forked from Minki/linux
RDMA 5.8 merge window pull request
A few large, long discussed works this time. The RNBD block driver has been posted for nearly two years now, and the removal of FMR has been a recurring discussion theme for a long time. The usual smattering of features and bug fixes. - Various small driver bugs fixes in rxe, mlx5, hfi1, and efa - Continuing driver cleanups in bnxt_re, hns - Big cleanup of mlx5 QP creation flows - More consistent use of src port and flow label when LAG is used and a mlx5 implementation - Additional set of cleanups for IB CM - 'RNBD' network block driver and target. This is a network block RDMA device specific to ionos's cloud environment. It brings strong multipath and resiliency capabilities. - Accelerated IPoIB for HFI1 - QP/WQ/SRQ ioctl migration for uverbs, and support for multiple async fds - Support for exchanging the new IBTA defiend ECE data during RDMA CM exchanges - Removal of the very old and insecure FMR interface from all ULPs and drivers. FRWR should be preferred for at least a decade now. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl7X/IwACgkQOG33FX4g mxp2uw/+MI2S/aXqEBvZfTT8yrkAwqYezS0VeTDnwH/T6UlTMDhHVN/2Ji3tbbX3 FEKT1i2mnAL5RqUAL1lr9g4sG/bVozrpN46Ws5Lu9dTbIPLKTNPWDuLFQDUShKY7 OyMI/bRx6anGnsOy20iiBqnrQbrrZj5TECgnmrkAl62QFdcl7aBWe/yYjy4CT11N ub+aBXBREN1F1pc0HIjd2tI+8gnZc+mNm1LVVDRH9Capun/pI26qDNh7e6QwGyIo n8ItraC8znLwv/nsUoTE7/JRcsTEe6vJI26PQmczZfNJs/4O65G7fZg0eSBseZYi qKf7Uwtb3qW0R7jRUMEgFY4DKXVAA0G2ph40HXBuzOSsqlT6HqYMO2wgG8pJkrTc qAjoSJGzfAHIsjxzxKI8wKuufCddjCm30VWWU7EKeriI6h1J0uPVqKkQMfYBTkik 696eZSBycAVgwayOng3XaehiTxOL7qGMTjUpDjUR6UscbiPG919vP+QsbIUuBXdb YoddBQJdyGJiaCXv32ciJjo9bjPRRi/bII7Q5qzCNI2mi4ZVbudF4ffzyQvdHtNJ nGnpRXoPi7kMvUrKTMPWkFjj0R5/UsPszsA51zbxPydfgBe0Dlc2PrrIG8dlzYAp wbV0Lec+iJucKlt7EZtrjz1xOiOOaQt/5/cW1bWqL+wk2t6gAuY= =9zTe -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "A more active cycle than most of the recent past, with a few large, long discussed works this time. The RNBD block driver has been posted for nearly two years now, and flowing through RDMA due to it also introducing a new ULP. The removal of FMR has been a recurring discussion theme for a long time. And the usual smattering of features and bug fixes. Summary: - Various small driver bugs fixes in rxe, mlx5, hfi1, and efa - Continuing driver cleanups in bnxt_re, hns - Big cleanup of mlx5 QP creation flows - More consistent use of src port and flow label when LAG is used and a mlx5 implementation - Additional set of cleanups for IB CM - 'RNBD' network block driver and target. This is a network block RDMA device specific to ionos's cloud environment. It brings strong multipath and resiliency capabilities. - Accelerated IPoIB for HFI1 - QP/WQ/SRQ ioctl migration for uverbs, and support for multiple async fds - Support for exchanging the new IBTA defiend ECE data during RDMA CM exchanges - Removal of the very old and insecure FMR interface from all ULPs and drivers. FRWR should be preferred for at least a decade now" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (247 commits) RDMA/cm: Spurious WARNING triggered in cm_destroy_id() RDMA/mlx5: Return ECE DC support RDMA/mlx5: Don't rely on FW to set zeros in ECE response RDMA/mlx5: Return an error if copy_to_user fails IB/hfi1: Use free_netdev() in hfi1_netdev_free() RDMA/hns: Uninitialized variable in modify_qp_init_to_rtr() RDMA/core: Move and rename trace_cm_id_create() IB/hfi1: Fix hfi1_netdev_rx_init() error handling RDMA: Remove 'max_map_per_fmr' RDMA: Remove 'max_fmr' RDMA/core: Remove FMR device ops RDMA/rdmavt: Remove FMR memory registration RDMA/mthca: Remove FMR support for memory registration RDMA/mlx4: Remove FMR support for memory registration RDMA/i40iw: Remove FMR leftovers RDMA/bnxt_re: Remove FMR leftovers RDMA/mlx5: Remove FMR leftovers RDMA/core: Remove FMR pool API RDMA/rds: Remove FMR support for memory registration RDMA/srp: Remove support for FMR memory registration ...
This commit is contained in:
commit
242b233198
46
Documentation/ABI/testing/sysfs-block-rnbd
Normal file
46
Documentation/ABI/testing/sysfs-block-rnbd
Normal file
@ -0,0 +1,46 @@
|
||||
What: /sys/block/rnbd<N>/rnbd/unmap_device
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: To unmap a volume, "normal" or "force" has to be written to:
|
||||
/sys/block/rnbd<N>/rnbd/unmap_device
|
||||
|
||||
When "normal" is used, the operation will fail with EBUSY if any process
|
||||
is using the device. When "force" is used, the device is also unmapped
|
||||
when device is in use. All I/Os that are in progress will fail.
|
||||
|
||||
Example:
|
||||
|
||||
# echo "normal" > /sys/block/rnbd0/rnbd/unmap_device
|
||||
|
||||
What: /sys/block/rnbd<N>/rnbd/state
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: The file contains the current state of the block device. The state file
|
||||
returns "open" when the device is successfully mapped from the server
|
||||
and accepting I/O requests. When the connection to the server gets
|
||||
disconnected in case of an error (e.g. link failure), the state file
|
||||
returns "closed" and all I/O requests submitted to it will fail with -EIO.
|
||||
|
||||
What: /sys/block/rnbd<N>/rnbd/session
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: RNBD uses RTRS session to transport the data between client and
|
||||
server. The entry "session" contains the name of the session, that
|
||||
was used to establish the RTRS session. It's the same name that
|
||||
was passed as server parameter to the map_device entry.
|
||||
|
||||
What: /sys/block/rnbd<N>/rnbd/mapping_path
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Contains the path that was passed as "device_path" to the map_device
|
||||
operation.
|
||||
|
||||
What: /sys/block/rnbd<N>/rnbd/access_mode
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Contains the device access mode: ro, rw or migration.
|
111
Documentation/ABI/testing/sysfs-class-rnbd-client
Normal file
111
Documentation/ABI/testing/sysfs-class-rnbd-client
Normal file
@ -0,0 +1,111 @@
|
||||
What: /sys/class/rnbd-client
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Provide information about RNBD-client.
|
||||
All sysfs files that are not read-only provide the usage information on read:
|
||||
|
||||
Example:
|
||||
# cat /sys/class/rnbd-client/ctl/map_device
|
||||
|
||||
> Usage: echo "sessname=<name of the rtrs session> path=<[srcaddr,]dstaddr>
|
||||
> [path=<[srcaddr,]dstaddr>] device_path=<full path on remote side>
|
||||
> [access_mode=<ro|rw|migration>] > map_device
|
||||
>
|
||||
> addr ::= [ ip:<ipv4> | ip:<ipv6> | gid:<gid> ]
|
||||
|
||||
What: /sys/class/rnbd-client/ctl/map_device
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Expected format is the following:
|
||||
|
||||
sessname=<name of the rtrs session>
|
||||
path=<[srcaddr,]dstaddr> [path=<[srcaddr,]dstaddr> ...]
|
||||
device_path=<full path on remote side>
|
||||
[access_mode=<ro|rw|migration>]
|
||||
|
||||
Where:
|
||||
|
||||
sessname: accepts a string not bigger than 256 chars, which identifies
|
||||
a given session on the client and on the server.
|
||||
I.e. "clt_hostname-srv_hostname" could be a natural choice.
|
||||
|
||||
path: describes a connection between the client and the server by
|
||||
specifying destination and, when required, the source address.
|
||||
The addresses are to be provided in the following format:
|
||||
|
||||
ip:<IPv6>
|
||||
ip:<IPv4>
|
||||
gid:<GID>
|
||||
|
||||
for example:
|
||||
|
||||
path=ip:10.0.0.66
|
||||
The single addr is treated as the destination.
|
||||
The connection will be established to this server from any client IP address.
|
||||
|
||||
path=ip:10.0.0.66,ip:10.0.1.66
|
||||
First addr is the source address and the second is the destination.
|
||||
|
||||
If multiple "path=" options are specified multiple connection
|
||||
will be established and data will be sent according to
|
||||
the selected multipath policy (see RTRS mp_policy sysfs entry description).
|
||||
|
||||
device_path: Path to the block device on the server side. Path is specified
|
||||
relative to the directory on server side configured in the
|
||||
'dev_search_path' module parameter of the rnbd_server.
|
||||
The rnbd_server prepends the <device_path> received from client
|
||||
with <dev_search_path> and tries to open the
|
||||
<dev_search_path>/<device_path> block device. On success,
|
||||
a /dev/rnbd<N> device file, a /sys/block/rnbd_client/rnbd<N>/
|
||||
directory and an entry in /sys/class/rnbd-client/ctl/devices
|
||||
will be created.
|
||||
|
||||
If 'dev_search_path' contains '%SESSNAME%', then each session can
|
||||
have different devices namespace, e.g. server was configured with
|
||||
the following parameter "dev_search_path=/run/rnbd-devs/%SESSNAME%",
|
||||
client has this string "sessname=blya device_path=sda", then server
|
||||
will try to open: /run/rnbd-devs/blya/sda.
|
||||
|
||||
access_mode: the access_mode parameter specifies if the device is to be
|
||||
mapped as "ro" read-only or "rw" read-write. The server allows
|
||||
a device to be exported in rw mode only once. The "migration"
|
||||
access mode has to be specified if a second mapping in read-write
|
||||
mode is desired.
|
||||
|
||||
By default "rw" is used.
|
||||
|
||||
Exit Codes:
|
||||
|
||||
If the device is already mapped it will fail with EEXIST. If the input
|
||||
has an invalid format it will return EINVAL. If the device path cannot
|
||||
be found on the server, it will fail with ENOENT.
|
||||
|
||||
Finding device file after mapping
|
||||
---------------------------------
|
||||
|
||||
After mapping, the device file can be found by:
|
||||
o The symlink /sys/class/rnbd-client/ctl/devices/<device_id>
|
||||
points to /sys/block/<dev-name>. The last part of the symlink destination
|
||||
is the same as the device name. By extracting the last part of the
|
||||
path the path to the device /dev/<dev-name> can be build.
|
||||
|
||||
o /dev/block/$(cat /sys/class/rnbd-client/ctl/devices/<device_id>/dev)
|
||||
|
||||
How to find the <device_id> of the device is described on the next
|
||||
section.
|
||||
|
||||
What: /sys/class/rnbd-client/ctl/devices/
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: For each device mapped on the client a new symbolic link is created as
|
||||
/sys/class/rnbd-client/ctl/devices/<device_id>, which points
|
||||
to the block device created by rnbd (/sys/block/rnbd<N>/).
|
||||
The <device_id> of each device is created as follows:
|
||||
|
||||
- If the 'device_path' provided during mapping contains slashes ("/"),
|
||||
they are replaced by exclamation mark ("!") and used as as the
|
||||
<device_id>. Otherwise, the <device_id> will be the same as the
|
||||
"device_path" provided.
|
50
Documentation/ABI/testing/sysfs-class-rnbd-server
Normal file
50
Documentation/ABI/testing/sysfs-class-rnbd-server
Normal file
@ -0,0 +1,50 @@
|
||||
What: /sys/class/rnbd-server
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: provide information about RNBD-server.
|
||||
|
||||
What: /sys/class/rnbd-server/ctl/
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: When a client maps a device, a directory entry with the name of the
|
||||
block device is created under /sys/class/rnbd-server/ctl/devices/.
|
||||
|
||||
What: /sys/class/rnbd-server/ctl/devices/<device_name>/block_dev
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Is a symlink to the sysfs entry of the exported device.
|
||||
|
||||
Example:
|
||||
block_dev -> ../../../../class/block/ram0
|
||||
|
||||
What: /sys/class/rnbd-server/ctl/devices/<device_name>/sessions/
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: For each client a particular device is exported to, following directory will be
|
||||
created:
|
||||
|
||||
/sys/class/rnbd-server/ctl/devices/<device_name>/sessions/<session-name>/
|
||||
|
||||
When the device is unmapped by that client, the directory will be removed.
|
||||
|
||||
What: /sys/class/rnbd-server/ctl/devices/<device_name>/sessions/<session-name>/read_only
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Contains '1' if device is mapped read-only, otherwise '0'.
|
||||
|
||||
What: /sys/class/rnbd-server/ctl/devices/<device_name>/sessions/<session-name>/mapping_path
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Contains the relative device path provided by the user during mapping.
|
||||
|
||||
What: /sys/class/rnbd-server/ctl/devices/<device_name>/sessions/<session-name>/access_mode
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Contains the device access mode: ro, rw or migration.
|
131
Documentation/ABI/testing/sysfs-class-rtrs-client
Normal file
131
Documentation/ABI/testing/sysfs-class-rtrs-client
Normal file
@ -0,0 +1,131 @@
|
||||
What: /sys/class/rtrs-client
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: When a user of RTRS API creates a new session, a directory entry with
|
||||
the name of that session is created under /sys/class/rtrs-client/<session-name>/
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/add_path
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: RW, adds a new path (connection) to an existing session. Expected format is the
|
||||
following:
|
||||
|
||||
<[source addr,]destination addr>
|
||||
*addr ::= [ ip:<ipv4|ipv6> | gid:<gid> ]
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/max_reconnect_attempts
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Maximum number reconnect attempts the client should make before giving up
|
||||
after connection breaks unexpectedly.
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/mp_policy
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Multipath policy specifies which path should be selected on each IO:
|
||||
|
||||
round-robin (0):
|
||||
select path in per CPU round-robin manner.
|
||||
|
||||
min-inflight (1):
|
||||
select path with minimum inflights.
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/paths/
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Each path belonging to a given session is listed here by its source and
|
||||
destination address. When a new path is added to a session by writing to
|
||||
the "add_path" entry, a directory <src@dst> is created.
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/state
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: RO, Contains "connected" if the session is connected to the peer and fully
|
||||
functional. Otherwise the file contains "disconnected"
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/reconnect
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Write "1" to the file in order to reconnect the path.
|
||||
Operation is blocking and returns 0 if reconnect was successful.
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/disconnect
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Write "1" to the file in order to disconnect the path.
|
||||
Operation blocks until RTRS path is disconnected.
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/remove_path
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Write "1" to the file in order to disconnected and remove the path
|
||||
from the session. Operation blocks until the path is disconnected
|
||||
and removed from the session.
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/hca_name
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: RO, Contains the the name of HCA the connection established on.
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/hca_port
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: RO, Contains the port number of active port traffic is going through.
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/src_addr
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: RO, Contains the source address of the path
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/dst_addr
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: RO, Contains the destination address of the path
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/stats/reset_all
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: RW, Read will return usage help, write 0 will clear all the statistics.
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/stats/cpu_migration
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: RTRS expects that each HCA IRQ is pinned to a separate CPU. If it's
|
||||
not the case, the processing of an I/O response could be processed on a
|
||||
different CPU than where it was originally submitted. This file shows
|
||||
how many interrupts where generated on a non expected CPU.
|
||||
"from:" is the CPU on which the IRQ was expected, but not generated.
|
||||
"to:" is the CPU on which the IRQ was generated, but not expected.
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/stats/reconnects
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Contains 2 unsigned int values, the first one records number of successful
|
||||
reconnects in the path lifetime, the second one records number of failed
|
||||
reconnects in the path lifetime.
|
||||
|
||||
What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/stats/rdma
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Contains statistics regarding rdma operations and inflight operations.
|
||||
The output consists of 6 values:
|
||||
|
||||
<read-count> <read-total-size> <write-count> <write-total-size> \
|
||||
<inflights> <failovered>
|
53
Documentation/ABI/testing/sysfs-class-rtrs-server
Normal file
53
Documentation/ABI/testing/sysfs-class-rtrs-server
Normal file
@ -0,0 +1,53 @@
|
||||
What: /sys/class/rtrs-server
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: When a user of RTRS API creates a new session on a client side, a
|
||||
directory entry with the name of that session is created in here.
|
||||
|
||||
What: /sys/class/rtrs-server/<session-name>/paths/
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: When new path is created by writing to "add_path" entry on client side,
|
||||
a directory entry named as <source address>@<destination address> is created
|
||||
on server.
|
||||
|
||||
What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/disconnect
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: When "1" is written to the file, the RTRS session is being disconnected.
|
||||
Operations is non-blocking and returns control immediately to the caller.
|
||||
|
||||
What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/hca_name
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: RO, Contains the the name of HCA the connection established on.
|
||||
|
||||
What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/hca_port
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: RO, Contains the port number of active port traffic is going through.
|
||||
|
||||
What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/src_addr
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: RO, Contains the source address of the path
|
||||
|
||||
What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/dst_addr
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: RO, Contains the destination address of the path
|
||||
|
||||
What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/stats/rdma
|
||||
Date: Feb 2020
|
||||
KernelVersion: 5.7
|
||||
Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
Description: Contains statistics regarding rdma operations and inflight operations.
|
||||
The output consists of 5 values:
|
||||
<read-count> <read-total-size> <write-count> <write-total-size> <inflights>
|
@ -37,9 +37,6 @@ InfiniBand core interfaces
|
||||
.. kernel-doc:: drivers/infiniband/core/ud_header.c
|
||||
:export:
|
||||
|
||||
.. kernel-doc:: drivers/infiniband/core/fmr_pool.c
|
||||
:export:
|
||||
|
||||
.. kernel-doc:: drivers/infiniband/core/umem.c
|
||||
:export:
|
||||
|
||||
|
@ -22,7 +22,6 @@ Sleeping and interrupt context
|
||||
- post_recv
|
||||
- poll_cq
|
||||
- req_notify_cq
|
||||
- map_phys_fmr
|
||||
|
||||
which may not sleep and must be callable from any context.
|
||||
|
||||
@ -36,7 +35,6 @@ Sleeping and interrupt context
|
||||
- ib_post_send
|
||||
- ib_post_recv
|
||||
- ib_req_notify_cq
|
||||
- ib_map_phys_fmr
|
||||
|
||||
are therefore safe to call from any context.
|
||||
|
||||
|
14
MAINTAINERS
14
MAINTAINERS
@ -14579,6 +14579,13 @@ F: arch/riscv/
|
||||
N: riscv
|
||||
K: riscv
|
||||
|
||||
RNBD BLOCK DRIVERS
|
||||
M: Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
M: Jack Wang <jinpu.wang@cloud.ionos.com>
|
||||
L: linux-block@vger.kernel.org
|
||||
S: Maintained
|
||||
F: drivers/block/rnbd/
|
||||
|
||||
ROCCAT DRIVERS
|
||||
M: Stefan Achatz <erazor_de@users.sourceforge.net>
|
||||
S: Maintained
|
||||
@ -14716,6 +14723,13 @@ S: Maintained
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/jes/linux.git rtl8xxxu-devel
|
||||
F: drivers/net/wireless/realtek/rtl8xxxu/
|
||||
|
||||
RTRS TRANSPORT DRIVERS
|
||||
M: Danil Kipnis <danil.kipnis@cloud.ionos.com>
|
||||
M: Jack Wang <jinpu.wang@cloud.ionos.com>
|
||||
L: linux-rdma@vger.kernel.org
|
||||
S: Maintained
|
||||
F: drivers/infiniband/ulp/rtrs/
|
||||
|
||||
RXRPC SOCKETS (AF_RXRPC)
|
||||
M: David Howells <dhowells@redhat.com>
|
||||
L: linux-afs@lists.infradead.org
|
||||
|
@ -458,4 +458,6 @@ config BLK_DEV_RSXX
|
||||
To compile this driver as a module, choose M here: the
|
||||
module will be called rsxx.
|
||||
|
||||
source "drivers/block/rnbd/Kconfig"
|
||||
|
||||
endif # BLK_DEV
|
||||
|
@ -39,6 +39,7 @@ obj-$(CONFIG_BLK_DEV_PCIESSD_MTIP32XX) += mtip32xx/
|
||||
|
||||
obj-$(CONFIG_BLK_DEV_RSXX) += rsxx/
|
||||
obj-$(CONFIG_ZRAM) += zram/
|
||||
obj-$(CONFIG_BLK_DEV_RNBD) += rnbd/
|
||||
|
||||
obj-$(CONFIG_BLK_DEV_NULL_BLK) += null_blk.o
|
||||
null_blk-objs := null_blk_main.o
|
||||
|
28
drivers/block/rnbd/Kconfig
Normal file
28
drivers/block/rnbd/Kconfig
Normal file
@ -0,0 +1,28 @@
|
||||
# SPDX-License-Identifier: GPL-2.0-or-later
|
||||
|
||||
config BLK_DEV_RNBD
|
||||
bool
|
||||
|
||||
config BLK_DEV_RNBD_CLIENT
|
||||
tristate "RDMA Network Block Device driver client"
|
||||
depends on INFINIBAND_RTRS_CLIENT
|
||||
select BLK_DEV_RNBD
|
||||
help
|
||||
RNBD client is a network block device driver using rdma transport.
|
||||
|
||||
RNBD client allows for mapping of a remote block devices over
|
||||
RTRS protocol from a target system where RNBD server is running.
|
||||
|
||||
If unsure, say N.
|
||||
|
||||
config BLK_DEV_RNBD_SERVER
|
||||
tristate "RDMA Network Block Device driver server"
|
||||
depends on INFINIBAND_RTRS_SERVER
|
||||
select BLK_DEV_RNBD
|
||||
help
|
||||
RNBD server is the server side of RNBD using rdma transport.
|
||||
|
||||
RNBD server allows for exporting local block devices to a remote client
|
||||
over RTRS protocol.
|
||||
|
||||
If unsure, say N.
|
15
drivers/block/rnbd/Makefile
Normal file
15
drivers/block/rnbd/Makefile
Normal file
@ -0,0 +1,15 @@
|
||||
# SPDX-License-Identifier: GPL-2.0-or-later
|
||||
|
||||
ccflags-y := -I$(srctree)/drivers/infiniband/ulp/rtrs
|
||||
|
||||
rnbd-client-y := rnbd-clt.o \
|
||||
rnbd-clt-sysfs.o \
|
||||
rnbd-common.o
|
||||
|
||||
rnbd-server-y := rnbd-common.o \
|
||||
rnbd-srv.o \
|
||||
rnbd-srv-dev.o \
|
||||
rnbd-srv-sysfs.o
|
||||
|
||||
obj-$(CONFIG_BLK_DEV_RNBD_CLIENT) += rnbd-client.o
|
||||
obj-$(CONFIG_BLK_DEV_RNBD_SERVER) += rnbd-server.o
|
92
drivers/block/rnbd/README
Normal file
92
drivers/block/rnbd/README
Normal file
@ -0,0 +1,92 @@
|
||||
********************************
|
||||
RDMA Network Block Device (RNBD)
|
||||
********************************
|
||||
|
||||
Introduction
|
||||
------------
|
||||
|
||||
RNBD (RDMA Network Block Device) is a pair of kernel modules
|
||||
(client and server) that allow for remote access of a block device on
|
||||
the server over RTRS protocol using the RDMA (InfiniBand, RoCE, iWARP)
|
||||
transport. After being mapped, the remote block devices can be accessed
|
||||
on the client side as local block devices.
|
||||
|
||||
I/O is transferred between client and server by the RTRS transport
|
||||
modules. The administration of RNBD and RTRS modules is done via
|
||||
sysfs entries.
|
||||
|
||||
Requirements
|
||||
------------
|
||||
|
||||
RTRS kernel modules
|
||||
|
||||
Quick Start
|
||||
-----------
|
||||
|
||||
Server side:
|
||||
# modprobe rnbd_server
|
||||
|
||||
Client side:
|
||||
# modprobe rnbd_client
|
||||
# echo "sessname=blya path=ip:10.50.100.66 device_path=/dev/ram0" > \
|
||||
/sys/devices/virtual/rnbd-client/ctl/map_device
|
||||
|
||||
Where "sessname=" is a session name, a string to identify the session
|
||||
on client and on server sides; "path=" is a destination IP address or
|
||||
a pair of a source and a destination IPs, separated by comma. Multiple
|
||||
"path=" options can be specified in order to use multipath (see RTRS
|
||||
description for details); "device_path=" is the block device to be
|
||||
mapped from the server side. After the session to the server machine is
|
||||
established, the mapped device will appear on the client side under
|
||||
/dev/rnbd<N>.
|
||||
|
||||
|
||||
RNBD-Server Module Parameters
|
||||
=============================
|
||||
|
||||
dev_search_path
|
||||
---------------
|
||||
|
||||
When a device is mapped from the client, the server generates the path
|
||||
to the block device on the server side by concatenating dev_search_path
|
||||
and the "device_path" that was specified in the map_device operation.
|
||||
|
||||
The default dev_search_path is: "/".
|
||||
|
||||
dev_search_path option can also contain %SESSNAME% in order to provide
|
||||
different device namespaces for different sessions. See "device_path"
|
||||
option for details.
|
||||
|
||||
============================
|
||||
Protocol (rnbd/rnbd-proto.h)
|
||||
============================
|
||||
|
||||
1. Before mapping first device from a given server, client sends an
|
||||
RNBD_MSG_SESS_INFO to the server. Server responds with
|
||||
RNBD_MSG_SESS_INFO_RSP. Currently the messages only contain the protocol
|
||||
version for backward compatibility.
|
||||
|
||||
2. Client requests to open a device by sending RNBD_MSG_OPEN message. This
|
||||
contains the path to the device and access mode (read-only or writable).
|
||||
Server responds to the message with RNBD_MSG_OPEN_RSP. This contains
|
||||
a 32 bit device id to be used for IOs and device "geometry" related
|
||||
information: side, max_hw_sectors, etc.
|
||||
|
||||
3. Client attaches RNBD_MSG_IO to each IO message send to a device. This
|
||||
message contains device id, provided by server in his rnbd_msg_open_rsp,
|
||||
sector to be accessed, read-write flags and bi_size.
|
||||
|
||||
4. Client closes a device by sending RNBD_MSG_CLOSE which contains only the
|
||||
device id provided by the server.
|
||||
|
||||
=========================================
|
||||
Contributors List(in alphabetical order)
|
||||
=========================================
|
||||
Danil Kipnis <danil.kipnis@profitbricks.com>
|
||||
Fabian Holler <mail@fholler.de>
|
||||
Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
|
||||
Jack Wang <jinpu.wang@profitbricks.com>
|
||||
Kleber Souza <kleber.souza@profitbricks.com>
|
||||
Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
|
||||
Milind Dumbare <Milind.dumbare@gmail.com>
|
||||
Roman Penyaev <roman.penyaev@profitbricks.com>
|
639
drivers/block/rnbd/rnbd-clt-sysfs.c
Normal file
639
drivers/block/rnbd/rnbd-clt-sysfs.c
Normal file
@ -0,0 +1,639 @@
|
||||
// SPDX-License-Identifier: GPL-2.0-or-later
|
||||
/*
|
||||
* RDMA Network Block Driver
|
||||
*
|
||||
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
|
||||
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
|
||||
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
|
||||
*/
|
||||
|
||||
#undef pr_fmt
|
||||
#define pr_fmt(fmt) KBUILD_MODNAME " L" __stringify(__LINE__) ": " fmt
|
||||
|
||||
#include <linux/types.h>
|
||||
#include <linux/ctype.h>
|
||||
#include <linux/parser.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/in6.h>
|
||||
#include <linux/fs.h>
|
||||
#include <linux/uaccess.h>
|
||||
#include <linux/device.h>
|
||||
#include <rdma/ib.h>
|
||||
#include <rdma/rdma_cm.h>
|
||||
|
||||
#include "rnbd-clt.h"
|
||||
|
||||
static struct device *rnbd_dev;
|
||||
static struct class *rnbd_dev_class;
|
||||
static struct kobject *rnbd_devs_kobj;
|
||||
|
||||
enum {
|
||||
RNBD_OPT_ERR = 0,
|
||||
RNBD_OPT_DEST_PORT = 1 << 0,
|
||||
RNBD_OPT_PATH = 1 << 1,
|
||||
RNBD_OPT_DEV_PATH = 1 << 2,
|
||||
RNBD_OPT_ACCESS_MODE = 1 << 3,
|
||||
RNBD_OPT_SESSNAME = 1 << 6,
|
||||
};
|
||||
|
||||
static const unsigned int rnbd_opt_mandatory[] = {
|
||||
RNBD_OPT_PATH,
|
||||
RNBD_OPT_DEV_PATH,
|
||||
RNBD_OPT_SESSNAME,
|
||||
};
|
||||
|
||||
static const match_table_t rnbd_opt_tokens = {
|
||||
{RNBD_OPT_PATH, "path=%s" },
|
||||
{RNBD_OPT_DEV_PATH, "device_path=%s"},
|
||||
{RNBD_OPT_DEST_PORT, "dest_port=%d" },
|
||||
{RNBD_OPT_ACCESS_MODE, "access_mode=%s"},
|
||||
{RNBD_OPT_SESSNAME, "sessname=%s" },
|
||||
{RNBD_OPT_ERR, NULL },
|
||||
};
|
||||
|
||||
struct rnbd_map_options {
|
||||
char *sessname;
|
||||
struct rtrs_addr *paths;
|
||||
size_t *path_cnt;
|
||||
char *pathname;
|
||||
u16 *dest_port;
|
||||
enum rnbd_access_mode *access_mode;
|
||||
};
|
||||
|
||||
static int rnbd_clt_parse_map_options(const char *buf, size_t max_path_cnt,
|
||||
struct rnbd_map_options *opt)
|
||||
{
|
||||
char *options, *sep_opt;
|
||||
char *p;
|
||||
substring_t args[MAX_OPT_ARGS];
|
||||
int opt_mask = 0;
|
||||
int token;
|
||||
int ret = -EINVAL;
|
||||
int i, dest_port;
|
||||
int p_cnt = 0;
|
||||
|
||||
options = kstrdup(buf, GFP_KERNEL);
|
||||
if (!options)
|
||||
return -ENOMEM;
|
||||
|
||||
sep_opt = strstrip(options);
|
||||
while ((p = strsep(&sep_opt, " ")) != NULL) {
|
||||
if (!*p)
|
||||
continue;
|
||||
|
||||
token = match_token(p, rnbd_opt_tokens, args);
|
||||
opt_mask |= token;
|
||||
|
||||
switch (token) {
|
||||
case RNBD_OPT_SESSNAME:
|
||||
p = match_strdup(args);
|
||||
if (!p) {
|
||||
ret = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
if (strlen(p) > NAME_MAX) {
|
||||
pr_err("map_device: sessname too long\n");
|
||||
ret = -EINVAL;
|
||||
kfree(p);
|
||||
goto out;
|
||||
}
|
||||
strlcpy(opt->sessname, p, NAME_MAX);
|
||||
kfree(p);
|
||||
break;
|
||||
|
||||
case RNBD_OPT_PATH:
|
||||
if (p_cnt >= max_path_cnt) {
|
||||
pr_err("map_device: too many (> %zu) paths provided\n",
|
||||
max_path_cnt);
|
||||
ret = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
p = match_strdup(args);
|
||||
if (!p) {
|
||||
ret = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
|
||||
ret = rtrs_addr_to_sockaddr(p, strlen(p),
|
||||
*opt->dest_port,
|
||||
&opt->paths[p_cnt]);
|
||||
if (ret) {
|
||||
pr_err("Can't parse path %s: %d\n", p, ret);
|
||||
kfree(p);
|
||||
goto out;
|
||||
}
|
||||
|
||||
p_cnt++;
|
||||
|
||||
kfree(p);
|
||||
break;
|
||||
|
||||
case RNBD_OPT_DEV_PATH:
|
||||
p = match_strdup(args);
|
||||
if (!p) {
|
||||
ret = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
if (strlen(p) > NAME_MAX) {
|
||||
pr_err("map_device: Device path too long\n");
|
||||
ret = -EINVAL;
|
||||
kfree(p);
|
||||
goto out;
|
||||
}
|
||||
strlcpy(opt->pathname, p, NAME_MAX);
|
||||
kfree(p);
|
||||
break;
|
||||
|
||||
case RNBD_OPT_DEST_PORT:
|
||||
if (match_int(args, &dest_port) || dest_port < 0 ||
|
||||
dest_port > 65535) {
|
||||
pr_err("bad destination port number parameter '%d'\n",
|
||||
dest_port);
|
||||
ret = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
*opt->dest_port = dest_port;
|
||||
break;
|
||||
|
||||
case RNBD_OPT_ACCESS_MODE:
|
||||
p = match_strdup(args);
|
||||
if (!p) {
|
||||
ret = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
|
||||
if (!strcmp(p, "ro")) {
|
||||
*opt->access_mode = RNBD_ACCESS_RO;
|
||||
} else if (!strcmp(p, "rw")) {
|
||||
*opt->access_mode = RNBD_ACCESS_RW;
|
||||
} else if (!strcmp(p, "migration")) {
|
||||
*opt->access_mode = RNBD_ACCESS_MIGRATION;
|
||||
} else {
|
||||
pr_err("map_device: Invalid access_mode: '%s'\n",
|
||||
p);
|
||||
ret = -EINVAL;
|
||||
kfree(p);
|
||||
goto out;
|
||||
}
|
||||
|
||||
kfree(p);
|
||||
break;
|
||||
|
||||
default:
|
||||
pr_err("map_device: Unknown parameter or missing value '%s'\n",
|
||||
p);
|
||||
ret = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(rnbd_opt_mandatory); i++) {
|
||||
if ((opt_mask & rnbd_opt_mandatory[i])) {
|
||||
ret = 0;
|
||||
} else {
|
||||
pr_err("map_device: Parameters missing\n");
|
||||
ret = -EINVAL;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
out:
|
||||
*opt->path_cnt = p_cnt;
|
||||
kfree(options);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static ssize_t state_show(struct kobject *kobj,
|
||||
struct kobj_attribute *attr, char *page)
|
||||
{
|
||||
struct rnbd_clt_dev *dev;
|
||||
|
||||
dev = container_of(kobj, struct rnbd_clt_dev, kobj);
|
||||
|
||||
switch (dev->dev_state) {
|
||||
case DEV_STATE_INIT:
|
||||
return snprintf(page, PAGE_SIZE, "init\n");
|
||||
case DEV_STATE_MAPPED:
|
||||
/* TODO fix cli tool before changing to proper state */
|
||||
return snprintf(page, PAGE_SIZE, "open\n");
|
||||
case DEV_STATE_MAPPED_DISCONNECTED:
|
||||
/* TODO fix cli tool before changing to proper state */
|
||||
return snprintf(page, PAGE_SIZE, "closed\n");
|
||||
case DEV_STATE_UNMAPPED:
|
||||
return snprintf(page, PAGE_SIZE, "unmapped\n");
|
||||
default:
|
||||
return snprintf(page, PAGE_SIZE, "unknown\n");
|
||||
}
|
||||
}
|
||||
|
||||
static struct kobj_attribute rnbd_clt_state_attr = __ATTR_RO(state);
|
||||
|
||||
static ssize_t mapping_path_show(struct kobject *kobj,
|
||||
struct kobj_attribute *attr, char *page)
|
||||
{
|
||||
struct rnbd_clt_dev *dev;
|
||||
|
||||
dev = container_of(kobj, struct rnbd_clt_dev, kobj);
|
||||
|
||||
return scnprintf(page, PAGE_SIZE, "%s\n", dev->pathname);
|
||||
}
|
||||
|
||||
static struct kobj_attribute rnbd_clt_mapping_path_attr =
|
||||
__ATTR_RO(mapping_path);
|
||||
|
||||
static ssize_t access_mode_show(struct kobject *kobj,
|
||||
struct kobj_attribute *attr, char *page)
|
||||
{
|
||||
struct rnbd_clt_dev *dev;
|
||||
|
||||
dev = container_of(kobj, struct rnbd_clt_dev, kobj);
|
||||
|
||||
return snprintf(page, PAGE_SIZE, "%s\n",
|
||||
rnbd_access_mode_str(dev->access_mode));
|
||||
}
|
||||
|
||||
static struct kobj_attribute rnbd_clt_access_mode =
|
||||
__ATTR_RO(access_mode);
|
||||
|
||||
static ssize_t rnbd_clt_unmap_dev_show(struct kobject *kobj,
|
||||
struct kobj_attribute *attr, char *page)
|
||||
{
|
||||
return scnprintf(page, PAGE_SIZE, "Usage: echo <normal|force> > %s\n",
|
||||
attr->attr.name);
|
||||
}
|
||||
|
||||
static ssize_t rnbd_clt_unmap_dev_store(struct kobject *kobj,
|
||||
struct kobj_attribute *attr,
|
||||
const char *buf, size_t count)
|
||||
{
|
||||
struct rnbd_clt_dev *dev;
|
||||
char *opt, *options;
|
||||
bool force;
|
||||
int err;
|
||||
|
||||
opt = kstrdup(buf, GFP_KERNEL);
|
||||
if (!opt)
|
||||
return -ENOMEM;
|
||||
|
||||
options = strstrip(opt);
|
||||
dev = container_of(kobj, struct rnbd_clt_dev, kobj);
|
||||
if (sysfs_streq(options, "normal")) {
|
||||
force = false;
|
||||
} else if (sysfs_streq(options, "force")) {
|
||||
force = true;
|
||||
} else {
|
||||
rnbd_clt_err(dev,
|
||||
"unmap_device: Invalid value: %s\n",
|
||||
options);
|
||||
err = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
|
||||
rnbd_clt_info(dev, "Unmapping device, option: %s.\n",
|
||||
force ? "force" : "normal");
|
||||
|
||||
/*
|
||||
* We take explicit module reference only for one reason: do not
|
||||
* race with lockless rnbd_destroy_sessions().
|
||||
*/
|
||||
if (!try_module_get(THIS_MODULE)) {
|
||||
err = -ENODEV;
|
||||
goto out;
|
||||
}
|
||||
err = rnbd_clt_unmap_device(dev, force, &attr->attr);
|
||||
if (err) {
|
||||
if (err != -EALREADY)
|
||||
rnbd_clt_err(dev, "unmap_device: %d\n", err);
|
||||
goto module_put;
|
||||
}
|
||||
|
||||
/*
|
||||
* Here device can be vanished!
|
||||
*/
|
||||
|
||||
err = count;
|
||||
|
||||
module_put:
|
||||
module_put(THIS_MODULE);
|
||||
out:
|
||||
kfree(opt);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static struct kobj_attribute rnbd_clt_unmap_device_attr =
|
||||
__ATTR(unmap_device, 0644, rnbd_clt_unmap_dev_show,
|
||||
rnbd_clt_unmap_dev_store);
|
||||
|
||||
static ssize_t rnbd_clt_resize_dev_show(struct kobject *kobj,
|
||||
struct kobj_attribute *attr,
|
||||
char *page)
|
||||
{
|
||||
return scnprintf(page, PAGE_SIZE,
|
||||
"Usage: echo <new size in sectors> > %s\n",
|
||||
attr->attr.name);
|
||||
}
|
||||
|
||||
static ssize_t rnbd_clt_resize_dev_store(struct kobject *kobj,
|
||||
struct kobj_attribute *attr,
|
||||
const char *buf, size_t count)
|
||||
{
|
||||
int ret;
|
||||
unsigned long sectors;
|
||||
struct rnbd_clt_dev *dev;
|
||||
|
||||
dev = container_of(kobj, struct rnbd_clt_dev, kobj);
|
||||
|
||||
ret = kstrtoul(buf, 0, §ors);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = rnbd_clt_resize_disk(dev, (size_t)sectors);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
return count;
|
||||
}
|
||||
|
||||
static struct kobj_attribute rnbd_clt_resize_dev_attr =
|
||||
__ATTR(resize, 0644, rnbd_clt_resize_dev_show,
|
||||
rnbd_clt_resize_dev_store);
|
||||
|
||||
static ssize_t rnbd_clt_remap_dev_show(struct kobject *kobj,
|
||||
struct kobj_attribute *attr, char *page)
|
||||
{
|
||||
return scnprintf(page, PAGE_SIZE, "Usage: echo <1> > %s\n",
|
||||
attr->attr.name);
|
||||
}
|
||||
|
||||
static ssize_t rnbd_clt_remap_dev_store(struct kobject *kobj,
|
||||
struct kobj_attribute *attr,
|
||||
const char *buf, size_t count)
|
||||
{
|
||||
struct rnbd_clt_dev *dev;
|
||||
char *opt, *options;
|
||||
int err;
|
||||
|
||||
opt = kstrdup(buf, GFP_KERNEL);
|
||||
if (!opt)
|
||||
return -ENOMEM;
|
||||
|
||||
options = strstrip(opt);
|
||||
dev = container_of(kobj, struct rnbd_clt_dev, kobj);
|
||||
if (!sysfs_streq(options, "1")) {
|
||||
rnbd_clt_err(dev,
|
||||
"remap_device: Invalid value: %s\n",
|
||||
options);
|
||||
err = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
err = rnbd_clt_remap_device(dev);
|
||||
if (likely(!err))
|
||||
err = count;
|
||||
|
||||
out:
|
||||
kfree(opt);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
static struct kobj_attribute rnbd_clt_remap_device_attr =
|
||||
__ATTR(remap_device, 0644, rnbd_clt_remap_dev_show,
|
||||
rnbd_clt_remap_dev_store);
|
||||
|
||||
static ssize_t session_show(struct kobject *kobj, struct kobj_attribute *attr,
|
||||
char *page)
|
||||
{
|
||||
struct rnbd_clt_dev *dev;
|
||||
|
||||
dev = container_of(kobj, struct rnbd_clt_dev, kobj);
|
||||
|
||||
return scnprintf(page, PAGE_SIZE, "%s\n", dev->sess->sessname);
|
||||
}
|
||||
|
||||
static struct kobj_attribute rnbd_clt_session_attr =
|
||||
__ATTR_RO(session);
|
||||
|
||||
static struct attribute *rnbd_dev_attrs[] = {
|
||||
&rnbd_clt_unmap_device_attr.attr,
|
||||
&rnbd_clt_resize_dev_attr.attr,
|
||||
&rnbd_clt_remap_device_attr.attr,
|
||||
&rnbd_clt_mapping_path_attr.attr,
|
||||
&rnbd_clt_state_attr.attr,
|
||||
&rnbd_clt_session_attr.attr,
|
||||
&rnbd_clt_access_mode.attr,
|
||||
NULL,
|
||||
};
|
||||
|
||||
void rnbd_clt_remove_dev_symlink(struct rnbd_clt_dev *dev)
|
||||
{
|
||||
/*
|
||||
* The module unload rnbd_client_exit path is racing with unmapping of
|
||||
* the last single device from the sysfs manually
|
||||
* i.e. rnbd_clt_unmap_dev_store() leading to a sysfs warning because
|
||||
* of sysfs link already was removed already.
|
||||
*/
|
||||
if (strlen(dev->blk_symlink_name) && try_module_get(THIS_MODULE)) {
|
||||
sysfs_remove_link(rnbd_devs_kobj, dev->blk_symlink_name);
|
||||
module_put(THIS_MODULE);
|
||||
}
|
||||
}
|
||||
|
||||
static struct kobj_type rnbd_dev_ktype = {
|
||||
.sysfs_ops = &kobj_sysfs_ops,
|
||||
.default_attrs = rnbd_dev_attrs,
|
||||
};
|
||||
|
||||
static int rnbd_clt_add_dev_kobj(struct rnbd_clt_dev *dev)
|
||||
{
|
||||
int ret;
|
||||
struct kobject *gd_kobj = &disk_to_dev(dev->gd)->kobj;
|
||||
|
||||
ret = kobject_init_and_add(&dev->kobj, &rnbd_dev_ktype, gd_kobj, "%s",
|
||||
"rnbd");
|
||||
if (ret)
|
||||
rnbd_clt_err(dev, "Failed to create device sysfs dir, err: %d\n",
|
||||
ret);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static ssize_t rnbd_clt_map_device_show(struct kobject *kobj,
|
||||
struct kobj_attribute *attr,
|
||||
char *page)
|
||||
{
|
||||
return scnprintf(page, PAGE_SIZE,
|
||||
"Usage: echo \"[dest_port=server port number] sessname=<name of the rtrs session> path=<[srcaddr@]dstaddr> [path=<[srcaddr@]dstaddr>] device_path=<full path on remote side> [access_mode=<ro|rw|migration>]\" > %s\n\naddr ::= [ ip:<ipv4> | ip:<ipv6> | gid:<gid> ]\n",
|
||||
attr->attr.name);
|
||||
}
|
||||
|
||||
static int rnbd_clt_get_path_name(struct rnbd_clt_dev *dev, char *buf,
|
||||
size_t len)
|
||||
{
|
||||
int ret;
|
||||
char pathname[NAME_MAX], *s;
|
||||
|
||||
strlcpy(pathname, dev->pathname, sizeof(pathname));
|
||||
while ((s = strchr(pathname, '/')))
|
||||
s[0] = '!';
|
||||
|
||||
ret = snprintf(buf, len, "%s", pathname);
|
||||
if (ret >= len)
|
||||
return -ENAMETOOLONG;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int rnbd_clt_add_dev_symlink(struct rnbd_clt_dev *dev)
|
||||
{
|
||||
struct kobject *gd_kobj = &disk_to_dev(dev->gd)->kobj;
|
||||
int ret;
|
||||
|
||||
ret = rnbd_clt_get_path_name(dev, dev->blk_symlink_name,
|
||||
sizeof(dev->blk_symlink_name));
|
||||
if (ret) {
|
||||
rnbd_clt_err(dev, "Failed to get /sys/block symlink path, err: %d\n",
|
||||
ret);
|
||||
goto out_err;
|
||||
}
|
||||
|
||||
ret = sysfs_create_link(rnbd_devs_kobj, gd_kobj,
|
||||
dev->blk_symlink_name);
|
||||
if (ret) {
|
||||
rnbd_clt_err(dev, "Creating /sys/block symlink failed, err: %d\n",
|
||||
ret);
|
||||
goto out_err;
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
||||
out_err:
|
||||
dev->blk_symlink_name[0] = '\0';
|
||||
return ret;
|
||||
}
|
||||
|
||||
static ssize_t rnbd_clt_map_device_store(struct kobject *kobj,
|
||||
struct kobj_attribute *attr,
|
||||
const char *buf, size_t count)
|
||||
{
|
||||
struct rnbd_clt_dev *dev;
|
||||
struct rnbd_map_options opt;
|
||||
int ret;
|
||||
char pathname[NAME_MAX];
|
||||
char sessname[NAME_MAX];
|
||||
enum rnbd_access_mode access_mode = RNBD_ACCESS_RW;
|
||||
u16 port_nr = RTRS_PORT;
|
||||
|
||||
struct sockaddr_storage *addrs;
|
||||
struct rtrs_addr paths[6];
|
||||
size_t path_cnt;
|
||||
|
||||
opt.sessname = sessname;
|
||||
opt.paths = paths;
|
||||
opt.path_cnt = &path_cnt;
|
||||
opt.pathname = pathname;
|
||||
opt.dest_port = &port_nr;
|
||||
opt.access_mode = &access_mode;
|
||||
addrs = kcalloc(ARRAY_SIZE(paths) * 2, sizeof(*addrs), GFP_KERNEL);
|
||||
if (!addrs)
|
||||
return -ENOMEM;
|
||||
|
||||
for (path_cnt = 0; path_cnt < ARRAY_SIZE(paths); path_cnt++) {
|
||||
paths[path_cnt].src = &addrs[path_cnt * 2];
|
||||
paths[path_cnt].dst = &addrs[path_cnt * 2 + 1];
|
||||
}
|
||||
|
||||
ret = rnbd_clt_parse_map_options(buf, ARRAY_SIZE(paths), &opt);
|
||||
if (ret)
|
||||
goto out;
|
||||
|
||||
pr_info("Mapping device %s on session %s, (access_mode: %s)\n",
|
||||
pathname, sessname,
|
||||
rnbd_access_mode_str(access_mode));
|
||||
|
||||
dev = rnbd_clt_map_device(sessname, paths, path_cnt, port_nr, pathname,
|
||||
access_mode);
|
||||
if (IS_ERR(dev)) {
|
||||
ret = PTR_ERR(dev);
|
||||
goto out;
|
||||
}
|
||||
|
||||
ret = rnbd_clt_add_dev_kobj(dev);
|
||||
if (ret)
|
||||
goto unmap_dev;
|
||||
|
||||
ret = rnbd_clt_add_dev_symlink(dev);
|
||||
if (ret)
|
||||
goto unmap_dev;
|
||||
|
||||
kfree(addrs);
|
||||
return count;
|
||||
|
||||
unmap_dev:
|
||||
rnbd_clt_unmap_device(dev, true, NULL);
|
||||
out:
|
||||
kfree(addrs);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static struct kobj_attribute rnbd_clt_map_device_attr =
|
||||
__ATTR(map_device, 0644,
|
||||
rnbd_clt_map_device_show, rnbd_clt_map_device_store);
|
||||
|
||||
static struct attribute *default_attrs[] = {
|
||||
&rnbd_clt_map_device_attr.attr,
|
||||
NULL,
|
||||
};
|
||||
|
||||
static struct attribute_group default_attr_group = {
|
||||
.attrs = default_attrs,
|
||||
};
|
||||
|
||||
static const struct attribute_group *default_attr_groups[] = {
|
||||
&default_attr_group,
|
||||
NULL,
|
||||
};
|
||||
|
||||
int rnbd_clt_create_sysfs_files(void)
|
||||
{
|
||||
int err;
|
||||
|
||||
rnbd_dev_class = class_create(THIS_MODULE, "rnbd-client");
|
||||
if (IS_ERR(rnbd_dev_class))
|
||||
return PTR_ERR(rnbd_dev_class);
|
||||
|
||||
rnbd_dev = device_create_with_groups(rnbd_dev_class, NULL,
|
||||
MKDEV(0, 0), NULL,
|
||||
default_attr_groups, "ctl");
|
||||
if (IS_ERR(rnbd_dev)) {
|
||||
err = PTR_ERR(rnbd_dev);
|
||||
goto cls_destroy;
|
||||
}
|
||||
rnbd_devs_kobj = kobject_create_and_add("devices", &rnbd_dev->kobj);
|
||||
if (!rnbd_devs_kobj) {
|
||||
err = -ENOMEM;
|
||||
goto dev_destroy;
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
||||
dev_destroy:
|
||||
device_destroy(rnbd_dev_class, MKDEV(0, 0));
|
||||
cls_destroy:
|
||||
class_destroy(rnbd_dev_class);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
void rnbd_clt_destroy_default_group(void)
|
||||
{
|
||||
sysfs_remove_group(&rnbd_dev->kobj, &default_attr_group);
|
||||
}
|
||||
|
||||
void rnbd_clt_destroy_sysfs_files(void)
|
||||
{
|
||||
kobject_del(rnbd_devs_kobj);
|
||||
kobject_put(rnbd_devs_kobj);
|
||||
device_destroy(rnbd_dev_class, MKDEV(0, 0));
|
||||
class_destroy(rnbd_dev_class);
|
||||
}
|
1729
drivers/block/rnbd/rnbd-clt.c
Normal file
1729
drivers/block/rnbd/rnbd-clt.c
Normal file
File diff suppressed because it is too large
Load Diff
156
drivers/block/rnbd/rnbd-clt.h
Normal file
156
drivers/block/rnbd/rnbd-clt.h
Normal file
@ -0,0 +1,156 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0-or-later */
|
||||
/*
|
||||
* RDMA Network Block Driver
|
||||
*
|
||||
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
|
||||
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
|
||||
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
|
||||
*/
|
||||
|
||||
#ifndef RNBD_CLT_H
|
||||
#define RNBD_CLT_H
|
||||
|
||||
#include <linux/wait.h>
|
||||
#include <linux/in.h>
|
||||
#include <linux/inet.h>
|
||||
#include <linux/blk-mq.h>
|
||||
#include <linux/refcount.h>
|
||||
|
||||
#include <rtrs.h>
|
||||
#include "rnbd-proto.h"
|
||||
#include "rnbd-log.h"
|
||||
|
||||
/* Max. number of segments per IO request, Mellanox Connect X ~ Connect X5,
|
||||
* choose minimial 30 for all, minus 1 for internal protocol, so 29.
|
||||
*/
|
||||
#define BMAX_SEGMENTS 29
|
||||
/* time in seconds between reconnect tries, default to 30 s */
|
||||
#define RECONNECT_DELAY 30
|
||||
/*
|
||||
* Number of times to reconnect on error before giving up, 0 for * disabled,
|
||||
* -1 for forever
|
||||
*/
|
||||
#define MAX_RECONNECTS -1
|
||||
|
||||
enum rnbd_clt_dev_state {
|
||||
DEV_STATE_INIT,
|
||||
DEV_STATE_MAPPED,
|
||||
DEV_STATE_MAPPED_DISCONNECTED,
|
||||
DEV_STATE_UNMAPPED,
|
||||
};
|
||||
|
||||
struct rnbd_iu_comp {
|
||||
wait_queue_head_t wait;
|
||||
int errno;
|
||||
};
|
||||
|
||||
struct rnbd_iu {
|
||||
union {
|
||||
struct request *rq; /* for block io */
|
||||
void *buf; /* for user messages */
|
||||
};
|
||||
struct rtrs_permit *permit;
|
||||
union {
|
||||
/* use to send msg associated with a dev */
|
||||
struct rnbd_clt_dev *dev;
|
||||
/* use to send msg associated with a sess */
|
||||
struct rnbd_clt_session *sess;
|
||||
};
|
||||
struct scatterlist sglist[BMAX_SEGMENTS];
|
||||
struct work_struct work;
|
||||
int errno;
|
||||
struct rnbd_iu_comp comp;
|
||||
atomic_t refcount;
|
||||
};
|
||||
|
||||
struct rnbd_cpu_qlist {
|
||||
struct list_head requeue_list;
|
||||
spinlock_t requeue_lock;
|
||||
unsigned int cpu;
|
||||
};
|
||||
|
||||
struct rnbd_clt_session {
|
||||
struct list_head list;
|
||||
struct rtrs_clt *rtrs;
|
||||
wait_queue_head_t rtrs_waitq;
|
||||
bool rtrs_ready;
|
||||
struct rnbd_cpu_qlist __percpu
|
||||
*cpu_queues;
|
||||
DECLARE_BITMAP(cpu_queues_bm, NR_CPUS);
|
||||
int __percpu *cpu_rr; /* per-cpu var for CPU round-robin */
|
||||
atomic_t busy;
|
||||
int queue_depth;
|
||||
u32 max_io_size;
|
||||
struct blk_mq_tag_set tag_set;
|
||||
struct mutex lock; /* protects state and devs_list */
|
||||
struct list_head devs_list; /* list of struct rnbd_clt_dev */
|
||||
refcount_t refcount;
|
||||
char sessname[NAME_MAX];
|
||||
u8 ver; /* protocol version */
|
||||
};
|
||||
|
||||
/**
|
||||
* Submission queues.
|
||||
*/
|
||||
struct rnbd_queue {
|
||||
struct list_head requeue_list;
|
||||
unsigned long in_list;
|
||||
struct rnbd_clt_dev *dev;
|
||||
struct blk_mq_hw_ctx *hctx;
|
||||
};
|
||||
|
||||
struct rnbd_clt_dev {
|
||||
struct rnbd_clt_session *sess;
|
||||
struct request_queue *queue;
|
||||
struct rnbd_queue *hw_queues;
|
||||
u32 device_id;
|
||||
/* local Idr index - used to track minor number allocations. */
|
||||
u32 clt_device_id;
|
||||
struct mutex lock;
|
||||
enum rnbd_clt_dev_state dev_state;
|
||||
char pathname[NAME_MAX];
|
||||
enum rnbd_access_mode access_mode;
|
||||
bool read_only;
|
||||
bool rotational;
|
||||
u32 max_hw_sectors;
|
||||
u32 max_write_same_sectors;
|
||||
u32 max_discard_sectors;
|
||||
u32 discard_granularity;
|
||||
u32 discard_alignment;
|
||||
u16 secure_discard;
|
||||
u16 physical_block_size;
|
||||
u16 logical_block_size;
|
||||
u16 max_segments;
|
||||
size_t nsectors;
|
||||
u64 size; /* device size in bytes */
|
||||
struct list_head list;
|
||||
struct gendisk *gd;
|
||||
struct kobject kobj;
|
||||
char blk_symlink_name[NAME_MAX];
|
||||
refcount_t refcount;
|
||||
struct work_struct unmap_on_rmmod_work;
|
||||
};
|
||||
|
||||
/* rnbd-clt.c */
|
||||
|
||||
struct rnbd_clt_dev *rnbd_clt_map_device(const char *sessname,
|
||||
struct rtrs_addr *paths,
|
||||
size_t path_cnt, u16 port_nr,
|
||||
const char *pathname,
|
||||
enum rnbd_access_mode access_mode);
|
||||
int rnbd_clt_unmap_device(struct rnbd_clt_dev *dev, bool force,
|
||||
const struct attribute *sysfs_self);
|
||||
|
||||
int rnbd_clt_remap_device(struct rnbd_clt_dev *dev);
|
||||
int rnbd_clt_resize_disk(struct rnbd_clt_dev *dev, size_t newsize);
|
||||
|
||||
/* rnbd-clt-sysfs.c */
|
||||
|
||||
int rnbd_clt_create_sysfs_files(void);
|
||||
|
||||
void rnbd_clt_destroy_sysfs_files(void);
|
||||
void rnbd_clt_destroy_default_group(void);
|
||||
|
||||
void rnbd_clt_remove_dev_symlink(struct rnbd_clt_dev *dev);
|
||||
|
||||
#endif /* RNBD_CLT_H */
|
23
drivers/block/rnbd/rnbd-common.c
Normal file
23
drivers/block/rnbd/rnbd-common.c
Normal file
@ -0,0 +1,23 @@
|
||||
// SPDX-License-Identifier: GPL-2.0-or-later
|
||||
/*
|
||||
* RDMA Network Block Driver
|
||||
*
|
||||
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
|
||||
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
|
||||
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
|
||||
*/
|
||||
#include "rnbd-proto.h"
|
||||
|
||||
const char *rnbd_access_mode_str(enum rnbd_access_mode mode)
|
||||
{
|
||||
switch (mode) {
|
||||
case RNBD_ACCESS_RO:
|
||||
return "ro";
|
||||
case RNBD_ACCESS_RW:
|
||||
return "rw";
|
||||
case RNBD_ACCESS_MIGRATION:
|
||||
return "migration";
|
||||
default:
|
||||
return "unknown";
|
||||
}
|
||||
}
|
41
drivers/block/rnbd/rnbd-log.h
Normal file
41
drivers/block/rnbd/rnbd-log.h
Normal file
@ -0,0 +1,41 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0-or-later */
|
||||
/*
|
||||
* RDMA Network Block Driver
|
||||
*
|
||||
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
|
||||
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
|
||||
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
|
||||
*/
|
||||
#ifndef RNBD_LOG_H
|
||||
#define RNBD_LOG_H
|
||||
|
||||
#include "rnbd-clt.h"
|
||||
#include "rnbd-srv.h"
|
||||
|
||||
#define rnbd_clt_log(fn, dev, fmt, ...) ( \
|
||||
fn("<%s@%s> " fmt, (dev)->pathname, \
|
||||
(dev)->sess->sessname, \
|
||||
##__VA_ARGS__))
|
||||
#define rnbd_srv_log(fn, dev, fmt, ...) ( \
|
||||
fn("<%s@%s>: " fmt, (dev)->pathname, \
|
||||
(dev)->sess->sessname, ##__VA_ARGS__))
|
||||
|
||||
#define rnbd_clt_err(dev, fmt, ...) \
|
||||
rnbd_clt_log(pr_err, dev, fmt, ##__VA_ARGS__)
|
||||
#define rnbd_clt_err_rl(dev, fmt, ...) \
|
||||
rnbd_clt_log(pr_err_ratelimited, dev, fmt, ##__VA_ARGS__)
|
||||
#define rnbd_clt_info(dev, fmt, ...) \
|
||||
rnbd_clt_log(pr_info, dev, fmt, ##__VA_ARGS__)
|
||||
#define rnbd_clt_info_rl(dev, fmt, ...) \
|
||||
rnbd_clt_log(pr_info_ratelimited, dev, fmt, ##__VA_ARGS__)
|
||||
|
||||
#define rnbd_srv_err(dev, fmt, ...) \
|
||||
rnbd_srv_log(pr_err, dev, fmt, ##__VA_ARGS__)
|
||||
#define rnbd_srv_err_rl(dev, fmt, ...) \
|
||||
rnbd_srv_log(pr_err_ratelimited, dev, fmt, ##__VA_ARGS__)
|
||||
#define rnbd_srv_info(dev, fmt, ...) \
|
||||
rnbd_srv_log(pr_info, dev, fmt, ##__VA_ARGS__)
|
||||
#define rnbd_srv_info_rl(dev, fmt, ...) \
|
||||
rnbd_srv_log(pr_info_ratelimited, dev, fmt, ##__VA_ARGS__)
|
||||
|
||||
#endif /* RNBD_LOG_H */
|
303
drivers/block/rnbd/rnbd-proto.h
Normal file
303
drivers/block/rnbd/rnbd-proto.h
Normal file
@ -0,0 +1,303 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0-or-later */
|
||||
/*
|
||||
* RDMA Network Block Driver
|
||||
*
|
||||
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
|
||||
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
|
||||
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
|
||||
*/
|
||||
#ifndef RNBD_PROTO_H
|
||||
#define RNBD_PROTO_H
|
||||
|
||||
#include <linux/types.h>
|
||||
#include <linux/blkdev.h>
|
||||
#include <linux/limits.h>
|
||||
#include <linux/inet.h>
|
||||
#include <linux/in.h>
|
||||
#include <linux/in6.h>
|
||||
#include <rdma/ib.h>
|
||||
|
||||
#define RNBD_PROTO_VER_MAJOR 2
|
||||
#define RNBD_PROTO_VER_MINOR 0
|
||||
|
||||
/* The default port number the RTRS server is listening on. */
|
||||
#define RTRS_PORT 1234
|
||||
|
||||
/**
|
||||
* enum rnbd_msg_types - RNBD message types
|
||||
* @RNBD_MSG_SESS_INFO: initial session info from client to server
|
||||
* @RNBD_MSG_SESS_INFO_RSP: initial session info from server to client
|
||||
* @RNBD_MSG_OPEN: open (map) device request
|
||||
* @RNBD_MSG_OPEN_RSP: response to an @RNBD_MSG_OPEN
|
||||
* @RNBD_MSG_IO: block IO request operation
|
||||
* @RNBD_MSG_CLOSE: close (unmap) device request
|
||||
*/
|
||||
enum rnbd_msg_type {
|
||||
RNBD_MSG_SESS_INFO,
|
||||
RNBD_MSG_SESS_INFO_RSP,
|
||||
RNBD_MSG_OPEN,
|
||||
RNBD_MSG_OPEN_RSP,
|
||||
RNBD_MSG_IO,
|
||||
RNBD_MSG_CLOSE,
|
||||
};
|
||||
|
||||
/**
|
||||
* struct rnbd_msg_hdr - header of RNBD messages
|
||||
* @type: Message type, valid values see: enum rnbd_msg_types
|
||||
*/
|
||||
struct rnbd_msg_hdr {
|
||||
__le16 type;
|
||||
__le16 __padding;
|
||||
};
|
||||
|
||||
/**
|
||||
* We allow to map RO many times and RW only once. We allow to map yet another
|
||||
* time RW, if MIGRATION is provided (second RW export can be required for
|
||||
* example for VM migration)
|
||||
*/
|
||||
enum rnbd_access_mode {
|
||||
RNBD_ACCESS_RO,
|
||||
RNBD_ACCESS_RW,
|
||||
RNBD_ACCESS_MIGRATION,
|
||||
};
|
||||
|
||||
/**
|
||||
* struct rnbd_msg_sess_info - initial session info from client to server
|
||||
* @hdr: message header
|
||||
* @ver: RNBD protocol version
|
||||
*/
|
||||
struct rnbd_msg_sess_info {
|
||||
struct rnbd_msg_hdr hdr;
|
||||
u8 ver;
|
||||
u8 reserved[31];
|
||||
};
|
||||
|
||||
/**
|
||||
* struct rnbd_msg_sess_info_rsp - initial session info from server to client
|
||||
* @hdr: message header
|
||||
* @ver: RNBD protocol version
|
||||
*/
|
||||
struct rnbd_msg_sess_info_rsp {
|
||||
struct rnbd_msg_hdr hdr;
|
||||
u8 ver;
|
||||
u8 reserved[31];
|
||||
};
|
||||
|
||||
/**
|
||||
* struct rnbd_msg_open - request to open a remote device.
|
||||
* @hdr: message header
|
||||
* @access_mode: the mode to open remote device, valid values see:
|
||||
* enum rnbd_access_mode
|
||||
* @device_name: device path on remote side
|
||||
*/
|
||||
struct rnbd_msg_open {
|
||||
struct rnbd_msg_hdr hdr;
|
||||
u8 access_mode;
|
||||
u8 resv1;
|
||||
s8 dev_name[NAME_MAX];
|
||||
u8 reserved[3];
|
||||
};
|
||||
|
||||
/**
|
||||
* struct rnbd_msg_close - request to close a remote device.
|
||||
* @hdr: message header
|
||||
* @device_id: device_id on server side to identify the device
|
||||
*/
|
||||
struct rnbd_msg_close {
|
||||
struct rnbd_msg_hdr hdr;
|
||||
__le32 device_id;
|
||||
};
|
||||
|
||||
/**
|
||||
* struct rnbd_msg_open_rsp - response message to RNBD_MSG_OPEN
|
||||
* @hdr: message header
|
||||
* @device_id: device_id on server side to identify the device
|
||||
* @nsectors: number of sectors in the usual 512b unit
|
||||
* @max_hw_sectors: max hardware sectors in the usual 512b unit
|
||||
* @max_write_same_sectors: max sectors for WRITE SAME in the 512b unit
|
||||
* @max_discard_sectors: max. sectors that can be discarded at once in 512b
|
||||
* unit.
|
||||
* @discard_granularity: size of the internal discard allocation unit in bytes
|
||||
* @discard_alignment: offset from internal allocation assignment in bytes
|
||||
* @physical_block_size: physical block size device supports in bytes
|
||||
* @logical_block_size: logical block size device supports in bytes
|
||||
* @max_segments: max segments hardware support in one transfer
|
||||
* @secure_discard: supports secure discard
|
||||
* @rotation: is a rotational disc?
|
||||
*/
|
||||
struct rnbd_msg_open_rsp {
|
||||
struct rnbd_msg_hdr hdr;
|
||||
__le32 device_id;
|
||||
__le64 nsectors;
|
||||
__le32 max_hw_sectors;
|
||||
__le32 max_write_same_sectors;
|
||||
__le32 max_discard_sectors;
|
||||
__le32 discard_granularity;
|
||||
__le32 discard_alignment;
|
||||
__le16 physical_block_size;
|
||||
__le16 logical_block_size;
|
||||
__le16 max_segments;
|
||||
__le16 secure_discard;
|
||||
u8 rotational;
|
||||
u8 reserved[11];
|
||||
};
|
||||
|
||||
/**
|
||||
* struct rnbd_msg_io - message for I/O read/write
|
||||
* @hdr: message header
|
||||
* @device_id: device_id on server side to find the right device
|
||||
* @sector: bi_sector attribute from struct bio
|
||||
* @rw: valid values are defined in enum rnbd_io_flags
|
||||
* @bi_size: number of bytes for I/O read/write
|
||||
* @prio: priority
|
||||
*/
|
||||
struct rnbd_msg_io {
|
||||
struct rnbd_msg_hdr hdr;
|
||||
__le32 device_id;
|
||||
__le64 sector;
|
||||
__le32 rw;
|
||||
__le32 bi_size;
|
||||
__le16 prio;
|
||||
};
|
||||
|
||||
#define RNBD_OP_BITS 8
|
||||
#define RNBD_OP_MASK ((1 << RNBD_OP_BITS) - 1)
|
||||
|
||||
/**
|
||||
* enum rnbd_io_flags - RNBD request types from rq_flag_bits
|
||||
* @RNBD_OP_READ: read sectors from the device
|
||||
* @RNBD_OP_WRITE: write sectors to the device
|
||||
* @RNBD_OP_FLUSH: flush the volatile write cache
|
||||
* @RNBD_OP_DISCARD: discard sectors
|
||||
* @RNBD_OP_SECURE_ERASE: securely erase sectors
|
||||
* @RNBD_OP_WRITE_SAME: write the same sectors many times
|
||||
|
||||
* @RNBD_F_SYNC: request is sync (sync write or read)
|
||||
* @RNBD_F_FUA: forced unit access
|
||||
*/
|
||||
enum rnbd_io_flags {
|
||||
|
||||
/* Operations */
|
||||
|
||||
RNBD_OP_READ = 0,
|
||||
RNBD_OP_WRITE = 1,
|
||||
RNBD_OP_FLUSH = 2,
|
||||
RNBD_OP_DISCARD = 3,
|
||||
RNBD_OP_SECURE_ERASE = 4,
|
||||
RNBD_OP_WRITE_SAME = 5,
|
||||
|
||||
RNBD_OP_LAST,
|
||||
|
||||
/* Flags */
|
||||
|
||||
RNBD_F_SYNC = 1<<(RNBD_OP_BITS + 0),
|
||||
RNBD_F_FUA = 1<<(RNBD_OP_BITS + 1),
|
||||
|
||||
RNBD_F_ALL = (RNBD_F_SYNC | RNBD_F_FUA)
|
||||
|
||||
};
|
||||
|
||||
static inline u32 rnbd_op(u32 flags)
|
||||
{
|
||||
return flags & RNBD_OP_MASK;
|
||||
}
|
||||
|
||||
static inline u32 rnbd_flags(u32 flags)
|
||||
{
|
||||
return flags & ~RNBD_OP_MASK;
|
||||
}
|
||||
|
||||
static inline bool rnbd_flags_supported(u32 flags)
|
||||
{
|
||||
u32 op;
|
||||
|
||||
op = rnbd_op(flags);
|
||||
flags = rnbd_flags(flags);
|
||||
|
||||
if (op >= RNBD_OP_LAST)
|
||||
return false;
|
||||
if (flags & ~RNBD_F_ALL)
|
||||
return false;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static inline u32 rnbd_to_bio_flags(u32 rnbd_opf)
|
||||
{
|
||||
u32 bio_opf;
|
||||
|
||||
switch (rnbd_op(rnbd_opf)) {
|
||||
case RNBD_OP_READ:
|
||||
bio_opf = REQ_OP_READ;
|
||||
break;
|
||||
case RNBD_OP_WRITE:
|
||||
bio_opf = REQ_OP_WRITE;
|
||||
break;
|
||||
case RNBD_OP_FLUSH:
|
||||
bio_opf = REQ_OP_FLUSH | REQ_PREFLUSH;
|
||||
break;
|
||||
case RNBD_OP_DISCARD:
|
||||
bio_opf = REQ_OP_DISCARD;
|
||||
break;
|
||||
case RNBD_OP_SECURE_ERASE:
|
||||
bio_opf = REQ_OP_SECURE_ERASE;
|
||||
break;
|
||||
case RNBD_OP_WRITE_SAME:
|
||||
bio_opf = REQ_OP_WRITE_SAME;
|
||||
break;
|
||||
default:
|
||||
WARN(1, "Unknown RNBD type: %d (flags %d)\n",
|
||||
rnbd_op(rnbd_opf), rnbd_opf);
|
||||
bio_opf = 0;
|
||||
}
|
||||
|
||||
if (rnbd_opf & RNBD_F_SYNC)
|
||||
bio_opf |= REQ_SYNC;
|
||||
|
||||
if (rnbd_opf & RNBD_F_FUA)
|
||||
bio_opf |= REQ_FUA;
|
||||
|
||||
return bio_opf;
|
||||
}
|
||||
|
||||
static inline u32 rq_to_rnbd_flags(struct request *rq)
|
||||
{
|
||||
u32 rnbd_opf;
|
||||
|
||||
switch (req_op(rq)) {
|
||||
case REQ_OP_READ:
|
||||
rnbd_opf = RNBD_OP_READ;
|
||||
break;
|
||||
case REQ_OP_WRITE:
|
||||
rnbd_opf = RNBD_OP_WRITE;
|
||||
break;
|
||||
case REQ_OP_DISCARD:
|
||||
rnbd_opf = RNBD_OP_DISCARD;
|
||||
break;
|
||||
case REQ_OP_SECURE_ERASE:
|
||||
rnbd_opf = RNBD_OP_SECURE_ERASE;
|
||||
break;
|
||||
case REQ_OP_WRITE_SAME:
|
||||
rnbd_opf = RNBD_OP_WRITE_SAME;
|
||||
break;
|
||||
case REQ_OP_FLUSH:
|
||||
rnbd_opf = RNBD_OP_FLUSH;
|
||||
break;
|
||||
default:
|
||||
WARN(1, "Unknown request type %d (flags %llu)\n",
|
||||
req_op(rq), (unsigned long long)rq->cmd_flags);
|
||||
rnbd_opf = 0;
|
||||
}
|
||||
|
||||
if (op_is_sync(rq->cmd_flags))
|
||||
rnbd_opf |= RNBD_F_SYNC;
|
||||
|
||||
if (op_is_flush(rq->cmd_flags))
|
||||
rnbd_opf |= RNBD_F_FUA;
|
||||
|
||||
return rnbd_opf;
|
||||
}
|
||||
|
||||
const char *rnbd_access_mode_str(enum rnbd_access_mode mode);
|
||||
|
||||
#endif /* RNBD_PROTO_H */
|
134
drivers/block/rnbd/rnbd-srv-dev.c
Normal file
134
drivers/block/rnbd/rnbd-srv-dev.c
Normal file
@ -0,0 +1,134 @@
|
||||
// SPDX-License-Identifier: GPL-2.0-or-later
|
||||
/*
|
||||
* RDMA Network Block Driver
|
||||
*
|
||||
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
|
||||
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
|
||||
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
|
||||
*/
|
||||
#undef pr_fmt
|
||||
#define pr_fmt(fmt) KBUILD_MODNAME " L" __stringify(__LINE__) ": " fmt
|
||||
|
||||
#include "rnbd-srv-dev.h"
|
||||
#include "rnbd-log.h"
|
||||
|
||||
struct rnbd_dev *rnbd_dev_open(const char *path, fmode_t flags,
|
||||
struct bio_set *bs)
|
||||
{
|
||||
struct rnbd_dev *dev;
|
||||
int ret;
|
||||
|
||||
dev = kzalloc(sizeof(*dev), GFP_KERNEL);
|
||||
if (!dev)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
dev->blk_open_flags = flags;
|
||||
dev->bdev = blkdev_get_by_path(path, flags, THIS_MODULE);
|
||||
ret = PTR_ERR_OR_ZERO(dev->bdev);
|
||||
if (ret)
|
||||
goto err;
|
||||
|
||||
dev->blk_open_flags = flags;
|
||||
bdevname(dev->bdev, dev->name);
|
||||
dev->ibd_bio_set = bs;
|
||||
|
||||
return dev;
|
||||
|
||||
err:
|
||||
kfree(dev);
|
||||
return ERR_PTR(ret);
|
||||
}
|
||||
|
||||
void rnbd_dev_close(struct rnbd_dev *dev)
|
||||
{
|
||||
blkdev_put(dev->bdev, dev->blk_open_flags);
|
||||
kfree(dev);
|
||||
}
|
||||
|
||||
static void rnbd_dev_bi_end_io(struct bio *bio)
|
||||
{
|
||||
struct rnbd_dev_blk_io *io = bio->bi_private;
|
||||
|
||||
rnbd_endio(io->priv, blk_status_to_errno(bio->bi_status));
|
||||
bio_put(bio);
|
||||
}
|
||||
|
||||
/**
|
||||
* rnbd_bio_map_kern - map kernel address into bio
|
||||
* @data: pointer to buffer to map
|
||||
* @bs: bio_set to use.
|
||||
* @len: length in bytes
|
||||
* @gfp_mask: allocation flags for bio allocation
|
||||
*
|
||||
* Map the kernel address into a bio suitable for io to a block
|
||||
* device. Returns an error pointer in case of error.
|
||||
*/
|
||||
static struct bio *rnbd_bio_map_kern(void *data, struct bio_set *bs,
|
||||
unsigned int len, gfp_t gfp_mask)
|
||||
{
|
||||
unsigned long kaddr = (unsigned long)data;
|
||||
unsigned long end = (kaddr + len + PAGE_SIZE - 1) >> PAGE_SHIFT;
|
||||
unsigned long start = kaddr >> PAGE_SHIFT;
|
||||
const int nr_pages = end - start;
|
||||
int offset, i;
|
||||
struct bio *bio;
|
||||
|
||||
bio = bio_alloc_bioset(gfp_mask, nr_pages, bs);
|
||||
if (!bio)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
offset = offset_in_page(kaddr);
|
||||
for (i = 0; i < nr_pages; i++) {
|
||||
unsigned int bytes = PAGE_SIZE - offset;
|
||||
|
||||
if (len <= 0)
|
||||
break;
|
||||
|
||||
if (bytes > len)
|
||||
bytes = len;
|
||||
|
||||
if (bio_add_page(bio, virt_to_page(data), bytes,
|
||||
offset) < bytes) {
|
||||
/* we don't support partial mappings */
|
||||
bio_put(bio);
|
||||
return ERR_PTR(-EINVAL);
|
||||
}
|
||||
|
||||
data += bytes;
|
||||
len -= bytes;
|
||||
offset = 0;
|
||||
}
|
||||
|
||||
bio->bi_end_io = bio_put;
|
||||
return bio;
|
||||
}
|
||||
|
||||
int rnbd_dev_submit_io(struct rnbd_dev *dev, sector_t sector, void *data,
|
||||
size_t len, u32 bi_size, enum rnbd_io_flags flags,
|
||||
short prio, void *priv)
|
||||
{
|
||||
struct rnbd_dev_blk_io *io;
|
||||
struct bio *bio;
|
||||
|
||||
/* Generate bio with pages pointing to the rdma buffer */
|
||||
bio = rnbd_bio_map_kern(data, dev->ibd_bio_set, len, GFP_KERNEL);
|
||||
if (IS_ERR(bio))
|
||||
return PTR_ERR(bio);
|
||||
|
||||
io = container_of(bio, struct rnbd_dev_blk_io, bio);
|
||||
|
||||
io->dev = dev;
|
||||
io->priv = priv;
|
||||
|
||||
bio->bi_end_io = rnbd_dev_bi_end_io;
|
||||
bio->bi_private = io;
|
||||
bio->bi_opf = rnbd_to_bio_flags(flags);
|
||||
bio->bi_iter.bi_sector = sector;
|
||||
bio->bi_iter.bi_size = bi_size;
|
||||
bio_set_prio(bio, prio);
|
||||
bio_set_dev(bio, dev->bdev);
|
||||
|
||||
submit_bio(bio);
|
||||
|
||||
return 0;
|
||||
}
|
92
drivers/block/rnbd/rnbd-srv-dev.h
Normal file
92
drivers/block/rnbd/rnbd-srv-dev.h
Normal file
@ -0,0 +1,92 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0-or-later */
|
||||
/*
|
||||
* RDMA Network Block Driver
|
||||
*
|
||||
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
|
||||
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
|
||||
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
|
||||
*/
|
||||
#ifndef RNBD_SRV_DEV_H
|
||||
#define RNBD_SRV_DEV_H
|
||||
|
||||
#include <linux/fs.h>
|
||||
#include "rnbd-proto.h"
|
||||
|
||||
struct rnbd_dev {
|
||||
struct block_device *bdev;
|
||||
struct bio_set *ibd_bio_set;
|
||||
fmode_t blk_open_flags;
|
||||
char name[BDEVNAME_SIZE];
|
||||
};
|
||||
|
||||
struct rnbd_dev_blk_io {
|
||||
struct rnbd_dev *dev;
|
||||
void *priv;
|
||||
/* have to be last member for front_pad usage of bioset_init */
|
||||
struct bio bio;
|
||||
};
|
||||
|
||||
/**
|
||||
* rnbd_dev_open() - Open a device
|
||||
* @flags: open flags
|
||||
* @bs: bio_set to use during block io,
|
||||
*/
|
||||
struct rnbd_dev *rnbd_dev_open(const char *path, fmode_t flags,
|
||||
struct bio_set *bs);
|
||||
|
||||
/**
|
||||
* rnbd_dev_close() - Close a device
|
||||
*/
|
||||
void rnbd_dev_close(struct rnbd_dev *dev);
|
||||
|
||||
void rnbd_endio(void *priv, int error);
|
||||
|
||||
static inline int rnbd_dev_get_max_segs(const struct rnbd_dev *dev)
|
||||
{
|
||||
return queue_max_segments(bdev_get_queue(dev->bdev));
|
||||
}
|
||||
|
||||
static inline int rnbd_dev_get_max_hw_sects(const struct rnbd_dev *dev)
|
||||
{
|
||||
return queue_max_hw_sectors(bdev_get_queue(dev->bdev));
|
||||
}
|
||||
|
||||
static inline int rnbd_dev_get_secure_discard(const struct rnbd_dev *dev)
|
||||
{
|
||||
return blk_queue_secure_erase(bdev_get_queue(dev->bdev));
|
||||
}
|
||||
|
||||
static inline int rnbd_dev_get_max_discard_sects(const struct rnbd_dev *dev)
|
||||
{
|
||||
if (!blk_queue_discard(bdev_get_queue(dev->bdev)))
|
||||
return 0;
|
||||
|
||||
return blk_queue_get_max_sectors(bdev_get_queue(dev->bdev),
|
||||
REQ_OP_DISCARD);
|
||||
}
|
||||
|
||||
static inline int rnbd_dev_get_discard_granularity(const struct rnbd_dev *dev)
|
||||
{
|
||||
return bdev_get_queue(dev->bdev)->limits.discard_granularity;
|
||||
}
|
||||
|
||||
static inline int rnbd_dev_get_discard_alignment(const struct rnbd_dev *dev)
|
||||
{
|
||||
return bdev_get_queue(dev->bdev)->limits.discard_alignment;
|
||||
}
|
||||
|
||||
/**
|
||||
* rnbd_dev_submit_io() - Submit an I/O to the disk
|
||||
* @dev: device to that the I/O is submitted
|
||||
* @sector: address to read/write data to
|
||||
* @data: I/O data to write or buffer to read I/O date into
|
||||
* @len: length of @data
|
||||
* @bi_size: Amount of data that will be read/written
|
||||
* @prio: IO priority
|
||||
* @priv: private data passed to @io_fn
|
||||
*/
|
||||
int rnbd_dev_submit_io(struct rnbd_dev *dev, sector_t sector, void *data,
|
||||
size_t len, u32 bi_size, enum rnbd_io_flags flags,
|
||||
short prio, void *priv);
|
||||
|
||||
#endif /* RNBD_SRV_DEV_H */
|
215
drivers/block/rnbd/rnbd-srv-sysfs.c
Normal file
215
drivers/block/rnbd/rnbd-srv-sysfs.c
Normal file
@ -0,0 +1,215 @@
|
||||
// SPDX-License-Identifier: GPL-2.0-or-later
|
||||
/*
|
||||
* RDMA Network Block Driver
|
||||
*
|
||||
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
|
||||
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
|
||||
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
|
||||
*/
|
||||
#undef pr_fmt
|
||||
#define pr_fmt(fmt) KBUILD_MODNAME " L" __stringify(__LINE__) ": " fmt
|
||||
|
||||
#include <uapi/linux/limits.h>
|
||||
#include <linux/kobject.h>
|
||||
#include <linux/sysfs.h>
|
||||
#include <linux/stat.h>
|
||||
#include <linux/genhd.h>
|
||||
#include <linux/list.h>
|
||||
#include <linux/moduleparam.h>
|
||||
#include <linux/device.h>
|
||||
|
||||
#include "rnbd-srv.h"
|
||||
|
||||
static struct device *rnbd_dev;
|
||||
static struct class *rnbd_dev_class;
|
||||
static struct kobject *rnbd_devs_kobj;
|
||||
|
||||
static void rnbd_srv_dev_release(struct kobject *kobj)
|
||||
{
|
||||
struct rnbd_srv_dev *dev;
|
||||
|
||||
dev = container_of(kobj, struct rnbd_srv_dev, dev_kobj);
|
||||
|
||||
kfree(dev);
|
||||
}
|
||||
|
||||
static struct kobj_type dev_ktype = {
|
||||
.sysfs_ops = &kobj_sysfs_ops,
|
||||
.release = rnbd_srv_dev_release
|
||||
};
|
||||
|
||||
int rnbd_srv_create_dev_sysfs(struct rnbd_srv_dev *dev,
|
||||
struct block_device *bdev,
|
||||
const char *dev_name)
|
||||
{
|
||||
struct kobject *bdev_kobj;
|
||||
int ret;
|
||||
|
||||
ret = kobject_init_and_add(&dev->dev_kobj, &dev_ktype,
|
||||
rnbd_devs_kobj, dev_name);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
dev->dev_sessions_kobj = kobject_create_and_add("sessions",
|
||||
&dev->dev_kobj);
|
||||
if (!dev->dev_sessions_kobj)
|
||||
goto put_dev_kobj;
|
||||
|
||||
bdev_kobj = &disk_to_dev(bdev->bd_disk)->kobj;
|
||||
ret = sysfs_create_link(&dev->dev_kobj, bdev_kobj, "block_dev");
|
||||
if (ret)
|
||||
goto put_sess_kobj;
|
||||
|
||||
return 0;
|
||||
|
||||
put_sess_kobj:
|
||||
kobject_put(dev->dev_sessions_kobj);
|
||||
put_dev_kobj:
|
||||
kobject_put(&dev->dev_kobj);
|
||||
return ret;
|
||||
}
|
||||
|
||||
void rnbd_srv_destroy_dev_sysfs(struct rnbd_srv_dev *dev)
|
||||
{
|
||||
sysfs_remove_link(&dev->dev_kobj, "block_dev");
|
||||
kobject_del(dev->dev_sessions_kobj);
|
||||
kobject_put(dev->dev_sessions_kobj);
|
||||
kobject_del(&dev->dev_kobj);
|
||||
kobject_put(&dev->dev_kobj);
|
||||
}
|
||||
|
||||
static ssize_t read_only_show(struct kobject *kobj, struct kobj_attribute *attr,
|
||||
char *page)
|
||||
{
|
||||
struct rnbd_srv_sess_dev *sess_dev;
|
||||
|
||||
sess_dev = container_of(kobj, struct rnbd_srv_sess_dev, kobj);
|
||||
|
||||
return scnprintf(page, PAGE_SIZE, "%d\n",
|
||||
!(sess_dev->open_flags & FMODE_WRITE));
|
||||
}
|
||||
|
||||
static struct kobj_attribute rnbd_srv_dev_session_ro_attr =
|
||||
__ATTR_RO(read_only);
|
||||
|
||||
static ssize_t access_mode_show(struct kobject *kobj,
|
||||
struct kobj_attribute *attr,
|
||||
char *page)
|
||||
{
|
||||
struct rnbd_srv_sess_dev *sess_dev;
|
||||
|
||||
sess_dev = container_of(kobj, struct rnbd_srv_sess_dev, kobj);
|
||||
|
||||
return scnprintf(page, PAGE_SIZE, "%s\n",
|
||||
rnbd_access_mode_str(sess_dev->access_mode));
|
||||
}
|
||||
|
||||
static struct kobj_attribute rnbd_srv_dev_session_access_mode_attr =
|
||||
__ATTR_RO(access_mode);
|
||||
|
||||
static ssize_t mapping_path_show(struct kobject *kobj,
|
||||
struct kobj_attribute *attr, char *page)
|
||||
{
|
||||
struct rnbd_srv_sess_dev *sess_dev;
|
||||
|
||||
sess_dev = container_of(kobj, struct rnbd_srv_sess_dev, kobj);
|
||||
|
||||
return scnprintf(page, PAGE_SIZE, "%s\n", sess_dev->pathname);
|
||||
}
|
||||
|
||||
static struct kobj_attribute rnbd_srv_dev_session_mapping_path_attr =
|
||||
__ATTR_RO(mapping_path);
|
||||
|
||||
static struct attribute *rnbd_srv_default_dev_sessions_attrs[] = {
|
||||
&rnbd_srv_dev_session_access_mode_attr.attr,
|
||||
&rnbd_srv_dev_session_ro_attr.attr,
|
||||
&rnbd_srv_dev_session_mapping_path_attr.attr,
|
||||
NULL,
|
||||
};
|
||||
|
||||
static struct attribute_group rnbd_srv_default_dev_session_attr_group = {
|
||||
.attrs = rnbd_srv_default_dev_sessions_attrs,
|
||||
};
|
||||
|
||||
void rnbd_srv_destroy_dev_session_sysfs(struct rnbd_srv_sess_dev *sess_dev)
|
||||
{
|
||||
sysfs_remove_group(&sess_dev->kobj,
|
||||
&rnbd_srv_default_dev_session_attr_group);
|
||||
|
||||
kobject_del(&sess_dev->kobj);
|
||||
kobject_put(&sess_dev->kobj);
|
||||
}
|
||||
|
||||
static void rnbd_srv_sess_dev_release(struct kobject *kobj)
|
||||
{
|
||||
struct rnbd_srv_sess_dev *sess_dev;
|
||||
|
||||
sess_dev = container_of(kobj, struct rnbd_srv_sess_dev, kobj);
|
||||
rnbd_destroy_sess_dev(sess_dev);
|
||||
}
|
||||
|
||||
static struct kobj_type rnbd_srv_sess_dev_ktype = {
|
||||
.sysfs_ops = &kobj_sysfs_ops,
|
||||
.release = rnbd_srv_sess_dev_release,
|
||||
};
|
||||
|
||||
int rnbd_srv_create_dev_session_sysfs(struct rnbd_srv_sess_dev *sess_dev)
|
||||
{
|
||||
int ret;
|
||||
|
||||
ret = kobject_init_and_add(&sess_dev->kobj, &rnbd_srv_sess_dev_ktype,
|
||||
sess_dev->dev->dev_sessions_kobj, "%s",
|
||||
sess_dev->sess->sessname);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = sysfs_create_group(&sess_dev->kobj,
|
||||
&rnbd_srv_default_dev_session_attr_group);
|
||||
if (ret)
|
||||
goto err;
|
||||
|
||||
return 0;
|
||||
|
||||
err:
|
||||
kobject_put(&sess_dev->kobj);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
int rnbd_srv_create_sysfs_files(void)
|
||||
{
|
||||
int err;
|
||||
|
||||
rnbd_dev_class = class_create(THIS_MODULE, "rnbd-server");
|
||||
if (IS_ERR(rnbd_dev_class))
|
||||
return PTR_ERR(rnbd_dev_class);
|
||||
|
||||
rnbd_dev = device_create(rnbd_dev_class, NULL,
|
||||
MKDEV(0, 0), NULL, "ctl");
|
||||
if (IS_ERR(rnbd_dev)) {
|
||||
err = PTR_ERR(rnbd_dev);
|
||||
goto cls_destroy;
|
||||
}
|
||||
rnbd_devs_kobj = kobject_create_and_add("devices", &rnbd_dev->kobj);
|
||||
if (!rnbd_devs_kobj) {
|
||||
err = -ENOMEM;
|
||||
goto dev_destroy;
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
||||
dev_destroy:
|
||||
device_destroy(rnbd_dev_class, MKDEV(0, 0));
|
||||
cls_destroy:
|
||||
class_destroy(rnbd_dev_class);
|
||||
|
||||
return err;
|
||||
}
|
||||
|
||||
void rnbd_srv_destroy_sysfs_files(void)
|
||||
{
|
||||
kobject_del(rnbd_devs_kobj);
|
||||
kobject_put(rnbd_devs_kobj);
|
||||
device_destroy(rnbd_dev_class, MKDEV(0, 0));
|
||||
class_destroy(rnbd_dev_class);
|
||||
}
|
844
drivers/block/rnbd/rnbd-srv.c
Normal file
844
drivers/block/rnbd/rnbd-srv.c
Normal file
@ -0,0 +1,844 @@
|
||||
// SPDX-License-Identifier: GPL-2.0-or-later
|
||||
/*
|
||||
* RDMA Network Block Driver
|
||||
*
|
||||
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
|
||||
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
|
||||
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
|
||||
*/
|
||||
#undef pr_fmt
|
||||
#define pr_fmt(fmt) KBUILD_MODNAME " L" __stringify(__LINE__) ": " fmt
|
||||
|
||||
#include <linux/module.h>
|
||||
#include <linux/blkdev.h>
|
||||
|
||||
#include "rnbd-srv.h"
|
||||
#include "rnbd-srv-dev.h"
|
||||
|
||||
MODULE_DESCRIPTION("RDMA Network Block Device Server");
|
||||
MODULE_LICENSE("GPL");
|
||||
|
||||
static u16 port_nr = RTRS_PORT;
|
||||
|
||||
module_param_named(port_nr, port_nr, ushort, 0444);
|
||||
MODULE_PARM_DESC(port_nr,
|
||||
"The port number the server is listening on (default: "
|
||||
__stringify(RTRS_PORT)")");
|
||||
|
||||
#define DEFAULT_DEV_SEARCH_PATH "/"
|
||||
|
||||
static char dev_search_path[PATH_MAX] = DEFAULT_DEV_SEARCH_PATH;
|
||||
|
||||
static int dev_search_path_set(const char *val, const struct kernel_param *kp)
|
||||
{
|
||||
const char *p = strrchr(val, '\n') ? : val + strlen(val);
|
||||
|
||||
if (strlen(val) >= sizeof(dev_search_path))
|
||||
return -EINVAL;
|
||||
|
||||
snprintf(dev_search_path, sizeof(dev_search_path), "%.*s",
|
||||
(int)(p - val), val);
|
||||
|
||||
pr_info("dev_search_path changed to '%s'\n", dev_search_path);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static struct kparam_string dev_search_path_kparam_str = {
|
||||
.maxlen = sizeof(dev_search_path),
|
||||
.string = dev_search_path
|
||||
};
|
||||
|
||||
static const struct kernel_param_ops dev_search_path_ops = {
|
||||
.set = dev_search_path_set,
|
||||
.get = param_get_string,
|
||||
};
|
||||
|
||||
module_param_cb(dev_search_path, &dev_search_path_ops,
|
||||
&dev_search_path_kparam_str, 0444);
|
||||
MODULE_PARM_DESC(dev_search_path,
|
||||
"Sets the dev_search_path. When a device is mapped this path is prepended to the device path from the map device operation. If %SESSNAME% is specified in a path, then device will be searched in a session namespace. (default: "
|
||||
DEFAULT_DEV_SEARCH_PATH ")");
|
||||
|
||||
static DEFINE_MUTEX(sess_lock);
|
||||
static DEFINE_SPINLOCK(dev_lock);
|
||||
|
||||
static LIST_HEAD(sess_list);
|
||||
static LIST_HEAD(dev_list);
|
||||
|
||||
struct rnbd_io_private {
|
||||
struct rtrs_srv_op *id;
|
||||
struct rnbd_srv_sess_dev *sess_dev;
|
||||
};
|
||||
|
||||
static void rnbd_sess_dev_release(struct kref *kref)
|
||||
{
|
||||
struct rnbd_srv_sess_dev *sess_dev;
|
||||
|
||||
sess_dev = container_of(kref, struct rnbd_srv_sess_dev, kref);
|
||||
complete(sess_dev->destroy_comp);
|
||||
}
|
||||
|
||||
static inline void rnbd_put_sess_dev(struct rnbd_srv_sess_dev *sess_dev)
|
||||
{
|
||||
kref_put(&sess_dev->kref, rnbd_sess_dev_release);
|
||||
}
|
||||
|
||||
void rnbd_endio(void *priv, int error)
|
||||
{
|
||||
struct rnbd_io_private *rnbd_priv = priv;
|
||||
struct rnbd_srv_sess_dev *sess_dev = rnbd_priv->sess_dev;
|
||||
|
||||
rnbd_put_sess_dev(sess_dev);
|
||||
|
||||
rtrs_srv_resp_rdma(rnbd_priv->id, error);
|
||||
|
||||
kfree(priv);
|
||||
}
|
||||
|
||||
static struct rnbd_srv_sess_dev *
|
||||
rnbd_get_sess_dev(int dev_id, struct rnbd_srv_session *srv_sess)
|
||||
{
|
||||
struct rnbd_srv_sess_dev *sess_dev;
|
||||
int ret = 0;
|
||||
|
||||
rcu_read_lock();
|
||||
sess_dev = xa_load(&srv_sess->index_idr, dev_id);
|
||||
if (likely(sess_dev))
|
||||
ret = kref_get_unless_zero(&sess_dev->kref);
|
||||
rcu_read_unlock();
|
||||
|
||||
if (!sess_dev || !ret)
|
||||
return ERR_PTR(-ENXIO);
|
||||
|
||||
return sess_dev;
|
||||
}
|
||||
|
||||
static int process_rdma(struct rtrs_srv *sess,
|
||||
struct rnbd_srv_session *srv_sess,
|
||||
struct rtrs_srv_op *id, void *data, u32 datalen,
|
||||
const void *usr, size_t usrlen)
|
||||
{
|
||||
const struct rnbd_msg_io *msg = usr;
|
||||
struct rnbd_io_private *priv;
|
||||
struct rnbd_srv_sess_dev *sess_dev;
|
||||
u32 dev_id;
|
||||
int err;
|
||||
|
||||
priv = kmalloc(sizeof(*priv), GFP_KERNEL);
|
||||
if (!priv)
|
||||
return -ENOMEM;
|
||||
|
||||
dev_id = le32_to_cpu(msg->device_id);
|
||||
|
||||
sess_dev = rnbd_get_sess_dev(dev_id, srv_sess);
|
||||
if (IS_ERR(sess_dev)) {
|
||||
pr_err_ratelimited("Got I/O request on session %s for unknown device id %d\n",
|
||||
srv_sess->sessname, dev_id);
|
||||
err = -ENOTCONN;
|
||||
goto err;
|
||||
}
|
||||
|
||||
priv->sess_dev = sess_dev;
|
||||
priv->id = id;
|
||||
|
||||
err = rnbd_dev_submit_io(sess_dev->rnbd_dev, le64_to_cpu(msg->sector),
|
||||
data, datalen, le32_to_cpu(msg->bi_size),
|
||||
le32_to_cpu(msg->rw),
|
||||
srv_sess->ver < RNBD_PROTO_VER_MAJOR ||
|
||||
usrlen < sizeof(*msg) ?
|
||||
0 : le16_to_cpu(msg->prio), priv);
|
||||
if (unlikely(err)) {
|
||||
rnbd_srv_err(sess_dev, "Submitting I/O to device failed, err: %d\n",
|
||||
err);
|
||||
goto sess_dev_put;
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
||||
sess_dev_put:
|
||||
rnbd_put_sess_dev(sess_dev);
|
||||
err:
|
||||
kfree(priv);
|
||||
return err;
|
||||
}
|
||||
|
||||
static void destroy_device(struct rnbd_srv_dev *dev)
|
||||
{
|
||||
WARN_ONCE(!list_empty(&dev->sess_dev_list),
|
||||
"Device %s is being destroyed but still in use!\n",
|
||||
dev->id);
|
||||
|
||||
spin_lock(&dev_lock);
|
||||
list_del(&dev->list);
|
||||
spin_unlock(&dev_lock);
|
||||
|
||||
mutex_destroy(&dev->lock);
|
||||
if (dev->dev_kobj.state_in_sysfs)
|
||||
/*
|
||||
* Destroy kobj only if it was really created.
|
||||
*/
|
||||
rnbd_srv_destroy_dev_sysfs(dev);
|
||||
else
|
||||
kfree(dev);
|
||||
}
|
||||
|
||||
static void destroy_device_cb(struct kref *kref)
|
||||
{
|
||||
struct rnbd_srv_dev *dev;
|
||||
|
||||
dev = container_of(kref, struct rnbd_srv_dev, kref);
|
||||
|
||||
destroy_device(dev);
|
||||
}
|
||||
|
||||
static void rnbd_put_srv_dev(struct rnbd_srv_dev *dev)
|
||||
{
|
||||
kref_put(&dev->kref, destroy_device_cb);
|
||||
}
|
||||
|
||||
void rnbd_destroy_sess_dev(struct rnbd_srv_sess_dev *sess_dev)
|
||||
{
|
||||
DECLARE_COMPLETION_ONSTACK(dc);
|
||||
|
||||
xa_erase(&sess_dev->sess->index_idr, sess_dev->device_id);
|
||||
synchronize_rcu();
|
||||
sess_dev->destroy_comp = &dc;
|
||||
rnbd_put_sess_dev(sess_dev);
|
||||
wait_for_completion(&dc); /* wait for inflights to drop to zero */
|
||||
|
||||
rnbd_dev_close(sess_dev->rnbd_dev);
|
||||
list_del(&sess_dev->sess_list);
|
||||
mutex_lock(&sess_dev->dev->lock);
|
||||
list_del(&sess_dev->dev_list);
|
||||
if (sess_dev->open_flags & FMODE_WRITE)
|
||||
sess_dev->dev->open_write_cnt--;
|
||||
mutex_unlock(&sess_dev->dev->lock);
|
||||
|
||||
rnbd_put_srv_dev(sess_dev->dev);
|
||||
|
||||
rnbd_srv_info(sess_dev, "Device closed\n");
|
||||
kfree(sess_dev);
|
||||
}
|
||||
|
||||
static void destroy_sess(struct rnbd_srv_session *srv_sess)
|
||||
{
|
||||
struct rnbd_srv_sess_dev *sess_dev, *tmp;
|
||||
|
||||
if (list_empty(&srv_sess->sess_dev_list))
|
||||
goto out;
|
||||
|
||||
mutex_lock(&srv_sess->lock);
|
||||
list_for_each_entry_safe(sess_dev, tmp, &srv_sess->sess_dev_list,
|
||||
sess_list)
|
||||
rnbd_srv_destroy_dev_session_sysfs(sess_dev);
|
||||
mutex_unlock(&srv_sess->lock);
|
||||
|
||||
out:
|
||||
xa_destroy(&srv_sess->index_idr);
|
||||
bioset_exit(&srv_sess->sess_bio_set);
|
||||
|
||||
pr_info("RTRS Session %s disconnected\n", srv_sess->sessname);
|
||||
|
||||
mutex_lock(&sess_lock);
|
||||
list_del(&srv_sess->list);
|
||||
mutex_unlock(&sess_lock);
|
||||
|
||||
mutex_destroy(&srv_sess->lock);
|
||||
kfree(srv_sess);
|
||||
}
|
||||
|
||||
static int create_sess(struct rtrs_srv *rtrs)
|
||||
{
|
||||
struct rnbd_srv_session *srv_sess;
|
||||
char sessname[NAME_MAX];
|
||||
int err;
|
||||
|
||||
err = rtrs_srv_get_sess_name(rtrs, sessname, sizeof(sessname));
|
||||
if (err) {
|
||||
pr_err("rtrs_srv_get_sess_name(%s): %d\n", sessname, err);
|
||||
|
||||
return err;
|
||||
}
|
||||
srv_sess = kzalloc(sizeof(*srv_sess), GFP_KERNEL);
|
||||
if (!srv_sess)
|
||||
return -ENOMEM;
|
||||
|
||||
srv_sess->queue_depth = rtrs_srv_get_queue_depth(rtrs);
|
||||
err = bioset_init(&srv_sess->sess_bio_set, srv_sess->queue_depth,
|
||||
offsetof(struct rnbd_dev_blk_io, bio),
|
||||
BIOSET_NEED_BVECS);
|
||||
if (err) {
|
||||
pr_err("Allocating srv_session for session %s failed\n",
|
||||
sessname);
|
||||
kfree(srv_sess);
|
||||
return err;
|
||||
}
|
||||
|
||||
xa_init_flags(&srv_sess->index_idr, XA_FLAGS_ALLOC);
|
||||
INIT_LIST_HEAD(&srv_sess->sess_dev_list);
|
||||
mutex_init(&srv_sess->lock);
|
||||
mutex_lock(&sess_lock);
|
||||
list_add(&srv_sess->list, &sess_list);
|
||||
mutex_unlock(&sess_lock);
|
||||
|
||||
srv_sess->rtrs = rtrs;
|
||||
strlcpy(srv_sess->sessname, sessname, sizeof(srv_sess->sessname));
|
||||
|
||||
rtrs_srv_set_sess_priv(rtrs, srv_sess);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int rnbd_srv_link_ev(struct rtrs_srv *rtrs,
|
||||
enum rtrs_srv_link_ev ev, void *priv)
|
||||
{
|
||||
struct rnbd_srv_session *srv_sess = priv;
|
||||
|
||||
switch (ev) {
|
||||
case RTRS_SRV_LINK_EV_CONNECTED:
|
||||
return create_sess(rtrs);
|
||||
|
||||
case RTRS_SRV_LINK_EV_DISCONNECTED:
|
||||
if (WARN_ON_ONCE(!srv_sess))
|
||||
return -EINVAL;
|
||||
|
||||
destroy_sess(srv_sess);
|
||||
return 0;
|
||||
|
||||
default:
|
||||
pr_warn("Received unknown RTRS session event %d from session %s\n",
|
||||
ev, srv_sess->sessname);
|
||||
return -EINVAL;
|
||||
}
|
||||
}
|
||||
|
||||
static int process_msg_close(struct rtrs_srv *rtrs,
|
||||
struct rnbd_srv_session *srv_sess,
|
||||
void *data, size_t datalen, const void *usr,
|
||||
size_t usrlen)
|
||||
{
|
||||
const struct rnbd_msg_close *close_msg = usr;
|
||||
struct rnbd_srv_sess_dev *sess_dev;
|
||||
|
||||
sess_dev = rnbd_get_sess_dev(le32_to_cpu(close_msg->device_id),
|
||||
srv_sess);
|
||||
if (IS_ERR(sess_dev))
|
||||
return 0;
|
||||
|
||||
rnbd_put_sess_dev(sess_dev);
|
||||
mutex_lock(&srv_sess->lock);
|
||||
rnbd_srv_destroy_dev_session_sysfs(sess_dev);
|
||||
mutex_unlock(&srv_sess->lock);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int process_msg_open(struct rtrs_srv *rtrs,
|
||||
struct rnbd_srv_session *srv_sess,
|
||||
const void *msg, size_t len,
|
||||
void *data, size_t datalen);
|
||||
|
||||
static int process_msg_sess_info(struct rtrs_srv *rtrs,
|
||||
struct rnbd_srv_session *srv_sess,
|
||||
const void *msg, size_t len,
|
||||
void *data, size_t datalen);
|
||||
|
||||
static int rnbd_srv_rdma_ev(struct rtrs_srv *rtrs, void *priv,
|
||||
struct rtrs_srv_op *id, int dir,
|
||||
void *data, size_t datalen, const void *usr,
|
||||
size_t usrlen)
|
||||
{
|
||||
struct rnbd_srv_session *srv_sess = priv;
|
||||
const struct rnbd_msg_hdr *hdr = usr;
|
||||
int ret = 0;
|
||||
u16 type;
|
||||
|
||||
if (WARN_ON_ONCE(!srv_sess))
|
||||
return -ENODEV;
|
||||
|
||||
type = le16_to_cpu(hdr->type);
|
||||
|
||||
switch (type) {
|
||||
case RNBD_MSG_IO:
|
||||
return process_rdma(rtrs, srv_sess, id, data, datalen, usr,
|
||||
usrlen);
|
||||
case RNBD_MSG_CLOSE:
|
||||
ret = process_msg_close(rtrs, srv_sess, data, datalen,
|
||||
usr, usrlen);
|
||||
break;
|
||||
case RNBD_MSG_OPEN:
|
||||
ret = process_msg_open(rtrs, srv_sess, usr, usrlen,
|
||||
data, datalen);
|
||||
break;
|
||||
case RNBD_MSG_SESS_INFO:
|
||||
ret = process_msg_sess_info(rtrs, srv_sess, usr, usrlen,
|
||||
data, datalen);
|
||||
break;
|
||||
default:
|
||||
pr_warn("Received unexpected message type %d with dir %d from session %s\n",
|
||||
type, dir, srv_sess->sessname);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
rtrs_srv_resp_rdma(id, ret);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static struct rnbd_srv_sess_dev
|
||||
*rnbd_sess_dev_alloc(struct rnbd_srv_session *srv_sess)
|
||||
{
|
||||
struct rnbd_srv_sess_dev *sess_dev;
|
||||
int error;
|
||||
|
||||
sess_dev = kzalloc(sizeof(*sess_dev), GFP_KERNEL);
|
||||
if (!sess_dev)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
error = xa_alloc(&srv_sess->index_idr, &sess_dev->device_id, sess_dev,
|
||||
xa_limit_32b, GFP_NOWAIT);
|
||||
if (error < 0) {
|
||||
pr_warn("Allocating idr failed, err: %d\n", error);
|
||||
kfree(sess_dev);
|
||||
return ERR_PTR(error);
|
||||
}
|
||||
|
||||
return sess_dev;
|
||||
}
|
||||
|
||||
static struct rnbd_srv_dev *rnbd_srv_init_srv_dev(const char *id)
|
||||
{
|
||||
struct rnbd_srv_dev *dev;
|
||||
|
||||
dev = kzalloc(sizeof(*dev), GFP_KERNEL);
|
||||
if (!dev)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
strlcpy(dev->id, id, sizeof(dev->id));
|
||||
kref_init(&dev->kref);
|
||||
INIT_LIST_HEAD(&dev->sess_dev_list);
|
||||
mutex_init(&dev->lock);
|
||||
|
||||
return dev;
|
||||
}
|
||||
|
||||
static struct rnbd_srv_dev *
|
||||
rnbd_srv_find_or_add_srv_dev(struct rnbd_srv_dev *new_dev)
|
||||
{
|
||||
struct rnbd_srv_dev *dev;
|
||||
|
||||
spin_lock(&dev_lock);
|
||||
list_for_each_entry(dev, &dev_list, list) {
|
||||
if (!strncmp(dev->id, new_dev->id, sizeof(dev->id))) {
|
||||
if (!kref_get_unless_zero(&dev->kref))
|
||||
/*
|
||||
* We lost the race, device is almost dead.
|
||||
* Continue traversing to find a valid one.
|
||||
*/
|
||||
continue;
|
||||
spin_unlock(&dev_lock);
|
||||
return dev;
|
||||
}
|
||||
}
|
||||
list_add(&new_dev->list, &dev_list);
|
||||
spin_unlock(&dev_lock);
|
||||
|
||||
return new_dev;
|
||||
}
|
||||
|
||||
static int rnbd_srv_check_update_open_perm(struct rnbd_srv_dev *srv_dev,
|
||||
struct rnbd_srv_session *srv_sess,
|
||||
enum rnbd_access_mode access_mode)
|
||||
{
|
||||
int ret = -EPERM;
|
||||
|
||||
mutex_lock(&srv_dev->lock);
|
||||
|
||||
switch (access_mode) {
|
||||
case RNBD_ACCESS_RO:
|
||||
ret = 0;
|
||||
break;
|
||||
case RNBD_ACCESS_RW:
|
||||
if (srv_dev->open_write_cnt == 0) {
|
||||
srv_dev->open_write_cnt++;
|
||||
ret = 0;
|
||||
} else {
|
||||
pr_err("Mapping device '%s' for session %s with RW permissions failed. Device already opened as 'RW' by %d client(s), access mode %s.\n",
|
||||
srv_dev->id, srv_sess->sessname,
|
||||
srv_dev->open_write_cnt,
|
||||
rnbd_access_mode_str(access_mode));
|
||||
}
|
||||
break;
|
||||
case RNBD_ACCESS_MIGRATION:
|
||||
if (srv_dev->open_write_cnt < 2) {
|
||||
srv_dev->open_write_cnt++;
|
||||
ret = 0;
|
||||
} else {
|
||||
pr_err("Mapping device '%s' for session %s with migration permissions failed. Device already opened as 'RW' by %d client(s), access mode %s.\n",
|
||||
srv_dev->id, srv_sess->sessname,
|
||||
srv_dev->open_write_cnt,
|
||||
rnbd_access_mode_str(access_mode));
|
||||
}
|
||||
break;
|
||||
default:
|
||||
pr_err("Received mapping request for device '%s' on session %s with invalid access mode: %d\n",
|
||||
srv_dev->id, srv_sess->sessname, access_mode);
|
||||
ret = -EINVAL;
|
||||
}
|
||||
|
||||
mutex_unlock(&srv_dev->lock);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static struct rnbd_srv_dev *
|
||||
rnbd_srv_get_or_create_srv_dev(struct rnbd_dev *rnbd_dev,
|
||||
struct rnbd_srv_session *srv_sess,
|
||||
enum rnbd_access_mode access_mode)
|
||||
{
|
||||
int ret;
|
||||
struct rnbd_srv_dev *new_dev, *dev;
|
||||
|
||||
new_dev = rnbd_srv_init_srv_dev(rnbd_dev->name);
|
||||
if (IS_ERR(new_dev))
|
||||
return new_dev;
|
||||
|
||||
dev = rnbd_srv_find_or_add_srv_dev(new_dev);
|
||||
if (dev != new_dev)
|
||||
kfree(new_dev);
|
||||
|
||||
ret = rnbd_srv_check_update_open_perm(dev, srv_sess, access_mode);
|
||||
if (ret) {
|
||||
rnbd_put_srv_dev(dev);
|
||||
return ERR_PTR(ret);
|
||||
}
|
||||
|
||||
return dev;
|
||||
}
|
||||
|
||||
static void rnbd_srv_fill_msg_open_rsp(struct rnbd_msg_open_rsp *rsp,
|
||||
struct rnbd_srv_sess_dev *sess_dev)
|
||||
{
|
||||
struct rnbd_dev *rnbd_dev = sess_dev->rnbd_dev;
|
||||
|
||||
rsp->hdr.type = cpu_to_le16(RNBD_MSG_OPEN_RSP);
|
||||
rsp->device_id =
|
||||
cpu_to_le32(sess_dev->device_id);
|
||||
rsp->nsectors =
|
||||
cpu_to_le64(get_capacity(rnbd_dev->bdev->bd_disk));
|
||||
rsp->logical_block_size =
|
||||
cpu_to_le16(bdev_logical_block_size(rnbd_dev->bdev));
|
||||
rsp->physical_block_size =
|
||||
cpu_to_le16(bdev_physical_block_size(rnbd_dev->bdev));
|
||||
rsp->max_segments =
|
||||
cpu_to_le16(rnbd_dev_get_max_segs(rnbd_dev));
|
||||
rsp->max_hw_sectors =
|
||||
cpu_to_le32(rnbd_dev_get_max_hw_sects(rnbd_dev));
|
||||
rsp->max_write_same_sectors =
|
||||
cpu_to_le32(bdev_write_same(rnbd_dev->bdev));
|
||||
rsp->max_discard_sectors =
|
||||
cpu_to_le32(rnbd_dev_get_max_discard_sects(rnbd_dev));
|
||||
rsp->discard_granularity =
|
||||
cpu_to_le32(rnbd_dev_get_discard_granularity(rnbd_dev));
|
||||
rsp->discard_alignment =
|
||||
cpu_to_le32(rnbd_dev_get_discard_alignment(rnbd_dev));
|
||||
rsp->secure_discard =
|
||||
cpu_to_le16(rnbd_dev_get_secure_discard(rnbd_dev));
|
||||
rsp->rotational =
|
||||
!blk_queue_nonrot(bdev_get_queue(rnbd_dev->bdev));
|
||||
}
|
||||
|
||||
static struct rnbd_srv_sess_dev *
|
||||
rnbd_srv_create_set_sess_dev(struct rnbd_srv_session *srv_sess,
|
||||
const struct rnbd_msg_open *open_msg,
|
||||
struct rnbd_dev *rnbd_dev, fmode_t open_flags,
|
||||
struct rnbd_srv_dev *srv_dev)
|
||||
{
|
||||
struct rnbd_srv_sess_dev *sdev = rnbd_sess_dev_alloc(srv_sess);
|
||||
|
||||
if (IS_ERR(sdev))
|
||||
return sdev;
|
||||
|
||||
kref_init(&sdev->kref);
|
||||
|
||||
strlcpy(sdev->pathname, open_msg->dev_name, sizeof(sdev->pathname));
|
||||
|
||||
sdev->rnbd_dev = rnbd_dev;
|
||||
sdev->sess = srv_sess;
|
||||
sdev->dev = srv_dev;
|
||||
sdev->open_flags = open_flags;
|
||||
sdev->access_mode = open_msg->access_mode;
|
||||
|
||||
return sdev;
|
||||
}
|
||||
|
||||
static char *rnbd_srv_get_full_path(struct rnbd_srv_session *srv_sess,
|
||||
const char *dev_name)
|
||||
{
|
||||
char *full_path;
|
||||
char *a, *b;
|
||||
|
||||
full_path = kmalloc(PATH_MAX, GFP_KERNEL);
|
||||
if (!full_path)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
/*
|
||||
* Replace %SESSNAME% with a real session name in order to
|
||||
* create device namespace.
|
||||
*/
|
||||
a = strnstr(dev_search_path, "%SESSNAME%", sizeof(dev_search_path));
|
||||
if (a) {
|
||||
int len = a - dev_search_path;
|
||||
|
||||
len = snprintf(full_path, PATH_MAX, "%.*s/%s/%s", len,
|
||||
dev_search_path, srv_sess->sessname, dev_name);
|
||||
if (len >= PATH_MAX) {
|
||||
pr_err("Too long path: %s, %s, %s\n",
|
||||
dev_search_path, srv_sess->sessname, dev_name);
|
||||
kfree(full_path);
|
||||
return ERR_PTR(-EINVAL);
|
||||
}
|
||||
} else {
|
||||
snprintf(full_path, PATH_MAX, "%s/%s",
|
||||
dev_search_path, dev_name);
|
||||
}
|
||||
|
||||
/* eliminitate duplicated slashes */
|
||||
a = strchr(full_path, '/');
|
||||
b = a;
|
||||
while (*b != '\0') {
|
||||
if (*b == '/' && *a == '/') {
|
||||
b++;
|
||||
} else {
|
||||
a++;
|
||||
*a = *b;
|
||||
b++;
|
||||
}
|
||||
}
|
||||
a++;
|
||||
*a = '\0';
|
||||
|
||||
return full_path;
|
||||
}
|
||||
|
||||
static int process_msg_sess_info(struct rtrs_srv *rtrs,
|
||||
struct rnbd_srv_session *srv_sess,
|
||||
const void *msg, size_t len,
|
||||
void *data, size_t datalen)
|
||||
{
|
||||
const struct rnbd_msg_sess_info *sess_info_msg = msg;
|
||||
struct rnbd_msg_sess_info_rsp *rsp = data;
|
||||
|
||||
srv_sess->ver = min_t(u8, sess_info_msg->ver, RNBD_PROTO_VER_MAJOR);
|
||||
pr_debug("Session %s using protocol version %d (client version: %d, server version: %d)\n",
|
||||
srv_sess->sessname, srv_sess->ver,
|
||||
sess_info_msg->ver, RNBD_PROTO_VER_MAJOR);
|
||||
|
||||
rsp->hdr.type = cpu_to_le16(RNBD_MSG_SESS_INFO_RSP);
|
||||
rsp->ver = srv_sess->ver;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* find_srv_sess_dev() - a dev is already opened by this name
|
||||
* @srv_sess: the session to search.
|
||||
* @dev_name: string containing the name of the device.
|
||||
*
|
||||
* Return struct rnbd_srv_sess_dev if srv_sess already opened the dev_name
|
||||
* NULL if the session didn't open the device yet.
|
||||
*/
|
||||
static struct rnbd_srv_sess_dev *
|
||||
find_srv_sess_dev(struct rnbd_srv_session *srv_sess, const char *dev_name)
|
||||
{
|
||||
struct rnbd_srv_sess_dev *sess_dev;
|
||||
|
||||
if (list_empty(&srv_sess->sess_dev_list))
|
||||
return NULL;
|
||||
|
||||
list_for_each_entry(sess_dev, &srv_sess->sess_dev_list, sess_list)
|
||||
if (!strcmp(sess_dev->pathname, dev_name))
|
||||
return sess_dev;
|
||||
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static int process_msg_open(struct rtrs_srv *rtrs,
|
||||
struct rnbd_srv_session *srv_sess,
|
||||
const void *msg, size_t len,
|
||||
void *data, size_t datalen)
|
||||
{
|
||||
int ret;
|
||||
struct rnbd_srv_dev *srv_dev;
|
||||
struct rnbd_srv_sess_dev *srv_sess_dev;
|
||||
const struct rnbd_msg_open *open_msg = msg;
|
||||
fmode_t open_flags;
|
||||
char *full_path;
|
||||
struct rnbd_dev *rnbd_dev;
|
||||
struct rnbd_msg_open_rsp *rsp = data;
|
||||
|
||||
pr_debug("Open message received: session='%s' path='%s' access_mode=%d\n",
|
||||
srv_sess->sessname, open_msg->dev_name,
|
||||
open_msg->access_mode);
|
||||
open_flags = FMODE_READ;
|
||||
if (open_msg->access_mode != RNBD_ACCESS_RO)
|
||||
open_flags |= FMODE_WRITE;
|
||||
|
||||
mutex_lock(&srv_sess->lock);
|
||||
|
||||
srv_sess_dev = find_srv_sess_dev(srv_sess, open_msg->dev_name);
|
||||
if (srv_sess_dev)
|
||||
goto fill_response;
|
||||
|
||||
if ((strlen(dev_search_path) + strlen(open_msg->dev_name))
|
||||
>= PATH_MAX) {
|
||||
pr_err("Opening device for session %s failed, device path too long. '%s/%s' is longer than PATH_MAX (%d)\n",
|
||||
srv_sess->sessname, dev_search_path, open_msg->dev_name,
|
||||
PATH_MAX);
|
||||
ret = -EINVAL;
|
||||
goto reject;
|
||||
}
|
||||
if (strstr(open_msg->dev_name, "..")) {
|
||||
pr_err("Opening device for session %s failed, device path %s contains relative path ..\n",
|
||||
srv_sess->sessname, open_msg->dev_name);
|
||||
ret = -EINVAL;
|
||||
goto reject;
|
||||
}
|
||||
full_path = rnbd_srv_get_full_path(srv_sess, open_msg->dev_name);
|
||||
if (IS_ERR(full_path)) {
|
||||
ret = PTR_ERR(full_path);
|
||||
pr_err("Opening device '%s' for client %s failed, failed to get device full path, err: %d\n",
|
||||
open_msg->dev_name, srv_sess->sessname, ret);
|
||||
goto reject;
|
||||
}
|
||||
|
||||
rnbd_dev = rnbd_dev_open(full_path, open_flags,
|
||||
&srv_sess->sess_bio_set);
|
||||
if (IS_ERR(rnbd_dev)) {
|
||||
pr_err("Opening device '%s' on session %s failed, failed to open the block device, err: %ld\n",
|
||||
full_path, srv_sess->sessname, PTR_ERR(rnbd_dev));
|
||||
ret = PTR_ERR(rnbd_dev);
|
||||
goto free_path;
|
||||
}
|
||||
|
||||
srv_dev = rnbd_srv_get_or_create_srv_dev(rnbd_dev, srv_sess,
|
||||
open_msg->access_mode);
|
||||
if (IS_ERR(srv_dev)) {
|
||||
pr_err("Opening device '%s' on session %s failed, creating srv_dev failed, err: %ld\n",
|
||||
full_path, srv_sess->sessname, PTR_ERR(srv_dev));
|
||||
ret = PTR_ERR(srv_dev);
|
||||
goto rnbd_dev_close;
|
||||
}
|
||||
|
||||
srv_sess_dev = rnbd_srv_create_set_sess_dev(srv_sess, open_msg,
|
||||
rnbd_dev, open_flags,
|
||||
srv_dev);
|
||||
if (IS_ERR(srv_sess_dev)) {
|
||||
pr_err("Opening device '%s' on session %s failed, creating sess_dev failed, err: %ld\n",
|
||||
full_path, srv_sess->sessname, PTR_ERR(srv_sess_dev));
|
||||
ret = PTR_ERR(srv_sess_dev);
|
||||
goto srv_dev_put;
|
||||
}
|
||||
|
||||
/* Create the srv_dev sysfs files if they haven't been created yet. The
|
||||
* reason to delay the creation is not to create the sysfs files before
|
||||
* we are sure the device can be opened.
|
||||
*/
|
||||
mutex_lock(&srv_dev->lock);
|
||||
if (!srv_dev->dev_kobj.state_in_sysfs) {
|
||||
ret = rnbd_srv_create_dev_sysfs(srv_dev, rnbd_dev->bdev,
|
||||
rnbd_dev->name);
|
||||
if (ret) {
|
||||
mutex_unlock(&srv_dev->lock);
|
||||
rnbd_srv_err(srv_sess_dev,
|
||||
"Opening device failed, failed to create device sysfs files, err: %d\n",
|
||||
ret);
|
||||
goto free_srv_sess_dev;
|
||||
}
|
||||
}
|
||||
|
||||
ret = rnbd_srv_create_dev_session_sysfs(srv_sess_dev);
|
||||
if (ret) {
|
||||
mutex_unlock(&srv_dev->lock);
|
||||
rnbd_srv_err(srv_sess_dev,
|
||||
"Opening device failed, failed to create dev client sysfs files, err: %d\n",
|
||||
ret);
|
||||
goto free_srv_sess_dev;
|
||||
}
|
||||
|
||||
list_add(&srv_sess_dev->dev_list, &srv_dev->sess_dev_list);
|
||||
mutex_unlock(&srv_dev->lock);
|
||||
|
||||
list_add(&srv_sess_dev->sess_list, &srv_sess->sess_dev_list);
|
||||
|
||||
rnbd_srv_info(srv_sess_dev, "Opened device '%s'\n", srv_dev->id);
|
||||
|
||||
kfree(full_path);
|
||||
|
||||
fill_response:
|
||||
rnbd_srv_fill_msg_open_rsp(rsp, srv_sess_dev);
|
||||
mutex_unlock(&srv_sess->lock);
|
||||
return 0;
|
||||
|
||||
free_srv_sess_dev:
|
||||
xa_erase(&srv_sess->index_idr, srv_sess_dev->device_id);
|
||||
synchronize_rcu();
|
||||
kfree(srv_sess_dev);
|
||||
srv_dev_put:
|
||||
if (open_msg->access_mode != RNBD_ACCESS_RO) {
|
||||
mutex_lock(&srv_dev->lock);
|
||||
srv_dev->open_write_cnt--;
|
||||
mutex_unlock(&srv_dev->lock);
|
||||
}
|
||||
rnbd_put_srv_dev(srv_dev);
|
||||
rnbd_dev_close:
|
||||
rnbd_dev_close(rnbd_dev);
|
||||
free_path:
|
||||
kfree(full_path);
|
||||
reject:
|
||||
mutex_unlock(&srv_sess->lock);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static struct rtrs_srv_ctx *rtrs_ctx;
|
||||
|
||||
static struct rtrs_srv_ops rtrs_ops;
|
||||
static int __init rnbd_srv_init_module(void)
|
||||
{
|
||||
int err;
|
||||
|
||||
BUILD_BUG_ON(sizeof(struct rnbd_msg_hdr) != 4);
|
||||
BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info) != 36);
|
||||
BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info_rsp) != 36);
|
||||
BUILD_BUG_ON(sizeof(struct rnbd_msg_open) != 264);
|
||||
BUILD_BUG_ON(sizeof(struct rnbd_msg_close) != 8);
|
||||
BUILD_BUG_ON(sizeof(struct rnbd_msg_open_rsp) != 56);
|
||||
rtrs_ops = (struct rtrs_srv_ops) {
|
||||
.rdma_ev = rnbd_srv_rdma_ev,
|
||||
.link_ev = rnbd_srv_link_ev,
|
||||
};
|
||||
rtrs_ctx = rtrs_srv_open(&rtrs_ops, port_nr);
|
||||
if (IS_ERR(rtrs_ctx)) {
|
||||
err = PTR_ERR(rtrs_ctx);
|
||||
pr_err("rtrs_srv_open(), err: %d\n", err);
|
||||
return err;
|
||||
}
|
||||
|
||||
err = rnbd_srv_create_sysfs_files();
|
||||
if (err) {
|
||||
pr_err("rnbd_srv_create_sysfs_files(), err: %d\n", err);
|
||||
rtrs_srv_close(rtrs_ctx);
|
||||
return err;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void __exit rnbd_srv_cleanup_module(void)
|
||||
{
|
||||
rtrs_srv_close(rtrs_ctx);
|
||||
WARN_ON(!list_empty(&sess_list));
|
||||
rnbd_srv_destroy_sysfs_files();
|
||||
}
|
||||
|
||||
module_init(rnbd_srv_init_module);
|
||||
module_exit(rnbd_srv_cleanup_module);
|
78
drivers/block/rnbd/rnbd-srv.h
Normal file
78
drivers/block/rnbd/rnbd-srv.h
Normal file
@ -0,0 +1,78 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0-or-later */
|
||||
/*
|
||||
* RDMA Network Block Driver
|
||||
*
|
||||
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
|
||||
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
|
||||
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
|
||||
*/
|
||||
#ifndef RNBD_SRV_H
|
||||
#define RNBD_SRV_H
|
||||
|
||||
#include <linux/types.h>
|
||||
#include <linux/idr.h>
|
||||
#include <linux/kref.h>
|
||||
|
||||
#include <rtrs.h>
|
||||
#include "rnbd-proto.h"
|
||||
#include "rnbd-log.h"
|
||||
|
||||
struct rnbd_srv_session {
|
||||
/* Entry inside global sess_list */
|
||||
struct list_head list;
|
||||
struct rtrs_srv *rtrs;
|
||||
char sessname[NAME_MAX];
|
||||
int queue_depth;
|
||||
struct bio_set sess_bio_set;
|
||||
|
||||
struct xarray index_idr;
|
||||
/* List of struct rnbd_srv_sess_dev */
|
||||
struct list_head sess_dev_list;
|
||||
struct mutex lock;
|
||||
u8 ver;
|
||||
};
|
||||
|
||||
struct rnbd_srv_dev {
|
||||
/* Entry inside global dev_list */
|
||||
struct list_head list;
|
||||
struct kobject dev_kobj;
|
||||
struct kobject *dev_sessions_kobj;
|
||||
struct kref kref;
|
||||
char id[NAME_MAX];
|
||||
/* List of rnbd_srv_sess_dev structs */
|
||||
struct list_head sess_dev_list;
|
||||
struct mutex lock;
|
||||
int open_write_cnt;
|
||||
};
|
||||
|
||||
/* Structure which binds N devices and N sessions */
|
||||
struct rnbd_srv_sess_dev {
|
||||
/* Entry inside rnbd_srv_dev struct */
|
||||
struct list_head dev_list;
|
||||
/* Entry inside rnbd_srv_session struct */
|
||||
struct list_head sess_list;
|
||||
struct rnbd_dev *rnbd_dev;
|
||||
struct rnbd_srv_session *sess;
|
||||
struct rnbd_srv_dev *dev;
|
||||
struct kobject kobj;
|
||||
u32 device_id;
|
||||
fmode_t open_flags;
|
||||
struct kref kref;
|
||||
struct completion *destroy_comp;
|
||||
char pathname[NAME_MAX];
|
||||
enum rnbd_access_mode access_mode;
|
||||
};
|
||||
|
||||
/* rnbd-srv-sysfs.c */
|
||||
|
||||
int rnbd_srv_create_dev_sysfs(struct rnbd_srv_dev *dev,
|
||||
struct block_device *bdev,
|
||||
const char *dir_name);
|
||||
void rnbd_srv_destroy_dev_sysfs(struct rnbd_srv_dev *dev);
|
||||
int rnbd_srv_create_dev_session_sysfs(struct rnbd_srv_sess_dev *sess_dev);
|
||||
void rnbd_srv_destroy_dev_session_sysfs(struct rnbd_srv_sess_dev *sess_dev);
|
||||
int rnbd_srv_create_sysfs_files(void);
|
||||
void rnbd_srv_destroy_sysfs_files(void);
|
||||
void rnbd_destroy_sess_dev(struct rnbd_srv_sess_dev *sess_dev);
|
||||
|
||||
#endif /* RNBD_SRV_H */
|
@ -107,6 +107,7 @@ source "drivers/infiniband/ulp/srpt/Kconfig"
|
||||
|
||||
source "drivers/infiniband/ulp/iser/Kconfig"
|
||||
source "drivers/infiniband/ulp/isert/Kconfig"
|
||||
source "drivers/infiniband/ulp/rtrs/Kconfig"
|
||||
|
||||
source "drivers/infiniband/ulp/opa_vnic/Kconfig"
|
||||
|
||||
|
@ -8,11 +8,11 @@ obj-$(CONFIG_INFINIBAND_USER_MAD) += ib_umad.o
|
||||
obj-$(CONFIG_INFINIBAND_USER_ACCESS) += ib_uverbs.o $(user_access-y)
|
||||
|
||||
ib_core-y := packer.o ud_header.o verbs.o cq.o rw.o sysfs.o \
|
||||
device.o fmr_pool.o cache.o netlink.o \
|
||||
device.o cache.o netlink.o \
|
||||
roce_gid_mgmt.o mr_pool.o addr.o sa_query.o \
|
||||
multicast.o mad.o smi.o agent.o mad_rmpp.o \
|
||||
nldev.o restrack.o counters.o ib_core_uverbs.o \
|
||||
trace.o
|
||||
trace.o lag.o
|
||||
|
||||
ib_core-$(CONFIG_SECURITY_INFINIBAND) += security.o
|
||||
ib_core-$(CONFIG_CGROUP_RDMA) += cgroup.o
|
||||
@ -36,6 +36,9 @@ ib_uverbs-y := uverbs_main.o uverbs_cmd.o uverbs_marshall.o \
|
||||
uverbs_std_types_flow_action.o uverbs_std_types_dm.o \
|
||||
uverbs_std_types_mr.o uverbs_std_types_counters.o \
|
||||
uverbs_uapi.o uverbs_std_types_device.o \
|
||||
uverbs_std_types_async_fd.o
|
||||
uverbs_std_types_async_fd.o \
|
||||
uverbs_std_types_srq.o \
|
||||
uverbs_std_types_wq.o \
|
||||
uverbs_std_types_qp.o
|
||||
ib_uverbs-$(CONFIG_INFINIBAND_USER_MEM) += umem.o
|
||||
ib_uverbs-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += umem_odp.o
|
||||
|
@ -371,6 +371,8 @@ static int fetch_ha(const struct dst_entry *dst, struct rdma_dev_addr *dev_addr,
|
||||
(const void *)&dst_in6->sin6_addr;
|
||||
sa_family_t family = dst_in->sa_family;
|
||||
|
||||
might_sleep();
|
||||
|
||||
/* If we have a gateway in IB mode then it must be an IB network */
|
||||
if (has_gateway(dst, family) && dev_addr->network == RDMA_NETWORK_IB)
|
||||
return ib_nl_fetch_ha(dev_addr, daddr, seq, family);
|
||||
@ -727,6 +729,8 @@ int roce_resolve_route_from_path(struct sa_path_rec *rec,
|
||||
struct rdma_dev_addr dev_addr = {};
|
||||
int ret;
|
||||
|
||||
might_sleep();
|
||||
|
||||
if (rec->roce.route_resolved)
|
||||
return 0;
|
||||
|
||||
|
@ -66,6 +66,8 @@ static const char * const ibcm_rej_reason_strs[] = {
|
||||
[IB_CM_REJ_INVALID_CLASS_VERSION] = "invalid class version",
|
||||
[IB_CM_REJ_INVALID_FLOW_LABEL] = "invalid flow label",
|
||||
[IB_CM_REJ_INVALID_ALT_FLOW_LABEL] = "invalid alt flow label",
|
||||
[IB_CM_REJ_VENDOR_OPTION_NOT_SUPPORTED] =
|
||||
"vendor option is not supported",
|
||||
};
|
||||
|
||||
const char *__attribute_const__ ibcm_reject_msg(int reason)
|
||||
@ -81,8 +83,11 @@ const char *__attribute_const__ ibcm_reject_msg(int reason)
|
||||
EXPORT_SYMBOL(ibcm_reject_msg);
|
||||
|
||||
struct cm_id_private;
|
||||
static void cm_add_one(struct ib_device *device);
|
||||
struct cm_work;
|
||||
static int cm_add_one(struct ib_device *device);
|
||||
static void cm_remove_one(struct ib_device *device, void *client_data);
|
||||
static void cm_process_work(struct cm_id_private *cm_id_priv,
|
||||
struct cm_work *work);
|
||||
static int cm_send_sidr_rep_locked(struct cm_id_private *cm_id_priv,
|
||||
struct ib_cm_sidr_rep_param *param);
|
||||
static int cm_send_dreq_locked(struct cm_id_private *cm_id_priv,
|
||||
@ -287,6 +292,8 @@ struct cm_id_private {
|
||||
|
||||
struct list_head work_list;
|
||||
atomic_t work_count;
|
||||
|
||||
struct rdma_ucm_ece ece;
|
||||
};
|
||||
|
||||
static void cm_work_handler(struct work_struct *work);
|
||||
@ -474,24 +481,19 @@ static int cm_init_av_for_response(struct cm_port *port, struct ib_wc *wc,
|
||||
grh, &av->ah_attr);
|
||||
}
|
||||
|
||||
static int add_cm_id_to_port_list(struct cm_id_private *cm_id_priv,
|
||||
struct cm_av *av,
|
||||
struct cm_port *port)
|
||||
static void add_cm_id_to_port_list(struct cm_id_private *cm_id_priv,
|
||||
struct cm_av *av, struct cm_port *port)
|
||||
{
|
||||
unsigned long flags;
|
||||
int ret = 0;
|
||||
|
||||
spin_lock_irqsave(&cm.lock, flags);
|
||||
|
||||
if (&cm_id_priv->av == av)
|
||||
list_add_tail(&cm_id_priv->prim_list, &port->cm_priv_prim_list);
|
||||
else if (&cm_id_priv->alt_av == av)
|
||||
list_add_tail(&cm_id_priv->altr_list, &port->cm_priv_altr_list);
|
||||
else
|
||||
ret = -EINVAL;
|
||||
|
||||
WARN_ON(true);
|
||||
spin_unlock_irqrestore(&cm.lock, flags);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static struct cm_port *
|
||||
@ -572,12 +574,7 @@ static int cm_init_av_by_path(struct sa_path_rec *path,
|
||||
return ret;
|
||||
|
||||
av->timeout = path->packet_life_time + 1;
|
||||
|
||||
ret = add_cm_id_to_port_list(cm_id_priv, av, port);
|
||||
if (ret) {
|
||||
rdma_destroy_ah_attr(&new_ah_attr);
|
||||
return ret;
|
||||
}
|
||||
add_cm_id_to_port_list(cm_id_priv, av, port);
|
||||
rdma_move_ah_attr(&av->ah_attr, &new_ah_attr);
|
||||
return 0;
|
||||
}
|
||||
@ -587,11 +584,6 @@ static u32 cm_local_id(__be32 local_id)
|
||||
return (__force u32) (local_id ^ cm.random_id_operand);
|
||||
}
|
||||
|
||||
static void cm_free_id(__be32 local_id)
|
||||
{
|
||||
xa_erase_irq(&cm.local_id_table, cm_local_id(local_id));
|
||||
}
|
||||
|
||||
static struct cm_id_private *cm_acquire_id(__be32 local_id, __be32 remote_id)
|
||||
{
|
||||
struct cm_id_private *cm_id_priv;
|
||||
@ -698,9 +690,10 @@ static struct cm_id_private * cm_find_listen(struct ib_device *device,
|
||||
cm_id_priv = rb_entry(node, struct cm_id_private, service_node);
|
||||
if ((cm_id_priv->id.service_mask & service_id) ==
|
||||
cm_id_priv->id.service_id &&
|
||||
(cm_id_priv->id.device == device))
|
||||
(cm_id_priv->id.device == device)) {
|
||||
refcount_inc(&cm_id_priv->refcount);
|
||||
return cm_id_priv;
|
||||
|
||||
}
|
||||
if (device < cm_id_priv->id.device)
|
||||
node = node->rb_left;
|
||||
else if (device > cm_id_priv->id.device)
|
||||
@ -745,12 +738,14 @@ static struct cm_timewait_info * cm_insert_remote_id(struct cm_timewait_info
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static struct cm_timewait_info * cm_find_remote_id(__be64 remote_ca_guid,
|
||||
__be32 remote_id)
|
||||
static struct cm_id_private *cm_find_remote_id(__be64 remote_ca_guid,
|
||||
__be32 remote_id)
|
||||
{
|
||||
struct rb_node *node = cm.remote_id_table.rb_node;
|
||||
struct cm_timewait_info *timewait_info;
|
||||
struct cm_id_private *res = NULL;
|
||||
|
||||
spin_lock_irq(&cm.lock);
|
||||
while (node) {
|
||||
timewait_info = rb_entry(node, struct cm_timewait_info,
|
||||
remote_id_node);
|
||||
@ -762,10 +757,14 @@ static struct cm_timewait_info * cm_find_remote_id(__be64 remote_ca_guid,
|
||||
node = node->rb_left;
|
||||
else if (be64_gt(remote_ca_guid, timewait_info->remote_ca_guid))
|
||||
node = node->rb_right;
|
||||
else
|
||||
return timewait_info;
|
||||
else {
|
||||
res = cm_acquire_id(timewait_info->work.local_id,
|
||||
timewait_info->work.remote_id);
|
||||
break;
|
||||
}
|
||||
}
|
||||
return NULL;
|
||||
spin_unlock_irq(&cm.lock);
|
||||
return res;
|
||||
}
|
||||
|
||||
static struct cm_timewait_info * cm_insert_remote_qpn(struct cm_timewait_info
|
||||
@ -917,6 +916,35 @@ static void cm_free_work(struct cm_work *work)
|
||||
kfree(work);
|
||||
}
|
||||
|
||||
static void cm_queue_work_unlock(struct cm_id_private *cm_id_priv,
|
||||
struct cm_work *work)
|
||||
{
|
||||
bool immediate;
|
||||
|
||||
/*
|
||||
* To deliver the event to the user callback we have the drop the
|
||||
* spinlock, however, we need to ensure that the user callback is single
|
||||
* threaded and receives events in the temporal order. If there are
|
||||
* already events being processed then thread new events onto a list,
|
||||
* the thread currently processing will pick them up.
|
||||
*/
|
||||
immediate = atomic_inc_and_test(&cm_id_priv->work_count);
|
||||
if (!immediate) {
|
||||
list_add_tail(&work->list, &cm_id_priv->work_list);
|
||||
/*
|
||||
* This routine always consumes incoming reference. Once queued
|
||||
* to the work_list then a reference is held by the thread
|
||||
* currently running cm_process_work() and this reference is not
|
||||
* needed.
|
||||
*/
|
||||
cm_deref_id(cm_id_priv);
|
||||
}
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
|
||||
if (immediate)
|
||||
cm_process_work(cm_id_priv, work);
|
||||
}
|
||||
|
||||
static inline int cm_convert_to_ms(int iba_time)
|
||||
{
|
||||
/* approximate conversion to ms from 4.096us x 2^iba_time */
|
||||
@ -942,8 +970,10 @@ static u8 cm_ack_timeout(u8 ca_ack_delay, u8 packet_life_time)
|
||||
return min(31, ack_timeout);
|
||||
}
|
||||
|
||||
static void cm_cleanup_timewait(struct cm_timewait_info *timewait_info)
|
||||
static void cm_remove_remote(struct cm_id_private *cm_id_priv)
|
||||
{
|
||||
struct cm_timewait_info *timewait_info = cm_id_priv->timewait_info;
|
||||
|
||||
if (timewait_info->inserted_remote_id) {
|
||||
rb_erase(&timewait_info->remote_id_node, &cm.remote_id_table);
|
||||
timewait_info->inserted_remote_id = 0;
|
||||
@ -982,7 +1012,7 @@ static void cm_enter_timewait(struct cm_id_private *cm_id_priv)
|
||||
return;
|
||||
|
||||
spin_lock_irqsave(&cm.lock, flags);
|
||||
cm_cleanup_timewait(cm_id_priv->timewait_info);
|
||||
cm_remove_remote(cm_id_priv);
|
||||
list_add_tail(&cm_id_priv->timewait_info->list, &cm.timewait_list);
|
||||
spin_unlock_irqrestore(&cm.lock, flags);
|
||||
|
||||
@ -1001,6 +1031,11 @@ static void cm_enter_timewait(struct cm_id_private *cm_id_priv)
|
||||
msecs_to_jiffies(wait_time));
|
||||
spin_unlock_irqrestore(&cm.lock, flags);
|
||||
|
||||
/*
|
||||
* The timewait_info is converted into a work and gets freed during
|
||||
* cm_free_work() in cm_timewait_handler().
|
||||
*/
|
||||
BUILD_BUG_ON(offsetof(struct cm_timewait_info, work) != 0);
|
||||
cm_id_priv->timewait_info = NULL;
|
||||
}
|
||||
|
||||
@ -1013,7 +1048,7 @@ static void cm_reset_to_idle(struct cm_id_private *cm_id_priv)
|
||||
cm_id_priv->id.state = IB_CM_IDLE;
|
||||
if (cm_id_priv->timewait_info) {
|
||||
spin_lock_irqsave(&cm.lock, flags);
|
||||
cm_cleanup_timewait(cm_id_priv->timewait_info);
|
||||
cm_remove_remote(cm_id_priv);
|
||||
spin_unlock_irqrestore(&cm.lock, flags);
|
||||
kfree(cm_id_priv->timewait_info);
|
||||
cm_id_priv->timewait_info = NULL;
|
||||
@ -1076,7 +1111,9 @@ retest:
|
||||
case IB_CM_REP_SENT:
|
||||
case IB_CM_MRA_REP_RCVD:
|
||||
ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);
|
||||
/* Fall through */
|
||||
cm_send_rej_locked(cm_id_priv, IB_CM_REJ_CONSUMER_DEFINED, NULL,
|
||||
0, NULL, 0);
|
||||
goto retest;
|
||||
case IB_CM_MRA_REQ_SENT:
|
||||
case IB_CM_REP_RCVD:
|
||||
case IB_CM_MRA_REP_SENT:
|
||||
@ -1101,7 +1138,7 @@ retest:
|
||||
case IB_CM_TIMEWAIT:
|
||||
/*
|
||||
* The cm_acquire_id in cm_timewait_handler will stop working
|
||||
* once we do cm_free_id() below, so just move to idle here for
|
||||
* once we do xa_erase below, so just move to idle here for
|
||||
* consistency.
|
||||
*/
|
||||
cm_id->state = IB_CM_IDLE;
|
||||
@ -1114,7 +1151,7 @@ retest:
|
||||
spin_lock(&cm.lock);
|
||||
/* Required for cleanup paths related cm_req_handler() */
|
||||
if (cm_id_priv->timewait_info) {
|
||||
cm_cleanup_timewait(cm_id_priv->timewait_info);
|
||||
cm_remove_remote(cm_id_priv);
|
||||
kfree(cm_id_priv->timewait_info);
|
||||
cm_id_priv->timewait_info = NULL;
|
||||
}
|
||||
@ -1131,7 +1168,7 @@ retest:
|
||||
spin_unlock(&cm.lock);
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
|
||||
cm_free_id(cm_id->local_id);
|
||||
xa_erase_irq(&cm.local_id_table, cm_local_id(cm_id->local_id));
|
||||
cm_deref_id(cm_id_priv);
|
||||
wait_for_completion(&cm_id_priv->comp);
|
||||
while ((work = cm_dequeue_work(cm_id_priv)) != NULL)
|
||||
@ -1287,6 +1324,13 @@ static void cm_format_mad_hdr(struct ib_mad_hdr *hdr,
|
||||
hdr->tid = tid;
|
||||
}
|
||||
|
||||
static void cm_format_mad_ece_hdr(struct ib_mad_hdr *hdr, __be16 attr_id,
|
||||
__be64 tid, u32 attr_mod)
|
||||
{
|
||||
cm_format_mad_hdr(hdr, attr_id, tid);
|
||||
hdr->attr_mod = cpu_to_be32(attr_mod);
|
||||
}
|
||||
|
||||
static void cm_format_req(struct cm_req_msg *req_msg,
|
||||
struct cm_id_private *cm_id_priv,
|
||||
struct ib_cm_req_param *param)
|
||||
@ -1299,8 +1343,8 @@ static void cm_format_req(struct cm_req_msg *req_msg,
|
||||
pri_ext = opa_is_extended_lid(pri_path->opa.dlid,
|
||||
pri_path->opa.slid);
|
||||
|
||||
cm_format_mad_hdr(&req_msg->hdr, CM_REQ_ATTR_ID,
|
||||
cm_form_tid(cm_id_priv));
|
||||
cm_format_mad_ece_hdr(&req_msg->hdr, CM_REQ_ATTR_ID,
|
||||
cm_form_tid(cm_id_priv), param->ece.attr_mod);
|
||||
|
||||
IBA_SET(CM_REQ_LOCAL_COMM_ID, req_msg,
|
||||
be32_to_cpu(cm_id_priv->id.local_id));
|
||||
@ -1423,6 +1467,7 @@ static void cm_format_req(struct cm_req_msg *req_msg,
|
||||
cm_ack_timeout(cm_id_priv->av.port->cm_dev->ack_delay,
|
||||
alt_path->packet_life_time));
|
||||
}
|
||||
IBA_SET(CM_REQ_VENDOR_ID, req_msg, param->ece.vendor_id);
|
||||
|
||||
if (param->private_data && param->private_data_len)
|
||||
IBA_SET_MEM(CM_REQ_PRIVATE_DATA, req_msg, param->private_data,
|
||||
@ -1779,6 +1824,9 @@ static void cm_format_req_event(struct cm_work *work,
|
||||
param->rnr_retry_count = IBA_GET(CM_REQ_RNR_RETRY_COUNT, req_msg);
|
||||
param->srq = IBA_GET(CM_REQ_SRQ, req_msg);
|
||||
param->ppath_sgid_attr = cm_id_priv->av.ah_attr.grh.sgid_attr;
|
||||
param->ece.vendor_id = IBA_GET(CM_REQ_VENDOR_ID, req_msg);
|
||||
param->ece.attr_mod = be32_to_cpu(req_msg->hdr.attr_mod);
|
||||
|
||||
work->cm_event.private_data =
|
||||
IBA_GET_MEM_PTR(CM_REQ_PRIVATE_DATA, req_msg);
|
||||
}
|
||||
@ -1927,7 +1975,6 @@ static struct cm_id_private * cm_match_req(struct cm_work *work,
|
||||
struct cm_id_private *listen_cm_id_priv, *cur_cm_id_priv;
|
||||
struct cm_timewait_info *timewait_info;
|
||||
struct cm_req_msg *req_msg;
|
||||
struct ib_cm_id *cm_id;
|
||||
|
||||
req_msg = (struct cm_req_msg *)work->mad_recv_wc->recv_buf.mad;
|
||||
|
||||
@ -1948,7 +1995,7 @@ static struct cm_id_private * cm_match_req(struct cm_work *work,
|
||||
/* Check for stale connections. */
|
||||
timewait_info = cm_insert_remote_qpn(cm_id_priv->timewait_info);
|
||||
if (timewait_info) {
|
||||
cm_cleanup_timewait(cm_id_priv->timewait_info);
|
||||
cm_remove_remote(cm_id_priv);
|
||||
cur_cm_id_priv = cm_acquire_id(timewait_info->work.local_id,
|
||||
timewait_info->work.remote_id);
|
||||
|
||||
@ -1957,8 +2004,7 @@ static struct cm_id_private * cm_match_req(struct cm_work *work,
|
||||
IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REQ,
|
||||
NULL, 0);
|
||||
if (cur_cm_id_priv) {
|
||||
cm_id = &cur_cm_id_priv->id;
|
||||
ib_send_cm_dreq(cm_id, NULL, 0);
|
||||
ib_send_cm_dreq(&cur_cm_id_priv->id, NULL, 0);
|
||||
cm_deref_id(cur_cm_id_priv);
|
||||
}
|
||||
return NULL;
|
||||
@ -1969,14 +2015,13 @@ static struct cm_id_private * cm_match_req(struct cm_work *work,
|
||||
cm_id_priv->id.device,
|
||||
cpu_to_be64(IBA_GET(CM_REQ_SERVICE_ID, req_msg)));
|
||||
if (!listen_cm_id_priv) {
|
||||
cm_cleanup_timewait(cm_id_priv->timewait_info);
|
||||
cm_remove_remote(cm_id_priv);
|
||||
spin_unlock_irq(&cm.lock);
|
||||
cm_issue_rej(work->port, work->mad_recv_wc,
|
||||
IB_CM_REJ_INVALID_SERVICE_ID, CM_MSG_RESPONSE_REQ,
|
||||
NULL, 0);
|
||||
return NULL;
|
||||
}
|
||||
refcount_inc(&listen_cm_id_priv->refcount);
|
||||
spin_unlock_irq(&cm.lock);
|
||||
return listen_cm_id_priv;
|
||||
}
|
||||
@ -2153,9 +2198,7 @@ static int cm_req_handler(struct cm_work *work)
|
||||
|
||||
/* Refcount belongs to the event, pairs with cm_process_work() */
|
||||
refcount_inc(&cm_id_priv->refcount);
|
||||
atomic_inc(&cm_id_priv->work_count);
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
cm_process_work(cm_id_priv, work);
|
||||
cm_queue_work_unlock(cm_id_priv, work);
|
||||
/*
|
||||
* Since this ID was just created and was not made visible to other MAD
|
||||
* handlers until the cm_finalize_id() above we know that the
|
||||
@ -2176,7 +2219,8 @@ static void cm_format_rep(struct cm_rep_msg *rep_msg,
|
||||
struct cm_id_private *cm_id_priv,
|
||||
struct ib_cm_rep_param *param)
|
||||
{
|
||||
cm_format_mad_hdr(&rep_msg->hdr, CM_REP_ATTR_ID, cm_id_priv->tid);
|
||||
cm_format_mad_ece_hdr(&rep_msg->hdr, CM_REP_ATTR_ID, cm_id_priv->tid,
|
||||
param->ece.attr_mod);
|
||||
IBA_SET(CM_REP_LOCAL_COMM_ID, rep_msg,
|
||||
be32_to_cpu(cm_id_priv->id.local_id));
|
||||
IBA_SET(CM_REP_REMOTE_COMM_ID, rep_msg,
|
||||
@ -2203,6 +2247,10 @@ static void cm_format_rep(struct cm_rep_msg *rep_msg,
|
||||
IBA_SET(CM_REP_LOCAL_EE_CONTEXT_NUMBER, rep_msg, param->qp_num);
|
||||
}
|
||||
|
||||
IBA_SET(CM_REP_VENDOR_ID_L, rep_msg, param->ece.vendor_id);
|
||||
IBA_SET(CM_REP_VENDOR_ID_M, rep_msg, param->ece.vendor_id >> 8);
|
||||
IBA_SET(CM_REP_VENDOR_ID_H, rep_msg, param->ece.vendor_id >> 16);
|
||||
|
||||
if (param->private_data && param->private_data_len)
|
||||
IBA_SET_MEM(CM_REP_PRIVATE_DATA, rep_msg, param->private_data,
|
||||
param->private_data_len);
|
||||
@ -2350,6 +2398,11 @@ static void cm_format_rep_event(struct cm_work *work, enum ib_qp_type qp_type)
|
||||
param->flow_control = IBA_GET(CM_REP_END_TO_END_FLOW_CONTROL, rep_msg);
|
||||
param->rnr_retry_count = IBA_GET(CM_REP_RNR_RETRY_COUNT, rep_msg);
|
||||
param->srq = IBA_GET(CM_REP_SRQ, rep_msg);
|
||||
param->ece.vendor_id = IBA_GET(CM_REP_VENDOR_ID_H, rep_msg) << 16;
|
||||
param->ece.vendor_id |= IBA_GET(CM_REP_VENDOR_ID_M, rep_msg) << 8;
|
||||
param->ece.vendor_id |= IBA_GET(CM_REP_VENDOR_ID_L, rep_msg);
|
||||
param->ece.attr_mod = be32_to_cpu(rep_msg->hdr.attr_mod);
|
||||
|
||||
work->cm_event.private_data =
|
||||
IBA_GET_MEM_PTR(CM_REP_PRIVATE_DATA, rep_msg);
|
||||
}
|
||||
@ -2404,7 +2457,6 @@ static int cm_rep_handler(struct cm_work *work)
|
||||
struct cm_rep_msg *rep_msg;
|
||||
int ret;
|
||||
struct cm_id_private *cur_cm_id_priv;
|
||||
struct ib_cm_id *cm_id;
|
||||
struct cm_timewait_info *timewait_info;
|
||||
|
||||
rep_msg = (struct cm_rep_msg *)work->mad_recv_wc->recv_buf.mad;
|
||||
@ -2454,9 +2506,7 @@ static int cm_rep_handler(struct cm_work *work)
|
||||
/* Check for a stale connection. */
|
||||
timewait_info = cm_insert_remote_qpn(cm_id_priv->timewait_info);
|
||||
if (timewait_info) {
|
||||
rb_erase(&cm_id_priv->timewait_info->remote_id_node,
|
||||
&cm.remote_id_table);
|
||||
cm_id_priv->timewait_info->inserted_remote_id = 0;
|
||||
cm_remove_remote(cm_id_priv);
|
||||
cur_cm_id_priv = cm_acquire_id(timewait_info->work.local_id,
|
||||
timewait_info->work.remote_id);
|
||||
|
||||
@ -2472,8 +2522,7 @@ static int cm_rep_handler(struct cm_work *work)
|
||||
IBA_GET(CM_REP_REMOTE_COMM_ID, rep_msg));
|
||||
|
||||
if (cur_cm_id_priv) {
|
||||
cm_id = &cur_cm_id_priv->id;
|
||||
ib_send_cm_dreq(cm_id, NULL, 0);
|
||||
ib_send_cm_dreq(&cur_cm_id_priv->id, NULL, 0);
|
||||
cm_deref_id(cur_cm_id_priv);
|
||||
}
|
||||
|
||||
@ -2501,15 +2550,7 @@ static int cm_rep_handler(struct cm_work *work)
|
||||
cm_id_priv->alt_av.timeout - 1);
|
||||
|
||||
ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);
|
||||
ret = atomic_inc_and_test(&cm_id_priv->work_count);
|
||||
if (!ret)
|
||||
list_add_tail(&work->list, &cm_id_priv->work_list);
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
|
||||
if (ret)
|
||||
cm_process_work(cm_id_priv, work);
|
||||
else
|
||||
cm_deref_id(cm_id_priv);
|
||||
cm_queue_work_unlock(cm_id_priv, work);
|
||||
return 0;
|
||||
|
||||
error:
|
||||
@ -2520,7 +2561,6 @@ error:
|
||||
static int cm_establish_handler(struct cm_work *work)
|
||||
{
|
||||
struct cm_id_private *cm_id_priv;
|
||||
int ret;
|
||||
|
||||
/* See comment in cm_establish about lookup. */
|
||||
cm_id_priv = cm_acquire_id(work->local_id, work->remote_id);
|
||||
@ -2534,15 +2574,7 @@ static int cm_establish_handler(struct cm_work *work)
|
||||
}
|
||||
|
||||
ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);
|
||||
ret = atomic_inc_and_test(&cm_id_priv->work_count);
|
||||
if (!ret)
|
||||
list_add_tail(&work->list, &cm_id_priv->work_list);
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
|
||||
if (ret)
|
||||
cm_process_work(cm_id_priv, work);
|
||||
else
|
||||
cm_deref_id(cm_id_priv);
|
||||
cm_queue_work_unlock(cm_id_priv, work);
|
||||
return 0;
|
||||
out:
|
||||
cm_deref_id(cm_id_priv);
|
||||
@ -2553,7 +2585,6 @@ static int cm_rtu_handler(struct cm_work *work)
|
||||
{
|
||||
struct cm_id_private *cm_id_priv;
|
||||
struct cm_rtu_msg *rtu_msg;
|
||||
int ret;
|
||||
|
||||
rtu_msg = (struct cm_rtu_msg *)work->mad_recv_wc->recv_buf.mad;
|
||||
cm_id_priv = cm_acquire_id(
|
||||
@ -2576,15 +2607,7 @@ static int cm_rtu_handler(struct cm_work *work)
|
||||
cm_id_priv->id.state = IB_CM_ESTABLISHED;
|
||||
|
||||
ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);
|
||||
ret = atomic_inc_and_test(&cm_id_priv->work_count);
|
||||
if (!ret)
|
||||
list_add_tail(&work->list, &cm_id_priv->work_list);
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
|
||||
if (ret)
|
||||
cm_process_work(cm_id_priv, work);
|
||||
else
|
||||
cm_deref_id(cm_id_priv);
|
||||
cm_queue_work_unlock(cm_id_priv, work);
|
||||
return 0;
|
||||
out:
|
||||
cm_deref_id(cm_id_priv);
|
||||
@ -2777,7 +2800,6 @@ static int cm_dreq_handler(struct cm_work *work)
|
||||
struct cm_id_private *cm_id_priv;
|
||||
struct cm_dreq_msg *dreq_msg;
|
||||
struct ib_mad_send_buf *msg = NULL;
|
||||
int ret;
|
||||
|
||||
dreq_msg = (struct cm_dreq_msg *)work->mad_recv_wc->recv_buf.mad;
|
||||
cm_id_priv = cm_acquire_id(
|
||||
@ -2842,15 +2864,7 @@ static int cm_dreq_handler(struct cm_work *work)
|
||||
}
|
||||
cm_id_priv->id.state = IB_CM_DREQ_RCVD;
|
||||
cm_id_priv->tid = dreq_msg->hdr.tid;
|
||||
ret = atomic_inc_and_test(&cm_id_priv->work_count);
|
||||
if (!ret)
|
||||
list_add_tail(&work->list, &cm_id_priv->work_list);
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
|
||||
if (ret)
|
||||
cm_process_work(cm_id_priv, work);
|
||||
else
|
||||
cm_deref_id(cm_id_priv);
|
||||
cm_queue_work_unlock(cm_id_priv, work);
|
||||
return 0;
|
||||
|
||||
unlock: spin_unlock_irq(&cm_id_priv->lock);
|
||||
@ -2862,7 +2876,6 @@ static int cm_drep_handler(struct cm_work *work)
|
||||
{
|
||||
struct cm_id_private *cm_id_priv;
|
||||
struct cm_drep_msg *drep_msg;
|
||||
int ret;
|
||||
|
||||
drep_msg = (struct cm_drep_msg *)work->mad_recv_wc->recv_buf.mad;
|
||||
cm_id_priv = cm_acquire_id(
|
||||
@ -2883,15 +2896,7 @@ static int cm_drep_handler(struct cm_work *work)
|
||||
cm_enter_timewait(cm_id_priv);
|
||||
|
||||
ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);
|
||||
ret = atomic_inc_and_test(&cm_id_priv->work_count);
|
||||
if (!ret)
|
||||
list_add_tail(&work->list, &cm_id_priv->work_list);
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
|
||||
if (ret)
|
||||
cm_process_work(cm_id_priv, work);
|
||||
else
|
||||
cm_deref_id(cm_id_priv);
|
||||
cm_queue_work_unlock(cm_id_priv, work);
|
||||
return 0;
|
||||
out:
|
||||
cm_deref_id(cm_id_priv);
|
||||
@ -2987,24 +2992,15 @@ static void cm_format_rej_event(struct cm_work *work)
|
||||
|
||||
static struct cm_id_private * cm_acquire_rejected_id(struct cm_rej_msg *rej_msg)
|
||||
{
|
||||
struct cm_timewait_info *timewait_info;
|
||||
struct cm_id_private *cm_id_priv;
|
||||
__be32 remote_id;
|
||||
|
||||
remote_id = cpu_to_be32(IBA_GET(CM_REJ_LOCAL_COMM_ID, rej_msg));
|
||||
|
||||
if (IBA_GET(CM_REJ_REASON, rej_msg) == IB_CM_REJ_TIMEOUT) {
|
||||
spin_lock_irq(&cm.lock);
|
||||
timewait_info = cm_find_remote_id(
|
||||
cm_id_priv = cm_find_remote_id(
|
||||
*((__be64 *)IBA_GET_MEM_PTR(CM_REJ_ARI, rej_msg)),
|
||||
remote_id);
|
||||
if (!timewait_info) {
|
||||
spin_unlock_irq(&cm.lock);
|
||||
return NULL;
|
||||
}
|
||||
cm_id_priv =
|
||||
cm_acquire_id(timewait_info->work.local_id, remote_id);
|
||||
spin_unlock_irq(&cm.lock);
|
||||
} else if (IBA_GET(CM_REJ_MESSAGE_REJECTED, rej_msg) ==
|
||||
CM_MSG_RESPONSE_REQ)
|
||||
cm_id_priv = cm_acquire_id(
|
||||
@ -3022,7 +3018,6 @@ static int cm_rej_handler(struct cm_work *work)
|
||||
{
|
||||
struct cm_id_private *cm_id_priv;
|
||||
struct cm_rej_msg *rej_msg;
|
||||
int ret;
|
||||
|
||||
rej_msg = (struct cm_rej_msg *)work->mad_recv_wc->recv_buf.mad;
|
||||
cm_id_priv = cm_acquire_rejected_id(rej_msg);
|
||||
@ -3068,19 +3063,10 @@ static int cm_rej_handler(struct cm_work *work)
|
||||
__func__, be32_to_cpu(cm_id_priv->id.local_id),
|
||||
cm_id_priv->id.state);
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
ret = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
|
||||
ret = atomic_inc_and_test(&cm_id_priv->work_count);
|
||||
if (!ret)
|
||||
list_add_tail(&work->list, &cm_id_priv->work_list);
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
|
||||
if (ret)
|
||||
cm_process_work(cm_id_priv, work);
|
||||
else
|
||||
cm_deref_id(cm_id_priv);
|
||||
cm_queue_work_unlock(cm_id_priv, work);
|
||||
return 0;
|
||||
out:
|
||||
cm_deref_id(cm_id_priv);
|
||||
@ -3190,7 +3176,7 @@ static int cm_mra_handler(struct cm_work *work)
|
||||
{
|
||||
struct cm_id_private *cm_id_priv;
|
||||
struct cm_mra_msg *mra_msg;
|
||||
int timeout, ret;
|
||||
int timeout;
|
||||
|
||||
mra_msg = (struct cm_mra_msg *)work->mad_recv_wc->recv_buf.mad;
|
||||
cm_id_priv = cm_acquire_mraed_id(mra_msg);
|
||||
@ -3250,15 +3236,7 @@ static int cm_mra_handler(struct cm_work *work)
|
||||
|
||||
cm_id_priv->msg->context[1] = (void *) (unsigned long)
|
||||
cm_id_priv->id.state;
|
||||
ret = atomic_inc_and_test(&cm_id_priv->work_count);
|
||||
if (!ret)
|
||||
list_add_tail(&work->list, &cm_id_priv->work_list);
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
|
||||
if (ret)
|
||||
cm_process_work(cm_id_priv, work);
|
||||
else
|
||||
cm_deref_id(cm_id_priv);
|
||||
cm_queue_work_unlock(cm_id_priv, work);
|
||||
return 0;
|
||||
out:
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
@ -3393,15 +3371,7 @@ static int cm_lap_handler(struct cm_work *work)
|
||||
|
||||
cm_id_priv->id.lap_state = IB_CM_LAP_RCVD;
|
||||
cm_id_priv->tid = lap_msg->hdr.tid;
|
||||
ret = atomic_inc_and_test(&cm_id_priv->work_count);
|
||||
if (!ret)
|
||||
list_add_tail(&work->list, &cm_id_priv->work_list);
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
|
||||
if (ret)
|
||||
cm_process_work(cm_id_priv, work);
|
||||
else
|
||||
cm_deref_id(cm_id_priv);
|
||||
cm_queue_work_unlock(cm_id_priv, work);
|
||||
return 0;
|
||||
|
||||
unlock: spin_unlock_irq(&cm_id_priv->lock);
|
||||
@ -3413,7 +3383,6 @@ static int cm_apr_handler(struct cm_work *work)
|
||||
{
|
||||
struct cm_id_private *cm_id_priv;
|
||||
struct cm_apr_msg *apr_msg;
|
||||
int ret;
|
||||
|
||||
/* Currently Alternate path messages are not supported for
|
||||
* RoCE link layer.
|
||||
@ -3448,16 +3417,7 @@ static int cm_apr_handler(struct cm_work *work)
|
||||
cm_id_priv->id.lap_state = IB_CM_LAP_IDLE;
|
||||
ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);
|
||||
cm_id_priv->msg = NULL;
|
||||
|
||||
ret = atomic_inc_and_test(&cm_id_priv->work_count);
|
||||
if (!ret)
|
||||
list_add_tail(&work->list, &cm_id_priv->work_list);
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
|
||||
if (ret)
|
||||
cm_process_work(cm_id_priv, work);
|
||||
else
|
||||
cm_deref_id(cm_id_priv);
|
||||
cm_queue_work_unlock(cm_id_priv, work);
|
||||
return 0;
|
||||
out:
|
||||
cm_deref_id(cm_id_priv);
|
||||
@ -3468,7 +3428,6 @@ static int cm_timewait_handler(struct cm_work *work)
|
||||
{
|
||||
struct cm_timewait_info *timewait_info;
|
||||
struct cm_id_private *cm_id_priv;
|
||||
int ret;
|
||||
|
||||
timewait_info = container_of(work, struct cm_timewait_info, work);
|
||||
spin_lock_irq(&cm.lock);
|
||||
@ -3487,15 +3446,7 @@ static int cm_timewait_handler(struct cm_work *work)
|
||||
goto out;
|
||||
}
|
||||
cm_id_priv->id.state = IB_CM_IDLE;
|
||||
ret = atomic_inc_and_test(&cm_id_priv->work_count);
|
||||
if (!ret)
|
||||
list_add_tail(&work->list, &cm_id_priv->work_list);
|
||||
spin_unlock_irq(&cm_id_priv->lock);
|
||||
|
||||
if (ret)
|
||||
cm_process_work(cm_id_priv, work);
|
||||
else
|
||||
cm_deref_id(cm_id_priv);
|
||||
cm_queue_work_unlock(cm_id_priv, work);
|
||||
return 0;
|
||||
out:
|
||||
cm_deref_id(cm_id_priv);
|
||||
@ -3642,7 +3593,6 @@ static int cm_sidr_req_handler(struct cm_work *work)
|
||||
.status = IB_SIDR_UNSUPPORTED });
|
||||
goto out; /* No match. */
|
||||
}
|
||||
refcount_inc(&listen_cm_id_priv->refcount);
|
||||
spin_unlock_irq(&cm.lock);
|
||||
|
||||
cm_id_priv->id.cm_handler = listen_cm_id_priv->id.cm_handler;
|
||||
@ -3674,8 +3624,8 @@ static void cm_format_sidr_rep(struct cm_sidr_rep_msg *sidr_rep_msg,
|
||||
struct cm_id_private *cm_id_priv,
|
||||
struct ib_cm_sidr_rep_param *param)
|
||||
{
|
||||
cm_format_mad_hdr(&sidr_rep_msg->hdr, CM_SIDR_REP_ATTR_ID,
|
||||
cm_id_priv->tid);
|
||||
cm_format_mad_ece_hdr(&sidr_rep_msg->hdr, CM_SIDR_REP_ATTR_ID,
|
||||
cm_id_priv->tid, param->ece.attr_mod);
|
||||
IBA_SET(CM_SIDR_REP_REQUESTID, sidr_rep_msg,
|
||||
be32_to_cpu(cm_id_priv->id.remote_id));
|
||||
IBA_SET(CM_SIDR_REP_STATUS, sidr_rep_msg, param->status);
|
||||
@ -3683,6 +3633,10 @@ static void cm_format_sidr_rep(struct cm_sidr_rep_msg *sidr_rep_msg,
|
||||
IBA_SET(CM_SIDR_REP_SERVICEID, sidr_rep_msg,
|
||||
be64_to_cpu(cm_id_priv->id.service_id));
|
||||
IBA_SET(CM_SIDR_REP_Q_KEY, sidr_rep_msg, param->qkey);
|
||||
IBA_SET(CM_SIDR_REP_VENDOR_ID_L, sidr_rep_msg,
|
||||
param->ece.vendor_id & 0xFF);
|
||||
IBA_SET(CM_SIDR_REP_VENDOR_ID_H, sidr_rep_msg,
|
||||
(param->ece.vendor_id >> 8) & 0xFF);
|
||||
|
||||
if (param->info && param->info_length)
|
||||
IBA_SET_MEM(CM_SIDR_REP_ADDITIONAL_INFORMATION, sidr_rep_msg,
|
||||
@ -4384,7 +4338,7 @@ static void cm_remove_port_fs(struct cm_port *port)
|
||||
|
||||
}
|
||||
|
||||
static void cm_add_one(struct ib_device *ib_device)
|
||||
static int cm_add_one(struct ib_device *ib_device)
|
||||
{
|
||||
struct cm_device *cm_dev;
|
||||
struct cm_port *port;
|
||||
@ -4403,7 +4357,7 @@ static void cm_add_one(struct ib_device *ib_device)
|
||||
cm_dev = kzalloc(struct_size(cm_dev, port, ib_device->phys_port_cnt),
|
||||
GFP_KERNEL);
|
||||
if (!cm_dev)
|
||||
return;
|
||||
return -ENOMEM;
|
||||
|
||||
cm_dev->ib_device = ib_device;
|
||||
cm_dev->ack_delay = ib_device->attrs.local_ca_ack_delay;
|
||||
@ -4415,8 +4369,10 @@ static void cm_add_one(struct ib_device *ib_device)
|
||||
continue;
|
||||
|
||||
port = kzalloc(sizeof *port, GFP_KERNEL);
|
||||
if (!port)
|
||||
if (!port) {
|
||||
ret = -ENOMEM;
|
||||
goto error1;
|
||||
}
|
||||
|
||||
cm_dev->port[i-1] = port;
|
||||
port->cm_dev = cm_dev;
|
||||
@ -4437,8 +4393,10 @@ static void cm_add_one(struct ib_device *ib_device)
|
||||
cm_recv_handler,
|
||||
port,
|
||||
0);
|
||||
if (IS_ERR(port->mad_agent))
|
||||
if (IS_ERR(port->mad_agent)) {
|
||||
ret = PTR_ERR(port->mad_agent);
|
||||
goto error2;
|
||||
}
|
||||
|
||||
ret = ib_modify_port(ib_device, i, 0, &port_modify);
|
||||
if (ret)
|
||||
@ -4447,15 +4405,17 @@ static void cm_add_one(struct ib_device *ib_device)
|
||||
count++;
|
||||
}
|
||||
|
||||
if (!count)
|
||||
if (!count) {
|
||||
ret = -EOPNOTSUPP;
|
||||
goto free;
|
||||
}
|
||||
|
||||
ib_set_client_data(ib_device, &cm_client, cm_dev);
|
||||
|
||||
write_lock_irqsave(&cm.device_lock, flags);
|
||||
list_add_tail(&cm_dev->list, &cm.device_list);
|
||||
write_unlock_irqrestore(&cm.device_lock, flags);
|
||||
return;
|
||||
return 0;
|
||||
|
||||
error3:
|
||||
ib_unregister_mad_agent(port->mad_agent);
|
||||
@ -4477,6 +4437,7 @@ error1:
|
||||
}
|
||||
free:
|
||||
kfree(cm_dev);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void cm_remove_one(struct ib_device *ib_device, void *client_data)
|
||||
@ -4491,9 +4452,6 @@ static void cm_remove_one(struct ib_device *ib_device, void *client_data)
|
||||
unsigned long flags;
|
||||
int i;
|
||||
|
||||
if (!cm_dev)
|
||||
return;
|
||||
|
||||
write_lock_irqsave(&cm.device_lock, flags);
|
||||
list_del(&cm_dev->list);
|
||||
write_unlock_irqrestore(&cm.device_lock, flags);
|
||||
|
@ -91,7 +91,13 @@ const char *__attribute_const__ rdma_reject_msg(struct rdma_cm_id *id,
|
||||
}
|
||||
EXPORT_SYMBOL(rdma_reject_msg);
|
||||
|
||||
bool rdma_is_consumer_reject(struct rdma_cm_id *id, int reason)
|
||||
/**
|
||||
* rdma_is_consumer_reject - return true if the consumer rejected the connect
|
||||
* request.
|
||||
* @id: Communication identifier that received the REJECT event.
|
||||
* @reason: Value returned in the REJECT event status field.
|
||||
*/
|
||||
static bool rdma_is_consumer_reject(struct rdma_cm_id *id, int reason)
|
||||
{
|
||||
if (rdma_ib_or_roce(id->device, id->port_num))
|
||||
return reason == IB_CM_REJ_CONSUMER_DEFINED;
|
||||
@ -102,7 +108,6 @@ bool rdma_is_consumer_reject(struct rdma_cm_id *id, int reason)
|
||||
WARN_ON_ONCE(1);
|
||||
return false;
|
||||
}
|
||||
EXPORT_SYMBOL(rdma_is_consumer_reject);
|
||||
|
||||
const void *rdma_consumer_reject_data(struct rdma_cm_id *id,
|
||||
struct rdma_cm_event *ev, u8 *data_len)
|
||||
@ -148,7 +153,7 @@ struct rdma_cm_id *rdma_res_to_id(struct rdma_restrack_entry *res)
|
||||
}
|
||||
EXPORT_SYMBOL(rdma_res_to_id);
|
||||
|
||||
static void cma_add_one(struct ib_device *device);
|
||||
static int cma_add_one(struct ib_device *device);
|
||||
static void cma_remove_one(struct ib_device *device, void *client_data);
|
||||
|
||||
static struct ib_client cma_client = {
|
||||
@ -479,6 +484,7 @@ static void _cma_attach_to_dev(struct rdma_id_private *id_priv,
|
||||
rdma_restrack_kadd(&id_priv->res);
|
||||
else
|
||||
rdma_restrack_uadd(&id_priv->res);
|
||||
trace_cm_id_attach(id_priv, cma_dev->device);
|
||||
}
|
||||
|
||||
static void cma_attach_to_dev(struct rdma_id_private *id_priv,
|
||||
@ -883,7 +889,6 @@ struct rdma_cm_id *__rdma_create_id(struct net *net,
|
||||
id_priv->id.route.addr.dev_addr.net = get_net(net);
|
||||
id_priv->seq_num &= 0x00ffffff;
|
||||
|
||||
trace_cm_id_create(id_priv);
|
||||
return &id_priv->id;
|
||||
}
|
||||
EXPORT_SYMBOL(__rdma_create_id);
|
||||
@ -1906,6 +1911,9 @@ static void cma_set_rep_event_data(struct rdma_cm_event *event,
|
||||
event->param.conn.rnr_retry_count = rep_data->rnr_retry_count;
|
||||
event->param.conn.srq = rep_data->srq;
|
||||
event->param.conn.qp_num = rep_data->remote_qpn;
|
||||
|
||||
event->ece.vendor_id = rep_data->ece.vendor_id;
|
||||
event->ece.attr_mod = rep_data->ece.attr_mod;
|
||||
}
|
||||
|
||||
static int cma_cm_event_handler(struct rdma_id_private *id_priv,
|
||||
@ -2124,6 +2132,9 @@ static void cma_set_req_event_data(struct rdma_cm_event *event,
|
||||
event->param.conn.rnr_retry_count = req_data->rnr_retry_count;
|
||||
event->param.conn.srq = req_data->srq;
|
||||
event->param.conn.qp_num = req_data->remote_qpn;
|
||||
|
||||
event->ece.vendor_id = req_data->ece.vendor_id;
|
||||
event->ece.attr_mod = req_data->ece.attr_mod;
|
||||
}
|
||||
|
||||
static int cma_ib_check_req_qp_type(const struct rdma_cm_id *id,
|
||||
@ -2904,6 +2915,24 @@ static int iboe_tos_to_sl(struct net_device *ndev, int tos)
|
||||
return 0;
|
||||
}
|
||||
|
||||
static __be32 cma_get_roce_udp_flow_label(struct rdma_id_private *id_priv)
|
||||
{
|
||||
struct sockaddr_in6 *addr6;
|
||||
u16 dport, sport;
|
||||
u32 hash, fl;
|
||||
|
||||
addr6 = (struct sockaddr_in6 *)cma_src_addr(id_priv);
|
||||
fl = be32_to_cpu(addr6->sin6_flowinfo) & IB_GRH_FLOWLABEL_MASK;
|
||||
if ((cma_family(id_priv) != AF_INET6) || !fl) {
|
||||
dport = be16_to_cpu(cma_port(cma_dst_addr(id_priv)));
|
||||
sport = be16_to_cpu(cma_port(cma_src_addr(id_priv)));
|
||||
hash = (u32)sport * 31 + dport;
|
||||
fl = hash & IB_GRH_FLOWLABEL_MASK;
|
||||
}
|
||||
|
||||
return cpu_to_be32(fl);
|
||||
}
|
||||
|
||||
static int cma_resolve_iboe_route(struct rdma_id_private *id_priv)
|
||||
{
|
||||
struct rdma_route *route = &id_priv->id.route;
|
||||
@ -2970,6 +2999,11 @@ static int cma_resolve_iboe_route(struct rdma_id_private *id_priv)
|
||||
goto err2;
|
||||
}
|
||||
|
||||
if (rdma_protocol_roce_udp_encap(id_priv->id.device,
|
||||
id_priv->id.port_num))
|
||||
route->path_rec->flow_label =
|
||||
cma_get_roce_udp_flow_label(id_priv);
|
||||
|
||||
cma_init_resolve_route_work(work, id_priv);
|
||||
queue_work(cma_wq, &work->work);
|
||||
|
||||
@ -3919,6 +3953,8 @@ static int cma_connect_ib(struct rdma_id_private *id_priv,
|
||||
req.local_cm_response_timeout = CMA_CM_RESPONSE_TIMEOUT;
|
||||
req.max_cm_retries = CMA_MAX_CM_RETRIES;
|
||||
req.srq = id_priv->srq ? 1 : 0;
|
||||
req.ece.vendor_id = id_priv->ece.vendor_id;
|
||||
req.ece.attr_mod = id_priv->ece.attr_mod;
|
||||
|
||||
trace_cm_send_req(id_priv);
|
||||
ret = ib_send_cm_req(id_priv->cm_id.ib, &req);
|
||||
@ -4008,6 +4044,27 @@ err:
|
||||
}
|
||||
EXPORT_SYMBOL(rdma_connect);
|
||||
|
||||
/**
|
||||
* rdma_connect_ece - Initiate an active connection request with ECE data.
|
||||
* @id: Connection identifier to connect.
|
||||
* @conn_param: Connection information used for connected QPs.
|
||||
* @ece: ECE parameters
|
||||
*
|
||||
* See rdma_connect() explanation.
|
||||
*/
|
||||
int rdma_connect_ece(struct rdma_cm_id *id, struct rdma_conn_param *conn_param,
|
||||
struct rdma_ucm_ece *ece)
|
||||
{
|
||||
struct rdma_id_private *id_priv =
|
||||
container_of(id, struct rdma_id_private, id);
|
||||
|
||||
id_priv->ece.vendor_id = ece->vendor_id;
|
||||
id_priv->ece.attr_mod = ece->attr_mod;
|
||||
|
||||
return rdma_connect(id, conn_param);
|
||||
}
|
||||
EXPORT_SYMBOL(rdma_connect_ece);
|
||||
|
||||
static int cma_accept_ib(struct rdma_id_private *id_priv,
|
||||
struct rdma_conn_param *conn_param)
|
||||
{
|
||||
@ -4033,6 +4090,8 @@ static int cma_accept_ib(struct rdma_id_private *id_priv,
|
||||
rep.flow_control = conn_param->flow_control;
|
||||
rep.rnr_retry_count = min_t(u8, 7, conn_param->rnr_retry_count);
|
||||
rep.srq = id_priv->srq ? 1 : 0;
|
||||
rep.ece.vendor_id = id_priv->ece.vendor_id;
|
||||
rep.ece.attr_mod = id_priv->ece.attr_mod;
|
||||
|
||||
trace_cm_send_rep(id_priv);
|
||||
ret = ib_send_cm_rep(id_priv->cm_id.ib, &rep);
|
||||
@ -4080,7 +4139,11 @@ static int cma_send_sidr_rep(struct rdma_id_private *id_priv,
|
||||
return ret;
|
||||
rep.qp_num = id_priv->qp_num;
|
||||
rep.qkey = id_priv->qkey;
|
||||
|
||||
rep.ece.vendor_id = id_priv->ece.vendor_id;
|
||||
rep.ece.attr_mod = id_priv->ece.attr_mod;
|
||||
}
|
||||
|
||||
rep.private_data = private_data;
|
||||
rep.private_data_len = private_data_len;
|
||||
|
||||
@ -4133,11 +4196,24 @@ int __rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param,
|
||||
return 0;
|
||||
reject:
|
||||
cma_modify_qp_err(id_priv);
|
||||
rdma_reject(id, NULL, 0);
|
||||
rdma_reject(id, NULL, 0, IB_CM_REJ_CONSUMER_DEFINED);
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL(__rdma_accept);
|
||||
|
||||
int __rdma_accept_ece(struct rdma_cm_id *id, struct rdma_conn_param *conn_param,
|
||||
const char *caller, struct rdma_ucm_ece *ece)
|
||||
{
|
||||
struct rdma_id_private *id_priv =
|
||||
container_of(id, struct rdma_id_private, id);
|
||||
|
||||
id_priv->ece.vendor_id = ece->vendor_id;
|
||||
id_priv->ece.attr_mod = ece->attr_mod;
|
||||
|
||||
return __rdma_accept(id, conn_param, caller);
|
||||
}
|
||||
EXPORT_SYMBOL(__rdma_accept_ece);
|
||||
|
||||
int rdma_notify(struct rdma_cm_id *id, enum ib_event_type event)
|
||||
{
|
||||
struct rdma_id_private *id_priv;
|
||||
@ -4160,7 +4236,7 @@ int rdma_notify(struct rdma_cm_id *id, enum ib_event_type event)
|
||||
EXPORT_SYMBOL(rdma_notify);
|
||||
|
||||
int rdma_reject(struct rdma_cm_id *id, const void *private_data,
|
||||
u8 private_data_len)
|
||||
u8 private_data_len, u8 reason)
|
||||
{
|
||||
struct rdma_id_private *id_priv;
|
||||
int ret;
|
||||
@ -4175,9 +4251,8 @@ int rdma_reject(struct rdma_cm_id *id, const void *private_data,
|
||||
private_data, private_data_len);
|
||||
} else {
|
||||
trace_cm_send_rej(id_priv);
|
||||
ret = ib_send_cm_rej(id_priv->cm_id.ib,
|
||||
IB_CM_REJ_CONSUMER_DEFINED, NULL,
|
||||
0, private_data, private_data_len);
|
||||
ret = ib_send_cm_rej(id_priv->cm_id.ib, reason, NULL, 0,
|
||||
private_data, private_data_len);
|
||||
}
|
||||
} else if (rdma_cap_iw_cm(id->device, id->port_num)) {
|
||||
ret = iw_cm_reject(id_priv->cm_id.iw,
|
||||
@ -4633,29 +4708,34 @@ static struct notifier_block cma_nb = {
|
||||
.notifier_call = cma_netdev_callback
|
||||
};
|
||||
|
||||
static void cma_add_one(struct ib_device *device)
|
||||
static int cma_add_one(struct ib_device *device)
|
||||
{
|
||||
struct cma_device *cma_dev;
|
||||
struct rdma_id_private *id_priv;
|
||||
unsigned int i;
|
||||
unsigned long supported_gids = 0;
|
||||
int ret;
|
||||
|
||||
cma_dev = kmalloc(sizeof *cma_dev, GFP_KERNEL);
|
||||
if (!cma_dev)
|
||||
return;
|
||||
return -ENOMEM;
|
||||
|
||||
cma_dev->device = device;
|
||||
cma_dev->default_gid_type = kcalloc(device->phys_port_cnt,
|
||||
sizeof(*cma_dev->default_gid_type),
|
||||
GFP_KERNEL);
|
||||
if (!cma_dev->default_gid_type)
|
||||
if (!cma_dev->default_gid_type) {
|
||||
ret = -ENOMEM;
|
||||
goto free_cma_dev;
|
||||
}
|
||||
|
||||
cma_dev->default_roce_tos = kcalloc(device->phys_port_cnt,
|
||||
sizeof(*cma_dev->default_roce_tos),
|
||||
GFP_KERNEL);
|
||||
if (!cma_dev->default_roce_tos)
|
||||
if (!cma_dev->default_roce_tos) {
|
||||
ret = -ENOMEM;
|
||||
goto free_gid_type;
|
||||
}
|
||||
|
||||
rdma_for_each_port (device, i) {
|
||||
supported_gids = roce_gid_type_mask_support(device, i);
|
||||
@ -4681,15 +4761,14 @@ static void cma_add_one(struct ib_device *device)
|
||||
mutex_unlock(&lock);
|
||||
|
||||
trace_cm_add_one(device);
|
||||
return;
|
||||
return 0;
|
||||
|
||||
free_gid_type:
|
||||
kfree(cma_dev->default_gid_type);
|
||||
|
||||
free_cma_dev:
|
||||
kfree(cma_dev);
|
||||
|
||||
return;
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int cma_remove_id_dev(struct rdma_id_private *id_priv)
|
||||
@ -4751,9 +4830,6 @@ static void cma_remove_one(struct ib_device *device, void *client_data)
|
||||
|
||||
trace_cm_remove_one(device);
|
||||
|
||||
if (!cma_dev)
|
||||
return;
|
||||
|
||||
mutex_lock(&lock);
|
||||
list_del(&cma_dev->list);
|
||||
mutex_unlock(&lock);
|
||||
|
@ -322,8 +322,21 @@ fail:
|
||||
return ERR_PTR(err);
|
||||
}
|
||||
|
||||
static void drop_cma_dev(struct config_group *cgroup, struct config_item *item)
|
||||
{
|
||||
struct config_group *group =
|
||||
container_of(item, struct config_group, cg_item);
|
||||
struct cma_dev_group *cma_dev_group =
|
||||
container_of(group, struct cma_dev_group, device_group);
|
||||
|
||||
configfs_remove_default_groups(&cma_dev_group->ports_group);
|
||||
configfs_remove_default_groups(&cma_dev_group->device_group);
|
||||
config_item_put(item);
|
||||
}
|
||||
|
||||
static struct configfs_group_operations cma_subsys_group_ops = {
|
||||
.make_group = make_cma_dev,
|
||||
.drop_item = drop_cma_dev,
|
||||
};
|
||||
|
||||
static const struct config_item_type cma_subsys_type = {
|
||||
|
@ -95,6 +95,7 @@ struct rdma_id_private {
|
||||
* Internal to RDMA/core, don't use in the drivers
|
||||
*/
|
||||
struct rdma_restrack_entry res;
|
||||
struct rdma_ucm_ece ece;
|
||||
};
|
||||
|
||||
#if IS_ENABLED(CONFIG_INFINIBAND_ADDR_TRANS_CONFIGFS)
|
||||
|
@ -103,23 +103,33 @@ DEFINE_CMA_FSM_EVENT(sent_drep);
|
||||
DEFINE_CMA_FSM_EVENT(sent_dreq);
|
||||
DEFINE_CMA_FSM_EVENT(id_destroy);
|
||||
|
||||
TRACE_EVENT(cm_id_create,
|
||||
TRACE_EVENT(cm_id_attach,
|
||||
TP_PROTO(
|
||||
const struct rdma_id_private *id_priv
|
||||
const struct rdma_id_private *id_priv,
|
||||
const struct ib_device *device
|
||||
),
|
||||
|
||||
TP_ARGS(id_priv),
|
||||
TP_ARGS(id_priv, device),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field(u32, cm_id)
|
||||
__array(unsigned char, srcaddr, sizeof(struct sockaddr_in6))
|
||||
__array(unsigned char, dstaddr, sizeof(struct sockaddr_in6))
|
||||
__string(devname, device->name)
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->cm_id = id_priv->res.id;
|
||||
memcpy(__entry->srcaddr, &id_priv->id.route.addr.src_addr,
|
||||
sizeof(struct sockaddr_in6));
|
||||
memcpy(__entry->dstaddr, &id_priv->id.route.addr.dst_addr,
|
||||
sizeof(struct sockaddr_in6));
|
||||
__assign_str(devname, device->name);
|
||||
),
|
||||
|
||||
TP_printk("cm.id=%u",
|
||||
__entry->cm_id
|
||||
TP_printk("cm.id=%u src=%pISpc dst=%pISpc device=%s",
|
||||
__entry->cm_id, __entry->srcaddr, __entry->dstaddr,
|
||||
__get_str(devname)
|
||||
)
|
||||
);
|
||||
|
||||
|
@ -414,4 +414,7 @@ void rdma_umap_priv_init(struct rdma_umap_priv *priv,
|
||||
struct vm_area_struct *vma,
|
||||
struct rdma_user_mmap_entry *entry);
|
||||
|
||||
void ib_cq_pool_init(struct ib_device *dev);
|
||||
void ib_cq_pool_destroy(struct ib_device *dev);
|
||||
|
||||
#endif /* _CORE_PRIV_H */
|
||||
|
@ -7,7 +7,11 @@
|
||||
#include <linux/slab.h>
|
||||
#include <rdma/ib_verbs.h>
|
||||
|
||||
#include "core_priv.h"
|
||||
|
||||
#include <trace/events/rdma_core.h>
|
||||
/* Max size for shared CQ, may require tuning */
|
||||
#define IB_MAX_SHARED_CQ_SZ 4096U
|
||||
|
||||
/* # of WCs to poll for with a single call to ib_poll_cq */
|
||||
#define IB_POLL_BATCH 16
|
||||
@ -218,6 +222,7 @@ struct ib_cq *__ib_alloc_cq_user(struct ib_device *dev, void *private,
|
||||
cq->cq_context = private;
|
||||
cq->poll_ctx = poll_ctx;
|
||||
atomic_set(&cq->usecnt, 0);
|
||||
cq->comp_vector = comp_vector;
|
||||
|
||||
cq->wc = kmalloc_array(IB_POLL_BATCH, sizeof(*cq->wc), GFP_KERNEL);
|
||||
if (!cq->wc)
|
||||
@ -309,6 +314,8 @@ void ib_free_cq_user(struct ib_cq *cq, struct ib_udata *udata)
|
||||
{
|
||||
if (WARN_ON_ONCE(atomic_read(&cq->usecnt)))
|
||||
return;
|
||||
if (WARN_ON_ONCE(cq->cqe_used))
|
||||
return;
|
||||
|
||||
switch (cq->poll_ctx) {
|
||||
case IB_POLL_DIRECT:
|
||||
@ -334,3 +341,169 @@ void ib_free_cq_user(struct ib_cq *cq, struct ib_udata *udata)
|
||||
kfree(cq);
|
||||
}
|
||||
EXPORT_SYMBOL(ib_free_cq_user);
|
||||
|
||||
void ib_cq_pool_init(struct ib_device *dev)
|
||||
{
|
||||
unsigned int i;
|
||||
|
||||
spin_lock_init(&dev->cq_pools_lock);
|
||||
for (i = 0; i < ARRAY_SIZE(dev->cq_pools); i++)
|
||||
INIT_LIST_HEAD(&dev->cq_pools[i]);
|
||||
}
|
||||
|
||||
void ib_cq_pool_destroy(struct ib_device *dev)
|
||||
{
|
||||
struct ib_cq *cq, *n;
|
||||
unsigned int i;
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(dev->cq_pools); i++) {
|
||||
list_for_each_entry_safe(cq, n, &dev->cq_pools[i],
|
||||
pool_entry) {
|
||||
WARN_ON(cq->cqe_used);
|
||||
cq->shared = false;
|
||||
ib_free_cq(cq);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
static int ib_alloc_cqs(struct ib_device *dev, unsigned int nr_cqes,
|
||||
enum ib_poll_context poll_ctx)
|
||||
{
|
||||
LIST_HEAD(tmp_list);
|
||||
unsigned int nr_cqs, i;
|
||||
struct ib_cq *cq;
|
||||
int ret;
|
||||
|
||||
if (poll_ctx > IB_POLL_LAST_POOL_TYPE) {
|
||||
WARN_ON_ONCE(poll_ctx > IB_POLL_LAST_POOL_TYPE);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/*
|
||||
* Allocate at least as many CQEs as requested, and otherwise
|
||||
* a reasonable batch size so that we can share CQs between
|
||||
* multiple users instead of allocating a larger number of CQs.
|
||||
*/
|
||||
nr_cqes = min_t(unsigned int, dev->attrs.max_cqe,
|
||||
max(nr_cqes, IB_MAX_SHARED_CQ_SZ));
|
||||
nr_cqs = min_t(unsigned int, dev->num_comp_vectors, num_online_cpus());
|
||||
for (i = 0; i < nr_cqs; i++) {
|
||||
cq = ib_alloc_cq(dev, NULL, nr_cqes, i, poll_ctx);
|
||||
if (IS_ERR(cq)) {
|
||||
ret = PTR_ERR(cq);
|
||||
goto out_free_cqs;
|
||||
}
|
||||
cq->shared = true;
|
||||
list_add_tail(&cq->pool_entry, &tmp_list);
|
||||
}
|
||||
|
||||
spin_lock_irq(&dev->cq_pools_lock);
|
||||
list_splice(&tmp_list, &dev->cq_pools[poll_ctx]);
|
||||
spin_unlock_irq(&dev->cq_pools_lock);
|
||||
|
||||
return 0;
|
||||
|
||||
out_free_cqs:
|
||||
list_for_each_entry(cq, &tmp_list, pool_entry) {
|
||||
cq->shared = false;
|
||||
ib_free_cq(cq);
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
|
||||
/**
|
||||
* ib_cq_pool_get() - Find the least used completion queue that matches
|
||||
* a given cpu hint (or least used for wild card affinity) and fits
|
||||
* nr_cqe.
|
||||
* @dev: rdma device
|
||||
* @nr_cqe: number of needed cqe entries
|
||||
* @comp_vector_hint: completion vector hint (-1) for the driver to assign
|
||||
* a comp vector based on internal counter
|
||||
* @poll_ctx: cq polling context
|
||||
*
|
||||
* Finds a cq that satisfies @comp_vector_hint and @nr_cqe requirements and
|
||||
* claim entries in it for us. In case there is no available cq, allocate
|
||||
* a new cq with the requirements and add it to the device pool.
|
||||
* IB_POLL_DIRECT cannot be used for shared cqs so it is not a valid value
|
||||
* for @poll_ctx.
|
||||
*/
|
||||
struct ib_cq *ib_cq_pool_get(struct ib_device *dev, unsigned int nr_cqe,
|
||||
int comp_vector_hint,
|
||||
enum ib_poll_context poll_ctx)
|
||||
{
|
||||
static unsigned int default_comp_vector;
|
||||
unsigned int vector, num_comp_vectors;
|
||||
struct ib_cq *cq, *found = NULL;
|
||||
int ret;
|
||||
|
||||
if (poll_ctx > IB_POLL_LAST_POOL_TYPE) {
|
||||
WARN_ON_ONCE(poll_ctx > IB_POLL_LAST_POOL_TYPE);
|
||||
return ERR_PTR(-EINVAL);
|
||||
}
|
||||
|
||||
num_comp_vectors =
|
||||
min_t(unsigned int, dev->num_comp_vectors, num_online_cpus());
|
||||
/* Project the affinty to the device completion vector range */
|
||||
if (comp_vector_hint < 0) {
|
||||
comp_vector_hint =
|
||||
(READ_ONCE(default_comp_vector) + 1) % num_comp_vectors;
|
||||
WRITE_ONCE(default_comp_vector, comp_vector_hint);
|
||||
}
|
||||
vector = comp_vector_hint % num_comp_vectors;
|
||||
|
||||
/*
|
||||
* Find the least used CQ with correct affinity and
|
||||
* enough free CQ entries
|
||||
*/
|
||||
while (!found) {
|
||||
spin_lock_irq(&dev->cq_pools_lock);
|
||||
list_for_each_entry(cq, &dev->cq_pools[poll_ctx],
|
||||
pool_entry) {
|
||||
/*
|
||||
* Check to see if we have found a CQ with the
|
||||
* correct completion vector
|
||||
*/
|
||||
if (vector != cq->comp_vector)
|
||||
continue;
|
||||
if (cq->cqe_used + nr_cqe > cq->cqe)
|
||||
continue;
|
||||
found = cq;
|
||||
break;
|
||||
}
|
||||
|
||||
if (found) {
|
||||
found->cqe_used += nr_cqe;
|
||||
spin_unlock_irq(&dev->cq_pools_lock);
|
||||
|
||||
return found;
|
||||
}
|
||||
spin_unlock_irq(&dev->cq_pools_lock);
|
||||
|
||||
/*
|
||||
* Didn't find a match or ran out of CQs in the device
|
||||
* pool, allocate a new array of CQs.
|
||||
*/
|
||||
ret = ib_alloc_cqs(dev, nr_cqe, poll_ctx);
|
||||
if (ret)
|
||||
return ERR_PTR(ret);
|
||||
}
|
||||
|
||||
return found;
|
||||
}
|
||||
EXPORT_SYMBOL(ib_cq_pool_get);
|
||||
|
||||
/**
|
||||
* ib_cq_pool_put - Return a CQ taken from a shared pool.
|
||||
* @cq: The CQ to return.
|
||||
* @nr_cqe: The max number of cqes that the user had requested.
|
||||
*/
|
||||
void ib_cq_pool_put(struct ib_cq *cq, unsigned int nr_cqe)
|
||||
{
|
||||
if (WARN_ON_ONCE(nr_cqe > cq->cqe_used))
|
||||
return;
|
||||
|
||||
spin_lock_irq(&cq->device->cq_pools_lock);
|
||||
cq->cqe_used -= nr_cqe;
|
||||
spin_unlock_irq(&cq->device->cq_pools_lock);
|
||||
}
|
||||
EXPORT_SYMBOL(ib_cq_pool_put);
|
||||
|
@ -677,8 +677,20 @@ static int add_client_context(struct ib_device *device,
|
||||
if (ret)
|
||||
goto out;
|
||||
downgrade_write(&device->client_data_rwsem);
|
||||
if (client->add)
|
||||
client->add(device);
|
||||
if (client->add) {
|
||||
if (client->add(device)) {
|
||||
/*
|
||||
* If a client fails to add then the error code is
|
||||
* ignored, but we won't call any more ops on this
|
||||
* client.
|
||||
*/
|
||||
xa_erase(&device->client_data, client->client_id);
|
||||
up_read(&device->client_data_rwsem);
|
||||
ib_device_put(device);
|
||||
ib_client_put(client);
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
/* Readers shall not see a client until add has been completed */
|
||||
xa_set_mark(&device->client_data, client->client_id,
|
||||
@ -1381,6 +1393,7 @@ int ib_register_device(struct ib_device *device, const char *name)
|
||||
goto dev_cleanup;
|
||||
}
|
||||
|
||||
ib_cq_pool_init(device);
|
||||
ret = enable_device_and_get(device);
|
||||
dev_set_uevent_suppress(&device->dev, false);
|
||||
/* Mark for userspace that device is ready */
|
||||
@ -1435,6 +1448,7 @@ static void __ib_unregister_device(struct ib_device *ib_dev)
|
||||
goto out;
|
||||
|
||||
disable_device(ib_dev);
|
||||
ib_cq_pool_destroy(ib_dev);
|
||||
|
||||
/* Expedite removing unregistered pointers from the hash table */
|
||||
free_netdevs(ib_dev);
|
||||
@ -2557,7 +2571,6 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
|
||||
SET_DEVICE_OP(dev_ops, add_gid);
|
||||
SET_DEVICE_OP(dev_ops, advise_mr);
|
||||
SET_DEVICE_OP(dev_ops, alloc_dm);
|
||||
SET_DEVICE_OP(dev_ops, alloc_fmr);
|
||||
SET_DEVICE_OP(dev_ops, alloc_hw_stats);
|
||||
SET_DEVICE_OP(dev_ops, alloc_mr);
|
||||
SET_DEVICE_OP(dev_ops, alloc_mr_integrity);
|
||||
@ -2584,7 +2597,6 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
|
||||
SET_DEVICE_OP(dev_ops, create_wq);
|
||||
SET_DEVICE_OP(dev_ops, dealloc_dm);
|
||||
SET_DEVICE_OP(dev_ops, dealloc_driver);
|
||||
SET_DEVICE_OP(dev_ops, dealloc_fmr);
|
||||
SET_DEVICE_OP(dev_ops, dealloc_mw);
|
||||
SET_DEVICE_OP(dev_ops, dealloc_pd);
|
||||
SET_DEVICE_OP(dev_ops, dealloc_ucontext);
|
||||
@ -2628,7 +2640,6 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
|
||||
SET_DEVICE_OP(dev_ops, iw_rem_ref);
|
||||
SET_DEVICE_OP(dev_ops, map_mr_sg);
|
||||
SET_DEVICE_OP(dev_ops, map_mr_sg_pi);
|
||||
SET_DEVICE_OP(dev_ops, map_phys_fmr);
|
||||
SET_DEVICE_OP(dev_ops, mmap);
|
||||
SET_DEVICE_OP(dev_ops, mmap_free);
|
||||
SET_DEVICE_OP(dev_ops, modify_ah);
|
||||
@ -2662,7 +2673,6 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
|
||||
SET_DEVICE_OP(dev_ops, resize_cq);
|
||||
SET_DEVICE_OP(dev_ops, set_vf_guid);
|
||||
SET_DEVICE_OP(dev_ops, set_vf_link_state);
|
||||
SET_DEVICE_OP(dev_ops, unmap_fmr);
|
||||
|
||||
SET_OBJ_SIZE(dev_ops, ib_ah);
|
||||
SET_OBJ_SIZE(dev_ops, ib_cq);
|
||||
|
@ -1,494 +0,0 @@
|
||||
/*
|
||||
* Copyright (c) 2004 Topspin Communications. All rights reserved.
|
||||
* Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
|
||||
*
|
||||
* This software is available to you under a choice of one of two
|
||||
* licenses. You may choose to be licensed under the terms of the GNU
|
||||
* General Public License (GPL) Version 2, available from the file
|
||||
* COPYING in the main directory of this source tree, or the
|
||||
* OpenIB.org BSD license below:
|
||||
*
|
||||
* Redistribution and use in source and binary forms, with or
|
||||
* without modification, are permitted provided that the following
|
||||
* conditions are met:
|
||||
*
|
||||
* - Redistributions of source code must retain the above
|
||||
* copyright notice, this list of conditions and the following
|
||||
* disclaimer.
|
||||
*
|
||||
* - Redistributions in binary form must reproduce the above
|
||||
* copyright notice, this list of conditions and the following
|
||||
* disclaimer in the documentation and/or other materials
|
||||
* provided with the distribution.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
||||
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
||||
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
|
||||
* BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
|
||||
* ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
|
||||
* CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
* SOFTWARE.
|
||||
*/
|
||||
|
||||
#include <linux/errno.h>
|
||||
#include <linux/spinlock.h>
|
||||
#include <linux/export.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/jhash.h>
|
||||
#include <linux/kthread.h>
|
||||
|
||||
#include <rdma/ib_fmr_pool.h>
|
||||
|
||||
#include "core_priv.h"
|
||||
|
||||
#define PFX "fmr_pool: "
|
||||
|
||||
enum {
|
||||
IB_FMR_MAX_REMAPS = 32,
|
||||
|
||||
IB_FMR_HASH_BITS = 8,
|
||||
IB_FMR_HASH_SIZE = 1 << IB_FMR_HASH_BITS,
|
||||
IB_FMR_HASH_MASK = IB_FMR_HASH_SIZE - 1
|
||||
};
|
||||
|
||||
/*
|
||||
* If an FMR is not in use, then the list member will point to either
|
||||
* its pool's free_list (if the FMR can be mapped again; that is,
|
||||
* remap_count < pool->max_remaps) or its pool's dirty_list (if the
|
||||
* FMR needs to be unmapped before being remapped). In either of
|
||||
* these cases it is a bug if the ref_count is not 0. In other words,
|
||||
* if ref_count is > 0, then the list member must not be linked into
|
||||
* either free_list or dirty_list.
|
||||
*
|
||||
* The cache_node member is used to link the FMR into a cache bucket
|
||||
* (if caching is enabled). This is independent of the reference
|
||||
* count of the FMR. When a valid FMR is released, its ref_count is
|
||||
* decremented, and if ref_count reaches 0, the FMR is placed in
|
||||
* either free_list or dirty_list as appropriate. However, it is not
|
||||
* removed from the cache and may be "revived" if a call to
|
||||
* ib_fmr_register_physical() occurs before the FMR is remapped. In
|
||||
* this case we just increment the ref_count and remove the FMR from
|
||||
* free_list/dirty_list.
|
||||
*
|
||||
* Before we remap an FMR from free_list, we remove it from the cache
|
||||
* (to prevent another user from obtaining a stale FMR). When an FMR
|
||||
* is released, we add it to the tail of the free list, so that our
|
||||
* cache eviction policy is "least recently used."
|
||||
*
|
||||
* All manipulation of ref_count, list and cache_node is protected by
|
||||
* pool_lock to maintain consistency.
|
||||
*/
|
||||
|
||||
struct ib_fmr_pool {
|
||||
spinlock_t pool_lock;
|
||||
|
||||
int pool_size;
|
||||
int max_pages;
|
||||
int max_remaps;
|
||||
int dirty_watermark;
|
||||
int dirty_len;
|
||||
struct list_head free_list;
|
||||
struct list_head dirty_list;
|
||||
struct hlist_head *cache_bucket;
|
||||
|
||||
void (*flush_function)(struct ib_fmr_pool *pool,
|
||||
void * arg);
|
||||
void *flush_arg;
|
||||
|
||||
struct kthread_worker *worker;
|
||||
struct kthread_work work;
|
||||
|
||||
atomic_t req_ser;
|
||||
atomic_t flush_ser;
|
||||
|
||||
wait_queue_head_t force_wait;
|
||||
};
|
||||
|
||||
static inline u32 ib_fmr_hash(u64 first_page)
|
||||
{
|
||||
return jhash_2words((u32) first_page, (u32) (first_page >> 32), 0) &
|
||||
(IB_FMR_HASH_SIZE - 1);
|
||||
}
|
||||
|
||||
/* Caller must hold pool_lock */
|
||||
static inline struct ib_pool_fmr *ib_fmr_cache_lookup(struct ib_fmr_pool *pool,
|
||||
u64 *page_list,
|
||||
int page_list_len,
|
||||
u64 io_virtual_address)
|
||||
{
|
||||
struct hlist_head *bucket;
|
||||
struct ib_pool_fmr *fmr;
|
||||
|
||||
if (!pool->cache_bucket)
|
||||
return NULL;
|
||||
|
||||
bucket = pool->cache_bucket + ib_fmr_hash(*page_list);
|
||||
|
||||
hlist_for_each_entry(fmr, bucket, cache_node)
|
||||
if (io_virtual_address == fmr->io_virtual_address &&
|
||||
page_list_len == fmr->page_list_len &&
|
||||
!memcmp(page_list, fmr->page_list,
|
||||
page_list_len * sizeof *page_list))
|
||||
return fmr;
|
||||
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static void ib_fmr_batch_release(struct ib_fmr_pool *pool)
|
||||
{
|
||||
int ret;
|
||||
struct ib_pool_fmr *fmr;
|
||||
LIST_HEAD(unmap_list);
|
||||
LIST_HEAD(fmr_list);
|
||||
|
||||
spin_lock_irq(&pool->pool_lock);
|
||||
|
||||
list_for_each_entry(fmr, &pool->dirty_list, list) {
|
||||
hlist_del_init(&fmr->cache_node);
|
||||
fmr->remap_count = 0;
|
||||
list_add_tail(&fmr->fmr->list, &fmr_list);
|
||||
}
|
||||
|
||||
list_splice_init(&pool->dirty_list, &unmap_list);
|
||||
pool->dirty_len = 0;
|
||||
|
||||
spin_unlock_irq(&pool->pool_lock);
|
||||
|
||||
if (list_empty(&unmap_list)) {
|
||||
return;
|
||||
}
|
||||
|
||||
ret = ib_unmap_fmr(&fmr_list);
|
||||
if (ret)
|
||||
pr_warn(PFX "ib_unmap_fmr returned %d\n", ret);
|
||||
|
||||
spin_lock_irq(&pool->pool_lock);
|
||||
list_splice(&unmap_list, &pool->free_list);
|
||||
spin_unlock_irq(&pool->pool_lock);
|
||||
}
|
||||
|
||||
static void ib_fmr_cleanup_func(struct kthread_work *work)
|
||||
{
|
||||
struct ib_fmr_pool *pool = container_of(work, struct ib_fmr_pool, work);
|
||||
|
||||
ib_fmr_batch_release(pool);
|
||||
atomic_inc(&pool->flush_ser);
|
||||
wake_up_interruptible(&pool->force_wait);
|
||||
|
||||
if (pool->flush_function)
|
||||
pool->flush_function(pool, pool->flush_arg);
|
||||
|
||||
if (atomic_read(&pool->flush_ser) - atomic_read(&pool->req_ser) < 0)
|
||||
kthread_queue_work(pool->worker, &pool->work);
|
||||
}
|
||||
|
||||
/**
|
||||
* ib_create_fmr_pool - Create an FMR pool
|
||||
* @pd:Protection domain for FMRs
|
||||
* @params:FMR pool parameters
|
||||
*
|
||||
* Create a pool of FMRs. Return value is pointer to new pool or
|
||||
* error code if creation failed.
|
||||
*/
|
||||
struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd *pd,
|
||||
struct ib_fmr_pool_param *params)
|
||||
{
|
||||
struct ib_device *device;
|
||||
struct ib_fmr_pool *pool;
|
||||
int i;
|
||||
int ret;
|
||||
int max_remaps;
|
||||
|
||||
if (!params)
|
||||
return ERR_PTR(-EINVAL);
|
||||
|
||||
device = pd->device;
|
||||
if (!device->ops.alloc_fmr || !device->ops.dealloc_fmr ||
|
||||
!device->ops.map_phys_fmr || !device->ops.unmap_fmr) {
|
||||
dev_info(&device->dev, "Device does not support FMRs\n");
|
||||
return ERR_PTR(-ENOSYS);
|
||||
}
|
||||
|
||||
if (!device->attrs.max_map_per_fmr)
|
||||
max_remaps = IB_FMR_MAX_REMAPS;
|
||||
else
|
||||
max_remaps = device->attrs.max_map_per_fmr;
|
||||
|
||||
pool = kmalloc(sizeof *pool, GFP_KERNEL);
|
||||
if (!pool)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
pool->cache_bucket = NULL;
|
||||
pool->flush_function = params->flush_function;
|
||||
pool->flush_arg = params->flush_arg;
|
||||
|
||||
INIT_LIST_HEAD(&pool->free_list);
|
||||
INIT_LIST_HEAD(&pool->dirty_list);
|
||||
|
||||
if (params->cache) {
|
||||
pool->cache_bucket =
|
||||
kmalloc_array(IB_FMR_HASH_SIZE,
|
||||
sizeof(*pool->cache_bucket),
|
||||
GFP_KERNEL);
|
||||
if (!pool->cache_bucket) {
|
||||
ret = -ENOMEM;
|
||||
goto out_free_pool;
|
||||
}
|
||||
|
||||
for (i = 0; i < IB_FMR_HASH_SIZE; ++i)
|
||||
INIT_HLIST_HEAD(pool->cache_bucket + i);
|
||||
}
|
||||
|
||||
pool->pool_size = 0;
|
||||
pool->max_pages = params->max_pages_per_fmr;
|
||||
pool->max_remaps = max_remaps;
|
||||
pool->dirty_watermark = params->dirty_watermark;
|
||||
pool->dirty_len = 0;
|
||||
spin_lock_init(&pool->pool_lock);
|
||||
atomic_set(&pool->req_ser, 0);
|
||||
atomic_set(&pool->flush_ser, 0);
|
||||
init_waitqueue_head(&pool->force_wait);
|
||||
|
||||
pool->worker =
|
||||
kthread_create_worker(0, "ib_fmr(%s)", dev_name(&device->dev));
|
||||
if (IS_ERR(pool->worker)) {
|
||||
pr_warn(PFX "couldn't start cleanup kthread worker\n");
|
||||
ret = PTR_ERR(pool->worker);
|
||||
goto out_free_pool;
|
||||
}
|
||||
kthread_init_work(&pool->work, ib_fmr_cleanup_func);
|
||||
|
||||
{
|
||||
struct ib_pool_fmr *fmr;
|
||||
struct ib_fmr_attr fmr_attr = {
|
||||
.max_pages = params->max_pages_per_fmr,
|
||||
.max_maps = pool->max_remaps,
|
||||
.page_shift = params->page_shift
|
||||
};
|
||||
int bytes_per_fmr = sizeof *fmr;
|
||||
|
||||
if (pool->cache_bucket)
|
||||
bytes_per_fmr += params->max_pages_per_fmr * sizeof (u64);
|
||||
|
||||
for (i = 0; i < params->pool_size; ++i) {
|
||||
fmr = kmalloc(bytes_per_fmr, GFP_KERNEL);
|
||||
if (!fmr)
|
||||
goto out_fail;
|
||||
|
||||
fmr->pool = pool;
|
||||
fmr->remap_count = 0;
|
||||
fmr->ref_count = 0;
|
||||
INIT_HLIST_NODE(&fmr->cache_node);
|
||||
|
||||
fmr->fmr = ib_alloc_fmr(pd, params->access, &fmr_attr);
|
||||
if (IS_ERR(fmr->fmr)) {
|
||||
pr_warn(PFX "fmr_create failed for FMR %d\n",
|
||||
i);
|
||||
kfree(fmr);
|
||||
goto out_fail;
|
||||
}
|
||||
|
||||
list_add_tail(&fmr->list, &pool->free_list);
|
||||
++pool->pool_size;
|
||||
}
|
||||
}
|
||||
|
||||
return pool;
|
||||
|
||||
out_free_pool:
|
||||
kfree(pool->cache_bucket);
|
||||
kfree(pool);
|
||||
|
||||
return ERR_PTR(ret);
|
||||
|
||||
out_fail:
|
||||
ib_destroy_fmr_pool(pool);
|
||||
|
||||
return ERR_PTR(-ENOMEM);
|
||||
}
|
||||
EXPORT_SYMBOL(ib_create_fmr_pool);
|
||||
|
||||
/**
|
||||
* ib_destroy_fmr_pool - Free FMR pool
|
||||
* @pool:FMR pool to free
|
||||
*
|
||||
* Destroy an FMR pool and free all associated resources.
|
||||
*/
|
||||
void ib_destroy_fmr_pool(struct ib_fmr_pool *pool)
|
||||
{
|
||||
struct ib_pool_fmr *fmr;
|
||||
struct ib_pool_fmr *tmp;
|
||||
LIST_HEAD(fmr_list);
|
||||
int i;
|
||||
|
||||
kthread_destroy_worker(pool->worker);
|
||||
ib_fmr_batch_release(pool);
|
||||
|
||||
i = 0;
|
||||
list_for_each_entry_safe(fmr, tmp, &pool->free_list, list) {
|
||||
if (fmr->remap_count) {
|
||||
INIT_LIST_HEAD(&fmr_list);
|
||||
list_add_tail(&fmr->fmr->list, &fmr_list);
|
||||
ib_unmap_fmr(&fmr_list);
|
||||
}
|
||||
ib_dealloc_fmr(fmr->fmr);
|
||||
list_del(&fmr->list);
|
||||
kfree(fmr);
|
||||
++i;
|
||||
}
|
||||
|
||||
if (i < pool->pool_size)
|
||||
pr_warn(PFX "pool still has %d regions registered\n",
|
||||
pool->pool_size - i);
|
||||
|
||||
kfree(pool->cache_bucket);
|
||||
kfree(pool);
|
||||
}
|
||||
EXPORT_SYMBOL(ib_destroy_fmr_pool);
|
||||
|
||||
/**
|
||||
* ib_flush_fmr_pool - Invalidate all unmapped FMRs
|
||||
* @pool:FMR pool to flush
|
||||
*
|
||||
* Ensure that all unmapped FMRs are fully invalidated.
|
||||
*/
|
||||
int ib_flush_fmr_pool(struct ib_fmr_pool *pool)
|
||||
{
|
||||
int serial;
|
||||
struct ib_pool_fmr *fmr, *next;
|
||||
|
||||
/*
|
||||
* The free_list holds FMRs that may have been used
|
||||
* but have not been remapped enough times to be dirty.
|
||||
* Put them on the dirty list now so that the cleanup
|
||||
* thread will reap them too.
|
||||
*/
|
||||
spin_lock_irq(&pool->pool_lock);
|
||||
list_for_each_entry_safe(fmr, next, &pool->free_list, list) {
|
||||
if (fmr->remap_count > 0)
|
||||
list_move(&fmr->list, &pool->dirty_list);
|
||||
}
|
||||
spin_unlock_irq(&pool->pool_lock);
|
||||
|
||||
serial = atomic_inc_return(&pool->req_ser);
|
||||
kthread_queue_work(pool->worker, &pool->work);
|
||||
|
||||
if (wait_event_interruptible(pool->force_wait,
|
||||
atomic_read(&pool->flush_ser) - serial >= 0))
|
||||
return -EINTR;
|
||||
|
||||
return 0;
|
||||
}
|
||||
EXPORT_SYMBOL(ib_flush_fmr_pool);
|
||||
|
||||
/**
|
||||
* ib_fmr_pool_map_phys - Map an FMR from an FMR pool.
|
||||
* @pool_handle: FMR pool to allocate FMR from
|
||||
* @page_list: List of pages to map
|
||||
* @list_len: Number of pages in @page_list
|
||||
* @io_virtual_address: I/O virtual address for new FMR
|
||||
*/
|
||||
struct ib_pool_fmr *ib_fmr_pool_map_phys(struct ib_fmr_pool *pool_handle,
|
||||
u64 *page_list,
|
||||
int list_len,
|
||||
u64 io_virtual_address)
|
||||
{
|
||||
struct ib_fmr_pool *pool = pool_handle;
|
||||
struct ib_pool_fmr *fmr;
|
||||
unsigned long flags;
|
||||
int result;
|
||||
|
||||
if (list_len < 1 || list_len > pool->max_pages)
|
||||
return ERR_PTR(-EINVAL);
|
||||
|
||||
spin_lock_irqsave(&pool->pool_lock, flags);
|
||||
fmr = ib_fmr_cache_lookup(pool,
|
||||
page_list,
|
||||
list_len,
|
||||
io_virtual_address);
|
||||
if (fmr) {
|
||||
/* found in cache */
|
||||
++fmr->ref_count;
|
||||
if (fmr->ref_count == 1) {
|
||||
list_del(&fmr->list);
|
||||
}
|
||||
|
||||
spin_unlock_irqrestore(&pool->pool_lock, flags);
|
||||
|
||||
return fmr;
|
||||
}
|
||||
|
||||
if (list_empty(&pool->free_list)) {
|
||||
spin_unlock_irqrestore(&pool->pool_lock, flags);
|
||||
return ERR_PTR(-EAGAIN);
|
||||
}
|
||||
|
||||
fmr = list_entry(pool->free_list.next, struct ib_pool_fmr, list);
|
||||
list_del(&fmr->list);
|
||||
hlist_del_init(&fmr->cache_node);
|
||||
spin_unlock_irqrestore(&pool->pool_lock, flags);
|
||||
|
||||
result = ib_map_phys_fmr(fmr->fmr, page_list, list_len,
|
||||
io_virtual_address);
|
||||
|
||||
if (result) {
|
||||
spin_lock_irqsave(&pool->pool_lock, flags);
|
||||
list_add(&fmr->list, &pool->free_list);
|
||||
spin_unlock_irqrestore(&pool->pool_lock, flags);
|
||||
|
||||
pr_warn(PFX "fmr_map returns %d\n", result);
|
||||
|
||||
return ERR_PTR(result);
|
||||
}
|
||||
|
||||
++fmr->remap_count;
|
||||
fmr->ref_count = 1;
|
||||
|
||||
if (pool->cache_bucket) {
|
||||
fmr->io_virtual_address = io_virtual_address;
|
||||
fmr->page_list_len = list_len;
|
||||
memcpy(fmr->page_list, page_list, list_len * sizeof(*page_list));
|
||||
|
||||
spin_lock_irqsave(&pool->pool_lock, flags);
|
||||
hlist_add_head(&fmr->cache_node,
|
||||
pool->cache_bucket + ib_fmr_hash(fmr->page_list[0]));
|
||||
spin_unlock_irqrestore(&pool->pool_lock, flags);
|
||||
}
|
||||
|
||||
return fmr;
|
||||
}
|
||||
EXPORT_SYMBOL(ib_fmr_pool_map_phys);
|
||||
|
||||
/**
|
||||
* ib_fmr_pool_unmap - Unmap FMR
|
||||
* @fmr:FMR to unmap
|
||||
*
|
||||
* Unmap an FMR. The FMR mapping may remain valid until the FMR is
|
||||
* reused (or until ib_flush_fmr_pool() is called).
|
||||
*/
|
||||
void ib_fmr_pool_unmap(struct ib_pool_fmr *fmr)
|
||||
{
|
||||
struct ib_fmr_pool *pool;
|
||||
unsigned long flags;
|
||||
|
||||
pool = fmr->pool;
|
||||
|
||||
spin_lock_irqsave(&pool->pool_lock, flags);
|
||||
|
||||
--fmr->ref_count;
|
||||
if (!fmr->ref_count) {
|
||||
if (fmr->remap_count < pool->max_remaps) {
|
||||
list_add_tail(&fmr->list, &pool->free_list);
|
||||
} else {
|
||||
list_add_tail(&fmr->list, &pool->dirty_list);
|
||||
if (++pool->dirty_len >= pool->dirty_watermark) {
|
||||
atomic_inc(&pool->req_ser);
|
||||
kthread_queue_work(pool->worker, &pool->work);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
spin_unlock_irqrestore(&pool->pool_lock, flags);
|
||||
}
|
||||
EXPORT_SYMBOL(ib_fmr_pool_unmap);
|
138
drivers/infiniband/core/lag.c
Normal file
138
drivers/infiniband/core/lag.c
Normal file
@ -0,0 +1,138 @@
|
||||
// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
|
||||
/*
|
||||
* Copyright (c) 2020 Mellanox Technologies. All rights reserved.
|
||||
*/
|
||||
|
||||
#include <rdma/ib_verbs.h>
|
||||
#include <rdma/ib_cache.h>
|
||||
#include <rdma/lag.h>
|
||||
|
||||
static struct sk_buff *rdma_build_skb(struct ib_device *device,
|
||||
struct net_device *netdev,
|
||||
struct rdma_ah_attr *ah_attr,
|
||||
gfp_t flags)
|
||||
{
|
||||
struct ipv6hdr *ip6h;
|
||||
struct sk_buff *skb;
|
||||
struct ethhdr *eth;
|
||||
struct iphdr *iph;
|
||||
struct udphdr *uh;
|
||||
u8 smac[ETH_ALEN];
|
||||
bool is_ipv4;
|
||||
int hdr_len;
|
||||
|
||||
is_ipv4 = ipv6_addr_v4mapped((struct in6_addr *)ah_attr->grh.dgid.raw);
|
||||
hdr_len = ETH_HLEN + sizeof(struct udphdr) + LL_RESERVED_SPACE(netdev);
|
||||
hdr_len += is_ipv4 ? sizeof(struct iphdr) : sizeof(struct ipv6hdr);
|
||||
|
||||
skb = alloc_skb(hdr_len, flags);
|
||||
if (!skb)
|
||||
return NULL;
|
||||
|
||||
skb->dev = netdev;
|
||||
skb_reserve(skb, hdr_len);
|
||||
skb_push(skb, sizeof(struct udphdr));
|
||||
skb_reset_transport_header(skb);
|
||||
uh = udp_hdr(skb);
|
||||
uh->source =
|
||||
htons(rdma_flow_label_to_udp_sport(ah_attr->grh.flow_label));
|
||||
uh->dest = htons(ROCE_V2_UDP_DPORT);
|
||||
uh->len = htons(sizeof(struct udphdr));
|
||||
|
||||
if (is_ipv4) {
|
||||
skb_push(skb, sizeof(struct iphdr));
|
||||
skb_reset_network_header(skb);
|
||||
iph = ip_hdr(skb);
|
||||
iph->frag_off = 0;
|
||||
iph->version = 4;
|
||||
iph->protocol = IPPROTO_UDP;
|
||||
iph->ihl = 0x5;
|
||||
iph->tot_len = htons(sizeof(struct udphdr) + sizeof(struct
|
||||
iphdr));
|
||||
memcpy(&iph->saddr, ah_attr->grh.sgid_attr->gid.raw + 12,
|
||||
sizeof(struct in_addr));
|
||||
memcpy(&iph->daddr, ah_attr->grh.dgid.raw + 12,
|
||||
sizeof(struct in_addr));
|
||||
} else {
|
||||
skb_push(skb, sizeof(struct ipv6hdr));
|
||||
skb_reset_network_header(skb);
|
||||
ip6h = ipv6_hdr(skb);
|
||||
ip6h->version = 6;
|
||||
ip6h->nexthdr = IPPROTO_UDP;
|
||||
memcpy(&ip6h->flow_lbl, &ah_attr->grh.flow_label,
|
||||
sizeof(*ip6h->flow_lbl));
|
||||
memcpy(&ip6h->saddr, ah_attr->grh.sgid_attr->gid.raw,
|
||||
sizeof(struct in6_addr));
|
||||
memcpy(&ip6h->daddr, ah_attr->grh.dgid.raw,
|
||||
sizeof(struct in6_addr));
|
||||
}
|
||||
|
||||
skb_push(skb, sizeof(struct ethhdr));
|
||||
skb_reset_mac_header(skb);
|
||||
eth = eth_hdr(skb);
|
||||
skb->protocol = eth->h_proto = htons(is_ipv4 ? ETH_P_IP : ETH_P_IPV6);
|
||||
rdma_read_gid_l2_fields(ah_attr->grh.sgid_attr, NULL, smac);
|
||||
memcpy(eth->h_source, smac, ETH_ALEN);
|
||||
memcpy(eth->h_dest, ah_attr->roce.dmac, ETH_ALEN);
|
||||
|
||||
return skb;
|
||||
}
|
||||
|
||||
static struct net_device *rdma_get_xmit_slave_udp(struct ib_device *device,
|
||||
struct net_device *master,
|
||||
struct rdma_ah_attr *ah_attr,
|
||||
gfp_t flags)
|
||||
{
|
||||
struct net_device *slave;
|
||||
struct sk_buff *skb;
|
||||
|
||||
skb = rdma_build_skb(device, master, ah_attr, flags);
|
||||
if (!skb)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
rcu_read_lock();
|
||||
slave = netdev_get_xmit_slave(master, skb,
|
||||
!!(device->lag_flags &
|
||||
RDMA_LAG_FLAGS_HASH_ALL_SLAVES));
|
||||
if (slave)
|
||||
dev_hold(slave);
|
||||
rcu_read_unlock();
|
||||
kfree_skb(skb);
|
||||
return slave;
|
||||
}
|
||||
|
||||
void rdma_lag_put_ah_roce_slave(struct net_device *xmit_slave)
|
||||
{
|
||||
if (xmit_slave)
|
||||
dev_put(xmit_slave);
|
||||
}
|
||||
|
||||
struct net_device *rdma_lag_get_ah_roce_slave(struct ib_device *device,
|
||||
struct rdma_ah_attr *ah_attr,
|
||||
gfp_t flags)
|
||||
{
|
||||
struct net_device *slave = NULL;
|
||||
struct net_device *master;
|
||||
|
||||
if (!(ah_attr->type == RDMA_AH_ATTR_TYPE_ROCE &&
|
||||
ah_attr->grh.sgid_attr->gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP &&
|
||||
ah_attr->grh.flow_label))
|
||||
return NULL;
|
||||
|
||||
rcu_read_lock();
|
||||
master = rdma_read_gid_attr_ndev_rcu(ah_attr->grh.sgid_attr);
|
||||
if (IS_ERR(master)) {
|
||||
rcu_read_unlock();
|
||||
return master;
|
||||
}
|
||||
dev_hold(master);
|
||||
rcu_read_unlock();
|
||||
|
||||
if (!netif_is_bond_master(master))
|
||||
goto put;
|
||||
|
||||
slave = rdma_get_xmit_slave_udp(device, master, ah_attr, flags);
|
||||
put:
|
||||
dev_put(master);
|
||||
return slave;
|
||||
}
|
@ -85,7 +85,6 @@ MODULE_PARM_DESC(send_queue_size, "Size of send queue in number of work requests
|
||||
module_param_named(recv_queue_size, mad_recvq_size, int, 0444);
|
||||
MODULE_PARM_DESC(recv_queue_size, "Size of receive queue in number of work requests");
|
||||
|
||||
/* Client ID 0 is used for snoop-only clients */
|
||||
static DEFINE_XARRAY_ALLOC1(ib_mad_clients);
|
||||
static u32 ib_mad_client_next;
|
||||
static struct list_head ib_mad_port_list;
|
||||
@ -483,141 +482,12 @@ error1:
|
||||
}
|
||||
EXPORT_SYMBOL(ib_register_mad_agent);
|
||||
|
||||
static inline int is_snooping_sends(int mad_snoop_flags)
|
||||
{
|
||||
return (mad_snoop_flags &
|
||||
(/*IB_MAD_SNOOP_POSTED_SENDS |
|
||||
IB_MAD_SNOOP_RMPP_SENDS |*/
|
||||
IB_MAD_SNOOP_SEND_COMPLETIONS /*|
|
||||
IB_MAD_SNOOP_RMPP_SEND_COMPLETIONS*/));
|
||||
}
|
||||
|
||||
static inline int is_snooping_recvs(int mad_snoop_flags)
|
||||
{
|
||||
return (mad_snoop_flags &
|
||||
(IB_MAD_SNOOP_RECVS /*|
|
||||
IB_MAD_SNOOP_RMPP_RECVS*/));
|
||||
}
|
||||
|
||||
static int register_snoop_agent(struct ib_mad_qp_info *qp_info,
|
||||
struct ib_mad_snoop_private *mad_snoop_priv)
|
||||
{
|
||||
struct ib_mad_snoop_private **new_snoop_table;
|
||||
unsigned long flags;
|
||||
int i;
|
||||
|
||||
spin_lock_irqsave(&qp_info->snoop_lock, flags);
|
||||
/* Check for empty slot in array. */
|
||||
for (i = 0; i < qp_info->snoop_table_size; i++)
|
||||
if (!qp_info->snoop_table[i])
|
||||
break;
|
||||
|
||||
if (i == qp_info->snoop_table_size) {
|
||||
/* Grow table. */
|
||||
new_snoop_table = krealloc(qp_info->snoop_table,
|
||||
sizeof mad_snoop_priv *
|
||||
(qp_info->snoop_table_size + 1),
|
||||
GFP_ATOMIC);
|
||||
if (!new_snoop_table) {
|
||||
i = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
|
||||
qp_info->snoop_table = new_snoop_table;
|
||||
qp_info->snoop_table_size++;
|
||||
}
|
||||
qp_info->snoop_table[i] = mad_snoop_priv;
|
||||
atomic_inc(&qp_info->snoop_count);
|
||||
out:
|
||||
spin_unlock_irqrestore(&qp_info->snoop_lock, flags);
|
||||
return i;
|
||||
}
|
||||
|
||||
struct ib_mad_agent *ib_register_mad_snoop(struct ib_device *device,
|
||||
u8 port_num,
|
||||
enum ib_qp_type qp_type,
|
||||
int mad_snoop_flags,
|
||||
ib_mad_snoop_handler snoop_handler,
|
||||
ib_mad_recv_handler recv_handler,
|
||||
void *context)
|
||||
{
|
||||
struct ib_mad_port_private *port_priv;
|
||||
struct ib_mad_agent *ret;
|
||||
struct ib_mad_snoop_private *mad_snoop_priv;
|
||||
int qpn;
|
||||
int err;
|
||||
|
||||
/* Validate parameters */
|
||||
if ((is_snooping_sends(mad_snoop_flags) && !snoop_handler) ||
|
||||
(is_snooping_recvs(mad_snoop_flags) && !recv_handler)) {
|
||||
ret = ERR_PTR(-EINVAL);
|
||||
goto error1;
|
||||
}
|
||||
qpn = get_spl_qp_index(qp_type);
|
||||
if (qpn == -1) {
|
||||
ret = ERR_PTR(-EINVAL);
|
||||
goto error1;
|
||||
}
|
||||
port_priv = ib_get_mad_port(device, port_num);
|
||||
if (!port_priv) {
|
||||
ret = ERR_PTR(-ENODEV);
|
||||
goto error1;
|
||||
}
|
||||
/* Allocate structures */
|
||||
mad_snoop_priv = kzalloc(sizeof *mad_snoop_priv, GFP_KERNEL);
|
||||
if (!mad_snoop_priv) {
|
||||
ret = ERR_PTR(-ENOMEM);
|
||||
goto error1;
|
||||
}
|
||||
|
||||
/* Now, fill in the various structures */
|
||||
mad_snoop_priv->qp_info = &port_priv->qp_info[qpn];
|
||||
mad_snoop_priv->agent.device = device;
|
||||
mad_snoop_priv->agent.recv_handler = recv_handler;
|
||||
mad_snoop_priv->agent.snoop_handler = snoop_handler;
|
||||
mad_snoop_priv->agent.context = context;
|
||||
mad_snoop_priv->agent.qp = port_priv->qp_info[qpn].qp;
|
||||
mad_snoop_priv->agent.port_num = port_num;
|
||||
mad_snoop_priv->mad_snoop_flags = mad_snoop_flags;
|
||||
init_completion(&mad_snoop_priv->comp);
|
||||
|
||||
err = ib_mad_agent_security_setup(&mad_snoop_priv->agent, qp_type);
|
||||
if (err) {
|
||||
ret = ERR_PTR(err);
|
||||
goto error2;
|
||||
}
|
||||
|
||||
mad_snoop_priv->snoop_index = register_snoop_agent(
|
||||
&port_priv->qp_info[qpn],
|
||||
mad_snoop_priv);
|
||||
if (mad_snoop_priv->snoop_index < 0) {
|
||||
ret = ERR_PTR(mad_snoop_priv->snoop_index);
|
||||
goto error3;
|
||||
}
|
||||
|
||||
atomic_set(&mad_snoop_priv->refcount, 1);
|
||||
return &mad_snoop_priv->agent;
|
||||
error3:
|
||||
ib_mad_agent_security_cleanup(&mad_snoop_priv->agent);
|
||||
error2:
|
||||
kfree(mad_snoop_priv);
|
||||
error1:
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL(ib_register_mad_snoop);
|
||||
|
||||
static inline void deref_mad_agent(struct ib_mad_agent_private *mad_agent_priv)
|
||||
{
|
||||
if (atomic_dec_and_test(&mad_agent_priv->refcount))
|
||||
complete(&mad_agent_priv->comp);
|
||||
}
|
||||
|
||||
static inline void deref_snoop_agent(struct ib_mad_snoop_private *mad_snoop_priv)
|
||||
{
|
||||
if (atomic_dec_and_test(&mad_snoop_priv->refcount))
|
||||
complete(&mad_snoop_priv->comp);
|
||||
}
|
||||
|
||||
static void unregister_mad_agent(struct ib_mad_agent_private *mad_agent_priv)
|
||||
{
|
||||
struct ib_mad_port_private *port_priv;
|
||||
@ -650,25 +520,6 @@ static void unregister_mad_agent(struct ib_mad_agent_private *mad_agent_priv)
|
||||
kfree_rcu(mad_agent_priv, rcu);
|
||||
}
|
||||
|
||||
static void unregister_mad_snoop(struct ib_mad_snoop_private *mad_snoop_priv)
|
||||
{
|
||||
struct ib_mad_qp_info *qp_info;
|
||||
unsigned long flags;
|
||||
|
||||
qp_info = mad_snoop_priv->qp_info;
|
||||
spin_lock_irqsave(&qp_info->snoop_lock, flags);
|
||||
qp_info->snoop_table[mad_snoop_priv->snoop_index] = NULL;
|
||||
atomic_dec(&qp_info->snoop_count);
|
||||
spin_unlock_irqrestore(&qp_info->snoop_lock, flags);
|
||||
|
||||
deref_snoop_agent(mad_snoop_priv);
|
||||
wait_for_completion(&mad_snoop_priv->comp);
|
||||
|
||||
ib_mad_agent_security_cleanup(&mad_snoop_priv->agent);
|
||||
|
||||
kfree(mad_snoop_priv);
|
||||
}
|
||||
|
||||
/*
|
||||
* ib_unregister_mad_agent - Unregisters a client from using MAD services
|
||||
*
|
||||
@ -677,20 +528,11 @@ static void unregister_mad_snoop(struct ib_mad_snoop_private *mad_snoop_priv)
|
||||
void ib_unregister_mad_agent(struct ib_mad_agent *mad_agent)
|
||||
{
|
||||
struct ib_mad_agent_private *mad_agent_priv;
|
||||
struct ib_mad_snoop_private *mad_snoop_priv;
|
||||
|
||||
/* If the TID is zero, the agent can only snoop. */
|
||||
if (mad_agent->hi_tid) {
|
||||
mad_agent_priv = container_of(mad_agent,
|
||||
struct ib_mad_agent_private,
|
||||
agent);
|
||||
unregister_mad_agent(mad_agent_priv);
|
||||
} else {
|
||||
mad_snoop_priv = container_of(mad_agent,
|
||||
struct ib_mad_snoop_private,
|
||||
agent);
|
||||
unregister_mad_snoop(mad_snoop_priv);
|
||||
}
|
||||
mad_agent_priv = container_of(mad_agent,
|
||||
struct ib_mad_agent_private,
|
||||
agent);
|
||||
unregister_mad_agent(mad_agent_priv);
|
||||
}
|
||||
EXPORT_SYMBOL(ib_unregister_mad_agent);
|
||||
|
||||
@ -706,57 +548,6 @@ static void dequeue_mad(struct ib_mad_list_head *mad_list)
|
||||
spin_unlock_irqrestore(&mad_queue->lock, flags);
|
||||
}
|
||||
|
||||
static void snoop_send(struct ib_mad_qp_info *qp_info,
|
||||
struct ib_mad_send_buf *send_buf,
|
||||
struct ib_mad_send_wc *mad_send_wc,
|
||||
int mad_snoop_flags)
|
||||
{
|
||||
struct ib_mad_snoop_private *mad_snoop_priv;
|
||||
unsigned long flags;
|
||||
int i;
|
||||
|
||||
spin_lock_irqsave(&qp_info->snoop_lock, flags);
|
||||
for (i = 0; i < qp_info->snoop_table_size; i++) {
|
||||
mad_snoop_priv = qp_info->snoop_table[i];
|
||||
if (!mad_snoop_priv ||
|
||||
!(mad_snoop_priv->mad_snoop_flags & mad_snoop_flags))
|
||||
continue;
|
||||
|
||||
atomic_inc(&mad_snoop_priv->refcount);
|
||||
spin_unlock_irqrestore(&qp_info->snoop_lock, flags);
|
||||
mad_snoop_priv->agent.snoop_handler(&mad_snoop_priv->agent,
|
||||
send_buf, mad_send_wc);
|
||||
deref_snoop_agent(mad_snoop_priv);
|
||||
spin_lock_irqsave(&qp_info->snoop_lock, flags);
|
||||
}
|
||||
spin_unlock_irqrestore(&qp_info->snoop_lock, flags);
|
||||
}
|
||||
|
||||
static void snoop_recv(struct ib_mad_qp_info *qp_info,
|
||||
struct ib_mad_recv_wc *mad_recv_wc,
|
||||
int mad_snoop_flags)
|
||||
{
|
||||
struct ib_mad_snoop_private *mad_snoop_priv;
|
||||
unsigned long flags;
|
||||
int i;
|
||||
|
||||
spin_lock_irqsave(&qp_info->snoop_lock, flags);
|
||||
for (i = 0; i < qp_info->snoop_table_size; i++) {
|
||||
mad_snoop_priv = qp_info->snoop_table[i];
|
||||
if (!mad_snoop_priv ||
|
||||
!(mad_snoop_priv->mad_snoop_flags & mad_snoop_flags))
|
||||
continue;
|
||||
|
||||
atomic_inc(&mad_snoop_priv->refcount);
|
||||
spin_unlock_irqrestore(&qp_info->snoop_lock, flags);
|
||||
mad_snoop_priv->agent.recv_handler(&mad_snoop_priv->agent, NULL,
|
||||
mad_recv_wc);
|
||||
deref_snoop_agent(mad_snoop_priv);
|
||||
spin_lock_irqsave(&qp_info->snoop_lock, flags);
|
||||
}
|
||||
spin_unlock_irqrestore(&qp_info->snoop_lock, flags);
|
||||
}
|
||||
|
||||
static void build_smp_wc(struct ib_qp *qp, struct ib_cqe *cqe, u16 slid,
|
||||
u16 pkey_index, u8 port_num, struct ib_wc *wc)
|
||||
{
|
||||
@ -2289,9 +2080,6 @@ static void ib_mad_recv_done(struct ib_cq *cq, struct ib_wc *wc)
|
||||
recv->header.recv_wc.recv_buf.mad = (struct ib_mad *)recv->mad;
|
||||
recv->header.recv_wc.recv_buf.grh = &recv->grh;
|
||||
|
||||
if (atomic_read(&qp_info->snoop_count))
|
||||
snoop_recv(qp_info, &recv->header.recv_wc, IB_MAD_SNOOP_RECVS);
|
||||
|
||||
/* Validate MAD */
|
||||
if (!validate_mad((const struct ib_mad_hdr *)recv->mad, qp_info, opa))
|
||||
goto out;
|
||||
@ -2538,9 +2326,6 @@ retry:
|
||||
mad_send_wc.send_buf = &mad_send_wr->send_buf;
|
||||
mad_send_wc.status = wc->status;
|
||||
mad_send_wc.vendor_err = wc->vendor_err;
|
||||
if (atomic_read(&qp_info->snoop_count))
|
||||
snoop_send(qp_info, &mad_send_wr->send_buf, &mad_send_wc,
|
||||
IB_MAD_SNOOP_SEND_COMPLETIONS);
|
||||
ib_mad_complete_send_wr(mad_send_wr, &mad_send_wc);
|
||||
|
||||
if (queued_send_wr) {
|
||||
@ -2782,10 +2567,6 @@ static void local_completions(struct work_struct *work)
|
||||
local->mad_priv->header.recv_wc.recv_buf.grh = NULL;
|
||||
local->mad_priv->header.recv_wc.recv_buf.mad =
|
||||
(struct ib_mad *)local->mad_priv->mad;
|
||||
if (atomic_read(&recv_mad_agent->qp_info->snoop_count))
|
||||
snoop_recv(recv_mad_agent->qp_info,
|
||||
&local->mad_priv->header.recv_wc,
|
||||
IB_MAD_SNOOP_RECVS);
|
||||
recv_mad_agent->agent.recv_handler(
|
||||
&recv_mad_agent->agent,
|
||||
&local->mad_send_wr->send_buf,
|
||||
@ -2800,10 +2581,6 @@ local_send_completion:
|
||||
mad_send_wc.status = IB_WC_SUCCESS;
|
||||
mad_send_wc.vendor_err = 0;
|
||||
mad_send_wc.send_buf = &local->mad_send_wr->send_buf;
|
||||
if (atomic_read(&mad_agent_priv->qp_info->snoop_count))
|
||||
snoop_send(mad_agent_priv->qp_info,
|
||||
&local->mad_send_wr->send_buf,
|
||||
&mad_send_wc, IB_MAD_SNOOP_SEND_COMPLETIONS);
|
||||
mad_agent_priv->agent.send_handler(&mad_agent_priv->agent,
|
||||
&mad_send_wc);
|
||||
|
||||
@ -3119,10 +2896,6 @@ static void init_mad_qp(struct ib_mad_port_private *port_priv,
|
||||
init_mad_queue(qp_info, &qp_info->send_queue);
|
||||
init_mad_queue(qp_info, &qp_info->recv_queue);
|
||||
INIT_LIST_HEAD(&qp_info->overflow_list);
|
||||
spin_lock_init(&qp_info->snoop_lock);
|
||||
qp_info->snoop_table = NULL;
|
||||
qp_info->snoop_table_size = 0;
|
||||
atomic_set(&qp_info->snoop_count, 0);
|
||||
}
|
||||
|
||||
static int create_mad_qp(struct ib_mad_qp_info *qp_info,
|
||||
@ -3166,7 +2939,6 @@ static void destroy_mad_qp(struct ib_mad_qp_info *qp_info)
|
||||
return;
|
||||
|
||||
ib_destroy_qp(qp_info->qp);
|
||||
kfree(qp_info->snoop_table);
|
||||
}
|
||||
|
||||
/*
|
||||
@ -3304,9 +3076,11 @@ static int ib_mad_port_close(struct ib_device *device, int port_num)
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void ib_mad_init_device(struct ib_device *device)
|
||||
static int ib_mad_init_device(struct ib_device *device)
|
||||
{
|
||||
int start, i;
|
||||
unsigned int count = 0;
|
||||
int ret;
|
||||
|
||||
start = rdma_start_port(device);
|
||||
|
||||
@ -3314,17 +3088,23 @@ static void ib_mad_init_device(struct ib_device *device)
|
||||
if (!rdma_cap_ib_mad(device, i))
|
||||
continue;
|
||||
|
||||
if (ib_mad_port_open(device, i)) {
|
||||
ret = ib_mad_port_open(device, i);
|
||||
if (ret) {
|
||||
dev_err(&device->dev, "Couldn't open port %d\n", i);
|
||||
goto error;
|
||||
}
|
||||
if (ib_agent_port_open(device, i)) {
|
||||
ret = ib_agent_port_open(device, i);
|
||||
if (ret) {
|
||||
dev_err(&device->dev,
|
||||
"Couldn't open port %d for agents\n", i);
|
||||
goto error_agent;
|
||||
}
|
||||
count++;
|
||||
}
|
||||
return;
|
||||
if (!count)
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
return 0;
|
||||
|
||||
error_agent:
|
||||
if (ib_mad_port_close(device, i))
|
||||
@ -3341,6 +3121,7 @@ error:
|
||||
if (ib_mad_port_close(device, i))
|
||||
dev_err(&device->dev, "Couldn't close port %d\n", i);
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void ib_mad_remove_device(struct ib_device *device, void *client_data)
|
||||
|
@ -42,7 +42,7 @@
|
||||
#include <rdma/ib_cache.h>
|
||||
#include "sa.h"
|
||||
|
||||
static void mcast_add_one(struct ib_device *device);
|
||||
static int mcast_add_one(struct ib_device *device);
|
||||
static void mcast_remove_one(struct ib_device *device, void *client_data);
|
||||
|
||||
static struct ib_client mcast_client = {
|
||||
@ -815,7 +815,7 @@ static void mcast_event_handler(struct ib_event_handler *handler,
|
||||
}
|
||||
}
|
||||
|
||||
static void mcast_add_one(struct ib_device *device)
|
||||
static int mcast_add_one(struct ib_device *device)
|
||||
{
|
||||
struct mcast_device *dev;
|
||||
struct mcast_port *port;
|
||||
@ -825,7 +825,7 @@ static void mcast_add_one(struct ib_device *device)
|
||||
dev = kmalloc(struct_size(dev, port, device->phys_port_cnt),
|
||||
GFP_KERNEL);
|
||||
if (!dev)
|
||||
return;
|
||||
return -ENOMEM;
|
||||
|
||||
dev->start_port = rdma_start_port(device);
|
||||
dev->end_port = rdma_end_port(device);
|
||||
@ -845,7 +845,7 @@ static void mcast_add_one(struct ib_device *device)
|
||||
|
||||
if (!count) {
|
||||
kfree(dev);
|
||||
return;
|
||||
return -EOPNOTSUPP;
|
||||
}
|
||||
|
||||
dev->device = device;
|
||||
@ -853,6 +853,7 @@ static void mcast_add_one(struct ib_device *device)
|
||||
|
||||
INIT_IB_EVENT_HANDLER(&dev->event_handler, device, mcast_event_handler);
|
||||
ib_register_event_handler(&dev->event_handler);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void mcast_remove_one(struct ib_device *device, void *client_data)
|
||||
@ -861,9 +862,6 @@ static void mcast_remove_one(struct ib_device *device, void *client_data)
|
||||
struct mcast_port *port;
|
||||
int i;
|
||||
|
||||
if (!dev)
|
||||
return;
|
||||
|
||||
ib_unregister_event_handler(&dev->event_handler);
|
||||
flush_workqueue(mcast_wq);
|
||||
|
||||
|
@ -130,6 +130,17 @@ static int uverbs_destroy_uobject(struct ib_uobject *uobj,
|
||||
lockdep_assert_held(&ufile->hw_destroy_rwsem);
|
||||
assert_uverbs_usecnt(uobj, UVERBS_LOOKUP_WRITE);
|
||||
|
||||
if (reason == RDMA_REMOVE_ABORT_HWOBJ) {
|
||||
reason = RDMA_REMOVE_ABORT;
|
||||
ret = uobj->uapi_object->type_class->destroy_hw(uobj, reason,
|
||||
attrs);
|
||||
/*
|
||||
* Drivers are not permitted to ignore RDMA_REMOVE_ABORT, see
|
||||
* ib_is_destroy_retryable, cleanup_retryable == false here.
|
||||
*/
|
||||
WARN_ON(ret);
|
||||
}
|
||||
|
||||
if (reason == RDMA_REMOVE_ABORT) {
|
||||
WARN_ON(!list_empty(&uobj->list));
|
||||
WARN_ON(!uobj->context);
|
||||
@ -653,11 +664,15 @@ void rdma_alloc_commit_uobject(struct ib_uobject *uobj,
|
||||
* object and anything else connected to uobj before calling this.
|
||||
*/
|
||||
void rdma_alloc_abort_uobject(struct ib_uobject *uobj,
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
struct uverbs_attr_bundle *attrs,
|
||||
bool hw_obj_valid)
|
||||
{
|
||||
struct ib_uverbs_file *ufile = uobj->ufile;
|
||||
|
||||
uverbs_destroy_uobject(uobj, RDMA_REMOVE_ABORT, attrs);
|
||||
uverbs_destroy_uobject(uobj,
|
||||
hw_obj_valid ? RDMA_REMOVE_ABORT_HWOBJ :
|
||||
RDMA_REMOVE_ABORT,
|
||||
attrs);
|
||||
|
||||
/* Matches the down_read in rdma_alloc_begin_uobject */
|
||||
up_read(&ufile->hw_destroy_rwsem);
|
||||
@ -927,8 +942,8 @@ uverbs_get_uobject_from_file(u16 object_id, enum uverbs_obj_access access,
|
||||
}
|
||||
|
||||
void uverbs_finalize_object(struct ib_uobject *uobj,
|
||||
enum uverbs_obj_access access, bool commit,
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
enum uverbs_obj_access access, bool hw_obj_valid,
|
||||
bool commit, struct uverbs_attr_bundle *attrs)
|
||||
{
|
||||
/*
|
||||
* refcounts should be handled at the object level and not at the
|
||||
@ -951,7 +966,7 @@ void uverbs_finalize_object(struct ib_uobject *uobj,
|
||||
if (commit)
|
||||
rdma_alloc_commit_uobject(uobj, attrs);
|
||||
else
|
||||
rdma_alloc_abort_uobject(uobj, attrs);
|
||||
rdma_alloc_abort_uobject(uobj, attrs, hw_obj_valid);
|
||||
break;
|
||||
default:
|
||||
WARN_ON(true);
|
||||
|
@ -64,8 +64,8 @@ uverbs_get_uobject_from_file(u16 object_id, enum uverbs_obj_access access,
|
||||
s64 id, struct uverbs_attr_bundle *attrs);
|
||||
|
||||
void uverbs_finalize_object(struct ib_uobject *uobj,
|
||||
enum uverbs_obj_access access, bool commit,
|
||||
struct uverbs_attr_bundle *attrs);
|
||||
enum uverbs_obj_access access, bool hw_obj_valid,
|
||||
bool commit, struct uverbs_attr_bundle *attrs);
|
||||
|
||||
int uverbs_output_written(const struct uverbs_attr_bundle *bundle, size_t idx);
|
||||
|
||||
@ -159,6 +159,9 @@ extern const struct uapi_definition uverbs_def_obj_dm[];
|
||||
extern const struct uapi_definition uverbs_def_obj_flow_action[];
|
||||
extern const struct uapi_definition uverbs_def_obj_intf[];
|
||||
extern const struct uapi_definition uverbs_def_obj_mr[];
|
||||
extern const struct uapi_definition uverbs_def_obj_qp[];
|
||||
extern const struct uapi_definition uverbs_def_obj_srq[];
|
||||
extern const struct uapi_definition uverbs_def_obj_wq[];
|
||||
extern const struct uapi_definition uverbs_def_write_intf[];
|
||||
|
||||
static inline const struct uverbs_api_write_method *
|
||||
|
@ -129,7 +129,7 @@ static int rdma_rw_init_mr_wrs(struct rdma_rw_ctx *ctx, struct ib_qp *qp,
|
||||
qp->integrity_en);
|
||||
int i, j, ret = 0, count = 0;
|
||||
|
||||
ctx->nr_ops = (sg_cnt + pages_per_mr - 1) / pages_per_mr;
|
||||
ctx->nr_ops = DIV_ROUND_UP(sg_cnt, pages_per_mr);
|
||||
ctx->reg = kcalloc(ctx->nr_ops, sizeof(*ctx->reg), GFP_KERNEL);
|
||||
if (!ctx->reg) {
|
||||
ret = -ENOMEM;
|
||||
|
@ -174,7 +174,7 @@ static const struct nla_policy ib_nl_policy[LS_NLA_TYPE_MAX] = {
|
||||
};
|
||||
|
||||
|
||||
static void ib_sa_add_one(struct ib_device *device);
|
||||
static int ib_sa_add_one(struct ib_device *device);
|
||||
static void ib_sa_remove_one(struct ib_device *device, void *client_data);
|
||||
|
||||
static struct ib_client sa_client = {
|
||||
@ -190,7 +190,7 @@ static u32 tid;
|
||||
|
||||
#define PATH_REC_FIELD(field) \
|
||||
.struct_offset_bytes = offsetof(struct sa_path_rec, field), \
|
||||
.struct_size_bytes = sizeof((struct sa_path_rec *)0)->field, \
|
||||
.struct_size_bytes = sizeof_field(struct sa_path_rec, field), \
|
||||
.field_name = "sa_path_rec:" #field
|
||||
|
||||
static const struct ib_field path_rec_table[] = {
|
||||
@ -292,7 +292,7 @@ static const struct ib_field path_rec_table[] = {
|
||||
.struct_offset_bytes = \
|
||||
offsetof(struct sa_path_rec, field), \
|
||||
.struct_size_bytes = \
|
||||
sizeof((struct sa_path_rec *)0)->field, \
|
||||
sizeof_field(struct sa_path_rec, field), \
|
||||
.field_name = "sa_path_rec:" #field
|
||||
|
||||
static const struct ib_field opa_path_rec_table[] = {
|
||||
@ -420,7 +420,7 @@ static const struct ib_field opa_path_rec_table[] = {
|
||||
|
||||
#define MCMEMBER_REC_FIELD(field) \
|
||||
.struct_offset_bytes = offsetof(struct ib_sa_mcmember_rec, field), \
|
||||
.struct_size_bytes = sizeof ((struct ib_sa_mcmember_rec *) 0)->field, \
|
||||
.struct_size_bytes = sizeof_field(struct ib_sa_mcmember_rec, field), \
|
||||
.field_name = "sa_mcmember_rec:" #field
|
||||
|
||||
static const struct ib_field mcmember_rec_table[] = {
|
||||
@ -504,7 +504,7 @@ static const struct ib_field mcmember_rec_table[] = {
|
||||
|
||||
#define SERVICE_REC_FIELD(field) \
|
||||
.struct_offset_bytes = offsetof(struct ib_sa_service_rec, field), \
|
||||
.struct_size_bytes = sizeof ((struct ib_sa_service_rec *) 0)->field, \
|
||||
.struct_size_bytes = sizeof_field(struct ib_sa_service_rec, field), \
|
||||
.field_name = "sa_service_rec:" #field
|
||||
|
||||
static const struct ib_field service_rec_table[] = {
|
||||
@ -552,7 +552,7 @@ static const struct ib_field service_rec_table[] = {
|
||||
|
||||
#define CLASSPORTINFO_REC_FIELD(field) \
|
||||
.struct_offset_bytes = offsetof(struct ib_class_port_info, field), \
|
||||
.struct_size_bytes = sizeof((struct ib_class_port_info *)0)->field, \
|
||||
.struct_size_bytes = sizeof_field(struct ib_class_port_info, field), \
|
||||
.field_name = "ib_class_port_info:" #field
|
||||
|
||||
static const struct ib_field ib_classport_info_rec_table[] = {
|
||||
@ -630,7 +630,7 @@ static const struct ib_field ib_classport_info_rec_table[] = {
|
||||
.struct_offset_bytes =\
|
||||
offsetof(struct opa_class_port_info, field), \
|
||||
.struct_size_bytes = \
|
||||
sizeof((struct opa_class_port_info *)0)->field, \
|
||||
sizeof_field(struct opa_class_port_info, field), \
|
||||
.field_name = "opa_class_port_info:" #field
|
||||
|
||||
static const struct ib_field opa_classport_info_rec_table[] = {
|
||||
@ -710,7 +710,7 @@ static const struct ib_field opa_classport_info_rec_table[] = {
|
||||
|
||||
#define GUIDINFO_REC_FIELD(field) \
|
||||
.struct_offset_bytes = offsetof(struct ib_sa_guidinfo_rec, field), \
|
||||
.struct_size_bytes = sizeof((struct ib_sa_guidinfo_rec *) 0)->field, \
|
||||
.struct_size_bytes = sizeof_field(struct ib_sa_guidinfo_rec, field), \
|
||||
.field_name = "sa_guidinfo_rec:" #field
|
||||
|
||||
static const struct ib_field guidinfo_rec_table[] = {
|
||||
@ -1412,17 +1412,13 @@ void ib_sa_pack_path(struct sa_path_rec *rec, void *attribute)
|
||||
EXPORT_SYMBOL(ib_sa_pack_path);
|
||||
|
||||
static bool ib_sa_opa_pathrecord_support(struct ib_sa_client *client,
|
||||
struct ib_device *device,
|
||||
struct ib_sa_device *sa_dev,
|
||||
u8 port_num)
|
||||
{
|
||||
struct ib_sa_device *sa_dev = ib_get_client_data(device, &sa_client);
|
||||
struct ib_sa_port *port;
|
||||
unsigned long flags;
|
||||
bool ret = false;
|
||||
|
||||
if (!sa_dev)
|
||||
return ret;
|
||||
|
||||
port = &sa_dev->port[port_num - sa_dev->start_port];
|
||||
spin_lock_irqsave(&port->classport_lock, flags);
|
||||
if (!port->classport_info.valid)
|
||||
@ -1450,8 +1446,8 @@ enum opa_pr_supported {
|
||||
* query is possible.
|
||||
*/
|
||||
static int opa_pr_query_possible(struct ib_sa_client *client,
|
||||
struct ib_device *device,
|
||||
u8 port_num,
|
||||
struct ib_sa_device *sa_dev,
|
||||
struct ib_device *device, u8 port_num,
|
||||
struct sa_path_rec *rec)
|
||||
{
|
||||
struct ib_port_attr port_attr;
|
||||
@ -1459,7 +1455,7 @@ static int opa_pr_query_possible(struct ib_sa_client *client,
|
||||
if (ib_query_port(device, port_num, &port_attr))
|
||||
return PR_NOT_SUPPORTED;
|
||||
|
||||
if (ib_sa_opa_pathrecord_support(client, device, port_num))
|
||||
if (ib_sa_opa_pathrecord_support(client, sa_dev, port_num))
|
||||
return PR_OPA_SUPPORTED;
|
||||
|
||||
if (port_attr.lid >= be16_to_cpu(IB_MULTICAST_LID_BASE))
|
||||
@ -1574,7 +1570,8 @@ int ib_sa_path_rec_get(struct ib_sa_client *client,
|
||||
|
||||
query->sa_query.port = port;
|
||||
if (rec->rec_type == SA_PATH_REC_TYPE_OPA) {
|
||||
status = opa_pr_query_possible(client, device, port_num, rec);
|
||||
status = opa_pr_query_possible(client, sa_dev, device, port_num,
|
||||
rec);
|
||||
if (status == PR_NOT_SUPPORTED) {
|
||||
ret = -EINVAL;
|
||||
goto err1;
|
||||
@ -2325,18 +2322,19 @@ static void ib_sa_event(struct ib_event_handler *handler,
|
||||
}
|
||||
}
|
||||
|
||||
static void ib_sa_add_one(struct ib_device *device)
|
||||
static int ib_sa_add_one(struct ib_device *device)
|
||||
{
|
||||
struct ib_sa_device *sa_dev;
|
||||
int s, e, i;
|
||||
int count = 0;
|
||||
int ret;
|
||||
|
||||
s = rdma_start_port(device);
|
||||
e = rdma_end_port(device);
|
||||
|
||||
sa_dev = kzalloc(struct_size(sa_dev, port, e - s + 1), GFP_KERNEL);
|
||||
if (!sa_dev)
|
||||
return;
|
||||
return -ENOMEM;
|
||||
|
||||
sa_dev->start_port = s;
|
||||
sa_dev->end_port = e;
|
||||
@ -2356,8 +2354,10 @@ static void ib_sa_add_one(struct ib_device *device)
|
||||
ib_register_mad_agent(device, i + s, IB_QPT_GSI,
|
||||
NULL, 0, send_handler,
|
||||
recv_handler, sa_dev, 0);
|
||||
if (IS_ERR(sa_dev->port[i].agent))
|
||||
if (IS_ERR(sa_dev->port[i].agent)) {
|
||||
ret = PTR_ERR(sa_dev->port[i].agent);
|
||||
goto err;
|
||||
}
|
||||
|
||||
INIT_WORK(&sa_dev->port[i].update_task, update_sm_ah);
|
||||
INIT_DELAYED_WORK(&sa_dev->port[i].ib_cpi_work,
|
||||
@ -2366,8 +2366,10 @@ static void ib_sa_add_one(struct ib_device *device)
|
||||
count++;
|
||||
}
|
||||
|
||||
if (!count)
|
||||
if (!count) {
|
||||
ret = -EOPNOTSUPP;
|
||||
goto free;
|
||||
}
|
||||
|
||||
ib_set_client_data(device, &sa_client, sa_dev);
|
||||
|
||||
@ -2386,7 +2388,7 @@ static void ib_sa_add_one(struct ib_device *device)
|
||||
update_sm_ah(&sa_dev->port[i].update_task);
|
||||
}
|
||||
|
||||
return;
|
||||
return 0;
|
||||
|
||||
err:
|
||||
while (--i >= 0) {
|
||||
@ -2395,7 +2397,7 @@ err:
|
||||
}
|
||||
free:
|
||||
kfree(sa_dev);
|
||||
return;
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void ib_sa_remove_one(struct ib_device *device, void *client_data)
|
||||
@ -2403,9 +2405,6 @@ static void ib_sa_remove_one(struct ib_device *device, void *client_data)
|
||||
struct ib_sa_device *sa_dev = client_data;
|
||||
int i;
|
||||
|
||||
if (!sa_dev)
|
||||
return;
|
||||
|
||||
ib_unregister_event_handler(&sa_dev->event_handler);
|
||||
flush_workqueue(ib_wq);
|
||||
|
||||
|
@ -1058,8 +1058,7 @@ static int add_port(struct ib_core_device *coredev, int port_num)
|
||||
coredev->ports_kobj,
|
||||
"%d", port_num);
|
||||
if (ret) {
|
||||
kfree(p);
|
||||
return ret;
|
||||
goto err_put;
|
||||
}
|
||||
|
||||
p->gid_attr_group = kzalloc(sizeof(*p->gid_attr_group), GFP_KERNEL);
|
||||
@ -1072,8 +1071,7 @@ static int add_port(struct ib_core_device *coredev, int port_num)
|
||||
ret = kobject_init_and_add(&p->gid_attr_group->kobj, &gid_attr_type,
|
||||
&p->kobj, "gid_attrs");
|
||||
if (ret) {
|
||||
kfree(p->gid_attr_group);
|
||||
goto err_put;
|
||||
goto err_put_gid_attrs;
|
||||
}
|
||||
|
||||
if (device->ops.process_mad && is_full_dev) {
|
||||
@ -1404,8 +1402,10 @@ int ib_port_register_module_stat(struct ib_device *device, u8 port_num,
|
||||
|
||||
ret = kobject_init_and_add(kobj, ktype, &port->kobj, "%s",
|
||||
name);
|
||||
if (ret)
|
||||
if (ret) {
|
||||
kobject_put(kobj);
|
||||
return ret;
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
@ -52,6 +52,7 @@
|
||||
#include <rdma/rdma_cm_ib.h>
|
||||
#include <rdma/ib_addr.h>
|
||||
#include <rdma/ib.h>
|
||||
#include <rdma/ib_cm.h>
|
||||
#include <rdma/rdma_netlink.h>
|
||||
#include "core_priv.h"
|
||||
|
||||
@ -360,6 +361,9 @@ static int ucma_event_handler(struct rdma_cm_id *cm_id,
|
||||
ucma_copy_conn_event(&uevent->resp.param.conn,
|
||||
&event->param.conn);
|
||||
|
||||
uevent->resp.ece.vendor_id = event->ece.vendor_id;
|
||||
uevent->resp.ece.attr_mod = event->ece.attr_mod;
|
||||
|
||||
if (event->event == RDMA_CM_EVENT_CONNECT_REQUEST) {
|
||||
if (!ctx->backlog) {
|
||||
ret = -ENOMEM;
|
||||
@ -404,7 +408,8 @@ static ssize_t ucma_get_event(struct ucma_file *file, const char __user *inbuf,
|
||||
* Old 32 bit user space does not send the 4 byte padding in the
|
||||
* reserved field. We don't care, allow it to keep working.
|
||||
*/
|
||||
if (out_len < sizeof(uevent->resp) - sizeof(uevent->resp.reserved))
|
||||
if (out_len < sizeof(uevent->resp) - sizeof(uevent->resp.reserved) -
|
||||
sizeof(uevent->resp.ece))
|
||||
return -ENOSPC;
|
||||
|
||||
if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
|
||||
@ -845,7 +850,7 @@ static ssize_t ucma_query_route(struct ucma_file *file,
|
||||
struct sockaddr *addr;
|
||||
int ret = 0;
|
||||
|
||||
if (out_len < sizeof(resp))
|
||||
if (out_len < offsetof(struct rdma_ucm_query_route_resp, ibdev_index))
|
||||
return -ENOSPC;
|
||||
|
||||
if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
|
||||
@ -869,6 +874,7 @@ static ssize_t ucma_query_route(struct ucma_file *file,
|
||||
goto out;
|
||||
|
||||
resp.node_guid = (__force __u64) ctx->cm_id->device->node_guid;
|
||||
resp.ibdev_index = ctx->cm_id->device->index;
|
||||
resp.port_num = ctx->cm_id->port_num;
|
||||
|
||||
if (rdma_cap_ib_sa(ctx->cm_id->device, ctx->cm_id->port_num))
|
||||
@ -880,8 +886,8 @@ static ssize_t ucma_query_route(struct ucma_file *file,
|
||||
|
||||
out:
|
||||
mutex_unlock(&ctx->mutex);
|
||||
if (copy_to_user(u64_to_user_ptr(cmd.response),
|
||||
&resp, sizeof(resp)))
|
||||
if (copy_to_user(u64_to_user_ptr(cmd.response), &resp,
|
||||
min_t(size_t, out_len, sizeof(resp))))
|
||||
ret = -EFAULT;
|
||||
|
||||
ucma_put_ctx(ctx);
|
||||
@ -895,6 +901,7 @@ static void ucma_query_device_addr(struct rdma_cm_id *cm_id,
|
||||
return;
|
||||
|
||||
resp->node_guid = (__force __u64) cm_id->device->node_guid;
|
||||
resp->ibdev_index = cm_id->device->index;
|
||||
resp->port_num = cm_id->port_num;
|
||||
resp->pkey = (__force __u16) cpu_to_be16(
|
||||
ib_addr_get_pkey(&cm_id->route.addr.dev_addr));
|
||||
@ -907,7 +914,7 @@ static ssize_t ucma_query_addr(struct ucma_context *ctx,
|
||||
struct sockaddr *addr;
|
||||
int ret = 0;
|
||||
|
||||
if (out_len < sizeof(resp))
|
||||
if (out_len < offsetof(struct rdma_ucm_query_addr_resp, ibdev_index))
|
||||
return -ENOSPC;
|
||||
|
||||
memset(&resp, 0, sizeof resp);
|
||||
@ -922,7 +929,7 @@ static ssize_t ucma_query_addr(struct ucma_context *ctx,
|
||||
|
||||
ucma_query_device_addr(ctx->cm_id, &resp);
|
||||
|
||||
if (copy_to_user(response, &resp, sizeof(resp)))
|
||||
if (copy_to_user(response, &resp, min_t(size_t, out_len, sizeof(resp))))
|
||||
ret = -EFAULT;
|
||||
|
||||
return ret;
|
||||
@ -974,7 +981,7 @@ static ssize_t ucma_query_gid(struct ucma_context *ctx,
|
||||
struct sockaddr_ib *addr;
|
||||
int ret = 0;
|
||||
|
||||
if (out_len < sizeof(resp))
|
||||
if (out_len < offsetof(struct rdma_ucm_query_addr_resp, ibdev_index))
|
||||
return -ENOSPC;
|
||||
|
||||
memset(&resp, 0, sizeof resp);
|
||||
@ -1007,7 +1014,7 @@ static ssize_t ucma_query_gid(struct ucma_context *ctx,
|
||||
&ctx->cm_id->route.addr.dst_addr);
|
||||
}
|
||||
|
||||
if (copy_to_user(response, &resp, sizeof(resp)))
|
||||
if (copy_to_user(response, &resp, min_t(size_t, out_len, sizeof(resp))))
|
||||
ret = -EFAULT;
|
||||
|
||||
return ret;
|
||||
@ -1070,12 +1077,15 @@ static void ucma_copy_conn_param(struct rdma_cm_id *id,
|
||||
static ssize_t ucma_connect(struct ucma_file *file, const char __user *inbuf,
|
||||
int in_len, int out_len)
|
||||
{
|
||||
struct rdma_ucm_connect cmd;
|
||||
struct rdma_conn_param conn_param;
|
||||
struct rdma_ucm_ece ece = {};
|
||||
struct rdma_ucm_connect cmd;
|
||||
struct ucma_context *ctx;
|
||||
size_t in_size;
|
||||
int ret;
|
||||
|
||||
if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
|
||||
in_size = min_t(size_t, in_len, sizeof(cmd));
|
||||
if (copy_from_user(&cmd, inbuf, in_size))
|
||||
return -EFAULT;
|
||||
|
||||
if (!cmd.conn_param.valid)
|
||||
@ -1086,8 +1096,13 @@ static ssize_t ucma_connect(struct ucma_file *file, const char __user *inbuf,
|
||||
return PTR_ERR(ctx);
|
||||
|
||||
ucma_copy_conn_param(ctx->cm_id, &conn_param, &cmd.conn_param);
|
||||
if (offsetofend(typeof(cmd), ece) <= in_size) {
|
||||
ece.vendor_id = cmd.ece.vendor_id;
|
||||
ece.attr_mod = cmd.ece.attr_mod;
|
||||
}
|
||||
|
||||
mutex_lock(&ctx->mutex);
|
||||
ret = rdma_connect(ctx->cm_id, &conn_param);
|
||||
ret = rdma_connect_ece(ctx->cm_id, &conn_param, &ece);
|
||||
mutex_unlock(&ctx->mutex);
|
||||
ucma_put_ctx(ctx);
|
||||
return ret;
|
||||
@ -1121,28 +1136,36 @@ static ssize_t ucma_accept(struct ucma_file *file, const char __user *inbuf,
|
||||
{
|
||||
struct rdma_ucm_accept cmd;
|
||||
struct rdma_conn_param conn_param;
|
||||
struct rdma_ucm_ece ece = {};
|
||||
struct ucma_context *ctx;
|
||||
size_t in_size;
|
||||
int ret;
|
||||
|
||||
if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
|
||||
in_size = min_t(size_t, in_len, sizeof(cmd));
|
||||
if (copy_from_user(&cmd, inbuf, in_size))
|
||||
return -EFAULT;
|
||||
|
||||
ctx = ucma_get_ctx_dev(file, cmd.id);
|
||||
if (IS_ERR(ctx))
|
||||
return PTR_ERR(ctx);
|
||||
|
||||
if (offsetofend(typeof(cmd), ece) <= in_size) {
|
||||
ece.vendor_id = cmd.ece.vendor_id;
|
||||
ece.attr_mod = cmd.ece.attr_mod;
|
||||
}
|
||||
|
||||
if (cmd.conn_param.valid) {
|
||||
ucma_copy_conn_param(ctx->cm_id, &conn_param, &cmd.conn_param);
|
||||
mutex_lock(&file->mut);
|
||||
mutex_lock(&ctx->mutex);
|
||||
ret = __rdma_accept(ctx->cm_id, &conn_param, NULL);
|
||||
ret = __rdma_accept_ece(ctx->cm_id, &conn_param, NULL, &ece);
|
||||
mutex_unlock(&ctx->mutex);
|
||||
if (!ret)
|
||||
ctx->uid = cmd.uid;
|
||||
mutex_unlock(&file->mut);
|
||||
} else {
|
||||
mutex_lock(&ctx->mutex);
|
||||
ret = __rdma_accept(ctx->cm_id, NULL, NULL);
|
||||
ret = __rdma_accept_ece(ctx->cm_id, NULL, NULL, &ece);
|
||||
mutex_unlock(&ctx->mutex);
|
||||
}
|
||||
ucma_put_ctx(ctx);
|
||||
@ -1159,12 +1182,24 @@ static ssize_t ucma_reject(struct ucma_file *file, const char __user *inbuf,
|
||||
if (copy_from_user(&cmd, inbuf, sizeof(cmd)))
|
||||
return -EFAULT;
|
||||
|
||||
if (!cmd.reason)
|
||||
cmd.reason = IB_CM_REJ_CONSUMER_DEFINED;
|
||||
|
||||
switch (cmd.reason) {
|
||||
case IB_CM_REJ_CONSUMER_DEFINED:
|
||||
case IB_CM_REJ_VENDOR_OPTION_NOT_SUPPORTED:
|
||||
break;
|
||||
default:
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
ctx = ucma_get_ctx_dev(file, cmd.id);
|
||||
if (IS_ERR(ctx))
|
||||
return PTR_ERR(ctx);
|
||||
|
||||
mutex_lock(&ctx->mutex);
|
||||
ret = rdma_reject(ctx->cm_id, cmd.private_data, cmd.private_data_len);
|
||||
ret = rdma_reject(ctx->cm_id, cmd.private_data, cmd.private_data_len,
|
||||
cmd.reason);
|
||||
mutex_unlock(&ctx->mutex);
|
||||
ucma_put_ctx(ctx);
|
||||
return ret;
|
||||
|
@ -41,7 +41,7 @@
|
||||
|
||||
#define STRUCT_FIELD(header, field) \
|
||||
.struct_offset_bytes = offsetof(struct ib_unpacked_ ## header, field), \
|
||||
.struct_size_bytes = sizeof ((struct ib_unpacked_ ## header *) 0)->field, \
|
||||
.struct_size_bytes = sizeof_field(struct ib_unpacked_ ## header, field), \
|
||||
.field_name = #header ":" #field
|
||||
|
||||
static const struct ib_field lrh_table[] = {
|
||||
|
@ -142,7 +142,7 @@ static dev_t dynamic_issm_dev;
|
||||
|
||||
static DEFINE_IDA(umad_ida);
|
||||
|
||||
static void ib_umad_add_one(struct ib_device *device);
|
||||
static int ib_umad_add_one(struct ib_device *device);
|
||||
static void ib_umad_remove_one(struct ib_device *device, void *client_data);
|
||||
|
||||
static void ib_umad_dev_free(struct kref *kref)
|
||||
@ -1352,37 +1352,41 @@ static void ib_umad_kill_port(struct ib_umad_port *port)
|
||||
put_device(&port->dev);
|
||||
}
|
||||
|
||||
static void ib_umad_add_one(struct ib_device *device)
|
||||
static int ib_umad_add_one(struct ib_device *device)
|
||||
{
|
||||
struct ib_umad_device *umad_dev;
|
||||
int s, e, i;
|
||||
int count = 0;
|
||||
int ret;
|
||||
|
||||
s = rdma_start_port(device);
|
||||
e = rdma_end_port(device);
|
||||
|
||||
umad_dev = kzalloc(struct_size(umad_dev, ports, e - s + 1), GFP_KERNEL);
|
||||
if (!umad_dev)
|
||||
return;
|
||||
return -ENOMEM;
|
||||
|
||||
kref_init(&umad_dev->kref);
|
||||
for (i = s; i <= e; ++i) {
|
||||
if (!rdma_cap_ib_mad(device, i))
|
||||
continue;
|
||||
|
||||
if (ib_umad_init_port(device, i, umad_dev,
|
||||
&umad_dev->ports[i - s]))
|
||||
ret = ib_umad_init_port(device, i, umad_dev,
|
||||
&umad_dev->ports[i - s]);
|
||||
if (ret)
|
||||
goto err;
|
||||
|
||||
count++;
|
||||
}
|
||||
|
||||
if (!count)
|
||||
if (!count) {
|
||||
ret = -EOPNOTSUPP;
|
||||
goto free;
|
||||
}
|
||||
|
||||
ib_set_client_data(device, &umad_client, umad_dev);
|
||||
|
||||
return;
|
||||
return 0;
|
||||
|
||||
err:
|
||||
while (--i >= s) {
|
||||
@ -1394,6 +1398,7 @@ err:
|
||||
free:
|
||||
/* balances kref_init */
|
||||
ib_umad_dev_put(umad_dev);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void ib_umad_remove_one(struct ib_device *device, void *client_data)
|
||||
@ -1401,9 +1406,6 @@ static void ib_umad_remove_one(struct ib_device *device, void *client_data)
|
||||
struct ib_umad_device *umad_dev = client_data;
|
||||
unsigned int i;
|
||||
|
||||
if (!umad_dev)
|
||||
return;
|
||||
|
||||
rdma_for_each_port (device, i) {
|
||||
if (rdma_cap_ib_mad(device, i))
|
||||
ib_umad_kill_port(
|
||||
|
@ -142,7 +142,7 @@ struct ib_uverbs_file {
|
||||
* ucontext_lock held
|
||||
*/
|
||||
struct ib_ucontext *ucontext;
|
||||
struct ib_uverbs_async_event_file *async_file;
|
||||
struct ib_uverbs_async_event_file *default_async_file;
|
||||
struct list_head list;
|
||||
|
||||
/*
|
||||
@ -180,6 +180,7 @@ struct ib_uverbs_mcast_entry {
|
||||
|
||||
struct ib_uevent_object {
|
||||
struct ib_uobject uobject;
|
||||
struct ib_uverbs_async_event_file *event_file;
|
||||
/* List member for ib_uverbs_async_event_file list */
|
||||
struct list_head event_list;
|
||||
u32 events_reported;
|
||||
@ -296,6 +297,24 @@ static inline u32 make_port_cap_flags(const struct ib_port_attr *attr)
|
||||
return res;
|
||||
}
|
||||
|
||||
static inline struct ib_uverbs_async_event_file *
|
||||
ib_uverbs_get_async_event(struct uverbs_attr_bundle *attrs,
|
||||
u16 id)
|
||||
{
|
||||
struct ib_uobject *async_ev_file_uobj;
|
||||
struct ib_uverbs_async_event_file *async_ev_file;
|
||||
|
||||
async_ev_file_uobj = uverbs_attr_get_uobject(attrs, id);
|
||||
if (IS_ERR(async_ev_file_uobj))
|
||||
async_ev_file = READ_ONCE(attrs->ufile->default_async_file);
|
||||
else
|
||||
async_ev_file = container_of(async_ev_file_uobj,
|
||||
struct ib_uverbs_async_event_file,
|
||||
uobj);
|
||||
if (async_ev_file)
|
||||
uverbs_uobject_get(&async_ev_file->uobj);
|
||||
return async_ev_file;
|
||||
}
|
||||
|
||||
void copy_port_attr_to_resp(struct ib_port_attr *attr,
|
||||
struct ib_uverbs_query_port_resp *resp,
|
||||
|
@ -311,7 +311,7 @@ static int ib_uverbs_get_context(struct uverbs_attr_bundle *attrs)
|
||||
return 0;
|
||||
|
||||
err_uobj:
|
||||
rdma_alloc_abort_uobject(uobj, attrs);
|
||||
rdma_alloc_abort_uobject(uobj, attrs, false);
|
||||
err_ucontext:
|
||||
kfree(attrs->context);
|
||||
attrs->context = NULL;
|
||||
@ -356,8 +356,6 @@ static void copy_query_dev_fields(struct ib_ucontext *ucontext,
|
||||
resp->max_mcast_qp_attach = attr->max_mcast_qp_attach;
|
||||
resp->max_total_mcast_qp_attach = attr->max_total_mcast_qp_attach;
|
||||
resp->max_ah = attr->max_ah;
|
||||
resp->max_fmr = attr->max_fmr;
|
||||
resp->max_map_per_fmr = attr->max_map_per_fmr;
|
||||
resp->max_srq = attr->max_srq;
|
||||
resp->max_srq_wr = attr->max_srq_wr;
|
||||
resp->max_srq_sge = attr->max_srq_sge;
|
||||
@ -1051,6 +1049,10 @@ static struct ib_ucq_object *create_cq(struct uverbs_attr_bundle *attrs,
|
||||
goto err_free;
|
||||
|
||||
obj->uevent.uobject.object = cq;
|
||||
obj->uevent.event_file = READ_ONCE(attrs->ufile->default_async_file);
|
||||
if (obj->uevent.event_file)
|
||||
uverbs_uobject_get(&obj->uevent.event_file->uobj);
|
||||
|
||||
memset(&resp, 0, sizeof resp);
|
||||
resp.base.cq_handle = obj->uevent.uobject.id;
|
||||
resp.base.cqe = cq->cqe;
|
||||
@ -1067,6 +1069,8 @@ static struct ib_ucq_object *create_cq(struct uverbs_attr_bundle *attrs,
|
||||
return obj;
|
||||
|
||||
err_cb:
|
||||
if (obj->uevent.event_file)
|
||||
uverbs_uobject_put(&obj->uevent.event_file->uobj);
|
||||
ib_destroy_cq_user(cq, uverbs_get_cleared_udata(attrs));
|
||||
cq = NULL;
|
||||
err_free:
|
||||
@ -1460,6 +1464,9 @@ static int create_qp(struct uverbs_attr_bundle *attrs,
|
||||
}
|
||||
|
||||
obj->uevent.uobject.object = qp;
|
||||
obj->uevent.event_file = READ_ONCE(attrs->ufile->default_async_file);
|
||||
if (obj->uevent.event_file)
|
||||
uverbs_uobject_get(&obj->uevent.event_file->uobj);
|
||||
|
||||
memset(&resp, 0, sizeof resp);
|
||||
resp.base.qpn = qp->qp_num;
|
||||
@ -1473,7 +1480,7 @@ static int create_qp(struct uverbs_attr_bundle *attrs,
|
||||
|
||||
ret = uverbs_response(attrs, &resp, sizeof(resp));
|
||||
if (ret)
|
||||
goto err_cb;
|
||||
goto err_uevent;
|
||||
|
||||
if (xrcd) {
|
||||
obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object,
|
||||
@ -1498,6 +1505,9 @@ static int create_qp(struct uverbs_attr_bundle *attrs,
|
||||
|
||||
rdma_alloc_commit_uobject(&obj->uevent.uobject, attrs);
|
||||
return 0;
|
||||
err_uevent:
|
||||
if (obj->uevent.event_file)
|
||||
uverbs_uobject_put(&obj->uevent.event_file->uobj);
|
||||
err_cb:
|
||||
ib_destroy_qp_user(qp, uverbs_get_cleared_udata(attrs));
|
||||
|
||||
@ -2954,11 +2964,11 @@ static int ib_uverbs_ex_create_wq(struct uverbs_attr_bundle *attrs)
|
||||
wq_init_attr.cq = cq;
|
||||
wq_init_attr.max_sge = cmd.max_sge;
|
||||
wq_init_attr.max_wr = cmd.max_wr;
|
||||
wq_init_attr.wq_context = attrs->ufile;
|
||||
wq_init_attr.wq_type = cmd.wq_type;
|
||||
wq_init_attr.event_handler = ib_uverbs_wq_event_handler;
|
||||
wq_init_attr.create_flags = cmd.create_flags;
|
||||
INIT_LIST_HEAD(&obj->uevent.event_list);
|
||||
obj->uevent.uobject.user_handle = cmd.user_handle;
|
||||
|
||||
wq = pd->device->ops.create_wq(pd, &wq_init_attr, &attrs->driver_udata);
|
||||
if (IS_ERR(wq)) {
|
||||
@ -2972,12 +2982,12 @@ static int ib_uverbs_ex_create_wq(struct uverbs_attr_bundle *attrs)
|
||||
wq->cq = cq;
|
||||
wq->pd = pd;
|
||||
wq->device = pd->device;
|
||||
wq->wq_context = wq_init_attr.wq_context;
|
||||
atomic_set(&wq->usecnt, 0);
|
||||
atomic_inc(&pd->usecnt);
|
||||
atomic_inc(&cq->usecnt);
|
||||
wq->uobject = obj;
|
||||
obj->uevent.uobject.object = wq;
|
||||
obj->uevent.event_file = READ_ONCE(attrs->ufile->default_async_file);
|
||||
if (obj->uevent.event_file)
|
||||
uverbs_uobject_get(&obj->uevent.event_file->uobj);
|
||||
|
||||
memset(&resp, 0, sizeof(resp));
|
||||
resp.wq_handle = obj->uevent.uobject.id;
|
||||
@ -2996,6 +3006,8 @@ static int ib_uverbs_ex_create_wq(struct uverbs_attr_bundle *attrs)
|
||||
return 0;
|
||||
|
||||
err_copy:
|
||||
if (obj->uevent.event_file)
|
||||
uverbs_uobject_put(&obj->uevent.event_file->uobj);
|
||||
ib_destroy_wq(wq, uverbs_get_cleared_udata(attrs));
|
||||
err_put_cq:
|
||||
rdma_lookup_put_uobject(&cq->uobject->uevent.uobject,
|
||||
@ -3441,46 +3453,25 @@ static int __uverbs_create_xsrq(struct uverbs_attr_bundle *attrs,
|
||||
}
|
||||
|
||||
attr.event_handler = ib_uverbs_srq_event_handler;
|
||||
attr.srq_context = attrs->ufile;
|
||||
attr.srq_type = cmd->srq_type;
|
||||
attr.attr.max_wr = cmd->max_wr;
|
||||
attr.attr.max_sge = cmd->max_sge;
|
||||
attr.attr.srq_limit = cmd->srq_limit;
|
||||
|
||||
INIT_LIST_HEAD(&obj->uevent.event_list);
|
||||
obj->uevent.uobject.user_handle = cmd->user_handle;
|
||||
|
||||
srq = rdma_zalloc_drv_obj(ib_dev, ib_srq);
|
||||
if (!srq) {
|
||||
ret = -ENOMEM;
|
||||
goto err_put;
|
||||
srq = ib_create_srq_user(pd, &attr, obj, udata);
|
||||
if (IS_ERR(srq)) {
|
||||
ret = PTR_ERR(srq);
|
||||
goto err_put_pd;
|
||||
}
|
||||
|
||||
srq->device = pd->device;
|
||||
srq->pd = pd;
|
||||
srq->srq_type = cmd->srq_type;
|
||||
srq->uobject = obj;
|
||||
srq->event_handler = attr.event_handler;
|
||||
srq->srq_context = attr.srq_context;
|
||||
|
||||
ret = pd->device->ops.create_srq(srq, &attr, udata);
|
||||
if (ret)
|
||||
goto err_free;
|
||||
|
||||
if (ib_srq_has_cq(cmd->srq_type)) {
|
||||
srq->ext.cq = attr.ext.cq;
|
||||
atomic_inc(&attr.ext.cq->usecnt);
|
||||
}
|
||||
|
||||
if (cmd->srq_type == IB_SRQT_XRC) {
|
||||
srq->ext.xrc.xrcd = attr.ext.xrc.xrcd;
|
||||
atomic_inc(&attr.ext.xrc.xrcd->usecnt);
|
||||
}
|
||||
|
||||
atomic_inc(&pd->usecnt);
|
||||
atomic_set(&srq->usecnt, 0);
|
||||
|
||||
obj->uevent.uobject.object = srq;
|
||||
obj->uevent.uobject.user_handle = cmd->user_handle;
|
||||
obj->uevent.event_file = READ_ONCE(attrs->ufile->default_async_file);
|
||||
if (obj->uevent.event_file)
|
||||
uverbs_uobject_get(&obj->uevent.event_file->uobj);
|
||||
|
||||
memset(&resp, 0, sizeof resp);
|
||||
resp.srq_handle = obj->uevent.uobject.id;
|
||||
@ -3505,14 +3496,11 @@ static int __uverbs_create_xsrq(struct uverbs_attr_bundle *attrs,
|
||||
return 0;
|
||||
|
||||
err_copy:
|
||||
if (obj->uevent.event_file)
|
||||
uverbs_uobject_put(&obj->uevent.event_file->uobj);
|
||||
ib_destroy_srq_user(srq, uverbs_get_cleared_udata(attrs));
|
||||
/* It was released in ib_destroy_srq_user */
|
||||
srq = NULL;
|
||||
err_free:
|
||||
kfree(srq);
|
||||
err_put:
|
||||
err_put_pd:
|
||||
uobj_put_obj_read(pd);
|
||||
|
||||
err_put_cq:
|
||||
if (ib_srq_has_cq(cmd->srq_type))
|
||||
rdma_lookup_put_uobject(&attr.ext.cq->uobject->uevent.uobject,
|
||||
@ -3751,7 +3739,7 @@ static int ib_uverbs_ex_modify_cq(struct uverbs_attr_bundle *attrs)
|
||||
#define UAPI_DEF_WRITE_IO(req, resp) \
|
||||
.write.has_resp = 1 + \
|
||||
BUILD_BUG_ON_ZERO(offsetof(req, response) != 0) + \
|
||||
BUILD_BUG_ON_ZERO(sizeof(((req *)0)->response) != \
|
||||
BUILD_BUG_ON_ZERO(sizeof_field(req, response) != \
|
||||
sizeof(u64)), \
|
||||
.write.req_size = sizeof(req), .write.resp_size = sizeof(resp)
|
||||
|
||||
|
@ -58,6 +58,7 @@ struct bundle_priv {
|
||||
|
||||
DECLARE_BITMAP(uobj_finalize, UVERBS_API_ATTR_BKEY_LEN);
|
||||
DECLARE_BITMAP(spec_finalize, UVERBS_API_ATTR_BKEY_LEN);
|
||||
DECLARE_BITMAP(uobj_hw_obj_valid, UVERBS_API_ATTR_BKEY_LEN);
|
||||
|
||||
/*
|
||||
* Must be last. bundle ends in a flex array which overlaps
|
||||
@ -136,7 +137,7 @@ EXPORT_SYMBOL(_uverbs_alloc);
|
||||
static bool uverbs_is_attr_cleared(const struct ib_uverbs_attr *uattr,
|
||||
u16 len)
|
||||
{
|
||||
if (uattr->len > sizeof(((struct ib_uverbs_attr *)0)->data))
|
||||
if (uattr->len > sizeof_field(struct ib_uverbs_attr, data))
|
||||
return ib_is_buffer_cleared(u64_to_user_ptr(uattr->data) + len,
|
||||
uattr->len - len);
|
||||
|
||||
@ -230,7 +231,8 @@ static void uverbs_free_idrs_array(const struct uverbs_api_attr *attr_uapi,
|
||||
|
||||
for (i = 0; i != attr->len; i++)
|
||||
uverbs_finalize_object(attr->uobjects[i],
|
||||
spec->u2.objs_arr.access, commit, attrs);
|
||||
spec->u2.objs_arr.access, false, commit,
|
||||
attrs);
|
||||
}
|
||||
|
||||
static int uverbs_process_attr(struct bundle_priv *pbundle,
|
||||
@ -502,7 +504,9 @@ static void bundle_destroy(struct bundle_priv *pbundle, bool commit)
|
||||
|
||||
uverbs_finalize_object(
|
||||
attr->obj_attr.uobject,
|
||||
attr->obj_attr.attr_elm->spec.u.obj.access, commit,
|
||||
attr->obj_attr.attr_elm->spec.u.obj.access,
|
||||
test_bit(i, pbundle->uobj_hw_obj_valid),
|
||||
commit,
|
||||
&pbundle->bundle);
|
||||
}
|
||||
|
||||
@ -590,6 +594,8 @@ static int ib_uverbs_cmd_verbs(struct ib_uverbs_file *ufile,
|
||||
sizeof(pbundle->bundle.attr_present));
|
||||
memset(pbundle->uobj_finalize, 0, sizeof(pbundle->uobj_finalize));
|
||||
memset(pbundle->spec_finalize, 0, sizeof(pbundle->spec_finalize));
|
||||
memset(pbundle->uobj_hw_obj_valid, 0,
|
||||
sizeof(pbundle->uobj_hw_obj_valid));
|
||||
|
||||
ret = ib_uverbs_run_method(pbundle, hdr->num_attrs);
|
||||
bundle_destroy(pbundle, ret == 0);
|
||||
@ -784,3 +790,15 @@ int uverbs_copy_to_struct_or_zero(const struct uverbs_attr_bundle *bundle,
|
||||
}
|
||||
return uverbs_copy_to(bundle, idx, from, size);
|
||||
}
|
||||
|
||||
/* Once called an abort will call through to the type's destroy_hw() */
|
||||
void uverbs_finalize_uobj_create(const struct uverbs_attr_bundle *bundle,
|
||||
u16 idx)
|
||||
{
|
||||
struct bundle_priv *pbundle =
|
||||
container_of(bundle, struct bundle_priv, bundle);
|
||||
|
||||
__set_bit(uapi_bkey_attr(uapi_key_attr(idx)),
|
||||
pbundle->uobj_hw_obj_valid);
|
||||
}
|
||||
EXPORT_SYMBOL(uverbs_finalize_uobj_create);
|
||||
|
@ -75,7 +75,7 @@ static dev_t dynamic_uverbs_dev;
|
||||
static struct class *uverbs_class;
|
||||
|
||||
static DEFINE_IDA(uverbs_ida);
|
||||
static void ib_uverbs_add_one(struct ib_device *device);
|
||||
static int ib_uverbs_add_one(struct ib_device *device);
|
||||
static void ib_uverbs_remove_one(struct ib_device *device, void *client_data);
|
||||
|
||||
/*
|
||||
@ -146,8 +146,7 @@ void ib_uverbs_release_ucq(struct ib_uverbs_completion_event_file *ev_file,
|
||||
|
||||
void ib_uverbs_release_uevent(struct ib_uevent_object *uobj)
|
||||
{
|
||||
struct ib_uverbs_async_event_file *async_file =
|
||||
READ_ONCE(uobj->uobject.ufile->async_file);
|
||||
struct ib_uverbs_async_event_file *async_file = uobj->event_file;
|
||||
struct ib_uverbs_event *evt, *tmp;
|
||||
|
||||
if (!async_file)
|
||||
@ -159,6 +158,7 @@ void ib_uverbs_release_uevent(struct ib_uevent_object *uobj)
|
||||
kfree(evt);
|
||||
}
|
||||
spin_unlock_irq(&async_file->ev_queue.lock);
|
||||
uverbs_uobject_put(&async_file->uobj);
|
||||
}
|
||||
|
||||
void ib_uverbs_detach_umcast(struct ib_qp *qp,
|
||||
@ -197,8 +197,8 @@ void ib_uverbs_release_file(struct kref *ref)
|
||||
if (atomic_dec_and_test(&file->device->refcount))
|
||||
ib_uverbs_comp_dev(file->device);
|
||||
|
||||
if (file->async_file)
|
||||
uverbs_uobject_put(&file->async_file->uobj);
|
||||
if (file->default_async_file)
|
||||
uverbs_uobject_put(&file->default_async_file->uobj);
|
||||
put_device(&file->device->dev);
|
||||
|
||||
if (file->disassociate_page)
|
||||
@ -296,6 +296,8 @@ static __poll_t ib_uverbs_event_poll(struct ib_uverbs_event_queue *ev_queue,
|
||||
spin_lock_irq(&ev_queue->lock);
|
||||
if (!list_empty(&ev_queue->event_list))
|
||||
pollflags = EPOLLIN | EPOLLRDNORM;
|
||||
else if (ev_queue->is_closed)
|
||||
pollflags = EPOLLERR;
|
||||
spin_unlock_irq(&ev_queue->lock);
|
||||
|
||||
return pollflags;
|
||||
@ -425,7 +427,7 @@ void ib_uverbs_async_handler(struct ib_uverbs_async_event_file *async_file,
|
||||
static void uverbs_uobj_event(struct ib_uevent_object *eobj,
|
||||
struct ib_event *event)
|
||||
{
|
||||
ib_uverbs_async_handler(READ_ONCE(eobj->uobject.ufile->async_file),
|
||||
ib_uverbs_async_handler(eobj->event_file,
|
||||
eobj->uobject.user_handle, event->event,
|
||||
&eobj->event_list, &eobj->events_reported);
|
||||
}
|
||||
@ -482,10 +484,10 @@ void ib_uverbs_init_async_event_file(
|
||||
|
||||
/* The first async_event_file becomes the default one for the file. */
|
||||
mutex_lock(&uverbs_file->ucontext_lock);
|
||||
if (!uverbs_file->async_file) {
|
||||
if (!uverbs_file->default_async_file) {
|
||||
/* Pairs with the put in ib_uverbs_release_file */
|
||||
uverbs_uobject_get(&async_file->uobj);
|
||||
smp_store_release(&uverbs_file->async_file, async_file);
|
||||
smp_store_release(&uverbs_file->default_async_file, async_file);
|
||||
}
|
||||
mutex_unlock(&uverbs_file->ucontext_lock);
|
||||
|
||||
@ -1092,7 +1094,7 @@ static int ib_uverbs_create_uapi(struct ib_device *device,
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void ib_uverbs_add_one(struct ib_device *device)
|
||||
static int ib_uverbs_add_one(struct ib_device *device)
|
||||
{
|
||||
int devnum;
|
||||
dev_t base;
|
||||
@ -1100,16 +1102,16 @@ static void ib_uverbs_add_one(struct ib_device *device)
|
||||
int ret;
|
||||
|
||||
if (!device->ops.alloc_ucontext)
|
||||
return;
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
uverbs_dev = kzalloc(sizeof(*uverbs_dev), GFP_KERNEL);
|
||||
if (!uverbs_dev)
|
||||
return;
|
||||
return -ENOMEM;
|
||||
|
||||
ret = init_srcu_struct(&uverbs_dev->disassociate_srcu);
|
||||
if (ret) {
|
||||
kfree(uverbs_dev);
|
||||
return;
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
device_initialize(&uverbs_dev->dev);
|
||||
@ -1129,15 +1131,18 @@ static void ib_uverbs_add_one(struct ib_device *device)
|
||||
|
||||
devnum = ida_alloc_max(&uverbs_ida, IB_UVERBS_MAX_DEVICES - 1,
|
||||
GFP_KERNEL);
|
||||
if (devnum < 0)
|
||||
if (devnum < 0) {
|
||||
ret = -ENOMEM;
|
||||
goto err;
|
||||
}
|
||||
uverbs_dev->devnum = devnum;
|
||||
if (devnum >= IB_UVERBS_NUM_FIXED_MINOR)
|
||||
base = dynamic_uverbs_dev + devnum - IB_UVERBS_NUM_FIXED_MINOR;
|
||||
else
|
||||
base = IB_UVERBS_BASE_DEV + devnum;
|
||||
|
||||
if (ib_uverbs_create_uapi(device, uverbs_dev))
|
||||
ret = ib_uverbs_create_uapi(device, uverbs_dev);
|
||||
if (ret)
|
||||
goto err_uapi;
|
||||
|
||||
uverbs_dev->dev.devt = base;
|
||||
@ -1152,7 +1157,7 @@ static void ib_uverbs_add_one(struct ib_device *device)
|
||||
goto err_uapi;
|
||||
|
||||
ib_set_client_data(device, &uverbs_client, uverbs_dev);
|
||||
return;
|
||||
return 0;
|
||||
|
||||
err_uapi:
|
||||
ida_free(&uverbs_ida, devnum);
|
||||
@ -1161,7 +1166,7 @@ err:
|
||||
ib_uverbs_comp_dev(uverbs_dev);
|
||||
wait_for_completion(&uverbs_dev->comp);
|
||||
put_device(&uverbs_dev->dev);
|
||||
return;
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void ib_uverbs_free_hw_resources(struct ib_uverbs_device *uverbs_dev,
|
||||
@ -1201,9 +1206,6 @@ static void ib_uverbs_remove_one(struct ib_device *device, void *client_data)
|
||||
struct ib_uverbs_device *uverbs_dev = client_data;
|
||||
int wait_clients = 1;
|
||||
|
||||
if (!uverbs_dev)
|
||||
return;
|
||||
|
||||
cdev_device_del(&uverbs_dev->cdev, &uverbs_dev->dev);
|
||||
ida_free(&uverbs_ida, uverbs_dev->devnum);
|
||||
|
||||
|
@ -75,40 +75,6 @@ static int uverbs_free_mw(struct ib_uobject *uobject,
|
||||
return uverbs_dealloc_mw((struct ib_mw *)uobject->object);
|
||||
}
|
||||
|
||||
static int uverbs_free_qp(struct ib_uobject *uobject,
|
||||
enum rdma_remove_reason why,
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
{
|
||||
struct ib_qp *qp = uobject->object;
|
||||
struct ib_uqp_object *uqp =
|
||||
container_of(uobject, struct ib_uqp_object, uevent.uobject);
|
||||
int ret;
|
||||
|
||||
/*
|
||||
* If this is a user triggered destroy then do not allow destruction
|
||||
* until the user cleans up all the mcast bindings. Unlike in other
|
||||
* places we forcibly clean up the mcast attachments for !DESTROY
|
||||
* because the mcast attaches are not ubojects and will not be
|
||||
* destroyed by anything else during cleanup processing.
|
||||
*/
|
||||
if (why == RDMA_REMOVE_DESTROY) {
|
||||
if (!list_empty(&uqp->mcast_list))
|
||||
return -EBUSY;
|
||||
} else if (qp == qp->real_qp) {
|
||||
ib_uverbs_detach_umcast(qp, uqp);
|
||||
}
|
||||
|
||||
ret = ib_destroy_qp_user(qp, &attrs->driver_udata);
|
||||
if (ib_is_destroy_retryable(ret, why, uobject))
|
||||
return ret;
|
||||
|
||||
if (uqp->uxrcd)
|
||||
atomic_dec(&uqp->uxrcd->refcnt);
|
||||
|
||||
ib_uverbs_release_uevent(&uqp->uevent);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int uverbs_free_rwq_ind_tbl(struct ib_uobject *uobject,
|
||||
enum rdma_remove_reason why,
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
@ -125,48 +91,6 @@ static int uverbs_free_rwq_ind_tbl(struct ib_uobject *uobject,
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int uverbs_free_wq(struct ib_uobject *uobject,
|
||||
enum rdma_remove_reason why,
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
{
|
||||
struct ib_wq *wq = uobject->object;
|
||||
struct ib_uwq_object *uwq =
|
||||
container_of(uobject, struct ib_uwq_object, uevent.uobject);
|
||||
int ret;
|
||||
|
||||
ret = ib_destroy_wq(wq, &attrs->driver_udata);
|
||||
if (ib_is_destroy_retryable(ret, why, uobject))
|
||||
return ret;
|
||||
|
||||
ib_uverbs_release_uevent(&uwq->uevent);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int uverbs_free_srq(struct ib_uobject *uobject,
|
||||
enum rdma_remove_reason why,
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
{
|
||||
struct ib_srq *srq = uobject->object;
|
||||
struct ib_uevent_object *uevent =
|
||||
container_of(uobject, struct ib_uevent_object, uobject);
|
||||
enum ib_srq_type srq_type = srq->srq_type;
|
||||
int ret;
|
||||
|
||||
ret = ib_destroy_srq_user(srq, &attrs->driver_udata);
|
||||
if (ib_is_destroy_retryable(ret, why, uobject))
|
||||
return ret;
|
||||
|
||||
if (srq_type == IB_SRQT_XRC) {
|
||||
struct ib_usrq_object *us =
|
||||
container_of(uevent, struct ib_usrq_object, uevent);
|
||||
|
||||
atomic_dec(&us->uxrcd->refcnt);
|
||||
}
|
||||
|
||||
ib_uverbs_release_uevent(uevent);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int uverbs_free_xrcd(struct ib_uobject *uobject,
|
||||
enum rdma_remove_reason why,
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
@ -252,10 +176,6 @@ DECLARE_UVERBS_NAMED_OBJECT(
|
||||
"[infinibandevent]",
|
||||
O_RDONLY));
|
||||
|
||||
DECLARE_UVERBS_NAMED_OBJECT(
|
||||
UVERBS_OBJECT_QP,
|
||||
UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uqp_object), uverbs_free_qp));
|
||||
|
||||
DECLARE_UVERBS_NAMED_METHOD_DESTROY(
|
||||
UVERBS_METHOD_MW_DESTROY,
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_DESTROY_MW_HANDLE,
|
||||
@ -267,11 +187,6 @@ DECLARE_UVERBS_NAMED_OBJECT(UVERBS_OBJECT_MW,
|
||||
UVERBS_TYPE_ALLOC_IDR(uverbs_free_mw),
|
||||
&UVERBS_METHOD(UVERBS_METHOD_MW_DESTROY));
|
||||
|
||||
DECLARE_UVERBS_NAMED_OBJECT(
|
||||
UVERBS_OBJECT_SRQ,
|
||||
UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_usrq_object),
|
||||
uverbs_free_srq));
|
||||
|
||||
DECLARE_UVERBS_NAMED_METHOD_DESTROY(
|
||||
UVERBS_METHOD_AH_DESTROY,
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_DESTROY_AH_HANDLE,
|
||||
@ -296,10 +211,6 @@ DECLARE_UVERBS_NAMED_OBJECT(
|
||||
uverbs_free_flow),
|
||||
&UVERBS_METHOD(UVERBS_METHOD_FLOW_DESTROY));
|
||||
|
||||
DECLARE_UVERBS_NAMED_OBJECT(
|
||||
UVERBS_OBJECT_WQ,
|
||||
UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uwq_object), uverbs_free_wq));
|
||||
|
||||
DECLARE_UVERBS_NAMED_METHOD_DESTROY(
|
||||
UVERBS_METHOD_RWQ_IND_TBL_DESTROY,
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_DESTROY_RWQ_IND_TBL_HANDLE,
|
||||
@ -340,18 +251,12 @@ const struct uapi_definition uverbs_def_obj_intf[] = {
|
||||
UAPI_DEF_OBJ_NEEDS_FN(dealloc_pd)),
|
||||
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(UVERBS_OBJECT_COMP_CHANNEL,
|
||||
UAPI_DEF_OBJ_NEEDS_FN(dealloc_pd)),
|
||||
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(UVERBS_OBJECT_QP,
|
||||
UAPI_DEF_OBJ_NEEDS_FN(destroy_qp)),
|
||||
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(UVERBS_OBJECT_AH,
|
||||
UAPI_DEF_OBJ_NEEDS_FN(destroy_ah)),
|
||||
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(UVERBS_OBJECT_MW,
|
||||
UAPI_DEF_OBJ_NEEDS_FN(dealloc_mw)),
|
||||
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(UVERBS_OBJECT_SRQ,
|
||||
UAPI_DEF_OBJ_NEEDS_FN(destroy_srq)),
|
||||
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(UVERBS_OBJECT_FLOW,
|
||||
UAPI_DEF_OBJ_NEEDS_FN(destroy_flow)),
|
||||
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(UVERBS_OBJECT_WQ,
|
||||
UAPI_DEF_OBJ_NEEDS_FN(destroy_wq)),
|
||||
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(
|
||||
UVERBS_OBJECT_RWQ_IND_TBL,
|
||||
UAPI_DEF_OBJ_NEEDS_FN(destroy_rwq_ind_table)),
|
||||
|
@ -100,6 +100,9 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
|
||||
uverbs_uobject_get(ev_file_uobj);
|
||||
}
|
||||
|
||||
obj->uevent.event_file = ib_uverbs_get_async_event(
|
||||
attrs, UVERBS_ATTR_CREATE_CQ_EVENT_FD);
|
||||
|
||||
if (attr.comp_vector >= attrs->ufile->device->num_comp_vectors) {
|
||||
ret = -EINVAL;
|
||||
goto err_event_file;
|
||||
@ -129,19 +132,17 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
|
||||
obj->uevent.uobject.object = cq;
|
||||
obj->uevent.uobject.user_handle = user_handle;
|
||||
rdma_restrack_uadd(&cq->res);
|
||||
uverbs_finalize_uobj_create(attrs, UVERBS_ATTR_CREATE_CQ_HANDLE);
|
||||
|
||||
ret = uverbs_copy_to(attrs, UVERBS_ATTR_CREATE_CQ_RESP_CQE, &cq->cqe,
|
||||
sizeof(cq->cqe));
|
||||
if (ret)
|
||||
goto err_cq;
|
||||
return ret;
|
||||
|
||||
return 0;
|
||||
err_cq:
|
||||
ib_destroy_cq_user(cq, uverbs_get_cleared_udata(attrs));
|
||||
cq = NULL;
|
||||
err_free:
|
||||
kfree(cq);
|
||||
err_event_file:
|
||||
if (obj->uevent.event_file)
|
||||
uverbs_uobject_put(&obj->uevent.event_file->uobj);
|
||||
if (ev_file)
|
||||
uverbs_uobject_put(ev_file_uobj);
|
||||
return ret;
|
||||
@ -171,6 +172,10 @@ DECLARE_UVERBS_NAMED_METHOD(
|
||||
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_CREATE_CQ_RESP_CQE,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_FD(UVERBS_ATTR_CREATE_CQ_EVENT_FD,
|
||||
UVERBS_OBJECT_ASYNC_EVENT,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_UHW());
|
||||
|
||||
static int UVERBS_HANDLER(UVERBS_METHOD_CQ_DESTROY)(
|
||||
|
@ -136,21 +136,15 @@ static int UVERBS_HANDLER(UVERBS_METHOD_DM_MR_REG)(
|
||||
|
||||
uobj->object = mr;
|
||||
|
||||
uverbs_finalize_uobj_create(attrs, UVERBS_ATTR_REG_DM_MR_HANDLE);
|
||||
|
||||
ret = uverbs_copy_to(attrs, UVERBS_ATTR_REG_DM_MR_RESP_LKEY, &mr->lkey,
|
||||
sizeof(mr->lkey));
|
||||
if (ret)
|
||||
goto err_dereg;
|
||||
return ret;
|
||||
|
||||
ret = uverbs_copy_to(attrs, UVERBS_ATTR_REG_DM_MR_RESP_RKEY,
|
||||
&mr->rkey, sizeof(mr->rkey));
|
||||
if (ret)
|
||||
goto err_dereg;
|
||||
|
||||
return 0;
|
||||
|
||||
err_dereg:
|
||||
ib_dereg_mr_user(mr, uverbs_get_cleared_udata(attrs));
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
401
drivers/infiniband/core/uverbs_std_types_qp.c
Normal file
401
drivers/infiniband/core/uverbs_std_types_qp.c
Normal file
@ -0,0 +1,401 @@
|
||||
// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
|
||||
/*
|
||||
* Copyright (c) 2020, Mellanox Technologies inc. All rights reserved.
|
||||
*/
|
||||
|
||||
#include <rdma/uverbs_std_types.h>
|
||||
#include "rdma_core.h"
|
||||
#include "uverbs.h"
|
||||
#include "core_priv.h"
|
||||
|
||||
static int uverbs_free_qp(struct ib_uobject *uobject,
|
||||
enum rdma_remove_reason why,
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
{
|
||||
struct ib_qp *qp = uobject->object;
|
||||
struct ib_uqp_object *uqp =
|
||||
container_of(uobject, struct ib_uqp_object, uevent.uobject);
|
||||
int ret;
|
||||
|
||||
/*
|
||||
* If this is a user triggered destroy then do not allow destruction
|
||||
* until the user cleans up all the mcast bindings. Unlike in other
|
||||
* places we forcibly clean up the mcast attachments for !DESTROY
|
||||
* because the mcast attaches are not ubojects and will not be
|
||||
* destroyed by anything else during cleanup processing.
|
||||
*/
|
||||
if (why == RDMA_REMOVE_DESTROY) {
|
||||
if (!list_empty(&uqp->mcast_list))
|
||||
return -EBUSY;
|
||||
} else if (qp == qp->real_qp) {
|
||||
ib_uverbs_detach_umcast(qp, uqp);
|
||||
}
|
||||
|
||||
ret = ib_destroy_qp_user(qp, &attrs->driver_udata);
|
||||
if (ib_is_destroy_retryable(ret, why, uobject))
|
||||
return ret;
|
||||
|
||||
if (uqp->uxrcd)
|
||||
atomic_dec(&uqp->uxrcd->refcnt);
|
||||
|
||||
ib_uverbs_release_uevent(&uqp->uevent);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int check_creation_flags(enum ib_qp_type qp_type,
|
||||
u32 create_flags)
|
||||
{
|
||||
create_flags &= ~IB_UVERBS_QP_CREATE_SQ_SIG_ALL;
|
||||
|
||||
if (!create_flags || qp_type == IB_QPT_DRIVER)
|
||||
return 0;
|
||||
|
||||
if (qp_type != IB_QPT_RAW_PACKET && qp_type != IB_QPT_UD)
|
||||
return -EINVAL;
|
||||
|
||||
if ((create_flags & IB_UVERBS_QP_CREATE_SCATTER_FCS ||
|
||||
create_flags & IB_UVERBS_QP_CREATE_CVLAN_STRIPPING) &&
|
||||
qp_type != IB_QPT_RAW_PACKET)
|
||||
return -EINVAL;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void set_caps(struct ib_qp_init_attr *attr,
|
||||
struct ib_uverbs_qp_cap *cap, bool req)
|
||||
{
|
||||
if (req) {
|
||||
attr->cap.max_send_wr = cap->max_send_wr;
|
||||
attr->cap.max_recv_wr = cap->max_recv_wr;
|
||||
attr->cap.max_send_sge = cap->max_send_sge;
|
||||
attr->cap.max_recv_sge = cap->max_recv_sge;
|
||||
attr->cap.max_inline_data = cap->max_inline_data;
|
||||
} else {
|
||||
cap->max_send_wr = attr->cap.max_send_wr;
|
||||
cap->max_recv_wr = attr->cap.max_recv_wr;
|
||||
cap->max_send_sge = attr->cap.max_send_sge;
|
||||
cap->max_recv_sge = attr->cap.max_recv_sge;
|
||||
cap->max_inline_data = attr->cap.max_inline_data;
|
||||
}
|
||||
}
|
||||
|
||||
static int UVERBS_HANDLER(UVERBS_METHOD_QP_CREATE)(
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
{
|
||||
struct ib_uqp_object *obj = container_of(
|
||||
uverbs_attr_get_uobject(attrs, UVERBS_ATTR_CREATE_QP_HANDLE),
|
||||
typeof(*obj), uevent.uobject);
|
||||
struct ib_qp_init_attr attr = {};
|
||||
struct ib_uverbs_qp_cap cap = {};
|
||||
struct ib_rwq_ind_table *rwq_ind_tbl = NULL;
|
||||
struct ib_qp *qp;
|
||||
struct ib_pd *pd = NULL;
|
||||
struct ib_srq *srq = NULL;
|
||||
struct ib_cq *recv_cq = NULL;
|
||||
struct ib_cq *send_cq = NULL;
|
||||
struct ib_xrcd *xrcd = NULL;
|
||||
struct ib_uobject *xrcd_uobj = NULL;
|
||||
struct ib_device *device;
|
||||
u64 user_handle;
|
||||
int ret;
|
||||
|
||||
ret = uverbs_copy_from_or_zero(&cap, attrs,
|
||||
UVERBS_ATTR_CREATE_QP_CAP);
|
||||
if (!ret)
|
||||
ret = uverbs_copy_from(&user_handle, attrs,
|
||||
UVERBS_ATTR_CREATE_QP_USER_HANDLE);
|
||||
if (!ret)
|
||||
ret = uverbs_get_const(&attr.qp_type, attrs,
|
||||
UVERBS_ATTR_CREATE_QP_TYPE);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
switch (attr.qp_type) {
|
||||
case IB_QPT_XRC_TGT:
|
||||
if (uverbs_attr_is_valid(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_RECV_CQ_HANDLE) ||
|
||||
uverbs_attr_is_valid(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_SEND_CQ_HANDLE) ||
|
||||
uverbs_attr_is_valid(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_PD_HANDLE) ||
|
||||
uverbs_attr_is_valid(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_IND_TABLE_HANDLE))
|
||||
return -EINVAL;
|
||||
|
||||
xrcd_uobj = uverbs_attr_get_uobject(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_XRCD_HANDLE);
|
||||
if (IS_ERR(xrcd_uobj))
|
||||
return PTR_ERR(xrcd_uobj);
|
||||
|
||||
xrcd = (struct ib_xrcd *)xrcd_uobj->object;
|
||||
if (!xrcd)
|
||||
return -EINVAL;
|
||||
device = xrcd->device;
|
||||
break;
|
||||
case IB_UVERBS_QPT_RAW_PACKET:
|
||||
if (!capable(CAP_NET_RAW))
|
||||
return -EPERM;
|
||||
fallthrough;
|
||||
case IB_UVERBS_QPT_RC:
|
||||
case IB_UVERBS_QPT_UC:
|
||||
case IB_UVERBS_QPT_UD:
|
||||
case IB_UVERBS_QPT_XRC_INI:
|
||||
case IB_UVERBS_QPT_DRIVER:
|
||||
if (uverbs_attr_is_valid(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_XRCD_HANDLE) ||
|
||||
(uverbs_attr_is_valid(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_SRQ_HANDLE) &&
|
||||
attr.qp_type == IB_QPT_XRC_INI))
|
||||
return -EINVAL;
|
||||
|
||||
pd = uverbs_attr_get_obj(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_PD_HANDLE);
|
||||
if (IS_ERR(pd))
|
||||
return PTR_ERR(pd);
|
||||
|
||||
rwq_ind_tbl = uverbs_attr_get_obj(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_IND_TABLE_HANDLE);
|
||||
if (!IS_ERR(rwq_ind_tbl)) {
|
||||
if (cap.max_recv_wr || cap.max_recv_sge ||
|
||||
uverbs_attr_is_valid(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_RECV_CQ_HANDLE) ||
|
||||
uverbs_attr_is_valid(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_SRQ_HANDLE))
|
||||
return -EINVAL;
|
||||
|
||||
/* send_cq is optinal */
|
||||
if (cap.max_send_wr) {
|
||||
send_cq = uverbs_attr_get_obj(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_SEND_CQ_HANDLE);
|
||||
if (IS_ERR(send_cq))
|
||||
return PTR_ERR(send_cq);
|
||||
}
|
||||
attr.rwq_ind_tbl = rwq_ind_tbl;
|
||||
} else {
|
||||
send_cq = uverbs_attr_get_obj(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_SEND_CQ_HANDLE);
|
||||
if (IS_ERR(send_cq))
|
||||
return PTR_ERR(send_cq);
|
||||
|
||||
if (attr.qp_type != IB_QPT_XRC_INI) {
|
||||
recv_cq = uverbs_attr_get_obj(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_RECV_CQ_HANDLE);
|
||||
if (IS_ERR(recv_cq))
|
||||
return PTR_ERR(recv_cq);
|
||||
}
|
||||
}
|
||||
|
||||
device = pd->device;
|
||||
break;
|
||||
default:
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
ret = uverbs_get_flags32(&attr.create_flags, attrs,
|
||||
UVERBS_ATTR_CREATE_QP_FLAGS,
|
||||
IB_UVERBS_QP_CREATE_BLOCK_MULTICAST_LOOPBACK |
|
||||
IB_UVERBS_QP_CREATE_SCATTER_FCS |
|
||||
IB_UVERBS_QP_CREATE_CVLAN_STRIPPING |
|
||||
IB_UVERBS_QP_CREATE_PCI_WRITE_END_PADDING |
|
||||
IB_UVERBS_QP_CREATE_SQ_SIG_ALL);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = check_creation_flags(attr.qp_type, attr.create_flags);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (uverbs_attr_is_valid(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_SOURCE_QPN)) {
|
||||
ret = uverbs_copy_from(&attr.source_qpn, attrs,
|
||||
UVERBS_ATTR_CREATE_QP_SOURCE_QPN);
|
||||
if (ret)
|
||||
return ret;
|
||||
attr.create_flags |= IB_QP_CREATE_SOURCE_QPN;
|
||||
}
|
||||
|
||||
srq = uverbs_attr_get_obj(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_SRQ_HANDLE);
|
||||
if (!IS_ERR(srq)) {
|
||||
if ((srq->srq_type == IB_SRQT_XRC &&
|
||||
attr.qp_type != IB_QPT_XRC_TGT) ||
|
||||
(srq->srq_type != IB_SRQT_XRC &&
|
||||
attr.qp_type == IB_QPT_XRC_TGT))
|
||||
return -EINVAL;
|
||||
attr.srq = srq;
|
||||
}
|
||||
|
||||
obj->uevent.event_file = ib_uverbs_get_async_event(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_EVENT_FD);
|
||||
INIT_LIST_HEAD(&obj->uevent.event_list);
|
||||
INIT_LIST_HEAD(&obj->mcast_list);
|
||||
obj->uevent.uobject.user_handle = user_handle;
|
||||
attr.event_handler = ib_uverbs_qp_event_handler;
|
||||
attr.send_cq = send_cq;
|
||||
attr.recv_cq = recv_cq;
|
||||
attr.xrcd = xrcd;
|
||||
if (attr.create_flags & IB_UVERBS_QP_CREATE_SQ_SIG_ALL) {
|
||||
/* This creation bit is uverbs one, need to mask before
|
||||
* calling drivers. It was added to prevent an extra user attr
|
||||
* only for that when using ioctl.
|
||||
*/
|
||||
attr.create_flags &= ~IB_UVERBS_QP_CREATE_SQ_SIG_ALL;
|
||||
attr.sq_sig_type = IB_SIGNAL_ALL_WR;
|
||||
} else {
|
||||
attr.sq_sig_type = IB_SIGNAL_REQ_WR;
|
||||
}
|
||||
|
||||
set_caps(&attr, &cap, true);
|
||||
mutex_init(&obj->mcast_lock);
|
||||
|
||||
if (attr.qp_type == IB_QPT_XRC_TGT)
|
||||
qp = ib_create_qp(pd, &attr);
|
||||
else
|
||||
qp = _ib_create_qp(device, pd, &attr, &attrs->driver_udata,
|
||||
obj);
|
||||
|
||||
if (IS_ERR(qp)) {
|
||||
ret = PTR_ERR(qp);
|
||||
goto err_put;
|
||||
}
|
||||
|
||||
if (attr.qp_type != IB_QPT_XRC_TGT) {
|
||||
atomic_inc(&pd->usecnt);
|
||||
if (attr.send_cq)
|
||||
atomic_inc(&attr.send_cq->usecnt);
|
||||
if (attr.recv_cq)
|
||||
atomic_inc(&attr.recv_cq->usecnt);
|
||||
if (attr.srq)
|
||||
atomic_inc(&attr.srq->usecnt);
|
||||
if (attr.rwq_ind_tbl)
|
||||
atomic_inc(&attr.rwq_ind_tbl->usecnt);
|
||||
} else {
|
||||
obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object,
|
||||
uobject);
|
||||
atomic_inc(&obj->uxrcd->refcnt);
|
||||
/* It is done in _ib_create_qp for other QP types */
|
||||
qp->uobject = obj;
|
||||
}
|
||||
|
||||
obj->uevent.uobject.object = qp;
|
||||
uverbs_finalize_uobj_create(attrs, UVERBS_ATTR_CREATE_QP_HANDLE);
|
||||
|
||||
if (attr.qp_type != IB_QPT_XRC_TGT) {
|
||||
ret = ib_create_qp_security(qp, device);
|
||||
if (ret)
|
||||
return ret;
|
||||
}
|
||||
|
||||
set_caps(&attr, &cap, false);
|
||||
ret = uverbs_copy_to_struct_or_zero(attrs,
|
||||
UVERBS_ATTR_CREATE_QP_RESP_CAP, &cap,
|
||||
sizeof(cap));
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = uverbs_copy_to(attrs, UVERBS_ATTR_CREATE_QP_RESP_QP_NUM,
|
||||
&qp->qp_num,
|
||||
sizeof(qp->qp_num));
|
||||
|
||||
return ret;
|
||||
err_put:
|
||||
if (obj->uevent.event_file)
|
||||
uverbs_uobject_put(&obj->uevent.event_file->uobj);
|
||||
return ret;
|
||||
};
|
||||
|
||||
DECLARE_UVERBS_NAMED_METHOD(
|
||||
UVERBS_METHOD_QP_CREATE,
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_QP_HANDLE,
|
||||
UVERBS_OBJECT_QP,
|
||||
UVERBS_ACCESS_NEW,
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_QP_XRCD_HANDLE,
|
||||
UVERBS_OBJECT_XRCD,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_QP_PD_HANDLE,
|
||||
UVERBS_OBJECT_PD,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_QP_SRQ_HANDLE,
|
||||
UVERBS_OBJECT_SRQ,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_QP_SEND_CQ_HANDLE,
|
||||
UVERBS_OBJECT_CQ,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_QP_RECV_CQ_HANDLE,
|
||||
UVERBS_OBJECT_CQ,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_QP_IND_TABLE_HANDLE,
|
||||
UVERBS_OBJECT_RWQ_IND_TBL,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_QP_USER_HANDLE,
|
||||
UVERBS_ATTR_TYPE(u64),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_QP_CAP,
|
||||
UVERBS_ATTR_STRUCT(struct ib_uverbs_qp_cap,
|
||||
max_inline_data),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_CONST_IN(UVERBS_ATTR_CREATE_QP_TYPE,
|
||||
enum ib_uverbs_qp_type,
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_FLAGS_IN(UVERBS_ATTR_CREATE_QP_FLAGS,
|
||||
enum ib_uverbs_qp_create_flags,
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_QP_SOURCE_QPN,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_FD(UVERBS_ATTR_CREATE_QP_EVENT_FD,
|
||||
UVERBS_OBJECT_ASYNC_EVENT,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_CREATE_QP_RESP_CAP,
|
||||
UVERBS_ATTR_STRUCT(struct ib_uverbs_qp_cap,
|
||||
max_inline_data),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_CREATE_QP_RESP_QP_NUM,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_UHW());
|
||||
|
||||
static int UVERBS_HANDLER(UVERBS_METHOD_QP_DESTROY)(
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
{
|
||||
struct ib_uobject *uobj =
|
||||
uverbs_attr_get_uobject(attrs, UVERBS_ATTR_DESTROY_QP_HANDLE);
|
||||
struct ib_uqp_object *obj =
|
||||
container_of(uobj, struct ib_uqp_object, uevent.uobject);
|
||||
struct ib_uverbs_destroy_qp_resp resp = {
|
||||
.events_reported = obj->uevent.events_reported
|
||||
};
|
||||
|
||||
return uverbs_copy_to(attrs, UVERBS_ATTR_DESTROY_QP_RESP, &resp,
|
||||
sizeof(resp));
|
||||
}
|
||||
|
||||
DECLARE_UVERBS_NAMED_METHOD(
|
||||
UVERBS_METHOD_QP_DESTROY,
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_DESTROY_QP_HANDLE,
|
||||
UVERBS_OBJECT_QP,
|
||||
UVERBS_ACCESS_DESTROY,
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_DESTROY_QP_RESP,
|
||||
UVERBS_ATTR_TYPE(struct ib_uverbs_destroy_qp_resp),
|
||||
UA_MANDATORY));
|
||||
|
||||
DECLARE_UVERBS_NAMED_OBJECT(
|
||||
UVERBS_OBJECT_QP,
|
||||
UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uqp_object), uverbs_free_qp),
|
||||
&UVERBS_METHOD(UVERBS_METHOD_QP_CREATE),
|
||||
&UVERBS_METHOD(UVERBS_METHOD_QP_DESTROY));
|
||||
|
||||
const struct uapi_definition uverbs_def_obj_qp[] = {
|
||||
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(UVERBS_OBJECT_QP,
|
||||
UAPI_DEF_OBJ_NEEDS_FN(destroy_qp)),
|
||||
{}
|
||||
};
|
234
drivers/infiniband/core/uverbs_std_types_srq.c
Normal file
234
drivers/infiniband/core/uverbs_std_types_srq.c
Normal file
@ -0,0 +1,234 @@
|
||||
// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
|
||||
/*
|
||||
* Copyright (c) 2020, Mellanox Technologies inc. All rights reserved.
|
||||
*/
|
||||
|
||||
#include <rdma/uverbs_std_types.h>
|
||||
#include "rdma_core.h"
|
||||
#include "uverbs.h"
|
||||
|
||||
static int uverbs_free_srq(struct ib_uobject *uobject,
|
||||
enum rdma_remove_reason why,
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
{
|
||||
struct ib_srq *srq = uobject->object;
|
||||
struct ib_uevent_object *uevent =
|
||||
container_of(uobject, struct ib_uevent_object, uobject);
|
||||
enum ib_srq_type srq_type = srq->srq_type;
|
||||
int ret;
|
||||
|
||||
ret = ib_destroy_srq_user(srq, &attrs->driver_udata);
|
||||
if (ib_is_destroy_retryable(ret, why, uobject))
|
||||
return ret;
|
||||
|
||||
if (srq_type == IB_SRQT_XRC) {
|
||||
struct ib_usrq_object *us =
|
||||
container_of(uobject, struct ib_usrq_object,
|
||||
uevent.uobject);
|
||||
|
||||
atomic_dec(&us->uxrcd->refcnt);
|
||||
}
|
||||
|
||||
ib_uverbs_release_uevent(uevent);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int UVERBS_HANDLER(UVERBS_METHOD_SRQ_CREATE)(
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
{
|
||||
struct ib_usrq_object *obj = container_of(
|
||||
uverbs_attr_get_uobject(attrs, UVERBS_ATTR_CREATE_SRQ_HANDLE),
|
||||
typeof(*obj), uevent.uobject);
|
||||
struct ib_pd *pd =
|
||||
uverbs_attr_get_obj(attrs, UVERBS_ATTR_CREATE_SRQ_PD_HANDLE);
|
||||
struct ib_srq_init_attr attr = {};
|
||||
struct ib_uobject *xrcd_uobj;
|
||||
struct ib_srq *srq;
|
||||
u64 user_handle;
|
||||
int ret;
|
||||
|
||||
ret = uverbs_copy_from(&attr.attr.max_sge, attrs,
|
||||
UVERBS_ATTR_CREATE_SRQ_MAX_SGE);
|
||||
if (!ret)
|
||||
ret = uverbs_copy_from(&attr.attr.max_wr, attrs,
|
||||
UVERBS_ATTR_CREATE_SRQ_MAX_WR);
|
||||
if (!ret)
|
||||
ret = uverbs_copy_from(&attr.attr.srq_limit, attrs,
|
||||
UVERBS_ATTR_CREATE_SRQ_LIMIT);
|
||||
if (!ret)
|
||||
ret = uverbs_copy_from(&user_handle, attrs,
|
||||
UVERBS_ATTR_CREATE_SRQ_USER_HANDLE);
|
||||
if (!ret)
|
||||
ret = uverbs_get_const(&attr.srq_type, attrs,
|
||||
UVERBS_ATTR_CREATE_SRQ_TYPE);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (ib_srq_has_cq(attr.srq_type)) {
|
||||
attr.ext.cq = uverbs_attr_get_obj(attrs,
|
||||
UVERBS_ATTR_CREATE_SRQ_CQ_HANDLE);
|
||||
if (IS_ERR(attr.ext.cq))
|
||||
return PTR_ERR(attr.ext.cq);
|
||||
}
|
||||
|
||||
switch (attr.srq_type) {
|
||||
case IB_UVERBS_SRQT_XRC:
|
||||
xrcd_uobj = uverbs_attr_get_uobject(attrs,
|
||||
UVERBS_ATTR_CREATE_SRQ_XRCD_HANDLE);
|
||||
if (IS_ERR(xrcd_uobj))
|
||||
return PTR_ERR(xrcd_uobj);
|
||||
|
||||
attr.ext.xrc.xrcd = (struct ib_xrcd *)xrcd_uobj->object;
|
||||
if (!attr.ext.xrc.xrcd)
|
||||
return -EINVAL;
|
||||
obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object,
|
||||
uobject);
|
||||
atomic_inc(&obj->uxrcd->refcnt);
|
||||
break;
|
||||
case IB_UVERBS_SRQT_TM:
|
||||
ret = uverbs_copy_from(&attr.ext.tag_matching.max_num_tags,
|
||||
attrs,
|
||||
UVERBS_ATTR_CREATE_SRQ_MAX_NUM_TAGS);
|
||||
if (ret)
|
||||
return ret;
|
||||
break;
|
||||
case IB_UVERBS_SRQT_BASIC:
|
||||
break;
|
||||
default:
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
obj->uevent.event_file = ib_uverbs_get_async_event(attrs,
|
||||
UVERBS_ATTR_CREATE_SRQ_EVENT_FD);
|
||||
INIT_LIST_HEAD(&obj->uevent.event_list);
|
||||
attr.event_handler = ib_uverbs_srq_event_handler;
|
||||
obj->uevent.uobject.user_handle = user_handle;
|
||||
|
||||
srq = ib_create_srq_user(pd, &attr, obj, &attrs->driver_udata);
|
||||
if (IS_ERR(srq)) {
|
||||
ret = PTR_ERR(srq);
|
||||
goto err;
|
||||
}
|
||||
|
||||
obj->uevent.uobject.object = srq;
|
||||
uverbs_finalize_uobj_create(attrs, UVERBS_ATTR_CREATE_SRQ_HANDLE);
|
||||
|
||||
ret = uverbs_copy_to(attrs, UVERBS_ATTR_CREATE_SRQ_RESP_MAX_WR,
|
||||
&attr.attr.max_wr,
|
||||
sizeof(attr.attr.max_wr));
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = uverbs_copy_to(attrs, UVERBS_ATTR_CREATE_SRQ_RESP_MAX_SGE,
|
||||
&attr.attr.max_sge,
|
||||
sizeof(attr.attr.max_sge));
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (attr.srq_type == IB_SRQT_XRC) {
|
||||
ret = uverbs_copy_to(attrs,
|
||||
UVERBS_ATTR_CREATE_SRQ_RESP_SRQ_NUM,
|
||||
&srq->ext.xrc.srq_num,
|
||||
sizeof(srq->ext.xrc.srq_num));
|
||||
if (ret)
|
||||
return ret;
|
||||
}
|
||||
|
||||
return 0;
|
||||
err:
|
||||
if (obj->uevent.event_file)
|
||||
uverbs_uobject_put(&obj->uevent.event_file->uobj);
|
||||
if (attr.srq_type == IB_SRQT_XRC)
|
||||
atomic_dec(&obj->uxrcd->refcnt);
|
||||
return ret;
|
||||
};
|
||||
|
||||
DECLARE_UVERBS_NAMED_METHOD(
|
||||
UVERBS_METHOD_SRQ_CREATE,
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_SRQ_HANDLE,
|
||||
UVERBS_OBJECT_SRQ,
|
||||
UVERBS_ACCESS_NEW,
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_SRQ_PD_HANDLE,
|
||||
UVERBS_OBJECT_PD,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_CONST_IN(UVERBS_ATTR_CREATE_SRQ_TYPE,
|
||||
enum ib_uverbs_srq_type,
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_SRQ_USER_HANDLE,
|
||||
UVERBS_ATTR_TYPE(u64),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_SRQ_MAX_WR,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_SRQ_MAX_SGE,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_SRQ_LIMIT,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_SRQ_XRCD_HANDLE,
|
||||
UVERBS_OBJECT_XRCD,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_SRQ_CQ_HANDLE,
|
||||
UVERBS_OBJECT_CQ,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_SRQ_MAX_NUM_TAGS,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_FD(UVERBS_ATTR_CREATE_SRQ_EVENT_FD,
|
||||
UVERBS_OBJECT_ASYNC_EVENT,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_CREATE_SRQ_RESP_MAX_WR,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_CREATE_SRQ_RESP_MAX_SGE,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_CREATE_SRQ_RESP_SRQ_NUM,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_UHW());
|
||||
|
||||
static int UVERBS_HANDLER(UVERBS_METHOD_SRQ_DESTROY)(
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
{
|
||||
struct ib_uobject *uobj =
|
||||
uverbs_attr_get_uobject(attrs, UVERBS_ATTR_DESTROY_SRQ_HANDLE);
|
||||
struct ib_usrq_object *obj =
|
||||
container_of(uobj, struct ib_usrq_object, uevent.uobject);
|
||||
struct ib_uverbs_destroy_srq_resp resp = {
|
||||
.events_reported = obj->uevent.events_reported
|
||||
};
|
||||
|
||||
return uverbs_copy_to(attrs, UVERBS_ATTR_DESTROY_SRQ_RESP, &resp,
|
||||
sizeof(resp));
|
||||
}
|
||||
|
||||
DECLARE_UVERBS_NAMED_METHOD(
|
||||
UVERBS_METHOD_SRQ_DESTROY,
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_DESTROY_SRQ_HANDLE,
|
||||
UVERBS_OBJECT_SRQ,
|
||||
UVERBS_ACCESS_DESTROY,
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_DESTROY_SRQ_RESP,
|
||||
UVERBS_ATTR_TYPE(struct ib_uverbs_destroy_srq_resp),
|
||||
UA_MANDATORY));
|
||||
|
||||
DECLARE_UVERBS_NAMED_OBJECT(
|
||||
UVERBS_OBJECT_SRQ,
|
||||
UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_usrq_object),
|
||||
uverbs_free_srq),
|
||||
&UVERBS_METHOD(UVERBS_METHOD_SRQ_CREATE),
|
||||
&UVERBS_METHOD(UVERBS_METHOD_SRQ_DESTROY)
|
||||
);
|
||||
|
||||
const struct uapi_definition uverbs_def_obj_srq[] = {
|
||||
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(UVERBS_OBJECT_SRQ,
|
||||
UAPI_DEF_OBJ_NEEDS_FN(destroy_srq)),
|
||||
{}
|
||||
};
|
194
drivers/infiniband/core/uverbs_std_types_wq.c
Normal file
194
drivers/infiniband/core/uverbs_std_types_wq.c
Normal file
@ -0,0 +1,194 @@
|
||||
// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
|
||||
/*
|
||||
* Copyright (c) 2020, Mellanox Technologies inc. All rights reserved.
|
||||
*/
|
||||
|
||||
#include <rdma/uverbs_std_types.h>
|
||||
#include "rdma_core.h"
|
||||
#include "uverbs.h"
|
||||
|
||||
static int uverbs_free_wq(struct ib_uobject *uobject,
|
||||
enum rdma_remove_reason why,
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
{
|
||||
struct ib_wq *wq = uobject->object;
|
||||
struct ib_uwq_object *uwq =
|
||||
container_of(uobject, struct ib_uwq_object, uevent.uobject);
|
||||
int ret;
|
||||
|
||||
ret = ib_destroy_wq(wq, &attrs->driver_udata);
|
||||
if (ib_is_destroy_retryable(ret, why, uobject))
|
||||
return ret;
|
||||
|
||||
ib_uverbs_release_uevent(&uwq->uevent);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int UVERBS_HANDLER(UVERBS_METHOD_WQ_CREATE)(
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
{
|
||||
struct ib_uwq_object *obj = container_of(
|
||||
uverbs_attr_get_uobject(attrs, UVERBS_ATTR_CREATE_WQ_HANDLE),
|
||||
typeof(*obj), uevent.uobject);
|
||||
struct ib_pd *pd =
|
||||
uverbs_attr_get_obj(attrs, UVERBS_ATTR_CREATE_WQ_PD_HANDLE);
|
||||
struct ib_cq *cq =
|
||||
uverbs_attr_get_obj(attrs, UVERBS_ATTR_CREATE_WQ_CQ_HANDLE);
|
||||
struct ib_wq_init_attr wq_init_attr = {};
|
||||
struct ib_wq *wq;
|
||||
u64 user_handle;
|
||||
int ret;
|
||||
|
||||
ret = uverbs_get_flags32(&wq_init_attr.create_flags, attrs,
|
||||
UVERBS_ATTR_CREATE_WQ_FLAGS,
|
||||
IB_UVERBS_WQ_FLAGS_CVLAN_STRIPPING |
|
||||
IB_UVERBS_WQ_FLAGS_SCATTER_FCS |
|
||||
IB_UVERBS_WQ_FLAGS_DELAY_DROP |
|
||||
IB_UVERBS_WQ_FLAGS_PCI_WRITE_END_PADDING);
|
||||
if (!ret)
|
||||
ret = uverbs_copy_from(&wq_init_attr.max_sge, attrs,
|
||||
UVERBS_ATTR_CREATE_WQ_MAX_SGE);
|
||||
if (!ret)
|
||||
ret = uverbs_copy_from(&wq_init_attr.max_wr, attrs,
|
||||
UVERBS_ATTR_CREATE_WQ_MAX_WR);
|
||||
if (!ret)
|
||||
ret = uverbs_copy_from(&user_handle, attrs,
|
||||
UVERBS_ATTR_CREATE_WQ_USER_HANDLE);
|
||||
if (!ret)
|
||||
ret = uverbs_get_const(&wq_init_attr.wq_type, attrs,
|
||||
UVERBS_ATTR_CREATE_WQ_TYPE);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (wq_init_attr.wq_type != IB_WQT_RQ)
|
||||
return -EINVAL;
|
||||
|
||||
obj->uevent.event_file = ib_uverbs_get_async_event(attrs,
|
||||
UVERBS_ATTR_CREATE_WQ_EVENT_FD);
|
||||
obj->uevent.uobject.user_handle = user_handle;
|
||||
INIT_LIST_HEAD(&obj->uevent.event_list);
|
||||
wq_init_attr.event_handler = ib_uverbs_wq_event_handler;
|
||||
wq_init_attr.wq_context = attrs->ufile;
|
||||
wq_init_attr.cq = cq;
|
||||
|
||||
wq = pd->device->ops.create_wq(pd, &wq_init_attr, &attrs->driver_udata);
|
||||
if (IS_ERR(wq)) {
|
||||
ret = PTR_ERR(wq);
|
||||
goto err;
|
||||
}
|
||||
|
||||
obj->uevent.uobject.object = wq;
|
||||
wq->wq_type = wq_init_attr.wq_type;
|
||||
wq->cq = cq;
|
||||
wq->pd = pd;
|
||||
wq->device = pd->device;
|
||||
wq->wq_context = wq_init_attr.wq_context;
|
||||
atomic_set(&wq->usecnt, 0);
|
||||
atomic_inc(&pd->usecnt);
|
||||
atomic_inc(&cq->usecnt);
|
||||
wq->uobject = obj;
|
||||
uverbs_finalize_uobj_create(attrs, UVERBS_ATTR_CREATE_WQ_HANDLE);
|
||||
|
||||
ret = uverbs_copy_to(attrs, UVERBS_ATTR_CREATE_WQ_RESP_MAX_WR,
|
||||
&wq_init_attr.max_wr,
|
||||
sizeof(wq_init_attr.max_wr));
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = uverbs_copy_to(attrs, UVERBS_ATTR_CREATE_WQ_RESP_MAX_SGE,
|
||||
&wq_init_attr.max_sge,
|
||||
sizeof(wq_init_attr.max_sge));
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = uverbs_copy_to(attrs, UVERBS_ATTR_CREATE_WQ_RESP_WQ_NUM,
|
||||
&wq->wq_num,
|
||||
sizeof(wq->wq_num));
|
||||
return ret;
|
||||
|
||||
err:
|
||||
if (obj->uevent.event_file)
|
||||
uverbs_uobject_put(&obj->uevent.event_file->uobj);
|
||||
return ret;
|
||||
};
|
||||
|
||||
DECLARE_UVERBS_NAMED_METHOD(
|
||||
UVERBS_METHOD_WQ_CREATE,
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_WQ_HANDLE,
|
||||
UVERBS_OBJECT_WQ,
|
||||
UVERBS_ACCESS_NEW,
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_WQ_PD_HANDLE,
|
||||
UVERBS_OBJECT_PD,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_CONST_IN(UVERBS_ATTR_CREATE_WQ_TYPE,
|
||||
enum ib_wq_type,
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_WQ_USER_HANDLE,
|
||||
UVERBS_ATTR_TYPE(u64),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_WQ_MAX_WR,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_WQ_MAX_SGE,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_FLAGS_IN(UVERBS_ATTR_CREATE_WQ_FLAGS,
|
||||
enum ib_uverbs_wq_flags,
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_CREATE_WQ_CQ_HANDLE,
|
||||
UVERBS_OBJECT_CQ,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_FD(UVERBS_ATTR_CREATE_WQ_EVENT_FD,
|
||||
UVERBS_OBJECT_ASYNC_EVENT,
|
||||
UVERBS_ACCESS_READ,
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_CREATE_WQ_RESP_MAX_WR,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_CREATE_WQ_RESP_MAX_SGE,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_CREATE_WQ_RESP_WQ_NUM,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_OPTIONAL),
|
||||
UVERBS_ATTR_UHW());
|
||||
|
||||
static int UVERBS_HANDLER(UVERBS_METHOD_WQ_DESTROY)(
|
||||
struct uverbs_attr_bundle *attrs)
|
||||
{
|
||||
struct ib_uobject *uobj =
|
||||
uverbs_attr_get_uobject(attrs, UVERBS_ATTR_DESTROY_WQ_HANDLE);
|
||||
struct ib_uwq_object *obj =
|
||||
container_of(uobj, struct ib_uwq_object, uevent.uobject);
|
||||
|
||||
return uverbs_copy_to(attrs, UVERBS_ATTR_DESTROY_WQ_RESP,
|
||||
&obj->uevent.events_reported,
|
||||
sizeof(obj->uevent.events_reported));
|
||||
}
|
||||
|
||||
DECLARE_UVERBS_NAMED_METHOD(
|
||||
UVERBS_METHOD_WQ_DESTROY,
|
||||
UVERBS_ATTR_IDR(UVERBS_ATTR_DESTROY_WQ_HANDLE,
|
||||
UVERBS_OBJECT_WQ,
|
||||
UVERBS_ACCESS_DESTROY,
|
||||
UA_MANDATORY),
|
||||
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_DESTROY_WQ_RESP,
|
||||
UVERBS_ATTR_TYPE(u32),
|
||||
UA_MANDATORY));
|
||||
|
||||
|
||||
DECLARE_UVERBS_NAMED_OBJECT(
|
||||
UVERBS_OBJECT_WQ,
|
||||
UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uwq_object), uverbs_free_wq),
|
||||
&UVERBS_METHOD(UVERBS_METHOD_WQ_CREATE),
|
||||
&UVERBS_METHOD(UVERBS_METHOD_WQ_DESTROY)
|
||||
);
|
||||
|
||||
const struct uapi_definition uverbs_def_obj_wq[] = {
|
||||
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(UVERBS_OBJECT_WQ,
|
||||
UAPI_DEF_OBJ_NEEDS_FN(destroy_wq)),
|
||||
{}
|
||||
};
|
@ -634,6 +634,9 @@ static const struct uapi_definition uverbs_core_api[] = {
|
||||
UAPI_DEF_CHAIN(uverbs_def_obj_flow_action),
|
||||
UAPI_DEF_CHAIN(uverbs_def_obj_intf),
|
||||
UAPI_DEF_CHAIN(uverbs_def_obj_mr),
|
||||
UAPI_DEF_CHAIN(uverbs_def_obj_qp),
|
||||
UAPI_DEF_CHAIN(uverbs_def_obj_srq),
|
||||
UAPI_DEF_CHAIN(uverbs_def_obj_wq),
|
||||
UAPI_DEF_CHAIN(uverbs_def_write_intf),
|
||||
{},
|
||||
};
|
||||
|
@ -50,6 +50,7 @@
|
||||
#include <rdma/ib_cache.h>
|
||||
#include <rdma/ib_addr.h>
|
||||
#include <rdma/rw.h>
|
||||
#include <rdma/lag.h>
|
||||
|
||||
#include "core_priv.h"
|
||||
#include <trace/events/rdma_core.h>
|
||||
@ -500,8 +501,10 @@ rdma_update_sgid_attr(struct rdma_ah_attr *ah_attr,
|
||||
static struct ib_ah *_rdma_create_ah(struct ib_pd *pd,
|
||||
struct rdma_ah_attr *ah_attr,
|
||||
u32 flags,
|
||||
struct ib_udata *udata)
|
||||
struct ib_udata *udata,
|
||||
struct net_device *xmit_slave)
|
||||
{
|
||||
struct rdma_ah_init_attr init_attr = {};
|
||||
struct ib_device *device = pd->device;
|
||||
struct ib_ah *ah;
|
||||
int ret;
|
||||
@ -521,8 +524,11 @@ static struct ib_ah *_rdma_create_ah(struct ib_pd *pd,
|
||||
ah->pd = pd;
|
||||
ah->type = ah_attr->type;
|
||||
ah->sgid_attr = rdma_update_sgid_attr(ah_attr, NULL);
|
||||
init_attr.ah_attr = ah_attr;
|
||||
init_attr.flags = flags;
|
||||
init_attr.xmit_slave = xmit_slave;
|
||||
|
||||
ret = device->ops.create_ah(ah, ah_attr, flags, udata);
|
||||
ret = device->ops.create_ah(ah, &init_attr, udata);
|
||||
if (ret) {
|
||||
kfree(ah);
|
||||
return ERR_PTR(ret);
|
||||
@ -547,15 +553,22 @@ struct ib_ah *rdma_create_ah(struct ib_pd *pd, struct rdma_ah_attr *ah_attr,
|
||||
u32 flags)
|
||||
{
|
||||
const struct ib_gid_attr *old_sgid_attr;
|
||||
struct net_device *slave;
|
||||
struct ib_ah *ah;
|
||||
int ret;
|
||||
|
||||
ret = rdma_fill_sgid_attr(pd->device, ah_attr, &old_sgid_attr);
|
||||
if (ret)
|
||||
return ERR_PTR(ret);
|
||||
|
||||
ah = _rdma_create_ah(pd, ah_attr, flags, NULL);
|
||||
|
||||
slave = rdma_lag_get_ah_roce_slave(pd->device, ah_attr,
|
||||
(flags & RDMA_CREATE_AH_SLEEPABLE) ?
|
||||
GFP_KERNEL : GFP_ATOMIC);
|
||||
if (IS_ERR(slave)) {
|
||||
rdma_unfill_sgid_attr(ah_attr, old_sgid_attr);
|
||||
return (void *)slave;
|
||||
}
|
||||
ah = _rdma_create_ah(pd, ah_attr, flags, NULL, slave);
|
||||
rdma_lag_put_ah_roce_slave(slave);
|
||||
rdma_unfill_sgid_attr(ah_attr, old_sgid_attr);
|
||||
return ah;
|
||||
}
|
||||
@ -594,7 +607,8 @@ struct ib_ah *rdma_create_user_ah(struct ib_pd *pd,
|
||||
}
|
||||
}
|
||||
|
||||
ah = _rdma_create_ah(pd, ah_attr, RDMA_CREATE_AH_SLEEPABLE, udata);
|
||||
ah = _rdma_create_ah(pd, ah_attr, RDMA_CREATE_AH_SLEEPABLE,
|
||||
udata, NULL);
|
||||
|
||||
out:
|
||||
rdma_unfill_sgid_attr(ah_attr, old_sgid_attr);
|
||||
@ -967,15 +981,29 @@ EXPORT_SYMBOL(rdma_destroy_ah_user);
|
||||
|
||||
/* Shared receive queues */
|
||||
|
||||
struct ib_srq *ib_create_srq(struct ib_pd *pd,
|
||||
struct ib_srq_init_attr *srq_init_attr)
|
||||
/**
|
||||
* ib_create_srq_user - Creates a SRQ associated with the specified protection
|
||||
* domain.
|
||||
* @pd: The protection domain associated with the SRQ.
|
||||
* @srq_init_attr: A list of initial attributes required to create the
|
||||
* SRQ. If SRQ creation succeeds, then the attributes are updated to
|
||||
* the actual capabilities of the created SRQ.
|
||||
* @uobject - uobject pointer if this is not a kernel SRQ
|
||||
* @udata - udata pointer if this is not a kernel SRQ
|
||||
*
|
||||
* srq_attr->max_wr and srq_attr->max_sge are read the determine the
|
||||
* requested size of the SRQ, and set to the actual values allocated
|
||||
* on return. If ib_create_srq() succeeds, then max_wr and max_sge
|
||||
* will always be at least as large as the requested values.
|
||||
*/
|
||||
struct ib_srq *ib_create_srq_user(struct ib_pd *pd,
|
||||
struct ib_srq_init_attr *srq_init_attr,
|
||||
struct ib_usrq_object *uobject,
|
||||
struct ib_udata *udata)
|
||||
{
|
||||
struct ib_srq *srq;
|
||||
int ret;
|
||||
|
||||
if (!pd->device->ops.create_srq)
|
||||
return ERR_PTR(-EOPNOTSUPP);
|
||||
|
||||
srq = rdma_zalloc_drv_obj(pd->device, ib_srq);
|
||||
if (!srq)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
@ -985,6 +1013,7 @@ struct ib_srq *ib_create_srq(struct ib_pd *pd,
|
||||
srq->event_handler = srq_init_attr->event_handler;
|
||||
srq->srq_context = srq_init_attr->srq_context;
|
||||
srq->srq_type = srq_init_attr->srq_type;
|
||||
srq->uobject = uobject;
|
||||
|
||||
if (ib_srq_has_cq(srq->srq_type)) {
|
||||
srq->ext.cq = srq_init_attr->ext.cq;
|
||||
@ -996,7 +1025,7 @@ struct ib_srq *ib_create_srq(struct ib_pd *pd,
|
||||
}
|
||||
atomic_inc(&pd->usecnt);
|
||||
|
||||
ret = pd->device->ops.create_srq(srq, srq_init_attr, NULL);
|
||||
ret = pd->device->ops.create_srq(srq, srq_init_attr, udata);
|
||||
if (ret) {
|
||||
atomic_dec(&srq->pd->usecnt);
|
||||
if (srq->srq_type == IB_SRQT_XRC)
|
||||
@ -1009,7 +1038,7 @@ struct ib_srq *ib_create_srq(struct ib_pd *pd,
|
||||
|
||||
return srq;
|
||||
}
|
||||
EXPORT_SYMBOL(ib_create_srq);
|
||||
EXPORT_SYMBOL(ib_create_srq_user);
|
||||
|
||||
int ib_modify_srq(struct ib_srq *srq,
|
||||
struct ib_srq_attr *srq_attr,
|
||||
@ -1633,11 +1662,35 @@ static int _ib_modify_qp(struct ib_qp *qp, struct ib_qp_attr *attr,
|
||||
const struct ib_gid_attr *old_sgid_attr_alt_av;
|
||||
int ret;
|
||||
|
||||
attr->xmit_slave = NULL;
|
||||
if (attr_mask & IB_QP_AV) {
|
||||
ret = rdma_fill_sgid_attr(qp->device, &attr->ah_attr,
|
||||
&old_sgid_attr_av);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (attr->ah_attr.type == RDMA_AH_ATTR_TYPE_ROCE &&
|
||||
is_qp_type_connected(qp)) {
|
||||
struct net_device *slave;
|
||||
|
||||
/*
|
||||
* If the user provided the qp_attr then we have to
|
||||
* resolve it. Kerne users have to provide already
|
||||
* resolved rdma_ah_attr's.
|
||||
*/
|
||||
if (udata) {
|
||||
ret = ib_resolve_eth_dmac(qp->device,
|
||||
&attr->ah_attr);
|
||||
if (ret)
|
||||
goto out_av;
|
||||
}
|
||||
slave = rdma_lag_get_ah_roce_slave(qp->device,
|
||||
&attr->ah_attr,
|
||||
GFP_KERNEL);
|
||||
if (IS_ERR(slave))
|
||||
goto out_av;
|
||||
attr->xmit_slave = slave;
|
||||
}
|
||||
}
|
||||
if (attr_mask & IB_QP_ALT_PATH) {
|
||||
/*
|
||||
@ -1664,18 +1717,6 @@ static int _ib_modify_qp(struct ib_qp *qp, struct ib_qp_attr *attr,
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* If the user provided the qp_attr then we have to resolve it. Kernel
|
||||
* users have to provide already resolved rdma_ah_attr's
|
||||
*/
|
||||
if (udata && (attr_mask & IB_QP_AV) &&
|
||||
attr->ah_attr.type == RDMA_AH_ATTR_TYPE_ROCE &&
|
||||
is_qp_type_connected(qp)) {
|
||||
ret = ib_resolve_eth_dmac(qp->device, &attr->ah_attr);
|
||||
if (ret)
|
||||
goto out;
|
||||
}
|
||||
|
||||
if (rdma_ib_or_roce(qp->device, port)) {
|
||||
if (attr_mask & IB_QP_RQ_PSN && attr->rq_psn & ~0xffffff) {
|
||||
dev_warn(&qp->device->dev,
|
||||
@ -1717,8 +1758,10 @@ out:
|
||||
if (attr_mask & IB_QP_ALT_PATH)
|
||||
rdma_unfill_sgid_attr(&attr->alt_ah_attr, old_sgid_attr_alt_av);
|
||||
out_av:
|
||||
if (attr_mask & IB_QP_AV)
|
||||
if (attr_mask & IB_QP_AV) {
|
||||
rdma_lag_put_ah_roce_slave(attr->xmit_slave);
|
||||
rdma_unfill_sgid_attr(&attr->ah_attr, old_sgid_attr_av);
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
|
||||
@ -1962,6 +2005,9 @@ EXPORT_SYMBOL(__ib_create_cq);
|
||||
|
||||
int rdma_set_cq_moderation(struct ib_cq *cq, u16 cq_count, u16 cq_period)
|
||||
{
|
||||
if (cq->shared)
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
return cq->device->ops.modify_cq ?
|
||||
cq->device->ops.modify_cq(cq, cq_count,
|
||||
cq_period) : -EOPNOTSUPP;
|
||||
@ -1970,6 +2016,9 @@ EXPORT_SYMBOL(rdma_set_cq_moderation);
|
||||
|
||||
int ib_destroy_cq_user(struct ib_cq *cq, struct ib_udata *udata)
|
||||
{
|
||||
if (WARN_ON_ONCE(cq->shared))
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
if (atomic_read(&cq->usecnt))
|
||||
return -EBUSY;
|
||||
|
||||
@ -1982,6 +2031,9 @@ EXPORT_SYMBOL(ib_destroy_cq_user);
|
||||
|
||||
int ib_resize_cq(struct ib_cq *cq, int cqe)
|
||||
{
|
||||
if (cq->shared)
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
return cq->device->ops.resize_cq ?
|
||||
cq->device->ops.resize_cq(cq, cqe, NULL) : -EOPNOTSUPP;
|
||||
}
|
||||
@ -2160,54 +2212,6 @@ out:
|
||||
}
|
||||
EXPORT_SYMBOL(ib_alloc_mr_integrity);
|
||||
|
||||
/* "Fast" memory regions */
|
||||
|
||||
struct ib_fmr *ib_alloc_fmr(struct ib_pd *pd,
|
||||
int mr_access_flags,
|
||||
struct ib_fmr_attr *fmr_attr)
|
||||
{
|
||||
struct ib_fmr *fmr;
|
||||
|
||||
if (!pd->device->ops.alloc_fmr)
|
||||
return ERR_PTR(-EOPNOTSUPP);
|
||||
|
||||
fmr = pd->device->ops.alloc_fmr(pd, mr_access_flags, fmr_attr);
|
||||
if (!IS_ERR(fmr)) {
|
||||
fmr->device = pd->device;
|
||||
fmr->pd = pd;
|
||||
atomic_inc(&pd->usecnt);
|
||||
}
|
||||
|
||||
return fmr;
|
||||
}
|
||||
EXPORT_SYMBOL(ib_alloc_fmr);
|
||||
|
||||
int ib_unmap_fmr(struct list_head *fmr_list)
|
||||
{
|
||||
struct ib_fmr *fmr;
|
||||
|
||||
if (list_empty(fmr_list))
|
||||
return 0;
|
||||
|
||||
fmr = list_entry(fmr_list->next, struct ib_fmr, list);
|
||||
return fmr->device->ops.unmap_fmr(fmr_list);
|
||||
}
|
||||
EXPORT_SYMBOL(ib_unmap_fmr);
|
||||
|
||||
int ib_dealloc_fmr(struct ib_fmr *fmr)
|
||||
{
|
||||
struct ib_pd *pd;
|
||||
int ret;
|
||||
|
||||
pd = fmr->pd;
|
||||
ret = fmr->device->ops.dealloc_fmr(fmr);
|
||||
if (!ret)
|
||||
atomic_dec(&pd->usecnt);
|
||||
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL(ib_dealloc_fmr);
|
||||
|
||||
/* Multicast groups */
|
||||
|
||||
static bool is_valid_mcast_lid(struct ib_qp *qp, u16 lid)
|
||||
@ -2574,6 +2578,7 @@ EXPORT_SYMBOL(ib_map_mr_sg_pi);
|
||||
* @page_size: page vector desired page size
|
||||
*
|
||||
* Constraints:
|
||||
*
|
||||
* - The first sg element is allowed to have an offset.
|
||||
* - Each sg element must either be aligned to page_size or virtually
|
||||
* contiguous to the previous element. In case an sg element has a
|
||||
@ -2607,10 +2612,12 @@ EXPORT_SYMBOL(ib_map_mr_sg);
|
||||
* @mr: memory region
|
||||
* @sgl: dma mapped scatterlist
|
||||
* @sg_nents: number of entries in sg
|
||||
* @sg_offset_p: IN: start offset in bytes into sg
|
||||
* OUT: offset in bytes for element n of the sg of the first
|
||||
* @sg_offset_p: ==== =======================================================
|
||||
* IN start offset in bytes into sg
|
||||
* OUT offset in bytes for element n of the sg of the first
|
||||
* byte that has not been processed where n is the return
|
||||
* value of this function.
|
||||
* ==== =======================================================
|
||||
* @set_page: driver page assignment function pointer
|
||||
*
|
||||
* Core service helper for drivers to convert the largest
|
||||
|
@ -177,9 +177,6 @@ int bnxt_re_query_device(struct ib_device *ibdev,
|
||||
ib_attr->max_total_mcast_qp_attach = 0;
|
||||
ib_attr->max_ah = dev_attr->max_ah;
|
||||
|
||||
ib_attr->max_fmr = 0;
|
||||
ib_attr->max_map_per_fmr = 0;
|
||||
|
||||
ib_attr->max_srq = dev_attr->max_srq;
|
||||
ib_attr->max_srq_wr = dev_attr->max_srq_wqes;
|
||||
ib_attr->max_srq_sge = dev_attr->max_srq_sges;
|
||||
@ -631,11 +628,12 @@ static u8 bnxt_re_stack_to_dev_nw_type(enum rdma_network_type ntype)
|
||||
return nw_type;
|
||||
}
|
||||
|
||||
int bnxt_re_create_ah(struct ib_ah *ib_ah, struct rdma_ah_attr *ah_attr,
|
||||
u32 flags, struct ib_udata *udata)
|
||||
int bnxt_re_create_ah(struct ib_ah *ib_ah, struct rdma_ah_init_attr *init_attr,
|
||||
struct ib_udata *udata)
|
||||
{
|
||||
struct ib_pd *ib_pd = ib_ah->pd;
|
||||
struct bnxt_re_pd *pd = container_of(ib_pd, struct bnxt_re_pd, ib_pd);
|
||||
struct rdma_ah_attr *ah_attr = init_attr->ah_attr;
|
||||
const struct ib_global_route *grh = rdma_ah_read_grh(ah_attr);
|
||||
struct bnxt_re_dev *rdev = pd->rdev;
|
||||
const struct ib_gid_attr *sgid_attr;
|
||||
@ -673,7 +671,8 @@ int bnxt_re_create_ah(struct ib_ah *ib_ah, struct rdma_ah_attr *ah_attr,
|
||||
|
||||
memcpy(ah->qplib_ah.dmac, ah_attr->roce.dmac, ETH_ALEN);
|
||||
rc = bnxt_qplib_create_ah(&rdev->qplib_res, &ah->qplib_ah,
|
||||
!(flags & RDMA_CREATE_AH_SLEEPABLE));
|
||||
!(init_attr->flags &
|
||||
RDMA_CREATE_AH_SLEEPABLE));
|
||||
if (rc) {
|
||||
ibdev_err(&rdev->ibdev, "Failed to allocate HW AH");
|
||||
return rc;
|
||||
@ -856,7 +855,7 @@ static int bnxt_re_init_user_qp(struct bnxt_re_dev *rdev, struct bnxt_re_pd *pd,
|
||||
if (ib_copy_from_udata(&ureq, udata, sizeof(ureq)))
|
||||
return -EFAULT;
|
||||
|
||||
bytes = (qplib_qp->sq.max_wqe * BNXT_QPLIB_MAX_SQE_ENTRY_SIZE);
|
||||
bytes = (qplib_qp->sq.max_wqe * qplib_qp->sq.wqe_size);
|
||||
/* Consider mapping PSN search memory only for RC QPs. */
|
||||
if (qplib_qp->type == CMDQ_CREATE_QP_TYPE_RC) {
|
||||
psn_sz = bnxt_qplib_is_chip_gen_p5(rdev->chip_ctx) ?
|
||||
@ -879,7 +878,7 @@ static int bnxt_re_init_user_qp(struct bnxt_re_dev *rdev, struct bnxt_re_pd *pd,
|
||||
qplib_qp->qp_handle = ureq.qp_handle;
|
||||
|
||||
if (!qp->qplib_qp.srq) {
|
||||
bytes = (qplib_qp->rq.max_wqe * BNXT_QPLIB_MAX_RQE_ENTRY_SIZE);
|
||||
bytes = (qplib_qp->rq.max_wqe * qplib_qp->rq.wqe_size);
|
||||
bytes = PAGE_ALIGN(bytes);
|
||||
umem = ib_umem_get(&rdev->ibdev, ureq.qprva, bytes,
|
||||
IB_ACCESS_LOCAL_WRITE);
|
||||
@ -976,6 +975,7 @@ static struct bnxt_re_qp *bnxt_re_create_shadow_qp
|
||||
qp->qplib_qp.sig_type = true;
|
||||
|
||||
/* Shadow QP SQ depth should be same as QP1 RQ depth */
|
||||
qp->qplib_qp.sq.wqe_size = bnxt_re_get_swqe_size();
|
||||
qp->qplib_qp.sq.max_wqe = qp1_qp->rq.max_wqe;
|
||||
qp->qplib_qp.sq.max_sge = 2;
|
||||
/* Q full delta can be 1 since it is internal QP */
|
||||
@ -986,6 +986,7 @@ static struct bnxt_re_qp *bnxt_re_create_shadow_qp
|
||||
qp->qplib_qp.scq = qp1_qp->scq;
|
||||
qp->qplib_qp.rcq = qp1_qp->rcq;
|
||||
|
||||
qp->qplib_qp.rq.wqe_size = bnxt_re_get_rwqe_size();
|
||||
qp->qplib_qp.rq.max_wqe = qp1_qp->rq.max_wqe;
|
||||
qp->qplib_qp.rq.max_sge = qp1_qp->rq.max_sge;
|
||||
/* Q full delta can be 1 since it is internal QP */
|
||||
@ -1021,10 +1022,12 @@ static int bnxt_re_init_rq_attr(struct bnxt_re_qp *qp,
|
||||
struct bnxt_qplib_dev_attr *dev_attr;
|
||||
struct bnxt_qplib_qp *qplqp;
|
||||
struct bnxt_re_dev *rdev;
|
||||
struct bnxt_qplib_q *rq;
|
||||
int entries;
|
||||
|
||||
rdev = qp->rdev;
|
||||
qplqp = &qp->qplib_qp;
|
||||
rq = &qplqp->rq;
|
||||
dev_attr = &rdev->dev_attr;
|
||||
|
||||
if (init_attr->srq) {
|
||||
@ -1036,23 +1039,21 @@ static int bnxt_re_init_rq_attr(struct bnxt_re_qp *qp,
|
||||
return -EINVAL;
|
||||
}
|
||||
qplqp->srq = &srq->qplib_srq;
|
||||
qplqp->rq.max_wqe = 0;
|
||||
rq->max_wqe = 0;
|
||||
} else {
|
||||
rq->wqe_size = bnxt_re_get_rwqe_size();
|
||||
/* Allocate 1 more than what's provided so posting max doesn't
|
||||
* mean empty.
|
||||
*/
|
||||
entries = roundup_pow_of_two(init_attr->cap.max_recv_wr + 1);
|
||||
qplqp->rq.max_wqe = min_t(u32, entries,
|
||||
dev_attr->max_qp_wqes + 1);
|
||||
|
||||
qplqp->rq.q_full_delta = qplqp->rq.max_wqe -
|
||||
init_attr->cap.max_recv_wr;
|
||||
qplqp->rq.max_sge = init_attr->cap.max_recv_sge;
|
||||
if (qplqp->rq.max_sge > dev_attr->max_qp_sges)
|
||||
qplqp->rq.max_sge = dev_attr->max_qp_sges;
|
||||
rq->max_wqe = min_t(u32, entries, dev_attr->max_qp_wqes + 1);
|
||||
rq->q_full_delta = rq->max_wqe - init_attr->cap.max_recv_wr;
|
||||
rq->max_sge = init_attr->cap.max_recv_sge;
|
||||
if (rq->max_sge > dev_attr->max_qp_sges)
|
||||
rq->max_sge = dev_attr->max_qp_sges;
|
||||
}
|
||||
qplqp->rq.sg_info.pgsize = PAGE_SIZE;
|
||||
qplqp->rq.sg_info.pgshft = PAGE_SHIFT;
|
||||
rq->sg_info.pgsize = PAGE_SIZE;
|
||||
rq->sg_info.pgshft = PAGE_SHIFT;
|
||||
|
||||
return 0;
|
||||
}
|
||||
@ -1080,15 +1081,18 @@ static void bnxt_re_init_sq_attr(struct bnxt_re_qp *qp,
|
||||
struct bnxt_qplib_dev_attr *dev_attr;
|
||||
struct bnxt_qplib_qp *qplqp;
|
||||
struct bnxt_re_dev *rdev;
|
||||
struct bnxt_qplib_q *sq;
|
||||
int entries;
|
||||
|
||||
rdev = qp->rdev;
|
||||
qplqp = &qp->qplib_qp;
|
||||
sq = &qplqp->sq;
|
||||
dev_attr = &rdev->dev_attr;
|
||||
|
||||
qplqp->sq.max_sge = init_attr->cap.max_send_sge;
|
||||
if (qplqp->sq.max_sge > dev_attr->max_qp_sges)
|
||||
qplqp->sq.max_sge = dev_attr->max_qp_sges;
|
||||
sq->wqe_size = bnxt_re_get_swqe_size();
|
||||
sq->max_sge = init_attr->cap.max_send_sge;
|
||||
if (sq->max_sge > dev_attr->max_qp_sges)
|
||||
sq->max_sge = dev_attr->max_qp_sges;
|
||||
/*
|
||||
* Change the SQ depth if user has requested minimum using
|
||||
* configfs. Only supported for kernel consumers
|
||||
@ -1096,9 +1100,9 @@ static void bnxt_re_init_sq_attr(struct bnxt_re_qp *qp,
|
||||
entries = init_attr->cap.max_send_wr;
|
||||
/* Allocate 128 + 1 more than what's provided */
|
||||
entries = roundup_pow_of_two(entries + BNXT_QPLIB_RESERVED_QP_WRS + 1);
|
||||
qplqp->sq.max_wqe = min_t(u32, entries, dev_attr->max_qp_wqes +
|
||||
BNXT_QPLIB_RESERVED_QP_WRS + 1);
|
||||
qplqp->sq.q_full_delta = BNXT_QPLIB_RESERVED_QP_WRS + 1;
|
||||
sq->max_wqe = min_t(u32, entries, dev_attr->max_qp_wqes +
|
||||
BNXT_QPLIB_RESERVED_QP_WRS + 1);
|
||||
sq->q_full_delta = BNXT_QPLIB_RESERVED_QP_WRS + 1;
|
||||
/*
|
||||
* Reserving one slot for Phantom WQE. Application can
|
||||
* post one extra entry in this case. But allowing this to avoid
|
||||
@ -1511,7 +1515,7 @@ static int bnxt_re_init_user_srq(struct bnxt_re_dev *rdev,
|
||||
if (ib_copy_from_udata(&ureq, udata, sizeof(ureq)))
|
||||
return -EFAULT;
|
||||
|
||||
bytes = (qplib_srq->max_wqe * BNXT_QPLIB_MAX_RQE_ENTRY_SIZE);
|
||||
bytes = (qplib_srq->max_wqe * qplib_srq->wqe_size);
|
||||
bytes = PAGE_ALIGN(bytes);
|
||||
umem = ib_umem_get(&rdev->ibdev, ureq.srqva, bytes,
|
||||
IB_ACCESS_LOCAL_WRITE);
|
||||
@ -1534,15 +1538,20 @@ int bnxt_re_create_srq(struct ib_srq *ib_srq,
|
||||
struct ib_srq_init_attr *srq_init_attr,
|
||||
struct ib_udata *udata)
|
||||
{
|
||||
struct ib_pd *ib_pd = ib_srq->pd;
|
||||
struct bnxt_re_pd *pd = container_of(ib_pd, struct bnxt_re_pd, ib_pd);
|
||||
struct bnxt_re_dev *rdev = pd->rdev;
|
||||
struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr;
|
||||
struct bnxt_re_srq *srq =
|
||||
container_of(ib_srq, struct bnxt_re_srq, ib_srq);
|
||||
struct bnxt_qplib_dev_attr *dev_attr;
|
||||
struct bnxt_qplib_nq *nq = NULL;
|
||||
struct bnxt_re_dev *rdev;
|
||||
struct bnxt_re_srq *srq;
|
||||
struct bnxt_re_pd *pd;
|
||||
struct ib_pd *ib_pd;
|
||||
int rc, entries;
|
||||
|
||||
ib_pd = ib_srq->pd;
|
||||
pd = container_of(ib_pd, struct bnxt_re_pd, ib_pd);
|
||||
rdev = pd->rdev;
|
||||
dev_attr = &rdev->dev_attr;
|
||||
srq = container_of(ib_srq, struct bnxt_re_srq, ib_srq);
|
||||
|
||||
if (srq_init_attr->attr.max_wr >= dev_attr->max_srq_wqes) {
|
||||
ibdev_err(&rdev->ibdev, "Create CQ failed - max exceeded");
|
||||
rc = -EINVAL;
|
||||
@ -1563,8 +1572,9 @@ int bnxt_re_create_srq(struct ib_srq *ib_srq,
|
||||
entries = roundup_pow_of_two(srq_init_attr->attr.max_wr + 1);
|
||||
if (entries > dev_attr->max_srq_wqes + 1)
|
||||
entries = dev_attr->max_srq_wqes + 1;
|
||||
|
||||
srq->qplib_srq.max_wqe = entries;
|
||||
|
||||
srq->qplib_srq.wqe_size = bnxt_re_get_rwqe_size();
|
||||
srq->qplib_srq.max_sge = srq_init_attr->attr.max_sge;
|
||||
srq->qplib_srq.threshold = srq_init_attr->attr.srq_limit;
|
||||
srq->srq_limit = srq_init_attr->attr.srq_limit;
|
||||
|
@ -122,12 +122,6 @@ struct bnxt_re_frpl {
|
||||
u64 *page_list;
|
||||
};
|
||||
|
||||
struct bnxt_re_fmr {
|
||||
struct bnxt_re_dev *rdev;
|
||||
struct ib_fmr ib_fmr;
|
||||
struct bnxt_qplib_mrw qplib_fmr;
|
||||
};
|
||||
|
||||
struct bnxt_re_mw {
|
||||
struct bnxt_re_dev *rdev;
|
||||
struct ib_mw ib_mw;
|
||||
@ -142,6 +136,16 @@ struct bnxt_re_ucontext {
|
||||
spinlock_t sh_lock; /* protect shpg */
|
||||
};
|
||||
|
||||
static inline u16 bnxt_re_get_swqe_size(void)
|
||||
{
|
||||
return sizeof(struct sq_send);
|
||||
}
|
||||
|
||||
static inline u16 bnxt_re_get_rwqe_size(void)
|
||||
{
|
||||
return sizeof(struct rq_wqe);
|
||||
}
|
||||
|
||||
int bnxt_re_query_device(struct ib_device *ibdev,
|
||||
struct ib_device_attr *ib_attr,
|
||||
struct ib_udata *udata);
|
||||
@ -160,7 +164,7 @@ enum rdma_link_layer bnxt_re_get_link_layer(struct ib_device *ibdev,
|
||||
u8 port_num);
|
||||
int bnxt_re_alloc_pd(struct ib_pd *pd, struct ib_udata *udata);
|
||||
void bnxt_re_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
|
||||
int bnxt_re_create_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr, u32 flags,
|
||||
int bnxt_re_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr,
|
||||
struct ib_udata *udata);
|
||||
int bnxt_re_modify_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr);
|
||||
int bnxt_re_query_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr);
|
||||
|
@ -300,12 +300,12 @@ static void bnxt_qplib_service_nq(unsigned long data)
|
||||
{
|
||||
struct bnxt_qplib_nq *nq = (struct bnxt_qplib_nq *)data;
|
||||
struct bnxt_qplib_hwq *hwq = &nq->hwq;
|
||||
struct nq_base *nqe, **nq_ptr;
|
||||
struct bnxt_qplib_cq *cq;
|
||||
int num_cqne_processed = 0;
|
||||
int num_srqne_processed = 0;
|
||||
int num_cqne_processed = 0;
|
||||
struct bnxt_qplib_cq *cq;
|
||||
int budget = nq->budget;
|
||||
u32 sw_cons, raw_cons;
|
||||
struct nq_base *nqe;
|
||||
uintptr_t q_handle;
|
||||
u16 type;
|
||||
|
||||
@ -314,8 +314,7 @@ static void bnxt_qplib_service_nq(unsigned long data)
|
||||
raw_cons = hwq->cons;
|
||||
while (budget--) {
|
||||
sw_cons = HWQ_CMP(raw_cons, hwq);
|
||||
nq_ptr = (struct nq_base **)hwq->pbl_ptr;
|
||||
nqe = &nq_ptr[NQE_PG(sw_cons)][NQE_IDX(sw_cons)];
|
||||
nqe = bnxt_qplib_get_qe(hwq, sw_cons, NULL);
|
||||
if (!NQE_CMP_VALID(nqe, raw_cons, hwq->max_elements))
|
||||
break;
|
||||
|
||||
@ -392,13 +391,11 @@ static irqreturn_t bnxt_qplib_nq_irq(int irq, void *dev_instance)
|
||||
{
|
||||
struct bnxt_qplib_nq *nq = dev_instance;
|
||||
struct bnxt_qplib_hwq *hwq = &nq->hwq;
|
||||
struct nq_base **nq_ptr;
|
||||
u32 sw_cons;
|
||||
|
||||
/* Prefetch the NQ element */
|
||||
sw_cons = HWQ_CMP(hwq->cons, hwq);
|
||||
nq_ptr = (struct nq_base **)nq->hwq.pbl_ptr;
|
||||
prefetch(&nq_ptr[NQE_PG(sw_cons)][NQE_IDX(sw_cons)]);
|
||||
prefetch(bnxt_qplib_get_qe(hwq, sw_cons, NULL));
|
||||
|
||||
/* Fan out to CPU affinitized kthreads? */
|
||||
tasklet_schedule(&nq->nq_tasklet);
|
||||
@ -612,12 +609,13 @@ int bnxt_qplib_create_srq(struct bnxt_qplib_res *res,
|
||||
struct cmdq_create_srq req;
|
||||
struct bnxt_qplib_pbl *pbl;
|
||||
u16 cmd_flags = 0;
|
||||
u16 pg_sz_lvl;
|
||||
int rc, idx;
|
||||
|
||||
hwq_attr.res = res;
|
||||
hwq_attr.sginfo = &srq->sg_info;
|
||||
hwq_attr.depth = srq->max_wqe;
|
||||
hwq_attr.stride = BNXT_QPLIB_MAX_RQE_ENTRY_SIZE;
|
||||
hwq_attr.stride = srq->wqe_size;
|
||||
hwq_attr.type = HWQ_TYPE_QUEUE;
|
||||
rc = bnxt_qplib_alloc_init_hwq(&srq->hwq, &hwq_attr);
|
||||
if (rc)
|
||||
@ -638,22 +636,11 @@ int bnxt_qplib_create_srq(struct bnxt_qplib_res *res,
|
||||
|
||||
req.srq_size = cpu_to_le16((u16)srq->hwq.max_elements);
|
||||
pbl = &srq->hwq.pbl[PBL_LVL_0];
|
||||
req.pg_size_lvl = cpu_to_le16((((u16)srq->hwq.level &
|
||||
CMDQ_CREATE_SRQ_LVL_MASK) <<
|
||||
CMDQ_CREATE_SRQ_LVL_SFT) |
|
||||
(pbl->pg_size == ROCE_PG_SIZE_4K ?
|
||||
CMDQ_CREATE_SRQ_PG_SIZE_PG_4K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_8K ?
|
||||
CMDQ_CREATE_SRQ_PG_SIZE_PG_8K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_64K ?
|
||||
CMDQ_CREATE_SRQ_PG_SIZE_PG_64K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_2M ?
|
||||
CMDQ_CREATE_SRQ_PG_SIZE_PG_2M :
|
||||
pbl->pg_size == ROCE_PG_SIZE_8M ?
|
||||
CMDQ_CREATE_SRQ_PG_SIZE_PG_8M :
|
||||
pbl->pg_size == ROCE_PG_SIZE_1G ?
|
||||
CMDQ_CREATE_SRQ_PG_SIZE_PG_1G :
|
||||
CMDQ_CREATE_SRQ_PG_SIZE_PG_4K));
|
||||
pg_sz_lvl = ((u16)bnxt_qplib_base_pg_size(&srq->hwq) <<
|
||||
CMDQ_CREATE_SRQ_PG_SIZE_SFT);
|
||||
pg_sz_lvl |= (srq->hwq.level & CMDQ_CREATE_SRQ_LVL_MASK) <<
|
||||
CMDQ_CREATE_SRQ_LVL_SFT;
|
||||
req.pg_size_lvl = cpu_to_le16(pg_sz_lvl);
|
||||
req.pbl = cpu_to_le64(pbl->pg_map_arr[0]);
|
||||
req.pd_id = cpu_to_le32(srq->pd->id);
|
||||
req.eventq_id = cpu_to_le16(srq->eventq_hw_ring_id);
|
||||
@ -740,7 +727,7 @@ int bnxt_qplib_post_srq_recv(struct bnxt_qplib_srq *srq,
|
||||
struct bnxt_qplib_swqe *wqe)
|
||||
{
|
||||
struct bnxt_qplib_hwq *srq_hwq = &srq->hwq;
|
||||
struct rq_wqe *srqe, **srqe_ptr;
|
||||
struct rq_wqe *srqe;
|
||||
struct sq_sge *hw_sge;
|
||||
u32 sw_prod, sw_cons, count = 0;
|
||||
int i, rc = 0, next;
|
||||
@ -758,9 +745,8 @@ int bnxt_qplib_post_srq_recv(struct bnxt_qplib_srq *srq,
|
||||
spin_unlock(&srq_hwq->lock);
|
||||
|
||||
sw_prod = HWQ_CMP(srq_hwq->prod, srq_hwq);
|
||||
srqe_ptr = (struct rq_wqe **)srq_hwq->pbl_ptr;
|
||||
srqe = &srqe_ptr[RQE_PG(sw_prod)][RQE_IDX(sw_prod)];
|
||||
memset(srqe, 0, BNXT_QPLIB_MAX_RQE_ENTRY_SIZE);
|
||||
srqe = bnxt_qplib_get_qe(srq_hwq, sw_prod, NULL);
|
||||
memset(srqe, 0, srq->wqe_size);
|
||||
/* Calculate wqe_size16 and data_len */
|
||||
for (i = 0, hw_sge = (struct sq_sge *)srqe->data;
|
||||
i < wqe->num_sge; i++, hw_sge++) {
|
||||
@ -809,6 +795,7 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
|
||||
struct bnxt_qplib_pbl *pbl;
|
||||
u16 cmd_flags = 0;
|
||||
u32 qp_flags = 0;
|
||||
u8 pg_sz_lvl;
|
||||
int rc;
|
||||
|
||||
RCFW_CMD_PREP(req, CREATE_QP1, cmd_flags);
|
||||
@ -822,7 +809,7 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
|
||||
hwq_attr.res = res;
|
||||
hwq_attr.sginfo = &sq->sg_info;
|
||||
hwq_attr.depth = sq->max_wqe;
|
||||
hwq_attr.stride = BNXT_QPLIB_MAX_SQE_ENTRY_SIZE;
|
||||
hwq_attr.stride = sq->wqe_size;
|
||||
hwq_attr.type = HWQ_TYPE_QUEUE;
|
||||
rc = bnxt_qplib_alloc_init_hwq(&sq->hwq, &hwq_attr);
|
||||
if (rc)
|
||||
@ -835,33 +822,18 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
|
||||
}
|
||||
pbl = &sq->hwq.pbl[PBL_LVL_0];
|
||||
req.sq_pbl = cpu_to_le64(pbl->pg_map_arr[0]);
|
||||
req.sq_pg_size_sq_lvl =
|
||||
((sq->hwq.level & CMDQ_CREATE_QP1_SQ_LVL_MASK)
|
||||
<< CMDQ_CREATE_QP1_SQ_LVL_SFT) |
|
||||
(pbl->pg_size == ROCE_PG_SIZE_4K ?
|
||||
CMDQ_CREATE_QP1_SQ_PG_SIZE_PG_4K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_8K ?
|
||||
CMDQ_CREATE_QP1_SQ_PG_SIZE_PG_8K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_64K ?
|
||||
CMDQ_CREATE_QP1_SQ_PG_SIZE_PG_64K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_2M ?
|
||||
CMDQ_CREATE_QP1_SQ_PG_SIZE_PG_2M :
|
||||
pbl->pg_size == ROCE_PG_SIZE_8M ?
|
||||
CMDQ_CREATE_QP1_SQ_PG_SIZE_PG_8M :
|
||||
pbl->pg_size == ROCE_PG_SIZE_1G ?
|
||||
CMDQ_CREATE_QP1_SQ_PG_SIZE_PG_1G :
|
||||
CMDQ_CREATE_QP1_SQ_PG_SIZE_PG_4K);
|
||||
pg_sz_lvl = (bnxt_qplib_base_pg_size(&sq->hwq) <<
|
||||
CMDQ_CREATE_QP1_SQ_PG_SIZE_SFT);
|
||||
pg_sz_lvl |= (sq->hwq.level & CMDQ_CREATE_QP1_SQ_LVL_MASK);
|
||||
req.sq_pg_size_sq_lvl = pg_sz_lvl;
|
||||
|
||||
if (qp->scq)
|
||||
req.scq_cid = cpu_to_le32(qp->scq->id);
|
||||
|
||||
qp_flags |= CMDQ_CREATE_QP1_QP_FLAGS_RESERVED_LKEY_ENABLE;
|
||||
|
||||
/* RQ */
|
||||
if (rq->max_wqe) {
|
||||
hwq_attr.res = res;
|
||||
hwq_attr.sginfo = &rq->sg_info;
|
||||
hwq_attr.stride = BNXT_QPLIB_MAX_RQE_ENTRY_SIZE;
|
||||
hwq_attr.stride = rq->wqe_size;
|
||||
hwq_attr.depth = qp->rq.max_wqe;
|
||||
hwq_attr.type = HWQ_TYPE_QUEUE;
|
||||
rc = bnxt_qplib_alloc_init_hwq(&rq->hwq, &hwq_attr);
|
||||
@ -876,32 +848,20 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
|
||||
}
|
||||
pbl = &rq->hwq.pbl[PBL_LVL_0];
|
||||
req.rq_pbl = cpu_to_le64(pbl->pg_map_arr[0]);
|
||||
req.rq_pg_size_rq_lvl =
|
||||
((rq->hwq.level & CMDQ_CREATE_QP1_RQ_LVL_MASK) <<
|
||||
CMDQ_CREATE_QP1_RQ_LVL_SFT) |
|
||||
(pbl->pg_size == ROCE_PG_SIZE_4K ?
|
||||
CMDQ_CREATE_QP1_RQ_PG_SIZE_PG_4K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_8K ?
|
||||
CMDQ_CREATE_QP1_RQ_PG_SIZE_PG_8K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_64K ?
|
||||
CMDQ_CREATE_QP1_RQ_PG_SIZE_PG_64K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_2M ?
|
||||
CMDQ_CREATE_QP1_RQ_PG_SIZE_PG_2M :
|
||||
pbl->pg_size == ROCE_PG_SIZE_8M ?
|
||||
CMDQ_CREATE_QP1_RQ_PG_SIZE_PG_8M :
|
||||
pbl->pg_size == ROCE_PG_SIZE_1G ?
|
||||
CMDQ_CREATE_QP1_RQ_PG_SIZE_PG_1G :
|
||||
CMDQ_CREATE_QP1_RQ_PG_SIZE_PG_4K);
|
||||
pg_sz_lvl = (bnxt_qplib_base_pg_size(&rq->hwq) <<
|
||||
CMDQ_CREATE_QP1_RQ_PG_SIZE_SFT);
|
||||
pg_sz_lvl |= (rq->hwq.level & CMDQ_CREATE_QP1_RQ_LVL_MASK);
|
||||
req.rq_pg_size_rq_lvl = pg_sz_lvl;
|
||||
if (qp->rcq)
|
||||
req.rcq_cid = cpu_to_le32(qp->rcq->id);
|
||||
}
|
||||
|
||||
/* Header buffer - allow hdr_buf pass in */
|
||||
rc = bnxt_qplib_alloc_qp_hdr_buf(res, qp);
|
||||
if (rc) {
|
||||
rc = -ENOMEM;
|
||||
goto fail;
|
||||
}
|
||||
qp_flags |= CMDQ_CREATE_QP1_QP_FLAGS_RESERVED_LKEY_ENABLE;
|
||||
req.qp_flags = cpu_to_le32(qp_flags);
|
||||
req.sq_size = cpu_to_le32(sq->hwq.max_elements);
|
||||
req.rq_size = cpu_to_le32(rq->hwq.max_elements);
|
||||
@ -948,23 +908,47 @@ exit:
|
||||
return rc;
|
||||
}
|
||||
|
||||
static void bnxt_qplib_init_psn_ptr(struct bnxt_qplib_qp *qp, int size)
|
||||
{
|
||||
struct bnxt_qplib_hwq *hwq;
|
||||
struct bnxt_qplib_q *sq;
|
||||
u64 fpsne, psne, psn_pg;
|
||||
u16 indx_pad = 0, indx;
|
||||
u16 pg_num, pg_indx;
|
||||
u64 *page;
|
||||
|
||||
sq = &qp->sq;
|
||||
hwq = &sq->hwq;
|
||||
|
||||
fpsne = (u64)bnxt_qplib_get_qe(hwq, hwq->max_elements, &psn_pg);
|
||||
if (!IS_ALIGNED(fpsne, PAGE_SIZE))
|
||||
indx_pad = ALIGN(fpsne, PAGE_SIZE) / size;
|
||||
|
||||
page = (u64 *)psn_pg;
|
||||
for (indx = 0; indx < hwq->max_elements; indx++) {
|
||||
pg_num = (indx + indx_pad) / (PAGE_SIZE / size);
|
||||
pg_indx = (indx + indx_pad) % (PAGE_SIZE / size);
|
||||
psne = page[pg_num] + pg_indx * size;
|
||||
sq->swq[indx].psn_ext = (struct sq_psn_search_ext *)psne;
|
||||
sq->swq[indx].psn_search = (struct sq_psn_search *)psne;
|
||||
}
|
||||
}
|
||||
|
||||
int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
|
||||
{
|
||||
struct bnxt_qplib_rcfw *rcfw = res->rcfw;
|
||||
struct bnxt_qplib_hwq_attr hwq_attr = {};
|
||||
unsigned long int psn_search, poff = 0;
|
||||
struct bnxt_qplib_sg_info sginfo = {};
|
||||
struct sq_psn_search **psn_search_ptr;
|
||||
struct bnxt_qplib_q *sq = &qp->sq;
|
||||
struct bnxt_qplib_q *rq = &qp->rq;
|
||||
int i, rc, req_size, psn_sz = 0;
|
||||
struct sq_send **hw_sq_send_ptr;
|
||||
struct creq_create_qp_resp resp;
|
||||
int rc, req_size, psn_sz = 0;
|
||||
struct bnxt_qplib_hwq *xrrq;
|
||||
u16 cmd_flags = 0, max_ssge;
|
||||
struct cmdq_create_qp req;
|
||||
struct bnxt_qplib_pbl *pbl;
|
||||
struct cmdq_create_qp req;
|
||||
u32 qp_flags = 0;
|
||||
u8 pg_sz_lvl;
|
||||
u16 max_rsge;
|
||||
|
||||
RCFW_CMD_PREP(req, CREATE_QP, cmd_flags);
|
||||
@ -983,7 +967,7 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
|
||||
|
||||
hwq_attr.res = res;
|
||||
hwq_attr.sginfo = &sq->sg_info;
|
||||
hwq_attr.stride = BNXT_QPLIB_MAX_SQE_ENTRY_SIZE;
|
||||
hwq_attr.stride = sq->wqe_size;
|
||||
hwq_attr.depth = sq->max_wqe;
|
||||
hwq_attr.aux_stride = psn_sz;
|
||||
hwq_attr.aux_depth = hwq_attr.depth;
|
||||
@ -997,64 +981,25 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
|
||||
rc = -ENOMEM;
|
||||
goto fail_sq;
|
||||
}
|
||||
hw_sq_send_ptr = (struct sq_send **)sq->hwq.pbl_ptr;
|
||||
if (psn_sz) {
|
||||
psn_search_ptr = (struct sq_psn_search **)
|
||||
&hw_sq_send_ptr[get_sqe_pg
|
||||
(sq->hwq.max_elements)];
|
||||
psn_search = (unsigned long int)
|
||||
&hw_sq_send_ptr[get_sqe_pg(sq->hwq.max_elements)]
|
||||
[get_sqe_idx(sq->hwq.max_elements)];
|
||||
if (psn_search & ~PAGE_MASK) {
|
||||
/* If the psn_search does not start on a page boundary,
|
||||
* then calculate the offset
|
||||
*/
|
||||
poff = (psn_search & ~PAGE_MASK) /
|
||||
BNXT_QPLIB_MAX_PSNE_ENTRY_SIZE;
|
||||
}
|
||||
for (i = 0; i < sq->hwq.max_elements; i++) {
|
||||
sq->swq[i].psn_search =
|
||||
&psn_search_ptr[get_psne_pg(i + poff)]
|
||||
[get_psne_idx(i + poff)];
|
||||
/*psns_ext will be used only for P5 chips. */
|
||||
sq->swq[i].psn_ext =
|
||||
(struct sq_psn_search_ext *)
|
||||
&psn_search_ptr[get_psne_pg(i + poff)]
|
||||
[get_psne_idx(i + poff)];
|
||||
}
|
||||
}
|
||||
|
||||
if (psn_sz)
|
||||
bnxt_qplib_init_psn_ptr(qp, psn_sz);
|
||||
|
||||
pbl = &sq->hwq.pbl[PBL_LVL_0];
|
||||
req.sq_pbl = cpu_to_le64(pbl->pg_map_arr[0]);
|
||||
req.sq_pg_size_sq_lvl =
|
||||
((sq->hwq.level & CMDQ_CREATE_QP_SQ_LVL_MASK)
|
||||
<< CMDQ_CREATE_QP_SQ_LVL_SFT) |
|
||||
(pbl->pg_size == ROCE_PG_SIZE_4K ?
|
||||
CMDQ_CREATE_QP_SQ_PG_SIZE_PG_4K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_8K ?
|
||||
CMDQ_CREATE_QP_SQ_PG_SIZE_PG_8K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_64K ?
|
||||
CMDQ_CREATE_QP_SQ_PG_SIZE_PG_64K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_2M ?
|
||||
CMDQ_CREATE_QP_SQ_PG_SIZE_PG_2M :
|
||||
pbl->pg_size == ROCE_PG_SIZE_8M ?
|
||||
CMDQ_CREATE_QP_SQ_PG_SIZE_PG_8M :
|
||||
pbl->pg_size == ROCE_PG_SIZE_1G ?
|
||||
CMDQ_CREATE_QP_SQ_PG_SIZE_PG_1G :
|
||||
CMDQ_CREATE_QP_SQ_PG_SIZE_PG_4K);
|
||||
pg_sz_lvl = (bnxt_qplib_base_pg_size(&sq->hwq) <<
|
||||
CMDQ_CREATE_QP_SQ_PG_SIZE_SFT);
|
||||
pg_sz_lvl |= (sq->hwq.level & CMDQ_CREATE_QP_SQ_LVL_MASK);
|
||||
req.sq_pg_size_sq_lvl = pg_sz_lvl;
|
||||
|
||||
if (qp->scq)
|
||||
req.scq_cid = cpu_to_le32(qp->scq->id);
|
||||
|
||||
qp_flags |= CMDQ_CREATE_QP_QP_FLAGS_RESERVED_LKEY_ENABLE;
|
||||
qp_flags |= CMDQ_CREATE_QP_QP_FLAGS_FR_PMR_ENABLED;
|
||||
if (qp->sig_type)
|
||||
qp_flags |= CMDQ_CREATE_QP_QP_FLAGS_FORCE_COMPLETION;
|
||||
|
||||
/* RQ */
|
||||
if (rq->max_wqe) {
|
||||
hwq_attr.res = res;
|
||||
hwq_attr.sginfo = &rq->sg_info;
|
||||
hwq_attr.stride = BNXT_QPLIB_MAX_RQE_ENTRY_SIZE;
|
||||
hwq_attr.stride = rq->wqe_size;
|
||||
hwq_attr.depth = rq->max_wqe;
|
||||
hwq_attr.aux_stride = 0;
|
||||
hwq_attr.aux_depth = 0;
|
||||
@ -1071,22 +1016,10 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
|
||||
}
|
||||
pbl = &rq->hwq.pbl[PBL_LVL_0];
|
||||
req.rq_pbl = cpu_to_le64(pbl->pg_map_arr[0]);
|
||||
req.rq_pg_size_rq_lvl =
|
||||
((rq->hwq.level & CMDQ_CREATE_QP_RQ_LVL_MASK) <<
|
||||
CMDQ_CREATE_QP_RQ_LVL_SFT) |
|
||||
(pbl->pg_size == ROCE_PG_SIZE_4K ?
|
||||
CMDQ_CREATE_QP_RQ_PG_SIZE_PG_4K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_8K ?
|
||||
CMDQ_CREATE_QP_RQ_PG_SIZE_PG_8K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_64K ?
|
||||
CMDQ_CREATE_QP_RQ_PG_SIZE_PG_64K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_2M ?
|
||||
CMDQ_CREATE_QP_RQ_PG_SIZE_PG_2M :
|
||||
pbl->pg_size == ROCE_PG_SIZE_8M ?
|
||||
CMDQ_CREATE_QP_RQ_PG_SIZE_PG_8M :
|
||||
pbl->pg_size == ROCE_PG_SIZE_1G ?
|
||||
CMDQ_CREATE_QP_RQ_PG_SIZE_PG_1G :
|
||||
CMDQ_CREATE_QP_RQ_PG_SIZE_PG_4K);
|
||||
pg_sz_lvl = (bnxt_qplib_base_pg_size(&rq->hwq) <<
|
||||
CMDQ_CREATE_QP_RQ_PG_SIZE_SFT);
|
||||
pg_sz_lvl |= (rq->hwq.level & CMDQ_CREATE_QP_RQ_LVL_MASK);
|
||||
req.rq_pg_size_rq_lvl = pg_sz_lvl;
|
||||
} else {
|
||||
/* SRQ */
|
||||
if (qp->srq) {
|
||||
@ -1097,7 +1030,13 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
|
||||
|
||||
if (qp->rcq)
|
||||
req.rcq_cid = cpu_to_le32(qp->rcq->id);
|
||||
|
||||
qp_flags |= CMDQ_CREATE_QP_QP_FLAGS_RESERVED_LKEY_ENABLE;
|
||||
qp_flags |= CMDQ_CREATE_QP_QP_FLAGS_FR_PMR_ENABLED;
|
||||
if (qp->sig_type)
|
||||
qp_flags |= CMDQ_CREATE_QP_QP_FLAGS_FORCE_COMPLETION;
|
||||
req.qp_flags = cpu_to_le32(qp_flags);
|
||||
|
||||
req.sq_size = cpu_to_le32(sq->hwq.max_elements);
|
||||
req.rq_size = cpu_to_le32(rq->hwq.max_elements);
|
||||
qp->sq_hdr_buf = NULL;
|
||||
@ -1483,12 +1422,11 @@ bail:
|
||||
static void __clean_cq(struct bnxt_qplib_cq *cq, u64 qp)
|
||||
{
|
||||
struct bnxt_qplib_hwq *cq_hwq = &cq->hwq;
|
||||
struct cq_base *hw_cqe, **hw_cqe_ptr;
|
||||
struct cq_base *hw_cqe;
|
||||
int i;
|
||||
|
||||
for (i = 0; i < cq_hwq->max_elements; i++) {
|
||||
hw_cqe_ptr = (struct cq_base **)cq_hwq->pbl_ptr;
|
||||
hw_cqe = &hw_cqe_ptr[CQE_PG(i)][CQE_IDX(i)];
|
||||
hw_cqe = bnxt_qplib_get_qe(cq_hwq, i, NULL);
|
||||
if (!CQE_CMP_VALID(hw_cqe, i, cq_hwq->max_elements))
|
||||
continue;
|
||||
/*
|
||||
@ -1615,6 +1553,34 @@ void *bnxt_qplib_get_qp1_rq_buf(struct bnxt_qplib_qp *qp,
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static void bnxt_qplib_fill_psn_search(struct bnxt_qplib_qp *qp,
|
||||
struct bnxt_qplib_swqe *wqe,
|
||||
struct bnxt_qplib_swq *swq)
|
||||
{
|
||||
struct sq_psn_search_ext *psns_ext;
|
||||
struct sq_psn_search *psns;
|
||||
u32 flg_npsn;
|
||||
u32 op_spsn;
|
||||
|
||||
psns = swq->psn_search;
|
||||
psns_ext = swq->psn_ext;
|
||||
|
||||
op_spsn = ((swq->start_psn << SQ_PSN_SEARCH_START_PSN_SFT) &
|
||||
SQ_PSN_SEARCH_START_PSN_MASK);
|
||||
op_spsn |= ((wqe->type << SQ_PSN_SEARCH_OPCODE_SFT) &
|
||||
SQ_PSN_SEARCH_OPCODE_MASK);
|
||||
flg_npsn = ((swq->next_psn << SQ_PSN_SEARCH_NEXT_PSN_SFT) &
|
||||
SQ_PSN_SEARCH_NEXT_PSN_MASK);
|
||||
|
||||
if (bnxt_qplib_is_chip_gen_p5(qp->cctx)) {
|
||||
psns_ext->opcode_start_psn = cpu_to_le32(op_spsn);
|
||||
psns_ext->flags_next_psn = cpu_to_le32(flg_npsn);
|
||||
} else {
|
||||
psns->opcode_start_psn = cpu_to_le32(op_spsn);
|
||||
psns->flags_next_psn = cpu_to_le32(flg_npsn);
|
||||
}
|
||||
}
|
||||
|
||||
void bnxt_qplib_post_send_db(struct bnxt_qplib_qp *qp)
|
||||
{
|
||||
struct bnxt_qplib_q *sq = &qp->sq;
|
||||
@ -1625,16 +1591,16 @@ void bnxt_qplib_post_send_db(struct bnxt_qplib_qp *qp)
|
||||
int bnxt_qplib_post_send(struct bnxt_qplib_qp *qp,
|
||||
struct bnxt_qplib_swqe *wqe)
|
||||
{
|
||||
struct bnxt_qplib_q *sq = &qp->sq;
|
||||
struct bnxt_qplib_swq *swq;
|
||||
struct sq_send *hw_sq_send_hdr, **hw_sq_send_ptr;
|
||||
struct sq_sge *hw_sge;
|
||||
struct bnxt_qplib_nq_work *nq_work = NULL;
|
||||
bool sch_handler = false;
|
||||
u32 sw_prod;
|
||||
u8 wqe_size16;
|
||||
int i, rc = 0, data_len = 0, pkt_num = 0;
|
||||
struct bnxt_qplib_q *sq = &qp->sq;
|
||||
struct sq_send *hw_sq_send_hdr;
|
||||
struct bnxt_qplib_swq *swq;
|
||||
bool sch_handler = false;
|
||||
struct sq_sge *hw_sge;
|
||||
u8 wqe_size16;
|
||||
__le32 temp32;
|
||||
u32 sw_prod;
|
||||
|
||||
if (qp->state != CMDQ_MODIFY_QP_NEW_STATE_RTS) {
|
||||
if (qp->state == CMDQ_MODIFY_QP_NEW_STATE_ERR) {
|
||||
@ -1663,11 +1629,8 @@ int bnxt_qplib_post_send(struct bnxt_qplib_qp *qp,
|
||||
swq->flags |= SQ_SEND_FLAGS_SIGNAL_COMP;
|
||||
swq->start_psn = sq->psn & BTH_PSN_MASK;
|
||||
|
||||
hw_sq_send_ptr = (struct sq_send **)sq->hwq.pbl_ptr;
|
||||
hw_sq_send_hdr = &hw_sq_send_ptr[get_sqe_pg(sw_prod)]
|
||||
[get_sqe_idx(sw_prod)];
|
||||
|
||||
memset(hw_sq_send_hdr, 0, BNXT_QPLIB_MAX_SQE_ENTRY_SIZE);
|
||||
hw_sq_send_hdr = bnxt_qplib_get_qe(&sq->hwq, sw_prod, NULL);
|
||||
memset(hw_sq_send_hdr, 0, sq->wqe_size);
|
||||
|
||||
if (wqe->flags & BNXT_QPLIB_SWQE_FLAGS_INLINE) {
|
||||
/* Copy the inline data */
|
||||
@ -1854,28 +1817,8 @@ int bnxt_qplib_post_send(struct bnxt_qplib_qp *qp,
|
||||
goto done;
|
||||
}
|
||||
swq->next_psn = sq->psn & BTH_PSN_MASK;
|
||||
if (swq->psn_search) {
|
||||
u32 opcd_spsn;
|
||||
u32 flg_npsn;
|
||||
|
||||
opcd_spsn = ((swq->start_psn << SQ_PSN_SEARCH_START_PSN_SFT) &
|
||||
SQ_PSN_SEARCH_START_PSN_MASK);
|
||||
opcd_spsn |= ((wqe->type << SQ_PSN_SEARCH_OPCODE_SFT) &
|
||||
SQ_PSN_SEARCH_OPCODE_MASK);
|
||||
flg_npsn = ((swq->next_psn << SQ_PSN_SEARCH_NEXT_PSN_SFT) &
|
||||
SQ_PSN_SEARCH_NEXT_PSN_MASK);
|
||||
if (bnxt_qplib_is_chip_gen_p5(qp->cctx)) {
|
||||
swq->psn_ext->opcode_start_psn =
|
||||
cpu_to_le32(opcd_spsn);
|
||||
swq->psn_ext->flags_next_psn =
|
||||
cpu_to_le32(flg_npsn);
|
||||
} else {
|
||||
swq->psn_search->opcode_start_psn =
|
||||
cpu_to_le32(opcd_spsn);
|
||||
swq->psn_search->flags_next_psn =
|
||||
cpu_to_le32(flg_npsn);
|
||||
}
|
||||
}
|
||||
if (qp->type == CMDQ_CREATE_QP_TYPE_RC)
|
||||
bnxt_qplib_fill_psn_search(qp, wqe, swq);
|
||||
queue_err:
|
||||
if (sch_handler) {
|
||||
/* Store the ULP info in the software structures */
|
||||
@ -1918,13 +1861,13 @@ void bnxt_qplib_post_recv_db(struct bnxt_qplib_qp *qp)
|
||||
int bnxt_qplib_post_recv(struct bnxt_qplib_qp *qp,
|
||||
struct bnxt_qplib_swqe *wqe)
|
||||
{
|
||||
struct bnxt_qplib_q *rq = &qp->rq;
|
||||
struct rq_wqe *rqe, **rqe_ptr;
|
||||
struct sq_sge *hw_sge;
|
||||
struct bnxt_qplib_nq_work *nq_work = NULL;
|
||||
struct bnxt_qplib_q *rq = &qp->rq;
|
||||
bool sch_handler = false;
|
||||
u32 sw_prod;
|
||||
struct sq_sge *hw_sge;
|
||||
struct rq_wqe *rqe;
|
||||
int i, rc = 0;
|
||||
u32 sw_prod;
|
||||
|
||||
if (qp->state == CMDQ_MODIFY_QP_NEW_STATE_ERR) {
|
||||
sch_handler = true;
|
||||
@ -1941,10 +1884,8 @@ int bnxt_qplib_post_recv(struct bnxt_qplib_qp *qp,
|
||||
sw_prod = HWQ_CMP(rq->hwq.prod, &rq->hwq);
|
||||
rq->swq[sw_prod].wr_id = wqe->wr_id;
|
||||
|
||||
rqe_ptr = (struct rq_wqe **)rq->hwq.pbl_ptr;
|
||||
rqe = &rqe_ptr[RQE_PG(sw_prod)][RQE_IDX(sw_prod)];
|
||||
|
||||
memset(rqe, 0, BNXT_QPLIB_MAX_RQE_ENTRY_SIZE);
|
||||
rqe = bnxt_qplib_get_qe(&rq->hwq, sw_prod, NULL);
|
||||
memset(rqe, 0, rq->wqe_size);
|
||||
|
||||
/* Calculate wqe_size16 and data_len */
|
||||
for (i = 0, hw_sge = (struct sq_sge *)rqe->data;
|
||||
@ -1997,9 +1938,10 @@ int bnxt_qplib_create_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq)
|
||||
struct bnxt_qplib_rcfw *rcfw = res->rcfw;
|
||||
struct bnxt_qplib_hwq_attr hwq_attr = {};
|
||||
struct creq_create_cq_resp resp;
|
||||
struct cmdq_create_cq req;
|
||||
struct bnxt_qplib_pbl *pbl;
|
||||
struct cmdq_create_cq req;
|
||||
u16 cmd_flags = 0;
|
||||
u32 pg_sz_lvl;
|
||||
int rc;
|
||||
|
||||
hwq_attr.res = res;
|
||||
@ -2020,22 +1962,13 @@ int bnxt_qplib_create_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq)
|
||||
}
|
||||
req.dpi = cpu_to_le32(cq->dpi->dpi);
|
||||
req.cq_handle = cpu_to_le64(cq->cq_handle);
|
||||
|
||||
req.cq_size = cpu_to_le32(cq->hwq.max_elements);
|
||||
pbl = &cq->hwq.pbl[PBL_LVL_0];
|
||||
req.pg_size_lvl = cpu_to_le32(
|
||||
((cq->hwq.level & CMDQ_CREATE_CQ_LVL_MASK) <<
|
||||
CMDQ_CREATE_CQ_LVL_SFT) |
|
||||
(pbl->pg_size == ROCE_PG_SIZE_4K ? CMDQ_CREATE_CQ_PG_SIZE_PG_4K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_8K ? CMDQ_CREATE_CQ_PG_SIZE_PG_8K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_64K ? CMDQ_CREATE_CQ_PG_SIZE_PG_64K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_2M ? CMDQ_CREATE_CQ_PG_SIZE_PG_2M :
|
||||
pbl->pg_size == ROCE_PG_SIZE_8M ? CMDQ_CREATE_CQ_PG_SIZE_PG_8M :
|
||||
pbl->pg_size == ROCE_PG_SIZE_1G ? CMDQ_CREATE_CQ_PG_SIZE_PG_1G :
|
||||
CMDQ_CREATE_CQ_PG_SIZE_PG_4K));
|
||||
|
||||
pg_sz_lvl = (bnxt_qplib_base_pg_size(&cq->hwq) <<
|
||||
CMDQ_CREATE_CQ_PG_SIZE_SFT);
|
||||
pg_sz_lvl |= (cq->hwq.level & CMDQ_CREATE_CQ_LVL_MASK);
|
||||
req.pg_size_lvl = cpu_to_le32(pg_sz_lvl);
|
||||
req.pbl = cpu_to_le64(pbl->pg_map_arr[0]);
|
||||
|
||||
req.cq_fco_cnq_id = cpu_to_le32(
|
||||
(cq->cnq_hw_ring_id & CMDQ_CREATE_CQ_CNQ_ID_MASK) <<
|
||||
CMDQ_CREATE_CQ_CNQ_ID_SFT);
|
||||
@ -2194,13 +2127,13 @@ void bnxt_qplib_mark_qp_error(void *qp_handle)
|
||||
static int do_wa9060(struct bnxt_qplib_qp *qp, struct bnxt_qplib_cq *cq,
|
||||
u32 cq_cons, u32 sw_sq_cons, u32 cqe_sq_cons)
|
||||
{
|
||||
struct bnxt_qplib_q *sq = &qp->sq;
|
||||
struct bnxt_qplib_swq *swq;
|
||||
u32 peek_sw_cq_cons, peek_raw_cq_cons, peek_sq_cons_idx;
|
||||
struct cq_base *peek_hwcqe, **peek_hw_cqe_ptr;
|
||||
struct bnxt_qplib_q *sq = &qp->sq;
|
||||
struct cq_req *peek_req_hwcqe;
|
||||
struct bnxt_qplib_qp *peek_qp;
|
||||
struct bnxt_qplib_q *peek_sq;
|
||||
struct bnxt_qplib_swq *swq;
|
||||
struct cq_base *peek_hwcqe;
|
||||
int i, rc = 0;
|
||||
|
||||
/* Normal mode */
|
||||
@ -2230,9 +2163,8 @@ static int do_wa9060(struct bnxt_qplib_qp *qp, struct bnxt_qplib_cq *cq,
|
||||
i = cq->hwq.max_elements;
|
||||
while (i--) {
|
||||
peek_sw_cq_cons = HWQ_CMP((peek_sw_cq_cons), &cq->hwq);
|
||||
peek_hw_cqe_ptr = (struct cq_base **)cq->hwq.pbl_ptr;
|
||||
peek_hwcqe = &peek_hw_cqe_ptr[CQE_PG(peek_sw_cq_cons)]
|
||||
[CQE_IDX(peek_sw_cq_cons)];
|
||||
peek_hwcqe = bnxt_qplib_get_qe(&cq->hwq,
|
||||
peek_sw_cq_cons, NULL);
|
||||
/* If the next hwcqe is VALID */
|
||||
if (CQE_CMP_VALID(peek_hwcqe, peek_raw_cq_cons,
|
||||
cq->hwq.max_elements)) {
|
||||
@ -2294,11 +2226,11 @@ static int bnxt_qplib_cq_process_req(struct bnxt_qplib_cq *cq,
|
||||
struct bnxt_qplib_cqe **pcqe, int *budget,
|
||||
u32 cq_cons, struct bnxt_qplib_qp **lib_qp)
|
||||
{
|
||||
struct bnxt_qplib_qp *qp;
|
||||
struct bnxt_qplib_q *sq;
|
||||
struct bnxt_qplib_cqe *cqe;
|
||||
u32 sw_sq_cons, cqe_sq_cons;
|
||||
struct bnxt_qplib_swq *swq;
|
||||
struct bnxt_qplib_cqe *cqe;
|
||||
struct bnxt_qplib_qp *qp;
|
||||
struct bnxt_qplib_q *sq;
|
||||
int rc = 0;
|
||||
|
||||
qp = (struct bnxt_qplib_qp *)((unsigned long)
|
||||
@ -2408,10 +2340,10 @@ static int bnxt_qplib_cq_process_res_rc(struct bnxt_qplib_cq *cq,
|
||||
struct bnxt_qplib_cqe **pcqe,
|
||||
int *budget)
|
||||
{
|
||||
struct bnxt_qplib_qp *qp;
|
||||
struct bnxt_qplib_q *rq;
|
||||
struct bnxt_qplib_srq *srq;
|
||||
struct bnxt_qplib_cqe *cqe;
|
||||
struct bnxt_qplib_qp *qp;
|
||||
struct bnxt_qplib_q *rq;
|
||||
u32 wr_id_idx;
|
||||
int rc = 0;
|
||||
|
||||
@ -2483,10 +2415,10 @@ static int bnxt_qplib_cq_process_res_ud(struct bnxt_qplib_cq *cq,
|
||||
struct bnxt_qplib_cqe **pcqe,
|
||||
int *budget)
|
||||
{
|
||||
struct bnxt_qplib_qp *qp;
|
||||
struct bnxt_qplib_q *rq;
|
||||
struct bnxt_qplib_srq *srq;
|
||||
struct bnxt_qplib_cqe *cqe;
|
||||
struct bnxt_qplib_qp *qp;
|
||||
struct bnxt_qplib_q *rq;
|
||||
u32 wr_id_idx;
|
||||
int rc = 0;
|
||||
|
||||
@ -2561,15 +2493,13 @@ done:
|
||||
|
||||
bool bnxt_qplib_is_cq_empty(struct bnxt_qplib_cq *cq)
|
||||
{
|
||||
struct cq_base *hw_cqe, **hw_cqe_ptr;
|
||||
struct cq_base *hw_cqe;
|
||||
u32 sw_cons, raw_cons;
|
||||
bool rc = true;
|
||||
|
||||
raw_cons = cq->hwq.cons;
|
||||
sw_cons = HWQ_CMP(raw_cons, &cq->hwq);
|
||||
hw_cqe_ptr = (struct cq_base **)cq->hwq.pbl_ptr;
|
||||
hw_cqe = &hw_cqe_ptr[CQE_PG(sw_cons)][CQE_IDX(sw_cons)];
|
||||
|
||||
hw_cqe = bnxt_qplib_get_qe(&cq->hwq, sw_cons, NULL);
|
||||
/* Check for Valid bit. If the CQE is valid, return false */
|
||||
rc = !CQE_CMP_VALID(hw_cqe, raw_cons, cq->hwq.max_elements);
|
||||
return rc;
|
||||
@ -2813,7 +2743,7 @@ int bnxt_qplib_process_flush_list(struct bnxt_qplib_cq *cq,
|
||||
int bnxt_qplib_poll_cq(struct bnxt_qplib_cq *cq, struct bnxt_qplib_cqe *cqe,
|
||||
int num_cqes, struct bnxt_qplib_qp **lib_qp)
|
||||
{
|
||||
struct cq_base *hw_cqe, **hw_cqe_ptr;
|
||||
struct cq_base *hw_cqe;
|
||||
u32 sw_cons, raw_cons;
|
||||
int budget, rc = 0;
|
||||
|
||||
@ -2822,8 +2752,7 @@ int bnxt_qplib_poll_cq(struct bnxt_qplib_cq *cq, struct bnxt_qplib_cqe *cqe,
|
||||
|
||||
while (budget) {
|
||||
sw_cons = HWQ_CMP(raw_cons, &cq->hwq);
|
||||
hw_cqe_ptr = (struct cq_base **)cq->hwq.pbl_ptr;
|
||||
hw_cqe = &hw_cqe_ptr[CQE_PG(sw_cons)][CQE_IDX(sw_cons)];
|
||||
hw_cqe = bnxt_qplib_get_qe(&cq->hwq, sw_cons, NULL);
|
||||
|
||||
/* Check for Valid bit */
|
||||
if (!CQE_CMP_VALID(hw_cqe, raw_cons, cq->hwq.max_elements))
|
||||
|
@ -45,6 +45,7 @@ struct bnxt_qplib_srq {
|
||||
struct bnxt_qplib_db_info dbinfo;
|
||||
u64 srq_handle;
|
||||
u32 id;
|
||||
u16 wqe_size;
|
||||
u32 max_wqe;
|
||||
u32 max_sge;
|
||||
u32 threshold;
|
||||
@ -65,38 +66,7 @@ struct bnxt_qplib_sge {
|
||||
u32 size;
|
||||
};
|
||||
|
||||
#define BNXT_QPLIB_MAX_SQE_ENTRY_SIZE sizeof(struct sq_send)
|
||||
|
||||
#define SQE_CNT_PER_PG (PAGE_SIZE / BNXT_QPLIB_MAX_SQE_ENTRY_SIZE)
|
||||
#define SQE_MAX_IDX_PER_PG (SQE_CNT_PER_PG - 1)
|
||||
|
||||
static inline u32 get_sqe_pg(u32 val)
|
||||
{
|
||||
return ((val & ~SQE_MAX_IDX_PER_PG) / SQE_CNT_PER_PG);
|
||||
}
|
||||
|
||||
static inline u32 get_sqe_idx(u32 val)
|
||||
{
|
||||
return (val & SQE_MAX_IDX_PER_PG);
|
||||
}
|
||||
|
||||
#define BNXT_QPLIB_MAX_PSNE_ENTRY_SIZE sizeof(struct sq_psn_search)
|
||||
|
||||
#define PSNE_CNT_PER_PG (PAGE_SIZE / BNXT_QPLIB_MAX_PSNE_ENTRY_SIZE)
|
||||
#define PSNE_MAX_IDX_PER_PG (PSNE_CNT_PER_PG - 1)
|
||||
|
||||
static inline u32 get_psne_pg(u32 val)
|
||||
{
|
||||
return ((val & ~PSNE_MAX_IDX_PER_PG) / PSNE_CNT_PER_PG);
|
||||
}
|
||||
|
||||
static inline u32 get_psne_idx(u32 val)
|
||||
{
|
||||
return (val & PSNE_MAX_IDX_PER_PG);
|
||||
}
|
||||
|
||||
#define BNXT_QPLIB_QP_MAX_SGL 6
|
||||
|
||||
struct bnxt_qplib_swq {
|
||||
u64 wr_id;
|
||||
int next_idx;
|
||||
@ -226,19 +196,13 @@ struct bnxt_qplib_swqe {
|
||||
};
|
||||
};
|
||||
|
||||
#define BNXT_QPLIB_MAX_RQE_ENTRY_SIZE sizeof(struct rq_wqe)
|
||||
|
||||
#define RQE_CNT_PER_PG (PAGE_SIZE / BNXT_QPLIB_MAX_RQE_ENTRY_SIZE)
|
||||
#define RQE_MAX_IDX_PER_PG (RQE_CNT_PER_PG - 1)
|
||||
#define RQE_PG(x) (((x) & ~RQE_MAX_IDX_PER_PG) / RQE_CNT_PER_PG)
|
||||
#define RQE_IDX(x) ((x) & RQE_MAX_IDX_PER_PG)
|
||||
|
||||
struct bnxt_qplib_q {
|
||||
struct bnxt_qplib_hwq hwq;
|
||||
struct bnxt_qplib_swq *swq;
|
||||
struct bnxt_qplib_db_info dbinfo;
|
||||
struct bnxt_qplib_sg_info sg_info;
|
||||
u32 max_wqe;
|
||||
u16 wqe_size;
|
||||
u16 q_full_delta;
|
||||
u16 max_sge;
|
||||
u32 psn;
|
||||
@ -256,7 +220,7 @@ struct bnxt_qplib_qp {
|
||||
struct bnxt_qplib_dpi *dpi;
|
||||
struct bnxt_qplib_chip_ctx *cctx;
|
||||
u64 qp_handle;
|
||||
#define BNXT_QPLIB_QP_ID_INVALID 0xFFFFFFFF
|
||||
#define BNXT_QPLIB_QP_ID_INVALID 0xFFFFFFFF
|
||||
u32 id;
|
||||
u8 type;
|
||||
u8 sig_type;
|
||||
|
@ -89,10 +89,9 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, struct cmdq_base *req,
|
||||
struct creq_base *resp, void *sb, u8 is_block)
|
||||
{
|
||||
struct bnxt_qplib_cmdq_ctx *cmdq = &rcfw->cmdq;
|
||||
struct bnxt_qplib_cmdqe *cmdqe, **hwq_ptr;
|
||||
struct bnxt_qplib_hwq *hwq = &cmdq->hwq;
|
||||
struct bnxt_qplib_crsqe *crsqe;
|
||||
u32 cmdq_depth = rcfw->cmdq_depth;
|
||||
struct bnxt_qplib_cmdqe *cmdqe;
|
||||
u32 sw_prod, cmdq_prod;
|
||||
struct pci_dev *pdev;
|
||||
unsigned long flags;
|
||||
@ -163,13 +162,11 @@ static int __send_message(struct bnxt_qplib_rcfw *rcfw, struct cmdq_base *req,
|
||||
BNXT_QPLIB_CMDQE_UNITS;
|
||||
}
|
||||
|
||||
hwq_ptr = (struct bnxt_qplib_cmdqe **)hwq->pbl_ptr;
|
||||
preq = (u8 *)req;
|
||||
do {
|
||||
/* Locate the next cmdq slot */
|
||||
sw_prod = HWQ_CMP(hwq->prod, hwq);
|
||||
cmdqe = &hwq_ptr[get_cmdq_pg(sw_prod, cmdq_depth)]
|
||||
[get_cmdq_idx(sw_prod, cmdq_depth)];
|
||||
cmdqe = bnxt_qplib_get_qe(hwq, sw_prod, NULL);
|
||||
if (!cmdqe) {
|
||||
dev_err(&pdev->dev,
|
||||
"RCFW request failed with no cmdqe!\n");
|
||||
@ -378,7 +375,7 @@ static void bnxt_qplib_service_creq(unsigned long data)
|
||||
struct bnxt_qplib_creq_ctx *creq = &rcfw->creq;
|
||||
u32 type, budget = CREQ_ENTRY_POLL_BUDGET;
|
||||
struct bnxt_qplib_hwq *hwq = &creq->hwq;
|
||||
struct creq_base *creqe, **hwq_ptr;
|
||||
struct creq_base *creqe;
|
||||
u32 sw_cons, raw_cons;
|
||||
unsigned long flags;
|
||||
|
||||
@ -387,8 +384,7 @@ static void bnxt_qplib_service_creq(unsigned long data)
|
||||
raw_cons = hwq->cons;
|
||||
while (budget > 0) {
|
||||
sw_cons = HWQ_CMP(raw_cons, hwq);
|
||||
hwq_ptr = (struct creq_base **)hwq->pbl_ptr;
|
||||
creqe = &hwq_ptr[get_creq_pg(sw_cons)][get_creq_idx(sw_cons)];
|
||||
creqe = bnxt_qplib_get_qe(hwq, sw_cons, NULL);
|
||||
if (!CREQ_CMP_VALID(creqe, raw_cons, hwq->max_elements))
|
||||
break;
|
||||
/* The valid test of the entry must be done first before
|
||||
@ -434,7 +430,6 @@ static irqreturn_t bnxt_qplib_creq_irq(int irq, void *dev_instance)
|
||||
{
|
||||
struct bnxt_qplib_rcfw *rcfw = dev_instance;
|
||||
struct bnxt_qplib_creq_ctx *creq;
|
||||
struct creq_base **creq_ptr;
|
||||
struct bnxt_qplib_hwq *hwq;
|
||||
u32 sw_cons;
|
||||
|
||||
@ -442,8 +437,7 @@ static irqreturn_t bnxt_qplib_creq_irq(int irq, void *dev_instance)
|
||||
hwq = &creq->hwq;
|
||||
/* Prefetch the CREQ element */
|
||||
sw_cons = HWQ_CMP(hwq->cons, hwq);
|
||||
creq_ptr = (struct creq_base **)creq->hwq.pbl_ptr;
|
||||
prefetch(&creq_ptr[get_creq_pg(sw_cons)][get_creq_idx(sw_cons)]);
|
||||
prefetch(bnxt_qplib_get_qe(hwq, sw_cons, NULL));
|
||||
|
||||
tasklet_schedule(&creq->creq_tasklet);
|
||||
|
||||
@ -468,29 +462,13 @@ int bnxt_qplib_deinit_rcfw(struct bnxt_qplib_rcfw *rcfw)
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int __get_pbl_pg_idx(struct bnxt_qplib_pbl *pbl)
|
||||
{
|
||||
return (pbl->pg_size == ROCE_PG_SIZE_4K ?
|
||||
CMDQ_INITIALIZE_FW_QPC_PG_SIZE_PG_4K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_8K ?
|
||||
CMDQ_INITIALIZE_FW_QPC_PG_SIZE_PG_8K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_64K ?
|
||||
CMDQ_INITIALIZE_FW_QPC_PG_SIZE_PG_64K :
|
||||
pbl->pg_size == ROCE_PG_SIZE_2M ?
|
||||
CMDQ_INITIALIZE_FW_QPC_PG_SIZE_PG_2M :
|
||||
pbl->pg_size == ROCE_PG_SIZE_8M ?
|
||||
CMDQ_INITIALIZE_FW_QPC_PG_SIZE_PG_8M :
|
||||
pbl->pg_size == ROCE_PG_SIZE_1G ?
|
||||
CMDQ_INITIALIZE_FW_QPC_PG_SIZE_PG_1G :
|
||||
CMDQ_INITIALIZE_FW_QPC_PG_SIZE_PG_4K);
|
||||
}
|
||||
|
||||
int bnxt_qplib_init_rcfw(struct bnxt_qplib_rcfw *rcfw,
|
||||
struct bnxt_qplib_ctx *ctx, int is_virtfn)
|
||||
{
|
||||
struct cmdq_initialize_fw req;
|
||||
struct creq_initialize_fw_resp resp;
|
||||
u16 cmd_flags = 0, level;
|
||||
struct cmdq_initialize_fw req;
|
||||
u16 cmd_flags = 0;
|
||||
u8 pgsz, lvl;
|
||||
int rc;
|
||||
|
||||
RCFW_CMD_PREP(req, INITIALIZE_FW, cmd_flags);
|
||||
@ -511,32 +489,30 @@ int bnxt_qplib_init_rcfw(struct bnxt_qplib_rcfw *rcfw,
|
||||
if (bnxt_qplib_is_chip_gen_p5(rcfw->res->cctx))
|
||||
goto config_vf_res;
|
||||
|
||||
level = ctx->qpc_tbl.level;
|
||||
req.qpc_pg_size_qpc_lvl = (level << CMDQ_INITIALIZE_FW_QPC_LVL_SFT) |
|
||||
__get_pbl_pg_idx(&ctx->qpc_tbl.pbl[level]);
|
||||
level = ctx->mrw_tbl.level;
|
||||
req.mrw_pg_size_mrw_lvl = (level << CMDQ_INITIALIZE_FW_MRW_LVL_SFT) |
|
||||
__get_pbl_pg_idx(&ctx->mrw_tbl.pbl[level]);
|
||||
level = ctx->srqc_tbl.level;
|
||||
req.srq_pg_size_srq_lvl = (level << CMDQ_INITIALIZE_FW_SRQ_LVL_SFT) |
|
||||
__get_pbl_pg_idx(&ctx->srqc_tbl.pbl[level]);
|
||||
level = ctx->cq_tbl.level;
|
||||
req.cq_pg_size_cq_lvl = (level << CMDQ_INITIALIZE_FW_CQ_LVL_SFT) |
|
||||
__get_pbl_pg_idx(&ctx->cq_tbl.pbl[level]);
|
||||
level = ctx->srqc_tbl.level;
|
||||
req.srq_pg_size_srq_lvl = (level << CMDQ_INITIALIZE_FW_SRQ_LVL_SFT) |
|
||||
__get_pbl_pg_idx(&ctx->srqc_tbl.pbl[level]);
|
||||
level = ctx->cq_tbl.level;
|
||||
req.cq_pg_size_cq_lvl = (level << CMDQ_INITIALIZE_FW_CQ_LVL_SFT) |
|
||||
__get_pbl_pg_idx(&ctx->cq_tbl.pbl[level]);
|
||||
level = ctx->tim_tbl.level;
|
||||
req.tim_pg_size_tim_lvl = (level << CMDQ_INITIALIZE_FW_TIM_LVL_SFT) |
|
||||
__get_pbl_pg_idx(&ctx->tim_tbl.pbl[level]);
|
||||
level = ctx->tqm_ctx.pde.level;
|
||||
req.tqm_pg_size_tqm_lvl =
|
||||
(level << CMDQ_INITIALIZE_FW_TQM_LVL_SFT) |
|
||||
__get_pbl_pg_idx(&ctx->tqm_ctx.pde.pbl[level]);
|
||||
|
||||
lvl = ctx->qpc_tbl.level;
|
||||
pgsz = bnxt_qplib_base_pg_size(&ctx->qpc_tbl);
|
||||
req.qpc_pg_size_qpc_lvl = (pgsz << CMDQ_INITIALIZE_FW_QPC_PG_SIZE_SFT) |
|
||||
lvl;
|
||||
lvl = ctx->mrw_tbl.level;
|
||||
pgsz = bnxt_qplib_base_pg_size(&ctx->mrw_tbl);
|
||||
req.mrw_pg_size_mrw_lvl = (pgsz << CMDQ_INITIALIZE_FW_QPC_PG_SIZE_SFT) |
|
||||
lvl;
|
||||
lvl = ctx->srqc_tbl.level;
|
||||
pgsz = bnxt_qplib_base_pg_size(&ctx->srqc_tbl);
|
||||
req.srq_pg_size_srq_lvl = (pgsz << CMDQ_INITIALIZE_FW_QPC_PG_SIZE_SFT) |
|
||||
lvl;
|
||||
lvl = ctx->cq_tbl.level;
|
||||
pgsz = bnxt_qplib_base_pg_size(&ctx->cq_tbl);
|
||||
req.cq_pg_size_cq_lvl = (pgsz << CMDQ_INITIALIZE_FW_QPC_PG_SIZE_SFT) |
|
||||
lvl;
|
||||
lvl = ctx->tim_tbl.level;
|
||||
pgsz = bnxt_qplib_base_pg_size(&ctx->tim_tbl);
|
||||
req.tim_pg_size_tim_lvl = (pgsz << CMDQ_INITIALIZE_FW_QPC_PG_SIZE_SFT) |
|
||||
lvl;
|
||||
lvl = ctx->tqm_ctx.pde.level;
|
||||
pgsz = bnxt_qplib_base_pg_size(&ctx->tqm_ctx.pde);
|
||||
req.tqm_pg_size_tqm_lvl = (pgsz << CMDQ_INITIALIZE_FW_QPC_PG_SIZE_SFT) |
|
||||
lvl;
|
||||
req.qpc_page_dir =
|
||||
cpu_to_le64(ctx->qpc_tbl.pbl[PBL_LVL_0].pg_map_arr[0]);
|
||||
req.mrw_page_dir =
|
||||
|
@ -87,12 +87,6 @@ static inline u32 bnxt_qplib_cmdqe_page_size(u32 depth)
|
||||
return (bnxt_qplib_cmdqe_npages(depth) * PAGE_SIZE);
|
||||
}
|
||||
|
||||
static inline u32 bnxt_qplib_cmdqe_cnt_per_pg(u32 depth)
|
||||
{
|
||||
return (bnxt_qplib_cmdqe_page_size(depth) /
|
||||
BNXT_QPLIB_CMDQE_UNITS);
|
||||
}
|
||||
|
||||
/* Set the cmd_size to a factor of CMDQE unit */
|
||||
static inline void bnxt_qplib_set_cmd_slots(struct cmdq_base *req)
|
||||
{
|
||||
@ -100,30 +94,12 @@ static inline void bnxt_qplib_set_cmd_slots(struct cmdq_base *req)
|
||||
BNXT_QPLIB_CMDQE_UNITS;
|
||||
}
|
||||
|
||||
#define MAX_CMDQ_IDX(depth) ((depth) - 1)
|
||||
|
||||
static inline u32 bnxt_qplib_max_cmdq_idx_per_pg(u32 depth)
|
||||
{
|
||||
return (bnxt_qplib_cmdqe_cnt_per_pg(depth) - 1);
|
||||
}
|
||||
|
||||
#define RCFW_MAX_COOKIE_VALUE 0x7FFF
|
||||
#define RCFW_CMD_IS_BLOCKING 0x8000
|
||||
#define RCFW_BLOCKED_CMD_WAIT_COUNT 0x4E20
|
||||
|
||||
#define HWRM_VERSION_RCFW_CMDQ_DEPTH_CHECK 0x1000900020011ULL
|
||||
|
||||
static inline u32 get_cmdq_pg(u32 val, u32 depth)
|
||||
{
|
||||
return (val & ~(bnxt_qplib_max_cmdq_idx_per_pg(depth))) /
|
||||
(bnxt_qplib_cmdqe_cnt_per_pg(depth));
|
||||
}
|
||||
|
||||
static inline u32 get_cmdq_idx(u32 val, u32 depth)
|
||||
{
|
||||
return val & (bnxt_qplib_max_cmdq_idx_per_pg(depth));
|
||||
}
|
||||
|
||||
/* Crsq buf is 1024-Byte */
|
||||
struct bnxt_qplib_crsbe {
|
||||
u8 data[1024];
|
||||
@ -133,76 +109,9 @@ struct bnxt_qplib_crsbe {
|
||||
/* Allocate 1 per QP for async error notification for now */
|
||||
#define BNXT_QPLIB_CREQE_MAX_CNT (64 * 1024)
|
||||
#define BNXT_QPLIB_CREQE_UNITS 16 /* 16-Bytes per prod unit */
|
||||
#define BNXT_QPLIB_CREQE_CNT_PER_PG (PAGE_SIZE / BNXT_QPLIB_CREQE_UNITS)
|
||||
|
||||
#define MAX_CREQ_IDX (BNXT_QPLIB_CREQE_MAX_CNT - 1)
|
||||
#define MAX_CREQ_IDX_PER_PG (BNXT_QPLIB_CREQE_CNT_PER_PG - 1)
|
||||
|
||||
static inline u32 get_creq_pg(u32 val)
|
||||
{
|
||||
return (val & ~MAX_CREQ_IDX_PER_PG) / BNXT_QPLIB_CREQE_CNT_PER_PG;
|
||||
}
|
||||
|
||||
static inline u32 get_creq_idx(u32 val)
|
||||
{
|
||||
return val & MAX_CREQ_IDX_PER_PG;
|
||||
}
|
||||
|
||||
#define BNXT_QPLIB_CREQE_PER_PG (PAGE_SIZE / sizeof(struct creq_base))
|
||||
|
||||
#define CREQ_CMP_VALID(hdr, raw_cons, cp_bit) \
|
||||
(!!((hdr)->v & CREQ_BASE_V) == \
|
||||
!((raw_cons) & (cp_bit)))
|
||||
|
||||
#define CREQ_DB_KEY_CP (0x2 << CMPL_DOORBELL_KEY_SFT)
|
||||
#define CREQ_DB_IDX_VALID CMPL_DOORBELL_IDX_VALID
|
||||
#define CREQ_DB_IRQ_DIS CMPL_DOORBELL_MASK
|
||||
#define CREQ_DB_CP_FLAGS_REARM (CREQ_DB_KEY_CP | \
|
||||
CREQ_DB_IDX_VALID)
|
||||
#define CREQ_DB_CP_FLAGS (CREQ_DB_KEY_CP | \
|
||||
CREQ_DB_IDX_VALID | \
|
||||
CREQ_DB_IRQ_DIS)
|
||||
|
||||
static inline void bnxt_qplib_ring_creq_db64(void __iomem *db, u32 index,
|
||||
u32 xid, bool arm)
|
||||
{
|
||||
u64 val = 0;
|
||||
|
||||
val = xid & DBC_DBC_XID_MASK;
|
||||
val |= DBC_DBC_PATH_ROCE;
|
||||
val |= arm ? DBC_DBC_TYPE_NQ_ARM : DBC_DBC_TYPE_NQ;
|
||||
val <<= 32;
|
||||
val |= index & DBC_DBC_INDEX_MASK;
|
||||
|
||||
writeq(val, db);
|
||||
}
|
||||
|
||||
static inline void bnxt_qplib_ring_creq_db_rearm(void __iomem *db, u32 raw_cons,
|
||||
u32 max_elements, u32 xid,
|
||||
bool gen_p5)
|
||||
{
|
||||
u32 index = raw_cons & (max_elements - 1);
|
||||
|
||||
if (gen_p5)
|
||||
bnxt_qplib_ring_creq_db64(db, index, xid, true);
|
||||
else
|
||||
writel(CREQ_DB_CP_FLAGS_REARM | (index & DBC_DBC32_XID_MASK),
|
||||
db);
|
||||
}
|
||||
|
||||
static inline void bnxt_qplib_ring_creq_db(void __iomem *db, u32 raw_cons,
|
||||
u32 max_elements, u32 xid,
|
||||
bool gen_p5)
|
||||
{
|
||||
u32 index = raw_cons & (max_elements - 1);
|
||||
|
||||
if (gen_p5)
|
||||
bnxt_qplib_ring_creq_db64(db, index, xid, true);
|
||||
else
|
||||
writel(CREQ_DB_CP_FLAGS | (index & DBC_DBC32_XID_MASK),
|
||||
db);
|
||||
}
|
||||
|
||||
#define CREQ_ENTRY_POLL_BUDGET 0x100
|
||||
|
||||
/* HWQ */
|
||||
|
@ -347,6 +347,7 @@ done:
|
||||
hwq->depth = hwq_attr->depth;
|
||||
hwq->max_elements = depth;
|
||||
hwq->element_size = stride;
|
||||
hwq->qe_ppg = pg_size / stride;
|
||||
/* For direct access to the elements */
|
||||
lvl = hwq->level;
|
||||
if (hwq_attr->sginfo->nopte && hwq->level)
|
||||
|
@ -80,6 +80,15 @@ enum bnxt_qplib_pbl_lvl {
|
||||
#define ROCE_PG_SIZE_8M (8 * 1024 * 1024)
|
||||
#define ROCE_PG_SIZE_1G (1024 * 1024 * 1024)
|
||||
|
||||
enum bnxt_qplib_hwrm_pg_size {
|
||||
BNXT_QPLIB_HWRM_PG_SIZE_4K = 0,
|
||||
BNXT_QPLIB_HWRM_PG_SIZE_8K = 1,
|
||||
BNXT_QPLIB_HWRM_PG_SIZE_64K = 2,
|
||||
BNXT_QPLIB_HWRM_PG_SIZE_2M = 3,
|
||||
BNXT_QPLIB_HWRM_PG_SIZE_8M = 4,
|
||||
BNXT_QPLIB_HWRM_PG_SIZE_1G = 5,
|
||||
};
|
||||
|
||||
struct bnxt_qplib_reg_desc {
|
||||
u8 bar_id;
|
||||
resource_size_t bar_base;
|
||||
@ -126,6 +135,7 @@ struct bnxt_qplib_hwq {
|
||||
u32 max_elements;
|
||||
u32 depth;
|
||||
u16 element_size; /* Size of each entry */
|
||||
u16 qe_ppg; /* queue entry per page */
|
||||
|
||||
u32 prod; /* raw */
|
||||
u32 cons; /* raw */
|
||||
@ -263,6 +273,49 @@ static inline u8 bnxt_qplib_get_ring_type(struct bnxt_qplib_chip_ctx *cctx)
|
||||
RING_ALLOC_REQ_RING_TYPE_ROCE_CMPL;
|
||||
}
|
||||
|
||||
static inline u8 bnxt_qplib_base_pg_size(struct bnxt_qplib_hwq *hwq)
|
||||
{
|
||||
u8 pg_size = BNXT_QPLIB_HWRM_PG_SIZE_4K;
|
||||
struct bnxt_qplib_pbl *pbl;
|
||||
|
||||
pbl = &hwq->pbl[PBL_LVL_0];
|
||||
switch (pbl->pg_size) {
|
||||
case ROCE_PG_SIZE_4K:
|
||||
pg_size = BNXT_QPLIB_HWRM_PG_SIZE_4K;
|
||||
break;
|
||||
case ROCE_PG_SIZE_8K:
|
||||
pg_size = BNXT_QPLIB_HWRM_PG_SIZE_8K;
|
||||
break;
|
||||
case ROCE_PG_SIZE_64K:
|
||||
pg_size = BNXT_QPLIB_HWRM_PG_SIZE_64K;
|
||||
break;
|
||||
case ROCE_PG_SIZE_2M:
|
||||
pg_size = BNXT_QPLIB_HWRM_PG_SIZE_2M;
|
||||
break;
|
||||
case ROCE_PG_SIZE_8M:
|
||||
pg_size = BNXT_QPLIB_HWRM_PG_SIZE_8M;
|
||||
break;
|
||||
case ROCE_PG_SIZE_1G:
|
||||
pg_size = BNXT_QPLIB_HWRM_PG_SIZE_1G;
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
|
||||
return pg_size;
|
||||
}
|
||||
|
||||
static inline void *bnxt_qplib_get_qe(struct bnxt_qplib_hwq *hwq,
|
||||
u32 indx, u64 *pg)
|
||||
{
|
||||
u32 pg_num, pg_idx;
|
||||
|
||||
pg_num = (indx / hwq->qe_ppg);
|
||||
pg_idx = (indx % hwq->qe_ppg);
|
||||
if (pg)
|
||||
*pg = (u64)&hwq->pbl_ptr[pg_num];
|
||||
return (void *)(hwq->pbl_ptr[pg_num] + hwq->element_size * pg_idx);
|
||||
}
|
||||
|
||||
#define to_bnxt_qplib(ptr, type, member) \
|
||||
container_of(ptr, type, member)
|
||||
|
@ -132,9 +132,6 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw,
|
||||
attr->max_raw_ethy_qp = le32_to_cpu(sb->max_raw_eth_qp);
|
||||
attr->max_ah = le32_to_cpu(sb->max_ah);
|
||||
|
||||
attr->max_fmr = le32_to_cpu(sb->max_fmr);
|
||||
attr->max_map_per_fmr = sb->max_map_per_fmr;
|
||||
|
||||
attr->max_srq = le16_to_cpu(sb->max_srq);
|
||||
attr->max_srq_wqes = le32_to_cpu(sb->max_srq_wr) - 1;
|
||||
attr->max_srq_sges = sb->max_srq_sge;
|
||||
|
@ -64,8 +64,6 @@ struct bnxt_qplib_dev_attr {
|
||||
u32 max_mw;
|
||||
u32 max_raw_ethy_qp;
|
||||
u32 max_ah;
|
||||
u32 max_fmr;
|
||||
u32 max_map_per_fmr;
|
||||
u32 max_srq;
|
||||
u32 max_srq_wqes;
|
||||
u32 max_srq_sges;
|
||||
|
@ -210,6 +210,20 @@ struct sq_send {
|
||||
__le32 data[24];
|
||||
};
|
||||
|
||||
/* sq_send_hdr (size:256b/32B) */
|
||||
struct sq_send_hdr {
|
||||
u8 wqe_type;
|
||||
u8 flags;
|
||||
u8 wqe_size;
|
||||
u8 reserved8_1;
|
||||
__le32 inv_key_or_imm_data;
|
||||
__le32 length;
|
||||
__le32 q_key;
|
||||
__le32 dst_qp;
|
||||
__le32 avid;
|
||||
__le64 reserved64;
|
||||
};
|
||||
|
||||
/* Send Raw Ethernet and QP1 SQ WQE (40 bytes) */
|
||||
struct sq_send_raweth_qp1 {
|
||||
u8 wqe_type;
|
||||
@ -265,6 +279,21 @@ struct sq_send_raweth_qp1 {
|
||||
__le32 data[24];
|
||||
};
|
||||
|
||||
/* sq_send_raweth_qp1_hdr (size:256b/32B) */
|
||||
struct sq_send_raweth_qp1_hdr {
|
||||
u8 wqe_type;
|
||||
u8 flags;
|
||||
u8 wqe_size;
|
||||
u8 reserved8;
|
||||
__le16 lflags;
|
||||
__le16 cfa_action;
|
||||
__le32 length;
|
||||
__le32 reserved32_1;
|
||||
__le32 cfa_meta;
|
||||
__le32 reserved32_2;
|
||||
__le64 reserved64;
|
||||
};
|
||||
|
||||
/* RDMA SQ WQE (40 bytes) */
|
||||
struct sq_rdma {
|
||||
u8 wqe_type;
|
||||
@ -288,6 +317,20 @@ struct sq_rdma {
|
||||
__le32 data[24];
|
||||
};
|
||||
|
||||
/* sq_rdma_hdr (size:256b/32B) */
|
||||
struct sq_rdma_hdr {
|
||||
u8 wqe_type;
|
||||
u8 flags;
|
||||
u8 wqe_size;
|
||||
u8 reserved8;
|
||||
__le32 imm_data;
|
||||
__le32 length;
|
||||
__le32 reserved32_1;
|
||||
__le64 remote_va;
|
||||
__le32 remote_key;
|
||||
__le32 reserved32_2;
|
||||
};
|
||||
|
||||
/* Atomic SQ WQE (40 bytes) */
|
||||
struct sq_atomic {
|
||||
u8 wqe_type;
|
||||
@ -307,6 +350,17 @@ struct sq_atomic {
|
||||
__le32 data[24];
|
||||
};
|
||||
|
||||
/* sq_atomic_hdr (size:256b/32B) */
|
||||
struct sq_atomic_hdr {
|
||||
u8 wqe_type;
|
||||
u8 flags;
|
||||
__le16 reserved16;
|
||||
__le32 remote_key;
|
||||
__le64 remote_va;
|
||||
__le64 swap_data;
|
||||
__le64 cmp_data;
|
||||
};
|
||||
|
||||
/* Local Invalidate SQ WQE (40 bytes) */
|
||||
struct sq_localinvalidate {
|
||||
u8 wqe_type;
|
||||
@ -324,6 +378,16 @@ struct sq_localinvalidate {
|
||||
__le32 data[24];
|
||||
};
|
||||
|
||||
/* sq_localinvalidate_hdr (size:256b/32B) */
|
||||
struct sq_localinvalidate_hdr {
|
||||
u8 wqe_type;
|
||||
u8 flags;
|
||||
__le16 reserved16;
|
||||
__le32 inv_l_key;
|
||||
__le64 reserved64;
|
||||
u8 reserved128[16];
|
||||
};
|
||||
|
||||
/* FR-PMR SQ WQE (40 bytes) */
|
||||
struct sq_fr_pmr {
|
||||
u8 wqe_type;
|
||||
@ -380,6 +444,21 @@ struct sq_fr_pmr {
|
||||
__le32 data[24];
|
||||
};
|
||||
|
||||
/* sq_fr_pmr_hdr (size:256b/32B) */
|
||||
struct sq_fr_pmr_hdr {
|
||||
u8 wqe_type;
|
||||
u8 flags;
|
||||
u8 access_cntl;
|
||||
u8 zero_based_page_size_log;
|
||||
__le32 l_key;
|
||||
u8 length[5];
|
||||
u8 reserved8_1;
|
||||
u8 reserved8_2;
|
||||
u8 numlevels_pbl_page_size_log;
|
||||
__le64 pblptr;
|
||||
__le64 va;
|
||||
};
|
||||
|
||||
/* Bind SQ WQE (40 bytes) */
|
||||
struct sq_bind {
|
||||
u8 wqe_type;
|
||||
@ -417,6 +496,22 @@ struct sq_bind {
|
||||
#define SQ_BIND_DATA_SFT 0
|
||||
};
|
||||
|
||||
/* sq_bind_hdr (size:256b/32B) */
|
||||
struct sq_bind_hdr {
|
||||
u8 wqe_type;
|
||||
u8 flags;
|
||||
u8 access_cntl;
|
||||
u8 reserved8_1;
|
||||
u8 mw_type_zero_based;
|
||||
u8 reserved8_2;
|
||||
__le16 reserved16;
|
||||
__le32 parent_l_key;
|
||||
__le32 l_key;
|
||||
__le64 va;
|
||||
u8 length[5];
|
||||
u8 reserved24[3];
|
||||
};
|
||||
|
||||
/* RQ/SRQ WQE Structures */
|
||||
/* RQ/SRQ WQE (40 bytes) */
|
||||
struct rq_wqe {
|
||||
@ -435,6 +530,17 @@ struct rq_wqe {
|
||||
__le32 data[24];
|
||||
};
|
||||
|
||||
/* rq_wqe_hdr (size:256b/32B) */
|
||||
struct rq_wqe_hdr {
|
||||
u8 wqe_type;
|
||||
u8 flags;
|
||||
u8 wqe_size;
|
||||
u8 reserved8;
|
||||
__le32 reserved32;
|
||||
__le32 wr_id[2];
|
||||
u8 reserved128[16];
|
||||
};
|
||||
|
||||
/* CQ CQE Structures */
|
||||
/* Base CQE (32 bytes) */
|
||||
struct cq_base {
|
||||
|
@ -953,6 +953,7 @@ void c4iw_dealloc(struct uld_ctx *ctx)
|
||||
static void c4iw_remove(struct uld_ctx *ctx)
|
||||
{
|
||||
pr_debug("c4iw_dev %p\n", ctx->dev);
|
||||
debugfs_remove_recursive(ctx->dev->debugfs_root);
|
||||
c4iw_unregister_device(ctx->dev);
|
||||
c4iw_dealloc(ctx);
|
||||
}
|
||||
|
@ -1,6 +1,6 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
|
||||
/*
|
||||
* Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
|
||||
* Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All rights reserved.
|
||||
*/
|
||||
|
||||
#ifndef _EFA_H_
|
||||
@ -40,6 +40,7 @@ struct efa_sw_stats {
|
||||
atomic64_t reg_mr_err;
|
||||
atomic64_t alloc_ucontext_err;
|
||||
atomic64_t create_ah_err;
|
||||
atomic64_t mmap_err;
|
||||
};
|
||||
|
||||
/* Don't use anything other than atomic64 */
|
||||
@ -153,8 +154,7 @@ int efa_mmap(struct ib_ucontext *ibucontext,
|
||||
struct vm_area_struct *vma);
|
||||
void efa_mmap_free(struct rdma_user_mmap_entry *rdma_entry);
|
||||
int efa_create_ah(struct ib_ah *ibah,
|
||||
struct rdma_ah_attr *ah_attr,
|
||||
u32 flags,
|
||||
struct rdma_ah_init_attr *init_attr,
|
||||
struct ib_udata *udata);
|
||||
void efa_destroy_ah(struct ib_ah *ibah, u32 flags);
|
||||
int efa_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
|
||||
|
@ -37,7 +37,7 @@ enum efa_admin_aq_feature_id {
|
||||
EFA_ADMIN_NETWORK_ATTR = 3,
|
||||
EFA_ADMIN_QUEUE_ATTR = 4,
|
||||
EFA_ADMIN_HW_HINTS = 5,
|
||||
EFA_ADMIN_FEATURES_OPCODE_NUM = 8,
|
||||
EFA_ADMIN_HOST_INFO = 6,
|
||||
};
|
||||
|
||||
/* QP transport type */
|
||||
@ -799,6 +799,54 @@ struct efa_admin_mmio_req_read_less_resp {
|
||||
u32 reg_val;
|
||||
};
|
||||
|
||||
enum efa_admin_os_type {
|
||||
EFA_ADMIN_OS_LINUX = 0,
|
||||
};
|
||||
|
||||
struct efa_admin_host_info {
|
||||
/* OS distribution string format */
|
||||
u8 os_dist_str[128];
|
||||
|
||||
/* Defined in enum efa_admin_os_type */
|
||||
u32 os_type;
|
||||
|
||||
/* Kernel version string format */
|
||||
u8 kernel_ver_str[32];
|
||||
|
||||
/* Kernel version numeric format */
|
||||
u32 kernel_ver;
|
||||
|
||||
/*
|
||||
* 7:0 : driver_module_type
|
||||
* 15:8 : driver_sub_minor
|
||||
* 23:16 : driver_minor
|
||||
* 31:24 : driver_major
|
||||
*/
|
||||
u32 driver_ver;
|
||||
|
||||
/*
|
||||
* Device's Bus, Device and Function
|
||||
* 2:0 : function
|
||||
* 7:3 : device
|
||||
* 15:8 : bus
|
||||
*/
|
||||
u16 bdf;
|
||||
|
||||
/*
|
||||
* Spec version
|
||||
* 7:0 : spec_minor
|
||||
* 15:8 : spec_major
|
||||
*/
|
||||
u16 spec_ver;
|
||||
|
||||
/*
|
||||
* 0 : intree - Intree driver
|
||||
* 1 : gdr - GPUDirect RDMA supported
|
||||
* 31:2 : reserved2
|
||||
*/
|
||||
u32 flags;
|
||||
};
|
||||
|
||||
/* create_qp_cmd */
|
||||
#define EFA_ADMIN_CREATE_QP_CMD_SQ_VIRT_MASK BIT(0)
|
||||
#define EFA_ADMIN_CREATE_QP_CMD_RQ_VIRT_MASK BIT(1)
|
||||
@ -820,4 +868,17 @@ struct efa_admin_mmio_req_read_less_resp {
|
||||
/* feature_device_attr_desc */
|
||||
#define EFA_ADMIN_FEATURE_DEVICE_ATTR_DESC_RDMA_READ_MASK BIT(0)
|
||||
|
||||
/* host_info */
|
||||
#define EFA_ADMIN_HOST_INFO_DRIVER_MODULE_TYPE_MASK GENMASK(7, 0)
|
||||
#define EFA_ADMIN_HOST_INFO_DRIVER_SUB_MINOR_MASK GENMASK(15, 8)
|
||||
#define EFA_ADMIN_HOST_INFO_DRIVER_MINOR_MASK GENMASK(23, 16)
|
||||
#define EFA_ADMIN_HOST_INFO_DRIVER_MAJOR_MASK GENMASK(31, 24)
|
||||
#define EFA_ADMIN_HOST_INFO_FUNCTION_MASK GENMASK(2, 0)
|
||||
#define EFA_ADMIN_HOST_INFO_DEVICE_MASK GENMASK(7, 3)
|
||||
#define EFA_ADMIN_HOST_INFO_BUS_MASK GENMASK(15, 8)
|
||||
#define EFA_ADMIN_HOST_INFO_SPEC_MINOR_MASK GENMASK(7, 0)
|
||||
#define EFA_ADMIN_HOST_INFO_SPEC_MAJOR_MASK GENMASK(15, 8)
|
||||
#define EFA_ADMIN_HOST_INFO_INTREE_MASK BIT(0)
|
||||
#define EFA_ADMIN_HOST_INFO_GDR_MASK BIT(1)
|
||||
|
||||
#endif /* _EFA_ADMIN_CMDS_H_ */
|
||||
|
@ -631,17 +631,20 @@ int efa_com_cmd_exec(struct efa_com_admin_queue *aq,
|
||||
cmd->aq_common_descriptor.opcode, PTR_ERR(comp_ctx));
|
||||
|
||||
up(&aq->avail_cmds);
|
||||
atomic64_inc(&aq->stats.cmd_err);
|
||||
return PTR_ERR(comp_ctx);
|
||||
}
|
||||
|
||||
err = efa_com_wait_and_process_admin_cq(comp_ctx, aq);
|
||||
if (err)
|
||||
if (err) {
|
||||
ibdev_err_ratelimited(
|
||||
aq->efa_dev,
|
||||
"Failed to process command %s (opcode %u) comp_status %d err %d\n",
|
||||
efa_com_cmd_str(cmd->aq_common_descriptor.opcode),
|
||||
cmd->aq_common_descriptor.opcode, comp_ctx->comp_status,
|
||||
err);
|
||||
atomic64_inc(&aq->stats.cmd_err);
|
||||
}
|
||||
|
||||
up(&aq->avail_cmds);
|
||||
|
||||
|
@ -1,6 +1,6 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
|
||||
/*
|
||||
* Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
|
||||
* Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All rights reserved.
|
||||
*/
|
||||
|
||||
#ifndef _EFA_COM_H_
|
||||
@ -47,6 +47,7 @@ struct efa_com_admin_sq {
|
||||
struct efa_com_stats_admin {
|
||||
atomic64_t submitted_cmd;
|
||||
atomic64_t completed_cmd;
|
||||
atomic64_t cmd_err;
|
||||
atomic64_t no_completion;
|
||||
};
|
||||
|
||||
|
@ -351,7 +351,7 @@ int efa_com_destroy_ah(struct efa_com_dev *edev,
|
||||
return 0;
|
||||
}
|
||||
|
||||
static bool
|
||||
bool
|
||||
efa_com_check_supported_feature_id(struct efa_com_dev *edev,
|
||||
enum efa_admin_aq_feature_id feature_id)
|
||||
{
|
||||
@ -388,7 +388,7 @@ static int efa_com_get_feature_ex(struct efa_com_dev *edev,
|
||||
|
||||
if (control_buff_size)
|
||||
EFA_SET(&get_cmd.aq_common_descriptor.flags,
|
||||
EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_INDIRECT, 1);
|
||||
EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA, 1);
|
||||
|
||||
efa_com_set_dma_addr(control_buf_dma_addr,
|
||||
&get_cmd.control_buffer.address.mem_addr_high,
|
||||
@ -517,12 +517,12 @@ int efa_com_get_hw_hints(struct efa_com_dev *edev,
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int efa_com_set_feature_ex(struct efa_com_dev *edev,
|
||||
struct efa_admin_set_feature_resp *set_resp,
|
||||
struct efa_admin_set_feature_cmd *set_cmd,
|
||||
enum efa_admin_aq_feature_id feature_id,
|
||||
dma_addr_t control_buf_dma_addr,
|
||||
u32 control_buff_size)
|
||||
int efa_com_set_feature_ex(struct efa_com_dev *edev,
|
||||
struct efa_admin_set_feature_resp *set_resp,
|
||||
struct efa_admin_set_feature_cmd *set_cmd,
|
||||
enum efa_admin_aq_feature_id feature_id,
|
||||
dma_addr_t control_buf_dma_addr,
|
||||
u32 control_buff_size)
|
||||
{
|
||||
struct efa_com_admin_queue *aq;
|
||||
int err;
|
||||
@ -540,7 +540,7 @@ static int efa_com_set_feature_ex(struct efa_com_dev *edev,
|
||||
if (control_buff_size) {
|
||||
set_cmd->aq_common_descriptor.flags = 0;
|
||||
EFA_SET(&set_cmd->aq_common_descriptor.flags,
|
||||
EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA_INDIRECT, 1);
|
||||
EFA_ADMIN_AQ_COMMON_DESC_CTRL_DATA, 1);
|
||||
efa_com_set_dma_addr(control_buf_dma_addr,
|
||||
&set_cmd->control_buffer.address.mem_addr_high,
|
||||
&set_cmd->control_buffer.address.mem_addr_low);
|
||||
|
@ -1,6 +1,6 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */
|
||||
/*
|
||||
* Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
|
||||
* Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All rights reserved.
|
||||
*/
|
||||
|
||||
#ifndef _EFA_COM_CMD_H_
|
||||
@ -270,6 +270,15 @@ int efa_com_get_device_attr(struct efa_com_dev *edev,
|
||||
struct efa_com_get_device_attr_result *result);
|
||||
int efa_com_get_hw_hints(struct efa_com_dev *edev,
|
||||
struct efa_com_get_hw_hints_result *result);
|
||||
bool
|
||||
efa_com_check_supported_feature_id(struct efa_com_dev *edev,
|
||||
enum efa_admin_aq_feature_id feature_id);
|
||||
int efa_com_set_feature_ex(struct efa_com_dev *edev,
|
||||
struct efa_admin_set_feature_resp *set_resp,
|
||||
struct efa_admin_set_feature_cmd *set_cmd,
|
||||
enum efa_admin_aq_feature_id feature_id,
|
||||
dma_addr_t control_buf_dma_addr,
|
||||
u32 control_buff_size);
|
||||
int efa_com_set_aenq_config(struct efa_com_dev *edev, u32 groups);
|
||||
int efa_com_alloc_pd(struct efa_com_dev *edev,
|
||||
struct efa_com_alloc_pd_result *result);
|
||||
|
@ -1,10 +1,12 @@
|
||||
// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
|
||||
/*
|
||||
* Copyright 2018-2019 Amazon.com, Inc. or its affiliates. All rights reserved.
|
||||
* Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All rights reserved.
|
||||
*/
|
||||
|
||||
#include <linux/module.h>
|
||||
#include <linux/pci.h>
|
||||
#include <linux/utsname.h>
|
||||
#include <linux/version.h>
|
||||
|
||||
#include <rdma/ib_user_verbs.h>
|
||||
|
||||
@ -187,6 +189,52 @@ static void efa_stats_init(struct efa_dev *dev)
|
||||
atomic64_set(s, 0);
|
||||
}
|
||||
|
||||
static void efa_set_host_info(struct efa_dev *dev)
|
||||
{
|
||||
struct efa_admin_set_feature_resp resp = {};
|
||||
struct efa_admin_set_feature_cmd cmd = {};
|
||||
struct efa_admin_host_info *hinf;
|
||||
u32 bufsz = sizeof(*hinf);
|
||||
dma_addr_t hinf_dma;
|
||||
|
||||
if (!efa_com_check_supported_feature_id(&dev->edev,
|
||||
EFA_ADMIN_HOST_INFO))
|
||||
return;
|
||||
|
||||
/* Failures in host info set shall not disturb probe */
|
||||
hinf = dma_alloc_coherent(&dev->pdev->dev, bufsz, &hinf_dma,
|
||||
GFP_KERNEL);
|
||||
if (!hinf)
|
||||
return;
|
||||
|
||||
strlcpy(hinf->os_dist_str, utsname()->release,
|
||||
min(sizeof(hinf->os_dist_str), sizeof(utsname()->release)));
|
||||
hinf->os_type = EFA_ADMIN_OS_LINUX;
|
||||
strlcpy(hinf->kernel_ver_str, utsname()->version,
|
||||
min(sizeof(hinf->kernel_ver_str), sizeof(utsname()->version)));
|
||||
hinf->kernel_ver = LINUX_VERSION_CODE;
|
||||
EFA_SET(&hinf->driver_ver, EFA_ADMIN_HOST_INFO_DRIVER_MAJOR, 0);
|
||||
EFA_SET(&hinf->driver_ver, EFA_ADMIN_HOST_INFO_DRIVER_MINOR, 0);
|
||||
EFA_SET(&hinf->driver_ver, EFA_ADMIN_HOST_INFO_DRIVER_SUB_MINOR, 0);
|
||||
EFA_SET(&hinf->driver_ver, EFA_ADMIN_HOST_INFO_DRIVER_MODULE_TYPE, 0);
|
||||
EFA_SET(&hinf->bdf, EFA_ADMIN_HOST_INFO_BUS, dev->pdev->bus->number);
|
||||
EFA_SET(&hinf->bdf, EFA_ADMIN_HOST_INFO_DEVICE,
|
||||
PCI_SLOT(dev->pdev->devfn));
|
||||
EFA_SET(&hinf->bdf, EFA_ADMIN_HOST_INFO_FUNCTION,
|
||||
PCI_FUNC(dev->pdev->devfn));
|
||||
EFA_SET(&hinf->spec_ver, EFA_ADMIN_HOST_INFO_SPEC_MAJOR,
|
||||
EFA_COMMON_SPEC_VERSION_MAJOR);
|
||||
EFA_SET(&hinf->spec_ver, EFA_ADMIN_HOST_INFO_SPEC_MINOR,
|
||||
EFA_COMMON_SPEC_VERSION_MINOR);
|
||||
EFA_SET(&hinf->flags, EFA_ADMIN_HOST_INFO_INTREE, 1);
|
||||
EFA_SET(&hinf->flags, EFA_ADMIN_HOST_INFO_GDR, 0);
|
||||
|
||||
efa_com_set_feature_ex(&dev->edev, &resp, &cmd, EFA_ADMIN_HOST_INFO,
|
||||
hinf_dma, bufsz);
|
||||
|
||||
dma_free_coherent(&dev->pdev->dev, bufsz, hinf, hinf_dma);
|
||||
}
|
||||
|
||||
static const struct ib_device_ops efa_dev_ops = {
|
||||
.owner = THIS_MODULE,
|
||||
.driver_id = RDMA_DRIVER_EFA,
|
||||
@ -251,6 +299,8 @@ static int efa_ib_device_add(struct efa_dev *dev)
|
||||
if (err)
|
||||
goto err_release_doorbell_bar;
|
||||
|
||||
efa_set_host_info(dev);
|
||||
|
||||
dev->ibdev.node_type = RDMA_NODE_UNSPECIFIED;
|
||||
dev->ibdev.phys_port_cnt = 1;
|
||||
dev->ibdev.num_comp_vectors = 1;
|
||||
|
@ -37,13 +37,16 @@ struct efa_user_mmap_entry {
|
||||
op(EFA_RX_DROPS, "rx_drops") \
|
||||
op(EFA_SUBMITTED_CMDS, "submitted_cmds") \
|
||||
op(EFA_COMPLETED_CMDS, "completed_cmds") \
|
||||
op(EFA_CMDS_ERR, "cmds_err") \
|
||||
op(EFA_NO_COMPLETION_CMDS, "no_completion_cmds") \
|
||||
op(EFA_KEEP_ALIVE_RCVD, "keep_alive_rcvd") \
|
||||
op(EFA_ALLOC_PD_ERR, "alloc_pd_err") \
|
||||
op(EFA_CREATE_QP_ERR, "create_qp_err") \
|
||||
op(EFA_CREATE_CQ_ERR, "create_cq_err") \
|
||||
op(EFA_REG_MR_ERR, "reg_mr_err") \
|
||||
op(EFA_ALLOC_UCONTEXT_ERR, "alloc_ucontext_err") \
|
||||
op(EFA_CREATE_AH_ERR, "create_ah_err")
|
||||
op(EFA_CREATE_AH_ERR, "create_ah_err") \
|
||||
op(EFA_MMAP_ERR, "mmap_err")
|
||||
|
||||
#define EFA_STATS_ENUM(ename, name) ename,
|
||||
#define EFA_STATS_STR(ename, name) [ename] = name,
|
||||
@ -1568,6 +1571,7 @@ static int __efa_mmap(struct efa_dev *dev, struct efa_ucontext *ucontext,
|
||||
ibdev_dbg(&dev->ibdev,
|
||||
"pgoff[%#lx] does not have valid entry\n",
|
||||
vma->vm_pgoff);
|
||||
atomic64_inc(&dev->stats.sw_stats.mmap_err);
|
||||
return -EINVAL;
|
||||
}
|
||||
entry = to_emmap(rdma_entry);
|
||||
@ -1603,12 +1607,14 @@ static int __efa_mmap(struct efa_dev *dev, struct efa_ucontext *ucontext,
|
||||
err = -EINVAL;
|
||||
}
|
||||
|
||||
if (err)
|
||||
if (err) {
|
||||
ibdev_dbg(
|
||||
&dev->ibdev,
|
||||
"Couldn't mmap address[%#llx] length[%#zx] mmap_flag[%d] err[%d]\n",
|
||||
entry->address, rdma_entry->npages * PAGE_SIZE,
|
||||
entry->mmap_flag, err);
|
||||
atomic64_inc(&dev->stats.sw_stats.mmap_err);
|
||||
}
|
||||
|
||||
rdma_user_mmap_entry_put(rdma_entry);
|
||||
return err;
|
||||
@ -1639,10 +1645,10 @@ static int efa_ah_destroy(struct efa_dev *dev, struct efa_ah *ah)
|
||||
}
|
||||
|
||||
int efa_create_ah(struct ib_ah *ibah,
|
||||
struct rdma_ah_attr *ah_attr,
|
||||
u32 flags,
|
||||
struct rdma_ah_init_attr *init_attr,
|
||||
struct ib_udata *udata)
|
||||
{
|
||||
struct rdma_ah_attr *ah_attr = init_attr->ah_attr;
|
||||
struct efa_dev *dev = to_edev(ibah->device);
|
||||
struct efa_com_create_ah_params params = {};
|
||||
struct efa_ibv_create_ah_resp resp = {};
|
||||
@ -1650,7 +1656,7 @@ int efa_create_ah(struct ib_ah *ibah,
|
||||
struct efa_ah *ah = to_eah(ibah);
|
||||
int err;
|
||||
|
||||
if (!(flags & RDMA_CREATE_AH_SLEEPABLE)) {
|
||||
if (!(init_attr->flags & RDMA_CREATE_AH_SLEEPABLE)) {
|
||||
ibdev_dbg(&dev->ibdev,
|
||||
"Create address handle is not supported in atomic context\n");
|
||||
err = -EOPNOTSUPP;
|
||||
@ -1747,15 +1753,18 @@ int efa_get_hw_stats(struct ib_device *ibdev, struct rdma_hw_stats *stats,
|
||||
as = &dev->edev.aq.stats;
|
||||
stats->value[EFA_SUBMITTED_CMDS] = atomic64_read(&as->submitted_cmd);
|
||||
stats->value[EFA_COMPLETED_CMDS] = atomic64_read(&as->completed_cmd);
|
||||
stats->value[EFA_CMDS_ERR] = atomic64_read(&as->cmd_err);
|
||||
stats->value[EFA_NO_COMPLETION_CMDS] = atomic64_read(&as->no_completion);
|
||||
|
||||
s = &dev->stats;
|
||||
stats->value[EFA_KEEP_ALIVE_RCVD] = atomic64_read(&s->keep_alive_rcvd);
|
||||
stats->value[EFA_ALLOC_PD_ERR] = atomic64_read(&s->sw_stats.alloc_pd_err);
|
||||
stats->value[EFA_CREATE_QP_ERR] = atomic64_read(&s->sw_stats.create_qp_err);
|
||||
stats->value[EFA_CREATE_CQ_ERR] = atomic64_read(&s->sw_stats.create_cq_err);
|
||||
stats->value[EFA_REG_MR_ERR] = atomic64_read(&s->sw_stats.reg_mr_err);
|
||||
stats->value[EFA_ALLOC_UCONTEXT_ERR] = atomic64_read(&s->sw_stats.alloc_ucontext_err);
|
||||
stats->value[EFA_CREATE_AH_ERR] = atomic64_read(&s->sw_stats.create_ah_err);
|
||||
stats->value[EFA_MMAP_ERR] = atomic64_read(&s->sw_stats.mmap_err);
|
||||
|
||||
return ARRAY_SIZE(efa_stats_names);
|
||||
}
|
||||
|
@ -22,9 +22,13 @@ hfi1-y := \
|
||||
init.o \
|
||||
intr.o \
|
||||
iowait.o \
|
||||
ipoib_main.o \
|
||||
ipoib_rx.o \
|
||||
ipoib_tx.o \
|
||||
mad.o \
|
||||
mmu_rb.o \
|
||||
msix.o \
|
||||
netdev_rx.o \
|
||||
opfn.o \
|
||||
pcie.o \
|
||||
pio.o \
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright(c) 2015 - 2018 Intel Corporation.
|
||||
* Copyright(c) 2015 - 2020 Intel Corporation.
|
||||
*
|
||||
* This file is provided under a dual BSD/GPLv2 license. When using or
|
||||
* redistributing this file, you may do so under either license.
|
||||
@ -64,6 +64,7 @@ struct hfi1_affinity_node_list node_affinity = {
|
||||
static const char * const irq_type_names[] = {
|
||||
"SDMA",
|
||||
"RCVCTXT",
|
||||
"NETDEVCTXT",
|
||||
"GENERAL",
|
||||
"OTHER",
|
||||
};
|
||||
@ -915,6 +916,11 @@ static int get_irq_affinity(struct hfi1_devdata *dd,
|
||||
set = &entry->rcv_intr;
|
||||
scnprintf(extra, 64, "ctxt %u", rcd->ctxt);
|
||||
break;
|
||||
case IRQ_NETDEVCTXT:
|
||||
rcd = (struct hfi1_ctxtdata *)msix->arg;
|
||||
set = &entry->def_intr;
|
||||
scnprintf(extra, 64, "ctxt %u", rcd->ctxt);
|
||||
break;
|
||||
default:
|
||||
dd_dev_err(dd, "Invalid IRQ type %d\n", msix->type);
|
||||
return -EINVAL;
|
||||
@ -987,6 +993,10 @@ void hfi1_put_irq_affinity(struct hfi1_devdata *dd,
|
||||
if (rcd->ctxt != HFI1_CTRL_CTXT)
|
||||
set = &entry->rcv_intr;
|
||||
break;
|
||||
case IRQ_NETDEVCTXT:
|
||||
rcd = (struct hfi1_ctxtdata *)msix->arg;
|
||||
set = &entry->def_intr;
|
||||
break;
|
||||
default:
|
||||
mutex_unlock(&node_affinity.lock);
|
||||
return;
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright(c) 2015 - 2018 Intel Corporation.
|
||||
* Copyright(c) 2015 - 2020 Intel Corporation.
|
||||
*
|
||||
* This file is provided under a dual BSD/GPLv2 license. When using or
|
||||
* redistributing this file, you may do so under either license.
|
||||
@ -52,6 +52,7 @@
|
||||
enum irq_type {
|
||||
IRQ_SDMA,
|
||||
IRQ_RCVCTXT,
|
||||
IRQ_NETDEVCTXT,
|
||||
IRQ_GENERAL,
|
||||
IRQ_OTHER
|
||||
};
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright(c) 2015 - 2018 Intel Corporation.
|
||||
* Copyright(c) 2015 - 2020 Intel Corporation.
|
||||
*
|
||||
* This file is provided under a dual BSD/GPLv2 license. When using or
|
||||
* redistributing this file, you may do so under either license.
|
||||
@ -66,10 +66,7 @@
|
||||
#include "affinity.h"
|
||||
#include "debugfs.h"
|
||||
#include "fault.h"
|
||||
|
||||
uint kdeth_qp;
|
||||
module_param_named(kdeth_qp, kdeth_qp, uint, S_IRUGO);
|
||||
MODULE_PARM_DESC(kdeth_qp, "Set the KDETH queue pair prefix");
|
||||
#include "netdev.h"
|
||||
|
||||
uint num_vls = HFI1_MAX_VLS_SUPPORTED;
|
||||
module_param(num_vls, uint, S_IRUGO);
|
||||
@ -128,13 +125,15 @@ struct flag_table {
|
||||
|
||||
/*
|
||||
* RSM instance allocation
|
||||
* 0 - Verbs
|
||||
* 1 - User Fecn Handling
|
||||
* 2 - Vnic
|
||||
* 0 - User Fecn Handling
|
||||
* 1 - Vnic
|
||||
* 2 - AIP
|
||||
* 3 - Verbs
|
||||
*/
|
||||
#define RSM_INS_VERBS 0
|
||||
#define RSM_INS_FECN 1
|
||||
#define RSM_INS_VNIC 2
|
||||
#define RSM_INS_FECN 0
|
||||
#define RSM_INS_VNIC 1
|
||||
#define RSM_INS_AIP 2
|
||||
#define RSM_INS_VERBS 3
|
||||
|
||||
/* Bit offset into the GUID which carries HFI id information */
|
||||
#define GUID_HFI_INDEX_SHIFT 39
|
||||
@ -175,6 +174,25 @@ struct flag_table {
|
||||
/* QPN[m+n:1] QW 1, OFFSET 1 */
|
||||
#define QPN_SELECT_OFFSET ((1ull << QW_SHIFT) | (1ull))
|
||||
|
||||
/* RSM fields for AIP */
|
||||
/* LRH.BTH above is reused for this rule */
|
||||
|
||||
/* BTH.DESTQP: QW 1, OFFSET 16 for match */
|
||||
#define BTH_DESTQP_QW 1ull
|
||||
#define BTH_DESTQP_BIT_OFFSET 16ull
|
||||
#define BTH_DESTQP_OFFSET(off) ((BTH_DESTQP_QW << QW_SHIFT) | (off))
|
||||
#define BTH_DESTQP_MATCH_OFFSET BTH_DESTQP_OFFSET(BTH_DESTQP_BIT_OFFSET)
|
||||
#define BTH_DESTQP_MASK 0xFFull
|
||||
#define BTH_DESTQP_VALUE 0x81ull
|
||||
|
||||
/* DETH.SQPN: QW 1 Offset 56 for select */
|
||||
/* We use 8 most significant Soure QPN bits as entropy fpr AIP */
|
||||
#define DETH_AIP_SQPN_QW 3ull
|
||||
#define DETH_AIP_SQPN_BIT_OFFSET 56ull
|
||||
#define DETH_AIP_SQPN_OFFSET(off) ((DETH_AIP_SQPN_QW << QW_SHIFT) | (off))
|
||||
#define DETH_AIP_SQPN_SELECT_OFFSET \
|
||||
DETH_AIP_SQPN_OFFSET(DETH_AIP_SQPN_BIT_OFFSET)
|
||||
|
||||
/* RSM fields for Vnic */
|
||||
/* L2_TYPE: QW 0, OFFSET 61 - for match */
|
||||
#define L2_TYPE_QW 0ull
|
||||
@ -8463,6 +8481,49 @@ static void hfi1_rcd_eoi_intr(struct hfi1_ctxtdata *rcd)
|
||||
local_irq_restore(flags);
|
||||
}
|
||||
|
||||
/**
|
||||
* hfi1_netdev_rx_napi - napi poll function to move eoi inline
|
||||
* @napi - pointer to napi object
|
||||
* @budget - netdev budget
|
||||
*/
|
||||
int hfi1_netdev_rx_napi(struct napi_struct *napi, int budget)
|
||||
{
|
||||
struct hfi1_netdev_rxq *rxq = container_of(napi,
|
||||
struct hfi1_netdev_rxq, napi);
|
||||
struct hfi1_ctxtdata *rcd = rxq->rcd;
|
||||
int work_done = 0;
|
||||
|
||||
work_done = rcd->do_interrupt(rcd, budget);
|
||||
|
||||
if (work_done < budget) {
|
||||
napi_complete_done(napi, work_done);
|
||||
hfi1_rcd_eoi_intr(rcd);
|
||||
}
|
||||
|
||||
return work_done;
|
||||
}
|
||||
|
||||
/* Receive packet napi handler for netdevs VNIC and AIP */
|
||||
irqreturn_t receive_context_interrupt_napi(int irq, void *data)
|
||||
{
|
||||
struct hfi1_ctxtdata *rcd = data;
|
||||
|
||||
receive_interrupt_common(rcd);
|
||||
|
||||
if (likely(rcd->napi)) {
|
||||
if (likely(napi_schedule_prep(rcd->napi)))
|
||||
__napi_schedule_irqoff(rcd->napi);
|
||||
else
|
||||
__hfi1_rcd_eoi_intr(rcd);
|
||||
} else {
|
||||
WARN_ONCE(1, "Napi IRQ handler without napi set up ctxt=%d\n",
|
||||
rcd->ctxt);
|
||||
__hfi1_rcd_eoi_intr(rcd);
|
||||
}
|
||||
|
||||
return IRQ_HANDLED;
|
||||
}
|
||||
|
||||
/*
|
||||
* Receive packet IRQ handler. This routine expects to be on its own IRQ.
|
||||
* This routine will try to handle packets immediately (latency), but if
|
||||
@ -13330,13 +13391,12 @@ static int set_up_interrupts(struct hfi1_devdata *dd)
|
||||
* in array of contexts
|
||||
* freectxts - number of free user contexts
|
||||
* num_send_contexts - number of PIO send contexts being used
|
||||
* num_vnic_contexts - number of contexts reserved for VNIC
|
||||
* num_netdev_contexts - number of contexts reserved for netdev
|
||||
*/
|
||||
static int set_up_context_variables(struct hfi1_devdata *dd)
|
||||
{
|
||||
unsigned long num_kernel_contexts;
|
||||
u16 num_vnic_contexts = HFI1_NUM_VNIC_CTXT;
|
||||
int total_contexts;
|
||||
u16 num_netdev_contexts;
|
||||
int ret;
|
||||
unsigned ngroups;
|
||||
int rmt_count;
|
||||
@ -13373,13 +13433,6 @@ static int set_up_context_variables(struct hfi1_devdata *dd)
|
||||
num_kernel_contexts = send_contexts - num_vls - 1;
|
||||
}
|
||||
|
||||
/* Accommodate VNIC contexts if possible */
|
||||
if ((num_kernel_contexts + num_vnic_contexts) > rcv_contexts) {
|
||||
dd_dev_err(dd, "No receive contexts available for VNIC\n");
|
||||
num_vnic_contexts = 0;
|
||||
}
|
||||
total_contexts = num_kernel_contexts + num_vnic_contexts;
|
||||
|
||||
/*
|
||||
* User contexts:
|
||||
* - default to 1 user context per real (non-HT) CPU core if
|
||||
@ -13392,28 +13445,32 @@ static int set_up_context_variables(struct hfi1_devdata *dd)
|
||||
/*
|
||||
* Adjust the counts given a global max.
|
||||
*/
|
||||
if (total_contexts + n_usr_ctxts > rcv_contexts) {
|
||||
if (num_kernel_contexts + n_usr_ctxts > rcv_contexts) {
|
||||
dd_dev_err(dd,
|
||||
"Reducing # user receive contexts to: %d, from %u\n",
|
||||
rcv_contexts - total_contexts,
|
||||
"Reducing # user receive contexts to: %u, from %u\n",
|
||||
(u32)(rcv_contexts - num_kernel_contexts),
|
||||
n_usr_ctxts);
|
||||
/* recalculate */
|
||||
n_usr_ctxts = rcv_contexts - total_contexts;
|
||||
n_usr_ctxts = rcv_contexts - num_kernel_contexts;
|
||||
}
|
||||
|
||||
num_netdev_contexts =
|
||||
hfi1_num_netdev_contexts(dd, rcv_contexts -
|
||||
(num_kernel_contexts + n_usr_ctxts),
|
||||
&node_affinity.real_cpu_mask);
|
||||
/*
|
||||
* The RMT entries are currently allocated as shown below:
|
||||
* 1. QOS (0 to 128 entries);
|
||||
* 2. FECN (num_kernel_context - 1 + num_user_contexts +
|
||||
* num_vnic_contexts);
|
||||
* 3. VNIC (num_vnic_contexts).
|
||||
* It should be noted that FECN oversubscribe num_vnic_contexts
|
||||
* entries of RMT because both VNIC and PSM could allocate any receive
|
||||
* num_netdev_contexts);
|
||||
* 3. netdev (num_netdev_contexts).
|
||||
* It should be noted that FECN oversubscribe num_netdev_contexts
|
||||
* entries of RMT because both netdev and PSM could allocate any receive
|
||||
* context between dd->first_dyn_alloc_text and dd->num_rcv_contexts,
|
||||
* and PSM FECN must reserve an RMT entry for each possible PSM receive
|
||||
* context.
|
||||
*/
|
||||
rmt_count = qos_rmt_entries(dd, NULL, NULL) + (num_vnic_contexts * 2);
|
||||
rmt_count = qos_rmt_entries(dd, NULL, NULL) + (num_netdev_contexts * 2);
|
||||
if (HFI1_CAP_IS_KSET(TID_RDMA))
|
||||
rmt_count += num_kernel_contexts - 1;
|
||||
if (rmt_count + n_usr_ctxts > NUM_MAP_ENTRIES) {
|
||||
@ -13426,21 +13483,20 @@ static int set_up_context_variables(struct hfi1_devdata *dd)
|
||||
n_usr_ctxts = user_rmt_reduced;
|
||||
}
|
||||
|
||||
total_contexts += n_usr_ctxts;
|
||||
|
||||
/* the first N are kernel contexts, the rest are user/vnic contexts */
|
||||
dd->num_rcv_contexts = total_contexts;
|
||||
/* the first N are kernel contexts, the rest are user/netdev contexts */
|
||||
dd->num_rcv_contexts =
|
||||
num_kernel_contexts + n_usr_ctxts + num_netdev_contexts;
|
||||
dd->n_krcv_queues = num_kernel_contexts;
|
||||
dd->first_dyn_alloc_ctxt = num_kernel_contexts;
|
||||
dd->num_vnic_contexts = num_vnic_contexts;
|
||||
dd->num_netdev_contexts = num_netdev_contexts;
|
||||
dd->num_user_contexts = n_usr_ctxts;
|
||||
dd->freectxts = n_usr_ctxts;
|
||||
dd_dev_info(dd,
|
||||
"rcv contexts: chip %d, used %d (kernel %d, vnic %u, user %u)\n",
|
||||
"rcv contexts: chip %d, used %d (kernel %d, netdev %u, user %u)\n",
|
||||
rcv_contexts,
|
||||
(int)dd->num_rcv_contexts,
|
||||
(int)dd->n_krcv_queues,
|
||||
dd->num_vnic_contexts,
|
||||
dd->num_netdev_contexts,
|
||||
dd->num_user_contexts);
|
||||
|
||||
/*
|
||||
@ -14119,21 +14175,12 @@ static void init_early_variables(struct hfi1_devdata *dd)
|
||||
|
||||
static void init_kdeth_qp(struct hfi1_devdata *dd)
|
||||
{
|
||||
/* user changed the KDETH_QP */
|
||||
if (kdeth_qp != 0 && kdeth_qp >= 0xff) {
|
||||
/* out of range or illegal value */
|
||||
dd_dev_err(dd, "Invalid KDETH queue pair prefix, ignoring");
|
||||
kdeth_qp = 0;
|
||||
}
|
||||
if (kdeth_qp == 0) /* not set, or failed range check */
|
||||
kdeth_qp = DEFAULT_KDETH_QP;
|
||||
|
||||
write_csr(dd, SEND_BTH_QP,
|
||||
(kdeth_qp & SEND_BTH_QP_KDETH_QP_MASK) <<
|
||||
(RVT_KDETH_QP_PREFIX & SEND_BTH_QP_KDETH_QP_MASK) <<
|
||||
SEND_BTH_QP_KDETH_QP_SHIFT);
|
||||
|
||||
write_csr(dd, RCV_BTH_QP,
|
||||
(kdeth_qp & RCV_BTH_QP_KDETH_QP_MASK) <<
|
||||
(RVT_KDETH_QP_PREFIX & RCV_BTH_QP_KDETH_QP_MASK) <<
|
||||
RCV_BTH_QP_KDETH_QP_SHIFT);
|
||||
}
|
||||
|
||||
@ -14249,6 +14296,12 @@ static void complete_rsm_map_table(struct hfi1_devdata *dd,
|
||||
}
|
||||
}
|
||||
|
||||
/* Is a receive side mapping rule */
|
||||
static bool has_rsm_rule(struct hfi1_devdata *dd, u8 rule_index)
|
||||
{
|
||||
return read_csr(dd, RCV_RSM_CFG + (8 * rule_index)) != 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Add a receive side mapping rule.
|
||||
*/
|
||||
@ -14485,77 +14538,138 @@ static void init_fecn_handling(struct hfi1_devdata *dd,
|
||||
rmt->used += total_cnt;
|
||||
}
|
||||
|
||||
/* Initialize RSM for VNIC */
|
||||
void hfi1_init_vnic_rsm(struct hfi1_devdata *dd)
|
||||
static inline bool hfi1_is_rmt_full(int start, int spare)
|
||||
{
|
||||
return (start + spare) > NUM_MAP_ENTRIES;
|
||||
}
|
||||
|
||||
static bool hfi1_netdev_update_rmt(struct hfi1_devdata *dd)
|
||||
{
|
||||
u8 i, j;
|
||||
u8 ctx_id = 0;
|
||||
u64 reg;
|
||||
u32 regoff;
|
||||
struct rsm_rule_data rrd;
|
||||
int rmt_start = hfi1_netdev_get_free_rmt_idx(dd);
|
||||
int ctxt_count = hfi1_netdev_ctxt_count(dd);
|
||||
|
||||
if (hfi1_vnic_is_rsm_full(dd, NUM_VNIC_MAP_ENTRIES)) {
|
||||
dd_dev_err(dd, "Vnic RSM disabled, rmt entries used = %d\n",
|
||||
dd->vnic.rmt_start);
|
||||
return;
|
||||
/* We already have contexts mapped in RMT */
|
||||
if (has_rsm_rule(dd, RSM_INS_VNIC) || has_rsm_rule(dd, RSM_INS_AIP)) {
|
||||
dd_dev_info(dd, "Contexts are already mapped in RMT\n");
|
||||
return true;
|
||||
}
|
||||
|
||||
dev_dbg(&(dd)->pcidev->dev, "Vnic rsm start = %d, end %d\n",
|
||||
dd->vnic.rmt_start,
|
||||
dd->vnic.rmt_start + NUM_VNIC_MAP_ENTRIES);
|
||||
if (hfi1_is_rmt_full(rmt_start, NUM_NETDEV_MAP_ENTRIES)) {
|
||||
dd_dev_err(dd, "Not enough RMT entries used = %d\n",
|
||||
rmt_start);
|
||||
return false;
|
||||
}
|
||||
|
||||
dev_dbg(&(dd)->pcidev->dev, "RMT start = %d, end %d\n",
|
||||
rmt_start,
|
||||
rmt_start + NUM_NETDEV_MAP_ENTRIES);
|
||||
|
||||
/* Update RSM mapping table, 32 regs, 256 entries - 1 ctx per byte */
|
||||
regoff = RCV_RSM_MAP_TABLE + (dd->vnic.rmt_start / 8) * 8;
|
||||
regoff = RCV_RSM_MAP_TABLE + (rmt_start / 8) * 8;
|
||||
reg = read_csr(dd, regoff);
|
||||
for (i = 0; i < NUM_VNIC_MAP_ENTRIES; i++) {
|
||||
/* Update map register with vnic context */
|
||||
j = (dd->vnic.rmt_start + i) % 8;
|
||||
for (i = 0; i < NUM_NETDEV_MAP_ENTRIES; i++) {
|
||||
/* Update map register with netdev context */
|
||||
j = (rmt_start + i) % 8;
|
||||
reg &= ~(0xffllu << (j * 8));
|
||||
reg |= (u64)dd->vnic.ctxt[ctx_id++]->ctxt << (j * 8);
|
||||
/* Wrap up vnic ctx index */
|
||||
ctx_id %= dd->vnic.num_ctxt;
|
||||
reg |= (u64)hfi1_netdev_get_ctxt(dd, ctx_id++)->ctxt << (j * 8);
|
||||
/* Wrap up netdev ctx index */
|
||||
ctx_id %= ctxt_count;
|
||||
/* Write back map register */
|
||||
if (j == 7 || ((i + 1) == NUM_VNIC_MAP_ENTRIES)) {
|
||||
if (j == 7 || ((i + 1) == NUM_NETDEV_MAP_ENTRIES)) {
|
||||
dev_dbg(&(dd)->pcidev->dev,
|
||||
"Vnic rsm map reg[%d] =0x%llx\n",
|
||||
"RMT[%d] =0x%llx\n",
|
||||
regoff - RCV_RSM_MAP_TABLE, reg);
|
||||
|
||||
write_csr(dd, regoff, reg);
|
||||
regoff += 8;
|
||||
if (i < (NUM_VNIC_MAP_ENTRIES - 1))
|
||||
if (i < (NUM_NETDEV_MAP_ENTRIES - 1))
|
||||
reg = read_csr(dd, regoff);
|
||||
}
|
||||
}
|
||||
|
||||
/* Add rule for vnic */
|
||||
rrd.offset = dd->vnic.rmt_start;
|
||||
rrd.pkt_type = 4;
|
||||
/* Match 16B packets */
|
||||
rrd.field1_off = L2_TYPE_MATCH_OFFSET;
|
||||
rrd.mask1 = L2_TYPE_MASK;
|
||||
rrd.value1 = L2_16B_VALUE;
|
||||
/* Match ETH L4 packets */
|
||||
rrd.field2_off = L4_TYPE_MATCH_OFFSET;
|
||||
rrd.mask2 = L4_16B_TYPE_MASK;
|
||||
rrd.value2 = L4_16B_ETH_VALUE;
|
||||
/* Calc context from veswid and entropy */
|
||||
rrd.index1_off = L4_16B_HDR_VESWID_OFFSET;
|
||||
rrd.index1_width = ilog2(NUM_VNIC_MAP_ENTRIES);
|
||||
rrd.index2_off = L2_16B_ENTROPY_OFFSET;
|
||||
rrd.index2_width = ilog2(NUM_VNIC_MAP_ENTRIES);
|
||||
add_rsm_rule(dd, RSM_INS_VNIC, &rrd);
|
||||
return true;
|
||||
}
|
||||
|
||||
/* Enable RSM if not already enabled */
|
||||
static void hfi1_enable_rsm_rule(struct hfi1_devdata *dd,
|
||||
int rule, struct rsm_rule_data *rrd)
|
||||
{
|
||||
if (!hfi1_netdev_update_rmt(dd)) {
|
||||
dd_dev_err(dd, "Failed to update RMT for RSM%d rule\n", rule);
|
||||
return;
|
||||
}
|
||||
|
||||
add_rsm_rule(dd, rule, rrd);
|
||||
add_rcvctrl(dd, RCV_CTRL_RCV_RSM_ENABLE_SMASK);
|
||||
}
|
||||
|
||||
void hfi1_init_aip_rsm(struct hfi1_devdata *dd)
|
||||
{
|
||||
/*
|
||||
* go through with the initialisation only if this rule actually doesn't
|
||||
* exist yet
|
||||
*/
|
||||
if (atomic_fetch_inc(&dd->ipoib_rsm_usr_num) == 0) {
|
||||
int rmt_start = hfi1_netdev_get_free_rmt_idx(dd);
|
||||
struct rsm_rule_data rrd = {
|
||||
.offset = rmt_start,
|
||||
.pkt_type = IB_PACKET_TYPE,
|
||||
.field1_off = LRH_BTH_MATCH_OFFSET,
|
||||
.mask1 = LRH_BTH_MASK,
|
||||
.value1 = LRH_BTH_VALUE,
|
||||
.field2_off = BTH_DESTQP_MATCH_OFFSET,
|
||||
.mask2 = BTH_DESTQP_MASK,
|
||||
.value2 = BTH_DESTQP_VALUE,
|
||||
.index1_off = DETH_AIP_SQPN_SELECT_OFFSET +
|
||||
ilog2(NUM_NETDEV_MAP_ENTRIES),
|
||||
.index1_width = ilog2(NUM_NETDEV_MAP_ENTRIES),
|
||||
.index2_off = DETH_AIP_SQPN_SELECT_OFFSET,
|
||||
.index2_width = ilog2(NUM_NETDEV_MAP_ENTRIES)
|
||||
};
|
||||
|
||||
hfi1_enable_rsm_rule(dd, RSM_INS_AIP, &rrd);
|
||||
}
|
||||
}
|
||||
|
||||
/* Initialize RSM for VNIC */
|
||||
void hfi1_init_vnic_rsm(struct hfi1_devdata *dd)
|
||||
{
|
||||
int rmt_start = hfi1_netdev_get_free_rmt_idx(dd);
|
||||
struct rsm_rule_data rrd = {
|
||||
/* Add rule for vnic */
|
||||
.offset = rmt_start,
|
||||
.pkt_type = 4,
|
||||
/* Match 16B packets */
|
||||
.field1_off = L2_TYPE_MATCH_OFFSET,
|
||||
.mask1 = L2_TYPE_MASK,
|
||||
.value1 = L2_16B_VALUE,
|
||||
/* Match ETH L4 packets */
|
||||
.field2_off = L4_TYPE_MATCH_OFFSET,
|
||||
.mask2 = L4_16B_TYPE_MASK,
|
||||
.value2 = L4_16B_ETH_VALUE,
|
||||
/* Calc context from veswid and entropy */
|
||||
.index1_off = L4_16B_HDR_VESWID_OFFSET,
|
||||
.index1_width = ilog2(NUM_NETDEV_MAP_ENTRIES),
|
||||
.index2_off = L2_16B_ENTROPY_OFFSET,
|
||||
.index2_width = ilog2(NUM_NETDEV_MAP_ENTRIES)
|
||||
};
|
||||
|
||||
hfi1_enable_rsm_rule(dd, RSM_INS_VNIC, &rrd);
|
||||
}
|
||||
|
||||
void hfi1_deinit_vnic_rsm(struct hfi1_devdata *dd)
|
||||
{
|
||||
clear_rsm_rule(dd, RSM_INS_VNIC);
|
||||
}
|
||||
|
||||
/* Disable RSM if used only by vnic */
|
||||
if (dd->vnic.rmt_start == 0)
|
||||
clear_rcvctrl(dd, RCV_CTRL_RCV_RSM_ENABLE_SMASK);
|
||||
void hfi1_deinit_aip_rsm(struct hfi1_devdata *dd)
|
||||
{
|
||||
/* only actually clear the rule if it's the last user asking to do so */
|
||||
if (atomic_fetch_add_unless(&dd->ipoib_rsm_usr_num, -1, 0) == 1)
|
||||
clear_rsm_rule(dd, RSM_INS_AIP);
|
||||
}
|
||||
|
||||
static int init_rxe(struct hfi1_devdata *dd)
|
||||
@ -14574,8 +14688,8 @@ static int init_rxe(struct hfi1_devdata *dd)
|
||||
init_qos(dd, rmt);
|
||||
init_fecn_handling(dd, rmt);
|
||||
complete_rsm_map_table(dd, rmt);
|
||||
/* record number of used rsm map entries for vnic */
|
||||
dd->vnic.rmt_start = rmt->used;
|
||||
/* record number of used rsm map entries for netdev */
|
||||
hfi1_netdev_set_free_rmt_idx(dd, rmt->used);
|
||||
kfree(rmt);
|
||||
|
||||
/*
|
||||
@ -15129,6 +15243,10 @@ int hfi1_init_dd(struct hfi1_devdata *dd)
|
||||
(dd->revision >> CCE_REVISION_SW_SHIFT)
|
||||
& CCE_REVISION_SW_MASK);
|
||||
|
||||
/* alloc netdev data */
|
||||
if (hfi1_netdev_alloc(dd))
|
||||
goto bail_cleanup;
|
||||
|
||||
ret = set_up_context_variables(dd);
|
||||
if (ret)
|
||||
goto bail_cleanup;
|
||||
@ -15229,6 +15347,7 @@ bail_clear_intr:
|
||||
hfi1_comp_vectors_clean_up(dd);
|
||||
msix_clean_up_interrupts(dd);
|
||||
bail_cleanup:
|
||||
hfi1_netdev_free(dd);
|
||||
hfi1_pcie_ddcleanup(dd);
|
||||
bail_free:
|
||||
hfi1_free_devdata(dd);
|
||||
|
@ -1,7 +1,7 @@
|
||||
#ifndef _CHIP_H
|
||||
#define _CHIP_H
|
||||
/*
|
||||
* Copyright(c) 2015 - 2018 Intel Corporation.
|
||||
* Copyright(c) 2015 - 2020 Intel Corporation.
|
||||
*
|
||||
* This file is provided under a dual BSD/GPLv2 license. When using or
|
||||
* redistributing this file, you may do so under either license.
|
||||
@ -1447,6 +1447,7 @@ irqreturn_t general_interrupt(int irq, void *data);
|
||||
irqreturn_t sdma_interrupt(int irq, void *data);
|
||||
irqreturn_t receive_context_interrupt(int irq, void *data);
|
||||
irqreturn_t receive_context_thread(int irq, void *data);
|
||||
irqreturn_t receive_context_interrupt_napi(int irq, void *data);
|
||||
|
||||
int set_intr_bits(struct hfi1_devdata *dd, u16 first, u16 last, bool set);
|
||||
void init_qsfp_int(struct hfi1_devdata *dd);
|
||||
@ -1455,6 +1456,8 @@ void remap_intr(struct hfi1_devdata *dd, int isrc, int msix_intr);
|
||||
void remap_sdma_interrupts(struct hfi1_devdata *dd, int engine, int msix_intr);
|
||||
void reset_interrupts(struct hfi1_devdata *dd);
|
||||
u8 hfi1_get_qp_map(struct hfi1_devdata *dd, u8 idx);
|
||||
void hfi1_init_aip_rsm(struct hfi1_devdata *dd);
|
||||
void hfi1_deinit_aip_rsm(struct hfi1_devdata *dd);
|
||||
|
||||
/*
|
||||
* Interrupt source table.
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright(c) 2015 - 2018 Intel Corporation.
|
||||
* Copyright(c) 2015 - 2020 Intel Corporation.
|
||||
*
|
||||
* This file is provided under a dual BSD/GPLv2 license. When using or
|
||||
* redistributing this file, you may do so under either license.
|
||||
@ -72,13 +72,6 @@
|
||||
* compilation unit
|
||||
*/
|
||||
|
||||
/*
|
||||
* If a packet's QP[23:16] bits match this value, then it is
|
||||
* a PSM packet and the hardware will expect a KDETH header
|
||||
* following the BTH.
|
||||
*/
|
||||
#define DEFAULT_KDETH_QP 0x80
|
||||
|
||||
/* driver/hw feature set bitmask */
|
||||
#define HFI1_CAP_USER_SHIFT 24
|
||||
#define HFI1_CAP_MASK ((1UL << HFI1_CAP_USER_SHIFT) - 1)
|
||||
@ -149,7 +142,8 @@
|
||||
HFI1_CAP_NO_INTEGRITY | \
|
||||
HFI1_CAP_PKEY_CHECK | \
|
||||
HFI1_CAP_TID_RDMA | \
|
||||
HFI1_CAP_OPFN) << \
|
||||
HFI1_CAP_OPFN | \
|
||||
HFI1_CAP_AIP) << \
|
||||
HFI1_CAP_USER_SHIFT)
|
||||
/*
|
||||
* Set of capabilities that need to be enabled for kernel context in
|
||||
@ -166,6 +160,7 @@
|
||||
HFI1_CAP_PKEY_CHECK | \
|
||||
HFI1_CAP_MULTI_PKT_EGR | \
|
||||
HFI1_CAP_EXTENDED_PSN | \
|
||||
HFI1_CAP_AIP | \
|
||||
((HFI1_CAP_HDRSUPP | \
|
||||
HFI1_CAP_MULTI_PKT_EGR | \
|
||||
HFI1_CAP_STATIC_RATE_CTRL | \
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright(c) 2015-2018 Intel Corporation.
|
||||
* Copyright(c) 2015-2020 Intel Corporation.
|
||||
*
|
||||
* This file is provided under a dual BSD/GPLv2 license. When using or
|
||||
* redistributing this file, you may do so under either license.
|
||||
@ -54,6 +54,7 @@
|
||||
#include <linux/module.h>
|
||||
#include <linux/prefetch.h>
|
||||
#include <rdma/ib_verbs.h>
|
||||
#include <linux/etherdevice.h>
|
||||
|
||||
#include "hfi.h"
|
||||
#include "trace.h"
|
||||
@ -63,6 +64,9 @@
|
||||
#include "vnic.h"
|
||||
#include "fault.h"
|
||||
|
||||
#include "ipoib.h"
|
||||
#include "netdev.h"
|
||||
|
||||
#undef pr_fmt
|
||||
#define pr_fmt(fmt) DRIVER_NAME ": " fmt
|
||||
|
||||
@ -748,6 +752,39 @@ static noinline int skip_rcv_packet(struct hfi1_packet *packet, int thread)
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void process_rcv_packet_napi(struct hfi1_packet *packet)
|
||||
{
|
||||
packet->etype = rhf_rcv_type(packet->rhf);
|
||||
|
||||
/* total length */
|
||||
packet->tlen = rhf_pkt_len(packet->rhf); /* in bytes */
|
||||
/* retrieve eager buffer details */
|
||||
packet->etail = rhf_egr_index(packet->rhf);
|
||||
packet->ebuf = get_egrbuf(packet->rcd, packet->rhf,
|
||||
&packet->updegr);
|
||||
/*
|
||||
* Prefetch the contents of the eager buffer. It is
|
||||
* OK to send a negative length to prefetch_range().
|
||||
* The +2 is the size of the RHF.
|
||||
*/
|
||||
prefetch_range(packet->ebuf,
|
||||
packet->tlen - ((packet->rcd->rcvhdrqentsize -
|
||||
(rhf_hdrq_offset(packet->rhf)
|
||||
+ 2)) * 4));
|
||||
|
||||
packet->rcd->rhf_rcv_function_map[packet->etype](packet);
|
||||
packet->numpkt++;
|
||||
|
||||
/* Set up for the next packet */
|
||||
packet->rhqoff += packet->rsize;
|
||||
if (packet->rhqoff >= packet->maxcnt)
|
||||
packet->rhqoff = 0;
|
||||
|
||||
packet->rhf_addr = (__le32 *)packet->rcd->rcvhdrq + packet->rhqoff +
|
||||
packet->rcd->rhf_offset;
|
||||
packet->rhf = rhf_to_cpu(packet->rhf_addr);
|
||||
}
|
||||
|
||||
static inline int process_rcv_packet(struct hfi1_packet *packet, int thread)
|
||||
{
|
||||
int ret;
|
||||
@ -826,6 +863,36 @@ static inline void finish_packet(struct hfi1_packet *packet)
|
||||
packet->etail, rcv_intr_dynamic, packet->numpkt);
|
||||
}
|
||||
|
||||
/*
|
||||
* handle_receive_interrupt_napi_fp - receive a packet
|
||||
* @rcd: the context
|
||||
* @budget: polling budget
|
||||
*
|
||||
* Called from interrupt handler for receive interrupt.
|
||||
* This is the fast path interrupt handler
|
||||
* when executing napi soft irq environment.
|
||||
*/
|
||||
int handle_receive_interrupt_napi_fp(struct hfi1_ctxtdata *rcd, int budget)
|
||||
{
|
||||
struct hfi1_packet packet;
|
||||
|
||||
init_packet(rcd, &packet);
|
||||
if (last_rcv_seq(rcd, rhf_rcv_seq(packet.rhf)))
|
||||
goto bail;
|
||||
|
||||
while (packet.numpkt < budget) {
|
||||
process_rcv_packet_napi(&packet);
|
||||
if (hfi1_seq_incr(rcd, rhf_rcv_seq(packet.rhf)))
|
||||
break;
|
||||
|
||||
process_rcv_update(0, &packet);
|
||||
}
|
||||
hfi1_set_rcd_head(rcd, packet.rhqoff);
|
||||
bail:
|
||||
finish_packet(&packet);
|
||||
return packet.numpkt;
|
||||
}
|
||||
|
||||
/*
|
||||
* Handle receive interrupts when using the no dma rtail option.
|
||||
*/
|
||||
@ -1073,6 +1140,63 @@ bail:
|
||||
return last;
|
||||
}
|
||||
|
||||
/*
|
||||
* handle_receive_interrupt_napi_sp - receive a packet
|
||||
* @rcd: the context
|
||||
* @budget: polling budget
|
||||
*
|
||||
* Called from interrupt handler for errors or receive interrupt.
|
||||
* This is the slow path interrupt handler
|
||||
* when executing napi soft irq environment.
|
||||
*/
|
||||
int handle_receive_interrupt_napi_sp(struct hfi1_ctxtdata *rcd, int budget)
|
||||
{
|
||||
struct hfi1_devdata *dd = rcd->dd;
|
||||
int last = RCV_PKT_OK;
|
||||
bool needset = true;
|
||||
struct hfi1_packet packet;
|
||||
|
||||
init_packet(rcd, &packet);
|
||||
if (last_rcv_seq(rcd, rhf_rcv_seq(packet.rhf)))
|
||||
goto bail;
|
||||
|
||||
while (last != RCV_PKT_DONE && packet.numpkt < budget) {
|
||||
if (hfi1_need_drop(dd)) {
|
||||
/* On to the next packet */
|
||||
packet.rhqoff += packet.rsize;
|
||||
packet.rhf_addr = (__le32 *)rcd->rcvhdrq +
|
||||
packet.rhqoff +
|
||||
rcd->rhf_offset;
|
||||
packet.rhf = rhf_to_cpu(packet.rhf_addr);
|
||||
|
||||
} else {
|
||||
if (set_armed_to_active(&packet))
|
||||
goto bail;
|
||||
process_rcv_packet_napi(&packet);
|
||||
}
|
||||
|
||||
if (hfi1_seq_incr(rcd, rhf_rcv_seq(packet.rhf)))
|
||||
last = RCV_PKT_DONE;
|
||||
|
||||
if (needset) {
|
||||
needset = false;
|
||||
set_all_fastpath(dd, rcd);
|
||||
}
|
||||
|
||||
process_rcv_update(last, &packet);
|
||||
}
|
||||
|
||||
hfi1_set_rcd_head(rcd, packet.rhqoff);
|
||||
|
||||
bail:
|
||||
/*
|
||||
* Always write head at end, and setup rcv interrupt, even
|
||||
* if no packets were processed.
|
||||
*/
|
||||
finish_packet(&packet);
|
||||
return packet.numpkt;
|
||||
}
|
||||
|
||||
/*
|
||||
* We may discover in the interrupt that the hardware link state has
|
||||
* changed from ARMED to ACTIVE (due to the arrival of a non-SC15 packet),
|
||||
@ -1550,6 +1674,82 @@ void handle_eflags(struct hfi1_packet *packet)
|
||||
show_eflags_errs(packet);
|
||||
}
|
||||
|
||||
static void hfi1_ipoib_ib_rcv(struct hfi1_packet *packet)
|
||||
{
|
||||
struct hfi1_ibport *ibp;
|
||||
struct net_device *netdev;
|
||||
struct hfi1_ctxtdata *rcd = packet->rcd;
|
||||
struct napi_struct *napi = rcd->napi;
|
||||
struct sk_buff *skb;
|
||||
struct hfi1_netdev_rxq *rxq = container_of(napi,
|
||||
struct hfi1_netdev_rxq, napi);
|
||||
u32 extra_bytes;
|
||||
u32 tlen, qpnum;
|
||||
bool do_work, do_cnp;
|
||||
struct hfi1_ipoib_dev_priv *priv;
|
||||
|
||||
trace_hfi1_rcvhdr(packet);
|
||||
|
||||
hfi1_setup_ib_header(packet);
|
||||
|
||||
packet->ohdr = &((struct ib_header *)packet->hdr)->u.oth;
|
||||
packet->grh = NULL;
|
||||
|
||||
if (unlikely(rhf_err_flags(packet->rhf))) {
|
||||
handle_eflags(packet);
|
||||
return;
|
||||
}
|
||||
|
||||
qpnum = ib_bth_get_qpn(packet->ohdr);
|
||||
netdev = hfi1_netdev_get_data(rcd->dd, qpnum);
|
||||
if (!netdev)
|
||||
goto drop_no_nd;
|
||||
|
||||
trace_input_ibhdr(rcd->dd, packet, !!(rhf_dc_info(packet->rhf)));
|
||||
trace_ctxt_rsm_hist(rcd->ctxt);
|
||||
|
||||
/* handle congestion notifications */
|
||||
do_work = hfi1_may_ecn(packet);
|
||||
if (unlikely(do_work)) {
|
||||
do_cnp = (packet->opcode != IB_OPCODE_CNP);
|
||||
(void)hfi1_process_ecn_slowpath(hfi1_ipoib_priv(netdev)->qp,
|
||||
packet, do_cnp);
|
||||
}
|
||||
|
||||
/*
|
||||
* We have split point after last byte of DETH
|
||||
* lets strip padding and CRC and ICRC.
|
||||
* tlen is whole packet len so we need to
|
||||
* subtract header size as well.
|
||||
*/
|
||||
tlen = packet->tlen;
|
||||
extra_bytes = ib_bth_get_pad(packet->ohdr) + (SIZE_OF_CRC << 2) +
|
||||
packet->hlen;
|
||||
if (unlikely(tlen < extra_bytes))
|
||||
goto drop;
|
||||
|
||||
tlen -= extra_bytes;
|
||||
|
||||
skb = hfi1_ipoib_prepare_skb(rxq, tlen, packet->ebuf);
|
||||
if (unlikely(!skb))
|
||||
goto drop;
|
||||
|
||||
priv = hfi1_ipoib_priv(netdev);
|
||||
hfi1_ipoib_update_rx_netstats(priv, 1, skb->len);
|
||||
|
||||
skb->dev = netdev;
|
||||
skb->pkt_type = PACKET_HOST;
|
||||
netif_receive_skb(skb);
|
||||
|
||||
return;
|
||||
|
||||
drop:
|
||||
++netdev->stats.rx_dropped;
|
||||
drop_no_nd:
|
||||
ibp = rcd_to_iport(packet->rcd);
|
||||
++ibp->rvp.n_pkt_drops;
|
||||
}
|
||||
|
||||
/*
|
||||
* The following functions are called by the interrupt handler. They are type
|
||||
* specific handlers for each packet type.
|
||||
@ -1572,28 +1772,10 @@ static void process_receive_ib(struct hfi1_packet *packet)
|
||||
hfi1_ib_rcv(packet);
|
||||
}
|
||||
|
||||
static inline bool hfi1_is_vnic_packet(struct hfi1_packet *packet)
|
||||
{
|
||||
/* Packet received in VNIC context via RSM */
|
||||
if (packet->rcd->is_vnic)
|
||||
return true;
|
||||
|
||||
if ((hfi1_16B_get_l2(packet->ebuf) == OPA_16B_L2_TYPE) &&
|
||||
(hfi1_16B_get_l4(packet->ebuf) == OPA_16B_L4_ETHR))
|
||||
return true;
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
static void process_receive_bypass(struct hfi1_packet *packet)
|
||||
{
|
||||
struct hfi1_devdata *dd = packet->rcd->dd;
|
||||
|
||||
if (hfi1_is_vnic_packet(packet)) {
|
||||
hfi1_vnic_bypass_rcv(packet);
|
||||
return;
|
||||
}
|
||||
|
||||
if (hfi1_setup_bypass_packet(packet))
|
||||
return;
|
||||
|
||||
@ -1757,3 +1939,14 @@ const rhf_rcv_function_ptr normal_rhf_rcv_functions[] = {
|
||||
[RHF_RCV_TYPE_INVALID6] = process_receive_invalid,
|
||||
[RHF_RCV_TYPE_INVALID7] = process_receive_invalid,
|
||||
};
|
||||
|
||||
const rhf_rcv_function_ptr netdev_rhf_rcv_functions[] = {
|
||||
[RHF_RCV_TYPE_EXPECTED] = process_receive_invalid,
|
||||
[RHF_RCV_TYPE_EAGER] = process_receive_invalid,
|
||||
[RHF_RCV_TYPE_IB] = hfi1_ipoib_ib_rcv,
|
||||
[RHF_RCV_TYPE_ERROR] = process_receive_error,
|
||||
[RHF_RCV_TYPE_BYPASS] = hfi1_vnic_bypass_rcv,
|
||||
[RHF_RCV_TYPE_INVALID5] = process_receive_invalid,
|
||||
[RHF_RCV_TYPE_INVALID6] = process_receive_invalid,
|
||||
[RHF_RCV_TYPE_INVALID7] = process_receive_invalid,
|
||||
};
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright(c) 2015-2017 Intel Corporation.
|
||||
* Copyright(c) 2015-2020 Intel Corporation.
|
||||
*
|
||||
* This file is provided under a dual BSD/GPLv2 license. When using or
|
||||
* redistributing this file, you may do so under either license.
|
||||
@ -1264,7 +1264,7 @@ static int get_base_info(struct hfi1_filedata *fd, unsigned long arg, u32 len)
|
||||
memset(&binfo, 0, sizeof(binfo));
|
||||
binfo.hw_version = dd->revision;
|
||||
binfo.sw_version = HFI1_KERN_SWVERSION;
|
||||
binfo.bthqp = kdeth_qp;
|
||||
binfo.bthqp = RVT_KDETH_QP_PREFIX;
|
||||
binfo.jkey = uctxt->jkey;
|
||||
/*
|
||||
* If more than 64 contexts are enabled the allocated credit
|
||||
|
@ -1,7 +1,7 @@
|
||||
#ifndef _HFI1_KERNEL_H
|
||||
#define _HFI1_KERNEL_H
|
||||
/*
|
||||
* Copyright(c) 2015-2018 Intel Corporation.
|
||||
* Copyright(c) 2015-2020 Intel Corporation.
|
||||
*
|
||||
* This file is provided under a dual BSD/GPLv2 license. When using or
|
||||
* redistributing this file, you may do so under either license.
|
||||
@ -233,6 +233,8 @@ struct hfi1_ctxtdata {
|
||||
intr_handler fast_handler;
|
||||
/** slow handler */
|
||||
intr_handler slow_handler;
|
||||
/* napi pointer assiociated with netdev */
|
||||
struct napi_struct *napi;
|
||||
/* verbs rx_stats per rcd */
|
||||
struct hfi1_opcode_stats_perctx *opstats;
|
||||
/* clear interrupt mask */
|
||||
@ -383,11 +385,11 @@ struct hfi1_packet {
|
||||
u32 rhqoff;
|
||||
u32 dlid;
|
||||
u32 slid;
|
||||
int numpkt;
|
||||
u16 tlen;
|
||||
s16 etail;
|
||||
u16 pkey;
|
||||
u8 hlen;
|
||||
u8 numpkt;
|
||||
u8 rsize;
|
||||
u8 updegr;
|
||||
u8 etype;
|
||||
@ -985,7 +987,7 @@ typedef void (*hfi1_make_req)(struct rvt_qp *qp,
|
||||
struct hfi1_pkt_state *ps,
|
||||
struct rvt_swqe *wqe);
|
||||
extern const rhf_rcv_function_ptr normal_rhf_rcv_functions[];
|
||||
|
||||
extern const rhf_rcv_function_ptr netdev_rhf_rcv_functions[];
|
||||
|
||||
/* return values for the RHF receive functions */
|
||||
#define RHF_RCV_CONTINUE 0 /* keep going */
|
||||
@ -1045,23 +1047,10 @@ struct hfi1_asic_data {
|
||||
#define NUM_MAP_ENTRIES 256
|
||||
#define NUM_MAP_REGS 32
|
||||
|
||||
/*
|
||||
* Number of VNIC contexts used. Ensure it is less than or equal to
|
||||
* max queues supported by VNIC (HFI1_VNIC_MAX_QUEUE).
|
||||
*/
|
||||
#define HFI1_NUM_VNIC_CTXT 8
|
||||
|
||||
/* Number of VNIC RSM entries */
|
||||
#define NUM_VNIC_MAP_ENTRIES 8
|
||||
|
||||
/* Virtual NIC information */
|
||||
struct hfi1_vnic_data {
|
||||
struct hfi1_ctxtdata *ctxt[HFI1_NUM_VNIC_CTXT];
|
||||
struct kmem_cache *txreq_cache;
|
||||
struct xarray vesws;
|
||||
u8 num_vports;
|
||||
u8 rmt_start;
|
||||
u8 num_ctxt;
|
||||
};
|
||||
|
||||
struct hfi1_vnic_vport_info;
|
||||
@ -1167,8 +1156,8 @@ struct hfi1_devdata {
|
||||
u64 z_send_schedule;
|
||||
|
||||
u64 __percpu *send_schedule;
|
||||
/* number of reserved contexts for VNIC usage */
|
||||
u16 num_vnic_contexts;
|
||||
/* number of reserved contexts for netdev usage */
|
||||
u16 num_netdev_contexts;
|
||||
/* number of receive contexts in use by the driver */
|
||||
u32 num_rcv_contexts;
|
||||
/* number of pio send contexts in use by the driver */
|
||||
@ -1417,12 +1406,12 @@ struct hfi1_devdata {
|
||||
struct hfi1_vnic_data vnic;
|
||||
/* Lock to protect IRQ SRC register access */
|
||||
spinlock_t irq_src_lock;
|
||||
};
|
||||
int vnic_num_vports;
|
||||
struct net_device *dummy_netdev;
|
||||
|
||||
static inline bool hfi1_vnic_is_rsm_full(struct hfi1_devdata *dd, int spare)
|
||||
{
|
||||
return (dd->vnic.rmt_start + spare) > NUM_MAP_ENTRIES;
|
||||
}
|
||||
/* Keeps track of IPoIB RSM rule users */
|
||||
atomic_t ipoib_rsm_usr_num;
|
||||
};
|
||||
|
||||
/* 8051 firmware version helper */
|
||||
#define dc8051_ver(a, b, c) ((a) << 16 | (b) << 8 | (c))
|
||||
@ -1500,6 +1489,8 @@ struct hfi1_ctxtdata *hfi1_rcd_get_by_index(struct hfi1_devdata *dd, u16 ctxt);
|
||||
int handle_receive_interrupt(struct hfi1_ctxtdata *rcd, int thread);
|
||||
int handle_receive_interrupt_nodma_rtail(struct hfi1_ctxtdata *rcd, int thread);
|
||||
int handle_receive_interrupt_dma_rtail(struct hfi1_ctxtdata *rcd, int thread);
|
||||
int handle_receive_interrupt_napi_fp(struct hfi1_ctxtdata *rcd, int budget);
|
||||
int handle_receive_interrupt_napi_sp(struct hfi1_ctxtdata *rcd, int budget);
|
||||
void set_all_slowpath(struct hfi1_devdata *dd);
|
||||
|
||||
extern const struct pci_device_id hfi1_pci_tbl[];
|
||||
@ -2250,7 +2241,6 @@ extern int num_user_contexts;
|
||||
extern unsigned long n_krcvqs;
|
||||
extern uint krcvqs[];
|
||||
extern int krcvqsset;
|
||||
extern uint kdeth_qp;
|
||||
extern uint loopback;
|
||||
extern uint quick_linkup;
|
||||
extern uint rcv_intr_timeout;
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright(c) 2015 - 2018 Intel Corporation.
|
||||
* Copyright(c) 2015 - 2020 Intel Corporation.
|
||||
*
|
||||
* This file is provided under a dual BSD/GPLv2 license. When using or
|
||||
* redistributing this file, you may do so under either license.
|
||||
@ -69,6 +69,7 @@
|
||||
#include "affinity.h"
|
||||
#include "vnic.h"
|
||||
#include "exp_rcv.h"
|
||||
#include "netdev.h"
|
||||
|
||||
#undef pr_fmt
|
||||
#define pr_fmt(fmt) DRIVER_NAME ": " fmt
|
||||
@ -374,6 +375,7 @@ int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int numa,
|
||||
rcd->numa_id = numa;
|
||||
rcd->rcv_array_groups = dd->rcv_entries.ngroups;
|
||||
rcd->rhf_rcv_function_map = normal_rhf_rcv_functions;
|
||||
rcd->msix_intr = CCE_NUM_MSIX_VECTORS;
|
||||
|
||||
mutex_init(&rcd->exp_mutex);
|
||||
spin_lock_init(&rcd->exp_lock);
|
||||
@ -1316,6 +1318,7 @@ static struct hfi1_devdata *hfi1_alloc_devdata(struct pci_dev *pdev,
|
||||
goto bail;
|
||||
}
|
||||
|
||||
atomic_set(&dd->ipoib_rsm_usr_num, 0);
|
||||
return dd;
|
||||
|
||||
bail:
|
||||
@ -1663,9 +1666,6 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
|
||||
/* do the generic initialization */
|
||||
initfail = hfi1_init(dd, 0);
|
||||
|
||||
/* setup vnic */
|
||||
hfi1_vnic_setup(dd);
|
||||
|
||||
ret = hfi1_register_ib_device(dd);
|
||||
|
||||
/*
|
||||
@ -1704,7 +1704,6 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
|
||||
hfi1_device_remove(dd);
|
||||
if (!ret)
|
||||
hfi1_unregister_ib_device(dd);
|
||||
hfi1_vnic_cleanup(dd);
|
||||
postinit_cleanup(dd);
|
||||
if (initfail)
|
||||
ret = initfail;
|
||||
@ -1749,8 +1748,8 @@ static void remove_one(struct pci_dev *pdev)
|
||||
/* unregister from IB core */
|
||||
hfi1_unregister_ib_device(dd);
|
||||
|
||||
/* cleanup vnic */
|
||||
hfi1_vnic_cleanup(dd);
|
||||
/* free netdev data */
|
||||
hfi1_netdev_free(dd);
|
||||
|
||||
/*
|
||||
* Disable the IB link, disable interrupts on the device,
|
||||
|
171
drivers/infiniband/hw/hfi1/ipoib.h
Normal file
171
drivers/infiniband/hw/hfi1/ipoib.h
Normal file
@ -0,0 +1,171 @@
|
||||
/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
|
||||
/*
|
||||
* Copyright(c) 2020 Intel Corporation.
|
||||
*
|
||||
*/
|
||||
|
||||
/*
|
||||
* This file contains HFI1 support for IPOIB functionality
|
||||
*/
|
||||
|
||||
#ifndef HFI1_IPOIB_H
|
||||
#define HFI1_IPOIB_H
|
||||
|
||||
#include <linux/types.h>
|
||||
#include <linux/stddef.h>
|
||||
#include <linux/atomic.h>
|
||||
#include <linux/netdevice.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/skbuff.h>
|
||||
#include <linux/list.h>
|
||||
#include <linux/if_infiniband.h>
|
||||
|
||||
#include "hfi.h"
|
||||
#include "iowait.h"
|
||||
#include "netdev.h"
|
||||
|
||||
#include <rdma/ib_verbs.h>
|
||||
|
||||
#define HFI1_IPOIB_ENTROPY_SHIFT 24
|
||||
|
||||
#define HFI1_IPOIB_TXREQ_NAME_LEN 32
|
||||
|
||||
#define HFI1_IPOIB_PSEUDO_LEN 20
|
||||
#define HFI1_IPOIB_ENCAP_LEN 4
|
||||
|
||||
struct hfi1_ipoib_dev_priv;
|
||||
|
||||
union hfi1_ipoib_flow {
|
||||
u16 as_int;
|
||||
struct {
|
||||
u8 tx_queue;
|
||||
u8 sc5;
|
||||
} __attribute__((__packed__));
|
||||
};
|
||||
|
||||
/**
|
||||
* struct hfi1_ipoib_circ_buf - List of items to be processed
|
||||
* @items: ring of items
|
||||
* @head: ring head
|
||||
* @tail: ring tail
|
||||
* @max_items: max items + 1 that the ring can contain
|
||||
* @producer_lock: producer sync lock
|
||||
* @consumer_lock: consumer sync lock
|
||||
*/
|
||||
struct hfi1_ipoib_circ_buf {
|
||||
void **items;
|
||||
unsigned long head;
|
||||
unsigned long tail;
|
||||
unsigned long max_items;
|
||||
spinlock_t producer_lock; /* head sync lock */
|
||||
spinlock_t consumer_lock; /* tail sync lock */
|
||||
};
|
||||
|
||||
/**
|
||||
* struct hfi1_ipoib_txq - IPOIB per Tx queue information
|
||||
* @priv: private pointer
|
||||
* @sde: sdma engine
|
||||
* @tx_list: tx request list
|
||||
* @sent_txreqs: count of txreqs posted to sdma
|
||||
* @flow: tracks when list needs to be flushed for a flow change
|
||||
* @q_idx: ipoib Tx queue index
|
||||
* @pkts_sent: indicator packets have been sent from this queue
|
||||
* @wait: iowait structure
|
||||
* @complete_txreqs: count of txreqs completed by sdma
|
||||
* @napi: pointer to tx napi interface
|
||||
* @tx_ring: ring of ipoib txreqs to be reaped by napi callback
|
||||
*/
|
||||
struct hfi1_ipoib_txq {
|
||||
struct hfi1_ipoib_dev_priv *priv;
|
||||
struct sdma_engine *sde;
|
||||
struct list_head tx_list;
|
||||
u64 sent_txreqs;
|
||||
union hfi1_ipoib_flow flow;
|
||||
u8 q_idx;
|
||||
bool pkts_sent;
|
||||
struct iowait wait;
|
||||
|
||||
atomic64_t ____cacheline_aligned_in_smp complete_txreqs;
|
||||
struct napi_struct *napi;
|
||||
struct hfi1_ipoib_circ_buf tx_ring;
|
||||
};
|
||||
|
||||
struct hfi1_ipoib_dev_priv {
|
||||
struct hfi1_devdata *dd;
|
||||
struct net_device *netdev;
|
||||
struct ib_device *device;
|
||||
struct hfi1_ipoib_txq *txqs;
|
||||
struct kmem_cache *txreq_cache;
|
||||
struct napi_struct *tx_napis;
|
||||
u16 pkey;
|
||||
u16 pkey_index;
|
||||
u32 qkey;
|
||||
u8 port_num;
|
||||
|
||||
const struct net_device_ops *netdev_ops;
|
||||
struct rvt_qp *qp;
|
||||
struct pcpu_sw_netstats __percpu *netstats;
|
||||
};
|
||||
|
||||
/* hfi1 ipoib rdma netdev's private data structure */
|
||||
struct hfi1_ipoib_rdma_netdev {
|
||||
struct rdma_netdev rn; /* keep this first */
|
||||
/* followed by device private data */
|
||||
struct hfi1_ipoib_dev_priv dev_priv;
|
||||
};
|
||||
|
||||
static inline struct hfi1_ipoib_dev_priv *
|
||||
hfi1_ipoib_priv(const struct net_device *dev)
|
||||
{
|
||||
return &((struct hfi1_ipoib_rdma_netdev *)netdev_priv(dev))->dev_priv;
|
||||
}
|
||||
|
||||
static inline void
|
||||
hfi1_ipoib_update_rx_netstats(struct hfi1_ipoib_dev_priv *priv,
|
||||
u64 packets,
|
||||
u64 bytes)
|
||||
{
|
||||
struct pcpu_sw_netstats *netstats = this_cpu_ptr(priv->netstats);
|
||||
|
||||
u64_stats_update_begin(&netstats->syncp);
|
||||
netstats->rx_packets += packets;
|
||||
netstats->rx_bytes += bytes;
|
||||
u64_stats_update_end(&netstats->syncp);
|
||||
}
|
||||
|
||||
static inline void
|
||||
hfi1_ipoib_update_tx_netstats(struct hfi1_ipoib_dev_priv *priv,
|
||||
u64 packets,
|
||||
u64 bytes)
|
||||
{
|
||||
struct pcpu_sw_netstats *netstats = this_cpu_ptr(priv->netstats);
|
||||
|
||||
u64_stats_update_begin(&netstats->syncp);
|
||||
netstats->tx_packets += packets;
|
||||
netstats->tx_bytes += bytes;
|
||||
u64_stats_update_end(&netstats->syncp);
|
||||
}
|
||||
|
||||
int hfi1_ipoib_send_dma(struct net_device *dev,
|
||||
struct sk_buff *skb,
|
||||
struct ib_ah *address,
|
||||
u32 dqpn);
|
||||
|
||||
int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv);
|
||||
void hfi1_ipoib_txreq_deinit(struct hfi1_ipoib_dev_priv *priv);
|
||||
|
||||
int hfi1_ipoib_rxq_init(struct net_device *dev);
|
||||
void hfi1_ipoib_rxq_deinit(struct net_device *dev);
|
||||
|
||||
void hfi1_ipoib_napi_tx_enable(struct net_device *dev);
|
||||
void hfi1_ipoib_napi_tx_disable(struct net_device *dev);
|
||||
|
||||
struct sk_buff *hfi1_ipoib_prepare_skb(struct hfi1_netdev_rxq *rxq,
|
||||
int size, void *data);
|
||||
|
||||
int hfi1_ipoib_rn_get_params(struct ib_device *device,
|
||||
u8 port_num,
|
||||
enum rdma_netdev_t type,
|
||||
struct rdma_netdev_alloc_params *params);
|
||||
|
||||
#endif /* _IPOIB_H */
|
309
drivers/infiniband/hw/hfi1/ipoib_main.c
Normal file
309
drivers/infiniband/hw/hfi1/ipoib_main.c
Normal file
@ -0,0 +1,309 @@
|
||||
// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
|
||||
/*
|
||||
* Copyright(c) 2020 Intel Corporation.
|
||||
*
|
||||
*/
|
||||
|
||||
/*
|
||||
* This file contains HFI1 support for ipoib functionality
|
||||
*/
|
||||
|
||||
#include "ipoib.h"
|
||||
#include "hfi.h"
|
||||
|
||||
static u32 qpn_from_mac(u8 *mac_arr)
|
||||
{
|
||||
return (u32)mac_arr[1] << 16 | mac_arr[2] << 8 | mac_arr[3];
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_dev_init(struct net_device *dev)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
|
||||
int ret;
|
||||
|
||||
priv->netstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
|
||||
|
||||
ret = priv->netdev_ops->ndo_init(dev);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = hfi1_netdev_add_data(priv->dd,
|
||||
qpn_from_mac(priv->netdev->dev_addr),
|
||||
dev);
|
||||
if (ret < 0) {
|
||||
priv->netdev_ops->ndo_uninit(dev);
|
||||
return ret;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void hfi1_ipoib_dev_uninit(struct net_device *dev)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
|
||||
|
||||
hfi1_netdev_remove_data(priv->dd, qpn_from_mac(priv->netdev->dev_addr));
|
||||
|
||||
priv->netdev_ops->ndo_uninit(dev);
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_dev_open(struct net_device *dev)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
|
||||
int ret;
|
||||
|
||||
ret = priv->netdev_ops->ndo_open(dev);
|
||||
if (!ret) {
|
||||
struct hfi1_ibport *ibp = to_iport(priv->device,
|
||||
priv->port_num);
|
||||
struct rvt_qp *qp;
|
||||
u32 qpn = qpn_from_mac(priv->netdev->dev_addr);
|
||||
|
||||
rcu_read_lock();
|
||||
qp = rvt_lookup_qpn(ib_to_rvt(priv->device), &ibp->rvp, qpn);
|
||||
if (!qp) {
|
||||
rcu_read_unlock();
|
||||
priv->netdev_ops->ndo_stop(dev);
|
||||
return -EINVAL;
|
||||
}
|
||||
rvt_get_qp(qp);
|
||||
priv->qp = qp;
|
||||
rcu_read_unlock();
|
||||
|
||||
hfi1_netdev_enable_queues(priv->dd);
|
||||
hfi1_ipoib_napi_tx_enable(dev);
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_dev_stop(struct net_device *dev)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
|
||||
|
||||
if (!priv->qp)
|
||||
return 0;
|
||||
|
||||
hfi1_ipoib_napi_tx_disable(dev);
|
||||
hfi1_netdev_disable_queues(priv->dd);
|
||||
|
||||
rvt_put_qp(priv->qp);
|
||||
priv->qp = NULL;
|
||||
|
||||
return priv->netdev_ops->ndo_stop(dev);
|
||||
}
|
||||
|
||||
static void hfi1_ipoib_dev_get_stats64(struct net_device *dev,
|
||||
struct rtnl_link_stats64 *storage)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
|
||||
u64 rx_packets = 0ull;
|
||||
u64 rx_bytes = 0ull;
|
||||
u64 tx_packets = 0ull;
|
||||
u64 tx_bytes = 0ull;
|
||||
int i;
|
||||
|
||||
netdev_stats_to_stats64(storage, &dev->stats);
|
||||
|
||||
for_each_possible_cpu(i) {
|
||||
const struct pcpu_sw_netstats *stats;
|
||||
unsigned int start;
|
||||
u64 trx_packets;
|
||||
u64 trx_bytes;
|
||||
u64 ttx_packets;
|
||||
u64 ttx_bytes;
|
||||
|
||||
stats = per_cpu_ptr(priv->netstats, i);
|
||||
do {
|
||||
start = u64_stats_fetch_begin_irq(&stats->syncp);
|
||||
trx_packets = stats->rx_packets;
|
||||
trx_bytes = stats->rx_bytes;
|
||||
ttx_packets = stats->tx_packets;
|
||||
ttx_bytes = stats->tx_bytes;
|
||||
} while (u64_stats_fetch_retry_irq(&stats->syncp, start));
|
||||
|
||||
rx_packets += trx_packets;
|
||||
rx_bytes += trx_bytes;
|
||||
tx_packets += ttx_packets;
|
||||
tx_bytes += ttx_bytes;
|
||||
}
|
||||
|
||||
storage->rx_packets += rx_packets;
|
||||
storage->rx_bytes += rx_bytes;
|
||||
storage->tx_packets += tx_packets;
|
||||
storage->tx_bytes += tx_bytes;
|
||||
}
|
||||
|
||||
static const struct net_device_ops hfi1_ipoib_netdev_ops = {
|
||||
.ndo_init = hfi1_ipoib_dev_init,
|
||||
.ndo_uninit = hfi1_ipoib_dev_uninit,
|
||||
.ndo_open = hfi1_ipoib_dev_open,
|
||||
.ndo_stop = hfi1_ipoib_dev_stop,
|
||||
.ndo_get_stats64 = hfi1_ipoib_dev_get_stats64,
|
||||
};
|
||||
|
||||
static int hfi1_ipoib_send(struct net_device *dev,
|
||||
struct sk_buff *skb,
|
||||
struct ib_ah *address,
|
||||
u32 dqpn)
|
||||
{
|
||||
return hfi1_ipoib_send_dma(dev, skb, address, dqpn);
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_mcast_attach(struct net_device *dev,
|
||||
struct ib_device *device,
|
||||
union ib_gid *mgid,
|
||||
u16 mlid,
|
||||
int set_qkey,
|
||||
u32 qkey)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
|
||||
u32 qpn = (u32)qpn_from_mac(priv->netdev->dev_addr);
|
||||
struct hfi1_ibport *ibp = to_iport(priv->device, priv->port_num);
|
||||
struct rvt_qp *qp;
|
||||
int ret = -EINVAL;
|
||||
|
||||
rcu_read_lock();
|
||||
|
||||
qp = rvt_lookup_qpn(ib_to_rvt(priv->device), &ibp->rvp, qpn);
|
||||
if (qp) {
|
||||
rvt_get_qp(qp);
|
||||
rcu_read_unlock();
|
||||
if (set_qkey)
|
||||
priv->qkey = qkey;
|
||||
|
||||
/* attach QP to multicast group */
|
||||
ret = ib_attach_mcast(&qp->ibqp, mgid, mlid);
|
||||
rvt_put_qp(qp);
|
||||
} else {
|
||||
rcu_read_unlock();
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_mcast_detach(struct net_device *dev,
|
||||
struct ib_device *device,
|
||||
union ib_gid *mgid,
|
||||
u16 mlid)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
|
||||
u32 qpn = (u32)qpn_from_mac(priv->netdev->dev_addr);
|
||||
struct hfi1_ibport *ibp = to_iport(priv->device, priv->port_num);
|
||||
struct rvt_qp *qp;
|
||||
int ret = -EINVAL;
|
||||
|
||||
rcu_read_lock();
|
||||
|
||||
qp = rvt_lookup_qpn(ib_to_rvt(priv->device), &ibp->rvp, qpn);
|
||||
if (qp) {
|
||||
rvt_get_qp(qp);
|
||||
rcu_read_unlock();
|
||||
ret = ib_detach_mcast(&qp->ibqp, mgid, mlid);
|
||||
rvt_put_qp(qp);
|
||||
} else {
|
||||
rcu_read_unlock();
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void hfi1_ipoib_netdev_dtor(struct net_device *dev)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
|
||||
|
||||
hfi1_ipoib_txreq_deinit(priv);
|
||||
hfi1_ipoib_rxq_deinit(priv->netdev);
|
||||
|
||||
free_percpu(priv->netstats);
|
||||
}
|
||||
|
||||
static void hfi1_ipoib_free_rdma_netdev(struct net_device *dev)
|
||||
{
|
||||
hfi1_ipoib_netdev_dtor(dev);
|
||||
free_netdev(dev);
|
||||
}
|
||||
|
||||
static void hfi1_ipoib_set_id(struct net_device *dev, int id)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
|
||||
|
||||
priv->pkey_index = (u16)id;
|
||||
ib_query_pkey(priv->device,
|
||||
priv->port_num,
|
||||
priv->pkey_index,
|
||||
&priv->pkey);
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_setup_rn(struct ib_device *device,
|
||||
u8 port_num,
|
||||
struct net_device *netdev,
|
||||
void *param)
|
||||
{
|
||||
struct hfi1_devdata *dd = dd_from_ibdev(device);
|
||||
struct rdma_netdev *rn = netdev_priv(netdev);
|
||||
struct hfi1_ipoib_dev_priv *priv;
|
||||
int rc;
|
||||
|
||||
rn->send = hfi1_ipoib_send;
|
||||
rn->attach_mcast = hfi1_ipoib_mcast_attach;
|
||||
rn->detach_mcast = hfi1_ipoib_mcast_detach;
|
||||
rn->set_id = hfi1_ipoib_set_id;
|
||||
rn->hca = device;
|
||||
rn->port_num = port_num;
|
||||
rn->mtu = netdev->mtu;
|
||||
|
||||
priv = hfi1_ipoib_priv(netdev);
|
||||
priv->dd = dd;
|
||||
priv->netdev = netdev;
|
||||
priv->device = device;
|
||||
priv->port_num = port_num;
|
||||
priv->netdev_ops = netdev->netdev_ops;
|
||||
|
||||
netdev->netdev_ops = &hfi1_ipoib_netdev_ops;
|
||||
|
||||
ib_query_pkey(device, port_num, priv->pkey_index, &priv->pkey);
|
||||
|
||||
rc = hfi1_ipoib_txreq_init(priv);
|
||||
if (rc) {
|
||||
dd_dev_err(dd, "IPoIB netdev TX init - failed(%d)\n", rc);
|
||||
hfi1_ipoib_free_rdma_netdev(netdev);
|
||||
return rc;
|
||||
}
|
||||
|
||||
rc = hfi1_ipoib_rxq_init(netdev);
|
||||
if (rc) {
|
||||
dd_dev_err(dd, "IPoIB netdev RX init - failed(%d)\n", rc);
|
||||
hfi1_ipoib_free_rdma_netdev(netdev);
|
||||
return rc;
|
||||
}
|
||||
|
||||
netdev->priv_destructor = hfi1_ipoib_netdev_dtor;
|
||||
netdev->needs_free_netdev = true;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int hfi1_ipoib_rn_get_params(struct ib_device *device,
|
||||
u8 port_num,
|
||||
enum rdma_netdev_t type,
|
||||
struct rdma_netdev_alloc_params *params)
|
||||
{
|
||||
struct hfi1_devdata *dd = dd_from_ibdev(device);
|
||||
|
||||
if (type != RDMA_NETDEV_IPOIB)
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
if (!HFI1_CAP_IS_KSET(AIP) || !dd->num_netdev_contexts)
|
||||
return -EOPNOTSUPP;
|
||||
|
||||
if (!port_num || port_num > dd->num_pports)
|
||||
return -EINVAL;
|
||||
|
||||
params->sizeof_priv = sizeof(struct hfi1_ipoib_rdma_netdev);
|
||||
params->txqs = dd->num_sdma;
|
||||
params->rxqs = dd->num_netdev_contexts;
|
||||
params->param = NULL;
|
||||
params->initialize_rdma_netdev = hfi1_ipoib_setup_rn;
|
||||
|
||||
return 0;
|
||||
}
|
95
drivers/infiniband/hw/hfi1/ipoib_rx.c
Normal file
95
drivers/infiniband/hw/hfi1/ipoib_rx.c
Normal file
@ -0,0 +1,95 @@
|
||||
// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
|
||||
/*
|
||||
* Copyright(c) 2020 Intel Corporation.
|
||||
*
|
||||
*/
|
||||
|
||||
#include "netdev.h"
|
||||
#include "ipoib.h"
|
||||
|
||||
#define HFI1_IPOIB_SKB_PAD ((NET_SKB_PAD) + (NET_IP_ALIGN))
|
||||
|
||||
static void copy_ipoib_buf(struct sk_buff *skb, void *data, int size)
|
||||
{
|
||||
void *dst_data;
|
||||
|
||||
skb_checksum_none_assert(skb);
|
||||
skb->protocol = *((__be16 *)data);
|
||||
|
||||
dst_data = skb_put(skb, size);
|
||||
memcpy(dst_data, data, size);
|
||||
skb->mac_header = HFI1_IPOIB_PSEUDO_LEN;
|
||||
skb_pull(skb, HFI1_IPOIB_ENCAP_LEN);
|
||||
}
|
||||
|
||||
static struct sk_buff *prepare_frag_skb(struct napi_struct *napi, int size)
|
||||
{
|
||||
struct sk_buff *skb;
|
||||
int skb_size = SKB_DATA_ALIGN(size + HFI1_IPOIB_SKB_PAD);
|
||||
void *frag;
|
||||
|
||||
skb_size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
|
||||
skb_size = SKB_DATA_ALIGN(skb_size);
|
||||
frag = napi_alloc_frag(skb_size);
|
||||
|
||||
if (unlikely(!frag))
|
||||
return napi_alloc_skb(napi, size);
|
||||
|
||||
skb = build_skb(frag, skb_size);
|
||||
|
||||
if (unlikely(!skb)) {
|
||||
skb_free_frag(frag);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
skb_reserve(skb, HFI1_IPOIB_SKB_PAD);
|
||||
return skb;
|
||||
}
|
||||
|
||||
struct sk_buff *hfi1_ipoib_prepare_skb(struct hfi1_netdev_rxq *rxq,
|
||||
int size, void *data)
|
||||
{
|
||||
struct napi_struct *napi = &rxq->napi;
|
||||
int skb_size = size + HFI1_IPOIB_ENCAP_LEN;
|
||||
struct sk_buff *skb;
|
||||
|
||||
/*
|
||||
* For smaller(4k + skb overhead) allocations we will go using
|
||||
* napi cache. Otherwise we will try to use napi frag cache.
|
||||
*/
|
||||
if (size <= SKB_WITH_OVERHEAD(PAGE_SIZE))
|
||||
skb = napi_alloc_skb(napi, skb_size);
|
||||
else
|
||||
skb = prepare_frag_skb(napi, skb_size);
|
||||
|
||||
if (unlikely(!skb))
|
||||
return NULL;
|
||||
|
||||
copy_ipoib_buf(skb, data, size);
|
||||
|
||||
return skb;
|
||||
}
|
||||
|
||||
int hfi1_ipoib_rxq_init(struct net_device *netdev)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *ipoib_priv = hfi1_ipoib_priv(netdev);
|
||||
struct hfi1_devdata *dd = ipoib_priv->dd;
|
||||
int ret;
|
||||
|
||||
ret = hfi1_netdev_rx_init(dd);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
hfi1_init_aip_rsm(dd);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
void hfi1_ipoib_rxq_deinit(struct net_device *netdev)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *ipoib_priv = hfi1_ipoib_priv(netdev);
|
||||
struct hfi1_devdata *dd = ipoib_priv->dd;
|
||||
|
||||
hfi1_deinit_aip_rsm(dd);
|
||||
hfi1_netdev_rx_destroy(dd);
|
||||
}
|
828
drivers/infiniband/hw/hfi1/ipoib_tx.c
Normal file
828
drivers/infiniband/hw/hfi1/ipoib_tx.c
Normal file
@ -0,0 +1,828 @@
|
||||
// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
|
||||
/*
|
||||
* Copyright(c) 2020 Intel Corporation.
|
||||
*
|
||||
*/
|
||||
|
||||
/*
|
||||
* This file contains HFI1 support for IPOIB SDMA functionality
|
||||
*/
|
||||
|
||||
#include <linux/log2.h>
|
||||
#include <linux/circ_buf.h>
|
||||
|
||||
#include "sdma.h"
|
||||
#include "verbs.h"
|
||||
#include "trace_ibhdrs.h"
|
||||
#include "ipoib.h"
|
||||
|
||||
/* Add a convenience helper */
|
||||
#define CIRC_ADD(val, add, size) (((val) + (add)) & ((size) - 1))
|
||||
#define CIRC_NEXT(val, size) CIRC_ADD(val, 1, size)
|
||||
#define CIRC_PREV(val, size) CIRC_ADD(val, -1, size)
|
||||
|
||||
/**
|
||||
* struct ipoib_txreq - IPOIB transmit descriptor
|
||||
* @txreq: sdma transmit request
|
||||
* @sdma_hdr: 9b ib headers
|
||||
* @sdma_status: status returned by sdma engine
|
||||
* @priv: ipoib netdev private data
|
||||
* @txq: txq on which skb was output
|
||||
* @skb: skb to send
|
||||
*/
|
||||
struct ipoib_txreq {
|
||||
struct sdma_txreq txreq;
|
||||
struct hfi1_sdma_header sdma_hdr;
|
||||
int sdma_status;
|
||||
struct hfi1_ipoib_dev_priv *priv;
|
||||
struct hfi1_ipoib_txq *txq;
|
||||
struct sk_buff *skb;
|
||||
};
|
||||
|
||||
struct ipoib_txparms {
|
||||
struct hfi1_devdata *dd;
|
||||
struct rdma_ah_attr *ah_attr;
|
||||
struct hfi1_ibport *ibp;
|
||||
struct hfi1_ipoib_txq *txq;
|
||||
union hfi1_ipoib_flow flow;
|
||||
u32 dqpn;
|
||||
u8 hdr_dwords;
|
||||
u8 entropy;
|
||||
};
|
||||
|
||||
static u64 hfi1_ipoib_txreqs(const u64 sent, const u64 completed)
|
||||
{
|
||||
return sent - completed;
|
||||
}
|
||||
|
||||
static void hfi1_ipoib_check_queue_depth(struct hfi1_ipoib_txq *txq)
|
||||
{
|
||||
if (unlikely(hfi1_ipoib_txreqs(++txq->sent_txreqs,
|
||||
atomic64_read(&txq->complete_txreqs)) >=
|
||||
min_t(unsigned int, txq->priv->netdev->tx_queue_len,
|
||||
txq->tx_ring.max_items - 1)))
|
||||
netif_stop_subqueue(txq->priv->netdev, txq->q_idx);
|
||||
}
|
||||
|
||||
static void hfi1_ipoib_check_queue_stopped(struct hfi1_ipoib_txq *txq)
|
||||
{
|
||||
struct net_device *dev = txq->priv->netdev;
|
||||
|
||||
/* If the queue is already running just return */
|
||||
if (likely(!__netif_subqueue_stopped(dev, txq->q_idx)))
|
||||
return;
|
||||
|
||||
/* If shutting down just return as queue state is irrelevant */
|
||||
if (unlikely(dev->reg_state != NETREG_REGISTERED))
|
||||
return;
|
||||
|
||||
/*
|
||||
* When the queue has been drained to less than half full it will be
|
||||
* restarted.
|
||||
* The size of the txreq ring is fixed at initialization.
|
||||
* The tx queue len can be adjusted upward while the interface is
|
||||
* running.
|
||||
* The tx queue len can be large enough to overflow the txreq_ring.
|
||||
* Use the minimum of the current tx_queue_len or the rings max txreqs
|
||||
* to protect against ring overflow.
|
||||
*/
|
||||
if (hfi1_ipoib_txreqs(txq->sent_txreqs,
|
||||
atomic64_read(&txq->complete_txreqs))
|
||||
< min_t(unsigned int, dev->tx_queue_len,
|
||||
txq->tx_ring.max_items) >> 1)
|
||||
netif_wake_subqueue(dev, txq->q_idx);
|
||||
}
|
||||
|
||||
static void hfi1_ipoib_free_tx(struct ipoib_txreq *tx, int budget)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = tx->priv;
|
||||
|
||||
if (likely(!tx->sdma_status)) {
|
||||
hfi1_ipoib_update_tx_netstats(priv, 1, tx->skb->len);
|
||||
} else {
|
||||
++priv->netdev->stats.tx_errors;
|
||||
dd_dev_warn(priv->dd,
|
||||
"%s: Status = 0x%x pbc 0x%llx txq = %d sde = %d\n",
|
||||
__func__, tx->sdma_status,
|
||||
le64_to_cpu(tx->sdma_hdr.pbc), tx->txq->q_idx,
|
||||
tx->txq->sde->this_idx);
|
||||
}
|
||||
|
||||
napi_consume_skb(tx->skb, budget);
|
||||
sdma_txclean(priv->dd, &tx->txreq);
|
||||
kmem_cache_free(priv->txreq_cache, tx);
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_drain_tx_ring(struct hfi1_ipoib_txq *txq, int budget)
|
||||
{
|
||||
struct hfi1_ipoib_circ_buf *tx_ring = &txq->tx_ring;
|
||||
unsigned long head;
|
||||
unsigned long tail;
|
||||
unsigned int max_tx;
|
||||
int work_done;
|
||||
int tx_count;
|
||||
|
||||
spin_lock_bh(&tx_ring->consumer_lock);
|
||||
|
||||
/* Read index before reading contents at that index. */
|
||||
head = smp_load_acquire(&tx_ring->head);
|
||||
tail = tx_ring->tail;
|
||||
max_tx = tx_ring->max_items;
|
||||
|
||||
work_done = min_t(int, CIRC_CNT(head, tail, max_tx), budget);
|
||||
|
||||
for (tx_count = work_done; tx_count; tx_count--) {
|
||||
hfi1_ipoib_free_tx(tx_ring->items[tail], budget);
|
||||
tail = CIRC_NEXT(tail, max_tx);
|
||||
}
|
||||
|
||||
atomic64_add(work_done, &txq->complete_txreqs);
|
||||
|
||||
/* Finished freeing tx items so store the tail value. */
|
||||
smp_store_release(&tx_ring->tail, tail);
|
||||
|
||||
spin_unlock_bh(&tx_ring->consumer_lock);
|
||||
|
||||
hfi1_ipoib_check_queue_stopped(txq);
|
||||
|
||||
return work_done;
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_process_tx_ring(struct napi_struct *napi, int budget)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(napi->dev);
|
||||
struct hfi1_ipoib_txq *txq = &priv->txqs[napi - priv->tx_napis];
|
||||
|
||||
int work_done = hfi1_ipoib_drain_tx_ring(txq, budget);
|
||||
|
||||
if (work_done < budget)
|
||||
napi_complete_done(napi, work_done);
|
||||
|
||||
return work_done;
|
||||
}
|
||||
|
||||
static void hfi1_ipoib_add_tx(struct ipoib_txreq *tx)
|
||||
{
|
||||
struct hfi1_ipoib_circ_buf *tx_ring = &tx->txq->tx_ring;
|
||||
unsigned long head;
|
||||
unsigned long tail;
|
||||
size_t max_tx;
|
||||
|
||||
spin_lock(&tx_ring->producer_lock);
|
||||
|
||||
head = tx_ring->head;
|
||||
tail = READ_ONCE(tx_ring->tail);
|
||||
max_tx = tx_ring->max_items;
|
||||
|
||||
if (likely(CIRC_SPACE(head, tail, max_tx))) {
|
||||
tx_ring->items[head] = tx;
|
||||
|
||||
/* Finish storing txreq before incrementing head. */
|
||||
smp_store_release(&tx_ring->head, CIRC_ADD(head, 1, max_tx));
|
||||
napi_schedule(tx->txq->napi);
|
||||
} else {
|
||||
struct hfi1_ipoib_txq *txq = tx->txq;
|
||||
struct hfi1_ipoib_dev_priv *priv = tx->priv;
|
||||
|
||||
/* Ring was full */
|
||||
hfi1_ipoib_free_tx(tx, 0);
|
||||
atomic64_inc(&txq->complete_txreqs);
|
||||
dd_dev_dbg(priv->dd, "txq %d full.\n", txq->q_idx);
|
||||
}
|
||||
|
||||
spin_unlock(&tx_ring->producer_lock);
|
||||
}
|
||||
|
||||
static void hfi1_ipoib_sdma_complete(struct sdma_txreq *txreq, int status)
|
||||
{
|
||||
struct ipoib_txreq *tx = container_of(txreq, struct ipoib_txreq, txreq);
|
||||
|
||||
tx->sdma_status = status;
|
||||
|
||||
hfi1_ipoib_add_tx(tx);
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_build_ulp_payload(struct ipoib_txreq *tx,
|
||||
struct ipoib_txparms *txp)
|
||||
{
|
||||
struct hfi1_devdata *dd = txp->dd;
|
||||
struct sdma_txreq *txreq = &tx->txreq;
|
||||
struct sk_buff *skb = tx->skb;
|
||||
int ret = 0;
|
||||
int i;
|
||||
|
||||
if (skb_headlen(skb)) {
|
||||
ret = sdma_txadd_kvaddr(dd, txreq, skb->data, skb_headlen(skb));
|
||||
if (unlikely(ret))
|
||||
return ret;
|
||||
}
|
||||
|
||||
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
|
||||
const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
|
||||
|
||||
ret = sdma_txadd_page(dd,
|
||||
txreq,
|
||||
skb_frag_page(frag),
|
||||
frag->bv_offset,
|
||||
skb_frag_size(frag));
|
||||
if (unlikely(ret))
|
||||
break;
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_build_tx_desc(struct ipoib_txreq *tx,
|
||||
struct ipoib_txparms *txp)
|
||||
{
|
||||
struct hfi1_devdata *dd = txp->dd;
|
||||
struct sdma_txreq *txreq = &tx->txreq;
|
||||
struct hfi1_sdma_header *sdma_hdr = &tx->sdma_hdr;
|
||||
u16 pkt_bytes =
|
||||
sizeof(sdma_hdr->pbc) + (txp->hdr_dwords << 2) + tx->skb->len;
|
||||
int ret;
|
||||
|
||||
ret = sdma_txinit(txreq, 0, pkt_bytes, hfi1_ipoib_sdma_complete);
|
||||
if (unlikely(ret))
|
||||
return ret;
|
||||
|
||||
/* add pbc + headers */
|
||||
ret = sdma_txadd_kvaddr(dd,
|
||||
txreq,
|
||||
sdma_hdr,
|
||||
sizeof(sdma_hdr->pbc) + (txp->hdr_dwords << 2));
|
||||
if (unlikely(ret))
|
||||
return ret;
|
||||
|
||||
/* add the ulp payload */
|
||||
return hfi1_ipoib_build_ulp_payload(tx, txp);
|
||||
}
|
||||
|
||||
static void hfi1_ipoib_build_ib_tx_headers(struct ipoib_txreq *tx,
|
||||
struct ipoib_txparms *txp)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = tx->priv;
|
||||
struct hfi1_sdma_header *sdma_hdr = &tx->sdma_hdr;
|
||||
struct sk_buff *skb = tx->skb;
|
||||
struct hfi1_pportdata *ppd = ppd_from_ibp(txp->ibp);
|
||||
struct rdma_ah_attr *ah_attr = txp->ah_attr;
|
||||
struct ib_other_headers *ohdr;
|
||||
struct ib_grh *grh;
|
||||
u16 dwords;
|
||||
u16 slid;
|
||||
u16 dlid;
|
||||
u16 lrh0;
|
||||
u32 bth0;
|
||||
u32 sqpn = (u32)(priv->netdev->dev_addr[1] << 16 |
|
||||
priv->netdev->dev_addr[2] << 8 |
|
||||
priv->netdev->dev_addr[3]);
|
||||
u16 payload_dwords;
|
||||
u8 pad_cnt;
|
||||
|
||||
pad_cnt = -skb->len & 3;
|
||||
|
||||
/* Includes ICRC */
|
||||
payload_dwords = ((skb->len + pad_cnt) >> 2) + SIZE_OF_CRC;
|
||||
|
||||
/* header size in dwords LRH+BTH+DETH = (8+12+8)/4. */
|
||||
txp->hdr_dwords = 7;
|
||||
|
||||
if (rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH) {
|
||||
grh = &sdma_hdr->hdr.ibh.u.l.grh;
|
||||
txp->hdr_dwords +=
|
||||
hfi1_make_grh(txp->ibp,
|
||||
grh,
|
||||
rdma_ah_read_grh(ah_attr),
|
||||
txp->hdr_dwords - LRH_9B_DWORDS,
|
||||
payload_dwords);
|
||||
lrh0 = HFI1_LRH_GRH;
|
||||
ohdr = &sdma_hdr->hdr.ibh.u.l.oth;
|
||||
} else {
|
||||
lrh0 = HFI1_LRH_BTH;
|
||||
ohdr = &sdma_hdr->hdr.ibh.u.oth;
|
||||
}
|
||||
|
||||
lrh0 |= (rdma_ah_get_sl(ah_attr) & 0xf) << 4;
|
||||
lrh0 |= (txp->flow.sc5 & 0xf) << 12;
|
||||
|
||||
dlid = opa_get_lid(rdma_ah_get_dlid(ah_attr), 9B);
|
||||
if (dlid == be16_to_cpu(IB_LID_PERMISSIVE)) {
|
||||
slid = be16_to_cpu(IB_LID_PERMISSIVE);
|
||||
} else {
|
||||
u16 lid = (u16)ppd->lid;
|
||||
|
||||
if (lid) {
|
||||
lid |= rdma_ah_get_path_bits(ah_attr) &
|
||||
((1 << ppd->lmc) - 1);
|
||||
slid = lid;
|
||||
} else {
|
||||
slid = be16_to_cpu(IB_LID_PERMISSIVE);
|
||||
}
|
||||
}
|
||||
|
||||
/* Includes ICRC */
|
||||
dwords = txp->hdr_dwords + payload_dwords;
|
||||
|
||||
/* Build the lrh */
|
||||
sdma_hdr->hdr.hdr_type = HFI1_PKT_TYPE_9B;
|
||||
hfi1_make_ib_hdr(&sdma_hdr->hdr.ibh, lrh0, dwords, dlid, slid);
|
||||
|
||||
/* Build the bth */
|
||||
bth0 = (IB_OPCODE_UD_SEND_ONLY << 24) | (pad_cnt << 20) | priv->pkey;
|
||||
|
||||
ohdr->bth[0] = cpu_to_be32(bth0);
|
||||
ohdr->bth[1] = cpu_to_be32(txp->dqpn);
|
||||
ohdr->bth[2] = cpu_to_be32(mask_psn((u32)txp->txq->sent_txreqs));
|
||||
|
||||
/* Build the deth */
|
||||
ohdr->u.ud.deth[0] = cpu_to_be32(priv->qkey);
|
||||
ohdr->u.ud.deth[1] = cpu_to_be32((txp->entropy <<
|
||||
HFI1_IPOIB_ENTROPY_SHIFT) | sqpn);
|
||||
|
||||
/* Construct the pbc. */
|
||||
sdma_hdr->pbc =
|
||||
cpu_to_le64(create_pbc(ppd,
|
||||
ib_is_sc5(txp->flow.sc5) <<
|
||||
PBC_DC_INFO_SHIFT,
|
||||
0,
|
||||
sc_to_vlt(priv->dd, txp->flow.sc5),
|
||||
dwords - SIZE_OF_CRC +
|
||||
(sizeof(sdma_hdr->pbc) >> 2)));
|
||||
}
|
||||
|
||||
static struct ipoib_txreq *hfi1_ipoib_send_dma_common(struct net_device *dev,
|
||||
struct sk_buff *skb,
|
||||
struct ipoib_txparms *txp)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
|
||||
struct ipoib_txreq *tx;
|
||||
int ret;
|
||||
|
||||
tx = kmem_cache_alloc_node(priv->txreq_cache,
|
||||
GFP_ATOMIC,
|
||||
priv->dd->node);
|
||||
if (unlikely(!tx))
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
/* so that we can test if the sdma decriptors are there */
|
||||
tx->txreq.num_desc = 0;
|
||||
tx->priv = priv;
|
||||
tx->txq = txp->txq;
|
||||
tx->skb = skb;
|
||||
|
||||
hfi1_ipoib_build_ib_tx_headers(tx, txp);
|
||||
|
||||
ret = hfi1_ipoib_build_tx_desc(tx, txp);
|
||||
if (likely(!ret)) {
|
||||
if (txp->txq->flow.as_int != txp->flow.as_int) {
|
||||
txp->txq->flow.tx_queue = txp->flow.tx_queue;
|
||||
txp->txq->flow.sc5 = txp->flow.sc5;
|
||||
txp->txq->sde =
|
||||
sdma_select_engine_sc(priv->dd,
|
||||
txp->flow.tx_queue,
|
||||
txp->flow.sc5);
|
||||
}
|
||||
|
||||
return tx;
|
||||
}
|
||||
|
||||
sdma_txclean(priv->dd, &tx->txreq);
|
||||
kmem_cache_free(priv->txreq_cache, tx);
|
||||
|
||||
return ERR_PTR(ret);
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_submit_tx_list(struct net_device *dev,
|
||||
struct hfi1_ipoib_txq *txq)
|
||||
{
|
||||
int ret;
|
||||
u16 count_out;
|
||||
|
||||
ret = sdma_send_txlist(txq->sde,
|
||||
iowait_get_ib_work(&txq->wait),
|
||||
&txq->tx_list,
|
||||
&count_out);
|
||||
if (likely(!ret) || ret == -EBUSY || ret == -ECOMM)
|
||||
return ret;
|
||||
|
||||
dd_dev_warn(txq->priv->dd, "cannot send skb tx list, err %d.\n", ret);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_flush_tx_list(struct net_device *dev,
|
||||
struct hfi1_ipoib_txq *txq)
|
||||
{
|
||||
int ret = 0;
|
||||
|
||||
if (!list_empty(&txq->tx_list)) {
|
||||
/* Flush the current list */
|
||||
ret = hfi1_ipoib_submit_tx_list(dev, txq);
|
||||
|
||||
if (unlikely(ret))
|
||||
if (ret != -EBUSY)
|
||||
++dev->stats.tx_carrier_errors;
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_submit_tx(struct hfi1_ipoib_txq *txq,
|
||||
struct ipoib_txreq *tx)
|
||||
{
|
||||
int ret;
|
||||
|
||||
ret = sdma_send_txreq(txq->sde,
|
||||
iowait_get_ib_work(&txq->wait),
|
||||
&tx->txreq,
|
||||
txq->pkts_sent);
|
||||
if (likely(!ret)) {
|
||||
txq->pkts_sent = true;
|
||||
iowait_starve_clear(txq->pkts_sent, &txq->wait);
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_send_dma_single(struct net_device *dev,
|
||||
struct sk_buff *skb,
|
||||
struct ipoib_txparms *txp)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
|
||||
struct hfi1_ipoib_txq *txq = txp->txq;
|
||||
struct ipoib_txreq *tx;
|
||||
int ret;
|
||||
|
||||
tx = hfi1_ipoib_send_dma_common(dev, skb, txp);
|
||||
if (IS_ERR(tx)) {
|
||||
int ret = PTR_ERR(tx);
|
||||
|
||||
dev_kfree_skb_any(skb);
|
||||
|
||||
if (ret == -ENOMEM)
|
||||
++dev->stats.tx_errors;
|
||||
else
|
||||
++dev->stats.tx_carrier_errors;
|
||||
|
||||
return NETDEV_TX_OK;
|
||||
}
|
||||
|
||||
ret = hfi1_ipoib_submit_tx(txq, tx);
|
||||
if (likely(!ret)) {
|
||||
trace_sdma_output_ibhdr(tx->priv->dd,
|
||||
&tx->sdma_hdr.hdr,
|
||||
ib_is_sc5(txp->flow.sc5));
|
||||
hfi1_ipoib_check_queue_depth(txq);
|
||||
return NETDEV_TX_OK;
|
||||
}
|
||||
|
||||
txq->pkts_sent = false;
|
||||
|
||||
if (ret == -EBUSY) {
|
||||
list_add_tail(&tx->txreq.list, &txq->tx_list);
|
||||
|
||||
trace_sdma_output_ibhdr(tx->priv->dd,
|
||||
&tx->sdma_hdr.hdr,
|
||||
ib_is_sc5(txp->flow.sc5));
|
||||
hfi1_ipoib_check_queue_depth(txq);
|
||||
return NETDEV_TX_OK;
|
||||
}
|
||||
|
||||
if (ret == -ECOMM) {
|
||||
hfi1_ipoib_check_queue_depth(txq);
|
||||
return NETDEV_TX_OK;
|
||||
}
|
||||
|
||||
sdma_txclean(priv->dd, &tx->txreq);
|
||||
dev_kfree_skb_any(skb);
|
||||
kmem_cache_free(priv->txreq_cache, tx);
|
||||
++dev->stats.tx_carrier_errors;
|
||||
|
||||
return NETDEV_TX_OK;
|
||||
}
|
||||
|
||||
static int hfi1_ipoib_send_dma_list(struct net_device *dev,
|
||||
struct sk_buff *skb,
|
||||
struct ipoib_txparms *txp)
|
||||
{
|
||||
struct hfi1_ipoib_txq *txq = txp->txq;
|
||||
struct ipoib_txreq *tx;
|
||||
|
||||
/* Has the flow change ? */
|
||||
if (txq->flow.as_int != txp->flow.as_int)
|
||||
(void)hfi1_ipoib_flush_tx_list(dev, txq);
|
||||
|
||||
tx = hfi1_ipoib_send_dma_common(dev, skb, txp);
|
||||
if (IS_ERR(tx)) {
|
||||
int ret = PTR_ERR(tx);
|
||||
|
||||
dev_kfree_skb_any(skb);
|
||||
|
||||
if (ret == -ENOMEM)
|
||||
++dev->stats.tx_errors;
|
||||
else
|
||||
++dev->stats.tx_carrier_errors;
|
||||
|
||||
return NETDEV_TX_OK;
|
||||
}
|
||||
|
||||
list_add_tail(&tx->txreq.list, &txq->tx_list);
|
||||
|
||||
hfi1_ipoib_check_queue_depth(txq);
|
||||
|
||||
trace_sdma_output_ibhdr(tx->priv->dd,
|
||||
&tx->sdma_hdr.hdr,
|
||||
ib_is_sc5(txp->flow.sc5));
|
||||
|
||||
if (!netdev_xmit_more())
|
||||
(void)hfi1_ipoib_flush_tx_list(dev, txq);
|
||||
|
||||
return NETDEV_TX_OK;
|
||||
}
|
||||
|
||||
static u8 hfi1_ipoib_calc_entropy(struct sk_buff *skb)
|
||||
{
|
||||
if (skb_transport_header_was_set(skb)) {
|
||||
u8 *hdr = (u8 *)skb_transport_header(skb);
|
||||
|
||||
return (hdr[0] ^ hdr[1] ^ hdr[2] ^ hdr[3]);
|
||||
}
|
||||
|
||||
return (u8)skb_get_queue_mapping(skb);
|
||||
}
|
||||
|
||||
int hfi1_ipoib_send_dma(struct net_device *dev,
|
||||
struct sk_buff *skb,
|
||||
struct ib_ah *address,
|
||||
u32 dqpn)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
|
||||
struct ipoib_txparms txp;
|
||||
struct rdma_netdev *rn = netdev_priv(dev);
|
||||
|
||||
if (unlikely(skb->len > rn->mtu + HFI1_IPOIB_ENCAP_LEN)) {
|
||||
dd_dev_warn(priv->dd, "packet len %d (> %d) too long to send, dropping\n",
|
||||
skb->len,
|
||||
rn->mtu + HFI1_IPOIB_ENCAP_LEN);
|
||||
++dev->stats.tx_dropped;
|
||||
++dev->stats.tx_errors;
|
||||
dev_kfree_skb_any(skb);
|
||||
return NETDEV_TX_OK;
|
||||
}
|
||||
|
||||
txp.dd = priv->dd;
|
||||
txp.ah_attr = &ibah_to_rvtah(address)->attr;
|
||||
txp.ibp = to_iport(priv->device, priv->port_num);
|
||||
txp.txq = &priv->txqs[skb_get_queue_mapping(skb)];
|
||||
txp.dqpn = dqpn;
|
||||
txp.flow.sc5 = txp.ibp->sl_to_sc[rdma_ah_get_sl(txp.ah_attr)];
|
||||
txp.flow.tx_queue = (u8)skb_get_queue_mapping(skb);
|
||||
txp.entropy = hfi1_ipoib_calc_entropy(skb);
|
||||
|
||||
if (netdev_xmit_more() || !list_empty(&txp.txq->tx_list))
|
||||
return hfi1_ipoib_send_dma_list(dev, skb, &txp);
|
||||
|
||||
return hfi1_ipoib_send_dma_single(dev, skb, &txp);
|
||||
}
|
||||
|
||||
/*
|
||||
* hfi1_ipoib_sdma_sleep - ipoib sdma sleep function
|
||||
*
|
||||
* This function gets called from sdma_send_txreq() when there are not enough
|
||||
* sdma descriptors available to send the packet. It adds Tx queue's wait
|
||||
* structure to sdma engine's dmawait list to be woken up when descriptors
|
||||
* become available.
|
||||
*/
|
||||
static int hfi1_ipoib_sdma_sleep(struct sdma_engine *sde,
|
||||
struct iowait_work *wait,
|
||||
struct sdma_txreq *txreq,
|
||||
uint seq,
|
||||
bool pkts_sent)
|
||||
{
|
||||
struct hfi1_ipoib_txq *txq =
|
||||
container_of(wait->iow, struct hfi1_ipoib_txq, wait);
|
||||
|
||||
write_seqlock(&sde->waitlock);
|
||||
|
||||
if (likely(txq->priv->netdev->reg_state == NETREG_REGISTERED)) {
|
||||
if (sdma_progress(sde, seq, txreq)) {
|
||||
write_sequnlock(&sde->waitlock);
|
||||
return -EAGAIN;
|
||||
}
|
||||
|
||||
netif_stop_subqueue(txq->priv->netdev, txq->q_idx);
|
||||
|
||||
if (list_empty(&txq->wait.list))
|
||||
iowait_queue(pkts_sent, wait->iow, &sde->dmawait);
|
||||
|
||||
write_sequnlock(&sde->waitlock);
|
||||
return -EBUSY;
|
||||
}
|
||||
|
||||
write_sequnlock(&sde->waitlock);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/*
|
||||
* hfi1_ipoib_sdma_wakeup - ipoib sdma wakeup function
|
||||
*
|
||||
* This function gets called when SDMA descriptors becomes available and Tx
|
||||
* queue's wait structure was previously added to sdma engine's dmawait list.
|
||||
*/
|
||||
static void hfi1_ipoib_sdma_wakeup(struct iowait *wait, int reason)
|
||||
{
|
||||
struct hfi1_ipoib_txq *txq =
|
||||
container_of(wait, struct hfi1_ipoib_txq, wait);
|
||||
|
||||
if (likely(txq->priv->netdev->reg_state == NETREG_REGISTERED))
|
||||
iowait_schedule(wait, system_highpri_wq, WORK_CPU_UNBOUND);
|
||||
}
|
||||
|
||||
static void hfi1_ipoib_flush_txq(struct work_struct *work)
|
||||
{
|
||||
struct iowait_work *ioww =
|
||||
container_of(work, struct iowait_work, iowork);
|
||||
struct iowait *wait = iowait_ioww_to_iow(ioww);
|
||||
struct hfi1_ipoib_txq *txq =
|
||||
container_of(wait, struct hfi1_ipoib_txq, wait);
|
||||
struct net_device *dev = txq->priv->netdev;
|
||||
|
||||
if (likely(dev->reg_state == NETREG_REGISTERED) &&
|
||||
likely(__netif_subqueue_stopped(dev, txq->q_idx)) &&
|
||||
likely(!hfi1_ipoib_flush_tx_list(dev, txq)))
|
||||
netif_wake_subqueue(dev, txq->q_idx);
|
||||
}
|
||||
|
||||
int hfi1_ipoib_txreq_init(struct hfi1_ipoib_dev_priv *priv)
|
||||
{
|
||||
struct net_device *dev = priv->netdev;
|
||||
char buf[HFI1_IPOIB_TXREQ_NAME_LEN];
|
||||
unsigned long tx_ring_size;
|
||||
int i;
|
||||
|
||||
/*
|
||||
* Ring holds 1 less than tx_ring_size
|
||||
* Round up to next power of 2 in order to hold at least tx_queue_len
|
||||
*/
|
||||
tx_ring_size = roundup_pow_of_two((unsigned long)dev->tx_queue_len + 1);
|
||||
|
||||
snprintf(buf, sizeof(buf), "hfi1_%u_ipoib_txreq_cache", priv->dd->unit);
|
||||
priv->txreq_cache = kmem_cache_create(buf,
|
||||
sizeof(struct ipoib_txreq),
|
||||
0,
|
||||
0,
|
||||
NULL);
|
||||
if (!priv->txreq_cache)
|
||||
return -ENOMEM;
|
||||
|
||||
priv->tx_napis = kcalloc_node(dev->num_tx_queues,
|
||||
sizeof(struct napi_struct),
|
||||
GFP_ATOMIC,
|
||||
priv->dd->node);
|
||||
if (!priv->tx_napis)
|
||||
goto free_txreq_cache;
|
||||
|
||||
priv->txqs = kcalloc_node(dev->num_tx_queues,
|
||||
sizeof(struct hfi1_ipoib_txq),
|
||||
GFP_ATOMIC,
|
||||
priv->dd->node);
|
||||
if (!priv->txqs)
|
||||
goto free_tx_napis;
|
||||
|
||||
for (i = 0; i < dev->num_tx_queues; i++) {
|
||||
struct hfi1_ipoib_txq *txq = &priv->txqs[i];
|
||||
|
||||
iowait_init(&txq->wait,
|
||||
0,
|
||||
hfi1_ipoib_flush_txq,
|
||||
NULL,
|
||||
hfi1_ipoib_sdma_sleep,
|
||||
hfi1_ipoib_sdma_wakeup,
|
||||
NULL,
|
||||
NULL);
|
||||
txq->priv = priv;
|
||||
txq->sde = NULL;
|
||||
INIT_LIST_HEAD(&txq->tx_list);
|
||||
atomic64_set(&txq->complete_txreqs, 0);
|
||||
txq->q_idx = i;
|
||||
txq->flow.tx_queue = 0xff;
|
||||
txq->flow.sc5 = 0xff;
|
||||
txq->pkts_sent = false;
|
||||
|
||||
netdev_queue_numa_node_write(netdev_get_tx_queue(dev, i),
|
||||
priv->dd->node);
|
||||
|
||||
txq->tx_ring.items =
|
||||
vzalloc_node(array_size(tx_ring_size,
|
||||
sizeof(struct ipoib_txreq)),
|
||||
priv->dd->node);
|
||||
if (!txq->tx_ring.items)
|
||||
goto free_txqs;
|
||||
|
||||
spin_lock_init(&txq->tx_ring.producer_lock);
|
||||
spin_lock_init(&txq->tx_ring.consumer_lock);
|
||||
txq->tx_ring.max_items = tx_ring_size;
|
||||
|
||||
txq->napi = &priv->tx_napis[i];
|
||||
netif_tx_napi_add(dev, txq->napi,
|
||||
hfi1_ipoib_process_tx_ring,
|
||||
NAPI_POLL_WEIGHT);
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
||||
free_txqs:
|
||||
for (i--; i >= 0; i--) {
|
||||
struct hfi1_ipoib_txq *txq = &priv->txqs[i];
|
||||
|
||||
netif_napi_del(txq->napi);
|
||||
vfree(txq->tx_ring.items);
|
||||
}
|
||||
|
||||
kfree(priv->txqs);
|
||||
priv->txqs = NULL;
|
||||
|
||||
free_tx_napis:
|
||||
kfree(priv->tx_napis);
|
||||
priv->tx_napis = NULL;
|
||||
|
||||
free_txreq_cache:
|
||||
kmem_cache_destroy(priv->txreq_cache);
|
||||
priv->txreq_cache = NULL;
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
static void hfi1_ipoib_drain_tx_list(struct hfi1_ipoib_txq *txq)
|
||||
{
|
||||
struct sdma_txreq *txreq;
|
||||
struct sdma_txreq *txreq_tmp;
|
||||
atomic64_t *complete_txreqs = &txq->complete_txreqs;
|
||||
|
||||
list_for_each_entry_safe(txreq, txreq_tmp, &txq->tx_list, list) {
|
||||
struct ipoib_txreq *tx =
|
||||
container_of(txreq, struct ipoib_txreq, txreq);
|
||||
|
||||
list_del(&txreq->list);
|
||||
sdma_txclean(txq->priv->dd, &tx->txreq);
|
||||
dev_kfree_skb_any(tx->skb);
|
||||
kmem_cache_free(txq->priv->txreq_cache, tx);
|
||||
atomic64_inc(complete_txreqs);
|
||||
}
|
||||
|
||||
if (hfi1_ipoib_txreqs(txq->sent_txreqs, atomic64_read(complete_txreqs)))
|
||||
dd_dev_warn(txq->priv->dd,
|
||||
"txq %d not empty found %llu requests\n",
|
||||
txq->q_idx,
|
||||
hfi1_ipoib_txreqs(txq->sent_txreqs,
|
||||
atomic64_read(complete_txreqs)));
|
||||
}
|
||||
|
||||
void hfi1_ipoib_txreq_deinit(struct hfi1_ipoib_dev_priv *priv)
|
||||
{
|
||||
int i;
|
||||
|
||||
for (i = 0; i < priv->netdev->num_tx_queues; i++) {
|
||||
struct hfi1_ipoib_txq *txq = &priv->txqs[i];
|
||||
|
||||
iowait_cancel_work(&txq->wait);
|
||||
iowait_sdma_drain(&txq->wait);
|
||||
hfi1_ipoib_drain_tx_list(txq);
|
||||
netif_napi_del(txq->napi);
|
||||
(void)hfi1_ipoib_drain_tx_ring(txq, txq->tx_ring.max_items);
|
||||
vfree(txq->tx_ring.items);
|
||||
}
|
||||
|
||||
kfree(priv->txqs);
|
||||
priv->txqs = NULL;
|
||||
|
||||
kfree(priv->tx_napis);
|
||||
priv->tx_napis = NULL;
|
||||
|
||||
kmem_cache_destroy(priv->txreq_cache);
|
||||
priv->txreq_cache = NULL;
|
||||
}
|
||||
|
||||
void hfi1_ipoib_napi_tx_enable(struct net_device *dev)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
|
||||
int i;
|
||||
|
||||
for (i = 0; i < dev->num_tx_queues; i++) {
|
||||
struct hfi1_ipoib_txq *txq = &priv->txqs[i];
|
||||
|
||||
napi_enable(txq->napi);
|
||||
}
|
||||
}
|
||||
|
||||
void hfi1_ipoib_napi_tx_disable(struct net_device *dev)
|
||||
{
|
||||
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
|
||||
int i;
|
||||
|
||||
for (i = 0; i < dev->num_tx_queues; i++) {
|
||||
struct hfi1_ipoib_txq *txq = &priv->txqs[i];
|
||||
|
||||
napi_disable(txq->napi);
|
||||
(void)hfi1_ipoib_drain_tx_ring(txq, txq->tx_ring.max_items);
|
||||
}
|
||||
}
|
@ -1,6 +1,6 @@
|
||||
// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
|
||||
/*
|
||||
* Copyright(c) 2018 Intel Corporation.
|
||||
* Copyright(c) 2018 - 2020 Intel Corporation.
|
||||
*
|
||||
* This file is provided under a dual BSD/GPLv2 license. When using or
|
||||
* redistributing this file, you may do so under either license.
|
||||
@ -49,6 +49,7 @@
|
||||
#include "hfi.h"
|
||||
#include "affinity.h"
|
||||
#include "sdma.h"
|
||||
#include "netdev.h"
|
||||
|
||||
/**
|
||||
* msix_initialize() - Calculate, request and configure MSIx IRQs
|
||||
@ -69,7 +70,7 @@ int msix_initialize(struct hfi1_devdata *dd)
|
||||
* one for each VNIC context
|
||||
* ...any new IRQs should be added here.
|
||||
*/
|
||||
total = 1 + dd->num_sdma + dd->n_krcv_queues + dd->num_vnic_contexts;
|
||||
total = 1 + dd->num_sdma + dd->n_krcv_queues + dd->num_netdev_contexts;
|
||||
|
||||
if (total >= CCE_NUM_MSIX_VECTORS)
|
||||
return -EINVAL;
|
||||
@ -140,7 +141,7 @@ static int msix_request_irq(struct hfi1_devdata *dd, void *arg,
|
||||
ret = pci_request_irq(dd->pcidev, nr, handler, thread, arg, name);
|
||||
if (ret) {
|
||||
dd_dev_err(dd,
|
||||
"%s: request for IRQ %d failed, MSIx %lu, err %d\n",
|
||||
"%s: request for IRQ %d failed, MSIx %lx, err %d\n",
|
||||
name, irq, nr, ret);
|
||||
spin_lock(&dd->msix_info.msix_lock);
|
||||
__clear_bit(nr, dd->msix_info.in_use_msix);
|
||||
@ -160,7 +161,7 @@ static int msix_request_irq(struct hfi1_devdata *dd, void *arg,
|
||||
/* This is a request, so a failure is not fatal */
|
||||
ret = hfi1_get_irq_affinity(dd, me);
|
||||
if (ret)
|
||||
dd_dev_err(dd, "unable to pin IRQ %d\n", ret);
|
||||
dd_dev_err(dd, "%s: unable to pin IRQ %d\n", name, ret);
|
||||
|
||||
return nr;
|
||||
}
|
||||
@ -171,7 +172,8 @@ static int msix_request_rcd_irq_common(struct hfi1_ctxtdata *rcd,
|
||||
const char *name)
|
||||
{
|
||||
int nr = msix_request_irq(rcd->dd, rcd, handler, thread,
|
||||
IRQ_RCVCTXT, name);
|
||||
rcd->is_vnic ? IRQ_NETDEVCTXT : IRQ_RCVCTXT,
|
||||
name);
|
||||
if (nr < 0)
|
||||
return nr;
|
||||
|
||||
@ -203,6 +205,21 @@ int msix_request_rcd_irq(struct hfi1_ctxtdata *rcd)
|
||||
receive_context_thread, name);
|
||||
}
|
||||
|
||||
/**
|
||||
* msix_request_rcd_irq() - Helper function for RCVAVAIL IRQs
|
||||
* for netdev context
|
||||
* @rcd: valid netdev contexti
|
||||
*/
|
||||
int msix_netdev_request_rcd_irq(struct hfi1_ctxtdata *rcd)
|
||||
{
|
||||
char name[MAX_NAME_SIZE];
|
||||
|
||||
snprintf(name, sizeof(name), DRIVER_NAME "_%d nd kctxt%d",
|
||||
rcd->dd->unit, rcd->ctxt);
|
||||
return msix_request_rcd_irq_common(rcd, receive_context_interrupt_napi,
|
||||
NULL, name);
|
||||
}
|
||||
|
||||
/**
|
||||
* msix_request_smda_ira() - Helper for getting SDMA IRQ resources
|
||||
* @sde: valid sdma engine
|
||||
@ -355,15 +372,16 @@ void msix_clean_up_interrupts(struct hfi1_devdata *dd)
|
||||
}
|
||||
|
||||
/**
|
||||
* msix_vnic_syncrhonize_irq() - Vnic IRQ synchronize
|
||||
* msix_netdev_syncrhonize_irq() - netdev IRQ synchronize
|
||||
* @dd: valid devdata
|
||||
*/
|
||||
void msix_vnic_synchronize_irq(struct hfi1_devdata *dd)
|
||||
void msix_netdev_synchronize_irq(struct hfi1_devdata *dd)
|
||||
{
|
||||
int i;
|
||||
int ctxt_count = hfi1_netdev_ctxt_count(dd);
|
||||
|
||||
for (i = 0; i < dd->vnic.num_ctxt; i++) {
|
||||
struct hfi1_ctxtdata *rcd = dd->vnic.ctxt[i];
|
||||
for (i = 0; i < ctxt_count; i++) {
|
||||
struct hfi1_ctxtdata *rcd = hfi1_netdev_get_ctxt(dd, i);
|
||||
struct hfi1_msix_entry *me;
|
||||
|
||||
me = &dd->msix_info.msix_entries[rcd->msix_intr];
|
||||
|
@ -1,6 +1,6 @@
|
||||
/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
|
||||
/*
|
||||
* Copyright(c) 2018 Intel Corporation.
|
||||
* Copyright(c) 2018 - 2020 Intel Corporation.
|
||||
*
|
||||
* This file is provided under a dual BSD/GPLv2 license. When using or
|
||||
* redistributing this file, you may do so under either license.
|
||||
@ -59,7 +59,8 @@ int msix_request_rcd_irq(struct hfi1_ctxtdata *rcd);
|
||||
int msix_request_sdma_irq(struct sdma_engine *sde);
|
||||
void msix_free_irq(struct hfi1_devdata *dd, u8 msix_intr);
|
||||
|
||||
/* VNIC interface */
|
||||
void msix_vnic_synchronize_irq(struct hfi1_devdata *dd);
|
||||
/* Netdev interface */
|
||||
void msix_netdev_synchronize_irq(struct hfi1_devdata *dd);
|
||||
int msix_netdev_request_rcd_irq(struct hfi1_ctxtdata *rcd);
|
||||
|
||||
#endif
|
||||
|
118
drivers/infiniband/hw/hfi1/netdev.h
Normal file
118
drivers/infiniband/hw/hfi1/netdev.h
Normal file
@ -0,0 +1,118 @@
|
||||
/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
|
||||
/*
|
||||
* Copyright(c) 2020 Intel Corporation.
|
||||
*
|
||||
*/
|
||||
|
||||
#ifndef HFI1_NETDEV_H
|
||||
#define HFI1_NETDEV_H
|
||||
|
||||
#include "hfi.h"
|
||||
|
||||
#include <linux/netdevice.h>
|
||||
#include <linux/xarray.h>
|
||||
|
||||
/**
|
||||
* struct hfi1_netdev_rxq - Receive Queue for HFI
|
||||
* dummy netdev. Both IPoIB and VNIC netdevices will be working on
|
||||
* top of this device.
|
||||
* @napi: napi object
|
||||
* @priv: ptr to netdev_priv
|
||||
* @rcd: ptr to receive context data
|
||||
*/
|
||||
struct hfi1_netdev_rxq {
|
||||
struct napi_struct napi;
|
||||
struct hfi1_netdev_priv *priv;
|
||||
struct hfi1_ctxtdata *rcd;
|
||||
};
|
||||
|
||||
/*
|
||||
* Number of netdev contexts used. Ensure it is less than or equal to
|
||||
* max queues supported by VNIC (HFI1_VNIC_MAX_QUEUE).
|
||||
*/
|
||||
#define HFI1_MAX_NETDEV_CTXTS 8
|
||||
|
||||
/* Number of NETDEV RSM entries */
|
||||
#define NUM_NETDEV_MAP_ENTRIES HFI1_MAX_NETDEV_CTXTS
|
||||
|
||||
/**
|
||||
* struct hfi1_netdev_priv: data required to setup and run HFI netdev.
|
||||
* @dd: hfi1_devdata
|
||||
* @rxq: pointer to dummy netdev receive queues.
|
||||
* @num_rx_q: number of receive queues
|
||||
* @rmt_index: first free index in RMT Array
|
||||
* @msix_start: first free MSI-X interrupt vector.
|
||||
* @dev_tbl: netdev table for unique identifier VNIC and IPoIb VLANs.
|
||||
* @enabled: atomic counter of netdevs enabling receive queues.
|
||||
* When 0 NAPI will be disabled.
|
||||
* @netdevs: atomic counter of netdevs using dummy netdev.
|
||||
* When 0 receive queues will be freed.
|
||||
*/
|
||||
struct hfi1_netdev_priv {
|
||||
struct hfi1_devdata *dd;
|
||||
struct hfi1_netdev_rxq *rxq;
|
||||
int num_rx_q;
|
||||
int rmt_start;
|
||||
struct xarray dev_tbl;
|
||||
/* count of enabled napi polls */
|
||||
atomic_t enabled;
|
||||
/* count of netdevs on top */
|
||||
atomic_t netdevs;
|
||||
};
|
||||
|
||||
static inline
|
||||
struct hfi1_netdev_priv *hfi1_netdev_priv(struct net_device *dev)
|
||||
{
|
||||
return (struct hfi1_netdev_priv *)&dev[1];
|
||||
}
|
||||
|
||||
static inline
|
||||
int hfi1_netdev_ctxt_count(struct hfi1_devdata *dd)
|
||||
{
|
||||
struct hfi1_netdev_priv *priv = hfi1_netdev_priv(dd->dummy_netdev);
|
||||
|
||||
return priv->num_rx_q;
|
||||
}
|
||||
|
||||
static inline
|
||||
struct hfi1_ctxtdata *hfi1_netdev_get_ctxt(struct hfi1_devdata *dd, int ctxt)
|
||||
{
|
||||
struct hfi1_netdev_priv *priv = hfi1_netdev_priv(dd->dummy_netdev);
|
||||
|
||||
return priv->rxq[ctxt].rcd;
|
||||
}
|
||||
|
||||
static inline
|
||||
int hfi1_netdev_get_free_rmt_idx(struct hfi1_devdata *dd)
|
||||
{
|
||||
struct hfi1_netdev_priv *priv = hfi1_netdev_priv(dd->dummy_netdev);
|
||||
|
||||
return priv->rmt_start;
|
||||
}
|
||||
|
||||
static inline
|
||||
void hfi1_netdev_set_free_rmt_idx(struct hfi1_devdata *dd, int rmt_idx)
|
||||
{
|
||||
struct hfi1_netdev_priv *priv = hfi1_netdev_priv(dd->dummy_netdev);
|
||||
|
||||
priv->rmt_start = rmt_idx;
|
||||
}
|
||||
|
||||
u32 hfi1_num_netdev_contexts(struct hfi1_devdata *dd, u32 available_contexts,
|
||||
struct cpumask *cpu_mask);
|
||||
|
||||
void hfi1_netdev_enable_queues(struct hfi1_devdata *dd);
|
||||
void hfi1_netdev_disable_queues(struct hfi1_devdata *dd);
|
||||
int hfi1_netdev_rx_init(struct hfi1_devdata *dd);
|
||||
int hfi1_netdev_rx_destroy(struct hfi1_devdata *dd);
|
||||
int hfi1_netdev_alloc(struct hfi1_devdata *dd);
|
||||
void hfi1_netdev_free(struct hfi1_devdata *dd);
|
||||
int hfi1_netdev_add_data(struct hfi1_devdata *dd, int id, void *data);
|
||||
void *hfi1_netdev_remove_data(struct hfi1_devdata *dd, int id);
|
||||
void *hfi1_netdev_get_data(struct hfi1_devdata *dd, int id);
|
||||
void *hfi1_netdev_get_first_data(struct hfi1_devdata *dd, int *start_id);
|
||||
|
||||
/* chip.c */
|
||||
int hfi1_netdev_rx_napi(struct napi_struct *napi, int budget);
|
||||
|
||||
#endif /* HFI1_NETDEV_H */
|
481
drivers/infiniband/hw/hfi1/netdev_rx.c
Normal file
481
drivers/infiniband/hw/hfi1/netdev_rx.c
Normal file
@ -0,0 +1,481 @@
|
||||
// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
|
||||
/*
|
||||
* Copyright(c) 2020 Intel Corporation.
|
||||
*
|
||||
*/
|
||||
|
||||
/*
|
||||
* This file contains HFI1 support for netdev RX functionality
|
||||
*/
|
||||
|
||||
#include "sdma.h"
|
||||
#include "verbs.h"
|
||||
#include "netdev.h"
|
||||
#include "hfi.h"
|
||||
|
||||
#include <linux/netdevice.h>
|
||||
#include <linux/etherdevice.h>
|
||||
#include <rdma/ib_verbs.h>
|
||||
|
||||
static int hfi1_netdev_setup_ctxt(struct hfi1_netdev_priv *priv,
|
||||
struct hfi1_ctxtdata *uctxt)
|
||||
{
|
||||
unsigned int rcvctrl_ops;
|
||||
struct hfi1_devdata *dd = priv->dd;
|
||||
int ret;
|
||||
|
||||
uctxt->rhf_rcv_function_map = netdev_rhf_rcv_functions;
|
||||
uctxt->do_interrupt = &handle_receive_interrupt_napi_sp;
|
||||
|
||||
/* Now allocate the RcvHdr queue and eager buffers. */
|
||||
ret = hfi1_create_rcvhdrq(dd, uctxt);
|
||||
if (ret)
|
||||
goto done;
|
||||
|
||||
ret = hfi1_setup_eagerbufs(uctxt);
|
||||
if (ret)
|
||||
goto done;
|
||||
|
||||
clear_rcvhdrtail(uctxt);
|
||||
|
||||
rcvctrl_ops = HFI1_RCVCTRL_CTXT_DIS;
|
||||
rcvctrl_ops |= HFI1_RCVCTRL_INTRAVAIL_DIS;
|
||||
|
||||
if (!HFI1_CAP_KGET_MASK(uctxt->flags, MULTI_PKT_EGR))
|
||||
rcvctrl_ops |= HFI1_RCVCTRL_ONE_PKT_EGR_ENB;
|
||||
if (HFI1_CAP_KGET_MASK(uctxt->flags, NODROP_EGR_FULL))
|
||||
rcvctrl_ops |= HFI1_RCVCTRL_NO_EGR_DROP_ENB;
|
||||
if (HFI1_CAP_KGET_MASK(uctxt->flags, NODROP_RHQ_FULL))
|
||||
rcvctrl_ops |= HFI1_RCVCTRL_NO_RHQ_DROP_ENB;
|
||||
if (HFI1_CAP_KGET_MASK(uctxt->flags, DMA_RTAIL))
|
||||
rcvctrl_ops |= HFI1_RCVCTRL_TAILUPD_ENB;
|
||||
|
||||
hfi1_rcvctrl(uctxt->dd, rcvctrl_ops, uctxt);
|
||||
done:
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int hfi1_netdev_allocate_ctxt(struct hfi1_devdata *dd,
|
||||
struct hfi1_ctxtdata **ctxt)
|
||||
{
|
||||
struct hfi1_ctxtdata *uctxt;
|
||||
int ret;
|
||||
|
||||
if (dd->flags & HFI1_FROZEN)
|
||||
return -EIO;
|
||||
|
||||
ret = hfi1_create_ctxtdata(dd->pport, dd->node, &uctxt);
|
||||
if (ret < 0) {
|
||||
dd_dev_err(dd, "Unable to create ctxtdata, failing open\n");
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
uctxt->flags = HFI1_CAP_KGET(MULTI_PKT_EGR) |
|
||||
HFI1_CAP_KGET(NODROP_RHQ_FULL) |
|
||||
HFI1_CAP_KGET(NODROP_EGR_FULL) |
|
||||
HFI1_CAP_KGET(DMA_RTAIL);
|
||||
/* Netdev contexts are always NO_RDMA_RTAIL */
|
||||
uctxt->fast_handler = handle_receive_interrupt_napi_fp;
|
||||
uctxt->slow_handler = handle_receive_interrupt_napi_sp;
|
||||
hfi1_set_seq_cnt(uctxt, 1);
|
||||
uctxt->is_vnic = true;
|
||||
|
||||
hfi1_stats.sps_ctxts++;
|
||||
|
||||
dd_dev_info(dd, "created netdev context %d\n", uctxt->ctxt);
|
||||
*ctxt = uctxt;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void hfi1_netdev_deallocate_ctxt(struct hfi1_devdata *dd,
|
||||
struct hfi1_ctxtdata *uctxt)
|
||||
{
|
||||
flush_wc();
|
||||
|
||||
/*
|
||||
* Disable receive context and interrupt available, reset all
|
||||
* RcvCtxtCtrl bits to default values.
|
||||
*/
|
||||
hfi1_rcvctrl(dd, HFI1_RCVCTRL_CTXT_DIS |
|
||||
HFI1_RCVCTRL_TIDFLOW_DIS |
|
||||
HFI1_RCVCTRL_INTRAVAIL_DIS |
|
||||
HFI1_RCVCTRL_ONE_PKT_EGR_DIS |
|
||||
HFI1_RCVCTRL_NO_RHQ_DROP_DIS |
|
||||
HFI1_RCVCTRL_NO_EGR_DROP_DIS, uctxt);
|
||||
|
||||
if (uctxt->msix_intr != CCE_NUM_MSIX_VECTORS)
|
||||
msix_free_irq(dd, uctxt->msix_intr);
|
||||
|
||||
uctxt->msix_intr = CCE_NUM_MSIX_VECTORS;
|
||||
uctxt->event_flags = 0;
|
||||
|
||||
hfi1_clear_tids(uctxt);
|
||||
hfi1_clear_ctxt_pkey(dd, uctxt);
|
||||
|
||||
hfi1_stats.sps_ctxts--;
|
||||
|
||||
hfi1_free_ctxt(uctxt);
|
||||
}
|
||||
|
||||
static int hfi1_netdev_allot_ctxt(struct hfi1_netdev_priv *priv,
|
||||
struct hfi1_ctxtdata **ctxt)
|
||||
{
|
||||
int rc;
|
||||
struct hfi1_devdata *dd = priv->dd;
|
||||
|
||||
rc = hfi1_netdev_allocate_ctxt(dd, ctxt);
|
||||
if (rc) {
|
||||
dd_dev_err(dd, "netdev ctxt alloc failed %d\n", rc);
|
||||
return rc;
|
||||
}
|
||||
|
||||
rc = hfi1_netdev_setup_ctxt(priv, *ctxt);
|
||||
if (rc) {
|
||||
dd_dev_err(dd, "netdev ctxt setup failed %d\n", rc);
|
||||
hfi1_netdev_deallocate_ctxt(dd, *ctxt);
|
||||
*ctxt = NULL;
|
||||
}
|
||||
|
||||
return rc;
|
||||
}
|
||||
|
||||
/**
|
||||
* hfi1_num_netdev_contexts - Count of netdev recv contexts to use.
|
||||
* @dd: device on which to allocate netdev contexts
|
||||
* @available_contexts: count of available receive contexts
|
||||
* @cpu_mask: mask of possible cpus to include for contexts
|
||||
*
|
||||
* Return: count of physical cores on a node or the remaining available recv
|
||||
* contexts for netdev recv context usage up to the maximum of
|
||||
* HFI1_MAX_NETDEV_CTXTS.
|
||||
* A value of 0 can be returned when acceleration is explicitly turned off,
|
||||
* a memory allocation error occurs or when there are no available contexts.
|
||||
*
|
||||
*/
|
||||
u32 hfi1_num_netdev_contexts(struct hfi1_devdata *dd, u32 available_contexts,
|
||||
struct cpumask *cpu_mask)
|
||||
{
|
||||
cpumask_var_t node_cpu_mask;
|
||||
unsigned int available_cpus;
|
||||
|
||||
if (!HFI1_CAP_IS_KSET(AIP))
|
||||
return 0;
|
||||
|
||||
/* Always give user contexts priority over netdev contexts */
|
||||
if (available_contexts == 0) {
|
||||
dd_dev_info(dd, "No receive contexts available for netdevs.\n");
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (!zalloc_cpumask_var(&node_cpu_mask, GFP_KERNEL)) {
|
||||
dd_dev_err(dd, "Unable to allocate cpu_mask for netdevs.\n");
|
||||
return 0;
|
||||
}
|
||||
|
||||
cpumask_and(node_cpu_mask, cpu_mask,
|
||||
cpumask_of_node(pcibus_to_node(dd->pcidev->bus)));
|
||||
|
||||
available_cpus = cpumask_weight(node_cpu_mask);
|
||||
|
||||
free_cpumask_var(node_cpu_mask);
|
||||
|
||||
return min3(available_cpus, available_contexts,
|
||||
(u32)HFI1_MAX_NETDEV_CTXTS);
|
||||
}
|
||||
|
||||
static int hfi1_netdev_rxq_init(struct net_device *dev)
|
||||
{
|
||||
int i;
|
||||
int rc;
|
||||
struct hfi1_netdev_priv *priv = hfi1_netdev_priv(dev);
|
||||
struct hfi1_devdata *dd = priv->dd;
|
||||
|
||||
priv->num_rx_q = dd->num_netdev_contexts;
|
||||
priv->rxq = kcalloc_node(priv->num_rx_q, sizeof(struct hfi1_netdev_rxq),
|
||||
GFP_KERNEL, dd->node);
|
||||
|
||||
if (!priv->rxq) {
|
||||
dd_dev_err(dd, "Unable to allocate netdev queue data\n");
|
||||
return (-ENOMEM);
|
||||
}
|
||||
|
||||
for (i = 0; i < priv->num_rx_q; i++) {
|
||||
struct hfi1_netdev_rxq *rxq = &priv->rxq[i];
|
||||
|
||||
rc = hfi1_netdev_allot_ctxt(priv, &rxq->rcd);
|
||||
if (rc)
|
||||
goto bail_context_irq_failure;
|
||||
|
||||
hfi1_rcd_get(rxq->rcd);
|
||||
rxq->priv = priv;
|
||||
rxq->rcd->napi = &rxq->napi;
|
||||
dd_dev_info(dd, "Setting rcv queue %d napi to context %d\n",
|
||||
i, rxq->rcd->ctxt);
|
||||
/*
|
||||
* Disable BUSY_POLL on this NAPI as this is not supported
|
||||
* right now.
|
||||
*/
|
||||
set_bit(NAPI_STATE_NO_BUSY_POLL, &rxq->napi.state);
|
||||
netif_napi_add(dev, &rxq->napi, hfi1_netdev_rx_napi, 64);
|
||||
rc = msix_netdev_request_rcd_irq(rxq->rcd);
|
||||
if (rc)
|
||||
goto bail_context_irq_failure;
|
||||
}
|
||||
|
||||
return 0;
|
||||
|
||||
bail_context_irq_failure:
|
||||
dd_dev_err(dd, "Unable to allot receive context\n");
|
||||
for (; i >= 0; i--) {
|
||||
struct hfi1_netdev_rxq *rxq = &priv->rxq[i];
|
||||
|
||||
if (rxq->rcd) {
|
||||
hfi1_netdev_deallocate_ctxt(dd, rxq->rcd);
|
||||
hfi1_rcd_put(rxq->rcd);
|
||||
rxq->rcd = NULL;
|
||||
}
|
||||
}
|
||||
kfree(priv->rxq);
|
||||
priv->rxq = NULL;
|
||||
|
||||
return rc;
|
||||
}
|
||||
|
||||
static void hfi1_netdev_rxq_deinit(struct net_device *dev)
|
||||
{
|
||||
int i;
|
||||
struct hfi1_netdev_priv *priv = hfi1_netdev_priv(dev);
|
||||
struct hfi1_devdata *dd = priv->dd;
|
||||
|
||||
for (i = 0; i < priv->num_rx_q; i++) {
|
||||
struct hfi1_netdev_rxq *rxq = &priv->rxq[i];
|
||||
|
||||
netif_napi_del(&rxq->napi);
|
||||
hfi1_netdev_deallocate_ctxt(dd, rxq->rcd);
|
||||
hfi1_rcd_put(rxq->rcd);
|
||||
rxq->rcd = NULL;
|
||||
}
|
||||
|
||||
kfree(priv->rxq);
|
||||
priv->rxq = NULL;
|
||||
priv->num_rx_q = 0;
|
||||
}
|
||||
|
||||
static void enable_queues(struct hfi1_netdev_priv *priv)
|
||||
{
|
||||
int i;
|
||||
|
||||
for (i = 0; i < priv->num_rx_q; i++) {
|
||||
struct hfi1_netdev_rxq *rxq = &priv->rxq[i];
|
||||
|
||||
dd_dev_info(priv->dd, "enabling queue %d on context %d\n", i,
|
||||
rxq->rcd->ctxt);
|
||||
napi_enable(&rxq->napi);
|
||||
hfi1_rcvctrl(priv->dd,
|
||||
HFI1_RCVCTRL_CTXT_ENB | HFI1_RCVCTRL_INTRAVAIL_ENB,
|
||||
rxq->rcd);
|
||||
}
|
||||
}
|
||||
|
||||
static void disable_queues(struct hfi1_netdev_priv *priv)
|
||||
{
|
||||
int i;
|
||||
|
||||
msix_netdev_synchronize_irq(priv->dd);
|
||||
|
||||
for (i = 0; i < priv->num_rx_q; i++) {
|
||||
struct hfi1_netdev_rxq *rxq = &priv->rxq[i];
|
||||
|
||||
dd_dev_info(priv->dd, "disabling queue %d on context %d\n", i,
|
||||
rxq->rcd->ctxt);
|
||||
|
||||
/* wait for napi if it was scheduled */
|
||||
hfi1_rcvctrl(priv->dd,
|
||||
HFI1_RCVCTRL_CTXT_DIS | HFI1_RCVCTRL_INTRAVAIL_DIS,
|
||||
rxq->rcd);
|
||||
napi_synchronize(&rxq->napi);
|
||||
napi_disable(&rxq->napi);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* hfi1_netdev_rx_init - Incrememnts netdevs counter. When called first time,
|
||||
* it allocates receive queue data and calls netif_napi_add
|
||||
* for each queue.
|
||||
*
|
||||
* @dd: hfi1 dev data
|
||||
*/
|
||||
int hfi1_netdev_rx_init(struct hfi1_devdata *dd)
|
||||
{
|
||||
struct hfi1_netdev_priv *priv = hfi1_netdev_priv(dd->dummy_netdev);
|
||||
int res;
|
||||
|
||||
if (atomic_fetch_inc(&priv->netdevs))
|
||||
return 0;
|
||||
|
||||
mutex_lock(&hfi1_mutex);
|
||||
init_dummy_netdev(dd->dummy_netdev);
|
||||
res = hfi1_netdev_rxq_init(dd->dummy_netdev);
|
||||
mutex_unlock(&hfi1_mutex);
|
||||
return res;
|
||||
}
|
||||
|
||||
/**
|
||||
* hfi1_netdev_rx_destroy - Decrements netdevs counter, when it reaches 0
|
||||
* napi is deleted and receive queses memory is freed.
|
||||
*
|
||||
* @dd: hfi1 dev data
|
||||
*/
|
||||
int hfi1_netdev_rx_destroy(struct hfi1_devdata *dd)
|
||||
{
|
||||
struct hfi1_netdev_priv *priv = hfi1_netdev_priv(dd->dummy_netdev);
|
||||
|
||||
/* destroy the RX queues only if it is the last netdev going away */
|
||||
if (atomic_fetch_add_unless(&priv->netdevs, -1, 0) == 1) {
|
||||
mutex_lock(&hfi1_mutex);
|
||||
hfi1_netdev_rxq_deinit(dd->dummy_netdev);
|
||||
mutex_unlock(&hfi1_mutex);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* hfi1_netdev_alloc - Allocates netdev and private data. It is required
|
||||
* because RMT index and MSI-X interrupt can be set only
|
||||
* during driver initialization.
|
||||
*
|
||||
* @dd: hfi1 dev data
|
||||
*/
|
||||
int hfi1_netdev_alloc(struct hfi1_devdata *dd)
|
||||
{
|
||||
struct hfi1_netdev_priv *priv;
|
||||
const int netdev_size = sizeof(*dd->dummy_netdev) +
|
||||
sizeof(struct hfi1_netdev_priv);
|
||||
|
||||
dd_dev_info(dd, "allocating netdev size %d\n", netdev_size);
|
||||
dd->dummy_netdev = kcalloc_node(1, netdev_size, GFP_KERNEL, dd->node);
|
||||
|
||||
if (!dd->dummy_netdev)
|
||||
return -ENOMEM;
|
||||
|
||||
priv = hfi1_netdev_priv(dd->dummy_netdev);
|
||||
priv->dd = dd;
|
||||
xa_init(&priv->dev_tbl);
|
||||
atomic_set(&priv->enabled, 0);
|
||||
atomic_set(&priv->netdevs, 0);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
void hfi1_netdev_free(struct hfi1_devdata *dd)
|
||||
{
|
||||
if (dd->dummy_netdev) {
|
||||
dd_dev_info(dd, "hfi1 netdev freed\n");
|
||||
free_netdev(dd->dummy_netdev);
|
||||
dd->dummy_netdev = NULL;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* hfi1_netdev_enable_queues - This is napi enable function.
|
||||
* It enables napi objects associated with queues.
|
||||
* When at least one device has called it it increments atomic counter.
|
||||
* Disable function decrements counter and when it is 0,
|
||||
* calls napi_disable for every queue.
|
||||
*
|
||||
* @dd: hfi1 dev data
|
||||
*/
|
||||
void hfi1_netdev_enable_queues(struct hfi1_devdata *dd)
|
||||
{
|
||||
struct hfi1_netdev_priv *priv;
|
||||
|
||||
if (!dd->dummy_netdev)
|
||||
return;
|
||||
|
||||
priv = hfi1_netdev_priv(dd->dummy_netdev);
|
||||
if (atomic_fetch_inc(&priv->enabled))
|
||||
return;
|
||||
|
||||
mutex_lock(&hfi1_mutex);
|
||||
enable_queues(priv);
|
||||
mutex_unlock(&hfi1_mutex);
|
||||
}
|
||||
|
||||
void hfi1_netdev_disable_queues(struct hfi1_devdata *dd)
|
||||
{
|
||||
struct hfi1_netdev_priv *priv;
|
||||
|
||||
if (!dd->dummy_netdev)
|
||||
return;
|
||||
|
||||
priv = hfi1_netdev_priv(dd->dummy_netdev);
|
||||
if (atomic_dec_if_positive(&priv->enabled))
|
||||
return;
|
||||
|
||||
mutex_lock(&hfi1_mutex);
|
||||
disable_queues(priv);
|
||||
mutex_unlock(&hfi1_mutex);
|
||||
}
|
||||
|
||||
/**
|
||||
* hfi1_netdev_add_data - Registers data with unique identifier
|
||||
* to be requested later this is needed for VNIC and IPoIB VLANs
|
||||
* implementations.
|
||||
* This call is protected by mutex idr_lock.
|
||||
*
|
||||
* @dd: hfi1 dev data
|
||||
* @id: requested integer id up to INT_MAX
|
||||
* @data: data to be associated with index
|
||||
*/
|
||||
int hfi1_netdev_add_data(struct hfi1_devdata *dd, int id, void *data)
|
||||
{
|
||||
struct hfi1_netdev_priv *priv = hfi1_netdev_priv(dd->dummy_netdev);
|
||||
|
||||
return xa_insert(&priv->dev_tbl, id, data, GFP_NOWAIT);
|
||||
}
|
||||
|
||||
/**
|
||||
* hfi1_netdev_remove_data - Removes data with previously given id.
|
||||
* Returns the reference to removed entry.
|
||||
*
|
||||
* @dd: hfi1 dev data
|
||||
* @id: requested integer id up to INT_MAX
|
||||
*/
|
||||
void *hfi1_netdev_remove_data(struct hfi1_devdata *dd, int id)
|
||||
{
|
||||
struct hfi1_netdev_priv *priv = hfi1_netdev_priv(dd->dummy_netdev);
|
||||
|
||||
return xa_erase(&priv->dev_tbl, id);
|
||||
}
|
||||
|
||||
/**
|
||||
* hfi1_netdev_get_data - Gets data with given id
|
||||
*
|
||||
* @dd: hfi1 dev data
|
||||
* @id: requested integer id up to INT_MAX
|
||||
*/
|
||||
void *hfi1_netdev_get_data(struct hfi1_devdata *dd, int id)
|
||||
{
|
||||
struct hfi1_netdev_priv *priv = hfi1_netdev_priv(dd->dummy_netdev);
|
||||
|
||||
return xa_load(&priv->dev_tbl, id);
|
||||
}
|
||||
|
||||
/**
|
||||
* hfi1_netdev_get_first_dat - Gets first entry with greater or equal id.
|
||||
*
|
||||
* @dd: hfi1 dev data
|
||||
* @id: requested integer id up to INT_MAX
|
||||
*/
|
||||
void *hfi1_netdev_get_first_data(struct hfi1_devdata *dd, int *start_id)
|
||||
{
|
||||
struct hfi1_netdev_priv *priv = hfi1_netdev_priv(dd->dummy_netdev);
|
||||
unsigned long index = *start_id;
|
||||
void *ret;
|
||||
|
||||
ret = xa_find(&priv->dev_tbl, &index, UINT_MAX, XA_PRESENT);
|
||||
*start_id = (int)index;
|
||||
return ret;
|
||||
}
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright(c) 2015 - 2019 Intel Corporation.
|
||||
* Copyright(c) 2015 - 2020 Intel Corporation.
|
||||
*
|
||||
* This file is provided under a dual BSD/GPLv2 license. When using or
|
||||
* redistributing this file, you may do so under either license.
|
||||
@ -186,15 +186,6 @@ static void flush_iowait(struct rvt_qp *qp)
|
||||
write_sequnlock_irqrestore(lock, flags);
|
||||
}
|
||||
|
||||
static inline int opa_mtu_enum_to_int(int mtu)
|
||||
{
|
||||
switch (mtu) {
|
||||
case OPA_MTU_8192: return 8192;
|
||||
case OPA_MTU_10240: return 10240;
|
||||
default: return -1;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* This function is what we would push to the core layer if we wanted to be a
|
||||
* "first class citizen". Instead we hide this here and rely on Verbs ULPs
|
||||
@ -202,15 +193,10 @@ static inline int opa_mtu_enum_to_int(int mtu)
|
||||
*/
|
||||
static inline int verbs_mtu_enum_to_int(struct ib_device *dev, enum ib_mtu mtu)
|
||||
{
|
||||
int val;
|
||||
|
||||
/* Constraining 10KB packets to 8KB packets */
|
||||
if (mtu == (enum ib_mtu)OPA_MTU_10240)
|
||||
mtu = OPA_MTU_8192;
|
||||
val = opa_mtu_enum_to_int((int)mtu);
|
||||
if (val > 0)
|
||||
return val;
|
||||
return ib_mtu_enum_to_int(mtu);
|
||||
return opa_mtu_enum_to_int((enum opa_mtu)mtu);
|
||||
}
|
||||
|
||||
int hfi1_check_modify_qp(struct rvt_qp *qp, struct ib_qp_attr *attr,
|
||||
|
@ -1,6 +1,6 @@
|
||||
// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
|
||||
/*
|
||||
* Copyright(c) 2018 Intel Corporation.
|
||||
* Copyright(c) 2018 - 2020 Intel Corporation.
|
||||
*
|
||||
*/
|
||||
|
||||
@ -194,7 +194,7 @@ void tid_rdma_opfn_init(struct rvt_qp *qp, struct tid_rdma_params *p)
|
||||
{
|
||||
struct hfi1_qp_priv *priv = qp->priv;
|
||||
|
||||
p->qp = (kdeth_qp << 16) | priv->rcd->ctxt;
|
||||
p->qp = (RVT_KDETH_QP_PREFIX << 16) | priv->rcd->ctxt;
|
||||
p->max_len = TID_RDMA_MAX_SEGMENT_SIZE;
|
||||
p->jkey = priv->rcd->jkey;
|
||||
p->max_read = TID_RDMA_MAX_READ_SEGS_PER_REQ;
|
||||
|
@ -1,5 +1,5 @@
|
||||
/*
|
||||
* Copyright(c) 2015 - 2018 Intel Corporation.
|
||||
* Copyright(c) 2015 - 2020 Intel Corporation.
|
||||
*
|
||||
* This file is provided under a dual BSD/GPLv2 license. When using or
|
||||
* redistributing this file, you may do so under either license.
|
||||
@ -47,6 +47,7 @@
|
||||
#define CREATE_TRACE_POINTS
|
||||
#include "trace.h"
|
||||
#include "exp_rcv.h"
|
||||
#include "ipoib.h"
|
||||
|
||||
static u8 __get_ib_hdr_len(struct ib_header *hdr)
|
||||
{
|
||||
@ -126,6 +127,7 @@ const char *hfi1_trace_get_packet_l2_str(u8 l2)
|
||||
#define RETH_PRN "reth vaddr:0x%.16llx rkey:0x%.8x dlen:0x%.8x"
|
||||
#define AETH_PRN "aeth syn:0x%.2x %s msn:0x%.8x"
|
||||
#define DETH_PRN "deth qkey:0x%.8x sqpn:0x%.6x"
|
||||
#define DETH_ENTROPY_PRN "deth qkey:0x%.8x sqpn:0x%.6x entropy:0x%.2x"
|
||||
#define IETH_PRN "ieth rkey:0x%.8x"
|
||||
#define ATOMICACKETH_PRN "origdata:%llx"
|
||||
#define ATOMICETH_PRN "vaddr:0x%llx rkey:0x%.8x sdata:%llx cdata:%llx"
|
||||
@ -444,6 +446,12 @@ const char *parse_everbs_hdrs(
|
||||
break;
|
||||
/* deth */
|
||||
case OP(UD, SEND_ONLY):
|
||||
trace_seq_printf(p, DETH_ENTROPY_PRN,
|
||||
be32_to_cpu(eh->ud.deth[0]),
|
||||
be32_to_cpu(eh->ud.deth[1]) & RVT_QPN_MASK,
|
||||
be32_to_cpu(eh->ud.deth[1]) >>
|
||||
HFI1_IPOIB_ENTROPY_SHIFT);
|
||||
break;
|
||||
case OP(UD, SEND_ONLY_WITH_IMMEDIATE):
|
||||
trace_seq_printf(p, DETH_PRN,
|
||||
be32_to_cpu(eh->ud.deth[0]),
|
||||
@ -512,6 +520,38 @@ u16 hfi1_trace_get_tid_idx(u32 ent)
|
||||
return EXP_TID_GET(ent, IDX);
|
||||
}
|
||||
|
||||
struct hfi1_ctxt_hist {
|
||||
atomic_t count;
|
||||
atomic_t data[255];
|
||||
};
|
||||
|
||||
struct hfi1_ctxt_hist hist = {
|
||||
.count = ATOMIC_INIT(0)
|
||||
};
|
||||
|
||||
const char *hfi1_trace_print_rsm_hist(struct trace_seq *p, unsigned int ctxt)
|
||||
{
|
||||
int i, len = ARRAY_SIZE(hist.data);
|
||||
const char *ret = trace_seq_buffer_ptr(p);
|
||||
unsigned long packet_count = atomic_fetch_inc(&hist.count);
|
||||
|
||||
trace_seq_printf(p, "packet[%lu]", packet_count);
|
||||
for (i = 0; i < len; ++i) {
|
||||
unsigned long val;
|
||||
atomic_t *count = &hist.data[i];
|
||||
|
||||
if (ctxt == i)
|
||||
val = atomic_fetch_inc(count);
|
||||
else
|
||||
val = atomic_read(count);
|
||||
|
||||
if (val)
|
||||
trace_seq_printf(p, "(%d:%lu)", i, val);
|
||||
}
|
||||
trace_seq_putc(p, 0);
|
||||
return ret;
|
||||
}
|
||||
|
||||
__hfi1_trace_fn(AFFINITY);
|
||||
__hfi1_trace_fn(PKT);
|
||||
__hfi1_trace_fn(PROC);
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user