b40f4757da
In practice, each RDMA device has a unique set of counters that the hardware implements. Having a central set of counters that they must all adhere to is limiting and causes many useful counters to not be available. Therefore we create a dynamic counter registration infrastructure. The driver must implement a stats structure allocation routine, in which the driver must place the directory name it wants, a list of names for all of the counters, an array of u64 counters themselves, plus a few generic configuration options. We then implement a core routine to create a sysfs file for each of the named stats elements, and a core routine to retrieve the stats when any of the sysfs attribute files are read. To avoid excessive beating on the stats generation routine in the drivers, the core code also caches the stats for a short period of time so that someone attempting to read all of the stats in a given device's directory will not result in a stats generation call per file read. Future work will attempt to standardize just the shared stats elements, and possibly add a method to get the stats via netlink in addition to sysfs. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> [ Add caching, make structure names more informative, add i40iw support, other significant rewrites from the original patch ]
100 lines
3.7 KiB
Plaintext
100 lines
3.7 KiB
Plaintext
SYSFS FILES
|
|
|
|
For each InfiniBand device, the InfiniBand drivers create the
|
|
following files under /sys/class/infiniband/<device name>:
|
|
|
|
node_type - Node type (CA, switch or router)
|
|
node_guid - Node GUID
|
|
sys_image_guid - System image GUID
|
|
|
|
In addition, there is a "ports" subdirectory, with one subdirectory
|
|
for each port. For example, if mthca0 is a 2-port HCA, there will
|
|
be two directories:
|
|
|
|
/sys/class/infiniband/mthca0/ports/1
|
|
/sys/class/infiniband/mthca0/ports/2
|
|
|
|
(A switch will only have a single "0" subdirectory for switch port
|
|
0; no subdirectory is created for normal switch ports)
|
|
|
|
In each port subdirectory, the following files are created:
|
|
|
|
cap_mask - Port capability mask
|
|
lid - Port LID
|
|
lid_mask_count - Port LID mask count
|
|
rate - Port data rate (active width * active speed)
|
|
sm_lid - Subnet manager LID for port's subnet
|
|
sm_sl - Subnet manager SL for port's subnet
|
|
state - Port state (DOWN, INIT, ARMED, ACTIVE or ACTIVE_DEFER)
|
|
phys_state - Port physical state (Sleep, Polling, LinkUp, etc)
|
|
|
|
There is also a "counters" subdirectory, with files
|
|
|
|
VL15_dropped
|
|
excessive_buffer_overrun_errors
|
|
link_downed
|
|
link_error_recovery
|
|
local_link_integrity_errors
|
|
port_rcv_constraint_errors
|
|
port_rcv_data
|
|
port_rcv_errors
|
|
port_rcv_packets
|
|
port_rcv_remote_physical_errors
|
|
port_rcv_switch_relay_errors
|
|
port_xmit_constraint_errors
|
|
port_xmit_data
|
|
port_xmit_discards
|
|
port_xmit_packets
|
|
symbol_error
|
|
|
|
Each of these files contains the corresponding value from the port's
|
|
Performance Management PortCounters attribute, as described in
|
|
section 16.1.3.5 of the InfiniBand Architecture Specification.
|
|
|
|
The "pkeys" and "gids" subdirectories contain one file for each
|
|
entry in the port's P_Key or GID table respectively. For example,
|
|
ports/1/pkeys/10 contains the value at index 10 in port 1's P_Key
|
|
table.
|
|
|
|
There is an optional "hw_counters" subdirectory that may be under either
|
|
the parent device or the port subdirectories or both. If present,
|
|
there are a list of counters provided by the hardware. They may match
|
|
some of the counters in the counters directory, but they often include
|
|
many other counters. In addition to the various counters, there will
|
|
be a file named "lifespan" that configures how frequently the core
|
|
should update the counters when they are being accessed (counters are
|
|
not updated if they are not being accessed). The lifespan is in milli-
|
|
seconds and defaults to 10 unless set to something else by the driver.
|
|
Users may echo a value between 0 - 10000 to the lifespan file to set
|
|
the length of time between updates in milliseconds.
|
|
|
|
MTHCA
|
|
|
|
The Mellanox HCA driver also creates the files:
|
|
|
|
hw_rev - Hardware revision number
|
|
fw_ver - Firmware version
|
|
hca_type - HCA type: "MT23108", "MT25208 (MT23108 compat mode)",
|
|
or "MT25208"
|
|
|
|
HFI1
|
|
|
|
The hfi1 driver also creates these additional files:
|
|
|
|
hw_rev - hardware revision
|
|
board_id - manufacturing board id
|
|
tempsense - thermal sense information
|
|
serial - board serial number
|
|
nfreectxts - number of free user contexts
|
|
nctxts - number of allowed contexts (PSM2)
|
|
chip_reset - diagnostic (root only)
|
|
boardversion - board version
|
|
ports/1/
|
|
CCMgtA/
|
|
cc_settings_bin - CCA tables used by PSM2
|
|
cc_table_bin
|
|
cc_prescan - enable prescaning for faster BECN response
|
|
sc2v/ - 32 files (0 - 31) used to translate sl->vl
|
|
sl2sc/ - 32 files (0 - 31) used to translate sl->sc
|
|
vl2mtu/ - 16 (0 - 15) files used to determine MTU for vl
|