mlx5-updates-2023-06-06

1) Support 4 ports VF LAG, part 2/2
 2) Few extra trivial cleanup patches
 
 Shay Drory Says:
 ================
 
 Support 4 ports VF LAG, part 2/2
 
 This series continues the series[1] "Support 4 ports VF LAG, part1/2".
 This series adds support for 4 ports VF LAG (single FDB E-Switch).
 
 This series of patches refactoring LAG code that make assumptions
 about VF LAG supporting only two ports and then enable 4 ports VF LAG.
 
 Patch 1:
 - Fix for ib rep code
 Patches 2-5:
 - Refactors LAG layer.
 Patches 6-7:
 - Block LAG types which doesn't support 4 ports.
 Patch 8:
 - Enable 4 ports VF LAG.
 
 This series specifically allows HCAs with 4 ports to create a VF LAG
 with only 4 ports. It is not possible to create a VF LAG with 2 or 3
 ports using HCAs that have 4 ports.
 
 Currently, the Merged E-Switch feature only supports HCAs with 2 ports.
 However, upcoming patches will introduce support for HCAs with 4 ports.
 
 In order to activate VF LAG a user can execute:
 
 devlink dev eswitch set pci/0000:08:00.0 mode switchdev
 devlink dev eswitch set pci/0000:08:00.1 mode switchdev
 devlink dev eswitch set pci/0000:08:00.2 mode switchdev
 devlink dev eswitch set pci/0000:08:00.3 mode switchdev
 ip link add name bond0 type bond
 ip link set dev bond0 type bond mode 802.3ad
 ip link set dev eth2 master bond0
 ip link set dev eth3 master bond0
 ip link set dev eth4 master bond0
 ip link set dev eth5 master bond0
 
 Where eth2, eth3, eth4 and eth5 are net-interfaces of pci/0000:08:00.0
 pci/0000:08:00.1 pci/0000:08:00.2 pci/0000:08:00.3 respectively.
 
 User can verify LAG state and type via debugfs:
 /sys/kernel/debug/mlx5/0000\:08\:00.0/lag/state
 /sys/kernel/debug/mlx5/0000\:08\:00.0/lag/type
 
 [1]
 https://lore.kernel.org/netdev/20230601060118.154015-1-saeed@kernel.org/T/#mf1d2083780970ba277bfe721554d4925f03f36d1
 
 ================
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmSA7/0ACgkQSD+KveBX
 +j4faQgApm14Id8QTB0rSj9tO1tJFtSgCcpDN9DtyYWuq3B0rGW9CPC1rPdaFOlt
 xst7PtEaCiJu7a7dwlH/kFLSAlXpZHdUZA+VG8JF0aYV8qOV/0R0xQKZgP68kwkn
 vFZqZCzA1vR6egK3AweAjKAVaqDKSSKVlFGXJGzyNpGMWpGEEKodlZKCH7Jd580F
 UFhCqbyY8vccMUa3cvrLVjePUjdM1xxsLKHWmYXTaN2NkoLvOYXnXThElu7skm96
 Uqv8B9t2FoojZPBxgiJtoGKZ516+1dozORq7ioQug3oG9P/vpTY5QnTMKSZpJUTH
 5cdSCdqii4UDPqfe0PtdEdE1O2aWig==
 =A44p
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2023-06-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-06-06

1) Support 4 ports VF LAG, part 2/2
2) Few extra trivial cleanup patches

Shay Drory Says:
================

Support 4 ports VF LAG, part 2/2

This series continues the series[1] "Support 4 ports VF LAG, part1/2".
This series adds support for 4 ports VF LAG (single FDB E-Switch).

This series of patches refactoring LAG code that make assumptions
about VF LAG supporting only two ports and then enable 4 ports VF LAG.

Patch 1:
- Fix for ib rep code
Patches 2-5:
- Refactors LAG layer.
Patches 6-7:
- Block LAG types which doesn't support 4 ports.
Patch 8:
- Enable 4 ports VF LAG.

This series specifically allows HCAs with 4 ports to create a VF LAG
with only 4 ports. It is not possible to create a VF LAG with 2 or 3
ports using HCAs that have 4 ports.

Currently, the Merged E-Switch feature only supports HCAs with 2 ports.
However, upcoming patches will introduce support for HCAs with 4 ports.

In order to activate VF LAG a user can execute:

devlink dev eswitch set pci/0000:08:00.0 mode switchdev
devlink dev eswitch set pci/0000:08:00.1 mode switchdev
devlink dev eswitch set pci/0000:08:00.2 mode switchdev
devlink dev eswitch set pci/0000:08:00.3 mode switchdev
ip link add name bond0 type bond
ip link set dev bond0 type bond mode 802.3ad
ip link set dev eth2 master bond0
ip link set dev eth3 master bond0
ip link set dev eth4 master bond0
ip link set dev eth5 master bond0

Where eth2, eth3, eth4 and eth5 are net-interfaces of pci/0000:08:00.0
pci/0000:08:00.1 pci/0000:08:00.2 pci/0000:08:00.3 respectively.

User can verify LAG state and type via debugfs:
/sys/kernel/debug/mlx5/0000\:08\:00.0/lag/state
/sys/kernel/debug/mlx5/0000\:08\:00.0/lag/type

[1]
https://lore.kernel.org/netdev/20230601060118.154015-1-saeed@kernel.org/T/#mf1d2083780970ba277bfe721554d4925f03f36d1

================

* tag 'mlx5-updates-2023-06-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: simplify condition after napi budget handling change
  mlx5/core: E-Switch, Allocate ECPF vport if it's an eswitch manager
  net/mlx5: Skip inline mode check after mlx5_eswitch_enable_locked() failure
  net/mlx5e: TC, refactor access to hash key
  net/mlx5e: Remove RX page cache leftovers
  net/mlx5e: Expose catastrophic steering error counters
  net/mlx5: Enable 4 ports VF LAG
  net/mlx5: LAG, block multiport eswitch LAG in case ldev have more than 2 ports
  net/mlx5: LAG, block multipath LAG in case ldev have more than 2 ports
  net/mlx5: LAG, change mlx5_shared_fdb_supported() to static
  net/mlx5: LAG, generalize handling of shared FDB
  net/mlx5: LAG, check if all eswitches are paired for shared FDB
  {net/RDMA}/mlx5: introduce lag_for_each_peer
  RDMA/mlx5: Free second uplink ib port
====================

Link: https://lore.kernel.org/r/20230607210410.88209-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit is contained in:
Jakub Kicinski 2023-06-08 19:28:20 -07:00
commit f84ad5cffd
18 changed files with 202 additions and 104 deletions

View File

@ -290,6 +290,13 @@ Description of the vnic counters:
- nic_receive_steering_discard
number of packets that completed RX flow
steering but were discarded due to a mismatch in flow table.
- generated_pkt_steering_fail
number of packets generated by the VNIC experiencing unexpected steering
failure (at any point in steering flow).
- handled_pkt_steering_fail
number of packets handled by the VNIC experiencing unexpected steering
failure (at any point in steering flow owned by the VNIC, including the FDB
for the eswitch owner).
User commands examples:

View File

@ -30,45 +30,65 @@ mlx5_ib_set_vport_rep(struct mlx5_core_dev *dev,
static void mlx5_ib_register_peer_vport_reps(struct mlx5_core_dev *mdev);
static void mlx5_ib_num_ports_update(struct mlx5_core_dev *dev, u32 *num_ports)
{
struct mlx5_core_dev *peer_dev;
int i;
mlx5_lag_for_each_peer_mdev(dev, peer_dev, i) {
u32 peer_num_ports = mlx5_eswitch_get_total_vports(peer_dev);
if (mlx5_lag_is_mpesw(peer_dev))
*num_ports += peer_num_ports;
else
/* Only 1 ib port is the representor for all uplinks */
*num_ports += peer_num_ports - 1;
}
}
static int
mlx5_ib_vport_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
{
u32 num_ports = mlx5_eswitch_get_total_vports(dev);
struct mlx5_core_dev *lag_master = dev;
const struct mlx5_ib_profile *profile;
struct mlx5_core_dev *peer_dev;
struct mlx5_ib_dev *ibdev;
int second_uplink = false;
u32 peer_num_ports;
int new_uplink = false;
int vport_index;
int ret;
int i;
vport_index = rep->vport_index;
if (mlx5_lag_is_shared_fdb(dev)) {
peer_dev = mlx5_lag_get_peer_mdev(dev);
peer_num_ports = mlx5_eswitch_get_total_vports(peer_dev);
if (mlx5_lag_is_master(dev)) {
if (mlx5_lag_is_mpesw(dev))
num_ports += peer_num_ports;
else
num_ports += peer_num_ports - 1;
mlx5_ib_num_ports_update(dev, &num_ports);
} else {
if (rep->vport == MLX5_VPORT_UPLINK) {
if (!mlx5_lag_is_mpesw(dev))
return 0;
second_uplink = true;
new_uplink = true;
}
mlx5_lag_for_each_peer_mdev(dev, peer_dev, i) {
u32 peer_n_ports = mlx5_eswitch_get_total_vports(peer_dev);
vport_index += peer_num_ports;
dev = peer_dev;
if (mlx5_lag_is_master(peer_dev))
lag_master = peer_dev;
else if (!mlx5_lag_is_mpesw(dev))
/* Only 1 ib port is the representor for all uplinks */
peer_n_ports--;
if (mlx5_get_dev_index(peer_dev) < mlx5_get_dev_index(dev))
vport_index += peer_n_ports;
}
}
}
if (rep->vport == MLX5_VPORT_UPLINK && !second_uplink)
if (rep->vport == MLX5_VPORT_UPLINK && !new_uplink)
profile = &raw_eth_profile;
else
return mlx5_ib_set_vport_rep(dev, rep, vport_index);
return mlx5_ib_set_vport_rep(lag_master, rep, vport_index);
ibdev = ib_alloc_device(mlx5_ib_dev, ib_dev);
if (!ibdev)
@ -85,8 +105,8 @@ mlx5_ib_vport_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
vport_index = rep->vport_index;
ibdev->port[vport_index].rep = rep;
ibdev->port[vport_index].roce.netdev =
mlx5_ib_get_rep_netdev(dev->priv.eswitch, rep->vport);
ibdev->mdev = dev;
mlx5_ib_get_rep_netdev(lag_master->priv.eswitch, rep->vport);
ibdev->mdev = lag_master;
ibdev->num_ports = num_ports;
ret = __mlx5_ib_add(ibdev, profile);
@ -94,8 +114,8 @@ mlx5_ib_vport_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
goto fail_add;
rep->rep_data[REP_IB].priv = ibdev;
if (mlx5_lag_is_shared_fdb(dev))
mlx5_ib_register_peer_vport_reps(dev);
if (mlx5_lag_is_shared_fdb(lag_master))
mlx5_ib_register_peer_vport_reps(lag_master);
return 0;
@ -118,23 +138,27 @@ mlx5_ib_vport_rep_unload(struct mlx5_eswitch_rep *rep)
struct mlx5_ib_dev *dev = mlx5_ib_rep_to_dev(rep);
int vport_index = rep->vport_index;
struct mlx5_ib_port *port;
int i;
if (WARN_ON(!mdev))
return;
if (mlx5_lag_is_shared_fdb(mdev) &&
!mlx5_lag_is_master(mdev)) {
struct mlx5_core_dev *peer_mdev;
if (rep->vport == MLX5_VPORT_UPLINK)
return;
peer_mdev = mlx5_lag_get_peer_mdev(mdev);
vport_index += mlx5_eswitch_get_total_vports(peer_mdev);
}
if (!dev)
return;
if (mlx5_lag_is_shared_fdb(mdev) &&
!mlx5_lag_is_master(mdev)) {
if (rep->vport == MLX5_VPORT_UPLINK && !mlx5_lag_is_mpesw(mdev))
return;
for (i = 0; i < dev->num_ports; i++) {
if (dev->port[i].rep == rep)
break;
}
if (WARN_ON(i == dev->num_ports))
return;
vport_index = i;
}
port = &dev->port[vport_index];
write_lock(&port->roce.netdev_lock);
port->roce.netdev = NULL;
@ -143,14 +167,19 @@ mlx5_ib_vport_rep_unload(struct mlx5_eswitch_rep *rep)
port->rep = NULL;
if (rep->vport == MLX5_VPORT_UPLINK) {
if (mlx5_lag_is_shared_fdb(mdev) && !mlx5_lag_is_master(mdev))
return;
if (mlx5_lag_is_shared_fdb(mdev)) {
struct mlx5_core_dev *peer_mdev;
struct mlx5_eswitch *esw;
if (mlx5_lag_is_shared_fdb(mdev)) {
peer_mdev = mlx5_lag_get_peer_mdev(mdev);
mlx5_lag_for_each_peer_mdev(mdev, peer_mdev, i) {
esw = peer_mdev->priv.eswitch;
mlx5_eswitch_unregister_vport_reps(esw, REP_IB);
}
}
__mlx5_ib_remove(dev, dev->profile, MLX5_IB_STAGE_MAX);
}
}
@ -163,14 +192,14 @@ static const struct mlx5_eswitch_rep_ops rep_ops = {
static void mlx5_ib_register_peer_vport_reps(struct mlx5_core_dev *mdev)
{
struct mlx5_core_dev *peer_mdev = mlx5_lag_get_peer_mdev(mdev);
struct mlx5_core_dev *peer_mdev;
struct mlx5_eswitch *esw;
int i;
if (!peer_mdev)
return;
mlx5_lag_for_each_peer_mdev(mdev, peer_mdev, i) {
esw = peer_mdev->priv.eswitch;
mlx5_eswitch_register_vport_reps(esw, &rep_ops, REP_IB);
}
}
struct net_device *mlx5_ib_get_rep_netdev(struct mlx5_eswitch *esw,

View File

@ -76,6 +76,16 @@ int mlx5_reporter_vnic_diagnose_counters(struct mlx5_core_dev *dev,
if (err)
return err;
err = devlink_fmsg_u64_pair_put(fmsg, "generated_pkt_steering_fail",
VNIC_ENV_GET64(&vnic, generated_pkt_steering_fail));
if (err)
return err;
err = devlink_fmsg_u64_pair_put(fmsg, "handled_pkt_steering_fail",
VNIC_ENV_GET64(&vnic, handled_pkt_steering_fail));
if (err)
return err;
err = devlink_fmsg_obj_nest_end(fmsg);
if (err)
return err;

View File

@ -594,13 +594,6 @@ struct mlx5e_mpw_info {
#define MLX5E_MAX_RX_FRAGS 4
/* a single cache unit is capable to serve one napi call (for non-striding rq)
* or a MPWQE (for striding rq).
*/
#define MLX5E_CACHE_UNIT (MLX5_MPWRQ_MAX_PAGES_PER_WQE > NAPI_POLL_WEIGHT ? \
MLX5_MPWRQ_MAX_PAGES_PER_WQE : NAPI_POLL_WEIGHT)
#define MLX5E_CACHE_SIZE (4 * roundup_pow_of_two(MLX5E_CACHE_UNIT))
struct mlx5e_rq;
typedef void (*mlx5e_fp_handle_rx_cqe)(struct mlx5e_rq*, struct mlx5_cqe64*);
typedef struct sk_buff *

View File

@ -25,8 +25,8 @@ struct mlx5e_tc_act_stats {
static const struct rhashtable_params act_counters_ht_params = {
.head_offset = offsetof(struct mlx5e_tc_act_stats, hash),
.key_offset = 0,
.key_len = offsetof(struct mlx5e_tc_act_stats, counter),
.key_offset = offsetof(struct mlx5e_tc_act_stats, tc_act_cookie),
.key_len = sizeof_field(struct mlx5e_tc_act_stats, tc_act_cookie),
.automatic_shrinking = true,
};
@ -169,14 +169,11 @@ mlx5e_tc_act_stats_fill_stats(struct mlx5e_tc_act_stats_handle *handle,
{
struct rhashtable *ht = &handle->ht;
struct mlx5e_tc_act_stats *item;
struct mlx5e_tc_act_stats key;
u64 pkts, bytes, lastused;
int err = 0;
key.tc_act_cookie = fl_act->cookie;
rcu_read_lock();
item = rhashtable_lookup(ht, &key, act_counters_ht_params);
item = rhashtable_lookup(ht, &fl_act->cookie, act_counters_ht_params);
if (!item) {
rcu_read_unlock();
err = -ENOENT;

View File

@ -207,7 +207,7 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget)
}
ch_stats->aff_change++;
aff_change = true;
if (budget && work_done == budget)
if (work_done == budget)
work_done--;
}

View File

@ -1601,7 +1601,8 @@ static int mlx5_esw_vports_init(struct mlx5_eswitch *esw)
idx++;
}
if (mlx5_ecpf_vport_exists(dev)) {
if (mlx5_ecpf_vport_exists(dev) ||
mlx5_core_is_ecpf_esw_manager(dev)) {
err = mlx5_esw_vport_alloc(esw, idx, MLX5_VPORT_ECPF);
if (err)
goto err;

View File

@ -779,6 +779,13 @@ static inline int mlx5_eswitch_num_vfs(struct mlx5_eswitch *esw)
return 0;
}
static inline int mlx5_eswitch_get_npeers(struct mlx5_eswitch *esw)
{
if (mlx5_esw_allowed(esw))
return esw->num_peers;
return 0;
}
static inline struct mlx5_flow_table *
mlx5_eswitch_get_slow_fdb(struct mlx5_eswitch *esw)
{
@ -826,6 +833,8 @@ static inline void
mlx5_eswitch_offloads_single_fdb_del_one(struct mlx5_eswitch *master_esw,
struct mlx5_eswitch *slave_esw) {}
static inline int mlx5_eswitch_get_npeers(struct mlx5_eswitch *esw) { return 0; }
static inline int
mlx5_eswitch_reload_reps(struct mlx5_eswitch *esw)
{

View File

@ -2178,6 +2178,7 @@ static int esw_offloads_start(struct mlx5_eswitch *esw,
"Failed setting eswitch to offloads");
esw->mode = MLX5_ESWITCH_LEGACY;
mlx5_rescan_drivers(esw->dev);
return err;
}
if (esw->offloads.inline_mode == MLX5_INLINE_MODE_NONE) {
if (mlx5_eswitch_inline_mode_get(esw,
@ -2187,7 +2188,7 @@ static int esw_offloads_start(struct mlx5_eswitch *esw,
"Inline mode is different between vports");
}
}
return err;
return 0;
}
static void mlx5_esw_offloads_rep_mark_set(struct mlx5_eswitch *esw,

View File

@ -244,16 +244,22 @@ static int mlx5_cmd_update_root_ft(struct mlx5_flow_root_namespace *ns,
ft->type == FS_FT_FDB &&
mlx5_lag_is_shared_fdb(dev) &&
mlx5_lag_is_master(dev)) {
err = mlx5_cmd_set_slave_root_fdb(dev,
mlx5_lag_get_peer_mdev(dev),
!disconnect, (!disconnect) ?
ft->id : 0);
struct mlx5_core_dev *peer_dev;
int i;
mlx5_lag_for_each_peer_mdev(dev, peer_dev, i) {
err = mlx5_cmd_set_slave_root_fdb(dev, peer_dev, !disconnect,
(!disconnect) ? ft->id : 0);
if (err && !disconnect) {
MLX5_SET(set_flow_table_root_in, in, op_mod, 0);
MLX5_SET(set_flow_table_root_in, in, table_id,
ns->root_ft->id);
mlx5_cmd_exec_in(dev, set_flow_table_root, in);
}
if (err)
break;
}
}
return err;

View File

@ -512,8 +512,11 @@ static void mlx5_lag_set_port_sel_mode_offloads(struct mlx5_lag *ldev,
return;
if (MLX5_CAP_PORT_SELECTION(dev0->dev, port_select_flow_table) &&
tracker->tx_type == NETDEV_LAG_TX_TYPE_HASH)
tracker->tx_type == NETDEV_LAG_TX_TYPE_HASH) {
if (ldev->ports > 2)
ldev->buckets = MLX5_LAG_MAX_HASH_BUCKETS;
set_bit(MLX5_LAG_MODE_FLAG_HASH_BASED, flags);
}
}
static int mlx5_lag_set_flags(struct mlx5_lag *ldev, enum mlx5_lag_mode mode,
@ -708,7 +711,7 @@ int mlx5_deactivate_lag(struct mlx5_lag *ldev)
return 0;
}
#define MLX5_LAG_OFFLOADS_SUPPORTED_PORTS 2
#define MLX5_LAG_OFFLOADS_SUPPORTED_PORTS 4
bool mlx5_lag_check_prereq(struct mlx5_lag *ldev)
{
#ifdef CONFIG_MLX5_ESWITCH
@ -734,7 +737,7 @@ bool mlx5_lag_check_prereq(struct mlx5_lag *ldev)
if (mlx5_eswitch_mode(ldev->pf[i].dev) != mode)
return false;
if (mode == MLX5_ESWITCH_OFFLOADS && ldev->ports != MLX5_LAG_OFFLOADS_SUPPORTED_PORTS)
if (mode == MLX5_ESWITCH_OFFLOADS && ldev->ports > MLX5_LAG_OFFLOADS_SUPPORTED_PORTS)
return false;
#else
for (i = 0; i < ldev->ports; i++)
@ -782,7 +785,6 @@ void mlx5_disable_lag(struct mlx5_lag *ldev)
{
bool shared_fdb = test_bit(MLX5_LAG_MODE_FLAG_SHARED_FDB, &ldev->mode_flags);
struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev;
struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev;
bool roce_lag;
int err;
int i;
@ -807,28 +809,35 @@ void mlx5_disable_lag(struct mlx5_lag *ldev)
if (shared_fdb || roce_lag)
mlx5_lag_add_devices(ldev);
if (shared_fdb) {
if (!(dev0->priv.flags & MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV))
mlx5_eswitch_reload_reps(dev0->priv.eswitch);
if (!(dev1->priv.flags & MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV))
mlx5_eswitch_reload_reps(dev1->priv.eswitch);
}
if (shared_fdb)
for (i = 0; i < ldev->ports; i++)
if (!(ldev->pf[i].dev->priv.flags & MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV))
mlx5_eswitch_reload_reps(ldev->pf[i].dev->priv.eswitch);
}
bool mlx5_shared_fdb_supported(struct mlx5_lag *ldev)
static bool mlx5_shared_fdb_supported(struct mlx5_lag *ldev)
{
struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev;
struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev;
struct mlx5_core_dev *dev;
int i;
if (is_mdev_switchdev_mode(dev0) &&
is_mdev_switchdev_mode(dev1) &&
mlx5_eswitch_vport_match_metadata_enabled(dev0->priv.eswitch) &&
mlx5_eswitch_vport_match_metadata_enabled(dev1->priv.eswitch) &&
mlx5_devcom_comp_is_ready(dev0->priv.devcom,
MLX5_DEVCOM_ESW_OFFLOADS) &&
MLX5_CAP_GEN(dev1, lag_native_fdb_selection) &&
MLX5_CAP_ESW(dev1, root_ft_on_other_esw) &&
MLX5_CAP_ESW(dev0, esw_shared_ingress_acl))
for (i = MLX5_LAG_P1 + 1; i < ldev->ports; i++) {
dev = ldev->pf[i].dev;
if (is_mdev_switchdev_mode(dev) &&
mlx5_eswitch_vport_match_metadata_enabled(dev->priv.eswitch) &&
MLX5_CAP_GEN(dev, lag_native_fdb_selection) &&
MLX5_CAP_ESW(dev, root_ft_on_other_esw) &&
mlx5_eswitch_get_npeers(dev->priv.eswitch) ==
MLX5_CAP_GEN(dev, num_lag_ports) - 1)
continue;
return false;
}
dev = ldev->pf[MLX5_LAG_P1].dev;
if (is_mdev_switchdev_mode(dev) &&
mlx5_eswitch_vport_match_metadata_enabled(dev->priv.eswitch) &&
mlx5_devcom_comp_is_ready(dev->priv.devcom, MLX5_DEVCOM_ESW_OFFLOADS) &&
MLX5_CAP_ESW(dev, esw_shared_ingress_acl) &&
mlx5_eswitch_get_npeers(dev->priv.eswitch) == MLX5_CAP_GEN(dev, num_lag_ports) - 1)
return true;
return false;
@ -865,7 +874,6 @@ static bool mlx5_lag_should_disable_lag(struct mlx5_lag *ldev, bool do_bond)
static void mlx5_do_bond(struct mlx5_lag *ldev)
{
struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev;
struct mlx5_core_dev *dev1 = ldev->pf[MLX5_LAG_P2].dev;
struct lag_tracker tracker = { };
bool do_bond, roce_lag;
int err;
@ -906,20 +914,24 @@ static void mlx5_do_bond(struct mlx5_lag *ldev)
for (i = 1; i < ldev->ports; i++)
mlx5_nic_vport_enable_roce(ldev->pf[i].dev);
} else if (shared_fdb) {
int i;
dev0->priv.flags &= ~MLX5_PRIV_FLAGS_DISABLE_IB_ADEV;
mlx5_rescan_drivers_locked(dev0);
err = mlx5_eswitch_reload_reps(dev0->priv.eswitch);
if (!err)
err = mlx5_eswitch_reload_reps(dev1->priv.eswitch);
for (i = 0; i < ldev->ports; i++) {
err = mlx5_eswitch_reload_reps(ldev->pf[i].dev->priv.eswitch);
if (err)
break;
}
if (err) {
dev0->priv.flags |= MLX5_PRIV_FLAGS_DISABLE_IB_ADEV;
mlx5_rescan_drivers_locked(dev0);
mlx5_deactivate_lag(ldev);
mlx5_lag_add_devices(ldev);
mlx5_eswitch_reload_reps(dev0->priv.eswitch);
mlx5_eswitch_reload_reps(dev1->priv.eswitch);
for (i = 0; i < ldev->ports; i++)
mlx5_eswitch_reload_reps(ldev->pf[i].dev->priv.eswitch);
mlx5_core_err(dev0, "Failed to enable lag\n");
return;
}
@ -1519,26 +1531,37 @@ u8 mlx5_lag_get_num_ports(struct mlx5_core_dev *dev)
}
EXPORT_SYMBOL(mlx5_lag_get_num_ports);
struct mlx5_core_dev *mlx5_lag_get_peer_mdev(struct mlx5_core_dev *dev)
struct mlx5_core_dev *mlx5_lag_get_next_peer_mdev(struct mlx5_core_dev *dev, int *i)
{
struct mlx5_core_dev *peer_dev = NULL;
struct mlx5_lag *ldev;
unsigned long flags;
int idx;
spin_lock_irqsave(&lag_lock, flags);
ldev = mlx5_lag_dev(dev);
if (!ldev)
goto unlock;
peer_dev = ldev->pf[MLX5_LAG_P1].dev == dev ?
ldev->pf[MLX5_LAG_P2].dev :
ldev->pf[MLX5_LAG_P1].dev;
if (*i == ldev->ports)
goto unlock;
for (idx = *i; idx < ldev->ports; idx++)
if (ldev->pf[idx].dev != dev)
break;
if (idx == ldev->ports) {
*i = idx;
goto unlock;
}
*i = idx + 1;
peer_dev = ldev->pf[idx].dev;
unlock:
spin_unlock_irqrestore(&lag_lock, flags);
return peer_dev;
}
EXPORT_SYMBOL(mlx5_lag_get_peer_mdev);
EXPORT_SYMBOL(mlx5_lag_get_next_peer_mdev);
int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev,
u64 *values,

View File

@ -111,7 +111,6 @@ int mlx5_activate_lag(struct mlx5_lag *ldev,
bool shared_fdb);
int mlx5_lag_dev_get_netdev_idx(struct mlx5_lag *ldev,
struct net_device *ndev);
bool mlx5_shared_fdb_supported(struct mlx5_lag *ldev);
char *mlx5_get_str_port_sel_mode(enum mlx5_lag_mode mode, unsigned long flags);
void mlx5_infer_tx_enabled(struct lag_tracker *tracker, u8 num_ports,

View File

@ -14,6 +14,7 @@ static bool __mlx5_lag_is_multipath(struct mlx5_lag *ldev)
return ldev->mode == MLX5_LAG_MODE_MULTIPATH;
}
#define MLX5_LAG_MULTIPATH_OFFLOADS_SUPPORTED_PORTS 2
static bool mlx5_lag_multipath_check_prereq(struct mlx5_lag *ldev)
{
if (!mlx5_lag_is_ready(ldev))
@ -22,6 +23,9 @@ static bool mlx5_lag_multipath_check_prereq(struct mlx5_lag *ldev)
if (__mlx5_lag_is_active(ldev) && !__mlx5_lag_is_multipath(ldev))
return false;
if (ldev->ports > MLX5_LAG_MULTIPATH_OFFLOADS_SUPPORTED_PORTS)
return false;
return mlx5_esw_multipath_prereq(ldev->pf[MLX5_LAG_P1].dev,
ldev->pf[MLX5_LAG_P2].dev);
}

View File

@ -65,6 +65,7 @@ err_metadata:
return err;
}
#define MLX5_LAG_MPESW_OFFLOADS_SUPPORTED_PORTS 2
static int enable_mpesw(struct mlx5_lag *ldev)
{
struct mlx5_core_dev *dev0 = ldev->pf[MLX5_LAG_P1].dev;
@ -74,6 +75,9 @@ static int enable_mpesw(struct mlx5_lag *ldev)
if (ldev->mode != MLX5_LAG_MODE_NONE)
return -EINVAL;
if (ldev->ports > MLX5_LAG_MPESW_OFFLOADS_SUPPORTED_PORTS)
return -EOPNOTSUPP;
if (mlx5_eswitch_mode(dev0) != MLX5_ESWITCH_OFFLOADS ||
!MLX5_CAP_PORT_SELECTION(dev0, port_select_flow_table) ||
!MLX5_CAP_GEN(dev0, create_lag_when_not_master_up) ||

View File

@ -75,13 +75,14 @@ struct mlx5_devcom *mlx5_devcom_register_device(struct mlx5_core_dev *dev)
if (!mlx5_core_is_pf(dev))
return NULL;
if (MLX5_CAP_GEN(dev, num_lag_ports) != MLX5_DEVCOM_PORTS_SUPPORTED)
if (MLX5_CAP_GEN(dev, num_lag_ports) > MLX5_DEVCOM_PORTS_SUPPORTED)
return NULL;
mlx5_dev_list_lock();
sguid0 = mlx5_query_nic_system_image_guid(dev);
list_for_each_entry(iter, &devcom_list, list) {
struct mlx5_core_dev *tmp_dev = NULL;
/* There is at least one device in iter */
struct mlx5_core_dev *tmp_dev;
idx = -1;
for (i = 0; i < MLX5_DEVCOM_PORTS_SUPPORTED; i++) {

View File

@ -6,7 +6,7 @@
#include <linux/mlx5/driver.h>
#define MLX5_DEVCOM_PORTS_SUPPORTED 2
#define MLX5_DEVCOM_PORTS_SUPPORTED 4
enum mlx5_devcom_components {
MLX5_DEVCOM_ESW_OFFLOADS,

View File

@ -1174,7 +1174,13 @@ int mlx5_lag_query_cong_counters(struct mlx5_core_dev *dev,
u64 *values,
int num_counters,
size_t *offsets);
struct mlx5_core_dev *mlx5_lag_get_peer_mdev(struct mlx5_core_dev *dev);
struct mlx5_core_dev *mlx5_lag_get_next_peer_mdev(struct mlx5_core_dev *dev, int *i);
#define mlx5_lag_for_each_peer_mdev(dev, peer, i) \
for (i = 0, peer = mlx5_lag_get_next_peer_mdev(dev, &i); \
peer; \
peer = mlx5_lag_get_next_peer_mdev(dev, &i))
u8 mlx5_lag_get_num_ports(struct mlx5_core_dev *dev);
struct mlx5_uars_page *mlx5_get_uars_page(struct mlx5_core_dev *mdev);
void mlx5_put_uars_page(struct mlx5_core_dev *mdev, struct mlx5_uars_page *up);

View File

@ -1755,7 +1755,9 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 reserved_at_328[0x2];
u8 relaxed_ordering_read[0x1];
u8 log_max_pd[0x5];
u8 reserved_at_330[0x9];
u8 reserved_at_330[0x7];
u8 vnic_env_cnt_steering_fail[0x1];
u8 reserved_at_338[0x1];
u8 q_counter_aggregation[0x1];
u8 q_counter_other_vport[0x1];
u8 log_max_xrcd[0x5];
@ -3673,7 +3675,13 @@ struct mlx5_ifc_vnic_diagnostic_statistics_bits {
u8 eth_wqe_too_small[0x20];
u8 reserved_at_220[0xdc0];
u8 reserved_at_220[0xc0];
u8 generated_pkt_steering_fail[0x40];
u8 handled_pkt_steering_fail[0x40];
u8 reserved_at_360[0xc80];
};
struct mlx5_ifc_traffic_counter_bits {