linux

Author	SHA1	Message	Date
Danielle Ratson	463e1ab82a	mlxsw: Support FLOW_ACTION_MANGLE for SIP and DIP IPv6 addresses Spectrum-2 supports an ACL action SIP_DIP, which allows IPv4 and IPv6 source and destination addresses change. Offload suitable mangles to the IPv6 address change action. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-07 11:59:57 +00:00
Danielle Ratson	d7809b620f	mlxsw: Support FLOW_ACTION_MANGLE for SIP and DIP IPv4 addresses Spectrum-2 supports an ACL action SIP_DIP, which allows IPv4 and IPv6 source and destination addresses change. Offload suitable mangles to the IPv4 address change action. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-07 11:59:57 +00:00
Danielle Ratson	e3541022e4	mlxsw: core_acl_flex_actions: Add SIP_DIP_ACTION Add fields related to SIP_DIP_ACTION, which is used for changing of SIP and DIP addresses. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-07 11:59:57 +00:00
Jakub Kicinski	c59400a68c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-03 17:36:16 -08:00
Kees Cook	ad5185735f	net/mlx5e: Avoid field-overflowing memcpy() In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields. Use flexible arrays instead of zero-element arrays (which look like they are always overflowing) and split the cross-field memcpy() into two halves that can be appropriately bounds-checked by the compiler. We were doing: #define ETH_HLEN 14 #define VLAN_HLEN 4 ... #define MLX5E_XDP_MIN_INLINE (ETH_HLEN + VLAN_HLEN) ... struct mlx5e_tx_wqe wqe = mlx5_wq_cyc_get_wqe(wq, pi); ... struct mlx5_wqe_eth_seg eseg = &wqe->eth; struct mlx5_wqe_data_seg dseg = wqe->data; ... memcpy(eseg->inline_hdr.start, xdptxd->data, MLX5E_XDP_MIN_INLINE); target is wqe->eth.inline_hdr.start (which the compiler sees as being 2 bytes in size), but copying 18, intending to write across start (really vlan_tci, 2 bytes). The remaining 16 bytes get written into wqe->data[0], covering byte_count (4 bytes), lkey (4 bytes), and addr (8 bytes). struct mlx5e_tx_wqe { struct mlx5_wqe_ctrl_seg ctrl; / 0 16 / struct mlx5_wqe_eth_seg eth; / 16 16 / struct mlx5_wqe_data_seg data[]; / 32 0 / / size: 32, cachelines: 1, members: 3 / / last cacheline: 32 bytes / }; struct mlx5_wqe_eth_seg { u8 swp_outer_l4_offset; / 0 1 / u8 swp_outer_l3_offset; / 1 1 / u8 swp_inner_l4_offset; / 2 1 / u8 swp_inner_l3_offset; / 3 1 / u8 cs_flags; / 4 1 / u8 swp_flags; / 5 1 / __be16 mss; / 6 2 / __be32 flow_table_metadata; / 8 4 / union { struct { __be16 sz; / 12 2 / u8 start[2]; / 14 2 / } inline_hdr; / 12 4 / struct { __be16 type; / 12 2 / __be16 vlan_tci; / 14 2 / } insert; / 12 4 / __be32 trailer; / 12 4 / }; / 12 4 / / size: 16, cachelines: 1, members: 9 / / last cacheline: 16 bytes / }; struct mlx5_wqe_data_seg { __be32 byte_count; / 0 4 / __be32 lkey; / 4 4 / __be64 addr; / 8 8 / / size: 16, cachelines: 1, members: 3 / / last cacheline: 16 bytes */ }; So, split the memcpy() so the compiler can reason about the buffer sizes. "pahole" shows no size nor member offset changes to struct mlx5e_tx_wqe nor struct mlx5e_umr_wqe. "objdump -d" shows no meaningful object code changes (i.e. only source line number induced differences and optimizations). Fixes: `b5503b994e` ("net/mlx5e: XDP TX forwarding support") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:43 -08:00
Kees Cook	6d5c900eb6	net/mlx5e: Use struct_group() for memcpy() region In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields. Use struct_group() in struct vlan_ethhdr around members h_dest and h_source, so they can be referenced together. This will allow memcpy() and sizeof() to more easily reason about sizes, improve readability, and avoid future warnings about writing beyond the end of h_dest. "pahole" shows no size nor member offset changes to struct vlan_ethhdr. "objdump -d" shows no object code changes. Fixes: `34802a42b3` ("net/mlx5e: Do not modify the TX SKB") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:43 -08:00
Roi Dayan	5b209d1a22	net/mlx5e: Avoid implicit modify hdr for decap drop rule Currently the driver adds implicit modify hdr action for decap rules on tunnel devices if the port is an ovs port. This is also done if the action is drop and makes the modify hdr redundant and also the FW doesn't support it and will generate a syndrome. kernel: mlx5_core 0000:08:00.0: mlx5_cmd_check:777:(pid 102063): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x8708c3) Fix it by adding the implicit modify hdr only for fwd actions. Fixes: `b16eb3c81f` ("net/mlx5: Support internal port as decap route device") Fixes: `077cdda764` ("net/mlx5e: TC, Fix memory leak with rules with internal port") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:43 -08:00
Raed Salem	de47db0cf7	net/mlx5e: IPsec: Fix tunnel mode crypto offload for non TCP/UDP traffic IPsec Tunnel mode crypto offload software parser (SWP) setting in data path currently always set the inner L4 offset regardless of the encapsulated L4 header type and whether it exists in the first place, this breaks non TCP/UDP traffic as such. Set the SWP inner L4 offset only when the IPsec tunnel encapsulated L4 header protocol is TCP/UDP. While at it fix inner ip protocol read for setting MLX5_ETH_WQE_SWP_INNER_L4_UDP flag to address the case where the ip header protocol is IPv6. Fixes: `f1267798c9` ("net/mlx5: Fix checksum issue of VXLAN and IPsec crypto offload") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:43 -08:00
Raed Salem	5352859b3b	net/mlx5e: IPsec: Fix crypto offload for non TCP/UDP encapsulated traffic IPsec crypto offload always set the ethernet segment checksum flags with the inner L4 header checksum flag enabled for encapsulated IPsec offloaded packet regardless of the encapsulated L4 header type, and even if it doesn't exists in the first place, this breaks non TCP/UDP traffic as such. Set the inner L4 checksum flag only when the encapsulated L4 header protocol is TCP/UDP using software parser swp_inner_l4_offset field as indication. Fixes: `5cfb540ef2` ("net/mlx5e: Set IPsec WAs only in IP's non checksum partial case.") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:42 -08:00
Maxim Mikityanskiy	736dfe4e68	net/mlx5e: Don't treat small ceil values as unlimited in HTB offload The hardware spec defines max_average_bw == 0 as "unlimited bandwidth". max_average_bw is calculated as `ceil / BYTES_IN_MBIT`, which can become 0 when ceil is small, leading to an undesired effect of having no bandwidth limit. This commit fixes it by rounding up small values of ceil to 1 Mbit/s. Fixes: `214baf2287` ("net/mlx5e: Support HTB offload") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:42 -08:00
Maor Dickman	d8e5883d69	net/mlx5: E-Switch, Fix uninitialized variable modact The variable modact is not initialized before used in command modify header allocation which can cause command to fail. Fix by initializing modact with zeros. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: `8f1e0b97cc` ("net/mlx5: E-Switch, Mark miss packets with new chain id mapping") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:42 -08:00
Maor Dickman	ec41332e02	net/mlx5e: Fix handling of wrong devices during bond netevent Current implementation of bond netevent handler only check if the handled netdev is VF representor and it missing a check if the VF representor is on the same phys device of the bond handling the netevent. Fix by adding the missing check and optimizing the check if the netdev is VF representor so it will not access uninitialized private data and crashes. BUG: kernel NULL pointer dereference, address: 000000000000036c PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI Workqueue: eth3bond0 bond_mii_monitor [bonding] RIP: 0010:mlx5e_is_uplink_rep+0xc/0x50 [mlx5_core] RSP: 0018:ffff88812d69fd60 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff8881cf800000 RCX: 0000000000000000 RDX: ffff88812d69fe10 RSI: 000000000000001b RDI: ffff8881cf800880 RBP: ffff8881cf800000 R08: 00000445cabccf2b R09: 0000000000000008 R10: 0000000000000004 R11: 0000000000000008 R12: ffff88812d69fe10 R13: 00000000fffffffe R14: ffff88820c0f9000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88846fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000036c CR3: 0000000103d80006 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mlx5e_eswitch_uplink_rep+0x31/0x40 [mlx5_core] mlx5e_rep_is_lag_netdev+0x94/0xc0 [mlx5_core] mlx5e_rep_esw_bond_netevent+0xeb/0x3d0 [mlx5_core] raw_notifier_call_chain+0x41/0x60 call_netdevice_notifiers_info+0x34/0x80 netdev_lower_state_changed+0x4e/0xa0 bond_mii_monitor+0x56b/0x640 [bonding] process_one_work+0x1b9/0x390 worker_thread+0x4d/0x3d0 ? rescuer_thread+0x350/0x350 kthread+0x124/0x150 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x1f/0x30 Fixes: `7e51891a23` ("net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:41 -08:00
Khalid Manaa	7957837b81	net/mlx5e: Fix broken SKB allocation in HW-GRO In case the HW doesn't perform header-data split, it will write the whole packet into the data buffer in the WQ, in this case the SHAMPO CQE handler couldn't use the header entry to build the SKB, instead it should allocate a new memory to build the SKB using the function: mlx5e_skb_from_cqe_mpwrq_nonlinear. Fixes: `f97d5c2a45` ("net/mlx5e: Add handle SHAMPO cqe support") Signed-off-by: Khalid Manaa <khalidm@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:41 -08:00
Khalid Manaa	b8d91145ed	net/mlx5e: Fix wrong calculation of header index in HW_GRO The HW doesn't wrap the CQE.shampo.header_index field according to the headers buffer size, instead it always increases it until reaching overflow of u16 size. Thus the mlx5e_handle_rx_cqe_mpwrq_shampo handler should mask the CQE header_index field to find the actual header index in the headers buffer. Fixes: `f97d5c2a45` ("net/mlx5e: Add handle SHAMPO cqe support") Signed-off-by: Khalid Manaa <khalidm@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:41 -08:00
Roi Dayan	880b517691	net/mlx5: Bridge, Fix devlink deadlock on net namespace deletion When changing mode to switchdev, rep bridge init registered to netdevice notifier holds the devlink lock and then takes pernet_ops_rwsem. At that time deleting a netns holds pernet_ops_rwsem and then takes the devlink lock. Example sequence is: $ ip netns add foo $ devlink dev eswitch set pci/0000:00:08.0 mode switchdev & $ ip netns del foo deleting netns trace: [ 1185.365555] ? devlink_pernet_pre_exit+0x74/0x1c0 [ 1185.368331] ? mutex_lock_io_nested+0x13f0/0x13f0 [ 1185.370984] ? xt_find_table+0x40/0x100 [ 1185.373244] ? __mutex_lock+0x24a/0x15a0 [ 1185.375494] ? net_generic+0xa0/0x1c0 [ 1185.376844] ? wait_for_completion_io+0x280/0x280 [ 1185.377767] ? devlink_pernet_pre_exit+0x74/0x1c0 [ 1185.378686] devlink_pernet_pre_exit+0x74/0x1c0 [ 1185.379579] ? devlink_nl_cmd_get_dumpit+0x3a0/0x3a0 [ 1185.380557] ? xt_find_table+0xda/0x100 [ 1185.381367] cleanup_net+0x372/0x8e0 changing mode to switchdev trace: [ 1185.411267] down_write+0x13a/0x150 [ 1185.412029] ? down_write_killable+0x180/0x180 [ 1185.413005] register_netdevice_notifier+0x1e/0x210 [ 1185.414000] mlx5e_rep_bridge_init+0x181/0x360 [mlx5_core] [ 1185.415243] mlx5e_uplink_rep_enable+0x269/0x480 [mlx5_core] [ 1185.416464] ? mlx5e_uplink_rep_disable+0x210/0x210 [mlx5_core] [ 1185.417749] mlx5e_attach_netdev+0x232/0x400 [mlx5_core] [ 1185.418906] mlx5e_netdev_attach_profile+0x15b/0x1e0 [mlx5_core] [ 1185.420172] mlx5e_netdev_change_profile+0x15a/0x1d0 [mlx5_core] [ 1185.421459] mlx5e_vport_rep_load+0x557/0x780 [mlx5_core] [ 1185.422624] ? mlx5e_stats_grp_vport_rep_num_stats+0x10/0x10 [mlx5_core] [ 1185.424006] mlx5_esw_offloads_rep_load+0xdb/0x190 [mlx5_core] [ 1185.425277] esw_offloads_enable+0xd74/0x14a0 [mlx5_core] Fix this by registering rep bridges for per net netdev notifier instead of global one, which operats on the net namespace without holding the pernet_ops_rwsem. Fixes: `19e9bfa044` ("net/mlx5: Bridge, add offload infrastructure") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:40 -08:00
Dima Chumak	55b2ca702c	net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLE Only prio 1 is supported for nic mode when there is no ignore flow level support in firmware. But for switchdev mode, which supports fixed number of statically pre-allocated prios, this restriction is not relevant so it can be relaxed. Fixes: `d671e109bd` ("net/mlx5: Fix tc max supported prio for nic mode") Signed-off-by: Dima Chumak <dchumak@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:40 -08:00
Roi Dayan	5623ef8a11	net/mlx5e: TC, Reject rules with forward and drop actions Such rules are redundant but allowed and passed to the driver. The driver does not support offloading such rules so return an error. Fixes: `03a9d11e6e` ("net/mlx5e: Add TC drop and mirred/redirect action parsing for SRIOV offloads") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:40 -08:00
Maher Sanalla	3c5193a87b	net/mlx5: Use del_timer_sync in fw reset flow of halting poll Substitute del_timer() with del_timer_sync() in fw reset polling deactivation flow, in order to prevent a race condition which occurs when del_timer() is called and timer is deactivated while another process is handling the timer interrupt. A situation that led to the following call trace: RIP: 0010:run_timer_softirq+0x137/0x420 <IRQ> recalibrate_cpu_khz+0x10/0x10 ktime_get+0x3e/0xa0 ? sched_clock_cpu+0xb/0xc0 __do_softirq+0xf5/0x2ea irq_exit_rcu+0xc1/0xf0 sysvec_apic_timer_interrupt+0x9e/0xc0 asm_sysvec_apic_timer_interrupt+0x12/0x20 </IRQ> Fixes: `38b9f903f2` ("net/mlx5: Handle sync reset request event") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:40 -08:00
Gal Pressman	4a08a13135	net/mlx5e: Fix module EEPROM query When querying the module EEPROM, there was a misusage of the 'offset' variable vs the 'query.offset' field. Fix that by always using 'offset' and assigning its value to 'query.offset' right before the mcia register read call. While at it, the cross-pages read size adjustment was changed to be more intuitive. Fixes: `e19b0a3474` ("net/mlx5: Refactor module EEPROM query") Reported-by: Wang Yugui <wangyugui@e16-tech.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:39 -08:00
Roi Dayan	a2446bc77a	net/mlx5e: TC, Reject rules with drop and modify hdr action This kind of action is not supported by firmware and generates a syndrome. kernel: mlx5_core 0000:08:00.0: mlx5_cmd_check:777:(pid 102063): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x8708c3) Fixes: `d7e75a325c` ("net/mlx5e: Add offloading of E-Switch TC pedit (header re-write) actions") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:39 -08:00
Vlad Buslov	350d9a8237	net/mlx5: Bridge, ensure dev_name is null-terminated Even though net_device->name is guaranteed to be null-terminated string of size<=IFNAMSIZ, the test robot complains that return value of netdev_name() can be larger: In file included from include/trace/define_trace.h:102, from drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h:113, from drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c:12: drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h: In function 'trace_event_raw_event_mlx5_esw_bridge_fdb_template': >> drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h:24:29: warning: 'strncpy' output may be truncated copying 16 bytes from a string of length 20 [-Wstringop-truncation] 24 \| strncpy(__entry->dev_name, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ 25 \| netdev_name(fdb->dev), \| ~~~~~~~~~~~~~~~~~~~~~~ 26 \| IFNAMSIZ); \| ~~~~~~~~~ This is caused by the fact that default value of IFNAMSIZ is 16, while placeholder value that is returned by netdev_name() for unnamed net devices is larger than that. The offending code is in a tracing function that is only called for mlx5 representors, so there is no straightforward way to reproduce the issue but let's fix it for correctness sake by replacing strncpy() with strscpy() to ensure that resulting string is always null-terminated. Fixes: `9724fd5d9c` ("net/mlx5: Bridge, add tracepoints") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:39 -08:00
Vlad Buslov	04f8c12f03	net/mlx5: Bridge, take rtnl lock in init error handler The mlx5_esw_bridge_cleanup() is expected to be called with rtnl lock taken, which is true for mlx5e_rep_bridge_cleanup() function but not for error handling code in mlx5e_rep_bridge_init(). Add missing rtnl lock/unlock calls and extend both mlx5_esw_bridge_cleanup() and its dual function mlx5_esw_bridge_init() with ASSERT_RTNL() to verify the invariant from now on. Fixes: `7cd6a54a82` ("net/mlx5: Bridge, handle FDB events") Fixes: `19e9bfa044` ("net/mlx5: Bridge, add offload infrastructure") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:38 -08:00
Ido Schimmel	ef14c298b5	mlxsw: spectrum_acl: Allocate default actions for internal TCAM regions In Spectrum-2 and later ASICs, each TCAM region has a default action that is executed in case a packet did not match any rule in the region. The location of the action in the database (KVDL) is computed by adding the region's index to a base value. Some TCAM regions are not exposed to the host and used internally by the device. Allocate KVDL entries for the default actions of these regions to avoid the host from overwriting them. With mlxsw, lookups in the internal regions are not currently performed, but it is a good practice not to overwrite their default actions. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-27 19:19:58 -08:00
Amit Cohen	bcdfd615f8	mlxsw: spectrum: Guard against invalid local ports When processing events generated by the device's firmware, the driver protects itself from events reported for non-existent local ports, but not for the CPU port (local port 0), which exists, but does not have all the fields as any local port. This can result in a NULL pointer dereference when trying access 'struct mlxsw_sp_port' fields which are not initialized for CPU port. Commit `63b08b1f68` ("mlxsw: spectrum: Protect driver from buggy firmware") already handled such issue by bailing early when processing a PUDE event reported for the CPU port. Generalize the approach by moving the check to a common function and making use of it in all relevant places. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-27 19:19:58 -08:00
Jiri Pirko	636d3ad238	mlxsw: core: Consolidate trap groups to a single event group For event traps which are used in core, avoid having a separate trap group for each event. Instead of that introduce a single core event trap group and use it for all event traps. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-27 19:19:58 -08:00
Jiri Pirko	981f1d18be	mlxsw: core: Move functions to register/unregister array of traps to core.c These functions belong to core.c alongside the functions that register/unregister a single trap. Move it there. Make the functions possibly usable by other parts of mlxsw code. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-27 19:19:58 -08:00
Jiri Pirko	8ae89cf454	mlxsw: core: Move basic trap group initialization from spectrum.c Instead of initializing the trap groups used by core in spectrum.c over op, do it directly in core.c Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-27 19:19:58 -08:00
Jiri Pirko	74e0494d35	mlxsw: core: Move basic_trap_groups_set() call out of EMAD init code The call inits the EMAD group, but other groups as well. Therefore, move it out of EMAD init code and call it before. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-27 19:19:57 -08:00
Jiri Pirko	7aad5244f0	mlxsw: spectrum: Set basic trap groups from an array Instead of calling the same code four times, do it in a loop over array which contains trap grups to be set. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-01-27 19:19:57 -08:00
Dima Chumak	60dc0ef674	net/mlx5: VLAN push on RX, pop on TX Some older NIC hardware isn't capable of doing VLAN push on RX and pop on TX. A workaround has been added in software to support it, but it has a performance penalty since it requires a hairpin + loopback. There's no such limitation with the newer NICs, so no need to pay the price of the w/a. With this change the software w/a is disabled for certain HW versions and steering modes that support it. Signed-off-by: Dima Chumak <dchumak@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:36 -08:00
Dima Chumak	8348b71ccd	net/mlx5: Introduce software defined steering capabilities There are two different internal steering modes, abstracted from the rest of the driver. In order to keep upper layer of the driver agnostic to the differences in capabilities of the steering modes, this patch introduces mlx5_fs_get_capabilities() API to check if a certain software defined capability is supported. It differs from the capabilities exposed by the hardware, as it takes into account the flow steering mode (SMFS/DMFS) currently enabled. This implementation supports only two capability flags: MLX5_FLOW_STEERING_CAP_VLAN_PUSH_ON_RX MLX5_FLOW_STEERING_CAP_VLAN_POP_ON_TX They map to DR_ACTION_STATE_PUSH_VLAN and DR_ACTION_STATE_POP_VLAN actions, implemented in SW steering earlier in commit `f5e22be534` ("net/mlx5: DR, Split modify VLAN state to separate pop/push states"). Which enables using of pop/push vlan without restrictions, e.g. doing vlan pop on TX and RX, compared to FW steering that supports only vlan pop on RX and push on TX. Other capabilities can be added in the future. Signed-off-by: Dima Chumak <dchumak@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:35 -08:00
Roi Dayan	a572c0a748	net/mlx5e: CT, Remove redundant flow args from tc ct calls The flow arg is not being used so remove it. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:35 -08:00
Roi Dayan	73a3f1bcab	net/mlx5e: TC, Store mapped tunnel id on flow attr In preparation for multiple attr instances the tunnel_id should be attr specific and not flow specific. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:35 -08:00
Roi Dayan	84ba8062e3	net/mlx5e: Test CT and SAMPLE on flow attr Currently the mlx5_flow object contains a single mlx5_attr instance. However, multi table actions (e.g. CT) instantiate multiple attr instances. Prepare for multiple attr instances by testing for CT or SAMPLE flag on attr flags instead of flow flag. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Chris Mi <cmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:34 -08:00
Roi Dayan	e5d4e1da65	net/mlx5e: Refactor eswitch attr flags to just attr flags The flags are flow attrs and not esw specific attr flags. Refactor to remove the esw prefix and move from eswitch.h to en_tc.h where struct mlx5_flow_attr exists. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:34 -08:00
Roi Dayan	efe6f961cd	net/mlx5e: CT, Don't set flow flag CT for ct clear flow ct clear action is a normal flow with a modify header for registers to 0. there is no need for any special handling in tc_ct.c. Parsing of ct clear action still allocates mod acts to set 0 on the registers and the driver continue to add a normal rule with modify hdr context. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:34 -08:00
Roi Dayan	eeed226ed1	net/mlx5e: TC, Hold sample_attr on stack instead of pointer In later commit we are going to instantiate multiple attr instances for flow instead of single attr. Parsing TC sample allocates a new memory but there is no symmetric cleanup in the infrastructure. To avoid asymmetric alloc/free use sample_attr as part of the flow attr and not allocated and held as a pointer. This will avoid a cleanup leak when sample action is not on the first attr. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:33 -08:00
Roi Dayan	3b49a7edec	net/mlx5e: TC, Reject rules with multiple CT actions The driver doesn't support multiple CT actions. Multiple CT clear actions are ok as they are redundant also with another CT actions. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:33 -08:00
Roi Dayan	ff99316700	net/mlx5e: TC, Refactor mlx5e_tc_add_flow_mod_hdr() to get flow attr In later commit we are going to instantiate multiple attr instances for flow instead of single attr. Make sure mlx5e_tc_add_flow_mod_hdr() use the correct attr and not flow->attr. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:33 -08:00
Roi Dayan	8be9686d24	net/mlx5e: TC, Pass attr to tc_act can_offload() In later commit we are going to instantiate multiple attr instances for flow instead of single attr. Make sure the parsing using correct attr and not flow->attr. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:32 -08:00
Roi Dayan	918ed7bf76	net/mlx5e: TC, Split pedit offloads verify from alloc_tc_pedit_action() Split pedit verify part into a new subfunction for better maintainability. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:32 -08:00
Roi Dayan	09bf979232	net/mlx5e: TC, Move pedit_headers_action to parse_attr Move pedit_headers_action from flow parse_state to flow parse_attr. In a follow up commit we are going to have multiple attr per flow and pedit_headers_action are unique per attr. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:32 -08:00
Roi Dayan	df67ad625b	net/mlx5e: Move counter creation call to alloc_flow_attr_counter() Move shared code to alloc_flow_attr_counter() for reuse by the next patches. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:31 -08:00
Roi Dayan	c118ebc982	net/mlx5e: Pass attr arg for attaching/detaching encaps In later commit that we will have multiple attr instances per flow we would like to pass a specific attr instance to set encaps. Currently the mlx5_flow object contains a single mlx5_attr instance. However, multi table actions (e.g. CT) instantiate multiple attr instances. Currently mlx5e_attach/detach_encap() reads the first attr instance from the flow instance. Modify the functions to receive the attr instance as a parameter which is set by the calling function. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:31 -08:00
Roi Dayan	39542e234b	net/mlx5e: Move code chunk setting encap dests into its own function Split setting encap dests code chunk out of mlx5e_tc_add_fdb_flow() to make the function smaller for maintainability and reuse. For symmetry do the same for mlx5e_tc_del_fdb_flow(). While at it refactor cleanup to first check for encap flag like done when setting encap dests. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-01-27 12:37:31 -08:00
Gustavo A. R. Silva	70b3c38b4c	mlxsw: spectrum_kvdl: Use struct_size() helper in kzalloc() Make use of the struct_size() helper instead of an open-coded version, in order to avoid any potential type mistakes or integer overflows that, in the worst scenario, could lead to heap overflows. Also, address the following sparse warnings: drivers/net/ethernet/mellanox/mlxsw/spectrum1_kvdl.c:229:24: warning: using sizeof on a flexible structure Link: https://github.com/KSPP/linux/issues/174 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-26 16:35:34 +00:00
Danielle Ratson	b7347cdf10	mlxsw: core_env: Forbid module reset on RJ45 ports Transceiver module reset through 'rst' field in PMAOS register is not supported on RJ45 ports, so module reset should be rejected. Therefore, before trying to access this field, validate the port module type that was queried during initialization and return an error to user space in case the port module type is RJ45 (twisted pair). Output example: # ethtool --reset swp11 phy ETHTOOL_RESET 0x40 Cannot issue ETHTOOL_RESET: Invalid argument $ dmesg mlxsw_spectrum 0000:03:00.0 swp11: Reset module is not supported on port module type Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-26 11:15:42 +00:00
Danielle Ratson	c8f994ccdd	mlxsw: core_env: Forbid power mode set and get on RJ45 ports PMMP (Port Module Memory Map Properties) and MCION (Management Cable IO and Notifications) registers are not supported on RJ45 ports, so setting and getting power mode should be rejected. Therefore, before trying to access those registers, validate the port module type that was queried during initialization and return an error to user space in case the port module type is RJ45 (twisted pair). Set output example: # ethtool --set-module swp1 power-mode-policy auto netlink error: mlxsw_core: Power mode is not supported on port module type netlink error: Invalid argument Get output example: $ ethtool --show-module swp11 netlink error: mlxsw_core: Power mode is not supported on port module type netlink error: Invalid argument Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-26 11:15:42 +00:00
Danielle Ratson	615ebb8cc4	mlxsw: core_env: Forbid getting module EEPROM on RJ45 ports MCIA (Management Cable Info Access) register is not supported on RJ45 ports, so getting module EEPROM should be rejected. Therefore, before trying to access this register, validate the port module type that was queried during initialization and return an error to user space in case the port module type is RJ45 (twisted pair). Examples for output when trying to get EEPROM module: Using netlink: # ethtool -m swp1 netlink error: mlxsw_core: EEPROM is not equipped on port module type netlink error: Invalid argument Using IOCTL: # ethtool -m swp1 Cannot get module EEPROM information: Invalid argument $ dmesg mlxsw_spectrum 0000:03:00.0 swp1: EEPROM is not equipped on port module type Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-26 11:15:42 +00:00
Danielle Ratson	e62f5b0e3f	mlxsw: core_env: Query and store port module's type during initialization Query and store port module's type during initialization so that it could be later used to determine if certain configurations are allowed based on the type. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-26 11:15:42 +00:00

1 2 3 4 5 ...

8398 Commits