linux

Author	SHA1	Message	Date
Alexander Lobakin	5ce6663158	ice: switch to napi_build_skb() napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. ice driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-28 09:42:33 -08:00
Alexander Lobakin	ef687d61e0	iavf: switch to napi_build_skb() napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. iavf driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-28 09:42:33 -08:00
Alexander Lobakin	6e19cf7d38	i40e: switch to napi_build_skb() napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. i40e driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-28 09:42:33 -08:00
Alexander Lobakin	89a354c03b	e1000: switch to napi_build_skb() napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx element. e1000 driver runs Tx completion polling cycle right before the Rx one. Now that e1000 uses napi_consume_skb() to put skbuff_heads of completed entries into the cache, it will never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx and increase throughput. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-28 09:42:25 -08:00
Alexander Lobakin	dcb95f06ea	e1000: switch to napi_consume_skb() In order to take the best from per-cpu NAPI skbuff_head caches and CPU cycles, let's switch from dev_kfree_skb_any(), which passes skb back to the mm layer, to napi_consume_skb(), which feeds those caches on non-zero budget instead (falls back to the former on 0). Do the replacement in e1000_unmap_and_free_tx_resource(). There are 4 call sites of this function throughout the driver: * e1000_clean_tx_ring(). Slowpath, process context, cleans the whole Tx ring on ifdown. Use budget of 0 here; * e1000_tx_map(). Hotpath, net Tx softirq, unmaps the buffers in case of error. Use 0 as well; * e1000_clean_tx_irq(). Hotpath, NAPI Tx completion polling cycle. As the driver doesn't count completed Tx entries towards the NAPI budget, just use the poll budget of 64 to utilize caches. Apart from being a preparation for switching to napi_build_skb(), this is useful on its own as well, as napi_consume_skb() flushes skb caches by batches of 32 instead of one-at-a-time. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-12-28 09:41:57 -08:00
Aleksander Jan Bajkowski	5be60a9453	net: lantiq_xrx200: fix statistics of received bytes Received frames have FCS truncated. There is no need to subtract FCS length from the statistics. Fixes: `fe1a56420c` ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver") Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-28 12:18:18 +00:00
Christophe JAILLET	1cd5384c88	net: ag71xx: Fix a potential double free in error handling paths 'ndev' is a managed resource allocated with devm_alloc_etherdev(), so there is no need to call free_netdev() explicitly or there will be a double free(). Simplify all error handling paths accordingly. Fixes: `d51b6ce441` ("net: ethernet: add ag71xx driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-28 12:16:34 +00:00
Zekun Shen	5f50153288	atlantic: Fix buff_ring OOB in aq_ring_rx_clean The function obtain the next buffer without boundary check. We should return with I/O error code. The bug is found by fuzzing and the crash report is attached. It is an OOB bug although reported as use-after-free. [ 4.804724] BUG: KASAN: use-after-free in aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.805661] Read of size 4 at addr ffff888034fe93a8 by task ksoftirqd/0/9 [ 4.806505] [ 4.806703] CPU: 0 PID: 9 Comm: ksoftirqd/0 Tainted: G W 5.6.0 #34 [ 4.809030] Call Trace: [ 4.809343] dump_stack+0x76/0xa0 [ 4.809755] print_address_description.constprop.0+0x16/0x200 [ 4.810455] ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.811234] ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.813183] __kasan_report.cold+0x37/0x7c [ 4.813715] ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.814393] kasan_report+0xe/0x20 [ 4.814837] aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.815499] ? hw_atl_b0_hw_ring_rx_receive+0x9a5/0xb90 [atlantic] [ 4.816290] aq_vec_poll+0x179/0x5d0 [atlantic] [ 4.816870] ? _GLOBAL__sub_I_65535_1_aq_pci_func_init+0x20/0x20 [atlantic] [ 4.817746] ? __next_timer_interrupt+0xba/0xf0 [ 4.818322] net_rx_action+0x363/0xbd0 [ 4.818803] ? call_timer_fn+0x240/0x240 [ 4.819302] ? __switch_to_asm+0x40/0x70 [ 4.819809] ? napi_busy_loop+0x520/0x520 [ 4.820324] __do_softirq+0x18c/0x634 [ 4.820797] ? takeover_tasklets+0x5f0/0x5f0 [ 4.821343] run_ksoftirqd+0x15/0x20 [ 4.821804] smpboot_thread_fn+0x2f1/0x6b0 [ 4.822331] ? smpboot_unregister_percpu_thread+0x160/0x160 [ 4.823041] ? __kthread_parkme+0x80/0x100 [ 4.823571] ? smpboot_unregister_percpu_thread+0x160/0x160 [ 4.824301] kthread+0x2b5/0x3b0 [ 4.824723] ? kthread_create_on_node+0xd0/0xd0 [ 4.825304] ret_from_fork+0x35/0x40 Signed-off-by: Zekun Shen <bruceshenzk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 14:49:53 +00:00
Lad Prabhakar	32f52e8e78	net: ethernet: ti: davinci_emac: Use platform_get_irq() to get the interrupt platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq() for DT users only. While at it propagate error code in case request_irq() fails instead of returning -EBUSY. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 12:22:19 +00:00
Lad Prabhakar	7801302b9a	net: xilinx: emaclite: Use platform_get_irq() to get the interrupt platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 12:22:19 +00:00
Lad Prabhakar	6c119fbdb8	net: ethoc: Use platform_get_irq() to get the interrupt platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 12:22:19 +00:00
Lad Prabhakar	441faddaad	fsl/fman: Use platform_get_irq() to get the interrupt platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). While doing so return error pointer from read_dts_node() as platform_get_irq() may return -EPROBE_DEFER. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 12:22:19 +00:00
Lad Prabhakar	f83b434811	net: pxa168_eth: Use platform_get_irq() to get the interrupt platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 12:22:19 +00:00
Lad Prabhakar	c0032d6e87	ethernet: netsec: Use platform_get_irq() to get the interrupt platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq(). Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 12:22:19 +00:00
Horatiu Vultur	0c94d657d2	net: lan966x: Fix the vlan used by host ports The blamed commit changed the vlan used by the host ports to be 4095 instead of 0. Because of this change the following issues are seen: - when the port is probed first it was adding an entry in the MAC table with the wrong vlan (port->pvid which is default 0) and not HOST_PVID - when the port is removed from a bridge, it was using the wrong vlan to add entries in the MAC table. It was using the old PVID and not the HOST_PVID This patch fixes this two issues by using the HOST_PVID instead of port->pvid. Fixes: `6d2c186afa` ("net: lan966x: Add vlan support.") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 12:16:30 +00:00
Jakub Kicinski	720908e5f8	bnxt_en: Use page frag RX buffers for better software GRO performance If NETIF_F_GRO_HW is disabled, the existing driver code uses kmalloc'ed data for RX buffers. This causes inefficient SW GRO performance because the GRO data is merged using the less efficient frag_list. Use netdev_alloc_frag() and friends instead so that GRO data can be merged into skb_shinfo(skb)->frags for better performance. [Use skb_free_frag() - Vikas Gupta] Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 12:00:28 +00:00
Edwin Peer	b976969bed	bnxt_en: convert to xdp_do_flush The xdp_do_flush_map function has been replaced with the more general xdp_do_flush(). Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 12:00:28 +00:00
Michael Chan	3fcbdbd5d8	bnxt_en: Support CQE coalescing mode in ethtool Support showing and setting the CQE mode in ethtool. Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 12:00:28 +00:00
Michael Chan	df78ea2246	bnxt_en: Support configurable CQE coalescing mode CQE coalescing mode is the same as the timer reset coalescing mode on Broadcom devices. Currently this mode is always enabled if it is supported by the device. Restructure the code slightly to support dynamically changing this mode. Add a flags field to struct bnxt_coal. Initially, the CQE flag will be set for the RX and TX side if the device supports it. We need to move bnxt_init_dflt_coal() to set up default coalescing until the capability is determined. Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 12:00:28 +00:00
Andy Gospodarek	dc1f5d1ebc	bnxt_en: enable interrupt sampling on 5750X for DIM 5750X (P5) chips handle receiving packets on the NQ rather than the main completion queue so we need to get and set stats from the correct spots for dynamic interrupt moderation. Signed-off-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 12:00:28 +00:00
Michael Chan	0fb8582ae5	bnxt_en: Log error report for dropped doorbell Log the unrecognized error report type value as well. Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 12:00:28 +00:00
Somnath Kotur	5a717f4a8e	bnxt_en: Add event handler for PAUSE Storm event FW has been modified to send a new async event when it detects a pause storm. Register for this new event and log it upon receipt. Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-27 12:00:28 +00:00
Jakub Kicinski	6f6f0ac664	Merge tag 'mlx5-fixes-2021-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2021-12-22 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2021-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()' net/mlx5e: Delete forward rule for ct or sample action net/mlx5e: Fix ICOSQ recovery flow for XSK net/mlx5e: Fix interoperability between XSK and ICOSQ recovery flow net/mlx5e: Fix skb memory leak when TC classifier action offloads are disabled net/mlx5e: Wrap the tx reporter dump callback to extract the sq net/mlx5: Fix tc max supported prio for nic mode net/mlx5: Fix SF health recovery flow net/mlx5: Fix error print in case of IRQ request failed net/mlx5: Use first online CPU instead of hard coded CPU net/mlx5: DR, Fix querying eswitch manager vport for ECPF net/mlx5: DR, Fix NULL vs IS_ERR checking in dr_domain_init_resources ==================== Link: https://lore.kernel.org/r/20211223190441.153012-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-23 19:04:33 -08:00
Jakub Kicinski	8b3f913322	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net include/net/sock.h commit `8f905c0e73` ("inet: fully convert sk->sk_rx_dst to RCU rules") commit `43f51df417` ("net: move early demux fields close to sk_refcnt") https://lore.kernel.org/all/20211222141641.0caa0ab3@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-23 16:09:58 -08:00
Nobuhiro Iwamatsu	391e5975c0	net: stmmac: dwmac-visconti: Fix value of ETHER_CLK_SEL_FREQ_SEL_2P5M ETHER_CLK_SEL_FREQ_SEL_2P5M is not 0 bit of the register. This is a value, which is 0. Fix from BIT(0) to 0. Reported-by: Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp> Fixes: `b38dd98ff8` ("net: stmmac: Add Toshiba Visconti SoCs glue driver") Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Link: https://lore.kernel.org/r/20211223073633.101306-1-nobuhiro1.iwamatsu@toshiba.co.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-23 09:58:13 -08:00
Xiaoliang Yang	eccffcf465	net: stmmac: ptp: fix potentially overflowing expression Convert the u32 variable to type u64 in a context where expression of type u64 is required to avoid potential overflow. Fixes: `e9e3720002` ("net: stmmac: ptp: update tas basetime after ptp adjust") Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Link: https://lore.kernel.org/r/20211223073928.37371-1-xiaoliang.yang_1@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-23 09:45:55 -08:00
Ong Boon Leong	e48cb313fd	net: stmmac: add tc flower filter for EtherType matching This patch adds basic support for EtherType RX frame steering for LLDP and PTP using the hardware offload capabilities. Example steps for setting up RX frame steering for LLDP and PTP: $ IFDEVNAME=eth0 $ tc qdisc add dev $IFDEVNAME ingress $ tc qdisc add dev $IFDEVNAME root mqprio num_tc 8 \ map 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0 For LLDP $ tc filter add dev $IFDEVNAME parent ffff: protocol 0x88cc \ flower hw_tc 5 OR $ tc filter add dev $IFDEVNAME parent ffff: protocol LLDP \ flower hw_tc 5 For PTP $ tc filter add dev $IFDEVNAME parent ffff: protocol 0x88f7 \ flower hw_tc 6 Show tc ingress filter $ tc filter show dev $IFDEVNAME ingress v1->v2: Thanks to Kurt's and Sebastian's suggestion. - change from __be16 to u16 etype - change ETHER_TYPE_FULL_MASK to use cpu_to_be16() macro Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-23 11:20:49 +00:00
Horatiu Vultur	2e49761e4f	net: lan966x: Add support for multiple bridge flags This patch series extends the current supported bridge flags with the following flags: BR_FLOOD, BR_BCAST_FLOOD and BR_LEARNING. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-23 11:19:06 +00:00
Christophe JAILLET	4390c6edc0	net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()' All the error handling paths of 'mlx5e_tc_add_fdb_flow()' end to 'err_out' where 'flow_flag_set(flow, FAILED);' is called. All but the new error handling paths added by the commits given in the Fixes tag below. Fix these error handling paths and branch to 'err_out'. Fixes: `166f431ec6` ("net/mlx5e: Add indirect tc offload of ovs internal port") Fixes: `b16eb3c81f` ("net/mlx5: Support internal port as decap route device") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> (cherry picked from commit `31108d142f`)	2021-12-22 20:38:49 -08:00
Chris Mi	2820110d94	net/mlx5e: Delete forward rule for ct or sample action When there is ct or sample action, the ct or sample rule will be deleted and return. But if there is an extra mirror action, the forward rule can't be deleted because of the return. Fix it by removing the return. Fixes: `69e2916ebc` ("net/mlx5: CT: Add support for mirroring") Fixes: `f94d6389f6` ("net/mlx5e: TC, Add support to offload sample action") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:49 -08:00
Maxim Mikityanskiy	19c4aba2d4	net/mlx5e: Fix ICOSQ recovery flow for XSK There are two ICOSQs per channel: one is needed for RX, and the other for async operations (XSK TX, kTLS offload). Currently, the recovery flow for both is the same, and async ICOSQ is mistakenly treated like the regular ICOSQ. This patch prevents running the regular ICOSQ recovery on async ICOSQ. The purpose of async ICOSQ is to handle XSK wakeup requests and post kTLS offload RX parameters, it has nothing to do with RQ and XSKRQ UMRs, so the regular recovery sequence is not applicable here. Fixes: `be5323c837` ("net/mlx5e: Report and recover from CQE error on ICOSQ") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:48 -08:00
Maxim Mikityanskiy	17958d7cd7	net/mlx5e: Fix interoperability between XSK and ICOSQ recovery flow Both regular RQ and XSKRQ use the same ICOSQ for UMRs. When doing recovery for the ICOSQ, don't forget to deactivate XSKRQ. XSK can be opened and closed while channels are active, so a new mutex prevents the ICOSQ recovery from running at the same time. The ICOSQ recovery deactivates and reactivates XSKRQ, so any parallel change in XSK state would break consistency. As the regular RQ is running, it's not enough to just flush the recovery work, because it can be rescheduled. Fixes: `be5323c837` ("net/mlx5e: Report and recover from CQE error on ICOSQ") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:48 -08:00
Gal Pressman	a0cb909644	net/mlx5e: Fix skb memory leak when TC classifier action offloads are disabled When TC classifier action offloads are disabled (CONFIG_MLX5_CLS_ACT in Kconfig), the mlx5e_rep_tc_receive() function which is responsible for passing the skb to the stack (or freeing it) is defined as a nop, and results in leaking the skb memory. Replace the nop with a call to napi_gro_receive() to resolve the leak. Fixes: `28e7606fa8` ("net/mlx5e: Refactor rx handler of represetor device") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:48 -08:00
Amir Tzin	918fc3855a	net/mlx5e: Wrap the tx reporter dump callback to extract the sq Function mlx5e_tx_reporter_dump_sq() casts its void * argument to struct mlx5e_txqsq , but in TX-timeout-recovery flow the argument is actually of type struct mlx5e_tx_timeout_ctx . mlx5_core 0000:08:00.1 enp8s0f1: TX timeout detected mlx5_core 0000:08:00.1 enp8s0f1: TX timeout on queue: 1, SQ: 0x11ec, CQ: 0x146d, SQ Cons: 0x0 SQ Prod: 0x1, usecs since last trans: 21565000 BUG: stack guard page was hit at 0000000093f1a2de (stack is 00000000b66ea0dc..000000004d932dae) kernel stack overflow (page fault): 0000 [#1] SMP NOPTI CPU: 5 PID: 95 Comm: kworker/u20:1 Tainted: G W OE 5.13.0_mlnx #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core] RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180 [mlx5_core] Call Trace: mlx5e_tx_reporter_dump+0x43/0x1c0 [mlx5_core] devlink_health_do_dump.part.91+0x71/0xd0 devlink_health_report+0x157/0x1b0 mlx5e_reporter_tx_timeout+0xb9/0xf0 [mlx5_core] ? mlx5e_tx_reporter_err_cqe_recover+0x1d0/0x1d0 [mlx5_core] ? mlx5e_health_queue_dump+0xd0/0xd0 [mlx5_core] ? update_load_avg+0x19b/0x550 ? set_next_entity+0x72/0x80 ? pick_next_task_fair+0x227/0x340 ? finish_task_switch+0xa2/0x280 mlx5e_tx_timeout_work+0x83/0xb0 [mlx5_core] process_one_work+0x1de/0x3a0 worker_thread+0x2d/0x3c0 ? process_one_work+0x3a0/0x3a0 kthread+0x115/0x130 ? kthread_park+0x90/0x90 ret_from_fork+0x1f/0x30 --[ end trace 51ccabea504edaff ]--- RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180 PKRU: 55555554 Kernel panic - not syncing: Fatal exception Kernel Offset: disabled end Kernel panic - not syncing: Fatal exception To fix this bug add a wrapper for mlx5e_tx_reporter_dump_sq() which extracts the sq from struct mlx5e_tx_timeout_ctx and set it as the TX-timeout-recovery flow dump callback. Fixes: `5f29458b77` ("net/mlx5e: Support dump callback in TX reporter") Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Amir Tzin <amirtz@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:48 -08:00
Chris Mi	d671e109bd	net/mlx5: Fix tc max supported prio for nic mode Only prio 1 is supported if firmware doesn't support ignore flow level for nic mode. The offending commit removed the check wrongly. Add it back. Fixes: `9a99c8f125` ("net/mlx5e: E-Switch, Offload all chain 0 priorities when modify header and forward action is not supported") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:47 -08:00
Moshe Shemesh	33de865f7b	net/mlx5: Fix SF health recovery flow SF do not directly control the PCI device. During recovery flow SF should not be allowed to do pci disable or pci reset, its PF will do it. It fixes the following kernel trace: mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:387:(pid 40948): starting health recovery flow mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called mlx5_core 0000:03:00.0: wait vital counter value 0xab175 after 1 iterations mlx5_core.sf mlx5_core.sf.25: firmware version: 24.32.532 mlx5_core.sf mlx5_core.sf.23: mlx5_health_try_recover:387:(pid 40946): starting health recovery flow mlx5_core 0000:03:00.0: mlx5_pci_slot_reset was called mlx5_core 0000:03:00.0: wait vital counter value 0xab193 after 1 iterations mlx5_core.sf mlx5_core.sf.23: firmware version: 24.32.532 mlx5_core.sf mlx5_core.sf.25: mlx5_cmd_check:813:(pid 40948): ENABLE_HCA(0x104) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x658908) mlx5_core.sf mlx5_core.sf.25: mlx5_function_setup:1292:(pid 40948): enable hca failed mlx5_core.sf mlx5_core.sf.25: mlx5_health_try_recover:389:(pid 40948): health recovery failed Fixes: `1958fc2f07` ("net/mlx5: SF, Add auxiliary device driver") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:47 -08:00
Shay Drory	aa968f9220	net/mlx5: Fix error print in case of IRQ request failed In case IRQ layer failed to find or to request irq, the driver is printing the first cpu of the provided affinity as part of the error print. Empty affinity is a valid input for the IRQ layer, and it is an error to call cpumask_first() on empty affinity. Remove the first cpu print from the error message. Fixes: `c36326d38d` ("net/mlx5: Round-Robin EQs over IRQs") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:47 -08:00
Shay Drory	26a7993c93	net/mlx5: Use first online CPU instead of hard coded CPU Hard coded CPU (0 in our case) might be offline. Hence, use the first online CPU instead. Fixes: `f891b7cdbd` ("net/mlx5: Enable single IRQ for PCI Function") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:46 -08:00
Yevgeny Kliteynik	624bf42c2e	net/mlx5: DR, Fix querying eswitch manager vport for ECPF On BlueField the E-Switch manager is the ECPF (vport 0xFFFE), but when querying capabilities of ECPF eswitch manager, need to query vport 0 with other_vport = 0. Fixes: `9091b821aa` ("net/mlx5: DR, Handle eswitch manager and uplink vports separately") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:46 -08:00
Miaoqian Lin	6b8b425858	net/mlx5: DR, Fix NULL vs IS_ERR checking in dr_domain_init_resources The mlx5_get_uars_page() function returns error pointers. Using IS_ERR() to check the return value to fix this. Fixes: `4ec9e7b026` ("net/mlx5: DR, Expose steering domain functionality") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-12-22 20:38:46 -08:00
Amit Cohen	70ec72d5b6	mlxsw: spectrum_flower: Make vlan_id limitation more specific Spectrum ASICs do not support matching of VLAN ID at egress. Currently, mlxsw driver forbids matching of all VLAN related fields at egress, which is too strict check. For example, the following filter is not supported by the driver: $ tc filter add dev swpX egress protocol 802.1q pref 1 handle 101 flower vlan_ethtype ipv4 src_ip .. dst_ip .. skip_sw action pass Error: mlxsw_spectrum: vlan_id key is not supported on egress. We have an error talking to the kernel The filter above does not match on VLAN ID, but is bounced anyway. Make the check more specific, forbid only matching of 'vlan_id' at egress. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-22 17:14:32 -08:00
Jakub Kicinski	5de24da1b3	Merge tag 'mlx5-updates-2021-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2021-12-21 1) From Shay Drory: Devlink user knobs to control device's EQ size This series provides knobs which will enable users to minimize memory consumption of mlx5 Functions (PF/VF/SF). mlx5 exposes two new generic devlink params for EQ size configuration and uses devlink generic param max_macs. LINK: https://lore.kernel.org/netdev/20211208141722.13646-1-shayd@nvidia.com/ 2) From Tariq and Lama, allocate software channel objects and statistics of a mlx5 netdevice private data dynamically upon first demand to save on memory. * tag 'mlx5-updates-2021-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Take packet_merge params directly from the RX res struct net/mlx5e: Allocate per-channel stats dynamically at first usage net/mlx5e: Use dynamic per-channel allocations in stats net/mlx5e: Allow profile-specific limitation on max num of channels net/mlx5e: Save memory by using dynamic allocation in netdev priv net/mlx5e: Add profile indications for PTP and QOS HTB features net/mlx5e: Use bitmap field for profile features net/mlx5: Remove the repeated declaration net/mlx5: Let user configure max_macs generic param devlink: Clarifies max_macs generic devlink param net/mlx5: Let user configure event_eq_size param devlink: Add new "event_eq_size" generic device param net/mlx5: Let user configure io_eq_size param devlink: Add new "io_eq_size" generic device param ==================== Link: https://lore.kernel.org/r/20211222031604.14540-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-22 17:13:01 -08:00
Colin Ian King	62a3106697	net: broadcom: bcm4908enet: remove redundant variable bytes The variable bytes is being used to summate slot lengths, however the value is never used afterwards. The summation is redundant so remove variable bytes. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20211222003937.727325-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-22 14:58:43 -08:00
Jesse Brandeburg	0092db5fac	ice: trivial: fix odd indenting Fix an odd indent where some code was left indented, and causes smatch to warn: ice_log_pkg_init() warn: inconsistent indenting While here, for consistency, add a break after the default case. This commit has a Fixes: but we caught this while it was only in net-next. Fixes: `247dd97d71` ("ice: Refactor status flow for DDP load") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Link: https://lore.kernel.org/r/20211221230538.2546315-1-jesse.brandeburg@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-22 14:57:55 -08:00
Jakub Kicinski	2030eddced	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2021-12-21 This series contains updates to ice driver only. Karol modifies the reset flow to correct issues with PTP reset. Jake extends PTP support for E822 based devices. This includes a few cleanup patches, that fix some minor issues. In addition, there are some slight refactors to ease the addition of E822 support, followed by adding the new hardware implementation ice_ptp_hw.c. There are a few major differences with E822 support compared to E810 support: ) The E822 device has a Clock Generation Unit which must be initialized in order to generate proper clock frequencies on the output that drives the PTP hardware clock registers ) The E822 PHY is a bit different and requires a more complex initialization procedure which must be rerun any time the link configuration changes. ) The E822 devices support enhanced timestamp calibration by making use of a process called Vernier offset measurement. This allows the hardware to measure phase offset related to the PHY clocks for Serdes and FEC, reducing the inaccuracy of the timestamp relative to the actual packet transmission and receipt. Making use of this requires data gathered from the first transmitted and received packets, and waiting for the PHY to complete the calibration measurements. This is done as part of a new kthread, ov_work. Note that to avoid delay in enabling timestamps, we start the PHY in 'bypass' mode which allows timestamps to be captured without the Vernier calibration measurement. Once the first packets have been sent and received, we then complete the calibration setup and exit bypass mode and begin using the more precise timestamps. According to the datasheet, timestamps without calibration data can be incorrect relative to actual receipt or transmission by up to 1 clock cycle (~1.25 nanoseconds), while calibrated timestamps should be correct to within 1/8th of a clock cycle (~0.15 nanoseconds). ) E822 devices support crosstimestamping via PCIe PTM, which we enable when available on the platform. There is a fair amount of logic required to perform PHY and CGU initialization, which is the vast majority of the new code, but it is fairly self contained within ice_ptp_hw.c, with the exception of monitoring for offset validity being handled by a kthread. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: support crosstimestamping on E822 devices if supported ice: exit bypass mode once hardware finishes timestamp calibration ice: ensure the hardware Clock Generation Unit is configured ice: implement basic E822 PTP support ice: convert clk_freq capability into time_ref ice: introduce ice_ptp_init_phc function ice: use 'int err' instead of 'int status' in ice_ptp_hw.c ice: PTP: move setting of tstamp_config ice: introduce ice_base_incval function ice: Fix E810 PTP reset flow ==================== Link: https://lore.kernel.org/r/20211221174845.3063640-1-anthony.l.nguyen@intel.com Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-22 14:25:42 -08:00
Jiasheng Jiang	9b8bdd1eb5	sfc: falcon: Check null pointer of rx_queue->page_ring Because of the possible failure of the kcalloc, it should be better to set rx_queue->page_ptr_mask to 0 when it happens in order to maintain the consistency. Fixes: `5a6681e22c` ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://lore.kernel.org/r/20211220140344.978408-1-jiasheng@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-22 12:25:18 -08:00
Jiasheng Jiang	bdf1b5c388	sfc: Check null pointer of rx_queue->page_ring Because of the possible failure of the kcalloc, it should be better to set rx_queue->page_ptr_mask to 0 when it happens in order to maintain the consistency. Fixes: `5a6681e22c` ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://lore.kernel.org/r/20211220135603.954944-1-jiasheng@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-22 12:23:18 -08:00
David E. Box	a5f8ef0baf	net/mlx5e: Use auxiliary_device driver data helpers Use auxiliary_get_drvdata and auxiliary_set_drvdata helpers. Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20211221235852.323752-4-david.e.box@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-22 13:59:01 +01:00
Jiasheng Jiang	99d7fbb5ce	net: ks8851: Check for error irq Because platform_get_irq() could fail and return error irq. Therefore, it might be better to check it if order to avoid the use of error irq. Fixes: `797047f875` ("net: ks8851: Implement Parallel bus operations") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-22 10:23:50 +00:00
Jiasheng Jiang	cb93b3e11d	drivers: net: smc911x: Check for error irq Because platform_get_irq() could fail and return error irq. Therefore, it might be better to check it if order to avoid the use of error irq. Fixes: `ae150435b5` ("smsc: Move the SMC (SMSC) drivers") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-22 10:23:03 +00:00

... 4 5 6 7 8 ...

41005 Commits