Commit Graph

1297933 Commits

Author SHA1 Message Date
Daniel Xu
23dc986732 bpf, cpumap: Move xdp:xdp_cpumap_kthread tracepoint before rcv
cpumap takes RX processing out of softirq and onto a separate kthread.
Since the kthread needs to be scheduled in order to run (versus softirq
which does not), we can theoretically experience extra latency if the
system is under load and the scheduler is being unfair to us.

Moving the tracepoint to before passing the skb list up the stack allows
users to more accurately measure enqueue/dequeue latency introduced by
cpumap via xdp:xdp_cpumap_enqueue and xdp:xdp_cpumap_kthread tracepoints.

f9419f7bd7 ("bpf: cpumap add tracepoints") which added the tracepoints
states that the intent behind them was for general observability and for
a feedback loop to see if the queues are being overwhelmed. This change
does not mess with either of those use cases but rather adds a third
one.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://lore.kernel.org/bpf/47615d5b5e302e4bd30220473779e98b492d47cd.1725585718.git.dxu@dxuuu.xyz
2024-09-11 16:32:11 +02:00
Maciej Fijalkowski
d41905b3bb selftests/xsk: Read current MAX_SKB_FRAGS from sysctl knob
Currently, xskxceiver assumes that MAX_SKB_FRAGS value is always 17
which is not true - since the introduction of BIG TCP this can now take
any value between 17 to 45 via CONFIG_MAX_SKB_FRAGS.

Adjust the TOO_MANY_FRAGS test case to read the currently configured
MAX_SKB_FRAGS value by reading it from /proc/sys/net/core/max_skb_frags.
If running system does not provide that sysctl file then let us try
running the test with a default value.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20240910124129.289874-1-maciej.fijalkowski@intel.com
2024-09-11 15:48:35 +02:00
David S. Miller
bf73478b53 Merge branch 'lan743x-phylink'
Raju Lakkaraju says:

====================
Add support to PHYLINK for LAN743x/PCI11x1x chips

This is the follow-up patch series of
https://lkml.iu.edu/hypermail/linux/kernel/2310.2/02078.html

Divide the PHYLINK adaptation and SFP modifications into two separate patch
series.

The current patch series focuses on transitioning the LAN743x driver's PHY
support from phylib to phylink.

Tested on PCI11010 Rev-1 Evaluation board

Change List:
============
V5 -> V6:
  - Remove the lan743x_find_max_speed( ) function. Not require
  - Add EEE enable check before calling lan743x_mac_eee_enable( ) function
V4 -> V5:
  - Remove the fixed_phy_unregister( ) function. Not require
  - Remove the "phydev->eee_enabled" check to update the MAC EEE
    enable/disable
  - Call lan743x_mac_eee_enable() with true after update tx_lpi_timer.
  - Add phy_support_eee() to initialize the EEE flags
V3 -> V4:
  - Add fixed-link patch along with this series.
    Note: Note: This code was developed by Mr.Russell King
    Ref:
    https://lore.kernel.org/netdev/LV8PR11MB8700C786F5F1C274C73036CC9F8E2@LV8PR11MB8700.namprd11.prod.outlook.com/T/#me943adf54f1ea082edf294aba448fa003a116815
  - Change phylink fixed-link function header's string from "Returns" to
    "Returns:"
  - Remove the EEE private variable from LAN743x adapter strcture and fix the
    EEE's set/get functions
  - set the individual caps (i.e. _RGMII, _RGMII_ID, _RGMII_RXID and
    __RGMII_TXID) replace with phy_interface_set_rgmii( ) function
  - Change lan743x_set_eee( ) to lan743x_mac_eee_enable( )

V2 -> V3:
  - Remove the unwanted parens in each of these if() sub-blocks
  - Replace "to_net_dev(config->dev)" with "netdev".
  - Add GMII_ID/RGMII_TXID/RGMII_RXID in supported_interfaces
  - Fix the lan743x_phy_handle_exists( ) return type

V1 -> V2:
  - Fix the Russell King's comments i.e. remove the speed, duplex update in
    lan743x_phylink_mac_config( )
  - pre-March 2020 legacy support has been removed

V0 -> V1:
  - Integrate with Synopsys DesignWare XPCS drivers
  - Based on external review comments,
  - Changes made to SGMII interface support only 1G/100M/10M bps speed
  - Changes made to 2500Base-X interface support only 2.5Gbps speed
  - Add check for not is_sgmii_en with is_sfp_support_en support
  - Change the "pci11x1x_strap_get_status" function return type from void to
    int
  - Add ethtool phylink wol, eee, pause get/set functions
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-11 11:06:12 +01:00
Raju Lakkaraju
f95f28d794 net: lan743x: Add support to ethtool phylink get and set settings
Add support to ethtool phylink functions:
  - get/set settings like speed, duplex etc
  - get/set the wake-on-lan (WOL)
  - get/set the energy-efficient ethernet (EEE)
  - get/set the pause

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-11 11:06:12 +01:00
Raju Lakkaraju
a5f199a8d8 net: lan743x: Migrate phylib to phylink
Migrate phy support from phylib to phylink.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-11 11:06:12 +01:00
Raju Lakkaraju
92b740a43f net: lan743x: Create separate Link Speed Duplex state function
Create separate Link Speed Duplex (LSD) update state function from
lan743x_sgmii_config () to use as subroutine.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-11 11:06:12 +01:00
Raju Lakkaraju
ef0250456c net: lan743x: Create separate PCS power reset function
Create separate PCS power reset function from lan743x_sgmii_config () to use
as subroutine.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-11 11:06:12 +01:00
Russell King
4b3fc475c6 net: phylink: Add phylink_set_fixed_link() to configure fixed link state in phylink
The function allows for the configuration of a fixed link state for a given
phylink instance. This addition is particularly useful for network devices that
operate with a fixed link configuration, where the link parameters do not change
dynamically. By using `phylink_set_fixed_link()`, drivers can easily set up
the fixed link state during initialization or configuration changes.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <linux@armlinux.org.uk>
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-11 11:06:11 +01:00
Arnd Bergmann
0e7af99aef RISC-V soc fixes for v6.11-final
StarFive:
 A fix to return one of the clocks on the JH7110 from 1 GHz to 1.5 GHz
 
 Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZt8G1wAKCRB4tDGHoIJi
 0q/hAQD++rcYdOcP4iwguSS3ZkbCiyMdLDUuVvSiHUtR5dS0WgEAvXE1i0eEwt63
 BELsrBNFkeUhLrBfHtN0k2MNvfPXxw4=
 =VjNw
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmbhWs4ACgkQYKtH/8kJ
 UieKcRAAyg582dYBmQr2z2u+X0XwR85nPCmwtZT7j1DbR+knBLt8s0+JqNu0g3sn
 NFg14CI3CtaTS96JaFOzXHKpD96oVTyozs0AU5jtCmD0/+RmOXIByrc1hMzRCP8C
 RNhFTwuuQsB3aP56EhL07CAwpTE0lZSdXORtQMn+vJ+H4RF5n0fzjujGLXWbEpOZ
 8tmIFMEkgEILNcRxPRDLsa4hwUhqNiQDBNsV+QtDaaRfPUBHnXdfv88xkXZCX7EF
 wZRi+WobYbAIVIdEfQ01DfsjSKDmief3kZvm7nj4pBV0QP8O3sOe+xFrKcUguBwb
 6tTACRzC9CaJvRNnkiHGvTMuT98500kd5P7TEnBZKgmHlWWYAxOivYO/mu/WYxVK
 Erb/i08XHopNK46xm1Ue/AM3eeXw6I6lAeZquEREZK2zGS6DE/LHTxHIlqelGEHt
 9ubBlJLf6IVsmlzNrOyN7lrKkXhHqTkM5O8o3RCckTpsNrmV2TppjOl4s4FKWRPM
 o91f/FrS4CV8QdA7JQdmgB+pQWAE83txbTP5KAZpxEkTDBEd6b1NKllxXe3OD68y
 QA8B7lEw4ltxnzyD+WI4RIhYoBB+1IfguDy6CbtJ6gzKbmszfCXaxSCA1HpBbmPQ
 ilGFdrEFgT9NDbxHXOWsG/MHMjgOsi3e/h4XN/iYAjN0wwQ6ZSY=
 =1HB8
 -----END PGP SIGNATURE-----

Merge tag 'riscv-soc-fixes-for-v6.11-final' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into arm/fixes

RISC-V soc fixes for v6.11-final

StarFive:
A fix to return one of the clocks on the JH7110 from 1 GHz to 1.5 GHz

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-soc-fixes-for-v6.11-final' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: dts: starfive: jh7110-common: Fix lower rate of CPUfreq by setting PLL0 rate to 1.5GHz

Link: https://lore.kernel.org/r/20240909-hybrid-groovy-601a33b5b309@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-09-11 08:54:37 +00:00
Arnd Bergmann
b97acde6f9
platform: cznic: turris-omnia-mcu: fix HW_RANDOM dependency
There is still a build failure when the rwrng support is in a loadable
module but the mcu driver is built-in:

arm-linux-gnueabi-ld: drivers/platform/cznic/turris-omnia-mcu-trng.o: in function `omnia_mcu_register_trng':
turris-omnia-mcu-trng.c:(.text.omnia_mcu_register_trng+0x11c): undefined reference to `devm_hwrng_register'

Change the dependency to explicitly disallow the broken
configuration.

Fixes: 41bb142a40 ("platform: cznic: turris-omnia-mcu: Add support for MCU provided TRNG")
Reviewed-by: Marek Behún <kabel@kernel.org>
Link: https://lore.kernel.org/r/20240909110417.247453-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-09-11 08:54:21 +00:00
Petr Mladek
2c83ded8ae Merge branch 'for-6.11-fixup' into for-linus 2024-09-11 09:30:22 +02:00
Jakub Kicinski
d1aaaa2e0a Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-09-09 (ice, igb)

This series contains updates to ice and igb drivers.

Martyna moves LLDP rule removal to the proper uninitialization function
for ice.

Jake corrects accounting logic for FWD_TO_VSI_LIST switch filters on
ice.

Przemek removes incorrect, explicit calls to pci_disable_device() for
ice.

Michal Schmidt stops incorrect use of VSI list for VLAN use on ice.

Sriram Yagnaraman adjusts igb_xdp_ring_update_tail() to be called under
Tx lock on igb.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  igb: Always call igb_xdp_ring_update_tail() under Tx lock
  ice: fix VSI lists confusion when adding VLANs
  ice: stop calling pci_disable_device() as we use pcim
  ice: fix accounting for filters shared by multiple VSIs
  ice: Fix lldp packets dropping after changing the number of channels
====================

Link: https://patch.msgid.link/20240909203842.3109822-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 20:15:10 -07:00
Jakub Kicinski
3d731dc9b1 mlx5-fixes-2024-09-09
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmbfTw4ACgkQSD+KveBX
 +j4jxggAnxuWbJuvFBVkiU+62SpPldhKy/ut7Dc3KTOOezb7HMD7suYawgZl0jxr
 1cSltKL3lpmaN2FEKITRxESsOKjHqVShkWpZCi+c8hMwd+vWowlaO4r6BY/5ZYhj
 2KPx3PjJl6d30d0gw4zMNu3XlOnpunuaRXJv5dbmRkz6G2XGVQzyOH2pfzSJWxyk
 bcqYm/3Ma0psfEQhIP6I0LDBvHU4rOAlIGQN4IAzmLmwi4Whk6ECI7Q91v3PH/c9
 nTJNTQhvyUJEc5aYuHftNU2MHlzejDPx5F3xd4dcQs30MXk5efSD9+OWnxHivjrP
 c9GE3+PmWAWJtSLLb/iOMyTvY+x63Q==
 =mGZl
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2024-09-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2024-09-09

This series provides bug fixes to mlx5 driver.

* tag 'mlx5-fixes-2024-09-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Fix bridge mode operations when there are no VFs
  net/mlx5: Verify support for scheduling element and TSAR type
  net/mlx5: Add missing masks and QoS bit masks for scheduling elements
  net/mlx5: Explicitly set scheduling element and TSAR type
  net/mlx5e: Add missing link mode to ptys2ext_ethtool_map
  net/mlx5e: Add missing link modes to ptys2ethtool_map
  net/mlx5: Update the list of the PCI supported devices
====================

Link: https://patch.msgid.link/20240909194505.69715-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 20:11:40 -07:00
Jakub Kicinski
f3b6129b7d Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
ice: support devlink subfunction

Michal Swiatkowski says:

Currently ice driver does not allow creating more than one networking
device per physical function. The only way to have more hardware backed
netdev is to use SR-IOV.

Following patchset adds support for devlink port API. For each new
pcisf type port, driver allocates new VSI, configures all resources
needed, including dynamically MSIX vectors, program rules and registers
new netdev.

This series supports only one Tx/Rx queue pair per subfunction.

Example commands:
devlink port add pci/0000:31:00.1 flavour pcisf pfnum 1 sfnum 1000
devlink port function set pci/0000:31:00.1/1 hw_addr 00:00:00:00:03:14
devlink port function set pci/0000:31:00.1/1 state active
devlink port function del pci/0000:31:00.1/1

Make the port representor and eswitch code generic to support
subfunction representor type.

VSI configuration is slightly different between VF and SF. It needs to
be reflected in the code.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: subfunction activation and base devlink ops
  ice: basic support for VLAN in subfunctions
  ice: support subfunction devlink Tx topology
  ice: implement netdevice ops for SF representor
  ice: check if SF is ready in ethtool ops
  ice: don't set target VSI for subfunction
  ice: create port representor for SF
  ice: make representor code generic
  ice: implement netdev for subfunction
  ice: base subfunction aux driver
  ice: allocate devlink for subfunction
  ice: treat subfunction VSI the same as PF VSI
  ice: add basic devlink subfunctions support
  ice: export ice ndo_ops functions
  ice: add new VSI type for subfunctions
====================

Link: https://patch.msgid.link/20240906223010.2194591-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 20:05:10 -07:00
Jakub Kicinski
474bb1aa45 mlx5-updates-2024-08-29
HW-Managed Flow Steering in mlx5 driver
 
 Yevgeny Kliteynik says:
 =======================
 
 1. Overview
 -----------
 
 ConnectX devices support packet matching, modification, and redirection.
 This functionality is referred as Flow Steering.
 To configure a steering rule, the rule is written to the device-owned
 memory. This memory is accessed and cached by the device when processing
 a packet.
 
 The first implementation of Flow Steering was done in FW, and it is
 referred in the mlx5 driver as Device-Managed Flow Steering (DMFS).
 Later we introduced SW-managed Flow Steering (SWS or SMFS), where the
 driver is writing directly to the device's configuration memory (ICM)
 through RC QP using RDMA operations (RDMA-read and RDAM-write), thus
 achieving higher rates of rule insertion/deletion.
 
 Now we introduce a new flow steering implementation: HW-Managed Flow
 Steering (HWS or HMFS).
 
 In this new approach, the driver is configuring steering rules directly
 to the HW using the WQs with a special new type of WQE. This way we can
 reach higher rule insertion/deletion rate with much lower CPU utilization
 compared to SWS.
 
 The key benefits of HWS as opposed to SWS:
 + HW manages the steering decision tree
    - HW calculates CRC for each entry
    - HW handles tree hash collisions
    - HW & FW manage objects refcount
 + HW keeps cache coherency:
    - HW provides tree access locking and synchronization
    - HW provides notification on completion
 + Insertion rate isn’t affected by background traffic
    - Dedicated HW components that handle insertion
 
 2. Performance
 --------------
 
 Measuring Connection Tracking with simple IPv4 flows w/o NAT, we
 are able to get ~5 times more flows offloaded per second using HWS.
 
 3. Configuration
 ----------------
 
 The enablement of HWS mode in eswitch manager is done using the same
 devlink param that is already used for switching between FW-managed
 steering and SW-managed steering modes:
 
   # devlink dev param set pci/<PCI_ID> name flow_steering_mode cmod runtime value hmfs
 
 4. Upstream Submission
 ----------------------
 
 HWS support consists of 3 main components:
 + Steering:
    - The lower layer that exposes HWS API to upper layers and implements
      all the management of flow steering building blocks
 + FS-Core
    - Implementation of fs_hws layer to enable fs_core to use HWS instead
      of FW or SW steering
    - Create HW steering action pools to utilize the ability of HWS to
      share steering actions among different rules
    - Add support for configuring HWS mode through devlink command,
      similar to configuring SWS mode
 + Connection Tracking
    - Implementation of CT support for HW steering
    - Hooks up the CT ops for the new steering mode and uses the HWS API
      to implement connection tracking.
 
 Because of the large number of patches, we need to perform the submission
 in several separate patch series. This series is the first submission that
 lays the ground work for the next submissions, where an actual user of HWS
 will be added.
 
 5. Patches in this series
 -------------------------
 
 This patch series contains implementation of the first bullet from above.
 The patches are:
 
 [patch 01/15] net/mlx5: Added missing mlx5_ifc definition for HW Steering
 [patch 02/15] net/mlx5: Added missing definitions in preparation for HW Steering
 [patch 03/15] net/mlx5: HWS, added actions handling
 [patch 04/15] net/mlx5: HWS, added tables handling
 [patch 05/15] net/mlx5: HWS, added rules handling
 [patch 06/15] net/mlx5: HWS, added definers handling
 [patch 07/15] net/mlx5: HWS, added matchers functionality
 [patch 08/15] net/mlx5: HWS, added FW commands handling
 [patch 09/15] net/mlx5: HWS, added modify header pattern and args handling
 [patch 10/15] net/mlx5: HWS, added vport handling
 [patch 11/15] net/mlx5: HWS, added memory management handling
 [patch 12/15] net/mlx5: HWS, added backward-compatible API handling
 [patch 13/15] net/mlx5: HWS, added debug dump and internal headers
 [patch 14/15] net/mlx5: HWS, added send engine and context handling
 [patch 15/15] net/mlx5: HWS, added API and enabled HWS support
 
 =======================
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmbfOf4ACgkQSD+KveBX
 +j7hWgf/UzlKp8uyqb+7MpMWP6EgT8WUwWdpDfAr1jubIFz7e+VGaA/7QCThe89u
 alcgYvIDQCGrpB/0qXY+kaKPvqOej2wPCiLU7K2JB5y20pZ/RATlFuFBZjsMzufJ
 7NxzZgTPUDz+8OWK0mm0LxEJRJYoJ69gAnR0jvLGx9uSjv/f9lNICvWBaI58hkzb
 HJa6sJNBiFj4EnkipxWCP0GQ4dddMkgCIVYb91FtlBA4SGZtmPS35NqQJKtGnKF3
 ZhZuaTeRdw8bFDJnhbu0ur9cs4EUorZE5QBWhoHYN0zFZF4JmqCCC1HvCS7LEKaU
 PgtREk20H2jPIRFwQuX05D6M4zSizg==
 =AsB3
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2024-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2024-08-29

HW-Managed Flow Steering in mlx5 driver

Yevgeny Kliteynik says:
=======================

1. Overview
-----------

ConnectX devices support packet matching, modification, and redirection.
This functionality is referred as Flow Steering.
To configure a steering rule, the rule is written to the device-owned
memory. This memory is accessed and cached by the device when processing
a packet.

The first implementation of Flow Steering was done in FW, and it is
referred in the mlx5 driver as Device-Managed Flow Steering (DMFS).
Later we introduced SW-managed Flow Steering (SWS or SMFS), where the
driver is writing directly to the device's configuration memory (ICM)
through RC QP using RDMA operations (RDMA-read and RDAM-write), thus
achieving higher rates of rule insertion/deletion.

Now we introduce a new flow steering implementation: HW-Managed Flow
Steering (HWS or HMFS).

In this new approach, the driver is configuring steering rules directly
to the HW using the WQs with a special new type of WQE. This way we can
reach higher rule insertion/deletion rate with much lower CPU utilization
compared to SWS.

The key benefits of HWS as opposed to SWS:
+ HW manages the steering decision tree
   - HW calculates CRC for each entry
   - HW handles tree hash collisions
   - HW & FW manage objects refcount
+ HW keeps cache coherency:
   - HW provides tree access locking and synchronization
   - HW provides notification on completion
+ Insertion rate isn’t affected by background traffic
   - Dedicated HW components that handle insertion

2. Performance
--------------

Measuring Connection Tracking with simple IPv4 flows w/o NAT, we
are able to get ~5 times more flows offloaded per second using HWS.

3. Configuration
----------------

The enablement of HWS mode in eswitch manager is done using the same
devlink param that is already used for switching between FW-managed
steering and SW-managed steering modes:

  # devlink dev param set pci/<PCI_ID> name flow_steering_mode cmod runtime value hmfs

4. Upstream Submission
----------------------

HWS support consists of 3 main components:
+ Steering:
   - The lower layer that exposes HWS API to upper layers and implements
     all the management of flow steering building blocks
+ FS-Core
   - Implementation of fs_hws layer to enable fs_core to use HWS instead
     of FW or SW steering
   - Create HW steering action pools to utilize the ability of HWS to
     share steering actions among different rules
   - Add support for configuring HWS mode through devlink command,
     similar to configuring SWS mode
+ Connection Tracking
   - Implementation of CT support for HW steering
   - Hooks up the CT ops for the new steering mode and uses the HWS API
     to implement connection tracking.

Because of the large number of patches, we need to perform the submission
in several separate patch series. This series is the first submission that
lays the ground work for the next submissions, where an actual user of HWS
will be added.

5. Patches in this series
-------------------------

This patch series contains implementation of the first bullet from above.

=======================

* tag 'mlx5-updates-2024-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: HWS, added API and enabled HWS support
  net/mlx5: HWS, added send engine and context handling
  net/mlx5: HWS, added debug dump and internal headers
  net/mlx5: HWS, added backward-compatible API handling
  net/mlx5: HWS, added memory management handling
  net/mlx5: HWS, added vport handling
  net/mlx5: HWS, added modify header pattern and args handling
  net/mlx5: HWS, added FW commands handling
  net/mlx5: HWS, added matchers functionality
  net/mlx5: HWS, added definers handling
  net/mlx5: HWS, added rules handling
  net/mlx5: HWS, added tables handling
  net/mlx5: HWS, added actions handling
  net/mlx5: Added missing definitions in preparation for HW Steering
  net/mlx5: Added missing mlx5_ifc definition for HW Steering
====================

Link: https://patch.msgid.link/20240909181250.41596-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 20:01:15 -07:00
Jakub Kicinski
ea403549da ipsec-next-2024-09-10
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH7ZpcWbFyOOp6OJbrB3Eaf9PW7cFAmbf6xAACgkQrB3Eaf9P
 W7eZQA/9HuHTWBg0V43QDT1rjNnKult+uBKYpKrh045outqMs+cU8bsww5ZuIAKx
 ktN66OCE67d7XeFttb9UAJUPqQ98RjwjVUOpjRJ5iRDtj2bmn/5VGSYuH7zx5so0
 msFs5gkomo2ZZNjcMOSrDVGUoCdlHh1og5L2KN/FgztSA1smDdUBQOWNm1peezbI
 eJFt2Q6KCNfzwPthmQte0dmDnK5gWPducereSx03tMuSyUmPML1zrzOFXBXSg09e
 dAlDTxbAXZDrXS4Ii0y/FEM2Ugkjg9FXbE1kvM0i05GIc/SGnEBGEcdW5YbmRhOL
 4JlLnpiLTmKTaIZ0GdpADv7XZMga6R01AalSPsJz+H7aNAHTKkK+SzQY4YXRucZy
 SsASM39oRLzo9Bm4ZZ773Nw83cxBgO/ZixK4KVvCZI/1ftD+9zn72eqk+CeveSeE
 ChaXGuWpRdfAOsgozFJNFx/ffK5qzxFKkIeN9KN0QYV/XJuZJ7nD6eQkH9ydgvTI
 4cexY+cs4wgfdi9dDkVHPVhCR7mRlfi5r/VL8rtWWnWzR07okKF4rW6dgvx33m60
 9MmF1/EdD2uh3CLcBMjNg6qXdC07VeDpFLqWs+utJvSHMuI43uE4FkRQui/J6T9N
 RX7zzkFBsPvPpm5GHLx2u/wvnzX1co1Rk9xzbC+J6FEPlm2/0vI=
 =ErGl
 -----END PGP SIGNATURE-----

Merge tag 'ipsec-next-2024-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next

Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2024-09-10

1) Remove an unneeded WARN_ON on packet offload.
   From Patrisious Haddad.

2) Add a copy from skb_seq_state to buffer function.
   This is needed for the upcomming IPTFS patchset.
   From Christian Hopps.

3) Spelling fix in xfrm.h.
   From Simon Horman.

4) Speed up xfrm policy insertions.
   From Florian Westphal.

5) Add and revert a patch to support xfrm interfaces
   for packet offload. This patch was just half cooked.

6) Extend usage of the new xfrm_policy_is_dead_or_sk helper.
   From Florian Westphal.

7) Update comments on sdb and xfrm_policy.
   From Florian Westphal.

8) Fix a null pointer dereference in the new policy insertion
   code From Florian Westphal.

9) Fix an uninitialized variable in the new policy insertion
   code. From Nathan Chancellor.

* tag 'ipsec-next-2024-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next:
  xfrm: policy: Restore dir assignments in xfrm_hash_rebuild()
  xfrm: policy: fix null dereference
  Revert "xfrm: add SA information to the offloaded packet"
  xfrm: minor update to sdb and xfrm_policy comments
  xfrm: policy: use recently added helper in more places
  xfrm: add SA information to the offloaded packet
  xfrm: policy: remove remaining use of inexact list
  xfrm: switch migrate to xfrm_policy_lookup_bytype
  xfrm: policy: don't iterate inexact policies twice at insert time
  selftests: add xfrm policy insertion speed test script
  xfrm: Correct spelling in xfrm.h
  net: add copy from skb_seq_state to buffer function
  xfrm: Remove documentation WARN_ON to limit return values for offloaded SA
====================

Link: https://patch.msgid.link/20240910065507.2436394-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 19:00:47 -07:00
Jakub Kicinski
e35b0515bb Merge branch 'bnxt_en-msix-improvements'
Michael Chan says:

====================
bnxt_en: MSIX improvements

This patchset makes some improvements related to MSIX.  The first
patch adjusts the default MSIX vectors assigned for RoCE.  On the
PF, the number of MSIX is increased to 64 from the current 9.  The
second patch allocates additional MSIX vectors ahead of time when
changing ethtool channels if dynamic MSIX is supported.  The 3rd
patch makes sure that the IRQ name is not truncated.
====================

Link: https://patch.msgid.link/20240909202737.93852-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:42:49 -07:00
Edwin Peer
f77cdee5db bnxt_en: resize bnxt_irq name field to fit format string
The name field of struct bnxt_irq is written using snprintf in
bnxt_setup_msix(). Make the field large enough to fit the maximal
formatted string to prevent truncation.  Truncated IRQ names are
less meaningful to the user.  For example, "enp4s0f0np0-TxRx-0"
gets truncated to "enp4s0f0np0-TxRx-" with the existing code.

Make sure we have space for the extra characters added to the IRQ
names:

  - the characters introduced by the static format string: hyphens
  - the maximal static substituted ring type string: "TxRx"
  - the maximum length of an integer formatted as a string, even
    though reasonable ring numbers would never be as long as this.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909202737.93852-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:42:45 -07:00
Michael Chan
2d51eb0bd8 bnxt_en: Add MSIX check in bnxt_check_rings()
bnxt_check_rings() is called to ensure that we have the hardware ring
resources before committing to reinitialize with the new number of
rings.  MSIX vectors are never checked at this point, because up
until recently we must first disable MSIX before we can allocate the
new set of MSIX vectors.

Now that we support dynamic MSIX allocation, check to make sure we
can dynamically allocate the new MSIX vectors as the last step in
bnxt_check_rings() if dynamic MSIX is supported.

For example, the IOMMU group may limit the number of MSIX vectors
for the device.  With this patch, the ring change will fail more
gracefully when there is not enough MSIX vectors.

It is also better to move bnxt_check_rings() to be called as the last
step when changing ethtool rings.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909202737.93852-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:42:45 -07:00
Michael Chan
f775cb1bbf bnxt_en: Increase the number of MSIX vectors for RoCE device
If RocE is supported on the device, set the number of RoCE MSIX vectors
to the number of online CPUs + 1 and capped at these maximums:

VF: 2
NPAR: 5
PF: 64

For the PF, the maximum is now increased from the previous value
of 9 to get better performance for kernel applications.

Remove the unnecessary check for BNXT_FLAG_ROCE_CAP.
bnxt_set_dflt_ulp_msix() will only be called if the flag is set.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909202737.93852-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:42:45 -07:00
Rob Herring (Arm)
955f5b1508 net: amlogic,meson-dwmac: Fix "amlogic,tx-delay-ns" schema
The "amlogic,tx-delay-ns" property schema has unnecessary type reference
as it's a standard unit suffix, and the constraints are in freeform
text rather than schema.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Link: https://patch.msgid.link/20240909172342.487675-2-robh@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:35:51 -07:00
Jakub Kicinski
3a1f6f4551 Merge branch 'net-xilinx-axienet-partial-checksum-offload-improvements'
Sean Anderson says:

====================
net: xilinx: axienet: Partial checksum offload improvements

Partial checksum offload is not always used when it could be.
Enable it in more cases.
====================

Link: https://patch.msgid.link/20240909161016.1149119-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:34:55 -07:00
Sean Anderson
736f0c7a8e net: xilinx: axienet: Relax partial rx checksum checks
The partial rx checksum feature computes a checksum over the entire
packet, regardless of the L3 protocol. Remove the check for IPv4.
Additionally, testing with csum.py (from kselftests) shows no anomalies
with 64-byte packets, so we can remove that check as well.

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909161016.1149119-5-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:34:51 -07:00
Sean Anderson
06c069ff2f net: xilinx: axienet: Set RXCSUM in features
When it is supported by hardware, we enable receive checksum offload
unconditionally. Update features to reflect this.

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Link: https://patch.msgid.link/20240909161016.1149119-4-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:34:51 -07:00
Sean Anderson
dd28f4c0e8 net: xilinx: axienet: Enable NETIF_F_HW_CSUM for partial tx checksumming
Partial tx chechsumming is completely generic and does not depend on the
L3/L4 protocol. Signal this to the net subsystem by enabling the
more-generic offload feature (instead of restricting ourselves to
TCP/UDP over IPv4 checksumming only like is necessary with full
checksumming).

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909161016.1149119-3-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:34:51 -07:00
Sean Anderson
b1e455cd86 net: xilinx: axienet: Remove unused checksum variables
These variables are set but never used. Remove them.

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Link: https://patch.msgid.link/20240909161016.1149119-2-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:34:50 -07:00
Colin Ian King
d59239f8a4 rtase: Fix spelling mistake: "tx_underun" -> "tx_underrun"
There is a spelling mistake in the struct field tx_underun, rename
it to tx_underrun.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909134612.63912-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:33:32 -07:00
Colin Ian King
8df9439389 r8169: Fix spelling mistake: "tx_underun" -> "tx_underrun"
There is a spelling mistake in the struct field tx_underun, rename
it to tx_underrun.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/20240909140021.64884-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:32:54 -07:00
Dave Taht
c48994baef sch_cake: constify inverse square root cache
sch_cake uses a cache of the first 16 values of the inverse square root
calculation for the Cobalt AQM to save some cycles on the fast path.
This cache is populated when the qdisc is first loaded, but there's
really no reason why it can't just be pre-populated. So change it to be
pre-populated with constants, which also makes it possible to constify
it.

This gives a modest space saving for the module (not counting debug data):
.text:  -224 bytes
.rodata: +80 bytes
.bss:    -64 bytes
Total:  -192 bytes

Signed-off-by: Dave Taht <dave.taht@gmail.com>
[ fixed up comment, rewrote commit message ]
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20240909091630.22177-1-toke@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:31:52 -07:00
Kory Maincent
330dadacc5 MAINTAINERS: Add ethtool pse-pd to PSE NETWORK DRIVER
Add net/ethtool/pse-pd.c to PSE NETWORK DRIVER to receive emails concerning
modifications to the ethtool part.

Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909114336.362174-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 18:31:23 -07:00
Pieter Van Trappen
3f464b193d net: dsa: microchip: update tag_ksz masks for KSZ9477 family
Remove magic number 7 by introducing a GENMASK macro instead.
Remove magic number 0x80 by using the BIT macro instead.

Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20240909134301.75448-1-vtpieter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 17:27:56 -07:00
Wei Fang
2f9caba9b2 dt-bindings: net: tja11xx: fix the broken binding
As Rob pointed in another mail thread [1], the binding of tja11xx PHY
is completely broken, the schema cannot catch the error in the DTS. A
compatiable string must be needed if we want to add a custom propety.
So extract known PHY IDs from the tja11xx PHY drivers and convert them
into supported compatible string list to fix the broken binding issue.

Fixes: 52b2fe4535 ("dt-bindings: net: tja11xx: add nxp,refclk_in property")
Link: https://lore.kernel.org/31058f49-bac5-49a9-a422-c43b121bf049@kernel.org  # [1]
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20240909012152.431647-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 17:12:58 -07:00
Jakub Kicinski
97b1ebb1e2 Merge branch 'net-timestamp-introduce-a-flag-to-filter-out-rx-software-and-hardware-report'
Jason Xing says:

====================
net-timestamp: introduce a flag to filter out rx software and hardware report

When one socket is set SOF_TIMESTAMPING_RX_SOFTWARE which means the
whole system turns on the netstamp_needed_key button, other sockets
that only have SOF_TIMESTAMPING_SOFTWARE will be affected and then
print the rx timestamp information even without setting
SOF_TIMESTAMPING_RX_SOFTWARE generation flag.

How to solve it without breaking users?
We introduce a new flag named SOF_TIMESTAMPING_OPT_RX_FILTER. Using
it together with SOF_TIMESTAMPING_SOFTWARE can stop reporting the
rx software timestamp.

Similarly, we also filter out the hardware case where one process
enables the rx hardware generation flag, then another process only
passing SOF_TIMESTAMPING_RAW_HARDWARE gets the timestamp. So we can set
both SOF_TIMESTAMPING_RAW_HARDWARE and SOF_TIMESTAMPING_OPT_RX_FILTER
to stop reporting rx hardware timestamp after this patch applied.

v6: https://lore.kernel.org/20240906095640.77533-1-kerneljasonxing@gmail.com
v5: https://lore.kernel.org/20240905071738.3725-1-kerneljasonxing@gmail.com
v4: https://lore.kernel.org/20240830153751.86895-1-kerneljasonxing@gmail.com
v3: https://lore.kernel.org/20240828160145.68805-1-kerneljasonxing@gmail.com
v2: https://lore.kernel.org/20240825152440.93054-1-kerneljasonxing@gmail.com
====================

Link: https://patch.msgid.link/20240909015612.3856-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:57:12 -07:00
Jason Xing
fffe8efd68 net-timestamp: add selftests for SOF_TIMESTAMPING_OPT_RX_FILTER
Test a few possible cases where we use SOF_TIMESTAMPING_OPT_RX_FILTER
with software or hardware report/generation flag.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240909015612.3856-3-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:55:23 -07:00
Jason Xing
be8e9eb375 net-timestamp: introduce SOF_TIMESTAMPING_OPT_RX_FILTER flag
introduce a new flag SOF_TIMESTAMPING_OPT_RX_FILTER in the receive
path. User can set it with SOF_TIMESTAMPING_SOFTWARE to filter
out rx software timestamp report, especially after a process turns on
netstamp_needed_key which can time stamp every incoming skb.

Previously, we found out if an application starts first which turns on
netstamp_needed_key, then another one only passing SOF_TIMESTAMPING_SOFTWARE
could also get rx timestamp. Now we handle this case by introducing this
new flag without breaking users.

Quoting Willem to explain why we need the flag:
"why a process would want to request software timestamp reporting, but
not receive software timestamp generation. The only use I see is when
the application does request
SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_TX_SOFTWARE."

Similarly, this new flag could also be used for hardware case where we
can set it with SOF_TIMESTAMPING_RAW_HARDWARE, then we won't receive
hardware receive timestamp.

Another thing about errqueue in this patch I have a few words to say:
In this case, we need to handle the egress path carefully, or else
reporting the tx timestamp will fail. Egress path and ingress path will
finally call sock_recv_timestamp(). We have to distinguish them.
Errqueue is a good indicator to reflect the flow direction.

Suggested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240909015612.3856-2-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:55:23 -07:00
Jason Xing
e503f82e30 net-timestamp: correct the use of SOF_TIMESTAMPING_RAW_HARDWARE
SOF_TIMESTAMPING_RAW_HARDWARE is a report flag which passes the
timestamps generated by either SOF_TIMESTAMPING_TX_HARDWARE or
SOF_TIMESTAMPING_RX_HARDWARE to the userspace all the time.

So let us revise the doc here.

Link: https://lore.kernel.org/all/66d8c21d3042a_163d93294cb@willemb.c.googlers.com.notmuch/
Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Link: https://patch.msgid.link/20240908124141.39628-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:55:08 -07:00
Jakub Kicinski
cce2991e7e Merge branch 'net-stmmac-fpe-via-ethtool-tc'
Furong Xu says:

====================
net: stmmac: FPE via ethtool + tc

Move the Frame Preemption(FPE) over to the new standard API which uses
ethtool-mm/tc-mqprio/tc-taprio.
====================

Link: https://patch.msgid.link/cover.1725631883.git.0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:42:15 -07:00
Furong Xu
22a805d880 net: stmmac: silence FPE kernel logs
ethtool --show-mm can get real-time state of FPE.
fpe_irq_status logs should keep quiet.

tc-taprio can always query driver state, delete unbalanced logs.

Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/39943d7967f291674a97ef0572878aca273087e9.1725631883.git.0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:42:12 -07:00
Furong Xu
15d8a407a5 net: stmmac: support fp parameter of tc-taprio
tc-taprio can select whether traffic classes are express or preemptible.

0) tc qdisc add dev eth1 parent root handle 100 taprio \
        num_tc 4 \
        map 0 1 2 3 2 2 2 2 2 2 2 2 2 2 2 3 \
        queues 1@0 1@1 1@2 1@3 \
        base-time 1000000000 \
        sched-entry S 03 10000000 \
        sched-entry S 0e 10000000 \
        flags 0x2 fp P E E E

1) After some traffic tests, MAC merge layer statistics are all good.

Local device:
[ {
        "ifname": "eth1",
        "pmac-enabled": true,
        "tx-enabled": true,
        "tx-active": true,
        "tx-min-frag-size": 60,
        "rx-min-frag-size": 60,
        "verify-enabled": true,
        "verify-time": 100,
        "max-verify-time": 128,
        "verify-status": "SUCCEEDED",
        "statistics": {
            "MACMergeFrameAssErrorCount": 0,
            "MACMergeFrameSmdErrorCount": 0,
            "MACMergeFrameAssOkCount": 0,
            "MACMergeFragCountRx": 0,
            "MACMergeFragCountTx": 17837,
            "MACMergeHoldCount": 18639
        }
    } ]

Remote device:
[ {
        "ifname": "end1",
        "pmac-enabled": true,
        "tx-enabled": true,
        "tx-active": true,
        "tx-min-frag-size": 60,
        "rx-min-frag-size": 60,
        "verify-enabled": true,
        "verify-time": 100,
        "max-verify-time": 128,
        "verify-status": "SUCCEEDED",
        "statistics": {
            "MACMergeFrameAssErrorCount": 0,
            "MACMergeFrameSmdErrorCount": 0,
            "MACMergeFrameAssOkCount": 17189,
            "MACMergeFragCountRx": 17837,
            "MACMergeFragCountTx": 0,
            "MACMergeHoldCount": 0
        }
    } ]

Tested on DWMAC CORE 5.10a

Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/0d21ae356fb3cab77337527e87d46748a4852055.1725631883.git.0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:42:12 -07:00
Furong Xu
195e4f409a net: stmmac: support fp parameter of tc-mqprio
tc-mqprio can select whether traffic classes are express or preemptible.

After some traffic tests, MAC merge layer statistics are all good.

Local device:
ethtool --include-statistics --json --show-mm eth1
[ {
        "ifname": "eth1",
        "pmac-enabled": true,
        "tx-enabled": true,
        "tx-active": true,
        "tx-min-frag-size": 60,
        "rx-min-frag-size": 60,
        "verify-enabled": true,
        "verify-time": 100,
        "max-verify-time": 128,
        "verify-status": "SUCCEEDED",
        "statistics": {
            "MACMergeFrameAssErrorCount": 0,
            "MACMergeFrameSmdErrorCount": 0,
            "MACMergeFrameAssOkCount": 0,
            "MACMergeFragCountRx": 0,
            "MACMergeFragCountTx": 35105,
            "MACMergeHoldCount": 0
        }
    } ]

Remote device:
ethtool --include-statistics --json --show-mm end1
[ {
        "ifname": "end1",
        "pmac-enabled": true,
        "tx-enabled": true,
        "tx-active": true,
        "tx-min-frag-size": 60,
        "rx-min-frag-size": 60,
        "verify-enabled": true,
        "verify-time": 100,
        "max-verify-time": 128,
        "verify-status": "SUCCEEDED",
        "statistics": {
            "MACMergeFrameAssErrorCount": 0,
            "MACMergeFrameSmdErrorCount": 0,
            "MACMergeFrameAssOkCount": 35105,
            "MACMergeFragCountRx": 35105,
            "MACMergeFragCountTx": 0,
            "MACMergeHoldCount": 0
        }
    } ]

Tested on DWMAC CORE 5.10a

Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/592965ea93ed8240f0a1b8f6f8ebb8914f69419b.1725631883.git.0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:42:12 -07:00
Furong Xu
0f156aceee net: stmmac: configure FPE via ethtool-mm
Implement ethtool --show-mm and --set-mm callbacks.

NIC up/down, link up/down, suspend/resume, kselftest-ethtool_mm,
all tested okay.

Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/06ed409314fe0ee37b78b800922f2c0cce762532.1725631883.git.0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:42:12 -07:00
Furong Xu
8d43e99a5a net: stmmac: refactor FPE verification process
Drop driver defined stmmac_fpe_state, and switch to common
ethtool_mm_verify_status for local TX verification status.

Local side and remote side verification processes are completely
independent. There is no reason at all to keep a local state and
a remote state.

Add a spinlock to avoid races among ISR, timer, link update
and register configuration.

This patch is based on Vladimir Oltean's proposal.

Vladimir Oltean says:

  ====================
  In the INITIAL state, the timer sends MPACKET_VERIFY. Eventually the
  stmmac_fpe_event_status() IRQ fires and advances the state to VERIFYING,
  then rearms the timer after verify_time ms. If a subsequent IRQ comes in
  and modifies the state to SUCCEEDED after getting MPACKET_RESPONSE, the
  timer sees this. It must enable the EFPE bit now. Otherwise, it
  decrements the verify_limit counter and tries again. Eventually it
  moves the status to FAILED, from which the IRQ cannot move it anywhere
  else, except for another stmmac_fpe_apply() call.
  ====================

Co-developed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/151f86c8428eba967039718c6bf90a7d841e703b.1725631883.git.0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:42:11 -07:00
Furong Xu
59dd7fc932 net: stmmac: drop stmmac_fpe_handshake
ethtool --set-mm can trigger FPE verification process by calling
stmmac_fpe_send_mpacket, stmmac_fpe_handshake should be gone.

Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/42018b1a15eb3ced567fd6a73798c7cd4e08799a.1725631883.git.0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:42:11 -07:00
Furong Xu
070a5e6295 net: stmmac: move stmmac_fpe_cfg to stmmac_priv data
By moving the fpe_cfg field to the stmmac_priv data, stmmac_fpe_cfg
becomes platform-data eventually, instead of a run-time config.

Suggested-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://patch.msgid.link/d9b3d7ecb308c5e39778a4c8ae9df288a2754379.1725631883.git.0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:42:11 -07:00
Sean Anderson
e8a63d473b selftests: net: csum: Fix checksums for packets with non-zero padding
Padding is not included in UDP and TCP checksums. Therefore, reduce the
length of the checksummed data to include only the data in the IP
payload. This fixes spurious reported checksum failures like

rx: pkt: sport=33000 len=26 csum=0xc850 verify=0xf9fe
pkt: bad csum

Technically it is possible for there to be trailing bytes after the UDP
data but before the Ethernet padding (e.g. if sizeof(ip) + sizeof(udp) +
udp.len < ip.len). However, we don't generate such packets.

Fixes: 91a7de8560 ("selftests/net: add csum offload test")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240906210743.627413-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:33:31 -07:00
Tomas Paukrt
3f62ea572b net: phy: dp83822: Fix NULL pointer dereference on DP83825 devices
The probe() function is only used for DP83822 and DP83826 PHY,
leaving the private data pointer uninitialized for the DP83825 models
which causes a NULL pointer dereference in the recently introduced/changed
functions dp8382x_config_init() and dp83822_set_wol().

Add the dp8382x_probe() function, so all PHY models will have a valid
private data pointer to fix this issue and also prevent similar issues
in the future.

Fixes: 9ef9ecfa9e ("net: phy: dp8382x: keep WOL settings across suspends")
Signed-off-by: Tomas Paukrt <tomaspaukrt@email.cz>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/66w.ZbGt.65Ljx42yHo5.1csjxu@seznam.cz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:20:15 -07:00
Linus Torvalds
8d8d276ba2 More tracing fixes for 6.11:
- Move declaration of interface_lock outside of CONFIG_TIMERLAT_TRACER
 
   The fix to some locking races moved the declaration of the
   interface_lock up in the file, but also moved it into the
   CONFIG_TIMERLAT_TRACER #ifdef block, breaking the build when
   that wasn't set. Move it further up and out of that #ifdef block.
 
 - Remove unused function run_tracer_selftest() stub
 
   When CONFIG_FTRACE_STARTUP_TEST is not set the stub function
   run_tracer_selftest() is not used and clang is warning about it.
   Remove the function stub as it is not needed.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZt9WIRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qj2PAPsHsAHxF4oPhXi9UmGHH+l0NcWm87U2
 B5JE+73M+RaDQgD/WpdGJaQRudUwic0wu+aHXzMFae3DVd/WUjWbGnlo5gI=
 =pS08
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Move declaration of interface_lock outside of CONFIG_TIMERLAT_TRACER

   The fix to some locking races moved the declaration of the
   interface_lock up in the file, but also moved it into the
   CONFIG_TIMERLAT_TRACER #ifdef block, breaking the build when that
   wasn't set. Move it further up and out of that #ifdef block.

 - Remove unused function run_tracer_selftest() stub

   When CONFIG_FTRACE_STARTUP_TEST is not set the stub function
   run_tracer_selftest() is not used and clang is warning about it.
   Remove the function stub as it is not needed.

* tag 'trace-v6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Drop unused helper function to fix the build
  tracing/osnoise: Fix build when timerlat is not enabled
2024-09-10 09:05:20 -07:00
Jakub Kicinski
48aa361c5d Merge branch 'revert-virtio_net-rx-enable-premapped-mode-by-default'
Xuan Zhuo says:

====================
Revert "virtio_net: rx enable premapped mode by default"

Regression: http://lore.kernel.org/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com
====================

Link: https://patch.msgid.link/20240906123137.108741-1-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 09:01:08 -07:00
Xuan Zhuo
111fc9f517 virtio_net: disable premapped mode by default
Now, the premapped mode encounters some problem.

    http://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com

So we disable the premapped mode by default.
We can re-enable it in the future.

Fixes: f9dac92ba9 ("virtio_ring: enable premapped mode whatever use_dma_api")
Reported-by: "Si-Wei Liu" <si-wei.liu@oracle.com>
Closes: http://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Takero Funaki <flintglass@gmail.com>
Link: https://patch.msgid.link/20240906123137.108741-4-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 09:01:06 -07:00
Xuan Zhuo
38eef112a8 Revert "virtio_net: big mode skip the unmap check"
This reverts commit a377ae542d.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Takero Funaki <flintglass@gmail.com>
Link: https://patch.msgid.link/20240906123137.108741-3-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 09:01:06 -07:00