cpumap takes RX processing out of softirq and onto a separate kthread.
Since the kthread needs to be scheduled in order to run (versus softirq
which does not), we can theoretically experience extra latency if the
system is under load and the scheduler is being unfair to us.
Moving the tracepoint to before passing the skb list up the stack allows
users to more accurately measure enqueue/dequeue latency introduced by
cpumap via xdp:xdp_cpumap_enqueue and xdp:xdp_cpumap_kthread tracepoints.
f9419f7bd7 ("bpf: cpumap add tracepoints") which added the tracepoints
states that the intent behind them was for general observability and for
a feedback loop to see if the queues are being overwhelmed. This change
does not mess with either of those use cases but rather adds a third
one.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://lore.kernel.org/bpf/47615d5b5e302e4bd30220473779e98b492d47cd.1725585718.git.dxu@dxuuu.xyz
Currently, xskxceiver assumes that MAX_SKB_FRAGS value is always 17
which is not true - since the introduction of BIG TCP this can now take
any value between 17 to 45 via CONFIG_MAX_SKB_FRAGS.
Adjust the TOO_MANY_FRAGS test case to read the currently configured
MAX_SKB_FRAGS value by reading it from /proc/sys/net/core/max_skb_frags.
If running system does not provide that sysctl file then let us try
running the test with a default value.
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20240910124129.289874-1-maciej.fijalkowski@intel.com
Raju Lakkaraju says:
====================
Add support to PHYLINK for LAN743x/PCI11x1x chips
This is the follow-up patch series of
https://lkml.iu.edu/hypermail/linux/kernel/2310.2/02078.html
Divide the PHYLINK adaptation and SFP modifications into two separate patch
series.
The current patch series focuses on transitioning the LAN743x driver's PHY
support from phylib to phylink.
Tested on PCI11010 Rev-1 Evaluation board
Change List:
============
V5 -> V6:
- Remove the lan743x_find_max_speed( ) function. Not require
- Add EEE enable check before calling lan743x_mac_eee_enable( ) function
V4 -> V5:
- Remove the fixed_phy_unregister( ) function. Not require
- Remove the "phydev->eee_enabled" check to update the MAC EEE
enable/disable
- Call lan743x_mac_eee_enable() with true after update tx_lpi_timer.
- Add phy_support_eee() to initialize the EEE flags
V3 -> V4:
- Add fixed-link patch along with this series.
Note: Note: This code was developed by Mr.Russell King
Ref:
https://lore.kernel.org/netdev/LV8PR11MB8700C786F5F1C274C73036CC9F8E2@LV8PR11MB8700.namprd11.prod.outlook.com/T/#me943adf54f1ea082edf294aba448fa003a116815
- Change phylink fixed-link function header's string from "Returns" to
"Returns:"
- Remove the EEE private variable from LAN743x adapter strcture and fix the
EEE's set/get functions
- set the individual caps (i.e. _RGMII, _RGMII_ID, _RGMII_RXID and
__RGMII_TXID) replace with phy_interface_set_rgmii( ) function
- Change lan743x_set_eee( ) to lan743x_mac_eee_enable( )
V2 -> V3:
- Remove the unwanted parens in each of these if() sub-blocks
- Replace "to_net_dev(config->dev)" with "netdev".
- Add GMII_ID/RGMII_TXID/RGMII_RXID in supported_interfaces
- Fix the lan743x_phy_handle_exists( ) return type
V1 -> V2:
- Fix the Russell King's comments i.e. remove the speed, duplex update in
lan743x_phylink_mac_config( )
- pre-March 2020 legacy support has been removed
V0 -> V1:
- Integrate with Synopsys DesignWare XPCS drivers
- Based on external review comments,
- Changes made to SGMII interface support only 1G/100M/10M bps speed
- Changes made to 2500Base-X interface support only 2.5Gbps speed
- Add check for not is_sgmii_en with is_sfp_support_en support
- Change the "pci11x1x_strap_get_status" function return type from void to
int
- Add ethtool phylink wol, eee, pause get/set functions
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support to ethtool phylink functions:
- get/set settings like speed, duplex etc
- get/set the wake-on-lan (WOL)
- get/set the energy-efficient ethernet (EEE)
- get/set the pause
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Migrate phy support from phylib to phylink.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create separate Link Speed Duplex (LSD) update state function from
lan743x_sgmii_config () to use as subroutine.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create separate PCS power reset function from lan743x_sgmii_config () to use
as subroutine.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function allows for the configuration of a fixed link state for a given
phylink instance. This addition is particularly useful for network devices that
operate with a fixed link configuration, where the link parameters do not change
dynamically. By using `phylink_set_fixed_link()`, drivers can easily set up
the fixed link state during initialization or configuration changes.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <linux@armlinux.org.uk>
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
StarFive:
A fix to return one of the clocks on the JH7110 from 1 GHz to 1.5 GHz
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZt8G1wAKCRB4tDGHoIJi
0q/hAQD++rcYdOcP4iwguSS3ZkbCiyMdLDUuVvSiHUtR5dS0WgEAvXE1i0eEwt63
BELsrBNFkeUhLrBfHtN0k2MNvfPXxw4=
=VjNw
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmbhWs4ACgkQYKtH/8kJ
UieKcRAAyg582dYBmQr2z2u+X0XwR85nPCmwtZT7j1DbR+knBLt8s0+JqNu0g3sn
NFg14CI3CtaTS96JaFOzXHKpD96oVTyozs0AU5jtCmD0/+RmOXIByrc1hMzRCP8C
RNhFTwuuQsB3aP56EhL07CAwpTE0lZSdXORtQMn+vJ+H4RF5n0fzjujGLXWbEpOZ
8tmIFMEkgEILNcRxPRDLsa4hwUhqNiQDBNsV+QtDaaRfPUBHnXdfv88xkXZCX7EF
wZRi+WobYbAIVIdEfQ01DfsjSKDmief3kZvm7nj4pBV0QP8O3sOe+xFrKcUguBwb
6tTACRzC9CaJvRNnkiHGvTMuT98500kd5P7TEnBZKgmHlWWYAxOivYO/mu/WYxVK
Erb/i08XHopNK46xm1Ue/AM3eeXw6I6lAeZquEREZK2zGS6DE/LHTxHIlqelGEHt
9ubBlJLf6IVsmlzNrOyN7lrKkXhHqTkM5O8o3RCckTpsNrmV2TppjOl4s4FKWRPM
o91f/FrS4CV8QdA7JQdmgB+pQWAE83txbTP5KAZpxEkTDBEd6b1NKllxXe3OD68y
QA8B7lEw4ltxnzyD+WI4RIhYoBB+1IfguDy6CbtJ6gzKbmszfCXaxSCA1HpBbmPQ
ilGFdrEFgT9NDbxHXOWsG/MHMjgOsi3e/h4XN/iYAjN0wwQ6ZSY=
=1HB8
-----END PGP SIGNATURE-----
Merge tag 'riscv-soc-fixes-for-v6.11-final' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into arm/fixes
RISC-V soc fixes for v6.11-final
StarFive:
A fix to return one of the clocks on the JH7110 from 1 GHz to 1.5 GHz
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
* tag 'riscv-soc-fixes-for-v6.11-final' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
riscv: dts: starfive: jh7110-common: Fix lower rate of CPUfreq by setting PLL0 rate to 1.5GHz
Link: https://lore.kernel.org/r/20240909-hybrid-groovy-601a33b5b309@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
There is still a build failure when the rwrng support is in a loadable
module but the mcu driver is built-in:
arm-linux-gnueabi-ld: drivers/platform/cznic/turris-omnia-mcu-trng.o: in function `omnia_mcu_register_trng':
turris-omnia-mcu-trng.c:(.text.omnia_mcu_register_trng+0x11c): undefined reference to `devm_hwrng_register'
Change the dependency to explicitly disallow the broken
configuration.
Fixes: 41bb142a40 ("platform: cznic: turris-omnia-mcu: Add support for MCU provided TRNG")
Reviewed-by: Marek Behún <kabel@kernel.org>
Link: https://lore.kernel.org/r/20240909110417.247453-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-09-09 (ice, igb)
This series contains updates to ice and igb drivers.
Martyna moves LLDP rule removal to the proper uninitialization function
for ice.
Jake corrects accounting logic for FWD_TO_VSI_LIST switch filters on
ice.
Przemek removes incorrect, explicit calls to pci_disable_device() for
ice.
Michal Schmidt stops incorrect use of VSI list for VLAN use on ice.
Sriram Yagnaraman adjusts igb_xdp_ring_update_tail() to be called under
Tx lock on igb.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
igb: Always call igb_xdp_ring_update_tail() under Tx lock
ice: fix VSI lists confusion when adding VLANs
ice: stop calling pci_disable_device() as we use pcim
ice: fix accounting for filters shared by multiple VSIs
ice: Fix lldp packets dropping after changing the number of channels
====================
Link: https://patch.msgid.link/20240909203842.3109822-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmbfTw4ACgkQSD+KveBX
+j4jxggAnxuWbJuvFBVkiU+62SpPldhKy/ut7Dc3KTOOezb7HMD7suYawgZl0jxr
1cSltKL3lpmaN2FEKITRxESsOKjHqVShkWpZCi+c8hMwd+vWowlaO4r6BY/5ZYhj
2KPx3PjJl6d30d0gw4zMNu3XlOnpunuaRXJv5dbmRkz6G2XGVQzyOH2pfzSJWxyk
bcqYm/3Ma0psfEQhIP6I0LDBvHU4rOAlIGQN4IAzmLmwi4Whk6ECI7Q91v3PH/c9
nTJNTQhvyUJEc5aYuHftNU2MHlzejDPx5F3xd4dcQs30MXk5efSD9+OWnxHivjrP
c9GE3+PmWAWJtSLLb/iOMyTvY+x63Q==
=mGZl
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2024-09-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes 2024-09-09
This series provides bug fixes to mlx5 driver.
* tag 'mlx5-fixes-2024-09-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: Fix bridge mode operations when there are no VFs
net/mlx5: Verify support for scheduling element and TSAR type
net/mlx5: Add missing masks and QoS bit masks for scheduling elements
net/mlx5: Explicitly set scheduling element and TSAR type
net/mlx5e: Add missing link mode to ptys2ext_ethtool_map
net/mlx5e: Add missing link modes to ptys2ethtool_map
net/mlx5: Update the list of the PCI supported devices
====================
Link: https://patch.msgid.link/20240909194505.69715-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
ice: support devlink subfunction
Michal Swiatkowski says:
Currently ice driver does not allow creating more than one networking
device per physical function. The only way to have more hardware backed
netdev is to use SR-IOV.
Following patchset adds support for devlink port API. For each new
pcisf type port, driver allocates new VSI, configures all resources
needed, including dynamically MSIX vectors, program rules and registers
new netdev.
This series supports only one Tx/Rx queue pair per subfunction.
Example commands:
devlink port add pci/0000:31:00.1 flavour pcisf pfnum 1 sfnum 1000
devlink port function set pci/0000:31:00.1/1 hw_addr 00:00:00:00:03:14
devlink port function set pci/0000:31:00.1/1 state active
devlink port function del pci/0000:31:00.1/1
Make the port representor and eswitch code generic to support
subfunction representor type.
VSI configuration is slightly different between VF and SF. It needs to
be reflected in the code.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: subfunction activation and base devlink ops
ice: basic support for VLAN in subfunctions
ice: support subfunction devlink Tx topology
ice: implement netdevice ops for SF representor
ice: check if SF is ready in ethtool ops
ice: don't set target VSI for subfunction
ice: create port representor for SF
ice: make representor code generic
ice: implement netdev for subfunction
ice: base subfunction aux driver
ice: allocate devlink for subfunction
ice: treat subfunction VSI the same as PF VSI
ice: add basic devlink subfunctions support
ice: export ice ndo_ops functions
ice: add new VSI type for subfunctions
====================
Link: https://patch.msgid.link/20240906223010.2194591-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
HW-Managed Flow Steering in mlx5 driver
Yevgeny Kliteynik says:
=======================
1. Overview
-----------
ConnectX devices support packet matching, modification, and redirection.
This functionality is referred as Flow Steering.
To configure a steering rule, the rule is written to the device-owned
memory. This memory is accessed and cached by the device when processing
a packet.
The first implementation of Flow Steering was done in FW, and it is
referred in the mlx5 driver as Device-Managed Flow Steering (DMFS).
Later we introduced SW-managed Flow Steering (SWS or SMFS), where the
driver is writing directly to the device's configuration memory (ICM)
through RC QP using RDMA operations (RDMA-read and RDAM-write), thus
achieving higher rates of rule insertion/deletion.
Now we introduce a new flow steering implementation: HW-Managed Flow
Steering (HWS or HMFS).
In this new approach, the driver is configuring steering rules directly
to the HW using the WQs with a special new type of WQE. This way we can
reach higher rule insertion/deletion rate with much lower CPU utilization
compared to SWS.
The key benefits of HWS as opposed to SWS:
+ HW manages the steering decision tree
- HW calculates CRC for each entry
- HW handles tree hash collisions
- HW & FW manage objects refcount
+ HW keeps cache coherency:
- HW provides tree access locking and synchronization
- HW provides notification on completion
+ Insertion rate isn’t affected by background traffic
- Dedicated HW components that handle insertion
2. Performance
--------------
Measuring Connection Tracking with simple IPv4 flows w/o NAT, we
are able to get ~5 times more flows offloaded per second using HWS.
3. Configuration
----------------
The enablement of HWS mode in eswitch manager is done using the same
devlink param that is already used for switching between FW-managed
steering and SW-managed steering modes:
# devlink dev param set pci/<PCI_ID> name flow_steering_mode cmod runtime value hmfs
4. Upstream Submission
----------------------
HWS support consists of 3 main components:
+ Steering:
- The lower layer that exposes HWS API to upper layers and implements
all the management of flow steering building blocks
+ FS-Core
- Implementation of fs_hws layer to enable fs_core to use HWS instead
of FW or SW steering
- Create HW steering action pools to utilize the ability of HWS to
share steering actions among different rules
- Add support for configuring HWS mode through devlink command,
similar to configuring SWS mode
+ Connection Tracking
- Implementation of CT support for HW steering
- Hooks up the CT ops for the new steering mode and uses the HWS API
to implement connection tracking.
Because of the large number of patches, we need to perform the submission
in several separate patch series. This series is the first submission that
lays the ground work for the next submissions, where an actual user of HWS
will be added.
5. Patches in this series
-------------------------
This patch series contains implementation of the first bullet from above.
The patches are:
[patch 01/15] net/mlx5: Added missing mlx5_ifc definition for HW Steering
[patch 02/15] net/mlx5: Added missing definitions in preparation for HW Steering
[patch 03/15] net/mlx5: HWS, added actions handling
[patch 04/15] net/mlx5: HWS, added tables handling
[patch 05/15] net/mlx5: HWS, added rules handling
[patch 06/15] net/mlx5: HWS, added definers handling
[patch 07/15] net/mlx5: HWS, added matchers functionality
[patch 08/15] net/mlx5: HWS, added FW commands handling
[patch 09/15] net/mlx5: HWS, added modify header pattern and args handling
[patch 10/15] net/mlx5: HWS, added vport handling
[patch 11/15] net/mlx5: HWS, added memory management handling
[patch 12/15] net/mlx5: HWS, added backward-compatible API handling
[patch 13/15] net/mlx5: HWS, added debug dump and internal headers
[patch 14/15] net/mlx5: HWS, added send engine and context handling
[patch 15/15] net/mlx5: HWS, added API and enabled HWS support
=======================
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmbfOf4ACgkQSD+KveBX
+j7hWgf/UzlKp8uyqb+7MpMWP6EgT8WUwWdpDfAr1jubIFz7e+VGaA/7QCThe89u
alcgYvIDQCGrpB/0qXY+kaKPvqOej2wPCiLU7K2JB5y20pZ/RATlFuFBZjsMzufJ
7NxzZgTPUDz+8OWK0mm0LxEJRJYoJ69gAnR0jvLGx9uSjv/f9lNICvWBaI58hkzb
HJa6sJNBiFj4EnkipxWCP0GQ4dddMkgCIVYb91FtlBA4SGZtmPS35NqQJKtGnKF3
ZhZuaTeRdw8bFDJnhbu0ur9cs4EUorZE5QBWhoHYN0zFZF4JmqCCC1HvCS7LEKaU
PgtREk20H2jPIRFwQuX05D6M4zSizg==
=AsB3
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2024-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2024-08-29
HW-Managed Flow Steering in mlx5 driver
Yevgeny Kliteynik says:
=======================
1. Overview
-----------
ConnectX devices support packet matching, modification, and redirection.
This functionality is referred as Flow Steering.
To configure a steering rule, the rule is written to the device-owned
memory. This memory is accessed and cached by the device when processing
a packet.
The first implementation of Flow Steering was done in FW, and it is
referred in the mlx5 driver as Device-Managed Flow Steering (DMFS).
Later we introduced SW-managed Flow Steering (SWS or SMFS), where the
driver is writing directly to the device's configuration memory (ICM)
through RC QP using RDMA operations (RDMA-read and RDAM-write), thus
achieving higher rates of rule insertion/deletion.
Now we introduce a new flow steering implementation: HW-Managed Flow
Steering (HWS or HMFS).
In this new approach, the driver is configuring steering rules directly
to the HW using the WQs with a special new type of WQE. This way we can
reach higher rule insertion/deletion rate with much lower CPU utilization
compared to SWS.
The key benefits of HWS as opposed to SWS:
+ HW manages the steering decision tree
- HW calculates CRC for each entry
- HW handles tree hash collisions
- HW & FW manage objects refcount
+ HW keeps cache coherency:
- HW provides tree access locking and synchronization
- HW provides notification on completion
+ Insertion rate isn’t affected by background traffic
- Dedicated HW components that handle insertion
2. Performance
--------------
Measuring Connection Tracking with simple IPv4 flows w/o NAT, we
are able to get ~5 times more flows offloaded per second using HWS.
3. Configuration
----------------
The enablement of HWS mode in eswitch manager is done using the same
devlink param that is already used for switching between FW-managed
steering and SW-managed steering modes:
# devlink dev param set pci/<PCI_ID> name flow_steering_mode cmod runtime value hmfs
4. Upstream Submission
----------------------
HWS support consists of 3 main components:
+ Steering:
- The lower layer that exposes HWS API to upper layers and implements
all the management of flow steering building blocks
+ FS-Core
- Implementation of fs_hws layer to enable fs_core to use HWS instead
of FW or SW steering
- Create HW steering action pools to utilize the ability of HWS to
share steering actions among different rules
- Add support for configuring HWS mode through devlink command,
similar to configuring SWS mode
+ Connection Tracking
- Implementation of CT support for HW steering
- Hooks up the CT ops for the new steering mode and uses the HWS API
to implement connection tracking.
Because of the large number of patches, we need to perform the submission
in several separate patch series. This series is the first submission that
lays the ground work for the next submissions, where an actual user of HWS
will be added.
5. Patches in this series
-------------------------
This patch series contains implementation of the first bullet from above.
=======================
* tag 'mlx5-updates-2024-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: HWS, added API and enabled HWS support
net/mlx5: HWS, added send engine and context handling
net/mlx5: HWS, added debug dump and internal headers
net/mlx5: HWS, added backward-compatible API handling
net/mlx5: HWS, added memory management handling
net/mlx5: HWS, added vport handling
net/mlx5: HWS, added modify header pattern and args handling
net/mlx5: HWS, added FW commands handling
net/mlx5: HWS, added matchers functionality
net/mlx5: HWS, added definers handling
net/mlx5: HWS, added rules handling
net/mlx5: HWS, added tables handling
net/mlx5: HWS, added actions handling
net/mlx5: Added missing definitions in preparation for HW Steering
net/mlx5: Added missing mlx5_ifc definition for HW Steering
====================
Link: https://patch.msgid.link/20240909181250.41596-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH7ZpcWbFyOOp6OJbrB3Eaf9PW7cFAmbf6xAACgkQrB3Eaf9P
W7eZQA/9HuHTWBg0V43QDT1rjNnKult+uBKYpKrh045outqMs+cU8bsww5ZuIAKx
ktN66OCE67d7XeFttb9UAJUPqQ98RjwjVUOpjRJ5iRDtj2bmn/5VGSYuH7zx5so0
msFs5gkomo2ZZNjcMOSrDVGUoCdlHh1og5L2KN/FgztSA1smDdUBQOWNm1peezbI
eJFt2Q6KCNfzwPthmQte0dmDnK5gWPducereSx03tMuSyUmPML1zrzOFXBXSg09e
dAlDTxbAXZDrXS4Ii0y/FEM2Ugkjg9FXbE1kvM0i05GIc/SGnEBGEcdW5YbmRhOL
4JlLnpiLTmKTaIZ0GdpADv7XZMga6R01AalSPsJz+H7aNAHTKkK+SzQY4YXRucZy
SsASM39oRLzo9Bm4ZZ773Nw83cxBgO/ZixK4KVvCZI/1ftD+9zn72eqk+CeveSeE
ChaXGuWpRdfAOsgozFJNFx/ffK5qzxFKkIeN9KN0QYV/XJuZJ7nD6eQkH9ydgvTI
4cexY+cs4wgfdi9dDkVHPVhCR7mRlfi5r/VL8rtWWnWzR07okKF4rW6dgvx33m60
9MmF1/EdD2uh3CLcBMjNg6qXdC07VeDpFLqWs+utJvSHMuI43uE4FkRQui/J6T9N
RX7zzkFBsPvPpm5GHLx2u/wvnzX1co1Rk9xzbC+J6FEPlm2/0vI=
=ErGl
-----END PGP SIGNATURE-----
Merge tag 'ipsec-next-2024-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2024-09-10
1) Remove an unneeded WARN_ON on packet offload.
From Patrisious Haddad.
2) Add a copy from skb_seq_state to buffer function.
This is needed for the upcomming IPTFS patchset.
From Christian Hopps.
3) Spelling fix in xfrm.h.
From Simon Horman.
4) Speed up xfrm policy insertions.
From Florian Westphal.
5) Add and revert a patch to support xfrm interfaces
for packet offload. This patch was just half cooked.
6) Extend usage of the new xfrm_policy_is_dead_or_sk helper.
From Florian Westphal.
7) Update comments on sdb and xfrm_policy.
From Florian Westphal.
8) Fix a null pointer dereference in the new policy insertion
code From Florian Westphal.
9) Fix an uninitialized variable in the new policy insertion
code. From Nathan Chancellor.
* tag 'ipsec-next-2024-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next:
xfrm: policy: Restore dir assignments in xfrm_hash_rebuild()
xfrm: policy: fix null dereference
Revert "xfrm: add SA information to the offloaded packet"
xfrm: minor update to sdb and xfrm_policy comments
xfrm: policy: use recently added helper in more places
xfrm: add SA information to the offloaded packet
xfrm: policy: remove remaining use of inexact list
xfrm: switch migrate to xfrm_policy_lookup_bytype
xfrm: policy: don't iterate inexact policies twice at insert time
selftests: add xfrm policy insertion speed test script
xfrm: Correct spelling in xfrm.h
net: add copy from skb_seq_state to buffer function
xfrm: Remove documentation WARN_ON to limit return values for offloaded SA
====================
Link: https://patch.msgid.link/20240910065507.2436394-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Chan says:
====================
bnxt_en: MSIX improvements
This patchset makes some improvements related to MSIX. The first
patch adjusts the default MSIX vectors assigned for RoCE. On the
PF, the number of MSIX is increased to 64 from the current 9. The
second patch allocates additional MSIX vectors ahead of time when
changing ethtool channels if dynamic MSIX is supported. The 3rd
patch makes sure that the IRQ name is not truncated.
====================
Link: https://patch.msgid.link/20240909202737.93852-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The name field of struct bnxt_irq is written using snprintf in
bnxt_setup_msix(). Make the field large enough to fit the maximal
formatted string to prevent truncation. Truncated IRQ names are
less meaningful to the user. For example, "enp4s0f0np0-TxRx-0"
gets truncated to "enp4s0f0np0-TxRx-" with the existing code.
Make sure we have space for the extra characters added to the IRQ
names:
- the characters introduced by the static format string: hyphens
- the maximal static substituted ring type string: "TxRx"
- the maximum length of an integer formatted as a string, even
though reasonable ring numbers would never be as long as this.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909202737.93852-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
bnxt_check_rings() is called to ensure that we have the hardware ring
resources before committing to reinitialize with the new number of
rings. MSIX vectors are never checked at this point, because up
until recently we must first disable MSIX before we can allocate the
new set of MSIX vectors.
Now that we support dynamic MSIX allocation, check to make sure we
can dynamically allocate the new MSIX vectors as the last step in
bnxt_check_rings() if dynamic MSIX is supported.
For example, the IOMMU group may limit the number of MSIX vectors
for the device. With this patch, the ring change will fail more
gracefully when there is not enough MSIX vectors.
It is also better to move bnxt_check_rings() to be called as the last
step when changing ethtool rings.
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909202737.93852-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If RocE is supported on the device, set the number of RoCE MSIX vectors
to the number of online CPUs + 1 and capped at these maximums:
VF: 2
NPAR: 5
PF: 64
For the PF, the maximum is now increased from the previous value
of 9 to get better performance for kernel applications.
Remove the unnecessary check for BNXT_FLAG_ROCE_CAP.
bnxt_set_dflt_ulp_msix() will only be called if the flag is set.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909202737.93852-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The "amlogic,tx-delay-ns" property schema has unnecessary type reference
as it's a standard unit suffix, and the constraints are in freeform
text rather than schema.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Link: https://patch.msgid.link/20240909172342.487675-2-robh@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sean Anderson says:
====================
net: xilinx: axienet: Partial checksum offload improvements
Partial checksum offload is not always used when it could be.
Enable it in more cases.
====================
Link: https://patch.msgid.link/20240909161016.1149119-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The partial rx checksum feature computes a checksum over the entire
packet, regardless of the L3 protocol. Remove the check for IPv4.
Additionally, testing with csum.py (from kselftests) shows no anomalies
with 64-byte packets, so we can remove that check as well.
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909161016.1149119-5-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When it is supported by hardware, we enable receive checksum offload
unconditionally. Update features to reflect this.
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Link: https://patch.msgid.link/20240909161016.1149119-4-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Partial tx chechsumming is completely generic and does not depend on the
L3/L4 protocol. Signal this to the net subsystem by enabling the
more-generic offload feature (instead of restricting ourselves to
TCP/UDP over IPv4 checksumming only like is necessary with full
checksumming).
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909161016.1149119-3-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
These variables are set but never used. Remove them.
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Link: https://patch.msgid.link/20240909161016.1149119-2-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is a spelling mistake in the struct field tx_underun, rename
it to tx_underrun.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909134612.63912-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is a spelling mistake in the struct field tx_underun, rename
it to tx_underrun.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/20240909140021.64884-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
sch_cake uses a cache of the first 16 values of the inverse square root
calculation for the Cobalt AQM to save some cycles on the fast path.
This cache is populated when the qdisc is first loaded, but there's
really no reason why it can't just be pre-populated. So change it to be
pre-populated with constants, which also makes it possible to constify
it.
This gives a modest space saving for the module (not counting debug data):
.text: -224 bytes
.rodata: +80 bytes
.bss: -64 bytes
Total: -192 bytes
Signed-off-by: Dave Taht <dave.taht@gmail.com>
[ fixed up comment, rewrote commit message ]
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20240909091630.22177-1-toke@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add net/ethtool/pse-pd.c to PSE NETWORK DRIVER to receive emails concerning
modifications to the ethtool part.
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240909114336.362174-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Remove magic number 7 by introducing a GENMASK macro instead.
Remove magic number 0x80 by using the BIT macro instead.
Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20240909134301.75448-1-vtpieter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As Rob pointed in another mail thread [1], the binding of tja11xx PHY
is completely broken, the schema cannot catch the error in the DTS. A
compatiable string must be needed if we want to add a custom propety.
So extract known PHY IDs from the tja11xx PHY drivers and convert them
into supported compatible string list to fix the broken binding issue.
Fixes: 52b2fe4535 ("dt-bindings: net: tja11xx: add nxp,refclk_in property")
Link: https://lore.kernel.org/31058f49-bac5-49a9-a422-c43b121bf049@kernel.org # [1]
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20240909012152.431647-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Test a few possible cases where we use SOF_TIMESTAMPING_OPT_RX_FILTER
with software or hardware report/generation flag.
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240909015612.3856-3-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
introduce a new flag SOF_TIMESTAMPING_OPT_RX_FILTER in the receive
path. User can set it with SOF_TIMESTAMPING_SOFTWARE to filter
out rx software timestamp report, especially after a process turns on
netstamp_needed_key which can time stamp every incoming skb.
Previously, we found out if an application starts first which turns on
netstamp_needed_key, then another one only passing SOF_TIMESTAMPING_SOFTWARE
could also get rx timestamp. Now we handle this case by introducing this
new flag without breaking users.
Quoting Willem to explain why we need the flag:
"why a process would want to request software timestamp reporting, but
not receive software timestamp generation. The only use I see is when
the application does request
SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_TX_SOFTWARE."
Similarly, this new flag could also be used for hardware case where we
can set it with SOF_TIMESTAMPING_RAW_HARDWARE, then we won't receive
hardware receive timestamp.
Another thing about errqueue in this patch I have a few words to say:
In this case, we need to handle the egress path carefully, or else
reporting the tx timestamp will fail. Egress path and ingress path will
finally call sock_recv_timestamp(). We have to distinguish them.
Errqueue is a good indicator to reflect the flow direction.
Suggested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240909015612.3856-2-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Furong Xu says:
====================
net: stmmac: FPE via ethtool + tc
Move the Frame Preemption(FPE) over to the new standard API which uses
ethtool-mm/tc-mqprio/tc-taprio.
====================
Link: https://patch.msgid.link/cover.1725631883.git.0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ethtool --show-mm can get real-time state of FPE.
fpe_irq_status logs should keep quiet.
tc-taprio can always query driver state, delete unbalanced logs.
Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/39943d7967f291674a97ef0572878aca273087e9.1725631883.git.0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Drop driver defined stmmac_fpe_state, and switch to common
ethtool_mm_verify_status for local TX verification status.
Local side and remote side verification processes are completely
independent. There is no reason at all to keep a local state and
a remote state.
Add a spinlock to avoid races among ISR, timer, link update
and register configuration.
This patch is based on Vladimir Oltean's proposal.
Vladimir Oltean says:
====================
In the INITIAL state, the timer sends MPACKET_VERIFY. Eventually the
stmmac_fpe_event_status() IRQ fires and advances the state to VERIFYING,
then rearms the timer after verify_time ms. If a subsequent IRQ comes in
and modifies the state to SUCCEEDED after getting MPACKET_RESPONSE, the
timer sees this. It must enable the EFPE bit now. Otherwise, it
decrements the verify_limit counter and tries again. Eventually it
moves the status to FAILED, from which the IRQ cannot move it anywhere
else, except for another stmmac_fpe_apply() call.
====================
Co-developed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/151f86c8428eba967039718c6bf90a7d841e703b.1725631883.git.0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
By moving the fpe_cfg field to the stmmac_priv data, stmmac_fpe_cfg
becomes platform-data eventually, instead of a run-time config.
Suggested-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Furong Xu <0x1207@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://patch.msgid.link/d9b3d7ecb308c5e39778a4c8ae9df288a2754379.1725631883.git.0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Padding is not included in UDP and TCP checksums. Therefore, reduce the
length of the checksummed data to include only the data in the IP
payload. This fixes spurious reported checksum failures like
rx: pkt: sport=33000 len=26 csum=0xc850 verify=0xf9fe
pkt: bad csum
Technically it is possible for there to be trailing bytes after the UDP
data but before the Ethernet padding (e.g. if sizeof(ip) + sizeof(udp) +
udp.len < ip.len). However, we don't generate such packets.
Fixes: 91a7de8560 ("selftests/net: add csum offload test")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240906210743.627413-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The probe() function is only used for DP83822 and DP83826 PHY,
leaving the private data pointer uninitialized for the DP83825 models
which causes a NULL pointer dereference in the recently introduced/changed
functions dp8382x_config_init() and dp83822_set_wol().
Add the dp8382x_probe() function, so all PHY models will have a valid
private data pointer to fix this issue and also prevent similar issues
in the future.
Fixes: 9ef9ecfa9e ("net: phy: dp8382x: keep WOL settings across suspends")
Signed-off-by: Tomas Paukrt <tomaspaukrt@email.cz>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/66w.ZbGt.65Ljx42yHo5.1csjxu@seznam.cz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- Move declaration of interface_lock outside of CONFIG_TIMERLAT_TRACER
The fix to some locking races moved the declaration of the
interface_lock up in the file, but also moved it into the
CONFIG_TIMERLAT_TRACER #ifdef block, breaking the build when
that wasn't set. Move it further up and out of that #ifdef block.
- Remove unused function run_tracer_selftest() stub
When CONFIG_FTRACE_STARTUP_TEST is not set the stub function
run_tracer_selftest() is not used and clang is warning about it.
Remove the function stub as it is not needed.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZt9WIRQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qj2PAPsHsAHxF4oPhXi9UmGHH+l0NcWm87U2
B5JE+73M+RaDQgD/WpdGJaQRudUwic0wu+aHXzMFae3DVd/WUjWbGnlo5gI=
=pS08
-----END PGP SIGNATURE-----
Merge tag 'trace-v6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Move declaration of interface_lock outside of CONFIG_TIMERLAT_TRACER
The fix to some locking races moved the declaration of the
interface_lock up in the file, but also moved it into the
CONFIG_TIMERLAT_TRACER #ifdef block, breaking the build when that
wasn't set. Move it further up and out of that #ifdef block.
- Remove unused function run_tracer_selftest() stub
When CONFIG_FTRACE_STARTUP_TEST is not set the stub function
run_tracer_selftest() is not used and clang is warning about it.
Remove the function stub as it is not needed.
* tag 'trace-v6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Drop unused helper function to fix the build
tracing/osnoise: Fix build when timerlat is not enabled