Commit Graph

111082 Commits

Author SHA1 Message Date
Martin Habets
7f9e4b2a61 sfc/siena: Rename RX/TX functions to avoid conflicts with sfc
For siena use efx_siena_ as the function prefix.
Several functions are not used in Siena, so they are removed.

Use a Siena specific variable name for module parameter
efx_separate_tx_channels.
Move efx_fini_tx_queue() to avoid a forward declaration of
efx_dequeue_buffer().

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:14 -07:00
Martin Habets
71ad88f661 sfc/siena: Rename functions in efx headers to avoid conflicts with sfc
When building with allyesconfig there are many identical
symbol names.
For siena use efx_siena_ as the function and variable prefix
to avoid build errors.

efx_mtd_remove_partition can become static as it is no longer called
from other files.
efx_ticks_to_usecs and efx_xmit_done_single are not used in Siena, so
they are removed.
Several functions are only used inside efx_channels.c for Siena so
they can become static.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:14 -07:00
Martin Habets
956f2d86cb sfc/siena: Remove build references to missing functionality
Functionality not supported or needed on Siena includes:
- Anything for EF100
- EF10 specifics such as register access, PIO and TSO offload.
Also only bind to Siena NICs.

Remove EF10 specifics from nic.h.
The functions that start with efx_farch_ will be removed from sfc.ko
with a subsequent patch.
Add the efx_ prefix to siena_prepare_flush() to make it consistent
with the other APIs.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:14 -07:00
Martin Habets
d48523cb88 sfc: Copy shared files needed for Siena (part 2)
These are the files starting with m through w.
No changes are done, those will be done with subsequent commits.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:14 -07:00
Martin Habets
6e173d3b4a sfc: Copy shared files needed for Siena (part 1)
These are the files starting with b through i.
No changes are done, those will be done with subsequent commits.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:14 -07:00
Martin Habets
36ff639329 sfc: Move Siena specific files
Files are only moved, no changes are made.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:13 -07:00
Wan Jiabing
12a4d677b1 net: phy: micrel: Fix incorrect variable type in micrel
In lanphy_read_page_reg, calling __phy_read() might return a negative
error code. Use 'int' to check the error code.

Fixes: 7c2dcfa295 ("net: phy: micrel: Add support for LAN8804 PHY")
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220509144519.2343399-1-wanjiabing@vivo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:18:13 -07:00
Louis Peens
61004d1d4b nfp: flower: fix 'variable 'flow6' set but not used'
Kernel test robot reported an issue after a recent patch about an
unused variable when CONFIG_IPV6 is disabled. Move the variable
declaration to be inside the #ifdef, and do a bit more cleanup. There
is no need to use a temporary ipv6 bool value, it is just checked once,
remove the extra variable and just do the check directly.

Fixes: 9d5447ed44 ("nfp: flower: fixup ipv6/ipv4 route lookup for neigh events")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220510074845.41457-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:15:12 -07:00
Sasha Neftin
95073d0815 igc: Change type of the 'igc_check_downshift' method
The 'igc_check_downshift' method always returns 0; there is no need
for a return value so change the type of this method to void.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-05-10 14:02:53 -07:00
Sasha Neftin
7241069f7a igc: Remove unused phy_type enum
Complete to commit 8e153faf58 ("igc: Remove unused phy type")
i225 parts have only one PHY. There is no point to use phy_type enum.
Clean up the code accordingly, and get rid of the unused enum lines.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-05-10 14:02:40 -07:00
Sasha Neftin
d098538ed4 igc: Remove igc_set_spd_dplx method
igc_set_spd_dplx method is not used. This patch comes to tidy up
the driver code.

Reported-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-05-10 14:02:18 -07:00
Colin Ian King
25c321e853 ath11k: remove redundant assignment to variables vht_mcs and he_mcs
The variables vht_mcs and he_mcs are being initialized in the
start of for-loops however they are re-assigned new values in
the loop and not used outside the loop. The initializations
are redundant and can be removed.

Cleans up clang scan warnings:

warning: Although the value stored to 'vht_mcs' is used in the
enclosing expression, the value is never actually read from
'vht_mcs' [deadcode.DeadStores]

warning: Although the value stored to 'he_mcs' is used in the
enclosing expression, the value is never actually read from
'he_mcs' [deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220507184155.26939-1-colin.i.king@gmail.com
2022-05-10 19:34:17 +03:00
Anilkumar Kolli
5962f370ce ath11k: Reuse the available memory after firmware reload
Ath11k allocates memory when firmware requests memory in QMI.
Coldboot calibration and firmware recovery uses firmware reload.
On firmware reload, firmware sends memory request again. If Ath11k
allocates memory on first firmware boot, reuse the available
memory. Also check if the segment type and size is same
on the next firmware boot. Reuse if segment type/size is
same as previous firmware boot else free the segment and
allocate the segment with size/type.

Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.6.0.1-00752-QCAHKSWPL_SILICONZ-1

Signed-off-by: Anilkumar Kolli <quic_akolli@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220506141448.10340-1-quic_akolli@quicinc.com
2022-05-10 19:33:33 +03:00
Johannes Berg
4255a07a98 wil6210: remove 'freq' debugfs
This is completely racy, remove it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220506095451.9852305a92c8.I05155824a1b9059eb59beccde81786dc69de354a@changeid
2022-05-10 19:32:45 +03:00
Baochen Qiang
1d7f514577 ath11k: Designating channel frequency when sending management frames
In case of Passpoint, the WLAN interface may be requested to
remain on a specific channel and then to send some management
frames on that channel. Now chanfreq of wmi_mgmt_send_cmd is set
as 0, as a result firmware may choose a default but wrong channel.
Fix it by assigning chanfreq field with the designated channel.

This change only applies to WCN6855 and QCA6390, other chips are
not affected.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1

Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220506013614.1580274-4-quic_bqiang@quicinc.com
2022-05-10 19:32:08 +03:00
Baochen Qiang
355333a217 ath11k: Don't check arvif->is_started before sending management frames
Commit 66307ca040 ("ath11k: fix mgmt_tx_wmi cmd sent to FW for
deleted vdev") wants both of below two conditions are true before
sending management frames:

1: ar->allocated_vdev_map & (1LL << arvif->vdev_id)
2: arvif->is_started

Actually the second one is not necessary because with the first one
we can make sure the vdev is present.

Also use ar->conf_mutex to synchronize vdev delete and mgmt. TX.

This issue is found in case of Passpoint scenario where ath11k
needs to send action frames before vdev is started.

Fix it by removing the second condition.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1

Fixes: 66307ca040 ("ath11k: fix mgmt_tx_wmi cmd sent to FW for deleted vdev")
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220506013614.1580274-3-quic_bqiang@quicinc.com
2022-05-10 19:32:07 +03:00
Baochen Qiang
3a5627b942 ath11k: Implement remain-on-channel support
Add remain on channel support, it is needed in several
scenarios such as Passpoint etc.

Currently this is supported by QCA6390, WCN6855, IPQ8074,
IPQ6018 and QCN9074.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1

Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220506013614.1580274-2-quic_bqiang@quicinc.com
2022-05-10 19:32:07 +03:00
Baochen Qiang
0f84a156aa ath11k: Handle keepalive during WoWLAN suspend and resume
With WoWLAN enabled and after sleeping for a rather long time,
we are seeing that with some APs, it is not able to wake up
the STA though the correct wake up pattern has been configured.
This is because the host doesn't send keepalive command to
firmware, thus firmware will not send any packet to the AP and
after a specific time the AP kicks out the STA.

Fix this issue by enabling keepalive before going to suspend
and disabling it after resume back.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1

Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220506012540.1579604-1-quic_bqiang@quicinc.com
2022-05-10 19:30:37 +03:00
Yishai Hadas
846e437387 net/mlx5: Expose mlx5_sriov_blocking_notifier_register / unregister APIs
Expose mlx5_sriov_blocking_notifier_register / unregister APIs to let a
VF register to be notified for its enablement / disablement by the PF.

Upon VF probe it will call mlx5_sriov_blocking_notifier_register() with
its notifier block and upon VF remove it will call
mlx5_sriov_blocking_notifier_unregister() to drop its registration.

This can give a VF the ability to clean some resources upon disable
before that the command interface goes down and on the other hand sets
some stuff before that it's enabled.

This may be used by a VF which is migration capable in few cases.(e.g.
PF load/unload upon an health recovery).

Link: https://lore.kernel.org/r/20220510090206.90374-2-yishaih@nvidia.com
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-05-10 15:45:28 +03:00
Wells Lu
fd3040b939 net: ethernet: Add driver for Sunplus SP7021
Add driver for Sunplus SP7021 SoC.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Wells Lu <wellslutw@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-10 11:31:32 +02:00
Manuel Ullmann
1809c30b6e net: atlantic: always deep reset on pm op, fixing up my null deref regression
The impact of this regression is the same for resume that I saw on
thaw: the kernel hangs and nothing except SysRq rebooting can be done.

Fixes regression in commit cbe6c3a8f8 ("net: atlantic: invert deep
par in pm functions, preventing null derefs"), where I disabled deep
pm resets in suspend and resume, trying to make sense of the
atl_resume_common() deep parameter in the first place.

It turns out, that atlantic always has to deep reset on pm
operations. Even though I expected that and tested resume, I screwed
up by kexec-rebooting into an unpatched kernel, thus missing the
breakage.

This fixup obsoletes the deep parameter of atl_resume_common, but I
leave the cleanup for the maintainers to post to mainline.

Suspend and hibernation were successfully tested by the reporters.

Fixes: cbe6c3a8f8 ("net: atlantic: invert deep par in pm functions, preventing null derefs")
Link: https://lore.kernel.org/regressions/9-Ehc_xXSwdXcvZqKD5aSqsqeNj5Izco4MYEwnx5cySXVEc9-x_WC4C3kAoCqNTi-H38frroUK17iobNVnkLtW36V6VWGSQEOHXhmVMm5iQ=@protonmail.com/
Reported-by: Jordan Leppert <jordanleppert@protonmail.com>
Reported-by: Holger Hoffstaette <holger@applied-asynchrony.com>
Tested-by: Jordan Leppert <jordanleppert@protonmail.com>
Tested-by: Holger Hoffstaette <holger@applied-asynchrony.com>
CC: <stable@vger.kernel.org> # 5.10+
Signed-off-by: Manuel Ullmann <labre@posteo.de>
Link: https://lore.kernel.org/r/87bkw8dfmp.fsf@posteo.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-10 10:29:54 +02:00
Gerhard Engleder
0abb62b682 tsnep: Add free running cycle counter support
The TSN endpoint Ethernet MAC supports a free running counter
additionally to its clock. This free running counter can be read and
hardware timestamps are supported. As the name implies, this counter
cannot be set and its frequency cannot be adjusted.

Add free running cycle counter support based on this free running
counter to physical clock. This also requires hardware time stamps
based on that free running counter.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-10 09:48:09 +02:00
Jakub Kicinski
b3552d6a3b eth: dpaa2-mac: remove a dead-code NULL check on fwnode parent
Since commit 4e30e98c4b ("dpaa2-mac: return -EPROBE_DEFER from dpaa2_mac_open in case the fwnode is not set")
@parent can't be NULL after the if. It's either the address
of the ->fwnode of @dpmacs or @fwnode in case of ACPI.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20220506200029.852310-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-10 09:19:56 +02:00
Mark Bloch
7f46a0b732 net/mlx5: Lag, add debugfs to query hardware lag state
Lag state has become very complicated with many modes, flags, types and
port selections methods and future work will add additional features.

Add a debugfs to query the current lag state. A new directory named "lag"
will be created under the mlx5 debugfs directory. As the driver has
debugfs per pci function the location will be: <debugfs>/mlx5/<BDF>/lag

For example:
/sys/kernel/debug/mlx5/0000:08:00.0/lag

The following files are exposed:

- state: Returns "active" or "disabled". If "active" it means hardware
         lag is active.

- members: Returns the BDFs of all the members of lag object.

- type: Returns the type of the lag currently configured. Valid only
	if hardware lag is active.
	* "roce" - Members are bare metal PFs.
	* "switchdev" - Members are in switchdev mode.
	* "multipath" - ECMP offloads.

- port_sel_mode: Returns the egress port selection method, valid
		 only if hardware lag is active.
		 * "queue_affinity" - Egress port is selected by
		   the QP/SQ affinity.
		 * "hash" - Egress port is selected by hash done on
		   each packet. Controlled by: xmit_hash_policy of the
		   bond device.
- flags: Returns flags that are specific per lag @type. Valid only if
	 hardware lag is active.
	 * "shared_fdb" - "on" or "off", if "on" single FDB is used.

- mapping: Returns the mapping which is used to select egress port.
	   Valid only if hardware lag is active.
	   If @port_sel_mode is "hash" returns the active egress ports.
	   The hash result will select only active ports.
	   if @port_sel_mode is "queue_affinity" returns the mapping
	   between the configured port affinity of the QP/SQ and actual
	   egress port. For example:
	   * 1:1 - Mapping means if the configured affinity is port 1
	           traffic will egress via port 1.
	   * 1:2 - Mapping means if the configured affinity is port 1
		   traffic will egress via port 2. This can happen
		   if port 1 is down or in active/backup mode and port 1
		   is backup.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:04 -07:00
Mark Bloch
352899f384 net/mlx5: Lag, use buckets in hash mode
When in hardware lag and the NIC has more than 2 ports when one port
goes down need to distribute the traffic between the remaining
active ports.

For better spread in such cases instead of using 1-to-1 mapping and only
4 slots in the hash, use many.

Each port will have many slots that point to it. When a port goes down
go over all the slots that pointed to that port and spread them between
the remaining active ports. Once the port comes back restore the default
mapping.

We will have number_of_ports * MLX5_LAG_MAX_HASH_BUCKETS slots.
Each MLX5_LAG_MAX_HASH_BUCKETS belong to a different port.
The native mapping is such that:

port 1: The first MLX5_LAG_MAX_HASH_BUCKETS slots are: [1, 1, .., 1]
which means if a packet is hased into one of this slots it will hit the
wire via port 1.

port 2: The second MLX5_LAG_MAX_HASH_BUCKETS slots are: [2, 2, .., 2]
which means if a packet is hased into one of this slots it will hit the
wire via port2.

and this mapping is the same of the rest of the ports.
On a failover, lets say port 2 goes down (port 1, 3, 4 are still up).
the new mapping for port 2 will be:

port 2: The second MLX5_LAG_MAX_HASH_BUCKETS are: [1, 3, 1, 4, .., 4]
which means the mapping was changed from the native mapping to a mapping
that consists of only the active ports.

With this if a port goes down the traffic will be split between the
active ports randomly

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:03 -07:00
Mark Bloch
24b3599eff net/mlx5: Lag, refactor dmesg print
Combine dmesg lag prints into a single function.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:03 -07:00
Mark Bloch
4cd14d44b1 net/mlx5: Support devices with more than 2 ports
Increase the define MLX5_MAX_PORTS to 4 as the driver is ready
to support NICs with 4 ports.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:03 -07:00
Mark Bloch
7e978e7714 net/mlx5: Lag, use actual number of lag ports
Refactor the entire lag code to use ldev->ports instead of hard-coded
defines (like MLX5_MAX_PORTS) for its operations.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:02 -07:00
Mark Bloch
cdf611d170 net/mlx5: Lag, use hash when in roce lag on 4 ports
Downstream patches will add support for lag over 4 ports.
In that mode we will only use hash as the uplink selection method.
Using hash instead of queue affinity (before this patch) offers key
advantages like:

- Align ports selection method with the method used by the bond device

- Better packets distribution where a single queue can transmit from
  multiple ports (with queue affinity a queue is bound to a single port
  regardless of the packet being sent).

- In case of failover we traffic is split between multiple ports and not
  a single one like in queue affinity.

Going forward it was decided that queue affinity will be deprecated
as using hash provides a better user experience which means on 4 ports
HCAs hash will always be used.

Future work will add hash support for 2 ports HCAs as well.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:02 -07:00
Mark Bloch
e2c45931ff net/mlx5: Lag, support single FDB only on 2 ports
E-Switch currently doesn't support more than 2 E-Switch managers
being aggregated under a single hardware lag. Have specific checks
to disallow creating lag when the code doesn't support it.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:02 -07:00
Mark Bloch
e9d5bb51c5 net/mlx5: Lag, store number of ports inside lag object
Store the number of lag ports inside the lag object. Lag object is a single
shared object managing the lag state of multiple mlx5 devices on the same
physical HCA.

Downstream patches will allow hardware lag to be created over devices with
more than 2 ports.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:02 -07:00
Mark Bloch
bc4c2f2e01 net/mlx5: Lag, filter non compatible devices
When search for a peer lag device we can filter based on that
device's capabilities.

Downstream patch will be less strict when filtering compatible devices
and remove the limitation where we require exact MLX5_MAX_PORTS and
change it to a range.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:01 -07:00
Mark Bloch
ec2fa47d7b net/mlx5: Lag, use lag lock
Use a lag specific lock instead of depending on external locks to
synchronise the lag creation/destruction.

With this, taking E-Switch mode lock is no longer needed for syncing
lag logic.

Cleanup any dead code that is left over and don't export functions that
aren't used outside the E-Switch core code.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:01 -07:00
Mark Bloch
4202ea95a6 net/mlx5: Lag, move E-Switch prerequisite check into lag code
There is no need to expose E-Switch function for something that can be
checked with already present API inside lag code.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:01 -07:00
Mark Bloch
8a6e75e5f5 net/mlx5: devcom only supports 2 ports
Devcom API is intended to be used between 2 devices only add this
implied assumption into the code and check when it's no true.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:00 -07:00
Mark Bloch
34a30d7635 net/mlx5: Lag, expose number of lag ports
Downstream patches will add support for hardware lag with
more than 2 ports. Add a way for users to query the number of lag ports.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:00 -07:00
Gavin Li
37ca95e62e net/mlx5: Increase FW pre-init timeout for health recovery
Currently, health recovery will reload driver to recover it from fatal
errors. During the driver's load process, it would wait for FW to set the
pre-init bit for up to 120 seconds, beyond this threshold it would abort
the load process. In some cases, such as a FW upgrade on the DPU, this
timeout period is insufficient, and the user has no way to recover the
host device.

To solve this issue, introduce a new FW pre-init timeout for health
recovery, which is set to 2 hours.

The timeout for devlink reload and probe will use the original one because
they are user triggered flows, and therefore should not have a
significantly long timeout, during which the user command would hang.

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:00 -07:00
Gavin Li
8324a02c34 net/mlx5: Add exit route when waiting for FW
Currently, removing a device needs to get the driver interface lock before
doing any cleanup. If the driver is waiting in a loop for FW init, there
is no way to cancel the wait, instead the device cleanup waits for the
loop to conclude and release the lock.

To allow immediate response to remove device commands, check the TEARDOWN
flag while waiting for FW init, and exit the loop if it has been set.

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:00 -07:00
Yu Xiao
299ba7a32a nfp: support Corigine PCIE vendor ID
Historically the nfp driver has supported NFP chips with Netronome's
PCIE vendor ID. This patch extends the driver to also support NFP
chips, which at this point are assumed to be otherwise identical from
a software perspective, that have Corigine's PCIE vendor ID (0x1da8).

Also, Rename the macro definitions PCI_DEVICE_ID_NERTONEOME_NFPXXXX
to PCI_DEVICE_ID_NFPXXXX, as they are now used in conjunction with two
PCIE vendor IDs.

Signed-off-by: Yu Xiao <yu.xiao@corigine.com>
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-09 18:20:39 -07:00
Yu Xiao
34e244ea15 nfp: vendor neutral strings for chip and Corigne in strings for driver
Historically the nfp driver has supported NFP chips with Netronome's
PCIE vendor ID. In preparation for extending the to also support NFP
chips that have Corigine's PCIE vendor ID (0x1da8) make printk statements
relating to the chip vendor neutral.

An alternate approach is to set the string based on the PCI vendor ID.
In our judgement this proved to cumbersome so we have taken this simpler
approach.

Update strings relating to the driver to use Corigine, who have taken
over maintenance of the driver.

Signed-off-by: Yu Xiao <yu.xiao@corigine.com>
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-09 18:20:39 -07:00
Jakub Kicinski
d75b4c7dee Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-05-06

This series contains updates to ice driver only.

Ivan Vecera fixes a race with aux plug/unplug by delaying setting adev
until initialization is complete and adding locking.

Anatolii ensures VF queues are completely disabled before attempting to
reconfigure them.

Michal ensures stale Tx timestamps are cleared from hardware.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: fix PTP stale Tx timestamps cleanup
  ice: clear stale Tx queue settings before configuring
  ice: Fix race during aux device (un)plugging
====================

Link: https://lore.kernel.org/r/20220506174129.4976-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-09 17:54:27 -07:00
Jakub Kicinski
5bcfeb6efe Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2022-05-06

Marcin Szycik says:

This patchset adds support for systemd defined naming scheme for port
representors, as well as re-enables displaying PCI bus-info in ethtool.

bus-info information has previously been removed from ethtool for port
representors, as a workaround for a bug in lshw tool, where the tool would
sometimes display wrong descriptions for port representors/PF. Now the bug
has been fixed in lshw tool [1].

Removing the workaround can be considered a regression (user might be
running an older, unpatched version of lshw) (see [2] for discussion).
However, calling SET_NETDEV_DEV also produces the same effect as removing
the workaround, i.e. lshw is able to access PCI bus-info (this time not
via ethtool, but in some other way) and the bug can occur.

Adding SET_NETDEV_DEV is important, as it greatly improves netdev naming -
- port representors are named based on PF name. Currently port representors
are named "ethX", which might be confusing, especially when spawning VFs on
multiple PFs. Furthermore, it's currently harder to determine to which PF
does a particular port representor belong, as bus-info is not shown in
ethtool.

Consider the following three cases:

Case 1: current code - driver workaround in place, no SET_NETDEV_DEV,
lshw with or without fix. Port representors are not displayed because they
don't have bus-info (the workaround), PFs are labelled correctly:

$ sudo ./lshw -c net -businfo
Bus info          Device      Class          Description
========================================================
pci@0000:02:00.0  ens6f0      network        Ethernet Controller E810-XXV for SFP <-- PF
pci@0000:02:00.1  ens6f1      network        Ethernet Controller E810-XXV for SFP
pci@0000:02:01.0  ens6f0v0    network        Ethernet Adaptive Virtual Function <-- VF
pci@0000:02:01.1  ens6f0v1    network        Ethernet Adaptive Virtual Function
...

Case 2: driver workaround in place, SET_NETDEV_DEV, no lshw fix. Port
representors have predictable names. lshw is able to get bus-info because
of SET_NETDEV_DEV and netdevs CAN be mislabelled:

$ sudo ./lshw -c net -businfo
Bus info          Device           Class          Description
=============================================================
pci@0000:02:00.0  ens6f0npf0vf60   network        Ethernet Controller E810-XXV for SFP <-- mislabeled port representor
pci@0000:02:00.1  ens6f1           network        Ethernet Controller E810-XXV for SFP
pci@0000:02:01.0  ens6f0v0         network        Ethernet Adaptive Virtual Function
pci@0000:02:01.1  ens6f0v1         network        Ethernet Adaptive Virtual Function
...
pci@0000:02:00.0  ens6f0npf0vf26   network        Ethernet interface
pci@0000:02:00.0  ens6f0           network        Ethernet interface <-- mislabeled PF
pci@0000:02:00.0  ens6f0npf0vf81   network        Ethernet interface
...
$ sudo ethtool -i ens6f0npf0vf60
driver: ice
...
bus-info:
...

Output of lshw would be the same with workaround removed; it does not
change the fact that lshw labels netdevs incorrectly, while at the same
time it prevents ethtool from displaying potentially useful data
(bus-info).

Case 3: workaround removed, SET_NETDEV_DEV, lshw fix:

$ sudo ./lshw -c net -businfo
Bus info          Device           Class          Description
=============================================================
pci@0000:02:00.0  ens6f0npf0vf73   network        Ethernet Controller E810-XXV for SFP
pci@0000:02:00.1  ens6f1           network        Ethernet Controller E810-XXV for SFP
pci@0000:02:01.0  ens6f0v0         network        Ethernet Adaptive Virtual Function
pci@0000:02:01.1  ens6f0v1         network        Ethernet Adaptive Virtual Function
...
pci@0000:02:00.0  ens6f0npf0vf5    network        Ethernet Controller E810-XXV for SFP
pci@0000:02:00.0  ens6f0           network        Ethernet Controller E810-XXV for SFP
pci@0000:02:00.0  ens6f0npf0vf60   network        Ethernet Controller E810-XXV for SFP
...
$ sudo ethtool -i ens6f0npf0vf73
driver: ice
...
bus-info: 0000:02:00.0
...

In this case poort representors have predictable names, netdevs are not
mislabelled in lshw, and bus-info is shown in ethtool.

[1] https://ezix.org/src/pkg/lshw/commit/9bf4e4c9c1
[2] https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20220321144731.3935-1-marcin.szycik@linux.intel.com

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  Revert "ice: Hide bus-info in ethtool for PRs in switchdev mode"
  ice: link representors to PCI device
====================

Link: https://lore.kernel.org/r/20220506180052.5256-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-09 17:51:01 -07:00
Francesco Dolcini
91a7cda1f4 net: phy: Fix race condition on link status change
This fixes the following error caused by a race condition between
phydev->adjust_link() and a MDIO transaction in the phy interrupt
handler. The issue was reproduced with the ethernet FEC driver and a
micrel KSZ9031 phy.

[  146.195696] fec 2188000.ethernet eth0: MDIO read timeout
[  146.201779] ------------[ cut here ]------------
[  146.206671] WARNING: CPU: 0 PID: 571 at drivers/net/phy/phy.c:942 phy_error+0x24/0x6c
[  146.214744] Modules linked in: bnep imx_vdoa imx_sdma evbug
[  146.220640] CPU: 0 PID: 571 Comm: irq/128-2188000 Not tainted 5.18.0-rc3-00080-gd569e86915b7 #9
[  146.229563] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[  146.236257]  unwind_backtrace from show_stack+0x10/0x14
[  146.241640]  show_stack from dump_stack_lvl+0x58/0x70
[  146.246841]  dump_stack_lvl from __warn+0xb4/0x24c
[  146.251772]  __warn from warn_slowpath_fmt+0x5c/0xd4
[  146.256873]  warn_slowpath_fmt from phy_error+0x24/0x6c
[  146.262249]  phy_error from kszphy_handle_interrupt+0x40/0x48
[  146.268159]  kszphy_handle_interrupt from irq_thread_fn+0x1c/0x78
[  146.274417]  irq_thread_fn from irq_thread+0xf0/0x1dc
[  146.279605]  irq_thread from kthread+0xe4/0x104
[  146.284267]  kthread from ret_from_fork+0x14/0x28
[  146.289164] Exception stack(0xe6fa1fb0 to 0xe6fa1ff8)
[  146.294448] 1fa0:                                     00000000 00000000 00000000 00000000
[  146.302842] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  146.311281] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  146.318262] irq event stamp: 12325
[  146.321780] hardirqs last  enabled at (12333): [<c01984c4>] __up_console_sem+0x50/0x60
[  146.330013] hardirqs last disabled at (12342): [<c01984b0>] __up_console_sem+0x3c/0x60
[  146.338259] softirqs last  enabled at (12324): [<c01017f0>] __do_softirq+0x2c0/0x624
[  146.346311] softirqs last disabled at (12319): [<c01300ac>] __irq_exit_rcu+0x138/0x178
[  146.354447] ---[ end trace 0000000000000000 ]---

With the FEC driver phydev->adjust_link() calls fec_enet_adjust_link()
calls fec_stop()/fec_restart() and both these function reset and
temporary disable the FEC disrupting any MII transaction that
could be happening at the same time.

fec_enet_adjust_link() and phy_read() can be running at the same time
when we have one additional interrupt before the phy_state_machine() is
able to terminate.

Thread 1 (phylib WQ)       | Thread 2 (phy interrupt)
                           |
                           | phy_interrupt()            <-- PHY IRQ
                           |  handle_interrupt()
                           |   phy_read()
                           |   phy_trigger_machine()
                           |    --> schedule phylib WQ
                           |
                           |
phy_state_machine()        |
 phy_check_link_status()   |
  phy_link_change()        |
   phydev->adjust_link()   |
    fec_enet_adjust_link() |
     --> FEC reset         | phy_interrupt()            <-- PHY IRQ
                           |  phy_read()
                           |

Fix this by acquiring the phydev lock in phy_interrupt().

Link: https://lore.kernel.org/all/20220422152612.GA510015@francesco-nb.int.toradex.com/
Fixes: c974bdbc3e ("net: phy: Use threaded IRQ, to allow IRQ from sleeping devices")
cc: <stable@vger.kernel.org>
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220506060815.327382-1-francesco.dolcini@toradex.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-09 17:48:17 -07:00
Yang Yingliang
51ca86b4c9 ethernet: tulip: fix missing pci_disable_device() on error in tulip_init_one()
Fix the missing pci_disable_device() before return
from tulip_init_one() in the error handling case.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220506094250.3630615-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-09 15:52:38 -07:00
Yang Yingliang
e4b1045bf9 ionic: fix missing pci_release_regions() on error in ionic_probe()
If ionic_map_bars() fails, pci_release_regions() need be called.

Fixes: fbfb803153 ("ionic: Add hardware init and device commands")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220506034040.2614129-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-09 15:49:12 -07:00
Guangbin Huang
443edfd6d4 net: hns3: fix incorrect type of argument in declaration of function hclge_comm_get_rss_indir_tbl
The argument rss_ind_tbl_size is type u16 in function definition of
hclge_comm_get_rss_indir_tbl(), but it is set to type __le16 in function
declaration by mistake, so fix it.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-09 14:30:38 +01:00
Guangbin Huang
a1aed456e3 net: hns3: add query vf ring and vector map relation
This patch adds a new mailbox opcode to query map relation between
vf ring and vector.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-09 14:30:37 +01:00
Jie Wang
416eedb603 net: hns3: add byte order conversion for VF to PF mailbox message
This patch uses __le16/__32 to define mailbox data structures. Then byte
order conversion are added for mailbox messages from VF to PF.

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-09 14:30:37 +01:00
Jie Wang
767975e582 net: hns3: add byte order conversion for PF to VF mailbox message
Currently, hns3 mailbox processing between PF and VF missed to convert
message byte order and use data type u16 instead of __le16 for mailbox
data process. These processes may cause problems between different
architectures.

So this patch uses __le16/__le32 data type to define mailbox data
structures. To be compatible with old hns3 driver, these structures use
one-byte alignment. Then byte order conversions are added to mailbox
messages from PF to VF.

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-09 14:30:37 +01:00
Yufeng Mo
bbed702412 net: hns3: remove the affinity settings of vector0
Vector0 is used for common interrupt control events and is
irrelevant to performance. Currently, the driver sets the
default affinity of vector0 to NUMA nodes, which is unnecessary.
Therefore, the default setting is removed, and the driver does
not set the affinity for vector0.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-09 14:30:37 +01:00