Commit Graph

42672 Commits

Author SHA1 Message Date
Manuel Ullmann
cbe6c3a8f8 net: atlantic: invert deep par in pm functions, preventing null derefs
This will reset deeply on freeze and thaw instead of suspend and
resume and prevent null pointer dereferences of the uninitialized ring
0 buffer while thawing.

The impact is an indefinitely hanging kernel. You can't switch
consoles after this and the only possible user interaction is SysRq.

BUG: kernel NULL pointer dereference
RIP: 0010:aq_ring_rx_fill+0xcf/0x210 [atlantic]
aq_vec_init+0x85/0xe0 [atlantic]
aq_nic_init+0xf7/0x1d0 [atlantic]
atl_resume_common+0x4f/0x100 [atlantic]
pci_pm_thaw+0x42/0xa0

resolves in aq_ring.o to

```
0000000000000ae0 <aq_ring_rx_fill>:
{
/* ... */
 baf:	48 8b 43 08          	mov    0x8(%rbx),%rax
 		buff->flags = 0U; /* buff is NULL */
```

The bug has been present since the introduction of the new pm code in
8aaa112a57 ("net: atlantic: refactoring pm logic") and was hidden
until 8ce8427169 ("net: atlantic: changes for multi-TC support"),
which refactored the aq_vec_{free,alloc} functions into
aq_vec_{,ring}_{free,alloc}, but is technically not wrong. The
original functions just always reinitialized the buffers on S3/S4. If
the interface is down before freezing, the bug does not occur. It does
not matter, whether the initrd contains and loads the module before
thawing.

So the fix is to invert the boolean parameter deep in all pm function
calls, which was clearly intended to be set like that.

First report was on Github [1], which you have to guess from the
resume logs in the posted dmesg snippet. Recently I posted one on
Bugzilla [2], since I did not have an AQC device so far.

#regzbot introduced: 8ce8427169
#regzbot from: koo5 <kolman.jindrich@gmail.com>
#regzbot monitor: https://github.com/Aquantia/AQtion/issues/32

Fixes: 8aaa112a57 ("net: atlantic: refactoring pm logic")
Link: https://github.com/Aquantia/AQtion/issues/32 [1]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215798 [2]
Cc: stable@vger.kernel.org
Reported-by: koo5 <kolman.jindrich@gmail.com>
Signed-off-by: Manuel Ullmann <labre@posteo.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 13:34:36 +01:00
Jiri Pirko
6445eef0f6 mlxsw: spectrum: Add port to linecard mapping
For each port get slot_index using PMLP register. For ports residing
on a linecard, identify it with the linecard by setting mapping
using devlink_port_linecard_set() helper. Use linecard slot index for
PMTDB register queries.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 11:00:19 +01:00
Jiri Pirko
45bf3b7267 mlxsw: core: Extend driver ops by remove selected ports op
In case of line card implementation, the core has to have a way to
remove relevant ports manually. Extend the Spectrum driver ops by an op
that implements port removal of selected ports upon request.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 11:00:19 +01:00
Jiri Pirko
ee7a70fa67 mlxsw: core_linecards: Implement line card activation process
Allow to process events generated upon line card getting "ready" and
"active".

When DSDSC event with "ready" bit set is delivered, that means the
line card is powered up. Use MDDC register to push the line card to
active state. Once FW is done with that, the DSDSC event with "active"
bit set is delivered.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 11:00:19 +01:00
Jiri Pirko
b217127e5e mlxsw: core_linecards: Add line card objects and implement provisioning
Introduce objects for line cards and an infrastructure around that.
Use devlink_linecard_create/destroy() to register the line card with
devlink core. Implement provisioning ops with a list of supported
line cards.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 11:00:19 +01:00
Jiri Pirko
5bade5aa4a mlxsw: reg: Add Management Binary Code Transfer Register
The MBCT register allows to transfer binary INI codes from the host to
the management FW by transferring it by chunks of maximum 1KB.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 11:00:19 +01:00
Jiri Pirko
5290a8ff2e mlxsw: reg: Add Management DownStream Device Control Register
The MDDC register allows to control downstream devices and line cards.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 11:00:19 +01:00
Jiri Pirko
505f524dc6 mlxsw: reg: Add Management DownStream Device Query Register
The MDDQ register allows to query the DownStream device properties.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 11:00:19 +01:00
Jiri Pirko
b0ec003e9a mlxsw: spectrum: Introduce port mapping change event processing
Register PMLPE trap and process the port mapping changes delivered
by it by creating related ports. Note that this happens after
provisioning. The INI of the linecard is processed and merged by FW.
PMLPE is generated for each port. Process this mapping change.

Layout of PMLPE is the same as layout of PMLP.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 11:00:19 +01:00
Jiri Pirko
adc6462376 mlxsw: Narrow the critical section of devl_lock during ports creation/removal
No need to hold the lock for alloc and freecpu. So narrow the critical
section. Follow-up patch is going to benefit from this by adding more
code to the functions which will be out of the critical as well.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 11:00:18 +01:00
Jiri Pirko
ebf0c53417 mlxsw: reg: Add Ports Mapping Event Configuration Register
The PMECR register is used to enable/disable event triggering
in case of local port mapping change.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 11:00:18 +01:00
Jiri Pirko
d3ad2d8820 mlxsw: spectrum: Allocate port mapping array of structs instead of pointers
Instead of array of pointers to port mapping structures, allocate the
array of structures directly.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 11:00:18 +01:00
Jiri Pirko
bac62191a3 mlxsw: spectrum: Allow lane to start from non-zero index
So far, the lane index always started from zero. That is not true for
modular systems with gearbox-equipped linecards. Loose the check so the
lanes can start from non-zero index.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 11:00:18 +01:00
Horatiu Vultur
d08ed85256 net: lan966x: Make sure to release ptp interrupt
When the lan966x driver is removed make sure to remove also the ptp_irq
IRQ.

Fixes: e85a96e48e ("net: lan966x: Add support for ptp interrupts")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/20220413195716.3796467-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 14:29:38 -07:00
David Ahern
db53cd3d88 net: Handle l3mdev in ip_tunnel_init_flow
Ido reported that the commit referenced in the Fixes tag broke
a gre use case with dummy devices. Add a check to ip_tunnel_init_flow
to see if the oif is an l3mdev port and if so set the oif to 0 to
avoid the oif comparison in fib_lookup_good_nhc.

Fixes: 40867d74c3 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices")
Reported-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 14:27:30 -07:00
Jakub Kicinski
f3226eed54 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-04-13

This series contains updates to igc and e1000e drivers.

Sasha removes waiting for hardware semaphore as it could cause an
infinite loop and changes usleep_range() calls done under atomic
context to udelay() for igc. For e1000e, he changes some variables from
u16 to u32 to prevent possible overflow of values.

Vinicius disables PTM when going to suspend as it is causing hang issues
on some platforms for igc.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  e1000e: Fix possible overflow in LTR decoding
  igc: Fix suspending when PTM is active
  igc: Fix BUG: scheduling while atomic
  igc: Fix infinite loop in release_swfw_sync
====================

Link: https://lore.kernel.org/r/20220413170814.2066855-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 14:25:42 -07:00
Leon Romanovsky
31248b5a35 octeon_ep: Remove custom driver version
In review comment [1] was pointed that new code is not supposed
to set driver version and should rely on kernel version instead.

As an outcome of that comment all the dance around setting such
driver version to FW should be removed too, because in upstream
kernel whole driver will have same version so read/write from/to
FW will give same result.

[1] https://lore.kernel.org/all/YladGTmon1x3dfxI@unreal

Fixes: 862cd659a6 ("octeon_ep: Add driver framework and device initialization")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/5d76f3116ee795071ec044eabb815d6c2bdc7dbd.1649922731.git.leonro@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 14:04:14 -07:00
Sukadev Bhattiprolu
93b1ebb348 ibmvnic: Allow multiple ltbs in txpool ltb_set
Allow multiple LTBs in the txpool's ltb_set. i.e rather than using
a single large LTB, use several smaller LTBs.

The first n-1 LTBs will all be of the same size. The size of the last
LTB in the set depends on the number of buffers and buffer (mtu) size.
This strategy hopefully allows more reuse of the initial LTBs and also
reduces the chances of an allocation failure (of the large LTB) when
system is low in memory.

Suggested-by: Brian King <brking@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 14:02:06 -07:00
Sukadev Bhattiprolu
a75de82057 ibmvnic: Allow multiple ltbs in rxpool ltb_set
Allow multiple LTBs in the rxpool's ltb_set. The first n-1 LTBs will all
be of the same size. The size of the last LTB in the set depends on the
number of buffers and buffer (mtu) size.

Having a set of LTBs per pool provides a couple of benefits.

First, with the current value of IBMVNIC_MAX_LTB_SIZE of 16MB, with an
MTU of 9000, we need a LTB (DMA buffer) of that size but the allocation
can fail in low memory conditions. With a set of LTBs per pool, we can
use several smaller (8MB) LTBs and hopefully have fewer allocation
failures. (See also comments in ibmvnic.h on the trade-off with smaller
LTBs)

Second since the kernel limits the size of the DMA buffer to 16MB (based
on MAX_ORDER), with a single DMA buffer per pool, the pool is also limited
to 16MB. This in turn limits the number of buffers per pool to 1763 when
MTU is 9000. With a set of LTBs per pool, we can have upto the max of 4096
buffers per pool even when MTU is 9000.

Suggested-by: Brian King <brking@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 14:02:05 -07:00
Sukadev Bhattiprolu
d6b4585090 ibmvnic: convert rxpool ltb to a set of ltbs
Define and use interfaces that treat the long term buffer (LTB) of an
rxpool as a set of LTBs rather than a single LTB. The set only has one
LTB for now.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 14:02:05 -07:00
Sukadev Bhattiprolu
0c91bf9ceb ibmvnic: define map_txpool_buf_to_ltb()
Define a helper to map a given txpool buffer into its corresponding long
term buffer (LTB) and offset. Currently there is just one LTB per txpool
so the mapping is trivial. When we add support for multiple LTBs per
txpool, this helper will be more useful.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 14:02:04 -07:00
Sukadev Bhattiprolu
2872a67c6b ibmvnic: define map_rxpool_buf_to_ltb()
Define a helper to map a given rx pool buffer into its corresponding long
term buffer (LTB) and offset. Currently there is just one LTB per pool so
the mapping is trivial. When we add support for multiple LTBs per pool,
this helper will be more useful.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 14:02:04 -07:00
Sukadev Bhattiprolu
8880fc669d ibmvnic: rename local variable index to bufidx
The local variable 'index' is heavily used in some functions and is
confusing with the presence of other "index" fields like pool->index,
->consumer_index, etc. Rename it to bufidx to better reflect that its
the index of a buffer in the pool.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 14:02:03 -07:00
Maciej Fijalkowski
4efad19616 ice, xsk: Avoid refilling single Rx descriptors
Call alloc Rx routine for ZC driver only when the amount of unallocated
descriptors exceeds given threshold.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220413153015.453864-14-maciej.fijalkowski@intel.com
2022-04-15 21:11:13 +02:00
Maciej Fijalkowski
a817ead415 stmmac, xsk: Diversify return values from xsk_wakeup call paths
Currently, when debugging AF_XDP workloads, one can correlate the -ENXIO
return code as the case that XSK is not in the bound state. Returning
same code from ndo_xsk_wakeup can be misleading and simply makes it
harder to follow what is going on.

Change ENXIOs in stmmac's ndo_xsk_wakeup() implementation to EINVALs, so
that when probing it is clear that something is wrong on the driver
side, not the xsk_{recv,send}msg.

There is a -ENETDOWN that can happen from both kernel/driver sides
though, but I don't have a correct replacement for this on one of the
sides, so let's keep it that way.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220413153015.453864-13-maciej.fijalkowski@intel.com
2022-04-15 21:11:05 +02:00
Maciej Fijalkowski
7b7f2f273d mlx5, xsk: Diversify return values from xsk_wakeup call paths
Currently, when debugging AF_XDP workloads, one can correlate the -ENXIO
return code as the case that XSK is not in the bound state. Returning
same code from ndo_xsk_wakeup can be misleading and simply makes it
harder to follow what is going on.

Change ENXIO in mlx5's ndo_xsk_wakeup() implementation to EINVAL, so
that when probing it is clear that something is wrong on the driver
side, not the xsk_{recv,send}msg.

There is a -ENETDOWN that can happen from both kernel/driver sides
though, but I don't have a correct replacement for this on one of the
sides, so let's keep it that way.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220413153015.453864-12-maciej.fijalkowski@intel.com
2022-04-15 21:11:01 +02:00
Maciej Fijalkowski
0f8bf01889 ixgbe, xsk: Diversify return values from xsk_wakeup call paths
Currently, when debugging AF_XDP workloads, one can correlate the -ENXIO
return code as the case that XSK is not in the bound state. Returning
same code from ndo_xsk_wakeup can be misleading and simply makes it
harder to follow what is going on.

Change ENXIOs in ixgbe's ndo_xsk_wakeup() implementation to EINVALs, so
that when probing it is clear that something is wrong on the driver
side, not the xsk_{recv,send}msg.

There is a -ENETDOWN that can happen from both kernel/driver sides
though, but I don't have a correct replacement for this on one of the
sides, so let's keep it that way.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220413153015.453864-11-maciej.fijalkowski@intel.com
2022-04-15 21:10:57 +02:00
Maciej Fijalkowski
ed7ae2d622 i40e, xsk: Diversify return values from xsk_wakeup call paths
Currently, when debugging AF_XDP workloads, one can correlate the -ENXIO
return code as the case that XSK is not in the bound state. Returning
same code from ndo_xsk_wakeup can be misleading and simply makes it
harder to follow what is going on.

Change ENXIOs in i40e's ndo_xsk_wakeup() implementation to EINVALs, so
that when probing it is clear that something is wrong on the driver
side, not the xsk_{recv,send}msg.

There is a -ENETDOWN that can happen from both kernel/driver sides
though, but I don't have a correct replacement for this on one of the
sides, so let's keep it that way.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220413153015.453864-10-maciej.fijalkowski@intel.com
2022-04-15 21:10:54 +02:00
Maciej Fijalkowski
ed8a6bc60f ice, xsk: Diversify return values from xsk_wakeup call paths
Currently, when debugging AF_XDP workloads, one can correlate the -ENXIO
return code as the case that XSK is not in the bound state. Returning
same code from ndo_xsk_wakeup can be misleading and simply makes it
harder to follow what is going on.

Change ENXIOs in ice's ndo_xsk_wakeup() implementation to EINVALs, so
that when probing it is clear that something is wrong on the driver
side, not the xsk_{recv,send}msg.

There is a -ENETDOWN that can happen from both kernel/driver sides
though, but I don't have a correct replacement for this on one of the
sides, so let's keep it that way.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220413153015.453864-9-maciej.fijalkowski@intel.com
2022-04-15 21:10:49 +02:00
Maciej Fijalkowski
c7dd09fd46 ixgbe, xsk: Terminate Rx side of NAPI when XSK Rx queue gets full
When XSK pool uses need_wakeup feature, correlate -ENOBUFS that was
returned from xdp_do_redirect() with a XSK Rx queue being full. In such
case, terminate the Rx processing that is being done on the current HW
Rx ring and let the user space consume descriptors from XSK Rx queue so
that there is room that driver can use later on.

Introduce new internal return code IXGBE_XDP_EXIT that will indicate case
described above.

Note that it does not affect Tx processing that is bound to the same
NAPI context, nor the other Rx rings.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220413153015.453864-8-maciej.fijalkowski@intel.com
2022-04-15 21:10:45 +02:00
Maciej Fijalkowski
b8aef650e5 i40e, xsk: Terminate Rx side of NAPI when XSK Rx queue gets full
When XSK pool uses need_wakeup feature, correlate -ENOBUFS that was
returned from xdp_do_redirect() with a XSK Rx queue being full. In such
case, terminate the Rx processing that is being done on the current HW
Rx ring and let the user space consume descriptors from XSK Rx queue so
that there is room that driver can use later on.

Introduce new internal return code I40E_XDP_EXIT that will indicate case
described above.

Note that it does not affect Tx processing that is bound to the same
NAPI context, nor the other Rx rings.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220413153015.453864-7-maciej.fijalkowski@intel.com
2022-04-15 21:10:41 +02:00
Maciej Fijalkowski
50ae066480 ice, xsk: Terminate Rx side of NAPI when XSK Rx queue gets full
When XSK pool uses need_wakeup feature, correlate -ENOBUFS that was
returned from xdp_do_redirect() with a XSK Rx queue being full. In such
case, terminate the Rx processing that is being done on the current HW
Rx ring and let the user space consume descriptors from XSK Rx queue so
that there is room that driver can use later on.

Introduce new internal return code ICE_XDP_EXIT that will indicate case
described above.

Note that it does not affect Tx processing that is bound to the same
NAPI context, nor the other Rx rings.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220413153015.453864-6-maciej.fijalkowski@intel.com
2022-04-15 21:10:37 +02:00
Maciej Fijalkowski
d090c88586 ixgbe, xsk: Decorate IXGBE_XDP_REDIR with likely()
ixgbe_run_xdp_zc() suggests to compiler that XDP_REDIRECT is the most
probable action returned from BPF program that AF_XDP has in its
pipeline. Let's also bring this suggestion up to the callsite of
ixgbe_run_xdp_zc() so that compiler will be able to generate more
optimized code which in turn will make branch predictor happy.

Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220413153015.453864-5-maciej.fijalkowski@intel.com
2022-04-15 21:10:32 +02:00
Maciej Fijalkowski
0bd5ab511e ice, xsk: Decorate ICE_XDP_REDIR with likely()
ice_run_xdp_zc() suggests to compiler that XDP_REDIRECT is the most
probable action returned from BPF program that AF_XDP has in its
pipeline. Let's also bring this suggestion up to the callsite of
ice_run_xdp_zc() so that compiler will be able to generate more
optimized code which in turn will make branch predictor happy.

Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220413153015.453864-4-maciej.fijalkowski@intel.com
2022-04-15 21:10:25 +02:00
Jie Wang
1f702c1643 net: hns3: add tx push support in hns3 ring param process
This patch adds tx push param to hns3 ring param and adapts the set and get
API of ring params. So users can set it by cmd ethtool -G and get it by cmd
ethtool -g.

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 11:41:51 -07:00
Stephen Hemminger
da367ac74a net: restore alpha order to Ethernet devices in config
The displayed list of Ethernet devices in make menuconfig
has gotten out of order. This is mostly due to changes in vendor
names etc, but also because of new Microsoft entry in wrong place.

This restores so that the display is in order even if the names
of the sub directories are not.

Fixes: ca9c54d2d6 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:56:16 +01:00
Yang Yingliang
0a03f3c511 octeon_ep: fix error return code in octep_probe()
If register_netdev() fails , it should return error
code in octep_probe().

Fixes: 862cd659a6 ("octeon_ep: Add driver framework and device initialization")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:49:20 +01:00
Shravya Kumbham
7240bf6fb2 net: emaclite: Remove custom BUFFER_ALIGN macro
BUFFER_ALIGN macro is used to calculate the number of bytes
required for the next alignment. Instead of this, we can directly
use the skb_reserve(skb, NET_IP_ALIGN) to make the protocol header
buffer aligned on at least a 4-byte boundary, where the NET_IP_ALIGN
is by default defined as 2. So removing the BUFFER_ALIGN and its
related defines which it can be done by the skb_reserve() itself.

Signed-off-by: Shravya Kumbham <shravya.kumbham@xilinx.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:46:29 +01:00
Michal Simek
7ae7d494f6 net: emaclite: Update copyright text to correct format
Based on recommended guidance Copyright term should be also present in
front of (c). That's why aligned driver to match this pattern.
It helps automated tools with source code scanning.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:46:29 +01:00
Radhey Shyam Pandey
945e659dff net: emaclite: Fix coding style
Make coding style changes to fix checkpatch script warnings.
There is no functional change. Fixes below check and warnings-

CHECK: Blank lines aren't necessary after an open brace '{'
CHECK: spinlock_t definition without comment
CHECK: Please don't use multiple blank lines
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
CHECK: braces {} should be used on all arms of this statement
CHECK: Unbalanced braces around else statement
CHECK: Alignment should match open parenthesis
WARNING: Missing a blank line after declarations

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:46:29 +01:00
Minghao Chi
81669e7c6c net: ethernet: ti: davinci_emac: using pm_runtime_resume_and_get instead of pm_runtime_get_sync
Using pm_runtime_resume_and_get() to replace pm_runtime_get_sync and
pm_runtime_put_noidle. This change is just to simplify the code, no
actual functional changes.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:18:24 +01:00
Colin Ian King
bb578430d0 octeon_ep: Fix spelling mistake "inerrupts" -> "interrupts"
There is a spelling mistake in a dev_info message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:16:09 +01:00
Vadim Pasternak
03978fb88b mlxsw: core_thermal: Use common define for thermal zone name length
Replace internal define 'MLXSW_THERMAL_ZONE_MAX_NAME' by common
'THERMAL_NAME_LENGTH'.

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:06:13 +01:00
Vadim Pasternak
739d56bc63 mlxsw: core_thermal: Use exact name of cooling devices for binding
Modular system supports additional cooling devices "mlxreg_fan1",
"mlxreg_fan2", etcetera. Thermal zones in "mlxsw" driver should be
bound to the same device as before called "mlxreg_fan". Used exact
match for cooling device name to avoid binding to new additional
cooling devices.

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:06:13 +01:00
Vadim Pasternak
6d94449a7d mlxsw: core_thermal: Add line card id prefix to line card thermal zone name
Add prefix "lc#n" to thermal zones associated with the thermal objects
found on line cards.

For example thermal zone for module #9 located at line card #7 will
have type:
mlxsw-lc7-module9.
And thermal zone for gearbox #3 located at line card #5 will have type:
mlxsw-lc5-gearbox3.

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:06:13 +01:00
Vadim Pasternak
ef0df4fa32 mlxsw: core_thermal: Extend internal structures to support multi thermal areas
Introduce intermediate level for thermal zones areas.
Currently all thermal zones are associated with thermal objects located
within the main board. Such objects are created during driver
initialization and removed during driver de-initialization.

For line cards in modular system the thermal zones are to be associated
with the specific line card. They should be created whenever new line
card is available (inserted, validated, powered and enabled) and
removed, when line card is getting unavailable.
The thermal objects found on the line card #n are accessed by setting
slot index to #n, while for access to objects found on the main board
slot index should be set to default value zero.

Each thermal area contains the set of thermal zones associated with
particular slot index.
Thus introduction of thermal zone areas allows to use the same APIs for
the main board and line cards, by adding slot index argument.

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:06:13 +01:00
Vadim Pasternak
fd27849dd6 mlxsw: core_hwmon: Introduce slot parameter in hwmon interfaces
Add 'slot' parameter to 'mlxsw_hwmon_dev' structure. Use this parameter
in mlxsw_reg_mtmp_pack(), mlxsw_reg_mtbr_pack(), mlxsw_reg_mgpir_pack()
and mlxsw_reg_mtmp_slot_index_set() routines.
For main board it'll always be zero, for line cards it'll be set to
the physical slot number in modular systems.

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:06:13 +01:00
Vadim Pasternak
b890ad418e mlxsw: core_hwmon: Extend internal structures to support multi hwmon objects
Currently, mlxsw supports a single hwmon device and registers it with
attributes corresponding to the various objects found on the main
board such as fans and gearboxes.

Line cards can have the same objects, but unlike the main board they
can be added and removed while the system is running. The various
hwmon objects found on these line cards should be created when the
line card becomes available and destroyed when the line card becomes
unavailable.

The above can be achieved by representing each line card as a
different hwmon device and registering / unregistering it when the
line card becomes available / unavailable.

Prepare for multi hwmon device support by splitting
'struct mlxsw_hwmon' into 'struct mlxsw_hwmon' and
'struct mlxsw_hwmon_dev'. The first will hold information relevant to
all hwmon devices, whereas the second will hold per-hwmon device
information.

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:06:13 +01:00
Vadim Pasternak
b244143a08 mlxsw: core: Move port module events enablement to a separate function
Use a separate function for enablement of port module events such
plug/unplug and temperature threshold crossing. The motivation is to
reuse the function for line cards.

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:06:12 +01:00
Vadim Pasternak
e5b6a5bac8 mlxsw: core: Extend port module data structures for line cards
The port module core is tasked with module operations such as setting
power mode policy and reset. The per-module information is currently
stored in one large array suited for non-modular systems where only the
main board is present (i.e., slot index 0).

As a preparation for line cards support, allocate a per line card array
according to the queried number of slots in the system. For each line
card, allocate a module array according to the queried maximum number of
modules per-slot.

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:06:12 +01:00