Commit Graph

35849 Commits

Author SHA1 Message Date
Jakub Kicinski
5c39f26e67 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Trivial conflict in CAN, keep the net-next + the byteswap wrapper.

Conflicts:
	drivers/net/can/usb/gs_usb.c

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 18:25:27 -08:00
Ido Schimmel
ff47fa13c9 mlxsw: spectrum_router: Update adjacency index more efficiently
The device supports an operation that allows the driver to issue one
request to update the adjacency index for all the routes in a given
virtual router (VR) from old index and size to new ones. This is useful
in case the configuration of a certain nexthop group is updated and its
adjacency index changes.

Currently, the driver does not use this operation in an efficient
manner. It iterates over all the routes using the nexthop group and
issues an update request for the VR if it is not the same as the
previous VR.

Instead, use the VR tracking added in the previous patch to update the
adjacency index once for each VR currently using the nexthop group.

Example:

8k IPv6 routes were added in an alternating manner to two VRFs. All the
routes are using the same nexthop object ('nhid 1').

Before:

# perf stat -e devlink:devlink_hwmsg --filter='incoming==0' -- ip nexthop replace id 1 via 2001:db8:1::2 dev swp3

 Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3':

            16,385      devlink:devlink_hwmsg

       4.255933213 seconds time elapsed

       0.000000000 seconds user
       0.666923000 seconds sys

Number of EMAD transactions corresponds to number of routes using the
nexthop group.

After:

# perf stat -e devlink:devlink_hwmsg --filter='incoming==0' -- ip nexthop replace id 1 via 2001:db8:1::2 dev swp3

 Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3':

                 3      devlink:devlink_hwmsg

       0.077655094 seconds time elapsed

       0.000000000 seconds user
       0.076698000 seconds sys

Number of EMAD transactions corresponds to number of VRFs / VRs.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:33 -08:00
Ido Schimmel
d2141a42b9 mlxsw: spectrum_router: Track nexthop group virtual router membership
For each nexthop group, track in which virtual routers (VRs) the group is
used. This is going to be used by the next patch to perform a more
efficient adjacency index update whenever the group's adjacency index
changes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:33 -08:00
Ido Schimmel
9a4ab10c74 mlxsw: spectrum_router: Rollback virtual router adjacency pointer update
In the rare case where the adjacency pointer cannot be updated for a
given virtual router, rollback the operation so that virtual routers
that are already using the new index will use the old one again.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:33 -08:00
Ido Schimmel
40e4413d5d mlxsw: spectrum_router: Pass virtual router parameters directly instead of pointer
mlxsw_sp_adj_index_mass_update_vr() only needs the virtual router's
identifier and protocol, so pass them directly. In a subsequent patch
the caller will not have access to the pointer.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:33 -08:00
Ido Schimmel
1c2c5eb6e1 mlxsw: spectrum_router: Fix error handling issue
Return error to the caller instead of suppressing it.

Fixes: e3ddfb45ba ("mlxsw: spectrum_router: Allow returning errors from mlxsw_sp_nexthop_group_refresh()")
Addresses-Coverity: ("Error handling issues  (CHECKED_RETURN)")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:32 -08:00
Ingo Molnar
a787bdaff8 Merge branch 'linus' into sched/core, to resolve semantic conflict
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-11-27 11:10:50 +01:00
Parav Pandit
617b860c18 net/mlx5: Treat host PF vport as other (non eswitch manager) vport
When eswitch manager is running on ECPF, host PF should be treated
as non eswitch manager port, similar to other VF vports.
Fail to do so, results in firmware treating PF's vport as ECPF
vport for eswitch ACL tables.
Non zero check to figure out if a given vport is other vport or not
is not sufficient becase PF vport number = 0 on ECPF.
Hence, create esw acl tables with an attribute of other vport.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:45:03 -08:00
Parav Pandit
5bef709d76 net/mlx5: Enable host PF HCA after eswitch is initialized
Currently ECPF enables external host PF too early in the initialization
sequence for Ethernet links when ECPF is eswitch manager.

Due to this, when external host PF driver is loaded, host PF's HCA CAP has
inner_ip_version supported by NIC RX flow table.
This capability is later updated by firmware after ECPF driver enables
ENCAP/DECAP as eswitch manager.

This results into a timing race condition, where CREATE_TIR command
fails with a below syndrome on host PF.

mlx5_cmd_check:775:(pid 510): CREATE_TIR(0x900) op_mod(0x0) failed,
status bad parameter(0x3), syndrome (0x562b00)

Hence, enable the external host PF after necessary eswitch and per vport
initialization is completed.
Continue to enable host PF when eswitch manager capability is off for a
ECPF.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Bodong Wang <bodong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:45:03 -08:00
Parav Pandit
8a90f2fc67 net/mlx5: Rename peer_pf to host_pf
To match the hardware spec, rename peer_pf to host_pf.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Bodong Wang <bodong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:45:03 -08:00
Eli Cohen
8d2a9d8d64 net/mlx5: Export steering related functions
Export
 mlx5_create_flow_table()
 mlx5_create_flow_group()
 mlx5_destroy_flow_group().

These symbols are required by the VDPA implementation to create rules
that consume VDPA specific traffic.

We do not deal with putting the prototypes in a header file since they
already exist in include/linux/mlx5/fs.h.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:45:02 -08:00
Meir Lichtinger
dd8595eabe net/mlx5: Update the list of the PCI supported devices
Add the upcoming BlueField-3 device ID.

Signed-off-by: Meir Lichtinger <meirl@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:48 -08:00
Parav Pandit
e5dfe6b57e net/mlx5: Avoid exposing driver internal command helpers
mlx5 command init and cleanup routines are internal to mlx5_core driver.
Hence, avoid exporting them and move their definition to mlx5_core
driver's internal file mlx5_core.h

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:48 -08:00
Muhammad Sammar
7da3ad6c26 net/mlx5: Add misc4 to mlx5_ifc_fte_match_param_bits
Add misc4 match params to enable matching on prog_sample_fields.

Signed-off-by: Muhammad Sammar <muhammads@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:47 -08:00
Muhammad Sammar
699d531f55 net/mlx5: Check dr mask size against mlx5_match_param size
This is to allow passing misc4 match param from userspace when
function like ib_flow_matcher_create is called.

Signed-off-by: Muhammad Sammar <muhammads@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:47 -08:00
Chris Mi
3873063088 net/mlx5: Add sampler destination type
The flow sampler object is a new destination type. Add a new member
for the flow destination.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:47 -08:00
Jakub Kicinski
594e31bceb Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
40GbE Intel Wired LAN Driver Updates 2020-11-24

This series contains updates to i40e and igbvf drivers.

Marek removes a redundant assignment for i40e.

Stefan Assmann corrects reporting of VF link speed for i40e.

Karen revises a couple of error messages to warnings for igbvf as they
could be misinterpreted as issues when they are not.

v2: Dropped PTP patch as it's being updated.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  igbvf: Refactor traces
  i40e: report correct VF link speed when link state is set to enable
  i40e: remove redundant assignment
====================

Link: https://lore.kernel.org/r/20201124165245.2844118-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 18:11:25 -08:00
Rohit Maheshwari
cbf3d60329 ch_ktls: lock is not freed
Currently lock gets freed only if timeout expires, but missed a
case when HW returns failure and goes for cleanup.

Fixes: efca3878a5 ("ch_ktls: Issue if connection offload fails")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Link: https://lore.kernel.org/r/20201125072626.10861-1-rohitm@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:44:42 -08:00
Vladimir Oltean
90cf87d16b enetc: Let the hardware auto-advance the taprio base-time of 0
The tc-taprio base time indicates the beginning of the tc-taprio
schedule, which is cyclic by definition (where the length of the cycle
in nanoseconds is called the cycle time). The base time is a 64-bit PTP
time in the TAI domain.

Logically, the base-time should be a future time. But that imposes some
restrictions to user space, which has to retrieve the current PTP time
from the NIC first, then calculate a base time that will still be larger
than the base time by the time the kernel driver programs this value
into the hardware. Actually ensuring that the programmed base time is in
the future is still a problem even if the kernel alone deals with this.

Luckily, the enetc hardware already advances a base-time that is in the
past into a congruent time in the immediate future, according to the
same formula that can be found in the software implementation of taprio
(in taprio_get_start_time):

	/* Schedule the start time for the beginning of the next
	 * cycle.
	 */
	n = div64_s64(ktime_sub_ns(now, base), cycle);
	*start = ktime_add_ns(base, (n + 1) * cycle);

There's only one problem: the driver doesn't let the hardware do that.
It interferes with the base-time passed from user space, by special-casing
the situation when the base-time is zero, and replaces that with the
current PTP time. This changes the intended effective base-time of the
schedule, which will in the end have a different phase offset than if
the base-time of 0.000000000 was to be advanced by an integer multiple
of the cycle-time.

Fixes: 34c6adf197 ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20201124220259.3027991-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 12:36:27 -08:00
Christian Eggers
37e9d0559a mlxsw: spectrum_ptp: use PTP wide message type definitions
Use recently introduced PTP wide defines instead of a driver internal
enumeration.

Signed-off-by: Christian Eggers <ceggers@arri.de>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Cc: Petr Machata <petrm@mellanox.com>
Cc: Jiri Pirko <jiri@nvidia.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 12:23:14 -08:00
Antonio Borneo
12a8fe56c0 net: stmmac: fix incorrect merge of patch upstream
Commit 7579262478 ("net: stmmac: add flexible PPS to dwmac
4.10a") was intended to modify the struct dwmac410_ops, but it got
somehow badly merged and modified the struct dwmac4_ops.

Revert the modification in struct dwmac4_ops and re-apply it
properly in struct dwmac410_ops.

Fixes: 7579262478 ("net: stmmac: add flexible PPS to dwmac 4.10a")
Signed-off-by: Antonio Borneo <antonio.borneo@st.com>
Reported-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Link: https://lore.kernel.org/r/20201124223729.886992-1-antonio.borneo@st.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 11:23:40 -08:00
Lijun Pan
3ada288150 ibmvnic: enhance resetting status check during module exit
Based on the discussion with Sukadev Bhattiprolu and Dany Madden,
we believe that checking adapter->resetting bit is preferred
since RESETTING state flag is not as strict as resetting bit.
RESETTING state flag is removed since it is verbose now.

Fixes: 7d7195a026 ("ibmvnic: Do not process device remove during device reset")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:28:34 -08:00
Lijun Pan
0e435befae ibmvnic: fix NULL pointer dereference in ibmvic_reset_crq
crq->msgs could be NULL if the previous reset did not complete after
freeing crq->msgs. Check for NULL before dereferencing them.

Snippet of call trace:
...
ibmvnic 30000003 env3 (unregistering): Releasing sub-CRQ
ibmvnic 30000003 env3 (unregistering): Releasing CRQ
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc0000000000c1a30
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: ibmvnic(E-) rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag tun raw_diag inet_diag unix_diag bridge af_packet_diag netlink_diag stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ibmvnic]
CPU: 20 PID: 8426 Comm: kworker/20:0 Tainted: G            E     5.10.0-rc1+ #12
Workqueue: events __ibmvnic_reset [ibmvnic]
NIP:  c0000000000c1a30 LR: c008000001b00c18 CTR: 0000000000000400
REGS: c00000000d05b7a0 TRAP: 0380   Tainted: G            E      (5.10.0-rc1+)
MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44002480  XER: 20040000
CFAR: c0000000000c19ec IRQMASK: 0
GPR00: 0000000000000400 c00000000d05ba30 c008000001b17c00 0000000000000000
GPR04: 0000000000000000 0000000000000000 0000000000000000 00000000000001e2
GPR08: 000000000001f400 ffffffffffffd950 0000000000000000 c008000001b0b280
GPR12: c0000000000c19c8 c00000001ec72e00 c00000000019a778 c00000002647b440
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000006 0000000000000001 0000000000000003 0000000000000002
GPR24: 0000000000001000 c008000001b0d570 0000000000000005 c00000007ab5d550
GPR28: c00000007ab5c000 c000000032fcf848 c00000007ab5cc00 c000000032fcf800
NIP [c0000000000c1a30] memset+0x68/0x104
LR [c008000001b00c18] ibmvnic_reset_crq+0x70/0x110 [ibmvnic]
Call Trace:
[c00000000d05ba30] [0000000000000800] 0x800 (unreliable)
[c00000000d05bab0] [c008000001b0a930] do_reset.isra.40+0x224/0x634 [ibmvnic]
[c00000000d05bb80] [c008000001b08574] __ibmvnic_reset+0x17c/0x3c0 [ibmvnic]
[c00000000d05bc50] [c00000000018d9ac] process_one_work+0x2cc/0x800
[c00000000d05bd20] [c00000000018df58] worker_thread+0x78/0x520
[c00000000d05bdb0] [c00000000019a934] kthread+0x1c4/0x1d0
[c00000000d05be20] [c00000000000d5d0] ret_from_kernel_thread+0x5c/0x6c

Fixes: 032c5e8284 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:28:34 -08:00
Lijun Pan
a0faaa27c7 ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues
adapter->tx_scrq and adapter->rx_scrq could be NULL if the previous reset
did not complete after freeing sub crqs. Check for NULL before
dereferencing them.

Snippet of call trace:
ibmvnic 30000006 env6: Releasing sub-CRQ
ibmvnic 30000006 env6: Releasing CRQ
...
ibmvnic 30000006 env6: Got Control IP offload Response
ibmvnic 30000006 env6: Re-setting tx_scrq[0]
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc008000003dea7cc
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag tun bridge stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmvnic ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
CPU: 80 PID: 1856 Comm: kworker/80:2 Tainted: G        W         5.8.0+ #4
Workqueue: events __ibmvnic_reset [ibmvnic]
NIP:  c008000003dea7cc LR: c008000003dea7bc CTR: 0000000000000000
REGS: c0000007ef7db860 TRAP: 0380   Tainted: G        W          (5.8.0+)
MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28002422  XER: 0000000d
CFAR: c000000000bd9520 IRQMASK: 0
GPR00: c008000003dea7bc c0000007ef7dbaf0 c008000003df7400 c0000007fa26ec00
GPR04: c0000007fcd0d008 c0000007fcd96350 0000000000000027 c0000007fcd0d010
GPR08: 0000000000000023 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000002000 c00000001ec18e00 c0000000001982f8 c0000007bad6e840
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 fffffffffffffef7
GPR24: 0000000000000402 c0000007fa26f3a8 0000000000000003 c00000016f8ec048
GPR28: 0000000000000000 0000000000000000 0000000000000000 c0000007fa26ec00
NIP [c008000003dea7cc] ibmvnic_reset_init+0x15c/0x258 [ibmvnic]
LR [c008000003dea7bc] ibmvnic_reset_init+0x14c/0x258 [ibmvnic]
Call Trace:
[c0000007ef7dbaf0] [c008000003dea7bc] ibmvnic_reset_init+0x14c/0x258 [ibmvnic] (unreliable)
[c0000007ef7dbb80] [c008000003de8860] __ibmvnic_reset+0x408/0x970 [ibmvnic]
[c0000007ef7dbc50] [c00000000018b7cc] process_one_work+0x2cc/0x800
[c0000007ef7dbd20] [c00000000018bd78] worker_thread+0x78/0x520
[c0000007ef7dbdb0] [c0000000001984c4] kthread+0x1d4/0x1e0
[c0000007ef7dbe20] [c00000000000cea8] ret_from_kernel_thread+0x5c/0x74

Fixes: 57a49436f4 ("ibmvnic: Reset sub-crqs during driver reset")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:28:34 -08:00
Sven Van Asbroeck
470dfd808a lan743x: replace polling loop by wait_event_timeout()
The driver's ISR sends a 'software interrupt' event to the probe()
thread using the following method:
- probe(): write 0 to flag, enable s/w interrupt
- probe(): poll on flag, relax using usleep_range()
- ISR    : write 1 to flag

Replace with wake_up() / wait_event_timeout(). Besides being easier
to get right, this abstraction has better timing and memory
consistency properties.

Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # lan7430
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Link: https://lore.kernel.org/r/20201123191529.14908-2-TheSven73@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:14:04 -08:00
Sven Van Asbroeck
c31799bae8 lan743x: clean up software_isr function
For no apparent reason, this function reads the INT_STS register, and
checks if the software interrupt bit is set. These things have already
been carried out by this function's only caller.

Clean up by removing the redundant code.

Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # lan7430
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Link: https://lore.kernel.org/r/20201123191529.14908-1-TheSven73@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:14:04 -08:00
Shay Agroskin
1396d3148b net: ena: fix packet's addresses for rx_offset feature
This patch fixes two lines in which the rx_offset received by the device
wasn't taken into account:

- prefetch function:
	In our driver the copied data would reside in
	rx_info->page + rx_headroom + rx_offset

	so the prefetch function is changed accordingly.

- setting page_offset to zero for descriptors > 1:
	for every descriptor but the first, the rx_offset is zero. Hence
	the page_offset value should be set to rx_headroom.

	The previous implementation changed the value of rx_info after
	the descriptor was added to the SKB (essentially providing wrong
	page offset).

Fixes: 68f236df93 ("net: ena: add support for the rx offset feature")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:07:13 -08:00
Shay Agroskin
09323b3bca net: ena: set initial DMA width to avoid intel iommu issue
The ENA driver uses the readless mechanism, which uses DMA, to find
out what the DMA mask is supposed to be.

If DMA is used without setting the dma_mask first, it causes the
Intel IOMMU driver to think that ENA is a 32-bit device and therefore
disables IOMMU passthrough permanently.

This patch sets the dma_mask to be ENA_MAX_PHYS_ADDR_SIZE_BITS=48
before readless initialization in
ena_device_init()->ena_com_mmio_reg_read_request_init(),
which is large enough to workaround the intel_iommu issue.

DMA mask is set again to the correct value after it's received from the
device after readless is initialized.

The patch also changes the driver to use dma_set_mask_and_coherent()
function instead of the two pci_set_dma_mask() and
pci_set_consistent_dma_mask() ones. Both methods achieve the same
effect.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Mike Cui <mikecui@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:07:13 -08:00
Shay Agroskin
5b7022cf1d net: ena: handle bad request id in ena_netdev
After request id is checked in validate_rx_req_id() its value is still
used in the line
	rx_ring->free_ids[next_to_clean] =
					rx_ring->ena_bufs[i].req_id;
even if it was found to be out-of-bound for the array free_ids.

The patch moves the request id to an earlier stage in the napi routine and
makes sure its value isn't used if it's found out-of-bounds.

Fixes: 30623e1ed1 ("net: ena: avoid memory access violation by validating req_id properly")
Signed-off-by: Ido Segev <idose@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:07:13 -08:00
Lorenzo Bianconi
039fbc47f9 net: mvneta: alloc skb_shared_info on the mvneta_rx_swbm stack
Build skb_shared_info on mvneta_rx_swbm stack and sync it to xdp_buff
skb_shared_info area only on the last fragment. Leftover cache miss in
mvneta_swbm_rx_frame will be addressed introducing mb bit in
xdp_buff/xdp_frame struct

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 15:09:44 -08:00
Lorenzo Bianconi
eb33f11864 net: mvneta: move skb_shared_info in mvneta_xdp_put_buff caller
Pass skb_shared_info pointer from mvneta_xdp_put_buff caller. This is a
preliminary patch to reduce accesses to skb_shared_info area and reduce
cache misses.
Remove napi parameter in mvneta_xdp_put_buff signature since it is always
run in NAPI context

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 15:09:41 -08:00
Lorenzo Bianconi
05c748f7d0 net: mvneta: avoid unnecessary xdp_buff initialization
Explicitly initialize mandatory fields defining xdp_buff struct
in mvneta_rx_swbm routine

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 15:09:29 -08:00
Stefan Chulski
9a71baf719 net: mvpp2: divide fifo for dts-active ports only
Tx/Rx FIFO is a HW resource limited by total size, but shared
by all ports of same CP110 and impacting port-performance.
Do not divide the FIFO for ports which are not enabled in DTS,
so active ports could have more FIFO.
No change in FIFO allocation if all 3 ports on the communication
processor enabled in DTS.

The active port mapping should be done in probe before FIFO-init.

Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/1606154073-28267-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 15:05:04 -08:00
Ezequiel Garcia
078eb55cdf dpaa2-eth: Fix compile error due to missing devlink support
The dpaa2 driver depends on devlink, so it should select
NET_DEVLINK in order to fix compile errors, such as:

drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.o: in function `dpaa2_eth_rx_err':
dpaa2-eth.c:(.text+0x3cec): undefined reference to `devlink_trap_report'
drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-devlink.o: in function `dpaa2_eth_dl_info_get':
dpaa2-eth-devlink.c:(.text+0x160): undefined reference to `devlink_info_driver_name_put'

Fixes: ceeb03ad8e ("dpaa2-eth: add basic devlink support")
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20201123163553.1666476-1-ciorneiioana@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 14:57:55 -08:00
Colin Ian King
be419fcacf net: hns3: fix spelling mistake "memroy" -> "memory"
There are spelling mistakes in two dev_err messages. Fix them.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20201123103452.197708-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 13:20:13 -08:00
Ido Schimmel
37b50e556e mlxsw: spectrum_trap: Add blackhole_nexthop trap
Register with devlink the blackhole_nexthop trap so that mlxsw will be
able to report packets dropped due to a blackhole nexthop.

The internal trap identifier is "DISCARD_ROUTER3", which traps packets
dropped in the adjacency table.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 12:14:56 -08:00
Ido Schimmel
68e92ad855 mlxsw: spectrum_router: Add support for blackhole nexthops
Add support for blackhole nexthops by programming them to the adjacency
table with a discard action.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 12:14:56 -08:00
Ido Schimmel
18c4b79d28 mlxsw: spectrum_router: Resolve RIF from nexthop struct instead of neighbour
The two are the same, but for blackhole nexthops we will not have an
associated neighbour struct, so resolve the RIF from the nexthop struct
itself instead.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 12:14:56 -08:00
Ido Schimmel
919f6aaa3a mlxsw: spectrum_router: Use loopback RIF for unresolved nexthops
Now that the driver creates a loopback RIF during its initialization, it
can be used to program the adjacency entries for unresolved nexthops
instead of other RIFs. The loopback RIF is guaranteed to exist for the
entire life time of the driver, unlike other RIFs that come and go.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 12:14:55 -08:00
Ido Schimmel
52d45575ec mlxsw: spectrum_router: Use different trap identifier for unresolved nexthops
Unresolved nexthops are currently written to the adjacency table with a
discard action. Packets hitting such entries are trapped to the CPU via
the 'DISCARD_ROUTER3' trap which can be enabled or disabled on demand,
but is always enabled in order to ensure the kernel can resolve the
unresolved neighbours.

This trap will be needed for blackhole nexthops support. Therefore, move
unresolved nexthops to explicitly program the adjacency entries with a
trap action and a different trap identifier, 'RTR_EGRESS0'.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 12:14:55 -08:00
Ido Schimmel
07c78536ef mlxsw: spectrum_router: Create loopback RIF during initialization
Up until now RIFs (router interfaces) were created on demand (e.g.,
when an IP address was added to a netdev). However, sometimes the device
needs to be provided with a RIF when one might not be available.

For example, adjacency entries that drop packets need to be programmed
with an egress RIF despite the RIF not being used to forward packets.

Create such a RIF during initialization so that it could be used later
on to support blackhole nexthops.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 12:14:55 -08:00
Lincoln Ramsay
9bd2702d29 aquantia: Remove the build_skb path
When performing IPv6 forwarding, there is an expectation that SKBs
will have some headroom. When forwarding a packet from the aquantia
driver, this does not always happen, triggering a kernel warning.

aq_ring.c has this code (edited slightly for brevity):

if (buff->is_eop && buff->len <= AQ_CFG_RX_FRAME_MAX - AQ_SKB_ALIGN) {
    skb = build_skb(aq_buf_vaddr(&buff->rxdata), AQ_CFG_RX_FRAME_MAX);
} else {
    skb = napi_alloc_skb(napi, AQ_CFG_RX_HDR_SIZE);

There is a significant difference between the SKB produced by these
2 code paths. When napi_alloc_skb creates an SKB, there is a certain
amount of headroom reserved. However, this is not done in the
build_skb codepath.

As the hardware buffer that build_skb is built around does not
handle the presence of the SKB header, this code path is being
removed and the napi_alloc_skb path will always be used. This code
path does have to copy the packet header into the SKB, but it adds
the packet data as a frag.

Fixes: 018423e90b ("net: ethernet: aquantia: Add ring support code")
Signed-off-by: Lincoln Ramsay <lincoln.ramsay@opengear.com>
Link: https://lore.kernel.org/r/MWHPR1001MB23184F3EAFA413E0D1910EC9E8FC0@MWHPR1001MB2318.namprd10.prod.outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 10:59:17 -08:00
Vincent Whitchurch
d5a05e69ac net: stmmac: Use hrtimer for TX coalescing
This driver uses a normal timer for TX coalescing, which means that the
with the default tx-usecs of 1000 microseconds the cleanups actually
happen 10 ms or more later with HZ=100.  This leads to very low
througput with TCP when bridged to a slow link such as a 4G modem.  Fix
this by using an hrtimer instead.

On my ARM platform with HZ=100 and the default TX coalescing settings
(tx-frames 25 tx-usecs 1000), with "tc qdisc add dev eth0 root netem
delay 60ms 40ms rate 50Mbit" run on the server, netperf's TCP_STREAM
improves from ~5.5 Mbps to ~100 Mbps.

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Link: https://lore.kernel.org/r/20201120150208.6838-1-vincent.whitchurch@axis.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 08:47:44 -08:00
Karen Sornek
24453a8428 igbvf: Refactor traces
Refactoring "PF still resetting" and changing "Failed
 to add vlan id" to "Vlan id is not added"
messages because previous version looked like a bug
- it informed about changes that worked as
designed but might confuse users

Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-11-24 08:02:24 -08:00
Stefan Assmann
6ec12e1e94 i40e: report correct VF link speed when link state is set to enable
When the virtual link state was set to "enable" ethtool would report
link speed as 40000Mb/s regardless of the underlying device.
Report the correct link speed.

Example from a XXV710 NIC.
Before:
$ ip link set ens3f0 vf 0 state auto
$  ethtool enp8s2 | grep Speed
        Speed: 25000Mb/s
$ ip link set ens3f0 vf 0 state enable
$ ethtool enp8s2 | grep Speed
        Speed: 40000Mb/s
After:
$ ip link set ens3f0 vf 0 state auto
$  ethtool enp8s2 | grep Speed
        Speed: 25000Mb/s
$ ip link set ens3f0 vf 0 state enable
$ ethtool enp8s2 | grep Speed
        Speed: 25000Mb/s

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-11-24 08:02:24 -08:00
Marek Majtyka
088d5360d0 i40e: remove redundant assignment
Remove a redundant assignment of the software ring pointer in the i40e
driver. The variable is assigned twice with no use in between, so just
get rid of the first occurrence.

Fixes: 3b4f0b66c2 ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL")
Signed-off-by: Marek Majtyka <marekx.majtyka@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-11-24 08:02:24 -08:00
Peter Zijlstra
545b8c8df4 smp: Cleanup smp_call_function*()
Get rid of the __call_single_node union and cleanup the API a little
to avoid external code relying on the structure layout as much.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
2020-11-24 16:47:49 +01:00
Jakub Kicinski
cc69837fca net: don't include ethtool.h from netdevice.h
linux/netdevice.h is included in very many places, touching any
of its dependecies causes large incremental builds.

Drop the linux/ethtool.h include, linux/netdevice.h just needs
a forward declaration of struct ethtool_ops.

Fix all the places which made use of this implicit include.

Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20201120225052.1427503-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 17:27:04 -08:00
Christophe JAILLET
7fd6372e27 net: pch_gbe: Use 'dma_free_coherent()' to undo 'dma_alloc_coherent()'
Memory allocation are done with 'dma_alloc_coherent()'. Be consistent
and use 'dma_free_coherent()' to free the corresponding memory.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/20201121090330.1332543-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 17:01:48 -08:00
Christophe JAILLET
8ff39301ef net: pch_gbe: Use dma_set_mask_and_coherent to simplify code
'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by
an equivalent 'dma_set_mask_and_coherent()' which is much less verbose.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/20201121090302.1332491-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 17:01:48 -08:00
Sylwester Dziedziuch
2980cbd4dc i40e: Fix removing driver while bare-metal VFs pass traffic
Prevent VFs from resetting when PF driver is being unloaded:
- introduce new pf state: __I40E_VF_RESETS_DISABLED;
- check if pf state has __I40E_VF_RESETS_DISABLED state set,
  if so, disable any further VFLR event notifications;
- when i40e_remove (rmmod i40e) is called, disable any resets on
  the VFs;

Previously if there were bare-metal VFs passing traffic and PF
driver was removed, there was a possibility of VFs triggering a Tx
timeout right before iavf_remove. This was causing iavf_close to
not be called because there is a check in the beginning of  iavf_remove
that bails out early if adapter->state < IAVF_DOWN_PENDING. This
makes it so some resources do not get cleaned up.

Fixes: 6a9ddb36ee ("i40e: disable IOV before freeing resources")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20201120180640.3654474-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 16:52:59 -08:00
Paul Moore
3df98d7921 lsm,selinux: pass flowi_common instead of flowi to the LSM hooks
As pointed out by Herbert in a recent related patch, the LSM hooks do
not have the necessary address family information to use the flowi
struct safely.  As none of the LSMs currently use any of the protocol
specific flowi information, replace the flowi pointers with pointers
to the address family independent flowi_common struct.

Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-11-23 18:36:21 -05:00
Christian Eggers
6b6817c5d8 dpaa2-eth: use new PTP_MSGTYPE_* define(s)
Remove usage of magic numbers.

Signed-off-by: Christian Eggers <ceggers@arri.de>
Cc: Ioana Ciornei <ioana.ciornei@nxp.com>
Cc: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Cc: Yangbo Lu <yangbo.lu@nxp.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 13:43:31 -08:00
George Cherian
f9e425e99b octeontx2-af: Add support for RSS hashing based on Transport protocol field
Add support to choose RSS flow key algorithm with IPv4 transport protocol
field included in hashing input data. This will be enabled by default.
There-by enabling 3/5 tuple hash

Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: George Cherian <george.cherian@marvell.com>
Link: https://lore.kernel.org/r/20201120093906.2873616-1-george.cherian@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-21 16:05:22 -08:00
Lijun Pan
855a631a4c ibmvnic: skip tx timeout reset while in resetting
Sometimes it takes longer than 5 seconds (watchdog timeout) to complete
failover, migration, and other resets. In stead of scheduling another
timeout reset, we wait for the current one to complete.

Suggested-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-21 15:30:47 -08:00
Lijun Pan
98025bce3a ibmvnic: notify peers when failover and migration happen
Commit 61d3e1d9bc ("ibmvnic: Remove netdev notify for failover resets")
excluded the failover case for notify call because it said
netdev_notify_peers() can cause network traffic to stall or halt.
Current testing does not show network traffic stall
or halt because of the notify call for failover event.
netdev_notify_peers may be used when a device wants to inform the
rest of the network about some sort of a reconfiguration
such as failover or migration.

It is unnecessary to call that in other events like
FATAL, NON_FATAL, CHANGE_PARAM, and TIMEOUT resets
since in those scenarios the hardware does not change.
If the driver must do a hard reset, it is necessary to notify peers.

Fixes: 61d3e1d9bc ("ibmvnic: Remove netdev notify for failover resets")
Suggested-by: Brian King <brking@linux.vnet.ibm.com>
Suggested-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-21 15:30:47 -08:00
Lijun Pan
8393597579 ibmvnic: fix call_netdevice_notifiers in do_reset
When netdev_notify_peers was substituted in
commit 986103e792 ("net/ibmvnic: Fix RTNL deadlock during device reset"),
call_netdevice_notifiers(NETDEV_RESEND_IGMP, dev) was missed.
Fix it now.

Fixes: 986103e792 ("net/ibmvnic: Fix RTNL deadlock during device reset")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-21 15:30:47 -08:00
Yonglong Liu
c331ecf1af net: hns3: adds debugfs to dump more info of shaping parameters
Adds debugfs to dump new shaping parameters: rate and flag.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-21 14:33:46 -08:00
Yonglong Liu
e364ad303f net: hns3: add support to utilize the firmware calculated shaping parameters
Since the calculation of the driver is fixed, if the number of
queue or clock changed, the calculated result may be inaccurate.

So for compatible and maintainable, add a new flag to tell the
firmware to calculate the shaping parameters with the specified
rate.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-21 14:33:46 -08:00
Yufeng Mo
3a6863e4e8 net: hns3: add support for pf querying new interrupt resources
For HNAE3_DEVICE_VERSION_V3, a maximum of 1281 interrupt
resources are supported. To utilize these new resources,
extend the corresponding field or variable to 16bit type,
and remove the restriction of NIC client that only use a
maximum of 65 interrupt vectors. In addition, the I/O address
of the extended interrupt resources are different, so an extra
handler is needed.

Currently, the total number of interrupts is the sum of RoCE's
number and RoCE's offset (RoCE is in front of NIC), since
the number of both NIC and RoCE are same. For readability,
rewrite the corresponding field of the command, rename the
RoCE's offset field as the number of NIC interrupts, then
the total number of interrupts is sum of the number of RoCE
and NIC, and replace vport->back with hdev in
hclge_init_roce_base_info() for simplifying the code.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-21 14:33:46 -08:00
Huazhong Tan
30ae7f8a6a net: hns3: add support for mapping device memory
For device who has device memory accessed through the PCI BAR4,
IO descriptor push of NIC and direct WQE(Work Queue Element) of
RoCE will use this device memory, so add support for mapping
this device memory, and add this info to the RoCE client whose
new feature needs.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-21 14:33:46 -08:00
Yonglong Liu
9a5ef4aa54 net: hns3: add support for 1280 queues
For DEVICE_VERSION_V1/2, there are total 1024 queues and
queue sets. For DEVICE_VERSION_V3, it increases to 1280,
and can be assigned to one pf, so remove the limitation
of 1024.

To keep compatible with DEVICE_VERSION_V1/2 and old driver
version, the queue number is split into two part:
tqp_num(range 0~1023) and ext_tqp_num(range 1024~1279).

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-21 14:33:46 -08:00
Tom Seewald
659fbdcf2f cxgb4: Fix build failure when CONFIG_TLS=m
After commit 9d2e5e9eeb ("cxgb4/ch_ktls: decrypted bit is not enough")
whenever CONFIG_TLS=m and CONFIG_CHELSIO_T4=y, the following build
failure occurs:

ld: drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.o: in function
`cxgb_select_queue':
cxgb4_main.c:(.text+0x2dac): undefined reference to `tls_validate_xmit_skb'

Fix this by ensuring that if TLS is set to be a module, CHELSIO_T4 will
also be compiled as a module. As otherwise the cxgb4 driver will not be
able to access TLS' symbols.

Fixes: 9d2e5e9eeb ("cxgb4/ch_ktls: decrypted bit is not enough")
Signed-off-by: Tom Seewald <tseewald@gmail.com>
Link: https://lore.kernel.org/r/20201120192528.615-1-tseewald@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-21 13:10:18 -08:00
Dwip N. Banerjee
41ed0a00ff ibmvnic: Do not replenish RX buffers after every polling loop
Reduce the amount of time spent replenishing RX buffers by only doing
so once available buffers has fallen under a certain threshold, in this
case half of the total number of buffers, or if the polling loop exits
before the packets processed is less than its budget. Non-exhaustion of
NAPI budget implies lower incoming packet pressure, allowing the leeway
to refill the buffers in preparation for any impending burst.

Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 19:50:41 -08:00
Dwip N. Banerjee
e552aa313b ibmvnic: Use netdev_alloc_skb instead of alloc_skb to replenish RX buffers
Take advantage of the additional optimizations in netdev_alloc_skb when
allocating socket buffers to be used for packet reception.

Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
Acked-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 19:50:34 -08:00
Dwip N. Banerjee
ec20f36bb4 ibmvnic: Correctly re-enable interrupts in NAPI polling routine
If the current NAPI polling loop exits without completing it's
budget, only re-enable interrupts if there are no entries remaining
in the queue and napi_complete_done is successful. If there are entries
remaining on the queue that were missed, restart the polling loop.

Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 19:50:34 -08:00
Dwip N. Banerjee
9a87c3fca2 ibmvnic: Ensure that device queue memory is cache-line aligned
PCI bus slowdowns were observed on IBM VNIC devices as a result
of partial cache line writes and non-cache aligned full cache line writes.
Ensure that packet data buffers are cache-line aligned to avoid these
slowdowns.

Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 19:50:34 -08:00
Thomas Falcon
8ed589f383 ibmvnic: Remove send_subcrq function
It is not longer used, so remove it.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Acked-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 19:50:34 -08:00
Thomas Falcon
c62aa3734f ibmvnic: Clean up TX code and TX buffer data structure
Remove unused and superfluous code and members in
existing TX implementation and data structures.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 19:50:34 -08:00
Thomas Falcon
0d97338818 ibmvnic: Introduce xmit_more support using batched subCRQ hcalls
Include support for the xmit_more feature utilizing the
H_SEND_SUB_CRQ_INDIRECT hypervisor call which allows the sending
of multiple subordinate Command Response Queue descriptors in one
hypervisor call via a DMA-mapped buffer. This update reduces hypervisor
calls and thus hypervisor call overhead per TX descriptor.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 19:50:33 -08:00
Thomas Falcon
4f0b6812e9 ibmvnic: Introduce batched RX buffer descriptor transmission
Utilize the H_SEND_SUB_CRQ_INDIRECT hypervisor call to send
multiple RX buffer descriptors to the device in one hypervisor
call operation. This change will reduce the number of hypervisor
calls and thus hypervisor call overhead needed to transmit
RX buffer descriptors to the device.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 19:50:33 -08:00
Thomas Falcon
f019fb6392 ibmvnic: Introduce indirect subordinate Command Response Queue buffer
This patch introduces the infrastructure to send batched subordinate
Command Response Queue descriptors, which are used by the ibmvnic
driver to send TX frame and RX buffer descriptors.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Acked-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 19:50:33 -08:00
Heiner Kallweit
bf7b0bf68e r8169: use dev_err_probe in rtl_get_ether_clk
Tiny improvement, let dev_err_probe() deal with EPROBE_DEFER.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/b0c4ebcf-2047-e933-b890-8a20e4bdb19f@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 18:36:13 -08:00
Heiner Kallweit
94d8a98e62 r8169: reduce number of workaround doorbell rings
Some chip versions have a hw bug resulting in lost door bell rings.
To work around this the doorbell is also rung whenever we still have
tx descriptors in flight after having cleaned up tx descriptors.
These PCI(e) writes come at a cost, therefore let's reduce the number
of extra doorbell rings.
If skb is NULL then this means:
- last cleaned-up descriptor belongs to a skb with at least one fragment
  and last fragment isn't marked as sent yet
- hw is in progress sending the skb, therefore no extra doorbell ring
  is needed for this skb
- once last fragment is marked as transmitted hw will trigger
  a tx done interrupt and we come here again (with skb != NULL)
  and ring the doorbell if needed
Therefore skip the workaround doorbell ring if skb is NULL.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/0a15a83c-aecf-ab51-8071-b29d9dcd529a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 18:33:55 -08:00
Ioana Ciornei
d2624e70a2 dpaa2-eth: select XGMAC_MDIO for MDIO bus support
Explicitly enable the FSL_XGMAC_MDIO Kconfig option in order to have
MDIO access to internal and external PHYs.

Fixes: 7194792308 ("dpaa2-eth: add MAC/PHY support through phylink")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20201119145106.712761-1-ciorneiioana@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 15:21:54 -08:00
Ido Schimmel
cdd6cfc54c mlxsw: spectrum_router: Allow programming routes with nexthop objects
Now that the driver supports nexthop objects, the check is no longer
necessary. Remove it.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 15:20:20 -08:00
Ido Schimmel
c25db3a77f mlxsw: spectrum_router: Enable resolution of nexthop groups from nexthop objects
If the FIB info (i.e, 'struct fib_info', 'struct fib6_info') uses a
nexthop object, then use the object's identifier to resolve the nexthop
group.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 15:20:20 -08:00
Ido Schimmel
2a014b200b mlxsw: spectrum_router: Add support for nexthop objects
Register a listener to the nexthop notification chain and parse notified
nexthop objects into the existing mlxsw nexthop data structures.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 15:20:20 -08:00
Colin Ian King
7648398017 octeontx2-af: Fix access of iter->entry after iter object has been kfree'd
The call to pc_delete_flow can kfree the iter object, so the following
dev_err message that accesses iter->entry can accessmemory that has
just been kfree'd.  Fix this by adding a temporary variable 'entry'
that has a copy of iter->entry and also use this when indexing into
the array mcam->entry2target_pffunc[]. Also print the unsigned value
using the %u format specifier rather than %d.

Addresses-Coverity: ("Read from pointer after free")
Fixes: 55307fcb92 ("octeontx2-af: Add mbox messages to install and delete MCAM rules")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20201118143803.463297-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:07:57 -08:00
Colin Ian King
dd6028a3cb octeontx2-af: Fix return of uninitialized variable err
Currently the variable err may be uninitialized if several of the if
statements are not executed in function nix_tx_vtag_decfg and a garbage
value in err is returned.  Fix this by initialized ret at the start of
the function.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 9a946def26 ("octeontx2-af: Modify nix_vtag_cfg mailbox to support TX VTAG entries")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20201118132502.461098-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:06:10 -08:00
Colin Ian King
583b273dea octeontx2-pf: Fix unintentional sign extension issue
The shifting of the u16 result from ntohs(proto) by 16 bits to the
left will be promoted to a 32 bit signed int and then sign-extended
to a u64.  In the event that the top bit of the return from ntohs(proto)
is set then all then all the upper 32 bits of a 64 bit long end up as
also being set because of the sign-extension. Fix this by casting to
a u64 long before the shift.

Addresses-Coverity: ("Unintended sign extension")
Fixes: f0c2982aaf ("octeontx2-pf: Add support for SR-IOV management function")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20201118130520.460365-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:05:16 -08:00
Raju Rangoju
bff453921a cxgb4: fix the panic caused by non smac rewrite
SMT entry is allocated only when loopback Source MAC
rewriting is requested. Accessing SMT entry for non
smac rewrite cases results in kernel panic.

Fix the panic caused by non smac rewrite

Fixes: 937d842058 ("cxgb4: set up filter action after rewrites")
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Link: https://lore.kernel.org/r/20201118143213.13319-1-rajur@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:03:37 -08:00
Srujana Challa
76638a2e58 octeontx2-af: add debugfs entries for CPT block
Add entries to debugfs at /sys/kernel/debug/octeontx2/cpt.

cpt_pc: dump cpt performance HW registers.
Usage:
cat /sys/kernel/debug/octeontx2/cpt/cpt_pc

cpt_ae_sts: show cpt asymmetric engines current state
Usage:
cat /sys/kernel/debug/octeontx2/cpt/cpt_ae_sts

cpt_se_sts: show cpt symmetric engines current state
Usage:
cat /sys/kernel/debug/octeontx2/cpt/cpt_se_sts

cpt_engines_info: dump cpt engine control registers.
Usage:
cat /sys/kernel/debug/octeontx2/cpt/cpt_engines_info

cpt_lfs_info: dump cpt lfs control registers.
Usage:
cat /sys/kernel/debug/octeontx2/cpt/cpt_lfs_info

cpt_err_info: dump cpt error registers.
Usage:
cat /sys/kernel/debug/octeontx2/cpt/cpt_err_info

Signed-off-by: Suheil Chandran <schandran@marvell.com>
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:01:13 -08:00
Srujana Challa
ae454086e3 octeontx2-af: add mailbox interface for CPT
On OcteonTX2 SoC, the admin function (AF) is the only one with all
priviliges to configure HW and alloc resources, PFs and it's VFs
have to request AF via mailbox for all their needs. This patch adds
a mailbox interface for CPT PFs and VFs to allocate resources
for cryptography. It also adds hardware CPT AF register defines.

Signed-off-by: Suheil Chandran <schandran@marvell.com>
Signed-off-by: Lukasz Bartosik <lbartosik@marvell.com>
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:01:13 -08:00
Srujana Challa
956fb85218 octeontx2-pf: move lmt flush to include/linux/soc
On OcteonTX2 platform CPT instruction enqueue and NIX
packet send are only possible via LMTST operations which
uses LDEOR instruction. This patch moves lmt flush
function from OcteonTX2 nic driver to include/linux/soc
since it will be used by OcteonTX2 CPT and NIC driver for
LMTST.

Signed-off-by: Suheil Chandran <schandran@marvell.com>
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 11:01:13 -08:00
Tariq Toukan
1a0058cf0c net/mlx4_en: Remove unused performance counters
Performance analysis counters are maintained under the MLX4_EN_PERF_STAT
definition, which is never set.
Clean them up, with all related structures and logic.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Link: https://lore.kernel.org/r/20201118103427.4314-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 10:59:39 -08:00
Michael Chan
c54bc3ced5 bnxt_en: Release PCI regions when DMA mask setup fails during probe.
Jump to init_err_release to cleanup.  bnxt_unmap_bars() will also be
called but it will do nothing if the BARs are not mapped yet.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/1605858271-8209-1-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 10:09:09 -08:00
Lorenzo Bianconi
c3bc2adb05 net: netsec: add xdp tx return bulking support
Convert netsec driver to xdp_return_frame_bulk APIs.
Rely on xdp_return_frame_rx_napi for XDP_TX in order to try to recycle
the page in the "in-irq" page_pool cache.

Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/01487b8f5167d62649339469cdd0c6d8df885902.1605605531.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 09:57:33 -08:00
Claudiu Manoil
0dfd294c92 enetc: Fix endianness issues for enetc_qos
Currently the control buffer descriptor (cbd) fields have endianness
restrictions while the commands passed into the control buffers
don't (with one exception). This patch fixes offending code,
by adding endianness accessors for cbd fields and removing the
unnecessary ones in case of data buffer fields. Currently there's
no need to convert all commands to little endian format, the patch
only focuses on fixing current endianness issues reported by sparse.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 22:05:42 -08:00
Claudiu Manoil
d548d3930a enetc: Fix endianness issues for enetc_ethtool
These particular fields are specified in the H/W reference
manual as having network byte order format, so enforce big
endian annotation for them and clear the related sparse
warnings in the process.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 22:05:34 -08:00
Zhang Changzhong
3383176efc bnxt_en: fix error return code in bnxt_init_board()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Link: https://lore.kernel.org/r/1605792621-6268-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 21:49:01 -08:00
Zhang Changzhong
b5f796b62c bnxt_en: fix error return code in bnxt_init_one()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: c213eae8d3 ("bnxt_en: Improve VF/PF link change logic.")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Link: https://lore.kernel.org/r/1605701851-20270-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 21:46:30 -08:00
Jacob Keller
52cc5f3a16 devlink: move flash end and begin to core devlink
When performing a flash update via devlink, device drivers may inform
user space of status updates via
devlink_flash_update_(begin|end|timeout|status)_notify functions.

It is expected that drivers do not send any status notifications unless
they send a begin and end message. If a driver sends a status
notification without sending the appropriate end notification upon
finishing (regardless of success or failure), the current implementation
of the devlink userspace program can get stuck endlessly waiting for the
end notification that will never come.

The current ice driver implementation may send such a status message
without the appropriate end notification in rare cases.

Fixing the ice driver is relatively simple: we just need to send the
begin_notify at the start of the function and always send an end_notify
no matter how the function exits.

Rather than assuming driver authors will always get this right in the
future, lets just fix the API so that it is not possible to get wrong.
Make devlink_flash_update_begin_notify and
devlink_flash_update_end_notify static, and call them in devlink.c core
code. Always send the begin_notify just before calling the driver's
flash_update routine. Always send the end_notify just after the routine
returns regardless of success or failure.

Doing this makes the status notification easier to use from the driver,
as it no longer needs to worry about catching failures and cleaning up
by calling devlink_flash_update_end_notify. It is now no longer possible
to do the wrong thing in this regard. We also save a couple of lines of
code in each driver.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 21:41:02 -08:00
Jacob Keller
b44cfd4f5b devlink: move request_firmware out of driver
All drivers which implement the devlink flash update support, with the
exception of netdevsim, use either request_firmware or
request_firmware_direct to locate the firmware file. Rather than having
each driver do this separately as part of its .flash_update
implementation, perform the request_firmware within net/core/devlink.c

Replace the file_name parameter in the struct devlink_flash_update_params
with a pointer to the fw object.

Use request_firmware rather than request_firmware_direct. Although most
Linux distributions today do not have the fallback mechanism
implemented, only about half the drivers used the _direct request, as
compared to the generic request_firmware. In the event that
a distribution does support the fallback mechanism, the devlink flash
update ought to be able to use it to provide the firmware contents. For
distributions which do not support the fallback userspace mechanism,
there should be essentially no difference between request_firmware and
request_firmware_direct.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 21:40:57 -08:00
Eric Biggers
a24d22b225 crypto: sha - split sha.h into sha1.h and sha2.h
Currently <crypto/sha.h> contains declarations for both SHA-1 and SHA-2,
and <crypto/sha3.h> contains declarations for SHA-3.

This organization is inconsistent, but more importantly SHA-1 is no
longer considered to be cryptographically secure.  So to the extent
possible, SHA-1 shouldn't be grouped together with any of the other SHA
versions, and usage of it should be phased out.

Therefore, split <crypto/sha.h> into two headers <crypto/sha1.h> and
<crypto/sha2.h>, and make everyone explicitly specify whether they want
the declarations for SHA-1, SHA-2, or both.

This avoids making the SHA-1 declarations visible to files that don't
want anything to do with SHA-1.  It also prepares for potentially moving
sha1.h into a new insecure/ or dangerous/ directory.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-11-20 14:45:33 +11:00
Jakub Kicinski
56495a2442 Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 19:08:46 -08:00
Aya Levin
6d9c8d15af net/mlx4_core: Fix init_hca fields offset
Slave function read the following capabilities from the wrong offset:
1. log_mc_entry_sz
2. fs_log_entry_sz
3. log_mc_hash_sz

Fix that by adjusting these capabilities offset to match firmware
layout.

Due to the wrong offset read, the following issues might occur:
1+2. Negative value reported at max_mcast_qp_attach.
3. Driver to init FW with multicast hash size of zero.

Fixes: a40ded6043 ("net/mlx4_core: Add masking for a few queries on HCA caps")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20201118081922.553-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 17:46:20 -08:00
Heiner Kallweit
bd4bdeb4f2 r8169: remove not needed check in rtl8169_start_xmit
In rtl_tx() the released descriptors are zero'ed by
rtl8169_unmap_tx_skb(). And in the beginning of rtl8169_start_xmit()
we check that enough descriptors are free, therefore there's no way
the DescOwn bit can be set here.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/6965d665-6c50-90c5-70e6-0bb335d4ea47@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 17:26:31 -08:00
Jakub Kicinski
f93e8497a9 mlx5-fixes-2020-11-17
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl+0KZ4ACgkQSD+KveBX
 +j5DDAgAip1LGfOUq7RISh1OWQ9zLl2KT/mmbdSioObGgjKunU4lMGZAPrLB0bpe
 RH1RodslY+1bepcz7VQ/QxbL0EU605SZn08xLZIQAYYXT0Sar+hhg7h7vhD0iSA1
 dwDF/XLQYHf8snc3x/tu6/zp9BzfzRfuiieYAbC1ri97Iz9uoYO+QcRI7OyqgXWf
 LmK9/s6WzgaC0DKg8XXEQGp7F1MbZfJph6tThZnF62NY9Rrkut6AEah85p1QaqF7
 9WLjqOtJwo893fzbN+WCXJTYo2UX9oU18nUSvSoYid2LEofiwY3PcT8R0gKSPZto
 cVcjm1oGsotbonBnPlR2eMZq8LdPDQ==
 =MKi5
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2020-11-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2020-11-17

This series introduces some fixes to mlx5 driver.

* tag 'mlx5-fixes-2020-11-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: fix error return code in mlx5e_tc_nic_init()
  net/mlx5: E-Switch, Fail mlx5_esw_modify_vport_rate if qos disabled
  net/mlx5: Disable QoS when min_rates on all VFs are zero
  net/mlx5: Clear bw_share upon VF disable
  net/mlx5: Add handling of port type in rule deletion
  net/mlx5e: Fix check if netdev is bond slave
  net/mlx5e: Fix IPsec packet drop by mlx5e_tc_update_skb
  net/mlx5e: Set IPsec WAs only in IP's non checksum partial case.
  net/mlx5e: Fix refcount leak on kTLS RX resync
====================

Link: https://lore.kernel.org/r/20201117195702.386113-1-saeedm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 17:17:33 -08:00
Ido Schimmel
e3ddfb45ba mlxsw: spectrum_router: Allow returning errors from mlxsw_sp_nexthop_group_refresh()
The function is responsible for allocating the adjacency entries used by
the nexthop group and populating them with the adjacency information
such as egress RIF and MAC address.

Allow the function to return an error when it encounters a problem and
have the relevant call sites check it.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 11:51:18 -08:00
Ido Schimmel
2efca2bfba mlxsw: spectrum_router: Add an indication if a nexthop group can be destroyed
Currently, a nexthop group is destroyed when the last FIB entry is
detached from it.

When nexthop objects are supported, this can no longer be the case, as
the group is a separate object whose lifetime is managed by user space.

Add an indication if a nexthop group can be destroyed and always set it
to true for the existing IPv4 and IPv6 nexthop groups.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 11:51:18 -08:00
Ido Schimmel
a9a711a3f7 mlxsw: spectrum_router: Only clear offload indication from valid IPv6 FIB info
When the IPv6 FIB info has a nexthop object, the nexthop offload
indication is set on the nexthop object and not on the FIB info itself.

Therefore, do not try to clear the offload indication from the FIB info
when it has a nexthop object.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 11:51:18 -08:00
Ido Schimmel
5b9954e1e7 mlxsw: spectrum_router: Re-order mlxsw_sp_nexthop6_group_get()
Attach the FIB entry to the nexthop group after setting the offload flag
on the IPv6 FIB info (i.e., 'struct fib6_info'). The second operation is
not needed when the nexthop group is a nexthop object. This will allow
us to have a common exit path from the function, regardless of the
nexthop group's type.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 11:51:18 -08:00
Ido Schimmel
c0351b7c25 mlxsw: spectrum_router: Set FIB entry's type based on nexthop group
The previous patch associated a nexthop group with the FIB entry before
the entry's type is determined.

Make use of the nexthop group when determining the entry's type instead
of relying on helpers that assume that the nexthop info is not a nexthop
object (i.e., 'struct nexthop').

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 11:51:18 -08:00
Ido Schimmel
5c9a3b2451 mlxsw: spectrum_router: Set FIB entry's type after creating nexthop group
Each FIB entry has a type (e.g., remote, local) that determines how the
entry is programmed to the device. In order to determine if the entry is
local (directly connected) or remote (has a gateway) the relevant FIB
info structures (e.g., 'struct fib_info') are checked.

When entries that use nexthop objects are supported, these checks will
need to be changed to take into account 'struct nexthop'.

Instead, first associate the entry with a nexthop group so that the next
patch could determine the entry's type based on the associated nexthop
group's type.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 11:51:18 -08:00
Ido Schimmel
c68e248d53 mlxsw: spectrum_router: Pass ifindex to mlxsw_sp_ipip_entry_find_by_decap()
The sole caller of the function will soon only have the ifindex
available, instead of the pointer itself.

Therefore, change the function to take the ifindex as input and have it
get the pointer.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 11:51:18 -08:00
Ido Schimmel
ff8a24182a mlxsw: spectrum_router: Set ifindex for IPv4 nexthops
The ifindex of the nexthop device was never set for IPv4 nexthops,
unlike IPv6 nexthops. This went unnoticed since only IPv6 nexthops use
it.

Set the ifindex for IPv4 nexthops in order to be consistent with IPv6
and also because it will be used by a later patch.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 11:51:18 -08:00
Ido Schimmel
fbf805bf1f mlxsw: spectrum_router: Fix wrong kfree() in error path
The function allocates 'nhgi', not 'nh_grp', so it needs to free the
former in its error path.

Fixes: 7f7a417e6a ("mlxsw: spectrum_router: Split nexthop group configuration to a different struct")
Addresses-Coverity: ("Memory - corruptions  (USE_AFTER_FREE)")
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 11:51:18 -08:00
Ido Schimmel
1f492eab67 mlxsw: core: Use variable timeout for EMAD retries
The driver sends Ethernet Management Datagram (EMAD) packets to the
device for configuration purposes and waits for up to 200ms for a reply.
A request is retried up to 5 times.

When the system is under heavy load, replies are not always processed in
time and EMAD transactions fail.

Make the process more robust to such delays by using exponential
backoff. First wait for up to 200ms, then retransmit and wait for up to
400ms and so on.

Fixes: caf7297e7a ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Reported-by: Denis Yulevich <denisyu@nvidia.com>
Tested-by: Denis Yulevich <denisyu@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 11:07:00 -08:00
Ido Schimmel
fb738b99ef mlxsw: Fix firmware flashing
The commit cited below moved firmware flashing functionality from
mlxsw_spectrum to mlxsw_core, but did not adjust the Kconfig
dependencies. This makes it possible to have mlxsw_core as built-in and
mlxfw as a module. The mlxfw code is therefore not reachable from
mlxsw_core and firmware flashing fails:

# devlink dev flash pci/0000:01:00.0 file mellanox/mlxsw_spectrum-13.2008.1310.mfa2
devlink answers: Operation not supported

Fix by having mlxsw_core select mlxfw.

Fixes: b79cb787ac ("mlxsw: Move fw flashing code into core.c")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reported-by: Vadim Pasternak <vadimp@nvidia.com>
Tested-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 11:07:00 -08:00
Zhang Changzhong
3a36060bf2 atl1e: fix error return code in atl1e_probe()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: a6a5325239 ("atl1e: Atheros L1E Gigabit Ethernet driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Link: https://lore.kernel.org/r/1605581875-36281-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 11:02:15 -08:00
Zhang Changzhong
537a147265 atl1c: fix error return code in atl1c_probe()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 43250ddd75 ("atl1c: Atheros L1C Gigabit Ethernet driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Link: https://lore.kernel.org/r/1605581721-36028-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 11:00:17 -08:00
Gustavo A. R. Silva
ed30aef3c8 nfp: tls: Fix unreachable code issue
Fix the following unreachable code issue:

   drivers/net/ethernet/netronome/nfp/crypto/tls.c: In function 'nfp_net_tls_add':
   include/linux/compiler_attributes.h:208:41: warning: statement will never be executed [-Wswitch-unreachable]
     208 | # define fallthrough                    __attribute__((__fallthrough__))
         |                                         ^~~~~~~~~~~~~
   drivers/net/ethernet/netronome/nfp/crypto/tls.c:299:3: note: in expansion of macro 'fallthrough'
     299 |   fallthrough;
         |   ^~~~~~~~~~~

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Link: https://lore.kernel.org/r/20201117171347.GA27231@embeddedor
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-18 09:45:52 -08:00
Dmitry Bogdanov
93be526124 qed: fix ILT configuration of SRC block
The code refactoring of ILT configuration was not complete, the old
unused variables were used for the SRC block. That could lead to the memory
corruption by HW when rx filters are configured.
This patch completes that refactoring.

Fixes: 8a52bbab39 (qed: Debug feature: ilt and mdump)
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Link: https://lore.kernel.org/r/20201116132944.2055-1-dbogdanov@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 16:35:32 -08:00
Subbaraya Sundeep
5a57966785 octeontx2-af: Delete NIX_RXVLAN_ALLOC mailbox message
Since mailbox message for installing flows is in place,
remove the RXVLAN_ALLOC mbox message which is redundant.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 13:48:21 -08:00
Naveen Mamindlapalli
dbab48cecc octeontx2-af: Add new mbox messages to retrieve MCAM entries
This patch introduces new mailbox mesages to retrieve a given
MCAM entry or base flow steering rule of a VF installed by its
parent PF. This helps while updating the existing MCAM rules
with out re-framing the whole mailbox request again. The INSTALL
FLOW mailbox consumer can read-modify-write the existing entry.
Similarly while installing new flow rules for a VF, the base
flow steering rule match creteria is copied to the new flow rule
and the deltas are appended to the new rule.

Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Co-developed-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 13:48:21 -08:00
Hariprasad Kelam
4f88ed2cc5 octeontx2-af: Handle PF-VF mac address changes
This patch handles the VF mac address changes as given below.
    1. mac addr configrued by VF will be retained until VF module unload.
    2. mac addr configred by PF for VF will be retained until power cycle.
    3. mac addr confgired by PF for its VF can't be overwritten by VF.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 13:48:21 -08:00
Naveen Mamindlapalli
f0c2982aaf octeontx2-pf: Add support for SR-IOV management functions
This patch adds support for ndo_set_vf_mac, ndo_set_vf_vlan
and ndo_get_vf_config handlers. The traffic redirection
based on the VF mac address or vlan id is done by installing
MCAM rules. Reserved RX_VTAG_TYPE7 in each NIXLF for VF VLAN
which strips the VLAN tag from ingress VLAN traffic. The NIX PF
allocates two MCAM entries for VF VLAN feature, one used for
ingress VTAG strip and another entry for egress VTAG insertion.

This patch also updates the MAC address in PF installed VF VLAN
rule upon receiving nix_lf_start_rx mbox request for VF since
Administrative Function driver will assign a valid MAC addr
in nix_lf_start_rx function.

Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Co-developed-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 13:48:21 -08:00
Hariprasad Kelam
fd9d7859db octeontx2-pf: Implement ingress/egress VLAN offload
This patch implements egress VLAN offload by appending NIX_SEND_EXT_S
header to NIX_SEND_HDR_S. The VLAN TCI information is specified
in the NIX_SEND_EXT_S. The VLAN offload in the ingress path is
implemented by configuring the NIX_RX_VTAG_ACTION_S to strip and
capture the outer vlan fields. The NIX PF allocates one MCAM entry
for Rx VLAN offload.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 13:48:21 -08:00
Vamsi Attunuru
9a946def26 octeontx2-af: Modify nix_vtag_cfg mailbox to support TX VTAG entries
This patch modifies the existing nix_vtag_config mailbox message
to allocate and free TX VTAG entries as requested by a NIX PF.
The TX VTAG entries are global resource that shared by all PFs
and each entry specifies the size of VTAG to insert and the VTAG
header data to insert. The mailbox response contains the entry
index which is used by mailbox requester in configuring the
NPC_TX_VTAG_ACTION for any MCAM entry.

Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 13:48:21 -08:00
Subbaraya Sundeep
4d6beb9c80 octeontx2-af: Add debugfs entry to dump the MCAM rules
Add debugfs support to dump the MCAM rules installed using
NPC_INSTALL_FLOW mbox message. Debugfs file can display mcam
entry, counter if any, flow type and counter hits.

Ethtool will dump the ntuple flows related to the PF only.
The debugfs file gives systemwide view of the MCAM rules
installed by all the PF's.

Below is the example output when the debugfs file is read:
~ # mount -t debugfs none /sys/kernel/debug
~ # cat /sys/kernel/debug/octeontx2/npc/mcam_rules

	Installed by: PF1
	direction: RX
        mcam entry: 227
	udp source port 23 mask 0xffff
	Forward to: PF1 VF0
        action: Direct to queue 0
	enabled: yes
        counter: 1
        hits: 0

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 13:48:21 -08:00
Hariprasad Kelam
63ee51575f octeontx2-pf: Add support for unicast MAC address filtering
Add unicast MAC address filtering support using install flow
message. Total of 8 MCAM entries are allocated for adding
unicast mac filtering rules. If the MCAM allocation fails,
the unicast filtering support will not be advertised.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 13:48:20 -08:00
Subbaraya Sundeep
f0a1913f8a octeontx2-pf: Add support for ethtool ntuple filters
This patch adds support for adding and deleting ethtool ntuple
filters. The filters for ether, ipv4, ipv6, tcp, udp and sctp
are supported. The mask is also supported. The supported actions
are drop and direct to a queue. Additionally we support FLOW_EXT
field vlan_tci and FLOW_MAC_EXT.

The NIX PF will allocate total 32 MCAM entries for the use of
ethtool ntuple filters. The Administrative Function(AF) will
install/delete the MCAM rules when NIX PF sends mailbox message
to install/delete the ntuple filters.

Ethtool ntuple filters support is restricted to PFs as of now
and PF can install ntuple filters to direct the traffic to its
VFs. Hence added a separate callback for VFs to get/set RSS
configuration.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 13:48:20 -08:00
Subbaraya Sundeep
55307fcb92 octeontx2-af: Add mbox messages to install and delete MCAM rules
Added new mailbox messages to install and delete MCAM rules.
These mailbox messages will be used for adding/deleting ethtool
n-tuple filters by NIX PF. The installed MCAM rules are stored
in a list that will be traversed later to delete the MCAM entries
when the interface is brought down or when PCIe FLR is received.
The delete mailbox supports deleting a single MCAM entry or range
of entries or all the MCAM entries owned by the pcifunc. Each MCAM
entry can be associated with a HW match stat entry if the mailbox
requester wants to check the hit count for debugging.

Modified adding default unicast DMAC match rule using install
flow API. The default unicast DMAC match entry installed by
Administrative Function is saved and can be changed later by the
mailbox user to fit additional fields, or the default MCAM entry
rule action can be used for other flow rules installed later.

Modified rvu_mbox_handler_nix_lf_free mailbox to add a flag to
disable or delete the MCAM entries. The MCAM entries are disabled
when the interface is brought down and deleted in FLR handler.
The disabled MCAM entries will be re-enabled when the interface
is brought up again.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 13:48:20 -08:00
Subbaraya Sundeep
9b179a960a octeontx2-af: Generate key field bit mask from KEX profile
Key Extraction(KEX) profile decides how the packet metadata such as
layer information and selected packet data bytes at each layer are
placed in MCAM search key. This patch reads the configured KEX profile
parameters to find out the bit position and bit mask for each field.
The information is used when programming the MCAM match data by SW
to match a packet flow and take appropriate action on the flow. This
patch also verifies the mandatory fields such as channel and DMAC
are not overwritten by the KEX configuration of other fields.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 13:48:20 -08:00
Subbaraya Sundeep
041a1c1715 octeontx2-af: Verify MCAM entry channel and PF_FUNC
This patch adds support to verify the channel number sent by
mailbox requester before writing MCAM entry for Ingress packets.
Similarly for Egress packets, verifying the PF_FUNC sent by the
mailbox user.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 13:48:20 -08:00
Stanislaw Kardach
f1517f6f1d octeontx2-af: Modify default KEX profile to extract TX packet fields
The current default Key Extraction(KEX) profile can only use RX
packet fields while generating the MCAM search key. The profile
can't be used for matching TX packet fields. This patch modifies
the default KEX profile to add support for extracting TX packet
fields into MCAM search key. Enabled Tx KPU packet parsing by
configuring TX PKIND in tx_parse_cfg.

Modified the KEX profile to extract 2 bytes of VLAN TCI from an
offset of 2 bytes from LB_PTR. The LB_PTR points to the byte offset
where the VLAN header starts. The NPC KPU parser profile has been
modified to point LB_PTR to the starting byte offset of VLAN header
which points to the tpid field.

Signed-off-by: Stanislaw Kardach <skardach@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 13:48:20 -08:00
Magnus Karlsson
3106c580fb i40e: Use batched xsk Tx interfaces to increase performance
Use the new batched xsk interfaces for the Tx path in the i40e driver
to improve performance. On my machine, this yields a throughput
increase of 4% for the l2fwd sample app in xdpsock. If we instead just
look at the Tx part, this patch set increases throughput with above
20% for Tx.

Note that I had to explicitly loop unroll the inner loop to get to
this performance level, by using a pragma. It is honored by both clang
and gcc and should be ignored by versions that do not support
it. Using the -funroll-loops compiler command line switch on the
source file resulted in a loop unrolling on a higher level that
lead to a performance decrease instead of an increase.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/1605525167-14450-6-git-send-email-magnus.karlsson@gmail.com
2020-11-17 22:07:40 +01:00
Magnus Karlsson
f320460b94 i40e: Remove unnecessary sw_ring access from xsk Tx
Remove the unnecessary access to the software ring for the AF_XDP
zero-copy driver. This was used to record the length of the packet so
that the driver Tx completion code could sum this up to produce the
total bytes sent. This is now performed during the transmission of the
packet, so no need to record this in the software ring.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/1605525167-14450-3-git-send-email-magnus.karlsson@gmail.com
2020-11-17 22:07:40 +01:00
Alex Marginean
fd5736bf9f enetc: Workaround for MDIO register access issue
Due to a hardware issue, an access to MDIO registers
that is concurrent with other ENETC register accesses
may lead to the MDIO access being dropped or corrupted.
The workaround introduces locking for all register accesses
to the ENETC register space.  To reduce performance impact,
a readers-writers locking scheme has been implemented.
The writer in this case is the MDIO access code (irrelevant
whether that MDIO access is a register read or write), and
the reader is any access code to non-MDIO ENETC registers.
Also, the datapath functions acquire the read lock fewer times
and use _hot accessors.  All the rest of the code uses the _wa
accessors which lock every register access.
The commit introducing MDIO support is -
commit ebfcb23d62 ("enetc: Add ENETC PF level external MDIO support")
but due to subsequent refactoring this patch is applicable on
top of a later commit.

Fixes: 6517798dd3 ("enetc: Make MDIO accessors more generic and export to include/linux/fsl")
Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20201112182608.26177-1-claudiu.manoil@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 12:12:12 -08:00
Wang Hai
68ec32daf7 net/mlx5: fix error return code in mlx5e_tc_nic_init()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: aedd133d17 ("net/mlx5e: Support CT offload for tc nic flows")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-17 11:50:54 -08:00
Eli Cohen
5b8631c7b2 net/mlx5: E-Switch, Fail mlx5_esw_modify_vport_rate if qos disabled
Avoid calling mlx5_esw_modify_vport_rate() if qos is not enabled and
avoid unnecessary syndrome messages from firmware.

Fixes: fcb64c0f56 ("net/mlx5: E-Switch, add ingress rate support")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-17 11:50:53 -08:00
Vladyslav Tarasiuk
470b747582 net/mlx5: Disable QoS when min_rates on all VFs are zero
Currently when QoS is enabled for VF and any min_rate is configured,
the driver sets bw_share value to at least 1 and doesn’t allow to set
it to 0 to make minimal rate unlimited. It means there is always a
minimal rate configured for every VF, even if user tries to remove it.

In order to make QoS disable possible, check whether all vports have
configured min_rate = 0. If this is true, set their bw_share to 0 to
disable min_rate limitations.

Fixes: c9497c9890 ("net/mlx5: Add support for setting VF min rate")
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-17 11:50:53 -08:00
Vladyslav Tarasiuk
1ce5fc724a net/mlx5: Clear bw_share upon VF disable
Currently, if user disables VFs with some min and max rates configured,
they are cleared. But QoS data is not cleared and restored upon next VF
enable placing limits on minimal rate for given VF, when user expects
none.

To match cleared vport->info struct with QoS-related min and max rates
upon VF disable, clear vport->qos struct too.

Fixes: 556b9d16d3 ("net/mlx5: Clear VF's configuration on disabling SRIOV")
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-17 11:50:53 -08:00
Michael Guralnik
8cbcc5ef2a net/mlx5: Add handling of port type in rule deletion
Handle destruction of rules with port destination type to enable
full destruction of flow.

Without this handling of TX rules the deletion of these rules fails.
Dmesg of flow destruction failure:

[  203.714146] mlx5_core 0000:00:0b.0: mlx5_cmd_check:753:(pid 342): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x144b7a)
[  210.547387] ------------[ cut here ]------------
[  210.548663] refcount_t: decrement hit 0; leaking memory.
[  210.550651] WARNING: CPU: 4 PID: 342 at lib/refcount.c:31 refcount_warn_saturate+0x5c/0x110
[  210.550654] Modules linked in: mlx5_ib mlx5_core ib_ipoib rdma_ucm rdma_cm iw_cm ib_cm ib_umad ib_uverbs ib_core
[  210.550675] CPU: 4 PID: 342 Comm: test Not tainted 5.8.0-rc2+ #116
[  210.550678] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[  210.550680] RIP: 0010:refcount_warn_saturate+0x5c/0x110
[  210.550685] Code: c6 d1 1b 01 00 0f 84 ad 00 00 00 5b 5d c3 80 3d b5 d1 1b 01 00 75 f4 48 c7 c7 20 d1 15 82 c6 05 a5 d1 1b 01 01 e8 a7 eb af ff <0f> 0b eb dd 80 3d 99 d1 1b 01 00 75 d4 48 c7 c7 c0 cf 15 82 c6 05
[  210.550687] RSP: 0018:ffff8881642e77e8 EFLAGS: 00010282
[  210.550691] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000000
[  210.550694] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed102c85ceef
[  210.550696] RBP: ffff888161720428 R08: ffffffff8124c10e R09: ffffed103243beae
[  210.550698] R10: ffff8881921df56b R11: ffffed103243bead R12: ffff8881841b4180
[  210.550701] R13: ffff888161720428 R14: ffff8881616d0000 R15: ffff888161720380
[  210.550704] FS:  00007fc27f025740(0000) GS:ffff888192000000(0000) knlGS:0000000000000000
[  210.550706] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  210.550708] CR2: 0000557e4b41a6a0 CR3: 0000000002415004 CR4: 0000000000360ea0
[  210.550711] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  210.550713] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  210.550715] Call Trace:
[  210.550717]  mlx5_del_flow_rules+0x484/0x490 [mlx5_core]
[  210.550720]  ? mlx5_cmd_set_fte+0xa80/0xa80 [mlx5_core]
[  210.550722]  mlx5_ib_destroy_flow+0x17f/0x280 [mlx5_ib]
[  210.550724]  uverbs_free_flow+0x4c/0x90 [ib_uverbs]
[  210.550726]  destroy_hw_idr_uobject+0x41/0xb0 [ib_uverbs]
[  210.550728]  uverbs_destroy_uobject+0xaa/0x390 [ib_uverbs]
[  210.550731]  __uverbs_cleanup_ufile+0x129/0x1b0 [ib_uverbs]
[  210.550733]  ? uverbs_destroy_uobject+0x390/0x390 [ib_uverbs]
[  210.550735]  uverbs_destroy_ufile_hw+0x78/0x190 [ib_uverbs]
[  210.550737]  ib_uverbs_close+0x36/0x140 [ib_uverbs]
[  210.550739]  __fput+0x181/0x380
[  210.550741]  task_work_run+0x88/0xd0
[  210.550743]  do_exit+0x5f6/0x13b0
[  210.550745]  ? sched_clock_cpu+0x30/0x140
[  210.550747]  ? is_current_pgrp_orphaned+0x70/0x70
[  210.550750]  ? lock_downgrade+0x360/0x360
[  210.550752]  ? mark_held_locks+0x1d/0x90
[  210.550754]  do_group_exit+0x8a/0x140
[  210.550756]  get_signal+0x20a/0xf50
[  210.550758]  do_signal+0x8c/0xbe0
[  210.550760]  ? hrtimer_nanosleep+0x1d8/0x200
[  210.550762]  ? nanosleep_copyout+0x50/0x50
[  210.550764]  ? restore_sigcontext+0x320/0x320
[  210.550766]  ? __hrtimer_init+0xf0/0xf0
[  210.550768]  ? timespec64_add_safe+0x150/0x150
[  210.550770]  ? mark_held_locks+0x1d/0x90
[  210.550772]  ? lockdep_hardirqs_on_prepare+0x14c/0x240
[  210.550774]  __prepare_exit_to_usermode+0x119/0x170
[  210.550776]  do_syscall_64+0x65/0x300
[  210.550778]  ? trace_hardirqs_off+0x10/0x120
[  210.550781]  ? mark_held_locks+0x1d/0x90
[  210.550783]  ? asm_sysvec_apic_timer_interrupt+0xa/0x20
[  210.550785]  ? lockdep_hardirqs_on+0x112/0x190
[  210.550787]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  210.550789] RIP: 0033:0x7fc27f1cd157
[  210.550791] Code: Bad RIP value.
[  210.550793] RSP: 002b:00007ffd4db27ea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000023
[  210.550798] RAX: fffffffffffffdfc RBX: ffffffffffffff80 RCX: 00007fc27f1cd157
[  210.550800] RDX: 00007fc27f025740 RSI: 00007ffd4db27eb0 RDI: 00007ffd4db27eb0
[  210.550803] RBP: 0000000000000016 R08: 0000000000000000 R09: 000000000000000e
[  210.550805] R10: 00007ffd4db27dc7 R11: 0000000000000246 R12: 0000000000400c00
[  210.550808] R13: 00007ffd4db285f0 R14: 0000000000000000 R15: 0000000000000000
[  210.550809] irq event stamp: 49399
[  210.550812] hardirqs last  enabled at (49399): [<ffffffff81172d36>] console_unlock+0x556/0x6f0
[  210.550815] hardirqs last disabled at (49398): [<ffffffff81172897>] console_unlock+0xb7/0x6f0
[  210.550818] softirqs last  enabled at (48706): [<ffffffff81e0037b>] __do_softirq+0x37b/0x60c
[  210.550820] softirqs last disabled at (48697): [<ffffffff81c00e2f>] asm_call_on_stack+0xf/0x20
[  210.550822] ---[ end trace ad18c0e6fa846454 ]---
[  210.581862] mlx5_core 0000:00:0c.0: mlx5_destroy_flow_table:2132:(pid 342): Flow table 262150 wasn't destroyed, refcount > 1

Fixes: a7ee18bdee ("RDMA/mlx5: Allow creating a matcher for a NIC TX flow table")
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-17 11:50:52 -08:00
Maor Dickman
219b3267ca net/mlx5e: Fix check if netdev is bond slave
Bond events handler uses bond_slave_get_rtnl to check if net device
is bond slave. bond_slave_get_rtnl return the rcu rx_handler pointer
from the netdev which exists for bond slaves but also exists for
devices that are attached to linux bridge so using it as indication
for bond slave is wrong.

Fix by using netif_is_lag_port instead.

Fixes: 7e51891a23 ("net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-17 11:50:52 -08:00
Huy Nguyen
6248ce991f net/mlx5e: Fix IPsec packet drop by mlx5e_tc_update_skb
Both TC and IPsec crypto offload use metadata_regB to store
private information. Since TC does not use bit 31 of regB, IPsec
will use bit 31 as the IPsec packet marker. The IPsec's regB usage
is changed to:
Bit31: IPsec marker
Bit30-24: IPsec syndrome
Bit23-0: IPsec obj id

Fixes: b2ac7541e3 ("net/mlx5e: IPsec: Add Connect-X IPsec Rx data path offload")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-17 11:50:52 -08:00
Huy Nguyen
5cfb540ef2 net/mlx5e: Set IPsec WAs only in IP's non checksum partial case.
The IP's checksum partial still requires L4 csum flag on Ethernet WQE.
Make the IPsec WAs only for the IP's non checksum partial case
(for example icmd packet)

Fixes: 5be019040c ("net/mlx5e: IPsec: Add Connect-X IPsec Tx data path offload")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Alaa Hleihel <alaa@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-17 11:50:52 -08:00
Maxim Mikityanskiy
ea63609857 net/mlx5e: Fix refcount leak on kTLS RX resync
On resync, the driver calls inet_lookup_established
(__inet6_lookup_established) that increases sk_refcnt of the socket. To
decrease it, the driver set skb->destructor to sock_edemux. However, it
didn't work well, because the TCP stack also sets this destructor for
early demux, and the refcount gets decreased only once, while increased
two times (in mlx5e and in the TCP stack). It leads to a socket leak, a
TLS context leak, which in the end leads to calling tls_dev_del twice:
on socket close and on driver unload, which in turn leads to a crash.

This commit fixes the refcount leak by calling sock_gen_put right away
after using the socket, thus fixing all the subsequent issues.

Fixes: 0419d8c9d8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-17 11:50:51 -08:00
Huazhong Tan
de25bcc47f net: hns3: rename gl_adapt_enable in struct hns3_enet_coalesce
Besides GL(Gap Limiting), QL(Quantity Limiting) can be modified
dynamically when DIM is supported. So rename gl_adapt_enable as
adapt_enable in struct hns3_enet_coalesce.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 11:39:21 -08:00
Huazhong Tan
5ac84b02d3 net: hns3: add support for 1us unit GL configuration
For device whose version is above V3(include V3), the GL
configuration can set as 1us unit, so adds support for
configuring this field.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 11:39:20 -08:00
Huazhong Tan
ab16b49cdf net: hns3: add support for querying maximum value of GL
For maintainability and compatibility, add support for querying
the maximum value of GL.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 11:39:20 -08:00
Huazhong Tan
91bfae25ee net: hns3: add support for configuring interrupt quantity limiting
QL(quantity limiting) means that hardware supports the interrupt
coalesce based on the frame quantity.  QL can be configured when
int_ql_max in device's specification is non-zero, so add support
to configure it. Also, rename two coalesce init function to fit
their purpose.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 11:39:20 -08:00
Joel Stanley
3d5179458d net: ftgmac100: Fix crash when removing driver
When removing the driver we would hit BUG_ON(!list_empty(&dev->ptype_specific))
in net/core/dev.c due to still having the NC-SI packet handler
registered.

 # echo 1e660000.ethernet > /sys/bus/platform/drivers/ftgmac100/unbind
  ------------[ cut here ]------------
  kernel BUG at net/core/dev.c:10254!
  Internal error: Oops - BUG: 0 [#1] SMP ARM
  CPU: 0 PID: 115 Comm: sh Not tainted 5.10.0-rc3-next-20201111-00007-g02e0365710c4 #46
  Hardware name: Generic DT based system
  PC is at netdev_run_todo+0x314/0x394
  LR is at cpumask_next+0x20/0x24
  pc : [<806f5830>]    lr : [<80863cb0>]    psr: 80000153
  sp : 855bbd58  ip : 00000001  fp : 855bbdac
  r10: 80c03d00  r9 : 80c06228  r8 : 81158c54
  r7 : 00000000  r6 : 80c05dec  r5 : 80c05d18  r4 : 813b9280
  r3 : 813b9054  r2 : 8122c470  r1 : 00000002  r0 : 00000002
  Flags: Nzcv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment none
  Control: 00c5387d  Table: 85514008  DAC: 00000051
  Process sh (pid: 115, stack limit = 0x7cb5703d)
 ...
  Backtrace:
  [<806f551c>] (netdev_run_todo) from [<80707eec>] (rtnl_unlock+0x18/0x1c)
   r10:00000051 r9:854ed710 r8:81158c54 r7:80c76bb0 r6:81158c10 r5:8115b410
   r4:813b9000
  [<80707ed4>] (rtnl_unlock) from [<806f5db8>] (unregister_netdev+0x2c/0x30)
  [<806f5d8c>] (unregister_netdev) from [<805a8180>] (ftgmac100_remove+0x20/0xa8)
   r5:8115b410 r4:813b9000
  [<805a8160>] (ftgmac100_remove) from [<805355e4>] (platform_drv_remove+0x34/0x4c)

Fixes: bd466c3fb5 ("net/faraday: Support NCSI mode")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20201117024448.1170761-1-joel@jms.id.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 10:59:03 -08:00
Zhang Changzhong
7b027c249d net: b44: fix error return code in b44_init_one()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 39a6f4bce6 ("b44: replace the ssb_dma API with the generic DMA API")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/1605582131-36735-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 10:50:28 -08:00
Sven Van Asbroeck
7c3e2b771d lan743x: replace devicetree phy parse code with library function
The code in this driver which parses the devicetree to determine
the phy/fixed link setup, can be replaced by a single library
function: of_phy_get_and_connect().

Behaviour is identical, except that the library function will
complain when 'phy-connection-type' is omitted, instead of
blindly using PHY_INTERFACE_MODE_NA, which would result in an
invalid phy configuration.

The library function no longer brings out the exact phy_mode,
but the driver doesn't need this, because phy_interface_is_rgmii()
queries the phydev directly. Remove 'phy_mode' from the private
adapter struct.

While we're here, log info about the attached phy on connect,
this is useful because the phy type and connection method is now
fully configurable via the devicetree.

Tested on a lan7430 chip with built-in phy. Verified that adding
fixed-link/phy-connection-type in the devicetree results in a
fixed-link setup. Used ethtool to verify that the devicetree
settings are used.

Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # lan7430
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20201116170155.26967-1-TheSven73@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 10:46:20 -08:00
Zhang Changzhong
cb47d16ea2 qed: fix error return code in qed_iwarp_ll2_start()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 469981b17a ("qed: Add unaligned and packed packet processing")
Fixes: fcb39f6c10 ("qed: Add mpa buffer descriptors for storing and processing mpa fpdus")
Fixes: 1e28eaad07 ("qed: Add iWARP support for fpdu spanned over more than two tcp packets")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Link: https://lore.kernel.org/r/1605532033-27373-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 10:40:34 -08:00
Heiner Kallweit
83c317d7b3 r8169: remove nr_frags argument from rtl_tx_slots_avail
The only time when nr_frags isn't SKB_MAX_FRAGS is when entering
rtl8169_start_xmit(). However we can use SKB_MAX_FRAGS also here
because when queue isn't stopped there should always be room for
MAX_SKB_FRAGS + 1 descriptors.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/3d1f2ad7-31d5-2cac-4f4a-394f8a3cab63@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17 09:29:14 -08:00
Vasundhara Volam
0ae0a779ef bnxt_en: Avoid unnecessary NVM_GET_DEV_INFO cmd error log on VFs.
VFs do not have access permissions to issue NVM_GET_DEV_INFO
firmware command.

Fixes: 4933f6753b ("bnxt_en: Add bnxt_hwrm_nvm_get_dev_info() to query NVM info.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 17:39:47 -08:00
Michael Chan
fa97f303fa bnxt_en: Fix counter overflow logic.
bnxt_add_one_ctr() adds a hardware counter to a software counter and
adjusts for the hardware counter wraparound against the mask.  The logic
assumes that the hardware counter is always smaller than or equal to
the mask.

This assumption is mostly correct.  But in some cases if the firmware
is older and does not provide the accurate mask, the driver can use
a mask that is smaller than the actual hardware mask.  This can cause
some extra carry bits to be added to the software counter, resulting in
counters that far exceed the actual value.  Fix it by masking the
hardware counter with the mask passed into bnxt_add_one_ctr().

Fixes: fea6b33355 ("bnxt_en: Accumulate all counters.")
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 17:39:46 -08:00
Michael Chan
eba93de6d3 bnxt_en: Free port stats during firmware reset.
Firmware is unable to retain the port counters during any kind of
fatal or non-fatal resets, so we must clear the port counters to
avoid false detection of port counter overflow.

Fixes: fea6b33355 ("bnxt_en: Accumulate all counters.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 17:39:46 -08:00
Edwin Peer
4260330b32 bnxt_en: read EEPROM A2h address using page 0
The module eeprom address range returned by bnxt_get_module_eeprom()
should be 256 bytes of A0h address space, the lower half of the A2h
address space, and page 0 for the upper half of the A2h address space.

Fix the firmware call by passing page_number 0 for the A2h slave address
space.

Fixes: 42ee18fe4c ("bnxt_en: Add Support for ETHTOOL_GMODULEINFO and ETHTOOL_GMODULEEEPRO")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 17:39:46 -08:00
Subash Abhinov Kasiviswanathan
fc70f5bf5e net: qualcomm: rmnet: Fix incorrect receive packet handling during cleanup
During rmnet unregistration, the real device rx_handler is first cleared
followed by the removal of rx_handler_data after the rcu synchronization.

Any packets in the receive path may observe that the rx_handler is NULL.
However, there is no check when dereferencing this value to use the
rmnet_port information.

This fixes following splat by adding the NULL check.

Unable to handle kernel NULL pointer dereference at virtual
address 000000000000000d
pc : rmnet_rx_handler+0x124/0x284
lr : rmnet_rx_handler+0x124/0x284
 rmnet_rx_handler+0x124/0x284
 __netif_receive_skb_core+0x758/0xd74
 __netif_receive_skb+0x50/0x17c
 process_backlog+0x15c/0x1b8
 napi_poll+0x88/0x284
 net_rx_action+0xbc/0x23c
 __do_softirq+0x20c/0x48c

Fixes: ceed73a2cf ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Link: https://lore.kernel.org/r/1605298325-3705-1-git-send-email-subashab@codeaurora.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 16:34:49 -08:00
Lorenzo Bianconi
9c79a8ab5f net: mvneta: fix possible memory leak in mvneta_swbm_add_rx_fragment
Recycle the page running page_pool_put_full_page() in
mvneta_swbm_add_rx_fragment routine when the last descriptor
contains just the FCS or if the received packet contains more than
MAX_SKB_FRAGS fragments

Fixes: ca0e014609 ("net: mvneta: move skb build after descriptors processing")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/df6a2bad70323ee58d3901491ada31c1ca2a40b9.1605291228.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 16:30:50 -08:00
Wong Vee Khee
8e5debed39 net: stmmac: Use rtnl_lock/unlock on netif_set_real_num_rx_queues() call
Fix an issue where dump stack is printed on suspend resume flow due to
netif_set_real_num_rx_queues() is not called with rtnl_lock held().

Fixes: 686cff3d70 ("net: stmmac: Fix incorrect location to set real_num_rx|tx_queues")
Reported-by: Christophe ROULLIER <christophe.roullier@st.com>
Tested-by: Christophe ROULLIER <christophe.roullier@st.com>
Cc: Alexandre TORGUE <alexandre.torgue@st.com>
Reviewed-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com>
Link: https://lore.kernel.org/r/20201115074210.23605-1-vee.khee.wong@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 16:12:36 -08:00
Zhang Changzhong
35f735c665 net: ethernet: ti: cpsw: fix error return code in cpsw_probe()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 83a8471ba2 ("net: ethernet: ti: cpsw: refactor probe to group common hw initialization")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Link: https://lore.kernel.org/r/1605250173-18438-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 15:37:28 -08:00
Zhang Changzhong
661710bfd5 net: stmmac: dwmac-intel-plat: fix error return code in intel_eth_plat_probe()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 9efc9b2b04 ("net: stmmac: Add dwmac-intel-plat for GBE driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Link: https://lore.kernel.org/r/1605249243-17262-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 15:32:30 -08:00
Zhang Changzhong
3beb9be165 qlcnic: fix error return code in qlcnic_83xx_restart_hw()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 3ced0a88cd ("qlcnic: Add support to run firmware POST")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Link: https://lore.kernel.org/r/1605248186-16013-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 15:26:42 -08:00
Zhang Qilong
da875fa504 net: fec: Fix reference count leak in fec series ops
pm_runtime_get_sync() will increment pm usage at first and it will
resume the device later. If runtime of the device has error or
device is in inaccessible state(or other error state), resume
operation will fail. If we do not call put operation to decrease
the reference, it will result in reference count leak. Moreover,
this device cannot enter the idle state and always stay busy or other
non-idle state later. So we fixed it by replacing it with
pm_runtime_resume_and_get.

Fixes: 8fff755e9f ("net: fec: Ensure clocks are enabled while using mdio bus")
Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 09:37:01 -08:00
Heiner Kallweit
41294e6a43 r8169: improve rtl8169_start_xmit
Improve the following in rtl8169_start_xmit:
- tp->cur_tx can be accessed in parallel by rtl_tx(), therefore
  annotate the race by using WRITE_ONCE
- avoid checking stop_queue a second time by moving the doorbell check
- netif_stop_queue() uses atomic operation set_bit() that includes a
  full memory barrier on some platforms, therefore use
  smp_mb__after_atomic to avoid overhead

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/80085451-3eaf-507a-c7c0-08d607c46fbc@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 07:54:07 -08:00
Ido Schimmel
245f4e44d2 mlxsw: spectrum_router: Remove outdated comment
Since commit 21151f64a4 ("mlxsw: Add new FIB entry type for reject
routes") this comment is no longer correct. Remove it.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:04 -08:00
Ido Schimmel
9ed2b4d287 mlxsw: spectrum_router: Consolidate mlxsw_sp_nexthop{4, 6}_type_fini()
The two functions are identical, so consolidate them to
mlxsw_sp_nexthop_type_fini().

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:04 -08:00
Ido Schimmel
c181a89a6d mlxsw: spectrum_router: Consolidate mlxsw_sp_nexthop{4, 6}_type_init()
The two functions are now identical, so consolidate them to
mlxsw_sp_nexthop_type_init().

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:04 -08:00
Ido Schimmel
b360952bbf mlxsw: spectrum_router: Remove unused argument from mlxsw_sp_nexthop6_type_init()
Remove it as it is unused.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:04 -08:00
Ido Schimmel
c3bde5a914 mlxsw: spectrum_router: Pass nexthop netdev to mlxsw_sp_nexthop4_type_init()
Instead of passing the nexthop and resolving the nexthop netdev from it,
pass the nexthop netdev directly.

This will later allow us to consolidate code paths between IPv4 and IPv6
code.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:04 -08:00
Ido Schimmel
4dd38da54a mlxsw: spectrum_router: Pass nexthop netdev to mlxsw_sp_nexthop6_type_init()
Instead of passing the route and resolving the nexthop netdev from it,
pass the nexthop netdev directly.

This will later allow us to consolidate code paths between IPv4 and IPv6
code.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:04 -08:00
Ido Schimmel
7ba7bc55cf mlxsw: spectrum_ipip: Remove overlay protocol from can_offload() callback
The overlay protocol (i.e., IPv4/IPv6) that is being encapsulated has
no impact on whether a certain IP tunnel can be offloaded or not. Only
the underlay protocol matters.

Therefore, remove the unused overlay protocol parameter from the
callback.

This will later allow us to consolidate code paths between IPv4 and IPv6
code.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:04 -08:00
Ido Schimmel
7f7a417e6a mlxsw: spectrum_router: Split nexthop group configuration to a different struct
Currently, the individual nexthops member in the group and attributes of
the group (e.g., its type) are stored in the same struct (i.e., 'struct
mlxsw_sp_nexthop_group'). This is fine since the individual nexthops
cannot change during the lifetime of the group.

With nexthop objects this is no longer the case. An existing nexthop
group can be replaced to use a new set of nexthops. Creating a new
struct whenever a group is replaced entails replacing the group pointer
of all the routes (i.e., 'struct mlxsw_sp_fib_entry') using the group.

Avoid this inefficient step by splitting the nexthop group configuration
to a different struct (i.e., 'struct mlxsw_sp_nexthop_group_info').
When a nexthop group is replaced a new group info struct is created and
the individual rotues do not need to be touched.

Illustration after the change:

  mlxsw_sp_fib_entry    mlxsw_sp_nexthop_group    mlxsw_sp_nexthop_group_info
+-------------------+  +----------------------+  +---------------------------+
| nh_group;         +--> nhgi;                +-->                           |
|                   |  |                      |  |                           |
+-------------------+  +----------------------+  +---------------------------+

No functional changes intended.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:04 -08:00
Ido Schimmel
5a49dfe51f mlxsw: spectrum_router: Move IPv4 FIB info into a union in nexthop group struct
Instead of storing the FIB info as 'priv' when the nexthop group
represents an IPv4 nexthop group, simply store it as a FIB info with a
proper comment.

When nexthop objects are supported, this field will become a union with
the nexthop object's identifier.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:03 -08:00
Ido Schimmel
46d5b7b541 mlxsw: spectrum_router: Remove unused field 'prio' from IPv4 FIB entry struct
Not used anywhere.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:03 -08:00
Ido Schimmel
9ce254d9fb mlxsw: spectrum_router: Store FIB info in route
When needed, IPv4 routes fetch the FIB info (i.e., 'struct fib_info')
from their associated nexthop group. This will not work when the nexthop
group represents a nexthop object (i.e., 'struct nexthop'), as it will
only have access to the nexthop's identifier.

Instead, store the FIB info in the route itself.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:03 -08:00
Ido Schimmel
02d8fdcad7 mlxsw: spectrum_router: Associate neighbour table with nexthop instead of group
As explained in the previous patch, nexthop objects can have both IPv4
and IPv6 nexthops in the same group. Therefore, move the neighbour table
to be a property of the nexthop instead of the nexthop group.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:03 -08:00
Ido Schimmel
1664dd3d5e mlxsw: spectrum_router: Use nexthop group type in hash table key
Both IPv4 and IPv6 nexthop groups are hashed in the same table. The
protocol field is used to indicate how the hash should be computed for
each group.

When nexthop group objects are supported, the hash will be computed for
them based on the nexthop identifier.

To differentiate between all the nexthop group types, encode the type of
the group in the key instead of the protocol.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:03 -08:00
Ido Schimmel
a06191aabb mlxsw: spectrum_router: Add nexthop group type field
Currently, the type (i.e., IPv4/IPv6) of the nexthop group is derived
from the neighbour table associated with the group.

This is problematic when nexthop objects are taken into account, as a
nexthop group object can contain both IPv4 and IPv6 nexthops.

Instead, add a new field that indicates the type of the group and
initialize it during the group's creation. Currently, the types are IPv4
('struct fib_info') and IPv6 ('struct fib6_info'). In the future another
type will be added for nexthop objects ('struct nexthop').

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:03 -08:00
Ido Schimmel
10502d055b mlxsw: spectrum_router: Compare key with correct object type
When comparing a key with a nexthop group in rhastable's obj_cmpfn()
callback, make sure that the key and nexthop group are of the same type
(i.e., IPv4 / IPv6).

The bug is not currently visible because IPv6 nexthop groups do not
populate the FIB info pointer and IPv4 nexthop groups do not set the
ifindex for the individual nexthops.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:55:03 -08:00
Jisheng Zhang
56311a315d net: stmmac: dwmac_lib: enlarge dma reset timeout
If the phy enables power saving technology, the dwmac's software reset
needs more time to complete, enlarge dma reset timeout to 200000us.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Link: https://lore.kernel.org/r/20201113090902.5c7aab1a@xhacker.debian
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 16:04:02 -08:00
Sven Van Asbroeck
796a2665ca lan743x: prevent entire kernel HANG on open, for some platforms
On arm imx6, when opening the chip's netdev, the whole Linux
kernel intermittently hangs/freezes.

This is caused by a bug in the driver code which tests if pcie
interrupts are working correctly, using the software interrupt:

1. open: enable the software interrupt
2. open: tell the chip to assert the software interrupt
3. open: wait for flag
4. ISR: acknowledge s/w interrupt, set flag
5. open: notice flag, disable the s/w interrupt, continue

Unfortunately the ISR only acknowledges the s/w interrupt, but
does not disable it. This will re-trigger the ISR in a tight
loop.

On some (lucky) platforms, open proceeds to disable the s/w
interrupt even while the ISR is 'spinning'. On arm imx6,
the spinning ISR does not allow open to proceed, resulting
in a hung Linux kernel.

Fix minimally by disabling the s/w interrupt in the ISR, which
will prevent it from spinning. This won't break anything because
the s/w interrupt is used as a one-shot interrupt.

Note that this is a minimal fix, overlooking many possible
cleanups, e.g.:
- lan743x_intr_software_isr() is completely redundant and reads
  INT_STS twice for no apparent reason
- disabling the s/w interrupt in lan743x_intr_test_isr() is now
  redundant, but harmless
- waiting on software_isr_flag can be converted from a sleeping
  poll loop to wait_event_timeout()

Fixes: 23f0703c12 ("lan743x: Add main source files for new lan743x driver")
Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # arm imx6 lan7430
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Link: https://lore.kernel.org/r/20201112204741.12375-1-TheSven73@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 15:25:12 -08:00
Sven Van Asbroeck
e35df62e04 lan743x: fix issue causing intermittent kernel log warnings
When running this chip on arm imx6, we intermittently observe
the following kernel warning in the log, especially when the
system is under high load:

[   50.119484] ------------[ cut here ]------------
[   50.124377] WARNING: CPU: 0 PID: 303 at kernel/softirq.c:169 __local_bh_enable_ip+0x100/0x184
[   50.132925] IRQs not enabled as expected
[   50.159250] CPU: 0 PID: 303 Comm: rngd Not tainted 5.7.8 #1
[   50.164837] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[   50.171395] [<c0111a38>] (unwind_backtrace) from [<c010be28>] (show_stack+0x10/0x14)
[   50.179162] [<c010be28>] (show_stack) from [<c05b9dec>] (dump_stack+0xac/0xd8)
[   50.186408] [<c05b9dec>] (dump_stack) from [<c0122e40>] (__warn+0xd0/0x10c)
[   50.193391] [<c0122e40>] (__warn) from [<c0123238>] (warn_slowpath_fmt+0x98/0xc4)
[   50.200892] [<c0123238>] (warn_slowpath_fmt) from [<c012b010>] (__local_bh_enable_ip+0x100/0x184)
[   50.209860] [<c012b010>] (__local_bh_enable_ip) from [<bf09ecbc>] (destroy_conntrack+0x48/0xd8 [nf_conntrack])
[   50.220038] [<bf09ecbc>] (destroy_conntrack [nf_conntrack]) from [<c0ac9b58>] (nf_conntrack_destroy+0x94/0x168)
[   50.230160] [<c0ac9b58>] (nf_conntrack_destroy) from [<c0a4aaa0>] (skb_release_head_state+0xa0/0xd0)
[   50.239314] [<c0a4aaa0>] (skb_release_head_state) from [<c0a4aadc>] (skb_release_all+0xc/0x24)
[   50.247946] [<c0a4aadc>] (skb_release_all) from [<c0a4b4cc>] (consume_skb+0x74/0x17c)
[   50.255796] [<c0a4b4cc>] (consume_skb) from [<c081a2dc>] (lan743x_tx_release_desc+0x120/0x124)
[   50.264428] [<c081a2dc>] (lan743x_tx_release_desc) from [<c081a98c>] (lan743x_tx_napi_poll+0x5c/0x18c)
[   50.273755] [<c081a98c>] (lan743x_tx_napi_poll) from [<c0a6b050>] (net_rx_action+0x118/0x4a4)
[   50.282306] [<c0a6b050>] (net_rx_action) from [<c0101364>] (__do_softirq+0x13c/0x53c)
[   50.290157] [<c0101364>] (__do_softirq) from [<c012b29c>] (irq_exit+0x150/0x17c)
[   50.297575] [<c012b29c>] (irq_exit) from [<c0196a08>] (__handle_domain_irq+0x60/0xb0)
[   50.305423] [<c0196a08>] (__handle_domain_irq) from [<c05d44fc>] (gic_handle_irq+0x4c/0x90)
[   50.313790] [<c05d44fc>] (gic_handle_irq) from [<c0100ed4>] (__irq_usr+0x54/0x80)
[   50.321287] Exception stack(0xecd99fb0 to 0xecd99ff8)
[   50.326355] 9fa0:                                     1cf1aa74 00000001 00000001 00000000
[   50.334547] 9fc0: 00000001 00000000 00000000 00000000 00000000 00000000 00004097 b6d17d14
[   50.342738] 9fe0: 00000001 b6d17c60 00000000 b6e71f94 800b0010 ffffffff
[   50.349364] irq event stamp: 2525027
[   50.352955] hardirqs last  enabled at (2525026): [<c0a6afec>] net_rx_action+0xb4/0x4a4
[   50.360892] hardirqs last disabled at (2525027): [<c0d6d2fc>] _raw_spin_lock_irqsave+0x1c/0x50
[   50.369517] softirqs last  enabled at (2524660): [<c01015b4>] __do_softirq+0x38c/0x53c
[   50.377446] softirqs last disabled at (2524693): [<c012b29c>] irq_exit+0x150/0x17c
[   50.385027] ---[ end trace c0b571db4bc8087d ]---

The driver is calling dev_kfree_skb() from code inside a spinlock,
where h/w interrupts are disabled. This is forbidden, as documented
in include/linux/netdevice.h. The correct function to use
dev_kfree_skb_irq(), or dev_kfree_skb_any().

Fix by using the correct dev_kfree_skb_xxx() functions:

in lan743x_tx_release_desc():
  called by lan743x_tx_release_completed_descriptors()
    called by in lan743x_tx_napi_poll()
    which holds a spinlock
  called by lan743x_tx_release_all_descriptors()
    called by lan743x_tx_close()
    which can-sleep
conclusion: use dev_kfree_skb_any()

in lan743x_tx_xmit_frame():
  which holds a spinlock
conclusion: use dev_kfree_skb_irq()

in lan743x_tx_close():
  which can-sleep
conclusion: use dev_kfree_skb()

in lan743x_rx_release_ring_element():
  called by lan743x_rx_close()
    which can-sleep
  called by lan743x_rx_open()
    which can-sleep
conclusion: use dev_kfree_skb()

Fixes: 23f0703c12 ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Link: https://lore.kernel.org/r/20201112185949.11315-1-TheSven73@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 13:32:22 -08:00
Shannon Nelson
7c8d008cc0 ionic: useful names for booleans
With a few more uses of true and false in function calls, we
need to give them some useful names so we can tell from the
calling point what we're doing.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 13:22:59 -08:00
Shannon Nelson
81dbc24147 ionic: change set_rx_mode from_ndo to can_sleep
Instead of having two different ways of expressing the same
sleepability concept, using opposite logic, we can rework the
from_ndo to can_sleep for a more consistent usage.

Fixes: 1800eee166 ("net: ionic: Replace in_interrupt() usage.")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 13:22:59 -08:00
Shannon Nelson
e94f76bb20 ionic: flatten calls to ionic_lif_rx_mode
The _ionic_lif_rx_mode() is only used once and really doesn't
need to be broken out.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 13:22:59 -08:00
Shannon Nelson
e0243e1966 ionic: use mc sync for multicast filters
We should be using the multicast sync routines for the multicast
filters.  Also, let's just flatten the logic a bit and pull
the small unicast routine back into ionic_set_rx_mode().

Fixes: 1800eee166 ("net: ionic: Replace in_interrupt() usage.")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 13:22:58 -08:00
Shannon Nelson
a8205ab620 ionic: batch rx buffer refilling
We don't need to refill the rx descriptors on every napi
if only a few were handled.  Waiting until we can batch up
a few together will save us a few Rx cycles.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 13:22:58 -08:00
Shannon Nelson
e7e8e087ac ionic: add lif quiesce
After the queues are stopped, expressly quiesce the lif.
This assures that even if the queues were in an odd state,
the firmware will close up everything cleanly.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 13:22:58 -08:00
Shannon Nelson
f6e428b27e ionic: check for link after netdev registration
Request a link check as soon as the netdev is registered rather
than waiting for the watchdog to go off in order to get the
interface operational a little more quickly.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 13:22:58 -08:00
Shannon Nelson
8f56bc4dc1 ionic: start queues before announcing link up
Change the order of operations in the link_up handling to be
sure that the queues are up and ready before we announce that
the link is up.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 13:22:58 -08:00
YueHaibing
9e6cad531c net: macb: Fix passing zero to 'PTR_ERR'
Check PTR_ERR with IS_ERR to fix this.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20201112144936.54776-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 12:35:33 -08:00
Jakub Kicinski
07cbce2e46 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-11-14

1) Add BTF generation for kernel modules and extend BTF infra in kernel
   e.g. support for split BTF loading and validation, from Andrii Nakryiko.

2) Support for pointers beyond pkt_end to recognize LLVM generated patterns
   on inlined branch conditions, from Alexei Starovoitov.

3) Implements bpf_local_storage for task_struct for BPF LSM, from KP Singh.

4) Enable FENTRY/FEXIT/RAW_TP tracing program to use the bpf_sk_storage
   infra, from Martin KaFai Lau.

5) Add XDP bulk APIs that introduce a defer/flush mechanism to optimize the
   XDP_REDIRECT path, from Lorenzo Bianconi.

6) Fix a potential (although rather theoretical) deadlock of hashtab in NMI
   context, from Song Liu.

7) Fixes for cross and out-of-tree build of bpftool and runqslower allowing build
   for different target archs on same source tree, from Jean-Philippe Brucker.

8) Fix error path in htab_map_alloc() triggered from syzbot, from Eric Dumazet.

9) Move functionality from test_tcpbpf_user into the test_progs framework so it
   can run in BPF CI, from Alexander Duyck.

10) Lift hashtab key_size limit to be larger than MAX_BPF_STACK, from Florian Lehner.

Note that for the fix from Song we have seen a sparse report on context
imbalance which requires changes in sparse itself for proper annotation
detection where this is currently being discussed on linux-sparse among
developers [0]. Once we have more clarification/guidance after their fix,
Song will follow-up.

  [0] https://lore.kernel.org/linux-sparse/CAHk-=wh4bx8A8dHnX612MsDO13st6uzAz1mJ1PaHHVevJx_ZCw@mail.gmail.com/T/
      https://lore.kernel.org/linux-sparse/20201109221345.uklbp3lzgq6g42zb@ltop.local/T/

* git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (66 commits)
  net: mlx5: Add xdp tx return bulking support
  net: mvpp2: Add xdp tx return bulking support
  net: mvneta: Add xdp tx return bulking support
  net: page_pool: Add bulk support for ptr_ring
  net: xdp: Introduce bulking for xdp tx return path
  bpf: Expose bpf_d_path helper to sleepable LSM hooks
  bpf: Augment the set of sleepable LSM hooks
  bpf: selftest: Use bpf_sk_storage in FENTRY/FEXIT/RAW_TP
  bpf: Allow using bpf_sk_storage in FENTRY/FEXIT/RAW_TP
  bpf: Rename some functions in bpf_sk_storage
  bpf: Folding omem_charge() into sk_storage_charge()
  selftests/bpf: Add asm tests for pkt vs pkt_end comparison.
  selftests/bpf: Add skb_pkt_end test
  bpf: Support for pointers beyond pkt_end.
  tools/bpf: Always run the *-clean recipes
  tools/bpf: Add bootstrap/ to .gitignore
  bpf: Fix NULL dereference in bpf_task_storage
  tools/bpftool: Fix build slowdown
  tools/runqslower: Build bpftool using HOSTCC
  tools/runqslower: Enable out-of-tree build
  ...
====================

Link: https://lore.kernel.org/r/20201114020819.29584-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 09:13:41 -08:00
Lorenzo Bianconi
b87c57ae12 net: mlx5: Add xdp tx return bulking support
Convert mlx5 driver to xdp_return_frame_bulk APIs.

XDP_REDIRECT (upstream codepath): 8.9Mpps
XDP_REDIRECT (upstream codepath + bulking APIs): 10.2Mpps

Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/250460319fd868b7b5668fc1deca74dd42813a90.1605267335.git.lorenzo@kernel.org
2020-11-14 02:29:00 +01:00
Lorenzo Bianconi
dbef19ccde net: mvpp2: Add xdp tx return bulking support
Convert mvpp2 driver to xdp_return_frame_bulk APIs.

XDP_REDIRECT (upstream codepath): 1.79Mpps
XDP_REDIRECT (upstream codepath + bulking APIs): 1.93Mpps

Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Matteo Croce <mcroce@microsoft.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/0b38c295e58e8ce251ef6b4e2187a2f457f9f7a3.1605267335.git.lorenzo@kernel.org
2020-11-14 02:29:00 +01:00
Lorenzo Bianconi
2f9d09394d net: mvneta: Add xdp tx return bulking support
Convert mvneta driver to xdp_return_frame_bulk APIs.

XDP_REDIRECT (upstream codepath): 275Kpps
XDP_REDIRECT (upstream codepath + bulking APIs): 284Kpps

Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/9af8014006d022fc0fec78cdaa71beb56999750d.1605267335.git.lorenzo@kernel.org
2020-11-14 02:29:00 +01:00
Jisheng Zhang
bb3222f71b net: stmmac: platform: use optional clk/reset get APIs
Use the devm_reset_control_get_optional() and devm_clk_get_optional()
rather than open coding them.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Link: https://lore.kernel.org/r/20201112092606.5173aa6f@xhacker.debian
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 16:31:52 -08:00
Heiner Kallweit
ca1ab89cd2 r8169: improve rtl_tx
We can simplify the for() condition and eliminate variable tx_left.
The change also considers that tp->cur_tx may be incremented by a
racing rtl8169_start_xmit().
In addition replace the write to tp->dirty_tx and the following
smp_mb() with an equivalent call to smp_store_mb(). This implicitly
adds a WRITE_ONCE() to the write.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/c2e19e5e-3d3f-d663-af32-13c3374f5def@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 16:29:07 -08:00
Heiner Kallweit
95f3c5458d r8169: use READ_ONCE in rtl_tx_slots_avail
tp->dirty_tx and tp->cur_tx may be changed by a racing rtl_tx() or
rtl8169_start_xmit(). Use READ_ONCE() to annotate the races and ensure
that the compiler doesn't use cached values.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/5676fee3-f6b4-84f2-eba5-c64949a371ad@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 16:28:59 -08:00
Edward Cree
c5122cf584 sfc: support GRE TSO on EF100
We can treat SKB_GSO_GRE almost exactly the same as UDP tunnels, except
 that we don't want to edit the outer UDP len (as there isn't one).
For SKB_GSO_GRE_CSUM, we have to use GSO_PARTIAL as the device doesn't
 support offload of non-UDP outer L4 checksums.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Martin Habets <mhabets@solarflare.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
2020-11-13 15:33:30 -08:00
Edward Cree
42bfd69a9f sfc: correctly support non-partial GSO_UDP_TUNNEL_CSUM on EF100
By asking the HW for the correct edits, we can make UDP tunnel TSO
 work without needing GSO_PARTIAL.  So don't specify it in our
 netdev->gso_partial_features.
However, retain GSO_PARTIAL support, as this will be used for other
 protocols later.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Martin Habets <mhabets@solarflare.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
2020-11-13 15:33:27 -08:00
Edward Cree
dc8d2512e6 sfc: extend bitfield macros to 19 fields
Our TSO descriptors got even more fussy.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Martin Habets <mhabets@solarflare.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
2020-11-13 15:33:03 -08:00
Wang Qing
81e329e93b net: ethernet: ti: am65-cpts: update ret when ptp_clock is ERROR
We always have to update the value of ret, otherwise the
 error value may be the previous one.

Fixes: f6bd59526c ("net: ethernet: ti: introduce am654 common platform time sync driver")
Signed-off-by: Wang Qing <wangqing@vivo.com>
[grygorii.strashko@ti.com: fix build warn, subj add fixes tag]
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://lore.kernel.org/r/20201112164541.3223-1-grygorii.strashko@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 15:25:43 -08:00
Wang Hai
8c07205aea net: marvell: prestera: fix error return code in prestera_pci_probe()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 4c2703dfd7 ("net: marvell: prestera: Add PCI interface support")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Vadym Kochan <vadym.kochan@plvision.eu>
Acked-by: Vadym Kochan <vadym.kochan@plvision.eu>
Link: https://lore.kernel.org/r/20201113113236.71678-1-wanghai38@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 15:09:54 -08:00
Grygorii Strashko
2b56687330 net: ethernet: ti: cpsw: fix cpts irq after suspend
Depending on the SoC/platform the CPSW can completely lose context after a
suspend/resume cycle, including CPSW wrapper (WR) which will cause reset of
WR_C0_MISC_EN register, so CPTS IRQ will became disabled.

Fix it by moving CPTS IRQ enabling in cpsw_ndo_open() where CPTS is
actually started.

Fixes: 84ea9c0a95 ("net: ethernet: ti: cpsw: enable cpts irq")
Reported-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20201112111546.20343-1-grygorii.strashko@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-13 14:20:42 -08:00
Zhang Changzhong
baee1991fa net: ethernet: mtk-star-emac: fix error return code in mtk_star_enable()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 8c7bd5a454 ("net: ethernet: mtk-star-emac: new driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Link: https://lore.kernel.org/r/1605180879-2573-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 17:58:37 -08:00
Vincent Stehlé
e8aa6d520b net: ethernet: mtk-star-emac: return ok when xmit drops
The ndo_start_xmit() method must return NETDEV_TX_OK if the DMA mapping
fails, after freeing the socket buffer.
Fix the mtk_star_netdev_start_xmit() function accordingly.

Fixes: 8c7bd5a454 ("net: ethernet: mtk-star-emac: new driver")
Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net>
Acked-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Link: https://lore.kernel.org/r/20201112084833.21842-1-vincent.stehle@laposte.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 17:05:15 -08:00
Jakub Kicinski
e1d9d7b913 Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 16:54:48 -08:00
Jiri Pirko
173f14cda3 mlxsw: spectrum_router: Introduce FIB entry update op
Follow-up patchset introducing XMDR implementation is going to need
to distinguish write and update ops. Therefore introduce "update op"
and call "write op" only when new FIB entry is inserted.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:22 -08:00
Jiri Pirko
a005a7fe2f mlxsw: spectrum_router: Track FIB entry committed state and skip uncommitted on delete
In case bulking is used, the entry that was previously added may not
be yet committed to the HW as it waits in the queue for bulk send. For
such entries, skip the deletion.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:22 -08:00
Jiri Pirko
ae9ce81aa7 mlxsw: spectrum_router: Introduce fib_entry priv for low-level ops
Prepare for the low-level ops that need to store some data alongside
the fib_entry and introduce a per-fib_entry priv for ll ops.
The priv is reference counted as in the follow-up patch it is going
to be saved in pack() function and used later on in commit() even in
case the related fib_entry gets freed in the middle.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:21 -08:00
Jiri Pirko
91d20d71b2 mlxsw: spectrum_router: Have FIB entry op context allocated for the instance
Get the max size needed for FIB entry op context and allocate it once
for the instance. Use it repeatedly from the scheduled work.
By this, allow to extend the context to hold more data than it is wise
to do when it was on the stack. Make sure to signalize that the context
needs to be initialized in case families of subsequent FIB entries differ.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:21 -08:00
Jiri Pirko
505cd65c66 mlxsw: spectrum_router: Prepare work context for possible bulking
For XMDR register it is possible to carry multiple FIB entry
operations in a single write. However the FW does not restrict mixing
the types of operations, make the code easier and indicate the bulking
is ok only in case the bulk contains FIB operations of the same family
and event.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:21 -08:00
Jiri Pirko
7f5c4090e4 mlxsw: spectrum: Push RALUE packing and writing into low-level router ops
With follow-up introduction of XM implementation, XMDR register is
going to be optionally used instead of RALUE register. Push the RALUE
packing helpers and write call into low-level router ops.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:21 -08:00
Jiri Pirko
1a9c21d5f7 mlxsw: spectrum_router: Use RALUE pack helper from abort function
Unify the RALUE register payload packing and use the
__mlxsw_sp_fib_entry_ralue_pack() helper from
__mlxsw_sp_router_set_abort_trap().

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:21 -08:00
Jiri Pirko
1a7fcdf75d mlxsw: reg: Allow to pass NULL pointer to mlxsw_reg_ralue_pack4/6()
In preparation for the change that is going to be done in the next
patch, allow to pass NULL pointer to mlxsw_reg_ralue_pack4() and
mlxsw_reg_ralue_pack6() helpers.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:20 -08:00
Jiri Pirko
0c1d6b2694 mlxsw: spectrum_router: Pass destination IP as a pointer to mlxsw_reg_ralue_pack4()
Instead of passing destination IP as a u32 value, pass it as pointer to
u32. Avoid using local variable for the pointer store.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:20 -08:00
Jiri Pirko
d271cf9f29 mlxsw: spectrum: Export RALUE pack helper and use it from IPIP
As the RALUE packing is going to be put into op, make the user from
IPIP code use the same helper as the router code does.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:20 -08:00
Jiri Pirko
0f6b66011a mlxsw: spectrum_router: Push out RALUE pack into separate helper
As the RALUE packing is going to be pushed into an op, in preparation
for that push the code into a separate function in the meantime.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:20 -08:00
Jiri Pirko
2d5bd7a111 mlxsw: spectrum: Propagate context from work handler containing RALUE payload
Currently, RALUE payload is defined locally in the function that is
calling the register write. With introduction of alternative register to
RALUE, XMDR, it has to be possible to put multiple FIB entry
operations into single register write.

So in order to prepare for that, have per-work entry operation context
and propagate it all the way down to the functions writing RALUE.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:19 -08:00
Jiri Pirko
c1b290d594 mlxsw: spectrum_router: Introduce FIB event queue instead of separate works
Currently, every FIB event is queued-up as a separate work to be
processed. However, that allows to process only one FIB entry per work
callback.

In preparation of future XMDR register bulking of multiple FIB entries,
convert to FIB event queue. Implement this by a list_head, adding new
events to the end of the list in the FIB notify callback. That allows to
process multiple events from the list inside the work callback.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:19 -08:00
Jiri Pirko
d57ff02286 mlxsw: spectrum_router: Use RALUE-independent op arg
Since the write/delete of FIB entry is going to be implemented by XMDR
register for XM implementation, introduce RALUE-independent enum for op
so the enum could be used in both RALUE and XMDR.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:19 -08:00
Jiri Pirko
69ba53e72b mlxsw: spectrum_router: Pass non-register proto enum to __mlxsw_sp_router_set_abort_trap()
Don't pass RALXX register enum and rather pass enum mlxsw_sp_l3proto
to __mlxsw_sp_router_set_abort_trap(). This is in preparation to fib
entry pack implementation by XMDR register.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 15:55:19 -08:00
Andrew Lunn
7958ba7e62 drivers: net: smsc: Add COMPILE_TEST support
Improve the build testing of these SMSC drivers by enabling them when
COMPILE_TEST is selected.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 14:49:40 -08:00
Andrew Lunn
6e4a930c40 drivers: net: smc911x: Fix cast from pointer to integer of different size
drivers/net/ethernet/smsc/smc911x.c: In function ‘smc911x_hardware_send_pkt’:
drivers/net/ethernet/smsc/smc911x.c:471:11: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
  471 |  cmdA = (((u32)skb->data & 0x3) << 16) |

When built on 64bit targets, the skb->data pointer cannot be cast to a
u32 in a meaningful way. Use uintptr_t instead.

Suggested-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 14:49:40 -08:00
Andrew Lunn
dd5fdb3f97 drivers: net: smc911x: Fix passing wrong number of parameters to DBG() macro
Now that the compiler always sees the parameters passed to the DBG()
macro, it gives an error message about wrong parameters. The comment
says it all:

	/* ndev is not valid yet, so avoid passing it in. */
	DBG(SMC_DEBUG_FUNC, "--> %s\n",  __func__);

You cannot not just pass a parameter!

The DBG does not seem to have any real value, to just remove it.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 14:49:40 -08:00
Andrew Lunn
40f6d1d915 drivers: net: smc911x: Fix set but unused status because of DBG macro
drivers/net/ethernet/smsc/smc911x.c: In function ‘smc911x_timeout’:
drivers/net/ethernet/smsc/smc911x.c:1251:6: warning: variable ‘status’ set but not used [-Wunused-but-set-variable]
 1251 |  int status, mask;

The status is read in order to print it via the DBG macro. However,
due to the way DBG is disabled, the compiler never sees it being used.

Change the DBG macro to actually make use of the passed parameters,
and the leave the optimiser to remove the unwanted code inside the
while (0).

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 14:49:39 -08:00
Andrew Lunn
6015e6f2ef drivers: net: smc911x: Work around set but unused status
drivers/net/ethernet/smsc/smc911x.c: In function ‘smc911x_phy_interrupt’:
drivers/net/ethernet/smsc/smc911x.c:976:6: warning: variable ‘status’ set but not used [-Wunused-but-set-variable]
  976 |  int status;

A comment indicates the status needs to be read from the PHY,
otherwise bad things happen. But due to the macro magic, it is hard to
perform the read without assigning it to a variable. So add
_always_unused attribute to status to tell the compiler we don't
expect to use the value.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 14:49:39 -08:00
Andrew Lunn
606ddf1f04 drivers: net: smc91x: Fix missing kerneldoc reported by W=1
drivers/net/ethernet/smsc/smc91x.c:2199: warning: Function parameter or member 'dev' not described in 'try_toggle_control_gpio'
drivers/net/ethernet/smsc/smc91x.c:2199: warning: Function parameter or member 'desc' not described in 'try_toggle_control_gpio'
drivers/net/ethernet/smsc/smc91x.c:2199: warning: Function parameter or member 'name' not described in 'try_toggle_control_gpio'
drivers/net/ethernet/smsc/smc91x.c:2199: warning: Function parameter or member 'index' not described in 'try_toggle_control_gpio'
drivers/net/ethernet/smsc/smc91x.c:2199: warning: Function parameter or member 'value' not described in 'try_toggle_control_gpio'
drivers/net/ethernet/smsc/smc91x.c:2199: warning: Function parameter or member 'nsdelay' not described in 'try_toggle_control_gpio'

Document these parameters.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 14:49:39 -08:00
Andrew Lunn
5b320b5343 drivers: net: smc91x: Fix set but unused W=1 warning
drivers/net/ethernet/smsc/smc91x.c:706:51: warning: variable ‘pkt_len’ set but not used [-Wunused-but-set-variable]
  706 |  unsigned int saved_packet, packet_no, tx_status, pkt_len;

The read of the packet length in the descriptor probably needs to be
kept in order to keep the hardware happy. So tell the compiler we
don't expect to use the value by using the __always_unused attribute.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 14:49:39 -08:00
Andrew Lunn
03dfd15767 drivers: net: xilinx_emaclite: Add COMPILE_TEST support
To improve build testing of this driver, add COMPILE_TEST support.

Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 14:32:31 -08:00
Andrew Lunn
eccd540381 drivers: net: xilinx_emaclite: Fix -Wpointer-to-int-cast warnings with W=1
drivers/net/ethernet//xilinx/xilinx_emaclite.c:341:35: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
  341 |   addr = (void __iomem __force *)((u32 __force)addr ^

Use uintptr_t instead of u32 to avoid problems on 64 bit systems.

Also, cast the address to an unsigned long for printing.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 14:32:31 -08:00
Andrew Lunn
27b4255798 drivers: net: xilinx_emaclite: Add missing parameter kerneldoc
The txqueue parameter to the watchdog callback is unused in this
driver. But it still needs to be documented.

Reviewed-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 14:32:30 -08:00
YueHaibing
95530a59db nfp: Fix passing zero to 'PTR_ERR'
nfp_cpp_from_nfp6000_pcie() returns ERR_PTR() and never returns
NULL. The NULL test should be removed, also return correct err.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Link: https://lore.kernel.org/r/20201112145852.6580-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 10:11:12 -08:00
Sven Van Asbroeck
edbc21113b lan743x: fix use of uninitialized variable
When no devicetree is present, the driver will use an
uninitialized variable.

Fix by initializing this variable.

Fixes: 902a66e08c ("lan743x: correctly handle chips with internal PHY")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Link: https://lore.kernel.org/r/20201112152513.1941-1-TheSven73@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 10:03:16 -08:00
Jakub Kicinski
8a5c2906c5 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2020-11-10

This series contains updates to i40e and igc drivers and the MAINTAINERS
file.

Slawomir fixes updating VF MAC addresses to fix various issues related
to reporting and setting of these addresses for i40e.

Dan Carpenter fixes a possible used before being initialized issue for
i40e.

Vinicius fixes reporting of netdev stats for igc.

Tony updates repositories for Intel Ethernet Drivers.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  MAINTAINERS: Update repositories for Intel Ethernet Drivers
  igc: Fix returning wrong statistics
  i40e, xsk: uninitialized variable in i40e_clean_rx_irq_zc()
  i40e: Fix MAC address setting for a VF via Host/VM
====================

Link: https://lore.kernel.org/r/20201111001955.533210-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-12 08:47:23 -08:00
Andrew Lunn
0575bedd6a drivers: net: sky2: Fix -Wstringop-truncation with W=1
In function ‘strncpy’,
    inlined from ‘sky2_name’ at drivers/net/ethernet/marvell/sky2.c:4903:3,
    inlined from ‘sky2_probe’ at drivers/net/ethernet/marvell/sky2.c:5049:2:
./include/linux/string.h:297:30: warning: ‘__builtin_strncpy’ specified bound 16 equals destination size [-Wstringop-truncation]

None of the device names are 16 characters long, so it was never an
issue. But replace the strncpy with an snprintf() to prevent the
theoretical overflow.

Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Link: https://lore.kernel.org/r/20201110023222.1479398-1-andrew@lunn.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 17:42:24 -08:00
Rohit Maheshwari
83a95df04b ch_ktls: stop the txq if reaches threshold
Stop the queue and ask for the credits if queue reaches to
threashold.

Fixes: 5a4b9fe7fe ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 16:30:38 -08:00
Rohit Maheshwari
7d01c428c8 ch_ktls: tcb update fails sometimes
context id and port id should be filled while sending tcb update.

Fixes: 5a4b9fe7fe ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 16:30:38 -08:00
Rohit Maheshwari
21f82acbb8 ch_ktls/cxgb4: handle partial tag alone SKBs
If TCP congestion caused a very small packets which only has some
part fo the TAG, and that too is not till the end. HW can't handle
such case, so falling back to sw crypto in such cases.

v1->v2:
- Marked chcr_ktls_sw_fallback() static.

Fixes: dc05f3df8f ("chcr: Handle first or middle part of record")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 16:30:38 -08:00
Rohit Maheshwari
659bf0383d ch_ktls: don't free skb before sending FIN
If its a last packet and fin is set. Make sure FIN is informed
to HW before skb gets freed.

Fixes: 429765a149 ("chcr: handle partial end part of a record")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 16:30:38 -08:00
Rohit Maheshwari
9478e08394 ch_ktls: packet handling prior to start marker
There could be a case where ACK for tls exchanges prior to start
marker is missed out, and by the time tls is offloaded. This pkt
should not be discarded and handled carefully. It could be
plaintext alone or plaintext + finish as well.

Fixes: 5a4b9fe7fe ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 16:30:38 -08:00
Rohit Maheshwari
63ee4591fa ch_ktls: Correction in middle record handling
If a record starts in middle, reset TCB UNA so that we could
avoid sending out extra packet which is needed to make it 16
byte aligned to start AES CTR.
Check also considers prev_seq, which should be what is
actually sent, not the skb data length.
Avoid updating partial TAG to HW at any point of time, that's
why we need to check if remaining part is smaller than TAG
size, then reset TX_MAX to be TAG starting sequence number.

Fixes: 5a4b9fe7fe ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 16:30:37 -08:00
Rohit Maheshwari
83deb094dd ch_ktls: missing handling of header alone
If an skb has only header part which doesn't start from
beginning, is not being handled properly.

Fixes: dc05f3df8f ("chcr: Handle first or middle part of record")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 16:30:37 -08:00
Rohit Maheshwari
c68a28a9e2 ch_ktls: Correction in trimmed_len calculation
trimmed length calculation goes wrong if skb has only tag part
to send. It should be zero if there is no data bytes apart from
TAG.

Fixes: dc05f3df8f ("chcr: Handle first or middle part of record")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 16:30:37 -08:00
Rohit Maheshwari
687823d2d1 cxgb4/ch_ktls: creating skbs causes panic
Creating SKB per tls record and freeing the original one causes
panic. There will be race if connection reset is requested. By
freeing original skb, refcnt will be decremented and that means,
there is no pending record to send, and so tls_dev_del will be
requested in control path while SKB of related connection is in
queue.
 Better approach is to use same SKB to send one record (partial
data) at a time. We still have to create a new SKB when partial
last part of a record is requested.
 This fix introduces new API cxgb4_write_partial_sgl() to send
partial part of skb. Present cxgb4_write_sgl can only provide
feasibility to start from an offset which limits to header only
and it can write sgls for the whole skb len. But this new API
will help in both. It can start from any offset and can end
writing in middle of the skb.

v4->v5:
- Removed extra changes.

Fixes: 429765a149 ("chcr: handle partial end part of a record")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 16:30:37 -08:00
Rohit Maheshwari
86716b51d1 ch_ktls: Update cheksum information
Checksum update was missing in the WR.

Fixes: 429765a149 ("chcr: handle partial end part of a record")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 16:30:37 -08:00
Rohit Maheshwari
b1b5cb1803 ch_ktls: Correction in finding correct length
There is a possibility of linear skbs coming in. Correcting
the length extraction logic.

v2->v3:
- Separated un-related changes from this patch.

Fixes: 5a4b9fe7fe ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 16:30:37 -08:00
Rohit Maheshwari
9d2e5e9eeb cxgb4/ch_ktls: decrypted bit is not enough
If skb has retransmit data starting before start marker, e.g. ccs,
decrypted bit won't be set for that, and if it has some data to
encrypt, then it must be given to crypto ULD. So in place of
decrypted, check if socket is tls offloaded. Also, unless skb has
some data to encrypt, no need to give it for tls offload handling.

v2->v3:
- Removed ifdef.

Fixes: 5a4b9fe7fe ("cxgb4/chcr: complete record tx handling")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 16:30:37 -08:00
Jisheng Zhang
a884915f4c net: stmmac: dwc-qos: Change the dwc_eth_dwmac_data's .probe prototype
The return pointer of dwc_eth_dwmac_data's .probe isn't used, and
"probe" usually return int, so change the prototype to follow standard
way. Secondly, it can simplify the tegra_eqos_probe() code.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Link: https://lore.kernel.org/r/20201109160440.3a736ee3@xhacker.debian
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-11 15:15:02 -08:00
Sven Van Asbroeck
2b52a4b65b lan743x: fix "BUG: invalid wait context" when setting rx mode
In the net core, the struct net_device_ops -> ndo_set_rx_mode()
callback is called with the dev->addr_list_lock spinlock held.

However, this driver's ndo_set_rx_mode callback eventually calls
lan743x_dp_write(), which acquires a mutex. Mutex acquisition
may sleep, and this is not allowed when holding a spinlock.

Fix by removing the dp_lock mutex entirely. Its purpose is to
prevent concurrent accesses to the data port. No concurrent
accesses are possible, because the dev->addr_list_lock
spinlock in the core only lets through one thread at a time.

Fixes: 23f0703c12 ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Link: https://lore.kernel.org/r/20201109203828.5115-1-TheSven73@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-10 17:52:54 -08:00
Sven Van Asbroeck
902a66e08c lan743x: correctly handle chips with internal PHY
Commit 6f197fb638 ("lan743x: Added fixed link and RGMII support")
assumes that chips with an internal PHY will never have a devicetree
entry. This is incorrect: even for these chips, a devicetree entry
can be useful e.g. to pass the mac address from bootloader to chip:

    &pcie {
            status = "okay";

            host@0 {
                    reg = <0 0 0 0 0>;

                    #address-cells = <3>;
                    #size-cells = <2>;

                    lan7430: ethernet@0 {
                            /* LAN7430 with internal PHY */
                            compatible = "microchip,lan743x";
                            status = "okay";
                            reg = <0 0 0 0 0>;
                            /* filled in by bootloader */
                            local-mac-address = [00 00 00 00 00 00];
                    };
            };
    };

If a devicetree entry is present, the driver will not attach the chip
to its internal phy, and the chip will be non-operational.

Fix by tweaking the phy connection algorithm:
- first try to connect to a phy specified in the devicetree
  (could be 'real' phy, or just a 'fixed-link')
- if that doesn't succeed, try to connect to an internal phy, even
  if the chip has a devnode

Tested on a LAN7430 with internal PHY. I cannot test a device using
fixed-link, as I do not have access to one.

Fixes: 6f197fb638 ("lan743x: Added fixed link and RGMII support")
Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # lan7430
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Link: https://lore.kernel.org/r/20201108171224.23829-1-TheSven73@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-10 15:46:52 -08:00
Kaixu Xia
1aa844b921 net: pch_gbe: remove unneeded variable retval in __pch_gbe_suspend
Fix the following coccicheck warning:

./drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:2415:5-11: Unneeded variable: "retval". Return "0" on line 2435

Reported-by: Tosk Robot <tencent_os_robot@tencent.com>
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Link: https://lore.kernel.org/r/1604837580-12419-1-git-send-email-kaixuxia@tencent.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-10 15:26:31 -08:00
Kaixu Xia
3ec94da976 net: atlantic: Remove unnecessary conversion to bool
The '!=' expression itself is bool, no need to convert it to bool.
Fix the following coccicheck warning:

./drivers/net/ethernet/aquantia/atlantic/aq_nic.c:1477:34-39: WARNING: conversion to bool not needed here

Reported-by: Tosk Robot <tencent_os_robot@tencent.com>
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Link: https://lore.kernel.org/r/1604797919-10157-1-git-send-email-kaixuxia@tencent.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-10 15:24:49 -08:00
Vinicius Costa Gomes
6b7ed22ae4 igc: Fix returning wrong statistics
'igc_update_stats()' was not updating 'netdev->stats', so the returned
statistics, for example, requested by:

$ ip -s link show dev enp3s0

were not being updated and were always zero.

Fix by returning a set of statistics that are actually being
updated (adapter->stats64).

Fixes: c9a11c23ce ("igc: Add netdev")
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-11-10 15:03:14 -08:00
Dan Carpenter
1773482fd8 i40e, xsk: uninitialized variable in i40e_clean_rx_irq_zc()
The "failure" variable is used without being initialized.  It should be
set to false.

Fixes: 8cbf741499 ("i40e, xsk: move buffer allocation out of the Rx processing loop")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-11-10 15:03:14 -08:00
Slawomir Laba
3a7001788f i40e: Fix MAC address setting for a VF via Host/VM
Fix MAC setting flow for the PF driver.

Update the unicast VF's MAC address in VF structure if it is
a new setting in i40e_vc_add_mac_addr_msg.

When unicast MAC address gets deleted, record that and
set the new unicast MAC address that is already waiting in the filter
list. This logic is based on the order of messages arriving to
the PF driver.

Without this change the MAC address setting was interpreted
incorrectly in the following use cases:
1) Print incorrect VF MAC or zero MAC
ip link show dev $pf
2) Don't preserve MAC between driver reload
rmmod iavf; modprobe iavf
3) Update VF MAC when macvlan was set
ip link add link $vf address $mac $vf.1 type macvlan
4) Failed to update mac address when VF was trusted
ip link set dev $vf address $mac

This includes all other configurations including above commands.

Fixes: f657a6e131 ("i40e: Fix VF driver MAC address configuration")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-11-10 15:03:06 -08:00
Kaixu Xia
785d21b826 net/mlx4: Assign boolean values to a bool variable
Fix the following coccinelle warnings:

./drivers/net/ethernet/mellanox/mlx4/en_rx.c:687:1-17: WARNING: Assignment of 0/1 to bool variable

Reported-by: Tosk Robot <tencent_os_robot@tencent.com>
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/1604732038-6057-1-git-send-email-kaixuxia@tencent.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-09 17:37:53 -08:00
Voon Weifeng
bff6f1db91 stmmac: intel: change all EHL/TGL to auto detect phy addr
Set all EHL/TGL phy_addr to -1 so that the driver will automatically
detect it at run-time by probing all the possible 32 addresses.

Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com>
Link: https://lore.kernel.org/r/20201106094341.4241-1-vee.khee.wong@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07 16:11:54 -08:00
Parshuram Thombare
0012eeb370 net: macb: fix NULL dereference due to no pcs_config method
This patch fixes NULL pointer dereference due to NULL pcs_config
in pcs_ops.

Fixes: e4e143e26c ("net: macb: add support for high speed interface")
Reported-by: Nicolas Ferre <Nicolas.Ferre@microchip.com>
Link: https://lore.kernel.org/netdev/2db854c7-9ffb-328a-f346-f68982723d29@microchip.com/
Signed-off-by: Parshuram Thombare <pthombar@cadence.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Link: https://lore.kernel.org/r/1604599113-2488-1-git-send-email-pthombar@cadence.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07 13:21:38 -08:00
Vadym Kochan
4e0396c595 net: marvell: prestera: fix compilation with CONFIG_BRIDGE=m
With CONFIG_BRIDGE=m the compilation fails:

    ld: drivers/net/ethernet/marvell/prestera/prestera_switchdev.o: in function `prestera_bridge_port_event':
    prestera_switchdev.c:(.text+0x2ebd): undefined reference to `br_vlan_enabled'

in case the driver is statically enabled.

Fix it by adding 'BRIDGE || BRIDGE=n' dependency.

Fixes: e1189d9a5f ("net: marvell: prestera: Add Switchdev driver implementation")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Link: https://lore.kernel.org/r/20201106161128.24069-1-vadym.kochan@plvision.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07 12:43:26 -08:00
Jakub Kicinski
ee661a4abd mlx5-fixes-2020-11-03
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl+kXcMACgkQSD+KveBX
 +j61Zgf+IBCrevYpytPWmbrLnpWP6sVNkg0Plw8cmhRcLCCeN8mEMPIrhNDysmJu
 WOen+yVz7K65WEA5LQ1nUM+VY8PdI3sAre81rTrfyEm2Nkd0aO22BoWSESrNZowf
 XDNwn4lEJ5MToxvozfV158Gi921EwGtB/JKS6If7Cyjf+Ok8+ie8ayGfvPi7SiNK
 +2Bnai7RJSeYYt8xZjvvg17b1+ZbTeoJF/waEXpFsM/BWVKg5ZOawcfRKr4MuvEJ
 4WDzCjdaoeDZ8T6cppuWnznpoAeEf6Avq5DG/wMnUQxXRk9X5i043jr1Iydo5Bj9
 MvE31Mg9v15CecaKBldVs4cMSv86gw==
 =pwvB
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2020-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2020-11-03

v1->v2:
 - Fix fixes line tag in patch #1
 - Toss ktls refcount leak fix, Maxim will look further into the root
   cause.
 - Toss eswitch chain 0 prio patch, until we determine if it is needed
   for -rc and net.

* tag 'mlx5-fixes-2020-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: Fix incorrect access of RCU-protected xdp_prog
  net/mlx5e: Fix VXLAN synchronization after function reload
  net/mlx5: E-switch, Avoid extack error log for disabled vport
  net/mlx5: Fix deletion of duplicate rules
  net/mlx5e: Use spin_lock_bh for async_icosq_lock
  net/mlx5e: Protect encap route dev from concurrent release
  net/mlx5e: Fix modify header actions memory leak
====================

Link: https://lore.kernel.org/r/20201105202129.23644-1-saeedm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07 12:27:26 -08:00
Heiner Kallweit
847f0a2bfd r8169: disable hw csum for short packets on all chip versions
RTL8125B has same or similar short packet hw padding bug as RTL8168evl.
The main workaround has been extended accordingly, however we have to
disable also hw checksumming for short packets on affected new chip
versions. Instead of checking for an affected chip version let's
simply disable hw checksumming for short packets in general.

v2:
- remove the version checks and disable short packet hw csum in general
- reflect this in commit title and message

Fixes: 0439297be9 ("r8169: add support for RTL8125B")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/7fbb35f0-e244-ef65-aa55-3872d7d38698@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07 12:14:35 -08:00
Heiner Kallweit
cc6528bc9a r8169: fix potential skb double free in an error path
The caller of rtl8169_tso_csum_v2() frees the skb if false is returned.
eth_skb_pad() internally frees the skb on error what would result in a
double free. Therefore use __skb_put_padto() directly and instruct it
to not free the skb on error.

Fixes: b423e9ae49 ("r8169: fix offloaded tx checksum for small packets.")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/f7e68191-acff-9ded-4263-c016428a8762@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07 12:10:16 -08:00
Kaixu Xia
ea8146c684 cxgb4: Fix the -Wmisleading-indentation warning
Fix the gcc warning:

drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c:2673:9: warning: this 'for' clause does not guard... [-Wmisleading-indentation]
 2673 |         for (i = 0; i < n; ++i) \

Reported-by: Tosk Robot <tencent_os_robot@tencent.com>
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Link: https://lore.kernel.org/r/1604467444-23043-1-git-send-email-kaixuxia@tencent.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07 11:56:03 -08:00
Clayton Rayment
253761a0e6 net: xilinx: axiethernet: Enable dynamic MDIO MDC
MDIO spec does not require an MDC at all times, only when MDIO
transactions are occurring. This patch allows the xilinx_axienet
driver to disable the MDC when not in use, and re-enable it when
needed. It also simplifies the driver by removing MDC disable
and enable in device reset sequence.

Signed-off-by: Clayton Rayment <clayton.rayment@xilinx.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07 11:13:52 -08:00
Radhey Shyam Pandey
6c3cbaa0f0 net: xilinx: axiethernet: Introduce helper functions for MDC enable/disable
Introduce helper functions to enable/disable MDIO interface clock. This
change serves a preparatory patch for the coming feature to dynamically
control the management bus clock.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07 11:13:51 -08:00
Jakub Kicinski
ae0d0bb29b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-06 17:33:38 -08:00
Dany Madden
9f32c27eb4 Revert ibmvnic merge do_change_param_reset into do_reset
This reverts commit 16b5f5ce35
where it restructures do_reset. There are patches being tested that
would require major rework if this is committed first.

We will resend this after the other patches have been applied.

Signed-off-by: Dany Madden <drt@linux.ibm.com>
Link: https://lore.kernel.org/r/20201106191745.1679846-1-drt@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-06 17:27:50 -08:00
Linus Torvalds
41f1653024 Networking fixes for 5.10-rc3, including fixes from wireless, can,
and netfilter subtrees.
 
 Current release - bugs in new features:
 
  - can: isotp: isotp_rcv_cf(): enable RX timeout handling in
    listen-only mode
 
 Previous release - regressions:
 
  - mac80211:
    - don't require VHT elements for HE on 2.4 GHz
    - fix regression where EAPOL frames were sent in plaintext
 
  - netfilter:
    - ipset: Update byte and packet counters regardless of whether
      they match
 
  - ip_tunnel: fix over-mtu packet send by allowing fragmenting even
    if inner packet has IP_DF (don't fragment) set in its header
    (when TUNNEL_DONT_FRAGMENT flag is not set on the tunnel dev)
 
  - net: fec: fix MDIO probing for some FEC hardware blocks
 
  - ip6_tunnel: set inner ipproto before ip6_tnl_encap to un-break
    gso support
 
  - sctp: Fix COMM_LOST/CANT_STR_ASSOC err reporting on big-endian
    platforms, sparse-related fix used the wrong integer size
 
 Previous release - always broken:
 
  - netfilter: use actual socket sk rather than skb sk when routing
    harder
 
  - r8169: work around short packet hw bug on RTL8125 by padding frames
 
  - net: ethernet: ti: cpsw: disable PTPv1 hw timestamping
    advertisement, the hardware does not support it
 
  - chelsio/chtls: fix always leaking ctrl_skb and another leak caused
    by a race condition
 
  - fix drivers incorrectly writing into skbs on TX:
    - cadence: force nonlinear buffers to be cloned
    - gianfar: Account for Tx PTP timestamp in the skb headroom
    - gianfar: Replace skb_realloc_headroom with skb_cow_head for PTP
 
  - can: flexcan:
    - remove FLEXCAN_QUIRK_DISABLE_MECR quirk for LS1021A
    - add ECC initialization for VF610 and LX2160A
    - flexcan_remove(): disable wakeup completely
 
  - can: fix packet echo functionality:
    - peak_canfd: fix echo management when loopback is on
    - make sure skbs are not freed in IRQ context in case they need
      to be dropped
    - always clone the skbs to make sure they have a reference on
      the socket, and prevent it from disappearing
    - fix real payload length return value for RTR frames
 
  - can: j1939: return failure on bind if netdev is down, rather than
    waiting indefinitely
 
 Misc:
 
  - IPv6: reply ICMP error if the first fragment don't include all
    headers to improve compliance with RFC 8200
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl+kTDcACgkQMUZtbf5S
 IrtC9A//f9rwNFI7sRaz9FYi6ljtWY7paPxdOxy3pWRoNzbfffjTGSPheNvy1pQb
 IPaLsNwRrckQNSEPTbQqlUYcjzk1W74ffvq0sQOan4kNKxjX3uf78E6RuWARJsRC
 dLqfcJctO6bFi6sEMwIFZ2tLOO5lUIA+Pd0GbjhSdObWzl3uqJ26v7wC6vVk29vS
 116Mmhe8/TDVtCOzwlZnBPHqBJkTAirB+MAEX4Sp6FB9YirlcNZbWyHX5L6ejGqC
 WQVjU2tPBBugeo0j72tc+y0mD3iK0aLcPL+dk0EQQYHRDMVTebl+gxNPUXCo9Out
 HGe5z4e4qrR4Rx1W6MQ3pKwTYuCdwKjMRGd72JAi428/l4NN3y9W/HkI2Zuppd2l
 7ifURkNQllYjGCSoHBviJbajyFBeA1nkFJgMSJiRs4T167K3zTbsyjNnfa4LnsvS
 B3SrYMGqIH+oR20R9EoV8prVX+Alj1hh/jX02J8zsCcHmBqF2yZi17NarVAWoarm
 v/AAqehlP+D1vjAmbCG9DeborrjaNi+v6zFTKK6ZadvLXRJX/N+wEPIpG4KjiK8W
 DWKIVlee0R+kgCXE1n9AuZaZLWb7VwrAjkG1Pmfi3vkZhWeAhOW4X98ehhi/hVR/
 Gq+e48ZECW5yuOA1q4hbsCYkGr2qAn/LPbsXxhEmW8qwkJHZYkI=
 =5R2w
 -----END PGP SIGNATURE-----

Merge tag 'net-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes for 5.10-rc3, including fixes from wireless, can, and
  netfilter subtrees.

  Current merge window - bugs in new features:

   - can: isotp: isotp_rcv_cf(): enable RX timeout handling in
     listen-only mode

  Previous releases - regressions:

   - mac80211:
      - don't require VHT elements for HE on 2.4 GHz
      - fix regression where EAPOL frames were sent in plaintext

   - netfilter:
      - ipset: Update byte and packet counters regardless of whether
        they match

   - ip_tunnel: fix over-mtu packet send by allowing fragmenting even if
     inner packet has IP_DF (don't fragment) set in its header (when
     TUNNEL_DONT_FRAGMENT flag is not set on the tunnel dev)

   - net: fec: fix MDIO probing for some FEC hardware blocks

   - ip6_tunnel: set inner ipproto before ip6_tnl_encap to un-break gso
     support

   - sctp: Fix COMM_LOST/CANT_STR_ASSOC err reporting on big-endian
     platforms, sparse-related fix used the wrong integer size

  Previous releases - always broken:

   - netfilter: use actual socket sk rather than skb sk when routing
     harder

   - r8169: work around short packet hw bug on RTL8125 by padding frames

   - net: ethernet: ti: cpsw: disable PTPv1 hw timestamping
     advertisement, the hardware does not support it

   - chelsio/chtls: fix always leaking ctrl_skb and another leak caused
     by a race condition

   - fix drivers incorrectly writing into skbs on TX:
      - cadence: force nonlinear buffers to be cloned
      - gianfar: Account for Tx PTP timestamp in the skb headroom
      - gianfar: Replace skb_realloc_headroom with skb_cow_head for PTP

   - can: flexcan:
      - remove FLEXCAN_QUIRK_DISABLE_MECR quirk for LS1021A
      - add ECC initialization for VF610 and LX2160A
      - flexcan_remove(): disable wakeup completely

   - can: fix packet echo functionality:
      - peak_canfd: fix echo management when loopback is on
      - make sure skbs are not freed in IRQ context in case they need to
        be dropped
      - always clone the skbs to make sure they have a reference on the
        socket, and prevent it from disappearing
      - fix real payload length return value for RTR frames

   - can: j1939: return failure on bind if netdev is down, rather than
     waiting indefinitely

  Misc:

   - IPv6: reply ICMP error if the first fragment don't include all
     headers to improve compliance with RFC 8200"

* tag 'net-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
  ionic: check port ptr before use
  r8169: work around short packet hw bug on RTL8125
  net: openvswitch: silence suspicious RCU usage warning
  chelsio/chtls: fix always leaking ctrl_skb
  chelsio/chtls: fix memory leaks caused by a race
  can: flexcan: flexcan_remove(): disable wakeup completely
  can: flexcan: add ECC initialization for VF610
  can: flexcan: add ECC initialization for LX2160A
  can: flexcan: remove FLEXCAN_QUIRK_DISABLE_MECR quirk for LS1021A
  can: mcp251xfd: remove unneeded break
  can: mcp251xfd: mcp251xfd_regmap_nocrc_read(): fix semicolon.cocci warnings
  can: mcp251xfd: mcp251xfd_regmap_crc_read(): increase severity of CRC read error messages
  can: peak_canfd: pucan_handle_can_rx(): fix echo management when loopback is on
  can: peak_usb: peak_usb_get_ts_time(): fix timestamp wrapping
  can: peak_usb: add range checking in decode operations
  can: xilinx_can: handle failure cases of pm_runtime_get_sync
  can: ti_hecc: ti_hecc_probe(): add missed clk_disable_unprepare() in error path
  can: isotp: padlen(): make const array static, makes object smaller
  can: isotp: isotp_rcv_cf(): enable RX timeout handling in listen-only mode
  can: isotp: Explain PDU in CAN_ISOTP help text
  ...
2020-11-06 11:50:28 -08:00
Jakub Kicinski
c9448e828d mlx5-updates-2020-11-03
This series includes updates to mlx5 software steering component.
 
 1) Few improvements in the DR area, such as removing unneeded checks,
   renaming to better general names, refactor in some places, etc.
 
 2) Software steering (DR) Memory management improvements
 
 This patch series contains SW Steering memory management improvements:
 using buddy allocator instead of an existing bucket allocator, and
 several other optimizations.
 
 The buddy system is a memory allocation and management algorithm
 that manages memory in power of two increments.
 
 The algorithm is well-known and well-described, such as here:
 https://en.wikipedia.org/wiki/Buddy_memory_allocation
 
 Linux uses this algorithm for managing and allocating physical pages,
 as described here:
 https://www.kernel.org/doc/gorman/html/understand/understand009.html
 
 In our case, although the algorithm in principal is similar to the
 Linux physical page allocator, the "building blocks" and the circumstances
 are different: in SW steering, buddy allocator doesn't really allocates
 a memory, but rather manages ICM (Interconnect Context Memory) that was
 previously allocated and registered.
 
 The ICM memory that is used in SW steering is always power
 of 2 (order), so buddy system is a good fit for this.
 
 Patches in this series:
 
 [PATH 4] net/mlx5: DR, Add buddy allocator utilities
   This patch adds a modified implementation of a well-known buddy allocator,
   adjusted for SW steering needs: the algorithm in principal is similar to
   the Linux physical page allocator, but in our case buddy allocator doesn't
   really allocate a memory, but rather manages ICM memory that was previously
   allocated and registered.
 
 [PATH 5] net/mlx5: DR, Handle ICM memory via buddy allocation instead of bucket management
   This patch changes ICM management of SW steering to use buddy-system mechanism
   Instead of the previous bucket management.
 
 [PATH 6] net/mlx5: DR, Sync chunks only during free
   This patch makes syncing happen only when freeing memory chunks.
 
 [PATH 7] net/mlx5: DR, ICM memory pools sync optimization
   This patch adds tracking of pool's "hot" memory and makes the
   check whether steering sync is required much shorter and faster.
 
 [PATH 8] net/mlx5: DR, Free buddy ICM memory if it is unused
   This patch adds tracking buddy's used ICM memory,
   and frees the buddy if all its memory becomes unused.
 
 3) Misc code cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl+kW/sACgkQSD+KveBX
 +j7pRwf+KH6vIkwKlt0I2cYUmUINAkz0o1E82wXGx5Q81iMJLeIeMPKiatai6/0r
 BIYD8t8oagOVw5OU+H9DbVIR47wz6pB90bkQIrIJk0S3ocVcDfLlN0ssbwCdEDlH
 jj56SB6jjVVj1LlTXSVfUrXEKJ2FBjTxPbNtVL/NLW0GQkoJM5RaYjK++ZB58o+O
 YI1Gb2w+FT5vVHdRVxs/2a/NTy71VQOrYctkhql4/8P0SsstPvhgOD/oRU26l8Il
 Qpl68H/0N9KM04SlOlD91AwaWhPEIBP1qFSgRnfStmqxAfQlAfqajOoSntK8Vkz0
 0yRF7pVLELhzxx+IR/W9Vpdf7uzjeA==
 =rQcg
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2020-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-11-03

This series includes updates to mlx5 software steering component.

1) Few improvements in the DR area, such as removing unneeded checks,
  renaming to better general names, refactor in some places, etc.

2) Software steering (DR) Memory management improvements

This patch series contains SW Steering memory management improvements:
using buddy allocator instead of an existing bucket allocator, and
several other optimizations.

The buddy system is a memory allocation and management algorithm
that manages memory in power of two increments.

The algorithm is well-known and well-described, such as here:
https://en.wikipedia.org/wiki/Buddy_memory_allocation

Linux uses this algorithm for managing and allocating physical pages,
as described here:
https://www.kernel.org/doc/gorman/html/understand/understand009.html

In our case, although the algorithm in principal is similar to the
Linux physical page allocator, the "building blocks" and the circumstances
are different: in SW steering, buddy allocator doesn't really allocates
a memory, but rather manages ICM (Interconnect Context Memory) that was
previously allocated and registered.

The ICM memory that is used in SW steering is always power
of 2 (order), so buddy system is a good fit for this.

Patches in this series:

[PATH 4] net/mlx5: DR, Add buddy allocator utilities
  This patch adds a modified implementation of a well-known buddy allocator,
  adjusted for SW steering needs: the algorithm in principal is similar to
  the Linux physical page allocator, but in our case buddy allocator doesn't
  really allocate a memory, but rather manages ICM memory that was previously
  allocated and registered.

[PATH 5] net/mlx5: DR, Handle ICM memory via buddy allocation instead of bucket management
  This patch changes ICM management of SW steering to use buddy-system mechanism
  Instead of the previous bucket management.

[PATH 6] net/mlx5: DR, Sync chunks only during free
  This patch makes syncing happen only when freeing memory chunks.

[PATH 7] net/mlx5: DR, ICM memory pools sync optimization
  This patch adds tracking of pool's "hot" memory and makes the
  check whether steering sync is required much shorter and faster.

[PATH 8] net/mlx5: DR, Free buddy ICM memory if it is unused
  This patch adds tracking buddy's used ICM memory,
  and frees the buddy if all its memory becomes unused.

3) Misc code cleanups

* tag 'mlx5-updates-2020-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net: mlx5: Replace in_irq() usage
  net/mlx5: Cleanup kernel-doc warnings
  net/mlx4: Cleanup kernel-doc warnings
  net/mlx5e: Validate stop_room size upon user input
  net/mlx5: DR, Free unused buddy ICM memory
  net/mlx5: DR, ICM memory pools sync optimization
  net/mlx5: DR, Sync chunks only during free
  net/mlx5: DR, Handle ICM memory via buddy allocation instead of buckets
  net/mlx5: DR, Add buddy allocator utilities
  net/mlx5: DR, Rename matcher functions to be more HW agnostic
  net/mlx5: DR, Rename builders HW specific names
  net/mlx5: DR, Remove unused member of action struct
====================

Link: https://lore.kernel.org/r/20201105201242.21716-1-saeedm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-05 18:01:31 -08:00
Maxim Mikityanskiy
1a50cf9a67 net/mlx5e: Fix incorrect access of RCU-protected xdp_prog
rq->xdp_prog is RCU-protected and should be accessed only with
rcu_access_pointer for the NULL check in mlx5e_poll_rx_cq.

rq->xdp_prog may change on the fly only from one non-NULL value to
another non-NULL value, so the checks in mlx5e_xdp_handle and
mlx5e_poll_rx_cq will have the same result during one NAPI cycle,
meaning that no additional synchronization is needed.

Fixes: fe45386a20 ("net/mlx5e: Use RCU to protect rq->xdp_prog")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:17:06 -08:00
Aya Levin
c5eb51adf0 net/mlx5e: Fix VXLAN synchronization after function reload
During driver reload, perform firmware tear-down which results in
firmware losing the configured VXLAN ports. These ports are still
available in the driver's database. Fix this by cleaning up driver's
VXLAN database in the nic unload flow, before firmware tear-down. With
that, minimize mlx5_vxlan_destroy() to remove only what was added in
mlx5_vxlan_create() and warn on leftover UDP ports.

Fixes: 18a2b7f969 ("net/mlx5: convert to new udp_tunnel infrastructure")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:17:06 -08:00
Parav Pandit
ae35859445 net/mlx5: E-switch, Avoid extack error log for disabled vport
When E-switch vport is disabled, querying its hardware address is
unsupported.
Avoid setting extack error log message in such case.

Fixes: f099fde16d ("net/mlx5: E-switch, Support querying port function mac address")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:17:06 -08:00
Maor Gottlieb
465e7baab6 net/mlx5: Fix deletion of duplicate rules
When a rule is duplicated, the refcount of the rule is increased so only
the second deletion of the rule should cause destruction of the FTE.
Currently, the FTE will be destroyed in the first deletion of rule since
the modify_mask will be 0.
Fix it and call to destroy FTE only if all the rules (FTE's children)
have been removed.

Fixes: 718ce4d601 ("net/mlx5: Consolidate update FTE for all removal changes")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:17:06 -08:00
Maxim Mikityanskiy
f42139ba49 net/mlx5e: Use spin_lock_bh for async_icosq_lock
async_icosq_lock may be taken from softirq and non-softirq contexts. It
requires protection with spin_lock_bh, otherwise a softirq may be
triggered in the middle of the critical section, and it may deadlock if
it tries to take the same lock. This patch fixes such a scenario by
using spin_lock_bh to disable softirqs on that CPU while inside the
critical section.

Fixes: 8d94b590f1 ("net/mlx5e: Turn XSK ICOSQ into a general asynchronous one")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:17:05 -08:00
Vlad Buslov
78c906e430 net/mlx5e: Protect encap route dev from concurrent release
In functions mlx5e_route_lookup_ipv{4|6}() route_dev can be arbitrary net
device and not necessary mlx5 eswitch port representor. As such, in order
to ensure that route_dev is not destroyed concurrent the code needs either
explicitly take reference to the device before releasing reference to
rtable instance or ensure that caller holds rtnl lock. First approach is
chosen as a fix since rtnl lock dependency was intentionally removed from
mlx5 TC layer.

To prevent unprotected usage of route_dev in encap code take a reference to
the device before releasing rt. Don't save direct pointer to the device in
mlx5_encap_entry structure and use ifindex instead. Modify users of
route_dev pointer to properly obtain the net device instance from its
ifindex.

Fixes: 61086f3910 ("net/mlx5e: Protect encap hash table with mutex")
Fixes: 6707f74be8 ("net/mlx5e: Update hw flows when encap source mac changed")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:17:05 -08:00
Maor Dickman
e68e28b4a9 net/mlx5e: Fix modify header actions memory leak
Modify header actions are allocated during parse tc actions and only
freed during the flow creation, however, on error flow the allocated
memory is wrongly unfreed.

Fix this by calling dealloc_mod_hdr_actions in __mlx5e_add_fdb_flow
and mlx5e_add_nic_flow error flow.

Fixes: d7e75a325c ("net/mlx5e: Add offloading of E-Switch TC pedit (header re-write) actions")
Fixes: 2f4fe4cab0 ("net/mlx5e: Add offloading of NIC TC pedit (header re-write) actions")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:17:05 -08:00
Sebastian Andrzej Siewior
5144368571 net: mlx5: Replace in_irq() usage
mlx5_eq_async_int() uses in_irq() to decide whether eq::lock needs to be
acquired and released with spin_[un]lock() or the irq saving/restoring
variants.

The usage of in_*() in drivers is phased out and Linus clearly requested
that code which changes behaviour depending on context should either be
seperated or the context be conveyed in an argument passed by the caller,
which usually knows the context.

mlx5_eq_async_int() knows the context via the action argument already so
using it for the lock variant decision is a straight forward replacement
for in_irq().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:09:31 -08:00
Saeed Mahameed
6c6132032d net/mlx5: Cleanup kernel-doc warnings
$ git ls-files *.[ch] | egrep drivers/net/ethernet/mellanox/ | \
        xargs scripts/kernel-doc -none

drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.h:57:
warning: Enum value 'MLX5_FPGA_ACCESS_TYPE_I2C' not described ...
drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.h:57:
warning: Enum value 'MLX5_FPGA_ACCESS_TYPE_DONTCARE' not described ...
drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.h:118:
warning: Function parameter or member 'cb_arg' not described ...
drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.h:160:
warning: Function parameter or member 'conn' not described ...
drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.h:160:
warning: Excess function parameter 'fdev' description ...

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reported-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
2020-11-05 12:09:31 -08:00
Saeed Mahameed
7c36e785d6 net/mlx4: Cleanup kernel-doc warnings
$ git ls-files *.[ch] | egrep drivers/net/ethernet/mellanox/ | \
	xargs scripts/kernel-doc -none

drivers/net/ethernet/mellanox/mlx4/fw_qos.h:144:
warning: Function parameter or member 'in_param' not described ...
drivers/net/ethernet/mellanox/mlx4/fw_qos.h:144:
warning: Excess function parameter 'out_param' description ...

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reported-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
2020-11-05 12:09:30 -08:00
Vladyslav Tarasiuk
579524c6ea net/mlx5e: Validate stop_room size upon user input
Stop room is a space that may be taken by WQEs in the SQ during a packet
transmit. It is used to check if next packet has enough room in the SQ.
Stop room guarantees this packet can be served and if not, the queue is
stopped, so no more packets are passed to the driver until it's ready.

Currently, stop_room size is calculated and validated upon tx queues
allocation. This makes it impossible to know if user provided valid
input for certain parameters when interface is down.

Instead, store stop_room in mlx5e_sq_param and create
mlx5e_validate_params(), to validate its fields upon user input even
when the interface is down.

Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:09:30 -08:00
Yevgeny Kliteynik
284836d966 net/mlx5: DR, Free unused buddy ICM memory
Track buddy's used ICM memory, and free it if all
of the buddy's memory bacame unused.
Do this only for STEs.
MODIFY_ACTION buddies are much smaller, so in case there
is a large amount of modify_header actions, which result
in large amount of MODIFY_ACTION buddies, doing this
cleanup during sync will result in performance hit while
not freeing significant amount of memory.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:09:30 -08:00
Yevgeny Kliteynik
1c58651412 net/mlx5: DR, ICM memory pools sync optimization
Track the pool's hot ICM memory when freeing/allocating
chunk, so that when checking if the sync is required, just
check if the pool hot memory has reached the sync threshold.

Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:09:30 -08:00
Yevgeny Kliteynik
3eb1006a3b net/mlx5: DR, Sync chunks only during free
When freeing chunks, we want to sync the steering
so that all the "hot" memory will be written to ICM
and all the chunks that are in the hot_list will be
actually destroyed.
When allocating from the pool, we don't have a need
to sync the steering, as we're not freeing anything,
and sync might just hurt the performance in terms of
flow-per-second offloaded.

Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:09:29 -08:00
Yevgeny Kliteynik
a00cd87880 net/mlx5: DR, Handle ICM memory via buddy allocation instead of buckets
Till now in order to manage the ICM memory we used bucket
mechanism, which kept a bucket per specified size (sizes were
between 1 block to 2^21 blocks).

Now changing that with buddy-system mechanism, which gives us much
more flexible way to manage the ICM memory.
Its biggest advantage over the bucket is by using the same ICM memory
area for all the sizes of blocks, which reduces the memory consumption.

Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:09:29 -08:00
Yevgeny Kliteynik
3b72422dea net/mlx5: DR, Add buddy allocator utilities
Add implementation of SW Steering variation of buddy allocator.

The buddy system for ICM memory uses 2 main data structures:
- Bitmap per order, that keeps the current state of allocated
  blocks for this order
- Indicator for the number of available blocks per each order

Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:09:29 -08:00
Yevgeny Kliteynik
8a8a102300 net/mlx5: DR, Rename matcher functions to be more HW agnostic
Remove flex parser from the matcher function names since
the matcher should not be aware of such HW specific details.

Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:09:28 -08:00
Yevgeny Kliteynik
de1facaf56 net/mlx5: DR, Rename builders HW specific names
We will support multiple STE versions.
The existing naming is not suitable for newer versions.
Removed the HW specific details and renamed with a more
general names.

Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:09:28 -08:00
Yevgeny Kliteynik
77662e75e0 net/mlx5: DR, Remove unused member of action struct
Struct mlx5dr_action doesn't use this member

Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-05 12:09:28 -08:00
Shannon Nelson
2bcbf42add ionic: check port ptr before use
Check for corner case of port_init failure before using
the port_info pointer.

Fixes: 4d03e00a21 ("ionic: Add initial ethtool support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Link: https://lore.kernel.org/r/20201104195606.61184-1-snelson@pensando.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-05 09:58:25 -08:00
Heiner Kallweit
2aaf09a0e7 r8169: work around short packet hw bug on RTL8125
Network problems with RTL8125B have been reported [0] and with help
from Realtek it turned out that this chip version has a hw problem
with short packets (similar to RTL8168evl). Having said that activate
the same workaround as for RTL8168evl.
Realtek suggested to activate the workaround for RTL8125A too, even
though they're not 100% sure yet which RTL8125 versions are affected.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=209839

Fixes: 0439297be9 ("r8169: add support for RTL8125B")
Reported-by: Maxim Plotnikov <wgh@torlan.ru>
Tested-by: Maxim Plotnikov <wgh@torlan.ru>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/8002c31a-60b9-58f1-f0dd-8fd07239917f@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-04 17:40:22 -08:00
Claudiu Manoil
82728b91f1 enetc: Remove Tx checksumming offload code
Tx checksumming has been defeatured and completely removed
from the h/w reference manual. Made a little cleanup for the
TSE case as this is complementary code.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20201103140213.3294-1-claudiu.manoil@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-04 17:35:01 -08:00
Zou Wei
ebcaa207b4 dpaa_eth: use false and true for bool variables
Fix coccicheck warnings:

./dpaa_eth.c:2549:2-22: WARNING: Assignment of 0/1 to bool variable
./dpaa_eth.c:2562:2-22: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://lore.kernel.org/r/1604405100-33255-1-git-send-email-zou_wei@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-04 17:24:00 -08:00
Vinay Kumar Yadav
a74e44a111 chelsio/chtls: Utilizing multiple rxq/txq to process requests
patch adds a logic to utilize multiple queues to process requests.
The queue selection logic uses a round-robin distribution technique
using a counter.

Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Link: https://lore.kernel.org/r/20201102162832.22344-1-vinay.yadav@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-03 17:54:23 -08:00
Colin Ian King
873b807c98 octeontx2-pf: Fix sizeof() mismatch
An incorrect sizeof() is being used, sizeof(u64 *) is not correct,
it should be sizeof(*sq->sqb_ptrs).

Addresses-Coverity: ("Sizeof not portable (SIZEOF_MISMATCH)")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20201102134601.698436-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-03 17:51:56 -08:00
kernel test robot
4c4ac83177 forcedeth: fix excluded_middle.cocci warnings
Condition !A || A && B is equivalent to !A || B.

Generated by: scripts/coccinelle/misc/excluded_middle.cocci

Fixes: b76f0ea013 ("coccinelle: misc: add excluded_middle.cocci script")
CC: Denis Efremov <efremov@linux.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2011020936100.3077@hadrien
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-03 17:47:04 -08:00
Sebastian Andrzej Siewior
abba4b16fd net: dpaa: Replace in_irq() usage.
The driver uses in_irq() + in_serving_softirq() magic to decide if NAPI
scheduling is required or packet processing.

The usage of in_*() in drivers is phased out and Linus clearly requested
that code which changes behaviour depending on context should either be
separated or the context be conveyed in an argument passed by the caller,
which usually knows the context.

Use the `sched_napi' argument passed by the callback. It is set true if
called from the interrupt handler and NAPI should be scheduled.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Aymen Sghaier <aymen.sghaier@nxp.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Tested-by: Camelia Groza <camelia.groza@nxp.com>
2020-11-03 17:41:24 -08:00
Sebastian Andrzej Siewior
f84754dbc5 soc/fsl/qbman: Add an argument to signal if NAPI processing is required.
dpaa_eth_napi_schedule() and caam_qi_napi_schedule() schedule NAPI if
invoked from:

 - Hard interrupt context
 - Any context which is not serving soft interrupts

Any context which is not serving soft interrupts includes hard interrupts
so the in_irq() check is redundant. caam_qi_napi_schedule() has a comment
about this:

        /*
         * In case of threaded ISR, for RT kernels in_irq() does not return
         * appropriate value, so use in_serving_softirq to distinguish between
         * softirq and irq contexts.
         */
         if (in_irq() || !in_serving_softirq())

This has nothing to do with RT. Even on a non RT kernel force threaded
interrupts run obviously in thread context and therefore in_irq() returns
false when invoked from the handler.

The extension of the in_irq() check with !in_serving_softirq() was there
when the drivers were added, but in the out of tree FSL BSP the original
condition was in_irq() which got extended due to failures on RT.

The usage of in_xxx() in drivers is phased out and Linus clearly requested
that code which changes behaviour depending on context should either be
separated or the context be conveyed in an argument passed by the caller,
which usually knows the context. Right he is, the above construct is
clearly showing why.

The following callchains have been analyzed to end up in
dpaa_eth_napi_schedule():

qman_p_poll_dqrr()
  __poll_portal_fast()
    fq->cb.dqrr()
       dpaa_eth_napi_schedule()

portal_isr()
  __poll_portal_fast()
    fq->cb.dqrr()
       dpaa_eth_napi_schedule()

Both need to schedule NAPI.
The crypto part has another code path leading up to this:
  kill_fq()
     empty_retired_fq()
       qman_p_poll_dqrr()
         __poll_portal_fast()
            fq->cb.dqrr()
               dpaa_eth_napi_schedule()

kill_fq() is called from task context and ends up scheduling NAPI, but
that's pointless and an unintended side effect of the !in_serving_softirq()
check.

The code path:
  caam_qi_poll() -> qman_p_poll_dqrr()

is invoked from NAPI and I *assume* from crypto's NAPI device and not
from qbman's NAPI device. I *guess* it is okay to skip scheduling NAPI
(because this is what happens now) but could be changed if it is wrong
due to `budget' handling.

Add an argument to __poll_portal_fast() which is true if NAPI needs to be
scheduled. This requires propagating the value to the caller including
`qman_cb_dqrr' typedef which is used by the dpaa and the crypto driver.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Aymen Sghaier <aymen.sghaier@nxp.com>
Cc: Herbert XS <herbert@gondor.apana.org.au>
Cc: Li Yang <leoyang.li@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Tested-by: Camelia Groza <camelia.groza@nxp.com>
2020-11-03 17:41:03 -08:00
Sergej Bauer
e9e13b6adc lan743x: fix for potential NULL pointer dereference with bare card
This is the 3rd revision of the patch fix for potential null pointer dereference
with lan743x card.

The simpliest way to reproduce: boot with bare lan743x and issue "ethtool ethN"
commant where ethN is the interface with lan743x card. Example:

$ sudo ethtool eth7
dmesg:
[  103.510336] BUG: kernel NULL pointer dereference, address: 0000000000000340
...
[  103.510836] RIP: 0010:phy_ethtool_get_wol+0x5/0x30 [libphy]
...
[  103.511629] Call Trace:
[  103.511666]  lan743x_ethtool_get_wol+0x21/0x40 [lan743x]
[  103.511724]  dev_ethtool+0x1507/0x29d0
[  103.511769]  ? avc_has_extended_perms+0x17f/0x440
[  103.511820]  ? tomoyo_init_request_info+0x84/0x90
[  103.511870]  ? tomoyo_path_number_perm+0x68/0x1e0
[  103.511919]  ? tty_insert_flip_string_fixed_flag+0x82/0xe0
[  103.511973]  ? inet_ioctl+0x187/0x1d0
[  103.512016]  dev_ioctl+0xb5/0x560
[  103.512055]  sock_do_ioctl+0xa0/0x140
[  103.512098]  sock_ioctl+0x2cb/0x3c0
[  103.512139]  __x64_sys_ioctl+0x84/0xc0
[  103.512183]  do_syscall_64+0x33/0x80
[  103.512224]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  103.512274] RIP: 0033:0x7f54a9cba427
...

Previous versions can be found at:
v1:
initial version
    https://lkml.org/lkml/2020/10/28/921

v2:
do not return from lan743x_ethtool_set_wol if netdev->phydev == NULL, just skip
the call of phy_ethtool_set_wol() instead.
    https://lkml.org/lkml/2020/10/31/380

v3:
in function lan743x_ethtool_set_wol:
use ternary operator instead of if-else sentence (review by Markus Elfring)
return -ENETDOWN insted of -EIO (review by Andrew Lunn)

Signed-off-by: Sergej Bauer <sbauer@blackbox.su>

Link: https://lore.kernel.org/r/20201101223556.16116-1-sbauer@blackbox.su
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-03 17:34:42 -08:00
Heiner Kallweit
870f531e17 r8169: set IRQF_NO_THREAD if MSI(X) is enabled
We had to remove flag IRQF_NO_THREAD because it conflicts with shared
interrupts in case legacy interrupts are used. Following up on the
linked discussion set IRQF_NO_THREAD if MSI or MSI-X is used, because
both guarantee that interrupt won't be shared.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://www.spinics.net/lists/netdev/msg695341.html
Link: https://lore.kernel.org/r/446cf5b8-dddd-197f-cb96-66783141ade4@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-03 17:34:27 -08:00
Heiner Kallweit
f06059c244 r8169: align number of tx descriptors with vendor driver
Lowest number of tx descriptors used in the vendor drivers is 256 in
r8169. r8101/r8168/r8125 use 1024 what seems to be the hw limit. Stay
on the safe side and go with 256, same as number of rx descriptors.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/a52a6de4-f792-5038-ae2f-240d3b7860eb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-03 17:34:11 -08:00
Jiri Pirko
803be1085d mlxsw: spectrum_router: Introduce low-level ops and implement them for RALXX regs
In preparation for support of XM router implementation which uses
different registers to work with trees and FIB entries, introduce
a structure to hold low-level ops and implement tree manipulation
register ops.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-03 17:27:15 -08:00
Jiri Pirko
fb281f24f8 mlxsw: reg: Add XRALXX Registers
Add a couple of registers used to manipulate LPM trees on XM:
The XRALTA is used to allocate the XLT LPM trees.
The XRALST is used to set and query the structure of an XLT LPM tree.
The XRALTB register is used to bind virtual router and protocol to
an allocated LPM tree.

Since the XM registers are identical to the legacy router registers
with a fixed offset, re-use their pack functions.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-03 17:27:15 -08:00
Lijun Pan
16b5f5ce35 ibmvnic: merge do_change_param_reset into do_reset
Commit b27507bb59 ("net/ibmvnic: unlock rtnl_lock in reset so
linkwatch_event can run") introduced do_change_param_reset function to
solve the rtnl lock issue. Majority of the code in do_change_param_reset
duplicates do_reset. Also, we can handle the rtnl lock issue in do_reset
itself. Hence merge do_change_param_reset back into do_reset to clean up
the code.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Link: https://lore.kernel.org/r/20201031094645.17255-1-ljp@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-03 15:08:14 -08:00