Trivial fix to spelling mistakes in dev_dbg warning messages
"Reloade" -> "Reload"
"chang" -> "change"
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net'
overlapped the renaming of a netlink attribute in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
Today mlx5 devices support two teardown modes:
1- Regular teardown
2- Force teardown
This change introduces the enhanced version of the "Force teardown" that
allows SW to perform teardown in a faster way without the need to reclaim
all the pages.
Fast teardown provides the following advantages:
1- Fix a FW race condition that could cause command timeout
2- Avoid moving to polling mode
3- Close the vport to prevent PCI ACK to be sent without been scatter
to memory
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Count aRFS rules insertion failure for ethtool output. In addition, move
the error print into debug prints mechanism, as it could flood the dmesg
and reduce system BW dramatically.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Return tc extack messages for failures to user space.
Messages provide reasons for not being able to offload rules to HW.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Return extack messages for failures in the e-switch devlink callbacks.
Messages provide reasons for not being able to issue the operation.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add extack argument to the eswitch related operations.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
David writes:
"Networking fixes:
1) Prefix length validation in xfrm layer, from Steffen Klassert.
2) TX status reporting fix in mac80211, from Andrei Otcheretianski.
3) Fix hangs due to TX_DROP in mac80211, from Bob Copeland.
4) Fix DMA error regression in b43, from Larry Finger.
5) Add input validation to xenvif_set_hash_mapping(), from Jan Beulich.
6) SMMU unmapping fix in hns driver, from Yunsheng Lin.
7) Bluetooh crash in unpairing on SMP, from Matias Karhumaa.
8) WoL handling fixes in the phy layer, from Heiner Kallweit.
9) Fix deadlock in bonding, from Mahesh Bandewar.
10) Fill ttl inherit infor in vxlan driver, from Hangbin Liu.
11) Fix TX timeouts during netpoll, from Michael Chan.
12) RXRPC layer fixes from David Howells.
13) Another batch of ndo_poll_controller() removals to deal with
excessive resource consumption during load. From Eric Dumazet.
14) Fix a specific TIPC failure secnario, from LUU Duc Canh.
15) Really disable clocks in r8169 during suspend so that low
power states can actually be reached.
16) Fix SYN backlog lockdep issue in tcp and dccp, from Eric Dumazet.
17) Fix RCU locking in netpoll SKB send, which shows up in bonding,
from Dave Jones.
18) Fix TX stalls in r8169, from Heiner Kallweit.
19) Fix locksup in nfp due to control message storms, from Jakub
Kicinski.
20) Various rmnet bug fixes from Subash Abhinov Kasiviswanathan and
Sean Tranchetti.
21) Fix use after free in ip_cmsg_recv_dstaddr(), from Eric Dumazet."
* gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net: (122 commits)
ixgbe: check return value of napi_complete_done()
sctp: fix fall-through annotation
r8169: always autoneg on resume
ipv4: fix use-after-free in ip_cmsg_recv_dstaddr()
net: qualcomm: rmnet: Fix incorrect allocation flag in receive path
net: qualcomm: rmnet: Fix incorrect allocation flag in transmit
net: qualcomm: rmnet: Skip processing loopback packets
net: systemport: Fix wake-up interrupt race during resume
rtnl: limit IFLA_NUM_TX_QUEUES and IFLA_NUM_RX_QUEUES to 4096
bonding: fix warning message
inet: make sure to grab rcu_read_lock before using ireq->ireq_opt
nfp: avoid soft lockups under control message storm
declance: Fix continuation with the adapter identification message
net: fec: fix rare tx timeout
r8169: fix network stalls due to missing bit TXCFG_AUTO_FIFO
tun: napi flags belong to tfile
tun: initialize napi_mutex unconditionally
tun: remove unused parameters
bond: take rcu lock in netpoll_send_skb_on_dev
rtnetlink: Fail dump if target netnsid is invalid
...
The NIC driver should only enable interrupts when napi_complete_done()
returns true. This patch adds the check for ixgbe.
Cc: stable@vger.kernel.org # 4.10+
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This trivial patch fixes a typo in iavf.h.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds zero-copy Tx support for AF_XDP sockets. It implements
the ndo_xsk_async_xmit netdev ndo and performs all the Tx logic from a
NAPI context. This means pulling egress packets from the Tx ring,
placing the frames on the NIC HW descriptor ring and completing sent
frames back to the application via the completion ring.
The regular XDP Tx ring is used for AF_XDP as well. This rationale for
this is as follows: XDP_REDIRECT guarantees mutual exclusion between
different NAPI contexts based on CPU id. In other words, a netdev can
XDP_REDIRECT to another netdev with a different NAPI context, since
the operation is bound to a specific core and each core has its own
hardware ring.
As the AF_XDP Tx action is running in the same NAPI context and using
the same ring, it will also be protected from XDP_REDIRECT actions
with the exact same mechanism.
As with AF_XDP Rx, all AF_XDP Tx specific functions are added to
ixgbe_xsk.c.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch prepares for the upcoming zero-copy Tx functionality by
moving common functions used both by the regular path and zero-copy
path.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds zero-copy Rx support for AF_XDP sockets. Instead of
allocating buffers of type MEM_TYPE_PAGE_SHARED, the Rx frames are
allocated as MEM_TYPE_ZERO_COPY when AF_XDP is enabled for a certain
queue.
All AF_XDP specific functions are added to a new file, ixgbe_xsk.c.
Note that when AF_XDP zero-copy is enabled, the XDP action XDP_PASS
will allocate a new buffer and copy the zero-copy frame prior passing
it to the kernel stack.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch prepares for the upcoming zero-copy Rx functionality, by
moving/changing linkage of common functions, used both by the regular
path and zero-copy path.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add functions for Rx/Tx ring enable/disable. Instead of resetting the
whole device, only the affected ring is disabled or enabled.
This plumbing is used in later commits, when zero-copy AF_XDP support
is introduced.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fix crash when we have restore flow director filters after reset
adapter. In ixgbe_fdir_filter_restore() filter->action is outside of the
rx_ring array, as it has a VF identifier in the upper 32 bits.
Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clang warns that the address of a pointer will always evaluated as true
in a boolean context.
drivers/net/ethernet/intel/i40e/i40e_debugfs.c:136:9: warning: address
of array 'vsi->active_vlans' will always evaluate to 'true'
[-Wpointer-bool-conversion]
vsi->active_vlans ? "<valid>" : "<null>");
~~~~~^~~~~~~~~~~~ ~
./include/linux/device.h:1431:33: note: expanded from macro 'dev_info'
_dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
^~~~~~~~~~~
1 warning generated.
Given that the statement shows that active_vlans is always valid, just
remove the statement since it's not giving any useful information.
Link: https://github.com/ClangBuiltLinux/linux/issues/82
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clang warns when one enumerated type is converted implicitly to another.
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4214:42: warning:
implicit conversion from enumeration type 'enum i40e_aq_link_speed' to
different enumeration type 'enum virtchnl_link_speed'
[-Wenum-conversion]
pfe.event_data.link_event.link_speed = I40E_LINK_SPEED_40GB;
~ ^~~~~~~~~~~~~~~~~~~~
1 warning generated.
Use the proper enum from virtchnl_link_speed, which has the same value
as I40E_LINK_SPEED_40GB, VIRTCHNL_LINK_SPEED_40GB. This appears to be
missed by commit ff3f4cc267 ("virtchnl: finish conversion to virtchnl
interface").
Link: https://github.com/ClangBuiltLinux/linux/issues/81
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ipsec->tx_tbl[] array has IXGBE_IPSEC_MAX_SA_COUNT elements so the >
should be a >=.
Fixes: 0062e7cc95 ("ixgbevf: add VF IPsec offload code")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There are no in-tree callers.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We have Tx hang when number Tx and XDP queues are more than 64.
In XDP always is MTQC == 0x0 (64TxQs). We need more space for Tx queues.
Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don't be fancy with message lengths, just set lengths to
number of dwords, not bytes.
Fixes: 0062e7cc95 ("ixgbevf: add VF IPsec offload code")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2018-10-03
This series contains updates to ice and virtchnl.
Yashaswini Raghuram adds a new virtchnl capability flag to support the
exchange of additional supported speeds.
Anirudh adds support for SR-IOV for the ice driver. Added code to
initialize, configure and use mailbox queues for PF and VF
communication. Updated the VSI and queue management to handle both PF
and VF VSI type. Added "Adaptive Virtual Function (AVF)" support for
the ice PF driver by implementing virtchnl commands. Extended the
malicious driver detection logic to include the VF driver as well.
Fixed the queue region size which needs to be log base 2 of the number
of queues in region.
Brett fixes an issue which was causing switch rules to be lost, by
making a call to ice_update_pkt_fwd_rule() with the necessary changes.
Fixed how the PF and VF assigned the ITR index by adding a struct member
itr_idx to be used to dynamically program the correct ITR index.
Dave fixed a potential NULL pointer dereference by adding checks in the
filter handling.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb_set_tx_maxrate will be called holding rtnl lock,
hence remove all unneeded locks.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update version string to 0.7.2-k
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ice_ena/dis_vsi should have a single differentiating
factor to determine if the netdev_ops call is used or a
direct call to ice_vsi_open/close. This is if the netif is
running or not. If netif is running, use ndo_open/ndo_close.
Else, use ice_vsi_open/ice_vsi_close.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This issue came about when looking at the VF function
ice_vc_cfg_irq_map_msg. Currently we are assigning the itr_setting value
to the itr_idx received from the AVF driver, which is not correct and is
not used for the VF flow anyway. Currently the only way we set the ITR
index for both the PF and VF driver is by hard coding ICE_TX_ITR or
ICE_RX_ITR for the ITR index on each q_vector.
To fix this, add the member itr_idx in struct ice_ring_container. This
can then be used to dynamically program the correct ITR index. This change
also affected the PF driver so make the necessary changes there as well.
Also, removed the itr_setting member in struct ice_ring because it is not
being used meaningfully and is going to be removed in a future patch that
includes dynamic ITR.
On another note, this will be useful moving forward if we decide to split
Rx/Tx rings on different q_vectors instead of sharing them as queue pairs.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add checks in the filter handling flow to avoid dereferencing
NULL pointers.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When a switch rule is initially created we set the filter action to
ICE_FWD_TO_VSI. The filter action changes to ICE_FWD_TO_VSI_LIST
whenever more than one VSI is subscribed to the same switch rule. When
the switch rule goes from 2 VSIs in the list to 1 VSI we remove and
delete the VSI list rule, but we currently don't update the switch rule
in hardware. This is causing switch rules to be lost, so fix that by
making a call to ice_update_pkt_fwd_rule() with the necessary changes.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When adding a rule, queue region size needs to be provided as log base 2
of the number of queues in region. Fix that.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch extends the existing malicious driver operation detection
logic to cover malicious operations by the VF driver as well.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When PF gets a link status change event, notify the VFs of the same.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
virtchnl is a protocol/interface specification that allows the Intel
"Adaptive Virtual Function (AVF)" driver (iavf.ko) to work with more than
one physical function driver. The AVF driver sends "virtchnl commands"
(control plane only) to the PF driver over mailbox queues and the PF driver
executes these commands and returns a result to the VF, again over mailbox.
This patch adds AVF support for the ice PF driver by implementing the
following virtchnl commands:
VIRTCHNL_OP_VERSION
VIRTCHNL_OP_GET_VF_RESOURCES
VIRTCHNL_OP_RESET_VF
VIRTCHNL_OP_ADD_ETH_ADDR
VIRTCHNL_OP_DEL_ETH_ADDR
VIRTCHNL_OP_CONFIG_VSI_QUEUES
VIRTCHNL_OP_ENABLE_QUEUES
VIRTCHNL_OP_DISABLE_QUEUES
VIRTCHNL_OP_ADD_ETH_ADDR
VIRTCHNL_OP_DEL_ETH_ADDR
VIRTCHNL_OP_CONFIG_VSI_QUEUES
VIRTCHNL_OP_ENABLE_QUEUES
VIRTCHNL_OP_DISABLE_QUEUES
VIRTCHNL_OP_REQUEST_QUEUES
VIRTCHNL_OP_CONFIG_IRQ_MAP
VIRTCHNL_OP_CONFIG_RSS_KEY
VIRTCHNL_OP_CONFIG_RSS_LUT
VIRTCHNL_OP_GET_STATS
VIRTCHNL_OP_ADD_VLAN
VIRTCHNL_OP_DEL_VLAN
VIRTCHNL_OP_ENABLE_VLAN_STRIPPING
VIRTCHNL_OP_DISABLE_VLAN_STRIPPING
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch implements handlers for the following NDO operations:
.ndo_set_vf_spoofchk
.ndo_set_vf_mac
.ndo_get_vf_config
.ndo_set_vf_trust
.ndo_set_vf_vlan
.ndo_set_vf_link_state
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Post VF initialization, there are a couple of different ways in which a
VF reset can be triggered. One is when the underlying PF itself goes
through a reset and other is via a VFLR interrupt. ice_reset_vf introduced
in this patch handles both these cases.
Also introduced in this patch is a helper function ice_aq_send_msg_to_vf
to send messages to VF over the mailbox queue. The PF uses this to send
reset notifications to VFs.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Until now, all the VSI and queue management code supported only the PF
VSI type (ICE_VSI_PF). Update these flows to handle the VF VSI type
(ICE_VSI_VF) type as well.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch implements parts of ice_sriov_configure and VF reset flow.
To create virtual functions (VFs), the user sets a value in num_vfs
through sysfs. This results in the kernel calling the handler for
.sriov_configure which is ice_sriov_configure.
VF setup first starts with a VF reset, followed by allocation of the VF
VSI using ice_vf_vsi_setup. Once the VF setup is complete a state bit
ICE_VF_STATE_INIT is set in the vf->states bitmap to indicate that
the VF is ready to go.
Also for VF reset to go into effect, it's necessary to issue a disable
queue command (ice_aqc_opc_dis_txqs). So this patch updates multiple
functions in the disable queue flow to take additional parameters that
distinguish if queues are being disabled due to VF reset.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mailbox queue is a type of control queue that's used for communication
between PF and VF. This patch adds code to initialize, configure and
use mailbox queues.
This patch also adds support to detect and parse SR-IOV capabilities
returned by the hardware.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This affects at least versions 25 and 33, so assume all cards are broken
and just renegotiate by default.
Fixes: 10bc6a6042 ("r8169: fix autoneg issue on resume with RTL8168E")
Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c:390:4: warning: implicit
conversion from enumeration type 'enum cxgb4_dcb_state' to different
enumeration type 'enum cxgb4_dcb_state_input' [-Wenum-conversion]
IEEE_FAUX_SYNC(dev, dcb);
^~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.h:70:10: note: expanded
from macro 'IEEE_FAUX_SYNC'
CXGB4_DCB_STATE_FW_ALLSYNCED);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use the equivalent value of the expected type to silence Clang while
resulting in no functional change.
CXGB4_DCB_STATE_FW_ALLSYNCED = CXGB4_DCB_INPUT_FW_ALLSYNCED = 3
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c:303:7: warning: implicit
conversion from enumeration type 'enum cxgb4_dcb_state' to different
enumeration type 'enum cxgb4_dcb_state_input' [-Wenum-conversion]
? CXGB4_DCB_STATE_FW_ALLSYNCED
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c:304:7: warning: implicit
conversion from enumeration type 'enum cxgb4_dcb_state' to different
enumeration type 'enum cxgb4_dcb_state_input' [-Wenum-conversion]
: CXGB4_DCB_STATE_FW_INCOMPLETE);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 warnings generated.
Use the equivalent value of the expected type to silence Clang while
resulting in no functional change.
CXGB4_DCB_STATE_FW_INCOMPLETE = CXGB4_DCB_INPUT_FW_INCOMPLETE = 2
CXGB4_DCB_STATE_FW_ALLSYNCED = CXGB4_DCB_INPUT_FW_ALLSYNCED = 3
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns:
drivers/net/ethernet/freescale/dpaa/dpaa_eth.c:2734:34: warning:
tentative array definition assumed to have one element
static const struct of_device_id dpaa_match[];
^
1 warning generated.
Turns out that since this driver was introduced in commit 9ad1a37493
("dpaa_eth: add support for DPAA Ethernet"), this declaration has been
unused. Remove it to silence the warning.
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for inserting and deleting Rx flow classification
rules through ethtool.
We support classification based on some header fields for
flow-types ether, ip4, tcp4, udp4 and sctp4.
Rx queues are core affine, so the action argument effectively
selects on which cpu the matching frame will be processed.
Discarding the frame is also supported.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For firmware versions that support it, configure an Rx flow
classification key at probe time.
Hardware expects all rules in the classification table to share
the same key. So we setup a key containing all supported fields
at driver init and when a user adds classification rules through
ethtool, we will just mask out the unused header fields.
Since the key composition process is the same for flow
classification and hashing, reuse existing code where possible.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the array of supported header fields will be used for
Rx flow classification as well, rename it from "hash_fields" to
the more inclusive "dist_fields".
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Management Complex (MC) firmware initially allowed the
configuration of a single key to be used both for Rx flow hashing
and flow classification. This prevented us from supporting
Rx flow classification through ethtool.
Starting with version 10.7.0, the Management Complex(MC) offers
a new set of APIs for separate configuration of Rx hashing and
classification keys.
Update the Rx flow hashing support to use the new API, if available.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbsmA5AAoJEEg/ir3gV/o+Bg0IAM4qUD878349YufTxlNI3usz
pPyRYLFfh+f3oTjGWxZ4SlAzMsCVnGDWMKFBj33qnJen1YcMxD2DriTIStQfOHkE
EpErxxMa70YFY2VyvD6Gt/Xfw9Q4ZGrp9j4tNnXVBMVT+EDhui/OpHFlpgoLtpLk
A8pxPRkbQNdql4OWOLwVpFJoleB0mJp4+ed+z743z4ectBfY0Iqx6Yyk1TRUzLfB
o/cUV02M33Fi8XjjZ59crPrIc4r0iUcPDcd5+8i2WtOkfbFod9s1Dg4GMNbGpcFn
RPJX9A+XilMPCJNM8YhwcS2JxX8Y2XwUjBDbspx0RHX3HZY22SjuN2i5p1GZnLo=
=pu43
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2018-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2018-10-01
This pull request includes some fixes to mlx5 driver,
Please pull and let me know if there's any problem.
For -stable v4.11:
"6e0a4a23c59a ('net/mlx5: E-Switch, Fix out of bound access when setting vport rate')"
For -stable v4.18:
"98d6627c372a ('net/mlx5e: Set vlan masks for all offloaded TC rules')"
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The incoming skb needs to be reallocated in case the headroom
is not sufficient to adjust the ethernet header. This allocation
needs to be atomic otherwise it results in this splat
[<600601bb>] ___might_sleep+0x185/0x1a3
[<603f6314>] ? _raw_spin_unlock_irqrestore+0x0/0x27
[<60069bb0>] ? __wake_up_common_lock+0x95/0xd1
[<600602b0>] __might_sleep+0xd7/0xe2
[<60065598>] ? enqueue_task_fair+0x112/0x209
[<600eea13>] __kmalloc_track_caller+0x5d/0x124
[<600ee9b6>] ? __kmalloc_track_caller+0x0/0x124
[<602696d5>] __kmalloc_reserve.isra.34+0x30/0x7e
[<603f629b>] ? _raw_spin_lock_irqsave+0x0/0x3d
[<6026b744>] pskb_expand_head+0xbf/0x310
[<6025ca6a>] rmnet_rx_handler+0x7e/0x16b
[<6025c9ec>] ? rmnet_rx_handler+0x0/0x16b
[<6027ad0c>] __netif_receive_skb_core+0x301/0x96f
[<60033c17>] ? set_signals+0x0/0x40
[<6027bbcb>] __netif_receive_skb+0x24/0x8e
Fixes: 74692caf1b ("net: qualcomm: rmnet: Process packets over ethernet")
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The incoming skb needs to be reallocated in case the headroom
is not sufficient to add the MAP header. This allocation needs to
be atomic otherwise it results in the following splat
[32805.801456] BUG: sleeping function called from invalid context
[32805.841141] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[32805.904773] task: ffffffd7c5f62280 task.stack: ffffff80464a8000
[32805.910851] pc : ___might_sleep+0x180/0x188
[32805.915143] lr : ___might_sleep+0x180/0x188
[32806.131520] Call trace:
[32806.134041] ___might_sleep+0x180/0x188
[32806.137980] __might_sleep+0x50/0x84
[32806.141653] __kmalloc_track_caller+0x80/0x3bc
[32806.146215] __kmalloc_reserve+0x3c/0x88
[32806.150241] pskb_expand_head+0x74/0x288
[32806.154269] rmnet_egress_handler+0xb0/0x1d8
[32806.162239] rmnet_vnd_start_xmit+0xc8/0x13c
[32806.166627] dev_hard_start_xmit+0x148/0x280
[32806.181181] sch_direct_xmit+0xa4/0x198
[32806.185125] __qdisc_run+0x1f8/0x310
[32806.188803] net_tx_action+0x23c/0x26c
[32806.192655] __do_softirq+0x220/0x408
[32806.196420] do_softirq+0x4c/0x70
Fixes: ceed73a2cf ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
RMNET RX handler was processing invalid packets that were
originally sent on the real device and were looped back via
dev_loopback_xmit(). This was detected using syzkaller.
Fixes: ceed73a2cf ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The AON_PM_L2 is normally used to trigger and identify the source of a
wake-up event. Since the RX_SYS clock is no longer turned off, we also
have an interrupt being sent to the SYSTEMPORT INTRL_2_0 controller, and
that interrupt remains active up until the magic packet detector is
disabled which happens much later during the driver resumption.
The race happens if we have a CPU that is entering the SYSTEMPORT
INTRL2_0 handler during resume, and another CPU has managed to clear the
wake-up interrupt during bcm_sysport_resume_from_wol(). In that case, we
have the first CPU stuck in the interrupt handler with an interrupt
cause that has been cleared under its feet, and so we keep returning
IRQ_NONE and we never make any progress.
This was not a problem before because we would always turn off the
RX_SYS clock during WoL, so the SYSTEMPORT INTRL2_0 would also be turned
off as well, thus not latching the interrupt.
The fix is to make sure we do not enable either the MPD or
BRCM_TAG_MATCH interrupts since those are redundant with what the
AON_PM_L2 interrupt controller already processes and they would cause
such a race to occur.
Fixes: bb9051a2b2 ("net: systemport: Add support for WAKE_FILTER")
Fixes: 83e82f4c70 ("net: systemport: add Wake-on-LAN support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After bfcb79fca1 ("PCI/ERR: Run error recovery callbacks for all affected
devices"), AER errors are always cleared by the PCI core and drivers don't
need to do it themselves.
Remove calls to pci_cleanup_aer_uncorrect_error_status() from device
driver error recovery functions.
Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog, remove PCI core changes, remove unused variables]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When FW floods the driver with control messages try to exit the cmsg
processing loop every now and then to avoid soft lockups. Cmsg
processing is generally very lightweight so 512 seems like a reasonable
budget, which should not be exceeded under normal conditions.
Fixes: 77ece8d5f1 ("nfp: add control vNIC datapath")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2018-10-02
This series contains updates to ice driver only.
Anirudh expands the use of VSI handles across the rest of the driver,
which includes refactoring the code to correctly use VSI handles. After
a reset, ensure that all configurations for a VSI get re-applied before
moving on to rebuilding the next VSI.
Dave fixed the driver to check the current link state after reset to
ensure that the correct link state of a port is reported. Fixed an
issue where if the driver is unloaded when traffic is in progress,
errors are generated.
Preethi breaks up the IRQ tracker into a software and hardware IRQ
tracker, where the software IRQ tracker tracks only the PF's IRQ
requests and does not play any role in the VF initialization. The
hardware IRQ tracker represents the device's interrupt space and will be
looked up to see if the device has run our of interrupts when a
interrupt has to be allocated in the device for either PF or VF.
Md Fahad adds support for enabling/disabling RSS via ethtool.
Brett aligns the ice_reset_req enum values to the values that the
hardware understands. Also added initial support for dynamic interrupt
moderation in the ice driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a commit 4bcc595ccd ("printk: reinstate KERN_CONT for printing
continuation lines") regression with the `declance' driver, which caused
the adapter identification message to be split between two lines, e.g.:
declance.c: v0.011 by Linux MIPS DECstation task force
tc6: PMAD-AA
, addr = 08:00:2b:1b:2a:6a, irq = 14
tc6: registered as eth0.
Address that properly, by printing identification with a single call,
making the messages now look like:
declance.c: v0.011 by Linux MIPS DECstation task force
tc6: PMAD-AA, addr = 08:00:2b:1b:2a:6a, irq = 14
tc6: registered as eth0.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Fixes: 4bcc595ccd ("printk: reinstate KERN_CONT for printing continuation lines")
Signed-off-by: David S. Miller <davem@davemloft.net>
Add driver support for reading/configuring the 20G link speed via ethtool.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add driver support for configuring/reading the 20G link speed.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During certain heavy network loads TX could time out
with TX ring dump.
TX is sometimes never restarted after reaching
"tx_stop_threshold" because function "fec_enet_tx_queue"
only tests the first queue.
In addition the TX timeout callback function failed to
recover because it also operated only on the first queue.
Signed-off-by: Rickard x Andersson <rickaran@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the driver is unloaded when traffic is in progress, errors are
generated. Fix this by releasing qvectors and NAPI handler on remove.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently there is no support for dynamic interrupt moderation. This
patch adds some initial code to support this. The following changes
were made:
1. Currently we are using multiple members to store the interrupt
granularity (itr_gran_25/50/100/200). This is not necessary because
we can query the device to determine what the interrupt granularity
should be set to, done by a new function ice_get_itr_intrl_gran.
2. Added intrl to ice_q_vector structure to support interrupt rate
limiting.
3. Added the function ice_intrl_usecs_to_reg for converting to a value
in usecs that the device understands.
4. Added call to write to the GLINT_RATE register. Disable intrl by
default for now.
5. Changed rx/tx_itr_setting to itr_setting because having both seems
redundant because a ring is either Tx or Rx.
6. Initialize itr_setting for both Tx/Rx rings in ice_vsi_alloc_rings()
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently the ice_reset_req enum values have to be translated into
a different set of values that the hardware understands for the same
reset types. Avoid this translation by aligning ice_reset_req enum
values to the ones that the hardware understands.
Also add and else if block to check for ICE_RESET_EMPR and put a dev_dbg
message in the else case.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch implements ethtool hook for enabling/disabling
RSS. While disabling RSS, the LUT should be cleared. And
the LUT should be reconfigured while enabling RSS.
Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
For the PF driver, when mapping interrupts to queues, we need to request
IRQs from the kernel and we also have to allocate interrupts from
the device.
Similarly, when the VF driver (iavf.ko) initializes, it requests the kernel
IRQs that it needs but it can't directly allocate interrupts in the device.
Instead, it sends a mailbox message to the ice driver, which then allocates
interrupts in the device on the VF driver's behalf.
Currently both these cases end up having to reserve entries in
pf->irq_tracker but irq_tracker itself is sized based on how many vectors
the PF driver needs. Under the right circumstances, the VF driver can fail
to get entries in irq_tracker, which will result in the VF driver failing
probe.
To fix this, sw_irq_tracker and hw_irq_tracker are introduced. The
sw_irq_tracker tracks only the PF's IRQ request and doesn't play any
role in VF init. hw_irq_tracker represents the device's interrupt space.
When interrupts have to be allocated in the device for either PF or VF,
hw_irq_tracker will be looked up to see if the device has run out of
interrupts.
Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We are currently replaying the link state of a port after a reset, but
it is possible that the link state of a port can change during the reset
process. So check for the current link state of a port during the rebuild
process of a reset.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently, switch filters get replayed after reset. In addition to
filters, other VSI attributes (like RSS configuration, Tx scheduler
configuration, etc.) also need to be replayed after reset.
Thus, instead of replaying based on functional blocks (i.e. replay
all filters for all VSIs, followed by RSS configuration replay for
all VSIs, and so on), it makes more sense to have the replay centered
around a VSI. In other words, replay all configurations for a VSI before
moving on to rebuilding the next VSI.
To that effect, this patch introduces a VSI replay framework in a new
function ice_vsi_replay_all. Currently it only replays switch filters,
but it will be expanded in the future to replay additional VSI attributes.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch is a continuation of the previous patch where VSI
handles are used instead of VSI numbers.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A VSI handle is just a number the driver maintains to uniquely identify
a VSI. A VSI handle is backed by a VSI number in the hardware. When
interacting when the hardware, VSI handles are converted into VSI numbers.
In commit 0f9d5027a7 ("ice: Refactor VSI allocation, deletion and
rebuild flow"), VSI handles were introduced but it was used only
when creating and deleting VSIs. This patch is part one of two patches
that expands the use of VSI handles across the rest of the driver. Also
in this patch, certain parts of the code had to be refactored to correctly
use VSI handles.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In current ABI the size of the messages carrying map elements was
statically defined to at most 16 words of key and 16 words of value
(NFP word is 4 bytes). We should not make this assumption and use
the max key and value sizes from the BPF capability instead.
To make sure old kernels don't get surprised with larger (or smaller)
messages bump the FW ABI version to 3 when key/value size is different
than 16 words.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Some apps may want to have higher MTU on the control vNIC/queue.
Allow them to set the requested MTU at init time.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Up until now we only had per-vNIC BPF ABI version capabilities,
which are slightly awkward to use because bulk of the resources
and configuration does not relate to any particular vNIC. Add
a new capability for global ABI version and check the per-vNIC
version are equal to it. Assume the ABI version 2 if no explicit
version capability is present.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
When choosing channel amounts and ring sizes, the maximums in the
ibmvnic driver are defined by the virtual i/o server management
partition. Even though they are defined as maximums, the client
driver may in fact successfully request resources that exceed
these limits, which are mostly dependent on a user's hardware
With this in mind, provide an ethtool flag that when enabled will
allow the user to request resources limited by driver-defined
maximums instead of limits defined by the management partition.
The driver will try to honor the user's request but may not allowed
by the management partition. In this case, the driver requests
as close as it can get to the desired amount until it succeeds.
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce driver-defined maximums for queue ring sizes. Devices
available for IBM vNIC today will likely not allow this amount,
but this should give us some leeway for future devices that may
support larger ring sizes.
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Increase queue size limit to 16. Devices available for IBM vNIC today
will not allow this amount, but this should give us some leeway for
future devices that may support more RX or TX queues.
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some of the chip-specific hw_start functions set bit TXCFG_AUTO_FIFO
in register TxConfig. The original patch changed the order of some
calls resulting in these changes being overwritten by
rtl_set_tx_config_registers() in rtl_hw_start(). This eventually
resulted in network stalls especially under high load.
Analyzing the chip-specific hw_start functions all chip version from
34, with the exception of version 39, need this bit set.
This patch moves setting this bit to rtl_set_tx_config_registers().
Fixes: 4fd48c4ac0 ("r8169: move common initializations to tp->hw_start")
Reported-by: Ortwin Glück <odi@odi.ch>
Reported-by: David Arendt <admin@prnet.org>
Root-caused-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Tested-by: Tony Atkinson <tatkinson@linux.com>
Tested-by: David Arendt <admin@prnet.org>
Tested-by: Ortwin Glück <odi@odi.ch>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When inserting the TSB, keep track of how many times we had to do it and
if there was a failure in doing so, this helps profile the driver for
possibly incorrect headroom settings.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During bcm_sysport_insert_tsb() make sure we differentiate a SKB
headroom re-allocation failure from the normal swap and replace path.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can turn on the RX/TX checksum offloads by default and make sure that
those are properly reflected back to e.g: stacked devices such as VLAN
or DSA.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During driver resume and open, the HW may have lost its context/state,
utilize bcm_sysport_set_features() to make sure we do restore the
correct set of features that were previously configured.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for unconditionally enabling TX and RX checksum offloads,
refactor bcm_sysport_set_features() a bit such that
__netdev_update_features() during register_netdev() can make sure that
features are correctly programmed during network device registration.
Since we can now be called during register_netdev() with clocks gated,
we need to temporarily turn them on/off in order to have a successful
register programming.
We also move the CRC forward setting read into
bcm_sysport_set_features() since priv->crc_fwd matters while turning on
RX checksum offload, that way we are guaranteed they are in sync in case
we ever add support for NETIF_F_RXFCS at some point in the future.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds switch for flow director with ethtool command
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes all flow director rules when unload hns3 driver.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When doing reset, remove all entries in TCAM block, and keep flow
director rules list. After finishing reset, restore all entries.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for querying rule number and rule details
by ethtool commands.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for add and delete rule by ethtool commands.
HNS3 driver supports several flow types, include ETHER_FLOW,
IP_USER_FLOW, TCP_V4_FLOW, UDP_V4_FLOW, SCTP_V4_FLOW, IPV6_USER_FLOW,
TCP_V6_FLOW, UDP_V6_FLOW and SCTP_V6_FLOW.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each flow director rule consists of input key and action. The input key
is the condition for matching, includes tuples of L2/L3/L4 header.
Action is the behaviour when a packet matches with the input key, such
as drop the packet, or forward to a specified queue.
The input key is stored in the tcam blocks, Each bit of input key can
be masked.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Flow director is a new feature supported by hardware with revision 0x21.
This patch adds flow direcor initialization for each PF. It queries flow
director mode and tcam resource from firmware, selects tuples used for
input key.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is one step in allowing phylib to make use of link_mode bitmaps,
instead of u32 for supported and advertised features. Convert the phy
drivers to use bitmaps to indicates the features they support.
Build bitmap equivalents of the u32 values at runtime, and have the
drivers point to the appropriate bitmap. These bitmaps are shared, and
we don't want a driver to modify them. So mark them __ro_after_init.
Within phylib, the features bitmap is currently turned back into a
u32. This will be removed once the whole of phylib, and the drivers
are converted to use bitmaps.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The macro PHY_GBIT_FEAUTRES needs to change into a bitmap in order to
support link_modes. Remove its use from xgde by replacing it with its
definition.
Probably, the current behavior is wrong. It probably should be
ANDing not assigning.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper to convert the local advertising to an LCL capabilities,
which is then used to resolve pause flow control settings.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reserve two TLV types for feature development, and warn in the driver
if they ever leak into production.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Address compiler warning reported by kbuild autobuilders
when building for i386 as a result of dma_addr_t size on
different architectures.
warning: cast to pointer from integer of different size
[-Wint-to-pointer-cast]
Fixes: 7e8d5755be ("net: nixge: Add support for 64-bit platforms")
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This series includes updates to mlx5e ethernet netdevice driver:
From Or Gerlitz:
1) Support masks for l3/l4 filters in ethtool flow steering
2) Report checksum unnecessary also when the L3 checksum flag on the
cqe is set and there's no L4 header
3) Allow reporting of checksum unnecessary, using an ethtool private flag.
From Gavi Teitz and Or, VF representors netdevs performance improvements
4) Allow striding RQ in VF representor and bigger RQ size, ~3X performance improvement
5) Enable stateless offloads for VF representor, csum and TSO, 1.5X performance improvement
6) RSS Support for VF representors
6.1) Allow flow table destination fir VF representor steering rule.
6.2) Create RSS flow table per representor netdev
6.3) Expose mlx5e RSS ethtool to be used by representor netdevs
6.4) Enable multi-queue and RSS for VF representors, using mlx5e existing infrastructure
for managing a multi-queue RX RSS tables.
From Alaa Hleihel:
7) Cache the system image guid, The system image guid is a read-only field
Read this once and save it on the core device.
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbsmk5AAoJEEg/ir3gV/o+S+IIAMpGR1b79uPjCIdZSTxLlCvg
xGdc3MGQ4Lv+DuUEP1DQpZDIAJQeh10KqICGfRZxo7d4jUtajwujwaUylkZrb9da
QNIeK2FLMXkxQo51owZxXgdAy8BNpPmg3D7xI/O/DET4UWCKRmZ/4eJfWLG7r8qp
+BEqQmJjhY67mj1tuzJ1JsysrOvUhfXEwqJGgBCLlME8KVWh0No4fTsrAlKBmeec
kYxaW9hS57LKOu32y8LdYb830yHDjUqtWAd+5cGWNTOaygm0dd5ZPVnR9oJ/Ga7I
Ru64aMMQoLuRbMfryUh8gYZ/oIwP48oIIr3Wler5SXkHxEKR++F5fJrApgxvibA=
=ivVt
-----END PGP SIGNATURE-----
Merge tag 'mlx5e-updates-2018-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5e-updates-2018-10-01
This series includes updates to mlx5e ethernet netdevice driver:
From Or Gerlitz:
1) Support masks for l3/l4 filters in ethtool flow steering
2) Report checksum unnecessary also when the L3 checksum flag on the
cqe is set and there's no L4 header
3) Allow reporting of checksum unnecessary, using an ethtool private flag.
From Gavi Teitz and Or, VF representors netdevs performance improvements
4) Allow striding RQ in VF representor and bigger RQ size, ~3X performance improvement
5) Enable stateless offloads for VF representor, csum and TSO, 1.5X performance improvement
6) RSS Support for VF representors
6.1) Allow flow table destination fir VF representor steering rule.
6.2) Create RSS flow table per representor netdev
6.3) Expose mlx5e RSS ethtool to be used by representor netdevs
6.4) Enable multi-queue and RSS for VF representors, using mlx5e existing infrastructure
for managing a multi-queue RX RSS tables.
From Alaa Hleihel:
7) Cache the system image guid, The system image guid is a read-only field
Read this once and save it on the core device.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2018-10-01
This series contains updates to ice driver only.
Anirudh provides several changes to "prep" the driver for upcoming
features. Specifically, the functions that are used for PF VSI/netdev
setup will also be used in SR-IOV support and to allow the reuse of
these functions, code needs to move.
Dave provides the only other change in the series, updates the driver to
protect the reset patch in its entirety. This is done by adding the
various bit checks to determine if a reset is scheduled/initiated and
whether it came from the software or firmware.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, there is no bit, or set of bits, that protect the entirety
of the reset path.
If the reset is originated by the driver, then the relevant
one of the following bits will be set when the reset is scheduled:
__ICE_PFR_REQ
__ICE_CORER_REQ
__ICE_GLOBR_REQ
This bit will not be cleared until after the rebuild has completed.
If the reset is originated by the FW, then the first the driver knows of
it will be the reception of the OICR interrupt. The __ICE_RESET_OICR_RECV
bit will be set in the interrupt handler. This will also be the indicator
in a SW originated reset that we have completed the pre-OICR tasks and
have informed the FW that a reset was requested.
To utilize these bits, change the function:
ice_is_reset_recovery_pending()
to be:
ice_is_reset_in_progress()
The new function will check all of the above bits in the pf->state and
will return a true if one or more of these bits are set.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch completes the code move out of ice_main.c
The following top level functions and related dependency functions) were
moved to ice_lib.c:
ice_vsi_setup
ice_vsi_cfg_tc
The following functions were made static again:
ice_vsi_setup_vector_base
ice_vsi_alloc_q_vectors
ice_vsi_get_qs
void ice_vsi_map_rings_to_vectors
ice_vsi_alloc_rings
ice_vsi_set_rss_params
ice_vsi_set_num_qs
ice_get_free_slot
ice_vsi_init
ice_vsi_alloc_arrays
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch continues the code move out of ice_main.c
The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_setup_vector_base
ice_vsi_alloc_q_vectors
ice_vsi_get_qs
The following functions were made static again:
ice_vsi_free_arrays
ice_vsi_clear_rings
Also, in this patch, the netdev and NAPI registration logic was de-coupled
from the VSI creation logic (ice_vsi_setup) as for SR-IOV, while we want to
create VF VSIs using ice_vsi_setup, we don't want to create netdevs.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch continues the code move out of ice_main.c
The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_clear
ice_vsi_close
ice_vsi_free_arrays
ice_vsi_map_rings_to_vectors
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch continues the code move out of ice_main.c
The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_alloc_rings
ice_vsi_set_rss_params
ice_vsi_set_num_qs
ice_get_free_slot
ice_vsi_init
ice_vsi_clear_rings
ice_vsi_alloc_arrays
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch continues the code move out of ice_main.c
The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_delete
ice_free_res
ice_get_res
ice_is_reset_recovery_pending
ice_vsi_put_qs
ice_vsi_dis_irq
ice_vsi_free_irq
ice_vsi_free_rx_rings
ice_vsi_free_tx_rings
ice_msix_clean_rings
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch continues the code move out of ice_main.c
The following top level functions (and related dependency functions) were
moved to ice_lib.c:
ice_vsi_start_rx_rings
ice_vsi_stop_rx_rings
ice_vsi_stop_tx_rings
ice_vsi_cfg_rxqs
ice_vsi_cfg_txqs
ice_vsi_cfg_msix
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The functions that are used for PF VSI/netdev setup will also be used
for SR-IOV support. To allow reuse of these functions, move these
functions out of ice_main.c to ice_common.c/ice_lib.c
This move is done across multiple patches. Each patch moves a few
functions and may have minor adjustments. For example, a function that was
previously static in ice_main.c will be made non-static temporarily in
its new location to allow the driver to build cleanly. These adjustments
will be removed in subsequent patches where more code is moved out of
ice_main.c
In this particular patch, the following functions were moved out of
ice_main.c:
int ice_add_mac_to_list
ice_free_fltr_list
ice_stat_update40
ice_stat_update32
ice_update_eth_stats
ice_vsi_add_vlan
ice_vsi_kill_vlan
ice_vsi_manage_vlan_insertion
ice_vsi_manage_vlan_stripping
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The system image guid is a read-only field which is used by the TC
offloads code to determine if two mlx5 devices belong to the same
ASIC while adding flows.
Read this once and save it on the core device rather than querying each
time an offloaded flow is added.
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently we practically never report checksum unnecessary, because
for all IP packets we take the checksum complete path.
Enable non-default runs with reprorting checksum unnecessary, using
an ethtool private flag. This can be useful for performance evals
and other explorations.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
We can report checksum unnecessary also when the L3 checksum
flag on the cqe is set and there's no L4 header.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Increased the amount of channels the representors can open to be the
amount of CPUs. The default amount opened remains one.
Used the standard NIC netdev functions to:
* Set RSS params when building the representors' params.
* Setup an indirect TIR and RQT for the representors upon
initialization.
* Create a TTC flow table for the representors' indirect TIR (when
creating the TTC table, mlx5e_set_ttc_basic_params() is not called,
in order to avoid setting the inner_ttc param, which is not needed).
Added ethtool control to the representors for setting and querying
the amount of open channels. Additionally, included logic in the
representors' ethtool set channels handler which controls a
representor's vport rx rule, so that if there is one open channel
the rx rule steers traffic to the representor's direct TIR, whereas
if there is more than one channel, the rx rule steers traffic to the
new TTC flow table.
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Towards enabling RSS for the vport representors, expose the functions for
querying the rss hash key size and indirection table size via ethtool.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Towards enabling RSS for the vport representors, extract the
procedure for building a device's RSS params, and expose the
function.
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Change the driver functions that deal with creating indirect tirs
to get a flag telling if inner ttc is desired.
A pre-step for enabling rss on the vport representors, where
inner ttc is not needed.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently the destination for the representor e-switch rx rule is
a TIR number. Towards changing that to potentially be a flow table,
as part of enabling RSS for representors, modify the signature of
the related e-switch API to get a flow destination.
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Cleaning up the flow of the representors' rx initialization, towards
enabling RSS for the representors.
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Enabled checksum and TSO offloads for the representors, in
order to increase their performance, which is required to
increase the performance of flows that cannot be offloaded.
Checksum offloads contribute to a general acceleration of all
traffic (to around 150%), whereas the TSO offload contributes
to a prominent acceleration of the representor's TX for traffic
flows with larger than MTU sized packets (to around 200%). This
is the usual case for TCP streams, as the PF, which serves as
the uplink representor, and the VF representors employ GRO before
forwarding the packets to the representor.
GRO was enabled implicitly for the representors beforehand, and
is explicitly enabled here to ensure that the representors preserve
the performance boost it provides (of around 200%) when working in
tandem with the TSO offload by the forwardee, which is the standard
case as both the PF and the VF representors employ HW TSO.
The impact of these changes can be seen in the following
measurements taken on a setup of a VM over a VF, connected
to OVS via the VF representor, to an external host:
Before current changes:
TCP Throughput [Gb/s]
External host to VM ~ 10.5
VM to external host ~ 23.5
With just checksum offloads enabled:
TCP Throughput [Gb/s]
External host to VM ~ 14.9
VM to external host ~ 28.5
With the TSO offload also enabled:
TCP Throughput [Gb/s]
External host to VM ~ 30.5
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The representors' RQ size was not large enough for them to achieve
high enough performance, and therefore needed to be enlarged, while
suffering a minimum hit to its memory usage. To achieve this the
representors RQ size was increased, and its type was changed to be a
striding RQ if it is supported.
Towards that goal the following changes were made:
* Extracted the sequence for setting the standard netdev's RQ parmas
into a function
* Replaced the sequence for setting the representor's RQ params with
the standard sequence
The impact of this change can be seen in the following measurements
taken on a setup of a VM over a VF, connected to OVS via the VF
representor, to an external host:
Before current change:
TCP Throughput [Gb/s]
VM to external host ~ 7.2
With the current change (measured with a striding RQ):
TCP Throughput [Gb/s]
VM to external host ~ 23.5
Each representor now consumes 2 [MB] of memory for its packet
buffers.
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Allow using partial masks for L3 addresses and L4 ports across
the place.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In flow steering, if asked to, the hardware matches on the first ethertype
which is not vlan. It's possible to set a rule as follows, which is meant
to match on untagged packet, but will match on a vlan packet:
tc filter add dev eth0 parent ffff: protocol ip flower ...
To avoid this for packets with single tag, we set vlan masks to tell
hardware to check the tags for every matched packet.
Fixes: 095b6cfd69 ('net/mlx5e: Add TC vlan match parsing')
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The code that deals with eswitch vport bw guarantee was going beyond the
eswitch vport array limit, fix that. This was pointed out by the kernel
address sanitizer (KASAN).
The error from KASAN log:
[2018-09-15 15:04:45] BUG: KASAN: slab-out-of-bounds in
mlx5_eswitch_set_vport_rate+0x8c1/0xae0 [mlx5_core]
Fixes: c9497c9890 ("net/mlx5: Add support for setting VF min rate")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
If the peer device was already unbound, then do not attempt to modify
it's resources, otherwise we will crash on dereferencing non-existing
device.
Fixes: 5c65c564c9 ("net/mlx5e: Support offloading TC NIC hairpin flows")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Disable the clk during suspend to save power. Note that tp->clk may be
NULL, the clk core functions handle this without problems.
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Carlo Caione <carlo@endlessm.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In regular NIC transmission flow, driver always configures MAC using
Tx queue zero descriptor as a part of MAC learning flow.
But with multi Tx queue supported NIC, regular transmission can occur on
any non-zero Tx queue and from that context it uses
Tx queue zero descriptor to configure MAC, at the same time TX queue
zero could be used by another CPU for regular transmission
which could lead to Tx queue zero descriptor corruption and cause FW
abort.
This patch fixes this in such a way that driver always configures
learned MAC address from the same Tx queue which is used for
regular transmission.
Fixes: 7e2cf4feba ("qlcnic: change driver hardware interface mechanism")
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c: In function ‘hclge_get_sset_count’:
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:496:31: error: ‘HNAE3_REVISION_ID_21’ undeclared (first use in this function); did you mean ‘FADT2_REVISION_ID’?
if (hdev->pdev->revision >= HNAE3_REVISION_ID_21 ||
^~~~~~~~~~~~~~~~~~~~
FADT2_REVISION_ID
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c: In function ‘hns3_self_test’:
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c:278:15: error: ‘HNS3_SELF_TEST_TYPE_NUM’ undeclared (first use in this function); did you mean ‘HNS3_SELF_TEST_TPYE_NUM’?
int st_param[HNS3_SELF_TEST_TYPE_NUM][2];
^~~~~~~~~~~~~~~~~~~~~~~
HNS3_SELF_TEST_TPYE_NUM
Signed-off-by: David S. Miller <davem@davemloft.net>
rx_mini_pending was set to an incorrect value. This was causing EINVAL to
always be returned to 'ethtool -G'. The driver does not support mini or
jumbo rings so the respective settings should be zero.
Also, change the valid range of the number of descriptors in the rings to
make the code simpler and easier for users to understand (this removes the
valid settings of 8 and 16). Add a system log message indicating when the
number is rounded-up from what the user specifies with the 'ethtool -G'
command (i.e. when it is not a multiple of 32), and update the log message
when a user-provided value is out of range to also indicate the stride.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch makes a couple of changes in the way the driver uses the
"get capabilities" command.
1. Get device capabilities in addition to function capabilities
2. Align to latest spec by using cap_count to determine size of the
buffer in case of length error.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Query the Tx scheduler tree node information from FW before adding it to
the driver's software database. This will keep the node information current
in driver.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Previously the comment stated that VSI lists should be used when a
second VSI becomes a subscriber to the "VLAN address". VSI lists
are always used for VLAN membership, so replace "VLAN address" with
"MAC address". Also note that VLAN(s) always use VSI list rules.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We have MAX_FW_API_VER_BRANCH, MAX_FW_API_VER_MAJOR, and
MAX_FW_API_VER_MINOR that we use in ice_controlq.h to test when a
firmware version is newer than expected. This is currently tested by
comparing each field separately. Thus, we compare the branch field
against the MAX_FW_API_VER_BRANCH, and so forth.
This means that currently, if we suppose that the max firmware version
is defined as 0.2.1, i.e.
Then firmware 0.1.3 will fail to load. This is because the minor version
3 is greater than the max minor version 1.
This is not intuitive, because of the notion that increasing the major
firmware version to 2 should mean any firmware version with a major
version is less than 2 should be considered older than 2...
In order to allow both 0.2.1 and 0.1.3 to load, you would have to define
the "max" firmware version as 0.2.3.. It is possible that such
a firmware version doesn't even exist yet!
Fix this by replacing the current logic with an updated check that
behaves as follows:
First, we check the major version. If it is greater than the expected
version, then we prevent driver load. Additionally, a warning message is
logged to indicate to the system administrator that they need to update
their driver. This is now the only case where the driver will refuse to
load.
Second, if the major version is less than the expected version, we log
an information message indicating the NVM should be updated.
Third, if the major version is exact, we'll then check the minor
version. If the minor version is more than two versions less than
expected, we log an information message indicating the NVM should be
updated. If it is more than two versions greater than the expected
version, we log an information message that the driver should be
updated.
To support this, the ice_aq_ver_check function needs its signature
updated to pass the HW structure. Since we now pass this structure,
there is no need to pass the firmware API versions separately.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Update branding strings and remove device ids 0x1594 and 0x1595.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Direct assignment is preferred over a memcpy()
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When shutting down the controlqs, we check if they are initialized
before we shut them down and destroy the lock. This is important, as it
prevents attempts to access the lock of an already shutdown queue.
Unfortunately, we checked rq.head and sq.head as the value to determine
if the queue was initialized. This doesn't work, because head is not
reset when the queue is shutdown. In some flows, the adminq will have
already been shut down prior to calling ice_shutdown_all_ctrlqs. This
can result in a crash due to attempting to access the already destroyed
mutex.
Fix this by using rq.count and sq.count instead. Indeed, ice_shutdown_sq
and ice_shutdown_rq already indicate that this is the value we should be
using to determine of the queue was initialized.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Trivial fix to spelling mistake struct field name, rename it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
ibmvnic uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
ibmvnic_netpoll_controller() was completely wrong anyway,
as it was scheduling NAPI to service RX queues (instead of TX),
so I doubt netpoll ever worked on this driver.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
sfc-falcon uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
Cc: Edward Cree <ecree@solarflare.com>
Cc: Bert Kenward <bkenward@solarflare.com>
Acked-By: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
sfc uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Edward Cree <ecree@solarflare.com>
Cc: Bert Kenward <bkenward@solarflare.com>
Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
Acked-By: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
ena uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Netanel Belgazal <netanel@amazon.com>
Cc: Saeed Bishara <saeedb@amazon.com>
Cc: Zorik Machulsky <zorik@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
netxen uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Manish Chopra <manish.chopra@cavium.com>
Cc: Rahul Verma <rahul.verma@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
qlcnic uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Harish Patil <harish.patil@cavium.com>
Cc: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
hns uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
ehea uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Douglas Miller <dougmill@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
hinic uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Note that hinic_netpoll() was incorrectly scheduling NAPI
on both RX and TX queues.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2018-09-27
This series contains fixes to the ice driver only.
Jake fixes a potential crash due to attempting to access the mutex which
is already destroyed. Fix this by using rq.count and sq.count to
determine if the queue was initialized. Fixed the current logic for
checking the firmware version to properly handle situations when
firmware major/minor versions differ and when the branch version
differs.
Bruce replaces a memcpy() with a direct assignment, which is preferred.
Also updated the branding strings and device ids supported by the
driver. Fixed the "ethtool -G" command in the driver, which was always
returning EINVAL when changing the descriptor ring size.
Brett update and clarified code comments.
Anirudh updates the driver to ensure we query the firmware for the
transmit scheduler node information before adding it to the driver
database, to ensure we have the current information. Also update the
"get capabilities" command to get device and function capabilities.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The structure shared between driver and the management FW (mfw) differ in
sizes. This would lead to issues when driver try to access the structure
members which are not-aligned with the mfw copy e.g., data_ptr usage in the
case of mfw_tlv request.
Align the driver structure with mfw copy, add reserved field(s) to driver
structure for the members not used by the driver.
Fixes: dd006921d6 ("qed: Add MFW interfaces for TLV request support.)
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
The user's coal configuration will be lost after reset, so the tx_coal
and rx_coal fields are added to the struct hns_nic_priv to save the coal
configuration and used to restore the user's configuration after the reset
is complete.
Fixes: bb6b94a896 ("net: hns3: Add reset interface implementation in client")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current hns3_get_max_available_channels returns the total number
of queues for the device, which makes ethtool -L set the number of queues
per channel queues incorrectly, so hns3_get_max_available_channels should
return the maximum available number of queues per channel, depending on
the total number of queues allocated and the hardware configurations.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hclge_tm_schd_info_update should return an error when num_tc is greater
than alloc_tqps.
This patch changes the return type of hnae3_register_ae_algo from void
to int.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently hns3_nic_change_mtu will try to down the netdev before
setting mtu, and it does not up the netdev when the setting fails,
which causes netdev not up problem.
This patch fixes it by not returning when the setting fails.
Fixes: a8e8b7ff35 ("net: hns3: Add support to change MTU in HNS3 hardware")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware expects a unit of 128 bytes when setting
packet buffer. When calculating the packet buffer size,
hclge_rx_buffer_calc does not round up the size as a unit
of 128 byte, which may casue packet lost problem when stress
testing.
This patch fixes it by rounding up packet size when calculating.
Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds serdes parallel inner loopback support for self test.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In fact, our implementation of mac loopback is the implementation of app
loopback now. Current name is wrong. This patch renames mac loopback to
app loopback.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Our loop mode includes mac loop, serdes loop and phy loop. Not all of them
are related with mac. This patch corrects their names.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The extra mac address of pause param is used to do double check
for pause frame. This patch set it to HW. If we do not do that,
pfc pause frame will be transferred protocol stack when normal
flow control mode is enabled.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for sctp checksum offload.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch that removed the only users of the oldadv/newadv variables
accidentally left the now-unused declarations behind:
drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c: In function 'dpaa_set_pauseparam':
drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c:185:14: error: unused variable 'oldadv' [-Werror=unused-variable]
drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c:185:6: error: unused variable 'newadv' [-Werror=unused-variable]
Fixes: 70814e819c ("net: ethernet: Add helper for set_pauseparam for Asym Pause")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warning:
drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c:873:5: warning:
symbol 'aq_fw1x_set_power' was not declared. Should it be static?
Fixes: a0da96c08c ("net: aquantia: implement WOL support")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/qlogic/qed/qed_ooo.c: In function 'qed_ooo_delete_isles':
drivers/net/ethernet/qlogic/qed/qed_ooo.c:354:30: warning:
variable 'p_archipelago' set but not used [-Wunused-but-set-variable]
drivers/net/ethernet/qlogic/qed/qed_ooo.c: In function 'qed_ooo_join_isles':
drivers/net/ethernet/qlogic/qed/qed_ooo.c:463:30: warning:
variable 'p_archipelago' set but not used [-Wunused-but-set-variable]
Since commit 1eec2437d1 ("qed: Make OOO archipelagos into an array"),
'p_archipelago' is no longer in use.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fix a long standing race bug when destroying comp_event file descriptors
- srp, hfi1, bnxt_re: Various driver crashes from missing validation and
other cases
- Fixes for regressions in patches merged this window in the gid cache,
devx, ucma and uapi.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlutG7QACgkQOG33FX4g
mxps3Q//fPL1o4Na/nx/aSmtfcEuPxjXwczUbB0MzlN7rKu0HjrES/DOXbGJZgS7
5XGu7TQyOBWzpZHZM4Xs+EjTdVY9pOK5+Q+t76Ns1vZsblCATJPOCIYMVkr+zM5T
zQxEfarlz8kEPWVz/TBAM7djoSkGXNLQQ1ttSVJBSAindhb25NzoiV1fsFGkXKmT
xbZHW03K1o4dZMf0JJO77eu1m9tkRfX0iFXBg4efgNOZtl9MP4l5lRwPqyRdGhMO
MvBNCJte/4dKlg5TeoVtp8B0sJvrO6QOj878VqYH4nL8uW2lZWFuzB7ZWxWGxMb/
VgXwpmXyDuTUjX9gn3kQToG8L9FJqsMzPZGBePMjilJqpgpqoVZpQ8i674JBZznF
as/bRSLdkA4PL5upFepBD/j9ep9LGIb6ojDcUSmwB+aJ99ZtlFUOhdf4ehQ2x9MU
WETIcdSqlmFjhY2IoHFKEr5PP3NGIDtPlvZbAmJrrne3uTCwXChXeZgzX1S6AHkY
djvDEaV8DrMmdJc0XTIOIQvxzMnHuoCo3oHdVaJMLudGkw1NKMN84SkIPb6U2Qjk
QiPf3pRRUazO6tqJfEZr8Es5lQjYkZUK4wKagOJKdcLBAubEnMhrsvECXQuoFWT8
sYhzK3AwyEaXp4Y3VbA+aqjZA/AhtPk/l3QKesU6u7hoLtiawoc=
=cjxG
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Jason writes:
"Second RDMA rc pull request
- Fix a long standing race bug when destroying comp_event file descriptors
- srp, hfi1, bnxt_re: Various driver crashes from missing validation
and other cases
- Fixes for regressions in patches merged this window in the gid
cache, devx, ucma and uapi."
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/core: Set right entry state before releasing reference
IB/mlx5: Destroy the DEVX object upon error flow
IB/uverbs: Free uapi on destroy
RDMA/bnxt_re: Fix system crash during RDMA resource initialization
IB/hfi1: Fix destroy_qp hang after a link down
IB/hfi1: Fix context recovery when PBC has an UnsupportedVL
IB/hfi1: Invalid user input can result in crash
IB/hfi1: Fix SL array bounds check
RDMA/uverbs: Fix validity check for modify QP
IB/srp: Avoid that sg_reset -d ${srp_device} triggers an infinite loop
ucma: fix a use-after-free in ucma_resolve_ip()
RDMA/uverbs: Atomically flush and mark closed the comp event queue
cxgb4: fix abort_req_rss6 struct
rx_mini_pending was set to an incorrect value. This was causing EINVAL to
always be returned to 'ethtool -G'. The driver does not support mini or
jumbo rings so the respective settings should be zero.
Also, change the valid range of the number of descriptors in the rings to
make the code simpler and easier for users to understand (this removes the
valid settings of 8 and 16). Add a system log message indicating when the
number is rounded-up from what the user specifies with the 'ethtool -G'
command (i.e. when it is not a multiple of 32), and update the log message
when a user-provided value is out of range to also indicate the stride.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch makes a couple of changes in the way the driver uses the
"get capabilities" command.
1. Get device capabilities in addition to function capabilities
2. Align to latest spec by using cap_count to determine size of the
buffer in case of length error.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Query the Tx scheduler tree node information from FW before adding it to
the driver's software database. This will keep the node information current
in driver.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Previously the comment stated that VSI lists should be used when a
second VSI becomes a subscriber to the "VLAN address". VSI lists
are always used for VLAN membership, so replace "VLAN address" with
"MAC address". Also note that VLAN(s) always use VSI list rules.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We have MAX_FW_API_VER_BRANCH, MAX_FW_API_VER_MAJOR, and
MAX_FW_API_VER_MINOR that we use in ice_controlq.h to test when a
firmware version is newer than expected. This is currently tested by
comparing each field separately. Thus, we compare the branch field
against the MAX_FW_API_VER_BRANCH, and so forth.
This means that currently, if we suppose that the max firmware version
is defined as 0.2.1, i.e.
Then firmware 0.1.3 will fail to load. This is because the minor version
3 is greater than the max minor version 1.
This is not intuitive, because of the notion that increasing the major
firmware version to 2 should mean any firmware version with a major
version is less than 2 should be considered older than 2...
In order to allow both 0.2.1 and 0.1.3 to load, you would have to define
the "max" firmware version as 0.2.3.. It is possible that such
a firmware version doesn't even exist yet!
Fix this by replacing the current logic with an updated check that
behaves as follows:
First, we check the major version. If it is greater than the expected
version, then we prevent driver load. Additionally, a warning message is
logged to indicate to the system administrator that they need to update
their driver. This is now the only case where the driver will refuse to
load.
Second, if the major version is less than the expected version, we log
an information message indicating the NVM should be updated.
Third, if the major version is exact, we'll then check the minor
version. If the minor version is more than two versions less than
expected, we log an information message indicating the NVM should be
updated. If it is more than two versions greater than the expected
version, we log an information message that the driver should be
updated.
To support this, the ice_aq_ver_check function needs its signature
updated to pass the HW structure. Since we now pass this structure,
there is no need to pass the firmware API versions separately.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Update branding strings and remove device ids 0x1594 and 0x1595.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Direct assignment is preferred over a memcpy()
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When shutting down the controlqs, we check if they are initialized
before we shut them down and destroy the lock. This is important, as it
prevents attempts to access the lock of an already shutdown queue.
Unfortunately, we checked rq.head and sq.head as the value to determine
if the queue was initialized. This doesn't work, because head is not
reset when the queue is shutdown. In some flows, the adminq will have
already been shut down prior to calling ice_shutdown_all_ctrlqs. This
can result in a crash due to attempting to access the already destroyed
mutex.
Fix this by using rq.count and sq.count instead. Indeed, ice_shutdown_sq
and ice_shutdown_rq already indicate that this is the value we should be
using to determine of the queue was initialized.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The current netpoll implementation in the bnxt_en driver has problems
that may miss TX completion events. bnxt_poll_work() in effect is
only handling at most 1 TX packet before exiting. In addition,
there may be in flight TX completions that ->poll() may miss even
after we fix bnxt_poll_work() to handle all visible TX completions.
netpoll may not call ->poll() again and HW may not generate IRQ
because the driver does not ARM the IRQ when the budget (0 for netpoll)
is reached.
We fix it by handling all TX completions and to always ARM the IRQ
when we exit ->poll() with 0 budget.
Also, the logic to ACK the completion ring in case it is almost filled
with TX completions need to be adjusted to take care of the 0 budget
case, as discussed with Eric Dumazet <edumazet@google.com>
Reported-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Song Liu <songliubraving@fb.com>
Tested-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mvneta controller can handle speeds up to 2500Mbps on the SGMII
interface. This relies on serdes configuration, the lane must be
configured at 3.125Gbps and we can't use in-band autoneg at that speed.
The main issue when supporting that speed on this particular controller
is that the link partner can send ethernet frames with a shortened
preamble, which if not explicitly enabled in the controller will cause
unexpected behaviours.
This was tested on Armada 385, with the comphy configuration done in
bootloader.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/qlogic/qed/qed_iwarp.c:1713:25: warning: implicit
conversion from enumeration type 'enum tcp_ip_version' to different
enumeration type 'enum qed_tcp_ip_version' [-Wenum-conversion]
cm_info->ip_version = TCP_IPV4;
~ ^~~~~~~~
drivers/net/ethernet/qlogic/qed/qed_iwarp.c:1733:25: warning: implicit
conversion from enumeration type 'enum tcp_ip_version' to different
enumeration type 'enum qed_tcp_ip_version' [-Wenum-conversion]
cm_info->ip_version = TCP_IPV6;
~ ^~~~~~~~
2 warnings generated.
Use the appropriate values from the expected type, qed_tcp_ip_version:
TCP_IPV4 = QED_TCP_IPV4 = 0
TCP_IPV6 = QED_TCP_IPV6 = 1
Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns when a constant is used in a boolean context as it thinks a
bitwise operation may have been intended.
drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: warning: use of logical
'&&' with constant operand [-Wconstant-logical-operand]
if (!p_iov->b_pre_fp_hsi &&
^
drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: note: use '&' for a
bitwise operation
if (!p_iov->b_pre_fp_hsi &&
^~
&
drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: note: remove constant
to silence this warning
if (!p_iov->b_pre_fp_hsi &&
~^~
1 warning generated.
This has been here since commit 1fe614d10f ("qed: Relax VF firmware
requirements") and I am not entirely sure why since 0 isn't a special
case. Just remove the statement causing Clang to warn since it isn't
required.
Link: https://github.com/ClangBuiltLinux/linux/issues/126
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/qlogic/qed/qed_roce.c:153:12: warning: implicit
conversion from enumeration type 'enum roce_mode' to different
enumeration type 'enum roce_flavor' [-Wenum-conversion]
flavor = ROCE_V2_IPV6;
~ ^~~~~~~~~~~~
drivers/net/ethernet/qlogic/qed/qed_roce.c:156:12: warning: implicit
conversion from enumeration type 'enum roce_mode' to different
enumeration type 'enum roce_flavor' [-Wenum-conversion]
flavor = MAX_ROCE_MODE;
~ ^~~~~~~~~~~~~
2 warnings generated.
Use the appropriate values from the expected type, roce_flavor:
ROCE_V2_IPV6 = RROCE_IPV6 = 2
MAX_ROCE_MODE = MAX_ROCE_FLAVOR = 3
While we're add it, ditch the local variable flavor, we can just return
the value directly from the switch statement.
Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang complains when one enumerated type is implicitly converted to
another.
drivers/net/ethernet/qlogic/qed/qed_vf.c:686:6: warning: implicit
conversion from enumeration type 'enum qed_tunn_mode' to different
enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
QED_MODE_L2GENEVE_TUNN,
^~~~~~~~~~~~~~~~~~~~~~
Update mask's parameter to expect qed_tunn_mode, which is what was
intended.
Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:163:25: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->vxlan.tun_cls = type;
~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:165:26: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->l2_gre.tun_cls = type;
~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:167:26: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->ip_gre.tun_cls = type;
~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:169:29: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->l2_geneve.tun_cls = type;
~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:171:29: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->ip_geneve.tun_cls = type;
~ ^~~~
5 warnings generated.
Avoid this by changing type to an int.
Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial fix to spelling mistake in DP_VERBOSE message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trival cleanup, list_move_tail will implement the same function that
list_del() + list_add_tail() will do. hence just replace them.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trival cleanup, list_move_tail will implement the same function that
list_del() + list_add_tail() will do. hence just replace them.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
40GbE Intel Wired LAN Driver Updates 2018-09-25
This series contains updates to i40e and xsk.
Mariusz fixes an issue where the VF link state was not being updated
properly when the PF is down or up. Also cleaned up the promiscuous
configuration during a VF reset.
Patryk simplifies the code a bit to use the variables for PF and HW that
are declared, rather than using the VSI pointers. Cleaned up the
message length parameter to several virtchnl functions, since it was not
being used (or needed).
Harshitha fixes two potential race conditions when trying to change VF
settings by creating a helper function to validate that the VF is
enabled and that the VSI is set up.
Sergey corrects a double "link down" message by putting in a check for
whether or not the link is up or going down.
Björn addresses an AF_XDP zero-copy issue that buffers passed
from userspace to the kernel was leaked when the hardware descriptor
ring was torn down. A zero-copy capable driver picks buffers off the
fill ring and places them on the hardware receive ring to be completed at
a later point when DMA is complete. Similar on the transmit side; The
driver picks buffers off the transmit ring and places them on the
transmit hardware ring.
In the typical flow, the receive buffer will be placed onto an receive
ring (completed to the user), and the transmit buffer will be placed on
the completion ring to notify the user that the transfer is done.
However, if the driver needs to tear down the hardware rings for some
reason (interface goes down, reconfiguration and such), the userspace
buffers cannot be leaked. They have to be reused or completed back to
userspace.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When an AF_XDP UMEM is attached to any of the Rx rings, we disallow a
user to change the number of descriptors via e.g. "ethtool -G IFNAME".
Otherwise, the size of the stash/reuse queue can grow unbounded, which
would result in OOM or leaking userspace buffers.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Outstanding Rx descriptors are temporarily stored on a stash/reuse
queue. When/if the HW rings comes up again, entries from the stash are
used to re-populate the ring.
The latter required some restructuring of the allocation scheme for
the AF_XDP zero-copy implementation. There is now a fast, and a slow
allocation. The "fast allocation" is used from the fast-path and
obtains free buffers from the fill ring and the internal recycle
mechanism. The "slow allocation" is only used in ring setup, and
obtains buffers from the fill ring and the stash (if any).
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the zero-copy enabled XDP Tx ring is torn down, due to
configuration changes, outstanding frames on the hardware descriptor
ring are queued on the completion ring.
The completion ring has a back-pressure mechanism that will guarantee
that there is sufficient space on the ring.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
msglen parameter seems to be unused in several virtchnl function.
This patch removes it from signatures of those functions.
Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When isup is false meaning that interface is going to shut down
set new speed to 0 to avoid double 'NIC Link is Down' messages.
Signed-off-by: Sergey Nemov <sergey.nemov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When we are trying to change VF settings, it is possible for 2 race
conditions to happen. One, when the VF is created but not yet enabled.
Second, the VF is enabled but the VSI is still not created or not yet
re-created in the VF reset flow.
This patch introduces a helper function to validate that the VF is
enabled and that the VSI is set up. This patch also calls this
function from other functions which could get into these race conditions.
While we are poking around here, remove unnecessary parenthesis that
checkpatch was complaining about.
Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In order to slightly simplify the code use the variables for pf and hw
that are declared in i40e_set_mac function.
Signed-off-by: Patryk Małek <patryk.malek@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch cleans up promiscuous configuration when a VF reset occurs.
Previously the promiscuous mode settings were still there after the VF
driver removal.
Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This resolves an issue where the VF link state was not being updated
when the PF is down or up, and the VF link state would always show
that it is running.
Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If SMMU is on, there is more likely that skb_shinfo(skb)->frags[i]
can not send by a single BD. when this happen, the
hns_nic_net_xmit_hw function map the whole data in a frags using
skb_frag_dma_map, but unmap each BD' data individually when tx is
done, which causes problem when SMMU is on.
This patch fixes this problem by ummapping the whole data in a
frags when tx is done.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Version bump conflict in batman-adv, take what's in net-next.
iavf conflict, adjustment of netdev_ops in net-next conflicting
with poll controller method removal in net.
Signed-off-by: David S. Miller <davem@davemloft.net>
Clear ADDR64 dma bit in DMACFG register in case that HW_DMA_CAP_64B is
not detected on 64bit system.
The issue was observed when bootloader(u-boot) does not check macb
feature at DCFG6 register (DAW64_OFFSET) and enabling 64bit dma support
by default. Then macb driver is reading DMACFG register back and only
adding 64bit dma configuration but not cleaning it out.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clear ADDR64 dma bit in DMACFG register in case that HW_DMA_CAP_64B is
not detected on 64bit system.
The issue was observed when bootloader(u-boot) does not check macb
feature at DCFG6 register (DAW64_OFFSET) and enabling 64bit dma support
by default. Then macb driver is reading DMACFG register back and only
adding 64bit dma configuration but not cleaning it out.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The check for pci_is_pcie() is redundant here because all
chip versions >=18 are PCIe only anyway. In addition use
dma_set_mask_and_coherent() instead of separate calls to
pci_set_dma_mask() and pci_set_consistent_dma_mask().
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Code can be slightly simplified by acking even events we're not
interested in. In addition add a comment making clear that the
read has no functional purpose and is just a PCI commit.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The networking core has a default watchdog timeout of 5s. I see no
need to define an own timeout of 6s which is basically the same.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set uid as part of DCT commands so that the firmware can manage the
DCT object in a secured way.
That will enable using a DCT that was created by verbs application
to be used by the DEVX flow in case the uid is equal.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Set uid as part of SRQ commands so that the firmware can manage the
SRQ object in a secured way.
That will enable using an SRQ that was created by verbs application
to be used by the DEVX flow in case the uid is equal.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Set uid as part of SQ commands so that the firmware can manage the
SQ object in a secured way.
That will enable using an SQ that was created by verbs application
to be used by the DEVX flow in case the uid is equal.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Set uid as part of RQ commands so that the firmware can manage the
RQ object in a secured way.
That will enable using an RQ that was created by verbs application
to be used by the DEVX flow in case the uid is equal.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Set uid as part of QP commands so that the firmware can manage the
QP object in a secured way.
That will enable using a QP that was created by verbs application to
be used by the DEVX flow in case the uid is equal.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Set uid as part of CQ commands so that the firmware can manage the CQ
object in a secured way.
The firmware should mark this CQ with the given uid so that it can
be used later on only by objects with the same uid.
Upon DEVX flows that use this CQ (e.g. create QP command), the
pointed CQ must have the same uid as of the issuer uid command.
When a command is issued with uid=0 it means that the issuer of the
command is trusted (i.e. kernel), in that case any pointed object
can be used regardless of its uid.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Until now, the Rx flow hash key was a 5-tuple (IP src, IP dst,
IP nextproto, L4 src port, L4 dst port) fixed value that we
configured at probe.
Add support for configuring this hash key at runtime.
We support all standard header fields configurable through ethtool,
but cannot differentiate between flow types, so the same hash key
is applied regardless of protocol.
We also don't support the discard option.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With CONFIG_DMA_API_DEBUG enabled we get DMA unmapping warning in
various places of the mvneta driver, for example when putting down an
interface while traffic is passing through.
The issue is when using s/w buffer management, the Rx buffers are mapped
using dma_map_page but unmapped with dma_unmap_single. This patch fixes
this by using the right unmapping function.
Fixes: 562e2f467e ("net: mvneta: Improve the buffer allocation method for SWBM")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SPI protocol for the QCA7000 doesn't have any fault detection.
In order to increase the drivers reliability in noisy environments,
we could implement a write verification inspired by the enc28j60.
This should avoid situations were the driver wrongly assumes the
receive interrupt is enabled and miss all incoming packets.
This function is disabled per default and can be controlled via module
parameter wr_verify.
Signed-off-by: Michael Heimpold <michael.heimpold@i2se.com>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit allows each TXQ to be picked in a round-robin fashion by
the PPv2 transmit scheduling mechanism. This is opposed to the default
behaviour that prioritizes the highest numbered queues.
Suggested-by: Yan Markman <ymarkman@marvell.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the PPv2 controller has multiple TX queues, we can spread traffic
by assining TX queues to CPUs, allowing to use XPS to balance egress
traffic between CPUs.
Suggested-by : Yan Markman <ymarkman@marvell.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes skb_shared area, which will be corrupted
upon reception of 4K jumbo packets.
Originally build_skb usage purpose was to reuse page for skb to eliminate
needs of extra fragments. But that logic does not take into account that
skb_shared_info should be reserved at the end of skb data area.
In case packet data consumes all the page (4K), skb_shinfo location
overflows the page. As a consequence, __build_skb zeroed shinfo data above
the allocated page, corrupting next page.
The issue is rarely seen in real life because jumbo are normally larger
than 4K and that causes another code path to trigger.
But it 100% reproducible with simple scapy packet, like:
sendp(IP(dst="192.168.100.3") / TCP(dport=443) \
/ Raw(RandString(size=(4096-40))), iface="enp1s0")
Fixes: 018423e90b ("net: ethernet: aquantia: Add ring support code")
Reported-by: Friedemann Gerold <f.gerold@b-c-s.de>
Reported-by: Michael Rauch <michael@rauch.be>
Signed-off-by: Friedemann Gerold <f.gerold@b-c-s.de>
Tested-by: Nikita Danilov <nikita.danilov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
nfp uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
bnxt uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
bnx2x uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
mlx5 uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
mlx4 uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
i40evf uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
ice uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
igb uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
ixgb uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
This also removes a problematic use of disable_irq() in
a context it is forbidden, as explained in commit
af3e0fcf78 ("8139too: Use disable_irq_nosync() in
rtl8139_poll_controller()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
lasts for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
fm10k uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
ixgbevf uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
ixgbe uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Reported-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Song Liu <songliubraving@fb.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Up until now, mlxsw tolerated firmware versions that weren't exactly
matching the required version, if the branch number matched. That
allowed the users to test various firmware versions as long as they were
on the right branch.
On the other hand, it made it impossible for mlxsw to put a hard lower
bound on a version that fixes all problems known to date. If a user had
a somewhat older FW version installed, mlxsw would start up just fine,
possibly performing non-optimally as it would use features that trigger
problematic behavior.
Therefore tweak the check to accept any FW version that is:
- on the same branch as the preferred version, and
- the same as or newer than the preferred version.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes hclge_get_port_type which is redundant.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Our VF has not implemented the ops for get_port_type. So when we executing
ethtool ethx cmd of VF, hns3_get_link_ksettings will return directly. And
we can not query anything.
To support get_link_ksettings for VF, this patch replaces get_port_type
with get_media_type. If the media type is HNAE3_MEDIA_TYPE_NONE,
hns3_get_link_ksettings will return link information of VF.
Fixes: 12f46bc1d4 ("net: hns3: Refine hns3_get_link_ksettings()")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the ops of get_media_type support for VF.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are already multiple types packets statistics for error packets,
it's unnecessary to print them, which may affect the rx performance if
print too many.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For dma_mapping_error is unlikely happened, this patch adds unlikely for
dma_mapping_error check.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When nic down, it firstly calls netif_tx_stop_all_queues(), then calls
napi_disable(). But napi_disable() will wait current napi_poll finish,
it may call netif_tx_wake_queue(). This patch fixes it by add nic state
checking.
Fixes: 424eb834a9 ("net: hns3: Unified HNS3 {VF|PF} Ethernet Driver for hip08 SoC")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are a few "switch-case" codes missed handle for default case. For
some abnormal case, it should return error code instead of return 0.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The prefix of most functions for vf are hclgevf. This patch renames the
function with inconsistent prefix.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two tqp_num variables "hdev->tqp_num" and "kinfo->tqp_num"
used in VF. "hdev->tqp_num" is the total tqp number allocated to the
VF, and "kinfo->tqp_num" indicates the tqp number being used by the
VF. Usually the two variables are equal. But for the case hdev->tqp_num
larger than rss_size_max, and num_tc is 1, "kinfo->tqp_num" will be
less than "hdev->tqp_num".
In original codes, "hdev->tqp_num" is always used to traverse the
tqp array of kinfo. It may cause null pointer error when "hdev->tqp_num"
is larger than "kinfo->tqp_num"
Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some prefix of tx/rx statistic names are redundant, this patch modifies
these names.
The new prefix looks like below:
rxq#1_ -> rxq1_
txq#1_ -> txq1_
tx_dropped -> dropped
tx_wake -> wake
tx_busy -> busy
rx_dropped -> dropped
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For desc.data is already point to the address of struct member "data[6]",
it's unnecessary to use '&' to get its address. This patch unifies all
the type convert for dest.data, using "req = (struct name *)dest.data".
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a defect in hclge_ets_validate(). If each member of tc_tsa is
not IEEE_8021QAZ_TSA_ETS, the variable total_ets_bw won't be updated.
In this case, the check for value of total_ets_bw will fail. This patch
fixes it by checking total_ets_bw only after it has been updated.
Fixes: cacde272dd ("net: hns3: Add hclge_dcb module for the support of DCB feature")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The previous commit to ravb had the side effect of making the PHY
advertise Pause and Asym Pause, which previously did not happen. By
default, phydev->supported has both forms of pause enabled, but
phydev->advertising does not. The new phy_remove_link_mode() copies
phydev->supported to phydev->advertising after removing the requested
link mode. These Pause configuration bits appears it stops the PHY
from completing Auto-Neg and the link remains down. Be explicit and
remove the Pause and Asym Pause modes, so restoring the old behavior.
Fixes: 41124fa64d ("net: ethernet: Add helper to remove a supported link mode")
Reported-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns that the address of a pointer will always evaluated as true
in a boolean context:
drivers/net/ethernet/mellanox/mlx4/eq.c:243:11: warning: address of
array 'eq->affinity_mask' will always evaluate to 'true'
[-Wpointer-bool-conversion]
if (!eq->affinity_mask || cpumask_empty(eq->affinity_mask))
~~~~~^~~~~~~~~~~~~
1 warning generated.
Use cpumask_available, introduced in commit f7e30f01a9 ("cpumask: Add
helper cpumask_available()"), which does the proper checking and avoids
this warning.
Link: https://github.com/ClangBuiltLinux/linux/issues/86
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns when a variable is assigned to itself.
drivers/net/ethernet/brocade/bna/bna_enet.c:1800:9: warning: explicitly
assigning value of variable of type 'int' to itself [-Wself-assign]
for (i = i; i < (bna->ioceth.attr.num_ucmac * 2); i++)
~ ^ ~
drivers/net/ethernet/brocade/bna/bna_enet.c:1835:9: warning: explicitly
assigning value of variable of type 'int' to itself [-Wself-assign]
for (i = i; i < (bna->ioceth.attr.num_mcmac * 2); i++)
~ ^ ~
2 warnings generated.
Link: https://github.com/ClangBuiltLinux/linux/issues/110
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns when multiple pairs of parentheses are used for a single
conditional statement.
drivers/net/ethernet/neterion/vxge/vxge-traffic.c:2265:31: warning:
equality comparison with extraneous parentheses [-Wparentheses-equality]
if ((hldev->config.intr_mode ==
VXGE_HW_INTR_MODE_MSIX_ONE_SHOT))
~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/neterion/vxge/vxge-traffic.c:2265:31: note: remove
extraneous parentheses around the comparison to silence this warning
if ((hldev->config.intr_mode ==
VXGE_HW_INTR_MODE_MSIX_ONE_SHOT))
~ ^ ~
drivers/net/ethernet/neterion/vxge/vxge-traffic.c:2265:31: note: use '='
to turn this equality comparison into an assignment
if ((hldev->config.intr_mode ==
VXGE_HW_INTR_MODE_MSIX_ONE_SHOT))
^~
=
1 warning generated.
Link: https://github.com/ClangBuiltLinux/linux/issues/124
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove a trailing underscore from the multicast/unicast names.
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Provide current link status of VF in ndo_get_vf_config
handler.
Signed-off-by: Shahed Shaikh <Shahed.Shaikh@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a workaround for FW bug -
MFW generates bandwidth attention in single function mode, which
is only expected to be generated in multi function mode.
This undesired attention in SF mode results in incorrect HW
configuration and resulting into Tx timeout.
Signed-off-by: Shahed Shaikh <Shahed.Shaikh@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for `ndo_set_vf_spoofchk' to allow PF control over
its VF spoof-checking configuration.
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When extracting frames from the Ocelot switch, the frame check sequence
(FCS) is present at the end of the data extracted. The FCS was put into
the sk buffer which introduced some issues (as length related ones), as
the FCS shouldn't be part of an Rx sk buffer.
This patch fixes the Ocelot switch extraction behaviour by discarding
the FCS.
Fixes: a556c76adc ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
kfree_skb has taken the null pointer into account. hence it is safe
to remove the redundant null pointer check before kfree_skb.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
kfree_skb has taken the null pointer into account. hence it is safe
to remove the redundant null pointer check before kfree_skb.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The continue will not truely skip any code. hence it is safe to
remove it.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The continue will not truely skip any code. hence it is safe to
remove it.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It was reported that chip version 33 (RTL8168E) ends up with
10MBit/Half on a 1GBit link after resuming from S3 (with different
link partners). For whatever reason the PHY on this chip doesn't
properly start a renegotiation when soft-reset.
Explicitly requesting a renegotiation fixes this.
Fixes: a2965f12fd ("r8169: remove rtl8169_set_speed_xmii")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reported-by: Neil MacLeod <neil@nmacleod.com>
Tested-by: Neil MacLeod <neil@nmacleod.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt offload code currently supports only 'push' and 'pop' operation: let
.ndo_setup_tc() return -EOPNOTSUPP if VLAN 'modify' action is configured.
Fixes: 2ae7408fed ("bnxt_en: bnxt: add TC flower filter offload support")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The header ocelot_dev_gmii.h is unused since the inclusion of the driver.
It is unused, lets just remove it.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
MC-aware mode was introduced to mlxsw in commit 7b81953066 ("mlxsw: spectrum:
Configure MC-aware mode on mlxsw ports") and fixed up later in commit
3a3539cd36 ("mlxsw: spectrum_buffers: Set up a dedicated pool for BUM
traffic"). As the final piece of puzzle, a firmware issue whereby a wrong
priority was assigned to BUM traffic was corrected in FW version 13.1703.4.
Therefore require this FW version in the driver.
Fixes: 7b81953066 ("mlxsw: spectrum: Configure MC-aware mode on mlxsw ports")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SBMM register configures shared buffer allocation and settings for
MC packets according to switch priority. The recommended values are no
reserved buffer and alpha of 1/4, which corresponds to buf_max of 6.
Update mlxsw_sp_sb_mms accordingly.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pool 15 (indexed as 8) is dedicated to MC traffic. Its configuration has
been kept at default, because the table-based configuration wasn't
expressive enough to allow the explicit configuration.
Now that the configuration of pool 15 can be described, do so. The MC
pool should have infinite size, infinite per-TC quota, and per-port
limit of 90K.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some pools configured through the sb_pm entries may have by default
static size. The MC pool is now not explicitly configured, however it
gets configured as static implicitly by 0-initializing sb->prs, and a
follow-up patch adds an explicit configuration to the same effect.
To support this, pass max_buff taken from sb_pm and sb_cm entries
through cell conversion before handing it to mlxsw_sp_sb_pm_write(), if
the pool that the sb_pm entry configures is statically-sized.
To keep current behavior, update mlxsw_sp_sb_cms_egress[] to denote
buffer sizes in bytes (assuming Spectrum 1 cell sizes, which the
original code assumed as well) instead of cells. Note that a follow-up
patch changes this to infinite size.
Also tweak a comment at SBMM configuration to remain true now that
statically-sized pools exist.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SBPM register configures the shared buffer allocation and
configuration per port and pool. The min_buff value is the buffer size
dedicated to this single function, and is configured in cells.
Currently, all sb_pm entries have 0 for min_buff, and therefore the
actual unit is immaterial. However, in a follow-up patch we want to add
entries with non-zero minimum.
Therefore pass the min_buff from the sb_pm table through the cell
conversion before handing it over to mlxsw_sp_sb_pm_write().
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SBCM register configures the shared buffer configuration according
to port and TC. So far all pools have had a dynamic size, where the
infinite size is easy to express by using max_buff of 0xff. However the
MC pool should be configured with static size, and the infinite size
thus needs to be set using the field SBCM.infi_max.
Therefore add the field infi_max to the SBCM register and to
mlxsw_reg_sbcm_pack(). Extend mlxsw_sp_sb_cm_write() to handle infinite
sizes as well. Report infinite pool limits as if the limit actually were
the total shared buffer size.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MC pool should have an infinite size (i.e. no quota).
To that end, add infi_size to the SBPR register and extend
mlxsw_reg_sbpr_pack(). Also add MLXSW_SP_SB_INFI to denote
buffers that should have an infinite size.
Change mlxsw_sp_sb_pr_write() to take as parameter byte size,
instead of cell size, and add the special handling of infinite
buffers. Report pools with infinite size as if they actually
take the full shared buffer size.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Entities of infinite size will be reported as if they had the maximum
size allowed by the chip. To that end, keep track of maximum shared
buffer size in mlxsw_sp->sb.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current code assumes that ingress and egress has the same number of
traffic classes. Since the introduction of MC-aware mode that assumption
hasn't held anymore, and there have been 16 TCs on the egress as opposed
to 8 on ingress.
Break the assumption of symmetry by splitting the artifacts related to
shared-buffer TC counting to ingress and egress parts.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, mlxsw assumes that each ingress pool has its egress
counterpart, and that pool index for purposes of caching matches the
index with which the hardware should be configured. As we want to expose
the MC pool, both of these assumptions break.
Instead, maintain the pool index as long as possible. Unify ingress and
egress caches and use the pool index as cache index as well. Only
translate to FW pool numbering when actually packing the registers. This
simplifies things considerably, as the pool index is the only quantity
necessary to uniquely identify a pool, and the pool/direction split is
not necessary until firmware is talked to.
To support the mapping between pool indices and pool numbers and
directions, which is not neatly mathematical anymore, introduce a pool
descriptor table, indexed by pool index, to facilitate the translation.
Include the MC pool in the descriptor table as well, so that it can be
referenced from mlxsw_sp_sb_cms_egress.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With introduction of MC-aware mode to mlxsw, it became necessary to
configure TCs above 7 as well. There is now code in mlxsw to disable ETS
for these higher classes, but disablement of max shaper was neglected.
By default, max shaper is currently disabled to begin with, so the
problem is just cosmetic. However, for symmetry, do like we do for ETS
configuration, and call mlxsw_sp_port_ets_maxrate_set() for both TC i
and i + 8.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support to configure the DORQ to use vlan-id/priority for
roce EDPM.
Fixes: cac6f691 ("qed: Add support for Unified Fabric Port")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In certain multi-function switch dependent modes, firmware adds vlan tag 0
to the untagged frames. This leads to double tagging for the traffic
if the dcbx is enabled, which is not the desired behavior. To avoid this,
driver needs to set "dcb_dont_add_vlan0" flag.
Fixes: cac6f691 ("qed: Add support for Unified Fabric Port")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In multi-function mode, driver receives the stag value (outer vlan)
for a PF from management FW (MFW). If the stag value is negotiated prior to
the driver load, then the stag is not notified to the driver and hence
driver will have the invalid stag value.
The fix is to request the MFW for STAG value during the driver load time.
Fixes: cac6f691 ("qed: Add support for Unified Fabric Port")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/atheros/atlx/atl1.c: In function 'atl1_set_link_ksettings':
drivers/net/ethernet/atheros/atlx/atl1.c:3280:6: warning:
variable 'advertising' set but not used [-Wunused-but-set-variable]
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/intel/e1000/e1000_main.c: In function 'e1000_watchdog':
drivers/net/ethernet/intel/e1000/e1000_main.c:2436:9: warning:
variable 'txb2b' set but not used [-Wunused-but-set-variable]
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The local variable 'index_specified' is never used after being assigned.
hence it should be redundant adn can be removed.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NFP supports fairly enormous ring sizes (up to 256k descriptors).
In commit 4662717038 ("nfp: use kvcalloc() to allocate SW buffer
descriptor arrays") we have started using kvcalloc() functions to
make sure the allocation of software state arrays doesn't hit
the MAX_ORDER limit. Unfortunately, we can't use virtual mappings
for the DMA region holding HW descriptors. In case this allocation
fails instead of the generic (and fairly scary) warning/splat in
the logs print a helpful message explaining what happened and
suggesting how to fix it.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Setting register 0x82 to value 01 is done a few lines before for all
chip versions <= 06 anyway. And setting PHY register 0x0b to value 00
is done at the end of rtl8169s_hw_phy_config() already. So we can
remove this.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PCI_LATENCY_TIMER is ignored on PCIe, therefore we have to do this
for the PCI chips (version <= 06) only. Also we can move setting
PCI_CACHE_LINE_SIZE.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
That local variable are never used after being assigned.
hence it should be redundant and can be removed.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The local variable 'k' is never used after being assigned.
hence it should be redundant adn can be removed.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With CONFIG_DMA_API_DEBUG enabled we now get a warning when using the
mvneta driver:
mvneta d0030000.ethernet: DMA-API: device driver frees DMA memory with
wrong function [device address=0x000000001165b000] [size=4096 bytes]
[mapped as page] [unmapped as single]
This is because when using the s/w buffer management, the Rx descriptor
buffer is mapped with dma_map_page but unmapped with dma_unmap_single.
This patch fixes this by using the right unmapping function.
Fixes: 562e2f467e ("net: mvneta: Improve the buffer allocation method for SWBM")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So far all the places calling hclge_tm_q_to_qs_map_cfg() are assigning
an u16 type value to "q_id", and in the processing of
hclge_tm_q_to_qs_map_cfg(), it also converts the "q_id" to le16.
The max tqp number for pf can be more than 256, we should use "u16" to
store the queue id, instead of "u8", which may cause data lost.
Fixes: 848440544b ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When roce is loaded before nic, the roce client will not be initialized
until nic client is initialized, but roce init flag is set before it.
Furthermore, in this case of nic initialized success and roce failed,
the nic init flag is not set, and roce init flag is not cleared.
This patch fixes it by set init flag only after the client is initialized
successfully.
Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If initialize client failed or finish uninitializing client, we should
clear the client pointer. It may cause unexpected result when use
uninitialized client. Meanwhile, we also should check whether client
exist when uninitialize it.
Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to hardware's description, the head pointer register should
be written before the tail pointer register while initializing the vf
command queue. Otherwise, it may trigger an interrupt even though there
is no command received.
Fixes: fedd0c15d2 ("net: hns3: Add HNS3 VF IMP(Integrated Management Proc) cmd interface")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function of genphy_read_status is that reading phy information
from HW and using these information to update SW variable. If user
is using ethtool to setting the speed of phy and service task is calling
by hclge_get_mac_phy_link, the result of speed setting is uncertain.
Because ethtool cmd will modified phydev and hclge_get_mac_phy_link also
will modified phydev.
Because phy state machine will update phy link periodically, we can
just use phydev->link to check the link status. This patch removes
function call of genphy_read_status. To ensure accuracy, this patch
adds a phy state check. If phy state is not PHY_RUNNING, we consider
link is down. Because in some scenarios, phydev->link may be link up,
but phy state is not PHY_RUNNING. This is just an intermediate state.
In fact, the link is not ready yet.
Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
By default, HW link status is up. If hclge_update_link_status is called
before net up, driver will print "link up". It is not suitable. hdev
state check is needed when getting link status.
Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We clear STATE_DOWN bit of hdev state when starting net, but do not set
it again when stopping net. It causes that the net is down, but hdev state
is still up. STATE_DOWN bit of hdev state should be set when stopping net.
Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the .ndo_do_ioctl net_device_ops operation to support
the PHY MII ioctl for PF driver.
Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All pf have permission to read packet statistics of public in hardware,
but the read operation will clear registers which cause statistical
inaccuracy.
This patch removes all packet statistics of public.
Signed-off-by: Junxin Chen <chenjunxin1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The the actual Tx work is minimal, driver can clean up as more
Tx descriptors as possible in a irq.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds unlikely for buf_num check.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All irq will float to cpu0 if do not set irq affinity.
This patch adds default irq affinity in hns3 driver, users can
also change the irq affinity in OS.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, but the implementation in this
driver returns an 'int'.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the explicit call to netif_carrier_off() in
mvneta_open() as this is already handled in phylink_start().
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the explicit call to netif_carrier_off() in PPv2's
open() path, as this is now handled in phylink_start().
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the mvpp2_percpu_read/write/... functions aren't really per-cpu but
per s/w thread, rename them to include 'thread' instead of 'percpu'.
This is a cosmetic patch.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Marvell PPv2 network controller has 9 internal threads. The driver
works fine when there are less CPUs available than threads. This isn't
true if more CPUs are available. As this is a valid use case, handle
this particular case.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch maps all uses of the CPU to threads. All this_cpu calls are
replaced, and all smp_processor_id() calls are wrapped into the
indirection.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch reworks the Marvell PPv2 driver to stop using directly the
CPU number to access per-thread registers.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the Marvell PPv2 driver the mvpp2_read_relaxed function is only used
in a single file. Make it static and remove its prototype from the
header.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Marvell PPv2 driver has per-cpu functions. As they only are used in
the main file, make them static and remove their prototype from the
header.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Updates the PPv2 driver so that all CPU variables are unsigned, as it
makes no sense to have a negative CPU number. This patch is cosmetic.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Marvell PPv2.2 engine only has 8 Rx queues per CPU, while PPv2.1 has
16 of them. This patch updates the code so that the Rx queues mask width
is selected given the version of the network controller used.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates the probing function so that the queue mode isn't
updated while probing, as the driver would silently end up using a
configuration not wanted by the user. The patch adds an extra check to
validate the chosen queue mode instead, and the driver will fail to
probe if the configuration is invalid.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch renames the IRQs in the Marvell PPv2 driver as their current
names match the way they are used in software. But this will change in
the future, and those IRQs have nothing to do with Rx/Tx interrupts
(this can be configured). The new binding also describe more interrupts
as some where left out.
The old binding support is kept for backward compatibility.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch sets the number of s/w threads to 9, its maximum value,
instead of 8. This is not a fix as only 4 of the s/w threads were used
so far, but more could be used in the future.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch sets from two descriptor to one descriptor because R-Car Gen3
does not have the 4 bytes alignment restriction of the transmission buffer.
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
FIELD_SIZEOF is defined as a macro to calculate the specified value. Therefore,
We prefer to use the macro rather than calculating its value.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FIELD_SIZEOF is defined as a macro to calculate the specified value. Therefore,
We prefer to use the macro rather than calculating its value.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When no Tx IRQ is available, the txq_done() routine (called from
tx_done()) shouldn't be called from the polling function, as in such
case it is already called in the Tx path thanks to an hrtimer. This
mostly occurred when using PPv2.1, as the engine then do not have Tx
IRQs.
Fixes: edc660fa09 ("net: mvpp2: replace TX coalescing interrupts with hrtimer")
Reported-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
EtherAVB hardware requires 0 to be written to status register bits in
order to clear them, however, care must be taken not to:
1. Clear other bits, by writing zero to them
2. Write one to reserved bits
This patch corrects the ravb driver with respect to the second point above.
This is done by defining reserved bit masks for the affected registers and,
after auditing the code, ensure all sites that may write a one to a
reserved bit use are suitably masked.
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
module.h already contained moduleparam.h, so it is safe to remove
the redundant include.
The issue is detected with the help of Coccinelle.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch replace the custom definition of writeq/read and use ones
defined in linux/io-64-nonatomic-lo-hi.h.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch replace the custom definition of writeq/read and use ones
defined in linux/io-64-nonatomic-lo-hi.h.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following compile warning:
drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c:49:5: warning: nvm_param.dir_type may be used uninitialized in this function [-Wmaybe-uninitialized]
if (nvm_param.dir_type == BNXT_NVM_PORT_CFG)
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, also the implementation in this
driver has returns 'netdev_tx_t' value, so just change the function
return type to netdev_tx_t.
Found by coccinelle.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJboCdPAAoJEEg/ir3gV/o+OqQH/2U2kCMUHRuG6WGEbm+bTz7m
fF2L1iRMeXvpmD0FrWs1XiBEg1/6Gf6GfNLmuwRqhHJMi/ok1310V5UbQ3dcXEQ5
7rIzBUbd2jyrI5wln9ISgmvgi/r7Cdzf9xNM3MHmPCouNWxbX85VLTGdTXcsUknd
Z7lZHbPjB8aeZEBmeQdNbGvwZOe9IIgTlfEN4gwrYvUtJD82HrGgGpdLvnSWgy4X
RhKdjvhkYx5ba3AH9CbY88oR5LRjhk3dGhngpMDvQZBmqE0W0j8tXkR8bj36zJN6
m2YJadeGlds7fubFORYaaDGuPBtCT5SnrJqboEMZ8fzuJMqdyvBNI/LtTRTIGAc=
=FOdR
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2018-09-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2018-09-17
Sorry about the previous submission of this series which was mistakenly
marked for net-next, here I am resending with 'net' mark.
This series provides three fixes to mlx5 core and mlx5e netdevice
driver.
Please pull and let me know if there's any problem.
For -stable v4.16:
('net/mlx5: Check for SQ and not RQ state when modifying hairpin SQ')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
On the Netgear WNDAP620, the emac ethernet isn't receiving nor
xmitting any frames from/to the RTL8363SB (identifies itself
as a RTL8367RB).
This is caused by the emac hardware not knowing the forced link
parameters for speed, duplex, pause, etc.
This begs the question, how this was working on the original
driver code, when it was necessary to set the phy_address and
phy_map to 0xffffffff. But I guess without access to the old
PPC405/440/460 hardware, it's not possible to know.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we are always setting the tail address of descriptor list to
the end of the pre-allocated list.
According to databook this is not correct. Tail address should point to
the last available descriptor + 1, which means we have to update the
tail address everytime we call the xmit function.
This should make no impact in older versions of MAC but in newer
versions there are some DMA features which allows the IP to fetch
descriptors in advance and in a non sequential order so its critical
that we set the tail address correctly.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Fixes: f748be531d ("stmmac: support new GMAC4")
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This follows David Miller advice and tries to fix coalesce timer in
multi-queue scenarios.
We are now using per-queue coalesce values and per-queue TX timer.
Coalesce timer default values was changed to 1ms and the coalesce frames
to 25.
Tested in B2B setup between XGMAC2 and GMAC5.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Fixes: ce736788e8 ("net: stmmac: adding multiple buffers for TX")
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
40GbE Intel Wired LAN Driver Updates 2018-09-18
This series contains changes to i40evf so that it becomes a more
generic virtual function driver for current and future silicon.
While doing the rename of i40evf to a more generic name of iavf,
we also put the driver on a severe diet due to how much of the
code was unneeded or was unused. The outcome is a lean and mean
virtual function driver that continues to work on existing 40GbE
(i40e) virtual devices and prepped for future supported devices,
like the 100GbE (ice) virtual devices.
This solves 2 issues we saw coming or were already present, the
first was constant code duplication happening with i40e/i40evf,
when much of the duplicate code in the i40evf was not used or was
not needed. The second was to remove the future confusion of why
future VF devices that were not considered "40GbE" only devices
were supported by i40evf.
The thought is that iavf will be the virtual function driver for
all future devices, so it should have a "generic" name to properly
represent that it is the VF driver for multiple generations of
devices.
The last patch in this series is unreleated to the iavf conversion
and just has to do with a MODULE_LICENSE correction.
Known Caveats:
Existing user space configurations may have to change, but the module
alias in patch 1 helps a bit here.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We recently updated all our SPDX identifiers to correctly
indicate our net/ethernet/intel/* drivers were always released
and intended to be released under GPL v2, but the MODULE_LICENSE
declaration was never updated.
Fix the MODULE_LICENSE to be GPL v2, for all our drivers.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This finishes the process of renaming the files that
make sense to rename (skipping adminq related files that
talk to i40e), and fixes up the build and the #includes
so that everything builds nicely.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is the big rename patch, it takes most of the i40e_
and I40E_ strings and renames them to iavf_ and IAVF_.
Some of the adminq code, as well as most of the client
interface code used by RDMA is left unchanged in order
to indicate that the driver is talking to non-internal to
iavf code.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Rename the i40e_trace file and fix up all the callers
to the new names inside the iavf_trace.h file.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix up the i40e_hw names to new name, including versions
inside other strings.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Take care of some renames containing I40E_ADMINQ_DESC.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Rename the device ID defines to have IAVF in them
and remove all the unused defines.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove the register name references to I40E_VF* and change to
IAVF_VF. Update the descriptor names and defines to the IAVF
name.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Simply move the i40evf files to the new name, updating the #includes
to track the new names, and updating the Makefile as well.
A future patch will remove the i40e references (after the code
removal patches later in this series).
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is just a rename of an internal variable i40e_status, but
it was a pretty big change and so deserved it's own patch.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This basically begins the internal portion of the rename of i40evf to iavf,
by renaming many of the functions, structs, variables and defines.
Most of the changes were made mechanically, which introduces some
alignment issues.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove a bunch of unused code and reformat a few lines. Also
remove some now un-necessary files.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Two new tls tests added in parallel in both net and net-next.
Used Stephen Rothwell's linux-next resolution.
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename the Intel Ethernet Adaptive Virtual Function driver
(i40evf) to a new name (iavf) that is more consistent with
the ongoing maintenance of the driver as the universal VF driver
for multiple product lines.
This first patch fixes up the directory names and the .ko name,
intentionally ignoring the function names inside the driver
for now. Basically this is the simplest patch that gets
the rename done and will be followed by other patches that
rename the internal functions.
This patch also addresses a couple of string/name issues
and updates the Copyright year.
Also, made sure to add a MODULE_ALIAS to the old name.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver may sleep with holding a spinlock.
The function call paths (from bottom to top) in Linux-4.17 are:
[FUNC] usleep_range
drivers/net/ethernet/socionext/sni_ave.c, 892:
usleep_range in ave_rxfifo_reset
drivers/net/ethernet/socionext/sni_ave.c, 932:
ave_rxfifo_reset in ave_irq_handler
[FUNC] usleep_range
drivers/net/ethernet/socionext/sni_ave.c, 888:
usleep_range in ave_rxfifo_reset
drivers/net/ethernet/socionext/sni_ave.c, 932:
ave_rxfifo_reset in ave_irq_handler
To fix these bugs, usleep_range() is replaced with udelay().
These bugs are found by my static analysis tool DSAC.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On some boards a platform clock is used as clock for the r8169 chip,
this commit adds support for getting and enabling this clock (assuming
it has an "ether_clk" alias set on it).
This is related to commit d31fd43c0f ("clk: x86: Do not gate clocks
enabled by the firmware") which is a previous attempt to fix this for some
x86 boards, but this causes all Cherry Trail SoC using boards to not reach
there lowest power states when suspending.
This commit (together with an atom-pmc-clk driver commit adding the alias)
fixes things properly by making the r8169 get the clock and enable it when
it needs it.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=193891#c102
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=196861
Cc: Johannes Stezenbach <js@sig21.net>
Cc: Carlo Caione <carlo@endlessm.com>
Reported-by: Johannes Stezenbach <js@sig21.net>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Intel SoC was prevented from entering lower idle state because
of RTL8106E's ASPM was not enabled.
So enable ASPM on RTL8106E (chip version 39).
Now the Intel SoC can enter lower idle state, power consumption and
temperature are much lower.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a small delay after setting ASPM in vendor drivers, r8101 and
r8168.
In addition, those drivers enable ASPM before ClkReq, also change that
to align with vendor driver.
I haven't seen anything bad becasue of this, but I think it's better to
keep in sync with vendor driver.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When modifying hairpin SQ, instead of checking if the next state equals
to MLX5_SQC_STATE_RDY, we compare it against the MLX5_RQC_STATE_RDY enum
value.
The code worked since both of MLX5_RQC_STATE_RDY and MLX5_SQC_STATE_RDY
have the same value today.
This patch fixes this issue.
Fixes: 18e568c390 ("net/mlx5: Hairpin pair core object setup")
Change-Id: I6758aa7b4bd137966ae28206b70648c5bc223b46
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Use accessor function READ_ONCE to read from coherent memory modified
by the device and read by the driver. This becomes most important in
preemptive kernels where cond_resched implementation does not have the
side effect which guaranteed the updated value.
Fixes: 269d26f47f ("net/mlx5: Reduce command polling interval")
Change-Id: Ie6deeb565ffaf76777b07448c7fbcce3510bbb8a
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Fix the following compile warning:
drivers/net/ethernet/microchip/lan743x_main.c:2964:12: warning: lan743x_pm_suspend defined but not used [-Wunused-function]
static int lan743x_pm_suspend(struct device *dev)
drivers/net/ethernet/microchip/lan743x_main.c:2987:12: warning: lan743x_pm_resume defined but not used [-Wunused-function]
static int lan743x_pm_resume(struct device *dev)
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Add functions for get_fecparam and set_fecparam.
2. Modify lio_get_link_ksettings to display FEC setting.
Signed-off-by: Weilin Chang <weilin.chang@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put has taken the null pointer check into account. So it is
safe to remove the duplicated check before of_node_put.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reviewed-by: Vladimir Zapolskiy <vz@mleia.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes skb_shared area, which will be corrupted
upon reception of 4K jumbo packets.
Originally build_skb usage purpose was to reuse page for skb to eliminate
needs of extra fragments. But that logic does not take into account that
skb_shared_info should be reserved at the end of skb data area.
In case packet data consumes all the page (4K), skb_shinfo location
overflows the page. As a consequence, __build_skb zeroed shinfo data above
the allocated page, corrupting next page.
The issue is rarely seen in real life because jumbo are normally larger
than 4K and that causes another code path to trigger.
But it 100% reproducible with simple scapy packet, like:
sendp(IP(dst="192.168.100.3") / TCP(dport=443) \
/ Raw(RandString(size=(4096-40))), iface="enp1s0")
Fixes: 018423e90b ("net: ethernet: aquantia: Add ring support code")
Reported-by: Friedemann Gerold <f.gerold@b-c-s.de>
Reported-by: Michael Rauch <michael@rauch.be>
Signed-off-by: Friedemann Gerold <f.gerold@b-c-s.de>
Tested-by: Nikita Danilov <nikita.danilov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The switch and the MAC are in one IP core and they use the same clock
signal from the clock generation unit.
Currently the clock architecture in the lantiq SoC code does not allow
to easily share the same clocks, this has to be fixed by switching to
the common clock framework.
As a workaround the clock of the switch and MAC should be activated when
the MAC gets probed and only disabled when the MAC gets removed. This
way it is ensured that the clock is always enabled when the switch or
MAC is used. The switch can not be used without the MAC.
This fixes a data bus error when rebooting the system and deactivating
the switch and mac and later accessing some registers in the cleanup
while the clocks are disabled.
Fixes: fe1a56420c ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver")
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warning:
drivers/net/ethernet/hisilicon/hns/hns_dsaf_gmac.c:322:5: warning:
symbol 'hns_gmac_wait_fifo_clean' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of error, the function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().
Fixes: fe1a56420c ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent commit to always forward the VF MAC address to the PF for
approval may not work if the PF driver or the firmware is older. This
will cause the VF driver to fail during probe:
bnxt_en 0000:00:03.0 (unnamed net_device) (uninitialized): hwrm req_type 0xf seq id 0x5 error 0xffff
bnxt_en 0000:00:03.0 (unnamed net_device) (uninitialized): VF MAC address 00:00:17:02:05:d0 not approved by the PF
bnxt_en 0000:00:03.0: Unable to initialize mac address.
bnxt_en: probe of 0000:00:03.0 failed with error -99
We fix it by treating the error as fatal only if the VF MAC address is
locally generated by the VF.
Fixes: 707e7e9660 ("bnxt_en: Always forward VF MAC address to the PF.")
Reported-by: Seth Forshee <seth.forshee@canonical.com>
Reported-by: Siwei Liu <loseweigh@gmail.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The operation ~(p100_inb(VG_LAN_CFG_1) & HP100_LINK_UP) returns a value
that is always non-zero and hence the wait for the link to drop always
terminates prematurely. Fix this by using a logical not operator instead
of a bitwise complement. This issue has been in the driver since
pre-2.6.12-rc2.
Detected by CoverityScan, CID#114157 ("Logical vs. bitwise operator")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create a new configuration for the sama5d3-macb new compatibility string.
This configuration disables scatter-gather because we experienced lock down
of the macb interface of this particular SoC under very high load.
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Net drivers using phylink shouldn't mess with the link carrier
themselves and should let phylink manage it. The mvpp2 driver wasn't
following this best practice as the mac_config() function made calls to
change the link carrier state. This led to wrongly reported carrier link
state which then triggered other issues. This patch fixes this
behaviour.
But the PPv2 driver relied on this misbehaviour in two cases: for fixed
links and when not using phylink (ACPI mode). The later was fixed by
adding an explicit call to link_up(), which when the ACPI mode will use
phylink should be removed.
The fixed link case was relying on the mac_config() function to set the
link up, as we found an issue in phylink_start() which assumes the
carrier is off. If not, the link_up() function is never called. To fix
this, a call to netif_carrier_off() is added just before phylink_start()
so that we do not introduce a regression in the driver.
Fixes: 4bb0432628 ("net: mvpp2: phylink support")
Reported-by: Russell King <linux@armlinux.org.uk>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch mades TI_DAVINCI_CPDMA select GENERIC_ALLOCATOR.
without that, the following sparc64 build failure happen
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_check_free_tx_desc':
(.text+0x278): undefined reference to `gen_pool_avail'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_chan_submit':
(.text+0x340): undefined reference to `gen_pool_alloc'
(.text+0x5c4): undefined reference to `gen_pool_free'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `__cpdma_chan_free':
davinci_cpdma.c:(.text+0x64c): undefined reference to `gen_pool_free'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_desc_pool_destroy.isra.6':
davinci_cpdma.c:(.text+0x17ac): undefined reference to `gen_pool_size'
davinci_cpdma.c:(.text+0x17b8): undefined reference to `gen_pool_avail'
davinci_cpdma.c:(.text+0x1824): undefined reference to `gen_pool_size'
davinci_cpdma.c:(.text+0x1830): undefined reference to `gen_pool_avail'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_ctlr_create':
(.text+0x19f8): undefined reference to `devm_gen_pool_create'
(.text+0x1a90): undefined reference to `gen_pool_add_virt'
Makefile:1011: recipe for target 'vmlinux' failed
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Synopsys DWC Ethernet MAC can be configured to have 1..32, 64, or
128 unicast filter entries. (Table 7-8 MAC Address Registers from
databook) Fix dwmac1000_validate_ucast_entries() to accept values
between 1 and 32 in addition.
Signed-off-by: Jongsung Kim <neidhard.kim@lge.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- In CXGB4_DCB_STATE_FW_INCOMPLETE state check if the dcb
version is changed and update the dcb supported version.
- Also, fill the priority code point value for priority
based flow control.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
print per rx-queue packet errors in sge_qinfo
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not put host-endian 0 or 1 into big endian feild.
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the quest to remove all stack VLA usage from the kernel[1], this
removes the VLA used for the emac xaht registers size. Since the size
of registers can only ever be 4 or 8, as detected in emac_init_config(),
the max can be hardcoded and a runtime test added for robustness.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christian Lamparter <chunkeey@gmail.com>
Cc: Ivan Mikhaylov <ivan@de.ibm.com>
Cc: netdev@vger.kernel.org
Co-developed-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace "fallthru" with a proper "fall through" annotation.
This fix is part of the ongoing efforts to enabling
-Wimplicit-fallthrough
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This drives the PMAC between the GSWIP Switch and the CPU in the VRX200
SoC. This is currently only the very basic version of the Ethernet
driver.
When the DMA channel is activated we receive some packets which were
send to the SoC while it was still in U-Boot, these packets have the
wrong header. Resetting the IP cores did not work so we read out the
extra packets at the beginning and discard them.
This also adapts the clock code in sysctrl.c to use the default name of
the device node so that the driver gets the correct clock. sysctrl.c
should be replaced with a proper common clock driver later.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a DMA channel is opened the IRQ should not get activated
automatically, this allows it to pull data out manually without the help
of interrupts. This is needed for a workaround in the vrx200 Ethernet
driver.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DIV_ROUND_UP has implemented the code-opened function. Therefore, just
replace the implementation with DIV_ROUND_UP.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c: In function 'qlcnic_sriov_pull_bc_msg':
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c:907:6: warning:
variable 'fw_mbx' set but not used [-Wunused-but-set-variable]
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c: In function 'qlcnic_sriov_issue_bc_post':
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c:939:16: warning:
variable 'hdr_size' set but not used [-Wunused-but-set-variable]
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rather than have MAC drivers open code the test, add a helper in
phylib. This will help when we change the type of phydev->supported.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ethtool can be used to enable/disable pause. Add a helper to configure
the PHY when Pause is supported.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ethtool can be used to enable/disable pause. Add a helper to configure
the PHY when asym pause is supported.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rather than have the MAC drivers manipulate phydev members, add a
helper function for MACs supporting Pause, but not Asym Pause.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rather than have the MAC drivers manipulate phydev members to indicate
they support Asym Pause, add a helper function.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some MAC hardware cannot support a subset of link modes. e.g. often
1Gbps Full duplex is supported, but Half duplex is not. Add a helper
to remove such a link mode.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PHY drivers don't indicate they support pause. They expect MAC drivers
to enable its support if the MAC has the needed hardware. Thus MAC
drivers should not mask Pause support, but enable it.
Change a few ANDs to ORs.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The phy supported speed is being used to determine if the MAC should
be configured to 100 or 1G. The masking logic is broken. Instead, look
at 1G supported speeds to enable 1G MAC support.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Many Ethernet MAC drivers want to limit the PHY to only advertise a
maximum speed of 100Mbs or 1Gbps. Rather than using a mask, make use
of the helper function phy_set_max_speed().
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Report in standard netdev stats drops and errors as well as
RX multicast from the FW vNIC counters.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a bug where ipv6 tunnels would report that it is
getting offloaded to hardware but would actually be rejected
by hardware.
Fixes: b27d6a95a7 ("nfp: compile flower vxlan tunnel set actions")
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously we only checked if the vlan id field is present when trying
to match a vlan tag. The vlan id and vlan pcp field should be treated
independently.
Fixes: 5571e8c9f2 ("nfp: extend flower matching capabilities")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After system suspend, sometimes the r8169 doesn't work when ethernet
cable gets pluggued.
This issue happens because rtl_reset_work() doesn't get called from
rtl8169_runtime_resume(), after system suspend.
In rtl_task(), RTL_FLAG_TASK_* only gets cleared if this condition is
met:
if (!netif_running(dev) ||
!test_bit(RTL_FLAG_TASK_ENABLED, tp->wk.flags))
...
If RTL_FLAG_TASK_ENABLED was cleared during system suspend while
RTL_FLAG_TASK_RESET_PENDING was set, the next rtl_schedule_task() won't
schedule task as the flag is still there.
So in addition to clearing RTL_FLAG_TASK_ENABLED, also clears other
flags.
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removed extra characters from the names of structures to unify prefixes
used through the driver code (we normally use hw_atl for hw specifics).
HW_ATL_B0_ and HW_ATL_A0_ are the same and useless copies.
Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removed extra spaces, corrected alignment.
Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support of Energy-Efficient Ethernet to aQuantia NIC's via ethtool
(according to the IEEE 802.3az specifications)
Signed-off-by: Yana Esina <yana.esina@aquantia.com>
Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com>
Tested-by: Nikita Danilov <nikita.danilov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add WOL support. Currently only magic packet
(ethtool -s <ethX> wol g) feature is implemented.
Remove hw_set_power and move that to FW_OPS set_power:
because WOL configuration behaves differently on 1x and 2x
firmwares
Signed-off-by: Yana Esina <yana.esina@aquantia.com>
Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com>
Tested-by: Nikita Danilov <nikita.danilov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added definitions and structures needed to support WOL.
Signed-off-by: Yana Esina <yana.esina@aquantia.com>
Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com>
Tested-by: Nikita Danilov <nikita.danilov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the upload function, which worked incorrectly with
some chips.
Signed-off-by: Yana Esina <yana.esina@aquantia.com>
Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com>
Tested-by: Nikita Danilov <nikita.danilov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the changes in patch 1 and 2, droq lock is not required.
So removing droq lock.
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removed oom task unconditional rescheduling every 250ms and created per
queue oom work queue for refilling buffers.
The oom task refills only if the available descriptors is fallen to 64.
There will be no packets coming in after hitting this level. So NAPI will
not run until oom task refills the buffers.
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Control packets are processed in tasklet when interface is down and in
NAPI when interface is up. So tasklet can be disabled when interface up
and re-enabled when interface is down.
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dma_zalloc_coherent() now crashes if no dev pointer is given.
Add a dev pointer to the ltq_dma_channel structure and fill it in the
driver using it.
This fixes a bug introduced in kernel 4.19.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the incorrect WR_HDR field which can cause a misinterpretation
of ABORT CPL by ULDs, such as iw_cxgb4.
Fixes: a3cdaa69e4 ("cxgb4: Adds CPL support for Shared Receive Queues")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This patch updates license to use SPDX-License-Identifier
instead of verbose license text.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A recent commit updated vlan_cmd.dropnovlan_fm but failed to remove
the older assignment. Fix this by removing the former redundant
assignment.
Detected by CoverityScan, CID#1473290 ("Unused value")
Fixes: a89cdd8e7c ("cxgb4: impose mandatory VLAN usage when non-zero TAG ID")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added memory barriers where they were missing to support multiple
architectures, and removed redundant ones.
As part of removing the redundant memory barriers and improving
performance, we moved to more relaxed versions of memory barriers,
as well as to the more relaxed version of writel - writel_relaxed,
while maintaining correctness.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add READ_ONCE calls where necessary (for example when iterating
over a memory field that gets updated by the hardware).
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
acquire the rtnl_lock during device destruction to avoid
using partially destroyed device.
ena_remove() shares almost the same logic as ena_destroy_device(),
so use ena_destroy_device() and avoid duplications.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ena_destroy_device() can potentially be called twice.
To avoid this, check that the device is running and
only then proceed destroying it.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When ena_destroy_device() is called from ena_suspend(), the device is
still reachable from the driver. Therefore, the driver can send a command
to the device to free all resources.
However, in all other cases of calling ena_destroy_device(), the device is
potentially in an error state and unreachable from the driver. In these
cases the driver must not send commands to the device.
The current implementation does not request resource freeing from the
device even when possible. We add the graceful parameter to
ena_destroy_device() to enable resource freeing when possible, and
use it in ena_suspend().
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The buffer length field in the ena rx descriptor is 16 bit, and the
current driver passes a full page in each ena rx descriptor.
When PAGE_SIZE equals 64kB or more, the buffer length field becomes
zero.
To solve this issue, limit the ena Rx descriptor to use 16kB even
when allocating 64kB kernel pages. This change would not impact ena
device functionality, as 16kB is still larger than maximum MTU.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Starting with driver version 1.5.0, in case of a surprise device
unplug, there is a race caused by invoking ena_destroy_device()
from two different places. As a result, the readless register might
be accessed after it was destroyed.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
GMAC >= 4 also supports CBS. Lets enable the TC Ops for these versions.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Essentially reverts commit 8fd75c58a0 ("i40e: move ethtool
stats boiler plate code to i40e_ethtool_stats.h", 2018-08-30), and
additionally moves the similar code in i40evf into i40evf_ethtool.c.
The code was intially moved from i40e_ethtool.c into i40e_ethtool_stats.h
as a way of better logically organizing the code. This has two problems.
First, we can't have an inline function with variadic arguments on all
platforms. Second, it gave the appearance that we had plans to share
code between the i40e and i40evf drivers, due to having a near copy of
the contents in the i40evf/i40e_ethtool_stats.h file.
Patches which actually attempt to combine or share code between the i40e
and i40evf drivers have not materialized, and are likely a ways off.
Rather than fixing the one function which causes build issues, just move
this code back into the i40e_ethtool.c and i40evf_ethtool.c files. Note
that we also change these functions back from static inlines to just
statics, since they're no longer in a header file.
We can revisit this if/when work is done to actually attempt to share
code between drivers. Alternatively, this stats code could be made more
generic so that it can be shared across drivers as part of ethtool
kernel work.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a non-zero VLAN Tag ID is passed to t4_set_vlan_acl()
then impose mandatory VLAN Usage with that VLAN ID.
I.e any other VLAN ID should result in packets getting
dropped.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: 488752220b ("liquidio: Add spoof checking on a VF MAC address")
Signed-off-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As you are already in a tasklet, it is unnecessary to call spin_lock_bh.
Signed-off-by: jun qian <hangdianqj@163.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 3559d81e76 ("r8169: simplify rtl_hw_start_8169") changed order of
two register writes:
1) Caused RxConfig to be written before TX / RX is enabled,
2) Caused TxConfig to be written before TX / RX is enabled.
At least on XIDs 10000000 ("RTL8169sb/8110sb") and
18000000 ("RTL8169sc/8110sc") such writes are ignored by the chip, leaving
values in these registers intact.
Change 1) was reverted by
commit 05212ba813 ("r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices"),
however change 2) wasn't.
In practice, this caused TxConfig's "InterFrameGap time" and "Max DMA Burst
Size per Tx DMA Burst" bits to be zero dramatically reducing TX performance
(in my tests it dropped from around 500Mbps to around 50Mbps).
This patch fixes the issue by moving TxConfig register write a bit later in
the code so it happens after TX / RX is already enabled.
Fixes: 05212ba813 ("r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices")
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both WARN_ON() and WARN_ONCE() already contain an unlikely(), so it's not
necessary to wrap it into another.
Signed-off-by: Igor Stoppa <igor.stoppa@huawei.com>
Cc: Madalin Bucur <madalin.bucur@nxp.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c: In function 'bnxt_tc_parse_flow':
drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c:186:6: warning:
variable 'addr_type' set but not used [-Wunused-but-set-variable]
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/cavium/liquidio/cn23xx_vf_device.c: In function 'cn23xx_setup_octeon_vf_device':
drivers/net/ethernet/cavium/liquidio/cn23xx_vf_device.c:619:20: warning:
variable 'ring_flag' set but not used [-Wunused-but-set-variable]
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Provide the API to set/unset the spoof checking feature.
2. Add a function to periodically provide the count of found
packets with spoof VF MAC address.
3. Prevent VF MAC address changing while the spoofchk of the VF is
on unless the changing MAC address is issued from PF.
Signed-off-by: Weilin Chang <weilin.chang@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This series provides updates to mlx5 ethernet driver.
1) Starting with a four patches series to optimize flow counters updates,
From Vlad Buslov:
==============================================
By default mlx5 driver updates cached counters each second. Update function
consumes noticeable amount of CPU resources. The goal of this patch series
is to optimize update function.
Investigation revealed following bottlenecks in fs counters
implementation:
1) Update code(scheduled each second) iterates over all counters twice.
(first for finding and deleting counters that are marked for deletion,
second iteration is for actually updating the counters)
2) Counters are stored in rb tree. Linear iteration over all rb tree
elements(rb_next in profiling data) consumed ~65% of time spent in
update function.
Following optimizations were implemented:
1) Instead of just marking counters for deletion, store them in
standalone list. This removes first iteration over whole counters tree.
2) Store counters in sorted list to optimize traversing them and remove
calls to rb_next.
First implementation of these changes caused degradation of performance,
instead of improving it. Investigation revealed that there first cache
line of struct mlx5_fc is full and adding anything to it causes amount
of cache misses to double. To mitigate that, following refactorings were
implemented:
- Change 'addlist' list type from double linked to single linked. This
allowes to get free space for one additional pointer that is used to
store deletion list(optimization 1)
- Substitute rb tree with idr. Idr is non-intrusive data structure and
doesn't require adding any new members to struct mlx5_fc. Use free
space that became available for double linked sorted list that is used
for traversing all counters. (optimization 2)
Described changes reduced CPU time spent in mlx5_fc_stats_work from 70%
to 44%. (global perf profile mode)
============================================
The rest of the series are misc updates:
2) From Kamal, Move mlx5e_priv_flags into en_ethtool.c, to avoid a
compilation warning.
3) From Roi Dayan, Move Q counters allocation and drop RQ to init_rx profile
function to avoid allocating Q counters when not required.
4) From Shay Agroskin, Replace PTP clock lock from RW lock to seq lock.
Almost double the packet rate when timestamping is active on multiple TX
queues.
5) From: Natali Shechtman, set ECN for received packets using CQE indication.
6) From: Alaa Hleihel, don't set CHECKSUM_COMPLETE on SCTP packets.
CHECKSUM_COMPLETE is not applicable to SCTP protocol.
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbkKzcAAoJEEg/ir3gV/o+4bsIAKu61DsZZhJAXfwZ4JXI9fN7
iZ9wOPbPovTrkfRfqatcCa3CBwxM1zZ6WzbVTGGieMsUnxjp4G+LAs7m8J+cmWB2
GsxhO3lsF5iGTcEB5Q5a/TlWlUP0CiPcdNZZe6FvT2Vzil+mLAb3BZgIPKPgoyJW
CmDhIOslfolBJC0s4fc+zVcWrShmKnUwHmVLhkOXpz8cfYBB1fBVGebRTZiYsvbf
8CV4buiYx0Fuq9MSnOwaMuPrUP1IB6DweNbFpg9gRd0k4mmWnoNu3l+g1Flgmd88
RtU4+5PEtZF2kNhH/LeiHRqkZ5f2cmOsIKbDZoX6xwzfYesws8jlDJpL5dIhxZA=
=nxDw
-----END PGP SIGNATURE-----
Merge tag 'mlx5e-updates-2018-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5e-updates-2018-09-05
This series provides updates to mlx5 ethernet driver.
1) Starting with a four patches series to optimize flow counters updates,
From Vlad Buslov:
==============================================
By default mlx5 driver updates cached counters each second. Update function
consumes noticeable amount of CPU resources. The goal of this patch series
is to optimize update function.
Investigation revealed following bottlenecks in fs counters
implementation:
1) Update code(scheduled each second) iterates over all counters twice.
(first for finding and deleting counters that are marked for deletion,
second iteration is for actually updating the counters)
2) Counters are stored in rb tree. Linear iteration over all rb tree
elements(rb_next in profiling data) consumed ~65% of time spent in
update function.
Following optimizations were implemented:
1) Instead of just marking counters for deletion, store them in
standalone list. This removes first iteration over whole counters tree.
2) Store counters in sorted list to optimize traversing them and remove
calls to rb_next.
First implementation of these changes caused degradation of performance,
instead of improving it. Investigation revealed that there first cache
line of struct mlx5_fc is full and adding anything to it causes amount
of cache misses to double. To mitigate that, following refactorings were
implemented:
- Change 'addlist' list type from double linked to single linked. This
allowes to get free space for one additional pointer that is used to
store deletion list(optimization 1)
- Substitute rb tree with idr. Idr is non-intrusive data structure and
doesn't require adding any new members to struct mlx5_fc. Use free
space that became available for double linked sorted list that is used
for traversing all counters. (optimization 2)
Described changes reduced CPU time spent in mlx5_fc_stats_work from 70%
to 44%. (global perf profile mode)
============================================
The rest of the series are misc updates:
2) From Kamal, Move mlx5e_priv_flags into en_ethtool.c, to avoid a
compilation warning.
3) From Roi Dayan, Move Q counters allocation and drop RQ to init_rx profile
function to avoid allocating Q counters when not required.
4) From Shay Agroskin, Replace PTP clock lock from RW lock to seq lock.
Almost double the packet rate when timestamping is active on multiple TX
queues.
5) From: Natali Shechtman, set ECN for received packets using CQE indication.
6) From: Alaa Hleihel, don't set CHECKSUM_COMPLETE on SCTP packets.
CHECKSUM_COMPLETE is not applicable to SCTP protocol.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a new qed firmware with fixes and support for new features.
Fixes:
- Fix a rare case of device crash with iWARP, iSCSI or FCoE offload.
- Fix GRE tunneled traffic when iWARP offload is enabled.
- Fix RoCE failure in ib_send_bw when using inline data.
- Fix latency optimization flow for inline WQEs.
- BigBear 100G fix
RDMA:
- Reduce task context size.
- Application page sizes above 2GB support.
- Performance improvements.
ETH:
- Tenant DCB support.
- Replace RSS indirection table update interface.
Misc:
- Debug Tools changes.
Signed-off-by: Denis Bolotin <denis.bolotin@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VXLAN and GRE FW features have to currently be both advertised
for the driver to enable them. Separate the handling.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the accesses to rtsyms now all going via special helpers
we can easily make sure the driver is not reading past the
end of the symbol.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For ease of debug preface all error messages with the name
of the symbol which caused them. Use the same message format
for existing messages while at it.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Return the error and report value through the output param.
Fixes: 640917dd81 ("nfp: support access to absolute RTsyms")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Francois H. Theron <francois.theron@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CHECKSUM_COMPLETE is not applicable to SCTP protocol.
Setting it for SCTP packets leads to CRC32c validation failure.
Fixes: bbceefce9a ("net/mlx5e: Support RX CHECKSUM_COMPLETE")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In multi-host (MH) NIC scheme, a single HW port serves multiple hosts
or sockets on the same host.
The HW uses a mechanism in the PCIe buffer which monitors
the amount of consumed PCIe buffers per host.
On a certain configuration, under congestion,
the HW emulates a switch doing ECN marking on packets using ECN
indication on the completion descriptor (CQE).
The driver needs to set the ECN bits on the packet SKB,
such that the network stack can react on that, this commit does that.
Signed-off-by: Natali Shechtman <natali@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Changed "priv.clock.lock" lock from 'rw_lock' to 'seq_lock'
in order to improve packet rate performance.
Tested on Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz.
Sent 64b packets between two peers connected by ConnectX-5,
and measured packet rate for the receiver in three modes:
no time-stamping (base rate)
time-stamping using rw_lock (old lock) for critical region
time-stamping using seq_lock (new lock) for critical region
Only the receiver time stamped its packets.
The measured packet rate improvements are:
Single flow (multiple TX rings to single RX ring):
without timestamping: 4.26 (M packets)/sec
with rw-lock (old lock): 4.1 (M packets)/sec
with seq-lock (new lock): 4.16 (M packets)/sec
1.46% improvement
Multiple flows (multiple TX rings to six RX rings):
without timestamping: 22 (M packets)/sec
with rw-lock (old lock): 11.7 (M packets)/sec
with seq-lock (new lock): 21.3 (M packets)/sec
82.05% improvement
The packet rate improvement is due to the lack of atomic operations
for the 'readers' by the seq-lock.
Since there are much more 'readers' than 'writers' contention
on this lock, almost all atomic operations are saved.
this results in a dramatic decrease in overall
cache misses.
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Not all profiles query the HW Q counters in update_stats() callback.
HW Q couners are limited per device and in case of representors all
their Q counters are allocated on the parent PF device.
Avoid reundant allocation of HW Q counters by moving the allocation
to init_rx profile callback.
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Move the definition of mlx5e_priv_flags into en_ethtool.c because it's
only used there.
Fixes: 4e59e28881 ("net/mlx5e: Introduce net device priv flags infrastructure")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Previous patch in series changed flow counter storage structure from
rb_tree to linked list in order to improve flow counter traversal
performance. The drawback of such solution is that flow counter lookup by
id becomes linear in complexity.
Store pointers to flow counters in idr in order to improve lookup
performance to logarithmic again. Idr is non-intrusive data structure and
doesn't require extending flow counter struct with new elements. This means
that idr can be used for lookup, while linked list from previous patch is
used for traversal, and struct mlx5_fc size is <= 2 cache lines.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Amir Vadai <amir@vadai.me>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In order to improve performance of flow counter stats query loop that
traverses all configured flow counters, replace rb_tree with double-linked
list. This change improves performance of traversing flow counters by
removing the tree traversal. (profiling data showed that call to rb_next
was most top CPU consumer)
However, lookup of flow flow counter in list becomes linear, instead of
logarithmic. This problem is fixed by next patch in series, which adds idr
for fast lookup. Idr is to be used because it is not an intrusive data
structure and doesn't require adding any new members to struct mlx5_fc,
which allows its control data part to stay <= 1 cache line in size.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Amir Vadai <amir@vadai.me>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In order to prevent flow counters stats work function from traversing whole
flow counters tree while searching for deleted flow counters, new list to
store deleted flow counters is added to struct mlx5_fc_stats. Lockless
NULL-terminated single linked list data type is used due to following
reasons:
- This use case only needs to add single element to list and
remove/iterate whole list. Lockless list doesn't require any additional
synchronization for these operations.
- First cache line of flow counter data structure only has space to store
single additional pointer, which precludes usage of double linked list.
Remove flow counter 'deleted' flag that is no longer needed.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Amir Vadai <amir@vadai.me>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In order to prevent flow counters stats work function from traversing whole
flow counters tree while searching for deleted flow counters, new list to
store deleted flow counters will be added to struct mlx5_fc_stats. However,
the flow counter structure itself has no space left to store any more data
in first cache line. To free space that is needed to store additional list
node, convert current addlist double linked list (two pointers per node) to
atomic single linked list (one pointer per node).
Lockless NULL-terminated single linked list data type doesn't require any
additional external synchronization for operations used by flow counters
module (add single new element, remove all elements from list and traverse
them). Remove addlist_lock that is no longer needed.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Amir Vadai <amir@vadai.me>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This is a false positive report due to incorrect nested lock
annotations as we lock multiple fgs with the same subclass.
Instead of locking all fgs only lock the one being used as was
done before.
Fixes: bd71b08ec2 ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Copy and paste bug was introduced in the offending patch.
We need to write udp source port value into the headers value and not
headers criteria "mask".
Fixes: 142644f8a1 ("net/mlx5e: Ethtool steering flow parsing refactoring")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Currently, mlx5_attach_interface does not check for error
after calling intf->attach or intf->add. When these two calls
fails, the client is not initialized and will cause issues such as
kernel panic on invalid address in the teardown path (mlx5_detach_interface)
Fixes: 737a234bb6 ("net/mlx5: Introduce attach/detach to interface API")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The PCI BDF is not unique. PCI domain must also be considered when
searching for the next physical device during lag setup. Example below:
mlx5_core 0000:01:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(128) RxCqeCmprss(0)
mlx5_core 0000:01:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(128) RxCqeCmprss(0)
mlx5_core 0001:01:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(128) RxCqeCmprss(0)
mlx5_core 0001:01:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(128) RxCqeCmprss(0)
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
If building match list fg fails and we never jumped to
search_again_locked label then the function returned without
unlocking the read lock.
Fixes: bd71b08ec2 ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The memory allocated for the slow path table flow group input structure
was not freed upon successful return, fix that.
Fixes: 1967ce6ea5 ("net/mlx5: E-Switch, Refactor fast path FDB table creation in switchdev mode")
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Minimal stride size is 16.
Hence, the number of strides in a fragment (of PAGE_SIZE)
is <= PAGE_SIZE / 16 <= 4K.
u16 is sufficient to represent this.
Fixes: d7037ad73d ("net/mlx5: Fix QP fragmented buffer allocation")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Minimal stride size is 16.
Hence, the number of strides in a fragment (of PAGE_SIZE)
is <= PAGE_SIZE / 16 <= 4K.
u16 is sufficient to represent this.
Fixes: 388ca8be00 ("IB/mlx5: Implement fragmented completion queue (CQ)")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When initializing the device (procedure init_one), the driver
calls mlx5_pci_init to perform pci initialization. As part of this
initialization, mlx5_pci_init creates a debugfs directory.
If this creation fails, init_one aborts, returning failure to
the caller (which is the probe method caller).
The main reason for such a failure to occur is if the debugfs
directory already exists. This can happen if the last time
mlx5_pci_close was called, debugfs_remove (silently) failed due
to the debugfs directory not being empty.
Guarantee that such a debugfs_remove failure will not occur by
instead calling debugfs_remove_recursive in procedure mlx5_pci_close.
Fixes: 59211bd3b6 ("net/mlx5: Split the load/unload flow into hardware and software flows")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When the mlx5 health mechanism detects a problem while the driver
is in the middle of init_one or remove_one, the driver needs to prevent
the health mechanism from scheduling future work; if future work
is scheduled, there is a problem with use-after-free: the system WQ
tries to run the work item (which has been freed) at the scheduled
future time.
Prevent this by disabling work item scheduling in the health mechanism
when the driver is in the middle of init_one() or remove_one().
Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
With performance optimization the spi transfer and messages of basic
register operations like qcaspi_read_register moved into the private
driver structure. But they weren't protected against mutual access
(e.g. between driver kthread and ethtool). So dumping the QCA7000
registers via ethtool during network traffic could make spi_sync
hang forever, because the completion in spi_message is overwritten.
So revert the optimization completely.
Fixes: 291ab06ecf ("net: qualcomm: new Ethernet over SPI driver for QCA700")
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DMA allocated memory is lost in be_cmd_get_profile_config() when we
call it with non-NULL port_res parameter.
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warning:
drivers/net/ethernet/microchip/lan743x_ptp.c:980:6: warning:
symbol 'lan743x_ptp_set_sync_ts_insert' was not declared. Should it be static?
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warning:
drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c:119:6: warning:
symbol 'mlx5i_grp_sw_update_stats' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
MC-aware mode was recently enabled by mlxsw on Spectrum switches in
commit 7b81953066 ("mlxsw: spectrum: Configure MC-aware mode on mlxsw
ports"). Unfortunately, testing has shown that the fix is incomplete and
in the presented form actually makes the problem even worse, because any
amount of MC traffic causes UC disruption.
The reason for this is that currently, mlxsw configures the MC-specific
TCs (8..15) to map to pool 0. It also configures a maximum buffer size
of 0, but for MC traffic that maximum is disregarded and not part of the
quota. Therefore MC traffic is always admitted to the egress buffer.
Fix the configuration by directing the MC TCs into pool 15, which is
dedicated to MC traffic and recognized as such by the silicon.
Fixes: 7b81953066 ("mlxsw: spectrum: Configure MC-aware mode on mlxsw ports")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will allow for the RDMA side to allocate packet reformat context.
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Currently we attach packet reformat actions only to the FDB namespace.
In preparation to be able to use that for NIC steering, pass the actual
namespace as a parameter.
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>