Commit Graph

217 Commits

Author SHA1 Message Date
Lukasz Czapnik
870f805e97 ice: Fix FW version formatting in dmesg
The FW build id is currently being displayed as an int which doesn't make
sense. Instead display FW build id as a hex value. Also add other useful
information to the output such as NVM version, API patch info, and FW
build hash.

Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-12 10:37:22 -07:00
Paul M Stillwell Jr
e3710a01a8 ice: send driver version to firmware
The driver is required to send a version to the firmware
to indicate that the driver is up. If the driver doesn't
do this the firmware doesn't behave properly.

Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-12 10:22:04 -07:00
Anirudh Venkataramanan
8c243700ab ice: Minor refactor in queue management
Remove q_left_tx and q_left_rx from the PF struct as these can be
obtained by calling ice_get_avail_txq_count and ice_get_avail_rxq_count
respectively.

The function ice_determine_q_usage is only setting num_lan_tx and
num_lan_rx in the PF structure, and these are later assigned to
vsi->alloc_txq and vsi->alloc_rxq respectively. This is an unnecessary
indirection, so remove ice_determine_q_usage and just assign values
for vsi->alloc_txq and vsi->alloc_rxq in ice_vsi_set_num_qs and use
these to set num_lan_tx and num_lan_rx respectively.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:41 -07:00
Dave Ertman
ea300f41bb ice: Allow for delayed LLDP MIB change registration
Add an additional boolean parameter to the ice_init_dcb
function.  This boolean controls if the LLDP MIB change
events are registered for.  Also, add a new function
defined ice_cfg_lldp_mib_change.  The additional function
is necessary to be able to register for LLDP MIB change
events after calling ice_init_dcb.  The net effect of these
two changes is to allow a delayed registration for MIB change
events so that the driver is not accepting events before it
is ready for them.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:41 -07:00
Anirudh Venkataramanan
80739b57b1 ice: Check for DCB capability before initializing DCB
Check the ICE_FLAG_DCB_CAPABLE before calling ice_init_pf_dcb.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:41 -07:00
Anirudh Venkataramanan
208ff75135 ice: Add ice_get_main_vsi to get PF/main VSI
There are multiple places where we currently use ice_find_vsi_by_type
to get the PF (a.k.a. main) VSI. The PF VSI by definition is always
the first element in the pf->vsi array (i.e. pf->vsi[0]). So instead
add and use a new helper function ice_get_main_vsi, which just returns
pf->vsi[0].

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05 08:13:40 -07:00
Dave Ertman
3d57fd10f2 ice: Report stats when VSI is down
There is currently a check in get_ndo_stats that
returns before updating stats if the VSI is down
or there are no Tx or Rx queues.  This causes the
netdev to report zero stats with the netdev is down.

Remove the check so that the behavior of reporting
stats is the same as it was in IXGBE.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 17:07:50 -07:00
Jesse Brandeburg
2e0ab37c04 ice: print extra message if topology issue
The driver needs to inform the user if there is an issue
with the topology / configuration of the link.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 16:27:45 -07:00
Jesse Brandeburg
432609887a ice: add print of autoneg state to link message
Print the state of auto-negotiation when printing the Link
up message.  Adds new text to the "NIC Link is up" line like
Autoneg: <True | False>

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 16:25:34 -07:00
Bruce Allan
18057cb357 ice: add needed PFR during driver unload
According to the specification, a PF Reset must be done as part of the
driver unload flow.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 16:18:52 -07:00
Anirudh Venkataramanan
03af840650 ice: Fix EMP reset handling
ice_reset_subtask needs to handle EMP resets as well, as EMP resets
can be triggered by the firmware. This patch adds the logic to do
this.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-03 13:47:12 -07:00
Henry Tieman
ae2bdbb45d ice: fix adminq calls during remove
The order of operations was incorrect in ice_remove(). The code would
try to use adminq operations after the adminq was disabled. This caused
all adminq calls to fail and possibly timeout waiting.

Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-26 23:54:29 -07:00
Anirudh Venkataramanan
152b978a1f ice: Rework ice_ena_msix_range
The current implementation of ice_ena_msix_range is difficult to read
and has subtle issues. This patch reworks the said function for
clarity and correctness.

More specifically,

1. Add more checks to bail out of 'needed' is greater than 'v_left'.

2. Simplify fallback logic

3. Do not set pf->num_avail_sw_msix in ice_ena_msix_range as it
   gets overwritten by ice_init_interrupt_scheme.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-26 23:52:29 -07:00
Anirudh Venkataramanan
78b5713ac1 ice: Alloc queue management bitmaps and arrays dynamically
The total number of queues available on the device is divided between
multiple physical functions (PF) in the firmware and provided to the
driver when it gets function capabilities from the firmware. Thus
each PF knows how many Tx/Rx queues it has. These queues are then
doled out to different VSIs (for LAN traffic, SR-IOV VF traffic, etc.)

To track usage of these queues at the PF level, the driver uses two
bitmaps avail_txqs and avail_rxqs. At the VSI level (i.e. struct ice_vsi
instances) the driver uses two arrays txq_map and rxq_map, to track
ownership of VSIs' queues in avail_txqs and avail_rxqs respectively.

The aforementioned bitmaps and arrays should be allocated dynamically,
because the number of queues supported by a PF is only available once
function capabilities have been queried. The current static allocation
consumes way more memory than required.

This patch removes the DECLARE_BITMAP for avail_txqs and avail_rxqs
and instead uses bitmap_zalloc to allocate the bitmaps during init.
Similarly txq_map and rxq_map are now allocated in ice_vsi_alloc_arrays.
As a result ICE_MAX_TXQS and ICE_MAX_RXQS defines are no longer needed.
Also as txq_map and rxq_map are now allocated and freed, some code
reordering was required in ice_vsi_rebuild for correct functioning.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-26 23:45:54 -07:00
Paul Greenwalt
77ca27c417 ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap
The VF driver can call VIRTCHNL_OP_[ENABLE|DISABLE]_QUEUES separately
for each queue. Add support for virtchnl_queue_select.[tx|rx]_queues
bitmap which is used to indicate which queues to enable and disable.

Add tracing of VF Tx/Rx per queue enable state to avoid enabling enabled
queues and disabling disabled queues. Add total queues enabled count and
clear ICE_VF_STATE_QS_ENA when count is zero.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Peng Huang <peng.huang@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-26 23:37:16 -07:00
Colin Ian King
a1199d679a ice: fix potential infinite loop
The loop counter of a for-loop is a u8 however this is being compared
to an int upper bound and this can lead to an infinite loop if the
upper bound is greater than 255 since the loop counter will wrap back
to zero. Fix this potential issue by making the loop counter an int.

Addresses-Coverity: ("Infinite loop")
Fixes: c7aeb4d1b9 ("ice: Disable VFs until reset is completed")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-26 23:30:26 -07:00
Akeem G Abodunrin
e63a1dbdc7 ice: Don't clog kernel debug log with VF MDD events errors
In case of MDD events on VF, don't clog kernel log with unlimited VF MDD
events message "VF 0 has had 1018 MDD events since last boot" - limit
events log message to 30, based on the observation in some experimentation
with sending malicious packet once, and number of events reported before
device stopped observing MDD events.

Also removed defunct macro "ICE_DFLT_NUM_MDD_EVENTS_ALLOWED" for tracking
number of MDD events allowed before disabling the interface...

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-26 23:21:28 -07:00
Krzysztof Kazimierczak
4425e0531c ice: Introduce a local variable for a VSI in the rebuild path
When a VSI is accessed inside the ice_for_each_vsi macro in the rebuild
path (ice_vsi_rebuild_all() and ice_vsi_replay_all()), it is referred to
as pf->vsi[i]. Introduce local variables to improve readability.

Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-26 23:18:06 -07:00
Anirudh Venkataramanan
f27db2e65e ice: Sanitize ice_ena_vsi and ice_dis_vsi
1. ndo_open and ndo_stop are implemented by ice_open and ice_stop
   respectively. When enabling/disabling VSIs, just call
   ice_open/ice_stop instead of ndo_open/ndo_stop.

2. Rework logic around rtnl_lock/rtnl_unlock

3. In ice_ena_vsi, remove an unnecessary stack variable and return
   0 instead of err when __ICE_NEEDS_RESTART is not set.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-26 23:02:48 -07:00
Usha Ketineni
9e7a5d1746 ice: Fix ethtool port and PFC stats for 4x25G cards
This patch fixes the issue where port and PFC statistics counters are
incrementing at the wrong port with 4x25G cards.
Read the GLPRT port registers using lport parameter instead of pf_id to
update the statistics otherwise the pf_ids are flipped for ports 2 and 3
when read from the HW register PF_FUNC_RID and this is expected as per
hardware specification.

Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-26 22:54:12 -07:00
Akeem G Abodunrin
bbb968e8b3 ice: Fix issues updating VSI MAC filters
VSI, especially VF could request to add or remove filter for another VSI,
driver should really guide such request and disallow it.
However, instead of returning error for such malicious request, driver
can simply return success.

In addition, we are not tracking number of MAC filters configured per
VF correctly - and this leads to issue updating VF MAC filters whenever
they were removed and re-configured via bringing VF interface down and
up. Also, since VF could send request to update multiple MAC filters at
once, driver should program those filters individually in the switch, in
order to determine which action resulted to error, and communicate
accordingly to the VF.

So, with this changes, we now track number of filters added right from
when VF resources allocation is done, and could properly add filters for
both trusted and non_trusted VFs, without MAC filters mis-match issue in
the switch...

Also refactor code, so that driver can use new function to add or remove
MAC filters.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-23 10:46:53 -07:00
Bruce Allan
5a4a867310 ice: update ethtool stats on-demand
Users expect ethtool statistics to be updated on-demand when invoking
'ethtool -S <iface>' instead of providing a snapshot of statistics taken
once a second (the frequency of the watchdog task where stats are currently
updated).  Update stats every time 'ethtool -S <iface>' is run.

Also, fix an indentation style issue and an unnecessary local variable
initialization in ice_get_ethtool_stats() discovered while investigating
the subject issue.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-23 10:34:27 -07:00
Brett Creeley
11836214d5 ice: Increase size of Mailbox receive queue for many VFs
Currently we use the ICE_MBXQ_LEN for both the Mailbox send and receive
queues that are used to communicate with VFs. This is fine for the send
queue because the PF driver will lock the queue for every single send,
but for the Mailbox receive queue every VF is posting to its Mailbox
send queue and the hardware is then handing the message to the PF on its
Mailbox receive queue. This becomes a problem with many VFs because it
seems to overburden the Mailbox receive queue on the PF. Fix this by
increasing the Mailbox receive queue for the PF to 512 entries.

The number 512 was determined based on the number of VFs supported by
the device. We can have a total of 256 VFs so in the worst case this
allows the VFs to put 2 messages in the PFs Mailbox receive queue at the
same time.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-20 14:37:15 -07:00
Tony Nguyen
e6c45149b8 ice: Do not always bring up PF VSI in ice_ena_vsi()
During rebuild ice_ena_vsi() is called to recover the VSI state.
This function assumes the PF VSI is always to be enabled, however,
it's possible that during reset/rebuild the interface can be
brought down.  If this occurs, we can attempt to bring up the PF
VSI on a downed interface which can lead to various crashes. If
the interface is not running, do not bring up the associated VSI.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-20 14:32:42 -07:00
Brett Creeley
c1ddf1f5c4 ice: Use the software based tail when checking for hung Tx ring
Currently in ice_get_tx_pending we try to read a Tx ring's tail. This is
then compared with the software based head (next_to_clean) to determine
if we have pending work. This will never work because reading of the Tx
ring's tail is no longer supported. Fix this by using the software based
tail (next_to_use) to determine if there is pending work.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-08-20 12:28:35 -07:00
Tony Nguyen
3015b8fcb6 ice: Bump version number
Update driver version to 0.7.5

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 13:41:09 -07:00
Brett Creeley
ba880734ba ice: Remove unnecessary flag ICE_FLAG_MSIX_ENA
This flag is not needed and is called every time we re-enable interrupts
in the hotpath so remove it. Also remove ice_vsi_req_irq() because it
was a wrapper function for ice_vsi_req_irq_msix() whose sole purpose was
checking the ICE_FLAG_MSIX_ENA flag.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 13:41:01 -07:00
Brett Creeley
56923ab664 ice: Add stats for Rx drops at the port level
Currently we are not reporting dropped counts at the port level to
ethtool or netlink. This was found when debugging Rx dropped issues
and the total packets sent did not equal the total packets received
minus the rx_dropped, which was very confusing. To determine dropped
counts at the port level we need to read the PRTRPB_RDPC register.
To fix reporting we will store the dropped counts in the PF's
rx_discards. This will be reported to netlink by storing it in the
PF VSI's rx_missed_errors signaling that the receiver missed the
packet. Also, we will report this to ethtool in the rx_dropped.nic
field.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 13:40:46 -07:00
Akeem G Abodunrin
c7aeb4d1b9 ice: Disable VFs until reset is completed
This patch adds code to clear VFs enable status until reset is completed,
and Tx/Rx rings are setup. Without this patch, the code flow request Tx
queues to be disabled after reset, especially PFR - where VF VSI Tx rings
have already been wiped off in the NVM and result to adminq error based on
the call to disable Tx LAN queue in ice_reset_all_vfs function call.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 10:23:04 -07:00
Tony Nguyen
6d5999467d ice: Do not configure port with no media
The firmware reports an error when trying to configure a port with no
media. Instead of always configuring the port, check for media before
attempting to configure it. In the absence of media, turn off link and
poll for media to become available before re-enabling link.

Move ice_force_phys_link_state() up to avoid forward declaration.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 10:23:04 -07:00
Jacob Keller
36517fd397 ice: track hardware stat registers past rollover
Currently, ice_stat_update32 and ice_stat_update40 will limit the
value of the software statistic to 32 or 40 bits wide, depending on
which register is being read.

This means that if a driver is running for a long time, the displayed
software register values will roll over to zero at 40 bits or 32 bits.

This occurs because the functions directly assign the difference between
the previous value and current value of the hardware statistic.

Instead, add this value to the current software statistic, and then
update the previous value.

In this way, each time ice_stat_update40 or ice_stat_update32 are
called, they will increment the software tracking value by the
difference of the hardware register from its last read. The software
tracking value will correctly count up until it overflows a u64.

The only requirement is that the ice_stat_update functions be called at
least once each time the hardware register overflows.

While we're fixing ice_stat_update40, modify it to use rd64 instead of
two calls to rd32. Additionally, drop the now unnecessary hireg
function parameter.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-07-31 10:23:04 -07:00
Mauro Carvalho Chehab
fe34c89d25 docs: driver-model: move it to the driver-api book
The audience for the Kernel driver-model is clearly Kernel hackers.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> # ice driver changes
2019-07-15 11:03:02 -03:00
Linus Torvalds
f632a8170a Driver Core and debugfs changes for 5.3-rc1
Here is the "big" driver core and debugfs changes for 5.3-rc1
 
 It's a lot of different patches, all across the tree due to some api
 changes and lots of debugfs cleanups.  Because of this, there is going
 to be some merge issues with your tree at the moment, I'll follow up
 with the expected resolutions to make it easier for you.
 
 Other than the debugfs cleanups, in this set of changes we have:
 	- bus iteration function cleanups (will cause build warnings
 	  with s390 and coresight drivers in your tree)
 	- scripts/get_abi.pl tool to display and parse Documentation/ABI
 	  entries in a simple way
 	- cleanups to Documenatation/ABI/ entries to make them parse
 	  easier due to typos and other minor things
 	- default_attrs use for some ktype users
 	- driver model documentation file conversions to .rst
 	- compressed firmware file loading
 	- deferred probe fixes
 
 All of these have been in linux-next for a while, with a bunch of merge
 issues that Stephen has been patient with me for.  Other than the merge
 issues, functionality is working properly in linux-next :)
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXSgpnQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykcwgCfS30OR4JmwZydWGJ7zK/cHqk+KjsAnjOxjC1K
 LpRyb3zX29oChFaZkc5a
 =XrEZ
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core and debugfs updates from Greg KH:
 "Here is the "big" driver core and debugfs changes for 5.3-rc1

  It's a lot of different patches, all across the tree due to some api
  changes and lots of debugfs cleanups.

  Other than the debugfs cleanups, in this set of changes we have:

   - bus iteration function cleanups

   - scripts/get_abi.pl tool to display and parse Documentation/ABI
     entries in a simple way

   - cleanups to Documenatation/ABI/ entries to make them parse easier
     due to typos and other minor things

   - default_attrs use for some ktype users

   - driver model documentation file conversions to .rst

   - compressed firmware file loading

   - deferred probe fixes

  All of these have been in linux-next for a while, with a bunch of
  merge issues that Stephen has been patient with me for"

* tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits)
  debugfs: make error message a bit more verbose
  orangefs: fix build warning from debugfs cleanup patch
  ubifs: fix build warning after debugfs cleanup patch
  driver: core: Allow subsystems to continue deferring probe
  drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
  arch_topology: Remove error messages on out-of-memory conditions
  lib: notifier-error-inject: no need to check return value of debugfs_create functions
  swiotlb: no need to check return value of debugfs_create functions
  ceph: no need to check return value of debugfs_create functions
  sunrpc: no need to check return value of debugfs_create functions
  ubifs: no need to check return value of debugfs_create functions
  orangefs: no need to check return value of debugfs_create functions
  nfsd: no need to check return value of debugfs_create functions
  lib: 842: no need to check return value of debugfs_create functions
  debugfs: provide pr_fmt() macro
  debugfs: log errors when something goes wrong
  drivers: s390/cio: Fix compilation warning about const qualifiers
  drivers: Add generic helper to match by of_node
  driver_find_device: Unify the match function with class_find_device()
  bus_find_device: Unify the match callback with class_find_device
  ...
2019-07-12 12:24:03 -07:00
Mauro Carvalho Chehab
4489f161b7 docs: driver-model: convert docs to ReST and rename to *.rst
Convert the various documents at the driver-model, preparing
them to be part of the driver-api book.

The conversion is actually:
  - add blank lines and identation in order to identify paragraphs;
  - fix tables markups;
  - add some lists markups;
  - mark literal blocks;
  - adjust title markups.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> # ice
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-21 15:47:26 +02:00
Anirudh Venkataramanan
2f2da36ebf ice: Trivial cosmetic changes
This patch mostly capitalizes abbreviations in code comments. Fixed some
typos and removed some unnecessary newlines as well.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-30 10:57:55 -07:00
Anirudh Venkataramanan
072efdf8bf ice: Recognize higher speeds
In ice_print_link_msg, add cases for 50GB and 100GB speeds. This
results in the right speed being reported on load, instead of
"Unknownbps".

When VF link if forced (in ice_set_pfe_link_forced), report
max speed 100GB.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-30 10:53:05 -07:00
Paul Greenwalt
f776b3acb0 ice: Add support for Forward Error Correction (FEC)
This patch adds driver support for Forward Error Correction (FEC)
and ethtool handlers to set/get FEC params.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 23:01:49 -07:00
Tony Nguyen
561f437901 ice: Introduce ice_init_mac_fltr and move ice_napi_del
Consolidate adding unicast and broadcast MAC filters in a single new
function ice_init_mac_fltr.

Move ice_napi_del to ice_lib.c

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 22:51:35 -07:00
Brett Creeley
e89e899f3e ice: Add a helper to trigger software interrupt
Add a new function ice_trigger_sw_intr to trigger interrupts.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 03:00:53 -07:00
Mitch Williams
4cc82aaa74 ice: Change message level
Change the message level of the MTU change log message from debug to
info.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:52:23 -07:00
Mitch Williams
23c0112246 ice: Check all VFs for MDD activity, don't disable
Don't use the mdd_detected variable as an exit condition for this loop;
the first VF to NOT have an MDD event will cause the loop to terminate.

Instead just look at all of the VFs, but don't disable them. This
prevents proper release of resources if the VFs are rebooted or the VF
driver reloaded. Instead, just log a message and call out repeat
offenders.

To make it clear what we are doing, use a differently-named variable in
the loop.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:50:46 -07:00
Brett Creeley
cbe66bfee6 ice: Refactor interrupt tracking
Currently we have two MSI-x (IRQ) trackers, one for OS requested MSI-x
entries (sw_irq_tracker) and one for hardware MSI-x vectors
(hw_irq_tracker). Generally the sw_irq_tracker has less entries than the
hw_irq_tracker because the hw_irq_tracker has entries equal to the max
allowed MSI-x per PF and the sw_irq_tracker is mainly the minimum (non
SR-IOV portion of the vectors, kernel granted IRQs). All of the non
SR-IOV portions of the driver (i.e. LAN queues, RDMA queues, OICR, etc.)
take at least one of each type of tracker resource. SR-IOV only grabs
entries from the hw_irq_tracker. There are a few issues with this approach
that can be seen when doing any kind of device reconfiguration (i.e.
ethtool -L, SR-IOV, etc.). One of them being, any time the driver creates
an ice_q_vector and associates it to a LAN queue pair it will grab and
use one entry from the hw_irq_tracker and one from the sw_irq_tracker.
If the indices on these does not match it will cause a Tx timeout, which
will cause a reset and then the indices will match up again and traffic
will resume. The mismatched indices come from the trackers not being the
same size and/or the search_hint in the two trackers not being equal.
Another reason for the refactor is the co-existence of features with
SR-IOV. If SR-IOV is enabled and the interrupts are taken from the end
of the sw_irq_tracker then other features can no longer use this space
because the hardware has now given the remaining interrupts to SR-IOV.

This patch reworks how we track MSI-x vectors by removing the
hw_irq_tracker completely and instead MSI-x resources needed for SR-IOV
are determined all at once instead of per VF. This can be done because
when creating VFs we know how many are wanted and how many MSI-x vectors
each VF needs. This also allows us to start using MSI-x resources from
the end of the PF's allowed MSI-x vectors so we are less likely to use
entries needed for other features (i.e. RDMA, L2 Offload, etc).

This patch also reworks the ice_res_tracker structure by removing the
search_hint and adding a new member - "end". Instead of having a
search_hint we will always search from 0. The new member, "end", will be
used to manipulate the end of the ice_res_tracker (specifically
sw_irq_tracker) during runtime based on MSI-x vectors needed by SR-IOV.
In the normal case, the end of ice_res_tracker will be equal to the
ice_res_tracker's num_entries.

The sriov_base_vector member was added to the PF structure. It is used
to represent the starting MSI-x index of all the needed MSI-x vectors
for all SR-IOV VFs. Depending on how many MSI-x are needed, SR-IOV may
have to take resources from the sw_irq_tracker. This is done by setting
the sw_irq_tracker->end equal to the pf->sriov_base_vector. When all
SR-IOV VFs are removed then the sw_irq_tracker->end is reset back to
sw_irq_tracker->num_entries. The sriov_base_vector, along with the VF's
number of MSI-x (pf->num_vf_msix), vf_id, and the base MSI-x index on
the PF (pf->hw.func_caps.common_cap.msix_vector_first_id), is used to
calculate the first HW absolute MSI-x index for each VF, which is used
to write to the VPINT_ALLOC[_PCI] and GLINT_VECT2FUNC registers to
program the VFs MSI-x PCI configuration bits. Also, the sriov_base_vector
is used along with VF's num_vf_msix, vf_id, and q_vector->v_idx to
determine the MSI-x register index (used for writing to GLINT_DYN_CTL)
within the PF's space.

Interrupt changes removed any references to hw_base_vector, hw_oicr_idx,
and hw_irq_tracker. Only sw_base_vector, sw_oicr_idx, and sw_irq_tracker
variables remain. Change all of these by removing the "sw_" prefix to
help avoid confusion with these variables and their use.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:48:49 -07:00
Anirudh Venkataramanan
0e674aeb0b ice: Add handler for ethtool selftest
This patch adds a handler for ethtool selftest. Selftest includes
testing link, interrupts, eeprom, registers and packet loopback.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:44:12 -07:00
Tony Nguyen
3171948e94 ice: Implement toggling ethtool rx-vlan-filter
Implement the toggling of rx-vlan-filter; enable|disable VLAN
pruning based on on|off, respectively.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-29 02:35:06 -07:00
Brett Creeley
aa6ccf3f2d ice: Fix couple of issues in ice_vsi_release
Currently the driver is calling ice_napi_del() and then
unregister_netdev(). The call to unregister_netdev() will result in a
call to ice_stop() and then ice_vsi_close(). This is where we call
napi_disable() for all the MSI-X vectors. This flow is reversed so make
the changes to ensure napi_disable() happens prior to napi_del().

Before calling napi_del() and free_netdev() make sure
unregister_netdev() was called. This is done by making sure the
__ICE_DOWN bit is set in the vsi->state for the interested VSI.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23 10:51:54 -07:00
Dave Ertman
e223eaec67 ice: Fix hang when ethtool disables FW LLDP
When disabling and enabling VSIs, there are a couple of flows
that recursively acquire the RTNL lock which causes a deadlock.
Fix that.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23 10:51:53 -07:00
Anirudh Venkataramanan
b4603dbf1e ice: Fix double spacing
Fix double spacing in ice_napi_disable_all

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-23 10:51:53 -07:00
Michal Swiatkowski
64439f8f0b ice: Disable sniffing VF traffic on PF
Delete code that add default Tx rule on PF. With this rule PF can see
Tx VF traffic that should go outside. For traffic from VF to another
VF default Tx rule on PF doesn't apply because of lower priority than
VF mac rule.

With this change on PF in promisc mode we can see only Rx traffic that
doesn't match any other rule (mac etc.). We can't see Tx traffic from
other VSI.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-04 14:44:47 -07:00
Tony Nguyen
8f529ff912 ice: Separate if conditions for ice_set_features()
Set features can have multiple features turned on|off in a single
call.  Grouping these all in an if/else means after one condition
is met, other conditions/features will not be evaluated.  Break
the if/else statements by feature to ensure all features will be
handled properly.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-04 14:33:44 -07:00
Michal Swiatkowski
a52db6b260 ice: Fix for allowing too many MDD events on VF
Disable VF if any malicious device driver (MDD) event is detected by
hardware. Track vf->num_mdd_events for information about VF MDD events.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-04 13:42:53 -07:00
Brett Creeley
c2a23e0061 ice: Refactor link event flow
Currently the link event flow works, but can be much better.
Refactor the link event flow to make it cleaner and more clear
on what is going on.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-02 01:27:11 -07:00
Brett Creeley
b07833a00d ice: Add reg_idx variable in ice_q_vector structure
Every time we want to re-enable interrupts and/or write to a register
that requires an interrupt vector's hardware index we do the following:

vsi->hw_base_vector + q_vector->v_idx

This is a wasteful operation, especially in the hot path. Fix this by
adding a u16 reg_idx member to the ice_q_vector structure and make the
necessary changes to make this work.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-02 01:21:56 -07:00
Md Fahad Iqbal Polash
8d7189d266 ice: Remove runtime change of PFINT_OICR_ENA register
Runtime change of PFINT_OICR_ENA register is unnecessary.
The handlers should always clear the atomic bit for each
task as they start, because it will make sure that any late
interrupt will either 1) re-set the bit, or 2) be handled
directly in the "already running" task handler.

Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-02 01:19:26 -07:00
Brett Creeley
0c2561c81f ice: Use ice_for_each_q_vector macro where possible
There are many places in the code where we do the following:

for (i = 0; i < vsi->num_q_vectors; i++)

Instead use the macro mentioned in the commit title:

ice_for_each_q_vector(vsi, i)

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-05-02 01:09:51 -07:00
Anirudh Venkataramanan
9c010de7cf ice: Bump driver version
Update driver version to 0.7.4

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:48 -07:00
Anirudh Venkataramanan
b832c2f631 ice: Add code for DCB rebuild
This patch introduces a new function ice_dcb_rebuild which reinitializes
DCB after a reset.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:48 -07:00
Anirudh Venkataramanan
4b0fdceb81 ice: Add code to get DCB related statistics
This patch adds a new function ice_update_dcb_stats to get DCB stats
from the hardware and ethtool support for displaying these stats.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan
a629cf0a01 ice: Update rings based on TC information
This patch adds a new function ice_vsi_cfg_dcb_rings which updates a
VSI's rings based on DCB traffic class information.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan
00cc3f1b3a ice: Add code to process LLDP MIB change events
This patch adds support to process LLDP MIB change notifications sent
by the firmware.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan
7b9ffc76bf ice: Add code for DCB initialization part 3/4
This patch adds a new function ice_pf_dcb_cfg (and related helpers)
which applies the DCB configuration obtained from the firmware. As
part of this, VSIs/netdevs are updated with traffic class information.

This patch requires a bit of a refactor of existing code.

1. For a MIB change event, the associated VSI is closed and brought up
   again. The gap between closing and opening the VSI can cause a race
   condition. Fix this by grabbing the rtnl_lock prior to closing the
   VSI and then only free it after re-opening the VSI during a MIB
   change event.

2. ice_sched_query_elem is used in ice_sched.c and with this patch, in
   ice_dcb.c as well. However, ice_dcb.c is not built when CONFIG_DCB is
   unset. This results in namespace warnings (ice_sched.o: Externally
   defined symbols with no external references) when CONFIG_DCB is unset.
   To avoid this move ice_sched_query_elem from ice_sched.c to
   ice_common.c.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan
37b6f6469f ice: Add code for DCB initialization part 1/4
This patch introduces a skeleton for ice_init_pf_dcb, the top level
function for DCB initialization. Subsequent patches will add to this
DCB init flow.

In this patch, ice_init_pf_dcb checks if DCB is a supported capability.
If so, an admin queue call to start the LLDP and DCBx in firmware is
issued. If not, an error is reported. Note that we don't fail the driver
init if DCB init fails.

Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan
802abbb44a ice: Bump version
Bump driver version to 0.7.3

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan
f9867df6d9 ice: Fix incorrect use of abbreviations
Capitalize abbreviations and spell out some that aren't obvious.

Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan
94c4441b5a ice: Fix typos in code comments
This patch fixes typos in code comments.

Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:47 -07:00
Brett Creeley
203a068ac9 ice: Add missing case in print_link_msg for printing flow control
Currently we aren't checking for the ICE_FC_NONE case for the current
flow control mode. This is causing "Unknown" to be printed for the
current flow control method if flow control is disabled. Fix this by
adding the case for ICE_FC_NONE to print "None".

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-26 15:05:27 -07:00
Preethi Banala
89f3e4a5b7 ice: Do not bail out when filter already exists
If filter already exists, do not go through error path flow but instead
continue to process rest of the function. Hence have an appropriate check
after adding MAC filters.

Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-26 14:51:59 -07:00
Brett Creeley
5995b6d0c6 ice: Implement pci_error_handler ops
This patch implements the following pci_error_handler ops:
	.error_detected = ice_pci_err_detected
	.slot_reset = ice_pci_err_slot_reset
	.reset_notify = ice_pci_err_reset_notify
	.reset_prepare = ice_pci_err_reset_prepare
	.reset_done = ice_pci_err_reset_done
	.resume = ice_pci_err_resume

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-26 13:57:12 -07:00
Brett Creeley
5abac9d7e1 ice: Put __ICE_PREPARED_FOR_RESET check in ice_prepare_for_reset
Currently we check if the __ICE_PREPARED_FOR_RESET bit is set prior to
calling ice_prepare_for_reset in ice_reset_subtask(), but we aren't
checking that bit in ice_do_reset() before calling
ice_prepare_for_reset(). This is not consistent and can cause issues if
ice_prepare_for_reset() is called prior to ice_do_reset(). Fix this by
checking if the __ICE_PREPARED_FOR_RESET bit is set internal to
ice_prepare_for_reset().

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-26 13:40:20 -07:00
Dave Ertman
2ebd4428d9 ice: Prevent unintended multiple chain resets
In the current implementation of ice_reset_subtask, if multiple reset
types are set in the pf->state, the most intrusive one is meant to be
performed only, but the bits requesting the other types are not being
cleared. This would lead to another reset being performed the next time
the service task is scheduled.

Change the flow of ice_reset_subtask so that all reset request bits in
pf->state are cleared, and we still perform the most intrusive of the
resets requested.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-25 10:12:21 -07:00
Brett Creeley
250c3b3e0a ice: Enable link events over the ARQ
The hardware now supports link events over the admin receive queue (ARQ),
so enable HW link events over the ARQ and remove code for link event
polling.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-25 07:54:44 -07:00
Alan Brady
8d051b8b5d ice: use irq_num var in ice_vsi_req_irq_msix
Someone went through the effort of making this a variable so let's use
it instead of recalculating it again.

Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-25 07:52:51 -07:00
Akeem G Abodunrin
5eda8afd6b ice: Add support for PF/VF promiscuous mode
Implement support for VF promiscuous mode, MAC/VLAN/MAC_VLAN and PF
multicast MAC/VLAN/MAC_VLAN promiscuous mode.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-22 08:19:17 -07:00
Bruce Allan
c8b7abdd7d ice: fix some function prototype and signature style issues
Put the return type on a separate line for function prototypes and
signatures that would exceed the 80-character limit if both were on
the same line.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-22 08:19:17 -07:00
Bruce Allan
1b5c19c779 ice: fix static analysis warnings
cppcheck warns "Identical condition '<var>', second condition is always
false". Fix them.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-22 08:19:16 -07:00
Akeem G Abodunrin
1c44e3bce1 ice: Implement flow to reset VFs with PFR and other resets
All VF VSIs need to be reset and rebuild with the main VSIs before
replaying all VSIs, so that all existing switch filters, scheduler tree
and other configuration could be replayed at once. This fixes issues when
doing PFR and CORER reset.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19 17:00:09 -07:00
Brett Creeley
70457520ba ice: configure GLINT_ITR to always have an ITR gran of 2
Instead of hoping that our ITR granularity will be 2 usec program the
GLINT_CTL register to make sure the ITR granularity is always 2 usecs.

Now that we know what the ITR granularity will be get rid of the check
in ice_probe() to verify our previous assumption.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19 16:56:10 -07:00
Brett Creeley
80ed404abb ice: use ice_for_each_vsi macro when possible
Replace all instances of:
	for (i = 0; i < pf->num_alloc_vsi; i++)

with the following macro:
	ice_for_each_vsi(pf, i)

This will allow the code to be consistent since there are currently
cases of using both.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19 16:54:23 -07:00
Bruce Allan
77ed84f49a ice: avoid multiple unnecessary de-references in probe
Add a local variable struct device *dev to avoid unnecessary de-references
throughout ice_probe().

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19 16:49:36 -07:00
Mitch Williams
4cf7bc0d27 ice: don't spam VFs with link messages
Don't send a link message to the VFs unless link actually changes state.
This avoids a small timing hole in some VF drivers that can cause an
apparent TX hang if they receive a link status message at the wrong time.

Although we have fixed the timing hole in the current VF driver, there
are still lots of drivers in the field that have this timing hole. Let's
not fall into it if we can avoid it.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Brett Creeley
0e04e8e14b ice: fix issue where host reboots on unload when iommu=on
Currently if the kernel has the intel_iommu=on parameter set, on some
platforms removing the driver causes a system reboot. In initialization
we associate the control queue interrupts with the pf->hw_oicr_idx and
enable the interrupts by setting the CAUSE_ENA bit. The problem comes
on teardown because we are not clearing the CAUSE_ENA bit for the
control queues, but the vector at pf->hw_oicr_idx (miscellaneous
interrupt vector) gets disabled.

Fix this by clearing the CAUSE_ENA bit in the appropriate control queue
registers on when freeing the miscellaneous interrupt vector. Also,
move the call to ice_free_irq_msix_misc() to after ice_deinit_sw() in
ice_remove() because ice_deinit_sw() makes an AQ call, but
ice_free_irq_msix_misc() disables the miscellaneous vector and it's
associated interrupts.

Also, create two small helper functions to enable and disable the
control queue interrupts respectively.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Bruce Allan
198a666a45 ice: fix stack hogs from struct ice_vsi_ctx structures
struct ice_vsi_ctx has gotten large enough that function local declarations
of it on the stack are causing stack hogs.  Fix that by allocating the
structs on heap.  Cleanup some formatting issues in the code around these
changes and fix incorrect data type uses of returned functions in a couple
places.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Bruce Allan
c6dfd690f1 ice: sizeof(<type>) should be avoided
With sizeof(), it is preferable to use the variable of type <type> instead
of sizeof(<type>).

There are multiple places where a temporary variable is used to hold a
'size' value which is then used for a subsequent alloc/memset. Get rid
of the temporary variable by calculating size as part of the alloc/memset
statement.

Also remove unnecessary type-cast.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Bruce Allan
99be37edeb ice: Mark extack argument as __always_unused
Commit 87b0984ebf ("net: Add extack argument to ndo_fdb_add()") in
net-next added an extended parameter to the .ndo_fdb_add op and changed
ice_fdb_add() accordingly. Update the function header and add the
__always_unused attribute to the new parameter to avoid -Wunused-parameter
warnings.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25 08:56:01 -08:00
Petr Machata
87b0984ebf net: Add extack argument to ndo_fdb_add()
Drivers may not be able to support certain FDB entries, and an error
code is insufficient to give clear hints as to the reasons of rejection.

In order to make it possible to communicate the rejection reason, extend
ndo_fdb_add() with an extack argument. Adapt the existing
implementations of ndo_fdb_add() to take the parameter (and ignore it).
Pass the extack parameter when invoking ndo_fdb_add() from rtnl_fdb_add().

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-17 15:18:47 -08:00
Anirudh Venkataramanan
cf909e19ac ice: Offload SCTP checksum
This patch adds the ability to offload SCTP checksum calculations to the
NIC.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-01-15 12:02:27 -08:00
Brett Creeley
63f545ed12 ice: Add support for adaptive interrupt moderation
Currently the driver does not support adaptive/dynamic interrupt
moderation. This patch adds support for this. Also, adaptive/dynamic
interrupt moderation is turned on by default upon driver load.

In order to support adaptive interrupt moderation, two functions were
added, ice_update_itr() and ice_itr_divisor(). These are used to
determine the current packet load and to determine a divisor based
on link speed respectively.

This patch also adds the ICE_ITR_GRAN_S define that is used in the
hot-path when setting a new ITR value. The shift is used to pet two
birds with one hand, set the ITR value while re-enabling the
interrupt. Also, the ICE_ITR_GRAN_S is defined as 1 because the device
has a ITR granularity of 2usecs.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-01-15 11:29:16 -08:00
Anirudh Venkataramanan
03f7a98668 ice: Rework queue management code for reuse
This patch reworks the queue management code to allow for reuse with the
XDP feature (to be added in a future patch).

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-01-15 11:11:10 -08:00
Bruce Allan
ab4ab73fc1 ice: Add ethtool private flag to make forcing link down optional
Add new infrastructure for implementing ethtool private flags using the
existing pf->flags bitmap to store them, and add the link-down-on-close
ethtool private flag to optionally bring down the PHY link when the
interface is administratively downed.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-01-15 10:32:59 -08:00
Brett Creeley
b6f934f027 ice: Set physical link up/down when an interface is set up/down
When a netdev is set up/down we need to set the phsyical link state
accordingly. This patch adds that functionality by calling
ice_force_phys_link_state(vsi, link_up) in both the ice_stop() and
ice_open() paths.

In order to force link, ice_force_phys_link_state(vsi, link_up) will
first determine the current phy capabilities. If link has not changed
there is nothing to do. If link has changed, previous PHY capabilities
are saved and the "Enable Automatic Link Update" and "Link Establishment
State Machine (LESM)" enable bits are set. Then the new PHY config is
saved. The "Enable Automatic Link Update" will force the FW to execute
Setup link and restart auto-negotiation. This *should* then result in a
"Link Status Event (LSE)" which will cause the driver to get the current
link status.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-01-15 10:27:18 -08:00
Bruce Allan
3d50514717 ice: Fix unused variable build warning
Commit 2fd527b72b ("net: ndo_bridge_setlink: Add extack") added a new
parameter "extack" to ice_bridge_setlink but this parameter isn't used
by the function. This results in a warning: unused parameter ‘extack’
[-Wunused-parameter]. Fix that by adding an "__always_unused" qualifier.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-01-15 08:44:05 -08:00
Young Xiao
eec903769b ice: Do not enable NAPI on q_vectors that have no rings
If ice driver has q_vectors w/ active NAPI that has no rings,
then this will result in a divide by zero error. To correct it
I am updating the driver code so that we only support NAPI on
q_vectors that have 1 or more rings allocated to them.

See commit 13a8cd191a ("i40e: Do not enable NAPI on q_vectors
that have no rings") for detail.

Signed-off-by: Young Xiao <YangX92@hotmail.com>
Acked-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-12-20 12:10:24 -08:00
Petr Machata
2fd527b72b net: ndo_bridge_setlink: Add extack
Drivers may not be able to implement a VLAN addition or reconfiguration.
In those cases it's desirable to explain to the user that it was
rejected (and why).

To that end, add extack argument to ndo_bridge_setlink. Adapt all users
to that change.

Following patches will use the new argument in the bridge driver.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-12 16:34:21 -08:00
Anirudh Venkataramanan
d337f2afb7 ice: Use Tx|Rx in comments
In code comments, use Tx|Rx instead of tx|rx

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-20 11:39:04 -08:00
Anirudh Venkataramanan
df17b7e02f ice: Cosmetic formatting changes
1. Fix several cases of double spacing
2. Fix typos
3. Capitalize abbreviations

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-20 11:39:04 -08:00
Bruce Allan
bc0c6fab8a ice: Cleanup ice_tx_timeout()
Clean up number of formatting issues and a comment that could use
clarification.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-20 11:39:04 -08:00
Usha Ketineni
c5a2a4a388 ice: Fix to make VLAN priority tagged traffic to appear on all TCs
This patch includes below changes to resolve the issue of ETS bandwidth
shaping to work.

1. Allocation of Tx queues is accounted for based on the enabled TC's
   in ice_vsi_setup_q_map() and enabled the Tx queues on those TC's via
   ice_vsi_cfg_txqs()

2. Get the mapped netdev TC # for the user priority and set the priority
   to TC mapping for the VSI.

Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:26 -08:00
Dave Ertman
d09e2693b6 ice: Avoid nested RTNL locking in ice_dis_vsi
ice_dis_vsi() performs an rtnl_lock() if it detects a netdev that is
running on the VSI. In cases where the RTNL lock has already been
acquired, a deadlock results. Add a boolean to pass to ice_dis_vsi to
tell it if the RTNL lock is already held.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:26 -08:00
Anirudh Venkataramanan
995c90f2de ice: Calculate guaranteed VSIs per function and use it
Currently we are setting the guar_num_vsi to equal to ICE_MAX_VSI
which is the device limit of 768. This is incorrect and could have
unintended consequences. To fix this use the valid_function's 8-bit
bitmap returned from discovering device capabilities to determine the
guar_num_vsi per function. guar_num_vsi value is then passed on to
pf->num_alloc_vsi.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:26 -08:00
Brett Creeley
807bc98d31 ice: Fix debug print in ice_tx_timeout
Currently the debug print in ice_tx_timeout is printing useless and
duplicate values. First, head is being assigned to tx_ring->next_to_clean
and we are printing both of those values, but naming them HWB and NTC
respectively. Also, reading tail always returns 0 so remove that as well.

Instead of assigning the SW head (NTC) read to head, use the actual head
register and change the debug print to note that this is HW_HEAD. Also
reduce the scope of a couple variables.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-13 09:09:25 -08:00
Brett Creeley
c585ea42ec ice: Fix tx_timeout in PF driver
Prior to this commit the driver was running into tx_timeouts when a
queue was stressed enough. This was happening because the HW tail
and SW tail (NTU) were incorrectly out of sync. Consequently this was
causing the HW head to collide with the HW tail, which to the hardware
means that all descriptors posted for Tx have been processed.

Due to the Tx logic used in the driver SW tail and HW tail are allowed
to be out of sync. This is done as an optimization because it allows the
driver to write HW tail as infrequently as possible, while still
updating the SW tail index to keep track. However, there are situations
where this results in the tail never getting updated, resulting in Tx
timeouts.

Tx HW tail write condition:
	if (netif_xmit_stopped(txring_txq(tx_ring) || !skb->xmit_more)
		writel(sw_tail, tx_ring->tail);

An issue was found in the Tx logic that was causing the afore mentioned
condition for updating HW tail to never happen, causing tx_timeouts.

In ice_xmit_frame_ring we calculate how many descriptors we need for the
Tx transaction based on the skb the kernel hands us. This is then passed
into ice_maybe_stop_tx along with some extra padding to determine if we
have enough descriptors available for this transaction. If we don't then
we return -EBUSY to the stack, otherwise we move on and eventually
prepare the Tx descriptors accordingly in ice_tx_map and set
next_to_watch. In ice_tx_map we make another call to ice_maybe_stop_tx
with a value of MAX_SKB_FRAGS + 4. The key here is that this value is
possibly less than the value we sent in the first call to
ice_maybe_stop_tx in ice_xmit_frame_ring. Now, if the number of unused
descriptors is between MAX_SKB_FRAGS + 4 and the value used in the first
call to ice_maybe_stop_tx in ice_xmit_frame_ring then we do not update
the HW tail because of the "Tx HW tail write condition" above. This is
because in ice_maybe_stop_tx we return success from ice_maybe_stop_tx
instead of calling __ice_maybe_stop_tx and subsequently calling
netif_stop_subqueue, which sets the __QUEUE_STATE_DEV_XOFF bit. This
bit is then checked in the "Tx HW tail write condition" by calling
netif_xmit_stopped and subsequently updating HW tail if the
afore mentioned bit is set.

In ice_clean_tx_irq, if next_to_watch is not NULL, we end up cleaning
the descriptors that HW sets the DD bit on and we have the budget. The
HW head will eventually run into the HW tail in response to the
description in the paragraph above.

The next time through ice_xmit_frame_ring we make the initial call to
ice_maybe_stop_tx with another skb from the stack. This time we do not
have enough descriptors available and we return NETDEV_TX_BUSY to the
stack and end up setting next_to_watch to NULL.

This is where we are stuck. In ice_clean_tx_irq we never clean anything
because next_to_watch is always NULL and in ice_xmit_frame_ring we never
update HW tail because we already return NETDEV_TX_BUSY to the stack and
eventually we hit a tx_timeout.

This issue was fixed by making sure that the second call to
ice_maybe_stop_tx in ice_tx_map is passed a value that is >= the value
that was used on the initial call to ice_maybe_stop_tx in
ice_xmit_frame_ring. This was done by adding the following defines to
make the logic more clear and to reduce the chance of mucking this up
again:

ICE_CACHE_LINE_BYTES		64
ICE_DESCS_PER_CACHE_LINE	(ICE_CACHE_LINE_BYTES / \
				 sizeof(struct ice_tx_desc))
ICE_DESCS_FOR_CTX_DESC		1
ICE_DESCS_FOR_SKB_DATA_PTR	1

The ICE_CACHE_LINE_BYTES being 64 is an assumption being made so we
don't have to figure this out on every pass through the Tx path. Instead
I added a sanity check in ice_probe to verify cache line size and print
a message if it's not 64 Bytes. This will make it easier to file issues
if they are seen when the cache line size is not 64 Bytes when reading
from the GLPCI_CNF2 register.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-06 12:46:47 -08:00