Some cards do not support autoneg. The current code does not prevent the
user from enabling autoneg with ethtool on such cards, causing confusion.
Firmware provides the autoneg capability information and we just need to
store it in the support_auto_speeds field in bnxt_link_info struct.
The ethtool set_settings() call will check this field before proceeding
with autoneg.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add bnxt_gro_func_5731x() to handle GRO packets for this chip. The
completion structures used in the new chip have new data to help determine
the header offsets. The offsets can be off by 4 if the packet is an
internal loopback packet (e.g. from one VF to another VF). Some additional
logic is added to adjust the offsets if it is a loopback packet.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newer chips require different logic to handle GRO packets. So refactor
the code so that we can call different functions depending on the chip.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define all the supported chip numbers and chip categories. Store the
chip_num returned by firmware. If the call to get the version and chip
number fails, we should abort.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NPAR type is read from bnxt_hwrm_func_qcfg. Do not allow changing link
parameters if in NPAR mode sinc ethe port is shared among multiple
partitions. The link parameters are set up by firmware.
Signed-off-by: Satish Baddipadige <sbaddipa@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the VF driver gets this event, the VF configuration has changed (such
as default VLAN). The VF driver will initiate a silent reset to pick up
the new configuration.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code is more complicated than necessary and can only report
unsupported SFP+ module if it is plugged in after the device is up.
Rename bnxt_port_module_event() to bnxt_get_port_module_status(). We
already have the current module_status in the link_info structure, so
just check that and report any unsupported SFP+ module status. Delete
the unnecessary last_port_module_event. Call this function at the
end of bnxt_open to report unsupported module already plugged in.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The chip supports 4K/8K/64K page sizes for the rings and we try to
match it to the CPU PAGE_SIZE. The current page size limits for the rings
are based on 4K/8K page size. If the page size is 64K, these limits are
too large. Reduce them appropriately.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support to fetch the SFP EEPROM settings from the firmware
and display it via the ethtool -m command. We support SFP+ and QSFP
modules.
v2: Fixed a bug in bnxt_get_module_eeprom() found by Ben Hutchings.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The nf_conntrack_core.c fix in 'net' is not relevant in 'net-next'
because we no longer have a per-netns conntrack hash.
The ip_gre.c conflict as well as the iwlwifi ones were cases of
overlapping changes.
Conflicts:
drivers/net/wireless/intel/iwlwifi/mvm/tx.c
net/ipv4/ip_gre.c
net/netfilter/nf_conntrack_core.c
Signed-off-by: David S. Miller <davem@davemloft.net>
Add detection and recovery code when the hardware returned opaque value
does not match the expected consumer index. Once the issue is detected,
we skip the processing of all RX and LRO/GRO packets. These completion
entries are discarded without sending the SKB to the stack and without
producing new buffers. The function will be reset from a workqueue.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a rare hardware bug that can cause a bad opaque value in the RX
or TPA completion. When this happens, the hardware may have used the
same buffer twice for 2 rx packets. In addition, the driver will also
crash later using the bad opaque as the index into the ring.
The rx opaque value is predictable and is always monotonically increasing.
The workaround is to keep track of the expected next opaque value and
compare it with the one returned by hardware during RX and TPA start
completions. If they miscompare, we will not process any more RX and
TPA completions and exit NAPI. We will then schedule a workqueue to
reset the function.
This patch adds the logic to keep track of the next rx consumer index.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
net/ipv4/ip_gre.c
Minor conflicts between tunnel bug fixes in net and
ipv6 tunnel cleanups in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
If PAGE_SIZE is bigger than BNXT_RX_PAGE_SIZE, that means the native CPU
page is bigger than the maximum length of the RX BD. Divide the page
into multiple 32K buffers for the aggregation ring.
Add an offset field in the bnxt_sw_rx_agg_bd struct to keep track of the
page offset of each buffer. Since each page can be referenced by multiple
buffer entries, call get_page() as needed to get the proper reference
count.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RX BD length field of this device is 16-bit, so the largest buffer
size is 65535. For LRO and GRO, we allocate native CPU pages for the
aggregation ring buffers. It won't work if the native CPU page size is
64K or bigger.
We fix this by defining BNXT_RX_PAGE_SIZE to be native CPU page size
up to 32K. Replace PAGE_SIZE with BNXT_RX_PAGE_SIZE in all appropriate
places related to the rx aggregation ring logic.
The next patch will add additional logic to divide the page into 32K
chunks for aggrgation ring buffers if PAGE_SIZE is bigger than
BNXT_RX_PAGE_SIZE.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10GBaseT devices must autonegotiate to determine master/slave clocking.
Disallow forced speed in ethtool .set_settings() for these devices.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the PORT_CONN_NOT_ALLOWED async event handling logic. The driver
will print an appropriate warning to reflect the SFP+ module enforcement
policy done in the firmware.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Add bnxt_hwrm_set_eee() function to setup EEE firmware parameters based
on the bp->eee settings.
2. The new function bnxt_eee_config_ok() will check if EEE parameters need
to be modified due to autoneg changes.
3. bnxt_hwrm_set_link() has added a new parameter to update EEE. If the
parameter is set, it will call bnxt_hwrm_set_eee().
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get EEE capability and the initial EEE settings from firmware.
Add "EEE is active | not active" to link up dmesg.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use new field names in API structs and stop using deprecated fields
auto_link_speed and auto_duplex in phy_cfg/phy_qcfg structs.
Update copyright year to 2016.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The size of every padded firmware message is specified in the first
HWRM_VER_GET response message. Use this value to pad every message
after that.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gather periodic port statistics if the device is PF and link is up. This
is triggered in bnxt_timer() every one second to request firmware to DMA
the counters.
Signed-off-by: Michael Chan <michael.chan@broadocm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow all autoneg speeds aupported by firmware to be advertised. If
the advertising parameter is 0, then all supported speeds will be
advertised.
Remove BNXT_ALL_COPPER_ETHTOOL_SPEED which is no longer used as all
supported speeds can be advertised.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And report actual pause settings to ETHTOOL_GPAUSEPARAM to let ethtool
resolve the actual pause settings.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is used to send NVM_FIND_DIR_ENTRY messages which can return error
if the entry is not found. This is normal and the error message will
cause unnecessary alarm, so silence it.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use appropriate firmware request header structure to prepare the
firmware messages. This avoids the unnecessary conversion of the
fields to 32-bit fields. Add appropriate endian conversion when
printing out the message fields in dmesg so that they appear correct
in the log.
Reported-by: Rob Swindell <swindell@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before this patch, we used a hardcoded value of 500 msec as the default
value for firmware message response timeout. For better portability with
future hardware or debug platforms, use the value provided by firmware in
the first response and store it for all susequent messages. Redefine the
macro HWRM_CMD_TIMEOUT to the stored value. Since we don't have the
value yet in the first message, use the 500 ms default if the stored value
is zero.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When tx and rx rings don't share the same completion ring, tx coalescing
parameters can be set differently from the rx coalescing parameters.
Otherwise, use rx coalescing parameters on shared completion rings.
Adjust rx coalescing default values to lower interrupt rate.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't convert these to internal hardware tick values before storing
them. This avoids the confusion of ethtool -c returning slightly
different values than the ones set using ethtool -C when we convert
hardware tick values back to micro seconds. Add better comments for
the hardware settings.
Also, rename the current set of coalescing fields with rx_ prefix.
The next patch will add support of tx coalescing values.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During remove_one() when SRIOV is enabled, the PF driver
should broadcast PF driver unload notification to all
VFs that are attached to VMs. Upon receiving the PF
driver unload notification, the VF driver should print
a warning message to message log. Certain operations on the
VF may not succeed after the PF has unloaded.
Signed-off-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current default tx ring size of 512 causes an extra page to be
allocated for the tx ring with only 1 entry in it. Reduce it to
511. The default rx ring size is also reduced to 511 to use less
memory by default.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tx push is supported for small packets to reduce DMA latency. The
following bugs are fixed in this patch:
1. Fix the definition of the push BD which is different from the DMA BD.
2. The push buffer has to be zero padded to the next 64-bit word boundary
or tx checksum won't be correct.
3. Increase the tx push packet threshold to 164 bytes (192 bytes with the BD)
so that small tunneled packets are within the threshold.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add logic to calculate how many shared or non shared rings can be
supported. Default is to use shared rings.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to support dedicated or shared completion rings, the ring
indexing and mapping are re-structured as below:
1. bp->grp_info[] array index is 1:1 with bp->bnapi[] array index and
completion ring index.
2. rx rings 0 to n will be mapped to completion rings 0 to n.
3. If tx and rx rings share completion rings, then tx rings 0 to m will
be mapped to completion rings 0 to m.
4. If tx and rx rings use dedicated completion rings, then tx rings 0 to
m will be mapped to completion rings n + 1 to n + m.
5. Each tx or rx ring will use the corresponding completion ring index
for doorbell mapping and MSIX mapping.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, an rx and a tx ring are always paired with a completion ring.
We want to restructure it so that it is possible to have a dedicated
completion ring for tx or rx only.
The bnxt hardware uses a completion ring for rx and tx events. The driver
has to process the completion ring entries sequentially for the rx and tx
events. Using a dedicated completion ring for rx only or tx only has these
benefits:
1. A burst of rx packets can cause delay in processing tx events if the
completion ring is shared. If tx queue is stopped by BQL, this can cause
delay in re-starting the tx queue.
2. A completion ring is sized according to the rx and tx ring size rounded
up to the nearest power of 2. When the completion ring is shared, it is
sized by adding the rx and tx ring sizes and then rounded to the next power
of 2, often with a lot of wasted space.
3. Using dedicated completion ring, we can adjust the tx and rx coalescing
parameters independently for rx and tx.
The first step is to separate the rx and tx ring structures from the
bnxt_napi struct.
In this patch, an rx ring and a tx ring will point to the same bnxt_napi
struct to share the same completion ring. No change in ring assignment
and mapping yet.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This interface will be forward compatible with future changes.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newer firmware will return the ring group resource when we call
hwrm_func_qcaps(). To be compatible with older firmware, use the
number of tx rings as the number of ring groups if the older firmware
returns 0. When determining how many rx rings we can support, take
the ring group resource in account as well in _bnxt_get_max_rings().
Divide and assign the ring groups to VFs.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to keep track of all resources, such as rx rings, tx rings,
cmpl rings, rss contexts, stats contexts, vnics, after we have
divided them for the VFs. Otherwise, subsequent ring changes on
the PF may not work correctly.
We adjust all max resources in struct bnxt_pf_info after they have been
assigned to the VFs. There is no need to keep the separate
max_pf_tx_rings and max_pf_rx_rings.
When SR-IOV is disabled, we call bnxt_hwrm_func_qcaps() to restore the
max resources for the PF.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When implementing driver reset from tx_timeout in the next patch,
bnxt_close_nic() will be called from the sp_task workqueue. Calling
cancel_work() on sp_task will hang the workqueue.
Instead, set a new bit BNXT_STATE_IN_SP_TASK when bnxt_sp_task() is running.
bnxt_close_nic() will wait for BNXT_STATE_IN_SP_TASK to clear before
proceeding.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows multiple independent bits to be set for various states.
Subsequent patches to implement tx timeout reset will require this.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to use offset 0x4014 for reading CAG interrupt status,
the actual CAG register must be mapped to GRC bar0 window #4.
Otherwise, the driver is reading garbage. This patch corrects
this issue.
Signed-off-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The profile ID in the completion record needs to be ANDed with the
profile ID mask of 0x1f. This bug was causing the SKB hash type
and the gso_type to be wrong in some cases.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the sp event bits to be bit positions instead of bit values since
the bit helper functions are expecting the former.
Signed-off-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_gro_skb() has unused variables when CONFIG_INET is not set. We
really cannot support hardware GRO if CONFIG_INET is not set, so
compile out bnxt_gro_skb() completely and define BNXT_FLAG_GRO to be 0
if CONFIG_INET is not set. This will effectively always disable
hardware GRO if CONFIG_INET is not set.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct bnxt_pf_info needs to be always defined. Move bnxt_update_vf_mac()
to bnxt_sriov.c and add some missing #ifdef CONFIG_BNXT_SRIOV.
Reported-by: Jim Hull <jim.hull@hpe.com>
Tested-by: Jim Hull <jim.hull@hpe.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Broadcom ethernet driver for the new family of NetXtreme-C/E
ethernet devices.
v5:
- Removed empty blank lines at end of files (noted by David Miller).
- Moved busy poll helper functions to bnxt.h to at least make the
.c file look less cluttered with #ifdef (noted by Stephen Hemminger).
v4:
- Broke up 2 long message strings with "\n" (suggested by John Linville)
- Constify an array of strings (suggested by Stephen Hemminger)
- Improve bnxt_vf_pciid() (suggested by Stephen Hemminger)
- Use PCI_VDEVICE() to populate pci_device_id table for more compact
source.
v3:
- Fixed 2 more sparse warnings.
- Removed some unused structures in .h files.
v2:
- Fixed all kbuild test robot reported warnings.
- Fixed many of the checkpatch.pl errors and warnings.
- Fixed the Kconfig description (noted by Dmitry Kravkov).
Acked-by: Eddie Wai <eddie.wai@broadcom.com>
Acked-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>