On older firmware that doesn't support the HWRM_NVM_GET_DEV_INFO
command that returns detailed stored firmware versions, fallback
to use the same firmware package version that is reported to ethtool.
Refactor bnxt_get_pkgver() in bnxt_ethtool.c so that devlink can call
and get the package version.
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Live patches are activated by using the 'limit no_reset' option when
performing a devlink dev reload fw_activate operation. These packages
must first be installed on the device in the usual way. For example,
via devlink dev flash or ethtool -f.
The devlink device info has also been enhanced to render stored and
running live patch versions.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The main changes are firmware live patch support and 2 additional FEC
standard counters.
Add the matching FEC counters to ethtool counter array. Firmware older
than 220 does not return the proper size of the extended RX counters so
we need to cap it at the smaller legacy size. Otherwise the new FEC
counters may show up with garbage values.
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Populate the dump with firmware 'live' coredump data. This includes
the information stored in NVRAM by the firmware exception handler
prior to recovery. Thus, the live dump includes the desired crash
context.
Firmware does not support HWRM calls after RESET_NOTIFY, so there is
no supported way to capture a coredump during the auto dump phase.
Detect this and abort when called from devlink_health_report().
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tools other than 'ethtool -w' may be used to produce a coredump. For
devlink health, such dumps could even be driver initiated in response
to a health event. In these cases, the kernel thread information will
be placed in the coredump record instead.
v2: use min_t() instead of min() to fix the mismatched type warning
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent firmware provides coredump and crashdump size info via
DBG_QCFG command. Read the dump sizes from firmware, instead of
computing in the driver. This patch reduces the time taken
to collect the dump via ethtool.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Firmware sets compression flags for each segment, add this information
while filling segment header.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change bnxt_get_coredump() and bnxt_get_coredump_length() to non-static
functions.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The coredump functionality will be used by devlink health. Refactor
these functions that get coredump and coredump length. There is no
functional change, but the following checkpatch warnings were
addressed:
- strscpy is preferred over strlcpy.
- sscanf results should be checked, with an additional warning.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add firmware event counters as well as health state severity. In
the unhealthy state, recommend a remedy and inform the user as to
its impact.
Readability of the devlink tool's output is negatively impacted by
adding these fields to the diagnosis. The single line of text, as
rendered by devlink health diagnose, benefits from more terse
descriptions, which can be substituted without loss of clarity, even
in pretty printed JSON mode.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge 'fw' and 'fw_fatal' health reporters. There is no longer a need
to distinguish between firmware reporters. Only bonafide errors are
reported now and no reports were being generated for the 'fw' reporter.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Firmware resets initiated by the user are not errors and should not
be reported via devlink. Once only unsolicited resets remain, it is no
longer sensible to maintain a separate fw_reset reporter.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recovery election messages are often mistaken for errors. Improve
the wording to clarify the meaning of these frequent and expected
events. Also, take the first step towards more inclusive language.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The reported parameter value should not take into account the state
of remote drivers. Firmware will reject remote resets as appropriate,
thus it is not strictly necessary to check HOT_RESET_ALLOWED before
attempting to initiate a reset. But we add the check so that we can
provide more intuitive messages when reset is not permitted.
This firmware setting needs to be restored from all functions after
a firmware reset.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to reload driver_reinit, the RTNL lock is held across reload
down and up to prevent interleaving state changes. But we need to
subsequently release the RTNL lock while waiting for firmware reset
to complete.
Also keep a statistic on fw_activate resets initiated remotely from
other functions.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RTNL lock must be held between down and up to prevent interleaving
state changes, especially since external state changes might release
and allocate different driver resource subsets that would otherwise
need to be tracked and carefully handled. If the down function fails,
then devlink will not call the corresponding up function, thus the
lock is released in the down error paths.
v2: Don't use devlink_reload_disable() and devlink_reload_enable().
Instead, check that the netdev is not in unregistered state before
proceeding with reload.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-Off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Resource reservations will also need to be reset after FUNC_DRV_UNRGTR
in the following devlink driver_reinit patch. Extract this logic into a
reusable function.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device info logged during probe will be reused by the devlink
driver_reinit code in a following patch. Extract this logic into
the new bnxt_print_device_info() function. The board index needs
to be saved in the driver context so that the board information
can be retrieved at a later time, outside of the probe function.
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 16nm internal EPHY that is present in 7712 is actually a 16nm
Gigabit PHY which has been forced to operate in 10/100 mode. Its
controls are therefore via the EXT_GPHY_CTRL registers and not via the
EXT_EPHY_CTRL which are used for all GENETv5 adapters. Add a match on
the 7712 compatible string to allow that differentiation to happen.
On previous GENETv4 chips the EXT_CFG_IDDQ_GLOBAL_PWR bit was cleared by
default, but this is not the case with this chip, so we need to make
sure we clear it to power on the EPHY.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The variable will be assigned again later in the if condition,
there is no meaning there.
drivers/net/ethernet/broadcom/tg3.c:5750:2 warning:
Value stored to 'current_link_up' is never read.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: luo penghao <luo.penghao@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.
Read the address into an array on the stack, then call
eth_hw_addr_set().
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.
Read the address into an array on the stack, then call
eth_hw_addr_set().
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This big patch sprinkles const on local variables and
function arguments which may refer to netdev->dev_addr.
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.
Some of the changes here are not strictly required - const
is sometimes cast off but pointer is not used for writing.
It seems like it's still better to add the const in case
the code changes later or relevant -W flags get enabled
for the build.
No functional changes.
Link: https://lore.kernel.org/r/20211014142432.449314-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The open code which is dev->priv_flags & IFF_RXFH_CONFIGURED is defined as
a helper function on netdevice.h. So use netif_is_rxfh_configured()
function instead of open code. This patch doesn't change logic.
Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tg3 does various forms of direct writes to netdev->dev_addr.
Use a local buffer. Make sure local buffer is aligned since
eth_platform_get_mac_address() may call ether_addr_copy().
tg3_get_device_address() returns whenever it finds a method
that found a valid address. Instead of modifying all the exit
points pass the buffer from the outside and commit the address
in the caller.
Constify the argument of the set addr helper.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the redundant check of (tg3_asic_rev(tp) == ASIC_REV_5705) after
it is checked to be true.
Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20211008063147.1421-1-sakiwit@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use the new device_get_ethdev_address() helper for the cases
where dev->dev_addr is passed in directly as the destination.
@@
expression dev, np;
@@
- device_get_mac_address(np, dev->dev_addr, ETH_ALEN)
+ device_get_ethdev_address(np, dev)
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
All callers pass in ETH_ALEN and the function itself
will return -EINVAL for any other address length.
Just assume it's ETH_ALEN like all other mac address
helpers (nvm, of, platform).
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
fwnode_get_mac_address() and device_get_mac_address()
return a pointer to the buffer that was passed to them
on success or NULL on failure. None of the callers
care about the actual value, only if it's NULL or not.
These semantics differ from of_get_mac_address() which
returns an int so to avoid confusion make the device
helpers return an errno.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the new of_get_ethdev_address() helper for the cases
where dev->dev_addr is passed in directly as the destination.
@@
expression dev, np;
@@
- of_get_mac_address(np, dev->dev_addr)
+ of_get_ethdev_address(np, dev)
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The err variable is checked for true or false a few lines above. When
!err is checked again, it always evaluates to true. Therefore we should
skip this check.
We should also group the adjacent statements together for readability.
Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert all Ethernet drivers from memcpy(... dev->addr_len)
to eth_hw_addr_set():
@@
expression dev, np;
@@
- memcpy(dev->dev_addr, np, dev->addr_len)
+ eth_hw_addr_set(dev, np)
In theory addr_len may not be ETH_ALEN, but we don't expect
non-Ethernet devices to live under this directory, and only
the following cases of setting addr_len exist:
- cxgb4 for mgmt device,
and the drivers which set it to ETH_ALEN: s2io, mlx4, vxge.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check ethernet controller DT node for "mdio" subnode and use it with
of_mdiobus_register() when present. That allows specifying MDIO and its
PHY devices in a standard DT based way.
This is required for BCM53573 SoC support. That family is sometimes
called Northstar (by marketing?) but is quite different from it. It uses
different CPU(s) and many different hw blocks.
One of shared blocks in BCM53573 is Ethernet controller. Switch however
is not SRAB accessible (as it Northstar) but is MDIO attached.
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Use info from DT if available
It allows describing for example a fixed link. It's more accurate than
just guessing there may be one (depending on a chipset).
2. Verify PHY ID before trying to connect PHY
PHY addr 0x1e (30) is special in Broadcom routers and means a switch
connected as MDIO devices instead of a real PHY. Don't try connecting to
it.
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert Ethernet from ether_addr_copy() to eth_hw_addr_set():
@@
expression dev, np;
@@
- ether_addr_copy(dev->dev_addr, np)
+ eth_hw_addr_set(dev, np)
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert all Ethernet drivers from memcpy(... ETH_ADDR)
to eth_hw_addr_set():
@@
expression dev, np;
@@
- memcpy(dev->dev_addr, np, ETH_ALEN)
+ eth_hw_addr_set(dev, np)
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit extends the supported ethtool operations to allow MAC
level flow control to be configured for the bcmgenet driver.
The ethtool utility can be used to change the configuration of
auto-negotiated symmetric and asymmetric modes as well as manually
configuring support for RX and TX Pause frames individually.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit separates out the MAC configuration that occurs on a
PHY state change into a function named bcmgenet_mac_config().
This allows the function to be called directly elsewhere.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PHY state machine has been fixed to only call the adjust_link
callback when the link state has changed. Therefore the old link
state variables are no longer needed to detect a change in link
state.
This commit effectively reverts
commit 5ad6e6c508 ("net: bcmgenet: improve bcmgenet_mii_setup()")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bcmgenet_mii_setup() function is registered as the adjust_link
callback from the phylib for the GENET driver.
The phylib always sets the netif_carrier according to phydev->link
prior to invoking the adjust_link callback, so there is no need to
repeat that in the link down case within the network driver.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move devlink_register() to be last command in devlink configuration
sequence, so no user space access will be possible till devlink instance
is fully operable. As part of this change, the devlink_params_publish
call is removed as not needed.
This change fixes forgotten devlink_params_unpublish() too.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use dma_alloc_coherent() instead of pci_alloc_consistent(),
because only dma_alloc_coherent() is called here.
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is a replication of Christian Lamparter's "net: bgmac-bcma:
handle deferred probe error due to mac-address" patch for the
bgmac-platform driver [1].
As is the case with the bgmac-bcma driver, this change is to cover the
scenario where the MAC address cannot yet be discovered due to reliance
on an nvmem provider which is yet to be instantiated, resulting in a
random address being assigned that has to be manually overridden.
[1] https://lore.kernel.org/netdev/20210919115725.29064-1-chunkeey@gmail.com
Signed-off-by: Matthew Hagan <mnhagan88@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.
Use struct_group() around members queue_id, min_bw, max_bw, tsa, pri_lvl,
and bw_weight so they can be referenced together. This will allow memcpy()
and sizeof() to more easily reason about sizes, improve readability,
and avoid future warnings about writing beyond the end of queue_id.
"pahole" shows no size nor member offset changes to struct bnxt_cos2bw_cfg.
"objdump -d" shows no meaningful object code changes (i.e. only source
line number induced differences and optimizations).
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/lkml/CACKFLinDc6Y+P8eZ=450yA1nMC7swTURLtcdyiNR=9J6dfFyBg@mail.gmail.com
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/lkml/20210728044517.GE35706@embeddedor
This driver doesn't have any port parameters and registers
devlink port parameters with empty table. Remove the useless
calls to devlink_port_params_register and _unregister.
Fixes: da203dfa89 ("Revert "devlink: Add a generic wake_on_lan port parameter"")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
devlink is a software interface that doesn't depend on any hardware
capabilities. The failure in SW means memory issues, wrong parameters,
programmer error e.t.c.
Like any other such interface in the kernel, the returned status of
devlink APIs should be checked and propagated further and not ignored.
Fixes: 4ab0c6a8ff ("bnxt_en: add support to enable VF-representors")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/mptcp/protocol.c
977d293e23 ("mptcp: ensure tx skbs always have the MPTCP ext")
efe686ffce ("mptcp: ensure tx skbs always have the MPTCP ext")
same patch merged in both trees, keep net-next.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
devlink_register() can't fail and always returns success, but all drivers
are obligated to check returned status anyway. This adds a lot of boilerplate
code to handle impossible flow.
Make devlink_register() void and simplify the drivers that use that
API call.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com> # dsa
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When interfacing with a Broadcom PHY, request the auto-power down, DLL
disable and IDDQ-SR modes to be enabled.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The smallest TX ring size we support must fit a TX SKB with MAX_SKB_FRAGS
+ 1. Because the first TX BD for a packet is always a long TX BD, we
need an extra TX BD to fit this packet. Define BNXT_MIN_TX_DESC_CNT with
this value to make this more clear. The current code uses a minimum
that is off by 1. Fix it using this constant.
The tx_wake_thresh to determine when to wake up the TX queue is half the
ring size but we must have at least BNXT_MIN_TX_DESC_CNT for the next
packet which may have maximum fragments. So the comparison of the
available TX BDs with tx_wake_thresh should be >= instead of > in the
current code. Otherwise, at the smallest ring size, we will never wake
up the TX queue and will cause TX timeout.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadocm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to the inclusion of nvmem handling into the mac-address getter
function of_get_mac_address() by
commit d01f449c00 ("of_net: add NVMEM support to of_get_mac_address")
it is now possible to get a -EPROBE_DEFER return code. Which did cause
bgmac to assign a random ethernet address.
This exact issue happened on my Meraki MR32. The nvmem provider is
an EEPROM (at24c64) which gets instantiated once the module
driver is loaded... This happens once the filesystem becomes available.
With this patch, bgmac_probe() will propagate the -EPROBE_DEFER error.
Then the driver subsystem will reschedule the probe at a later time.
Cc: Petr Štetiar <ynezz@true.cz>
Cc: Michael Walle <michael@walle.cc>
Fixes: d01f449c00 ("of_net: add NVMEM support to of_get_mac_address")
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we are using a dedicated PHY driver (not the Generic PHY driver)
chances are that it is going to configure RGMII delays and do that in a
way that is incompatible with our incorrect interpretation of the
phy_interface value.
Add a quirk in order to reverse the PHY_INTERFACE_MODE_RGMII to the
value of PHY_INTERFACE_MODE_RGMII_ID such that the MAC continues to be
configured the way it used to be, but the PHY driver can account for
adding delays. Conversely when PHY_INTERFACE_MODE_RGMII_TXID is
specified, return PHY_INTERFACE_MODE_RGMII_RXID to the PHY since we will
have enabled a TXC MAC delay (id_mode_dis=0, meaning there is a delay
inserted).
This is not considered a bug fix at this point since it only affects
Broadcom STB platforms shipping with a Device Tree blob that is not
updatable in the field (quite a few devices out there) and which was
generated using the scripted Device Tree environment shipped with those
platforms' SDK.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the assert from the callback priv lookup function since it does
not require RTNL lock and is already protected by flow_indr_block_lock.
This will avoid warnings from being emitted to dmesg if the driver
registers its callback after an ingress qdisc was created for a
netdevice.
The warnings started after the following patch was merged:
commit 74fc4f8287 ("net: Fix offloading indirect devices dependency on qdisc order creation")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function is called to enable SR-IOV when available,
not enabling interfaces without VFs was a regression.
Fixes: 65161c3555 ("bnx2x: Fix missing error code in bnx2x_iov_init_one()")
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Reported-by: YunQiang Su <wzssyqa@gmail.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Shai Malin <smalin@marvell.com>
Link: https://lore.kernel.org/r/20210912190523.27991-1-bunk@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We recently changed the completion ring page arrays to be dynamically
allocated to better support the expanded range of ring depths. The
cleanup path for this was not quite complete. It might cause the
shutdown path to crash if we need to abort before the completion ring
arrays have been allocated and initialized.
Fix it by initializing the ring_mem->pg_arr to NULL after freeing the
completion ring page array. Add a check in bnxt_free_ring() to skip
referencing the rmem->pg_arr if it is NULL.
Fixes: 03c7448790 ("bnxt_en: Don't use static arrays for completion ring pages")
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The call to bnxt_free_mem(..., false) in the bnxt_half_open_nic() error
path will deallocate ring descriptor memory via bnxt_free_?x_rings(),
but because irq_re_init is false, the ring info itself is not freed.
To simplify error paths, deallocation functions have generally been
written to be safe when called on unallocated memory. It should always
be safe to call dev_close(), which calls bnxt_free_skbs() a second time,
even in this semi- allocated ring state.
Calling bnxt_free_skbs() a second time with the rings already freed will
cause NULL pointer dereference. Fix it by checking the rings are valid
before proceeding in bnxt_free_tx_skbs() and
bnxt_free_one_rx_ring_skbs().
Fixes: 975bc99a4a ("bnxt_en: Refactor bnxt_free_rx_skbs().")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent patch has introduced a regression by not reading the reset
count in the ERROR_RECOVERY async event handler. We may have just
gone through a reset and the reset count has just incremented. If
we don't update the reset count in the ERROR_RECOVERY event handler,
the health check timer will see that the reset count has changed and
will initiate an unintended reset.
Restore the unconditional update of the reset count in
bnxt_async_event_process() if error recovery watchdog is enabled.
Also, update the reset count at the end of the reset sequence to
make it even more robust.
Fixes: 1b2b918319 ("bnxt_en: Fix possible unintended driver initiated error recovery")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If error recovery is already enabled, bnxt_timer() will periodically
check the heartbeat register and the reset counter. If we get an
error recovery async. notification from the firmware (e.g. change in
primary/secondary role), we will immediately read and update the
heartbeat register and the reset counter. If the timer for the next
health check expires soon after this, we may read the heartbeat register
again in quick succession and find that it hasn't changed. This will
trigger error recovery unintentionally.
The likelihood is small because we also reset fw_health->tmr_counter
which will reset the interval for the next health check. But the
update is not protected and bnxt_timer() can miss the update and
perform the health check without waiting for the full interval.
Fix it by only reading the heartbeat register and reset counter in
bnxt_async_event_process() if error recovery is trasitioning to the
enabled state. Also add proper memory barriers so that when enabling
for the first time, bnxt_timer() will see the tmr_counter interval and
perform the health check after the full interval has elapsed.
Fixes: 7e914027f7 ("bnxt_en: Enable health monitoring.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current logic assumes that when the driver sends the message to the
firmware to add the VXLAN or Geneve port, the firmware will never fail
the operation. The UDP ports are always stored and are used to check
the tunnel packets in .ndo_features_check(). These tunnnel packets
will fail to offload on the transmit side if firmware fails the call to
add the UDP ports.
To fix the problem, bp->vxlan_port and bp->nge_port will only be set to
the offloaded ports when the HWRM_TUNNEL_DST_PORT_ALLOC firmware call
succeeds. When deleting a UDP port, we check that the port was
previously added successfuly first by checking the FW ID.
Fixes: 1698d600b3 ("bnxt_en: Implement .ndo_features_check().")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current asic.rev is incomplete and does not include the metal
revision. Add the metal revision and decode the complete asic
revision into the more common and readable form (A0, B0, etc).
Fixes: 7154917a12 ("bnxt_en: Refactor bnxt_dl_info_get().")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
P5 devices store NVM arrays using a different internal representation.
This implementation detail permeates into the HWRM API, requiring the
caller to explicitly index the array elements in HWRM_NVM_GET_VARIABLE
on these devices. Conversely, older devices do not support the indexed
mode of operation and require reading the raw NVM content.
Fixes: db28b6c77f ("bnxt_en: Fix devlink info's stored fw.psid version format.")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The FW_PSID version components are 8 bits wide, not 4.
Fixes: db28b6c77f ("bnxt_en: Fix devlink info's stored fw.psid version format.")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Parameter names in the comments did not match the function arguments.
Fixes: 2138081708 ("bnxt_en: add support for HWRM request slices")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Reported-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210901185315.57137-1-edwin.peer@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The driver requires 64-bit doorbell writes to be atomic on 32-bit
architectures. So we redefined writeq as a new macro with spinlock
protection on 32-bit architectures. This created a new warning when
we added a new file in a recent patchset. writeq is defined on many
32-bit architectures to do the memory write non-atomically and it
generated a new macro redefined warning. This warning was fixed
incorrectly in the recent patch.
Fix this properly by adding a new bnxt_writeq() function that will
do the non-atomic write under spinlock on 32-bit systems. All callers
in the driver will now call bnxt_writeq() instead.
v2: Need to pass in bp to bnxt_writeq()
Use lo_hi_writeq() [suggested by Florian]
Reported-by: kernel test robot <lkp@intel.com>
Fixes: f9ff578251 ("bnxt_en: introduce new firmware message API based on DMA pools")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add infrastructure to maintain a pending list of HWRM commands awaiting
completion and reduce the scope of the hwrm_cmd_lock mutex so that it
protects only the request mailbox. The mailbox is free to use for one
or more concurrent commands after receiving deferred response events.
For uniformity and completeness, use the same pending list for
collecting completions for commands that respond via a completion ring.
These commands are only used for freeing rings and for IRQ test and
we only support one such command in flight.
Note deferred responses are also only supported on the main channel.
The secondary channel (KONG) does not support deferred responses.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are no longer any callers relying on the old API.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The conversion follows this general pattern for most of the calls:
1. The input message is changed from a stack variable initialized
using bnxt_hwrm_cmd_hdr_init() to a pointer allocated and intialized
using hwrm_req_init().
2. If we don't need to read the firmware response, the hwrm_send_message()
call is replaced with hwrm_req_send().
3. If we need to read the firmware response, the mutex lock is replaced
by hwrm_req_hold() to hold the response. When the response is read, the
mutex unlock is replaced by hwrm_req_drop().
If additional DMA buffers are needed for firmware response data, the
hwrm_req_dma_slice() is used instead of calling dma_alloc_coherent().
Some minor refactoring is also done while doing these conversions.
v2: Fix unintialized variable warnings in __bnxt_hwrm_get_tx_rings()
and bnxt_approve_mac()
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently use the hwrm_cmd_lock to serialize the update of the
firmware's link status response data and the copying of link status data
to the VF. This won't work when we update the firmware message APIs, so
we use the link_lock mutex instead. All link_info data should be
updated under the link_lock mutex. Also add link_lock to functions that
touch link_info in __bnxt_open_nic() and bnxt_probe_phy(). The locking
is probably not strictly necessary during probe, but it's more consistent.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Slices are a mechanism for suballocating DMA mapped regions from the
request buffer. Such regions can be used for indirect command data
instead of creating new mappings with dma_alloc_coherent().
The advantage of using a slice is that the lifetime of the slice is
bound to the request and will be automatically unmapped when the
request is consumed.
A single external region is also supported. This allows for regions
that will not fit inside the spare request buffer space such that
the same API can be used consistently even for larger mappings.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hwrm_req_replace() provides an assignment like operation to replace a
managed HWRM request object with data from a pre-built source. This is
useful for handling request data provided by higher layer HWRM clients.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During firmware crash recovery, it is possible for firmware to respond
to stale HWRM commands that have already timed out. Because response
buffers may be reused, any out of sequence responses need to be ignored
and only the matching seq_id should be accepted.
Also, READ_ONCE should be used for the reads from the DMA buffer to
ensure that the necessary loads are scheduled.
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change constitutes a major step towards supporting multiple
firmware commands in flight by maintaining a separate response buffer
for the duration of each request. These firmware commands are also
known as Hardware Resource Manager (HWRM) commands. Using separate
response buffers requires an API change in order for callers to be
able to free the buffer when done.
It is impossible to keep the existing APIs unchanged. The existing
usage for a simple HWRM message request such as the following:
struct input req = {0};
bnxt_hwrm_cmd_hdr_init(bp, &req, REQ_TYPE, -1, -1);
rc = hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
if (rc)
/* error */
changes to:
struct input *req;
rc = hwrm_req_init(bp, req, REQ_TYPE);
if (rc)
/* error */
rc = hwrm_req_send(bp, req); /* consumes req */
if (rc)
/* error */
The key changes are:
1. The req is no longer allocated on the stack.
2. The caller must call hwrm_req_init() to allocate a req buffer and
check for a valid buffer.
3. The req buffer is automatically released when hwrm_req_send() returns.
4. If the caller wants to check the firmware response, the caller must
call hwrm_req_hold() to take ownership of the response buffer and
release it afterwards using hwrm_req_drop(). The caller is no longer
required to explicitly hold the hwrm_cmd_lock mutex to read the
response.
5. Because the firmware commands and responses all have different sizes,
some safeguards are added to the code.
This patch maintains legacy API compatibiltiy, implementing the old
API in terms of the new. The follow-on patches will convert all
callers to use the new APIs.
v2: Fix redefined writeq with parisc .config
Fix "cast from pointer to integer of different size" warning in
hwrm_calc_sentinel()
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move all firmware messaging functions and definitions to new
bnxt_hwrm.[ch]. The follow-on patches will make major modifications
to these APIs.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactor the code so that __bnxt_hwrm_ver_get() does not call
bnxt_hwrm_do_send_msg() directly. The new APIs will not expose this
internal call. Add a new bnxt_hwrm_poll() to poll the HWRM_VER_GET
firmware call silently. The other bnxt_hwrm_ver_get() function will
send the HWRM_VER_GET message directly with error logs enabled.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The additional response buffer serves no useful purpose. There can
be only one firmware command in flight due to the hwrm_cmd_lock mutex,
which is taken for the entire duration of any command completion,
KONG or otherwise. It is thus safe to share a single DMA buffer.
Removing the code associated with the additional mapping will simplify
matters in the next patch, which allocates response buffers from DMA
pools on a per request basis.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Count packets dropped due to buffer or skb allocation errors.
Report as part of rx_dropped.
v2: drop the ethtool -S entry [Vladimir]
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
bnxt may discard packets if Rx completions are consumed
in an attempt to let netpoll make progress. It should be
extremely rare in practice but nonetheless such events
should be counted.
Since completion ring memory is allocated dynamically use
a similar scheme to what is done for HW stats to save them.
Report the stats in rx_dropped and per-netdev ethtool
counter. Chances that users care which ring dropped are
very low.
v3: only save the stat to rx_dropped on reset,
rx_total_netpoll_discards will now only show drops since
last reset, similar to other "total_discard" counters.
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use pci_vpd_alloc() to dynamically allocate a properly sized buffer and
read the full VPD data into it.
This simplifies the code, and we no longer have to make assumptions about
VPD size.
Link: https://lore.kernel.org/r/62522a24-f39a-2b35-1577-1fbb41695bed@gmail.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use pci_vpd_find_ro_info_keyword() to search for keywords in VPD to
simplify the code.
Use strncasecmp() to match Vendor ID instead of comparing with lower- and
upper-case hex string.
[bhelgaas: convert to strncasecmp()]
Link: https://lore.kernel.org/r/a9f730cf-e31e-902b-7b39-0ff2e99636e0@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use pci_vpd_alloc() to dynamically allocate a properly sized buffer and
read the full VPD data into it.
This simplifies the code, and we no longer have to make assumptions about
VPD size.
Link: https://lore.kernel.org/r/821a334d-ff9d-386e-5f42-9b620ab3dbfa@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Read NVRAM directly into buffer and use swab32s() to byte swap it in-place
instead of reading it into the end of the buffer and swapping it manually
while copying it.
[bhelgaas: commit log]
Link: https://lore.kernel.org/r/e4ac6229-1df5-8760-3a87-1ad0ace87137@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In order to support more coalesce parameters through netlink,
add two new parameter kernel_coal and extack for .set_coalesce
and .get_coalesce, then some extra info can return to user with
the netlink API.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use pci_vpd_find_ro_info_keyword() to search for keywords in VPD to
simplify the code.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use pci_vpd_alloc() to dynamically allocate a properly sized buffer and
read the full VPD data into it.
This simplifies the code, and we no longer have to make assumptions about
VPD size.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use pci_vpd_find_ro_info_keyword() to search for keywords in VPD to
simplify the code.
str_id_reg and str_id_cap hold the same string and are used in the same
comparison. This doesn't make sense, use one string str_id instead.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use pci_vpd_alloc() to dynamically allocate a properly sized buffer and
read the full VPD data into it.
This simplifies the code, and we no longer have to make assumptions about
VPD size.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>