Commit Graph

888 Commits

Author SHA1 Message Date
Michael Chan
43a440c400 bnxt_en: Improve the status_reliable flag in bp->fw_health.
In order to read the firmware health status, we first need to determine
the register location and then the register may need to be mapped.
There are 2 code paths to do this.  The first one is done early as a
best effort attempt by the function bnxt_try_map_fw_health_reg().  The
second one is done later in the function bnxt_map_fw_health_regs()
after establishing communications with the firmware.  We currently
only set fw_health->status_reliable if we can successfully set up the
health register in the first code path.

Improve the scheme by setting the fw_health->status_reliable flag if
either (or both) code paths can successfully set up the health
register.  This flag is relied upon during run-time when we need to
check the health status.  So this will make it work better.

During ifdown, if the health register is mapped, we need to invalidate
the health register mapping because a potential fw reset will reset
the mapping.  Similarly, we need to do the same after firmware reset
during recovery.  We'll remap it during ifup.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 13:07:28 -07:00
Edwin Peer
20d7d1c5c9 bnxt_en: reliably allocate IRQ table on reset to avoid crash
The following trace excerpt corresponds with a NULL pointer dereference
of 'bp->irq_tbl' in bnxt_setup_inta() on an Aarch64 system after many
device resets:

    Unable to handle kernel NULL pointer dereference at ... 000000d
    ...
    pc : string+0x3c/0x80
    lr : vsnprintf+0x294/0x7e0
    sp : ffff00000f61ba70 pstate : 20000145
    x29: ffff00000f61ba70 x28: 000000000000000d
    x27: ffff0000009c8b5a x26: ffff00000f61bb80
    x25: ffff0000009c8b5a x24: 0000000000000012
    x23: 00000000ffffffe0 x22: ffff000008990428
    x21: ffff00000f61bb80 x20: 000000000000000d
    x19: 000000000000001f x18: 0000000000000000
    x17: 0000000000000000 x16: ffff800b6d0fb400
    x15: 0000000000000000 x14: ffff800b7fe31ae8
    x13: 00001ed16472c920 x12: ffff000008c6b1c9
    x11: ffff000008cf0580 x10: ffff00000f61bb80
    x9 : 00000000ffffffd8 x8 : 000000000000000c
    x7 : ffff800b684b8000 x6 : 0000000000000000
    x5 : 0000000000000065 x4 : 0000000000000001
    x3 : ffff0a00ffffff04 x2 : 000000000000001f
    x1 : 0000000000000000 x0 : 000000000000000d
    Call trace:
    string+0x3c/0x80
    vsnprintf+0x294/0x7e0
    snprintf+0x44/0x50
    __bnxt_open_nic+0x34c/0x928 [bnxt_en]
    bnxt_open+0xe8/0x238 [bnxt_en]
    __dev_open+0xbc/0x130
    __dev_change_flags+0x12c/0x168
    dev_change_flags+0x20/0x60
    ...

Ordinarily, a call to bnxt_setup_inta() (not in trace due to inlining)
would not be expected on a system supporting MSIX at all. However, if
bnxt_init_int_mode() does not end up being called after the call to
bnxt_clear_int_mode() in bnxt_fw_reset_close(), then the driver will
think that only INTA is supported and bp->irq_tbl will be NULL,
causing the above crash.

In the error recovery scenario, we call bnxt_clear_int_mode() in
bnxt_fw_reset_close() early in the sequence. Ordinarily, we will
call bnxt_init_int_mode() in bnxt_hwrm_if_change() after we
reestablish communication with the firmware after reset.  However,
if the sequence has to abort before we call bnxt_init_int_mode() and
if the user later attempts to re-open the device, then it will cause
the crash above.

We fix it in 2 ways:

1. Check for bp->irq_tbl in bnxt_setup_int_mode(). If it is NULL, call
bnxt_init_init_mode().

2. If we need to abort in bnxt_hwrm_if_change() and cannot complete
the error recovery sequence, set the BNXT_STATE_ABORT_ERR flag.  This
will cause more drastic recovery at the next attempt to re-open the
device, including a call to bnxt_init_int_mode().

Fixes: 3bc7d4a352 ("bnxt_en: Add BNXT_STATE_IN_FW_RESET state.")
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-26 15:50:23 -08:00
Vasundhara Volam
d20cd74521 bnxt_en: Fix race between firmware reset and driver remove.
The driver's error recovery reset sequence can take many seconds to
complete and only the critical sections are protected by rtnl_lock.
A recent change has introduced a regression in this sequence.

bnxt_remove_one() may be called while the recovery is in progress.
Normally, unregister_netdev() would cause bnxt_close_nic() to be
called and this would cause the error recovery to safely abort
with the BNXT_STATE_ABORT_ERR flag set in bnxt_close_nic().

Recently, we added bnxt_reinit_after_abort() to allow the user to
reopen the device after an aborted recovery.  This causes the
regression in the scenario described above because we would
attempt to re-open even after the netdev has been unregistered.

Fix it by checking the netdev reg_state in
bnxt_reinit_after_abort() and abort if it is unregistered.

Fixes: 6882c36cf8 ("bnxt_en: attempt to reinitialize after aborted reset")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-26 15:50:23 -08:00
David S. Miller
d489ded1a3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-02-16 17:51:13 -08:00
Michael Chan
f4d95c3c19 bnxt_en: Improve logging of error recovery settings information.
We currently only log the error recovery settings if it is enabled.
In some cases, firmware disables error recovery after it was
initially enabled.  Without logging anything, the user will not be
aware of this change in setting.

Log it when error recovery is disabled.  Also, change the reset count
value from hexadecimal to decimal.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:51 -08:00
Michael Chan
df97b34d3a bnxt_en: Reply to firmware's echo request async message.
This is a new async message that the firmware can send to check if it
can communicate with the driver.  This is an added error detection
scheme that firmware can use if it suspects errors in the PCIe
interface.  When the driver receives this async message, it will reply
back echoing some data in the async message.  If the firmware is not
getting the reply with the proper data after some retries, error
recovery will kick in.

Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:51 -08:00
Michael Chan
41435c3940 bnxt_en: Initialize "context kind" field for context memory blocks.
If firmware provides the offset to the "context kind" field of the
relevant context memory blocks, we'll initialize just that field for
each block instead of initializing all of context memory.

Populate the bnxt_mem_init structure with the proper offset returned
by firmware.  If it is older firmware and the information is not
available, we set the offset to an invalid value and fall back to
the old behavior of initializing every byte.  Otherwise, we initialize
only the "context kind" byte at the offset.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:51 -08:00
Michael Chan
e9696ff33c bnxt_en: Add context memory initialization infrastructure.
Currently, the driver calls memset() to set all relevant context memory
used by the chip to the initial value.  This can take many milliseconds
with the potentially large number of context pages allocated for the
chip.

To make this faster, we only need to initialize the "context kind" field
of each block of context memory.  This patch sets up the infrastructure
to do that with the bnxt_mem_init structure.  In the next patch, we'll
add the logic to obtain the offset of the "context kind" from the
firmware.  This patch is not changing the current behavior of calling
memset() to initialize all relevant context memory.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:50 -08:00
Michael Chan
dab62e7c2d bnxt_en: Implement faster recovery for firmware fatal error.
During some fatal firmware error conditions, the PCI config space
register 0x2e which normally contains the subsystem ID will become
0xffff.  This register will revert back to the normal value after
the chip has completed core reset.  If we detect this condition,
we can poll this config register immediately for the value to revert.
Because we use config read cycles to poll this register, there is no
possibility of Master Abort if we happen to read it during core reset.
This speeds up recovery significantly as we don't have to wait for the
conservative min_time before polling MMIO to see if the firmware has
come out of reset.  As soon as this register changes value we can
proceed to re-initialize the device.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:50 -08:00
Edwin Peer
be6d755f3d bnxt_en: selectively allocate context memories
Newer devices may have local context memory instead of relying on the
host for backing store. In these cases, HWRM_FUNC_BACKING_STORE_QCAPS
will return a zero entry size to indicate contexts for which the host
should not allocate backing store.

Selectively allocate context memory based on device capabilities and
only enable backing store for the appropriate contexts.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:50 -08:00
Edwin Peer
132e0b65dc bnxt_en: reverse order of TX disable and carrier off
A TX queue can potentially immediately timeout after it is stopped
and the last TX timestamp on that queue was more than 5 seconds ago with
carrier still up.  Prevent these intermittent false TX timeouts
by bringing down carrier first before calling netif_tx_disable().

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-11 14:36:22 -08:00
Michael Chan
871127e6ab bnxt_en: Convert to use netif_level() helpers.
Use the various netif_level() helpers to simplify the C code.  This was
suggested by Joe Perches.

Cc: Joe Perches <joe@perches.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/1611642024-3166-1-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-26 17:25:59 -08:00
Michael Chan
0da65f4932 bnxt_en: Do not process completion entries after fatal condition detected.
Once the firmware fatal condition is detected, we should cease
comminication with the firmware and hardware quickly even if there
are many completion entries in the completion rings.  This will
speed up the recovery process and prevent further I/Os that may
cause further exceptions.

Do not proceed in the NAPI poll function if fatal condition is
detected.  Call napi_complete() and return without arming interrupts.
Cleanup of all rings and reset are imminent.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:05 -08:00
Michael Chan
5863b10aa8 bnxt_en: Consolidate firmware reset event logging.
Combine the three netdev_warn() calls into a single call, printed at
the NETIF_MSG_HW log level.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:04 -08:00
Michael Chan
4f036b2e75 bnxt_en: Improve firmware fatal error shutdown sequence.
In the event of a fatal firmware error, firmware will notify the host
and then it will proceed to do core reset when it sees that all functions
have disabled Bus Master.  To prevent Master Aborts and other hard
errors, we need to quiesce all activities in addition to disabling Bus
Master before the chip goes into core reset.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:04 -08:00
Michael Chan
38290e3729 bnxt_en: Modify bnxt_disable_int_sync() to be called more than once.
In the event of a fatal firmware error, we want to disable IRQ early
in the recovery sequence.  This change will allow it to be called
safely again as part of the normal shutdown sequence.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:04 -08:00
Michael Chan
e340a5c4fb bnxt_en: Add a new BNXT_STATE_NAPI_DISABLED flag to keep track of NAPI state.
Up until now, we don't need to keep track of this state because NAPI
is always enabled once and disabled once during bring up and shutdown.
For better error recovery in subsequent patches, we want to quiesce
the device earlier during fatal error conditions.  The normal shutdown
sequence will disable NAPI again and the flag will prevent disabling
NAPI twice.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:04 -08:00
Michael Chan
339eeb4bd9 bnxt_en: Add bnxt_fw_reset_timeout() helper.
This code to check if we have reached the maximum wait time after
firmware reset is used multiple times.  Add a helper function to
do this.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:04 -08:00
Vasundhara Volam
5d06eb5cb1 bnxt_en: Retry open if firmware is in reset.
Firmware may be in the middle of reset when the driver tries to do ifup.
In that case, firmware will return a special error code and the driver
will retry 10 times with 50 msecs delay after each retry.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:04 -08:00
Edwin Peer
6882c36cf8 bnxt_en: attempt to reinitialize after aborted reset
Drawing a hard line on aborted resets prevents a NIC open in
some scenarios that may otherwise be recoverable. For example,
if a firmware recovery happened while a PF was down and an
attempt was made to bring up an associated VF in this state,
then it was impossible to ever bring up this VF without a
rebind or reload of its driver.

Attempt to reinitialize the firmware when an aborted reset (or
failed init after a reset) is discovered during open - it may
succeed. Also take care to allow the user to retry opening the
NIC even after an aborted reset.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:04 -08:00
Edwin Peer
a44daa8fcb bnxt_en: log firmware debug notifications
Firmware is capable of generating asynchronous debug notifications.
The event data is opaque to the driver and is simply logged. Debug
notifications can be enabled by turning on hardware status messages
using the ethtool msglvl interface.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:03 -08:00
Vasundhara Volam
881d8353b0 bnxt_en: Add an upper bound for all firmware command timeouts.
The timeout period for firmware messages is passed to the driver
from the firmware in the response of the first command.  This
timeout period is multiplied by a factor for certain long
running commands such as NVRAM commands.  In some cases, the
timeout period can become really long and it can cause hung task
warnings if firmware has crashed or is not responding.  To avoid
such long delays, cap all firmware commands to a max timeout value
of 40 seconds.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:03 -08:00
Vasundhara Volam
3e3c09b0e9 bnxt_en: Move reading VPD info after successful handshake with fw.
If firmware is in reset or in bad state, it won't be able to return
VPD data.  Move bnxt_vpd_read_info() until after bnxt_fw_init_one_p1()
successfully returns.  By then we would have established proper
communications with the firmware.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:03 -08:00
Michael Chan
d1cbd1659c bnxt_en: Retry sending the first message to firmware if it is under reset.
The first HWRM_VER_GET message to firmware during probe may timeout if
firmware is under reset.  This can happen during hot-plug for example.
On P5 and newer chips, we can check if firmware is in the boot stage by
reading a status register.  Retry 5 times if the status register shows
that firmware is not ready and not in error state.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:03 -08:00
Edwin Peer
b187e4bae0 bnxt_en: handle CRASH_NO_MASTER during bnxt_open()
Add missing support for handling NO_MASTER crashes while ports are
administratively down (ifdown). On some SoC platforms, the driver
needs to assist the firmware to recover from a crash via OP-TEE.
This is performed in a similar fashion to what is done during driver
probe.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:03 -08:00
Michael Chan
16db632304 bnxt_en: Update firmware interface to 1.10.2.11.
Updates to backing store APIs, QoS profiles, and push buffer initial
index support.

Since the new HWRM_FUNC_BACKING_STORE_CFG message size has increased,
we need to add some compat. logic to fall back to the smaller legacy
size if firmware cannot accept the larger message size.  The new fields
added to the structure are not used yet.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:03 -08:00
Jakub Kicinski
30bfce1094 net: remove ndo_udp_tunnel_* callbacks
All UDP tunnel port management is now routed via udp_tunnel_nic
infra directly. Remove the old callbacks.

Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 12:53:29 -08:00
Zheng Yongjun
33dbcf6055 bnxt_en: Use kzalloc for allocating only one thing
Use kzalloc rather than kcalloc(1,...)

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
@@

- kcalloc(1,
+ kzalloc(
          ...)
// </smpl>

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-01-05 15:43:41 -08:00
Michael Chan
a029a2fef5 bnxt_en: Check TQM rings for maximum supported value.
TQM rings are hardware resources that require host context memory
managed by the driver.  The driver supports up to 9 TQM rings and
the number of rings to use is requested by firmware during run-time.
Cap this number to the maximum supported to prevent accessing beyond
the array.  Future firmware may request more than 9 TQM rings.  Define
macros to remove the magic number 9 from the C code.

Fixes: ac3158cb01 ("bnxt_en: Allocate TQM ring context memory according to fw specification.")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28 14:10:53 -08:00
Vasundhara Volam
fb1e6e562b bnxt_en: Fix AER recovery.
A recent change skips sending firmware messages to the firmware when
pci_channel_offline() is true during fatal AER error.  To make this
complete, we need to move the re-initialization sequence to
bnxt_io_resume(), otherwise the firmware messages to re-initialize
will all be skipped.  In any case, it is more correct to re-initialize
in bnxt_io_resume().

Also, fix the reverse x-mas tree format when defining variables
in bnxt_io_slot_reset().

Fixes: b340dc680e ("bnxt_en: Avoid sending firmware messages when AER error is detected.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28 14:10:52 -08:00
Jakub Kicinski
a1dd1d8697 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-12-03

The main changes are:

1) Support BTF in kernel modules, from Andrii.

2) Introduce preferred busy-polling, from Björn.

3) bpf_ima_inode_hash() and bpf_bprm_opts_set() helpers, from KP Singh.

4) Memcg-based memory accounting for bpf objects, from Roman.

5) Allow bpf_{s,g}etsockopt from cgroup bind{4,6} hooks, from Stanislav.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (118 commits)
  selftests/bpf: Fix invalid use of strncat in test_sockmap
  libbpf: Use memcpy instead of strncpy to please GCC
  selftests/bpf: Add fentry/fexit/fmod_ret selftest for kernel module
  selftests/bpf: Add tp_btf CO-RE reloc test for modules
  libbpf: Support attachment of BPF tracing programs to kernel modules
  libbpf: Factor out low-level BPF program loading helper
  bpf: Allow to specify kernel module BTFs when attaching BPF programs
  bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier
  selftests/bpf: Add CO-RE relocs selftest relying on kernel module BTF
  selftests/bpf: Add support for marking sub-tests as skipped
  selftests/bpf: Add bpf_testmod kernel module for testing
  libbpf: Add kernel module BTF support for CO-RE relocations
  libbpf: Refactor CO-RE relocs to not assume a single BTF object
  libbpf: Add internal helper to load BTF data by FD
  bpf: Keep module's btf_data_size intact after load
  bpf: Fix bpf_put_raw_tracepoint()'s use of __module_address()
  selftests/bpf: Add Userspace tests for TCP_WINDOW_CLAMP
  bpf: Adds support for setting window clamp
  samples/bpf: Fix spelling mistake "recieving" -> "receiving"
  bpf: Fix cold build of test_progs-no_alu32
  ...
====================

Link: https://lore.kernel.org/r/20201204021936.85653-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04 07:48:12 -08:00
Björn Töpel
b02e5a0ebb xsk: Propagate napi_id to XDP socket Rx path
Add napi_id to the xdp_rxq_info structure, and make sure the XDP
socket pick up the napi_id in the Rx path. The napi_id is used to find
the corresponding NAPI structure for socket busy polling.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/bpf/20201130185205.196029-7-bjorn.topel@gmail.com
2020-12-01 00:09:25 +01:00
Michael Chan
c54bc3ced5 bnxt_en: Release PCI regions when DMA mask setup fails during probe.
Jump to init_err_release to cleanup.  bnxt_unmap_bars() will also be
called but it will do nothing if the BARs are not mapped yet.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/1605858271-8209-1-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-20 10:09:09 -08:00
Zhang Changzhong
3383176efc bnxt_en: fix error return code in bnxt_init_board()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Link: https://lore.kernel.org/r/1605792621-6268-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 21:49:01 -08:00
Zhang Changzhong
b5f796b62c bnxt_en: fix error return code in bnxt_init_one()
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: c213eae8d3 ("bnxt_en: Improve VF/PF link change logic.")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Link: https://lore.kernel.org/r/1605701851-20270-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 21:46:30 -08:00
Michael Chan
fa97f303fa bnxt_en: Fix counter overflow logic.
bnxt_add_one_ctr() adds a hardware counter to a software counter and
adjusts for the hardware counter wraparound against the mask.  The logic
assumes that the hardware counter is always smaller than or equal to
the mask.

This assumption is mostly correct.  But in some cases if the firmware
is older and does not provide the accurate mask, the driver can use
a mask that is smaller than the actual hardware mask.  This can cause
some extra carry bits to be added to the software counter, resulting in
counters that far exceed the actual value.  Fix it by masking the
hardware counter with the mask passed into bnxt_add_one_ctr().

Fixes: fea6b33355 ("bnxt_en: Accumulate all counters.")
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 17:39:46 -08:00
Michael Chan
eba93de6d3 bnxt_en: Free port stats during firmware reset.
Firmware is unable to retain the port counters during any kind of
fatal or non-fatal resets, so we must clear the port counters to
avoid false detection of port counter overflow.

Fixes: fea6b33355 ("bnxt_en: Accumulate all counters.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16 17:39:46 -08:00
Vasundhara Volam
825741b071 bnxt_en: Send HWRM_FUNC_RESET fw command unconditionally.
In the AER or firmware reset flow, if we are in fatal error state or
if pci_channel_offline() is true, we don't send any commands to the
firmware because the commands will likely not reach the firmware and
most commands don't matter much because the firmware is likely to be
reset imminently.

However, the HWRM_FUNC_RESET command is different and we should always
attempt to send it.  In the AER flow for example, the .slot_reset()
call will trigger this fw command and we need to try to send it to
effect the proper reset.

Fixes: b340dc680e ("bnxt_en: Avoid sending firmware messages when AER error is detected.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-26 18:26:35 -07:00
Michael Chan
a1301f08c5 bnxt_en: Check abort error state in bnxt_open_nic().
bnxt_open_nic() is called during configuration changes that require
the NIC to be closed and then opened.  This call is protected by
rtnl_lock.  Firmware reset can be happening at the same time.  Only
critical portions of the entire firmware reset sequence are protected
by the rtnl_lock.  It is possible that bnxt_open_nic() can be called
when the firmware reset sequence is aborting.  In that case,
bnxt_open_nic() needs to check if the ABORT_ERR flag is set and
abort if it is.  The configuration change that resulted in the
bnxt_open_nic() call will fail but the NIC will be brought to a
consistent IF_DOWN state.

Without this patch, if bnxt_open_nic() were to continue in this error
state, it may crash like this:

[ 1648.659736] BUG: unable to handle kernel NULL pointer dereference at           (null)
[ 1648.659768] IP: [<ffffffffc01e9b3a>] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en]
[ 1648.659796] PGD 101e1b3067 PUD 101e1b2067 PMD 0
[ 1648.659813] Oops: 0000 [#1] SMP
[ 1648.659825] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dell_smbios dell_wmi_descriptor dcdbas amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper vfat cryptd fat pcspkr ipmi_ssif sg k10temp i2c_piix4 wmi ipmi_si ipmi_devintf ipmi_msghandler tpm_crb acpi_power_meter sch_fq_codel ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm libahci megaraid_sas crct10dif_pclmul crct10dif_common
[ 1648.660063]  tg3 libata crc32c_intel bnxt_en(OE) drm_panel_orientation_quirks devlink ptp pps_core dm_mirror dm_region_hash dm_log dm_mod fuse
[ 1648.660105] CPU: 13 PID: 3867 Comm: ethtool Kdump: loaded Tainted: G           OE  ------------   3.10.0-1152.el7.x86_64 #1
[ 1648.660911] Hardware name: Dell Inc. PowerEdge R7515/0R4CNN, BIOS 1.2.14 01/28/2020
[ 1648.661662] task: ffff94e64cbc9080 ti: ffff94f55df1c000 task.ti: ffff94f55df1c000
[ 1648.662409] RIP: 0010:[<ffffffffc01e9b3a>]  [<ffffffffc01e9b3a>] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en]
[ 1648.663171] RSP: 0018:ffff94f55df1fba8  EFLAGS: 00010202
[ 1648.663927] RAX: 0000000000000000 RBX: ffff94e6827e0000 RCX: 0000000000000000
[ 1648.664684] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff94e6827e08c0
[ 1648.665433] RBP: ffff94f55df1fc20 R08: 00000000000001ff R09: 0000000000000008
[ 1648.666184] R10: 0000000000000d53 R11: ffff94f55df1f7ce R12: ffff94e6827e08c0
[ 1648.666940] R13: ffff94e6827e08c0 R14: ffff94e6827e08c0 R15: ffffffffb9115e40
[ 1648.667695] FS:  00007f8aadba5740(0000) GS:ffff94f57eb40000(0000) knlGS:0000000000000000
[ 1648.668447] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1648.669202] CR2: 0000000000000000 CR3: 0000001022772000 CR4: 0000000000340fe0
[ 1648.669966] Call Trace:
[ 1648.670730]  [<ffffffffc01f1d5d>] ? bnxt_need_reserve_rings+0x9d/0x170 [bnxt_en]
[ 1648.671496]  [<ffffffffc01fa7ea>] __bnxt_open_nic+0x8a/0x9a0 [bnxt_en]
[ 1648.672263]  [<ffffffffc01f7479>] ? bnxt_close_nic+0x59/0x1b0 [bnxt_en]
[ 1648.673031]  [<ffffffffc01fb11b>] bnxt_open_nic+0x1b/0x50 [bnxt_en]
[ 1648.673793]  [<ffffffffc020037c>] bnxt_set_ringparam+0x6c/0xa0 [bnxt_en]
[ 1648.674550]  [<ffffffffb8a5f564>] dev_ethtool+0x1334/0x21a0
[ 1648.675306]  [<ffffffffb8a719ff>] dev_ioctl+0x1ef/0x5f0
[ 1648.676061]  [<ffffffffb8a324bd>] sock_do_ioctl+0x4d/0x60
[ 1648.676810]  [<ffffffffb8a326bb>] sock_ioctl+0x1eb/0x2d0
[ 1648.677548]  [<ffffffffb8663230>] do_vfs_ioctl+0x3a0/0x5b0
[ 1648.678282]  [<ffffffffb8b8e678>] ? __do_page_fault+0x238/0x500
[ 1648.679016]  [<ffffffffb86634e1>] SyS_ioctl+0xa1/0xc0
[ 1648.679745]  [<ffffffffb8b93f92>] system_call_fastpath+0x25/0x2a
[ 1648.680461] Code: 9e 60 01 00 00 0f 1f 40 00 45 8b 8e 48 01 00 00 31 c9 45 85 c9 0f 8e 73 01 00 00 66 0f 1f 44 00 00 49 8b 86 a8 00 00 00 48 63 d1 <48> 8b 14 d0 48 85 d2 0f 84 46 01 00 00 41 8b 86 44 01 00 00 c7
[ 1648.681986] RIP  [<ffffffffc01e9b3a>] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en]
[ 1648.682724]  RSP <ffff94f55df1fba8>
[ 1648.683451] CR2: 0000000000000000

Fixes: ec5d31e3c1 ("bnxt_en: Handle firmware reset status during IF_UP.")
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-26 18:26:35 -07:00
Vasundhara Volam
f75d9a0aa9 bnxt_en: Re-write PCI BARs after PCI fatal error.
When a PCIe fatal error occurs, the internal latched BAR addresses
in the chip get reset even though the BAR register values in config
space are retained.

pci_restore_state() will not rewrite the BAR addresses if the
BAR address values are valid, causing the chip's internal BAR addresses
to stay invalid.  So we need to zero the BAR registers during PCIe fatal
error to force pci_restore_state() to restore the BAR addresses.  These
write cycles to the BAR registers will cause the proper BAR addresses to
latch internally.

Fixes: 6316ea6db9 ("bnxt_en: Enable AER support.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-26 18:26:35 -07:00
Vasundhara Volam
631ce27a30 bnxt_en: Invoke cancel_delayed_work_sync() for PFs also.
As part of the commit b148bb238c
("bnxt_en: Fix possible crash in bnxt_fw_reset_task()."),
cancel_delayed_work_sync() is called only for VFs to fix a possible
crash by cancelling any pending delayed work items. It was assumed
by mistake that the flush_workqueue() call on the PF would flush
delayed work items as well.

As flush_workqueue() does not cancel the delayed workqueue, extend
the fix for PFs. This fix will avoid the system crash, if there are
any pending delayed work items in fw_reset_task() during driver's
.remove() call.

Unify the workqueue cleanup logic for both PF and VF by calling
cancel_work_sync() and cancel_delayed_work_sync() directly in
bnxt_remove_one().

Fixes: b148bb238c ("bnxt_en: Fix possible crash in bnxt_fw_reset_task().")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-26 18:26:35 -07:00
Vasundhara Volam
21d6a11e2c bnxt_en: Fix regression in workqueue cleanup logic in bnxt_remove_one().
A recent patch has moved the workqueue cleanup logic before
calling unregister_netdev() in bnxt_remove_one().  This caused a
regression because the workqueue can be restarted if the device is
still open.  Workqueue cleanup must be done after unregister_netdev().
The workqueue will not restart itself after the device is closed.

Call bnxt_cancel_sp_work() after unregister_netdev() and
call bnxt_dl_fw_reporters_destroy() after that.  This fixes the
regession and the original NULL ptr dereference issue.

Fixes: b16939b59c ("bnxt_en: Fix NULL ptr dereference crash in bnxt_fw_reset_task()")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-26 18:26:35 -07:00
Vasundhara Volam
4933f6753b bnxt_en: Add bnxt_hwrm_nvm_get_dev_info() to query NVM info.
Add a new bnxt_hwrm_nvm_get_dev_info() to query firmware version
information via NVM_GET_DEV_INFO firmware command.  Use it to
get the running version of the NVM configuration information.

This new function will also be used in subsequent patches to get the
stored firmware versions.

Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/1602493854-29283-8-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-12 14:27:03 -07:00
Michael Chan
8eddb3e7ce bnxt_en: Log unknown link speed appropriately.
If the VF virtual link is set to always enabled, the speed may be
unknown when the physical link is down.  The driver currently logs
the link speed as 4294967295 Mbps which is SPEED_UNKNOWN.  Modify
the link up log message as "speed unknown" which makes more sense.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/1602493854-29283-7-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-12 14:27:03 -07:00
Michael Chan
c966c67c09 bnxt_en: Log event_data1 and event_data2 when handling RESET_NOTIFY event.
Log these values that contain useful firmware state information.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/1602493854-29283-6-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-12 14:27:03 -07:00
Michael Chan
03ab8ca1e9 bnxt_en: Simplify bnxt_async_event_process().
event_data1 and event_data2 are used when processing most events.
Store these in local variables at the beginning of the function to
simplify many of the case statements.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/1602493854-29283-5-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-12 14:27:03 -07:00
Michael Chan
8fb35cd302 bnxt_en: Set driver default message level.
Currently, bp->msg_enable has default value of 0.  It is more useful
to have the commonly used NETIF_MSG_DRV and NETIF_MSG_HW enabled by
default.

v2: Change the fall back bnxt_reset_task() inside bnxt_rx_ring_reset()
to silent mode.  With older fw, we would take the fall back path and
it would be very noisy.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/1602493854-29283-4-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-12 14:27:03 -07:00
Vasundhara Volam
cf223bfaf7 bnxt_en: Return -EROFS to user space, if NVM writes are not permitted.
If NVRAM resources are locked, NVM writes are not permitted. In such
scenarios, firmware returns HWRM_ERR_CODE_RESOURCE_LOCKED error to
firmware commands.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/1602493854-29283-2-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-12 14:27:02 -07:00
Michael Chan
8d4bd96b54 bnxt_en: Eliminate unnecessary RX resets.
Currently, the driver will schedule RX ring reset when we get a buffer
error in the RX completion record.  These RX buffer errors can be due
to normal out-of-buffer conditions or a permanent error in the RX
ring.  Because the driver cannot distinguish between these 2
conditions, we assume all these buffer errors require reset.

This is very disruptive when it is just a normal out-of-buffer
condition.  Newer firmware will now monitor the rings for the permanent
failure and will send a notification to the driver when it happens.
This allows the driver to reset only when such a notification is
received.  In environments where we have predominently out-of-buffer
conditions, we now can avoid these unnecessary resets.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 14:41:05 -07:00
Michael Chan
1b5c8b63d6 bnxt_en: Reduce unnecessary message log during RX errors.
There is logic in the RX path to detect unexpected handles in the
RX completion.  We'll print a warning and schedule a reset.  The
next expected handle is then set to 0xffff which is guaranteed to
not match any valid handle.  This will force all remaining packets in
the ring to be discarded before the reset.  There can be hundreds of
these packets remaining in the ring and there is no need to print the
warnings for these forced errors.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-04 14:41:05 -07:00