Commit Graph

339 Commits

Author SHA1 Message Date
Michael Chan
cb98526bf9 bnxt_en: Fix NULL pointer dereference at bnxt_free_irq().
When open fails during ethtool -L ring change, for example, the driver
may crash at bnxt_free_irq() because bp->bnapi is NULL.

If we fail to allocate all the new rings, bnxt_open_nic() will free
all the memory including bp->bnapi.  Subsequent call to bnxt_close_nic()
will try to dereference bp->bnapi in bnxt_free_irq().

Fix it by checking for !bp->bnapi in bnxt_free_irq().

Fixes: e5811b8c09 ("bnxt_en: Add IRQ remapping logic.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11 14:42:00 -04:00
Michael Chan
11c3ec7bb9 bnxt_en: Need to include RDMA rings in bnxt_check_rings().
With recent changes to reserve both L2 and RDMA rings, we need to include
the RDMA rings in bnxt_check_rings().  Otherwise we will under-estimate
the rings we need during ethtool -L and may lead to failure.

Fixes: fbcfc8e467 ("bnxt_en: Reserve completion rings and MSIX for bnxt_re RDMA driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-11 14:42:00 -04:00
Michael Chan
ec86f14ea5 bnxt_en: Add ULP calls to stop and restart IRQs.
When the driver needs to re-initailize the IRQ vectors, we make the
new ulp_irq_stop() call to tell the RDMA driver to disable and free
the IRQ vectors.  After IRQ vectors have been re-initailized, we
make the ulp_irq_restart() call to tell the RDMA driver that
IRQs can be restarted.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:20 -04:00
Michael Chan
fbcfc8e467 bnxt_en: Reserve completion rings and MSIX for bnxt_re RDMA driver.
Add additional logic to reserve completion rings for the bnxt_re driver
when it requests MSIX vectors.  The function bnxt_cp_rings_in_use()
will return the total number of completion rings used by both drivers
that need to be reserved.  If the network interface in up, we will
close and open the NIC to reserve the new set of completion rings and
re-initialize the vectors.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:20 -04:00
Michael Chan
4e41dc5deb bnxt_en: Refactor bnxt_need_reserve_rings().
Refactor bnxt_need_reserve_rings() slightly so that __bnxt_reserve_rings()
can call it and remove some duplicated code.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:20 -04:00
Michael Chan
e5811b8c09 bnxt_en: Add IRQ remapping logic.
Add remapping logic so that bnxt_en can use any arbitrary MSIX vectors.
This will allow the driver to reserve one range of MSIX vectors to be
used by both bnxt_en and bnxt_re.  bnxt_en can now skip over the MSIX
vectors used by bnxt_re.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:20 -04:00
Michael Chan
08654eb213 bnxt_en: Change IRQ assignment for RDMA driver.
In the current code, the range of MSIX vectors allocated for the RDMA
driver is disjoint from the network driver.  This creates a problem
for the new firmware ring reservation scheme.  The new scheme requires
the reserved completion rings/MSIX vectors to be in a contiguous
range.

Change the logic to allocate RDMA MSIX vectors to be contiguous with
the vectors used by bnxt_en on new firmware using the new scheme.
The new function bnxt_get_num_msix() calculates the exact number of
vectors needed by both drivers.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:20 -04:00
Michael Chan
9899bb59ff bnxt_en: Improve ring allocation logic.
Currently, the driver code makes some assumptions about the group index
and the map index of rings.  This makes the code more difficult to
understand and less flexible.

Improve it by adding the grp_idx and map_idx fields explicitly to the
bnxt_ring_struct as a union.  The grp_idx is initialized for each tx ring
and rx agg ring during init. time.  We do the same for the map_idx for
each cmpl ring.

The grp_idx ties the tx ring to the ring group.  The map_idx is the
doorbell index of the ring.  With this new infrastructure, we can change
the ring index mapping scheme easily in the future.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:20 -04:00
Michael Chan
845adfe40c bnxt_en: Improve valid bit checking in firmware response message.
When firmware sends a DMA response to the driver, the last byte of the
message will be set to 1 to indicate that the whole response is valid.
The driver waits for the message to be valid before reading the message.

The firmware spec allows these response messages to increase in
length by adding new fields to the end of these messages.  The
older spec's valid location may become a new field in a newer
spec.  To guarantee compatibility, the driver should zero the valid
byte before interpreting the entire message so that any new fields not
implemented by the older spec will be read as zero.

For messages that are forwarded to VFs, we need to set the length
and re-instate the valid bit so the VF will see the valid response.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Michael Chan
db4723b3cd bnxt_en: Check max_tx_scheduler_inputs value from firmware.
When checking for the maximum pre-set TX channels for ethtool -l, we
need to check the current max_tx_scheduler_inputs parameter from firmware.
This parameter specifies the max input for the internal QoS nodes currently
available to this function.  The function's TX rings will be capped by this
parameter.  By adding this logic, we provide a more accurate pre-set max
TX channels to the user.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Vasundhara Volam
00db3cba35 bnxt_en: Add extended port statistics support
Gather periodic extended port statistics, if the device is PF and
link is up.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Vasundhara Volam
746df13964 bnxt_en: Add support for ndo_set_vf_trust
Trusted VFs are allowed to modify MAC address, even when PF
has assigned one.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Michael Chan
abe93ad2e0 bnxt_en: Use a dedicated VNIC mode for RDMA.
If the RDMA driver is registered, use a new VNIC mode that allows
RDMA traffic to be seen on the netdev in promiscuous mode.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Michael Chan
1d3ef13dd4 bnxt_en: Adjust default rings for multi-port NICs.
Change the default ring logic to select default number of rings to be up to
8 per port if the default rings x NIC ports <= total CPUs.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Michael Chan
d4f52de02f bnxt_en: Update firmware interface to 1.9.1.15.
Minor changes, such as new extended port statistics.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-31 23:24:19 -04:00
Sinan Kaya
fd141fa47c bnxt_en: Eliminate duplicate barriers on weakly-ordered archs
Code includes wmb() followed by writel(). writel() already has a barrier on
some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Create a new wrapper function with relaxed write operator. Use the new
wrapper when a write is following a wmb().

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Also add mmiowb() so that write code doesn't move outside of scope.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-26 12:47:56 -04:00
Michael Chan
3c4fe80b32 bnxt_en: Check valid VNIC ID in bnxt_hwrm_vnic_set_tpa().
During initialization, if we encounter errors, there is a code path that
calls bnxt_hwrm_vnic_set_tpa() with invalid VNIC ID.  This may cause a
warning in firmware logs.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-12 10:58:12 -04:00
Venkat Duvvuru
1a037782e7 bnxt_en: close & open NIC, only when the interface is in running state.
bnxt_restore_pf_fw_resources routine frees PF resources by calling
close_nic and allocates the resources back, by doing open_nic. However,
this is not needed, if the PF is already in closed state.

This bug causes the driver to call open the device and call request_irq()
when it is not needed.  Ultimately, pci_disable_msix() will crash
when bnxt_en is unloaded.

This patch fixes the problem by skipping __bnxt_close_nic and
__bnxt_open_nic inside bnxt_restore_pf_fw_resources routine, if the
interface is not running.

Fixes: 80fcaf46c0 ("bnxt_en: Restore MSIX after disabling SRIOV.")
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-12 10:58:05 -04:00
Michael Chan
832aed16ce bnxt_en: Fix regressions when setting up MQPRIO TX rings.
Recent changes added the bnxt_init_int_mode() call in the driver's open
path whenever ring reservations are changed.  This call was previously
only called in the probe path.  In the open path, if MQPRIO TC has been
setup, the bnxt_init_int_mode() call would reset and mess up the MQPRIO
per TC rings.

Fix it by not re-initilizing bp->tx_nr_rings_per_tc in
bnxt_init_int_mode().  Instead, initialize it in the probe path only
after the bnxt_init_int_mode() call.

Fixes: 674f50a5b0 ("bnxt_en: Implement new method to reserve rings.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-12 10:57:51 -04:00
Michael Chan
ed7bc602f6 bnxt_en: Pass complete VLAN TCI to the stack.
When receiving a packet with VLAN tag, pass the entire 16-bit TCI to the
stack when calling __vlan_hwaccel_put_tag().  The current code is only
passing the 12-bit tag and it is missing the priority bits.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-12 10:57:45 -04:00
Eddie Wai
6fc2ffdf10 bnxt_en: Fix vnic accounting in the bnxt_check_rings() path.
The number of vnics to check must be determined ahead of time because
only standard RX rings require vnics to support RFS.  The logic is
similar to the ring reservation logic and we can now use the
refactored common functions to do most of the work in setting up
the firmware message.

Fixes: 8f23d638b3 ("bnxt_en: Expand bnxt_check_rings() to check all resources.")
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-12 10:57:32 -04:00
Michael Chan
4ed50ef4da bnxt_en: Refactor the functions to reserve hardware rings.
The bnxt_hwrm_reserve_{pf|vf}_rings() functions are very similar to
the bnxt_hwrm_check_{pf|vf}_rings() functions.  Refactor the former
so that the latter can make use of common code in the next patch.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-12 10:57:23 -04:00
Andy Gospodarek
0bc0b97fca bnxt_en: cleanup DIM work on device shutdown
Make sure to cancel any pending work that might update driver coalesce
settings when taking down an interface.

Fixes: 6a8788f256 ("bnxt_en: add support for software dynamic interrupt moderation")
Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-29 14:19:23 -05:00
Jakub Kicinski
312324f124 bnxt: use tc_cls_can_offload_and_chain0()
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case
ethtool tc offload flag is not set or chain unsupported.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 21:23:08 -05:00
Sathya Perla
dd4ea1da12 bnxt_en: export a common switchdev PARENT_ID for all reps of an adapter
Currently the driver exports different switchdev PARENT_IDs for
representors belonging to different SR-IOV PF-pools of an adapter.
This is not correct as the adapter can switch across all vports
of an adapter. This patch fixes this by exporting a common switchdev
PARENT_ID for all reps of an adapter. The PCIE DSN is used as the id.

Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:27 -05:00
Michael Chan
c3480a6037 bnxt_en: Add cache line size setting to optimize performance.
The chip supports 64-byte and 128-byte cache line size for more optimal
DMA performance when matched to the CPU cache line size.  The default is 64.
If the system is using 128-byte cache line size, set it to 128.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:27 -05:00
Vasundhara Volam
91cdda4071 bnxt_en: Forward VF MAC address to the PF.
Forward hwrm_func_vf_cfg command from VF to PF driver, to store
VF MAC address in PF's context.  This will allow "ip link show"
to display all VF MAC addresses.

Maintain 2 locations of MAC address in VF info structure, one for
a PF assigned MAC and one for VF assigned MAC.

Display VF assigned MAC in "ip link show", only if PF assigned MAC is
not valid.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:27 -05:00
Vasundhara Volam
92abef361b bnxt_en: Add BCM5745X NPAR device IDs
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:26 -05:00
Michael Chan
8f23d638b3 bnxt_en: Expand bnxt_check_rings() to check all resources.
bnxt_check_rings() is called by ethtool, XDP setup, and ndo_setup_tc()
to see if there are enough resources to support the new configuration.
Expand the call to test all resources if the firmware supports the new
API.  With the more flexible resource allocation scheme, this call must
be made to check that all resources are available before committing to
allocate the resources.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:26 -05:00
Michael Chan
4673d66468 bnxt_en: Implement new method for the PF to assign SRIOV resources.
Instead of the old method of evenly dividing the resources to the VFs,
use the new firmware API to specify min and max resources for each VF.
This way, there is more flexibility for each VF to allocate more or less
resources.

The min is the absolute minimum for each VF to function.  The max is the
global resources minus the resources used by the PF.  Each VF is
guaranteed the min.  Up to max resources may be available for some VFs.

The PF driver can use one of 2 strategies specified in NVRAM to assign
the resources.  The old legacy strategy of evenly dividing the resources
or the new flexible strategy.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:26 -05:00
Michael Chan
6a1eef5b90 bnxt_en: Reserve resources for RFS.
In bnxt_rfs_capable(), add call to reserve vnic resources to support
NTUPLE.  Return true if we can successfully reserve enough vnics.
Otherwise, reserve the minimum 1 VNIC for normal operations not
supporting NTUPLE and return false.

Also, suppress warning message about not enough resources for NTUPLE when
only 1 RX ring is in use.  NTUPLE filters by definition require multiple
RX rings.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:26 -05:00
Michael Chan
674f50a5b0 bnxt_en: Implement new method to reserve rings.
The new method will call firmware to reserve the desired tx, rx, cmpl
rings, ring groups, stats context, and vnic resources.  A second query
call will check the actual resources that firmware is able to reserve.
The driver will then trim and adjust based on the actual resources
provided by firmware.  The driver will then reserve the final resources
in use.

This method is a more flexible way of using hardware resources.  The
resources are not fixed and can by adjusted by firmware.  The driver
adapts to the available resources that the firmware can reserve for
the driver.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:26 -05:00
Michael Chan
58ea801ac4 bnxt_en: Set initial default RX and TX ring numbers the same in combined mode.
In combined mode, the driver is currently not setting RX and TX ring
numbers the same when firmware can allocate more RX than TX or vice versa.
This will confuse the user as the ethtool convention assumes they are the
same in combined mode.  Fix it by adding bnxt_trim_dflt_sh_rings() to trim
RX and TX ring numbers to be the same as the completion ring number in
combined mode.

Note that if TCs are enabled and/or XDP is enabled, the number of TX rings
will not be the same as RX rings in combined mode.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:26 -05:00
Michael Chan
be0dd9c410 bnxt_en: Add the new firmware API to query hardware resources.
The new API HWRM_FUNC_RESOURCE_QCAPS provides min and max hardware
resources.  Use the new API when it is supported by firmware.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:26 -05:00
Michael Chan
6a4f294705 bnxt_en: Refactor hardware resource data structures.
In preparation for new firmware APIs to allocate hardware resources,
add a new struct bnxt_hw_resc to hold various min, max and reserved
resources.  This new structure is common for PFs and VFs.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:26 -05:00
Michael Chan
80fcaf46c0 bnxt_en: Restore MSIX after disabling SRIOV.
After SRIOV has been enabled and disabled, the MSIX vectors assigned to
the VFs have to be re-initialized.  Otherwise they cannot be re-used by
the PF.  For example, increasing the number of PF rings after disabling
SRIOV may fail if the PF uses MSIX vectors previously assigned to the VFs.

To fix this, we add logic in bnxt_restore_pf_fw_resources() to close the
NIC, clear and re-init MSIX, and re-open the NIC.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:25 -05:00
Michael Chan
86e953db01 bnxt_en: Refactor bnxt_close_nic().
Add a new __bnxt_close_nic() function to do all the work previously done
in bnxt_close_nic() except waiting for SRIOV configuration.  The new
function will be used in the next patch as part of SRIOV cleanup.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:25 -05:00
Michael Chan
894aa69a90 bnxt_en: Update firmware interface to 1.9.0.
The version has new firmware APIs to allocate PF/VF resources more
flexibly.

New toolchains were used to generate this file, resulting in a one-time
large diffstat.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:48:25 -05:00
Colin Ian King
e7e70fa678 bnxt_en: don't update cpr->rx_bytes with uninitialized length len
Currently in the cases where cmp_type == CMP_TYPE_RX_L2_TPA_START_CMP or
CMP_TYPE_RX_L2_TPA_END_CMP the exit path updates cpr->rx_bytes with an
uninitialized length len.  Fix this by adding a new exit path that does
not update the cpr stats with the bogus length len and remove the unused
label next_rx_no_prod.

Detected by CoverityScan, CID#1463807 ("Uninitialized scalar variable")
Fixes: 6a8788f256 ("bnxt_en: add support for software dynamic interrupt moderation")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-16 15:24:29 -05:00
Andy Gospodarek
6a8788f256 bnxt_en: add support for software dynamic interrupt moderation
This implements the changes needed for the bnxt_en driver to add support
for dynamic interrupt moderation per ring.

This does add additional counters in the receive path, but testing shows
that any additional instructions are offset by throughput gain when the
default configuration is for low latency.

Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10 15:27:45 -05:00
Jesper Dangaard Brouer
96a8604f95 bnxt_en: setup xdp_rxq_info
Driver hook points for xdp_rxq_info:
 * reg  : bnxt_alloc_rx_rings
 * unreg: bnxt_free_rx_rings

This driver should be updated to re-register when changing
allocation mode of RX rings.

Tested on actual hardware.

Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-05 15:21:21 -08:00
Michael Chan
1054aee823 bnxt_en: Use NETIF_F_GRO_HW.
Advertise NETIF_F_GRO_HW in hw_features if hardware GRO is supported.
In bnxt_fix_features(), disable GRO_HW and LRO if current hardware
configuration does not allow it.  GRO_HW depends on GRO.  GRO_HW is
also mutually exclusive with LRO.  XDP setup will now rely on
bnxt_fix_features() to turn off aggregation.  During chip init, turn on
or off hardware GRO based on NETIF_F_GRO_HW in features flag.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-19 10:38:36 -05:00
David S. Miller
51e18a453f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflict was two parallel additions of include files to sch_generic.c,
no biggie.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-09 22:09:55 -05:00
Calvin Owens
2edbdb3159 bnxt_en: Fix sources of spurious netpoll warnings
After applying 2270bc5da3 ("bnxt_en: Fix netpoll handling") and
903649e718 ("bnxt_en: Improve -ENOMEM logic in NAPI poll loop."),
we still see the following WARN fire:

  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 1875170 at net/core/netpoll.c:165 netpoll_poll_dev+0x15a/0x160
  bnxt_poll+0x0/0xd0 exceeded budget in poll
  <snip>
  Call Trace:
   [<ffffffff814be5cd>] dump_stack+0x4d/0x70
   [<ffffffff8107e013>] __warn+0xd3/0xf0
   [<ffffffff8107e07f>] warn_slowpath_fmt+0x4f/0x60
   [<ffffffff8179519a>] netpoll_poll_dev+0x15a/0x160
   [<ffffffff81795f38>] netpoll_send_skb_on_dev+0x168/0x250
   [<ffffffff817962fc>] netpoll_send_udp+0x2dc/0x440
   [<ffffffff815fa9be>] write_ext_msg+0x20e/0x250
   [<ffffffff810c8125>] call_console_drivers.constprop.23+0xa5/0x110
   [<ffffffff810c9549>] console_unlock+0x339/0x5b0
   [<ffffffff810c9a88>] vprintk_emit+0x2c8/0x450
   [<ffffffff810c9d5f>] vprintk_default+0x1f/0x30
   [<ffffffff81173df5>] printk+0x48/0x50
   [<ffffffffa0197713>] edac_raw_mc_handle_error+0x563/0x5c0 [edac_core]
   [<ffffffffa0197b9b>] edac_mc_handle_error+0x42b/0x6e0 [edac_core]
   [<ffffffffa01c3a60>] sbridge_mce_output_error+0x410/0x10d0 [sb_edac]
   [<ffffffffa01c47cc>] sbridge_check_error+0xac/0x130 [sb_edac]
   [<ffffffffa0197f3c>] edac_mc_workq_function+0x3c/0x90 [edac_core]
   [<ffffffff81095f8b>] process_one_work+0x19b/0x480
   [<ffffffff810967ca>] worker_thread+0x6a/0x520
   [<ffffffff8109c7c4>] kthread+0xe4/0x100
   [<ffffffff81884c52>] ret_from_fork+0x22/0x40

This happens because we increment rx_pkts on -ENOMEM and -EIO, resulting
in rx_pkts > 0. Fix this by only bumping rx_pkts if we were actually
given a non-zero budget.

Signed-off-by: Calvin Owens <calvinowens@fb.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-08 14:07:19 -05:00
Michael Chan
a8168b6cee bnxt_en: Don't print "Link speed -1 no longer supported" messages.
On some dual port NICs, the 2 ports have to be configured with compatible
link speeds.  Under some conditions, a port's configured speed may no
longer be supported.  The firmware will send a message to the driver
when this happens.

Improve this logic that prints out the warning by only printing it if
we can determine the link speed that is no longer supported.  If the
speed is unknown or it is in autoneg mode, skip the warning message.

Reported-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Tested-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-07 13:35:56 -05:00
David S. Miller
7cda4cee13 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Small overlapping change conflict ('net' changed a line,
'net-next' added a line right afterwards) in flexcan.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-05 10:44:19 -05:00
Vasundhara Volam
ebd5818cc5 bnxt_en: Fix a variable scoping in bnxt_hwrm_do_send_msg()
short_input variable is assigned to another data pointer which is
referred out of its scope. Fix it by moving short_input definition
to the beginning of bnxt_hwrm_do_send_msg() function.

No failure has been reported so far due to this issue.

Fixes: e605db801b ("bnxt_en: Support for Short Firmware Message")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:25:38 -05:00
Ray Jui
a7f3f939dd bnxt_en: Need to unconditionally shut down RoCE in bnxt_shutdown
The current 'bnxt_shutdown' implementation only invokes
'bnxt_ulp_shutdown' to shut down RoCE in the case when the system is in
the path of power off (SYSTEM_POWER_OFF). While this may work in most
cases, it does not work in the smart NIC case, when Linux 'reboot'
command is initiated from the Linux that runs on the ARM cores of the
NIC card. In this particular case, Linux 'reboot' results in a system
'L3' level reset where the entire ARM and associated subsystems are
being reset, but at the same time, Nitro core is being kept in sane state
(to allow external PCIe connected servers to continue to work). Without
properly shutting down RoCE and freeing all associated resources, it
results in the ARM core to hang immediately after the 'reboot'

By always invoking 'bnxt_ulp_shutdown' in 'bnxt_shutdown', it fixes the
above issue

Fixes: 0efd2fc65c ("bnxt_en: Add a callback to inform RDMA driver during PCI shutdown.")

Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:25:38 -05:00
Jakub Kicinski
bd0b2e7fe6 net: xdp: make the stack take care of the tear down
Since day one of XDP drivers had to remember to free the program
on the remove path.  This leads to code duplication and is error
prone.  Make the stack query the installed programs on unregister
and if something is installed, remove the program.  Freeing of
program attached to XDP generic is moved from free_netdev() as well.

Because the remove will now be called before notifiers are
invoked, BPF offload state of the program will not get destroyed
before uninstall.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-03 00:27:57 +01:00
Kees Cook
e99e88a9d2 treewide: setup_timer() -> timer_setup()
This converts all remaining cases of the old setup_timer() API into using
timer_setup(), where the callback argument is the structure already
holding the struct timer_list. These should have no behavioral changes,
since they just change which pointer is passed into the callback with
the same available pointers after conversion. It handles the following
examples, in addition to some other variations.

Casting from unsigned long:

    void my_callback(unsigned long data)
    {
        struct something *ptr = (struct something *)data;
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, ptr);

and forced object casts:

    void my_callback(struct something *ptr)
    {
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr);

become:

    void my_callback(struct timer_list *t)
    {
        struct something *ptr = from_timer(ptr, t, my_timer);
    ...
    }
    ...
    timer_setup(&ptr->my_timer, my_callback, 0);

Direct function assignments:

    void my_callback(unsigned long data)
    {
        struct something *ptr = (struct something *)data;
    ...
    }
    ...
    ptr->my_timer.function = my_callback;

have a temporary cast added, along with converting the args:

    void my_callback(struct timer_list *t)
    {
        struct something *ptr = from_timer(ptr, t, my_timer);
    ...
    }
    ...
    ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback;

And finally, callbacks without a data assignment:

    void my_callback(unsigned long data)
    {
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, 0);

have their argument renamed to verify they're unused during conversion:

    void my_callback(struct timer_list *unused)
    {
    ...
    }
    ...
    timer_setup(&ptr->my_timer, my_callback, 0);

The conversion is done with the following Coccinelle script:

spatch --very-quiet --all-includes --include-headers \
	-I ./arch/x86/include -I ./arch/x86/include/generated \
	-I ./include -I ./arch/x86/include/uapi \
	-I ./arch/x86/include/generated/uapi -I ./include/uapi \
	-I ./include/generated/uapi --include ./include/linux/kconfig.h \
	--dir . \
	--cocci-file ~/src/data/timer_setup.cocci

@fix_address_of@
expression e;
@@

 setup_timer(
-&(e)
+&e
 , ...)

// Update any raw setup_timer() usages that have a NULL callback, but
// would otherwise match change_timer_function_usage, since the latter
// will update all function assignments done in the face of a NULL
// function initialization in setup_timer().
@change_timer_function_usage_NULL@
expression _E;
identifier _timer;
type _cast_data;
@@

(
-setup_timer(&_E->_timer, NULL, _E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E->_timer, NULL, (_cast_data)_E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, &_E);
+timer_setup(&_E._timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, (_cast_data)&_E);
+timer_setup(&_E._timer, NULL, 0);
)

@change_timer_function_usage@
expression _E;
identifier _timer;
struct timer_list _stl;
identifier _callback;
type _cast_func, _cast_data;
@@

(
-setup_timer(&_E->_timer, _callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
 _E->_timer@_stl.function = _callback;
|
 _E->_timer@_stl.function = &_callback;
|
 _E->_timer@_stl.function = (_cast_func)_callback;
|
 _E->_timer@_stl.function = (_cast_func)&_callback;
|
 _E._timer@_stl.function = _callback;
|
 _E._timer@_stl.function = &_callback;
|
 _E._timer@_stl.function = (_cast_func)_callback;
|
 _E._timer@_stl.function = (_cast_func)&_callback;
)

// callback(unsigned long arg)
@change_callback_handle_cast
 depends on change_timer_function_usage@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
identifier _handle;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *t
 )
 {
(
	... when != _origarg
	_handletype *_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle;
	... when != _handle
	_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle;
	... when != _handle
	_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
)
 }

// callback(unsigned long arg) without existing variable
@change_callback_handle_cast_no_arg
 depends on change_timer_function_usage &&
                     !change_callback_handle_cast@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *t
 )
 {
+	_handletype *_origarg = from_timer(_origarg, t, _timer);
+
	... when != _origarg
-	(_handletype *)_origarg
+	_origarg
	... when != _origarg
 }

// Avoid already converted callbacks.
@match_callback_converted
 depends on change_timer_function_usage &&
            !change_callback_handle_cast &&
	    !change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier t;
@@

 void _callback(struct timer_list *t)
 { ... }

// callback(struct something *handle)
@change_callback_handle_arg
 depends on change_timer_function_usage &&
	    !match_callback_converted &&
            !change_callback_handle_cast &&
            !change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
@@

 void _callback(
-_handletype *_handle
+struct timer_list *t
 )
 {
+	_handletype *_handle = from_timer(_handle, t, _timer);
	...
 }

// If change_callback_handle_arg ran on an empty function, remove
// the added handler.
@unchange_callback_handle_arg
 depends on change_timer_function_usage &&
	    change_callback_handle_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
identifier t;
@@

 void _callback(struct timer_list *t)
 {
-	_handletype *_handle = from_timer(_handle, t, _timer);
 }

// We only want to refactor the setup_timer() data argument if we've found
// the matching callback. This undoes changes in change_timer_function_usage.
@unchange_timer_function_usage
 depends on change_timer_function_usage &&
            !change_callback_handle_cast &&
            !change_callback_handle_cast_no_arg &&
	    !change_callback_handle_arg@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type change_timer_function_usage._cast_data;
@@

(
-timer_setup(&_E->_timer, _callback, 0);
+setup_timer(&_E->_timer, _callback, (_cast_data)_E);
|
-timer_setup(&_E._timer, _callback, 0);
+setup_timer(&_E._timer, _callback, (_cast_data)&_E);
)

// If we fixed a callback from a .function assignment, fix the
// assignment cast now.
@change_timer_function_assignment
 depends on change_timer_function_usage &&
            (change_callback_handle_cast ||
             change_callback_handle_cast_no_arg ||
             change_callback_handle_arg)@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_func;
typedef TIMER_FUNC_TYPE;
@@

(
 _E->_timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-(_cast_func)_callback;
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-&_callback;
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-(_cast_func)_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
)

// Sometimes timer functions are called directly. Replace matched args.
@change_timer_function_calls
 depends on change_timer_function_usage &&
            (change_callback_handle_cast ||
             change_callback_handle_cast_no_arg ||
             change_callback_handle_arg)@
expression _E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_data;
@@

 _callback(
(
-(_cast_data)_E
+&_E->_timer
|
-(_cast_data)&_E
+&_E._timer
|
-_E
+&_E->_timer
)
 )

// If a timer has been configured without a data argument, it can be
// converted without regard to the callback argument, since it is unused.
@match_timer_function_unused_data@
expression _E;
identifier _timer;
identifier _callback;
@@

(
-setup_timer(&_E->_timer, _callback, 0);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0L);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0UL);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0L);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0UL);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0L);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0UL);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0L);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0UL);
+timer_setup(_timer, _callback, 0);
)

@change_callback_unused_data
 depends on match_timer_function_unused_data@
identifier match_timer_function_unused_data._callback;
type _origtype;
identifier _origarg;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *unused
 )
 {
	... when != _origarg
 }

Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 15:57:07 -08:00