Add the capability to split the Tx queues onto their own
interrupts with their own napi contexts. This gives the
opportunity for more direct control of Tx interrupt
handling, such as CPU affinity and interrupt coalescing,
useful for some traffic loads.
v2: use ethtool -L, not a vendor specific priv-flag
v3: simplify logging, drop unnecessary "no-change" tests
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
We give the tx clean path its own budget and service routine in
order to give a little more leeway to be more aggressive, and
in preparation for coming changes. We've found this gives us
a little better performance in some packet processing scenarios
without hurting other scenarios.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
We really don't need to hit the Rx queue doorbell so many times,
we can wait to the end and cause a little less thrash.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ionic_wait_on_bit_lock() was a open-coded mutex knock-off
used only for protecting the queue reset operations, and there
was no reason not to use the real thing. We can use the lock
more correctly and to better protect the queue stop and start
operations from cross threading. We can also remove a useless
and expensive bit operation from the Rx path.
This fixes a case found where the link_status_check from a link
flap could run into an MTU change and cause a crash.
Fixes: beead698b1 ("ionic: Add the basic NDO callbacks for netdev support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add hardware port stats and a few more driver collected
statistics to the ethtool stats output.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
The version 1 Tx queues can use longer SG lists than the
original version 0 queues, but we need to check to see if the
firmware supports the v1 Tx queues. This implements the queue
type query for all queue types, and uses the information to
set up for using the longer Tx SG lists.
Because the Tx SG list can be longer, we need to limit the
max ring length to be sure we stay inside the boundaries of a
DMA allocation max size, so we lower the max Tx ring size.
The driver sets its highest known version in the Q_IDENTITY
command, and the FW returns the highest version that it knows,
bounded by the driver's version. The negotiated version number
is later used in the Q_INIT commands.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clean out tx requests that didn't get finished before
shutting down the queue.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the unused flags field and and fix the bitflag names
to include the _F_ flag hint.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use new helper tcp_v6_gso_csum_prep in additional network drivers.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure the NIC drops packets that are larger than the
specified MTU.
The front end of the NIC will accept packets larger than MTU and
will copy all the data it can to fill up the driver's posted
buffers - if the buffers are not long enough the packet will
then get dropped. With the Rx SG buffers allocagted as full
pages, we are currently setting up more space than MTU size
available and end up receiving some packets that are larger
than MTU, up to the size of buffers posted. To be sure the
NIC doesn't waste our time with oversized packets we need to
lie a little in the SG descriptor about how long is the last
SG element.
At dealloc time, we know the allocation was a page, so the
deallocation doesn't care about what length we put in the
descriptor.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a counter for packets dropped by the driver, typically
for bad size or a receive error seen by the device.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/pensando/ionic/ionic_txrx.c: In function 'ionic_rx_empty':
drivers/net/ethernet/pensando/ionic/ionic_txrx.c:405:28: warning:
variable 'sg_desc' set but not used [-Wunused-but-set-variable]
It is never used, so can be removed.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even out Rx performance across MTU sizes by changing from full
skb allocations to page-based frag allocations. The device
supports a form of scatter-gather in the Rx path, so we can
set up a number of pages for each descriptor, all of which are
easier to alloc and pass around than the standard kzalloc'd
buffer. An skb is wrapped around the pages while processing
the received packets, and pages are recycled as needed, or
left alone if they weren't used in the Rx.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Interrupt coalescing, tunable copybreak value, and
tx timeout.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add both the Tx and Rx queue setup and handling. The related
stats display comes later. Instead of using the generic napi
routines used by the slow-path commands, the Tx and Rx paths
are simplified and inlined in one file in order to get better
compiler optimizations.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>