We need to increase TSO_HEADER_SIZE from 128 to 256.
Since otx2_sq_init() calls qmem_alloc() with TSO_HEADER_SIZE,
we need to change (struct qmem)->entry_sz to avoid truncation to 0.
Fixes: 7a37245ef2 ("octeontx2-af: NPA block admin queue init")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The older SoCs like Armada XP support a 2500BaseX mode in the datasheets
referred to as DR-SGMII (Double rated SGMII) or HS-SGMII (High Speed
SGMII). This is an upclocked 1000BaseX mode, thus
PHY_INTERFACE_MODE_2500BASEX is the appropriate mode define for it.
adding support for it merely means writing the correct magic value into
the MVNETA_SERDES_CFG register.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MVNETA_SERDES_CFG register is only available on older SoCs like the
Armada XP. On newer SoCs like the Armada 38x the fields are moved to
comphy. This patch moves the writes to this register next to the comphy
initialization, so that depending on the SoC either comphy or
MVNETA_SERDES_CFG is configured.
With this we no longer write to the MVNETA_SERDES_CFG on SoCs where it
doesn't exist.
Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The remove function does not destroy all
BM Pools when per cpu pool is active.
When reloading the mvpp2 as a module the BM Pools
are still active in hardware and due to the bug
have twice the size now old + new.
This eventually leads to a kernel crash.
v2:
* add Fixes tag
Fixes: 7d04b0b13b ("mvpp2: percpu buffers")
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool rx and tx queue statistics are reporting wrong values.
Fix reading out the correct ones.
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix cfg80211 deadlock, from Johannes Berg.
2) RXRPC fails to send norigications, from David Howells.
3) MPTCP RM_ADDR parsing has an off by one pointer error, fix from
Geliang Tang.
4) Fix crash when using MSG_PEEK with sockmap, from Anny Hu.
5) The ucc_geth driver needs __netdev_watchdog_up exported, from
Valentin Longchamp.
6) Fix hashtable memory leak in dccp, from Wang Hai.
7) Fix how nexthops are marked as FDB nexthops, from David Ahern.
8) Fix mptcp races between shutdown and recvmsg, from Paolo Abeni.
9) Fix crashes in tipc_disc_rcv(), from Tuong Lien.
10) Fix link speed reporting in iavf driver, from Brett Creeley.
11) When a channel is used for XSK and then reused again later for XSK,
we forget to clear out the relevant data structures in mlx5 which
causes all kinds of problems. Fix from Maxim Mikityanskiy.
12) Fix memory leak in genetlink, from Cong Wang.
13) Disallow sockmap attachments to UDP sockets, it simply won't work.
From Lorenz Bauer.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
net: ethernet: ti: ale: fix allmulti for nu type ale
net: ethernet: ti: am65-cpsw-nuss: fix ale parameters init
net: atm: Remove the error message according to the atomic context
bpf: Undo internal BPF_PROBE_MEM in BPF insns dump
libbpf: Support pre-initializing .bss global variables
tools/bpftool: Fix skeleton codegen
bpf: Fix memlock accounting for sock_hash
bpf: sockmap: Don't attach programs to UDP sockets
bpf: tcp: Recv() should return 0 when the peer socket is closed
ibmvnic: Flush existing work items before device removal
genetlink: clean up family attributes allocations
net: ipa: header pad field only valid for AP->modem endpoint
net: ipa: program upper nibbles of sequencer type
net: ipa: fix modem LAN RX endpoint id
net: ipa: program metadata mask differently
ionic: add pcie_print_link_status
rxrpc: Fix race between incoming ACK parser and retransmitter
net/mlx5: E-Switch, Fix some error pointer dereferences
net/mlx5: Don't fail driver on failure to create debugfs
net/mlx5e: CT: Fix ipv6 nat header rewrite actions
...
Since commit 84af7a6194 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.
This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.
There are a variety of indentation styles found.
a) 4 spaces + '---help---'
b) 7 spaces + '---help---'
c) 8 spaces + '---help---'
d) 1 space + 1 tab + '---help---'
e) 1 tab + '---help---' (correct indentation)
f) 1 tab + 1 space + '---help---'
g) 1 tab + 2 spaces + '---help---'
In order to convert all of them to 1 tab + 'help', I ran the
following commend:
$ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Disable frames injection in mvneta_xdp_xmit routine during hw
re-configuration in order to avoid hardware hangs
Fixes: b0a43db908 ("net: mvneta: add XDP_TX support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include
of the latter in the middle of asm includes. Fix this up with the aid of
the below script and manual adjustments here and there.
import sys
import re
if len(sys.argv) is not 3:
print "USAGE: %s <file> <header>" % (sys.argv[0])
sys.exit(1)
hdr_to_move="#include <linux/%s>" % sys.argv[2]
moved = False
in_hdrs = False
with open(sys.argv[1], "r") as f:
lines = f.readlines()
for _line in lines:
line = _line.rstrip('
')
if line == hdr_to_move:
continue
if line.startswith("#include <linux/"):
in_hdrs = True
elif not moved and in_hdrs:
moved = True
print hdr_to_move
print line
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The include/linux/pgtable.h is going to be the home of generic page table
manipulation functions.
Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and
make the latter include asm/pgtable.h.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit ca23cb0bc5 ("mvneta: MVNETA_SKB_HEADROOM set last 3 bits to zero")
added headroom alignment check against 8.
Hovewer (if we imagine that NET_SKB_PAD or XDP_PACKET_HEADROOM is not
aligned to cacheline size), it actually aligns headroom down, while
skb/xdp_buff headroom should be *at least* equal to one of the values
(depending on XDP prog presence).
So, fix the check to align the value up. This satisfies both
hardware/driver and network stack requirements.
Fixes: ca23cb0bc5 ("mvneta: MVNETA_SKB_HEADROOM set last 3 bits to zero")
Signed-off-by: Alexander Lobakin <bloodyreaper@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to use standard 'xdp' prefix, rename convert_to_xdp_frame
utility routine in xdp_convert_buff_to_frame and replace all the
occurrences
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/6344f739be0d1a08ab2b9607584c4d5478c8c083.1590698295.git.lorenzo@kernel.org
For XDP the MVNETA_SKB_HEADROOM is used as an offset for
the received data.
The MVNETA manual states that the last 3 bits assumed to be 0.
This is currently the case but lets make it explicit in the definition
to prevent future problems.
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MSCC bug fix in 'net' had to be slightly adjusted because the
register accesses are done slightly differently in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
When call function devm_platform_ioremap_resource(), we should use IS_ERR()
to check the return value and return PTR_ERR() if failed.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
When rxhash is enabled on any ethernet port except the first in each CP
block, traffic flow is prevented. The analysis is below:
I've been investigating this afternoon, and what I've found, comparing
a kernel without 895586d5dc and with 895586d5dc applied is:
- The table programmed into the hardware via mvpp22_rss_fill_table()
appears to be identical with or without the commit.
- When rxhash is enabled on eth2, mvpp2_rss_port_c2_enable() reports
that c2.attr[0] and c2.attr[2] are written back containing:
- with 895586d5dc, failing: 00200000 40000000
- without 895586d5dc, working: 04000000 40000000
- When disabling rxhash, c2.attr[0] and c2.attr[2] are written back as:
04000000 00000000
The second value represents the MVPP22_CLS_C2_ATTR2_RSS_EN bit, the
first value is the queue number, which comprises two fields. The high
5 bits are 24:29 and the low three are 21:23 inclusive. This comes
from:
c2.attr[0] = MVPP22_CLS_C2_ATTR0_QHIGH(qh) |
MVPP22_CLS_C2_ATTR0_QLOW(ql);
So, the working case gives eth2 a queue id of 4.0, or 32 as per
port->first_rxq, and the non-working case a queue id of 0.1, or 1.
The allocation of queue IDs seems to be in mvpp2_port_probe():
if (priv->hw_version == MVPP21)
port->first_rxq = port->id * port->nrxqs;
else
port->first_rxq = port->id * priv->max_port_rxqs;
Where:
if (priv->hw_version == MVPP21)
priv->max_port_rxqs = 8;
else
priv->max_port_rxqs = 32;
Making the port 0 (eth0 / eth1) have port->first_rxq = 0, and port 1
(eth2) be 32. It seems the idea is that the first 32 queues belong to
port 0, the second 32 queues belong to port 1, etc.
mvpp2_rss_port_c2_enable() gets the queue number from it's parameter,
'ctx', which comes from mvpp22_rss_ctx(port, 0). This returns
port->rss_ctx[0].
mvpp22_rss_context_create() is responsible for allocating that, which
it does by looking for an unallocated priv->rss_tables[] pointer. This
table is shared amongst all ports on the CP silicon.
When we write the tables in mvpp22_rss_fill_table(), the RSS table
entry is defined by:
u32 sel = MVPP22_RSS_INDEX_TABLE(rss_ctx) |
MVPP22_RSS_INDEX_TABLE_ENTRY(i);
where rss_ctx is the context ID (queue number) and i is the index in
the table.
If we look at what is written:
- The first table to be written has "sel" values of 00000000..0000001f,
containing values 0..3. This appears to be for eth1. This is table 0,
RX queue number 0.
- The second table has "sel" values of 00000100..0000011f, and appears
to be for eth2. These contain values 0x20..0x23. This is table 1,
RX queue number 0.
- The third table has "sel" values of 00000200..0000021f, and appears
to be for eth3. These contain values 0x40..0x43. This is table 2,
RX queue number 0.
How do queue numbers translate to the RSS table? There is another
table - the RXQ2RSS table, indexed by the MVPP22_RSS_INDEX_QUEUE field
of MVPP22_RSS_INDEX and accessed through the MVPP22_RXQ2RSS_TABLE
register. Before 895586d5dc, it was:
mvpp2_write(priv, MVPP22_RSS_INDEX,
MVPP22_RSS_INDEX_QUEUE(port->first_rxq));
mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE,
MVPP22_RSS_TABLE_POINTER(port->id));
and after:
mvpp2_write(priv, MVPP22_RSS_INDEX, MVPP22_RSS_INDEX_QUEUE(ctx));
mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE, MVPP22_RSS_TABLE_POINTER(ctx));
Before the commit, for eth2, that would've contained '32' for the
index and '1' for the table pointer - mapping queue 32 to table 1.
Remember that this is queue-high.queue-low of 4.0.
After the commit, we appear to map queue 1 to table 1. That again
looks fine on the face of it.
Section 9.3.1 of the A8040 manual seems indicate the reason that the
queue number is separated. queue-low seems to always come from the
classifier, whereas queue-high can be from the ingress physical port
number or the classifier depending on the MVPP2_CLS_SWFWD_PCTRL_REG.
We set the port bit in MVPP2_CLS_SWFWD_PCTRL_REG, meaning that queue-high
comes from the MVPP2_CLS_SWFWD_P2HQ_REG() register... and this seems to
be where our bug comes from.
mvpp2_cls_oversize_rxq_set() sets this up as:
mvpp2_write(port->priv, MVPP2_CLS_SWFWD_P2HQ_REG(port->id),
(port->first_rxq >> MVPP2_CLS_OVERSIZE_RXQ_LOW_BITS));
val = mvpp2_read(port->priv, MVPP2_CLS_SWFWD_PCTRL_REG);
val |= MVPP2_CLS_SWFWD_PCTRL_MASK(port->id);
mvpp2_write(port->priv, MVPP2_CLS_SWFWD_PCTRL_REG, val);
Setting the MVPP2_CLS_SWFWD_PCTRL_MASK bit means that the queue-high
for eth2 is _always_ 4, so only queues 32 through 39 inclusive are
available to eth2. Yet, we're trying to tell the classifier to set
queue-high, which will be ignored, to zero. Hence, the queue-high
field (MVPP22_CLS_C2_ATTR0_QHIGH()) from the classifier will be
ignored.
This means we end up directing traffic from eth2 not to queue 1, but
to queue 33, and then we tell it to look up queue 33 in the RSS table.
However, RSS table has not been programmed for queue 33, and so it ends
up (presumably) dropping the packets.
It seems that mvpp22_rss_context_create() doesn't take account of the
fact that the upper 5 bits of the queue ID can't actually be changed
due to the settings in mvpp2_cls_oversize_rxq_set(), _or_ it seems that
mvpp2_cls_oversize_rxq_set() has been missed in this commit. Either
way, these two functions mutually disagree with what queue number
should be used.
Looking deeper into what mvpp2_cls_oversize_rxq_set() and the MTU
validation is doing, it seems that MVPP2_CLS_SWFWD_P2HQ_REG() is used
for over-sized packets attempting to egress through this port. With
the classifier having had RSS enabled and directing eth2 traffic to
queue 1, we may still have packets appearing on queue 32 for this port.
However, the only way we may end up with over-sized packets attempting
to egress through eth2 - is if the A8040 forwards frames between its
ports. From what I can see, we don't support that feature, and the
kernel restricts the egress packet size to the MTU. In any case, if we
were to attempt to transmit an oversized packet, we have no support in
the kernel to deal with that appearing in the port's receive queue.
So, this patch attempts to solve the issue by clearing the
MVPP2_CLS_SWFWD_PCTRL_MASK() bit, allowing MVPP22_CLS_C2_ATTR0_QHIGH()
from the classifier to define the queue-high field of the queue number.
My testing seems to confirm my findings above - clearing this bit
means that if I enable rxhash on eth2, the interface can then pass
traffic, as we are now directing traffic to RX queue 1 rather than
queue 33. Traffic still seems to work with rxhash off as well.
Reported-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Matteo Croce <mcroce@redhat.com>
Fixes: 895586d5dc ("net: mvpp2: cls: Use RSS contexts to handle RSS tables")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the bpf verifier trace check into the new switch statement in
HEAD.
Resolve the overlapping changes in hinic, where bug fixes overlap
the addition of VF support.
Signed-off-by: David S. Miller <davem@davemloft.net>
This marvell driver mvneta uses PAGE_SIZE frames, which makes it
really easy to convert. Driver updates rxq and now frame_sz
once per NAPI call.
This driver takes advantage of page_pool PP_FLAG_DMA_SYNC_DEV that
can help reduce the number of cache-lines that need to be flushed
when doing DMA sync for_device. Due to xdp_adjust_tail can grow the
area accessible to the by the CPU (can possibly write into), then max
sync length *after* bpf_prog_run_xdp() needs to be taken into account.
For XDP_TX action the driver is smart and does DMA-sync. When growing
tail this is still safe, because page_pool have DMA-mapped the entire
page size.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: thomas.petazzoni@bootlin.com
Link: https://lore.kernel.org/bpf/158945335786.97035.12714388304493736747.stgit@firesoul
Some PHYs connected to this ethernet hardware support the WoL feature.
But when WoL is enabled and the machine is powered off, the PHY remains
waiting for a magic packet at max speed (i.e. 1Gbps), which is a waste of
energy.
Slow down the PHY speed before stopping the ethernet if WoL is enabled,
and save some energy while the machine is powered off or sleeping.
Tested using an Armada 370 based board (LS421DE) equipped with a Marvell
88E1518 PHY.
Signed-off-by: Daniel González Cabanelas <dgcbueu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the current codes, the octeontx2 uses its own method to allocate
the pool buffers, but there are some issues in this implementation.
1. We have to run the otx2_get_page() for each allocation cycle and
this is pretty error prone. As I can see there is no invocation
of the otx2_get_page() in otx2_pool_refill_task(), this will leave
the allocated pages have the wrong refcount and may be freed wrongly.
2. It wastes memory. For example, if we only receive one packet in a
NAPI RX cycle, and then allocate a 2K buffer with otx2_alloc_rbuf()
to refill the pool buffers and leave the remain area of the allocated
page wasted. On a kernel with 64K page, 62K area is wasted.
IMHO it is really unnecessary to implement our own method for the
buffers allocate, we can reuse the napi_alloc_frag() to simplify
our code.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix to return negative error code -ENOMEM from the alloc failed error
handling case instead of 0, as done elsewhere in this function.
Fixes: 3184fb5ba9 ("octeontx2-vf: Virtual function driver support")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The "info->fs.location" is a u32 that comes from the user via the
ethtool_set_rxnfc() function. We need to check for invalid values to
prevent a buffer overflow.
I copy and pasted this check from the mvpp2_ethtool_cls_rule_ins()
function.
Fixes: 90b509b39a ("net: mvpp2: cls: Add Classification offload support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "rss_context" variable comes from the user via ethtool_get_rxfh().
It can be any u32 value except zero. Eventually it gets passed to
mvpp22_rss_ctx() and if it is over MVPP22_N_RSS_TABLES (8) then it
results in an array overflow.
Fixes: 895586d5dc ("net: mvpp2: cls: Use RSS contexts to handle RSS tables")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 84411f73b8 ("net: mv643xx_eth: Avoid setting the initial TCP checksum")
left behind this, remove it.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix to return negative error code -ENOMEM from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: 5a6d7c9dae ("octeontx2-pf: Mailbox communication with AF")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes coccicheck warning:
drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h:312:2-3: Unneeded semicolon
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Overlapping header include additions in macsec.c
A bug fix in 'net' overlapping with the removal of 'version'
string in ena_netdev.c
Overlapping test additions in selftests Makefile
Overlapping PCI ID table adjustments in iwlwifi driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
Since set_rx_mode takes a mutex lock for sending mailbox
message to admin function to set the mode, moved logic
to a workqueue.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed an issue wherein while refilling receive buffers
for the last page allocated, recount is not being updated.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes wrapper fn()s around mutex_init/lock/unlock.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removed MODULE_VERSION and fixed MODULE_AUTHOR.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With MTU sized receive buffers it is not expected to have CQE_RX
with multiple receive buffer pointers. But since same physcial link
is shared by PF and it's VFs, the max receive packet configured
at link could be morethan MTU. Hence there is a chance of receiving
plts morethan MTU which then gets DMA'ed into multiple buffers
and notified in a single CQE_RX. This patch treats such pkts as errors
and frees up receive buffers pointers back to hardware.
Also on the transmit side this patch sets SMQ MAXLEN to max value to avoid
HW length errors for the packets whose size > MTU, eg due to path MTU.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VF shares physical link with PF. Admin function (AF) sends
notification to PF whenever a link change event happens. PF
has to forward the same notification to each of the enabled VF.
PF traps START/STOP_RX messages sent by VF to AF to keep track of
VF's enabled/disabled state.
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added ethtool support for VF devices for
- Driver stats, Tx/Rx perqueue stats
- Set/show Rx/Tx queue count
- Set/show Rx/Tx ring sizes
- Set/show IRQ coalescing parameters
- RSS configuration etc
It's the PF which owns the interface, hence VF
cannot display underlying CGX interface stats.
Except for this rest ethtool support reuses PF's
APIs.
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On OcteonTx2 silicon there two two types VFs, VFs that share the
physical link with their parent SR-IOV PF and the VFs which work
in pairs using internal HW loopback channels (LBK). Except for the
underlying Rx/Tx channel mapping from netdev functionality perspective
they are almost identical. This patch adds netdev driver support
for these VFs.
Unlike it's parent PF a VF cannot directly communicate with admin
function (AF) and it has to go through PF for the same. The mailbox
communication with AF works like 'VF <=> PF <=> AF'.
Also functionality wise VF and PF are identical, hence to avoid code
duplication PF driver's APIs are resued here for HW initialization,
packet handling etc etc ie almost everything. For VF driver to compile
as module exported few of the existing PF driver APIs.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When FLR is initiated for a VF (PCI function level reset),
the parent PF gets a interrupt. PF then sends a message to
admin function (AF), which then cleanups all resources attached
to that VF.
Also handled IRQs triggered when master enable bit is cleared
or set for VFs. This handler just clears the transaction pending
ie TRPEND bit.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added 'sriov_configure' to enable/disable virtual functions (VFs).
Also added handling of mailbox messages from these VFs.
Admin function (AF) is the only one with all priviliges to configure
HW, alloc resources etc etc, PFs and it's VFs have to request AF
via mbox for all their needs. But unlike PFs, their VFs cannot
send a mbox request directly. A VF shares a mailbox region with
it's parent PF, so VF sends a mailbox msg to PF and then PF forwards
it to AF. Then AF after processing sends response to PF which it
again forwards to VF.
This patch adds support for this 'VF <=> PF <=> AF' mailbox
communication.
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Christina Jacob <cjacob@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
flow_action_hw_stats_types_check() helper takes one of the
FLOW_ACTION_HW_STATS_*_BIT values as input. If we align
the arguments to the opening bracket of the helper there
is no way to call this helper and stay under 80 characters.
Remove the "types" part from the new flow_action helpers
and enum values.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Per the dt-binding the interrupt is optional so use
platform_get_irq_optional() instead of platform_get_irq(). Since
commit 7723f4c5ec ("driver core: platform: Add an error message to
platform_get_irq*()") platform_get_irq() produces an error message
orion-mdio f1072004.mdio: IRQ index 0 not found
which is perfectly normal if one hasn't specified the optional property
in the device tree.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit e1f550dc44.
platform_get_irq_optional() will still return -ENXIO when no interrupt
is provided so the additional error handling caused the driver prone to
fail when no interrupt was specified. Revert the change so we can apply
the correct minimal fix.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the case where the last mvneta_poll did not process all
RX packets, we need to xor the pp->cause_rx_tx or port->cause_rx_tx
before claculating the rx_queue.
Fixes: 2dcf75e279 ("net: mvneta: Associate RX queues with each CPU")
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.
This driver did not previously reject unsupported parameters.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.
This driver did not previously reject unsupported parameters.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.
This driver correctly rejects all unsupported
parameters, no functional changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.
This driver did not previously reject unsupported parameters.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.
This driver did not previously reject unsupported parameters.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set ethtool_ops->supported_coalesce_params to let
the core reject unsupported coalescing parameters.
This driver did not previously reject unsupported parameters.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to unlock before returning if this allocation fails.
Fixes: 75f3627099 ("octeontx2-pf: Support to enable/disable pause frames via ethtool")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Per the dt-binding the interrupt is optional so use
platform_get_irq_optional() instead of platform_get_irq(). Since
commit 7723f4c5ec ("driver core: platform: Add an error message to
platform_get_irq*()") platform_get_irq() produces an error message
orion-mdio f1072004.mdio: IRQ index 0 not found
which is perfectly normal if one hasn't specified the optional property
in the device tree.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce flow_action_basic_hw_stats_types_check() helper and use it
in drivers. That sanitizes the drivers which do not have support
for action HW stats types.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This collection of PCI error bits is used in more than one driver,
so move it to the PCI core.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation of factoring out PCI_STATUS error bit handling let drivers
use the same collection of error bits. To facilitate bisecting we do this
in a separate patch per affected driver. For the Marvell drivers we have
to add PCI_STATUS_SIG_TARGET_ABORT to the error bits.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a spelling mistake in a dev_warn message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding ethtool stats for when XDP transmitted packets overrun the TX
queue. This is recorded separately for XDP_TX and ndo_xdp_xmit. This
is an important aid for troubleshooting XDP based setups.
It is currently a known weakness and property of XDP that there isn't
any push-back or congestion feedback when transmitting frames via XDP.
It's easy to realise when redirecting from a higher speed link into a
slower speed link, or simply two ingress links into a single egress.
The situation can also happen when Ethernet flow control is active.
For testing the patch and provoking the situation to occur on my
Espressobin board, I configured the TX-queue to be smaller (434) than
RX-queue (512) and overload network with large MTU size frames (as a
larger frame takes longer to transmit).
Hopefully the upcoming XDP TX hook can be extended to provide insight
into these TX queue overflows, to allow programmable adaptation
strategies.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently on the first check if the operation is still not
finished, the poll goes to sleep for 2-5 usecs. But if for
some reason (due to other priority stuff like interrupts etc) by
the time the poll wakes up the 10ms time is expired then we don't
check if operation is finished or not and return failure.
This patch modifies poll logic to check HW operation after sleep so
that the status is checked atleast twice.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bus mastering is enabled by firmware, but when this driver
is unbinded bus mastering gets disabled by the PCI subsystem
which results interrupts not working when driver is reloaded.
Hence set bus mastering everytime in probe().
Also
- Converted pci_set_dma_mask() and pci_set_consistent_dma_mask()
to dma_set_mask_and_coherent().
- Cleared transaction pending bit which gets set during
driver unbind due to clearing of bus mastering (ME bit).
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently there is no way for AF dependent drivers in
any domain to check if the AF driver is loaded. This
patch sets an ID for RVUM block which will automatically
reflects in PF/VFs discovery register which they can
check and defer their probe until AF is up.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For retrieving info like interface MAC addresses, packet
parser key extraction config etc currently a command
is sent to firmware and firmware which periodically polls
for commands, processes these and returns the info.
This is resulting in interface initialization taking lot
of time. To optimize this a memory region is shared between
firmware and this driver, firmware while booting puts
static info like these into that region for driver to
read directly without using commands.
With this
- Logic for retrieving packet parser extraction config
via commands is removed and repalced with using the
shared 'fwdata' structure.
- Now RVU MSIX vector address is also retrieved from this fwdata struct
instead of from CSR. Otherwise when kexec/kdump crash kernel loads
CSR will have a IOVA setup by primary kernel which impacts
RVU PF/VF's interrupts.
- Also added a mbox handler for PF/VF interfaces to retrieve their MAC
addresses from AF.
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Christina Jacob <cjacob@marvell.com>
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added mailbox requests to retrieve backpressure IDs from AF and Aura,
CQ contexts are configured with these BPIDs. So that when resource
levels reach configured thresholds they assert backpressure on the
interface which is also mapped to same BPID.
Also added support to enable/disable pause frames generation via ethtool.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
CGX LMAC, the physical interface can generate pause frames when
internal resources asserts backpressure due to exhaustion.
This patch configures CGX to generate 802.3 pause frames.
Also enabled processing of received pause frames on the line which
will assert backpressure on the internal transmit path.
Also added mailbox handlers for PF drivers to enable or disable
pause frames anytime.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each of the interface receive channels can be backpressured by
resources upon exhaustion or reaching configured threshold levels.
Resources here are receive buffer queues (Auras) and pkt notification
descriptor queues (CQs). Resources and interface channels are mapped
using backpressure IDs (BPIDs).
HW supports upto 512 BPIDs, this patch divides these BPIDs statically
across CGX/LBK/SDP interfaces as follows.
BPIDs 0 - 191 are mapped to LMAC channels, 16 per LMAC.
BPIDs 192 - 255 are mapped to LBK channels.
BPIDs 256 - 511 are mapped to SDP channels.
Also did the needed basic configuration of BPIDs.
Added mbox handlers with which a PF device can request for a BPID which
it will use to configure Auras and CQs.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert the Marvell mvpp2 ethernet driver to use the finalised link
parameters in mac_link_up() rather than the parameters in mac_config().
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert the Marvell mvneta ethernet driver to use the finalised link
parameters in mac_link_up() rather than the parameters in mac_config().
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Propagate the resolved link parameters via the mac_link_up() call for
MACs that do not automatically track their PCS state. We propagate the
link parameters via function arguments so that inappropriate members
of struct phylink_link_state can't be accessed, and creating a new
structure just for this adds needless complexity to the API.
Tested-by: Andre Przywara <andre.przywara@arm.com>
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Functions starting with __ usually indicate those which are exported,
but should not be called directly. Update some of those declared in the
API and make it more readable.
page_pool_unmap_page() and page_pool_release_page() were doing
exactly the same thing calling __page_pool_clean_page(). Let's
rename __page_pool_clean_page() to page_pool_release_page() and
export it in order to show up on perf logs and get rid of
page_pool_unmap_page().
Finally rename __page_pool_put_page() to page_pool_put_page() since we
can now directly call it from drivers and rename the existing
page_pool_put_page() to page_pool_put_full_page() since they do the same
thing but the latter is trying to sync the full DMA area.
This patch also updates netsec, mvneta and stmmac drivers which use
those functions.
Suggested-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce "rx" prefix in the name scheme for xdp counters
on rx path.
Differentiate between XDP_TX and ndo_xdp_xmit counters
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cleanedup repititive nixlf and blkaddr retrieving logic
is various mailbox handlers throughout the rvu_nix.c file.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most of the CGX register config is restricted to mapped RVU PFs,
this patch cleans up these permission checks spread across
the rvu_cgx.c file by moving the checks to a common fn().
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since CGX driver and AF driver are built into a single module
the export symbols in CGX driver are not needed. This patch
gets rid of them.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of xdp_ret in mvneta_swbm_rx_frame routine since now
we can rely on xdp_stats to flush in case of xdp_redirect
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add xdp_redirect, xdp_pass, xdp_drop and xdp_tx counters
to ethtool statistics
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce mvneta_stats structure in mvneta_update_stats routine signature
in order to collect all the rx stats and update them at the end at the
napi loop. mvneta_stats will be reused adding xdp statistics support to
ethtool.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In oreder to avoid unnecessary instructions rely on open-coding updating
per-cpu stats in mvneta_tx/mvneta_xdp_submit_frame and mvneta_rx_hwbm
routines. This patch will be used to add xdp support to ethtool for the
mvneta driver
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
mvneta_ethtool_update_stats routine is currently reporting
skb_alloc_error and refill_error only for the first rx queue.
Fix the issue moving skb_alloc_err and refill_err in
mvneta_pcpu_stats structure.
Moreover this patch will be used to introduce xdp statistics
to ethtool for the mvneta driver
Fixes: 17a96da627 ("net: mvneta: discriminate error cause for missed packet")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move rx_dropped and rx_errors counters in mvneta_pcpu_stats in order to
avoid possible races updating statistics
Fixes: 562e2f467e ("net: mvneta: Improve the buffer allocation method for SWBM")
Fixes: dc35a10f68 ("net: mvneta: bm: add support for hardware buffer management")
Fixes: c5aff18204 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The otx2_mbox_get_rsp() function never returns NULL, it returns error
pointers on error.
Fixes: 34bfe0ebed ("octeontx2-pf: MTU, MAC and RX mode config support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In order to fix XDP support if sw buffer management is used as fallback
for hw bm devices, define MVNETA_SKB_HEADROOM as maximum between
XDP_PACKET_HEADROOM and NET_SKB_PAD and let the hw aligns the IP header
to 4-byte boundary.
Fix rx_offset_correction initialization if mvneta_bm_port_init fails in
mvneta_resume routine
Fixes: 0db51da7a8 ("net: mvneta: add basic XDP support")
Tested-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking updates from David Miller:
1) Add WireGuard
2) Add HE and TWT support to ath11k driver, from John Crispin.
3) Add ESP in TCP encapsulation support, from Sabrina Dubroca.
4) Add variable window congestion control to TIPC, from Jon Maloy.
5) Add BCM84881 PHY driver, from Russell King.
6) Start adding netlink support for ethtool operations, from Michal
Kubecek.
7) Add XDP drop and TX action support to ena driver, from Sameeh
Jubran.
8) Add new ipv4 route notifications so that mlxsw driver does not have
to handle identical routes itself. From Ido Schimmel.
9) Add BPF dynamic program extensions, from Alexei Starovoitov.
10) Support RX and TX timestamping in igc, from Vinicius Costa Gomes.
11) Add support for macsec HW offloading, from Antoine Tenart.
12) Add initial support for MPTCP protocol, from Christoph Paasch,
Matthieu Baerts, Florian Westphal, Peter Krystad, and many others.
13) Add Octeontx2 PF support, from Sunil Goutham, Geetha sowjanya, Linu
Cherian, and others.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1469 commits)
net: phy: add default ARCH_BCM_IPROC for MDIO_BCM_IPROC
udp: segment looped gso packets correctly
netem: change mailing list
qed: FW 8.42.2.0 debug features
qed: rt init valid initialization changed
qed: Debug feature: ilt and mdump
qed: FW 8.42.2.0 Add fw overlay feature
qed: FW 8.42.2.0 HSI changes
qed: FW 8.42.2.0 iscsi/fcoe changes
qed: Add abstraction for different hsi values per chip
qed: FW 8.42.2.0 Additional ll2 type
qed: Use dmae to write to widebus registers in fw_funcs
qed: FW 8.42.2.0 Parser offsets modified
qed: FW 8.42.2.0 Queue Manager changes
qed: FW 8.42.2.0 Expose new registers and change windows
qed: FW 8.42.2.0 Internal ram offsets modifications
MAINTAINERS: Add entry for Marvell OcteonTX2 Physical Function driver
Documentation: net: octeontx2: Add RVU HW and drivers overview
octeontx2-pf: ethtool RSS config support
octeontx2-pf: Add basic ethtool support
...
- remove ioremap_nocache given that is is equivalent to
ioremap everywhere
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl4vKHwLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYMPGBAAuVNUZaZfWYHpiVP2oRcUQUguFiD3NTbknsyzV2oH
J9P0GfeENSKwE9OOhZ7XIjnCZAJwQgTK/ppQY5yiQ/KAtYyyXjXEJ6jqqjiTDInr
+3+I3t/LhkgrK7tMrb7ylTGa/d7KhaciljnOXC8+b75iddvM9I1z2pbHDbppZMS9
wT4RXL/cFtRb85AfOyPLybcka3f5P2gGvQz38qyimhJYEzHDXZu9VO1Bd20f8+Xf
eLBKX0o6yWMhcaPLma8tm0M0zaXHEfLHUKLSOkiOk+eHTWBZ3b/w5nsOQZYZ7uQp
25yaClbameAn7k5dHajduLGEJv//ZjLRWcN3HJWJ5vzO111aHhswpE7JgTZJSVWI
ggCVkytD3ESXapvswmACSeCIDMmiJMzvn6JvwuSMVB7a6e5mcqTuGo/FN+DrBF/R
IP+/gY/T7zIIOaljhQVkiEIIwiD/akYo0V9fheHTBnqcKEDTHV4WjKbeF6aCwcO+
b8inHyXZSKSMG//UlDuN84/KH/o1l62oKaB1uDIYrrL8JVyjAxctWt3GOt5KgSFq
wVz1lMw4kIvWtC/Sy2H4oB+RtODLp6yJDqmvmPkeJwKDUcd/1JKf0KsZ8j3FpGei
/rEkBEss0KBKyFAgBSRO2jIpdj2epgcBcsdB/r5mlhcn8L77AS6mHbA173kY4pQ/
Kdg=
=TUCJ
-----END PGP SIGNATURE-----
Merge tag 'ioremap-5.6' of git://git.infradead.org/users/hch/ioremap
Pull ioremap updates from Christoph Hellwig:
"Remove the ioremap_nocache API (plus wrappers) that are always
identical to ioremap"
* tag 'ioremap-5.6' of git://git.infradead.org/users/hch/ioremap:
remove ioremap_nocache and devm_ioremap_nocache
MIPS: define ioremap_nocache to ioremap
Added support to show or configure RSS hash key, indirection table,
2,4 tuple via ethtool. Also added debug msg_level support
to dump messages when HW reports errors in packet received
or transmitted.
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds ethtool support for
- Driver stats, Tx/Rx perqueue and CGX LMAC stats
- Set/show Rx/Tx queue count
- Set/show Rx/Tx ring sizes
- Set/show IRQ coalescing parameters
Signed-off-by: Christina Jacob <cjacob@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added ndo_get_stats64 which returns stats maintained by HW.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds TCP segmentation offload (TSO) support. First version
of the silicon didn't support TSO offload, for this driver
level TSO support is added.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds receive side scaling (RSS) support to distribute
pkts/flows across multiple queues. Sets up key, indirection
table etc. Also added extraction of HW calculated rxhash and
adding to same to SKB ie NETIF_F_RXHASH offload support.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
HW reports many errors on the receive and transmit paths.
Such as incorrect queue configuration, pkt transmission errors,
LMTST instruction errors, transmit queue full etc. These are reported
via QINT interrupt. Most of the errors are fatal and needs
reinitialization.
Also added support to allocate receive buffers in non-atomic context
when allocation fails in NAPI context.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Aleksey Makarov <amakarov@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch addes support to change interface MTU, MAC address
retrieval and config, RX mode ie unicast, multicast and promiscuous.
Also added link loopback support
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PF and AF (admin function) shares 64KB of reserved memory region for
communication. This region is shared for
- Messages sent by PF and responses sent by AF.
- Notifications sent by AF and ACKs sent by PF.
This patch adds infrastructure to handle notifications sent
by AF and adds handlers to process them.
One of the main usecase of notifications from AF is physical
link changes. So this patch adds registration of PF with AF
to receive link status change notifications and also adds
the handler for that notification.
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the packet transmission support.
For a given skb prepares send queue descriptors (SQEs) and pushes them
to HW. Here driver doesn't maintain it's own SQ rings, SQEs are pushed
to HW using a silicon specific operations called LMTST. From the
instuction HW derives the transmit queue number and queues the SQE to
that queue. These LMTST instructions are designed to avoid queue
maintenance in SW and lockless behavior ie when multiple cores are trying
to add SQEs to same queue then HW will takecare of serialization, no need
for SW to hold locks.
Also supports scatter/gather.
Co-developed-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added receive packet handling (NAPI) support, error stats, RX_ALL
capability config option to passon error pkts to stack upon user request.
In subsequent patches these error stats will be added to ethttool.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Completion queue (CQ) is the one with which HW notifies SW on a packet
reception or transmission. Each of the RQ and SQ are mapped to a unique
CQ and again both CQs are mapped to same interrupt ie the CINT. So that
each core has one interrupt source in whose handler both Rx and Tx
notifications are processed.
Also
- Registered a NAPI handler for the CINT.
- Setup coalescing parameters.
- IRQ affinity hints etc
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch does the initialization of all queues ie the
receive buffer pools, receive and transmit queues, completion
or notification queues etc. Allocates all required resources
(eg transmit schedulers, receive buffers etc) and configures
them for proper functioning of queues. Also sets up receive
queue's RED dropping levels.
Co-developed-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For a PF to function as a NIC, NPA (for Rx buffers, Tx descriptors etc)
and NIX (for rcv, send and completion queues) are the minimum resources
needed. So request admin function (AF) to attach one each of NIX and NPA
block LFs (local functions).
Only AF can configure a LF's contexts, so request AF to allocate memory
for NPA aura/pool and NIX RQ/SQ/CQ HW contexts. Upon receiving response,
save some of the HW constants like number of pointers per stack page,
size of send queue buffer (SQBs, where SQEs are queued by HW) e.t.c which
are later used to initialize queues.
A HW context here is like a state machine maintained for a descriptor
queue. eg size, head/tail pointers, irq etc etc. HW maintains this in
memory.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the resource virtualization unit (RVU) each of the PF and AF
(admin function) share a 64KB of reserved memory region for
communication. This patch initializes PF <=> AF mailbox IRQs,
registers handlers for processing these communication messages.
Also adds support to process these messages in both directions
ie responses to PF initiated DOWN (PF => AF) messages and AF
initiated UP messages (AF => PF).
Mbox communication APIs and message formats are defined in AF driver
(drivers/net/ethernet/marvell/octeontx2/af), mbox.h from AF driver is
included here to avoid duplication.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Christina Jacob <cjacob@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Aleksey Makarov <amakarov@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds template for the Marvell's OcteonTX2 network
controller's physical function driver. Just the probe, PCI
specific initialization and netdev registration.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recently XDP Support was added to the mvneta driver
for software buffer management only.
It is still possible to attach an XDP program if
hardware buffer management is used.
It is not doing anything at that point.
The patch disallows attaching XDP programs to mvneta
if hardware buffer management is used.
I am sorry about that. It is my first submission and I am having
some troubles with the format of my emails.
v4 -> v5:
- Remove extra tabs
v3 -> v4:
- Please ignore v3 I accidentally submitted
my other patch with git-send-mail and v4 is correct
v2 -> v3:
- My mailserver corrupted the patch
resubmission with git-send-email
v1 -> v2:
- Fixing the patches indentation
Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The first batch of driver conversions missed a few cases where we can
use phy_do_ioctl too.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Page pool API will start syncing (if requested) starting from
page->dma_addr + pool->p.offset. Fix dma sync length in
mvneta_run_xdp since we do not need to account xdp headroom
Fixes: 07e13edbb6 ("net: mvneta: get rid of huge dma sync in mvneta_rx_refill")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With 'commit 44768decb7 ("page_pool: handle page recycle for NUMA_NO_NODE
condition")' we can safely change nid to NUMA_NO_NODE and accommodate
future NUMA aware hardware using mvneta network interface
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ioremap has provided non-cached semantics by default since the Linux 2.6
days, so remove the additional ioremap_nocache interface.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Switch network drivers, phy drivers, and SFP/phylink over to use the
more correct 10GBASE-R, rather than 10GBASE-KR. 10GBASE-KR is backplane
ethernet, which is 10GBASE-R with autonegotiation on top, which our
current usage on the affected platforms does not have.
The only remaining user of PHY_INTERFACE_MODE_10GKR is the Aquantia
PHY, which has a separate mode for 10GBASE-KR.
For Marvell mvpp2, we detect 10GBASE-KR, and rewrite it to 10GBASE-R
for compatibility with existing DT - this is the only network driver
at present that makes use of PHY_INTERFACE_MODE_10GKR.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Several nf_flow_table_offload fixes from Pablo Neira Ayuso,
including adding a missing ipv6 match description.
2) Several heap overflow fixes in mwifiex from qize wang and Ganapathi
Bhat.
3) Fix uninit value in bond_neigh_init(), from Eric Dumazet.
4) Fix non-ACPI probing of nxp-nci, from Stephan Gerhold.
5) Fix use after free in tipc_disc_rcv(), from Tuong Lien.
6) Enforce limit of 33 tail calls in mips and riscv JIT, from Paul
Chaignon.
7) Multicast MAC limit test is off by one in qede, from Manish Chopra.
8) Fix established socket lookup race when socket goes from
TCP_ESTABLISHED to TCP_LISTEN, because there lacks an intervening
RCU grace period. From Eric Dumazet.
9) Don't send empty SKBs from tcp_write_xmit(), also from Eric Dumazet.
10) Fix active backup transition after link failure in bonding, from
Mahesh Bandewar.
11) Avoid zero sized hash table in gtp driver, from Taehee Yoo.
12) Fix wrong interface passed to ->mac_link_up(), from Russell King.
13) Fix DSA egress flooding settings in b53, from Florian Fainelli.
14) Memory leak in gmac_setup_txqs(), from Navid Emamdoost.
15) Fix double free in dpaa2-ptp code, from Ioana Ciornei.
16) Reject invalid MTU values in stmmac, from Jose Abreu.
17) Fix refcount leak in error path of u32 classifier, from Davide
Caratti.
18) Fix regression causing iwlwifi firmware crashes on boot, from Anders
Kaseorg.
19) Fix inverted return value logic in llc2 code, from Chan Shu Tak.
20) Disable hardware GRO when XDP is attached to qede, frm Manish
Chopra.
21) Since we encode state in the low pointer bits, dst metrics must be
at least 4 byte aligned, which is not necessarily true on m68k. Add
annotations to fix this, from Geert Uytterhoeven.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (160 commits)
sfc: Include XDP packet headroom in buffer step size.
sfc: fix channel allocation with brute force
net: dst: Force 4-byte alignment of dst_metrics
selftests: pmtu: fix init mtu value in description
hv_netvsc: Fix unwanted rx_table reset
net: phy: ensure that phy IDs are correctly typed
mod_devicetable: fix PHY module format
qede: Disable hardware gro when xdp prog is installed
net: ena: fix issues in setting interrupt moderation params in ethtool
net: ena: fix default tx interrupt moderation interval
net/smc: unregister ib devices in reboot_event
net: stmmac: platform: Fix MDIO init for platforms without PHY
llc2: Fix return statement of llc_stat_ev_rx_null_dsap_xid_c (and _test_c)
net: hisilicon: Fix a BUG trigered by wrong bytes_compl
net: dsa: ksz: use common define for tag len
s390/qeth: don't return -ENOTSUPP to userspace
s390/qeth: fix promiscuous mode after reset
s390/qeth: handle error due to unsupported transport mode
cxgb4: fix refcount init for TC-MQPRIO offload
tc-testing: initial tdc selftests for cls_u32
...
Presently, at boot time, the comphys are enabled. For firmware
compatibility reasons, the comphy driver does not power down the
comphys at boot. Consequently, the ethernet comphys are left active
until the network interfaces are brought through an up/down cycle.
If the port is never used, the port wastes power needlessly. Arrange
for the ethernet comphys to be cycled by the mvpp2 driver as if the
interface went through an up/down cycle during driver probe, thereby
powering them down.
This saves:
270mW per 10G SFP+ port on the Macchiatobin Single Shot (eth0/eth1)
370mW per 10G PHY port on the Macchiatobin Double Shot (eth0/eth1)
160mW on the SFP port on either Macchiatobin flavour (eth3)
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up the mvpp2 validate implementation to adopt the same behaviour as
mvneta:
- only allow the link modes that the specified PHY interface mode
supports with the exception of 1000base-X and 2500base-X.
- use the basex helper to deal with SFP modules that can be switched
between 1000base-X vs 2500base-X.
This gives consistent behaviour between mvneta and mvpp2.
This commit depends on "net: phylink: extend clause 45 PHY validation
workaround" so is not marked for backporting to stable kernels.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
phylink requires the MAC to report when its link status changes when
operating in inband modes. Failure to report link status changes
means that phylink has no idea when the link events happen, which
results in either the network interface's carrier remaining up or
remaining permanently down.
For example, with a fiber module, if the interface is brought up and
link is initially established, taking the link down at the far end
will cut the optical power. The SFP module's LOS asserts, we
deactivate the link, and the network interface reports no carrier.
When the far end is brought back up, the SFP module's LOS deasserts,
but the MAC may be slower to establish link. If this happens (which
in my tests is a certainty) then phylink never hears that the MAC
has established link with the far end, and the network interface is
stuck reporting no carrier. This means the interface is
non-functional.
Avoiding the link interrupt when we have phylink is basically not
an option, so remove the !port->phylink from the test.
Fixes: 4bb0432628 ("net: mvpp2: phylink support")
Tested-by: Sven Auhagen <sven.auhagen@voleatech.de>
Tested-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except
at places where these are defined. Later patches will remove the unused
definition of FIELD_SIZEOF().
This patch is generated using following script:
EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h"
git grep -l -e "\bFIELD_SIZEOF\b" | while read file;
do
if [[ "$file" =~ $EXCLUDE_FILES ]]; then
continue
fi
sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file;
done
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: David Miller <davem@davemloft.net> # for net
Rename the mac_link_state() method to mac_pcs_get_state() to make it
clear that it should be returning the MACs PCS current state, which
is used for inband negotiation rather than just reading back what the
MAC has been configured for. Update the documentation to explicitly
mention that this is for inband.
We drop the return value as well; most of phylink doesn't check the
return value and it is not clear what it should do on error - instead
arrange for state->link to be false.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
If rvu_get_blkaddr() fails, then this rvu_cgx_nix_cuml_stats() returns
zero and we write some uninitialized data into the debugfs output.
On the error paths, the use of the uninitialized "*stat" is harmless,
but it will lead to a Smatch warning (static analysis) and a UBSan
warning (runtime analysis) so we should prevent that as well.
Fixes: f967488d09 ("octeontx2-af: Add per CGX port level NIX Rx/Tx counters")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of costly dma_sync_single_for_device in mvneta_rx_refill
since now the driver can let page_pool API to manage needed DMA
sync with a proper size.
- XDP_DROP DMA sync managed by mvneta driver: ~420Kpps
- XDP_DROP DMA sync managed by page_pool API: ~585Kpps
Tested-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rely on page_pool_recycle_direct and not on xdp_return_buff in
mvneta_run_xdp. This is a preliminary patch to limit the dma sync len
to the one strictly necessary
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch corrects the SPDX License Identifier style in
header files related to Marvell OcteonTX2 network devices.
It uses an expilict block comment for the SPDX License
Identifier.
Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Traffic for a CGX mapped NIXLF can be stopped by disabling entries
in NPC MCAM or by configuring CGX and mailbox messages exist for the
two options. If traffic is stopped at CGX then VFs of that PF are
also effected hence CGX traffic should be started/stopped by
tracking all the users of it. This patch implements that CGX users
tracking. CGX is also configured along with NPC if required.
Also removed a check which mandates even number of LBK VFs.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A config option is added to disable caching of dynamic entries
like SQEs and stack pages. Also locks down all HW contexts in NDC,
preventing them from being evicted.
This option is useful when the queue count is large and there are
huge NDC cache misses. It's trade off between SQ context misses and
dynamically changing entries like SQE and stack page pointers.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each of the NIX/NPA LFs can choose which ways of their respective
NDC caches should be used to cache their contexts. This enables
flexible configurations like disabling caching for a LF, limiting
it's context to a certain set of ways etc etc. Separate way_mask
for NIX-TX and NIX-RX is not supported.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ingress packet replication support has been added to 96xx B0
silicon. This patch enables using that feature to replicate
ingress broadcast packets to PF and it's VFs.
Also fixed below issues
- VFs can also install NPC MCAM entry to forward broadcast pkts.
Otherwise, unless PF's interface is UP, VFs will not receive
bcast packets.
- NPC MCAM entry is disabled when PF and all it's VFs are down.
- Few corner cases in installing multicast entry list.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CN96xx initial silicon doesn't support all features pertaining to
NIX transmit scheduling and shaping.
- It supports a fixed topology of 1:1 mapped transmit
limiters at all levels.
- Supports DWRR only at SMQ/MDQ and TL1.
- Doesn't support shaping and coloring.
This patch adds HW capability structure by which each variant
and skew of silicon can be differentiated by their supported
features. And adds support for A0 silicon's transmit scheduler
capabilities or rather limitations.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for few more RSS key types for flow key
algorithm to compute rss hash index.
Following flow key types have been added.
- Tunnel types like NVGRE, VXLAN, GENEVE.
- L2 offload type ETH_DMAC, Here we will consider only DMAC 6 bytes.
- And extension header IPV6_EXT (1 byte followed by IPV6 header
- Hashing inner protocol fields for inner DMAC, IPv4/v6, TCP, UDP, SCTP.
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Writing into NPC MCAM1 and MCAM0 registers are suppressed if
they happened to form a reserved combination. Hence
clear and disable MCAM entries before update.
For HRM:
[CAM(1)]<n>=1, [CAM(0)]<n>=1: Reserved.
The reserved combination is not allowed. Hardware suppresses any
write to CAM(0) or CAM(1) that would result in the reserved combination for
any CAM bit.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Updated NPC KPU packet parsing profile with support for following
- Fragmentation support for IPv4 IPv6 outer header
- NIX instruction header support
- QinQ with TPID of 0x8100 as non inner most vlan tag, as legacy
network equipments still generate QinQ packets with this configuration.
- To better support RSS for tunnelled packets, udp based tunnel
protocols such as vxlan, vxlan-gpe, geneve and gtpu are now
captured into a separate layer E. Consequently, the inner
packet headers are pushed one layer down to LF, LG, and LH
accordingly.
- Support for rfc7510 mpls in udp. Up to 4 MPLS labels can be parsed
and captured in one layer LE.
- Parser support for DSA, extended DSA and eDSA tags right after
ethernet header by Marvell SOHO and Falcon switches. For extended
DSA and eDSA tags, a special PKIND of 62 is used, as these tags don't
contain a tpid field.
- Higig2 protocol header parsing support, added a NPC_LT_LA_HIGIG2_ETHER
for a combined header of HIGIG2 and Ethernet. Add a
NPC_LT_LA_IH_NIX_HIGIG2_ETHER for a combined header of nix_ih,
HIGIG2 and Ethernet on egress side. Also added 2 upper flags in LA to
indicate the presence of nix_ih and HIGIG2.
Other changes include
- IPv4.TTL==0 IPv6.HLIM==0 check
- Per RFC 1858, mark fragment offset == 1 as error
- TCP invalid flags check
- Separate error codes for outer and inner IPv4 checksum errors.
- Fix a parser error when KPU parses incoming IPSec ESP and AH packets
- NPC vtag capture/strip hardware expect tag pointer to point to
tpid/ethertype instead of tci. So move lb_ptr to point to tpid/ethertype.
- Fix npc parser error when parsing udp packets that don't have any payload.
- For a single MCAM entry to match on packets with one or stacked vlan tags
combine NPC_LT_LB_STAG and NPC_LT_LB_QINQ to NPC_LT_LB_STAG_QINQ.
- NVGRE to have a separate ltype LD_NVGRE instead of combined with LD_GRE.
- Reserve top LD/LTYPEs to support custom KPU profile fields.
Signed-off-by: Hao Zheng <haoz@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For every mailbox handler added to rvu, we are adding a function
declaration in rvu header file. Cleaned this up by adding a macro
to generate these declarations automatically.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If mailbox client has a bounce buffer or a intermediate buffer where
mbox messages are framed then copy them from there to HW buffer.
If 'mbase' and 'hw_mbase' are not same then assume 'mbase' points to
bounce buffer.
This patch also adds msg_size field to mbox header to copy only valid
data instead of whole buffer.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added a new mailbox API which goes through all responses
to check their IDs and response codes.
Also added logic to prevent queuing multiple works to
process the same mailbox message. This scenario happens
when AF is processing a PF's request and menawhile PF
sends ACK to AF sent UP message, then mbox_hdr->num_msgs
in the PF->AF DOWN mbox region will be nonzero and AF
will end up processing PF's request again. This is fixed
by taking a backup of num_msgs counter and clearing the
same in the mbox region before scheduling work.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added support to display current NPC MCAM entries and counter's allocation
status ín debugfs.
cat /sys/kernel/debug/octeontx2/npc/mcam_info' will dump following info
- MCAM Rx and Tx keysize
- Total MCAM entries and counters
- Current available count
- Count of number of MCAM entries and counters allocated
by a RVU PF/VF device.
Also, one NPC MCAM counter (last one) is reserved and mapped to
NPC RX_INTF's MISS_ACTION to count dropped packets due to no MCAM
entry match. This pkt drop counter can be checked via debugfs.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A CGX port is shared by a RVU PF and it's VFs. These per
CGX port level NIX Rx/Tx counters are cumilative stats of
all NIXLFs sharing this port. These stats when compared
to CGX Rx/Tx stats helps in identifying pkts dropped within
the system, if any.
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds CGX LMAC physical interface or serdes Rx/Tx
packet stats to debugfs.
'cat cgx<idx>/lmac<idx>/stats' dumps the current interface link
status and Rx/Tx stats. Stats include pkt received/transmitted,
dropped, pause frames etc etc.
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NDC is a data cache unit which caches NPA and NIX block's
aura/pool/RQ/SQ/CQ/etc contexts to reduce number of costly
DRAM accesses.
This patch adds support to dump cache's performance stats
like cache line hit/miss counters, average cycles taken for
accessing cached and non-cached data. This will help in
checking if NPA/NIX context reads/writes are having NDC cache
misses which inturn might effect performance.
Also changed NDC enums to reflect correct NDC hardware instance.
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To aid in debugging NIX block related issues, added support to dump
NIX block LF's RQ, SQ and CQ hardware contexts in debugfs. User can
check which contexts are enabled currently and dump it's current HW
context.
Four new files 'qsize', 'rq_ctx', 'sq_ctx' and 'cq_ctx' are added to the
debugfs at 'sys/kernel/debug/octeontx2/nix/'
'echo <nixlf index> > qsize' will display current enabled CQ/SQ/RQs.
'echo <nixlf> [rq number/all] > rq_ctx',
'echo <nixlf> [sq number/all] > sq_ctx' &
'echo <nixlf> [cq number/all] > cq_ctx' will dump RQ/SQ/CQ's current
hardware context.
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To aid in debugging NPA related issues, added support to dump
NPA (pool allocator) block LF's aura and pool hardware contexts in
debugfs. User can check which contexts are enabled currently and dump
it's current HW context.
Three new files 'qsize', 'aura_ctx', 'pool_ctx' are added to the
debugfs at 'sys/kernel/debug/octeontx2/npa/'
'echo <npalf index> > qsize' will display current enabled Aura/Pools.
'echo <npalf> [aura number/all] > aura_ctx' &
'echo <npalf> [aura number/all] > pool_ctx' will dump Aura/Pool
context info.
Signed-off-by: Christina Jacob <cjacob@marvell.com>
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added support to dump current resource provisioning status
of all resource virtualization unit (RVU) block's
(i.e NPA, NIX, SSO, SSOW, CPT, TIM) local functions attached
to a PF_FUNC into a debugfs file.
'cat /sys/kernel/debug/octeontx2/rsrc_alloc'
will show the current block LF's allocation status.
Signed-off-by: Christina Jacob <cjacob@marvell.com>
Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix build_skb for bm capable devices when they fall-back using swbm path
(e.g. when bm properties are configured in device tree but
CONFIG_MVNETA_BM_ENABLE is not set). In this case rx_offset_correction is
overwritten so we need to use it building skb instead of
MVNETA_SKB_HEADROOM directly
Fixes: 8dc9a0888f ("net: mvneta: rely on build_skb in mvneta_rx_swbm poll routine")
Fixes: 0db51da7a8 ("net: mvneta: add basic XDP support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reported-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before this change of_get_phy_mode() returned an enum,
phy_interface_t. On error, -ENODEV etc, is returned. If the result of
the function is stored in a variable of type phy_interface_t, and the
compiler has decided to represent this as an unsigned int, comparision
with -ENODEV etc, is a signed vs unsigned comparision.
Fix this problem by changing the API. Make the function return an
error, or 0 on success, and pass a pointer, of type phy_interface_t,
where the phy mode should be stored.
v2:
Return with *interface set to PHY_INTERFACE_MODE_NA on error.
Add error checks to all users of of_get_phy_mode()
Fixup a few reverse christmas tree errors
Fixup a few slightly malformed reverse christmas trees
v3:
Fix 0-day reported errors.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The only slightly tricky merge conflict was the netdevsim because the
mutex locking fix overlapped a lot of driver reload reorganization.
The rest were (relatively) trivial in nature.
Signed-off-by: David S. Miller <davem@davemloft.net>
When receiving traffic, eth_type_trans() is high up on the perf top list,
because it's the first function which access the packet data.
Move the DMA unmap a bit higher, and put a prefetch just after it, so we
have more time to load the data into the cache.
The packet rate increase is about 14% with a tc drop test: 1620 => 1853 kpps
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the RX path we always sync against the maximum frame size for that pool.
Do the DMA sync and the unmap separately, so we can only sync by the
size of the received frame.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the CONFIG_MVNET_BA is not set, then make the stub functions
static inline to avoid trying to export them, and remove hte
following sparse warnings:
drivers/net/ethernet/marvell/mvneta_bm.h:163:6: warning: symbol 'mvneta_bm_pool_destroy' was not declared. Should it be static?
drivers/net/ethernet/marvell/mvneta_bm.h:165:6: warning: symbol 'mvneta_bm_bufs_free' was not declared. Should it be static?
drivers/net/ethernet/marvell/mvneta_bm.h:167:5: warning: symbol 'mvneta_bm_construct' was not declared. Should it be static?
drivers/net/ethernet/marvell/mvneta_bm.h:168:5: warning: symbol 'mvneta_bm_pool_refill' was not declared. Should it be static?
drivers/net/ethernet/marvell/mvneta_bm.h:170:23: warning: symbol 'mvneta_bm_pool_use' was not declared. Should it be static?
drivers/net/ethernet/marvell/mvneta_bm.h:181:18: warning: symbol 'mvneta_bm_get' was not declared. Should it be static?
drivers/net/ethernet/marvell/mvneta_bm.h:182:6: warning: symbol 'mvneta_bm_put' was not declared. Should it be static?
Signed-off-by: Ben Dooks (Codethink) <ben.dooks@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement XDP_TX verdict and ndo_xdp_xmit net_device_ops function
pointer
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow tx buffer array to contain both skb and xdp buffers in order to
enable xdp frame recycling adding XDP_TX verdict support
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move data buffer prefetch in mvneta_swbm_rx_frame after
dma_sync_single_range_for_cpu
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactor mvneta_rx_swbm code introducing mvneta_swbm_rx_frame and
mvneta_swbm_add_rx_fragment routines. Rely on build_skb in oreder to
allocate skb since the previous patch introduced buffer recycling using
the page_pool API.
This patch fixes even an issue in the original driver where dma buffers
are accessed before dma sync.
mvneta driver can run on not cache coherent devices so it is
necessary to sync DMA buffers before sending them to the device
in order to avoid memory corruptions. Running perf analysis we can
see a performance cost associated with this DMA-sync (anyway it is
already there in the original driver code). In follow up patches we
will add more logic to reduce DMA-sync as much as possible.
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the page_pool api for allocations and DMA handling instead of
__dev_alloc_page()/dma_map_page() and free_page()/dma_unmap_page().
Pages are unmapped using page_pool_release_page before packets
go into the network stack.
The page_pool API offers buffer recycling capabilities for XDP but
allocates one page per packet, unless the driver splits and manages
the allocated page.
This is a preliminary patch to add XDP support to mvneta driver
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce mvneta_update_stats routine to collect {rx/tx} statistics
(packets and bytes). This is a preliminary patch to add XDP support to
mvneta driver
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recycling in mvpp2 has gone long time ago, but two comment still refers
to it. Remove those two misleading comments as they generate confusion.
Fixes: 7ef7e1d949 ("net: mvpp2: drop useless fields in mvpp2_bm_pool and related code")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Running old skge driver on PowerPC causes checksum errors
because hardware reported 1's complement checksum is in little-endian
byte order.
Reported-by: Benoit <benoit.sansoni@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Every mvpp2 unit can use up to 8 buffers mapped by the BM (the HW buffer
manager). The HW will place the frames in the buffer pool depending on the
frame size: short (< 128 bytes), long (< 1664) or jumbo (up to 9856).
As any unit can have up to 4 ports, the driver allocates only 2 pools,
one for small and one long frames, and share them between ports.
When the first port MTU is set higher than 1664 bytes, a third pool is
allocated for jumbo frames.
This shared allocation makes impossible to use percpu allocators,
and creates contention between HW queues.
If possible, i.e. if the number of possible CPU are less than 8 and jumbo
frames are not used, switch to a new scheme: allocate 8 per-cpu pools for
short and long frames and bind every pool to an RXQ.
When the first port MTU is set higher than 1664 bytes, the allocation
scheme is reverted to the old behaviour (3 shared pools), and when all
ports MTU are lowered, the per-cpu buffers are allocated again.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactor mvpp2_bm_pool_create(), mvpp2_bm_pool_destroy() and
mvpp2_bm_pools_init() so that they accept a struct device instead
of a struct platform_device, as they just need platform_device->dev.
Removing such dependency makes the BM code more reusable in context
where we don't have a pointer to the platform_device.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A similar workaround for the suspend/resume problem is needed for yet
another ASUS machines, P6X models. Like the previous fix, the BIOS
doesn't provide the standard DMI_SYS_* entry, so again DMI_BOARD_*
entries are used instead.
Reported-and-tested-by: SteveM <swm@swm1.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tx_done_tasklet tasklet is used in invoke the hrtimer
(mvpp2_hr_timer_cb) in softirq context. This can be also achieved without
the tasklet but with HRTIMER_MODE_SOFT as hrtimer mode.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Maxime Chevallier <maxime.chevallier@bootlin.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nathan Huckleberry <nhuck@google.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use devm_platform_ioremap_resource() to simplify the code a bit.
This is detected by coccinelle.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use accessor functions for skb fragment's page_offset instead
of direct references, in preparation for bvec conversion.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware can only offload checksum calculation on first port due to
the Tx FIFO size limitation, and has a maximum L3 offset of 128 bytes.
Document this in a comment and move duplicated code in a function.
Fixes: 576193f2d5 ("net: mvpp2: jumbo frames support")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MTU change code can call napi_disable() with the device already down,
leading to a deadlock. Also, lot of code is duplicated unnecessarily.
Rework mvpp2_change_mtu() to avoid the deadlock and remove duplicated code.
Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
devm_platform_ioremap_resource() wraps platform_get_resource() and
devm_ioremap_resource() in a single helper, let's use that helper to
simplify the code.
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The onboard sky2 NIC on ASUS P6T WS PRO doesn't work after PM resume
due to the infamous IRQ problem. Disabling MSI works around it, so
let's add it to the blacklist.
Unfortunately the BIOS on the machine doesn't fill the standard
DMI_SYS_* entry, so we pick up DMI_BOARD_* entries instead.
BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1142496
Reported-and-tested-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for unifying the skb_frag and bio_vec, use the fine
accessors which already exist and use skb_frag_t instead of
struct skb_frag_struct.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
PPv2's XLGMAC can wait for 3 idle frames before triggering a link up
event. This can cause the link to be stuck low when there's traffic on
the interface, so disable this feature.
Fixes: 4bb0432628 ("net: mvpp2: phylink support")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
kvzalloc already zeroes the memory during the allocation.
pci_alloc_consistent calls dma_alloc_coherent directly.
In commit 518a2f1925
("dma-mapping: zero memory returned from dma_alloc_*"),
dma_alloc_coherent has already zeroed the memory.
So the memset after these function is not needed.
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The onboard sky2 NICs send IRQs after S3, resulting in ethernet not
working after resume.
Maskable MSI and MSI-X are also not supported, so fall back to INTx.
Signed-off-by: Tasos Sahanidis <tasos@tasossah.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Defer probing of the orion-mdio interface when getting a clock returns
EPROBE_DEFER. This avoids locking up the Armada 8k SoC when mdio is used
before all clocks have been enabled.
Signed-off-by: Josua Mayer <josua@solid-run.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Print a warning when device tree specifies more than the maximum of four
clocks supported by orion-mdio. Because reading from mdio can lock up
the Armada 8k when a required clock is not initialized, it is important
to notify the user when a specified clock is ignored.
Signed-off-by: Josua Mayer <josua@solid-run.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow up to four clocks to be specified and enabled for the orion-mdio
interface, which are required by the Armada 8k and defined in
armada-cp110.dtsi.
Fixes a hang in probing the mvmdio driver that was encountered on the
Clearfog GT 8K with all drivers built as modules, but also affects other
boards such as the MacchiatoBIN.
Cc: stable@vger.kernel.org
Fixes: 96cb434238 ("net: mvmdio: allow up to three clocks to be specified for orion-mdio")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Josua Mayer <josua@solid-run.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Users can specify classification actions based on the 'ether' flow type.
In that case, this will apply to all ethernet traffic, superseeding
flows such as 'udp4' or 'tcp6'.
Add support for this flow type in the PPv2 classifier, by mapping the
ETHER_FLOW value to the corresponding entries in the classifier.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a missing check to detect flow types that we don't support, so that
user can be informed of this.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Header Parser allows identifying various fields in the packet
headers, used for various kind of filtering and classification
steps.
This is a re-entrant process, where the offset in the packet header
depends on the previous lookup results. This offset is represented in
the SRAM results of the TCAM, as a shift to be operated.
This shift can be negative in some cases, such as in IPv6 parsing.
This commit prevents overriding the sign bit when setting the shift
value, which could cause instabilities when parsing IPv6 flows.
Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Suggested-by: Alan Winkowski <walan@marvell.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There was an unused variable 'mvpp2_dbgfs_prs_pmap_fops'
Added a usage consistent with other fops to dump pmap
to userspace.
Cc: clang-built-linux@googlegroups.com
Link: https://github.com/ClangBuiltLinux/linux/issues/529
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit allows using the vlan Id and priority as parts of the key
for classification offload. These fields are extracted from the
outermost tag, if multiple tags are present.
Vlan Id and priority are considered as 2 different fields by the
classifier, however the fields are both appended in the Header Extracted
Key in the same layout as they are found in the tags. This means that
when steering only based on the prio, a 16-bit slot is still taken in
the HEK.
The classifier doesn't allow extracting the DEI bit from the tag, so we
explicitly prevent user from using this bit in the key.
This commit adds the vlan priotity as a compatible HEK field for
tagged traffic, meaning that we limit the possibility of extracting this
field only to the flows that contain tagged traffic.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The C2 TCAM used for classification uses a key (Header Extracted Key)
built by concatenating several fields extracted from the packet header.
After a lot of trial-and-error and some guess work, it seems the HEK is
right justified, with the first fields being stored in the MSB, then
concatenated up until the LSB.
Until now, this doesn't cause any issue since all HEK fields we use are
full bytes. However this is an issue for the upcoming VLAN id and pri
extraction, which aren't full bytes.
Rework the way we built that TCAM key, by changing the order in which we
append the fields.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The way we currently handle classification offload and RSS is by having
dedicated lookup sequences in the flow table, each being selected
depending on several fields being present in the packet header.
We need to make sure the classification operation we want to perform can
be done in each flow we want to insert it into. As an example,
classifying on VLAN tag can only be done on flows used for tagged
traffic.
This commit makes sure we don't insert rules in flows we aren't
compatible with.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When performing a TCAM lookup in the C2 engine, it's possible that
multiple entries match the packet. To make sure the correct entry match
when performing a lookup, the Flow Table can set a lookup type, which
will be used in the TCAM lookup, thus preventing such false-positives.
We need to make sure the RSS match doesn't interfere with other
classification lookups, hence we use a dedicated lookup_type for it.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When removing all VID filters, the mvpp2_prs_vid_entry_remove would be
called with the TCAM id incorrectly used as a VID, causing the wrong
TCAM entries to be invalidated.
Fix this by directly invalidating entries in the VID range.
Fixes: 56beda3db6 ("net: mvpp2: Add hardware offloading for VLAN filtering")
Suggested-by: Yuri Chipchev <yuric@marvell.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VID filtering is implemented in the Header Parser, with one range of 11
vids being assigned for each no-loopback port.
Make sure we use the per-port range when looking for existing entries in
the Parser.
Since we used a global range instead of a per-port one, this causes VIDs
to be removed from the whitelist from all ports of the same PPv2
instance.
Fixes: 56beda3db6 ("net: mvpp2: Add hardware offloading for VLAN filtering")
Suggested-by: Yuri Chipchev <yuric@marvell.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Besides the MIB counters, some other useful counters can be exposed to
the user. This commit adds support for :
- Per-port counters, that indicate FIFO drops and classifier drops,
- Per-rxq counters,
- Per-txq counters
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we'll be adding support for other kind of internal counters, make
clear that the currently supported counters are the MIB counters.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When first configuring a port on PPv2, we want to clear the internal
counters so that we don't get values from previous boot stages.
However, we can't really clear these counters when resetting the MAC,
since there are valid reasons to do so while the port is being used,
such as when reconfiguring the interface mode with the PHY.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on review, `lock' is only acquired in hwbm_pool_add() which is
invoked via ->probe(), ->resume() and ->ndo_change_mtu(). Based on this
the lock can become a mutex and there is no need to disable interrupts
during the procedure.
Now that the lock is a mutex, hwbm_pool_add() no longer invokes
hwbm_pool_refill() in an atomic context so we can pass GFP_KERNEL to
hwbm_pool_refill() and remove the `gfp' argument from hwbm_pool_add().
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some ISDN files that got removed in net-next had some changes
done in mainline, take the removals.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Free AF_PACKET po->rollover properly, from Willem de Bruijn.
2) Read SFP eeprom in max 16 byte increments to avoid problems with
some SFP modules, from Russell King.
3) Fix UDP socket lookup wrt. VRF, from Tim Beale.
4) Handle route invalidation properly in s390 qeth driver, from Julian
Wiedmann.
5) Memory leak on unload in RDS, from Zhu Yanjun.
6) sctp_process_init leak, from Neil HOrman.
7) Fix fib_rules rule insertion semantic change that broke Android,
from Hangbin Liu.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
pktgen: do not sleep with the thread lock held.
net: mvpp2: Use strscpy to handle stat strings
net: rds: fix memory leak in rds_ib_flush_mr_pool
ipv6: fix EFAULT on sendto with icmpv6 and hdrincl
ipv6: use READ_ONCE() for inet->hdrincl as in ipv4
Revert "fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied"
net: aquantia: fix wol configuration not applied sometimes
ethtool: fix potential userspace buffer overflow
Fix memory leak in sctp_process_init
net: rds: fix memory leak when unload rds_rdma
ipv6: fix the check before getting the cookie in rt6_get_cookie
ipv4: not do cache for local delivery if bc_forwarding is enabled
s390/qeth: handle error when updating TX queue count
s390/qeth: fix VLAN attribute in bridge_hostnotify udev event
s390/qeth: check dst entry before use
s390/qeth: handle limited IPv4 broadcast in L3 TX path
net: fix indirect calls helpers for ptype list hooks.
net: ipvlan: Fix ipvlan device tso disabled while NETIF_F_IP_CSUM is set
udp: only choose unbound UDP socket for multicast when not in a VRF
net/tls: replace the sleeping lock around RX resync with a bit lock
...
Use a safe strscpy call to copy the ethtool stat strings into the
relevant buffers, instead of a memcpy that will be accessing
out-of-bound data.
Fixes: 118d6298f6 ("net: mvpp2: add ethtool GOP statistics")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The phylink conflict was between a bug fix by Russell King
to make sure we have a consistent PHY interface mode, and
a change in net-next to pull some code in phylink_resolve()
into the helper functions phylink_mac_link_{up,down}()
On the dp83867 side it's mostly overlapping changes, with
the 'net' side removing a condition that was supposed to
trigger for RGMII but because of how it was coded never
actually could trigger.
Signed-off-by: David S. Miller <davem@davemloft.net>