Add support for the VTU Load Purge operation and implement the
port_vlan_del driver function to remove a port from a VLAN entry, and
delete the VLAN if the given port was its last member.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add an helper function to read the next valid VLAN entry for a given
port. It is used in the VID to FID conversion function to retrieve the
forwarding database assigned to a given VLAN port.
Finally update the FDB getnext operation to iterate on the next valid
port VLAN when the end of the current database is reached.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the port_pvid_get and vlan_getnext driver functions required
to dump VLAN entries from the hardware, with the VTU Get Next operation.
Some functions and structure will be shared with STU operations, since
their table format are similar (e.g. STU data entries are accessible
with the same registers as VTU entries, except with an offset of 2).
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the VTU Flush operation (which also flushes the STU), so that
warm boots won't preserved old entries.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new functions in DSA drivers to access hardware VLAN entries through
SWITCHDEV_OBJ_PORT_VLAN objects:
- port_pvid_get() and vlan_getnext() to dump a VLAN
- port_vlan_del() to exclude a port from a VLAN
- port_pvid_set() and port_vlan_add() to join a port to a VLAN
The DSA infrastructure will ensure that each VLAN of the given range
does not already belong to another bridge. If it does, it will fallback
to software VLAN and won't program the hardware.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Like the ipv4 patch with a similar title, this adds a sysctl to allow
the user to change routing behavior based on whether or not the
interface associated with the nexthop was an up or down link. The
default setting preserves the current behavior, but anyone that enables
it will notice that nexthops on down interfaces will no longer be
selected:
net.ipv6.conf.all.ignore_routes_with_linkdown = 0
net.ipv6.conf.default.ignore_routes_with_linkdown = 0
net.ipv6.conf.lo.ignore_routes_with_linkdown = 0
...
When the above sysctls are set, not only will link status be reported to
userspace, but an indication that a nexthop is dead and will not be used
is also reported.
1000::/8 via 7000::2 dev p7p1 metric 1024 dead linkdown pref medium
1000::/8 via 8000::2 dev p8p1 metric 1024 pref medium
7000::/8 dev p7p1 proto kernel metric 256 dead linkdown pref medium
8000::/8 dev p8p1 proto kernel metric 256 pref medium
9000::/8 via 8000::2 dev p8p1 metric 2048 pref medium
9000::/8 via 7000::2 dev p7p1 metric 1024 dead linkdown pref medium
fe80::/64 dev p7p1 proto kernel metric 256 dead linkdown pref medium
fe80::/64 dev p8p1 proto kernel metric 256 pref medium
This also adds devconf support and notification when sysctl values
change.
v2: drop use of rt6i_nhflags since it is not needed right now
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support to track current link status of ipv6 nexthops to match
recent changes that added support for ipv4 nexthops. This takes a
simple approach to track linkdown status for next-hops and simply
checks the dev for the dst entry and sets proper flags that to be used
in the netlink message.
v2: drop use of rt6i_nhflags since it is not needed right now
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Handle TRACE_PKT, stack can sniff them on the first port
Add debubfs enrty to configure tracing for offload traffic like iWARP
& iSCSI for debugging purpose.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rocker driver tracks arp_tbl neighs to resolve IPv4 route nexthops. The
driver uses NETEVENT_NEIGH_UPDATE for neigh adds and updates, but there is
no event when the neigh is removed from the device (such as when the device
goes admin down). This patches hooks ndo_neigh_destroy so the driver can
know when a neigh is removed from the device. In response, the driver will
purge the neigh entry from its internal tbl.
I didn't find an in-tree users of ndo_neigh_destroy, so I'm not sure if
this ndo is vestigial or if there are out-of-tree users. In any case, it
does what I need here. An alternative design would be to generate
NETEVENT_NEIGH_UPDATE event when neigh is being destroyed, setting state to
NUD_NONE so driver knows neigh entry is dead.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On sucessful probe, driver prints the switch ID. This patch changes the
format of the printed ID to match what's used in sysfs phys_switch_id node.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeremy Linton says:
====================
Enable smsc911x for use with ACPI
This set of patches enables the front Ethernet port on the
ARM Juno development platform when used with an ACPI enabled kernel.
These patches covert the of_property* calls in the driver to the
DT/ACPI agnostic device_property* calls, and add the arm hardware
id to the acpi_match_table.
To support the above changes I copied a couple routines from
of_net into the properties.c file, and modified them to
be ACPI/DT agnostic. I'm not 100% sure this is the correct location
for these functions. But I think they are required to avoid having
a dozen different implementations scattered across assorted Ethernet
adapters that are being enabled to use ACPI properties.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ACPI bindings for the smsc911x driver. Convert the DT specific calls
to nonspecific device* calls, This allows the driver to work
with both ACPI and DT configurations. Ethernet should now work when using
ACPI on ARM Juno.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Graeme Gregory <graeme.gregory@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
OF has some helper functions for parsing MAC and PHY settings.
In cases where the platform is providing this information rather
than the device itself, there needs to be similar functions for ACPI.
These functions are slightly modified versions of the ones in
of_net which can use information provided via DT or ACPI.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuchung Cheng says:
====================
minor tail loss probe improvements
This patch series enhance the tail loss probe (TLP) on some error
conditions. When TLP fails to send a probe, it will no longer
extend the RTO. When it fails to send a new packet because of
receiver window limit, it'll try to retransmit the last packet.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When TLP fails to send new packet because of receive window
limit, it should fall back to retransmit the last packet instead.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If TLP was unable to send a probe, it extended the RTO to
now + icsk_rto. But extending the RTO makes little sense
if no TLP probe went out. With this commit, instead of
extending the RTO we re-arm it relative to the transmit time
of the write queue head.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N says:
====================
Add AM335x PG1.0 CPSW errata workaround
With commit 870915feab ("drivers: net: cpsw: remove
disable_irq/enable_irq as irq can be masked from cpsw itself"),
CPSW on AM335x beagle bone white is broken as there is a errata
for AM335x PG1.0. This patch series implements the workaround by
disabling the interrupts from ARM IRQ controller for AM335x SoC
in addition to the masking of interrupts in CPSW.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
CPSW driver has been updated with compatibles for enabling errata
workarounds. So updating cpsw compatibles.
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CPSW driver has been updated with compatibles for enabling errata
workarounds. So updating cpsw compatibles.
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As per Am335x Errata [1] Advisory 1.0.9, The CPSW C0_TX_PEND and
C0_RX_PEND interrupt outputs provide a single transmit interrupt
that combines transmit channel interrupts TXPEND[7:0] and a
single receive interrupt that combines receive channel interrupts
RXPEND[7:0]. The TXPEND[0] and RXPEND[0] interrupt outputs are
connected to the ARM Cortex-A8 interrupt controller (INTC) rather
than the C0_TX_PEND and C0_RX_PEND interrupt outputs. So even
though CPSW interrupt is cleared by writing appropriate values to
EOI register the interrupt is not cleared in IRQ controller. So
interrupt is still pending and CPU is struck in ISR, the
workaround is to disable the interrupts in ARM irq controller.
[1] http://www.ti.com/lit/er/sprz360f/sprz360f.pdf
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/cavium/Kconfig
The cavium conflict was overlapping dependency
changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
metadata snapshots aren't widely used but help provide a consistent
view of the metadata associated with an active thin-pool.
- a dm-cache fix for the 4.2 "default" policy switch from "mq" to "smq"
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJVzPl/AAoJEMUj8QotnQNa2MUH/24BKs8WYNy79B9pr6H41ouQ
LAXFVfx3Vy7n3U1KrZcPylXb7R0feXphUrSXJ6ZlKz/Df+AhG5ycUOCfSZxSDpA9
pX5rtWFSWLXGF3wRf0GXqB3CXkrMKlNWVAUlzge39TpIzpdkbh9reM9He4yQphl9
u0QVvIOTQsUrtN9htnQb7Dgmm7qD6NinrebOrWTiOqBdCh3Iw4J48RsYggYOG0ZP
fbMnpZlH537vgLyKaz72Xh+kMrYmP5DBd1f+iAUBVKk12HQVQNMzEwOgkEB0M+4l
EG3c4O3zMVrBxXdxsI2srFZfLX2XeUYAuMx8cfVSSCDEzQwjC0kmXVdkCYuAjAk=
=r/53
-----END PGP SIGNATURE-----
Merge tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- two stable fixes for corruption seen in a snapshot of thinp metadata;
metadata snapshots aren't widely used but help provide a consistent
view of the metadata associated with an active thin-pool.
- a dm-cache fix for the 4.2 "default" policy switch from "mq" to "smq"
* tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache policy smq: move 'dm-cache-default' module alias to SMQ
dm btree: add ref counting ops for the leaves of top level btrees
dm thin metadata: delete btrees when releasing metadata snapshot
Pull xen block driver fixes from Jens Axboe:
"A few small bug fixes for xen-blk{front,back} that have been sitting
over my vacation"
* 'for-linus' of git://git.kernel.dk/linux-block:
xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()
xen-blkfront: don't add indirect pages to list when !feature_persistent
xen-blkfront: introduce blkfront_gather_backend_features()
- Revert a fix from 4.2-rc5 that was causing lots of WARNING spam.
- Fix a memory leak affecting backends in HVM guests.
- Fix PV domU hang with certain configurations.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJVzHsAAAoJEFxbo/MsZsTR9Y0H/2j1PHt29RPcNdgGQ84AH0Wh
tw1emL8rMcdhWQnsO7bNmywNNvRNQnU3ZJ8dzoq+5GPikNsbfQzYc7U2pIL4A+gB
AAJsNDNzecuq4srk8vNxcmZ7ySvm9w6dccDUex2ge3sNWaq6gzSQvz6FSWiL0Sxg
k3JcnemEg6JrYOTWdxKInAORMcRO6rgx9eIsdPUPOpgC5XLg6/mZOqBAWXIksDvs
V9uCMqQicaUgBgKFIOSllqH6fcCNooRu3aDwNNj/2mMcJmEvMeBkHmNlQgEm2j5L
ubdDyrC5y48TUPJm8i3+W2/AY+kgWzhThcqyVy6LRAAj5RItJFxMf0nMXzIEqlQ=
=UgMy
-----END PGP SIGNATURE-----
Merge tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen bug fixes from David Vrabel:
- revert a fix from 4.2-rc5 that was causing lots of WARNING spam.
- fix a memory leak affecting backends in HVM guests.
- fix PV domU hang with certain configurations.
* tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/xenbus: Don't leak memory when unmapping the ring on HVM backend
Revert "xen/events/fifo: Handle linked events when closing a port"
x86/xen: build "Xen PV" APIC driver for domU as well
This reverts commits 9a036b93a3 ("x86/signal/64: Remove 'fs' and 'gs'
from sigcontext") and c6f2062935 ("x86/signal/64: Fix SS handling for
signals delivered to 64-bit programs").
They were cleanups, but they break dosemu by changing the signal return
behavior (and removing 'fs' and 'gs' from the sigcontext struct - while
not actually changing any behavior - causes build problems).
Reported-and-tested-by: Stas Sergeev <stsp@list.ru>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking fixes from David Miller:
1) Workaround hw bug when acquiring PCI bos ownership of iwlwifi
devices, from Emmanuel Grumbach.
2) Falling back to vmalloc in conntrack should not emit a warning, from
Pablo Neira Ayuso.
3) Fix NULL deref when rtlwifi driver is used as an AP, from Luis
Felipe Dominguez Vega.
4) Rocker doesn't free netdev on device removal, from Ido Schimmel.
5) UDP multicast early sock demux has route handling races, from Eric
Dumazet.
6) Fix L4 checksum handling in openvswitch, from Glenn Griffin.
7) Fix use-after-free in skb_set_peeked, from Herbert Xu.
8) Don't advertize NETIF_F_FRAGLIST in virtio_net driver, this can lead
to fraglists longer than the driver can support. From Jason Wang.
9) Fix mlx5 on non-4k-pagesize systems, from Carol L Soto.
10) Fix interrupt storm in bna driver, from Ivan Vecera.
11) Don't propagate -EBUSY from netlink_insert(), from Daniel Borkmann.
12) Fix inet request sock leak, from Eric Dumazet.
13) Fix TX interrupt masking and marking in TX descriptors of fs_enet
driver, from LEROY Christophe.
14) Get rid of rule optimizer in gianfar driver, it's buggy and unlikely
to get fixed any time soon. From Jakub Kicinski
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
cosa: missing error code on failure in probe()
gianfar: remove faulty filer optimizer
gianfar: correct list membership accounting
gianfar: correct filer table writing
bonding: Gratuitous ARP gets dropped when first slave added
net: dsa: Do not override PHY interface if already configured
net: fs_enet: mask interrupts for TX partial frames.
net: fs_enet: explicitly remove I flag on TX partial frames
inet: fix possible request socket leak
inet: fix races with reqsk timers
mkiss: Fix error handling in mkiss_open()
bnx2x: Free NVRAM lock at end of each page
bnx2x: Prevent null pointer dereference on SKB release
cxgb4: missing curly braces in t4_setup_debugfs()
net-timestamp: Update skb_complete_tx_timestamp comment
ipv6: don't reject link-local nexthop on other interface
netlink: make sure -EBUSY won't escape from netlink_insert
bna: fix interrupts storm caused by erroneous packets
net: mvpp2: replace TX coalescing interrupts with hrtimer
net: mvpp2: enable proper per-CPU TX buffers unmapping
...
missed during the conversion a couple of years ago.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVzBnmAAoJEBLB8Bhh3lVKCdwP+welT0JP1MkElR0KPTyrfi98
xn1NSO6WfEuXeYQWCUKImsEpMGMulIzc2MW/+KD2aYwnT+iUG2S/etOC6nCEzbq4
vx0CTKyNAMz55YHZqWXXCUh563WvJczE56J5FQK2tTqv4jaWUKFxPI6tZ+zvDfoR
phLHRAwaHcaSeD7eaFznNsggF5lEJthzzM7jF2d+g2TO6nbVRZWVo4lhKmV7wOj8
+/fkPzUFrY+pRvE3Ceop10hT1UY6t6mi3xQFPcU3f+gY9DIr8VFZF1soyLnKMSqi
SBCiiHzVUhLpSew5JBtvjxL3vSu9vJAFXwBFY4FCkwzIdzyrwNCnH3roDmCGuSuu
Kj9KHh6BIpBCI5njgvtsx9Ou6XNPwEtZojd2VUfh3BIkZEtXH1+UjIgoOaGuhg4A
H5kzdEs/w2sfXX5MAJwh5F97qKXJbJ656IgdE7ZDNIhUN6EQLWet37mLkXADtEr1
0yn//KsidWAk4FqYMSBs0RAHYd7vipm6GodLOTxwejpMWrp1nUS5KhRaSu/0k1u0
gwEmR1BtwO+mv0o0tg9YGYtmYdZ3X+cYrozIFF1M3RzFd1FzqgEIIpQRzQ66B3Wm
zPnfvr8EXscQIcxgF/AwRGrf8Yc0mvN3FMctQkEfV9rNEsCwNJdv2xSu+lOm3S/Y
qhbkLxg35KonH1v04w0f
=AM3b
-----END PGP SIGNATURE-----
Merge tag 'edac_fix_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
Pull EDAC fix from Borislav Petkov:
"A ppc4xx_edac fix for accessing ->csrows properly. This driver was
missed during the conversion a couple of years ago"
* tag 'edac_fix_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC, ppc4xx: Access mci->csrows array elements properly
The commit
de3910eb79 ("edac: change the mem allocation scheme to
make Documentation/kobject.txt happy")
changed the memory allocation for the csrows member. But ppc4xx_edac was
forgotten in the patch. Fix it.
Signed-off-by: Michael Walle <michael@walle.cc>
Cc: <stable@vger.kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Link: http://lkml.kernel.org/r/1437469253-8611-1-git-send-email-michael@walle.cc
Signed-off-by: Borislav Petkov <bp@suse.de>
If register_hdlc_device() fails, the current code returns 0 but we
should return an error code instead.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A few things have changed since the previous version of the vxlan
documentation was written, so update it and correct some grammar and
such while we are at it.
Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to use the IS_ERR_VALUE() macro for checking
the return value from pm_runtime_* functions.
Just do a simple negative test instead.
The semantic patch that makes this change is available
in scripts/coccinelle/api/pm_runtime.cocci.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai says:
====================
Add some more debug info
This patch series adds the following.
Add more info for sge_qinfo dump
Differentiate tid and stids between different regions, and add a debugfs
entry to dump all the tid info
This patch series has been created against net-next tree and includes
patches on cxgb4 driver.
We have included all the maintainers of respective drivers. Kindly review
the change and let us know in case of any review comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add debugfs support to dump tid info like stid, sftid, tids, atid and
hwtids
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For T4 adapter, offloaded servers tid for IPv4 connections are
allocated from filter region. So add a new field for server filter tid if
server tid is allocated from filter region.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the tid info, differentiate from which region the TID is allocated
from. It can be from TCAM region or HASH region.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding more details to sge qinfo for debugging purpose.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a system has multiple ethernet devices and during DHCP
request (for using NFS), the system waits only for HZ/2 which is
500mS before switching to another interface for DHCP.
There are some routers (Ex: Trendnet routers) which responds to
DHCP request at about 560mS. When the system has only one
ethernet interface there is no issue as the timeout is 2S and the
dev xid doesn't changes and only retries.
But when the system has multiple Ethernet like DRA74x with CPSW
in dual EMAC mode, the DHCP response is dropped as the dev xid
changes while shifting to the next device. So changing inter
device timeout to HZ (which is 1S).
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two improvements in this patch:
1. Fix the build warnings;
2. Add function read_trace_pipe() to print the result on
the screen;
Before this patch, we can get the result through /sys/kernel/de
bug/tracing/trace_pipe and get nothing on the screen.
By applying this patch, the result can be printed on the screen.
$ ./tracex6
...
tracex6-705 [003] d..1 131.428593: : CPU-3 19981414
sshd-683 [000] d..1 131.428727: : CPU-0 221682321
sshd-683 [000] d..1 131.428821: : CPU-0 221808766
sshd-683 [000] d..1 131.428950: : CPU-0 221982984
sshd-683 [000] d..1 131.429045: : CPU-0 222111851
tracex6-705 [003] d..1 131.429168: : CPU-3 20757551
sshd-683 [000] d..1 131.429170: : CPU-0 222281240
sshd-683 [000] d..1 131.429261: : CPU-0 222403340
sshd-683 [000] d..1 131.429378: : CPU-0 222561024
...
Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This BQL implementation is mostly derived from its related driver, alx.
Tested on AR8131 (rev c0) [1969:1063]. Saturated a 100mbps link with 5
concurrent runs of netperf. Ping latency dropped from 14ms to 3ms.
Signed-off-by: Ron Angeles <ronangeles@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
gianfar: filer changes
respinning with examples as requested.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Current filer rule optimization is broken in several ways:
(1) Can perform reads/writes beyond end of allocated tables.
(gianfar_ethtool.c:1326).
(2) It breaks badly for rules with more than 2 specifiers
(e.g. matching ip, port, tos).
Example:
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.1 dst-port 1 tos 1 action 1
Added rule with ID 254
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.2 dst-port 2 tos 2 action 9
Added rule with ID 253
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.3 dst-port 3 tos 3 action 17
Added rule with ID 252
# ./filer_decode /sys/kernel/debug/gfar1/filer_raw
00: MASK == 00000210 AND Q:00 ctrl:00000080 prop:00000210
01: FPR == 00000210 AND CLE Q:00 ctrl:00000281 prop:00000210
02: MASK == ffffffff AND Q:00 ctrl:00000080 prop:ffffffff
03: DPT == 00000003 AND Q:00 ctrl:0000008e prop:00000003
04: TOS == 00000003 AND Q:00 ctrl:0000008a prop:00000003
05: DIA == 0a000003 AND Q:11 ctrl:0000448c prop:0a000003
06: DPT == 00000002 AND Q:00 ctrl:0000008e prop:00000002
07: TOS == 00000002 AND Q:00 ctrl:0000008a prop:00000002
08: DIA == 0a000002 AND Q:09 ctrl:0000248c prop:0a000002
09: DIA == 0a000001 AND Q:00 ctrl:0000008c prop:0a000001
0a: DPT == 00000001 AND Q:00 ctrl:0000008e prop:00000001
0b: TOS == 00000001 CLE Q:01 ctrl:0000060a prop:00000001
ff: MASK >= 00000000 Q:00 ctrl:00000020 prop:00000000
(Entire cluster gets AND-ed together).
(3) We observed that the masking rules it generates do not
play well with clustering on P2020. Only first rule
of the cluster would ever fire. Given that optimizer
relies heavily on masking this is very hard to fix.
Example:
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.1 dst-port 1 action 1
Added rule with ID 254
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.2 dst-port 2 action 9
Added rule with ID 253
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.3 dst-port 3 action 17
Added rule with ID 252
# ./filer_decode /sys/kernel/debug/gfar1/filer_raw
00: MASK == 00000210 AND Q:00 ctrl:00000080 prop:00000210
01: FPR == 00000210 AND CLE Q:00 ctrl:00000281 prop:00000210
02: MASK == ffffffff AND Q:00 ctrl:00000080 prop:ffffffff
03: DPT == 00000003 AND Q:00 ctrl:0000008e prop:00000003
04: DIA == 0a000003 Q:11 ctrl:0000440c prop:0a000003
05: DPT == 00000002 AND Q:00 ctrl:0000008e prop:00000002
06: DIA == 0a000002 Q:09 ctrl:0000240c prop:0a000002
07: DIA == 0a000001 AND Q:00 ctrl:0000008c prop:0a000001
08: DPT == 00000001 CLE Q:01 ctrl:0000060e prop:00000001
ff: MASK >= 00000000 Q:00 ctrl:00000020 prop:00000000
Which looks correct according to the spec but only the first
(eth id 252)/last added rule for 10.0.0.3 will ever trigger.
As if filer did not treat the AND CLE as cluster start but
also kept AND-ing the rules. We found no errata covering this.
The fact that nobody noticed (2) or (3) makes me think
that this feature is not very widely used and we should just
remove it.
Reported-by: Aleksander Dutkowski <adutkowski@gmail.com>
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Acked-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At a cost of one line let's make sure .count is correct
when calling gfar_process_filer_changes().
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
MAX_FILER_IDX is the last usable index. Using less-than
will already guarantee that one entry for catch-all rule
will be left, no need to subtract 1 here.
Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
This enables the use of ethtool --set-channels devname combined N to
change the number of vRSS queues. Separate rx, tx, and other parameters
are not supported. The maximum is rsscap.num_recv_que. It passes the
given value to rndis_filter_device_add through the device_info->num_chn
field.
If the procedure fails, it attempts to recover to the prior state. If
the recovery fails, it logs an error and aborts.
Current num_chn is saved and restored when changing the MTU.
Signed-off-by: Andrew Schwartzmeyer <andschwa@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Uses device_info->num_chn to pass user provided number of vRSS
queues (from ethtool --set-channels) to rndis_filter_device_add. If
nonzero and less than the maximum, set net_device->num_chn to the given
value; else default to prior algorithm.
Always initialize struct device_info to 0, otherwise not all its fields
are guaranteed to be 0, which is necessary when checking if num_chn has
been purposefully set.
Signed-off-by: Andrew Schwartzmeyer <andschwa@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the first slave is added (such as during bootup) the first
gratuitous ARP gets dropped. We don't see this drop during a failover.
The packet gets dropped in qdisc (noop_enqueue).
The fix is to delay the sending of gratuitous ARPs till the bond dev's
carrier is present.
It can also be worked around by setting num_grat_arp to more than 1.
Signed-off-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>