Commit Graph

769375 Commits

Author SHA1 Message Date
Pablo Neira Ayuso
483f3fdcc7 netfilter: nft_tunnel: fix sparse errors
[...]
net/netfilter/nft_tunnel.c:117:25:    expected unsigned int [unsigned] [usertype] flags
net/netfilter/nft_tunnel.c:117:25:    got restricted __be16 [usertype] <noident>
[...]
net/netfilter/nft_tunnel.c:246:33:    expected restricted __be16 [addressable] [assigned] [usertype] tp_dst
net/netfilter/nft_tunnel.c:246:33:    got int

Fixes: af308b94a2 ("netfilter: nf_tables: add tunnel support")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-04 00:53:29 +02:00
Florian Westphal
020f6cc5f7 netfilter: conntrack: avoid use-after free on rmmod
When the conntrack module is removed, we call nf_ct_iterate_destroy via
nf_ct_l4proto_unregister().

Problem is that nf_conntrack_proto_fini() gets called after the
conntrack hash table has already been freed.

Just remove the l4proto unregister call, its unecessary as the
nf_ct_protos[] array gets free'd right after anyway.

v2: add comment wrt. missing unreg call.

Fixes: a0ae2562c6 ("netfilter: conntrack: remove l3proto abstraction")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03 21:15:13 +02:00
Florian Westphal
7bdfcea875 netfilter: kconfig: remove ct zone/label dependencies
connection tracking zones currently depend on the xtables CT target.
The reasoning was that it makes no sense to support zones if they can't
be configured (which needed CT target).

Nowadays zones can also be used by OVS and configured via nftables,
so remove the dependency.

connection tracking labels are handled via hidden dependency that gets
auto-selected by the connlabel match.
Make it a visible knob, as labels can be attached via ctnetlink
or via nftables rules (nft_ct expression) too.

This allows to use conntrack labels and zones with nftables-only build.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03 21:15:12 +02:00
Pablo Neira Ayuso
445509eb9b netfilter: nf_tables: simplify NLM_F_CREATE handling
* From nf_tables_newchain(), codepath provides context that allows us to
  infer if we are updating a chain (in that case, no module autoload is
  required) or adding a new one (then, module autoload is indeed
  needed).
* We only need it in one single spot in nf_tables_newrule().
* Not needed for nf_tables_newset() at all.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03 21:15:11 +02:00
Máté Eckl
94276fa8a2 netfilter: bridge: Expose nf_tables bridge hook priorities through uapi
Netfilter exposes standard hook priorities in case of ipv4, ipv6 and
arp but not in case of bridge.

This patch exposes the hook priority values of the bridge family (which are
different from the formerly mentioned) via uapi so that they can be used by
user-space applications just like the others.

Signed-off-by: Máté Eckl <ecklm94@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03 21:15:09 +02:00
Pablo Neira Ayuso
aaecfdb5c5 netfilter: nf_tables: match on tunnel metadata
This patch allows us to match on the tunnel metadata that is available
of the packet. We can use this to validate if the packet comes from/goes
to tunnel and the corresponding tunnel ID.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03 21:12:19 +02:00
Pablo Neira Ayuso
af308b94a2 netfilter: nf_tables: add tunnel support
This patch implements the tunnel object type that can be used to
configure tunnels via metadata template through the existing lightweight
API from the ingress path.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03 21:12:12 +02:00
Máté Eckl
033eab53ff netfilter: nft_tproxy: Add missing config check
A config check was missing form the code when using
nf_defrag_ipv6_enable with NFT_TPROXY != n and NF_DEFRAG_IPV6 = n and
this caused the following error:

../net/netfilter/nft_tproxy.c: In function 'nft_tproxy_init':
../net/netfilter/nft_tproxy.c:237:3: error: implicit declaration of function
+'nf_defrag_ipv6_enable' [-Werror=implicit-function-declaration]
   err = nf_defrag_ipv6_enable(ctx->net);

This patch adds a check for NF_TABLES_IPV6 when NF_DEFRAG_IPV6 is
selected by Kconfig.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: 4ed8eb6570 ("netfilter: nf_tables: Add native tproxy support")
Signed-off-by: Máté Eckl <ecklm94@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03 20:20:53 +02:00
Harsha Sharma
c753032690 netfilter: cttimeout: Make NF_CT_NETLINK_TIMEOUT depend on NF_CONNTRACK_TIMEOUT
With this, remove ifdef for CONFIG_NF_CONNTRACK_TIMEOUT in
nfnetlink_cttimeout. This is also required for moving ctnl_untimeout
from nfnetlink_cttimeout to nf_conntrack_timeout.

Signed-off-by: Harsha Sharma <harshasharmaiitr@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03 18:50:41 +02:00
YueHaibing
1974d2453f netfilter: nf_tables: remove unused variable
Variable 'ext' is being assigned but are never used hence they are
unused and can be removed.

Cleans up clang warnings:
net/netfilter/nf_tables_api.c:4032:28: warning: variable ‘ext’ set but not used [-Wunused-but-set-variable]

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03 18:50:35 +02:00
Florian Westphal
9e619d87b2 netfilter: nf_tables: flow event notifier must use transaction mutex
Fixes: f102d66b33 ("netfilter: nf_tables: use dedicated mutex to guard transactions")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03 18:38:31 +02:00
Fernando Fernandez Mancera
ddba40be59 netfilter: nfnetlink_osf: rename nf_osf header file to nfnetlink_osf
The first client of the nf_osf.h userspace header is nft_osf, coming in
this batch, rename it to nfnetlink_osf.h as there are no userspace
clients for this yet, hence this looks consistent with other nfnetlink
subsystem.

Suggested-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03 18:38:30 +02:00
Fernando Fernandez Mancera
7cca1ed0bb netfilter: nf_osf: move nf_osf_fingers to non-uapi header file
All warnings (new ones prefixed by >>):

>> ./usr/include/linux/netfilter/nf_osf.h:73: userspace cannot reference function or variable defined in the kernel

Fixes: f932495208 ("netfilter: nfnetlink_osf: extract nfnetlink_subsystem code from xt_osf.c")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03 18:38:29 +02:00
Li RongQing
285189c78e netfilter: use kvmalloc_array to allocate memory for hashtable
nf_ct_alloc_hashtable is used to allocate memory for conntrack,
NAT bysrc and expectation hashtable. Assuming 64k bucket size,
which means 7th order page allocation, __get_free_pages, called
by nf_ct_alloc_hashtable, will trigger the direct memory reclaim
and stall for a long time, when system has lots of memory stress

so replace combination of __get_free_pages and vzalloc with
kvmalloc_array, which provides a overflow check and a fallback
if no high order memory is available, and do not retry to reclaim
memory, reduce stall

and remove nf_ct_free_hashtable, since it is just a kvfree

Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: Wang Li <wangli39@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-08-03 18:37:55 +02:00
Máté Eckl
4ed8eb6570 netfilter: nf_tables: Add native tproxy support
A great portion of the code is taken from xt_TPROXY.c

There are some changes compared to the iptables implementation:
 - tproxy statement is not terminal here
 - Either address or port has to be specified, but at least one of them
   is necessary. If one of them is not specified, the evaluation will be
   performed with the original attribute of the packet (ie. target port
   is not specified => the packet's dport will be used).

To make this work in inet tables, the tproxy structure has a family
member (typically called priv->family) which is not necessarily equal to
ctx->family.

priv->family can have three values legally:
 - NFPROTO_IPV4 if the table family is ip OR if table family is inet,
   but an ipv4 address is specified as a target address. The rule only
   evaluates ipv4 packets in this case.
 - NFPROTO_IPV6 if the table family is ip6 OR if table family is inet,
   but an ipv6 address is specified as a target address. The rule only
   evaluates ipv6 packets in this case.
 - NFPROTO_UNSPEC if the table family is inet AND if only the port is
   specified. The rule will evaluate both ipv4 and ipv6 packets.

Signed-off-by: Máté Eckl <ecklm94@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-30 14:07:12 +02:00
Fernando Fernandez Mancera
b96af92d6e netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf
Add basic module functions into nft_osf.[ch] in order to implement OSF
module in nf_tables.

Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-30 14:07:11 +02:00
Fernando Fernandez Mancera
f932495208 netfilter: nfnetlink_osf: extract nfnetlink_subsystem code from xt_osf.c
Move nfnetlink osf subsystem from xt_osf.c to standalone module so we can
reuse it from the new nft_ost extension.

Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-30 14:07:11 +02:00
Fernando Fernandez Mancera
f6b7b5f4f3 netfilter: nf_osf: rename nf_osf.c to nfnetlink_osf.c
Rename nf_osf.c to nfnetlink_osf.c as we introduce nfnetlink_osf which is
the OSF infraestructure.

Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-30 14:07:10 +02:00
YueHaibing
33b78aaa44 netfilter: use PTR_ERR_OR_ZERO()
Fix ptr_ret.cocci warnings:

  net/netfilter/xt_connlimit.c:96:1-3: WARNING: PTR_ERR_OR_ZERO can be used
  net/netfilter/nft_numgen.c:240:1-3: WARNING: PTR_ERR_OR_ZERO can be used

Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR

Generated by: scripts/coccinelle/api/ptr_ret.cocci

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-30 14:07:09 +02:00
Pablo Neira Ayuso
51c23b47e6 netfilter: nf_osf: add nf_osf_find()
This new function returns the OS genre as a string. Plan is to use to
from the new nft_osf extension.

Note that this doesn't yet support ttl options, but it could be easily
extended to do so.

Tested-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-30 14:06:59 +02:00
Florian Westphal
222440b4e8 netfilter: nf_tables: handle meta/lookup with direct call
Currently nft uses inlined variants for common operations
such as 'ip saddr 1.2.3.4' instead of an indirect call.

Also handle meta get operations and lookups without indirect call,
both are builtin.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-07-30 11:52:02 +02:00
David S. Miller
ecbcd689d7 mlx5e-updates-2018-07-26 (XDP redirect)
This series from Tariq adds the support for device-out XDP redirect.
 
 Start with a simple RX and XDP cleanups:
 - Replace call to MPWQE free with dealloc in interface down flow
 - Do not recycle RX pages in interface down flow
 - Gather all XDP pre-requisite checks in a single function
 - Restrict the combination of large MTU and XDP
 
 Since now XDP logic is going to be called from TX side as well,
 generic XDP TX logic is not RX only anymore, for that Tariq creates
 a new xdp.c file and moves XDP related code into it, and generalizes
 the code to support XDP TX for XDP redirect, such as the xdp tx sq
 structures and xdp counters.
 
 XDP redirect support:
 Add implementation for the ndo_xdp_xmit callback.
 
 Dedicate a new set of XDP-SQ instances to satisfy the XDP_REDIRECT
 requests.  These instances are totally separated from the existing
 XDP-SQ objects that satisfy local XDP_TX actions.
 
 Performance tests:
 
 xdp_redirect_map from ConnectX-5 to ConnectX-5.
 CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
 Packet-rate of 64B packets.
 
 Single queue: 7 Mpps.
 Multi queue: 55 Mpps.
 
 -Saeed.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbWk7YAAoJEEg/ir3gV/o+b+4H/RKfHrDgqRcz+qw2hYqMc/0Q
 N/s6SaZm/AZi63lOwUeeSGORJSGYxxbwcWFNuI6CZrHEupi0zQivDko+E3fTPeDQ
 3Y7jAaLjB0ZsWwEukcgZ1PG1QIYUg2/EBijG+aah9q/TuUfsFYJDRImlXRi5zvSe
 34yuLls6ctuoxoet5S5SVWFssdQZ2YpmmMNqVmQIRGDSmd03dhE1WUGUGmPhR5mK
 vxUBL5X3ucb1lcqxpFnZqsrIZphxkimX0q8qUK1yHk7VU4JXZgikPkRFM1mnlNt0
 YosvDlnDqJugBvYspBUx+9au8wxgPwpHEbnGKdS/npsIuNXa0ZFnEky1tp6CzV0=
 =3CNK
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-updates-2018-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-07-26 (XDP redirect)

This series from Tariq adds the support for device-out XDP redirect.

Start with a simple RX and XDP cleanups:
- Replace call to MPWQE free with dealloc in interface down flow
- Do not recycle RX pages in interface down flow
- Gather all XDP pre-requisite checks in a single function
- Restrict the combination of large MTU and XDP

Since now XDP logic is going to be called from TX side as well,
generic XDP TX logic is not RX only anymore, for that Tariq creates
a new xdp.c file and moves XDP related code into it, and generalizes
the code to support XDP TX for XDP redirect, such as the xdp tx sq
structures and xdp counters.

XDP redirect support:
Add implementation for the ndo_xdp_xmit callback.

Dedicate a new set of XDP-SQ instances to satisfy the XDP_REDIRECT
requests.  These instances are totally separated from the existing
XDP-SQ objects that satisfy local XDP_TX actions.

Performance tests:

xdp_redirect_map from ConnectX-5 to ConnectX-5.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Packet-rate of 64B packets.

Single queue: 7 Mpps.
Multi queue: 55 Mpps.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:33:24 -07:00
Jakub Kicinski
f61b6db378 netdevsim: make debug dirs' dentries static
The root directories of netdevsim should only be used by the core
to create per-device subdirectories, so limit their visibility to
the core file.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:28:54 -07:00
David S. Miller
472f5975f7 Merge branch 'docs-net-Convert-netdev-FAQ-to-RST'
Tobin C. Harding says:

====================
docs: net: Convert netdev-FAQ to RST

Jon answered all the tree questions on v1 so if you will please take
this through your tree that would be awesome.

v2:
 - Fix typo 'canonical_path_format' (thanks Edward)
 - Add patch fixing references netdev-FAQ
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:27:54 -07:00
Tobin C. Harding
287f4fa99a docs: Update references to netdev-FAQ
File 'Documentation/networking/netdev-FAQ.txt' has been converted to RST
format.  We should update all links/references to point to the new file.

Update references to netdev-FAQ

Signed-off-by: Tobin C. Harding <me@tobin.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:27:54 -07:00
Tobin C. Harding
96398ddf63 docs: net: Convert netdev-FAQ to restructured text
Preferred kernel docs format is now restructured text.  Convert
netdev-FAQ.txt to restructured text.

 - Add SPDX license identifier.

 - Change file heading 'Information you need to know about netdev' to
  'netdev FAQ' to better suit displayed index (in HTML).

 - Change question/answer layout to suit rst.  Copy format in
   Documentation/bpf/bpf_devel_QA.rst

 - Fix indentation of code snippets

 - If multiple consecutive URLs appear put them in a list (to maintain
  whitespace).

 - Use uniform spelling of 'bug fix' throughout document (not bugfix or
   bug-fix).

 - Add double back ticks to 'net' and 'net-next' when referring to the
   trees.

 - Use rst references for Documentation/ links.

 - Add rst label 'netdev-FAQ' for referencing by other docs files.

 - Remove stale entry from Documentation/networking/00-INDEX

Signed-off-by: Tobin C. Harding <me@tobin.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:27:54 -07:00
Tobin C. Harding
f58252cdc0 docs: Add rest label the_canonical_patch_format
In preparation to convert Documentation/network/netdev-FAQ.rst to
restructured text format.  We would like to be able to reference 'the
canonical patch format' section.

Add rest label: 'the_canonical_patch_format'.

Signed-off-by: Tobin C. Harding <me@tobin.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:27:54 -07:00
Jia-Ju Bai
d8ad2f31f8 net: adaptec: Replace mdelay() with msleep() in starfire_init_one()
starfire_init_one() is never called in atomic context.
It calls mdelay() to busily wait, which is not necessary.
mdelay() can be replaced with msleep().

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:24:23 -07:00
Jia-Ju Bai
055d624fac isdn: hisax: config: Replace GFP_ATOMIC with GFP_KERNEL
hisax_cs_new() and hisax_cs_setup() are never called in atomic context.
They call kmalloc() and kzalloc() with GFP_ATOMIC, which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:23:47 -07:00
Jia-Ju Bai
87935aa776 isdn: hisax: callc: Replace GFP_ATOMIC with GFP_KERNEL in init_PStack()
init_PStack() is never called in atomic context.
It calls kmalloc() with GFP_ATOMIC, which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:23:16 -07:00
Jia-Ju Bai
9d8009dee9 isdn: mISDN: netjet: Replace GFP_ATOMIC with GFP_KERNEL in nj_probe()
nj_probe() is never called in atomic context.
It calls kzalloc() with GFP_ATOMIC, which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:21:23 -07:00
Jia-Ju Bai
8c957d66d2 isdn: mISDN: hfcpci: Replace GFP_ATOMIC with GFP_KERNEL in hfc_probe()
hfc_probe() is never called in atomic context.
It calls kzalloc() with GFP_ATOMIC, which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:20:52 -07:00
YueHaibing
ff7b91262b net: hns: make hns_dsaf_roce_reset non static
hns_dsaf_roce_reset is exported and used in hns_roce_hw_v1.c
In commit 336a443bd9 ("net: hns: Make many functions static") I make
it static wrongly.

drivers/infiniband/hw/hns/hns_roce_hw_v1.o: In function `hns_roce_v1_reset':
hns_roce_hw_v1.c:(.text+0x37ac): undefined reference to `hns_dsaf_roce_reset'
hns_roce_hw_v1.c:(.text+0x37cc): undefined reference to `hns_dsaf_roce_reset'

Fixes: 336a443bd9 ("net: hns: Make many functions static")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:19:31 -07:00
Tariq Toukan
8ee4823356 net/mlx5e: TX, Use function to access sq_dma object in fifo
Use designated function mlx5e_dma_get() to get
the mlx5e_sq_dma object to be pushed into fifo.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:59 -07:00
Tariq Toukan
9a3956da1c net/mlx5e: TX, Move DB fields in TXQ-SQ struct
Pointers in DB are static, move them to read-only area so they
do not share a cacheline with fields modified in datapath.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:58 -07:00
Tariq Toukan
d3398a4f18 net/mlx5e: RX, Prefetch the xdp_frame data area
A loaded XDP program might write to the xdp_frame data area,
prefetchw() it to avoid a potential cache miss.

Performance tests:
ConnectX-5, XDP_TX packet rate, single ring.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

Before: 13,172,976 pps
After:  13,456,248 pps
2% gain.

Fixes: 22f4539881 ("net/mlx5e: Support XDP over Striding RQ")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:57 -07:00
Tariq Toukan
58b99ee3e3 net/mlx5e: Add support for XDP_REDIRECT in device-out side
Add implementation for the ndo_xdp_xmit callback.

Dedicate a new set of XDP-SQ instances to satisfy the XDP_REDIRECT
requests.  These instances are totally separated from the existing
XDP-SQ objects that satisfy local XDP_TX actions.

Performance tests:

xdp_redirect_map from ConnectX-5 to ConnectX-5.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Packet-rate of 64B packets.

Single queue: 7 Mpps.
Multi queue: 55 Mpps.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:57 -07:00
Tariq Toukan
dac0d15fff net/mlx5e: Re-order fields of struct mlx5e_xdpsq
In the downstream patch that adds support to XDP_REDIRECT-out,
the XDP xmit frame function doesn't share the same run context as
the NAPI that polls the XDP-SQ completion queue.

Hence, need to re-order the XDP-SQ fields to avoid cacheline
false-sharing.

Take redirect_flush and doorbell out of DB, into separated
cachelines.

Add a cacheline breaker within the stats struct.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:56 -07:00
Tariq Toukan
890388ad6f net/mlx5e: Refactor XDP counters
Separate the XDP counters into two sets:
(1) One set reside in the RQ stats, and they monitor XDP stats
in the RQ side.
(2) Another set is per XDP-SQ, and they monitor XDP stats that
are related to XDP transmit flow.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:55 -07:00
Tariq Toukan
c94e4f117e net/mlx5e: Make XDP xmit functions more generic
Convert the XDP xmit functions to use the generic xdp_frame API
in XDP_TX flow.
Same functions will be used later in this series to transmit
the XDP redirect-out packets as well.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:55 -07:00
Tariq Toukan
86690b4b4a net/mlx5e: Add counter for XDP redirect in RX
Add per-ring and total stats for received packets that
goes into XDP redirection.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:54 -07:00
Tariq Toukan
159d213134 net/mlx5e: Move XDP related code into new XDP files
Take XDP code out of the general EN header and RX file into
new XDP files.

Currently, XDP-SQ resides only within an RQ and used from a
single flow (XDP_TX) triggered upon RX completions.
In a downstream patch, additional type of XDP-SQ instances will be
presented and used for the XDP_REDIRECT flow, totally unrelated to
the RX context.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:53 -07:00
Tariq Toukan
a26a5bdf3e net/mlx5e: Restrict the combination of large MTU and XDP
Add checks in control path upon an MTU change or an XDP program set,
to prevent reaching cases where large MTU and XDP are set simultaneously.

This is to make sure we allow XDP only with the linear RX memory scheme,
i.e. a received packet is not scattered to different pages.
Change mlx5e_rx_get_linear_frag_sz() accordingly, so that we make sure
the XDP configuration can really be set, instead of assuming that it is.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:53 -07:00
Tariq Toukan
0ec13877ce net/mlx5e: Gather all XDP pre-requisite checks in a single function
Dedicate a function to all checks done when setting an XDP program.
Take indications from priv instead of netdev features.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:52 -07:00
Tariq Toukan
cb5189d173 net/mlx5e: Do not recycle RX pages in interface down flow
Keep all page-pool recycle calls within NAPI context.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:51 -07:00
Tariq Toukan
afab995e06 net/mlx5e: Replace call to MPWQE free with dealloc in interface down flow
No need to expose the MPWQE free function to control path.
The dealloc function already exposed, use it.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26 15:23:51 -07:00
David S. Miller
6a8fab1794 Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2018-07-26

This series contains updates to ixgbe and igb.

Tony fixes ixgbe to add checks to ensure jumbo frames or LRO get enabled
after an XDP program is loaded.

Shannon Nelson adds the missing security configuration registers to the
ixgbe register dump, which will help in debugging.

Christian Grönke fixes an issue in igb that occurs on SGMII based SPF
mdoules, by reverting changes from 2 previous patches.  The issue was
that initialization would fail on the fore mentioned modules because the
driver would try to reset the PHY before obtaining the PHY address of
the SGMII attached PHY.

Venkatesh Srinivas replaces wmb() with dma_wmb() for doorbell writes,
which avoids SFENCEs before the doorbell writes.

Alex cleans up and refactors ixgbe Tx/Rx shutdown to reduce time needed
to stop the device.  The code refactor allows us to take the completion
time into account when disabling queues, so that on some platforms with
higher completion times, would not result in receive queues disabled
messages.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 14:14:01 -07:00
Jiri Pirko
c921d7db3d net: sched: unmark chain as explicitly created on delete
Once user manually deletes the chain using "chain del", the chain cannot
be marked as explicitly created anymore.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Fixes: 32a4f5ecd7 ("net: sched: introduce chain object to uapi")
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 14:12:58 -07:00
Doron Roberts-Kedes
0a26cf3ff4 tls: Skip zerocopy path for ITER_KVEC
The zerocopy path ultimately calls iov_iter_get_pages, which defines the
step function for ITER_KVECs as simply, return -EFAULT. Taking the
non-zerocopy path for ITER_KVECs avoids the unnecessary fallback.

See https://lore.kernel.org/lkml/20150401023311.GL29656@ZenIV.linux.org.uk/T/#u
for a discussion of why zerocopy for vmalloc data is not a good idea.

Discovered while testing NBD traffic encrypted with ktls.

Fixes: c46234ebb4 ("tls: RX path for ktls")
Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 14:11:34 -07:00
Gustavo A. R. Silva
2ed9db3074 net: sched: cls_api: fix dead code in switch
Code at line 1850 is unreachable. Fix this by removing the break
statement above it, so the code for case RTM_GETCHAIN can be
properly executed.

Addresses-Coverity-ID: 1472050 ("Structurally dead code")
Fixes: 32a4f5ecd7 ("net: sched: introduce chain object to uapi")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 14:09:27 -07:00