Move the fib_nh cleanup code from free_fib_info_rcu into a new helper,
fib_nh_release. Move classid accounting into fib_nh_release which is
called per fib_nh to make accounting symmetrical with fib_nh_init.
Export the helper to allow for use with nexthop objects in the
future.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Consolidate the fib_nh initialization which is duplicated between
fib_create_info for single path and fib_get_nhs for multipath.
Export the helper to allow for use with nexthop objects in the
future.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
in_dev lookup followed by IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN check
is called in several places, some with the rcu lock and others with the
rtnl held.
Move the check to a helper similar to what IPv6 has. Since the helper
can be invoked from either context use rcu_dereference_rtnl to
dereference ip_ptr.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define fib_get_nhs to return EINVAL when CONFIG_IP_ROUTE_MULTIPATH is
not enabled and remove the ifdef check for CONFIG_IP_ROUTE_MULTIPATH
in fib_create_info.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for fine-grain timeout support to conntrack action.
The new OVS_CT_ATTR_TIMEOUT attribute of the conntrack action
specifies a timeout to be associated with this connection.
If no timeout is specified, it acts as is, that is the default
timeout for the connection will be automatically applied.
Example usage:
$ nfct timeout add timeout_1 inet tcp syn_sent 100 established 200
$ ovs-ofctl add-flow br0 in_port=1,ip,tcp,action=ct(commit,timeout=timeout_1)
CC: Pravin Shelar <pshelar@ovn.org>
CC: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch exports nf_ct_set_timeout() and nf_ct_destroy_timeout().
The two functions are derived from xt_ct_destroy_timeout() and
xt_ct_set_timeout() in xt_CT.c, and moved to nf_conntrack_timeout.c
without any functional change.
It would be useful for other users (i.e. OVS) that utilizes the
finer-grain conntrack timeout feature.
CC: Pablo Neira Ayuso <pablo@netfilter.org>
CC: Pravin Shelar <pshelar@ovn.org>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently if the driver registers devlink port instance, it should set
the devlink port attributes as well. Then the devlink core is able to
obtain physical port name itself, no need for driver to implement
the ndo. Once all drivers will implement devlink port registration,
this ndo should be removed. This warning guides new
drivers to do things as they should be done.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since each non-legacy slave has its own devlink port instance
correctly set, rely on devlink core to generate correct phys port name.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order for devlink compat functions to work, implement
ndo_get_devlink_port. Legacy slaves does not have devlink port instances
created for themselves.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce devlink_compat_phys_port_name_get() helper that
gets the physical port name for specified netdevice
according to devlink port attributes.
Call this helper from dev_get_phys_port_name()
in case ndo_get_phys_port_name is not defined.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Drop license boilerplate (obsoleted by SPDX license IDs),
by Sven Eckelmann
- Drop documentation for sysfs and debugfs Documentation,
by Sven Eckelmann (2 patches)
- Mark sysfs as optional and deprecated, by Sven Eckelmann (3 patches)
- Update MAINTAINERS Tree, Chat and Bugtracker,
by Sven Eckelmann (3 patches)
- Rename batadv_dat_send_data, by Sven Eckelmann
- update DAT entries with incoming ARP replies, by Linus Luessing
- add multicast-to-unicast support for limited destinations,
by Linus Luessing
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlyc6uEWHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoSA9D/sEpVY0qOITIwzbttcyeDU5PPSD
OF4dVCf6Za6CqfnPRCdViKAGtC1FOz+X2BXtedrIxgsjSPFoRvRoi1XBdu4Bobv2
/4wx56rz3AeMoBZ1UyziUIS6Qam1x7vVYSRXk+QHqBYVc16YiIePpCqTuryrzuk4
4MMqXz+V0dqm7z7irRDe7W9/CdFRtZEDAS8o6cgw4IlL56Ul3Yz6xP6p3PRA+H6V
OWtVwmwcbX2KzZnrWDgql5NBhJ1bOfn2oDp1Y4RpLRmBp0iwg1qZdNZK2+MD2TTw
xxuz5lsZFhTBXNqGgeoGk87m2z0wNkvnj9UnkMPl3gb7j+FyyaAgvVY4M2s2qJv/
++wKDPPun/aGDOuo/rJdBTdlnToH17KS3jsDwhj4TooroI8uCCLWZQaYWkgjcugD
ZKsZlIqFrfH3rPAzOBwRZodoYkOPpz/+xHp3p/cg9ANifwqpxqq3PY35BoP4ZXRi
xUy79QgNIFxYXwrrqTrt3UrY8AGo1/OOHmA6nFQGZT79S648ZoG5vPDKFKRzTmcj
Mj2GXuBzMIkWayHgnH69Kv9vVZc7mZPi7lartsVq/aZtMCh3HbPNfKtNOYsu4QEq
6c2966jvFB+LdTibiJQWbe0s5Z96UaFQUxH5+gGdM5TS5TCIaG3udXoI1ou4YVJI
q6eOdAgblbD7oaNY4w==
=WB31
-----END PGP SIGNATURE-----
Merge tag 'batadv-next-for-davem-20190328' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This feature/cleanup patchset includes the following patches:
- Drop license boilerplate (obsoleted by SPDX license IDs),
by Sven Eckelmann
- Drop documentation for sysfs and debugfs Documentation,
by Sven Eckelmann (2 patches)
- Mark sysfs as optional and deprecated, by Sven Eckelmann (3 patches)
- Update MAINTAINERS Tree, Chat and Bugtracker,
by Sven Eckelmann (3 patches)
- Rename batadv_dat_send_data, by Sven Eckelmann
- update DAT entries with incoming ARP replies, by Linus Luessing
- add multicast-to-unicast support for limited destinations,
by Linus Luessing
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
According to Amit Klein and Benny Pinkas, IP ID generation is too weak
and might be used by attackers.
Even with recent net_hash_mix() fix (netns: provide pure entropy for net_hash_mix())
having 64bit key and Jenkins hash is risky.
It is time to switch to siphash and its 128bit keys.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Amit Klein <aksecurity@gmail.com>
Reported-by: Benny Pinkas <benny@pinkas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
My recent patch had at least three problems :
1) TX zerocopy wants notification when skb is acknowledged,
thus we need to call skb_zcopy_clear() if the skb is
cached into sk->sk_tx_skb_cache
2) Some applications might expect precise EPOLLOUT
notifications, so we need to update sk->sk_wmem_queued
and call sk_mem_uncharge() from sk_wmem_free_skb()
in all cases. The SOCK_QUEUE_SHRUNK flag must also be set.
3) Reuse of saved skb should have used skb_cloned() instead
of simply checking if the fast clone has been freed.
Fixes: 472c2e07ee ("tcp: add one skb cache for tx")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a new action - 'check_pkt_len' which checks the
packet length and executes a set of actions if the packet
length is greater than the specified length or executes
another set of actions if the packet length is lesser or equal to.
This action takes below nlattrs
* OVS_CHECK_PKT_LEN_ATTR_PKT_LEN - 'pkt_len' to check for
* OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_GREATER - Nested actions
to apply if the packet length is greater than the specified 'pkt_len'
* OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL - Nested
actions to apply if the packet length is lesser or equal to the
specified 'pkt_len'.
The main use case for adding this action is to solve the packet
drops because of MTU mismatch in OVN virtual networking solution.
When a VM (which belongs to a logical switch of OVN) sends a packet
destined to go via the gateway router and if the nic which provides
external connectivity, has a lesser MTU, OVS drops the packet
if the packet length is greater than this MTU.
With the help of this action, OVN will check the packet length
and if it is greater than the MTU size, it will generate an
ICMP packet (type 3, code 4) and includes the next hop mtu in it
so that the sender can fragment the packets.
Reported-at:
https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047039.html
Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
CC: Gregory Rose <gvrose8192@gmail.com>
CC: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds support for Fast Link Down as new PHY tunable.
Fast Link Down reduces the time until a link down event is reported
for 1000BaseT. According to the standard it's 750ms what is too long
for several use cases.
v2:
- add comment describing the constants
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of declaring a function in a .c file, declare it in a header
file and include that header file from the source files that define
and that use the function. That allows the compiler to verify
consistency of declaration and definition. See also commit
52267790ef ("sock: add MSG_ZEROCOPY") # v4.14.
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch avoids that the following warnings are reported when building
with W=1:
net/core/rtnetlink.c:3580: warning: Function parameter or member 'ndm' not described in 'ndo_dflt_fdb_add'
net/core/rtnetlink.c:3580: warning: Function parameter or member 'tb' not described in 'ndo_dflt_fdb_add'
net/core/rtnetlink.c:3580: warning: Function parameter or member 'dev' not described in 'ndo_dflt_fdb_add'
net/core/rtnetlink.c:3580: warning: Function parameter or member 'addr' not described in 'ndo_dflt_fdb_add'
net/core/rtnetlink.c:3580: warning: Function parameter or member 'vid' not described in 'ndo_dflt_fdb_add'
net/core/rtnetlink.c:3580: warning: Function parameter or member 'flags' not described in 'ndo_dflt_fdb_add'
net/core/rtnetlink.c:3718: warning: Function parameter or member 'ndm' not described in 'ndo_dflt_fdb_del'
net/core/rtnetlink.c:3718: warning: Function parameter or member 'tb' not described in 'ndo_dflt_fdb_del'
net/core/rtnetlink.c:3718: warning: Function parameter or member 'dev' not described in 'ndo_dflt_fdb_del'
net/core/rtnetlink.c:3718: warning: Function parameter or member 'addr' not described in 'ndo_dflt_fdb_del'
net/core/rtnetlink.c:3718: warning: Function parameter or member 'vid' not described in 'ndo_dflt_fdb_del'
net/core/rtnetlink.c:3861: warning: Function parameter or member 'skb' not described in 'ndo_dflt_fdb_dump'
net/core/rtnetlink.c:3861: warning: Function parameter or member 'cb' not described in 'ndo_dflt_fdb_dump'
net/core/rtnetlink.c:3861: warning: Function parameter or member 'filter_dev' not described in 'ndo_dflt_fdb_dump'
net/core/rtnetlink.c:3861: warning: Function parameter or member 'idx' not described in 'ndo_dflt_fdb_dump'
net/core/rtnetlink.c:3861: warning: Excess function parameter 'nlh' description in 'ndo_dflt_fdb_dump'
Cc: Hubert Sokolowski <hubert.sokolowski@intel.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch avoids that the following warning is reported when building
with W=1:
warning: Function parameter or member 'flags' not described in '__skb_flow_dissect'
Cc: Tom Herbert <tom@herbertland.com>
Fixes: cd79a2382a ("flow_dissector: Add flags argument to skb_flow_dissector functions") # v4.3.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch avoids that the following warnings are reported when building
with W=1:
net/core/dev_ioctl.c:378: warning: Function parameter or member 'ifr' not described in 'dev_ioctl'
net/core/dev_ioctl.c:378: warning: Function parameter or member 'need_copyout' not described in 'dev_ioctl'
net/core/dev_ioctl.c:378: warning: Excess function parameter 'arg' description in 'dev_ioctl'
Cc: Al Viro <viro@zeniv.linux.org.uk>
Fixes: 44c02a2c3d ("dev_ioctl(): move copyin/copyout to callers") # v4.16.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch avoids that the following warning is reported when building
with W=1:
warning: Function parameter or member 'bind_inany' not described in 'reuseport_add_sock'
Cc: Martin KaFai Lau <kafai@fb.com>
Fixes: 2dbb9b9e6d ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT") # v4.19.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
An FoU socket is currently bound to the wildcard-address. While this
works fine, there are several use-cases where the use of the
wildcard-address is not desirable. For example, I use FoU on some
multi-homed servers and would like to use FoU on only one of the
interfaces.
This commit adds support for binding FoU sockets to a given source
address/interface, as well as connecting the socket to a given
destination address/port. udp_tunnel already provides the required
infrastructure, so most of the code added is for exposing and setting
the different attributes (local address, peer address, etc.).
The lookups performed when we add, delete or get an FoU-socket has also
been updated to compare all the attributes a user can set. Since the
comparison now involves several elements, I have added a separate
comparison-function instead of open-coding.
In order to test the code and ensure that the new comparison code works
correctly, I started by creating a wildcard socket bound to port 1234 on
my machine. I then tried to create a non-wildcarded socket bound to the
same port, as well as fetching and deleting the socket (including source
address, peer address or interface index in the netlink request). Both
the create, fetch and delete request failed. Deleting/fetching the
socket was only successful when my netlink request attributes matched
those used to create the socket.
I then repeated the tests, but with a socket bound to a local ip
address, a socket bound to a local address + interface, and a bound
socket that was also «connected» to a peer. Add only worked when no
socket with the matching source address/interface (or wildcard) existed,
while fetch/delete was only successful when all attributes matched.
In addition to testing that the new code work, I also checked that the
current behavior is kept. If none of the new attributes are provided,
then an FoU-socket is configured as before (i.e., wildcarded). If any
of the new attributes are provided, the FoU-socket is configured as
expected.
v1->v2:
* Fixed building with IPv6 disabled (kbuild).
* Fixed a return type warning and make the ugly comparison function more
readable (kbuild).
* Describe more in detail what has been tested (thanks David Miller).
* Make peer port required if peer address is specified.
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"Fixes here and there, a couple new device IDs, as usual:
1) Fix BQL race in dpaa2-eth driver, from Ioana Ciornei.
2) Fix 64-bit division in iwlwifi, from Arnd Bergmann.
3) Fix documentation for some eBPF helpers, from Quentin Monnet.
4) Some UAPI bpf header sync with tools, also from Quentin Monnet.
5) Set descriptor ownership bit at the right time for jumbo frames in
stmmac driver, from Aaro Koskinen.
6) Set IFF_UP properly in tun driver, from Eric Dumazet.
7) Fix load/store doubleword instruction generation in powerpc eBPF
JIT, from Naveen N. Rao.
8) nla_nest_start() return value checks all over, from Kangjie Lu.
9) Fix asoc_id handling in SCTP after the SCTP_*_ASSOC changes this
merge window. From Marcelo Ricardo Leitner and Xin Long.
10) Fix memory corruption with large MTUs in stmmac, from Aaro
Koskinen.
11) Do not use ipv4 header for ipv6 flows in TCP and DCCP, from Eric
Dumazet.
12) Fix topology subscription cancellation in tipc, from Erik Hugne.
13) Memory leak in genetlink error path, from Yue Haibing.
14) Valid control actions properly in packet scheduler, from Davide
Caratti.
15) Even if we get EEXIST, we still need to rehash if a shrink was
delayed. From Herbert Xu.
16) Fix interrupt mask handling in interrupt handler of r8169, from
Heiner Kallweit.
17) Fix leak in ehea driver, from Wen Yang"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (168 commits)
dpaa2-eth: fix race condition with bql frame accounting
chelsio: use BUG() instead of BUG_ON(1)
net: devlink: skip info_get op call if it is not defined in dumpit
net: phy: bcm54xx: Encode link speed and activity into LEDs
tipc: change to check tipc_own_id to return in tipc_net_stop
net: usb: aqc111: Extend HWID table by QNAP device
net: sched: Kconfig: update reference link for PIE
net: dsa: qca8k: extend slave-bus implementations
net: dsa: qca8k: remove leftover phy accessors
dt-bindings: net: dsa: qca8k: support internal mdio-bus
dt-bindings: net: dsa: qca8k: fix example
net: phy: don't clear BMCR in genphy_soft_reset
bpf, libbpf: clarify bump in libbpf version info
bpf, libbpf: fix version info and add it to shared object
rxrpc: avoid clang -Wuninitialized warning
tipc: tipc clang warning
net: sched: fix cleanup NULL pointer exception in act_mirr
r8169: fix cable re-plugging issue
net: ethernet: ti: fix possible object reference leak
net: ibm: fix possible object reference leak
...
Alexei Starovoitov says:
====================
pull-request: bpf-next 2019-03-26
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) introduce bpf_tcp_check_syncookie() helper for XDP and tc, from Lorenz.
2) allow bpf_skb_ecn_set_ce() in tc, from Peter.
3) numerous bpf tc tunneling improvements, from Willem.
4) and other miscellaneous improvements from Adrian, Alan, Daniel, Ivan, Stanislav.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Highlights include:
Stable fixes:
- Fix nfs4_lock_state refcounting in nfs4_alloc_{lock,unlock}data()
- fix mount/umount race in nlmclnt.
- NFSv4.1 don't free interrupted slot on open
Bugfixes:
- Don't let RPC_SOFTCONN tasks time out if the transport is connected
- Fix a typo in nfs_init_timeout_values()
- Fix layoutstats handling during read failovers
- fix uninitialized variable warning
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJcmobMAAoJEA4mA3inWBJc/7cP+wR19SPLnbPAFnA09LyT2wDu
wZI/y4KYcqGX4kW+ZfhvtR91Zy+UzF685NlbY+kH74JH9Wp9o9DJHW6DC//oxAM5
bzMKH4FIY5IEYN6R554QzHHvIzDzJADgdmjwaSjZyYiNQMJ5xnYClkAWBqU4zG4c
luTLcYg2cHYic/2bYCVI/SvSSH4Rq93MhttxWgmP0yUm2l3ed+r+ZydQiAyxBFRv
0DN8dM7gltHnbOapKVxttmdNpK7EIDlTdUFupiwZMvsm5OCGcLm09DUUE0oE0d+s
bZflhWNtV/0P7zjx0SZTfd3/XKo5PRIzAB2sx4KsqzbnC5kR9fl3royZ0CUgPJYa
n7Bb9PJd8AJV+0FK5cyH3KQwL5UokpU7g1pD7MNxUuIM8iDbpZcOfsiKN/ZWVInJ
E/eot9/D4kaDvTWQ+EmCzb7bI6yjVo6B27KFVC+ZNunfP1hFz+CrybUHpbraMw+7
okvE9x+qCeeHRKTNGhcFTAEjGFPQX6nomS6MyFUXUriKSy29Fiq9kUem1qFFsPxk
c79pYQdu/TUX3sUxjVsOaOr1sS+VJZOrUzGe2/IAZKM86Mu0fQ8W4PTKhqv/ZG+4
oxC4ukHI39cDYcjyUMnpOGgZ3k1w7UcttVKy0fcsfHQJCTfa5kfd+s9mPpCBV3JG
GN9QQkWPLud8uoR/85rR
=d5ft
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-5.1-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
Stable fixes:
- Fix nfs4_lock_state refcounting in nfs4_alloc_{lock,unlock}data()
- fix mount/umount race in nlmclnt.
- NFSv4.1 don't free interrupted slot on open
Bugfixes:
- Don't let RPC_SOFTCONN tasks time out if the transport is connected
- Fix a typo in nfs_init_timeout_values()
- Fix layoutstats handling during read failovers
- fix uninitialized variable warning"
* tag 'nfs-for-5.1-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: fix uninitialized variable warning
pNFS/flexfiles: Fix layoutstats handling during read failovers
NFS: Fix a typo in nfs_init_timeout_values()
SUNRPC: Don't let RPC_SOFTCONN tasks time out if the transport is connected
NFSv4.1 don't free interrupted slot on open
NFS: fix mount/umount race in nlmclnt.
NFS: Fix nfs4_lock_state refcounting in nfs4_alloc_{lock,unlock}data()
Avoid following compiler warning on uninitialized variable
net/sunrpc/xprtsock.c: In function ‘xs_read_stream_request.constprop’:
net/sunrpc/xprtsock.c:525:10: warning: ‘read’ may be used uninitialized in this function [-Wmaybe-uninitialized]
return read;
^~~~
net/sunrpc/xprtsock.c:529:23: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized]
return ret < 0 ? ret : read;
~~~~~~~~~~~~~~^~~~~~
Signed-off-by: Alakesh Haloi <alakesh.haloi@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
When the conntrack is initialized, there is no helper attached
yet so the nat info initialization (nf_nat_setup_info) skips
adding the seqadj ext.
A helper is attached later when the conntrack is not confirmed
but is going to be committed. In this case, if NAT is needed then
adds the seqadj ext as well.
Fixes: 16ec3d4fbb ("openvswitch: Fix cached ct with helper.")
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the return value check which testing the wrong variable
in tipc_mcast_send_sync().
Fixes: c55c8edafa ("tipc: smooth change between replicast and broadcast")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When phylink_of_phy_connect fails, dsa_slave_phy_setup tries to save the
day by connecting to an alternative PHY, none other than a PHY on the
switch's internal MDIO bus, at an address equal to the port's index.
However this does not take into consideration the scenario when the
switch that failed to probe an external PHY does not have an internal
MDIO bus at all.
Fixes: aab9c4067d ("net: dsa: Plug in PHYLINK support")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In dumpit, unlike doit, the check for info_get op being defined
is missing. Add it and avoid null pointer dereference in case driver
does not define this op.
Fixes: f9cf22882c ("devlink: add device information API")
Reported-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When running a syz script, a panic occurred:
[ 156.088228] BUG: KASAN: use-after-free in tipc_disc_timeout+0x9c9/0xb20 [tipc]
[ 156.094315] Call Trace:
[ 156.094844] <IRQ>
[ 156.095306] dump_stack+0x7c/0xc0
[ 156.097346] print_address_description+0x65/0x22e
[ 156.100445] kasan_report.cold.3+0x37/0x7a
[ 156.102402] tipc_disc_timeout+0x9c9/0xb20 [tipc]
[ 156.106517] call_timer_fn+0x19a/0x610
[ 156.112749] run_timer_softirq+0xb51/0x1090
It was caused by the netns freed without deleting the discoverer timer,
while later on the netns would be accessed in the timer handler.
The timer should have been deleted by tipc_net_stop() when cleaning up a
netns. However, tipc has been able to enable a bearer and start d->timer
without the local node_addr set since Commit 52dfae5c85 ("tipc: obtain
node identity from interface by default"), which caused the timer not to
be deleted in tipc_net_stop() then.
So fix it in tipc_net_stop() by changing to check local node_id instead
of local node_addr, as Jon suggested.
While at it, remove the calling of tipc_nametbl_withdraw() there, since
tipc_nametbl_stop() will take of the nametbl's freeing after.
Fixes: 52dfae5c85 ("tipc: obtain node identity from interface by default")
Reported-by: syzbot+a25307ad099309f1c2b9@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With this patch multicast packets with a limited number of destinations
(current default: 16) will be split and transmitted by the originator as
individual unicast transmissions.
Wifi broadcasts with their low bitrate are still a costly undertaking.
In a mesh network this cost multiplies with the overall size of the mesh
network. Therefore using multiple unicast transmissions instead of
broadcast flooding is almost always less burdensome for the mesh
network.
The maximum amount of unicast packets can be configured via the newly
introduced multicast_fanout parameter. If this limit is exceeded
distribution will fall back to classic broadcast flooding.
The multicast-to-unicast conversion is performed on the initial
multicast sender node and counts on a final destination node, mesh-wide
basis (and not next hop, neighbor node basis).
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Currently incoming ARP Replies, for example via a DHT-PUT message, do
not update the timeout for an already existing DAT entry. These ARP
Replies are dropped instead.
This however defeats the purpose of the DHCPACK snooping, for instance.
Right now, a DAT entry in the DHT will be purged every five minutes,
likely leading to a mesh-wide ARP Request broadcast after this timeout.
Which then recreates the entry. The idea of the DHCPACK snooping is to
be able to update an entry before a timeout happens, to avoid ARP Request
flooding.
This patch fixes this issue by updating a DAT entry on incoming
ARP Replies even if a matching DAT entry already exists. While still
filtering the ARP Reply towards the soft-interface, to avoid duplicate
messages on the client device side.
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Acked-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The send functions in batman-adv are expected to consume the skb when
either the data is queued up for the underlying driver or when some
precondition failed. batadv_dat_send_data didn't do this and instead
created a copy of the skb, modified it and queued the copy up for
transmission. The caller has to take care that the skb is handled correctly
(for example free'd) when batadv_dat_send_data returns.
This unclear behavior already lead to memory leaks in the recent past.
Renaming the function to batadv_dat_forward_data should make it easier to
identify that the data is forwarded but the skb is not actually
send+consumed.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The sysfs files to read and modify the configuration settings were replaced
by the batadv generic netlink family. They are also marked as obsolete in
the ABI documentation. But not all users of this functionality might follow
changes in the Documentation/ABI/obsolete/ folder. They might benefit from
a warning messages about the deprecation of the functionality which they
just tried to access
batman_adv: [Deprecated]: batctl (pid 30381) Use of sysfs file "orig_interval".
Use batadv genl family instead
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The sysfs files will be marked as deprecated in the near future. They are
already replaced by the batadv generic netlink family. Add an Kconfig
option to disable the sysfs support for users who want to test their tools
or want to safe some space. This setting should currently still be enabled
by default to keep backward compatible with legacy tools.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
All files got a SPDX-License-Identifier with commit 7db7d9f369
("batman-adv: Add SPDX license identifier above copyright header"). All the
required information about the license conditions can be found in
LICENSES/.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
These three variables are set in one branch and used in another with
the same condition. But on some architectures they still generate
compiler warnings of the kind:
warning: 'inner_trans' may be used uninitialized in this function [-Wmaybe-uninitialized]
Silence these false positives. Use the straightforward approach to
always initialize them, if a bit superfluous.
Fixes: 868d523535 ("bpf: add bpf_skb_adjust_room encap flags")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Some drivers are becoming more dependent on NET_DEVLINK being selected
in configuration. With upcoming compat functions, the behavior would be
wrong in case devlink was not compiled in. So make the drivers select
NET_DEVLINK and rely on the functions being there, not just stubs.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add spinlock to protect port type and type_dev pointer consistency.
Without that, userspace may see inconsistent type and type_dev
combinations.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
v1->v2:
- rebased
Signed-off-by: David S. Miller <davem@davemloft.net>
Port needs to be registered first before the type is set. Warn and
bail-out in case it is not.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the port attributes are static and cannot change during the port
lifetime, WARN_ON if some driver calls it after registration. Also, no
need to call notifications as it is noop anyway due to check of
devlink_port->registered there.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since attrs are static during the existence of devlink port, set the
before registration of the port.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
__devlink_port_type_set() returns void, it makes no sense to pass it on,
so don't do that.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netdevice is guaranteed to not disappear so we can rely that
devlink_port and devlink won't disappear as well. No need to take
devlink_mutex so don't take it here.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
devlink functions are in use, so include the related header file.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add missing called to mutex_destroy() for two mutexes used
in devlink code.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Often times, recvmsg() system calls and BH handling for a particular
TCP socket are done on different cpus.
This means the incoming skb had to be allocated on a cpu,
but freed on another.
This incurs a high spinlock contention in slab layer for small rpc,
but also a high number of cache line ping pongs for larger packets.
A full size GRO packet might use 45 page fragments, meaning
that up to 45 put_page() can be involved.
More over performing the __kfree_skb() in the recvmsg() context
adds a latency for user applications, and increase probability
of trapping them in backlog processing, since the BH handler
might found the socket owned by the user.
This patch, combined with the prior one increases the rpc
performance by about 10 % on servers with large number of cores.
(tcp_rr workload with 10,000 flows and 112 threads reach 9 Mpps
instead of 8 Mpps)
This also increases single bulk flow performance on 40Gbit+ links,
since in this case there are often two cpus working in tandem :
- CPU handling the NIC rx interrupts, feeding the receive queue,
and (after this patch) freeing the skbs that were consumed.
- CPU in recvmsg() system call, essentially 100 % busy copying out
data to user space.
Having at most one skb in a per-socket cache has very little risk
of memory exhaustion, and since it is protected by socket lock,
its management is essentially free.
Note that if rps/rfs is used, we do not enable this feature, because
there is high chance that the same cpu is handling both the recvmsg()
system call and the TCP rx path, but that another cpu did the skb
allocations in the device driver right before the RPS/RFS logic.
To properly handle this case, it seems we would need to record
on which cpu skb was allocated, and use a different channel
to give skbs back to this cpu.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>