Commit Graph

49191 Commits

Author SHA1 Message Date
Stefan Hajnoczi
ba3169fc75 VSOCK: set POLLOUT | POLLWRNORM for TCP_CLOSING
select(2) with wfds but no rfds must return when the socket is shut down
by the peer.  This way userspace notices socket activity and gets -EPIPE
from the next write(2).

Currently select(2) does not return for virtio-vsock when a SEND+RCV
shutdown packet is received.  This is because vsock_poll() only sets
POLLOUT | POLLWRNORM for TCP_CLOSE, not the TCP_CLOSING state that the
socket is in when the shutdown is received.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-26 11:16:27 -05:00
Alexey Kodanev
dd5684ecae dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
ccid2_hc_tx_rto_expire() timer callback always restarts the timer
again and can run indefinitely (unless it is stopped outside), and after
commit 120e9dabaf ("dccp: defer ccid_hc_tx_delete() at dismantle time"),
which moved ccid_hc_tx_delete() (also includes sk_stop_timer()) from
dccp_destroy_sock() to sk_destruct(), this started to happen quite often.
The timer prevents releasing the socket, as a result, sk_destruct() won't
be called.

Found with LTP/dccp_ipsec tests running on the bonding device,
which later couldn't be unloaded after the tests were completed:

  unregister_netdevice: waiting for bond0 to become free. Usage count = 148

Fixes: 2a91aa3967 ("[DCCP] CCID2: Initial CCID2 (TCP-Like) implementation")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-26 11:15:00 -05:00
Nicolas Dichtel
f15ca723c1 net: don't call update_pmtu unconditionally
Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to:
"BUG: unable to handle kernel NULL pointer dereference at           (null)"

Let's add a helper to check if update_pmtu is available before calling it.

Fixes: 52a589d51f ("geneve: update skb dst pmtu on tx path")
Fixes: a93bf0ff44 ("vxlan: update skb dst pmtu on tx path")
CC: Roman Kapl <code@rkapl.cz>
CC: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 16:27:34 -05:00
Dan Streetman
4ee806d511 net: tcp: close sock if net namespace is exiting
When a tcp socket is closed, if it detects that its net namespace is
exiting, close immediately and do not wait for FIN sequence.

For normal sockets, a reference is taken to their net namespace, so it will
never exit while the socket is open.  However, kernel sockets do not take a
reference to their net namespace, so it may begin exiting while the kernel
socket is still open.  In this case if the kernel socket is a tcp socket,
it will stay open trying to complete its close sequence.  The sock's dst(s)
hold a reference to their interface, which are all transferred to the
namespace's loopback interface when the real interfaces are taken down.
When the namespace tries to take down its loopback interface, it hangs
waiting for all references to the loopback interface to release, which
results in messages like:

unregister_netdevice: waiting for lo to become free. Usage count = 1

These messages continue until the socket finally times out and closes.
Since the net namespace cleanup holds the net_mutex while calling its
registered pernet callbacks, any new net namespace initialization is
blocked until the current net namespace finishes exiting.

After this change, the tcp socket notices the exiting net namespace, and
closes immediately, releasing its dst(s) and their reference to the
loopback interface, which lets the net namespace continue exiting.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811
Signed-off-by: Dan Streetman <ddstreet@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25 10:56:45 -05:00
Tom Herbert
e557124023 kcm: Check if sk_user_data already set in kcm_attach
This is needed to prevent sk_user_data being overwritten.
The check is done under the callback lock. This should prevent
a socket from being attached twice to a KCM mux. It also prevents
a socket from being attached for other use cases of sk_user_data
as long as the other cases set sk_user_data under the lock.
Followup work is needed to unify all the use cases of sk_user_data
to use the same locking.

Reported-by: syzbot+114b15f2be420a8886c3@syzkaller.appspotmail.com
Fixes: ab7ac4eb98 ("kcm: Kernel Connection Multiplexor module")
Signed-off-by: Tom Herbert <tom@quantonium.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 15:54:30 -05:00
Tom Herbert
581e7226a5 kcm: Only allow TCP sockets to be attached to a KCM mux
TCP sockets for IPv4 and IPv6 that are not listeners or in closed
stated are allowed to be attached to a KCM mux.

Fixes: ab7ac4eb98 ("kcm: Kernel Connection Multiplexor module")
Reported-by: syzbot+8865eaff7f9acd593945@syzkaller.appspotmail.com
Signed-off-by: Tom Herbert <tom@quantonium.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 15:54:30 -05:00
Wolfgang Bumiller
560a66075d net: sched: em_nbyte: don't add the data offset twice
'ptr' is shifted by the offset and then validated,
the memcmp should not add it a second time.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 14:52:40 -05:00
David S. Miller
97edf7c526 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2018-01-24

1) Only offloads SAs after they are fully initialized.
   Otherwise a NIC may receive packets on a SA we can
   not yet handle in the stack.
   From Yossi Kuperman.

2) Fix negative refcount in case of a failing offload.
   From Aviad Yehezkel.

3) Fix inner IP ptoro version when decapsulating
   from interaddress family tunnels.
   From Yossi Kuperman.

4) Use true or false for boolean variables instead of an
   integer value in xfrm_get_type_offload.
   From Gustavo A. R. Silva.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24 10:32:29 -05:00
Ben Hutchings
e9191ffb65 ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
Commit 513674b5a2 ("net: reevalulate autoflowlabel setting after
sysctl setting") removed the initialisation of
ipv6_pinfo::autoflowlabel and added a second flag to indicate
whether this field or the net namespace default should be used.

The getsockopt() handling for this case was not updated, so it
currently returns 0 for all sockets for which IPV6_AUTOFLOWLABEL is
not explicitly enabled.  Fix it to return the effective value, whether
that has been set at the socket or net namespace level.

Fixes: 513674b5a2 ("net: reevalulate autoflowlabel setting after sysctl ...")
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-23 19:53:24 -05:00
Gustavo A. R. Silva
545d8ae7af xfrm: fix boolean assignment in xfrm_get_type_offload
Assign true or false to boolean variables instead of an integer value.

This issue was detected with the help of Coccinelle.

Fixes: ffdb5211da ("xfrm: Auto-load xfrm offload modules")
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-01-23 10:56:36 +01:00
Yossi Kuperman
5efec5c655 xfrm: Fix eth_hdr(skb)->h_proto to reflect inner IP version
IPSec tunnel mode supports encapsulation of IPv4 over IPv6 and vice-versa.

The outer IP header is stripped and the inner IP inherits the original
Ethernet header. Tcpdump fails to properly decode the inner packet in
case that h_proto is different than the inner IP version.

Fix h_proto to reflect the inner IP version.

Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-01-23 10:56:36 +01:00
Dave Watson
7a8c4dd9be tls: Correct length of scatterlist in tls_sw_sendpage
The scatterlist is reused by both sendmsg and sendfile.
If a sendmsg of smaller number of pages is followed by a sendfile
of larger number of pages, the scatterlist may be too short, resulting
in a crash in gcm_encrypt.

Add sg_unmark_end to make the list the correct length.

tls_sw_sendmsg already calls sg_unmark_end correctly when it allocates
memory in alloc_sg, or in zerocopy_from_iter.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22 16:25:21 -05:00
Felix Fietkau
ad23b75093 net: igmp: fix source address check for IGMPv3 reports
Commit "net: igmp: Use correct source address on IGMPv3 reports"
introduced a check to validate the source address of locally generated
IGMPv3 packets.
Instead of checking the local interface address directly, it uses
inet_ifa_match(fl4->saddr, ifa), which checks if the address is on the
local subnet (or equal to the point-to-point address if used).

This breaks for point-to-point interfaces, so check against
ifa->ifa_local directly.

Cc: Kevin Cernekee <cernekee@chromium.org>
Fixes: a46182b002 ("net: igmp: Use correct source address on IGMPv3 reports")
Reported-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22 16:16:05 -05:00
Willem de Bruijn
121d57af30 gso: validate gso_type in GSO handlers
Validate gso_type during segmentation as SKB_GSO_DODGY sources
may pass packets where the gso_type does not match the contents.

Syzkaller was able to enter the SCTP gso handler with a packet of
gso_type SKB_GSO_TCPV4.

On entry of transport layer gso handlers, verify that the gso_type
matches the transport protocol.

Fixes: 90017accff ("sctp: Add GSO support")
Link: http://lkml.kernel.org/r/<001a1137452496ffc305617e5fe0@google.com>
Reported-by: syzbot+fee64147a25aecd48055@syzkaller.appspotmail.com
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22 16:01:30 -05:00
Eric Dumazet
7c68d1a6b4 net: qdisc_pkt_len_init() should be more robust
Without proper validation of DODGY packets, we might very well
feed qdisc_pkt_len_init() with invalid GSO packets.

tcp_hdrlen() might access out-of-bound data, so let's use
skb_header_pointer() and proper checks.

Whole story is described in commit d0c081b491 ("flow_dissector:
properly cap thoff field")

We have the goal of validating DODGY packets earlier in the stack,
so we might very well revert this fix in the future.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jason Wang <jasowang@redhat.com>
Reported-by: syzbot+9da69ebac7dddd804552@syzkaller.appspotmail.com
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22 16:00:05 -05:00
Sowmini Varadhan
b589513e63 rds: tcp: compute m_ack_seq as offset from ->write_seq
rds-tcp uses m_ack_seq to track the tcp ack# that indicates
that the peer has received a rds_message. The m_ack_seq is
used in rds_tcp_is_acked() to figure out when it is safe to
drop the rds_message from the RDS retransmit queue.

The m_ack_seq must be calculated as an offset from the right
edge of the in-flight tcp buffer, i.e., it should be based on
the ->write_seq, not the ->snd_nxt.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-22 15:43:54 -05:00
Aviad Yehezkel
aa5dd6fa6f xfrm: fix error flow in case of add state fails
If add state fails in case of device offload, netdev refcount
will be negative since gc task is attempting to dev_free this state.
This is fixed by putting NULL in state dev field.

Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Boris Pismeny <borisp@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-01-19 06:44:22 +01:00
David S. Miller
69c4a65e4b linux-can-fixes-for-4.15-20180118
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEE4bay/IylYqM/npjQHv7KIOw4HPYFAlpgXAoTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRAe/sog7Dgc9pfMB/92awOK3MnAv1k5HjWNK0y2Qn0ZF2uq
 5X63v54tNs/nTs/GRcq69YOehrgKKonTKGHfbU/IDw0LGcjQBd1VRXe38fSpURzu
 0ecaN6/5HY7bhmT4FrR1S0JUxM/mLF3WygJDbOQsoWByqyGdZBZVTMNhl12gGcNU
 mq3TPrAx2PO2C4/6U/QIIG0PVx+RytFobcGAssKhhILyJdbO/BjqQIgejg5uvZP2
 DqXVbj2+zFrcXjB6lhAiOCvdrYqQq2fdgJHeLtmYjfLf7hzLOt0aWqr4vgAkb/ew
 ejArkC8LaZZVxsB/I/dNKQebzBTxzs1QswwXpdjmR3xaxsazyTaQ4VnT
 =R2B8
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-4.15-20180118' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2018-01-18

====================
this is a pull reqeust of two patches for net/master:

The syzkaller project triggered two WARN_ONCE() in the af_can code from
userspace and we decided to replace it by a pr_warn_once().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 21:16:13 -05:00
Wei Wang
591ff9ea51 ipv6: don't let tb6_root node share routes with other node
After commit 4512c43eac, if we add a route to the subtree of tb6_root
which does not have any route attached to it yet, the current code will
let tb6_root and the node in the subtree share the same route.
This could cause problem cause tb6_root has RTN_INFO flag marked and the
tree repair and clean up code will not work properly.
This commit makes sure tb6_root->leaf points back to null_entry instead
of sharing route with other node.

It fixes the following syzkaller reported issue:
BUG: KASAN: use-after-free in ipv6_prefix_equal include/net/ipv6.h:540 [inline]
BUG: KASAN: use-after-free in fib6_add_1+0x165f/0x1790 net/ipv6/ip6_fib.c:618
Read of size 8 at addr ffff8801bc043498 by task syz-executor5/19819

CPU: 1 PID: 19819 Comm: syz-executor5 Not tainted 4.15.0-rc7+ #186
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:53
 print_address_description+0x73/0x250 mm/kasan/report.c:252
 kasan_report_error mm/kasan/report.c:351 [inline]
 kasan_report+0x25b/0x340 mm/kasan/report.c:409
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
 ipv6_prefix_equal include/net/ipv6.h:540 [inline]
 fib6_add_1+0x165f/0x1790 net/ipv6/ip6_fib.c:618
 fib6_add+0x5fa/0x1540 net/ipv6/ip6_fib.c:1214
 __ip6_ins_rt+0x6c/0x90 net/ipv6/route.c:1003
 ip6_route_add+0x141/0x190 net/ipv6/route.c:2790
 ipv6_route_ioctl+0x4db/0x6b0 net/ipv6/route.c:3299
 inet6_ioctl+0xef/0x1e0 net/ipv6/af_inet6.c:520
 sock_do_ioctl+0x65/0xb0 net/socket.c:958
 sock_ioctl+0x2c2/0x440 net/socket.c:1055
 vfs_ioctl fs/ioctl.c:46 [inline]
 do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
 SYSC_ioctl fs/ioctl.c:701 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
 entry_SYSCALL_64_fastpath+0x23/0x9a
RIP: 0033:0x452ac9
RSP: 002b:00007fd42b321c58 EFLAGS: 00000212 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000071bea0 RCX: 0000000000452ac9
RDX: 0000000020fd7000 RSI: 000000000000890b RDI: 0000000000000013
RBP: 000000000000049e R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000212 R12: 00000000006f4f70
R13: 00000000ffffffff R14: 00007fd42b3226d4 R15: 0000000000000000

Fixes: 4512c43eac ("ipv6: remove null_entry before adding default route")
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 21:14:00 -05:00
Alexey Kodanev
128bb975dc ip6_gre: init dev->mtu and dev->hard_header_len correctly
Commit b05229f442 ("gre6: Cleanup GREv6 transmit path,
call common GRE functions") moved dev->mtu initialization
from ip6gre_tunnel_setup() to ip6gre_tunnel_init(), as a
result, the previously set values, before ndo_init(), are
reset in the following cases:

* rtnl_create_link() can update dev->mtu from IFLA_MTU
  parameter.

* ip6gre_tnl_link_config() is invoked before ndo_init() in
  netlink and ioctl setup, so ndo_init() can reset MTU
  adjustments with the lower device MTU as well, dev->mtu
  and dev->hard_header_len.

  Not applicable for ip6gretap because it has one more call
  to ip6gre_tnl_link_config(tunnel, 1) in ip6gre_tap_init().

Fix the first case by updating dev->mtu with 'tb[IFLA_MTU]'
parameter if a user sets it manually on a device creation,
and fix the second one by moving ip6gre_tnl_link_config()
call after register_netdevice().

Fixes: b05229f442 ("gre6: Cleanup GREv6 transmit path, call common GRE functions")
Fixes: db2ec95d1b ("ip6_gre: Fix MTU setting")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 21:05:56 -05:00
Eric Dumazet
d0c081b491 flow_dissector: properly cap thoff field
syzbot reported yet another crash [1] that is caused by
insufficient validation of DODGY packets.

Two bugs are happening here to trigger the crash.

1) Flow dissection leaves with incorrect thoff field.

2) skb_probe_transport_header() sets transport header to this invalid
thoff, even if pointing after skb valid data.

3) qdisc_pkt_len_init() reads out-of-bound data because it
trusts tcp_hdrlen(skb)

Possible fixes :

- Full flow dissector validation before injecting bad DODGY packets in
the stack.
 This approach was attempted here : https://patchwork.ozlabs.org/patch/
861874/

- Have more robust functions in the core.
  This might be needed anyway for stable versions.

This patch fixes the flow dissection issue.

[1]
CPU: 1 PID: 3144 Comm: syzkaller271204 Not tainted 4.15.0-rc4-mm1+ #49
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:53
 print_address_description+0x73/0x250 mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:355 [inline]
 kasan_report+0x23b/0x360 mm/kasan/report.c:413
 __asan_report_load2_noabort+0x14/0x20 mm/kasan/report.c:432
 __tcp_hdrlen include/linux/tcp.h:35 [inline]
 tcp_hdrlen include/linux/tcp.h:40 [inline]
 qdisc_pkt_len_init net/core/dev.c:3160 [inline]
 __dev_queue_xmit+0x20d3/0x2200 net/core/dev.c:3465
 dev_queue_xmit+0x17/0x20 net/core/dev.c:3554
 packet_snd net/packet/af_packet.c:2943 [inline]
 packet_sendmsg+0x3ad5/0x60a0 net/packet/af_packet.c:2968
 sock_sendmsg_nosec net/socket.c:628 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:638
 sock_write_iter+0x31a/0x5d0 net/socket.c:907
 call_write_iter include/linux/fs.h:1776 [inline]
 new_sync_write fs/read_write.c:469 [inline]
 __vfs_write+0x684/0x970 fs/read_write.c:482
 vfs_write+0x189/0x510 fs/read_write.c:544
 SYSC_write fs/read_write.c:589 [inline]
 SyS_write+0xef/0x220 fs/read_write.c:581
 entry_SYSCALL_64_fastpath+0x1f/0x96

Fixes: 34fad54c25 ("net: __skb_flow_dissect() must cap its return value")
Fixes: a6e544b0a8 ("flow_dissector: Jump to exit code in __skb_flow_dissect")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 16:25:49 -05:00
Johannes Berg
5762d7d3ed cfg80211: fix station info handling bugs
Fix two places where the structure isn't initialized to zero,
and thus can't be filled properly by the driver.

Fixes: 4a4b816950 ("cfg80211: Accept multiple RSSI thresholds for CQM")
Fixes: 9930380f0b ("cfg80211: implement IWRATE")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 15:36:18 -05:00
Xin Long
cd443f1e91 netlink: reset extack earlier in netlink_rcv_skb
Move up the extack reset/initialization in netlink_rcv_skb, so that
those 'goto ack' will not skip it. Otherwise, later on netlink_ack
may use the uninitialized extack and cause kernel crash.

Fixes: cbbdf8433a ("netlink: extack needs to be reset each time through loop")
Reported-by: syzbot+03bee3680a37466775e7@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 15:14:51 -05:00
David S. Miller
7155f8f391 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2018-01-18

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix a divide by zero due to wrong if (src_reg == 0) check in
   64-bit mode. Properly handle this in interpreter and mask it
   also generically in verifier to guard against similar checks
   in JITs, from Eric and Alexei.

2) Fix a bug in arm64 JIT when tail calls are involved and progs
   have different stack sizes, from Daniel.

3) Reject stores into BPF context that are not expected BPF_STX |
   BPF_MEM variant, from Daniel.

4) Mark dst reg as unknown on {s,u}bounds adjustments when the
   src reg has derived bounds from dead branches, from Daniel.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 09:17:04 -05:00
Yossi Kuperman
cc01572e2f xfrm: Add SA to hardware at the end of xfrm_state_construct()
Current code configures the hardware with a new SA before the state has been
fully initialized. During this time interval, an incoming ESP packet can cause
a crash due to a NULL dereference. More specifically, xfrm_input() considers
the packet as valid, and yet, anti-replay mechanism is not initialized.

Move hardware configuration to the end of xfrm_state_construct(), and mark
the state as valid once the SA is fully initialized.

Fixes: d77e38e612 ("xfrm: Add an IPsec hardware offloading API")
Signed-off-by: Aviad Yehezkel <aviadye@mellnaox.com>
Signed-off-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2018-01-18 11:09:29 +01:00
Marc Kleine-Budde
d468984688 can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once
If an invalid CANFD frame is received, from a driver or from a tun
interface, a Kernel warning is generated.

This patch replaces the WARN_ONCE by a simple pr_warn_once, so that a
kernel, bootet with panic_on_warn, does not panic. A printk seems to be
more appropriate here.

Reported-by: syzbot+e3b775f40babeff6e68b@syzkaller.appspotmail.com
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-01-18 09:32:54 +01:00
Marc Kleine-Budde
8cb68751c1 can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once
If an invalid CAN frame is received, from a driver or from a tun
interface, a Kernel warning is generated.

This patch replaces the WARN_ONCE by a simple pr_warn_once, so that a
kernel, bootet with panic_on_warn, does not panic. A printk seems to be
more appropriate here.

Reported-by: syzbot+4386709c0c1284dca827@syzkaller.appspotmail.com
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-01-18 09:32:54 +01:00
Daniel Borkmann
ad9294dbc2 bpf: fix cls_bpf on filter replace
Running the following sequence is currently broken:

  # tc qdisc add dev foo clsact
  # tc filter replace dev foo ingress prio 1 handle 1 bpf da obj bar.o
  # tc filter replace dev foo ingress prio 1 handle 1 bpf da obj bar.o
  RTNETLINK answers: Invalid argument

The normal expectation on kernel side is that the second command
succeeds replacing the existing program. However, what happens is
in cls_bpf_change(), we bail out with err in the second run in
cls_bpf_offload(). The EINVAL comes directly in cls_bpf_offload()
when comparing prog vs oldprog's gen_flags. In case of above
replace the new prog's gen_flags are 0, but the old ones are 8,
which means TCA_CLS_FLAGS_NOT_IN_HW is set (e.g. drivers not having
cls_bpf offload).

Fix 102740bd94 ("cls_bpf: fix offload assumptions after callback
conversion") in the following way: gen_flags from user space passed
down via netlink cannot include status flags like TCA_CLS_FLAGS_IN_HW
or TCA_CLS_FLAGS_NOT_IN_HW as opposed to oldprog that we previously
loaded. Therefore, it doesn't make any sense to include them in the
gen_flags comparison with the new prog before we even attempt to
offload. Thus, lets fix this before 4.15 goes out.

Fixes: 102740bd94 ("cls_bpf: fix offload assumptions after callback conversion")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 17:14:06 -05:00
Sabrina Dubroca
6db959c82e tls: reset crypto_info when do_tls_setsockopt_tx fails
The current code copies directly from userspace to ctx->crypto_send, but
doesn't always reinitialize it to 0 on failure. This causes any
subsequent attempt to use this setsockopt to fail because of the
TLS_CRYPTO_INFO_READY check, eventhough crypto_info is not actually
ready.

This should result in a correctly set up socket after the 3rd call, but
currently it does not:

    size_t s = sizeof(struct tls12_crypto_info_aes_gcm_128);
    struct tls12_crypto_info_aes_gcm_128 crypto_good = {
        .info.version = TLS_1_2_VERSION,
        .info.cipher_type = TLS_CIPHER_AES_GCM_128,
    };

    struct tls12_crypto_info_aes_gcm_128 crypto_bad_type = crypto_good;
    crypto_bad_type.info.cipher_type = 42;

    setsockopt(sock, SOL_TLS, TLS_TX, &crypto_bad_type, s);
    setsockopt(sock, SOL_TLS, TLS_TX, &crypto_good, s - 1);
    setsockopt(sock, SOL_TLS, TLS_TX, &crypto_good, s);

Fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 16:16:04 -05:00
Sabrina Dubroca
877d17c79b tls: return -EBUSY if crypto_info is already set
do_tls_setsockopt_tx returns 0 without doing anything when crypto_info
is already set. Silent failure is confusing for users.

Fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 16:16:03 -05:00
Sabrina Dubroca
cf6d43ef66 tls: fix sw_ctx leak
During setsockopt(SOL_TCP, TLS_TX), if initialization of the software
context fails in tls_set_sw_offload(), we leak sw_ctx. We also don't
reassign ctx->priv_ctx to NULL, so we can't even do another attempt to
set it up on the same socket, as it will fail with -EEXIST.

Fixes: 3c4d755915 ('tls: kernel TLS support')
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 16:16:03 -05:00
Ilya Lesokhin
d91c3e17f7 net/tls: Only attach to sockets in ESTABLISHED state
Calling accept on a TCP socket with a TLS ulp attached results
in two sockets that share the same ulp context.
The ulp context is freed while a socket is destroyed, so
after one of the sockets is released, the second second will
trigger a use after free when it tries to access the ulp context
attached to it.
We restrict the TLS ulp to sockets in ESTABLISHED state
to prevent the scenario above.

Fixes: 3c4d755915 ("tls: kernel TLS support")
Reported-by: syzbot+904e7cd6c5c741609228@syzkaller.appspotmail.com
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 16:05:28 -05:00
Linus Torvalds
b45a53be53 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Two read past end of buffer fixes in AF_KEY, from Eric Biggers.

 2) Memory leak in key_notify_policy(), from Steffen Klassert.

 3) Fix overflow with bpf arrays, from Daniel Borkmann.

 4) Fix RDMA regression with mlx5 due to mlx5 no longer using
    pci_irq_get_affinity(), from Saeed Mahameed.

 5) Missing RCU read locking in nl80211_send_iface() when it calls
    ieee80211_bss_get_ie(), from Dominik Brodowski.

 6) cfg80211 should check dev_set_name()'s return value, from Johannes
    Berg.

 7) Missing module license tag in 9p protocol, from Stephen Hemminger.

 8) Fix crash due to too small MTU in udp ipv6 sendmsg, from Mike
    Maloney.

 9) Fix endless loop in netlink extack code, from David Ahern.

10) TLS socket layer sets inverted error codes, resulting in an endless
    loop. From Robert Hering.

11) Revert openvswitch erspan tunnel support, it's mis-designed and we
    need to kill it before it goes into a real release. From William Tu.

12) Fix lan78xx failures in full speed USB mode, from Yuiko Oshino.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits)
  net, sched: fix panic when updating miniq {b,q}stats
  qed: Fix potential use-after-free in qed_spq_post()
  nfp: use the correct index for link speed table
  lan78xx: Fix failure in USB Full Speed
  sctp: do not allow the v4 socket to bind a v4mapped v6 address
  sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
  sctp: reinit stream if stream outcnt has been change by sinit in sendmsg
  ibmvnic: Fix pending MAC address changes
  netlink: extack: avoid parenthesized string constant warning
  ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
  net: Allow neigh contructor functions ability to modify the primary_key
  sh_eth: fix dumping ARSTR
  Revert "openvswitch: Add erspan tunnel support."
  net/tls: Fix inverted error codes to avoid endless loop
  ipv6: ip6_make_skb() needs to clear cork.base.dst
  sctp: avoid compiler warning on implicit fallthru
  net: ipv4: Make "ip route get" match iif lo rules again.
  netlink: extack needs to be reset each time through loop
  tipc: fix a memory leak in tipc_nl_node_get_link()
  ipv6: fix udpv6 sendmsg crash caused by too small MTU
  ...
2018-01-16 12:45:30 -08:00
Daniel Borkmann
81d947e2b8 net, sched: fix panic when updating miniq {b,q}stats
While working on fixing another bug, I ran into the following panic
on arm64 by simply attaching clsact qdisc, adding a filter and running
traffic on ingress to it:

  [...]
  [  178.188591] Unable to handle kernel read from unreadable memory at virtual address 810fb501f000
  [  178.197314] Mem abort info:
  [  178.200121]   ESR = 0x96000004
  [  178.203168]   Exception class = DABT (current EL), IL = 32 bits
  [  178.209095]   SET = 0, FnV = 0
  [  178.212157]   EA = 0, S1PTW = 0
  [  178.215288] Data abort info:
  [  178.218175]   ISV = 0, ISS = 0x00000004
  [  178.222019]   CM = 0, WnR = 0
  [  178.224997] user pgtable: 4k pages, 48-bit VAs, pgd = 0000000023cb3f33
  [  178.231531] [0000810fb501f000] *pgd=0000000000000000
  [  178.236508] Internal error: Oops: 96000004 [#1] SMP
  [...]
  [  178.311855] CPU: 73 PID: 2497 Comm: ping Tainted: G        W        4.15.0-rc7+ #5
  [  178.319413] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB18A 03/31/2017
  [  178.326887] pstate: 60400005 (nZCv daif +PAN -UAO)
  [  178.331685] pc : __netif_receive_skb_core+0x49c/0xac8
  [  178.336728] lr : __netif_receive_skb+0x28/0x78
  [  178.341161] sp : ffff00002344b750
  [  178.344465] x29: ffff00002344b750 x28: ffff810fbdfd0580
  [  178.349769] x27: 0000000000000000 x26: ffff000009378000
  [...]
  [  178.418715] x1 : 0000000000000054 x0 : 0000000000000000
  [  178.424020] Process ping (pid: 2497, stack limit = 0x000000009f0a3ff4)
  [  178.430537] Call trace:
  [  178.432976]  __netif_receive_skb_core+0x49c/0xac8
  [  178.437670]  __netif_receive_skb+0x28/0x78
  [  178.441757]  process_backlog+0x9c/0x160
  [  178.445584]  net_rx_action+0x2f8/0x3f0
  [...]

Reason is that sch_ingress and sch_clsact are doing mini_qdisc_pair_init()
which sets up miniq pointers to cpu_{b,q}stats from the underlying qdisc.
Problem is that this cannot work since they are actually set up right after
the qdisc ->init() callback in qdisc_create(), so first packet going into
sch_handle_ingress() tries to call mini_qdisc_bstats_cpu_update() and we
therefore panic.

In order to fix this, allocation of {b,q}stats needs to happen before we
call into ->init(). In net-next, there's already such option through commit
d59f5ffa59 ("net: sched: a dflt qdisc may be used with per cpu stats").
However, the bug needs to be fixed in net still for 4.15. Thus, include
these bits to reduce any merge churn and reuse the static_flags field to
set TCQ_F_CPUSTATS, and remove the allocation from qdisc_create() since
there is no other user left. Prashant Bhole ran into the same issue but
for net-next, thus adding him below as well as co-author. Same issue was
also reported by Sandipan Das when using bcc.

Fixes: 46209401f8 ("net: core: introduce mini_Qdisc and eliminate usage of tp->q for clsact fastpath")
Reference: https://lists.iovisor.org/pipermail/iovisor-dev/2018-January/001190.html
Reported-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Co-authored-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Co-authored-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-16 15:02:36 -05:00
David S. Miller
161f72ed6d More fixes:
* hwsim:
     - properly flush deletion works at module unload
     - validate # of channels passed from userspace
  * cfg80211:
     - fix RCU locking regression
     - initialize on-stack channel data for nl80211 event
     - check dev_set_name() return value
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlpch0kACgkQB8qZga/f
 l8R3dg/+IkRShxLcSrig3u8o24gWgEYP01y98gtW0eSe2kuWkC4lMYejSg/Wa/7F
 2w6kLyZwXpUxiHqyrcZpZG6xxcBklL77TfxiTnscWdC/ubcKHHRQrcsIglKcuFeN
 jpogCOVS3F0xFxdssKBJoLMYRkb4ZXAa3GN9/2x5M9dWQRBE5ixtx23iy85YGXyk
 xyhAXvGqdk0PKWj63G65dCKxISWVunAWxnXZh5KfKySNzPiuYf6zlDHRGUY7AhVb
 ZD5FeFI0tledoYoCpqNRuDcjfi1z3jUCINb1IVsA7LaXCiJRDW0PmhWE1KDNoREU
 Zono6ytEdt9tLMCNrZ8Gi2FZvIkLD0SCYMkkduIGyrXgDMB37H1HctkFa+YK2C/E
 TxfKZYPChIT9lVczVRySz69fzp9twALKwQO8AAQzi7eWNLQ8ztJnVvF6vMIVHODh
 DSWfHdfqIEaIiku4mcV/Urd3xGm6JTHgQExyfA5VkRDkMIQdpWWQv9pUsKGAswtp
 x5KjV6ytbWwzwULXY1StDalG0S+jWk3G/4Cin8FQH4VbbfWlUyE/azT+549GReuj
 wlU9wgIWGA8s1qzHj/vUlTqW0GOLob5uvbq5HdfHvzhbP89PJz+DriU5mKgJS1uL
 80PS+ocLJSvWQFc9Ep65LpKTU6FHdrFrRl6ZBwk6E2lYLEOzs6c=
 =xGDk
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2018-01-15' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
More fixes:
 * hwsim:
    - properly flush deletion works at module unload
    - validate # of channels passed from userspace
 * cfg80211:
    - fix RCU locking regression
    - initialize on-stack channel data for nl80211 event
    - check dev_set_name() return value
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-16 14:28:14 -05:00
Xin Long
c5006b8aa7 sctp: do not allow the v4 socket to bind a v4mapped v6 address
The check in sctp_sockaddr_af is not robust enough to forbid binding a
v4mapped v6 addr on a v4 socket.

The worse thing is that v4 socket's bind_verify would not convert this
v4mapped v6 addr to a v4 addr. syzbot even reported a crash as the v4
socket bound a v6 addr.

This patch is to fix it by doing the common sa.sa_family check first,
then AF_INET check for v4mapped v6 addrs.

Fixes: 7dab83de50 ("sctp: Support ipv6only AF_INET6 sockets.")
Reported-by: syzbot+7b7b518b1228d2743963@syzkaller.appspotmail.com
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-16 14:24:20 -05:00
Xin Long
a0ff660058 sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
After commit cea0cc80a6 ("sctp: use the right sk after waking up from
wait_buf sleep"), it may change to lock another sk if the asoc has been
peeled off in sctp_wait_for_sndbuf.

However, the asoc's new sk could be already closed elsewhere, as it's in
the sendmsg context of the old sk that can't avoid the new sk's closing.
If the sk's last one refcnt is held by this asoc, later on after putting
this asoc, the new sk will be freed, while under it's own lock.

This patch is to revert that commit, but fix the old issue by returning
error under the old sk's lock.

Fixes: cea0cc80a6 ("sctp: use the right sk after waking up from wait_buf sleep")
Reported-by: syzbot+ac6ea7baa4432811eb50@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-16 14:22:51 -05:00
Xin Long
625637bf4a sctp: reinit stream if stream outcnt has been change by sinit in sendmsg
After introducing sctp_stream structure, sctp uses stream->outcnt as the
out stream nums instead of c.sinit_num_ostreams.

However when users use sinit in cmsg, it only updates c.sinit_num_ostreams
in sctp_sendmsg. At that moment, stream->outcnt is still using previous
value. If it's value is not updated, the sinit_num_ostreams of sinit could
not really work.

This patch is to fix it by updating stream->outcnt and reiniting stream
if stream outcnt has been change by sinit in sendmsg.

Fixes: a83863174a ("sctp: prepare asoc stream for stream reconf")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-16 14:20:21 -05:00
Jim Westfall
cd9ff4de01 ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
Map all lookup neigh keys to INADDR_ANY for loopback/point-to-point devices
to avoid making an entry for every remote ip the device needs to talk to.

This used the be the old behavior but became broken in a263b30936
(ipv4: Make neigh lookups directly in output packet path) and later removed
in 0bb4087cbe (ipv4: Fix neigh lookup keying over loopback/point-to-point
devices) because it was broken.

Signed-off-by: Jim Westfall <jwestfall@surrealistic.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15 14:53:43 -05:00
Jim Westfall
096b9854c0 net: Allow neigh contructor functions ability to modify the primary_key
Use n->primary_key instead of pkey to account for the possibility that a neigh
constructor function may have modified the primary_key value.

Signed-off-by: Jim Westfall <jwestfall@surrealistic.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15 14:53:43 -05:00
William Tu
95a332088e Revert "openvswitch: Add erspan tunnel support."
This reverts commit ceaa001a17.

The OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS attr should be designed
as a nested attribute to support all ERSPAN v1 and v2's fields.
The current attr is a be32 supporting only one field.  Thus, this
patch reverts it and later patch will redo it using nested attr.

Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Jiri Benc <jbenc@redhat.com>
Cc: Pravin Shelar <pshelar@ovn.org>
Acked-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15 14:33:16 -05:00
r.hering@avm.de
30be8f8dba net/tls: Fix inverted error codes to avoid endless loop
sendfile() calls can hang endless with using Kernel TLS if a socket error occurs.
Socket error codes must be inverted by Kernel TLS before returning because
they are stored with positive sign. If returned non-inverted they are
interpreted as number of bytes sent, causing endless looping of the
splice mechanic behind sendfile().

Signed-off-by: Robert Hering <r.hering@avm.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15 14:21:57 -05:00
Eric Dumazet
95ef498d97 ipv6: ip6_make_skb() needs to clear cork.base.dst
In my last patch, I missed fact that cork.base.dst was not initialized
in ip6_make_skb() :

If ip6_setup_cork() returns an error, we might attempt a dst_release()
on some random pointer.

Fixes: 862c03ee1d ("ipv6: fix possible mem leaks in ipv6_make_skb()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15 14:19:32 -05:00
Marcelo Ricardo Leitner
37f47bc90c sctp: avoid compiler warning on implicit fallthru
These fall-through are expected.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15 13:56:13 -05:00
Lorenzo Colitti
6503a30440 net: ipv4: Make "ip route get" match iif lo rules again.
Commit 3765d35ed8 ("net: ipv4: Convert inet_rtm_getroute to rcu
versions of route lookup") broke "ip route get" in the presence
of rules that specify iif lo.

Host-originated traffic always has iif lo, because
ip_route_output_key_hash and ip6_route_output_flags set the flow
iif to LOOPBACK_IFINDEX. Thus, putting "iif lo" in an ip rule is a
convenient way to select only originated traffic and not forwarded
traffic.

inet_rtm_getroute used to match these rules correctly because
even though it sets the flow iif to 0, it called
ip_route_output_key which overwrites iif with LOOPBACK_IFINDEX.
But now that it calls ip_route_output_key_hash_rcu, the ifindex
will remain 0 and not match the iif lo in the rule. As a result,
"ip route get" will return ENETUNREACH.

Fixes: 3765d35ed8 ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup")
Tested: https://android.googlesource.com/kernel/tests/+/master/net/test/multinetwork_test.py passes again
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15 13:53:30 -05:00
David Ahern
cbbdf8433a netlink: extack needs to be reset each time through loop
syzbot triggered the WARN_ON in netlink_ack testing the bad_attr value.
The problem is that netlink_rcv_skb loops over the skb repeatedly invoking
the callback and without resetting the extack leaving potentially stale
data. Initializing each time through avoids the WARN_ON.

Fixes: 2d4bc93368 ("netlink: extended ACK reporting")
Reported-by: syzbot+315fa6766d0f7c359327@syzkaller.appspotmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15 13:50:07 -05:00
Cong Wang
59b36613e8 tipc: fix a memory leak in tipc_nl_node_get_link()
When tipc_node_find_by_name() fails, the nlmsg is not
freed.

While on it, switch to a goto label to properly
free it.

Fixes: be9c086715c ("tipc: narrow down exposure of struct tipc_node")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15 13:45:50 -05:00
Mike Maloney
749439bfac ipv6: fix udpv6 sendmsg crash caused by too small MTU
The logic in __ip6_append_data() assumes that the MTU is at least large
enough for the headers.  A device's MTU may be adjusted after being
added while sendmsg() is processing data, resulting in
__ip6_append_data() seeing any MTU.  For an mtu smaller than the size of
the fragmentation header, the math results in a negative 'maxfraglen',
which causes problems when refragmenting any previous skb in the
skb_write_queue, leaving it possibly malformed.

Instead sendmsg returns EINVAL when the mtu is calculated to be less
than IPV6_MIN_MTU.

Found by syzkaller:
kernel BUG at ./include/linux/skbuff.h:2064!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 14216 Comm: syz-executor5 Not tainted 4.13.0-rc4+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffff8801d0b68580 task.stack: ffff8801ac6b8000
RIP: 0010:__skb_pull include/linux/skbuff.h:2064 [inline]
RIP: 0010:__ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617
RSP: 0018:ffff8801ac6bf570 EFLAGS: 00010216
RAX: 0000000000010000 RBX: 0000000000000028 RCX: ffffc90003cce000
RDX: 00000000000001b8 RSI: ffffffff839df06f RDI: ffff8801d9478ca0
RBP: ffff8801ac6bf780 R08: ffff8801cc3f1dbc R09: 0000000000000000
R10: ffff8801ac6bf7a0 R11: 43cb4b7b1948a9e7 R12: ffff8801cc3f1dc8
R13: ffff8801cc3f1d40 R14: 0000000000001036 R15: dffffc0000000000
FS:  00007f43d740c700(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7834984000 CR3: 00000001d79b9000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ip6_finish_skb include/net/ipv6.h:911 [inline]
 udp_v6_push_pending_frames+0x255/0x390 net/ipv6/udp.c:1093
 udpv6_sendmsg+0x280d/0x31a0 net/ipv6/udp.c:1363
 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:762
 sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 SYSC_sendto+0x352/0x5a0 net/socket.c:1750
 SyS_sendto+0x40/0x50 net/socket.c:1718
 entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x4512e9
RSP: 002b:00007f43d740bc08 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00000000007180a8 RCX: 00000000004512e9
RDX: 000000000000002e RSI: 0000000020d08000 RDI: 0000000000000005
RBP: 0000000000000086 R08: 00000000209c1000 R09: 000000000000001c
R10: 0000000000040800 R11: 0000000000000216 R12: 00000000004b9c69
R13: 00000000ffffffff R14: 0000000000000005 R15: 00000000202c2000
Code: 9e 01 fe e9 c5 e8 ff ff e8 7f 9e 01 fe e9 4a ea ff ff 48 89 f7 e8 52 9e 01 fe e9 aa eb ff ff e8 a8 b6 cf fd 0f 0b e8 a1 b6 cf fd <0f> 0b 49 8d 45 78 4d 8d 45 7c 48 89 85 78 fe ff ff 49 8d 85 ba
RIP: __skb_pull include/linux/skbuff.h:2064 [inline] RSP: ffff8801ac6bf570
RIP: __ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617 RSP: ffff8801ac6bf570

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Mike Maloney <maloney@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15 13:28:18 -05:00
Stephen Hemminger
d542296a4d 9p: add missing module license for xen transport
The 9P of Xen module is missing required license and module information.
See https://bugzilla.kernel.org/show_bug.cgi?id=198109

Reported-by: Alan Bartlett <ajb@elrepo.org>
Fixes: 868eb12273 ("xen/9pfs: introduce Xen 9pfs transport driver")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-15 13:13:53 -05:00
Johannes Berg
59b179b48c cfg80211: check dev_set_name() return value
syzbot reported a warning from rfkill_alloc(), and after a while
I think that the reason is that it was doing fault injection and
the dev_set_name() failed, leaving the name NULL, and we didn't
check the return value and got to rfkill_alloc() with a NULL name.
Since we really don't want a NULL name, we ought to check the
return value.

Fixes: fb28ad3590 ("net: struct device - replace bus_id with dev_name(), dev_set_name()")
Reported-by: syzbot+1ddfb3357e1d7bb5b5d3@syzkaller.appspotmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-01-15 11:35:06 +01:00