linux/net/ipv4
Joe Stringer 8282f27449 inet: frag: Always orphan skbs inside ip_defrag()
Later parts of the stack (including fragmentation) expect that there is
never a socket attached to frag in a frag_list, however this invariant
was not enforced on all defrag paths. This could lead to the
BUG_ON(skb->sk) during ip_do_fragment(), as per the call stack at the
end of this commit message.

While the call could be added to openvswitch to fix this particular
error, the head and tail of the frags list are already orphaned
indirectly inside ip_defrag(), so it seems like the remaining fragments
should all be orphaned in all circumstances.

kernel BUG at net/ipv4/ip_output.c:586!
[...]
Call Trace:
 <IRQ>
 [<ffffffffa0205270>] ? do_output.isra.29+0x1b0/0x1b0 [openvswitch]
 [<ffffffffa02167a7>] ovs_fragment+0xcc/0x214 [openvswitch]
 [<ffffffff81667830>] ? dst_discard_out+0x20/0x20
 [<ffffffff81667810>] ? dst_ifdown+0x80/0x80
 [<ffffffffa0212072>] ? find_bucket.isra.2+0x62/0x70 [openvswitch]
 [<ffffffff810e0ba5>] ? mod_timer_pending+0x65/0x210
 [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
 [<ffffffffa03205a2>] ? nf_conntrack_in+0x252/0x500 [nf_conntrack]
 [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
 [<ffffffffa02051a3>] do_output.isra.29+0xe3/0x1b0 [openvswitch]
 [<ffffffffa0206411>] do_execute_actions+0xe11/0x11f0 [openvswitch]
 [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
 [<ffffffffa0206822>] ovs_execute_actions+0x32/0xd0 [openvswitch]
 [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch]
 [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
 [<ffffffffa02068a2>] ovs_execute_actions+0xb2/0xd0 [openvswitch]
 [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch]
 [<ffffffffa0215019>] ? ovs_ct_get_labels+0x49/0x80 [openvswitch]
 [<ffffffffa0213a1d>] ovs_vport_receive+0x5d/0xa0 [openvswitch]
 [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
 [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
 [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
 [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch]
 [<ffffffffa02148fc>] internal_dev_xmit+0x6c/0x140 [openvswitch]
 [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch]
 [<ffffffff81660299>] dev_hard_start_xmit+0x2b9/0x5e0
 [<ffffffff8165fc21>] ? netif_skb_features+0xd1/0x1f0
 [<ffffffff81660f20>] __dev_queue_xmit+0x800/0x930
 [<ffffffff81660770>] ? __dev_queue_xmit+0x50/0x930
 [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90
 [<ffffffff81669876>] ? neigh_resolve_output+0x106/0x220
 [<ffffffff81661060>] dev_queue_xmit+0x10/0x20
 [<ffffffff816698e8>] neigh_resolve_output+0x178/0x220
 [<ffffffff816a8e6f>] ? ip_finish_output2+0x1ff/0x590
 [<ffffffff816a8e6f>] ip_finish_output2+0x1ff/0x590
 [<ffffffff816a8cee>] ? ip_finish_output2+0x7e/0x590
 [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0
 [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0
 [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80
 [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340
 [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190
 [<ffffffff816ab4c0>] ip_output+0x70/0x110
 [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80
 [<ffffffff816aa9f9>] ip_local_out+0x39/0x70
 [<ffffffff816abf89>] ip_send_skb+0x19/0x40
 [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40
 [<ffffffff816df21a>] icmp_push_reply+0xea/0x120
 [<ffffffff816df93d>] icmp_reply.constprop.23+0x1ed/0x230
 [<ffffffff816df9ce>] icmp_echo.part.21+0x4e/0x50
 [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
 [<ffffffff810d5f9e>] ? rcu_read_lock_held+0x5e/0x70
 [<ffffffff816dfa06>] icmp_echo+0x36/0x70
 [<ffffffff816e0d11>] icmp_rcv+0x271/0x450
 [<ffffffff816a4ca7>] ip_local_deliver_finish+0x127/0x3a0
 [<ffffffff816a4bc1>] ? ip_local_deliver_finish+0x41/0x3a0
 [<ffffffff816a5160>] ip_local_deliver+0x60/0xd0
 [<ffffffff816a4b80>] ? ip_rcv_finish+0x560/0x560
 [<ffffffff816a46fd>] ip_rcv_finish+0xdd/0x560
 [<ffffffff816a5453>] ip_rcv+0x283/0x3e0
 [<ffffffff810b6302>] ? match_held_lock+0x192/0x200
 [<ffffffff816a4620>] ? inet_del_offload+0x40/0x40
 [<ffffffff8165d062>] __netif_receive_skb_core+0x392/0xae0
 [<ffffffff8165e68e>] ? process_backlog+0x8e/0x230
 [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90
 [<ffffffff8165d7c8>] __netif_receive_skb+0x18/0x60
 [<ffffffff8165e678>] process_backlog+0x78/0x230
 [<ffffffff8165e6dd>] ? process_backlog+0xdd/0x230
 [<ffffffff8165e355>] net_rx_action+0x155/0x400
 [<ffffffff8106b48c>] __do_softirq+0xcc/0x420
 [<ffffffff816a8e87>] ? ip_finish_output2+0x217/0x590
 [<ffffffff8178e78c>] do_softirq_own_stack+0x1c/0x30
 <EOI>
 [<ffffffff8106b88e>] do_softirq+0x4e/0x60
 [<ffffffff8106b948>] __local_bh_enable_ip+0xa8/0xb0
 [<ffffffff816a8eb0>] ip_finish_output2+0x240/0x590
 [<ffffffff816a9a31>] ? ip_do_fragment+0x831/0x8a0
 [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0
 [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0
 [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80
 [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340
 [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190
 [<ffffffff816ab4c0>] ip_output+0x70/0x110
 [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80
 [<ffffffff816aa9f9>] ip_local_out+0x39/0x70
 [<ffffffff816abf89>] ip_send_skb+0x19/0x40
 [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40
 [<ffffffff816d55d3>] raw_sendmsg+0x7d3/0xc30
 [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
 [<ffffffff816e7557>] ? inet_sendmsg+0xc7/0x1d0
 [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
 [<ffffffff816e759a>] inet_sendmsg+0x10a/0x1d0
 [<ffffffff816e7495>] ? inet_sendmsg+0x5/0x1d0
 [<ffffffff8163e398>] sock_sendmsg+0x38/0x50
 [<ffffffff8163ec5f>] ___sys_sendmsg+0x25f/0x270
 [<ffffffff811aadad>] ? handle_mm_fault+0x8dd/0x1320
 [<ffffffff8178c147>] ? _raw_spin_unlock+0x27/0x40
 [<ffffffff810529b2>] ? __do_page_fault+0x1e2/0x460
 [<ffffffff81204886>] ? __fget_light+0x66/0x90
 [<ffffffff8163f8e2>] __sys_sendmsg+0x42/0x80
 [<ffffffff8163f932>] SyS_sendmsg+0x12/0x20
 [<ffffffff8178cb17>] entry_SYSCALL_64_fastpath+0x12/0x6f
Code: 00 00 44 89 e0 e9 7c fb ff ff 4c 89 ff e8 e7 e7 ff ff 41 8b 9d 80 00 00 00 2b 5d d4 89 d8 c1 f8 03 0f b7 c0 e9 33 ff ff f
 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48
RIP  [<ffffffff816a9a92>] ip_do_fragment+0x892/0x8a0
 RSP <ffff88006d603170>

Fixes: 7f8a436eaa ("openvswitch: Add conntrack action")
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-28 16:00:46 -08:00
..
netfilter inet: frag: Always orphan skbs inside ip_defrag() 2016-01-28 16:00:46 -08:00
af_inet.c net: add validation for the socket syscall protocol argument 2015-12-14 16:09:30 -05:00
ah4.c
arp.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
cipso_ipv4.c
datagram.c
devinet.c netlink: Rightsize IFLA_AF_SPEC size calculation 2015-10-21 19:15:20 -07:00
esp4.c
fib_frontend.c net: Flush local routes when device changes vrf association 2015-12-13 23:58:44 -05:00
fib_lookup.h
fib_rules.c
fib_semantics.c net: Fix prefsrc lookups 2015-11-04 21:34:37 -05:00
fib_trie.c fib_trie: leaf_walk_rcu should not compute key if key is less than pn->key 2015-10-27 18:14:51 -07:00
fou.c udp: restrict offloads to one namespace 2016-01-10 17:28:24 -05:00
gre_demux.c
gre_offload.c ipv6: gre: support SIT encapsulation 2015-10-26 22:01:18 -07:00
icmp.c Revert "ipv4/icmp: redirect messages can use the ingress daddr as source" 2015-10-14 06:01:07 -07:00
igmp.c ipv4: igmp: Allow removing groups from a removed interface 2015-12-03 12:07:05 -05:00
inet_connection_sock.c tcp: ensure proper barriers in lockless contexts 2015-11-15 18:36:38 -05:00
inet_diag.c net: diag: support v4mapped sockets in inet_diag_find_one_icsk() 2016-01-20 18:51:31 -08:00
inet_fragment.c inet: kill unused skb_free op 2016-01-05 22:25:57 -05:00
inet_hashtables.c tcp/dccp: fix hashdance race for passive sessions 2015-10-23 05:42:21 -07:00
inet_lro.c
inet_timewait_sock.c tcp/dccp: fix timewait races in timer handling 2015-09-21 16:32:29 -07:00
inetpeer.c
ip_forward.c net: Pass net into dst_output and remove dst_output_okfn 2015-10-08 04:26:54 -07:00
ip_fragment.c inet: frag: Always orphan skbs inside ip_defrag() 2016-01-28 16:00:46 -08:00
ip_gre.c ip_tunnel: Move stats update to iptunnel_xmit() 2015-12-25 23:32:23 -05:00
ip_input.c ipv4: Pass struct net into ip_defrag and ip_check_defrag 2015-10-12 19:44:16 -07:00
ip_options.c
ip_output.c net: preserve IP control block during GSO segmentation 2016-01-15 14:35:24 -05:00
ip_sockglue.c ipv4: fix a potential deadlock in mcast getsockopt() path 2015-11-04 21:29:59 -05:00
ip_tunnel_core.c ipv4: fix endianness warnings in ip_tunnel_core.c 2016-01-08 21:30:43 -05:00
ip_tunnel.c ip_tunnel: Move stats update to iptunnel_xmit() 2015-12-25 23:32:23 -05:00
ip_vti.c ip_tunnel: Move stats update to iptunnel_xmit() 2015-12-25 23:32:23 -05:00
ipcomp.c
ipconfig.c net/ipv4/ipconfig: Rejoin broken lines in console output 2015-11-24 12:00:09 -05:00
ipip.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-31 18:20:10 -05:00
ipmr.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-12-03 21:09:12 -05:00
Kconfig ipv4+ipv6: Make INET*_ESP select CRYPTO_ECHAINIV 2016-01-25 10:45:41 -08:00
Makefile tcp: track the packet timings in RACK 2015-10-21 07:00:48 -07:00
netfilter.c ipv4: Pass struct net into ip_route_me_harder 2015-09-29 20:21:32 +02:00
ping.c ipv4: eliminate lock count warnings in ping.c 2016-01-08 21:30:43 -05:00
proc.c
protocol.c
raw.c net: Propagate lookup failure in l3mdev_get_saddr to caller 2016-01-04 22:58:30 -05:00
route.c net: Do not drop to make_route if oif is l3mdev 2015-10-08 05:18:47 -07:00
syncookies.c net: Allow accepted sockets to be bound to l3mdev domain 2015-12-18 14:43:38 -05:00
sysctl_net_ipv4.c ipv4: Namespecify the tcp_keepalive_intvl sysctl knob 2016-01-10 17:32:09 -05:00
tcp_bic.c
tcp_cdg.c
tcp_cong.c tcp: remove tcp_ecn_make_synack() socket argument 2015-09-25 13:00:38 -07:00
tcp_cubic.c tcp_cubic: do not set epoch_start in the future 2015-09-17 22:35:07 -07:00
tcp_dctcp.c tcp: allow dctcp alpha to drop to zero 2015-10-23 02:46:52 -07:00
tcp_diag.c net: diag: Support destroying TCP sockets. 2015-12-15 23:26:52 -05:00
tcp_fastopen.c tcp/dccp: fix hashdance race for passive sessions 2015-10-23 05:42:21 -07:00
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c
tcp_input.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-01-06 22:54:18 -05:00
tcp_ipv4.c tcp: fix NULL deref in tcp_v4_send_ack() 2016-01-21 11:20:14 -08:00
tcp_lp.c
tcp_memcontrol.c mm: memcontrol: switch to the updated jump-label API 2016-01-14 16:00:49 -08:00
tcp_metrics.c
tcp_minisocks.c tcp: honour SO_BINDTODEVICE for TW_RST case too 2015-12-22 17:03:05 -05:00
tcp_offload.c
tcp_output.c net: tcp_memcontrol: simplify linkage between socket and page counter 2016-01-14 16:00:49 -08:00
tcp_probe.c
tcp_recovery.c tcp: use RACK to detect losses 2015-10-21 07:00:53 -07:00
tcp_scalable.c
tcp_timer.c ipv4: Namespecify the tcp_keepalive_intvl sysctl knob 2016-01-10 17:32:09 -05:00
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c tcp_yeah: don't set ssthresh below 2 2016-01-11 17:25:16 -05:00
tcp.c net: tcp_memcontrol: protect all tcp_memcontrol calls by jump-label 2016-01-14 16:00:49 -08:00
tunnel4.c
udp_diag.c soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF 2016-01-04 22:49:59 -05:00
udp_impl.h
udp_offload.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-01-11 23:55:43 -05:00
udp_tunnel.c ip_tunnel: Move stats update to iptunnel_xmit() 2015-12-25 23:32:23 -05:00
udp.c udp: fix potential infinite loop in SO_REUSEPORT logic 2016-01-19 13:52:25 -05:00
udplite.c
xfrm4_input.c netfilter: Pass net into okfn 2015-09-17 17:18:37 -07:00
xfrm4_mode_beet.c
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c
xfrm4_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-24 06:54:12 -07:00
xfrm4_policy.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec 2015-12-22 16:26:31 -05:00
xfrm4_protocol.c
xfrm4_state.c
xfrm4_tunnel.c