linux/net/ipv4
Eric Dumazet 5037e9ef94 net: fix IP early demux races
David Wilder reported crashes caused by dst reuse.

<quote David>
  I am seeing a crash on a distro V4.2.3 kernel caused by a double
  release of a dst_entry.  In ipv4_dst_destroy() the call to
  list_empty() finds a poisoned next pointer, indicating the dst_entry
  has already been removed from the list and freed. The crash occurs
  18 to 24 hours into a run of a network stress exerciser.
</quote>

Thanks to his detailed report and analysis, we were able to understand
the core issue.

IP early demux can associate a dst to skb, after a lookup in TCP/UDP
sockets.

When socket cache is not properly set, we want to store into
sk->sk_dst_cache the dst for future IP early demux lookups,
by acquiring a stable refcount on the dst.

Problem is this acquisition is simply using an atomic_inc(),
which works well, unless the dst was queued for destruction from
dst_release() noticing dst refcount went to zero, if DST_NOCACHE
was set on dst.

We need to make sure current refcount is not zero before incrementing
it, or risk double free as David reported.

This patch, being a stable candidate, adds two new helpers, and use
them only from IP early demux problematic paths.

It might be possible to merge in net-next skb_dst_force() and
skb_dst_force_safe(), but I prefer having the smallest patch for stable
kernels : Maybe some skb_dst_force() callers do not expect skb->dst
can suddenly be cleared.

Can probably be backported back to linux-3.6 kernels

Reported-by: David J. Wilder <dwilder@us.ibm.com>
Tested-by: David J. Wilder <dwilder@us.ibm.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14 23:52:00 -05:00
..
netfilter netfilter: nf_dup: add missing dependencies with NF_CONNTRACK 2015-12-10 18:17:06 +01:00
af_inet.c net: add validation for the socket syscall protocol argument 2015-12-14 16:09:30 -05:00
ah4.c ah4: Fix error return in ah_input(). 2015-08-25 13:38:50 -07:00
arp.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
cipso_ipv4.c
datagram.c net: Set sk_txhash from a random number 2015-07-29 22:44:04 -07:00
devinet.c netlink: Rightsize IFLA_AF_SPEC size calculation 2015-10-21 19:15:20 -07:00
esp4.c esp4: Switch to new AEAD interface 2015-05-28 11:23:20 +08:00
fib_frontend.c net: Flush local routes when device changes vrf association 2015-12-13 23:58:44 -05:00
fib_lookup.h ipv4: consider TOS in fib_select_default 2015-07-24 22:46:11 -07:00
fib_rules.c net: ipv6: use common fib_default_rule_pref 2015-09-09 14:19:50 -07:00
fib_semantics.c net: Fix prefsrc lookups 2015-11-04 21:34:37 -05:00
fib_trie.c fib_trie: leaf_walk_rcu should not compute key if key is less than pn->key 2015-10-27 18:14:51 -07:00
fou.c fou: reject IPv6 config 2015-08-29 13:07:54 -07:00
gre_demux.c gre: Remove support for sharing GRE protocol hook. 2015-08-10 14:03:54 -07:00
gre_offload.c ipv6: gre: support SIT encapsulation 2015-10-26 22:01:18 -07:00
icmp.c Revert "ipv4/icmp: redirect messages can use the ingress daddr as source" 2015-10-14 06:01:07 -07:00
igmp.c ipv4: igmp: Allow removing groups from a removed interface 2015-12-03 12:07:05 -05:00
inet_connection_sock.c tcp: ensure proper barriers in lockless contexts 2015-11-15 18:36:38 -05:00
inet_diag.c tcp/dccp: install syn_recv requests into ehash table 2015-10-03 04:32:41 -07:00
inet_fragment.c net: fix percpu memory leaks 2015-11-02 22:47:14 -05:00
inet_hashtables.c tcp/dccp: fix hashdance race for passive sessions 2015-10-23 05:42:21 -07:00
inet_lro.c
inet_timewait_sock.c tcp/dccp: fix timewait races in timer handling 2015-09-21 16:32:29 -07:00
inetpeer.c net: Add helper function to compare inetpeer addresses 2015-08-28 13:32:36 -07:00
ip_forward.c net: Pass net into dst_output and remove dst_output_okfn 2015-10-08 04:26:54 -07:00
ip_fragment.c net: fix percpu memory leaks 2015-11-02 22:47:14 -05:00
ip_gre.c openvswitch: Fix egress tunnel info. 2015-10-22 19:39:25 -07:00
ip_input.c ipv4: Pass struct net into ip_defrag and ip_check_defrag 2015-10-12 19:44:16 -07:00
ip_options.c
ip_output.c ipv4: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment 2015-11-01 12:01:27 -05:00
ip_sockglue.c ipv4: fix a potential deadlock in mcast getsockopt() path 2015-11-04 21:29:59 -05:00
ip_tunnel_core.c ipv4, ipv6: Pass net into ip_local_out and ip6_local_out 2015-10-08 04:27:02 -07:00
ip_tunnel.c ip_gre: Add support to collect tunnel metadata. 2015-08-10 14:03:54 -07:00
ip_vti.c net: Pass net into dst_output and remove dst_output_okfn 2015-10-08 04:26:54 -07:00
ipcomp.c
ipconfig.c ipconfig: send Client-identifier in DHCP requests 2015-10-18 19:23:52 -07:00
ipip.c ip_gre: Add support to collect tunnel metadata. 2015-08-10 14:03:54 -07:00
ipmr.c net: ipmr, ip6mr: fix vif/tunnel failure race condition 2015-11-24 17:15:56 -05:00
Kconfig geneve: Consolidate Geneve functionality in single module. 2015-08-27 15:42:48 -07:00
Makefile tcp: track the packet timings in RACK 2015-10-21 07:00:48 -07:00
netfilter.c ipv4: Pass struct net into ip_route_me_harder 2015-09-29 20:21:32 +02:00
ping.c ipv6: Nonlocal bind 2015-07-09 21:09:10 -07:00
proc.c net: track success and failure of TCP PMTU probing 2015-07-21 22:36:33 -07:00
protocol.c
raw.c raw: increment correct SNMP counters for ICMP messages 2015-11-16 15:08:48 -05:00
route.c net: Do not drop to make_route if oif is l3mdev 2015-10-08 05:18:47 -07:00
syncookies.c tcp/dccp: fix hashdance race for passive sessions 2015-10-23 05:42:21 -07:00
sysctl_net_ipv4.c ipv4: disable BH when changing ip local port range 2015-11-04 21:29:06 -05:00
tcp_bic.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_cdg.c tcp: do not slow start when cwnd equals ssthresh 2015-07-09 14:22:52 -07:00
tcp_cong.c tcp: remove tcp_ecn_make_synack() socket argument 2015-09-25 13:00:38 -07:00
tcp_cubic.c tcp_cubic: do not set epoch_start in the future 2015-09-17 22:35:07 -07:00
tcp_dctcp.c tcp: allow dctcp alpha to drop to zero 2015-10-23 02:46:52 -07:00
tcp_diag.c tcp: ensure proper barriers in lockless contexts 2015-11-15 18:36:38 -05:00
tcp_fastopen.c tcp/dccp: fix hashdance race for passive sessions 2015-10-23 05:42:21 -07:00
tcp_highspeed.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_htcp.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_hybla.c tcp: do not slow start when cwnd equals ssthresh 2015-07-09 14:22:52 -07:00
tcp_illinois.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_input.c tcp: initialize tp->copied_seq in case of cross SYN connection 2015-11-30 15:34:17 -05:00
tcp_ipv4.c net: fix IP early demux races 2015-12-14 23:52:00 -05:00
tcp_lp.c
tcp_memcontrol.c
tcp_metrics.c net: Add helper function to compare inetpeer addresses 2015-08-28 13:32:36 -07:00
tcp_minisocks.c tcp: fix req->saved_syn race 2015-11-05 14:36:09 -05:00
tcp_offload.c tcp: reserve tcp_skb_mss() to tcp stack 2015-06-11 16:33:10 -07:00
tcp_output.c net: make skb_set_owner_w() more robust 2015-11-02 16:28:49 -05:00
tcp_probe.c
tcp_recovery.c tcp: use RACK to detect losses 2015-10-21 07:00:53 -07:00
tcp_scalable.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_timer.c tcp: fix Fast Open snmp over-counting bug 2015-11-20 10:51:12 -05:00
tcp_vegas.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_vegas.h
tcp_veno.c tcp: add tcp_in_slow_start helper 2015-07-09 14:22:52 -07:00
tcp_westwood.c
tcp_yeah.c
tcp.c net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
tunnel4.c
udp_diag.c sock_diag: specify info_size per inet protocol 2015-06-15 19:49:22 -07:00
udp_impl.h
udp_offload.c
udp_tunnel.c tunnel: introduce udp_tun_rx_dst() 2015-08-27 15:42:47 -07:00
udp.c udp: remove duplicate include 2015-11-18 14:58:02 -05:00
udplite.c
xfrm4_input.c netfilter: Pass net into okfn 2015-09-17 17:18:37 -07:00
xfrm4_mode_beet.c
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c
xfrm4_output.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-24 06:54:12 -07:00
xfrm4_policy.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2015-10-30 20:51:56 +09:00
xfrm4_protocol.c
xfrm4_state.c
xfrm4_tunnel.c