linux/net/ipv4
Ido Schimmel f97f4dd8b3 net: ipv4: Fix memory leak in network namespace dismantle
IPv4 routing tables are flushed in two cases:

1. In response to events in the netdev and inetaddr notification chains
2. When a network namespace is being dismantled

In both cases only routes associated with a dead nexthop group are
flushed. However, a nexthop group will only be marked as dead in case it
is populated with actual nexthops using a nexthop device. This is not
the case when the route in question is an error route (e.g.,
'blackhole', 'unreachable').

Therefore, when a network namespace is being dismantled such routes are
not flushed and leaked [1].

To reproduce:
# ip netns add blue
# ip -n blue route add unreachable 192.0.2.0/24
# ip netns del blue

Fix this by not skipping error routes that are not marked with
RTNH_F_DEAD when flushing the routing tables.

To prevent the flushing of such routes in case #1, add a parameter to
fib_table_flush() that indicates if the table is flushed as part of
namespace dismantle or not.

Note that this problem does not exist in IPv6 since error routes are
associated with the loopback device.

[1]
unreferenced object 0xffff888066650338 (size 56):
  comm "ip", pid 1206, jiffies 4294786063 (age 26.235s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 b0 1c 62 61 80 88 ff ff  ..........ba....
    e8 8b a1 64 80 88 ff ff 00 07 00 08 fe 00 00 00  ...d............
  backtrace:
    [<00000000856ed27d>] inet_rtm_newroute+0x129/0x220
    [<00000000fcdfc00a>] rtnetlink_rcv_msg+0x397/0xa20
    [<00000000cb85801a>] netlink_rcv_skb+0x132/0x380
    [<00000000ebc991d2>] netlink_unicast+0x4c0/0x690
    [<0000000014f62875>] netlink_sendmsg+0x929/0xe10
    [<00000000bac9d967>] sock_sendmsg+0xc8/0x110
    [<00000000223e6485>] ___sys_sendmsg+0x77a/0x8f0
    [<000000002e94f880>] __sys_sendmsg+0xf7/0x250
    [<00000000ccb1fa72>] do_syscall_64+0x14d/0x610
    [<00000000ffbe3dae>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [<000000003a8b605b>] 0xffffffffffffffff
unreferenced object 0xffff888061621c88 (size 48):
  comm "ip", pid 1206, jiffies 4294786063 (age 26.235s)
  hex dump (first 32 bytes):
    6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
    6b 6b 6b 6b 6b 6b 6b 6b d8 8e 26 5f 80 88 ff ff  kkkkkkkk..&_....
  backtrace:
    [<00000000733609e3>] fib_table_insert+0x978/0x1500
    [<00000000856ed27d>] inet_rtm_newroute+0x129/0x220
    [<00000000fcdfc00a>] rtnetlink_rcv_msg+0x397/0xa20
    [<00000000cb85801a>] netlink_rcv_skb+0x132/0x380
    [<00000000ebc991d2>] netlink_unicast+0x4c0/0x690
    [<0000000014f62875>] netlink_sendmsg+0x929/0xe10
    [<00000000bac9d967>] sock_sendmsg+0xc8/0x110
    [<00000000223e6485>] ___sys_sendmsg+0x77a/0x8f0
    [<000000002e94f880>] __sys_sendmsg+0xf7/0x250
    [<00000000ccb1fa72>] do_syscall_64+0x14d/0x610
    [<00000000ffbe3dae>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [<000000003a8b605b>] 0xffffffffffffffff

Fixes: 8cced9eff1 ("[NETNS]: Enable routing configuration in non-initial namespace.")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-15 13:33:44 -08:00
..
bpfilter net: bpfilter: disallow to remove bpfilter module while being used 2019-01-11 18:05:41 -08:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2018-12-20 18:20:26 -08:00
af_inet.c net: use indirect call wrappers at GRO transport layer 2018-12-15 13:23:02 -08:00
ah4.c net-ipv4: remove 2 always zero parameters from ipv4_redirect() 2018-09-26 20:30:55 -07:00
arp.c net: Evict neighbor entries on carrier down 2018-10-12 09:47:39 -07:00
cipso_ipv4.c net/ipv4: defensive cipso option parsing 2018-09-17 19:37:46 -07:00
datagram.c ipv4: Allow sending multicast packets on specific i/f using VRF socket 2018-10-02 22:28:17 -07:00
devinet.c netlink: fixup regression in RTM_GETADDR 2019-01-04 12:47:06 -08:00
esp4_offload.c net: use skb_sec_path helper in more places 2018-12-19 11:21:37 -08:00
esp4.c net: use skb_sec_path helper in more places 2018-12-19 11:21:37 -08:00
fib_frontend.c net: ipv4: Fix memory leak in network namespace dismantle 2019-01-15 13:33:44 -08:00
fib_lookup.h
fib_notifier.c
fib_rules.c ipv4: fib_rules: Fix possible infinite loop in fib_empty_table 2018-12-30 12:57:04 -08:00
fib_semantics.c net: Add extack argument to ip_fib_metrics_init 2018-11-06 15:00:45 -08:00
fib_trie.c net: ipv4: Fix memory leak in network namespace dismantle 2019-01-15 13:33:44 -08:00
fou.c fou: Prevent unbounded recursion in GUE error handler also with UDP-Lite 2019-01-04 13:06:07 -08:00
gre_demux.c net: Convert protocol error handlers from void to int 2018-11-08 17:13:08 -08:00
gre_offload.c Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net 2018-07-03 10:29:26 +09:00
icmp.c net: Convert protocol error handlers from void to int 2018-11-08 17:13:08 -08:00
igmp.c ipv4/igmp: fix v1/v2 switchback timeout based on rfc3376, 8.12 2018-10-29 20:26:06 -07:00
inet_connection_sock.c inet: minor optimization for backlog setting in listen(2) 2018-11-07 22:31:07 -08:00
inet_diag.c tcp: fix a race in inet_diag_dump_icsk() 2018-12-20 19:23:22 -08:00
inet_fragment.c inet: frags: better deal with smp races 2018-11-08 18:40:30 -08:00
inet_hashtables.c net: dccp: fix kernel crash on module load 2018-12-24 15:27:56 -08:00
inet_timewait_sock.c
inetpeer.c
ip_forward.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-12-20 11:53:36 -08:00
ip_fragment.c net: ipv4: do not handle duplicate fragments as overlapping 2018-12-15 11:50:40 -08:00
ip_gre.c ip: validate header length on virtual device xmit 2019-01-01 12:05:02 -08:00
ip_input.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-12-09 21:43:31 -08:00
ip_options.c
ip_output.c sk_buff: add skb extension infrastructure 2018-12-19 11:21:37 -08:00
ip_sockglue.c ip: on queued skb use skb_header_pointer instead of pskb_may_pull 2019-01-10 09:27:20 -05:00
ip_tunnel_core.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-12-24 16:19:56 -08:00
ip_tunnel.c ip: validate header length on virtual device xmit 2019-01-01 12:05:02 -08:00
ip_vti.c ip: validate header length on virtual device xmit 2019-01-01 12:05:02 -08:00
ipcomp.c net-ipv4: remove 2 always zero parameters from ipv4_redirect() 2018-09-26 20:30:55 -07:00
ipconfig.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-12-20 11:53:36 -08:00
ipip.c net: Convert protocol error handlers from void to int 2018-11-08 17:13:08 -08:00
ipmr_base.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-10-19 11:03:06 -07:00
ipmr.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-12-20 11:53:36 -08:00
Kconfig net: remove blank lines at end of file 2018-07-24 14:10:43 -07:00
Makefile bpf, sockmap: convert to generic sk_msg interface 2018-10-15 12:23:19 -07:00
metrics.c net: Add extack argument to ip_fib_metrics_init 2018-11-06 15:00:45 -08:00
netfilter.c netfilter: utils: move nf_ip_checksum* from ipv4 to utils 2018-07-16 17:51:48 +02:00
netlink.c ipv4: support sport, dport and ip_proto in RTM_GETROUTE 2018-05-23 15:14:12 -04:00
ping.c ipv4: Allow sending multicast packets on specific i/f using VRF socket 2018-10-02 22:28:17 -07:00
proc.c tcp: implement coalescing on backlog queue 2018-11-30 13:26:54 -08:00
protocol.c fou, fou6: ICMP error handlers for FoU and GUE 2018-11-08 17:13:08 -08:00
raw_diag.c
raw.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-12-20 11:53:36 -08:00
route.c net: ipv4: Set skb->dev for output route resolution 2018-12-20 16:42:39 -08:00
syncookies.c tcp: provide earliest departure time in skb->tstamp 2018-09-21 19:37:59 -07:00
sysctl_net_ipv4.c net: provide a sysctl raw_l3mdev_accept for raw socket lookup with VRFs 2018-11-07 16:12:38 -08:00
tcp_bbr.c tcp_bbr: update comments to reflect pacing_margin_percent 2018-11-08 20:46:17 -08:00
tcp_bic.c
tcp_bpf.c bpf: sk_msg, sock{map|hash} redirect through ULP 2018-12-20 23:47:09 +01:00
tcp_cdg.c tcp: cdg: use tcp high resolution clock cache 2018-10-15 22:56:42 -07:00
tcp_cong.c
tcp_cubic.c
tcp_dctcp.c tcp: refactor DCTCP ECN ACK handling 2018-10-10 22:26:00 -07:00
tcp_dctcp.h tcp: refactor DCTCP ECN ACK handling 2018-10-10 22:26:00 -07:00
tcp_diag.c
tcp_fastopen.c
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c
tcp_input.c tcp: take care of compressed acks in tcp_add_reno_sack() 2018-11-30 13:26:53 -08:00
tcp_ipv4.c tcp: md5: add tcp_md5_needed jump label 2018-11-30 13:28:03 -08:00
tcp_lp.c
tcp_metrics.c mm: convert totalram_pages and totalhigh_pages variables to atomic 2018-12-28 12:11:47 -08:00
tcp_minisocks.c tcp: do not restart timewait timer on rst reception 2018-08-31 23:10:35 -07:00
tcp_nv.c
tcp_offload.c net: use indirect call wrappers at GRO transport layer 2018-12-15 13:23:02 -08:00
tcp_output.c tcp: handle EOR and FIN conditions the same in tcp_tso_should_defer() 2018-12-10 12:09:15 -08:00
tcp_rate.c tcp: introduce tcp_skb_timestamp_us() helper 2018-09-21 19:37:59 -07:00
tcp_recovery.c tcp: introduce tcp_skb_timestamp_us() helper 2018-09-21 19:37:59 -07:00
tcp_scalable.c
tcp_timer.c tcp: change txhash on SYN-data timeout 2019-01-10 16:55:41 -05:00
tcp_ulp.c tcp, ulp: remove socket lock assertion on ULP cleanup 2018-10-16 12:38:41 -07:00
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c
tcp.c tcp: fix code style in tcp_recvmsg() 2018-12-06 12:19:47 -08:00
tunnel4.c net: Convert protocol error handlers from void to int 2018-11-08 17:13:08 -08:00
udp_diag.c net: diag: document swapped src/dst in udp_dump_one. 2018-10-28 19:27:21 -07:00
udp_impl.h net: Convert protocol error handlers from void to int 2018-11-08 17:13:08 -08:00
udp_offload.c udp: use indirect call wrappers for GRO socket lookup 2018-12-15 13:23:02 -08:00
udp_tunnel.c udp_tunnel: add config option to bind to a device 2018-12-03 14:15:26 -08:00
udp.c net: udp: prefer listeners bound to an address 2018-12-14 15:55:20 -08:00
udplite.c net: Convert protocol error handlers from void to int 2018-11-08 17:13:08 -08:00
xfrm4_input.c xfrm: reset transport header back to network header after all input transforms ahave been applied 2018-09-04 10:26:30 +02:00
xfrm4_mode_beet.c
xfrm4_mode_transport.c xfrm: reset transport header back to network header after all input transforms ahave been applied 2018-09-04 10:26:30 +02:00
xfrm4_mode_tunnel.c
xfrm4_output.c
xfrm4_policy.c
xfrm4_protocol.c net: Convert protocol error handlers from void to int 2018-11-08 17:13:08 -08:00
xfrm4_state.c
xfrm4_tunnel.c