linux/net/ipv4
Florian Westphal 18685451fc inet: inet_defrag: prevent sk release while still in use
ip_local_out() and other functions can pass skb->sk as function argument.

If the skb is a fragment and reassembly happens before such function call
returns, the sk must not be released.

This affects skb fragments reassembled via netfilter or similar
modules, e.g. openvswitch or ct_act.c, when run as part of tx pipeline.

Eric Dumazet made an initial analysis of this bug.  Quoting Eric:
  Calling ip_defrag() in output path is also implying skb_orphan(),
  which is buggy because output path relies on sk not disappearing.

  A relevant old patch about the issue was :
  8282f27449 ("inet: frag: Always orphan skbs inside ip_defrag()")

  [..]

  net/ipv4/ip_output.c depends on skb->sk being set, and probably to an
  inet socket, not an arbitrary one.

  If we orphan the packet in ipvlan, then downstream things like FQ
  packet scheduler will not work properly.

  We need to change ip_defrag() to only use skb_orphan() when really
  needed, ie whenever frag_list is going to be used.

Eric suggested to stash sk in fragment queue and made an initial patch.
However there is a problem with this:

If skb is refragmented again right after, ip_do_fragment() will copy
head->sk to the new fragments, and sets up destructor to sock_wfree.
IOW, we have no choice but to fix up sk_wmem accouting to reflect the
fully reassembled skb, else wmem will underflow.

This change moves the orphan down into the core, to last possible moment.
As ip_defrag_offset is aliased with sk_buff->sk member, we must move the
offset into the FRAG_CB, else skb->sk gets clobbered.

This allows to delay the orphaning long enough to learn if the skb has
to be queued or if the skb is completing the reasm queue.

In the former case, things work as before, skb is orphaned.  This is
safe because skb gets queued/stolen and won't continue past reasm engine.

In the latter case, we will steal the skb->sk reference, reattach it to
the head skb, and fix up wmem accouting when inet_frag inflates truesize.

Fixes: 7026b1ddb6 ("netfilter: Pass socket pointer down through okfn().")
Diagnosed-by: Eric Dumazet <edumazet@google.com>
Reported-by: xingwei lee <xrivendell7@gmail.com>
Reported-by: yue sun <samsun1006219@gmail.com>
Reported-by: syzbot+e5167d7144a62715044c@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240326101845.30836-1-fw@strlen.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-28 12:06:22 +01:00
..
netfilter netfilter: arptables: Select NETFILTER_FAMILY_ARP when building arp_tables.c 2024-03-28 03:54:02 +01:00
af_inet.c net: introduce include/net/rps.h 2024-03-07 21:12:43 -08:00
ah4.c net: fill in MODULE_DESCRIPTION()s for ipv4 modules 2024-02-09 14:12:02 -08:00
arp.c arp: Prevent overflow in arp_req_get(). 2024-02-20 10:50:19 +01:00
bpf_tcp_ca.c bpf: treewide: Annotate BPF kfuncs in BTF 2024-01-31 20:40:56 -08:00
cipso_ipv4.c netlabel: remove impossible return value in netlbl_bitmap_walk 2024-02-28 19:37:34 -08:00
datagram.c ipv4: Set the routing scope properly in ip_route_output_ports(). 2024-02-12 17:33:05 -08:00
devinet.c netlink: let core handle error cases in dump operations 2024-03-07 20:48:22 -08:00
esp4_offload.c xfrm: Support GRO for IPv4 ESP in UDP encapsulation 2023-10-06 07:30:40 +02:00
esp4.c net: esp: fix bad handling of pages from page_pool 2024-03-18 11:53:46 +01:00
fib_frontend.c netlink: let core handle error cases in dump operations 2024-03-07 20:48:22 -08:00
fib_lookup.h
fib_notifier.c
fib_rules.c fib: remove unnecessary input parameters in fib_default_rule_add 2024-01-03 16:42:48 -08:00
fib_semantics.c ipv4: fib: annotate races around nh->nh_saddr_genid and nh->nh_saddr 2023-10-18 18:11:31 -07:00
fib_trie.c inet: switch inet_dump_fib() to RCU protection 2024-02-26 11:46:13 +00:00
fou_bpf.c bpf: treewide: Annotate BPF kfuncs in BTF 2024-01-31 20:40:56 -08:00
fou_core.c net: gro: rename skb_gro_header_hard() 2024-03-05 13:30:11 +01:00
fou_nl.c net: ynl: prefix uAPI header include with uapi/ 2023-05-26 10:30:14 +01:00
fou_nl.h net: ynl: prefix uAPI header include with uapi/ 2023-05-26 10:30:14 +01:00
gre_demux.c Normalise "name (ad@dr)" MODULE_AUTHORs to "name <ad@dr>" 2024-03-06 13:07:39 -08:00
gre_offload.c net: gro: rename skb_gro_header_hard() 2024-03-05 13:30:11 +01:00
icmp.c xfrm: pass struct net to xfrm_decode_session wrappers 2023-10-06 08:31:53 +02:00
igmp.c inet: annotate devconf data-races 2024-02-28 19:36:39 -08:00
inet_connection_sock.c tcp: properly terminate timers for kernel sockets 2024-03-25 19:51:57 -07:00
inet_diag.c inet_diag: skip over empty buckets 2024-01-23 15:13:55 +01:00
inet_fragment.c inet: inet_defrag: prevent sk release while still in use 2024-03-28 12:06:22 +01:00
inet_hashtables.c tcp: Fix refcnt handling in __inet_hash_connect(). 2024-03-14 10:57:02 +01:00
inet_timewait_sock.c tcp: Fix NEW_SYN_RECV handling in inet_twsk_purge() 2024-03-12 18:56:15 -07:00
inetpeer.c net: ipv4: Simplify the allocation of slab caches in inet_initpeers 2024-01-31 16:39:42 -08:00
ip_forward.c net: fix IPSTATS_MIB_OUTFORWDATAGRAMS increment after fragment check 2023-10-13 09:58:45 -07:00
ip_fragment.c inet: inet_defrag: prevent sk release while still in use 2024-03-28 12:06:22 +01:00
ip_gre.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-02-15 16:20:04 -08:00
ip_input.c ipv4: ignore dst hint for multipath routes 2023-09-01 08:11:51 +01:00
ip_options.c
ip_output.c Revert "net: Re-use and set mono_delivery_time bit for userspace tstamp packets" 2024-03-18 12:29:53 +00:00
ip_sockglue.c inet: Add getsockopt support for IP_ROUTER_ALERT and IPV6_ROUTER_ALERT 2024-03-06 12:37:06 +00:00
ip_tunnel_core.c tunnels: fix out of bounds access when building IPv6 PMTU error 2024-02-03 12:43:19 +00:00
ip_tunnel.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-03-11 20:38:36 -07:00
ip_vti.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-02-15 16:20:04 -08:00
ipcomp.c
ipconfig.c net: ipconfig: move ic_nameservers_fallback into #ifdef block 2023-05-22 11:17:55 +01:00
ipip.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-02-15 16:20:04 -08:00
ipmr_base.c
ipmr.c ipmr: fix incorrect parameter validation in the ip_mroute_getsockopt() function 2024-03-11 09:53:22 +00:00
Kconfig net/tcp: Add TCP-AO config and structures 2023-10-27 10:35:44 +01:00
Makefile bpfilter: remove bpfilter 2024-01-04 10:23:10 -08:00
metrics.c ipv4: prevent potential spectre v1 gadget in ip_metrics_convert() 2023-01-23 21:37:25 -08:00
netfilter.c xfrm: pass struct net to xfrm_decode_session wrappers 2023-10-06 08:31:53 +02:00
netlink.c
nexthop.c nexthop: fix uninitialized variable in nla_put_nh_group_stats() 2024-03-22 18:03:29 -07:00
ping.c bpf-next-for-netdev 2023-10-16 21:05:33 -07:00
proc.c inet: annotate devconf data-races 2024-02-28 19:36:39 -08:00
protocol.c
raw_diag.c inet_diag: add module pointer to "struct inet_diag_handler" 2024-01-23 15:13:54 +01:00
raw.c ipv4: raw: Fix sending packets from raw sockets via IPsec tunnels 2024-03-19 13:45:58 +01:00
route.c inet: annotate devconf data-races 2024-02-28 19:36:39 -08:00
syncookies.c tcp: Clear req->syncookie in reqsk_alloc(). 2024-03-19 19:35:59 -07:00
sysctl_net_ipv4.c Use READ/WRITE_ONCE() for IP local_port_range. 2023-12-08 10:44:42 -08:00
tcp_ao.c net: tcp: Remove redundant initialization of variable len 2024-02-20 11:40:15 +01:00
tcp_bbr.c bpf: treewide: Annotate BPF kfuncs in BTF 2024-01-31 20:40:56 -08:00
tcp_bic.c
tcp_bpf.c tcp_bpf: properly release resources on error paths 2023-10-18 18:09:31 -07:00
tcp_cdg.c
tcp_cong.c bpf, net: validate struct_ops when updating value. 2024-03-04 10:03:57 -08:00
tcp_cubic.c bpf: treewide: Annotate BPF kfuncs in BTF 2024-01-31 20:40:56 -08:00
tcp_dctcp.c bpf: treewide: Annotate BPF kfuncs in BTF 2024-01-31 20:40:56 -08:00
tcp_dctcp.h
tcp_diag.c inet_diag: add module pointer to "struct inet_diag_handler" 2024-01-23 15:13:54 +01:00
tcp_fastopen.c inet: move inet->defer_connect to inet->inet_flags 2023-08-16 11:09:18 +01:00
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c
tcp_input.c tcp: add dropreasons in tcp_rcv_state_process() 2024-02-28 10:39:22 +00:00
tcp_ipv4.c tcp: make dropreason in tcp_child_process() work 2024-02-28 10:39:22 +00:00
tcp_lp.c tcp: rename tcp_time_stamp() to tcp_time_stamp_ts() 2023-10-23 09:35:01 +01:00
tcp_metrics.c tcp_metrics: optimize tcp_metrics_flush_all() 2023-10-03 10:05:22 +02:00
tcp_minisocks.c rds: tcp: Fix use-after-free of net in reqsk_timer_handler(). 2024-03-12 18:56:16 -07:00
tcp_nv.c
tcp_offload.c net: move tcpv4_offload and tcpv6_offload to net_hotdata 2024-03-07 21:12:42 -08:00
tcp_output.c net: Remove acked SYN flag from packet in the transmit queue correctly 2023-12-12 15:56:02 -08:00
tcp_plb.c prandom: remove prandom_u32_max() 2022-12-20 03:13:45 +01:00
tcp_rate.c
tcp_recovery.c tcp: fix excessive TLP and RACK timeouts from HZ rounding 2023-10-17 17:25:42 -07:00
tcp_scalable.c
tcp_sigpool.c net/tcp_sigpool: Use kref_get_unless_zero() 2024-01-01 14:42:05 +00:00
tcp_timer.c tcp: use tp->total_rto to track number of linear timeouts in SYN_SENT state 2023-11-16 23:35:12 +00:00
tcp_ulp.c net/ulp: use consistent error code when blocking ULP 2023-01-19 09:26:16 -08:00
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c
tcp.c tcp: properly terminate timers for kernel sockets 2024-03-25 19:51:57 -07:00
tunnel4.c net: fill in MODULE_DESCRIPTION()s for ipv4 modules 2024-02-09 14:12:02 -08:00
udp_bpf.c bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser() 2023-03-03 17:25:15 +01:00
udp_diag.c inet_diag: add module pointer to "struct inet_diag_handler" 2024-01-23 15:13:54 +01:00
udp_impl.h sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES) 2023-06-24 15:50:13 -07:00
udp_offload.c udp: move udpv4_offload and udpv6_offload to net_hotdata 2024-03-07 21:12:43 -08:00
udp_tunnel_core.c net: fill in MODULE_DESCRIPTION()s for ipv4 modules 2024-02-09 14:12:02 -08:00
udp_tunnel_nic.c udp_tunnel: Use flex array to simplify code 2023-10-03 11:39:34 +02:00
udp_tunnel_stub.c
udp.c udp: no longer touch sk->sk_refcnt in early demux 2024-03-11 09:56:03 +00:00
udplite.c udplite: remove UDPLITE_BIT 2023-09-14 16:16:36 +02:00
xfrm4_input.c net: adopt skb_network_offset() and similar helpers 2024-03-04 08:47:06 +00:00
xfrm4_output.c
xfrm4_policy.c sysctl-6.6-rc1 2023-08-29 17:39:15 -07:00
xfrm4_protocol.c
xfrm4_state.c
xfrm4_tunnel.c net: fill in MODULE_DESCRIPTION()s for ipv4 modules 2024-02-09 14:12:02 -08:00