linux/net/ipv4
Paolo Abeni 3dd1c9a127 ip: hash fragments consistently
The skb hash for locally generated ip[v6] fragments belonging
to the same datagram can vary in several circumstances:
* for connected UDP[v6] sockets, the first fragment get its hash
  via set_owner_w()/skb_set_hash_from_sk()
* for unconnected IPv6 UDPv6 sockets, the first fragment can get
  its hash via ip6_make_flowlabel()/skb_get_hash_flowi6(), if
  auto_flowlabel is enabled

For the following frags the hash is usually computed via
skb_get_hash().
The above can cause OoO for unconnected IPv6 UDPv6 socket: in that
scenario the egress tx queue can be selected on a per packet basis
via the skb hash.
It may also fool flow-oriented schedulers to place fragments belonging
to the same datagram in different flows.

Fix the issue by copying the skb hash from the head frag into
the others at fragmentation time.

Before this commit:
perf probe -a "dev_queue_xmit skb skb->hash skb->l4_hash:b1@0/8 skb->sw_hash:b1@1/8"
netperf -H $IPV4 -t UDP_STREAM -l 5 -- -m 2000 -n &
perf record -e probe:dev_queue_xmit -e probe:skb_set_owner_w -a sleep 0.1
perf script
probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=3713014309 l4_hash=1 sw_hash=0
probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=0 l4_hash=0 sw_hash=0

After this commit:
probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0
probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0

Fixes: b73c3d0e4f ("net: Save TX flow hash in sock and set in skbuf on xmit")
Fixes: 67800f9b1f ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-23 11:39:30 -07:00
..
bpfilter net: bpfilter: make function bpfilter_mbox_request() static 2018-05-29 09:51:44 -04:00
netfilter netfilter: nf_tproxy: fix possible non-linear access to transport header 2018-07-06 14:32:44 +02:00
af_inet.c Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
ah4.c
arp.c proc: introduce proc_create_net{,_data} 2018-05-16 07:24:30 +02:00
cipso_ipv4.c
datagram.c
devinet.c net/ipv4: Add support for specifying metric of connected routes 2018-05-29 10:12:45 -04:00
esp4_offload.c
esp4.c
fib_frontend.c net/ipv4: Set oif in fib_compute_spec_dst 2018-07-08 10:54:58 +09:00
fib_lookup.h
fib_notifier.c
fib_rules.c net: fib_rules: add extack support 2018-04-23 10:21:24 -04:00
fib_semantics.c net: metrics: add proper netlink validation 2018-06-05 12:29:43 -04:00
fib_trie.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-06-06 18:39:49 -07:00
fou.c net: fix use-after-free in GRO with ESP 2018-07-02 20:34:04 +09:00
gre_demux.c
gre_offload.c net: fix use-after-free in GRO with ESP 2018-07-02 20:34:04 +09:00
icmp.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
igmp.c multicast: do not restore deleted record source filter mode to new one 2018-07-21 22:58:17 -07:00
inet_connection_sock.c net: ipv4: remove define INET_CSK_DEBUG and unnecessary EXPORT_SYMBOL 2018-05-10 17:43:55 -04:00
inet_diag.c sock_diag: request _diag module only when the family or proto has been registered 2018-03-12 11:03:42 -04:00
inet_fragment.c ipfrag: really prevent allocation on netns exit 2018-07-08 13:05:33 +09:00
inet_hashtables.c net/tcp: Fix socket lookups with SO_BINDTODEVICE 2018-06-20 08:03:06 +09:00
inet_timewait_sock.c soreuseport: initialise timewait reuseport field 2018-04-07 22:32:32 -04:00
inetpeer.c inetpeer: fix uninit-value in inet_getpeer 2018-04-09 10:57:35 -04:00
ip_forward.c net: rename skb_gso_validate_mtu -> skb_gso_validate_network_len 2018-03-04 17:49:17 -05:00
ip_fragment.c inet: frags: fix ip6frag_low_thresh boundary 2018-04-04 12:04:59 -04:00
ip_gre.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-05-21 16:01:54 -04:00
ip_input.c net: Make ip_ra_chain per struct net 2018-03-22 15:12:56 -04:00
ip_options.c
ip_output.c ip: hash fragments consistently 2018-07-23 11:39:30 -07:00
ip_sockglue.c ipv4/igmp: init group mode as INCLUDE when join source group 2018-07-16 11:20:06 -07:00
ip_tunnel_core.c net/ipv4: Update ip_tunnel_metadata_cnt static key to modern api 2018-05-10 15:13:33 -04:00
ip_tunnel.c ip_tunnel: Fix name string concatenate in __ip_tunnel_create() 2018-06-07 16:27:16 -04:00
ip_vti.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
ipcomp.c
ipconfig.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-06-06 18:39:49 -07:00
ipip.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
ipmr_base.c ipmr: fix error path when ipmr_new_table fails 2018-06-05 12:26:41 -04:00
ipmr.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-06-06 18:39:49 -07:00
Kconfig ipmr,ipmr6: Define a uniform vif_device 2018-03-01 13:13:23 -05:00
Makefile ipv4: support sport, dport and ip_proto in RTM_GETROUTE 2018-05-23 15:14:12 -04:00
metrics.c net: metrics: add proper netlink validation 2018-06-05 12:29:43 -04:00
netfilter.c
netlink.c ipv4: support sport, dport and ip_proto in RTM_GETROUTE 2018-05-23 15:14:12 -04:00
ping.c proc: introduce proc_create_net{,_data} 2018-05-16 07:24:30 +02:00
proc.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-06-06 18:39:49 -07:00
protocol.c
raw_diag.c
raw.c proc: introduce proc_create_net{,_data} 2018-05-16 07:24:30 +02:00
route.c treewide: kzalloc() -> kcalloc() 2018-06-12 16:19:22 -07:00
syncookies.c net/ipv4: disable SMC TCP option with SYN Cookies 2018-03-25 20:53:54 -04:00
sysctl_net_ipv4.c ipv4: Return EINVAL when ping_group_range sysctl doesn't map to user ns 2018-07-06 11:51:18 +09:00
tcp_bbr.c tcp_bbr: fix to zero idle_restart only upon S/ACKed data 2018-05-02 11:12:32 -04:00
tcp_bic.c
tcp_cdg.c
tcp_cong.c
tcp_cubic.c
tcp_dctcp.c tcp: do not delay ACK in DCTCP upon CE status change 2018-07-20 14:32:23 -07:00
tcp_diag.c
tcp_fastopen.c
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c net/tcp/illinois: replace broken algorithm reference link 2018-02-28 12:03:47 -05:00
tcp_input.c tcp: do not delay ACK in DCTCP upon CE status change 2018-07-20 14:32:23 -07:00
tcp_ipv4.c tcp: fix sequence numbers for repaired sockets re-using TIME-WAIT sockets 2018-07-12 14:33:45 -07:00
tcp_lp.c
tcp_metrics.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
tcp_minisocks.c net-tcp: remove useless tw_timeout field 2018-06-05 10:45:24 -04:00
tcp_nv.c
tcp_offload.c tcp: Do not reload skb pointer after skb_gro_receive(). 2018-06-11 20:00:56 -07:00
tcp_output.c tcp: do not cancel delay-AcK on DCTCP special ACK 2018-07-20 14:32:23 -07:00
tcp_rate.c
tcp_recovery.c tcp: tcp_rack_reo_wnd() can be static 2018-05-18 13:28:40 -04:00
tcp_scalable.c
tcp_timer.c tcp: add SACK compression 2018-05-18 11:40:27 -04:00
tcp_ulp.c
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c
tcp.c tcp: identify cryptic messages as TCP seq # bugs 2018-07-18 15:26:33 -07:00
tunnel4.c inet: whitespace cleanup 2018-02-28 11:43:28 -05:00
udp_diag.c udp: fix rx queue len reported by diag and proc interface 2018-06-08 19:55:15 -04:00
udp_impl.h
udp_offload.c net: fix use-after-free in GRO with ESP 2018-07-02 20:34:04 +09:00
udp_tunnel.c
udp.c Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
udplite.c proc: introduce proc_create_net{,_data} 2018-05-16 07:24:30 +02:00
xfrm4_input.c
xfrm4_mode_beet.c
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c xfrm: Verify MAC header exists before overwriting eth_hdr(skb)->h_proto 2018-03-07 10:54:29 +01:00
xfrm4_output.c net: xfrm: use skb_gso_validate_network_len() to check gso sizes 2018-03-04 17:49:17 -05:00
xfrm4_policy.c net: Drop pernet_operations::async 2018-03-27 13:18:09 -04:00
xfrm4_protocol.c
xfrm4_state.c
xfrm4_tunnel.c