linux/net
Ihar Hrachyshka 77d7123342 neighbour: update neigh timestamps iff update is effective
It's a common practice to send gratuitous ARPs after moving an
IP address to another device to speed up healing of a service. To
fulfill service availability constraints, the timing of network peers
updating their caches to point to a new location of an IP address can be
particularly important.

Sometimes neigh_update calls won't touch neither lladdr nor state, for
example if an update arrives in locktime interval. The neigh->updated
value is tested by the protocol specific neigh code, which in turn
will influence whether NEIGH_UPDATE_F_OVERRIDE gets set in the
call to neigh_update() or not. As a result, we may effectively ignore
the update request, bailing out of touching the neigh entry, except that
we still bump its timestamps inside neigh_update.

This may be a problem for updates arriving in quick succession. For
example, consider the following scenario:

A service is moved to another device with its IP address. The new device
sends three gratuitous ARP requests into the network with ~1 seconds
interval between them. Just before the first request arrives to one of
network peer nodes, its neigh entry for the IP address transitions from
STALE to DELAY.  This transition, among other things, updates
neigh->updated. Once the kernel receives the first gratuitous ARP, it
ignores it because its arrival time is inside the locktime interval. The
kernel still bumps neigh->updated. Then the second gratuitous ARP
request arrives, and it's also ignored because it's still in the (new)
locktime interval. Same happens for the third request. The node
eventually heals itself (after delay_first_probe_time seconds since the
initial transition to DELAY state), but it just wasted some time and
require a new ARP request/reply round trip. This unfortunate behaviour
both puts more load on the network, as well as reduces service
availability.

This patch changes neigh_update so that it bumps neigh->updated (as well
as neigh->confirmed) only once we are sure that either lladdr or entry
state will change). In the scenario described above, it means that the
second gratuitous ARP request will actually update the entry lladdr.

Ideally, we would update the neigh entry on the very first gratuitous
ARP request. The locktime mechanism is designed to ignore ARP updates in
a short timeframe after a previous ARP update was honoured by the kernel
layer. This would require tracking timestamps for state transitions
separately from timestamps when actual updates are received. This would
probably involve changes in neighbour struct. Therefore, the patch
doesn't tackle the issue of the first gratuitous APR ignored, leaving
it for a follow-up.

Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17 11:41:38 -04:00
..
6lowpan 6lowpan: Don't set IFF_NO_QUEUE 2017-04-12 22:02:40 +02:00
9p xen: fixes and featrues for 4.12 2017-05-04 11:37:09 -07:00
802 Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
8021q vlan: Keep NETIF_F_HW_CSUM similar to other software devices 2017-05-08 14:39:19 -04:00
appletalk lib/vsprintf.c: remove %Z support 2017-02-27 18:43:47 -08:00
atm neighbour: fix nlmsg_pid in notifications 2017-03-22 10:48:49 -07:00
ax25 net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
batman-adv This feature/cleanup patchset includes the following patches: 2017-04-06 14:37:50 -07:00
bluetooth Bluetooth: Add selftest for ECDH key generation 2017-04-30 16:52:43 +03:00
bpf bpf: Align packet data properly in program testing framework. 2017-05-02 11:46:28 -04:00
bridge bridge: netlink: account for IFLA_BRPORT_{B, M}CAST_FLOOD size and policy 2017-05-05 11:21:03 -04:00
caif sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> 2017-03-02 08:42:29 +01:00
can can: fix CAN BCM build with CONFIG_PROC_FS disabled 2017-04-27 09:34:13 +02:00
ceph The two main items are support for disabling automatic rbd exclusive 2017-05-10 08:42:33 -07:00
core neighbour: update neigh timestamps iff update is effective 2017-05-17 11:41:38 -04:00
dcb net: rtnetlink: plumb extended ack to doit function 2017-04-17 15:35:38 -04:00
dccp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-05-15 15:50:49 -07:00
decnet Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-05-09 15:42:31 -07:00
dns_resolver Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-03-03 10:16:38 -08:00
dsa net: dsa: Remove redundant NULL dst check 2017-04-21 10:41:24 -04:00
ethernet Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2017-02-16 21:25:49 -05:00
hsr netlink: extended ACK reporting 2017-04-13 13:58:20 -04:00
ieee802154 netlink: pass extended ACK struct where available 2017-04-13 13:58:22 -04:00
ife net: Introduce ife encapsulation module 2017-02-03 15:16:45 -05:00
ipv4 arp: honour gratuitous ARP _replies_ 2017-05-17 11:32:44 -04:00
ipv6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-05-15 15:50:49 -07:00
ipx ipx: call ipxitf_put() in ioctl error path 2017-05-02 15:34:53 -04:00
irda net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
iucv net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
kcm kcm: remove a useless copy_from_user() 2017-04-17 13:28:48 -04:00
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-04-21 20:23:53 -07:00
l2tp l2tp: remove useless device duplication test in l2tp_eth_create() 2017-04-27 16:32:13 -04:00
l3mdev
lapb Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
llc Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu 2017-04-23 11:12:44 +02:00
mac80211 mac80211: fix IBSS presp allocation size 2017-05-08 11:25:04 +02:00
mac802154 drivers: add explicit interrupt.h includes 2017-03-30 11:05:34 -07:00
mpls treewide: use kv[mz]alloc* rather than opencoded variants 2017-05-08 17:15:13 -07:00
ncsi
netfilter Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-10 10:30:46 -07:00
netlabel netlink: pass extended ACK struct to parsing functions 2017-04-13 13:58:22 -04:00
netlink netlink: pass extended ACK struct where available 2017-04-13 13:58:22 -04:00
netrom net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
nfc NFC 4.12 pull request 2017-04-21 15:29:40 -04:00
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf 2017-05-03 10:11:26 -04:00
packet net/packet: fix missing net_device reference release 2017-05-15 14:22:12 -04:00
phonet net: rtnetlink: plumb extended ack to doit function 2017-04-17 15:35:38 -04:00
psample net: Introduce psample, a new genetlink channel for packet sampling 2017-01-24 13:44:28 -05:00
qrtr Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-04-21 20:23:53 -07:00
rds Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
rfkill rfkill: remove rfkill-regulator 2017-01-24 11:07:35 +01:00
rose net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
rxrpc rxrpc: Trace client call connection 2017-04-06 11:10:41 +01:00
sched net: sched: optimize class dumps 2017-05-11 21:37:40 -04:00
sctp sctp: fix src address selection if using secondary addresses for ipv6 2017-05-12 10:50:32 -04:00
smc net/smc: Add warning about remote memory exposure 2017-05-16 14:49:43 -04:00
strparser strparser: destroy workqueue on module exit 2017-03-03 20:43:26 -08:00
sunrpc Another RDMA update from Chuck Lever, and a bunch of miscellaneous 2017-05-10 13:29:23 -07:00
switchdev netlink: pass extended ACK struct to parsing functions 2017-04-13 13:58:22 -04:00
tipc tipc: make macro tipc_wait_for_cond() smp safe 2017-05-11 22:19:30 -04:00
unix af_unix: Use designated initializers 2017-04-06 12:43:04 -07:00
vmw_vsock virtio: fixes, cleanups, performance 2017-05-10 11:33:08 -07:00
wimax genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
wireless nl80211: correctly validate MU-MIMO groups 2017-05-08 11:24:34 +02:00
x25 net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
compat.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-02-22 10:15:09 -08:00
Kconfig bpf: make jited programs visible in traces 2017-02-17 13:40:05 -05:00
Makefile bpf: introduce BPF_PROG_TEST_RUN command 2017-04-01 12:45:57 -07:00
socket.c l2tp: device MTU setup, tunnel socket needs a lock 2017-04-17 13:01:48 -04:00
sysctl_net.c sysctl: Remove dead register_sysctl_root 2017-04-16 23:42:49 -05:00