linux/net/core
Ido Schimmel e4961b0768 net: core: Correctly iterate over lower adjacency list
Tamir reported the following trace when processing ARP requests received
via a vlan device on top of a VLAN-aware bridge:

 NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/1:0]
[...]
 CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W       4.8.0-rc7 #1
 Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016
 task: ffff88017edfea40 task.stack: ffff88017ee10000
 RIP: 0010:[<ffffffff815dcc73>]  [<ffffffff815dcc73>] netdev_all_lower_get_next_rcu+0x33/0x60
[...]
 Call Trace:
  <IRQ>
  [<ffffffffa015de0a>] mlxsw_sp_port_lower_dev_hold+0x5a/0xa0 [mlxsw_spectrum]
  [<ffffffffa016f1b0>] mlxsw_sp_router_netevent_event+0x80/0x150 [mlxsw_spectrum]
  [<ffffffff810ad07a>] notifier_call_chain+0x4a/0x70
  [<ffffffff810ad13a>] atomic_notifier_call_chain+0x1a/0x20
  [<ffffffff815ee77b>] call_netevent_notifiers+0x1b/0x20
  [<ffffffff815f2eb6>] neigh_update+0x306/0x740
  [<ffffffff815f38ce>] neigh_event_ns+0x4e/0xb0
  [<ffffffff8165ea3f>] arp_process+0x66f/0x700
  [<ffffffff8170214c>] ? common_interrupt+0x8c/0x8c
  [<ffffffff8165ec29>] arp_rcv+0x139/0x1d0
  [<ffffffff816e505a>] ? vlan_do_receive+0xda/0x320
  [<ffffffff815e3794>] __netif_receive_skb_core+0x524/0xab0
  [<ffffffff815e6830>] ? dev_queue_xmit+0x10/0x20
  [<ffffffffa06d612d>] ? br_forward_finish+0x3d/0xc0 [bridge]
  [<ffffffffa06e5796>] ? br_handle_vlan+0xf6/0x1b0 [bridge]
  [<ffffffff815e3d38>] __netif_receive_skb+0x18/0x60
  [<ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0
  [<ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70
  [<ffffffffa06d7856>] br_pass_frame_up+0xc6/0x160 [bridge]
  [<ffffffffa06d63d7>] ? deliver_clone+0x37/0x50 [bridge]
  [<ffffffffa06d656c>] ? br_flood+0xcc/0x160 [bridge]
  [<ffffffffa06d7b14>] br_handle_frame_finish+0x224/0x4f0 [bridge]
  [<ffffffffa06d7f94>] br_handle_frame+0x174/0x300 [bridge]
  [<ffffffff815e3599>] __netif_receive_skb_core+0x329/0xab0
  [<ffffffff81374815>] ? find_next_bit+0x15/0x20
  [<ffffffff8135e802>] ? cpumask_next_and+0x32/0x50
  [<ffffffff810c9968>] ? load_balance+0x178/0x9b0
  [<ffffffff815e3d38>] __netif_receive_skb+0x18/0x60
  [<ffffffff815e3dc0>] netif_receive_skb_internal+0x40/0xb0
  [<ffffffff815e3e4c>] netif_receive_skb+0x1c/0x70
  [<ffffffffa01544a1>] mlxsw_sp_rx_listener_func+0x61/0xb0 [mlxsw_spectrum]
  [<ffffffffa005c9f7>] mlxsw_core_skb_receive+0x187/0x200 [mlxsw_core]
  [<ffffffffa007332a>] mlxsw_pci_cq_tasklet+0x63a/0x9b0 [mlxsw_pci]
  [<ffffffff81091986>] tasklet_action+0xf6/0x110
  [<ffffffff81704556>] __do_softirq+0xf6/0x280
  [<ffffffff8109213f>] irq_exit+0xdf/0xf0
  [<ffffffff817042b4>] do_IRQ+0x54/0xd0
  [<ffffffff8170214c>] common_interrupt+0x8c/0x8c

The problem is that netdev_all_lower_get_next_rcu() never advances the
iterator, thereby causing the loop over the lower adjacency list to run
forever.

Fix this by advancing the iterator and avoid the infinite loop.

Fixes: 7ce856aaaf ("mlxsw: spectrum: Add couple of lower device helper functions")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Tamir Winetroub <tamirw@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-19 10:38:08 -04:00
..
datagram.c udp: enable MSG_PEEK at non-zero offset 2016-04-05 16:29:37 -04:00
dev_addr_lists.c net: fix spelling for synchronized 2014-11-18 15:26:32 -05:00
dev_ioctl.c dev_ioctl: use sizeof(x) instead of sizeof x 2014-11-18 15:27:32 -05:00
dev.c net: core: Correctly iterate over lower adjacency list 2016-10-19 10:38:08 -04:00
devlink.c devlink: add hardware messages tracing facility 2016-07-12 14:20:18 -07:00
drop_monitor.c drop_monitor: make genl_multicast_group const 2016-09-01 14:09:00 -07:00
dst_cache.c net: dst_cache_per_cpu_dst_set() can be static 2016-03-18 17:45:08 -04:00
dst.c net: add dst_cache to ovs vxlan lwtunnel 2016-02-16 20:21:48 -05:00
ethtool.c sctp: Add GSO support 2016-06-03 19:37:21 -04:00
fib_rules.c fib_rules: Added NLM_F_EXCL support to fib_nl_newrule 2016-06-30 08:23:19 -04:00
filter.c bpf: add helper to invalidate hash 2016-09-23 08:40:28 -04:00
flow_dissector.c flow_dissector: Check skb for VLAN only if skb specified. 2016-10-19 10:35:46 -04:00
flow.c flowcache: Avoid OOM condition under preasure 2016-03-17 10:28:42 +01:00
gen_estimator.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
gen_stats.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-06-10 11:52:24 -07:00
hwbm.c net: hwbm: Fix unbalanced spinlock in error case 2016-05-25 12:35:09 -07:00
link_watch.c dev: introduce dev_get_iflink() 2015-04-02 14:04:59 -04:00
lwtunnel.c net: lwtunnel: Handle fragmentation 2016-08-30 22:27:18 -07:00
Makefile net: add a hardware buffer management helper API 2016-03-14 12:19:46 -04:00
neighbour.c neigh: allow admin to set NUD_STALE 2016-08-08 15:36:38 -07:00
net_namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2016-10-06 09:52:23 -07:00
net-procfs.c net: remove NETDEV_TX_LOCKED support 2016-04-26 15:53:05 -04:00
net-sysfs.c net: introduce NETDEV_CHANGE_TX_QUEUE_LEN 2016-07-01 05:32:17 -04:00
net-sysfs.h
net-traces.c net: IPv6 fib lookup tracepoint 2015-11-22 11:54:10 -05:00
netclassid_cgroup.c core: remove unneded headers for net cgroup controllers. 2016-02-17 15:31:27 -05:00
netevent.c netevent: remove automatic variable in register_netevent_notifier() 2015-05-31 00:03:21 -07:00
netpoll.c net: tracepoint napi:napi_poll add work and budget 2016-07-09 18:05:02 -04:00
netprio_cgroup.c core: remove unneded headers for net cgroup controllers. 2016-02-17 15:31:27 -05:00
pktgen.c net: pktgen: remove rcu locking in pktgen_change_name() 2016-10-17 10:52:59 -04:00
ptp_classifier.c ptp: Change ptp_class to a proper bitmask 2015-11-03 11:08:22 -05:00
request_sock.c tcp: restore fastopen operations 2015-10-05 03:19:06 -07:00
rtnetlink.c net: rtnl: info leak in rtnl_fill_vfinfo() 2016-10-13 12:12:04 -04:00
scm.c unix: correctly track in-flight fds in sending process user_struct 2016-02-08 10:30:42 -05:00
secure_seq.c net: remove a sparse error in secure_dccpv6_sequence_number() 2015-05-25 22:55:37 -04:00
skbuff.c Merge branch 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-07 15:36:58 -07:00
sock_diag.c sock_diag: align nlattr properly when needed 2016-04-26 12:00:48 -04:00
sock_reuseport.c soreuseport: do not export reuseport_add_sock() 2016-10-18 14:18:23 -04:00
sock.c mm: memcontrol: consolidate cgroup socket tracking 2016-10-07 18:46:29 -07:00
stream.c net: do not export sk_stream_write_space 2016-09-28 20:32:38 -04:00
sysctl_net_core.c bpf: add generic constant blinding for use in jits 2016-05-16 13:49:32 -04:00
timestamping.c net: skb_defer_rx_timestamp should check for phydev before setting up classify 2015-07-09 14:17:15 -07:00
tso.c net: tso: add support for IPv6 2015-10-26 22:24:22 -07:00
utils.c net: the space is required before the open parenthesis '(' 2016-06-29 05:15:14 -04:00