linux

Author	SHA1	Message	Date
NeilBrown	b7b17c9b67	md: remove mddev_lock() from md_attr_show() Most attributes can be read safely without any locking. A race might lead to a slightly out-dated value, but nothing wrong. We already have locking in some places where needed. All that remains is can_clear_show(), behind_writes_used_show() and action_show() which are easily fixed. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-06 09:32:55 +11:00
NeilBrown	7b1485bab9	md/raid5: use ->lock to protect accessing raid5 sysfs attributes. It is important that mddev->private isn't freed while a sysfs attribute function is accessing it. So use mddev->lock to protect the setting of ->private to NULL, and take that lock when checking ->private for NULL and de-referencing it in the sysfs access functions. This only applies to the read ('show') side of access. Write access will be handled separately. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-06 09:32:55 +11:00
NeilBrown	f97fcad38f	md: remove need for mddev_lock() in md_seq_show() The only access in md_seq_show that could suffer from races not protected by ->lock is walking the rdev list. This can receive sufficient protection from 'rcu'. So use rdev_for_each_rcu() and get rid of mddev_lock(). Now reading /proc/mdstat will never block in md_seq_show. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-06 09:32:55 +11:00
NeilBrown	978a7a47ca	md/bitmap: protect clearing of ->bitmap by mddev->lock This makes it safe to inspect the struct while holding only the spinlock. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-06 09:32:55 +11:00
NeilBrown	36d091f475	md: protect ->pers changes with mddev->lock ->pers is already protected by ->reconfig_mutex, and cannot possibly change when there are threads running or outstanding IO. However there are some places where we access ->pers not in a thread or IO context, and where ->reconfig_mutex is unnecessarily heavy-weight: level_show and md_seq_show(). So protect all changes, and those accesses, with ->lock. This is a step toward taking those accesses out from under reconfig_mutex. [Fixed missing "mddev->pers" -> "pers" conversion, thanks to Dan Carpenter <dan.carpenter@oracle.com>] Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-04 08:35:53 +11:00
NeilBrown	db721d32b7	md: level_store: group all important changes into one place. Gather all the changes that can happen atomically and might be relevant to other code into one place. This will make it easier to refine the locking. Note that this puts quite a few things between mddev_detach() and ->free(). Enabling this was the point of some recent patches. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-04 08:35:53 +11:00
NeilBrown	afa0f557cb	md: rename ->stop to ->free Now that the ->stop function only frees the private data, rename is accordingly. Also pass in the private pointer as an arg rather than using mddev->private. This flexibility will be useful in level_store(). Finally, don't clear ->private. It doesn't make sense to clear it seeing that isn't what we free, and it is no longer necessary to clear ->private (it was some time ago before ->to_remove was introduced). Setting ->to_remove in ->free() is a bit of a wart, but not a big problem at the moment. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-04 08:35:52 +11:00
NeilBrown	5aa61f427e	md: split detach operation out from ->stop. Each md personality has a 'stop' operation which does two things: 1/ it finalizes some aspects of the array to ensure nothing is accessing the ->private data 2/ it frees the ->private data. All the steps in '1' can apply to all arrays and so can be performed in common code. This is useful as in the case where we change the personality which manages an array (in level_store()), it would be helpful to do step 1 early, and step 2 later. So split the 'step 1' functionality out into a new mddev_detach(). Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-04 08:35:52 +11:00
NeilBrown	3be260cc18	md/linear: remove rcu protections in favour of suspend/resume The use of 'rcu' to protect accesses to ->private_data so that the ->private_data could be updated predates the introduction of mddev_suspend/mddev_resume. These are a cleaner mechanism for providing stability while swapping in a new ->private data - it is used by level_store() to support changing of raid levels. So get rid of the RCU stuff and just use mddev_suspend, mddev_resume. As these function call ->quiesce(), we add an empty function for linear just like for raid0. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-04 08:35:52 +11:00
NeilBrown	64590f45dd	md: make merge_bvec_fn more robust in face of personality changes. There is no locking around calls to merge_bvec_fn(), so it is possible that calls which coincide with a level (or personality) change could go wrong. So create a central dispatch point for these functions and use rcu_read_lock(). If the array is suspended, reject any merge that can be rejected. If not, we know it is safe to call the function. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-04 08:35:52 +11:00
NeilBrown	5c675f83c6	md: make ->congested robust against personality changes. There is currently no locking around calls to the 'congested' bdi function. If called at an awkward time while an array is being converted from one level (or personality) to another, there is a tiny chance of running code in an unreferenced module etc. So add a 'congested' function to the md_personality operations structure, and call it with appropriate locking from a central 'mddev_congested'. When the array personality is changing the array will be 'suspended' so no IO is processed. If mddev_congested detects this, it simply reports that the array is congested, which is a safe guess. As mddev_suspend calls synchronize_rcu(), mddev_congested can avoid races by included the whole call inside an rcu_read_lock() region. This require that the congested functions for all subordinate devices can be run under rcu_lock. Fortunately this is the case. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-04 08:35:52 +11:00
NeilBrown	85572d7c75	md: rename mddev->write_lock to mddev->lock This lock is used for (slightly) more than helping with writing superblocks, and it will soon be extended further. So the name is inappropriate. Also, the _irq variant hasn't been needed since 2.6.37 as it is never taking from interrupt or bh context. So: -rename write_lock to lock -document what it protects -remove _irq ... except in md_flush_request() as there is no wait_event_lock() (with no _irq). This can be cleaned up after appropriate changes to wait.h. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-04 08:35:52 +11:00
NeilBrown	ea664c8245	md/raid5: need_this_block: tidy/fix last condition. That last condition is unclear and over cautious. There are two related issues here. If a partial write is destined for a missing device, then either RMW or RCW can work. We must read all the available block. Only then can the missing blocks be calculated, and then the parity update performed. If RMW is not an option, then there is a complication even without partial writes. If we would need to read a missing device to perform the reconstruction, then we must first read every block so the missing device data can be computed. This is the case for RAID6 (Which currently does not support RMW) and for times when we don't trust the parity (after a crash) and so are in the process of resyncing it. So make these two cases more clear and separate, and perform the relevant tests more thoroughly. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-04 08:35:51 +11:00
NeilBrown	a9d56950f7	md/raid5: need_this_block: start simplifying the last two conditions. Both the last two cases are only relevant if something has failed and something needs to be written (but not over-written), and if it is OK to pre-read blocks at this point. So factor out those tests and explain them. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-04 08:35:51 +11:00
NeilBrown	a79cfe12c6	md/raid5: separate out the easy conditions in need_this_block. Some of the conditions in need_this_block have very straight forward motivation. Separate those out and document them. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-04 08:35:51 +11:00
NeilBrown	2c58f06e6f	md/raid5: separate large if clause out of fetch_block(). fetch_block() has a very large and hard to read 'if' condition. Separate it into its own function so that it can be made more readable. Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-04 08:35:51 +11:00
Jes Sorensen	ad3ab8b608	md: do_release_stripe(): No need to call md_wakeup_thread() twice `67f455486d` introduced a call to md_wakeup_thread() when adding to the delayed_list. However the md thread is woken up unconditionally just below. Remove the unnecessary wakeup call. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-04 08:35:51 +11:00
Jan Beulich	75aaf4c3e6	x86/raid6: correctly check for assembler capabilities Just like for AVX2 (which simply needs an #if -> #ifdef conversion), SSSE3 assembler support should be checked for before using it. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Jim Kukunas <james.t.kukunas@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-04 08:35:51 +11:00
NeilBrown	d959014334	md/bitmap: fix a might_sleep() warning. commit `8eb23b9f35` sched: Debug nested sleeps causes false-positive warnings in RAID5 code. This annotation removes them and adds a comment explaining why there is no real problem. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-02 17:08:03 +11:00
NeilBrown	b1b02fe97f	md/raid5: fix another livelock caused by non-aligned writes. If a non-page-aligned write is destined for a device which is missing/faulty, we can deadlock. As the target device is missing, a read-modify-write cycle is not possible. As the write is not for a full-page, a recontruct-write cycle is not possible. This should be handled by logic in fetch_block() which notices there is a non-R5_OVERWRITE write to a missing device, and so loads all blocks. However since commit `67f455486d`, that code requires STRIPE_PREREAD_ACTIVE before it will active, and those circumstances never set STRIPE_PREREAD_ACTIVE. So: in handle_stripe_dirtying, if neither rmw or rcw was possible, set STRIPE_DELAYED, which will cause STRIPE_PREREAD_ACTIVE be set after a suitable delay. Fixes: `67f455486d` Cc: stable@vger.kernel.org (v3.16+) Reported-by: Mikulas Patocka <mpatocka@redhat.com> Tested-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>	2015-02-02 16:57:17 +11:00
Linus Torvalds	59343cd7c4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Don't OOPS on socket AIO, from Christoph Hellwig. 2) Scheduled scans should be aborted upon RFKILL, from Emmanuel Grumbach. 3) Fix sleep in atomic context in kvaser_usb, from Ahmed S Darwish. 4) Fix RCU locking across copy_to_user() in bpf code, from Alexei Starovoitov. 5) Lots of crash, memory leak, short TX packet et al bug fixes in sh_eth from Ben Hutchings. 6) Fix memory corruption in SCTP wrt. INIT collitions, from Daniel Borkmann. 7) Fix return value logic for poll handlers in netxen, enic, and bnx2x. From Eric Dumazet and Govindarajulu Varadarajan. 8) Header length calculation fix in mac80211 from Fred Chou. 9) mv643xx_eth doesn't handle highmem correctly in non-TSO code paths. From Ezequiel Garcia. 10) udp_diag has bogus logic in it's hash chain skipping, copy same fix tcp diag used. From Herbert Xu. 11) amd-xgbe programs wrong rx flow control register, from Thomas Lendacky. 12) Fix race leading to use after free in ping receive path, from Subash Abhinov Kasiviswanathan. 13) Cache redirect routes otherwise we can get a heavy backlog of rcu jobs liberating DST_NOCACHE entries. From Hannes Frederic Sowa. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (48 commits) net: don't OOPS on socket aio stmmac: prevent probe drivers to crash kernel bnx2x: fix napi poll return value for repoll ipv6: replacing a rt6_info needs to purge possible propagated rt6_infos too sh_eth: Fix DMA-API usage for RX buffers sh_eth: Check for DMA mapping errors on transmit sh_eth: Ensure DMA engines are stopped before freeing buffers sh_eth: Remove RX overflow log messages ping: Fix race in free in receive path udp_diag: Fix socket skipping within chain can: kvaser_usb: Fix state handling upon BUS_ERROR events can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT can: kvaser_usb: Send correct context to URB completion can: kvaser_usb: Do not sleep in atomic context ipv4: try to cache dst_entries which would cause a redirect samples: bpf: relax test_maps check bpf: rcu lock must not be held when calling copy_to_user() net: sctp: fix slab corruption from use after free on INIT collisions net: mv643xx_eth: Fix highmem support in non-TSO egress path sh_eth: Fix serialisation of interrupt disable with interrupt & NAPI handlers ...	2015-01-27 13:55:36 -08:00
Christoph Hellwig	06539d3071	net: don't OOPS on socket aio Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 12:25:33 -08:00
Andy Shevchenko	9afec6efc6	stmmac: prevent probe drivers to crash kernel In the case when alloc_netdev fails we return NULL to a caller. But there is no check for NULL in the probe drivers. This patch changes NULL to an error pointer. The function description is amended to reflect what we may get returned. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 12:24:30 -08:00
Linus Torvalds	7da323bb45	Two powerpc fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUxx7MAAoJEFHr6jzI4aWAqe0QAIGGb2mtU1gON4XtWIlYPzSu TZDGAK6pCeCr92tH5H3HXy7hOEbZOPQGJCqvOHMK/kcJtlkXSrW1ERDPryqEv9Xq V+PJMY5Tre1ga1xttyk3ny4zKmAy4UcLm6Wepi5054kB0Svo7/s2IKECwsInY/D3 yYxSW4XUZXvCYqdZUqdQbCSgocWOfFJbPqt4kf2IiqjCPlWU1GGjpMpmqXFhTh4Y wuU0FtcC7db+wCvjaXs/dbCHMZqWqsQ+WCo9kz2+fLtRbthXSk0A9V96jLzah32/ yQHqyH2+/6ZklxApjgvuQ4ZhZogU2U2PlFLfpJqiBhy+5WnkBpZtJ1lIj9hGlyC4 pvKjrHTp0Q8+gZ8ZxuEuiKV1L+01kWbtQwg4Z5HIuulz7DfiSGatSez6CuwIeV49 A5dQmse0bK6L4XWlB+RKyLdnj+HSPVFIcLlTPMSV/UlDp7IySphR+zp28ykyWv0C yGa9JUeD7hvM3Oezo88QoS4P5rk/KmQjPMhYz+ab8/eX3RCtBA1jb/KrdLm2b4tM sVJyuiyalL35IoOZlAh8rrKHfu1ZgXyYkJgMdccAy2yQZghmK4H8o8gdf9YWGNNq vEZHk+2gKI1T1o5uqvHMmkw0IWV4pE9jm3Jyo2W/pZozxR3oswOtUebCXbK+mLqy 81GwzBWwALxbRM5HGtIr =cDwY -----END PGP SIGNATURE----- Merge tag 'powerpc-3.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc fixes from Michael Ellerman: "Two powerpc fixes" * tag 'powerpc-3.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: powerpc/powernv: Restore LPCR with LPCR_PECE1 cleared powerpc/xmon: Fix another endiannes issue in RTAS call from xmon	2015-01-27 10:04:38 -08:00
Linus Torvalds	41592e2fb5	SCSI was using module_refcount() to figure out when the module was unloading: this broke with new atomic refcounting. The code is still suspicious, but this solves the WARN_ON(). Thanks, Rusty. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUxuOxAAoJENkgDmzRrbjxtGIP/im0s3L+wM28NdQ5zFOiIpwX 0N4y4U6Aa+rrLs27jDZtXW4ss+fr/V7pUSD1uwTrTHl9X4a3M+DLpEME5Rq40TXG MYHihlQo8PM/yCuTMFR7esF20/q9CIksuVA3GkjYqwBW+ks097+t60Yn7YSfX22U HzSiR3Pp6yPjMoZWROP576rc9VtOM6tScKU1rzwe8Bo7ayzNADmTFJR4UpQuiQhU NwHn7Lih6EGsVTtaX3KS91dA7v5Un/S4jskqAkkwOW3ipgf38LLfjETueHvDWpda szVFoMMMkqDwse2MyluiLk/80Cpl528GO24D0MMbQoPfsUi7OgxQN6t1QviRxeut pH1VyK4VrRDqhpGB+5l7PQUOk58dL4SjmIVO/G9wxMgKzw2giONJdt/zHBXMr5/l 6RYINwHEQalMoG3X+kgYKa5E1WPXeJsE4xQTBIQRsaHO+9lk95BbNUVfUgvygUPV TTaymhHhEYOsntiMHPkTI05FkK79khs/qX7lU1MVtZ9x7UaYXs3ziY7z+D5fRKf1 WgoYemWBeFZznsceacNSZGjFsaMJr6gsZ/rVuPSbytXDXSOrQsJxiENph7CnHbRk 4UJffEomjdQ93CNGBI2hCYlr4Gf5DflbwHjJbT+YXcqumkXJq4+ktSFp5aCH+9q8 901XYQ7QIJ4roIOzODUG =JI7o -----END PGP SIGNATURE----- Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull one more module fix from Rusty Russell: "SCSI was using module_refcount() to figure out when the module was unloading: this broke with new atomic refcounting. The code is still suspicious, but this solves the WARN_ON()" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: scsi: always increment reference count	2015-01-27 09:02:09 -08:00
Govindarajulu Varadarajan	24e579c889	bnx2x: fix napi poll return value for repoll With the commit `d75b1ade56` ("net: less interrupt masking in NAPI") napi repoll is done only when work_done == budget. When in busy_poll is we return 0 in napi_poll. We should return budget. Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:29:29 -08:00
David S. Miller	bf693f7beb	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== ipsec 2015-01-26 Just two small fixes for _decode_session6() where we might decode to wrong header information in some rare situations. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:28:38 -08:00
Hannes Frederic Sowa	6e9e16e614	ipv6: replacing a rt6_info needs to purge possible propagated rt6_infos too Lubomir Rintel reported that during replacing a route the interface reference counter isn't correctly decremented. To quote bug <https://bugzilla.kernel.org/show_bug.cgi?id=91941>: \| [root@rhel7-5 lkundrak]# sh -x lal \| + ip link add dev0 type dummy \| + ip link set dev0 up \| + ip link add dev1 type dummy \| + ip link set dev1 up \| + ip addr add 2001:db8:8086::2/64 dev dev0 \| + ip route add 2001:db8:8086::/48 dev dev0 proto static metric 20 \| + ip route add 2001:db8:8088::/48 dev dev1 proto static metric 10 \| + ip route replace 2001:db8:8086::/48 dev dev1 proto static metric 20 \| + ip link del dev0 type dummy \| Message from syslogd@rhel7-5 at Jan 23 10:54:41 ... \| kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2 \| \| Message from syslogd@rhel7-5 at Jan 23 10:54:51 ... \| kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2 During replacement of a rt6_info we must walk all parent nodes and check if the to be replaced rt6_info got propagated. If so, replace it with an alive one. Fixes: `4a287eba2d` ("IPv6 routing, NLM_F_* flag support: REPLACE and EXCL flags support, warn about missing CREATE flag") Reported-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Tested-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:22:14 -08:00
David S. Miller	225776098b	Merge branch 'sh_eth' Ben Hutchings says: ==================== Fixes for sh_eth #3 I'm continuing review and testing of Ethernet support on the R-Car H2 chip. This series fixes the last of the more serious issues I've found. These are not tested on any of the other supported chips. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:18:57 -08:00
Ben Hutchings	52b9fa3696	sh_eth: Fix DMA-API usage for RX buffers - Use the return value of dma_map_single(), rather than calling virt_to_page() separately - Check for mapping failue - Call dma_unmap_single() rather than dma_sync_single_for_cpu() Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:18:54 -08:00
Ben Hutchings	aa3933b873	sh_eth: Check for DMA mapping errors on transmit dma_map_single() may fail if an IOMMU or swiotlb is in use, so we need to check for this. Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:18:53 -08:00
Ben Hutchings	740c7f31c0	sh_eth: Ensure DMA engines are stopped before freeing buffers Currently we try to clear EDRRR and EDTRR and immediately continue to free buffers. This is unsafe because: - In general, register writes are not serialised with DMA, so we still have to wait for DMA to complete somehow - The R8A7790 (R-Car H2) manual states that the TX running flag cannot be cleared by writing to EDTRR - The same manual states that clearing the RX running flag only stops RX DMA at the next packet boundary I applied this patch to the driver to detect DMA writes to freed buffers: > --- a/drivers/net/ethernet/renesas/sh_eth.c > +++ b/drivers/net/ethernet/renesas/sh_eth.c > @@ -1098,7 +1098,14 @@ static void sh_eth_ring_free(struct net_device ndev) > / Free Rx skb ringbuffer */ > if (mdp->rx_skbuff) { > for (i = 0; i < mdp->num_rx_ring; i++) > + memcpy(mdp->rx_skbuff[i]->data, > + "Hello, world", 12); > + msleep(100); > + for (i = 0; i < mdp->num_rx_ring; i++) { > + WARN_ON(memcmp(mdp->rx_skbuff[i]->data, > + "Hello, world", 12)); > dev_kfree_skb(mdp->rx_skbuff[i]); > + } > } > kfree(mdp->rx_skbuff); > mdp->rx_skbuff = NULL; then ran the loop: while ethtool -G eth0 rx 128 ; ethtool -G eth0 rx 64; do echo -n .; done and 'ping -f' toward the sh_eth port from another machine. The warning fired several times a minute. To fix these issues: - Deactivate all TX descriptors rather than writing to EDTRR - As there seems to be no way of telling when RX DMA is stopped, perform a soft reset to ensure that both DMA enginess are stopped - To reduce the possibility of the reset truncating a transmitted frame, disable egress and wait a reasonable time to reach a packet boundary before resetting - Update statistics before resetting (The 'reasonable time' does not allow for CS/CD in half-duplex mode, but half-duplex no longer seems reasonable!) Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:18:53 -08:00
Ben Hutchings	dc1d0e6d55	sh_eth: Remove RX overflow log messages If RX traffic is overflowing the FIFO or DMA ring, logging every time this happens just makes things worse. These errors are visible in the statistics anyway. Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:18:53 -08:00
David S. Miller	8d8d67f140	linux-can-fixes-for-3.19-20150127 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJUx0SsAAoJECte4hHFiupUy9wP/3YkYTtcFcyZ4xbv3v8q+AZX GklmuK8aSmyy6Rwq/ZWtE8g9FCnedBy5D5ws76BgJjwmuAinsfjA893IYfPD0oTV sR6QKkA9QP3CgAwLfkH4+Zkd/dV4ltwZwTenwTPUXLZ/6qxfWjwvGwgXPoaZtChO R3Sqf0l4h7hmVu2/xuUNAycSysQYqCkbHVD3q1bEyd6H3jvFze/YixRqOaiCXwNh nI0H7lrdAAnYKfmWfKI6SaWtfxfyejExdMfDGwbe6FWPX6YNWI/4RB+vJwVLELuy kjJBZEFEVJWNsrwXOU/EqIyajXrSIt6asLy0wNaME99QzSAyV857PY8ZLuhEwcWP bmYrIPRvFKnG+STcIhZC+xOGEPVTe2Mh4y+JLgTQ0JWFR8+8GWdBRI1Oj1BYblGh CUMIeSs1KeQ0HHBAD/frCg2FxPzE/cZvqgznr8neC6o7fkrw0HOq9zn8ovC1vsYk lky0tNDO3ZTdagghGzr1Wfcs+TAyI4qrMWTI7cbuyZ+o/icPhR3WIkA3RFR0imBb yEB32HUN06YZnvQs8DsRIh4EhJe2f/ZjSETl5uxhI/bgaBYGzplSP0S95Q8tblWR 6zpJyBTEkoaFZXLD0MPmAi7rp3EcpK3orjI5g9CgOoYK09AvIPcH3wRffvkht1/E WgmUt4gecjxWfukFU408 =S/gP -----END PGP SIGNATURE----- Merge tag 'linux-can-fixes-for-3.19-20150127' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2015-01-27 this is another pull request for net/master which consists of 4 patches. All 4 patches are contributed by Ahmed S. Darwish, he fixes more problems in the kvaser_usb driver. David, please merge net/master to net-next/master, as we have more kvaser_usb patches in the queue, that target net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:13:05 -08:00
subashab@codeaurora.org	fc752f1f43	ping: Fix race in free in receive path An exception is seen in ICMP ping receive path where the skb destructor sock_rfree() tries to access a freed socket. This happens because ping_rcv() releases socket reference with sock_put() and this internally frees up the socket. Later icmp_rcv() will try to free the skb and as part of this, skb destructor is called and which leads to a kernel panic as the socket is freed already in ping_rcv(). -->\|exception -007\|sk_mem_uncharge -007\|sock_rfree -008\|skb_release_head_state -009\|skb_release_all -009\|__kfree_skb -010\|kfree_skb -011\|icmp_rcv -012\|ip_local_deliver_finish Fix this incorrect free by cloning this skb and processing this cloned skb instead. This patch was suggested by Eric Dumazet Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:04:53 -08:00
Herbert Xu	86f3cddbc3	udp_diag: Fix socket skipping within chain While working on rhashtable walking I noticed that the UDP diag dumping code is buggy. In particular, the socket skipping within a chain never happens, even though we record the number of sockets that should be skipped. As this code was supposedly copied from TCP, this patch does what TCP does and resets num before we walk a chain. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-27 00:02:41 -08:00
Ahmed S. Darwish	e638642b08	can: kvaser_usb: Fix state handling upon BUS_ERROR events While being in an ERROR_WARNING state, and receiving further bus error events with error counters still in the ERROR_WARNING range of 97-127 inclusive, the state handling code erroneously reverts back to ERROR_ACTIVE. Per the CAN standard, only revert to ERROR_ACTIVE when the error counters are less than 96. Moreover, in certain Kvaser models, the BUS_ERROR flag is always set along with undefined bits in the M16C status register. Thus use bitwise operators instead of full equality for checking that register against bus errors. Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2015-01-27 08:55:09 +01:00
Ahmed S. Darwish	14c10c2a1d	can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT On some x86 laptops, plugging a Kvaser device again after an unplug makes the firmware always ignore the very first command. For such a case, provide some room for retries instead of completely exiting the driver init code. Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2015-01-27 08:55:09 +01:00
Ahmed S. Darwish	3803fa6977	can: kvaser_usb: Send correct context to URB completion Send expected argument to the URB completion hander: a CAN netdevice instead of the network interface private context `kvaser_usb_net_priv'. This was discovered by having some garbage in the kernel log in place of the netdevice names: can0 and can1. Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2015-01-27 08:55:08 +01:00
Ahmed S. Darwish	ded5006667	can: kvaser_usb: Do not sleep in atomic context Upon receiving a hardware event with the BUS_RESET flag set, the driver kills all of its anchored URBs and resets all of its transmit URB contexts. Unfortunately it does so under the context of URB completion handler `kvaser_usb_read_bulk_callback()', which is often called in an atomic context. While the device is flooded with many received error packets, usb_kill_urb() typically sleeps/reschedules till the transfer request of each killed URB in question completes, leading to the sleep in atomic bug. [3] In v2 submission of the original driver patch [1], it was stated that the URBs kill and tx contexts reset was needed since we don't receive any tx acknowledgments later and thus such resources will be locked down forever. Fortunately this is no longer needed since an earlier bugfix in this patch series is now applied: all tx URB contexts are reset upon CAN channel close. [2] Moreover, a BUS_RESET is now treated _exactly_ like a BUS_OFF event, which is the recommended handling method advised by the device manufacturer. [1] http://article.gmane.org/gmane.linux.network/239442 http://www.webcitation.org/6Vr2yagAQ [2] can: kvaser_usb: Reset all URB tx contexts upon channel close `889b77f7fd` [3] Stacktrace: <IRQ> [<ffffffff8158de87>] dump_stack+0x45/0x57 [<ffffffff8158b60c>] __schedule_bug+0x41/0x4f [<ffffffff815904b1>] __schedule+0x5f1/0x700 [<ffffffff8159360a>] ? _raw_spin_unlock_irqrestore+0xa/0x10 [<ffffffff81590684>] schedule+0x24/0x70 [<ffffffff8147d0a5>] usb_kill_urb+0x65/0xa0 [<ffffffff81077970>] ? prepare_to_wait_event+0x110/0x110 [<ffffffff8147d7d8>] usb_kill_anchored_urbs+0x48/0x80 [<ffffffffa01f4028>] kvaser_usb_unlink_tx_urbs+0x18/0x50 [kvaser_usb] [<ffffffffa01f45d0>] kvaser_usb_rx_error+0xc0/0x400 [kvaser_usb] [<ffffffff8108b14a>] ? vprintk_default+0x1a/0x20 [<ffffffffa01f5241>] kvaser_usb_read_bulk_callback+0x4c1/0x5f0 [kvaser_usb] [<ffffffff8147a73e>] __usb_hcd_giveback_urb+0x5e/0xc0 [<ffffffff8147a8a1>] usb_hcd_giveback_urb+0x41/0x110 [<ffffffffa0008748>] finish_urb+0x98/0x180 [ohci_hcd] [<ffffffff810cd1a7>] ? acct_account_cputime+0x17/0x20 [<ffffffff81069f65>] ? local_clock+0x15/0x30 [<ffffffffa000a36b>] ohci_work+0x1fb/0x5a0 [ohci_hcd] [<ffffffff814fbb31>] ? process_backlog+0xb1/0x130 [<ffffffffa000cd5b>] ohci_irq+0xeb/0x270 [ohci_hcd] [<ffffffff81479fc1>] usb_hcd_irq+0x21/0x30 [<ffffffff8108bfd3>] handle_irq_event_percpu+0x43/0x120 [<ffffffff8108c0ed>] handle_irq_event+0x3d/0x60 [<ffffffff8108ec84>] handle_fasteoi_irq+0x74/0x110 [<ffffffff81004dfd>] handle_irq+0x1d/0x30 [<ffffffff81004727>] do_IRQ+0x57/0x100 [<ffffffff8159482a>] common_interrupt+0x6a/0x6a Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2015-01-27 08:55:08 +01:00
David S. Miller	7d63585bf0	Another set of last-minute fixes: * fix station double-removal when suspending while associating * fix the HT (802.11n) header length calculation * fix the CCK radiotap flag used for monitoring, a pretty old regression but a simple one-liner * fix per-station group-key handling -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJUwmkTAAoJEDBSmw7B7bqr3X0P/3Y/2UnQSBSi2LnqmsuV8igR noAwaPuCf2ASs25v9nQv64JM1qC0kiemYesi3fFWHiqPhoxx9WrMSvjXdarynrQN LwBG74tcQVEeW70w+36yeKCS+wMU35eTzDDxdnYOfJeNOhW2rQE7WVT2drkc6lVL hvGRhqlBx0mVLQ3VCW5c1kmMN/VAttZKx5WWWurmA8OSQzcHJGZW6ZTuqwpDx4RT iqwi5uBtMkqTQV+22Zb4xI+YxLGwcGiYjy15KksPsrSZZeS5pJzeyovrDhACBVpE aVj3K+OAGVmXCN7HVpQ3tTqCaQea5o+EDqtnT/IQrjnmw6p4mC3co6ShiVn+DkTX 6nv/ka92k9q1LMGIzl+lje2cNmaerVrDa84gUqnHXbTr80+ugKu4NpZowQgdW4yG Qu4kQALkTRhUoNf+RUzXKcFqAW2qLKf83qLcrhbQWkSrY0hzLVWKI/1/FAkCyo6I zKhsB44mHMMaKGVq5qsHTD0E89PtiJuizDLiU04uJExYJCi1FCh9NDlc+HaVlTnt 5LHFbrsPDuAUKoKSiJmJUIyueL+Of5N34+epuLRZb55aE2AQhApg6Nxd+FMuro1m 0aeHEREEO04QZ7IUrYgBA4G7L1dsZJDD8auU6Y4V3chAg6ArbswbtzAv8ON1tAk+ pr0kThACKqkfg2tN5fny =DhIp -----END PGP SIGNATURE----- Merge tag 'mac80211-for-davem-2015-01-23' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Another set of last-minute fixes: * fix station double-removal when suspending while associating * fix the HT (802.11n) header length calculation * fix the CCK radiotap flag used for monitoring, a pretty old regression but a simple one-liner * fix per-station group-key handling Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 17:32:24 -08:00
Hannes Frederic Sowa	df4d92549f	ipv4: try to cache dst_entries which would cause a redirect Not caching dst_entries which cause redirects could be exploited by hosts on the same subnet, causing a severe DoS attack. This effect aggravated since commit `f886497212` ("ipv4: fix dst race in sk_dst_get()"). Lookups causing redirects will be allocated with DST_NOCACHE set which will force dst_release to free them via RCU. Unfortunately waiting for RCU grace period just takes too long, we can end up with >1M dst_entries waiting to be released and the system will run OOM. rcuos threads cannot catch up under high softirq load. Attaching the flag to emit a redirect later on to the specific skb allows us to cache those dst_entries thus reducing the pressure on allocation and deallocation. This issue was discovered by Marcelo Leitner. Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: Marcelo Leitner <mleitner@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 17:28:27 -08:00
David S. Miller	412d2907c4	Merge branch 'bpf' Alexei Starovoitov says: ==================== bpf: fix two bugs Michael Holzheu caught two issues (in bpf syscall and in the test). Fix them. Details in corresponding patches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 17:20:48 -08:00
Alexei Starovoitov	ba1a68bf13	samples: bpf: relax test_maps check hash map is unordered, so get_next_key() iterator shouldn't rely on particular order of elements. So relax this test. Fixes: `ffb65f27a1` ("bpf: add a testsuite for eBPF maps") Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 17:20:40 -08:00
Alexei Starovoitov	8ebe667c41	bpf: rcu lock must not be held when calling copy_to_user() BUG: sleeping function called from invalid context at mm/memory.c:3732 in_atomic(): 0, irqs_disabled(): 0, pid: 671, name: test_maps 1 lock held by test_maps/671: #0: (rcu_read_lock){......}, at: [<0000000000264190>] map_lookup_elem+0xe8/0x260 Call Trace: ([<0000000000115b7e>] show_trace+0x12e/0x150) [<0000000000115c40>] show_stack+0xa0/0x100 [<00000000009b163c>] dump_stack+0x74/0xc8 [<000000000017424a>] ___might_sleep+0x23a/0x248 [<00000000002b58e8>] might_fault+0x70/0xe8 [<0000000000264230>] map_lookup_elem+0x188/0x260 [<0000000000264716>] SyS_bpf+0x20e/0x840 Fix it by allocating temporary buffer to store map element value. Fixes: `db20fd2b01` ("bpf: add lookup/update/delete/iterate methods to BPF maps") Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 17:20:40 -08:00
Daniel Borkmann	600ddd6825	net: sctp: fix slab corruption from use after free on INIT collisions When hitting an INIT collision case during the 4WHS with AUTH enabled, as already described in detail in commit `1be9a950c6` ("net: sctp: inherit auth_capable on INIT collisions"), it can happen that we occasionally still remotely trigger the following panic on server side which seems to have been uncovered after the fix from commit `1be9a950c6` ... [ 533.876389] BUG: unable to handle kernel paging request at 00000000ffffffff [ 533.913657] IP: [<ffffffff811ac385>] __kmalloc+0x95/0x230 [ 533.940559] PGD 5030f2067 PUD 0 [ 533.957104] Oops: 0000 [#1] SMP [ 533.974283] Modules linked in: sctp mlx4_en [...] [ 534.939704] Call Trace: [ 534.951833] [<ffffffff81294e30>] ? crypto_init_shash_ops+0x60/0xf0 [ 534.984213] [<ffffffff81294e30>] crypto_init_shash_ops+0x60/0xf0 [ 535.015025] [<ffffffff8128c8ed>] __crypto_alloc_tfm+0x6d/0x170 [ 535.045661] [<ffffffff8128d12c>] crypto_alloc_base+0x4c/0xb0 [ 535.074593] [<ffffffff8160bd42>] ? _raw_spin_lock_bh+0x12/0x50 [ 535.105239] [<ffffffffa0418c11>] sctp_inet_listen+0x161/0x1e0 [sctp] [ 535.138606] [<ffffffff814e43bd>] SyS_listen+0x9d/0xb0 [ 535.166848] [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b ... or depending on the the application, for example this one: [ 1370.026490] BUG: unable to handle kernel paging request at 00000000ffffffff [ 1370.026506] IP: [<ffffffff811ab455>] kmem_cache_alloc+0x75/0x1d0 [ 1370.054568] PGD 633c94067 PUD 0 [ 1370.070446] Oops: 0000 [#1] SMP [ 1370.085010] Modules linked in: sctp kvm_amd kvm [...] [ 1370.963431] Call Trace: [ 1370.974632] [<ffffffff8120f7cf>] ? SyS_epoll_ctl+0x53f/0x960 [ 1371.000863] [<ffffffff8120f7cf>] SyS_epoll_ctl+0x53f/0x960 [ 1371.027154] [<ffffffff812100d3>] ? anon_inode_getfile+0xd3/0x170 [ 1371.054679] [<ffffffff811e3d67>] ? __alloc_fd+0xa7/0x130 [ 1371.080183] [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b With slab debugging enabled, we can see that the poison has been overwritten: [ 669.826368] BUG kmalloc-128 (Tainted: G W ): Poison overwritten [ 669.826385] INFO: 0xffff880228b32e50-0xffff880228b32e50. First byte 0x6a instead of 0x6b [ 669.826414] INFO: Allocated in sctp_auth_create_key+0x23/0x50 [sctp] age=3 cpu=0 pid=18494 [ 669.826424] __slab_alloc+0x4bf/0x566 [ 669.826433] __kmalloc+0x280/0x310 [ 669.826453] sctp_auth_create_key+0x23/0x50 [sctp] [ 669.826471] sctp_auth_asoc_create_secret+0xcb/0x1e0 [sctp] [ 669.826488] sctp_auth_asoc_init_active_key+0x68/0xa0 [sctp] [ 669.826505] sctp_do_sm+0x29d/0x17c0 [sctp] [...] [ 669.826629] INFO: Freed in kzfree+0x31/0x40 age=1 cpu=0 pid=18494 [ 669.826635] __slab_free+0x39/0x2a8 [ 669.826643] kfree+0x1d6/0x230 [ 669.826650] kzfree+0x31/0x40 [ 669.826666] sctp_auth_key_put+0x19/0x20 [sctp] [ 669.826681] sctp_assoc_update+0x1ee/0x2d0 [sctp] [ 669.826695] sctp_do_sm+0x674/0x17c0 [sctp] Since this only triggers in some collision-cases with AUTH, the problem at heart is that sctp_auth_key_put() on asoc->asoc_shared_key is called twice when having refcnt 1, once directly in sctp_assoc_update() and yet again from within sctp_auth_asoc_init_active_key() via sctp_assoc_update() on the already kzfree'd memory, which is also consistent with the observation of the poison decrease from 0x6b to 0x6a (note: the overwrite is detected at a later point in time when poison is checked on new allocation). Reference counting of auth keys revisited: Shared keys for AUTH chunks are being stored in endpoints and associations in endpoint_shared_keys list. On endpoint creation, a null key is being added; on association creation, all endpoint shared keys are being cached and thus cloned over to the association. struct sctp_shared_key only holds a pointer to the actual key bytes, that is, struct sctp_auth_bytes which keeps track of users internally through refcounting. Naturally, on assoc or enpoint destruction, sctp_shared_key are being destroyed directly and the reference on sctp_auth_bytes dropped. User space can add keys to either list via setsockopt(2) through struct sctp_authkey and by passing that to sctp_auth_set_key() which replaces or adds a new auth key. There, sctp_auth_create_key() creates a new sctp_auth_bytes with refcount 1 and in case of replacement drops the reference on the old sctp_auth_bytes. A key can be set active from user space through setsockopt() on the id via sctp_auth_set_active_key(), which iterates through either endpoint_shared_keys and in case of an assoc, invokes (one of various places) sctp_auth_asoc_init_active_key(). sctp_auth_asoc_init_active_key() computes the actual secret from local's and peer's random, hmac and shared key parameters and returns a new key directly as sctp_auth_bytes, that is asoc->asoc_shared_key, plus drops the reference if there was a previous one. The secret, which where we eventually double drop the ref comes from sctp_auth_asoc_set_secret() with intitial refcount of 1, which also stays unchanged eventually in sctp_assoc_update(). This key is later being used for crypto layer to set the key for the hash in crypto_hash_setkey() from sctp_auth_calculate_hmac(). To close the loop: asoc->asoc_shared_key is freshly allocated secret material and independant of the sctp_shared_key management keeping track of only shared keys in endpoints and assocs. Hence, also commit `4184b2a79a` ("net: sctp: fix memory leak in auth key management") is independant of this bug here since it concerns a different layer (though same structures being used eventually). asoc->asoc_shared_key is reference dropped correctly on assoc destruction in sctp_association_free() and when active keys are being replaced in sctp_auth_asoc_init_active_key(), it always has a refcount of 1. Hence, it's freed prematurely in sctp_assoc_update(). Simple fix is to remove that sctp_auth_key_put() from there which fixes these panics. Fixes: `730fc3d05c` ("[SCTP]: Implete SCTP-AUTH parameter processing") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 17:02:05 -08:00
Linus Torvalds	4adca1cbc4	Merge branch 'akpm' (patches from Andrew Morton) Merge misc fixes from Andrew Morton: "Six fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: drivers/rtc/rtc-s5m.c: terminate s5m_rtc_id array with empty element printk: add dummy routine for when CONFIG_PRINTK=n mm/vmscan: fix highidx argument type memcg: remove extra newlines from memcg oom kill log x86, build: replace Perl script with Shell script mm: page_alloc: embed OOM killing naturally into allocation slowpath	2015-01-26 16:25:42 -08:00
Ezequiel Garcia	9e911414af	net: mv643xx_eth: Fix highmem support in non-TSO egress path Commit `69ad0dd7af` Author: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Date: Mon May 19 13:59:59 2014 -0300 net: mv643xx_eth: Use dma_map_single() to map the skb fragments caused a nasty regression by removing the support for highmem skb fragments. By using page_address() to get the address of a fragment's page, we are assuming a lowmem page. However, such assumption is incorrect, as fragments can be in highmem pages, resulting in very nasty issues. This commit fixes this by using the skb_frag_dma_map() helper, which takes care of mapping the skb fragment properly. Additionally, the type of mapping is now tracked, so it can be unmapped using dma_unmap_page or dma_unmap_single when appropriate. This commit also fixes the error path in txq_init() to release the resources properly. Fixes: `69ad0dd7af` ("net: mv643xx_eth: Use dma_map_single() to map the skb fragments") Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 16:14:40 -08:00
David S. Miller	9d08da9630	Merge branch 'sh_eth' Ben Hutchings says: ==================== Fixes for sh_eth #2 I'm continuing review and testing of Ethernet support on the R-Car H2 chip. This series fixes more of the issues I've found, but it won't be the last set. These are not tested on any of the other supported chips. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 16:13:20 -08:00
Ben Hutchings	283e38db65	sh_eth: Fix serialisation of interrupt disable with interrupt & NAPI handlers In order to stop the RX path accessing the RX ring while it's being stopped or resized, we clear the interrupt mask (EESIPR) and then call free_irq() or synchronise_irq(). This is insufficient because the interrupt handler or NAPI poller may set EESIPR again after we clear it. Also, in sh_eth_set_ringparam() we currently don't disable NAPI polling at all. I could easily trigger a crash by running the loop: while ethtool -G eth0 rx 128 && ethtool -G eth0 rx 64; do echo -n .; done and 'ping -f' toward the sh_eth port from another machine. To fix this: - Add a software flag (irq_enabled) to signal whether interrupts should be enabled - In the interrupt handler, if the flag is clear then clear EESIPR and return - In the NAPI poller, if the flag is clear then don't set EESIPR - Set the flag before enabling interrupts in sh_eth_dev_init() and sh_eth_set_ringparam() - Clear the flag and serialise with the interrupt and NAPI handlers before clearing EESIPR in sh_eth_close() and sh_eth_set_ringparam() After this, I could run the loop for 100,000 iterations successfully. Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-01-26 16:13:15 -08:00

1 2 3 4 5 ...

495677 Commits