We intend this patch to improve spidernet interrupt handling to be
more strict. We had following problem and this patch solves it.
-when CONFIG_DEBUG_SHIRQ=y, request_irq() calls handler().
-when spider_net_open() is called, it calls request_irq() which calls
spider_net_interrupt().
-if some specific interrupt bit is set at this timing, it calls
netif_rx_schedule() and spider_net_poll() is scheduled.
-spider_net_open() calls netif_poll_enable() which clears the bit
__LINK_STATE_RX_SCHED.
-when spider_net_poll() is called, it calls netif_rx_complete() which
causes BUG_ON() because __LINK_STATE_RX_SCHED is not set.
Signed-off-by: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The transmit frame tail bit is stranglely misnamed as
"no checksum". Fix the name to what it should be:
"transmit frame tail". No functional change,
just a name change.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Put the enable and disable routines next to one-another,
as this makes verifying thier symmetry that much easier.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
When entering the netdev poll routine, empty out the RX
chain first, before cleaning up the TX chain. This should
help avoid RX buffer overflows.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Invalidate a pointer as its pci_unmap'ed; this is a bit of
paranoia to make sure hardware doesn't continue trying to
DMA to it.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Crazy device problems are hard to debug, when one does not have
good trace info. This patch makes a major enhancement to the
device dump routine.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
It doesn't look like spidernet hardware can really checksum all protocols,
the code looks like it does IPV4 only. If so, it should use NETIF_F_IP_CSUM
instead of NETIF_F_HW_CSUM.
The driver doesn't need it's own get/set for ethtool tx csum, and it
should use the standard ethtool_op_get_link.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Although the previous patch resolved issues with hangs when the
RX ram full interrupt is encountered, there are still situations
where lots of RX ramfull interrupts arrive, resulting in a noisy
log in syslog. There is no need for this.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The terminated RX ring will cause trouble during the RX ram full
conditions, leading to a hung driver, as the hardware can't find
the next descr. There is no real reason to terminate the RX ring;
it doesn't make the operation any smooother, and it does
require an extra sync. So don't do it.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch fixes a rare deadlock that can occur when the kernel
is not able to empty out the RX ring quickly enough. Below follows
a detailed description of the bug and the fix.
As long as the OS can empty out the RX buffers at a rate faster than
the hardware can fill them, there is no problem. If, for some reason,
the OS fails to empty the RX ring fast enough, the hardware GDACTDPA
pointer will catch up to the head, notice the not-empty condition,
ad stop. However, RX packets may still continue arriving on the wire.
The spidernet chip can save some limited number of these in local RAM.
When this local ram fills up, the spider chip will issue an interrupt
indicating this (GHIINT0STS will show ERRINT, and the GRMFLLINT bit
will be set in GHIINT1STS). When te RX ram full condition occurs,
a certain bug/feature is triggered that has to be specially handled.
This section describes the special handling for this condition.
When the OS finally has a chance to run, it will empty out the RX ring.
In particular, it will clear the descriptor on which the hardware had
stopped. However, once the hardware has decided that a certain
descriptor is invalid, it will not restart at that descriptor; instead
it will restart at the next descr. This potentially will lead to a
deadlock condition, as the tail pointer will be pointing at this descr,
which, from the OS point of view, is empty; the OS will be waiting for
this descr to be filled. However, the hardware has skipped this descr,
and is filling the next descrs. Since the OS doesn't see this, there
is a potential deadlock, with the OS waiting for one descr to fill,
while the hardware is waiting for a differen set of descrs to become
empty.
A call to show_rx_chain() at this point indicates the nature of the
problem. A typical print when the network is hung shows the following:
net eth1: Spider RX RAM full, incoming packets might be discarded!
net eth1: Total number of descrs=256
net eth1: Chain tail located at descr=255
net eth1: Chain head is at 255
net eth1: HW curr desc (GDACTDPA) is at 0
net eth1: Have 1 descrs with stat=xa0800000
net eth1: HW next desc (GDACNEXTDA) is at 1
net eth1: Have 127 descrs with stat=x40800101
net eth1: Have 1 descrs with stat=x40800001
net eth1: Have 126 descrs with stat=x40800101
net eth1: Last 1 descrs with stat=xa0800000
Both the tail and head pointers are pointing at descr 255, which is
marked xa... which is "empty". Thus, from the OS point of view, there
is nothing to be done. In particular, there is the implicit assumption
that everything in front of the "empty" descr must surely also be empty,
as explained in the last section. The OS is waiting for descr 255 to
become non-empty, which, in this case, will never happen.
The HW pointer is at descr 0. This descr is marked 0x4.. or "full".
Since its already full, the hardware can do nothing more, and thus has
halted processing. Notice that descrs 0 through 254 are all marked
"full", while descr 254 and 255 are empty. (The "Last 1 descrs" is
descr 254, since tail was at 255.) Thus, the system is deadlocked,
and there can be no forward progress; the OS thinks there's nothing
to do, and the hardware has nowhere to put incoming data.
This bug/feature is worked around with the spider_net_resync_head_ptr()
routine. When the driver receives RX interrupts, but an examination
of the RX chain seems to show it is empty, then it is probable that
the hardware has skipped a descr or two (sometimes dozens under heavy
network conditions). The spider_net_resync_head_ptr() subroutine will
search the ring for the next full descr, and the driver will resume
operations there. Since this will leave "holes" in the ring, there
is also a spider_net_resync_tail_ptr() that will skip over such holes.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Avoid kernel crash in mm/slab.c due to double-free of pointer.
If the ethernet interface is brought down while there is still
RX traffic in flight, the device shutdown routine can end up
trying to double-free an skb, leading to a crash in mm/slab.c
Avoid the double-free by nulling out the skb pointer.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Many drivers had code that did kill_vid, but they weren't doing vlan
filtering. With new API the stub is unneeded unless device sets
NETIF_F_HW_VLAN_FILTER.
Bad habit: I couldn't resist fixing a couple of nearby style things
in acenic, and forcedeth.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The stats update code in spider_net_pass_skb_up() is touching the skb
after it's been passed up to the stack. To avoid that, just update the
stats first.
Signed-off-by: Florin Malita <fmalita@gmail.com>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Spidernet was the driver I original did all the node-aware netdevice
allocation for, but after a year it still hasn't hit mainline.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
These are all the remaining instances of get_property. Simple rename of
get_property to of_get_property.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
One less thing for drivers writers to worry about.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The basic structure of "normal" UDP/IP/Ethernet
frames (that actually work):
- It starts with the Ethernet header (dest MAC, src MAC, etc.)
- The next part is occupied by the IP header (version info, length of
packet, id=0, fragment offset=0, checksum, from / to address, etc.)
- Then comes the UDP header (src / dest port, length, checksum)
- Actual payload
- Ethernet checksum
Now what's different for IP fragment:
- The IP header has id set to some value (same for all fragments),
offset is set appropriately (i.e. 0 for first fragment, following
according to size of other fragments), size is the length of the frame.
- UDP header is unchanged. I.e. length is according to full UDP
datagram, not just the part within the actual frame! But this is only
true within the first frame: all following frames don't have a valid
UDP-header at all.
The spidernet silicon seems to be quite intelligent: It's able to
compute (IP / UDP / Ethernet) checksums on the fly and tests if frames
are conforming to RFC -- at least conforming to RFC on complete frames.
But IP fragments are different as explained above:
I.e. for IP fragments containing part of a UDP datagram it sees
incompatible length in the headers for IP and UDP in the first frame
and, thus, skips this frame. But the content *is* correct for IP
fragments. For all following frames it finds (most probably) no valid
UDP header at all. But this *is* also correct for IP fragments.
The Linux IP-stack seems to be clever in this point. It expects the
spidernet to calculate the checksum (since the module claims to be able
to do so) and marks the skb's for "normal" frames accordingly
(ip_summed set to CHECKSUM_HW).
But for the IP fragments it does not expect the driver to be capable to
handle the frames appropriately. Thus all checksums are allready
computed. This is also flaged within the skb (ip_summed set to
CHECKSUM_NONE).
Unfortunately the spidernet driver ignores that hints. It tries to send
the IP fragments of UDP datagrams as normal UDP/IP frames. Since they
have different structure the silicon detects them the be not
"well-formed" and skips them.
The following one-liner against 2.6.21-rc2 changes this behavior. If the
IP-stack claims to have done the checksumming, the driver should not
try to checksum (and analyze) the frame but send it as is.
Signed-off-by: Norbert Eicker <n.eicker@fz-juelich.de>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Multiple threads performing a transmit can race into
the spidernet tx ring cleanup code. This puts the
relevant check under a lock.
Signed-off-by: Linas Vepstas <lins@austin.ibm.com>
Cc: Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
Cc: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
It appears that under certain circumstances, a race will result
in a double-free of an skb. This patch null's out the skb pointer
upon the skb free, avoiding the inadvertent deref of bogus data.
The next patch fixes the actual race.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
Cc: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch separates the hardware descriptor state from the
driver descriptor state, per (old) suggestion from Ben Herrenschmidt.
This compiles and boots and seems to work.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
Cc: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This moves the medium variable into the spidernet card structure.
It renames the GMII_ variables to BCM54XX specific ones.
Signed-off-by: Jens Osterkamp <jens@de.ibm.com>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patches removes logging for SPIDER_NET_GTMFLLINT interrupts.
Since the interrupts are not irregular, and they happen frequently
when using 100Mbps network switches.
Signed-off-by: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch adds or changes some HW specific settings for spider_net on
Celleb.
Signed-off-by: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch moves calling init_firmware() from spider_net_probe() to
spider_net_open() so as to use the driver by built-in.
Signed-off-by: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add auto negotiation support for Celleb.
Signed-off-by: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
As of 2.6.20-git4, the spider_net driver does not compile.
This appears to be due to some archaic usage involving kobjects.
It also fixes a nasty double-free during ifdown of the interface.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
Cc: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Make the hardware perceive the RX descriptor ring as a null-terminated linked
list, instead of a circular ring.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: James K Lewis <jklewis@us.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add some debugging and error printing.
The show_rx_chain() prints out the status of the rx chain,
which shows that the status of the descriptors gets
messed up after the second & subsequent RX ramfulls.
Print out contents of bad packets if error occurs.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: James K Lewis <jklewis@us.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Delete possible source of chain corruption; the hardware
already knows the location of the tail, and writing it
again is likely to mess it up.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: James K Lewis <jklewis@us.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add memory barrier to make sure that the rest of the
RX descriptor state is flushed to memory before we tell
the hardware that its ready to go.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: James K Lewis <jklewis@us.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tell the hardware the location of the rx ring tail.
More punctuation cleanup.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: James K Lewis <jklewis@us.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Remove unused variable; this makes code easier to read.
Tweak commentary.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: James K Lewis <jklewis@us.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The invocation of the rx ring refill routine is haphazard,
it can be called from a central location.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: James K Lewis <jklewis@us.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Simplify the somewhat convoluted use of return codes
in the rx buffer handling.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: James K Lewis <jklewis@us.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Another skb leak in an error branch. Fix this by adding
call to dev_kfree_skb_irq() after moving to a more
appropriate spot.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: James K Lewis <jklewis@us.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
One of the unlikely error branches has an skb memory leak.
Fix this by handling the error conditions consistently.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: James K Lewis <jklewis@us.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
There is no need to pass a flag into spider_net_decode_one_descr()
so remove this, and perform some othre minor cleanup.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: James K Lewis <jklewis@us.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Get rid of the rxramfull tasklet, and let the NAPI poll routine
deal with this situation. (The rxramfull interrupt is simply
stating that the h/w has run out of room for incoming packets).
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: James K Lewis <jklewis@us.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch adds net_ratelimit to many of the printks in order to
limit extraneous warning messages (created in response to Bug 28554).
This patch supercedes all previous ratelimit patches.
This has been tested, please apply.
From: James K Lewis <jklewis@us.ibm.com>
Signed-off-by: James K Lewis <jklewis@us.ibm.com>
Signed-off-by: Linas Vepstas <jlinas@austin.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The current driver code performs 512 DMA mappings of a bunch of
32-byte ring descriptor structures. This is silly, as they are
all in contiguous memory. This patch changes the code to
dma_map_coherent() each rx/tx ring as a whole.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: James K Lewis <jklewis@us.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
We forget to call spider_net_free_rx_chain_contents which does the
actual dev_kfree_skb. New skbs are allocated from skbuff_head_cache
on each "ifconfig up" letting the cache grow infinitely.
This patch fixes it.
Signed-off-by: Jens Osterkamp <jens@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Conflicts:
drivers/ata/libata-scsi.c
include/linux/libata.h
Futher merge of Linus's head and compilation fixups.
Signed-Off-By: David Howells <dhowells@redhat.com>
Conflicts:
drivers/infiniband/core/iwcm.c
drivers/net/chelsio/cxgb2.c
drivers/net/wireless/bcm43xx/bcm43xx_main.c
drivers/net/wireless/prism54/islpci_eth.c
drivers/usb/core/hub.h
drivers/usb/input/hid-core.c
net/core/netpoll.c
Fix up merge failures with Linus's head and fix new compilation failures.
Signed-Off-By: David Howells <dhowells@redhat.com>
We use the powerpc specific low level MMIO accessor variants instead
of readl() or readl_be() because we know spidernet is not a real PCI
device and we can thus avoid the performance hit caused by the PCI
workarounds.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>