Commit Graph

287469 Commits

Author SHA1 Message Date
Bruce Allan
07914ee3cc e1000e: use true/false for bool autoneg_false
v2 - replaced mac->autoneg_failed == false with !mac->autoneg_failed

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-10 00:06:58 -08:00
Bruce Allan
668018d747 e1000e: remove unnecessary parentheses
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-10 00:06:44 -08:00
Bruce Allan
fe1e980f24 e1000e: remove unnecessary returns from void functions
...and convert some goto's which simply return to just return.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-10 00:06:30 -08:00
Bruce Allan
4bcf053baf e1000e: remove test that is always false
warning: comparison of unsigned expression < 0 is always false

Remove an unnecessary test that is reported when compiling driver with W=1.
The test is unnecessary because Intel wired GbE hardware older (i.e. less)
than 82571 is not supported by this driver.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-10 00:06:07 -08:00
Eddie Wai
4cbbb04dc1 cnic: Update VLAN ID during ISCSI_UEVENT_PATH_UPDATE
This will support the new VLAN attribute in the iSCSI iface file.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-09 14:03:50 -05:00
Jeffrey Huang
0cb1f4b960 cnic: set error flag when iSCSI connection fails
to speed up error recovery due to SPQ failures.  The error flag will
expedite the recovery process by skipping the timeouts.

Signed-off-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-09 14:03:50 -05:00
Dan Carpenter
a584b7ae4e netxen_nic: signedness bug in netxen_md_entry_err_chk()
"esize" should be signed because it can be negative here.  For example,
when we call it in netxen_parse_md_template(), it could be -1 from the
return value of netxen_md_L2Cache().

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 19:57:55 -05:00
Ursula Braun
039055b965 qeth: add wake_up on write channel
To send commands on the write channel 8 buffers exist. If all
8 buffers are used, a wait is triggered on the write channel. When
such buffer are freed, a wake_up is needed. This patch adds the
missing wake_up in qeth_release_buffer().
This fix is especially important when running Communications
Controller for Linux on System z.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 18:50:20 -05:00
Frank Blaschka
c3ab96f36a qeth: add query OSA address table support
Add qeth device private ioctl to query the OSA address table.
This helps debugging hw related problems.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 18:50:20 -05:00
Ursula Braun
51363b8751 af_iucv: allow retrieval of maximum message size
For HS transport the maximum message size depends on the MTU-size
of the HS-device bound to the AF_IUCV socket. This patch adds a
getsockopt option MSGSIZE returning the maximum message size that
can be handled for this AF_IUCV socket.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 18:50:19 -05:00
Ursula Braun
800c5eb7b5 af_iucv: change net_device handling for HS transport
This patch saves the net_device in the iucv_sock structure during
bind in order to fasten skb sending.
In addition some other small improvements are made for HS transport:
   - error checking when sending skbs
   - locking changes in afiucv_hs_callback_txnotify
   - skb freeing in afiucv_hs_callback_txnotify
And finally it contains code cleanup to get rid of iucv_skb_queue_purge.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 18:50:19 -05:00
Ursula Braun
7f1b0ea42a af_iucv: block writing if msg limit is exceeded
When polling on an AF_IUCV socket, writing should be blocked if the
number of pending messages exceeds a defined limit.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 18:50:19 -05:00
Ursula Braun
7d316b9453 af_iucv: remove IUCV-pathes completely
A SEVER is missing in the callback of a receiving SEVERED. This may
inhibit z/VM to remove the corresponding IUCV-path completely.
This patch adds a SEVER in iucv_callback_connrej (together with
additional locking.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 18:50:19 -05:00
Pradeep A. Dalvi
1ab0d2ec9a netdev: ethernet dev_alloc_skb to netdev_alloc_skb
Replaced deprecating dev_alloc_skb with netdev_alloc_skb in drivers/net/ethernet
  - Removed extra skb->dev = dev after netdev_alloc_skb

Signed-off-by: Pradeep A Dalvi <netdev@pradeepdalvi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 18:46:38 -05:00
Pradeep A. Dalvi
dae2e9f430 netdev: ethernet dev_alloc_skb to netdev_alloc_skb
Replaced deprecating dev_alloc_skb with netdev_alloc_skb in drivers/net/ethernet
  - Removed extra skb->dev = dev after netdev_alloc_skb

Signed-off-by: Pradeep A Dalvi <netdev@pradeepdalvi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 18:46:38 -05:00
Erich E. Hoover
c4062dfc42 ipv6: Implement IPV6_UNICAST_IF socket option.
The IPV6_UNICAST_IF feature is the IPv6 compliment to IP_UNICAST_IF.

Signed-off-by: Erich E. Hoover <ehoover@mines.edu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 15:52:45 -05:00
Erich E. Hoover
76e21053b5 ipv4: Implement IP_UNICAST_IF socket option.
The IP_UNICAST_IF feature is needed by the Wine project.  This patch
implements the feature by setting the outgoing interface in a similar
fashion to that of IP_MULTICAST_IF.  A separate option is needed to
handle this feature since the existing options do not provide all of
the characteristics required by IP_UNICAST_IF, a summary is provided
below.

SO_BINDTODEVICE:
* SO_BINDTODEVICE requires administrative privileges, IP_UNICAST_IF
does not.  From reading some old mailing list articles my
understanding is that SO_BINDTODEVICE requires administrative
privileges because it can override the administrator's routing
settings.
* The SO_BINDTODEVICE option restricts both outbound and inbound
traffic, IP_UNICAST_IF only impacts outbound traffic.

IP_PKTINFO:
* Since IP_PKTINFO and IP_UNICAST_IF are independent options,
implementing IP_UNICAST_IF with IP_PKTINFO will likely break some
applications.
* Implementing IP_UNICAST_IF on top of IP_PKTINFO significantly
complicates the Wine codebase and reduces the socket performance
(doing this requires a lot of extra communication between the
"server" and "user" layers).

bind():
* bind() does not work on broadcast packets, IP_UNICAST_IF is
specifically intended to work with broadcast packets.
* Like SO_BINDTODEVICE, bind() restricts both outbound and inbound
traffic.

Signed-off-by: Erich E. Hoover <ehoover@mines.edu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 15:52:45 -05:00
Eric Dumazet
43480aecb1 gro: more generic L2 header check
Shlomo Pongratz reported GRO L2 header check was suited for Ethernet
only, and failed on IB/ipoib traffic.

He provided a patch faking a zeroed header to let GRO aggregates frames.

Roland Dreier, Herbert Xu, and others suggested we change GRO L2 header
check to be more generic, ie not assuming L2 header is 14 bytes, but
taking into account hard_header_len.

__napi_gro_receive() has special handling for the common case (Ethernet)
to avoid a memcmp() call and use an inline optimized function instead.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Shlomo Pongratz <shlomop@mellanox.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 15:50:01 -05:00
Roland Dreier
377cb4f9e7 IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses
Commit a0417fa3a1 ("net: Make qdisc_skb_cb upper size bound
explicit.") made it possible for a netdev driver to use skb->cb
between its header_ops.create method and its .ndo_start_xmit
method.  Use this in ipoib_hard_header() to stash away the LL address
(GID + QPN), instead of the "ipoib_pseudoheader" hack.  This allows
IPoIB to stop lying about its hard_header_len, which will let us fix
the L2 check for GRO.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 15:50:01 -05:00
Masanari Iida
8c1a7f5283 stmmac: Fix typo in stmmac_pci.c
Correct spelling "regiser" to "register" in
drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 15:29:15 -05:00
Pradeep A. Dalvi
31a4c8b827 mace: Fix build for mace due to netdev_alloc_skb
Refs:
1. pmac32_defconfig
http://kisskb.ellerman.id.au/kisskb/buildresult/5583746/
2. ppc6xx_defconfig
http://kisskb.ellerman.id.au/kisskb/buildresult/5584116/

Confirmed any such occurances from all failed defconfigs &
in net-next sources with
grep -nrs "netdev_alloc_skb" drivers/net/ethernet/ | grep -v ","

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Pradeep A Dalvi <netdev@pradeepdalvi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-08 15:23:52 -05:00
David S. Miller
7280f5ae0d sonice: Fix build due to botched netdev_alloc_skb() conversion.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-07 15:28:15 -05:00
Dan Carpenter
af2ce213f6 caif: remove duplicate initialization
"priv" is initialized twice.  I kept the second one, because it is next
to the check for NULL.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-07 13:39:37 -05:00
Eric Dumazet
bb7d92e3e3 sh-eth: use netdev stats structure and fix dma_map_single
No need to maintain a parallel net_device_stats structure in
sh_eth_private, since we have a generic one in netdev

Fix two dma_map_single() incorrect parameters, passing skb->tail instead
of skb->data. Seems that there is no corresponding dmap_unmap_single()
calls for the moment in this driver.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-07 13:38:57 -05:00
Fabio Estevam
b72061a3cb net: fec: Fix build due to wrong dev annotation
commit 21a4e469 (netdev: ethernet dev_alloc_skb to netdev_alloc_skb)
should have used "ndev" instead of "dev".

This causes the following build errors:

drivers/net/ethernet/freescale/fec.c: In function 'fec_enet_rx':
drivers/net/ethernet/freescale/fec.c:714: error: 'dev' undeclared (first use in this function)
drivers/net/ethernet/freescale/fec.c:714: error: (Each undeclared identifier is reported only once
drivers/net/ethernet/freescale/fec.c:714: error: for each function it appears in.)
drivers/net/ethernet/freescale/fec.c: In function 'fec_enet_alloc_buffers':
drivers/net/ethernet/freescale/fec.c:1213: error: 'dev' undeclared (first use in this function)

Fix it, so that fec driver can be built again.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-07 13:32:50 -05:00
Shriram Rajagopalan
c3059be16c net/sched: sch_plug - Queue traffic until an explicit release command
The qdisc supports two operations - plug and unplug. When the
qdisc receives a plug command via netlink request, packets arriving
henceforth are buffered until a corresponding unplug command is received.
Depending on the type of unplug command, the queue can be unplugged
indefinitely or selectively.

This qdisc can be used to implement output buffering, an essential
functionality required for consistent recovery in checkpoint based
fault-tolerance systems. Output buffering enables speculative execution
by allowing generated network traffic to be rolled back. It is used to
provide network protection for Xen Guests in the Remus high availability
project, available as part of Xen.

This module is generic enough to be used by any other system that wishes
to add speculative execution and output buffering to its applications.

This module was originally available in the linux 2.6.32 PV-OPS tree,
used as dom0 for Xen.

For more information, please refer to http://nss.cs.ubc.ca/remus/
and http://wiki.xensource.com/xenwiki/Remus

Changes in V3:
  * Removed debug output (printk) on queue overflow
  * Added TCQ_PLUG_RELEASE_INDEFINITE - that allows the user to
    use this qdisc, for simple plug/unplug operations.
  * Use of packet counts instead of pointers to keep track of
    the buffers in the queue.

Signed-off-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
[author of the code in the linux 2.6.32 pvops tree]
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-07 12:54:56 -05:00
David S. Miller
17b8a74f00 Merge branch 'tipc_net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2012-02-07 12:31:01 -05:00
Bruce Allan
0e15df490e e1000e: minor whitespace and indentation cleanup
Cleanup of some whitespace and indentation of a single code block.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:18:09 -08:00
Bruce Allan
e885d762b7 e1000e: fix sparse warnings with -D__CHECK_ENDIAN__
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:16:54 -08:00
Bruce Allan
a2a5b3235d e1000e: fix checkpatch warning from MINMAX test
WARNING: min() should probably be min_t(unsigned int, 4, skb->data_len)

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:15:52 -08:00
Bruce Allan
24b706b2f4 e1000e: cleanup - use braces in both branches of a conditional statement
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:14:50 -08:00
Bruce Allan
f23efdff77 e1000e: cleanup e1000_set_phys_id
Use the existing hw pointer.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:13:17 -08:00
Bruce Allan
66092f5925 e1000e: cleanup e1000_init_mac_params_82571()
Combine two switch statements into one, convert a nebulous pointer to one
that is a bit more in keeping with the rest of the driver code and cleanup
some coding style.  No change in functionality, just cosmetic changes.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:12:14 -08:00
Bruce Allan
e68782ed78 e1000e: cleanup e1000_init_mac_params_80003es2lan()
Combine two switch statements into one, convert a nebulous pointer to one
that is a bit more in keeping with the rest of the driver code and remove
some dead code (there are no 80003es2lan devices with fiber).  No change in
functionality, just cosmetic changes.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:11:10 -08:00
Bruce Allan
9e2d7657e2 e1000e: cleanup - check return values consistently
The majority of the e1000e code checks most function return values using a
test like 'if (ret_val)' or 'if (!ret_val)' but there are a few instances
of 'if (ret_val == 0)'.  This patch converts the latter to the former for
consistency.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:10:07 -08:00
Bruce Allan
f36bb6cacd e1000e: add missing initializers reported when compiling with W=1
warning: missing initializer

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:09:14 -08:00
Tushar Dave
b04e36bac5 e1000: Adding e1000_dump function
When TX hang occurs e1000_dump prints TX ring, RX ring and Device registers.

Signed-off-by: Tushar Dave <tushar.n.dave@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 04:01:46 -08:00
Mitch A Williams
ab50a2a430 igbvf: refactor Interrupt Throttle Rate code
The existing ITR code is broken and confusing, with lots of similarly-named
variables that do different things. Additionally, after the driver carefully
determines the optimal interrupt rate for the adapter, it then
ignores it and always writes a fixed, suboptimal value.

This patch refactors that code to make variable names more descriptive of
what they actually do, and then actually writes the calculated result to
the hardware.

Preliminary testing shows that netperf TCP_STREAM tests goes from ~918Mbps
to ~940Mbps, and TCP_RR goes from ~2k transactions/sec up to > 8k.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Robert E Garrett <robertX.e.garrett@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2012-02-07 03:49:23 -08:00
Allan Stephens
dff10e9e63 tipc: Minor optimization to rejection of connection-based messages
Modifies message rejection logic so that TIPC doesn't attempt to
send a FIN message to the rejecting port if it is known in advance
that there is no such message because the rejecting port doesn't exist.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:19 -05:00
Allan Stephens
3175bd9add tipc: Eliminate alteration of publication key during name table purging
Removes code that alters the publication key of a name table entry
that is being forcibly purged from TIPC's name table after contact
with the publishing node has been lost.

Current TIPC ensures that all defunct names are purged before
re-establishing contact with a failed node.  There used to be a risk
that the publication might be accidentally deleted because it might be
re-added to the name table before the purge operation was completed.
But now there is no longer a need to ensure that the new key is different
than the old one.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:19 -05:00
Allan Stephens
63e7f1ac28 tipc: Prevent loss of fragmented messages over broadcast link
Modifies broadcast link so that an incoming fragmented message is not
lost if reassembly cannot begin because there currently is no buffer
big enough to hold the entire reassembled message. The broadcast link
now ignores the first fragment completely, which causes the sending node
to retransmit the first fragment so that reassembly can be re-attempted.

Previously, the sender would have had no reason to retransmit the 1st
fragment, so we would never have a chance to re-try the allocation.

To do this cleanly without duplicaton, a new bclink_accept_pkt()
function is introduced.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:19 -05:00
Allan Stephens
b76b27cad5 tipc: Prevent loss of fragmented messages over unicast links
Modifies unicast link endpoint logic so an incoming fragmented message
is not lost if reassembly cannot begin because there is no buffer big
enough to hold the entire reassembled message. The link endpoint now
ignores the first fragment completely, which causes the sending node to
retransmit the first fragment so that reassembly can be re-attempted.

Previously, the sender would have had no reason to retransmit the 1st
fragment, so we would never have a chance to re-try the allocation.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
2012-02-06 16:59:19 -05:00
Allan Stephens
1ec2bb0840 tipc: Remove obsolete broadcast tag capability
Eliminates support for the broadcast tag field, which is no longer
used by broadcast link NACK messages.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:18 -05:00
Allan Stephens
7a54d4a99d tipc: Major redesign of broadcast link ACK/NACK algorithms
Completely redesigns broadcast link ACK and NACK mechanisms to prevent
spurious retransmit requests in dual LAN networks, and to prevent the
broadcast link from stalling due to the failure of a receiving node to
acknowledge receiving a broadcast message or request its retransmission.

Note: These changes only impact the timing of when ACK and NACK messages
are sent, and not the basic broadcast link protocol itself, so inter-
operability with nodes using the "classic" algorithms is maintained.

The revised algorithms are as follows:

1) An explicit ACK message is still sent after receiving 16 in-sequence
messages, and implicit ACK information continues to be carried in other
unicast link message headers (including link state messages).  However,
the timing of explicit ACKs is now based on the receiving node's absolute
network address rather than its relative network address to ensure that
the failure of another node does not delay the ACK beyond its 16 message
target.

2) A NACK message is now typically sent only when a message gap persists
for two consecutive incoming link state messages; this ensures that a
suspected gap is not confirmed until both LANs in a dual LAN network have
had an opportunity to deliver the message, thereby preventing spurious NACKs.
A NACK message can also be generated by the arrival of a single link state
message, if the deferred queue is so big that the current message gap
cannot be the result of "normal" mis-ordering due to the use of dual LANs
(or one LAN using a bonded interface). Since link state messages typically
arrive at different nodes at different times the problem of multiple nodes
issuing identical NACKs simultaneously is inherently avoided.

3) Nodes continue to "peek" at NACK messages sent by other nodes. If
another node requests retransmission of a message gap suspected (but not
yet confirmed) by the peeking node, the peeking node forgets about the
gap and does not generate a duplicate retransmit request. (If the peeking
node subsequently fails to receive the lost message, later link state
messages will cause it to rediscover and confirm the gap and send another
NACK.)

4) Message gap "equality" is now determined by the start of the gap only.
This is sufficient to deal with the most common cases of message loss,
and eliminates the need for complex end of gap computations.

5) A peeking node no longer tries to determine whether it should send a
complementary NACK, since the most common cases of message loss don't
require it to be sent. Consequently, the node no longer examines the
"broadcast tag" field of a NACK message when peeking.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:18 -05:00
Allan Stephens
b98158e3b3 tipc: Add missing locks in broadcast link statistics accumulation
Ensures that all attempts to update broadcast link statistics are done
only while holding the lock that protects the link's main data structures,
to prevent interference by simultaneous updates caused by messages
arriving on other interfaces.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:18 -05:00
Allan Stephens
0232c5a566 tipc: Fix bug in broadcast link duplicate message statistics
Modifies broadcast link so that it increments the "received duplicate
message" count if an incoming message cannot be added to the deferred
message queue because it is already present in the queue. (The aligns
broadcast link behavior with that of TIPC's unicast links.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:17 -05:00
Allan Stephens
8a275a6a30 tipc: Fix node lock reclamation issues in broadcast link reception
Fixes a pair of problems in broadcast link message reception code
relating to the reclamation of the node lock after consuming an
in-sequence message.

1) Now retests to see if the sending node is still up after reclaiming
   the node lock, and bails out if it is non-operational.

2) Now manipulates the node's deferred message queue only after
   reclaiming the node lock, rather than using queue head pointer
   information that was cached previously.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:17 -05:00
Allan Stephens
57732560d1 tipc: Add missing broadcast link lock when sending NACK
Ensures that any attempt to send a NACK message over TIPC's broadcast
link has exclusive access to the link's main data structures, to prevent
interference with a simultaneous attempt to send other broadcast link
traffic (such as application-generated multicast messages).

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:17 -05:00
Allan Stephens
47361c87c5 tipc: Fix problem with broadcast link synchronization between nodes
Corrects a problem in which a link endpoint that activates as the
result of receiving a RESET/STATE sequence of link protocol messages
fails to properly record the broadcast link status information about
the node to which it is now communicating with. (The problem does
not occur with the more common RESET/ACTIVATE sequence of messages.)
The fix ensures that the broadcast link status info is updated after
the RESET message resets the link endpoint, rather than before, thereby
preventing new information from being overwritten by the reset operation.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:16 -05:00
Allan Stephens
9349931371 tipc: Ensure broadcast link re-acquires node after link failure
Fix a bug that can prevent TIPC from sending broadcast messages to a node
if contact with the node is lost and then regained. The problem occurs if
the broadcast link first clears the flag indicating the node is part of the
link's distribution set (when it loses contact with the node), and later
fails to restore the flag (when contact is regained); restoration fails
if contact with the node is regained by implicit unicast link activation
triggered by the arrival of a data message, rather than explicitly by the
arrival of a link activation message.

The broadcast link now uses separate fields to track whether a node is
theoretically capable of receiving broadcast messages versus whether it is
actually part of the link's distribution set. The former member is updated
by the receipt of link protocol messages, which can occur at any time; the
latter member is updated only when contact with the node is gained or lost.
This change also permits the simplification of several conditional
expressions since the broadcast link's "supported" field can now only be
set if there are working links to the associated node.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-02-06 16:59:16 -05:00