This patch aims to remove struct bonding's first_slave and struct
slave's next and prev pointers, and replace them with the standard Linux
list API. The old macros are converted to list API as well and some new
primitives are available now. The checks if there're slaves that used
slave_cnt have been replaced by the list_empty macro.
Also a few small style fixes, changing longest -> shortest line in local
variable declarations, leaving an empty line before return and removing
unnecessary brackets.
This is the first step to gradual RCU conversion.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When bond_3ad_get_active_agg_info() is used in all show_ad_ functions
it is not protected against slave manipulation and since it walks over
the slaves and uses them, this can easily result in NULL pointer
dereference or use of freed memory. Both the new wrapper and the
internal function are exported to the bonding as they're needed in
different places.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 3ad machine state spinlock can be used before it is inititialized
while doing bond_enslave() (and the port is being initialized) since
port->slave is set before the lock is prepared, thus causing soft
lock-ups and a multitude of other nasty bugs.
[ Rename __initialize_port_locks() variable name to 'slave' -DaveM ]
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
port->slave can be NULL since it's being initialized in bond_enslave
thus dereferencing a NULL pointer in bond_3ad_update_lacp_rate()
Also fix a minor bug, which could cause a port not to have
AD_STATE_LACP_TIMEOUT since there's no sync between
bond_3ad_update_lacp_rate() and bond_3ad_bind_slave(), by changing
the read_lock to a write_lock_bh in bond_3ad_update_lacp_rate().
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benefit from new upper dev list and free bonding from dev->master usage.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
When packets are dropped in TX path, its better to use kfree_skb()
instead of dev_kfree_skb() to give proper drop_monitor events.
Also move the kfree_skb() call after read_unlock() in bond_alb_xmit()
and bond_xmit_activebackup()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cloning all packets in input path have a significant cost.
Use skb_header_pointer()/skb_copy_bits() instead of pskb_may_pull() so
that recv_probe handlers (bond_3ad_lacpdu_recv / bond_arp_rcv /
rlb_arp_recv ) dont touch input skb.
bond_handle_frame() can avoid the skb_clone()/dev_kfree_skb()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Jiri Bohac <jbohac@suse.cz>
Cc: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Cc: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 3aba891d, bonding processes LACP frames (802.3ad
mode) with bond_handle_frame(). Currently a copy of the skb is
made and the original is left to be processed by other
rx_handlers and the rest of the network stack by returning
RX_HANDLER_ANOTHER. As there is no protocol handler for
PKT_TYPE_LACPDU, the frame is dropped and dev->rx_dropped
increased.
Fix this by making bond_handle_frame() return RX_HANDLER_CONSUMED
if bonding has processed the LACP frame.
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch resolves two sets of race conditions.
Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> reported the
first, as follows:
The bond_close() calls cancel_delayed_work() to cancel delayed works.
It, however, cannot cancel works that were already queued in workqueue.
The bond_open() initializes work->data, and proccess_one_work() refers
get_work_cwq(work)->wq->flags. The get_work_cwq() returns NULL when
work->data has been initialized. Thus, a panic occurs.
He included a patch that converted the cancel_delayed_work calls
in bond_close to flush_delayed_work_sync, which eliminated the above
problem.
His patch is incorporated, at least in principle, into this
patch. In this patch, we use cancel_delayed_work_sync in place of
flush_delayed_work_sync, and also convert bond_uninit in addition to
bond_close.
This conversion to _sync, however, opens new races between
bond_close and three periodically executing workqueue functions:
bond_mii_monitor, bond_alb_monitor and bond_activebackup_arp_mon.
The race occurs because bond_close and bond_uninit are always
called with RTNL held, and these workqueue functions may acquire RTNL to
perform failover-related activities. If bond_close or bond_uninit is
waiting in cancel_delayed_work_sync, deadlock occurs.
These deadlocks are resolved by having the workqueue functions
acquire RTNL conditionally. If the rtnl_trylock() fails, the functions
reschedule and return immediately. For the cases that are attempting to
perform link failover, a delay of 1 is used; for the other cases, the
normal interval is used (as those activities are not as time critical).
Additionally, the bond_mii_monitor function now stores the delay
in a variable (mimicing the structure of activebackup_arp_mon).
Lastly, all of the above renders the kill_timers sentinel moot,
and therefore it has been removed.
Tested-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The port shouldn't be enabled unless its current MUX
state is DISTRIBUTING which is correctly handled by
ad_mux_machine(), otherwise the packet sent can be
lost because the other end may not be ready.
The issue happens on every port initialization, but
as the ports are expected to move quickly to DISTRIBUTING,
it doesn't cause much problem. However, it does cause
constant packet loss if the other peer has the port
configured to stay in STANDBY (i.e. SYNC set to OFF).
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During a test where a pair of bonding interfaces using ARP monitoring
were both brought up and torn down (with an rmmod) repeatedly, a panic
in the timer code was noticed. I tracked this down and determined that
any of the bonding functions that ran as workqueue handlers and requeued
more work might not properly exit when the module was removed.
There was a flag protected by the bond lock called kill_timers that is
set when the interface goes down or the module is removed, but many of
the functions that monitor link status now unlock the bond lock to take
rtnl first. There is a chance that another CPU running the rmmod could
get the lock and set kill_timers after the first check has passed.
This patch does not allow any function to queue work that will make
itself run unless kill_timers is not set. I also noticed while doing
this work that bond_resend_igmp_join_requests did not have a check for
kill_timers, so I added the needed call there as well.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Reported-by: Liang Zheng <lzheng@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds support for a configuring the minimum number of links that
must be active before asserting carrier. It is similar to the Cisco
EtherChannel min-links feature. This allows setting the minimum number
of member ports that must be up (link-up state) before marking the
bond device as up (carrier on). This is useful for situations where
higher level services such as clustering want to ensure a minimum
number of low bandwidth links are active before switchover.
See:
http://bugzilla.vyatta.com/show_bug.cgi?id=7196
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter found that there was a dereference before a check,
added in 56d00c677de0(bonding:delete lacp_fast from ad_bond_info).
Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
bond_params->ad_select and ad_bond_info->agg_select_mode have the same
meaning, they are duplicate and need extra synchronization.
__get_agg_selection_mode() get ad_select from bond_params directly.
Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These is also a bug, that if you modify lacp_rate via sysfs,
and add new slaves in bonding, new slaves won't use the latest lacp_rate,
since ad_bond_info->lacp_fast is initialized only once,
in bond_3ad_initialize().
Since both struct bond_params and ad_bond_info have lacp_fast,
they are duplicate and need extra synchronization.
bond_3ad_bind_slave() can use bond_params->lacp_fast to initialize port.
So we can just remove lacp_fast from struct ad_bond_info.
Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is bug that when you modify lacp_rate via sysfs,
802.3ad won't use the new value of lacp_rate to transmit packets.
This is because port->actor_oper_port_state isn't changed.
Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull read_lock(&bond->lock) and BOND_IS_OK() to bond_start_xmit() from
mode-dependent xmit functions.
netif_running() is always true in hard_start_xmit.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Resolved logic conflicts causing a build failure due to
drivers/net/r8169.c changes using a patch from Stephen Rothwell.
Signed-off-by: David S. Miller <davem@davemloft.net>
Since now when bonding uses rx_handler, all traffic going into bond
device goes thru bond_handle_frame. So there's no need to go back into
bonding code later via ptype handlers. This patch converts
original ptype handlers into "bonding receive probes". These functions
are called from bond_handle_frame and they are registered per-mode.
Note that vlan packets are also handled because they are always untagged
thanks to vlan_untag()
Note that this also allows arpmon for eth-bond-bridge-vlan topology.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The slave member of struct aggregator does not necessarily point
to a slave which is part of the aggregator. It points to the
slave structure containing the aggregator structure, while
completely different slaves (or no slaves at all) may be part of
the aggregator.
The agg_device_up() function wrongly uses agg->slave to find the state
of the aggregator. Use agg->lag_ports->slave instead. The bug has
been introduced by commit 4cd6fe1c64
("bonding: fix link down handling in 802.3ad mode").
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The __get_link_speed() function returns a u16 value which was stored
in a u32 local variable. This patch uses the return value directly,
thus fixing that minor type consistency.
The 'duplex' field in struct slave being encoded on 8 bits, to be more
consistent we use a u8 integer (instead of u16) whenever we copy it to
local variables.
Signed-off-by: David Decotigny <decot@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
transfers slave->state into slave->backup (that it's going to transfer
into bitfield. Introduce wrapper inlines to do the work with it.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename the rx_machine_lock to state_machine_lock as this makes more
sense in light of it now protecting all the state machines against
concurrency.
Signed-off-by: Nils Carlson <nils.carlson@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Changes since v1:
* Clarify an unclear comment
* Move a (possible) name change to a separate patch
The ad_rx_machine, ad_periodic_machine and ad_port_selection_logic
functions all inspect and alter common fields within the port structure.
Previous to this patch, only the ad_rx_machines were mutexed, and the
periodic and port_selection could run unmutexed against an ad_rx_machine
trigged by an arriving LACPDU.
This patch remedies the situation by protecting all the state machines
from concurrency. This is accomplished by locking around all the state
machines for a given port, which are executed at regular intervals; and
the ad_rx_machine when handling an incoming LACPDU.
Signed-off-by: Nils Carlson <nils.carlson@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
checkpatch.pl cleanup : Remove braces from single statement
blocks.
Signed-off-by: Bandan Das <bandan.das@stratus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
checkpatch.pl cleanup: Added spaces around operators at various places.
Also fixed some c99 style comments that I came across.
Signed-off-by: Bandan Das <bandan.das@stratus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change "return (EXPR);" to "return EXPR;"
return is not a function, parentheses are not required.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It was recently brought to my attention that 802.3ad mode bonds would no
longer form when using some network hardware after a driver update.
After snooping around I realized that the particular hardware was using
page-based skbs and found that skb->data did not contain a valid LACPDU
as it was not stored there. That explained the inability to form an
802.3ad-based bond. For balance-alb mode bonds this was also an issue
as ARPs would not be properly processed.
This patch fixes the issue in my tests and should be applied to 2.6.36
and as far back as anyone cares to add it to stable.
Thanks to Alexander Duyck <alexander.h.duyck@intel.com> and Jesse
Brandeburg <jesse.brandeburg@intel.com> for the suggestions on this one.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: stable@kerne.org
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A few lines earlier we assume that best->slave could be either null or non-null so
we should check it here as well.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
Remove DRV_NAME from pr_<level>s
Consolidate long format strings
Remove some extra tab indents
Remove some unnecessary ()s from pr_<level>s arguments
Align pr_<level> arguments
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only files where David Miller is the primary git-signer.
wireless, wimax, ixgbe, etc are not modified.
Compile tested x86 allyesconfig only
Not all files compiled (not x86 compatible)
Added a few > 80 column lines, which I ignored.
Existing checkpatch complaints ignored.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The language of 802.3ad 43.4.9 requires the "recordPDU" function
to, in part, compare the Partner parameter values in a received LACPDU
to the stored Actor values. If those match, then the Partner's
synchronization state is set to true.
The current 802.3ad implementation is performing these steps out
of order; first, the synchronization check is done, then the paramters are
checked to see if they match (the synch check being done against a match
check of a prior LACPDU). This causes delays in establishing aggregators
in some circumstances.
This patch modifies the 802.3ad code to call __choose_matched,
the function that does the "match" comparisions, as the first step of
__record_pdu, instead of immediately afterwards. This new behavior is
in compliance with the language of the standard.
Some additional commentary relating to code vs. standard is also
added.
Reported by Martin Patterson <martin@gear6.com> who also supplied
the logic of the fix and verified the patch.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the bonding device is no longer used in determining the device to
which to send packets, it can be dropped from the argument list of the various
xmit_hash_policy calls.
Signed-off-by: Jasper Spaans <spaans@fox-it.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I did not introduce new lines over 80 chars. I even eliminated some of
them.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
AD_SHORT_TIMEOUT and AD_STATE_LACP_ACTIVITY have the same value, but
AD_STATE_LACP_ACTIVITY better reflects the intended semantics.
[ J adds: AD_STATE_LACP_ACTIVITY is a value defined by the standard, and
should be set here in accordance with 802.3ad 43.4.12; AD_SHORT_TIMEOUT
is a constant specific to the Linux 802.3ad implementation that happens
to have the same value ]
The semantic match that finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@@
struct port_params p;
@@
* p.port_state |= AD_SHORT_TIMEOUT
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch converts the remaining occurences of raw return values to their
symbolic counterparts in ndo_start_xmit() functions that were missed by the
previous automatic conversion.
Additionally code that assumed the symbolic value of NETDEV_TX_OK to be zero
is changed to explicitly use NETDEV_TX_OK.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
One of the purposes of bonding is to allow for redundant links, and failover
correctly if the cable is pulled. If all the members of a bonded device have
no carrier present, the bonded device itself needs to report no carrier present
to user space so management tools (like routing daemons) can respond.
Bonding in 802.3ad mode does not work correctly for this because it incorrectly
chooses a link that is down as a possible aggregator.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch fixes issues with dev->dev_addr changing from array to pointer.
Hopefully there are no others.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
802.3ad has its own ethhdr-like structure in the form of an ad_header,
which is at the start of both the LACPDU and marker PDU. Both are
the same from the struct values, both are packed as well.
It's therefore perfectly fine to replace the ad_header by the ethhdr
and to remove its definition.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Generalize out mac address initializer for the LACPDU multicast
address and use in two places. Remove the now unused
AD_MULTICAST_LACPDU_ADDR.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Save some text by initializing ports LACPDU from const initializer,
then get rid of ad_initialize_lacpdu().
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Turn ports is_individual into a bool. There is no functional change.
Signed-off-by: Holger Eitzenberger <heitzenberger@astaro.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Turn ports is_enabled into a bool. There is no functional change.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Turn Need-To-Transmit port variable into a bool. There is no
functional change.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
I also removed some of the unneeded braces in the if condition to
improve readability and a little bit of reformatting.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
They are all defined before used, it's therefore ok to remove
them.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also remove the pointless comment at the top.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It helps in maintaining the various partner information values from
the LACPDU. It also removes the pointless comment at the top.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It generally helps to handle those values in various places, using it
might make the code more readable and gives room for other improvements.
The IEEE standard talks about them as "parameter values".
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The previous code was just a funny way of assigning both values (they
are both of type u8).
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use pr_debug() instead of own macros.
Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have some reasons to kill netdev->priv:
1. netdev->priv is equal to netdev_priv().
2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously
netdev_priv() is more flexible than netdev->priv.
But we cann't kill netdev->priv, because so many drivers reference to it
directly.
This patch is a safe convert for netdev->priv to netdev_priv(netdev).
Since all of the netdev->priv is only for read.
But it is too big to be sent in one mail.
I split it to 4 parts and make every part smaller than 100,000 bytes,
which is max size allowed by vger.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements alternative aggregator selection policies
for 802.3ad. The existing policy, now termed "stable," selects the active
aggregator by greatest bandwidth, and only reselects a new aggregator
if the active aggregator is entirely disabled (no more ports or all ports
down).
This patch adds two new policies: bandwidth and count, selecting
the active aggregator by total bandwidth (like the stable policy) or by
the number of ports in the aggregator, respectively. These two policies
also differ from the stable policy in that they will reselect the active
aggregator when availability-related changes occur in the bond (e.g.,
link state change).
This permits "gang failover" within 802.3ad, allowing redundant
aggregators along parallel paths to always maintain the "best" aggregator
as the active aggregator (rather than having to wait for the active to
entirely fail).
This patch also updates the driver version to 3.5.0.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Refactor mii monitor. As with the previous ARP monitor refactor,
the motivation for this is to handle locking rationally (in this case,
removing conditional locking) and generally clean up the code.
This patch breaks up the monolithic mii monitor into two phases:
an inspection phase, followed by an optional commit phase. The commit phase
is the only portion that requires RTNL or makes changes to state, and is
only called when inspection finds something to change.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The 802.3ad state machine lock can be acquired in both softirq and
not softirq context, but was not held at _bh to prevent a deadlock (which
could occur if a LACPDU arrived and was processed while the lock was
held).
Corrected this, now hold the state machine lock at _bh to prevent
deadlock.
Bug reported by Todd Fleisher <todd@fleish.org>.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Introduce per-net_device inlines: dev_net(), dev_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Convert bonding timers to workqueues. This converts the various
monitor functions to run in periodic work queues instead of timers. This
patch introduces the framework and convers the calls, but does not resolve
various locking issues, and does not stand alone.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Prior to use struct marker in the linux kernel markers, we need to clean
two drivers which use this structure name.
Change bonding driver types :
- struct marker to struct bond_marker.
- marker_t to bond_marker_t.
- marker_header to bond_marker_header.
- marker_header_t to bond_marker_header_t.
Change qla4xxx struct marker_entry usage :
- Change struct marker_entry for struct qla4_marker_entry.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Chad Tindel <ctindel@users.sourceforge.net>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: David Somayajulu <david.somayajulu@qlogic.com>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Ravi Anand <ravi.anand@qlogic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch modifies every packet receive function
registered with dev_add_pack() to drop packets if they
are not from the initial network namespace.
This should ensure that the various network stacks do
not receive packets in a anything but the initial network
namespace until the code has been converted and is ready
for them.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify carrier state determination for 802.3ad mode to comply
with section 43.3.9 of IEEE 802.3, which requires that "Links that are
not successful candidates for aggregation (e.g., links that are attached
to other devices that cannot perform aggregation or links that have been
manually configured to be non-aggregatable) are enabled to operate as
individual IEEE 802.3 links."
Bug reported by Laurent Chavey <chavey@google.com>. This patch
is an updated version of his patch that changes the wording of
commentary and adds an update to the driver version.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Laurent Chavey <chavey@google.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and
skb->mac to skb->mac_header, to match the names of the associated helpers
(skb[_[re]set]_{transport,network,mac}_header).
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can
later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.
This one touches just the most simple case, next will handle the slightly more
"complex" cases.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed handling of 802.3ad LACPDUs. Do not byte swap data in
place in the packet. Updated nomenclature of "__ntohs_lacpdu" to be
"htons"; it was previously used for both ntohs and htons operations, but
only called ntohs functions.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Allow channel bonding to enslave a 10 Gig adapter without errors.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add support for the bonding master to specify its carrier state
based upon the state of the slaves. For 802.3ad, the bond is up if
there is an active, parterned aggregator. For other modes, the bond is
up if any slaves are up. Updates driver version to 3.0.3.
Based on a patch by jamal <hadi@cyberus.ca>.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add the bond name to all error messages so we can tell which one is
complaining. Also reformats some error messages to be more consistent.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Bonding just wants the device before the skb_bond()
decapsulation occurs, so simply pass that original
device into packet_type->func() as an argument.
It remains to be seen whether we can use this same
exact thing to get rid of skb->input_dev as well.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for alternate slave selection algorithms to bonding
balance-xor and 802.3ad modes. Default mode (what we have now: xor of
MAC addresses) is "layer2", new choice is "layer3+4", using IP and port
information for hashing to select peer.
Originally submitted by Jason Gabler for balance-xor mode;
modified by Jay Vosburgh to additionally support 802.3ad mode. Jason's
original comment is as follows:
The attached patch to the Linux Etherchannel Bonding driver modifies the
driver's "balance-xor" mode as follows:
- alternate hashing policy support for mode 2
* Added kernel parameter "xmit_policy" to allow the specification
of different hashing policies for mode 2. The original mode 2
policy is the default, now found in xmit_hash_policy_layer2().
* Added xmit_hash_policy_layer34()
This patch was inspired by hashing policies implemented by Cisco,
Foundry and IBM, which are explained in
Foundry documentation found at:
http://www.foundrynet.com/services/documentation/sribcg/Trunking.html#112750
Signed-off-by: Jason Gabler <jygabler@lbl.gov>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!