When a dump is interrupted at the last device in a hash chain and
then continued, "idx" won't get incremented past s_idx, so s_ip_idx
is not reset when moving on to the next device. This means of all
following devices only the last n - s_ip_idx addresses are dumped.
Tested-by: Pawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Use list_add_tail() to get the behavior we had before
the list_head conversion for ipv6 address lists.
Signed-off-by: David S. Miller <davem@davemloft.net>
hlist_for_each_entry(p...) will not necessarily initialize 'p'
to anything if the hlist is empty. GCC notices this and emits
a warning.
Just return true explicitly when we hit a match, and return
false is we fall out of the loop without one.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch reduces timer events while keeping accuracy by rounding
our timer and/or batching several address validations in addrconf_verify().
addrconf_verify() is called at earliest timeout among interface addresses'
timeouts, but at maximum ADDR_CHECK_FREQUENCY (120 secs).
In most cases, all of timeouts of interface addresses are long enough
(e.g. several hours or days vs 2 minutes), this timer is usually called
every ADDR_CHECK_FREQUENCY, and it is okay to be lazy.
(Note this timer could be eliminated if all code paths which modifies
variables related to timeouts call us manually, but it is another story.)
However, in other least but important cases, we try keeping accuracy.
When the real interface address timeout is coming, and the timeout
is just before the rounded timeout, we accept some error.
When a timeout has been reached, we also try batching other several
events in very near future.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The variable regen_advance is only used in the privacy case.
Move it to simplify code and eliminate ifdef's
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some minor stuff, reformat comments and add whitespace for clarity
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert to list macro's for the list of addresses per interface
in IPv6.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing hash function has a couple of issues:
* it is hardwired to 16 for IN6_ADDR_HSIZE
* limited to 256 and callers using int
* use jhash2 rather than some old BSD algorithm
No need for random seed since this is local only (based on assigned
addresses) table.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert from reader/writer lock to RCU and spinlock for addrconf
hash list.
Adds an additional helper macro for hlist_for_each_entry_continue_rcu
to handle the continue case.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using hash list macros, simplifies code and helps later RCU.
This patch includes some initialization that is not strictly necessary,
since an empty hlist node/list is all zero; and list is in BSS
and node is allocated with kzalloc.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use list macros instead of open coded linked list.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since generally there could be more netdevices changing type other
than bonding, making this event type name "bonding-unrelated"
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we are managing IPv6 addresses using DHCP, it would be nice
for user-space to be notified if an address configured through
DHCP fails DAD. Otherwise user-space would have to poll to see
whether DAD succeeds.
This patch uses the existing notification mechanism and simply
hooks it into the DAD failure code path.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This solves a potential race problem during the cleanup process.
The issue is that addrconf_ifdown() needs to traverse address list,
but then drop lock to call the notifier. The version in -next
could get confused if add/delete happened during this window.
Original code (2.6.32 and earlier) was okay because all addresses
were always deleted.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
My recent change in net-next to retain permanent addresses caused regression.
Device refcount would not go to zero when device was unregistered because
left over anycast reference would hold ipv6 dev reference which would hold
device references...
The correct procedure is to call notify chain when address is no longer
available for use. When interface comes back DAD timer will notify
back that address is available.
Also, link local addresses should be purged when interface is brought
down. The address might be changed.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Router Solicitation timer races with device state changes
because it doesn't lock the device. Use local variable to avoid
one repeated dereference.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timer code runs in bottom half, so there is no need for
using _bh form of locking. Also check if device is not ready
to avoid race with address that is no longer active.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RFC 4291 section 2.4 states that all uncategorized addresses
should be considered as Global Unicast.
This will remove IPV6_ADDR_RESERVED completely
and return IPV6_ADDR_UNICAST in ipv6_addr_type() instead.
Signed-off-by: Ulrich Weber <uweber@astaro.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuck. It turns out that when we restart sysctls we were restarting
with the values already changed. Which unfortunately meant that
the second time through we thought there was no change and skipped
all kinds of work, despite the fact that there was indeed a change.
I have fixed this the simplest way possible by restoring the changed
values when we restart the sysctl write.
One of my coworkers spotted this bug when after disabling forwarding
on an interface pings were still forwarded.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add __percpu sparse annotations to net.
These annotations are to make sparse consider percpu variables to be
in a different address space and warn if accessed without going
through percpu accessors. This patch doesn't affect normal builds.
The macro and type tricks around snmp stats make things a bit
interesting. DEFINE/DECLARE_SNMP_STAT() macros mark the target field
as __percpu and SNMP_UPD_PO_STATS() macro is updated accordingly. All
snmp_mib_*() users which used to cast the argument to (void **) are
updated to cast it to (void __percpu **).
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Stop computing the number of neighbour table settings we have by
counting the number of binary sysctls. This behaviour was silly
and meant that we could not add another neighbour table setting
without also adding another binary sysctl.
Don't pass the binary sysctl path for neighour table entries
into neigh_sysctl_register. These parameters are no longer
used and so are just dead code.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't need to disable bottom half it is already down in the
previous lock. Move some blank lines to group locking in same
context.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Permanent IPV6 addresses should not be removed when the link is
set to admin down, only when device is removed.
When link is lost permanent addresses should be marked as tentative
so that when link comes back they are subject to duplicate address
detection (if DAD was enabled for that address).
Other routing systems keep manually configured IPv6 addresses
when link is set down.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__net_init/__net_exit are apparently not going away, so use them
to full extent.
In some cases __net_init was removed, because it was called from
__net_exit code.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits)
mac80211: fix reorder buffer release
iwmc3200wifi: Enable wimax core through module parameter
iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter
iwmc3200wifi: Coex table command does not expect a response
iwmc3200wifi: Update wiwi priority table
iwlwifi: driver version track kernel version
iwlwifi: indicate uCode type when fail dump error/event log
iwl3945: remove duplicated event logging code
b43: fix two warnings
ipw2100: fix rebooting hang with driver loaded
cfg80211: indent regulatory messages with spaces
iwmc3200wifi: fix NULL pointer dereference in pmkid update
mac80211: Fix TX status reporting for injected data frames
ath9k: enable 2GHz band only if the device supports it
airo: Fix integer overflow warning
rt2x00: Fix padding bug on L2PAD devices.
WE: Fix set events not propagated
b43legacy: avoid PPC fault during resume
b43: avoid PPC fault during resume
tcp: fix a timewait refcnt race
...
Fix up conflicts due to sysctl cleanups (dead sysctl_check code and
CTL_UNNUMBERED removed) in
kernel/sysctl_check.c
net/ipv4/sysctl_net_ipv4.c
net/ipv6/addrconf.c
net/sctp/sysctl.c
Generated with the following semantic patch
@@
struct net *n1;
struct net *n2;
@@
- n1 == n2
+ net_eq(n1, n2)
@@
struct net *n1;
struct net *n2;
@@
- n1 != n2
+ !net_eq(n1, n2)
applied over {include,net,drivers/net}.
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When handling large number of netdevices, inet6_dump_addr()
is very slow because it has O(N^2) complexity.
Instead of scanning one single list, we can use the NETDEV_HASHENTRIES
sub lists of the dev_index hash table, and RCU lookups.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that sys_sysctl is a compatiblity wrapper around /proc/sys
all sysctl strategy routines, and all ctl_name and strategy
entries in the sysctl tables are unused, and can be
revmoed.
In addition neigh_sysctl_register has been modified to no longer
take a strategy argument and it's callers have been modified not
to pass one.
Cc: "David Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Reported by Stephen Rothwell:
--------------------
Today's linux-next build (x86_64 allmodconfig) produced this warning:
net/ipv6/addrconf.c: In function 'inet6_dump_ifinfo':
net/ipv6/addrconf.c:3833: warning: unused variable 'err'
Introduced by commit 84d2697d96 ("ipv6:
speedup inet6_dump_ifinfo()").
--------------------
Signed-off-by: David S. Miller <davem@davemloft.net>
Apparently, inet6_dump_addr() is not able to handle more than
64 ipv6 addresses per device. We must break from inner loops
in case skb is full, or else cursor is put at the end of list.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When handling large number of netdevice, inet6_dump_ifinfo()
is very slow because it has O(N^2) complexity.
Instead of scanning one single list, we can use the 256 sub lists
of the dev_index hash table, and RCU lookups.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds RCU management to the list of netdevices.
Convert some for_each_netdev() users to RCU version, if
it can avoid read_lock-ing dev_base_lock
Ie:
read_lock(&dev_base_loack);
for_each_netdev(net, dev)
some_action();
read_unlock(&dev_base_lock);
becomes :
rcu_read_lock();
for_each_netdev_rcu(net, dev)
some_action();
rcu_read_unlock();
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On Friday 02 October 2009 20:53:51 you wrote:
> This is good although I would have shortened the name.
Ah, I knew I forgot something :) Here is v4.
tavi
>From 24d96d825b9fa832b22878cc6c990d5711968734 Mon Sep 17 00:00:00 2001
From: Octavian Purdila <opurdila@ixiacom.com>
Date: Fri, 2 Oct 2009 00:51:15 +0300
Subject: [PATCH] ipv6: new sysctl for sending TLLAO with unicast NAs
Neighbor advertisements responding to unicast neighbor solicitations
did not include the target link-layer address option. This patch adds
a new sysctl option (disabled by default) which controls whether this
option should be sent even with unicast NAs.
The need for this arose because certain routers expect the TLLAO in
some situations even as a response to unicast NS packets.
Moreover, RFC 2461 recommends sending this to avoid a race condition
(section 4.4, Target link-layer address)
Signed-off-by: Cosmin Ratiu <cratiu@ixiacom.com>
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's unused.
It isn't needed -- read or write flag is already passed and sysctl
shouldn't care about the rest.
It _was_ used in two places at arch/frv for some reason.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If an interface has multiple addresses, the current message for DAD
failure isn't really helpful, so this patch adds the address itself to
the printk.
Signed-off-by: Jens Rosenboom <me@jayr.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes commit e36b9d16c6. The approach
there is to call dev_close()/dev_open() whenever the device type is changed in
order to remap the device IP multicast addresses to HW multicast addresses.
This approach suffers from 2 drawbacks:
*. It assumes tha the device is UP when calling dev_close(), or otherwise
dev_close() has no affect. It is worth to mention that initscripts (Redhat)
and sysconfig (Suse) doesn't act the same in this matter.
*. dev_close() has other side affects, like deleting entries from the routing
table, which might be unnecessary.
The fix here is to directly remap the IP multicast addresses to HW multicast
addresses for a bonding device that changes its type, and nothing else.
Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add IFA_F_DADFAILED flag to denote an IPv6 address that has
failed Duplicate Address Detection, that way tools like
/sbin/ip can be more informative.
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:db8::1/64 scope global tentative dadfailed
valid_lft forever preferred_lft forever
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a bug in addrconf_prefix_rcv() where it won't update the
preferred lifetime of an IPv6 address if the current valid lifetime
of the address is less than 2 hours (the minimum value in the RA).
For example, If I send a router advertisement with a prefix that
has valid lifetime = preferred lifetime = 2 hours we'll build
this address:
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic
valid_lft 7175sec preferred_lft 7175sec
If I then send the same prefix with valid lifetime = preferred
lifetime = 0 it will be ignored since the minimum valid lifetime
is 2 hours:
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic
valid_lft 7161sec preferred_lft 7161sec
But according to RFC 4862 we should always reset the preferred lifetime
even if the valid lifetime is invalid, which would cause the address
to immediately get deprecated. So with this patch we'd see this:
5: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:1890:1109:a20:21f:29ff:fe5a:ef04/64 scope global deprecated dynamic
valid_lft 7163sec preferred_lft 0sec
The comment winds-up being 5x the size of the code to fix the problem.
Update the preferred lifetime of IPv6 addresses derived from a prefix
info option in a router advertisement even if the valid lifetime in
the option is invalid, as specified in RFC 4862 Section 5.5.3e. Fixes
an issue where an address will not immediately become deprecated.
Reported by Jens Rosenboom.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid showing wrong high values when the preferred lifetime of an address
is expired.
Signed-off-by: Jens Rosenboom <me@jayr.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add 'autoconf' and 'disable_ipv6' parameters to the IPv6 module.
The first controls if IPv6 addresses are autoconfigured from
prefixes received in Router Advertisements. The IPv6 loopback
(::1) and link-local addresses are still configured.
The second controls if IPv6 addresses are desired at all. No
IPv6 addresses will be added to any interfaces.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A tunnel with no local ipv4 endpoint would otherwise use the
ISATAP linklocal address fe80::5efe:0:0, which is invalid. Rather not
add a linklocal address at all.
Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just returning -ERESTARTSYS without a signal pending is not
good that will just leak it to userspace. We need return
-ERESTARTNOINTR so we always restart and set signal pending
so that we fall of the fast path of syscall return and setup
the system call restart.
So use restart_syscall() which does all of this for us.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ipv6 version of bind_conflict code calls ipv6_rcv_saddr_equal()
which at times wrongly identified intersections between addresses.
It particularly broke down under a few instances and caused erroneous
bind conflicts.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>