Commit Graph

8171 Commits

Author SHA1 Message Date
YOSHIFUJI Hideaki
1218854afa [NET] NETNS: Omit seq_net_private->net without CONFIG_NET_NS.
Without CONFIG_NET_NS, no namespace other than &init_net exists,
no need to store net in seq_net_private.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:56 +09:00
YOSHIFUJI Hideaki
3b1e0a655f [NET] NETNS: Omit sock->sk_net without CONFIG_NET_NS.
Introduce per-sock inlines: sock_net(), sock_net_set()
and per-inet_timewait_sock inlines: twsk_net(), twsk_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:55 +09:00
YOSHIFUJI Hideaki
c346dca108 [NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS.
Introduce per-net_device inlines: dev_net(), dev_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-26 04:39:53 +09:00
YOSHIFUJI Hideaki
7cbca67c07 [IPV6]: Support Source Address Selection API (RFC5014).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:24:01 +09:00
YOSHIFUJI Hideaki
6b75d09081 [IPV6]: Optimize hop-limit determination.
Last part of hop-limit determination is always:
    hoplimit = dst_metric(dst, RTAX_HOPLIMIT);
    if (hoplimit < 0)
        hoplimit = ipv6_get_hoplimit(dst->dev).

Let's consolidate it as ip6_dst_hoplimit(dst).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:24:00 +09:00
YOSHIFUJI Hideaki
c8cdaf998d [IPV4,IPV6]: Share cork.rt between IPv4 and IPv6.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:59 +09:00
YOSHIFUJI Hideaki
a9b05723ff [IPV6] ADDRCONF: Clean-up ipv6_dev_get_saddr().
old:
|    text	   data	    bss	    dec	    hex	filename
|   28599	   1416	     96	  30111	   759f	net/ipv6/addrconf.o

new:
|    text	   data	    bss	    dec	    hex	filename
|   28007	   1416	     96	  29519	   734f	net/ipv6/addrconf.o

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:58 +09:00
YOSHIFUJI Hideaki
9bb182a700 [XFRM] MIP6: Fix address keys for routing search.
Each MIPv6 XFRM state (DSTOPT/RH2) holds either destination or source
address to be mangled in the IPv6 header (that is "CoA").
On Inter-MN communication after both nodes binds each other,
they use route optimized traffic two MIPv6 states applied, and
both source and destination address in the IPv6 header
are replaced by the states respectively.
The packet format is correct, however, next-hop routing search
are not.
This patch fixes it by remembering address pairs for later states.

Based on patch from Masahide NAKAMURA <nakam@linux-ipv6.org>.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:57 +09:00
YOSHIFUJI Hideaki
df8ea19b5d [XFRM] IPV6: Optimize __xfrm_tunnel_alloc_spi().
| % size old/net/ipv6/xfrm6_tunnel.o new/net/ipv6/xfrm6_tunnel.o
|    text	   data	    bss	    dec	    hex	filename
|    1606	     40	   2080	   3726	    e8e	old/net/ipv6/xfrm6_tunnel.o
|    1574	     40	   2080	   3694	    e6e	new/net/ipv6/xfrm6_tunnel.o

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:57 +09:00
YOSHIFUJI Hideaki
a002c6fd71 [XFRM] IPV6: Optimize xfrm6_input_addr().
| % size old/net/ipv6/xfrm6_input.o new/net/ipv6/xfrm6_input.o
|    text	   data	    bss	    dec	    hex	filename
|    1026	      0	      0	   1026	    402	old/net/ipv6/xfrm6_input.o
|     947	      0	      0	    947	    3b3	new/net/ipv6/xfrm6_input.o

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:56 +09:00
YOSHIFUJI Hideaki
3b6cdf94cd [XFRM] IPV6: Use distribution counting sort for xfrm_state/xfrm_tmpl chain.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25 10:23:56 +09:00
Denis V. Lunev
92f1fecb45 [NETNS]: Enable TCP/UDP/ICMP inside namespace.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:34:06 -07:00
Denis V. Lunev
2342fd7e14 [NETNS]: Allow to create sockets in non-initial namespace.
Allow to create sockets in the namespace if the protocol ok with this.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:33:42 -07:00
Denis V. Lunev
f145049a06 [NETNS]: Drop packets in the non-initial namespace on the per/protocol basis.
IP layer now can handle multiple namespaces normally. So, process such
packets normally and drop them only if the transport layer is not
aware about namespaces.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:33:00 -07:00
Denis V. Lunev
0be43f82c4 [NETNS]: Process netfilter hooks in initial namespace only.
There were no packets in the namespace other than initial
previously. This will be changed in the neareast future. Netfilters
are not namespace aware and should be processed in the initial
namespace only for now.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:32:09 -07:00
Denis V. Lunev
05cf89d40c [NETNS]: Process INET socket layer in the correct namespace.
Replace all the reast of the init_net with a proper net on the socket
layer.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:31:35 -07:00
Denis V. Lunev
cb84663e4d [NETNS]: Process IP layer in the context of the correct namespace.
Replace all the rest of the init_net with a proper net on the IP layer.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:31:00 -07:00
Denis V. Lunev
7a6adb92fe [NETNS]: Add namespace parameter to ip_cmsg_send.
Pass the init_net there for now.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:30:27 -07:00
Denis V. Lunev
f2c4802b3f [NETNS]: Add namespace parameter to ip_options_get(...).
Pass the init_net there for now.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:29:55 -07:00
Denis V. Lunev
0e6bd4a1c6 [NETNS]: Add namespace parameter to ip_options_compile.
ip_options_compile uses inet_addr_type which requires a namespace. The
packet argument is optional, so parameter is the only way to obtain
it. Pass the init_net there for now.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:29:23 -07:00
Denis V. Lunev
ffc31d3d77 [NETNS]: /proc/net/arp namespacing.
Seqfile operation showing /proc/net/arp are already namespace
aware. All we need is to register this file for each namespace.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:28:43 -07:00
Denis V. Lunev
49e8a279a1 [NETNS]: Process ARP in the context of the correct namespace.
Get namespace from a device and pass it to the routing engine. Enable
ARP packet processing and device notifiers after that.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 15:28:12 -07:00
Pavel Emelyanov
2feb27dbe0 [NETNS]: Minor information leak via /proc/net/ptype file.
This file displays the registered packet types, but some of them
(packet sockets creates such) can be bound to a net device and showing
them in a wrong namespace is not correct.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:57:45 -07:00
Pavel Emelyanov
84c375af0f [NETNS][UDP-Lite]: Register /proc/net/udplite(6) in a namespace.
UDP-Lite sockets are displayed in another files, rather than
UDP ones, so make the present in namespaces as well.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:56:57 -07:00
Pavel Emelyanov
ff2bac6a63 [UDP-Lite]: Clean up proc creation a bit.
Just introduce a helper to remove ifdefs from inside the
udplite4_register function. This will help to make the next patch
nicer.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:56:34 -07:00
Pavel Emelyanov
757764f61d [NETNS][TCP]: Register /proc/net/tcp in a namespace.
After the commit f40c8174d3 ([NETNS][IPV4] 
tcp - make proc handle the network namespaces) it is now possible to make
this file present in newly created namespaces.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:56:02 -07:00
Pavel Emelyanov
15439febb0 [NETNS][UDP]: Register /proc/net/udp in a namespace.
After the commit a91275eff4 ([NETNS][IPV6]
udp - make proc handle the network namespace) it is now possible to make
this file present in newly created namespaces.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:53:49 -07:00
Kazunori MIYAZAWA
df9dcb4588 [IPSEC]: Fix inter address family IPsec tunnel handling.
Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:51:51 -07:00
Pavel Emelyanov
fa86d322d8 [NEIGH]: Fix race between pneigh deletion and ipv6's ndisc_recv_ns (v3).
Proxy neighbors do not have any reference counting, so any caller
of pneigh_lookup (unless it's a netlink triggered add/del routine)
should _not_ perform any actions on the found proxy entry. 

There's one exception from this rule - the ipv6's ndisc_recv_ns() 
uses found entry to check the flags for NTF_ROUTER.

This creates a race between the ndisc and pneigh_delete - after 
the pneigh is returned to the caller, the nd_tbl.lock is dropped 
and the deleting procedure may proceed.

One of the fixes would be to add a reference counting, but this
problem exists for ndisc only. Besides such a patch would be too 
big for -rc4.

So I propose to introduce a __pneigh_lookup() which is supposed
to be called with the lock held and use it in ndisc code to check
the flags on alive pneigh entry.


Changes from v2:
As David noticed, Exported the __pneigh_lookup() to ipv6 module. 
The checkpatch generates a warning on it, since the EXPORT_SYMBOL 
does not follow the symbol itself, but in this file all the 
exports come at the end, so I decided no to break this harmony.

Changes from v1:
Fixed comments from YOSHIFUJI - indentation of prototype in header
and the pndisc_check_router() name - and a compilation fix, pointed
by Daniel - the is_routed was (falsely) considered as uninitialized
by gcc.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-24 14:48:59 -07:00
David S. Miller
06802a819a Merge branch 'master' of ../net-2.6/
Conflicts:

	net/ipv6/ndisc.c
2008-03-23 22:54:03 -07:00
Florian Westphal
80445cfb28 [SCTP]: Remove redundant wrapper functions.
sctp_datamsg_free and sctp_datamsg_track are just aliases for
sctp_datamsg_put and sctp_chunk_hold, respectively.

Saves 32 Bytes on x86.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:47:08 -07:00
Florian Westphal
2444844cef [SCTP]: Replace char msg[] with static const char[].
133886    2004     220  136110   213ae sctp.new/sctp.o
134018    2004     220  136242   21432 sctp.old/sctp.o

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:46:34 -07:00
Stephen Hemminger
3d3b2d25a4 fib_trie: print information on all routing tables
Make /proc/net/fib_trie and /proc/net/fib_triestat display
all routing tables, not just local and main.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:43:56 -07:00
Jiri Olsa
2a706ec188 [AF_PACKET]: Remove unused variable.
Signed-off-by: Jiri Olsa <olsajiri@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:42:34 -07:00
Florian Westphal
2051f11fb8 [TCP]: Shrink syncookie_secret by 8 byte.
the first u32 copied from syncookie_secret is overwritten by the
minute-counter four lines below.  After adjusting the destination
address, the size of syncookie_secret can be reduced accordingly.

AFAICS, the only other user of syncookie_secret[] is the ipv6
syncookie support.  Because ipv6 syncookies only grab 44 bytes from
syncookie_secret[], this shouldn't affect them in any way.

With fixes from Glenn Griffin.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Glenn Griffin <ggriffin.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:21:28 -07:00
Martin Devera
8f3ea33a50 sch_htb: fix "too many events" situation
HTB is event driven algorithm and part of its work is to apply
scheduled events at proper times. It tried to defend itself from
livelock by processing only limited number of events per dequeue.
Because of faster computers some users already hit this hardcoded
limit.

This patch limits processing up to 2 jiffies (why not 1 jiffie ?
because it might stop prematurely when only fraction of jiffie
remains).

Signed-off-by: Martin Devera <devik@cdi.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 22:00:38 -07:00
Rami Rosen
061964fb98 [IPV6]: Remove unused code in ndisc_send_redirect().
This patches removes unused code in ndisc_send_redirect() method in
net/ipv6/ndisc.c.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 21:58:44 -07:00
Wang Chen
dbee0d3f46 [ATM]: When proc_create() fails, do some error handling work and return -ENOMEM.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-23 21:45:36 -07:00
Julia Lawall
53a6201fdf [9P] net/9p/trans_fd.c: remove unused variable
The variable cb is initialized but never used otherwise.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
type T;
identifier i;
constant C;
@@

(
extern T i;
|
- T i;
  <+... when != i
- i = C;
  ...+>
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 18:05:33 -07:00
Julia Lawall
421f099bc5 [IPV6] net/ipv6/ndisc.c: remove unused variable
The variable hlen is initialized but never used otherwise.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
type T;
identifier i;
constant C;
@@

(
extern T i;
|
- T i;
  <+... when != i
- i = C;
  ...+>
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 18:04:16 -07:00
Stephen Hemminger
6440cc9e0f [IPV4] fib_trie: fix warning from rcu_assign_poinger
This gets rid of a warning caused by the test in rcu_assign_pointer.
I tried to fix rcu_assign_pointer, but that devolved into a long set
of discussions about doing it right that came to no real solution.
Since the test in rcu_assign_pointer for constant NULL would never
succeed in fib_trie, just open code instead.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 17:59:58 -07:00
Stephen Hemminger
817bc4db77 [IPV4] route: use read_mostly
The route table parameters are set based on system memory and sysctl
values that almost never change. Also the genid only changes every
10 minutes.

RTprint is defined by never used.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 17:43:59 -07:00
Denis V. Lunev
ce25999078 [IPV4]: sk parameter is unused in ipv4_dst_blackhole.
Just remove it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 17:42:37 -07:00
Pavel Emelyanov
fc8717baa8 [RAW]: Add raw_hashinfo member on struct proto.
Sorry for the patch sequence confusion :| but I found that the similar
thing can be done for raw sockets easily too late.

Expand the proto.h union with the raw_hashinfo member and use it in
raw_prot and rawv6_prot. This allows to drop the protocol specific
versions of hash and unhash callbacks.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:56:51 -07:00
Pavel Emelyanov
6ba5a3c52d [UDP]: Make full use of proto.h.udp_hash innovation.
After this we have only udp_lib_get_port to get the port and two 
stubs for ipv4 and ipv6. No difference in udp and udplite except
for initialized h.udp_hash member.

I tried to find a graceful way to drop the only difference between
udp_v4_get_port and udp_v6_get_port (i.e. the rcv_saddr comparison 
routine), but adding one more callback on the struct proto didn't 
appear such :( Maybe later.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:51:21 -07:00
Pavel Emelyanov
39d8cda76c [SOCK]: Add udp_hash member to struct proto.
Inspired by the commit ab1e0a13 ([SOCK] proto: Add hashinfo member to 
struct proto) from Arnaldo, I made similar thing for UDP/-Lite IPv4 
and -v6 protocols.

The result is not that exciting, but it removes some levels of
indirection in udpxxx_get_port and saves some space in code and text.

The first step is to union existing hashinfo and new udp_hash on the
struct proto and give a name to this union, since future initialization 
of tcpxxx_prot, dccp_vx_protinfo and udpxxx_protinfo will cause gcc 
warning about inability to initialize anonymous member this way.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:50:58 -07:00
Denis V. Lunev
22aba383ce [IPV4]: Always pass ip_options pointer into ip_options_compile.
This makes code a bit more uniform and straigthforward.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:36:20 -07:00
Denis V. Lunev
ef722495c8 [IPV4]: Remove unused ip_options->is_data.
ip_options->is_data is assigned only and never checked. The structure is
not a part of kernel interface to the userspace. So, it is safe to remove
this field.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:35:29 -07:00
Denis V. Lunev
10fe7d85e2 [IPV4]: Remove unnecessary check for opt->is_data in ip_options_compile.
There is the only way to reach ip_options compile with opt != NULL:

ip_options_get_finish
    opt->is_data = 1;
    ip_options_compile(opt, NULL)

So, checking for is_data inside opt != NULL branch is not needed.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 16:35:00 -07:00
Herbert Xu
69d1506731 [TCP]: Let skbs grow over a page on fast peers
While testing the virtio-net driver on KVM with TSO I noticed
that TSO performance with a 1500 MTU is significantly worse
compared to the performance of non-TSO with a 16436 MTU.  The
packet dump shows that most of the packets sent are smaller
than a page.

Looking at the code this actually is quite obvious as it always
stop extending the packet if it's the first packet yet to be
sent and if it's larger than the MSS.  Since each extension is
bound by the page size, this means that (given a 1500 MTU) we're
very unlikely to construct packets greater than a page, provided
that the receiver and the path is fast enough so that packets can
always be sent immediately.

The fix is also quite obvious.  The push calls inside the loop
is just an optimisation so that we don't end up doing all the
sending at the end of the loop.  Therefore there is no specific
reason why it has to do so at MSS boundaries.  For TSO, the
most natural extension of this optimisation is to do the pushing
once the skb exceeds the TSO size goal.

This is what the patch does and testing with KVM shows that the
TSO performance with a 1500 MTU easily surpasses that of a 16436
MTU and indeed the packet sizes sent are generally larger than
16436.

I don't see any obvious downsides for slower peers or connections,
but it would be prudent to test this extensively to ensure that
those cases don't regress.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-22 15:47:05 -07:00
Patrick McManus
ec3c0982a2 [TCP]: TCP_DEFER_ACCEPT updates - process as established
Change TCP_DEFER_ACCEPT implementation so that it transitions a
connection to ESTABLISHED after handshake is complete instead of
leaving it in SYN-RECV until some data arrvies. Place connection in
accept queue when first data packet arrives from slow path.

Benefits:
  - established connection is now reset if it never makes it
   to the accept queue

 - diagnostic state of established matches with the packet traces
   showing completed handshake

 - TCP_DEFER_ACCEPT timeouts are expressed in seconds and can now be
   enforced with reasonable accuracy instead of rounding up to next
   exponential back-off of syn-ack retry.

Signed-off-by: Patrick McManus <mcmanus@ducksong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 16:33:01 -07:00
Patrick McManus
e4c7884028 [TCP]: TCP_DEFER_ACCEPT updates - dont retxmt synack
a socket in LISTEN that had completed its 3 way handshake, but not notified
userspace because of SO_DEFER_ACCEPT, would retransmit the already
acked syn-ack during the time it was waiting for the first data byte
from the peer.

Signed-off-by: Patrick McManus <mcmanus@ducksong.com>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 16:29:22 -07:00
Patrick McManus
539fae89be [TCP]: TCP_DEFER_ACCEPT updates - defer timeout conflicts with max_thresh
timeout associated with SO_DEFER_ACCEPT wasn't being honored if it was
less than the timeout allowed by the maximum syn-recv queue size
algorithm. Fix by using the SO_DEFER_ACCEPT value if the ack has
arrived.

Signed-off-by: Patrick McManus <mcmanus@ducksong.com>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 16:27:38 -07:00
Pavel Emelyanov
7512cbf6ef [DLCI]: Fix tiny race between module unload and sock_ioctl.
This is a narrow pedantry :) but the dlci_ioctl_hook check and call
should not be parted with the mutex lock.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 15:58:52 -07:00
Pavel Emelyanov
28518fc170 [NET]: NULL pointer dereference and other nasty things in /proc/net/(tcp|udp)[6]
Commits f40c81 ([NETNS][IPV4] tcp - make proc handle the network
namespaces) and a91275 ([NETNS][IPV6] udp - make proc handle the
network namespace) both introduced bad checks on sockets and tw
buckets to belong to proper net namespace.

I.e. when checking for socket to belong to given net and family the

	do {
		sk = sk_next(sk);
	} while (sk && sk->sk_net != net && sk->sk_family != family);

constructions were used. This is wrong, since as soon as the
sk->sk_net fits the net the socket is immediately returned, even if it
belongs to other family.

As the result four /proc/net/(udp|tcp)[6] entries show wrong info.
The udp6 entry even oopses when dereferencing inet6_sk(sk) pointer:

static void udp6_sock_seq_show(struct seq_file *seq, struct sock *sp, int bucket)
{
	...
        struct ipv6_pinfo *np = inet6_sk(sp);
	...

        dest  = &np->daddr; /* will be NULL for AF_INET sockets */
	...
	seq_printf(...
	           dest->s6_addr32[0], dest->s6_addr32[1],
                   dest->s6_addr32[2], dest->s6_addr32[3],
	...

Fix it by converting && to ||.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 15:52:00 -07:00
Stephen Hemminger
b1153f29ee netlink: make socket filters work on netlink
Make socket filters work for netlink unicast and notifications.
This is useful for applications like Zebra that get overrun with
messages that are then ignored.

Note: netlink messages are in host byte order, but packet filter
state machine operations are done as network byte order.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 15:46:12 -07:00
Phil Oester
12b101555f [IPV4]: Fix null dereference in ip_defrag
Been seeing occasional panics in my testing of 2.6.25-rc in ip_defrag.
Offending line in ip_defrag is here:

	net = skb->dev->nd_net

where dev is NULL.  Bisected the problem down to commit
ac18e7509e ([NETNS][FRAGS]: Make the
inet_frag_queue lookup work in namespaces).  

Below patch (idea from Patrick McHardy) fixes the problem for me.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 15:01:50 -07:00
Daniel Lezcano
6f8b13bcb3 [NETNS][IPV6] tcp6 - make proc per namespace
Make the proc for tcp6 to be per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:14:45 -07:00
Daniel Lezcano
0c96d8c50b [NETNS][IPV6] udp6 - make proc per namespace
The proc init/exit functions take a new network namespace parameter in
order to register/unregister /proc/net/udp6 for a namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:14:17 -07:00
Daniel Lezcano
f40c8174d3 [NETNS][IPV4] tcp - make proc handle the network namespaces
This patch, like udp proc, makes the proc functions to take care of
which namespace the socket belongs.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:13:54 -07:00
Daniel Lezcano
8d9f1744ca [NETNS][IPV6] tcp - assign the netns for timewait sockets
Copy the network namespace from the socket to the timewait socket.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:12:54 -07:00
Daniel Lezcano
a91275eff4 [NETNS][IPV6] udp - make proc handle the network namespace
This patch makes the common udp proc functions to take care of which
socket they should show taking into account the namespace it belongs.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:11:58 -07:00
Daniel Lezcano
ea82edf704 [NETNS][IPV6] mcast - fix compilation warning when procfs is not compiled in
When CONFIG_PROC_FS=no, the out_sock_create label is not used because
the code using it is disabled and that leads to a warning at compile
time.

This patch fix that by making a specific function to initialize proc
for igmp6, and remove the annoying CONFIG_PROC_FS sections in
init/exit function.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 04:10:53 -07:00
Peter P Waskiewicz Jr
82cc1a7a56 [NET]: Add per-connection option to set max TSO frame size
Update: My mailer ate one of Jarek's feedback mails...  Fixed the
parameter in netif_set_gso_max_size() to be u32, not u16.  Fixed the
whitespace issue due to a patch import botch.  Changed the types from
u32 to unsigned int to be more consistent with other variables in the
area.  Also brought the patch up to the latest net-2.6.26 tree.

Update: Made gso_max_size container 32 bits, not 16.  Moved the
location of gso_max_size within netdev to be less hotpath.  Made more
consistent names between the sock and netdev layers, and added a
define for the max GSO size.

Update: Respun for net-2.6.26 tree.

Update: changed max_gso_frame_size and sk_gso_max_size from signed to
unsigned - thanks Stephen!

This patch adds the ability for device drivers to control the size of
the TSO frames being sent to them, per TCP connection.  By setting the
netdevice's gso_max_size value, the socket layer will set the GSO
frame size based on that value.  This will propogate into the TCP
layer, and send TSO's of that size to the hardware.

This can be desirable to help tune the bursty nature of TSO on a
per-adapter basis, where one may have 1 GbE and 10 GbE devices
coexisting in a system, one running multiqueue and the other not, etc.

This can also be desirable for devices that cannot support full 64 KB
TSO's, but still want to benefit from some level of segmentation
offloading.

Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-21 03:43:19 -07:00
David S. Miller
a25606c845 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-03-21 03:42:24 -07:00
YOSHIFUJI Hideaki
38fe999e22 [IPV6] KCONFIG: Fix description about IPV6_TUNNEL.
Based on notice from "Colin" <colins@sjtu.edu.cn>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 16:13:58 -07:00
Patrick McHardy
607bfbf2d5 [TCP]: Fix shrinking windows with window scaling
When selecting a new window, tcp_select_window() tries not to shrink
the offered window by using the maximum of the remaining offered window
size and the newly calculated window size. The newly calculated window
size is always a multiple of the window scaling factor, the remaining
window size however might not be since it depends on rcv_wup/rcv_nxt.
This means we're effectively shrinking the window when scaling it down.


The dump below shows the problem (scaling factor 2^7):

- Window size of 557 (71296) is advertised, up to 3111907257:

IP 172.2.2.3.33000 > 172.2.2.2.33000: . ack 3111835961 win 557 <...>

- New window size of 514 (65792) is advertised, up to 3111907217, 40 bytes
  below the last end:

IP 172.2.2.3.33000 > 172.2.2.2.33000: . 3113575668:3113577116(1448) ack 3111841425 win 514 <...>

The number 40 results from downscaling the remaining window:

3111907257 - 3111841425 = 65832
65832 / 2^7 = 514
65832 % 2^7 = 40

If the sender uses up the entire window before it is shrunk, this can have
chaotic effects on the connection. When sending ACKs, tcp_acceptable_seq()
will notice that the window has been shrunk since tcp_wnd_end() is before
tp->snd_nxt, which makes it choose tcp_wnd_end() as sequence number.
This will fail the receivers checks in tcp_sequence() however since it
is before it's tp->rcv_wup, making it respond with a dupack.

If both sides are in this condition, this leads to a constant flood of
ACKs until the connection times out.

Make sure the window is never shrunk by aligning the remaining window to
the window scaling factor.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 16:11:27 -07:00
Jarek Poplawski
8a455b087c netpoll: zap_completion_queue: adjust skb->users counter
zap_completion_queue() retrieves skbs from completion_queue where they have
zero skb->users counter.  Before dev_kfree_skb_any() it should be non-zero
yet, so it's increased now.

Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 16:07:27 -07:00
Fabio Checconi
2bec008ca9 bridge: use time_before() in br_fdb_cleanup()
In br_fdb_cleanup() next_timer and this_timer are in jiffies, so they
should be compared using the time_after() macro.

Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:54:58 -07:00
Vlad Yasevich
270637abff [SCTP]: Fix a race between module load and protosw access
There is a race is SCTP between the loading of the module
and the access by the socket layer to the protocol functions.
In particular, a list of addresss that SCTP maintains is
not initialized prior to the registration with the protosw.
Thus it is possible for a user application to gain access
to SCTP functions before everything has been initialized.
The problem shows up as odd crashes during connection
initializtion when we try to access the SCTP address list.

The solution is to refactor how we do registration and
initialize the lists prior to registering with the protosw.
Care must be taken since the address list initialization
depends on some other pieces of SCTP initialization.  Also
the clean-up in case of failure now also needs to be refactored.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:17:14 -07:00
Daniel Hokka Zakrisson
d0ebf13359 [NETFILTER]: ipt_recent: sanity check hit count
If a rule using ipt_recent is created with a hit count greater than
ip_pkt_list_tot, the rule will never match as it cannot keep track
of enough timestamps. This patch makes ipt_recent refuse to create such
rules.

With ip_pkt_list_tot's default value of 20, the following can be used
to reproduce the problem.

nc -u -l 0.0.0.0 1234 &
for i in `seq 1 100`; do echo $i | nc -w 1 -u 127.0.0.1 1234; done

This limits it to 20 packets:
iptables -A OUTPUT -p udp --dport 1234 -m recent --set --name test \
         --rsource
iptables -A OUTPUT -p udp --dport 1234 -m recent --update --seconds \
         60 --hitcount 20 --name test --rsource -j DROP

While this is unlimited:
iptables -A OUTPUT -p udp --dport 1234 -m recent --set --name test \
         --rsource
iptables -A OUTPUT -p udp --dport 1234 -m recent --update --seconds \
         60 --hitcount 21 --name test --rsource -j DROP

With the patch the second rule-set will throw an EINVAL.

Reported-by: Sean Kennedy <skennedy@vcn.com>
Signed-off-by: Daniel Hokka Zakrisson <daniel@hozac.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:07:10 -07:00
Roel Kluin
6aebb9b280 [NETFILTER]: nf_conntrack_h323: logical-bitwise & confusion in process_setup()
logical-bitwise & confusion

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-20 15:06:23 -07:00
Robert P. J. Day
938b93adb2 [NET]: Add debugging names to __RW_LOCK_UNLOCKED macros.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-18 00:59:23 -07:00
David S. Miller
577f99c1d0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/rt2x00/rt2x00dev.c
	net/8021q/vlan_dev.c
2008-03-18 00:37:55 -07:00
David S. Miller
2f633928cb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-03-17 23:44:31 -07:00
Al Viro
5e226e4d90 [IPV4]: esp_output() misannotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:50:23 -07:00
Al Viro
9534f035ee [8021Q]: vlan_dev misannotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:49:48 -07:00
Al Viro
27724426a9 [SUNRPC]: net/* NULL noise
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:48:03 -07:00
Al Viro
bc92dd194d [SCTP]: fix misannotated __sctp_rcv_asconf_lookup()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:47:32 -07:00
Al Viro
0382b9c354 [PKT_SCHED]: annotate cls_u32
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:46:46 -07:00
Al Viro
e6f1cebf71 [NET] endianness noise: INADDR_ANY
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-17 22:44:53 -07:00
Adrian Bunk
d9357136ac the scheduled rc80211-simple.c removal
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-13 16:02:31 -04:00
Adrian Bunk
7524d7d6de the scheduled ieee80211 softmac removal
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-13 16:02:31 -04:00
Linus Torvalds
609eb39c8d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
  [SCTP]: Fix local_addr deletions during list traversals.
  net: fix build with CONFIG_NET=n
  [TCP]: Prevent sending past receiver window with TSO (at last skb)
  rt2x00: Add new D-Link USB ID
  rt2x00: never disable multicast because it disables broadcast too
  libertas: fix the 'compare command with itself' properly
  drivers/net/Kconfig: fix whitespace for GELIC_WIRELESS entry
  [NETFILTER]: nf_queue: don't return error when unregistering a non-existant handler
  [NETFILTER]: nfnetlink_queue: fix EPERM when binding/unbinding and instance 0 exists
  [NETFILTER]: nfnetlink_log: fix EPERM when binding/unbinding and instance 0 exists
  [NETFILTER]: nf_conntrack: replace horrible hack with ksize()
  [NETFILTER]: nf_conntrack: add \n to "expectation table full" message
  [NETFILTER]: xt_time: fix failure to match on Sundays
  [NETFILTER]: nfnetlink_log: fix computation of netlink skb size
  [NETFILTER]: nfnetlink_queue: fix computation of allocated size for netlink skb.
  [NETFILTER]: nfnetlink: fix ifdef in nfnetlink_compat.h
  [NET]: include <linux/types.h> into linux/ethtool.h for __u* typedef
  [NET]: Make /proc/net a symlink on /proc/self/net (v3)
  RxRPC: fix rxrpc_recvmsg()'s returning of msg_name
  net/enc28j60: oops fix
  ...
2008-03-12 13:08:09 -07:00
Tom Tucker
3fedb3c5a8 SVCRDMA: Fix erroneous BUG_ON in send_write
The assertion that checks for sge context overflow is
incorrectly hard-coded to 32. This causes a kernel bug
check when using big-data mounts. Changed the BUG_ON to
use the computed value RPCSVC_MAXPAGES.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-12 12:37:34 -07:00
Tom Tucker
c48cbb405c SVCRDMA: Add xprt refs to fix close/unmount crash
RDMA connection shutdown on an SMP machine can cause a kernel crash due
to the transport close path racing with the I/O tasklet.

Additional transport references were added as follows:
- A reference when on the DTO Q to avoid having the transport
  deleted while queued for I/O.
- A reference while there is a QP able to generate events.
- A reference until the DISCONNECTED event is received on the CM ID

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-12 12:37:34 -07:00
David S. Miller
ba73d4c84a Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6.26 2008-03-11 19:17:18 -07:00
Chidambar 'ilLogict' Zinnoury
22626216c4 [SCTP]: Fix local_addr deletions during list traversals.
Since the lists are circular, we need to explicitely tag
the address to be deleted since we might end up freeing
the list head instead.  This fixes some interesting SCTP
crashes.

Signed-off-by: Chidambar 'ilLogict' Zinnoury <illogict@online.fr>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-11 18:05:02 -07:00
Ilpo Järvinen
5ea3a74806 [TCP]: Prevent sending past receiver window with TSO (at last skb)
With TSO it was possible to send past the receiver window when the skb
to be sent was the last in the write queue while the receiver window
is the limiting factor. One can notice that there's a loophole in the
tcp_mss_split_point that lacked a receiver window check for the
tcp_write_queue_tail() if also cwnd was smaller than the full skb.

Noticed by Thomas Gleixner <tglx@linutronix.de> in form of "Treason
uncloaked! Peer ... shrinks window .... Repaired."  messages (the peer
didn't actually shrink its window as the message suggests, we had just
sent something past it without a permission to do so).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-11 17:55:27 -07:00
Patrick McHardy
94be1a3f36 [NETFILTER]: nf_queue: don't return error when unregistering a non-existant handler
Commit ce7663d84:

[NETFILTER]: nfnetlink_queue: don't unregister handler of other subsystem

changed nf_unregister_queue_handler to return an error when attempting to
unregister a queue handler that is not identical to the one passed in.
This is correct in case we really do have a different queue handler already
registered, but some existing userspace code always does an unbind before
bind and aborts if that fails, so try to be nice and return success in
that case.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:45:05 -07:00
Patrick McHardy
914afea84e [NETFILTER]: nfnetlink_queue: fix EPERM when binding/unbinding and instance 0 exists
Similar to the nfnetlink_log problem, nfnetlink_queue incorrectly
returns -EPERM when binding or unbinding to an address family and
queueing instance 0 exists and is owned by a different process. Unlike
nfnetlink_log it previously completes the operation, but it is still
incorrect.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:44:36 -07:00
Patrick McHardy
b7047a1c88 [NETFILTER]: nfnetlink_log: fix EPERM when binding/unbinding and instance 0 exists
When binding or unbinding to an address family, the res_id is usually set
to zero. When logging instance 0 already exists and is owned by a different
process, this makes nfunl_recv_config return -EPERM without performing
the bind operation.

Since no operation on the foreign logging instance itself was requested,
this is incorrect. Move bind/unbind commands before the queue instance
permissions checks.

Also remove an incorrect comment.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:44:13 -07:00
Pekka Enberg
019f692ea7 [NETFILTER]: nf_conntrack: replace horrible hack with ksize()
There's a horrible slab abuse in net/netfilter/nf_conntrack_extend.c
that can be replaced with a call to ksize().

Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:43:41 -07:00
Alexey Dobriyan
3d89e9cf36 [NETFILTER]: nf_conntrack: add \n to "expectation table full" message
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:43:10 -07:00
Jan Engelhardt
4f4c9430cf [NETFILTER]: xt_time: fix failure to match on Sundays
From: Andrew Schulman <andrex@alumni.utexas.net>

xt_time_match() in net/netfilter/xt_time.c in kernel 2.6.24 never
matches on Sundays. On my host I have a rule like

    iptables -A OUTPUT -m time --weekdays Sun -j REJECT

and it never matches. The problem is in localtime_2(), which uses

    r->weekday = (4 + r->dse) % 7;

to map the epoch day onto a weekday in {0,...,6}. In particular this
gives 0 for Sundays. But 0 has to be wrong; a weekday of 0 can never
match. xt_time_match() has

    if (!(info->weekdays_match & (1 << current_time.weekday)))
        return false;

and when current_time.weekday = 0, the result of the & is always
zero, even when info->weekdays_match = XT_TIME_ALL_WEEKDAYS = 0xFE.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:42:40 -07:00
Eric Leblond
7000d38d61 [NETFILTER]: nfnetlink_log: fix computation of netlink skb size
This patch is similar to nfnetlink_queue fixes. It fixes the computation
of skb size by using NLMSG_SPACE instead of NLMSG_ALIGN.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:42:04 -07:00
Eric Leblond
cabaa9bfb0 [NETFILTER]: nfnetlink_queue: fix computation of allocated size for netlink skb.
Size of the netlink skb was wrongly computed because the formula was using
NLMSG_ALIGN instead of NLMSG_SPACE. NLMSG_ALIGN does not add the room for
netlink header as NLMSG_SPACE does. This was causing a failure of message
building in some cases.

On my test system, all messages for packets in range [8*k+41, 8*k+48] where k
is an integer were invalid and the corresponding packets were dropped.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-10 16:41:43 -07:00
Johannes Berg
e5f98f2df9 mac80211: don't call conf_tx under RCU lock
Reinette pointed out that with the sta_info RCU-ification
the behaviour here changed and the conf_tx callback is
now invoked under RCU read lock. That is not necessary so
this patch restores the original behaviour

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-07 16:02:59 -05:00
Linus Torvalds
4c1aa6f8b9 Merge branch 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Fix dentry revalidation for NFSv4 referrals and mountpoint crossings
  NFS: Fix the fsid revalidation in nfs_update_inode()
  SUNRPC: Fix a nfs4 over rdma transport oops
  NFS: Fix an f_mode/f_flags confusion in fs/nfs/write.c
2008-03-07 12:08:07 -08:00
Tom Talpey
ee1a2c564f SUNRPC: Fix a nfs4 over rdma transport oops
Prevent an RPC oops when freeing a dynamically allocated RDMA
buffer, used in certain special-case large metadata operations.

Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: James Lentini <jlentini@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-07 14:35:32 -05:00
Daniel Lezcano
b8ad0cbc58 [NETNS][IPV6] mcast - handle several network namespace
This patch make use of the network namespace information at the right
places to handle the multicast for several network namespaces.  It
makes the socket control to be per namespace too.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:16:55 -08:00
Daniel Lezcano
e504799276 [NETNS][IPV6] tcp6 - handle several network namespace
We have the right network namespace at the right place now.
So make use of this information to make tcp6 per network namespace

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:16:26 -08:00
Daniel Lezcano
93ec926b07 [NETNS][IPV6] tcp6 - make socket control per namespace
Instead of having a tcp6_socket global to all the namespace, there is
tcp6 socket control per namespace. That is consistent with which
namespace sent a RST and allows to pass the socket to the underlying
function to retrieve the network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:16:02 -08:00
Daniel Lezcano
1762f7e88e [NETNS][IPV6] ndisc - make socket control per namespace
Make ndisc socket control per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:15:34 -08:00
Daniel Lezcano
a18bc6959d [NETNS][IPV6] ndisc - make ndisc handle multiple network namespaces
Make ndisc handle multiple network namespaces: 
Remove references to init_net, add network namespace parameters and add 
pernet_operations for ndisc

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:14:49 -08:00
Daniel Lezcano
8a3edd800d [NETNS][IPV6] fix some missing namespace
This patch adds some missing namespace

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-07 11:14:16 -08:00
David S. Miller
db8dac20d5 [UDP]: Revert udplite and code split.
This reverts commit db1ed684f6 ("[IPV6]
UDP: Rename IPv6 UDP files."), commit
8be8af8fa4 ("[IPV4] UDP: Move
IPv4-specific bits to other file.") and commit
e898d4db27 ("[UDP]: Allow users to
configure UDP-Lite.").

First, udplite is of such small cost, and it is a core protocol just
like TCP and normal UDP are.

We spent enormous amounts of effort to make udplite share as much code
with core UDP as possible.  All of that work is less valuable if we're
just going to slap a config option on udplite support.

It is also causing build failures, as reported on linux-next, showing
that the changeset was not tested very well.  In fact, this is the
second build failure resulting from the udplite change.

Finally, the config options provided was a bool, instead of a modular
option.  Meaning the udplite code does not even get build tested
by allmodconfig builds, and furthermore the user is not presented
with a reasonable modular build option which is particularly needed
by distribution vendors.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 16:22:02 -08:00
Allan Stephens
ba0fa45994 [TIPC]: Update version to 1.6.3
This patch updates TIPC's version number to 1.6.3.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:08:40 -08:00
Allan Stephens
c0cb7ef086 [TIPC]: Enhancements to message header writing
This patch makes two enhancements to the routine used to
set bit fields within a TIPC message header:

 1) It now ignores any bits of the new field value that are not
    covered by the mask being used.  (Previously, if the new value
    exceeded the size of the mask the extra bits could corrupt
    other fields in the message header word being updated.)

 2) The code has been optimized to minimize the number of run-time
    endianness conversion operations by leveraging the fact that the
    mask (and, in some cases, the value as well) is constant and the
    necessary conversion can be performed by the compiler.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:08:10 -08:00
Allan Stephens
37695420a2 [TIPC]: Use correct bitmask when setting version
This patch ensures that the 3-bit version field of the TIPC
message header is masked correctly when written into a message.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:07:42 -08:00
Allan Stephens
06d82c9191 [TIPC]: Minor cleanup of message header code
This patch eliminates some unused or duplicate message header
symbols, and fixes up the comments and/or location of a few
other symbols.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:06:55 -08:00
Allan Stephens
0e0609bbd2 [TIPC]: Eliminate "sparse" symbol warnings
This patch eliminates warnings about undeclared symbols.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:06:06 -08:00
Allan Stephens
e247a8f5d0 [TIPC]: Add argument validation for shutdown()
This patch validates that the "how" argument to shutdown()
is SHUT_RDWR, since this is the only form that TIPC supports.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:05:38 -08:00
Allan Stephens
8c8696553a [TIPC]: Removal of message header option code
This patch removes code associated with optional, user-specified
fields of the TIPC message header.  Such fields were never
utilized by TIPC, and have now been removed from the protocol
specification.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-06 15:05:07 -08:00
Johannes Berg
69d3b6f491 mac80211: fix hardware scan completion
The mac80211 MLME requires restarting timers after a scan
completes but this wasn't done when hardware scan offload
was added, so add it now.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Tested-by: Bill Moss <bmoss@clemson.edu>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:54 -05:00
Luis Carlos Cobo
2a8ca29a88 mac80211: fix mesh_path and sta_info get_by_idx functions
Skip properly entries whose dev does not match.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:54 -05:00
Luis Carlos Cobo
a00de5d08b mac80211: path IE fields macros, fix alignment problems and clean up
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:53 -05:00
Luis Carlos Cobo
b4e08ea141 mac80211: add PLINK_ prefix and kernel doc to enum plink_state
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:52 -05:00
Luis Carlos Cobo
cfa22c716f mac80211: always force mesh_path deletions
Postponing the deletion is not really useful anymore.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:51 -05:00
Luis Carlos Cobo
89a1ad6990 mac80211: delete mesh_path timer on mesh_path removal
This avoids dereferencing a no longer existing struct mesh_path.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:51 -05:00
Luis Carlos Cobo
aa2b592843 mac80211: clean up use of endianness conversion functions
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:50 -05:00
Luis Carlos Cobo
4f5d4c4da8 mac80211: breakdown mesh network attributes in different extra fields for wext
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:50 -05:00
Luis Carlos Cobo
3b091cd494 mac80211: move comment to better location
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:49 -05:00
Luis Carlos Cobo
1d1b535969 mac80211: fix incorrect parenthesis
Pointed out by Johannes Berg.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:49 -05:00
Luis Carlos Cobo
37659ff8e1 mac80211: fix mesh endianness sparse warnings and unmark it as broken
This patch fixes all the mesh related endianness warnings reported by sparse. As
they were the reason why Johannes marked mesh as BROKEN, that flag has been
removed.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 16:40:48 -05:00
Johannes Berg
96c46546e2 mac80211: always insert key into list
Today I hit one of my new WARN_ONs in the mac80211 code because
a key wasn't being freed correctly. After wondering for a while
I finally tracked it to the fact that STA keys aren't added to
the per-sdata key list correctly, they are supposed to always be
on that list, not just for default keys. This patch fixes that.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
03e4497ebe mac80211: fix sta_info mesh timer bug
I noticed a bug I introduced when mesh is enabled: sta_info_destroy()
will end up calling cancel_timer() on a timer that has never been
initialized because the timer is only initialized in mesh_plink_alloc(),
not in sta_info_alloc(). This patch moves the initialization of all mesh
related fields into sta_info_alloc(), adds a bit of sanity checking to
the cfg80211 handlers and sta_info_insert() and makes mesh_plink_alloc()
a static helper function that is only used from the mesh plink code.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
dbbea6713d mac80211: add documentation book
Quite a while ago I started this book. The required kernel-doc
patches have since gone into the tree so it is now possible to
build the book in mainline.

The actual documentation is still rather incomplete and not all
things are linked into the book, but this enables us to edit
the documentation collaboratively, hopefully driver authors can
add documentation based on their experience with mac80211.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
7c8076bd8b mac80211: don't clear next_hop in path reclaim
Luis pointed out that this path is going to be freed right
away anyway so there's no point in assigning next_hop.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
44213b5e13 mac80211: remove STA entries when taking down interface
When we take down an interface, we need to remove the STA info
items that belong to it because otherwise we might invoke a
sta_notify() callback in the driver when we later delete the
STA entries, but in that case the driver will already have
removed its knowledge of the interface they belonged to leading
to confusion. Also, we could invoke the set_tim() callback after
the driver removed its knowledge of the interface, which can
lead to a crash if it requests a beacon with a then-invalid vif
pointer!

A side effect of this patch is that, because it was easier, it
disallows changing the WDS peer while an interface is up. Should
that actually be necessary, it can be added back, but the WDS
peer STA entry may not be added while the interface is UP so for
now I've simplified the WDS peer's STA entry lifetime management.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
693b1bbcc4 mac80211: clean up sta_info and document locking
This patch cleans up the sta_info struct and documents how
each set of variables is locked. Notably, flags locking is
completely missing. It also adds kernel-doc for some (but
not all yet) members of the struct.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
73651ee639 mac80211: split sta_info_add
sta_info_add() has two functions: allocating a station info
structure and inserting it into the hash table/list. Splitting
these two functions allows allocating with GFP_KERNEL in many
places instead of GFP_ATOMIC which is now required by the RCU
protection. Additionally, in many places RCU protection is now
no longer needed at all because between sta_info_alloc() and
sta_info_insert() the caller owns the structure.

This fixes a few race conditions with setting initial flags
and similar, but not all (see comments in ieee80211_sta.c and
cfg.c). More documentation on the existing races will be in
a follow-up patch.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:47 -05:00
Johannes Berg
d0709a6518 mac80211: RCU-ify STA info structure access
This makes access to the STA hash table/list use RCU to protect
against freeing of items. However, it's not a true RCU, the
copy step is missing: whenever somebody changes a STA item it
is simply updated. This is an existing race condition that is
now somewhat understandable.

This patch also fixes the race key freeing vs. STA destruction
by making sure that sta_info_destroy() is always called under
RTNL and frees the key.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg
5cf121c3cd mac80211: split ieee80211_txrx_data
Split it into ieee80211_tx_data and ieee80211_rx_data to clarify
usage/flag usage and remove the stupid union thing.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg
7495883bdd mac80211: reorder a few fields in sta_info
Three __le16s followed by an enum (int) leave a two-byte hole
of padding which we can use for two of the other fields.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg
42096b634f mac80211: fix kernel-doc comment for mesh_plink_deactivate
Accidentally copied in a __mesh_plink_deactivate, noticed by Luis
Carlos Cobo.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg
d6d1a5a709 mac80211: clean up mesh RX path a bit more
Moves another ifdef into the sta_info header file in favour of
compiling more code even w/o CONFIG_MAC80211_MESH.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:46 -05:00
Johannes Berg
c1edd987a4 mac80211: export mesh_plink_broken
This needs to be exported because rate control algorithms
can be modular.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:45 -05:00
Johannes Berg
5c142e8db4 mac80211: clarify mesh Kconfig
This clarifies that the mesh networking code is currently
based on Draft 1.08 of the 802.11 Mesh Networking amendment.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:44 -05:00
Johannes Berg
ff59dc76e6 mac80211: add missing "break" statement in mesh code
This inserts a missing break statement which, if hit, would cause
the code to fall-through and unlock a spinlock twice. Noticed via
sparse's "lock count wrong in basic block" warning and careful
code inspection.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:43 -05:00
Johannes Berg
2f5ce793c0 mac80211: enable mesh in Kconfig
Currently marked BROKEN because of endianness problems.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:43 -05:00
Johannes Berg
dc0b0f7d1e mac80211: mesh hwmp locking fixes
This fixes missing unlocks noticed by sparse.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Johannes Berg
902acc7896 mac80211: clean up mesh code
Various cleanups, reducing the #ifdef mess and other things.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo
f7a9214437 mac80211: complete the mesh (interface handling) code
This completes the mesh interface handling code and a few other
bits about the mac80211 module.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo
c5dd9c2bd0 mac80211: mesh path and mesh peer configuration
This adds code to allow adding mesh interfaces and configuring
mesh peers etc. Also, it adds code for station dumping.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo
9f42f60705 mac80211: mesh statistics and config through debugfs
This patch contains the debugfs code for mesh statistics and configuration
parameters. Please note that generic support for r/w debugfs attributes has been
added.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo
050ac52cbe mac80211: code for on-demand Hybrid Wireless Mesh Protocol
This file implements the on-demand Hybrid Wireless Mesh Protocol, at this moment
using hop-count as the metric. When no mesh path exists for a given destination
or the mesh path is not active, frames addressed to that destination will be
queued and a Path Request frame will be sent. Queued frames will be sent when
the path is resolved (usually after reception of a Path Response) or discarded
if discovery times out. Path Requests will also be sent to refresh paths that
are being used and are close to expiring.

Path Errors are sent when a path discovery process triggered by the attempt to
forward a frame originated in a different mesh point times out. Path Errors are
also sent when a peer link is determined to be unreachable because of high error
rates.

Multiple destination support in Path Requests and Path Errors and precursors
have not been implemented yet.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo
eb2b9311fd mac80211: mesh path table implementation
The mesh path table associates destinations with the next hop to reach them. The
table is a hash of linked lists protected by rcu mechanisms. Every mesh path
contains a lock to protect the mesh path state.

Each outgoing mesh frame requires a look up into this table. Therefore, the
table it has been designed so it is not necessary to hold any lock to find the
appropriate next hop.

If the path is determined to be active within a rcu context we can safely
dereference mpath->next_hop->addr, since it holds a reference to the sta
next_hop. After a mesh path has been set active for the first time it next_hop
must always point to a valid sta.  If this is not possible the mpath must be
deleted or replaced in a RCU safe fashion.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:42 -05:00
Luis Carlos Cobo
c3896d2ca4 mac80211: mesh peer link implementation
This file implements mesh discovery and peer link establishment support using
the mesh peer link table provided in mesh_plinktbl.c.

Secure peer links have not been implemented yet.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00
Luis Carlos Cobo
f709fc696d mac80211: mesh changes to the MLME
This includes support for mesh network scanning. The ugly code in
ieee80211_sta_scan_result() is my approach to work around wext. This has been
tested with wireless tools version 29 and works as expected (the new interface
mode is just not shown).

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-03-06 15:30:41 -05:00