Commit Graph

8988 Commits

Author SHA1 Message Date
Vlad Yasevich
8b750ce54b sctp: Flush the queue only once during fast retransmit.
When fast retransmit is triggered by a sack, we should flush the queue
only once so that only 1 retransmit happens.  Also, since we could
potentially have non-fast-rtx chunks on the retransmit queue, we need
make sure any chunks eligable for fast retransmit are sent first
during fast retransmission.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:39:36 -07:00
Vlad Yasevich
62aeaff5cc sctp: Start T3-RTX timer when fast retransmitting lowest TSN
When we are trying to fast retransmit the lowest outstanding TSN, we
need to restart the T3-RTX timer, so that subsequent timeouts will
correctly tag all the packets necessary for retransmissions.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:39:11 -07:00
Vlad Yasevich
a646523481 sctp: Correctly implement Fast Recovery cwnd manipulations.
Correctly keep track of Fast Recovery state and do not reduce
congestion window multiple times during sucht state.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:38:43 -07:00
Gui Jianfeng
159c6bea37 sctp: Move sctp_v4_dst_saddr out of loop
There's no need to execute sctp_v4_dst_saddr() for each
iteration, just move it out of loop.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:38:07 -07:00
Gui Jianfeng
4141ddc02a sctp: retran_path update bug fix
If the current retran_path is the only active one, it should
update it to the the next inactive one.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:37:33 -07:00
David S. Miller
aed5a833fb Merge branch 'net-2.6-misc-20080605a' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-fix 2008-06-04 12:10:21 -07:00
Ilpo Järvinen
a6604471db tcp: fix skb vs fack_count out-of-sync condition
This bug is able to corrupt fackets_out in very rare cases.
In order for this to cause corruption:
  1) DSACK in the middle of previous SACK block must be generated.
  2) In order to take that particular branch, part or all of the
     DSACKed segment must already be SACKed so that we have that
     in cache in the first place.
  3) The new info must be top enough so that fackets_out will be
     updated on this iteration.
...then fack_count is updated while skb wasn't, then we walk again
that particular segment thus updating fack_count twice for
a single skb and finally that value is assigned to fackets_out
by tcp_sacktag_one.

It is safe to call tcp_sacktag_one just once for a segment (at
DSACK), no need to call again for plain SACK.

Potential problem of the miscount are limited to premature entry
to recovery and to inflated reordering metric (which could even
cancel each other out in the most the luckiest scenarios :-)).
Both are quite insignificant in worst case too and there exists
also code to reset them (fackets_out once sacked_out becomes zero
and reordering metric on RTO).

This has been reported by a number of people, because it occurred
quite rarely, it has been very evasive. Andy Furniss was able to
get it to occur couple of times so that a bit more info was
collected about the problem using a debug patch, though it still
required lot of checking around. Thanks also to others who have
tried to help here.

This is listed as Bugzilla #10346. The bug was introduced by
me in commit 68f8353b48 ([TCP]: Rewrite SACK block processing & 
sack_recv_cache use), I probably thought back then that there's
need to scan that entry twice or didn't dare to make it go
through it just once there. Going through twice would have
required restoring fack_count after the walk but as noted above,
I chose to drop the additional walk step altogether here.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:07:44 -07:00
Adrian-Ken Rueegsegger
a13366c632 xfrm: xfrm_algo: correct usage of RIPEMD-160
This patch fixes the usage of RIPEMD-160 in xfrm_algo which in turn
allows hmac(rmd160) to be used as authentication mechanism in IPsec
ESP and AH (see RFC 2857).

Signed-off-by: Adrian-Ken Rueegsegger <rueegsegger@swiss-it.ch>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 12:04:55 -07:00
Denis V. Lunev
9596cc826e [IPV6]: Do not change protocol for UDPv6 sockets with pending sent data.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:38 +09:00
Denis V. Lunev
36d926b94a [IPV6]: inet_sk(sk)->cork.opt leak
IPv6 UDP sockets wth IPv4 mapped address use udp_sendmsg to send the data
actually. In this case ip_flush_pending_frames should be called instead
of ip6_flush_pending_frames.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:38 +09:00
Denis V. Lunev
49d074f400 [IPV6]: Do not change protocol for raw IPv6 sockets.
It is not allowed to change underlying protocol for
   int fd = socket(PF_INET6, SOCK_RAW, IPPROTO_UDP);

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:37 +09:00
YOSHIFUJI Hideaki
91e1908f56 [IPV6] NETNS: Handle ancillary data in appropriate namespace.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:36 +09:00
YOSHIFUJI Hideaki
187e38384c [IPV6]: Check outgoing interface even if source address is unspecified.
The outgoing interface index (ipi6_ifindex) in IPV6_PKTINFO
ancillary data, is not checked if the source address (ipi6_addr)
is unspecified.  If the ipi6_ifindex is the not-exist interface,
it should be fail.

Based on patch from Shan Wei <shanwei@cn.fujitsu.com> and
Brian Haley <brian.haley@hp.com>.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:35 +09:00
Yang Hongyang
95b496b666 [IPV6]: Fix the data length of get destination options with short length
If get destination options with length which is not enough for that
option,getsockopt() will still return the real length of the option,
which is larger then the buffer space.
 This is because ipv6_getsockopt_sticky() returns the real length of
the option.

This patch fix this problem.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:35 +09:00
Yang Hongyang
05335c2220 [IPV6]: Fix the return value of get destination options with NULL data pointer
If we pass NULL data buffer to getsockopt(), it will return 0,
and the option length is set to -EFAULT:
    getsockopt(sk, IPPROTO_IPV6, IPV6_DSTOPTS, NULL, &len);

This is because ipv6_getsockopt_sticky() will return -EFAULT or
-EINVAL if some error occur.

This patch fix this problem.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:34 +09:00
YOSHIFUJI Hideaki
4bed72e4f5 [IPV6] ADDRCONF: Allow longer lifetime on 64bit archs.
- Allow longer lifetimes (>= 0x7fffffff/HZ) on 64bit archs
  by using unsigned long.
- Shadow this arithmetic overflow workaround by introducing
  helper functions: addrconf_timeout_fixup() and
  addrconf_finite_timeout().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:34 +09:00
YOSHIFUJI Hideaki
baa2bfb8ae [IPV4] TUNNEL4: Fix incoming packet length check for inter-protocol tunnel.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:33 +09:00
Colin
8283637231 [IPV6] TUNNEL6: Fix incoming packet length check for inter-protocol tunnel.
I discover a strange behavior in [ipv4 in ipv6] tunnel. When IPv6 tunnel
payload is less than 40(0x28), packet can be sent to network, received in
physical interface, but not seen in IP tunnel interface. No counter increase
in tunnel interface.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:32 +09:00
Thomas Graf
24ef0da7b8 [IPV6] ADDRCONF: Check range of prefix length
As of now, the prefix length is not vaildated when adding or deleting
addresses. The value is passed directly into the inet6_ifaddr structure
and later passed on to memcmp() as length indicator which relies on
the value never to exceed 128 (bits).

Due to the missing check, the currently code allows for any 8 bit
value to be passed on as prefix length while using the netlink
interface, and any 32 bit value while using the ioctl interface.

[Use unsigned int instead to generate better code - yoshfuji]

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:31 +09:00
YOSHIFUJI Hideaki
a3c960899e [IPV6] UDP: Possible dst leak in udpv6_sendmsg.
ip6_sk_dst_lookup returns held dst entry. It should be released
on all paths beyond this point. Add missed release when up->pending
is set.

Bug report and initial patch by Denis V. Lunev <den@openvz.org>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Denis V. Lunev <den@openvz.org>
2008-06-05 04:02:31 +09:00
YOSHIFUJI Hideaki
e51171019b [SCTP]: Fix NULL dereference of asoc.
Commit 7cbca67c07 ("[IPV6]: Support
Source Address Selection API (RFC5014)") introduced NULL dereference
of asoc to sctp_v6_get_saddr in net/sctp/ipv6.c.
Pointed out by Johann Felix Soden <johfel@users.sourceforge.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-06-05 04:02:30 +09:00
Ilpo Järvinen
8aca6cb117 tcp: Fix inconsistency source (CA_Open only when !tcp_left_out(tp))
It is possible that this skip path causes TCP to end up into an
invalid state where ca_state was left to CA_Open while some
segments already came into sacked_out. If next valid ACK doesn't
contain new SACK information TCP fails to enter into
tcp_fastretrans_alert(). Thus at least high_seq is set
incorrectly to a too high seqno because some new data segments
could be sent in between (and also, limited transmit is not
being correctly invoked there). Reordering in both directions
can easily cause this situation to occur.

I guess we would want to use tcp_moderate_cwnd(tp) there as well
as it may be possible to use this to trigger oversized burst to
network by sending an old ACK with huge amount of SACK info, but
I'm a bit unsure about its effects (mainly to FlightSize), so to
be on the safe side I just currently fixed it minimally to keep
TCP's state consistent (obviously, such nasty ACKs have been
possible this far). Though it seems that FlightSize is already
underestimated by some amount, so probably on the long term we
might want to trigger recovery there too, if appropriate, to make
FlightSize calculation to resemble reality at the time when the
losses where discovered (but such change scares me too much now
and requires some more thinking anyway how to do that as it
likely involves some code shuffling).

This bug was found by Brian Vowell while running my TCP debug
patch to find cause of another TCP issue (fackets_out
miscount).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 11:34:22 -07:00
Jarek Poplawski
b9c6989646 netfilter: nf_conntrack_ipv6: fix inconsistent lock state in nf_ct_frag6_gather()
[   63.531438] =================================
[   63.531520] [ INFO: inconsistent lock state ]
[   63.531520] 2.6.26-rc4 #7
[   63.531520] ---------------------------------
[   63.531520] inconsistent {softirq-on-W} -> {in-softirq-W} usage.
[   63.531520] tcpsic6/3864 [HC0[0]:SC1[1]:HE1:SE0] takes:
[   63.531520]  (&q->lock#2){-+..}, at: [<c07175b0>] ipv6_frag_rcv+0xd0/0xbd0
[   63.531520] {softirq-on-W} state was registered at:
[   63.531520]   [<c0143bba>] __lock_acquire+0x3aa/0x1080
[   63.531520]   [<c0144906>] lock_acquire+0x76/0xa0
[   63.531520]   [<c07a8f0b>] _spin_lock+0x2b/0x40
[   63.531520]   [<c0727636>] nf_ct_frag6_gather+0x3f6/0x910
 ...

According to this and another similar lockdep report inet_fragment
locks are taken from nf_ct_frag6_gather() with softirqs enabled, but
these locks are mainly used in softirq context, so disabling BHs is
necessary.

Reported-and-tested-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 09:58:27 -07:00
Dong Wei
d2ee3f2c4b netfilter: xt_connlimit: fix accouning when receive RST packet in ESTABLISHED state
In xt_connlimit match module, the counter of an IP is decreased when
the TCP packet is go through the chain with ip_conntrack state TW.
Well, it's very natural that the server and client close the socket
with FIN packet. But when the client/server close the socket with RST
packet(using so_linger), the counter for this connection still exsit.
The following patch can fix it which is based on linux-2.6.25.4

Signed-off-by: Dong Wei <dwei.zh@gmail.com>
Acked-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04 09:57:51 -07:00
Al Viro
d430a227d2 bogus format in ip6mr
ptrdiff_t is %t..., not %Z...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-06-04 08:06:02 -07:00
Thomas Graf
ab32cd793d route: Remove unused ifa_anycast field
The field was supposed to allow the creation of an anycast route by
assigning an anycast address to an address prefix. It was never
implemented so this field is unused and serves no purpose. Remove it.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 16:37:33 -07:00
Thomas Graf
bc3ed28caa netlink: Improve returned error codes
Make nlmsg_trim(), nlmsg_cancel(), genlmsg_cancel(), and
nla_nest_cancel() void functions.

Return -EMSGSIZE instead of -1 if the provided message buffer is not
big enough.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 16:36:54 -07:00
Thomas Graf
1f9d11c7c9 route: Mark unused routing attributes as such
Also removes an unused policy entry for an attribute which is
only used in kernel->user direction.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 16:36:27 -07:00
Thomas Graf
51b77cae0d route: Mark unused route cache flags as such.
Also removes an obsolete check for the unused flag RTCF_MASQ.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 16:36:01 -07:00
Brice Goglin
7557af2515 net_dma: remove duplicate assignment in dma_skb_copy_datagram_iovec
No need to compute copy twice in the frags loop in
dma_skb_copy_datagram_iovec().

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 16:07:45 -07:00
Stephen Hemminger
b9f5f52cca net: neighbour table ABI problem
The neighbor table time of last use information is returned in the
incorrect unit. Kernel to user space ABI's need to use USER_HZ (or
milliseconds), otherwise the application has to try and discover the
real system HZ value which is problematic.  Linux has standardized on
keeping USER_HZ consistent (100hz) even when kernel is running
internally at some other value.

This change is small, but it breaks the ABI for older version of
iproute2 utilities.  But these utilities are already broken since they
are looking at the psched_hz values which are completely different. So
let's just go ahead and fix both kernel and user space. Older
utilities will just print wrong values.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 16:03:15 -07:00
Pavel Emelyanov
9ecad87794 irda: Sock leak on error path in irda_create.
Bad type/protocol specified result in sk leak.

Fix is simple - release the sk if bad values are given,
but to make it possible just to call sk_free(), I move
some sk initialization a bit lower.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 15:18:36 -07:00
Jarek Poplawski
7dccf1f4e1 ax25: Fix NULL pointer dereference and lockup.
From: Jarek Poplawski <jarkao2@gmail.com>

There is only one function in AX25 calling skb_append(), and it really
looks suspicious: appends skb after previously enqueued one, but in
the meantime this previous skb could be removed from the queue.

This patch Fixes it the simple way, so this is not fully compatible with
the current method, but testing hasn't shown any problems.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 14:53:46 -07:00
Dave Young
537d59af73 bluetooth: rfcomm_dev_state_change deadlock fix
There's logic in __rfcomm_dlc_close:
	rfcomm_dlc_lock(d);
	d->state = BT_CLOSED;
	d->state_changed(d, err);
	rfcomm_dlc_unlock(d);

In rfcomm_dev_state_change, it's possible that rfcomm_dev_put try to
take the dlc lock, then we will deadlock.

Here fixed it by unlock dlc before rfcomm_dev_get in
rfcomm_dev_state_change.

why not unlock just before rfcomm_dev_put? it's because there's
another problem.  rfcomm_dev_get/rfcomm_dev_del will take
rfcomm_dev_lock, but in rfcomm_dev_add the lock order is :
rfcomm_dev_lock --> dlc lock

so I unlock dlc before the taken of rfcomm_dev_lock.

Actually it's a regression caused by commit
1905f6c736 ("bluetooth :
__rfcomm_dlc_close lock fix"), the dlc state_change could be two
callbacks : rfcomm_sk_state_change and rfcomm_dev_state_change. I
missed the rfcomm_sk_state_change that time.

Thanks Arjan van de Ven <arjan@linux.intel.com> for the effort in
commit 4c8411f8c1 ("bluetooth: fix
locking bug in the rfcomm socket cleanup handling") but he missed the
rfcomm_dev_state_change lock issue.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03 14:27:17 -07:00
Linus Torvalds
1beee8dc8c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits)
  llc: Fix double accounting of received packets
  netfilter: nf_conntrack_expect: fix error path unwind in nf_conntrack_expect_init()
  bluetooth: fix locking bug in the rfcomm socket cleanup handling
  mac80211: fix alignment issue with compare_ether_addr()
  mac80211: Fix for NULL pointer dereference in sta_info_get()
  mac80211: fix a typo in ieee80211_handle_filtered_frame comment
  rndis_wlan: add missing range check for power_output modparam
  iwlwifi: fix rate scale TLC column selection bug
  iwlwifi: fix exit from stay_in_table state
  rndis_wlan: Make connections to TKIP PSK networks work
  mac80211 : Fixes the status message for iwconfig
  rt2x00: Use atomic interface iteration in irq context
  rt2x00: Reset antenna RSSI after switch
  rt2x00: Don't count retries as failure
  rt2x00: Fix memleak in tx() path
  mac80211: reorder channel and freq reporting in wext scan report
  b43: Fix controller restart crash
  mac80211: fix ieee80211_rx_bss_put/get imbalance
  net/mac80211: always true conditionals
  b43: Upload both beacon templates on initial load
  ...
2008-05-30 07:45:20 -07:00
Arnaldo Carvalho de Melo
3446b9d57e llc: Fix double accounting of received packets
llc_sap_rcv was being preceded by skb_set_owner_r, then calling
llc_state_process that calls sock_queue_rcv_skb, that in turn calls
skb_set_owner_r again making the space allowed to be used by the socket to be
leaked, making the socket to get stuck.

Fix it by setting skb->sk at llc_sap_rcv and leave the accounting to be done
only at sock_queue_rcv_skb.

Reported-by: Dmitry Petukhov <dmgenp@gmail.com>
Tested-by: Dmitry Petukhov <dmgenp@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-30 02:57:29 -07:00
Alexey Dobriyan
12293bf911 netfilter: nf_conntrack_expect: fix error path unwind in nf_conntrack_expect_init()
Signed-off-by: Alexey Dobriyan <adobriyan@parallels.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-29 03:19:37 -07:00
David S. Miller
8c3a01d0c2 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-05-29 01:49:04 -07:00
Arjan van de Ven
4c8411f8c1 bluetooth: fix locking bug in the rfcomm socket cleanup handling
in net/bluetooth/rfcomm/sock.c, rfcomm_sk_state_change() does the
following operation:

        if (parent && sock_flag(sk, SOCK_ZAPPED)) {
                /* We have to drop DLC lock here, otherwise
                 * rfcomm_sock_destruct() will dead lock. */
                rfcomm_dlc_unlock(d);
                rfcomm_sock_kill(sk);
                rfcomm_dlc_lock(d);
        }
}

which is fine, since rfcomm_sock_kill() will call sk_free() which will call
rfcomm_sock_destruct() which takes the rfcomm_dlc_lock()... so far so good.

HOWEVER, this assumes that the rfcomm_sk_state_change() function always gets
called with the rfcomm_dlc_lock() taken. This is the case for all but one
case, and in that case where we don't have the lock, we do a double unlock
followed by an attempt to take the lock, which due to underflow isn't
going anywhere fast.

This patch fixes this by moving the stragling case inside the lock, like
the other usages of the same call are doing in this code.

This was found with the help of the www.kerneloops.org project, where this
deadlock was observed 51 times at this point in time:
http://www.kerneloops.org/search.php?search=rfcomm_sock_destruct

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-29 01:32:47 -07:00
Senthil Balasubramanian
c97c23e386 mac80211: fix alignment issue with compare_ether_addr()
This addresses an alignment issue with compare_ether_addr().
The addresses passed to compare_ether_addr should be two bytes aligned.
It may function properly in x86 platform. However may not work properly
on IA-64 or ARM processor.

This also fixes a typo in mlme.c where the sk_buff struct name is incorect.
Though sizeof() works for any incorrect structure pointer name as its just
a pointer length that we want, lets just fix it.

Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-28 16:43:50 -04:00
Senthil Balasubramanian
70d251b24c mac80211: Fix for NULL pointer dereference in sta_info_get()
This addresses a NULL pointer dereference in sta_info_get().
TID and sta_info are extracted in ADDBA Timer expiry function
through the timer handler's argument.

The problem is extracging the TID (which was stored in
timer_to_tid[] array of type "u8") through "int *" typecast which
may also yield unwanted bytes for the MSB of TID that results
in incorrect sta_info and ieee80211_local pointers.

ieee80211_local pointer is NULL as illustrated below, it crashes in
sta_info_get(). The problem started when extracting ieee80211_local
pointer out of sta_info iteself and eventually crashed in
stat_info_get().

The proper way to fix is to change the data type of TID to u8
instead of u16. However changing all the occurences requires
some prototype changes as well. We should fix this in upcoming
patches.

Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com>
Signed-off-by: Luis Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-28 16:43:49 -04:00
Yi Zhu
f6d9710489 mac80211: fix a typo in ieee80211_handle_filtered_frame comment
fix a typo in ieee80211_handle_filtered_frame comment

Signed-off-by: Yi Zhu <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-28 16:43:49 -04:00
Abhijeet Kolekar
d4231ca3e1 mac80211 : Fixes the status message for iwconfig
iwconfig was showing incorrect status messages when disassociated.
Patch fixes this by always checking for association status in
ioctl calls for getting ap address.

Signed-off-by: Abhijeet Kolekar <abhijeet.kolekar@intel.com>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-28 16:43:46 -04:00
Tomas Winkler
9381be059b mac80211: reorder channel and freq reporting in wext scan report
This patch switch order of channel and freq (SIOCGIWFREQ) reports
in scan results in order to overcome wpa_supplicant inability
to handle channel numbers in 5.2Ghz band.
Wext reporting channel number is ambiguous as channels 7-12 (802.11j)
exist on both bands.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-28 16:43:43 -04:00
Tomas Winkler
167ad6f7a2 mac80211: fix ieee80211_rx_bss_put/get imbalance
This patch fixes iee80211_rx_bss_put/get imbalance
introduced by 'mac80211: enable IBSS merging' patch.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-28 16:43:42 -04:00
Nicolas Kaiser
679fda1aa4 net/mac80211: always true conditionals
Correct always true conditionals.

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-28 16:43:41 -04:00
Gerrit Renker
825de27d9e dccp ccid-3: Fix "t_ipi explosion" bug
The identification of this bug is thanks to Cheng Wei and Tomasz
Grobelny.

To avoid divide-by-zero, the implementation previously ignored RTTs
smaller than 4 microseconds when performing integer division RTT/4.

When the RTT reached a value less than 4 microseconds (as observed on
loopback), this prevented the Window Counter CCVal value from
advancing. As a result, the receiver stopped sending feedback. This in
turn caused non-ending expiries of the nofeedback timer at the sender,
so that the sending rate was progressively reduced until reaching the
minimum of one packet per 64 seconds.

The patch fixes this bug by handling integer division more
intelligently. Due to consistent use of dccp_sample_rtt(),
divide-by-zero-RTT is avoided.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-27 06:33:54 -07:00
Wei Yongjun
6079a463cf dccp: Fix to handle short sequence numbers packet correctly
RFC4340 said:
  8.5.  Pseudocode
       ...
       If P.type is not Data, Ack, or DataAck and P.X == 0 (the packet
             has short sequence numbers), drop packet and return

But DCCP has some mistake to handle short sequence numbers packet, now
it drop packet only if P.type is Data, Ack, or DataAck and P.X == 0.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-27 06:22:38 -07:00
Linus Torvalds
c5e6fd28e5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (52 commits)
  vlan: Use bitmask of feature flags instead of seperate feature bits
  fmvj18x_cs: add NextCom NC5310 rev B support
  xirc2ps_cs: re-initialize the multicast address in do_reset
  3C509: rx_bytes should not be increased when alloc_skb failed
  NETFRONT: Use __skb_queue_purge()
  VIRTIO: Use __skb_queue_purge()
  phylib: do EXPORT_SYMBOL on get_phy_id
  netlink: Fix nla_parse_nested_compat() to call nla_parse() directly
  WAN: protect HDLC proto list while insmod/rmmod
  drivers/net/fs_enet: remove null pointer dereference
  S2io: Version update for napi and MSI-X patches
  S2io: Added napi support when MSIX is enabled.
  S2io: Move all the transmit completions to a single msi-x (alarm) vector
  drivers/net/ehea - remove unnecessary memset after kzalloc
  au1000_eth: remove useless check
  Blackfin EMAC Driver: Removed duplicated include <linux/ethtool.h>
  cpmac bugfixes and enhancements
  e1000e: use resource_size_t, not unsigned long, for phys addrs
  net/usb: add support for Apple USB Ethernet Adapter
  uli526x: add support for netpoll
  ...
2008-05-26 10:14:02 -07:00
Carlos R. Mafra
962cf36c5b Remove argument from open_softirq which is always NULL
As git-grep shows, open_softirq() is always called with the last argument
being NULL

block/blk-core.c:       open_softirq(BLOCK_SOFTIRQ, blk_done_softirq, NULL);
kernel/hrtimer.c:       open_softirq(HRTIMER_SOFTIRQ, run_hrtimer_softirq, NULL);
kernel/rcuclassic.c:    open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
kernel/rcupreempt.c:    open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
kernel/sched.c: open_softirq(SCHED_SOFTIRQ, run_rebalance_domains, NULL);
kernel/softirq.c:       open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
kernel/softirq.c:       open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
kernel/timer.c: open_softirq(TIMER_SOFTIRQ, run_timer_softirq, NULL);
net/core/dev.c: open_softirq(NET_TX_SOFTIRQ, net_tx_action, NULL);
net/core/dev.c: open_softirq(NET_RX_SOFTIRQ, net_rx_action, NULL);

This observation has already been made by Matthew Wilcox in June 2002
(http://www.cs.helsinki.fi/linux/linux-kernel/2002-25/0687.html)

"I notice that none of the current softirq routines use the data element
passed to them."

and the situation hasn't changed since them. So it appears we can safely
remove that extra argument to save 128 (54) bytes of kernel data (text).

Signed-off-by: Carlos R. Mafra <crmafra@ift.unesp.br>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-25 07:43:15 +02:00
Patrick McHardy
289c79a4bd vlan: Use bitmask of feature flags instead of seperate feature bits
Herbert Xu points out that the use of seperate feature bits for features
to be propagated to VLAN devices is going to get messy real soon.
Replace the VLAN feature bits by a bitmask of feature flags to be
propagated and restore the old GSO_SHIFT/MASK values.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-23 00:27:50 -07:00
Ingo Molnar
2ba4cc319a rcu: fix nf_conntrack_helper.c build bug
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-22 10:08:38 +02:00
Linus Torvalds
a0abb93bf9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  net: The world is not perfect patch.
  tcp: Make prior_ssthresh a u32
  xfrm_user: Remove zero length key checks.
  net/ipv4/arp.c: Use common hex_asc helpers
  cassini: Only use chip checksum for ipv4 packets.
  tcp: TCP connection times out if ICMP frag needed is delayed
  netfilter: Move linux/types.h inclusions outside of #ifdef __KERNEL__
  af_key: Fix selector family initialization.
  libertas: Fix ethtool statistics
  mac80211: fix NULL pointer dereference in ieee80211_compatible_rates
  mac80211: don't claim iwspy support
  orinoco_cs: add ID for SpeedStream wireless adapters
  hostap_cs: add ID for Conceptronic CON11CPro
  rtl8187: resource leak in error case
  ath5k: Fix loop variable initializations
2008-05-21 22:14:39 -07:00
Rami Rosen
071f92d059 net: The world is not perfect patch.
Unless there will be any objection here, I suggest consider the
following patch which simply removes the code for the
-DI_WISH_WORLD_WERE_PERFECT in the three methods which use it.

The compilation errors we get when using -DI_WISH_WORLD_WERE_PERFECT
show that this code was not built and not used for really a long time.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-21 17:47:54 -07:00
David S. Miller
88860c9ef4 xfrm_user: Remove zero length key checks.
The crypto layer will determine whether that is valid
or not.

Suggested by Herbert Xu, based upon a report and patch
by Martin Willi.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-05-21 17:36:21 -07:00
Denis Cheng
51f82a2b12 net/ipv4/arp.c: Use common hex_asc helpers
Here the local hexbuf is a duplicate of global const char hex_asc from
lib/hexdump.c, except the hex letters' cases:

	const char hexbuf[] = "0123456789ABCDEF";

	const char hex_asc[] = "0123456789abcdef";

and here to print HW addresses, the hex cases are not significant.

Thanks to Harvey Harrison to introduce the hex_asc_hi/hex_asc_lo helpers.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-21 17:34:32 -07:00
Sridhar Samudrala
7d227cd235 tcp: TCP connection times out if ICMP frag needed is delayed
We are seeing an issue with TCP in handling an ICMP frag needed
message that is received after net.ipv4.tcp_retries1 retransmits.
The default value of retries1 is 3. So if the path mtu changes
and ICMP frag needed is lost for the first 3 retransmits or if
it gets delayed until 3 retransmits are done, TCP doesn't update
MSS correctly and continues to retransmit the orginal message
until it timesout after tcp_retries2 retransmits.

I am seeing this issue even with the latest 2.6.25.4 kernel.

In tcp_retransmit_timer(), when retransmits counter exceeds 
tcp_retries1 value, the dst cache entry of the socket is reset.
At this time, if we receive an ICMP frag needed message, the 
dst entry gets updated with the new MTU, but the TCP sockets
dst_cache entry remains NULL.

So the next time when we try to retransmit after the ICMP frag
needed is received, tcp_retransmit_skb() gets called. Here the
cur_mss value is calculated at the start of the routine with
a NULL sk_dst_cache. Instead we should call tcp_current_mss after
the rebuild_header that caches the dst entry with the updated mtu.
Also the rebuild_header should be called before tcp_fragment
so that skb is fragmented if the mss goes down.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-21 16:42:20 -07:00
Kazunori MIYAZAWA
4da5105687 af_key: Fix selector family initialization.
This propagates the xfrm_user fix made in commit
bcf0dda8d2 ("[XFRM]: xfrm_user: fix
selector family initialization")

Based upon a bug report from, and tested by, Alan Swanson.

Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-21 13:26:11 -07:00
David S. Miller
d8ac48d4cb Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-05-20 20:34:22 -07:00
Helmut Schaa
0d580a774b mac80211: fix NULL pointer dereference in ieee80211_compatible_rates
Fix a possible NULL pointer dereference in ieee80211_compatible_rates
introduced in the patch "mac80211: fix association with some APs". If no bss
is available just use all supported rates in the association request.

Signed-off-by: Helmut Schaa <hschaa@suse.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-20 22:40:30 -04:00
Linus Torvalds
d40ace0c7b Merge branch 'for-2.6.26' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.26' of git://linux-nfs.org/~bfields/linux: (25 commits)
  svcrdma: Verify read-list fits within RPCSVC_MAXPAGES
  svcrdma: Change svc_rdma_send_error return type to void
  svcrdma: Copy transport address and arm CQ before calling rdma_accept
  svcrdma: Set rqstp transport address in rdma_read_complete function
  svcrdma: Use ib verbs version of dma_unmap
  svcrdma: Cleanup queued, but unprocessed I/O in svc_rdma_free
  svcrdma: Move the QP and cm_id destruction to svc_rdma_free
  svcrdma: Add reference for each SQ/RQ WR
  svcrdma: Move destroy to kernel thread
  svcrdma: Shrink scope of spinlock on RQ CQ
  svcrdma: Use standard Linux lists for context cache
  svcrdma: Simplify RDMA_READ deferral buffer management
  svcrdma: Remove unused READ_DONE context flags bit
  svcrdma: Return error from rdma_read_xdr so caller knows to free context
  svcrdma: Fix error handling during listening endpoint creation
  svcrdma: Free context on post_recv error in send_reply
  svcrdma: Free context on ib_post_recv error
  svcrdma: Add put of connection ESTABLISHED reference in rdma_cma_handler
  svcrdma: Fix return value in svc_rdma_send
  svcrdma: Fix race with dto_tasklet in svc_rdma_send
  ...
2008-05-20 19:30:54 -07:00
Linus Torvalds
e616c63033 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
  pktgen: make sure that pktgen_thread_worker has been executed
  [VLAN]: Propagate selected feature bits to VLAN devices
  drivers/atm/: remove CVS keywords
  vlan: Correctly handle device notifications for layered VLAN devices
  net: Fix call to ->change_rx_flags(dev, IFF_MULTICAST) in dev_change_flags()
  net_sched: cls_api: fix return value for non-existant classifiers
  ipsec: Use the correct ip_local_out function
  ipv6 addrconf: Allow infinite prefix lifetime.
  ipv6 route: Fix lifetime in netlink.
  ipv6 addrconf: Fix route lifetime setting in corner case.
  ndisc: Add missing strategies for per-device retrans timer/reachable time settings.
  ipv6: Move <linux/in6.h> from header-y to unifdef-y.
  l2tp: avoid skb truesize bug if headroom is increased
  wireless: Create 'device' symlink in sysfs
  wireless, airo: waitbusy() won't delay
  libertas: fix command timeout after firmware failure
  mac80211: Add RTNL version of ieee80211_iterate_active_interfaces
  mac80211 : Association with 11n hidden ssid ap.
  hostap: fix "registers" registration in procfs
  isdn/capi: Return proper errnos on module init.
  ...
2008-05-20 17:23:03 -07:00
J. Bruce Fields
68432a03f8 Merge branch 'from-tomtucker' into for-2.6.26 2008-05-20 19:57:38 -04:00
Denis V. Lunev
d3ede327e8 pktgen: make sure that pktgen_thread_worker has been executed
The following courruption can happen during pktgen stop:
list_del corruption. prev->next should be ffff81007e8a5e70, but was 6b6b6b6b6b6b6b6b
kernel BUG at lib/list_debug.c:67!
      :pktgen:pktgen_thread_worker+0x374/0x10b0
      ? autoremove_wake_function+0x0/0x40
      ? _spin_unlock_irqrestore+0x42/0x80
      ? :pktgen:pktgen_thread_worker+0x0/0x10b0
      kthread+0x4d/0x80
      child_rip+0xa/0x12
      ? restore_args+0x0/0x30
      ? kthread+0x0/0x80
      ? child_rip+0x0/0x12
RIP  list_del+0x48/0x70

The problem is that pktgen_thread_worker can not be executed if kthread_stop
has been called too early. Insert a completion on the normal initialization
path to make sure that pktgen_thread_worker will gain the control for sure.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-20 15:12:44 -07:00
Johannes Berg
51e779f0da mac80211: don't claim iwspy support
We removed iwspy support a very long time ago because it is useless, but
forgot to stop claiming to support it. Apparently, nobody cares, but
remove it nonetheless.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-20 17:55:30 -04:00
Patrick McHardy
5fb1357054 [VLAN]: Propagate selected feature bits to VLAN devices
Propagate feature bits from the NETDEV_FEAT_CHANGE notifier. For now
only TSO is propagated for devices that announce their ability to
support TSO in combination with VLAN accel by setting the NETIF_F_VLAN_TSO
flag.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-20 14:54:50 -07:00
Patrick McHardy
81d85346b3 vlan: Correctly handle device notifications for layered VLAN devices
Commit 30688a9 ([VLAN]: Handle vlan devices net namespace changing)
changed the device notifier to special-case notifications for VLAN
devices, effectively disabling state propagation to underlying VLAN
devices. This is needed for layered VLANs though, so restore the
original behaviour.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-20 14:37:36 -07:00
David Woodhouse
0e91796eb4 net: Fix call to ->change_rx_flags(dev, IFF_MULTICAST) in dev_change_flags()
Am I just being particularly dim today, or can the call to
dev->change_rx_flags(dev, IFF_MULTICAST) in dev_change_flags() never
happen?

We've just set dev->flags = flags & IFF_MULTICAST, effectively. So the
condition '(dev->flags ^ flags) & IFF_MULTICAST' is _never_ going to be
true.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-20 14:36:14 -07:00
Patrick McHardy
f2df824948 net_sched: cls_api: fix return value for non-existant classifiers
cls_api should return ENOENT when the requested classifier doesn't
exist.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-20 14:34:46 -07:00
Herbert Xu
1ac06e0306 ipsec: Use the correct ip_local_out function
Because the IPsec output function xfrm_output_resume does its
own dst_output call it should always call __ip_local_output
instead of ip_local_output as the latter may invoke dst_output
directly.  Otherwise the return values from nf_hook and dst_output
may clash as they both use the value 1 but for different purposes.

When that clash occurs this can cause a packet to be used after
it has been freed which usually leads to a crash.  Because the
offending value is only returned from dst_output with qdiscs
such as HTB, this bug is normally not visible.

Thanks to Marco Berizzi for his perseverance in tracking this
down.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-20 14:32:14 -07:00
YOSHIFUJI Hideaki
6f704992d3 ipv6 addrconf: Allow infinite prefix lifetime.
We need to handle infinite prefix lifetime specially.
With help from original reporter "Bonitch, Joseph"
<Joseph.Bonitch@xerox.com>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-19 16:56:11 -07:00
YOSHIFUJI Hideaki
69cdf8f92a ipv6 route: Fix lifetime in netlink.
We could not see appropriate lifetime if the route had been scheduled
to expired at 0 (in jiffies).  We should check rt6i_flags instead of
rt6i_expires to determine whether lifetime is valid or not.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-19 16:55:13 -07:00
YOSHIFUJI Hideaki
a3264435b4 ipv6 addrconf: Fix route lifetime setting in corner case.
Because of arithmetic overflow avoidance, the actual lifetime setting
(vs the value given by RA) did not increase monotonically around
0x7fffffff/HZ.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-19 16:54:29 -07:00
David S. Miller
44dc19c829 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-05-19 16:29:40 -07:00
YOSHIFUJI Hideaki
0686caa35e ndisc: Add missing strategies for per-device retrans timer/reachable time settings.
Noticed from Al Viro <viro@ftp.linux.org.uk> via David Miller
<davem@davemloft.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-19 16:25:42 -07:00
Tom Tucker
a6f911c04e svcrdma: Verify read-list fits within RPCSVC_MAXPAGES
A RDMA read-list cannot contain more elements than RPCSVC_MAXPAGES or
it will overflow the DTO context. Verify this when processing the
protocol header.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:34:02 -05:00
Tom Tucker
008fdbc571 svcrdma: Change svc_rdma_send_error return type to void
The svc_rdma_send_error function is called when an RPCRDMA protocol
error is detected. This function attempts to post an error reply message.
Since an error posting to a transport in error is ignored, change
the return type to void.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:34:01 -05:00
Tom Tucker
af261af4db svcrdma: Copy transport address and arm CQ before calling rdma_accept
This race was found by inspection. Messages can be received from the peer
immediately following the rdma_accept call, however, the CQ have not yet
been armed and the transport address has not yet been set.

Set the transport address in the connect request handler and arm the CQ
prior to calling rdma_accept.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:34:00 -05:00
Tom Tucker
69500c43b4 svcrdma: Set rqstp transport address in rdma_read_complete function
The rdma_read_complete function needs to copy the rqstp transport address
from the transport. Failure to do so can result in using the wrong
authentication method for the RPC or bug checking if the rqstp address
is not valid.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:59 -05:00
Tom Tucker
97a3df382e svcrdma: Use ib verbs version of dma_unmap
Use the ib_verbs version of the dma_unmap service in the
svc_rdma_put_context function. This should support providers
using software rdma.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:58 -05:00
Tom Tucker
356d0a1519 svcrdma: Cleanup queued, but unprocessed I/O in svc_rdma_free
When the transport is closing, the DTO tasklet may queue data
that never gets processed. Clean up resources associated with
this I/O.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:57 -05:00
Tom Tucker
1711386c62 svcrdma: Move the QP and cm_id destruction to svc_rdma_free
Move the destruction of the QP and CM_ID to the free path so that the
QP cleanup code doesn't race with the dto_tasklet handling flushed WR.
The QP reference is not needed because we now have a reference for
every WR.

Also add a guard in the SQ and RQ completion handlers to ignore
calls generated by some providers when the QP is destroyed.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:56 -05:00
Tom Tucker
0905c0f0a2 svcrdma: Add reference for each SQ/RQ WR
Add a reference on the transport for every outstanding WR.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:55 -05:00
Tom Tucker
8da91ea8de svcrdma: Move destroy to kernel thread
Some providers may wait while destroying adapter resources.
Since it is possible that the last reference is put on the
dto_tasklet, the actual destroy must be scheduled as a work item.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:54 -05:00
Tom Tucker
47698e083e svcrdma: Shrink scope of spinlock on RQ CQ
The rq_cq_reap function is only called from the dto_tasklet. The
only resource shared with other threads is the sc_rq_dto_q. Move the
spin lock to protect only this list.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:53 -05:00
Tom Tucker
8740767376 svcrdma: Use standard Linux lists for context cache
Replace the one-off linked list implementation used to implement the
context cache with the standard Linux list_head lists. Add a context
counter to catch resource leaks. A WARN_ON will be added later to
ensure that we've freed all contexts.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:52 -05:00
Tom Tucker
02e7452de7 svcrdma: Simplify RDMA_READ deferral buffer management
An NFS_WRITE requires a set of RDMA_READ requests to fetch the write
data from the client. There are two principal pieces of data that
need to be tracked: the list of pages that comprise the completed RPC
and the SGE of dma mapped pages to refer to this list of pages. Previously
this whole bit was managed as a linked list of contexts with the
context containing the page list buried in this list. This patch
simplifies this processing by not keeping a linked list, but rather only
a pionter from the last submitted RDMA_READ's context to the context
that maps the set of pages that describe the RPC.  This significantly
simplifies this code path. SGE contexts are cleaned up inline in the DTO
path instead of at read completion time.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:51 -05:00
Tom Tucker
10a38c33f4 svcrdma: Remove unused READ_DONE context flags bit
The RDMACTXT_F_READ_DONE bit is not longer used. Remove it.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:50 -05:00
Tom Tucker
d16d40093a svcrdma: Return error from rdma_read_xdr so caller knows to free context
The rdma_read_xdr function did not discriminate between no read-list and
an error posting the read-list. This results in a leak of a page if there
is an error posting the read-list.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:49 -05:00
Tom Tucker
58e8f62137 svcrdma: Fix error handling during listening endpoint creation
A listening endpoint isn't known to the generic transport switch until
the svc_create_xprt function returns without error. Calling
svc_xprt_put within the xpo_create function causes the module reference
count to be erroneously decremented.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:48 -05:00
Tom Tucker
5ac461a6f0 svcrdma: Free context on post_recv error in send_reply
If an error is encountered trying to post a recv buffer in send_reply,
free the passed in context. Return an error to the caller so it is
aware that the request was not posted.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:47 -05:00
Tom Tucker
05a0826a6e svcrdma: Free context on ib_post_recv error
If there is an error posting the recv WR to the RQ, free the
context associated with the WR. This would leak a context when
asynchronous errors occurred on the transport while conccurent threads
were processing their RPC.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:47 -05:00
Tom Tucker
120693d12c svcrdma: Add put of connection ESTABLISHED reference in rdma_cma_handler
The svcrdma transport takes a reference when it gets the ESTABLISHED
event from the provider. This reference is supposed to be removed when
the DISCONNECT event is received, however, the call to svc_xprt_put
was missing in the switch statement. This results in the memory
associated with the transport never being freed.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:46 -05:00
Tom Tucker
9d6347acd2 svcrdma: Fix return value in svc_rdma_send
Fix the return value on close to -ENOTCONN so caller knows to free context.
Also if a thread is waiting for free SQ space, check for close when waking
to avoid posting WR to a closing transport.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:45 -05:00
Tom Tucker
dbcd00eba9 svcrdma: Fix race with dto_tasklet in svc_rdma_send
The svc_rdma_send function will attempt to reap SQ WR to make room for
a new request if it finds the SQ full. This function races with the
dto_tasklet that also reaps SQ WR. To avoid polling and arming the CQ
unnecessarily move the test_and_clear_bit of the RDMAXPRT_SQ_PENDING
flag and arming of the CQ to the sq_cq_reap function.

Refactor the rq_cq_reap function to match sq_cq_reap so that the
code is easier to follow.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:44 -05:00
Tom Tucker
0e7f011a19 svcrdma: Simplify receive buffer posting
The svcrdma transport provider currently allocates receive buffers
to the RQ through the xpo_release_rqst method. This approach is overly
complicated since it means that the rqstp rq_xprt_ctxt has to be
selectively set based on whether the RPC is going to be processed
immediately or deferred. Instead, just post the receive buffer when
we are certain that we are replying in the send_reply function.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:43 -05:00
Tom Tucker
aa3314c8d6 svc: Remove unused header files from svc_xprt.c
This cosmetic patch removes unused header files that svc_xprt.c
inherited from svcsock.c

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:42 -05:00
Tom Tucker
fc63a05086 svc: Remove extra check for XPT_DEAD bit in svc_xprt_enqueue
Remove a redundant check for the XPT_DEAD bit in the svc_xprt_enqueue
function. This same bit is checked below while holding the pool lock
and prints a debug message if found to be dead.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
2008-05-19 07:33:41 -05:00
Ingo Molnar
711bbdd659 rculist.h: fix include in net/netfilter/nf_conntrack_netlink.c
this file has rculist dependency but did not explicitly include it,
which broke the build.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-19 10:03:40 +02:00
Franck Bui-Huu
82524746c2 rcu: split list.h and move rcu-protected lists into rculist.h
Move rcu-protected lists from list.h into a new header file rculist.h.

This is done because list are a very used primitive structure all over the
kernel and it's currently impossible to include other header files in this
list.h without creating some circular dependencies.

For example, list.h implements rcu-protected list and uses rcu_dereference()
without including rcupdate.h.  It actually compiles because users of
rcu_dereference() are macros.  Others RCU functions could be used too but
aren't probably because of this.

Therefore this patch creates rculist.h which includes rcupdates without to
many changes/troubles.

Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Josh Triplett <josh@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-19 10:01:37 +02:00
J. Bruce Fields
d71a4dd72e svcrpc: fix proc/net/rpc/auth.unix.ip/content display
Commit f15364bd4c ("IPv6 support for NFS
server export caches") dropped a couple spaces, rendering the output
here difficult to read.

(However note that we expect the output to be parsed only by humans, not
machines, so this shouldn't have broken any userland software.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-05-18 19:13:07 -04:00
Trond Myklebust
b4528762ca SUNRPC: AUTH_SYS "machine creds" shouldn't use negative valued uid/gid
Apparently this causes Solaris 10 servers to refuse our NFSv4 SETCLIENTID
calls. Fall back to root creds for now, since most servers that care are
very likely to have root squashing enabled.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-05-18 14:18:27 -04:00
Ivo van Doorn
2f561feb38 mac80211: Add RTNL version of ieee80211_iterate_active_interfaces
Since commit e38bad4766
	mac80211: make ieee80211_iterate_active_interfaces not need rtnl
rt2500usb and rt73usb broke down due to attempting register access
in atomic context (which is not possible for USB hardware).

This patch restores ieee80211_iterate_active_interfaces() to use RTNL lock,
and provides the non-RTNL version under a new name:
	ieee80211_iterate_active_interfaces_atomic()

So far only rt2x00 uses ieee80211_iterate_active_interfaces(), and those
drivers require the RTNL version of ieee80211_iterate_active_interfaces().
Since they already call that function directly, this patch will automatically
fix the USB rt2x00 drivers.

v2: Rename ieee80211_iterate_active_interfaces_rtnl

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-16 17:15:09 -04:00
Abhijeet Kolekar
34a961f7db mac80211 : Association with 11n hidden ssid ap.
This patch fixes the association problem with 11n hidden ssid ap.
Patch fixes the problem of associating with hidden ssid when
all three parameters ap,essid and channel are given to iwconfig.
This patch removes the condition of checking three parameters
and always checks for bss in bss list while associating.

Signed-off-by: Abhijeet Kolekar <abhijeet.kolekar@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-16 17:14:44 -04:00
Stephen Hemminger
dcc997738e net: handle errors from device_rename
device_rename can fail with -EEXIST or -ENOMEM, so handle any
problems.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-14 22:33:38 -07:00
Linus Torvalds
8f40f672e6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of ssh://master.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  9p: fix error path during early mount
  9p: make cryptic unknown error from server less scary
  9p: fix flags length in net
  9p: Correct fidpool creation failure in p9_client_create
  9p: use struct mutex instead of struct semaphore
  9p: propagate parse_option changes to client and transports
  fs/9p/v9fs.c (v9fs_parse_options): Handle kstrdup and match_strdup failure.
  9p: Documentation updates
  add match_strlcpy() us it to make v9fs make uname and remotename parsing more robust
2008-05-14 19:30:51 -07:00
Andrew Morton
f7fd63c0b5 net/irda/irnet/irnet_irda.c needs unaligned.h
net/irda/irnet/irnet_irda.c: In function 'irnet_discovery_indication':
net/irda/irnet/irnet_irda.c:1676: error: implicit declaration of function 'get_unaligned'

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-14 19:11:15 -07:00
Eric Van Hensbergen
887b3ece65 9p: fix error path during early mount
There was some cleanup issues during early mount which would trigger
a kernel bug for certain types of failure.  This patch reorganizes the
cleanup to get rid of the bad behavior.

This also merges the 9pnet and 9pnet_fd modules for the purpose of
configuration and initialization.  Keeping the fd transport separate
from the core 9pnet code seemed like a good idea at the time, but in
practice has caused more harm and confusion than good.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-05-14 19:23:27 -05:00
Eric Van Hensbergen
332c421e67 9p: make cryptic unknown error from server less scary
Right now when we get an error string from the server that we can't
map we report a cryptic error that actually makes it look like we are
reporting a problem with the client.  This changes the text of the log
message to clarify where the error is coming from.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-05-14 19:23:26 -05:00
Steven Rostedt
d0c447180b 9p: fix flags length in net
Some files in the net/9p directory uses "int" for flags. This can
cause hard to find bugs on some architectures. This patch converts the
flags to use "long" instead.

This bug was discovered by doing an allyesconfig make on the -rt kernel
where checks are done to ensure all flags are of size sizeof(long).

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-05-14 19:23:26 -05:00
Josef 'Jeff' Sipek
728fc4ef17 9p: Correct fidpool creation failure in p9_client_create
On error, p9_idpool_create returns an ERR_PTR-encoded errno.

Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-05-14 19:23:26 -05:00
Josef 'Jeff' Sipek
c1549497e9 9p: use struct mutex instead of struct semaphore
Replace semaphores protecting use flags with a mutex.

Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-05-14 19:23:26 -05:00
Eric Van Hensbergen
bb8ffdfc3e 9p: propagate parse_option changes to client and transports
Propagate changes that were made to the parse_options code to the
other parse options pieces present in the other modules.  Looks like
the client parse options was probably corrupting the parse string
and causing problems for others.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-05-14 19:23:26 -05:00
Eric Van Hensbergen
ee443996a3 9p: Documentation updates
The kernel-doc comments of much of the 9p system have been in disarray since
reorganization.  This patch fixes those problems, adds additional documentation
and a template book which collects the 9p information.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-05-14 19:23:25 -05:00
Linus Torvalds
6aa5fc4349 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (73 commits)
  net: Fix typo in net/core/sock.c.
  ppp: Do not free not yet unregistered net device.
  netfilter: xt_iprange: module aliases for xt_iprange
  netfilter: ctnetlink: dump conntrack ID in event messages
  irda: Fix a misalign access issue. (v2)
  sctp: Fix use of uninitialized pointer
  cipso: Relax too much careful cipso hash function.
  tcp FRTO: work-around inorder receivers
  tcp FRTO: Fix fallback to conventional recovery
  New maintainer for Intel ethernet adapters
  DM9000: Use delayed work to update MII PHY state
  DM9000: Update and fix driver debugging messages
  DM9000: Add __devinit and __devexit attributes to probe and remove
  sky2: fix simple define thinko
  [netdrvr] sfc: sfc: Add self-test support
  [netdrvr] sfc: Increment rx_reset when reported as driver event
  [netdrvr] sfc: Remove unused macro EFX_XAUI_RETRAIN_MAX
  [netdrvr] sfc: Fix code formatting
  [netdrvr] sfc: Remove kernel-doc comments for removed members of struct efx_nic
  [netdrvr] sfc: Remove garbage from comment
  ...
2008-05-14 10:08:24 -07:00
Rami Rosen
9ee6b7f155 net: Fix typo in net/core/sock.c.
In sock_queue_rcv_skb()  (net/core/sock.c) it should be:
"Cast sk->rcvbuf ..." instead of: "Cast skb->rcvbuf ..."

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-14 03:50:03 -07:00
Phil Oester
01b7a31429 netfilter: xt_iprange: module aliases for xt_iprange
Using iptables 1.3.8 with kernel 2.6.25, rules which include '-m
iprange' don't automatically pull in xt_iprange module.  Below patch
adds module aliases to fix that.  Patch against latest -git, but seems
like a good candidate for -stable also.

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-13 23:27:48 -07:00
Eric Leblond
1eedf69993 netfilter: ctnetlink: dump conntrack ID in event messages
Conntrack ID is not put (anymore ?) in event messages. This causes
current ulogd2 code to fail because it uses the ID to build a hash in
userspace. This hash is used to be able to output the starting time of
a connection.

Conntrack ID can be used in userspace application to maintain an easy
match between kernel connections list and userspace one. It may worth
to add it if there is no performance related issue.

[ Patrick: it was never included in events, but really should be ]

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-13 23:27:11 -07:00
Graf Yang
332223831e irda: Fix a misalign access issue. (v2)
Replace u16ho with put/get_unaligned functions

Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-13 23:25:57 -07:00
Patrick McHardy
c1cc678ada sctp: Fix use of uninitialized pointer
Introduced by c4492586 (sctp: Add address type check while process
paramaters of ASCONF chunk):

net/sctp/sm_make_chunk.c: In function 'sctp_process_asconf':
net/sctp/sm_make_chunk.c:2828: warning: 'addr_param' may be used uninitialized in this function
net/sctp/sm_make_chunk.c:2828: note: 'addr_param' was declared here

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-13 23:25:00 -07:00
Pavel Emelyanov
5e0f8923f3 cipso: Relax too much careful cipso hash function.
The cipso_v4_cache is allocated to contain CIPSO_V4_CACHE_BUCKETS
buckets. The CIPSO_V4_CACHE_BUCKETS = 1 << CIPSO_V4_CACHE_BUCKETBITS,
where CIPSO_V4_CACHE_BUCKETBITS = 7.

The bucket-selection function for this hash is calculated like this:

  bkt = hash & (CIPSO_V4_CACHE_BUCKETBITS - 1);
                                     ^^^

i.e. picking only 4 buckets of possible 128 :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-13 23:23:55 -07:00
Ilpo Järvinen
79d44516b4 tcp FRTO: work-around inorder receivers
If receiver consumes segments successfully only in-order, FRTO
fallback to conventional recovery produces RTO loop because
FRTO's forward transmissions will always get dropped and need to
be resent, yet by default they're not marked as lost (which are
the only segments we will retransmit in CA_Loss).

Price to pay about this is occassionally unnecessarily
retransmitting the forward transmission(s). SACK blocks help
a bit to avoid this, so it's mainly a concern for NewReno case
though SACK is not fully immune either.

This change has a side-effect of fixing SACKFRTO problem where
it didn't have snd_nxt of the RTO time available anymore when
fallback become necessary (this problem would have only occured
when RTO would occur for two or more segments and ECE arrives
in step 3; no need to figure out how to fix that unless the
TODO item of selective behavior is considered in future).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Damon L. Chesser <damon@damtek.com>
Tested-by: Damon L. Chesser <damon@damtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-13 02:54:19 -07:00
Ilpo Järvinen
a1c1f281b8 tcp FRTO: Fix fallback to conventional recovery
It seems that commit 009a2e3e4e ("[TCP] FRTO: Improve
interoperability with other undo_marker users") run into
another land-mine which caused fallback to conventional
recovery to break:

1. Cumulative ACK arrives after FRTO retransmission
2. tcp_try_to_open sees zero retrans_out, clears retrans_stamp
   which should be kept like in CA_Loss state it would be
3. undo_marker change allowed tcp_packet_delayed to return
   true because of the cleared retrans_stamp once FRTO is
   terminated causing LossUndo to occur, which means all loss
   markings FRTO made are reverted.

This means that the conventional recovery basically recovered
one loss per RTT, which is not that efficient. It was quite
unobvious that the undo_marker change broken something like
this, I had a quite long session to track it down because of
the non-intuitiviness of the bug (luckily I had a trivial
reproducer at hand and I was also able to learn to use kprobes
in the process as well :-)).

This together with the NewReno+FRTO fix and FRTO in-order
workaround this fixes Damon's problems, this and the first
mentioned are enough to fix Bugzilla #10063.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Damon L. Chesser <damon@damtek.com>
Tested-by: Damon L. Chesser <damon@damtek.com>
Tested-by: Sebastian Hyrwall <zibbe@cisko.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-13 02:53:26 -07:00
David S. Miller
608961a5ec mac80211: Use skb_header_cloned() on TX path.
When skb_header_cloned() returns false you can change the
headers however you like.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-12 21:59:32 -07:00
Johannes Berg
f3994eceeb mac80211: assign needed_headroom/tailroom for netdevs
This assigns the netdev's needed_headroom/tailroom members to take
advantage of pre-allocated space for 802.11 headers.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-12 20:51:44 -07:00
Johannes Berg
f5184d267c net: Allow netdevices to specify needed head/tailroom
This patch adds needed_headroom/needed_tailroom members to struct
net_device and updates many places that allocate sbks to use them. Not
all of them can be converted though, and I'm sure I missed some (I
mostly grepped for LL_RESERVED_SPACE)

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-12 20:48:31 -07:00
Pavel Roskin
a4278e18e7 mac80211: add missing newlines in printk()
Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-12 21:44:41 -04:00
Helmut Schaa
36d16ae73b mac80211: fix association with some APs
Some APs refuse association if the supported rates contained in the
association request do not match its own supported rates. This patch
introduces a new function which builds the intersection between the AP's
supported rates and the client's supported rates to work around such
problems. The same approach is already used in ipw2200 for example.

Signed-off-by: Helmut Schaa <hschaa@suse.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-12 21:22:19 -04:00
Pavel Emelyanov
6d6936e2ea Fix potential scheduling while atomic in mesh_path_add.
Calling synchronize_rcu() under write-lock-ed pathtbl_resize_lock may
result in this warning (and other side effects).

It looks safe just dropping this lock before calling synchronize_rcu.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-12 21:22:19 -04:00
Pavel Emelyanov
0eb03d5a14 Fix not checked kmalloc() result.
The new_node kmallocation is not checked for success, so add
this check.

BTW, it also happens under the read_lock.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-12 21:22:19 -04:00
Pavel Emelyanov
f84e71a94c Fix GFP_KERNEL allocation under read lock.
The mesh_path_add() read-locks the pathtbl_resize_lock and calls
kmalloc with GFP_KERNEL mask.

Fix it and move the endadd2 label lower. It should be _before_ the
if() beyond, but it makes no sense for it being there, so I move it
right after this if().

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-12 21:22:18 -04:00
Patrick McHardy
812714d741 mac80211: mesh hwmp: fix kfree(skb)
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-12 21:22:18 -04:00
Luis Carlos Cobo
69687a0b99 mac80211: fix access to null skb
Without this patch, if xmit_skb is null but net_ratelimit() returns 0 we would
go to the else branch and access the null xmit_skb. Pointed out by Johannes
Berg.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-12 21:22:18 -04:00
Luis Carlos Cobo
ef26925477 mac80211: fix incorrect mesh header length
This should have been updated at the same time we were transitioning from 3 byte
to 4 byte mesh sequence number. Pointed out by Johannes Berg.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-12 21:22:18 -04:00
Ivo van Doorn
df44205455 mac80211: Don't encrypt beacons
mac80211 should set the IEEE80211_TX_CTL_DO_NOT_ENCRYPT flag in tx_control
structure to inform drivers not to encrypt the beacon. Drivers that only check
for that flag before accessing the hw_key field, will otherwise cause a NULL
pointer dereference since that field is not configured for beacons.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-12 21:22:18 -04:00
Johannes Berg
78520cad4b mac80211: fix debugfs default key oops
Under certain circumstances (in AP mode) the debugfs function
that is supposed to add the default key symlink can encounter
a NULL default_key pointer. This patch makes it handle that
situtation gracefully.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-12 21:18:05 -04:00
Steven Rostedt
bb55bdd512 fix irq flags in mac80211 code
A file in the net/mac80211 directory uses "int" for flags.  This can cause
hard to find bugs on some architectures.  This patch converts the flags to use
"long" instead.

This bug was discovered by doing an allyesconfig make on the -rt kernel where
checks are done to ensure all flags are of size sizeof(long).

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-05-12 21:18:04 -04:00
Wei Yongjun
c4492586a6 sctp: Add address type check while process paramaters of ASCONF chunk
If socket is create by AF_INET type, add IPv6 address to asoc will cause
kernel panic while packet is transmitted on that transport.

This patch add address type check before process paramaters of ASCONF
chunk. If peer is not support this address type, return with error
invald parameter.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-12 03:11:48 -07:00
Wei Yongjun
6e40a915de sctp: Do not enable peer IPv6 address support on PF_INET socket
If socket is create by PF_INET type, it can not used IPv6 address to
send/recv DATA, So we can not used IPv6 address even if peer tell us it
support IPv6 address.
This patch fix to only enabled peer IPv6 address support on PF_INET6 socket.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-12 03:11:43 -07:00
Linus Torvalds
a8f43ee7e1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  sit: Add missing kfree_skb() on pskb_may_pull() failure.
  tipc: Increase buffer header to support worst-case device
2008-05-09 08:01:19 -07:00
David S. Miller
36ca34cc3b sit: Add missing kfree_skb() on pskb_may_pull() failure.
Noticed by Paul Marks <paul@pmarks.net>.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-08 23:40:26 -07:00
Allan Stephens
f08269d3ec tipc: Increase buffer header to support worst-case device
This patch increases the headroom TIPC reserves in each sk_buff
to accommodate the largest possible link level device header.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-08 21:38:24 -07:00
Linus Torvalds
28a4acb485 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits)
  net: Added ASSERT_RTNL() to dev_open() and dev_close().
  can: Fix can_send() handling on dev_queue_xmit() failures
  netns: Fix arbitrary net_device-s corruptions on net_ns stop.
  netfilter: Kconfig: default DCCP/SCTP conntrack support to the protocol config values
  netfilter: nf_conntrack_sip: restrict RTP expect flushing on error to last request
  macvlan: Fix memleak on device removal/crash on module removal
  net/ipv4: correct RFC 1122 section reference in comment
  tcp FRTO: SACK variant is errorneously used with NewReno
  e1000e: don't return half-read eeprom on error
  ucc_geth: Don't use RX clock as TX clock.
  cxgb3: Use CAP_SYS_RAWIO for firmware
  pcnet32: delete non NAPI code from driver.
  fs_enet: Fix a memory leak in fs_enet_mdio_probe
  [netdrvr] eexpress: IPv6 fails - multicast problems
  3c59x: use netstats in net_device structure
  3c980-TX needs EXTRA_PREAMBLE
  fix warning in drivers/net/appletalk/cops.c
  e1000e: Add support for BM PHYs on ICH9
  uli526x: fix endianness issues in the setup frame
  uli526x: initialize the hardware prior to requesting interrupts
  ...
2008-05-08 19:03:26 -07:00
Huang Weiyi
625fc3a375 Remove duplicated include in net/sunrpc/svc.c
<linux/sched.h> we included twice.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08 10:58:25 -07:00
Steven Rostedt
a1f2aa1be2 fix irq flags in mac80211 code
A file in the net/mac80211 directory uses "int" for flags.  This can cause
hard to find bugs on some architectures.  This patch converts the flags to use
"long" instead.

This bug was discovered by doing an allyesconfig make on the -rt kernel where
checks are done to ensure all flags are of size sizeof(long).

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-08 10:46:55 -07:00
Ben Hutchings
e46b66bc42 net: Added ASSERT_RTNL() to dev_open() and dev_close().
dev_open() and dev_close() must be called holding the RTNL, since they
call device functions and netdevice notifiers that are promised the RTNL.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-08 02:53:17 -07:00
Oliver Hartkopp
c2ab7ac225 can: Fix can_send() handling on dev_queue_xmit() failures
The tx packet counting and the local loopback of CAN frames should
only happen in the case that the CAN frame has been enqueued to the
netdevice tx queue successfully.

Thanks to Andre Naujoks <nautsch@gmail.com> for reporting this issue.

Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: Urs Thuermann <urs@isnogud.escape.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-08 02:49:55 -07:00
Pavel Emelyanov
aca51397d0 netns: Fix arbitrary net_device-s corruptions on net_ns stop.
When a net namespace is destroyed, some devices (those, not killed
on ns stop explicitly) are moved back to init_net.

The problem, is that this net_ns change has one point of failure -
the __dev_alloc_name() may be called if a name collision occurs (and
this is easy to trigger). This allocator performs a likely-to-fail
GFP_ATOMIC allocation to find a suitable number. Other possible 
conditions that may cause error (for device being ns local or not
registered) are always false in this case.

So, when this call fails, the device is unregistered. But this is
*not* the right thing to do, since after this the device may be
released (and kfree-ed) improperly. E. g. bridges require more
actions (sysfs update, timer disarming, etc.), some other devices 
want to remove their private areas from lists, etc.

I. e. arbitrary use-after-free cases may occur.

The proposed fix is the following: since the only reason for the
dev_change_net_namespace to fail is the name generation, we may
give it a unique fall-back name w/o %d-s in it - the dev<ifindex>
one, since ifindexes are still unique.

So make this change, raise the failure-case printk loglevel to 
EMERG and replace the unregister_netdevice call with BUG().

[ Use snprintf() -DaveM ]

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-08 01:24:25 -07:00
Patrick McHardy
f3261aff35 netfilter: Kconfig: default DCCP/SCTP conntrack support to the protocol config values
When conntrack and DCCP/SCTP protocols are enabled, chances are good
that people also want DCCP/SCTP conntrack and NAT support.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-08 01:16:04 -07:00
Patrick McHardy
ef75d49f11 netfilter: nf_conntrack_sip: restrict RTP expect flushing on error to last request
Some Inovaphone PBXs exhibit very stange behaviour: when dialing for
example "123", the device sends INVITE requests for "1", "12" and
"123" back to back.  The first requests will elicit error responses
from the receiver, causing the SIP helper to flush the RTP
expectations even though we might still see a positive response.

Note the sequence number of the last INVITE request that contained a
media description and only flush the expectations when receiving a
negative response for that sequence number.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-08 01:15:21 -07:00
J.H.M. Dassen (Ray)
c67fa02799 net/ipv4: correct RFC 1122 section reference in comment
RFC 1122 does not have a section 3.1.2.2. The requirement to silently
discard datagrams with a bad checksum is in section 3.2.1.2 instead.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=10611

Signed-off-by: J.H.M. Dassen (Ray) <jdassen@debian.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-08 01:11:04 -07:00
Ilpo Järvinen
62ab222783 tcp FRTO: SACK variant is errorneously used with NewReno
Note: there's actually another bug in FRTO's SACK variant, which
is the causing failure in NewReno case because of the error
that's fixed here. I'll fix the SACK case separately (it's
a separate bug really, though related, but in order to fix that
I need to audit tp->snd_nxt usage a bit).

There were two places where SACK variant of FRTO is getting
incorrectly used even if SACK wasn't negotiated by the TCP flow.
This leads to incorrect setting of frto_highmark with NewReno
if a previous recovery was interrupted by another RTO.

An eventual fallback to conventional recovery then incorrectly
considers one or couple of segments as forward transmissions
though they weren't, which then are not LOST marked during
fallback making them "non-retransmittable" until the next RTO.
In a bad case, those segments are really lost and are the only
one left in the window. Thus TCP needs another RTO to continue.
The next FRTO, however, could again repeat the same events
making the progress of the TCP flow extremely slow.

In order for these events to occur at all, FRTO must occur
again in FRTOs step 3 while the key segments must be lost as
well, which is not too likely in practice. It seems to most
frequently with some small devices such as network printers
that *seem* to accept TCP segments only in-order. In cases
were key segments weren't lost, things get automatically
resolved because those wrongly marked segments don't need to be
retransmitted in order to continue.

I found a reproducer after digging up relevant reports (few
reports in total, none at netdev or lkml I know of), some
cases seemed to indicate middlebox issues which seems now
to be a false assumption some people had made. Bugzilla
#10063 _might_ be related. Damon L. Chesser <damon@damtek.com>
had a reproducable case and was kind enough to tcpdump it
for me. With the tcpdump log it was quite trivial to figure
out.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-08 01:09:11 -07:00
Linus Torvalds
4880d10927 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  net_cls_act: act_simple dont ignore realloc code
  iwlwifi: make IWLWIFI a tristate
  Revert "atm: Do not free already unregistered net device."
  dccp: return -EINVAL on invalid feature length
  irda: fix !PNP support for drivers/net/irda/smsc-ircc2.c
  irda: fix !PNP support in drivers/net/irda/nsc-ircc.c
  net_cls_act: Make act_simple use of netlink policy.
  ip: Use inline function dst_metric() instead of direct access to dst->metric[]
  ip: Make use of the inline function dst_metric_locked()
  atm: Bad locking on br2684_devs modifications.
  atm: Do not free already unregistered net device.
  mac80211: Do not free net device after it is unregistered.
  bridge: Consolidate error paths in br_add_bridge().
  bridge: Net device leak in br_add_bridge().
  niu: Fix probing regression for maramba on-board chips.
  lapbeth: Release ->ethdev when unregistering device.
  xfrm: convert empty xfrm_audit_* macros to functions
  net: Fix useless comment reference loop.
  sch_htb: remove from event queue in htb_parent_to_leaf()
2008-05-06 07:49:20 -07:00
Jamal Hadi Salim
9d1045ad68 net_cls_act: act_simple dont ignore realloc code
reallocation of the policy data was being ignored. It could fail.
Simplify so that there is no need for reallocating.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-06 00:10:24 -07:00
David S. Miller
5f6b1ea41b Revert "atm: Do not free already unregistered net device."
This reverts commit 65e4113684.

Unlike the other cases Pavel fixed, this case did not
setup a netdev->destructor of free_netdev, therefore this
change was not correct.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-06 00:00:16 -07:00
Chris Wright
19443178fb dccp: return -EINVAL on invalid feature length
dccp_feat_change() validates length and on error is returning 1.
This happens to work since call chain is checking for 0 == success,
but this is returned to userspace, so make it a real error value.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-05 13:50:24 -07:00
Jamal Hadi Salim
fa1b1cff3d net_cls_act: Make act_simple use of netlink policy.
Convert to netlink helpers by using netlink policy validation.
As a side effect fixes a leak.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-05 00:22:35 -07:00
Satoru SATOH
5ffc02a158 ip: Use inline function dst_metric() instead of direct access to dst->metric[]
There are functions to refer to the value of dst->metric[THE_METRIC-1]
directly without use of a inline function "dst_metric" defined in
net/dst.h.

The following patch changes them to use the inline function
consistently.

Signed-off-by: Satoru SATOH <satoru.satoh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-04 22:14:42 -07:00
Satoru SATOH
0bbeafd011 ip: Make use of the inline function dst_metric_locked()
Signed-off-by: Satoru SATOH <satoru.satoh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-04 22:12:43 -07:00
Pavel Emelyanov
1e0ba0060f atm: Bad locking on br2684_devs modifications.
The list_del happens under read-locked devs_lock.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-04 18:00:36 -07:00
Pavel Emelyanov
65e4113684 atm: Do not free already unregistered net device.
Both br2684_push and br2684_exit do so, but unregister_netdev()
releases the device itself.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-04 18:00:05 -07:00
Pavel Emelyanov
339a7c41c9 mac80211: Do not free net device after it is unregistered.
The error path in ieee80211_register_hw() may call the unregister_netdev()
and right after it - the free_netdev(), which is wrong, since the
unregister releases the device itself.

So the proposed fix is to NULL the local->mdev after unregister is done
and check this before calling free_netdev().

I checked - no code uses the local->mdev after unregister in this error
path (but even if some did this would be a BUG).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-04 17:59:30 -07:00
Pavel Emelyanov
e340a90e6e bridge: Consolidate error paths in br_add_bridge().
This actually had to be merged with the patch #1, but I decided not to
mix two changes in one patch.

There are already two calls to free_netdev() in there, so merge them
into one.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-04 17:58:07 -07:00
Pavel Emelyanov
c37aa90b04 bridge: Net device leak in br_add_bridge().
In case the register_netdevice() call fails the device is leaked,
since the out: label is just rtnl_unlock()+return.

Free the device.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-04 17:57:29 -07:00
Johannes Berg
c800578510 net: Fix useless comment reference loop.
include/linux/skbuff.h says:
        /* These elements must be at the end, see alloc_skb() for details.  */

net/core/skbuff.c says:
	* See comment in sk_buff definition, just before the 'tail' member

This patch contains my guess as to the actual reason rather than a
dead comment reference loop.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-03 20:56:42 -07:00
Jarek Poplawski
3ba08b00e0 sch_htb: remove from event queue in htb_parent_to_leaf()
There is lack of removing a class from the event queue while changing
from parent to leaf which can cause corruption of this rb tree. This
patch fixes a bug introduced by my patch: "sch_htb: turn intermediate
classes into leaves" commit: 160d5e10f8.

Many thanks to Jan 'yanek' Bortl for finding a way to reproduce this
rare bug and narrowing the test case, which made possible proper
diagnosing.

This patch is recommended for all kernels starting from 2.6.20.

Reported-and-tested-by: Jan 'yanek' Bortl <yanek@ya.bofh.cz>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-03 20:46:29 -07:00
Linus Torvalds
afa26be86b Merge git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt
* git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt:
  clocksource: allow read access to available/current_clocksource
  clocksource: Fix permissions for available_clocksource
  hrtimer: remove duplicate helper function
2008-05-03 13:51:10 -07:00
Linus Torvalds
4f9faaace2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
  rose: Wrong list_lock argument in rose_node seqops
  netns: Fix reassembly timer to use the right namespace
  netns: Fix device renaming for sysfs
  bnx2: Update version to 1.7.5.
  bnx2: Update RV2P firmware for 5709.
  bnx2: Zero out context memory for 5709.
  bnx2: Fix register test on 5709.
  bnx2: Fix remote PHY initial link state.
  bnx2: Refine remote PHY locking.
  bridge: forwarding table information for >256 devices
  tg3: Update version to 3.92
  tg3: Add link state reporting to UMP firmware
  tg3: Fix ethtool loopback test for 5761 BX devices
  tg3: Fix 5761 NVRAM sizes
  tg3: Use constant 500KHz MI clock on adapters with a CPMU
  hci_usb.h: fix hard-to-trigger race
  dccp: ccid2.c, ccid3.c use clamp(), clamp_t()
  net: remove NR_CPUS arrays in net/core/dev.c
  net: use get/put_unaligned_* helpers
  bluetooth: use get/put_unaligned_* helpers
  ...
2008-05-03 10:18:21 -07:00
Oliver Hartkopp
4346f65426 hrtimer: remove duplicate helper function
The helper function hrtimer_callback_running() is used in
kernel/hrtimer.c as well as in the updated net/can/bcm.c which now
supports hrtimers. Moving the helper function to hrtimer.h removes the
duplicate definition in the C-files.

Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-03 18:11:48 +02:00
Bernard Pidoux
f37f2c62a2 rose: Wrong list_lock argument in rose_node seqops
In rose_node_start() as well as in rose_node_stop() __acquires() and
spin_lock_bh() were wrongly passing rose_neigh_list_lock instead of
rose_node_list_lock arguments.

Signed-off-by: Bernard Pidoux <f6bvp@amsat.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 17:03:22 -07:00
Daniel Lezcano
4ac2ccd016 netns: Fix reassembly timer to use the right namespace
This trivial fix retrieves the network namespace from frag queue
and use it to get the network device in the right namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 17:02:03 -07:00
Daniel Lezcano
aaf8cdc34d netns: Fix device renaming for sysfs
When a netdev is moved across namespaces with the
'dev_change_net_namespace' function, the 'device_rename' function is
used to fixup kobject and refresh the sysfs tree. The device_rename
function will call kobject_rename and this one will check if there is
an object with the same name and this is the case because we are
renaming the object with the same name.

The use of 'device_rename' seems for me wrong because we usually don't
rename it but just move it across namespaces. As we just want to do a
mini "netdev_[un]register", IMO the functions
'netdev_[un]register_kobject' should be used instead, like an usual
network device [un]registering.

This patch replace device_rename by netdev_unregister_kobject,
followed by netdev_register_kobject.

The netdev_register_kobject will call device_initialize and will raise
a warning indicating the device was already initialized. In order to
fix that, I split the device initialization into a separate function
and use it together with 'netdev_register_kobject' into
register_netdevice. So we can safely call 'netdev_register_kobject' in
'dev_change_net_namespace'.

This fix will allow to properly use the sysfs per namespace which is
coming from -mm tree.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 17:00:58 -07:00
Stephen Hemminger
ae4f8fca40 bridge: forwarding table information for >256 devices
The forwarding table binary interface (my bad choice), only exposes
the port number of the first 8 bits. The bridge code was limited to
256 ports at the time, but now the kernel supports up 1024 ports, so
the upper bits are lost when doing:

   brctl showmacs

The fix is to squeeze the extra bits into small hole left in data
structure, to maintain binary compatiablity.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 16:53:33 -07:00
Harvey Harrison
84994e16f2 dccp: ccid2.c, ccid3.c use clamp(), clamp_t()
Makes the intention of the nested min/max clear.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 16:44:07 -07:00
Mike Travis
0c0b0aca66 net: remove NR_CPUS arrays in net/core/dev.c
Remove the fixed size channels[NR_CPUS] array in net/core/dev.c and
dynamically allocate array based on nr_cpu_ids.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 16:43:08 -07:00
Harvey Harrison
d3e2ce3bcd net: use get/put_unaligned_* helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 16:26:16 -07:00
Harvey Harrison
8398531939 bluetooth: use get/put_unaligned_* helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 16:25:46 -07:00
Harvey Harrison
260ffeed3f irda: use get_unaligned_* helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 16:21:52 -07:00
Arjan van de Ven
b4192bbd85 net: Add a WARN_ON_ONCE() to the transmit timeout function
WARN_ON_ONCE() gives a stack trace including the full module list.
Having this in the kernel dump for the timeout case in the
generic netdev watchdog will help us see quicker which driver
is involved. It also allows us to collect statistics 
and patterns in terms of which drivers have this event occuring.

Suggested by Andrew Morton

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 16:21:07 -07:00
Ilpo Järvinen
50aab54f30 net: Add missing braces to multi-statement if()s
One finds all kinds of crazy things with some shell pipelining.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 16:20:10 -07:00
Denis V. Lunev
8b169240e2 netfilter: assign PDE->data before gluing PDE into /proc tree
Replace proc_net_fops_create with proc_create_data.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 04:11:52 -07:00
Denis V. Lunev
52c0e111fa netfilter: assign PDE->fops before gluing PDE into /proc tree
Replace create_proc_entry with specially created for this purpose proc_create.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 04:10:57 -07:00
Denis V. Lunev
84841c3c6c ipv4: assign PDE->data before gluing PDE into /proc tree
The check for PDE->data != NULL becomes useless after the replacement
of proc_net_fops_create with proc_create_data.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 04:10:08 -07:00
Denis V. Lunev
1d3faa390d vlan: assign PDE->data before gluing PDE into /proc tree
Simply replace proc_create and further data assigned with proc_create_data.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 04:09:11 -07:00
Denis V. Lunev
0c89652a74 atm: assign PDE->data before gluing PDE into /proc tree
Simply replace proc_create and further data assigned with proc_create_data.
proc_atm_dev_ops holds proper referrence.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 04:08:30 -07:00
Denis V. Lunev
0bb53a66fe ipv6: assign PDE->data before gluing PDE into /proc tree
Simply replace proc_create and further data assigned with proc_create_data.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 02:46:55 -07:00
Denis V. Lunev
5efdccbcda net: assign PDE->data before gluing PDE into /proc tree
Simply replace proc_create and further data assigned with proc_create_data.
Additionally, there is no need to assign NULL to PDE->data after creation,
/proc generic has already done this for us.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 02:46:22 -07:00
Denis V. Lunev
6e79d85d9a netfilter: assign PDE->data before gluing PDE into /proc tree
Simply replace proc_create and further data assigned with proc_create_data.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 02:45:42 -07:00
Denis V. Lunev
e7fe23363b sunrpc: assign PDE->data before gluing PDE into /proc tree
Simply replace proc_create and further data assigned with proc_create_data.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-02 02:44:36 -07:00
Roman Zippel
6f6d6a1a6a rename div64_64 to div64_u64
Rename div64_64 to div64_u64 to make it consistent with the other divide
functions, so it clearly includes the type of the divide.  Move its definition
to math64.h as currently no architecture overrides the generic implementation.
 They can still override it of course, but the duplicated declarations are
avoided.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-05-01 08:03:58 -07:00
Harvey Harrison
ab59859de1 net: fix returning void-valued expression warnings
drivers/net/8390.c:37:2: warning: returning void-valued expression
drivers/net/bnx2.c:1635:3: warning: returning void-valued expression
drivers/net/xen-netfront.c:1806:2: warning: returning void-valued expression
net/ipv4/tcp_hybla.c:105:3: warning: returning void-valued expression
net/ipv4/tcp_vegas.c:171:3: warning: returning void-valued expression
net/ipv4/tcp_veno.c:123:3: warning: returning void-valued expression
net/sysctl_net.c:85:2: warning: returning void-valued expression

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-01 02:47:38 -07:00
David S. Miller
c2a3b23345 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-05-01 02:06:32 -07:00
Linus Torvalds
ccc7518415 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  ipv6: Compilation fix for compat MCAST_MSFILTER sockopts.
2008-04-30 20:13:22 -07:00
Harvey Harrison
17f830459d mac80211: incorrect shift direction
Looks like  5d2cdcd4e8 ("mac80211: get a
TKIP phase key from skb") got the shifts wrong.

Noticed by sparse:
net/mac80211/tkip.c:234:25: warning: right shift by bigger than source value
net/mac80211/tkip.c:235:25: warning: right shift by bigger than source value
net/mac80211/tkip.c:236:25: warning: right shift by bigger than source value

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-30 20:34:26 -04:00
Johannes Berg
636c5d488b mac80211: insert WDS peer after adding interface
This reorders the open code so that WDS peer STA info entries
are added after the corresponding interface is added to the
driver so that driver callbacks aren't invoked out of order.
Also make any master device startup fatal.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-30 20:34:26 -04:00
Johannes Berg
e94e106831 mac80211: don't allow invalid WDS peer addresses
Rather than just disallowing the zero address, disallow all
invalid ones.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-30 20:34:26 -04:00
Johannes Berg
8b808bf29b mac80211: assign conf.beacon_control for mesh
Drivers can rightfully assume that they get a beacon_control
if the beacon is set.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-30 20:34:26 -04:00
Luis Carlos Cobo
51ceddade0 mac80211: use 4-byte mesh sequence number
This follows the new 802.11s/D2.0 draft.

Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-04-30 20:34:26 -04:00
Pavel Emelyanov
8099179031 ipv6: Compilation fix for compat MCAST_MSFILTER sockopts.
The last hunk from the commit dae50295 (ipv4/ipv6 compat: Fix SSM
applications on 64bit kernels.) escaped from the compat_ipv6_setsockopt
to the ipv6_getsockopt (I guess due to patch smartness wrt searching
for context) thus breaking 32-bit and 64-bit-without-compat compilation.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-30 14:49:54 -07:00
Linus Torvalds
95dfec6ae1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (53 commits)
  tcp: Overflow bug in Vegas
  [IPv4] UFO: prevent generation of chained skb destined to UFO device
  iwlwifi: move the selects to the tristate drivers
  ipv4: annotate a few functions __init in ipconfig.c
  atm: ambassador: vcc_sf semaphore to mutex
  MAINTAINERS: The socketcan-core list is subscribers-only.
  netfilter: nf_conntrack: padding breaks conntrack hash on ARM
  ipv4: Update MTU to all related cache entries in ip_rt_frag_needed()
  sch_sfq: use del_timer_sync() in sfq_destroy()
  net: Add compat support for getsockopt (MCAST_MSFILTER)
  net: Several cleanups for the setsockopt compat support.
  ipvs: fix oops in backup for fwmark conn templates
  bridge: kernel panic when unloading bridge module
  bridge: fix error handling in br_add_if()
  netfilter: {nfnetlink,ip,ip6}_queue: fix skb_over_panic when enlarging packets
  netfilter: x_tables: fix net namespace leak when reading /proc/net/xxx_tables_names
  netfilter: xt_TCPOPTSTRIP: signed tcphoff for ipv6_skip_exthdr() retval
  tcp: Limit cwnd growth when deferring for GSO
  tcp: Allow send-limited cwnd to grow up to max_burst when gso disabled
  [netdrvr] gianfar: Determine TBIPA value dynamically
  ...
2008-04-30 08:45:48 -07:00