Commit Graph

3010 Commits

Author SHA1 Message Date
Philip Craig
5c170a09d9 [NETFILTER]: fix format specifier for netfilter log targets
The prefix argument for nf_log_packet is a format specifier,
so don't pass the user defined string directly to it.

Signed-off-by: Philip Craig <philipc@snapgear.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-19 02:15:47 -07:00
Jesper Juhl
493e2428aa [NETFILTER]: Fix memory leak in ipt_recent
The Coverity checker spotted that we may leak 'hold' in
net/ipv4/netfilter/ipt_recent.c::checkentry() when the following
is true:
  if (!curr_table->status_proc) {
    ...
    if(!curr_table) {
    ...
      return 0;  <-- here we leak.
Simply moving an existing vfree(hold); up a bit avoids the possible leak.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-19 02:15:13 -07:00
John W. Linville
5dd8816aeb Merge branch 'from-linus' into upstream 2006-05-17 14:51:24 -04:00
Angelo P. Castellani
8872d8e1c4 [TCP]: reno sacked_out count fix
From: "Angelo P. Castellani" <angelo.castellani+lkml@gmail.com>

Using NewReno, if a sk_buff is timed out and is accounted as lost_out,
it should also be removed from the sacked_out.

This is necessary because recovery using NewReno fast retransmit could
take up to a lot RTTs and the sk_buff RTO can expire without actually
being really lost.

left_out = sacked_out + lost_out
in_flight = packets_out - left_out + retrans_out

Using NewReno without this patch, on very large network losses,
left_out becames bigger than packets_out + retrans_out (!!).

For this reason unsigned integer in_flight overflows to 2^32 - something.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-16 21:42:11 -07:00
Alexey Dobriyan
d8fd0a7316 [IPV6]: Endian fix in net/ipv6/netfilter/ip6t_eui64.c:match().
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-16 15:24:41 -07:00
Adrian Bunk
6599519e9c [TR]: Remove an unused export.
This patch removes the unused EXPORT_SYMBOL(tr_source_route).

(Note, the usage in net/llc/llc_output.c can't be modular.)

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-16 15:23:40 -07:00
Alexey Dobriyan
4ac396c046 [IPX]: Correct return type of ipx_map_frame_type().
Casting BE16 to int and back may or may not work. Correct, to be sure.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-16 15:17:49 -07:00
Alexey Dobriyan
53d42f5412 [IPX]: Correct argument type of ipxrtr_delete().
A single caller passes __u32. Inside function "net" is compared with
__u32 (__be32 really, just wasn't annotated).

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-16 15:07:28 -07:00
Stephen Hemminger
338f7566e5 [PKT_SCHED]: Potential jiffy wrap bug in dev_watchdog().
There is a potential jiffy wraparound bug in the transmit watchdog
that is easily avoided by using time_after().

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-16 15:02:12 -07:00
Simon Kelley
bd89efc532 [NEIGH]: Fix IP-over-ATM and ARP interaction.
The classical IP over ATM code maintains its own IPv4 <-> <ATM stuff>
ARP table, using the standard neighbour-table code. The
neigh_table_init function adds this neighbour table to a linked list
of all neighbor tables which is used by the functions neigh_delete()
neigh_add() and neightbl_set(), all called by the netlink code.

Once the ATM neighbour table is added to the list, there are two
tables with family == AF_INET there, and ARP entries sent via netlink
go into the first table with matching family. This is indeterminate
and often wrong.

To see the bug, on a kernel with CLIP enabled, create a standard IPv4
ARP entry by pinging an unused address on a local subnet. Then attempt
to complete that entry by doing

ip neigh replace <ip address> lladdr <some mac address> nud reachable

Looking at the ARP tables by using 

ip neigh show

will reveal two ARP entries for the same address. One of these can be
found in /proc/net/arp, and the other in /proc/net/atm/arp.

This patch adds a new function, neigh_table_init_no_netlink() which
does everything the neigh_table_init() does, except add the table to
the netlink all-arp-tables chain. In addition neigh_table_init() has a
check that all tables on the chain have a distinct address family.
The init call in clip.c is changed to call
neigh_table_init_no_netlink().

Since ATM ARP tables are rather more complicated than can currently be
handled by the available rtattrs in the netlink protocol, no
functionality is lost by this patch, and non-ATM ARP manipulation via
netlink is rescued. A more complete solution would involve a rtattr
for ATM ARP entries and some way for the netlink code to give
neigh_add and friends more information than just address family with
which to find the correct ARP table.

[ I've changed the assertion checking in neigh_table_init() to not
  use BUG_ON() while holding neigh_tbl_lock.  Instead we remember that
  we found an existing tbl with the same family, and after dropping
  the lock we'll give a diagnostic kernel log message and a stack dump.
  -DaveM ]

Signed-off-by: Simon Kelley <simon@thekelleys.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-12 14:56:08 -07:00
Patrick McHardy
210525d65d [NET_SCHED]: HFSC: fix thinko in hfsc_adjust_levels()
When deleting the last child the level of a class should drop to zero.

Noticed by Andreas Mueller <andreas@stapelspeicher.org>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-11 12:22:03 -07:00
Alexey Kuznetsov
b0013fd47b [IPV6]: skb leakage in inet6_csk_xmit
inet6_csk_xit does not free skb when routing fails.

Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-10 13:24:38 -07:00
Stephen Hemminger
ac05202e8b [BRIDGE]: Do sysfs registration inside rtnl.
Now that netdevice sysfs registration is done as part of
register_netdevice; bridge code no longer has to be tricky when adding
it's kobjects to bridges.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-10 13:21:53 -07:00
Stephen Hemminger
b17a7c179d [NET]: Do sysfs registration as part of register_netdevice.
The last step of netdevice registration was being done by a delayed
call, but because it was delayed, it was impossible to return any error
code if the class_device registration failed.

Side effects:
 * one state in registration process is unnecessary.
 * register_netdevice can sleep inside class_device registration/hotplug
 * code in netdev_run_todo only does unregistration so it is simpler.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-10 13:21:17 -07:00
Herbert Xu
8c1056839e [NET] linkwatch: Handle jiffies wrap-around
The test used in the linkwatch does not handle wrap-arounds correctly.
Since the intention of the code is to eliminate bursts of messages we
can afford to delay things up to a second.  Using that fact we can
easily handle wrap-arounds by making sure that we don't delay things
by more than one second.

This is based on diagnosis and a patch by Stefan Rompf.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Stefan Rompf <stefan@loplof.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-09 15:27:54 -07:00
Adrian Bunk
11766199a0 [IRDA]: Removing unused EXPORT_SYMBOLs
This patch removes the following unused EXPORT_SYMBOL's:
- irias_find_attrib
- irias_new_string_value
- irias_new_octseq_value

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Samuel Ortiz <samuel.ortiz@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-09 15:25:25 -07:00
Alan Stern
f07d5b9465 [NET]: Make netdev_chain a raw notifier.
From: Alan Stern <stern@rowland.harvard.edu>

This chain does it's own locking via the RTNL semaphore, and
can also run recursively so adding a new mutex here was causing
deadlocks.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-09 15:23:03 -07:00
Wei Yongjun
63cbd2fda3 [IPV4]: ip_options_fragment() has no effect on fragmentation
Fix error point to options in ip_options_fragment(). optptr get a
error pointer to the ipv4 header, correct is pointer to ipv4 options.

Signed-off-by: Wei Yongjun <weiyj@soft.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-09 15:18:50 -07:00
Stephen Hemminger
23aee82e75 Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2006-05-08 16:01:20 -07:00
Hua Zhong
0182bd2b1e [IPV4]: Remove likely in ip_rcv_finish()
This is another result from my likely profiling tool
(dwalker@mvista.com just sent the patch of the profiling tool to
linux-kernel mailing list, which is similar to what I use).

On my system (not very busy, normal development machine within a
VMWare workstation), I see a 6/5 miss/hit ratio for this "likely".

Signed-off-by: Hua Zhong <hzhong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-06 18:11:39 -07:00
Stephen Hemminger
fe9925b551 [NET]: Create netdev attribute_groups with class_device_add
Atomically create attributes when class device is added. This avoids
the race between registering class_device (which generates hotplug
event), and the creation of attribute groups.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-06 17:56:03 -07:00
John Heffner
5528e568a7 [TCP]: Fix snd_cwnd adjustments in tcp_highspeed.c
Xiaoliang (David) Wei wrote:
> Hi gurus,
> 
>    I am reading the code of tcp_highspeed.c in the kernel and have a
> question on the hstcp_cong_avoid function, specifically the following
> AI part (line 136~143 in net/ipv4/tcp_highspeed.c ):
> 
>                /* Do additive increase */
>                if (tp->snd_cwnd < tp->snd_cwnd_clamp) {
>                        tp->snd_cwnd_cnt += ca->ai;
>                        if (tp->snd_cwnd_cnt >= tp->snd_cwnd) {
>                                tp->snd_cwnd++;
>                                tp->snd_cwnd_cnt -= tp->snd_cwnd;
>                        }
>                }
> 
>    In this part, when (tp->snd_cwnd_cnt == tp->snd_cwnd),
> snd_cwnd_cnt will be -1... snd_cwnd_cnt is defined as u16, will this
> small chance of getting -1 becomes a problem?
> Shall we change it by reversing the order of the cwnd++ and cwnd_cnt -= 
> cwnd?

Absolutely correct.  Thanks.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:41:44 -07:00
Ralf Baechle
f530937b2c [NETROM/ROSE]: Kill module init version kernel log messages.
There are out of date and don't tell the user anything useful.
The similar messages which IPV4 and the core networking used
to output were killed a long time ago.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:19:26 -07:00
Herbert Xu
134af34632 [DCCP]: Fix sock_orphan dead lock
Calling sock_orphan inside bh_lock_sock in dccp_close can lead to dead
locks.  For example, the inet_diag code holds sk_callback_lock without
disabling BH.  If an inbound packet arrives during that admittedly tiny
window, it will cause a dead lock on bh_lock_sock.  Another possible
path would be through sock_wfree if the network device driver frees the
tx skb in process context with BH enabled.

We can fix this by moving sock_orphan out of bh_lock_sock.

The tricky bit is to work out when we need to destroy the socket
ourselves and when it has already been destroyed by someone else.

By moving sock_orphan before the release_sock we can solve this
problem.  This is because as long as we own the socket lock its
state cannot change.

So we simply record the socket state before the release_sock
and then check the state again after we regain the socket lock.
If the socket state has transitioned to DCCP_CLOSED in the time being,
we know that the socket has been destroyed.  Otherwise the socket is
still ours to keep.

This problem was discoverd by Ingo Molnar using his lock validator.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:09:13 -07:00
Stephen Hemminger
1c29fc4989 [BRIDGE]: keep track of received multicast packets
It makes sense to add this simple statistic to keep track of received
multicast packets.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:07:13 -07:00
Sridhar Samudrala
35d63edb1c [SCTP]: Fix state table entries for chunks received in CLOSED state.
Discard an unexpected chunk in CLOSED state rather can calling BUG().

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:05:23 -07:00
Sridhar Samudrala
62b08083ec [SCTP]: Fix panic's when receiving fragmented SCTP control chunks.
Use pskb_pull() to handle incoming COOKIE_ECHO and HEARTBEAT chunks that
are received as skb's with fragment list.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:04:43 -07:00
Vladislav Yasevich
672e7cca17 [SCTP]: Prevent possible infinite recursion with multiple bundled DATA.
There is a rare situation that causes lksctp to go into infinite recursion
and crash the system.  The trigger is a packet that contains at least the
first two DATA fragments of a message bundled together. The recursion is
triggered when the user data buffer is smaller that the full data message.
The problem is that we clone the skb for every fragment in the message.
When reassembling the full message, we try to link skbs from the "first
fragment" clone using the frag_list. However, since the frag_list is shared
between two clones in this rare situation, we end up setting the frag_list
pointer of the second fragment to point to itself.  This causes
sctp_skb_pull() to potentially recurse indefinitely.

Proposed solution is to make a copy of the skb when attempting to link
things using frag_list.

Signed-off-by: Vladislav Yasevich <vladsilav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:03:49 -07:00
Neil Horman
7c3ceb4fb9 [SCTP]: Allow spillover of receive buffer to avoid deadlock.
This patch fixes a deadlock situation in the receive path by allowing
temporary spillover of the receive buffer.

- If the chunk we receive has a tsn that immediately follows the ctsn,
  accept it even if we run out of receive buffer space and renege data with
  higher TSNs.
- Once we accept one chunk in a packet, accept all the remaining chunks
  even if we run out of receive buffer space.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Mark Butler <butlerm@middle.net>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:02:09 -07:00
Daniel Drake
8462fe3cd9 [PATCH] softmac: suggest per-frame-type TX rate
This patch is the first step towards rate control inside softmac.

The txrates substructure has been extended to provide
different fields for different types of packets (management/data,
unicast/multicast). These fields are updated on association to values
compatible with the access point we are associating to.

Drivers can then use the new ieee80211softmac_suggest_txrate() function
call when deciding which rate to transmit each frame at. This is
immensely useful for ZD1211, and bcm can use it too.

The user can still specify a rate through iwconfig, which is matched
for all transmissions (assuming the rate they have specified is in
the rate set required by the AP).

At a later date, we can incorporate automatic rate management into
the ieee80211softmac_recalc_txrates() function.

This patch also removes the mcast_fallback field. Sam Leffler pointed
out that this field is meaningless, because no driver will ever be
retransmitting mcast frames (they are not acked).

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-05 17:10:41 -04:00
Adrian Bunk
6274115ce9 [PATCH] ieee80211_wx.c: remove dead code
Since sec->key_sizes[] is an u8, len can't be < 0.

Spotted by the Coverity checker.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-05 17:10:40 -04:00
Daniel Drake
6d92f83ffa [PATCH] softmac: deauthentication implies deassociation
The 802.11 specs state that deauthenticating also implies
disassociating. This patch implements that, which improve the behaviour
of SIOCSIWMLME.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-05 17:10:39 -04:00
John W. Linville
fd5226a726 Merge branch 'upstream-fixes' into upstream 2006-05-05 16:56:24 -04:00
Daniel Drake
d57336e3f2 [PATCH] softmac: make non-operational after being stopped
zd1211 with softmac and wpa_supplicant revealed an issue with softmac
and the use of workqueues. Some of the work functions actually
reschedule themselves, so this meant that there could still be
pending work after flush_scheduled_work() had been called during
ieee80211softmac_stop().

This patch introduces a "running" flag which is used to ensure that
rescheduling does not happen in this situation.

I also used this flag to ensure that softmac's hooks into ieee80211 are
non-operational once the stop operation has been started. This simply
makes softmac a little more robust, because I could crash it easily
by receiving frames in the short timeframe after shutting down softmac
and before turning off the ZD1211 radio. (ZD1211 is now fixed as well!)

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-05 16:55:22 -04:00
Daniel Drake
995c99268e [PATCH] softmac: don't reassociate if user asked for deauthentication
When wpa_supplicant exits, it uses SIOCSIWMLME to request
deauthentication.  softmac then tries to reassociate without any user
intervention, which isn't the desired behaviour of this signal.

This change makes softmac only attempt reassociation if the remote
network itself deauthenticated us.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-05 16:55:22 -04:00
John W. Linville
aad61439e6 Merge branch 'from-linus' into upstream 2006-05-05 16:50:23 -04:00
Patrick Caulfield
d1a6498388 [DECNET]: Fix level1 router hello
This patch fixes hello messages sent when a node is a level 1
router. Slightly contrary to the spec (maybe) VMS ignores hello
messages that do not name level2 routers that it also knows about.

So, here we simply name all the routers that the node knows about
rather just other level1 routers.  (I hope the patch is clearer than
the description. sorry).

Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:36:23 -07:00
Herbert Xu
75c2d9077c [TCP]: Fix sock_orphan dead lock
Calling sock_orphan inside bh_lock_sock in tcp_close can lead to dead
locks.  For example, the inet_diag code holds sk_callback_lock without
disabling BH.  If an inbound packet arrives during that admittedly tiny
window, it will cause a dead lock on bh_lock_sock.  Another possible
path would be through sock_wfree if the network device driver frees the
tx skb in process context with BH enabled.

We can fix this by moving sock_orphan out of bh_lock_sock.

The tricky bit is to work out when we need to destroy the socket
ourselves and when it has already been destroyed by someone else.

By moving sock_orphan before the release_sock we can solve this
problem.  This is because as long as we own the socket lock its
state cannot change.

So we simply record the socket state before the release_sock
and then check the state again after we regain the socket lock.
If the socket state has transitioned to TCP_CLOSE in the time being,
we know that the socket has been destroyed.  Otherwise the socket is
still ours to keep.

Note that I've also moved the increment on the orphan count forward.
This may look like a problem as we're increasing it even if the socket
is just about to be destroyed where it'll be decreased again.  However,
this simply enlarges a window that already exists.  This also changes
the orphan count test by one.

Considering what the orphan count is meant to do this is no big deal.

This problem was discoverd by Ingo Molnar using his lock validator.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:31:35 -07:00
Ralf Baechle
82e84249f0 [ROSE]: Eleminate HZ from ROSE kernel interfaces
Convert all ROSE sysctl time values from jiffies to ms as units.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:28:20 -07:00
Ralf Baechle
4d8937d0b1 [NETROM]: Eleminate HZ from NET/ROM kernel interfaces
Convert all NET/ROM sysctl time values from jiffies to ms as units.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:27:47 -07:00
Ralf Baechle
e1fdb5b396 [AX.25]: Eleminate HZ from AX.25 kernel interfaces
Convert all AX.25 sysctl time values from jiffies to ms as units.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:27:16 -07:00
Ralf Baechle
4cc7c2734e [ROSE]: Fix routing table locking in rose_remove_neigh.
The locking rule for rose_remove_neigh() are that the caller needs to
hold rose_neigh_list_lock, so we better don't take it yet again in
rose_neigh_list_lock.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:26:20 -07:00
Ralf Baechle
70868eace5 [AX.25]: Move AX.25 symbol exports
Move AX.25 symbol exports to next to their definitions where they're
supposed to be these days.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:25:17 -07:00
Ralf Baechle
86cfcb95ec [AX25, ROSE]: Remove useless SET_MODULE_OWNER calls.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:23:48 -07:00
Ralf Baechle
3f072310d0 [AX.25]: Spelling fix
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:22:36 -07:00
Ralf Baechle
0cc5ae24af [ROSE]: Remove useless prototype for rose_remove_neigh().
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:22:01 -07:00
Patrick McHardy
7800007c1e [NETFILTER]: x_tables: don't use __copy_{from,to}_user on unchecked memory in compat layer
Noticed by Linus Torvalds <torvalds@osdl.org>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:20:27 -07:00
Jing Min Zhao
7582e9d17e [NETFILTER]: H.323 helper: Change author's email address
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:19:59 -07:00
Patrick McHardy
2354feaeb2 [NETFILTER]: NAT: silence unused variable warnings with CONFIG_XFRM=n
net/ipv4/netfilter/ip_nat_standalone.c: In function 'ip_nat_out':
net/ipv4/netfilter/ip_nat_standalone.c:223: warning: unused variable 'ctinfo'
net/ipv4/netfilter/ip_nat_standalone.c:222: warning: unused variable 'ct'

Surprisingly no complaints so far ..

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:19:26 -07:00
Patrick McHardy
4228e2a989 [NETFILTER]: H.323 helper: fix use of uninitialized data
When a Choice element contains an unsupported choice no error is returned
and parsing continues normally, but the choice value is not set and
contains data from the last parsed message. This may in turn lead to
parsing of more stale data and following crashes.

Fixes a crash triggered by testcase 0003243 from the PROTOS c07-h2250v4
testsuite following random other testcases:

CPU:    0
EIP:    0060:[<c01a9554>]    Not tainted VLI
EFLAGS: 00210646   (2.6.17-rc2 #3)
EIP is at memmove+0x19/0x22
eax: d7be0307   ebx: d7be0307   ecx: e841fcf9   edx: d7be0307
esi: bfffffff   edi: bfffffff   ebp: da5eb980   esp: c0347e2c
ds: 007b   es: 007b   ss: 0068
Process events/0 (pid: 4, threadinfo=c0347000 task=dff86a90)
Stack: <0>00000006 c0347ea6 d7be0301 e09a6b2c 00000006 da5eb980 d7be003e d7be0052
       c0347f6c e09a6d9c 00000006 c0347ea6 00000006 00000000 d7b9a548 00000000
       c0347f6c d7b9a548 00000004 e0a1a119 0000028f 00000006 c0347ea6 00000006
Call Trace:
 [<e09a6b2c>] mangle_contents+0x40/0xd8 [ip_nat]
 [<e09a6d9c>] ip_nat_mangle_tcp_packet+0xa1/0x191 [ip_nat]
 [<e0a1a119>] set_addr+0x60/0x14d [ip_nat_h323]
 [<e0ab6e66>] q931_help+0x2da/0x71a [ip_conntrack_h323]
 [<e0ab6e98>] q931_help+0x30c/0x71a [ip_conntrack_h323]
 [<e09af242>] ip_conntrack_help+0x22/0x2f [ip_conntrack]
 [<c022934a>] nf_iterate+0x2e/0x5f
 [<c025d357>] xfrm4_output_finish+0x0/0x39f
 [<c02294ce>] nf_hook_slow+0x42/0xb0
 [<c025d357>] xfrm4_output_finish+0x0/0x39f
 [<c025d732>] xfrm4_output+0x3c/0x4e
 [<c025d357>] xfrm4_output_finish+0x0/0x39f
 [<c0230370>] ip_forward+0x1c2/0x1fa
 [<c022f417>] ip_rcv+0x388/0x3b5
 [<c02188f9>] netif_receive_skb+0x2bc/0x2ec
 [<c0218994>] process_backlog+0x6b/0xd0
 [<c021675a>] net_rx_action+0x4b/0xb7
 [<c0115606>] __do_softirq+0x35/0x7d
 [<c0104294>] do_softirq+0x38/0x3f

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:17:11 -07:00
Patrick McHardy
6fd737031e [NETFILTER]: H.323 helper: fix endless loop caused by invalid TPKT len
When the TPKT len included in the packet is below the lowest valid value
of 4 an underflow occurs which results in an endless loop.

Found by testcase 0000058 from the PROTOS c07-h2250v4 testsuite.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:16:29 -07:00
Patrick McHardy
e17df688f7 [NETFILTER] SCTP conntrack: fix infinite loop
fix infinite loop in the SCTP-netfilter code: check SCTP chunk size to
guarantee progress of for_each_sctp_chunk(). (all other uses of
for_each_sctp_chunk() are preceded by do_basic_checks(), so this fix
should be complete.)

Based on patch from Ingo Molnar <mingo@elte.hu>

CVE-2006-1527

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-02 17:26:39 -07:00
Jeff Garzik
1fb5fef9b8 Merge branch 'master' into upstream 2006-05-02 14:33:57 -04:00
Linus Torvalds
532f57da40 Merge branch 'audit.b10' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b10' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [PATCH] Audit Filter Performance
  [PATCH] Rework of IPC auditing
  [PATCH] More user space subject labels
  [PATCH] Reworked patch for labels on user space messages
  [PATCH] change lspp ipc auditing
  [PATCH] audit inode patch
  [PATCH] support for context based audit filtering, part 2
  [PATCH] support for context based audit filtering
  [PATCH] no need to wank with task_lock() and pinning task down in audit_syscall_exit()
  [PATCH] drop task argument of audit_syscall_{entry,exit}
  [PATCH] drop gfp_mask in audit_log_exit()
  [PATCH] move call of audit_free() into do_exit()
  [PATCH] sockaddr patch
  [PATCH] deal with deadlocks in audit_free()
2006-05-01 21:43:05 -07:00
Patrick McHardy
46c5ea3c9a [NETFILTER] x_tables: fix compat related crash on non-x86
When iptables userspace adds an ipt_standard_target, it calculates the size
of the entire entry as:

sizeof(struct ipt_entry) + XT_ALIGN(sizeof(struct ipt_standard_target))

ipt_standard_target looks like this:

  struct xt_standard_target
  {
        struct xt_entry_target target;
        int verdict;
  };

xt_entry_target contains a pointer, so when compiled for 64 bit the
structure gets an extra 4 byte of padding at the end. On 32 bit
architectures where iptables aligns to 8 byte it will also have 4
byte padding at the end because it is only 36 bytes large.

The compat_ipt_standard_fn in the kernel adjusts the offsets by

  sizeof(struct ipt_standard_target) - sizeof(struct compat_ipt_standard_target),

which will always result in 4, even if the structure from userspace
was already padded to a multiple of 8. On x86 this works out by
accident because userspace only aligns to 4, on all other
architectures this is broken and causes incorrect adjustments to
the size and following offsets.

Thanks to Linus for lots of debugging help and testing.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-01 20:48:32 -07:00
Steve Grubb
e7c3497013 [PATCH] Reworked patch for labels on user space messages
The below patch should be applied after the inode and ipc sid patches.
This patch is a reworking of Tim's patch that has been updated to match
the inode and ipc patches since its similar.

[updated:
>  Stephen Smalley also wanted to change a variable from isec to tsec in the
>  user sid patch.                                                              ]

Signed-off-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-05-01 06:09:58 -04:00
Steve Grubb
d6fe3945b4 [PATCH] sockaddr patch
On Thursday 23 March 2006 09:08, John D. Ramsdell wrote:
>  I noticed that a socketcall(bind) and socketcall(connect) event contain a
>  record of type=SOCKADDR, but I cannot see one for a system call event
>  associated with socketcall(accept).  Recording the sockaddr of an accepted
>  socket is important for cross platform information flow analys

Thanks for pointing this out. The following patch should address this.

Signed-off-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-05-01 06:06:10 -04:00
YOSHIFUJI Hideaki
c302e6d54e [IPV6]: Fix race in route selection.
We eliminated rt6_dflt_lock (to protect default router pointer)
at 2.6.17-rc1, and introduced rt6_select() for general router selection.
The function is called in the context of rt6_lock read-lock held,
but this means, we have some race conditions when we do round-robin.

Signed-off-by; YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:22 -07:00
Ingo Molnar
e959d8121f [XFRM]: fix incorrect xfrm_policy_afinfo_lock use
xfrm_policy_afinfo_lock can be taken in bh context, at:

 [<c013fe1a>] lockdep_acquire_read+0x54/0x6d
 [<c0f6e024>] _read_lock+0x15/0x22
 [<c0e8fcdb>] xfrm_policy_get_afinfo+0x1a/0x3d
 [<c0e8fd10>] xfrm_decode_session+0x12/0x32
 [<c0e66094>] ip_route_me_harder+0x1c9/0x25b
 [<c0e770d3>] ip_nat_local_fn+0x94/0xad
 [<c0e2bbc8>] nf_iterate+0x2e/0x7a
 [<c0e2bc50>] nf_hook_slow+0x3c/0x9e
 [<c0e3a342>] ip_push_pending_frames+0x2de/0x3a7
 [<c0e53e19>] icmp_push_reply+0x136/0x141
 [<c0e543fb>] icmp_reply+0x118/0x1a0
 [<c0e54581>] icmp_echo+0x44/0x46
 [<c0e53fad>] icmp_rcv+0x111/0x138
 [<c0e36764>] ip_local_deliver+0x150/0x1f9
 [<c0e36be2>] ip_rcv+0x3d5/0x413
 [<c0df760f>] netif_receive_skb+0x337/0x356
 [<c0df76c3>] process_backlog+0x95/0x110
 [<c0df5fe2>] net_rx_action+0xa5/0x16d
 [<c012d8a7>] __do_softirq+0x6f/0xe6
 [<c0105ec2>] do_softirq+0x52/0xb1

this means that all write-locking of xfrm_policy_afinfo_lock must be
bh-safe. This patch fixes xfrm_policy_register_afinfo() and
xfrm_policy_unregister_afinfo().

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:21 -07:00
Ingo Molnar
f3111502c0 [XFRM]: fix incorrect xfrm_state_afinfo_lock use
xfrm_state_afinfo_lock can be read-locked from bh context, so take it
in a bh-safe manner in xfrm_state_register_afinfo() and
xfrm_state_unregister_afinfo(). Found by the lock validator.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:20 -07:00
Hua Zhong
83de47cd0c [TCP]: Fix unlikely usage in tcp_transmit_skb()
The following unlikely should be replaced by likely because the
condition happens every time unless there is a hard error to transmit
a packet.

Signed-off-by: Hua Zhong <hzhong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:19 -07:00
Ingo Molnar
8dff7c2970 [XFRM]: fix softirq-unsafe xfrm typemap->lock use
xfrm typemap->lock may be used in softirq context, so all write_lock()
uses must be softirq-safe.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:18 -07:00
Herbert Xu
a76e07acd0 [IPSEC]: Fix IP ID selection
I was looking through the xfrm input/output code in order to abstract
out the address family specific encapsulation/decapsulation code.  During
that process I found this bug in the IP ID selection code in xfrm4_output.c.

At that point dst is still the xfrm_dst for the current SA which
represents an internal flow as far as the IPsec tunnel is concerned.
Since the IP ID is going to sit on the outside of the encapsulated
packet, we obviously want the external flow which is just dst->child.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:16 -07:00
Heiko Carstens
a536e07787 [IPV4]: inet_init() -> fs_initcall
Convert inet_init to an fs_initcall to make sure its called before any
device driver's initcall.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:14 -07:00
Soyoung Park
09493abfdb [NETLINK]: cleanup unused macro in net/netlink/af_netlink.c
1 line removal, of unused macro.
ran 'egrep -r' from linux-2.6.16/ for Nprintk and
didn't see it anywhere else but here, in #define...

Signed-off-by: Soyoung Park <speattle@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:13 -07:00
Stephen Hemminger
89bbb0a361 [PKT_SCHED] netem: fix loss
The following one line fix is needed to make loss function of
netem work right when doing loss on the local host.
Otherwise, higher layers just recover.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:12 -07:00
Shaun Pereira
43dff98b02 [X25]: fix for spinlock recurse and spinlock lockup with timer handler
When the sk_timer function x25_heartbeat_expiry() is called by the
kernel in a running/terminating process, spinlock-recursion and
spinlock-lockup locks up the kernel.  This has happened with testing
on some distro's and the patch below fixed it.

Signed-off-by: Shaun Pereira <spereira@tusc.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:11 -07:00
Jeff Garzik
1a2e8a6f8e Merge branch 'master' into upstream 2006-04-27 04:52:44 -04:00
Linus Torvalds
07db8696f5 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
  [PATCH] forcedeth: fix initialization
  [PATCH] sky2: version 1.2
  [PATCH] sky2: reset function can be devinit
  [PATCH] sky2: use ALIGN() macro
  [PATCH] sky2: add fake idle irq timer
  [PATCH] sky2: reschedule if irq still pending
  [PATCH] bcm43xx: make PIO mode usable
  [PATCH] bcm43xx: add to MAINTAINERS
  [PATCH] softmac: fix SIOCSIWAP
  [PATCH] Fix crash on big-endian systems during scan
  e1000: Update truesize with the length of the packet for packet split
  [PATCH] Fix locking in gianfar
2006-04-26 07:46:19 -07:00
Jeff Garzik
00355cd938 Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream 2006-04-26 06:18:15 -04:00
Jeff Garzik
3b908870b8 Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream-fixes 2006-04-26 06:16:50 -04:00
Stephen Hemminger
85ca719e57 [BRIDGE]: allow full size vlan packets
Need to allow for VLAN header when bridging.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-26 02:39:19 -07:00
Patrick McHardy
18118cdbfd [NETFILTER]: ipt action: use xt_check_target for basic verification
The targets don't do the basic verification themselves anymore so
the ipt action needs to take care of it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-24 17:27:34 -07:00
Dmitry Mishin
91536b7ae6 [NETFILTER]: x_tables: move table->lock initialization
xt_table->lock should be initialized before xt_replace_table() call, which
uses it. This patch removes strict requirement that table should define
lock before registering.

Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-24 17:27:33 -07:00
Patrick McHardy
e4a79ef811 [NETFILTER]: ip6_tables: remove broken comefrom debugging
The introduction of x_tables broke comefrom debugging, remove it from
ip6_tables as well (ip_tables already got removed).

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-24 17:27:32 -07:00
Yasuyuki Kozakai
2c16b774c7 [NETFILTER]: nf_conntrack: kill unused callback init_conntrack
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-24 17:27:31 -07:00
Thomas Voegtle
44adf28f4a [NETFILTER]: ULOG target is not obsolete
The backend part is obsoleted, but the target itself is still needed.

Signed-off-by: Thomas Voegtle <tv@lio96.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-24 17:27:29 -07:00
Yasuyuki Kozakai
e1bbdebdba [NETFILTER]: nf_conntrack: Fix module refcount dropping too far
If nf_ct_l3proto_find_get() fails to get the refcount of
nf_ct_l3proto_generic, nf_ct_l3proto_put() will drop the refcount
too far.

This gets rid of '.me = THIS_MODULE' of nf_ct_l3proto_generic so that
nf_ct_l3proto_find_get() doesn't try to get refcount of it.
It's OK because its symbol is usable until nf_conntrack.ko is unloaded.

This also kills unnecessary NULL pointer check as well.
__nf_ct_proto_find() allways returns non-NULL pointer.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-24 17:27:28 -07:00
Johannes Berg
921a91ef6a [PATCH] softmac: clean up event handling code
This patch cleans up the event handling code in ieee80211softmac_event.c and
makes the module slightly smaller by removing some strings that are not used
any more and consolidating some code.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:58 -04:00
Johannes Berg
9a1771e867 [PATCH] softmac: add SIOCSIWMLME
This patch adds the SIOCSIWMLME wext to softmac, this functionality
appears to be used by wpa_supplicant and is softmac-specific.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Jouni Malinen <jkm@devicescape.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:58 -04:00
Zhu Yi
7736b5bd93 [PATCH] ieee80211: replace debug IEEE80211_WARNING with each own debug macro
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:54 -04:00
Zhu Yi
35c14b855f [PATCH] ieee80211: remove unnecessary CONFIG_WIRELESS_EXT checking
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:53 -04:00
Zhu Yi
09593047d8 [PATCH] ieee80211: export list of bit rates with standard WEXT procddures
The patch replace the way to export the list of bit rates in scan results
from IWEVCUSTOM to SIOCGIWRATE. It also removes the max_rate item exported
with SIOCGIWRATE since this should be done by userspace.

Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:53 -04:00
Zhu Yi
73858062b6 [PATCH] ieee80211: Fix TX code doesn't enable QoS when using WPA + QoS
Fix ieee80211 TX code when using WPA+QOS. TKIP/CCMP will use
the TID field of qos_ctl in 802.11 frame header to do encryption. We
cannot ignore this field when doing host encryption and add the qos_ctl
field later.

Signed-off-by: Hong Liu <hong.liu@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:53 -04:00
Zhu Yi
ea2841521a [PATCH] ieee80211: Fix TKIP MIC calculation for QoS frames
Fix TKIP MIC verification failure when receiving QoS frames from AP.

Signed-off-by: Hong Liu <hong.liu@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:53 -04:00
Johannes Berg
818667f7c4 [PATCH] softmac: fix SIOCSIWAP
There are some bugs in the current implementation of the SIOCSIWAP wext,
for example that when you do it twice and it fails, it may still try
another access point for some reason. This patch fixes this by introducing
a new flag that tells the association code that the bssid that is in use
was fixed by the user and shouldn't be deviated from.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 15:20:23 -04:00
Linus Torvalds
f4ffaa452e Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (21 commits)
  [PATCH] wext: Fix RtNetlink ENCODE security permissions
  [PATCH] bcm43xx: iw_priv_args names should be <16 characters
  [PATCH] bcm43xx: sysfs code cleanup
  [PATCH] bcm43xx: fix pctl slowclock limit calculation
  [PATCH] bcm43xx: fix dyn tssi2dbm memleak
  [PATCH] bcm43xx: fix config menu alignment
  [PATCH] bcm43xx wireless: fix printk format warnings
  [PATCH] softmac: report when scanning has finished
  [PATCH] softmac: fix event sending
  [PATCH] softmac: handle iw_mode properly
  [PATCH] softmac: dont send out packets while scanning
  [PATCH] softmac: return -EAGAIN from getscan while scanning
  [PATCH] bcm43xx: set trans_start on TX to prevent bogus timeouts
  [PATCH] orinoco: fix truncating commsquality RID with the latest Symbol firmware
  [PATCH] softmac: fix spinlock recursion on reassoc
  [PATCH] Revert NET_RADIO Kconfig title change
  [PATCH] wext: Fix IWENCODEEXT security permissions
  [PATCH] wireless/atmel: send WEXT scan completion events
  [PATCH] wireless/airo: clean up WEXT association and scan events
  [PATCH] softmac uses Wiress Ext.
  ...
2006-04-20 15:26:25 -07:00
Jeff Garzik
f18b95c3e2 Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2006-04-20 17:36:10 -04:00
Jayachandran C
18bc89aa25 [EBTABLES]: Clean up vmalloc usage in net/bridge/netfilter/ebtables.c
Make all the vmalloc calls in net/bridge/netfilter/ebtables.c follow
the standard convention.  Remove unnecessary casts, and use '*object'
instead of 'type'.

Signed-off-by: Jayachandran C. <c.jayachandran@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-20 00:14:49 -07:00
David S. Miller
dc6de33674 [NET]: Add skb->truesize assertion checking.
Add some sanity checking.  truesize should be at least sizeof(struct
sk_buff) plus the current packet length.  If not, then truesize is
seriously mangled and deserves a kernel log message.

Currently we'll do the check for release of stream socket buffers.

But we can add checks to more spots over time.

Incorporating ideas from Herbert Xu.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-20 00:10:50 -07:00
Herbert Xu
b60b49ea6a [TCP]: Account skb overhead in tcp_fragment
Make sure that we get the full sizeof(struct sk_buff)
plus the data size accounted for in skb->truesize.

This will create invariants that will allow adding
assertion checks on skb->truesize.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-19 21:35:00 -07:00
David S. Miller
5185db09f4 [LLC]: Use pskb_trim_rcsum() in llc_fixup_skb().
Kernel Bugzilla #6409

If we use plain skb_trim(), that's wrong, because if
the SKB is cloned, and it can be because we unshared
it in the caller, we have to allow reallocation.  The
pskb_trim*() family of routines is therefore the most
appropriate here.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-19 15:37:13 -07:00
Hua Zhong
3672558c61 [NET]: sockfd_lookup_light() returns random error for -EBADFD
This applies to 2.6.17-rc2.

There is a missing initialization of err in sockfd_lookup_light() that
could return random error for an invalid file handle.

Signed-off-by: Hua Zhong <hzhong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-19 15:25:02 -07:00
Jean Tourrilhes
848ef85552 [PATCH] wext: Fix RtNetlink ENCODE security permissions
I've just realised that the RtNetlink code does not check the
permission for SIOCGIWENCODE and SIOCGIWENCODEEXT, which means that
any user can read the encryption keys. The fix is trivial and should
go in 2.6.17 alonside the two other patch I sent you last week.

Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:41 -04:00
Johannes Berg
6788a07f8f [PATCH] softmac: report when scanning has finished
Make softmac report a scan event when scanning has finished, that way
userspace can wait for the event to happen instead of polling for the
results.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:40 -04:00
Johannes Berg
feeeaa87e8 [PATCH] softmac: fix event sending
Softmac is sending custom events to userspace already, but it
should _really_ be sending the right WEXT events instead. This
patch fixes that.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:39 -04:00
johannes@sipsolutions.net
68970ce6ac [PATCH] softmac: handle iw_mode properly
Below patch allows using iw_mode auto with softmac. bcm43xx forces managed
so this bug wasn't noticed earlier, but this was one of the problems why
zd1211 didn't work earlier.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:39 -04:00
johannes@sipsolutions.net
fc242746ea [PATCH] softmac: dont send out packets while scanning
Seems we forgot to stop the queue while scanning. Better do that so we
don't transmit packets all the time during background scanning.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:39 -04:00
johannes@sipsolutions.net
ba2f8c1875 [PATCH] softmac: return -EAGAIN from getscan while scanning
Below patch was developed after discussion with Daniel Drake who
mentioned to me that wireless tools expect an EAGAIN return from getscan
so that they can wait for the scan to finish before printing out the
results.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:39 -04:00
Michael Buesch
9b0b4d8ae8 [PATCH] softmac: fix spinlock recursion on reassoc
This fixes a spinlock recursion on receiving a reassoc request.

On reassoc, the softmac calls back into the driver. This results in a
driver lock recursion. This schedules the assoc workqueue, instead
of calling it directly.

Probably, we should defer the _whole_ management frame processing
to a tasklet or workqueue, because it does several callbacks into the driver.
That is dangerous.

This fix should go into linus's tree, before 2.6.17 is released, because it
is remote exploitable (DoS by crash).

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:38 -04:00
Jean Tourrilhes
a417016d1a [PATCH] wext: Fix IWENCODEEXT security permissions
Check the permissions when user-space try to read the
encryption parameters via SIOCGIWENCODEEXT. This is trivial and
probably should go in 2.6.17...
	Bug was found by Brian Eaton <eaton.lists@gmail.com>, thanks !

Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:38 -04:00
Randy Dunlap
e4b5fae8b3 [PATCH] softmac uses Wiress Ext.
softmac uses wireless extensions, so let it SELECT that config option;
WARNING: "wireless_send_event" [net/ieee80211/softmac/ieee80211softmac.ko] undefined!

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:37 -04:00
Eric Sesterhenn
a5f9145bc9 SUNRPC: Dead code in net/sunrpc/auth_gss/auth_gss.c
Hi,

the coverity checker spotted that cred is always NULL
when we jump to out_err ( there is just one case, when
we fail to allocate the memory for cred )
This is Coverity ID #79

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 13:06:49 -04:00
Adrian Bunk
ec535ce154 NFS: make 2 functions static
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 12:43:47 -04:00
J. Bruce Fields
d4a30e7e66 RPCSEC_GSS: fix leak in krb5 code caused by superfluous kmalloc
I was sloppy when generating a previous patch; I modified the callers of
krb5_make_checksum() to allocate memory for the buffer where the result is
returned, then forgot to modify krb5_make_checksum to stop allocating that
memory itself.  The result is a per-packet memory leak.  This fixes the
problem by removing the now-superfluous kmalloc().

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 12:43:46 -04:00
Jesper Juhl
63903ca6af [NET]: Remove redundant NULL checks before [kv]free
Redundant NULL check before kfree removal
from net/

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-18 15:57:55 -07:00
Dmitry Mishin
40daafc80b unaligned access in sk_run_filter()
This patch fixes unaligned access warnings noticed on IA64
in sk_run_filter(). 'ptr' can be unaligned.

Signed-off-By: Dmitry Mishin <dim@openvz.org>
Signed-off-By: Kirill Korotaev <dev@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-18 15:57:54 -07:00
YOSHIFUJI Hideaki
b809739a1b [IPV6]: Clean up hop-by-hop options handler.
- Removed unused argument (nhoff) for ipv6_parse_hopopts().
- Make ipv6_parse_hopopts() to align with other extension header
  handlers.
- Removed pointless assignment (hdr), which is not used afterwards.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-18 15:57:53 -07:00
YOSHIFUJI Hideaki
e5d25a9088 [IPV6] XFRM: Fix decoding session with preceding extension header(s).
We did not correctly decode session with preceding extension
header(s).  This was because we had already pulled preceding
headers, skb->nh.raw + 40 + 1 - skb->data was minus, and
pskb_may_pull() failed.

We now have IP6CB(skb)->nhoff and skb->h.raw, and we can
start parsing / decoding upper layer protocol from current
position.

Tracked down by Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
and tested by Kazunori Miyazawa <kazunori@miyazawa.org>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-18 15:57:52 -07:00
YOSHIFUJI Hideaki
e3cae904d7 [IPV6] XFRM: Don't use old copy of pointer after pskb_may_pull().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-18 15:57:51 -07:00
YOSHIFUJI Hideaki
ec6700958a [IPV6]: Ensure to have hop-by-hop options in our header of &sk_buff.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-18 15:57:50 -07:00
Herbert Xu
ef5cb9738b [TCP]: Fix truesize underflow
There is a problem with the TSO packet trimming code.  The cause of
this lies in the tcp_fragment() function.

When we allocate a fragment for a completely non-linear packet the
truesize is calculated for a payload length of zero.  This means that
truesize could in fact be less than the real payload length.

When that happens the TSO packet trimming can cause truesize to become
negative.  This in turn can cause sk_forward_alloc to be -n * PAGE_SIZE
which would trigger the warning.

I've copied the code DaveM used in tso_fragment which should work here.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-18 15:57:49 -07:00
Stephen Hemminger
d2c962b853 [IPV4]: ip_route_input panic fix
This fixes http://bugzilla.kernel.org/show_bug.cgi?id=6388
The bug is caused by ip_route_input dereferencing skb->nh.protocol of
the dummy skb passed dow from inet_rtm_getroute (Thanks Thomas for seeing
it). It only happens if the route requested is for a multicast IP
address.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-17 17:27:11 -07:00
Zach Brown
3d9dd7564d [PATCH] ip_output: account for fraggap when checking to add trailer_len
During other work I noticed that ip_append_data() seemed to be forgetting to
include the frag gap in its calculation of a fragment that consumes the rest of
the payload.  Herbert confirmed that this was a bug that snuck in during a
previous rework.

Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-14 16:04:18 -07:00
Stephen Hemminger
4909e488f6 [ATM] clip: add module info
Add module information

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-14 16:01:26 -07:00
Stephen Hemminger
5ff765f3d0 [ATM] clip: notifier related cleanups
Cleanup some code around notifier.  Don't need (void) casts to ignore
return values, and use C90 style initializer. Just ignore unused device
events.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-14 16:00:59 -07:00
Stephen Hemminger
dcdb02752f [ATM] clip: get rid of PROC_FS ifdef
Don't need the ifdef here since create_proc_entry() is stubbed to
always return NULL.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-14 16:00:25 -07:00
Stephen Hemminger
e49e76db03 [ATM] clip: run through Lindent
Run CLIP driver through Lindent script to fix formatting.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-14 15:59:37 -07:00
Stephen Hemminger
2d9073922b [ATM]: Clip timer race.
By inspection, the clip idle timer code is racy on SMP.
Here is a safe version of timer management.
Untested, I don't have ATM hardware.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-14 15:56:02 -07:00
Stephen Hemminger
f3a0592b37 [ATM]: clip causes unregister hang
If Classical IP over ATM module is loaded, its neighbor table gets
populated when permanent neighbor entries are created; but these entries
are not flushed when the device is removed. Since the entry never gets
flushed the unregister of the network device never completes.

This version of the patch also adds locking around the reference to
the atm arp daemon to avoid races with events and daemon state changes.
(Note: barrier() was never really safe)

Bug-reference: http://bugzilla.kernel.org/show_bug.cgi?id=6295
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-14 15:07:27 -07:00
Jamal Hadi Salim
2717096ab4 [XFRM]: Fix aevent timer.
Send aevent immediately if we have sent nothing since last timer and
this is the first packet.

Fixes a corner case when packet threshold is very high, the timer low
and a very low packet rate input which is bursty.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-14 15:03:05 -07:00
Adrian Bunk
6c97e72a16 [IPV4]: Possible cleanups.
This patch contains the following possible cleanups:
- make the following needlessly global function static:
  - arp.c: arp_rcv()
- remove the following unused EXPORT_SYMBOL's:
  - devinet.c: devinet_ioctl
  - fib_frontend.c: ip_rt_ioctl
  - inet_hashtables.c: inet_bind_bucket_create
  - inet_hashtables.c: inet_bind_hash
  - tcp_input.c: sysctl_tcp_abc
  - tcp_ipv4.c: sysctl_tcp_tw_reuse
  - tcp_output.c: sysctl_tcp_mtu_probing
  - tcp_output.c: sysctl_tcp_base_mss

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-14 15:00:20 -07:00
Adrian Bunk
8db60bcf30 [WAN]: Remove broken and unmaintained Sangoma drivers.
The in-kernel Sangoma drivers are both not compiling and marked as BROKEN
since at least kernel 2.6.0.

Sangoma offers out-of-tree drivers, and David Mandelstam told me Sangoma
does no longer maintain the in-kernel drivers and prefers to provide them
as a separate installation package.

This patch therefore removes these drivers.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-11 17:28:33 -07:00
Jayachandran C
7ad4d2f690 [BRIDGE] ebtables: fix allocation in net/bridge/netfilter/ebtables.c
Allocate an array of 'struct ebt_chainstack *', the current code allocates
array of 'struct ebt_chainstack'.

akpm: converted to use the

	foo = alloc(sizeof(*foo))

form.  Which would have prevented this from happening in the first place.

akpm: also removed unneeded typecast.

akpm: what on earth is this code doing anyway?  cpu_possible_map can be
sparse..

Signed-off-by: Jayachandran C. <c.jayachandran@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-11 17:25:38 -07:00
Eric Sesterhenn
b8282dcf04 [DCCP]: Fix leak in net/dccp/ipv4.c
we dont free req if we cant parse the options.
This fixes coverity bug id #1046

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-11 17:21:06 -07:00
Stephen Hemminger
b7595b4955 [BRIDGE]: receive link-local on disabled ports.
This change allows link local packets (like 802.3ad and Spanning Tree
Protocol) to be processed even when the bridge is not using the port.
It fixes the chicken-egg problem for bridging a bonded device, and
may also fix problems with spanning tree failover.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-11 17:21:05 -07:00
Zach Brown
f6596f9d2b [IPv6] reassembly: Always compute hash under the fragment lock.
This closes a race where an ipq6hashfn() caller could get a hash value
and race with the cycling of the random seed.  By the time they got to
the read_lock they'd have a stale hash value and might not find
previous fragments of their datagram.

This matches the previous patch to IPv4.

Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-11 17:21:05 -07:00
Linus Torvalds
88dd9c16ce Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] vfs: add splice_write and splice_read to documentation
  [PATCH] Remove sys_ prefix of new syscalls from __NR_sys_*
  [PATCH] splice: warning fix
  [PATCH] another round of fs/pipe.c cleanups
  [PATCH] splice: comment styles
  [PATCH] splice: add Ingo as addition copyright holder
  [PATCH] splice: unlikely() optimizations
  [PATCH] splice: speedups and optimizations
  [PATCH] pipe.c/fifo.c code cleanups
  [PATCH] get rid of the PIPE_*() macros
  [PATCH] splice: speedup __generic_file_splice_read
  [PATCH] splice: add direct fd <-> fd splicing support
  [PATCH] splice: add optional input and output offsets
  [PATCH] introduce a "kernel-internal pipe object" abstraction
  [PATCH] splice: be smarter about calling do_page_cache_readahead()
  [PATCH] splice: optimize the splice buffer mapping
  [PATCH] splice: cleanup __generic_file_splice_read()
  [PATCH] splice: only call wake_up_interruptible() when we really have to
  [PATCH] splice: potential !page dereference
  [PATCH] splice: mark the io page as accessed
2006-04-11 06:34:02 -07:00
NeilBrown
dfee55f062 [PATCH] knfsd: svcrpc: gss: don't call svc_take_page unnecessarily
We're using svc_take_page here to get another page for the tail in case one
wasn't already allocated.  But there isn't always guaranteed to be another
page available.

Also fix a typo that made us check the tail buffer for space when we meant to
be checking the head buffer.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:52 -07:00
KAMEZAWA Hiroyuki
6f91204225 [PATCH] for_each_possible_cpu: network codes
for_each_cpu() actually iterates across all possible CPUs.  We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs.  This is inefficient and
possibly buggy.

We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.

This patch replaces for_each_cpu with for_each_possible_cpu under /net

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:31 -07:00
Andrew Morton
88e6faefae [PATCH] splice: warning fix
From: Andrew Morton <akpm@osdl.org>

net/socket.c:148: warning: initialization from incompatible pointer type

extern declarations in .c files!  Bad boy.

Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:59:36 +02:00
Denis Vlasenko
b1a7ffcb7a [IPV6]: Deinline few large functions in inet6 code
Deinline a few functions which produce 200+ bytes of code.

Size  Uses Wasted Name and definition
===== ==== ====== ================================================
  429    3    818 __inet6_lookup        include/net/inet6_hashtables.h
  404    2    384 __inet6_lookup_established    include/net/inet6_hashtables.h
  206    3    372 __inet6_hash  include/net/inet6_hashtables.h

Signed-off-by: Denis Vlasenko <vda@ilport.com.ua>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:48:59 -07:00
David S. Miller
55c0022e53 [IPV4] ip_fragment: Always compute hash with ipfrag_lock held.
Otherwise we could compute an inaccurate hash due to the
random seed changing.

Noticed by Zach Brown and patch is based upon some feedback
from Herbert Xu.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:43:55 -07:00
Patrick McHardy
19910d1aec [NETFILTER]: Fix DNAT in LOCAL_OUT
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:38:29 -07:00
Sergey Vlasov
9469d458b9 [NET]: Fix hotplug race during device registration.
From: Thomas de Grenier de Latour <degrenier@easyconnect.fr>

On Sun, 9 Apr 2006 21:56:59 +0400,
Sergey Vlasov <vsu@altlinux.ru> wrote:

> However, show_address() does not output anything unless
> dev->reg_state == NETREG_REGISTERED - and this state is set by
> netdev_run_todo() only after netdev_register_sysfs() returns, so in
> the meantime (while netdev_register_sysfs() is busy adding the
> "statistics" attribute group) some process may see an empty "address"
> attribute.

I've tried the attached patch, suggested by Sergey Vlasov on
hotplug-devel@, and as far as i can test it works just fine.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:32:48 -07:00
Brian Haley
503e4faad1 [NETFILTER]: Fix build with CONFIG_NETFILTER=y/m on IA64
Can't build with CONFIG_NETFILTER=y/m on IA64, there's a missing
#include in net/ipv6/netfilter.c

net/ipv6/netfilter.c: In function `nf_ip6_checksum':
net/ipv6/netfilter.c:92: warning: implicit declaration of function
`csum_ipv6_magic'

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:49 -07:00
Andrew Morton
77d04bd957 [NET]: More kzalloc conversions.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:48 -07:00
Paolo 'Blaisorblade' Giarrusso
31380de95c [NET] kzalloc: use in alloc_netdev
Noticed this use, fixed it.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:47 -07:00
Jamal Hadi Salim
83b950c89c [PKT_SCHED] act_police: Rename methods.
Rename policer specific _generic_ methods to be specific to
_act_police_

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:46 -07:00
Patrick McHardy
7a43c99551 [NETFILTER]: H.323 helper: remove changelog
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:43 -07:00
Patrick McHardy
96f6bf82ea [NETFILTER]: Convert conntrack/ipt_REJECT to new checksumming functions
Besides removing lots of duplicate code, all converted users benefit
from improved HW checksum error handling. Tested with and without HW
checksums in almost all combinations.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:42 -07:00
Patrick McHardy
422c346fad [NETFILTER]: Add address family specific checksum helpers
Add checksum operation which takes care of verifying the checksum and
dealing with HW checksum errors and avoids multiple checksum
operations by setting ip_summed to CHECKSUM_UNNECESSARY after
successful verification.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:41 -07:00
Patrick McHardy
bce8032ef3 [NETFILTER]: Introduce infrastructure for address family specific operations
Change the queue rerouter intrastructure to a generic usable
infrastructure for address family specific operations as a base for
some cleanups.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:40 -07:00
Patrick McHardy
a0aed49bdb [NETFILTER]: Fix IP_NF_CONNTRACK_NETLINK dependency
When NAT is built as a module, ip_conntrack_netlink can not be linked
statically.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:39 -07:00
Jing Min Zhao
a0b7db5e86 [NETFILTER]: H.323 helper: add parameter 'default_rrq_ttl'
default_rrq_ttl is used when no TTL is included in the RRQ.

Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:38 -07:00
Jing Min Zhao
51d42f5e4e [NETFILTER]: H.323 helper: make get_h245_addr() static
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:37 -07:00
Jing Min Zhao
0f249685fd [NETFILTER]: H.323 helper: change EXPORT_SYMBOL to EXPORT_SYMBOL_GPL
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:36 -07:00
Jing Min Zhao
48bfee5fad [NETFILTER]: H.323 helper: move some function prototypes to ip_conntrack_h323.h
Move prototypes of NAT callbacks to ip_conntrack_h323.h. Because the
use of typedefs as arguments, some header files need to be moved as
well.

Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:35 -07:00
Patrick McHardy
32292a7ff1 [NETFILTER]: Fix section mismatch warnings
Fix section mismatch warnings caused by netfilter's init_or_cleanup
functions used in many places by splitting the init from the cleanup
parts.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:34 -07:00
Patrick McHardy
964ddaa10d [NETFILTER]: Clean up hook registration
Clean up hook registration by makeing use of the new mass registration and
unregistration helpers.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:33 -07:00
Patrick McHardy
972d1cb142 [NETFILTER]: Add helper functions for mass hook registration/unregistration
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:32 -07:00
Herbert Xu
45af08be6d [INET]: Use port unreachable instead of proto for tunnels
This patch changes GRE and SIT to generate port unreachable instead of
protocol unreachable errors when we can't find a matching tunnel for a
packet.

This removes the ambiguity as to whether the error is caused by no
tunnel being found or by the lack of support for the given tunnel
type.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:29 -07:00
Eric Sesterhenn
cdee5751bf [BLUETOOTH] sco: Possible double free.
this fixes coverity bug id #1068.
hci_send_sco() frees skb if (skb->len > hdev->sco_mtu).
Since it returns a negative error value only in this case, we
can directly return here.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:29 -07:00
Adrian Bunk
e3a5cd9edf [NET]: Fix an off-by-21-or-49 error.
This patch fixes an off-by-21-or-49 error ;-) spotted by the Coverity
checker.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:26 -07:00
Herbert Xu
50fba2aa7c [INET]: Move no-tunnel ICMP error to tunnel4/tunnel6
This patch moves the sending of ICMP messages when there are no IPv4/IPv6
tunnels present to tunnel4/tunnel6 respectively.  Please note that for now
if xfrm4_tunnel/xfrm6_tunnel is loaded then no ICMP messages will ever be
sent.  This is similar to how we handle AH/ESP/IPCOMP.

This move fixes the bug where we always send an ICMP message when there is
no ip6_tunnel device present for a given packet even if it is later handled
by IPsec.  It also causes ICMP messages to be sent when no IPIP tunnel is
present.

I've decided to use the "port unreachable" ICMP message over the current
value of "address unreachable" (and "protocol unreachable" by GRE) because
it is not ambiguous unlike the other ones which can be triggered by other
conditions.  There seems to be no standard specifying what value must be
used so this change should be OK.  In fact we should change GRE to use
this value as well.

Incidentally, this patch also fixes a fairly serious bug in xfrm6_tunnel
where we don't check whether the embedded IPv6 header is present before
dereferencing it for the inside source address.

This patch is inspired by a previous patch by Hugo Santos <hsantos@av.it.pt>.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:25 -07:00
Patrick McHardy
2e2f7aefa8 [NETFILTER]: Fix fragmentation issues with bridge netfilter
The conntrack code doesn't do re-fragmentation of defragmented packets
anymore but relies on fragmentation in the IP layer. Purely bridged
packets don't pass through the IP layer, so the bridge netfilter code
needs to take care of fragmentation itself.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:23 -07:00
Robert Olsson
550e29bc96 [FIB_TRIE]: Fix leaf freeing.
Seems like leaf (end-nodes) has been freed by __tnode_free_rcu and not
by __leaf_free_rcu. This fixes the problem. Only tnode_free is now
used which checks for appropriate node type. free_leaf can be removed.

Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:23 -07:00
Herbert Xu
8bf4b8a108 [IPSEC]: Check x->encap before dereferencing it
We need to dereference x->encap before dereferencing it for encap_type.
If it's absent then the encap_type is zero.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:25:22 -07:00
David S. Miller
9a1875e60e [NET]: Fully fix the memory leaks in sys_accept().
Andi Kleen was right, fput() on sock->file will end up calling
sock_release() if necessary.  So here is the rest of his version
of the fix for these leaks.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 12:48:36 -08:00
Dmitry Mishin
2722971cbe [NETFILTER]: iptables 32bit compat layer
This patch extends current iptables compatibility layer in order to get
32bit iptables to work on 64bit kernel. Current layer is insufficient due
to alignment checks both in kernel and user space tools.

Patch is for current net-2.6.17 with addition of move of ipt_entry_{match|
target} definitions to xt_entry_{match|target}.

Signed-off-by: Dmitry Mishin <dim@openvz.org>
Acked-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 02:25:19 -08:00
Martin Josefsson
e64a70be51 [NETFILTER]: {ip,nf}_conntrack_netlink: fix expectation notifier unregistration
This patch fixes expectation notifier unregistration on module unload to
use ip_conntrack_expect_unregister_notifier(). This bug causes a soft
lockup at the first expectation created after a rmmod ; insmod of this
module.

Should go into -stable as well.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 02:24:48 -08:00
Martin Josefsson
bcd1e830a5 [NETFILTER]: fix ifdef for connmark support in nf_conntrack_netlink
When ctnetlink was ported from ip_conntrack to nf_conntrack two #ifdef's
for connmark support were left unchanged and this code was never
compiled.

Problem noticed by Daniel De Graaf.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 02:23:21 -08:00
Yasuyuki Kozakai
a89ecb6a2e [NETFILTER]: x_tables: unify IPv4/IPv6 multiport match
This unifies ipt_multiport and ip6t_multiport to xt_multiport.
As a result, this addes support for inversion and port range match
to IPv6 packets.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 02:22:54 -08:00
Yasuyuki Kozakai
dc5ab2faec [NETFILTER]: x_tables: unify IPv4/IPv6 esp match
This unifies ipt_esp and ip6t_esp to xt_esp. Please note that now
a user program needs to specify IPPROTO_ESP as protocol to use esp match
with IPv6. This means that ip6tables requires '-p esp' like iptables.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 02:22:30 -08:00
David S. Miller
9606a21635 [NET]: Fix dentry leak in sys_accept().
This regression was added by commit:
39d8c1b6fb
("Do not lose accepted socket when -ENFILE/-EMFILE.")

This is based upon a patch from Andi Kleen.

Thanks to Adrian Bridgett for narrowing down a good test case, and to
Andi Kleen and Andrew Morton for eyeballing this code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 01:00:14 -08:00
Herbert Xu
dbe5b4aaaf [IPSEC]: Kill unused decap state structure
This patch removes the *_decap_state structures which were previously
used to share state between input/post_input.  This is no longer
needed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 00:54:16 -08:00
Herbert Xu
e695633e21 [IPSEC]: Kill unused decap state argument
This patch removes the decap_state argument from the xfrm input hook.
Previously this function allowed the input hook to share state with
the post_input hook.  The latter has since been removed.

The only purpose for it now is to check the encap type.  However, it
is easier and better to move the encap type check to the generic
xfrm_rcv function.  This allows us to get rid of the decap state
argument altogether.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-01 00:52:46 -08:00
Linus Torvalds
4b75679f60 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET]: Allow skb headroom to be overridden
  [TCP]: Kill unused extern decl for tcp_v4_hash_connecting()
  [NET]: add SO_RCVBUF comment
  [NET]: Deinline some larger functions from netdevice.h
  [DCCP]: Use NULL for pointers, comfort sparse.
  [DECNET]: Fix refcount
2006-03-31 12:52:30 -08:00
Andrew Morton
c08e49611a [NET]: add SO_RCVBUF comment
Put a comment in there explaining why we double the setsockopt()
caller's SO_RCVBUF.  People keep wondering.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-31 02:09:36 -08:00
Jens Axboe
5274f052e7 [PATCH] Introduce sys_splice() system call
This adds support for the sys_splice system call. Using a pipe as a
transport, it can connect to files or sockets (latter as output only).

From the splice.c comments:

   "splice": joining two ropes together by interweaving their strands.

   This is the "extended pipe" functionality, where a pipe is used as
   an arbitrary in-memory buffer. Think of a pipe as a small kernel
   buffer that you can use to transfer data from one end to the other.

   The traditional unix read/write is extended with a "splice()" operation
   that transfers data buffers to or from a pipe buffer.

   Named by Larry McVoy, original implementation from Linus, extended by
   Jens to support splicing to files and fixing the initial implementation
   bugs.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-30 12:28:18 -08:00
Denis Vlasenko
56079431b6 [NET]: Deinline some larger functions from netdevice.h
On a allyesconfig'ured kernel:

Size  Uses Wasted Name and definition
===== ==== ====== ================================================
   95  162  12075 netif_wake_queue      include/linux/netdevice.h
  129   86   9265 dev_kfree_skb_any     include/linux/netdevice.h
  127   56   5885 netif_device_attach   include/linux/netdevice.h
   73   86   4505 dev_kfree_skb_irq     include/linux/netdevice.h
   46   60   1534 netif_device_detach   include/linux/netdevice.h
  119   16   1485 __netif_rx_schedule   include/linux/netdevice.h
  143    5    492 netif_rx_schedule     include/linux/netdevice.h
   81    7    366 netif_schedule        include/linux/netdevice.h

netif_wake_queue is big because __netif_schedule is a big inline:

static inline void __netif_schedule(struct net_device *dev)
{
        if (!test_and_set_bit(__LINK_STATE_SCHED, &dev->state)) {
                unsigned long flags;
                struct softnet_data *sd;

                local_irq_save(flags);
                sd = &__get_cpu_var(softnet_data);
                dev->next_sched = sd->output_queue;
                sd->output_queue = dev;
                raise_softirq_irqoff(NET_TX_SOFTIRQ);
                local_irq_restore(flags);
        }
}

static inline void netif_wake_queue(struct net_device *dev)
{
#ifdef CONFIG_NETPOLL_TRAP
        if (netpoll_trap())
                return;
#endif
        if (test_and_clear_bit(__LINK_STATE_XOFF, &dev->state))
                __netif_schedule(dev);
}

By de-inlining __netif_schedule we are saving a lot of text
at each callsite of netif_wake_queue and netif_schedule.
__netif_rx_schedule is also big, and it makes more sense to keep
both of them out of line.

Patch also deinlines dev_kfree_skb_any. We can deinline dev_kfree_skb_irq
instead... oh well.

netif_device_attach/detach are not hot paths, we can deinline them too.

Signed-off-by: Denis Vlasenko <vda@ilport.com.ua>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-29 15:57:29 -08:00
Jeff Garzik
e21a2b0cc5 Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2006-03-29 17:30:19 -05:00
Randy Dunlap
68907dad58 [DCCP]: Use NULL for pointers, comfort sparse.
From: Randy Dunlap <rdunlap@xenotime.net>

Use NULL instead of 0 for pointers.
Fix these sparse warnings:
net/dccp/feat.c:207:20: warning: Using plain integer as NULL pointer
net/dccp/feat.c:325:21: warning: Using plain integer as NULL pointer
net/dccp/feat.c:526:20: warning: Using plain integer as NULL pointer

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-29 13:58:25 -08:00
Patrick Caulfield
6a57b2ee45 [DECNET]: Fix refcount
From: Patrick Caulfield <patrick@tykepenguin.com>

This patch fixes a bug in the reference counting for the default
DECnet device.

If the device is changed, then the new device had its refcount
decremented rather than the old one!

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-29 13:57:31 -08:00
Andrew Morton
65b4b4e81a [NETFILTER]: Rename init functions.
Every netfilter module uses `init' for its module_init() function and
`fini' or `cleanup' for its module_exit() function.

Problem is, this creates uninformative initcall_debug output and makes
ctags rather useless.

So go through and rename them all to $(filename)_init and
$(filename)_fini.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-28 17:02:48 -08:00
S P
c3e5d877aa [TCP]: Fix RFC2465 typo.
Signed-off-by: S P <speattle@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-28 17:02:47 -08:00
Herbert Xu
d2acc3479c [INET]: Introduce tunnel4/tunnel6
Basically this patch moves the generic tunnel protocol stuff out of
xfrm4_tunnel/xfrm6_tunnel and moves it into the new files of tunnel4.c
and tunnel6 respectively.

The reason for this is that the problem that Hugo uncovered is only
the tip of the iceberg.  The real problem is that when we removed the
dependency of ipip on xfrm4_tunnel we didn't really consider the module
case at all.

For instance, as it is it's possible to build both ipip and xfrm4_tunnel
as modules and if the latter is loaded then ipip simply won't load.

After considering the alternatives I've decided that the best way out of
this is to restore the dependency of ipip on the non-xfrm-specific part
of xfrm4_tunnel.  This is acceptable IMHO because the intention of the
removal was really to be able to use ipip without the xfrm subsystem.
This is still preserved by this patch.

So now both ipip/xfrm4_tunnel depend on the new tunnel4.c which handles
the arbitration between the two.  The order of processing is determined
by a simple integer which ensures that ipip gets processed before
xfrm4_tunnel.

The situation for ICMP handling is a little bit more complicated since
we may not have enough information to determine who it's for.  It's not
a big deal at the moment since the xfrm ICMP handlers are basically
no-ops.  In future we can deal with this when we look at ICMP caching
in general.

The user-visible change to this is the removal of the TUNNEL Kconfig
prompts.  This makes sense because it can only be used through IPCOMP
as it stands.

The addition of the new modules shouldn't introduce any problems since
module dependency will cause them to be loaded.

Oh and I also turned some unnecessary pskb's in IPv6 related to this
patch to skb's.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-28 17:02:46 -08:00
Denis Vlasenko
f0088a50e7 [NET]: deinline 200+ byte inlines in sock.h
Sizes in bytes (allyesconfig, i386) and files where those inlines
are used:

238 sock_queue_rcv_skb 2.6.16/net/x25/x25_in.o
238 sock_queue_rcv_skb 2.6.16/net/rose/rose_in.o
238 sock_queue_rcv_skb 2.6.16/net/packet/af_packet.o
238 sock_queue_rcv_skb 2.6.16/net/netrom/nr_in.o
238 sock_queue_rcv_skb 2.6.16/net/llc/llc_sap.o
238 sock_queue_rcv_skb 2.6.16/net/llc/llc_conn.o
238 sock_queue_rcv_skb 2.6.16/net/irda/af_irda.o
238 sock_queue_rcv_skb 2.6.16/net/ipx/af_ipx.o
238 sock_queue_rcv_skb 2.6.16/net/ipv6/udp.o
238 sock_queue_rcv_skb 2.6.16/net/ipv6/raw.o
238 sock_queue_rcv_skb 2.6.16/net/ipv4/udp.o
238 sock_queue_rcv_skb 2.6.16/net/ipv4/raw.o
238 sock_queue_rcv_skb 2.6.16/net/ipv4/ipmr.o
238 sock_queue_rcv_skb 2.6.16/net/econet/econet.o
238 sock_queue_rcv_skb 2.6.16/net/econet/af_econet.o
238 sock_queue_rcv_skb 2.6.16/net/bluetooth/sco.o
238 sock_queue_rcv_skb 2.6.16/net/bluetooth/l2cap.o
238 sock_queue_rcv_skb 2.6.16/net/bluetooth/hci_sock.o
238 sock_queue_rcv_skb 2.6.16/net/ax25/ax25_in.o
238 sock_queue_rcv_skb 2.6.16/net/ax25/af_ax25.o
238 sock_queue_rcv_skb 2.6.16/net/appletalk/ddp.o
238 sock_queue_rcv_skb 2.6.16/drivers/net/pppoe.o

276 sk_receive_skb 2.6.16/net/decnet/dn_nsp_in.o
276 sk_receive_skb 2.6.16/net/dccp/ipv6.o
276 sk_receive_skb 2.6.16/net/dccp/ipv4.o
276 sk_receive_skb 2.6.16/net/dccp/dccp_ipv6.o
276 sk_receive_skb 2.6.16/drivers/net/pppoe.o

209 sk_dst_check 2.6.16/net/ipv6/ip6_output.o
209 sk_dst_check 2.6.16/net/ipv4/udp.o
209 sk_dst_check 2.6.16/net/decnet/dn_nsp_out.o

Large inlines with multiple callers:
Size  Uses Wasted Name and definition
===== ==== ====== ================================================
  238   21   4360 sock_queue_rcv_skb    include/net/sock.h
  109   10    801 sock_recv_timestamp   include/net/sock.h
  276    4    768 sk_receive_skb        include/net/sock.h
   94    8    518 __sk_dst_check        include/net/sock.h
  209    3    378 sk_dst_check  include/net/sock.h
  131    4    333 sk_setup_caps include/net/sock.h
  152    2    132 sk_stream_alloc_pskb  include/net/sock.h
  125    2    105 sk_stream_writequeue_purge    include/net/sock.h

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-28 17:02:45 -08:00
David S. Miller
1d1818316f [ECONET]: Convert away from SOCKOPS_WRAPPED
Just use a local econet_mutex instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-28 17:02:43 -08:00
Petr Vandrovec
f6c90b71a3 [NET]: Fix ipx/econet/appletalk/irda ioctl crashes
Fix kernel oopses whenever somebody issues compatible ioctl on AppleTalk,
Econet, IPX or IRDA socket.  For AppleTalk/Econet/IRDA it restores state
in which these sockets were before compat_ioctl was introduced to the socket
ops, for IPX it implements support for 4 ioctls which were not implemented
before - as these ioctls use structures which match between 32bit and 64bit
userspace, no special code is needed, just call 64bit ioctl handler.

Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-28 17:02:43 -08:00
Alexey Dobriyan
7f927fcc2f [PATCH] Typo fixes
Fix a lot of typos.  Eyeballed by jmc@ in OpenBSD.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:08 -08:00
Arjan van de Ven
4b6f5d20b0 [PATCH] Make most file operations structs in fs/ const
This is a conversion to make the various file_operations structs in fs/
const.  Basically a regexp job, with a few manual fixups

The goal is both to increase correctness (harder to accidentally write to
shared datastructures) and reducing the false sharing of cachelines with
things that get dirty in .data (while .rodata is nicely read only and thus
cache clean)

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:06 -08:00
Arjan van de Ven
99ac48f54a [PATCH] mark f_ops const in the inode
Mark the f_ops members of inodes as const, as well as fix the
ripple-through this causes by places that copy this f_ops and then "do
stuff" with it.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:05 -08:00
David Woodhouse
2638fed7cc [PATCH] softmac: reduce default rate to 11Mbps.
We don't make much of an attempt to fall back to lower rates, and 54M
just isn't reliable enough for many people. In fact, it's not clear we
even set it to 11M if we're trying to associate with an 802.11b AP.

This patch makes us default to 11M, which ought to work for most people.
When we actually handle dynamic rate adjustment, we can reconsider the
defaults -- but even then, probably it makes as much sense to start at
11M and adjust it upwards as it does to start at 54M and reduce it.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-27 14:04:13 -05:00
David Woodhouse
16f4352733 [PATCH] softmac: reduce scan dwell time
It currently takes something like 8 seconds to do a scan, because we
spend half a second on each channel. Reduce that time to 20ms per
channel.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-27 14:04:09 -05:00
Larry Finger
d94606e058 [PATCH] Minor (janitorial) change to ieee80211
The attached patch removes a potential problem from ieee80211_wx.c, by changing the name of routine
ipw2100_translate_scan to ieee80211_translate_scan. The problem is minor as the routine is declared
static; however, if it were made global, it would pollute the namespace.

Signed-Off-By: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-27 12:07:02 -05:00
Linus Torvalds
fdccffc6b7 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET]: drop duplicate assignment in request_sock
  [IPSEC]: Fix tunnel error handling in ipcomp6
2006-03-27 08:47:29 -08:00
Alan Stern
e041c68341 [PATCH] Notifier chain update: API changes
The kernel's implementation of notifier chains is unsafe.  There is no
protection against entries being added to or removed from a chain while the
chain is in use.  The issues were discussed in this thread:

    http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2

We noticed that notifier chains in the kernel fall into two basic usage
classes:

	"Blocking" chains are always called from a process context
	and the callout routines are allowed to sleep;

	"Atomic" chains can be called from an atomic context and
	the callout routines are not allowed to sleep.

We decided to codify this distinction and make it part of the API.  Therefore
this set of patches introduces three new, parallel APIs: one for blocking
notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
really just the old API under a new name).  New kinds of data structures are
used for the heads of the chains, and new routines are defined for
registration, unregistration, and calling a chain.  The three APIs are
explained in include/linux/notifier.h and their implementation is in
kernel/sys.c.

With atomic and blocking chains, the implementation guarantees that the chain
links will not be corrupted and that chain callers will not get messed up by
entries being added or removed.  For raw chains the implementation provides no
guarantees at all; users of this API must provide their own protections.  (The
idea was that situations may come up where the assumptions of the atomic and
blocking APIs are not appropriate, so it should be possible for users to
handle these things in their own way.)

There are some limitations, which should not be too hard to live with.  For
atomic/blocking chains, registration and unregistration must always be done in
a process context since the chain is protected by a mutex/rwsem.  Also, a
callout routine for a non-raw chain must not try to register or unregister
entries on its own chain.  (This did happen in a couple of places and the code
had to be changed to avoid it.)

Since atomic chains may be called from within an NMI handler, they cannot use
spinlocks for synchronization.  Instead we use RCU.  The overhead falls almost
entirely in the unregister routine, which is okay since unregistration is much
less frequent that calling a chain.

Here is the list of chains that we adjusted and their classifications.  None
of them use the raw API, so for the moment it is only a placeholder.

  ATOMIC CHAINS
  -------------
arch/i386/kernel/traps.c:		i386die_chain
arch/ia64/kernel/traps.c:		ia64die_chain
arch/powerpc/kernel/traps.c:		powerpc_die_chain
arch/sparc64/kernel/traps.c:		sparc64die_chain
arch/x86_64/kernel/traps.c:		die_chain
drivers/char/ipmi/ipmi_si_intf.c:	xaction_notifier_list
kernel/panic.c:				panic_notifier_list
kernel/profile.c:			task_free_notifier
net/bluetooth/hci_core.c:		hci_notifier
net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_chain
net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_expect_chain
net/ipv6/addrconf.c:			inet6addr_chain
net/netfilter/nf_conntrack_core.c:	nf_conntrack_chain
net/netfilter/nf_conntrack_core.c:	nf_conntrack_expect_chain
net/netlink/af_netlink.c:		netlink_chain

  BLOCKING CHAINS
  ---------------
arch/powerpc/platforms/pseries/reconfig.c:	pSeries_reconfig_chain
arch/s390/kernel/process.c:		idle_chain
arch/x86_64/kernel/process.c		idle_notifier
drivers/base/memory.c:			memory_chain
drivers/cpufreq/cpufreq.c		cpufreq_policy_notifier_list
drivers/cpufreq/cpufreq.c		cpufreq_transition_notifier_list
drivers/macintosh/adb.c:		adb_client_list
drivers/macintosh/via-pmu.c		sleep_notifier_list
drivers/macintosh/via-pmu68k.c		sleep_notifier_list
drivers/macintosh/windfarm_core.c	wf_client_list
drivers/usb/core/notify.c		usb_notifier_list
drivers/video/fbmem.c			fb_notifier_list
kernel/cpu.c				cpu_chain
kernel/module.c				module_notify_list
kernel/profile.c			munmap_notifier
kernel/profile.c			task_exit_notifier
kernel/sys.c				reboot_notifier_list
net/core/dev.c				netdev_chain
net/decnet/dn_dev.c:			dnaddr_chain
net/ipv4/devinet.c:			inetaddr_chain

It's possible that some of these classifications are wrong.  If they are,
please let us know or submit a patch to fix them.  Note that any chain that
gets called very frequently should be atomic, because the rwsem read-locking
used for blocking chains is very likely to incur cache misses on SMP systems.
(However, if the chain's callout routines may sleep then the chain cannot be
atomic.)

The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
material written by Keith Owens and suggestions from Paul McKenney and Andrew
Morton.

[jes@sgi.com: restructure the notifier chain initialization macros]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:50 -08:00
NeilBrown
ad1b5229de [PATCH] knfsd: Tidy up unix_domain_find
We shouldn't really compare &new->h with anything when new ==NULL, and gather
three different if statements that all start

  if (rv ...

into one large if.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:43 -08:00
Adrian Bunk
74cae61ab4 [PATCH] fs/nfsd/export.c,net/sunrpc/cache.c: make needlessly global code static
We can now make some code static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:43 -08:00
NeilBrown
baab935ff3 [PATCH] knfsd: Convert sunrpc_cache to use krefs
.. it makes some of the code nicer.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:43 -08:00
NeilBrown
ebd0cb1af3 [PATCH] knfsd: Unexport cache_fresh and fix a small race
Cache_fresh is now only used in cache.c, so unexport it.

Part of cache_fresh (setting CACHE_VALID) should really be done under the
lock, while part (calling cache_revisit_request etc) must be done outside the
lock.  So we split it up appropriately.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:43 -08:00
NeilBrown
4013edea9a [PATCH] knfsd: An assortment of little fixes to the sunrpc cache code
- in cache_check, h must be non-NULL as it has been de-referenced,
  so don't bother checking for NULL.

- When a cache-item is updated, we need to call cache_revisit_request to see
  if there is a pending request waiting for that item.  We were using
  a transition to CACHE_VALID to see if that was needed, however that is
  wrong as an expired entry will still be marked 'valid' (as the data is valid
  and will need to be released).  So instead use an off transition for
  CACHE_PENDING which is exactly the right thing to test.

- Add a little bit more debugging info.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:42 -08:00
NeilBrown
17f834b6d2 [PATCH] knfsd: Use new cache code for rsc cache
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:42 -08:00
NeilBrown
d4d11ea9d6 [PATCH] knfsd: Use new sunrpc cache for rsi cache
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:42 -08:00
NeilBrown
1a9917c2da [PATCH] knfsd: Convert ip_map cache to use the new lookup routine
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:42 -08:00
NeilBrown
15a5f6bd23 [PATCH] knfsd: Create cache_lookup function instead of using a macro to declare one
The C++-like 'template' approach proves to be too ugly and hard to work with.

The old 'template' won't go away until all users are updated.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:41 -08:00
NeilBrown
7d317f2c9f [PATCH] knfsd: Get rid of 'inplace' sunrpc caches
These were an unnecessary wart.  Also only have one 'DefineSimpleCache..'
instead of two.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:41 -08:00
NeilBrown
efc36aa560 [PATCH] knfsd: Change the store of auth_domains to not be a 'cache'
The 'auth_domain's are simply handles on internal data structures.  They do
not cache information from user-space, and forcing them into the mold of a
'cache' misrepresents their true nature and causes confusion.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:41 -08:00
Norbert Kiesel
3eb4801d7b [NET]: drop duplicate assignment in request_sock
Just noticed that request_sock.[ch] contain a useless assignment of
rskq_accept_head to itself.  I assume this is a typo and the 2nd one
was supposed to be _tail.  However, setting _tail to NULL is not
needed, so the patch below just drops the 2nd assignment.

Signed-off-By: Norbert Kiesel <nkiesel@tbdnetworks.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-26 17:39:55 -08:00
Herbert Xu
6abaaaae6d [IPSEC]: Fix tunnel error handling in ipcomp6
The error handling in ipcomp6_tunnel_create is broken in two ways:

1) If we fail to allocate an SPI (this should never happen in practice
since there are plenty of 32-bit SPI values for us to use), we will
still go ahead and create the SA.

2) When xfrm_init_state fails, we first of all may trigger the BUG_TRAP
in __xfrm_state_destroy because we didn't set the state to DEAD.  More
importantly we end up returning the freed state as if we succeeded!

This patch fixes them both.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-26 17:37:54 -08:00
Matthew Dobson
93d2341c75 [PATCH] mempool: use mempool_create_slab_pool()
Modify well over a dozen mempool users to call mempool_create_slab_pool()
rather than calling mempool_create() with extra arguments, saving about 30
lines of code and increasing readability.

Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:00 -08:00
Ingo Molnar
14cc3e2b63 [PATCH] sem2mutex: misc static one-file mutexes
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jens Axboe <axboe@suse.de>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Cc: Greg KH <greg@kroah.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Adam Belay <ambx1@neo.rr.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:55 -08:00
Linus Torvalds
1b9a391736 Merge branch 'audit.b3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: (22 commits)
  [PATCH] fix audit_init failure path
  [PATCH] EXPORT_SYMBOL patch for audit_log, audit_log_start, audit_log_end and audit_format
  [PATCH] sem2mutex: audit_netlink_sem
  [PATCH] simplify audit_free() locking
  [PATCH] Fix audit operators
  [PATCH] promiscuous mode
  [PATCH] Add tty to syscall audit records
  [PATCH] add/remove rule update
  [PATCH] audit string fields interface + consumer
  [PATCH] SE Linux audit events
  [PATCH] Minor cosmetic cleanups to the code moved into auditfilter.c
  [PATCH] Fix audit record filtering with !CONFIG_AUDITSYSCALL
  [PATCH] Fix IA64 success/failure indication in syscall auditing.
  [PATCH] Miscellaneous bug and warning fixes
  [PATCH] Capture selinux subject/object context information.
  [PATCH] Exclude messages by message type
  [PATCH] Collect more inode information during syscall processing.
  [PATCH] Pass dentry, not just name, in fsnotify creation hooks.
  [PATCH] Define new range of userspace messages.
  [PATCH] Filter rule comparators
  ...

Fixed trivial conflict in security/selinux/hooks.c
2006-03-25 09:24:53 -08:00
Linus Torvalds
53846a21c1 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (103 commits)
  SUNRPC,RPCSEC_GSS: spkm3--fix config dependencies
  SUNRPC,RPCSEC_GSS: spkm3: import contexts using NID_cast5_cbc
  LOCKD: Make nlmsvc_traverse_shares return void
  LOCKD: nlmsvc_traverse_blocks return is unused
  SUNRPC,RPCSEC_GSS: fix krb5 sequence numbers.
  NFSv4: Dont list system.nfs4_acl for filesystems that don't support it.
  SUNRPC,RPCSEC_GSS: remove unnecessary kmalloc of a checksum
  SUNRPC: Ensure rpc_call_async() always calls tk_ops->rpc_release()
  SUNRPC: Fix memory barriers for req->rq_received
  NFS: Fix a race in nfs_sync_inode()
  NFS: Clean up nfs_flush_list()
  NFS: Fix a race with PG_private and nfs_release_page()
  NFSv4: Ensure the callback daemon flushes signals
  SUNRPC: Fix a 'Busy inodes' error in rpc_pipefs
  NFS, NLM: Allow blocking locks to respect signals
  NFS: Make nfs_fhget() return appropriate error values
  NFSv4: Fix an oops in nfs4_fill_super
  lockd: blocks should hold a reference to the nlm_file
  NFSv4: SETCLIENTID_CONFIRM should handle NFS4ERR_DELAY/NFS4ERR_RESOURCE
  NFSv4: Send the delegation stateid for SETATTR calls
  ...
2006-03-25 09:18:27 -08:00
Linus Torvalds
b55813a2e5 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NETFILTER] x_table.c: sem2mutex
  [IPV4]: Aggregate route entries with different TOS values
  [TCP]: Mark tcp_*mem[] __read_mostly.
  [TCP]: Set default max buffers from memory pool size
  [SCTP]: Fix up sctp_rcv return value
  [NET]: Take RTNL when unregistering notifier
  [WIRELESS]: Fix config dependencies.
  [NET]: Fill in a 32-bit hole in struct sock on 64-bit platforms.
  [NET]: Ensure device name passed to SO_BINDTODEVICE is NULL terminated.
  [MODULES]: Don't allow statically declared exports
  [BRIDGE]: Unaligned accesses in the ethernet bridge
2006-03-25 08:39:20 -08:00
Davide Libenzi
f348d70a32 [PATCH] POLLRDHUP/EPOLLRDHUP handling for half-closed devices notifications
Implement the half-closed devices notifiation, by adding a new POLLRDHUP
(and its alias EPOLLRDHUP) bit to the existing poll/select sets.  Since the
existing POLLHUP handling, that does not report correctly half-closed
devices, was feared to be changed, this implementation leaves the current
POLLHUP reporting unchanged and simply add a new bit that is set in the few
places where it makes sense.  The same thing was discussed and conceptually
agreed quite some time ago:

http://lkml.org/lkml/2003/7/12/116

Since this new event bit is added to the existing Linux poll infrastruture,
even the existing poll/select system calls will be able to use it.  As far
as the existing POLLHUP handling, the patch leaves it as is.  The
pollrdhup-2.6.16.rc5-0.10.diff defines the POLLRDHUP for all the existing
archs and sets the bit in the six relevant files.  The other attached diff
is the simple change required to sys/epoll.h to add the EPOLLRDHUP
definition.

There is "a stupid program" to test POLLRDHUP delivery here:

 http://www.xmailserver.org/pollrdhup-test.c

It tests poll(2), but since the delivery is same epoll(2) will work equally.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:56 -08:00
Jesper Juhl
21e4f95269 [PATCH] fix 'defined but not used' warning in net/rxrpc/main.c::rxrpc_initialise
net/rxrpc/main.c: In function `rxrpc_initialise':
net/rxrpc/main.c:83: warning: label `error_proc' defined but not used

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:52 -08:00
Al Viro
871751e25d [PATCH] slab: implement /proc/slab_allocators
Implement /proc/slab_allocators.   It produces output like:

idr_layer_cache: 80 idr_pre_get+0x33/0x4e
buffer_head: 2555 alloc_buffer_head+0x20/0x75
mm_struct: 9 mm_alloc+0x1e/0x42
mm_struct: 20 dup_mm+0x36/0x370
vm_area_struct: 384 dup_mm+0x18f/0x370
vm_area_struct: 151 do_mmap_pgoff+0x2e0/0x7c3
vm_area_struct: 1 split_vma+0x5a/0x10e
vm_area_struct: 11 do_brk+0x206/0x2e2
vm_area_struct: 2 copy_vma+0xda/0x142
vm_area_struct: 9 setup_arg_pages+0x99/0x214
fs_cache: 8 copy_fs_struct+0x21/0x133
fs_cache: 29 copy_process+0xf38/0x10e3
files_cache: 30 alloc_files+0x1b/0xcf
signal_cache: 81 copy_process+0xbaa/0x10e3
sighand_cache: 77 copy_process+0xe65/0x10e3
sighand_cache: 1 de_thread+0x4d/0x5f8
anon_vma: 241 anon_vma_prepare+0xd9/0xf3
size-2048: 1 add_sect_attrs+0x5f/0x145
size-2048: 2 journal_init_revoke+0x99/0x302
size-2048: 2 journal_init_revoke+0x137/0x302
size-2048: 2 journal_init_inode+0xf9/0x1c4

Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Alexander Nyberg <alexn@telia.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
DESC
slab-leaks3-locking-fix
EDESC
From: Andrew Morton <akpm@osdl.org>

Update for slab-remove-cachep-spinlock.patch

Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Alexander Nyberg <alexn@telia.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:49 -08:00
Ingo Molnar
9e19bb6d7a [NETFILTER] x_table.c: sem2mutex
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-25 01:41:10 -08:00
Ilia Sotnikov
cef2685e00 [IPV4]: Aggregate route entries with different TOS values
When we get an ICMP need-to-frag message, the original TOS value in the
ICMP payload cannot be used as a key to look up the routes to update.
This is because the TOS field may have been modified by routers on the
way.  Similarly, ip_rt_redirect should also ignore the TOS as the router
that gave us the message may have modified the TOS value.

The patch achieves this objective by aggregating entries with different
TOS values (but are otherwise identical) into the same bucket.  This
makes it easy to update them at the same time when an ICMP message is
received.

In future we should use a twin-hashing scheme where teh aggregation
occurs at the entry level.  That is, the TOS goes back into the hash
for normal lookups while ICMP lookups will end up with a node that
gives us a list that contains all other route entries that differ
only by TOS.

Signed-off-by: Ilia Sotnikov <hostcc@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-25 01:38:55 -08:00
David S. Miller
b8059eadf9 [TCP]: Mark tcp_*mem[] __read_mostly.
Suggested by Stephen Hemminger.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-25 01:36:56 -08:00
John Heffner
7b4f4b5ebc [TCP]: Set default max buffers from memory pool size
This patch sets the maximum TCP buffer sizes (available to automatic
buffer tuning, not to setsockopt) based on the TCP memory pool size.
The maximum sndbuf and rcvbuf each will be up to 4 MB, but no more
than 1/128 of the memory pressure threshold.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-25 01:34:07 -08:00
Herbert Xu
2babf9daae [SCTP]: Fix up sctp_rcv return value
I was working on the ipip/xfrm problem and as usual I get side-tracked by
other problems.

As part of an attempt to change the IPv4 protocol handler calling
convention I found that SCTP violated the existing convention.

It's returning non-zero values after freeing the skb.  This is doubly bad
as 1) the skb gets resubmitted; 2) the return value is interpreted as a
protocol number.

This patch changes those return values to zero.

IPv6 doesn't suffer from this problem because it uses a positive return
value as an indication for resubmission.  So the only effect of this patch
there is to increment the IPSTATS_MIB_INDELIVERS counter which IMHO is
the right thing to do.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-25 01:25:29 -08:00
Herbert Xu
9f514950bb [NET]: Take RTNL when unregistering notifier
The netdev notifier call chain is currently unregistered without taking
any locks outside the notifier system.  Because the notifier system itself
does not synchronise unregistration with respect to the calling of the
chain, we as its user need to do our own locking.

We are supposed to take the RTNL for all calls to netdev notifiers, so
taking the RTNL should be sufficient to protect it.

The registration path in dev.c already takes the RTNL so it's OK.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-25 01:24:25 -08:00
David S. Miller
f67ed26f2b [NET]: Ensure device name passed to SO_BINDTODEVICE is NULL terminated.
Found by Solar Designer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-24 15:44:59 -08:00
Peter Chubb
4dc6d9cc38 [BRIDGE]: Unaligned accesses in the ethernet bridge
I see lots of
	kernel unaligned access to 0xa0000001009dbb6f, ip=0xa000000100811591
	kernel unaligned access to 0xa0000001009dbb6b, ip=0xa0000001008115c1
	kernel unaligned access to 0xa0000001009dbb6d, ip=0xa0000001008115f1
messages in my logs on IA64 when using the ethernet bridge with 2.6.16.

Appended is a patch to fix them.

Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-24 15:44:57 -08:00
Alexey Dobriyan
53b3531bbb [PATCH] s/;;/;/g
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:24 -08:00
Paul Jackson
fffb60f93c [PATCH] cpuset memory spread: slab cache format
Rewrap the overly long source code lines resulting from the previous
patch's addition of the slab cache flag SLAB_MEM_SPREAD.  This patch
contains only formatting changes, and no function change.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:23 -08:00
Paul Jackson
4b6a9316fa [PATCH] cpuset memory spread: slab cache filesystems
Mark file system inode and similar slab caches subject to SLAB_MEM_SPREAD
memory spreading.

If a slab cache is marked SLAB_MEM_SPREAD, then anytime that a task that's
in a cpuset with the 'memory_spread_slab' option enabled goes to allocate
from such a slab cache, the allocations are spread evenly over all the
memory nodes (task->mems_allowed) allowed to that task, instead of favoring
allocation on the node local to the current cpu.

The following inode and similar caches are marked SLAB_MEM_SPREAD:

    file                               cache
    ====                               =====
    fs/adfs/super.c                    adfs_inode_cache
    fs/affs/super.c                    affs_inode_cache
    fs/befs/linuxvfs.c                 befs_inode_cache
    fs/bfs/inode.c                     bfs_inode_cache
    fs/block_dev.c                     bdev_cache
    fs/cifs/cifsfs.c                   cifs_inode_cache
    fs/coda/inode.c                    coda_inode_cache
    fs/dquot.c                         dquot
    fs/efs/super.c                     efs_inode_cache
    fs/ext2/super.c                    ext2_inode_cache
    fs/ext2/xattr.c (fs/mbcache.c)     ext2_xattr
    fs/ext3/super.c                    ext3_inode_cache
    fs/ext3/xattr.c (fs/mbcache.c)     ext3_xattr
    fs/fat/cache.c                     fat_cache
    fs/fat/inode.c                     fat_inode_cache
    fs/freevxfs/vxfs_super.c           vxfs_inode
    fs/hpfs/super.c                    hpfs_inode_cache
    fs/isofs/inode.c                   isofs_inode_cache
    fs/jffs/inode-v23.c                jffs_fm
    fs/jffs2/super.c                   jffs2_i
    fs/jfs/super.c                     jfs_ip
    fs/minix/inode.c                   minix_inode_cache
    fs/ncpfs/inode.c                   ncp_inode_cache
    fs/nfs/direct.c                    nfs_direct_cache
    fs/nfs/inode.c                     nfs_inode_cache
    fs/ntfs/super.c                    ntfs_big_inode_cache_name
    fs/ntfs/super.c                    ntfs_inode_cache
    fs/ocfs2/dlm/dlmfs.c               dlmfs_inode_cache
    fs/ocfs2/super.c                   ocfs2_inode_cache
    fs/proc/inode.c                    proc_inode_cache
    fs/qnx4/inode.c                    qnx4_inode_cache
    fs/reiserfs/super.c                reiser_inode_cache
    fs/romfs/inode.c                   romfs_inode_cache
    fs/smbfs/inode.c                   smb_inode_cache
    fs/sysv/inode.c                    sysv_inode_cache
    fs/udf/super.c                     udf_inode_cache
    fs/ufs/super.c                     ufs_inode_cache
    net/socket.c                       sock_inode_cache
    net/sunrpc/rpc_pipe.c              rpc_inode_cache

The choice of which slab caches to so mark was quite simple.  I marked
those already marked SLAB_RECLAIM_ACCOUNT, except for fs/xfs, dentry_cache,
inode_cache, and buffer_head, which were marked in a previous patch.  Even
though SLAB_RECLAIM_ACCOUNT is for a different purpose, it marks the same
potentially large file system i/o related slab caches as we need for memory
spreading.

Given that the rule now becomes "wherever you would have used a
SLAB_RECLAIM_ACCOUNT slab cache flag before (usually the inode cache), use
the SLAB_MEM_SPREAD flag too", this should be easy enough to maintain.
Future file system writers will just copy one of the existing file system
slab cache setups and tend to get it right without thinking.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:23 -08:00
Trond Myklebust
1ebbe2b200 Merge branch 'linus' 2006-03-23 23:44:19 -05:00
Linus Torvalds
aca361c1a0 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (45 commits)
  [PATCH] Restore channel setting after scan.
  [PATCH] hostap: Fix memory leak on PCI probe error path
  [PATCH] hostap: Remove dead code (duplicated idx != 0)
  [PATCH] hostap: Fix unlikely read overrun in CIS parsing
  [PATCH] hostap: Fix double free in prism2_config() error path
  [PATCH] hostap: Fix ap_add_sta() return value verification
  [PATCH] hostap: Fix hw reset after CMDCODE_ACCESS_WRITE timeout
  [PATCH] wireless/airo: cache wireless scans
  [PATCH] wireless/airo: define default MTU
  [PATCH] wireless/airo: clean up printk usage to print device name
  [PATCH] WE-20 for kernel 2.6.16
  [PATCH] softmac: remove function_enter()
  [PATCH] skge: version 1.5
  [PATCH] skge: compute available ring buffers
  [PATCH] skge: dont free skb until multi-part transmit complete
  [PATCH] skge: multicast statistics fix
  [PATCH] skge: rx_reuse called twice
  [PATCH] skge: dont use dev_alloc_skb for rx buffs
  [PATCH] skge: align receive buffers
  [PATCH] sky2: dont need to use dev_kfree_skb_any
  ...
2006-03-23 16:25:49 -08:00
Jeff Garzik
9b7c84899e Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2006-03-23 17:15:41 -05:00
David Woodhouse
4edac92fcf [PATCH] Restore channel setting after scan.
After a scan, we weren't switching back to the original channel if we
were associated with an AP. So NetworkManager's periodic scans would
disrupt connectivity until the ESSID was manually set again. Fix that.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-23 16:18:47 -05:00
Jean Tourrilhes
711e2c33ac [PATCH] WE-20 for kernel 2.6.16
This is version 20 of the Wireless Extensions. This is the
completion of the RtNetlink work I started early 2004, it enables the
full Wireless Extension API over RtNetlink.

	Few comments on the patch :
	o totally driver transparent, no change in drivers needed.
	o iwevent were already RtNetlink based since they were created
(around 2.5.7). This adds all the regular SET and GET requests over
RtNetlink, using the exact same mechanism and data format as iwevents.
	o This is a Kconfig option, as currently most people have no
need for it. Surprisingly, patch is actually small and well
encapsulated.
	o Tested on SMP, attention as been paid to make it 64 bits clean.
	o Code do probably too many checks and could be further
optimised, but better safe than sorry.
	o RtNetlink based version of the Wireless Tools available on
my web page for people inclined to try out this stuff.

	I would also like to thank Alexey Kuznetsov for his helpful
suggestions to make this patch better.

Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-23 07:12:57 -05:00
John W. Linville
9a107aa24a [PATCH] softmac: remove function_enter()
Remove the function_enter() debugging macros.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-23 07:12:36 -05:00
Patrick McHardy
b30bd282cb [IPV6]: ip6_xmit: remove unnecessary NULL ptr check
The sk argument to ip6_xmit is never NULL nowadays since the skb->priority
assigment expects a valid socket.

Coverity #354

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-23 01:17:25 -08:00
Patrick McHardy
1ae39a430b [NET_SCHED]: cls_u32: remove unnecessary NULL-ptr check
In both cases n can't be NULL without crashing anyway.

Coverity #78

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-23 01:16:48 -08:00
Patrick McHardy
a5cdc03003 [IPV4]: Add fib rule netlink notifications
To really make sense of route notifications in the presence of
multiple tables, userspace also needs to be notified about routing
rule updates.  Notifications are sent to the so far unused
RTNLGRP_NOP1 (now RTNLGRP_RULE) group.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-23 01:16:06 -08:00
Steven Whitehouse
ca6549af77 [PKTGEN]: Add MPLS extension.
Signed-off-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-23 01:10:26 -08:00
Jeff Garzik
fa4fa40a99 Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2006-03-22 22:55:57 -05:00
Larry Finger
fe0b06b123 [PATCH] Fix softmac scan
Softmac scanning fails because the stop flag is not cleared before
scanning is started. The attached one-line patch fixes this.

Signed-Off-By: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:17:02 -05:00
Johannes Berg
1196862b79 [PATCH] softmac: remove dead code
This patch removes ieee80211softmac_reassoc which is neither implemented
nor used nor necessary.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:17:01 -05:00
Johannes Berg
b6c7658ef8 [PATCH] softmac: add reassociation code
This patch adds handling of reassociation to softmac when the AP
requests it. Patch from Larry Finger.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:17:01 -05:00
Johannes Berg
b10c991fa4 [PATCH] softmac: update deauth handler to quiet warning
Recently the deauth packet handler was updated to use a deauth packet
struct (identical to the auth packet struct) and this now gives a
warning. This patch updates the code to properly use a deauth struct and
deauth variable.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:17:00 -05:00
Johannes Berg
f484d582d3 [PATCH] trivial fixes to softmac
This patch removes a blank line that shouldn't be there and fixes a
spelling error in softmac.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:59 -05:00
Johannes Berg
7985905106 [PATCH] update copyright in softmac
This patch updates the copyright statements in softmac that I
erroneously added for 2005 only (when we already had 2006).

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:59 -05:00
Denis Vlasenko
1a995b45a5 [PATCH] ieee80211_rx_any: filter out packets, call ieee80211_rx or ieee80211_rx_mgt
Version 2 of the patch. Added checks for version 0
and proper from/to DS bits. Even in promisc
mode we won't receive packets from another BSSes.

bcm43xx_rx() contains code to filter out packets from
foreign BSSes and decide whether to call ieee80211_rx
or ieee80211_rx_mgt. This is not bcm specific.

Patch adapts that code and adds it to 80211
as ieee80211_rx_any() function.

Signed-off-by: Denis Vlasenko <vda@ilport.com.ua>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:58 -05:00
Johannes Berg
4c718cfd7d [PATCH] softmac: move EXPORT_SYMBOL_GPL right after functions
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:58 -05:00
Johannes Berg
9ebdd46681 [PATCH] softmac: add MODULE_DESCRIPTION and MODULE_AUTHORs
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:57 -05:00
Johannes Berg
4855d25b1e [PATCH] softmac: add copyright and license headers
add copyright and license headers to all softmac files

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:56 -05:00
Johannes Berg
b2b9b6518e [PATCH] softmac: some comment stuff
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:56 -05:00
Johannes Berg
bba52d5e9e [PATCH] softmac: properly check return value of ieee80211softmac_alloc_mgt
Properly check return value of ieee80211softmac_alloc_mgt
in ieee80211softmac_disassoc_deauth (patch by Denis Vlasenko)

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:55 -05:00
Johannes Berg
1dc09776d7 [PATCH] softmac: scan at least once before selecting a network by essid
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:55 -05:00
Johannes Berg
48b2e4ce69 [PATCH] softmac: check if disassociation is for us before processing it
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:54 -05:00
Johannes Berg
78e4f36e05 [PATCH] softmac: select "best" network based on rssi
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:54 -05:00
Johannes Berg
51da28a847 [PATCH] softmac: add fixme for disassoc
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:53 -05:00
Johannes Berg
d1469cf2c7 [PATCH] softmac: try to reassociate when being disassociated from the AP
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:53 -05:00
Johannes Berg
2dd50801b3 [PATCH] softmac: correctly use netif_carrier_{on,off}
TODO: add callbacks for ifup/ifdown (see mailing list)

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:52 -05:00
Johannes Berg
5c4df6da58 [PATCH] softmac: convert to use global workqueue
Convert softmac to use global workqueue instead of private one...

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:52 -05:00
Johannes Berg
45867e6a55 [PATCH] softmac: fix Makefiles
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:51 -05:00
Johannes Berg
714e1a5116 [PATCH] softmac: fix some sparse warnings
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:51 -05:00
Johannes Berg
32821837fa [PATCH] make softmac depend on IEEE80211 and EXPERIMENTAL
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:50 -05:00
Johannes Berg
370121e519 [PATCH] wireless: Add softmac layer to the kernel
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-22 22:16:50 -05:00
Alexey Kuznetsov
1a55d57b10 [TCP]: Do not use inet->id of global tcp_socket when sending RST.
The problem is in ip_push_pending_frames(), which uses:

        if (!df) {
                __ip_select_ident(iph, &rt->u.dst, 0);
        } else {
                iph->id = htons(inet->id++);
        }

instead of ip_select_ident().

Right now I think the code is a nonsense. Most likely, I copied it from
old ip_build_xmit(), where it was really special, we had to decide
whether to generate unique ID when generating the first (well, the last)
fragment.

In ip_push_pending_frames() it does not make sense, it should use plain
ip_select_ident() instead.

Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 14:27:59 -08:00
Patrick McHardy
6a534ee35c [NETFILTER]: Fix undefined references to get_h225_addr
get_h225_addr is exported, but declared static, which fails when
linking statically.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 13:57:25 -08:00
Patrick McHardy
81fbfd6925 [NETFILTER]: Fix xt_policy address matching
Fix missing inversion in address matching, it was broken during the
conversion to x_tables.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 13:56:33 -08:00
Pablo Neira Ayuso
b9f78f9fca [NETFILTER]: nf_conntrack: support for layer 3 protocol load on demand
x_tables matches and targets that require nf_conntrack_ipv[4|6] to work
don't have enough information to load on demand these modules. This
patch introduces the following changes to solve this issue:

o nf_ct_l3proto_try_module_get: try to load the layer 3 connection
tracker module and increases the refcount.
o nf_ct_l3proto_module put: drop the refcount of the module.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 13:56:08 -08:00
Pablo Neira Ayuso
a45049c51c [NETFILTER]: x_tables: set the protocol family in x_tables targets/matches
Set the family field in xt_[matches|targets] registered.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 13:55:40 -08:00
Pablo Neira Ayuso
4e3882f773 [NETFILTER]: conntrack: cleanup the conntrack ID initialization
Currently the first conntrack ID assigned is 2, use 1 instead.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 13:55:11 -08:00
Pablo Neira Ayuso
f0d835835b [NETFILTER]: nfnetlink_queue: fix nfnetlink message size
Fix oversized message, use NLMSG_SPACE just one since it reserves space
for the netlink header and NFA_SPACE for every attribute.

Thanks to Harald Welte for the feedback

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 13:54:40 -08:00
Pablo Neira Ayuso
1cde64365b [NETFILTER]: ctnetlink: Fix expectaction mask dumping
The expectation mask has some particularities that requires a different
handling. The protocol number fields can be set to non-valid protocols,
ie. l3num is set to 0xFFFF. Since that protocol does not exist, the mask
tuple will not be dumped. Moreover, this results in a kernel panic when
nf_conntrack accesses the array of protocol handlers, that is PF_MAX (0x1F)
long.

This patch introduces the function ctnetlink_exp_dump_mask, that correctly
dumps the expectation mask. Such function uses the l3num value from the
expectation tuple that is a valid layer 3 protocol number. The value of the
l3num mask isn't dumped since it is meaningless from the userspace side.

Thanks to Yasuyuki Kozakai and Patrick McHardy for the feedback.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 13:54:15 -08:00
Thomas Vgtle
50b521aa54 [NETFILTER]: Fix Kconfig typos
Signed-off-by: Thomas Vgtle <tv@lio96.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 13:53:48 -08:00
Patrick McHardy
443da0d527 [NETFILTER]: Fix ip6tables breakage from {get,set}sockopt compat layer
do_ipv6_getsockopt returns -EINVAL for unknown options, not
-ENOPROTOOPT as do_ipv6_setsockopt.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 13:53:20 -08:00
Shaun Pereira
9a6b9f2e76 [X25]: dte facilities 32 64 ioctl conversion
Allows dte facility patch to use 32 64 bit ioctl conversion mechanism

Signed-off-by: Shaun Pereira <spereira@tusc.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 00:02:00 -08:00
Shaun Pereira
a64b7b936d [X25]: allow ITU-T DTE facilities for x25
Allows use of the optional user facility to insert ITU-T
(http://www.itu.int/ITU-T/) specified DTE facilities in call set-up x25
packets.  This feature is optional; no facilities will be added if the ioctl
is not used, and call setup packet remains the same as before.

If the ioctls provided by the patch are used, then a facility marker will be
added to the x25 packet header so that the called dte address extension
facility can be differentiated from other types of facilities (as described in
the ITU-T X.25 recommendation) that are also allowed in the x25 packet header.

Facility markers are made up of two octets, and may be present in the x25
packet headers of call-request, incoming call, call accepted, clear request,
and clear indication packets.  The first of the two octets represents the
facility code field and is set to zero by this patch.  The second octet of the
marker represents the facility parameter field and is set to 0x0F because the
marker will be inserted before ITU-T type DTE facilities.

Since according to ITU-T X.25 Recommendation X.25(10/96)- 7.1 "All networks
will support the facility markers with a facility parameter field set to all
ones or to 00001111", therefore this patch should work with all x.25 networks.

While there are many ITU-T DTE facilities, this patch implements only the
called and calling address extension, with placeholders in the
x25_dte_facilities structure for the rest of the facilities.

Testing:

This patch was tested using a cisco xot router connected on its serial ports
to an X.25 network, and on its lan ports to a host running an xotd daemon.

It is also possible to test this patch using an xotd daemon and an x25tap
patch, where the xotd daemons work back-to-back without actually using an x.25
network.  See www.fyonne.net for details on how to do this.

Signed-off-by: Shaun Pereira <spereira@tusc.com.au>
Acked-by: Andrew Hendry <ahendry@tusc.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 00:01:31 -08:00
Shaun Pereira
bac37ec830 [X25]: fix kernel error message 64 bit kernel
Fixes the following error from kernel
T2 kernel: schedule_timeout:
wrong timeout value ffffffffffffffff from ffffffff88164796

Signed-off-by: Shaun Pereira <spereira@tusc.com.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 00:00:40 -08:00
Shaun Pereira
1b06e6ba25 [X25]: ioctl conversion 32 bit user to 64 bit kernel
To allow 32 bit x25 module structures to be passed to a 64 bit kernel via
ioctl using the new compat_sock_ioctl registration mechanism instead of the
obsolete 'register_ioctl32_conversion into hash table' mechanism

Signed-off-by: Shaun Pereira <spereira@tusc.com.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 00:00:12 -08:00
Shaun Pereira
f0ac261441 [NET]: socket timestamp 32 bit handler for 64 bit kernel
Get socket timestamp handler function that does not use the
ioctl32_hash_table.

Signed-off-by: Shaun Pereira <spereira@tusc.com.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-21 23:59:39 -08:00
Shaun Pereira
89bbfc95d6 [NET]: allow 32 bit socket ioctl in 64 bit kernel
Since the register_ioctl32_conversion() patch in the kernel is now obsolete,
provide another method to allow 32 bit user space ioctls to reach the kernel.

Signed-off-by: Shaun Pereira <spereira@tusc.com.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-21 23:58:08 -08:00
Tobias Klauser
67b52e554b [BLUETOOTH]: Return negative error constant
Return negative error constant.

Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-21 23:53:16 -08:00
Trond Myklebust
ac58c9059d Merge branch 'linus' 2006-03-21 12:08:21 -05:00
Jing Min Zhao
5e35941d99 [NETFILTER]: Add H.323 conntrack/NAT helper
Signed-off-by: Jing Min Zhao <zhaojignmin@hotmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 23:41:17 -08:00
Ingo Oeser
322f74a432 [IPV6]: Cleanups for net/ipv6/addrconf.c (kzalloc, early exit) v2
Here are some possible (and trivial) cleanups.
- use kzalloc() where possible
- invert allocation failure test like
  if (object) {
        /* Rest of function here */
  }
  to

  if (object == NULL)
        return NULL;

  /* Rest of function here */

Signed-off-by: Ingo Oeser <ioe-lkml@rameria.de>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 23:01:47 -08:00
Ingo Oeser
0c600eda4b [IPV6]: Nearly complete kzalloc cleanup for net/ipv6
Stupidly use kzalloc() instead of kmalloc()/memset()
everywhere where this is possible in net/ipv6/*.c .

Signed-off-by: Ingo Oeser <ioe-lkml@rameria.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 23:01:32 -08:00
Ingo Oeser
78c784c47a [IPV6]: Cleanup of net/ipv6/reassambly.c
Two minor cleanups:

1. Using kzalloc() in fraq_alloc_queue()
   saves the memset() in ipv6_frag_create().

2. Invert sense of if-statements to streamline code.
   Inverts the comment, too.

Signed-off-by: Ingo Oeser <ioe-lkml@rameria.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 23:01:17 -08:00
Andrew Morton
b3e83d6d18 [BRIDGE]: Remove duplicate const from is_link_local() argument type.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 23:00:56 -08:00
Adrian Bunk
3400112794 [DECNET]: net/decnet/dn_route.c: fix inconsequent NULL checking
The Coverity checker noted this inconsequent NULL checking in
dnrt_drop().

Since all callers ensure that NULL isn't passed, we can simply remove
the check.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 23:00:29 -08:00
Stephen Hemminger
12ac84c4a9 [BRIDGE]: use LLC to send STP
The bridge code can use existing LLC output code when building
spanning tree protocol packets.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:59:49 -08:00
Stephen Hemminger
f4ad2b162d [LLC]: llc_mac_hdr_init const arguments
Cleanup of LLC.  llc_mac_hdr_init can take constant arguments,
and it is defined twice once in llc_output.h that is otherwise unused.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Acked-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:59:36 -08:00
Stephen Hemminger
fda93d92d7 [BRIDGE]: allow show/store of group multicast address
Bridge's communicate with each other using Spanning Tree Protocol
over a standard multicast address. There are times when testing or
layering bridges over existing topologies or tunnels, when it is
useful to use alternative multicast addresses for STP packets.

The 802.1d standard has some unused addresses, that can be used for this.
This patch is restrictive in that it only allows one of the possible
addresses in the standard.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:59:21 -08:00
Stephen Hemminger
cf0f02d04a [BRIDGE]: use llc for receiving STP packets
Use LLC for the receive path of Spanning Tree Protocol packets.
This allows link local multicast packets to be received by
other protocols (if they care), and uses the existing LLC
code to get STP packets back into bridge code.

The bridge multicast address is also checked, so bridges using
other link local multicast addresses are ignored. This allows
for use of different multicast addresses to define separate STP
domains.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:59:06 -08:00
Stephen Hemminger
18fdb2b25b [BRIDGE]: stp timer to jiffies cleanup
Cleanup the get/set of bridge timer value in the packets.
It is clearer not to bury the conversion in macro.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:58:49 -08:00
Stephen Hemminger
f8ae737dee [BRIDGE]: forwarding remove unneeded preempt and bh diasables
Optimize the forwarding and transmit paths. Both places are
called with bottom half/no preempt so there is no need to use
spin_lock_bh or rcu_read_lock.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:58:36 -08:00
Stephen Hemminger
fdeabdefb2 [BRIDGE]: netfilter inline cleanup
Move nf_bridge_alloc from header file to the one place it is
used and optimize it.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:58:21 -08:00
Stephen Hemminger
8b42ec3926 [BRIDGE]: netfilter VLAN macro cleanup
Fix the VLAN macros in bridge netfilter code. Macros should
not depend on magic variables.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:58:05 -08:00
Stephen Hemminger
f8a2602861 [BRIDGE]: netfilter dont use __constant_htons
Only use__constant_htons() for initializers and switch cases.
For other uses, it is just as efficient and clearer to use htons

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:57:46 -08:00
Stephen Hemminger
789bc3e5b6 [BRIDGE]: netfilter whitespace
Run br_netfilter through Lindent to fix whitespace.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:57:32 -08:00
Stephen Hemminger
d5513a7d32 [BRIDGE]: optimize frame pass up
The netfilter hook that is used to receive frames doesn't need to be a
stub.  It is only called in two ways, both of which ignore the return
value.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:57:18 -08:00
Stephen Hemminger
cee4854122 [BRIDGE]: use kzalloc
Use kzalloc versus kmalloc+memset. Also don't need to do
memset() of bridge address since it is in netdev private data
that is already zero'd in alloc_netdev.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:57:03 -08:00
Stephen Hemminger
3b781fa10b [BRIDGE]: use kcalloc
Use kcalloc rather than kmalloc + memset.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:56:50 -08:00
Stephen Hemminger
a95fcacdc3 [BRIDGE]: use setup_timer
Use the now standard setup_timer function.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:56:38 -08:00
Stephen Hemminger
e3efe08e9a [BRIDGE]: remove unneeded bh disables
The STP timers run off softirq (kernel timers), so there is no need to
disable bottom half in the spin locks.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:56:25 -08:00
Andrew Morton
9ebddc1aa3 [BRIDGE] br_netfilter: Warning fixes.
net/bridge/br_netfilter.c: In function `br_nf_pre_routing':
net/bridge/br_netfilter.c:427: warning: unused variable `vhdr'
net/bridge/br_netfilter.c:445: warning: unused variable `vhdr'

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:55:24 -08:00
Andrew Morton
74ca4e5acd [BRIDGE] ebtables: Build fix.
net/bridge/netfilter/ebtables.c:1481: warning: initialization makes pointer from integer without a cast

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:55:02 -08:00
David S. Miller
dbeff12b4d [INET]: Fix typo in Arnaldo's connection sock compat fixups.
"struct inet_csk" --> "struct inet_connection_sock" :-)

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:52:32 -08:00
Arnaldo Carvalho de Melo
8ca0d17bd7 [DCCP] feat: Pass dccp_minisock ptr where only the minisock is used
This is in preparation for having a dccp_minisock embedded into
dccp_request_sock so that feature negotiation can be done prior to
creating the full blown dccp_sock.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:51:53 -08:00
Arnaldo Carvalho de Melo
a4bf390242 [DCCP] minisock: Rename struct dccp_options to struct dccp_minisock
This will later be included in struct dccp_request_sock so that we can
have per connection feature negotiation state while in the 3way
handshake, when we clone the DCCP_ROLE_LISTEN socket (in
dccp_create_openreq_child) we'll just copy this state from
dreq_minisock to dccps_minisock.

Also the feature negotiation and option parsing code will mostly touch
dccps_minisock, which will simplify some stuff.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:50:58 -08:00
Arnaldo Carvalho de Melo
543d9cfeec [NET]: Identation & other cleanups related to compat_[gs]etsockopt cset
No code changes, just tidying up, in some cases moving EXPORT_SYMBOLs
to just after the function exported, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:48:35 -08:00
Arnaldo Carvalho de Melo
f94691acf9 [SK_BUFF]: export skb_pull_rcsum
*** Warning: "skb_pull_rcsum" [net/bridge/bridge.ko] undefined!
*** Warning: "skb_pull_rcsum" [net/8021q/8021q.ko] undefined!
*** Warning: "skb_pull_rcsum" [drivers/net/pppoe.ko] undefined!
*** Warning: "skb_pull_rcsum" [drivers/net/ppp_generic.ko] undefined!

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:47:55 -08:00
Arnaldo Carvalho de Melo
dec73ff029 [ICSK] compat: Introduce inet_csk_compat_[gs]etsockopt
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:46:16 -08:00
Arnaldo Carvalho de Melo
d1d47beef8 [SNAP]: Remove leftover unused hdr variable
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:45:37 -08:00
Dmitry Mishin
3fdadf7d27 [NET]: {get|set}sockopt compatibility layer
This patch extends {get|set}sockopt compatibility layer in order to
move protocol specific parts to their place and avoid huge universal
net/compat.c file in the future.

Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:45:21 -08:00
Dave Jones
c750360938 [IPV6]: remove useless test in ip6_append_data
We've already dereferenced 'np' a dozen
times at this point, so it's safe to say it's not null.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:44:52 -08:00
Adrian Bunk
afb5bb5744 [PKT_SCHED]: Let NET_CLS_ACT no longer depend on EXPERIMENTAL
This option should IMHO no longer depend on EXPERIMENTAL.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
ACKed-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:44:24 -08:00
Herbert Xu
cbb042f9e1 [NET]: Replace skb_pull/skb_postpull_rcsum with skb_pull_rcsum
We're now starting to have quite a number of places that do skb_pull
followed immediately by an skb_postpull_rcsum.  We can merge these two
operations into one function with skb_pull_rcsum.  This makes sense
since most pull operations on receive skb's need to update the
checksum.

I've decided to make this out-of-line since it is fairly big and the
fast path where hardware checksums are enabled need to call
csum_partial anyway.

Since this is a brand new function we get to add an extra check on the
len argument.  As it is most callers of skb_pull ignore its return
value which essentially means that there is no check on the len
argument.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:43:56 -08:00
Steven Whitehouse
ecba320f2e [DECnet]: Use RCU locking in dn_rules.c
As per Robert Olsson's patch for ipv4, this is the DECnet
version to keep the code "in step". It changes the list
of rules to use RCU rather than an rwlock.

Inspired-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:43:28 -08:00
Patrick Caulfield
c60992db46 [DECnet]: Patch to fix recvmsg() flag check
This patch means that 64bit kernel/32bit userland platforms will now
work correctly with DECnet.

Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:43:05 -08:00
Steven Whitehouse
c4ea94ab37 [DECnet]: Endian annotation and fixes for DECnet.
The typedef for dn_address has been removed in favour of using __le16
or __u16 directly as appropriate. All the DECnet header files are
updated accordingly.

The byte ordering of dn_eth2dn() and dn_dn2eth() are both changed
since just about all their callers wanted network order rather than
host order, so the conversion is now done in the functions themselves.

Several missed endianess conversions have been picked up during the
conversion process. The nh_gw field in struct dn_fib_info has been
changed from a 32 bit field to 16 bits as it ought to be.

One or two cases of using htons rather than dn_htons in the routing
code have been found and fixed.

There are still a few warnings to fix, but this patch deals with the
important cases.

Signed-off-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:42:39 -08:00
Catherine Zhang
2c7946a7bf [SECURITY]: TCP/UDP getpeersec
This patch implements an application of the LSM-IPSec networking
controls whereby an application can determine the label of the
security association its TCP or UDP sockets are currently connected to
via getsockopt and the auxiliary data mechanism of recvmsg.

Patch purpose:

This patch enables a security-aware application to retrieve the
security context of an IPSec security association a particular TCP or
UDP socket is using.  The application can then use this security
context to determine the security context for processing on behalf of
the peer at the other end of this connection.  In the case of UDP, the
security context is for each individual packet.  An example
application is the inetd daemon, which could be modified to start
daemons running at security contexts dependent on the remote client.

Patch design approach:

- Design for TCP
The patch enables the SELinux LSM to set the peer security context for
a socket based on the security context of the IPSec security
association.  The application may retrieve this context using
getsockopt.  When called, the kernel determines if the socket is a
connected (TCP_ESTABLISHED) TCP socket and, if so, uses the dst_entry
cache on the socket to retrieve the security associations.  If a
security association has a security context, the context string is
returned, as for UNIX domain sockets.

- Design for UDP
Unlike TCP, UDP is connectionless.  This requires a somewhat different
API to retrieve the peer security context.  With TCP, the peer
security context stays the same throughout the connection, thus it can
be retrieved at any time between when the connection is established
and when it is torn down.  With UDP, each read/write can have
different peer and thus the security context might change every time.
As a result the security context retrieval must be done TOGETHER with
the packet retrieval.

The solution is to build upon the existing Unix domain socket API for
retrieving user credentials.  Linux offers the API for obtaining user
credentials via ancillary messages (i.e., out of band/control messages
that are bundled together with a normal message).

Patch implementation details:

- Implementation for TCP
The security context can be retrieved by applications using getsockopt
with the existing SO_PEERSEC flag.  As an example (ignoring error
checking):

getsockopt(sockfd, SOL_SOCKET, SO_PEERSEC, optbuf, &optlen);
printf("Socket peer context is: %s\n", optbuf);

The SELinux function, selinux_socket_getpeersec, is extended to check
for labeled security associations for connected (TCP_ESTABLISHED ==
sk->sk_state) TCP sockets only.  If so, the socket has a dst_cache of
struct dst_entry values that may refer to security associations.  If
these have security associations with security contexts, the security
context is returned.

getsockopt returns a buffer that contains a security context string or
the buffer is unmodified.

- Implementation for UDP
To retrieve the security context, the application first indicates to
the kernel such desire by setting the IP_PASSSEC option via
getsockopt.  Then the application retrieves the security context using
the auxiliary data mechanism.

An example server application for UDP should look like this:

toggle = 1;
toggle_len = sizeof(toggle);

setsockopt(sockfd, SOL_IP, IP_PASSSEC, &toggle, &toggle_len);
recvmsg(sockfd, &msg_hdr, 0);
if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) {
    cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr);
    if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) &&
        cmsg_hdr->cmsg_level == SOL_IP &&
        cmsg_hdr->cmsg_type == SCM_SECURITY) {
        memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext));
    }
}

ip_setsockopt is enhanced with a new socket option IP_PASSSEC to allow
a server socket to receive security context of the peer.  A new
ancillary message type SCM_SECURITY.

When the packet is received we get the security context from the
sec_path pointer which is contained in the sk_buff, and copy it to the
ancillary message space.  An additional LSM hook,
selinux_socket_getpeersec_udp, is defined to retrieve the security
context from the SELinux space.  The existing function,
selinux_socket_getpeersec does not suit our purpose, because the
security context is copied directly to user space, rather than to
kernel space.

Testing:

We have tested the patch by setting up TCP and UDP connections between
applications on two machines using the IPSec policies that result in
labeled security associations being built.  For TCP, we can then
extract the peer security context using getsockopt on either end.  For
UDP, the receiving end can retrieve the security context using the
auxiliary data mechanism of recvmsg.

Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:41:23 -08:00
Patrick McHardy
be33690d8f [XFRM]: Fix aevent related crash
When xfrm_user isn't loaded xfrm_nl is NULL, which makes IPsec crash because
xfrm_aevent_is_on passes the NULL pointer to netlink_has_listeners as socket.
A second problem is that the xfrm_nl pointer is not cleared when the socket
is releases at module unload time.

Protect references of xfrm_nl from outside of xfrm_user by RCU, check
that the socket is present in xfrm_aevent_is_on and set it to NULL
when unloading xfrm_user.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:40:54 -08:00
Rick Jones
15d99e02ba [TCP]: sysctl to allow TCP window > 32767 sans wscale
Back in the dark ages, we had to be conservative and only allow 15-bit
window fields if the window scale option was not negotiated.  Some
ancient stacks used a signed 16-bit quantity for the window field of
the TCP header and would get confused.

Those days are long gone, so we can use the full 16-bits by default
now.

There is a sysctl added so that we can still interact with such old
stacks

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:40:29 -08:00
Neil Horman
abd596a4b6 [IPV4] ARP: Alloc acceptance of unsolicited ARP via netdevice sysctl.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:39:47 -08:00
Per Liden
87546b1c25 [TIPC]: Avoid compiler warning
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:38:33 -08:00
Per Liden
de70c5ba43 [TIPC]: Reduce stack usage
The node_map struct can be quite large (516 bytes) and allocating two of
them on the stack is not a good idea since we might only have a 4K stack
to start with.

Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:38:14 -08:00
Adrian Bunk
988f088a8e [TIPC]: Cleanups
This patch contains the following possible cleanups:
- make needlessly global code static
- #if 0 the following unused global functions:
  - name_table.c: tipc_nametbl_print()
  - name_table.c: tipc_nametbl_dump()
  - net.c: tipc_net_next_node()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:37:52 -08:00
Per Liden
7c501a5960 [TIPC]: Remove unused functions
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:37:27 -08:00
Sam Ravnborg
05790c6456 [TIPC]: Remove inlines from *.c
With reference to latest discussions on linux-kernel with respect to
inline here is a patch for tipc to remove all inlines as used in
the .c files. See also chapter 14 in Documentation/CodingStyle.

Before:
   text        data     bss     dec     hex filename
 102990        5292    1752  110034   1add2 tipc.o

Now:
   text        data     bss     dec     hex filename
 101190        5292    1752  108234   1a6ca tipc.o

This is a nice text size reduction which will improve icache usage.
In some cases bigger (> 4 lines) functions where declared inline
and used in many places, they are most probarly no longer inlined by gcc
resulting in the size reduction.
There are several one liners that no longer are declared inline, but gcc
should inline these just fine without the inline hint.

With this patch applied one warning is added about an unused static
function - that was hidded by utilising inline before.
The function in question were kept so this patch is solely a
inline removal patch.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:37:04 -08:00
Sam Ravnborg
1fc54d8f49 [TIPC]: Fix simple sparse warnings
Tried to run the new tipc stack through sparse.
Following patch fixes all cases where 0 was used
as replacement of NULL.
Use NULL to document this is a pointer and to silence sparse.

This brough sparse warning count down with 127 to 24 warnings.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:36:47 -08:00
David S. Miller
edb2c34fb2 [NETFILTER]: Fix warnings in ip_nat_snmp_basic.c
net/ipv4/netfilter/ip_nat_snmp_basic.c: In function 'asn1_header_decode':
net/ipv4/netfilter/ip_nat_snmp_basic.c:248: warning: 'len' may be used uninitialized in this function
net/ipv4/netfilter/ip_nat_snmp_basic.c:248: warning: 'def' may be used uninitialized in this function
net/ipv4/netfilter/ip_nat_snmp_basic.c: In function 'snmp_translate':
net/ipv4/netfilter/ip_nat_snmp_basic.c:672: warning: 'l' may be used uninitialized in this function
net/ipv4/netfilter/ip_nat_snmp_basic.c:668: warning: 'type' may be used uninitialized in this function

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:36:21 -08:00
David S. Miller
fb9504964d [DCCP]: Fix uninitialized var warnings in dccp_parse_options().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:36:01 -08:00
Ingo Molnar
57b47a53ec [NET]: sem2mutex part 2
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:35:41 -08:00
Arjan van de Ven
4a3e2f711a [NET] sem2mutex: net/
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:33:17 -08:00
Stephen Hemminger
1533306186 [NET]: dev_put/dev_hold cleanup
Get rid of the old __dev_put macro that is just a hold over from pre 2.6
kernel.  And turn dev_hold into an inline instead of a macro.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:32:28 -08:00
Arnaldo Carvalho de Melo
2d0817d11e [DCCP] options: Make dccp_insert_options & friends yell on error
And not the silly LIMIT_NETDEBUG and silently return without inserting
the option requested.

Also drop some old debugging messages associated to option insertion.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:32:06 -08:00
Arnaldo Carvalho de Melo
110bae4efb [DCCP]: Remove leftover dccp_send_response prototype
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:31:46 -08:00
Arnaldo Carvalho de Melo
c5fed1597e [DCCP]: ditch dccp_v[46]_ctl_send_ack
Merging it with its only user: dccp_v[46]_reqsk_send_ack.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:31:26 -08:00
Arnaldo Carvalho de Melo
118b2c9532 [DCCP]: Use sk->sk_prot->max_header consistently for non-data packets
Using this also provides opportunities for introducing
inet_csk_alloc_skb that would call alloc_skb, account it to the sock
and skb_reserve(max_header), but I'll leave this for later, for now
using sk_prot->max_header consistently is enough.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:31:09 -08:00
Arnaldo Carvalho de Melo
e5a6de915b [DCCP] options: Fix handling of ackvecs in DATA packets
I.e. they should be just ignored, but we have to use 'break', not 'continue',
as we have to possibly reset the mandatory flag.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:30:51 -08:00
David S. Miller
aa837b5bbd [ATM]: Fix build after neigh->parms->neigh_destructor change.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:30:23 -08:00
Benjamin LaHaise
6cb153cab9 [NET]: use fget_light() in net/socket.c
Here's an updated copy of the patch to use fget_light in net/socket.c.
Rerunning the tests show a drop of ~80Mbit/s on average, which looks
bad until you see the drop in cpu usage from ~89% to ~82%.  That will
get fixed in another patch...

Before: max 8113.70, min 8026.32, avg 8072.34
 87380  16384  16384    10.01      8045.55   87.11    87.11    1.774   1.774
 87380  16384  16384    10.01      8065.14   90.86    90.86    1.846   1.846
 87380  16384  16384    10.00      8077.76   89.85    89.85    1.822   1.822
 87380  16384  16384    10.00      8026.32   89.80    89.80    1.833   1.833
 87380  16384  16384    10.01      8108.59   89.81    89.81    1.815   1.815
 87380  16384  16384    10.01      8034.53   89.01    89.01    1.815   1.815
 87380  16384  16384    10.00      8113.70   90.45    90.45    1.827   1.827
 87380  16384  16384    10.00      8111.37   89.90    89.90    1.816   1.816
 87380  16384  16384    10.01      8077.75   87.96    87.96    1.784   1.784
 87380  16384  16384    10.00      8062.70   90.25    90.25    1.834   1.834

After: max 8035.81, min 7963.69, avg 7998.14
 87380  16384  16384    10.01      8000.93   82.11    82.11    1.682   1.682
 87380  16384  16384    10.01      8016.17   83.67    83.67    1.710   1.710
 87380  16384  16384    10.01      7963.69   83.47    83.47    1.717   1.717
 87380  16384  16384    10.01      8014.35   81.71    81.71    1.671   1.671
 87380  16384  16384    10.00      7967.68   83.41    83.41    1.715   1.715
 87380  16384  16384    10.00      7995.22   81.00    81.00    1.660   1.660
 87380  16384  16384    10.00      8002.61   83.90    83.90    1.718   1.718
 87380  16384  16384    10.00      8035.81   81.71    81.71    1.666   1.666
 87380  16384  16384    10.01      8005.36   82.56    82.56    1.690   1.690
 87380  16384  16384    10.00      7979.61   82.50    82.50    1.694   1.694

Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:27:12 -08:00
Stephen Hemminger
8aca8a27d9 [NET]: minor net_rx_action optimization
The functions list_del followed by list_add_tail is equivalent to the
existing inline list_move_tail. list_move_tail avoids unnecessary
_LIST_POISON.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:26:39 -08:00
Michael S. Tsirkin
c5ecd62c25 [NET]: Move destructor from neigh->ops to neigh_params
struct neigh_ops currently has a destructor field, which no in-kernel
drivers outside of infiniband use.  The infiniband/ulp/ipoib in-tree
driver stashes some info in the neighbour structure (the results of
the second-stage lookup from ARP results to real link-level path), and
it uses neigh->ops->destructor to get a callback so it can clean up
this extra info when a neighbour is freed.  We've run into problems
with this: since the destructor is in an ops field that is shared
between neighbours that may belong to different net devices, there's
no way to set/clear it safely.

The following patch moves this field to neigh_parms where it can be
safely set, together with its twin neigh_setup.  Two additional
patches in the patch series update ipoib to use this new interface.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:25:41 -08:00
Luiz Capitulino
53dcb0e38c [PKTGEN]: Updates version.
Due to the thread's lock changes, we're at a new version now.

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:25:05 -08:00
Luiz Capitulino
6146e6a43b [PKTGEN]: Removes thread_{un,}lock() macros.
As suggested by Arnaldo, this patch replaces the
thread_lock()/thread_unlock() by directly calls to
mutex_lock()/mutex_unlock().

This change makes the code a bit more readable, and the direct calls
are used everywhere in the kernel.

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:24:45 -08:00
Luiz Capitulino
222fa07665 [PKTGEN]: Convert thread lock to mutexes.
pktgen's thread semaphores are strict mutexes, convert them to the
mutex implementation.

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:24:27 -08:00
Stephen Hemminger
6756ae4b4e [NET]: Convert RTNL to mutex.
This patch turns the RTNL from a semaphore to a new 2.6.16 mutex and
gets rid of some of the leftover legacy.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:23:58 -08:00
David S. Miller
253aa11578 [IPSEC] xfrm_user: Kill PAGE_SIZE check in verify_sec_ctx_len()
First, it warns when PAGE_SIZE >= 64K because the ctx_len
field is 16-bits.

Secondly, if there are any real length limitations it can
be verified by the security layer security_xfrm_state_alloc()
call.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:23:35 -08:00
Baruch Even
50bf3e224a [TCP] H-TCP: Better time accounting
Instead of estimating the time since the last congestion event, count
it directly.

Signed-off-by: Baruch Even <baruch@ev-en.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:23:10 -08:00
Baruch Even
0bc6d90b82 [TCP] H-TCP: Account for delayed-ACKs
Account for delayed-ACKs in H-TCP.

Delayed-ACKs cause H-TCP to be less aggressive than its design calls
for. It is especially true when the receiver is a Linux machine where
the average delayed ack is over 3 packets with values of 7 not unheard
of.

Signed-off-By: Baruch Even <baruch@ev-en.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:22:47 -08:00
Baruch Even
c33ad6e476 [TCP] H-TCP: Use msecs_to_jiffies
Use functions to calculate jiffies from milliseconds and not the old,
crude method of dividing HZ by a value. Ensures more accurate values
even in the face of strange HZ values.

Signed-off-By: Baruch Even <baruch@ev-en.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:22:20 -08:00
Luiz Capitulino
65a3980e6b [PKTGEN]: Updates version.
With all the previous changes, we're at a new version now.

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:18:31 -08:00
Luiz Capitulino
c26a80168f [PKTGEN]: Ports if_list to the in-kernel implementation.
This patch ports the per-thread interface list list to the in-kernel
linked list implementation. In the general, the resulting code is a
bit simpler.

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:18:16 -08:00
Luiz Capitulino
8024bb2454 [PKTGEN]: Fix Initialization fail leak.
Even if pktgen's thread initialization fails for all CPUs, the module
will be successfully loaded.

This patch changes that behaivor, by returning an error on module load time,
and also freeing all the resources allocated. It also prints a warning if a
thread initialization has failed.

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:17:55 -08:00
Luiz Capitulino
12e1872328 [PKTGEN]: Fix kernel_thread() fail leak.
Free all the alocated resources if kernel_thread() call fails.

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:17:00 -08:00
Luiz Capitulino
cdcdbe0b17 [PKTGEN]: Ports thread list to Kernel list implementation.
The final result is a simpler and smaller code.

Note that I'm adding a new member in the struct pktgen_thread called
'removed'. The reason is that I didn't find a better wait condition to
be used in the place of the replaced one.

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:16:40 -08:00
Luiz Capitulino
222f180658 [PKTGEN]: Lindent run.
Lindet run, with some fixes made by hand.

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:16:13 -08:00
Arnaldo Carvalho de Melo
6df9424a9c [DCCP] options: Fix some aspects of mandatory option processing
According to dccp draft (draft-ietf-dccp-spec-13.txt) section 5.8.2
(Mandatory Option) the following patch correct the handling of the
following cases:

1) "... and any Mandatory options received on DCCP-Data packets MUST be
  ignored."

2) "The connection is in error and should be reset with Reset Code 5, ...
  if option O is absent (Mandatory was the last byte of the option list), or
  if option O equals Mandatory."

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:06:02 -08:00
Arnaldo Carvalho de Melo
c0c736db7e [DCCP] ccid2: coding style cleanups
No changes in the logic where made.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:05:37 -08:00
Arnaldo Carvalho de Melo
45329e71ee [DCCP] ipv6: cleanups
No changes in the logic were made, just removing trailing whitespaces,
etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:01:29 -08:00
Arnaldo Carvalho de Melo
c4d9390941 [ICSK]: Introduce inet_csk_ctl_sock_create
Consolidating open coded sequences in tcp and dccp, v4 and v6.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:01:03 -08:00
Arnaldo Carvalho de Melo
7247887357 [DCCP] ipv6: Add missing ipv6 control socket
I guess I forgot to add it, nah, now it just works:

18:04:33.274066 IP6 ::1.1476 > ::1.5001: request (service=0)
18:04:33.334482 IP6 ::1.5001 > ::1.1476: reset (code=bad_service_code)

Ditched IP_DCCP_UNLOAD_HACK, as now we would have to do it for both
IPv6 and IPv4, so I'll come up with another way for freeing the
control sockets in upcoming changesets.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:00:37 -08:00
Arnaldo Carvalho de Melo
c25a18ba34 [DCCP]: Uninline some functions
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:58:56 -08:00
Adrian Bunk
5e0817f84c [DCCP] ipv4: make struct dccp_v4_prot static
There's no reason for struct dccp_v4_prot being global.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:58:29 -08:00
David S. Miller
d76e60a5b5 [IPV6]: Fix some code/comment formatting in ip6_dst_output().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:35:50 -08:00
Robert Olsson
06ef921d60 [IPV4]: fib_trie stats fix
fib_triestats has been buggy and caused oopses some platforms as
openwrt.  The patch below should cure those problems.

Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:35:01 -08:00
Robert Olsson
5ddf0eb2bf [IPV4]: fib_trie initialzation fix
In some kernel configs /proc functions seems to be accessed before the
trie is initialized. The patch below checks for this.

Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:34:12 -08:00
John Heffner
0e7b13685f [TCP] mtu probing: move tcp-specific data out of inet_connection_sock
This moves some TCP-specific MTU probing state out of
inet_connection_sock back to tcp_sock.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:32:58 -08:00
Benjamin LaHaise
e9df7d7f58 [AF_UNIX]: use shift instead of integer division
The patch below replaces a divide by 2 with a shift -- sk_sndbuf is an
integer, so gcc emits an idiv, which takes 10x longer than a shift by 1.
This improves af_unix bandwidth by ~6-10K/s.  Also, tidy up the comment
to fit in 80 columns while we're at it.

Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:29:05 -08:00
Jrn Engel
231d06ae82 [NET]: Uninline kfree_skb and allow NULL argument
o Uninline kfree_skb, which saves some 15k of object code on my notebook.

o Allow kfree_skb to be called with a NULL argument.

  Subsequent patches can remove conditional from drivers and further
  reduce source and object size.

Signed-off-by: Jrn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:28:35 -08:00
Arnaldo Carvalho de Melo
2e1f47c74c [LLC]: Fix sap refcounting
Thanks to Leslie Harlley Watter <leslie@watter.org> for reporting the
problem an testing this patch.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:28:11 -08:00
Arnaldo Carvalho de Melo
2342c990bb [LLC]: Replace __inline__ with inline
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:27:43 -08:00
Arnaldo Carvalho de Melo
9c005e018c [LLC]: Fix struct proto .name
Cut'n'paste error from ddp_proto.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:27:23 -08:00
Arthur Kepner
95ed63f791 [NET] pktgen: Fix races between control/worker threads.
There's a race in pktgen which can lead to a double
free of a pktgen_dev's skb. If a worker thread is in
the midst of doing fill_packet(), and the controlling
thread gets a "stop" message, the already freed skb
can be freed once again in pktgen_stop_device(). This
patch gives all responsibility for cleaning up a
pktgen_dev's skb to the associated worker thread.

Signed-off-by: Arthur Kepner <akepner@sgi.com>
Acked-by: Robert Olsson <Robert.Olsson@data.slu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:26:56 -08:00
Arnaldo Carvalho de Melo
b61fafc4ef [DCCP]: Move the IPv4 specific bits from proto.c to ipv4.c
With this patch in place we can break down the complexity by better
compartmentalizing the code that is common to ipv6 and ipv4.

Now we have these modules:
Module                  Size  Used by
dccp_diag               1344  0
inet_diag               9448  1 dccp_diag
dccp_ccid3             15856  0
dccp_tfrc_lib          12320  1 dccp_ccid3
dccp_ccid2              5764  0
dccp_ipv4              16996  2
dccp                   48208  4 dccp_diag,dccp_ccid3,dccp_ccid2,dccp_ipv4

dccp_ipv6 still requires dccp_ipv4 due to dccp_ipv6_mapped, that is
the next target to work on the "hey, ipv4 is legacy, I only want ipv6
dude!" direction.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:25:11 -08:00
Arnaldo Carvalho de Melo
46f09ffa7d [DCCP]: Rename init_dccp_v4_mibs to dccp_mib_init
And introduce dccp_mib_exit grouping previously open coded sequence.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:24:42 -08:00
Arnaldo Carvalho de Melo
075ae86611 [DCCP]: Move dccp_hashinfo from ipv4.c to the core
As it is used by both ipv4 and ipv6.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:24:19 -08:00
Arnaldo Carvalho de Melo
0a1ec676dd [DCCP]: Dont use dccp_v4_checksum in dccp_make_response
dccp_make_response is shared by ipv4/6 and the ipv6 code was
recalculating the checksum, not good, so move the dccp_v4_checksum
call to dccp_v4_send_response.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:23:59 -08:00
Arnaldo Carvalho de Melo
c985ed705f [DCCP]: Move dccp_[un]hash from ipv4.c to the core
As this is used by both ipv4 and ipv6 and is not ipv4 specific.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:23:39 -08:00
Arnaldo Carvalho de Melo
3e0fadc51f [DCCP]: Move dccp_v4_{init,destroy}_sock to the core
Removing one more ipv6 uses ipv4 stuff case in dccp land.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 21:23:15 -08:00
J. Bruce Fields
0e19c1ea2f SUNRPC,RPCSEC_GSS: spkm3: import contexts using NID_cast5_cbc
Import the NID_cast5_cbc from the userland context. Not used.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 23:24:40 -05:00
J. Bruce Fields
eaa82edf20 SUNRPC,RPCSEC_GSS: fix krb5 sequence numbers.
Use a spinlock to ensure unique sequence numbers when creating krb5 gss tokens.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 23:24:04 -05:00
J. Bruce Fields
9e57b302cf SUNRPC,RPCSEC_GSS: remove unnecessary kmalloc of a checksum
Remove unnecessary kmalloc of temporary space to hold the md5 result; it's
small enough to just put on the stack.

This code may be called to process rpc's necessary to perform writes, so
there's a potential deadlock whenever we kmalloc() here.  After this a
couple kmalloc()'s still remain, to be removed soon.

This also fixes a rare double-free on error noticed by coverity.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 23:23:11 -05:00
Arnaldo Carvalho de Melo
017487d7d1 [DCCP]: Generalize dccp_v4_send_reset
Renaming it to dccp_send_reset and moving it from the ipv4 specific
code to the core dccp code.

This fixes some bugs in IPV6 where timers would send v4 resets, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:25:24 -08:00
Arnaldo Carvalho de Melo
e55d912f5b [DCCP] feat: Introduce sysctls for the default features
[root@qemu ~]# for a in /proc/sys/net/dccp/default/* ; do echo $a ; cat $a ; done
/proc/sys/net/dccp/default/ack_ratio
2
/proc/sys/net/dccp/default/rx_ccid
3
/proc/sys/net/dccp/default/send_ackvec
1
/proc/sys/net/dccp/default/send_ndp
1
/proc/sys/net/dccp/default/seq_window
100
/proc/sys/net/dccp/default/tx_ccid
3
[root@qemu ~]#

So if wanting to test ccid3 as the tx CCID one can just do:

[root@qemu ~]# echo 3 > /proc/sys/net/dccp/default/tx_ccid
[root@qemu ~]# echo 2 > /proc/sys/net/dccp/default/rx_ccid
[root@qemu ~]# cat /proc/sys/net/dccp/default/[tr]x_ccid
2
3
[root@qemu ~]#

Of course we also need the setsockopt for each app to tell its preferences, but
for testing or defining something other than CCID2 as the default for apps that
don't explicitely set their preference the sysctl interface is handy.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:25:02 -08:00
Arnaldo Carvalho de Melo
04e2661e9c [DCCP]: Call dccp_feat_init more early in dccp_v4_init_sock
So that dccp_feat_clean doesn't get confused with uninitialized
list_heads.

Noticed when testing with no ccid kernel modules.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:24:41 -08:00
Arnaldo Carvalho de Melo
057fc6755a [DCCP]: Kconfig tidy up
Make CCID2 and CCID3 default to what was selected for DCCP and use the
standard short description for the CCIDs (TCP-Like & TCP-Friendly).

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:24:22 -08:00
Andrea Bittau
60fe62e789 [DCCP]: sparse endianness annotations
This also fixes the layout of dccp_hdr short sequence numbers, problem
was not fatal now as we only support long (48 bits) sequence numbers.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:23:32 -08:00
Patrick McHardy
a193a4abdd [NETFILTER]: Fix skb->nf_bridge lifetime issues
The bridge netfilter code simulates the NF_IP_PRE_ROUTING hook and skips
the real hook by registering with high priority and returning NF_STOP if
skb->nf_bridge is present and the BRNF_NF_BRIDGE_PREROUTING flag is not
set. The flag is only set during the simulated hook.

Because skb->nf_bridge is only freed when the packet is destroyed, the
packet will not only skip the first invocation of NF_IP_PRE_ROUTING, but
in the case of tunnel devices on top of the bridge also all further ones.
Forwarded packets from a bridge encapsulated by a tunnel device and sent
as locally outgoing packet will also still have the incorrect bridge
information from the input path attached.

We already have nf_reset calls on all RX/TX paths of tunnel devices,
so simply reset the nf_bridge field there too. As an added bonus,
the bridge information for locally delivered packets is now also freed
when the packet is queued to a socket.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:23:05 -08:00
Andrea Bittau
6ffd30fbbb [DCCP] feat: Actually change the CCID upon negotiation
Change the CCID upon successful feature negotiation.

Commiter note: patch mostly rewritten to use the new ccid API.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:22:37 -08:00
Arnaldo Carvalho de Melo
91f0ebf7b6 [DCCP] CCID: Improve CCID infrastructure
1. No need for ->ccid_init nor ->ccid_exit, this is what module_{init,exit}
   does and anynways neither ccid2 nor ccid3 were using it.

2. Rename struct ccid to struct ccid_operations and introduce struct ccid
   with a pointer to ccid_operations and rigth after it the rx or tx
   private state.

3. Remove the pointer to the state of the half connections from struct
   dccp_sock, now its derived thru ccid_priv() from the ccid pointer.

Now we also can implement the setsockopt for changing the CCID easily as
no ccid init routines can affect struct dccp_sock in any way that prevents
other CCIDs from working if a CCID switch operation is asked by apps.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:21:44 -08:00
Patrick McHardy
f38c39d6ce [PKT_SCHED]: Convert sch_red to a classful qdisc
Convert sch_red to a classful qdisc. All qdiscs that maintain accurate
backlog counters are eligible as child qdiscs. When a queue limit larger
than zero is given, a bfifo qdisc is used for backwards compatibility.
Current versions of tc enforce a limit larger than zero, other users
can avoid creating the default qdisc by using zero.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:20:44 -08:00
David S. Miller
a70fcb0ba3 [XFRM]: Add some missing exports.
To fix the case of modular xfrm_user.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:18:52 -08:00
David S. Miller
ee857a7d67 [XFRM]: Move xfrm_nl to xfrm_state.c from xfrm_user.c
xfrm_user could be modular, and since generic code uses this symbol
now...

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:18:37 -08:00
David S. Miller
0ac8475248 [XFRM]: Make sure xfrm_replay_timer_handler() is declared early enough.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:18:23 -08:00
Jamal Hadi Salim
6c5c8ca7ff [IPSEC]: Sync series - policy expires
This is similar to the SA expire insertion patch - only it inserts
expires for SP.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:17:25 -08:00
Jamal Hadi Salim
53bc6b4d29 [IPSEC]: Sync series - SA expires
This patch allows a user to insert SA expires. This is useful to
do on an HA backup for the case of byte counts but may not be very
useful for the case of time based expiry.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:17:03 -08:00
Jamal Hadi Salim
980ebd2579 [IPSEC]: Sync series - acquire insert
This introduces a feature similar to the one described in RFC 2367:
"
   ... the application needing an SA sends a PF_KEY
   SADB_ACQUIRE message down to the Key Engine, which then either
   returns an error or sends a similar SADB_ACQUIRE message up to one or
   more key management applications capable of creating such SAs.
   ...
   ...
   The third is where an application-layer consumer of security
   associations (e.g.  an OSPFv2 or RIPv2 daemon) needs a security
   association.

        Send an SADB_ACQUIRE message from a user process to the kernel.

        <base, address(SD), (address(P),) (identity(SD),) (sensitivity,)
          proposal>

        The kernel returns an SADB_ACQUIRE message to registered
          sockets.

        <base, address(SD), (address(P),) (identity(SD),) (sensitivity,)
          proposal>

        The user-level consumer waits for an SADB_UPDATE or SADB_ADD
        message for its particular type, and then can use that
        association by using SADB_GET messages.

 "
An app such as OSPF could then use ipsec KM to get keys

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:16:40 -08:00
Jamal Hadi Salim
d51d081d65 [IPSEC]: Sync series - user
Add xfrm as the user of the core changes

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:16:12 -08:00
Jamal Hadi Salim
9500e8a81f [IPSEC]: Sync series - fast path
Fast path sequence updates that will generate ipsec async
events

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:15:29 -08:00
Jamal Hadi Salim
f8cd54884e [IPSEC]: Sync series - core changes
This patch provides the core functionality needed for sync events
for ipsec. Derived work of Krisztian KOVACS <hidden@balabit.hu>

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:15:11 -08:00
Patrick McHardy
f5539eb8ca [PKT_SCHED]: Keep backlog counter in sch_sfq
Keep backlog counter in SFQ qdisc to make it usable as child qdisc
with RED.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:01:38 -08:00
Patrick McHardy
053cfed75d [PKT_SCHED]: Restore TBF change semantic
When TBF was converted to a classful qdisc, the semantic of the limit
parameter was broken. On initilization an inner bfifo qdisc is created
for backwards compatibility, when changing parameters however the new
limit is ignored and the current child qdisc remains in place.

Always replace the child qdisc by the default bfifo when limit is above
zero, otherwise don't touch the inner qdisc. Current tc version enforce
a limit above zero, other users can avoid creating the inner qdisc by
using zero.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:01:21 -08:00
Patrick McHardy
cdc7f8e362 [PKT_SCHED]: Dump child qdisc handle in sch_{atm,dsmark}
A qdisc should set tcm_info to the child qdisc handle in its class
dump function.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:01:06 -08:00
Patrick McHardy
6d037a26f0 [PKT_SCHED]: Qdisc drop operation is optional
The drop operation is optional and qdiscs must check if childs support it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 19:00:49 -08:00
Patrick McHardy
4277a083ec [NETLINK]: Add netlink_has_listeners for avoiding unneccessary event message generation
Keep a bitmask of multicast groups with subscribed listeners to let
netlink users check for listeners before generating multicast
messages.

Queries don't perform any locking, which may result in false
positives, it is guaranteed however that any new subscriptions are
visible before bind() or setsockopt() return.

Signed-off-by: Patrick McHardy <kaber@trash.net>
ACKed-by: Jamal Hadi Salim<hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 18:52:01 -08:00
Patrick McHardy
a242769248 [NETFILTER]: ctnetlink: avoid unneccessary event message generation
Avoid unneccessary event message generation by checking for netlink
listeners before building a message.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 18:03:59 -08:00
Patrick McHardy
c4b8851392 [NETFILTER]: x_tables: replace IPv4/IPv6 policy match by address family independant version
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 18:03:40 -08:00
Patrick McHardy
f2ffd9eeda [NETFILTER]: Move ip6_masked_addrcmp to include/net/ipv6.h
Replace netfilter's ip6_masked_addrcmp by a more efficient version
in include/net/ipv6.h to make it usable without module dependencies.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 18:03:16 -08:00
Patrick McHardy
c498673474 [NETFILTER]: x_tables: add xt_{match,target} arguments to match/target functions
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 18:02:56 -08:00
Patrick McHardy
1c524830d0 [NETFILTER]: x_tables: pass registered match/target data to match/target functions
This allows to make decisions based on the revision (and address family
with a follow-up patch) at runtime.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 18:02:15 -08:00
Patrick McHardy
5d04bff096 [NETFILTER]: Convert x_tables matches/targets to centralized error checking
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 18:01:58 -08:00
Patrick McHardy
7f9397138e [NETFILTER]: Convert ip6_tables matches/targets to centralized error checking
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 18:01:43 -08:00
Patrick McHardy
aa83c1ab43 [NETFILTER]: Convert arp_tables targets to centralized error checking
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 18:01:28 -08:00
Patrick McHardy
1d5cd90976 [NETFILTER]: Convert ip_tables matches/targets to centralized error checking
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 18:01:14 -08:00
Patrick McHardy
3cdc7c953e [NETFILTER]: Change {ip,ip6,arp}_tables to use centralized error checking
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 18:00:36 -08:00
Patrick McHardy
37f9f7334b [NETFILTER]: xt_tables: add centralized error checking
Introduce new functions for common match/target checks (private data
size, valid hooks, valid tables and valid protocols) to get more consistent
error reporting and to avoid each module duplicating them.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:59:06 -08:00
Yasuyuki Kozakai
6ea46c9c12 [NETFILTER]: nf_conntrack: use ipv6_addr_equal in nf_ct_reasm
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:58:44 -08:00
Holger Eitzenberger
f2ad52c9da [NETFILTER]: Fix CID offset bug in PPTP NAT helper debug message
The recent (kernel 2.6.15.1) fix for PPTP NAT helper introduced a
bug - which only appears if DEBUGP is enabled though.

The calculation of the CID offset into a PPTP request struct is
not correct, so that at least not the correct CID is displayed
if DEBUGP is enabled.

This patch corrects CID offset calculation and introduces a #define
for that.

Signed-off-by: Holger Eitzenberger <heitzenberger@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:58:21 -08:00
Andrea Bittau
77ff72d528 [DCCP] CCID2: Drop sock reference count on timer expiration and reset.
There was a hybrid use of standard timers and sk_timers.  This caused
the reference count of the sock to be incorrect when resetting the RTO
timer.  The sock reference count should now be correct, enabling its
destruction, and allowing the DCCP module to be unloaded.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-03-20 17:57:52 -08:00
Harald Welte
dc808fe28d [NETFILTER] nf_conntrack: clean up to reduce size of 'struct nf_conn'
This patch moves all helper related data fields of 'struct nf_conn'
into a separate structure 'struct nf_conn_help'.  This new structure
is only present in conntrack entries for which we actually have a
helper loaded.

Also, this patch cleans up the nf_conntrack 'features' mechanism to
resemble what the original idea was: Just glue the feature-specific
data structures at the end of 'struct nf_conn', and explicitly
re-calculate the pointer to it when needed rather than keeping
pointers around.

Saves 20 bytes per conntrack on my x86_64 box. A non-helped conntrack
is 276 bytes. We still need to save another 20 bytes in order to fit
into to target of 256bytes.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:56:32 -08:00
John Heffner
5d424d5a67 [TCP]: MTU probing
Implementation of packetization layer path mtu discovery for TCP, based on
the internet-draft currently found at
<http://www.ietf.org/internet-drafts/draft-ietf-pmtud-method-05.txt>.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:53:41 -08:00
Adrian Bunk
d15150f755 [IPV4] fib_rules.c: make struct fib_rules static again
struct fib_rules became global for no good reason.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:46:56 -08:00
Jesper Juhl
2b191befe2 [IPCOMP6]: don't check vfree() argument for NULL.
vfree does it's own NULL checking, so checking a pointer before
handing it to vfree is pointless.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:46:29 -08:00
Andrea Bittau
afe00251dd [DCCP]: Initial feature negotiation implementation
Still needs more work, but boots and doesn't crashes, even
does some negotiation!

18:38:52.174934  127.0.0.1.43458 > 127.0.0.1.5001: request <change_l ack_ratio 2, change_r ccid 2, change_l ccid 2>
18:38:52.218526  127.0.0.1.5001 > 127.0.0.1.43458: response <nop, nop, change_l ack_ratio 2, confirm_r ccid 2 2, confirm_l ccid 2 2, confirm_r ack_ratio 2>
18:38:52.185398  127.0.0.1.43458 > 127.0.0.1.5001: <nop, confirm_r ack_ratio 2, ack_vector0 0x00, elapsed_time 212>

:-)

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:43:56 -08:00
Andrea Bittau
2a91aa3967 [DCCP] CCID2: Initial CCID2 (TCP-Like) implementation
Original work by Andrea Bittau, Arnaldo Melo cleaned up and fixed several
issues on the merge process.

For now CCID2 was turned the default for all SOCK_DCCP connections, but this
will be remedied soon with the merge of the feature negotiation code.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:41:47 -08:00
Arnaldo Carvalho de Melo
aa5d7df3b2 [DCCP] CCID3: Set the no_feedback_timer fields near init_timer
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:35:13 -08:00
Arnaldo Carvalho de Melo
9833d6da00 [DCCP]: Don't alloc ack vector for the control sock
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:34:53 -08:00
Arnaldo Carvalho de Melo
d5e9b2c737 [DCCP] ackvec: Delete all the ack vector records in dccp_ackvec_free
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:20:46 -08:00
Arnaldo Carvalho de Melo
411447019a [DCCP] CCID: Allow ccid_{init,exit} to be NULL
Testing if the ccid being instantiated has these methods in
ccid_init().

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:20:23 -08:00
Andrea Bittau
02bcf28c82 [DCCP] ackvec: Introduce ack vector records
Based on a patch by Andrea Bittau.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:19:55 -08:00
Robert Olsson
7b204afd45 [IPV4]: Use RCU locking in fib_rules.
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:18:53 -08:00
Arnaldo Carvalho de Melo
9b07ef5dda [DCCP] ackvec: Introduce dccp_ackvec_slab
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:16:17 -08:00
Arnaldo Carvalho de Melo
fa23e2ecd3 [DCCP]: Fix error handling in dccp_init
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:16:01 -08:00
Arnaldo Carvalho de Melo
7400d78110 [DCCP] ackvec: Ditch dccpav_buf_len
Simplifying the code a bit as we're always using DCCP_MAX_ACKVEC_LEN.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:15:42 -08:00
Harald Welte
0af5f6c1eb [NETFILTER] nfnetlink_log: add sequence numbers for log events
By using a sequence number for every logged netfilter event, we can
determine from userspace whether logging information was lots somewhere
downstream.

The user has a choice of either having per-instance local sequence
counters, or using a global sequence counter, or both.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:15:11 -08:00
David S. Miller
39d8c1b6fb [NET]: Do not lose accepted socket when -ENFILE/-EMFILE.
Try to allocate the struct file and an unused file
descriptor before we try to pull a newly accepted
socket out of the protocol layer.

Based upon a patch by Prassana Meda.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:13:49 -08:00
Stefan Rompf
ddd7bf9fe4 [VLAN]: translate IF_OPER_DORMANT to netif_dormant_on()
this patch adds support to the VLAN driver to translate IF_OPER_DORMANT of the
underlying device to netif_dormant_on(). Beside clean state forwarding, this
allows running independant userspace supplicants on both the real device and
the stacked VLAN. It depends on my RFC2863 patch.

Signed-off-by: Stefan Rompf <stefan@loplof.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:11:41 -08:00
Stefan Rompf
b00055aacd [NET] core: add RFC2863 operstate
this patch adds a dormant flag to network devices, RFC2863 operstate derived
from these flags and possibility for userspace interaction. It allows drivers
to signal that a device is unusable for user traffic without disabling
queueing (and therefore the possibility for protocol establishment traffic to
flow) and a userspace supplicant (WPA, 802.1X) to mark a device unusable
without changes to the driver.

It is the result of our long discussion. However I must admit that it
represents what Jamal and I agreed on with compromises towards Krzysztof, but
Thomas and Krzysztof still disagree with some parts. Anyway I think it should
be applied.

Signed-off-by: Stefan Rompf <stefan@loplof.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:09:11 -08:00
YOSHIFUJI Hideaki
e843b9e1be [IPV6]: ROUTE: Ensure to accept redirects from nexthop for the target.
It is possible to get redirects from nexthop of "more-specific"
routes.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:07:49 -08:00
YOSHIFUJI Hideaki
09c884d4c3 [IPV6]: ROUTE: Add accept_ra_rt_info_max_plen sysctl.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:07:03 -08:00
YOSHIFUJI Hideaki
e317da9622 [IPV6]: ROUTE: Flag RTF_DEFAULT for Route Infomation for ::/0.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:06:42 -08:00
YOSHIFUJI Hideaki
70ceb4f539 [IPV6]: ROUTE: Add experimental support for Route Information Option in RA (RFC4191).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:06:24 -08:00
YOSHIFUJI Hideaki
52e1635631 [IPV6]: ROUTE: Add router_probe_interval sysctl.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:05:47 -08:00
YOSHIFUJI Hideaki
930d6ff2e2 [IPV6]: ROUTE: Add accept_ra_rtr_pref sysctl.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:05:30 -08:00
YOSHIFUJI Hideaki
270972554c [IPV6]: ROUTE: Add Router Reachability Probing (RFC4191).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:05:13 -08:00
YOSHIFUJI Hideaki
ebacaaa0fd [IPV6]: ROUTE: Add support for Router Preference (RFC4191).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:04:53 -08:00
YOSHIFUJI Hideaki
8238dd0698 [IPV6]: ROUTE: Handle finding the next best route in reachability in BACKTRACK().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:04:35 -08:00
YOSHIFUJI Hideaki
bb133964e0 [IPV6]: ROUTE: Try finding the next best route.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:01:43 -08:00
YOSHIFUJI Hideaki
1ddef044ed [IPV6]: ROUTE: Clean up rt6_select() code path in ip6_route_{intput,output}().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:01:24 -08:00
YOSHIFUJI Hideaki
118f8c1654 [IPV6]: ROUTE: Try selecting better route for non-default routes as well.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:01:06 -08:00
YOSHIFUJI Hideaki
045927ff84 [IPV6]: ROUTE: More strict check for default routers in rt6_get_dflt_router().
Check RTF_ADDRCONF|RTF_DEFAULT in rt6_get_dflt_router().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:00:48 -08:00
YOSHIFUJI Hideaki
554cfb7ee5 [IPV6]: ROUTE: Eliminate lock for default route pointer.
And prepare for more advanced router selection.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:00:26 -08:00
YOSHIFUJI Hideaki
519fbd8715 [IPV6]: ROUTE: Clean-up cow'ing in ip6_route_{intput,output}().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 17:00:05 -08:00
YOSHIFUJI Hideaki
e40cf3533c [IPV6]: ROUTE: Convert rt6_cow() to rt6_alloc_cow().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 16:59:27 -08:00
YOSHIFUJI Hideaki
fb9de91ea8 [IPV6]: ROUTE: Clean up reference counting / unlocking for returning object.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 16:59:08 -08:00
YOSHIFUJI Hideaki
d5315b500b [IPV6]: ROUTE: Unify two code paths for pmtu disc.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 16:58:48 -08:00
YOSHIFUJI Hideaki
299d993908 [IPV6]: ROUTE: Add rt6_alloc_clone() for cloning route allocation.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 16:58:32 -08:00
YOSHIFUJI Hideaki
76f9edd17d [IPV6]: ROUTE: Copy u.dst.error for RTF_REJECT routes when cloning.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 16:56:50 -08:00
YOSHIFUJI Hideaki
a1e783634a [IPV6]: ROUTE: Set appropriate information before inserting a route.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 16:56:32 -08:00
YOSHIFUJI Hideaki
95a9a5ba02 [IPV6]: ROUTE: Split up rt6_cow() for future changes.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 16:55:51 -08:00
YOSHIFUJI Hideaki
c4fd30eb18 [IPV6]: ADDRCONF: Add accept_ra_pinfo sysctl.
This controls whether we accept Prefix Information in RAs.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 16:55:26 -08:00
YOSHIFUJI Hideaki
65f5c7c114 [IPV6]: ROUTE: Add accept_ra_defrtr sysctl.
This controls whether we accept default router information
in RAs.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 16:55:08 -08:00
YOSHIFUJI Hideaki
073a8e0e15 [IPV6]: ADDRCONF: Split up ipv6_generate_eui64() by device type.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 16:54:49 -08:00
YOSHIFUJI Hideaki
955189efb4 [IPV6]: ADDRCONF: Use our standard algorithm for randomized ifid.
RFC 3041 describes an algorithm to generate random interface
identifier.  In RFC 3041bis, it is allowed to use different
algorithm than one described in RFC 3041.

So, let's use our standard pseudo random algorithm to simplify
our implementation.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 16:54:09 -08:00
YOSHIFUJI Hideaki
955aaa2fe3 [NET]: NEIGHBOUR: Ensure to record time to neigh->updated when neighbour's state changed.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 16:52:52 -08:00
YOSHIFUJI Hideaki
74a3a0ed90 [IPV6]: TUNNEL6: Don't try to add multicast route twice.
Since addrconf_add_dev() has already called addrconf_add_mroute()
to added route for multicast prefix, there's no point to call it
again in addrconf_ip6_tnl_config().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 16:51:48 -08:00
Trond Myklebust
7a1218a277 SUNRPC: Ensure rpc_call_async() always calls tk_ops->rpc_release()
Currently this will not happen if we exit before rpc_new_task() was called.
Also fix up rpc_run_task() to do the same (for consistency).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 18:11:10 -05:00
Steve Grubb
5bdb988680 [PATCH] promiscuous mode
Hi,

When a network interface goes into promiscuous mode, its an important security
issue. The attached patch is intended to capture that action and send an
event to the audit system.

The patch carves out a new block of numbers for kernel detected anomalies.
These are events that may indicate suspicious activity. Other examples of
potential kernel anomalies would be: exceeding disk quota, rlimit violations,
changes to syscall entry table.

Signed-off-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-20 14:08:55 -05:00
Trond Myklebust
43ac3f2961 SUNRPC: Fix memory barriers for req->rq_received
We need to ensure that all writes to the XDR buffers are done before
req->rq_received is visible to other processors.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:51 -05:00
Trond Myklebust
5428154827 SUNRPC: Fix a 'Busy inodes' error in rpc_pipefs
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:49 -05:00
Chuck Lever
5eb53f41d1 SUNRPC: fix compile warnings on 64-bit platforms
Introduced by NFS metrics patch.

Test plan:
Compile kernel with CONFIG_NFS enabled on a 64-bit platform.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:42 -05:00
Chuck Lever
e95b85ec9d SUNRPC: minor cleanup
RPC_DEBUG_DATA no longer needed in net/sunrpc/xprt.c.

Test plan:
Compile kernel with CONFIG_NFS enabled.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:23 -05:00
Chuck Lever
dead28da8e SUNRPC: eliminate rpc_call()
Clean-up: replace rpc_call() helper with direct call to rpc_call_sync.

This makes NFSv2 and NFSv3 synchronous calls more computationally
efficient, and reduces stack consumption in functions that used to
invoke rpc_call more than once.

Test plan:
Compile kernel with CONFIG_NFS enabled.  Connectathon on NFS version 2,
version 3, and version 4 mount points.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:23 -05:00
Chuck Lever
cc0175c1dc SUNRPC: display human-readable procedure name in rpc_iostats output
Add fields to the rpc_procinfo struct that allow the display of a
human-readable name for each procedure in the rpc_iostats output.

Also fix it so that the NFSv4 stats are broken up correctly by
sub-procedure number.  NFSv4 uses only two real RPC procedures:
NULL, and COMPOUND.

Test plan:
Mount with NFSv2, NFSv3, and NFSv4, and do "cat /proc/self/mountstats".

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:22 -05:00
Chuck Lever
11c556b3d8 SUNRPC: provide a mechanism for collecting stats in the RPC client
Add a simple mechanism for collecting stats in the RPC client.  Stats are
tabulated during xprt_release.  Note that per_cpu shenanigans are not
required here because the RPC client already serializes on the transport
write lock.

Test plan:
Compile kernel with CONFIG_NFS enabled.  Basic performance regression
testing with high-speed networking and high performance server.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:22 -05:00
Chuck Lever
ef759a2e54 SUNRPC: introduce per-task RPC iostats
Account for various things that occur while an RPC task is executed.
Separate timers for RPC round trip and RPC execution time show how
long RPC requests wait in queue before being sent.  Eventually these
will be accumulated at xprt_release time in one place where they can
be viewed from userland.

Test plan:
Compile kernel with CONFIG_NFS enabled.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:17 -05:00
Chuck Lever
262ca07de4 SUNRPC: add a handful of per-xprt counters
Monitor generic transport events.  Add a transport switch callout to
format transport counters for export to user-land.

Test plan:
Compile kernel with CONFIG_NFS enabled.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:16 -05:00
Chuck Lever
e19b63dafd SUNRPC: track length of RPC wait queues
RPC wait queue length will eventually be exported to userland via the RPC
iostats interface.

Test plan:
Compile kernel with CONFIG_NFS enabled.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:15 -05:00
Levent Serinol
1356b8c28d SUNRPC: more verbose output for rpc auth weak error
This patch adds server ip address to be printed out when "server
requires stronger authentication" error occured.

Signed-off-by: Levent Serinol <lserinol@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:11 -05:00
Trond Myklebust
12de3b35ea SUNRPC: Ensure that rpc_mkpipe returns a refcounted dentry
If not, we cannot guarantee that idmap->idmap_dentry, gss_auth->dentry and
clnt->cl_dentry are valid dentries.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:09 -05:00
Trond Myklebust
24c5d9d7ea SUNRPC: Run rpci->queue_timeout on the rpciod workqueue instead of generic
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:08 -05:00
Olaf Kirch
f344f6df4b SUNRPC: Auto-load RPC authentication kernel modules
This patch adds a request_module call to rpcauth_create which will try
to auto-load the kernel module for the requested authentication flavor.
For kernels with modular sunrpc, this reduces the admin overhead for
the user.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:08 -05:00
Jeff Garzik
2e9ff56efb Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2006-03-20 04:38:50 -05:00
Jeff Garzik
d378aca6ec Merge branch 'master' 2006-03-20 04:38:03 -05:00
Ralf Baechle DL5RB
c7c694d196 [AX.25]: Fix potencial memory hole.
If the AX.25 dialect chosen by the sysadmin is set to DAMA master / 3
(or DAMA slave / 2, if CONFIG_AX25_DAMA_SLAVE=n) ax25_kick() will fall
through the switch statement without calling ax25_send_iframe() or any
other function that would eventually free skbn thus leaking the packet.

Fix by restricting the sysctl inferface to allow only actually supported
AX.25 dialects.

The system administration mistake needed for this to happen is rather
unlikely, so this is an uncritical hole.

Coverity #651.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-19 13:20:06 -08:00
James Ketrenos
f44349f221 [PATCH] ieee80211: Don't update network statistics from off-channel packets.
This patch fixes a problem in the ieee80211 probe response and beacon
reception code that would use the packet statistics for a network even
if they were received on a channel other than that which the network
exists on.

This causes a problem in overlapping channels where, for example, a
strong AP on channel 2 could have its beacons received on channels 1 and
3, but at much lower signal levels.  If scanning was done sequentially,
this means the beacon received on channel 3 would update the AP's signal
level as being much lower than it really is, which subsequently could
cause that AP to be passed over and an alternate AP selected.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-17 15:38:55 -05:00
Jeff Garzik
abc71c46dc Merge branch 'upstream-fixes' 2006-03-16 19:27:08 -05:00
John W. Linville
dd288e7d75 Merge branch 'upstream-fixes' 2006-03-15 17:02:08 -05:00
Hong Liu
72df16f109 [PATCH] ieee80211: Fix QoS is not active problem
Fix QoS is not active even the network and the card is QOS enabled.
The problem is we pass the wrong ieee80211_network address to
ipw_handle_beacon/ipw_handle_probe_response, thus the
ieee80211_network->qos_data.active will not be set, causing the driver
not sending QoS frames at all.

Signed-off-by: Hong Liu <hong.liu@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-15 16:16:07 -05:00
Zhu Yi
0df7861240 [PATCH] ieee80211: Fix CCMP decryption problem when QoS is enabled
Use the correct STYPE for Qos data.

Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-03-15 16:11:55 -05:00
Trond Myklebust
e6d83d5569 [PATCH] SUNRPC: Fix potential deadlock in RPC code
In rpc_wake_up() and rpc_wake_up_status(), it is possible for the call to
__rpc_wake_up_task() to fail if another thread happens to be calling
rpc_wake_up_task() on the same rpc_task.

Problem noticed by Bruno Faccini.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-14 07:57:18 -08:00
Adrian Bunk
712917d1c0 [PATCH] SUNRPC: fix a NULL pointer dereference in net/sunrpc/clnt.c
The Coverity checker spotted this possible NULL pointer dereference in
rpc_new_client().

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-14 07:57:17 -08:00
Herbert Xu
3759fa9c55 [TCP]: Fix zero port problem in IPv6
When we link a socket into the hash table, we need to make sure that we
set the num/port fields so that it shows us with a non-zero port value
in proc/netlink and on the wire.  This code and comment is copied over
from the IPv4 stack as is.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-03-13 14:26:12 -08:00
Patrick McHardy
31fe4d3317 [NETFILTER]: arp_tables: fix NULL pointer dereference
The check is wrong and lets NULL-ptrs slip through since !IS_ERR(NULL)
is true.

Coverity #190

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-12 20:40:43 -08:00
Patrick McHardy
baa829d892 [IPV4/6]: Fix UFO error propagation
When ufo_append_data fails err is uninitialized, but returned back.
Strangely gcc doesn't notice it.

Coverity #901 and #902

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-12 20:39:40 -08:00
Patrick McHardy
4a1ff6e2bd [TCP]: tcp_highspeed: fix AIMD table out-of-bounds access
Covertiy #547

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-12 20:39:39 -08:00
Patrick McHardy
cc9a06cd8d [NETLINK]: Fix use-after-free in netlink_recvmsg
The skb given to netlink_cmsg_recv_pktinfo is already freed, move it up
a few lines.

Coverity #948

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-12 20:39:38 -08:00
Patrick McHardy
f8dc01f543 [XFRM]: Fix leak in ah6_input
tmp_hdr is not freed when ipv6_clear_mutable_options fails.

Coverity #650

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-12 20:39:37 -08:00
Patrick McHardy
f6e57464df [NET_SCHED]: act_api: fix skb leak in error path
The skb is allocated by the function, so it needs to be freed instead
of trimmed on overrun.

Coverity #614

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-12 20:39:36 -08:00
Patrick McHardy
406dbfc9ae [NETFILTER]: nfnetlink_queue: fix possible NULL-ptr dereference
Fix NULL-ptr dereference when a config message for a non-existant
queue containing only an NFQA_CFG_PARAMS attribute is received.

Coverity #433

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-12 20:39:35 -08:00
David S. Miller
ba244fe900 [TCP]: Fix tcp_tso_should_defer() when limit>=65536
That's >= a full sized TSO frame, so we should always
return 0 in that case.

Based upon a report and initial patch from Lachlan
Andrew, final patch suggested by Herbert Xu.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-11 18:51:49 -08:00
Gregor Maier
c127437641 [NETFILTER]: Fix wrong option spelling in Makefile for CONFIG_BRIDGE_EBT_ULOG
Signed-off-by: Gregor Maier <gregor@net.in.tum.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-11 18:51:25 -08:00
Brian Haley
0d27b42739 [IPV6]: fix ipv6_saddr_score struct element
The scope element in the ipv6_saddr_score struct used in 
ipv6_dev_get_saddr() is an unsigned integer, but __ipv6_addr_src_scope() 
returns a signed integer (and can return -1).

Signed-off-by: Brian Haley <brian.haley@hp.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-11 18:50:14 -08:00
Jeff Garzik
749dfc7055 Merge branch 'upstream-fixes' 2006-03-11 13:35:31 -05:00
Dipankar Sarma
529bf6be5c [PATCH] fix file counting
I have benchmarked this on an x86_64 NUMA system and see no significant
performance difference on kernbench.  Tested on both x86_64 and powerpc.

The way we do file struct accounting is not very suitable for batched
freeing.  For scalability reasons, file accounting was
constructor/destructor based.  This meant that nr_files was decremented
only when the object was removed from the slab cache.  This is susceptible
to slab fragmentation.  With RCU based file structure, consequent batched
freeing and a test program like Serge's, we just speed this up and end up
with a very fragmented slab -

llm22:~ # cat /proc/sys/fs/file-nr
587730  0       758844

At the same time, I see only a 2000+ objects in filp cache.  The following
patch I fixes this problem.

This patch changes the file counting by removing the filp_count_lock.
Instead we use a separate percpu counter, nr_files, for now and all
accesses to it are through get_nr_files() api.  In the sysctl handler for
nr_files, we populate files_stat.nr_files before returning to user.

Counting files as an when they are created and destroyed (as opposed to
inside slab) allows us to correctly count open files with RCU.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-08 14:14:01 -08:00
Thomas Graf
850a9a4e3c [NETFILTER] ip_queue: Fix wrong skb->len == nlmsg_len assumption
The size of the skb carrying the netlink message is not
equivalent to the length of the actual netlink message
due to padding. ip_queue matches the length of the payload
against the original packet size to determine if packet
mangling is desired, due to the above wrong assumption
arbitary packets may not be mangled depening on their
original size.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-07 14:56:12 -08:00
Ian McDonald
c09966608d [DCCP] ccid3: Divide by zero fix
In rare circumstances 0 is returned by dccp_li_hist_calc_i_mean which
leads to a divide by zero in ccid3_hc_rx_packet_recv. Explicitly check
for zero return now. Update copyright notice at same time.

Found by Arnaldo.

Signed-off-by: Ian McDonald <imcdnzl@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-04 21:06:29 -08:00
Chas Williams
0f8f325b25 [ATM]: keep atmsvc failure messages quiet
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-04 21:06:25 -08:00
Stephen Hemminger
125a12ccf3 [BRIDGE]: generate kobject remove event
The earlier round of kobject/sysfs changes to bridge caused
it not to generate a uevent on removal. Don't think any application
cares (not sure about Xen) but since it generates add uevent
it should generate remove as well.

Signed-off-by: Stephen Hemminger <shemmigner@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-04 21:06:23 -08:00
Stephen Hemminger
d32439c0d4 [BRIDGE]: port timer initialization
Initialize the STP timers for a port when it is created,
rather than when it is enabled. This will prevent future race conditions
where timer gets started before port is enabled.

Signed-off-by: Stephen Hemminger <shemmigner@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-04 21:06:21 -08:00
Stephen Hemminger
6e86b89084 [BRIDGE]: fix crash in STP
Bridge would crash because of uninitailized timer if STP is used and
device was inserted into a bridge before bridge was up. This got
introduced when the delayed port checking was added.  Fix is to not
enable STP on port unless bridge is up.

Bugzilla: http://bugzilla.kernel.org/show_bug.cgi?id=6140
Dup:      http://bugzilla.kernel.org/show_bug.cgi?id=6156

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-04 21:06:19 -08:00
Jay Vosburgh
8f903c708f [PATCH] bonding: suppress duplicate packets
Originally submitted by Kenzo Iwami; his original description is:

The current bonding driver receives duplicate packets when broadcast/
multicast packets are sent by other devices or packets are flooded by the
switch. In this patch, new flags are added in priv_flags of net_device
structure to let the bonding driver discard duplicate packets in
dev.c:skb_bond().

	Modified by Jay Vosburgh to change a define name, update some
comments, rearrange the new skb_bond() for clarity, clear all bonding
priv_flags on slave release, and update the driver version.

Signed-off-by: Kenzo Iwami <k-iwami@cj.jp.nec.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-03-03 20:58:00 -05:00
Jeff Garzik
75e47b3600 Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2006-03-01 01:59:15 -05:00
Jeff Garzik
68727fed54 Merge branch 'upstream-fixes' 2006-03-01 01:58:38 -05:00
Jeff Garzik
ce7eeb6b52 Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2006-02-28 18:04:30 -05:00
Pete Zaitcev
07981aa43f [PATCH] ieee80211_geo.c: remove frivolous BUG_ON's
I have come to consider BUG_ON generally harmful. The idea of an assert is
to prevent a program to execute past a point where its state is known
erroneous, thus preventing it from dealing more damage to the data
(or hiding the traces of malfunction). The problem is, in kernel this harm
has to be balanced against the harm of forced reboot.

The last straw was our softmac tree, where "iwlist eth1 scan" causes
a lockup. It is absolutely frivolus and provides no advantages a normal
assert has to provide. In fact, doing this impedes debugging.

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-02-27 20:14:58 -05:00
John W. Linville
acfaf10be5 Merge branch 'upstream-fixes' 2006-02-27 20:13:10 -05:00
John W. Linville
9f5a405b68 Merge branch 'from-linus' 2006-02-27 20:12:23 -05:00
Pete Zaitcev
4832843d77 [PATCH] ieee80211_rx.c: is_beacon
Fix broken is_beacon().

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-02-27 20:12:02 -05:00
Arnaldo Carvalho de Melo
ba13c98405 [REQSK]: Don't reset rskq_defer_accept in reqsk_queue_alloc
In 295f7324ff I moved defer_accept from
tcp_sock to request_queue and mistakingly reset it at reqsl_queue_alloc, causing
calls to setsockopt(TCP_DEFER_ACCEPT ) to be lost after bind, the fix is to
remove the zeroing of rskq_defer_accept from reqsl_queue_alloc.

Thanks to Alexandra N. Kossovsky <Alexandra.Kossovsky@oktetlabs.ru> for
reporting and testing the suggested fix.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-27 13:30:43 -08:00
Patrick McHardy
bafac2a512 [NETFILTER]: Restore {ipt,ip6t,ebt}_LOG compatibility
The nfnetlink_log infrastructure changes broke compatiblity of the LOG
targets. They currently use whatever log backend was registered first,
which means that if ipt_ULOG was loaded first, no messages will be printed
to the ring buffer anymore.

Restore compatiblity by using the old log functions by default and only use
the nf_log backend if the user explicitly said so.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-27 13:04:17 -08:00
Patrick McHardy
45fe4dc08c [NETFILTER]: nf_queue: fix end-of-list check
The comparison wants to find out if the last list iteration reached the
end of the list. It needs to compare the iterator with the list head to
do this, not the element it is looking for.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-27 13:03:55 -08:00
Patrick McHardy
e121e9ecb0 [NETFILTER]: nf_queue: remove unnecessary check for outfn
The only point of registering a queue handler is to provide an outfn,
so there is no need to check for it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-27 13:03:39 -08:00
Patrick McHardy
7a11b9848a [NETFILTER]: nf_queue: fix rerouting after packet mangling
Packets should be rerouted when they come back from userspace, not before.
Also move the queue_rerouters to RCU to avoid taking the queue_handler_lock
for each reinjected packet.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-27 13:03:24 -08:00
Patrick McHardy
f92f871989 [NETFILTER]: nf_queue: check if rerouter is present before using it
Every rerouter needs to provide a save and a reroute function, we don't
need to check for them. But we do need to check if a rerouter is registered
at all for the current family, with bridging for example packets of
unregistered families can hit nf_queue.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-27 13:03:10 -08:00
Patrick McHardy
e02f7d1603 [NETFILTER]: nf_queue: don't copy registered rerouter data
Use the registered data structure instead of copying it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-27 13:02:52 -08:00
Herbert Xu
752c1f4c78 [IPSEC]: Kill post_input hook and do NAT-T in esp_input directly
The only reason post_input exists at all is that it gives us the
potential to adjust the checksums incrementally in future which
we ought to do.

However, after thinking about it for a bit we can adjust the
checksums without using this post_input stuff at all.  The crucial
point is that only the inner-most NAT-T SA needs to be considered
when adjusting checksums.  What's more, the checksum adjustment
comes down to a single u32 due to the linearity of IP checksums.

We just happen to have a spare u32 lying around in our skb structure :)
When ip_summed is set to CHECKSUM_NONE on input, the value of skb->csum
is currently unused.  All we have to do is to make that the checksum
adjustment and voila, there goes all the post_input and decap structures!

I've left in the decap data structures for now since it's intricately
woven into the sec_path stuff.  We can kill them later too.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-27 13:00:40 -08:00
Herbert Xu
4bf05eceec [IPSEC] esp: Kill unnecessary block and indentation
We used to keep sg on the stack which is why the extra block was useful.
We've long since stopped doing that so let's kill the block and save
some indentation.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-27 13:00:01 -08:00
Jeff Garzik
dbfedbb981 Merge branch 'master' 2006-02-27 11:33:51 -05:00
YOSHIFUJI Hideaki
d91675f9c7 [IPV6]: Do not ignore IPV6_MTU socket option.
Based on patch by Hoerdt Mickael <hoerdt@clarinet.u-strasbg.fr>.

Signed-off-by: YOSHIFUJI Hideaki <yosufuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-24 13:18:33 -08:00
Hugo Santos
0c0888908d [IPV6] ip6_tunnel: release cached dst on change of tunnel params
The included patch fixes ip6_tunnel to release the cached dst entry
when the tunnel parameters (such as tunnel endpoints) are changed so
they are used immediatly for the next encapsulated packets.

Signed-off-by: Hugo Santos <hsantos@av.it.pt>
Acked-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-24 13:16:25 -08:00
Jeff Garzik
7b0386921d Merge branch 'upstream-fixes' 2006-02-23 21:16:27 -05:00
Herbert Xu
4da3089f2b [IPSEC]: Use TOS when doing tunnel lookups
We should use the TOS because it's one of the routing keys.  It also
means that we update the correct routing cache entry when PMTU occurs.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-23 16:19:26 -08:00
Jamal Hadi Salim
f8d0e3f115 [NET] ethernet: Fix first packet goes out with MAC 00:00:00:00:00:00
When you turn off ARP on a netdevice then the first packet always goes
out with a dstMAC of all zeroes. This is because the first packet is
used to resolve ARP entries. Even though the ARP entry may be resolved
(I tried by setting a static ARP entry for a host i was pinging from),
it gets overwritten by virtue of having the netdevice disabling ARP.

Subsequent packets go out fine with correct dstMAC address (which may
be why people have ignored reporting this issue).

To cut the story short: 

the culprit code is in net/ethernet/eth.c::eth_header()

----
        /*
         *      Anyway, the loopback-device should never use this
function...
         */

        if (dev->flags & (IFF_LOOPBACK|IFF_NOARP))
        {
                memset(eth->h_dest, 0, dev->addr_len);
                return ETH_HLEN;
        }

	if(daddr)
        {
                memcpy(eth->h_dest,daddr,dev->addr_len);
                return ETH_HLEN;
        }

----

Note how the h_dest is being reset when device has IFF_NOARP.

As a note:
All devices including loopback pass a daddr. loopback in fact passes
a 0 all the time ;-> 
This means i can delete the check totaly or i can remove the IFF_NOARP

Alexey says:
--------------------
I think, it was me who did this crap. It was so long ago I do not remember
why it was made.

I remember some troubles with dummy device. It tried to resolve
addresses, apparently, without success and generated errors instead of
blackholing. I think the problem was eventually solved at neighbour
level.

After some thinking I suspect the deletion of this chunk could change
behaviour of some parts which do not use neighbour cache f.e. packet
socket.

I think safer approach would be to move this chunk after if (daddr).
And the possibility to remove this completely could be analyzed later.
--------------------

Patch updated with Alexey's safer suggestions.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-23 16:18:01 -08:00
Herbert Xu
21380b81ef [XFRM]: Eliminate refcounting confusion by creating __xfrm_state_put().
We often just do an atomic_dec(&x->refcnt) on an xfrm_state object
because we know there is more than 1 reference remaining and thus
we can elide the heavier xfrm_state_put() call.

Do this behind an inline function called __xfrm_state_put() so that is
more obvious and also to allow us to more cleanly add refcount
debugging later.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-23 16:10:53 -08:00
Suresh Bhogavilli
8525987849 [IPV4]: Fix garbage collection of multipath route entries
When garbage collecting route cache entries of multipath routes
in rt_garbage_collect(), entries were deleted from the hash bucket
'i' while holding a spin lock on bucket 'k' resulting in a system
hang.  Delete entries, if any, from bucket 'k' instead.

Signed-off-by: Suresh Bhogavilli <sbhogavilli@verisign.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-23 16:10:52 -08:00
Patrick McHardy
42cf93cd46 [NETFILTER]: Fix bridge netfilter related in xfrm_lookup
The bridge-netfilter code attaches a fake dst_entry with dst->ops == NULL
to purely bridged packets. When these packets are SNATed and a policy
lookup is done, xfrm_lookup crashes because it tries to dereference
dst->ops.

Change xfrm_lookup not to dereference dst->ops before checking for the
DST_NOXFRM flag and set this flag in the fake dst_entry.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-23 16:10:51 -08:00
Linus Torvalds
cf70a6f264 Merge branch 'fixes.b8' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/bird 2006-02-20 20:09:44 -08:00
YOSHIFUJI Hideaki
a8372f035a [NET]: NETFILTER: remove duplicated lines and fix order in skb_clone().
Some of netfilter-related members are initalized / copied twice in
skb_clone(). Remove one.

Pointed out by Olivier MATZ <olivier.matz@6wind.com>.

And this patch also fixes order of copying / clearing members.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-19 22:32:06 -08:00
Patrick McHardy
8e249f0881 [NETFILTER]: Fix outgoing redirects to loopback
When redirecting an outgoing packet to loopback, it keeps the original
conntrack reference and information from the outgoing path, which
falsely triggers the check for DNAT on input and the dst_entry is
released to trigger rerouting. ip_route_input refuses to route the
packet because it has a local source address and it is dropped.

Look at the packet itself to dermine if it was NATed. Also fix a
missing inversion that causes unneccesary xfrm lookups.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-19 22:29:47 -08:00
Patrick McHardy
bc6e14b6f0 [NETFILTER]: Fix NAT PMTUD problems
ICMP errors are only SNATed when their source matches the source of the
connection they are related to, otherwise the source address is not
changed. This creates problems with ICMP frag. required messages
originating from a router behind the NAT, if private IPs are used the
packet has a good change of getting dropped on the path to its destination.

Always NAT ICMP errors similar to the original connection.

Based on report by Al Viro.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-19 22:26:40 -08:00
Patrick McHardy
9951101438 [XFRM]: Fix policy double put
The policy is put once immediately and once at the error label, which results
in the following Oops:

kernel BUG at net/xfrm/xfrm_policy.c:250!
invalid opcode: 0000 [#2]
PREEMPT
[...]
CPU:    0
EIP:    0060:[<c028caf7>]    Not tainted VLI
EFLAGS: 00210246   (2.6.16-rc3 #39)
EIP is at __xfrm_policy_destroy+0xf/0x46
eax: d49f2000   ebx: d49f2000   ecx: f74bd880   edx: f74bd280
esi: d49f2000   edi: 00000001   ebp: cd506dcc   esp: cd506dc8
ds: 007b   es: 007b   ss: 0068
Process ssh (pid: 31970, threadinfo=cd506000 task=cfb04a70)
Stack: <0>cd506000 cd506e34 c028e92b ebde7280 cd506e58 cd506ec0 f74bd280 00000000
       00000214 0000000a 0000000a 00000000 00000002 f7ae6000 00000000 cd506e58
       cd506e14 c0299e36 f74bd280 e873fe00 c02943fd cd506ec0 ebde7280 f271f440
Call Trace:
 [<c0103a44>] show_stack_log_lvl+0xaa/0xb5
 [<c0103b75>] show_registers+0x126/0x18c
 [<c0103e68>] die+0x14e/0x1db
 [<c02b6809>] do_trap+0x7c/0x96
 [<c0104237>] do_invalid_op+0x89/0x93
 [<c01035af>] error_code+0x4f/0x54
 [<c028e92b>] xfrm_lookup+0x349/0x3c2
 [<c02b0b0d>] ip6_datagram_connect+0x317/0x452
 [<c0281749>] inet_dgram_connect+0x49/0x54
 [<c02404d2>] sys_connect+0x51/0x68
 [<c0240928>] sys_socketcall+0x6f/0x166
 [<c0102aa1>] syscall_call+0x7/0xb

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-19 22:11:50 -08:00
Al Viro
cc6cdac0cf [PATCH] missing ntohs() in ip6_tunnel
->payload_len is net-endian

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-02-18 16:02:18 -05:00
Jeff Garzik
b04a92e160 Merge branch 'upstream-fixes' 2006-02-17 16:20:30 -05:00
Johannes Berg
b7cffb028a [PATCH] ieee80211: fix sparse warning about missing "static"
This patch adds a missing "static" on a variable (sparse complaint)

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-02-17 10:41:34 -05:00
Zhu Yi
4716808283 [PATCH] ieee80211: Use IWEVGENIE to set WPA IE
It replaces returning WPA/RSN IEs as custom events with returning them
as IWEVGENIE events. I have tested that it returns proper information
with both Xsupplicant, and the latest development version of the Linux
wireless tools.

Signed-off-by: Chris Hessing <Chris.Hessing@utah.edu>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-02-17 08:16:59 -05:00
John W. Linville
750b50ab56 Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2006-02-17 08:15:41 -05:00
Yasuyuki Kozakai
7c6de05884 [NETFILTER]: nf_conntrack: Fix TCP/UDP HW checksum handling for IPv6 packet
If skb->ip_summed is CHECKSUM_HW here, skb->csum includes checksum
of actual IPv6 header and extension headers. Then such excess
checksum must be subtruct when nf_conntrack calculates TCP/UDP checksum
with pseudo IPv6 header. Spotted by Ben Skeggs.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-15 15:25:18 -08:00
Yasuyuki Kozakai
763ecff187 [NETFILTER]: nf_conntrack: attach conntrack to locally generated ICMPv6 error
Locally generated ICMPv6 errors should be associated with the conntrack
of the original packet. Since the conntrack entry may not be in the hash
tables (for the first packet), it must be manually attached.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-15 15:24:15 -08:00
Yasuyuki Kozakai
08857fa745 [NETFILTER]: nf_conntrack: attach conntrack to TCP RST generated by ip6t_REJECT
TCP RSTs generated by the REJECT target should be associated with the
conntrack of the original TCP packet. Since the conntrack entry is
usually not is the hash tables, it must be manually attached.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-15 15:23:28 -08:00
Yasuyuki Kozakai
7d3cdc6b55 [NETFILTER]: nf_conntrack: move registration of __nf_ct_attach
Move registration of __nf_ct_attach to nf_conntrack_core to make it usable
for IPv6 connection tracking as well.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-15 15:22:21 -08:00
Yasuyuki Kozakai
deac0ccdb4 [NETFILTER]: x_tables: fix dependencies of conntrack related modules
NF_CONNTRACK_MARK is bool and depends on NF_CONNTRACK which is
tristate.  If a variable depends on NF_CONNTRACK_MARK and doesn't take
care about NF_CONNTRACK, it can be y even if NF_CONNTRACK isn't y.
NF_CT_ACCT have same issue, too.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-15 15:21:31 -08:00
Patrick McHardy
48d5cad87c [XFRM]: Fix SNAT-related crash in xfrm4_output_finish
When a packet matching an IPsec policy is SNATed so it doesn't match any
policy anymore it looses its xfrm bundle, which makes xfrm4_output_finish
crash because of a NULL pointer dereference.

This patch directs these packets to the original output path instead. Since
the packets have already passed the POST_ROUTING hook, but need to start at
the beginning of the original output path which includes another
POST_ROUTING invocation, a flag is added to the IPCB to indicate that the
packet was rerouted and doesn't need to pass the POST_ROUTING hook again.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-15 15:10:22 -08:00
Adrian Drzewiecki
78872ccb68 [BRIDGE]: Fix deadlock in br_stp_disable_bridge
Looks like somebody forgot to use the _bh spin_lock variant. We ran into a 
deadlock where br->hello_timer expired while br_stp_disable_br() walked 
br->port_list. 

Signed-off-by: Adrian Drzewiecki <z@drze.net>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-15 01:47:48 -08:00
Patrick McHardy
ee68cea2c2 [NETFILTER]: Fix xfrm lookup after SNAT
To find out if a packet needs to be handled by IPsec after SNAT, packets
are currently rerouted in POST_ROUTING and a new xfrm lookup is done. This
breaks SNAT of non-unicast packets to non-local addresses because the
packet is routed as incoming packet and no neighbour entry is bound to the
dst_entry. In general, it seems to be a bad idea to replace the dst_entry
after the packet was already sent to the output routine because its state
might not match what's expected.

This patch changes the xfrm lookup in POST_ROUTING to re-use the original
dst_entry without routing the packet again. This means no policy routing
can be used for transport mode transforms (which keep the original route)
when packets are SNATed to match the policy, but it looks like the best
we can do for now.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-15 01:34:23 -08:00
David S. Miller
b4d9eda028 [NET]: Revert skb_copy_datagram_iovec() recursion elimination.
Revert the following changeset:

bc8dfcb939

Recursive SKB frag lists are really possible and disallowing
them breaks things.

Noticed by: Jesse Brandeburg <jesse.brandeburg@intel.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-13 16:06:10 -08:00
Herbert Xu
00de651d14 [IPSEC]: Fix strange IPsec freeze.
Problem discovered and initial patch by Olaf Kirch:

	there's a problem with IPsec that has been bugging some of our users
	for the last couple of kernel revs. Every now and then, IPsec will
	freeze the machine completely. This is with openswan user land,
	and with kernels up to and including 2.6.16-rc2.

	I managed to debug this a little, and what happens is that we end
	up looping in xfrm_lookup, and never get out. With a bit of debug
	printks added, I can this happening:

		ip_route_output_flow calls xfrm_lookup

		xfrm_find_bundle returns NULL (apparently we're in the
			middle of negotiating a new SA or something)

		We therefore call xfrm_tmpl_resolve. This returns EAGAIN
			We go to sleep, waiting for a policy update.
			Then we loop back to the top

		Apparently, the dst_orig that was passed into xfrm_lookup
			has been dropped from the routing table (obsolete=2)
			This leads to the endless loop, because we now create
			a new bundle, check the new bundle and find it's stale
			(stale_bundle -> xfrm_bundle_ok -> dst_check() return 0)

	People have been testing with the patch below, which seems to fix the
	problem partially. They still see connection hangs however (things
	only clear up when they start a new ping or new ssh). So the patch
	is obvsiouly not sufficient, and something else seems to go wrong.

	I'm grateful for any hints you may have...

I suggest that we simply bail out always.  If the dst decides to die
on us later on, the packet will be dropped anyway.  So there is no
great urgency to retry here.  Once we have the proper resolution
queueing, we can then do the retry again.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Olaf Kirch <okir@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-13 16:01:27 -08:00
Nicolas DICHTEL
6d3e85ecf2 [IPV6] Don't store dst_entry for RAW socket
Signed-off-by: Nicolas DICHTEL <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-13 15:56:13 -08:00
Jamal Hadi Salim
e200bd8065 [NETLINK] genetlink: Fix bugs spotted by Andrew Morton.
- panic() doesn't return.

- Don't forget to unlock on genl_register_family() error path

- genl_rcv_msg() is called via pointer so there's no point in declaring it
  `inline'.

Notes:

genl_ctrl_event() ignores the genlmsg_multicast() return value.

lots of things ignore the genl_ctrl_event() return value.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-13 15:51:24 -08:00
Stephen Hemminger
178a3259f2 [BRIDGE]: Better fix for netfilter missing symbol has_bridge_parent
Horms patch was the best of the three fixes. Dave, already applied
Harald's version, so this patch converts that to the better one.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-13 15:43:58 -08:00
Harald Welte
a6c1cd5726 [NETFILTER] Fix Kconfig menu level for x_tables
The new x_tables related Kconfig options appear at the wrong menu level
without this patch.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-13 15:42:48 -08:00
David S. Miller
15c38c6ecd Merge master.kernel.org:/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6 2006-02-13 15:40:55 -08:00
Dave Jones
99e382afd2 [P8023]: Fix tainting of kernel.
Missing license tag.
I've assumed this is GPL.  (It could also use a MODULE_AUTHOR)

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-13 15:38:42 -08:00
Dave Jones
77decfc716 [IPV4] ICMP: Invert default for invalid icmp msgs sysctl
isic can trigger these msgs to be spewed at a very high rate.
There's already a sysctl to turn them off. Given these messages
aren't useful for most people, this patch disables them by
default.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-13 15:36:21 -08:00
Dave Jones
bf3883c12f [ATM]: Ratelimit atmsvc failure messages
This seems to be trivial to trigger.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-13 15:34:58 -08:00
Marcel Holtmann
7b005bd34c [Bluetooth] Fix NULL pointer dereferences of the HCI socket
This patch fixes the two NULL pointer dereferences found by the sfuzz
tool from Ilja van Sprundel. The first one was a call of getsockname()
for an unbound socket and the second was calling accept() while this
operation isn't implemented for the HCI socket interface.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-02-13 11:40:03 +01:00
Marcel Holtmann
56f3a40a5e [Bluetooth] Reduce L2CAP MTU for RFCOMM connections
This patch reduces the default L2CAP MTU for all RFCOMM connections
from 1024 to 1013 to improve the interoperability with some broken
RFCOMM implementations. To make this more flexible the L2CAP MTU
becomes also a module parameter and so it can changed at runtime.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2006-02-13 11:39:57 +01:00
Jesper Juhl
3c791925da [PATCH] netfilter: fix build error due to missing has_bridge_parent macro
net/bridge/br_netfilter.c: In function `br_nf_post_routing':
net/bridge/br_netfilter.c:808: warning: implicit declaration of function `has_bridge_parent'

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Cc: Harald Welte <laforge@netfilter.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-12 16:10:47 -08:00
Stephen Hemminger
bab1deea30 [BRIDGE]: fix error handling for add interface to bridge
Refactor how the bridge code interacts with kobject system.
It should still use kobjects even if not using sysfs.
Fix the error unwind handling in br_add_if.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-09 17:10:12 -08:00
Stephen Hemminger
5dce971acf [BRIDGE]: netfilter handle RCU during removal
Bridge netfilter code needs to handle the case where device is
removed from bridge while packet in process. In these cases the
bridge_parent can become null while processing.

This should fix: http://bugzilla.kernel.org/show_bug.cgi?id=5803

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-09 17:09:38 -08:00
Stephen Hemminger
b3f1be4b54 [BRIDGE]: fix for RCU and deadlock on device removal
Change Bridge receive path to correctly handle RCU removal of device
from bridge.  Also fixes deadlock between carrier_check and del_nbp.
This replaces the previous deleted flag fix.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-09 17:08:52 -08:00
John Heffner
6fcf9412de [TCP]: rcvbuf lock when tcp_moderate_rcvbuf enabled
The rcvbuf lock should probably be honored here.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-09 17:06:57 -08:00
David Binderman
80ba250e59 [IRDA]: out of range array access
This patch fixes an out of range array access in irnet_irda.c.

Author: David Binderman <dcb314@hotmail.com>
Signed-off-by: Samuel Ortiz <samuel.ortiz@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-09 16:59:48 -08:00
Samuel Ortiz
d93077fb0e [IRDA]: Set proper IrLAP device address length
This patch set IrDA's addr_len properly, i.e to 4 bytes, the size of the
IrLAP device address.

Signed-off-by: Samuel Ortiz <samuel.ortiz@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-09 16:58:46 -08:00
Alexey Kuznetsov
28633514af [NETLINK]: illegal use of pid in rtnetlink
When a netlink message is not related to a netlink socket,
it is issued by kernel socket with pid 0. Netlink "pid" has nothing
to do with current->pid. I called it incorrectly, if it was named "port",
the confusion would be avoided.

Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-09 16:43:41 -08:00
Alexey Kuznetsov
a70ea994a0 [NETLINK]: Fix a severe bug
netlink overrun was broken while improvement of netlink.
Destination socket is used in the place where it was meant to be source socket,
so that now overrun is never sent to user netlink sockets, when it should be,
and it even can be set on kernel socket, which results in complete deadlock
of rtnetlink.

Suggested fix is to restore status quo passing source socket as additional
argument to netlink_attachskb().

A little explanation: overrun is set on a socket, when it failed
to receive some message and sender of this messages does not or even
have no way to handle this error. This happens in two cases:
1. when kernel sends something. Kernel never retransmits and cannot
   wait for buffer space.
2. when user sends a broadcast and the message was not delivered
   to some recipients.

Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-09 16:43:38 -08:00
Jeff Garzik
70c07e0262 Merge branch 'viro' 2006-02-09 14:17:05 -05:00
Kristian Slavov
9908104935 [IPV6]: Address autoconfiguration does not work after device down/up cycle
If you set network interface down and up again, the IPv6 address
autoconfiguration does not work. 'ip addr' shows that the link-local
address is in tentative state. We don't even react to periodical router
advertisements.

During NETDEV_DOWN we clear IF_READY, and we don't set it back in
NETDEV_UP. While starting to perform DAD on the link-local address, we
notice that the device is not in IF_READY, and we abort autoconfiguration
process (which would eventually send router solicitations).

Acked-by: Juha-Matti Tapio <jmtapio@verkkotelakka.net>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-08 16:13:28 -08:00
Al Viro
e80e28b6b6 [PATCH] net/ipv6/mcast.c NULL noise removal
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-02-07 20:58:56 -05:00
Al Viro
76edc6051e [PATCH] ipv4 NULL noise removal
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-02-07 20:57:37 -05:00
Al Viro
1b8623545b [PATCH] remove bogus asm/bug.h includes.
A bunch of asm/bug.h includes are both not needed (since it will get
pulled anyway) and bogus (since they are done too early).  Removed.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-02-07 20:56:35 -05:00
Jeff Garzik
3c9b3a8575 Merge branch 'master' 2006-02-07 01:47:12 -05:00
Linus Torvalds
98bd0c07b6 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2006-02-05 11:10:29 -08:00
Eric Dumazet
88a2a4ac6b [PATCH] percpu data: only iterate over possible CPUs
percpu_data blindly allocates bootmem memory to store NR_CPUS instances of
cpudata, instead of allocating memory only for possible cpus.

As a preparation for changing that, we need to convert various 0 -> NR_CPUS
loops to use for_each_cpu().

(The above only applies to users of asm-generic/percpu.h.  powerpc has gone it
alone and is presently only allocating memory for present CPUs, so it's
currently corrupting memory).

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Jens Axboe <axboe@suse.de>
Cc: Anton Blanchard <anton@samba.org>
Acked-by: William Irwin <wli@holomorphy.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:51 -08:00
Patrick McHardy
7918d212df [NETFILTER]: Fix check whether dst_entry needs to be released after NAT
After DNAT the original dst_entry needs to be released if present
so the packet doesn't skip input routing with its new address. The
current check for DNAT in ip_nat_in is reversed and checks for SNAT.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04 23:51:29 -08:00
Patrick McHardy
0047c65a60 [NETFILTER]: Prepare {ipt,ip6t}_policy match for x_tables unification
The IPv4 and IPv6 version of the policy match are identical besides address
comparison and the data structure used for userspace communication. Unify
the data structures to break compatiblity now (before it is released), so
we can port it to x_tables in 2.6.17.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04 23:51:28 -08:00
Patrick McHardy
878c41ce57 [NETFILTER]: Fix ip6t_policy address matching
Fix two bugs in ip6t_policy address matching:
- misorder arguments to ip6_masked_addrcmp, mask must be the second argument
- inversion incorrectly applied to the entire expression instead of just
  the address comparison

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04 23:51:27 -08:00
Patrick McHardy
e55f1bc5dc [NETFILTER]: Check policy length in policy match strict mode
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04 23:51:26 -08:00
Kirill Korotaev
ee4bb818ae [NETFILTER]: Fix possible overflow in netfilters do_replace()
netfilter's do_replace() can overflow on addition within SMP_ALIGN()
and/or on multiplication by NR_CPUS, resulting in a buffer overflow on
the copy_from_user().  In practice, the overflow on addition is
triggerable on all systems, whereas the multiplication one might require
much physical memory to be present due to the check above.  Either is
sufficient to overwrite arbitrary amounts of kernel memory.

I really hate adding the same check to all 4 versions of do_replace(),
but the code is duplicate...

Found by Solar Designer during security audit of OpenVZ.org

Signed-Off-By: Kirill Korotaev <dev@openvz.org>
Signed-Off-By: Solar Designer <solar@openwall.com>
Signed-off-by: Patrck McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04 23:51:25 -08:00
Samir Bellabes
df4e9574a3 [NETFILTER]: nf_conntrack: fix incorrect memset() size in FTP helper
This memset() is executing with a bad size. According to Yasuyuki Kozakai,
this memset() can be deleted, as 'ftp' is declared in global area.

Signed-off-by: Samir Bellabes <sbellabes@mandriva.com>
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04 23:51:23 -08:00
Patrick McHardy
6f16930078 [NETFILTER]: Fix missing src port initialization in tftp expectation mask
Reported by David Ahern <dahern@avaya.com>, netfilter bugzilla #426.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04 23:51:21 -08:00
Patrick McHardy
a706124d0a [NETFILTER]: nfnetlink_queue: fix packet marking over netlink
The packet marked is the netlink skb, not the queued skb.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04 23:51:20 -08:00
Patrick McHardy
ad2ad0f965 [NETFILTER]: Fix undersized skb allocation in ipt_ULOG/ebt_ulog/nfnetlink_log
The skb allocated is always of size nlbufsize, even if that is smaller than
the size needed for the current packet.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04 23:51:19 -08:00
Holger Eitzenberger
c2db292438 [NETFILTER]: ULOG/nfnetlink_log: Use better default value for 'nlbufsiz'
Performance tests showed that ULOG may fail on heavy loaded systems
because of failed order-N allocations (N >= 1).

The default value of 4096 is not optimal in the sense that it actually
allocates _two_ contigous physical pages.  Reasoning: ULOG uses
alloc_skb(), which adds another ~300 bytes for skb_shared_info.

This patch sets the default value to NLMSG_GOODSIZE and adds some
documentation at the top.

Signed-off-by: Holger Eitzenberger <heitzenberger@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04 23:51:18 -08:00
Yasuyuki Kozakai
ddc8d029ac [NETFILTER]: nf_conntrack: check address family when finding protocol module
__nf_conntrack_{l3}proto_find() doesn't check the passed protocol family,
then it's possible to touch out of the array which has only AF_MAX items.

Spotted by Pablo Neira Ayuso.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04 23:51:17 -08:00
Pablo Neira Ayuso
34f9a2e4de [NETFILTER]: ctnetlink: add MODULE_ALIAS for expectation subsystem
Add load-on-demand support for expectation request. eg. conntrack -L expect

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04 23:51:16 -08:00
Marcus Sundberg
b633ad5fbf [NETFILTER]: ctnetlink: Fix subsystem used for expectation events
The ctnetlink expectation events should use the NFNL_SUBSYS_CTNETLINK_EXP
subsystem, not NFNL_SUBSYS_CTNETLINK.

Signed-off-by: Marcus Sundberg <marcus@ingate.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04 23:51:15 -08:00
Herbert Xu
fa60cf7f64 [ICMP]: Fix extra dst release when ip_options_echo fails
When two ip_route_output_key lookups in icmp_send were combined I
forgot to change the error path for ip_options_echo to not drop the
dst reference since it now sits before the dst lookup.  To fix it we
simply jump past the ip_rt_put call.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-04 23:51:14 -08:00
Linus Torvalds
d6c8f6aaa1 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2006-02-03 08:33:06 -08:00
Stephen Hemminger
0dec456d1f [NET]: Add CONFIG_NETDEBUG to suppress bad packet messages.
If you are on a hostile network, or are running protocol tests, you can
easily get the logged swamped by messages about bad UDP and ICMP packets.
This turns those messages off unless a config option is enabled.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-02 20:40:09 -08:00
Horms
f00c401b9b [IPV4]: Remove suprious use of goto out: in icmp_reply
This seems to be an artifact of the follwoing commit in February '02.

e7e173af42dbf37b1d946f9ee00219cb3b2bea6a

In a nutshell, goto out and return actually do the same thing,
and both are called in this function. This patch removes out.

Signed-Off-By: Horms <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-02 17:03:18 -08:00
Herbert Xu
6f4b6ec1cf [IPV6]: Fix illegal dst locking in softirq context.
On Tue, Jan 31, 2006 at 10:24:32PM +0100, Ingo Molnar wrote:
>
>  [<c04de9e8>] _write_lock+0x8/0x10
>  [<c0499015>] inet6_destroy_sock+0x25/0x100
>  [<c04b8672>] tcp_v6_destroy_sock+0x12/0x20
>  [<c046bbda>] inet_csk_destroy_sock+0x4a/0x150
>  [<c047625c>] tcp_rcv_state_process+0xd4c/0xdd0
>  [<c047d8e9>] tcp_v4_do_rcv+0xa9/0x340
>  [<c047eabb>] tcp_v4_rcv+0x8eb/0x9d0

OK this is definitely broken.  We should never touch the dst lock in
softirq context.  Since inet6_destroy_sock may be called from that
context due to the asynchronous nature of sockets, we can't take the
lock there.

In fact this sk_dst_reset is totally redundant since all IPv6 sockets
use inet_sock_destruct as their socket destructor which always cleans
up the dst anyway.  So the solution is to simply remove the call.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-02 17:01:13 -08:00
Herbert Xu
f8addb3215 [IPV4] multipath_wrandom: Fix softirq-unsafe spin lock usage
The spin locks in multipath_wrandom may be obtained from either process
context or softirq context depending on whether the packet is locally
or remotely generated.  Therefore we need to disable BH processing when
taking these locks.

This bug was found by Ingo's lock validator.
 
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-02 16:59:16 -08:00
Vlad Yasevich
27852c26ba [SCTP]: Fix 'fast retransmit' to send a TSN only once.
SCTP used to "fast retransmit" a TSN every time we hit the number
of missing reports for the TSN.  However the Implementers Guide
specifies that we should only "fast retransmit" a given TSN once.
Subsequent retransmits should be timeouts only. Also change the
number of missing reports to 3 as per the latest IG(similar to TCP).

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-02 16:57:31 -08:00
Herbert Xu
4641e7a334 [IPV6]: Don't hold extra ref count in ipv6_ifa_notify
Currently the logic in ipv6_ifa_notify is to hold an extra reference
count for addrconf dst's that get added to the routing table.  Thus,
when addrconf dst entries are taken out of the routing table, we need
to drop that dst.  However, addrconf dst entries may be removed from
the routing table by means other than __ipv6_ifa_notify.

So we're faced with the choice of either fixing up all places where
addrconf dst entries are removed, or dropping the extra reference count
altogether.

I chose the latter because the ifp itself always holds a dst reference
count of 1 while it's alive.  This is dropped just before we kfree the
ifp object.  Therefore we know that in __ipv6_ifa_notify we will always
hold that count.

This bug was found by Eric W. Biederman.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-02 16:55:45 -08:00
Stephen Hemminger
42c5e15f18 [NET] snap: needs hardware checksum fix
The SNAP code pops off it's 5 byte header, but doesn't adjust
the checksum. This would cause problems when using device that
does IP over SNAP and hardware receive checksums.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-02 16:53:26 -08:00
Trond Myklebust
fba3bad488 SUNRPC: Move upcall out of auth->au_ops->crcreate()
This fixes a bug whereby if two processes try to look up the same auth_gss
 credential, they may end up creating two creds, and triggering two upcalls
 because the upcall is performed before the credential is added to the
 credcache.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-02-01 12:52:25 -05:00
Trond Myklebust
adb12f63e0 SUNRPC: Remove the deprecated function lookup_hash() from rpc_pipefs code
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-02-01 12:52:24 -05:00
Trond Myklebust
9842ef3557 SUNRPC: rpc_timeout_upcall_queue should not sleep
The function rpc_timeout_upcall_queue runs from a workqueue, and hence
 sleeping is not recommended. Convert the protection of the upcall queue
 from being mutex-based to being spinlock-based.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-02-01 12:52:24 -05:00
Trond Myklebust
8a3177604b SUNRPC: Fix a lock recursion in the auth_gss downcall
When we look up a new cred in the auth_gss downcall so that we can stuff
 the credcache, we do not want that lookup to queue up an upcall in order
 to initialise it. To do an upcall here not only redundant, but since we
 are already holding the inode->i_mutex, it will trigger a lock recursion.

 This patch allows rpcauth cache searches to indicate that they can cope
 with uninitialised credentials.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-02-01 12:52:23 -05:00
Martin Waitz
99acf04421 [PATCH] DocBook: fix some kernel-doc comments in net/sunrpc
Fix the syntax of some kernel-doc comments

Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-01 08:53:27 -08:00
David S. Miller
0cbd782507 [DCCP] ipv6: dccp_v6_send_response() has a DST leak too.
It was copy&pasted from tcp_v6_send_synack() which has
a DST leak recently fixed by Eric W. Biederman.

So dccp_v6_send_response() needs the same fix too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-31 17:53:37 -08:00
Eric W. Biederman
78b910429e [IPV6] tcp_v6_send_synack: release the destination
This patch fix dst reference counting in tcp_v6_send_synack

Analysis:
Currently tcp_v6_send_synack is never called with a dst entry
so dst always comes in as NULL.

ip6_dst_lookup calls ip6_route_output which calls dst_hold
before it returns the dst entry.   Neither xfrm_lookup
nor tcp_make_synack consume the dst entry so we still have
a dst_entry with a bumped refrence count at the end of
this function.

Therefore we need to call dst_release just before we return
just like tcp_v4_send_synack does.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-31 17:51:44 -08:00
Sam Ravnborg
f9d9516db7 [NET]: Do not export inet_bind_bucket_create twice.
inet_bind_bucket_create was exported twice.  Keep the export in the
file where inet_bind_bucket_create is defined.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-31 17:47:02 -08:00
Stephen Hemminger
3f4cfc2d11 [BRIDGE]: Fix device delete race.
This is a simpler fix for the two races in bridge device removal.
The Xen race of delif and notify is managed now by a new deleted flag.
No need for barriers or other locking because of rtnl mutex.

The del_timer_sync()'s are unnecessary, because br_stp_disable_port
delete's the timers, and they will finish running before RCU callback.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-31 17:44:07 -08:00
Patrick McHardy
5d39a795bf [IPV4]: Always set fl.proto in ip_route_newports
ip_route_newports uses the struct flowi from the struct rtable returned
by ip_route_connect for the new route lookup and just replaces the port
numbers if they have changed. If an IPsec policy exists which doesn't match
port 0 the struct flowi won't have the proto field set and no xfrm lookup
is done for the changed ports.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-31 17:35:35 -08:00
Linus Torvalds
dd1c1853e2 Fix ipv4/igmp.c compile with gcc-4 and IP_MULTICAST
Modern versions of gcc do not like case statements at the end of a block
statement: you need at least an empty statement.  Using just a "break;"
is preferred for visual style.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-31 13:11:41 -08:00
Linus Torvalds
0827f2b698 Merge branch 'upstream-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2006-01-31 10:29:35 -08:00
Larry Finger
2f633db5e9 [PATCH] Add two management functions to ieee80211_rx.c
On my system, I get unhandled management functions corresponding
to IEEE80211_STYPE_REASSOC_REQ and IEEE80211_STYPE_ASSOC_REQ. The
attached patch adds the logic to pass these requests off to a user
stack. The patches to implement these requests in softmac have already
been sent to Johannes Berg.

Signed-Off-By: Larry Finger <Larry.Finger@lwfinger.net>

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-31 10:35:46 -05:00
Baruch Even
2c74088e41 [TCP] H-TCP: Fix accounting
This fixes the accounting in H-TCP, the ccount variable is also
adjusted a few lines above this one.

This line was not supposed to be there and wasn't there in the patches
originally submitted, the four patches submitted were merged to one
and in that merge the bug was introduced.

Signed-Off-By: Baruch Even <baruch@ev-en.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-30 20:54:39 -08:00
Dave Jones
c5d90e0004 [IPV4] igmp: remove pointless printk
This is easily triggerable by sending bogus packets,
allowing a malicious user to flood remote logs.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-30 20:27:17 -08:00
Larry Finger
dd5eeb461e [PATCH] ieee80211: common wx auth code
This patch creates two functions ieee80211_wx_set_auth and
ieee80211_wx_get_auth that can be used by drivers for the wireless
extension handlers instead of writing their own, if the implementation
should be software only.

These patches enable using bcm43xx devices with WPA and this seems (as
far as I can tell) to be the only difference between the stock ieee80211
and softmac's ieee80211 left.

Signed-Off-By: Johannes Berg <johannes@sipsolutions.net>

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-30 20:35:35 -05:00
Denis Vlasenko
9eafe76b8a [PATCH] ieee80211: trivial fix for misplaced ()'s
Patch fixes misplaced (). Diffed against wireless-2.6.git

Signed-off-by: Denis Vlasenko <vda@ilport.com.ua>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-30 20:35:31 -05:00
Adrian Bunk
d86b5e0e6b [PATCH] net/: fix the WIRELESS_EXT abuse
This patch contains the following changes:
- add a CONFIG_WIRELESS_EXT select'ed by NET_RADIO for conditional
  code
- remove the now no longer required #ifdef CONFIG_NET_RADIO from some
  #include's

Based on a patch by Jean Tourrilhes <jt@hpl.hp.com>.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-30 20:35:30 -05:00
Vlad Yasevich
e2c2fc2c8f [SCTP]: heartbeats exceed maximum retransmssion limit
The number of HEARTBEAT chunks that an association may transmit is
limited by Association.Max.Retrans count; however, the code allows
us to send one extra heartbeat.

This patch limits the number of heartbeats to the maximum count.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-30 16:00:40 -08:00
Vlad Yasevich
81845c21dc [SCTP]: correct the number of INIT retransmissions
We currently count the initial INIT/COOKIE_ECHO chunk toward the
retransmit count and thus sends a total of sctp_max_retrans_init chunks.
The correct behavior is to retransmit the chunk sctp_max_retrans_init in
addition to sending the original.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-30 15:59:54 -08:00
John W. Linville
747af1e154 Merge branch 'upstream-fixes' 2006-01-30 17:43:25 -05:00
Larry Finger
1a1fedf4d3 [PATCH] Typo corrections for ieee80211
This patch, generated against 2.6.16-rc1-git4, corrects two typographical
errors in ieee80211_rx.c and adds the facility name to a bare printk.

Signed-Off-By: Larry Finger <Larry.Finger@lwfinger.net>

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-30 17:41:36 -05:00
Zhu Yi
d1b46b0fba [PATCH] ieee80211: Add 802.11h information element parsing
Added default handlers for various 802.11h DFS and TPC information
elements.  Moved all information elements into single location (called
from two places).  Added debug message with information on unparsed IEs
if debug_level set.  Added code to reset network IBSS DFS information
when appropriate.  Added code to invoke driver callback for 802.11h
ACTION STYPE.  Changed a few printk's to IEEE80211_DEBUG_MGMT.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-27 17:08:07 -05:00
Zhu Yi
15f385982e [PATCH] ieee80211: Add helpers for IBSS DFS handling
To support IEEE 802.11h in IBSS, an ibss_dfs field is added to struct
ieee80211_network. In IBSS, if one STA sends a beacon with DFS info
(for radar detection), all the other STAs should receive and store
this DFS.  All STAs should send the DFS as one of the information
element in the beacon they are scheduled to send (if possible) in
the future.

Since the ibss_dfs has variable length, it must be allocated
dynamically. ieee80211_network_reset() is added to clear the ibss_dfs
field. ieee80211_network_free() is also updated to free the ibss_dfs
field if it is not NULL.

Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-27 17:08:07 -05:00
Zhu Yi
b79e20b609 [PATCH] ieee80211: Add 802.11h data type and structures
Add 802.11h data types and structure definitions to ieee80211.h.

Signed-off-by: Hong Liu <hong.liu@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-27 17:08:07 -05:00
Zhu Yi
9184d9348a [PATCH] ieee80211: Add TKIP crypt->build_iv
This patch adds ieee80211 TKIP build_iv() method to support hardwares
that can do TKIP encryption but relies on ieee80211 layer to build
the IV. It also changes the build_iv() interface to return the key
if possible after the IV is built (this is required by TKIP).

Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-27 17:08:07 -05:00
Zhu Yi
41a25c616b [PATCH] ieee80211: TIM information element parsing
Added partial support of TIM information element parsing

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-27 17:08:07 -05:00
Zhu Yi
8aa914b747 [PATCH] ieee80211: kmalloc+memset -> kzalloc cleanups
kmalloc+memset -> kzalloc cleanups in ieee80211_crypt_tkip

Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-27 17:08:06 -05:00
Zhu Yi
7bd6436604 [PATCH] ieee80211: Add spectrum management information
Add spectrum management information and use stat.signal to provide
signal level information.

Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-27 17:08:06 -05:00
Zhu Yi
d128f6c176 [PATCH] ieee80211: add flags for all geo channels
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-27 17:08:06 -05:00
Zhu Yi
d652923751 [PATCH] ieee80211: Log if netif_rx() drops the packet
Log to wireless network stats if netif_rx() drops the packet.

(also trailing whitespace and Lindent cleanups as part of patch-apply
process)

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-27 17:08:06 -05:00
Denis Vlasenko
44d7a8cfbd [PATCH] WEP fields are incorrectly shown to be INSIDE snap in the doc
>If encryption is enabled, each fragment payload size is reduced by enough space
>to add the prefix and postfix (IV and ICV totalling 8 bytes in the case of WEP)
>So if you have 1500 bytes of payload with ieee->fts set to 500 without
>encryption it will take 3 frames.  With WEP it will take 4 frames as the
>payload of each frame is reduced to 492 bytes.

Text is correct, but in picture (IV,payload,ICV) sits inside SNAP.
Patch corrects this.

Signed-Off-By: Denis Vlasenko <vda@ilport.com.ua>
Acked-By: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-27 17:08:06 -05:00
Zhu Yi
55cd94aa1d [PATCH] ieee80211: Fix iwlist scan can only show about 20 APs
Limit the amount of output given to iwlist scan.

Signed-off-by: Hong Liu <hong.liu@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-27 16:49:58 -05:00
Zhu Yi
b6daa25d65 [PATCH] ieee80211: Fix problem with not decrypting broadcast packets
The code for pulling the key to use for decrypt was correctly using
the host_mc_decrypt flag.  The code that actually decrypted,
however, was based on host_decrypt.  This patch changes this
behavior.

Signed-off-by: Etay Bogner <etay.bogner@gmail.com>
Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-01-27 16:49:58 -05:00
David L Stevens
7add2a4398 [IPV6] MLDv2: fix change records when transitioning to/from inactive
The following patch fixes these problems in MLDv2:

1) Add/remove "delete" records for sending change reports when
        addition of a filter results in that filter transitioning to/from
        inactive. [same as recent IPv4 IGMPv3 fix]
2) Remove 2 redundant "group_type" checks (can't be IPV6_ADDR_ANY
        within that loop, so checks are always true)
3) change an is_in() "return 0" to "return type == MLD2_MODE_IS_INCLUDE".
        It should always be "0" to get here, but it improves code locality 
        to not assume it, and if some race allowed otherwise, doing
        the check would return the correct result.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-24 13:06:39 -08:00
Jerome Borsboom
151bb0ffe5 [AF_KEY]: no message type set
When returning a message to userspace in reply to a SADB_FLUSH or 
SADB_X_SPDFLUSH message, the type was not set for the returned PFKEY 
message. The patch below corrects this problem.

Signed-off-by: Jerome Borsboom <j.borsboom@erasmusmc.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-24 12:57:19 -08:00
Thomas Graf
cabcac0b29 [BONDING]: Remove CAP_NET_ADMIN requirement for INFOQUERY ioctl
This information is already available via /proc/net/bonding/*
therefore it doesn't make sense to require CAP_NET_ADMIN
privileges.

Original patch by Laurent Deniel <laurent.deniel@free.fr>

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-24 12:46:33 -08:00
Herbert Xu
8798b3fb71 [NET]: Fix skb fclone error path handling.
On the error path if we allocated an fclone then we will free it in
the wrong pool.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-23 16:32:45 -08:00
Kris Katterjohn
8ae55f0489 [NET]: Fix some whitespace issues in af_packet.c
Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-23 16:28:02 -08:00
Kris Katterjohn
2966b66c25 [NET]: more whitespace issues in net/core/filter.c
This fixes some whitespace issues in net/core/filter.c

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-23 16:26:16 -08:00
David S. Miller
cf9e50a920 Merge master.kernel.org:/pub/scm/linux/kernel/git/sridhar/lksctp-2.6 2006-01-19 16:53:02 -08:00
Alan Cox
715b49ef2d [PATCH] EDAC: atomic scrub operations
EDAC requires a way to scrub memory if an ECC error is found and the chipset
does not do the work automatically.  That means rewriting memory locations
atomically with respect to all CPUs _and_ bus masters.  That means we can't
use atomic_add(foo, 0) as it gets optimised for non-SMP

This adds a function to include/asm-foo/atomic.h for the platforms currently
supported which implements a scrub of a mapped block.

It also adjusts a few other files include order where atomic.h is included
before types.h as this now causes an error as atomic_scrub uses u32.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 19:20:30 -08:00
J. Bruce Fields
5fb8b49e29 [PATCH] svcrpc: gss: svc context creation error handling
Allow mechanisms to return more varied errors on the context creation
downcall.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 19:20:25 -08:00
Kevin Coffman
91a4762e0a [PATCH] svcrpc: gss: server context init failure handling
We require the server's gssd to create a completed context before asking the
kernel to send a final context init reply.  However, gssd could be buggy, or
under some bizarre circumstances we might purge the context from our cache
before we get the chance to use it here.

Handle this case by returning GSS_S_NO_CONTEXT to the client.

Also move the relevant code here to a separate function rather than nesting
excessively.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 19:20:25 -08:00
Andy Adamson
822f1005ae [PATCH] svcrpc: gss: handle the GSS_S_CONTINUE
Kerberos context initiation is handled in a single round trip, but other
mechanisms (including spkm3) may require more, so we need to handle the
GSS_S_CONTINUE case in svcauth_gss_accept.  Send a null verifier.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 19:20:25 -08:00
J. Bruce Fields
1918e34138 [PATCH] svcrpc: save and restore the daddr field when request deferred
The server code currently keeps track of the destination address on every
request so that it can reply using the same address.  However we forget to do
that in the case of a deferred request.  Remedy this oversight.  >From folks
at PolyServe.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 19:20:24 -08:00
David S. Miller
27a7b0415f Merge git://tipc.cslab.ericsson.net/pub/git/tipc 2006-01-18 14:23:54 -08:00
David L Stevens
ad12583f46 [IPV4]: Fix multiple bugs in IGMPv3
1) fix "mld_marksources()" to
        a) send nothing when all queried sources are excluded
        b) send full exclude report when source queried sources are
                not excluded
        c) don't schedule a timer when there's nothing to report

2) fix "add_grec()" to send empty-source records when it should
        The original check doesn't account for a non-empty source
        list with all sources inactive; the new code keeps that
        short-circuit case, and also generates the group header
        with an empty list if needed.

3) fix mca_crcount decrement to be after add_grec(), which needs
        its original value

4) add/remove delete records and prevent current advertisements
        when an exclude-mode filter moves from "active" to "inactive"
        or vice versa based on new filter additions.

        Items 1-3 are just IPv4 versions of the IPv6 bugs found
by Yan Zheng and fixed earlier. Item #4 is a related bug that
affects exclude-mode change records only (but not queries) and
also occurs in IPv6 (IPv6 version coming soon).

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-18 14:20:56 -08:00
David S. Miller
7ac5459ec0 [PKTGEN]: Respect hard_header_len of device.
Don't assume 16.

Found by Ben Greear.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-18 14:19:10 -08:00
Andrew Morton
dbd2915ce8 [IPV4]: RT_CACHE_STAT_INC() warning fix
BUG: using smp_processor_id() in preemptible [00000001] code: rpc.statd/2408

And it _is_ a bug, but I guess we don't care enough to add preempt_disable().

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 22:46:49 -08:00
Per Liden
4323add677 [TIPC] Avoid polluting the global namespace
This patch adds a tipc_ prefix to all externally visible symbols.

Signed-off-by: Per Liden <per.liden@ericsson.com>
2006-01-18 00:45:16 +01:00
Per Liden
1e63e681e0 [TIPC] Group protocols with sub-options in Kconfig
This is just a cosmetic change that moves the TIPC configuration
entry next to the other protocols that also have sub-options.
Makes the the networking options menu look a bit better.

Signed-off-by: Per Liden <per.liden@ericsson.com>
2006-01-18 00:45:15 +01:00
Per Liden
c11ac3f236 [TIPC] Add help text for TIPC configuration option
Signed-off-by: Per Liden <per.liden@ericsson.com>
2006-01-18 00:45:15 +01:00
Per Liden
50f9bcddf8 [TIPC] Remove unused #includes
Signed-off-by: Per Liden <per.liden@ericsson.com>
2006-01-18 00:45:15 +01:00
Per Liden
33a9c4da5a [TIPC] Move ethernet protocol id to linux/if_ether.h
Signed-off-by: Per Liden <per.liden@ericsson.com>
2006-01-18 00:45:15 +01:00
Per Liden
16cb4b333c [TIPC] Updated link priority macros
Added macros for min/default/max link priority in tipc_config.h.
Also renamed TIPC_NUM_LINK_PRI to TIPC_MEDIA_LINK_PRI since that
is a more accurate description of what it is used for.

Signed-off-by: Per Liden <per.liden@ericsson.com>
2006-01-18 00:45:15 +01:00
Jon Maloy
5f7c3ff6a2 [TIPC] Minor changes to #includes
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
2006-01-18 00:45:14 +01:00
Kris Katterjohn
3860288ee8 [NET]: Use is_zero_ether_addr() in net/core/netpoll.c
This replaces a memcmp() with is_zero_ether_addr().

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 15:15:38 -08:00
Kris Katterjohn
f404e9a67f [PKTGEN]: Replacing with (compare|is_zero)_ether_addr() and ETH_ALEN
This replaces some tests with is_zero_ether_addr(), memcmp(one, two,
6) with compare_ether_addr(one, two), and 6 with ETH_ALEN where
appropriate.

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 13:04:57 -08:00
Kris Katterjohn
a8fc3d8dec [NET]: "signed long" -> "long"
Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 13:03:54 -08:00
Patrick McHardy
ab67a4d511 [EBTABLES]: Handle SCTP/DCCP in ebt_{ip,log}
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 13:01:31 -08:00
Patrick McHardy
ae82af54d7 [PKT_SCHED]: Handle SCTP/DCCP in sfq_hash
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 13:01:06 -08:00
Tsutomu Fujii
a7d1f1b66c [SCTP]: Fix sctp_rcv_ootb() to handle the last chunk of a packet correctly.
Signed-off-by: Tsutomu Fujii <t-fujii@nb.jp.nec.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:57:09 -08:00
Sridhar Samudrala
c4d2444e99 [SCTP]: Fix couple of races between sctp_peeloff() and sctp_rcv().
Validate and update the sk in sctp_rcv() to avoid the race where an
assoc/ep could move to a different socket after we get the sk, but before
the skb is added to the backlog.

Also migrate the skb's in backlog queue to new sk when doing a peeloff.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:56:26 -08:00
Vlad Yasevich
313e7b4d25 [SCTP]: Fix machine check/connection hang on IA64.
sctp_unpack_cookie used an on-stack array called digest as a result/out
parameter in the call to crypto_hmac. However, hmac code
(crypto_hmac_final)
assumes that the 'out' argument is in virtual memory (identity mapped
region)
and can use virt_to_page call on it.  This does not work with the on-stack
declared digest.  The problems observed so far have been:
 a) incorrect hmac digest
 b) machine check and hardware reset.

Solution is to define the digest in an identity mapped region by
kmalloc'ing
it.  We can do this once as part of the endpoint structure and re-use it
when
verifying the SCTP cookie.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:55:57 -08:00
Vlad Yasevich
8116ffad41 [SCTP]: Fix bad sysctl formatting of SCTP timeout values on 64-bit m/cs.
Change all the structure members that hold jiffies to be of type
unsigned long.  This also corrects bad sysctl formating on 64 bit
architectures.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:55:17 -08:00
Vlad Yasevich
38b0e42aba [SCTP]: Fix sctp_assoc_seq_show() panics on big-endian systems.
This patch corrects the panic by casting the argument to the
pointer of correct size.  On big-endian systems we ended up loading
only 32 bits of data because we are treating the pointer as an int*.
By treating this pointer as loff_t*, we'll load the full 64 bits
and then let regular integer demotion take place which will give us
the correct value.

Signed-off-by: Vlad Yaseivch <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:54:06 -08:00
Vlad Yasevich
49392e5ecf [SCTP]: sctp doesn't show all associations/endpoints in /proc
When creating a very large number of associations (and endpoints),
/proc/assocs and /proc/eps will not show all of them.  As a result
netstat will not show all of the either.  This is particularly evident
when creating 1000+ associations (or endpoints).  As an example with
1500 tcp style associations over loopback, netstat showed 1420 on my
system instead of 3000.

The reason for this is that the seq_operations start method is invoked
multiple times bacause of the amount of data that is provided.  The
start method always increments the position parameter and since we use
the position as the hash bucket id, we end up skipping hash buckets.

This patch corrects this situation and get's rid of the silly hash-1
decrement.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:53:06 -08:00
Vlad Yasevich
9834a2bb49 [SCTP]: Fix sctp_cookie alignment in the packet.
On 64 bit architectures, sctp_cookie sent as part of INIT-ACK is not
aligned on a 64 bit boundry and thus causes unaligned access exceptions.

The layout of the cookie prameter is this:
|<----- Parameter Header --------------------|<--- Cookie DATA --------
-----------------------------------------------------------------------
| param type (16 bits) | param len (16 bits) | sig [32 bytes] | cookie..
-----------------------------------------------------------------------

The cookie data portion contains 64 bit values on 64 bit architechtures
(timeval) that fall on a 32 bit alignment boundry when used as part of
the on-wire format, but align correctly when used in internal
structures.  This patch explicitely pads the on-wire format so that
it is properly aligned.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-01-17 11:52:12 -08:00
Sridhar Samudrala
7a48f923b8 [SCTP]: Fix potential race condition between sctp_close() and sctp_rcv().
Do not release the reference to association/endpoint if an incoming skb is
added to backlog. Instead release it after the chunk is processed in
sctp_backlog_rcv().

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2006-01-17 11:51:28 -08:00
Eric Dumazet
2f970d8357 [IPV4]: rt_cache_stat can be statically defined
Using __get_cpu_var(obj) is slightly faster than per_cpu_ptr(obj, 
raw_smp_processor_id()).

1) Smaller code and memory use
For static and small objects, DEFINE_PER_CPU(type, object) is preferred over a 
alloc_percpu() : Better and smaller code to access them, and no extra memory 
(storing the pointer, and the percpu array of pointers)

x86_64 code before patch

mov    1237577(%rip),%rax        # ffffffff803e5990 <rt_cache_stat>
not    %rax  # part of per_cpu machinery
mov    %gs:0x3c,%edx # get cpu number
movslq %edx,%rdx # extend 32 bits cpu number to 64 bits
mov    (%rax,%rdx,8),%rax # get the pointer for this cpu
incl   0x38(%rax)

x86_64 code after patch

mov    $per_cpu__rt_cache_stat,%rdx
mov    %gs:0x48,%rax # get percpu data offset
incl   0x38(%rax,%rdx,1)

2) False sharing avoidance for SMP :
For a small NR_CPUS, the array of per cpu pointers allocated in alloc_percpu() 
can be <= 32 bytes. This let slab code gives a part of a cache line. If the 
other part of this 64 bytes (or 128 bytes) cache line is used by a mostly 
written object, we can have false sharing and expensive per_cpu_ptr() operations.

Size of rt_cache_stat is 64 bytes, so this patch is not a danger of a too big 
increase of bss (in UP mode) or static per_cpu data for SMP 
(PERCPU_ENOUGH_ROOM is currently 32768 bytes)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:54:36 -08:00
David S. Miller
f09484ff87 [NETFILTER]: ip_conntrack_proto_gre.c needs linux/interrupt.h
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:42:02 -08:00
Yasuyuki Kozakai
f0daaa654a [NETFILTER] ip6tables: whitespace and indent cosmetic cleanup
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:39:39 -08:00
Yasuyuki Kozakai
6dd42af790 [NETFILTER] Makefile cleanup
These are replaced with x_tables matches and no longer exist.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:38:56 -08:00
Benoit Boissinot
ccc91324a1 [NETFILTER] ip[6]t_policy: Fix compilation warnings
ip[6]t_policy argument conversion slipped when merging with x_tables

Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:26:34 -08:00
Kris Katterjohn
e35bedf369 [NET]: Fix whitespace issues in net/core/filter.c
This fixes some whitespace issues in net/core/filter.c

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:25:52 -08:00
Amnon Aaronsohn
dd914b4082 [PKT_SCHED] sch_prio: fix qdisc bands init
Currently when PRIO is configured to use N bands, it lets the packets be
directed to any of the bands 0..N-1. However, PRIO attaches a fifo qdisc
only to the bands that appear in the priomap; the rest of the N bands
remain with a noop qdisc attached. This patch changes PRIO's behavior so
that it attaches a fifo qdisc to all of the N bands.

Signed-off-by: Amnon Aaronsohn <bla@cs.huji.ac.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:24:26 -08:00
YOSHIFUJI Hideaki
9343e79a7b [IPV6]: Preserve procfs IPV6 address output format
Procfs always output IPV6 addresses without the colon
characters, and we cannot change that.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 02:10:53 -08:00
Linus Torvalds
caf5b04c82 x86: Work around compiler code generation bug with -Os
Some versions of gcc generate incorrect code for the inet_check_attr()
function, apparently due to a totally bogus index -> pointer comparison
transformation.

At least "gcc version 4.0.1 20050727 (Red Hat 4.0.1-5)" from FC4 is
affected, possibly others too.

This changes the function subtly so that the buggy gcc transformation
doesn't trigger.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-14 22:08:28 -08:00
Arjan van de Ven
858119e159 [PATCH] Unlinline a bunch of other functions
Remove the "inline" keyword from a bunch of big functions in the kernel with
the goal of shrinking it by 30kb to 40kb

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-14 18:27:06 -08:00
David S. Miller
37d8dc82e0 [NETFILTER] x-tables: Missing linux/ipv6.h includes.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-13 16:19:44 -08:00
Patrick McHardy
dca80b962a [PKT_SCHED]: Change default clock source to gettimeofday
The default of using jiffies is very bad and results in
underutilization except with very low bandwidth.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-13 14:36:55 -08:00
Patrick McHardy
ee51b1b6ce [XFRM]: IPsec tunnel wildcard address support
When the source address of a tunnel is given as 0.0.0.0 do a routing lookup
to get the real source address for the destination and fill that into the
acquire message. This allows to specify policies like this:

spdadd 172.16.128.13/32 172.16.0.0/20 any -P out ipsec
        esp/tunnel/0.0.0.0-x.x.x.x/require;
spdadd 172.16.0.0/20 172.16.128.13/32 any -P in ipsec
        esp/tunnel/x.x.x.x-0.0.0.0/require;

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-13 14:34:36 -08:00
Kris Katterjohn
7b11f69fb5 [NET]: Clean up comments for sk_chk_filter()
This removes redundant comments, and moves one comment to a better
location.

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-13 14:33:06 -08:00
Joe Perches
46b86a2da0 [NET]: Use NIP6_FMT in kernel.h
There are errors and inconsistency in the display of NIP6 strings.
	ie: net/ipv6/ip6_flowlabel.c

There are errors and inconsistency in the display of NIPQUAD strings too.
	ie: net/netfilter/nf_conntrack_ftp.c

This patch:
	adds NIP6_FMT to kernel.h
	changes all code to use NIP6_FMT
	fixes net/ipv6/ip6_flowlabel.c
	adds NIPQUAD_FMT to kernel.h
	fixes net/netfilter/nf_conntrack_ftp.c
	changes a few uses of "%u.%u.%u.%u" to NIPQUAD_FMT for symmetry to NIP6_FMT

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-13 14:29:07 -08:00
Per Liden
23b0ca5bf5 [PATCH] genetlink: don't touch module ref count
Increasing the module ref count at registration will block the module from
ever being unloaded. In fact, genetlink should not care about the owner at
all. This patch removes the owner field from the struct registered with
genetlink.

Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-13 13:06:40 -08:00
Harald Welte
2e4e6a17af [NETFILTER] x_tables: Abstraction layer for {ip,ip6,arp}_tables
This monster-patch tries to do the best job for unifying the data
structures and backend interfaces for the three evil clones ip_tables,
ip6_tables and arp_tables.  In an ideal world we would never have
allowed this kind of copy+paste programming... but well, our world
isn't (yet?) ideal.

o introduce a new x_tables module
o {ip,arp,ip6}_tables depend on this x_tables module
o registration functions for tables, matches and targets are only
  wrappers around x_tables provided functions
o all matches/targets that are used from ip_tables and ip6_tables
  are now implemented as xt_FOOBAR.c files and provide module aliases
  to ipt_FOOBAR and ip6t_FOOBAR
o header files for xt_matches are in include/linux/netfilter/,
  include/linux/netfilter_{ipv4,ipv6} contains compatibility wrappers
  around the xt_FOOBAR.h headers

Based on this patchset we're going to further unify the code,
gradually getting rid of all the layer 3 specific assumptions.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-12 14:06:43 -08:00
David S. Miller
880b005f29 [TIPC]: Fix 64-bit build warnings.
When storing u32 values in a pointer, need to do
some long casts to keep GCC happy.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-12 14:06:41 -08:00
Per Liden
593a5f22d8 [TIPC] More updates of file headers
Updated copyright notice to include the year the file was
actually created. Information about file creation dates
was extracted from the files in the old CVS repository
at tipc.sourceforge.net.

Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
2006-01-12 14:06:39 -08:00
Per Liden
9da1c8b694 [TIPC] Update of file headers
The copyright statements from different parts of Ericsson
have been merged into one.

Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
2006-01-12 14:06:38 -08:00
Per Liden
d0a14a9dbd [TIPC] Cleaned up info/warn/err macros
Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
2006-01-12 14:06:37 -08:00
Per Liden
9ea1fd3c1a [TIPC] License header update
The license header in each file now more clearly state that this
code is licensed under a dual BSD/GPL. Before this was only
evident if you looked at the MODULE_LICENSE line in core.c.

Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
2006-01-12 14:06:36 -08:00
Per Liden
ea714ccda5 [TIPC] Moved configuration interface into tipc_config.h
Restored the old tipc_config.h to get a cleaner division between the
interfaces used by normal TIPC users and TIPC administration utilities.

Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
2006-01-12 14:06:35 -08:00
Jon Maloy
b70e4f45a8 [TIPC} Fixed bug in disc_timeout()
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
2006-01-12 14:06:33 -08:00
Per Liden
1dba974333 [TIPC] Use dynamically allocated family id with NETLINK_GENERIC
Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
2006-01-12 14:06:32 -08:00
Per Liden
b97bf3fd8f [TIPC] Initial merge
TIPC (Transparent Inter Process Communication) is a protocol designed for
intra cluster communication. For more information see
http://tipc.sourceforge.net

Signed-off-by: Per Liden <per.liden@nospam.ericsson.com>
2006-01-12 14:06:31 -08:00
Randy Dunlap
4fc268d24c [PATCH] capable/capability.h (net/)
net: Use <linux/capability.h> where capable() is used.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11 18:42:14 -08:00
Adrian Bunk
bb7e8c5a55 [PKT_SCHED] net/sched/Kconfig: fix typo in NET_EMATCH_META description
Noted by Matt LaPlante <webmaster@cyberdogtech.com>.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:40:30 -08:00
Evgeniy Polyakov
54608b7099 [PKT_SCHED] ematch: Remove bogus include.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:16 -08:00
Evgeniy Polyakov
c3f343e4d7 [NET]: Fix diverter build.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:15 -08:00
Kris Katterjohn
8b3a70058b [NET]: Remove more unneeded typecasts on *malloc()
This removes more unneeded casts on the return value for kmalloc(),
sock_kmalloc(), and vmalloc().

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:14 -08:00
David Woodhouse
ae0f7d5f83 [IPV6]: Avoid calling ip6_xmit() with NULL sk
The ip6_xmit() function now assumes that its sk argument is non-NULL,
which isn't currently true when TCPv6 code is sending RST or ACK
packets. This fixes that code to use a socket of its own for sending
such packets, as TCPv4 does. (Thanks Andi for the pointer).

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:13 -08:00
David S. Miller
a776809755 [NETFILTER]: ip_ct_proto_gre_fini() cannot be __exit
It is invoked from failures paths of __init code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:12 -08:00
David S. Miller
82bf7e97ac [NET]: Some more missing include/etherdevice.h includes
For compare_ether_addr()

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-11 16:32:11 -08:00
David S. Miller
5bf887f2ff [IPV6]: Fix modular build with netfilter enabled.
Also, drop __exit marker from ipv6_netfilter_fini() as this
can be invoked from inet6_init() error handling paths.

Based upon a report from Stephen Hemminger.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 21:02:21 -08:00
Linus Torvalds
9819d85c21 Fix net/core/wireless.c link failure
It needs <linux/etherdevice.h> for compare_ether_addr()
2006-01-10 19:35:19 -08:00
Nicolas Kaiser
b8ab50bc55 netfilter: headers included twice
Headers included twice.

Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-01-11 02:04:35 +01:00
Bart De Schuymer
8a4c8a96a4 [EBTABLES] Don't match tcp/udp source/destination port for IP fragments
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 13:12:22 -08:00
Jesper Juhl
12fe2c588d [NET]: Remove unneeded kmalloc() return value casts
Get rid of needless casting of kmalloc() return value in net/

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 13:08:21 -08:00
Jesper Juhl
ea2e90dfce [RXRPC]: Decrease number of pointer derefs in connection.c
Decrease the number of pointer derefs in net/rxrpc/connection.c

Benefits of the patch:
 - Fewer pointer dereferences should make the code slightly faster.
 - Size of generated code is smaller
 - improved readability

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 13:07:44 -08:00
Martin Murray
ad8e4b75c8 [AF_NETLINK]: Fix DoS in netlink_rcv_skb()
From: Martin Murray <murrayma@citi.umich.edu>

Sanity check nlmsg_len during netlink_rcv_skb.  An nlmsg_len == 0 can
cause infinite loop in kernel, effectively DoSing machine.  Noted by
Matin Murray.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 13:02:29 -08:00
Patrick McHardy
babbdb1a18 [NETFILTER]: Fix timeout sysctls on big-endian 64bit architectures
The connection tracking timeout variables are unsigned long, but
proc_dointvec_jiffies is used with sizeof(unsigned int) in the sysctl
tables. Since there is no proc_doulongvec_jiffies function, change the
timeout variables to unsigned int.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:35 -08:00
Patrick McHardy
9d28026b7e [NETFILTER]: Remove unused function from NAT protocol helpers
->print and ->print_range are not used (and apparently never were).

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:34 -08:00
Patrick McHardy
c07bc1ffbd [NETFILTER]: Fix return value confusion in PPTP NAT helper
ip_nat_mangle_tcp_packet doesn't return NF_* values but 0/1 for
failure/success.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:33 -08:00
Patrick McHardy
03b9feca89 [NETFILTER]: Fix another crash in ip_nat_pptp
The PPTP NAT helper calculates the offset at which the packet needs
to be mangled as difference between two pointers to the header. With
non-linear skbs however the pointers may point to two seperate buffers
on the stack and the calculation results in a wrong offset beeing
used.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:32 -08:00
Patrick McHardy
15db34702c [NETFILTER]: Fix crash in ip_nat_pptp
When an inbound PPTP_IN_CALL_REQUEST packet is received the
PPTP NAT helper uses a NULL pointer in pointer arithmentic to
calculate the offset in the packet which needs to be mangled
and corrupts random memory or crashes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:30 -08:00
Patrick McHardy
bb94aa169e [NETFILTER]: net/ipv[46]/netfilter.c cleanups
Don't wrap entire file in #ifdef CONFIG_NETFILTER, remove a few
unneccessary includes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:29 -08:00
Kris Katterjohn
d3f4a687f6 [NET]: Change memcmp(,,ETH_ALEN) to compare_ether_addr()
This changes some memcmp(one,two,ETH_ALEN) to compare_ether_addr(one,two).

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-10 12:54:28 -08:00
Linus Torvalds
4f47707b05 Fix rpc shutdown event condition bug
We want to wait for the cl_users to go down to zero, not for it to stay
positive.  Quoth Trond (who wasn't even the author, but acked the wrong
version): "Argh! I need to increase my daily caffeine dosages."

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:56:39 -08:00
Alan Cox
33f0f88f1c [PATCH] TTY layer buffering revamp
The API and code have been through various bits of initial review by
serial driver people but they definitely need to live somewhere for a
while so the unconverted drivers can get knocked into shape, existing
drivers that have been updated can be better tuned and bugs whacked out.

This replaces the tty flip buffers with kmalloc objects in rings. In the
normal situation for an IRQ driven serial port at typical speeds the
behaviour is pretty much the same, two buffers end up allocated and the
kernel cycles between them as before.

When there are delays or at high speed we now behave far better as the
buffer pool can grow a bit rather than lose characters. This also means
that we can operate at higher speeds reliably.

For drivers that receive characters in blocks (DMA based, USB and
especially virtualisation) the layer allows a lot of driver specific
code that works around the tty layer with private secondary queues to be
removed. The IBM folks need this sort of layer, the smart serial port
people do, the virtualisers do (because a virtualised tty typically
operates at infinite speed rather than emulating 9600 baud).

Finally many drivers had invalid and unsafe attempts to avoid buffer
overflows by directly invoking tty methods extracted out of the innards
of work queue structs. These are no longer needed and all go away. That
fixes various random hangs with serial ports on overflow.

The other change in here is to optimise the receive_room path that is
used by some callers. It turns out that only one ldisc uses receive room
except asa constant and it updates it far far less than the value is
read. We thus make it a variable not a function call.

I expect the code to contain bugs due to the size alone but I'll be
watching and squashing them and feeding out new patches as it goes.

Because the buffers now dynamically expand you should only run out of
buffering when the kernel runs out of memory for real.  That means a lot of
the horrible hacks high performance drivers used to do just aren't needed any
more.

Description:

tty_insert_flip_char is an old API and continues to work as before, as does
tty_flip_buffer_push() [this is why many drivers dont need modification].  It
does now also return the number of chars inserted

There are also

tty_buffer_request_room(tty, len)

which asks for a buffer block of the length requested and returns the space
found.  This improves efficiency with hardware that knows how much to
transfer.

and tty_insert_flip_string_flags(tty, str, flags, len)

to insert a string of characters and flags

For a smart interface the usual code is

    len = tty_request_buffer_room(tty, amount_hardware_says);
    tty_insert_flip_string(tty, buffer_from_card, len);

More description!

At the moment tty buffers are attached directly to the tty.  This is causing a
lot of the problems related to tty layer locking, also problems at high speed
and also with bursty data (such as occurs in virtualised environments)

I'm working on ripping out the flip buffers and replacing them with a pool of
dynamically allocated buffers.  This allows both for old style "byte I/O"
devices and also helps virtualisation and smart devices where large blocks of
data suddenely materialise and need storing.

So far so good.  Lots of drivers reference tty->flip.*.  Several of them also
call directly and unsafely into function pointers it provides.  This will all
break.  Most drivers can use tty_insert_flip_char which can be kept as an API
but others need more.

At the moment I've added the following interfaces, if people think more will
be needed now is a good time to say

 int tty_buffer_request_room(tty, size)

Try and ensure at least size bytes are available, returns actual room (may be
zero).  At the moment it just uses the flipbuf space but that will change.
Repeated calls without characters being added are not cumulative.  (ie if you
call it with 1, 1, 1, and then 4 you'll have four characters of space.  The
other functions will also try and grow buffers in future but this will be a
more efficient way when you know block sizes.

 int tty_insert_flip_char(tty, ch, flag)

As before insert a character if there is room.  Now returns 1 for success, 0
for failure.

 int tty_insert_flip_string(tty, str, len)

Insert a block of non error characters.  Returns the number inserted.

 int tty_prepare_flip_string(tty, strptr, len)

Adjust the buffer to allow len characters to be added.  Returns a buffer
pointer in strptr and the length available.  This allows for hardware that
needs to use functions like insl or mencpy_fromio.

Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Paul Fulghum <paulkf@microgate.com>
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:59 -08:00
Ingo Molnar
532347e2bb [PATCH] nfs: sleep_on() removal
Convert sleep_on() to wait_event_timeout().  Probably safe with the BKL but
could be racy once BKL use in NFS-client is gone.

Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:42 -08:00
Andrey Borzenkov
6dd214b554 [PATCH] fix /sys/class/net/<if>/wireless without dev->get_wireless_stats
dev->get_wireless_stats is deprecated but removing it also removes wireless
subdirectory in sysfs. This patch puts it back.

akpm: I don't know what's happening here.  This might be appropriate as a
2.6.15.x compatibility backport.  Waiting to hear from Jeff.

Signed-off-by: Andrey Borzenkov <arvidjaar@mail.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:24 -08:00
Linus Torvalds
80c0531514 Merge master.kernel.org:/pub/scm/linux/kernel/git/mingo/mutex-2.6 2006-01-09 17:31:38 -08:00
Linus Torvalds
a457aa6c2b Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial 2006-01-09 17:06:53 -08:00
Jes Sorensen
1b1dcc1b57 [PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_sem
This patch converts the inode semaphore to a mutex. I have tested it on
XFS and compiled as much as one can consider on an ia64. Anyway your
luck with it might be different.

Modified-by: Ingo Molnar <mingo@elte.hu>

(finished the conversion)

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2006-01-09 15:59:24 -08:00
Adrian Bunk
93b1fae491 spelling: s/trough/through/
Additionally, one comment was reformulated by Joe Perches <joe@perches.com>.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-01-10 00:13:33 +01:00
Arnaldo Carvalho de Melo
dff2c03534 [INET_DIAG]: Introduce sk_diag_fill
To be called from inet_diag_get_exact, also rename inet_diag_fill to
inet_csk_diag_fill, for consistency with inet_twsk_diag_fill.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:56:56 -08:00
Arnaldo Carvalho de Melo
c7d58aabdc [INET_DIAG]: Introduce inet_twsk_diag_dump & inet_twsk_diag_fill
To properly dump TIME_WAIT sockets and to reduce complexity a bit by
having per socket class accessor routines.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:56:38 -08:00
Arnaldo Carvalho de Melo
4e852c0279 [INET_DIAG]: whitespace/simple cleanups
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:56:19 -08:00
Arnaldo Carvalho de Melo
7dbf075524 [INET_DIAG]: Use inet_twsk() with TIME_WAIT sockets
The fields being accessed in inet_diag_dump are outside sock_common, the
common part of struct sock and struct inet_timewait_sock.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:56:03 -08:00
Patrick McHardy
a2c2064f7f [IPV6]: Set skb->priority in ip6_output.c
Set skb->priority = sk->sk_priority as in raw.c and IPv4.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:31 -08:00
Patrick McHardy
cfacb0577e [IPV4]: ip_output.c needs xfrm.h
This patch fixes a warning from my IPsec patches:

   CC      net/ipv4/ip_output.o
net/ipv4/ip_output.c: In function 'ip_finish_output':
net/ipv4/ip_output.c:208: warning: implicit declaration of function
'xfrm4_output_finish'

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:28 -08:00
Jamal Hadi Salim
29f1df6cc1 [PKT_SCHED]: Fix qdisc return code.
The mapping between TC_ACTION_SHOT and the qdisc return codes is better
suited to NET_XMIT_BYPASS so as not to confuse TCP

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:26 -08:00
Kris Katterjohn
09a626600b [NET]: Change some "if (x) BUG();" to "BUG_ON(x);"
This changes some simple "if (x) BUG();" statements to "BUG_ON(x);"

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:18 -08:00
Patrick McHardy
4bba392592 [PKT_SCHED]: Prefix tc actions with act_
Clean up the net/sched directory a bit by prefix all actions with act_.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:14 -08:00
Patrick McHardy
541673c859 [PKT_SCHED]: Fix memory leak when dumping in pedit action
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:12 -08:00
Patrick McHardy
31bd06eb33 [PKT_SCHED]: Remove some obsolete policer exports
Also make sure the legacy code is only built when CONFIG_NET_CLS_ACT
is not set.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:10 -08:00
Patrick McHardy
f43c5a0df3 [PKT_SCHED]: Convert tc action functions to single skb pointers
tcf_action_exec only gets a single skb pointer and doesn't own the skb,
but passes double skb pointers (to a local variable) to the action
functions. Change to use single skb pointers everywhere.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:08 -08:00
Patrick McHardy
538e43a4bd [PKT_SCHED]: Use USEC_PER_SEC
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:05 -08:00
Patrick McHardy
2941a48631 [NET]: Convert net/{ipv4,ipv6,sched} to netdev_priv
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:16:03 -08:00
Linus Torvalds
cf10b2853f Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2006-01-09 09:39:05 -08:00
Kirill Korotaev
14591de147 [PATCH] netlink oops fix due to incorrect error code
Fixed oops after failed netlink socket creation.

Wrong parathenses in if() statement caused err to be 1,
instead of negative value.

Trivial fix, not trivial to find though.

Signed-Off-By: Dmitry Mishin <dim@sw.ru>
Signed-Off-By: Kirill Korotaev <dev@openvz.org>
Signed-Off-By: Linus Torvalds <torvalds@osdl.org>
2006-01-09 09:36:52 -08:00
Johannes Berg
a4bf26f30e [PATCH] ieee80211: enable hw wep where host has to build IV
This patch fixes some of the ieee80211 crypto related code so that
instead of having the host fully do crypto operations, the host_build_iv
flag works properly (for WEP in this patch) which, if turned on,
requires the hardware to do all crypto operations, but the ieee80211
layer builds the IV. The hardware also has to build the ICV.

Previously, the host_build_iv flag couldn't be used at all for WEP, and
not alone (with both host_decrypt and host_encrypt disabled) because the
crypto algorithm wasn't assigned. This is also fixed.

I have tested this patch both in host crypto mode and in hw crypto mode
(with the Broadcom chipset).

[resent, signing digitally caused it to be MIME-junked, sorry]

Signed-Off-By: Johannes Berg <johannes@sipsolutions.net>

Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2006-01-09 10:34:25 -05:00
Matt Mackall
18e92b12e8 [PATCH] tiny: Trim non-IPX builds
trivial: drop unused 802.3 code if we compile without IPX

(originally from http://wohnheim.fh-wedel.de/~joern/software/kernel/je/25/)

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Arnaldo Carvalho de Melo <acme@conectiva.com.br>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08 20:14:10 -08:00
Eric Dumazet
5160ee6fc8 [PATCH] shrink dentry struct
Some long time ago, dentry struct was carefully tuned so that on 32 bits
UP, sizeof(struct dentry) was exactly 128, ie a power of 2, and a multiple
of memory cache lines.

Then RCU was added and dentry struct enlarged by two pointers, with nice
results for SMP, but not so good on UP, because breaking the above tuning
(128 + 8 = 136 bytes)

This patch reverts this unwanted side effect, by using an union (d_u),
where d_rcu and d_child are placed so that these two fields can share their
memory needs.

At the time d_free() is called (and d_rcu is really used), d_child is known
to be empty and not touched by the dentry freeing.

Lockless lookups only access d_name, d_parent, d_lock, d_op, d_flags (so
the previous content of d_child is not needed if said dentry was unhashed
but still accessed by a CPU because of RCU constraints)

As dentry cache easily contains millions of entries, a size reduction is
worth the extra complexity of the ugly C union.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Maneesh Soni <maneesh@in.ibm.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Paul Jackson <pj@sgi.com>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@epoch.ncsc.mil>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08 20:13:58 -08:00
Pekka Enberg
f9f7500521 [PATCH] slab: remove unused align parameter from alloc_percpu
__alloc_percpu and alloc_percpu both take an 'align' argument which is
completely ignored.  snmp6_mib_init() in net/ipv6/af_inet6.c attempts to use
it, but it will be ignored.  Therefore, remove the 'align' argument and fixup
the lone caller.

Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08 20:12:39 -08:00
Adrian Bunk
9f5336e218 [IPV6]: small cleanups
This patch contains the following cleanups:
- addrconf.c: make addrconf_dad_stop() static
- inet6_connection_sock.c should #include <net/inet6_connection_sock.h>
  for getting the prototypes of it's global functions

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 13:24:25 -08:00
Adrian Bunk
97dc627fb3 [IPV4]: make ip_fragment() static
Since there's no longer any external user of ip_fragment() we can make 
it static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 13:23:39 -08:00
Joe Kappus
da7bc6ee8e [NETFILTER]: ip_conntrack_proto_sctp.c needs linux/interrupt.h
Signed-off-by: Joe Kappus <joecool1029@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:41 -08:00
Patrick McHardy
e16a8f0b8c [NETFILTER]: Add ipt_policy/ip6t_policy matches
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:38 -08:00
Patrick McHardy
eb9c7ebe69 [NETFILTER]: Handle NAT in IPsec policy checks
Handle NAT of decapsulated IPsec packets by reconstructing the struct flowi
of the original packet from the conntrack information for IPsec policy
checks.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:37 -08:00
Patrick McHardy
b59c270104 [NETFILTER]: Keep conntrack reference until IPsec policy checks are done
Keep the conntrack reference until policy checks have been performed for
IPsec NAT support. The reference needs to be dropped before a packet is
queued to avoid having the conntrack module unloadable.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:36 -08:00
Patrick McHardy
5c901daaea [NETFILTER]: Redo policy lookups after NAT when neccessary
When NAT changes the key used for the xfrm lookup it needs to be done
again. If a new policy is returned in POST_ROUTING the packet needs
to be passed to xfrm4_output_one manually after all hooks were called
because POST_ROUTING is called with fixed okfn (ip_finish_output).

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:35 -08:00
Patrick McHardy
4e8e9de7c2 [NETFILTER]: Use conntrack information to determine if packet was NATed
Preparation for IPsec support for NAT:
Use conntrack information instead of saving the saving and comparing the
addresses to determine if a packet was NATed and needs to be rerouted to
make it easier to extend the key.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:34 -08:00
Patrick McHardy
3e3850e989 [NETFILTER]: Fix xfrm lookup in ip_route_me_harder/ip6_route_me_harder
ip_route_me_harder doesn't use the port numbers of the xfrm lookup and
uses ip_route_input for non-local addresses which doesn't do a xfrm
lookup, ip6_route_me_harder doesn't do a xfrm lookup at all.

Use xfrm_decode_session and do the lookup manually, make sure both
only do the lookup if the packet hasn't been transformed already.

Makeing sure the lookup only happens once needs a new field in the
IP6CB, which exceeds the size of skb->cb. The size of skb->cb is
increased to 48b. Apparently the IPv6 mobile extensions need some
more room anyway.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:33 -08:00
Patrick McHardy
8cdfab8a43 [IPV4]: reset IPCB flags when neccessary
Reset IPSKB_XFRM_TUNNEL_SIZE flags in ipip and ip_gre hard_start_xmit
function before the packet reenters IP. This is neccessary so the
encapsulated packets are checked not to be oversized in xfrm4_output.c
again. Reset all flags in sit when a packet changes its address family.

Also remove some obsolete IPSKB flags.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:32 -08:00
Patrick McHardy
b05e106698 [IPV4/6]: Netfilter IPsec input hooks
When the innermost transform uses transport mode the decapsulated packet
is not visible to netfilter. Pass the packet through the PRE_ROUTING and
LOCAL_IN hooks again before handing it to upper layer protocols to make
netfilter-visibility symetrical to the output path.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:31 -08:00
Patrick McHardy
951dbc8ac7 [IPV6]: Move nextheader offset to the IP6CB
Move nextheader offset to the IP6CB to make it possible to pass a
packet to ip6_input_finish multiple times and have it skip already
parsed headers. As a nice side effect this gets rid of the manual
hopopts skipping in ip6_input_finish.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:29 -08:00
Patrick McHardy
16a6677fdf [XFRM]: Netfilter IPsec output hooks
Call netfilter hooks before IPsec transforms. Packets visit the
FORWARD/LOCAL_OUT and POST_ROUTING hook before the first encapsulation
and the LOCAL_OUT and POST_ROUTING hook before each following tunnel mode
transform.

Patch from Herbert Xu <herbert@gondor.apana.org.au>:

Move the loop from dst_output into xfrm4_output/xfrm6_output since they're
the only ones who need to it. xfrm{4,6}_output_one() processes the first SA
all subsequent transport mode SAs and is called in a loop that calls the
netfilter hooks between each two calls.

In order to avoid the tail call issue, I've added the inline function
nf_hook which is nf_hook_slow plus the empty list check.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:28 -08:00
David S. Miller
aa0e4e4aea [DCCP]: ipv6.c needs net/ip6_checksum.c
Reported by Dave Jones.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-07 12:57:26 -08:00
Linus Torvalds
d8d8f6a4fd Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2006-01-06 15:24:28 -08:00
Linus Torvalds
57d1c91fa6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild 2006-01-06 15:23:56 -08:00
Alexey Dobriyan
a2167dc62e [NET]: Endian-annotate in_aton()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:24:54 -08:00
Alexey Dobriyan
76ab608d86 [NET]: Endian-annotate struct iphdr
And fix trivial warnings that emerged.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:24:29 -08:00
Trent Jaeger
5f8ac64b15 [LSM-IPSec]: Corrections to LSM-IPSec Nethooks
This patch contains two corrections to the LSM-IPsec Nethooks patches
previously applied.  

(1) free a security context on a failed insert via xfrm_user 
interface in xfrm_add_policy.  Memory leak.

(2) change the authorization of the allocation of a security context
in a xfrm_policy or xfrm_state from both relabelfrom and relabelto 
to setcontext.

Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:22:39 -08:00
Luiz Capitulino
69549ddd2f [PKTGEN]: Adds missing __init.
pktgen_find_thread() and pktgen_create_thread() are only called at
initialization time.

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:19:31 -08:00
Joe
3cbc4ab58f [NETFILTER]: ipt_helper.c needs linux/interrupt.h
From: Joe <joecool1029@gmail.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:15:11 -08:00
Stephen Hemminger
ee02b3a613 [BRIDGE] netfilter: vlan + hw checksum = bug?
It looks like the bridge netfilter code does not correctly update
the hardware checksum after popping off the VLAN header.

This is by inspection, I have *not* tested this.
To test you would need to set up a filtering bridge with vlans
and a device the does hardware receive checksum (skge, or sungem)

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:13:29 -08:00
Shaun Pereira
a20a855479 [X25]: Fix for broken x25 module.
When a user-space server application calls bind on a socket, then in kernel
space this bound socket is considered 'x25-linked' and the SOCK_ZAPPED flag
is unset.(As in x25_bind()/af_x25.c).

Now when a user-space client application attempts to connect to the server
on the listening socket, if the kernel accepts this in-coming call, then it
returns a new socket to userland and attempts to reply to the caller.

The reply/x25_sendmsg() will fail, because the new socket created on
call-accept has its SOCK_ZAPPED flag set by x25_make_new().
(sock_init_data() called by x25_alloc_socket() called by x25_make_new()
sets the flag to SOCK_ZAPPED)).

Fix: Using the sock_copy_flag() routine available in sock.h fixes this.

Tested on 32 and 64 bit kernels with x25 over tcp.

Signed-off-by: Shaun Pereira <pereira.shaun@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:11:35 -08:00
Kris Katterjohn
4bad4dc919 [NET]: Change sk_run_filter()'s return type in net/core/filter.c
It should return an unsigned value, and fix sk_filter() as well.

Signed-off-by: Kris Katterjohn <kjak@ispwest.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:08:20 -08:00
Kris Katterjohn
dbbc098828 [NET]: Use newer is_multicast_ether_addr() in some files
This uses is_multicast_ether_addr() because it has recently been
changed to do the same thing these seperate tests are doing.

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-06 13:05:58 -08:00
Sam Ravnborg
367cb70421 kbuild: un-stringnify KBUILD_MODNAME
Now when kbuild passes KBUILD_MODNAME with "" do not __stringify it when
used. Remove __stringnify for all users.
This also fixes the output of:

$ ls -l /sys/module/
drwxr-xr-x 4 root root 0 2006-01-05 14:24 pcmcia
drwxr-xr-x 4 root root 0 2006-01-05 14:24 pcmcia_core
drwxr-xr-x 3 root root 0 2006-01-05 14:24 "processor"
drwxr-xr-x 3 root root 0 2006-01-05 14:24 "psmouse"

The quoting of the module names will be gone again.
Thanks to GregKH + Kay Sievers for reproting this.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2006-01-06 21:17:50 +01:00
J. Bruce Fields
9e56904e41 SUNRPC: Make krb5 report unsupported encryption types
Print messages when an unsupported encrytion algorthm is requested or
 there is an error locating a supported algorthm.

 Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:59:00 -05:00
J. Bruce Fields
42181d4baf SUNRPC: Make spkm3 report unsupported encryption types
Print messages when an unsupported encrytion algorthm is requested or
 there is an error locating a supported algorthm.

 Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:59 -05:00
J. Bruce Fields
9eed129bbd SUNRPC: Update the spkm3 code to use the make_checksum interface
Also update the tokenlen calculations to accomodate g_token_size().

 Signed-off-by: Andy Adamson <andros@citi.umich.edu>
 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:59 -05:00
Trond Myklebust
0065db3285 SUNRPC: Clean up xprt_destroy()
We ought never to be calling xprt_destroy() if there are still active
 rpc_tasks. Optimise away the broken code that attempts to "fix" that case.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:58 -05:00
Trond Myklebust
632e3bdc50 SUNRPC: Ensure client closes the socket when server initiates a close
If the server decides to close the RPC socket, we currently don't actually
 respond until either another RPC call is scheduled, or until xprt_autoclose()
 gets called by the socket expiry timer (which may be up to 5 minutes
 later).

 This patch ensures that xprt_autoclose() is called much sooner if the
 server closes the socket.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:57 -05:00
Chuck Lever
f518e35aec SUNRPC: get rid of cl_chatty
Clean up: Every ULP that uses the in-kernel RPC client, except the NLM
 client, sets cl_chatty.  There's no reason why NLM shouldn't set it, so
 just get rid of cl_chatty and always be verbose.

 Test-plan:
 Compile with CONFIG_NFS enabled.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:56 -05:00
Chuck Lever
922004120b SUNRPC: transport switch API for setting port number
At some point, transport endpoint addresses will no longer be IPv4.  To hide
 the structure of the rpc_xprt's address field from ULPs and port mappers,
 add an API for setting the port number during an RPC bind operation.

 Test-plan:
 Destructive testing (unplugging the network temporarily).  Connectathon
 with UDP and TCP.  NFSv2/3 and NFSv4 mounting should be carefully checked.
 Probably need to rig a server where certain services aren't running, or
 that returns an error for some typical operation.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:56 -05:00
Chuck Lever
35f5a422ce SUNRPC: new interface to force an RPC rebind
We'd like to hide fields in rpc_xprt and rpc_clnt from upper layer protocols.
 Start by creating an API to force RPC rebind, replacing logic that simply
 sets cl_port to zero.

 Test-plan:
 Destructive testing (unplugging the network temporarily).  Connectathon
 with UDP and TCP.  NFSv2/3 and NFSv4 mounting should be carefully checked.
 Probably need to rig a server where certain services aren't running, or
 that returns an error for some typical operation.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:56 -05:00
Chuck Lever
0210714834 SUNRPC: switchable buffer allocation
Add RPC client transport switch support for replacing buffer management
 on a per-transport basis.

 In the current IPv4 socket transport implementation, RPC buffers are
 allocated as needed for each RPC message that is sent.  Some transport
 implementations may choose to use pre-allocated buffers for encoding,
 sending, receiving, and unmarshalling RPC messages, however.  For
 transports capable of direct data placement, the buffers can be carved
 out of a pre-registered area of memory rather than from a slab cache.

 Test-plan:
 Millions of fsx operations.  Performance characterization with "sio" and
 "iozone".  Use oprofile and other tools to look for significant regression
 in CPU utilization.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:55 -05:00
Adrian Bunk
fb459f45f7 SUNRPC: net/sunrpc/xdr.c: remove xdr_decode_string()
This patch removes ths unused function xdr_decode_string().

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Neil Brown <neilb@suse.de>
Acked-by: Charles Lever <Charles.Lever@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:53 -05:00
Trond Myklebust
969b7f2522 SUNRPC: Fix a potential race in rpc_pipefs.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:51 -05:00
Trond Myklebust
2bd615797e SUNRPC: Ensure that SIGKILL will always terminate a synchronous RPC call.
...and make sure that the "intr" flag also enables SIGHUP and SIGTERM to
 interrupt RPC calls too (as per the Solaris implementation).

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:45 -05:00
Trond Myklebust
e60859ac0e SUNRPC: rpc_execute should not return task->tk_status;
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:42 -05:00
Trond Myklebust
89991c24e4 SUNRPC: Get rid of some unused exports
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:41 -05:00
Trond Myklebust
44c288732f NFSv4: stateful NFSv4 RPC call interface
The NFSv4 model requires us to complete all RPC calls that might
 establish state on the server whether or not the user wants to
 interrupt it. We may also need to schedule new work (including
 new RPC calls) in order to cancel the new state.

 The asynchronous RPC model will allow us to ensure that RPC calls
 always complete, but in order to allow for "synchronous" RPC, we
 want to add the ability to wait for completion.
 The waits are, of course, interruptible.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:40 -05:00
Trond Myklebust
4ce70ada1f SUNRPC: Further cleanups
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:40 -05:00
Trond Myklebust
963d8fe533 RPC: Clean up RPC task structure
Shrink the RPC task structure. Instead of storing separate pointers
 for task->tk_exit and task->tk_release, put them in a structure.

 Also pass the user data pointer as a parameter instead of passing it via
 task->tk_calldata. This enables us to nest callbacks.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:39 -05:00
Trond Myklebust
abbcf28f23 SUNRPC: Yet more RPC cleanups
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:39 -05:00
Olaf Kirch
93fbf1a5de [PATCH] Keep nfsd from exiting when seeing recv() errors
I submitted this one previously - svc_tcp_recvfrom currently returns
any errors to the caller, including ECONNRESET and the like.

This is something svc_recv isn't able to deal with:

	len = svsk->sk_recvfrom(rqstp);
	[...]
	if (len == 0 || len == -EAGAIN) {
		[...]
		return -EAGAIN;
	}

	[...]
	return len;

The nfsd main loop will exit when it sees an error code other than
EAGAIN.

The following patch fixes this problem

svc_recv is not equipped to deal with error codes other than EAGAIN,
and will propagate anything else (such as ECONNRESET) up to nfsd,
causing it to exit.

Signed-off-by: Olaf Kirch <okir@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:59 -08:00
NeilBrown
1f1e030bf7 [PATCH] knfsd: fix hash function for IP addresses on 64bit little-endian machines.
The hash.h hash_long function, when used on a 64 bit machine, ignores many
of the middle-order bits.  (The prime chosen it too bit-sparse).

IP addresses for clients of an NFS server are very likely to differ only in
the low-order bits.  As addresses are stored in network-byte-order, these
bits become middle-order bits in a little-endian 64bit 'long', and so do
not contribute to the hash.  Thus you can have the situation where all
clients appear on one hash chain.

So, until hash_long is fixed (or maybe forever), us a hash function that
works well on IP addresses - xor the bytes together.

Thanks to "Iozone" <capps@iozone.org> for identifying this problem.

Cc: "Iozone" <capps@iozone.org>

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:21 -08:00
Kris Katterjohn
46f25dffba [NET]: Change 1500 to ETH_DATA_LEN in some files
These patches add the header linux/if_ether.h and change 1500 to
ETH_DATA_LEN in some files.

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 16:48:56 -08:00
Andrew Morton
e924283bf9 [IPVS]: Another file needs linux/interrupt.h
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 16:48:55 -08:00
Yasuyuki Kozakai
e8eaedf2f8 [NETFILTER]: Use HOPLIMIT metric as TTL of TCP reset sent by REJECT
HOPLIMIT metric is appropriate to TCP reset sent by REJECT target
than hard-coded max TTL. Thanks to David S. Miller for hint.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:28:57 -08:00
Patrick McHardy
0ae2cfe7f3 [NETFILTER]: nf_conntrack_l3proto_ipv4.c needs net/route.h
CC [M]  net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.o
net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c: In function 'ipv4_refrag':
net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c:198: error: dereferencing pointer to incomplete type
make[3]: *** [net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.o] Error 1

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:21:52 -08:00
Patrick McHardy
22dea562bb [NETFILTER]: Export ip6_masked_addrcmp, don't pass IPv6 addresses on stack
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:21:34 -08:00
Patrick McHardy
b777e0ce74 [NETFILTER]: make ipv6_find_hdr() find transport protocol header
The original ipv6_find_hdr() finds the specified header in IPv6 packets.
This makes it possible to get transport header so that we can kill similar
loop in ip6_match_packet().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:21:16 -08:00
Patrick McHardy
1bd9bef6f9 [NETFILTER]: Call POST_ROUTING hook before fragmentation
Call POST_ROUTING hook before fragmentation to get rid of the okfn use
in ip_refrag and save the useless fragmentation/defragmentation step
when NAT is used.

The patch introduces one user-visible change, the POSTROUTING chain
in the mangle table gets entire packets, not fragments, which should
simplify use of the MARK and CLASSIFY targets for queueing as a nice
side-effect.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:20:59 -08:00
Patrick McHardy
abbcc73982 [NETFILTER]: Remove okfn usage in ip_vs_core.c
okfn should only be used from different contexts to avoid deep call chains,
i.e. by nf_queue.

Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:20:40 -08:00
Patrick McHardy
a9b305c4e5 [NETFILTER]: ctnetlink: Fix dumping of helper name
Properly dump the helper name instead of internal kernel data.
Based on patch by Marcus Sundberg <marcus@ingate.com>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:20:02 -08:00
Patrick McHardy
e7be6994ec [NETFILTER]: Fix module_param types and permissions
Fix netfilter module_param types and permissions. Also fix an off-by-one in
the ipt_ULOG nlbufsiz < 128k check.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:19:46 -08:00
Pablo Neira Ayuso
87711cb81c [NETFILTER]: Filter dumped entries based on the layer 3 protocol number
Dump entries of a given Layer 3 protocol number.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:19:23 -08:00
Pablo Neira Ayuso
c1d10adb4a [NETFILTER]: Add ctnetlink port for nf_conntrack
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:19:05 -08:00
Pablo Neira Ayuso
205d67c7d9 [NETFILTER]: ctnetlink: remove unused variable
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:18:44 -08:00
Pablo Neira Ayuso
d4d6bb41e0 [NETFILTER]: ctnetlink: fix conntrack mark race
Set conntrack mark before it is in hashes.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:18:25 -08:00
Pablo Neira Ayuso
0368309cb4 [NETFILTER]: ctnetlink: ctnetlink_event cleanup
Cleanup: Use 'else if' instead of a ugly 'goto' statement.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:18:08 -08:00
Pablo Neira Ayuso
47116eb201 [NETFILTER]: ctnetlink: use u_int32_t instead of unsigned int
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:17:50 -08:00
Pablo Neira Ayuso
984955b3d7 [NETFILTER]: ctnetlink: propagate ctnetlink_dump_tuples_proto return value back
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:17:29 -08:00
Yasuyuki Kozakai
90c4656eb4 [NETFILTER]: ctnetlink: Add sanity checkings for ICMP
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:17:03 -08:00
Pablo Neira Ayuso
684f7b296c [NETFILTER]: ctnetlink: remove bogus checks in ICMP protocol at dumping
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:16:41 -08:00
Jesper Juhl
d695aa8a1f [NETFILTER]: Decrease number of pointer derefs in nf_conntrack_core.c
Benefits of the patch:
 - Fewer pointer dereferences should make the code slightly faster.
 - Size of generated code is smaller
 - improved readability

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:16:16 -08:00
Jesper Juhl
3e4ead4fe5 [NETFILTER]: Decrease number of pointer derefs in nfnetlink_queue.c
Benefits of the patch:
 - Fewer pointer dereferences should make the code slightly faster.
 - Size of generated code is smaller
 - improved readability

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:15:58 -08:00
Adrian Bunk
4ffd2e4907 [IPVS]: Fix compilation
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 12:14:43 -08:00
Linus Torvalds
db9edfd7e3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6
Trivial manual merge fixup for usb_find_interface clashes.
2006-01-04 18:44:12 -08:00
Linus Torvalds
52347f4e81 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial 2006-01-04 16:34:57 -08:00
Linus Torvalds
d779188d2b Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2006-01-04 16:31:56 -08:00
Kay Sievers
fd586bacf4 [PATCH] net: swich device attribute creation to default attrs
Recent udev versions don't longer cover bad sysfs timing with built-in
logic. Explicit rules are required to do that. For net devices, the
following is needed:
  ACTION=="add", SUBSYSTEM=="net", WAIT_FOR_SYSFS="address"
to handle access to net device properties from an event handler without
races.

This patch changes the main net attributes to be created by the driver
core, which is done _before_ the event is sent out and will not require
the stat() loop of the WAIT_FOR_SYSFS key.

Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 16:18:10 -08:00
Kay Sievers
312c004d36 [PATCH] driver core: replace "hotplug" by "uevent"
Leave the overloaded "hotplug" word to susbsystems which are handling
real devices. The driver core does not "plug" anything, it just exports
the state to userspace and generates events.

Signed-off-by: Kay Sievers <kay.sievers@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 16:18:08 -08:00
Thomas Young
74cb879822 [TCP] tcp_vegas: Fix slow start
Vegas' slow start was only adding one MSS per RTT rather than one for
every ack. Slow start behavior should now match Reno.

Signed-off-by: Thomas Young <tyo@ee.mu.oz.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-04 13:59:32 -08:00
Kris Katterjohn
9369986306 [NET]: More instruction checks fornet/core/filter.c
Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-04 13:58:36 -08:00
YOSHIFUJI Hideaki
181a46a56e [NETFILTER]: Use macro for spinlock_t/rwlock_t initializations/definition.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-04 13:56:54 -08:00
YOSHIFUJI Hideaki
196433c5b7 [IPV6]: Use macro for rwlock_t initialization.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-04 13:56:31 -08:00
YOSHIFUJI Hideaki
ca40330248 [ECONET]: Use macro for spinlock_t definition.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-04 13:56:08 -08:00
Arnaldo Carvalho de Melo
f190055ff5 [IPVS]: Add missing include <linux/net.h>
CC [M]  net/ipv4/ipvs/ip_vs_conn.o
  /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/ipvs/ip_vs_conn.c: In
  function 'ip_vs_conn_new':
  /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/ipvs/ip_vs_conn.c:606:
  warning: implicit declaration of function 'net_ratelimit'
  /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/ipvs/ip_vs_conn.c: In
  function 'ip_vs_random_dropentry':
  /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/ipvs/ip_vs_conn.c:810:
  warning: implicit declaration of function 'net_random'

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-01-04 02:02:20 -02:00
Arnaldo Carvalho de Melo
80e40daa47 [TCP]: syn_flood_warning is only needed if CONFIG_SYN_COOKIES is selected
CC      net/ipv4/tcp_ipv4.o
  /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/tcp_ipv4.c:665: warning:
  'syn_flood_warning' defined but not used

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-01-04 01:58:06 -02:00
Arnaldo Carvalho de Melo
e4dfd449c8 [DCCP] ackvec: use u8 for the buf offsets
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-01-04 01:46:34 -02:00
Andrea Bittau
6742bbcbb8 [DCCP] ackvec: Fix spelling of "throw"
Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-01-04 01:45:17 -02:00
Stephen Hemminger
40efc6fa17 [TCP]: less inline's
TCP inline usage cleanup:
 * get rid of inline in several places
 * replace __inline__ with inline where possible
 * move functions used in one file out of tcp.h
 * let compiler decide on used once cases

On x86_64: 
   text	   data	    bss	    dec	    hex	filename
3594701	 648348	 567400	4810449	 4966d1	vmlinux.orig
3593133	 648580	 567400	4809113	 496199	vmlinux

On sparc64:
   text	   data	    bss	    dec	    hex	filename
2538278	 406152	 530392	3474822	 350586	vmlinux.ORIG
2536382	 406384	 530392	3473158	 34ff06	vmlinux

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 16:03:49 -08:00
Stephen Hemminger
3c19065a1e [IEEE80211] ipw2200: Simplify multicast checks.
From: Stephen Hemminger <shemminger@osdl.org>

is_multicast_ether_addr() accepts broadcast too, so the
is_broadcast_ether_addr() calls are redundant.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 15:27:38 -08:00
Stephen Hemminger
cd8787ab04 [IPV4] fib_trie: build fix
Need this to fix build of fib_trie in net-2.6.16 (rebased) tree.
The code needs the new inet_make_mask inline.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:38:34 -08:00
Stephen Hemminger
554c9a8ec3 [BRIDGE]: Fix faulty check in br_stp_recalculate_bridge_id()
One of the conversions from memcmp to compare_ether_addr is incorrect.
We need to do relative comparison to determine min MAC address to
use in bridge id. 

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:35:54 -08:00
Andrea Bittau
e84a9f5e9c [DCCP]: Notify CCID only after ACK vectors have been processed.
The CCID should be notified of packet reception only when a packet is
valid.  Therefore, the ACK vector needs to be processed before
notifying the CCID.  Also, the CCID might need information provided by
the ACK vector.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:26:15 -08:00
Andrea Bittau
9e377202d2 [DCCP]: Send an ACK vector when ACKing a response packet
If ACK vectors are used, each packet with an ACK should contain an ACK
vector.  The only exception currently is response packets.  It
probably is not a good idea to store ACK vector state before the
connection is completed (to help protect from syn floods).

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:25:49 -08:00
Andrea Bittau
709dd3aaf5 [DCCP]: Do not process a packet twice when it's not in state DCCP_OPEN.
When packets are received, the connection is either in DCCP_OPEN
[fast-path] or it isn't.  If it's not [e.g. DCCP_PARTOPEN] upper
layers will perform sanity checks and parse options.  If it is in
DCCP_OPEN, dccp_rcv_established() will do it.  It is important not to
re-parse options in dccp_rcv_established() when it is not called from
the fast-path.  Else, fore example, the ack vector will be added twice
and the CCID will see the packet twice.

The solution is to always enfore sanity checks from the upper layers.
When packets arrive in the fast-path, sanity checks will be performed
before calling dccp_rcv_established().

Note(acme): I rewrote the patch to achieve the same result but keeping
dccp_rcv_established with the previous semantics and having it split
into __dccp_rcv_established, that doesn't does do any sanity check,
code in state != DCCP_OPEN use this lighter version as they already do
the sanity checks.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:25:17 -08:00
Patrick Caulfield
5062430c5c [DECNET]: Only use local routers
The attached patch makes DECnet routing only use routers from the same
area - rather than the highest rated router seen.

In theory there should not be an out-of-area router on a local network
but some networks are bridged rather than properly routed. VMS seems
to behave similarly: if I bring up a VMS node with no router then it
can't see anything else on the global network.

Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:24:02 -08:00
Roberto Nibali
4b5bdf5cc3 [IPVS]: Cleanup IP_VS_DBG statements.
From: Roberto Nibali <ratz@drugphish.ch>

The attached patch (against current -GIT) is a cleanup patch which does
following:

o lookup debug messages shifted back to 9
o added more informational value to flags and refcnt since those
entries can be in multiple referenced structures
o cleanup 80 char violation

It's the prepatch to the session pool implementation and helps very much
to debug and monitor important variables and structures regarding the
threshold limitation and persistency without the thousands of lookup
messages which noone is interested in.

Signed-off-by: Horms <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:22:59 -08:00
Christoph Hellwig
b5e5fa5e09 [NET]: Add a dev_ioctl() fallback to sock_ioctl()
Currently all network protocols need to call dev_ioctl as the default
fallback in their ioctl implementations.  This patch adds a fallback
to dev_ioctl to sock_ioctl if the protocol returned -ENOIOCTLCMD.
This way all the procotol ioctl handlers can be simplified and we don't
need to export dev_ioctl.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:18:33 -08:00
Christoph Hellwig
5ff7630e4a [NETROM]: Remove unessecary lock_sock calls in netrom_ioctl()
lock_sock is needed only in very few cases, so do it there instead of
around the switch statement.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:14:46 -08:00
Per Liden
b461d2f218 [NETLINK] genetlink: fix cmd type in genl_ops to be consistent to u8
Signed-off-by: Per Liden <per.liden@ericsson.com>
ACKed-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:13:29 -08:00
Benjamin LaHaise
fd19f329a3 [AF_UNIX]: Convert to use a spinlock instead of rwlock
From: Benjamin LaHaise <bcrl@kvack.org>

In af_unix, a rwlock is used to protect internal state.  At least on my 
P4 with HT it is faster to use a spinlock due to the simpler memory 
barrier used to unlock.  This patch raises bw_unix to ~690K/s.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:10:46 -08:00
Benjamin LaHaise
4947d3ef8d [NET]: Speed up __alloc_skb()
From: Benjamin LaHaise <bcrl@kvack.org>

In __alloc_skb(), the use of skb_shinfo() which casts a u8 * to the 
shared info structure results in gcc being forced to do a reload of the 
pointer since it has no information on possible aliasing.  Fix this by 
using a pointer to refer to skb_shared_info.

By initializing skb_shared_info sequentially, the write combining buffers 
can reduce the number of memory transactions to a single write.  Reorder 
the initialization in __alloc_skb() to match the structure definition.  
There is also an alignment issue on 64 bit systems with skb_shared_info 
by converting nr_frags to a short everything packs up nicely.

Also, pass the slab cache pointer according to the fclone flag instead 
of using two almost identical function calls.

This raises bw_unix performance up to a peak of 707KB/s when combined 
with the spinlock patch.  It should help other networking protocols, too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:06:50 -08:00
Arnaldo Carvalho de Melo
14c850212e [INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h
To help in reducing the number of include dependencies, several files were
touched as they were getting needed headers indirectly for stuff they use.

Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had
linux/dccp.h include twice.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:21 -08:00
Arnaldo Carvalho de Melo
25995ff577 [SOCK]: Introduce sk_receive_skb
Its common enough to to justify that, TCP still can't use it as it has the
prequeueing stuff, still to be made generic in the not so distant future :-)

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:19 -08:00
Christoph Hellwig
ce1d4d3e88 [NET]: restructure sock_aio_{read,write} / sock_{readv,writev}
Mid-term I plan to restructure the file_operations so that we don't need
to have all these duplicate aio and vectored versions.  This patch is
a small step in that direction but also a worthwile cleanup on it's own:

(1) introduce a alloc_sock_iocb helper that encapsulates allocating a
    proper sock_iocb
(2) add do_sock_read and do_sock_write helpers for common read/write
    code

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:18 -08:00
David S. Miller
cbeb321a64 [NET]: Fix sock_init() return value.
It needs to return zero now that it is an initcall.

Also, net/nonet.c no longer needs a dummy sock_init().

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:17 -08:00
Jaco Kroon
f34fbb9713 [PKTGEN]: Deinitialise static variables.
static variables should not be explicitly initialised to 0.  This causes
them to be placed in .data instead of .bss.  This patch de-initialises 3
static variables in net/core/pktgen.c.

There are approximately 800 more such variables in the source tree
(2.6.15rc5).  If there is more interrest I'd be willing to track down the
rest of these as well and de-initialise them as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:16 -08:00
Eric Dumazet
90ddc4f047 [NET]: move struct proto_ops to const
I noticed that some of 'struct proto_ops' used in the kernel may share
a cache line used by locks or other heavily modified data. (default
linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at
least)

This patch makes sure a 'struct proto_ops' can be declared as const,
so that all cpus can share all parts of it without false sharing.

This is not mandatory : a driver can still use a read/write structure
if it needs to (and eventually a __read_mostly)

I made a global stubstitute to change all existing occurences to make
them const.

This should reduce the possibility of false sharing on SMP, and
speedup some socket system calls.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:15 -08:00
Andi Kleen
77d76ea310 [NET]: Small cleanup to socket initialization
sock_init can be done as a core_initcall instead of calling
it directly in init/main.c

Also I removed an out of date #ifdef.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:14 -08:00
Frank Filz
7708610b1b [SCTP]: Add support for SCTP_DELAYED_ACK_TIME socket option.
Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:13 -08:00
Frank Filz
52ccb8e90c [SCTP]: Update SCTP_PEER_ADDR_PARAMS socket option to the latest api draft.
This patch adds support to set/get heartbeat interval, maximum number of
retransmissions, pathmtu, sackdelay time for a particular transport/
association/socket as per the latest SCTP sockets api draft11.

Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:11 -08:00
Robert Olsson
fd9662555c [IPV4] fib_trie: Add credits.
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:10 -08:00
Stephen Hemminger
9eb2d62719 [TCP] cubic: use Newton-Raphson
Replace cube root algorithim with a faster version using Newton-Raphson.
Surprisingly, doing the scaled div64_64 is faster than a true 64 bit
division on 64 bit CPU's.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:09 -08:00
Stephen Hemminger
89b3d9aaf4 [TCP] cubic: precompute constants
Revised version of patch to pre-compute values for TCP cubic.
  * d32,d64 replaced with descriptive names
  * cube_factor replaces
	 srtt[scaled by count] / HZ * ((1 << (10+2*BICTCP_HZ)) / bic_scale)
  * beta_scale replaces
	8*(BICTCP_BETA_SCALE+beta)/3/(BICTCP_BETA_SCALE-beta);

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:08 -08:00
Stephen Hemminger
c865e5d99e [PKT_SCHED] netem: packet corruption option
Here is a new feature for netem in 2.6.16. It adds the ability to
randomly corrupt packets with netem. A version was done by
Hagen Paul Pfeifer, but I redid it to handle the cases of backwards
compatibility with netlink interface and presence of hardware checksum
offload. It is useful for testing hardware offload in devices.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:05 -08:00
Stephen Hemminger
8cbb512e50 [BRIDGE]: add version number
Add version info to bridge module.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:04 -08:00
Stephen Hemminger
edb5e46fc0 [BRIDGE]: limited ethtool support
Add limited ethtool support to bridge to allow disabling
features.

Note: if underlying device does not support a feature (like checksum
offload), then the bridge device won't inherit it.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:03 -08:00
Stephen Hemminger
0e5eabac49 [BRIDGE]: filter packets in learning state
While in the learning state, run filters but drop the result.
This prevents us from acquiring bad fdb entries in learning state.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:02 -08:00
Stephen Hemminger
4433f420e5 [BRIDGE]: handle speed detection after carrier changes
Speed of a interface may not be available until carrier
is detected in the case of autonegotiation. To get the correct value
we need to recheck speed after carrier event.  But the check needs to
be done in a context that is similar to normal ethtool interface (can sleep).

Also, delay check for 1ms to try avoid any carrier bounce transitions.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:01 -08:00
Stephen Hemminger
4505a3ef72 [BRIDGE]: allow setting hardware address of bridge pseudo-dev
Some people are using bridging to hide multiple machines from an ISP
that restricts by MAC address. So in that case allow the bridge mac
address to be set to any of the existing interfaces.  I don't want to
allow any arbitrary value and confuse STP.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:00 -08:00
David S. Miller
fbe9cc4a87 [AF_UNIX]: Use spinlock for unix_table_lock
This lock is actually taken mostly as a writer,
so using a rwlock actually just makes performance
worse especially on chips like the Intel P4.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:59 -08:00
Arnaldo Carvalho de Melo
d83d8461f9 [IP_SOCKGLUE]: Remove most of the tcp specific calls
As DCCP needs to be called in the same spots.

Now we have a member in inet_sock (is_icsk), set at sock creation time from
struct inet_protosw->flags (if INET_PROTOSW_ICSK is set, like for TCP and
DCCP) to see if a struct sock instance is a inet_connection_sock for places
like the ones in ip_sockglue.c (v4 and v6) where we previously were looking if
sk_type was SOCK_STREAM, that is insufficient because we now use the same code
for DCCP, that has sk_type SOCK_DCCP.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:58 -08:00
Arnaldo Carvalho de Melo
d8313f5ca2 [INET6]: Generalise tcp_v6_hash_connect
Renaming it to inet6_hash_connect, making it possible to ditch
dccp_v6_hash_connect and share the same code with TCP instead.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:56 -08:00
Arnaldo Carvalho de Melo
a7f5e7f164 [INET]: Generalise tcp_v4_hash_connect
Renaming it to inet_hash_connect, making it possible to ditch
dccp_v4_hash_connect and share the same code with TCP instead.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:55 -08:00
Arnaldo Carvalho de Melo
6d6ee43e0b [TWSK]: Introduce struct timewait_sock_ops
So that we can share several timewait sockets related functions and
make the timewait mini sockets infrastructure closer to the request
mini sockets one.

Next changesets will take advantage of this, moving more code out of
TCP and DCCP v4 and v6 to common infrastructure.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:54 -08:00
Arnaldo Carvalho de Melo
fc44b98053 [DCCP]: Use reqsk_free in dccp_v4_conn_request
Now we have the destructor (dccp_v4_reqsk_destructor) in our
request_sock_ops vtable.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:53 -08:00
Arnaldo Carvalho de Melo
3df80d9320 [DCCP]: Introduce DCCPv6
Still needs mucho polishing, specially in the checksum code, but works
just fine, inet_diag/iproute2 and all 8)

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:52 -08:00
Arnaldo Carvalho de Melo
399c07def6 [IPV6]: Export ipv6_opt_accepted
It was already non-TCP specific, will be used by DCCPv6.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:51 -08:00
Arnaldo Carvalho de Melo
f21e68caa0 [DCCP]: Prepare the AF agnostic core for the introduction of DCCPv6
Basically exports a similar set of functions as the one exported by
the non-AF specific TCP code.

In the process moved some non-AF specific code from dccp_v4_connect to
dccp_connect_init and moved the checksum verification from
dccp_invalid_packet to dccp_v4_rcv, so as to use it in dccp_v6_rcv
too.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:50 -08:00
Arnaldo Carvalho de Melo
34ca686081 [DCCP]: Just rename dccp_v4_prot to dccp_prot
To match TCP equivalent.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:49 -08:00
Arnaldo Carvalho de Melo
3cf3dc6c2e [IPV6]: Export some symbols for DCCPv6
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:48 -08:00
Arnaldo Carvalho de Melo
0fa1a53e1f [IPV6]: Introduce inet6_timewait_sock
Out of tcp6_timewait_sock, that now is just an aggregation of
inet_timewait_sock and inet6_timewait_sock, using tw_ipv6_offset in struct
inet_timewait_sock, that is common to the IPv6 transport protocols that use
timewait sockets, like DCCP and TCP.

tw_ipv6_offset plays the struct inet_sock pinfo6 role, i.e. for the generic
code to find the IPv6 area in a timewait sock.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:47 -08:00
Arnaldo Carvalho de Melo
b9750ce13c [IPV6]: Generalise some functions
Using sk->sk_protocol instead of IPPROTO_TCP.

Will be used by DCCPv6 in the next changesets.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:46 -08:00
Benjamin LaHaise
830a1e5c21 [AF_UNIX]: Remove superfluous reference counting in unix_stream_sendmsg
AF_UNIX stream socket performance on P4 CPUs tends to suffer due to a
lot of pipeline flushes from atomic operations.  The patch below
removes the sock_hold() and sock_put() in unix_stream_sendmsg().  This
should be safe as the socket still holds a reference to its peer which
is only released after the file descriptor's final user invokes
unix_release_sock().  The only consideration is that we must add a
memory barrier before setting the peer initially.

Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:45 -08:00
Benjamin LaHaise
c1cbe4b7ad [NET]: Avoid atomic xchg() for non-error case
It also looks like there were 2 places where the test on sk_err was
missing from the event wait logic (in sk_stream_wait_connect and
sk_stream_wait_memory), while the rest of the sock_error() users look
to be doing the right thing.  This version of the patch fixes those,
and cleans up a few places that were testing ->sk_err directly.

Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:44 -08:00
Roberto Nibali
f1f71e03b1 [IPVS]: remove dead code
This patch removes dead code. I don't see the reason to keep this cruft
around, besides cluttering the nice and functionally working code.

Signed-off-by: Roberto Nibali <ratz@drugphish.ch>
Signed-off-by: Horms <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:43 -08:00
Stephen Hemminger
65a45441d7 [UDP]: udp_checksum_init return value
Since udp_checksum_init always returns 0 there is no point in
having it return a value.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:42 -08:00
Herbert Xu
3305b80c21 [IP]: Simplify and consolidate MSG_PEEK error handling
When a packet is obtained from skb_recv_datagram with MSG_PEEK enabled
it is left on the socket receive queue.  This means that when we detect
a checksum error we have to be careful when trying to free the packet
as someone could have dequeued it in the time being.

Currently this delicate logic is duplicated three times between UDPv4,
UDPv6 and RAWv6.  This patch moves them into a one place and simplifies
the code somewhat.

This is based on a suggestion by Eric Dumazet.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:41 -08:00
Arnaldo Carvalho de Melo
57cca05af1 [DCCP]: Introduce dccp_ipv4_af_ops
And make the core DCCP code AF agnostic, just like TCP, now its time
to work on net/dccp/ipv6.c, we are close to the end!

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:40 -08:00
Arnaldo Carvalho de Melo
af05dc9394 [ICSK]: Move v4_addr2sockaddr from TCP to icsk
Renaming it to inet_csk_addr2sockaddr.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:39 -08:00
Arnaldo Carvalho de Melo
8292a17a39 [ICSK]: Rename struct tcp_func to struct inet_connection_sock_af_ops
And move it to struct inet_connection_sock. DCCP will use it in the
upcoming changesets.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:38 -08:00
Arnaldo Carvalho de Melo
ca304b6104 [IPV6]: Introduce inet6_rsk()
And inet6_rsk_offset in inet_request_sock, for the same reasons as
inet_sock's pinfo6 member.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:37 -08:00
Arnaldo Carvalho de Melo
8129765ac0 [IPV6]: Generalise tcp_v6_search_req & tcp_v6_synq_add
More work is needed tho to introduce inet6_request_sock from
tcp6_request_sock, in the same layout considerations as ipv6_pinfo in
inet_sock, next changeset will do that.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:36 -08:00
Arnaldo Carvalho de Melo
c2977c2213 [ICSK]: make inet_csk_reqsk_queue_hash_add timeout arg unsigned long
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:34 -08:00
Arnaldo Carvalho de Melo
90b19d3169 [IPV6]: Generalise __tcp_v6_hash, renaming it to __inet6_hash
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:33 -08:00
Arnaldo Carvalho de Melo
971af18bbf [IPV6]: Reuse inet_csk_get_port in tcp_v6_get_port
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:33 -08:00
Herbert Xu
89cee8b1cb [IPV4]: Safer reassembly
Another spin of Herbert Xu's "safer ip reassembly" patch
for 2.6.16.

(The original patch is here:
http://marc.theaimsgroup.com/?l=linux-netdev&m=112281936522415&w=2
and my only contribution is to have tested it.)

This patch (optionally) does additional checks before accepting IP
fragments, which can greatly reduce the possibility of reassembling
fragments which originated from different IP datagrams.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arthur Kepner <akepner@sgi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:31 -08:00
Bart De Schuymer
d5228a4f49 [NETFILTER] ebtables: Support nf_log API from ebt_log and ebt_ulog
This makes ebt_log and ebt_ulog use the new nf_log api.  This enables
the bridging packet filter to log packets e.g. via nfnetlink_log.

Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:30 -08:00
Eric Dumazet
3183606469 [NETFILTER] ip_tables: NUMA-aware allocation
Part of a performance problem with ip_tables is that memory allocation
is not NUMA aware, but 'only' SMP aware (ie each CPU normally touch
separate cache lines)

Even with small iptables rules, the cost of this misplacement can be
high on common workloads.  Instead of using one vmalloc() area
(located in the node of the iptables process), we now allocate an area
for each possible CPU, using vmalloc_node() so that memory should be
allocated in the CPU's node if possible.

Port to arp_tables and ip6_tables by Harald Welte.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:29 -08:00
Stephen Hemminger
df3271f336 [TCP] BIC: CUBIC window growth (2.0)
Replace existing BIC version 1.1 with new version 2.0.
The main change is to replace the window growth function
with a cubic function as described in:
  http://www.csc.ncsu.edu/faculty/rhee/export/bitcp/cubic-paper.pdf

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:28 -08:00
Stephen Hemminger
05d054503a [TCP] BIC: spelling and whitespace
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:27 -08:00
Stephen Hemminger
018da8f44c [TCP] BIC: remove low utilization code.
The latest BICTCP patch at:
http://www.csc.ncsu.edu:8080/faculty/rhee/export/bitcp/index_files/Page546.htm

disables the low_utilization feature of BICTCP because it doesn't work
in some cases. This patch removes it.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:26 -08:00
Trent Jaeger
df71837d50 [LSM-IPSec]: Security association restriction.
This patch series implements per packet access control via the
extension of the Linux Security Modules (LSM) interface by hooks in
the XFRM and pfkey subsystems that leverage IPSec security
associations to label packets.  Extensions to the SELinux LSM are
included that leverage the patch for this purpose.

This patch implements the changes necessary to the XFRM subsystem,
pfkey interface, ipv4/ipv6, and xfrm_user interface to restrict a
socket to use only authorized security associations (or no security
association) to send/receive network packets.

Patch purpose:

The patch is designed to enable access control per packets based on
the strongly authenticated IPSec security association.  Such access
controls augment the existing ones based on network interface and IP
address.  The former are very coarse-grained, and the latter can be
spoofed.  By using IPSec, the system can control access to remote
hosts based on cryptographic keys generated using the IPSec mechanism.
This enables access control on a per-machine basis or per-application
if the remote machine is running the same mechanism and trusted to
enforce the access control policy.

Patch design approach:

The overall approach is that policy (xfrm_policy) entries set by
user-level programs (e.g., setkey for ipsec-tools) are extended with a
security context that is used at policy selection time in the XFRM
subsystem to restrict the sockets that can send/receive packets via
security associations (xfrm_states) that are built from those
policies.

A presentation available at
www.selinux-symposium.org/2005/presentations/session2/2-3-jaeger.pdf
from the SELinux symposium describes the overall approach.

Patch implementation details:

On output, the policy retrieved (via xfrm_policy_lookup or
xfrm_sk_policy_lookup) must be authorized for the security context of
the socket and the same security context is required for resultant
security association (retrieved or negotiated via racoon in
ipsec-tools).  This is enforced in xfrm_state_find.

On input, the policy retrieved must also be authorized for the socket
(at __xfrm_policy_check), and the security context of the policy must
also match the security association being used.

The patch has virtually no impact on packets that do not use IPSec.
The existing Netfilter (outgoing) and LSM rcv_skb hooks are used as
before.

Also, if IPSec is used without security contexts, the impact is
minimal.  The LSM must allow such policies to be selected for the
combination of socket and remote machine, but subsequent IPSec
processing proceeds as in the original case.

Testing:

The pfkey interface is tested using the ipsec-tools.  ipsec-tools have
been modified (a separate ipsec-tools patch is available for version
0.5) that supports assignment of xfrm_policy entries and security
associations with security contexts via setkey and the negotiation
using the security contexts via racoon.

The xfrm_user interface is tested via ad hoc programs that set
security contexts.  These programs are also available from me, and
contain programs for setting, getting, and deleting policy for testing
this interface.  Testing of sa functions was done by tracing kernel
behavior.

Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:24 -08:00
Jeff Garzik
ac67c62473 Merge branch 'master' 2006-01-03 10:49:18 -05:00
Matt Mackall
4a4efbdee2 s/retreiv/retriev/g
As everyone knows, the rule is: "i before e.. um.. always."

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-01-03 13:27:11 +01:00
David L Stevens
5ab4a6c81e [IPV6] mcast: Fix multiple issues in MLDv2 reports.
The below "jumbo" patch fixes the following problems in MLDv2.

1) Add necessary "ntohs" to recent "pskb_may_pull" check [breaks
        all nonzero source queries on little-endian (!)]

2) Add locking to source filter list [resend of prior patch]

3) fix "mld_marksources()" to
        a) send nothing when all queried sources are excluded
        b) send full exclude report when source queried sources are
                not excluded
        c) don't schedule a timer when there's nothing to report

NOTE: RFC 3810 specifies the source list should be saved and each
  source reported individually as an IS_IN. This is an obvious DOS
  path, requiring the host to store and then multicast as many sources
  as are queried (e.g., millions...). This alternative sends a full, 
  relevant report that's limited to number of sources present on the
  machine.

4) fix "add_grec()" to send empty-source records when it should
        The original check doesn't account for a non-empty source
        list with all sources inactive; the new code keeps that
        short-circuit case, and also generates the group header
        with an empty list if needed.

5) fix mca_crcount decrement to be after add_grec(), which needs
        its original value

These issues (other than item #1 ;-) ) were all found by Yan Zheng,
much thanks!

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-27 14:03:00 -08:00
David S. Miller
1b93ae64ca [NET]: Validate socket filters against BPF_MAXINSNS in one spot.
Currently the checks are scattered all over and this leads
to inconsistencies and even cases where the check is not made.

Based upon a patch from Kris Katterjohn.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-27 13:57:59 -08:00
YOSHIFUJI Hideaki
6732badee0 [IPV6]: Fix addrconf dead lock.
We need to release idev->lcok before we call addrconf_dad_stop().
It calls ipv6_addr_del(), which will hold idev->lock.

Bug spotted by Yasuyuki KOZAKAI <yasuyuki.kozakai@toshiba.co.jp>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-27 13:35:15 -08:00
David Kimdon
79cac2a221 [BR_NETFILTER]: Fix leak if skb traverses > 1 bridge
Call nf_bridge_put() before allocating a new nf_bridge structure and
potentially overwriting the pointer to a previously allocated one.
This fixes a memory leak which can occur when the bridge topology
allows for an skb to traverse more than one bridge.

Signed-off-by: David Kimdon <david.kimdon@devicescape.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-26 17:27:10 -08:00
David L Stevens
6f4353d891 [IPV6]: Increase default MLD_MAX_MSF to 64.
The existing default of 10 is just way too low.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-26 17:03:46 -08:00
Hiroyuki YAMAMORI
291d809ba5 [IPV6]: Fix Temporary Address Generation
From: Hiroyuki YAMAMORI <h-yamamo@db3.so-net.ne.jp>

Since regen_count is stored in the public address, we need to reset it
when we start renewing temporary address.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-23 11:24:05 -08:00
YOSHIFUJI Hideaki
3dd3bf8357 [IPV6]: Fix dead lock.
We need to relesae ifp->lock before we call addrconf_dad_stop(),
which will hold ifp->lock.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-23 11:23:21 -08:00
David S. Miller
e6469297d4 Merge git://git.skbuff.net/gitroot/yoshfuji/linux-2.6.14+git+ipv6-fix-20051221a 2005-12-22 07:41:27 -08:00