Commit Graph

2333 Commits

Author SHA1 Message Date
Herbert Xu
2c6cc0d853 [BRIDGE]: Add support for NETIF_F_HW_CSUM devices
As it is the bridge will only ever declare NETIF_F_IP_CSUM even if all
its constituent devices support NETIF_F_HW_CSUM.  This patch fixes
this by supporting the first one out of NETIF_F_NO_CSUM,
NETIF_F_HW_CSUM, and NETIF_F_IP_CSUM that is supported by all
constituent devices.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:06:45 -07:00
Herbert Xu
8648b3053b [NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM
The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM
identically so we test for them in quite a few places.  For the sake
of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two.  We
also test the disjunct of NETIF_F_IP_CSUM and the other two in various
places, for that purpose I've added NETIF_F_ALL_CSUM.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:06:05 -07:00
David S. Miller
35089bb203 [TCP]: Add tcp_slow_start_after_idle sysctl.
A lot of people have asked for a way to disable tcp_cwnd_restart(),
and it seems reasonable to add a sysctl to do that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:53 -07:00
Luca De Cicco
bc726a71d2 [TCP] Westwood: reset RTT min after FRTO
RTT_min is updated each time a timeout event occurs
in order to cope with hard handovers in wireless scenarios such as UMTS.

Signed-off-by: Luca De Cicco <ldecicco@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:38 -07:00
Luca De Cicco
b3a92eabe5 [TCP] Westwood: bandwidth filter startup
The bandwidth estimate filter is now initialized with the first
sample in order to have better performances in the case of small
file transfers.

Signed-off-by: Luca De Cicco <ldecicco@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:36 -07:00
Luca De Cicco
b7d7a9e3c9 [TCP] Westwood: comment fixes
Cleanup some comments and add more references

Signed-off-by: Luca De Cicco <ldecicco@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:34 -07:00
Stephen Hemminger
f61e29018a [TCP] Westwood: fix first sample
Need to update send sequence number tracking after first ack.
Rework of patch from Luca De Cicco.

Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:32 -07:00
Stephen Hemminger
bdeb04c6d9 [NET]: net.ipv4.ip_autoconfig sysctl removal
The sysctl net.ipv4.ip_autoconfig is a legacy value that is not used.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:30 -07:00
Alexey Dobriyan
f8d5962112 [IPX]: Endian bug in ipxrtr_route_packet()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:24 -07:00
Herbert Xu
3cc0e87398 [NET]: Warn in __skb_trim if skb is paged
It's better to warn and fail rather than rarely triggering BUG on paths
that incorrectly call skb_trim/__skb_trim on a non-linear skb.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:22 -07:00
Herbert Xu
b38dfee3d6 [NET]: skb_trim audit
I found a few more spots where pskb_trim_rcsum could be used but were not.
This patch changes them to use it.

Also, sk_filter can get paged skb data.  Therefore we must use pskb_trim
instead of skb_trim.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:20 -07:00
Herbert Xu
364c6badde [NET]: Clean up skb_linearize
The linearisation operation doesn't need to be super-optimised.  So we can
replace __skb_linearize with __pskb_pull_tail which does the same thing but
is more general.

Also, most users of skb_linearize end up testing whether the skb is linear
or not so it helps to make skb_linearize do just that.

Some callers of skb_linearize also use it to copy cloned data, so it's
useful to have a new function skb_linearize_cow to copy the data if it's
either non-linear or cloned.

Last but not least, I've removed the gfp argument since nobody uses it
anymore.  If it's ever needed we can easily add it back.

Misc bugs fixed by this patch:

* via-velocity error handling (also, no SG => no frags)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:16 -07:00
Herbert Xu
932ff279a4 [NET]: Add netif_tx_lock
Various drivers use xmit_lock internally to synchronise with their
transmission routines.  They do so without setting xmit_lock_owner.
This is fine as long as netpoll is not in use.

With netpoll it is possible for deadlocks to occur if xmit_lock_owner
isn't set.  This is because if a printk occurs while xmit_lock is held
and xmit_lock_owner is not set can cause netpoll to attempt to take
xmit_lock recursively.

While it is possible to resolve this by getting netpoll to use
trylock, it is suboptimal because netpoll's sole objective is to
maximise the chance of getting the printk out on the wire.  So
delaying or dropping the message is to be avoided as much as possible.

So the only alternative is to always set xmit_lock_owner.  The
following patch does this by introducing the netif_tx_lock family of
functions that take care of setting/unsetting xmit_lock_owner.

I renamed xmit_lock to _xmit_lock to indicate that it should not be
used directly.  I didn't provide irq versions of the netif_tx_lock
functions since xmit_lock is meant to be a BH-disabling lock.

This is pretty much a straight text substitution except for a small
bug fix in winbond.  It currently uses
netif_stop_queue/spin_unlock_wait to stop transmission.  This is
unsafe as an IRQ can potentially wake up the queue.  So it is safer to
use netif_tx_disable.

The hamradio bits used spin_lock_irq but it is unnecessary as
xmit_lock must never be taken in an IRQ handler.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:14 -07:00
Patrick McHardy
bf0857ea32 [NETFILTER]: hashlimit match: fix random initialization
hashlimit does:

        if (!ht->rnd)
                get_random_bytes(&ht->rnd, 4);

ignoring that 0 is also a valid random number.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:11 -07:00
Patrick McHardy
2b2283d030 [NETFILTER]: recent match: missing refcnt initialization
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:09 -07:00
Patrick McHardy
a0e889bb1b [NETFILTER]: recent match: fix "sleeping function called from invalid context"
create_proc_entry must not be called with locks held. Use a mutex
instead to protect data only changed in user context.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:07 -07:00
James Morris
100468e9c0 [SECMARK]: Add CONNSECMARK xtables target
Add a new xtables target, CONNSECMARK, which is used to specify rules
for copying security marks from packets to connections, and for
copyying security marks back from connections to packets.  This is
similar to the CONNMARK target, but is more limited in scope in that
it only allows copying of security marks to and from packets, as this
is all it needs to do.

A typical scenario would be to apply a security mark to a 'new' packet
with SECMARK, then copy that to its conntrack via CONNMARK, and then
restore the security mark from the connection to established and
related packets on that connection.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:03 -07:00
James Morris
7c9728c393 [SECMARK]: Add secmark support to conntrack
Add a secmark field to IP and NF conntracks, so that security markings
on packets can be copied to their associated connections, and also
copied back to packets as required.  This is similar to the network
mark field currently used with conntrack, although it is intended for
enforcement of security policy rather than network policy.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:01 -07:00
James Morris
5e6874cdb8 [SECMARK]: Add xtables SECMARK target
Add a SECMARK target to xtables, allowing the admin to apply security
marks to packets via both iptables and ip6tables.

The target currently handles SELinux security marking, but can be
extended for other purposes as needed.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:59 -07:00
James Morris
984bc16cc9 [SECMARK]: Add secmark support to core networking.
Add a secmark field to the skbuff structure, to allow security subsystems to
place security markings on network packets.  This is similar to the nfmark
field, except is intended for implementing security policy, rather than than
networking policy.

This patch was already acked in principle by Dave Miller.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:57 -07:00
David S. Miller
6f68dc3775 [NET]: Fix warnings after LSM-IPSEC changes.
Assignment used as truth value in xfrm_del_sa()
and xfrm_get_policy().

Wrong argument type declared for security_xfrm_state_delete()
when SELINUX is disabled.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:49 -07:00
Dave Jones
9dadaa19cb [NET]: NET_TCPPROBE Kconfig fix
Just spotted this typo in a new option.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:47 -07:00
Catherine Zhang
c8c05a8eec [LSM-IPsec]: SELinux Authorize
This patch contains a fix for the previous patch that adds security
contexts to IPsec policies and security associations.  In the previous
patch, no authorization (besides the check for write permissions to
SAD and SPD) is required to delete IPsec policies and security
assocations with security contexts.  Thus a user authorized to change
SAD and SPD can bypass the IPsec policy authorization by simply
deleteing policies with security contexts.  To fix this security hole,
an additional authorization check is added for removing security
policies and security associations with security contexts.

Note that if no security context is supplied on add or present on
policy to be deleted, the SELinux module allows the change
unconditionally.  The hook is called on deletion when no context is
present, which we may want to change.  At present, I left it up to the
module.

LSM changes:

The patch adds two new LSM hooks: xfrm_policy_delete and
xfrm_state_delete.  The new hooks are necessary to authorize deletion
of IPsec policies that have security contexts.  The existing hooks
xfrm_policy_free and xfrm_state_free lack the context to do the
authorization, so I decided to split authorization of deletion and
memory management of security data, as is typical in the LSM
interface.

Use:

The new delete hooks are checked when xfrm_policy or xfrm_state are
deleted by either the xfrm_user interface (xfrm_get_policy,
xfrm_del_sa) or the pfkey interface (pfkey_spddelete, pfkey_delete).

SELinux changes:

The new policy_delete and state_delete functions are added.

Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:45 -07:00
David S. Miller
f86502bfc1 [IPV4] icmp: Kill local 'ip' arg in icmp_redirect().
It is typed wrong, and it's only assigned and used once.
So just pass in iph->daddr directly which fixes both problems.

Based upon a patch by Alexey Dobriyan.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:41 -07:00
Alexey Dobriyan
6d74165350 [IPV4]: Right prototype of __raw_v4_lookup()
All users pass 32-bit values as addresses and internally they're
compared with 32-bit entities. So, change "laddr" and "raddr" types to
__be32.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:39 -07:00
Alexey Dobriyan
338fcf9886 [IPV4] igmp: Fixup struct ip_mc_list::multiaddr type
All users except two expect 32-bit big-endian value. One is of

	->multiaddr = ->multiaddr

variety. And last one is "%08lX".

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:37 -07:00
David S. Miller
70df2311ee [TCP]: Fix compile warning in tcp_probe.c
The suseconds_t et al. are not necessarily any particular type on
every platform, so cast to unsigned long so that we can use one printf
format string and avoid warnings across the board

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:35 -07:00
Stephen Hemminger
738980ffa6 [TCP]: Limited slow start for Highspeed TCP
Implementation of RFC3742 limited slow start. Added as part
of the TCP highspeed congestion control module.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:33 -07:00
Stephen Hemminger
a42e9d6ce8 [TCP]: TCP Probe congestion window tracing
This adds a new module for tracking TCP state variables non-intrusively
using kprobes.  It has a simple /proc interface that outputs one line
for each packet received. A sample usage is to collect congestion
window and ssthresh over time graphs.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:31 -07:00
Stephen Hemminger
72dc5b9225 [TCP]: Minimum congestion window consolidation.
Many of the TCP congestion methods all just use ssthresh
as the minimum congestion window on decrease.  Rather than
duplicating the code, just have that be the default if that
handle in the ops structure is not set.

Minor behaviour change to TCP compound.  It probably wants
to use this (ssthresh) as lower bound, rather than ssthresh/2
because the latter causes undershoot on loss.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:29 -07:00
Stephen Hemminger
a4ed258495 [TCP]: TCP Compound quad root function
The original code did a 64 bit divide directly, which won't work on
32 bit platforms.  Rather than doing a 64 bit square root twice,
just implement a 4th root function in one pass using Newton's method.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:27 -07:00
Angelo P. Castellani
f890f92104 [TCP]: TCP Compound congestion control
TCP Compound is a sender-side only change to TCP that uses
a mixed Reno/Vegas approach to calculate the cwnd.

For further details look here:
  ftp://ftp.research.microsoft.com/pub/tr/TR-2005-86.pdf

Signed-off-by: Angelo P. Castellani <angelo.castellani@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:25 -07:00
Bin Zhou
76f1017757 [TCP]: TCP Veno congestion control
TCP Veno module is a new congestion control module to improve TCP
performance over wireless networks. The key innovation in TCP Veno is
the enhancement of TCP Reno/Sack congestion control algorithm by using
the estimated state of a connection based on TCP Vegas. This scheme
significantly reduces "blind" reduction of TCP window regardless of
the cause of packet loss.

This work is based on the research paper "TCP Veno: TCP Enhancement
for Transmission over Wireless Access Networks." C. P. Fu, S. C. Liew,
IEEE Journal on Selected Areas in Communication, Feb. 2003.

Original paper and many latest research works on veno:
 http://www.ntu.edu.sg/home/ascpfu/veno/veno.html

Signed-off-by: Bin Zhou <zhou0022@ntu.edu.sg>
	       Cheng Peng Fu <ascpfu@ntu.edu.sg>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:23 -07:00
Wong Hoi Sing Edison
7c106d7e78 [TCP]: TCP Low Priority congestion control
TCP Low Priority is a distributed algorithm whose goal is to utilize only
 the excess network bandwidth as compared to the ``fair share`` of
 bandwidth as targeted by TCP. Available from:
   http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf

Original Author:
 Aleksandar Kuzmanovic <akuzma@northwestern.edu>

See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation.
As of 2.6.13, Linux supports pluggable congestion control algorithms.
Due to the limitation of the API, we take the following changes from
the original TCP-LP implementation:
 o We use newReno in most core CA handling. Only add some checking
   within cong_avoid.
 o Error correcting in remote HZ, therefore remote HZ will be keeped
   on checking and updating.
 o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne
   OWD have a similar meaning as RTT. Also correct the buggy formular.
 o Handle reaction for Early Congestion Indication (ECI) within
   pkts_acked, as mentioned within pseudo code.
 o OWD is handled in relative format, where local time stamp will in
   tcp_time_stamp format.

Port from 2.4.19 to 2.6.16 as module by:
 Wong Hoi Sing Edison <hswong3i@gmail.com>
 Hung Hing Lun <hlhung3i@gmail.com>

Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:21 -07:00
Andrew Morton
2f45c340e0 [LLC]: Fix double receive of SKB.
Oops fix from Stephen: remove duplicate rcv() calls.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:19 -07:00
Alexey Dobriyan
c45fb1089e [NETFILTER]: PPTP helper: fixup gre_keymap_lookup() return type
GRE keys are 16-bit wide.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:17 -07:00
Patrick McHardy
ae5b7d8ba2 [NETFILTER]: Add SIP connection tracking helper
Add SIP connection tracking helper. Originally written by
Christian Hentschel <chentschel@arnet.com.ar>, some cleanup, minor
fixes and bidirectional SIP support added by myself.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:15 -07:00
Patrick McHardy
e44ab66a75 [NETFILTER]: H.323 helper: replace internal_net_addr parameter by routing-based heuristic
Call Forwarding doesn't need to create an expectation if both peers can
reach each other without our help. The internal_net_addr parameter
lets the user explicitly specify a single network where this is true,
but is not very flexible and even fails in the common case that calls
will both be forwarded to outside parties and inside parties. Use an
optional heuristic based on routing instead, the assumption is that
if bpth the outgoing device and the gateway are equal, both peers can
reach each other directly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:13 -07:00
Jing Min Zhao
c0d4cfd96d [NETFILTER]: H.323 helper: Add support for Call Forwarding
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:11 -07:00
Patrick McHardy
c952616934 [NETFILTER]: amanda helper: convert to textsearch infrastructure
When a port number within a packet is replaced by a differently sized
number only the packet is resized, but not the copy of the data.
Following port numbers are rewritten based on their offsets within
the copy, leading to packet corruption.

Convert the amanda helper to the textsearch infrastructure to avoid
the copy entirely.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:09 -07:00
Patrick McHardy
7d8c501817 [NETFILTER]: FTP helper: search optimization
Instead of skipping search entries for the wrong direction simply index
them by direction.

Based on patch by Pablo Neira <pablo@netfilter.org>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:07 -07:00
Patrick McHardy
695ecea329 [NETFILTER]: SNMP helper: fix debug module param type
debug is the debug level, not a bool.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:05 -07:00
Patrick McHardy
89f2e21883 [NETFILTER]: ctnetlink: change table dumping not to require an unique ID
Instead of using the ID to find out where to continue dumping, take a
reference to the last entry dumped and try to continue there.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:03 -07:00
Patrick McHardy
3726add766 [NETFILTER]: ctnetlink: fix NAT configuration
The current configuration only allows to configure one manip and overloads
conntrack status flags with netlink semantic.

Signed-off-by: Patrick Mchardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:01 -07:00
Eric Leblond
997ae831ad [NETFILTER]: conntrack: add fixed timeout flag in connection tracking
Add a flag in a connection status to have a non updated timeout.
This permits to have connection that automatically die at a given
time.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:59 -07:00
Patrick McHardy
39a27a35c5 [NETFILTER]: conntrack: add sysctl to disable checksumming
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:57 -07:00
Patrick McHardy
6442f1cf89 [NETFILTER]: conntrack: don't call helpers for related ICMP messages
None of the existing helpers expects to get called for related ICMP
packets and some even drop them if they can't parse them.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:55 -07:00
Patrick McHardy
404bdbfd24 [NETFILTER]: recent match: replace by rewritten version
Replace the unmaintainable ipt_recent match by a rewritten version that
should be fully compatible.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:53 -07:00
Patrick McHardy
f3389805e5 [NETFILTER]: x_tables: add statistic match
Add statistic match which is a combination of the nth and random matches.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:51 -07:00
Patrick McHardy
62b7743483 [NETFILTER]: x_tables: add quota match
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:49 -07:00
Patrick McHardy
957dc80ac3 [NETFILTER]: x_tables: add SCTP/DCCP support where missing
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:47 -07:00
Patrick McHardy
3e72b2fe5b [NETFILTER]: x_tables: remove some unnecessary casts
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:45 -07:00
Herbert Xu
31a4ab9302 [IPSEC] proto: Move transport mode input path into xfrm_mode_transport
Now that we have xfrm_mode objects we can move the transport mode specific
input decapsulation code into xfrm_mode_transport.  This removes duplicate
code as well as unnecessary header movement in case of tunnel mode SAs
since we will discard the original IP header immediately.

This also fixes a minor bug for transport-mode ESP where the IP payload
length is set to the correct value minus the header length (with extension
headers for IPv6).

Of course the other neat thing is that we no longer have to allocate
temporary buffers to hold the IP headers for ESP and IPComp.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:41 -07:00
Herbert Xu
b59f45d0b2 [IPSEC] xfrm: Abstract out encapsulation modes
This patch adds the structure xfrm_mode.  It is meant to represent
the operations carried out by transport/tunnel modes.

By doing this we allow additional encapsulation modes to be added
without clogging up the xfrm_input/xfrm_output paths.

Candidate modes include 4-to-6 tunnel mode, 6-to-4 tunnel mode, and
BEET modes.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:39 -07:00
Herbert Xu
546be2405b [IPSEC] xfrm: Undo afinfo lock proliferation
The number of locks used to manage afinfo structures can easily be reduced
down to one each for policy and state respectively.  This is based on the
observation that the write locks are only held by module insertion/removal
which are very rare events so there is no need to further differentiate
between the insertion of modules like ipv6 versus esp6.

The removal of the read locks in xfrm4_policy.c/xfrm6_policy.c might look
suspicious at first.  However, after you realise that nobody ever takes
the corresponding write lock you'll feel better :)

As far as I can gather it's an attempt to guard against the removal of
the corresponding modules.  Since neither module can be unloaded at all
we can leave it to whoever fixes up IPv6 unloading :)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:37 -07:00
David S. Miller
15986e1aad [TCP]: tcp_rcv_rtt_measure_ts() call in pure-ACK path is superfluous
We only want to take receive RTT mesaurements for data
bearing frames, here in the header prediction fast path
for a pure-sender, we know that we have a pure-ACK and
thus the checks in tcp_rcv_rtt_mesaure_ts() will not pass.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:16 -07:00
Stephen Hemminger
11dc1f36a6 [BRIDGE]: netlink interface for link management
Add basic netlink support to the Ethernet bridge. Including:
 * dump interfaces in bridges
 * monitor link status changes
 * change state of bridge port

For some demo programs see:
	http://developer.osdl.org/shemminger/prototypes/brnl.tar.gz

These are to allow building a daemon that does alternative
implementations of Spanning Tree Protocol.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:14 -07:00
Stephen Hemminger
c090971326 [BRIDGE]: fix module startup error handling
Return address in use, if some other kernel code has the SAP.
Propogate out error codes from netfilter registration and unwind.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:12 -07:00
Stephen Hemminger
9ef513bed6 [BRIDGE]: optimize conditional in forward path
Small optimizations of bridge forwarding path.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:10 -07:00
Stephen Hemminger
bc0e646796 [LLC]: add multicast support for datagrams
Allow mulitcast reception of datagrams (similar to UDP).
All sockets bound to the same SAP receive a clone.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:08 -07:00
Stephen Hemminger
8f182b494f [LLC]: allow applications to get copy of kernel datagrams
It is legal for an application to bind to a SAP that is also being
used by the kernel. This happens if the bridge module binds to the
STP SAP, and the user wants to have a daemon for STP as well.
It is possible to have kernel doing STP on one bridge, but
let application do RSTP on another bridge.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:06 -07:00
Stephen Hemminger
23dbe7912d [LLC]: use rcu_dereference on receive handler
The receive hander pointer might be modified during network changes
of protocol. So use rcu_dereference (only matters on alpha).

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:04 -07:00
Stephen Hemminger
29efcd2666 [LLC]: allow datagram recvmsg
LLC receive is broken for SOCK_DGRAM.
If an application does recv() on a datagram socket and there
is no data present, don't return "not connected". Instead, just
do normal datagram semantics.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:02 -07:00
Stephen Hemminger
aecbd4e45c [LLC]: use more efficient ether address routines
Use more cache efficient Ethernet address manipulation functions
in etherdevice.h.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
2006-06-17 21:26:00 -07:00
Chris Leech
1a2449a87b [I/OAT]: TCP recv offload to I/OAT
Locks down user pages and sets up for DMA in tcp_recvmsg, then calls
dma_async_try_early_copy in tcp_v4_do_rcv

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:56 -07:00
Chris Leech
9593782585 [I/OAT]: Add a sysctl for tuning the I/OAT offloaded I/O threshold
Any socket recv of less than this ammount will not be offloaded

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:54 -07:00
Chris Leech
624d116473 [I/OAT]: Make sk_eat_skb I/OAT aware.
Add an extra argument to sk_eat_skb, and make it move early copied
packets to the async_wait_queue instead of freeing them.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:52 -07:00
Chris Leech
0e4b4992b8 [I/OAT]: Rename cleanup_rbuf to tcp_cleanup_rbuf and make non-static
Needed to be able to call tcp_cleanup_rbuf in tcp_input.c for I/OAT

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:50 -07:00
Chris Leech
97fc2f0848 [I/OAT]: Structure changes for TCP recv offload to I/OAT
Adds an async_wait_queue and some additional fields to tcp_sock, and a
dma_cookie_t to sk_buff.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:48 -07:00
Chris Leech
de5506e155 [I/OAT]: Utility functions for offloading sk_buff to iovec copies
Provides for pinning user space pages in memory, copying to iovecs,
and copying from sk_buffs including fragmented and chained sk_buffs.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:46 -07:00
Chris Leech
db21733488 [I/OAT]: Setup the networking subsystem as a DMA client
Attempts to allocate per-CPU DMA channels

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:24:58 -07:00
Sean Hefty
a1e8733e55 [NET]: Export ip_dev_find()
Export ip_dev_find() to allow locating a net_device given an IP address.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:28 -07:00
Larry Finger
7bd6b91800 [PATCH] wireless: correct dump of WPA IE
In net/ieee80211/softmac/ieee80211softmac_wx.c, there is a bug that
prints extended sign information whenever the byte value exceeds
0x7f. The following patch changes the printk to use a u8 cast to limit
the output to 2 digits. This bug was first noticed by Dan Williams
<dcbw@redhat.com>. This patch applies to the current master branch
of the Linville tree.

Signed-Off-By: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-15 15:48:14 -04:00
Jeff Garzik
b5ed7639c9 Merge branch 'master' into upstream 2006-06-13 20:29:04 -04:00
John W. Linville
76df73ff90 Merge branch 'from-linus' into upstream 2006-06-13 15:38:11 -04:00
Weidong
42d1d52e69 [IPV4]: Increment ipInHdrErrors when TTL expires.
Signed-off-by: Weidong <weid@nanjing-fnst.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-12 13:09:59 -07:00
Aki M Nyrhinen
79320d7e14 [TCP]: continued: reno sacked_out count fix
From: Aki M Nyrhinen <anyrhine@cs.helsinki.fi>

IMHO the current fix to the problem (in_flight underflow in reno)
is incorrect.  it treats the symptons but ignores the problem. the
problem is timing out packets other than the head packet when we
don't have sack. i try to explain (sorry if explaining the obvious).

with sack, scanning the retransmit queue for timed out packets is
fine because we know which packets in our retransmit queue have been
acked by the receiver.

without sack, we know only how many packets in our retransmit queue the
receiver has acknowledged, but no idea which packets.

think of a "typical" slow-start overshoot case, where for example
every third packet in a window get lost because a router buffer gets
full.

with sack, we check for timeouts on those every third packet (as the
rest have been sacked). the packet counting works out and if there
is no reordering, we'll retransmit exactly the packets that were 
lost.

without sack, however, we check for timeout on every packet and end up
retransmitting consecutive packets in the retransmit queue. in our
slow-start example, 2/3 of those retransmissions are unnecessary. these
unnecessary retransmissions eat the congestion window and evetually
prevent fast recovery from continuing, if enough packets were lost.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-11 21:18:56 -07:00
Andrea Bittau
afec35e3fe [DCCP] Ackvec: fix soft lockup in ackvec handling code
A soft lockup existed in the handling of ack vector records.
Specifically, when a tail of the list of ack vector records was
removed, it was possible to end up iterating infinitely on an element
of the tail.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-11 21:08:03 -07:00
Trond Myklebust
81039f1f20 NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mounts
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:34 -04:00
Trond Myklebust
8b23ea7bed RPC: Allow struc xdr_stream to read the page section of an xdr_buf
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:21 -04:00
Trond Myklebust
1f5ce9e93a VFS: Unexport do_kern_mount() and clean up simple_pin_fs()
Replace all module uses with the new vfs_kern_mount() interface, and fix up
simple_pin_fs().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:16 -04:00
Chuck Lever
bf3fcf8955 SUNRPC: NFS_ROOT always uses the same XIDs
The XID generator uses get_random_bytes to generate an initial XID.
NFS_ROOT starts up before the random driver, though, so get_random_bytes
doesn't set a random XID for NFS_ROOT.  This causes NFS_ROOT mount points
to reuse XIDs every time the client is booted.  If the client boots often
enough, the server will start serving old replies out of its DRC.

Use net_random() instead.

Test plan:
I/O intensive workloads should perform well and generate no errors.  Traces
taken during client reboots should show that NFS_ROOT mounts use unique
XIDs after every reboot.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:06 -04:00
Chuck Lever
b85d880684 SUNRPC: select privileged port numbers at random
Make the RPC client select privileged ephemeral source ports at
random.  This improves DRC behavior on the server by using the
same port when reconnecting for the same mount point, but using
a different port for fresh mounts.

The Linux TCP implementation already does this for nonprivileged
ports.  Note that TCP sockets in TIME_WAIT will prevent quick reuse
of a random ephemeral port number by leaving the port INUSE until
the connection transitions out of TIME_WAIT.

Test plan:
Connectathon against every known server implementation using multiple
mount points.  Locking especially.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:05 -04:00
Jeff Garzik
ba9b28d19a Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream 2006-06-08 15:48:25 -04:00
Jeff Garzik
d15a88fc21 Merge branch 'master' into upstream 2006-06-08 15:24:46 -04:00
Jiri Benc
36485707bb [BRIDGE]: fix locking and memory leak in br_add_bridge
There are several bugs in error handling in br_add_bridge:
- when dev_alloc_name fails, allocated net_device is not freed
- unregister_netdev is called when rtnl lock is held
- free_netdev is called before netdev_run_todo has a chance to be run after
  unregistering net_device

Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-05 16:39:34 -07:00
Florin Malita
8c893ff6ab [IRDA]: Missing allocation result check in irlap_change_speed().
The skb allocation may fail, which can result in a NULL pointer dereference
in irlap_queue_xmit().

Coverity CID: 434.

Signed-off-by: Florin Malita <fmalita@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-05 15:34:52 -07:00
Jes Sorensen
6569a351da [NET]: Eliminate unused /proc/sys/net/ethernet
The /proc/sys/net/ethernet directory has been sitting empty for more than
10 years!  Time to eliminate it!

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-05 15:34:11 -07:00
Herbert Xu ~{PmVHI~}
f291196979 [TCP]: Avoid skb_pull if possible when trimming head
Trimming the head of an skb by calling skb_pull can cause the packet
to become unaligned if the length pulled is odd.  Since the length is
entirely arbitrary for a FIN packet carrying data, this is actually
quite common.

Unaligned data is not the end of the world, but we should avoid it if
it's easily done.  In this case it is trivial.  Since we're discarding
all of the head data it doesn't matter whether we move skb->data forward
or back.

However, it is still possible to have unaligned skb->data in general.
So network drivers should be prepared to handle it instead of crashing.

This patch also adds an unlikely marking on len < headlen since partial
ACKs on head data are extremely rare in the wild.  As the return value
of __pskb_trim_head is no longer ever NULL that has been removed.

Signed-off-by: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-05 15:03:37 -07:00
Joseph Jezak
c4b3d1bb32 [PATCH] softmac: unified capabilities computation
This patch moves the capabilities field computation to a function for clarity
and adds some previously unimplemented bits.

Signed off by Joseph Jezak <josejx@gentoo.org>
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-By: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-05 15:51:30 -04:00
Daniel Drake
6ae15df16e [PATCH] softmac: Fix handling of authentication failure
My router blew up earlier, but exhibited some interesting behaviour during
its dying moments. It was broadcasting beacons but wouldn't respond to
any authentication requests.

I noticed that softmac wasn't playing nice with this, as I couldn't make it try
to connect to other networks after it had timed out authenticating to my ill
router.

To resolve this, I modified the softmac event/notify API to pass the event
code to the callback, so that callbacks being notified from
IEEE80211SOFTMAC_EVENT_ANY masks can make some judgement. In this case, the
ieee80211softmac_assoc callback needs to make a decision based upon whether
the association passed or failed.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-05 15:51:30 -04:00
Daniel Drake
76ea4c7f4c [PATCH] softmac: complete shared key authentication
This patch finishes of the partially-complete shared key authentication
implementation in softmac.

The complication here is that we need to encrypt a management frame during
the authentication process. I don't think there are any other scenarios where
this would have to happen.

To get around this without causing too many headaches, we decided to just use
software encryption for this frame. The softmac config option now selects
IEEE80211_CRYPT_WEP so that we can ensure this available. This also involved
a modification to some otherwise unused ieee80211 API.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-05 15:51:29 -04:00
Toralf Förster
47fbe1bf39 [PATCH] ieee80211softmac_io.c: fix warning "defined but not used"
Got this compiler warning and Johannes Berg <johannes@sipsolutions.net>
wrote:

Yeah, known 'bug', we have that code there but never use it. Feel free
to submit a patch (to John Linville, CC netdev and softmac-dev) to
remove it.

Signed-off-by: Toralf Foerster <toralf.foerster@gmx.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-05 15:48:31 -04:00
John W. Linville
dea58b80f2 Merge branch 'from-linus' into upstream 2006-06-05 14:42:27 -04:00
Stephen Hemminger
fb80a6e1a5 [TCP] tcp_highspeed: Fix problem observed by Xiaoliang (David) Wei
When snd_cwnd is smaller than 38 and the connection is in
congestion avoidance phase (snd_cwnd > snd_ssthresh), the snd_cwnd
seems to stop growing.

The additive increase was confused because C array's are 0 based.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-02 17:51:08 -07:00
Alexey Dobriyan
7114b0bb6d [NETFILTER]: PPTP helper: fix sstate/cstate typo
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-28 22:51:05 -07:00
Patrick McHardy
ca3ba88d0c [NETFILTER]: mark H.323 helper experimental
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-28 22:50:40 -07:00
Marcel Holtmann
6c813c3fe9 [NETFILTER]: Fix small information leak in SO_ORIGINAL_DST (CVE-2006-1343)
It appears that sockaddr_in.sin_zero is not zeroed during
getsockopt(...SO_ORIGINAL_DST...) operation. This can lead
to an information leak (CVE-2006-1343).

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-28 22:50:18 -07:00
Jeff Garzik
cbc696a5fa Merge branch 'upstream-fixes' into upstream 2006-05-26 21:26:34 -04:00
Stephen Hemminger
3041a06909 [NET]: dev.c comment fixes
Noticed that dev_alloc_name() comment was incorrect, and more spellung
errors.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-26 13:25:24 -07:00