linux

Author	SHA1	Message	Date
Harvey Harrison	673d57e723	net: replace NIPQUAD() in net/ipv4/ net/ipv6/ Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u can be replaced with %pI4 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-31 00:53:57 -07:00
Harvey Harrison	cffee385d7	net: replace NIPQUAD() in net/ipv4/netfilter/ Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u can be replaced with %pI4 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-31 00:53:08 -07:00
David S. Miller	a1744d3bee	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/p54/p54common.c	2008-10-31 00:17:34 -07:00
Eric Dumazet	c8db3fec5b	udp: Should use spin_lock_bh()/spin_unlock_bh() in udp_lib_unhash() Spotted by Alexander Beregalov Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-30 14:00:53 -07:00
roel kluin	00af5c6959	cipso: unsigned buf_len cannot be negative unsigned buf_len cannot be negative Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Paul Moore <paul.moore@hp.com>	2008-10-29 15:55:53 -04:00
Harvey Harrison	5b095d9892	net: replace %p6 with %pI6 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 12:52:50 -07:00
Eric Dumazet	96631ed16c	udp: introduce sk_for_each_rcu_safenext() Corey Minyard found a race added in commit `271b72c7fa` (udp: RCU handling for Unicast packets.) "If the socket is moved from one list to another list in-between the time the hash is calculated and the next field is accessed, and the socket has moved to the end of the new list, the traversal will not complete properly on the list it should have, since the socket will be on the end of the new list and there's not a way to tell it's on a new list and restart the list traversal. I think that this can be solved by pre-fetching the "next" field (with proper barriers) before checking the hash." This patch corrects this problem, introducing a new sk_for_each_rcu_safenext() macro. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 11:19:58 -07:00
Eric Dumazet	f52b5054ec	udp: udp_get_next() should use spin_unlock_bh() Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 11:19:11 -07:00
Eric Dumazet	8203efb3c6	udp: calculate udp_mem based on low memory instead of all memory This patch mimics commit `57413ebc4e` (tcp: calculate tcp_mem based on low memory instead of all memory) The udp_mem array which contains limits on the total amount of memory used by UDP sockets is calculated based on nr_all_pages. On a 32 bits x86 system, we should base this on the number of lowmem pages. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 02:32:32 -07:00
Eric Dumazet	271b72c7fa	udp: RCU handling for Unicast packets. Goals are : 1) Optimizing handling of incoming Unicast UDP frames, so that no memory writes should happen in the fast path. Note: Multicasts and broadcasts still will need to take a lock, because doing a full lockless lookup in this case is difficult. 2) No expensive operations in the socket bind/unhash phases : - No expensive synchronize_rcu() calls. - No added rcu_head in socket structure, increasing memory needs, but more important, forcing us to use call_rcu() calls, that have the bad property of making sockets structure cold. (rcu grace period between socket freeing and its potential reuse make this socket being cold in CPU cache). David did a previous patch using call_rcu() and noticed a 20% impact on TCP connection rates. Quoting Cristopher Lameter : "Right. That results in cacheline cooldown. You'd want to recycle the object as they are cache hot on a per cpu basis. That is screwed up by the delayed regular rcu processing. We have seen multiple regressions due to cacheline cooldown. The only choice in cacheline hot sensitive areas is to deal with the complexity that comes with SLAB_DESTROY_BY_RCU or give up on RCU." - Because udp sockets are allocated from dedicated kmem_cache, use of SLAB_DESTROY_BY_RCU can help here. Theory of operation : --------------------- As the lookup is lockfree (using rcu_read_lock()/rcu_read_unlock()), special attention must be taken by readers and writers. Use of SLAB_DESTROY_BY_RCU is tricky too, because a socket can be freed, reused, inserted in a different chain or in worst case in the same chain while readers could do lookups in the same time. In order to avoid loops, a reader must check each socket found in a chain really belongs to the chain the reader was traversing. If it finds a mismatch, lookup must start again at the begining. This restart loop is the reason we had to use rdlock for the multicast case, because we dont want to send same message several times to the same socket. We use RCU only for fast path. Thus, /proc/net/udp still takes spinlocks. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 02:11:14 -07:00
Eric Dumazet	645ca708f9	udp: introduce struct udp_table and multiple spinlocks UDP sockets are hashed in a 128 slots hash table. This hash table is protected by one rwlock. This rwlock is readlocked each time an incoming UDP message is handled. This rwlock is writelocked each time a socket must be inserted in hash table (bind time), or deleted from this table (close time) This is not scalable on SMP machines : 1) Even in read mode, lock() and unlock() are atomic operations and must dirty a contended cache line, shared by all cpus. 2) A writer might be starved if many readers are 'in flight'. This can happen on a machine with some NIC receiving many UDP messages. User process can be delayed a long time at socket creation/dismantle time. This patch prepares RCU migration, by introducing 'struct udp_table and struct udp_hslot', and using one spinlock per chain, to reduce contention on central rwlock. Introducing one spinlock per chain reduces latencies, for port randomization on heavily loaded UDP servers. This also speedup bindings to specific ports. udp_lib_unhash() was uninlined, becoming to big. Some cleanups were done to ease review of following patch (RCUification of UDP Unicast lookups) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 01:41:45 -07:00
Harvey Harrison	0c6ce78abf	net: replace uses of NIP6_FMT with %p6 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 23:02:31 -07:00
Alexey Dobriyan	93adcc80f3	net: don't use INIT_RCU_HEAD call_rcu() will unconditionally rewrite RCU head anyway. Applies to struct neigh_parms struct neigh_table struct net struct cipso_v4_doi struct in_ifaddr struct in_device rt->u.dst Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 13:25:09 -07:00
Alexey Dobriyan	def8b4faff	net: reduce structures when XFRM=n ifdef out * struct sk_buff::sp (pointer) * struct dst_entry::xfrm (pointer) * struct sock::sk_policy (2 pointers) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 13:24:06 -07:00
Neil Horman	1080d709fb	net: implement emergency route cache rebulds when gc_elasticity is exceeded This is a patch to provide on demand route cache rebuilding. Currently, our route cache is rebulid periodically regardless of need. This introduced unneeded periodic latency. This patch offers a better approach. Using code provided by Eric Dumazet, we compute the standard deviation of the average hash bucket chain length while running rt_check_expire. Should any given chain length grow to larger that average plus 4 standard deviations, we trigger an emergency hash table rebuild for that net namespace. This allows for the common case in which chains are well behaved and do not grow unevenly to not incur any latency at all, while those systems (which may be being maliciously attacked), only rebuild when the attack is detected. This patch take 2 other factors into account: 1) chains with multiple entries that differ by attributes that do not affect the hash value are only counted once, so as not to unduly bias system to rebuilding if features like QOS are heavily used 2) if rebuilding crosses a certain threshold (which is adjustable via the added sysctl in this patch), route caching is disabled entirely for that net namespace, since constant rebuilding is less efficient that no caching at all Tested successfully by me. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-27 17:06:14 -07:00
Florian Westphal	8b5f12d04b	syncookies: fix inclusion of tcp options in syn-ack David Miller noticed that commit `33ad798c92` '(tcp: options clean up') did not move the req->cookie_ts check. This essentially disabled commit `4dfc281702` '[Syncookies]: Add support for TCP options via timestamps.'. This restores the original logic. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-26 23:10:12 -07:00
Ilpo Järvinen	fd6149d332	tcp: Restore ordering of TCP options for the sake of inter-operability This is not our bug! Sadly some devices cannot cope with the change of TCP option ordering which was a result of the recent rewrite of the option code (not that there was some particular reason steming from the rewrite for the reordering) though any ordering of TCP options is perfectly legal. Thus we restore the original ordering to allow interoperability with/through such broken devices and add some warning about this trap. Since the reordering just happened without any particular reason, this change shouldn't cost us anything. There are already couple of known failure reports (within close proximity of the last release), so the problem might be more wide-spread than a single device. And other reports which may be due to the same problem though the symptoms were less obvious. Analysis of one of the case revealed (with very high probability) that sack capability cannot be negotiated as the first option (SYN never got a response). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Reported-by: Aldo Maggi <sentiniate@tiscali.it> Tested-by: Aldo Maggi <sentiniate@tiscali.it> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-23 14:06:35 -07:00
Ilpo Järvinen	75e3d8db53	tcp: should use number of sack blocks instead of -1 While looking for the recent "sack issue" I also read all eff_sacks usage that was played around by some relevant commit. I found out that there's another thing that is asking for a fix (unrelated to the "sack issue" though). This feature has probably very little significance in practice. Opposite direction timeout with bidirectional tcp comes to me as the most likely scenario though there might be other cases as well related to non-data segments we send (e.g., response to the opposite direction segment). Also some ACK losses or option space wasted for other purposes is necessary to prevent the earlier SACK feedback getting to the sender. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-21 16:28:36 -07:00
Linus Torvalds	5fdf11283e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: netfilter: replace old NF_ARP calls with NFPROTO_ARP netfilter: fix compilation error with NAT=n netfilter: xt_recent: use proc_create_data() netfilter: snmp nat leaks memory in case of failure netfilter: xt_iprange: fix range inversion match netfilter: netns: use NFPROTO_NUMPROTO instead of NUMPROTO for tables array netfilter: ctnetlink: remove obsolete NAT dependency from Kconfig pkt_sched: sch_generic: Fix oops in sch_teql dccp: Port redirection support for DCCP tcp: Fix IPv6 fallout from 'Port redirection support for TCP' netdev: change name dropping error codes ipvs: Update CONFIG_IP_VS_IPV6 description and help text	2008-10-20 09:06:35 -07:00
Jan Engelhardt	fdc9314cbe	netfilter: replace old NF_ARP calls with NFPROTO_ARP (Supplements: `ee999d8b95`) NFPROTO_ARP actually has a different value from NF_ARP, so ensure all callers use the new value so that packets _do_ get delivered to the registered hooks. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-20 03:34:51 -07:00
Ilpo Järvinen	311670f3ea	netfilter: snmp nat leaks memory in case of failure Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-20 03:33:24 -07:00
Linus Torvalds	b225ee5bed	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: net: Remove CONFIG_KMOD from net/ (towards removing CONFIG_KMOD entirely) ipv4: Add a missing rcu_assign_pointer() in routing cache. [netdrvr] ibmtr: PCMCIA IBMTR is ok on 64bit xen-netfront: Avoid unaligned accesses to IP header lmc: copy_*_user under spinlock [netdrvr] myri10ge, ixgbe: remove broken select INTEL_IOATDMA	2008-10-17 08:58:52 -07:00
Johannes Berg	95a5afca4a	net: Remove CONFIG_KMOD from net/ (towards removing CONFIG_KMOD entirely) Some code here depends on CONFIG_KMOD to not try to load protocol modules or similar, replace by CONFIG_MODULES where more than just request_module depends on CONFIG_KMOD and and also use try_then_request_module in ebtables. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-16 15:24:51 -07:00
Eric Dumazet	00269b54ed	ipv4: Add a missing rcu_assign_pointer() in routing cache. rt_intern_hash() is doing an update of a RCU guarded hash chain without using rcu_assign_pointer() or equivalent barrier. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-16 14:18:29 -07:00
Linus Torvalds	cb23832e39	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits) decnet: Fix compiler warning in dn_dev.c IPV6: Fix default gateway criteria wrt. HIGH/LOW preference radv option net/802/fc.c: Fix compilation warnings netns: correct mib stats in ip6_route_me_harder() netns: fix net_generic array leak rt2x00: fix regression introduced by "mac80211: free up 2 bytes in skb->cb" rtl8187: Add USB ID for Belkin F5D7050 with RTL8187B chip p54usb: Device ID updates mac80211: fixme for kernel-doc ath9k/mac80211: disallow fragmentation in ath9k, report to userspace libertas : Remove unused variable warning for "old_channel" from cmd.c mac80211: Fix scan RX processing oops orinoco: fix unsafe locking in spectrum_cs_suspend orinoco: fix unsafe locking in orinoco_cs_resume cfg80211: fix debugfs error handling mac80211: fix debugfs netdev rename iwlwifi: fix ct kill configuration for 5350 mac80211: fix HT information element parsing p54: Fix compilation problem on PPC mac80211: fix debugfs lockup ...	2008-10-16 11:26:26 -07:00
Alexey Dobriyan	f221e726bf	sysctl: simplify ->strategy name and nlen parameters passed to ->strategy hook are unused, remove them. In general ->strategy hook should know what it's doing, and don't do something tricky for which, say, pointer to original userspace array may be needed (name). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> [ networking bits ] Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Howells <dhowells@redhat.com> Cc: Matt Mackall <mpm@selenic.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:47 -07:00
Pablo Neira Ayuso	e6a7d3c04f	netfilter: ctnetlink: remove bogus module dependency between ctnetlink and nf_nat This patch removes the module dependency between ctnetlink and nf_nat by means of an indirect call that is initialized when nf_nat is loaded. Now, nf_conntrack_netlink only requires nf_conntrack and nfnetlink. This patch puts nfnetlink_parse_nat_setup_hook into the nf_conntrack_core to avoid dependencies between ctnetlink, nf_conntrack_ipv4 and nf_conntrack_ipv6. This patch also introduces the function ctnetlink_change_nat that is only invoked from the creation path. Actually, the nat handling cannot be invoked from the update path since this is not allowed. By introducing this function, we remove the useless nat handling in the update path and we avoid deadlock-prone code. This patch also adds the required EAGAIN logic for nfnetlink. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-14 11:58:31 -07:00
Patrick McHardy	38f7ac3eb7	netfilter: restore lost #ifdef guarding defrag exception Nir Tzachar <nir.tzachar@gmail.com> reported a warning when sending fragments over loopback with NAT: [ 6658.338121] WARNING: at net/ipv4/netfilter/nf_nat_standalone.c:89 nf_nat_fn+0x33/0x155() The reason is that defragmentation is skipped for already tracked connections. This is wrong in combination with NAT and ip_conntrack actually had some ifdefs to avoid this behaviour when NAT is compiled in. The entire "optimization" may seem a bit silly, for now simply restoring the lost #ifdef is the easiest solution until we can come up with something better. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-14 11:56:59 -07:00
Linus Torvalds	43096597a4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: qlge: Fix page size ifdef test. net: Rationalise email address: Network Specific Parts dsa: fix compile bug on s390 netns: mib6 section fixlet enic: Fix Kconfig headline description de2104x: wrong MAC address fix s390: claw compile fixlet net: export genphy_restart_aneg cxgb3: extend copyrights to 2008 cxgb3: update driver version net/phy: add missing kernel-doc pktgen: fix skb leak in case of failure mISDN/dsp_cmx.c: fix size checks misdn: use nonseekable_open() net: fix driver build errors due to missing net/ip6_checksum.h include	2008-10-14 10:28:49 -07:00
Alan Cox	113aa838ec	net: Rationalise email address: Network Specific Parts Clean up the various different email addresses of mine listed in the code to a single current and valid address. As Dave says his network merges for 2.6.28 are now done this seems a good point to send them in where they won't risk disrupting real changes. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-13 19:01:08 -07:00
James Morris	93db628658	Merge branch 'next' into for-linus	2008-10-13 09:35:14 +11:00
Herbert Xu	7bb82d9245	gre: Initialise rtnl_link tunnel parameters properly Brown paper bag error of calling memset with sizeof(p) instead of sizeof(*p). Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-11 12:20:15 -07:00
Patrick McHardy	4d74f8ba1f	gre: minor cleanups in netlink interface - use typeful helpers for IFLA_GRE_LOCAL/IFLA_GRE_REMOTE - replace magic value by FIELD_SIZEOF - use MODULE_ALIAS_RTNL_LINK macro Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-10 12:11:06 -07:00
Patrick McHardy	ba9e64b1c2	gre: fix copy and paste error The flags are dumped twice, the keys not at all. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-10 12:10:30 -07:00
Paul Moore	15c45f7b2e	cipso: Add support for native local labeling and fixup mapping names This patch accomplishes three minor tasks: add a new tag type for local labeling, rename the CIPSO_V4_MAP_STD define to CIPSO_V4_MAP_TRANS and replace some of the CIPSO "magic numbers" with constants from the header file. The first change allows CIPSO to support full LSM labels/contexts, not just MLS attributes. The second change brings the mapping names inline with what userspace is using, compatibility is preserved since we don't actually change the value. The last change is to aid readability and help prevent mistakes. Signed-off-by: Paul Moore <paul.moore@hp.com>	2008-10-10 10:16:34 -04:00
Paul Moore	014ab19a69	selinux: Set socket NetLabel based on connection endpoint Previous work enabled the use of address based NetLabel selectors, which while highly useful, brought the potential for additional per-packet overhead when used. This patch attempts to solve that by applying NetLabel socket labels when sockets are connect()'d. This should alleviate the per-packet NetLabel labeling for all connected sockets (yes, it even works for connected DGRAM sockets). Signed-off-by: Paul Moore <paul.moore@hp.com> Reviewed-by: James Morris <jmorris@namei.org>	2008-10-10 10:16:33 -04:00
Paul Moore	948bf85c1b	netlabel: Add functionality to set the security attributes of a packet This patch builds upon the new NetLabel address selector functionality by providing the NetLabel KAPI and CIPSO engine support needed to enable the new packet-based labeling. The only new addition to the NetLabel KAPI at this point is shown below: * int netlbl_skbuff_setattr(skb, family, secattr) ... and is designed to be called from a Netfilter hook after the packet's IP header has been populated such as in the FORWARD or LOCAL_OUT hooks. This patch also provides the necessary SELinux hooks to support this new functionality. Smack support is not currently included due to uncertainty regarding the permissions needed to expand the Smack network access controls. Signed-off-by: Paul Moore <paul.moore@hp.com> Reviewed-by: James Morris <jmorris@namei.org>	2008-10-10 10:16:32 -04:00
Paul Moore	b1edeb1023	netlabel: Replace protocol/NetLabel linking with refrerence counts NetLabel has always had a list of backpointers in the CIPSO DOI definition structure which pointed to the NetLabel LSM domain mapping structures which referenced the CIPSO DOI struct. The rationale for this was that when an administrator removed a CIPSO DOI from the system all of the associated NetLabel LSM domain mappings should be removed as well; a list of backpointers made this a simple operation. Unfortunately, while the backpointers did make the removal easier they were a bit of a mess from an implementation point of view which was making further development difficult. Since the removal of a CIPSO DOI is a realtively rare event it seems to make sense to remove this backpointer list as the optimization was hurting us more then it was helping. However, we still need to be able to track when a CIPSO DOI definition is being used so replace the backpointer list with a reference count. In order to preserve the current functionality of removing the associated LSM domain mappings when a CIPSO DOI is removed we walk the LSM domain mapping table, removing the relevant entries. Signed-off-by: Paul Moore <paul.moore@hp.com> Reviewed-by: James Morris <jmorris@namei.org>	2008-10-10 10:16:31 -04:00
Eric Dumazet	f24d43c07e	udp: complete port availability checking While looking at UDP port randomization, I noticed it was litle bit pessimistic, not looking at type of sockets (IPV6/IPV4) and not looking at bound addresses if any. We should perform same tests than when binding to a specific port. This permits a cleanup of udp_lib_get_port() Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-09 14:51:27 -07:00
Ilpo Järvinen	78e645cb89	tcpv[46]: fix md5 pseudoheader address field ordering Maybe it's just me but I guess those md5 people made a mess out of it by having _md5_hash_ to use daddr, saddr order instead of the one that is natural (and equal to what csum functions use). For the segment were sending, the original addresses are reversed so buff's saddr == skb's daddr and vice-versa. Maybe I can finally proceed with unification of some code after fixing it first... :-) Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-09 14:37:47 -07:00
Herbert Xu	64194c31a0	inet: Make tunnel RX/TX byte counters more consistent This patch makes the RX/TX byte counters for IPIP, GRE and SIT more consistent. Previously we included the external IP headers on the way out but not when the packet is inbound. The new scheme is to count payload only in both directions. For IPIP and SIT this simply means the exclusion of the external IP header. For GRE this means that we exclude the GRE header as well. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-09 12:03:17 -07:00
Herbert Xu	e1a8000228	gre: Add Transparent Ethernet Bridging This patch adds support for Ethernet over GRE encapsulation. This is exposed to user-space with a new link type of "gretap" instead of "gre". It will create an ARPHRD_ETHER device in lieu of the usual ARPHRD_IPGRE. Note that to preserver backwards compatibility all Transparent Ethernet Bridging packets are passed to an ARPHRD_IPGRE tunnel if its key matches and there is no ARPHRD_ETHER device whose key matches more closely. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-09 12:00:17 -07:00
Herbert Xu	c19e654ddb	gre: Add netlink interface This patch adds a netlink interface that will eventually displace the existing ioctl interface. It utilises the elegant rtnl_link_ops mechanism. This also means that user-space no longer needs to rely on the tunnel interface being of type GRE to identify GRE tunnels. The identification can now occur using rtnl_link_ops. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-09 11:59:55 -07:00
Herbert Xu	42aa916265	gre: Move MTU setting out of ipgre_tunnel_bind_dev This patch moves the dev->mtu setting out of ipgre_tunnel_bind_dev. This is in prepartion of using rtnl_link where we'll need to make the MTU setting conditional on whether the user has supplied an MTU. This also requires the move of the ipgre_tunnel_bind_dev call out of the dev->init function so that we can access the user parameters later. This patch also adds a check to prevent setting the MTU below the minimum of 68. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-09 11:59:32 -07:00
Herbert Xu	c95b819ad7	gre: Use needed_headroom Now that we have dev->needed_headroom, we can use it instead of having a bogus dev->hard_header_len. This also allows us to include dev->hard_header_len in the MTU computation so that when we do have a meaningful hard_harder_len in future it is included automatically in figuring out the MTU. Incidentally, this fixes a bug where we ignored the needed_headroom field of the underlying device in calculating our own hard_header_len. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-09 11:58:54 -07:00
David S. Miller	4dd565134e	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e1000e/ich8lan.c drivers/net/e1000e/netdev.c	2008-10-08 14:56:41 -07:00
Sven Wegener	071d7ab664	ipvs: Remove stray file left over from ipvs move Commit `cb7f6a7b71` ("IPVS: Move IPVS to net/netfilter/ipvs") has left a stray file in the old location of ipvs. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-08 14:41:35 -07:00
David S. Miller	db2bf2476b	Merge branch 'lvs-next-2.6' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-2.6 Conflicts: net/netfilter/Kconfig	2008-10-08 14:26:36 -07:00
Eric Dumazet	3c689b7320	inet: cleanup of local_port_range I noticed sysctl_local_port_range[] and its associated seqlock sysctl_local_port_range_lock were on separate cache lines. Moreover, sysctl_local_port_range[] was close to unrelated variables, highly modified, leading to cache misses. Moving these two variables in a structure can help data locality and moving this structure to read_mostly section helps sharing of this data among cpus. Cleanup of extern declarations (moved in include file where they belong), and use of inet_get_local_port_range() accessor instead of direct access to ports values. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-08 14:18:04 -07:00
Eric Dumazet	9088c56095	udp: Improve port randomization Current UDP port allocation is suboptimal. We select the shortest chain to chose a port (out of 512) that will hash in this shortest chain. First, it can lead to give not so ramdom ports and ease give attackers more opportunities to break the system. Second, it can consume a lot of CPU to scan all table in order to find the shortest chain. Third, in some pathological cases we can fail to find a free port even if they are plenty of them. This patch zap the search for a short chain and only use one random seed. Problem of getting long chains should be addressed in another way, since we can obtain long chains with non random ports. Based on a report and patch from Vitaly Mayatskikh Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-08 11:44:17 -07:00

1 2 3 4 5 ...

2888 Commits