linux

Author	SHA1	Message	Date
Octavian Purdila	395264d509	net: introduce NETDEV_UNREGISTER_PERNET This new event is called once for each unique net namespace in batched unregister operations (with the argument set to a random device from that namespace) and once per device in non-batched unregister operations. It allows us to factorize some device unregister work such as clearing the routing cache. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-18 05:03:03 -08:00
Eric Dumazet	9793241fe9	vlan: Precise RX stats accounting With multi queue devices, its possible that several cpus call vlan RX routines simultaneously for the same vlan device. We update RX stats counter without any locking, so we can get slightly wrong counters. One possible fix is to use percpu counters, to get precise accounting and also get guarantee of no cache line ping pongs between cpus. Note: this adds 16 bytes (32 bytes on 64bit arches) of percpu data per vlan device. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-17 23:51:55 -08:00
Eric Dumazet	d83345adf9	net: add dev_txq_stats_fold() helper Some drivers ndo_get_stats() method need to perform txqueue stats folding. Move folding from dev_get_stats() to a new dev_txq_stats_fold() function Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-17 23:51:52 -08:00
Eric Dumazet	6b863d1d32	vlan: Fix register_vlan_dev() error path In case register_netdevice() returns an error, and a new vlan_group was allocated and inserted in vlan_group_hash[] we call vlan_group_free() without deleting group from hash table. Future lookups can give infinite loops or crashes. We must delete the vlan_group using RCU safe procedure. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-17 06:45:04 -08:00
Herbert Xu	69c0cab120	gro: Fix illegal merging of trailer trash When we've merged skb's with page frags, and subsequently receive a trailer skb (< MSS) that is not completely non-linear (this can occur on Intel NICs if the packet size falls below the threshold), GRO ends up producing an illegal GSO skb with a frag_list. This is harmless unless the skb is then forwarded through an interface that requires software GSO, whereupon the GSO code will BUG. This patch detects this case in GRO and avoids merging the trailer skb. Reported-by: Mark Wagner <mwagner@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-17 05:18:18 -08:00
Changli Gao	b76965e02b	act_mirred: optimization. move checking if eaction is valid in tcf_mirred_init() Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-17 04:15:38 -08:00
Changli Gao	feed1f1724	act_mirred: cleanup 1. don't let go back using goto. 2. don't call skb_act_clone() until it is necessary. 3. one exit of the critical context. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-17 04:15:37 -08:00
Rémi Denis-Courmont	b2a5decddb	Phonet: missing rcu_dereference() Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-17 04:08:50 -08:00
Johannes Berg	649300b927	netlink: remove subscriptions check on notifier The netlink URELEASE notifier doesn't notify for sockets that have been used to receive multicast but it should be called for such sockets as well since they might _also_ be used for sending and not solely for receiving multicast. We will need that for nl80211 (generic netlink sockets) in the future. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-17 04:08:49 -08:00
David S. Miller	a2bfbc072e	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/can/Kconfig	2009-11-17 00:05:02 -08:00
Jouni Malinen	b23709248f	mac80211: Do not queue Probe Request frames for station MLME Cooked monitor interfaces cannot currently receive Probe Request frames when the interface is in station mode. However, we do not process Probe Request frames internally in the station MLME, so there is no point in queueing the frame here. Remove Probe Request frames from the queued frame list to allow cooked monitor interfaces to receive these frames. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-16 14:17:14 -05:00
Eric Dumazet	91e9c07bd6	net: Fix the rollback test in dev_change_name() net: Fix the rollback test in dev_change_name() In dev_change_name() an err variable is used for storing the original call_netdevice_notifiers() errno (negative) and testing for a rollback error later, but the test for non-zero is wrong, because the err might have positive value as well - from dev_alloc_name(). It means the rollback for a netdevice with a number > 0 will never happen. (The err test is reordered btw. to make it more readable.) Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-16 03:30:35 -08:00
Marin Mitov	b9f5d52670	remove deprecated and not used: print_mac() The function print_mac in net/ethernet/eth.c is marked __deprecated and not used. Remove it. Signed-off-by: Marin Mitov <mitov@issp.bas.bg> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-15 22:21:34 -08:00
Eric Dumazet	b93ab837a2	vlan: Use __vlan_hwaccel_put_tag() in rx Commit `05423b2413` (vlan: allow null VLAN ID to be used) forgot to update __vlan_hwaccel_rx() & vlan_gro_common() We need to set VLAN_TAG_PRESENT flag in skb->vlan_tci Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-15 22:21:33 -08:00
Jarek Poplawski	9a1654ba0b	net: Optimize hard_start_xmit() return checking Recent changes in the TX error propagation require additional checking and masking of values returned from hard_start_xmit(), mainly to separate cases where skb was consumed. This aim can be simplified by changing the order of NETDEV_TX and NET_XMIT codes, because the latter are treated similarly to negative (ERRNO) values. After this change much simpler dev_xmit_complete() is also used in sch_direct_xmit(), so it is moved to netdevice.h. Additionally NET_RX definitions in netdevice.h are moved up from between TX codes to avoid confusion while reading the TX comment. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-15 22:08:33 -08:00
Eric Dumazet	ed04642f75	net: check the return value of ndo_select_queue() Check the return value of ndo_select_queue(). If the value isn't smaller than the real_num_tx_queues, print a warning message, and reset it to zero. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> ---- Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-15 22:08:05 -08:00
David S. Miller	eaa04dc353	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6	2009-11-15 20:59:34 -08:00
Gustavo F. Padovan	68ae6639b6	Bluetooth: Fix regression with L2CAP configuration in Basic Mode Basic Mode is the default mode of operation of a L2CAP entity. In this case the RFC (Retransmission and Flow Control) configuration option should not be used at all. Normally remote L2CAP implementation should just ignore this option, but it can cause various side effects with other Bluetooth stacks that are not capable of handling unknown options. Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-11-16 01:31:41 +01:00
Gustavo F. Padovan	a0e55a32af	Bluetooth: Select Basic Mode as default for SOCK_SEQPACKET The default mode for SOCK_SEQPACKET is Basic Mode. So when no mode has been specified, Basic Mode shall be used. This is important for current application to keep working as expected and not cause a regression. Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-11-16 01:31:16 +01:00
Andrei Emeltchenko	93f19c9fc8	Bluetooth: Set general bonding security for ACL by default This patch fixes double pairing issues with Secure Simple Paring support. It was observed that when pairing with SSP enabled, that the confirmation will be asked twice. http://www.spinics.net/lists/linux-bluetooth/msg02473.html This also causes bug when initiating SSP connection from Windows Vista. The reason is because bluetoothd does not store link keys since HCIGETAUTHINFO returns 0. Setting default to general bonding fixes these issues. Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-11-16 01:30:28 +01:00
David S. Miller	958fc41e32	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/lowpan/lowpan	2009-11-14 20:24:30 -08:00
Rémi Denis-Courmont	888801357f	Phonet: convert routing table to RCU Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 20:47:02 -08:00
Rémi Denis-Courmont	7ed0132f23	Phonet: put protocols array under RCU Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 20:47:01 -08:00
Ursula Braun	b7c2aecc07	iucv: add work_queue cleanup for suspend If iucv_work_queue is not empty during kernel freeze, a kernel panic occurs. This suspend-patch adds flushing of the work queue for pending connection requests and severing of remaining pending connections. Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 20:46:58 -08:00
Eric Dumazet	2c1409a0a2	inetpeer: Optimize inet_getid() While investigating for network latencies, I found inet_getid() was a contention point for some workloads, as inet_peer_idlock is shared by all inet_getid() users regardless of peers. One way to fix this is to make ip_id_count an atomic_t instead of __u16, and use atomic_add_return(). In order to keep sizeof(struct inet_peer) = 64 on 64bit arches tcp_ts_stamp is also converted to __u32 instead of "unsigned long". Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 20:46:58 -08:00
Eric Dumazet	234b27c3fd	ipv6: speedup inet6_dump_addr() When handling large number of netdevices, inet6_dump_addr() is very slow because it has O(N^2) complexity. Instead of scanning one single list, we can use the NETDEV_HASHENTRIES sub lists of the dev_index hash table, and RCU lookups. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 20:46:57 -08:00
Eric Dumazet	eec4df9885	ipv4: speedup inet_dump_ifaddr() Stephen Hemminger a écrit : > On Thu, 12 Nov 2009 15:11:36 +0100 > Eric Dumazet <eric.dumazet@gmail.com> wrote: > >> When handling large number of netdevices, inet_dump_ifaddr() >> is very slow because it has O(N^2) complexity. >> >> Instead of scanning one single list, we can use the NETDEV_HASHENTRIES >> sub lists of the dev_index hash table, and RCU lookups. >> >> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> > > You might be able to make RCU critical section smaller by moving > it into loop. > Indeed. But we dump at most one skb (<= 8192 bytes ?), so rcu_read_lock holding time is small, unless we meet many netdevices without addresses. I wonder if its really common... Thanks [PATCH net-next-2.6] ipv4: speedup inet_dump_ifaddr() When handling large number of netdevices, inet_dump_ifaddr() is very slow because it has O(N2) complexity. Instead of scanning one single list, we can use the NETDEV_HASHENTRIES sub lists of the dev_index hash table, and RCU lookups. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 20:46:55 -08:00
Eric Dumazet	6baff15037	igmp: Use next_net_device_rcu() We need to use next_det_device_rcu() in RCU protected section. We also can avoid in_dev_get()/in_dev_put() overhead (code size mainly) in rcu_read_lock() sections. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 20:38:49 -08:00
Eric Dumazet	ce81b76a39	ipv6: use RCU to walk list of network devices No longer need read_lock(&dev_base_lock), use RCU instead. We also can avoid taking references on inet6_dev structs. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 20:38:49 -08:00
William Allen Simpson	bee7ca9ec0	net: TCP_MSS_DEFAULT, TCP_MSS_DESIRED Define two symbols needed in both kernel and user space. Remove old (somewhat incorrect) kernel variant that wasn't used in most cases. Default should apply to both RMSS and SMSS (RFC2581). Replace numeric constants with defined symbols. Stand-alone patch, originally developed for TCPCT. Signed-off-by: William.Allen.Simpson@gmail.com Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 20:38:48 -08:00
Dan Carpenter	d0490cfdf4	ipmr: missing dev_put() on error path in vif_add() The other error paths in front of this one have a dev_put() but this one got missed. Found by smatch static checker. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Wang Chen <ellre923@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 19:56:54 -08:00
Vlad Yasevich	a78102e74e	sctp: Set socket source address when additing first transport Recent commits sctp: Get rid of an extra routing lookup when adding a transport and sctp: Set source addresses on the association before adding transports changed when routes are added to the sctp transports. As such, we didn't set the socket source address correctly when adding the first transport. The first transport is always the primary/active one, so when adding it, set the socket source address. This was causing regression failures in SCTP tests. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 19:56:52 -08:00
Vlad Yasevich	f9c67811eb	sctp: Fix regression introduced by new sctp_connectx api A new (unrealeased to the user) sctp_connectx api `c6ba68a266` sctp: support non-blocking version of the new sctp_connectx() API introduced a regression cought by the user regression test suite. In particular, the API requires the user library to re-allocate the buffer and could potentially trigger a SIGFAULT. This change corrects that regression by passing the original address buffer to the kernel unmodified, but still allows for a returned association id. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 19:56:51 -08:00
Vlad Yasevich	409b95aff3	sctp: Set source addresses on the association before adding transports Recent commit `8da645e101` sctp: Get rid of an extra routing lookup when adding a transport introduced a regression in the connection setup. The behavior was different between IPv4 and IPv6. IPv4 case ended up working because the route lookup routing returned a NULL route, which triggered another route lookup later in the output patch that succeeded. In the IPv6 case, a valid route was returned for first call, but we could not find a valid source address at the time since the source addresses were not set on the association yet. Thus resulted in a hung connection. The solution is to set the source addresses on the association prior to adding peers. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 19:56:50 -08:00
Felix Fietkau	c258d2de97	nl80211: only allow adding stations to running vlan interfaces Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-13 17:43:59 -05:00
Felix Fietkau	f501dba4c4	mac80211: fix broadcast frame handling for 4-addr AP VLANs Without this patch, broadcast frames from the station behind a 4-addr AP VLAN would be reflected back to the source. Fix this by checking the 4-addr flag before bridging multicast frames in the cell. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-13 17:43:59 -05:00
Holger Schurig	61fa713c75	cfg80211: return channel noise via survey API This patch implements the NL80211_CMD_GET_SURVEY command and an get_survey() ops that a driver can implement. The goal of this command is to allow a drivers to report channel survey data (e.g. channel noise, channel occupation). For now, only the mechanism to report back channel noise has been implemented. In future, there will either be a survey-trigger command --- or the existing scan-trigger command will be enhanced. This will allow user-space to request survey for arbitrary channels. Note: any driver that cannot report channel noise should not report any value at all, e.g. made-up -92 dBm. Signed-off-by: Holger Schurig <holgerschurig@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-13 17:43:58 -05:00
Holger Schurig	a043897a31	cfg80211: introduce nl80211_get_ifidx() ... which get's rid of three indentical cut-n-paste sections. Signed-off-by: Holger Schurig <holgerschurig@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-13 17:43:58 -05:00
Rui Paulo	264d9b7d8a	mac80211: update copyrights to 2009 Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-13 17:43:57 -05:00
Rui Paulo	63c5723bc3	mac80211: add nl80211/cfg80211 handling of the new mesh root mode option. Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-13 17:43:57 -05:00
Rui Paulo	e304bfd30f	mac80211: implement a timer to send RANN action frames RANN (Root Annoucement) frame TX. Send an action frame every second trying to build a path to all nodes on the mesh. Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-13 17:43:56 -05:00
Rui Paulo	d19b3bf638	mac80211: replace "destination" with "target" to follow the spec Resulting object files have the same MD5 as before. Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-13 17:43:56 -05:00
Rui Paulo	be125c60e4	mac80211: add the DS params to the beacon Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-13 17:43:56 -05:00
Rui Paulo	36f0d5f537	mac80211: fix BSSID setup for beacon frames BSSID is now set to the TA. Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-13 17:43:55 -05:00
Rui Paulo	77fa76bb7f	mac80211: set the AID field correctly for mesh peer frames This sets the AID field correctly for mesh peer confirm frames. Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-13 17:43:55 -05:00
Rui Paulo	a6a58b4f14	mac80211: properly forward the RANN IE Increase hopcount and convert metric to LE before forwarding the RANN action frame. Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-13 17:43:55 -05:00
Rui Paulo	d611f062f4	mac80211: update PERR frame format Update the PERR IE frame format according to latest draft (3.03). Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-13 17:43:54 -05:00
Rui Paulo	90a5e16992	mac80211: implement RANN processing and forwarding Process the RANN (Root Annoucement) Frame and try to find the HWMP root station by sending a PREQ. Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-13 17:43:54 -05:00
Patrick McHardy	cbbef5e183	vlan/macvlan: propagate transmission state to upper layers Both vlan and macvlan devices usually don't use a qdisc and immediately queue packets to the underlying device. Propagate transmission state of the underlying device to the upper layers so they can react on congestion and/or inform the sending process. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 14:07:33 -08:00
Patrick McHardy	572a9d7b6f	net: allow to propagate errors through ->ndo_hard_start_xmit() Currently the ->ndo_hard_start_xmit() callbacks are only permitted to return one of the NETDEV_TX codes. This prevents any kind of error propagation for virtual devices, like queue congestion of the underlying device in case of layered devices, or unreachability in case of tunnels. This patches changes the NET_XMIT codes to avoid clashes with the NETDEV_TX codes and changes the two callers of dev_hard_start_xmit() to expect either errno codes, NET_XMIT codes or NETDEV_TX codes as return value. In case of qdisc_restart(), all non NETDEV_TX codes are mapped to NETDEV_TX_OK since no error propagation is possible when using qdiscs. In case of dev_queue_xmit(), the error is propagated upwards. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 14:07:32 -08:00
Ilpo Järvinen	d792c1006f	tcp: provide more information on the tcp receive_queue bugs The addition of rcv_nxt allows to discern whether the skb was out of place or tp->copied. Also catch fancy combination of flags if necessary (sadly we might miss the actual causer flags as it might have already returned). Btw, we perhaps would want to forward copied_seq in somewhere or otherwise we might have some nice loop with WARN stuff within but where to do that safely I don't know at this stage until more is known (but it is not made significantly worse by this patch). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-13 13:56:33 -08:00
Dmitry Eremin-Solenikov	282a39546f	ieee802154: make wpan-phy class registration to subsys_initcall Move ieee802154 initialisation to subsys_initcall call, so that wpan-phy class is initialised before all devices (thus saving us from oops during bootup). Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>	2009-11-13 00:07:15 +03:00
Arnd Bergmann	805003a41c	net/atm: move all compat_ioctl handling to atm/ioctl.c We have two implementations of the compat_ioctl handling for ATM, the one that we have had for ages in fs/compat_ioctl.c and the one added to net/atm/ioctl.c by David Woodhouse. Unfortunately, both versions are incomplete, and in practice we use a very confusing combination of the two. For ioctl numbers that have the same identifier on 32 and 64 bit systems, we go directly through the compat_ioctl socket operation, for those that differ, we do a conversion in fs/compat_ioctl.c. This patch moves both variants into the vcc_compat_ioctl() function, while preserving the current behaviour. It also kills off the COMPATIBLE_IOCTL definitions that we never use here. Doing it this way is clearly not a good solution, but I hope it is a step into the right direction, so that someone is able to clean up this mess for real. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-11 19:22:23 -08:00
Arnd Bergmann	a2116ed223	net/compat: fix dev_ifsioc emulation corner cases Handling for SIOCSHWTSTAMP is broken on architectures with a split user/kernel address space like s390, because it passes a real user pointer while using set_fs(KERNEL_DS). A similar problem might arise the next time somebody adds code to dev_ifsioc. Split up dev_ifsioc into three separate functions for SIOCSHWTSTAMP, SIOC*IFMAP and all other numbers so we can get rid of set_fs in all potentially affected cases. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Patrick Ohly <patrick.ohly@intel.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-11 19:22:22 -08:00
stephen hemminger	e5c140a340	decnet: convert dndev_lock to spinlock There is no reason for this lock to be reader/writer since the reader only has lock held for a very brief period. The overhead of read_lock is more expensive than spinlock. Compile tested only, I am not a decnet user. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-11 19:22:18 -08:00
stephen hemminger	41bdecf17e	decnet: add RTNL lock when reading address list Add missing locking in the case of auto binding to the default device. The address list might change while this code is looking at the list. Compile tested only, I am not a decnet user. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-11 19:22:15 -08:00
stephen hemminger	08e9897d51	netdev: fold name hash properly (v3) The full_name_hash function does not produce well distributed values in the lower bits, so most code uses hash_32() to fold it. This is really a bug introduced when name hashing was added, back in 2.5 when I added name hashing. hash_32 is all that is needed since full_name_hash returns unsigned int which is only 32 bits on 64 bit platforms. Also, there is no point in using hash_32 on ifindex, because the is naturally sequential and usually well distributed. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-11 19:22:12 -08:00
Anton Vorontsov	e84af6ddef	skbuff: Do not allow skb recycling with disabled IRQs NAPI drivers try to recycle SKBs in their polling routine, but we generally don't know the context in which the polling will be called, and the skb recycling itself may require IRQs to be enabled. This patch adds irqs_disabled() test to the skb_recycle_check() routine, so that we'll not let the drivers hit the skb recycling path with IRQs disabled. As a side effect, this patch actually disables skb recycling for some [broken] drivers. E.g. gianfar driver grabs an irqsave spinlock during TX ring processing, and then tries to recycle an skb, and that caused the following badness: nf_conntrack version 0.5.0 (1008 buckets, 4032 max) ------------[ cut here ]------------ Badness at kernel/softirq.c:143 NIP: c003e3c4 LR: c423a528 CTR: c003e344 ... NIP [c003e3c4] local_bh_enable+0x80/0xc4 LR [c423a528] destroy_conntrack+0xd4/0x13c [nf_conntrack] Call Trace: [c15d1b60] [c003e32c] local_bh_disable+0x1c/0x34 (unreliable) [c15d1b70] [c423a528] destroy_conntrack+0xd4/0x13c [nf_conntrack] [c15d1b80] [c02c6370] nf_conntrack_destroy+0x3c/0x70 Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-11 19:03:28 -08:00
David S. Miller	434a8a58d7	ipv6: Remove unused var in inet6_dump_ifinfo() Reported by Stephen Rothwell: -------------------- Today's linux-next build (x86_64 allmodconfig) produced this warning: net/ipv6/addrconf.c: In function 'inet6_dump_ifinfo': net/ipv6/addrconf.c:3833: warning: unused variable 'err' Introduced by commit `84d2697d96` ("ipv6: speedup inet6_dump_ifinfo()"). -------------------- Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-11 18:53:00 -08:00
Luis R. Rodriguez	e5d6eb8305	mac80211: fix max HT rate processing on mac80211 The max MCS index is 76, fix the higher check to allow through frames received at MCS 76. This is a non-issue for current drivers as MCS 76 is only possible with a device supporting 4 spatial streams. While at it change the WARN_ON() on invalid HT rates to a WARN() to provide more useful information. This will help debug issues when the driver is passing up a bogus HT rate value. The rate must map to a valid MCS index which can be any of the values in the set [0 - 76] (inclusive). Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-11 17:09:18 -05:00
Felix Fietkau	f14543ee4d	mac80211: implement support for 4-address frames for AP and client mode In some situations it might be useful to run a network with an Access Point and multiple clients, but with each client bridged to a network behind it. For this to work, both the client and the AP need to transmit 4-address frames, containing both source and destination MAC addresses. With this patch, you can configure a client to communicate using only 4-address frames for data traffic. On the AP side you can enable 4-address frames for individual clients by isolating them in separate AP VLANs which are configured in 4-address mode. Such an AP VLAN will be limited to one client only, and this client will be used as the destination for all traffic on its interface, regardless of the destination MAC address in the packet headers. The advantage of this mode compared to regular WDS mode is that it's easier to configure and does not require a static list of peer MAC addresses on any side. Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-11 17:02:10 -05:00
Felix Fietkau	8b787643ca	nl80211: add a parameter for using 4-address frames on virtual interfaces Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-11 17:02:07 -05:00
Rui Paulo	1460dd158a	mac80211: improve peer link management debugging Print the FSM state strings instead of just the numbers. Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-11 15:23:59 -05:00
Rui Paulo	f3c0d88a7f	mac80211: improve HWMP debugging Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-11 15:23:59 -05:00
Rui Paulo	dbb81c428b	mac80211: allow processing of more than one HWMP IE Since the HWMP IEs are now all optional and the action code is fixed, allow the HWMP code to find and process each IE on the path selection action frames. Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <rpaulo@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-11 15:23:58 -05:00
Rui Paulo	27db2e423f	mac80211: add MAC80211_VERBOSE_MHWMP_DEBUG Add MAC80211_VERBOSE_MHWMP_DEBUG, a debugging option for HWMP frame processing. Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-11 15:23:58 -05:00
Rui Paulo	095de01325	mac80211: update the format of path selection frames Update the format of path selection frames according to latest draft (3.03). Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-11 15:23:58 -05:00
Rui Paulo	0938393f02	mac80211: update peer link management IE and action frames Update the length and format of the peer link management action frames. Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-11 15:23:57 -05:00
Rui Paulo	23c7a29cd0	mac80211: fix typo in a comment Signed-off-by: Javier Cardona <javier@cozybit.com> Signed-off-by: Rui Paulo <rpaulo@gmail.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-11 15:23:57 -05:00
Rui Paulo	8f2fda9594	mac80211: implement the meshconf formation info field The Mesh Configuration Formation Info field contains the number of neighbors. This means that the beacon must be updated every time a peer joins or leaves. Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <rpaulo@gmail.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-11 15:23:57 -05:00
Rui Paulo	a1935218da	mac80211: set MESH_TTL to 31 Update the mesh time to live field to 31 according to draft 3.03. Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-11 15:23:56 -05:00
Rui Paulo	3491707a07	mac80211: update meshconf IE This updates the Mesh Configuration IE according to the latest draft (3.03). Notable changes include the simplified protocol IDs. Signed-off-by: Rui Paulo <rpaulo@gmail.com> Signed-off-by: Javier Cardona <javier@cozybit.com> Reviewed-by: Andrey Yurovsky <andrey@cozybit.com> Tested-by: Brian Cavagnolo <brian@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-11-11 15:23:56 -05:00
stephen hemminger	ff879eb611	CAN: use dev_get_by_index_rcu Use new function to avoid doing read_lock(). Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-10 22:27:13 -08:00
stephen hemminger	61fbab77a8	IPV4: use rcu to walk list of devices in IGMP This also needs to be optimized for large number of devices. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-10 22:27:12 -08:00
stephen hemminger	fa918602b6	decnet: use RCU to find network devices When showing device statistics use RCU rather than read_lock(&dev_base_lock) Compile tested only. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-10 22:26:31 -08:00
stephen hemminger	f1e9016da6	net: use rcu for network scheduler API Use RCU to walk list of network devices in qdisc dump. This could be optimized for large number of devices. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-10 22:26:30 -08:00
stephen hemminger	9e067597ee	vlan: eliminate use of dev_base_lock Do not need to use read_lock(&dev_base_lock), use RCU instead. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-10 22:26:30 -08:00
Brian Haley	856540ee31	IPv6: use ipv6_addr_v4mapped() Change udp6_portaddr_hash() to use ipv6_addr_v4mapped() inline instead of ipv6_addr_type(). Signed-off-by: Brian Haley <brian.haley@hp.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-10 20:54:44 -08:00
Herbert Xu	292f4f3ce4	sit: Clean up DF code by copying from IPIP This patch rearranges the SIT DF bit handling using the new IPIP DF code. The only externally visible effect should be the case where PMTU is enabled and the MTU is exactly 1280 bytes. In this case the previous code would send packets out with DF off while the new code would set the DF bit. This is inline with RFC 4213. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Thanks, Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-10 20:54:43 -08:00
Eric Dumazet	bcd323262a	ipv6: Allow inet6_dump_addr() to handle more than 64 addresses Apparently, inet6_dump_addr() is not able to handle more than 64 ipv6 addresses per device. We must break from inner loops in case skb is full, or else cursor is put at the end of list. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-10 20:54:42 -08:00
Eric Dumazet	84d2697d96	ipv6: speedup inet6_dump_ifinfo() When handling large number of netdevice, inet6_dump_ifinfo() is very slow because it has O(N^2) complexity. Instead of scanning one single list, we can use the 256 sub lists of the dev_index hash table, and RCU lookups. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-10 20:54:41 -08:00
Cyrill Gorcunov	13cfa97bef	net: netlink_getname, packet_getname -- use DECLARE_SOCKADDR guard Use guard DECLARE_SOCKADDR in a few more places which allow us to catch if the structure copied back is too big. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-10 20:54:41 -08:00
Eric Dumazet	30fff9231f	udp: bind() optimisation UDP bind() can be O(N^2) in some pathological cases. Thanks to secondary hash tables, we can make it O(N) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-10 20:54:38 -08:00
Rémi Denis-Courmont	b1704374fd	Phonet: allocate and copy for pipe TX without sock lock Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-10 20:54:34 -08:00
Rémi Denis-Courmont	6b0d07ba15	Phonet: put sockets in a hash table Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-10 20:54:33 -08:00
David S. Miller	f6d773cd4f	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2009-11-09 11:17:24 -08:00
Linus Torvalds	1ce55238e2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (34 commits) net/fsl_pq_mdio: add module license GPL can: fix WARN_ON dump in net/core/rtnetlink.c:rtmsg_ifinfo() can: should not use __dev_get_by_index() without locks hisax: remove bad udelay call to fix build error on ARM ipip: Fix handling of DF packets when pmtudisc is OFF qlge: Set PCIe reset type for EEH to fundamental. qlge: Fix early exit from mbox cmd complete wait. ixgbe: fix traffic hangs on Tx with ioatdma loaded ixgbe: Fix checking TFCS register for TXOFF status when DCB is enabled ixgbe: Fix gso_max_size for 82599 when DCB is enabled macsonic: fix crash on PowerBook 520 NET: cassini, fix lock imbalance ems_usb: Fix byte order issues on big endian machines be2net: Bug fix to send config commands to hardware after netdev_register be2net: fix to set proper flow control on resume netfilter: xt_connlimit: fix regression caused by zero family value rt2x00: Don't queue ieee80211 work after USB removal Revert "ipw2200: fix oops on missing firmware" decnet: netdevice refcount leak netfilter: nf_nat: fix NAT issue in 2.6.30.4+ ...	2009-11-09 09:51:42 -08:00
David S. Miller	d0e1e88d6e	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/can/usb/ems_usb.c	2009-11-08 23:00:54 -08:00
Yury Polyanskiy	9e0d57fd6d	xfrm: SAD entries do not expire correctly after suspend-resume This fixes the following bug in the current implementation of net/xfrm: SAD entries timeouts do not count the time spent by the machine in the suspended state. This leads to the connectivity problems because after resuming local machine thinks that the SAD entry is still valid, while it has already been expired on the remote server. The cause of this is very simple: the timeouts in the net/xfrm are bound to the old mod_timer() timers. This patch reassigns them to the CLOCK_REALTIME hrtimer. I have been using this version of the patch for a few months on my machines without any problems. Also run a few stress tests w/o any issues. This version of the patch uses tasklet_hrtimer by Peter Zijlstra (commit 9ba5f0). This patch is against 2.6.31.4. Please CC me. Signed-off-by: Yury Polyanskiy <polyanskiy@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-08 20:58:41 -08:00
Arnd Bergmann	7a50a240c4	net/compat_ioctl: support SIOCWANDEV This adds compat_ioctl support for SIOCWANDEV, which has always been missing. The definition of struct compat_ifreq was missing an ifru_settings fields that is needed to support SIOCWANDEV, so add that and clean up the whitespace damage in the struct definition. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-08 20:57:03 -08:00
Arnd Bergmann	fab2532ba5	net, compat_ioctl: fix SIOCGMII ioctls SIOCGMIIPHY and SIOCGMIIREG return data through ifreq, so it needs to be converted on the way out as well. SIOCGIFPFLAGS is unused, but has the same problem in theory. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-08 20:56:21 -08:00
Eric Dumazet	f6b8f32ca7	udp: multicast RX should increment SNMP/sk_drops counter in allocation failures When skb_clone() fails, we should increment sk_drops and SNMP counters. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-08 20:53:10 -08:00
Eric Dumazet	a1ab77f97e	ipv6: udp: Optimise multicast reception IPV6 UDP multicast rx path is a bit complex and can hold a spinlock for a long time. Using a small (32 or 64 entries) stack of socket pointers can help to perform expensive operations (skb_clone(), udp_queue_rcv_skb()) outside of the lock, in most cases. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-08 20:53:09 -08:00
Eric Dumazet	1240d1373c	ipv4: udp: Optimise multicast reception UDP multicast rx path is a bit complex and can hold a spinlock for a long time. Using a small (32 or 64 entries) stack of socket pointers can help to perform expensive operations (skb_clone(), udp_queue_rcv_skb()) outside of the lock, in most cases. It's also a base for a future RCU conversion of multicast recption. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Lucian Adrian Grijincu <lgrijincu@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-08 20:53:08 -08:00
Eric Dumazet	fddc17defa	ipv6: udp: optimize unicast RX path We first locate the (local port) hash chain head If few sockets are in this chain, we proceed with previous lookup algo. If too many sockets are listed, we take a look at the secondary (port, address) hash chain. We choose the shortest chain and proceed with a RCU lookup on the elected chain. But, if we chose (port, address) chain, and fail to find a socket on given address, we must try another lookup on (port, in6addr_any) chain to find sockets not bound to a particular IP. -> No extra cost for typical setups, where the first lookup will probabbly be performed. RCU lookups everywhere, we dont acquire spinlock. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-08 20:53:07 -08:00
Eric Dumazet	5051ebd275	ipv4: udp: optimize unicast RX path We first locate the (local port) hash chain head If few sockets are in this chain, we proceed with previous lookup algo. If too many sockets are listed, we take a look at the secondary (port, address) hash chain we added in previous patch. We choose the shortest chain and proceed with a RCU lookup on the elected chain. But, if we chose (port, address) chain, and fail to find a socket on given address, we must try another lookup on (port, INADDR_ANY) chain to find socket not bound to a particular IP. -> No extra cost for typical setups, where the first lookup will probabbly be performed. RCU lookups everywhere, we dont acquire spinlock. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-08 20:53:07 -08:00
Eric Dumazet	512615b6b8	udp: secondary hash on (local port, local address) Extends udp_table to contain a secondary hash table. socket anchor for this second hash is free, because UDP doesnt use skc_bind_node : We define an union to hold both skc_bind_node & a new hlist_nulls_node udp_portaddr_node udp_lib_get_port() inserts sockets into second hash chain (additional cost of one atomic op) udp_lib_unhash() deletes socket from second hash chain (additional cost of one atomic op) Note : No spinlock lockdep annotation is needed, because lock for the secondary hash chain is always get after lock for primary hash chain. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-08 20:53:06 -08:00
Eric Dumazet	d4cada4ae1	udp: split sk_hash into two u16 hashes Union sk_hash with two u16 hashes for udp (no extra memory taken) One 16 bits hash on (local port) value (the previous udp 'hash') One 16 bits hash on (local address, local port) values, initialized but not yet used. This second hash is using jenkin hash for better distribution. Because the 'port' is xored later, a partial hash is performed on local address + net_hash_mix(net) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-08 20:53:05 -08:00
Eric Dumazet	fdcc8aa953	udp: add a counter into udp_hslot Adds a counter in udp_hslot to keep an accurate count of sockets present in chain. This will permit to upcoming UDP lookup algo to chose the shortest chain when secondary hash is added. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-08 20:53:04 -08:00
Stephen Rothwell	415ce61aef	net/appletalk: using compat_ptr needs inclusion of linux/compat.h Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-08 20:41:03 -08:00

1 2 3 4 5 ...

13950 Commits