linux

Author	SHA1	Message	Date
Wei Yongjun	10a43cea7d	sctp: fix panic when T4-rto timer expire on removed transport If T4-rto timer is expired on a removed transport, kernel panic will occur when we do failure management on that transport. You can reproduce this use the following sequence: Endpoint A Endpoint B (ESTABLISHED) (ESTABLISHED) <----------------- ASCONF (SRC=X) ASCONF -----------------> (Delete IP Address = X) <----------------- ASCONF-ACK (Success Indication) <----------------- ASCONF (T4-rto timer expire) This patch fixed the problem. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2009-06-03 09:14:46 -04:00
Wei Yongjun	6345b19985	sctp: fix panic when T2-shutdown timer expire on removed transport If T2-shutdown timer is expired on a removed transport, kernel panic will occur when we do failure management on that transport. You can reproduce this use the following sequence: Endpoint A Endpoint B (ESTABLISHED) (ESTABLISHED) <----------------- SHUTDOWN (SRC=X) ASCONF -----------------> (Delete IP Address = X) <----------------- ASCONF-ACK (Success Indication) <----------------- SHUTDOWN (T2-shutdown timer expire) This patch fixed the problem. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2009-06-03 09:14:46 -04:00
Wei Yongjun	a2c395846c	sctp: fix to only enable IPv6 address support on PF_INET6 socket If socket is create by PF_INET type, it can not used IPv6 address to send/recv DATA. So only enable IPv6 address support on PF_INET6 socket. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2009-06-03 09:14:46 -04:00
Wei Yongjun	4553e88d87	sctp: fix a typo in net/sctp/sm_statetable.c Just fix a typo in net/sctp/sm_statetable.c. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2009-06-03 09:14:45 -04:00
Wei Yongjun	945e5abcee	sctp: fix the error code when ASCONF is received with invalid address Use Unresolvable Address error cause instead of Invalid Mandatory Parameter error cause when process ASCONF chunk with invalid address since address parameters are not mandatory in the ASCONF chunk. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2009-06-03 09:14:45 -04:00
Wei Yongjun	a987f762ca	sctp: fix report unrecognized parameter in ACSONF-ACK RFC5061 Section 5.2. Upon Reception of an ASCONF Chunk V2) In processing the chunk, the receiver should build a response message with the appropriate error TLVs, as specified in the Parameter type bits, for any ASCONF Parameter it does not understand. To indicate an unrecognized parameter, Cause Type 8 should be used as defined in the ERROR in Section 3.3.10.8, [RFC4960]. The endpoint may also use the response to carry rejections for other reasons, such as resource shortages, etc., using the Error Cause TLV and an appropriate error condition. So we should indicate an unrecognized parameter with error SCTP_ERROR_UNKNOWN_PARAM in ACSONF-ACK chunk, not SCTP_ERROR_INV_PARAM. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>	2009-06-03 09:14:45 -04:00
Eric Dumazet	adf30907d6	net: skb->dst accessors Define three accessors to get/set dst attached to a skb struct dst_entry skb_dst(const struct sk_buff skb) void skb_dst_set(struct sk_buff skb, struct dst_entry dst) void skb_dst_drop(struct sk_buff *skb) This one should replace occurrences of : dst_release(skb->dst) skb->dst = NULL; Delete skb->dst field Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-03 02:51:04 -07:00
Eric Dumazet	511c3f92ad	net: skb->rtable accessor Define skb_rtable(const struct sk_buff *skb) accessor to get rtable from skb Delete skb->rtable field Setting rtable is not allowed, just set dst instead as rtable is an alias. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-03 02:51:02 -07:00
David S. Miller	b2f8f7525c	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/forcedeth.c	2009-06-03 02:43:41 -07:00
Pablo Neira Ayuso	e34d5c1a4f	netfilter: conntrack: replace notify chain by function pointer This patch removes the notify chain infrastructure and replace it by a simple function pointer. This issue has been mentioned in the mailing list several times: the use of the notify chain adds too much overhead for something that is only used by ctnetlink. This patch also changes nfnetlink_send(). It seems that gfp_any() returns GFP_KERNEL for user-context request, like those via ctnetlink, inside the RCU read-side section which is not valid. Using GFP_KERNEL is also evil since netlink may schedule(), this leads to "scheduling while atomic" bug reports. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-03 10:32:06 +02:00
Pablo Neira Ayuso	17e6e4eac0	netfilter: conntrack: simplify event caching system This patch simplifies the conntrack event caching system by removing several events: * IPCT_[]_VOLATILE, IPCT_HELPINFO and IPCT_NATINFO has been deleted since the have no clients. IPCT_COUNTER_FILLING which is a leftover of the 32-bits counter days. * IPCT_REFRESH which is not of any use since we always include the timeout in the messages. After this patch, the existing events are: * IPCT_NEW, IPCT_RELATED and IPCT_DESTROY, that are used to identify addition and deletion of entries. * IPCT_STATUS, that notes that the status bits have changes, eg. IPS_SEEN_REPLY and IPS_ASSURED. * IPCT_PROTOINFO, that reports that internal protocol information has changed, eg. the TCP, DCCP and SCTP protocol state. * IPCT_HELPER, that a helper has been assigned or unassigned to this entry. * IPCT_MARK and IPCT_SECMARK, that reports that the mark has changed, this covers the case when a mark is set to zero. * IPCT_NATSEQADJ, to report that there's updates in the NAT sequence adjustment. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-02 20:08:46 +02:00
Pablo Neira Ayuso	274d383b9c	netfilter: conntrack: don't report events on module removal During the module removal there are no possible event listeners since ctnetlink must be removed before to allow removing nf_conntrack. This patch removes the event reporting for the module removal case which is not of any use in the existing code. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-02 20:08:38 +02:00
Pablo Neira Ayuso	03b64f518a	netfilter: ctnetlink: cleanup message-size calculation This patch cleans up the message calculation to make it similar to rtnetlink, moreover, it removes unneeded verbose information. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-02 20:08:27 +02:00
Pablo Neira Ayuso	96bcf938dc	netfilter: ctnetlink: use nlmsg_* helper function to build messages Replaces the old macros to build Netlink messages with the new nlmsg_*() helper functions. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-02 20:07:39 +02:00
Pablo Neira Ayuso	f2f3e38c63	netfilter: ctnetlink: rename tuple() by nf_ct_tuple() macro definition This patch move the internal tuple() macro definition to the header file as nf_ct_tuple(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-02 20:03:35 +02:00
Pablo Neira Ayuso	8b0a231d4d	netfilter: ctnetlink: remove nowait parameter from fill_info() This patch is a cleanup, it removes the `nowait' parameter from all fill_info() function since it is always set to one. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-02 20:03:34 +02:00
Pablo Neira Ayuso	f49c857ff2	netfilter: nfnetlink: cleanup for nfnetlink_rcv_msg() function This patch cleans up the message handling path in two aspects: * it uses NLMSG_LENGTH() instead of NLMSG_SPACE() like rtnetlink does in this case to check if there is enough room for the Netlink/nfnetlink headers. No need to check for the padding room. * it removes a redundant header size checking that has been already do at the beginning of the function. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-02 20:03:33 +02:00
Jozsef Kadlecsik	874ab9233e	netfilter: nf_ct_tcp: TCP simultaneous open support The patch below adds supporting TCP simultaneous open to conntrack. The unused LISTEN state is replaced by a new state (SYN_SENT2) denoting the second SYN sent from the reply direction in the new case. The state table is updated and the function tcp_in_window is modified to handle simultaneous open. The functionality can fairly easily be tested by socat. A sample tcpdump recording 23:21:34.244733 IP (tos 0x0, ttl 64, id 49224, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.254.2020 > 192.168.0.1.2020: S, cksum 0xe75f (correct), 3383710133:3383710133(0) win 5840 <mss 1460,sackOK,timestamp 173445629 0,nop,wscale 7> 23:21:34.244783 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) 192.168.0.1.2020 > 192.168.0.254.2020: R, cksum 0x0253 (correct), 0:0(0) ack 3383710134 win 0 23:21:36.038680 IP (tos 0x0, ttl 64, id 28092, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.1.2020 > 192.168.0.254.2020: S, cksum 0x704b (correct), 2634546729:2634546729(0) win 5840 <mss 1460,sackOK,timestamp 824213 0,nop,wscale 1> 23:21:36.038777 IP (tos 0x0, ttl 64, id 49225, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.254.2020 > 192.168.0.1.2020: S, cksum 0xb179 (correct), 3383710133:3383710133(0) ack 2634546730 win 5840 <mss 1460,sackOK,timestamp 173447423 824213,nop,wscale 7> 23:21:36.038847 IP (tos 0x0, ttl 64, id 28093, offset 0, flags [DF], proto TCP (6), length 52) 192.168.0.1.2020 > 192.168.0.254.2020: ., cksum 0xebad (correct), ack 3383710134 win 2920 <nop,nop,timestamp 824213 173447423> and the corresponding netlink events: [NEW] tcp 6 120 SYN_SENT src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 [UNREPLIED] src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [UPDATE] tcp 6 120 LISTEN src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [UPDATE] tcp 6 60 SYN_RECV src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [UPDATE] tcp 6 432000 ESTABLISHED src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [ASSURED] The RST packet was dropped in the raw table, thus it did not reach conntrack. nfnetlink_conntrack is unpatched so it shows the new SYN_SENT2 state as the old unused LISTEN. With TCP simultaneous open support we satisfy REQ-2 in RFC 5382 ;-) . Additional minor correction in this patch is that in order to catch uninitialized reply directions, "td_maxwin == 0" is used instead of "td_end == 0" because the former can't be true except in uninitialized state while td_end may accidentally be equal to zero in the mid of a connection. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-02 13:58:56 +02:00
Patrick McHardy	8cc848fa34	Merge branch 'master' of git://dev.medozas.de/linux	2009-06-02 13:44:56 +02:00
Minoru Usui	12186be7d2	net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic when we use cls_cgroup This patch fixes a bug which unconfigured struct tcf_proto keeps chaining in tc_ctl_tfilter(), and avoids kernel panic in cls_cgroup_classify() when we use cls_cgroup. When we execute 'tc filter add', tcf_proto is allocated, initialized by classifier's init(), and chained. After it's chained, tc_ctl_tfilter() calls classifier's change(). When classifier's change() fails, tc_ctl_tfilter() does not free and keeps tcf_proto. In addition, cls_cgroup is initialized in change() not in init(). It accesses unconfigured struct tcf_proto which is chained before change(), then hits Oops. Signed-off-by: Minoru Usui <usui@mxm.nes.nec.co.jp> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Tested-by: Minoru Usui <usui@mxm.nes.nec.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-02 02:17:34 -07:00
Nivedita Singhvi	f771bef980	ipv4: New multicast-all socket option After some discussion offline with Christoph Lameter and David Stevens regarding multicast behaviour in Linux, I'm submitting a slightly modified patch from the one Christoph submitted earlier. This patch provides a new socket option IP_MULTICAST_ALL. In this case, default behaviour is _unchanged_ from the current Linux standard. The socket option is set by default to provide original behaviour. Sockets wishing to receive data only from multicast groups they join explicitly will need to clear this socket option. Signed-off-by: Nivedita Singhvi <niv@us.ibm.com> Signed-off-by: Christoph Lameter<cl@linux.com> Acked-by: David Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-02 00:45:24 -07:00
Eric Dumazet	4d52cfbef6	net: ipv4/ip_sockglue.c cleanups Pure cleanups Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-02 00:42:16 -07:00
Brian Haley	dae9de8e13	IPv6: Print error value when skb allocation fails Print-out the error value when sock_alloc_send_skb() fails in the IPv6 neighbor discovery code - can be useful for debugging. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-02 00:20:26 -07:00
Rémi Denis-Courmont	bbd5898d39	Phonet: fix accounting race between gprs_writeable() and gprs_xmit() In the unlikely event that gprs_writeable() and gprs_xmit() check for writeability at the same, we could stop the device queue forever. Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-02 00:17:43 -07:00
David S. Miller	fc23ffe075	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6	2009-06-01 14:32:08 -07:00
Brian Haley	56d417b12e	IPv6: Add 'autoconf' and 'disable_ipv6' module parameters Add 'autoconf' and 'disable_ipv6' parameters to the IPv6 module. The first controls if IPv6 addresses are autoconfigured from prefixes received in Router Advertisements. The IPv6 loopback (::1) and link-local addresses are still configured. The second controls if IPv6 addresses are desired at all. No IPv6 addresses will be added to any interfaces. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-01 03:07:33 -07:00
Jiri Pirko	ccffad25b5	net: convert unicast addr list This patch converts unicast address list to standard list_head using previously introduced struct netdev_hw_addr. It also relaxes the locking. Original spinlock (still used for multicast addresses) is not needed and is no longer used for a protection of this list. All reading and writing takes place under rtnl (with no changes). I also removed a possibility to specify the length of the address while adding or deleting unicast address. It's always dev->addr_len. The convertion touched especially e1000 and ixgbe codes when the change is not so trivial. Signed-off-by: Jiri Pirko <jpirko@redhat.com> drivers/net/bnx2.c \| 13 +-- drivers/net/e1000/e1000_main.c \| 24 +++-- drivers/net/ixgbe/ixgbe_common.c \| 14 ++-- drivers/net/ixgbe/ixgbe_common.h \| 4 +- drivers/net/ixgbe/ixgbe_main.c \| 6 +- drivers/net/ixgbe/ixgbe_type.h \| 4 +- drivers/net/macvlan.c \| 11 +- drivers/net/mv643xx_eth.c \| 11 +- drivers/net/niu.c \| 7 +- drivers/net/virtio_net.c \| 7 +- drivers/s390/net/qeth_l2_main.c \| 6 +- drivers/scsi/fcoe/fcoe.c \| 16 ++-- include/linux/netdevice.h \| 18 ++-- net/8021q/vlan.c \| 4 +- net/8021q/vlan_dev.c \| 10 +- net/core/dev.c \| 195 +++++++++++++++++++++++++++----------- net/dsa/slave.c \| 10 +- net/packet/af_packet.c \| 4 +- 18 files changed, 227 insertions(+), 137 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-29 22:12:32 -07:00
Ilpo Järvinen	2df9001edc	tcp: fix loop in ofo handling code and reduce its complexity Somewhat luckily, I was looking into these parts with very fine comb because I've made somewhat similar changes on the same area (conflicts that arose weren't that lucky though). The loop was very much overengineered recently in commit `915219441d` (tcp: Use SKB queue and list helpers instead of doing it by-hand), while it basically just wants to know if there are skbs after 'skb'. Also it got broken because skb1 = skb->next got translated into skb1 = skb1->next (though abstracted) improperly. Note that 'skb1' is pointing to previous sk_buff than skb or NULL if at head. Two things went wrong: - We'll kfree 'skb' on the first iteration instead of the skbuff following 'skb' (it would require required SACK reneging to recover I think). - The list head case where 'skb1' is NULL is checked too early and the loop won't execute whereas it previously did. Conclusion, mostly revert the recent changes which makes the cset very messy looking but using proper accessor in the previous-like version. The effective changes against the original can be viewed with: git-diff 915219441d566f1da0caa0e262be49b666159e17^ \ net/ipv4/tcp_input.c \| sed -n -e '57,70 p' Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-29 15:02:29 -07:00
Eric Dumazet	108bfa895c	net: unset IFF_XMIT_DST_RELEASE in ipgre_tunnel_setup() ipgre_tunnel_xmit() might need skb->dst, so tell dev_hard_start_xmit() to no release it. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-29 01:46:29 -07:00
Eric Dumazet	a489e51c1a	atm: unset IFF_XMIT_DST_RELEASE in clip_setup() clip_start_xmit() needs skb->dst so tell dev_hard_start_xmit() to no release it. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-29 01:46:28 -07:00
Eric Dumazet	28e72216d7	net: unset IFF_XMIT_DST_RELEASE in ipip_tunnel_setup() ipip_tunnel_xmit() might need skb->dst, so tell dev_hard_start_xmit() to no release it. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-29 01:46:27 -07:00
David S. Miller	3f1f39c42b	Merge branch 'linux-2.6.31.y' of git://git.kernel.org/pub/scm/linux/kernel/git/inaky/wimax	2009-05-29 01:41:32 -07:00
David S. Miller	dfe9a83798	llc: Kill outdated and incorrect comment. This comment suggested storing two pieces of state in the LLC skb control block, and in fact we do. Someone did the implementation but never killed this todo comment :-) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-28 23:31:56 -07:00
David S. Miller	528be7ff82	irda: Use SKB queue and list helpers instead of doing it by-hand. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-28 23:26:33 -07:00
David S. Miller	915219441d	tcp: Use SKB queue and list helpers instead of doing it by-hand. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-28 21:35:47 -07:00
Paulius Zaleckas	7f0333eb2f	wimax: Add netlink interface to get device state wimax connection manager / daemon has to know what is current state of the device. Previously it was only possible to get notification whet state has changed. Note: By mistake, the new generic netlink's number for WIMAX_GNL_OP_STATE_GET was declared inserting into the existing list of API calls, not appending; thus, it'd break existing API. Fixed by Inaky Perez-Gonzalez <inaky@linux.intel.com> by moving to the tail, where we add to the interface, not modify the interface. Thanks to Stephen Hemminger <shemminger@vyatta.com> for catching this. Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>	2009-05-28 18:02:20 -07:00
Inaky Perez-Gonzalez	52a8d96308	wimax: document why wimax_msg_*() operations can be used in any state Funcion documentation for wimax_msg_alloc() and wimax_msg_send() needs to clarify that they can be used in the very early stages of a wimax_dev lifecycle. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>	2009-05-28 18:02:04 -07:00
David S. Miller	de1033428b	econet: Use SKB queue and list helpers instead of doing it by-hand. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-28 16:46:29 -07:00
David S. Miller	bec571ec76	decnet: Use SKB queue and list helpers instead of doing it by-hand. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-28 16:43:52 -07:00
David S. Miller	b6211ae7f2	atm: Use SKB queue and list helpers instead of doing it by-hand. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-28 16:36:47 -07:00
David S. Miller	4d3383d0ad	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2009-05-27 15:51:25 -07:00
Eric Dumazet	2a91525c20	net: net/core/sock.c cleanup Pure style cleanup patch. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 15:47:07 -07:00
Eric Dumazet	1ce8e7b57b	net: ALIGN/PTR_ALIGN cleanup in alloc_netdev_mq()/netdev_priv() Use ALIGN() and PTR_ALIGN() macros instead of handcoding them. Get rid of NETDEV_ALIGN_CONST ugly define Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 15:47:06 -07:00
Jiri Pirko	0bb32417ff	bridge: avoid an extra space in br_fdb_update() Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 15:46:54 -07:00
Pablo Neira Ayuso	a17c859849	netfilter: conntrack: add support for DCCP handshake sequence to ctnetlink This patch adds CTA_PROTOINFO_DCCP_HANDSHAKE_SEQ that exposes the u64 handshake sequence number to user-space. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-05-27 17:50:35 +02:00
Pablo Neira Ayuso	eeff9beec3	netfilter: nfnetlink_log: fix wrong skbuff size calculation This problem was introduced in `72961ecf84` since no space was reserved for the new attributes NFULA_HWTYPE, NFULA_HWLEN and NFULA_HWHEADER. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-05-27 15:49:11 +02:00
Jesper Dangaard Brouer	683a04cebc	netfilter: xt_hashlimit does a wrong SEQ_SKIP The function dl_seq_show() returns 1 (equal to SEQ_SKIP) in case a seq_printf() call return -1. It should return -1. This SEQ_SKIP behavior brakes processing the proc file e.g. via a pipe or just through less. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-05-27 15:45:34 +02:00
Herbert Xu	a2a804cddf	tcp: Do not check flush when comparing options for GRO There is no need to repeatedly check flush when comparing TCP options for GRO as it will be false 99% of the time where it matters. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 03:26:05 -07:00
Herbert Xu	9aaa156cf9	gro: Store shinfo in local variable in skb_gro_receive This patch stores the two shinfo pointers in local variables because they're used over and over again in skb_gro_receive. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 03:26:05 -07:00
Herbert Xu	66e92fcf1d	gro: Nasty optimisations for page frags in skb_gro_receive This patch reverses the direction of the frags array copy in skb_gro_receive in order simplify the loop conditional. It also avoids touching the first element of the original frags array. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 03:26:04 -07:00
Herbert Xu	cb18978cbf	gro: Open-code final pskb_may_pull As we know the only packets which need the final pskb_may_pull are completely non-linear, and have all the required bits in frag0, we can perform a straight memcpy instead of going through pskb_may_pull and doing skb_copy_bits. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 03:26:02 -07:00
Herbert Xu	1075f3f65d	ipv4: Use 32-bit loads for ID and length in GRO This patch optimises the IPv4 GRO code by using 32-bit loads (instead of 16-bit ones) on the ID and length checks in the receive function. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 03:26:02 -07:00
Herbert Xu	a5b1cf288d	gro: Avoid unnecessary comparison after skb_gro_header For the overwhelming majority of cases, skb_gro_header's return value cannot be NULL. Yet we must check it because of its current form. This patch splits it up into multiple functions in order to avoid this. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 03:26:01 -07:00
Herbert Xu	7489594cb2	gro: Optimise length comparison in skb_gro_header By caching frag0_len, we can avoid checking both frag0 and the length separately in skb_gro_header. This helps as skb_gro_header is called four times per packet which amounts to a few million times at 10Gb/s. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 03:26:01 -07:00
Herbert Xu	30a3ae30c7	tcp: Optimise len/mss comparison Instead of checking len > mss \|\| len == 0, we can accomplish both by checking (len - 1) > mss using the unsigned wraparound. At nearly a million times a second, this might just help. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 03:26:00 -07:00
Herbert Xu	4a9a2968a1	tcp: Remove unnecessary window comparisons for GRO The window has already been checked as part of the flag word so there is no need to check it explicitly. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 03:25:59 -07:00
Herbert Xu	745898eaf0	tcp: Optimise GRO port comparisons Instead of doing two 16-bit operations for the source/destination ports, we can do one 32-bit operation to take care both. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 03:25:57 -07:00
Herbert Xu	78d3fd0b7d	gro: Only use skb_gro_header for completely non-linear packets Currently skb_gro_header is used for packets which put the hardware header in skb->data with the rest in frags. Since the drivers that need this optimisation all provide completely non-linear packets, we can gain extra optimisations by only performing the frag0 optimisation for completely non-linear packets. In particular, we can simply test frag0 (instead of skb_headlen) to see whether the optimisation is in force. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 03:25:57 -07:00
Herbert Xu	67147ba99a	gro: Localise offset/headlen in skb_gro_offset This patch stores the offset/headlen in local variables as they're used repeatedly in skb_gro_offset. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 03:25:55 -07:00
Herbert Xu	78a478d0ef	gro: Inline skb_gro_header and cache frag0 virtual address The function skb_gro_header is called four times per packet which quickly adds up at 10Gb/s. This patch inlines it to allow better optimisations. Some architectures perform multiplication for page_address, which is done by each skb_gro_header invocation. This patch caches that value in skb->cb to avoid the unnecessary multiplications. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 03:25:55 -07:00
Herbert Xu	42da6994ca	gro: Open-code frags copy in skb_gro_receive gcc does a poor job at generating code for the memcpy of the frags array in skb_gro_receive, which is the primary purpose of that function when merging frags. In particular, it can't utilise the alignment information of the source and destination. This patch open-codes the copy so we process words instead of bytes. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-27 03:25:54 -07:00
Dave Young	4c71318948	Bluetooth: Remove useless flush_work() causing lockdep warnings The calls to flush_work() are pointless in a single thread workqueue and they are actually causing a lockdep warning. ============================================= [ INFO: possible recursive locking detected ] 2.6.30-rc6-02911-gbb803cf #16 --------------------------------------------- bluetooth/2518 is trying to acquire lock: (bluetooth){+.+.+.}, at: [<c0130c14>] flush_work+0x28/0xb0 but task is already holding lock: (bluetooth){+.+.+.}, at: [<c0130424>] worker_thread+0x149/0x25e other info that might help us debug this: 2 locks held by bluetooth/2518: #0: (bluetooth){+.+.+.}, at: [<c0130424>] worker_thread+0x149/0x25e #1: (&conn->work_del){+.+...}, at: [<c0130424>] worker_thread+0x149/0x25e stack backtrace: Pid: 2518, comm: bluetooth Not tainted 2.6.30-rc6-02911-gbb803cf #16 Call Trace: [<c03d64d9>] ? printk+0xf/0x11 [<c0140d96>] __lock_acquire+0x7ce/0xb1b [<c0141173>] lock_acquire+0x90/0xad [<c0130c14>] ? flush_work+0x28/0xb0 [<c0130c2e>] flush_work+0x42/0xb0 [<c0130c14>] ? flush_work+0x28/0xb0 [<f8b84966>] del_conn+0x1c/0x84 [bluetooth] [<c0130469>] worker_thread+0x18e/0x25e [<c0130424>] ? worker_thread+0x149/0x25e [<f8b8494a>] ? del_conn+0x0/0x84 [bluetooth] [<c0133843>] ? autoremove_wake_function+0x0/0x33 [<c01302db>] ? worker_thread+0x0/0x25e [<c013355a>] kthread+0x45/0x6b [<c0133515>] ? kthread+0x0/0x6b [<c01034a7>] kernel_thread_helper+0x7/0x10 Based on a report by Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Tested-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2009-05-27 09:15:57 +02:00
David S. Miller	079e24ed80	nl80211: Eliminate reference to BUS_ID_SIZE. It's going away. Just leave the constant "20" here so that behavior doesn't change. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-26 21:15:00 -07:00
David S. Miller	2b0cc7f78b	net: Remove bogus reference to BUS_ID_SIZE in sysfs code. BUS_ID_SIZE is really no more, and device names are dynamically allocated and thus can be any necessary size. So remove the BUG check here making sure BUS_ID_SIZE is at least as large as IFNAMSIZ. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-26 21:05:19 -07:00
Paul Menage	e65fcfd63a	cls_cgroup: read classid atomically in classifier Avoid reading the unsynchronized value cs->classid multiple times, since it could change concurrently from non-zero to zero; this would result in the classifier returning a positive result with a bogus (zero) classid. Signed-off-by: Paul Menage <menage@google.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-26 20:47:02 -07:00
Eric Dumazet	08baf56108	net: txq_trans_update() helper We would like to get rid of netdev->trans_start = jiffies; that about all net drivers have to use in their start_xmit() function, and use txq->trans_start instead. This can be done generically in core network, as suggested by David. Some devices, (particularly loopback) dont need trans_start update, because they dont have transmit watchdog. We could add a new device flag, or rely on fact that txq->tran_start can be updated is txq->xmit_lock_owner is different than -1. Use a helper function to hide our choice. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-25 22:58:01 -07:00
Jarek Poplawski	a1dcb6628b	pkt_sched: gen_estimator: Fix signed integers right-shifts. Right-shifts of signed integers are implementation-defined so unportable. With feedback from: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-25 22:47:01 -07:00
Doug Leith	c80a5cdfc5	tcp: tcp_vegas ssthresh bugfix This patch fixes ssthresh accounting issues in tcp_vegas when cwnd decreases Signed-off-by: Doug Leith <doug.leith@nuim.ie> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-25 22:44:59 -07:00
Pablo Neira Ayuso	b38b1f6168	netfilter: nf_ct_dccp: add missing DCCP protocol changes in event cache This patch adds the missing protocol state-change event reporting for DCCP. $ sudo conntrack -E [NEW] dccp 33 240 src=192.168.0.2 dst=192.168.1.2 sport=57040 dport=5001 [UNREPLIED] src=192.168.1.2 dst=192.168.1.100 sport=5001 dport=57040 With this patch: $ sudo conntrack -E [NEW] dccp 33 240 REQUEST src=192.168.0.2 dst=192.168.1.2 sport=57040 dport=5001 [UNREPLIED] src=192.168.1.2 dst=192.168.1.100 sport=5001 dport=57040 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-05-25 17:29:43 +02:00
Jozsef Kadlecsik	bfcaa50270	netfilter: nf_ct_tcp: fix accepting invalid RST segments Robert L Mathews discovered that some clients send evil TCP RST segments, which are accepted by netfilter conntrack but discarded by the destination. Thus the conntrack entry is destroyed but the destination retransmits data until timeout. The same technique, i.e. sending properly crafted RST segments, can easily be used to bypass connlimit/connbytes based restrictions (the sample script written by Robert can be found in the netfilter mailing list archives). The patch below adds a new flag and new field to struct ip_ct_tcp_state so that checking RST segments can be made more strict and thus TCP conntrack can catch the invalid ones: the RST segment is accepted only if its sequence number higher than or equal to the highest ack we seen from the other direction. (The last_ack field cannot be reused because it is used to catch resent packets.) Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-05-25 17:23:15 +02:00
Alexander Beregalov	e3804cbebb	net: remove COMPAT_NET_DEV_OPS All drivers are already converted to new net_device_ops API and nobody uses old API anymore. Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-25 01:53:53 -07:00
David S. Miller	c649c0e31d	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/ath/ath5k/phy.c drivers/net/wireless/iwlwifi/iwl-agn.c drivers/net/wireless/iwlwifi/iwl3945-base.c	2009-05-25 01:42:21 -07:00
Herbert Xu	9bcb97cace	skbuff: Copy csum instead of csum_start/csum_offset Hi: skbuff: Copy csum instead of csum_start/csum_offset It's easier to copy the u32 csum instead of its two u16 constituents. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-25 00:40:43 -07:00
Herbert Xu	82c49a352e	skbuff: Move new code into __copy_skb_header Hi: skbuff: Move new __skb_clone code into __copy_skb_header It seems that people just keep on adding stuff to __skb_clone instead __copy_skb_header. This is wrong as it means your brand-new attributes won't always get copied as you intended. This patch moves them to the right place, and adds a comment to prevent this from happening again. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Thanks, Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-25 00:40:43 -07:00
David S. Miller	45ea4ea2af	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2009-05-25 00:38:24 -07:00
Zhu Yi	e31a16d6f6	wireless: move some utility functions from mac80211 to cfg80211 The patch moves some utility functions from mac80211 to cfg80211. Because these functions are doing generic 802.11 operations so they are not mac80211 specific. The moving allows some fullmac drivers to be also benefit from these utility functions. Signed-off-by: Zhu Yi <yi.zhu@intel.com> Signed-off-by: Samuel Ortiz <samuel.ortiz@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-22 14:06:02 -04:00
Johannes Berg	a971be223f	mac80211: correct probe wait time My first patch submission used 200ms, which I then somehow managed to revert back to the earlier 50ms I had used for some tests in the second patch submission -- but that was wrong, I should have used 200ms here. Correct that. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-22 14:06:01 -04:00
Johannes Berg	4ef699fb77	mac80211: fix probe response wait timing In "mac80211: split out and decrease probe wait time" I tried to reduce the time waiting for a probe response, but failed to take into account the case where we are detecting beacon loss in software -- in that case we still wait the monitoring time rather than the probe wait time. Fix this by refactoring the mod_timer() calls in ieee80211_associated(). Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-22 14:05:59 -04:00
Johannes Berg	8705782582	wext: remove atomic requirement for wireless stats The requirement for wireless stats to be atomic is now mostly artificial since we hold the rtnl _and_ the dev_base_lock for iterating the device list. Doing that is not required, just the rtnl is sufficient (and the rtnl is required for other reasons outlined in commit "wext: fix get_wireless_stats locking"). This will fix http://bugzilla.kernel.org/show_bug.cgi?id=13344 and make things easier for drivers. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-22 14:05:59 -04:00
Herbert Xu	3699067381	tcp: Unexport TCPv6 GRO functions Sinec the TCPv6 GRO functions are used in the same file where they are defined, we do not need to export them. This was a cut-n-paste from the IPv4 code which does need to export them. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-22 00:45:28 -07:00
David S. Miller	7d18f11489	net: Fix arg to trace_napi_poll() in netpoll. Reproted by Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 23:30:09 -07:00
Michał Mirosław	0d63cbb535	wireless: Use genl_register_family_with_ops() Use genl_register_family_with_ops() instead of a copy. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 16:50:25 -07:00
Michał Mirosław	7ae740df3a	netlabel: Use genl_register_family_with_ops() Use genl_register_family_with_ops() instead of a copy. This fixes genetlink family leak on error path. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 16:50:24 -07:00
Michał Mirosław	8f698d5453	ipvs: Use genl_register_family_with_ops() Use genl_register_family_with_ops() instead of a copy. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 16:50:24 -07:00
Michał Mirosław	acb0a200ae	tipc: Use genl_register_family_with_ops() Use genl_register_family_with_ops() instead of a copy. This also changes netlink related variable names to be kernel-wide unique for consistency with other users. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 16:50:23 -07:00
Michał Mirosław	502664eeaf	irda: Use genl_register_family_with_ops() Use genl_register_family_with_ops() instead of a copy. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Cc: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 16:50:22 -07:00
Michał Mirosław	a7b11d7382	genetlink: Introduce genl_register_family_with_ops() This introduces genl_register_family_with_ops() that registers a genetlink family along with operations from a table. This is used to kill copy'n'paste occurrences in following patches. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 16:50:22 -07:00
Neil Horman	4ea7e38696	dropmon: add ability to detect when hardware dropsrxpackets Patch to add the ability to detect drops in hardware interfaces via dropwatch. Adds a tracepoint to net_rx_action to signal everytime a napi instance is polled. The dropmon code then periodically checks to see if the rx_frames counter has changed, and if so, adds a drop notification to the netlink protocol, using the reserved all-0's vector to indicate the drop location was in hardware, rather than somewhere in the code. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/net_dropmon.h \| 8 ++ include/trace/napi.h \| 11 +++ net/core/dev.c \| 5 + net/core/drop_monitor.c \| 124 ++++++++++++++++++++++++++++++++++++++++++-- net/core/net-traces.c \| 4 + net/core/netpoll.c \| 2 6 files changed, 149 insertions(+), 5 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 16:50:21 -07:00
Dan Carpenter	0975ecba3b	RxRPC: Error handling for rxrpc_alloc_connection() rxrpc_alloc_connection() doesn't return an error code on failure, it just returns NULL. IS_ERR(NULL) is false. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 15:22:02 -07:00
Robert Olsson	3ed18d76d9	ipv4: Fix oops with FIB_TRIE It seems we can fix this by disabling preemption while we re-balance the trie. This is with the CONFIG_CLASSIC_RCU. It's been stress-tested at high loads continuesly taking a full BGP table up/down via iproute -batch. Note. fib_trie is not updated for CONFIG_PREEMPT_RCU Reported-by: Andrei Popa Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 15:20:59 -07:00
Eric W. Biederman	d95ed9275e	af_packet: Teach to listen for multiple unicast addresses. The the PACKET_ADD_MEMBERSHIP and the PACKET_DROP_MEMBERSHIP setsockopt calls for af_packet already has all of the infrastructure needed to subscribe to multiple mac addresses. All that is missing is a flag to say that the address we want to listen on is a unicast address. So introduce PACKET_MR_UNICAST and wire it up to dev_unicast_add and dev_unicast_delete. Additionally I noticed that errors from dev_mc_add were not propagated from packet_dev_mc so fix that. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 15:13:39 -07:00
Stephen Hemminger	ca0f31125c	netns: simplify net_ns_init The net_ns_init code can be simplified. No need to save error code if it is only going to panic if it is set 4 lines later. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 15:10:31 -07:00
Stephen Hemminger	1f7a2bb4ef	netns: remove leftover debugging message Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 15:10:05 -07:00
Florian Westphal	5b5f792a6a	pktgen: do not access flows[] beyond its length typo -- pkt_dev->nflows is for stats only, the number of concurrent flows is stored in cflows. Reported-By: Vladimir Ivashchenko <hazard@francoudi.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 15:07:12 -07:00
Jean-Mickael Guerin	4f72427998	IPv6: set RTPROT_KERNEL to initial route The use of unspecified protocol in IPv6 initial route prevents quagga to install IPv6 default route: # show ipv6 route S ::/0 [1/0] via fe80::1, eth1_0 K>* ::/0 is directly connected, lo, rej C>* ::1/128 is directly connected, lo C>* fe80::/64 is directly connected, eth1_0 # ip -6 route fe80::/64 dev eth1_0 proto kernel metric 256 mtu 1500 advmss 1440 hoplimit -1 ff00::/8 dev eth1_0 metric 256 mtu 1500 advmss 1440 hoplimit -1 unreachable default dev lo proto none metric -1 error -101 hoplimit 255 The attached patch ensures RTPROT_KERNEL to the default initial route and fixes the problem for quagga. This is similar to "ipv6: protocol for address routes" `f410a1fba7`. # show ipv6 route S>* ::/0 [1/0] via fe80::1, eth1_0 C>* ::1/128 is directly connected, lo C>* fe80::/64 is directly connected, eth1_0 # ip -6 route fe80::/64 dev eth1_0 proto kernel metric 256 mtu 1500 advmss 1440 hoplimit -1 fe80::/64 dev eth1_0 proto kernel metric 256 mtu 1500 advmss 1440 hoplimit -1 ff00::/8 dev eth1_0 metric 256 mtu 1500 advmss 1440 hoplimit -1 default via fe80::1 dev eth1_0 proto zebra metric 1024 mtu 1500 advmss 1440 hoplimit -1 unreachable default dev lo proto kernel metric -1 error -101 hoplimit 255 Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-20 17:38:59 -07:00
David S. Miller	86c2fe1e3a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2009-05-20 17:31:25 -07:00
Rami Rosen	04af8cf6f3	net: Remove unused parameter from fill method in fib_rules_ops. The netlink message header (struct nlmsghdr) is an unused parameter in fill method of fib_rules_ops struct. This patch removes this parameter from this method and fixes the places where this method is called. (include/net/fib_rules.h) Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-20 17:26:23 -07:00
Eric Dumazet	1ddbcb005c	net: fix rtable leak in net/ipv4/route.c Alexander V. Lukyanov found a regression in 2.6.29 and made a complete analysis found in http://bugzilla.kernel.org/show_bug.cgi?id=13339 Quoted here because its a perfect one : begin_of_quotation 2.6.29 patch has introduced flexible route cache rebuilding. Unfortunately the patch has at least one critical flaw, and another problem. rt_intern_hash calculates rthi pointer, which is later used for new entry insertion. The same loop calculates cand pointer which is used to clean the list. If the pointers are the same, rtable leak occurs, as first the cand is removed then the new entry is appended to it. This leak leads to unregister_netdevice problem (usage count > 0). Another problem of the patch is that it tries to insert the entries in certain order, to facilitate counting of entries distinct by all but QoS parameters. Unfortunately, referencing an existing rtable entry moves it to list beginning, to speed up further lookups, so the carefully built order is destroyed. For the first problem the simplest patch it to set rthi=0 when rthi==cand, but it will also destroy the ordering. end_of_quotation Problematic commit is `1080d709fb` (net: implement emergency route cache rebulds when gc_elasticity is exceeded) Trying to keep dst_entries ordered is too complex and breaks the fact that order should depend on the frequency of use for garbage collection. A possible fix is to make rt_intern_hash() simpler, and only makes rt_check_expire() a litle bit smarter, being able to cope with an arbitrary entries order. The added loop is running on cache hot data, while cpu is prefetching next object, so should be unnoticied. Reported-and-analyzed-by: Alexander V. Lukyanov <lav@yar.ru> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-20 17:18:02 -07:00
Eric Dumazet	cf8da764fc	net: fix length computation in rt_check_expire() rt_check_expire() computes average and standard deviation of chain lengths, but not correclty reset length to 0 at beginning of each chain. This probably gives overflows for sum2 (and sum) on loaded machines instead of meaningful results. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-20 17:18:01 -07:00
Johannes Berg	9cef873798	mac80211: fix managed mode BSSID handling Currently, we will ask the driver to configure right away when somebody changes the desired BSSID. That's totally strange because then we will configure the driver without even knowing whether the BSS exists. Change this to only configure the BSSID when associated, and configure a zero BSSID when not associated. As a side effect, this fixes an issue with the iwlwifi driver which doesn't implement sta_notify properly and uses the BSSID instead and gets very confused if the BSSID is cleared before we disassociate, which results in the warning Marcel posted [1] and iwlwifi bug 1995 [2]. [1] http://thread.gmane.org/gmane.linux.kernel.wireless.general/32598 [2] http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=1995 Cc: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:37 -04:00
Luis R. Rodriguez	bbcf3f0277	cfg80211: warn when wiphy_apply_custom_regulatory() does nothing Device drivers using wiphy_apply_custom_regulatory() want some regulatory settings applied to their wiphy, if no bands were configured on the wiphy then something went wrong. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:37 -04:00
Johannes Berg	db67645db6	mac80211: fix parameter confusion when finding IBSS When I fixed the crypto bit I must have done the negative test only -- it is quite clearly impossible to find _any_ IBSS to join with the parameters put the wrong way around. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:36 -04:00
Johannes Berg	175427ce40	mac80211: don't try to do anything on unchanged genIE When the genIE hasn't changed there's no reason to kick the state machine since it won't be able to do anything new -- doing this decreases the useless work we do for reassociating because if we do kick the state machine it will try to find a usable BSS but there might not be one because wpa_supplicant will only change the BSSID a little later. In a sense this is a workaround for userspace behaviour, but on the other hand userspace cannot really keep track of what the kernel currently has for genIE since any process could have changed that while wpa_supplicant wasn't looking. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:35 -04:00
Jouni Malinen	7e0aae4732	mac80211: Do not override AID in the duration field When updating the duration field for TX frames, skip the update for PS-Poll frames that use this field for other purposes (AID). Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:34 -04:00
Jouni Malinen	30196673fe	mac80211: PS processing for every Beacon with our AID in TIM If the AP includes our AID in the TIM IE, we need to process the Beacon frame as far as PS is concerned (send PS-Poll or nullfunc data with PM=0). The previous code skipped this in cases where the CRC value did not change and it would not change if the AP continues including our AID in the TIM.. There is no need to count the crc32 value for directed_tim with this change, so we can remove that part. In order not to change the order of operations (i.e., update WMM parameters prior to sending PS-Poll), the CRC match is checked twice as only after the PS processing step, the rest of the function is skipped if nothing changed in the Beacon. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:33 -04:00
Johannes Berg	cce4c77b87	mac80211: fix kernel-doc Moving information from config_interface to bss_info_changed removed struct ieee80211_if_conf which the documentation still refers to, additionally there's one kernel-doc description too much and one other missing, fix all this. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:32 -04:00
Pavel Roskin	e43e820c9c	cfg80211: fix compile error with CONFIG_CFG80211_DEBUGFS If CONFIG_CFG80211_DEBUGFS is enabled and CONFIG_MAC80211_DEBUGFS is not, compilation fails in net/wireless/debugfs.c: net/wireless/debugfs.c: In function 'cfg80211_debugfs_drv_add': net/wireless/debugfs.c:117: error: 'struct cfg80211_registered_device' has no member named 'debugfs' The debugfs filed is needed if and only if CONFIG_CFG80211_DEBUGFS is enabled, so use that instead of CONFIG_MAC80211_DEBUGFS. Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:32 -04:00
Luis R. Rodriguez	61405e9778	cfg80211: fix in nl80211_set_reg() There is a race on access to last_request and its alpha2 through reg_is_valid_request() and us possibly processing first another regulatory request on another CPU. We avoid this improbably race by locking with the cfg80211_mutex as we should have done in the first place. While at it add the assert on locking on reg_is_valid_request(). Cc: stable@kernel.org Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:32 -04:00
Luis R. Rodriguez	d0e18f833d	cfg80211: cleanup return calls on nl80211_set_reg() This has no functional change, but it will make the race fix easier to spot in my next patch. Cc: stable@kernel.org Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:31 -04:00
Luis R. Rodriguez	4776c6e7f6	cfg80211: return immediately if num reg rules > NL80211_MAX_SUPP_REG_RULES This has no functional change except we save a kfree(rd) and allows us to clean this code up a bit after this. We do avoid an unnecessary kfree(NULL) but calling that was OK too. Cc: stable@kernel.org Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:31 -04:00
Johannes Berg	e3da574a0d	cfg80211: allow wext to remove keys that don't exist Some applications using wireless extensions expect to be able to remove a key that doesn't exist. One example is wpa_supplicant which doesn't actually change behaviour when running into an error while trying to do that, but it prints an error message which users interpret as wpa_supplicant having problems. The safe thing to do is not change the behaviour of wireless extensions any more, so when the driver reports -ENOENT let the wext bridge code return success to userspace. To guarantee this, also document that drivers should return -ENOENT when the key doesn't exist. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:30 -04:00
David Kilroy	3dcf670baf	cfg80211: mark ops as pointer to const This allows drivers to mark their cfg80211_ops tables const. Signed-off-by: David Kilroy <kilroyd@googlemail.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:27 -04:00
Johannes Berg	5bb644a0fd	mac80211: cancel/restart all timers across suspend/resume We forgot to cancel all timers in mac80211 when suspending. In particular we forgot to deal with some things that can cause hardware reconfiguration -- while it is down. While at it we go ahead and add a warning in ieee80211_sta_work() if its run while the suspend->resume cycle is in effect. This should not happen and if it does it would indicate there is a bug lurking in either mac80211 or mac80211 drivers. With this now wpa_supplicant doesn't blink when I go to suspend and resume where as before there where issues with some timers running during the suspend->resume cycle. This caused a lot of incorrect assumptions and would at times bring back the device in an incoherent, but mostly recoverable, state. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:25 -04:00
Johannes Berg	cc32abd494	mac80211: move channel switch code The channel switch code is currently in the spectrum management file, where arguably it belongs. However, it is for managed mode only and uses the structures for that mode only so having it in a more generic file can be confusing. Additionally, my next patch gets simpler with the code here. When/if we ever implement this for IBSS or mesh then we will need to rework the structures it uses anyway at which point we could move the code back. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:25 -04:00
Jouni Malinen	9f26a95221	nl80211: Validate NL80211_ATTR_KEY_SEQ length Validate RSC (NL80211_ATTR_KEY_SEQ) length in nl80211/cfg80211 instead of having to do this in all the drivers. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:25 -04:00
Jouni Malinen	92778180f7	mac80211: Cancel pending probereq poll on beacon RX While the probe request poll is expected to work, it looks like it does not always result in getting a response. The exact reason for this is unclear, but anyway, if we do receive a Beacon frame from our AP, there is no need to disconnect based on the probereq poll. This seems to help keep the connection bit more stable in cases where beacon loss is occurring semi-frequently. Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:24 -04:00
Senthil Balasubramanian	cccaec98a3	mac80211: Initialize RX's last received sequence number The STA may drop the very first frame if it happens to be a retried frame. This is because we maintian the last received sequence number per TID for QoS frames and it is initialized to zero through kzalloc during sta_info_alloc and the sequence number of the very first date frame received would be ZERO (as per IEEE 802.11-2007, 7.1.3.4.1). If the frame dropped happens to be an EAP Request Identity(very first frame from the AP), then wpa_supplicnat disconnects the STA and the whole procedure starts again. Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:23 -04:00
Luis R. Rodriguez	80a3511d70	cfg80211: add debugfs HT40 allow map Here's a screenshot of what this looks like with ath9k: mcgrof@pogo /debug/ieee80211/phy0 $ cat ht40allow_map 2412 HT40 + 2417 HT40 + 2422 HT40 + 2427 HT40 + 2432 HT40 -+ 2437 HT40 -+ 2442 HT40 -+ 2447 HT40 - 2452 HT40 - 2457 HT40 - 2462 HT40 - 2467 Disabled 2472 Disabled 2484 Disabled 5180 HT40 + 5200 HT40 -+ 5220 HT40 -+ 5240 HT40 -+ 5260 HT40 -+ 5280 HT40 -+ 5300 HT40 -+ 5320 HT40 - 5500 HT40 + 5520 HT40 -+ 5540 HT40 -+ 5560 HT40 -+ 5580 HT40 -+ 5600 HT40 -+ 5620 HT40 -+ 5640 HT40 -+ 5660 HT40 -+ 5680 HT40 -+ 5700 HT40 - 5745 HT40 + 5765 HT40 -+ 5785 HT40 -+ 5805 HT40 -+ 5825 HT40 - Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:23 -04:00
Luis R. Rodriguez	1ac61302dc	mac80211/cfg80211: move wiphy specific debugfs entries to cfg80211 This moves the cfg80211 specific stuff to new cfg80211 debugfs entries. Non-mac80211 will also get these entries now. There were only 4 which we take: rts_threshold fragmentation_threshold short_retry_limit long_retry_limit Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:23 -04:00
Luis R. Rodriguez	294196ab22	cfg80211: check allowed channel type upon userspace requests Thanks to nl80211 userspace can be very specific upon device configuration. Before processing the request for the new HT40 channel types (HT40- or HT40+) we need to ensure we can use them regulatory-wise. This wasn't required with wireless extensions as specifying the channel type wasn't not available and configuration was done towards the end implicitly upon association or reception of beacons from the AP. For the new nl80211 we have to check this when configuring the interfaces explicitly. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:23 -04:00
Luis R. Rodriguez	768777ea11	mac80211: check if HT40+/- is allowed before sending assoc We weren't checking this at all. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:23 -04:00
Luis R. Rodriguez	689da1b3b8	wireless: rename IEEE80211_CHAN_NO_FAT_* to HT40-/+ This is more consistent with our nl80211 naming convention for HT40-/+. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:22 -04:00
Luis R. Rodriguez	038659e7c6	cfg80211: Process regulatory max bandwidth checks for HT40 We are not correctly listening to the regulatory max bandwidth settings. To actually make use of it we need to redesign things a bit. This patch does the work for that. We do this to so we can obey to regulatory rules accordingly for use of HT40. We end up dealing with HT40 by having two passes for each channel. The first check will see if a 20 MHz channel fits into the channel's center freq on a given frequency range. We check for a 20 MHz banwidth channel as that is the maximum an individual channel will use, at least for now. The first pass will go ahead and check if the regulatory rule for that given center of frequency allows 40 MHz bandwidths and we use this to determine whether or not the channel supports HT40 or not. So to support HT40 you'll need at a regulatory rule that allows you to use 40 MHz channels but you're channel must also be enabled and support 20 MHz by itself. The second pass is done after we do the regulatory checks over an device's supported channel list. On each channel we'll check if the control channel and the extension both: o exist o are enabled o regulatory allows 40 MHz bandwidth on its frequency range This work allows allows us to idependently check for HT40- and HT40+. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:46:22 -04:00
Luis R. Rodriguez	5078b2e32a	cfg80211: fix race between core hint and driver's custom apply Its possible for cfg80211 to have scheduled the work and for the global workqueue to not have kicked in prior to a cfg80211 driver's regulatory hint or wiphy_apply_custom_regulatory(). Although this is very unlikely its possible and should fix this race. When this race would happen you are expected to have hit a null pointer dereference panic. Cc: stable@kernel.org Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Tested-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:29:54 -04:00
Johannes Berg	88f16db7a2	wext: verify buffer size for SIOCSIWENCODEEXT Another design flaw in wireless extensions (is anybody surprised?) in the way it handles the iw_encode_ext structure: The structure is part of the 'extra' memory but contains the key length explicitly, instead of it just being the length of the extra buffer - size of the struct and using the explicit key length only for the get operation (which only writes it). Therefore, we have this layout: extra: +-------------------------+ \| struct iw_encode_ext { \| \| ... \| \| u16 key_len; \| \| u8 key[0]; \| \| }; \| +-------------------------+ \| key material \| +-------------------------+ Now, all drivers I checked use ext->key_len without checking that both key_len and the struct fit into the extra buffer that has been copied from userspace. This leads to a buffer overrun while reading that buffer, depending on the driver it may be possible to specify arbitrary key_len or it may need to be a proper length for the key algorithm specified. Thankfully, this is only exploitable by root, but root can actually cause a segfault or use kernel memory as a key (which you can even get back with siocgiwencode or siocgiwencodeext from the key buffer). Fix this by verifying that key_len fits into the buffer along with struct iw_encode_ext. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2009-05-20 14:07:50 -04:00
Sascha Hlusiak	645069299a	sit: stateless autoconf for isatap be sent periodically. The rs_delay can be speficied when adding the PRL entry and defaults to 15 minutes. The RS is sent from every link local adress that's assigned to the tunnel interface. It's directed to the (guessed) linklocal address of the router and is sent through the tunnel. Better: send to ff02::2 encapsuled in unicast directed to router-v4. Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-19 16:02:02 -07:00
Sascha Hlusiak	9af28511be	addrconf: refuse isatap eui64 for INADDR_ANY A tunnel with no local ipv4 endpoint would otherwise use the ISATAP linklocal address fe80::5efe:0:0, which is invalid. Rather not add a linklocal address at all. Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-19 16:02:02 -07:00
Sascha Hlusiak	4b27960174	sit: ipip6_tunnel_del_prl: return err Typo. When deleting a PRL entry, return status to userspace instead of success. Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-19 16:02:01 -07:00
Sascha Hlusiak	4fddbf5d78	sit: strictly restrict incoming traffic to tunnel link device Check link device when looking up a tunnel. When a tunnel is linked to a interface, traffic from a different interface must not reach the tunnel. This also allows creating of multiple tunnels with the same endpoints, if the link device differs. Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-19 16:02:00 -07:00
Sascha Hlusiak	8db99e5717	sit: Fail to create tunnel, if it already exists When locating the tunnel, do not continue if it is found. Otherwise a different tunnel with similar configuration would be returned and parts could be overwritten. Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-19 16:02:00 -07:00
Chris Friesen	9643f45512	ipv4: teach ipconfig about the MTU option in DHCP The DHCP spec allows the server to specify the MTU. This can be useful for netbooting with UDP-based NFS-root on a network using jumbo frames. This patch allows the kernel IP autoconfiguration to handle this option correctly. It would be possible to use initramfs and add a script to set the MTU, but that seems like a complicated solution if no initramfs is otherwise necessary, and would bloat the kernel image more than this code would. This patch was originally submitted to LKML in 2003 by Hans-Peter Jansen. Signed-off-by: Chris Friesen <cfriesen@nortel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-19 15:36:17 -07:00
Pablo Neira Ayuso	fd2120ca0d	net: use NLMSG_DEFAULT_SIZE in nlmsg_new() allocations nlmsg_new() adds the size of the netlink header to the value that has been passed as parameter. If NLMSG_GOODSIZE is selected, we request an allocation of one memory page plus the size of the header. Instead, NLMSG_DEFAULT_SIZE should be used since it already substracts the size of the Netlink header. I have the impression that the similar naming in both constant is error prone when using it with nlmsg_new(). This is already documented in include/net/netlink.h Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-19 15:36:16 -07:00
Eric Dumazet	ab35cd4b8f	sch_teql: Use net_device internal stats We can slightly reduce size of teqlN structure, not duplicating stats structure in teql_master but using stats field from net_device.stats for tx_errors and from netdev_queue for tx_bytes/tx_packets/tx_dropped values. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-19 15:36:15 -07:00
Eric Dumazet	93f154b594	net: release dst entry in dev_hard_start_xmit() One point of contention in high network loads is the dst_release() performed when a transmited skb is freed. This is because NIC tx completion calls dev_kree_skb() long after original call to dev_queue_xmit(skb). CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is quite visible if one CPU is 100% handling softirqs for a network device, since dst_clone() is done by other cpus, involving cache line ping pongs. It seems right place to release dst is in dev_hard_start_xmit(), for most devices but ones that are virtual, and some exceptions. David Miller suggested to define a new device flag, set in alloc_netdev_mq() (so that most devices set it at init time), and carefuly unset in devices which dont want a NULL skb->dst in their ndo_start_xmit(). List of devices that must clear this flag is : - loopback device, because it calls netif_rx() and quoting Patrick : "ip_route_input() doesn't accept loopback addresses, so loopback packets already need to have a dst_entry attached." - appletalk/ipddp.c : needs skb->dst in its xmit function - And all devices that call again dev_queue_xmit() from their xmit function (as some classifiers need skb->dst) : bonding, vlan, macvlan, eql, ifb, hdlc_fr Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 22:19:19 -07:00
Eric W. Biederman	af38f29895	net: Fix bridgeing sysfs handling of rtnl_lock Holding rtnl_lock when we are unregistering the sysfs files can deadlock if we unconditionally take rtnl_lock in a sysfs file. So fix it with the now familiar patter of: rtnl_trylock and syscall_restart() Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 22:15:59 -07:00
Eric W. Biederman	9b8adb5ea0	net: Fix devinet_sysctl_forward sysctls are unregistered with the rntl_lock held making it unsafe to unconditionally grab the the rtnl_lock. Instead we need to call rtnl_trylock and restart the system call if we can not grab it. Otherwise we could deadlock at unregistration time. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 22:15:58 -07:00
Eric W. Biederman	5007392d85	net: FIX ipv6_forward sysctl restart Just returning -ERESTARTSYS without a signal pending is not good that will just leak it to userspace. We need return -ERESTARTNOINTR so we always restart and set signal pending so that we fall of the fast path of syscall return and setup the system call restart. So use restart_syscall() which does all of this for us. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 22:15:58 -07:00
Eric W. Biederman	336ca57c3b	net-sysfs: Use rtnl_trylock in sysfs methods. The earlier patch to fix the deadlock between a network device going away and writing to sysfs attributes was incomplete. - It did not set signal_pending so we would leak ERSTARTSYS to user space. - It used ERESTARTSYS which only restarts if sigaction configures it to. - It did not cover store and show for ifalias. So fix all of these up and use the new helper restart_syscall so we get the details correct on what it takes. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 22:15:57 -07:00
Johann Baudy	69e3c75f4d	net: TX_RING and packet mmap New packet socket feature that makes packet socket more efficient for transmission. - It reduces number of system call through a PACKET_TX_RING mechanism, based on PACKET_RX_RING (Circular buffer allocated in kernel space which is mmapped from user space). - It minimizes CPU copy using fragmented SKB (almost zero copy). Signed-off-by: Johann Baudy <johann.baudy@gnu-log.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 22:11:22 -07:00
Frans Pop	bc8a539743	ipv4: make default for INET_LRO consistent with help text Commit `e81963b1` ("ipv4: Make INET_LRO a bool instead of tristate.") changed this config from tristate to bool. Add default so that it is consistent with the help text. Signed-off-by: Frans Pop <elendil@planet.nl> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 21:48:38 -07:00
Thomas Chenault	995b337952	net: fix skb_seq_read returning wrong offset/length for page frag data When called with a consumed value that is less than skb_headlen(skb) bytes into a page frag, skb_seq_read() incorrectly returns an offset/length relative to skb->data. Ensure that data which should come from a page frag does. Signed-off-by: Thomas Chenault <thomas_chenault@dell.com> Tested-by: Shyam Iyer <shyam_iyer@dell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 21:43:27 -07:00
David S. Miller	bb803cfbec	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/scsi/fcoe/fcoe.c	2009-05-18 21:08:20 -07:00
Eric Dumazet	511e11e396	pkt_sched: gen_estimator: use 64 bit intermediate counters for bps gen_estimator can overflow bps (bytes per second) with Gb links, while it was designed with a u32 API, with a theorical limit of 34360Mbit (2^32 bytes) Using 64 bit intermediate avbps/brate counters can allow us to reach this theorical limit. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 19:26:37 -07:00
Rami Rosen	d23a9b5baa	ipv4: cleanup: remove unnecessary include. There is no need for net/icmp.h header in net/ipv4/fib_frontend.c. This patch removes the #include net/icmp.h from it. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 15:16:38 -07:00
Rami Rosen	e204a345a0	ipv4: cleanup - remove two unused parameters from fib_semantic_match(). Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 15:16:37 -07:00
Eric Dumazet	450c4ea15e	vlan: use struct netdev_queue counters instead of dev->stats We can update netdev_queue tx_bytes/tx_packets/tx_dropped counters instead of dev->stats ones, to reduce number of cache lines dirtied in xmit path. This fixes a performance problem on SMP when many different cpus take vlan tx path. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 15:15:06 -07:00
Eric Dumazet	7004bf252c	net: add tx_packets/tx_bytes/tx_dropped counters in struct netdev_queue offsetof(struct net_device, features)=0x44 offsetof(struct net_device, stats.tx_packets)=0x54 offsetof(struct net_device, stats.tx_bytes)=0x5c offsetof(struct net_device, stats.tx_dropped)=0x6c Network drivers that touch dev->stats.tx_packets/stats.tx_bytes in their tx path can slow down SMP operations, since they dirty a cache line that should stay shared (dev->features is needed in rx and tx paths) We could move away stats field in net_device but it wont help that much. (Two cache lines dirtied in tx path, we can do one only) Better solution is to add tx_packets/tx_bytes/tx_dropped in struct netdev_queue because this structure is already touched in tx path and counters updates will then be free (no increase in size) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 15:15:06 -07:00
Eric Dumazet	c0f84d0d4b	sch_teql: should not dereference skb after ndo_start_xmit() It is illegal to dereference a skb after a successful ndo_start_xmit() call. We must store skb length in a local variable instead. Bug was introduced in 2.6.27 by commit `0abf77e55a` (net_sched: Add accessor function for packet length for qdiscs) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 15:12:31 -07:00
Ilpo Järvinen	7752731318	tcp: fix MSG_PEEK race check Commit `518a09ef11` (tcp: Fix recvmsg MSG_PEEK influence of blocking behavior) lets the loop run longer than the race check did previously expect, so we need to be more careful with this check and consider the work we have been doing. I tried my best to deal with urg hole madness too which happens here: if (!sock_flag(sk, SOCK_URGINLINE)) { ++*seq; ... by using additional offset by one but I certainly have very little interest in testing that part. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Frans Pop <elendil@planet.nl> Tested-by: Ian Zimmermann <itz@buug.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-18 15:05:40 -07:00
David S. Miller	82d048186e	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2009-05-18 14:48:30 -07:00

1 2 3 4 5 ...

12649 Commits