linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-14 16:12:02 +00:00

Author	SHA1	Message	Date
Tom Herbert	e51d739ab7	net: Fix locking in flush_backlog Need to take spinlocks when dequeuing from input_pkt_queue in flush_backlog. Also, flush_backlog can now be called directly from netdev_run_todo. Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-23 23:17:18 -07:00
Amerigo Wang	5fc05f8764	netpoll: warn when there are spaces in parameters v2: update according to Frans' comments. Currently, if we leave spaces before dst port, netconsole will silently accept it as 0. Warn about this. Also, when spaces appear in other places, make them visible in error messages. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: David Miller <davem@davemloft.net> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-22 20:05:45 -07:00
Tom Herbert	e880eb6c5c	rps: Fix build with CONFIG_SYSFS enabled Fix build with CONFIG_SYSFS not enabled. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-22 18:06:47 -07:00
Robert Olsson	e99b99b471	pktgen node allocation Here is patch to manipulate packet node allocation and implicitly how packets are DMA'd etc. The flag NODE_ALLOC enables the function and numa_node_id(); when enabled it can also be explicitly controlled via a new node parameter Tested this with 10 Intel 82599 ports w. TYAN S7025 E5520 CPU's. Was able to TX/DMA ~80 Gbit/s to Ethernet wires. Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 20:33:36 -07:00
Eric Dumazet	99fe3c391d	net: dev_getfirstbyhwtype() optimization Use RCU to avoid RTNL use in dev_getfirstbyhwtype() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 20:33:36 -07:00
Eric Dumazet	283f2fe87e	net: speedup netdev_set_master() We currently force a synchronize_net() in netdev_set_master() This seems necessary only when a slave had a master and we dismantle it. In the other case ("ifenslave bond0 ethO"), we dont need this long delay. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 18:34:15 -07:00
Jiri Pirko	32a806c194	bonding: flush unicast and multicast lists when changing type After the type change, addresses in unicast and multicast lists wouldn't make sense, not to mention possible different lenghts. So flush both lists here. Note "dev_addr_discard" will be very soon replaced by "dev_mc_flush" (once mc_list conversion will be done). Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 18:31:34 -07:00
Patrick McHardy	755d0e77ac	net: rtnetlink: ignore NETDEV_PRE_TYPE_CHANGE in rtnetlink_event() Ignore the new NETDEV_PRE_TYPE_CHANGE event in rtnetlink_event() since there have been no changes userspace needs to be notified of. Also add a comment to the netdev notifier event definitions to remind people to update the exclusion list when adding new event types. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-21 18:31:34 -07:00
David S. Miller	e77c8e83dd	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2010-03-20 15:24:29 -07:00
Eric Dumazet	0641e4fbf2	net: Potential null skb->dev dereference When doing "ifenslave -d bond0 eth0", there is chance to get NULL dereference in netif_receive_skb(), because dev->master suddenly becomes NULL after we tested it. We should use ACCESS_ONCE() to avoid this (or rcu_dereference()) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-18 21:16:45 -07:00
Jiri Pirko	3ca5b4042e	bonding: check return value of nofitier when changing type This patch adds the possibility to refuse the bonding type change for other subsystems (such as for example bridge, vlan, etc.) Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-18 20:00:02 -07:00
Tom Herbert	1e94d72fea	rps: Fixed build with CONFIG_SMP not enabled. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-18 17:45:44 -07:00
Jan Engelhardt	10708f37ae	net: core: add IFLA_STATS64 support `ip -s link` shows interface counters truncated to 32 bit. This is because interface statistics are transported only in 32-bit quantity to userspace. This commit adds a new IFLA_STATS64 attribute that exports them in full 64 bit. References: http://lkml.indiana.edu/hypermail/linux/kernel/0307.3/0215.html Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:23:22 -07:00
Eric Dumazet	2fb3573dfb	net: remove rcu locking from fib_rules_event() We hold RTNL at this point and dont use RCU variants of list traversals, we dont need rcu_read_lock()/rcu_read_unlock() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:23:19 -07:00
Tom Herbert	0a9627f264	rps: Receive Packet Steering This patch implements software receive side packet steering (RPS). RPS distributes the load of received packet processing across multiple CPUs. Problem statement: Protocol processing done in the NAPI context for received packets is serialized per device queue and becomes a bottleneck under high packet load. This substantially limits pps that can be achieved on a single queue NIC and provides no scaling with multiple cores. This solution queues packets early on in the receive path on the backlog queues of other CPUs. This allows protocol processing (e.g. IP and TCP) to be performed on packets in parallel. For each device (or each receive queue in a multi-queue device) a mask of CPUs is set to indicate the CPUs that can process packets. A CPU is selected on a per packet basis by hashing contents of the packet header (e.g. the TCP or UDP 4-tuple) and using the result to index into the CPU mask. The IPI mechanism is used to raise networking receive softirqs between CPUs. This effectively emulates in software what a multi-queue NIC can provide, but is generic requiring no device support. Many devices now provide a hash over the 4-tuple on a per packet basis (e.g. the Toeplitz hash). This patch allow drivers to set the HW reported hash in an skb field, and that value in turn is used to index into the RPS maps. Using the HW generated hash can avoid cache misses on the packet when steering it to a remote CPU. The CPU mask is set on a per device and per queue basis in the sysfs variable /sys/class/net/<device>/queues/rx-<n>/rps_cpus. This is a set of canonical bit maps for receive queues in the device (numbered by <n>). If a device does not support multi-queue, a single variable is used for the device (rx-0). Generally, we have found this technique increases pps capabilities of a single queue device with good CPU utilization. Optimal settings for the CPU mask seem to depend on architectures and cache hierarcy. Below are some results running 500 instances of netperf TCP_RR test with 1 byte req. and resp. Results show cumulative transaction rate and system CPU utilization. e1000e on 8 core Intel Without RPS: 108K tps at 33% CPU With RPS: 311K tps at 64% CPU forcedeth on 16 core AMD Without RPS: 156K tps at 15% CPU With RPS: 404K tps at 49% CPU bnx2x on 16 core AMD Without RPS 567K tps at 61% CPU (4 HW RX queues) Without RPS 738K tps at 96% CPU (8 HW RX queues) With RPS: 854K tps at 76% CPU (4 HW RX queues) Caveats: - The benefits of this patch are dependent on architecture and cache hierarchy. Tuning the masks to get best performance is probably necessary. - This patch adds overhead in the path for processing a single packet. In a lightly loaded server this overhead may eliminate the advantages of increased parallelism, and possibly cause some relative performance degradation. We have found that masks that are cache aware (share same caches with the interrupting CPU) mitigate much of this. - The RPS masks can be changed dynamically, however whenever the mask is changed this introduces the possibility of generating out of order packets. It's probably best not change the masks too frequently. Signed-off-by: Tom Herbert <therbert@google.com> include/linux/netdevice.h \| 32 ++++- include/linux/skbuff.h \| 3 + net/core/dev.c \| 335 +++++++++++++++++++++++++++++++++++++-------- net/core/net-sysfs.c \| 225 ++++++++++++++++++++++++++++++- net/core/skbuff.c \| 2 + 5 files changed, 538 insertions(+), 59 deletions(-) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:23:18 -07:00
Jiri Slaby	21edbb223e	NET: netpoll, fix potential NULL ptr dereference Stanse found that one error path in netpoll_setup dereferences npinfo even though it is NULL. Avoid that by adding new label and go to that instead. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Daniel Borkmann <danborkmann@googlemail.com> Cc: David S. Miller <davem@davemloft.net> Acked-by: chavey@google.com Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 14:15:45 -07:00
Eric Dumazet	3041f51707	net: Fix dev_mc_add() Commit `6e17d45a` (net: add addr len check to dev_mc_add) added a bug in dev_mc_add(), since it can now exit with a lock imbalance. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-10 07:32:28 -08:00
Eric Dumazet	0a141509ed	net: Annotates neigh_invalidate() Annotates neigh_invalidate() with __releases() and __acquires() for sparse sake. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-10 07:32:28 -08:00
Eric Dumazet	f5c445ed41	ethtool: Use noinline_for_stack Use self documenting noinline_for_stack instead of duplicated comments. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-08 12:17:04 -08:00
Dan Carpenter	72150e9b7f	sock.c: potential null dereference We test that "prot->rsk_prot" is non-null right before we dereference it on this line. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-07 15:25:50 -08:00
Jeff Garzik	d17792ebdf	ethtool: Add direct access to ops->get_sset_count On 03/04/2010 09:26 AM, Ben Hutchings wrote: > On Thu, 2010-03-04 at 00:51 -0800, Jeff Kirsher wrote: >> From: Jeff Garzik<jgarzik@redhat.com> >> >> This patch is an alternative approach for accessing string >> counts, vs. the drvinfo indirect approach. This way the drvinfo >> space doesn't run out, and we don't break ABI later. > [...] >> --- a/net/core/ethtool.c >> +++ b/net/core/ethtool.c >> @@ -214,6 +214,10 @@ static noinline int ethtool_get_drvinfo(struct net_device dev, void __user use >> info.cmd = ETHTOOL_GDRVINFO; >> ops->get_drvinfo(dev,&info); >> >> + /* >> + * this method of obtaining string set info is deprecated; >> + * consider using ETHTOOL_GSSET_INFO instead >> + / > > This comment belongs on the interface (ethtool.h) not the > implementation. Debatable -- the current comment is located at the callsite of ops->get_sset_count(), which is where an implementor might think to add a new call. Not all the numeric fields in ethtool_drvinfo are obtained from ->get_sset_count(). Hence the "some" in the attached patch to include/linux/ethtool.h, addressing your comment. > [...] >> +static noinline int ethtool_get_sset_info(struct net_device dev, >> + void __user useraddr) >> +{ > [...] >> + / calculate size of return buffer */ >> + for (i = 0; i< 64; i++) >> + if (sset_mask& (1ULL<< i)) >> + n_bits++; > [...] > > We have a function for this: > > n_bits = hweight64(sset_mask); Agreed. I've attached a follow-up patch, which should enable my/Jeff's kernel patch to be applied, followed by this one. Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-05 14:00:17 -08:00
Jeff Garzik	723b2f57ad	ethtool: Add direct access to ops->get_sset_count This patch is an alternative approach for accessing string counts, vs. the drvinfo indirect approach. This way the drvinfo space doesn't run out, and we don't break ABI later. Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-05 14:00:17 -08:00
Zhu Yi	a3a858ff18	net: backlog functions rename sk_add_backlog -> __sk_add_backlog sk_add_backlog_limited -> sk_add_backlog Signed-off-by: Zhu Yi <yi.zhu@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-05 13:34:03 -08:00
Zhu Yi	8eae939f14	net: add limit for socket backlog We got system OOM while running some UDP netperf testing on the loopback device. The case is multiple senders sent stream UDP packets to a single receiver via loopback on local host. Of course, the receiver is not able to handle all the packets in time. But we surprisingly found that these packets were not discarded due to the receiver's sk->sk_rcvbuf limit. Instead, they are kept queuing to sk->sk_backlog and finally ate up all the memory. We believe this is a secure hole that a none privileged user can crash the system. The root cause for this problem is, when the receiver is doing __release_sock() (i.e. after userspace recv, kernel udp_recvmsg -> skb_free_datagram_locked -> release_sock), it moves skbs from backlog to sk_receive_queue with the softirq enabled. In the above case, multiple busy senders will almost make it an endless loop. The skbs in the backlog end up eat all the system memory. The issue is not only for UDP. Any protocols using socket backlog is potentially affected. The patch adds limit for socket backlog so that the backlog size cannot be expanded endlessly. Reported-by: Alex Shi <alex.shi@intel.com> Cc: David Miller <davem@davemloft.net> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru Cc: "Pekka Savola (ipv6)" <pekkas@netcore.fi> Cc: Patrick McHardy <kaber@trash.net> Cc: Vlad Yasevich <vladislav.yasevich@hp.com> Cc: Sridhar Samudrala <sri@us.ibm.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Allan Stephens <allan.stephens@windriver.com> Cc: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: Zhu Yi <yi.zhu@intel.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-05 13:33:59 -08:00
David S. Miller	47871889c6	Merge branch 'master' of /home/davem/src/GIT/linux-2.6/ Conflicts: drivers/firmware/iscsi_ibft.c	2010-02-28 19:23:06 -08:00
Eric W. Biederman	76dadd76c2	scm: Only support SCM_RIGHTS on unix domain sockets. We use scm_send and scm_recv on both unix domain and netlink sockets, but only unix domain sockets support everything required for file descriptor passing, so error if someone attempts to pass file descriptors over netlink sockets. Cc: stable@kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-28 18:22:02 -08:00
Jeff Garzik	9675478bba	ethtool: do not set some flags, if others failed NETIF_F_NTUPLE flag setting introduced a bug: non-ntuple flags like LRO may be successfully set, before ioctl(2) returns failure to userspace. The set-flags operation should be all-or-none, rather than leaving things in an inconsistent state prior to reporting failure to userspace. Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-28 01:40:30 -08:00
Patrick McHardy	3729d50212	rtnetlink: support specifying device flags on device creation commit e8469ed959c373c2ff9e6f488aa5a14971aebe1f Author: Patrick McHardy <kaber@trash.net> Date: Tue Feb 23 20:41:30 2010 +0100 Support specifying the initial device flags when creating a device though rtnl_link. Devices allocated by rtnl_create_link() are marked as INITIALIZING in order to surpress netlink registration notifications. To complete setup, rtnl_configure_link() must be called, which performs the device flag changes and invokes the deferred notifiers if everything went well. Two examples: # add macvlan to eth0 # $ ip link add link eth0 up allmulticast on type macvlan [LINK]11: macvlan0@eth0: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether 26:f8:84:02:f9:2a brd ff:ff:ff:ff:ff:ff [ROUTE]ff00::/8 dev macvlan0 table local metric 256 mtu 1500 advmss 1440 hoplimit 0 [ROUTE]fe80::/64 dev macvlan0 proto kernel metric 256 mtu 1500 advmss 1440 hoplimit 0 [LINK]11: macvlan0@eth0: <BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP> mtu 1500 link/ether 26:f8:84:02:f9:2a [ADDR]11: macvlan0 inet6 fe80::24f8:84ff:fe02:f92a/64 scope link valid_lft forever preferred_lft forever [ROUTE]local fe80::24f8:84ff:fe02:f92a via :: dev lo table local proto none metric 0 mtu 16436 advmss 16376 hoplimit 0 [ROUTE]default via fe80::215:e9ff:fef0:10f8 dev macvlan0 proto kernel metric 1024 mtu 1500 advmss 1440 hoplimit 0 [NEIGH]fe80::215:e9ff:fef0:10f8 dev macvlan0 lladdr 00:15:e9:f0:10:f8 router STALE [ROUTE]2001:6f8:974::/64 dev macvlan0 proto kernel metric 256 expires 0sec mtu 1500 advmss 1440 hoplimit 0 [PREFIX]prefix 2001:6f8:974::/64 dev macvlan0 onlink autoconf valid 14400 preferred 131084 [ADDR]11: macvlan0 inet6 2001:6f8:974:0:24f8:84ff:fe02:f92a/64 scope global dynamic valid_lft 86399sec preferred_lft 14399sec # add VLAN to eth1, eth1 is down # $ ip link add link eth1 up type vlan id 1000 RTNETLINK answers: Network is down <no events> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-27 02:43:40 -08:00
Patrick McHardy	bd38081160	dev: support deferring device flag change notifications Split dev_change_flags() into two functions: __dev_change_flags() to perform the actual changes and __dev_notify_flags() to invoke netdevice notifiers. This will be used by rtnl_link to defer netlink notifications until the device has been fully configured. This changes ordering of some operations, in particular: - netlink notifications are sent after all changes have been performed. As a side effect this surpresses one unnecessary netlink message when the IFF_UP and other flags are changed simultaneously. - The NETDEV_UP/NETDEV_DOWN and NETDEV_CHANGE notifiers are invoked after all changes have been performed. Their relative is unchanged. - net_dmaengine_put() is invoked before the NETDEV_DOWN notifier instead of afterwards. This should not make any difference since both RX and TX are already shut down at this point. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-27 02:43:40 -08:00
Patrick McHardy	a2835763e1	rtnetlink: handle rtnl_link netlink notifications manually In order to support specifying device flags during device creation, we must be able to roll back device registration in case setting the flags fails without sending any notifications related to the device to userspace. This patch changes rollback_registered_many() and register_netdevice() to manually send netlink notifications for devices not handled by rtnl_link and allows to defer notifications for devices handled by rtnl_link until setup is complete. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-27 02:43:39 -08:00
Patrick McHardy	10de05afe0	rtnetlink: ignore NETDEV_PRE_UP notifier in rtnetlink_event() Commit `3b8bcfd` (net: introduce pre-up netdev notifier) added a new notifier which is run before a device is set UP for use by cfg80211. The patch missed to add the new notifier to the ignore list in rtnetlink_event(), so we currently get an unnecessary netlink notification before a device is set UP. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-27 02:43:39 -08:00
David S. Miller	738b0343e7	Revert "ethtool: Add n-tuple string length to drvinfo and return it" This reverts commit `c79c5ffdce`. As Jeff points out we can't break the user visible interface like this, we need to add this into the reserved[] thing. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-26 05:12:02 -08:00
Jiri Pirko	6e17d45ae3	net: add addr len check to dev_mc_add Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-26 04:22:26 -08:00
Peter Waskiewicz	c79c5ffdce	ethtool: Add n-tuple string length to drvinfo and return it The drvinfo struct should include the number of strings that get_rx_ntuple will return. It will be variable if an underlying driver implements its own get_rx_ntuple routine, so userspace needs to know how much data is coming. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-26 04:18:43 -08:00
stephen hemminger	e5e26d75f4	netdev: use list_first_entry macro Use list_first_entry macro; no longer any need to use 'next' directly in list to find first entry. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-26 04:18:35 -08:00
Williams, Mitch A	4edb246626	rtnetlink: clean up SR-IOV config interface This patch consists of a few minor cleanups to the SR-IOV configurion code in rtnetlink. - Remove unneccesary lock - Remove unneccesary casts - Return correct error code for no driver support These changes are based on comments from Patrick McHardy Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-26 04:18:35 -08:00
David S. Miller	0448873480	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2010-02-25 23:22:42 -08:00
Paul E. McKenney	a898def29e	net: Add checking to rcu_dereference() primitives Update rcu_dereference() primitives to use new lockdep-based checking. The rcu_dereference() in __in6_dev_get() may be protected either by rcu_read_lock() or RTNL, per Eric Dumazet. The rcu_dereference() in __sk_free() is protected by the fact that it is never reached if an update could change it. Check for this by using rcu_dereference_check() to verify that the struct sock's ->sk_wmem_alloc counter is zero. Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1266887105-1528-5-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-25 09:41:03 +01:00
Ajit Khaparde	c4d49794ff	net: bug fix for vlan + gro issue Traffic (tcp) doesnot start on a vlan interface when gro is enabled. Even the tcp handshake was not taking place. This is because, the eth_type_trans call before the netif_receive_skb in napi_gro_finish() resets the skb->dev to napi->dev from the previously set vlan netdev interface. This causes the ip_route_input to drop the incoming packet considering it as a packet coming from a martian source. I could repro this on 2.6.32.7 (stable) and 2.6.33-rc7. With this fix, the traffic starts and the test runs fine on both vlan and non-vlan interfaces. CC: Herbert Xu <herbert@gondor.apana.org.au> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: Ajit Khaparde <ajitk@serverengines.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-23 19:09:31 -08:00
Jamal Hadi Salim	bd55775c8d	xfrm: SA lookups signature with mark pass mark to all SA lookups to prepare them for when we add code to have them search. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-22 16:20:22 -08:00
Eric W. Biederman	b8afe64161	net-sysfs: Use rtnl_trylock in wireless sysfs methods. The wireless sysfs methods like the rest of the networking sysfs methods are removed with the rtnl_lock held and block until the existing methods stop executing. So use rtnl_trylock and restart_syscall so that the code continues to work. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-19 15:40:51 -08:00
Michael S. Tsirkin	5ff3f07367	net: export attach/detach filter routines Export sk_attach_filter/sk_detach_filter routines, so that tun module can use them. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-17 16:35:16 -08:00
Ajit Khaparde	e76b69cc01	net: bug fix for vlan + gro issue Traffic (tcp) doesnot start on a vlan interface when gro is enabled. Even the tcp handshake was not taking place. This is because, the eth_type_trans call before the netif_receive_skb in napi_gro_finish() resets the skb->dev to napi->dev from the previously set vlan netdev interface. This causes the ip_route_input to drop the incoming packet considering it as a packet coming from a martian source. I could repro this on 2.6.32.7 (stable) and 2.6.33-rc7. With this fix, the traffic starts and the test runs fine on both vlan and non-vlan interfaces. CC: Herbert Xu <herbert@gondor.apana.org.au> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: Ajit Khaparde <ajitk@serverengines.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-17 15:59:47 -08:00
Ben Hutchings	7af3351f71	ethtool: Don't flush n-tuple list from ethtool_reset() The n-tuple list should be flushed if and only if the ETH_RESET_FILTER flag is set and the driver is able to reset filtering/flow direction hardware without also resetting a component whose flag is not set. This test is best left to the driver. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-17 13:38:10 -08:00
Alexey Dobriyan	faf234220f	net: use kasprintf() for socket cache names kasprintf() makes code smaller. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-17 13:27:11 -08:00
Alexey Dobriyan	dc4c2c3105	net: remove INIT_RCU_HEAD() usage call_rcu() will unconditionally reinitialize RCU head anyway. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-17 00:03:27 -08:00
David S. Miller	2bb4646fce	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2010-02-16 22:09:29 -08:00
Eric W. Biederman	54716e3beb	net neigh: Decouple per interface neighbour table controls from binary sysctls Stop computing the number of neighbour table settings we have by counting the number of binary sysctls. This behaviour was silly and meant that we could not add another neighbour table setting without also adding another binary sysctl. Don't pass the binary sysctl path for neighour table entries into neigh_sysctl_register. These parameters are no longer used and so are just dead code. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-16 15:55:18 -08:00
stephen hemminger	1cab819b5e	ethtool: allow non-admin user to read GRO settings. Looks like an oversight in GRO design. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-16 14:53:23 -08:00
Eric Dumazet	339c6e9985	ethtool: reduce stack usage dev_ethtool() is currently using 604 bytes of stack, even with gcc-4.4.2 objdump -d vmlinux \| scripts/checkstack.pl ... 0xc04bbc33 dev_ethtool [vmlinux]: 604 ... Adding noinline attributes to selected functions can reduce stack usage. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-15 21:51:33 -08:00
Peter Waskiewicz	0d643e1fb4	ethtool: Move n-tuple capability check into set_flags set_flags should check if the underlying device supports n-tuple filter programming before setting the device flags on the netdevice. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-15 21:49:47 -08:00
Peter Waskiewicz	e858911804	ethtool: Fix filter addition when caching n-tuple filters We can allow a filter to be added successfully to the underlying hardware, but still return an error if the cached list memory allocation fails. This patch fixes that condition. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-15 21:49:47 -08:00
Williams, Mitch A	ebc08a6f47	rtnetlink: Add VF config code to rtnetlink Add code to allow rtnetlink clients to query and set VF information through the PF driver. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-12 16:56:08 -08:00
Jiri Pirko	4cd24eaf0c	net: use netdev_mc_count and netdev_mc_empty when appropriate This patch replaces dev->mc_count in all drivers (hopefully I didn't miss anything). Used spatch and did small tweaks and conding style changes when it was suitable. Jirka Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-12 11:38:58 -08:00
Roland Dreier	8e5574211d	ethtool: Use explicit designated initializers for .cmd Initialize the .cmd member of various ethtool using a designated struct initializer rather. This makes things a teeny bit more robust, although the chance of a struct layout changing is extremely remote, and also makes the code a little easier to read. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-11 12:14:23 -08:00
Peter P Waskiewicz Jr	15682bc488	ethtool: Introduce n-tuple filter programming support This patchset enables the ethtool layer to program n-tuple filters to an underlying device. The idea is to allow capable hardware to have static rules applied that can assist steering flows into appropriate queues. Hardware that is known to support these types of filters today are ixgbe and niu. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-10 20:03:05 -08:00
David S. Miller	b1109bf085	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6	2010-02-09 11:44:44 -08:00
Eric Dumazet	2fc1b5dd99	dst: call cond_resched() in dst_gc_task() Kernel bugzilla #15239 On some workloads, it is quite possible to get a huge dst list to process in dst_gc_task(), and trigger soft lockup detection. Fix is to call cond_resched(), as we run in process context. Reported-by: Pawel Staszewski <pstaszewski@itcare.pl> Tested-by: Pawel Staszewski <pstaszewski@itcare.pl> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-08 15:00:39 -08:00
Rafael J. Wysocki	1b3f720bf0	pktgen: Fix freezing problem Add missing try_to_freeze() to one of the pktgen_thread_worker() code paths so that it doesn't block suspend/hibernation. Fixes http://bugzilla.kernel.org/show_bug.cgi?id=15006 Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reported-and-tested-by: Ciprian Dorin Craciun <ciprian.craciun@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-04 14:00:41 -08:00
Arnd Bergmann	8a83a00b07	net: maintain namespace isolation between vlan and real device In the vlan and macvlan drivers, the start_xmit function forwards data to the dev_queue_xmit function for another device, which may potentially belong to a different namespace. To make sure that classification stays within a single namespace, this resets the potentially critical fields. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-02-03 20:20:32 -08:00
Jiri Pirko	32e7bfc411	net: use helpers to access uc list V2 This patch introduces three macros to work with uc list from net drivers. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-01-25 13:36:10 -08:00
Alexey Dobriyan	81c1ebfc43	neigh: simplify seq_file code Simpily pass 'struct neigh_table' with seq_file private pointer, and save one dereference. Proc entry itself isn't interesting. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-01-23 01:21:27 -08:00
Krishna Kumar	4b258461c0	net: Optimize non-gso test checks Avoid checking twice whether skb needs to be linearized, if one skb_linearize was already done. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-01-21 01:26:29 -08:00
David S. Miller	11380a4b2d	net: Unexport napi_gro_flush(). Nothing outside of net/core/dev.c uses it. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-01-19 13:46:10 -08:00
Alexey Dobriyan	2c8c1e7297	net: spread __net_init, __net_exit __net_init/__net_exit are apparently not going away, so use them to full extent. In some cases __net_init was removed, because it was called from __net_exit code. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-01-17 19:16:02 -08:00
H Hartley Sweeten	4d0392be21	net/core/sock.c: quiet sparse noise In sock_getsockopt the symbol 'lv' is declared as an unsigned int type, probably due to sizeof returning a size_t which is really an unsigned int. This produces a sparse warning for SO_PEERNAME due to the sock->ops->getname() call: warning: incorrect type in argument 3 (different signedness) expected int sockaddr_len got unsigned int <noident> Quiet the warning by changing the type of 'lv' to an int. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-01-15 01:08:58 -08:00
Daniel Borkmann	508e14b4a4	netpoll: allow execution of multiple rx_hooks per interface Signed-off-by: Daniel Borkmann <danborkmann@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-01-13 20:38:26 -08:00
Linus Torvalds	597d8c7178	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (56 commits) sky2: Fix oops in sky2_xmit_frame() after TX timeout Documentation/3c509: document ethtool support af_packet: Don't use skb after dev_queue_xmit() vxge: use pci_dma_mapping_error to test return value netfilter: ebtables: enforce CAP_NET_ADMIN e1000e: fix and commonize code for setting the receive address registers e1000e: e1000e_enable_tx_pkt_filtering() returns wrong value e1000e: perform 10/100 adaptive IFS only on parts that support it e1000e: don't accumulate PHY statistics on PHY read failure e1000e: call pci_save_state() after pci_restore_state() netxen: update version to 4.0.72 netxen: fix set mac addr netxen: fix smatch warning netxen: fix tx ring memory leak tcp: update the netstamp_needed counter when cloning sockets TI DaVinci EMAC: Handle emac module clock correctly. dmfe/tulip: Let dmfe handle DM910x except for SPARC on-board chips ixgbe: Fix compiler warning about variable being used uninitialized netfilter: nf_ct_ftp: fix out of bounds read in update_nl_seq() mv643xx_eth: don't include cache padding in rx desc buffer size ... Fix trivial conflict in drivers/scsi/cxgb3i/cxgb3i_offload.c	2010-01-12 20:53:29 -08:00
David S. Miller	d4a66e752d	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/benet/be_cmds.h include/linux/sysctl.h	2010-01-10 22:55:03 -08:00
Octavian Purdila	704da560c0	tcp: update the netstamp_needed counter when cloning sockets This fixes a netstamp_needed accounting issue when the listen socket has SO_TIMESTAMP set: s = socket(AF_INET, SOCK_STREAM, 0); setsockopt(s, SOL_SOCKET, SO_TIMESTAMP, 1); -> netstamp_needed = 1 bind(s, ...); listen(s, ...); s2 = accept(s, ...); -> netstamp_needed = 1 close(s2); -> netstamp_needed = 0 close(s); -> netstamp_needed = -1 Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-01-08 00:00:09 -08:00
Jesper Dangaard Brouer	2d13bafeba	net: Make it easier to parse /proc/net/dev contents. The contents of /proc/net/dev is annoying to parse, because it changes whether there is a space after the "ethX:" or not. It depends upon the size of the "Receive bytes" counter, if the number is below 7 digits, then there is whitespaces else if the number is 8 digits or above there is no space between the ":" and the number. This patch changes the output to assure there is always a space between the ":" and the number. Given that all existing userspace application already need to handle the whitespaces, I see no breakage of existing tools. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-01-07 00:59:10 -08:00
Andy Gospodarek	ca8d9ea30b	fix bonding: allow arp_ip_targets on separate vlans to use arp validation On Wed, Jan 06, 2010 at 10:10:03PM +0100, Eric Dumazet wrote: > Le 06/01/2010 19:38, Eric Dumazet a écrit : > > > > (net-next-2.6 doesnt work well on my bond/vlan setup, I suspect I need a bisection) > > David, I had to revert `1f3c8804ac` > (bonding: allow arp_ip_targets on separate vlans to use arp validation) > > Or else, my vlan devices dont work (unfortunatly I dont have much time > these days to debug the thing) > > My config : > > +---------+ > vlan.103 -----+ bond0 +--- eth1 (bnx2) > \| + > vlan.825 -----+ +--- eth2 (tg3) > +---------+ > > $ cat /proc/net/bonding/bond0 > Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009) > > Bonding Mode: fault-tolerance (active-backup) > Primary Slave: None > Currently Active Slave: eth2 > MII Status: up > MII Polling Interval (ms): 100 > Up Delay (ms): 0 > Down Delay (ms): 0 > > Slave Interface: eth1 (bnx2) > MII Status: down > Link Failure Count: 1 > Permanent HW addr: 00:1e:0b:ec:d3:d2 > > Slave Interface: eth2 (tg3) > MII Status: up > Link Failure Count: 0 > Permanent HW addr: 00:1e:0b:92:78:50 > This patch fixes up a problem with found with commit `1f3c8804ac`. The original change overloaded null_or_orig, but doing that prevented any packet handlers that were not tied to a specific device (i.e. ptype->dev == NULL) from ever receiving any frames. The null_or_orig variable cannot be overloaded, and must be kept as NULL to prevent the frame from being ignored by packet handlers designed to accept frames on any interface. Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-01-07 00:46:39 -08:00
Andy Gospodarek	1f3c8804ac	bonding: allow arp_ip_targets on separate vlans to use arp validation This allows a bond device to specify an arp_ip_target as a host that is not on the same vlan as the base bond device and still use arp validation. A configuration like this, now works: BONDING_OPTS="mode=active-backup arp_interval=1000 arp_ip_target=10.0.100.1 arp_validate=3" 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000 link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff 3: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000 link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff 8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff inet6 fe80::213:21ff:febe:33e9/64 scope link valid_lft forever preferred_lft forever 9: bond0.100@bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff inet 10.0.100.2/24 brd 10.0.100.255 scope global bond0.100 inet6 fe80::213:21ff:febe:33e9/64 scope link valid_lft forever preferred_lft forever Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: eth1 MII Status: up MII Polling Interval (ms): 0 Up Delay (ms): 0 Down Delay (ms): 0 ARP Polling Interval (ms): 1000 ARP IP target/s (n.n.n.n form): 10.0.100.1 Slave Interface: eth1 MII Status: up Link Failure Count: 1 Permanent HW addr: 00:40:05:30:ff:30 Slave Interface: eth0 MII Status: up Link Failure Count: 0 Permanent HW addr: 00:13:21:be:33:e9 Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-01-03 21:17:16 -08:00
Linus Torvalds	c3bf4906fb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (74 commits) Revert "b43: Enforce DMA descriptor memory constraints" iwmc3200wifi: fix array out-of-boundary access wl1251: timeout one too soon in wl1251_boot_run_firmware() mac80211: fix propagation of failed hardware reconfigurations mac80211: fix race with suspend and dynamic_ps_disable_work ath9k: fix missed error codes in the tx status check ath9k: wake hardware during AMPDU TX actions ath9k: wake hardware for interface IBSS/AP/Mesh removal ath9k: fix suspend by waking device prior to stop cfg80211: fix error path in cfg80211_wext_siwscan wl1271_cmd.c: cleanup char => u8 iwlwifi: Storage class should be before const qualifier ath9k: Storage class should be before const qualifier cfg80211: fix race between deauth and assoc response wireless: remove remaining qual code rt2x00: Add USB ID for Linksys WUSB 600N rev 2. ath5k: fix SWI calibration interrupt storm mac80211: fix ibss join with fixed-bssid libertas: Remove carrier signaling from the scan code orinoco: fix GFP_KERNEL in orinoco_set_key with interrupts disabled ...	2009-12-30 12:37:35 -08:00
John Fastabend	f466dba183	pktgen: ndo_start_xmit can return NET_XMIT_xxx values This updates pktgen so that it does not decrement skb->users when it receives valid NET_XMIT_xxx values. These are now valid return values from ndo_start_xmit in net-next-2.6. They also indicate the skb has been consumed. This fixes pktgen to work correctly with vlan devices. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-23 22:02:57 -08:00
Krishna Kumar	068a2de57d	net: release dst entry while cache-hot for GSO case too Non-GSO code drops dst entry for performance reasons, but the same is missing for GSO code. Drop dst while cache-hot for GSO case too. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-23 14:13:30 -08:00
Linus Torvalds	59be2e04e5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits) net: sh_eth alignment fix for sh7724 using NET_IP_ALIGN V2 ixgbe: allow tx of pre-formatted vlan tagged packets ixgbe: Fix 82598 premature copper PHY link indicatation ixgbe: Fix tx_restart_queue/non_eop_desc statistics counters bcm63xx_enet: fix compilation failure after get_stats_count removal packet: dont call sleeping functions while holding rcu_read_lock() tcp: Revert per-route SACK/DSACK/TIMESTAMP changes. ipvs: zero usvc and udest netfilter: fix crashes in bridge netfilter caused by fragment jumps ipv6: reassembly: use seperate reassembly queues for conntrack and local delivery sky2: leave PCI config space writeable sky2: print Optima chip name x25: Update maintainer. ipvs: fix synchronization on connection close netfilter: xtables: document minimal required version drivers/net/bonding/: : use pr_fmt can: CAN_MCP251X should depend on HAS_DMA drivers/net/usb: Correct code taking the size of a pointer drivers/net/cpmac.c: Correct code taking the size of a pointer drivers/net/sfc: Correct code taking the size of a pointer ...	2009-12-16 10:33:18 -08:00
Alexey Dobriyan	28dfef8feb	const: constify remaining pipe_buf_operations Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-16 07:20:05 -08:00
Eric W. Biederman	d90a909e1f	net: Fix userspace RTM_NEWLINK notifications. I received some bug reports about userspace programs having problems because after RTM_NEWLINK was received they could not immediate access files under /proc/sys/net/ because they had not been registered yet. The original problem was trivially fixed by moving the userspace notification from rtnetlink_event() to the end of register_netdevice(). When testing that change I discovered I was still getting RTM_NEWLINK events before I could access proc and I was also getting RTM_NEWLINK events after I was seeing RTM_DELLINK. Things practically guaranteed to confuse userspace. After a little more investigation these extra notifications proved to be from the new notifiers NETDEV_POST_INIT and NETDEV_UNREGISTER_BATCH hitting the default case in rtnetlink_event, and triggering unnecessary RTM_NEWLINK messages. rtnetlink_event now explicitly handles NETDEV_UNREGISTER_BATCH and NETDEV_POST_INIT to avoid sending the incorrect userspace notifications. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-13 19:45:22 -08:00
Krishna Kumar	e93737b0f0	net: Handle NETREG_UNINITIALIZED devices correctly Fix two problems: 1. If unregister_netdevice_many() is called with both registered and unregistered devices, rollback_registered_many() bails out when it reaches the first unregistered device. The processing of the prior registered devices is unfinished, and the remaining devices are skipped, and possible registered netdev's are leaked/unregistered. 2. System hangs or panics depending on how the devices are passed, since when netdev_run_todo() runs, some devices were not fully processed. Tested by passing intermingled unregistered and registered vlan devices to unregister_netdevice_many() as follows: 1. dev, fake_dev1, fake_dev2: hangs in run_todo ("unregister_netdevice: waiting for eth1.100 to become free. Usage count = 1") 2. fake_dev1, dev, fake_dev2: failure during de-registration and next registration, followed by a vlan driver Oops during subsequent registration. Confirmed that the patch fixes both cases. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-11 15:11:45 -08:00
Linus Torvalds	d7fc02c7ba	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits) mac80211: fix reorder buffer release iwmc3200wifi: Enable wimax core through module parameter iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter iwmc3200wifi: Coex table command does not expect a response iwmc3200wifi: Update wiwi priority table iwlwifi: driver version track kernel version iwlwifi: indicate uCode type when fail dump error/event log iwl3945: remove duplicated event logging code b43: fix two warnings ipw2100: fix rebooting hang with driver loaded cfg80211: indent regulatory messages with spaces iwmc3200wifi: fix NULL pointer dereference in pmkid update mac80211: Fix TX status reporting for injected data frames ath9k: enable 2GHz band only if the device supports it airo: Fix integer overflow warning rt2x00: Fix padding bug on L2PAD devices. WE: Fix set events not propagated b43legacy: avoid PPC fault during resume b43: avoid PPC fault during resume tcp: fix a timewait refcnt race ... Fix up conflicts due to sysctl cleanups (dead sysctl_check code and CTL_UNNUMBERED removed) in kernel/sysctl_check.c net/ipv4/sysctl_net_ipv4.c net/ipv6/addrconf.c net/sctp/sysctl.c	2009-12-08 07:55:01 -08:00
Linus Torvalds	1557d33007	Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6: (43 commits) security/tomoyo: Remove now unnecessary handling of security_sysctl. security/tomoyo: Add a special case to handle accesses through the internal proc mount. sysctl: Drop & in front of every proc_handler. sysctl: Remove CTL_NONE and CTL_UNNUMBERED sysctl: kill dead ctl_handler definitions. sysctl: Remove the last of the generic binary sysctl support sysctl net: Remove unused binary sysctl code sysctl security/tomoyo: Don't look at ctl_name sysctl arm: Remove binary sysctl support sysctl x86: Remove dead binary sysctl support sysctl sh: Remove dead binary sysctl support sysctl powerpc: Remove dead binary sysctl support sysctl ia64: Remove dead binary sysctl support sysctl s390: Remove dead sysctl binary support sysctl frv: Remove dead binary sysctl support sysctl mips/lasat: Remove dead binary sysctl support sysctl drivers: Remove dead binary sysctl support sysctl crypto: Remove dead binary sysctl support sysctl security/keys: Remove dead binary sysctl support sysctl kernel: Remove binary sysctl logic ...	2009-12-08 07:38:50 -08:00
David S. Miller	28b4d5cc17	Merge branch 'master' of /home/davem/src/GIT/linux-2.6/ Conflicts: drivers/net/pcmcia/fmvj18x_cs.c drivers/net/pcmcia/nmclan_cs.c drivers/net/pcmcia/xirc2ps_cs.c drivers/net/wireless/ray_cs.c	2009-12-05 15:22:26 -08:00
Linus Torvalds	d0b093a8b5	Merge branch 'core-printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: ratelimit: Make suppressed output messages more useful printk: Remove ratelimit.h from kernel.h ratelimit: Fix/allow use in atomic contexts ratelimit: Use per ratelimit context locking	2009-12-05 09:50:22 -08:00
Patrick Mullaney	fc4a748966	netdevice: provide common routine for macvlan and vlan operstate management Provide common routine for the transition of operational state for a leaf device during a root device transition. Signed-off-by: Patrick Mullaney <pmullaney@novell.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-03 15:59:22 -08:00
Eric W. Biederman	e9c5158ac2	net: Allow fib_rule_unregister to batch Refactor the code so fib_rules_register always takes a template instead of the actual fib_rules_ops structure that will be used. This is required for network namespace support so 2 out of the 3 callers already do this, it allows the error handling to be made common, and it allows fib_rules_unregister to free the template for hte caller. Modify fib_rules_unregister to use call_rcu instead of syncrhonize_rcu to allw multiple namespaces to be cleaned up in the same rcu grace period. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-03 12:22:55 -08:00
Eric W. Biederman	3a765edadb	netns: Add an explicit rcu_barrier to unregister_pernet_{device\|subsys} This allows namespace exit methods to batch work that comes requires an rcu barrier using call_rcu without having to treat the unregister_pernet_operations cases specially. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-03 12:22:03 -08:00
Eric W. Biederman	04dc7f6be3	net: Move network device exit batching Move network device exit batching from a special case in net_namespace.c to using common mechanisms in dev.c Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-03 12:22:02 -08:00
Eric W. Biederman	72ad937abd	net: Add support for batching network namespace cleanups - Add exit_list to struct net to support building lists of network namespaces to cleanup. - Add exit_batch to pernet_operations to allow running operations only once during a network namespace exit. Instead of once per network namespace. - Factor opt ops_exit_list and ops_exit_free so the logic with cleanup up a network namespace does not need to be duplicated. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-03 12:22:01 -08:00
Patrick McHardy	5adef18091	net 04/05: fib_rules: allow to delete local rule commit d124356ce314fff22a047ea334379d5105b2d834 Author: Patrick McHardy <kaber@trash.net> Date: Thu Dec 3 12:16:35 2009 +0100 net: fib_rules: allow to delete local rule Allow to delete the local rule and recreate it with a higher priority. This can be used to force packets with a local destination out on the wire instead of routing them to loopback. Additionally this patch allows to recreate rules with a priority of 0. Combined with the previous patch to allow oif classification, a socket can be bound to the desired interface and packets routed to the wire like this: # move local rule to lower priority ip rule add pref 1000 lookup local ip rule del pref 0 # route packets of sockets bound to eth0 to the wire independant # of the destination address ip rule add pref 100 oif eth0 lookup 100 ip route add default dev eth0 table 100 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-03 12:14:37 -08:00
Patrick McHardy	1b038a5e60	net 03/05: fib_rules: add oif classification commit 68144d350f4f6c348659c825cde6a82b34c27a91 Author: Patrick McHardy <kaber@trash.net> Date: Thu Dec 3 12:05:25 2009 +0100 net: fib_rules: add oif classification Support routing table lookup based on the flow's oif. This is useful to classify packets originating from sockets bound to interfaces differently. The route cache already includes the oif and needs no changes. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-03 12:14:36 -08:00
Patrick McHardy	491deb24bf	net 02/05: fib_rules: rename ifindex/ifname/FRA_IFNAME to iifindex/iifname/FRA_IIFNAME commit 229e77eec406ad68662f18e49fda8b5d366768c5 Author: Patrick McHardy <kaber@trash.net> Date: Thu Dec 3 12:05:23 2009 +0100 net: fib_rules: rename ifindex/ifname/FRA_IFNAME to iifindex/iifname/FRA_IIFNAME The next patch will add oif classification, rename interface related members and attributes to reflect that they're used for iif classification. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-03 12:14:36 -08:00
Alexander Duyck	c81c2d9544	skbuff: remove skb_dma_map/unmap The two functions skb_dma_map/unmap are unsafe to use as they cause problems when packets are cloned and sent to multiple devices while a HW IOMMU is enabled. Due to this it is best to remove the code so it is not used by any other network driver maintainters. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-02 19:57:15 -08:00
Eric W. Biederman	e008b5fc8d	net: Simplfy default_device_exit and improve batching. - Defer dellink to net_cleanup() allowing for batching. - Fix comment. - Use for_each_netdev_safe again as dev_change_net_namespace touches at most one network device (unlike veth dellink). Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-01 16:15:52 -08:00
Eric W. Biederman	f875bae065	net: Automatically allocate per namespace data. To get the full benefit of batched network namespace cleanup netowrk device deletion needs to be performed by the generic code. When using register_pernet_gen_device and freeing the data in exit_net it is impossible to delay allocation until after exit_net has called as the device uninit methods are no longer safe. To correct this, and to simplify working with per network namespace data I have moved allocation and deletion of per network namespace data into the network namespace core. The core now frees the data only after all of the network namespace exit routines have run. Now it is only required to set the new fields .id and .size in the pernet_operations structure if you want network namespace data to be managed for you automatically. This makes the current register_pernet_gen_device and register_pernet_gen_subsys routines unnecessary. For the moment I have left them as compatibility wrappers in net_namespace.h They will be removed once all of the users have been updated. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-01 16:15:51 -08:00
Eric W. Biederman	2b035b3997	net: Batch network namespace destruction. It is fairly common to kill several network namespaces at once. Either because they are nested one inside the other or because they are cooperating in multiple machine networking experiments. As the network stack control logic does not parallelize easily batch up multiple network namespaces existing together. To get the full benefit of batching the virtual network devices to be removed must be all removed in one batch. For that purpose I have added a loop after the last network device operations have run that batches up all remaining network devices and deletes them. An extra benefit is that the reorganization slightly shrinks the size of the per network namespace data structures replaceing a work_struct with a list_head. In a trivial test with 4K namespaces this change reduced the cost of a destroying 4K namespaces from 7+ minutes (at 12% cpu) to 44 seconds (at 60% cpu). The bulk of that 44s was spent in inet_twsk_purge. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-01 16:15:51 -08:00
Eric W. Biederman	a5ee155136	net: NETDEV_UNREGISTER_PERNET -> NETDEV_UNREGISTER_BATCH The motivation for an additional notifier in batched netdevice notification (rt_do_flush) only needs to be called once per batch not once per namespace. For further batching improvements I need a guarantee that the netdevices are unregistered in order allowing me to unregister an all of the network devices in a network namespace at the same time with the guarantee that the loopback device is really and truly unregistered last. Additionally it appears that we moved the route cache flush after the final synchronize_net, which seems wrong and there was no explanation. So I have restored the original location of the final synchronize_net. Cc: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-01 16:15:50 -08:00
Joe Perches	f64f9e7192	net: Move && and \|\| to end of previous line Not including net/atm/ Compiled tested x86 allyesconfig only Added a > 80 column line or two, which I ignored. Existing checkpatch plaints willfully, cheerfully ignored. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-29 16:55:45 -08:00
Eric Dumazet	3291b9db56	pktgen: NUMA aware pktgen threads are bound to given CPU, we can allocate memory for these threads in a NUMA aware way. After a pktgen session on two threads, we can check flows memory was allocated on right node, instead of a not related one. # grep pktgen_thread_write /proc/vmallocinfo 0xffffc90007204000-0xffffc90007385000 1576960 pktgen_thread_write+0x3a4/0x6b0 [pktgen] pages=384 vmalloc N0=384 0xffffc90007386000-0xffffc90007507000 1576960 pktgen_thread_write+0x3a4/0x6b0 [pktgen] pages=384 vmalloc N1=384 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-29 01:17:39 -08:00
David S. Miller	9b963e5d0e	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/ieee802154/fakehard.c drivers/net/e1000e/ich8lan.c drivers/net/e1000e/phy.c drivers/net/netxen/netxen_nic_init.c drivers/net/wireless/ath/ath9k/main.c	2009-11-29 00:57:15 -08:00

1 2 3 4 5 ...

1669 Commits