Commit Graph

266185 Commits

Author SHA1 Message Date
David S. Miller
1805b2f048 Merge branch 'master' of ra.kernel.org:/pub/scm/linux/kernel/git/davem/net 2011-10-24 18:18:09 -04:00
Flavio Leitner
78d81d15b7 TCP: remove TCP_DEBUG
It was enabled by default and the messages guarded
by the define are useful.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 17:36:08 -04:00
Dirk Eibach
f42af6c486 net: Fix driver name for mdio-gpio.c
Since commit
"7488876... dt/net: Eliminate users of of_platform_{,un}register_driver"
there are two platform drivers named "mdio-gpio" registered.
I renamed the of variant to "mdio-ofgpio".

Signed-off-by: Dirk Eibach <eibach@gdsys.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 17:24:40 -04:00
David S. Miller
8b1857357a Merge git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next 2011-10-24 03:12:49 -04:00
Eric Dumazet
66b13d99d9 ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT
There is a long standing bug in linux tcp stack, about ACK messages sent
on behalf of TIME_WAIT sockets.

In the IP header of the ACK message, we choose to reflect TOS field of
incoming message, and this might break some setups.

Example of things that were broken :
  - Routing using TOS as a selector
  - Firewalls
  - Trafic classification / shaping

We now remember in timewait structure the inet tos field and use it in
ACK generation, and route lookup.

Notes :
 - We still reflect incoming TOS in RST messages.
 - We could extend MuraliRaja Muniraju patch to report TOS value in
netlink messages for TIME_WAIT sockets.
 - A patch is needed for IPv6

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 03:06:21 -04:00
Eric W. Biederman
d2237d3574 rtnetlink: Add missing manual netlink notification in dev_change_net_namespaces
Renato Westphal noticed that since commit a2835763e1
"rtnetlink: handle rtnl_link netlink notifications manually" was merged
we no longer send a netlink message when a networking device is moved
from one network namespace to another.

Fix this by adding the missing manual notification in dev_change_net_namespaces.

Since all network devices that are processed by dev_change_net_namspaces are
in the initialized state the complicated tests that guard the manual
rtmsg_ifinfo calls in rollback_registered and register_netdevice are
unnecessary and we can just perform a plain notification.

Cc: stable@kernel.org
Tested-by: Renato Westphal <renatowestphal@gmail.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 03:03:37 -04:00
Yan, Zheng
b73233960a ipv4: fix ipsec forward performance regression
There is bug in commit 5e2b61f(ipv4: Remove flowi from struct rtable).
It makes xfrm4_fill_dst() modify wrong data structure.

Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
Reported-by: Kim Phillips <kim.phillips@freescale.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 03:01:22 -04:00
Clemens Buchacher
a7d5b76d9a jme: fix irq storm after suspend/resume
If the device is down during suspend/resume, interrupts are enabled
without a registered interrupt handler, causing a storm of
unhandled interrupts until the IRQ is disabled because "nobody
cared".

Instead, check that the device is up before touching it in the
suspend/resume code.

Fixes https://bugzilla.kernel.org/show_bug.cgi?id=39112

Helped-by: Adrian Chadd <adrian@freebsd.org>
Helped-by: Mohammed Shafi <shafi.wireless@gmail.com>
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 02:58:09 -04:00
Flavio Leitner
7cc9150ebe route: fix ICMP redirect validation
The commit f39925dbde
(ipv4: Cache learned redirect information in inetpeer.)
removed some ICMP packet validations which are required by
RFC 1122, section 3.2.2.2:
...
  A Redirect message SHOULD be silently discarded if the new
  gateway address it specifies is not on the same connected
  (sub-) net through which the Redirect arrived [INTRO:2,
  Appendix A], or if the source of the Redirect is not the
  current first-hop gateway for the specified destination (see
  Section 3.3.1).

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 02:56:38 -04:00
Richard Cochran
da92b194cc net: hold sock reference while processing tx timestamps
The pair of functions,

 * skb_clone_tx_timestamp()
 * skb_complete_tx_timestamp()

were designed to allow timestamping in PHY devices. The first
function, called during the MAC driver's hard_xmit method, identifies
PTP protocol packets, clones them, and gives them to the PHY device
driver. The PHY driver may hold onto the packet and deliver it at a
later time using the second function, which adds the packet to the
socket's error queue.

As pointed out by Johannes, nothing prevents the socket from
disappearing while the cloned packet is sitting in the PHY driver
awaiting a timestamp. This patch fixes the issue by taking a reference
on the socket for each such packet. In addition, the comments
regarding the usage of these function are expanded to highlight the
rule that PHY drivers must use skb_complete_tx_timestamp() to release
the packet, in order to release the socket reference, too.

These functions first appeared in v2.6.36.

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: <stable@vger.kernel.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 02:54:50 -04:00
Eric Dumazet
318cf7aaa0 tcp: md5: add more const attributes
Now tcp_md5_hash_header() has a const tcphdr argument, we can add more
const attributes to callers.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 02:46:04 -04:00
Rick Jones
8f9f4668b3 Add ethtool -g support to virtio_net
Add support for reporting ring sizes via ethtool -g to the virtio_net
driver.

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 02:07:21 -04:00
Eric Dumazet
ca35a0ef85 tcp: md5: dont write skb head in tcp_md5_hash_header()
tcp_md5_hash_header() writes into skb header a temporary zero value,
this might confuse other users of this area.

Since tcphdr is small (20 bytes), copy it in a temporary variable and
make the change in the copy.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-24 01:52:35 -04:00
Eric W. Biederman
01718e36df bonding: Add a forgetten sysfs_attr_init on class_attr_bonding_masters
When I made class_attr_bonding_matters per network namespace and dynamically
allocated I overlooked the need for calling sysfs_attr_init.  Oops.

This fixes the following lockdep splat:

[    5.749651] bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
[    5.749655] bonding: MII link monitoring set to 100 ms
[    5.749676] BUG: key f49a831c not in .data!
[    5.749677] ------------[ cut here ]------------
[    5.749752] WARNING: at kernel/lockdep.c:2897 lockdep_init_map+0x1c3/0x460()
[    5.749809] Hardware name: ProLiant BL460c G1
[    5.749862] Modules linked in: bonding(+)
[    5.749978] Pid: 3177, comm: modprobe Not tainted 3.1.0-rc9-02177-gf2d1a4e-dirty #1157
[    5.750066] Call Trace:
[    5.750120]  [<c1352c2f>] ? printk+0x18/0x21
[    5.750176]  [<c103112d>] warn_slowpath_common+0x6d/0xa0
[    5.750231]  [<c1060133>] ? lockdep_init_map+0x1c3/0x460
[    5.750287]  [<c1060133>] ? lockdep_init_map+0x1c3/0x460
[    5.750342]  [<c103117d>] warn_slowpath_null+0x1d/0x20
[    5.750398]  [<c1060133>] lockdep_init_map+0x1c3/0x460
[    5.750453]  [<c1355ddd>] ? _raw_spin_unlock+0x1d/0x20
[    5.750510]  [<c11255c8>] ? sysfs_new_dirent+0x68/0x110
[    5.750565]  [<c1124d4b>] sysfs_add_file_mode+0x8b/0xe0
[    5.750621]  [<c1124db3>] sysfs_add_file+0x13/0x20
[    5.750675]  [<c1124e7c>] sysfs_create_file+0x1c/0x20
[    5.750737]  [<c1208f09>] class_create_file+0x19/0x20
[    5.750794]  [<c12c186f>] netdev_class_create_file+0xf/0x20
[    5.750853]  [<f85deaf4>] bond_create_sysfs+0x44/0x90 [bonding]
[    5.750911]  [<f8410947>] ? bond_create_proc_dir+0x1e/0x3e [bonding]
[    5.750970]  [<f841007e>] bond_net_init+0x7e/0x87 [bonding]
[    5.751026]  [<f8410000>] ? 0xf840ffff
[    5.751080]  [<c12abc7a>] ops_init.clone.4+0xba/0x100
[    5.751135]  [<c12abdb2>] ? register_pernet_subsys+0x12/0x30
[    5.751191]  [<c12abd03>] register_pernet_operations.clone.3+0x43/0x80
[    5.751249]  [<c12abdb9>] register_pernet_subsys+0x19/0x30
[    5.751306]  [<f84108b9>] bonding_init+0x832/0x8a2 [bonding]
[    5.751363]  [<c10011f0>] do_one_initcall+0x30/0x160
[    5.751420]  [<f8410087>] ? bond_net_init+0x87/0x87 [bonding]
[    5.751477]  [<c106d5cf>] sys_init_module+0xef/0x1890
[    5.751533]  [<c1356490>] sysenter_do_call+0x12/0x36
[    5.751588] ---[ end trace 89f492d83a7f5006 ]---

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-22 05:08:44 -04:00
Eric Dumazet
f7ff19871b tg3: fix tigon3_dma_hwbug_workaround()
Ari got kernel panics using tg3 NIC, and bisected to 2669069aac "tg3:
enable transmit time stamping."

This is because tigon3_dma_hwbug_workaround() might alloc a new skb and
free the original. We panic when skb_tx_timestamp() is called on freed
skb.

Reported-by: Ari Savolainen <ari.m.savolainen@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-22 03:29:53 -04:00
Eric Dumazet
b5d9c9c281 inet: add rfc 3168 extract in front of INET_ECN_encapsulate()
INET_ECN_encapsulate() is better understood if we can read the official
statement.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-22 01:25:23 -04:00
Maciej Żenczykowski
2c67e9acb6 net: use INET_ECN_MASK instead of hardcoded 3
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-22 00:07:47 -04:00
Carolyn Wyborny
1128c756be igb: VFTA Table Fix for i350 devices
Due to a hardware problem, writes to the VFTA register can
theoretically fail. Although the likelihood of this is very low.
This patch adds a shadow vfta in the adapter struct for reading
and adds new write functions for these devices to work around the problem.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-10-21 03:19:39 -07:00
Carolyn Wyborny
b6e0c419f0 igb: Move DMA Coalescing init code to separate function.
This patch moves the DMA Coalescing feature initialization code from
igb_reset to a new function and replaces it with a call to the new
function.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-10-21 03:15:10 -07:00
Carolyn Wyborny
65189d284b igb: Fix for Alt MAC Address feature on 82580 and later devices
In 82580 and later devices, the alternate MAC address feature is
completely handled by the option ROM and software does not handle
it anymore.  This patch changes the check_alt_mac_addr function to
exit immediately if device is 82580 or later.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-10-21 03:13:21 -07:00
Williams, Mitch A
7d94eb84f3 igbvf: Bump version number
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-10-21 03:08:58 -07:00
Williams, Mitch A
10090751c0 igbvf: Update module identification strings
Update adapter identification strings to properly indicate i350 VF devices
in the VF driver. Change the driver ID string to remove 82576-specific
wording. Update copyright date.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-10-21 03:06:55 -07:00
Eric Dumazet
cf533ea53e tcp: add const qualifiers where possible
Adding const qualifiers to pointers can ease code review, and spot some
bugs. It might allow compiler to optimize code further.

For example, is it legal to temporary write a null cksum into tcphdr
in tcp_md5_hash_header() ? I am afraid a sniffer could catch the
temporary null value...

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 05:22:42 -04:00
Mihai Maruseac
f04565ddf5 dev: use name hash for dev_seq_ops
Instead of using the dev->next chain and trying to resync at each call to
dev_seq_start, use the name hash, keeping the bucket and the offset in
seq->private field.

Tests revealed the following results for ifconfig > /dev/null
	* 1000 interfaces:
		* 0.114s without patch
		* 0.089s with patch
	* 3000 interfaces:
		* 0.489s without patch
		* 0.110s with patch
	* 5000 interfaces:
		* 1.363s without patch
		* 0.250s with patch
	* 128000 interfaces (other setup):
		* ~100s without patch
		* ~30s with patch

Signed-off-by: Mihai Maruseac <mmaruseac@ixiacom.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:54:38 -04:00
Eric W. Biederman
e09eff7fc1 macvtap: Fix the minor device number allocation
On systems that create and delete lots of dynamic devices the
31bit linux ifindex fails to fit in the 16bit macvtap minor,
resulting in unusable macvtap devices.  I have systems running
automated tests that that hit this condition in just a few days.

Use a linux idr allocator to track which mavtap minor numbers
are available and and to track the association between macvtap
minor numbers and macvtap network devices.

Remove the unnecessary unneccessary check to see if the network
device we have found is indeed a macvtap device.  With macvtap
specific data structures it is impossible to find any other
kind of networking device.

Increase the macvtap minor range from 65536 to the full 20 bits
that is supported by linux device numbers.  It doesn't solve the
original problem but there is no penalty for a larger minor
device range.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:53:07 -04:00
Eric W. Biederman
9bf1907f42 macvtap: Rewrite macvtap_newlink so the error handling works.
Place macvlan_common_newlink at the end of macvtap_newlink because
failing in newlink after registering your network device is not
supported.

Move device_create into a netdevice creation notifier.   The network device
notifier is the only hook that is called after the network device has been
registered with the device layer and before register_network_device returns
success.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:53:07 -04:00
Eric W. Biederman
2259fef0bb macvtap: Don't leak unreceived packets when we delete a macvtap device.
To avoid leaking packets in the receive queue.  Add a socket destructor
that will run whenever destroy a macvtap socket.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:53:07 -04:00
Eric W. Biederman
047af9cfed macvtap: Fix macvtap_open races in the zero copy enable code.
To see if it is appropriate to enable the macvtap zero copy feature
don't test the lowerdev network device flags.   Instead test the
macvtap network device flags which are a direct copy of the lowerdev
flags.  This is important because nothing holds a reference to lowerdev
and on a very bad day we lowerdev could be a pointer to stale memory.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:53:07 -04:00
Eric W. Biederman
99f34b38cd macvtap: Close a race between macvtap_open and macvtap_dellink.
There is a small window in macvtap_open between looking up a
networking device and calling macvtap_set_queue in which
macvtap_del_queues called from macvtap_dellink.   After
calling macvtap_del_queues it is totally incorrect to
allow macvtap_set_queue to proceed so prevent success by
reporting that all of the available queues are in use.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:53:06 -04:00
Eric Dumazet
4b727361f0 virtio_net: fix truesize underestimation
We must account in skb->truesize, the size of the fragments, not the
used part of them.

Doing this work is important to avoid unexpected OOM situations.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: virtualization@lists.linux-foundation.org
CC: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:53:00 -04:00
Eric Dumazet
e1ac50f646 bnx2x: fix skb truesize underestimation
bnx2x allocates a full page per fragment.

We must account in skb->truesize, the size of the fragment, not the used
part of it.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:52:59 -04:00
Ian Campbell
a8605c6063 net: add opaque struct around skb frag page
I've split this bit out of the skb frag destructor patch since it helps enforce
the use of the fragment API.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:52:53 -04:00
Ian Campbell
6a39a16a5a cxgbi: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: James Bottomley <James.Bottomley@suse.de>
Cc: Karen Xie <kxie@chelsio.com>
Cc: linux-scsi@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:52:53 -04:00
Ian Campbell
a0006a86cb cxgb4vf: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Casey Leedom <leedom@chelsio.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:52:52 -04:00
Ian Campbell
e91b0f2491 cxgb4: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Dimitris Michailidis <dm@chelsio.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:52:52 -04:00
Ian Campbell
311761c8a5 mlx4: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:52:52 -04:00
Maciej Żenczykowski
6cc7a765c2 net: allow CAP_NET_RAW to set socket options IP{,V6}_TRANSPARENT
Up till now the IP{,V6}_TRANSPARENT socket options (which actually set
the same bit in the socket struct) have required CAP_NET_ADMIN
privileges to set or clear the option.

- we make clearing the bit not require any privileges.
- we allow CAP_NET_ADMIN to set the bit (as before this change)
- we allow CAP_NET_RAW to set this bit, because raw
  sockets already pretty much effectively allow you
  to emulate socket transparency.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 18:21:36 -04:00
Eric Dumazet
05bdd2f143 net: constify skbuff and Qdisc elements
Preliminary patch before tcp constification

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 17:45:43 -04:00
Eric Dumazet
20c4cb792d tcp: remove unused tcp_fin() parameters
tcp_fin() only needs socket pointer, we can remove skb and th params.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 17:44:03 -04:00
David S. Miller
580043a27d Merge branch 'batman-adv/maint' of git://git.open-mesh.org/linux-merge 2011-10-20 17:40:43 -04:00
Ricardo
9eac2d4d53 ll_temac: Add support for ethtool
This patch enables the ethtool interface. The implementation is done
using the libphy helper functions.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 17:10:55 -04:00
RongQing Li
46a016985a igb: fix a compile warning
control these three function declarations and
definitions with same macro CONFIG_PCI_IOV

drivers/net/ethernet/intel/igb/igb_main.c:165:
warning: ‘igb_vf_configure’ declared ‘static’ but never defined
drivers/net/ethernet/intel/igb/igb_main.c:166:
warning: ‘igb_find_enabled_vfs’ declared ‘static’ but never defined
drivers/net/ethernet/intel/igb/igb_main.c:167:
warning: ‘igb_check_vf_assignment’ declared ‘static’ but never defined

Signed-off-by: RongQing Li <roy.qing.li@gmail.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 17:09:40 -04:00
Eric Dumazet
924a4c7d2e myri10ge: fix truesize underestimation
skb->truesize must account for allocated memory, not the used part of
it. Doing this work is important to avoid unexpected OOM situations.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Jon Mason <mason@myri.com>
Acked-by: Jon Mason <mason@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 17:04:20 -04:00
Eric Dumazet
7b8b59617e igbvf: fix truesize underestimation
igbvf allocates half a page per skb fragment. We must account
PAGE_SIZE/2 increments on skb->truesize, not the actual frag length.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 17:04:20 -04:00
Eric Dumazet
33136d12be pktgen: remove ndelay() call
Daniel Turull reported inaccuracies in pktgen when using low packet
rates, because we call ndelay(val) with values bigger than 20000.

Instead of calling ndelay() for delays < 100us, we can instead loop
calling ktime_now() only.

Reported-by: Daniel Turull <daniel.turull@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 17:00:21 -04:00
Eric Dumazet
e9266a02b7 tcp: use TCP_DEFAULT_INIT_RCVWND in tcp_fixup_rcvbuf()
Since commit 356f039822 (TCP: increase default initial receive
window.), we allow sender to send 10 (TCP_DEFAULT_INIT_RCVWND) segments.

Change tcp_fixup_rcvbuf() to reflect this change, even if no real change
is expected, since sysctl_tcp_rmem[1] = 87380 and this value
is bigger than tcp_fixup_rcvbuf() computed rcvmem (~23720)

Note: Since commit 356f039822 limited default window to maximum of
10*1460 and 2*MSS, we use same heuristic in this patch.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 16:54:51 -04:00
Eric Dumazet
113ab386c7 ip_gre: dont increase dev->needed_headroom on a live device
It seems ip_gre is able to change dev->needed_headroom on the fly.

Its is not legal unfortunately and triggers a BUG in raw_sendmsg()

skb = sock_alloc_send_skb(sk, ... + LL_ALLOCATED_SPACE(rt->dst.dev)

< another cpu change dev->needed_headromm (making it bigger)

...
skb_reserve(skb, LL_RESERVED_SPACE(rt->dst.dev));

We end with LL_RESERVED_SPACE() being bigger than LL_ALLOCATED_SPACE()
-> we crash later because skb head is exhausted.

Bug introduced in commit 243aad83 in 2.6.34 (ip_gre: include route
header_len in max_headroom calculation)

Reported-by: Elmar Vonlanthen <evonlanthen@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Timo Teräs <timo.teras@iki.fi>
CC: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 16:20:30 -04:00
Linus Torvalds
fd11e153b8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Add alignment flag to PCI expansion resources
  sparc: Avoid calling sigprocmask()
  sparc: Use set_current_blocked()
  sparc32,leon: SRMMU MMU Table probe fix
2011-10-20 22:16:28 +03:00
Linus Torvalds
505f48b534 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  fib_rules: fix unresolved_rules counting
  r8169: fix wrong eee setting for rlt8111evl
  r8169: fix driver shutdown WoL regression.
  ehea: Change maintainer to me
  pptp: pptp_rcv_core() misses pskb_may_pull() call
  tproxy: copy transparent flag when creating a time wait
  pptp: fix skb leak in pptp_xmit()
  bonding: use local function pointer of bond->recv_probe in bond_handle_frame
  smsc911x: Add support for SMSC LAN89218
  tg3: negate USE_PHYLIB flag check
  netconsole: enable netconsole can make net_device refcnt incorrent
  bluetooth: Properly clone LSM attributes to newly created child connections
  l2tp: fix a potential skb leak in l2tp_xmit_skb()
  bridge: fix hang on removal of bridge via netlink
  x25: Prevent skb overreads when checking call user data
  x25: Handle undersized/fragmented skbs
  x25: Validate incoming call user data lengths
  udplite: fast-path computation of checksum coverage
  IPVS netns shutdown/startup dead-lock
  netfilter: nf_conntrack: fix event flooding in GRE protocol tracker
2011-10-20 22:15:20 +03:00
Ian Campbell
30d3c128ea mm: add a "struct page_frag" type containing a page, offset and length
A few network drivers currently use skb_frag_struct for this purpose but I have
patches which add additional fields and semantics there which these other uses
do not want.

A structure for reference sub-page regions seems like a generally useful thing
so do so instead of adding a network subsystem specific structure.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jens Axboe <jaxboe@fusionio.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 04:58:32 -04:00