linux

Author	SHA1	Message	Date
Jason Wang	bb91accf27	virtio-net: XDP support for small buffers Commit `f600b69050` ("virtio_net: Add XDP support") leaves the case of small receive buffer untouched. This will confuse the user who want to set XDP but use small buffers. Other than forbid XDP in small buffer mode, let's make it work. XDP then can only work at skb->data since virtio-net create skbs during refill, this is sub optimal which could be optimized in the future. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:56 -05:00
Jason Wang	c47a43d300	virtio-net: remove big packet XDP codes Now we in fact don't allow XDP for big packets, remove its codes. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:55 -05:00
Jason Wang	92502fe86c	virtio-net: forbid XDP when VIRTIO_NET_F_GUEST_UFO is support When VIRTIO_NET_F_GUEST_UFO is negotiated, host could still send UFO packet that exceeds a single page which could not be handled correctly by XDP. So this patch forbids setting XDP when GUEST_UFO is supported. While at it, forbid XDP for ECN (which comes only from GRO) too to prevent user from misconfiguration. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:55 -05:00
Jason Wang	5c33474d41	virtio-net: make rx buf size estimation works for XDP We don't update ewma rx buf size in the case of XDP. This will lead underestimation of rx buf size which causes host to produce more than one buffers. This will greatly increase the possibility of XDP page linearization. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:55 -05:00
Jason Wang	b00f70b0da	virtio-net: unbreak csumed packets for XDP_PASS We drop csumed packet when do XDP for packets. This breaks XDP_PASS when GUEST_CSUM is supported. Fix this by allowing csum flag to be set. With this patch, simple TCP works for XDP_PASS. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:54 -05:00
Jason Wang	1830f8935f	virtio-net: correctly handle XDP_PASS for linearized packets When XDP_PASS were determined for linearized packets, we try to get new buffers in the virtqueue and build skbs from them. This is wrong, we should create skbs based on existed buffers instead. Fixing them by creating skb based on xdp_page. With this patch "ping 192.168.100.4 -s 3900 -M do" works for XDP_PASS. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:54 -05:00
Jason Wang	56a86f84b8	virtio-net: fix page miscount during XDP linearizing We don't put page during linearizing, the would cause leaking when xmit through XDP_TX or the packet exceeds PAGE_SIZE. Fix them by put page accordingly. Also decrease the number of buffers during linearizing to make sure caller can free buffers correctly when packet exceeds PAGE_SIZE. With this patch, we won't get OOM after linearize huge number of packets. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:54 -05:00
Jason Wang	275be061b3	virtio-net: correctly xmit linearized page on XDP_TX After we linearize page, we should xmit this page instead of the page of first buffer which may lead unexpected result. With this patch, we can see correct packet during XDP_TX. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:54 -05:00
Jason Wang	73b62bd085	virtio-net: remove the warning before XDP linearizing Since we use EWMA to estimate the size of rx buffer. When rx buffer size is underestimated, it's usual to have a packet with more than one buffers. Consider this is not a bug, remove the warning and correct the comment before XDP linearizing. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-23 13:48:53 -05:00
John Fastabend	72979a6c35	virtio_net: xdp, add slowpath case for non contiguous buffers virtio_net XDP support expects receive buffers to be contiguous. If this is not the case we enable a slowpath to allow connectivity to continue but at a significan performance overhead associated with linearizing data. To make it painfully aware to users that XDP is running in a degraded mode we throw an xdp buffer error. To linearize packets we allocate a page and copy the segments of the data, including the header, into it. After this the page can be handled by XDP code flow as normal. Then depending on the return code the page is either freed or sent to the XDP xmit path. There is no attempt to optimize this path. This case is being handled simple as a precaution in case some unknown backend were to generate packets in this form. To test this I had to hack qemu and force it to generate these packets. I do not expect this case to be generated by "real" backends. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-17 11:48:55 -05:00
John Fastabend	56434a01b1	virtio_net: add XDP_TX support This adds support for the XDP_TX action to virtio_net. When an XDP program is run and returns the XDP_TX action the virtio_net XDP implementation will transmit the packet on a TX queue that aligns with the current CPU that the XDP packet was processed on. Before sending the packet the header is zeroed. Also XDP is expected to handle checksum correctly so no checksum offload support is provided. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-17 11:48:55 -05:00
John Fastabend	672aafd5d8	virtio_net: add dedicated XDP transmit queues XDP requires using isolated transmit queues to avoid interference with normal networking stack (BQL, NETDEV_TX_BUSY, etc). This patch adds a XDP queue per cpu when a XDP program is loaded and does not expose the queues to the OS via the normal API call to netif_set_real_num_tx_queues(). This way the stack will never push an skb to these queues. However virtio/vhost/qemu implementation only allows for creating TX/RX queue pairs at this time so creating only TX queues was not possible. And because the associated RX queues are being created I went ahead and exposed these to the stack and let the backend use them. This creates more RX queues visible to the network stack than TX queues which is worth mentioning but does not cause any issues as far as I can tell. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-17 11:48:55 -05:00
John Fastabend	f600b69050	virtio_net: Add XDP support This adds XDP support to virtio_net. Some requirements must be met for XDP to be enabled depending on the mode. First it will only be supported with LRO disabled so that data is not pushed across multiple buffers. Second the MTU must be less than a page size to avoid having to handle XDP across multiple pages. If mergeable receive is enabled this patch only supports the case where header and data are in the same buf which we can check when a packet is received by looking at num_buf. If the num_buf is greater than 1 and a XDP program is loaded the packet is dropped and a warning is thrown. When any_header_sg is set this does not happen and both header and data is put in a single buffer as expected so we check this when XDP programs are loaded. Subsequent patches will process the packet in a degraded mode to ensure connectivity and correctness is not lost even if backend pushes packets into multiple buffers. If big packets mode is enabled and MTU/LRO conditions above are met then XDP is allowed. This patch was tested with qemu with vhost=on and vhost=off where mergeable and big_packet modes were forced via hard coding feature negotiation. Multiple buffers per packet was forced via a small test patch to vhost.c in the vhost=on qemu mode. Suggested-by: Shrijeet Mukherjee <shrijeet@gmail.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-17 11:48:55 -05:00
Jason Wang	a220871be6	virtio-net: correctly enable multiqueue Commit `4490001029` ("virtio-net: enable multiqueue by default") blindly set the affinity instead of queues during probe which can cause a mismatch of #queues between guest and host. This patch fixes it by setting queues. Reported-by: Theodore Ts'o <tytso@mit.edu> Tested-by: Theodore Ts'o <tytso@mit.edu> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Michael S. Tsirkin <mst@redhat.com> Fixes: 49000102901 ("virtio-net: enable multiqueue by default") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-13 10:37:38 -05:00
David S. Miller	c63d352f05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-12-06 21:33:19 -05:00
Andy Lutomirski	e37e2ff350	virtio-net: Fix DMA-from-the-stack in virtnet_set_mac_address() With CONFIG_VMAP_STACK=y, virtnet_set_mac_address() can be passed a pointer to the stack and it will OOPS. Copy the address to the heap to prevent the crash. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Laura Abbott <labbott@redhat.com> Reported-by: zbyszek@in.waw.pl Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-06 11:38:43 -05:00
Jason Wang	4490001029	virtio-net: enable multiqueue by default We use single queue even if multiqueue is enabled and let admin to enable it through ethtool later. This is used to avoid possible regression (small packet TCP stream transmission). But looks like an overkill since: - single queue user can disable multiqueue when launching qemu - brings extra troubles for the management since it needs extra admin tool in guest to enable multiqueue - multiqueue performs much better than single queue in most of the cases So this patch enables multiqueue by default: if #queues is less than or equal to #vcpu, enable as much as queue pairs; if #queues is greater than #vcpu, enable #vcpu queue pairs. Cc: Hannes Frederic Sowa <hannes@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Neil Horman <nhorman@redhat.com> Cc: Jeremy Eder <jeder@redhat.com> Cc: Marko Myllynen <myllynen@redhat.com> Cc: Maxime Coquelin <maxime.coquelin@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-28 13:17:40 -05:00
David S. Miller	f9aa9dc7d2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net All conflicts were simple overlapping changes except perhaps for the Thunder driver. That driver has a change_mtu method explicitly for sending a message to the hardware. If that fails it returns an error. Normally a driver doesn't need an ndo_change_mtu method becuase those are usually just range changes, which are now handled generically. But since this extra operation is needed in the Thunder driver, it has to stay. However, if the message send fails we have to restore the original MTU before the change because the entire call chain expects that if an error is thrown by ndo_change_mtu then the MTU did not change. Therefore code is added to nicvf_change_mtu to remember the original MTU, and to restore it upon nicvf_update_hw_max_frs() failue. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-22 13:27:16 -05:00
Eric Dumazet	963abe5c8a	virtio-net: add a missing synchronize_net() It seems many drivers do not respect napi_hash_del() contract. When napi_hash_del() is used before netif_napi_del(), an RCU grace period is needed before freeing NAPI object. Fixes: `91815639d8` ("virtio-net: rx busy polling support") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-16 15:09:29 -05:00
David S. Miller	bb598c1b8c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Several cases of bug fixes in 'net' overlapping other changes in 'net-next-. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-15 10:54:36 -05:00
Michael S. Tsirkin	f3358507c1	virtio-net: drop legacy features in virtio 1 mode Virtio 1.0 spec says VIRTIO_F_ANY_LAYOUT and VIRTIO_NET_F_GSO are legacy-only feature bits. Do not negotiate them in virtio 1 mode. Note this is a spec violation so we need to backport it to stable/downstream kernels. Cc: stable@vger.kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-07 20:35:46 -05:00
Aaron Conole	93a205ee98	virtio-net: Update the mtu code to match virtio spec The virtio committee recently ratified a change, VIRTIO-152, which defines the mtu field to be 'max' MTU, not simply desired MTU. This commit brings the virtio-net device in compliance with VIRTIO-152. Additionally, drop the max_mtu branch - it cannot be taken since the u16 returned by virtio_cread16 will never exceed the initial value of max_mtu. Signed-off-by: Aaron Conole <aconole@redhat.com> Acked-by: "Michael S. Tsirkin" <mst@redhat.com> Acked-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 11:58:51 -04:00
Jarod Wilson	d0c2c9973e	net: use core MTU range checking in virt drivers hyperv_net: - set min/max_mtu, per Haiyang, after rndis_filter_device_add virtio_net: - set min/max_mtu - remove virtnet_change_mtu vmxnet3: - set min/max_mtu xen-netback: - min_mtu = 0, max_mtu = 65517 xen-netfront: - min_mtu = 0, max_mtu = 65535 unisys/visor: - clean up defines a little to not clash with network core or add redundat definitions CC: netdev@vger.kernel.org CC: virtualization@lists.linux-foundation.org CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: "Michael S. Tsirkin" <mst@redhat.com> CC: Shrikrishna Khare <skhare@vmware.com> CC: "VMware, Inc." <pv-drivers@vmware.com> CC: Wei Liu <wei.liu2@citrix.com> CC: Paul Durrant <paul.durrant@citrix.com> CC: David Kershner <david.kershner@unisys.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-20 14:51:09 -04:00
Sebastian Andrzej Siewior	8017c27919	net/virtio-net: Convert to hotplug state machine Install the callbacks via the state machine. The driver supports multiple instances and therefore the new cpuhp_state_add_instance_nocalls() infrastrucure is used. The driver currently uses get_online_cpus() to avoid missing a CPU hotplug event while invoking virtnet_set_affinity(). This could be avoided by using cpuhp_state_add_instance() variant which holds the hotplug lock and invokes callback during registration. This is more or less a 1:1 conversion of the current code. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Mark Rutland <mark.rutland@arm.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: netdev@vger.kernel.org Cc: Will Deacon <will.deacon@arm.com> Cc: virtualization@lists.linux-foundation.org Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/1471024183-12666-7-git-send-email-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-09-02 20:05:06 +02:00
Andy Lutomirski	a725ee3e44	virtio-net: Remove more stack DMA VLAN and MQ control was doing DMA from the stack. Fix it. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org> Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-07-19 19:25:43 -07:00
Mike Rapoport	d1dc06dcd0	virtio_net: fix csum generation for virtio-net devices The commit `e858fae2b0` ("virtio_net: use common code for virtio_net_hdr and skb GSO conversion") replaced the tun code for header manipulation with the generic helpers. While doing so, it implictly moved the skb_partial_csum_set() invocation after eth_type_trans(), which invalidate the current gso start/offset values. Fix it by moving the helper invocation before the mac pulling. Fixes: `e858fae2b0` ("virtio_net: use common code for virtio_net_hdr and skb GSO conversion") Reported-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-14 03:03:34 -04:00
Mike Rapoport	e858fae2b0	virtio_net: use common code for virtio_net_hdr and skb GSO conversion Replace open coded conversion between virtio_net_hdr to skb GSO info with virtio_net_hdr_{from,to}_skb Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-10 23:03:55 -07:00
Aaron Conole	14de9d114a	virtio-net: Add initial MTU advice feature This commit adds the feature bit and associated mtu device entry for the virtio network device. When a virtio device comes up, it checks the feature bit for the VIRTIO_NET_F_MTU feature. If such feature bit is enabled, the driver will read the advised MTU and use it as the initial value. Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-06 21:08:49 -04:00
wangyunjian	f00e35e259	virtio_net: fix virtnet_open and virtnet_probe competing for try_fill_recv In function virtnet_open() and virtnet_probe(), func try_fill_recv() may be executed at the same time. VQ in virtqueue_add() has not been protected well and BUG_ON will be triggered when virito_net.ko being removed. Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-31 14:21:09 -07:00
Linus Torvalds	f0691533b7	virtio/vhost: new features, performance improvements, cleanups This adds basic polling support for vhost. Reworks virtio to optionally use DMA API, fixing it on Xen. Balloon stats gained a new entry. Using the new napi_alloc_skb speeds up virtio net. virtio blk stats can now be read while another VCPU us busy inflating or deflating the balloon. Plus misc cleanups in various places. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJW7qRJAAoJECgfDbjSjVRpVNoH/A7z+lZ6nooSJ9fUBtAAlwit mE1VKi8g0G6naV1NVLFVe7hPAejExGiHfR3ZUrVoenJKj2yeW/DFojFC10YR/KTe ac7Imuc+owA3UOE/QpeGBs59+EEWKTZUYt6r8HSJVwoodeosw9v2ecP/Iwhbax8H a4V3HqOADjKnHg73R9o3u+bAgA1GrGYHeK0AfhCBSTNwlPdxkvf0463HgfOpM4nl /sNoFWO3vOyekk+loIk+jpmWVIoIfG2NFzW4lPwEPkfqUBX7r0ei/NR23hIqHL7r QZ6vMj1Ew9qctUONbJu4kXjuV2Vk9NhxwbDjoJtm8plKL2hz2prJynUEogkHh2g= =VMD0 -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio/vhost updates from Michael Tsirkin: "New features, performance improvements, cleanups: - basic polling support for vhost - rework virtio to optionally use DMA API, fixing it on Xen - balloon stats gained a new entry - using the new napi_alloc_skb speeds up virtio net - virtio blk stats can now be read while another VCPU is busy inflating or deflating the balloon plus misc cleanups in various places" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_net: replace netdev_alloc_skb_ip_align() with napi_alloc_skb() vhost_net: basic polling support vhost: introduce vhost_vq_avail_empty() vhost: introduce vhost_has_work() virtio_balloon: Allow to resize and update the balloon stats in parallel virtio_balloon: Use a workqueue instead of "vballoon" kthread virtio/s390: size of SET_IND payload virtio/s390: use dev_to_virtio vhost: rename vhost_init_used() vhost: rename cross-endian helpers virtio_blk: VIRTIO_BLK_F_WCE->VIRTIO_BLK_F_FLUSH vring: Use the DMA API on Xen virtio_pci: Use the DMA API if enabled virtio_mmio: Use the DMA API if enabled virtio: Add improved queue allocation API virtio_ring: Support DMA APIs vring: Introduce vring_use_dma_api() s390/dma: Allow per device dma ops alpha/dma: use common noop dma ops dma: Provide simple noop dma ops	2016-03-20 13:28:18 -07:00
Paolo Abeni	c67f5db820	virtio_net: replace netdev_alloc_skb_ip_align() with napi_alloc_skb() This gives small but noticeable rx performance improvement (2-3%) and will allow exploiting future napi improvement. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2016-03-17 17:42:00 +02:00
Nikolay Aleksandrov	0cf3ace9e7	virtio_net: validate ethtool port setting and explain the user validation We should validate the port setting that we got from the user and check if it's what we've set it to (PORT_OTHER), also add explanation that ignoring advertising is good as long as we don't have autonegotiation. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-11 11:55:38 -05:00
Nikolay Aleksandrov	16032be56c	virtio_net: add ethtool support for set and get of settings This patch allows the user to set and retrieve speed and duplex of the virtio_net device via ethtool. Having this functionality is very helpful for simulating different environments and also enables the virtio_net device to participate in operations where proper speed and duplex are required (e.g. currently bonding lacp mode requires full duplex). Custom speed and duplex are not allowed, the user-supplied settings are validated before applying. Example: $ ethtool eth1 Settings for eth1: ... Speed: Unknown! Duplex: Unknown! (255) $ ethtool -s eth1 speed 1000 duplex full $ ethtool eth1 Settings for eth1: ... Speed: 1000Mb/s Duplex: Full Based on a patch by Roopa Prabhu. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-07 14:30:45 -05:00
David S. Miller	b3e0d3d7ba	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/geneve.c Here we had an overlapping change, where in 'net' the extraneous stats bump was being removed whilst in 'net-next' the final argument to udp_tunnel6_xmit_skb() was being changed. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-17 22:08:28 -05:00
Michael S. Tsirkin	2ac4603033	virtio-net: Stop doing DMA from the stack Once virtio starts using the DMA API, we won't be able to safely DMA from the stack. virtio-net does a couple of config DMA requests from small stack buffers -- switch to using dynamically-allocated memory. This should have no effect on any performance-critical code paths. Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Andy Lutomirski <luto@kernel.org>	2015-12-07 16:10:53 +02:00
Eric Dumazet	93d05d4a32	net: provide generic busy polling to all NAPI drivers NAPI drivers no longer need to observe a particular protocol to benefit from busy polling (CONFIG_NET_RX_BUSY_POLL=y) napi_hash_add() and napi_hash_del() are automatically called from core networking stack, respectively from netif_napi_add() and netif_napi_del() This patch depends on free_netdev() and netif_napi_del() being called from process context, which seems to be the norm. Drivers might still prefer to call napi_hash_del() on their own, since they might combine all the rcu grace periods into a single one, knowing their NAPI structures lifetime, while core networking stack has no idea of a possible combining. Once this patch proves to not bring serious regressions, we will cleanup drivers to either remove napi_hash_del() or provide appropriate rcu grace periods combining. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:42 -05:00
Eric Dumazet	93f93a4404	net: move skb_mark_napi_id() into core networking stack We would like to automatically provide busy polling support to all NAPI drivers, without them having to implement anything. skb_mark_napi_id() can be called from napi_gro_receive() and napi_get_frags(). Few drivers are still calling skb_mark_napi_id() because they use netif_receive_skb(). They should eventually call napi_gro_receive() instead. I will leave this to drivers maintainers. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-11-18 16:17:41 -05:00
Jason Wang	547c890cfd	virtio-net: avoid unnecessary sg initialzation Usually an skb does not have up to MAX_SKB_FRAGS frags. So no need to initialize the unuse part of sg. This patch initialize the sg based on the real number it will used: - during xmit, it could be inferred from nr_frags and can_push. - for small receive buffer, it will also be 2. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-27 15:51:45 -07:00
Johannes Berg	5377d75823	virtio_net: use DECLARE_EWMA Instead of using the out-of-line EWMA calculation, use DECLARE_EWMA() to create static inlines. On x86/64 this results in no change in code size for me, but reduces the struct receive_queue size by the two unsigned long values that store the parameters. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-20 14:10:22 -07:00
David S. Miller	182ad468e7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/cavium/Kconfig The cavium conflict was overlapping dependency changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-13 16:23:11 -07:00
Jason Wang	48900cb6af	virtio-net: drop NETIF_F_FRAGLIST virtio declares support for NETIF_F_FRAGLIST, but assumes that there are at most MAX_SKB_FRAGS + 2 fragments which isn't always true with a fraglist. A longer fraglist in the skb will make the call to skb_to_sgvec overflow the sg array, leading to memory corruption. Drop NETIF_F_FRAGLIST so we only get what we can handle. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-07 00:13:10 -07:00
Eric Dumazet	0fbd050a7d	virtio_net: add gro capability Straightforward patch to add GRO processing to virtio_net. napi_complete_done() usage allows more aggressive aggregation, opted-in by setting /sys/class/net/xxx/gro_flush_timeout Tested: Setting /sys/class/net/xxx/gro_flush_timeout to 1000 nsec, Rick Jones reported following results. One VM of each on a pair of OpenStack compute nodes with E5-2650Lv3 CPUs and Intel 82599ES-based NICs. So, two "before" and two "after" VMs. The OpenStack compute nodes were running OpenStack Kilo, with VxLAN encapsulation being used through OVS so no GRO coming-up the host stack. The compute nodes themselves were running a 3.14-based kernel. Single-stream netperf, CPU utilizations and thus service demands are based on intra-guest reported CPU. Throughput Mbit/s, bigger is better Min Median Average Max 4.2.0-rc3+ 1364 1686 1678 1938 4.2.0-rc3+flush1k 1824 2269 2275 2647 Send Service Demand, smaller is better Min Median Average Max 4.2.0-rc3+ 0.236 0.558 0.524 0.802 4.2.0-rc3+flush1k 0.176 0.503 0.471 0.738 Receive Service Demand, smaller is better. Min Median Average Max 4.2.0-rc3+ 1.906 2.188 2.191 2.531 4.2.0-rc3+flush1k 0.448 0.529 0.533 0.692 Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Rick Jones <rick.jones2@hp.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-08-03 14:22:53 -07:00
Michael S. Tsirkin	75993300d0	virtio_net: don't require ANY_LAYOUT with VERSION_1 ANY_LAYOUT is a compatibility feature. It's implied for VERSION_1 devices, and non-transitional devices might not offer it. Change code to behave accordingly. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-07-20 12:43:33 -07:00
Michael S. Tsirkin	60302ff631	virtio: document queue state logic commit `d631b94e7a` virtio: change comment in transmit started clarifying the logic behind queue state management, but introduced an inaccuracy: TX_BUSY does not cause a BUG message. Clean this up some more, explaining the tradeoffs in detail. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-04-06 16:44:24 -04:00
Li RongQing	faadb05f4b	virtio: simplify the using of received in virtnet_poll received is 0, no need to minus it and use "+=" to reassign it Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:37:17 -07:00
stephen hemminger	d631b94e7a	virtio: change comment in transmit The original comment was not really informative or funny as well as sexist. Replace it with a better explanation of why the driver does stop and what the impacts are. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:22:50 -04:00
Jason Wang	ab3971b1e7	virtio-net: correctly delete napi hash We don't delete napi from hash list during module exit. This will cause the following panic when doing module load and unload: BUG: unable to handle kernel paging request at 0000004e00000075 IP: [<ffffffff816bd01b>] napi_hash_add+0x6b/0xf0 PGD 3c5d5067 PUD 0 Oops: 0000 [#1] SMP ... Call Trace: [<ffffffffa0a5bfb7>] init_vqs+0x107/0x490 [virtio_net] [<ffffffffa0a5c9f2>] virtnet_probe+0x562/0x791815639d880be [virtio_net] [<ffffffff8139e667>] virtio_dev_probe+0x137/0x200 [<ffffffff814c7f2a>] driver_probe_device+0x7a/0x250 [<ffffffff814c81d3>] __driver_attach+0x93/0xa0 [<ffffffff814c8140>] ? __device_attach+0x40/0x40 [<ffffffff814c6053>] bus_for_each_dev+0x63/0xa0 [<ffffffff814c7a79>] driver_attach+0x19/0x20 [<ffffffff814c76f0>] bus_add_driver+0x170/0x220 [<ffffffffa0a60000>] ? 0xffffffffa0a60000 [<ffffffff814c894f>] driver_register+0x5f/0xf0 [<ffffffff8139e41b>] register_virtio_driver+0x1b/0x30 [<ffffffffa0a60010>] virtio_net_driver_init+0x10/0x12 [virtio_net] This patch fixes this by doing this in virtnet_free_queues(). And also don't delete napi in virtnet_freeze() since it will call virtnet_free_queues() which has already did this. Fixes `91815639d8` ("virtio-net: rx busy polling support") Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-12 14:37:17 -04:00
Linus Torvalds	53861af9a1	OK, this has the big virtio 1.0 implementation, as specified by OASIS. On top of tht is the major rework of lguest, to use PCI and virtio 1.0, to double-check the implementation. Then comes the inevitable fixes and cleanups from that work. Thanks, Rusty. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJU5B9cAAoJENkgDmzRrbjxPacP/jajliXX353JJ/g/hkZ6oDN5 o7FhELBKiUMr7enVZYwj2BBYk5OM36nB9pQkiqHMSbjJGoS5IK70enxb4YRxSHBn YCLblZMNqutGS0kclZ9DDysztjAhxH7CvLM6pMZ7eHP0f3+FM/QhbxHfbG9DTBUH 2U/nybvd3M/+YBe7ptwQdrH8aOCAD6RTIsXellfm99dNMK6K/5lqnWQ98WSXmNXq vyvdaAQsqqUkmxtajjcBumaCH4/SehOJJjUqojCMsR3aBkgOBWDZJURMek+KA5Dt X996fBsTAlvTtCUKRrmLTb2ScDH7fu+jwbWRqMYDk8zpEr3XqiLTTPV4/TiHGmi7 Wiw3g1wIY1YbETlZyongB5MIoVyUfmDAd+bT8nBsj3KIITD84gOUQFDMl6d63c0I z6A9Pu/UzpJGsXZT3WoFLi6TO67QyhOseqZnhS4wBgLabjxffNM7yov9RVKUVH/n JHunnpUk2iTtSgscBarOBz5867dstuurnaUIspZthVBo6y6N0z+GrU+agJ8Y4DXx mvwzeYLhQH2208PjxPFiah/kA/gHNm1m678TbpS+CUsgmpQiJ4gTwtazDSi4TwZY Hs9T9GulkzpZIzEyKL3qG2TsfyDhW5Avn+GvKInAT9+Fkig4BnP3DUONBxcwGZ78 eI3FDUWsE36NqE5ECWmz =ivCe -----END PGP SIGNATURE----- Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio updates from Rusty Russell: "OK, this has the big virtio 1.0 implementation, as specified by OASIS. On top of tht is the major rework of lguest, to use PCI and virtio 1.0, to double-check the implementation. Then comes the inevitable fixes and cleanups from that work" * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (80 commits) virtio: don't set VIRTIO_CONFIG_S_DRIVER_OK twice. virtio_net: unconditionally define struct virtio_net_hdr_v1. tools/lguest: don't use legacy definitions for net device in example launcher. virtio: Don't expose legacy net features when VIRTIO_NET_NO_LEGACY defined. tools/lguest: use common error macros in the example launcher. tools/lguest: give virtqueues names for better error messages tools/lguest: more documentation and checking of virtio 1.0 compliance. lguest: don't look in console features to find emerg_wr. tools/lguest: don't start devices until DRIVER_OK status set. tools/lguest: handle indirect partway through chain. tools/lguest: insert driver references from the 1.0 spec (4.1 Virtio Over PCI) tools/lguest: insert device references from the 1.0 spec (4.1 Virtio Over PCI) tools/lguest: rename virtio_pci_cfg_cap field to match spec. tools/lguest: fix features_accepted logic in example launcher. tools/lguest: handle device reset correctly in example launcher. virtual: Documentation: simplify and generalize paravirt_ops.txt lguest: remove NOTIFY call and eventfd facility. lguest: remove NOTIFY facility from demonstration launcher. lguest: use the PCI console device's emerg_wr for early boot messages. lguest: always put console in PCI slot #1. ...	2015-02-18 09:24:01 -08:00
David S. Miller	6e03f896b5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/vxlan.c drivers/vhost/net.c include/linux/if_vlan.h net/core/dev.c The net/core/dev.c conflict was the overlap of one commit marking an existing function static whilst another was adding a new function. In the include/linux/if_vlan.h case, the type used for a local variable was changed in 'net', whereas the function got rewritten to fix a stacked vlan bug in 'net-next'. In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next' overlapped with an endainness fix for VHOST 1.0 in 'net'. In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter in 'net-next' whereas in 'net' there was a bug fix to pass in the correct network namespace pointer in calls to this function. Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-05 14:33:28 -08:00
Vlad Yasevich	e3e3c423f8	Revert "drivers/net: Disable UFO through virtio" This reverts commit `3d0ad09412`. Now that GSO functionality can correctly track if the fragment id has been selected and select a fragment id if necessary, we can re-enable UFO on tap/macvap and virtio devices. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-02-03 23:06:43 -08:00

1 2 3 4 5 ...

322 Commits