linux

Author	SHA1	Message	Date
David Ahern	afe80a4998	net: vrf: ipv4 support for local traffic to local addresses Add support for locally originated traffic to VRF-local addresses. If destination device for an skb is the loopback or VRF device then set its dst to a local version of the VRF cached dst_entry and call netif_rx to insert the packet onto the rx queue - similar to what is done for loopback. This patch handles IPv4 support; follow on patch handles IPv6. With this patch, ping, tcp and udp packets to a local IPv4 address are successfully routed: $ ip addr show dev eth1 4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000 link/ether 02:e0:f9:1c:b9:74 brd ff:ff:ff:ff:ff:ff inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1 valid_lft forever preferred_lft forever inet6 2100:1::1/120 scope global valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe1c:b974/64 scope link valid_lft forever preferred_lft forever $ ping -c1 -I red 10.100.1.1 ping: Warning: source address might be selected on device other than red. PING 10.100.1.1 (10.100.1.1) from 10.100.1.1 red: 56(84) bytes of data. 64 bytes from 10.100.1.1: icmp_seq=1 ttl=64 time=0.057 ms This patch also enables use of IPv4 loopback address on the VRF device: $ ip addr add dev red 127.0.0.1/8 $ ping -c1 -I red 127.0.0.1 PING 127.0.0.1 (127.0.0.1) from 127.0.0.1 red: 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.058 ms Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-08 00:25:38 -07:00
David Ahern	911a66fbc8	net: vrf: Minor refactoring for local address patches Move the stripping of the ethernet header from is_ip_tx_frame into the ipv4 and ipv6 outbound functions and collapse vrf_send_v4_prep into vrf_process_v4_outbound. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-08 00:25:38 -07:00
Tom Herbert	c1e48af796	gue: Implement direction IP encapsulation This patch implements direct encapsulation of IPv4 and IPv6 packets in UDP. This is done a version "1" of GUE and as explained in I-D draft-ietf-nvo3-gue-03. Changes here are only in the receive path, fou with IPxIPx already supports the transmit side. Both the normal receive path and GRO path are modified to check for GUE version and check for IP version in the case that GUE version is "1". Tested: IPIP with direct GUE encap 1 TCP_STREAM 4530 Mbps 200 TCP_RR 1297625 tps 135/232/444 90/95/99% latencies IP4IP6 with direct GUE encap 1 TCP_STREAM 4903 Mbps 200 TCP_RR 1184481 tps 149/253/473 90/95/99% latencies IP6IP6 direct GUE encap 1 TCP_STREAM 5146 Mbps 200 TCP_RR 1202879 tps 146/251/472 90/95/99% latencies SIT with direct GUE encap 1 TCP_STREAM 6111 Mbps 200 TCP_RR 1250337 tps 139/241/467 90/95/99% latencies Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 23:51:14 -07:00
David S. Miller	34fe76abbe	Merge branch 'net-sched-fast-stats' Eric Dumazet says: ==================== net: sched: faster stats gathering A while back, I sent one RFC patch using lockless stats gathering on 64bit arches. This patch series does it more cleanly, using a seqcount. Since qdisc/class stats are written at dequeue() time, we can ask the dequeue to change the seqcount, so that stats readers can avoid taking the root qdisc lock, and instead the typical read_seqcount_{begin\|retry} guarded loop. This does not change fast path costs, as the seqcount increments are not more expensive than the bit manipulation, and allows readers to not freeze the fast path anymore. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 16:37:14 -07:00
Eric Dumazet	edb09eb17e	net: sched: do not acquire qdisc spinlock in qdisc/class stats dump Large tc dumps (tc -s {qdisc\|class} sh dev ethX) done by Google BwE host agent [1] are problematic at scale : For each qdisc/class found in the dump, we currently lock the root qdisc spinlock in order to get stats. Sampling stats every 5 seconds from thousands of HTB classes is a challenge when the root qdisc spinlock is under high pressure. Not only the dumps take time, they also slow down the fast path (queue/dequeue packets) by 10 % to 20 % in some cases. An audit of existing qdiscs showed that sch_fq_codel is the only qdisc that might need the qdisc lock in fq_codel_dump_stats() and fq_codel_dump_class_stats() In v2 of this patch, I now use the Qdisc running seqcount to provide consistent reads of packets/bytes counters, regardless of 32/64 bit arches. I also changed rate estimators to use the same infrastructure so that they no longer need to lock root qdisc lock. [1] http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43838.pdf Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Kevin Athey <kda@google.com> Cc: Xiaotian Pei <xiaotian@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 16:37:14 -07:00
Eric Dumazet	f9eb8aea2a	net_sched: transform qdisc running bit into a seqcount Instead of using a single bit (__QDISC___STATE_RUNNING) in sch->__state, use a seqcount. This adds lockdep support, but more importantly it will allow us to sample qdisc/class statistics without having to grab qdisc root lock. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 16:37:13 -07:00
David S. Miller	64151ae36e	Merge branch 'be2net-noncrit-fixes' Sathya Perla says: ==================== be2net: patch set Hi David, the following patch set contains three non-critical fixes that can go into the net-next tree. Patch 1 fixes the logic for provisioning queue pairs on VFs to take into account the limit on number of TXQs too as in some profiles the number of TXQs is less than that of RXQs. Patch 2 enables WoL support from shutdown on Skyhawk. Patch 3 enhances the logic for provisioning queue pairs on VFs on SR-IOV over multi-partition configs. Each PF (partition) on a port has to compute the number of RSS tables it's VFs can use. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 16:18:20 -07:00
Somnath Kotur	de2b1e0366	be2net: Fix provisioning of RSS for VFs in multi-partition configurations Currently, we do not distribute queue resources to enable RSS for VFs in multi-channel/partition configurations. Fix this by having each PF(SRIOV capable) calculate it's share of the 15 RSS Policy Tables available per port before provisioning resources for all the VFs. This proportional share calculation is done based on division of the PF's MAX VFs with the Total MAX VFs on that port. It also needs to learn about the no: of NIC PFs on the port and subtract that from the 15 RSS Policy Tables on the port. Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 16:18:20 -07:00
Sriharsha Basavapatna	45f13df75f	be2net: Enable Wake-On-LAN from shutdown for Skyhawk Skyhawk does support wake-up from ACPI shutdown state - S5, provided the platform supports it (like Auxiliary power source etc). The changes listed below are done to fix this. 1) There's no need to defer the HW configuration of WOL to be_suspend(). Remove this in be_suspend() and move it to be_set_wol() ethtool function so it is configured directly in the context of ethtool. This automatically takes care of the shutdown case. 2) The driver incorrectly uses WOL_CAP field in the FW response to get_acpi_wol_cap() command, to determine if WOL is enabled. Instead the driver must rely on the macaddr field in the response to infer WOL state. 3) In be_get_config() during init, if we find that WOL is enabled in FW, call pci_enable_wake() to enable pmcsr.pme_en bit. This is needed to support persistent WOL configuration provided by the FW in some platforms. 4) Remove code in be_set_wol() that writes to PCICFG_PM_CONTROL_OFFSET to set pme_en bit; pci_enable_wake() sets that. Fixes: `028991e49` ("Enabling Wake-on-LAN is not supported in S5 state") Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Sathya Perla <sathya.perla@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 16:18:19 -07:00
Suresh Reddy	b9263cbf21	be2net: use max-TXQs limit too while provisioning VF queue pairs When the PF driver provisions resources for VFs, it currently only looks at max RSS queues available to calculate the number of VF queue pairs. This logic breaks when there are less number of TX-queues than RSS-queues. This patch fixes this problem by using the max-TXQs available in the PF-pool in the calculations. As a part of this change the be_calculate_vf_qs() routine is renamed as be_calculate_vf_res() and the code that calculates limits on other related resources is moved here to contain all resource calculation code inside one routine. Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com> Signed-off-by: Sathya Perla <sathya.perla@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 16:18:19 -07:00
Zhao Qiang	c19b6d246a	drivers/net: support hdlc function for QE-UCC The driver add hdlc support for Freescale QUICC Engine. It support NMSI and TSA mode. Signed-off-by: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:56:31 -07:00
Zhao Qiang	35ef1c20fd	fsl/qe: Add QE TDM lib QE has module to support TDM, some other protocols supported by QE are based on TDM. add a qe-tdm lib, this lib provides functions to the protocols using TDM to configurate QE-TDM. Signed-off-by: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:56:31 -07:00
Zhao Qiang	19163ac312	fsl/qe: Make regs resouce_size_t Signed-off-by: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:56:31 -07:00
Zhao Qiang	bb8b2062af	fsl/qe: setup clock source for TDM mode Add tdm clock configuration in both qe clock system and ucc fast controller. Signed-off-by: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:56:30 -07:00
Zhao Qiang	68f047e3d6	fsl/qe: add rx_sync and tx_sync for TDM mode Rx_sync and tx_sync are used by QE-TDM mode, add them to struct ucc_fast_info. Signed-off-by: Zhao Qiang <qiang.zhao@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:56:30 -07:00
Jamal Hadi Salim	0b0f43fe2e	net sched: indentation and other OCD stylistic fixes Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com>	2016-06-07 15:53:54 -07:00
David S. Miller	be11991368	Merge branch 'sch-action-tstamp' Jamal Hadi Salim says: ==================== net sched action timestamp improvements Various aggregations of duplicated code, fixes and introduction of firstused timestamp v2: add const for source time info per suggestion from Cong ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:53:44 -07:00
Jamal Hadi Salim	48d8ee1694	net sched actions: aggregate dumping of actions timeinfo Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:53:43 -07:00
Jamal Hadi Salim	53eb440f4a	net sched actions: introduce timestamp for firsttime use Useful to know when the action was first used for accounting (and debugging) Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:53:43 -07:00
Jamal Hadi Salim	9c4a4e488b	net sched: actions use tcf_lastuse_update for consistency Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:53:43 -07:00
Amir Vadai	e69985c67c	net/sched: cls_flower: Introduce support in SKIP SW flag In order to make a filter processed only by hardware, skip_sw flag should be supplied. This is an addition to the already existing skip_hw flag (filter will be processed by software only). If no flag is specified, filter will be processed by both software and hardware. If only hardware offloaded filters exist, fl_classify() will return without doing anything. A following userspace patch will be sent once kernel patch is accepted. Example: tc filter add dev enp0s9 protocol ip prio 20 parent ffff: \ flower \ ip_proto 6 \ indev enp0s9 \ skip_sw \ action skbedit mark 0x1234 Signed-off-by: Amir Vadai <amirva@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:49:53 -07:00
David S. Miller	919f274fd6	Merge branch 'qed-iov-fw-reqs' Yuval Mintz says: ==================== qed: IOV series - relax firmware requirements In order for VFs to work, current implementation demands that the VF's requried storm firmware would be exactly the version that was loaded by the PF, which is a very harsh requirement. This patch series is intended to relax this - the recently submitted firmware is intended to be forward/backward compatible in its fastpath [slowpath is configured by PF on behalf of VF], and so VFs would only be required of having the same major faspath HSI in order to work. Most of the other patches in this series extend current forward compatibilty of driver to reduce chance of breaking PF/VF compatibility in the future. A few are unrelated IOV changes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:40:12 -07:00
Yuval Mintz	54fdd80f6f	qed: PF to reply to unknown messages If a future VF would send the PF an unknown message, the PF today would not send a reply. This would have 2 bad effects: a. VF would have to timeout on the request. b. If VF were to send an additional message to PF, firmware would mark it as malicious. Instead, if there's some valid reply-address on the message - let the PF answer and tell the VF it doesn't know the message. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:40:12 -07:00
Yuval Mintz	8246d0b48b	qed: PF enforce MAC limitation of VFs The only limitation relating to MACs the PF enforce today on its VFs is in case it has a forced-unicast MAC address for them, in which case they can't configure other unicast addresses. Specifically, the PF isn't enforcing the number of MAC addresse a VF can configure regardless of the nubmer of such filters agreed upon by PF and VF during the acquisition process. PF's shadow-config is now extended to also contain information about its VFs' unicast addresses configuration, allowing such enforcement. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:40:12 -07:00
Yuval Mintz	5040acf537	qed: Move doorbell calculation from VF to PF Today, the VF is aware of its queues context-ids, and calculates the doorbell address when opening its queues on its own. The configuration of doorbells in HW can sometime in the future be changed by the PF [hw has several configurable features that might affect doorbell addresses, e.g., dpm support], this would break compatibility with older VFs as their calculated doorbell addresses would be incorrect for such a configuration. In order to avoid such a backward compatibility failure, let the PF make the calculation of the doorbell offset based on the context-id, and pass that to the VF. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:40:11 -07:00
Yuval Mintz	41086467d6	qed: Make PF more robust against malicious VF There are several requests the VF can make toward the PF which the driver would pass to firmware without checking the validity first - specifically, opening queues and updating vports. Such configurations might cause the firmware to assert. This adds validation of the legality of said configurations on the PF side before passing it onward via ramrod to firmware. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:40:11 -07:00
Yuval Mintz	1cf2b1a971	qed: PF-VF resource negotiation One of the goals of the vf's first message to the PF [acquire] is to learn about the number of resources available to it [macs, vlans, etc.]. This is done via negotiation - the VF requires a set of resources, which the PF either approves or disaproves and sends a smaller set of resources as alternative. In this later case, the VF is then expected to either abort the probe or re-send the acquire message with less required resources. While this infrastructure exists since the initial submision of qed SRIOV support, it's in fact completely inoperational - PF isn't really looking into the resources the VF has asked for and is never going to reply to the VF that it lacks resources. This patch addresses this flow, fixing it and allowing the PF and VF to actually agree on a set of resources. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:40:11 -07:00
Yuval Mintz	1fe614d10f	qed: Relax VF firmware requirements Current driver require an exact match between VF and PF storm firmware; Any difference would fail the VF acquire message, causing the VF probe to be aborted. While there's still dependencies between the two, the recent FW submission has relaxed the match requirement - instead of an exact match, there's now a 'fastpath' HSI major/minor scheme, where VFs and PFs that match in their major number can co-exist even if their minor is different. In order to accomadate this change some changes in the vf-start init flow had to be made, as the VF start ramrod now has to be sent only after PF learns which fastpath HSI its VF is requiring. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:40:11 -07:00
Eric Dumazet	3bcb846ca4	net: get rid of spin_trylock() in net_tx_action() Note: Tom Herbert posted almost same patch 3 months back, but for different reasons. The reasons we want to get rid of this spin_trylock() are : 1) Under high qdisc pressure, the spin_trylock() has almost no chance to succeed. 2) We loop multiple times in softirq handler, eventually reaching the max retry count (10), and we schedule ksoftirqd. Since we want to adhere more strictly to ksoftirqd being waked up in the future (https://lwn.net/Articles/687617/), better avoid spurious wakeups. 3) calls to __netif_reschedule() dirty the cache line containing q->next_sched, slowing down the owner of qdisc. 4) RT kernels can not use the spin_trylock() here. With help of busylock, we get the qdisc spinlock fast enough, and the trylock trick brings only performance penalty. Depending on qdisc setup, I observed a gain of up to 19 % in qdisc performance (1016600 pps instead of 853400 pps, using prio+tbf+fq_codel) ("mpstat -I SCPU 1" is much happier now) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 15:32:03 -07:00
Jason Wang	8241a1e466	vhost_net: stop polling socket during rx processing We don't stop rx polling socket during rx processing, this will lead unnecessary wakeups from under layer net devices (E.g sock_def_readable() form tun). Rx will be slowed down in this way. This patch avoids this by stop polling socket during rx processing. A small drawback is that this introduces some overheads in light load case because of the extra start/stop polling, but single netperf TCP_RR does not notice any change. In a super heavy load case, e.g using pktgen to inject packet to guest, we get about ~8.8% improvement on pps: before: ~1240000 pkt/s after: ~1350000 pkt/s Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-07 14:46:11 -07:00
Bhaktipriya Shridhar	aaa76724d7	net: ethernet: cavium: liquidio: request_manager: Remove create_workqueue alloc_workqueue replaces deprecated create_workqueue(). A dedicated workqueue has been used since the workitem viz (&db_wq->wk.work which maps to check_db_timeout) is involved in normal device operation. WQ_MEM_RECLAIM has been set to guarantee forward progress under memory pressure, which is a requirement here. Since there are only a fixed number of work items, explicit concurrency limit is unnecessary. flush_workqueue is unnecessary since destroy_workqueue() itself calls drain_workqueue() which flushes repeatedly till the workqueue becomes empty. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-06 21:17:02 -04:00
Bhaktipriya Shridhar	523a61b488	net: ethernet: cavium: liquidio: response_manager: Remove create_workqueue alloc_workqueue replaces deprecated create_workqueue(). A dedicated workqueue has been used since the workitem viz (&cwq->wk.work which maps to oct_poll_req_completion) is involved in normal device operation. WQ_MEM_RECLAIM has been set to guarantee forward progress under memory pressure, which is a requirement here. Since there are only a fixed number of work items, explicit concurrency limit is unnecessary. flush_workqueue is unnecessary since destroy_workqueue() itself calls drain_workqueue() which flushes repeatedly till the workqueue becomes empty. Hence the call to flush_workqueue() has been dropped. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-06 21:17:01 -04:00
Aaron Conole	14de9d114a	virtio-net: Add initial MTU advice feature This commit adds the feature bit and associated mtu device entry for the virtio network device. When a virtio device comes up, it checks the feature bit for the VIRTIO_NET_F_MTU feature. If such feature bit is enabled, the driver will read the advised MTU and use it as the initial value. Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-06 21:08:49 -04:00
David S. Miller	3d9dc408fa	net: Revert vrf-local changes. This reverts commit `2fb7ea455d`. It results in build errors because ip6_input is not a symbol exported to modules. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-06 15:58:34 -07:00
David S. Miller	2fb7ea455d	Merge branch 'vrf-local' David Ahern says: ==================== net: vrf: Add support for local traffic to local addresses Add support for locally originated traffic to VRF-local addresses, be it addresses on enslaved devices or addresses on the VRF device: $ ip addr show dev red 33: red: <NOARP,MASTER,UP,LOWER_UP> mtu 65536 qdisc pfifo_fast state UP group default qlen 1000 link/ether be:00:53:b5:e4:25 brd ff:ff:ff:ff:ff:ff inet 1.1.1.1/32 scope global red valid_lft forever preferred_lft forever inet6 1111:1::1/128 scope global valid_lft forever preferred_lft forever $ ip addr show dev eth1 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000 link/ether 02:e0:f9:79:34:bd brd ff:ff:ff:ff:ff:ff inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1 valid_lft forever preferred_lft forever inet6 2100:1::1/120 scope global valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe79:34bd/64 scope link valid_lft forever preferred_lft forever $ ping -c1 -I red 10.100.1.1 ping: Warning: source address might be selected on device other than red. PING 10.100.1.1 (10.100.1.1) from 10.100.1.1 red: 56(84) bytes of data. 64 bytes from 10.100.1.1: icmp_seq=1 ttl=64 time=0.057 ms $ ping -c1 -I red 1.1.1.1 PING 1.1.1.1 (1.1.1.1) from 1.1.1.1 red: 56(84) bytes of data. 64 bytes from 1.1.1.1: icmp_seq=1 ttl=64 time=0.136 ms --- 1.1.1.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.136/0.136/0.136/0.000 ms $ ping6 -c1 -I red 2100:1::1 ping6: Warning: source address might be selected on device other than red. PING 2100:1::1(2100:1::1) from 2100:1::1 red: 56 data bytes 64 bytes from 2100:1::1: icmp_seq=1 ttl=64 time=0.167 ms --- 2100:1::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.167/0.167/0.167/0.000 ms $ ping6 -c1 -I red 1111::1 PING 1111::1(1111::1) from 1111:1::1 red: 56 data bytes 64 bytes from 1111::1: icmp_seq=1 ttl=64 time=0.187 ms --- 1111::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.187/0.187/0.187/0.000 ms This change also enables use of loopback address on the VRF device: $ ip addr add dev red 127.0.0.1/8 $ ping -c1 -I red 127.0.0.1 PING 127.0.0.1 (127.0.0.1) from 127.0.0.1 red: 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.058 ms ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-06 15:19:07 -07:00
David Ahern	625b47b507	net: vrf: ipv6 support for local traffic to local addresses Add support for locally originated traffic to VRF-local IPv6 addresses. Similar to IPv4 a local dst is set on the skb and the packet is reinserted with a call to netif_rx. With this patch, ping, tcp and udp packets to a local IPv6 address are successfully routed: $ ip addr show dev eth1 4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000 link/ether 02:e0:f9:1c:b9:74 brd ff:ff:ff:ff:ff:ff inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1 valid_lft forever preferred_lft forever inet6 2100:1::1/120 scope global valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe1c:b974/64 scope link valid_lft forever preferred_lft forever $ ping6 -c1 -I red 2100:1::1 ping6: Warning: source address might be selected on device other than red. PING 2100:1::1(2100:1::1) from 2100:1::1 red: 56 data bytes 64 bytes from 2100:1::1: icmp_seq=1 ttl=64 time=0.098 ms ip6_input is exported so the VRF driver can use it for the dst input function. The dst_alloc function for IPv4 defaults to setting the input and output functions; IPv6's does not. VRF does not need to duplicate the Rx path so just export the ipv6 input function. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-06 15:19:02 -07:00
David Ahern	671cd19ade	net: vrf: ipv4 support for local traffic to local addresses Add support for locally originated traffic to VRF-local addresses. If destination device for an skb is the loopback or VRF device then set its dst to a local version of the VRF cached dst_entry and call netif_rx to insert the packet onto the rx queue - similar to what is done for loopback. This patch handles IPv4 support; follow on patch handles IPv6. With this patch, ping, tcp and udp packets to a local IPv4 address are successfully routed: $ ip addr show dev eth1 4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000 link/ether 02:e0:f9:1c:b9:74 brd ff:ff:ff:ff:ff:ff inet 10.100.1.1/24 brd 10.100.1.255 scope global eth1 valid_lft forever preferred_lft forever inet6 2100:1::1/120 scope global valid_lft forever preferred_lft forever inet6 fe80::e0:f9ff:fe1c:b974/64 scope link valid_lft forever preferred_lft forever $ ping -c1 -I red 10.100.1.1 ping: Warning: source address might be selected on device other than red. PING 10.100.1.1 (10.100.1.1) from 10.100.1.1 red: 56(84) bytes of data. 64 bytes from 10.100.1.1: icmp_seq=1 ttl=64 time=0.057 ms This patch also enables use of IPv4 loopback address on the VRF device: $ ip addr add dev red 127.0.0.1/8 $ ping -c1 -I red 127.0.0.1 PING 127.0.0.1 (127.0.0.1) from 127.0.0.1 red: 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.058 ms Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-06 15:19:02 -07:00
David Ahern	09fcf91662	net: vrf: Minor refactoring for local address patches Move the stripping of the ethernet header from is_ip_tx_frame into the ipv4 and ipv6 outbound functions. If the packet is destined to a local address the header is retained since the packet is sent back to netif_rx. Collapse vrf_send_v4_prep into vrf_process_v4_outbound. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-06 15:19:02 -07:00
David S. Miller	b94eb2ce3b	Merge branch 'hv_netvsc-cleanups' Vitaly Kuznetsov says: ==================== hv_netvsc: cleanup after untangling the pointer mess Changes since v1: - resend when net-next is open [David Miller] - rebased to current net-next. After we made traveling through our internal structures explicit it became obvious that some functions take arguments they don't need just to do redundant pointer travel and get to what they really need while their callers already have the required information. This is just a cleanup series with no functional changes intended. It doesn't pretend to be complete, additional cleanup of other functions may follow. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-05 23:16:50 -04:00
Vitaly Kuznetsov	426d95417e	hv_netvsc: pass struct net_device to rndis_filter_set_offload_params() The only caller rndis_filter_device_add() has 'struct net_device' pointer already. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-05 23:16:36 -04:00
Vitaly Kuznetsov	e834da9a40	hv_netvsc: pass struct net_device to rndis_filter_set_device_mac() We unpack 'struct net_device' in netvsc_set_mac_addr() to get to 'struct hv_device' pointer which we use in rndis_filter_set_device_mac() to get back to 'struct net_device'. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-05 23:16:36 -04:00
Vitaly Kuznetsov	2f5fa6c869	hv_netvsc: pass struct netvsc_device to rndis_filter_{open, close}() Both rndis_filter_open()/rndis_filter_close() use struct hv_device to reach to struct netvsc_device only and all callers have it already. While on it, rename net_device to nvdev in rndis_filter_open() as net_device is misleading. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-05 23:16:36 -04:00
Vitaly Kuznetsov	2625466d6d	hv_netvsc: introduce {net, hv}_device_to_netvsc_device() helpers Make it easier to get 'struct netvsc_device' from 'struct net_device' and 'struct hv_device' by introducing inline helpers. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-05 23:16:35 -04:00
Vitaly Kuznetsov	4baa994dc9	hv_netvsc: remove redundant assignment in netvsc_recv_callback() net_device_ctx is assigned in the very beginning of the function and 'net' pointer doesn't change. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-05 23:16:35 -04:00
Michal Kubeček	30759219f5	net: disable fragment reassembly if high_thresh is zero Before commit `6d7b857d54` ("net: use lib/percpu_counter API for fragmentation mem accounting"), setting the reassembly high threshold to 0 prevented fragment reassembly as first fragment would be always evicted before second could be added to the queue. While inefficient, some users apparently relied on this method. Since the commit mentioned above, a percpu counter is used for reassembly memory accounting and high batch size avoids taking slow path in most common scenarios. As a result, a whole full sized packet can be reassembled without the percpu counter's main counter changing its value so that even with high_thresh set to 0, fragmented packets can be still reassembled and processed. Add explicit check preventing reassembly if high threshold is zero. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-05 22:56:42 -04:00
David S. Miller	b1c6a3a46b	Merge branch 'hns-acpi' Kejian Yan says: ==================== net: hns: add support of ACPI This series adds HNS support of acpi. The routine will call some ACPI helper functions, like acpi_dev_found() and acpi_evaluate_dsm(), which are not included in other cases. In order to make system compile successfully in other cases except ACPI, it needs to add relative stub functions to linux/acpi.h. And we use device property functions instead of serial helper functions to suport both DT and ACPI cases. And then add the supports of ACPI for HNS. change log: v3->v4: mii-id gets from dev-name instead of address v2->v3: 1. add Review-by: Andy Shevchenko 2. fix the potential memory leak v1 -> v2: 1. use acpi_dev_found() instead of acpi_match_device_ids() to check if it is a acpi node. 2. use is_of_node() instead of IS_ENABLED() to check if it is a DT node. 3. split the patch("add support of acpi for hns-mdio") into two patches: 3.1 Move to use fwnode_handle 3.2 Add ACPI 4. add the patch which subject is dsaf misc operation method 5. fix the comments by Andy Shevchenko ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-04 21:32:46 -04:00
Kejian Yan	63434888aa	net: hns: net: hns: enet adds support of acpi Enet needs to get configration parameter by acpi. This patch adds support of ACPI for enet. The configuration parameter will be configed in BIOS. Signed-off-by: Kejian Yan <yankejian@huawei.com> Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-04 21:32:41 -04:00
Kejian Yan	f00ef863da	net: hns: implement the miscellaneous operation by asl The miscellaneous operation is implemented in BIOS, the kernel can call _DSM method help to call the implementation in ACPI case. Here is a patch to do that. Signed-off-by: Kejian Yan <yankejian@huawei.com> Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-04 21:32:41 -04:00
Kejian Yan	1d1afa2ebf	net: hns: register phy device in each mac initial sequence In ACPI case, there is no interface to register phy device to mdio-bus. Phy device has to be registered itself to mdio-bus, and then enet can get the phy device's info so that it can config the phy-device to help to trasmit and receive data. HNS hardware topology is as below. The MDIO controller may control several PHY-devices, and each PHY-device connects to a MAC device. PHY-devices will register when each mac find PHY device in initial sequence. cpu \| \| ------------------------------------------- \| \| \| \| \| \| \| dsaf \| MDIO \| MDIO \| --------------------------- \| \| \| \| \| \| \| \| \| \| \| \| \| \| MAC MAC MAC MAC \| \| \| \| \| \| \| ---- \|-------- \|-------- \| \| -------- \|\| \|\| \|\| \|\| PHY PHY PHY PHY Signed-off-by: Kejian Yan <yankejian@huawei.com> Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-04 21:32:41 -04:00
Kejian Yan	8413b3be4d	net: hns: dsaf adds support of acpi Dsaf needs to get configuration parameter by ACPI, so this patch add support of ACPI. Signed-off-by: Kejian Yan <yankejian@huawei.com> Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-04 21:32:41 -04:00

1 2 3 4 5 ...

601887 Commits