Commit Graph

693342 Commits

Author SHA1 Message Date
Arkadi Sharshevsky
c9eb3e0f87 net: dsa: Add support for learning FDB through notification
Add support for learning FDB through notification. The driver defers
the hardware update via ordered work queue. In case of a successful
FDB add a notification is sent back to bridge.

In case of hw FDB del failure the static FDB will be deleted from
the bridge, thus, the interface is moved to down state in order to
indicate inconsistent situation.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:48:48 -07:00
Arkadi Sharshevsky
2acf4e6a89 net: dsa: Remove switchdev dependency from DSA switch notifier chain
Currently, the switchdev objects are embedded inside the DSA notifier
info. This patch removes this dependency. This is done as a preparation
stage before adding support for learning FDB through the switchdev
notification chain.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:48:48 -07:00
Arkadi Sharshevsky
1b6dd556c3 net: dsa: Remove prepare phase for FDB
The prepare phase for FDB add is unneeded because most of DSA devices
can have failures during bus transactions (SPI, I2C, etc.), thus, the
prepare phase cannot guarantee success of the commit stage.

The support for learning FDB through notification chain, which will be
introduced in the following patches, will provide the ability to notify
back the bridge about successful offload.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:48:47 -07:00
Arkadi Sharshevsky
6c2c1dcb18 net: dsa: Change DSA slave FDB API to be switchdev independent
In order to support FDB add/del to be on a notifier chain the slave
API need to be changed to be switchdev independent.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:48:47 -07:00
Bhumika Goyal
511aeaf466 hamradio: baycom: make hdlcdrv_ops const
Make hdlcdrv_ops structures const as they are only passed to
hdlcdrv_register function. The corresponding argument is of type const,
so make the structures const.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:26:46 -07:00
Florian Westphal
13ead5c4f2 xfrm: check that cached bundle is still valid
Quoting Ilan Tayari:
  1. Set up a host-to-host IPSec tunnel (or transport, doesn't matter)
  2. Ping over IPSec, or do something to populate the pcpu cache
  3. Join a MC group, then leave MC group
  4. Try to ping again using same CPU as before -> traffic
     doesn't egress the machine at all

Ilan debugged the problem down to the fact that one of the path dsts
devices point to lo due to earlier dst_dev_put().
In this case, dst is marked as DEAD and we cannot reuse the bundle.

The cache only asserted that the requested policy and that of the cached
bundle match, but its not enough - also verify the path is still valid.

Fixes: ec30d78c14 ("xfrm: add xdst pcpu cache")
Reported-by: Ayham Masood <ayhamm@mellanox.com>
Tested-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:25:39 -07:00
David S. Miller
d899cb2e5f Merge branch 'net-dsa-remove-useless-arguments'
Vivien Didelot says:

====================
net: dsa: remove useless arguments

Several DSA core setup functions take many arguments, mostly because of
the legacy code. This patch series removes the useless args of these
functions, where either the dsa_switch or dsa_port argument is enough.

Changes in v2:
  - ds->dev is already assigned by dsa_switch_alloc
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:24:21 -07:00
Vivien Didelot
4cfbf09cf9 net: dsa: remove useless args of dsa_slave_create
dsa_slave_create currently takes 4 arguments while it only needs the
related dsa_port and its name. Remove all other arguments.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:24:14 -07:00
Vivien Didelot
47d0dcc354 net: dsa: remove useless args of dsa_cpu_dsa_setup
dsa_cpu_dsa_setup currently takes 4 arguments but they are all available
from the dsa_port argument. Remove all others.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:22:42 -07:00
Vivien Didelot
206e41fe7f net: dsa: remove useless argument in legacy setup
dsa_switch_alloc() already assigns ds-dev, which can be used in
dsa_switch_setup_one and dsa_cpu_dsa_setups instead of requiring an
additional struct device argument.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:22:42 -07:00
Colin Ian King
e00e21979d net: hns3: fix spelling mistake: "capabilty" -> "capability"
Trivial fix to spelling mistake in dev_err error message and also
split overly long line to avoid a checkpatch warning.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:20:52 -07:00
David S. Miller
2f23678c28 Merge branch 'Refactor-lan9303_xxx_packet_processing'
Egil Hjelmeland says:

====================
Refactor lan9303_xxx_packet_processing

This series is purely non functional.

It changes the lan9303_enable_packet_processing,
lan9303_disable_packet_processing() to pass port number (0,1,2) as
parameter instead of port offset. This aligns them with
other functions in the module, and makes it possible to simplify the code.

The lan9303_enable_packet_processing, lan9303_disable_packet_processing
functions operate on port. Therefore rename the functions to reflect that
as well.

Reviewer pointed out lan9303_get_ethtool_stats would be better off with
the use of a lan9303_read_switch_port(). So that was added to the series.

Changes v3 -> v4:
 - Whitespace adjustments.

Changes v2 -> v3:
 - Patch 1: Removed the change in lan9303_get_ethtool_stats
 - Added patch 4: rename lan9303_xxx_packet_processing
 - Added patch 5: refactor lan9303_get_ethtool_stats

Changes v1 -> v2:
 - introduced lan9303_write_switch_port() in first patch
 - inserted LAN9303_NUM_PORTS patch
 - Use LAN9303_NUM_PORTS in last patch. Plus whitespace change.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:18:01 -07:00
Egil Hjelmeland
0a967b4a8e net: dsa: lan9303: refactor lan9303_get_ethtool_stats
In lan9303_get_ethtool_stats: Get rid of 0x400 constant magic
by using new lan9303_read_switch_reg() inside loop.
Reduced scope of two variables.

Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:18:01 -07:00
Egil Hjelmeland
9c84258ed6 net: dsa: lan9303: Rename lan9303_xxx_packet_processing()
The lan9303_enable_packet_processing, lan9303_disable_packet_processing
functions operate on port, so the names should reflect that.
And to align with lan9303_disable_processing(), rename:

lan9303_enable_packet_processing -> lan9303_enable_processing_port
lan9303_disable_packet_processing -> lan9303_disable_processing_port

Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:18:01 -07:00
Egil Hjelmeland
b3d14a2b2f net: dsa: lan9303: Simplify lan9303_xxx_packet_processing() usage
Simplify usage of lan9303_enable_packet_processing,
lan9303_disable_packet_processing()

Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:18:01 -07:00
Egil Hjelmeland
a368ca5378 net: dsa: lan9303: define LAN9303_NUM_PORTS 3
Will be used instead of '3' in upcomming patches.

Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:18:01 -07:00
Egil Hjelmeland
451d3ca0a0 net: dsa: lan9303: Change lan9303_xxx_packet_processing() port param.
lan9303_enable_packet_processing, lan9303_disable_packet_processing()
Pass port number (0,1,2) as parameter instead of port offset.
Because other functions in the module pass port numbers.
And to enable simplifications in following patch.

Introduce lan9303_write_switch_port().

Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:18:01 -07:00
David S. Miller
51cb87c075 Merge branch 'ipv6-sr-add-support-for-advanced-local-segment-processing'
David Lebrun says:

====================
ipv6: sr: add support for advanced local segment processing

v2: use EXPORT_SYMBOL_GPL

The current implementation of IPv6 SR supports SRH insertion/encapsulation
and basic segment endpoint behavior (i.e., processing of an SRH contained in
a packet whose active segment (IPv6 DA) is routed to the local node). This
behavior simply consists of updating the DA to the next segment and forwarding
the packet accordingly. This processing is realised for all such packets,
regardless of the active segment.

The most recent specifications of IPv6 SR [1] [2] extend the SRH processing
features as follows. Each segment endpoint defines a MyLocalSID table.
This table maps segments to operations to perform. For each ingress IPv6
packet whose DA is part of a given prefix, the segment endpoint looks
up the active segment (i.e., the IPv6 DA) in the MyLocalSID table and
applies the corresponding operation. Such specifications enable to specify
arbitrary operations besides the basic SRH processing and allow for a more
fine-grained classification.

This patch series implements those extended specifications by leveraging
a new type of lightweight tunnel, seg6local. The MyLocalSID table is
simply an arbitrary routing table (using CONFIG_IPV6_MULTIPLE_TABLES). The
following commands would assign the prefix fc00::/64 to the MyLocalSID
table, map the segment fc00::42 to the regular SRH processing function
(named "End"), and drop all packets received with an undefined active
segment:

ip -6 rule add fc00::/64 lookup 100
ip -6 route add fc00::42 encap seg6local action End dev eth0 table 100
ip -6 route add blackhole default table 100

As another example, the following command would assign the segment
fc00::1234 to the regular SRH processing function, except that the
processed packet must be forwarded to the next-hop fc42::1 (this operation
is named "End.X"):

ip -6 route add fc00::1234 encap seg6local action End.X nh6 fc42::1 dev eth0 table 100

Those two basic operations (End and End.X) are defined in [1]. A more
extensive list of advanced operations is defined in [2].

The first two patches of the series are preliminary work that remove an
assumption about initial SRH format, and export the two functions used to
insert and encapsulate an SRH onto packets. The third patch defines the
new seg6local lightweight tunnel and implement the core functions. The
fourth patch implements the operations needed to handle the newly defined
rtnetlink attributes. The fifth patch implements a few SRH processing
operations, including End and End.X.

[1] https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-07
[2] https://tools.ietf.org/html/draft-filsfils-spring-srv6-network-programming-01
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:16:22 -07:00
David Lebrun
140f04c33b ipv6: sr: implement several seg6local actions
This patch implements the following seg6local actions.

- SEG6_LOCAL_ACTION_END: regular SRH processing. The DA of the packet
  is updated to the next segment and forwarded accordingly.

- SEG6_LOCAL_ACTION_END_X: same as above, except that the packet is
  forwarded to the specified IPv6 next-hop.

- SEG6_LOCAL_ACTION_END_DX6: decapsulate the packet and forward to
  inner IPv6 packet to the specified IPv6 next-hop.

- SEG6_LOCAL_ACTION_END_B6: insert the specified SRH directly after
  the IPv6 header of the packet.

- SEG6_LOCAL_ACTION_END_B6_ENCAP: encapsulate the packet within
  an outer IPv6 header, containing the specified SRH.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:16:22 -07:00
David Lebrun
2d9cc60aee ipv6: sr: add rtnetlink functions for seg6local action parameters
This patch adds the necessary functions to parse, fill, and compare
seg6local rtnetlink attributes, for all defined action parameters.

- The SRH parameter defines an SRH to be inserted or encapsulated.
- The TABLE parameter defines the table to use for the route lookup of
  the next segment or the inner decapsulated packet.
- The NH4 parameter defines the IPv4 next-hop for an inner decapsulated
  IPv4 packet.
- The NH6 parameter defines the IPv6 next-hop for the next segment or
  for an inner decapsulated IPv6 packet
- The IIF parameter defines an ingress interface index.
- The OIF parameter defines an egress interface index.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:16:22 -07:00
David Lebrun
d1df6fd8a1 ipv6: sr: define core operations for seg6local lightweight tunnel
This patch implements a new type of lightweight tunnel named seg6local.
A seg6local lwt is defined by a type of action and a set of parameters.
The action represents the operation to perform on the packets matching the
lwt's route, and is not necessarily an encapsulation. The set of parameters
are arguments for the processing function.

Each action is defined in a struct seg6_action_desc within
seg6_action_table[]. This structure contains the action, mandatory
attributes, the processing function, and a static headroom size required by
the action. The mandatory attributes are encoded as a bitmask field. The
static headroom is set to a non-zero value when the processing function
always add a constant number of bytes to the skb (e.g. the header size for
encapsulations).

To facilitate rtnetlink-related operations such as parsing, fill_encap,
and cmp_encap, each type of action parameter is associated to three
function pointers, in seg6_action_params[].

All actions defined in seg6_local.h are detailed in [1].

[1] https://tools.ietf.org/html/draft-filsfils-spring-srv6-network-programming-01

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:16:22 -07:00
David Lebrun
b04c80d3a7 ipv6: sr: export SRH insertion functions
This patch exports the seg6_do_srh_encap() and seg6_do_srh_inline()
functions. It also removes the CONFIG_IPV6_SEG6_INLINE knob
that enabled the compilation of seg6_do_srh_inline(). This function
is now built-in.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:16:21 -07:00
David Lebrun
925615ceda ipv6: sr: allow SRH insertion with arbitrary segments_left value
The seg6_validate_srh() function only allows SRHs whose active segment is
the first segment of the path. However, an application may insert an SRH
whose active segment is not the first one. Such an application might be
for example an SR-aware Virtual Network Function.

This patch enables to insert SRHs with an arbitrary active segment.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:16:21 -07:00
John Fastabend
4cc7b9544b bpf: devmap fix mutex in rcu critical section
Originally we used a mutex to protect concurrent devmap update
and delete operations from racing with netdev unregister notifier
callbacks.

The notifier hook is needed because we increment the netdev ref
count when a dev is added to the devmap. This ensures the netdev
reference is valid in the datapath. However, we don't want to block
unregister events, hence the initial mutex and notifier handler.

The concern was in the notifier hook we search the map for dev
entries that hold a refcnt on the net device being torn down. But,
in order to do this we require two steps,

  (i) dereference the netdev:  dev = rcu_dereference(map[i])
 (ii) test ifindex:   dev->ifindex == removing_ifindex

and then finally we can swap in the NULL dev in the map via an
xchg operation,

  xchg(map[i], NULL)

The danger here is a concurrent update could run a different
xchg op concurrently leading us to replace the new dev with a
NULL dev incorrectly.

      CPU 1                        CPU 2

   notifier hook                   bpf devmap update

   dev = rcu_dereference(map[i])
                                   dev = rcu_dereference(map[i])
                                   xchg(map[i]), new_dev);
                                   rcu_call(dev,...)
   xchg(map[i], NULL)

The above flow would create the incorrect state with the dev
reference in the update path being lost. To resolve this the
original code used a mutex around the above block. However,
updates, deletes, and lookups occur inside rcu critical sections
so we can't use a mutex in this context safely.

Fortunately, by writing slightly better code we can avoid the
mutex altogether. If CPU 1 in the above example uses a cmpxchg
and _only_ replaces the dev reference in the map when it is in
fact the expected dev the race is removed completely. The two
cases being illustrated here, first the race condition,

      CPU 1                          CPU 2

   notifier hook                     bpf devmap update

   dev = rcu_dereference(map[i])
                                     dev = rcu_dereference(map[i])
                                     xchg(map[i]), new_dev);
                                     rcu_call(dev,...)
   odev = cmpxchg(map[i], dev, NULL)

Now we can test the cmpxchg return value, detect odev != dev and
abort. Or in the good case,

      CPU 1                          CPU 2

   notifier hook                     bpf devmap update
   dev = rcu_dereference(map[i])
   odev = cmpxchg(map[i], dev, NULL)
                                     [...]

Now 'odev == dev' and we can do proper cleanup.

And viola the original race we tried to solve with a mutex is
corrected and the trace noted by Sasha below is resolved due
to removal of the mutex.

Note: When walking the devmap and removing dev references as needed
we depend on the core to fail any calls to dev_get_by_index() using
the ifindex of the device being removed. This way we do not race with
the user while searching the devmap.

Additionally, the mutex was also protecting list add/del/read on
the list of maps in-use. This patch converts this to an RCU list
and spinlock implementation. This protects the list from concurrent
alloc/free operations. The notifier hook walks this list so it uses
RCU read semantics.

BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
in_atomic(): 1, irqs_disabled(): 0, pid: 16315, name: syz-executor1
1 lock held by syz-executor1/16315:
 #0:  (rcu_read_lock){......}, at: [<ffffffff8c363bc2>] map_delete_elem kernel/bpf/syscall.c:577 [inline]
 #0:  (rcu_read_lock){......}, at: [<ffffffff8c363bc2>] SYSC_bpf kernel/bpf/syscall.c:1427 [inline]
 #0:  (rcu_read_lock){......}, at: [<ffffffff8c363bc2>] SyS_bpf+0x1d32/0x4ba0 kernel/bpf/syscall.c:1388

Fixes: 2ddf71e23c ("net: add notifier hooks for devmap bpf map")
Reported-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:13:04 -07:00
David S. Miller
98909f9545 Merge branch 'net_sched-clean-up-filter-handle'
Cong Wang says:

====================
net_sched: clean up filter handle

This patchset sits in my local branch for a long time, it is time to
send it out. It cleans up the ambiguous use of 'unsigned long fh',
please see each of them for details.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:12:18 -07:00
WANG Cong
8113c09567 net_sched: use void pointer for filter handle
Now we use 'unsigned long fh' as a pointer in every place,
it is safe to convert it to a void pointer now. This gets
rid of many casts to pointer.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:12:17 -07:00
WANG Cong
54df2cf819 net_sched: refactor notification code for RTM_DELTFILTER
It is confusing to use 'unsigned long fh' as both a handle
and a pointer, especially commit 9ee7837449
("net sched filters: fix notification of filter delete with proper handle").

This patch introduces tfilter_del_notify() so that we can
pass it as a pointer as before, and we don't need to check
RTM_DELTFILTER in tcf_fill_node() any more.

This prepares for the next patch.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:12:17 -07:00
Roopa Prabhu
08bd10ffb4 lwtunnel: replace EXPORT_SYMBOL with EXPORT_SYMBOL_GPL
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:10:38 -07:00
David S. Miller
1c14dc4850 Merge branch 'bpf-add-support-for-sys-enter-exit-tracepoints'
Yonghong Song says:

====================
bpf: add support for sys_{enter|exit}_* tracepoints

Currently, bpf programs cannot be attached to sys_enter_* and sys_exit_*
style tracepoints. The main reason is that syscalls/sys_enter_* and syscalls/sys_exit_*
tracepoints are treated differently from other tracepoints and there
is no bpf hook to it.

This patch set adds bpf support for these syscalls tracepoints and also
adds a test case for it.

Changelogs:
v3 -> v4:
 - Check the legality of ctx offset access for syscall tracepoint as well.
   trace_event_get_offsets will return correct max offset for each
   specific syscall tracepoint.
 - Use variable length array to avoid hardcode 6 as the maximum
   arguments beyond syscall_nr.
v2 -> v3:
 - Fix a build issue
v1 -> v2:
 - Do not use TRACE_EVENT_FL_CAP_ANY to identify syscall tracepoint.
   Instead use trace_event_call->class.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:09:48 -07:00
Yonghong Song
1da236b6be bpf: add a test case for syscalls/sys_{enter|exit}_* tracepoints
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:09:48 -07:00
Yonghong Song
cf5f5cea27 bpf: add support for sys_enter_* and sys_exit_* tracepoints
Currently, bpf programs cannot be attached to sys_enter_* and sys_exit_*
style tracepoints. The iovisor/bcc issue #748
(https://github.com/iovisor/bcc/issues/748) documents this issue.
For example, if you try to attach a bpf program to tracepoints
syscalls/sys_enter_newfstat, you will get the following error:
   # ./tools/trace.py t:syscalls:sys_enter_newfstat
   Ioctl(PERF_EVENT_IOC_SET_BPF): Invalid argument
   Failed to attach BPF to tracepoint

The main reason is that syscalls/sys_enter_* and syscalls/sys_exit_*
tracepoints are treated differently from other tracepoints and there
is no bpf hook to it.

This patch adds bpf support for these syscalls tracepoints by
  . permitting bpf attachment in ioctl PERF_EVENT_IOC_SET_BPF
  . calling bpf programs in perf_syscall_enter and perf_syscall_exit

The legality of bpf program ctx access is also checked.
Function trace_event_get_offsets returns correct max offset for each
specific syscall tracepoint, which is compared against the maximum offset
access in bpf program.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:09:48 -07:00
Sergei Shtylyov
d226a2b84d of_mdio: use of_property_read_u32_array()
The "fixed-link" prop support predated of_property_read_u32_array(), so
basically had to open-code it. Using the modern API saves 24 bytes of the
object code (ARM gcc 4.8.5); the only behavior change would be that the
prop length check is now less strict (however the strict pre-check done
in of_phy_is_fixed_link() is left intact anyway)...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:07:32 -07:00
John Allen
e1cea2e739 ibmvnic: Report rx buffer return codes as netdev_dbg
Reporting any return code for a receive buffer as an "rx error" only
produces alarming noise and the only values that have been observed to be
used in this field are not error conditions. Change this to a netdev_dbg
with a more descriptive message.

Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 14:00:44 -07:00
David S. Miller
9bcb5a572f Merge branch 'net-l3mdev-Support-for-sockets-bound-to-enslaved-device'
David Ahern says:

====================
net: l3mdev: Support for sockets bound to enslaved device

A missing piece to the VRF puzzle is the ability to bind sockets to
devices enslaved to a VRF. This patch set adds the enslaved device
index, sdif, to IPv4 and IPv6 socket lookups. The end result for users
is the following scope options for services:

1. "global" services - sockets not bound to any device

   Allows 1 service to work across all network interfaces with
   connected sockets bound to the VRF the connection originates
   (Requires net.ipv4.tcp_l3mdev_accept=1 for TCP and
    net.ipv4.udp_l3mdev_accept=1 for UDP)

2. "VRF" local services - sockets bound to a VRF

   Sockets work across all network interfaces enslaved to a VRF but
   are limited to just the one VRF.

3. "device" services - sockets bound to a specific network interface

   Service works only through the one specific interface.

v3
- convert __inet_lookup_established in dccp_v4_err; missed in v2

v2
- remove sk_lookup struct and add sdif as an argument to existing
  functions

Changes since RFC:
- no significant logic changes; mainly whitespace cleanups
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 11:39:22 -07:00
David Ahern
5108ab4bf4 net: ipv6: add second dif to raw socket lookups
Add a second device index, sdif, to raw socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 11:39:22 -07:00
David Ahern
4297a0ef08 net: ipv6: add second dif to inet6 socket lookups
Add a second device index, sdif, to inet6 socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.

TCP moves the data in the cb. Prior to tcp_v4_rcv (e.g., early demux) the
ingress index is obtained from IPCB using inet_sdif and after tcp_v4_rcv
tcp_v4_sdif is used.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 11:39:22 -07:00
David Ahern
1801b570dd net: ipv6: add second dif to udp socket lookups
Add a second device index, sdif, to udp socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.

Early demux lookups are handled in the next patch as part of INET_MATCH
changes.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 11:39:22 -07:00
David Ahern
60d9b03141 net: ipv4: add second dif to multicast source filter
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 11:39:22 -07:00
David Ahern
67359930e1 net: ipv4: add second dif to raw socket lookups
Add a second device index, sdif, to raw socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 11:39:21 -07:00
David Ahern
3fa6f616a7 net: ipv4: add second dif to inet socket lookups
Add a second device index, sdif, to inet socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.

TCP moves the data in the cb. Prior to tcp_v4_rcv (e.g., early demux) the
ingress index is obtained from IPCB using inet_sdif and after the cb move
in  tcp_v4_rcv the tcp_v4_sdif helper is used.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 11:39:21 -07:00
David Ahern
fb74c27735 net: ipv4: add second dif to udp socket lookups
Add a second device index, sdif, to udp socket lookups. sdif is the
index for ingress devices enslaved to an l3mdev. It allows the lookups
to consider the enslaved device as well as the L3 domain when searching
for a socket.

Early demux lookups are handled in the next patch as part of INET_MATCH
changes.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 11:39:21 -07:00
David S. Miller
46d4b68f89 wireless-drivers-next patches for 4.14
The first wireless-drivers-next pull request for 4.14. I'm submitting
 this unusally late in the cycle as my vacation postponed this. But
 even if this is late there's not still that much new features, mostly
 cleanup or fixes.
 
 Major changes:
 
 ath10k
 
 * preparation for wcn3990 support
 
 iwlwifi
 
 * Reorganization of the code into separate directories continues
 
 qtnfmac
 
 * regulatory support updates
 
 * add get_channel, dump_survey and channel_switch cfg80211 handlers
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZiHuQAAoJEG4XJFUm622bVSEIAKdausycC6OOZjwTGWnFyxE/
 58n79VTrTbXVLwJ7lSBCGYCTujc7amPxAVlDOLYd+9TKm0fO7gap50Gdl35HO5sp
 9v/augHQSouz52q2vgsTi0JbXsqhJQZ4Ie4P0fo8OyqJMYAvFga2FhFBpJseMYd9
 NX88SMoxAGgDkTC0JfzzLnA/jZ0W6ULai6zmRE1s6lUIynP2kzHgpfbMH3+KEkod
 SUW+yX91MdOkkyFGXyY11uuBqanUpEVSAQXW6J76vw3qS88qIqaL3iIeJ6C4Vozq
 fKNkHN4iZOd9FlKY1IFi4vS0+7hWiq6DQ3c+ngtU6cuq1XdBa6PuanC3I2e0B8E=
 =PKUU
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-next-for-davem-2017-08-07' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for 4.14

The first wireless-drivers-next pull request for 4.14. I'm submitting
this unusally late in the cycle as my vacation postponed this. But
even if this is late there's not still that much new features, mostly
cleanup or fixes.

Major changes:

ath10k

* preparation for wcn3990 support

iwlwifi

* Reorganization of the code into separate directories continues

qtnfmac

* regulatory support updates

* add get_channel, dump_survey and channel_switch cfg80211 handlers
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 11:37:47 -07:00
Arnd Bergmann
2a32ca138e hns3: fix unused function warning
Without CONFIG_PCI_IOV, we get a harmless warning about an
unused function:

drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:2273:13: error: 'hclge_disable_sriov' defined but not used [-Werror=unused-function]

The #ifdefs in this driver are obviously wrong, so this just
removes them and uses an IS_ENABLED() check that does the same
thing correctly in a more readable way.

Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 11:23:48 -07:00
David S. Miller
fde6af4729 mlx5-shared-2017-08-07
This series includes some mlx5 updates for both net-next and rdma trees.
 
 From Saeed,
 Core driver updates to allow selectively building the driver with
 or without some large driver components, such as
 	- E-Switch (Ethernet SRIOV support).
 	- Multi-Physical Function Switch (MPFs) support.
 For that we split E-Switch and MPFs functionalities into separate files.
 
 From Erez,
 Delay mlx5_core events when mlx5 interfaces, namely mlx5_ib, registration
 is taking place and until it completes.
 
 From Rabie,
 Increase the maximum supported flow counters.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZiDoAAAoJEEg/ir3gV/o+594H/RH5kRwC719s/5YQFJXvGsVC
 fjtj3UUJPLrWB8XBh7a4PRcxXPIHaFKJuY3MU7KHFIeZQFklJcit3njjpxDlUINo
 F5S1LHBSYBkeMD/ksWBA8OLCBprNGN6WQ2tuFfAjZlQQ44zqv8LJmegoDtW9bGRy
 aGAkjUmALEblQsq81y0BQwN2/8DA8HAywrs8L2dkH1LHwijoIeYMZFOtKugv1FbB
 ABSKxcU7D/NYw6rsVdZG59fHFQ+eKOspDFqBZrUzfQ+zUU2hFFo96ovfXBfIqYCV
 7BtJuKXu2LeGPzFLsuw4h1131iqFT1iSMy9fEhf/4OwaL/KPP/+Umy8vP/XfM+U=
 =wCpd
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-shared-2017-08-07' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Saeed Mahameed says:

====================
mlx5-shared-2017-08-07

This series includes some mlx5 updates for both net-next and rdma trees.

From Saeed,
Core driver updates to allow selectively building the driver with
or without some large driver components, such as
	- E-Switch (Ethernet SRIOV support).
	- Multi-Physical Function Switch (MPFs) support.
For that we split E-Switch and MPFs functionalities into separate files.

From Erez,
Delay mlx5_core events when mlx5 interfaces, namely mlx5_ib, registration
is taking place and until it completes.

From Rabie,
Increase the maximum supported flow counters.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 10:42:09 -07:00
David S. Miller
71feeef678 Merge branch 'net-sched-summer-cleanup-part-2-ndo_setup_tc'
Jiri Pirko says:

====================
net: sched: summer cleanup part 2, ndo_setup_tc

This patchset focuses on ndo_setup_tc and its args.
Currently there are couple of things that do not make much sense.
The type is passed in struct tc_to_netdev, but as it is always
required, should be arg of the ndo. Other things are passed as args
but they are only relevant for cls offloads and not mqprio. Therefore,
they should be pushed to struct. As the tc_to_netdev struct in the end
is just a container of single pointer, we get rid of it and pass the
struct according to type. So in the end, we have:
ndo_setup_tc(dev, type, type_data_struct)

There are couple of cosmetics done on the way to make things smooth.
Also, reported error is consolidated to eopnotsupp in case the
asked offload is not supported.

v1->v2:
- added forgotten hns3pf bits
====================

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 09:42:37 -07:00
Jiri Pirko
de4784ca03 net: sched: get rid of struct tc_to_netdev
Get rid of struct tc_to_netdev which is now just unnecessary container
and rather pass per-type structures down to drivers directly.
Along with that, consolidate the naming of per-type structure variables
in cls_*.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 09:42:37 -07:00
Jiri Pirko
38cf0426e5 net: sched: change return value of ndo_setup_tc for driver supporting mqprio only
Change the return value from -EINVAL to -EOPNOTSUPP. The rest of the
drivers have it like that, so be aligned.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 09:42:37 -07:00
Jiri Pirko
d7c1c8d2e5 net: sched: move prio into cls_common
prio is not cls_flower specific, but it is meaningful for all
classifiers. Seems that only mlxsw cares about the value. Obviously,
cls offload in other drivers is broken.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 09:42:37 -07:00
Jiri Pirko
5fd9fc4e20 net: sched: push cls related args into cls_common structure
As ndo_setup_tc is generic offload op for whole tc subsystem, does not
really make sense to have cls-specific args. So move them under
cls_common structurure which is embedded in all cls structs.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 09:42:37 -07:00
Jiri Pirko
74897ef0a5 hns3pf: don't check handle during mqprio offload
Similar to the rest offloaders of mqprio, no need to check handle.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 09:42:36 -07:00