Commit Graph

785368 Commits

Author SHA1 Message Date
David S. Miller
4cf34c0cf6 Merge branch 'ena-fixes'
Arthur Kiyanovski says:

====================
minor bug fixes for ENA Ethernet driver

Arthur Kiyanovski (4):
  net: ena: fix warning in rmmod caused by double iounmap
  net: ena: fix rare bug when failed restart/resume is followed by
    driver removal
  net: ena: fix NULL dereference due to untimely napi initialization
  net: ena: fix auto casting to boolean
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 10:49:50 -07:00
Arthur Kiyanovski
248ab77342 net: ena: fix auto casting to boolean
Eliminate potential auto casting compilation error.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 10:49:49 -07:00
Arthur Kiyanovski
78a55d05de net: ena: fix NULL dereference due to untimely napi initialization
napi poll functions should be initialized before running request_irq(),
to handle a rare condition where there is a pending interrupt, causing
the ISR to fire immediately while the poll function wasn't set yet,
causing a NULL dereference.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 10:49:49 -07:00
Arthur Kiyanovski
d7703ddbd7 net: ena: fix rare bug when failed restart/resume is followed by driver removal
In a rare scenario when ena_device_restore() fails, followed by device
remove, an FLR will not be issued. In this case, the device will keep
sending asynchronous AENQ keep-alive events, even after driver removal,
leading to memory corruption.

Fixes: 8c5c7abdeb ("net: ena: add power management ops to the ENA driver")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 10:49:49 -07:00
Arthur Kiyanovski
d79c3888bd net: ena: fix warning in rmmod caused by double iounmap
Memory mapped with devm_ioremap is automatically freed when the driver
is disconnected from the device. Therefore there is no need to
explicitly call devm_iounmap.

Fixes: 0857d92f71 ("net: ena: add missing unmap bars on device removal")
Fixes: 411838e7b4 ("net: ena: fix rare kernel crash when bar memory remap fails")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 10:49:49 -07:00
Andreas Gruenbacher
dc480feb45 gfs2: Fix iomap buffered write support for journaled files
Commit 64bc06bb32 broke buffered writes to journaled files (chattr
+j): we'll try to journal the buffer heads of the page being written to
in gfs2_iomap_journaled_page_done.  However, the iomap code no longer
creates buffer heads, so we'll BUG() in gfs2_page_add_databufs.  Fix
that by creating buffer heads ourself when needed.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2018-10-09 18:20:13 +02:00
Chris Boot
41591b38f5 mmc: block: avoid multiblock reads for the last sector in SPI mode
On some SD cards over SPI, reading with the multiblock read command the last
sector will leave the card in a bad state.

Remove last sectors from the multiblock reading cmd.

Signed-off-by: Chris Boot <bootc@bootc.net>
Signed-off-by: Clément Péron <peron.clem@gmail.com>
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2018-10-09 09:23:00 +02:00
Greg Kroah-Hartman
64c5e530ac ARC updates for 4.19-rc8
- Fix clone syscall to update Thread pointer register
 
  - Make/build updates (needed for AGL/OE builds)   [Alexey]
 
  - Typo fix [Colin Ian King]
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJbu7S9AAoJEGnX8d3iisJedk4QAMM6mlNYRNsryVlDHPlIv90D
 yKSqdKBbcJPmrRlD7wEe5fy17CBg8lCuBKKf4kEkQ60f+TPJ5zYstrGzjiIC2TtR
 SPTDr4aQdmzyEZbxaFw63KUyDPKioIdUNVmzSC0R5QGnvbJj30+5cMnSWTJQamC8
 NAzYSOZCCi8VtXON77fwbnxXdN3/6y+sguYF33N0d3EuNJch+plOvYFQAZTYhB0i
 XT6ve8sa5YwVf2zmD5CtVQlihUP0iOPBkwIKMrBWtXzt/flrc90iFDOsd4uK7Kih
 WvtVlGfKkJuAhL/lMn3I5epV+zYpYR0YhwTXrCVHsNiRr4Qo7uuJx/uEId/Bk7oZ
 JJNcAZZ+d0eJ5iith9Kv4Oi+9VAo+MsaozvQOhUzAalJlA3J/kV2+40DRnjtj1HG
 LBYToKjGKlNAZ+y5o9ygTpOlluZCNsCExsjAp4T4kXNvm8hJs/P9xrjVd7APzRuz
 rMHJMJgQB8P2YaNvDm5O1ZBgD5ePKviKwuPmyuHF5LWvlh8RzEbKiGnT0diS8Ius
 /lAZJKH8EQVNrCNbp0A5EVOQ4sYxjeJnnwbMLNvOs0PYuUNunsP7fiIiwwUEPgxP
 ahqQgnRKtcxyHYI30lpzQGyLUTy45RjURNHlEhcffQhLJj5zhDmgA2e96RXBr5zH
 X3FBPTFjhzHjVABv/iZZ
 =MxJ3
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Vineet writes:
   "ARC updates for 4.19-rc8
    - Fix clone syscall to update Thread pointer register
    - Make/build updates (needed for AGL/OE builds)   [Alexey]
    - Typo fix [Colin Ian King]"

* tag 'arc-4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: clone syscall to setp r25 as thread pointer
  ARC: build: Don't set CROSS_COMPILE in arch's Makefile
  ARC: fix spelling mistake "entires" -> "entries"
  ARC: build: Get rid of toolchain check
  ARCv2: build: use mcpu=hs38 iso generic mcpu=archs
2018-10-09 09:17:46 +02:00
Kees Cook
184d47f0fd x86/mm: Avoid VLA in pgd_alloc()
Arnd Bergmann reported that turning on -Wvla found a new (unintended) VLA usage:

  arch/x86/mm/pgtable.c: In function 'pgd_alloc':
  include/linux/build_bug.h:29:45: error: ISO C90 forbids variable length array 'u_pmds' [-Werror=vla]
  arch/x86/mm/pgtable.c:190:34: note: in expansion of macro 'static_cpu_has'
   #define PREALLOCATED_USER_PMDS  (static_cpu_has(X86_FEATURE_PTI) ? \
                                    ^~~~~~~~~~~~~~
  arch/x86/mm/pgtable.c:431:16: note: in expansion of macro 'PREALLOCATED_USER_PMDS'
    pmd_t *u_pmds[PREALLOCATED_USER_PMDS];
                ^~~~~~~~~~~~~~~~~~~~~~

Use the actual size of the array that is used for X86_FEATURE_PTI,
which is known at build time, instead of the variable size.

[ mingo: Squashed original fix with followup fix to avoid bisection breakage, wrote new changelog. ]

Reported-by: Arnd Bergmann <arnd@arndb.de>
Original-written-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Fixes: 1be3f247c288 ("x86/mm: Avoid VLA in pgd_alloc()")
Link: http://lkml.kernel.org/r/20181008235434.GA35035@beast
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-09 08:55:07 +02:00
David S. Miller
071a234ad7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-10-08

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) sk_lookup_[tcp|udp] and sk_release helpers from Joe Stringer which allow
BPF programs to perform lookups for sockets in a network namespace. This would
allow programs to determine early on in processing whether the stack is
expecting to receive the packet, and perform some action (eg drop,
forward somewhere) based on this information.

2) per-cpu cgroup local storage from Roman Gushchin.
Per-cpu cgroup local storage is very similar to simple cgroup storage
except all the data is per-cpu. The main goal of per-cpu variant is to
implement super fast counters (e.g. packet counters), which don't require
neither lookups, neither atomic operations in a fast path.
The example of these hybrid counters is in selftests/bpf/netcnt_prog.c

3) allow HW offload of programs with BPF-to-BPF function calls from Quentin Monnet

4) support more than 64-byte key/value in HW offloaded BPF maps from Jakub Kicinski

5) rename of libbpf interfaces from Andrey Ignatov.
libbpf is maturing as a library and should follow good practices in
library design and implementation to play well with other libraries.
This patch set brings consistent naming convention to global symbols.

6) relicense libbpf as LGPL-2.1 OR BSD-2-Clause from Alexei Starovoitov
to let Apache2 projects use libbpf

7) various AF_XDP fixes from Björn and Magnus
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 23:42:44 -07:00
Srikar Dronamraju
e054637597 mm, sched/numa: Remove remaining traces of NUMA rate-limiting
Remove the leftover pglist_data::numabalancing_migrate_lock and its
initialization, we stopped using this lock with:

  efaffc5e40 ("mm, sched/numa: Remove rate-limiting of automatic NUMA balancing migration")

[ mingo: Rewrote the changelog. ]

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux-MM <linux-mm@kvack.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1538824999-31230-1-git-send-email-srikar@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-09 08:30:51 +02:00
Reinette Chatre
49e00eee00 x86/intel_rdt: Fix out-of-bounds memory access in CBM tests
While the DOC at the beginning of lib/bitmap.c explicitly states that
"The number of valid bits in a given bitmap does _not_ need to be an
exact multiple of BITS_PER_LONG.", some of the bitmap operations do
indeed access BITS_PER_LONG portions of the provided bitmap no matter
the size of the provided bitmap. For example, if bitmap_intersects()
is provided with an 8 bit bitmap the operation will access
BITS_PER_LONG bits from the provided bitmap. While the operation
ensures that these extra bits do not affect the result, the memory
is still accessed.

The capacity bitmasks (CBMs) are typically stored in u32 since they
can never exceed 32 bits. A few instances exist where a bitmap_*
operation is performed on a CBM by simply pointing the bitmap operation
to the stored u32 value.

The consequence of this pattern is that some bitmap_* operations will
access out-of-bounds memory when interacting with the provided CBM. This
is confirmed with a KASAN test that reports:

 BUG: KASAN: stack-out-of-bounds in __bitmap_intersects+0xa2/0x100

and

 BUG: KASAN: stack-out-of-bounds in __bitmap_weight+0x58/0x90

Fix this by moving any CBM provided to a bitmap operation needing
BITS_PER_LONG to an 'unsigned long' variable.

[ tglx: Changed related function arguments to unsigned long and got rid
	of the _cbm extra step ]

Fixes: 72d5050566 ("x86/intel_rdt: Add utilities to test pseudo-locked region possibility")
Fixes: 49f7b4efa1 ("x86/intel_rdt: Enable setting of exclusive mode")
Fixes: d9b48c86eb ("x86/intel_rdt: Display resource groups' allocations' size in bytes")
Fixes: 95f0b77efa ("x86/intel_rdt: Initialize new resource group with sane defaults")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fenghua.yu@intel.com
Cc: tony.luck@intel.com
Cc: gavin.hindman@intel.com
Cc: jithu.joseph@intel.com
Cc: dave.hansen@intel.com
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/69a428613a53f10e80594679ac726246020ff94f.1538686926.git.reinette.chatre@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-09 08:02:15 +02:00
David S. Miller
9000a457a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next tree:

1) Support for matching on ipsec policy already set in the route, from
   Florian Westphal.

2) Split set destruction into deactivate and destroy phase to make it
   fit better into the transaction infrastructure, also from Florian.
   This includes a patch to warn on imbalance when setting the new
   activate and deactivate interfaces.

3) Release transaction list from the workqueue to remove expensive
   synchronize_rcu() from configuration plane path. This speeds up
   configuration plane quite a bit. From Florian Westphal.

4) Add new xfrm/ipsec extension, this new extension allows you to match
   for ipsec tunnel keys such as source and destination address, spi and
   reqid. From Máté Eckl and Florian Westphal.

5) Add secmark support, this includes connsecmark too, patches
   from Christian Gottsche.

6) Allow to specify remaining bytes in xt_quota, from Chenbo Feng.
   One follow up patch to calm a clang warning for this one, from
   Nathan Chancellor.

7) Flush conntrack entries based on layer 3 family, from Kristian Evensen.

8) New revision for cgroups2 to shrink the path field.

9) Get rid of obsolete need_conntrack(), as a result from recent
   demodularization works.

10) Use WARN_ON instead of BUG_ON, from Florian Westphal.

11) Unused exported symbol in nf_nat_ipv4_fn(), from Florian.

12) Remove superfluous check for timeout netlink parser and dump
    functions in layer 4 conntrack helpers.

13) Unnecessary redundant rcu read side locks in NAT redirect,
    from Taehee Yoo.

14) Pass nf_hook_state structure to error handlers, patch from
    Florian Westphal.

15) Remove ->new() interface from layer 4 protocol trackers. Place
    them in the ->packet() interface. From Florian.

16) Place conntrack ->error() handling in the ->packet() interface.
    Patches from Florian Westphal.

17) Remove unused parameter in the pernet initialization path,
    also from Florian.

18) Remove additional parameter to specify layer 3 protocol when
    looking up for protocol tracker. From Florian.

19) Shrink array of layer 4 protocol trackers, from Florian.

20) Check for linear skb only once from the ALG NAT mangling
    codebase, from Taehee Yoo.

21) Use rhashtable_walk_enter() instead of deprecated
    rhashtable_walk_init(), also from Taehee.

22) No need to flush all conntracks when only one single address
    is gone, from Tan Hu.

23) Remove redundant check for NAT flags in flowtable code, from
    Taehee Yoo.

24) Use rhashtable_lookup() instead of rhashtable_lookup_fast()
    from netfilter codebase, since rcu read lock side is already
    assumed in this path.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 21:28:55 -07:00
Arnd Bergmann
df3f94a0bb bpf: fix building without CONFIG_INET
The newly added TCP and UDP handling fails to link when CONFIG_INET
is disabled:

net/core/filter.o: In function `sk_lookup':
filter.c:(.text+0x7ff8): undefined reference to `tcp_hashinfo'
filter.c:(.text+0x7ffc): undefined reference to `tcp_hashinfo'
filter.c:(.text+0x8020): undefined reference to `__inet_lookup_established'
filter.c:(.text+0x8058): undefined reference to `__inet_lookup_listener'
filter.c:(.text+0x8068): undefined reference to `udp_table'
filter.c:(.text+0x8070): undefined reference to `udp_table'
filter.c:(.text+0x808c): undefined reference to `__udp4_lib_lookup'
net/core/filter.o: In function `bpf_sk_release':
filter.c:(.text+0x82e8): undefined reference to `sock_gen_put'

Wrap the related sections of code in #ifdefs for the config option.

Furthermore, sk_lookup() should always have been marked 'static', this
also avoids a warning about a missing prototype when building with
'make W=1'.

Fixes: 6acc9b432e ("bpf: Add helper to retrieve socket in BPF")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Joe Stringer <joe@wand.net.nz>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-09 00:49:44 +02:00
Nathan Chancellor
ffa0a9a590 netfilter: xt_quota: Don't use aligned attribute in sizeof
Clang warns:

net/netfilter/xt_quota.c:47:44: warning: 'aligned' attribute ignored
when parsing type [-Wignored-attributes]
        BUILD_BUG_ON(sizeof(atomic64_t) != sizeof(__aligned_u64));
                                                  ^~~~~~~~~~~~~

Use 'sizeof(__u64)' instead, as the alignment doesn't affect the size
of the type.

Fixes: e9837e55b0 ("netfilter: xt_quota: fix the behavior of xt_quota module")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-09 00:19:25 +02:00
David Howells
c1e15b4944 rxrpc: Fix the packet reception routine
The rxrpc_input_packet() function and its call tree was built around the
assumption that data_ready() handler called from UDP to inform a kernel
service that there is data to be had was non-reentrant.  This means that
certain locking could be dispensed with.

This, however, turns out not to be the case with a multi-queue network card
that can deliver packets to multiple cpus simultaneously.  Each of those
cpus can be in the rxrpc_input_packet() function at the same time.

Fix by adding or changing some structure members:

 (1) Add peer->rtt_input_lock to serialise access to the RTT buffer.

 (2) Make conn->service_id into a 32-bit variable so that it can be
     cmpxchg'd on all arches.

 (3) Add call->input_lock to serialise access to the Rx/Tx state.  Note
     that although the Rx and Tx states are (almost) entirely separate,
     there's no point completing the separation and having separate locks
     since it's a bi-phasal RPC protocol rather than a bi-direction
     streaming protocol.  Data transmission and data reception do not take
     place simultaneously on any particular call.

and making the following functional changes:

 (1) In rxrpc_input_data(), hold call->input_lock around the core to
     prevent simultaneous producing of packets into the Rx ring and
     updating of tracking state for a particular call.

 (2) In rxrpc_input_ping_response(), only read call->ping_serial once, and
     check it before checking RXRPC_CALL_PINGING as that's a cheaper test.
     The bit test and bit clear can then be combined.  No further locking
     is needed here.

 (3) In rxrpc_input_ack(), take call->input_lock after we've parsed much of
     the ACK packet.  The superseded ACK check is then done both before and
     after the lock is taken.

     The handing of ackinfo data is split, parsing before the lock is taken
     and processing with it held.  This is keyed on rxMTU being non-zero.

     Congestion management is also done within the locked section.

 (4) In rxrpc_input_ackall(), take call->input_lock around the Tx window
     rotation.  The ACKALL packet carries no information and is only really
     useful after all packets have been transmitted since it's imprecise.

 (5) In rxrpc_input_implicit_end_call(), we use rx->incoming_lock to
     prevent calls being simultaneously implicitly ended on two cpus and
     also to prevent any races with incoming call setup.

 (6) In rxrpc_input_packet(), use cmpxchg() to effect the service upgrade
     on a connection.  It is only permitted to happen once for a
     connection.

 (7) In rxrpc_new_incoming_call(), we have to recheck the routing inside
     rx->incoming_lock to see if someone else set up the call, connection
     or peer whilst we were getting there.  We can't trust the values from
     the earlier routing check unless we pin refs on them - which we want
     to avoid.

     Further, we need to allow for an incoming call to have its state
     changed on another CPU between us making it live and us adjusting it
     because the conn is now in the RXRPC_CONN_SERVICE state.

 (8) In rxrpc_peer_add_rtt(), take peer->rtt_input_lock around the access
     to the RTT buffer.  Don't need to lock around setting peer->rtt.

For reference, the inventory of state-accessing or state-altering functions
used by the packet input procedure is:

> rxrpc_input_packet()
  * PACKET CHECKING

  * ROUTING
    > rxrpc_post_packet_to_local()
    > rxrpc_find_connection_rcu() - uses RCU
      > rxrpc_lookup_peer_rcu() - uses RCU
      > rxrpc_find_service_conn_rcu() - uses RCU
      > idr_find() - uses RCU

  * CONNECTION-LEVEL PROCESSING
    - Service upgrade
      - Can only happen once per conn
      ! Changed to use cmpxchg
    > rxrpc_post_packet_to_conn()
    - Setting conn->hi_serial
      - Probably safe not using locks
      - Maybe use cmpxchg

  * CALL-LEVEL PROCESSING
    > Old-call checking
      > rxrpc_input_implicit_end_call()
        > rxrpc_call_completed()
	> rxrpc_queue_call()
	! Need to take rx->incoming_lock
	> __rxrpc_disconnect_call()
	> rxrpc_notify_socket()
    > rxrpc_new_incoming_call()
      - Uses rx->incoming_lock for the entire process
        - Might be able to drop this earlier in favour of the call lock
      > rxrpc_incoming_call()
      	! Conflicts with rxrpc_input_implicit_end_call()
    > rxrpc_send_ping()
      - Don't need locks to check rtt state
      > rxrpc_propose_ACK

  * PACKET DISTRIBUTION
    > rxrpc_input_call_packet()
      > rxrpc_input_data()
	* QUEUE DATA PACKET ON CALL
	> rxrpc_reduce_call_timer()
	  - Uses timer_reduce()
	! Needs call->input_lock()
	> rxrpc_receiving_reply()
	  ! Needs locking around ack state
	  > rxrpc_rotate_tx_window()
	  > rxrpc_end_tx_phase()
	> rxrpc_proto_abort()
	> rxrpc_input_dup_data()
	- Fills the Rx buffer
	- rxrpc_propose_ACK()
	- rxrpc_notify_socket()

      > rxrpc_input_ack()
	* APPLY ACK PACKET TO CALL AND DISCARD PACKET
	> rxrpc_input_ping_response()
	  - Probably doesn't need any extra locking
	  ! Need READ_ONCE() on call->ping_serial
	  > rxrpc_input_check_for_lost_ack()
	    - Takes call->lock to consult Tx buffer
	  > rxrpc_peer_add_rtt()
	    ! Needs to take a lock (peer->rtt_input_lock)
	    ! Could perhaps manage with cmpxchg() and xadd() instead
	> rxrpc_input_requested_ack
	  - Consults Tx buffer
	    ! Probably needs a lock
	  > rxrpc_peer_add_rtt()
	> rxrpc_propose_ack()
	> rxrpc_input_ackinfo()
	  - Changes call->tx_winsize
	    ! Use cmpxchg to handle change
	    ! Should perhaps track serial number
	  - Uses peer->lock to record MTU specification changes
	> rxrpc_proto_abort()
	! Need to take call->input_lock
	> rxrpc_rotate_tx_window()
	> rxrpc_end_tx_phase()
	> rxrpc_input_soft_acks()
	- Consults the Tx buffer
	> rxrpc_congestion_management()
	  - Modifies the Tx annotations
	  ! Needs call->input_lock()
	  > rxrpc_queue_call()

      > rxrpc_input_abort()
	* APPLY ABORT PACKET TO CALL AND DISCARD PACKET
	> rxrpc_set_call_completion()
	> rxrpc_notify_socket()

      > rxrpc_input_ackall()
	* APPLY ACKALL PACKET TO CALL AND DISCARD PACKET
	! Need to take call->input_lock
	> rxrpc_rotate_tx_window()
	> rxrpc_end_tx_phase()

    > rxrpc_reject_packet()

There are some functions used by the above that queue the packet, after
which the procedure is terminated:

 - rxrpc_post_packet_to_local()
   - local->event_queue is an sk_buff_head
   - local->processor is a work_struct
 - rxrpc_post_packet_to_conn()
   - conn->rx_queue is an sk_buff_head
   - conn->processor is a work_struct
 - rxrpc_reject_packet()
   - local->reject_queue is an sk_buff_head
   - local->processor is a work_struct

And some that offload processing to process context:

 - rxrpc_notify_socket()
   - Uses RCU lock
   - Uses call->notify_lock to call call->notify_rx
   - Uses call->recvmsg_lock to queue recvmsg side
 - rxrpc_queue_call()
   - call->processor is a work_struct
 - rxrpc_propose_ACK()
   - Uses call->lock to wrap __rxrpc_propose_ACK()

And a bunch that complete a call, all of which use call->state_lock to
protect the call state:

 - rxrpc_call_completed()
 - rxrpc_set_call_completion()
 - rxrpc_abort_call()
 - rxrpc_proto_abort()
   - Also uses rxrpc_queue_call()

Fixes: 17926a7932 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-08 22:42:04 +01:00
David Howells
4e2abd3c05 rxrpc: Fix the rxrpc_tx_packet trace line
Fix the rxrpc_tx_packet trace line by storing the where parameter.

Fixes: 4764c0da69 ("rxrpc: Trace packet transmission")
Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-08 22:42:04 +01:00
David Howells
647530924f rxrpc: Fix connection-level abort handling
Fix connection-level abort handling to cache the abort and error codes
properly so that a new incoming call can be properly aborted if it races
with the parent connection being aborted by another CPU.

The abort_code and error parameters can then be dropped from
rxrpc_abort_calls().

Fixes: f5c17aaeb2 ("rxrpc: Calls should only have one terminal state")
Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-08 22:42:04 +01:00
David Howells
298bc15b20 rxrpc: Only take the rwind and mtu values from latest ACK
Move the out-of-order and duplicate ACK packet check to before the call to
rxrpc_input_ackinfo() so that the receive window size and MTU size are only
checked in the latest ACK packet and don't regress.

Fixes: 248f219cb8 ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-08 22:42:04 +01:00
Ioana Ciocoi Radulescu
68049a5f4d dpaa2-eth: Don't account Tx confirmation frames on NAPI poll
Until now, both Rx and Tx confirmation frames handled during
NAPI poll were counted toward the NAPI budget. However, Tx
confirmations are lighter to process than Rx frames, which can
skew the amount of work actually done inside one NAPI cycle.

Update the code to only count Rx frames toward the NAPI budget
and set a separate threshold on how many Tx conf frames can be
processed in one poll cycle.

The NAPI poll routine stops when either the budget is consumed
by Rx frames or when Tx confirmation frames reach this threshold.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 11:05:05 -07:00
YueHaibing
9e19dabc05 net: mscc: ocelot: remove set but not used variable 'phy_mode'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/mscc/ocelot_board.c: In function 'mscc_ocelot_probe':
drivers/net/ethernet/mscc/ocelot_board.c:262:17: warning:
 variable 'phy_mode' set but not used [-Wunused-but-set-variable]
   enum phy_mode phy_mode;

It never used since introduction in
commit 71e32a20cf ("net: mscc: ocelot: make use of SerDes PHYs for handling their configuration")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 11:03:42 -07:00
David S. Miller
ee9615be25 Merge branch 'more-pmtu-selftests'
Sabrina Dubroca says:

====================
selftests: add more PMTU tests

The current selftests for PMTU cover VTI tunnels, but there's nothing
about the generation and handling of PMTU exceptions by intermediate
routers. This series adds and improves existing helpers, then adds
IPv4 and IPv6 selftests with a setup involving an intermediate router.

Joint work with Stefano Brivio.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 11:00:24 -07:00
Sabrina Dubroca
e44e428f59 selftests: pmtu: add basic IPv4 and IPv6 PMTU tests
Commit d1f1b9cbf3 ("selftests: net: Introduce first PMTU test") and
follow-ups introduced some PMTU tests, but they all rely on tunneling,
and, particularly, on VTI.

These new tests use simple routing to exercise the generation and
update of PMTU exceptions in IPv4 and IPv6.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 11:00:23 -07:00
Sabrina Dubroca
72ebddd7ff selftests: pmtu: extend MTU parsing helper to locked MTU
The mtu_parse helper introduced in commit f2c929feec ("selftests:
pmtu: Factor out MTU parsing helper") can only handle "mtu 1234", but
not "mtu lock 1234". Extend it, so that we can do IPv4 tests with PMTU
smaller than net.ipv4.route.min_pmtu

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 11:00:23 -07:00
Stefano Brivio
1e0a720779 selftests: pmtu: Introduce check_pmtu_value()
Introduce and use a function that checks PMTU values against
expected values and logs error messages, to remove some clutter.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 11:00:23 -07:00
Gustavo A. R. Silva
062f97a314 isdn/gigaset/isocdata: mark expected switch fall-through
Notice that in this particular case, I replaced the
"--v-- fall through --v--" comment with a proper
"fall through", which is what GCC is expecting to
find.

This fix is part of the ongoing efforts to enabling
-Wimplicit-fallthrough

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:54:14 -07:00
David S. Miller
cd7f7df6ca Merge branch 'rtnetlink-Add-support-for-rigid-checking-of-data-in-dump-request'
David Ahern says:

====================
rtnetlink: Add support for rigid checking of data in dump request

There are many use cases where a user wants to influence what is
returned in a dump for some rtnetlink command: one is wanting data
for a different namespace than the one the request is received and
another is limiting the amount of data returned in the dump to a
specific set of interest to userspace, reducing the cpu overhead of
both kernel and userspace. Unfortunately, the kernel has historically
not been strict with checking for the proper header or checking the
values passed in the header. This lenient implementation has allowed
iproute2 and other packages to pass any struct or data in the dump
request as long as the family is the first byte. For example, ifinfomsg
struct is used by iproute2 for all generic dump requests - links,
addresses, routes and rules when it is really only valid for link
requests.

There is 1 is example where the kernel deals with the wrong struct: link
dumps after VF support was added. Older iproute2 was sending rtgenmsg as
the header instead of ifinfomsg so a patch was added to try and detect
old userspace vs new:
e5eca6d41f ("rtnetlink: fix userspace API breakage for iproute2 < v3.9.0")

The latest example is Christian's patch set wanting to return addresses for
a target namespace. It guesses the header struct is an ifaddrmsg and if it
guesses wrong a netlink warning is generated in the kernel log on every
address dump which is unacceptable.

Another example where the kernel is a bit lenient is route dumps: iproute2
can send either a request with either ifinfomsg or a rtmsg as the header
struct, yet the kernel always treats the header as an rtmsg (see
inet_dump_fib and rtm_flags check). The header inconsistency impacts the
ability to add kernel side filters for route dumps - a necessary feature
for scale setups with 100k+ routes.

How to resolve the problem of not breaking old userspace yet be able to
move forward with new features such as kernel side filtering which are
crucial for efficient operation at high scale?

This patch set addresses the problem by adding a new socket flag,
NETLINK_DUMP_STRICT_CHK, that userspace can use with setsockopt to
request strict checking of headers and attributes on dump requests and
hence unlock the ability to use kernel side filters as they are added.

Kernel side, the dump handlers are updated to verify the message contains
at least the expected header struct:
    RTM_GETLINK:       ifinfomsg
    RTM_GETADDR:       ifaddrmsg
    RTM_GETMULTICAST:  ifaddrmsg
    RTM_GETANYCAST:    ifaddrmsg
    RTM_GETADDRLABEL:  ifaddrlblmsg
    RTM_GETROUTE:      rtmsg
    RTM_GETSTATS:      if_stats_msg
    RTM_GETNEIGH:      ndmsg
    RTM_GETNEIGHTBL:   ndtmsg
    RTM_GETNSID:       rtgenmsg
    RTM_GETRULE:       fib_rule_hdr
    RTM_GETNETCONF:    netconfmsg
    RTM_GETMDB:        br_port_msg

And then every field in the header struct should be 0 with the exception
of the family. There are a few exceptions to this rule where the kernel
already influences the data returned by values in the struct. Next the
message should not contain attributes unless the kernel implements
filtering for it. Any unexpected data causes the dump to fail with EINVAL.
If the new flag is honored by the kernel and the dump contents adjusted
by any data passed in the request, the dump handler can set the
NLM_F_DUMP_FILTERED flag in the netlink message header.

For old userspace on new kernel there is no impact as all checks are
wrapped in a check on the new strict flag. For new userspace on old
kernel, the data in the headers and any appended attributes are
silently ignored though the setsockopt failing is the clue to userspace
the feature is not supported. New userspace on new kernel gets the
requested data dump.

iproute2 patches can be found here:
    https://github.com/dsahern/iproute2 dump-enhancements

Major changes since v1
- inner header is supposed to be 4-bytes aligned. So for dumps that
  should not have attributes appended changed the check to use:
        if (nlmsg_attrlen(nlh, sizeof(hdr)))
  Only impacts patches with headers that are not multiples of 4-bytes
  (rtgenmsg, netconfmsg), but applied the change to all patches not
  calling nlmsg_parse for consistency.

- Added nlmsg_parse_strict and nla_parse_strict for tighter control on
  attribute parsing. There should be no unknown attribute types or extra
  bytes.

- Moved validation to a helper in most cases

Changes since rfc-v2
- dropped the NLM_F_DUMP_FILTERED flag from target nsid dumps per
  Jiri's objections
- changed the opt-in uapi from a netlink message flag to a socket
  flag. setsockopt provides an api for userspace to definitively
  know if the kernel supports strict checking on dumps.
- re-ordered patches to peel off the extack on dumps if needed to
  keep this set size within limits
- misc cleanups in patches based on testing
====================

Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:06 -07:00
David Ahern
8c6e137fbc rtnetlink: Update rtnl_fdb_dump for strict data checking
Update rtnl_fdb_dump for strict data checking. If the flag is set,
the dump request is expected to have an ndmsg struct as the header
potentially followed by one or more attributes. Any data passed in the
header or as an attribute is taken as a request to influence the data
returned. Only values supported by the dump handler are allowed to be
non-0 or set in the request. At the moment only the NDA_IFINDEX and
NDA_MASTER attributes are supported.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
8dfbda19a2 rtnetlink: Move input checking for rtnl_fdb_dump to helper
Move the existing input checking for rtnl_fdb_dump into a helper,
valid_fdb_dump_legacy. This function will retain the current
logic that works around the 2 headers that userspace has been
allowed to send up to this point.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
c77b93641e net/bridge: Update br_mdb_dump for strict data checking
Update br_mdb_dump for strict data checking. If the flag is set,
the dump request is expected to have a br_port_msg struct as the
header. All elements of the struct are expected to be 0 and no
attributes can be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
addd383f5a net: Update netconf dump handlers for strict data checking
Update inet_netconf_dump_devconf, inet6_netconf_dump_devconf, and
mpls_netconf_dump_devconf for strict data checking. If the flag is set,
the dump request is expected to have an netconfmsg struct as the header.
The struct only has the family member and no attributes can be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
f2ae64bb6b net/ipv6: Update ip6addrlbl_dump for strict data checking
Update ip6addrlbl_dump for strict data checking. If the flag is set,
the dump request is expected to have an ifaddrlblmsg struct as the
header. All elements of the struct are expected to be 0 and no
attributes can be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
4a73e5e56d net/fib_rules: Update fib_nl_dumprule for strict data checking
Update fib_nl_dumprule for strict data checking. If the flag is set,
the dump request is expected to have fib_rule_hdr struct as the header.
All elements of the struct are expected to be 0 and no attributes can
be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
f80f14c364 net/namespace: Update rtnl_net_dumpid for strict data checking
Update rtnl_net_dumpid for strict data checking. If the flag is set,
the dump request is expected to have an rtgenmsg struct as the header
which has the family as the only element. No data may be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
9632d47f6a net/neighbor: Update neightbl_dump_info for strict data checking
Update neightbl_dump_info for strict data checking. If the flag is set,
the dump request is expected to have an ndtmsg struct as the header.
All elements of the struct are expected to be 0 and no attributes can
be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
51183d233b net/neighbor: Update neigh_dump_info for strict data checking
Update neigh_dump_info for strict data checking. If the flag is set,
the dump request is expected to have an ndmsg struct as the header
potentially followed by one or more attributes. Any data passed in the
header or as an attribute is taken as a request to influence the data
returned. Only values supported by the dump handler are allowed to be
non-0 or set in the request. At the moment only the NDA_IFINDEX and
NDA_MASTER attributes are supported.

Existing code does not fail the dump if nlmsg_parse fails. That behavior
is kept for non-strict checking.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
e8ba330ac0 rtnetlink: Update fib dumps for strict data checking
Add helper to check netlink message for route dumps. If the strict flag
is set the dump request is expected to have an rtmsg struct as the header.
All elements of the struct are expected to be 0 with the exception of
rtm_flags (which is used by both ipv4 and ipv6 dumps) and no attributes
can be appended. rtm_flags can only have RTM_F_CLONED and RTM_F_PREFIX
set.

Update inet_dump_fib, inet6_dump_fib, mpls_dump_routes, ipmr_rtm_dumproute,
and ip6mr_rtm_dumproute to call this helper if strict data checking is
enabled.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:05 -07:00
David Ahern
14fc5bb29f rtnetlink: Update ipmr_rtm_dumplink for strict data checking
Update ipmr_rtm_dumplink for strict data checking. If the flag is set,
the dump request is expected to have an ifinfomsg struct as the header.
All elements of the struct are expected to be 0 and no attributes can
be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
786e0007e2 rtnetlink: Update inet6_dump_ifinfo for strict data checking
Update inet6_dump_ifinfo for strict data checking. If the flag is
set, the dump request is expected to have an ifinfomsg struct as
the header. All elements of the struct are expected to be 0 and no
attributes can be appended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
841891ec0c rtnetlink: Update rtnl_stats_dump for strict data checking
Update rtnl_stats_dump for strict data checking. If the flag is set,
the dump request is expected to have an if_stats_msg struct as the header.
All elements of the struct are expected to be 0 except filter_mask which
must be non-0 (legacy behavior). No attributes are supported.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
2d011be8c0 rtnetlink: Update rtnl_bridge_getlink for strict data checking
Update rtnl_bridge_getlink for strict data checking. If the flag is set,
the dump request is expected to have an ifinfomsg struct as the header
potentially followed by one or more attributes. Any data passed in the
header or as an attribute is taken as a request to influence the data
returned. Only values supported by the dump handler are allowed to be
non-0 or set in the request. At the moment only the IFLA_EXT_MASK
attribute is supported.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
905cf0abe8 rtnetlink: Update rtnl_dump_ifinfo for strict data checking
Update rtnl_dump_ifinfo for strict data checking. If the flag is set,
the dump request is expected to have an ifinfomsg struct as the header
potentially followed by one or more attributes. Any data passed in the
header or as an attribute is taken as a request to influence the data
returned. Only values supported by the dump handler are allowed to be
non-0 or set in the request. At the moment only the IFA_TARGET_NETNSID,
IFLA_EXT_MASK, IFLA_MASTER, and IFLA_LINKINFO attributes are supported.

Existing code does not fail the dump if nlmsg_parse fails. That behavior
is kept for non-strict checking.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
ed6eff1179 net/ipv6: Update inet6_dump_addr for strict data checking
Update inet6_dump_addr for strict data checking. If the flag is set, the
dump request is expected to have an ifaddrmsg struct as the header
potentially followed by one or more attributes. Any data passed in the
header or as an attribute is taken as a request to influence the data
returned. Only values suppored by the dump handler are allowed to be
non-0 or set in the request. At the moment only the IFA_TARGET_NETNSID
attribute is supported. Follow on patches can add support for other fields
(e.g., honor ifa_index and only return data for the given device index).

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
c33078e3df net/ipv4: Update inet_dump_ifaddr for strict data checking
Update inet_dump_ifaddr for strict data checking. If the flag is set,
the dump request is expected to have an ifaddrmsg struct as the header
potentially followed by one or more attributes. Any data passed in the
header or as an attribute is taken as a request to influence the data
returned. Only values supported by the dump handler are allowed to be
non-0 or set in the request. At the moment only the IFA_TARGET_NETNSID
attribute is supported. Follow on patches can support for other fields
(e.g., honor ifa_index and only return data for the given device index).

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
89d35528d1 netlink: Add new socket option to enable strict checking on dumps
Add a new socket option, NETLINK_DUMP_STRICT_CHK, that userspace
can use via setsockopt to request strict checking of headers and
attributes on dump requests.

To get dump features such as kernel side filtering based on data in
the header or attributes appended to the dump request, userspace
must call setsockopt() for NETLINK_DUMP_STRICT_CHK and a non-zero
value. Since the netlink sock and its flags are private to the
af_netlink code, the strict checking flag is passed to dump handlers
via a flag in the netlink_callback struct.

For old userspace on new kernel there is no impact as all of the data
checks in later patches are wrapped in a check on the new strict flag.

For new userspace on old kernel, the setsockopt will fail and even if
new userspace sets data in the headers and appended attributes the
kernel will silently ignore it. Moving forward when the setsockopt
succeeds, the new userspace on old kernel means the dump request can
pass an attribute the kernel does not understand. The dump will then
fail as the older kernel does not understand it.

New userspace on new kernel setting the socket option gets the benefit
of the improved data dump.

Kernel side the NETLINK_DUMP_STRICT_CHK uapi is converted to a generic
NETLINK_F_STRICT_CHK flag which can potentially be leveraged for tighter
checking on the NEW, DEL, and SET commands.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
6ba1e6e856 net/ipv6: Refactor address dump to push inet6_fill_args to in6_dump_addrs
Pull the inet6_fill_args arg up to in6_dump_addrs and move netnsid
into it.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
a5f6cba291 netlink: Add strict version of nlmsg_parse and nla_parse
nla_parse is currently lenient on message parsing, allowing type to be 0
or greater than max expected and only logging a message

    "netlink: %d bytes leftover after parsing attributes in process `%s'."

if the netlink message has unknown data at the end after parsing. What this
could mean is that the header at the front of the attributes is actually
wrong and the parsing is shifted from what is expected.

Add a new strict version that actually fails with EINVAL if there are any
bytes remaining after the parsing loop completes, if the atttrbitue type
is 0 or greater than max expected.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
dac9c9790e net: Add extack to nlmsg_parse
Make sure extack is passed to nlmsg_parse where easy to do so.
Most of these are dump handlers and leveraging the extack in
the netlink_callback.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:04 -07:00
David Ahern
3d0d4337d7 netlink: Add extack message to nlmsg_parse for invalid header length
Give a user a reason why EINVAL is returned in nlmsg_parse.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:03 -07:00
David Ahern
4a19edb60d netlink: Pass extack to dump handlers
Declare extack in netlink_dump and pass to dump handlers via
netlink_callback. Add any extack message after the dump_done_errno
allowing error messages to be returned. This will be useful when
strict checking is done on dump requests, returning why the dump
fails EINVAL.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08 10:39:03 -07:00