mainlining shenanigans
Go to file
David S. Miller 0565de29cb Merge branch 'ipv6-Separate-data-structures-for-FIB-and-data-path'
David Ahern says:

====================
net/ipv6: Separate data structures for FIB and data path

IPv6 uses the same data struct for both control plane (FIB entries) and
data path (dst entries). This struct has elements needed for both paths
adding memory overhead and complexity (taking a dst hold in most places
but an additional reference on rt6i_ref in a few). Furthermore, because
of the dst_alloc tie, all FIB entries are allocated with GFP_ATOMIC.

This patch set separates FIB entries from dst entries, better aligning
IPv6 code with IPv4, simplifying the reference counting and allowing
FIB entries added by userspace (not autoconf) to use GFP_KERNEL. It is
first step to a number of performance and scalability changes.

The end result of this patch set:
  - FIB entries (fib6_info):
        /* size: 208, cachelines: 4, members: 25 */
        /* sum members: 207, holes: 1, sum holes: 1 */

  - dst entries (rt6_info)
       /* size: 240, cachelines: 4, members: 11 */

Versus the the single rt6_info struct today for both paths:
      /* size: 320, cachelines: 5, members: 28 */

This amounts to a 35% reduction in memory use for FIB entries and a
25% reduction for dst entries.

With respect to locking FIB entries use RCU and a single atomic
counter with fib6_info_hold and fib6_info_release helpers to manage
the reference counting. dst entries use only the traditional dst
refcounts with dst_hold and dst_release.

FIB entries for host routes are referenced by inet6_ifaddr and
ifacaddr6. In both cases, additional holds are taken -- similar to
what is done for devices.

This set is the first of many changes to improve the scalability of the
IPv6 code. Follow on changes include:
- consolidating duplicate fib6_info references like IPv4 does with
  duplicate fib_info

- moving fib6_info into a slab cache to avoid allocation roundups to
  power of 2 (the 208 size becomes a 256 actual allocation)

- Allow FIB lookups without generating a dst (e.g., most rt6_lookup
  users just want to verify the egress device). Means moving dst
  allocation to the other side of fib6_rule_lookup which again aligns
  with IPv4 behavior

- using separate standalone nexthop objects which have performance
  benefits beyond fib_info consolidation

At this point I am not seeing any refcount leaks or underflows, no
oops or bug_ons, or warnings from kasan, so I think it is ready for
others to beat up on it finding errors in code paths I have missed.

v2 changes
- rebased to top of tree
- improved commit message on patch 7

v1 changes
- rebased to top of tree
- fix memory leak of metrics as noted by Ido
- MTU fixes based on pmtu tests (thanks Stefano Brivio for writing)

RFC v2 changes
- improved commit messages
- move common metrics code from dst.c to net/ipv4/metrics.c (comment
  from DaveM)
- address comments from Wei Wang and Martin KaFai Lau (let me know if
  I missed something)
- fixes detected by kernel test robots
  + added fib6_metric_set to change metric on a FIB entry which could
    be pointing to read-only dst_default_metrics
  + 0day testing found a problem with an intermediate patch; added
    dst_hold_safe on rt->from. Code is removed 3 patches later
- allow cacheinfo to handle NULL dst; means only expires is pushed to
  userspace
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17 23:41:18 -04:00
arch xen: fixes for 4.17-rc1 2018-04-12 11:04:35 -07:00
block for-4.17/block-20180402 2018-04-05 14:27:02 -07:00
certs certs/blacklist_nohashes.c: fix const confusion in certs blacklist 2018-02-21 15:35:43 -08:00
crypto MIPS changes for 4.17 2018-04-10 11:39:22 -07:00
Documentation IOMMU Updates for Linux v4.17 2018-04-11 18:50:41 -07:00
drivers net/ipv6: Flip FIB entries to fib6_info 2018-04-17 23:41:18 -04:00
firmware kbuild: remove all dummy assignments to obj- 2017-11-18 11:46:06 +09:00
fs This pull request contains updates for both UBI and UBIFS: 2018-04-11 16:39:34 -07:00
include net/ipv6: Remove unused code and variables for rt6_info 2018-04-17 23:41:18 -04:00
init seq_file: allocate seq_file from kmem_cache 2018-04-11 10:28:36 -07:00
ipc ipc/shm.c: shm_split(): remove unneeded test for NULL shm_file_data.vm_ops 2018-04-11 10:28:38 -07:00
kernel xdp: transition into using xdp_frame for return API 2018-04-17 10:50:29 -04:00
lib Fix for one swiotlb regression in 2.16 from Takashi. 2018-04-12 11:00:48 -07:00
LICENSES LICENSES: Add MPL-1.1 license 2018-01-06 10:59:44 -07:00
mm page cache: use xa_lock 2018-04-11 10:28:39 -07:00
net net/ipv6: Remove unused code and variables for rt6_info 2018-04-17 23:41:18 -04:00
samples remoteproc updates for v4.17 2018-04-10 12:09:27 -07:00
scripts asm-generic fixes for v4.17-rc1 2018-04-12 09:15:48 -07:00
security ipc/msg: introduce msgctl(MSG_STAT_ANY) 2018-04-11 10:28:37 -07:00
sound sound fixes for 4.17-rc1 2018-04-10 10:16:04 -07:00
tools selftest: tc_flower: add testcase for 'ip_flags' 2018-04-17 13:41:54 -04:00
usr kbuild: rename built-in.o to built-in.a 2018-03-26 02:01:19 +09:00
virt KVM/ARM updates for v4.17 2018-03-28 16:09:09 +02:00
.clang-format clang-format: add configuration file 2018-04-11 10:28:35 -07:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore clang-format: add configuration file 2018-04-11 10:28:35 -07:00
.mailmap Merge candidates for 4.17 merge window 2018-04-06 17:35:43 -07:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS MAINTAINERS/CREDITS: Drop METAG ARCHITECTURE 2018-03-05 16:34:24 +00:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
MAINTAINERS ARM: SoC fixes for 4.17 2018-04-11 16:12:21 -07:00
Makefile Kconfig updates for v4.17 2018-04-03 16:28:01 -07:00
README Docs: Added a pointer to the formatted docs to README 2018-03-21 09:02:53 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.