A mirror of the official Linux kernel repository just in case
Go to file
Roman Gushchin eedc4e5a14 mm: memcg: factor out memcg- and lruvec-level changes out of __mod_lruvec_state()
Patch series "The new cgroup slab memory controller", v7.

The patchset moves the accounting from the page level to the object level.
It allows to share slab pages between memory cgroups.  This leads to a
significant win in the slab utilization (up to 45%) and the corresponding
drop in the total kernel memory footprint.  The reduced number of
unmovable slab pages should also have a positive effect on the memory
fragmentation.

The patchset makes the slab accounting code simpler: there is no more need
in the complicated dynamic creation and destruction of per-cgroup slab
caches, all memory cgroups use a global set of shared slab caches.  The
lifetime of slab caches is not more connected to the lifetime of memory
cgroups.

The more precise accounting does require more CPU, however in practice the
difference seems to be negligible.  We've been using the new slab
controller in Facebook production for several months with different
workloads and haven't seen any noticeable regressions.  What we've seen
were memory savings in order of 1 GB per host (it varied heavily depending
on the actual workload, size of RAM, number of CPUs, memory pressure,
etc).

The third version of the patchset added yet another step towards the
simplification of the code: sharing of slab caches between accounted and
non-accounted allocations.  It comes with significant upsides (most
noticeable, a complete elimination of dynamic slab caches creation) but
not without some regression risks, so this change sits on top of the
patchset and is not completely merged in.  So in the unlikely event of a
noticeable performance regression it can be reverted separately.

The slab memory accounting works in exactly the same way for SLAB and
SLUB.  With both allocators the new controller shows significant memory
savings, with SLUB the difference is bigger.  On my 16-core desktop
machine running Fedora 32 the size of the slab memory measured after the
start of the system was lower by 58% and 38% with SLUB and SLAB
correspondingly.

As an estimation of a potential CPU overhead, below are results of
slab_bulk_test01 test, kindly provided by Jesper D.  Brouer.  He also
helped with the evaluation of results.

The test can be found here: https://github.com/netoptimizer/prototype-kernel/
The smallest number in each row should be picked for a comparison.

SLUB-patched - bulk-API
 - SLUB-patched : bulk_quick_reuse objects=1 : 187 -  90 - 224  cycles(tsc)
 - SLUB-patched : bulk_quick_reuse objects=2 : 110 -  53 - 133  cycles(tsc)
 - SLUB-patched : bulk_quick_reuse objects=3 :  88 -  95 -  42  cycles(tsc)
 - SLUB-patched : bulk_quick_reuse objects=4 :  91 -  85 -  36  cycles(tsc)
 - SLUB-patched : bulk_quick_reuse objects=8 :  32 -  66 -  32  cycles(tsc)

SLUB-original -  bulk-API
 - SLUB-original: bulk_quick_reuse objects=1 :  87 -  87 - 142  cycles(tsc)
 - SLUB-original: bulk_quick_reuse objects=2 :  52 -  53 -  53  cycles(tsc)
 - SLUB-original: bulk_quick_reuse objects=3 :  42 -  42 -  91  cycles(tsc)
 - SLUB-original: bulk_quick_reuse objects=4 :  91 -  37 -  37  cycles(tsc)
 - SLUB-original: bulk_quick_reuse objects=8 :  31 -  79 -  76  cycles(tsc)

SLAB-patched -  bulk-API
 - SLAB-patched : bulk_quick_reuse objects=1 :  67 -  67 - 140  cycles(tsc)
 - SLAB-patched : bulk_quick_reuse objects=2 :  55 -  46 -  46  cycles(tsc)
 - SLAB-patched : bulk_quick_reuse objects=3 :  93 -  94 -  39  cycles(tsc)
 - SLAB-patched : bulk_quick_reuse objects=4 :  35 -  88 -  85  cycles(tsc)
 - SLAB-patched : bulk_quick_reuse objects=8 :  30 -  30 -  30  cycles(tsc)

SLAB-original-  bulk-API
 - SLAB-original: bulk_quick_reuse objects=1 : 143 - 136 -  67  cycles(tsc)
 - SLAB-original: bulk_quick_reuse objects=2 :  45 -  46 -  46  cycles(tsc)
 - SLAB-original: bulk_quick_reuse objects=3 :  38 -  39 -  39  cycles(tsc)
 - SLAB-original: bulk_quick_reuse objects=4 :  35 -  87 -  87  cycles(tsc)
 - SLAB-original: bulk_quick_reuse objects=8 :  29 -  66 -  30  cycles(tsc)

This patch (of 19):

To convert memcg and lruvec slab counters to bytes there must be a way to
change these counters without touching node counters.  Factor out
__mod_memcg_lruvec_state() out of __mod_lruvec_state().

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/20200623174037.3951353-1-guro@fb.com
Link: http://lkml.kernel.org/r/20200623174037.3951353-2-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:24 -07:00
arch mm, treewide: rename kzfree() to kfree_sensitive() 2020-08-07 11:33:22 -07:00
block for-5.9/block-merge-20200804 2020-08-05 11:12:34 -07:00
certs .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
crypto mm, treewide: rename kzfree() to kfree_sensitive() 2020-08-07 11:33:22 -07:00
Documentation tmpfs: support 64-bit inums per-sb 2020-08-07 11:33:24 -07:00
drivers mm, treewide: rename kzfree() to kfree_sensitive() 2020-08-07 11:33:22 -07:00
fs tmpfs: support 64-bit inums per-sb 2020-08-07 11:33:24 -07:00
include mm: memcg: factor out memcg- and lruvec-level changes out of __mod_lruvec_state() 2020-08-07 11:33:24 -07:00
init mm/slab: expand CONFIG_SLAB_FREELIST_HARDENED to include SLAB 2020-08-07 11:33:22 -07:00
ipc mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
kernel kthread: remove incorrect comment in kthread_create_on_cpu() 2020-08-07 11:33:21 -07:00
lib mm, treewide: rename kzfree() to kfree_sensitive() 2020-08-07 11:33:22 -07:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
mm mm: memcg: factor out memcg- and lruvec-level changes out of __mod_lruvec_state() 2020-08-07 11:33:24 -07:00
net mm, treewide: rename kzfree() to kfree_sensitive() 2020-08-07 11:33:22 -07:00
samples Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-08-05 20:13:21 -07:00
scripts mm, treewide: rename kzfree() to kfree_sensitive() 2020-08-07 11:33:22 -07:00
security mm, treewide: rename kzfree() to kfree_sensitive() 2020-08-07 11:33:22 -07:00
sound sound updates for 5.9 2020-08-06 14:27:31 -07:00
tools tools/testing/selftests/cgroup/cgroup_util.c: cg_read_strcmp: fix null pointer dereference 2020-08-07 11:33:21 -07:00
usr usr: Add support for zstd compressed initramfs 2020-07-31 11:49:08 +02:00
virt s390: implement diag318 2020-08-06 12:59:31 -07:00
.clang-format block: add bio_for_each_bvec_all() 2020-05-25 11:25:24 +02:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: Add ZSTD-compressed files 2020-07-31 11:50:49 +02:00
.mailmap It's been a busy cycle for documentation - hopefully the busiest for a 2020-08-04 22:47:54 -07:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS CREDITS: Replace HTTP links with HTTPS ones 2020-07-23 14:53:58 -06:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS TTY/Serial patches for 5.9-rc1 2020-08-06 14:56:11 -07:00
Makefile Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-08-05 20:13:21 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.