linux/fs/gfs2
Vladimir Davydov 503c358cf1 list_lru: introduce list_lru_shrink_{count,walk}
Kmem accounting of memcg is unusable now, because it lacks slab shrinker
support.  That means when we hit the limit we will get ENOMEM w/o any
chance to recover.  What we should do then is to call shrink_slab, which
would reclaim old inode/dentry caches from this cgroup.  This is what
this patch set is intended to do.

Basically, it does two things.  First, it introduces the notion of
per-memcg slab shrinker.  A shrinker that wants to reclaim objects per
cgroup should mark itself as SHRINKER_MEMCG_AWARE.  Then it will be
passed the memory cgroup to scan from in shrink_control->memcg.  For
such shrinkers shrink_slab iterates over the whole cgroup subtree under
the target cgroup and calls the shrinker for each kmem-active memory
cgroup.

Secondly, this patch set makes the list_lru structure per-memcg.  It's
done transparently to list_lru users - everything they have to do is to
tell list_lru_init that they want memcg-aware list_lru.  Then the
list_lru will automatically distribute objects among per-memcg lists
basing on which cgroup the object is accounted to.  This way to make FS
shrinkers (icache, dcache) memcg-aware we only need to make them use
memcg-aware list_lru, and this is what this patch set does.

As before, this patch set only enables per-memcg kmem reclaim when the
pressure goes from memory.limit, not from memory.kmem.limit.  Handling
memory.kmem.limit is going to be tricky due to GFP_NOFS allocations, and
it is still unclear whether we will have this knob in the unified
hierarchy.

This patch (of 9):

NUMA aware slab shrinkers use the list_lru structure to distribute
objects coming from different NUMA nodes to different lists.  Whenever
such a shrinker needs to count or scan objects from a particular node,
it issues commands like this:

        count = list_lru_count_node(lru, sc->nid);
        freed = list_lru_walk_node(lru, sc->nid, isolate_func,
                                   isolate_arg, &sc->nr_to_scan);

where sc is an instance of the shrink_control structure passed to it
from vmscan.

To simplify this, let's add special list_lru functions to be used by
shrinkers, list_lru_shrink_count() and list_lru_shrink_walk(), which
consolidate the nid and nr_to_scan arguments in the shrink_control
structure.

This will also allow us to avoid patching shrinkers that use list_lru
when we make shrink_slab() per-memcg - all we will have to do is extend
the shrink_control structure to include the target memcg and make
list_lru_shrink_{count,walk} handle this appropriately.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Suggested-by: Dave Chinner <david@fromorbit.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Greg Thelen <gthelen@google.com>
Cc: Glauber Costa <glommer@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-12 18:54:08 -08:00
..
acl.c GFS2: Fix crash during ACL deletion in acl max entry check in gfs2_set_acl() 2015-02-10 10:14:56 +00:00
acl.h GFS2: Increase the max number of ACLs 2014-03-19 15:16:24 +00:00
aops.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-06-12 10:30:18 -07:00
bmap.c GFS2: Change maxlen variables to size_t 2014-08-21 10:22:23 +01:00
bmap.h GFS2: Clean up journal extent mapping 2014-03-03 13:50:12 +00:00
dentry.c vfs: Remove unnecessary calls of check_submounts_and_drop 2014-10-09 02:38:56 -04:00
dir.c GFS2: use __vmalloc GFP_NOFS for fs-related allocations. 2015-02-04 09:58:41 +00:00
dir.h GFS2: Make rename not save dirent location 2014-10-01 14:06:15 +01:00
export.c vfs: make first argument of dir_context.actor typed 2014-10-31 17:48:54 -04:00
file.c mm: drop vm_ops->remap_pages and generic_file_remap_pages() stub 2015-02-10 14:30:30 -08:00
gfs2.h
glock.c GFS2: Eliminate __gfs2_glock_remove_from_lru 2015-01-09 11:33:13 +00:00
glock.h GFS2: Don't use ENOBUFS when ENOMEM is the correct error code 2014-01-16 10:31:13 +00:00
glops.c GFS2: update freeze code to use freeze/thaw_super on all nodes 2014-11-17 10:36:39 +00:00
glops.h GFS2: update freeze code to use freeze/thaw_super on all nodes 2014-11-17 10:36:39 +00:00
incore.h GFS2: update freeze code to use freeze/thaw_super on all nodes 2014-11-17 10:36:39 +00:00
inode.c GFS2: Eliminate a nonsense goto 2015-01-26 20:58:54 +00:00
inode.h GFS2: Add atomic_open support 2013-06-14 11:17:15 +01:00
Kconfig Finally eradicate CONFIG_HOTPLUG 2013-06-03 14:20:18 -07:00
lock_dlm.c Merge branch 'sched/urgent' into sched/core, to merge fixes before applying new changes 2014-07-28 10:03:00 +02:00
log.c GFS2: update freeze code to use freeze/thaw_super on all nodes 2014-11-17 10:36:39 +00:00
log.h GFS2: remove transaction glock 2014-05-14 10:04:34 +01:00
lops.c GFS2: lops.c: replace 0 by NULL for pointers 2014-04-28 09:41:55 +01:00
lops.h GFS2: Move log buffer lists into transaction 2014-02-24 16:54:54 +00:00
main.c GFS2: update freeze code to use freeze/thaw_super on all nodes 2014-11-17 10:36:39 +00:00
Makefile GFS2: Rename ops_inode.c to inode.c 2011-05-10 13:12:49 +01:00
meta_io.c mm: non-atomically mark page accessed during page cache allocation where possible 2014-06-04 16:54:10 -07:00
meta_io.h GFS2: Fix address space from page function 2014-03-31 17:48:27 +01:00
ops_fstype.c Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2014-12-10 15:43:30 -08:00
quota.c list_lru: introduce list_lru_shrink_{count,walk} 2015-02-12 18:54:08 -08:00
quota.h GFS2: Use RCU/hlist_bl based hash for quotas 2014-01-14 19:27:56 +00:00
recovery.c GFS2: fix sprintf format specifier 2015-01-13 10:48:57 +00:00
recovery.h GFS2: Move recovery variables to journal structure in memory 2014-03-07 09:14:48 +00:00
rgrp.c GFS2: If we use up our block reservation, request more next time 2014-11-03 19:26:54 +00:00
rgrp.h GFS2: If we use up our block reservation, request more next time 2014-11-03 19:26:54 +00:00
super.c GFS2: update freeze code to use freeze/thaw_super on all nodes 2014-11-17 10:36:39 +00:00
super.h GFS2: update freeze code to use freeze/thaw_super on all nodes 2014-11-17 10:36:39 +00:00
sys.c GFS2: fix sprintf format specifier 2015-01-13 10:48:57 +00:00
sys.h GFS2: dlm based recovery coordination 2012-01-11 09:23:05 +00:00
trace_gfs2.h GFS2: Add origin indicator to glock demote tracing 2013-04-10 10:32:05 +01:00
trans.c GFS2: update freeze code to use freeze/thaw_super on all nodes 2014-11-17 10:36:39 +00:00
trans.h GFS2: Split gfs2_trans_add_bh() into two 2013-01-29 10:28:04 +00:00
util.c GFS2: Convert gfs2_lm_withdraw to use fs_err 2014-03-07 09:39:18 +00:00
util.h GFS2: Convert gfs2_lm_withdraw to use fs_err 2014-03-07 09:39:18 +00:00
xattr.c gfs2: use generic posix ACL infrastructure 2014-01-25 23:58:22 -05:00
xattr.h