linux/lib
Johannes Weiner ea07b862ac mm: workingset: fix use-after-free in shadow node shrinker
Several people report seeing warnings about inconsistent radix tree
nodes followed by crashes in the workingset code, which all looked like
use-after-free access from the shadow node shrinker.

Dave Jones managed to reproduce the issue with a debug patch applied,
which confirmed that the radix tree shrinking indeed frees shadow nodes
while they are still linked to the shadow LRU:

  WARNING: CPU: 2 PID: 53 at lib/radix-tree.c:643 delete_node+0x1e4/0x200
  CPU: 2 PID: 53 Comm: kswapd0 Not tainted 4.10.0-rc2-think+ #3
  Call Trace:
     delete_node+0x1e4/0x200
     __radix_tree_delete_node+0xd/0x10
     shadow_lru_isolate+0xe6/0x220
     __list_lru_walk_one.isra.4+0x9b/0x190
     list_lru_walk_one+0x23/0x30
     scan_shadow_nodes+0x2e/0x40
     shrink_slab.part.44+0x23d/0x5d0
     shrink_node+0x22c/0x330
     kswapd+0x392/0x8f0

This is the WARN_ON_ONCE(!list_empty(&node->private_list)) placed in the
inlined radix_tree_shrink().

The problem is with 14b468791f ("mm: workingset: move shadow entry
tracking to radix tree exceptional tracking"), which passes an update
callback into the radix tree to link and unlink shadow leaf nodes when
tree entries change, but forgot to pass the callback when reclaiming a
shadow node.

While the reclaimed shadow node itself is unlinked by the shrinker, its
deletion from the tree can cause the left-most leaf node in the tree to
be shrunk.  If that happens to be a shadow node as well, we don't unlink
it from the LRU as we should.

Consider this tree, where the s are shadow entries:

       root->rnode
            |
       [0       n]
        |       |
     [s    ] [sssss]

Now the shadow node shrinker reclaims the rightmost leaf node through
the shadow node LRU:

       root->rnode
            |
       [0        ]
        |
    [s     ]

Because the parent of the deleted node is the first level below the
root and has only one child in the left-most slot, the intermediate
level is shrunk and the node containing the single shadow is put in
its place:

       root->rnode
            |
       [s        ]

The shrinker again sees a single left-most slot in a first level node
and thus decides to store the shadow in root->rnode directly and free
the node - which is a leaf node on the shadow node LRU.

  root->rnode
       |
       s

Without the update callback, the freed node remains on the shadow LRU,
where it causes later shrinker runs to crash.

Pass the node updater callback into __radix_tree_delete_node() in case
the deletion causes the left-most branch in the tree to collapse too.

Also add warnings when linked nodes are freed right away, rather than
wait for the use-after-free when the list is scanned much later.

Fixes: 14b468791f ("mm: workingset: move shadow entry tracking to radix tree exceptional tracking")
Reported-by: Dave Chinner <david@fromorbit.com>
Reported-by: Hugh Dickins <hughd@google.com>
Reported-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-and-tested-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chris Leech <cleech@redhat.com>
Cc: Lee Duncan <lduncan@suse.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <mawilcox@linuxonhyperv.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-07 18:22:40 -08:00
..
842 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2016-03-17 21:38:27 -07:00
fonts fonts: Add 6x10 font 2014-10-09 11:35:48 +03:00
lz4 lib: lz4: cleanup unaligned access efficiency detection 2016-04-13 09:22:49 -07:00
lzo lzo: check for length overrun in variable length encoding. 2014-09-28 11:08:01 +02:00
mpi mpi: Fix NULL ptr dereference in mpi_powm() [ver #3] 2016-11-25 12:57:50 +11:00
raid6 lib/raid6: Add AVX2 optimized xor_syndrome functions 2016-11-07 15:08:20 -08:00
reed_solomon
xz lib/xz: enable all filters by default in Kconfig 2014-06-04 16:54:18 -07:00
zlib_deflate zlib_deflate/deftree: remove bi_reverse() 2015-09-10 13:29:01 -07:00
zlib_inflate zlib: clean up some dead code 2014-08-06 18:01:24 -07:00
.gitignore
argv_split.c
asn1_decoder.c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2016-05-17 09:33:39 -07:00
assoc_array.c assoc_array: don't call compare_object() on a node 2016-04-06 14:06:48 +01:00
atomic64_test.c atomic64: no need for CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE 2016-10-07 18:46:30 -07:00
atomic64.c locking/atomic: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() 2016-06-16 10:48:32 +02:00
audit.c syscalls: implement execveat() system call 2014-12-13 12:42:51 -08:00
bcd.c
bch.c
bitmap.c lib/bitmap.c: enhance bitmap syntax 2016-10-11 15:06:30 -07:00
bitrev.c ARM: 8187/1: add CONFIG_HAVE_ARCH_BITREVERSE to support rbit instruction 2014-12-22 16:43:06 +00:00
bsearch.c
btree.c treewide: Remove old email address 2015-11-23 09:44:58 +01:00
bug.c lib/bug.c: use common WARN helper 2016-03-17 15:09:34 -07:00
build_OID_registry
bust_spinlocks.c
chacha20.c random: replace non-blocking pool with a Chacha20-based CRNG 2016-07-03 00:57:23 -04:00
check_signature.c
checksum.c ipv4: Update parameters for csum_tcpudp_magic to their original types 2016-03-13 23:55:13 -04:00
clz_ctz.c lib/clz_ctz.c: add prototype declarations in lib/clz_ctz.c 2014-04-03 16:21:12 -07:00
clz_tab.c
cmdline.c lib: Add a generic cmdline parse function parse_option_str 2014-10-03 18:40:58 +01:00
compat_audit.c audit: Add generic compat syscall support 2014-03-20 10:11:35 -04:00
cordic.c
cpu_rmap.c sched/topology: Rename topology_thread_cpumask() to topology_sibling_cpumask() 2015-05-27 15:22:15 +02:00
cpumask.c cpumask: Export cpumask_any_but() 2016-02-29 09:35:20 +01:00
crc7.c lib/crc7: Shift crc7() output left 1 bit 2014-05-16 14:26:52 -04:00
crc8.c
crc16.c
crc32.c crc32: use ktime_get_ns() for measurement 2016-08-02 19:35:08 -04:00
crc32defs.h
crc-ccitt.c
crc-itu-t.c lib: crc-itu-t.[ch] fix 0x0x prefix in integer constants 2015-05-26 15:26:43 +02:00
crc-t10dif.c lib: introduce crc_t10dif_update() 2015-05-30 22:42:24 -07:00
ctype.c
debug_info.c kbuild: include core debug info when DEBUG_INFO_REDUCED 2015-06-11 15:08:32 +02:00
debug_locks.c
debugobjects.c Merge branch 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2016-12-13 12:59:57 -08:00
dec_and_lock.c
decompress_bunzip2.c lib/decompressors: use real out buf size for gunzip with kernel 2015-09-10 13:29:01 -07:00
decompress_inflate.c lib/decompressors: use real out buf size for gunzip with kernel 2015-09-10 13:29:01 -07:00
decompress_unlz4.c lib/decompressors: use real out buf size for gunzip with kernel 2015-09-10 13:29:01 -07:00
decompress_unlzma.c lib/decompress_unlzma: Do a NULL check for pointer 2015-09-10 13:29:01 -07:00
decompress_unlzo.c lib/decompressors: use real out buf size for gunzip with kernel 2015-09-10 13:29:01 -07:00
decompress_unxz.c lib/decompressors: use real out buf size for gunzip with kernel 2015-09-10 13:29:01 -07:00
decompress.c lib/decompress: set the compressor name to NULL on error 2015-07-17 16:39:54 -07:00
devres.c devres: use to_pci_dev() 2016-02-07 23:17:59 -08:00
digsig.c lib/digsig: digsig_verify_rsa(): return -EINVAL if modulo length is zero 2016-05-31 16:42:00 +08:00
div64.c __div64_32(): make it overridable at compile time 2015-11-16 14:42:12 -05:00
dma-debug.c dmaengine updates for 4.8-rc1 2016-10-06 17:13:54 -07:00
dma-noop.c dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
dump_stack.c dump_stack: avoid potential deadlocks 2016-02-05 18:10:40 -08:00
dynamic_debug.c dynamic_debug: add jump label support 2016-08-04 08:50:07 -04:00
dynamic_queue_limits.c lib/dynamic_queue_limits.c: simplify includes 2015-02-12 18:54:15 -08:00
earlycpio.c lib/cpio: Make find_cpio_data()'s offset arg optional 2016-06-08 11:04:19 +02:00
extable.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
fault-inject.c fault-inject: fix inverted interval/probability values in printk 2015-10-23 17:55:10 +09:00
fdt_empty_tree.c lib: add fdt_empty_tree.c 2014-04-30 19:49:37 +01:00
fdt_ro.c
fdt_rw.c
fdt_strerror.c
fdt_sw.c
fdt_wip.c
fdt.c
find_bit.c lib: rename lib/find_next_bit.c to lib/find_bit.c 2015-04-17 09:03:54 -04:00
flex_array.c reciprocal_divide: update/correction of the algorithm 2014-01-21 23:17:20 -08:00
flex_proportions.c lib+mm: fix few spelling mistakes 2016-02-15 11:18:23 +01:00
gcd.c lib/GCD.c: use binary GCD algorithm instead of Euclidean 2016-05-20 17:58:30 -07:00
gen_crc32table.c lib: crc32: constify crc32 lookup table 2015-02-13 21:21:35 -08:00
genalloc.c lib/genalloc.c: start search from start of chunk 2016-10-27 18:43:43 -07:00
glob.c lib/glob.c: add CONFIG_GLOB_SELFTEST 2014-08-06 18:01:25 -07:00
halfmd4.c lib/halfmd4.c: use rol32 inline function in the ROUND macro 2015-11-06 17:50:42 -08:00
hexdump.c lib/hexdump.c: truncate output in case of overflow 2015-11-06 17:50:42 -08:00
hweight.c x86/hweight: Get rid of the special calling convention 2016-06-08 15:01:02 +02:00
idr.c lib/ida: document locking requirements a bit better 2016-12-12 18:55:09 -08:00
inflate.c
int_sqrt.c
interval_tree_test.c lib: Export interval_tree 2014-05-05 09:09:14 +02:00
interval_tree.c lib/interval_tree.c: simplify includes 2015-02-12 18:54:15 -08:00
iomap_copy.c lib/iomap_copy.c: add __ioread32_copy() 2016-01-20 17:09:18 -08:00
iomap.c Kconfig: rename HAS_IOPORT to HAS_IOPORT_MAP 2014-04-07 16:36:11 -07:00
iommu-common.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc 2015-11-05 16:34:48 -08:00
iommu-helper.c lib/iommu-helper: skip to next segment 2016-08-02 19:35:07 -04:00
ioremap.c x86, mm: support huge KVA mappings on x86 2015-04-14 16:49:04 -07:00
iov_iter.c [iov_iter] fix iterate_all_kinds() on empty iterators 2016-12-22 23:00:22 -05:00
irq_poll.c This adds a new gcc plugin named "latent_entropy". It is designed to 2016-10-15 10:03:15 -07:00
irq_regs.c
is_single_threaded.c lib/is_single_threaded.c: change current_is_single_threaded() to use for_each_thread() 2015-11-06 17:50:42 -08:00
jedec_ddr_data.c
kasprintf.c lib/kasprintf.c: add sanity check to kvasprintf 2016-01-16 11:17:27 -08:00
Kconfig Merge branch 'akpm' (patches from Andrew) 2016-10-07 21:38:00 -07:00
Kconfig.debug cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions 2016-12-25 10:47:43 +01:00
Kconfig.kasan mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB 2016-07-28 16:07:41 -07:00
Kconfig.kgdb kgdb: depends on VT 2016-05-23 17:04:14 -07:00
Kconfig.kmemcheck
Kconfig.ubsan Kconfig: lib/Kconfig.ubsan fix reference to ubsan documentation 2016-12-14 16:04:08 -08:00
kfifo.c kfifo: use BUG_ON 2014-08-08 15:57:25 -07:00
klist.c klist: fix starting point removed bug in klist iterators 2016-02-07 22:18:47 -08:00
kobject_uevent.c kobject: improve function-level documentation 2016-10-28 02:42:10 -04:00
kobject.c kobject: export kset_find_obj() for module use 2016-02-09 17:36:34 -08:00
kstrtox.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
kstrtox.h
lcm.c block: fix blk_stack_limits() regression due to lcm() change 2015-03-31 09:45:50 -06:00
libcrc32c.c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2016-01-22 11:58:43 -08:00
list_debug.c bug: Provide toggle for BUG on data corruption 2016-10-31 13:01:58 -07:00
list_sort.c lib/list_sort: use late_initcall to hook in self tests 2015-06-16 14:12:35 -04:00
llist.c lib/llist.c: fix data race in llist_del_first 2015-11-06 17:50:42 -08:00
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c locking/selftest: Fix output since KERN_CONT changes 2016-11-25 07:12:19 +01:00
lockref.c locking/core: Remove cpu_relax_lowlatency() users 2016-11-16 10:15:10 +01:00
lru_cache.c lru_cache: Converted lc_seq_printf_status to return void 2015-11-25 09:22:02 -07:00
Makefile cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions 2016-12-25 10:47:43 +01:00
md5.c lib/md5.c: simplify include 2015-02-12 18:54:15 -08:00
memory-notifier-error-inject.c
memweight.c
net_utils.c mac_pton: Use bool not int return 2014-06-25 17:45:43 -07:00
netdev-notifier-error-inject.c net: Add support for CHANGEUPPER notifier error injection 2015-12-03 11:49:23 -05:00
nlattr.c netlink: smaller nla_attr_minlen table 2016-11-19 22:11:25 -05:00
nmi_backtrace.c nmi_backtrace: generate one-line reports for idle cpus 2016-10-07 18:46:30 -07:00
nodemask.c include/linux/nodemask.h: create next_node_in() helper 2016-05-19 19:12:14 -07:00
notifier-error-inject.c
notifier-error-inject.h
of-reconfig-notifier-error-inject.c
oid_registry.c
once.c once: make helper generic for calling functions once 2015-10-08 05:26:36 -07:00
parser.c parser: add u64 number parser 2016-12-06 10:17:03 +02:00
pci_iomap.c libnvdimm for 4.3: 2015-09-08 14:35:59 -07:00
percpu_counter.c lib/percpu_counter: Convert to hotplug state machine 2016-11-09 23:45:26 +01:00
percpu_ida.c mm, page_alloc: rename __GFP_WAIT to __GFP_RECLAIM 2015-11-06 17:50:42 -08:00
percpu_test.c percpu: add test module for various percpu operations 2013-11-13 12:09:11 +09:00
percpu-refcount.c percpu-refcount: init ->confirm_switch member properly 2016-08-11 13:52:23 -04:00
plist.c lib/plist.c: remove redundant include 2015-02-12 18:54:16 -08:00
pm-notifier-error-inject.c
radix-tree.c mm: workingset: fix use-after-free in shadow node shrinker 2017-01-07 18:22:40 -08:00
random32.c This adds a new gcc plugin named "latent_entropy". It is designed to 2016-10-15 10:03:15 -07:00
ratelimit.c ratelimit: extend to print suppressed messages on release 2016-08-02 19:35:06 -04:00
rational.c
rbtree_test.c rbtree/test: test rbtree_postorder_for_each_entry_safe() 2014-01-23 16:37:03 -08:00
rbtree.c lib/rbtree.c: fix typo in comment of ____rb_erase_color 2016-12-12 18:55:09 -08:00
reciprocal_div.c reciprocal_divide: update/correction of the algorithm 2014-01-21 23:17:20 -08:00
rhashtable.c rhashtable: Add rhlist interface 2016-09-20 04:43:36 -04:00
sbitmap.c sbitmap: initialize weight to zero 2016-09-19 08:19:40 -06:00
scatterlist.c scatterlist: fix a typo in comment block of sg_miter_stop() 2016-02-08 10:15:17 -08:00
seq_buf.c tracing: Use seq_buf_used() in seq_buf_to_user() instead of len 2015-12-23 14:27:20 -05:00
sg_pool.c lib: scatterlist: move SG pool code from SCSI driver to lib/sg_pool.c 2016-04-15 16:53:14 -04:00
sg_split.c lib: scatterlist: add sg splitting function 2015-08-24 14:28:01 -06:00
sha1.c lib: EXPORT_SYMBOL sha_init 2015-03-23 22:12:08 -04:00
show_mem.c lib/show_mem.c: correct reserved memory calculation 2015-09-08 15:35:28 -07:00
smp_processor_id.c percpu: add preemption checks to __this_cpu ops 2014-04-07 16:36:14 -07:00
sort.c lib/sort: Add 64 bit swap function 2015-06-25 17:00:40 -07:00
stackdepot.c lib/stackdepot: export save/fetch stack for drivers 2016-11-11 08:12:37 -08:00
stmp_device.c lib/stmp_device.c: replace module.h include 2015-02-12 18:54:16 -08:00
string_helpers.c string_helpers: add kstrdup_quotable_file 2016-04-21 10:47:26 +10:00
string.c lib: move strtobool() to kstrtobool() 2016-03-17 15:09:34 -07:00
strncpy_from_user.c lib: harden strncpy_from_user 2016-10-11 15:06:30 -07:00
strnlen_user.c unsafe_[get|put]_user: change interface to use a error target label 2016-08-08 13:02:01 -07:00
swiotlb.c swiotlb: Export swiotlb_max_segment to users 2017-01-06 13:00:01 -05:00
syscall.c lib/syscall: Pin the task stack in collect_syscall() 2016-09-16 09:18:53 +02:00
test_bitmap.c test_bitmap: unit tests for lib/bitmap.c 2016-02-19 22:54:09 -05:00
test_bpf.c bpf, test: fix ld_abs + vlan push/pop stress test 2016-10-20 14:39:06 -04:00
test_firmware.c test: firmware_class: add asynchronous request trigger 2016-01-07 13:44:22 -07:00
test_hash.c lib/test_hash.c: fix warning in preprocessor symbol evaluation 2016-09-01 17:52:01 -07:00
test_hexdump.c test_hexdump: print statistics at the end 2016-01-20 17:09:18 -08:00
test_kasan.c kasan: support use-after-scope detection 2016-11-30 16:32:52 -08:00
test_module.c test: add minimal module for verification testing 2014-01-23 16:36:57 -08:00
test_printf.c mm, printk: introduce new format string for flags 2016-03-15 16:55:16 -07:00
test_rhashtable.c rhashtable-test: Fix max_size parameter description 2016-08-08 12:52:42 -07:00
test_static_key_base.c locking/static_keys: Provide a selftest 2015-08-03 11:51:12 +02:00
test_static_keys.c locking/static_keys: Avoid nested functions 2016-02-09 10:27:29 +01:00
test_user_copy.c test: check copy_to/from_user boundary validation 2014-01-23 16:36:57 -08:00
test_uuid.c lib/uuid: add a test module 2016-05-30 15:26:57 -07:00
test-kstrtox.c kstrto*: accept "-0" for signed conversion 2015-09-10 13:29:01 -07:00
test-string_helpers.c lib/test-string_helpers.c: fix and improve string_get_size() tests 2016-02-03 08:28:43 -08:00
textsearch.c lib/textsearch.c: remove textsearch_put reference from comments 2014-10-14 02:18:14 +02:00
timerqueue.c ktime: Get rid of the union 2016-12-25 17:21:22 +01:00
ts_bm.c
ts_fsm.c
ts_kmp.c
ubsan.c UBSAN: fix typo in format string 2016-08-02 17:31:41 -04:00
ubsan.h UBSAN: run-time undefined behavior sanity checker 2016-01-20 17:09:18 -08:00
ucs2_string.c lib/ucs2_string: Speed up ucs2_utf8size() 2016-09-09 16:08:46 +01:00
uuid.c lib/uuid.c: use correct offset in uuid parser 2016-05-30 15:26:57 -07:00
vsprintf.c lib/uuid.c: introduce a few more generic helpers 2016-05-20 17:58:30 -07:00
win_minmax.c lib/win_minmax: windowed min or max estimator 2016-09-21 00:22:59 -04:00