linux/fs/proc
David Hildenbrand 0daa322b8f fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages
Let's avoid reading:

1) Offline memory sections: the content of offline memory sections is
   stale as the memory is effectively unused by the kernel.  On s390x with
   standby memory, offline memory sections (belonging to offline storage
   increments) are not accessible.  With virtio-mem and the hyper-v
   balloon, we can have unavailable memory chunks that should not be
   accessed inside offline memory sections.  Last but not least, offline
   memory sections might contain hwpoisoned pages which we can no longer
   identify because the memmap is stale.

2) PG_offline pages: logically offline pages that are documented as
   "The content of these pages is effectively stale.  Such pages should
   not be touched (read/write/dump/save) except by their owner.".
   Examples include pages inflated in a balloon or unavailble memory
   ranges inside hotplugged memory sections with virtio-mem or the hyper-v
   balloon.

3) PG_hwpoison pages: Reading pages marked as hwpoisoned can be fatal.
   As documented: "Accessing is not safe since it may cause another
   machine check.  Don't touch!"

Introduce is_page_hwpoison(), adding a comment that it is inherently racy
but best we can really do.

Reading /proc/kcore now performs similar checks as when reading
/proc/vmcore for kdump via makedumpfile: problematic pages are exclude.
It's also similar to hibernation code, however, we don't skip hwpoisoned
pages when processing pages in kernel/power/snapshot.c:saveable_page()
yet.

Note 1: we can race against memory offlining code, especially memory going
offline and getting unplugged: however, we will properly tear down the
identity mapping and handle faults gracefully when accessing this memory
from kcore code.

Note 2: we can race against drivers setting PageOffline() and turning
memory inaccessible in the hypervisor.  We'll handle this in a follow-up
patch.

Link: https://lkml.kernel.org/r/20210526093041.8800-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Aili Yao <yaoaili@kingsoft.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jiri Bohac <jbohac@suse.cz>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-30 20:47:28 -07:00
..
array.c seccomp: Fix CONFIG tests for Seccomp_filters 2021-03-30 22:33:50 -07:00
base.c proc: only require mm_struct for writing 2021-06-15 10:47:51 -07:00
bootconfig.c proc/bootconfig: Fix to use correct quotes for value 2020-06-16 21:21:03 -04:00
cmdline.c
consoles.c
cpuinfo.c proc/cpuinfo: switch to ->read_iter 2020-11-06 10:05:18 -08:00
devices.c block: move block-related definitions out of fs.h 2020-06-24 09:16:02 -06:00
fd.c fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
fd.h fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
generic.c proc: save LOC in __xlate_proc_name() 2021-05-06 19:24:11 -07:00
inode.c proc: delete redundant subset=pid check 2021-05-06 19:24:11 -07:00
internal.h fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
interrupts.c
Kconfig treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
kcore.c fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages 2021-06-30 20:47:28 -07:00
kmsg.c
loadavg.c
Makefile
meminfo.c mm: memcontrol: convert NR_FILE_PMDMAPPED account to pages 2021-02-24 13:38:29 -08:00
namespaces.c
nommu.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
page.c mm: Add PG_arch_2 page flag 2020-09-04 12:46:06 +01:00
proc_net.c fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
proc_sysctl.c proc/sysctl: fix function name error in comments 2021-05-06 19:24:11 -07:00
proc_tty.c
root.c fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
self.c Revert "proc: don't allow async path resolution of /proc/self components" 2021-02-23 20:32:11 -07:00
softirqs.c
stat.c time-namespace-v5.11 2020-12-14 16:35:39 -08:00
task_mmu.c mm/pagemap: export uffd-wp protection information 2021-06-30 20:47:27 -07:00
task_nommu.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
thread_self.c Revert "proc: don't allow async path resolution of /proc/thread-self components" 2021-02-23 20:32:11 -07:00
uptime.c
util.c
version.c
vmcore.c vmalloc: remove redundant NULL check 2021-02-24 13:38:30 -08:00