Commit Graph

811737 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo
eb563d6604 perf pmu: Remove needless evsel.h include, only needs one fwd decl
To reduce the header dependency tree.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-rc389o1z0htwukqv6ni1viun@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
e9dacd63a1 perf tests pmu: Add missing headers
It needs the definitions for PATH_MAX and snprintf, was getting it
by luck from headers it included and that are now being sanitized.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7bbh3kk0h5mywvfqm64nhv28@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
71551288d2 perf hist: Remove the needless callchain.h include from hist.h
Nothing that is provided by callchain.h is used there, just things that
should've be directly included in hist.h, such as rbtree.h and a
map_symbol forward declaration.

Remove it so that we reduce the headers dependency tree.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-zivvqfx93w5zzur7hr7h0nlh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
b10ba7f1a2 perf tools: Add missing include <callchain.h> in various places
Its getting it from hist.h and that will go away, as that header doesn't
need callchain.h at all.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-6ebl3mwwiqocl79yts44qltu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
e22c1c7511 perf thread: Don't include symbol.h, symbol_conf.h is enough
Also add stdio.h to get the FILE definition.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-8vx5396phynuxhdsxxfbdhsk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
9cd997f85e perf evsel: No need to include symbol.h in evsel.h, symbol_conf.h is enough
To reduce the header dependency and avoid unnecessary rebuilds when
things change in symbol.h.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-6duflwliprh2tr47w5x4t260@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
daecf9e0fa perf tools: Add missing include for symbols.h
Several places were using definitions found in symbols.h but not
including it, getting it by sheer luck from some other headers that now
are in the process of removing that include because they don't need it
or because simply having struct forward declarations is enough, fix it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-xbcvvx296d70kpg9wb0qmeq9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
7cadca8e1b perf hist: Remove symbol.h from hist.h, just fwd decls are needed
To reduce the includes dependencies.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-cmvg5ght75mmfg1efeyna9rn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
2f2ae234e5 perf tests: Add missing headers so far obtained indirectly
We're going to remove symbol.h from some places and this breaks
some of the perf tests, fix it by adding the required includes.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-wpa4b6x0btpnh2kjxzl9no4w@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
41f30914fc perf map: Move structs and prototypes for map groups to a separate header
And since machine.h only needs what is in there, make it stop including
map.h and instead include this newly introduced map_groups.h instead.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-dbob25fv5rp2rjpwlnterf38@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
1101f69af5 pref tools: Add missing map.h includes
Lots of places get the map.h file indirectly, and since we're going to
remove it from machine.h, then those need to include it directly, do it
now, before we remove that dep.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ob8jehdjda8h5jsrv9dqj9tf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
9f4e8ff27a perf symbols: Introduce map_symbol.h
To allow headers just wanting this definition to be able to get it
without all the things in symbol.h, to reduce the include dep tree.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-l32z2qyhs6fe8unf4gk2ead2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
7b644f9ad1 perf callchain: Uninline callchain_cursor_reset() to remove map.h dependency
That was the only thing that made including map.h in callchain.h a
requiriment, so uninline it and just add a 'struct map' forward
declaration.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7fjz4hvv1bpzqaeriku44fn4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
4fed072609 perf srccode: Move struct definition from map.h to srccode.h
To reduce the header dependencies, since we already have a srccode.h
header, then there is where the 'struct srccode_state' should be, and
map.h, that is more widely used should have just a forward declaraion
of 'struct srccode_state'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-64lrkjjaa7wlo1zi2gr5u3es@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
af1db7f6b7 perf arm pmu: Add missing linux/string.h header
It uses strstarts(), that is defined in linux/string.h but that was
being including by sheer luck, indirectly, fix it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dongjiu Geng <gengdongjiu@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: https://lkml.kernel.org/n/tip-vub5lp82wb7vy5wssfad0xu8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Arnaldo Carvalho de Melo
d6e4ae499f perf powerpc: Add missing headers to skip-callchain-idx.c
It uses several structs but don't explicitely includes the headers where
they are defined, getting them by sheer luck from one of the headers it
includes, since those are being streamlined to avoid unnecessary
rebuilds when changes are made to a random header, they will break, fix
them now so that they continue to build.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Maynard Johnson <maynard@us.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: https://lkml.kernel.org/n/tip-j1nyksegpnz36wi3qx2p46i1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-06 10:00:38 -03:00
Elena Reshetova
ca3bb3d027 perf/ring_buffer: Convert ring_buffer.aux_refcount to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:

 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable ring_buffer.aux_refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

** Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts. Please check Documentation/core-api/refcount-vs-atomic.rst
for more information.

Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the ring_buffer.aux_refcount it might make a difference
in following places:

 - perf_aux_output_begin(): increment in refcount_inc_not_zero() only
   guarantees control dependency on success vs. fully ordered
   atomic counterpart
 - rb_free_aux(): decrement in refcount_dec_and_test() only
   provides RELEASE ordering and ACQUIRE ordering + control dependency
   on success vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: namhyung@kernel.org
Link: https://lkml.kernel.org/r/1548678448-24458-4-git-send-email-elena.reshetova@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04 08:46:17 +01:00
Elena Reshetova
fecb8ed2ce perf/ring_buffer: Convert ring_buffer.refcount to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:

 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable ring_buffer.refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

** Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts. Please check Documentation/core-api/refcount-vs-atomic.rst
for more information.

Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the ring_buffer.refcount it might make a difference
in following places:

 - ring_buffer_get(): increment in refcount_inc_not_zero() only
   guarantees control dependency on success vs. fully ordered
   atomic counterpart
 - ring_buffer_put(): decrement in refcount_dec_and_test() only
   provides RELEASE ordering and ACQUIRE ordering + control dependency
   on success vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: namhyung@kernel.org
Link: https://lkml.kernel.org/r/1548678448-24458-3-git-send-email-elena.reshetova@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04 08:46:16 +01:00
Elena Reshetova
8c94abbbe1 perf: Convert perf_event_context.refcount to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:

 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable perf_event_context.refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

** Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts. Please check Documentation/core-api/refcount-vs-atomic.rst
for more information.

Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the perf_event_context.refcount it might make a difference
in following places:

 - get_ctx(), perf_event_ctx_lock_nested(), perf_lock_task_context()
   and __perf_event_ctx_lock_double(): increment in
   refcount_inc_not_zero() only guarantees control dependency
   on success vs. fully ordered atomic counterpart
 - put_ctx(): decrement in refcount_dec_and_test() provides
   RELEASE ordering and ACQUIRE ordering + control dependency on success
   vs. fully ordered atomic counterpart

Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: namhyung@kernel.org
Link: https://lkml.kernel.org/r/1548678448-24458-2-git-send-email-elena.reshetova@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04 08:46:15 +01:00
Thomas Gleixner
720e596a16 perf/uprobes: Convert to SPDX license identifier
Replace the license boiler plate with a SPDX license identifier.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Paul McKenney <paulmck@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190116111308.211981422@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04 08:46:13 +01:00
Thomas Gleixner
469eb32eaf perf/hw_breakpoints: Convert to SPDX license identifier
Replace the license boiler plate with a SPDX license identifier.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Paul McKenney <paulmck@linux.ibm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190116111308.105855650@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04 08:46:13 +01:00
Thomas Gleixner
8e86e01526 perf/core: Convert to SPDX license identifiers
Use proper SPDX license identifiers instead of the bogus reference to
kernel-base/COPYING.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190116111308.012666937@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04 08:46:11 +01:00
Ingo Molnar
98cb621081 Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04 08:45:42 +01:00
Mark Rutland
9dff0aa95a perf/core: Don't WARN() for impossible ring-buffer sizes
The perf tool uses /proc/sys/kernel/perf_event_mlock_kb to determine how
large its ringbuffer mmap should be. This can be configured to arbitrary
values, which can be larger than the maximum possible allocation from
kmalloc.

When this is configured to a suitably large value (e.g. thanks to the
perf fuzzer), attempting to use perf record triggers a WARN_ON_ONCE() in
__alloc_pages_nodemask():

   WARNING: CPU: 2 PID: 5666 at mm/page_alloc.c:4511 __alloc_pages_nodemask+0x3f8/0xbc8

Let's avoid this by checking that the requested allocation is possible
before calling kzalloc.

Reported-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20190110142745.25495-1-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04 08:45:25 +01:00
Peter Zijlstra
602cae04c4 perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu()
intel_pmu_cpu_prepare() allocated memory for ->shared_regs among other
members of struct cpu_hw_events. This memory is released in
intel_pmu_cpu_dying() which is wrong. The counterpart of the
intel_pmu_cpu_prepare() callback is x86_pmu_dead_cpu().

Otherwise if the CPU fails on the UP path between CPUHP_PERF_X86_PREPARE
and CPUHP_AP_PERF_X86_STARTING then it won't release the memory but
allocate new memory on the next attempt to online the CPU (leaking the
old memory).
Also, if the CPU down path fails between CPUHP_AP_PERF_X86_STARTING and
CPUHP_PERF_X86_PREPARE then the CPU will go back online but never
allocate the memory that was released in x86_pmu_dying_cpu().

Make the memory allocation/free symmetrical in regard to the CPU hotplug
notifier by moving the deallocation to intel_pmu_cpu_dead().

This started in commit:

   a7e3ed1e47 ("perf: Add support for supplementary event registers").

In principle the bug was introduced in v2.6.39 (!), but it will almost
certainly not backport cleanly across the big CPU hotplug rewrite between v4.7-v4.15...

[ bigeasy: Added patch description. ]
[ mingo: Added backporting guidance. ]

Reported-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With developer hat on
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With maintainer hat on
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: jolsa@kernel.org
Cc: kan.liang@linux.intel.com
Cc: namhyung@kernel.org
Cc: <stable@vger.kernel.org>
Fixes: a7e3ed1e47 ("perf: Add support for supplementary event registers").
Link: https://lkml.kernel.org/r/20181219165350.6s3jvyxbibpvlhtq@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04 08:44:51 +01:00
Kan Liang
9e63a7894f perf/x86/intel/uncore: Add Node ID mask
Some PCI uncore PMUs cannot be registered on an 8-socket system (HPE
Superdome Flex).

To understand which Socket the PCI uncore PMUs belongs to, perf retrieves
the local Node ID of the uncore device from CPUNODEID(0xC0) of the PCI
configuration space, and the mapping between Socket ID and Node ID from
GIDNIDMAP(0xD4). The Socket ID can be calculated accordingly.

The local Node ID is only available at bit 2:0, but current code doesn't
mask it. If a BIOS doesn't clear the rest of the bits, an incorrect Node ID
will be fetched.

Filter the Node ID by adding a mask.

Reported-by: Song Liu <songliubraving@fb.com>
Tested-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org> # v3.7+
Fixes: 7c94ee2e09 ("perf/x86: Add Intel Nehalem and Sandy Bridge-EP uncore support")
Link: https://lkml.kernel.org/r/1548600794-33162-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-04 08:44:43 +01:00
Linus Torvalds
8834f5600c Linux 5.0-rc5 2019-02-03 13:48:04 -08:00
Linus Torvalds
24b888d8d5 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A few updates for x86:

   - Fix an unintended sign extension issue in the fault handling code

   - Rename the new resource control config switch so it's less
     confusing

   - Avoid setting up EFI info in kexec when the EFI runtime is
     disabled.

   - Fix the microcode version check in the AMD microcode loader so it
     only loads higher version numbers and never downgrades

   - Set EFER.LME in the 32bit trampoline before returning to long mode
     to handle older AMD/KVM behaviour properly.

   - Add Darren and Andy as x86/platform reviewers"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Avoid confusion over the new X86_RESCTRL config
  x86/kexec: Don't setup EFI info if EFI runtime is not enabled
  x86/microcode/amd: Don't falsely trick the late loading mechanism
  MAINTAINERS: Add Andy and Darren as arch/x86/platform/ reviewers
  x86/fault: Fix sign-extend unintended sign extension
  x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline before returning to long mode
  x86/cpu: Add Atom Tremont (Jacobsville)
2019-02-03 09:08:12 -08:00
Linus Torvalds
cc6810e36b Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull cpu hotplug fixes from Thomas Gleixner:
 "Two fixes for the cpu hotplug machinery:

   - Replace the overly clever 'SMT disabled by BIOS' detection logic as
     it breaks KVM scenarios and prevents speculation control updates
     when the Hyperthreads are brought online late after boot.

   - Remove a redundant invocation of the speculation control update
     function"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM
  x86/speculation: Remove redundant arch_smt_update() invocation
2019-02-03 09:02:03 -08:00
Linus Torvalds
58f6d4287a Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "A pile of perf updates:

   - Fix broken sanity check in the /proc/sys/kernel/perf_cpu_time_max_percent
     write handler

   - Cure a perf script crash which caused by an unitinialized data
     structure

   - Highlight the hottest instruction in perf top and not a random one

   - Cure yet another clang issue when building perf python

   - Handle topology entries with no CPU correctly in the tools

   - Handle perf data which contains both tracepoints and performance
     counter entries correctly.

   - Add a missing NULL pointer check in perf ordered_events_free()"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf script: Fix crash when processing recorded stat data
  perf top: Fix wrong hottest instruction highlighted
  perf tools: Handle TOPOLOGY headers with no CPU
  perf python: Remove -fstack-clash-protection when building with some clang versions
  perf core: Fix perf_proc_update_handler() bug
  perf script: Fix crash with printing mixed trace point and other events
  perf ordered_events: Fix crash in ordered_events__free
2019-02-03 08:59:51 -08:00
Linus Torvalds
89401be658 Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fix from Thomas Gleixner:
 "The dump info for the efi page table debugging lacks a terminator
  which causes the kernel to crash when the debugfile is read"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi/arm64: Fix debugfs crash by adding a terminator for ptdump marker
2019-02-03 08:57:05 -08:00
Linus Torvalds
312b3a93dd for-5.0-rc4-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlxWtJgACgkQxWXV+ddt
 WDsDow//ZpnyDwQWvSIfF2UUQOPlcBjbHKuBA7rU0wdybW635QYGR0mqnI+1VnMj
 7ssUkeN6N0a2gQzrUG4Y+zpdzOWv2xQ4jKZ9GMOp9gwyzEFyPkcFXOnmM8UfYtVu
 e3fK65e8BZHmTeu0kGKah4Dt1g0t4fUmhsKR4Pfp5YNJC+zuuGTwUW1K/ZQHXJ+3
 kTHc7WP1lsF7wgaZ+Gl+Kvp8fVrHVdygMVTdRBW8QaBgPLa/KExvK62jW+NmCYhj
 7OIkWdew7e8IXc3Ie5IbOomHAv7IgqqgiO9VO9+n0EpyV4UobUgxrgBKJ+0yc1Ya
 eidbKhMslwUE50y00JVm+vw0gwQHkR+hZDn/mRB6xiIeI8tu/yQIJZ6AhYJXoByR
 cu8+SNO5Z5dOZ1f146ZH8lnkr24tuSnkDUhbRDR5pdb4tAHHej2ALzhbfbwbPEpF
 IverYLw5fOMGeRU/mBsjkVadpSZ4S0HVNU85ERdhLtVLK1PSaY2UkUaA+Ii5y7au
 EYDjaGMflmJ8cAVqgtgedEff2n8OKDnzRZlz4IPLI73MVSITGZkM7PmYmYsLm2Bs
 mDPnmyqR8kzcd1RRtSeZTvqOpAIZG+QUOmD2jiKrchmp54Sz0V/HJWRs3aQybD6Q
 ph0yAcbkvgp/ewe5IFgaI0kcyH7zYdL6GtiI2WUE3/8DObgrgsA=
 =E2PP
 -----END PGP SIGNATURE-----

Merge tag 'for-5.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - regression fix: transaction commit can run away due to delayed ref
   waiting heuristic, this is not necessary now because of the proper
   reservation mechanism introduced in 5.0

 - regression fix: potential crash due to use-before-check of an ERR_PTR
   return value

 - fix for transaction abort during transaction commit that needs to
   properly clean up pending block groups

 - fix deadlock during b-tree node/leaf splitting, when this happens on
   some of the fundamental trees, we must prevent new tree block
   allocation to re-enter indirectly via the block group flushing path

 - potential memory leak after errors during mount

* tag 'for-5.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: On error always free subvol_name in btrfs_mount
  btrfs: clean up pending block groups when transaction commit aborts
  btrfs: fix potential oops in device_list_add
  btrfs: don't end the transaction for delayed refs in throttle
  Btrfs: fix deadlock when allocating tree block during leaf/node split
2019-02-03 08:48:33 -08:00
Linus Torvalds
12491ed354 A single fix for building DT bindings in-tree.
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAlxVtl0QHHJvYmhAa2Vy
 bmVsLm9yZwAKCRD6+121jbxhw7wHD/4qMBS9JpAki/RV3Y9NN/FwWRYC4HS59K7s
 XnCt0oQBdGeiu6hxEaZLsUMurIxsfP87YE6sM9iwiE1JqwKnPl8727BNMd8AGeGP
 dT0Q8MXY/VYksegw+IG2E2L7ILAh2TDuHBt5t8ZKFgAzxvebVo9Cjj/hCp/c9j4l
 AUAONZ/sSvyl/ejmkQWTuJpA1ns+GN/OUzHgiNytGat/1gfcNaiKJlZxChVdN7SX
 Vqwxoih/J44LYQibUEIM/WyeNwqfEA440bI0xxoIVomfEH1lV3RNmXg2lrmZrdgi
 yax9+iUzH81wzWOIlWYYLV06bqt77wyxUTOMA4thFH5pXoI9FoH4IvqC+W75Z4xN
 DSQ9rSz7Gx5uDa01ttWI5kiFMcWVqPSuzsx+6dzpmJWII5WPEqSGFA3WniiJe+fG
 pc/s2xNVXQxhvfXi+WSjFZBxv7cKtvBtIOi2oOLlmOy4MDu1BiJ8St3+1eMv4laN
 7/ZmyDMVhIL45cRn28rLZ0NZY5340nit6QcihOwpQ9LElBXPOQ1NVTEjtNlw6SPd
 RHhYIF4bVGuJKYaIB8aU4QobIKIsL0rHfr2NY0xnJNzcJ02Jr+dv1brX6Nqd9msH
 ybnQByAIBG5zrxQPZvik5v4BxEHr4JPITUrxNLkP1HNc7/Vq0HBcCJjsw/ruRLQi
 67/Zo8Ijvg==
 =cHtH
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-fixes-for-5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull Devicetree fix from Rob Herring:
 "A single fix for building DT bindings in-tree"

* tag 'devicetree-fixes-for-5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: Fix dt_binding_check target for in tree builds
2019-02-02 10:34:32 -08:00
Linus Torvalds
74b13e7efe RISC-V Fixes for 5.0-rc5
This patch set contains a handful of mostly-independent patches:
 
 * A patch that causes our port to respect TIF_NEED_RESCHED, which fixes
   CONFIG_PREEMPT=y kernels.
 * A fix to avoid double-put on OF nodes.
 * Fix a misspelling of target in our Kconfig.
 * Generic PCIe is enabled in our defconfig.
 * A fix to our SBI early console to properly handle line endings.
 * A fix such that max_low_pfn is counted in PFNs.
 * A change to TASK_UNMAPPED_BASE to match what other arches do.
 
 This has passed by standard "boot Fedora" flow.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlxR5B4THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQf7aEACfy+2EHqgTo5lzRZCwxmjzb4rm5gyH
 1GuR+2NPiwjPADxCqpO7tLFwSoLWU46K7GKfTrjXWbW2hNfBiHfPv62mPSX7dWR0
 kbUMRK0z5qm4mE1K8mmDFWPntGchuuRz1rfPI2Dv8McVybz2MbrvlrJ3zXXK/Qcu
 07cWbmeGd2RWIfj+tSYo+i5Aq4VPWQ6X00ToZlIZ4/4v69mFVgjqTIcc3Jx7unt1
 vGFRjP8jDVibwFAE1yDcUl9OCxqW/wUuwTpPW2HgN+55dktjBxvESl/pMl0AeGYk
 ehKdsaBVk/BkQT2XkYDIPxqK4Yb0VsuPOoH0Jsj+l3tSGjSeRA/LwlHiY7hALIVj
 LTg2FvEKs++hLnzwV6D5Qqj8TcdowSm/hDJttjEewM+IhLPOVGLz16yRcM9jIC7E
 1Jc6ByTo6HgVxhasF/ehT5SowJc1nOR3nS9VVd2P0yYysdN0m/kcfSgxQ1GoXMmC
 2tnMlwJO7pm8akuV2ifxUsM6KlRP47auLtTbRxuebpYnd6YKng7NRLF8Um2tgck4
 wtULBEXZCLYhHB55Lyy+wWDr7Pi3XK7qyvoTNP+AEJHcPtI23TsGKzB0Hu2unvxE
 s9r+0NGuiiuLAsR6Jba9FFSeZR7mJ9u1Bm0fz89Ob6dr1lYf0UUjqqQfHTfVZr40
 F7SqTyD+pe7IoQ==
 =Z3YY
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux

Pull RISC-V fixes from Palmer Dabbelt:
 "This contains a handful of mostly-independent patches:

   - make our port respect TIF_NEED_RESCHED, which fixes
     CONFIG_PREEMPT=y kernels

   - fix double-put of OF nodes

   - fix a misspelling of target in our Kconfig

   - generic PCIe is enabled in our defconfig

   - fix our SBI early console to properly handle line
     endings

   - fix max_low_pfn being counted in PFNs

   - a change to TASK_UNMAPPED_BASE to match what other
     arches do

  This has passed my standard 'boot Fedora' flow"

* tag 'riscv-for-linus-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
  riscv: Adjust mmap base address at a third of task size
  riscv: fixup max_low_pfn with PFN_DOWN.
  tty/serial: use uart_console_write in the RISC-V SBL early console
  RISC-V: defconfig: Add CRYPTO_DEV_VIRTIO=y
  RISC-V: defconfig: Enable Generic PCIE by default
  RISC-V: defconfig: Move CONFIG_PCI{,E_XILINX}
  RISC-V: Kconfig: fix spelling mistake "traget" -> "target"
  RISC-V: asm/page.h: fix spelling mistake "CONFIG_64BITS" -> "CONFIG_64BIT"
  RISC-V: fix bad use of of_node_put
  RISC-V: Add _TIF_NEED_RESCHED check for kernel thread when CONFIG_PREEMPT=y
2019-02-02 10:26:14 -08:00
Linus Torvalds
c8864cb70f for-linus-20190202
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlxVxeIQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpq35EADCoVRFF8mi7wBhNvfN/yLqC9sgnqJM7vF3
 gPpm/E4MsTzCXrhgRmulrikfM8ywOT5ZBgIQp+BrhQgDZIlJ9fcyinFjU7o/gRm5
 R32IMJ4o7uh+YKWQyRSQu1WCaF3hvbNGUT7duMfTKjZ6t9TxTBgy46wYhb7YmNAP
 Ur8C1+4NfF/aHp59n6COM70KdYfswRxtgdEHGfmHAbmKDpvAeC+7I2LPpjTf5bH/
 YULnxk5sQFvCINDJH4zprZ0lSDy+63qk+Q1xxqpFNR1tnmRIVNvKhPPB5xfRLvzB
 mw3JdtwnvoY8Yv6eCs0u7mZs5L3I8zTI+4RtHH9nPD7ykIiUHpejgaRc4TGy2tox
 Dpgfc/Nvdsscpuy4QcFNHbWWBUu5pa5Li+KVS0FEP9FmD5UhcvVmZaZK+V6NuGO4
 9G9wraASFasPK7I0FMHlWLMIWKdj4s4n/H55QnP6yvFsnsrFaqnDwybUFiicjFkv
 hmNQmq8+5p0n3ZBVGQ/SI//vPUjuaFUKU2MhhW0NVz+KkgmEnOJ+W+C77//U33l+
 zhBURtdlnqfImFejEayhhtCtMATJcf2E1rHlA3nM6JyVWMRvQR6asb78QhAKZd6w
 el7rqRdWMwryUYFaROVurfOBMPFdhhyDB1qrOzAZIkhFqWwM9GX6+MH/y8+00HSA
 aA/rXg/daw==
 =8Zz+
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20190202' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A few fixes that should go into this release. This contains:

   - MD pull request from Song, fixing a recovery OOM issue (Alexei)

   - Fix for a sync related stall (Jianchao)

   - Dummy callback for timeouts (Tetsuo)

   - IDE atapi sense ordering fix (me)"

* tag 'for-linus-20190202' of git://git.kernel.dk/linux-block:
  ide: ensure atapi sense request aren't preempted
  blk-mq: fix a hung issue when fsync
  block: pass no-op callback to INIT_WORK().
  md/raid5: fix 'out of memory' during raid cache recovery
2019-02-02 10:16:28 -08:00
Linus Torvalds
3cde55ee79 SCSI fixes on 20190201
Five minor bug fixes.  The libfc one is a tiny memory leak, the zfcp
 one is an incorrect user visible parameter and the rest are on error
 legs or obscure features.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXFTUrCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishe/hAQCDic6S
 FcD8TPv4jAt3qWun5vVCFAGCEuGet73p89sxpQEA51KcGrFjR9oCtZkd9nIzJL7M
 enGaQ0RtT8wPMSUNklE=
 =t6p4
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Five minor bug fixes.

  The libfc one is a tiny memory leak, the zfcp one is an incorrect user
  visible parameter and the rest are on error legs or obscure features"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: 53c700: pass correct "dev" to dma_alloc_attrs()
  scsi: bnx2fc: Fix error handling in probe()
  scsi: scsi_debug: fix write_same with virtual_gb problem
  scsi: libfc: free skb when receiving invalid flogi resp
  scsi: zfcp: fix sysfs block queue limit output for max_segment_size
2019-02-02 10:12:53 -08:00
Linus Torvalds
b9de6efed2 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "24 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (24 commits)
  autofs: fix error return in autofs_fill_super()
  autofs: drop dentry reference only when it is never used
  fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
  mm: migrate: don't rely on __PageMovable() of newpage after unlocking it
  psi: clarify the Kconfig text for the default-disable option
  mm, memory_hotplug: __offline_pages fix wrong locking
  mm: hwpoison: use do_send_sig_info() instead of force_sig()
  kasan: mark file common so ftrace doesn't trace it
  init/Kconfig: fix grammar by moving a closing parenthesis
  lib/test_kmod.c: potential double free in error handling
  mm, oom: fix use-after-free in oom_kill_process
  mm/hotplug: invalid PFNs from pfn_to_online_page()
  mm,memory_hotplug: fix scan_movable_pages() for gigantic hugepages
  psi: fix aggregation idle shut-off
  mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone
  mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone
  oom, oom_reaper: do not enqueue same task twice
  mm: migrate: make buffer_migrate_page_norefs() actually succeed
  kernel/exit.c: release ptraced tasks before zap_pid_ns_processes
  x86_64: increase stack size for KASAN_EXTRA
  ...
2019-02-02 09:32:58 -08:00
Qian Cai
74c953ca5f efi/arm64: Fix debugfs crash by adding a terminator for ptdump marker
When reading 'efi_page_tables' debugfs triggers an out-of-bounds access here:

  arch/arm64/mm/dump.c: 282
  if (addr >= st->marker[1].start_address) {

called from:

  arch/arm64/mm/dump.c: 331
  note_page(st, addr, 2, pud_val(pud));

because st->marker++ is is called after "UEFI runtime end" which is the
last element in addr_marker[]. Therefore, add a terminator like the one
for kernel_page_tables, so it can be skipped to print out non-existent
markers.

Here's the KASAN bug report:

  # cat /sys/kernel/debug/efi_page_tables
  ---[ UEFI runtime start ]---
  0x0000000020000000-0x0000000020010000          64K PTE       RW NX SHD AF ...
  0x0000000020200000-0x0000000021340000       17664K PTE       RW NX SHD AF ...
  ...
  0x0000000021920000-0x0000000021950000         192K PTE       RW x  SHD AF ...
  0x0000000021950000-0x00000000219a0000         320K PTE       RW NX SHD AF ...
  ---[ UEFI runtime end ]---
  ---[ (null) ]---
  ---[ (null) ]---

   BUG: KASAN: global-out-of-bounds in note_page+0x1f0/0xac0
   Read of size 8 at addr ffff2000123f2ac0 by task read_all/42464
   Call trace:
    dump_backtrace+0x0/0x298
    show_stack+0x24/0x30
    dump_stack+0xb0/0xdc
    print_address_description+0x64/0x2b0
    kasan_report+0x150/0x1a4
    __asan_report_load8_noabort+0x30/0x3c
    note_page+0x1f0/0xac0
    walk_pgd+0xb4/0x244
    ptdump_walk_pgd+0xec/0x140
    ptdump_show+0x40/0x50
    seq_read+0x3f8/0xad0
    full_proxy_read+0x9c/0xc0
    __vfs_read+0xfc/0x4c8
    vfs_read+0xec/0x208
    ksys_read+0xd0/0x15c
    __arm64_sys_read+0x84/0x94
    el0_svc_handler+0x258/0x304
    el0_svc+0x8/0xc

  The buggy address belongs to the variable:
   __compound_literal.0+0x20/0x800

  Memory state around the buggy address:
   ffff2000123f2980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   ffff2000123f2a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
  >ffff2000123f2a80: fa fa fa fa 00 00 00 00 fa fa fa fa 00 00 00 00
                                            ^
   ffff2000123f2b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
   ffff2000123f2b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0

[ ardb: fix up whitespace ]
[ mingo: fix up some moar ]

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 9d80448ac9 ("efi/arm64: Add debugfs node to dump UEFI runtime page tables")
Link: http://lkml.kernel.org/r/20190202095017.13799-2-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-02-02 11:27:29 +01:00
Johannes Weiner
e6d429313e x86/resctrl: Avoid confusion over the new X86_RESCTRL config
"Resource Control" is a very broad term for this CPU feature, and a term
that is also associated with containers, cgroups etc. This can easily
cause confusion.

Make the user prompt more specific. Match the config symbol name.

 [ bp: In the future, the corresponding ARM arch-specific code will be
   under ARM_CPU_RESCTRL and the arch-agnostic bits will be carved out
   under the CPU_RESCTRL umbrella symbol. ]

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Babu Moger <Babu.Moger@amd.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: linux-doc@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190130195621.GA30653@cmpxchg.org
2019-02-02 10:34:52 +01:00
Linus Torvalds
cd984a5be2 xtensa fixes for v5.0-rc5
- fix ccount_timer_shutdown for secondary CPUs;
 - fix secondary CPU initialization;
 - fix secondary CPU reset vector clash with double exception vector;
 - fix present CPUs when booting with 'maxcpus' parameter;
 - limit possible CPUs by configured NR_CPUS;
 - issue a warning if xtensa PIC is asked to retrigger anything other
   than software IRQ;
 - fix masking/unmasking of the first two IRQs on xtensa MX PIC;
 - fix typo in Kconfig description for user space unaligned access
   feature;
 - fix Kconfig warning for selecting BUILTIN_DTB.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEK2eFS5jlMn3N6xfYUfnMkfg/oEQFAlxUn8MTHGpjbXZia2Jj
 QGdtYWlsLmNvbQAKCRBR+cyR+D+gRAX6D/4oOJd7TghbVdSC82rIQXRBxmlWB1YV
 tPtb/W3qrkkM8c693FbAoLFNoikYFftzn4EomTz1KtkBxq7HjZmmphiTU5E23Zg5
 nonnBkcFnm4Yfr4gLaTJl3rMqJNbDTMg6EyCPRHVI43Ux1jA9j/T2MN/dMZox+5a
 PU2q8k/HHDAlumOPj93MIKBb8XA9Sq9Jfpw2Jnlc0r8b4fR/9pKfVPOcsqs/jv3x
 BFIIH/vPvl2/j+DShpFcYnK8VgRo6zj2ny343J4zYqXspky43ZMMIaE/ZpkT592b
 uheDQYHAHvpZT+FD8waE5P5quBS5P+CmZIbuz7YTxB1VTcoV+OGGCpAvpj5CqmNr
 Mj2f3Yar+4q/QczHP+/42zGVDoJ/3dLBIu9IqSjWkY90qgncd9TD+dMWzI2ejJU7
 LPMIAw//Y/L4m3TAg84GFfGkOjzGQUXGQGl+9sqIr3eOWgoouXatq6L4F2CxCnz2
 zMKT0HFzdxs1gt13oRngugyfK1xRF0H5DW2eNt4dsFURIeIUP5cqou0v+b+CA3Li
 sbvI6yJ0g9tRf1f0yDnSRlvm0nB56zsXKGz5uuD6MiMOleUM3N41+IqCPHwuH0F1
 wNyYSWZEkt1t88rQxnter1+sDpi4brW0BZMVUSHsf+USqcwMrwu9ZmFKj08nihMP
 dCEHHlVGizOEsQ==
 =Pim9
 -----END PGP SIGNATURE-----

Merge tag 'xtensa-20190201' of git://github.com/jcmvbkbc/linux-xtensa

Pull xtensa fixes from Max Filippov:

 - fix ccount_timer_shutdown for secondary CPUs

 - fix secondary CPU initialization

 - fix secondary CPU reset vector clash with double exception vector

 - fix present CPUs when booting with 'maxcpus' parameter

 - limit possible CPUs by configured NR_CPUS

 - issue a warning if xtensa PIC is asked to retrigger anything other
   than software IRQ

 - fix masking/unmasking of the first two IRQs on xtensa MX PIC

 - fix typo in Kconfig description for user space unaligned access
   feature

 - fix Kconfig warning for selecting BUILTIN_DTB

* tag 'xtensa-20190201' of git://github.com/jcmvbkbc/linux-xtensa:
  xtensa: SMP: limit number of possible CPUs by NR_CPUS
  xtensa: rename BUILTIN_DTB to BUILTIN_DTB_SOURCE
  xtensa: Fix typo use space=>user space
  drivers/irqchip: xtensa-mx: fix mask and unmask
  drivers/irqchip: xtensa: add warning to irq_retrigger
  xtensa: SMP: mark each possible CPU as present
  xtensa: smp_lx200_defconfig: fix vectors clash
  xtensa: SMP: fix secondary CPU initialization
  xtensa: SMP: fix ccount_timer_shutdown
2019-02-01 16:56:30 -08:00
Linus Torvalds
8b050fe42d arm64 fixes for -rc5
- Fix module loading when KASLR is configured but disabled at runtime
 
 - Fix accidental IPI when mapping user executable pages
 
 - Ensure hyp-stub and KVM world switch code cannot be kprobed
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAlxUWvEACgkQt6xw3ITB
 YzTNGwgAgAk1D4xfe01rasMK8dnogIMNbpxeeAuiwwFuXq/lBqo2UDX8UMt6qW9k
 b0rLpzyVelhPRJ3lQ90XvIv279cFBjHQT42YyVLlAnzkYz89kPKZweE7cWetO3sV
 kB7uJIaCfstGAp13p2ztN1KzHeIFKxk14gUa9jgmVdbHdJt+L5WdNUExs4+0jgnq
 lxBPWwvluzEc6mmpniIpF7em+svIci5M4al2gjxKVO7DI5slphOEZ4+VSGeSDxTI
 jOzTilS8UfldEnwjndl/hKes9r65afeLSDXvkJTTpfXB9jnKS54FJzy1QCibSRLT
 SHY9xkA3VEgJBifmpCEuJoNs2XaBFQ==
 =Eh1r
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "Although we're still debugging a few minor arm64-specific issues in
  mainline, I didn't want to hold this lot up in the meantime.

  We've got an additional KASLR fix after the previous one wasn't quite
  complete, a fix for a performance regression when mapping executable
  pages into userspace and some fixes for kprobe blacklisting. All
  candidates for stable.

  Summary:

   - Fix module loading when KASLR is configured but disabled at runtime

   - Fix accidental IPI when mapping user executable pages

   - Ensure hyp-stub and KVM world switch code cannot be kprobed"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: hibernate: Clean the __hyp_text to PoC after resume
  arm64: hyp-stub: Forbid kprobing of the hyp-stub
  arm64: kprobe: Always blacklist the KVM world-switch code
  arm64: kaslr: ensure randomized quantities are clean also when kaslr is off
  arm64: Do not issue IPIs for user executable ptes
2019-02-01 16:54:25 -08:00
Linus Torvalds
33640d718c SMB3 fixes, some from this week's SMB3 test evemt, 5 for stable and a particularly important one for queryxattr (see xfstests 70 and 117)
-----BEGIN PGP SIGNATURE-----
 
 iQGyBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAlxUf8kACgkQiiy9cAdy
 T1Fy8Qv4rlgVRBPcSJpvZKhFjMd5KskVIDnP0tOc5od+Mg+547eAg7UKKnJVDgKS
 OUtiP8uC/UErWvQ214hoXF1sMwoG+rDdTRdIOVDLD2a2gIIJzaIPADhYt4w5kiAX
 nO9NwPwk03KTzNUQmpdFPofbLK4csmJUR4kBNE+XhZTulmw9BkplGXnNkQGjsEdf
 nY6YdfqofE//DmR/KB1BylD0lxDtnfuB5zoELPmM6X4iWtD9W6pKVg23huEv+Dmh
 80LdYt77vI4RKEzOAmsYEpROdAmlCksVQC1nlh5EHOuMN/9TCdpdWRahJWWhbbKr
 /b8TuD+0QUUKEQdGvwM8DM/OCjnwoJphzI8CK1VzRpM4XkZxpQTzng2zVFPNzs4Y
 wwKCsxxTO7ePGg6hE/DTnRn9XJtKbR1nQsIjHcw6rE0ydl+P6YRZDUEehqyRU+jd
 Z09XCPS1+zeUPE6PRE9wlrIt7QdxOynufhaWRKf18b2b4cfKbNGcd+rMMZPagUQe
 x2CMkEU=
 =qX/z
 -----END PGP SIGNATURE-----

Merge tag '5.0-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb3 fixes from Steve French:
 "SMB3 fixes, some from this week's SMB3 test evemt, 5 for stable and a
  particularly important one for queryxattr (see xfstests 70 and 117)"

* tag '5.0-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal module version number
  CIFS: fix use-after-free of the lease keys
  CIFS: Do not consider -ENODATA as stat failure for reads
  CIFS: Do not count -ENODATA as failure for query directory
  CIFS: Fix trace command logging for SMB2 reads and writes
  CIFS: Fix possible oops and memory leaks in async IO
  cifs: limit amount of data we request for xattrs to CIFSMaxBufSize
  cifs: fix computation for MAX_SMB2_HDR_SIZE
2019-02-01 16:53:01 -08:00
Linus Torvalds
b7bd29b530 Bug Fixes
- Fix aa_label_build() error handling for failed merges
 - Fix warning about unused function apparmor_ipv6_postroute
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE7cSDD705q2rFEEf7BS82cBjVw9gFAlxUbasACgkQBS82cBjV
 w9hEyA//XLzkJ6mdreIWLSO0KO8dwCwpWyJ5EFIx3R+2Nrz3BVN2/Ysg+GZ2oEdD
 iF8OEVW/RBjapHbtZeruu8tqfU9i3WYyc/cIpsdZTZ2yIfRikyV/bT21/XggAe4g
 cGhcNVvTpVxwDFHT1YdQFHeDrcJ4nHeGVcErQ0oDV3KJRIUII8uKUmJzxiFqFQjq
 iPZHcaU3b65RrPIaxfEIk6anBW9UwInnXBdOQqjxLCg1efLBfmpT3HpUbfQrTmIq
 zh2xSgTFVS8jRbg8N7HIQDR8GZTew1x6yWx6v84fE+8NfqXEuOwo3+EPGcJWfB4i
 9W+nLlmyrJ5wyFJ4dpWzj7ZMAwuZYKSOSgXIVh2ew9bI13KUthymQ+pWJ6S91Rxf
 Z3zPzjHG3RuyeHuZqiNknM6iXgmfR+PaVk2aq22No40BsXiNjTO8l0kSH36HQKCh
 vpOC1VVXq7B0+Xe6FVQd5R1KydXwO2DrYkO4ygyrYqWXD0F6bFhQyZJlbf2las7B
 DtGjC6+nhh9eQPaYDQ5IIJEhFmNSqE+q+yRqNgcN1vj+gPEayw2KqG67Nt50UjmW
 6Lz3tTZJIw8NRSgUYqRzsJfocCADRamHgysWD+etd3mPnQbFXU5Nx6qwztcpzISP
 fHZY75etD+fkg7pR4FvzqO7EpACtZnk/Yx2BTEjUOnTuUY9hjc0=
 =dMPT
 -----END PGP SIGNATURE-----

Merge tag 'apparmor-pr-2019-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor

Pull apparmor bug fixes from John Johansen:
 "Two bug fixes for apparmor:

   - Fix aa_label_build() error handling for failed merges

   - Fix warning about unused function apparmor_ipv6_postroute"

* tag 'apparmor-pr-2019-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
  apparmor: Fix aa_label_build() error handling for failed merges
  apparmor: Fix warning about unused function apparmor_ipv6_postroute
2019-02-01 16:18:38 -08:00
Ian Kent
f585b283e3 autofs: fix error return in autofs_fill_super()
In autofs_fill_super() on error of get inode/make root dentry the return
should be ENOMEM as this is the only failure case of the called
functions.

Link: http://lkml.kernel.org/r/154725123240.11260.796773942606871359.stgit@pluto-themaw-net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01 15:46:24 -08:00
Pan Bian
63ce5f552b autofs: drop dentry reference only when it is never used
autofs_expire_run() calls dput(dentry) to drop the reference count of
dentry.  However, dentry is read via autofs_dentry_ino(dentry) after
that.  This may result in a use-free-bug.  The patch drops the reference
count of dentry only when it is never used.

Link: http://lkml.kernel.org/r/154725122396.11260.16053424107144453867.stgit@pluto-themaw-net
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01 15:46:24 -08:00
Jan Kara
c27d82f52f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
When superblock has lots of inodes without any pagecache (like is the
case for /proc), drop_pagecache_sb() will iterate through all of them
without dropping sb->s_inode_list_lock which can lead to softlockups
(one of our customers hit this).

Fix the problem by going to the slow path and doing cond_resched() in
case the process needs rescheduling.

Link: http://lkml.kernel.org/r/20190114085343.15011-1-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01 15:46:24 -08:00
David Hildenbrand
e0a352fabc mm: migrate: don't rely on __PageMovable() of newpage after unlocking it
We had a race in the old balloon compaction code before b1123ea6d3
("mm: balloon: use general non-lru movable page feature") refactored it
that became visible after backporting 195a8c43e9 ("virtio-balloon:
deflate via a page list") without the refactoring.

The bug existed from commit d6d86c0a7f ("mm/balloon_compaction:
redesign ballooned pages management") till b1123ea6d3 ("mm: balloon:
use general non-lru movable page feature").  d6d86c0a7f
("mm/balloon_compaction: redesign ballooned pages management") was
backported to 3.12, so the broken kernels are stable kernels [3.12 -
4.7].

There was a subtle race between dropping the page lock of the newpage in
__unmap_and_move() and checking for __is_movable_balloon_page(newpage).

Just after dropping this page lock, virtio-balloon could go ahead and
deflate the newpage, effectively dequeueing it and clearing PageBalloon,
in turn making __is_movable_balloon_page(newpage) fail.

This resulted in dropping the reference of the newpage via
putback_lru_page(newpage) instead of put_page(newpage), leading to
page->lru getting modified and a !LRU page ending up in the LRU lists.
With 195a8c43e9 ("virtio-balloon: deflate via a page list")
backported, one would suddenly get corrupted lists in
release_pages_balloon():

- WARNING: CPU: 13 PID: 6586 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0
- list_del corruption. prev->next should be ffffe253961090a0, but was dead000000000100

Nowadays this race is no longer possible, but it is hidden behind very
ugly handling of __ClearPageMovable() and __PageMovable().

__ClearPageMovable() will not make __PageMovable() fail, only
PageMovable().  So the new check (__PageMovable(newpage)) will still
hold even after newpage was dequeued by virtio-balloon.

If anybody would ever change that special handling, the BUG would be
introduced again.  So instead, make it explicit and use the information
of the original isolated page before migration.

This patch can be backported fairly easy to stable kernels (in contrast
to the refactoring).

Link: http://lkml.kernel.org/r/20190129233217.10747-1-david@redhat.com
Fixes: d6d86c0a7f ("mm/balloon_compaction: redesign ballooned pages management")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Vratislav Bendel <vbendel@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Vratislav Bendel <vbendel@redhat.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: <stable@vger.kernel.org>	[3.12 - 4.7]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01 15:46:24 -08:00
Johannes Weiner
7b2489d37e psi: clarify the Kconfig text for the default-disable option
The current help text caused some confusion in online forums about
whether or not to default-enable or default-disable psi in vendor
kernels.  This is because it doesn't communicate the reason for why we
made this setting configurable in the first place: that the overhead is
non-zero in an artificial scheduler stress test.

Since this isn't representative of real workloads, and the effect was
not measurable in scheduler-heavy real world applications such as the
webservers and memcache installations at Facebook, it's fair to point
out that this is a pretty cautious option to select.

Link: http://lkml.kernel.org/r/20190129233617.16767-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01 15:46:24 -08:00
Michal Hocko
e3df4c6e48 mm, memory_hotplug: __offline_pages fix wrong locking
Jan has noticed that we do double unlock on some failure paths when
offlining a page range.  This is indeed the case when
test_pages_in_a_zone respp.  start_isolate_page_range fail.  This was an
omission when forward porting the debugging patch from an older kernel.

Fix the issue by dropping mem_hotplug_done from the failure condition
and keeping the single unlock in the catch all failure path.

Link: http://lkml.kernel.org/r/20190115120307.22768-1-mhocko@kernel.org
Fixes: 7960509329 ("mm, memory_hotplug: print reason for the offlining failure")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Tested-by: Jan Kara <jack@suse.cz>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01 15:46:23 -08:00
Naoya Horiguchi
6376360ecb mm: hwpoison: use do_send_sig_info() instead of force_sig()
Currently memory_failure() is racy against process's exiting, which
results in kernel crash by null pointer dereference.

The root cause is that memory_failure() uses force_sig() to forcibly
kill asynchronous (meaning not in the current context) processes.  As
discussed in thread https://lkml.org/lkml/2010/6/8/236 years ago for OOM
fixes, this is not a right thing to do.  OOM solves this issue by using
do_send_sig_info() as done in commit d2d393099d ("signal:
oom_kill_task: use SEND_SIG_FORCED instead of force_sig()"), so this
patch is suggesting to do the same for hwpoison.  do_send_sig_info()
properly accesses to siglock with lock_task_sighand(), so is free from
the reported race.

I confirmed that the reported bug reproduces with inserting some delay
in kill_procs(), and it never reproduces with this patch.

Note that memory_failure() can send another type of signal using
force_sig_mceerr(), and the reported race shouldn't happen on it because
force_sig_mceerr() is called only for synchronous processes (i.e.
BUS_MCEERR_AR happens only when some process accesses to the corrupted
memory.)

Link: http://lkml.kernel.org/r/20190116093046.GA29835@hori1.linux.bs1.fc.nec.co.jp
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reported-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01 15:46:23 -08:00