Commit Graph

7153 Commits

Author SHA1 Message Date
Naveen N. Rao
370011a270 powerpc/ftrace: Enable HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
This associates entries in the ftrace_ret_stack with corresponding stack
frames, enabling more robust stack unwinding. Also update the only user
of ftrace_graph_ret_addr() to pass the stack pointer.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0224f2d0971b069c678e2ff678cfc2cd1e114cfe.1567707399.git.naveen.n.rao@linux.vnet.ibm.com
2019-09-18 12:24:55 +10:00
Ganesh Goudar
e7ca44ed3b powerpc: dump kernel log before carrying out fadump or kdump
Since commit 4388c9b3a6 ("powerpc: Do not send system reset request
through the oops path"), pstore dmesg file is not updated when dump is
triggered from HMC. This commit modified system reset (sreset) handler
to invoke fadump or kdump (if configured), without pushing dmesg to
pstore. This leaves pstore to have old dmesg data which won't be much
of a help if kdump fails to capture the dump. This patch fixes that by
calling kmsg_dump() before heading to fadump ot kdump.

Fixes: 4388c9b3a6 ("powerpc: Do not send system reset request through the oops path")
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190904075949.15607-1-ganeshgr@linux.ibm.com
2019-09-18 00:03:51 +10:00
Linus Torvalds
d0a16fe934 Merge branch 'parisc-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:

 - Make the powerpc implementation to read elf files available as a
   public kexec interface so it can be re-used on other architectures
   (Sven)

 - Implement kexec on parisc (Sven)

 - Add kprobes on ftrace on parisc (Sven)

 - Fix kernel crash with HSC-PCI cards based on card-mode Dino

 - Add assembly implementations for memset, strlen, strcpy, strncpy and
   strcat

 - Some cleanups, documentation updates, warning fixes, ...

* 'parisc-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (25 commits)
  parisc: Have git ignore generated real2.S and firmware.c
  parisc: Disable HP HSC-PCI Cards to prevent kernel crash
  parisc: add support for kexec_file_load() syscall
  parisc: wire up kexec_file_load syscall
  parisc: add kexec syscall support
  parisc: add __pdc_cpu_rendezvous()
  kprobes/parisc: remove arch_kprobe_on_func_entry()
  kexec_elf: support 32 bit ELF files
  kexec_elf: remove unused variable in kexec_elf_load()
  kexec_elf: remove Elf_Rel macro
  kexec_elf: remove PURGATORY_STACK_SIZE
  kexec_elf: remove parsing of section headers
  kexec_elf: change order of elf_*_to_cpu() functions
  kexec: add KEXEC_ELF
  parisc: Save some bytes in dino driver
  parisc: Drop comments which are already in pci.h
  parisc: Convert eisa_enumerator to use pr_cont()
  parisc: Avoid warning when loading hppb driver
  parisc: speed up flush_tlb_all_local with qemu
  parisc: Add ALTERNATIVE_CODE() and ALT_COND_RUN_ON_QEMU
  ...
2019-09-16 15:38:31 -07:00
Hari Bathini
7dee93a9a8 powerpc/fadump: support holes in kernel boot memory area
With support to copy multiple kernel boot memory regions owing to copy
size limitation, also handle holes in the memory area to be preserved.
Support as many as 128 kernel boot memory regions. This allows having
an adequate FADump capture kernel size for different scenarios.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821385448.5656.6124791213910877759.stgit@hbathini.in.ibm.com
2019-09-14 00:04:46 +10:00
Hari Bathini
becd91d9c5 powerpc/fadump: remove RMA_START and RMA_END macros
RMA_START is defined as '0' and there is even a BUILD_BUG_ON() to
make sure it is never anything else. Remove this macro and use '0'
instead as code change is needed anyway when it has to be something
else. Also, remove unused RMA_END macro.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821384096.5656.15026984053970204652.stgit@hbathini.in.ibm.com
2019-09-14 00:04:46 +10:00
Hari Bathini
7b1b3b4825 powerpc/fadump: consider f/w load area
OPAL loads kernel & initrd at 512MB offset (256MB size), also exported
as ibm,opal/dump/fw-load-area. So, if boot memory size of FADump is
less than 768MB, kernel memory to be exported as '/proc/vmcore' would
be overwritten by f/w while loading kernel & initrd. To avoid such a
scenario, enforce a minimum boot memory size of 768MB on OPAL platform
and skip using FADump if a newer F/W version loads kernel & initrd
above 768MB.

Also, irrespective of RMA size, set the minimum boot memory size
expected on pseries platform at 320MB. This is to avoid inflating the
minimum memory requirements on systems with 512M/1024M RMA size.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821381414.5656.1592867278535469652.stgit@hbathini.in.ibm.com
2019-09-14 00:04:45 +10:00
Hari Bathini
bec53196ad powerpc/fadump: add support to preserve crash data on FADUMP disabled kernel
Add a new kernel config option, CONFIG_PRESERVE_FA_DUMP that ensures
that crash data, from previously crash'ed kernel, is preserved. This
helps in cases where FADump is not enabled but the subsequent memory
preserving kernel boot is likely to process this crash data. One
typical usecase for this config option is petitboot kernel.

As OPAL allows registering address with it in the first kernel and
retrieving it after MPIPL, use it to store the top of boot memory.
A kernel that intends to preserve crash data retrieves it and avoids
using memory beyond this address.

Move arch_reserved_kernel_pages() function as it is needed for both
FA_DUMP and PRESERVE_FA_DUMP configurations.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821375751.5656.11459483669542541602.stgit@hbathini.in.ibm.com
2019-09-14 00:04:45 +10:00
Hari Bathini
b2a815a554 powerpc/fadump: improve how crashed kernel's memory is reserved
The size parameter to fadump_reserve_crash_area() function is not needed
as all the memory above boot memory size must be preserved anyway. Update
the function by dropping this redundant parameter.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821374440.5656.2945512543806951766.stgit@hbathini.in.ibm.com
2019-09-14 00:04:45 +10:00
Hari Bathini
dda9dbfeeb powerpc/fadump: consider reserved ranges while releasing memory
Commit 0962e8004e ("powerpc/prom: Scan reserved-ranges node for
memory reservations") enabled support to parse 'reserved-ranges' DT
node to reserve kernel memory falling in these ranges for firmware
purposes. Along with the preserved area memory, ensure memory in
reserved ranges is not overlapped with memory released by capture
kernel aftering saving vmcore. Also, fix the off-by-one error in
fadump_release_reserved_area function while releasing memory.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821371358.5656.6061214942558818661.stgit@hbathini.in.ibm.com
2019-09-14 00:04:44 +10:00
Hari Bathini
e4fc48fb4d powerpc/fadump: make crash memory ranges array allocation generic
Make allocate_crash_memory_ranges() and free_crash_memory_ranges()
functions generic to reuse them for memory management of all types of
dynamic memory range arrays. This change helps in memory management
of reserved ranges array to be added later.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821369863.5656.4375667005352155892.stgit@hbathini.in.ibm.com
2019-09-14 00:04:44 +10:00
Hari Bathini
579ca1a276 powerpc/fadump: make use of memblock's bottom up allocation mode
Earlier, memblock_find_in_range() was not used to find the memory to
be reserved for FADump as bottom up allocation mode was not supported.
But since commit 79442ed189 ("mm/memblock.c: introduce bottom-up
allocation mode") bottom up allocation mode is supported for memblock.
So, use it to find the memory to be reserved for FADump.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821364211.5656.14336025460336135194.stgit@hbathini.in.ibm.com
2019-09-14 00:04:44 +10:00
Hari Bathini
a4e2e2ca2f powerpc/fadump: handle invalidation of crashdump and re-registraion
Make OPAL call to indicate that the dump is processed and the metadata
area in OPAL can be cleared/released. Also, setup/initialize FADump
for re-registration.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821356046.5656.12270927048195494911.stgit@hbathini.in.ibm.com
2019-09-14 00:04:44 +10:00
Hari Bathini
2790d01d1e powerpc/fadump: reset metadata address during clean up
During kexec boot, metadata address needs to be reset to avoid running
into errors interpreting stale metadata address, in case the kexec'ed
kernel crashes before metadata address could be setup again.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821346629.5656.10783321582005237813.stgit@hbathini.in.ibm.com
2019-09-14 00:04:43 +10:00
Hari Bathini
742a265acc powerpc/fadump: register kernel metadata address with opal
OPAL allows registering address with it in the first kernel and
retrieving it after MPIPL. Setup kernel metadata and register its
address with OPAL to use it for processing the crash dump.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821345011.5656.13567765019032928471.stgit@hbathini.in.ibm.com
2019-09-14 00:04:43 +10:00
Hari Bathini
6abec12c65 powerpc/fadump: improve fadump_reserve_mem()
Some code clean-up like using minimal assignments and updating printk
messages. Also, add an 'error_out' label for handling error cleanup
at one place.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821343485.5656.10202857091553646948.stgit@hbathini.in.ibm.com
2019-09-14 00:04:43 +10:00
Hari Bathini
41df592872 powerpc/fadump: add fadump support on powernv
Add basic callback functions for FADump on PowerNV platform.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821342072.5656.4346362203141486452.stgit@hbathini.in.ibm.com
2019-09-14 00:04:43 +10:00
Hari Bathini
f35120115b pseries/fadump: move out platform specific support from generic code
Move code that supports processing the crash'ed kernel's memory
preserved by firmware to platform specific callback functions.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821337690.5656.13050665924800177744.stgit@hbathini.in.ibm.com
2019-09-14 00:04:42 +10:00
Hari Bathini
8255da95e5 powerpc/fadump: release all the memory above boot memory size
Except for Reserved dump area (see Documentation/powerpc/firmware-
assisted-dump.rst) which is permanent reserved, all memory above boot
memory size, where boot memory size is the memory required for the
kernel to boot successfully when booted with restricted memory (memory
for capture kernel), is released when the dump is invalidated. Make
this a bit more explicit in the code.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821336092.5656.1079046285366041687.stgit@hbathini.in.ibm.com
2019-09-14 00:04:42 +10:00
Hari Bathini
41a65d1618 pseries/fadump: define RTAS register/un-register callback functions
Move platform specific register/un-register code, the RTAS calls, to
register/un-register callback functions. This would also mean moving
code that initializes and prints the platform specific FADump data.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821332856.5656.16380417702046411631.stgit@hbathini.in.ibm.com
2019-09-14 00:04:42 +10:00
Hari Bathini
d3833a7010 powerpc/fadump: introduce callbacks for platform specific operations
Introduce callback functions for platform specific operations like
register, unregister, invalidate & such. Also, define place-holders
for the same on pSeries platform.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821330286.5656.15538934400074110770.stgit@hbathini.in.ibm.com
2019-09-14 00:04:42 +10:00
Hari Bathini
0226e55275 powerpc/fadump: move rtas specific definitions to platform code
Currently, FADump is only supported on pSeries but that is going to
change soon with FADump support being added on PowerNV platform. So,
move rtas specific definitions to platform code to allow FADump
to have multiple platforms support.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821328494.5656.16219929140866195511.stgit@hbathini.in.ibm.com
2019-09-14 00:04:41 +10:00
Hari Bathini
72aa651795 powerpc/fadump: use helper functions to reserve/release cpu notes buffer
Use helper functions to simplify memory allocation, pinning down and
freeing the memory used for CPU notes buffer.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821323555.5656.2486038022572739622.stgit@hbathini.in.ibm.com
2019-09-14 00:04:41 +10:00
Hari Bathini
7f0ad11d3f powerpc/fadump: declare helper functions in internal header file
Declare helper functions, that can be reused by multiple platforms,
in the internal header file.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821320487.5656.2660730464212209984.stgit@hbathini.in.ibm.com
2019-09-14 00:04:41 +10:00
Hari Bathini
961cf26a98 powerpc/fadump: add helper functions
Add helper functions to setup & free CPU notes buffer and to find if a
given memory area is contiguous. Also, use boolean as return type for
the function that finds if boot memory area is contiguous. While at
it, save the virtual address of CPU notes buffer instead of physical
address as virtual address is used often.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821318971.5656.9281936950510635858.stgit@hbathini.in.ibm.com
2019-09-14 00:04:41 +10:00
Hari Bathini
ca986d7fa7 powerpc/fadump: move internal macros/definitions to a new header
Though asm/fadump.h is meant to be used by other components dealing
with FADump, it also has macros/definitions internal to FADump code.
Move them to a new header file used within FADump code. This also
makes way for refactoring platform specific FADump code.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/156821313134.5656.6597770626574392140.stgit@hbathini.in.ibm.com
2019-09-14 00:04:41 +10:00
Masahiro Yamada
1fdfa4c6af powerpc: improve prom_init_check rule
This slightly improves the prom_init_check rule.

[1] Avoid needless check

Currently, prom_init_check.sh is invoked every time you run 'make'
even if you have changed nothing in prom_init.c. With this commit,
the script is re-run only when prom_init.o is recompiled.

[2] Beautify the build log

Currently, the O= build shows the absolute path to the script:

  CALL    /abs/path/to/source/of/linux/arch/powerpc/kernel/prom_init_check.sh

With this commit, it is always a relative path to the timestamp file:

  PROMCHK arch/powerpc/kernel/prom_init_check

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190912074037.13813-1-yamada.masahiro@socionext.com
2019-09-14 00:04:41 +10:00
Michael Ellerman
caff52dc0b powerpc/kvm: Add ifdefs around template code
Some of the templates used for KVM patching are only used on certain
platforms, but currently they are always built-in, fix that.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190911115746.12433-4-mpe@ellerman.id.au
2019-09-14 00:04:40 +10:00
Michael Ellerman
731dade128 powerpc/kvm: Explicitly mark kvm guest code as __init
All the code in kvm.c can be marked __init. Most of it is already
inlined into the initcall, but not all. So instead of relying on the
inlining, mark it all as __init. This saves ~280 bytes of text for my
configuration.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190911115746.12433-3-mpe@ellerman.id.au
2019-09-14 00:04:40 +10:00
Michael Ellerman
0cb0837f9d powerpc/kvm: Move kvm_tmp into .text, shrink to 64K
In some configurations of KVM, guests binary patch themselves to
avoid/reduce trapping into the hypervisor. For some instructions this
requires replacing one instruction with a sequence of instructions.

For those cases we need to write the sequence of instructions
somewhere and then patch the location of the original instruction to
branch to the sequence. That requires that the location of the
sequence be within 32MB of the original instruction.

The current solution for this is that we create a 1MB array in BSS,
write sequences into there, and then free the remainder of the array.

This has a few problems:
 - it confuses kmemleak.
 - it confuses lockdep.
 - it requires mapping kvm_tmp executable, which can cause adjacent
   areas to also be mapped executable if we're using 16M pages for the
   linear mapping.
 - the 32MB limit can be exceeded if the kernel is big enough,
   especially with STRICT_KERNEL_RWX enabled, which then prevents the
   patching from working at all.

We can fix all those problems by making kvm_tmp just a region of
regular .text. However currently it's 1MB in size, and we don't want
to waste 1MB of text. In practice however I only see ~30KB of kvm_tmp
being used even for an allyes_config. So shrink kvm_tmp to 64K, which
ought to be enough for everyone, and move it into .text.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190911115746.12433-1-mpe@ellerman.id.au
2019-09-14 00:04:40 +10:00
Michael Ellerman
1b7f3b6c43 powerpc/eeh: Fix build with STACKTRACE=n
The build breaks when STACKTRACE=n, eg. skiroot_defconfig:

  arch/powerpc/kernel/eeh_event.c:124:23: error: implicit declaration of function 'stack_trace_save'

Fix it with some ifdefs for now.

Fixes: 25baf3d816 ("powerpc/eeh: Defer printing stack trace")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-09-14 00:01:14 +10:00
Ravi Bangoria
bc01bdf6c5 powerpc/watchpoint: Disable watchpoint hit by larx/stcx instructions
If watchpoint exception is generated by larx/stcx instructions, the
reservation created by larx gets lost while handling exception, and
thus stcx instruction always fails. Generally these instructions are
used in a while(1) loop, for example spinlocks. And because stcx
never succeeds, it loops forever and ultimately hangs the system.

Note that ptrace anyway works in one-shot mode and thus for ptrace
we don't change the behaviour. It's up to ptrace user to take care
of this.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190910131513.30499-1-ravi.bangoria@linux.ibm.com
2019-09-12 09:27:00 +10:00
Sven Schnelle
175fca3bf9 kexec: add KEXEC_ELF
Right now powerpc provides an implementation to read elf files
with the kexec_file_load() syscall. Make that available as a public
kexec interface so it can be re-used on other architectures.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-06 23:58:43 +02:00
Linus Torvalds
13da6ac106 powerpc fixes for 5.3 #5
One fix for a boot hang on some Freescale machines when PREEMPT is enabled.
 
 Two CVE fixes for bugs in our handling of FP registers and transactional memory,
 both of which can result in corrupted FP state, or FP state leaking between
 processes.
 
 Thanks to:
   Chris Packham, Christophe Leroy, Gustavo Romero, Michael Neuling.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl1x06oTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgCZzD/90EyaWJVS8WPZopoIdnuOfB/F7EZFY
 Lhgd640S1p4o8BUZaQ1T19JOzp6HlO38myOptBufY0BsIJW0M2GwngnBPzSPW8r7
 ImTTf5cU0CDe2m3OJdfBrVpnGmUsmoWxwrsFJZ9wbsXhCwbbUzOUuxD/B9wBIGi/
 sPpTlaYZBhu3cKs9EWPKAODJhtEf55Q1c62gftfj8Y5u8uxQGinYInCghAUr+3Zv
 uCw1CSxOV7yGxfgc1sbOptidOiG4Pljw4EDCUFLpjWTYgPVERASbPHs3C4xuAHGq
 IYuNDUJbwrxMU9BKLFzvL4MKWa5XtzLE34oY8SuyyVAbIQTszgCn2rIwlJXH88PO
 UtId9accmS+dy2lRI+90dC0qeTgUUIZXS1NF0cl5YNRN0TlMyjHL2/sRxCZF2svF
 EaGNjTQLAsfX0ccO9xQr8+KBSfFURMEkO8QQAR0lzJmIgbvSuzfjlZpbcYd2Nqfe
 EiYU4GeAQSn14vi0ZMdRWxc1rki9pPhGkrUwToDALsiEedRB03olM955uecf7fra
 S8MzHFBYh8Apd/lsAj53uAbL2rIHDJ5+6/eezYp7bRbo6FlvWDs9kmYTX3p3ixq1
 Q4gDHfbwnWxxhjUBri5QNZF9YHgkyGPURGpIbdXk9R4Hc7ihQWwDBcSrueca51Ug
 m97SLF5/+yWx0A==
 =C+wa
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "One fix for a boot hang on some Freescale machines when PREEMPT is
  enabled.

  Two CVE fixes for bugs in our handling of FP registers and
  transactional memory, both of which can result in corrupted FP state,
  or FP state leaking between processes.

  Thanks to: Chris Packham, Christophe Leroy, Gustavo Romero, Michael
  Neuling"

* tag 'powerpc-5.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts
  powerpc/tm: Fix FP/VMX unavailable exceptions inside a transaction
  powerpc/64e: Drop stale call to smp_processor_id() which hangs SMP startup
2019-09-06 08:54:45 -07:00
Oliver O'Halloran
bd6461cc7b powerpc/eeh: Add a eeh_dev_break debugfs interface
Add an interface to debugfs for generating an EEH event on a given device.
This works by disabling memory accesses to and from the device by setting
the PCI_COMMAND register (or the VF Memory Space Enable on the parent PF).

This is a somewhat portable alternative to using the platform specific
error injection mechanisms since those tend to be either hard to use, or
straight up broken. For pseries the interfaces also requires the use of
/dev/mem which is probably going to go away in a post-LOCKDOWN world
(and it's a horrific hack to begin with) so moving to a kernel-provided
interface makes sense and provides a sane, cross-platform interface for
userspace so we can write more generic testing scripts.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190903101605.2890-14-oohall@gmail.com
2019-09-05 14:22:39 +10:00
Oliver O'Halloran
22cda7c168 powerpc/eeh: Add debugfs interface to run an EEH check
Detecting an frozen EEH PE usually occurs when an MMIO load returns a 0xFFs
response. When performing EEH testing using the EEH error injection feature
available on some platforms there is no simple way to kick-off the kernel's
recovery process since any accesses from userspace (usually /dev/mem) will
bypass the MMIO helpers in the kernel which check if a 0xFF response is due
to an EEH freeze or not.

If a device contains a 0xFF byte in it's config space it's possible to
trigger the recovery process via config space read from userspace, but this
is not a reliable method. If a driver is bound to the device an in use it
will frequently trigger the MMIO check, but this is also inconsistent.

To solve these problems this patch adds a debugfs file called
"eeh_dev_check" which accepts a <domain>:<bus>:<dev>.<fn> string and runs
eeh_dev_check_failure() on it. This is the same check that's done when the
kernel gets a 0xFF result from an config or MMIO read with the added
benifit that it can be reliably triggered from userspace.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190903101605.2890-13-oohall@gmail.com
2019-09-05 14:22:39 +10:00
Oliver O'Halloran
aeff27c121 powerpc/eeh: Set attention indicator while recovering
I am the RAS team. Hear me roar.

Roar.

On a more serious note, being able to locate failed devices can be helpful.
Set the attention indicator if the slot supports it once we've determined
the device is present and only clear it if the device is fully recovered.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190903101605.2890-12-oohall@gmail.com
2019-09-05 14:22:39 +10:00
Oliver O'Halloran
25baf3d816 powerpc/eeh: Defer printing stack trace
Currently we print a stack trace in the event handler to help with
debugging EEH issues. In the case of suprise hot-unplug this is unneeded,
so we want to prevent printing the stack trace unless we know it's due to
an actual device error. To accomplish this, we can save a stack trace at
the point of detection and only print it once the EEH recovery handler has
determined the freeze was due to an actual error.

Since the whole point of this is to prevent spurious EEH output we also
move a few prints out of the detection thread, or mark them as pr_debug
so anyone interested can get output from the eeh_check_dev_failure()
if they want.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190903101605.2890-6-oohall@gmail.com
2019-09-05 14:22:38 +10:00
Oliver O'Halloran
b104af5a76 powerpc/eeh: Check slot presence state in eeh_handle_normal_event()
When a device is surprise removed while undergoing IO we will probably
get an EEH PE freeze due to MMIO timeouts and other errors. When a freeze
is detected we send a recovery event to the EEH worker thread which will
notify drivers, and perform recovery as needed.

In the event of a hot-remove we don't want recovery to occur since there
isn't a device to recover. The recovery process is fairly long due to
the number of wait states (required by PCIe) which causes problems when
devices are removed and replaced (e.g. hot swapping of U.2 NVMe drives).

To determine if we need to skip the recovery process we can use the
get_adapter_state() operation of the hotplug_slot to determine if the
slot contains a device or not, and if the slot is empty we can skip
recovery entirely.

One thing to note is that the slot being EEH frozen does not prevent the
hotplug driver from working. We don't have the EEH recovery thread
remove any of the devices since it's assumed that the hotplug driver
will handle tearing down the slot state.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190903101605.2890-5-oohall@gmail.com
2019-09-05 14:22:38 +10:00
Oliver O'Halloran
38ddc01147 powerpc/eeh: Make permanently failed devices non-actionable
If a device is torn down by a hotplug slot driver it's marked as removed
and marked as permaantly failed. There's no point in trying to recover a
permernantly failed device so it should be considered un-actionable.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190903101605.2890-4-oohall@gmail.com
2019-09-05 14:22:38 +10:00
Oliver O'Halloran
5ef753ae43 powerpc/eeh: Fix race when freeing PDNs
When hot-adding devices we rely on the hotplug driver to create pci_dn's
for the devices under the hotplug slot. Converse, when hot-removing the
driver will remove the pci_dn's that it created. This is a problem because
the pci_dev is still live until it's refcount drops to zero. This can
happen if the driver is slow to tear down it's internal state. Ideally, the
driver would not attempt to perform any config accesses to the device once
it's been marked as removed, but sometimes it happens. As a result, we
might attempt to access the pci_dn for a device that has been torn down and
the kernel may crash as a result.

To fix this, don't free the pci_dn unless the corresponding pci_dev has
been released.  If the pci_dev is still live, then we mark the pci_dn with
a flag that indicates the pci_dev's release function should free it.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190903101605.2890-3-oohall@gmail.com
2019-09-05 14:22:37 +10:00
Oliver O'Halloran
799abe283e powerpc/eeh: Clean up EEH PEs after recovery finishes
When the last device in an eeh_pe is removed the eeh_pe structure itself
(and any empty parents) are freed since they are no longer needed. This
results in a crash when a hotplug driver is involved since the following
may occur:

1. Device is suprise removed.
2. Driver performs an MMIO, which fails and queues and eeh_event.
3. Hotplug driver receives a hotplug interrupt and removes any
   pci_devs that were under the slot.
4. pci_dev is torn down and the eeh_pe is freed.
5. The EEH event handler thread processes the eeh_event and crashes
   since the eeh_pe pointer in the eeh_event structure is no
   longer valid.

Crashing is generally considered poor form. Instead of doing that use
the fact PEs are marked as EEH_PE_INVALID to keep them around until the
end of the recovery cycle, at which point we can safely prune any empty
PEs.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190903101605.2890-2-oohall@gmail.com
2019-09-05 14:22:37 +10:00
Masahiro Yamada
858805b336 kbuild: add $(BASH) to run scripts with bash-extension
CONFIG_SHELL falls back to sh when bash is not installed on the system,
but nobody is testing such a case since bash is usually installed.
So, shell scripts invoked by CONFIG_SHELL are only tested with bash.

It makes it difficult to test whether the hashbang #!/bin/sh is real.
For example, #!/bin/sh in arch/powerpc/kernel/prom_init_check.sh is
false. (I fixed it up)

Besides, some shell scripts invoked by CONFIG_SHELL use bash-extension
and #!/bin/bash is specified as the hashbang, while CONFIG_SHELL may
not always be set to bash.

Probably, the right thing to do is to introduce BASH, which is bash by
default, and always set CONFIG_SHELL to sh. Replace $(CONFIG_SHELL)
with $(BASH) for bash scripts.

If somebody tries to add bash-extension to a #!/bin/sh script, it will
be caught in testing because /bin/sh is a symlink to dash on some major
distributions.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-09-04 22:54:13 +09:00
Gustavo Romero
a8318c13e7 powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts
When in userspace and MSR FP=0 the hardware FP state is unrelated to
the current process. This is extended for transactions where if tbegin
is run with FP=0, the hardware checkpoint FP state will also be
unrelated to the current process. Due to this, we need to ensure this
hardware checkpoint is updated with the correct state before we enable
FP for this process.

Unfortunately we get this wrong when returning to a process from a
hardware interrupt. A process that starts a transaction with FP=0 can
take an interrupt. When the kernel returns back to that process, we
change to FP=1 but with hardware checkpoint FP state not updated. If
this transaction is then rolled back, the FP registers now contain the
wrong state.

The process looks like this:
   Userspace:                      Kernel

               Start userspace
                with MSR FP=0 TM=1
                  < -----
   ...
   tbegin
   bne
               Hardware interrupt
                   ---- >
                                    <do_IRQ...>
                                    ....
                                    ret_from_except
                                      restore_math()
				        /* sees FP=0 */
                                        restore_fp()
                                          tm_active_with_fp()
					    /* sees FP=1 (Incorrect) */
                                          load_fp_state()
                                        FP = 0 -> 1
                  < -----
               Return to userspace
                 with MSR TM=1 FP=1
                 with junk in the FP TM checkpoint
   TM rollback
   reads FP junk

When returning from the hardware exception, tm_active_with_fp() is
incorrectly making restore_fp() call load_fp_state() which is setting
FP=1.

The fix is to remove tm_active_with_fp().

tm_active_with_fp() is attempting to handle the case where FP state
has been changed inside a transaction. In this case the checkpointed
and transactional FP state is different and hence we must restore the
FP state (ie. we can't do lazy FP restore inside a transaction that's
used FP). It's safe to remove tm_active_with_fp() as this case is
handled by restore_tm_state(). restore_tm_state() detects if FP has
been using inside a transaction and will set load_fp and call
restore_math() to ensure the FP state (checkpoint and transaction) is
restored.

This is a data integrity problem for the current process as the FP
registers are corrupted. It's also a security problem as the FP
registers from one process may be leaked to another.

Similarly for VMX.

A simple testcase to replicate this will be posted to
tools/testing/selftests/powerpc/tm/tm-poison.c

This fixes CVE-2019-15031.

Fixes: a7771176b4 ("powerpc: Don't enable FP/Altivec if not checkpointed")
Cc: stable@vger.kernel.org # 4.15+
Signed-off-by: Gustavo Romero <gromero@linux.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190904045529.23002-2-gromero@linux.vnet.ibm.com
2019-09-04 22:31:13 +10:00
Gustavo Romero
8205d5d98e powerpc/tm: Fix FP/VMX unavailable exceptions inside a transaction
When we take an FP unavailable exception in a transaction we have to
account for the hardware FP TM checkpointed registers being
incorrect. In this case for this process we know the current and
checkpointed FP registers must be the same (since FP wasn't used
inside the transaction) hence in the thread_struct we copy the current
FP registers to the checkpointed ones.

This copy is done in tm_reclaim_thread(). We use thread->ckpt_regs.msr
to determine if FP was on when in userspace. thread->ckpt_regs.msr
represents the state of the MSR when exiting userspace. This is setup
by check_if_tm_restore_required().

Unfortunatley there is an optimisation in giveup_all() which returns
early if tsk->thread.regs->msr (via local variable `usermsr`) has
FP=VEC=VSX=SPE=0. This optimisation means that
check_if_tm_restore_required() is not called and hence
thread->ckpt_regs.msr is not updated and will contain an old value.

This can happen if due to load_fp=255 we start a userspace process
with MSR FP=1 and then we are context switched out. In this case
thread->ckpt_regs.msr will contain FP=1. If that same process is then
context switched in and load_fp overflows, MSR will have FP=0. If that
process now enters a transaction and does an FP instruction, the FP
unavailable will not update thread->ckpt_regs.msr (the bug) and MSR
FP=1 will be retained in thread->ckpt_regs.msr.  tm_reclaim_thread()
will then not perform the required memcpy and the checkpointed FP regs
in the thread struct will contain the wrong values.

The code path for this happening is:

       Userspace:                      Kernel
                   Start userspace
                    with MSR FP/VEC/VSX/SPE=0 TM=1
                      < -----
       ...
       tbegin
       bne
       fp instruction
                   FP unavailable
                       ---- >
                                        fp_unavailable_tm()
					  tm_reclaim_current()
					    tm_reclaim_thread()
					      giveup_all()
					        return early since FP/VMX/VSX=0
						/* ckpt MSR not updated (Incorrect) */
					      tm_reclaim()
					        /* thread_struct ckpt FP regs contain junk (OK) */
                                              /* Sees ckpt MSR FP=1 (Incorrect) */
					      no memcpy() performed
					        /* thread_struct ckpt FP regs not fixed (Incorrect) */
					  tm_recheckpoint()
					     /* Put junk in hardware checkpoint FP regs */
                                         ....
                      < -----
                   Return to userspace
                     with MSR TM=1 FP=1
                     with junk in the FP TM checkpoint
       TM rollback
       reads FP junk

This is a data integrity problem for the current process as the FP
registers are corrupted. It's also a security problem as the FP
registers from one process may be leaked to another.

This patch moves up check_if_tm_restore_required() in giveup_all() to
ensure thread->ckpt_regs.msr is updated correctly.

A simple testcase to replicate this will be posted to
tools/testing/selftests/powerpc/tm/tm-poison.c

Similarly for VMX.

This fixes CVE-2019-15030.

Fixes: f48e91e87e ("powerpc/tm: Fix FP and VMX register corruption")
Cc: stable@vger.kernel.org # 4.12+
Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190904045529.23002-1-gromero@linux.vnet.ibm.com
2019-09-04 22:31:13 +10:00
Christoph Hellwig
f9f3232a7d dma-mapping: explicitly wire up ->mmap and ->get_sgtable
While the default ->mmap and ->get_sgtable implementations work for the
majority of our dma_map_ops impementations they are inherently safe
for others that don't use the page allocator or CMA and/or use their
own way of remapping not covered by the common code.  So remove the
defaults if these methods are not wired up, but instead wire up the
default implementations for all safe instances.

Fixes: e1c7e32453 ("dma-mapping: always provide the dma_map_ops based implementation")
Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-09-04 11:13:18 +02:00
Nicholas Piggin
9b123d1ea2 powerpc/64s/exception: reduce page fault unnecessary loads
This avoids 3 loads in the radix page fault case, 1 load in the
hash fault case, and 2 loads in the hash miss page fault case.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-37-npiggin@gmail.com
2019-08-30 11:14:59 +10:00
Nicholas Piggin
05f97d94dd powerpc/64s/exception: Remove pointless KVM handler name bifurcation
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-36-npiggin@gmail.com
2019-08-30 11:14:59 +10:00
Nicholas Piggin
1b3599829a powerpc/64s/exception: program check handler do not branch into a macro
It is clever, but the small code saving is not worth the spaghetti of
jumping to a label in an expanded macro, particularly when the label
is just a number rather than a descriptive name.

So expand the INT_COMMON macro twice, once for the stack and no stack
cases, and branch to those. The slight code size increase is worth
the improved clarity of branches for this non-performance critical
code.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-35-npiggin@gmail.com
2019-08-30 11:14:58 +10:00
Nicholas Piggin
c7c5cbb42d powerpc/64s/exception: move interrupt entry code above the common handler
This better reflects the order in which the code is executed.

No generated code change except BUG line number constants.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-34-npiggin@gmail.com
2019-08-30 11:14:58 +10:00
Nicholas Piggin
d1a8471888 powerpc/64s/exception: INT_COMMON add DAR, DSISR, reconcile options
Move DAR and DSISR saving to pt_regs into INT_COMMON. Also add an
option to expand RECONCILE_IRQ_STATE.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-33-npiggin@gmail.com
2019-08-30 11:14:58 +10:00
Nicholas Piggin
8c9fb5d4f3 powerpc/64s/exception: Expand EXCEPTION_PROLOG_COMMON_1 and 2 into caller
No generated code change except BUG line number constants.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-32-npiggin@gmail.com
2019-08-30 11:14:58 +10:00
Nicholas Piggin
5d5e0edfd5 powerpc/64s/exception: Expand EXCEPTION_COMMON macro into caller
No generated code change except BUG line number constants.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-31-npiggin@gmail.com
2019-08-30 11:14:58 +10:00
Nicholas Piggin
bcbceed40a powerpc/64s/exception: Add INT_COMMON gas macro to generate common exception code
No generated code change except BUG line number constants.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-30-npiggin@gmail.com
2019-08-30 11:14:57 +10:00
Nicholas Piggin
9a9c739aa8 powerpc/64s/exception: Merge EXCEPTION_PROLOG_COMMON_2/3
Merge EXCEPTION_PROLOG_COMMON_3 into EXCEPTION_PROLOG_COMMON_2.

No generated code change except BUG line number constants.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-29-npiggin@gmail.com
2019-08-30 11:14:57 +10:00
Nicholas Piggin
7027d53d1a powerpc/64s/exception: KVM_HANDLER reorder arguments to match other macros
Also change argument name (n -> vec) to match others.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-28-npiggin@gmail.com
2019-08-30 11:14:57 +10:00
Nicholas Piggin
141fed2669 powerpc/64s/exception: Add INT_KVM_HANDLER gas macro
Replace the 4 variants of cpp macros with one gas macro.

No generated code change except BUG line number constants.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-27-npiggin@gmail.com
2019-08-30 11:14:57 +10:00
Nicholas Piggin
4515c5fa41 powerpc/64s/exception: INT_HANDLER support HDAR/HDSISR and use it in HDSI
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-26-npiggin@gmail.com
2019-08-30 11:14:57 +10:00
Nicholas Piggin
52b989231c powerpc/64s/exception: Add the virt variant of the denorm interrupt handler
All other virt handlers have the prolog code in the virt vector rather
than branch to the real vector. Follow this pattern in the denorm virt
handler.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-25-npiggin@gmail.com
2019-08-30 11:14:56 +10:00
Nicholas Piggin
d29768e13c powerpc/64s/exception: remove EXCEPTION_PROLOG_0/1, rename _2
EXCEPTION_PROLOG_0 and _1 have only a single caller, so expand them
into it.

Rename EXCEPTION_PROLOG_2_REAL to INT_SAVE_SRR_AND_JUMP and
EXCEPTION_PROLOG_2_VIRT to INT_VIRT_SAVE_SRR_AND_JUMP, which are
more descriptive.

No generated code change except BUG line number constants.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-24-npiggin@gmail.com
2019-08-30 11:14:56 +10:00
Michael Ellerman
9b40f62b8a powerpc/64s/exceptions: Use keyword params to shorten arg lists
The argument lists for the INT_HANDLER macro are getting a bit
unwieldy. Use keyword parameters with default values to shorten them.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190830011426.16810-1-mpe@ellerman.id.au
2019-08-30 11:14:51 +10:00
Nicholas Piggin
7299417c82 powerpc/64s/exception: Replace PROLOG macros and EXC helpers with a gas macro
This creates a single macro that generates the exception prolog code,
with variants specified by arguments, rather than assorted nested
macros for different variants.

The increasing length of macro argument list is not nice to read or
modify, but this is a temporary condition that will be improved in
later changes.

No generated code change except BUG line number constants and label
names.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-23-npiggin@gmail.com
2019-08-30 10:32:37 +10:00
Nicholas Piggin
5ff79a5ea6 powerpc/64s/exception: remove 0xb00 handler
This vector is not used by any supported processor, and has been
implemented as an unknown exception going back to 2.6. There is
nothing special about 0xb00, so remove it like other unused
vectors.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-22-npiggin@gmail.com
2019-08-30 10:32:37 +10:00
Nicholas Piggin
9a7a0773d7 powerpc/64s/exception: Fix performance monitor virt handler
The perf virt handler uses EXCEPTION_PROLOG_2_REAL rather than _VIRT.
In practice this is okay because the _REAL variant is usable by virt
mode interrupts, but should be fixed (and is a performance win).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-21-npiggin@gmail.com
2019-08-30 10:32:36 +10:00
Nicholas Piggin
def0db4f9d powerpc/64s/exception: Add EXC_HV_OR_STD, which selects HSRR if HVMODE
Add EXC_HV_OR_STD and use it to consolidate the 0x500 external
interrupt.

Executed code is unchanged.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-20-npiggin@gmail.com
2019-08-30 10:32:36 +10:00
Nicholas Piggin
a243281195 powerpc/64s/exception: move head-64.h exception code to exception-64s.S
The head-64.h code should deal only with the head code sections
and offset calculations.

No generated code change except BUG line number constants.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-19-npiggin@gmail.com
2019-08-30 10:32:36 +10:00
Nicholas Piggin
c31f7134dc powerpc/64s/exception: Fix DAR load for handle_page_fault error case
This buglet goes back to before the 64/32 arch merge, but it does not
seem to have had practical consequences because bad_page_fault does
not use the 2nd argument, but rather regs->dar/nip.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-18-npiggin@gmail.com
2019-08-30 10:32:36 +10:00
Nicholas Piggin
b3fe35261e powerpc/64s/exception: machine check improve labels and comments
Short forward and backward branches can be given number labels,
but larger significant divergences in code path a more readable
if they're given descriptive names.

Also adjusts a comment to account for guest delivery.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-17-npiggin@gmail.com
2019-08-30 10:32:36 +10:00
Nicholas Piggin
fce16d4822 powerpc/64s/exception: untangle early machine check handler branch
machine_check_early_common now branches to machine_check_handle_early
which is its only caller.

Move interleaving code out of the way, and remove the branch.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-16-npiggin@gmail.com
2019-08-30 10:32:36 +10:00
Nicholas Piggin
b7d9ccec30 powerpc/64s/exception: machine check move unrecoverable handling out of line
Similarly to the previous change, all callers of the unrecoverable
handler run relocated so can reach it with a direct branch. This makes
it easy to move out of line, which makes the "normal" path less
cluttered and easier to follow.

MSR[ME] manipulation still requires the rfi, so that is moved out of
line to its own function.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-15-npiggin@gmail.com
2019-08-30 10:32:36 +10:00
Nicholas Piggin
296e753fb4 powerpc/64s/exception: simplify machine check early path
machine_check_handle_early_common can reach machine_check_handle_early
directly now that it runs at the relocated address, so just branch
directly.

The rfi sequence is required to enable MSR[ME] but that step is moved
into a helper function, making the code easier to follow.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-14-npiggin@gmail.com
2019-08-30 10:32:35 +10:00
Nicholas Piggin
abd1f4ca2b powerpc/64s/exception: machine check move tramp code
Following convention, move the tramp code (unrelocated) above the
common handlers (relocated).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-13-npiggin@gmail.com
2019-08-30 10:32:35 +10:00
Nicholas Piggin
c8eb54dbc8 powerpc/64s/exception: machine check restructure to reuse common macros
Follow the pattern of sreset and HMI handlers more closely: use
EXCEPTION_PROLOG_COMMON_1 rather than open-coding it, and run the
handler at the relocated location.

This helps later simplification and code sharing.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-12-npiggin@gmail.com
2019-08-30 10:32:35 +10:00
Nicholas Piggin
272f636445 powerpc/64s/exception: machine check pseries should skip the late handler for kernel MCEs
The powernv machine check handler copes with taking a MCE from one of
three contexts, guest, kernel, and user. In each case the early
handler runs first on a special stack, then:

- The guest case branches to the KVM interrupt handler (via standard
  interrupt macros).
- The user case will run the "late" handler which is like a normal
  interrupt that runs in virtual mode and uses the regular kernel
  stack.
- The kernel case queues the event and schedules it for processing
  with irq work.

The last case is important, it must not enable virtual memory because
the MMU state may not be set up to deal with that (e.g., SLB might be
clear), it must not use the regular kernel stack for similar reasons
(e.g., might be in OPAL with OPAL stack in r1), and the kernel does
not expect anything to touch its stack if interrupts are disabled.

The pseries handler does not do this queueing, but instead it always
runs the late handler for host MCEs, which has some of the same
problems.

Now that pseries is using machine_check_events, change it to do the
same as powernv and queue events for kernel MCEs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-11-npiggin@gmail.com
2019-08-30 10:32:35 +10:00
Nicholas Piggin
9ca766f989 powerpc/64s/pseries: machine check convert to use common event code
The common machine_check_event data structures and queues are mostly
platform independent, with powernv decoding SRR1/DSISR/etc., into
machine_check_event objects.

This patch converts pseries to use this infrastructure by decoding
fwnmi/rtas data into machine_check_event objects.

This allows queueing to be used by a subsequent change to delay the
virtual mode handling of machine checks that occur in kernel space
where it is unsafe to switch immediately to virtual mode, similarly
to powernv.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fix implicit fallthrough warnings in mce_handle_error()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-10-npiggin@gmail.com
2019-08-30 10:32:35 +10:00
Nicholas Piggin
7290f3b3d3 powerpc/64s/powernv: machine check dump SLB contents
Re-use the code introduced in pseries to save and dump the contents
of the SLB in the case of an SLB involved machine check exception.

This patch also avoids allocating the SLB save array on pseries radix.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-9-npiggin@gmail.com
2019-08-30 10:32:35 +10:00
Nicholas Piggin
0b66370c61 powerpc/64s/exception: machine check use correct cfar for late handler
Bare metal machine checks run an "early" handler in real mode before
running the main handler which reports the event.

The main handler runs exactly as a normal interrupt handler, after the
"windup" which sets registers back as they were at interrupt entry.
CFAR does not get restored by the windup code, so that will be wrong
when the handler is run.

Restore the CFAR to the saved value before running the late handler.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-8-npiggin@gmail.com
2019-08-30 10:32:34 +10:00
Nicholas Piggin
fa2760eca5 powerpc/64s/exception: machine check remove machine_check_pSeries_0 branch
This label has only one caller, so unwind the branch and move it
inline. The location of the comment is adjusted to match similar
one in system reset.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-7-npiggin@gmail.com
2019-08-30 10:32:34 +10:00
Nicholas Piggin
b5c27f7c56 powerpc/64s/exception: machine check pseries should always run the early handler
Now that pseries with fwnmi registered runs the early machine check
handler, there is no good reason to special case the non-fwnmi case
and skip the early handler. Reducing the code and number of paths is
a top priority for asm code, it's better to handle this in C where
possible (and the pseries early handler is a no-op if fwnmi is not
registered).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-6-npiggin@gmail.com
2019-08-30 10:32:34 +10:00
Nicholas Piggin
fe9d482b1d powerpc/64s/exception: machine check adjust RFI target
The host kernel delivery case for powernv does RFI_TO_USER_OR_KERNEL,
but should just use RFI_TO_KERNEL which makes it clear this is not a
user case.

This is not a bug because RFI_TO_USER_OR_KERNEL deals with kernel
returns just fine.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-5-npiggin@gmail.com
2019-08-30 10:32:34 +10:00
Nicholas Piggin
19dbe673e6 powerpc/64s/exception: machine check fix KVM guest test
The machine_check_handle_early hypervisor guest test is skipped if
!HVMODE or MSR[HV]=0, which is wrong for PR or nested hypervisors
that could be running a guest in this state.

Test HSTATE_IN_GUEST up front and use that to branch out to the KVM
handler, then MSR[PR] alone can test for this kernel's userspace.
This matches all other interrupt handling.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-4-npiggin@gmail.com
2019-08-30 10:32:34 +10:00
Nicholas Piggin
1039f62431 powerpc/64s/exception: machine check remove bitrotted comment
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-3-npiggin@gmail.com
2019-08-30 10:32:34 +10:00
Nicholas Piggin
0be9f7fd5d powerpc/64s/exception: machine check fwnmi remove HV case
fwnmi does not trigger in HV mode, so remove always-true feature test.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-2-npiggin@gmail.com
2019-08-30 10:32:34 +10:00
Ryan Grimm
734560ac39 powerpc/pseries/svm: Export guest SVM status to user space via sysfs
User space might want to know it's running in a secure VM.  It can't do
a mfmsr because mfmsr is a privileged instruction.

The solution here is to create a cpu attribute:

/sys/devices/system/cpu/svm

which will read 0 or 1 based on the S bit of the current CPU.

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190820021326.6884-12-bauerman@linux.ibm.com
2019-08-30 09:55:41 +10:00
Ram Pai
256ba2c168 powerpc/pseries/svm: Unshare all pages before kexecing a new kernel
A new kernel deserves a clean slate. Any pages shared with the hypervisor
is unshared before invoking the new kernel. However there are exceptions.
If the new kernel is invoked to dump the current kernel, or if there is a
explicit request to preserve the state of the current kernel, unsharing
of pages is skipped.

NOTE: While testing crashkernel, make sure at least 256M is reserved for
crashkernel. Otherwise SWIOTLB allocation will fail and crash kernel will
fail to boot.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190820021326.6884-11-bauerman@linux.ibm.com
2019-08-30 09:55:40 +10:00
Anshuman Khandual
bd104e6db6 powerpc/pseries/svm: Use shared memory for LPPACA structures
LPPACA structures need to be shared with the host. Hence they need to be in
shared memory. Instead of allocating individual chunks of memory for a
given structure from memblock, a contiguous chunk of memory is allocated
and then converted into shared memory. Subsequent allocation requests will
come from the contiguous chunk which will be always shared memory for all
structures.

While we are able to use a kmem_cache constructor for the Debug Trace Log,
LPPACAs are allocated very early in the boot process (before SLUB is
available) so we need to use a simpler scheme here.

Introduce helper is_svm_platform() which uses the S bit of the MSR to tell
whether we're running as a secure guest.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190820021326.6884-9-bauerman@linux.ibm.com
2019-08-30 09:55:40 +10:00
Thiago Jung Bauermann
e311a92da1 powerpc/pseries: Add and use LPPACA_SIZE constant
Helps document what the hard-coded number means.

Also take the opportunity to fix an #endif comment.

Suggested-by: Alexey Kardashevskiy <aik@linux.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190820021326.6884-8-bauerman@linux.ibm.com
2019-08-30 09:55:40 +10:00
Ram Pai
6a9c930bd7 powerpc/prom_init: Add the ESM call to prom_init
Make the Enter-Secure-Mode (ESM) ultravisor call to switch the VM to secure
mode. Pass kernel base address and FDT address so that the Ultravisor is
able to verify the integrity of the VM using information from the ESM blob.

Add "svm=" command line option to turn on switching to secure mode.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
[ andmike: Generate an RTAS os-term hcall when the ESM ucall fails. ]
Signed-off-by: Michael Anderson <andmike@linux.ibm.com>
[ bauerman: Cleaned up the code a bit. ]
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190820021326.6884-5-bauerman@linux.ibm.com
2019-08-30 09:54:35 +10:00
Thiago Jung Bauermann
136bc0397a powerpc/pseries: Introduce option to build secure virtual machines
Introduce CONFIG_PPC_SVM to control support for secure guests and include
Ultravisor-related helpers when it is selected

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190820021326.6884-3-bauerman@linux.ibm.com
2019-08-30 09:53:28 +10:00
Michael Ellerman
9044adca78 Merge branch 'topic/ppc-kvm' into next
Merge our ppc-kvm topic branch to bring in the Ultravisor support
patches.
2019-08-30 09:52:57 +10:00
Sukadev Bhattiprolu
6c85b7bc63 powerpc/kvm: Use UV_RETURN ucall to return to ultravisor
When an SVM makes an hypercall or incurs some other exception, the
Ultravisor usually forwards (a.k.a. reflects) the exceptions to the
Hypervisor. After processing the exception, Hypervisor uses the
UV_RETURN ultracall to return control back to the SVM.

The expected register state on entry to this ultracall is:

* Non-volatile registers are restored to their original values.
* If returning from an hypercall, register R0 contains the return value
  (unlike other ultracalls) and, registers R4 through R12 contain any
  output values of the hypercall.
* R3 contains the ultracall number, i.e UV_RETURN.
* If returning with a synthesized interrupt, R2 contains the
  synthesized interrupt number.

Thanks to input from Paul Mackerras, Ram Pai and Mike Anderson.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190822034838.27876-8-cclaudio@linux.ibm.com
2019-08-30 09:40:16 +10:00
Claudio Carvalho
bb04ffe85e powerpc/powernv: Introduce FW_FEATURE_ULTRAVISOR
In PEF enabled systems, some of the resources which were previously
hypervisor privileged are now ultravisor privileged and controlled by
the ultravisor firmware.

This adds FW_FEATURE_ULTRAVISOR to indicate if PEF is enabled.

The host kernel can use FW_FEATURE_ULTRAVISOR, for instance, to skip
accessing resources (e.g. PTCR and LDBAR) in case PEF is enabled.

Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
[ andmike: Device node name to "ibm,ultravisor" ]
Signed-off-by: Michael Anderson <andmike@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190822034838.27876-4-cclaudio@linux.ibm.com
2019-08-30 09:40:15 +10:00
Claudio Carvalho
a49dddbdb0 powerpc/kernel: Add ucall_norets() ultravisor call handler
The ultracalls (ucalls for short) allow the Secure Virtual Machines
(SVM)s and hypervisor to request services from the ultravisor such as
accessing a register or memory region that can only be accessed when
running in ultravisor-privileged mode.

This patch adds the ucall_norets() ultravisor call handler.

The specific service needed from an ucall is specified in register
R3 (the first parameter to the ucall). Other parameters to the
ucall, if any, are specified in registers R4 through R12.

Return value of all ucalls is in register R3. Other output values
from the ucall, if any, are returned in registers R4 through R12.

Each ucall returns specific error codes, applicable in the context
of the ucall. However, like with the PowerPC Architecture Platform
Reference (PAPR), if no specific error code is defined for a particular
situation, then the ucall will fallback to an erroneous
parameter-position based code. i.e U_PARAMETER, U_P2, U_P3 etc depending
on the ucall parameter that may have caused the error.

Every host kernel (powernv) needs to be able to do ucalls in case it
ends up being run in a machine with ultravisor enabled. Otherwise, the
kernel may crash early in boot trying to access ultravisor resources,
for instance, trying to set the partition table entry 0. Secure guests
also need to be able to do ucalls and its kernel may not have
CONFIG_PPC_POWERNV=y. For that reason, the ucall.S file is placed under
arch/powerpc/kernel.

If ultravisor is not enabled, the ucalls will be redirected to the
hypervisor which must handle/fail the call.

Thanks to inputs from Ram Pai and Michael Anderson.

Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190822034838.27876-3-cclaudio@linux.ibm.com
2019-08-30 09:40:15 +10:00
Claudio Carvalho
70ed86f4de powerpc: Add PowerPC Capabilities ELF note
Add the PowerPC name and the PPC_ELFNOTE_CAPABILITIES type in the
kernel binary ELF note. This type is a bitmap that can be used to
advertise kernel capabilities to userland.

This patch also defines PPCCAP_ULTRAVISOR_BIT as being the bit zero.

Suggested-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
[ maxiwell: Define the 'PowerPC' type in the elfnote.h ]
Signed-off-by: Maxiwell S. Garcia <maxiwell@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190829155021.2915-2-maxiwell@linux.ibm.com
2019-08-30 09:40:15 +10:00
Alexey Kardashevskiy
a102f139aa powerpc/powernv/ioda: Remove obsolete iommu_table_ops::exchange callbacks
As now we have xchg_no_kill/tce_kill, these are not used anymore so
remove them.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190829085252.72370-6-aik@ozlabs.ru
2019-08-30 09:40:15 +10:00
Alexey Kardashevskiy
35872480da powerpc/powernv/ioda: Split out TCE invalidation from TCE updates
At the moment updates in a TCE table are made by iommu_table_ops::exchange
which update one TCE and invalidates an entry in the PHB/NPU TCE cache
via set of registers called "TCE Kill" (hence the naming).
Writing a TCE is a simple xchg() but invalidating the TCE cache is
a relatively expensive OPAL call. Mapping a 100GB guest with PCI+NPU
passed through devices takes about 20s.

Thankfully we can do better. Since such big mappings happen at the boot
time and when memory is plugged/onlined (i.e. not often), these requests
come in 512 pages so we call call OPAL 512 times less which brings 20s
from the above to less than 10s. Also, since TCE caches can be flushed
entirely, calling OPAL for 512 TCEs helps skiboot [1] to decide whether
to flush the entire cache or not.

This implements 2 new iommu_table_ops callbacks:
- xchg_no_kill() to update a single TCE with no TCE invalidation;
- tce_kill() to invalidate multiple TCEs.
This uses the same xchg_no_kill() callback for IODA1/2.

This implements 2 new wrappers on top of the new callbacks similar to
the existing iommu_tce_xchg().

This does not use the new callbacks yet, the next patches will;
so this should not cause any behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190829085252.72370-2-aik@ozlabs.ru
2019-08-30 09:40:14 +10:00
Alexey Kardashevskiy
bc605cd79e powerpc/of/pci: Rewrite pci_parse_of_flags
The existing code uses bunch of hardcoded values from the PCI Bus
Binding to IEEE Std 1275 spec; and it does so in quite non-obvious
way.

This defines fields from the cell#0 of the "reg" property of a PCI
device and uses them for parsing.

This should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[mpe: Unsplit some 80/81 char lines, space the code with some newlines]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190829084417.71873-1-aik@ozlabs.ru
2019-08-29 20:24:05 +10:00
Nicholas Piggin
555e28179d powerpc/64: remove support for kernel-mode syscalls
There is support for the kernel to execute the 'sc 0' instruction and
make a system call to itself. This is a relic that is unused in the
tree, therefore untested. It's also highly questionable for modules to
be doing this.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190827033010.28090-3-npiggin@gmail.com
2019-08-28 23:19:34 +10:00
Nicholas Piggin
facd04a904 powerpc: convert to copy_thread_tls
Commit 3033f14ab7 ("clone: support passing tls argument via C rather
than pt_regs magic") introduced the HAVE_COPY_THREAD_TLS option. Use it
to avoid a subtle assumption about the argument ordering of clone type
syscalls.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190827033010.28090-2-npiggin@gmail.com
2019-08-28 23:19:34 +10:00
Christophe Leroy
c7bf1252d5 powerpc/32: don't use CPU_FTR_COHERENT_ICACHE
Only 601 and E200 have CPU_FTR_COHERENT_ICACHE.

Just use #ifdefs instead of feature fixup.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5f3e92ccd64d06477b27626f6007a9da3b8da157.1566834712.git.christophe.leroy@c-s.fr
2019-08-28 23:19:34 +10:00
Christophe Leroy
e0291f1dec powerpc/32: drop CPU_FTR_UNIFIED_ID_CACHE
Only 601 and e200 have unified I/D cache.

Drop the feature and use CONFIG_PPC_BOOK3S_601 and CONFIG_E200.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b5902144266d2f4eed1ffea53915bd0245841e02.1566834712.git.christophe.leroy@c-s.fr
2019-08-28 23:19:33 +10:00
Christophe Leroy
39097b9c6d powerpc/32s: use CONFIG_PPC_BOOK3S_601 instead of reading PVR
Use CONFIG_PPC_BOOK3S_601 instead of reading PVR to know if
it is a 601 or not.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/909c26db9facd7fe454695b303f952e019dd9eda.1566834712.git.christophe.leroy@c-s.fr
2019-08-28 23:19:33 +10:00
Christophe Leroy
88fb309409 powerpc/32s: drop CPU_FTR_USE_RTC feature
CPU_FTR_USE_RTC feature only applies to powerpc601.

Drop this feature and replace it with tests on CONFIG_PPC_BOOK3S_601.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/170411e2360861f4a95c21faad43519a08bc4040.1566834712.git.christophe.leroy@c-s.fr
2019-08-28 23:19:33 +10:00
Christophe Leroy
12c3f1fd87 powerpc/32s: get rid of CPU_FTR_601 feature
Now that 601 is exclusive from other 6xx, CPU_FTR_601 and
associated fixups are useless.

Drop this feature and use #ifdefs instead.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ecdb7194a17dbfa01865df6a82979533adc2c70b.1566834712.git.christophe.leroy@c-s.fr
2019-08-28 23:19:33 +10:00
Christophe Leroy
3bbd234373 powerpc/8xx: set STACK_END_MAGIC earlier on the init_stack
Today, the STACK_END_MAGIC is set on init_stack in start_kernel().

To avoid a false 'Thread overran stack, or stack corrupted' message
on early Oopses, setup STACK_END_MAGIC as soon as possible.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/54f67bb7ac486c1350f2fa8905cd279f94b9dfb1.1566382841.git.christophe.leroy@c-s.fr
2019-08-28 11:31:18 +10:00
Christophe Leroy
a045657412 powerpc/8xx: drop unused self-modifying code alternative to FixupDAR.
The code which fixups the DAR on TLB errors for dbcX instructions
has a self-modifying code alternative that has never been used.

Drop it.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b095e12c82fcba1ac4c09fc3b85d969f36614746.1566417610.git.christophe.leroy@c-s.fr
2019-08-28 11:31:18 +10:00
Christophe Leroy
63ce271b5e powerpc/prom: convert PROM_BUG() to standard trap
Prior to commit 1bd98d7fbaf5 ("ppc64: Update BUG handling based on
ppc32"), BUG() family was using BUG_ILLEGAL_INSTRUCTION which
was an invalid instruction opcode to trap into program check
exception.

That commit converted them to using standard trap instructions,
but prom/prom_init and their PROM_BUG() macro were left over.
head_64.S and exception-64s.S were left aside as well.

Convert them to using the standard BUG infrastructure.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/cdaf4bbbb64c288a077845846f04b12683f8875a.1566817807.git.christophe.leroy@c-s.fr
2019-08-28 11:31:18 +10:00
Christophe Leroy
d7fb5b18a5 powerpc/64: optimise LOAD_REG_IMMEDIATE_SYM()
Optimise LOAD_REG_IMMEDIATE_SYM() using a temporary register to
parallelise operations.

It reduces the path from 5 to 3 instructions.

Suggested-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/bad41ed02531bb0382420cbab50a0d7153b71767.1566311636.git.christophe.leroy@c-s.fr
2019-08-27 13:03:36 +10:00
Christophe Leroy
ba18025fb0 powerpc/32: replace LOAD_MSR_KERNEL() by LOAD_REG_IMMEDIATE()
LOAD_MSR_KERNEL() and LOAD_REG_IMMEDIATE() are doing the same thing
in the same way. Drop LOAD_MSR_KERNEL()

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8f04a6df0bc8949517fd8236d50c15008ccf9231.1566311636.git.christophe.leroy@c-s.fr
2019-08-27 13:03:36 +10:00
Christophe Leroy
c691b4b83b powerpc: rewrite LOAD_REG_IMMEDIATE() as an intelligent macro
Today LOAD_REG_IMMEDIATE() is a basic #define which loads all
parts on a value into a register, including the parts that are NUL.

This means always 2 instructions on PPC32 and always 5 instructions
on PPC64. And those instructions cannot run in parallele as they are
updating the same register.

Ex: LOAD_REG_IMMEDIATE(r1,THREAD_SIZE) in head_64.S results in:

3c 20 00 00     lis     r1,0
60 21 00 00     ori     r1,r1,0
78 21 07 c6     rldicr  r1,r1,32,31
64 21 00 00     oris    r1,r1,0
60 21 40 00     ori     r1,r1,16384

Rewrite LOAD_REG_IMMEDIATE() with GAS macro in order to skip
the parts that are NUL.

Rename existing LOAD_REG_IMMEDIATE() as LOAD_REG_IMMEDIATE_SYM()
and use that one for loading value of symbols which are not known
at compile time.

Now LOAD_REG_IMMEDIATE(r1,THREAD_SIZE) in head_64.S results in:

38 20 40 00     li      r1,16384

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d60ce8dd3a383c7adbfc322bf1d53d81724a6000.1566311636.git.christophe.leroy@c-s.fr
2019-08-27 13:03:36 +10:00
Christophe Leroy
14b4d97669 powerpc/mm: rework io-workaround invocation.
ppc_md.ioremap() is only used for I/O workaround on CELL platform,
so indirect function call can be avoided.

This patch reworks the io-workaround and ioremap() functions to
use the global 'io_workaround_inited' flag for the activation
of io-workaround.

When CONFIG_PPC_IO_WORKAROUNDS or CONFIG_PPC_INDIRECT_MMIO are not
selected, the I/O workaround ioremap() voids and the global flag is
not used.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5fa3ef069fbd0f152512afaae19e7a60161454cf.1566309262.git.christophe.leroy@c-s.fr
2019-08-27 13:03:34 +10:00
Christopher M. Riedl
d8f0e0b073 powerpc/64s: support nospectre_v2 cmdline option
Add support for disabling the kernel implemented spectre v2 mitigation
(count cache flush on context switch) via the nospectre_v2 and
mitigations=off cmdline options.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Christopher M. Riedl <cmr@informatik.wtf>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190524024647.381-1-cmr@informatik.wtf
2019-08-27 13:03:32 +10:00
Christoph Hellwig
cdfee56232 driver core: initialize a default DMA mask for platform device
We still treat devices without a DMA mask as defaulting to 32-bits for
both mask, but a few releases ago we've started warning about such
cases, as they require special cases to work around this sloppyness.
Add a dma_mask field to struct platform_device so that we can initialize
the dma_mask pointer in struct device and initialize both masks to
32-bits by default, replacing similar functionality in m68k and
powerpc.  The arch_setup_pdev_archdata hooks is now unused and removed.

Note that the code looks a little odd with the various conditionals
because we have to support platform_device structures that are
statically allocated.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20190816062435.881-7-hch@lst.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-22 09:41:55 -07:00
Sam Bobroff
27d4396ed5 powerpc/eeh: Slightly simplify eeh_add_to_parent_pe()
Simplify some needlessly complicated boolean logic in
eeh_add_to_parent_pe().

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/09259a50308f10aa764695912bc87dc1d1cf654c.1565930772.git.sbobroff@linux.ibm.com
2019-08-22 23:12:47 +10:00
Sam Bobroff
cef50c67c1 powerpc/eeh: Remove unused return path from eeh_pe_dev_traverse()
There are no users of the early-out return value from
eeh_pe_dev_traverse(), so remove it.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c648070f5b28fe8ca1880b48e64b267959ffd369.1565930772.git.sbobroff@linux.ibm.com
2019-08-22 23:12:47 +10:00
Sam Bobroff
2e25505147 powerpc/eeh: Fix crash when edev->pdev changes
If a PCI device is removed during eeh_pe_report_edev(), between the
calls to device_lock() and device_unlock(), edev->pdev will change and
cause a crash as the wrong mutex is released.

To correct this, hold the PCI rescan/remove lock while taking a copy
of edev->pdev and performing a get_device() on it.  Use this value to
release the mutex, but also pass it through to the device driver's EEH
handlers so that they always see the same device.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3c590579a0faa24d20c826dcd26c739eb4d454e6.1565930772.git.sbobroff@linux.ibm.com
2019-08-22 23:12:47 +10:00
Sam Bobroff
1ff8f36fc7 powerpc/eeh: Convert log messages to eeh_edev_* macros
Convert existing messages, where appropriate, to use the eeh_edev_*
logging macros.

The only effect should be minor adjustments to the log messages, apart
from:

- A new message in pseries_eeh_probe() "Probing device" to match the
powernv case.
- The "Probing device" message in pnv_eeh_probe() is now generated
slightly later, which will mean that it is no longer emitted for
devices that aren't probed due to the initial checks.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ce505a0a7a4a5b0367f0f40f8b26e7c0a9cf4cb7.1565930772.git.sbobroff@linux.ibm.com
2019-08-22 23:12:47 +10:00
Sam Bobroff
b093f2cbed powerpc/eeh: Introduce EEH edev logging macros
Now that struct eeh_dev includes the BDFN of it's PCI device, make use
of it to replace eeh_edev_info() with a set of dev_dbg()-style macros
that only need a struct edev.

With the BDFN available without the struct pci_dev, eeh_pci_name() is
now unnecessary, so remove it.

While only the "info" level function is used here, the others will be
used in followup work.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f90ae9a53d762be7b0ccbad79e62b5a1b4f4996e.1565930772.git.sbobroff@linux.ibm.com
2019-08-22 23:12:46 +10:00
Oliver O'Halloran
7c33a994d3 powerpc/eeh: Add bdfn field to eeh_dev
Preparation for removing pci_dn from the powernv EEH code. The only
thing we really use pci_dn for is to get the bdfn of the device for
config space accesses, so adding that information to eeh_dev reduces
the need to carry around the pci_dn.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
[SB: Re-wrapped commit message, fixed whitespace damage.]
Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e458eb69a1f591d8a120782f23a8506b15d3c654.1565930772.git.sbobroff@linux.ibm.com
2019-08-22 23:12:46 +10:00
Sam Bobroff
c44e4ccada powerpc/eeh: Refactor around eeh_probe_devices()
Now that EEH support for all devices (on PowerNV and pSeries) is
provided by the pcibios bus add device hooks, eeh_probe_devices() and
eeh_addr_cache_build() are redundant and can be removed.

Move the EEH enabled message into it's own function so that it can be
called from multiple places.

Note that previously on pSeries, useless EEH sysfs files were created
for some devices that did not have EEH support and this change
prevents them from being created.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/33b0a6339d5ac88693de092d6fba984f2a5add66.1565930772.git.sbobroff@linux.ibm.com
2019-08-22 23:12:46 +10:00
Sam Bobroff
b905f8cdca powerpc/eeh: EEH for pSeries hot plug
On PowerNV and pSeries, devices currently acquire EEH support from
several different places: Boot-time devices from eeh_probe_devices()
and eeh_addr_cache_build(), Virtual Function devices from the pcibios
bus add device hooks and hot plugged devices from pci_hp_add_devices()
(with other platforms using other methods as well).  Unfortunately,
pSeries machines currently discover hot plugged devices using
pci_rescan_bus(), not pci_hp_add_devices(), and so those devices do
not receive EEH support.

Rather than adding another case for pci_rescan_bus(), this change
widens the scope of the pcibios bus add device hooks so that they can
handle all devices. As a side effect this also supports devices
discovered after manually rescanning via /sys/bus/pci/rescan.

Note that on PowerNV, this change allows the EEH subsystem to become
enabled after boot as long as it has not been forced off, which was
not previously possible (it was already possible on pSeries).

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/72ae8ae9c54097158894a52de23690448de38ea9.1565930772.git.sbobroff@linux.ibm.com
2019-08-22 23:12:40 +10:00
Sam Bobroff
685a0bc00a powerpc/eeh: Initialize EEH address cache earlier
The EEH address cache is currently initialized and populated by a
single function: eeh_addr_cache_build().  While the initial population
of the cache can only be done once resources are allocated,
initialization (just setting up a spinlock) could be done much
earlier.

So move the initialization step into a separate function and call it
from a core_initcall (rather than a subsys initcall).

This will allow future work to make use of the cache during boot time
PCI scanning.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0557206741bffee76cdfff042f65321f6f7a5b41.1565930772.git.sbobroff@linux.ibm.com
2019-08-22 23:11:48 +10:00
Sam Bobroff
617082a481 powerpc/eeh: Improve debug messages around device addition
Also remove useless comment.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/59db84f4bf94718a12f206bc923ac797d47e4cc1.1565930772.git.sbobroff@linux.ibm.com
2019-08-22 23:11:48 +10:00
Sam Bobroff
aa06e3d60e powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag
The EEH_DEV_NO_HANDLER flag is used by the EEH system to prevent the
use of driver callbacks in drivers that have been bound part way
through the recovery process. This is necessary to prevent later stage
handlers from being called when the earlier stage handlers haven't,
which can be confusing for drivers.

However, the flag is set for all devices that are added after boot
time and only cleared at the end of the EEH recovery process. This
results in hot plugged devices erroneously having the flag set during
the first recovery after they are added (causing their driver's
handlers to be incorrectly ignored).

To remedy this, clear the flag at the beginning of recovery
processing. The flag is still cleared at the end of recovery
processing, although it is no longer really necessary.

Also clear the flag during eeh_handle_special_event(), for the same
reasons.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b8ca5629d27de74c957d4f4b250177d1b6fc4bbd.1565930772.git.sbobroff@linux.ibm.com
2019-08-22 23:11:48 +10:00
Sam Bobroff
3f068aae7a powerpc/64: Adjust order in pcibios_init()
The pcibios_init() function for PowerPC 64 currently calls
pci_bus_add_devices() before pcibios_resource_survey(). This means
that at boot time, when the pcibios_bus_add_device() hooks are called
by pci_bus_add_devices(), device resources have not been allocated and
they are unable to perform EEH setup, so a separate pass is needed.

This patch adjusts that order so that it will become possible to
consolidate the EEH setup work into a single location.

The only functional change is to execute pcibios_resource_survey()
(excepting ppc_md.pcibios_fixup(), see below) before
pci_bus_add_devices() instead of after it.

Because pcibios_scan_phb() and pci_bus_add_devices() are called
together in a loop, this must be broken into one loop for each call.
Then the call to pcibios_resource_survey() is moved up in between
them. This changes the ordering but because pcibios_resource_survey()
also calls ppc_md.pcibios_fixup(), that call is extracted out into
pcibios_init() to where pcibios_resource_survey() was, so that it is
not moved.

The only other caller of pcibios_resource_survey() is the PowerPC 32
version of pcibios_init(), and therefore, that is modified to call
ppc_md.pcibios_fixup() right after pcibios_resource_survey() so that
there is no functional change there at all.

The re-arrangement will cause very few side-effects because at this
stage in the boot, pci_bus_add_devices() does very little:
- pci_create_sysfs_dev_files() does nothing (no sysfs yet)
- pci_proc_attach_device() does nothing (no proc yet)
- device_attach() does nothing (no drivers yet)
This leaves only the pci_final_fixup calls, D3 support, and marking
the device as added. Of those, only the pci_final_fixup calls have the
potential to be affected by resource allocation.

The only pci_final_fixup handlers that touch resources seem to be one
for x86 (pci_amd_enable_64bit_bar()), and a PowerPC 32 platform driver
(quirk_final_uli1575()), neither of which use this pcibios_init()
function. Even if they did, it would almost certainly be a bug, under
the current ordering, to rely on or make changes to resources before
they were allocated.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4506b0489eabd0921a3587d90bd44c7683f3472d.1565930772.git.sbobroff@linux.ibm.com
2019-08-22 23:11:48 +10:00
Balbir Singh
895e3dceeb powerpc/mce: Handle UE event for memcpy_mcsafe
If we take a UE on one of the instructions with a fixup entry, set nip
to continue execution at the fixup entry. Stop processing the event
further or print it.

Co-developed-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190820081352.8641-6-santosh@fossix.org
2019-08-21 22:23:48 +10:00
Reza Arbab
1a1715f516 powerpc/mce: Make machine_check_ue_event() static
The function doesn't get used outside this file, so make it static.

Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190820081352.8641-4-santosh@fossix.org
2019-08-21 22:23:47 +10:00
Balbir Singh
99ead78afd powerpc/mce: Fix MCE handling for huge pages
The current code would fail on huge pages addresses, since the shift would
be incorrect. Use the correct page shift value returned by
__find_linux_pte() to get the correct physical address. The code is more
generic and can handle both regular and compound pages.

Fixes: ba41e1e1cc ("powerpc/mce: Hookup derror (load/store) UE errors")
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[arbab@linux.ibm.com: Fixup pseries_do_memory_failure()]
Signed-off-by: Reza Arbab <arbab@linux.ibm.com>
Tested-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190820081352.8641-3-santosh@fossix.org
2019-08-21 22:23:47 +10:00
Santosh Sivaraj
b5bda6263c powerpc/mce: Schedule work from irq_work
schedule_work() cannot be called from MCE exception context as MCE can
interrupt even in interrupt disabled context.

Fixes: 733e4a4c44 ("powerpc/mce: hookup memory_failure for UE errors")
Cc: stable@vger.kernel.org # v4.15+
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Santosh Sivaraj <santosh@fossix.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190820081352.8641-2-santosh@fossix.org
2019-08-21 22:23:03 +10:00
Nathan Lynch
10e4850d7c powerpc/rtas: allow rescheduling while changing cpu states
rtas_cpu_state_change_mask() potentially operates on scores of cpus,
so explicitly allow rescheduling in the loop body.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802192926.19277-3-nathanl@linux.ibm.com
2019-08-20 21:22:27 +10:00
Nathan Lynch
a6717c01dd powerpc/rtas: use device model APIs and serialization during LPM
The LPAR migration implementation and userspace-initiated cpu hotplug
can interleave their executions like so:

1. Set cpu 7 offline via sysfs.

2. Begin a partition migration, whose implementation requires the OS
   to ensure all present cpus are online; cpu 7 is onlined:

     rtas_ibm_suspend_me -> rtas_online_cpus_mask -> cpu_up

   This sets cpu 7 online in all respects except for the cpu's
   corresponding struct device; dev->offline remains true.

3. Set cpu 7 online via sysfs. _cpu_up() determines that cpu 7 is
   already online and returns success. The driver core (device_online)
   sets dev->offline = false.

4. The migration completes and restores cpu 7 to offline state:

     rtas_ibm_suspend_me -> rtas_offline_cpus_mask -> cpu_down

This leaves cpu7 in a state where the driver core considers the cpu
device online, but in all other respects it is offline and
unused. Attempts to online the cpu via sysfs appear to succeed but the
driver core actually does not pass the request to the lower-level
cpuhp support code. This makes the cpu unusable until the cpu device
is manually set offline and then online again via sysfs.

Instead of directly calling cpu_up/cpu_down, the migration code should
use the higher-level device core APIs to maintain consistent state and
serialize operations.

Fixes: 120496ac2d ("powerpc: Bring all threads online prior to migration/hibernation")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802192926.19277-2-nathanl@linux.ibm.com
2019-08-20 21:22:27 +10:00
Christophe Leroy
415480dce2 powerpc/603: Fix handling of the DIRTY flag
If a page is already mapped RW without the DIRTY flag, the DIRTY
flag is never set and a TLB store miss exception is taken forever.

This is easily reproduced with the following app:

void main(void)
{
	volatile char *ptr = mmap(0, 4096, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0);

	*ptr = *ptr;
}

When DIRTY flag is not set, bail out of TLB miss handler and take
a minor page fault which will set the DIRTY flag.

Fixes: f8b58c64ea ("powerpc/603: let's handle PAGE_DIRTY directly")
Cc: stable@vger.kernel.org # v5.1+
Reported-by: Doug Crawford <doug.crawford@intelight-its.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/80432f71194d7ee75b2f5043ecf1501cf1cca1f3.1566196646.git.christophe.leroy@c-s.fr
2019-08-20 21:22:21 +10:00
Christophe Leroy
7ab0b7cb89 powerpc/32: Add warning on misaligned copy_page() or clear_page()
copy_page() and clear_page() expect page aligned destination, and
use dcbz instruction to clear entire cache lines based on the
assumption that the destination is cache aligned.

As shown during analysis of a bug in BTRFS filesystem, a misaligned
copy_page() can create bugs that are difficult to locate (see Link).

Add an explicit WARNING when copy_page() or clear_page() are called
with misaligned destination.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=204371
Link: https://lore.kernel.org/r/c6cea38f90480268d439ca44a645647e260fff09.1565941808.git.christophe.leroy@c-s.fr
2019-08-20 21:22:15 +10:00
Christophe Leroy
9d6d712fbf powerpc/32s: Fix boot failure with DEBUG_PAGEALLOC without KASAN.
When KASAN is selected, the definitive hash table has to be
set up later, but there is already an early temporary one.

When KASAN is not selected, there is no early hash table,
so the setup of the definitive hash table cannot be delayed.

Fixes: 72f208c6a8 ("powerpc/32s: move hash code patching out of MMU_init_hw()")
Cc: stable@vger.kernel.org # v5.2+
Reported-by: Jonathan Neuschafer <j.neuschaefer@gmx.net>
Tested-by: Jonathan Neuschafer <j.neuschaefer@gmx.net>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b7860c5e1e784d6b96ba67edf47dd6cbc2e78ab6.1565776892.git.christophe.leroy@c-s.fr
2019-08-20 21:22:13 +10:00
Christophe Leroy
658d029df0 powerpc/hw_breakpoint: move instruction stepping out of hw_breakpoint_handler()
On 8xx, breakpoints stop after executing the instruction, so
stepping/emulation is not needed. Move it into a sub-function and
remove the #ifdefs.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f8cdc3f1c66ad3c43ebc568abcc6c39ed4676284.1561737231.git.christophe.leroy@c-s.fr
2019-08-20 21:22:10 +10:00
Alexey Kardashevskiy
201ed7f327 powerpc/powernv/ioda2: Create bigger default window with 64k IOMMU pages
At the moment we create a small window only for 32bit devices, the window
maps 0..2GB of the PCI space only. For other devices we either use
a sketchy bypass or hardware bypass but the former can only work if
the amount of RAM is no bigger than the device's DMA mask and the latter
requires devices to support at least 59bit DMA.

This extends the default DMA window to the maximum size possible to allow
a wider DMA mask than just 32bit. The default window size is now limited
by the the iommu_table::it_map allocation bitmap which is a contiguous
array, 1 bit per an IOMMU page.

This increases the default IOMMU page size from hard coded 4K to
the system page size to allow wider DMA masks.

This increases the level number to not exceed the max order allocation
limit per TCE level. By the same time, this keeps minimal levels number
as 2 in order to save memory.

As the extended window now overlaps the 32bit MMIO region, this adds
an area reservation to iommu_init_table().

After this change the default window size is 0x80000000000==1<<43 so
devices limited to DMA mask smaller than the amount of system RAM can
still use more than just 2GB of memory for DMA.

This is an optimization and not a bug fix for DMA API usage.

With the on-demand allocation of indirect TCE table levels enabled and
2 levels, the first TCE level size is just
1<<ceil((log2(0x7ffffffffff+1)-16)/2)=16384 TCEs or 2 system pages.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190718051139.74787-5-aik@ozlabs.ru
2019-08-19 13:20:23 +10:00
Alexey Kardashevskiy
4f7e0babbc powerpc/iommu: Allow bypass-only for DMA
POWER8 and newer support a bypass mode which maps all host memory to
PCI buses so an IOMMU table is not always required. However if we fail to
create such a table, the DMA setup fails and the kernel does not boot.

This skips the 32bit DMA setup check if the bypass is selected.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190718051139.74787-3-aik@ozlabs.ru
2019-08-19 13:20:23 +10:00
Michael Ellerman
4215fa2d7d Merge branch 'fixes' into next
Merge in our fixes branch, which brings in clone3() as well as some
implicit fallthrough fixes we want in next.
2019-08-19 12:43:13 +10:00
Nicholas Piggin
2b87a2553a powerpc/64s: Make boot look nice(r)
Radix boot looks like this:

 -----------------------------------------------------
 phys_mem_size     = 0x200000000
 dcache_bsize      = 0x80
 icache_bsize      = 0x80
 cpu_features      = 0x0000c06f8f5fb1a7
   possible        = 0x0000fbffcf5fb1a7
   always          = 0x00000003800081a1
 cpu_user_features = 0xdc0065c2 0xaee00000
 mmu_features      = 0xbc006041
 firmware_features = 0x0000000010000000
 hash-mmu: ppc64_pft_size    = 0x0
 hash-mmu: kernel vmalloc start   = 0xc008000000000000
 hash-mmu: kernel IO start        = 0xc00a000000000000
 hash-mmu: kernel vmemmap start   = 0xc00c000000000000
 -----------------------------------------------------

Fix:

 -----------------------------------------------------
 phys_mem_size     = 0x200000000
 dcache_bsize      = 0x80
 icache_bsize      = 0x80
 cpu_features      = 0x0000c06f8f5fb1a7
   possible        = 0x0000fbffcf5fb1a7
   always          = 0x00000003800081a1
 cpu_user_features = 0xdc0065c2 0xaee00000
 mmu_features      = 0xbc006041
 firmware_features = 0x0000000010000000
 vmalloc start     = 0xc008000000000000
 IO start          = 0xc00a000000000000
 vmemmap start     = 0xc00c000000000000
 -----------------------------------------------------

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190516020437.11783-1-npiggin@gmail.com
2019-08-15 22:52:55 +10:00
Christoph Hellwig
33dcb37cef dma-mapping: fix page attributes for dma_mmap_*
All the way back to introducing dma_common_mmap we've defaulted to mark
the pages as uncached.  But this is wrong for DMA coherent devices.
Later on DMA_ATTR_WRITE_COMBINE also got incorrect treatment as that
flag is only treated special on the alloc side for non-coherent devices.

Introduce a new dma_pgprot helper that deals with the check for coherent
devices so that only the remapping cases ever reach arch_dma_mmap_pgprot
and we thus ensure no aliasing of page attributes happens, which makes
the powerpc version of arch_dma_mmap_pgprot obsolete and simplifies the
remaining ones.

Note that this means arch_dma_mmap_pgprot is a bit misnamed now, but
we'll phase it out soon.

Fixes: 64ccc9c033 ("common: dma-mapping: add support for generic dma_mmap_* calls")
Reported-by: Shawn Anastasio <shawn@anastas.io>
Reported-by: Gavin Li <git@thegavinli.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> # arm64
2019-08-10 19:52:45 +02:00
Nathan Lynch
ae2e953fdc powerpc/rtas: Unexport rtas_online_cpus_mask, rtas_offline_cpus_mask
These aren't used by modular code, nor should they be.

Fixes: 120496ac2d ("powerpc: Bring all threads online prior to migration/hibernation")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190718162214.5694-1-nathanl@linux.ibm.com
2019-08-05 18:53:03 +10:00
Michael Ellerman
7db57e7758 powerpc/spe: Mark expected switch fall-throughs
Mark switch cases where we are expecting to fall through.

Fixes errors such as below, seen with mpc85xx_defconfig:

  arch/powerpc/kernel/align.c: In function 'emulate_spe':
  arch/powerpc/kernel/align.c:178:8: error: this statement may fall through
    ret |= __get_user_inatomic(temp.v[3], p++);
        ^~

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190730141917.21817-1-mpe@ellerman.id.au
2019-07-31 00:19:34 +10:00
Michael Ellerman
cee3536d24 powerpc: Wire up clone3 syscall
Wire up the new clone3 syscall added in commit 7f192e3cd3 ("fork:
add clone3").

This requires a ppc_clone3 wrapper, in order to save the non-volatile
GPRs before calling into the generic syscall code. Otherwise we hit
the BUG_ON in CHECK_FULL_REGS in copy_thread().

Lightly tested using Christian's test code on a Power8 LE VM.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Christian Brauner <christian@brauner.io>
Link: https://lore.kernel.org/r/20190724140259.23554-1-mpe@ellerman.id.au
2019-07-29 09:34:27 +10:00
Linus Torvalds
3ea54d9b0d This is mostly a set of follow-on fixes from Mauro fixing various fallout
from the massive RST conversion; a few other small fixes as well.
 -----BEGIN PGP SIGNATURE-----
 
 iQFCBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl07C9gPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YQrgH+L7kPaRWcH+lorh7WxiIhlAvqFsR4xjRKTZV
 qZwlv3+87qjPu86DBdjRlos8PsoIkdD4+jH0u+7swYtZ5hsS8DYcfZ42i+nHqP7D
 k/XwVPW71ikpoi4vQVsO9zrwaOKPyL8AoNzlj+VpiVGAu/PYApGUfdkdOD0xxlOW
 hst6GKNAKrsoQ8fW3AIuCVi4oEEYru2FHMKLcpHBFDZtR7zgC4dEgs7UqEttAzHU
 559nkH2LKekya4pQaFMdSb6Wgj+JeIkI2XzujXES6QEdmf8vrvjkkvHeByoOYjNY
 NNdKNGhPUWrEUMfyzRQpEIcF+Dk3lYtBEDQO8+XCh6mpa9h7kQ==
 =BUU+
 -----END PGP SIGNATURE-----

Merge tag 'docs-5.3-1' of git://git.lwn.net/linux

Pull documentation fixes from Jonathan Corbet:
 "This is mostly a set of follow-on fixes from Mauro fixing various
  fallout from the massive RST conversion; a few other small fixes as
  well"

* tag 'docs-5.3-1' of git://git.lwn.net/linux: (21 commits)
  docs: phy: Drop duplicate 'be made'
  doc:it_IT: translations in process/
  docs/vm: transhuge: fix typo in madvise reference
  doc:it_IT: rephrase statement
  doc:it_IT: align translation to mainline
  docs: load_config.py: ensure subdirs end with "/"
  docs: virtual: add it to the documentation body
  docs: remove extra conf.py files
  docs: load_config.py: avoid needing a conf.py just due to LaTeX docs
  scripts/sphinx-pre-install: seek for Noto CJK fonts for pdf output
  scripts/sphinx-pre-install: cleanup Gentoo checks
  scripts/sphinx-pre-install: fix latexmk dependencies
  scripts/sphinx-pre-install: don't use LaTeX with CentOS 7
  scripts/sphinx-pre-install: fix script for RHEL/CentOS
  docs: conf.py: only use CJK if the font is available
  docs: conf.py: add CJK package needed by translations
  docs: pdf: add all Documentation/*/index.rst to PDF output
  docs: fix broken doc references due to renames
  docs: power: add it to to the main documentation index
  docs: powerpc: convert docs to ReST and rename to *.rst
  ...
2019-07-26 11:29:24 -07:00
Linus Torvalds
bed38c3e2d powerpc fixes for 5.3 #2
An assortment of non-regression fixes that have accumulated since the start of
 the merge window.
 
 A fix for a user triggerable oops on machines where transactional memory is
 disabled, eg. Power9 bare metal, Power8 with TM disabled on the command line, or
 all Power7 or earlier machines.
 
 Three fixes for handling of PMU and power saving registers when running nested
 KVM on Power9.
 
 Two fixes for bugs found while stress testing the XIVE interrupt controller
 code, also on Power9.
 
 A fix to allow guests to boot under Qemu/KVM on Power9 using the the Hash MMU
 with >= 1TB of memory.
 
 Two fixes for bugs in the recent DMA cleanup, one of which could lead to
 checkstops.
 
 And finally three fixes for the PAPR SCM nvdimm driver.
 
 Thanks to:
   Alexey Kardashevskiy, Andrea Arcangeli, Cédric Le Goater, Christoph Hellwig,
   David Gibson, Gautham R. Shenoy, Michael Neuling, Oliver O'Halloran,, Satheesh
   Rajendran, Shawn Anastasio, Suraj Jitindar Singh, Vaibhav Jain.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdOF2iAAoJEFHr6jzI4aWAVmsQAJ//UY1a+lz39y/5jmkybJbH
 HVnja6ZhsKd3+ZAnljGmqr1zuwDmy8+X3pT+1832zBBm4Z1cNKj1c0wuK5fuhAfq
 o0XkO0N9GFcQu8HUPb5wSBOoyXwK0qUhExfCVobl7YsDAyAI2//nSQTwxNX3W4Hv
 P7hz48pBbiqRzQAOSHV8ZlcOBETbSVAXeNalSXrXqSJmXQbVWCQcd6vucMSwZ7S5
 ZiiL/gCBoO0kd0ZQRsGXCbwcjcR4NlTDN0M40og8Y9KTDkId8HdmJyXW3tMcZo/g
 W3LeMR94bUh/KrK88lMBrRXKUlxL+loZKWZaeNlA5+ShCYk/ZafkKri/QUX/glOq
 ahm8uqokdZ5VS1tgSYoJIKdA5qMGvv8V+CpHRJnZqaEhUCduQa5XmWPnDnEKkDt0
 94VBsk0D2vHYKyygv5JMgYHQVlU7XrQF8fw2pKShpqLMY7ZMpeDDmKN9AuzxhawF
 9b7HigbwNt5LvNJ0xn097KW+svCK7i3ZgiQe83W36wjSl2ystgjJ3T7yrH6Q1rKH
 o4loEGA4gASTDjTmWQM20lHT1xQHY4fQBC/wi/67as3m0TDeGXYI0fOZC5qtEBFr
 Ln/0e78VMhut/RWicDlRveszef1MCi1warR9R4I/bQ8M6O1BzHYsQ9zr6H111uxL
 vQ92Yp8G2PoqN7wlFlSG
 =Yc9T
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "An assortment of non-regression fixes that have accumulated since the
  start of the merge window.

   - A fix for a user triggerable oops on machines where transactional
     memory is disabled, eg. Power9 bare metal, Power8 with TM disabled
     on the command line, or all Power7 or earlier machines.

   - Three fixes for handling of PMU and power saving registers when
     running nested KVM on Power9.

   - Two fixes for bugs found while stress testing the XIVE interrupt
     controller code, also on Power9.

   - A fix to allow guests to boot under Qemu/KVM on Power9 using the
     the Hash MMU with >= 1TB of memory.

   - Two fixes for bugs in the recent DMA cleanup, one of which could
     lead to checkstops.

   - And finally three fixes for the PAPR SCM nvdimm driver.

  Thanks to: Alexey Kardashevskiy, Andrea Arcangeli, Cédric Le Goater,
  Christoph Hellwig, David Gibson, Gautham R. Shenoy, Michael Neuling,
  Oliver O'Halloran, Satheesh Rajendran, Shawn Anastasio, Suraj Jitindar
  Singh, Vaibhav Jain"

* tag 'powerpc-5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/papr_scm: Force a scm-unbind if initial scm-bind fails
  powerpc/papr_scm: Update drc_pmem_unbind() to use H_SCM_UNBIND_ALL
  powerpc/pseries: Update SCM hcall op-codes in hvcall.h
  powerpc/tm: Fix oops on sigreturn on systems without TM
  powerpc/dma: Fix invalid DMA mmap behavior
  KVM: PPC: Book3S HV: XIVE: fix rollback when kvmppc_xive_create fails
  powerpc/xive: Fix loop exit-condition in xive_find_target_in_mask()
  powerpc: fix off by one in max_zone_pfn initialization for ZONE_DMA
  KVM: PPC: Book3S HV: Save and restore guest visible PSSCR bits on pseries
  powerpc/pmu: Set pmcregs_in_use in paca when running as LPAR
  KVM: PPC: Book3S HV: Always save guest pmu for guest capable of nesting
  powerpc/mm: Limit rma_size to 1TB when running without HV mode
2019-07-24 09:58:39 -07:00
Michael Neuling
f16d80b75a powerpc/tm: Fix oops on sigreturn on systems without TM
On systems like P9 powernv where we have no TM (or P8 booted with
ppc_tm=off), userspace can construct a signal context which still has
the MSR TS bits set. The kernel tries to restore this context which
results in the following crash:

  Unexpected TM Bad Thing exception at c0000000000022fc (msr 0x8000000102a03031) tm_scratch=800000020280f033
  Oops: Unrecoverable exception, sig: 6 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in:
  CPU: 0 PID: 1636 Comm: sigfuz Not tainted 5.2.0-11043-g0a8ad0ffa4 #69
  NIP:  c0000000000022fc LR: 00007fffb2d67e48 CTR: 0000000000000000
  REGS: c00000003fffbd70 TRAP: 0700   Not tainted  (5.2.0-11045-g7142b497d8)
  MSR:  8000000102a03031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[E]>  CR: 42004242  XER: 00000000
  CFAR: c0000000000022e0 IRQMASK: 0
  GPR00: 0000000000000072 00007fffb2b6e560 00007fffb2d87f00 0000000000000669
  GPR04: 00007fffb2b6e728 0000000000000000 0000000000000000 00007fffb2b6f2a8
  GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR12: 0000000000000000 00007fffb2b76900 0000000000000000 0000000000000000
  GPR16: 00007fffb2370000 00007fffb2d84390 00007fffea3a15ac 000001000a250420
  GPR20: 00007fffb2b6f260 0000000010001770 0000000000000000 0000000000000000
  GPR24: 00007fffb2d843a0 00007fffea3a14a0 0000000000010000 0000000000800000
  GPR28: 00007fffea3a14d8 00000000003d0f00 0000000000000000 00007fffb2b6e728
  NIP [c0000000000022fc] rfi_flush_fallback+0x7c/0x80
  LR [00007fffb2d67e48] 0x7fffb2d67e48
  Call Trace:
  Instruction dump:
  e96a0220 e96a02a8 e96a0330 e96a03b8 394a0400 4200ffdc 7d2903a6 e92d0c00
  e94d0c08 e96d0c10 e82d0c18 7db242a6 <4c000024> 7db243a6 7db142a6 f82d0c18

The problem is the signal code assumes TM is enabled when
CONFIG_PPC_TRANSACTIONAL_MEM is enabled. This may not be the case as
with P9 powernv or if `ppc_tm=off` is used on P8.

This means any local user can crash the system.

Fix the problem by returning a bad stack frame to the user if they try
to set the MSR TS bits with sigreturn() on systems where TM is not
supported.

Found with sigfuz kernel selftest on P9.

This fixes CVE-2019-13648.

Fixes: 2b0a576d15 ("powerpc: Add new transactional memory state to the signal context")
Cc: stable@vger.kernel.org # v3.9
Reported-by: Praveen Pandey <Praveen.Pandey@in.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190719050502.405-1-mikey@neuling.org
2019-07-22 13:05:23 +10:00
Shawn Anastasio
b4fc36e60f powerpc/dma: Fix invalid DMA mmap behavior
The refactor of powerpc DMA functions in commit 6666cc17d7
("powerpc/dma: remove dma_nommu_mmap_coherent") incorrectly
changes the way DMA mappings are handled on powerpc.
Since this change, all mapped pages are marked as cache-inhibited
through the default implementation of arch_dma_mmap_pgprot.
This differs from the previous behavior of only marking pages
in noncoherent mappings as cache-inhibited and has resulted in
sporadic system crashes in certain hardware configurations and
workloads (see Bugzilla).

This commit restores the previous correct behavior by providing
an implementation of arch_dma_mmap_pgprot that only marks
pages in noncoherent mappings as cache-inhibited. As this behavior
should be universal for all powerpc platforms a new file,
dma-generic.c, was created to store it.

Fixes: 6666cc17d7 ("powerpc/dma: remove dma_nommu_mmap_coherent")
# NOTE: fixes commit 6666cc17d7 released in v5.1.
# Consider a stable tag:
# Cc: stable@vger.kernel.org # v5.1+
# NOTE: fixes commit 6666cc17d7 released in v5.1.
# Consider a stable tag:
# Cc: stable@vger.kernel.org # v5.1+
Cc: stable@vger.kernel.org # v5.1+
Signed-off-by: Shawn Anastasio <shawn@anastas.io>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190717235437.12908-1-shawn@anastas.io
2019-07-19 21:29:49 +10:00
Mauro Carvalho Chehab
4d2e26a38f docs: powerpc: convert docs to ReST and rename to *.rst
Convert docs to ReST and add them to the arch-specific
book.

The conversion here was trivial, as almost every file there
was already using an elegant format close to ReST standard.

The changes were mostly to mark literal blocks and add a few
missing section title identifiers.

One note with regards to "--": on Sphinx, this can't be used
to identify a list, as it will format it badly. This can be
used, however, to identify a long hyphen - and "---" is an
even longer one.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> # cxl
2019-07-17 06:57:51 -03:00
Linus Torvalds
3c69914b4c for-linus-20190715
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXSxgkQAKCRCRxhvAZXjc
 opJ3AP9TWIWhEC0dzbNmzh0STj/Vyl5KYEpdMbi7HNeAmIEAfQD6A3HI4bVJbN08
 jH44U7DLzHJyHefKlB8jHEKEVYJWqgo=
 =74Bg
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20190715' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull pidfd and clone3 fixes from Christian Brauner:
 "This contains a bugfix for CLONE_PIDFD when used with the legacy clone
  syscall, two fixes to ensure that syscall numbering and clone3
  entrypoint implementations will stay consistent, and an update for the
  maintainers file:

   - The addition of clone3 broke CLONE_PIDFD for legacy clone on all
     architectures that use do_fork() directly instead of calling the
     clone syscall itself. (Fwiw, cleaning do_fork() up is on my todo.)

     The reason this happened was that during conversion of _do_fork()
     to use struct kernel_clone_args we missed that do_fork() is called
     directly by various architectures. This is fixed by making sure
     that the pidfd argument in struct kernel_clone_args is correctly
     initialized with the parent_tidptr argument passed down from
     do_fork(). Additionally, do_fork() missed a check to make
     CLONE_PIDFD and CLONE_PARENT_SETTID mutually exclusive just a
     clone() does. This is now fixed too.

   - When clone3() was introduced we skipped architectures that require
     special handling for fork-like syscalls. Their syscall tables did
     not contain any mention of clone3().

     To make sure that Arnd's work to make syscall numbers on all
     architectures identical (minus alpha) was not for naught we are
     placing a comment in all syscall tables that do not yet implement
     clone3(). The comment makes it clear that 435 is reserved for
     clone3 and should not be used.

   - Also, this contains a patch to make the clone3() syscall definition
     in asm-generic/unist.h conditional on __ARCH_WANT_SYS_CLONE3. This
     lets us catch new architectures that implicitly make use of clone3
     without setting __ARCH_WANT_SYS_CLONE3 which is a good indicator
     that they did not check whether it needs special treatment or not.

   - Finally, this contains a patch to add me as maintainer for pidfd
     stuff so people can start blaming me (more)"

* tag 'for-linus-20190715' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  MAINTAINERS: add new entry for pidfd api
  unistd: protect clone3 via __ARCH_WANT_SYS_CLONE3
  arch: mark syscall number 435 reserved for clone3
  clone: fix CLONE_PIDFD support
2019-07-16 11:30:07 -07:00
Christian Brauner
1a271a68e0
arch: mark syscall number 435 reserved for clone3
A while ago Arnd made it possible to give new system calls the same
syscall number on all architectures (except alpha). To not break this
nice new feature let's mark 435 for clone3 as reserved on all
architectures that do not yet implement it.
Even if an architecture does not plan to implement it this ensures that
new system calls coming after clone3 will have the same number on all
architectures.

Signed-off-by: Christian Brauner <christian@brauner.io>
Cc: linux-arch@vger.kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Link: https://lore.kernel.org/r/20190714192205.27190-2-christian@brauner.io
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christian Brauner <christian@brauner.io>
2019-07-15 00:39:33 +02:00
Linus Torvalds
192f0f8e9d powerpc updates for 5.3
Notable changes:
 
  - Removal of the NPU DMA code, used by the out-of-tree Nvidia driver, as well
    as some other functions only used by drivers that haven't (yet?) made it
    upstream.
 
  - A fix for a bug in our handling of hardware watchpoints (eg. perf record -e
    mem: ...) which could lead to register corruption and kernel crashes.
 
  - Enable HAVE_ARCH_HUGE_VMAP, which allows us to use large pages for vmalloc
    when using the Radix MMU.
 
  - A large but incremental rewrite of our exception handling code to use gas
    macros rather than multiple levels of nested CPP macros.
 
 And the usual small fixes, cleanups and improvements.
 
 Thanks to:
   Alastair D'Silva, Alexey Kardashevskiy, Andreas Schwab, Aneesh Kumar K.V, Anju
   T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Cédric Le Goater,
   Christian Lamparter, Christophe Leroy, Christophe Lombard, Christoph Hellwig,
   Daniel Axtens, Denis Efremov, Enrico Weigelt, Frederic Barrat, Gautham R.
   Shenoy, Geert Uytterhoeven, Geliang Tang, Gen Zhang, Greg Kroah-Hartman, Greg
   Kurz, Gustavo Romero, Krzysztof Kozlowski, Madhavan Srinivasan, Masahiro
   Yamada, Mathieu Malaterre, Michael Neuling, Nathan Lynch, Naveen N. Rao,
   Nicholas Piggin, Nishad Kamdar, Oliver O'Halloran, Qian Cai, Ravi Bangoria,
   Sachin Sant, Sam Bobroff, Satheesh Rajendran, Segher Boessenkool, Shaokun
   Zhang, Shawn Anastasio, Stewart Smith, Suraj Jitindar Singh, Thiago Jung
   Bauermann, YueHaibing.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdKVoLAAoJEFHr6jzI4aWA0kIP/A6shIbbE7H5W2hFrqt/PPPK
 3+VrvPKbOFF+W6hcE/RgSZmEnUo0svdNjHUd/eMfFS1vb/uRt2QDdrsHUNNwURQL
 M2mcLXFwYpnjSjb/XMgDbHpAQxjeGfTdYLonUIejN7Rk8KQUeLyKQ3SBn6kfMc46
 DnUUcPcjuRGaETUmVuZZ4e40ZWbJp8PKDrSJOuUrTPXMaK5ciNbZk5mCWXGbYl6G
 BMQAyv4ld/417rNTjBEP/T2foMJtioAt4W6mtlgdkOTdIEZnFU67nNxDBthNSu2c
 95+I+/sML4KOp1R4yhqLSLIDDbc3bg3c99hLGij0d948z3bkSZ8bwnPaUuy70C4v
 U8rvl/+N6C6H3DgSsPE/Gnkd8DnudqWY8nULc+8p3fXljGwww6/Qgt+6yCUn8BdW
 WgixkSjKgjDmzTw8trIUNEqORrTVle7cM2hIyIK2Q5T4kWzNQxrLZ/x/3wgoYjUa
 1KwIzaRo5JKZ9D3pJnJ5U+knE2/90rJIyfcp0W6ygyJsWKi2GNmq1eN3sKOw0IxH
 Tg86RENIA/rEMErNOfP45sLteMuTR7of7peCG3yumIOZqsDVYAzerpvtSgip2cvK
 aG+9HcYlBFOOOF9Dabi8GXsTBLXLfwiyjjLSpA9eXPwW8KObgiNfTZa7ujjTPvis
 4mk9oukFTFUpfhsMmI3T
 =3dBZ
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Notable changes:

   - Removal of the NPU DMA code, used by the out-of-tree Nvidia driver,
     as well as some other functions only used by drivers that haven't
     (yet?) made it upstream.

   - A fix for a bug in our handling of hardware watchpoints (eg. perf
     record -e mem: ...) which could lead to register corruption and
     kernel crashes.

   - Enable HAVE_ARCH_HUGE_VMAP, which allows us to use large pages for
     vmalloc when using the Radix MMU.

   - A large but incremental rewrite of our exception handling code to
     use gas macros rather than multiple levels of nested CPP macros.

  And the usual small fixes, cleanups and improvements.

  Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Andreas Schwab,
  Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Arnd Bergmann,
  Athira Rajeev, Cédric Le Goater, Christian Lamparter, Christophe
  Leroy, Christophe Lombard, Christoph Hellwig, Daniel Axtens, Denis
  Efremov, Enrico Weigelt, Frederic Barrat, Gautham R. Shenoy, Geert
  Uytterhoeven, Geliang Tang, Gen Zhang, Greg Kroah-Hartman, Greg Kurz,
  Gustavo Romero, Krzysztof Kozlowski, Madhavan Srinivasan, Masahiro
  Yamada, Mathieu Malaterre, Michael Neuling, Nathan Lynch, Naveen N.
  Rao, Nicholas Piggin, Nishad Kamdar, Oliver O'Halloran, Qian Cai, Ravi
  Bangoria, Sachin Sant, Sam Bobroff, Satheesh Rajendran, Segher
  Boessenkool, Shaokun Zhang, Shawn Anastasio, Stewart Smith, Suraj
  Jitindar Singh, Thiago Jung Bauermann, YueHaibing"

* tag 'powerpc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (163 commits)
  powerpc/powernv/idle: Fix restore of SPRN_LDBAR for POWER9 stop state.
  powerpc/eeh: Handle hugepages in ioremap space
  ocxl: Update for AFU descriptor template version 1.1
  powerpc/boot: pass CONFIG options in a simpler and more robust way
  powerpc/boot: add {get, put}_unaligned_be32 to xz_config.h
  powerpc/irq: Don't WARN continuously in arch_local_irq_restore()
  powerpc/module64: Use symbolic instructions names.
  powerpc/module32: Use symbolic instructions names.
  powerpc: Move PPC_HA() PPC_HI() and PPC_LO() to ppc-opcode.h
  powerpc/module64: Fix comment in R_PPC64_ENTRY handling
  powerpc/boot: Add lzo support for uImage
  powerpc/boot: Add lzma support for uImage
  powerpc/boot: don't force gzipped uImage
  powerpc/8xx: Add microcode patch to move SMC parameter RAM.
  powerpc/8xx: Use IO accessors in microcode programming.
  powerpc/8xx: replace #ifdefs by IS_ENABLED() in microcode.c
  powerpc/8xx: refactor programming of microcode CPM params.
  powerpc/8xx: refactor printing of microcode patch name.
  powerpc/8xx: Refactor microcode write
  powerpc/8xx: refactor writing of CPM microcode arrays
  ...
2019-07-13 16:08:36 -07:00
Oliver O'Halloran
3343962068 powerpc/eeh: Handle hugepages in ioremap space
In commit 4a7b06c157a2 ("powerpc/eeh: Handle hugepages in ioremap
space") support for using hugepages in the vmalloc and ioremap areas was
enabled for radix. Unfortunately this broke EEH MMIO error checking.

Detection works by inserting a hook which checks the results of the
ioreadXX() set of functions.  When a read returns a 0xFFs response we
need to check for an error which we do by mapping the (virtual) MMIO
address back to a physical address, then mapping physical address to a
PCI device via an interval tree.

When translating virt -> phys we currently assume the ioremap space is
only populated by PAGE_SIZE mappings. If a hugepage mapping is found we
emit a WARN_ON(), but otherwise handles the check as though a normal
page was found. In pathalogical cases such as copying a buffer
containing a lot of 0xFFs from BAR memory this can result in the system
not booting because it's too busy printing WARN_ON()s.

There's no real reason to assume huge pages can't be present and we're
prefectly capable of handling them, so do that.

Fixes: 4a7b06c157a2 ("powerpc/eeh: Handle hugepages in ioremap space")
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190710150517.27114-1-oohall@gmail.com
2019-07-12 01:02:09 +10:00
Linus Torvalds
5450e8a316 pidfd-updates-v5.3
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXSMhUgAKCRCRxhvAZXjc
 okkiAQC3Hlg/O2JoIb4PqgEvBkpHSdVxyuWagn0ksjACW9ANKQEAl5OadMhvOq16
 UHGhKlpE/M8HflknIffoEGlIAWHrdwU=
 =7kP5
 -----END PGP SIGNATURE-----

Merge tag 'pidfd-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull pidfd updates from Christian Brauner:
 "This adds two main features.

   - First, it adds polling support for pidfds. This allows process
     managers to know when a (non-parent) process dies in a race-free
     way.

     The notification mechanism used follows the same logic that is
     currently used when the parent of a task is notified of a child's
     death. With this patchset it is possible to put pidfds in an
     {e}poll loop and get reliable notifications for process (i.e.
     thread-group) exit.

   - The second feature compliments the first one by making it possible
     to retrieve pollable pidfds for processes that were not created
     using CLONE_PIDFD.

     A lot of processes get created with traditional PID-based calls
     such as fork() or clone() (without CLONE_PIDFD). For these
     processes a caller can currently not create a pollable pidfd. This
     is a problem for Android's low memory killer (LMK) and service
     managers such as systemd.

  Both patchsets are accompanied by selftests.

  It's perhaps worth noting that the work done so far and the work done
  in this branch for pidfd_open() and polling support do already see
  some adoption:

   - Android is in the process of backporting this work to all their LTS
     kernels [1]

   - Service managers make use of pidfd_send_signal but will need to
     wait until we enable waiting on pidfds for full adoption.

   - And projects I maintain make use of both pidfd_send_signal and
     CLONE_PIDFD [2] and will use polling support and pidfd_open() too"

[1] https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.9+backport%22
    https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.14+backport%22
    https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.19+backport%22

[2] aab6e3eb73/src/lxc/start.c (L1753)

* tag 'pidfd-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  tests: add pidfd_open() tests
  arch: wire-up pidfd_open()
  pid: add pidfd_open()
  pidfd: add polling selftests
  pidfd: add polling support
2019-07-10 22:17:21 -07:00
Michael Ellerman
0fc12c022a powerpc/irq: Don't WARN continuously in arch_local_irq_restore()
When CONFIG_PPC_IRQ_SOFT_MASK_DEBUG is enabled (uncommon), we have a
series of WARN_ON's in arch_local_irq_restore().

These are "should never happen" conditions, but if they do happen they
can flood the console and render the system unusable. So switch them
to WARN_ON_ONCE().

Fixes: e2b36d5917 ("powerpc/64: Don't trace code that runs with the soft irq mask unreconciled")
Fixes: 9b81c0211c ("powerpc/64s: make PACA_IRQ_HARD_DIS track MSR[EE] closely")
Fixes: 7c0482e3d0 ("powerpc/irq: Fix another case of lazy IRQ state getting out of sync")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190708061046.7075-1-mpe@ellerman.id.au
2019-07-10 13:20:43 +10:00
Linus Torvalds
cf2d213e49 Power management updates for 5.3-rc1
- Improve the handling of shared ACPI power resources in the PCI
    bus type layer (Mika Westerberg).
 
  - Make the PCI layer take link delays required by the PCIe spec
    into account as appropriate and avoid polling devices in D3cold
    for PME (Mika Westerberg).
 
  - Fix some corner case issues in ACPI device power management and
    in the PCI bus type layer, optimiza and clean up the handling of
    runtime-suspended PCI devices during system-wide transitions to
    sleep states (Rafael Wysocki).
 
  - Rework hibernation handling in the ACPI core and the PCI bus type
    to resume runtime-suspended devices before hibernation (which
    allows some functional problems to be avoided) and fix some ACPI
    power management issues related to hiberation (Rafael Wysocki).
 
  - Extend the operating performance points (OPP) framework to support
    a wider range of devices (Rajendra Nayak, Stehpen Boyd).
 
  - Fix issues related to genpd_virt_devs and issues with platforms
    using the set_opp() callback in the OPP framework (Viresh Kumar,
    Dmitry Osipenko).
 
  - Add new cpufreq driver for Raspberry Pi (Nicolas Saenz Julienne).
 
  - Add new cpufreq driver for imx8m and imx7d chips (Leonard Crestez).
 
  - Fix and clean up the pcc-cpufreq, brcmstb-avs-cpufreq, s5pv210,
    and armada-37xx cpufreq drivers (David Arcari, Florian Fainelli,
    Paweł Chmiel, YueHaibing).
 
  - Clean up and fix the cpufreq core (Viresh Kumar, Daniel Lezcano).
 
  - Fix minor issue in the ACPI system sleep support code and export
    one function from it (Lenny Szubowicz, Dexuan Cui).
 
  - Clean up assorted pieces of PM code and documentation (Kefeng Wang,
    Andy Shevchenko, Bart Van Assche, Greg Kroah-Hartman, Fuqian Huang,
    Geert Uytterhoeven, Mathieu Malaterre, Rafael Wysocki).
 
  - Update the pm-graph utility to v5.4 (Todd Brandt).
 
  - Fix and clean up the cpupower utility (Abhishek Goel, Nick Black).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl0jK18SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxgEAP/RbPe71Y9ufKu64L3EtgV6mS9iuLEhux
 /Ad9laLNeM1b0oceT3QxGk7xQCacnZcBlcaqXVWI4NRsn4RBZp1cYZngpgJ9DP6E
 ONr8hzyzDOMVReba3XJIF8H+WoTKjywMYtFutjdx6dRe2ZJLutqnuZ0JbH1YSSK7
 IxOt0mJVALf2M4Zz7F17d+n3yGE/4xAPBVbj/rBRcTEsGYlR/Hoxs7iF6EBau7fy
 R5drUH6XSrWk8adc+z7l3BTGqMMYj9deRSfAWB3wpM4YK7Fv7msX/amBoGINkdn6
 xP/ZcrHvhKKzE89MS8OUGP4rGVwq+7tu6mktnYL/tpKgutJqqx5LVvrLsGDSWr+W
 /aJExN8Eb4Jh98C6vog3XUJoqBxkVGbU8qoCBU3jlFsaznFEWjW9IKhBHs5CIaqz
 MXte6AsJ8lvFzxILjvx0m2206wNpRJRXYLX3a/BSBxa4OgOESjIpBTmbPfOwbxwj
 8z9hIVlDTTDtnF6BEyDQr1fjPi3Mxl7ibGnoqRrJm36VKBy9VZNNwG/0Y2oSvm6k
 Es8CiTWA3A/46dCZxGr18/9Vbfxn1Yvg9QZ1lCE5Fqij+0F2yRApbHZgUro+1rji
 6J8OWC5r5JccdKuGHh4RH/asMFhD0cAR/gUsRzS4dz/wz2jVIN1FstdD1aKN5GBy
 d0lchx/AKR5H
 =aBN3
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These update PCI and ACPI power management (improved handling of ACPI
  power resources and PCIe link delays, fixes related to corner cases,
  hibernation handling rework), fix and extend the operating performance
  points (OPP) framework, add new cpufreq drivers for Raspberry Pi and
  imx8m chips, update some other cpufreq drivers, clean up assorted
  pieces of PM code and documentation and update tools.

  Specifics:

   - Improve the handling of shared ACPI power resources in the PCI bus
     type layer (Mika Westerberg).

   - Make the PCI layer take link delays required by the PCIe spec into
     account as appropriate and avoid polling devices in D3cold for PME
     (Mika Westerberg).

   - Fix some corner case issues in ACPI device power management and in
     the PCI bus type layer, optimiza and clean up the handling of
     runtime-suspended PCI devices during system-wide transitions to
     sleep states (Rafael Wysocki).

   - Rework hibernation handling in the ACPI core and the PCI bus type
     to resume runtime-suspended devices before hibernation (which
     allows some functional problems to be avoided) and fix some ACPI
     power management issues related to hiberation (Rafael Wysocki).

   - Extend the operating performance points (OPP) framework to support
     a wider range of devices (Rajendra Nayak, Stehpen Boyd).

   - Fix issues related to genpd_virt_devs and issues with platforms
     using the set_opp() callback in the OPP framework (Viresh Kumar,
     Dmitry Osipenko).

   - Add new cpufreq driver for Raspberry Pi (Nicolas Saenz Julienne).

   - Add new cpufreq driver for imx8m and imx7d chips (Leonard Crestez).

   - Fix and clean up the pcc-cpufreq, brcmstb-avs-cpufreq, s5pv210, and
     armada-37xx cpufreq drivers (David Arcari, Florian Fainelli, Paweł
     Chmiel, YueHaibing).

   - Clean up and fix the cpufreq core (Viresh Kumar, Daniel Lezcano).

   - Fix minor issue in the ACPI system sleep support code and export
     one function from it (Lenny Szubowicz, Dexuan Cui).

   - Clean up assorted pieces of PM code and documentation (Kefeng Wang,
     Andy Shevchenko, Bart Van Assche, Greg Kroah-Hartman, Fuqian Huang,
     Geert Uytterhoeven, Mathieu Malaterre, Rafael Wysocki).

   - Update the pm-graph utility to v5.4 (Todd Brandt).

   - Fix and clean up the cpupower utility (Abhishek Goel, Nick Black)"

* tag 'pm-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (57 commits)
  ACPI: PM: Make acpi_sleep_state_supported() non-static
  PM: sleep: Drop dev_pm_skip_next_resume_phases()
  ACPI: PM: Unexport acpi_device_get_power()
  Documentation: ABI: power: Add missing newline at end of file
  ACPI: PM: Drop unused function and function header
  ACPI: PM: Introduce "poweroff" callbacks for ACPI PM domain and LPSS
  ACPI: PM: Simplify and fix PM domain hibernation callbacks
  PCI: PM: Simplify bus-level hibernation callbacks
  PM: ACPI/PCI: Resume all devices during hibernation
  cpufreq: Avoid calling cpufreq_verify_current_freq() from handle_update()
  cpufreq: Consolidate cpufreq_update_current_freq() and __cpufreq_get()
  kernel: power: swap: use kzalloc() instead of kmalloc() followed by memset()
  cpufreq: Don't skip frequency validation for has_target() drivers
  PCI: PM/ACPI: Refresh all stale power state data in pci_pm_complete()
  PCI / ACPI: Add _PR0 dependent devices
  ACPI / PM: Introduce concept of a _PR0 dependent device
  PCI / ACPI: Use cached ACPI device state to get PCI device power state
  ACPI: PM: Allow transitions to D0 to occur in special cases
  ACPI: PM: Avoid evaluating _PS3 on transitions from D3hot to D3cold
  cpufreq: Use has_target() instead of !setpolicy
  ...
2019-07-09 10:05:22 -07:00
Linus Torvalds
5ad18b2e60 Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull force_sig() argument change from Eric Biederman:
 "A source of error over the years has been that force_sig has taken a
  task parameter when it is only safe to use force_sig with the current
  task.

  The force_sig function is built for delivering synchronous signals
  such as SIGSEGV where the userspace application caused a synchronous
  fault (such as a page fault) and the kernel responded with a signal.

  Because the name force_sig does not make this clear, and because the
  force_sig takes a task parameter the function force_sig has been
  abused for sending other kinds of signals over the years. Slowly those
  have been fixed when the oopses have been tracked down.

  This set of changes fixes the remaining abusers of force_sig and
  carefully rips out the task parameter from force_sig and friends
  making this kind of error almost impossible in the future"

* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits)
  signal/x86: Move tsk inside of CONFIG_MEMORY_FAILURE in do_sigbus
  signal: Remove the signal number and task parameters from force_sig_info
  signal: Factor force_sig_info_to_task out of force_sig_info
  signal: Generate the siginfo in force_sig
  signal: Move the computation of force into send_signal and correct it.
  signal: Properly set TRACE_SIGNAL_LOSE_INFO in __send_signal
  signal: Remove the task parameter from force_sig_fault
  signal: Use force_sig_fault_to_task for the two calls that don't deliver to current
  signal: Explicitly call force_sig_fault on current
  signal/unicore32: Remove tsk parameter from __do_user_fault
  signal/arm: Remove tsk parameter from __do_user_fault
  signal/arm: Remove tsk parameter from ptrace_break
  signal/nds32: Remove tsk parameter from send_sigtrap
  signal/riscv: Remove tsk parameter from do_trap
  signal/sh: Remove tsk parameter from force_sig_info_fault
  signal/um: Remove task parameter from send_sigtrap
  signal/x86: Remove task parameter from send_sigtrap
  signal: Remove task parameter from force_sig_mceerr
  signal: Remove task parameter from force_sig
  signal: Remove task parameter from force_sigsegv
  ...
2019-07-08 21:48:15 -07:00
Linus Torvalds
e0e86b111b Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP/hotplug updates from Thomas Gleixner:
 "A small set of updates for SMP and CPU hotplug:

   - Abort disabling secondary CPUs in the freezer when a wakeup is
     pending instead of evaluating it only after all CPUs have been
     offlined.

   - Remove the shared annotation for the strict per CPU cfd_data in the
     smp function call core code.

   - Remove the return values of smp_call_function() and on_each_cpu()
     as they are unconditionally 0. Fixup the few callers which actually
     bothered to check the return value"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  smp: Remove smp_call_function() and on_each_cpu() return values
  smp: Do not mark call_function_data as shared
  cpu/hotplug: Abort disabling secondary CPUs if wakeup is pending
  cpu/hotplug: Fix notify_cpu_starting() reference in bringup_wait_for_ap()
2019-07-08 10:39:56 -07:00
Linus Torvalds
dfd437a257 arm64 updates for 5.3:
- arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP}
 
 - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to
   manage the permissions of executable vmalloc regions more strictly
 
 - Slight performance improvement by keeping softirqs enabled while
   touching the FPSIMD/SVE state (kernel_neon_begin/end)
 
 - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new XAFLAG
   and AXFLAG instructions for floating point comparison flags
   manipulation) and FRINT (rounding floating point numbers to integers)
 
 - Re-instate ARM64_PSEUDO_NMI support which was previously marked as
   BROKEN due to some bugs (now fixed)
 
 - Improve parking of stopped CPUs and implement an arm64-specific
   panic_smp_self_stop() to avoid warning on not being able to stop
   secondary CPUs during panic
 
 - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI
   platforms
 
 - perf: DDR performance monitor support for iMX8QXP
 
 - cache_line_size() can now be set from DT or ACPI/PPTT if provided to
   cope with a system cache info not exposed via the CPUID registers
 
 - Avoid warning on hardware cache line size greater than
   ARCH_DMA_MINALIGN if the system is fully coherent
 
 - arm64 do_page_fault() and hugetlb cleanups
 
 - Refactor set_pte_at() to avoid redundant READ_ONCE(*ptep)
 
 - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the 'arm_boot_flags'
   introduced in 5.1)
 
 - CONFIG_RANDOMIZE_BASE now enabled in defconfig
 
 - Allow the selection of ARM64_MODULE_PLTS, currently only done via
   RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill
   over into the vmalloc area
 
 - Make ZONE_DMA32 configurable
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl0eHqcACgkQa9axLQDI
 XvFyNA/+L+bnkz8m3ncydlqqfXomQn4eJJVQ8Uksb0knJz+1+3CUxxbO4ry4jXZN
 fMkbggYrDPRKpDbsUl0lsRipj7jW9bqan+N37c3SWqCkgb6HqDaHViwxdx6Ec/Uk
 gHudozDSPh/8c7hxGcSyt/CFyuW6b+8eYIQU5rtIgz8aVY2BypBvS/7YtYCbIkx0
 w4CFleRTK1zXD5mJQhrc6jyDx659sVkrAvdhf6YIymOY8nBTv40vwdNo3beJMYp8
 Po/+0Ixu+VkHUNtmYYZQgP/AGH96xiTcRnUqd172JdtRPpCLqnLqwFokXeVIlUKT
 KZFMDPzK+756Ayn4z4huEePPAOGlHbJje8JVNnFyreKhVVcCotW7YPY/oJR10bnc
 eo7yD+DxABTn+93G2yP436bNVa8qO1UqjOBfInWBtnNFJfANIkZweij/MQ6MjaTA
 o7KtviHnZFClefMPoiI7HDzwL8XSmsBDbeQ04s2Wxku1Y2xUHLx4iLmadwLQ1ZPb
 lZMTZP3N/T1554MoURVA1afCjAwiqU3bt1xDUGjbBVjLfSPBAn/25IacsG9Li9AF
 7Rp1M9VhrfLftjFFkB2HwpbhRASOxaOSx+EI3kzEfCtM2O9I1WHgP3rvCdc3l0HU
 tbK0/IggQicNgz7GSZ8xDlWPwwSadXYGLys+xlMZEYd3pDIOiFc=
 =0TDT
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP}

 - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to
   manage the permissions of executable vmalloc regions more strictly

 - Slight performance improvement by keeping softirqs enabled while
   touching the FPSIMD/SVE state (kernel_neon_begin/end)

 - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new
   XAFLAG and AXFLAG instructions for floating point comparison flags
   manipulation) and FRINT (rounding floating point numbers to integers)

 - Re-instate ARM64_PSEUDO_NMI support which was previously marked as
   BROKEN due to some bugs (now fixed)

 - Improve parking of stopped CPUs and implement an arm64-specific
   panic_smp_self_stop() to avoid warning on not being able to stop
   secondary CPUs during panic

 - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI
   platforms

 - perf: DDR performance monitor support for iMX8QXP

 - cache_line_size() can now be set from DT or ACPI/PPTT if provided to
   cope with a system cache info not exposed via the CPUID registers

 - Avoid warning on hardware cache line size greater than
   ARCH_DMA_MINALIGN if the system is fully coherent

 - arm64 do_page_fault() and hugetlb cleanups

 - Refactor set_pte_at() to avoid redundant READ_ONCE(*ptep)

 - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the
   'arm_boot_flags' introduced in 5.1)

 - CONFIG_RANDOMIZE_BASE now enabled in defconfig

 - Allow the selection of ARM64_MODULE_PLTS, currently only done via
   RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill
   over into the vmalloc area

 - Make ZONE_DMA32 configurable

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits)
  perf: arm_spe: Enable ACPI/Platform automatic module loading
  arm_pmu: acpi: spe: Add initial MADT/SPE probing
  ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  ACPI/PPTT: Modify node flag detection to find last IDENTICAL
  x86/entry: Simplify _TIF_SYSCALL_EMU handling
  arm64: rename dump_instr as dump_kernel_instr
  arm64/mm: Drop [PTE|PMD]_TYPE_FAULT
  arm64: Implement panic_smp_self_stop()
  arm64: Improve parking of stopped CPUs
  arm64: Expose FRINT capabilities to userspace
  arm64: Expose ARMv8.5 CondM capability to userspace
  arm64: defconfig: enable CONFIG_RANDOMIZE_BASE
  arm64: ARM64_MODULES_PLTS must depend on MODULES
  arm64: bpf: do not allocate executable memory
  arm64/kprobes: set VM_FLUSH_RESET_PERMS on kprobe instruction pages
  arm64/mm: wire up CONFIG_ARCH_HAS_SET_DIRECT_MAP
  arm64: module: create module allocations without exec permissions
  arm64: Allow user selection of ARM64_MODULE_PLTS
  acpi/arm64: ignore 5.1 FADTs that are reported as 5.0
  arm64: Allow selecting Pseudo-NMI again
  ...
2019-07-08 09:54:55 -07:00
Rafael J. Wysocki
3dbeb44854 Merge branch 'pm-sleep'
* pm-sleep:
  PM: sleep: Drop dev_pm_skip_next_resume_phases()
  ACPI: PM: Drop unused function and function header
  ACPI: PM: Introduce "poweroff" callbacks for ACPI PM domain and LPSS
  ACPI: PM: Simplify and fix PM domain hibernation callbacks
  PCI: PM: Simplify bus-level hibernation callbacks
  PM: ACPI/PCI: Resume all devices during hibernation
  kernel: power: swap: use kzalloc() instead of kmalloc() followed by memset()
  PM: sleep: Update struct wakeup_source documentation
  drivers: base: power: remove wakeup_sources_stats_dentry variable
  PM: suspend: Rename pm_suspend_via_s2idle()
  PM: sleep: Show how long dpm_suspend_start() and dpm_suspend_end() take
  PM: hibernate: powerpc: Expose pfn_is_nosave() prototype
2019-07-08 10:51:25 +02:00
Christophe Leroy
a2b6f26c26 powerpc/module64: Use symbolic instructions names.
To increase readability/maintainability, replace hard coded
instructions values by symbolic names.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Fix R_PPC64_ENTRY case, the addi reads from r2 not r12]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-06 00:29:50 +10:00
Christophe Leroy
4eb4516ead powerpc/module32: Use symbolic instructions names.
To increase readability/maintainability, replace hard coded
instructions values by symbolic names.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-06 00:29:50 +10:00
Christophe Leroy
7f9c929a7f powerpc: Move PPC_HA() PPC_HI() and PPC_LO() to ppc-opcode.h
PPC_HA() PPC_HI() and PPC_LO() macros are nice macros. Move them
from module64.c to ppc-opcode.h in order to use them in other places.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Clean up formatting in new code, drop duplicates in ftrace.c]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-06 00:29:50 +10:00
Michael Ellerman
2fb0a2c989 powerpc/module64: Fix comment in R_PPC64_ENTRY handling
The comment here is wrong, the addi reads from r2 not r12. The code is
correct, 0x38420000 = addi r2,r2,0.

Fixes: a61674bdfc ("powerpc/module: Handle R_PPC64_ENTRY relocations")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-06 00:29:50 +10:00
Christophe Leroy
22e9c88d48 powerpc/64: reuse PPC32 static inline flush_dcache_range()
This patch drops the assembly PPC64 version of flush_dcache_range()
and re-uses the PPC32 static inline version.

With GCC 8.1, the following code is generated:

void flush_test(unsigned long start, unsigned long stop)
{
	flush_dcache_range(start, stop);
}

0000000000000130 <.flush_test>:
 130:	3d 22 00 00 	addis   r9,r2,0
			132: R_PPC64_TOC16_HA	.data+0x8
 134:	81 09 00 00 	lwz     r8,0(r9)
			136: R_PPC64_TOC16_LO	.data+0x8
 138:	3d 22 00 00 	addis   r9,r2,0
			13a: R_PPC64_TOC16_HA	.data+0xc
 13c:	80 e9 00 00 	lwz     r7,0(r9)
			13e: R_PPC64_TOC16_LO	.data+0xc
 140:	7d 48 00 d0 	neg     r10,r8
 144:	7d 43 18 38 	and     r3,r10,r3
 148:	7c 00 04 ac 	hwsync
 14c:	4c 00 01 2c 	isync
 150:	39 28 ff ff 	addi    r9,r8,-1
 154:	7c 89 22 14 	add     r4,r9,r4
 158:	7c 83 20 50 	subf    r4,r3,r4
 15c:	7c 89 3c 37 	srd.    r9,r4,r7
 160:	41 82 00 1c 	beq     17c <.flush_test+0x4c>
 164:	7d 29 03 a6 	mtctr   r9
 168:	60 00 00 00 	nop
 16c:	60 00 00 00 	nop
 170:	7c 00 18 ac 	dcbf    0,r3
 174:	7c 63 42 14 	add     r3,r3,r8
 178:	42 00 ff f8 	bdnz    170 <.flush_test+0x40>
 17c:	7c 00 04 ac 	hwsync
 180:	4c 00 01 2c 	isync
 184:	4e 80 00 20 	blr
 188:	60 00 00 00 	nop
 18c:	60 00 00 00 	nop

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-05 02:06:37 +10:00
Christophe Leroy
1cfb725fb1 powerpc/64: flush_inval_dcache_range() becomes flush_dcache_range()
On most arches having function flush_dcache_range(), including PPC32,
this function does a writeback and invalidation of the cache bloc.

On PPC64, flush_dcache_range() only does a writeback while
flush_inval_dcache_range() does the invalidation in addition.

In addition it looks like within arch/powerpc/, there are no PPC64
platforms using flush_dcache_range()

This patch drops the existing 64 bits version of flush_dcache_range()
and renames flush_inval_dcache_range() into flush_dcache_range().

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-05 01:35:10 +10:00
Alexey Kardashevskiy
dead1c845d powerpc/pci/of: Parse unassigned resources
The pseries platform uses the PCI_PROBE_DEVTREE method of PCI probing
which reads "assigned-addresses" of every PCI device and initializes
the device resources. However if the property is missing or zero sized,
then there is no fallback of any kind and the PCI resources remain
undiscovered, i.e. pdev->resource[] array remains empty.

This adds a fallback which parses the "reg" property in pretty much same
way except it marks resources as "unset" which later make Linux assign
those resources proper addresses.

This has an effect when:
1. a hypervisor failed to assign any resource for a device;
2. /chosen/linux,pci-probe-only=0 is in the DT so the system may try
assigning a resource.
Neither is likely to happen under PowerVM.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-03 15:19:35 +10:00
Alexey Kardashevskiy
efd176a04b powerpc/pseries/dma: Allow SWIOTLB
The commit 8617a5c5bc ("powerpc/dma: handle iommu bypass in
dma_iommu_ops") merged direct DMA ops into the IOMMU DMA ops allowing
SWIOTLB as well but only for mapping; the unmapping and bouncing parts
were left unmodified.

This adds missing direct unmapping calls to .unmap_page() and
.unmap_sg().

This adds missing sync callbacks and directs them to the direct DMA
hooks.

Fixes: 8617a5c5bc ("powerpc/dma: handle iommu bypass in dma_iommu_ops")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-03 15:19:35 +10:00
Christoph Hellwig
24911acd64 powerpc: remove device_to_mask()
Use the dma_get_mask() helper from dma-mapping.h instead, as they are
functionally identical.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-03 15:19:35 +10:00
Michael Neuling
a278e7ea60 powerpc: Fix compile issue with force DAWR
If you compile with KVM but without CONFIG_HAVE_HW_BREAKPOINT you fail
at linking with:
  arch/powerpc/kvm/book3s_hv_rmhandlers.o:(.text+0x708): undefined reference to `dawr_force_enable'

This was caused by commit c1fe190c06 ("powerpc: Add force enable of
DAWR on P9 option").

This moves a bunch of code around to fix this. It moves a lot of the
DAWR code in a new file and creates a new CONFIG_PPC_DAWR to enable
compiling it.

Fixes: c1fe190c06 ("powerpc: Add force enable of DAWR on P9 option")
Signed-off-by: Michael Neuling <mikey@neuling.org>
[mpe: Minor formatting in set_dawr()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-03 15:19:35 +10:00
Mathieu Malaterre
548c54acba powerpc: silence a -Wcast-function-type warning in dawr_write_file_bool
In commit c1fe190c06 ("powerpc: Add force enable of DAWR on P9
option") the following piece of code was added:

   smp_call_function((smp_call_func_t)set_dawr, &null_brk, 0);

Since GCC 8 this triggers the following warning about incompatible
function types:

  arch/powerpc/kernel/hw_breakpoint.c:408:21: error: cast between incompatible function types from 'int (*)(struct arch_hw_breakpoint *)' to 'void (*)(void *)' [-Werror=cast-function-type]

Since the warning is there for a reason, and should not be hidden behind
a cast, provide an intermediate callback function to avoid the warning.

Fixes: c1fe190c06 ("powerpc: Add force enable of DAWR on P9 option")
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-03 15:19:35 +10:00
Nicholas Piggin
fe7946ce08 powerpc/64s: Rename PPC_INVALIDATE_ERAT to PPC_ISA_3_0_INVALIDATE_ERAT
This makes it clear to the caller that it can only be used on POWER9
and later CPUs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Use "ISA_3_0" rather than "ARCH_300"]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-03 15:19:35 +10:00
Nicholas Piggin
293c2e27b9 powerpc/64s/exception: simplify hmi control flow
Branch to the relocated 0xc000 address early (still in real mode), to
simplify subsequent branches. Have the virt mode handler avoid just
'windup' and redo the exception from scratch, rather than branching
back to the trampoline.

Rearrange the stack setup instruction location to match the system
reset handler (e.g., right before EXCEPTION_PROLOG_COMMON).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-03 15:19:35 +10:00
Nicholas Piggin
f34c9675ca powerpc/64s/exception: hmi remove special case macro
No code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-03 15:19:35 +10:00
Nicholas Piggin
acc8da4492 powerpc/64s/exception: sreset move trampoline ahead of common code
Follow convention and move tramp ahead of common.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-03 15:19:35 +10:00
Nicholas Piggin
0e10be2bb9 powerpc/64s/exception: optimise system_reset for idle, clean up non-idle case
The idle wake up code in the system reset interrupt is not very
optimal. There are two requirements: perform idle wake up quickly;
and save everything including CFAR for non-idle interrupts, with
no performance requirement.

The problem with placing the idle test in the middle of the handler
and using the normal handler code to save CFAR, is that it's quite
costly (e.g., mfcfar is serialising, speculative workarounds get
applied, SRR1 has to be reloaded, etc). It also prevents the standard
interrupt handler boilerplate being used.

This pain can be avoided by using a dedicated idle interrupt handler
at the start of the interrupt handler, which restores all registers
back to the way they were in case it was not an idle wake up. CFAR
is preserved without saving it before the non-idle case by making that
the fall-through, and idle is a taken branch.

Performance seems to be in the noise, but possibly around 0.5% faster,
the executed instructions certainly look better. The bigger benefit is
being able to drop in standard interrupt handlers after the idle code,
which helps with subsequent cleanup and consolidation.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fixup BE by using DOTSYM for idle_return_gpr_loss call]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-03 15:18:46 +10:00
Nicholas Piggin
0a882e2846 powerpc/64s/exception: remove bad stack branch
The bad stack test in interrupt handlers has a few problems. For
performance it is taken in the common case, which is a fetch bubble
and a waste of i-cache.

For code development and maintainence, it requires yet another stack
frame setup routine, and that constrains all exception handlers to
follow the same register save pattern which inhibits future
optimisation.

Remove the test/branch and replace it with a trap. Teach the program
check handler to use the emergency stack for this case.

This does not result in quite so nice a message, however the SRR0 and
SRR1 of the crashed interrupt can be seen in r11 and r12, as is the
original r1 (adjusted by INT_FRAME_SIZE). These are the most important
parts to debugging the issue.

The original r9-12 and cr0 is lost, which is the main downside.

  kernel BUG at linux/arch/powerpc/kernel/exceptions-64s.S:847!
  Oops: Exception in kernel mode, sig: 5 [#1]
  BE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted
  NIP:  c000000000009108 LR: c000000000cadbcc CTR: c0000000000090f0
  REGS: c0000000fffcbd70 TRAP: 0700   Not tainted
  MSR:  9000000000021032 <SF,HV,ME,IR,DR,RI>  CR: 28222448  XER: 20040000
  CFAR: c000000000009100 IRQMASK: 0
  GPR00: 000000000000003d fffffffffffffd00 c0000000018cfb00 c0000000f02b3166
  GPR04: fffffffffffffffd 0000000000000007 fffffffffffffffb 0000000000000030
  GPR08: 0000000000000037 0000000028222448 0000000000000000 c000000000ca8de0
  GPR12: 9000000002009032 c000000001ae0000 c000000000010a00 0000000000000000
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: c0000000f00322c0 c000000000f85200 0000000000000004 ffffffffffffffff
  GPR24: fffffffffffffffe 0000000000000000 0000000000000000 000000000000000a
  GPR28: 0000000000000000 0000000000000000 c0000000f02b391c c0000000f02b3167
  NIP [c000000000009108] decrementer_common+0x18/0x160
  LR [c000000000cadbcc] .vsnprintf+0x3ec/0x4f0
  Call Trace:
  Instruction dump:
  996d098a 994d098b 38610070 480246ed 48005518 60000000 38200000 718a4000
  7c2a0b78 3821fd00 41c20008 e82d0970 <0981fd00> f92101a0 f9610170 f9810178

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:49 +10:00
Nicholas Piggin
f30a5e68f0 powerpc/tm: update comment about interrupt re-entrancy
Since the system reset interrupt began to use its own stack, and
machine check interrupts have done so for some time, r1 can be
changed without clearing MSR[RI], provided no other interrupts
(including SLB misses) are taken.

MSR[RI] does have to be cleared when using SCRATCH0, however.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:49 +10:00
Nicholas Piggin
d7fb34c704 powerpc/64s/exception: move SET_SCRATCH0 into EXCEPTION_PROLOG_0
No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:49 +10:00
Nicholas Piggin
904f81f3f3 powerpc/64s/exception: denorm handler use standard scratch save macro
Although the 0x1500 interrupt only applies to bare metal, it is better
to just use the standard macro for scratch save.

Runtime code path remains unchanged (due to instruction patching).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:49 +10:00
Nicholas Piggin
02a1258154 powerpc/64s/exception: machine check use standard macros to save dar/dsisr
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:49 +10:00
Nicholas Piggin
5312c4941e powerpc/64s/exception: add dar and dsisr options to exception macro
Some exception entry requires DAR and/or DSISR to be saved into the
paca exception save area. Add options to the standard exception
macros for these.

Generated code changes slightly due to code structure.

-     554:      a6 02 72 7d     mfdsisr r11
-     558:      a8 00 4d f9     std     r10,168(r13)
-     55c:      b0 00 6d 91     stw     r11,176(r13)
+     554:      a8 00 4d f9     std     r10,168(r13)
+     558:      a6 02 52 7d     mfdsisr r10
+     55c:      b0 00 4d 91     stw     r10,176(r13)

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:49 +10:00
Nicholas Piggin
391e941b89 powerpc/64s/exception: use common macro for windup
No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:48 +10:00
Nicholas Piggin
b113c08341 powerpc/64s/exception: shuffle windup code around
Restore all SPRs and CR up-front, these are longer latency
instructions. Move register restore around to maximise pairs of
adjacent loads (e.g., restore r0 next to r1).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:48 +10:00
Nicholas Piggin
67d4160a61 powerpc/64s/exception: simplify hmi windup code
Duplicate the hmi windup code for both cases, rather than to put a
special case branch in the middle of it. Remove unused label. This
helps with later code consolidation.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:48 +10:00
Nicholas Piggin
ad73d8d4f4 powerpc/64s/exception: move machine check windup in_mce handling
Move in_mce decrement earlier before registers are restored (but
still after RI=0). This helps with later consolidation.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:48 +10:00
Nicholas Piggin
9592b29a9c powerpc/64s/exception: windup use r9 consistently to restore SPRs
Trivial code change, r3->r9.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:48 +10:00
Nicholas Piggin
fbc50063a2 powerpc/64s/exception: mtmsrd L=1 cleanup
All supported 64s CPUs support mtmsrd L=1 instruction, so a cleanup
can be made in sreset and mce handlers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:48 +10:00
Nicholas Piggin
63d60d0c69 powerpc/64s/exception: avoid SPR RAW scoreboard stall in real mode entry
Move SPR reads ahead of writes. Real mode entry that is not a KVM
guest is rare these days, but bad practice propagates.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:48 +10:00
Nicholas Piggin
b0b2a93da4 powerpc/64s/exception: clean up system call entry
syscall / hcall entry unnecessarily differs between KVM and non-KVM
builds. Move the SMT priority instruction to the same location
(after INTERRUPT_TO_KERNEL).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:48 +10:00
Nicholas Piggin
1582009113 powerpc/64s/exception: move paca save area offsets into exception-64s.S
No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:39:20 +10:00
Nicholas Piggin
d064151fd3 powerpc/64s/exception: remove pointless EXCEPTION_PROLOG macro indirection
No generated code change. Final vmlinux is changed only due to change
in bug table line numbers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 21:38:28 +10:00
Nicholas Piggin
f3c8b6c63e powerpc/64s/exception: generate regs clear instructions using .rept
No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:43 +10:00
Nicholas Piggin
bf66e3c4cf powerpc/64s/exception: fix indenting irregularities
Generally, macros that result in instructions being expanded are
indented by a tab, and those that don't have no indent. Fix the
obvious cases that go contrary to style.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:43 +10:00
Nicholas Piggin
1b4d4a7933 powerpc/64s/exception: use a gas macro for system call handler code
No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:43 +10:00
Nicholas Piggin
f945478d5c powerpc/64s/exception: remove unused BRANCH_TO_COMMON
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:43 +10:00
Nicholas Piggin
64e413515c powerpc/64s/exception: remove __BRANCH_TO_KVM
No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:42 +10:00
Nicholas Piggin
a0502434bb powerpc/64s/exception: move head-64.h code to exception-64s.S where it is used
No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:42 +10:00
Nicholas Piggin
12a0480990 powerpc/64s/exception: move exception-64s.h code to exception-64s.S where it is used
No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:42 +10:00
Nicholas Piggin
80bd9177de powerpc/64s/exception: improve 0x500 handler code
After the previous cleanup, it becomes possible to consolidate some
common code outside the runtime alternate patching. Also remove
unused labels.

This results in some code change, but unchanged runtime instruction
sequence.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:42 +10:00
Nicholas Piggin
fc557537f2 powerpc/64s/exception: unwind exception-64s.h macros
Many of these macros just specify 1-4 lines which are only called a
few times each at most, and often just once. Remove this indirection.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:42 +10:00
Nicholas Piggin
47169fba3a powerpc/64s/exception: Move EXCEPTION_COMMON additions into callers
More cases of code insertion via macros that does not add a great
deal. All the additions have to be specified in the macro arguments,
so they can just as well go after the macro.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:42 +10:00
Nicholas Piggin
c06075f3d3 powerpc/64s/exception: Move EXCEPTION_COMMON handler and return branches into callers
The aim is to reduce the amount of indirection it takes to get through
the exception handler macros, particularly where it provides little
code sharing.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:42 +10:00
Nicholas Piggin
5dba1d50ba powerpc/64s/exception: Make EXCEPTION_PROLOG_0 a gas macro for consistency with others
No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:42 +10:00
Nicholas Piggin
17bdc064a1 powerpc/64s/exception: merge KVM handler and skip variants
Conditionally expand the skip case if it is specified.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:42 +10:00
Nicholas Piggin
fa4cf6b703 powerpc/64s/exception: consolidate maskable and non-maskable prologs
Conditionally expand the soft-masking test if a mask is passed in.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:42 +10:00
Nicholas Piggin
a7c1ca19c2 powerpc/64s/exception: remove the "extra" macro parameter
Rather than pass in the soft-masking and KVM tests via macro that is
passed to another macro to expand it, switch to usig gas macros and
conditionally expand the soft-masking and KVM tests.

The system reset with its idle test is open coded as it is a one-off.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:41 +10:00
Nicholas Piggin
8f528359ef powerpc/64s/exception: fix sreset KVM test code
The sreset handler KVM test theoretically should not depend on P7.
In practice KVM now only supports P7 and up so no real bug fix, but
this change is made now so the quirk is not propagated through
cleanup patches.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:41 +10:00
Nicholas Piggin
2d046308d0 powerpc/64s/exception: move and tidy EXCEPTION_PROLOG_2 variants
- Re-name the macros to _REAL and _VIRT suffixes rather than no and
  _RELON suffix.

- Move the macro definitions together in the file.

- Move RELOCATABLE ifdef inside the _VIRT macro.

Further consolidation between variants does not buy much here.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:41 +10:00
Nicholas Piggin
bd7b6d1334 powerpc/64s/exception: consolidate EXCEPTION_PROLOG_2 with _NORI variant
Switch to a gas macro that conditionally expands the RI clearing
instruction.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:24:41 +10:00
Nicholas Piggin
4508a74a63 powerpc/64s/exception: remove H concatenation for EXC_HV variants
Replace all instances of this with gas macros that test the hsrr
parameter and use the appropriate register names / labels.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Remove extraneous 2nd check for 0xea0 in SOFTEN_TEST]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-02 20:23:51 +10:00
Qian Cai
3becd11dff powerpc/eeh_cache: fix a W=1 kernel-doc warning
The opening comment mark "/**" is reserved for kernel-doc comments, so
it will generate a warning with "make W=1".

arch/powerpc/kernel/eeh_cache.c:37: warning: cannot understand function
prototype: 'struct pci_io_addr_range

Since this is not a kernel-doc for the struct below, but rather an
overview of this source eeh_cache.c, just use the free-form comments
kernel-doc syntax instead.

Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-01 16:26:54 +10:00
Nathan Lynch
9fb603050f powerpc/rtas: retry when cpu offline races with suspend/migration
The protocol for suspending or migrating an LPAR requires all present
processor threads to enter H_JOIN. So if we have threads offline, we
have to temporarily bring them up. This can race with administrator
actions such as SMT state changes. As of dfd718a2ed ("powerpc/rtas:
Fix a potential race between CPU-Offline & Migration"),
rtas_ibm_suspend_me() accounts for this, but errors out with -EBUSY
for what almost certainly is a transient condition in any reasonable
scenario.

Callers of rtas_ibm_suspend_me() already retry when -EAGAIN is
returned, and it is typical during a migration for that to happen
repeatedly for several minutes polling the H_VASI_STATE hcall result
before proceeding to the next stage.

So return -EAGAIN instead of -EBUSY when this race is
encountered. Additionally: logging this event is still appropriate but
use pr_info instead of pr_err; and remove use of unlikely() while here
as this is not a hot path at all.

Fixes: dfd718a2ed ("powerpc/rtas: Fix a potential race between CPU-Offline & Migration")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-07-01 16:26:54 +10:00
Michael Ellerman
8b8dc69514 Merge branch 'fixes' into next
Merge our fixes branch into next, this brings in a number of commits
that fix bugs we don't want to hit in next, in particular the fix for
CVE-2019-12817.
2019-07-01 14:04:39 +10:00
Linus Torvalds
39132f746e powerpc fixes for 5.2 #7
One fix for a regression in my commit adding KUAP (Kernel User Access
 Prevention) on Radix, which incorrectly touched the AMR in the early machine
 check handler.
 
 Thanks to:
   Nicholas Piggin.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdF00aAAoJEFHr6jzI4aWAT2UQAJCnXrBsNJd7WikZE8NzwdmM
 G6bioGCSPgNuWDwaxgpi6RSilET3poBBt+NpttgOslZtzif/5mrLIuYqwQYgOTbL
 Oa4CnzVBHnBDFKqcqe/Sm7cKuvd7KO8RVbyfhNuQbm1y9Nqr3vPYKwQ6CTz7bth4
 AatNvjP12Ag8hDwk3VpOOiG88jKpj/N3V7PLNWOt9jn8B3rCWm5/7xZ84VSNWdRQ
 /MvdGAcFAboywZMj44u8mBpT7+EueFa/vVbpCj8gv9QhRSSGwSL1jZ5wNu2Iv6D+
 IxxZqdO3KHJVixEAC4fs5KWCuA84uhjlRMkP2BXTgKNZT3qXaLx0e8Qv9okg/xAU
 dAuZEQ0cv+gxdCblEiVZ+jjG0LQsntwXJwnsCeWjcHQr6S0umd2utFLl1N3HTqfx
 QhgatD5pTGvGU2WHO4+dhXeh0nITVfcB2E3cM0DHUgCESc1BGmK0MtS1kHYiQptt
 BMY5Y92D3vndmnoLTZzQ2DFj5of2u49+y0Cpti7RhJN9yV836bPGm1K8GnropHz8
 7HHYS4hV3HBFUlYH7zHLp4BMNg3nkdTK+WTR6HwFFSREzM59NZtVg5xJVk0j66GK
 mZIJoVOSQ0Sac03xYqwtdxdupxoulXy+khBcjC56OxxOEMIfjS66ZnawTDhI2jVf
 EI7VE3Y4hzrA4pMTw9fp
 =I22i
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.2-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:
 "One fix for a regression in my commit adding KUAP (Kernel User Access
  Prevention) on Radix, which incorrectly touched the AMR in the early
  machine check handler.

  Thanks to Nicholas Piggin"

* tag 'powerpc-5.2-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64s/exception: Fix machine check early corrupting AMR
2019-06-30 11:20:52 +08:00
Christian Brauner
7615d9e178
arch: wire-up pidfd_open()
This wires up the pidfd_open() syscall into all arches at once.

Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jann Horn <jannh@google.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-api@vger.kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linux-arch@vger.kernel.org
Cc: x86@kernel.org
2019-06-28 12:17:55 +02:00
Nicholas Piggin
e13e7cd4c0 powerpc/64s/exception: Fix machine check early corrupting AMR
The early machine check runs in real mode, so locking is unnecessary.
Worse, the windup does not restore AMR, so this can result in a false
KUAP fault after a recoverable machine check hits inside a user copy
operation.

Fix this similarly to HMI by just avoiding the kuap lock in the
early machine check handler (it will be set by the late handler that
runs in virtual mode if that runs). If the virtual mode handler is
reached, it will lock and restore the AMR.

Fixes: 890274c2dc ("powerpc/64s: Implement KUAP for Radix MMU")
Cc: Russell Currey <ruscur@russell.cc>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-25 21:04:27 +10:00
Nadav Amit
caa759323c smp: Remove smp_call_function() and on_each_cpu() return values
The return value is fixed. Remove it and amend the callers.

[ tglx: Fixup arm/bL_switcher and powerpc/rtas ]

Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20190613064813.8102-2-namit@vmware.com
2019-06-23 14:26:26 +02:00
Linus Torvalds
a8282bf087 powerpc fixes for 5.2 #5
Seven fixes, all for bugs introduced this cycle.
 
 The commit to add KASAN support broke booting on 32-bit SMP machines, due to a
 refactoring that moved some setup out of the secondary CPU path.
 
 A fix for another 32-bit SMP bug introduced by the fast syscall entry
 implementation for 32-bit BOOKE. And a build fix for the same commit.
 
 Our change to allow the DAWR to be force enabled on Power9 introduced a bug in
 KVM, where we clobber r3 leading to a host crash.
 
 The same commit also exposed a previously unreachable bug in the nested KVM
 handling of DAWR, which could lead to an oops in a nested host.
 
 One of the DMA reworks broke the b43legacy WiFi driver on some people's
 powermacs, fix it by enabling a 30-bit ZONE_DMA on 32-bit.
 
 A fix for TLB flushing in KVM introduced a new bug, as it neglected to also
 flush the ERAT, this could lead to memory corruption in the guest.
 
 Thanks to:
   Aaro Koskinen, Christoph Hellwig, Christophe Leroy, Larry Finger, Michael
   Neuling, Suraj Jitindar Singh.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdDg4YAAoJEFHr6jzI4aWACKYP/RG1cqYDjEWz0N9bxjAOanx6
 z//hPZqZrObEORx0mek07LNj6JDy4eL7CB9WaEudJjHt7mYugLYq0g7hUMVvBWnB
 irFEuzGJ8EgWl1aMbmz+fgf49PBIuroy2o/4pyzzQXoDaw44QyUaCke2VEBskQNG
 RW64C2rDVrPgpRHzBB9EZVNv7svmo6ERJsEpRvqP3PZG1ZxgXW+DXbEdSmJCcgAt
 8oI+z6frRv+0ez+nge7TULo8DuheShfxc7l0jFrd48i35v2qB/IowPr8cof9fRwM
 TqnB+3dZXHPKPz6J9mz80p9ZDe1omLzg6i9EiR2/7a3XGpRBo7kCg3Iri7N5pu0j
 LotK9l1+mXWLy5P6lOHH5/tEHv52Wqsvh5IetpNJ2tgXp3MzbOc1/Ut9h7Ag7cRw
 WRa7tNXQ5Ud8uPM1Pds8Ymhd+/nZ9RItjGcu6S095/OGpM1FJR9a0QnfUHMyfyuX
 kAGrJDWcAkCd/Q9tKHsQotuZAFmRCQe4JFkzTiGzwdjYWYgtTA1c/eIv3+SG7eLV
 1dsaIYzIS56b+Qz2Qc/pKHwho+I9o505Y7LFXxlCGXDDjyI72ioTQDwiSBzaZdc9
 ORwNchLfpXNpiNXRoRqAnqmhWxavYmA6oJ13RDBiMBxIUWHynVbEzLlX9fPNdBFj
 Kw3Zd15znokXBzU+1mDE
 =Ju1y
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.2-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "This is a frustratingly large batch at rc5. Some of these were sent
  earlier but were missed by me due to being distracted by other things,
  and some took a while to track down due to needing manual bisection on
  old hardware. But still we clearly need to improve our testing of KVM,
  and of 32-bit, so that we catch these earlier.

  Summary: seven fixes, all for bugs introduced this cycle.

   - The commit to add KASAN support broke booting on 32-bit SMP
     machines, due to a refactoring that moved some setup out of the
     secondary CPU path.

   - A fix for another 32-bit SMP bug introduced by the fast syscall
     entry implementation for 32-bit BOOKE. And a build fix for the same
     commit.

   - Our change to allow the DAWR to be force enabled on Power9
     introduced a bug in KVM, where we clobber r3 leading to a host
     crash.

   - The same commit also exposed a previously unreachable bug in the
     nested KVM handling of DAWR, which could lead to an oops in a
     nested host.

   - One of the DMA reworks broke the b43legacy WiFi driver on some
     people's powermacs, fix it by enabling a 30-bit ZONE_DMA on 32-bit.

   - A fix for TLB flushing in KVM introduced a new bug, as it neglected
     to also flush the ERAT, this could lead to memory corruption in the
     guest.

  Thanks to: Aaro Koskinen, Christoph Hellwig, Christophe Leroy, Larry
  Finger, Michael Neuling, Suraj Jitindar Singh"

* tag 'powerpc-5.2-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  KVM: PPC: Book3S HV: Invalidate ERAT when flushing guest TLB entries
  powerpc: enable a 30-bit ZONE_DMA for 32-bit pmac
  KVM: PPC: Book3S HV: Only write DAWR[X] when handling h_set_dawr in real mode
  KVM: PPC: Book3S HV: Fix r3 corruption in h_set_dabr()
  powerpc/32: fix build failure on book3e with KVM
  powerpc/booke: fix fast syscall entry on SMP
  powerpc/32s: fix initial setup of segment registers on secondary CPU
2019-06-22 09:09:42 -07:00
Alexey Kardashevskiy
df5be5be87 powerpc/pci/of: Fix OF flags parsing for 64bit BARs
When the firmware does PCI BAR resource allocation, it passes the assigned
addresses and flags (prefetch/64bit/...) via the "reg" property of
a PCI device device tree node so the kernel does not need to do
resource allocation.

The flags are stored in resource::flags - the lower byte stores
PCI_BASE_ADDRESS_SPACE/etc bits and the other bytes are IORESOURCE_IO/etc.
Some flags from PCI_BASE_ADDRESS_xxx and IORESOURCE_xxx are duplicated,
such as PCI_BASE_ADDRESS_MEM_PREFETCH/PCI_BASE_ADDRESS_MEM_TYPE_64/etc.
When parsing the "reg" property, we copy the prefetch flag but we skip
on PCI_BASE_ADDRESS_MEM_TYPE_64 which leaves the flags out of sync.

The missing IORESOURCE_MEM_64 flag comes into play under 2 conditions:
1. we remove PCI_PROBE_ONLY for pseries (by hacking pSeries_setup_arch()
or by passing "/chosen/linux,pci-probe-only");
2. we request resource alignment (by passing pci=resource_alignment=
via the kernel cmd line to request PAGE_SIZE alignment or defining
ppc_md.pcibios_default_alignment which returns anything but 0). Note that
the alignment requests are ignored if PCI_PROBE_ONLY is enabled.

With 1) and 2), the generic PCI code in the kernel unconditionally
decides to:
- reassign the BARs in pci_specified_resource_alignment() (works fine)
- write new BARs to the device - this fails for 64bit BARs as the generic
code looks at IORESOURCE_MEM_64 (not set) and writes only lower 32bits
of the BAR and leaves the upper 32bit unmodified which breaks BAR mapping
in the hypervisor.

This fixes the issue by copying the flag. This is useful if we want to
enforce certain BAR alignment per platform as handling subpage sized BARs
is proven to cause problems with hotplug (SLOF already aligns BARs to 64k).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Shawn Anastasio <shawn@anastas.io>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-20 15:13:12 +10:00
Thomas Gleixner
7f904d7e1f treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 505
Based on 1 normalized pattern(s):

  gplv2

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 58 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081207.556988620@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:11:22 +02:00
Thomas Gleixner
d2912cb15b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:55 +02:00
Thomas Gleixner
40b0b3f8fb treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 230
Based on 2 normalized pattern(s):

  this source code is licensed under the gnu general public license
  version 2 see the file copying for more details

  this source code is licensed under general public license version 2
  see

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 52 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.449021192@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:06 +02:00
Ravi Bangoria
f474c28fbc powerpc/watchpoint: Restore NV GPRs while returning from exception
powerpc hardware triggers watchpoint before executing the instruction.
To make trigger-after-execute behavior, kernel emulates the
instruction. If the instruction is 'load something into non-volatile
register', exception handler should restore emulated register state
while returning back, otherwise there will be register state
corruption. eg, adding a watchpoint on a list can corrput the list:

  # cat /proc/kallsyms | grep kthread_create_list
  c00000000121c8b8 d kthread_create_list

Add watchpoint on kthread_create_list->prev:

  # perf record -e mem:0xc00000000121c8c0

Run some workload such that new kthread gets invoked. eg, I just
logged out from console:

  list_add corruption. next->prev should be prev (c000000001214e00), \
	but was c00000000121c8b8. (next=c00000000121c8b8).
  WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0
  CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69
  ...
  NIP __list_add_valid+0xb4/0xc0
  LR __list_add_valid+0xb0/0xc0
  Call Trace:
  __list_add_valid+0xb0/0xc0 (unreliable)
  __kthread_create_on_node+0xe0/0x260
  kthread_create_on_node+0x34/0x50
  create_worker+0xe8/0x260
  worker_thread+0x444/0x560
  kthread+0x160/0x1a0
  ret_from_kernel_thread+0x5c/0x70

List corruption happened because it uses 'load into non-volatile
register' instruction:

Snippet from __kthread_create_on_node:

  c000000000136be8:     addis   r29,r2,-19
  c000000000136bec:     ld      r29,31424(r29)
        if (!__list_add_valid(new, prev, next))
  c000000000136bf0:     mr      r3,r30
  c000000000136bf4:     mr      r5,r28
  c000000000136bf8:     mr      r4,r29
  c000000000136bfc:     bl      c00000000059a2f8 <__list_add_valid+0x8>

Register state from WARN_ON():

  GPR00: c00000000059a3a0 c000007ff23afb50 c000000001344e00 0000000000000075
  GPR04: 0000000000000000 0000000000000000 0000001852af8bc1 0000000000000000
  GPR08: 0000000000000001 0000000000000007 0000000000000006 00000000000004aa
  GPR12: 0000000000000000 c000007ffffeb080 c000000000137038 c000005ff62aaa00
  GPR16: 0000000000000000 0000000000000000 c000007fffbe7600 c000007fffbe7370
  GPR20: c000007fffbe7320 c000007fffbe7300 c000000001373a00 0000000000000000
  GPR24: fffffffffffffef7 c00000000012e320 c000007ff23afcb0 c000000000cb8628
  GPR28: c00000000121c8b8 c000000001214e00 c000007fef5b17e8 c000007fef5b17c0

Watchpoint hit at 0xc000000000136bec.

  addis   r29,r2,-19
   => r29 = 0xc000000001344e00 + (-19 << 16)
   => r29 = 0xc000000001214e00

  ld      r29,31424(r29)
   => r29 = *(0xc000000001214e00 + 31424)
   => r29 = *(0xc00000000121c8c0)

0xc00000000121c8c0 is where we placed a watchpoint and thus this
instruction was emulated by emulate_step. But because handle_dabr_fault
did not restore emulated register state, r29 still contains stale
value in above register state.

Fixes: 5aae8a5370 ("powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: stable@vger.kernel.org # 2.6.36+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-19 20:05:08 +10:00
Christophe Leroy
6ecb78ef56 powerpc/32s: fix suspend/resume when IBATs 4-7 are used
Previously, only IBAT1 and IBAT2 were used to map kernel linear mem.
Since commit 63b2bc6195 ("powerpc/mm/32s: Use BATs for
STRICT_KERNEL_RWX"), we may have all 8 BATs used for mapping
kernel text. But the suspend/restore functions only save/restore
BATs 0 to 3, and clears BATs 4 to 7.

Make suspend and restore functions respectively save and reload
the 8 BATs on CPUs having MMU_FTR_USE_HIGH_BATS feature.

Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-19 20:05:07 +10:00
Linus Torvalds
fa1827d773 powerpc fixes for 5.2 #4
One fix for a regression introduced by our 32-bit KASAN support, which broke
 booting on machines with "bootx" early debugging enabled.
 
 A fix for a bug which broke kexec on 32-bit, introduced by changes to the 32-bit
 STRICT_KERNEL_RWX support in v5.1.
 
 Finally two fixes going to stable for our THP split/collapse handling,
 discovered by Nick. The first fixes random crashes and/or corruption in guests
 under sufficient load.
 
 Thanks to:
   Nicholas Piggin, Christophe Leroy, Aaro Koskinen, Mathieu Malaterre.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdBOivAAoJEFHr6jzI4aWA/T0P/1pmr4JLWzXOPQOGOwS2mmu5
 HhXBFuDflpGiWS31syOhfKhiE2qlwdIcGaclSo1wAgUnMKp+sxVEagF6DEt484r3
 DXs3eRyrGu5vQT7Q6yReuT3Kw2ZR474a5ob00WGAQosBKyJF4ZHWz16ETVWMdAMQ
 TknEEU3hOUnMWWIEvLnZOKT7eJcmzj5IYy1OtLHBjWiHHizGC8PSxdVhiRcD/O6R
 F6C7XrFb7RRj5ran6gxwMbcTvjgu922TSQPOCw93qnXYWLfvWDUXC4yCqY21oHnr
 b3zgJNgIdSoMYxE8pfOH7Y+eaJrbzgnhlS5OJNEz/4NOfGQnJXYcSF8QO6eeVoKM
 L2SkT1Ov+QMmZQjMC5e9OAe7DFHfM59RYFg11eaUqfiaObRsmwu8rqjngITV5Ede
 Ydq3W39XQkjB3aQ4qb0MnEBbVgyQ/y6/T5hoHRlvnb5byFk5Pd3jpub56sq87UGM
 M4GD3YdmT5eBqzGxApddyIiS839PZpdw7g/Ivtp2GYCtiNNZcJqaXvN7IwfcVF3c
 YJrCfhNTUJPjICIL9k9v2RQovGuGZqtM3BHc8PmyVvDxTqtEbh8B9GEg+QC58xp/
 s+UI2sEd/Vg6NVKluIJHau7aEmJlaxYIshqsIhYvPgJRN6KrFMhdfPNx7D+zMlu8
 nbM6Z1n2VFk9Cxb0vA9K
 =MXJI
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "One fix for a regression introduced by our 32-bit KASAN support, which
  broke booting on machines with "bootx" early debugging enabled.

  A fix for a bug which broke kexec on 32-bit, introduced by changes to
  the 32-bit STRICT_KERNEL_RWX support in v5.1.

  Finally two fixes going to stable for our THP split/collapse handling,
  discovered by Nick. The first fixes random crashes and/or corruption
  in guests under sufficient load.

  Thanks to: Nicholas Piggin, Christophe Leroy, Aaro Koskinen, Mathieu
  Malaterre"

* tag 'powerpc-5.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/32s: fix booting with CONFIG_PPC_EARLY_DEBUG_BOOTX
  powerpc/64s: __find_linux_pte() synchronization vs pmdp_invalidate()
  powerpc/64s: Fix THP PMD collapse serialisation
  powerpc: Fix kexec failure on book3s/32
2019-06-15 07:29:32 -10:00
Christophe Leroy
82f6e266f8 powerpc/32: fix build failure on book3e with KVM
Build failure was introduced by the commit identified below,
due to missed macro expension leading to wrong called function's name.

arch/powerpc/kernel/head_fsl_booke.o: In function `SystemCall':
arch/powerpc/kernel/head_fsl_booke.S:416: undefined reference to `kvmppc_handler_BOOKE_INTERRUPT_SYSCALL_SPRN_SRR1'
Makefile:1052: recipe for target 'vmlinux' failed

The called function should be kvmppc_handler_8_0x01B(). This patch fixes it.

Reported-by: Paul Mackerras <paulus@ozlabs.org>
Fixes: 1a4b739bbb ("powerpc/32: implement fast entry for syscalls on BOOKE")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-16 00:03:38 +10:00
Christophe Leroy
9c4e4c90ec powerpc/64: mark start_here_multiplatform as __ref
Otherwise, the following warning is encountered:

WARNING: vmlinux.o(.text+0x3dc6): Section mismatch in reference from the variable start_here_multiplatform to the function .init.text:.early_setup()
The function start_here_multiplatform() references
the function __init .early_setup().
This is often because start_here_multiplatform lacks a __init
annotation or the annotation of .early_setup is wrong.

Fixes: 56c46bba9b ("powerpc/64: Fix booting large kernels with STRICT_KERNEL_RWX")
Cc: Russell Currey <ruscur@russell.cc>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-16 00:00:30 +10:00
Christophe Leroy
e8732ffa2e powerpc/booke: fix fast syscall entry on SMP
Use r10 instead of r9 to calculate CPU offset as r9 contains
the value from SRR1 which is used later.

Fixes: 1a4b739bbb ("powerpc/32: implement fast entry for syscalls on BOOKE")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-15 23:44:41 +10:00
Christophe Leroy
b7f8b440f3 powerpc/32s: fix initial setup of segment registers on secondary CPU
The patch referenced below moved the loading of segment registers
out of load_up_mmu() in order to do it earlier in the boot sequence.
However, the secondary CPU still needs it to be done when loading up
the MMU.

Reported-by: Erhard F. <erhard_f@mailbox.org>
Fixes: 215b823707 ("powerpc/32s: set up an early static hash table for KASAN")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-15 23:43:54 +10:00
Nathan Lynch
d4aa219a07 powerpc/cacheinfo: add cacheinfo_teardown, cacheinfo_rebuild
Allow external callers to force the cacheinfo code to release all its
references to cache nodes, e.g. before processing device tree updates
post-migration, and to rebuild the hierarchy afterward.

CPU online/offline must be blocked by callers; enforce this.

Fixes: 410bccf978 ("powerpc/pseries: Partition migration in the kernel")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-15 16:52:06 +10:00
Mathieu Malaterre
1ec0cd8286 PM: hibernate: powerpc: Expose pfn_is_nosave() prototype
The declaration for pfn_is_nosave is only available in
kernel/power/power.h. Since this function can be override in arch,
expose it globally. Having a prototype will make sure to avoid warning
(sometime treated as error with W=1) such as:

  arch/powerpc/kernel/suspend.c:18:5: error: no previous prototype for 'pfn_is_nosave' [-Werror=missing-prototypes]

This moves the declaration into a globally visible header file and add
missing include to avoid a warning on powerpc.

Also remove the duplicated prototypes since not required anymore.

Signed-off-by: Mathieu Malaterre <malat@debian.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-06-14 10:48:56 +02:00
Christophe Leroy
c21f5a9ed8 powerpc/32s: fix booting with CONFIG_PPC_EARLY_DEBUG_BOOTX
When booting through OF, setup_disp_bat() does nothing because
disp_BAT are not set. By change, it used to work because BOOTX
buffer is mapped 1:1 at address 0x81000000 by the bootloader, and
btext_setup_display() sets virt addr same as phys addr.

But since commit 215b823707 ("powerpc/32s: set up an early static
hash table for KASAN."), a temporary page table overrides the
bootloader mapping.

This 0x81000000 is also problematic with the newly implemented
Kernel Userspace Access Protection (KUAP) because it is within user
address space.

This patch fixes those issues by properly setting disp_BAT through
a call to btext_prepare_BAT(), allowing setup_disp_bat() to
properly setup BAT3 for early bootx screen buffer access.

Reported-by: Mathieu Malaterre <malat@debian.org>
Fixes: 215b823707 ("powerpc/32s: set up an early static hash table for KASAN.")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Tested-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-07 19:00:14 +10:00
Christophe Leroy
6c284228eb powerpc: Fix kexec failure on book3s/32
In the old days, _PAGE_EXEC didn't exist on 6xx aka book3s/32.
Therefore, allthough __mapin_ram_chunk() was already mapping kernel
text with PAGE_KERNEL_TEXT and the rest with PAGE_KERNEL, the entire
memory was executable. Part of the memory (first 512kbytes) was
mapped with BATs instead of page table, but it was also entirely
mapped as executable.

In commit 385e89d5b2 ("powerpc/mm: add exec protection on
powerpc 603"), we started adding exec protection to some 6xx, namely
the 603, for pages mapped via pagetables.

Then, in commit 63b2bc6195 ("powerpc/mm/32s: Use BATs for
STRICT_KERNEL_RWX"), the exec protection was extended to BAT mapped
memory, so that really only the kernel text could be executed.

The problem here is that kexec is based on copying some code into
upper part of memory then executing it from there in order to install
a fresh new kernel at its definitive location.

However, the code is position independant and first part of it is
just there to deactivate the MMU and jump to the second part. So it
is possible to run this first part inplace instead of running the
copy. Once the MMU is off, there is no protection anymore and the
second part of the code will just run as before.

Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Fixes: 63b2bc6195 ("powerpc/mm/32s: Use BATs for STRICT_KERNEL_RWX")
Cc: stable@vger.kernel.org # v5.1+
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-07 16:24:47 +10:00
Sudeep Holla
15532fd6f5 ptrace: move clearing of TIF_SYSCALL_EMU flag to core
While the TIF_SYSCALL_EMU is set in ptrace_resume independent of any
architecture, currently only powerpc and x86 unset the TIF_SYSCALL_EMU
flag in ptrace_disable which gets called from ptrace_detach.

Let's move the clearing of TIF_SYSCALL_EMU flag to __ptrace_unlink
which gets executed from ptrace_detach and also keep it along with
or close to clearing of TIF_SYSCALL_TRACE.

Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-05 17:51:17 +01:00
Thomas Gleixner
767a67b0b3 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 430
Based on 1 normalized pattern(s):

  distribute under gplv2

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 8 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Armijn Hemel <armijn@tjaldur.nl>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190114.475576622@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:37:16 +02:00
Thomas Gleixner
4505153954 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not write to the free
  software foundation inc 59 temple place suite 330 boston ma 02111
  1307 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 136 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190530000436.384967451@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:37:06 +02:00
Thomas Gleixner
8e8e69d67e treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 285
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation version 2 of the license this program
  is distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 100 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141900.918357685@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:37 +02:00
Thomas Gleixner
d94d71cb45 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 266
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not write to the free
  software foundation 51 franklin street fifth floor boston ma 02110
  1301 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 67 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141333.953658117@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:30:28 +02:00
Linus Torvalds
460b48a0fe powerpc fixes for 5.2 #3
A minor fix to our IMC PMU code to print a less confusing error message when the
 driver can't initialise properly.
 
 A fix for a bug where a user requesting an unsupported branch sampling filter
 can corrupt PMU state, preventing the PMU from counting properly.
 
 And finally a fix for a bug in our support for kexec_file_load(), which
 prevented loading a kernel and initramfs. Most versions of kexec don't yet use
 kexec_file_load().
 
 Thanks to:
   Anju T Sudhakar, Dave Young, Madhavan Srinivasan, Ravi Bangoria, Thiago Jung
   Bauermann.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJc86afAAoJEFHr6jzI4aWAIC8P/R/1Y8ro94tmu58MvqNmG80F
 Rc/+/1dY023X/itMHikSzPzs1UnQ1sjSKOOk3WOJcgVBE6Ks9Q7peeWD6wnSwYUq
 PwvXAZNmsD0Bgh9Oazj7M4dMSZHq7zJzQ8WzopltVueKgx941BLwYPRZG5zFXfcV
 k2Zaln87wfOuyfl/gQyWuG2ikziHJl9EvS3IIFejAnnogkFMti/FAe5bYXP3IT4W
 c4whLQJMNLeaeA7zeFBUvotWCZVfCq+Li9QTKWYJIPIZqN3kSBP3wHuUC8wEFuXa
 32+qu7RW5NfbuJERtCJYXqjO9C9A/QgOAm2n0LF2vBzo5CwBIqiR69PRuVzzUekU
 NhYsHbbwviRl1By7pxUCNm48TW+3utm903etEp+PPV3FYdG+E2hlhs8NoUEdiN5X
 PWzEMqUqneN6+HnwpEeeiDQef19XTaGr3PUd8CtJYJQ4rSeyvnBASjGKfVuBW6FL
 85aCwsmAYdapJOyhK5OSaDEWwMnpLBClJKqNrdH7y9fFm2lFHSJSNxHERBtSQcfx
 Ra+F7M0tyPRJFmVfXiX8u2ZYvYBLUIYUV1+mJHokEFOR9tW+Oy1lV3UMlXzWZXqb
 CY3qpCk0ix1rLjv8VpDoVQE2LkMhhUGqsgymSFueIwsrQhfhb/ymiVmaEH202tTM
 DyvBWmVuI9vyqR3Vt9GQ
 =ilhV
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "A minor fix to our IMC PMU code to print a less confusing error
  message when the driver can't initialise properly.

  A fix for a bug where a user requesting an unsupported branch sampling
  filter can corrupt PMU state, preventing the PMU from counting
  properly.

  And finally a fix for a bug in our support for kexec_file_load(),
  which prevented loading a kernel and initramfs. Most versions of kexec
  don't yet use kexec_file_load().

  Thanks to: Anju T Sudhakar, Dave Young, Madhavan Srinivasan, Ravi
  Bangoria, Thiago Jung Bauermann"

* tag 'powerpc-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/kexec: Fix loading of kernel + initramfs with kexec_file_load()
  powerpc/perf: Fix MMCRA corruption by bhrb_filter
  powerpc/powernv: Return for invalid IMC domain
2019-06-02 10:21:04 -07:00
Greg Kurz
a3bf9fbdad powerpc/pseries: Fix xive=off command line
On POWER9, if the hypervisor supports XIVE exploitation mode, the
guest OS will unconditionally requests for the XIVE interrupt mode
even if XIVE was deactivated with the kernel command line xive=off.
Later on, when the spapr XIVE init code handles xive=off, it disables
XIVE and tries to fall back on the legacy mode XICS.

This discrepency causes a kernel panic because the hypervisor is
configured to provide the XIVE interrupt mode to the guest :

  kernel BUG at arch/powerpc/sysdev/xics/xics-common.c:135!
  ...
  NIP xics_smp_probe+0x38/0x98
  LR  xics_smp_probe+0x2c/0x98
  Call Trace:
    xics_smp_probe+0x2c/0x98 (unreliable)
    pSeries_smp_probe+0x40/0xa0
    smp_prepare_cpus+0x62c/0x6ec
    kernel_init_freeable+0x148/0x448
    kernel_init+0x2c/0x148
    ret_from_kernel_thread+0x5c/0x68

Look for xive=off during prom_init and don't ask for XIVE in this
case. One exception though: if the host only supports XIVE, we still
want to boot so we ignore xive=off.

Similarly, have the spapr XIVE init code to looking at the interrupt
mode negotiated during CAS, and ignore xive=off if the hypervisor only
supports XIVE.

Fixes: eac1e731b5 ("powerpc/xive: guest exploitation of the XIVE interrupt controller")
Cc: stable@vger.kernel.org # v4.20
Reported-by: Pavithra R. Prakash <pavrampu@in.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-02 19:39:36 +10:00
Mathieu Malaterre
c806a6fde1 powerpc: Remove variable ‘path’ since not used
In commit eab00a208e ("powerpc: Move `path` variable inside
DEBUG_PROM") DEBUG_PROM sentinels were added to silence a warning
(treated as error with W=1):

  arch/powerpc/kernel/prom_init.c:1388:8: error: variable ‘path’ set but not used [-Werror=unused-but-set-variable]

Rework the original patch and simplify the code, by removing the
variable ‘path’ completely. Fix line over 90 characters.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-02 19:39:36 +10:00
Thomas Gleixner
f50a7f3d92 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 191
Based on 1 normalized pattern(s):

  licensed under gplv2

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 99 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190528170027.163048684@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:29:21 -07:00
Thomas Gleixner
1a59d1b8e0 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version this program is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details you
  should have received a copy of the gnu general public license along
  with this program if not write to the free software foundation inc
  59 temple place suite 330 boston ma 02111 1307 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 1334 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070033.113240726@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:26:35 -07:00
Thomas Gleixner
2874c5fd28 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:26:32 -07:00
Eric W. Biederman
2e1661d267 signal: Remove the task parameter from force_sig_fault
As synchronous exceptions really only make sense against the current
task (otherwise how are you synchronous) remove the task parameter
from from force_sig_fault to make it explicit that is what is going
on.

The two known exceptions that deliver a synchronous exception to a
stopped ptraced task have already been changed to
force_sig_fault_to_task.

The callers have been changed with the following emacs regular expression
(with obvious variations on the architectures that take more arguments)
to avoid typos:

force_sig_fault[(]\([^,]+\)[,]\([^,]+\)[,]\([^,]+\)[,]\W+current[)]
->
force_sig_fault(\1,\2,\3)

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2019-05-29 09:31:43 -05:00
Eric W. Biederman
3cf5d076fb signal: Remove task parameter from force_sig
All of the remaining callers pass current into force_sig so
remove the task parameter to make this obvious and to make
misuse more difficult in the future.

This also makes it clear force_sig passes current into force_sig_info.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2019-05-27 09:36:28 -05:00
Thomas Gleixner
74ba9207e1 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 61
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version this program is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details you
  should have received a copy of the gnu general public license along
  with this program if not write to the free software foundation inc
  675 mass ave cambridge ma 02139 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 441 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190520071858.739733335@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24 17:36:45 +02:00
Thiago Jung Bauermann
8b909e3548 powerpc/kexec: Fix loading of kernel + initramfs with kexec_file_load()
Commit b6664ba42f ("s390, kexec_file: drop arch_kexec_mem_walk()")
changed kexec_add_buffer() to skip searching for a memory location if
kexec_buf.mem is already set, and use the address that is there.

In powerpc code we reuse a kexec_buf variable for loading both the
kernel and the initramfs by resetting some of the fields between those
uses, but not mem. This causes kexec_add_buffer() to try to load the
kernel at the same address where initramfs will be loaded, which is
naturally rejected:

  # kexec -s -l --initrd initramfs vmlinuz
  kexec_file_load failed: Invalid argument

Setting the mem field before every call to kexec_add_buffer() fixes
this regression.

Fixes: b6664ba42f ("s390, kexec_file: drop arch_kexec_mem_walk()")
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-23 14:00:32 +10:00
Thomas Gleixner
457c899653 treewide: Add SPDX license identifier for missed files
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:45 +02:00
Linus Torvalds
cb6f8739fb Merge branch 'akpm' (patches from Andrew)
Merge yet more updates from Andrew Morton:
 "A few final bits:

   - large changes to vmalloc, yielding large performance benefits

   - tweak the console-flush-on-panic code

   - a few fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  panic: add an option to replay all the printk message in buffer
  initramfs: don't free a non-existent initrd
  fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount
  mm/compaction.c: correct zone boundary handling when isolating pages from a pageblock
  mm/vmap: add DEBUG_AUGMENT_LOWEST_MATCH_CHECK macro
  mm/vmap: add DEBUG_AUGMENT_PROPAGATE_CHECK macro
  mm/vmalloc.c: keep track of free blocks for vmap allocation
2019-05-19 12:15:32 -07:00
Linus Torvalds
86a78a8b8d powerpc fixes for 5.2 #2
One fix going back to stable, for a bug on 32-bit introduced when we added
 support for THREAD_INFO_IN_TASK.
 
 A fix for a typo in a recent rework of our hugetlb code that leads to crashes on
 64-bit when using hugetlbfs with a 4K PAGE_SIZE.
 
 Two fixes for our recent rework of the address layout on 64-bit hash CPUs, both
 only triggered when userspace tries to access addresses outside the user or
 kernel address ranges.
 
 Finally a fix for a recently introduced double free in an error path in our
 cacheinfo code.
 
 Thanks to:
   Aneesh Kumar K.V, Christophe Leroy, Sachin Sant, Tobin C. Harding.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJc3+WRAAoJEFHr6jzI4aWAlN4QAIP7UnQUApYzQX8YrMJEiip7
 1Jtez373pgDEX21McmaznyYRy04gtLWVRV0D5vRsTCG5RHBOVt2KPXe0PrzyjJws
 /AxS9aRuMgM+VkLkD9c6DzWsuBLP8kJMfuTbmY7+C0tByQQT9Xfp+K7/gC5kGlK4
 igyTGZvALlEVilsjTQO/FhADWisYIHM+y2/e0ocxLlK5F+TxdJKpESyUzjMOT78i
 SmAn1qufZedV7ZH91VmvLm8f8MSqg+t1NTwpAnXuS58z2eSNisRhUcP43K4XdAez
 QTGODTgUGGzLsbnIq8KJB/X4YVLwtTR2UM3p6ubTwlDuVkBfrFPUHowywDmxoPHZ
 Kh6KWlRFAL49sGmYMJpfkaEiyN+J+VLN2HMdNLW2HgnROzDSOXCXcjuAuBE9nAKe
 kvMV7Zwb4mQ0j0QUxHuugbBRzUpAcDg+PNZKKb9P/jIMWlMhGyb9fjGxC1yJfDtB
 Kq03zJ+0lzvDCr1XbcDhSv4Fu6Dxxfi7GX9BXlzHTzsGFwZKICj9sLXDXXjFh4h8
 CDNAeWtol1WE2xj315LL6WTlONUMQJ3wDAzd5VJw7SewadDzpK57vuwba88IN9hd
 2PmH7FF8VyrksKmqt19ZsF0X7n7JL3+xbOcV9OT0r0DjkkM1fC6zo+JT6YYIER6A
 FE+pFQRQJndKRzHqvyBJ
 =2265
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "One fix going back to stable, for a bug on 32-bit introduced when we
  added support for THREAD_INFO_IN_TASK.

  A fix for a typo in a recent rework of our hugetlb code that leads to
  crashes on 64-bit when using hugetlbfs with a 4K PAGE_SIZE.

  Two fixes for our recent rework of the address layout on 64-bit hash
  CPUs, both only triggered when userspace tries to access addresses
  outside the user or kernel address ranges.

  Finally a fix for a recently introduced double free in an error path
  in our cacheinfo code.

  Thanks to: Aneesh Kumar K.V, Christophe Leroy, Sachin Sant, Tobin C.
  Harding"

* tag 'powerpc-5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/cacheinfo: Remove double free
  powerpc/mm/hash: Fix get_region_id() for invalid addresses
  powerpc/mm: Drop VM_BUG_ON in get_region_id()
  powerpc/mm: Fix crashes with hugepages & 4K pages
  powerpc/32s: fix flush_hash_pages() on SMP
2019-05-19 10:10:15 -07:00
Feng Tang
de6da1e8bc panic: add an option to replay all the printk message in buffer
Currently on panic, kernel will lower the loglevel and print out pending
printk msg only with console_flush_on_panic().

Add an option for users to configure the "panic_print" to replay all
dmesg in buffer, some of which they may have never seen due to the
loglevel setting, which will help panic debugging .

[feng.tang@intel.com: keep the original console_flush_on_panic() inside panic()]
  Link: http://lkml.kernel.org/r/1556199137-14163-1-git-send-email-feng.tang@intel.com
[feng.tang@intel.com: use logbuf lock to protect the console log index]
  Link: http://lkml.kernel.org/r/1556269868-22654-1-git-send-email-feng.tang@intel.com
Link: http://lkml.kernel.org/r/1556095872-36838-1-git-send-email-feng.tang@intel.com
Signed-off-by: Feng Tang <feng.tang@intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Aaro Koskinen <aaro.koskinen@nokia.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-18 15:52:26 -07:00
Linus Torvalds
bf8a9a4755 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs mount updates from Al Viro:
 "Propagation of new syscalls to other architectures + cosmetic change
  from Christian (fscontext didn't follow the convention for anon inode
  names)"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]
  uapi, x86: Fix the syscall numbering of the mount API syscalls [ver #2]
  uapi, fsopen: use square brackets around "fscontext" [ver #2]
2019-05-17 09:46:31 -07:00
Tobin C. Harding
672eaf37db powerpc/cacheinfo: Remove double free
kfree() after kobject_put(). Who ever wrote this was on crack.

Fixes: 7e8039795a ("powerpc/cacheinfo: Fix kobject memleak")
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-17 23:28:00 +10:00
David Howells
d8076bdb56 uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]
Wire up the mount API syscalls on non-x86 arches.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-05-16 12:23:45 -04:00
Sinan Kaya
efb463cc16 powerpc: replace CONFIG_DEBUG_KERNEL with CONFIG_DEBUG_MISC
CONFIG_DEBUG_KERNEL should not impact code generation.  Use the newly
defined CONFIG_DEBUG_MISC instead to keep the current code.

Link: http://lkml.kernel.org/r/20190413224438.10802-3-okaya@kernel.org
Signed-off-by: Sinan Kaya <okaya@kernel.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc:  Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Florian Westphal <fw@strlen.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 19:52:50 -07:00
Masahiro Yamada
480795a095 powerpc/prom_init: mark prom_getprop() and prom_getproplen() as __init
This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
place.  We need to eliminate potential issues beforehand.

If it is enabled for powerpc, the following modpost warnings are
reported:

  WARNING: vmlinux.o(.text.unlikely+0x20): Section mismatch in reference from the function .prom_getprop() to the function .init.text:.call_prom()
  The function .prom_getprop() references the function __init .call_prom().
  This is often because .prom_getprop lacks a __init annotation or the annotation of .call_prom is wrong.

  WARNING: vmlinux.o(.text.unlikely+0x3c): Section mismatch in reference from the function .prom_getproplen() to the function .init.text:.call_prom()
  The function .prom_getproplen() references the function __init .call_prom().
  This is often because .prom_getproplen lacks a __init annotation or the annotation of .call_prom is wrong.

Link: http://lkml.kernel.org/r/20190423034959.13525-9-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Brezillon <bbrezillon@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Norris <computersforpeace@gmail.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Marek Vasut <marek.vasut@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 19:52:48 -07:00
Linus Torvalds
b970afcfca powerpc updates for 5.2
Highlights:
 
  - Support for Kernel Userspace Access/Execution Prevention (like
    SMAP/SMEP/PAN/PXN) on some 64-bit and 32-bit CPUs. This prevents the kernel
    from accidentally accessing userspace outside copy_to/from_user(), or
    ever executing userspace.
 
  - KASAN support on 32-bit.
 
  - Rework of where we map the kernel, vmalloc, etc. on 64-bit hash to use the
    same address ranges we use with the Radix MMU.
 
  - A rewrite into C of large parts of our idle handling code for 64-bit Book3S
    (ie. power8 & power9).
 
  - A fast path entry for syscalls on 32-bit CPUs, for a 12-17% speedup in the
    null_syscall benchmark.
 
  - On 64-bit bare metal we have support for recovering from errors with the time
    base (our clocksource), however if that fails currently we hang in __delay()
    and never crash. We now have support for detecting that case and short
    circuiting __delay() so we at least panic() and reboot.
 
  - Add support for optionally enabling the DAWR on Power9, which had to be
    disabled by default due to a hardware erratum. This has the effect of
    enabling hardware breakpoints for GDB, the downside is a badly behaved
    program could crash the machine by pointing the DAWR at cache inhibited
    memory. This is opt-in obviously.
 
  - xmon, our crash handler, gets support for a read only mode where operations
    that could change memory or otherwise disturb the system are disabled.
 
 Plus many clean-ups, reworks and minor fixes etc.
 
 Thanks to:
   Christophe Leroy, Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Andrew
   Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Ben Hutchings,
   Bo YU, Breno Leitao, Cédric Le Goater, Christopher M. Riedl, Christoph
   Hellwig, Colin Ian King, David Gibson, Ganesh Goudar, Gautham R. Shenoy,
   George Spelvin, Greg Kroah-Hartman, Greg Kurz, Horia Geantă, Jagadeesh
   Pagadala, Joel Stanley, Joe Perches, Julia Lawall, Laurentiu Tudor, Laurent
   Vivier, Lukas Bulwahn, Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu
   Malaterre, Michael Neuling, Mukesh Ojha, Nathan Fontenot, Nathan Lynch,
   Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Peng Hao, Qian Cai, Ravi
   Bangoria, Rick Lindsley, Russell Currey, Sachin Sant, Stewart Smith, Sukadev
   Bhattiprolu, Thomas Huth, Tobin C. Harding, Tyrel Datwyler, Valentin
   Schneider, Wei Yongjun, Wen Yang, YueHaibing.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJc1WbwAAoJEFHr6jzI4aWAv5cP/iDskai4Az/GCa6yLj4b+det
 7mc7tTOaEzhUtvfrYYfHgvvdNNzo1ETv7rqTdZqtWJ3xfwdeowLFXXZwSywZKUDB
 bi4pcl2v55Qlf9kxgx9RDr6+4fTwGG4nhO2qPDJDR1umEih9mG/2HJ7d+Wnq6Va2
 E9srd+R6Fa0ty88+9vzBtdyllnDK1XHu3ahsxCH62aRm79ucuVrxyydWmbbs5lJe
 a7g/OQIPgZmObHhfXvw9DFkOvkp5Pm6hfHOeyQH2nTB5X6k0judWv00uoHTJgOuP
 DKxZtDhaGnajUfuhQYboDPOuFjY7lkfgEXaagyZsjdudqridTMmv1iU1o7iy8BT4
 AId4DyJbvFFgqRJkCwKzhKRRHPfFMfM7KTJ38GPZuPmniuULk9uiIy6JyY0tXO+l
 UQEclPzOTPkAE12FBaOBuqZqTRuBQuokWQF8ZDPOxbNAixHgFoRd4Z9diNwCPpLu
 +KoyCwd2Gm5DyX+mC85sWG28IPKi9Hhhw2XBOA5F4A2kH6uFa1BnERSRGYomx+pc
 BvEXHglf/vgV0XUQZfDCsiOecIKYuWxgre0/liLhhU5qMss2pxHczzffH4KtdykS
 9y7o3mVRcS7Moitbmb6SAJoQxbR5QhzfN832DbSd6jEfKdg1ytZlfHTG0WZYHKDs
 PHs6V1N+cQANdukutrJz
 =cUkd
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.2-1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Slightly delayed due to the issue with printk() calling
  probe_kernel_read() interacting with our new user access prevention
  stuff, but all fixed now.

  The only out-of-area changes are the addition of a cpuhp_state, small
  additions to Documentation and MAINTAINERS updates.

  Highlights:

   - Support for Kernel Userspace Access/Execution Prevention (like
     SMAP/SMEP/PAN/PXN) on some 64-bit and 32-bit CPUs. This prevents
     the kernel from accidentally accessing userspace outside
     copy_to/from_user(), or ever executing userspace.

   - KASAN support on 32-bit.

   - Rework of where we map the kernel, vmalloc, etc. on 64-bit hash to
     use the same address ranges we use with the Radix MMU.

   - A rewrite into C of large parts of our idle handling code for
     64-bit Book3S (ie. power8 & power9).

   - A fast path entry for syscalls on 32-bit CPUs, for a 12-17% speedup
     in the null_syscall benchmark.

   - On 64-bit bare metal we have support for recovering from errors
     with the time base (our clocksource), however if that fails
     currently we hang in __delay() and never crash. We now have support
     for detecting that case and short circuiting __delay() so we at
     least panic() and reboot.

   - Add support for optionally enabling the DAWR on Power9, which had
     to be disabled by default due to a hardware erratum. This has the
     effect of enabling hardware breakpoints for GDB, the downside is a
     badly behaved program could crash the machine by pointing the DAWR
     at cache inhibited memory. This is opt-in obviously.

   - xmon, our crash handler, gets support for a read only mode where
     operations that could change memory or otherwise disturb the system
     are disabled.

  Plus many clean-ups, reworks and minor fixes etc.

  Thanks to: Christophe Leroy, Akshay Adiga, Alastair D'Silva, Alexey
  Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar,
  Anton Blanchard, Ben Hutchings, Bo YU, Breno Leitao, Cédric Le Goater,
  Christopher M. Riedl, Christoph Hellwig, Colin Ian King, David Gibson,
  Ganesh Goudar, Gautham R. Shenoy, George Spelvin, Greg Kroah-Hartman,
  Greg Kurz, Horia Geantă, Jagadeesh Pagadala, Joel Stanley, Joe
  Perches, Julia Lawall, Laurentiu Tudor, Laurent Vivier, Lukas Bulwahn,
  Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu Malaterre, Michael
  Neuling, Mukesh Ojha, Nathan Fontenot, Nathan Lynch, Nicholas Piggin,
  Nick Desaulniers, Oliver O'Halloran, Peng Hao, Qian Cai, Ravi
  Bangoria, Rick Lindsley, Russell Currey, Sachin Sant, Stewart Smith,
  Sukadev Bhattiprolu, Thomas Huth, Tobin C. Harding, Tyrel Datwyler,
  Valentin Schneider, Wei Yongjun, Wen Yang, YueHaibing"

* tag 'powerpc-5.2-1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (205 commits)
  powerpc/64s: Use early_mmu_has_feature() in set_kuap()
  powerpc/book3s/64: check for NULL pointer in pgd_alloc()
  powerpc/mm: Fix hugetlb page initialization
  ocxl: Fix return value check in afu_ioctl()
  powerpc/mm: fix section mismatch for setup_kup()
  powerpc/mm: fix redundant inclusion of pgtable-frag.o in Makefile
  powerpc/mm: Fix makefile for KASAN
  powerpc/kasan: add missing/lost Makefile
  selftests/powerpc: Add a signal fuzzer selftest
  powerpc/booke64: set RI in default MSR
  ocxl: Provide global MMIO accessors for external drivers
  ocxl: move event_fd handling to frontend
  ocxl: afu_irq only deals with IRQ IDs, not offsets
  ocxl: Allow external drivers to use OpenCAPI contexts
  ocxl: Create a clear delineation between ocxl backend & frontend
  ocxl: Don't pass pci_dev around
  ocxl: Split pci.c
  ocxl: Remove some unused exported symbols
  ocxl: Remove superfluous 'extern' from headers
  ocxl: read_pasid never returns an error, so make it void
  ...
2019-05-10 05:29:27 -07:00
Linus Torvalds
0a499fc5c3 Merge branch 'core-speculation-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull speculation mitigation update from Ingo Molnar:
 "This adds the "mitigations=" bootline option, which offers a
  cross-arch set of options that will work on x86, PowerPC and s390 that
  will map to the arch specific option internally"

* 'core-speculation-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  s390/speculation: Support 'mitigations=' cmdline option
  powerpc/speculation: Support 'mitigations=' cmdline option
  x86/speculation: Support 'mitigations=' cmdline option
  cpu/speculation: Add 'mitigations=' cmdline option
2019-05-06 13:01:16 -07:00
Mahesh Salgaonkar
de269129a4 powerpc/hmi: Fix kernel hang when TB is in error state.
On TOD/TB errors timebase register stops/freezes until HMI error recovery
gets TOD/TB back into running state. On successful recovery, TB starts
running again and udelay() that relies on TB value continues to function
properly. But in case when HMI fails to recover from TOD/TB errors, the
TB register stay freezed. With TB not running the __delay() function
keeps looping and never return. If __delay() is called while in panic
path then system hangs and never reboots after panic.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 02:54:57 +10:00
Valentin Schneider
90437bffa5 powerpc/entry: Remove unneeded need_resched() loop
Since the enabling and disabling of IRQs within preempt_schedule_irq()
is contained in a need_resched() loop, we don't need the outer arch
code loop.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
[mpe: Rebase since CURRENT_THREAD_INFO() removal]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 02:54:57 +10:00
Christophe Leroy
d7fbe2a043 powerpc/prom_init: get rid of PROM_SCRATCH_SIZE
PROM_SCRATCH_SIZE is same as sizeof(prom_scratch)

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 02:54:56 +10:00
Michael Ellerman
398af57112 powerpc/security: Show powerpc_security_features in debugfs
This can be helpful for debugging problems with the security feature
flags, especially on guests where the flags come from the hypervisor
via an hcall and so can't be observed in the device tree.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 02:54:56 +10:00
Nicholas Piggin
e2b36d5917 powerpc/64: Don't trace code that runs with the soft irq mask unreconciled
"Reconciling" in terms of interrupt handling, is to bring the soft irq
mask state in to synch with the hardware, after an interrupt causes
MSR[EE] to be cleared (while the soft mask may be enabled, and hard
irqs not marked disabled).

General kernel code should not be called while unreconciled, because
local_irq_disable, etc. manipulations can cause surprising irq traces,
and it's fragile because the soft irq code does not really expect to
be called in this situation.

When exiting from an interrupt, MSR[EE] is cleared to prevent races,
but soft irq state is enabled for the returned-to context, so this is
now an unreconciled state. restore_math is called in this state, and
that can be ftraced, and the ftrace subsystem disables local irqs.

Mark restore_math and its callees as notrace. Restore a sanity check
in the soft irq code that had to be disabled for this case, by commit
4da1f79227 ("powerpc/64: Disable irq restore warning for now").

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:58:11 +10:00
Christophe Leroy
502523fd1d powerpc/irq: drop __irq_offset_value
This patch drops__irq_offset_value which has not been used since
commit 9c4cb82515 ("powerpc: Remove use of CONFIG_PPC_MERGE")

This removes a sparse warning.

Fixes: 9c4cb82515 ("powerpc: Remove use of CONFIG_PPC_MERGE")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:58:11 +10:00
Christophe Leroy
65184f2f04 powerpc/setup: replace ifdefs by IS_ENABLED() wherever possible.
Compared to ifdefs, IS_ENABLED() provide a cleaner code and allows
to detect compilation failure regardless of the selected options.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:58:11 +10:00
Christophe Leroy
48018e42e5 powerpc/setup: cleanup the #ifdef CONFIG_TAU block
Use cpu_has_feature() instead of opencoding

Use IS_ENABLED() instead of #ifdef for CONFIG_TAU_AVERAGE

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:58:11 +10:00
Christophe Leroy
b5064efee2 powerpc/setup: cleanup ifdef mess in check_cache_coherency()
Use IS_ENABLED() instead of #ifdefs

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:58:11 +10:00
Christophe Leroy
e9e9b25a4c powerpc/setup: Remove unnecessary #ifdef CONFIG_ALTIVEC
CPU_FTR_ALTIVEC is only set when CONFIG_ALTIVEC is selected, so
the ifdef is unnecessary.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:58:11 +10:00
Christophe Leroy
93f2cd8137 powerpc/mm: define an empty mm_iommu_init()
To avoid ifdefs, define a empty static inline mm_iommu_init() function
when CONFIG_SPAPR_TCE_IOMMU is not selected.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:58:11 +10:00
Christophe Leroy
9c1d38b34e powerpc/fadump: define an empty fadump_cleanup()
To avoid #ifdefs, define an static inline fadump_cleanup() function
when CONFIG_FADUMP is not selected

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:58:10 +10:00
Christophe Leroy
d1865e71cd powerpc/32: Don't add dummy frames when calling trace_hardirqs_on/off
No need to add dummy frames when calling trace_hardirqs_on or
trace_hardirqs_off. GCC properly handles empty stacks.

In addition, powerpc doesn't set CONFIG_FRAME_POINTER, therefore
__builtin_return_address(1..) returns NULL at all time. So the
dummy frames are definitely unneeded here.

In the meantime, avoid reading memory for loading r1 with a value
we already know.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:28 +10:00
Christophe Leroy
38b4564cf0 powerpc/32: don't do syscall stuff in transfer_to_handler
As syscalls are now handled via a fast entry path, syscall related
actions can be removed from the generic transfer_to_handler path.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:27 +10:00
Christophe Leroy
1a4b739bbb powerpc/32: implement fast entry for syscalls on BOOKE
This patch implements a fast entry for syscalls.

Syscalls don't have to preserve non volatile registers except LR.

This patch then implement a fast entry for syscalls, where
volatile registers get clobbered.

As this entry is dedicated to syscall it always sets MSR_EE
and warns in case MSR_EE was previously off

It also assumes that the call is always from user, system calls are
unexpected from kernel.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:27 +10:00
Christophe Leroy
b86fb88855 powerpc/32: implement fast entry for syscalls on non BOOKE
This patch implements a fast entry for syscalls.

Syscalls don't have to preserve non volatile registers except LR.

This patch then implement a fast entry for syscalls, where
volatile registers get clobbered.

As this entry is dedicated to syscall it always sets MSR_EE
and warns in case MSR_EE was previously off

It also assumes that the call is always from user, system calls are
unexpected from kernel.

The overall series improves null_syscall selftest by 12,5% on an 83xx
and by 17% on a 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:27 +10:00
Christophe Leroy
40530db7c6 powerpc: Fix 32-bit handling of MSR_EE on exceptions
[text mostly copied from benh's RFC/WIP]

ppc32 are still doing something rather gothic and wrong on 32-bit
which we stopped doing on 64-bit a while ago.

We have that thing where some handlers "copy" the EE value from the
original stack frame into the new MSR before transferring to the
handler.

Thus for a number of exceptions, we enter the handlers with interrupts
enabled.

This is rather fishy, some of the stuff that handlers might do early
on such as irq_enter/exit or user_exit, context tracking, etc...
should be run with interrupts off afaik.

Generally our handlers know when to re-enable interrupts if needed.

The problem we were having is that we assumed these interrupts would
return with interrupts enabled. However that isn't the case.

Instead, this patch changes things so that we always enter exception
handlers with interrupts *off* with the notable exception of syscalls
which are special (and get a fast path).

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:27 +10:00
Christophe Leroy
1ae99b4b92 powerpc/32: get rid of COPY_EE in exception entry
EXC_XFER_TEMPLATE() is not called with COPY_EE anymore so
we can get rid of copyee parameters and related COPY_EE and NOCOPY
macros.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[splited out from benh RFC patch]

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:27 +10:00
Christophe Leroy
642770dd96 powerpc/32: Enter exceptions with MSR_EE unset
All exceptions handlers know when to reenable interrupts, so
it is safer to enter all of them with MSR_EE unset, except
for syscalls.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[splited out from benh RFC patch]

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:27 +10:00
Christophe Leroy
f97dec21a3 powerpc/32: enter syscall with MSR_EE inconditionaly set
syscalls are expected to be entered with MSR_EE set. Lets
make it inconditional by forcing MSR_EE on syscalls.

This patch adds EXC_XFER_SYS for that.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[splited out from benh RFC patch]

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:27 +10:00
Christophe Leroy
ef4291243f powerpc/fsl_booke: ensure SPEFloatingPointException() reenables interrupts
SPEFloatingPointException() is the only exception handler which 'forgets' to
re-enable interrupts. This patch makes sure it does.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:27 +10:00
Christophe Leroy
90f204b9a1 powerpc/40x: Refactor exception entry macros by using head_32.h
Refactor exception entry macros by using the ones defined in head_32.h

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:27 +10:00
Christophe Leroy
7271fc9604 powerpc/40x: Split and rename NORMAL_EXCEPTION_PROLOG
This patch splits NORMAL_EXCEPTION_PROLOG in the same way as in
head_8xx.S and head_32.S and renames it EXCEPTION_PROLOG() as well
to match head_32.h

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:27 +10:00
Christophe Leroy
bd82904d46 powerpc/40x: add exception frame marker
This patch adds STACK_FRAME_REGS_MARKER in the stack at exception entry
in order to see interrupts in call traces as below:

[    0.013964] Call Trace:
[    0.014014] [c0745db0] [c007a9d4] tick_periodic.constprop.5+0xd8/0x104 (unreliable)
[    0.014086] [c0745dc0] [c007aa20] tick_handle_periodic+0x20/0x9c
[    0.014181] [c0745de0] [c0009cd0] timer_interrupt+0xa0/0x264
[    0.014258] [c0745e10] [c000e484] ret_from_except+0x0/0x14
[    0.014390] --- interrupt: 901 at console_unlock.part.7+0x3f4/0x528
[    0.014390]     LR = console_unlock.part.7+0x3f0/0x528
[    0.014455] [c0745ee0] [c0050334] console_unlock.part.7+0x114/0x528 (unreliable)
[    0.014542] [c0745f30] [c00524e0] register_console+0x3d8/0x44c
[    0.014625] [c0745f60] [c0675aac] cpm_uart_console_init+0x18/0x2c
[    0.014709] [c0745f70] [c06614f4] console_init+0x114/0x1cc
[    0.014795] [c0745fb0] [c0658b68] start_kernel+0x300/0x3d8
[    0.014864] [c0745ff0] [c00022cc] start_here+0x44/0x98

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:27 +10:00
Christophe Leroy
57bc13acbe powerpc/40x: Don't use SPRN_SPRG_SCRATCH2 in EXCEPTION_PROLOG
Unlike said in the comment, r1 is not reused by the critical
exception handler, as it uses a dedicated critirq_ctx stack.
Decrementing r1 early is then unneeded.

Should the above be valid, the code is crap buggy anyway as
r1 gets some intermediate values that would jeopardise the
whole process (for instance after mfspr   r1,SPRN_SPRG_THREAD)

Using SPRN_SPRG_SCRATCH2 to save r1 is then not needed, r11 can be
used instead. This avoids one mtspr and one mfspr and makes the
prolog closer to what's done on 6xx and 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:27 +10:00
Christophe Leroy
1d3034aed4 powerpc/32: make the 6xx/8xx EXC_XFER_TEMPLATE() similar to the 40x/booke one
6xx/8xx EXC_XFER_TEMPLATE() macro adds a i##n symbol which is
unused and can be removed.
40x and booke EXC_XFER_TEMPLATE() macros takes msr from the caller
while the 6xx/8xx version uses only MSR_KERNEL as msr value.

This patch modifies the 6xx/8xx version to make it similar to the
40x and booke versions.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:27 +10:00
Christophe Leroy
37737a2afd powerpc/32: move LOAD_MSR_KERNEL() into head_32.h and use it
As preparation for using head_32.h for head_40x.S, move
LOAD_MSR_KERNEL() there and use it to load r10 with MSR_KERNEL value.

In the mean time, this patch modifies it so that it takes into account
the size of the passed value to determine if 'li' can be used or if
'lis/ori' is needed instead of using the size of MSR_KERNEL. This is
done by using gas macro.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:27 +10:00
Christophe Leroy
8a23fdec3d powerpc/32: Refactor EXCEPTION entry macros for head_8xx.S and head_32.S
EXCEPTION_PROLOG is similar in head_8xx.S and head_32.S

This patch creates head_32.h and moves EXCEPTION_PROLOG macro
into it. It also converts it from a GCC macro to a GAS macro
in order to ease refactorisation with 40x later, since
GAS macros allows the use of #ifdef/#else/#endif inside it.
And it also has the advantage of not requiring the uggly "; \"
at the end of each line.

This patch also moves EXCEPTION() and EXC_XFER_XXXX() macros which
are also similar while adding START_EXCEPTION() out of EXCEPTION().

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:26 +10:00
Christophe Leroy
e4dccf9092 powerpc/mm: print hash info in a helper
Reduce #ifdef mess by defining a helper to print
hash info at startup.

In the meantime, remove the display of hash table address
to reduce leak of non necessary information.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:26 +10:00
Christophe Leroy
215b823707 powerpc/32s: set up an early static hash table for KASAN.
KASAN requires early activation of hash table, before memblock()
functions are available.

This patch implements an early hash_table statically defined in
__initdata.

During early boot, a single page table is used.

For hash32, when doing the final init, one page table is allocated
for each PGD entry because of the _PAGE_HASHPTE flag which can't be
common to several virt pages. This is done after memblock get
available but before switching to the final hash table, otherwise
there are issues with TLB flushing due to the shared entries.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:26 +10:00
Christophe Leroy
72f208c6a8 powerpc/32s: move hash code patching out of MMU_init_hw()
For KASAN, hash table handling will be activated early for
accessing to KASAN shadow areas.

In order to avoid any modification of the hash functions while
they are still used with the early hash table, the code patching
is moved out of MMU_init_hw() and put close to the big-bang switch
to the final hash table.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:26 +10:00
Christophe Leroy
2edb16efc8 powerpc/32: Add KASAN support
This patch adds KASAN support for PPC32. The following patch
will add an early activation of hash table for book3s. Until
then, a warning will be raised if trying to use KASAN on an
hash 6xx.

To support KASAN, this patch initialises that MMU mapings for
accessing to the KASAN shadow area defined in a previous patch.

An early mapping is set as soon as the kernel code has been
relocated at its definitive place.

Then the definitive mapping is set once paging is initialised.

For modules, the shadow area is allocated at module_alloc().

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:26 +10:00
Christophe Leroy
f072015c7b powerpc: disable KASAN instrumentation on early/critical files.
All files containing functions run before kasan_early_init() is called
must have KASAN instrumentation disabled.

For those file, branch profiling also have to be disabled otherwise
each if () generates a call to ftrace_likely_update().

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:26 +10:00
Christophe Leroy
7934cea7f0 powerpc/32: use memset() instead of memset_io() to zero BSS
Since commit 400c47d81c ("powerpc32: memset: only use dcbz once cache is
enabled"), memset() can be used before activation of the cache,
so no need to use memset_io() for zeroing the BSS.

Acked-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:26 +10:00
Christophe Leroy
adcf59187e powerpc: don't use direct assignation during early boot.
In kernel/cputable.c, explicitly use memcpy() instead of *y = *x;
This will allow GCC to replace it with __memcpy() when KASAN is
selected.

Acked-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:25 +10:00
Christophe Leroy
450e7dd400 powerpc/prom_init: don't use string functions from lib/
When KASAN is active, the string functions in lib/ are doing the
KASAN checks. This is too early for prom_init.

This patch implements dedicated string functions for prom_init,
which will be compiled in with KASAN disabled.

Size of prom_init before the patch:
   text	   data	    bss	    dec	    hex	filename
  12060	    488	   6960	  19508	   4c34	arch/powerpc/kernel/prom_init.o

Size of prom_init after the patch:
   text	   data	    bss	    dec	    hex	filename
  12460	    488	   6960	  19908	   4dc4	arch/powerpc/kernel/prom_init.o

This increases the size of prom_init a bit, but as prom_init is
in __init section, it is freed after boot anyway.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:25 +10:00
Christophe Leroy
cbe46bd4f5 powerpc: remove CONFIG_CMDLINE #ifdef mess
This patch makes CONFIG_CMDLINE defined at all time. It avoids
having to enclose related code inside #ifdef CONFIG_CMDLINE

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:25 +10:00
Christophe Leroy
26deb04342 powerpc: prepare string/mem functions for KASAN
CONFIG_KASAN implements wrappers for memcpy() memmove() and memset()
Those wrappers are doing the verification then call respectively
__memcpy() __memmove() and __memset(). The arches are therefore
expected to rename their optimised functions that way.

For files on which KASAN is inhibited, #defines are used to allow
them to directly call optimised versions of the functions without
going through the KASAN wrappers.

See commit 393f203f5f ("x86_64: kasan: add interceptors for
memset/memmove/memcpy functions") for details.

Other string / mem functions do not (yet) have kasan wrappers,
we therefore have to fallback to the generic versions when
KASAN is active, otherwise KASAN checks will be skipped.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Fixups to keep selftests working]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:25 +10:00
Christophe Leroy
d69ca6bab3 powerpc/32: Move early_init() in a separate file
In preparation of KASAN, move early_init() into a separate
file in order to allow deactivation of KASAN for that function.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:25 +10:00
Christophe Leroy
45d0ba527b powerpc/mm: move hugetlb_disabled into asm/hugetlb.h
No need to have this in asm/page.h, move it into asm/hugetlb.h

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:24 +10:00
Christophe Leroy
5ba666d56c powerpc/mm: fix erroneous duplicate slb_addr_limit init
Commit 67fda38f0d ("powerpc/mm: Move slb_addr_linit to
early_init_mmu") moved slb_addr_limit init out of setup_arch().

Commit 701101865f ("powerpc/mm: Reduce memory usage for mm_context_t
for radix") brought it back into setup_arch() by error.

This patch reverts that erroneous regress.

Fixes: 701101865f ("powerpc/mm: Reduce memory usage for mm_context_t for radix")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-03 01:20:22 +10:00
Breno Leitao
e620d45065 powerpc/tm: Avoid machine crash on rt_sigreturn()
There is a kernel crash that happens if rt_sigreturn() is called inside
a transactional block.

This crash happens if the kernel hits an in-kernel page fault when
accessing userspace memory, usually through copy_ckvsx_to_user(). A
major page fault calls might_sleep() function, which can cause a task
reschedule. A task reschedule (switch_to()) reclaim and recheckpoint
the TM states, but, in the signal return path, the checkpointed memory
was already reclaimed, thus the exception stack has MSR that points to
MSR[TS]=0.

When the code returns from might_sleep() and a task reschedule
happened, then this task is returned with the memory recheckpointed,
and CPU MSR[TS] = suspended.

This means that there is a side effect at might_sleep() if it is
called with CPU MSR[TS] = 0 and the task has regs->msr[TS] != 0.

This side effect can cause a TM bad thing, since at the exception
entrance, the stack saves MSR[TS]=0, and this is what will be used at
RFID, but, the processor has MSR[TS] = Suspended, and this transition
will be invalid and a TM Bad thing will be raised, causing the
following crash:

  Unexpected TM Bad Thing exception at c00000000000e9ec (msr 0x8000000302a03031) tm_scratch=800000010280b033
  cpu 0xc: Vector: 700 (Program Check) at [c00000003ff1fd70]
      pc: c00000000000e9ec: fast_exception_return+0x100/0x1bc
      lr: c000000000032948: handle_rt_signal64+0xb8/0xaf0
      sp: c0000004263ebc40
     msr: 8000000302a03031
    current = 0xc000000415050300
    paca    = 0xc00000003ffc4080	 irqmask: 0x03	 irq_happened: 0x01
      pid   = 25006, comm = sigfuz
  Linux version 5.0.0-rc1-00001-g3bd6e94bec12 (breno@debian) (gcc version 8.2.0 (Debian 8.2.0-3)) #899 SMP Mon Jan 7 11:30:07 EST 2019
  WARNING: exception is not recoverable, can't continue
  enter ? for help
  [c0000004263ebc40] c000000000032948 handle_rt_signal64+0xb8/0xaf0 (unreliable)
  [c0000004263ebd30] c000000000022780 do_notify_resume+0x2f0/0x430
  [c0000004263ebe20] c00000000000e844 ret_from_except_lite+0x70/0x74
  --- Exception: c00 (System Call) at 00007fffbaac400c
  SP (7fffeca90f40) is in userspace

The solution for this problem is running the sigreturn code with
regs->msr[TS] disabled, thus, avoiding hitting the side effect above.
This does not seem to be a problem since regs->msr will be replaced by
the ucontext value, so, it is being flushed already. In this case, it
is flushed earlier.

Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-01 23:03:29 +10:00