Pull devicetree updates from Rob Herring:
- Refactor powerpc and arm64 kexec DT handling to common code. This
enables IMA on arm64.
- Add kbuild support for applying DT overlays at build time. The first
user are the DT unittests.
- Fix kerneldoc formatting and W=1 warnings in drivers/of/
- Fix handling 64-bit flag on PCI resources
- Bump dtschema version required to v2021.2.1
- Enable undocumented compatible checks for dtbs_check. This allows
tracking of missing binding schemas.
- DT docs improvements. Regroup the DT docs and add the example schema
and DT kernel ABI docs to the doc build.
- Convert Broadcom Bluetooth and video-mux bindings to schema
- Add QCom sm8250 Venus video codec binding schema
- Add vendor prefixes for AESOP, YIC System Co., Ltd, and Siliconfile
Technologies Inc.
- Cleanup of DT schema type references on common properties and
standard unit properties
* tag 'devicetree-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (64 commits)
powerpc: If kexec_build_elf_info() fails return immediately from elf64_load()
powerpc: Free fdt on error in elf64_load()
of: overlay: Fix kerneldoc warning in of_overlay_remove()
of: linux/of.h: fix kernel-doc warnings
of/pci: Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses
dt-bindings: bcm4329-fmac: add optional brcm,ccode-map
docs: dt: update writing-schema.rst references
dt-bindings: media: venus: Add sm8250 dt schema
of: base: Fix spelling issue with function param 'prop'
docs: dt: Add DT API documentation
of: Add missing 'Return' section in kerneldoc comments
of: Fix kerneldoc output formatting
docs: dt: Group DT docs into relevant sub-sections
docs: dt: Make 'Devicetree' wording more consistent
docs: dt: writing-schema: Include the example schema in the doc build
docs: dt: writing-schema: Remove spurious indentation
dt-bindings: Fix reference in submitting-patches.rst to the DT ABI doc
dt-bindings: ddr: Add optional manufacturer and revision ID to LPDDR3
dt-bindings: media: video-interfaces: Drop the example
devicetree: bindings: clock: Minor typo fix in the file armada3700-tbg-clock.txt
...
The code relies on constant folding of cpu_has_feature() based
on possible and always true values as defined per
CPU_FTRS_ALWAYS and CPU_FTRS_POSSIBLE.
Build failure is encountered with for instance
book3e_all_defconfig on kisskb in the AMDGPU driver which uses
cpu_has_feature(CPU_FTR_VSX_COMP) to decide whether calling
kernel_enable_vsx() or not.
The failure is due to cpu_has_feature() not being inlined with
that configuration with gcc 4.9.
In the same way as commit acdad8fb4a ("powerpc: Force inlining of
mmu_has_feature to fix build failure"), for inlining of
cpu_has_feature().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b231dfa040ce4cc37f702f5c3a595fdeabfe0462.1615378209.git.christophe.leroy@csgroup.eu
kexec_fdt_totalsize_ppc64() includes the base FDT size in its size
calculation, but commit 3c985d31ad ("powerpc: Use common
of_kexec_alloc_and_setup_fdt()") changed the kexec code to use the
generic function of_kexec_alloc_and_setup_fdt() which already includes
the base FDT size. That change made the code overestimate the size a bit
by counting twice the space required for the kernel command line and
/chosen properties.
Therefore change kexec_fdt_totalsize_ppc64() to calculate just the extra
space needed by the kdump kernel, and change the function name so that it
better reflects what the function is now doing.
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
[robh: reword commit msg as no longer a fix from merging to branches]
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210220005204.1417200-1-bauerman@linux.ibm.com
Add stub instances of enable_kernel_vsx() and disable_kernel_vsx()
when CONFIG_VSX is not set, to avoid following build failure.
CC [M] drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.o
In file included from ./drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services_types.h:29,
from ./drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services.h:37,
from drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:27:
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c: In function 'dcn_bw_apply_registry_override':
./drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:64:3: error: implicit declaration of function 'enable_kernel_vsx'; did you mean 'enable_kernel_fp'? [-Werror=implicit-function-declaration]
64 | enable_kernel_vsx(); \
| ^~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:640:2: note: in expansion of macro 'DC_FP_START'
640 | DC_FP_START();
| ^~~~~~~~~~~
./drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:75:3: error: implicit declaration of function 'disable_kernel_vsx'; did you mean 'disable_kernel_fp'? [-Werror=implicit-function-declaration]
75 | disable_kernel_vsx(); \
| ^~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:676:2: note: in expansion of macro 'DC_FP_END'
676 | DC_FP_END();
| ^~~~~~~~~
cc1: some warnings being treated as errors
make[5]: *** [drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.o] Error 1
This works because the caller is checking if VSX is available using
cpu_has_feature():
#define DC_FP_START() { \
if (cpu_has_feature(CPU_FTR_VSX_COMP)) { \
preempt_disable(); \
enable_kernel_vsx(); \
} else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP)) { \
preempt_disable(); \
enable_kernel_altivec(); \
} else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE)) { \
preempt_disable(); \
enable_kernel_fp(); \
} \
When CONFIG_VSX is not selected, cpu_has_feature(CPU_FTR_VSX_COMP)
constant folds to 'false' so the call to enable_kernel_vsx() is
discarded and the build succeeds.
Fixes: 16a9dea110 ("amdgpu: Enable initial DCN support on POWER")
Cc: stable@vger.kernel.org # v5.6+
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Incorporate some discussion comments into the change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8d7d285a027e9d21f5ff7f850fa71a2655b0c4af.1615279170.git.christophe.leroy@csgroup.eu
In ppc_function_entry() we look for a specific set of instructions by
masking the instructions and comparing with a known value. Currently
those known values are just literal hex values, and we recently
discovered one of them was wrong.
Instead construct the values using the existing constants we have for
defining various fields of instructions.
Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20210309071544.515303-1-mpe@ellerman.id.au
The functions defined in "arch/powerpc/kexec/ima.c" handle setting up
and freeing the resources required to carry over the IMA measurement
list from the current kernel to the next kernel across kexec system call.
These functions do not have architecture specific code, but are
currently limited to powerpc.
Move remove_ima_buffer() and setup_ima_buffer() calls into
of_kexec_alloc_and_setup_fdt() defined in "drivers/of/kexec.c".
Move the remaining architecture independent functions from
"arch/powerpc/kexec/ima.c" to "drivers/of/kexec.c".
Delete "arch/powerpc/kexec/ima.c" and "arch/powerpc/include/asm/ima.h".
Remove references to the deleted files and functions in powerpc and
in ima.
Co-developed-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Tested-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210221174930.27324-11-nramas@linux.microsoft.com
The fields ima_buffer_addr and ima_buffer_size in "struct kimage_arch"
for powerpc are used to carry forward the IMA measurement list across
kexec system call. These fields are not architecture specific, but are
currently limited to powerpc.
arch_ima_add_kexec_buffer() defined in "arch/powerpc/kexec/ima.c"
sets ima_buffer_addr and ima_buffer_size for the kexec system call.
This function does not have architecture specific code, but is
currently limited to powerpc.
Move ima_buffer_addr and ima_buffer_size to "struct kimage".
Set ima_buffer_addr and ima_buffer_size in ima_add_kexec_buffer()
in security/integrity/ima/ima_kexec.c.
Co-developed-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Will Deacon <will@kernel.org>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210221174930.27324-9-nramas@linux.microsoft.com
The code for setting up the /chosen node in the device tree
and updating the memory reservation for the next kernel has been
moved to of_kexec_alloc_and_setup_fdt() defined in "drivers/of/kexec.c".
Use the common of_kexec_alloc_and_setup_fdt() to setup the device tree
and update the memory reservation for kexec for powerpc.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210221174930.27324-8-nramas@linux.microsoft.com
The test robot has managed to generate a random config leading
to following build failure:
LD .tmp_vmlinux.kallsyms1
powerpc64-linux-ld: arch/powerpc/mm/pgtable.o: in function `ptep_set_access_flags':
pgtable.c:(.text.ptep_set_access_flags+0xf0): undefined reference to `hash__flush_tlb_page'
powerpc64-linux-ld: arch/powerpc/mm/book3s32/mmu.o: in function `MMU_init_hw_patch':
mmu.c:(.init.text+0x452): undefined reference to `patch__hash_page_A0'
powerpc64-linux-ld: mmu.c:(.init.text+0x45e): undefined reference to `patch__hash_page_A0'
powerpc64-linux-ld: mmu.c:(.init.text+0x46a): undefined reference to `patch__hash_page_A1'
powerpc64-linux-ld: mmu.c:(.init.text+0x476): undefined reference to `patch__hash_page_A1'
powerpc64-linux-ld: mmu.c:(.init.text+0x482): undefined reference to `patch__hash_page_A2'
powerpc64-linux-ld: mmu.c:(.init.text+0x48e): undefined reference to `patch__hash_page_A2'
powerpc64-linux-ld: mmu.c:(.init.text+0x49e): undefined reference to `patch__hash_page_B'
powerpc64-linux-ld: mmu.c:(.init.text+0x4aa): undefined reference to `patch__hash_page_B'
powerpc64-linux-ld: mmu.c:(.init.text+0x4b6): undefined reference to `patch__hash_page_C'
powerpc64-linux-ld: mmu.c:(.init.text+0x4c2): undefined reference to `patch__hash_page_C'
powerpc64-linux-ld: mmu.c:(.init.text+0x4ce): undefined reference to `patch__flush_hash_A0'
powerpc64-linux-ld: mmu.c:(.init.text+0x4da): undefined reference to `patch__flush_hash_A0'
powerpc64-linux-ld: mmu.c:(.init.text+0x4e6): undefined reference to `patch__flush_hash_A1'
powerpc64-linux-ld: mmu.c:(.init.text+0x4f2): undefined reference to `patch__flush_hash_A1'
powerpc64-linux-ld: mmu.c:(.init.text+0x4fe): undefined reference to `patch__flush_hash_A2'
powerpc64-linux-ld: mmu.c:(.init.text+0x50a): undefined reference to `patch__flush_hash_A2'
powerpc64-linux-ld: mmu.c:(.init.text+0x522): undefined reference to `patch__flush_hash_B'
powerpc64-linux-ld: mmu.c:(.init.text+0x532): undefined reference to `patch__flush_hash_B'
powerpc64-linux-ld: arch/powerpc/mm/book3s32/mmu.o: in function `update_mmu_cache':
mmu.c:(.text.update_mmu_cache+0xa0): undefined reference to `add_hash_page'
powerpc64-linux-ld: mm/memory.o: in function `zap_pte_range':
memory.c:(.text.zap_pte_range+0x160): undefined reference to `flush_hash_pages'
powerpc64-linux-ld: mm/memory.o: in function `handle_pte_fault':
memory.c:(.text.handle_pte_fault+0x180): undefined reference to `hash__flush_tlb_page'
This is due to mmu_has_feature() not being inlined. See extract of build of
mmu.c with -Winline:
In file included from ./include/linux/mm_types.h:19,
from ./include/linux/mmzone.h:21,
from ./include/linux/gfp.h:6,
from ./include/linux/mm.h:10,
from arch/powerpc/mm/book3s32/mmu.c:21:
./arch/powerpc/include/asm/mmu.h: In function 'find_free_bat':
./arch/powerpc/include/asm/mmu.h:231:20: warning: inlining failed in call to 'early_mmu_has_feature': call is unlikely and code size would grow [-Winline]
231 | static inline bool early_mmu_has_feature(unsigned long feature)
| ^~~~~~~~~~~~~~~~~~~~~
./arch/powerpc/include/asm/mmu.h:291:9: note: called from here
291 | return early_mmu_has_feature(feature);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The code relies on constant folding of MMU_FTRS_POSSIBLE at buildtime
and elimination of non possible parts of code at compile time.
For this to work, mmu_has_feature() and early_mmu_has_feature()
must be inlined.
Fixes: 259149cf7c ("powerpc/32s: Only build hash code when CONFIG_PPC_BOOK3S_604 is selected")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/cf61345912c078c96f171afd0fcc48ef27cbdc3f.1614443418.git.christophe.leroy@csgroup.eu
The driver core ignores the return value of struct bus_type::remove()
because there is only little that can be done. To simplify the quest to
make this function return void, let struct vio_driver::remove() return
void, too. All users already unconditionally return 0, this commit makes
it obvious that returning an error code is a bad idea.
Note there are two nominally different implementations for a vio bus:
one in arch/sparc/kernel/vio.c and the other in
arch/powerpc/platforms/pseries/vio.c. This patch only adapts the powerpc
one.
Before this patch for a device that was bound to a driver without a
remove callback vio_cmo_bus_remove(viodev) wasn't called. As the device
core still considers the device unbound after vio_bus_remove() returns
calling this unconditionally is the consistent behaviour which is
implemented here.
Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org>
Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Acked-by: Lijun Pan <ljp@linux.ibm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[mpe: Drop unneeded hvcs_remove() forward declaration, squash in
change from sfr to drop ibmvnic_remove() forward declaration]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210225221834.160083-1-uwe@kleine-koenig.org
lkp reported a build error in fsp2.o:
CC arch/powerpc/platforms/44x/fsp2.o
{standard input}:577: Error: unsupported relocation against base
Which comes from:
pr_err("GESR0: 0x%08x\n", mfdcr(base + PLB4OPB_GESR0));
Where our mfdcr() macro is stringifying "base + PLB4OPB_GESR0", and
passing that to the assembler, which obviously doesn't work.
The mfdcr() macro already checks that the argument is constant using
__builtin_constant_p(), and if not calls the out-of-line version of
mfdcr(). But in this case GCC is smart enough to notice that "base +
PLB4OPB_GESR0" will be constant, even though it's not something we can
immediately stringify into a register number.
Segher pointed out that passing the register number to the inline asm
as a constant would be better, and in fact it fixes the build error,
presumably because it gives GCC a chance to resolve the value.
While we're at it, change mtdcr() similarly.
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Feng Tang <feng.tang@intel.com>
Link: https://lore.kernel.org/r/20210218123058.748882-1-mpe@ellerman.id.au
Pull x86 irq entry updates from Thomas Gleixner:
"The irq stack switching was moved out of the ASM entry code in course
of the entry code consolidation. It ended up being suboptimal in
various ways.
This reworks the X86 irq stack handling:
- Make the stack switching inline so the stackpointer manipulation is
not longer at an easy to find place.
- Get rid of the unnecessary indirect call.
- Avoid the double stack switching in interrupt return and reuse the
interrupt stack for softirq handling.
- A objtool fix for CONFIG_FRAME_POINTER=y builds where it got
confused about the stack pointer manipulation"
* tag 'x86-entry-2021-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix stack-swizzle for FRAME_POINTER=y
um: Enforce the usage of asm-generic/softirq_stack.h
x86/softirq/64: Inline do_softirq_own_stack()
softirq: Move do_softirq_own_stack() to generic asm header
softirq: Move __ARCH_HAS_DO_SOFTIRQ to Kconfig
x86: Select CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
x86/softirq: Remove indirection in do_softirq_own_stack()
x86/entry: Use run_sysvec_on_irqstack_cond() for XEN upcall
x86/entry: Convert device interrupts to inline stack switching
x86/entry: Convert system vectors to irq stack macro
x86/irq: Provide macro for inlining irq stack switching
x86/apic: Split out spurious handling code
x86/irq/64: Adjust the per CPU irq stack pointer by 8
x86/irq: Sanitize irq stack tracking
x86/entry: Fix instrumentation annotation
Pull powerpc updates from Michael Ellerman:
- A large series adding wrappers for our interrupt handlers, so that
irq/nmi/user tracking can be isolated in the wrappers rather than
spread in each handler.
- Conversion of the 32-bit syscall handling into C.
- A series from Nick to streamline our TLB flushing when using the
Radix MMU.
- Switch to using queued spinlocks by default for 64-bit server CPUs.
- A rework of our PCI probing so that it happens later in boot, when
more generic infrastructure is available.
- Two small fixes to allow 32-bit little-endian processes to run on
64-bit kernels.
- Other smaller features, fixes & cleanups.
Thanks to: Alexey Kardashevskiy, Ananth N Mavinakayanahalli, Aneesh
Kumar K.V, Athira Rajeev, Bhaskar Chowdhury, Cédric Le Goater, Chengyang
Fan, Christophe Leroy, Christopher M. Riedl, Fabiano Rosas, Florian
Fainelli, Frederic Barrat, Ganesh Goudar, Hari Bathini, Jiapeng Chong,
Joseph J Allen, Kajol Jain, Markus Elfring, Michal Suchanek, Nathan
Lynch, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Pingfan Liu,
Po-Hsu Lin, Qian Cai, Ram Pai, Randy Dunlap, Sandipan Das, Stephen
Rothwell, Tyrel Datwyler, Will Springer, Yury Norov, and Zheng Yongjun.
* tag 'powerpc-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (188 commits)
powerpc/perf: Adds support for programming of Thresholding in P10
powerpc/pci: Remove unimplemented prototypes
powerpc/uaccess: Merge raw_copy_to_user_allowed() into raw_copy_to_user()
powerpc/uaccess: Merge __put_user_size_allowed() into __put_user_size()
powerpc/uaccess: get rid of small constant size cases in raw_copy_{to,from}_user()
powerpc/64: Fix stack trace not displaying final frame
powerpc/time: Remove get_tbl()
powerpc/time: Avoid using get_tbl()
spi: mpc52xx: Avoid using get_tbl()
powerpc/syscall: Avoid storing 'current' in another pointer
powerpc/32: Handle bookE debugging in C in syscall entry/exit
powerpc/syscall: Do not check unsupported scv vector on PPC32
powerpc/32: Remove the counter in global_dbcr0
powerpc/32: Remove verification of MSR_PR on syscall in the ASM entry
powerpc/syscall: implement system call entry/exit logic in C for PPC32
powerpc/32: Always save non volatile GPRs at syscall entry
powerpc/syscall: Change condition to check MSR_RI
powerpc/syscall: Save r3 in regs->orig_r3
powerpc/syscall: Use is_compat_task()
powerpc/syscall: Make interrupt.c buildable on PPC32
...
Pull KVM updates from Paolo Bonzini:
"x86:
- Support for userspace to emulate Xen hypercalls
- Raise the maximum number of user memslots
- Scalability improvements for the new MMU.
Instead of the complex "fast page fault" logic that is used in
mmu.c, tdp_mmu.c uses an rwlock so that page faults are concurrent,
but the code that can run against page faults is limited. Right now
only page faults take the lock for reading; in the future this will
be extended to some cases of page table destruction. I hope to
switch the default MMU around 5.12-rc3 (some testing was delayed
due to Chinese New Year).
- Cleanups for MAXPHYADDR checks
- Use static calls for vendor-specific callbacks
- On AMD, use VMLOAD/VMSAVE to save and restore host state
- Stop using deprecated jump label APIs
- Workaround for AMD erratum that made nested virtualization
unreliable
- Support for LBR emulation in the guest
- Support for communicating bus lock vmexits to userspace
- Add support for SEV attestation command
- Miscellaneous cleanups
PPC:
- Support for second data watchpoint on POWER10
- Remove some complex workarounds for buggy early versions of POWER9
- Guest entry/exit fixes
ARM64:
- Make the nVHE EL2 object relocatable
- Cleanups for concurrent translation faults hitting the same page
- Support for the standard TRNG hypervisor call
- A bunch of small PMU/Debug fixes
- Simplification of the early init hypercall handling
Non-KVM changes (with acks):
- Detection of contended rwlocks (implemented only for qrwlocks,
because KVM only needs it for x86)
- Allow __DISABLE_EXPORTS from assembly code
- Provide a saner follow_pfn replacements for modules"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (192 commits)
KVM: x86/xen: Explicitly pad struct compat_vcpu_info to 64 bytes
KVM: selftests: Don't bother mapping GVA for Xen shinfo test
KVM: selftests: Fix hex vs. decimal snafu in Xen test
KVM: selftests: Fix size of memslots created by Xen tests
KVM: selftests: Ignore recently added Xen tests' build output
KVM: selftests: Add missing header file needed by xAPIC IPI tests
KVM: selftests: Add operand to vmsave/vmload/vmrun in svm.c
KVM: SVM: Make symbol 'svm_gp_erratum_intercept' static
locking/arch: Move qrwlock.h include after qspinlock.h
KVM: PPC: Book3S HV: Fix host radix SLB optimisation with hash guests
KVM: PPC: Book3S HV: Ensure radix guest has no SLB entries
KVM: PPC: Don't always report hash MMU capability for P9 < DD2.2
KVM: PPC: Book3S HV: Save and restore FSCR in the P9 path
KVM: PPC: remove unneeded semicolon
KVM: PPC: Book3S HV: Use POWER9 SLBIA IH=6 variant to clear SLB
KVM: PPC: Book3S HV: No need to clear radix host SLB before loading HPT guest
KVM: PPC: Book3S HV: Fix radix guest SLB side channel
KVM: PPC: Book3S HV: Remove support for running HPT guest on RPT host without mixed mode support
KVM: PPC: Book3S HV: Introduce new capability for 2nd DAWR
KVM: PPC: Book3S HV: Add infrastructure to support 2nd DAWR
...
Pull oprofile and dcookies removal from Viresh Kumar:
"Remove oprofile and dcookies support
The 'oprofile' user-space tools don't use the kernel OPROFILE support
any more, and haven't in a long time. User-space has been converted to
the perf interfaces.
The dcookies stuff is only used by the oprofile code. Now that
oprofile's support is getting removed from the kernel, there is no
need for dcookies as well.
Remove kernel's old oprofile and dcookies support"
* tag 'oprofile-removal-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux:
fs: Remove dcookies support
drivers: Remove CONFIG_OPROFILE support
arch: xtensa: Remove CONFIG_OPROFILE support
arch: x86: Remove CONFIG_OPROFILE support
arch: sparc: Remove CONFIG_OPROFILE support
arch: sh: Remove CONFIG_OPROFILE support
arch: s390: Remove CONFIG_OPROFILE support
arch: powerpc: Remove oprofile
arch: powerpc: Stop building and using oprofile
arch: parisc: Remove CONFIG_OPROFILE support
arch: mips: Remove CONFIG_OPROFILE support
arch: microblaze: Remove CONFIG_OPROFILE support
arch: ia64: Remove rest of perfmon support
arch: ia64: Remove CONFIG_OPROFILE support
arch: hexagon: Don't select HAVE_OPROFILE
arch: arc: Remove CONFIG_OPROFILE support
arch: arm: Remove CONFIG_OPROFILE support
arch: alpha: Remove CONFIG_OPROFILE support
KVM/arm64 updates for Linux 5.12
- Make the nVHE EL2 object relocatable, resulting in much more
maintainable code
- Handle concurrent translation faults hitting the same page
in a more elegant way
- Support for the standard TRNG hypervisor call
- A bunch of small PMU/Debug fixes
- Allow the disabling of symbol export from assembly code
- Simplification of the early init hypercall handling
Merge in the recent paravirt changes to resolve conflicts caused
by objtool annotations.
Conflicts:
arch/x86/xen/xen-asm.S
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull powerpc fix from Michael Ellerman:
"One fix for a regression seen in io_uring, introduced by our support
for KUAP (Kernel User Access Prevention) with the Hash MMU.
Thanks to Aneesh Kumar K.V, and Zorro Lang"
* tag 'powerpc-5.11-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/kuap: Allow kernel thread to access userspace after kthread_use_mm
Thresholding, a performance monitoring unit feature, can be
used to identify marked instructions which take more than
expected cycles between start event and end event.
Threshold compare (thresh_cmp) bits are programmed in MMCRA
register. In Power9, thresh_cmp bits were part of the
event code. But in case of P10, thresh_cmp are not part of
event code due to inclusion of MMCR3 bits.
Patch here adds an option to use attr.config1 variable
to be used to pass thresh_cmp value to be programmed in
MMCRA register. A new ppmu flag called PPMU_HAS_ATTR_CONFIG1
has been added and this flag is used to notify the use of
attr.config1 variable.
Patch has extended the parameter list of 'compute_mmcr',
to include power_pmu's 'flags' element and parameter list of
get_constraint to include attr.config1 value. It also extend
parameter list of power_check_constraints inorder to pass
perf_event list.
As stated by commit ef0e3b650f ("powerpc/perf: Fix Threshold
Event Counter Multiplier width for P10"), constraint bits for
thresh_cmp is also needed to be increased to 11 bits, which is
handled as part of this patch. We added bit number 53 as part
of constraint bits of thresh_cmp for power10 to make it an
11 bit field.
Updated layout for p10:
/*
* Layout of constraint bits:
*
* 60 56 52 48 44 40 36 32
* | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - |
* [ fab_match ] [ thresh_cmp ] [ thresh_ctl ] [ ]
* | |
* [ thresh_cmp bits for p10] thresh_sel -*
*
* 28 24 20 16 12 8 4 0
* | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - |
* [ ] | [ ] | [ sample ] [ ] [6] [5] [4] [3] [2] [1]
* | | | | |
* BHRB IFM -* | | |*radix_scope | Count of events for each PMC.
* EBB -* | | p1, p2, p3, p4, p5, p6.
* L1 I/D qualifier -* |
* nc - number of counters -*
*
* The PMC fields P1..P6, and NC, are adder fields. As we accumulate constraints
* we want the low bit of each field to be added to any existing value.
*
* Everything else is a value field.
*/
Result:
command#: cat /sys/devices/cpu/format/thresh_cmp
config1:0-17
ex. usage:
command#: perf record -I --weight -d -e
cpu/event=0x67340101EC,thresh_cmp=500/ ./ebizzy -S 2 -t 1 -s 4096
1826636 records/s
real 2.00 s
user 2.00 s
sys 0.00 s
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.038 MB perf.data (61 samples) ]
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210209095234.837356-1-kjain@linux.ibm.com
In commit bf13718bc5 ("powerpc: show registers when unwinding
interrupt frames") we changed our stack dumping logic to show the full
registers whenever we find an interrupt frame on the stack.
However we didn't notice that on 64-bit this doesn't show the final
frame, ie. the interrupt that brought us in from userspace, whereas on
32-bit it does.
That is due to confusion about the size of that last frame. The code
in show_stack() calls validate_sp(), passing it STACK_INT_FRAME_SIZE
to check the sp is at least that far below the top of the stack.
However on 64-bit that size is too large for the final frame, because
it includes the red zone, but we don't allocate a red zone for the
first frame.
So add a new define that encodes the correct size for 32-bit and
64-bit, and use it in show_stack().
This results in the full trace being shown on 64-bit, eg:
sysrq: Trigger a crash
Kernel panic - not syncing: sysrq triggered crash
CPU: 0 PID: 83 Comm: sh Not tainted 5.11.0-rc2-gcc-8.2.0-00188-g571abcb96b10-dirty #649
Call Trace:
[c00000000a1c3ac0] [c000000000897b70] dump_stack+0xc4/0x114 (unreliable)
[c00000000a1c3b00] [c00000000014334c] panic+0x178/0x41c
[c00000000a1c3ba0] [c00000000094e600] sysrq_handle_crash+0x40/0x50
[c00000000a1c3c00] [c00000000094ef98] __handle_sysrq+0xd8/0x210
[c00000000a1c3ca0] [c00000000094f820] write_sysrq_trigger+0x100/0x188
[c00000000a1c3ce0] [c0000000005559dc] proc_reg_write+0x10c/0x1b0
[c00000000a1c3d10] [c000000000479950] vfs_write+0xf0/0x360
[c00000000a1c3d60] [c000000000479d9c] ksys_write+0x7c/0x140
[c00000000a1c3db0] [c00000000002bf5c] system_call_exception+0x19c/0x2c0
[c00000000a1c3e10] [c00000000000d35c] system_call_common+0xec/0x278
--- interrupt: c00 at 0x7fff9fbab428
NIP: 00007fff9fbab428 LR: 000000001000b724 CTR: 0000000000000000
REGS: c00000000a1c3e80 TRAP: 0c00 Not tainted (5.11.0-rc2-gcc-8.2.0-00188-g571abcb96b10-dirty)
MSR: 900000000280f033 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 22002884 XER: 00000000
IRQMASK: 0
GPR00: 0000000000000004 00007fffc3cb8960 00007fff9fc59900 0000000000000001
GPR04: 000000002a4b32d0 0000000000000002 0000000000000063 0000000000000063
GPR08: 000000002a4b32d0 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000000000 00007fff9fcca9a0 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 00000000100b8fd0
GPR20: 000000002a4b3485 00000000100b8f90 0000000000000000 0000000000000000
GPR24: 000000002a4b0440 00000000100e77b8 0000000000000020 000000002a4b32d0
GPR28: 0000000000000001 0000000000000002 000000002a4b32d0 0000000000000001
NIP [00007fff9fbab428] 0x7fff9fbab428
LR [000000001000b724] 0x1000b724
--- interrupt: c00
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210209141627.2898485-1-mpe@ellerman.id.au
On systems with large amount of memory, loading kdump kernel through
kexec_file_load syscall may fail with the below error:
"Failed to update fdt with linux,drconf-usable-memory property"
This happens because the size estimation for kdump kernel's FDT does
not account for the additional space needed to setup usable memory
properties. Fix it by accounting for the space needed to include
linux,usable-memory & linux,drconf-usable-memory properties while
estimating kdump kernel's FDT size.
Fixes: 6ecd0163d3 ("powerpc/kexec_file: Add appropriate regions for memory reserve map")
Cc: stable@vger.kernel.org # v5.9+
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/161243826811.119001.14083048209224609814.stgit@hbathini
THP config results in compound pages. Make sure the kernel enables
the PageCompound() check with CONFIG_HUGETLB_PAGE disabled and
CONFIG_TRANSPARENT_HUGEPAGE enabled.
This makes sure we correctly flush the icache with THP pages.
flush_dcache_icache_page only matter for platforms that don't support
COHERENT_ICACHE.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210203045812.234439-1-aneesh.kumar@linux.ibm.com
SLB faults should not be taken while the PACA save areas are live, all
memory accesses should be fetches from the kernel text, and access to
PACA and the current stack, before C code is called or any other
accesses are made.
All of these have pinned SLBs so will not take a SLB fault. Therefore
EXSLB is not be required.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210208063406.331655-1-npiggin@gmail.com
The amount of code executed with enabled user space access (unlocked
KUAP) should be minimal. However with CONFIG_PROVE_LOCKING or
CONFIG_DEBUG_ATOMIC_SLEEP enabled, might_fault() calls into various
parts of the kernel, and may even end up replaying interrupts which in
turn may access user space and forget to restore the KUAP state.
The problem places are:
1. strncpy_from_user (and similar) which unlock KUAP and call
unsafe_get_user -> __get_user_allowed -> __get_user_nocheck()
with do_allow=false to skip KUAP as the caller took care of it.
2. __unsafe_put_user_goto() which is called with unlocked KUAP.
eg:
WARNING: CPU: 30 PID: 1 at arch/powerpc/include/asm/book3s/64/kup.h:324 arch_local_irq_restore+0x160/0x190
NIP arch_local_irq_restore+0x160/0x190
LR lock_is_held_type+0x140/0x200
Call Trace:
0xc00000007f392ff8 (unreliable)
___might_sleep+0x180/0x320
__might_fault+0x50/0xe0
filldir64+0x2d0/0x5d0
call_filldir+0xc8/0x180
ext4_readdir+0x948/0xb40
iterate_dir+0x1ec/0x240
sys_getdents64+0x80/0x290
system_call_exception+0x160/0x280
system_call_common+0xf0/0x27c
Change __get_user_nocheck() to look at `do_allow` to decide whether to
skip might_fault(). Since strncpy_from_user/etc call might_fault()
anyway before unlocking KUAP, there should be no visible change.
Drop might_fault() in __unsafe_put_user_goto() as it is only called
from unsafe_put_user(), which already has KUAP unlocked.
Since keeping might_fault() is still desirable for debugging, add
calls to it in user_[read|write]_access_begin(). That also allows us
to drop the is_kernel_addr() test, because there should be no code
using user_[read|write]_access_begin() in order to access a kernel
address.
Fixes: de78a9c42a ("powerpc: Add a framework for Kernel Userspace Access Protection")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[mpe: Combine with related patch from myself, merge change logs]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210204121612.32721-1-aik@ozlabs.ru
Currently unsafe_put_user() expands to __put_user_goto(), which
expands to __put_user_nocheck_goto().
There are no other uses of __put_user_nocheck_goto(), and although
there are some other uses of __put_user_goto() those could just use
unsafe_put_user().
Every layer of indirection introduces the possibility that some code
is calling that layer, and makes keeping track of the required
semantics at each point more complicated.
So drop __put_user_goto(), and rename __put_user_nocheck_goto() to
__unsafe_put_user_goto(). The "nocheck" is implied by "unsafe".
Replace the few uses of __put_user_goto() with unsafe_put_user().
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210208135717.2618798-1-mpe@ellerman.id.au
The allyesconfig ppc64 kernel fails to link with relocations unable to
fit after commit 3a96570ffc ("powerpc: convert interrupt handlers to
use wrappers"), which is due to the interrupt handler functions being
put into the .noinstr.text section, which the linker script places on
the opposite side of the main .text section from the interrupt entry
asm code which calls the handlers.
This results in a lot of linker stubs that overwhelm the 252-byte sized
space we allow for them, or in the case of BE a .opd relocation link
error for some reason.
It's not required to put interrupt handlers in the .noinstr section,
previously they used NOKPROBE_SYMBOL, so take them out and replace
with a NOKPROBE_SYMBOL in the wrapper macro. Remove the explicit
NOKPROBE_SYMBOL macros in the interrupt handler functions. This makes
a number of interrupt handlers nokprobe that were not prior to the
interrupt wrappers commit, but since that commit they were made
nokprobe due to being in .noinstr.text, so this fix does not change
that.
The fixes tag is different to the commit that first exposes the problem
because it is where the wrapper macros were introduced.
Fixes: 8d41fc618a ("powerpc: interrupt handler wrapper functions")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Slightly fix up comment wording]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210211063636.236420-1-npiggin@gmail.com
To prepare for inlining do_softirq_own_stack() replace
__ARCH_HAS_DO_SOFTIRQ with a Kconfig switch and select it in the affected
architectures.
This allows in the next step to move the function prototype and the inline
stub into a seperate asm-generic header file which is required to avoid
include recursion.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210210002513.181713427@linutronix.de
These machines don't support running both MMU types at the same time,
so remove the KVM_CAP_PPC_MMU_HASH_V3 capability when the host is
using Radix MMU.
[paulus@ozlabs.org - added defensive check on
kvmppc_hv_ops->hash_v3_possible]
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This reverts much of commit c01015091a ("KVM: PPC: Book3S HV: Run HPT
guests on POWER9 radix hosts"), which was required to run HPT guests on
RPT hosts on early POWER9 CPUs without support for "mixed mode", which
meant the host could not run with MMU on while guests were running.
This code has some corner case bugs, e.g., when the guest hits a machine
check or HMI the primary locks up waiting for secondaries to switch LPCR
to host, which they never do. This could all be fixed in software, but
most CPUs in production have mixed mode support, and those that don't
are believed to be all in installations that don't use this capability.
So simplify things and remove support.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Introduce KVM_CAP_PPC_DAWR1 which can be used by QEMU to query whether
KVM supports 2nd DAWR or not. The capability is by default disabled
even when the underlying CPU supports 2nd DAWR. QEMU needs to check
and enable it manually to use the feature.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
KVM code assumes single DAWR everywhere. Add code to support 2nd DAWR.
DAWR is a hypervisor resource and thus H_SET_MODE hcall is used to set/
unset it. Introduce new case H_SET_MODE_RESOURCE_SET_DAWR1 for 2nd DAWR.
Also, KVM will support 2nd DAWR only if CPU_FTR_DAWR1 is set.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>