Commit Graph

8942 Commits

Author SHA1 Message Date
Heiko Carstens
55d169c87d s390/vx: remove __uint128_t type from __vector128 struct again
The __uint128_t member was only added for future convenience to the
__vector128 struct. However this is a uapi header file, 31/32 bit (aka
compat layer) is still supported, but doesn't know anything about this
type:

/usr/include/asm/types.h:27:17: error: unknown type name __uint128_t
   27 |                 __uint128_t v;

Therefore remove it again.

Fixes: b0b7b43fcc ("s390/vx: add 64 and 128 bit members to __vector128 struct")
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-14 11:45:40 +01:00
Gerald Schaefer
0807b85652 s390/mm: add support for RDP (Reset DAT-Protection)
RDP instruction allows to reset DAT-protection bit in a PTE, with less
CPU synchronization overhead than IPTE instruction. In particular, IPTE
can cause machine-wide synchronization overhead, and excessive IPTE usage
can negatively impact machine performance.

RDP can be used instead of IPTE, if the new PTE only differs in SW bits
and _PAGE_PROTECT HW bit, for PTE protection changes from RO to RW.
SW PTE bit changes are allowed, e.g. for dirty and young tracking, but none
of the other HW-defined part of the PTE must change. This is because the
architecture forbids such changes to an active and valid PTE, which
is why invalidation with IPTE is always used first, before writing a new
entry.

The RDP optimization helps mainly for fault-driven SW dirty-bit tracking.
Writable PTEs are initially always mapped with HW _PAGE_PROTECT bit set,
to allow SW dirty-bit accounting on first write protection fault, where
the DAT-protection would then be reset. The reset is now done with RDP
instead of IPTE, if RDP instruction is available.

RDP cannot always guarantee that the DAT-protection reset is propagated
to all CPUs immediately. This means that spurious TLB protection faults
on other CPUs can now occur. For this, common code provides a
flush_tlb_fix_spurious_fault() handler, which will now be used to do a
CPU-local TLB flush. However, this will clear the whole TLB of a CPU, and
not just the affected entry. For more fine-grained flushing, by simply
doing a (local) RDP again, flush_tlb_fix_spurious_fault() would need to
also provide the PTE pointer.

Note that spurious TLB protection faults cannot really be distinguished
from racing pagetable updates, where another thread already installed the
correct PTE. In such a case, the local TLB flush would be unnecessary
overhead, but overall reduction of CPU synchronization overhead by not
using IPTE is still expected to be beneficial.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-14 11:45:39 +01:00
Peter Xu
d939474b3d s390/mm: define private VM_FAULT_* reasons from top bits
The current definition already collapse with the generic definition of
vm_fault_reason.  Move the private definitions to allocate bits from the
top of uint so they won't collapse anymore.

Signed-off-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20230205231704.909536-4-peterx@redhat.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-14 11:45:39 +01:00
Halil Pasic
a2522c80f0 s390/ap: fix status returned by ap_qact()
Since commit 159491f3b5 ("s390/ap: rework assembler functions to use
unions for in/out register variables") the  function ap_qact() tries to
grab the status from the wrong part of the register. Thus we always end
up with zeros. Which is wrong, among others, because we detect failures
via status.response_code.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Harald Freudenberger <freude@linux.ibm.com>
Fixes: 159491f3b5 ("s390/ap: rework assembler functions to use unions for in/out register variables")
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-10 10:55:30 +01:00
Halil Pasic
394740d764 s390/ap: fix status returned by ap_aqic()
There function ap_aqic() tries to grab the status from the
wrong part of the register. Thus we always end up with
zeros. Which is wrong, among others, because we detect
failures via status.response_code.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Janosch Frank <frankja@linux.ibm.com>
Fixes: 159491f3b5 ("s390/ap: rework assembler functions to use unions for in/out register variables")
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-10 10:55:29 +01:00
Heiko Carstens
2f09c2ea6c Revert "s390/mem_detect: do not update output parameters on failure"
This reverts commit cbc29f107e.

Get rid of the following smatch warnings:

arch/s390/include/asm/mem_detect.h:86 get_mem_detect_end() error: uninitialized symbol 'end'.
arch/s390/include/asm/mem_detect.h:86 get_mem_detect_end() error: uninitialized symbol 'end'.
arch/s390/boot/vmem.c:256 setup_vmem() error: uninitialized symbol 'start'.
arch/s390/boot/vmem.c:258 setup_vmem() error: uninitialized symbol 'end'.

Note that there is no bug in the code. This is purely to silence smatch.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-09 20:11:24 +01:00
Heiko Carstens
be76ea6144 s390/idle: remove arch_cpu_idle_time() and corresponding code
arch_cpu_idle_time() returns the idle time of any given cpu if it is in
idle, or zero if not. All if this is racy and partially incorrect. Time
stamps taken with store clock extended and store clock fast from different
cpus are compared, while the architecture states that this is nothing which
can be relied on (see Principles of Operation; Chapter 4, "Setting and
Inspecting the Clock").

A more fundamental problem is that the timestamp when a cpu is leaving idle
is taken early in the assembler part of the interrupt handler, and this
value is only transferred many cycles later to the cpu's per-cpu idle data
structure.

This per cpu data structure is read by arch_cpu_idle() to tell for which
period of time a remote cpu is idle: if only an idle_enter value is
present, the assumed idle time of the cpu is calculated by taking a local
timestamp and returning the difference of the local timestamp and the
idle_enter value. This is potentially incorrect, since the remote cpu may
have already left idle, but the taken timestamp may not have been
transferred to the per-cpu data structure. This in turn means that too much
idle time may be reported for a cpu, and a subsequent calculation of system
idle time may result in a smaller value.

Instead of coming up with even more complex code trying to fix this, just
remove this code, and only account idle time of a cpu, after idle state is
left.

Another minor bug is that it is assumed that timestamps are non-zero, which
is not necessarily the case for timestamps taken with store clock
fast. This however is just a very minor problem, since this can only happen
when the epoch increases.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-09 20:11:23 +01:00
Heiko Carstens
a02d584e72 s390/vx: use simple assignments to access __vector128 members
Use simple assignments to access __vector128 members instead of hard
to read casts.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-09 20:11:23 +01:00
Heiko Carstens
b0b7b43fcc s390/vx: add 64 and 128 bit members to __vector128 struct
Add 64 and 128 bit members to __vector128 struct in order to allow reading
of the complete value, or the higher or lower part of vector register
contents instead of having to use casts.

Add an explicit __aligned(4) statement to avoid that the alignment of the
structure changes from 4 to 8. This should make sure that no breakage
happens because of this change.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-09 20:11:23 +01:00
Heiko Carstens
87f79d886d s390/processor: always inline cpu flag helper functions
arch_cpu_idle() is marked noinstr and therefore must only call functions
which are also not instrumented.

Make sure that cpu flag helper functions are always inlined to avoid that
the compiler generates an out-of-line function for e.g. the call within
arch_cpu_idle().

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-09 20:11:22 +01:00
Heiko Carstens
a9cbc1b471 s390/idle: mark arch_cpu_idle() noinstr
linux-next commit ("cpuidle: tracing: Warn about !rcu_is_watching()")
adds a new warning which hits on s390's arch_cpu_idle() function:

RCU not on for: arch_cpu_idle+0x0/0x28
WARNING: CPU: 2 PID: 0 at include/linux/trace_recursion.h:162 arch_ftrace_ops_list_func+0x24c/0x258
Modules linked in:
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 6.2.0-rc6-next-20230202 #4
Hardware name: IBM 8561 T01 703 (z/VM 7.3.0)
Krnl PSW : 0404d00180000000 00000000002b55c0 (arch_ftrace_ops_list_func+0x250/0x258)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
Krnl GPRS: c0000000ffffbfff 0000000080000002 0000000000000026 0000000000000000
           0000037ffffe3a28 0000037ffffe3a20 0000000000000000 0000000000000000
           0000000000000000 0000000000f4acf6 00000000001044f0 0000037ffffe3cb0
           0000000000000000 0000000000000000 00000000002b55bc 0000037ffffe3bb8
Krnl Code: 00000000002b55b0: c02000840051        larl    %r2,0000000001335652
           00000000002b55b6: c0e5fff512d1        brasl   %r14,0000000000157b58
          #00000000002b55bc: af000000            mc      0,0
          >00000000002b55c0: a7f4ffe7            brc     15,00000000002b558e
           00000000002b55c4: 0707                bcr     0,%r7
           00000000002b55c6: 0707                bcr     0,%r7
           00000000002b55c8: eb6ff0480024        stmg    %r6,%r15,72(%r15)
           00000000002b55ce: b90400ef            lgr     %r14,%r15
Call Trace:
 [<00000000002b55c0>] arch_ftrace_ops_list_func+0x250/0x258
([<00000000002b55bc>] arch_ftrace_ops_list_func+0x24c/0x258)
 [<0000000000f5f0fc>] ftrace_common+0x1c/0x20
 [<00000000001044f6>] arch_cpu_idle+0x6/0x28
 [<0000000000f4acf6>] default_idle_call+0x76/0x128
 [<00000000001cc374>] do_idle+0xf4/0x1b0
 [<00000000001cc6ce>] cpu_startup_entry+0x36/0x40
 [<0000000000119d00>] smp_start_secondary+0x140/0x150
 [<0000000000f5d2ae>] restart_int_handler+0x6e/0x90

Mark arch_cpu_idle() noinstr like all other architectures with
CONFIG_ARCH_WANTS_NO_INSTR (should) have it to fix this.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-09 20:11:22 +01:00
Heiko Carstens
c01016299d s390/idle: move idle time accounting to account_idle_time_irq()
There is no reason to do idle time accounting in arch_cpu_idle().
Do idle time accounting in account_idle_time_irq(), where it belongs
to. The accounted values don't change between account_idle_time_irq() and
arch_cpu_idle(); so the result is the same.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-09 20:11:22 +01:00
Heiko Carstens
740d63b5a0 Merge branch 'cmpxchg_user_key' into features
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-09 20:10:35 +01:00
Heiko Carstens
83089c8f50 Merge branch 'fixes' into features
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 15:13:45 +01:00
Vasily Gorbik
6bddf115d0 s390/boot: avoid potential amode31 truncation
Fixes: bb1520d581 ("s390/mm: start kernel with DAT enabled")
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:55 +01:00
Vasily Gorbik
d1725ca60e s390/boot: move detect_facilities() after cmd line parsing
Facilities setup has to be done after "facilities" command line option
parsing, it might set extra or remove existing facilities bits for
testing purposes.

Fixes: bb1520d581 ("s390/mm: start kernel with DAT enabled")
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:55 +01:00
Vasily Gorbik
26ced8124a s390/kasan: avoid mapping KASAN shadow for standby memory
KASAN common code is able to handle memory hotplug and create KASAN shadow
memory on a fly. Online memory ranges are available from mem_detect,
use this information to avoid mapping KASAN shadow for standby memory.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:55 +01:00
Vasily Gorbik
8382c96324 s390/boot: avoid page tables memory in kaslr
If kernel is build without KASAN support there is a chance that kernel
image is going to be positioned by KASLR code to overlap with identity
mapping page tables.

When kernel is build with KASAN support enabled memory which
is potentially going to be used for page tables and KASAN
shadow mapping is accounted for in KASLR with the use of
kasan_estimate_memory_needs(). Split this function and introduce
vmem_estimate_memory_needs() to cover decompressor's vmem identity
mapping page tables.

Fixes: bb1520d581 ("s390/mm: start kernel with DAT enabled")
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:55 +01:00
Vasily Gorbik
3615d01114 s390/mem_detect: add get_mem_detect_online_total()
Add a function to get online memory in total. It is supposed to be used
in the decompressor as well as during early kernel startup.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:54 +01:00
Vasily Gorbik
bf64f0517e s390/mem_detect: handle online memory limit just once
Introduce mem_detect_truncate() to cut any online memory ranges above
established identity mapping size, so that mem_detect users wouldn't
have to do it over and over again.

Suggested-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:54 +01:00
Vasily Gorbik
22476f47b6 s390/boot: fix mem_detect extended area allocation
Allocation of mem_detect extended area was not considered neither
in commit 9641b8cc73 ("s390/ipl: read IPL report at early boot")
nor in commit b2d24b97b2 ("s390/kernel: add support for kernel address
space layout randomization (KASLR)"). As a result mem_detect extended
theoretically may overlap with ipl report or randomized kernel image
position. But as mem_detect code will allocate extended area only
upon exceeding 255 online regions (which should alternate with offline
memory regions) it is not seen in practice.

To make sure mem_detect extended area does not overlap with ipl report
or randomized kernel position extend usage of "safe_addr". Make initrd
handling and mem_detect extended area allocation code move it further
right and make KASLR takes in into consideration as well.

Fixes: 9641b8cc73 ("s390/ipl: read IPL report at early boot")
Fixes: b2d24b97b2 ("s390/kernel: add support for kernel address space layout randomization (KASLR)")
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:54 +01:00
Vasily Gorbik
eb33f9eb30 s390/mem_detect: rely on diag260() if sclp_early_get_memsize() fails
In case sclp_early_get_memsize() fails but diag260() succeeds make sure
some sane value is returned. This error scenario is highly unlikely,
but this change makes system able to boot in such case.

Suggested-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:54 +01:00
Heiko Carstens
18e5cb7a5c s390/diag: make __diag8c_tmp_amode31 static
Get rid of this sparse warning:

arch/s390/kernel/diag.c:69:29: warning: symbol '__diag8c_tmp_amode31' was not declared. Should it be static?

Fixes: fbaee7464f ("s390/tty3270: add support for diag 8c")
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:54 +01:00
Heiko Carstens
1e2eb49bb1 s390/rethook: add local rethook header file
Compiling the kernel with CONFIG_KPROBES disabled, but CONFIG_RETHOOK
enabled, results in this sparse warning:

arch/s390/kernel/rethook.c:26:15: warning: no previous prototype for 'arch_rethook_trampoline_callback' [-Wmissing-prototypes]
    26 | unsigned long arch_rethook_trampoline_callback(struct pt_regs *regs)
       |               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Add a local rethook header file similar to riscv to address this.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 1a280f48c0 ("s390/kprobes: replace kretprobe with rethook")
Link: https://lore.kernel.org/all/202302030102.69dZIuJk-lkp@intel.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:53 +01:00
Vasily Gorbik
fb9293b9f3 s390/vmem: remove unnecessary KASAN checks
Kasan shadow memory area has been moved to the end of kernel address
space since commit 9a39abb7c9 ("s390/boot: simplify and fix kernel
memory layout setup"), therefore skipping any memory ranges above
VMALLOC_START in empty page tables cleanup code already handles
KASAN shadow memory intersection case and explicit checks could be
removed.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:53 +01:00
Vasily Gorbik
108303b0a2 s390/vmem: fix empty page tables cleanup under KASAN
Commit b9ff81003c ("s390/vmem: cleanup empty page tables") introduced
empty page tables cleanup in vmem code, but when the kernel is built
with KASAN enabled the code has no effect due to wrong KASAN shadow
memory intersection condition, which effectively ignores any memory
range below KASAN shadow. Fix intersection condition to make code
work as anticipated.

Fixes: b9ff81003c ("s390/vmem: cleanup empty page tables")
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:53 +01:00
Vasily Gorbik
dfca37d36b s390/kasan: update kasan memory layout note
Kasan shadow memory area has been moved to the end of kernel address
space since commit 9a39abb7c9 ("s390/boot: simplify and fix kernel
memory layout setup"). Change kasan memory layout note accordingly.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:53 +01:00
Vasily Gorbik
3400c35a40 s390/mem_detect: fix detect_memory() error handling
Currently if for some reason sclp_early_read_info() fails,
sclp_early_get_memsize() will not set max_physmem_end and it
will stay uninitialized. Any garbage value other than 0 will lead
to detect_memory() taking wrong path or returning a garbage value
as max_physmem_end. To avoid that simply initialize max_physmem_end.

Fixes: 73045a08cf ("s390: unify identity mapping limits handling")
Reported-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:53 +01:00
Sven Schnelle
6bb361d5d8 s390/ipl: add loadparm parameter to eckd ipl/reipl data
commit 87fd22e0ae ("s390/ipl: add eckd support") missed to add the
loadparm attribute to the new eckd ipl/reipl data.

Fixes: 87fd22e0ae ("s390/ipl: add eckd support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:52 +01:00
Sven Schnelle
c676aac66f s390/ipl: add DEFINE_GENERIC_LOADPARM()
In the current code each reipl type implements its own pair of loadparm
show/store functions. Add a macro to deduplicate the code a bit.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Fixes: 87fd22e0ae ("s390/ipl: add eckd support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06 11:13:52 +01:00
Alexander Gordeev
cbc29f107e s390/mem_detect: do not update output parameters on failure
Function __get_mem_detect_block() resets start and end
output parameters in case of invalid mem_detect array
index is provided. That violates the rule of sparing
the output on fail path and leads e.g to a below anomaly:

	for_each_mem_detect_block(i, &start, &end)
		continue;

One would expect start and end contain addresses of the
last memory block (if available), but in fact the two
will be reset to zeroes. That is not how an iterator is
expected to work.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-31 18:56:36 +01:00
Vineeth Vijayan
0c6924c262 s390/cio: introduce locking for register/unregister functions
Unbinding an I/O subchannel with a child-CCW device in disconnected
state sometimes causes a kernel-panic. The race condition was seen
mostly during testing, when setting all the CHPIDs of a device to
offline and at the same time, the unbinding the I/O subchannel driver.

The kernel-panic occurs because of double delete, the I/O subchannel
driver calls device_del on the CCW device while another device_del
invocation for the same device is in-flight.  For instance, disabling
all the CHPIDs will trigger the ccw_device_remove function, which will
call a ccw_device_unregister(), which ends up calling the device_del()
which is asynchronous via cdev's todo workqueue. And unbinding the I/O
subchannel driver calls io_subchannel_remove() function which calls the
ccw_device_unregister() and device_del().

This double delete can be prevented by serializing all CCW device
registration/unregistration calls into the driver core. This patch
introduces a mutex which will be used for this purpose.

Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Reported-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-31 18:56:36 +01:00
Vasily Gorbik
05178996e1 s390/mm,ptdump: avoid Kasan vs Memcpy Real markers swapping
---[ Real Memory Copy Area Start ]---
0x001bfffffffff000-0x001c000000000000         4K PTE I
---[ Kasan Shadow Start ]---
---[ Real Memory Copy Area End ]---
0x001c000000000000-0x001c000200000000         8G PMD RW NX
...
---[ Kasan Shadow End ]---

ptdump does a stable sort of markers. Move kasan markers after
memcpy real to avoid swapping.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-31 18:56:36 +01:00
Vasily Gorbik
39da9a979c s390/boot: remove pgtable_populate_end
setup_vmem() already calls populate for all online memory regions.
pgtable_populate_end() could be removed.

Also rename pgtable_populate_begin() to pgtable_populate_init().

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-31 18:56:36 +01:00
Vasily Gorbik
e966ccf836 s390/boot: avoid mapping standby memory
Commit bb1520d581 ("s390/mm: start kernel with DAT enabled")
doesn't consider online memory holes due to potential memory offlining
and erroneously creates pgtables for stand-by memory, which bear RW+X
attribute and trigger a warning:

RANGE                                 SIZE   STATE REMOVABLE BLOCK
0x0000000000000000-0x0000000c3fffffff  49G  online       yes  0-48
0x0000000c40000000-0x0000000c7fffffff   1G offline              49
0x0000000c80000000-0x0000000fffffffff  14G  online       yes 50-63
0x0000001000000000-0x00000013ffffffff  16G offline           64-79

    s390/mm: Found insecure W+X mapping at address 0xc40000000
    WARNING: CPU: 14 PID: 1 at arch/s390/mm/dump_pagetables.c:142 note_page+0x2cc/0x2d8

Map only online memory ranges which fit within identity mapping limit.

Fixes: bb1520d581 ("s390/mm: start kernel with DAT enabled")
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-31 18:56:36 +01:00
Vasily Gorbik
7ab41c2c08 s390/decompressor: specify __decompress() buf len to avoid overflow
Historically calls to __decompress() didn't specify "out_len" parameter
on many architectures including s390, expecting that no writes beyond
uncompressed kernel image are performed. This has changed since commit
2aa14b1ab2 ("zstd: import usptream v1.5.2") which includes zstd library
commit 6a7ede3dfccb ("Reduce size of dctx by reutilizing dst buffer
(#2751)"). Now zstd decompression code might store literal buffer in
the unwritten portion of the destination buffer. Since "out_len" is
not set, it is considered to be unlimited and hence free to use for
optimization needs. On s390 this might corrupt initrd or ipl report
which are often placed right after the decompressor buffer. Luckily the
size of uncompressed kernel image is already known to the decompressor,
so to avoid the problem simply specify it in the "out_len" parameter.

Link: https://github.com/facebook/zstd/commit/6a7ede3dfccb
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Link: https://lore.kernel.org/r/patch-1.thread-41c676.git-41c676c2d153.your-ad-here.call-01675030179-ext-9637@work.hours
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-31 18:54:21 +01:00
Heiko Carstens
2213d44e14 s390/syscalls: get rid of system call alias functions
bpftrace and friends only consider functions present in
/sys/kernel/tracing/available_filter_functions.

For system calls there is the s390 specific problem that the system call
function itself is present via __se_sys##name() while the system call
itself is wired up via an __s390x_sys##name() alias. The required DWARF
debug information however is only available for the original function, not
the alias, but within available_filter_functions only the functions with
__s390x_ prefix are available. Which means the required DWARF debug
information cannot be found.
While this could be solved via tooling, it is easier to change the s390
specific system call wrapper handling.

Therefore get rid of this alias handling and implement system call wrappers
like most other architectures are doing. In result the implementation
generates the following functions:

long __s390x_sys##name(struct pt_regs *regs)
static inline long __se_sys##name(...)
static inline long __do_sys##name(...)

__s390x_sys##name() is the visible system call function which is also wired
up in the system call table. Its only parameter is a pt_regs variable.

This function calls the corresponding __se_sys##name() function, which has
as many parameters like the system call definition. This function in turn
performs all zero and sign extensions of all system call parameters, taken
from the pt_regs structure, and finally calls __do_sys##name().

__do_sys##name() is the actual inlined system call function implementation.

For all 64 bit system calls there is a 31/32 bit system call function
__s390_sys##name() generated, which handles all system call parameters
correctly as required by compat handling. This function may be wired
up within the compat system call table, unless there exists an
explicit compat system call function, which is then used instead.

Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-25 20:51:11 +01:00
Heiko Carstens
0efc5d58bd s390/syscalls: remove trailing semicolon
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-25 20:51:11 +01:00
Heiko Carstens
2e4532d4ac s390/syscalls: move __S390_SYS_STUBx() macro
Move __S390_SYS_STUBx() the end of the CONFIG_COMPAT section, so both
variants (compat and non-compat) are close together and can be easily
compared.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-25 20:51:11 +01:00
Heiko Carstens
82c1b3e7e5 s390/syscalls: remove __SC_COMPAT_TYPE define
Remove __SC_COMPAT_TYPE define which is an unused leftover.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-25 20:51:11 +01:00
Heiko Carstens
7be215ba35 s390/syscalls: remove SYSCALL_METADATA() from compat syscalls
SYSCALL_METADATA() is only supposed to be used for non-compat system
calls. Otherwise there would be a name clash.

This also removes the inconsistency that s390 is the only architecture
which uses SYSCALL_METADATA() for compat system calls, and even that only
for compat system calls without parameters. Only two such compat system
calls exist.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-25 20:51:11 +01:00
Ilya Leoshkevich
e9c9cb90e7 s390: discard .interp section
When debugging vmlinux with QEMU + GDB, the following GDB error may
occur:

    (gdb) c
    Continuing.
    Warning:
    Cannot insert breakpoint -1.
    Cannot access memory at address 0xffffffffffff95c0

    Command aborted.
    (gdb)

The reason is that, when .interp section is present, GDB tries to
locate the file specified in it in memory and put a number of
breakpoints there (see enable_break() function in gdb/solib-svr4.c).
Sometimes GDB finds a bogus location that matches its heuristics,
fails to set a breakpoint and stops. This makes further debugging
impossible.

The .interp section contains misleading information anyway (vmlinux
does not need ld.so), so fix by discarding it.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-25 20:51:10 +01:00
Thomas Richter
0d5f0dc830 s390/cpum_cf: simplify PMC_INIT and PMC_RELEASE usage
Simplify the use of constants PMC_INIT and PMC_RELEASE.

Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-25 20:51:10 +01:00
Thomas Richter
1e99c242ac s390/cpum_cf: merge source files for CPU Measurement counter facility
With no in-kernel user, the source files can be merged.

Move all functions and the variable definitions to file perf_cpum_cf.c
This file now contains all the necessary functions and definitions
for the CPU Measurement counter facility device driver.

The files cpu_mcf.h and perf_cpum_cf_common.c are deleted.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-25 20:51:10 +01:00
Thomas Richter
ea53e6995f s390/cpum_cf: remove in-kernel counting facility interface
Commit 17bebcc68e ("s390/cpum_cf: Add minimal in-kernel interface for
counter measurements") introduced a small in-kernel interface for CPU
Measurement counter facility.
There are no users of this interface, therefore remove it.

The following functions are removed:
 kernel_cpumcf_alert(),
 kernel_cpumcf_begin(),
 kernel_cpumcf_end(),
 kernel_cpumcf_avail()
there is no need for them anymore.
With the removal of function kernel_cpumcf_alert(), also remove
member alert in struct cpu_cf_events. Its purpose was to counter
measurement alert interrupts for the in-kernel interface.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-25 20:51:10 +01:00
Thomas Richter
7a8f09ac18 s390/cpum_cf: move stccm_avail()
Function stccm_avail() is defined in a header file and the
only user is one single source file. Move this function to the source
file where it is also used and remove it from the header file.
No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-25 20:51:10 +01:00
Thomas Richter
345d2a4dcd s390/cpum_cf: move cpum_cf_ctrset_size()
Function cpum_cf_ctrset_size() is defined in one source file and the
only user is in another source file. Move this function to the source
file where it is used and remove its prototype from the header file.
No functional change.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-25 20:51:09 +01:00
Thomas Richter
1ce357cb82 s390/cpum_cf: simplify hw_perf_event_destroy()
To remove an event from the CPU Measurement counter facility
use the lock/unlock scheme as done in event creation. Remove
the atomic_add_unless function to make the code easier.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-25 20:51:09 +01:00
Heiko Carstens
7a725b7702 s390/cache: change type from unsigned long long to unsigned long
The unsigned long long type is a leftover of the 31 bit area.
Get rid of it.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-25 20:51:09 +01:00
Vasily Gorbik
1a280f48c0 s390/kprobes: replace kretprobe with rethook
That's an adaptation of commit f3a112c0c4 ("x86,rethook,kprobes:
Replace kretprobe with rethook on x86") to s390.

Replaces the kretprobe code with rethook on s390. With this patch,
kretprobe on s390 uses the rethook instead of kretprobe specific
trampoline code.

Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-22 18:42:35 +01:00