Commit Graph

8892 Commits

Author SHA1 Message Date
Claudio Imbrenda
8c516b25d6 KVM: s390: pv: add KVM_CAP_S390_PROTECTED_ASYNC_DISABLE
Add KVM_CAP_S390_PROTECTED_ASYNC_DISABLE to signal that the
KVM_PV_ASYNC_DISABLE and KVM_PV_ASYNC_DISABLE_PREPARE commands for the
KVM_S390_PV_COMMAND ioctl are available.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20221111170632.77622-4-imbrenda@linux.ibm.com
Message-Id: <20221111170632.77622-4-imbrenda@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-11-23 09:06:50 +00:00
Claudio Imbrenda
fb491d5500 KVM: s390: pv: asynchronous destroy for reboot
Until now, destroying a protected guest was an entirely synchronous
operation that could potentially take a very long time, depending on
the size of the guest, due to the time needed to clean up the address
space from protected pages.

This patch implements an asynchronous destroy mechanism, that allows a
protected guest to reboot significantly faster than previously.

This is achieved by clearing the pages of the old guest in background.
In case of reboot, the new guest will be able to run in the same
address space almost immediately.

The old protected guest is then only destroyed when all of its memory
has been destroyed or otherwise made non protected.

Two new PV commands are added for the KVM_S390_PV_COMMAND ioctl:

KVM_PV_ASYNC_CLEANUP_PREPARE: set aside the current protected VM for
later asynchronous teardown. The current KVM VM will then continue
immediately as non-protected. If a protected VM had already been
set aside for asynchronous teardown, but without starting the teardown
process, this call will fail. There can be at most one VM set aside at
any time. Once it is set aside, the protected VM only exists in the
context of the Ultravisor, it is not associated with the KVM VM
anymore. Its protected CPUs have already been destroyed, but not its
memory. This command can be issued again immediately after starting
KVM_PV_ASYNC_CLEANUP_PERFORM, without having to wait for completion.

KVM_PV_ASYNC_CLEANUP_PERFORM: tears down the protected VM previously
set aside using KVM_PV_ASYNC_CLEANUP_PREPARE. Ideally the
KVM_PV_ASYNC_CLEANUP_PERFORM PV command should be issued by userspace
from a separate thread. If a fatal signal is received (or if the
process terminates naturally), the command will terminate immediately
without completing. All protected VMs whose teardown was interrupted
will be put in the need_cleanup list. The rest of the normal KVM
teardown process will take care of properly cleaning up all remaining
protected VMs, including the ones on the need_cleanup list.

Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Link: https://lore.kernel.org/r/20221111170632.77622-2-imbrenda@linux.ibm.com
Message-Id: <20221111170632.77622-2-imbrenda@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-11-23 09:06:50 +00:00
Niklas Schnelle
21c1f9021f s390/pci: use lock-free I/O translation updates
I/O translation tables on s390 use 8 byte page table entries and tables
which are allocated lazily but only freed when the entire I/O
translation table is torn down. Also each IOVA can at any time only
translate to one physical address Furthermore I/O table accesses by the
IOMMU hardware are cache coherent. With a bit of care we can thus use
atomic updates to manipulate the translation table without having to use
a global lock at all. This is done analogous to the existing I/O
translation table handling code used on Intel and AMD x86 systems.

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20221109142903.4080275-6-schnelle@linux.ibm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-19 10:28:18 +01:00
Niklas Schnelle
2ba8336dab iommu/s390: Use RCU to allow concurrent domain_list iteration
The s390_domain->devices list is only added to when new devices are
attached but is iterated through in read-only fashion for every mapping
operation as well as for I/O TLB flushes and thus in performance
critical code causing contention on the s390_domain->list_lock.
Fortunately such a read-mostly linked list is a standard use case for
RCU. This change closely follows the example fpr RCU protected list
given in Documentation/RCU/listRCU.rst.

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20221109142903.4080275-4-schnelle@linux.ibm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-19 10:28:16 +01:00
Niklas Schnelle
59bbf59679 iommu/s390: Make attach succeed even if the device is in error state
If a zPCI device is in the error state while switching IOMMU domains
zpci_register_ioat() will fail and we would end up with the device not
attached to any domain. In this state since zdev->dma_table == NULL
a reset via zpci_hot_reset_device() would wrongfully re-initialize the
device for DMA API usage using zpci_dma_init_device(). As automatic
recovery is currently disabled while attached to an IOMMU domain this
only affects slot resets triggered through other means but will affect
automatic recovery once we switch to using dma-iommu.

Additionally with that switch common code expects attaching to the
default domain to always work so zpci_register_ioat() should only fail
if there is no chance to recover anyway, e.g. if the device has been
unplugged.

Improve the robustness of attach by specifically looking at the status
returned by zpci_mod_fc() to determine if the device is unavailable and
in this case simply ignore the error. Once the device is reset
zpci_hot_reset_device() will then correctly set the domain's DMA
translation tables.

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20221109142903.4080275-2-schnelle@linux.ibm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-19 10:28:15 +01:00
Linus Torvalds
ab290eaddc s390 updates for 6.1-rc6
- Fix deadlock in discontiguous saved segments (DCSS) block device
   driver. When adding a disk and scanning partitions the scan would
   not break out early without a missed flag.
 
 - Avoid using global register variable for current_stack_pointer
   due to an old bug in gcc versions prior to gcc-8.4. Due to this
   bug a broken code is generated, which leads to stack corruptions.
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYIADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCY3edLRccYWdvcmRlZXZA
 bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8MtqAQCIla7FAmztUpqFlW1yzOAb7JLd
 9VtTX9lxiAwmKeL3rQD+Ow6nNhJiT7VX1+7YS+1a5lbq0fcckEbHoMtiWxRJVwc=
 =eAUy
 -----END PGP SIGNATURE-----

Merge tag 's390-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Alexander Gordeev:

 - Fix deadlock in discontiguous saved segments (DCSS) block device
   driver. When adding a disk and scanning partitions the scan would not
   break out early without a missed flag.

 - Avoid using global register variable for current_stack_pointer due to
   an old bug in gcc versions prior to gcc-8.4. Due to this bug a broken
   code is generated, which leads to stack corruptions.

* tag 's390-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: avoid using global register for current_stack_pointer
  s390/dcssblk: fix deadlock when adding a DCSS
2022-11-18 12:30:23 -08:00
Mark Rutland
94d095ffa0 ftrace: abstract DYNAMIC_FTRACE_WITH_ARGS accesses
In subsequent patches we'll arrange for architectures to have an
ftrace_regs which is entirely distinct from pt_regs. In preparation for
this, we need to minimize the use of pt_regs to where strictly necessary
in the core ftrace code.

This patch adds new ftrace_regs_{get,set}_*() helpers which can be used
to manipulate ftrace_regs. When CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y,
these can always be used on any ftrace_regs, and when
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=n these can be used when regs are
available. A new ftrace_regs_has_args(fregs) helper is added which code
can use to check when these are usable.

Co-developed-by: Florent Revest <revest@chromium.org>
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20221103170520.931305-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-18 13:56:41 +00:00
Mark Rutland
0ef86097f1 ftrace: rename ftrace_instruction_pointer_set() -> ftrace_regs_set_instruction_pointer()
In subsequent patches we'll add a sew of ftrace_regs_{get,set}_*()
helpers. In preparation, this patch renames
ftrace_instruction_pointer_set() to
ftrace_regs_set_instruction_pointer().

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20221103170520.931305-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-18 13:56:41 +00:00
Mark Rutland
9705bc7096 ftrace: pass fregs to arch_ftrace_set_direct_caller()
In subsequent patches we'll arrange for architectures to have an
ftrace_regs which is entirely distinct from pt_regs. In preparation for
this, we need to minimize the use of pt_regs to where strictly
necessary in the core ftrace code.

This patch changes the prototype of arch_ftrace_set_direct_caller() to
take ftrace_regs rather than pt_regs, and moves the extraction of the
pt_regs into arch_ftrace_set_direct_caller().

On x86, arch_ftrace_set_direct_caller() can be used even when
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=n, and <linux/ftrace.h> defines
struct ftrace_regs. Due to this, it's necessary to define
arch_ftrace_set_direct_caller() as a macro to avoid using an incomplete
type. I've also moved the body of arch_ftrace_set_direct_caller() after
the CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y defineidion of struct
ftrace_regs.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20221103170520.931305-2-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-18 13:56:41 +00:00
Jason A. Donenfeld
b9b01a5625 random: use random.trust_{bootloader,cpu} command line option only
It's very unusual to have both a command line option and a compile time
option, and apparently that's confusing to people. Also, basically
everybody enables the compile time option now, which means people who
want to disable this wind up having to use the command line option to
ensure that anyway. So just reduce the number of moving pieces and nix
the compile time option in favor of the more versatile command line
option.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-11-18 02:18:10 +01:00
Jason A. Donenfeld
8032bf1233 treewide: use get_random_u32_below() instead of deprecated function
This is a simple mechanical transformation done by:

@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
  (E)

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Reviewed-by: SeongJae Park <sj@kernel.org> # for damon
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-11-18 02:15:15 +01:00
Heiko Carstens
2e71df9469 s390: use generic vga.h header file
The generic vga.h contains a couple of defines, which do no harm on
s390. Therefore use the generic version and git rid of the s390
specific empty header file.

Suggested-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-16 11:41:15 +01:00
Heiko Carstens
438d43d252 s390: use generic shmparam.h header file
Use generic shmparam.h header file since the contents are identical.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-16 11:41:14 +01:00
Heiko Carstens
a3a6f55cc6 s390: use generic bugs.h header file
Use the generic bugs.h header file. Except for an
excellent comment the header files are identical.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-16 11:41:14 +01:00
Heiko Carstens
b381d047aa s390: use generic serial.h header file
There is no serial driver on s390, especially none that relies on a
bogus BASE_BAUD define. Therefore use the generic header file.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-16 11:41:14 +01:00
Vasily Gorbik
e3c11025bc s390: avoid using global register for current_stack_pointer
Commit 30de14b188 ("s390: current_stack_pointer shouldn't be a
function") made current_stack_pointer a global register variable like
on many other architectures. Unfortunately on s390 it uncovers old
gcc bug which is fixed only since gcc-9.1 [gcc commit 3ad7fed1cc87
("S/390: Fix PR89775. Stackpointer save/restore instructions removed")]
and backported to gcc-8.4 and later. Due to this bug gcc versions prior
to 8.4 generate broken code which leads to stack corruptions.

Current minimal gcc version required to build the kernel is declared
as 5.1. It is not possible to fix all old gcc versions, so work
around this problem by avoiding using global register variable for
current_stack_pointer.

Fixes: 30de14b188 ("s390: current_stack_pointer shouldn't be a function")
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-16 11:39:12 +01:00
Linus Torvalds
f5020a08b2 s390 updates for 6.1-rc5
- fix memcpy warning about field-spanning write in zcrypt driver.
 
 - minor updates to defconfigs.
 
 - Remove CONFIG_DEBUG_INFO_BTF from all defconfigs and add btf.config
   addon config file. It significantly decreases compile time and allows
   quickly enabling that option into the current kernel config.
 
 - Add kasan.config addon config file which allows to easily enable
   KASAN into the current kernel config.
 
 - binutils commit 906f69cf65da ("IBM zSystems: Issue error for *DBL
   relocs on misaligned symbols") caused several link errors.
   Always build relocatable kernel to avoid this problem.
 
 - Raise the minimum clang version to 15.0.0 to avoid silent generation
   of a corrupted code.
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYIADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCY2566RccYWdvcmRlZXZA
 bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8M0nAQDRrcyOb3BILKYNlYkb9H8Cw/0x
 TMl/zEqcjD14XBGXXAD+L6M9nNuWd8GKLYzE7AxEeQ80kAQLqUPGGxhjc09hqAI=
 =gkiM
 -----END PGP SIGNATURE-----

Merge tag 's390-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Alexander Gordeev:

 - fix memcpy warning about field-spanning write in zcrypt driver

 - minor updates to defconfigs

 - remove CONFIG_DEBUG_INFO_BTF from all defconfigs and add btf.config
   addon config file. It significantly decreases compile time and allows
   quickly enabling that option into the current kernel config

 - add kasan.config addon config file which allows to easily enable
   KASAN into the current kernel config

 - binutils commit 906f69cf65da ("IBM zSystems: Issue error for *DBL
   relocs on misaligned symbols") caused several link errors. Always
   build relocatable kernel to avoid this problem

 - raise the minimum clang version to 15.0.0 to avoid silent generation
   of a corrupted code

* tag 's390-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  scripts/min-tool-version.sh: raise minimum clang version to 15.0.0 for s390
  s390: always build relocatable kernel
  s390/configs: add kasan.config addon config file
  s390/configs: move CONFIG_DEBUG_INFO_BTF into btf.config addon config
  s390: update defconfigs
  s390/zcrypt: fix warning about field-spanning write
2022-11-11 11:49:20 -08:00
Gerald Schaefer
00a34d5a99 s390: select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
Enable HUGETLB_PAGE_OPTIMIZE_VMEMMAP for s390.

With this, vmemmap pages used to back struct pages for compound tail
pages of hugetlb pages are freed and remapped to compound head page
frame as RO, see also Documentation/vm/vmemmap_dedup.rst.

For 1M hugetlb pages, this results in freeing 3 of 4 vmemmap pages,
saving 12K of memory for each 1M hugetlb page (~1.2%).
/sys/kernel/debug/kernel_page_tables will show the impact:

---[ vmemmap Area Start ]---
[...]
0x0000037202d84000-0x0000037202d85000         4K PTE RW NX
0x0000037202d85000-0x0000037202d88000        12K PTE RO NX

For 2G hugetlb pages, this results in freeing 8191 of 8192 vmemmap
pages, saving 32764K of memory for each 2G hugetlb page (~1.6%)
/sys/kernel/debug/kernel_page_tables will show the impact:

---[ vmemmap Area Start ]---
[...]
0x000003720a000000-0x000003720a001000         4K PTE RW NX
0x000003720a001000-0x000003720c000000     32764K PTE RO NX

The memory savings come with some costs:
- vmemmap mapping for compound hugetlb pages is not a PMD mapping any
  more, but split to 4K PTE mappings, and it will not be coalesced back
  to PMD mapping after freeing hugetlb pages from the pool.
  Apart from theoretical performance impact, this will also (slightly)
  relativize the memory savings because of additional 2K PTE pagetable
  allocations.
- Workload using "on the fly" hugetlb allocations via
  "nr_overcommit_hugepages" instead of using the hugetlb pool via
  "nr_hugepages" will suffer from considerably increased fault handling
  time, see also description from commit 78f39084b4
  ("mm: hugetlb_vmemmap: add hugetlb_optimize_vmemmap sysctl").
- Freeing hugetlb pages from the pool will require re-allocation of the
  freed struct pages, and therefore needs some memory available to the
  kernel. This might fail in memory constrained scenarios.
- For the same reason, memory offline might fail even for ZONE_MOVABLE
  when hugetlb pages are present (but not for s390, since we do not
  support ARCH_ENABLE_HUGEPAGE_MIGRATION, and therefore cannot have
  hugetlb pages in ZONE_MOVABLE).
- General increased complexity and overhead in kernel handling of
  compound (head) pages.

Therefore, this feature is disabled by default, and has to be enabled
explicitly either by adding "hugetlb_free_vmemmap=on" kernel parameter,
or during run-time via "/proc/sys/vm/hugetlb_optimize_vmemmap" sysctl.

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-10 08:00:41 +01:00
Paolo Bonzini
d663b8a285 KVM: replace direct irq.h inclusion
virt/kvm/irqchip.c is including "irq.h" from the arch-specific KVM source
directory (i.e. not from arch/*/include) for the sole purpose of retrieving
irqchip_in_kernel.

Making the function inline in a header that is already included,
such as asm/kvm_host.h, is not possible because it needs to look at
struct kvm which is defined after asm/kvm_host.h is included.  So add a
kvm_arch_irqchip_in_kernel non-inline function; irqchip_in_kernel() is
only performance critical on arm64 and x86, and the non-inline function
is enough on all other architectures.

irq.h can then be deleted from all architectures except x86.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-09 12:31:37 -05:00
Kefeng Wang
e025ab842e mm: remove kern_addr_valid() completely
Most architectures (except arm64/x86/sparc) simply return 1 for
kern_addr_valid(), which is only used in read_kcore(), and it calls
copy_from_kernel_nofault() which could check whether the address is a
valid kernel address.  So as there is no need for kern_addr_valid(), let's
remove it.

Link: https://lkml.kernel.org/r/20221018074014.185687-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Acked-by: Heiko Carstens <hca@linux.ibm.com>		[s390]
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Helge Deller <deller@gmx.de>			[parisc]
Acked-by: Michael Ellerman <mpe@ellerman.id.au>		[powerpc]
Acked-by: Guo Ren <guoren@kernel.org>			[csky]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: <aou@eecs.berkeley.edu>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Xuerui Wang <kernel@xen0n.name>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08 17:37:18 -08:00
Heiko Carstens
80ddf5ce1c s390: always build relocatable kernel
Nathan Chancellor reported several link errors on s390 with
CONFIG_RELOCATABLE disabled, after binutils commit 906f69cf65da ("IBM
zSystems: Issue error for *DBL relocs on misaligned symbols"). The binutils
commit reveals potential miscompiles that might have happened already
before with linker script defined symbols at odd addresses.

A similar bug was recently fixed in the kernel with commit c9305b6c1f
("s390: fix nospec table alignments").

See https://github.com/ClangBuiltLinux/linux/issues/1747 for an analysis
from Ulich Weigand.

Therefore always build a relocatable kernel to avoid this problem. There is
hardly any use-case for non-relocatable kernels, so this shouldn't be
controversial.

Link: https://github.com/ClangBuiltLinux/linux/issues/1747
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20221030182202.2062705-1-hca@linux.ibm.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-08 19:32:32 +01:00
Heiko Carstens
9afea696a0 s390/configs: add kasan.config addon config file
Add kasan.config addon config file which allows to easily enable KASAN
into the current kernel config.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-08 19:32:32 +01:00
Heiko Carstens
6191de8b17 s390/configs: move CONFIG_DEBUG_INFO_BTF into btf.config addon config
CONFIG_DEBUG_INFO_BTF significantly increases compile time for the
kernel. E.g. when changing a single C file compile time for a new bzImage
is increased by ~50% if BTF debug info is generated.

Therefore remove CONFIG_DEBUG_INFO_BTF from all defconfigs and introduce a
btf.config addon config file. Quickly enabling CONFIG_DEBUG_INFO_BTF into
the current kernel config can be done by simply invoking

make btf.config

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-08 19:32:32 +01:00
Nico Boehr
58635d6615 s390/mm: fix virtual-physical address confusion for swiotlb
swiotlb passes virtual addresses to set_memory_encrypted() and
set_memory_decrypted(), but uv_remove_shared() and uv_set_shared()
expect physical addresses. This currently works, because virtual
and physical addresses are the same.

Add virt_to_phys() to resolve the virtual-physical confusion.

Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Link: https://lore.kernel.org/r/20221107121221.156274-2-nrb@linux.ibm.com
Message-Id: <20221107121221.156274-2-nrb@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-11-07 14:33:40 +01:00
Rafael Mendonca
b6662e3777 KVM: s390: pci: Fix allocation size of aift kzdev elements
The 'kzdev' field of struct 'zpci_aift' is an array of pointers to
'kvm_zdev' structs. Allocate the proper size accordingly.

Reported by Coccinelle:
  WARNING: Use correct pointer type argument for sizeof

Fixes: 98b1d33dac ("KVM: s390: pci: do initial setup for AEN interpretation")
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20221026013234.960859-1-rafaelmendsr@gmail.com
Message-Id: <20221026013234.960859-1-rafaelmendsr@gmail.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-11-07 10:14:15 +01:00
Nico Boehr
6973091d1b KVM: s390: pv: don't allow userspace to set the clock under PV
When running under PV, the guest's TOD clock is under control of the
ultravisor and the hypervisor isn't allowed to change it. Hence, don't
allow userspace to change the guest's TOD clock by returning
-EOPNOTSUPP.

When userspace changes the guest's TOD clock, KVM updates its
kvm.arch.epoch field and, in addition, the epoch field in all state
descriptions of all VCPUs.

But, under PV, the ultravisor will ignore the epoch field in the state
description and simply overwrite it on next SIE exit with the actual
guest epoch. This leads to KVM having an incorrect view of the guest's
TOD clock: it has updated its internal kvm.arch.epoch field, but the
ultravisor ignores the field in the state description.

Whenever a guest is now waiting for a clock comparator, KVM will
incorrectly calculate the time when the guest should wake up, possibly
causing the guest to sleep for much longer than expected.

With this change, kvm_s390_set_tod() will now take the kvm->lock to be
able to call kvm_s390_pv_is_protected(). Since kvm_s390_set_tod_clock()
also takes kvm->lock, use __kvm_s390_set_tod_clock() instead.

The function kvm_s390_set_tod_clock is now unused, hence remove it.
Update the documentation to indicate the TOD clock attr calls can now
return -EOPNOTSUPP.

Fixes: 0f30350471 ("KVM: s390: protvirt: Do only reset registers that are accessible")
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20221011160712.928239-2-nrb@linux.ibm.com
Message-Id: <20221011160712.928239-2-nrb@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-11-07 10:14:15 +01:00
Niklas Schnelle
1a3a7d64bb iommu/s390: Get rid of s390_domain_device
The struct s390_domain_device serves the sole purpose as list entry for
the devices list of a struct s390_domain. As it contains no additional
information besides a list_head and a pointer to the struct zpci_dev we
can simplify things and just thread the device list through struct
zpci_dev directly. This removes the need to allocate during domain
attach and gets rid of one level of indirection during mapping
operations.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Link: https://lore.kernel.org/r/20221025115657.1666860-3-schnelle@linux.ibm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03 15:40:53 +01:00
Heiko Carstens
bb8738876b s390: update defconfigs
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-11-02 22:15:57 +01:00
Peter Zijlstra
bd27568117 perf: Rewrite core context handling
There have been various issues and limitations with the way perf uses
(task) contexts to track events. Most notable is the single hardware
PMU task context, which has resulted in a number of yucky things (both
proposed and merged).

Notably:
 - HW breakpoint PMU
 - ARM big.little PMU / Intel ADL PMU
 - Intel Branch Monitoring PMU
 - AMD IBS PMU
 - S390 cpum_cf PMU
 - PowerPC trace_imc PMU

*Current design:*

Currently we have a per task and per cpu perf_event_contexts:

  task_struct::perf_events_ctxp[] <-> perf_event_context <-> perf_cpu_context
       ^                                 |    ^     |           ^
       `---------------------------------'    |     `--> pmu ---'
                                              v           ^
                                         perf_event ------'

Each task has an array of pointers to a perf_event_context. Each
perf_event_context has a direct relation to a PMU and a group of
events for that PMU. The task related perf_event_context's have a
pointer back to that task.

Each PMU has a per-cpu pointer to a per-cpu perf_cpu_context, which
includes a perf_event_context, which again has a direct relation to
that PMU, and a group of events for that PMU.

The perf_cpu_context also tracks which task context is currently
associated with that CPU and includes a few other things like the
hrtimer for rotation etc.

Each perf_event is then associated with its PMU and one
perf_event_context.

*Proposed design:*

New design proposed by this patch reduce to a single task context and
a single CPU context but adds some intermediate data-structures:

  task_struct::perf_event_ctxp -> perf_event_context <- perf_cpu_context
       ^                           |   ^ ^
       `---------------------------'   | |
                                       | |    perf_cpu_pmu_context <--.
                                       | `----.    ^                  |
                                       |      |    |                  |
                                       |      v    v                  |
                                       | ,--> perf_event_pmu_context  |
                                       | |                            |
                                       | |                            |
                                       v v                            |
                                  perf_event ---> pmu ----------------'

With the new design, perf_event_context will hold all events for all
pmus in the (respective pinned/flexible) rbtrees. This can be achieved
by adding pmu to rbtree key:

  {cpu, pmu, cgroup, group_index}

Each perf_event_context carries a list of perf_event_pmu_context which
is used to hold per-pmu-per-context state. For example, it keeps track
of currently active events for that pmu, a pmu specific task_ctx_data,
a flag to tell whether rotation is required or not etc.

Additionally, perf_cpu_pmu_context is used to hold per-pmu-per-cpu
state like hrtimer details to drive the event rotation, a pointer to
perf_event_pmu_context of currently running task and some other
ancillary information.

Each perf_event is associated to it's pmu, perf_event_context and
perf_event_pmu_context.

Further optimizations to current implementation are possible. For
example, ctx_resched() can be optimized to reschedule only single pmu
events.

Much thanks to Ravi for picking this up and pushing it towards
completion.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Co-developed-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221008062424.313-1-ravi.bangoria@amd.com
2022-10-27 20:12:16 +02:00
Thomas Richter
8b1e6a3fb3 s390/pai: fix raw data collection for PMU pai_ext
Commit 838d9bb62d ("perf: Use sample_flags for raw_data")
changed the way the raw data of an event is collected.
Adjust the PMU pai_ext to the new scheme.

Fixes: 838d9bb62d ("perf: Use sample_flags for raw_data")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-10-26 14:47:31 +02:00
Peter Oberparleiter
aa127a069e s390/boot: add secure boot trailer
This patch enhances the kernel image adding a trailer as required for
secure boot by future firmware versions.

Cc: <stable@vger.kernel.org> # 5.2+
Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-10-26 14:47:31 +02:00
Heiko Carstens
6ec803025c s390/pci: add missing EX_TABLE entries to __pcistg_mio_inuser()/__pcilg_mio_inuser()
For some exception types the instruction address points behind the
instruction that caused the exception. Take that into account and add
the missing exception table entry.

Cc: <stable@vger.kernel.org>
Fixes: f058599e22 ("s390/pci: Fix s390_mmio_read/write with MIO")
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-10-26 14:47:31 +02:00
Heiko Carstens
a262d3ad6a s390/futex: add missing EX_TABLE entry to __futex_atomic_op()
For some exception types the instruction address points behind the
instruction that caused the exception. Take that into account and add
the missing exception table entry.

Cc: <stable@vger.kernel.org>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-10-26 14:47:31 +02:00
Heiko Carstens
4e1b5a86a5 s390/uaccess: add missing EX_TABLE entries to __clear_user()
For some exception types the instruction address points behind the
instruction that caused the exception. Take that into account and add
the missing exception table entries.

Cc: <stable@vger.kernel.org>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-10-26 14:47:30 +02:00
Thomas Richter
58354c7d35 s390/pai: rename structure member users to active_events
Rename structure member users to active_events to make it consistent
with PMU pai_ext. Also use the same prefix syntax for increment and
decrement operators in both PMUs.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-10-26 14:46:52 +02:00
Thomas Richter
d3db4ac3c7 s390/pai: rework pai_crypto mapped buffer reference count
Rework the mapped buffer reference count in PMU pai_crypto
to match the same technique as in PMU pai_ext.
This simplifies the logic.
Do not count the individual number of counter and sampling
processes. Remember the type of access and the total number of
references to the buffer.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-10-26 14:46:51 +02:00
Thomas Richter
4c78796301 s390/pai: move enum definition to header file
Move enum definition to header file. This is done in preparation
for a follow on patch where this enum will be used in another source
file.
Also change the enum name from paiext_mode to paievt_mode
to indicate this enum is now used for several events.
Make naming consistent and rename PAI_MODE_COUNTER to PAI_MODE_COUNTING.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-10-26 14:46:51 +02:00
Nico Boehr
77b5334115 KVM: s390: VSIE: sort out virtual/physical address in pin_guest_page
pin_guest_page() used page_to_virt() to calculate the hpa of the pinned
page. This currently works, because virtual and physical addresses are
the same. Use page_to_phys() instead to resolve the virtual-real address
confusion.

One caller of pin_guest_page() actually expected the hpa to be a hva, so
add the missing phys_to_virt() conversion here.

Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Link: https://lore.kernel.org/r/20221025082039.117372-2-nrb@linux.ibm.com
Message-Id: <20221025082039.117372-2-nrb@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-10-26 14:28:55 +02:00
Nico Boehr
4435b79a36 KVM: s390: pv: sort out physical vs virtual pointers usage
Fix virtual vs physical address confusion (which currently are the same).

Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20221020143159.294605-6-nrb@linux.ibm.com
Message-Id: <20221020143159.294605-6-nrb@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-10-26 14:27:41 +02:00
Nico Boehr
b99f451219 KVM: s390: sida: sort out physical vs virtual pointers usage
All callers of the sida_origin() macro actually expected a virtual
address, so rename it to sida_addr() and hand out a virtual address.

At some places, the macro wasn't used, potentially creating problems
if the sida size ever becomes nonzero (not currently the case), so let's
start using it everywhere now while at it.

Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20221020143159.294605-5-nrb@linux.ibm.com
Message-Id: <20221020143159.294605-5-nrb@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-10-26 14:27:41 +02:00
Nico Boehr
fe0ef00304 KVM: s390: sort out physical vs virtual pointers usage
Fix virtual vs physical address confusion (which currently are the same).

Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20221020143159.294605-4-nrb@linux.ibm.com
Message-Id: <20221020143159.294605-4-nrb@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-10-26 14:27:41 +02:00
Nico Boehr
6b33e68ab3 s390/entry: sort out physical vs virtual pointers usage in sie64a
Fix virtual vs physical address confusion (which currently are the
same).

sie_block is accessed in entry.S and passed it to hardware, which is why
both its physical and virtual address are needed. To avoid every caller
having to do the virtual-physical conversion, add a new function sie64a()
which converts the virtual address to physical.

Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20221020143159.294605-3-nrb@linux.ibm.com
Message-Id: <20221020143159.294605-3-nrb@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-10-26 14:27:41 +02:00
Nico Boehr
079f0c21ef s390/mm: gmap: sort out physical vs virtual pointers usage
Fix virtual vs physical address confusion (which currently are the same).

Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20221020143159.294605-2-nrb@linux.ibm.com
Message-Id: <20221020143159.294605-2-nrb@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-10-26 14:27:41 +02:00
Paul E. McKenney
85bf37855c arch/s390: Add ARCH_HAS_NMI_SAFE_THIS_CPU_OPS Kconfig option
The s390 architecture uses either a cmpxchg loop (old systems)
or the laa add-to-memory instruction (new systems) to implement
this_cpu_add(), both of which are NMI safe.  This means that the old
and more-efficient srcu_read_lock() may be used in NMI context, without
the need for srcu_read_lock_nmisafe().  Therefore, add the new Kconfig
option ARCH_HAS_NMI_SAFE_THIS_CPU_OPS to arch/s390/Kconfig, which will
cause NEED_SRCU_NMI_SAFE to be deselected, thus preserving the current
srcu_read_lock() behavior.

[ paulmck: Apply Christian Borntraeger feedback. ]

Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/

Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Suggested-by: Frederic Weisbecker <frederic@kernel.org>
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: <linux-s390@vger.kernel.org>
2022-10-21 10:15:40 -07:00
Linus Torvalds
f1947d7c8a Random number generator fixes for Linux 6.1-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmNHYD0ACgkQSfxwEqXe
 A655AA//dJK0PdRghqrKQsl18GOCffV5TUw5i1VbJQbI9d8anfxNjVUQiNGZi4et
 qUwZ8OqVXxYx1Z1UDgUE39PjEDSG9/cCvOpMUWqN20/+6955WlNZjwA7Fk6zjvlM
 R30fz5CIJns9RFvGT4SwKqbVLXIMvfg/wDENUN+8sxt36+VD2gGol7J2JJdngEhM
 lW+zqzi0ABqYy5so4TU2kixpKmpC08rqFvQbD1GPid+50+JsOiIqftDErt9Eg1Mg
 MqYivoFCvbAlxxxRh3+UHBd7ZpJLtp1UFEOl2Rf00OXO+ZclLCAQAsTczucIWK9M
 8LCZjb7d4lPJv9RpXFAl3R1xvfc+Uy2ga5KeXvufZtc5G3aMUKPuIU7k28ZyblVS
 XXsXEYhjTSd0tgi3d0JlValrIreSuj0z2QGT5pVcC9utuAqAqRIlosiPmgPlzXjr
 Us4jXaUhOIPKI+Musv/fqrxsTQziT0jgVA3Njlt4cuAGm/EeUbLUkMWwKXjZLTsv
 vDsBhEQFmyZqxWu4pYo534VX2mQWTaKRV1SUVVhQEHm57b00EAiZohoOvweB09SR
 4KiJapikoopmW4oAUFotUXUL1PM6yi+MXguTuc1SEYuLz/tCFtK8DJVwNpfnWZpE
 lZKvXyJnHq2Sgod/hEZq58PMvT6aNzTzSg7YzZy+VabxQGOO5mc=
 =M+mV
 -----END PGP SIGNATURE-----

Merge tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull more random number generator updates from Jason Donenfeld:
 "This time with some large scale treewide cleanups.

  The intent of this pull is to clean up the way callers fetch random
  integers. The current rules for doing this right are:

   - If you want a secure or an insecure random u64, use get_random_u64()

   - If you want a secure or an insecure random u32, use get_random_u32()

     The old function prandom_u32() has been deprecated for a while
     now and is just a wrapper around get_random_u32(). Same for
     get_random_int().

   - If you want a secure or an insecure random u16, use get_random_u16()

   - If you want a secure or an insecure random u8, use get_random_u8()

   - If you want secure or insecure random bytes, use get_random_bytes().

     The old function prandom_bytes() has been deprecated for a while
     now and has long been a wrapper around get_random_bytes()

   - If you want a non-uniform random u32, u16, or u8 bounded by a
     certain open interval maximum, use prandom_u32_max()

     I say "non-uniform", because it doesn't do any rejection sampling
     or divisions. Hence, it stays within the prandom_*() namespace, not
     the get_random_*() namespace.

     I'm currently investigating a "uniform" function for 6.2. We'll see
     what comes of that.

  By applying these rules uniformly, we get several benefits:

   - By using prandom_u32_max() with an upper-bound that the compiler
     can prove at compile-time is ≤65536 or ≤256, internally
     get_random_u16() or get_random_u8() is used, which wastes fewer
     batched random bytes, and hence has higher throughput.

   - By using prandom_u32_max() instead of %, when the upper-bound is
     not a constant, division is still avoided, because
     prandom_u32_max() uses a faster multiplication-based trick instead.

   - By using get_random_u16() or get_random_u8() in cases where the
     return value is intended to indeed be a u16 or a u8, we waste fewer
     batched random bytes, and hence have higher throughput.

  This series was originally done by hand while I was on an airplane
  without Internet. Later, Kees and I worked on retroactively figuring
  out what could be done with Coccinelle and what had to be done
  manually, and then we split things up based on that.

  So while this touches a lot of files, the actual amount of code that's
  hand fiddled is comfortably small"

* tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  prandom: remove unused functions
  treewide: use get_random_bytes() when possible
  treewide: use get_random_u32() when possible
  treewide: use get_random_{u8,u16}() when possible, part 2
  treewide: use get_random_{u8,u16}() when possible, part 1
  treewide: use prandom_u32_max() when possible, part 2
  treewide: use prandom_u32_max() when possible, part 1
2022-10-16 15:27:07 -07:00
Linus Torvalds
676cb49573 - hfs and hfsplus kmap API modernization from Fabio Francesco
- Valentin Schneider makes crash-kexec work properly when invoked from
   an NMI-time panic.
 
 - ntfs bugfixes from Hawkins Jiawei
 
 - Jiebin Sun improves IPC msg scalability by replacing atomic_t's with
   percpu counters.
 
 - nilfs2 cleanups from Minghao Chi
 
 - lots of other single patches all over the tree!
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0Yf0gAKCRDdBJ7gKXxA
 joapAQDT1d1zu7T8yf9cQXkYnZVuBKCjxKE/IsYvqaq1a42MjQD/SeWZg0wV05B8
 DhJPj9nkEp6R3Rj3Mssip+3vNuceAQM=
 =lUQY
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - hfs and hfsplus kmap API modernization (Fabio Francesco)

 - make crash-kexec work properly when invoked from an NMI-time panic
   (Valentin Schneider)

 - ntfs bugfixes (Hawkins Jiawei)

 - improve IPC msg scalability by replacing atomic_t's with percpu
   counters (Jiebin Sun)

 - nilfs2 cleanups (Minghao Chi)

 - lots of other single patches all over the tree!

* tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits)
  include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype
  proc: test how it holds up with mapping'less process
  mailmap: update Frank Rowand email address
  ia64: mca: use strscpy() is more robust and safer
  init/Kconfig: fix unmet direct dependencies
  ia64: update config files
  nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure
  fork: remove duplicate included header files
  init/main.c: remove unnecessary (void*) conversions
  proc: mark more files as permanent
  nilfs2: remove the unneeded result variable
  nilfs2: delete unnecessary checks before brelse()
  checkpatch: warn for non-standard fixes tag style
  usr/gen_init_cpio.c: remove unnecessary -1 values from int file
  ipc/msg: mitigate the lock contention with percpu counter
  percpu: add percpu_counter_add_local and percpu_counter_sub_local
  fs/ocfs2: fix repeated words in comments
  relay: use kvcalloc to alloc page array in relay_alloc_page_array
  proc: make config PROC_CHILDREN depend on PROC_FS
  fs: uninline inode_maybe_inc_iversion()
  ...
2022-10-12 11:00:22 -07:00
Jason A. Donenfeld
a251c17aa5 treewide: use get_random_u32() when possible
The prandom_u32() function has been a deprecated inline wrapper around
get_random_u32() for several releases now, and compiles down to the
exact same code. Replace the deprecated wrapper with a direct call to
the real function. The same also applies to get_random_int(), which is
just a wrapper around get_random_u32(). This was done as a basic find
and replace.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz> # for ext4
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake
Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbolt
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Acked-by: Helge Deller <deller@gmx.de> # for parisc
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-11 17:42:58 -06:00
Jason A. Donenfeld
f743f16c54 treewide: use get_random_{u8,u16}() when possible, part 2
Rather than truncate a 32-bit value to a 16-bit value or an 8-bit value,
simply use the get_random_{u8,u16}() functions, which are faster than
wasting the additional bytes from a 32-bit value. This was done by hand,
identifying all of the places where one of the random integer functions
was used in a non-32-bit context.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-11 17:42:58 -06:00
Jason A. Donenfeld
81895a65ec treewide: use prandom_u32_max() when possible, part 1
Rather than incurring a division or requesting too many random bytes for
the given range, use the prandom_u32_max() function, which only takes
the minimum required bytes from the RNG and avoids divisions. This was
done mechanically with this coccinelle script:

@basic@
expression E;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u64;
@@
(
- ((T)get_random_u32() % (E))
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ((E) - 1))
+ prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2)
|
- ((u64)(E) * get_random_u32() >> 32)
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ~PAGE_MASK)
+ prandom_u32_max(PAGE_SIZE)
)

@multi_line@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
identifier RAND;
expression E;
@@

-       RAND = get_random_u32();
        ... when != RAND
-       RAND %= (E);
+       RAND = prandom_u32_max(E);

// Find a potential literal
@literal_mask@
expression LITERAL;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
position p;
@@

        ((T)get_random_u32()@p & (LITERAL))

// Add one to the literal.
@script:python add_one@
literal << literal_mask.LITERAL;
RESULT;
@@

value = None
if literal.startswith('0x'):
        value = int(literal, 16)
elif literal[0] in '123456789':
        value = int(literal, 10)
if value is None:
        print("I don't know how to handle %s" % (literal))
        cocci.include_match(False)
elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1:
        print("Skipping 0x%x for cleanup elsewhere" % (value))
        cocci.include_match(False)
elif value & (value + 1) != 0:
        print("Skipping 0x%x because it's not a power of two minus one" % (value))
        cocci.include_match(False)
elif literal.startswith('0x'):
        coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1))
else:
        coccinelle.RESULT = cocci.make_expr("%d" % (value + 1))

// Replace the literal mask with the calculated result.
@plus_one@
expression literal_mask.LITERAL;
position literal_mask.p;
expression add_one.RESULT;
identifier FUNC;
@@

-       (FUNC()@p & (LITERAL))
+       prandom_u32_max(RESULT)

@collapse_ret@
type T;
identifier VAR;
expression E;
@@

 {
-       T VAR;
-       VAR = (E);
-       return VAR;
+       return E;
 }

@drop_var@
type T;
identifier VAR;
@@

 {
-       T VAR;
        ... when != VAR
 }

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: KP Singh <kpsingh@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-11 17:42:55 -06:00
Linus Torvalds
27bc50fc90 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any negative
   reports (or any positive ones, come to that).
 
 - Also the Maple Tree from Liam R.  Howlett.  An overlapping range-based
   tree for vmas.  It it apparently slight more efficient in its own right,
   but is mainly targeted at enabling work to reduce mmap_lock contention.
 
   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.
 
   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   (https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
   This has yet to be addressed due to Liam's unfortunately timed
   vacation.  He is now back and we'll get this fixed up.
 
 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer.  It uses
   clang-generated instrumentation to detect used-unintialized bugs down to
   the single bit level.
 
   KMSAN keeps finding bugs.  New ones, as well as the legacy ones.
 
 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.
 
 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
   file/shmem-backed pages.
 
 - userfaultfd updates from Axel Rasmussen
 
 - zsmalloc cleanups from Alexey Romanov
 
 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
 
 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.
 
 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.
 
 - memcg cleanups from Kairui Song.
 
 - memcg fixes and cleanups from Johannes Weiner.
 
 - Vishal Moola provides more folio conversions
 
 - Zhang Yi removed ll_rw_block() :(
 
 - migration enhancements from Peter Xu
 
 - migration error-path bugfixes from Huang Ying
 
 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths.  For optimizations by PMEM drivers, DRM
   drivers, etc.
 
 - vma merging improvements from Jakub Matěn.
 
 - NUMA hinting cleanups from David Hildenbrand.
 
 - xu xin added aditional userspace visibility into KSM merging activity.
 
 - THP & KSM code consolidation from Qi Zheng.
 
 - more folio work from Matthew Wilcox.
 
 - KASAN updates from Andrey Konovalov.
 
 - DAMON cleanups from Kaixu Xia.
 
 - DAMON work from SeongJae Park: fixes, cleanups.
 
 - hugetlb sysfs cleanups from Muchun Song.
 
 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
 joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
 bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
 =xfWx
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
   linux-next for a couple of months without, to my knowledge, any
   negative reports (or any positive ones, come to that).

 - Also the Maple Tree from Liam Howlett. An overlapping range-based
   tree for vmas. It it apparently slightly more efficient in its own
   right, but is mainly targeted at enabling work to reduce mmap_lock
   contention.

   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.

   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   at [1]. This has yet to be addressed due to Liam's unfortunately
   timed vacation. He is now back and we'll get this fixed up.

 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
   clang-generated instrumentation to detect used-unintialized bugs down
   to the single bit level.

   KMSAN keeps finding bugs. New ones, as well as the legacy ones.

 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.

 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
   support file/shmem-backed pages.

 - userfaultfd updates from Axel Rasmussen

 - zsmalloc cleanups from Alexey Romanov

 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
   memory-failure

 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.

 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.

 - memcg cleanups from Kairui Song.

 - memcg fixes and cleanups from Johannes Weiner.

 - Vishal Moola provides more folio conversions

 - Zhang Yi removed ll_rw_block() :(

 - migration enhancements from Peter Xu

 - migration error-path bugfixes from Huang Ying

 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths. For optimizations by PMEM drivers, DRM
   drivers, etc.

 - vma merging improvements from Jakub Matěn.

 - NUMA hinting cleanups from David Hildenbrand.

 - xu xin added aditional userspace visibility into KSM merging
   activity.

 - THP & KSM code consolidation from Qi Zheng.

 - more folio work from Matthew Wilcox.

 - KASAN updates from Andrey Konovalov.

 - DAMON cleanups from Kaixu Xia.

 - DAMON work from SeongJae Park: fixes, cleanups.

 - hugetlb sysfs cleanups from Muchun Song.

 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.

Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]

* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
  hugetlb: allocate vma lock for all sharable vmas
  hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
  hugetlb: fix vma lock handling during split vma and range unmapping
  mglru: mm/vmscan.c: fix imprecise comments
  mm/mglru: don't sync disk for each aging cycle
  mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
  mm: memcontrol: use do_memsw_account() in a few more places
  mm: memcontrol: deprecate swapaccounting=0 mode
  mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
  mm/secretmem: remove reduntant return value
  mm/hugetlb: add available_huge_pages() func
  mm: remove unused inline functions from include/linux/mm_inline.h
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
  selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: dedup THP helpers
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  ...
2022-10-10 17:53:04 -07:00
Linus Torvalds
3604a7f568 This update includes the following changes:
API:
 
 - Feed untrusted RNGs into /dev/random.
 - Allow HWRNG sleeping to be more interruptible.
 - Create lib/utils module.
 - Setting private keys no longer required for akcipher.
 - Remove tcrypt mode=1000.
 - Reorganised Kconfig entries.
 
 Algorithms:
 
 - Load x86/sha512 based on CPU features.
 - Add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher.
 
 Drivers:
 
 - Add HACE crypto driver aspeed.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmM785cACgkQxycdCkmx
 i6dveBAAmGVYtrPmcGfA6CmzZ8ps9KdZxhjHjzLKwuqrOMulZvE2IYeUV4QtNqpQ
 6NLY2+TkqL0XIbCXoByIk32lMYIlXBaJdMYdHHDTeo7E2wqZn/46SPSWeNKazyJx
 dkL8Oj62nqDc2s0LOi3vLvod+sENFQ69R+vkHOa0fZhX0UBsac3NIXo+74Y2A7bE
 0+iQFKTWdNnoQzQ0j4q8WMiolKYh21iPZ9l5sjgMgichLCaE6PrITlRcaWrtPhey
 U1OmJtbTPsg+5X1r9KyLtoAXtBDONl66GQyne+p/ZYD8cMhxomjJaPlMhwWE/n4d
 d2KJKvoXoPPo4c+yNIS9hBav07ZriPl0q0jd2M1rd6oYTmFpaodTgIBfjvxO+wfV
 GoqDS8PEc42U1uwkuKC/cvfr6pB8WiybfXy+vSXBm/jUgIOO3y+eqsC8Jx9ZoQeG
 F+d34PYfJrJbmDRtcA6ZKdzN0OmKq7aCilx1kGKGPg0D+uq64FBo7zsT6XzTK8HL
 2Za9AACPn87xLQwGrKDSBfyrlSSIJm2FaIIPayUXHEo7cyoiZwbTpXRRJ1mDR+v9
 jzI+xPEXCthtjysuRmufNhTkiZUv3lZ8ORfQ0QFKR53tjZUm+dVQo0V/N/ZSXoSV
 SyRvXYO+ToXePAofNWl1LcO1grX/vxtFNedMkDLHXooRcnCaIYo=
 =rq2f
 -----END PGP SIGNATURE-----

Merge tag 'v6.1-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Feed untrusted RNGs into /dev/random
   - Allow HWRNG sleeping to be more interruptible
   - Create lib/utils module
   - Setting private keys no longer required for akcipher
   - Remove tcrypt mode=1000
   - Reorganised Kconfig entries

  Algorithms:
   - Load x86/sha512 based on CPU features
   - Add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher

  Drivers:
   - Add HACE crypto driver aspeed"

* tag 'v6.1-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (124 commits)
  crypto: aspeed - Remove redundant dev_err call
  crypto: scatterwalk - Remove unused inline function scatterwalk_aligned()
  crypto: aead - Remove unused inline functions from aead
  crypto: bcm - Simplify obtain the name for cipher
  crypto: marvell/octeontx - use sysfs_emit() to instead of scnprintf()
  hwrng: core - start hwrng kthread also for untrusted sources
  crypto: zip - remove the unneeded result variable
  crypto: qat - add limit to linked list parsing
  crypto: octeontx2 - Remove the unneeded result variable
  crypto: ccp - Remove the unneeded result variable
  crypto: aspeed - Fix check for platform_get_irq() errors
  crypto: virtio - fix memory-leak
  crypto: cavium - prevent integer overflow loading firmware
  crypto: marvell/octeontx - prevent integer overflows
  crypto: aspeed - fix build error when only CRYPTO_DEV_ASPEED is enabled
  crypto: hisilicon/qm - fix the qos value initialization
  crypto: sun4i-ss - use DEFINE_SHOW_ATTRIBUTE to simplify sun4i_ss_debugfs
  crypto: tcrypt - add async speed test for aria cipher
  crypto: aria-avx - add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher
  crypto: aria - prepare generic module for optimized implementations
  ...
2022-10-10 13:04:25 -07:00
Linus Torvalds
8afc66e8d4 Kbuild updates for v6.1
- Remove potentially incomplete targets when Kbuid is interrupted by
    SIGINT etc. in case GNU Make may miss to do that when stderr is piped
    to another program.
 
  - Rewrite the single target build so it works more correctly.
 
  - Fix rpm-pkg builds with V=1.
 
  - List top-level subdirectories in ./Kbuild.
 
  - Ignore auto-generated __kstrtab_* and __kstrtabns_* symbols in kallsyms.
 
  - Avoid two different modules in lib/zstd/ having shared code, which
    potentially causes building the common code as build-in and modular
    back-and-forth.
 
  - Unify two modpost invocations to optimize the build process.
 
  - Remove head-y syntax in favor of linker scripts for placing particular
    sections in the head of vmlinux.
 
  - Bump the minimal GNU Make version to 3.82.
 
  - Clean up misc Makefiles and scripts.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmM+4vcVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGY2IQAInr0JUNnkkxwUSXtOcQuA3IK8RJ
 FbU9HXJRoV9H+7+l3SMlN7mIbrs5eE5fTY3iwQ3CVe139d1+1q7nvTMRv8owywJx
 GBgzswncuu1lk7iQQ//CxiqMwSCG8GJdYn1uDVy4I5jg3o+DtFZJtyq2Wb7pqsMm
 ZhZ4PozRN+idYQJSF6Vx/zEVLHI7quMBwfe4CME8/0Kg2+hnYzbXV/aUf0ED2emq
 zdCMDQgIOK5AhY+8qgMXKYnBUJMTqBp6LoR4p3ApfUkwRFY0sGa0/LK3U/B22OE7
 uWyR4fCUExGyerlcHEVev+9eBfmsLLPyqlchNwpSDOPf5OSdnKmgqJEBR/Cvx0eh
 URerPk7EHxyH3G8yi+cU2GtofNTGc5RHPRgJE2ADsQEi5TAUKGmbXMlsFRL/51Vn
 lTANZObBNa1d4enljF6TfTL5nuccOa+DKvXnH9fQ49t0QdtSikv6J/lGwilwm1Sr
 BctmCsySPuURZfkpI9OQnLuouloMXl9f7Q/+S39haS/tSgvPpyITyO71nxDnXn/s
 BbFObZJUk9QkqOACjBP1hNErTLt83uBxQ9z+rDCw/SbLIe4nw0wyneuygfHI5rI8
 3RZB2DbGauuJHX2Zs6YGS14SLSY33IsLqKR1/Vy3LrPvOHuEvNiOR8LITq5E0YCK
 OffK2Y5cIlXR0QWf
 =DHiN
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Remove potentially incomplete targets when Kbuid is interrupted by
   SIGINT etc in case GNU Make may miss to do that when stderr is piped
   to another program.

 - Rewrite the single target build so it works more correctly.

 - Fix rpm-pkg builds with V=1.

 - List top-level subdirectories in ./Kbuild.

 - Ignore auto-generated __kstrtab_* and __kstrtabns_* symbols in
   kallsyms.

 - Avoid two different modules in lib/zstd/ having shared code, which
   potentially causes building the common code as build-in and modular
   back-and-forth.

 - Unify two modpost invocations to optimize the build process.

 - Remove head-y syntax in favor of linker scripts for placing
   particular sections in the head of vmlinux.

 - Bump the minimal GNU Make version to 3.82.

 - Clean up misc Makefiles and scripts.

* tag 'kbuild-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (41 commits)
  docs: bump minimal GNU Make version to 3.82
  ia64: simplify esi object addition in Makefile
  Revert "kbuild: Check if linker supports the -X option"
  kbuild: rebuild .vmlinux.export.o when its prerequisite is updated
  kbuild: move modules.builtin(.modinfo) rules to Makefile.vmlinux_o
  zstd: Fixing mixed module-builtin objects
  kallsyms: ignore __kstrtab_* and __kstrtabns_* symbols
  kallsyms: take the input file instead of reading stdin
  kallsyms: drop duplicated ignore patterns from kallsyms.c
  kbuild: reuse mksysmap output for kallsyms
  mksysmap: update comment about __crc_*
  kbuild: remove head-y syntax
  kbuild: use obj-y instead extra-y for objects placed at the head
  kbuild: hide error checker logs for V=1 builds
  kbuild: re-run modpost when it is updated
  kbuild: unify two modpost invocations
  kbuild: move vmlinux.o rule to the top Makefile
  kbuild: move .vmlinux.objs rule to Makefile.modpost
  kbuild: list sub-directories in ./Kbuild
  Makefile.compiler: replace cc-ifversion with compiler-specific macros
  ...
2022-10-10 12:00:45 -07:00
Linus Torvalds
3871d93b82 Perf events updates for v6.1:
- PMU driver updates:
 
      - Add AMD Last Branch Record Extension Version 2 (LbrExtV2)
        feature support for Zen 4 processors.
 
      - Extend the perf ABI to provide branch speculation information,
        if available, and use this on CPUs that have it (eg. LbrExtV2).
 
      - Improve Intel PEBS TSC timestamp handling & integration.
 
      - Add Intel Raptor Lake S CPU support.
 
      - Add 'perf mem' and 'perf c2c' memory profiling support on
        AMD CPUs by utilizing IBS tagged load/store samples.
 
      - Clean up & optimize various x86 PMU details.
 
  - HW breakpoints:
 
      - Big rework to optimize the code for systems with hundreds of CPUs and
        thousands of breakpoints:
 
         - Replace the nr_bp_mutex global mutex with the bp_cpuinfo_sem
 	  per-CPU rwsem that is read-locked during most of the key operations.
 
 	- Improve the O(#cpus * #tasks) logic in toggle_bp_slot()
 	  and fetch_bp_busy_slots().
 
 	- Apply micro-optimizations & cleanups.
 
   - Misc cleanups & enhancements.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmM/2pMRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iIMA/+J+MCEVTt9kwZeBtHoPX7iZ5gnq1+McoQ
 f6ALX19AO/ZSuA7EBA3cS3Ny5eyGy3ofYUnRW+POezu9CpflLW/5N27R2qkZFrWC
 A09B86WH676ZrmXt+oI05rpZ2y/NGw4gJxLLa4/bWF3g9xLfo21i+YGKwdOnNFpl
 DEdCVHtjlMcOAU3+on6fOYuhXDcYd7PKGcCfLE7muOMOAtwyj0bUDBt7m+hneZgy
 qbZHzDU2DZ5L/LXiMyuZj5rC7V4xUbfZZfXglG38YCW1WTieS3IjefaU2tREhu7I
 rRkCK48ULDNNJR3dZK8IzXJRxusq1ICPG68I+nm/K37oZyTZWtfYZWehW/d/TnPa
 tUiTwimabz7UUqaGq9ZptxwINcAigax0nl6dZ3EseeGhkDE6j71/3kqrkKPz4jth
 +fCwHLOrI3c4Gq5qWgPvqcUlUneKf3DlOMtzPKYg7sMhla2zQmFpYCPzKfm77U/Z
 BclGOH3FiwaK6MIjPJRUXTePXqnUseqCR8PCH/UPQUeBEVHFcMvqCaa15nALed8x
 dFi76VywR9mahouuLNq6sUNePlvDd2B124PygNwegLlBfY9QmKONg9qRKOnQpuJ6
 UprRJjLOOucZ/N/jn6+ShHkqmXsnY2MhfUoHUoMQ0QAI+n++e+2AuePo251kKWr8
 QlqKxd9PMQU=
 =LcGg
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf events updates from Ingo Molnar:
 "PMU driver updates:

   - Add AMD Last Branch Record Extension Version 2 (LbrExtV2) feature
     support for Zen 4 processors.

   - Extend the perf ABI to provide branch speculation information, if
     available, and use this on CPUs that have it (eg. LbrExtV2).

   - Improve Intel PEBS TSC timestamp handling & integration.

   - Add Intel Raptor Lake S CPU support.

   - Add 'perf mem' and 'perf c2c' memory profiling support on AMD CPUs
     by utilizing IBS tagged load/store samples.

   - Clean up & optimize various x86 PMU details.

  HW breakpoints:

   - Big rework to optimize the code for systems with hundreds of CPUs
     and thousands of breakpoints:

      - Replace the nr_bp_mutex global mutex with the bp_cpuinfo_sem
        per-CPU rwsem that is read-locked during most of the key
        operations.

      - Improve the O(#cpus * #tasks) logic in toggle_bp_slot() and
        fetch_bp_busy_slots().

      - Apply micro-optimizations & cleanups.

  - Misc cleanups & enhancements"

* tag 'perf-core-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits)
  perf/hw_breakpoint: Annotate tsk->perf_event_mutex vs ctx->mutex
  perf: Fix pmu_filter_match()
  perf: Fix lockdep_assert_event_ctx()
  perf/x86/amd/lbr: Adjust LBR regardless of filtering
  perf/x86/utils: Fix uninitialized var in get_branch_type()
  perf/uapi: Define PERF_MEM_SNOOPX_PEER in kernel header file
  perf/x86/amd: Support PERF_SAMPLE_PHY_ADDR
  perf/x86/amd: Support PERF_SAMPLE_ADDR
  perf/x86/amd: Support PERF_SAMPLE_{WEIGHT|WEIGHT_STRUCT}
  perf/x86/amd: Support PERF_SAMPLE_DATA_SRC
  perf/x86/amd: Add IBS OP_DATA2 DataSrc bit definitions
  perf/mem: Introduce PERF_MEM_LVLNUM_{EXTN_MEM|IO}
  perf/x86/uncore: Add new Raptor Lake S support
  perf/x86/cstate: Add new Raptor Lake S support
  perf/x86/msr: Add new Raptor Lake S support
  perf/x86: Add new Raptor Lake S support
  bpf: Check flags for branch stack in bpf_read_branch_records helper
  perf, hw_breakpoint: Fix use-after-free if perf_event_open() fails
  perf: Use sample_flags for raw_data
  perf: Use sample_flags for addr
  ...
2022-10-10 09:27:46 -07:00
Linus Torvalds
03785a69ae s390 updates for the 6.1 merge window
- Make use of the IBM z16 processor activity instrumentation facility
   extension to count neural network processor assist operations: add a new
   PMU device driver so that perf can make use of this.
 
 - Rework memcpy_real() to avoid DAT-off mode.
 
 - Rework absolute lowcore access code.
 
 - Various small fixes and improvements all over the code.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmNB0rUACgkQjYWKoQLX
 FBjIwAgAkxPvjr50iK5/exlZR7S2bcEwoCKBUX+D7563SE4jQQp3o5Gt2MZ3hdL0
 IKRkPPA/xRV0W7Oj57SvU4xXvxeXINoCPsQQlLrHg77L6B3TlWpmVWXjseunMyfM
 zhZTbv1sZ/XcdrnRwMq3P5fb8TnNHiU0/6lBz4Efi92K5NerQ0JBY7w7xbrZwiXD
 2R6ZpokXNcaBBtaETpu+PWu6QXyUOpvIeT+b6DG+4/ud2GmY7KNM75l2lzgWXajh
 7yHTLTkQTvUF+3jZS2JiRm9rrH/FY3X5PXVg17Q10KmzmVXp5WT75pVyDEuUq0a8
 nOj2QsFSX/xDJTcPvrAWZxMuxzLtRg==
 =6LMp
 -----END PGP SIGNATURE-----

Merge tag 's390-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Make use of the IBM z16 processor activity instrumentation facility
   extension to count neural network processor assist operations: add a
   new PMU device driver so that perf can make use of this.

 - Rework memcpy_real() to avoid DAT-off mode.

 - Rework absolute lowcore access code.

 - Various small fixes and improvements all over the code.

* tag 's390-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/pci: remove unused bus_next field from struct zpci_dev
  s390/cio: remove unused ccw_device_force_console() declaration
  s390/pai: Add support for PAI Extension 1 NNPA counters
  s390/mm: fix no previous prototype warnings in maccess.c
  s390/mm: uninline copy_oldmem_kernel() function
  s390/mm,ptdump: add real memory copy page markers
  s390/mm: rework memcpy_real() to avoid DAT-off mode
  s390/dump: save IPL CPU registers once DAT is available
  s390/pci: convert high_memory to physical address
  s390/smp,ptdump: add absolute lowcore markers
  s390/smp: rework absolute lowcore access
  s390/smp: call smp_reinit_ipl_cpu() before scheduler is available
  s390/ptdump: add missing amode31 markers
  s390/mm: split lowcore pages with set_memory_4k()
  s390/mm: remove unused access parameter from do_fault_error()
  s390/delay: sync comment within __delay() with reality
  s390: move from strlcpy with unused retval to strscpy
2022-10-09 13:51:40 -07:00
Linus Torvalds
ef688f8b8c The first batch of KVM patches, mostly covering x86, which I
am sending out early due to me travelling next week.  There is a
 lone mm patch for which Andrew gave an informal ack at
 https://lore.kernel.org/linux-mm/20220817102500.440c6d0a3fce296fdf91bea6@linux-foundation.org.
 
 I will send the bulk of ARM work, as well as other
 architectures, at the end of next week.
 
 ARM:
 
 * Account stage2 page table allocations in memory stats.
 
 x86:
 
 * Account EPT/NPT arm64 page table allocations in memory stats.
 
 * Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR accesses.
 
 * Drop eVMCS controls filtering for KVM on Hyper-V, all known versions of
   Hyper-V now support eVMCS fields associated with features that are
   enumerated to the guest.
 
 * Use KVM's sanitized VMCS config as the basis for the values of nested VMX
   capabilities MSRs.
 
 * A myriad event/exception fixes and cleanups.  Most notably, pending
   exceptions morph into VM-Exits earlier, as soon as the exception is
   queued, instead of waiting until the next vmentry.  This fixed
   a longstanding issue where the exceptions would incorrecly become
   double-faults instead of triggering a vmexit; the common case of
   page-fault vmexits had a special workaround, but now it's fixed
   for good.
 
 * A handful of fixes for memory leaks in error paths.
 
 * Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow.
 
 * Never write to memory from non-sleepable kvm_vcpu_check_block()
 
 * Selftests refinements and cleanups.
 
 * Misc typo cleanups.
 
 Generic:
 
 * remove KVM_REQ_UNHALT
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmM2zwcUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNpbwf+MlVeOlzE5SBdrJ0TEnLmKUel1lSz
 QnZzP5+D65oD0zhCilUZHcg6G4mzZ5SdVVOvrGJvA0eXh25ruLNMF6jbaABkMLk/
 FfI1ybN7A82hwJn/aXMI/sUurWv4Jteaad20JC2DytBCnsW8jUqc49gtXHS2QWy4
 3uMsFdpdTAg4zdJKgEUfXBmQviweVpjjl3ziRyZZ7yaeo1oP7XZ8LaE1nR2l5m0J
 mfjzneNm5QAnueypOh5KhSwIvqf6WHIVm/rIHDJ1HIFbgfOU0dT27nhb1tmPwAcE
 +cJnnMUHjZqtCXteHkAxMClyRq0zsEoKk0OGvSOOMoq3Q0DavSXUNANOig==
 =/hqX
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "The first batch of KVM patches, mostly covering x86.

  ARM:

   - Account stage2 page table allocations in memory stats

  x86:

   - Account EPT/NPT arm64 page table allocations in memory stats

   - Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR
     accesses

   - Drop eVMCS controls filtering for KVM on Hyper-V, all known
     versions of Hyper-V now support eVMCS fields associated with
     features that are enumerated to the guest

   - Use KVM's sanitized VMCS config as the basis for the values of
     nested VMX capabilities MSRs

   - A myriad event/exception fixes and cleanups. Most notably, pending
     exceptions morph into VM-Exits earlier, as soon as the exception is
     queued, instead of waiting until the next vmentry. This fixed a
     longstanding issue where the exceptions would incorrecly become
     double-faults instead of triggering a vmexit; the common case of
     page-fault vmexits had a special workaround, but now it's fixed for
     good

   - A handful of fixes for memory leaks in error paths

   - Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow

   - Never write to memory from non-sleepable kvm_vcpu_check_block()

   - Selftests refinements and cleanups

   - Misc typo cleanups

  Generic:

   - remove KVM_REQ_UNHALT"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits)
  KVM: remove KVM_REQ_UNHALT
  KVM: mips, x86: do not rely on KVM_REQ_UNHALT
  KVM: x86: never write to memory from kvm_vcpu_check_block()
  KVM: x86: Don't snapshot pending INIT/SIPI prior to checking nested events
  KVM: nVMX: Make event request on VMXOFF iff INIT/SIPI is pending
  KVM: nVMX: Make an event request if INIT or SIPI is pending on VM-Enter
  KVM: SVM: Make an event request if INIT or SIPI is pending when GIF is set
  KVM: x86: lapic does not have to process INIT if it is blocked
  KVM: x86: Rename kvm_apic_has_events() to make it INIT/SIPI specific
  KVM: x86: Rename and expose helper to detect if INIT/SIPI are allowed
  KVM: nVMX: Make an event request when pending an MTF nested VM-Exit
  KVM: x86: make vendor code check for all nested events
  mailmap: Update Oliver's email address
  KVM: x86: Allow force_emulation_prefix to be written without a reload
  KVM: selftests: Add an x86-only test to verify nested exception queueing
  KVM: selftests: Use uapi header to get VMX and SVM exit reasons/codes
  KVM: x86: Rename inject_pending_events() to kvm_check_and_inject_events()
  KVM: VMX: Update MTF and ICEBP comments to document KVM's subtle behavior
  KVM: x86: Treat pending TRIPLE_FAULT requests as pending exceptions
  KVM: x86: Morph pending exceptions to pending VM-Exits at queue time
  ...
2022-10-09 09:39:55 -07:00
Linus Torvalds
6181073dd6 TTY/Serial driver update for 6.1-rc1
Here is the big set of TTY and Serial driver updates for 6.1-rc1.
 
 Lots of cleanups in here, no real new functionality this time around,
 with the diffstat being that we removed more lines than we added!
 
 Included in here are:
 	- termios unification cleanups from Al Viro, it's nice to
 	  finally get this work done
 	- tty serial transmit cleanups in various drivers in preparation
 	  for more cleanup and unification in future releases (that work
 	  was not ready for this release.)
 	- n_gsm fixes and updates
 	- ktermios cleanups and code reductions
 	- dt bindings json conversions and updates for new devices
 	- some serial driver updates for new devices
 	- lots of other tiny cleanups and janitorial stuff.  Full
 	  details in the shortlog.
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY0BSdA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylucQCfaXIrYuh2AHcb6+G+Nqp1xD2BYaEAoIdLyOCA
 a2yziLrDF6us2oav6j4x
 =Wv+X
 -----END PGP SIGNATURE-----

Merge tag 'tty-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial driver updates from Greg KH:
 "Here is the big set of TTY and Serial driver updates for 6.1-rc1.

  Lots of cleanups in here, no real new functionality this time around,
  with the diffstat being that we removed more lines than we added!

  Included in here are:

   - termios unification cleanups from Al Viro, it's nice to finally get
     this work done

   - tty serial transmit cleanups in various drivers in preparation for
     more cleanup and unification in future releases (that work was not
     ready for this release)

   - n_gsm fixes and updates

   - ktermios cleanups and code reductions

   - dt bindings json conversions and updates for new devices

   - some serial driver updates for new devices

   - lots of other tiny cleanups and janitorial stuff. Full details in
     the shortlog.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'tty-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (102 commits)
  serial: cpm_uart: Don't request IRQ too early for console port
  tty: serial: do unlock on a common path in altera_jtaguart_console_putc()
  tty: serial: unify TX space reads under altera_jtaguart_tx_space()
  tty: serial: use FIELD_GET() in lqasc_tx_ready()
  tty: serial: extend lqasc_tx_ready() to lqasc_console_putchar()
  tty: serial: allow pxa.c to be COMPILE_TESTed
  serial: stm32: Fix unused-variable warning
  tty: serial: atmel: Add COMMON_CLK dependency to SERIAL_ATMEL
  serial: 8250: Fix restoring termios speed after suspend
  serial: Deassert Transmit Enable on probe in driver-specific way
  serial: 8250_dma: Convert to use uart_xmit_advance()
  serial: 8250_omap: Convert to use uart_xmit_advance()
  MAINTAINERS: Solve warning regarding inexistent atmel-usart binding
  serial: stm32: Deassert Transmit Enable on ->rs485_config()
  serial: ar933x: Deassert Transmit Enable on ->rs485_config()
  tty: serial: atmel: Use FIELD_PREP/FIELD_GET
  tty: serial: atmel: Make the driver aware of the existence of GCLK
  tty: serial: atmel: Only divide Clock Divisor if the IP is USART
  tty: serial: atmel: Separate mode clearing between UART and USART
  dt-bindings: serial: atmel,at91-usart: Add gclk as a possible USART clock
  ...
2022-10-07 16:36:24 -07:00
Linus Torvalds
513389809e for-6.1/block-2022-10-03
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmM67XkQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpiHoD/9eN+6YnNRPu5+2zeGnnm1Nlwic6YMZeORr
 KFIeC0COMWoFhNBIPFkgAKT+0qIH+uGt5UsHSM3Y5La7wMR8yLxD4PAnvTZ/Ijtt
 yxVIOmonJoQ0OrQ2kTbvDXL/9OCUrzwXXyUIEPJnH0Ca1mxeNOgDHbE7VGF6DMul
 0D3pI8qs2WLnHlDi1V/8kH5qZ6WoAJSDcb8sTzOUVnyveZPNaZhGQJuHA2XAYMtg
 fqKMDJqgmNk6jdTMUgdF5B+rV64PQoCy28I7fXqGkEe+RE5TBy57vAa0XY84V8XR
 /a8CEuwMts2ypk1hIcJG8Vv8K6u5war9yPM5MTngKsoMpzNIlhrhaJQVyjKdcs+E
 Ixwzexu6xTYcrcq+mUARgeTh79FzTBM/uXEdbCG2G3S6HPd6UZWUJZGfxw/l0Aem
 V4xB7lj6SQaJDU1iJCYUaHcekNXhQAPvyVG+R2ED1SO3McTpTPIM1aeigxw6vj7u
 bH3Kfdr94Z8HNuoLuiS6YYfjNt2Shf4LEB6GxKJ9TYHtyhdOyO0H64jGHpygrWqN
 cSnkWPUqUUNpF7srKM0ZgbliCshvmyJc4aMOFd0gBY/kXf5J/j7IXvh8TFCi9rHH
 0KyZH3/3Zsu9geUn3ynznlr4FXU+BcqE6boaa/iWb9sN1m+Rvaahv8cSch/dh44a
 vQNj/iOBQA==
 =R05e
 -----END PGP SIGNATURE-----

Merge tag 'for-6.1/block-2022-10-03' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:

 - NVMe pull requests via Christoph:
      - handle number of queue changes in the TCP and RDMA drivers
        (Daniel Wagner)
      - allow changing the number of queues in nvmet (Daniel Wagner)
      - also consider host_iface when checking ip options (Daniel
        Wagner)
      - don't map pages which can't come from HIGHMEM (Fabio M. De
        Francesco)
      - avoid unnecessary flush bios in nvmet (Guixin Liu)
      - shrink and better pack the nvme_iod structure (Keith Busch)
      - add comment for unaligned "fake" nqn (Linjun Bao)
      - print actual source IP address through sysfs "address" attr
        (Martin Belanger)
      - various cleanups (Jackie Liu, Wolfram Sang, Genjian Zhang)
      - handle effects after freeing the request (Keith Busch)
      - copy firmware_rev on each init (Keith Busch)
      - restrict management ioctls to admin (Keith Busch)
      - ensure subsystem reset is single threaded (Keith Busch)
      - report the actual number of tagset maps in nvme-pci (Keith
        Busch)
      - small fabrics authentication fixups (Christoph Hellwig)
      - add common code for tagset allocation and freeing (Christoph
        Hellwig)
      - stop using the request_queue in nvmet (Christoph Hellwig)
      - set min_align_mask before calculating max_hw_sectors (Rishabh
        Bhatnagar)
      - send a rediscover uevent when a persistent discovery controller
        reconnects (Sagi Grimberg)
      - misc nvmet-tcp fixes (Varun Prakash, zhenwei pi)

 - MD pull request via Song:
      - Various raid5 fix and clean up, by Logan Gunthorpe and David
        Sloan.
      - Raid10 performance optimization, by Yu Kuai.

 - sbitmap wakeup hang fixes (Hugh, Keith, Jan, Yu)

 - IO scheduler switching quisce fix (Keith)

 - s390/dasd block driver updates (Stefan)

 - support for recovery for the ublk driver (ZiyangZhang)

 - rnbd drivers fixes and updates (Guoqing, Santosh, ye, Christoph)

 - blk-mq and null_blk map fixes (Bart)

 - various bcache fixes (Coly, Jilin, Jules)

 - nbd signal hang fix (Shigeru)

 - block writeback throttling fix (Yu)

 - optimize the passthrough mapping handling (me)

 - prepare block cgroups to being gendisk based (Christoph)

 - get rid of an old PSI hack in the block layer, moving it to the
   callers instead where it belongs (Christoph)

 - blk-throttle fixes and cleanups (Yu)

 - misc fixes and cleanups (Liu Shixin, Liu Song, Miaohe, Pankaj,
   Ping-Xiang, Wolfram, Saurabh, Li Jinlin, Li Lei, Lin, Li zeming,
   Miaohe, Bart, Coly, Gaosheng

* tag 'for-6.1/block-2022-10-03' of git://git.kernel.dk/linux: (162 commits)
  sbitmap: fix lockup while swapping
  block: add rationale for not using blk_mq_plug() when applicable
  block: adapt blk_mq_plug() to not plug for writes that require a zone lock
  s390/dasd: use blk_mq_alloc_disk
  blk-cgroup: don't update the blkg lookup hint in blkg_conf_prep
  nvmet: don't look at the request_queue in nvmet_bdev_set_limits
  nvmet: don't look at the request_queue in nvmet_bdev_zone_mgmt_emulate_all
  blk-mq: use quiesced elevator switch when reinitializing queues
  block: replace blk_queue_nowait with bdev_nowait
  nvme: remove nvme_ctrl_init_connect_q
  nvme-loop: use the tagset alloc/free helpers
  nvme-loop: store the generic nvme_ctrl in set->driver_data
  nvme-loop: initialize sqsize later
  nvme-fc: use the tagset alloc/free helpers
  nvme-fc: store the generic nvme_ctrl in set->driver_data
  nvme-fc: keep ctrl->sqsize in sync with opts->queue_size
  nvme-rdma: use the tagset alloc/free helpers
  nvme-rdma: store the generic nvme_ctrl in set->driver_data
  nvme-tcp: use the tagset alloc/free helpers
  nvme-tcp: store the generic nvme_ctrl in set->driver_data
  ...
2022-10-07 09:19:14 -07:00
Alexander Potapenko
33b75c1d88 instrumented.h: allow instrumenting both sides of copy_from_user()
Introduce instrument_copy_from_user_before() and
instrument_copy_from_user_after() hooks to be invoked before and after the
call to copy_from_user().

KASAN and KCSAN will be only using instrument_copy_from_user_before(), but
for KMSAN we'll need to insert code after copy_from_user().

Link: https://lkml.kernel.org/r/20220915150417.722975-4-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03 14:03:18 -07:00
Masahiro Yamada
ce697ccee1 kbuild: remove head-y syntax
Kbuild puts the objects listed in head-y at the head of vmlinux.
Conventionally, we do this for head*.S, which contains the kernel entry
point.

A counter approach is to control the section order by the linker script.
Actually, the code marked as __HEAD goes into the ".head.text" section,
which is placed before the normal ".text" section.

I do not know if both of them are needed. From the build system
perspective, head-y is not mandatory. If you can achieve the proper code
placement by the linker script only, it would be cleaner.

I collected the current head-y objects into head-object-list.txt. It is
a whitelist. My hope is it will be reduced in the long run.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-10-02 18:06:03 +09:00
Masahiro Yamada
3216484550 kbuild: use obj-y instead extra-y for objects placed at the head
The objects placed at the head of vmlinux need special treatments:

 - arch/$(SRCARCH)/Makefile adds them to head-y in order to place
   them before other archives in the linker command line.

 - arch/$(SRCARCH)/kernel/Makefile adds them to extra-y instead of
   obj-y to avoid them going into built-in.a.

This commit gets rid of the latter.

Create vmlinux.a to collect all the objects that are unconditionally
linked to vmlinux. The objects listed in head-y are moved to the head
of vmlinux.a by using 'ar m'.

With this, arch/$(SRCARCH)/kernel/Makefile can consistently use obj-y
for builtin objects.

There is no *.o that is directly linked to vmlinux. Drop unneeded code
in scripts/clang-tools/gen_compile_commands.py.

$(AR) mPi needs 'T' to workaround the llvm-ar bug. The fix was suggested
by Nathan Chancellor [1].

[1]: https://lore.kernel.org/llvm/YyjjT5gQ2hGMH0ni@dev-arch.thelio-3990X/

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-10-02 18:04:05 +09:00
Peter Zijlstra
a1ebcd5943 Linux 6.0-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmMwwY4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGdlwH/0ESzdb6F9zYWwHR
 E08har56/IfwjOsn1y+JuHibpwUjzskLzdwIfI5zshSZAQTj5/UyC0P7G/wcYh/Z
 INh1uHGazmDUkx4O3lwuWLR+mmeUxZRWdq4NTwYDRNPMSiPInVxz+cZJ7y0aPr2e
 wii7kMFRHgXmX5DMDEwuHzehsJF7vZrp8zBu2DqzVUGnbwD50nPbyMM3H4g9mute
 fAEpDG0X3+smqMaKL+2rK0W/Av/87r3U8ZAztBem3nsCJ9jT7hqMO1ICcKmFMviA
 DTERRMwWjPq+mBPE2CiuhdaXvNZBW85Ds81mSddS6MsO6+Tvuzfzik/zSLQJxlBi
 vIqYphY=
 =NqG+
 -----END PGP SIGNATURE-----

Merge branch 'v6.0-rc7'

Merge upstream to get RAPTORLAKE_S

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-09-29 12:20:50 +02:00
Masahiro Yamada
2df8220cc5 kbuild: build init/built-in.a just once
Kbuild builds init/built-in.a twice; first during the ordinary
directory descending, second from scripts/link-vmlinux.sh.

We do this because UTS_VERSION contains the build version and the
timestamp. We cannot update it during the normal directory traversal
since we do not yet know if we need to update vmlinux. UTS_VERSION is
temporarily calculated, but omitted from the update check. Otherwise,
vmlinux would be rebuilt every time.

When Kbuild results in running link-vmlinux.sh, it increments the
version number in the .version file and takes the timestamp at that
time to really fix UTS_VERSION.

However, updating the same file twice is a footgun. To avoid nasty
timestamp issues, all build artifacts that depend on init/built-in.a
are atomically generated in link-vmlinux.sh, where some of them do not
need rebuilding.

To fix this issue, this commit changes as follows:

[1] Split UTS_VERSION out to include/generated/utsversion.h from
    include/generated/compile.h

    include/generated/utsversion.h is generated just before the
    vmlinux link. It is generated under include/generated/ because
    some decompressors (s390, x86) use UTS_VERSION.

[2] Split init_uts_ns and linux_banner out to init/version-timestamp.c
    from init/version.c

    init_uts_ns and linux_banner contain UTS_VERSION. During the ordinary
    directory descending, they are compiled with __weak and used to
    determine if vmlinux needs relinking. Just before the vmlinux link,
    they are compiled without __weak to embed the real version and
    timestamp.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-09-29 04:40:15 +09:00
Niklas Schnelle
8fb65e05bd s390/pci: remove unused bus_next field from struct zpci_dev
This field was added in commit 44510d6fa0 ("s390/pci: Handling
multifunctions") but is an unused remnant of an earlier version where
the devices on the virtual bus were connected in a linked list instead
of a fixed 256 entry array of pointers.

It is also not used for the list of busses as that is threaded through
struct zpci_bus not through struct zpci_dev.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-28 11:12:44 +02:00
Gaosheng Cui
4b39d40ea1 s390/cio: remove unused ccw_device_force_console() declaration
ccw_device_force_console() has been removed by
commit 8cc0dcfdc1 ("s390/cio: remove pm support from
ccw bus driver"), so remove the declaration, too.

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Acked-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-28 11:12:44 +02:00
Namhyung Kim
838d9bb62d perf: Use sample_flags for raw_data
Use the new sample_flags to indicate whether the raw data field is
filled by the PMU driver.  Although it could check with the NULL,
follow the same rule with other fields.

Remove the raw field from the perf_sample_data_init() to minimize
the number of cache lines touched.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220921220032.2858517-2-namhyung@kernel.org
2022-09-27 22:50:24 +02:00
Matthew Wilcox (Oracle)
e7b6b990e5 s390: remove vma linked list walks
Use the VMA iterator instead.

Link: https://lkml.kernel.org/r/20220906194824.2110408-35-Liam.Howlett@oracle.com
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Tested-by: Yu Zhao <yuzhao@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26 19:46:19 -07:00
Paolo Bonzini
c59fb12758 KVM: remove KVM_REQ_UNHALT
KVM_REQ_UNHALT is now unnecessary because it is replaced by the return
value of kvm_vcpu_block/kvm_vcpu_halt.  Remove it.

No functional change intended.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Message-Id: <20220921003201.1441511-13-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26 12:37:21 -04:00
Greg Kroah-Hartman
a12c689209 Merge 7e2cd21e02 ("Merge tag 'tty-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty") into tty-next
We need the tty fixes and api additions in this branch.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-25 09:22:13 +02:00
Stefan Haberland
1fca631a11 s390/dasd: suppress generic error messages for PPRC secondary devices
Suppress generic command reject messages and dump of sense data for
Peer-To-Peer-Remote-Copy (PPRC) secondary errors.
If IO is issued on a PPRC secondary device, a specific
error message is printed instead.

Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Link: https://lore.kernel.org/r/20220920192616.808070-7-sth@linux.ibm.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21 08:32:51 -06:00
Stefan Haberland
112ff512fc s390/dasd: add ioctl to perform a swap of the drivers copy pair
The newly defined ioctl BIODASDCOPYPAIRSWAP takes a structure that
specifies a copy pair that should be swapped. It will call the device
discipline function to perform the swap operation.

The structure looks as followed:

struct dasd_copypair_swap_data_t {
       char primary[20];
       char secondary[20];
       __u8 reserved[64];
};

where primary is the old primary device that will be replaced by the
secondary device. The old primary will become a secondary device
afterwards.

Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Link: https://lore.kernel.org/r/20220920192616.808070-6-sth@linux.ibm.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21 08:32:51 -06:00
Matthew Rosato
189e7d876e KVM: s390: pci: register pci hooks without interpretation
The kvm registration hooks must be registered even if the facilities
necessary for zPCI interpretation are unavailable, as vfio-pci-zdev will
expect to use the hooks regardless.
This fixes an issue where vfio-pci-zdev will fail its open function
because of a missing kvm_register when running on hardware that does not
support zPCI interpretation.

Fixes: ca922fecda ("KVM: s390: pci: Hook to access KVM lowlevel from VFIO")
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Link: https://lore.kernel.org/r/20220920193025.135655-1-mjrosato@linux.ibm.com
Message-Id: <20220920193025.135655-1-mjrosato@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-09-21 16:18:38 +02:00
Matthew Rosato
70ba8fae27 KVM: s390: pci: fix GAIT physical vs virtual pointers usage
The GAIT and all of its entries must be represented by physical
addresses as this structure is shared with underlying firmware.
We can keep a virtual address of the GAIT origin in order to
handle processing in the kernel, but when traversing the entries
we must again convert the physical AISB stored in that GAIT entry
into a virtual address in order to process it.

Note: this currently doesn't fix a real bug, since virtual addresses
are indentical to physical ones.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Nico Boehr <nrb@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20220907155952.87356-1-mjrosato@linux.ibm.com
Message-Id: <20220907155952.87356-1-mjrosato@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-09-21 16:18:38 +02:00
Janis Schoetterl-Glausch
b3cefd6bf1 KVM: s390: Pass initialized arg even if unused
This silences smatch warnings reported by kbuild bot:
arch/s390/kvm/gaccess.c:859 guest_range_to_gpas() error: uninitialized symbol 'prot'.
arch/s390/kvm/gaccess.c:1064 access_guest_with_key() error: uninitialized symbol 'prot'.

This is because it cannot tell that the value is not used in this case.
The trans_exc* only examine prot if code is PGM_PROTECTION.
Pass a dummy value for other codes.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20220825192540.1560559-1-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-09-21 16:18:35 +02:00
Matthew Rosato
e8c924a4fb KVM: s390: pci: fix plain integer as NULL pointer warnings
Fix some sparse warnings that a plain integer 0 is being used instead of
NULL.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220915175514.167899-1-mjrosato@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-09-21 16:18:30 +02:00
Thomas Richter
c432fefe8e s390/pai: Add support for PAI Extension 1 NNPA counters
PMU device driver perf_paiext supports Processor Activity
Instrumentation Extension (PAIE1), available with IBM z16:
- maps a 512 byte block to lowcore address 0x1508 called PAIE1 control
  block.
- maps a 1024 byte block at PAIE1 control block entry with index 2.
- uses control register bit 14 to enable PAIE1 control block lookup.
- turn PAIE1 nnpa counting on and off by setting bit 63 in
  PAIE1 control block entry with index 2.
- creates a sample with raw data on each context switch out when
  at context switch some mapped counters have a value of nonzero.
This device driver only supports CPU wide context, no task context
is allowed.

Support for counting:
- one or more counters can be specified using
  perf stat -e pai_ext/xxx/
  where xxx stands for the counter event name. Multiple invocation
  of this command is possible. The counter names are listed in
  /sys/devices/pai_ext/events directory.
- one special counters can be specified using
  perf stat -e pai_ext/NNPA_ALL/
  which returns the sum of all incremented nnpa counters.
- multiple counting events can run in parallel.

Support for Sampling:
- one event pai_ext/NNPA_ALL/ is reserved for sampling.
  The event collects data at context switch out and saves them in
  the ring buffer.
- no multiple invocations are possible.

The PAIE1 nnpa counter events are system wide. No task context is
supported.  Therefore some restrictions documented in function
paiext_busy() apply.

Extend qpaci assembly instruction to query supported memory mapped nnpa
counters. It returns the number of counters (no holes allowed in that
range).

PAIE1 nnpa counter events can not be created when a CPU hot plug
add is processed. This means a CPU hot plug add does not get
the necessary PAIE1 event to record PAIE1 nnpa counter increments
on the newly added CPU. CPU hot plug remove removes the event and
terminates the counting of PAIE1 counters immediately.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-16 18:36:25 +02:00
Alexander Gordeev
9267bdd819 s390/mm: fix no previous prototype warnings in maccess.c
Fix -Wmissing-prototypes warnings caused by missing maccess.h include.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 2f0e8aae26 ("s390/mm: rework memcpy_real() to avoid DAT-off mode")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-16 18:36:25 +02:00
Alexander Gordeev
fba07cd4dd s390/mm: uninline copy_oldmem_kernel() function
Uninline copy_oldmem_kernel() function and make it consistent
with a very similar memcpy_real() implementation, by moving
to code to crash_dump.c, where it actually belongs.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14 16:46:01 +02:00
Alexander Gordeev
c0ceb94403 s390/mm,ptdump: add real memory copy page markers
Add "Real Memory Copy Area Start" and "Real Memory Copy Area End"
markers that fence the page used for real memory copying.

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14 16:46:01 +02:00
Alexander Gordeev
2f0e8aae26 s390/mm: rework memcpy_real() to avoid DAT-off mode
Function memcpy_real() is an univeral data mover that does not
require DAT mode to be able reading from a physical address.
Its advantage is an ability to read from any address, even
those for which no kernel virtual mapping exists.

Although memcpy_real() is interrupt-safe, there are no handlers
that make use of this function. The compiler instrumentation
have to be disabled and separate no-DAT stack used to allow
execution of the function once DAT mode is disabled.

Rework memcpy_real() to overcome these shortcomings. As result,
data copying (which is primarily reading out a crashed system
memory by a user process) is executed on a regular stack with
enabled interrupts. Also, use of memcpy_real_buf swap buffer
becomes unnecessary and the swapping is eliminated.

The above is achieved by using a fixed virtual address range
that spans a single page and remaps that page repeatedly when
memcpy_real() is called for a particular physical address.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14 16:46:01 +02:00
Alexander Gordeev
14a3a26242 s390/dump: save IPL CPU registers once DAT is available
Function smp_save_dump_cpus() collects CPU state of a crashed
system for secondary CPUs and for the IPL CPU very differently.
The Signal Processor stop-and-store-status orders are used for
the former while Hardware System Area requests and memcpy_real()
routine are called for the latter. In addition a system reset is
triggered, which pins smp_save_dump_cpus() function call before
CPU and device initialization.

Move the collection of IPL CPU state to a later stage when DAT
becomes available. That is needed to allow a follow-up rework of
memcpy_real() routine.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14 16:46:00 +02:00
Niklas Schnelle
2187582c36 s390/pci: convert high_memory to physical address
We use high_memory as a measure for amount of memory available in
determining the required minimum size of our IOVA space with the
assumption that one rarely maps more than the available memory for DMA.
In special cases like mapping significant amounts of memory more than
once this can still be tuned with the s390_iommu_apterture kernel
parameter. In this use case high_memory is treated as a physical
address. As high_memory is a virtual address however this means we need
to convert it using virt_to_phys() before use

Note that at the moment physical and virtual addresses are identical so
this mismatch does not currently cause trouble.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14 16:46:00 +02:00
Alexander Gordeev
5078775531 s390/smp,ptdump: add absolute lowcore markers
Add "Lowcore Area Start" and "Lowcore Area End" markers
that fence pages where absolute lowcore resides.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14 16:46:00 +02:00
Alexander Gordeev
4df29d2b90 s390/smp: rework absolute lowcore access
Temporary unsetting of the prefix page in memcpy_absolute() routine
poses a risk of executing code path with unexpectedly disabled prefix
page. This rework avoids the prefix page uninstalling and disabling
of normal and machine check interrupts when accessing the absolute
zero memory.

Although memcpy_absolute() routine can access the whole memory, it is
only used to update the absolute zero lowcore. This rework therefore
introduces a new mechanism for the absolute zero lowcore access and
scraps memcpy_absolute() routine for good.

Instead, an area is reserved in the virtual memory that is used for
the absolute lowcore access only. That area holds an array of 8KB
virtual mappings - one per CPU. Whenever a CPU is brought online, the
corresponding item is mapped to the real address of the previously
installed prefix page.

The absolute zero lowcore access works like this: a CPU calls the
new primitive get_abs_lowcore() to obtain its 8KB mapping as a
pointer to the struct lowcore. Virtual address references to that
pointer get translated to the real addresses of the prefix page,
which in turn gets swapped with the absolute zero memory addresses
due to prefixing. Once the pointer is not needed it must be released
with put_abs_lowcore() primitive:

	struct lowcore *abs_lc;
	unsigned long flags;

	abs_lc = get_abs_lowcore(&flags);
	abs_lc->... = ...;
	put_abs_lowcore(abs_lc, flags);

To ensure the described mechanism works large segment- and region-
table entries must be avoided for the 8KB mappings. Failure to do
so results in usage of Region-Frame Absolute Address (RFAA) or
Segment-Frame Absolute Address (SFAA) large page fields. In that
case absolute addresses would be used to address the prefix page
instead of the real ones and the prefixing would get bypassed.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14 16:46:00 +02:00
Alexander Gordeev
6cbd7cc2eb s390/smp: call smp_reinit_ipl_cpu() before scheduler is available
Currently smp_reinit_ipl_cpu() is a pre-SMP early initcall.
That ensures no CPU is running in parallel, but still not
enough to assume the code is exclusive, since the scheduling
is already available.

Move the function call to arch_call_rest_init() callback
to ensure no thread could be preempted and allow lockless
allocation of the kernel page tables. That is needed to
allow a follow-up rework of the absolute lowcore access
mechanism.

Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14 16:46:00 +02:00
Vasily Gorbik
d61bb30e43 Merge branch 'fixes' into features
* fixes:
  s390/smp: enforce lowcore protection on CPU restart
  s390/boot: fix absolute zero lowcore corruption on boot
  s390/hugetlb: fix prepare_hugepage_range() check for 2 GB hugepages
  s390: update defconfigs
  s390: fix nospec table alignments
  s390/mm: remove useless hugepage address alignment

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14 16:41:21 +02:00
Kefeng Wang
2be9880dc8 kernel: exit: cleanup release_thread()
Only x86 has own release_thread(), introduce a new weak release_thread()
function to clean empty definitions in other ARCHs.

Link: https://lkml.kernel.org/r/20220819014406.32266-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Guo Ren <guoren@kernel.org>				[csky]
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Brian Cain <bcain@quicinc.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>			[powerpc]
Acked-by: Stafford Horne <shorne@gmail.com>			[openrisc]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>		[arm64]
Acked-by: Huacai Chen <chenhuacai@kernel.org>			[LoongArch]
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Guo Ren <guoren@kernel.org> [csky]
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Xuerui Wang <kernel@xen0n.name>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 21:55:07 -07:00
Gerald Schaefer
1a6baaa0db s390/hugetlb: switch to generic version of follow_huge_pud()
When pud-sized hugepages were introduced for s390, the generic version of
follow_huge_pud() was using pte_page() instead of pud_page().  This would
be wrong for s390, see also commit 9753412701 ("mm/hugetlb: use
pmd_page() in follow_huge_pmd()").  Therefore, and probably because not
all archs were supporting pud_page() at that time, a private version of
follow_huge_pud() was added for s390, correctly using pud_page().

Since commit 3a194f3f8a ("mm/hugetlb: make pud_huge() and
follow_huge_pud() aware of non-present pud entry"), the generic version of
follow_huge_pud() is now also using pud_page(), and in general behaves
similar to follow_huge_pmd().

Therefore we can now switch to the generic version and get rid of the
s390-specific follow_huge_pud().

Link: https://lkml.kernel.org/r/20220818135717.609eef8a@thinkpad
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Haiyue Wang <haiyue.wang@intel.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 20:26:01 -07:00
Linus Torvalds
b96fbd602d s390 updates for 6.0-rc5
- Fix absolute zero lowcore corruption on kdump when CPU0 is offline.
 
 - Fix lowcore protection setup for offline CPU restart.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmMca+IACgkQjYWKoQLX
 FBjE4Af+N/s30gWrJT1qY5ACxNyVH+UUbND+J/1Avg/PHMS1fLY/YN48clsPDimD
 WqnmYWqMCSxRIg29qY6Inj+7ZMI8lrRwVO46JU+4a9kujKzppjsWxVrotaIYS+5n
 EdBHPqdchxAhymAD9BkbffeuelFe/LKRY6R/JywbXMPbzNtx/KTO6wHNmPj3ekqK
 04jPtgcVl0bziwvAVGhCxPqrHPUKxkW0EliyDWPvSYdTuroR6m8qKzKzaG2EpXwk
 7eOJqIoZ88UVZYHPux+JSlaFMQzZS4a3kMHXzpc0X5DnHmyfYx0why2LFQKWX4fG
 mZ59CIaZOV084t8zm98TiPs78sdHsg==
 =JuWY
 -----END PGP SIGNATURE-----

Merge tag 's390-6.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

 - Fix absolute zero lowcore corruption on kdump when CPU0 is offline

 - Fix lowcore protection setup for offline CPU restart

* tag 's390-6.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/smp: enforce lowcore protection on CPU restart
  s390/boot: fix absolute zero lowcore corruption on boot
2022-09-10 13:19:31 -04:00
Linus Torvalds
f448dda895 asm-generic: SOFTIRQ_ON_OWN_STACK rework
Just one fixup patch, reworking the softirq_on_own_stack logic for
 preempt-rt kernels as discussed in
 https://lore.kernel.org/all/CAHk-=wgZSD3W2y6yczad2Am=EfHYyiPzTn3CfXxrriJf9i5W5w@mail.gmail.com/
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmMZ/FUACgkQmmx57+YA
 GNl4DBAAutecM/RGJnK//0+NEcCsuHoIWusifeEIhR8yNFxm1UOiwBNYV4M/73UA
 x2lLGQCu4ze1zSU3BXX1Ug3MfpmrBOhlsW2+x7j0b3tG9vXpvojitTsCPo6lgNDu
 nUx58rtN0wfZgwT7c4DaueEL1aIZ1yUOgCE+SCGw3subgpZeoaYUi3sbZxI2K+T8
 MnzcVVpmccMaxLh0xwCn6u4csYrarj0XodY11a8+DAN9Afek9nEq4UNhIewjvQw0
 2OTbGuxIBZoXNmWdjcDBO97eUtq0oJcRIgPkRi0xlas2hzRJMwO6liqsA34r4r7r
 Irxyyw4//jlCtJOKwrv+rSxR6ULvGw2+UBk/Q/nYeN9hL2qKnwJIa8hryj+cQ/os
 KWclMGEYtKvOzciMIvViXAOY4sa7jZO1Q9rntyUIVkNiqQqLDRxocW+GjsqG5KZI
 bl+akScQhX3a7bEZePO0EJqfitjaxk5JtxF3KcV5apKaWto+EwOpvRf6eUVT97/n
 Zpoy2dBygOtWiMJR7o143xeOQlhutDCOh6VeDfJiUA57ekABRY/8+R+ZltkmDP84
 4iCCtE4oWuMXz390ybBHxKcNjiknLUMvDUNq5wfjsyJ8ftBRF+1+vKR+BdLqsx8O
 d5/FjRXC5MWaW4GLO89l+MZh+BrOkobdq333VxIStpS9Jd0pUa4=
 =0A1T
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-fixes-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull SOFTIRQ_ON_OWN_STACK rework from Arnd Bergmann:
 "Just one fixup patch, reworking the softirq_on_own_stack logic for
  preempt-rt kernels as discussed in

    https://lore.kernel.org/all/CAHk-=wgZSD3W2y6yczad2Am=EfHYyiPzTn3CfXxrriJf9i5W5w@mail.gmail.com/"

* tag 'asm-generic-fixes-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig.
2022-09-09 07:23:29 -04:00
Al Viro
ccf3a57041 termios: kill uapi termios.h that are identical to generic one
mandatory-y will have the generic picked for architectures that
don't have uapi/asm/termios.h of their own.  ia64, parisc and
s390 ones are identical to generic, so...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/YxGVXpS2dWoTwoa0@ZenIV
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-09 10:44:35 +02:00
Al Viro
89bbeb7e31 termios: get rid of non-UAPI asm/termios.h
All non-UAPI asm/termios.h consist of include of UAPI counterpart
and, possibly, include of linux/uaccess.h

	The latter can't be simply removed, even though nothing in
linux/termios.h doesn't depend upon it anymore - there are several
places that rely upon that indirect chain of includes to pull
linux/uaccess.h.  So the include needs to be lifted out of there -
we lift into tty_driver.h, serdev.h and places that pull asm/termios.h,
but none of
	* linux/uaccess.h (obvious)
	* net/sock.h (pulls uaccess.h)
	* linux/{tty,tty_driver,serdev}.h (tty.h pulls tty_driver.h)

That leaves us just with the include of UAPI asm/termios.h, which is
what <asm/termios.h> will resolve to if we simply remove non-UAPI header.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/YxDnKvYCHn/ogBUv@ZenIV
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-09 10:44:35 +02:00
Al Viro
c9874d3ffe termios: start unifying non-UAPI parts of asm/termios.h
* new header (linut/termios_internal.h), pulled by the users of those
suckers
* defaults for INIT_C_CC and externs for conversion helpers moved over
there
* remove termios-base.h (empty now)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/YxDmptU7dNGZ+/Hn@ZenIV
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-09 10:44:34 +02:00
Heiko Carstens
edcfc9c71b s390/ptdump: add missing amode31 markers
Add amode31 markers which makes a ro mapping in the middle of
nowhere in the kernel_page_tables output less magic.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-07 14:04:52 +02:00
Heiko Carstens
b193d2d4d0 s390/mm: split lowcore pages with set_memory_4k()
Use set_memory_4k() to split lowcore pages within the kernel mapping
instead of using the quite subtle !addr check within modify_pmd_table()
and modify_pud_table() to prevent large pages for address zero.

With this lowcore might be mapped with 1MB / 2GB frames and only later
will be split. This way this mapping is handled like every other.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-07 14:04:52 +02:00
Alexander Gordeev
8d96bba75a s390/smp: enforce lowcore protection on CPU restart
As result of commit 915fea04f9 ("s390/smp: enable DAT before
CPU restart callback is called") the low-address protection bit
gets mistakenly unset in control register 0 save area of the
absolute zero memory. That area is used when manual PSW restart
happened to hit an offline CPU. In this case the low-address
protection for that CPU will be dropped.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Fixes: 915fea04f9 ("s390/smp: enable DAT before CPU restart callback is called")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-07 14:04:01 +02:00
Alexander Gordeev
12dd19c159 s390/boot: fix absolute zero lowcore corruption on boot
Crash dump always starts on CPU0. In case CPU0 is offline the
prefix page is not installed and the absolute zero lowcore is
used. However, struct lowcore::mcesad is never assigned and
stays zero. That leads to __machine_kdump() -> save_vx_regs()
call silently stores vector registers to the absolute lowcore
at 0x11b0 offset.

Fixes: a62bc07392 ("s390/kdump: add support for vector extension")
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-07 14:04:01 +02:00
Sebastian Andrzej Siewior
8cbb2b50ee asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig.
Remove the CONFIG_PREEMPT_RT symbol from the ifdef around
do_softirq_own_stack() and move it to Kconfig instead.

Enable softirq stacks based on SOFTIRQ_ON_OWN_STACK which depends on
HAVE_SOFTIRQ_ON_OWN_STACK and its default value is set to !PREEMPT_RT.
This ensures that softirq stacks are not used on PREEMPT_RT and avoids
a 'select' statement on an option which has a 'depends' statement.

Link: https://lore.kernel.org/YvN5E%2FPrHfUhggr7@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-09-05 17:20:55 +02:00
Linus Torvalds
685ed983e2 s390:
* PCI interpretation compile fixes
 
 RISC-V:
 
 * Fix unused variable warnings in vcpu_timer.c
 
 * Move extern sbi_ext declarations to a header
 
 x86:
 
 * check validity of argument to KVM_SET_MP_STATE
 
 * use guest's global_ctrl to completely disable guest PEBS
 
 * fix a memory leak on memory allocation failure
 
 * mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES
 
 * fix build failure with Clang integrated assembler
 
 * fix MSR interception
 
 * Always flush TLBs when enabling dirty logging
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmMUdO4UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPpMQf6Aqcvg4zFz3Ph/5RnakAxQris163b
 g63AyDXaVIZ1NHKdhBzlvw9t75UacgYOuB/+utDHQ4eUa1W6bMDA2zEGCQS3HfdW
 A+u5lrYfS4qUM4V8gCDZTuZUyK9EhKrQ6C/aTGge+q8YdG7P1zv72QOIZyQ/+WmU
 4aVuX3GehzdlhgJyzLG/g6NDA+n9fOB1lIlg4GSt9hcvUXcYFqZ+oZCPZgWDNZCP
 LXil43sgolw4y7FxVMSRlB6LhamWV4vI0u4nXVFC9tG0CSCmCrTkaFSMB7RixF1r
 TarYt7BSbb1ie5uXJpETSZL66DxeBSvLbtBUm98nt3Ym9aBYmAcN8GK7sg==
 =zUwO
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "s390:

   - PCI interpretation compile fixes

  RISC-V:

   - fix unused variable warnings in vcpu_timer.c

   - move extern sbi_ext declarations to a header

  x86:

   - check validity of argument to KVM_SET_MP_STATE

   - use guest's global_ctrl to completely disable guest PEBS

   - fix a memory leak on memory allocation failure

   - mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES

   - fix build failure with Clang integrated assembler

   - fix MSR interception

   - always flush TLBs when enabling dirty logging"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: check validity of argument to KVM_SET_MP_STATE
  perf/x86/core: Completely disable guest PEBS via guest's global_ctrl
  KVM: x86: fix memoryleak in kvm_arch_vcpu_create()
  KVM: x86: Mask off unsupported and unknown bits of IA32_ARCH_CAPABILITIES
  KVM: s390: pci: Hook to access KVM lowlevel from VFIO
  riscv: kvm: move extern sbi_ext declarations to a header
  riscv: kvm: vcpu_timer: fix unused variable warnings
  KVM: selftests: Fix ambiguous mov in KVM_ASM_SAFE()
  KVM: selftests: Fix KVM_EXCEPTION_MAGIC build with Clang
  KVM: VMX: Heed the 'msr' argument in msr_write_intercepted()
  kvm: x86: mmu: Always flush TLBs when enabling dirty logging
  kvm: x86: mmu: Drop the need_remote_flush() function
2022-09-04 11:27:14 -07:00
Heiko Carstens
bf2ce3855c s390/mm: remove unused access parameter from do_fault_error()
Remove unused access parameter from do_fault_error() which also makes the
code a bit more readable since quite some callers can be simplified.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-30 22:00:33 +02:00
Heiko Carstens
9aa10e791c s390/delay: sync comment within __delay() with reality
The comment within __delay() is outdated and does not reflect anymore
what the function is doing. Therefore replace the comment.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-30 22:00:33 +02:00
Wolfram Sang
820109fb11 s390: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Benjamin Block <bblock@linux.ibm.com>
Acked-by: Alexandra Winter <wintera@linux.ibm.com>
Link: https://lore.kernel.org/r/20220818205948.6360-1-wsa+renesas@sang-engineering.com
Link: https://lore.kernel.org/r/20220818210102.7301-1-wsa+renesas@sang-engineering.com
[gor@linux.ibm.com: squashed two changes linked above together]
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-30 22:00:33 +02:00
Gerald Schaefer
7c8d42fdf1 s390/hugetlb: fix prepare_hugepage_range() check for 2 GB hugepages
The alignment check in prepare_hugepage_range() is wrong for 2 GB
hugepages, it only checks for 1 MB hugepage alignment.

This can result in kernel crash in __unmap_hugepage_range() at the
BUG_ON(start & ~huge_page_mask(h)) alignment check, for mappings
created with MAP_FIXED at unaligned address.

Fix this by correctly handling multiple hugepage sizes, similar to the
generic version of prepare_hugepage_range().

Fixes: d08de8e2d8 ("s390/mm: add support for 2GB hugepages")
Cc: <stable@vger.kernel.org> # 4.8+
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-30 21:57:07 +02:00
Heiko Carstens
bdbf57bca6 s390: update defconfigs
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-30 21:57:07 +02:00
Josh Poimboeuf
c9305b6c1f s390: fix nospec table alignments
Add proper alignment for .nospec_call_table and .nospec_return_table in
vmlinux.

[hca@linux.ibm.com]: The problem with the missing alignment of the nospec
tables exist since a long time, however only since commit e6ed91fd07
("s390/alternatives: remove padding generation code") and with
CONFIG_RELOCATABLE=n the kernel may also crash at boot time.

The above named commit reduced the size of struct alt_instr by one byte,
so its new size is 11 bytes. Therefore depending on the number of cpu
alternatives the size of the __alt_instructions array maybe odd, which
again also causes that the addresses of the nospec tables will be odd.

If the address of __nospec_call_start is odd and the kernel is compiled
With CONFIG_RELOCATABLE=n the compiler may generate code that loads the
address of __nospec_call_start with a 'larl' instruction.

This will generate incorrect code since the 'larl' instruction only works
with even addresses. In result the members of the nospec tables will be
accessed with an off-by-one offset, which subsequently may lead to
addressing exceptions within __nospec_revert().

Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/8719bf1ce4a72ebdeb575200290094e9ce047bcc.1661557333.git.jpoimboe@kernel.org
Cc: <stable@vger.kernel.org> # 4.16
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-30 21:57:07 +02:00
Gerald Schaefer
ff03b88467 s390/mm: remove useless hugepage address alignment
The failing address alignment to HPAGE_MASK in do_exception(), for
hugetlb faults, was useless from the beginning. With 2 GB hugepage
support it became wrong, but w/o further negative impact. Now it
could have negative performance impact because it breaks the cacheline
optimization for process_huge_page().

Therefore, remove it.

Note that we still have failing address alignment by HW to PAGE_SIZE,
for all page faults, not just hugetlb faults. So this patch will not
fix UFFD_FEATURE_EXACT_ADDRESS for userfaultfd handling. It will just
move the failing address for hugetlb faults a bit closer to the real
address, at 4K page granularity, similar to normal page faults.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-30 21:57:07 +02:00
Pierre Morel
ca922fecda KVM: s390: pci: Hook to access KVM lowlevel from VFIO
We have a cross dependency between KVM and VFIO when using
s390 vfio_pci_zdev extensions for PCI passthrough
To be able to keep both subsystem modular we add a registering
hook inside the S390 core code.

This fixes a build problem when VFIO is built-in and KVM is built
as a module.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Fixes: 09340b2fca ("KVM: s390: pci: add routines to start/stop interpretive execution")
Cc: <stable@vger.kernel.org>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Link: https://lore.kernel.org/r/20220819122945.9309-1-pmorel@linux.ibm.com
Message-Id: <20220819122945.9309-1-pmorel@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-08-29 13:29:28 +02:00
Linus Torvalds
dee1873796 s390 updates for 6.0-rc3
- Fix double free of guarded storage and runtime instrumentation control
   blocks on fork() failure.
 
 - Fix triggering write fault when VMA does not allow VM_WRITE.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmMJ6sYACgkQjYWKoQLX
 FBiH2Af/YzyZO+MgkXEGpMDp7zp7G1H3m7OYfA9onnRruf5PKppWYR/0eQRRYbat
 C1+9LyIQO2PjMG0RoWroxX7DYG9HR027nt1D+J1xPk82UgGvegDkky3oH3c75wFb
 oxjjdSI+Rsb4VEo1yIaKfl+lEsxa+rqXB3wvSsrwMFs9OHJYQo4eK62WUAFwcDoq
 saZ5xu82GhACTETpzCFnVHdGyWdq4BfpQPV1xIQDt8V+8y/yK4/dnB8coGEJL6I1
 cBVPUXltCwXeVoeMTVzzD6qUU0TK9nwHmJneSJ/Afj2w2XfAfJlNYpBuSEUSwHS4
 pD9DzBKy8YQn9ecDmqnDN7zY5n1fFw==
 =wHSL
 -----END PGP SIGNATURE-----

Merge tag 's390-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

 - Fix double free of guarded storage and runtime instrumentation
   control blocks on fork() failure

 - Fix triggering write fault when VMA does not allow VM_WRITE

* tag 's390-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mm: do not trigger write fault when vma does not allow VM_WRITE
  s390: fix double free of GS and RI CBs on fork() failure
2022-08-27 15:40:51 -07:00
Mikulas Patocka
d6ffe6067a provide arch_test_bit_acquire for architectures that define test_bit
Some architectures define their own arch_test_bit and they also need
arch_test_bit_acquire, otherwise they won't compile.  We also clean up
the code by using the generic test_bit if that is equivalent to the
arch-specific version.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 8238b45798 ("wait_on_bit: add an acquire memory barrier")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-27 09:49:54 -07:00
Robert Elliott
cf514b2a59 crypto: Kconfig - simplify cipher entries
Shorten menu titles and make them consistent:
- acronym
- name
- architecture features in parenthesis
- no suffixes like "<something> algorithm", "support", or
  "hardware acceleration", or "optimized"

Simplify help text descriptions, update references, and ensure that
https references are still valid.

Signed-off-by: Robert Elliott <elliott@hpe.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-08-26 18:50:43 +08:00
Robert Elliott
3f342a2325 crypto: Kconfig - simplify hash entries
Shorten menu titles and make them consistent:
- acronym
- name
- architecture features in parenthesis
- no suffixes like "<something> algorithm", "support", or
  "hardware acceleration", or "optimized"

Simplify help text descriptions, update references, and ensure that
https references are still valid.

Signed-off-by: Robert Elliott <elliott@hpe.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-08-26 18:50:43 +08:00
Robert Elliott
ec84348da4 crypto: Kconfig - simplify CRC entries
Shorten menu titles and make them consistent:
- acronym
- name
- architecture features in parenthesis
- no suffixes like "<something> algorithm", "support", or
  "hardware acceleration", or "optimized"

Simplify help text descriptions, update references, and ensure that
https references are still valid.

Signed-off-by: Robert Elliott <elliott@hpe.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-08-26 18:50:42 +08:00
Robert Elliott
c9d24c97c8 crypto: Kconfig - move s390 entries to a submenu
Move CPU-specific crypto/Kconfig entries to arch/xxx/crypto/Kconfig
and create a submenu for them under the Crypto API menu.

Suggested-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Robert Elliott <elliott@hpe.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-08-26 18:50:41 +08:00
Gerald Schaefer
41ac42f137 s390/mm: do not trigger write fault when vma does not allow VM_WRITE
For non-protection pXd_none() page faults in do_dat_exception(), we
call do_exception() with access == (VM_READ | VM_WRITE | VM_EXEC).
In do_exception(), vma->vm_flags is checked against that before
calling handle_mm_fault().

Since commit 92f842eac7 ("[S390] store indication fault optimization"),
we call handle_mm_fault() with FAULT_FLAG_WRITE, when recognizing that
it was a write access. However, the vma flags check is still only
checking against (VM_READ | VM_WRITE | VM_EXEC), and therefore also
calling handle_mm_fault() with FAULT_FLAG_WRITE in cases where the vma
does not allow VM_WRITE.

Fix this by changing access check in do_exception() to VM_WRITE only,
when recognizing write access.

Link: https://lkml.kernel.org/r/20220811103435.188481-3-david@redhat.com
Fixes: 92f842eac7 ("[S390] store indication fault optimization")
Cc: <stable@vger.kernel.org>
Reported-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-25 15:12:32 +02:00
Brian Foster
13cccafe0e s390: fix double free of GS and RI CBs on fork() failure
The pointers for guarded storage and runtime instrumentation control
blocks are stored in the thread_struct of the associated task. These
pointers are initially copied on fork() via arch_dup_task_struct()
and then cleared via copy_thread() before fork() returns. If fork()
happens to fail after the initial task dup and before copy_thread(),
the newly allocated task and associated thread_struct memory are
freed via free_task() -> arch_release_task_struct(). This results in
a double free of the guarded storage and runtime info structs
because the fields in the failed task still refer to memory
associated with the source task.

This problem can manifest as a BUG_ON() in set_freepointer() (with
CONFIG_SLAB_FREELIST_HARDENED enabled) or KASAN splat (if enabled)
when running trinity syscall fuzz tests on s390x. To avoid this
problem, clear the associated pointer fields in
arch_dup_task_struct() immediately after the new task is copied.
Note that the RI flag is still cleared in copy_thread() because it
resides in thread stack memory and that is where stack info is
copied.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Fixes: 8d9047f8b9 ("s390/runtime instrumentation: simplify task exit handling")
Fixes: 7b83c6297d ("s390/guarded storage: simplify task exit handling")
Cc: <stable@vger.kernel.org> # 4.15
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20220816155407.537372-1-bfoster@redhat.com
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-08-25 15:12:32 +02:00
Juergen Gross
7b6670b036 s390/hypfs: avoid error message under KVM
When booting under KVM the following error messages are issued:

hypfs.7f5705: The hardware system does not support hypfs
hypfs.7a79f0: Initialization of hypfs failed with rc=-61

Demote the severity of first message from "error" to "info" and issue
the second message only in other error cases.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Link: https://lore.kernel.org/r/20220620094534.18967-1-jgross@suse.com
[arch/s390/hypfs/hypfs_diag.c changed description]
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-15 17:19:51 +02:00
Linus Torvalds
4e23eeebb2 Bitmap patches for v6.0-rc1
This branch consists of:
 
 Qu Wenruo:
 lib: bitmap: fix the duplicated comments on bitmap_to_arr64()
 https://lore.kernel.org/lkml/0d85e1dbad52ad7fb5787c4432bdb36cbd24f632.1656063005.git.wqu@suse.com/
 
 Alexander Lobakin:
 bitops: let optimize out non-atomic bitops on compile-time constants
 https://lore.kernel.org/lkml/20220624121313.2382500-1-alexandr.lobakin@intel.com/T/
 
 Yury Norov:
 lib: cleanup bitmap-related headers
 https://lore.kernel.org/linux-arm-kernel/YtCVeOGLiQ4gNPSf@yury-laptop/T/#m305522194c4d38edfdaffa71fcaaf2e2ca00a961
 
 Alexander Lobakin:
 x86/olpc: fix 'logical not is only applied to the left hand side'
 https://www.spinics.net/lists/kernel/msg4440064.html
 
 Yury Norov:
 lib/nodemask: inline wrappers around bitmap
 https://lore.kernel.org/all/20220723214537.2054208-1-yury.norov@gmail.com/
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmLpVvwACgkQsUSA/Tof
 vsiAHgwAwS9pl8GJ+fKYnue2CYo9349d2oT6BBUs/Rv8uqYEa4QkpYsR7NS733TG
 pos0hhoRvSOzrUP4qppXUjfJ+NkzLgpnKFOeWfFoNAKlHuaaMRvF3Y0Q/P8g0/Kg
 HPWcCQLHyCH9Wjs3e2TTgRjxTrHuruD2VJ401/PX/lw0DicUhmev5mUFa10uwFkP
 ZJRprjoFn9HJ0Hk16pFZDi36d3YumhACOcWRiJdoBDrEPV3S6lm9EeOy/yHBNp5k
 9bKj+RboeT2t70KaZcKv+M5j1nu0cAhl7kRkjcxcmGyimI0l82Vgq9yFxhGqvWg8
 RnCrJ5EaO08FGCAKG9GEwzdiNa24Gdq5XZSpQA7JZHmhmchpnnlNenJicyv0gOQi
 abChZeWSEsyA+78l2+kk9nezfVKUOnKDEZQxBVTOyWsmZYxHZV94oam340VjQDaY
 4/fETdOy/qqPIxnpxAeFGWxZjcVaYiYPLj7KLPMsB0aAAF7pZrem465vSfgbrE81
 +gCdqrWd
 =4dTW
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-6.0-rc1' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - fix the duplicated comments on bitmap_to_arr64() (Qu Wenruo)

 - optimize out non-atomic bitops on compile-time constants (Alexander
   Lobakin)

 - cleanup bitmap-related headers (Yury Norov)

 - x86/olpc: fix 'logical not is only applied to the left hand side'
   (Alexander Lobakin)

 - lib/nodemask: inline wrappers around bitmap (Yury Norov)

* tag 'bitmap-6.0-rc1' of https://github.com/norov/linux: (26 commits)
  lib/nodemask: inline next_node_in() and node_random()
  powerpc: drop dependency on <asm/machdep.h> in archrandom.h
  x86/olpc: fix 'logical not is only applied to the left hand side'
  lib/cpumask: move some one-line wrappers to header file
  headers/deps: mm: align MANITAINERS and Docs with new gfp.h structure
  headers/deps: mm: Split <linux/gfp_types.h> out of <linux/gfp.h>
  headers/deps: mm: Optimize <linux/gfp.h> header dependencies
  lib/cpumask: move trivial wrappers around find_bit to the header
  lib/cpumask: change return types to unsigned where appropriate
  cpumask: change return types to bool where appropriate
  lib/bitmap: change type of bitmap_weight to unsigned long
  lib/bitmap: change return types to bool where appropriate
  arm: align find_bit declarations with generic kernel
  iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE)
  lib/test_bitmap: test the tail after bitmap_to_arr64()
  lib/bitmap: fix off-by-one in bitmap_to_arr64()
  lib: test_bitmap: add compile-time optimization/evaluations assertions
  bitmap: don't assume compiler evaluates small mem*() builtins calls
  net/ice: fix initializing the bitmap in the switch code
  bitops: let optimize out non-atomic bitops on compile-time constants
  ...
2022-08-07 17:52:35 -07:00
Linus Torvalds
24cb958695 s390 updates for 5.20 merge window
- Rework copy_oldmem_page() callback to take an iov_iter.
   This includes few prerequisite updates and fixes to the
   oldmem reading code.
 
 - Rework cpufeature implementation to allow for various CPU feature
   indications, which is not only limited to hardware capabilities,
   but also allows CPU facilities.
 
 - Use the cpufeature rework to autoload Ultravisor module when CPU
   facility 158 is available.
 
 - Add ELF note type for encrypted CPU state of a protected virtual CPU.
   The zgetdump tool from s390-tools package will decrypt the CPU state
   using a Customer Communication Key and overwrite respective notes to
   make the data accessible for crash and other debugging tools.
 
 - Use vzalloc() instead of vmalloc() + memset() in ChaCha20 crypto test.
 
 - Fix incorrect recovery of kretprobe modified return address in stacktrace.
 
 - Switch the NMI handler to use generic irqentry_nmi_enter() and
   irqentry_nmi_exit() helper functions.
 
 - Rework the cryptographic Adjunct Processors (AP) pass-through design
   to support dynamic changes to the AP matrix of a running guest as well
   as to implement more of the AP architecture.
 
 - Minor boot code cleanups.
 
 - Grammar and typo fixes to hmcdrv and tape drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYIADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCYu4dRBccYWdvcmRlZXZA
 bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8DnlAP45Sk4cE35T+Z0vdHE2f0uMXE/p
 uHNjS3fDZOQVFJ2jZwEA99xPF5qPCttbR/b1VHsMSb30684IT1A4PC7y05kgfAw=
 =jCc3
 -----END PGP SIGNATURE-----

Merge tag 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Alexander Gordeev:

 - Rework copy_oldmem_page() callback to take an iov_iter.

   This includes a few prerequisite updates and fixes to the oldmem
   reading code.

 - Rework cpufeature implementation to allow for various CPU feature
   indications, which is not only limited to hardware capabilities, but
   also allows CPU facilities.

 - Use the cpufeature rework to autoload Ultravisor module when CPU
   facility 158 is available.

 - Add ELF note type for encrypted CPU state of a protected virtual CPU.
   The zgetdump tool from s390-tools package will decrypt the CPU state
   using a Customer Communication Key and overwrite respective notes to
   make the data accessible for crash and other debugging tools.

 - Use vzalloc() instead of vmalloc() + memset() in ChaCha20 crypto
   test.

 - Fix incorrect recovery of kretprobe modified return address in
   stacktrace.

 - Switch the NMI handler to use generic irqentry_nmi_enter() and
   irqentry_nmi_exit() helper functions.

 - Rework the cryptographic Adjunct Processors (AP) pass-through design
   to support dynamic changes to the AP matrix of a running guest as
   well as to implement more of the AP architecture.

 - Minor boot code cleanups.

 - Grammar and typo fixes to hmcdrv and tape drivers.

* tag 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (46 commits)
  Revert "s390/smp: enforce lowcore protection on CPU restart"
  Revert "s390/smp: rework absolute lowcore access"
  Revert "s390/smp,ptdump: add absolute lowcore markers"
  s390/unwind: fix fgraph return address recovery
  s390/nmi: use irqentry_nmi_enter()/irqentry_nmi_exit()
  s390: add ELF note type for encrypted CPU state of a PV VCPU
  s390/smp,ptdump: add absolute lowcore markers
  s390/smp: rework absolute lowcore access
  s390/setup: rearrange absolute lowcore initialization
  s390/boot: cleanup adjust_to_uv_max() function
  s390/smp: enforce lowcore protection on CPU restart
  s390/tape: fix comment typo
  s390/hmcdrv: fix Kconfig "its" grammar
  s390/docs: fix warnings for vfio_ap driver doc
  s390/docs: fix warnings for vfio_ap driver lock usage doc
  s390/crash: support multi-segment iterators
  s390/crash: use static swap buffer for copy_to_user_real()
  s390/crash: move copy_to_user_real() to crash_dump.c
  s390/zcore: fix race when reading from hardware system area
  s390/crash: fix incorrect number of bytes to copy to user space
  ...
2022-08-06 17:05:21 -07:00
Linus Torvalds
a9cf69d0e7 VFIO updates for v6.0-rc1
- Cleanup use of extern in function prototypes (Alex Williamson)
 
  - Simplify bus_type usage and convert to device IOMMU interfaces
    (Robin Murphy)
 
  - Check missed return value and fix comment typos (Bo Liu)
 
  - Split migration ops from device ops and fix races in mlx5 migration
    support (Yishai Hadas)
 
  - Fix missed return value check in noiommu support (Liam Ni)
 
  - Hardening to clear buffer pointer to avoid use-after-free (Schspa Shi)
 
  - Remove requirement that only the same mm can unmap a previously
    mapped range (Li Zhe)
 
  - Adjust semaphore release vs device open counter (Yi Liu)
 
  - Remove unused arg from SPAPR support code (Deming Wang)
 
  - Rework vfio-ccw driver to better fit new mdev framework (Eric Farman,
    Michael Kawano)
 
  - Replace DMA unmap notifier with callbacks (Jason Gunthorpe)
 
  - Clarify SPAPR support comment relative to iommu_ops (Alexey Kardashevskiy)
 
  - Revise page pinning API towards compatibility with future iommufd support
    (Nicolin Chen)
 
  - Resolve issues in vfio-ccw, including use of DMA unmap callback
    (Eric Farman)
 -----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmLqvYMbHGFsZXgud2ls
 bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsiHM0P/1n/bszel20PRC7x+NLI
 P7b/0aonW4Qtei2HORwowmaznb4NgRE5GCm5RU+a9+AwQKnK44j3lqy0skcfgZXr
 f4viFlxOyd0H4blOhUZ+FuPNkUMAyz6HerzvJ9jQFG426pL5vr7UKWBuJPYB5RCT
 4jEy3EUTSH8/Zt8ApLysFTyR64xN3Sk7vSUcj9rEhu5T3FWq8t9+jb3tE/HW/Xaw
 pMwdC+ctYzYaBD/oA7Ns2IebNS9AUIUjKMXC25oCmc83WGgGOqgLB2mAthQ2NKB5
 5capKBYuYl7PWERvpGpsPILEWvR6m+Rxh8r4Pqjcoyfq4k7vp+A/AFKiD7AEYBdy
 BtfLWO59w6vuRQ5XXOa6Hu4ef6BcMvH4StrHxlHkKcgI4PJA0QscIXiJPQSt7Crr
 m+kCNgPPgrfZDu7lmZTiWbXOYSkJR3Mxkhf2iNHudW9SsJT9pUAVEiGVVA/kC1Y/
 fNBziRQeVF6JUW8M4pveXEWEbA8iE1HQeJA6aVRonxAkJk1KBaQgm/GKJlPXCHIR
 R6lI90NXZHz/3ndIX1znKOm0qli+8auX/FH8iWUffZxGmtINOGGMYebD6YxFdCCJ
 sWalL8vlQNCams2MZdovu/5BowXWtwOMm6KNG9RXSyWIWZEcNVbAzhTr+rrDdHZd
 AJiUNCGO9UlO9FZM+ntfQTSr
 =4BE8
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v6.0-rc1' of https://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

 - Cleanup use of extern in function prototypes (Alex Williamson)

 - Simplify bus_type usage and convert to device IOMMU interfaces (Robin
   Murphy)

 - Check missed return value and fix comment typos (Bo Liu)

 - Split migration ops from device ops and fix races in mlx5 migration
   support (Yishai Hadas)

 - Fix missed return value check in noiommu support (Liam Ni)

 - Hardening to clear buffer pointer to avoid use-after-free (Schspa
   Shi)

 - Remove requirement that only the same mm can unmap a previously
   mapped range (Li Zhe)

 - Adjust semaphore release vs device open counter (Yi Liu)

 - Remove unused arg from SPAPR support code (Deming Wang)

 - Rework vfio-ccw driver to better fit new mdev framework (Eric Farman,
   Michael Kawano)

 - Replace DMA unmap notifier with callbacks (Jason Gunthorpe)

 - Clarify SPAPR support comment relative to iommu_ops (Alexey
   Kardashevskiy)

 - Revise page pinning API towards compatibility with future iommufd
   support (Nicolin Chen)

 - Resolve issues in vfio-ccw, including use of DMA unmap callback (Eric
   Farman)

* tag 'vfio-v6.0-rc1' of https://github.com/awilliam/linux-vfio: (40 commits)
  vfio/pci: fix the wrong word
  vfio/ccw: Check return code from subchannel quiesce
  vfio/ccw: Remove FSM Close from remove handlers
  vfio/ccw: Add length to DMA_UNMAP checks
  vfio: Replace phys_pfn with pages for vfio_pin_pages()
  vfio/ccw: Add kmap_local_page() for memcpy
  vfio: Rename user_iova of vfio_dma_rw()
  vfio/ccw: Change pa_pfn list to pa_iova list
  vfio/ap: Change saved_pfn to saved_iova
  vfio: Pass in starting IOVA to vfio_pin/unpin_pages API
  vfio/ccw: Only pass in contiguous pages
  vfio/ap: Pass in physical address of ind to ap_aqic()
  drm/i915/gvt: Replace roundup with DIV_ROUND_UP
  vfio: Make vfio_unpin_pages() return void
  vfio/spapr_tce: Fix the comment
  vfio: Replace the iommu notifier with a device list
  vfio: Replace the DMA unmapping notifier with a callback
  vfio/ccw: Move FSM open/close to MDEV open/close
  vfio/ccw: Refactor vfio_ccw_mdev_reset
  vfio/ccw: Create a CLOSE FSM event
  ...
2022-08-06 08:59:35 -07:00
Alexander Gordeev
953503751a Revert "s390/smp: enforce lowcore protection on CPU restart"
This reverts commit 6f5c672d17.

This breaks normal crash dump when CPU0 is offline.

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-06 09:29:46 +02:00
Alexander Gordeev
5e441f61f5 Revert "s390/smp: rework absolute lowcore access"
This reverts commit 7d06fed77b.

This introduced vmem_mutex locking from vmem_map_4k_page()
function called from smp_reinit_ipl_cpu() with interrupts
disabled. While it is a pre-SMP early initcall no other CPUs
running in parallel nor other code taking vmem_mutex on this
boot stage - it still needs to be fixed.

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-06 09:24:07 +02:00
Alexander Gordeev
3fb39cb7c5 Revert "s390/smp,ptdump: add absolute lowcore markers"
This reverts commit e409b7f191.

Commit 7d06fed77b ("s390/smp: rework absolute lowcore access")
introduced mutex lock with interrupts disabled. This commit is
a follow-up that needs to be reverted as well.

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-06 09:13:28 +02:00
Linus Torvalds
6614a3c316 - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
Lin, Yang Shi, Anshuman Khandual and Mike Rapoport
 
 - Some kmemleak fixes from Patrick Wang and Waiman Long
 
 - DAMON updates from SeongJae Park
 
 - memcg debug/visibility work from Roman Gushchin
 
 - vmalloc speedup from Uladzislau Rezki
 
 - more folio conversion work from Matthew Wilcox
 
 - enhancements for coherent device memory mapping from Alex Sierra
 
 - addition of shared pages tracking and CoW support for fsdax, from
   Shiyang Ruan
 
 - hugetlb optimizations from Mike Kravetz
 
 - Mel Gorman has contributed some pagealloc changes to improve latency
   and realtime behaviour.
 
 - mprotect soft-dirty checking has been improved by Peter Xu
 
 - Many other singleton patches all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYuravgAKCRDdBJ7gKXxA
 jpqSAQDrXSdII+ht9kSHlaCVYjqRFQz/rRvURQrWQV74f6aeiAD+NHHeDPwZn11/
 SPktqEUrF1pxnGQxqLh1kUFUhsVZQgE=
 =w/UH
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Most of the MM queue. A few things are still pending.

  Liam's maple tree rework didn't make it. This has resulted in a few
  other minor patch series being held over for next time.

  Multi-gen LRU still isn't merged as we were waiting for mapletree to
  stabilize. The current plan is to merge MGLRU into -mm soon and to
  later reintroduce mapletree, with a view to hopefully getting both
  into 6.1-rc1.

  Summary:

   - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
     Lin, Yang Shi, Anshuman Khandual and Mike Rapoport

   - Some kmemleak fixes from Patrick Wang and Waiman Long

   - DAMON updates from SeongJae Park

   - memcg debug/visibility work from Roman Gushchin

   - vmalloc speedup from Uladzislau Rezki

   - more folio conversion work from Matthew Wilcox

   - enhancements for coherent device memory mapping from Alex Sierra

   - addition of shared pages tracking and CoW support for fsdax, from
     Shiyang Ruan

   - hugetlb optimizations from Mike Kravetz

   - Mel Gorman has contributed some pagealloc changes to improve
     latency and realtime behaviour.

   - mprotect soft-dirty checking has been improved by Peter Xu

   - Many other singleton patches all over the place"

 [ XFS merge from hell as per Darrick Wong in

   https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ]

* tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (282 commits)
  tools/testing/selftests/vm/hmm-tests.c: fix build
  mm: Kconfig: fix typo
  mm: memory-failure: convert to pr_fmt()
  mm: use is_zone_movable_page() helper
  hugetlbfs: fix inaccurate comment in hugetlbfs_statfs()
  hugetlbfs: cleanup some comments in inode.c
  hugetlbfs: remove unneeded header file
  hugetlbfs: remove unneeded hugetlbfs_ops forward declaration
  hugetlbfs: use helper macro SZ_1{K,M}
  mm: cleanup is_highmem()
  mm/hmm: add a test for cross device private faults
  selftests: add soft-dirty into run_vmtests.sh
  selftests: soft-dirty: add test for mprotect
  mm/mprotect: fix soft-dirty check in can_change_pte_writable()
  mm: memcontrol: fix potential oom_lock recursion deadlock
  mm/gup.c: fix formatting in check_and_migrate_movable_page()
  xfs: fail dax mount if reflink is enabled on a partition
  mm/memcontrol.c: remove the redundant updating of stats_flush_threshold
  userfaultfd: don't fail on unrecognized features
  hugetlb_cgroup: fix wrong hugetlb cgroup numa stat
  ...
2022-08-05 16:32:45 -07:00
Linus Torvalds
3bd6e5854b asm-generic: updates for 6.0
There are three independent sets of changes:
 
  - Sai Prakash Ranjan adds tracing support to the asm-generic
    version of the MMIO accessors, which is intended to help
    understand problems with device drivers and has been part
    of Qualcomm's vendor kernels for many years.
 
  - A patch from Sebastian Siewior to rework the handling of
    IRQ stacks in softirqs across architectures, which is
    needed for enabling PREEMPT_RT.
 
  - The last patch to remove the CONFIG_VIRT_TO_BUS option and
    some of the code behind that, after the last users of this
    old interface made it in through the netdev, scsi, media and
    staging trees.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmLqPPEACgkQmmx57+YA
 GNlUbQ/+NpIsiA0JUrCGtySt8KrLHdA2dH9lJOR5/iuxfphscPFfWtpcPvcXQWmt
 a8u7wyI8SHW1ku4U0Y5sO0dBSldDnoIqJ5t4X5d7YNU9yVtEtucqQhZf+GkrPlVD
 1HkRu05B7y0k2BMn7BLhSvkpafs3f1lNGXjs8oFBdOF1/zwp/GjcrfCK7KFzqjwU
 dYrX0SOFlKFd4BZC75VfK+XcKg4LtwIOmJraRRl7alz2Q5Oop2hgjgZxXDPf//vn
 SPOhXJN/97i1FUpY2TkfHVH1NxbPfjCV4pUnjmLG0Y4NSy9UQ/ZcXHcywIdeuhfa
 0LySOIsAqBeccpYYYdg2ubiMDZOXkBfANu/sB9o/EhoHfB4svrbPRDhBIQZMFXJr
 MJYu+IYce2rvydA/nydo4q++pxR8v1ES1ZIo8bDux+q1CI/zbpQV+f98kPVRA0M7
 ajc+5GTIqNIsvHzzadq7eYxcj5Bi8Li2JA9sVkAQ+6iq1TVyeYayMc9eYwONlmqw
 MD+PFYc651pKtXZCfkLXPIKSwS0uPqBndAibuVhpZ0hxWaCBBdKvY9mrWcPxt0kA
 tMR8lrosbbrV2K48BFdWTOHvCs2FhHQxPGVPZ/iWuxTA0hHZ9tUlaEkSX+VM57IU
 KCYQLdWzT8J9vrgqSbgYKlb6pSPz6FIjTfut6NZMmshIbavHV/Q=
 =aTR0
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "There are three independent sets of changes:

   - Sai Prakash Ranjan adds tracing support to the asm-generic version
     of the MMIO accessors, which is intended to help understand
     problems with device drivers and has been part of Qualcomm's vendor
     kernels for many years

   - A patch from Sebastian Siewior to rework the handling of IRQ stacks
     in softirqs across architectures, which is needed for enabling
     PREEMPT_RT

   - The last patch to remove the CONFIG_VIRT_TO_BUS option and some of
     the code behind that, after the last users of this old interface
     made it in through the netdev, scsi, media and staging trees"

* tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  uapi: asm-generic: fcntl: Fix typo 'the the' in comment
  arch/*/: remove CONFIG_VIRT_TO_BUS
  soc: qcom: geni: Disable MMIO tracing for GENI SE
  serial: qcom_geni_serial: Disable MMIO tracing for geni serial
  asm-generic/io: Add logging support for MMIO accessors
  KVM: arm64: Add a flag to disable MMIO trace for nVHE KVM
  lib: Add register read/write tracing support
  drm/meson: Fix overflow implicit truncation warnings
  irqchip/tegra: Fix overflow implicit truncation warnings
  coresight: etm4x: Use asm-generic IO memory barriers
  arm64: io: Use asm-generic high level MMIO accessors
  arch/*: Disable softirq stacks on PREEMPT_RT.
2022-08-05 10:07:23 -07:00
Linus Torvalds
eff0cb3d91 pci-v5.20-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmLr+2wUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxfZg//eChkC2EUdT6K3zuQDbJJhsGcuOQF
 lnZuUyDn4xw7BkEoZf8V6YdAnp7VvgKhLOq1/q3Geu/LBbCaczoEogOCaR/WcVOs
 C+MsN0RWZQtgfuZKncQoqp25NeLPK9PFToeiIX/xViAYZF7NVjDY7XQiZHQ6JkEA
 /7cUqv/4nS3KCMsKjfmiOxGnqohMWtICiw9qjFvJ40PEDnNB1b53rkiVTxBFePpI
 ePfsRfi/C7klE3xNfoiEgrPp+Jfw+oShsCwXUsId7bEL2oLBc7ClqP05ZYZD3bTK
 QQYyZ12Cq8TysciYpUGBjBnywUHS5DIO5YaV3wxyVAR2Z+6GY2/QVjOa2kKvoK0o
 Hba6TJf8bL58AhSI8Q62pBM0sS7dqJSff+9c2BGpZvII5spP/rQQLlJO56TJjwkw
 Dlf0d3thhZOc9vSKjKw+0v0FdAyc4L11EOwUsw95jZeT5WWgqJYGFnWPZwqBI1KM
 DI1E5wVO5tA2H3NEn+BTTHbLWL+UppqyXPXBHiW52b2q5Bt8fJWMsFvnEEjclxmG
 pYCI7VgF8jqbYKxjobxPFY2x6PH9hfaGMxwzZSdOX6e/Eh+1esgyyaC5APpCO+Pp
 e4OkJaOzCmggrD0jYeLWu+yDm5KRrYo5cdfKHrKgAof0Am41lAa1OhJ2iH4ckNqP
 1qmHereDOe0zNVw=
 =9TAR
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:

   - Consolidate duplicated 'next function' scanning and extend to allow
     'isolated functions' on s390, similar to existing hypervisors
     (Niklas Schnelle)

  Resource management:
   - Implement pci_iobar_pfn() for sparc, which allows us to remove the
     sparc-specific pci_mmap_page_range() and pci_mmap_resource_range().

     This removes the ability to map the entire PCI I/O space using
     /proc/bus/pci, but we believe that's already been broken since
     v2.6.28 (Arnd Bergmann)

   - Move common PCI definitions to asm-generic/pci.h and rework others
     to be be more specific and more encapsulated in arches that need
     them (Stafford Horne)

  Power management:

   - Convert drivers to new *_PM_OPS macros to avoid need for '#ifdef
     CONFIG_PM_SLEEP' or '__maybe_unused' (Bjorn Helgaas)

  Virtualization:

   - Add ACS quirk for Broadcom BCM5750x multifunction NICs that isolate
     the functions but don't advertise an ACS capability (Pavan Chebbi)

  Error handling:

   - Clear PCI Status register during enumeration in case firmware left
     errors logged (Kai-Heng Feng)

   - When we have native control of AER, enable error reporting for all
     devices that support AER. Previously only a few drivers enabled
     this (Stefan Roese)

   - Keep AER error reporting enabled for switches. Previously we
     enabled this during enumeration but immediately disabled it (Stefan
     Roese)

   - Iterate over error counters instead of error strings to avoid
     printing junk in AER sysfs counters (Mohamed Khalfella)

  ASPM:

   - Remove pcie_aspm_pm_state_change() so ASPM config changes, e.g.,
     via sysfs, are not lost across power state changes (Kai-Heng Feng)

  Endpoint framework:

   - Don't stop an EPC when unbinding an EPF from it (Shunsuke Mie)

  Endpoint embedded DMA controller driver:

   - Simplify and clean up support for the DesignWare embedded DMA
     (eDMA) controller (Frank Li, Serge Semin)

  Broadcom STB PCIe controller driver:

   - Avoid config space accesses when link is down because we can't
     recover from the CPU aborts these cause (Jim Quinlan)

   - Look for power regulators described under Root Ports in DT and
     enable them before scanning the secondary bus (Jim Quinlan)

   - Disable/enable regulators in suspend/resume (Jim Quinlan)

  Freescale i.MX6 PCIe controller driver:

   - Simplify and clean up clock and PHY management (Richard Zhu)

   - Disable/enable regulators in suspend/resume (Richard Zhu)

   - Set PCIE_DBI_RO_WR_EN before writing DBI registers (Richard Zhu)

   - Allow speeds faster than Gen2 (Richard Zhu)

   - Make link being down a non-fatal error so controller probe doesn't
     fail if there are no Endpoints connected (Richard Zhu)

  Loongson PCIe controller driver:

   - Add ACPI and MCFG support for Loongson LS7A (Huacai Chen)

   - Avoid config reads to non-existent LS2K/LS7A devices because a
     hardware defect causes machine hangs (Huacai Chen)

   - Work around LS7A integrated devices that report incorrect Interrupt
     Pin values (Jianmin Lv)

  Marvell Aardvark PCIe controller driver:

   - Add support for AER and Slot capability on emulated bridge (Pali
     Rohár)

  MediaTek PCIe controller driver:

   - Add Airoha EN7532 to DT binding (John Crispin)

   - Allow building of driver for ARCH_AIROHA (Felix Fietkau)

  MediaTek PCIe Gen3 controller driver:

   - Print decoded LTSSM state when the link doesn't come up (Jianjun
     Wang)

  NVIDIA Tegra194 PCIe controller driver:

   - Convert DT binding to json-schema (Vidya Sagar)

   - Add DT bindings and driver support for Tegra234 Root Port and
     Endpoint mode (Vidya Sagar)

   - Fix some Root Port interrupt handling issues (Vidya Sagar)

   - Set default Max Payload Size to 256 bytes (Vidya Sagar)

   - Fix Data Link Feature capability programming (Vidya Sagar)

   - Extend Endpoint mode support to devices beyond Controller-5 (Vidya
     Sagar)

  Qualcomm PCIe controller driver:

   - Rework clock, reset, PHY power-on ordering to avoid hangs and
     improve consistency (Robert Marko, Christian Marangi)

   - Move pipe_clk handling to PHY drivers (Dmitry Baryshkov)

   - Add IPQ60xx support (Selvam Sathappan Periakaruppan)

   - Allow ASPM L1 and substates for 2.7.0 (Krishna chaitanya chundru)

   - Add support for more than 32 MSI interrupts (Dmitry Baryshkov)

  Renesas R-Car PCIe controller driver:

   - Convert DT binding to json-schema (Herve Codina)

   - Add Renesas RZ/N1D (R9A06G032) to rcar-gen2 DT binding and driver
     (Herve Codina)

  Samsung Exynos PCIe controller driver:

   - Fix phy-exynos-pcie driver so it follows the 'phy_init() before
     phy_power_on()' PHY programming model (Marek Szyprowski)

  Synopsys DesignWare PCIe controller driver:

   - Simplify and clean up the DWC core extensively (Serge Semin)

   - Fix an issue with programming the ATU for regions that cross a 4GB
     boundary (Serge Semin)

   - Enable the CDM check if 'snps,enable-cdm-check' exists; previously
     we skipped it if 'num-lanes' was absent (Serge Semin)

   - Allocate a 32-bit DMA-able page to be MSI target instead of using a
     driver data structure that may not be addressable with 32-bit
     address (Will McVicker)

   - Add DWC core support for more than 32 MSI interrupts (Dmitry
     Baryshkov)

  Xilinx Versal CPM PCIe controller driver:

   - Add DT binding and driver support for Versal CPM5 Gen5 Root Port
     (Bharat Kumar Gogada)"

* tag 'pci-v5.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (150 commits)
  PCI: imx6: Support more than Gen2 speed link mode
  PCI: imx6: Set PCIE_DBI_RO_WR_EN before writing DBI registers
  PCI: imx6: Reformat suspend callback to keep symmetric with resume
  PCI: imx6: Move the imx6_pcie_ltssm_disable() earlier
  PCI: imx6: Disable clocks in reverse order of enable
  PCI: imx6: Do not hide PHY driver callbacks and refine the error handling
  PCI: imx6: Reduce resume time by only starting link if it was up before suspend
  PCI: imx6: Mark the link down as non-fatal error
  PCI: imx6: Move regulator enable out of imx6_pcie_deassert_core_reset()
  PCI: imx6: Turn off regulator when system is in suspend mode
  PCI: imx6: Call host init function directly in resume
  PCI: imx6: Disable i.MX6QDL clock when disabling ref clocks
  PCI: imx6: Propagate .host_init() errors to caller
  PCI: imx6: Collect clock enables in imx6_pcie_clk_enable()
  PCI: imx6: Factor out ref clock disable to match enable
  PCI: imx6: Move imx6_pcie_clk_disable() earlier
  PCI: imx6: Move imx6_pcie_enable_ref_clk() earlier
  PCI: imx6: Move PHY management functions together
  PCI: imx6: Move imx6_pcie_grp_offset(), imx6_pcie_configure_type() earlier
  PCI: imx6: Convert to NOIRQ_SYSTEM_SLEEP_PM_OPS()
  ...
2022-08-04 19:30:35 -07:00
Linus Torvalds
7447691ef9 xen: branch for v6.0-rc1
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYuooOQAKCRCAXGG7T9hj
 vmmlAPoCfYBh4jKwRnvGvyn+sPQed/r0TH0wnsGK1ccONhyIvAD+IZcSTPsnp4Cj
 m1URGGff2PvAyjOIAzQZbKZomtfICwM=
 =z2e5
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.0-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:

 - a series fine tuning virtio support for Xen guests, including removal
   the now again unused "platform_has()" feature.

 - a fix for host admin triggered reboot of Xen guests

 - a simple spelling fix

* tag 'for-linus-6.0-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: don't require virtio with grants for non-PV guests
  kernel: remove platform_has() infrastructure
  virtio: replace restricted mem access flag with callback
  xen: Fix spelling mistake
  xen/manage: Use orderly_reboot() to reboot
2022-08-04 15:10:55 -07:00
Linus Torvalds
7c5c3a6177 ARM:
* Unwinder implementations for both nVHE modes (classic and
   protected), complete with an overflow stack
 
 * Rework of the sysreg access from userspace, with a complete
   rewrite of the vgic-v3 view to allign with the rest of the
   infrastructure
 
 * Disagregation of the vcpu flags in separate sets to better track
   their use model.
 
 * A fix for the GICv2-on-v3 selftest
 
 * A small set of cosmetic fixes
 
 RISC-V:
 
 * Track ISA extensions used by Guest using bitmap
 
 * Added system instruction emulation framework
 
 * Added CSR emulation framework
 
 * Added gfp_custom flag in struct kvm_mmu_memory_cache
 
 * Added G-stage ioremap() and iounmap() functions
 
 * Added support for Svpbmt inside Guest
 
 s390:
 
 * add an interface to provide a hypervisor dump for secure guests
 
 * improve selftests to use TAP interface
 
 * enable interpretive execution of zPCI instructions (for PCI passthrough)
 
 * First part of deferred teardown
 
 * CPU Topology
 
 * PV attestation
 
 * Minor fixes
 
 x86:
 
 * Permit guests to ignore single-bit ECC errors
 
 * Intel IPI virtualization
 
 * Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS
 
 * PEBS virtualization
 
 * Simplify PMU emulation by just using PERF_TYPE_RAW events
 
 * More accurate event reinjection on SVM (avoid retrying instructions)
 
 * Allow getting/setting the state of the speaker port data bit
 
 * Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent
 
 * "Notify" VM exit (detect microarchitectural hangs) for Intel
 
 * Use try_cmpxchg64 instead of cmpxchg64
 
 * Ignore benign host accesses to PMU MSRs when PMU is disabled
 
 * Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior
 
 * Allow NX huge page mitigation to be disabled on a per-vm basis
 
 * Port eager page splitting to shadow MMU as well
 
 * Enable CMCI capability by default and handle injected UCNA errors
 
 * Expose pid of vcpu threads in debugfs
 
 * x2AVIC support for AMD
 
 * cleanup PIO emulation
 
 * Fixes for LLDT/LTR emulation
 
 * Don't require refcounted "struct page" to create huge SPTEs
 
 * Miscellaneous cleanups:
 ** MCE MSR emulation
 ** Use separate namespaces for guest PTEs and shadow PTEs bitmasks
 ** PIO emulation
 ** Reorganize rmap API, mostly around rmap destruction
 ** Do not workaround very old KVM bugs for L0 that runs with nesting enabled
 ** new selftests API for CPUID
 
 Generic:
 
 * Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache
 
 * new selftests API using struct kvm_vcpu instead of a (vm, id) tuple
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmLnyo4UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMtQQf/XjVWiRcWLPR9dqzRM/vvRXpiG+UL
 jU93R7m6ma99aqTtrxV/AE+kHgamBlma3Cwo+AcWk9uCVNbIhFjv2YKg6HptKU0e
 oJT3zRYp+XIjEo7Kfw+TwroZbTlG6gN83l1oBLFMqiFmHsMLnXSI2mm8MXyi3dNB
 vR2uIcTAl58KIprqNNsYJ2dNn74ogOMiXYx9XzoA9/5Xb6c0h4rreHJa5t+0s9RO
 Gz7Io3PxumgsbJngjyL1Ve5oxhlIAcZA8DU0PQmjxo3eS+k6BcmavGFd45gNL5zg
 iLpCh4k86spmzh8CWkAAwWPQE4dZknK6jTctJc0OFVad3Z7+X7n0E8TFrA==
 =PM8o
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "Quite a large pull request due to a selftest API overhaul and some
  patches that had come in too late for 5.19.

  ARM:

   - Unwinder implementations for both nVHE modes (classic and
     protected), complete with an overflow stack

   - Rework of the sysreg access from userspace, with a complete rewrite
     of the vgic-v3 view to allign with the rest of the infrastructure

   - Disagregation of the vcpu flags in separate sets to better track
     their use model.

   - A fix for the GICv2-on-v3 selftest

   - A small set of cosmetic fixes

  RISC-V:

   - Track ISA extensions used by Guest using bitmap

   - Added system instruction emulation framework

   - Added CSR emulation framework

   - Added gfp_custom flag in struct kvm_mmu_memory_cache

   - Added G-stage ioremap() and iounmap() functions

   - Added support for Svpbmt inside Guest

  s390:

   - add an interface to provide a hypervisor dump for secure guests

   - improve selftests to use TAP interface

   - enable interpretive execution of zPCI instructions (for PCI
     passthrough)

   - First part of deferred teardown

   - CPU Topology

   - PV attestation

   - Minor fixes

  x86:

   - Permit guests to ignore single-bit ECC errors

   - Intel IPI virtualization

   - Allow getting/setting pending triple fault with
     KVM_GET/SET_VCPU_EVENTS

   - PEBS virtualization

   - Simplify PMU emulation by just using PERF_TYPE_RAW events

   - More accurate event reinjection on SVM (avoid retrying
     instructions)

   - Allow getting/setting the state of the speaker port data bit

   - Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls
     are inconsistent

   - "Notify" VM exit (detect microarchitectural hangs) for Intel

   - Use try_cmpxchg64 instead of cmpxchg64

   - Ignore benign host accesses to PMU MSRs when PMU is disabled

   - Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior

   - Allow NX huge page mitigation to be disabled on a per-vm basis

   - Port eager page splitting to shadow MMU as well

   - Enable CMCI capability by default and handle injected UCNA errors

   - Expose pid of vcpu threads in debugfs

   - x2AVIC support for AMD

   - cleanup PIO emulation

   - Fixes for LLDT/LTR emulation

   - Don't require refcounted "struct page" to create huge SPTEs

   - Miscellaneous cleanups:
      - MCE MSR emulation
      - Use separate namespaces for guest PTEs and shadow PTEs bitmasks
      - PIO emulation
      - Reorganize rmap API, mostly around rmap destruction
      - Do not workaround very old KVM bugs for L0 that runs with nesting enabled
      - new selftests API for CPUID

  Generic:

   - Fix races in gfn->pfn cache refresh; do not pin pages tracked by
     the cache

   - new selftests API using struct kvm_vcpu instead of a (vm, id)
     tuple"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (606 commits)
  selftests: kvm: set rax before vmcall
  selftests: KVM: Add exponent check for boolean stats
  selftests: KVM: Provide descriptive assertions in kvm_binary_stats_test
  selftests: KVM: Check stat name before other fields
  KVM: x86/mmu: remove unused variable
  RISC-V: KVM: Add support for Svpbmt inside Guest/VM
  RISC-V: KVM: Use PAGE_KERNEL_IO in kvm_riscv_gstage_ioremap()
  RISC-V: KVM: Add G-stage ioremap() and iounmap() functions
  KVM: Add gfp_custom flag in struct kvm_mmu_memory_cache
  RISC-V: KVM: Add extensible CSR emulation framework
  RISC-V: KVM: Add extensible system instruction emulation framework
  RISC-V: KVM: Factor-out instruction emulation into separate sources
  RISC-V: KVM: move preempt_disable() call in kvm_arch_vcpu_ioctl_run
  RISC-V: KVM: Make kvm_riscv_guest_timer_init a void function
  RISC-V: KVM: Fix variable spelling mistake
  RISC-V: KVM: Improve ISA extension by using a bitmap
  KVM, x86/mmu: Fix the comment around kvm_tdp_mmu_zap_leafs()
  KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog
  KVM: x86/mmu: Treat NX as a valid SPTE bit for NPT
  KVM: x86: Do not block APIC write for non ICR registers
  ...
2022-08-04 14:59:54 -07:00
Bjorn Helgaas
c4f36c3ab0 Merge branch 'pci/header-cleanup-immutable'
- Remove pci_get_legacy_ide_irq(); use ATA_PRIMARY_IRQ() and
  ATA_SECONDARY_IRQ() instead (Stafford Horne)

- Remove isa_dma_bridge_buggy, except for x86_32, the only place it's used
  (Stafford Horne)

- Define ARCH_GENERIC_PCI_MMAP_RESOURCE for csky (Stafford Horne)

- Move common PCI definitions that arches sometimes override to
  asm-generic/pci.h (Stafford Horne)

- Include <linux/isa-dma.h> for 'isa_dma_bridge_buggy' when needed
  (bisection hole here) (Randy Dunlap)

* pci/header-cleanup-immutable:
  PCI: Stub __pci_ioport_map() for arches that don't support it at all
  x86/cyrix: include header linux/isa-dma.h
  asm-generic: Add new pci.h and use it
  csky: PCI: Define ARCH_GENERIC_PCI_MMAP_RESOURCE
  PCI: Move isa_dma_bridge_buggy out of asm/dma.h
  PCI: Remove pci_get_legacy_ide_irq() and asm-generic/pci.h
2022-08-04 11:46:53 -05:00
Linus Torvalds
5264406cdb iov_iter work, part 1 - isolated cleanups and optimizations.
One of the goals is to reduce the overhead of using ->read_iter()
 and ->write_iter() instead of ->read()/->write(); new_sync_{read,write}()
 has a surprising amount of overhead, in particular inside iocb_flags().
 That's why the beginning of the series is in this pile; it's not directly
 iov_iter-related, but it's a part of the same work...
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYurGOQAKCRBZ7Krx/gZQ
 6ysyAP91lvBfMRepcxpd9kvtuzWkU8A3rfSziZZteEHANB9Q7QEAiPn2a2OjWkcZ
 uAyUWfCkHCNx+dSMkEvUgR5okQ0exAM=
 =9UCV
 -----END PGP SIGNATURE-----

Merge tag 'pull-work.iov_iter-base' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs iov_iter updates from Al Viro:
 "Part 1 - isolated cleanups and optimizations.

  One of the goals is to reduce the overhead of using ->read_iter() and
  ->write_iter() instead of ->read()/->write().

  new_sync_{read,write}() has a surprising amount of overhead, in
  particular inside iocb_flags(). That's the explanation for the
  beginning of the series is in this pile; it's not directly
  iov_iter-related, but it's a part of the same work..."

* tag 'pull-work.iov_iter-base' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  first_iovec_segment(): just return address
  iov_iter: massage calling conventions for first_{iovec,bvec}_segment()
  iov_iter: first_{iovec,bvec}_segment() - simplify a bit
  iov_iter: lift dealing with maxpages out of first_{iovec,bvec}_segment()
  iov_iter_get_pages{,_alloc}(): cap the maxsize with MAX_RW_COUNT
  iov_iter_bvec_advance(): don't bother with bvec_iter
  copy_page_{to,from}_iter(): switch iovec variants to generic
  keep iocb_flags() result cached in struct file
  iocb: delay evaluation of IS_SYNC(...) until we want to check IOCB_DSYNC
  struct file: use anonymous union member for rcuhead and llist
  btrfs: use IOMAP_DIO_NOSYNC
  teach iomap_dio_rw() to suppress dsync
  No need of likely/unlikely on calls of check_copy_size()
2022-08-03 13:50:22 -07:00
Linus Torvalds
e2b5421007 flexible-array transformations in UAPI for 6.0-rc1
Hi Linus,
 
 Please, pull the following treewide patch that replaces zero-length arrays
 with flexible-array members in UAPI. This patch has been baking in
 linux-next for 5 weeks now.
 
 -fstrict-flex-arrays=3 is coming and we need to land these changes
 to prevent issues like these in the short future:
 
 ../fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0,
 but the source string has length 2 (including NUL byte) [-Wfortify-source]
 		strcpy(de3->name, ".");
 		^
 
 Since these are all [0] to [] changes, the risk to UAPI is nearly zero. If
 this breaks anything, we can use a union with a new member name.
 
 Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836
 
 Thanks
 --
 Gustavo
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAmLoNdcACgkQRwW0y0cG
 2zEVeg//QYJ3j2pbKt9zB6muO3SkrNoMPc5wpY/SITUeiDscukLvGzJG88eIZskl
 NaEjbmacHmdlQrBkUdr10i1+hkb2zRd6/j42GIDXEhhKTMoT2UxJCBp47KSvd7VY
 dKNLGsgQs3kwmmxLEGu6w6vywWpI5wxXTKWL1Q/RpUXoOnLmsMEbzKTjf12a1Edl
 9gPNY+tMHIHyB0pGIRXDY/ZF5c+FcRFn6kKeMVzJL0bnX7FI4UmYe83k9ajEiLWA
 MD3JAw/mNv2X0nizHHuQHIjtky8Pr+E8hKs5ni88vMYmFqeABsTw4R1LJykv/mYa
 NakU1j9tHYTKcs2Ju+gIvSKvmatKGNmOpti/8RAjEX1YY4cHlHWNsigVbVRLqfo7
 SKImlSUxOPGFS3HAJQCC9P/oZgICkUdD6sdLO1PVBnE1G3Fvxg5z6fGcdEuEZkVR
 PQwlYDm1nlTuScbkgVSBzyU/AkntVMJTuPWgbpNo+VgSXWZ8T/U8II0eGrFVf9rH
 +y5dAS52/bi6OP0la7fNZlq7tcPfNG9HJlPwPb1kQtuPT4m6CBhth/rRrDJwx8za
 0cpJT75Q3CI0wLZ7GN4yEjtNQrlAeeiYiS4LMQ/SFFtg1KzvmYYVmWDhOf0+mMDA
 f7bq4cxEg2LHwrhRgQQWowFVBu7yeiwKbcj9sybfA27bMqCtfto=
 =8yMq
 -----END PGP SIGNATURE-----

Merge tag 'flexible-array-transformations-UAPI-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull uapi flexible array update from Gustavo Silva:
 "A treewide patch that replaces zero-length arrays with flexible-array
  members in UAPI. This has been baking in linux-next for 5 weeks now.

  '-fstrict-flex-arrays=3' is coming and we need to land these changes
  to prevent issues like these in the short future:

    fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0, but the source string has length 2 (including NUL byte) [-Wfortify-source]
		strcpy(de3->name, ".");
		^

  Since these are all [0] to [] changes, the risk to UAPI is nearly
  zero. If this breaks anything, we can use a union with a new member
  name"

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836

* tag 'flexible-array-transformations-UAPI-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  treewide: uapi: Replace zero-length arrays with flexible-array members
2022-08-02 19:50:47 -07:00
Linus Torvalds
a0b09f2d6f Random number generator updates for Linux 6.0-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmLnDOwACgkQSfxwEqXe
 A65Fiw//Z0YaPejSslQIGitQ1b0XzdWBhyJArYDieaaiQRXMqlaSKlIUqHz38xb7
 +FykUY51/SJLjHV2riPxq1OK3/MPmk6VlTd0HHihcHVmg77oZcFcv2tPnDpZoqND
 TsBOujLbXKwxP8tNFedRY/4+K7w+ue9BTfDjuH7aCtz7uWd+4cNJmPg3x9FCfkMA
 +hbcRluwE9W3Pg4OCKwv+qxL0JF3qQtNKEOp1wpnjGAZZW/I9gFNgFBEkykvcAsj
 TkIRDc3agPFj6QgDeRIgLdnf9KCsLubKAg5oJneeCvQztJJUCSkn8nQXxpx+4sLo
 GsRgvCdfL/GyJqfSAzQJVYDHKtKMkJiCiWCC/oOALR8dzHJfSlULDAjbY1m/DAr9
 at+vi4678Or7TNx2ZSaUlCXXKZ+UT7yWMlQWax9JuxGk1hGYP5/eT1AH5SGjqUwF
 w1q8oyzxt1vUcnOzEddFXPFirnqqhAk4dQFtu83+xKM4ZssMVyeB4NZdEhAdW0ng
 MX+RjrVj4l5gWWuoS0Cx3LUxDCgV6WT0dN+Vl9axAZkoJJbcXLEmXwQ6NbzTLPWg
 1/MT7qFTxNcTCeAArMdZvvFbeh7pOBXO42pafrK/7vDRnTMUIw9tqXNLQUfvdFQp
 F5flPgiVRHDU2vSzKIFtnPTyXU0RBBGvNb4n0ss2ehH2DSsCxYE=
 =Zy3d
 -----END PGP SIGNATURE-----

Merge tag 'random-6.0-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:
 "Though there's been a decent amount of RNG-related development during
  this last cycle, not all of it is coming through this tree, as this
  cycle saw a shift toward tackling early boot time seeding issues,
  which took place in other trees as well.

  Here's a summary of the various patches:

   - The CONFIG_ARCH_RANDOM .config option and the "nordrand" boot
     option have been removed, as they overlapped with the more widely
     supported and more sensible options, CONFIG_RANDOM_TRUST_CPU and
     "random.trust_cpu". This change allowed simplifying a bit of arch
     code.

   - x86's RDRAND boot time test has been made a bit more robust, with
     RDRAND disabled if it's clearly producing bogus results. This would
     be a tip.git commit, technically, but I took it through random.git
     to avoid a large merge conflict.

   - The RNG has long since mixed in a timestamp very early in boot, on
     the premise that a computer that does the same things, but does so
     starting at different points in wall time, could be made to still
     produce a different RNG state. Unfortunately, the clock isn't set
     early in boot on all systems, so now we mix in that timestamp when
     the time is actually set.

   - User Mode Linux now uses the host OS's getrandom() syscall to
     generate a bootloader RNG seed and later on treats getrandom() as
     the platform's RDRAND-like faculty.

   - The arch_get_random_{seed_,}_long() family of functions is now
     arch_get_random_{seed_,}_longs(), which enables certain platforms,
     such as s390, to exploit considerable performance advantages from
     requesting multiple CPU random numbers at once, while at the same
     time compiling down to the same code as before on platforms like
     x86.

   - A small cleanup changing a cmpxchg() into a try_cmpxchg(), from
     Uros.

   - A comment spelling fix"

More info about other random number changes that come in through various
architecture trees in the full commentary in the pull request:

  https://lore.kernel.org/all/20220731232428.2219258-1-Jason@zx2c4.com/

* tag 'random-6.0-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  random: correct spelling of "overwrites"
  random: handle archrandom with multiple longs
  um: seed rng using host OS rng
  random: use try_cmpxchg in _credit_init_bits
  timekeeping: contribute wall clock to rng on time change
  x86/rdrand: Remove "nordrand" flag in favor of "random.trust_cpu"
  random: remove CONFIG_ARCH_RANDOM
2022-08-02 17:31:35 -07:00
Linus Torvalds
043402495d integrity-v6.0
-----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCYulqTBQcem9oYXJAbGlu
 dXguaWJtLmNvbQAKCRDLwZzRsCrn5SBBAP9nbAW1SPa/hDqbrclHdDrS59VkSVwv
 6ZO2yAmxJAptHwD+JzyJpJiZsqVN/Tu85V1PqeAt9c8az8f3CfDBp2+w7AA=
 =Ad+c
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity updates from Mimi Zohar:
 "Aside from the one EVM cleanup patch, all the other changes are kexec
  related.

  On different architectures different keyrings are used to verify the
  kexec'ed kernel image signature. Here are a number of preparatory
  cleanup patches and the patches themselves for making the keyrings -
  builtin_trusted_keyring, .machine, .secondary_trusted_keyring, and
  .platform - consistent across the different architectures"

* tag 'integrity-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  kexec, KEYS, s390: Make use of built-in and secondary keyring for signature verification
  arm64: kexec_file: use more system keyrings to verify kernel image signature
  kexec, KEYS: make the code in bzImage64_verify_sig generic
  kexec: clean up arch_kexec_kernel_verify_sig
  kexec: drop weak attribute from functions
  kexec_file: drop weak attribute from functions
  evm: Use IS_ENABLED to initialize .enabled
2022-08-02 15:21:18 -07:00
Linus Torvalds
22a39c3d86 This was a fairly quiet cycle for the locking subsystem:
- lockdep: Fix a handful of the more complex lockdep_init_map_*() primitives
    that can lose the lock_type & cause false reports. No such mishap was
    observed in the wild.
 
  - jump_label improvements: simplify the cross-arch support of
    initial NOP patching by making it arch-specific code (used on MIPS only),
    and remove the s390 initial NOP patching that was superfluous.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmLn3jERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hzeg/7BTC90XeMANhTiL23iiH7dOYZwqdFeB12
 VBqdaPaGC8i+mJzVAdGyPFwCFDww6Ak6P33PcHkemuIO5+DhWis8hfw5krHEOO1k
 AyVSMOZuWJ8/g6ZenjgNFozQ8C+3NqURrpdqN55d7jhMazPWbsNLLqUgvSSqo6DY
 Ah2O+EKrDfGNCxT6/YaTAmUryctotxafSyFDQxv3RKPfCoIIVv9b3WApYqTOqFIu
 VYTPr+aAcMsU20hPMWQI4kbQaoCxFqr3bZiZtAiS/IEunqi+PlLuWjrnCUpLwVTC
 +jOCkNJHt682FPKTWelUnCnkOg9KhHRujRst5mi1+2tWAOEvKltxfe05UpsZYC3b
 jhzddREMwBt3iYsRn65LxxsN4AMK/C/41zjejHjZpf+Q5kwDsc6Ag3L5VifRFURS
 KRwAy9ejoVYwnL7CaVHM2zZtOk4YNxPeXmiwoMJmOufpdmD1LoYbNUbpSDf+goIZ
 yPJpxFI5UN8gi8IRo3DMe4K2nqcFBC3wFn8tNSAu+44gqDwGJAJL6MsLpkLSZkk8
 3QN9O11UCRTJDkURjoEWPgRRuIu9HZ4GKNhiblDy6gNM/jDE/m5OG4OYfiMhojgc
 KlMhsPzypSpeApL55lvZ+AzxH8mtwuUGwm8lnIdZ2kIse1iMwapxdWXWq9wQr8eW
 jLWHgyZ6rcg=
 =4B89
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:
 "This was a fairly quiet cycle for the locking subsystem:

   - lockdep: Fix a handful of the more complex lockdep_init_map_*()
     primitives that can lose the lock_type & cause false reports. No
     such mishap was observed in the wild.

   - jump_label improvements: simplify the cross-arch support of initial
     NOP patching by making it arch-specific code (used on MIPS only),
     and remove the s390 initial NOP patching that was superfluous"

* tag 'locking-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/lockdep: Fix lockdep_init_map_*() confusion
  jump_label: make initial NOP patching the special case
  jump_label: mips: move module NOP patching into arch code
  jump_label: s390: avoid pointless initial NOP patching
2022-08-01 12:15:27 -07:00
Paolo Bonzini
63f4b21041 Merge remote-tracking branch 'kvm/next' into kvm-next-5.20
KVM/s390, KVM/x86 and common infrastructure changes for 5.20

x86:

* Permit guests to ignore single-bit ECC errors

* Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache

* Intel IPI virtualization

* Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS

* PEBS virtualization

* Simplify PMU emulation by just using PERF_TYPE_RAW events

* More accurate event reinjection on SVM (avoid retrying instructions)

* Allow getting/setting the state of the speaker port data bit

* Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent

* "Notify" VM exit (detect microarchitectural hangs) for Intel

* Cleanups for MCE MSR emulation

s390:

* add an interface to provide a hypervisor dump for secure guests

* improve selftests to use TAP interface

* enable interpretive execution of zPCI instructions (for PCI passthrough)

* First part of deferred teardown

* CPU Topology

* PV attestation

* Minor fixes

Generic:

* new selftests API using struct kvm_vcpu instead of a (vm, id) tuple

x86:

* Use try_cmpxchg64 instead of cmpxchg64

* Bugfixes

* Ignore benign host accesses to PMU MSRs when PMU is disabled

* Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior

* x86/MMU: Allow NX huge pages to be disabled on a per-vm basis

* Port eager page splitting to shadow MMU as well

* Enable CMCI capability by default and handle injected UCNA errors

* Expose pid of vcpu threads in debugfs

* x2AVIC support for AMD

* cleanup PIO emulation

* Fixes for LLDT/LTR emulation

* Don't require refcounted "struct page" to create huge SPTEs

x86 cleanups:

* Use separate namespaces for guest PTEs and shadow PTEs bitmasks

* PIO emulation

* Reorganize rmap API, mostly around rmap destruction

* Do not workaround very old KVM bugs for L0 that runs with nesting enabled

* new selftests API for CPUID
2022-08-01 03:21:00 -04:00
Juergen Gross
a603002eea virtio: replace restricted mem access flag with callback
Instead of having a global flag to require restricted memory access
for all virtio devices, introduce a callback which can select that
requirement on a per-device basis.

For convenience add a common function returning always true, which can
be used for use cases like SEV.

Per default use a callback always returning false.

As the callback needs to be set in early init code already, add a
virtio anchor which is builtin in case virtio is enabled.

Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 guest using Xen
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20220622063838.8854-2-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-01 07:42:49 +02:00
Sumanth Korikkar
ded466e180 s390/unwind: fix fgraph return address recovery
When HAVE_FUNCTION_GRAPH_RET_ADDR_PTR is defined, the return
address to the fgraph caller is recovered by tagging it along with the
stack pointer of ftrace stack. This makes the stack unwinding more
reliable.

When the fgraph return address is modified to return_to_handler,
ftrace_graph_ret_addr tries to restore it to the original
value using tagged stack pointer.

Fix this by passing tagged sp to ftrace_graph_ret_addr.

Fixes: d81675b60d ("s390/unwind: recover kretprobe modified return address in stacktrace")
Cc: <stable@vger.kernel.org> # 5.18
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-28 18:05:24 +02:00
Sven Schnelle
520763a327 s390/nmi: use irqentry_nmi_enter()/irqentry_nmi_exit()
With generic entry in place switch the nmi handler to use
the generic entry helper functions.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-28 18:05:24 +02:00
Alexander Gordeev
e409b7f191 s390/smp,ptdump: add absolute lowcore markers
Add "Lowcore Area Start" and "Lowcore Area End" markers
that fence pages where absolute lowcore resides.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-28 18:05:24 +02:00
Alexander Gordeev
7d06fed77b s390/smp: rework absolute lowcore access
Temporary unsetting of the prefix page in memcpy_absolute() routine
poses a risk of executing code path with unexpectedly disabled prefix
page. This rework avoids the prefix page uninstalling and disabling
of normal and machine check interrupts when accessing the absolute
zero memory.

Although memcpy_absolute() routine can access the whole memory, it is
only used to update the absolute zero lowcore. This rework therefore
introduces a new mechanism for the absolute zero lowcore access and
scraps memcpy_absolute() routine for good.

Instead, an area is reserved in the virtual memory that is used for
the absolute lowcore access only. That area holds an array of 8KB
virtual mappings - one per CPU. Whenever a CPU is brought online, the
corresponding item is mapped to the real address of the previously
installed prefix page.

The absolute zero lowcore access works like this: a CPU calls the
new primitive get_abs_lowcore() to obtain its 8KB mapping as a
pointer to the struct lowcore. Virtual address references to that
pointer get translated to the real addresses of the prefix page,
which in turn gets swapped with the absolute zero memory addresses
due to prefixing. Once the pointer is not needed it must be released
with put_abs_lowcore() primitive:

	struct lowcore *abs_lc;
	unsigned long flags;

	abs_lc = get_abs_lowcore(&flags);
	abs_lc->... = ...;
	put_abs_lowcore(abs_lc, flags);

To ensure the described mechanism works large segment- and region-
table entries must be avoided for the 8KB mappings. Failure to do
so results in usage of Region-Frame Absolute Address (RFAA) or
Segment-Frame Absolute Address (SFAA) large page fields. In that
case absolute addresses would be used to address the prefix page
instead of the real ones and the prefixing would get bypassed.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-28 18:05:23 +02:00
Alexander Gordeev
2e2493c675 s390/setup: rearrange absolute lowcore initialization
Make the absolute lowcore assignments immediately follow
the boot CPU lowcore same member assignments. This way
readability improves when reading from up to down, with
no out of order mcck stack allocation in-between.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-28 18:05:23 +02:00
Alexander Gordeev
57ad19bcde s390/boot: cleanup adjust_to_uv_max() function
Uncouple input and output arguments by making the latter
the function return value.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-28 18:05:23 +02:00
Alexander Gordeev
6f5c672d17 s390/smp: enforce lowcore protection on CPU restart
As result of commit 915fea04f9 ("s390/smp: enable DAT before
CPU restart callback is called") the low-address protection bit
gets mistakenly unset in control register 0 save area of the
absolute zero memory. That area is used when manual PSW restart
happened to hit an offline CPU. In this case the low-address
protection for that CPU will be dropped.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Fixes: 915fea04f9 ("s390/smp: enable DAT before CPU restart callback is called")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-28 18:05:23 +02:00
Alexander Gordeev
2f089a3846 Merge branch 'vmcore-iov_iter' into features
Pull changes that finalize switching of copy_oldmem_page() callback
to iov_iter interface. These changes were pulled in work.iov_iter of
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-28 17:53:11 +02:00
Linus Torvalds
5de64d4496 s390 updates for 5.19
- Prevent relatively slow PRNO TRNG random number operation
   from being called from interrupt context. That could for
   example cause some network loads to timeout.
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYIADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCYt/ishccYWdvcmRlZXZA
 bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8DbGAP9O4BV4hAtjiOG+l3WSKprzwNKZ
 EpOmd8ivJyvox3GAMAEAziwGuBogmIhXsyiz7wrMUGaFpBkST3yVI7WMV4qaMAU=
 =mAUq
 -----END PGP SIGNATURE-----

Merge tag 's390-5.19-7' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fix from Alexander GordeevL

 - Prevent relatively slow PRNO TRNG random number operation from being
   called from interrupt context. That could for example cause some
   network loads to timeout.

* tag 's390-5.19-7' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/archrandom: prevent CPACF trng invocations in interrupt context
2022-07-26 10:03:53 -07:00
Jason A. Donenfeld
d349ab99ee random: handle archrandom with multiple longs
The archrandom interface was originally designed for x86, which supplies
RDRAND/RDSEED for receiving random words into registers, resulting in
one function to generate an int and another to generate a long. However,
other architectures don't follow this.

On arm64, the SMCCC TRNG interface can return between one and three
longs. On s390, the CPACF TRNG interface can return arbitrary amounts,
with four longs having the same cost as one. On UML, the os_getrandom()
interface can return arbitrary amounts.

So change the api signature to take a "max_longs" parameter designating
the maximum number of longs requested, and then return the number of
longs generated.

Since callers need to check this return value and loop anyway, each arch
implementation does not bother implementing its own loop to try again to
fill the maximum number of longs. Additionally, all existing callers
pass in a constant max_longs parameter. Taken together, these two things
mean that the codegen doesn't really change much for one-word-at-a-time
platforms, while performance is greatly improved on platforms such as
s390.

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-07-25 13:26:14 +02:00
Nicolin Chen
10e19d492a vfio/ap: Pass in physical address of ind to ap_aqic()
The ap_aqic() is called by vfio_ap_irq_enable() where it passes in a
virt value that's casted from a physical address "h_nib". Inside the
ap_aqic(), it does virt_to_phys() again.

Since ap_aqic() needs a physical address, let's just pass in a pa of
ind directly. So change the "ind" to "pa_ind".

Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20220723020256.30081-4-nicolinc@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-23 07:29:10 -06:00
Stafford Horne
abb4970ac3 PCI: Move isa_dma_bridge_buggy out of asm/dma.h
The isa_dma_bridge_buggy symbol is only used for x86_32, and only x86_32
platforms or quirks ever set it.

Add a new linux/isa-dma.h header that #defines isa_dma_bridge_buggy to 0
except on x86_32, where we keep it as a variable, and remove all the arch-
specific definitions.

[bhelgaas: commit log]
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/r/20220722214944.831438-3-shorne@gmail.com
Signed-off-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
2022-07-22 17:24:47 -05:00
Stafford Horne
ae85b23c65 PCI: Remove pci_get_legacy_ide_irq() and asm-generic/pci.h
pci_get_legacy_ide_irq() is only used on platforms that support PNP, so
many architectures define it but never use it.  Replace uses of it with
ATA_PRIMARY_IRQ() and ATA_SECONDARY_IRQ(), which provide the same
functionality.

Since pci_get_legacy_ide_irq() is no longer used, remove all the
architecture-specific definitions of it as well as asm-generic/pci.h, which
only provides pci_get_legacy_ide_irq()

[bhelgaas: commit log]
Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20220722214944.831438-2-shorne@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-07-22 17:23:45 -05:00
Niklas Schnelle
960ac36264 s390/pci: allow zPCI zbus without a function zero
Currently the zPCI code blocks PCI bus creation and probing of a zPCI zbus
unless there is a PCI function with devfn 0. This is always the case for
the PCI functions with hidden RID, but may keep PCI functions from a
multi-function PCI device with RID information invisible until the function
0 becomes visible. Worse, as a PCI bus is necessary to even present a PCI
hotplug slot, even that remains invisible.

With the probing of these so-called isolated PCI functions enabled for s390
in common code, this restriction is no longer necessary. On network cards
with multiple ports and a PF per port this also allows using each port on
its own while still providing the physical PCI topology information in the
devfn needed to associate VFs with their parent PF.

Link: https://lore.kernel.org/r/20220628143100.3228092-6-schnelle@linux.ibm.com
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
2022-07-22 16:06:13 -05:00
Harald Freudenberger
918e75f77a s390/archrandom: prevent CPACF trng invocations in interrupt context
This patch slightly reworks the s390 arch_get_random_seed_{int,long}
implementation: Make sure the CPACF trng instruction is never
called in any interrupt context. This is done by adding an
additional condition in_task().

Justification:

There are some constrains to satisfy for the invocation of the
arch_get_random_seed_{int,long}() functions:
- They should provide good random data during kernel initialization.
- They should not be called in interrupt context as the TRNG
  instruction is relatively heavy weight and may for example
  make some network loads cause to timeout and buck.

However, it was not clear what kind of interrupt context is exactly
encountered during kernel init or network traffic eventually calling
arch_get_random_seed_long().

After some days of investigations it is clear that the s390
start_kernel function is not running in any interrupt context and
so the trng is called:

Jul 11 18:33:39 t35lp54 kernel:  [<00000001064e90ca>] arch_get_random_seed_long.part.0+0x32/0x70
Jul 11 18:33:39 t35lp54 kernel:  [<000000010715f246>] random_init+0xf6/0x238
Jul 11 18:33:39 t35lp54 kernel:  [<000000010712545c>] start_kernel+0x4a4/0x628
Jul 11 18:33:39 t35lp54 kernel:  [<000000010590402a>] startup_continue+0x2a/0x40

The condition in_task() is true and the CPACF trng provides random data
during kernel startup.

The network traffic however, is more difficult. A typical call stack
looks like this:

Jul 06 17:37:07 t35lp54 kernel:  [<000000008b5600fc>] extract_entropy.constprop.0+0x23c/0x240
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b560136>] crng_reseed+0x36/0xd8
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b5604b8>] crng_make_state+0x78/0x340
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b5607e0>] _get_random_bytes+0x60/0xf8
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b56108a>] get_random_u32+0xda/0x248
Jul 06 17:37:07 t35lp54 kernel:  [<000000008aefe7a8>] kfence_guarded_alloc+0x48/0x4b8
Jul 06 17:37:07 t35lp54 kernel:  [<000000008aeff35e>] __kfence_alloc+0x18e/0x1b8
Jul 06 17:37:07 t35lp54 kernel:  [<000000008aef7f10>] __kmalloc_node_track_caller+0x368/0x4d8
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b611eac>] kmalloc_reserve+0x44/0xa0
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b611f98>] __alloc_skb+0x90/0x178
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b6120dc>] __napi_alloc_skb+0x5c/0x118
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b8f06b4>] qeth_extract_skb+0x13c/0x680
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b8f6526>] qeth_poll+0x256/0x3f8
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b63d76e>] __napi_poll.constprop.0+0x46/0x2f8
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b63dbec>] net_rx_action+0x1cc/0x408
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b937302>] __do_softirq+0x132/0x6b0
Jul 06 17:37:07 t35lp54 kernel:  [<000000008abf46ce>] __irq_exit_rcu+0x13e/0x170
Jul 06 17:37:07 t35lp54 kernel:  [<000000008abf531a>] irq_exit_rcu+0x22/0x50
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b922506>] do_io_irq+0xe6/0x198
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b935826>] io_int_handler+0xd6/0x110
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b9358a6>] psw_idle_exit+0x0/0xa
Jul 06 17:37:07 t35lp54 kernel: ([<000000008ab9c59a>] arch_cpu_idle+0x52/0xe0)
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b933cfe>] default_idle_call+0x6e/0xd0
Jul 06 17:37:07 t35lp54 kernel:  [<000000008ac59f4e>] do_idle+0xf6/0x1b0
Jul 06 17:37:07 t35lp54 kernel:  [<000000008ac5a28e>] cpu_startup_entry+0x36/0x40
Jul 06 17:37:07 t35lp54 kernel:  [<000000008abb0d90>] smp_start_secondary+0x148/0x158
Jul 06 17:37:07 t35lp54 kernel:  [<000000008b935b9e>] restart_int_handler+0x6e/0x90

which confirms that the call is in softirq context. So in_task() covers exactly
the cases where we want to have CPACF trng called: not in nmi, not in hard irq,
not in soft irq but in normal task context and during kernel init.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Link: https://lore.kernel.org/r/20220713131721.257907-1-freude@linux.ibm.com
Fixes: e4f7440030 ("s390/archrandom: simplify back to earlier design and initialize earlier")
[agordeev@linux.ibm.com changed desc, added Fixes and Link, removed -stable]
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-21 20:03:03 +02:00
Peter Zijlstra
1e9fdf21a4 mmu_gather: Remove per arch tlb_{start,end}_vma()
Scattered across the archs are 3 basic forms of tlb_{start,end}_vma().
Provide two new MMU_GATHER_knobs to enumerate them and remove the per
arch tlb_{start,end}_vma() implementations.

 - MMU_GATHER_NO_FLUSH_CACHE indicates the arch has flush_cache_range()
   but does *NOT* want to call it for each VMA.

 - MMU_GATHER_MERGE_VMAS indicates the arch wants to merge the
   invalidate across multiple VMAs if possible.

With these it is possible to capture the three forms:

  1) empty stubs;
     select MMU_GATHER_NO_FLUSH_CACHE and MMU_GATHER_MERGE_VMAS

  2) start: flush_cache_range(), end: empty;
     select MMU_GATHER_MERGE_VMAS

  3) start: flush_cache_range(), end: flush_tlb_range();
     default

Obviously, if the architecture does not have flush_cache_range() then
it also doesn't need to select MMU_GATHER_NO_FLUSH_CACHE.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-21 10:50:13 -07:00