Commit Graph

260 Commits

Author SHA1 Message Date
Ard Biesheuvel
c1fd78a777 arm64: mm: add missing P4D definitions and use them consistently
Even though level 0, 1 and 2 descriptors share the same attribute
encodings, let's be a bit more consistent about using the right one at
the right level. So add new macros for level 0/P4D definitions, and
clean up some inconsistencies involving these macros.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20210310104942.174584-2-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-03-19 18:47:15 +00:00
Ard Biesheuvel
7ba8f2b2d6 arm64: mm: use a 48-bit ID map when possible on 52-bit VA builds
52-bit VA kernels can run on hardware that is only 48-bit capable, but
configure the ID map as 52-bit by default. This was not a problem until
recently, because the special T0SZ value for a 52-bit VA space was never
programmed into the TCR register anwyay, and because a 52-bit ID map
happens to use the same number of translation levels as a 48-bit one.

This behavior was changed by commit 1401bef703 ("arm64: mm: Always update
TCR_EL1 from __cpu_set_tcr_t0sz()"), which causes the unsupported T0SZ
value for a 52-bit VA to be programmed into TCR_EL1. While some hardware
simply ignores this, Mark reports that Amberwing systems choke on this,
resulting in a broken boot. But even before that commit, the unsupported
idmap_t0sz value was exposed to KVM and used to program TCR_EL2 incorrectly
as well.

Given that we already have to deal with address spaces being either 48-bit
or 52-bit in size, the cleanest approach seems to be to simply default to
a 48-bit VA ID map, and only switch to a 52-bit one if the placement of the
kernel in DRAM requires it. This is guaranteed not to happen unless the
system is actually 52-bit VA capable.

Fixes: 90ec95cda9 ("arm64: mm: Introduce VA_BITS_MIN")
Reported-by: Mark Salter <msalter@redhat.com>
Link: http://lore.kernel.org/r/20210310003216.410037-1-msalter@redhat.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20210310171515.416643-2-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-11 13:04:28 +00:00
Catalin Marinas
d15dfd3138 arm64: mte: Map hotplugged memory as Normal Tagged
In a system supporting MTE, the linear map must allow reading/writing
allocation tags by setting the memory type as Normal Tagged. Currently,
this is only handled for memory present at boot. Hotplugged memory uses
Normal non-Tagged memory.

Introduce pgprot_mhp() for hotplugged memory and use it in
add_memory_resource(). The arm64 code maps pgprot_mhp() to
pgprot_tagged().

Note that ZONE_DEVICE memory should not be mapped as Tagged and
therefore setting the memory type in arch_add_memory() is not feasible.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Fixes: 0178dc7613 ("arm64: mte: Use Normal Tagged attributes for the linear map")
Reported-by: Patrick Daly <pdaly@codeaurora.org>
Tested-by: Patrick Daly <pdaly@codeaurora.org>
Link: https://lore.kernel.org/r/1614745263-27827-1-git-send-email-pdaly@codeaurora.org
Cc: <stable@vger.kernel.org> # 5.10.x
Cc: Will Deacon <will@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20210309122601.5543-1-catalin.marinas@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-10 10:56:46 +00:00
Linus Torvalds
8f47d753d4 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
 "The big one is a fix for the VHE enabling path during early boot,
  where the code enabling the MMU wasn't necessarily in the identity map
  of the new page-tables, resulting in a consistent crash with 64k
  pages. In fixing that, we noticed some missing barriers too, so we
  added those for the sake of architectural compliance.

  Other than that, just the usual merge window trickle. There'll be more
  to come, too.

  Summary:

   - Fix lockdep false alarm on resume-from-cpuidle path

   - Fix memory leak in kexec_file

   - Fix module linker script to work with GDB

   - Fix error code when trying to use uprobes with AArch32 instructions

   - Fix late VHE enabling with 64k pages

   - Add missing ISBs after TLB invalidation

   - Fix seccomp when tracing syscall -1

   - Fix stacktrace return code at end of stack

   - Fix inconsistent whitespace for pointer return values

   - Fix compiler warnings when building with W=1"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: stacktrace: Report when we reach the end of the stack
  arm64: ptrace: Fix seccomp of traced syscall -1 (NO_SYSCALL)
  arm64: Add missing ISB after invalidating TLB in enter_vhe
  arm64: Add missing ISB after invalidating TLB in __primary_switch
  arm64: VHE: Enable EL2 MMU from the idmap
  KVM: arm64: make the hyp vector table entries local
  arm64/mm: Fixed some coding style issues
  arm64: uprobe: Return EOPNOTSUPP for AARCH32 instruction probing
  kexec: move machine_kexec_post_load() to public interface
  arm64 module: set plt* section addresses to 0x0
  arm64: kexec_file: fix memory leakage in create_dtb() when fdt_open_into() fails
  arm64: spectre: Prevent lockdep splat on v4 mitigation enable path
2021-02-26 10:19:03 -08:00
Marco Elver
840b239863 arm64, kfence: enable KFENCE for ARM64
Add architecture specific implementation details for KFENCE and enable
KFENCE for the arm64 architecture. In particular, this implements the
required interface in <asm/kfence.h>.

KFENCE requires that attributes for pages from its memory pool can
individually be set. Therefore, force the entire linear map to be mapped
at page granularity. Doing so may result in extra memory allocated for
page tables in case rodata=full is not set; however, currently
CONFIG_RODATA_FULL_DEFAULT_ENABLED=y is the default, and the common case
is therefore not affected by this change.

[elver@google.com: add missing copyright and description header]
  Link: https://lkml.kernel.org/r/20210118092159.145934-3-elver@google.com

Link: https://lkml.kernel.org/r/20201103175841.3495947-4-elver@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Co-developed-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Jann Horn <jannh@google.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hillf Danton <hdanton@sina.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joern Engel <joern@purestorage.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: SeongJae Park <sjpark@amazon.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26 09:41:02 -08:00
Anshuman Khandual
03aaf83fba arm64/mm: define arch_get_mappable_range()
This overrides arch_get_mappable_range() on arm64 platform which will be
used with recently added generic framework.  It drops
inside_linear_region() and subsequent check in arch_add_memory() which are
no longer required.  It also adds a VM_BUG_ON() check that would ensure
that mhp_range_allowed() has already been called.

Link: https://lkml.kernel.org/r/1612149902-7867-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26 09:41:00 -08:00
Zhiyuan Dai
2e8acca191 arm64/mm: Fixed some coding style issues
Adjust whitespace for fixmap_pXd() functions returning pointers for
consistency with the kernel coding style.

Signed-off-by: Zhiyuan Dai <daizhiyuan@phytium.com.cn>
Link: https://lore.kernel.org/r/1613958231-5474-1-git-send-email-daizhiyuan@phytium.com.cn
Signed-off-by: Will Deacon <will@kernel.org>
2021-02-23 10:40:54 +00:00
Will Deacon
1d32854ea7 Merge branch 'for-next/misc' into for-next/core
Miscellaneous arm64 changes for 5.12.

* for-next/misc:
  arm64: Make CPU_BIG_ENDIAN depend on ld.bfd or ld.lld 13.0.0+
  arm64: vmlinux.ld.S: add assertion for tramp_pg_dir offset
  arm64: vmlinux.ld.S: add assertion for reserved_pg_dir offset
  arm64/ptdump:display the Linear Mapping start marker
  arm64: ptrace: Fix missing return in hw breakpoint code
  KVM: arm64: Move __hyp_set_vectors out of .hyp.text
  arm64: Include linux/io.h in mm/mmap.c
  arm64: cacheflush: Remove stale comment
  arm64: mm: Remove unused header file
  arm64/sparsemem: reduce SECTION_SIZE_BITS
  arm64/mm: Add warning for outside range requests in vmemmap_populate()
  arm64: Drop workaround for broken 'S' constraint with GCC 4.9
2021-02-12 15:07:34 +00:00
Marc Zyngier
93ad55b785 arm64: cpufeatures: Allow disabling of BTI from the command-line
In order to be able to disable BTI at runtime, whether it is
for testing purposes, or to work around HW issues, let's add
support for overriding the ID_AA64PFR1_EL1.BTI field.

This is further mapped on the arm64.nobti command-line alias.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: David Brazdil <dbrazdil@google.com>
Tested-by: Srinivas Ramana <sramana@codeaurora.org>
Link: https://lore.kernel.org/r/20210208095732.3267263-21-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-02-09 13:50:57 +00:00
Anshuman Khandual
edb739eed8 arm64/mm: Add warning for outside range requests in vmemmap_populate()
vmemmap_populate() does not validate the requested vmemmap address range to
be inside the platform assigned space i.e [VMEMMAP_START..VMEMMAP_END] for
vmemmap. Instead it would just go ahead and create the mapping which might
then overlap with other sections in the kernel virtual address space.

Just adding an warning here for range overrun which would help detect the
problem earlier on, before a potential struct page corruption. This also
makes vmemmap_populate() symmetrical with vmemmap_free() which already has
a similar warning.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1609845851-25064-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-20 12:27:57 +00:00
Catalin Marinas
d889797530 Merge remote-tracking branch 'arm64/for-next/fixes' into for-next/core
* arm64/for-next/fixes: (26 commits)
  arm64: mte: fix prctl(PR_GET_TAGGED_ADDR_CTRL) if TCF0=NONE
  arm64: mte: Fix typo in macro definition
  arm64: entry: fix EL1 debug transitions
  arm64: entry: fix NMI {user, kernel}->kernel transitions
  arm64: entry: fix non-NMI kernel<->kernel transitions
  arm64: ptrace: prepare for EL1 irq/rcu tracking
  arm64: entry: fix non-NMI user<->kernel transitions
  arm64: entry: move el1 irq/nmi logic to C
  arm64: entry: prepare ret_to_user for function call
  arm64: entry: move enter_from_user_mode to entry-common.c
  arm64: entry: mark entry code as noinstr
  arm64: mark idle code as noinstr
  arm64: syscall: exit userspace before unmasking exceptions
  arm64: pgtable: Ensure dirty bit is preserved across pte_wrprotect()
  arm64: pgtable: Fix pte_accessible()
  ACPI/IORT: Fix doc warnings in iort.c
  arm64/fpsimd: add <asm/insn.h> to <asm/kprobes.h> to fix fpsimd build
  arm64: cpu_errata: Apply Erratum 845719 to KRYO2XX Silver
  arm64: proton-pack: Add KRYO2XX silver CPUs to spectre-v2 safe-list
  arm64: kpti: Add KRYO2XX gold/silver CPU cores to kpti safelist
  ...

# Conflicts:
#	arch/arm64/include/asm/exception.h
#	arch/arm64/kernel/sdei.c
2020-12-09 18:04:55 +00:00
Catalin Marinas
ba4259a6f8 Merge branch 'for-next/misc' into for-next/core
* for-next/misc:
  : Miscellaneous patches
  arm64: vmlinux.lds.S: Drop redundant *.init.rodata.*
  kasan: arm64: set TCR_EL1.TBID1 when enabled
  arm64: mte: optimize asynchronous tag check fault flag check
  arm64/mm: add fallback option to allocate virtually contiguous memory
  arm64/smp: Drop the macro S(x,s)
  arm64: consistently use reserved_pg_dir
  arm64: kprobes: Remove redundant kprobe_step_ctx

# Conflicts:
#	arch/arm64/kernel/vmlinux.lds.S
2020-12-09 18:04:48 +00:00
Catalin Marinas
3c09ec59cd Merge branches 'for-next/kvm-build-fix', 'for-next/va-refactor', 'for-next/lto', 'for-next/mem-hotplug', 'for-next/cppc-ffh', 'for-next/pad-image-header', 'for-next/zone-dma-default-32-bit', 'for-next/signal-tag-bits' and 'for-next/cmdline-extended' into for-next/core
* for-next/kvm-build-fix:
  : Fix KVM build issues with 64K pages
  KVM: arm64: Fix build error in user_mem_abort()

* for-next/va-refactor:
  : VA layout changes
  arm64: mm: don't assume struct page is always 64 bytes
  Documentation/arm64: fix RST layout of memory.rst
  arm64: mm: tidy up top of kernel VA space
  arm64: mm: make vmemmap region a projection of the linear region
  arm64: mm: extend linear region for 52-bit VA configurations

* for-next/lto:
  : Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO
  arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y
  arm64: alternatives: Remove READ_ONCE() usage during patch operation
  arm64: cpufeatures: Add capability for LDAPR instruction
  arm64: alternatives: Split up alternative.h
  arm64: uaccess: move uao_* alternatives to asm-uaccess.h

* for-next/mem-hotplug:
  : Memory hotplug improvements
  arm64/mm/hotplug: Ensure early memory sections are all online
  arm64/mm/hotplug: Enable MEM_OFFLINE event handling
  arm64/mm/hotplug: Register boot memory hot remove notifier earlier
  arm64: mm: account for hotplug memory when randomizing the linear region

* for-next/cppc-ffh:
  : Add CPPC FFH support using arm64 AMU counters
  arm64: abort counter_read_on_cpu() when irqs_disabled()
  arm64: implement CPPC FFH support using AMUs
  arm64: split counter validation function
  arm64: wrap and generalise counter read functions

* for-next/pad-image-header:
  : Pad Image header to 64KB and unmap it
  arm64: head: tidy up the Image header definition
  arm64/head: avoid symbol names pointing into first 64 KB of kernel image
  arm64: omit [_text, _stext) from permanent kernel mapping

* for-next/zone-dma-default-32-bit:
  : Default to 32-bit wide ZONE_DMA (previously reduced to 1GB for RPi4)
  of: unittest: Fix build on architectures without CONFIG_OF_ADDRESS
  mm: Remove examples from enum zone_type comment
  arm64: mm: Set ZONE_DMA size based on early IORT scan
  arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges
  of: unittest: Add test for of_dma_get_max_cpu_address()
  of/address: Introduce of_dma_get_max_cpu_address()
  arm64: mm: Move zone_dma_bits initialization into zone_sizes_init()
  arm64: mm: Move reserve_crashkernel() into mem_init()
  arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
  arm64: Ignore any DMA offsets in the max_zone_phys() calculation

* for-next/signal-tag-bits:
  : Expose the FAR_EL1 tag bits in siginfo
  arm64: expose FAR_EL1 tag bits in siginfo
  signal: define the SA_EXPOSE_TAGBITS bit in sa_flags
  signal: define the SA_UNSUPPORTED bit in sa_flags
  arch: provide better documentation for the arch-specific SA_* flags
  signal: clear non-uapi flag bits when passing/returning sa_flags
  arch: move SA_* definitions to generic headers
  parisc: start using signal-defs.h
  parisc: Drop parisc special case for __sighandler_t

* for-next/cmdline-extended:
  : Add support for CONFIG_CMDLINE_EXTENDED
  arm64: Extend the kernel command line from the bootloader
  arm64: kaslr: Refactor early init command line parsing
2020-12-09 18:04:35 +00:00
Catalin Marinas
2687275a58 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
mem_init() currently relies on knowing the boundaries of the crashkernel
reservation to map such region with page granularity for later
unmapping via set_memory_valid(..., 0). If the crashkernel reservation
is deferred, such boundaries are not known when the linear mapping is
created. Simply parse the command line for "crashkernel" and, if found,
create the linear map with NO_BLOCK_MAPPINGS.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Acked-by: James Morse <james.morse@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Link: https://lore.kernel.org/r/20201119175556.18681-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-20 09:32:46 +00:00
Sudarshan Rajagopalan
9f84f39f55 arm64/mm: add fallback option to allocate virtually contiguous memory
When section mappings are enabled, we allocate vmemmap pages from
physically continuous memory of size PMD_SIZE using
vmemmap_alloc_block_buf(). Section mappings are good to reduce TLB
pressure. But when system is highly fragmented and memory blocks are
being hot-added at runtime, its possible that such physically continuous
memory allocations can fail. Rather than failing the memory hot-add
procedure, add a fallback option to allocate vmemmap pages from
discontinuous pages using vmemmap_populate_basepages().

Signed-off-by: Sudarshan Rajagopalan <sudaraja@codeaurora.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/d6c06f2ef39bbe6c715b2f6db76eb16155fdcee6.1602722808.git.sudaraja@codeaurora.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-17 18:21:01 +00:00
Ard Biesheuvel
e2a073dde9 arm64: omit [_text, _stext) from permanent kernel mapping
In a previous patch, we increased the size of the EFI PE/COFF header
to 64 KB, which resulted in the _stext symbol to appear at a fixed
offset of 64 KB into the image.

Since 64 KB is also the largest page size we support, this completely
removes the need to map the first 64 KB of the kernel image, given that
it only contains the arm64 Image header and the EFI header, neither of
which we ever access again after booting the kernel. More importantly,
we should avoid an executable mapping of non-executable and not entirely
predictable data, to deal with the unlikely event that we inadvertently
emitted something that looks like an opcode that could be used as a
gadget for speculative execution.

So let's limit the kernel mapping of .text to the [_stext, _etext)
region, which matches the view of generic code (such as kallsyms) when
it reasons about the boundaries of the kernel's .text section.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201117124729.12642-2-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-17 16:14:20 +00:00
Anshuman Khandual
58284a901b arm64/mm: Validate hotplug range before creating linear mapping
During memory hotplug process, the linear mapping should not be created for
a given memory range if that would fall outside the maximum allowed linear
range. Else it might cause memory corruption in the kernel virtual space.

Maximum linear mapping region is [PAGE_OFFSET..(PAGE_END -1)] accommodating
both its ends but excluding PAGE_END. Max physical range that can be mapped
inside this linear mapping range, must also be derived from its end points.

This ensures that arch_add_memory() validates memory hot add range for its
potential linear mapping requirements, before creating it with
__create_pgd_mapping().

Fixes: 4ab2150615 ("arm64: Add memory hotplug support")
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1605252614-761-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-13 09:45:08 +00:00
Anshuman Khandual
fdd99a4103 arm64/mm/hotplug: Ensure early memory sections are all online
This adds a validation function that scans the entire boot memory and makes
sure that all early memory sections are online. This check is essential for
the memory notifier to work properly, as it cannot prevent any boot memory
from offlining, if all sections are not online to begin with. Although the
boot section scanning is selectively enabled with DEBUG_VM.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1604896137-16644-4-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-10 18:44:12 +00:00
Anshuman Khandual
9fb3d4a303 arm64/mm/hotplug: Enable MEM_OFFLINE event handling
This enables MEM_OFFLINE memory event handling. It will help intercept any
possible error condition such as if boot memory some how still got offlined
even after an explicit notifier failure, potentially by a future change in
generic hot plug framework. This would help detect such scenarios and help
debug further. While here, also call out the first section being attempted
for offline or got offlined.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1604896137-16644-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-10 18:44:12 +00:00
Anshuman Khandual
cb45babe1b arm64/mm/hotplug: Register boot memory hot remove notifier earlier
This moves memory notifier registration earlier in the boot process from
device_initcall() to early_initcall() which will help in guarding against
potential early boot memory offline requests. Even though there should not
be any actual offlinig requests till memory block devices are initialized
with memory_dev_init() but then generic init sequence might just change in
future. Hence an early registration for the memory event notifier would be
helpful. While here, just skip the registration if CONFIG_MEMORY_HOTREMOVE
is not enabled and also call out when memory notifier registration fails.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1604896137-16644-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-10 18:44:11 +00:00
Joe Perches
33def8498f treewide: Convert macro and uses of __section(foo) to __section("foo")
Use a more generic form for __section that requires quotes to avoid
complications with clang and gcc differences.

Remove the quote operator # from compiler_attributes.h __section macro.

Convert all unquoted __section(foo) uses to quoted __section("foo").
Also convert __attribute__((section("foo"))) uses to __section("foo")
even if the __attribute__ has multiple list entry forms.

Conversion done using the script at:

    https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-25 14:51:49 -07:00
Mike Rapoport
b10d6bca87 arch, drivers: replace for_each_membock() with for_each_mem_range()
There are several occurrences of the following pattern:

	for_each_memblock(memory, reg) {
		start = __pfn_to_phys(memblock_region_memory_base_pfn(reg);
		end = __pfn_to_phys(memblock_region_memory_end_pfn(reg));

		/* do something with start and end */
	}

Using for_each_mem_range() iterator is more appropriate in such cases and
allows simpler and cleaner code.

[akpm@linux-foundation.org: fix arch/arm/mm/pmsa-v7.c build]
[rppt@linux.ibm.com: mips: fix cavium-octeon build caused by memblock refactoring]
  Link: http://lkml.kernel.org/r/20200827124549.GD167163@linux.ibm.com

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-13-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13 18:38:35 -07:00
Linus Torvalds
34eb62d868 Merge tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull orphan section checking from Ingo Molnar:
 "Orphan link sections were a long-standing source of obscure bugs,
  because the heuristics that various linkers & compilers use to handle
  them (include these bits into the output image vs discarding them
  silently) are both highly idiosyncratic and also version dependent.

  Instead of this historically problematic mess, this tree by Kees Cook
  (et al) adds build time asserts and build time warnings if there's any
  orphan section in the kernel or if a section is not sized as expected.

  And because we relied on so many silent assumptions in this area, fix
  a metric ton of dependencies and some outright bugs related to this,
  before we can finally enable the checks on the x86, ARM and ARM64
  platforms"

* tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  x86/boot/compressed: Warn on orphan section placement
  x86/build: Warn on orphan section placement
  arm/boot: Warn on orphan section placement
  arm/build: Warn on orphan section placement
  arm64/build: Warn on orphan section placement
  x86/boot/compressed: Add missing debugging sections to output
  x86/boot/compressed: Remove, discard, or assert for unwanted sections
  x86/boot/compressed: Reorganize zero-size section asserts
  x86/build: Add asserts for unwanted sections
  x86/build: Enforce an empty .got.plt section
  x86/asm: Avoid generating unused kprobe sections
  arm/boot: Handle all sections explicitly
  arm/build: Assert for unwanted sections
  arm/build: Add missing sections
  arm/build: Explicitly keep .ARM.attributes sections
  arm/build: Refactor linker script headers
  arm64/build: Assert for unwanted sections
  arm64/build: Add missing DWARF sections
  arm64/build: Use common DISCARDS in linker script
  arm64/build: Remove .eh_frame* sections due to unwind tables
  ...
2020-10-12 13:39:19 -07:00
Catalin Marinas
0178dc7613 arm64: mte: Use Normal Tagged attributes for the linear map
Once user space is given access to tagged memory, the kernel must be
able to clear/save/restore tags visible to the user. This is done via
the linear mapping, therefore map it as such. The new MT_NORMAL_TAGGED
index for MAIR_EL1 is initially mapped as Normal memory and later
changed to Normal Tagged via the cpufeature infrastructure. From a
mismatched attribute aliases perspective, the Tagged memory is
considered a permission and it won't lead to undefined behaviour.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
2020-09-03 17:26:31 +01:00
Kees Cook
b4ca91027d arm64/mm: Remove needless section quotes
Fix a case of needless quotes in __section(), which Clang doesn't like.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200821194310.3089815-9-keescook@chromium.org
2020-09-01 09:50:36 +02:00
Anshuman Khandual
eee07935bb arm64/mm: enable vmem_altmap support for vmemmap mappings
Device memory ranges when getting hot added into ZONE_DEVICE, might
require their vmemmap mapping's backing memory to be allocated from their
own range instead of consuming system memory.  This prevents large system
memory usage for potentially large device memory ranges.  Device driver
communicates this request via vmem_altmap structure.  Architecture needs
to take this request into account while creating and tearing down vemmmap
mappings.

This enables vmem_altmap support in vmemmap_populate() and vmemmap_free()
which includes vmemmap_populate_basepages() used for ARM64_16K_PAGES and
ARM64_64K_PAGES configs.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jia He <justin.he@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1594004178-8861-4-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:27 -07:00
Anshuman Khandual
56993b4e14 mm/sparsemem: enable vmem_altmap support in vmemmap_alloc_block_buf()
There are many instances where vmemap allocation is often switched between
regular memory and device memory just based on whether altmap is available
or not.  vmemmap_alloc_block_buf() is used in various platforms to
allocate vmemmap mappings.  Lets also enable it to handle altmap based
device memory allocation along with existing regular memory allocations.
This will help in avoiding the altmap based allocation switch in many
places.  To summarize there are two different methods to call
vmemmap_alloc_block_buf().

vmemmap_alloc_block_buf(size, node, NULL)   /* Allocate from system RAM */
vmemmap_alloc_block_buf(size, node, altmap) /* Allocate from altmap */

This converts altmap_alloc_block_buf() into a static function, drops it's
entry from the header and updates Documentation/vm/memory-model.rst.

Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jia He <justin.he@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yu Zhao <yuzhao@google.com>
Link: http://lkml.kernel.org/r/1594004178-8861-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:27 -07:00
Anshuman Khandual
1d9cfee753 mm/sparsemem: enable vmem_altmap support in vmemmap_populate_basepages()
Patch series "arm64: Enable vmemmap mapping from device memory", v4.

This series enables vmemmap backing memory allocation from device memory
ranges on arm64.  But before that, it enables vmemmap_populate_basepages()
and vmemmap_alloc_block_buf() to accommodate struct vmem_altmap based
alocation requests.

This patch (of 3):

vmemmap_populate_basepages() is used across platforms to allocate backing
memory for vmemmap mapping.  This is used as a standard default choice or
as a fallback when intended huge pages allocation fails.  This just
creates entire vmemmap mapping with base pages (PAGE_SIZE).

On arm64 platforms, vmemmap_populate_basepages() is called instead of the
platform specific vmemmap_populate() when ARM64_SWAPPER_USES_SECTION_MAPS
is not enabled as in case for ARM64_16K_PAGES and ARM64_64K_PAGES configs.

At present vmemmap_populate_basepages() does not support allocating from
driver defined struct vmem_altmap while trying to create vmemmap mapping
for a device memory range.  It prevents ARM64_16K_PAGES and
ARM64_64K_PAGES configs on arm64 from supporting device memory with
vmemap_altmap request.

This enables vmem_altmap support in vmemmap_populate_basepages() unlocking
device memory allocation for vmemap mapping on arm64 platforms with 16K or
64K base page configs.

Each architecture should evaluate and decide on subscribing device memory
based base page allocation through vmemmap_populate_basepages().  Hence
lets keep it disabled on all archs in order to preserve the existing
semantics.  A subsequent patch enables it on arm64.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jia He <justin.he@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Yu Zhao <yuzhao@google.com>
Link: http://lkml.kernel.org/r/1594004178-8861-1-git-send-email-anshuman.khandual@arm.com
Link: http://lkml.kernel.org/r/1594004178-8861-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:27 -07:00
Mike Rapoport
ca15ca406f mm: remove unneeded includes of <asm/pgalloc.h>
Patch series "mm: cleanup usage of <asm/pgalloc.h>"

Most architectures have very similar versions of pXd_alloc_one() and
pXd_free_one() for intermediate levels of page table.  These patches add
generic versions of these functions in <asm-generic/pgalloc.h> and enable
use of the generic functions where appropriate.

In addition, functions declared and defined in <asm/pgalloc.h> headers are
used mostly by core mm and early mm initialization in arch and there is no
actual reason to have the <asm/pgalloc.h> included all over the place.
The first patch in this series removes unneeded includes of
<asm/pgalloc.h>

In the end it didn't work out as neatly as I hoped and moving
pXd_alloc_track() definitions to <asm-generic/pgalloc.h> would require
unnecessary changes to arches that have custom page table allocations, so
I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local
to mm/.

This patch (of 8):

In most cases <asm/pgalloc.h> header is required only for allocations of
page table memory.  Most of the .c files that include that header do not
use symbols declared in <asm/pgalloc.h> and do not require that header.

As for the other header files that used to include <asm/pgalloc.h>, it is
possible to move that include into the .c file that actually uses symbols
from <asm/pgalloc.h> and drop the include from the header file.

The process was somewhat automated using

	sed -i -E '/[<"]asm\/pgalloc\.h/d' \
                $(grep -L -w -f /tmp/xx \
                        $(git grep -E -l '[<"]asm/pgalloc\.h'))

where /tmp/xx contains all the symbols defined in
arch/*/include/asm/pgalloc.h.

[rppt@linux.ibm.com: fix powerpc warning]

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Link: http://lkml.kernel.org/r/20200627143453.31835-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200627143453.31835-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:26 -07:00
Shyam Thombre
8dd4daa042 arm64: mm: reset address tag set by kasan sw tagging
KASAN sw tagging sets a random tag of 8 bits in the top byte of the pointer
returned by the memory allocating functions. So for the functions unaware
of this change, the top 8 bits of the address must be reset which is done
by the function arch_kasan_reset_tag().

Signed-off-by: Shyam Thombre <sthombre@codeaurora.org>
Link: https://lore.kernel.org/r/1591787384-5823-1-git-send-email-sthombre@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-06-15 16:58:13 +01:00
Mike Rapoport
974b9b2c68 mm: consolidate pte_index() and pte_offset_*() definitions
All architectures define pte_index() as

	(address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)

and all architectures define pte_offset_kernel() as an entry in the array
of PTEs indexed by the pte_index().

For the most architectures the pte_offset_kernel() implementation relies
on the availability of pmd_page_vaddr() that converts a PMD entry value to
the virtual address of the page containing PTEs array.

Let's move x86 definitions of the PTE accessors to the generic place in
<linux/pgtable.h> and then simply drop the respective definitions from the
other architectures.

The architectures that didn't provide pmd_page_vaddr() are updated to have
that defined.

The generic implementation of pte_offset_kernel() can be overridden by an
architecture and alpha makes use of this because it has special ordering
requirements for its version of pte_offset_kernel().

[rppt@linux.ibm.com: v2]
  Link: http://lkml.kernel.org/r/20200514170327.31389-11-rppt@kernel.org
[rppt@linux.ibm.com: update]
  Link: http://lkml.kernel.org/r/20200514170327.31389-12-rppt@kernel.org
[rppt@linux.ibm.com: update]
  Link: http://lkml.kernel.org/r/20200514170327.31389-13-rppt@kernel.org
[akpm@linux-foundation.org: fix x86 warning]
[sfr@canb.auug.org.au: fix powerpc build]
  Link: http://lkml.kernel.org/r/20200607153443.GB738695@linux.ibm.com

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-10-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:14 -07:00
Mike Rapoport
e9f6376858 arm64: add support for folded p4d page tables
Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate, replace 5level-fixup.h with pgtable-nop4d.h and
remove __ARCH_USE_5LEVEL_HACK.

[arnd@arndb.de: fix gcc-10 shift warning]
  Link: http://lkml.kernel.org/r/20200429185657.4085975-1-arnd@arndb.de
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: James Morse <james.morse@arm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200414153455.21744-4-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04 19:06:21 -07:00
Mark Brown
c8027285e3 arm64: Set GP bit in kernel page tables to enable BTI for the kernel
Now that the kernel is built with BTI annotations enable the feature by
setting the GP bit in the stage 1 translation tables.  This is done
based on the features supported by the boot CPU so that we do not need
to rewrite the translation tables.

In order to avoid potential issues on big.LITTLE systems when there are
a mix of BTI and non-BTI capable CPUs in the system when we have enabled
kernel mode BTI we change BTI to be a _STRICT_BOOT_CPU_FEATURE when we
have kernel BTI.  This will prevent any CPUs that don't support BTI
being started if the boot CPU supports BTI rather than simply not using
BTI as we do when supporting BTI only in userspace.  The main concern is
the possibility of BTYPE being preserved by a CPU that does not
implement BTI when a thread is migrated to it resulting in an incorrect
state which could generate an exception when the thread migrates back to
a CPU that does support BTI.  If we encounter practical systems which
mix BTI and non-BTI CPUs we will need to revisit this implementation.

Since we currently do not generate landing pads in the BPF JIT we only
map the base kernel text in this way.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200506195138.22086-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-05-07 17:53:20 +01:00
Logan Gunthorpe
bfeb022f8f mm/memory_hotplug: add pgprot_t to mhp_params
devm_memremap_pages() is currently used by the PCI P2PDMA code to create
struct page mappings for IO memory.  At present, these mappings are
created with PAGE_KERNEL which implies setting the PAT bits to be WB.
However, on x86, an mtrr register will typically override this and force
the cache type to be UC-.  In the case firmware doesn't set this
register it is effectively WB and will typically result in a machine
check exception when it's accessed.

Other arches are not currently likely to function correctly seeing they
don't have any MTRR registers to fall back on.

To solve this, provide a way to specify the pgprot value explicitly to
arch_add_memory().

Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a
simple change to pass the pgprot_t down to their respective functions
which set up the page tables.  For x86_32, set the page tables
explicitly using _set_memory_prot() (seeing they are already mapped).

For ia64, s390 and sh, reject anything but PAGE_KERNEL settings -- this
should be fine, for now, seeing these architectures don't support
ZONE_DEVICE.

A check in __add_pages() is also added to ensure the pgprot parameter
was set for all arches.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-7-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10 15:36:21 -07:00
Logan Gunthorpe
f5637d3b42 mm/memory_hotplug: rename mhp_restrictions to mhp_params
The mhp_restrictions struct really doesn't specify anything resembling a
restriction anymore so rename it to be mhp_params as it is a list of
extended parameters.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-3-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10 15:36:21 -07:00
Anshuman Khandual
bbd6ec605c arm64/mm: Enable memory hot remove
The arch code for hot-remove must tear down portions of the linear map and
vmemmap corresponding to memory being removed. In both cases the page
tables mapping these regions must be freed, and when sparse vmemmap is in
use the memory backing the vmemmap must also be freed.

This patch adds unmap_hotplug_range() and free_empty_tables() helpers which
can be used to tear down either region and calls it from vmemmap_free() and
___remove_pgd_mapping(). The free_mapped argument determines whether the
backing memory will be freed.

It makes two distinct passes over the kernel page table. In the first pass
with unmap_hotplug_range() it unmaps, invalidates applicable TLB cache and
frees backing memory if required (vmemmap) for each mapped leaf entry. In
the second pass with free_empty_tables() it looks for empty page table
sections whose page table page can be unmapped, TLB invalidated and freed.

While freeing intermediate level page table pages bail out if any of its
entries are still valid. This can happen for partially filled kernel page
table either from a previously attempted failed memory hot add or while
removing an address range which does not span the entire page table page
range.

The vmemmap region may share levels of table with the vmalloc region.
There can be conflicts between hot remove freeing page table pages with
a concurrent vmalloc() walking the kernel page table. This conflict can
not just be solved by taking the init_mm ptl because of existing locking
scheme in vmalloc(). So free_empty_tables() implements a floor and ceiling
method which is borrowed from user page table tear with free_pgd_range()
which skips freeing page table pages if intermediate address range is not
aligned or maximum floor-ceiling might not own the entire page table page.

Boot memory on arm64 cannot be removed. Hence this registers a new memory
hotplug notifier which prevents boot memory offlining and it's removal.

While here update arch_add_memory() to handle __add_pages() failures by
just unmapping recently added kernel linear mapping. Now enable memory hot
remove on arm64 platforms by default with ARCH_ENABLE_MEMORY_HOTREMOVE.

This implementation is overall inspired from kernel page table tear down
procedure on X86 architecture and user page table tear down method.

[Mike and Catalin added P4D page table level support]

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-04 15:35:44 +00:00
Steven Price
102f45fdbe arm64: mm: convert mm/dump.c to use walk_page_range()
Now walk_page_range() can walk kernel page tables, we can switch the arm64
ptdump code over to using it, simplifying the code.

Link: http://lkml.kernel.org/r/20191218162402.45610-22-steven.price@arm.com
Signed-off-by: Steven Price <steven.price@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zong Li <zong.li@sifive.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-04 03:05:25 +00:00
David Hildenbrand
feee6b2989 mm/memory_hotplug: shrink zones when offlining memory
We currently try to shrink a single zone when removing memory.  We use
the zone of the first page of the memory we are removing.  If that
memmap was never initialized (e.g., memory was never onlined), we will
read garbage and can trigger kernel BUGs (due to a stale pointer):

    BUG: unable to handle page fault for address: 000000000000353d
    #PF: supervisor write access in kernel mode
    #PF: error_code(0x0002) - not-present page
    PGD 0 P4D 0
    Oops: 0002 [#1] SMP PTI
    CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
    Workqueue: kacpi_hotplug acpi_hotplug_work_fn
    RIP: 0010:clear_zone_contiguous+0x5/0x10
    Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
    RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
    RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
    RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
    R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
    FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     __remove_pages+0x4b/0x640
     arch_remove_memory+0x63/0x8d
     try_remove_memory+0xdb/0x130
     __remove_memory+0xa/0x11
     acpi_memory_device_remove+0x70/0x100
     acpi_bus_trim+0x55/0x90
     acpi_device_hotplug+0x227/0x3a0
     acpi_hotplug_work_fn+0x1a/0x30
     process_one_work+0x221/0x550
     worker_thread+0x50/0x3b0
     kthread+0x105/0x140
     ret_from_fork+0x3a/0x50
    Modules linked in:
    CR2: 000000000000353d

Instead, shrink the zones when offlining memory or when onlining failed.
Introduce and use remove_pfn_range_from_zone(() for that.  We now
properly shrink the zones, even if we have DIMMs whereby

 - Some memory blocks fall into no zone (never onlined)

 - Some memory blocks fall into multiple zones (offlined+re-onlined)

 - Multiple memory blocks that fall into different zones

Drop the zone parameter (with a potential dubious value) from
__remove_pages() and __remove_section().

Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com
Fixes: f1dd2cd13c ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e86b]
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: <stable@vger.kernel.org>	[5.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-04 13:55:08 -08:00
Linus Torvalds
6e9f879684 Merge tag 'acpi-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
 "These update the ACPICA code in the kernel to upstream revision
  20191018, add support for EFI specific purpose memory, update the ACPI
  EC driver to make it work on systems with hardware-reduced ACPI,
  improve ACPI-based device enumeration for some platforms, rework the
  lid blacklist handling in the button driver and add more lid quirks to
  it, unify ACPI _HID/_UID matching, fix assorted issues and clean up
  the code and documentation.

  Specifics:

   - Update the ACPICA code in the kernel to upstream revision 20191018
     including:
      * Fixes for Clang warnings (Bob Moore)
      * Fix for possible overflow in get_tick_count() (Bob Moore)
      * Introduction of acpi_unload_table() (Bob Moore)
      * Debugger and utilities updates (Erik Schmauss)
      * Fix for unloading tables loaded via configfs (Nikolaus Voss)

   - Add support for EFI specific purpose memory to optionally allow
     either application-exclusive or core-kernel-mm managed access to
     differentiated memory (Dan Williams)

   - Fix and clean up processing of the HMAT table (Brice Goglin, Qian
     Cai, Tao Xu)

   - Update the ACPI EC driver to make it work on systems with
     hardware-reduced ACPI (Daniel Drake)

   - Always build in support for the Generic Event Device (GED) to allow
     one kernel binary to work both on systems with full hardware ACPI
     and hardware-reduced ACPI (Arjan van de Ven)

   - Fix the table unload mechanism to unregister platform devices
     created when the given table was loaded (Andy Shevchenko)

   - Rework the lid blacklist handling in the button driver and add more
     lid quirks to it (Hans de Goede)

   - Improve ACPI-based device enumeration for some platforms based on
     Intel BayTrail SoCs (Hans de Goede)

   - Add an OpRegion driver for the Cherry Trail Crystal Cove PMIC and
     prevent handlers from being registered for unhandled PMIC OpRegions
     (Hans de Goede)

   - Unify ACPI _HID/_UID matching (Andy Shevchenko)

   - Clean up documentation and comments (Cao jin, James Pack, Kacper
     Piwiński)"

* tag 'acpi-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits)
  ACPI: OSI: Shoot duplicate word
  ACPI: HMAT: use %u instead of %d to print u32 values
  ACPI: NUMA: HMAT: fix a section mismatch
  ACPI: HMAT: don't mix pxm and nid when setting memory target processor_pxm
  ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device
  ACPI: NUMA: HMAT: Register HMAT at device_initcall level
  device-dax: Add a driver for "hmem" devices
  dax: Fix alloc_dax_region() compile warning
  lib: Uplevel the pmem "region" ida to a global allocator
  x86/efi: Add efi_fake_mem support for EFI_MEMORY_SP
  arm/efi: EFI soft reservation to memblock
  x86/efi: EFI soft reservation to E820 enumeration
  efi: Common enable/disable infrastructure for EFI soft reservation
  x86/efi: Push EFI_MEMMAP check into leaf routines
  efi: Enumerate EFI_MEMORY_SP
  ACPI: NUMA: Establish a new drivers/acpi/numa/ directory
  ACPICA: Update version to 20191018
  ACPICA: debugger: remove leading whitespaces when converting a string to a buffer
  ACPICA: acpiexec: initialize all simple types and field units from user input
  ACPICA: debugger: add field unit support for acpi_db_get_next_token
  ...
2019-11-26 19:25:25 -08:00
Dan Williams
16993c0f0a arm/efi: EFI soft reservation to memblock
UEFI 2.8 defines an EFI_MEMORY_SP attribute bit to augment the
interpretation of the EFI Memory Types as "reserved for a specific
purpose".

The proposed Linux behavior for specific purpose memory is that it is
reserved for direct-access (device-dax) by default and not available for
any kernel usage, not even as an OOM fallback.  Later, through udev
scripts or another init mechanism, these device-dax claimed ranges can
be reconfigured and hot-added to the available System-RAM with a unique
node identifier. This device-dax management scheme implements "soft" in
the "soft reserved" designation by allowing some or all of the
reservation to be recovered as typical memory. This policy can be
disabled at compile-time with CONFIG_EFI_SOFT_RESERVE=n, or runtime with
efi=nosoftreserve.

For this patch, update the ARM paths that consider
EFI_CONVENTIONAL_MEMORY to optionally take the EFI_MEMORY_SP attribute
into account as a reservation indicator. Publish the soft reservation as
IORES_DESC_SOFT_RESERVED memory, similar to x86.

(Based on an original patch by Ard)

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-11-07 15:44:19 +01:00
Masahiro Yamada
32d1870877 arm64: mm: simplify the page end calculation in __create_pgd_mapping()
Calculate the page-aligned end address more simply.

The local variable, "length" is unneeded.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-06 11:17:09 +00:00
Mark Rutland
b4ed71f557 mm: treewide: clarify pgtable_page_{ctor,dtor}() naming
The naming of pgtable_page_{ctor,dtor}() seems to have confused a few
people, and until recently arm64 used these erroneously/pointlessly for
other levels of page table.

To make it incredibly clear that these only apply to the PTE level, and to
align with the naming of pgtable_pmd_page_{ctor,dtor}(), let's rename them
to pgtable_pte_page_{ctor,dtor}().

These changes were generated with the following shell script:

----
git grep -lw 'pgtable_page_.tor' | while read FILE; do
    sed -i '{s/pgtable_page_ctor/pgtable_pte_page_ctor/}' $FILE;
    sed -i '{s/pgtable_page_dtor/pgtable_pte_page_dtor/}' $FILE;
done
----

... with the documentation re-flowed to remain under 80 columns, and
whitespace fixed up in macros to keep backslashes aligned.

There should be no functional change as a result of this patch.

Link: http://lkml.kernel.org/r/20190722141133.3116-1-mark.rutland@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-26 10:10:44 -07:00
Will Deacon
ac12cf85d6 Merge branches 'for-next/52-bit-kva', 'for-next/cpu-topology', 'for-next/error-injection', 'for-next/perf', 'for-next/psci-cpuidle', 'for-next/rng', 'for-next/smpboot', 'for-next/tbi' and 'for-next/tlbi' into for-next/core
* for-next/52-bit-kva: (25 commits)
  Support for 52-bit virtual addressing in kernel space

* for-next/cpu-topology: (9 commits)
  Move CPU topology parsing into core code and add support for ACPI 6.3

* for-next/error-injection: (2 commits)
  Support for function error injection via kprobes

* for-next/perf: (8 commits)
  Support for i.MX8 DDR PMU and proper SMMUv3 group validation

* for-next/psci-cpuidle: (7 commits)
  Move PSCI idle code into a new CPUidle driver

* for-next/rng: (4 commits)
  Support for 'rng-seed' property being passed in the devicetree

* for-next/smpboot: (3 commits)
  Reduce fragility of secondary CPU bringup in debug configurations

* for-next/tbi: (10 commits)
  Introduce new syscall ABI with relaxed requirements for pointer tags

* for-next/tlbi: (6 commits)
  Handle spurious page faults arising from kernel space
2019-08-30 12:46:12 +01:00
Mark Rutland
b333b0ba23 arm64: fix fixmap copy for 16K pages and 48-bit VA
With 16K pages and 48-bit VAs, the PGD level of table has two entries,
and so the fixmap shares a PGD with the kernel image. Since commit:

  f9040773b7 ("arm64: move kernel image to base of vmalloc area")

... we copy the existing fixmap to the new fine-grained page tables at
the PUD level in this case. When walking to the new PUD, we forgot to
offset the PGD entry and always used the PGD entry at index 0, but this
worked as the kernel image and fixmap were in the low half of the TTBR1
address space.

As of commit:

  14c127c957 ("arm64: mm: Flip kernel VA space")

... the kernel image and fixmap are in the high half of the TTBR1
address space, and hence use the PGD at index 1, but we didn't update
the fixmap copying code to account for this.

Thus, we'll erroneously try to copy the fixmap slots into a PUD under
the PGD entry at index 0. At the point we do so this PGD entry has not
been initialised, and thus we'll try to write a value to a small offset
from physical address 0, causing a number of potential problems.

Fix this be correctly offsetting the PGD. This is split over a few steps
for legibility.

Fixes: 14c127c957 ("arm64: mm: Flip kernel VA space")
Reported-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Steve Capper <Steve.Capper@arm.com>
Tested-by: Steve Capper <Steve.Capper@arm.com>
Tested-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-28 12:10:03 +01:00
Hsin-Yi Wang
e112b032a7 arm64: map FDT as RW for early_init_dt_scan()
Currently in arm64, FDT is mapped to RO before it's passed to
early_init_dt_scan(). However, there might be some codes
(eg. commit "fdt: add support for rng-seed") that need to modify FDT
during init. Map FDT to RO after early fixups are done.

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-23 16:39:16 +01:00
Mark Rutland
77ad4ce693 arm64: memory: rename VA_START to PAGE_END
Prior to commit:

  14c127c957 ("arm64: mm: Flip kernel VA space")

... VA_START described the start of the TTBR1 address space for a given
VA size described by VA_BITS, where all kernel mappings began.

Since that commit, VA_START described a portion midway through the
address space, where the linear map ends and other kernel mappings
begin.

To avoid confusion, let's rename VA_START to PAGE_END, making it clear
that it's not the start of the TTBR1 address space and implying that
it's related to PAGE_OFFSET. Comments and other mnemonics are updated
accordingly, along with a typo fix in the decription of VMEMMAP_SIZE.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-14 17:06:58 +01:00
Steve Capper
2c624fe687 arm64: mm: Remove vabits_user
Previous patches have enabled 52-bit kernel + user VAs and there is no
longer any scenario where user VA != kernel VA size.

This patch removes the, now redundant, vabits_user variable and replaces
usage with vabits_actual where appropriate.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:27 +01:00
Steve Capper
5383cc6efe arm64: mm: Introduce vabits_actual
In order to support 52-bit kernel addresses detectable at boot time, one
needs to know the actual VA_BITS detected. A new variable vabits_actual
is introduced in this commit and employed for the KVM hypervisor layout,
KASAN, fault handling and phys-to/from-virt translation where there
would normally be compile time constants.

In order to maintain performance in phys_to_virt, another variable
physvirt_offset is introduced.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:21 +01:00
Steve Capper
14c127c957 arm64: mm: Flip kernel VA space
In order to allow for a KASAN shadow that changes size at boot time, one
must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the
start address. Also, it is highly desirable to maintain the same
function addresses in the kernel .text between VA sizes. Both of these
requirements necessitate us to flip the kernel address space halves s.t.
the direct linear map occupies the lower addresses.

This patch puts the direct linear map in the lower addresses of the
kernel VA range and everything else in the higher ranges.

We need to adjust:
 *) KASAN shadow region placement logic,
 *) KASAN_SHADOW_OFFSET computation logic,
 *) virt_to_phys, phys_to_virt checks,
 *) page table dumper.

These are all small changes, that need to take place atomically, so they
are bundled into this commit.

As part of the re-arrangement, a guard region of 2MB (to preserve
alignment for fixed map) is added after the vmemmap. Otherwise the
vmemmap could intersect with IS_ERR pointers.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:16:51 +01:00
David Hildenbrand
80ec922dbd mm/memory_hotplug: allow arch_remove_memory() without CONFIG_MEMORY_HOTREMOVE
We want to improve error handling while adding memory by allowing to use
arch_remove_memory() and __remove_pages() even if
CONFIG_MEMORY_HOTREMOVE is not set to e.g., implement something like:

	arch_add_memory()
	rc = do_something();
	if (rc) {
		arch_remove_memory();
	}

We won't get rid of CONFIG_MEMORY_HOTREMOVE for now, as it will require
quite some dependencies for memory offlining.

Link: http://lkml.kernel.org/r/20190527111152.16324-7-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Mark Brown <broonie@kernel.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: "mike.travis@hpe.com" <mike.travis@hpe.com>
Cc: Andrew Banman <andrew.banman@hpe.com>
Cc: Arun KS <arunks@codeaurora.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chintan Pandya <cpandya@codeaurora.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jun Yao <yaojun8558363@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-18 17:08:06 -07:00