Add userspace support for the Memory Tagging Extension introduced by
Armv8.5.
(Catalin Marinas and others)
* for-next/mte: (30 commits)
arm64: mte: Fix typo in memory tagging ABI documentation
arm64: mte: Add Memory Tagging Extension documentation
arm64: mte: Kconfig entry
arm64: mte: Save tags when hibernating
arm64: mte: Enable swap of tagged pages
mm: Add arch hooks for saving/restoring tags
fs: Handle intra-page faults in copy_mount_options()
arm64: mte: ptrace: Add NT_ARM_TAGGED_ADDR_CTRL regset
arm64: mte: ptrace: Add PTRACE_{PEEK,POKE}MTETAGS support
arm64: mte: Allow {set,get}_tagged_addr_ctrl() on non-current tasks
arm64: mte: Restore the GCR_EL1 register after a suspend
arm64: mte: Allow user control of the generated random tags via prctl()
arm64: mte: Allow user control of the tag check mode via prctl()
mm: Allow arm64 mmap(PROT_MTE) on RAM-based files
arm64: mte: Validate the PROT_MTE request via arch_validate_flags()
mm: Introduce arch_validate_flags()
arm64: mte: Add PROT_MTE support to mmap() and mprotect()
mm: Introduce arch_calc_vm_flag_bits()
arm64: mte: Tags-aware aware memcmp_pages() implementation
arm64: Avoid unnecessary clear_user_page() indirection
...
Fix and subsequently rewrite Spectre mitigations, including the addition
of support for PR_SPEC_DISABLE_NOEXEC.
(Will Deacon and Marc Zyngier)
* for-next/ghostbusters: (22 commits)
arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option
arm64: Pull in task_stack_page() to Spectre-v4 mitigation code
KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled
arm64: Get rid of arm64_ssbd_state
KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state()
KVM: arm64: Get rid of kvm_arm_have_ssbd()
KVM: arm64: Simplify handling of ARCH_WORKAROUND_2
arm64: Rewrite Spectre-v4 mitigation code
arm64: Move SSBD prctl() handler alongside other spectre mitigation code
arm64: Rename ARM64_SSBD to ARM64_SPECTRE_V4
arm64: Treat SSBS as a non-strict system feature
arm64: Group start_thread() functions together
KVM: arm64: Set CSV2 for guests on hardware unaffected by Spectre-v2
arm64: Rewrite Spectre-v2 mitigation code
arm64: Introduce separate file for spectre mitigations and reporting
arm64: Rename ARM64_HARDEN_BRANCH_PREDICTOR to ARM64_SPECTRE_V2
KVM: arm64: Simplify install_bp_hardening_cb()
KVM: arm64: Replace CONFIG_KVM_INDIRECT_VECTORS with CONFIG_RANDOMIZE_BASE
arm64: Remove Spectre-related CONFIG_* options
arm64: Run ARCH_WORKAROUND_2 enabling code on all CPUs
...
Remove unused functions and parameters from ACPI IORT code.
(Zenghui Yu via Lorenzo Pieralisi)
* for-next/acpi:
ACPI/IORT: Remove the unused inline functions
ACPI/IORT: Drop the unused @ops of iort_add_device_replay()
Remove redundant code and fix documentation of caching behaviour for the
HVC_SOFT_RESTART hypercall.
(Pingfan Liu)
* for-next/boot:
Documentation/kvm/arm: improve description of HVC_SOFT_RESTART
arm64/relocate_kernel: remove redundant code
Improve reporting of unexpected kernel traps due to BPF JIT failure.
(Will Deacon)
* for-next/bpf:
arm64: Improve diagnostics when trapping BRK with FAULT_BRK_IMM
Improve robustness of user-visible HWCAP strings and their corresponding
numerical constants.
(Anshuman Khandual)
* for-next/cpuinfo:
arm64/cpuinfo: Define HWCAP name arrays per their actual bit definitions
Cleanups to handling of SVE and FPSIMD register state in preparation
for potential future optimisation of handling across syscalls.
(Julien Grall)
* for-next/fpsimd:
arm64/sve: Implement a helper to load SVE registers from FPSIMD state
arm64/sve: Implement a helper to flush SVE registers
arm64/fpsimdmacros: Allow the macro "for" to be used in more cases
arm64/fpsimdmacros: Introduce a macro to update ZCR_EL1.LEN
arm64/signal: Update the comment in preserve_sve_context
arm64/fpsimd: Update documentation of do_sve_acc
Miscellaneous changes.
(Tian Tao and others)
* for-next/misc:
arm64/mm: return cpu_all_mask when node is NUMA_NO_NODE
arm64: mm: Fix missing-prototypes in pageattr.c
arm64/fpsimd: Fix missing-prototypes in fpsimd.c
arm64: hibernate: Remove unused including <linux/version.h>
arm64/mm: Refactor {pgd, pud, pmd, pte}_ERROR()
arm64: Remove the unused include statements
arm64: get rid of TEXT_OFFSET
arm64: traps: Add str of description to panic() in die()
Memory management updates and cleanups.
(Anshuman Khandual and others)
* for-next/mm:
arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD
arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op
arm64/mm: Unify CONT_PMD_SHIFT
arm64/mm: Unify CONT_PTE_SHIFT
arm64/mm: Remove CONT_RANGE_OFFSET
arm64/mm: Enable THP migration
arm64/mm: Change THP helpers to comply with generic MM semantics
arm64/mm/ptdump: Add address markers for BPF regions
Allow prefetchable PCI BARs to be exposed to userspace using normal
non-cacheable mappings.
(Clint Sbisa)
* for-next/pci:
arm64: Enable PCI write-combine resources under sysfs
Perf/PMU driver updates.
(Julien Thierry and others)
* for-next/perf:
perf: arm-cmn: Fix conversion specifiers for node type
perf: arm-cmn: Fix unsigned comparison to less than zero
arm_pmu: arm64: Use NMIs for PMU
arm_pmu: Introduce pmu_irq_ops
KVM: arm64: pmu: Make overflow handler NMI safe
arm64: perf: Defer irq_work to IPI_IRQ_WORK
arm64: perf: Remove PMU locking
arm64: perf: Avoid PMXEV* indirection
arm64: perf: Add missing ISB in armv8pmu_enable_counter()
perf: Add Arm CMN-600 PMU driver
perf: Add Arm CMN-600 DT binding
arm64: perf: Add support caps under sysfs
drivers/perf: thunderx2_pmu: Fix memory resource error handling
drivers/perf: xgene_pmu: Fix uninitialized resource struct
perf: arm_dsu: Support DSU ACPI devices
arm64: perf: Remove unnecessary event_idx check
drivers/perf: hisi: Add missing include of linux/module.h
arm64: perf: Add general hardware LLC events for PMUv3
Support for the Armv8.3 Pointer Authentication enhancements.
(By Amit Daniel Kachhap)
* for-next/ptrauth:
arm64: kprobe: clarify the comment of steppable hint instructions
arm64: kprobe: disable probe of fault prone ptrauth instruction
arm64: cpufeature: Modify address authentication cpufeature to exact
arm64: ptrauth: Introduce Armv8.3 pointer authentication enhancements
arm64: traps: Allow force_signal_inject to pass esr error code
arm64: kprobe: add checks for ARMv8.3-PAuth combined instructions
Tonnes of cleanup to the SDEI driver.
(Gavin Shan)
* for-next/sdei:
firmware: arm_sdei: Remove _sdei_event_unregister()
firmware: arm_sdei: Remove _sdei_event_register()
firmware: arm_sdei: Introduce sdei_do_local_call()
firmware: arm_sdei: Cleanup on cross call function
firmware: arm_sdei: Remove while loop in sdei_event_unregister()
firmware: arm_sdei: Remove while loop in sdei_event_register()
firmware: arm_sdei: Remove redundant error message in sdei_probe()
firmware: arm_sdei: Remove duplicate check in sdei_get_conduit()
firmware: arm_sdei: Unregister driver on error in sdei_init()
firmware: arm_sdei: Avoid nested statements in sdei_init()
firmware: arm_sdei: Retrieve event number from event instance
firmware: arm_sdei: Common block for failing path in sdei_event_create()
firmware: arm_sdei: Remove sdei_is_err()
Selftests for Pointer Authentication and FPSIMD/SVE context-switching.
(Mark Brown and Boyan Karatotev)
* for-next/selftests:
selftests: arm64: Add build and documentation for FP tests
selftests: arm64: Add wrapper scripts for stress tests
selftests: arm64: Add utility to set SVE vector lengths
selftests: arm64: Add stress tests for FPSMID and SVE context switching
selftests: arm64: Add test for the SVE ptrace interface
selftests: arm64: Test case for enumeration of SVE vector lengths
kselftests/arm64: add PAuth tests for single threaded consistency and differently initialized keys
kselftests/arm64: add PAuth test for whether exec() changes keys
kselftests/arm64: add nop checks for PAuth tests
kselftests/arm64: add a basic Pointer Authentication test
Implementation of ARCH_STACKWALK for unwinding.
(Mark Brown)
* for-next/stacktrace:
arm64: Move console stack display code to stacktrace.c
arm64: stacktrace: Convert to ARCH_STACKWALK
arm64: stacktrace: Make stack walk callback consistent with generic code
stacktrace: Remove reliable argument from arch_stack_walk() callback
Support for ASID pinning, which is required when sharing page-tables with
the SMMU.
(Jean-Philippe Brucker)
* for-next/svm:
arm64: cpufeature: Export symbol read_sanitised_ftr_reg()
arm64: mm: Pin down ASIDs for sharing mm with devices
Rely on firmware tables for establishing CPU topology.
(Valentin Schneider)
* for-next/topology:
arm64: topology: Stop using MPIDR for topology information
Spelling fixes.
(Xiaoming Ni and Yanfei Xu)
* for-next/tpyos:
arm64/numa: Fix a typo in comment of arm64_numa_init
arm64: fix some spelling mistakes in the comments by codespell
vDSO cleanups.
(Will Deacon)
* for-next/vdso:
arm64: vdso: Fix unusual formatting in *setup_additional_pages()
arm64: vdso32: Remove a bunch of #ifdef CONFIG_COMPAT_VDSO guards
TCR_EL1.HD is permitted to be cached in a TLB, so invalidate the local
TLB after setting the bit when detected support for the feature. Although
this isn't strictly necessary, since we can happily operate with the bit
effectively clear, the current code uses an ISB in a half-hearted attempt
to make the change effective, so let's just fix that up.
Link: https://lore.kernel.org/r/20201001110405.18617-1-will@kernel.org
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Jonathan reports that the strict policy for memory mapped by the
ACPI core breaks the use case of passing ACPI table overrides via
initramfs. This is due to the fact that the memory type used for
loading the initramfs in memory is not recognized as a memory type
that is typically used by firmware to pass firmware tables.
Since the purpose of the strict policy is to ensure that no AML or
other ACPI code can manipulate any memory that is used by the kernel
to keep its internal state or the state of user tasks, we can relax
the permission check, and allow mappings of memory that is reserved
and marked as NOMAP via memblock, and therefore not covered by the
linear mapping to begin with.
Fixes: 1583052d11 ("arm64/acpi: disallow AML memory opregions to access kernel memory")
Fixes: 325f5585ec ("arm64/acpi: disallow writeable AML opregion mapping for EFI code regions")
Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Link: https://lore.kernel.org/r/20200929132522.18067-1-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add hyp percpu section to linker script and rename the corresponding ELF
sections of hyp/nvhe object files. This moves all nVHE-specific percpu
variables to the new hyp percpu section.
Allocate sufficient amount of memory for all percpu hyp regions at global KVM
init time and create corresponding hyp mappings.
The base addresses of hyp percpu regions are kept in a dynamically allocated
array in the kernel.
Add NULL checks in PMU event-reset code as it may run before KVM memory is
initialized.
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-10-dbrazdil@google.com
Host CPU context is stored in a global per-cpu variable `kvm_host_data`.
In preparation for introducing independent per-CPU region for nVHE hyp,
create two separate instances of `kvm_host_data`, one for VHE and one
for nVHE.
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-9-dbrazdil@google.com
Hyp keeps track of which cores require SSBD callback by accessing a
kernel-proper global variable. Create an nVHE symbol of the same name
and copy the value from kernel proper to nVHE as KVM is being enabled
on a core.
Done in preparation for separating percpu memory owned by kernel
proper and nVHE.
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-8-dbrazdil@google.com
Minor cleanup that only creates __kvm_ex_table ELF section and
related symbols if CONFIG_KVM is enabled. Also useful as more
hyp-specific sections will be added.
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-4-dbrazdil@google.com
Minor cleanup to move all macros related to prefixing nVHE hyp section
and symbol names into one place: hyp_image.h.
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200922204910.7265-3-dbrazdil@google.com
The PR_SPEC_DISABLE_NOEXEC option to the PR_SPEC_STORE_BYPASS prctl()
allows the SSB mitigation to be enabled only until the next execve(),
at which point the state will revert back to PR_SPEC_ENABLE and the
mitigation will be disabled.
Add support for PR_SPEC_DISABLE_NOEXEC on arm64.
Reported-by: Anthony Steinhauser <asteinhauser@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
The kbuild robot reports that we're relying on an implicit inclusion to
get a definition of task_stack_page() in the Spectre-v4 mitigation code,
which is not always in place for some configurations:
| arch/arm64/kernel/proton-pack.c:329:2: error: implicit declaration of function 'task_stack_page' [-Werror,-Wimplicit-function-declaration]
| task_pt_regs(task)->pstate |= val;
| ^
| arch/arm64/include/asm/processor.h:268:36: note: expanded from macro 'task_pt_regs'
| ((struct pt_regs *)(THREAD_SIZE + task_stack_page(p)) - 1)
| ^
| arch/arm64/kernel/proton-pack.c:329:2: note: did you mean 'task_spread_page'?
Add the missing include to fix the build error.
Fixes: a44acf477220 ("arm64: Move SSBD prctl() handler alongside other spectre mitigation code")
Reported-by: Anthony Steinhauser <asteinhauser@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/202009260013.Ul7AD29w%lkp@intel.com
Signed-off-by: Will Deacon <will@kernel.org>
Patching the EL2 exception vectors is integral to the Spectre-v2
workaround, where it can be necessary to execute CPU-specific sequences
to nobble the branch predictor before running the hypervisor text proper.
Remove the dependency on CONFIG_RANDOMIZE_BASE and allow the EL2 vectors
to be patched even when KASLR is not enabled.
Fixes: 7a132017e7a5 ("KVM: arm64: Replace CONFIG_KVM_INDIRECT_VECTORS with CONFIG_RANDOMIZE_BASE")
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/202009221053.Jv1XsQUZ%lkp@intel.com
Signed-off-by: Will Deacon <will@kernel.org>
Owing to the fact that the host kernel is always mitigated, we can
drastically simplify the WA2 handling by keeping the mitigation
state ON when entering the guest. This means the guest is either
unaffected or not mitigated.
This results in a nice simplification of the mitigation space,
and the removal of a lot of code that was never really used anyway.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Rewrite the Spectre-v4 mitigation handling code to follow the same
approach as that taken by Spectre-v2.
For now, report to KVM that the system is vulnerable (by forcing
'ssbd_state' to ARM64_SSBD_UNKNOWN), as this will be cleared up in
subsequent steps.
Signed-off-by: Will Deacon <will@kernel.org>
As part of the spectre consolidation effort to shift all of the ghosts
into their own proton pack, move all of the horrible SSBD prctl() code
out of its own 'ssbd.c' file.
Signed-off-by: Will Deacon <will@kernel.org>
In a similar manner to the renaming of ARM64_HARDEN_BRANCH_PREDICTOR
to ARM64_SPECTRE_V2, rename ARM64_SSBD to ARM64_SPECTRE_V4. This isn't
_entirely_ accurate, as we also need to take into account the interaction
with SSBS, but that will be taken care of in subsequent patches.
Signed-off-by: Will Deacon <will@kernel.org>
If all CPUs discovered during boot have SSBS, then spectre-v4 will be
considered to be "mitigated". However, we still allow late CPUs without
SSBS to be onlined, albeit with a "SANITY CHECK" warning. This is
problematic for userspace because it means that the system can quietly
transition to "Vulnerable" at runtime.
Avoid this by treating SSBS as a non-strict system feature: if all of
the CPUs discovered during boot have SSBS, then late arriving secondaries
better have it as well.
Signed-off-by: Will Deacon <will@kernel.org>
The Spectre-v2 mitigation code is pretty unwieldy and hard to maintain.
This is largely due to it being written hastily, without much clue as to
how things would pan out, and also because it ends up mixing policy and
state in such a way that it is very difficult to figure out what's going
on.
Rewrite the Spectre-v2 mitigation so that it clearly separates state from
policy and follows a more structured approach to handling the mitigation.
Signed-off-by: Will Deacon <will@kernel.org>
The spectre mitigation code is spread over a few different files, which
makes it both hard to follow, but also hard to remove it should we want
to do that in future.
Introduce a new file for housing the spectre mitigations, and populate
it with the spectre-v1 reporting code to start with.
Signed-off-by: Will Deacon <will@kernel.org>
For better or worse, the world knows about "Spectre" and not about
"Branch predictor hardening". Rename ARM64_HARDEN_BRANCH_PREDICTOR to
ARM64_SPECTRE_V2 as part of moving all of the Spectre mitigations into
their own little corner.
Signed-off-by: Will Deacon <will@kernel.org>
Use is_hyp_mode_available() to detect whether or not we need to patch
the KVM vectors for branch hardening, which avoids the need to take the
vector pointers as parameters.
Signed-off-by: Will Deacon <will@kernel.org>
The removal of CONFIG_HARDEN_BRANCH_PREDICTOR means that
CONFIG_KVM_INDIRECT_VECTORS is synonymous with CONFIG_RANDOMIZE_BASE,
so replace it.
Signed-off-by: Will Deacon <will@kernel.org>
The spectre mitigations are too configurable for their own good, leading
to confusing logic trying to figure out when we should mitigate and when
we shouldn't. Although the plethora of command-line options need to stick
around for backwards compatibility, the default-on CONFIG options that
depend on EXPERT can be dropped, as the mitigations only do anything if
the system is vulnerable, a mitigation is available and the command-line
hasn't disabled it.
Remove CONFIG_HARDEN_BRANCH_PREDICTOR and CONFIG_ARM64_SSBD in favour of
enabling this code unconditionally.
Signed-off-by: Will Deacon <will@kernel.org>
Commit 606f8e7b27 ("arm64: capabilities: Use linear array for
detection and verification") changed the way we deal with per-CPU errata
by only calling the .matches() callback until one CPU is found to be
affected. At this point, .matches() stop being called, and .cpu_enable()
will be called on all CPUs.
This breaks the ARCH_WORKAROUND_2 handling, as only a single CPU will be
mitigated.
In order to address this, forcefully call the .matches() callback from a
.cpu_enable() callback, which brings us back to the original behaviour.
Fixes: 606f8e7b27 ("arm64: capabilities: Use linear array for detection and verification")
Cc: <stable@vger.kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
The SMMUv3 driver would like to read the MMFR0 PARANGE field in order to
share CPU page tables with devices. Allow the driver to be built as
module by exporting the read_sanitized_ftr_reg() cpufeature symbol.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20200918101852.582559-7-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
When handling events, armv8pmu_handle_irq() calls perf_event_overflow(),
and subsequently calls irq_work_run() to handle any work queued by
perf_event_overflow(). As perf_event_overflow() raises IPI_IRQ_WORK when
queuing the work, this isn't strictly necessary and the work could be
handled as part of the IPI_IRQ_WORK handler.
In the common case the IPI handler will run immediately after the PMU IRQ
handler, and where the PE is heavily loaded with interrupts other handlers
may run first, widening the window where some counters are disabled.
In practice this window is unlikely to be a significant issue, and removing
the call to irq_work_run() would make the PMU IRQ handler NMI safe in
addition to making it simpler, so let's do that.
[Alexandru E.: Reworded commit message]
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-5-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
The PMU is disabled and enabled, and the counters are programmed from
contexts where interrupts or preemption is disabled.
The functions to toggle the PMU and to program the PMU counters access the
registers directly and don't access data modified by the interrupt handler.
That, and the fact that they're always called from non-preemptible
contexts, means that we don't need to disable interrupts or use a spinlock.
[Alexandru E.: Explained why locking is not needed, removed WARN_ONs]
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-4-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Currently we access the counter registers and their respective type
registers indirectly. This requires us to write to PMSELR, issue an ISB,
then access the relevant PMXEV* registers.
This is unfortunate, because:
* Under virtualization, accessing one register requires two traps to
the hypervisor, even though we could access the register directly with
a single trap.
* We have to issue an ISB which we could otherwise avoid the cost of.
* When we use NMIs, the NMI handler will have to save/restore the select
register in case the code it preempted was attempting to access a
counter or its type register.
We can avoid these issues by directly accessing the relevant registers.
This patch adds helpers to do so.
In armv8pmu_enable_event() we still need the ISB to prevent the PE from
reordering the write to PMINTENSET_EL1 register. If the interrupt is
enabled before we disable the counter and the new event is configured,
we might get an interrupt triggered by the previously programmed event
overflowing, but which we wrongly attribute to the event that we are
enabling. Execute an ISB after we disable the counter.
In the process, remove the comment that refers to the ARMv7 PMU.
[Julien T.: Don't inline read/write functions to avoid big code-size
increase, remove unused read_pmevtypern function,
fix counter index issue.]
[Alexandru E.: Removed comment, removed trailing semicolons in macros,
added ISB]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-3-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Writes to the PMXEVTYPER_EL0 register are not self-synchronising. In
armv8pmu_enable_event(), the PE can reorder configuring the event type
after we have enabled the counter and the interrupt. This can lead to an
interrupt being asserted because of the previous event type that we were
counting using the same counter, not the one that we've just configured.
The same rationale applies to writes to the PMINTENSET_EL1 register. The PE
can reorder enabling the interrupt at any point in the future after we have
enabled the event.
Prevent both situations from happening by adding an ISB just before we
enable the event counter.
Fixes: 030896885a ("arm64: Performance counters support")
Reported-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-2-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
ARMv8.4-PMU introduces the PMMIR_EL1 registers and some new PMU events,
like STALL_SLOT etc, are related to it. Let's add a caps directory to
/sys/bus/event_source/devices/armv8_pmuv3_0/ and support slots from
PMMIR_EL1 registers in this entry. The user programs can get the slots
from sysfs directly.
/sys/bus/event_source/devices/armv8_pmuv3_0/caps/slots is exposed
under sysfs. Both ARMv8.4-PMU and STALL_SLOT event are implemented,
it returns the slots from PMMIR_EL1, otherwise it will return 0.
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1600754025-53535-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
The minimal compiler versions, GCC 4.9 and Clang 10 support this flag.
Here is the godbolt:
https://godbolt.org/z/xvjcMa
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Will Deacon <will@kernel.org>
The minimal compiler versions, GCC 4.9 and Clang 10 support this flag.
Here is the godbolt:
https://godbolt.org/z/odq8h9
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Will Deacon <will@kernel.org>
There was a request to preprocess the module linker script like we
do for the vmlinux one. (https://lkml.org/lkml/2020/8/21/512)
The difference between vmlinux.lds and module.lds is that the latter
is needed for external module builds, thus must be cleaned up by
'make mrproper' instead of 'make clean'. Also, it must be created
by 'make modules_prepare'.
You cannot put it in arch/$(SRCARCH)/kernel/, which is cleaned up by
'make clean'. I moved arch/$(SRCARCH)/kernel/module.lds to
arch/$(SRCARCH)/include/asm/module.lds.h, which is included from
scripts/module.lds.S.
scripts/module.lds is fine because 'make clean' keeps all the
build artifacts under scripts/.
You can add arch-specific sections in <asm/module.lds.h>.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Currently the code for displaying a stack trace on the console is located
in traps.c rather than stacktrace.c, using the unwinding code that is in
stacktrace.c. This can be confusing and make the code hard to find since
such output is often referred to as a stack trace which might mislead the
unwary. Due to this and since traps.c doesn't interact with this code
except for via the public interfaces move the code to stacktrace.c to
make it easier to find.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20200921122341.11280-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Commit 73f3816609 ("arm64: Advertise mitigation of Spectre-v2, or lack
thereof") changed the way we deal with ARCH_WORKAROUND_1, by moving most
of the enabling code to the .matches() callback.
This has the unfortunate effect that the workaround gets only enabled on
the first affected CPU, and no other.
In order to address this, forcefully call the .matches() callback from a
.cpu_enable() callback, which brings us back to the original behaviour.
Fixes: 73f3816609 ("arm64: Advertise mitigation of Spectre-v2, or lack thereof")
Cc: <stable@vger.kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
We seem to be pretending that we don't have any firmware mitigation
when KVM is not compiled in, which is not quite expected.
Bring back the mitigation in this case.
Fixes: 4db61fef16 ("arm64: kvm: Modernize __smccc_workaround_1_smc_start annotations")
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
In a follow-up patch, we may save the FPSIMD rather than the full SVE
state when the state has to be zeroed on return to userspace (e.g
during a syscall).
Introduce an helper to load SVE vectors from FPSIMD state and zero the rest
of SVE registers.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/20200828181155.17745-7-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Introduce a new helper that will zero all SVE registers but the first
128-bits of each vector. This will be used by subsequent patches to
avoid costly store/maipulate/reload sequences in places like do_sve_acc().
Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/20200828181155.17745-6-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
The SVE state is saved by fpsimd_signal_preserve_current_state() and not
preserve_fpsimd_context(). Update the comment in preserve_sve_context to
reflect the current behavior.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/20200828181155.17745-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
fpsimd_restore_current_state() enables and disables the SVE access trap
based on TIF_SVE, not task_fpsimd_load(). Update the documentation of
do_sve_acc to reflect this behavior.
Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/20200828181155.17745-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
arch_scale_freq_invariant() is used by schedutil to determine whether
the scheduler's load-tracking signals are frequency invariant. Its
definition is overridable, though by default it is hardcoded to 'true'
if arch_scale_freq_capacity() is defined ('false' otherwise).
This behaviour is not overridden on arm, arm64 and other users of the
generic arch topology driver, which is somewhat precarious:
arch_scale_freq_capacity() will always be defined, yet not all cpufreq
drivers are guaranteed to drive the frequency invariance scale factor
setting. In other words, the load-tracking signals may very well *not*
be frequency invariant.
Now that cpufreq can be queried on whether the current driver is driving
the Frequency Invariance (FI) scale setting, the current situation can
be improved. This combines the query of whether cpufreq supports the
setting of the frequency scale factor, with whether all online CPUs are
counter-based FI enabled.
While cpufreq FI enablement applies at system level, for all CPUs,
counter-based FI support could also be used for only a subset of CPUs to
set the invariance scale factor. Therefore, if cpufreq-based FI support
is present, we consider the system to be invariant. If missing, we
require all online CPUs to be counter-based FI enabled in order for the
full system to be considered invariant.
If the system ends up not being invariant, a new condition is needed in
the counter initialization code that disables all scale factor setting
based on counters.
Precedence of counters over cpufreq use is not important here. The
invariant status is only given to the system if all CPUs have at least
one method of setting the frequency scale factor.
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The passed cpumask arguments to arch_set_freq_scale() and
arch_freq_counters_available() are only iterated over, so reflect this
in the prototype. This also allows to pass system cpumasks like
cpu_online_mask without getting a warning.
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If CONFIG_HOTPLUG_CPU is n, gcc warns:
arch/arm64/kernel/smp.c:967:13: warning: ‘ipi_teardown’ defined but not used [-Wunused-function]
static void ipi_teardown(int cpu)
^~~~~~~~~~~~
Use #ifdef guard this.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200918123318.23764-1-yuehaibing@huawei.com
When generating instructions at runtime, for example due to kernel text
patching or the BPF JIT, we can emit a trapping BRK instruction if we
are asked to encode an invalid instruction such as an out-of-range]
branch. This is indicative of a bug in the caller, and will result in a
crash on executing the generated code. Unfortunately, the message from
the crash is really unhelpful, and mumbles something about ptrace:
| Unexpected kernel BRK exception at EL1
| Internal error: ptrace BRK handler: f2000100 [#1] SMP
We can do better than this. Install a break handler for FAULT_BRK_IMM,
which is the immediate used to encode the "I've been asked to generate
an invalid instruction" error, and triage the faulting PC to determine
whether or not the failure occurred in the BPF JIT.
Link: https://lore.kernel.org/r/20200915141707.GB26439@willie-the-truck
Reported-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Fix the following warnings.
arch/arm64/kernel/fpsimd.c:935:6: warning: no previous prototype for
‘do_sve_acc’ [-Wmissing-prototypes]
arch/arm64/kernel/fpsimd.c:962:6: warning: no previous prototype for
‘do_fpsimd_acc’ [-Wmissing-prototypes]
arch/arm64/kernel/fpsimd.c:971:6: warning: no previous prototype for
‘do_fpsimd_exc’ [-Wmissing-prototypes]
arch/arm64/kernel/fpsimd.c:1266:6: warning: no previous prototype for
‘kernel_neon_begin’ [-Wmissing-prototypes]
arch/arm64/kernel/fpsimd.c:1292:6: warning: no previous prototype for
‘kernel_neon_end’ [-Wmissing-prototypes]
Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/1600157999-14802-1-git-send-email-tiantao6@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
Historically architectures have had duplicated code in their stack trace
implementations for filtering what gets traced. In order to avoid this
duplication some generic code has been provided using a new interface
arch_stack_walk(), enabled by selecting ARCH_STACKWALK in Kconfig, which
factors all this out into the generic stack trace code. Convert arm64
to use this common infrastructure.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/20200914153409.25097-4-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
As with the generic arch_stack_walk() code the arm64 stack walk code takes
a callback that is called per stack frame. Currently the arm64 code always
passes a struct stackframe to the callback and the generic code just passes
the pc, however none of the users ever reference anything in the struct
other than the pc value. The arm64 code also uses a return type of int
while the generic code uses a return type of bool though in both cases the
return value is a boolean value and the sense is inverted between the two.
In order to reduce code duplication when arm64 is converted to use
arch_stack_walk() change the signature and return sense of the arm64
specific callback to match that of the generic code.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/20200914153409.25097-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Steal time initialization requires mapping a memory region which
invokes a memory allocation. Doing this at CPU starting time results
in the following trace when CONFIG_DEBUG_ATOMIC_SLEEP is enabled:
BUG: sleeping function called from invalid context at mm/slab.h:498
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.0-rc5+ #1
Call trace:
dump_backtrace+0x0/0x208
show_stack+0x1c/0x28
dump_stack+0xc4/0x11c
___might_sleep+0xf8/0x130
__might_sleep+0x58/0x90
slab_pre_alloc_hook.constprop.101+0xd0/0x118
kmem_cache_alloc_node_trace+0x84/0x270
__get_vm_area_node+0x88/0x210
get_vm_area_caller+0x38/0x40
__ioremap_caller+0x70/0xf8
ioremap_cache+0x78/0xb0
memremap+0x9c/0x1a8
init_stolen_time_cpu+0x54/0xf0
cpuhp_invoke_callback+0xa8/0x720
notify_cpu_starting+0xc8/0xd8
secondary_start_kernel+0x114/0x180
CPU1: Booted secondary processor 0x0000000001 [0x431f0a11]
However we don't need to initialize steal time at CPU starting time.
We can simply wait until CPU online time, just sacrificing a bit of
accuracy by returning zero for steal time until we know better.
While at it, add __init to the functions that are only called by
pv_time_init() which is __init.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Fixes: e0685fa228 ("arm64: Retrieve stolen time as paravirtualized guest")
Cc: stable@vger.kernel.org
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20200916154530.40809-1-drjones@redhat.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Let's switch the arm64 code to the core accounting, which already
does everything we need.
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
The old IPI registration interface is now unused on arm64, so let's
get rid of it.
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
The function __{pgd, pud, pmd, pte}_error() are introduced so that
they can be called by {pgd, pud, pmd, pte}_ERROR(). However, some
of the functions could never be called when the corresponding page
table level isn't enabled. For example, __{pud, pmd}_error() are
unused when PUD and PMD are folded to PGD.
This removes __{pgd, pud, pmd, pte}_error() and call pr_err() from
{pgd, pud, pmd, pte}_ERROR() directly, similar to what x86/powerpc
are doing. With this, the code looks a bit simplified either.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20200913234730.23145-1-gshan@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
The existing comment about steppable hint instruction is not complete
and only describes NOP instructions as steppable. As the function
aarch64_insn_is_steppable_hint allows all white-listed instruction
to be probed so the comment is updated to reflect this.
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Link: https://lore.kernel.org/r/20200914083656.21428-7-amit.kachhap@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
With the addition of ARMv8.3-FPAC feature, the probe of authenticate
ptrauth instructions (AUT*) may cause ptrauth fault exception in case of
authenticate failure so they cannot be safely single stepped.
Hence the probe of authenticate instructions is disallowed but the
corresponding pac ptrauth instruction (PAC*) is not affected and they can
still be probed. Also AUTH* instructions do not make sense at function
entry points so most realistic probes would be unaffected by this change.
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Link: https://lore.kernel.org/r/20200914083656.21428-6-amit.kachhap@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
The current address authentication cpufeature levels are set as LOWER_SAFE
which is not compatible with the different configurations added for Armv8.3
ptrauth enhancements as the different levels have different behaviour and
there is no tunable to enable the lower safe versions. This is rectified
by setting those cpufeature type as EXACT.
The current cpufeature framework also does not interfere in the booting of
non-exact secondary cpus but rather marks them as tainted. As a workaround
this is fixed by replacing the generic match handler with a new handler
specific to ptrauth.
After this change, if there is any variation in ptrauth configurations in
secondary cpus from boot cpu then those mismatched cpus are parked in an
infinite loop.
Following ptrauth crash log is observed in Arm fastmodel with simulated
mismatched cpus without this fix,
CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64ISAR1_EL1. Boot CPU: 0x11111110211402, CPU4: 0x11111110211102
CPU features: Unsupported CPU feature variation detected.
GICv3: CPU4: found redistributor 100 region 0:0x000000002f180000
CPU4: Booted secondary processor 0x0000000100 [0x410fd0f0]
Unable to handle kernel paging request at virtual address bfff800010dadf3c
Mem abort info:
ESR = 0x86000004
EC = 0x21: IABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
[bfff800010dadf3c] address between user and kernel address ranges
Internal error: Oops: 86000004 [#1] PREEMPT SMP
Modules linked in:
CPU: 4 PID: 29 Comm: migration/4 Tainted: G S 5.8.0-rc4-00005-ge658591d66d1-dirty #158
Hardware name: Foundation-v8A (DT)
pstate: 60000089 (nZCv daIf -PAN -UAO BTYPE=--)
pc : 0xbfff800010dadf3c
lr : __schedule+0x2b4/0x5a8
sp : ffff800012043d70
x29: ffff800012043d70 x28: 0080000000000000
x27: ffff800011cbe000 x26: ffff00087ad37580
x25: ffff00087ad37000 x24: ffff800010de7d50
x23: ffff800011674018 x22: 0784800010dae2a8
x21: ffff00087ad37000 x20: ffff00087acb8000
x19: ffff00087f742100 x18: 0000000000000030
x17: 0000000000000000 x16: 0000000000000000
x15: ffff800011ac1000 x14: 00000000000001bd
x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 71519a147ddfeb82
x9 : 825d5ec0fb246314 x8 : ffff00087ad37dd8
x7 : 0000000000000000 x6 : 00000000fffedb0e
x5 : 00000000ffffffff x4 : 0000000000000000
x3 : 0000000000000028 x2 : ffff80086e11e000
x1 : ffff00087ad37000 x0 : ffff00087acdc600
Call trace:
0xbfff800010dadf3c
schedule+0x78/0x110
schedule_preempt_disabled+0x24/0x40
__kthread_parkme+0x68/0xd0
kthread+0x138/0x160
ret_from_fork+0x10/0x34
Code: bad PC value
After this fix, the mismatched CPU4 is parked as,
CPU features: CPU4: Detected conflict for capability 39 (Address authentication (IMP DEF algorithm)), System: 1, CPU: 0
CPU4: will not boot
CPU4: failed to come online
CPU4: died during early boot
[Suzuki: Introduce new matching function for address authentication]
Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200914083656.21428-5-amit.kachhap@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Some Armv8.3 Pointer Authentication enhancements have been introduced
which are mandatory for Armv8.6 and optional for Armv8.3. These features
are,
* ARMv8.3-PAuth2 - An enhanced PAC generation logic is added which hardens
finding the correct PAC value of the authenticated pointer.
* ARMv8.3-FPAC - Fault is generated now when the ptrauth authentication
instruction fails in authenticating the PAC present in the address.
This is different from earlier case when such failures just adds an
error code in the top byte and waits for subsequent load/store to abort.
The ptrauth instructions which may cause this fault are autiasp, retaa
etc.
The above features are now represented by additional configurations
for the Address Authentication cpufeature and a new ESR exception class.
The userspace fault received in the kernel due to ARMv8.3-FPAC is treated
as Illegal instruction and hence signal SIGILL is injected with ILL_ILLOPN
as the signal code. Note that this is different from earlier ARMv8.3
ptrauth where signal SIGSEGV is issued due to Pointer authentication
failures. The in-kernel PAC fault causes kernel to crash.
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/20200914083656.21428-4-amit.kachhap@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Some error signal need to pass proper ARM esr error code to userspace to
better identify the cause of the signal. So the function
force_signal_inject is extended to pass this as a parameter. The
existing code is not affected by this change.
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/20200914083656.21428-3-amit.kachhap@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Currently the ARMv8.3-PAuth combined branch instructions (braa, retaa
etc.) are not simulated for out-of-line execution with a handler. Hence the
uprobe of such instructions leads to kernel warnings in a loop as they are
not explicitly checked and fall into INSN_GOOD categories. Other combined
instructions like LDRAA and LDRBB can be probed.
The issue of the combined branch instructions is fixed by adding
group definitions of all such instructions and rejecting their probes.
The instruction groups added are br_auth(braa, brab, braaz and brabz),
blr_auth(blraa, blrab, blraaz and blrabz), ret_auth(retaa and retab) and
eret_auth(eretaa and eretab).
Warning log:
WARNING: CPU: 0 PID: 156 at arch/arm64/kernel/probes/uprobes.c:182 uprobe_single_step_handler+0x34/0x50
Modules linked in:
CPU: 0 PID: 156 Comm: func Not tainted 5.9.0-rc3 #188
Hardware name: Foundation-v8A (DT)
pstate: 804003c9 (Nzcv DAIF +PAN -UAO BTYPE=--)
pc : uprobe_single_step_handler+0x34/0x50
lr : single_step_handler+0x70/0xf8
sp : ffff800012af3e30
x29: ffff800012af3e30 x28: ffff000878723b00
x27: 0000000000000000 x26: 0000000000000000
x25: 0000000000000000 x24: 0000000000000000
x23: 0000000060001000 x22: 00000000cb000022
x21: ffff800012065ce8 x20: ffff800012af3ec0
x19: ffff800012068d50 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000
x9 : ffff800010085c90 x8 : 0000000000000000
x7 : 0000000000000000 x6 : ffff80001205a9c8
x5 : ffff80001205a000 x4 : ffff80001233db80
x3 : ffff8000100a7a60 x2 : 0020000000000003
x1 : 0000fffffffff008 x0 : ffff800012af3ec0
Call trace:
uprobe_single_step_handler+0x34/0x50
single_step_handler+0x70/0xf8
do_debug_exception+0xb8/0x130
el0_sync_handler+0x138/0x1b8
el0_sync+0x158/0x180
Fixes: 74afda4016 ("arm64: compile the kernel with ptrauth return address signing")
Fixes: 04ca3204fa ("arm64: enable pointer authentication")
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/20200914083656.21428-2-amit.kachhap@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
The GIC's internal view of the priority mask register and the assigned
interrupt priorities are based on whether GIC security is enabled and
whether firmware routes Group 0 interrupts to EL3. At the moment, we
support priority masking when ICC_PMR_EL1 and interrupt priorities are
either both modified by the GIC, or both left unchanged.
Trusted Firmware-A's default interrupt routing model allows Group 0
interrupts to be delivered to the non-secure world (SCR_EL3.FIQ == 0).
Unfortunately, this is precisely the case that the GIC driver doesn't
support: ICC_PMR_EL1 remains unchanged, but the GIC's view of interrupt
priorities is different from the software programmed values.
Support pseudo-NMIs when SCR_EL3.FIQ == 0 by using a different value to
mask regular interrupts. All the other values remain the same.
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200912153707.667731-3-alexandru.elisei@arm.com
In order to deal with IPIs as normal interrupts, let's add
a new way to register them with the architecture code.
set_smp_ipi_range() takes a range of interrupts, and allows
the arch code to request them as if the were normal interrupts.
A standard handler is then called by the core IRQ code to deal
with the IPI.
This means that we don't need to call irq_enter/irq_exit, and
that we don't need to deal with set_irq_regs either. So let's
move the dispatcher into its own function, and leave handle_IPI()
as a compatibility function.
On the sending side, let's make use of ipi_send_mask, which
already exists for this purpose.
One of the major difference is that we end up, in some cases
(such as when performing IRQ time accounting on the scheduler
IPI), end up with nested irq_enter()/irq_exit() pairs.
Other than the (relatively small) overhead, there should be
no consequences to it (these pairs are designed to nest
correctly, and the accounting shouldn't be off).
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Now that we allow CPUs affected by erratum 1418040 to come in late,
this prevents their unaffected sibblings from coming in late (or
coming back after a suspend or hotplug-off, which amounts to the
same thing).
To allow this, we need to add ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU,
which amounts to set .type to ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE.
Fixes: bf87bb0881 ("arm64: Allow booting of late CPUs affected by erratum 1418040")
Reported-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Tested-by: Matthias Kaehlcke <mka@chromium.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200911181611.2073183-1-maz@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Kernel startup entry point requires disabling MMU and D-cache.
As for kexec-reboot, taking a close look at "msr sctlr_el1, x12" in
__cpu_soft_restart as the following:
-1. booted at EL1
The instruction is enough to disable MMU and I/D cache for
EL1 regime.
-2. booted at EL2, using VHE
Access to SCTLR_EL1 is redirected to SCTLR_EL2 in EL2. So the instruction
is enough to disable MMU and clear I+C bits for EL2 regime.
-3. booted at EL2, not using VHE
The instruction itself can not affect EL2 regime. But The hyp-stub doesn't
enable the MMU and I/D cache for EL2 regime. And KVM also disable them for EL2
regime when its unloaded, or execute a HVC_SOFT_RESTART call. So when
kexec-reboot, the code in KVM has prepare the requirement.
As a conclusion, disabling MMU and clearing I+C bits in
SYM_CODE_START(arm64_relocate_new_kernel) is redundant, and can be removed
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: James Morse <james.morse@arm.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Remi Denis-Courmont <remi.denis.courmont@huawei.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: kvmarm@lists.cs.columbia.edu
Link: https://lore.kernel.org/r/1598621998-20563-1-git-send-email-kernelfans@gmail.com
To: linux-arm-kernel@lists.infradead.org
Signed-off-by: Will Deacon <will@kernel.org>
HWCAP name arrays (hwcap_str, compat_hwcap_str, compat_hwcap2_str) that are
scanned for /proc/cpuinfo are detached from their bit definitions making it
vulnerable and difficult to correlate. It is also bit problematic because
during /proc/cpuinfo dump these arrays get traversed sequentially assuming
they reflect and match actual HWCAP bit sequence, to test various features
for a given CPU. This redefines name arrays per their HWCAP bit definitions
. It also warns after detecting any feature which is not expected on arm64.
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1599630535-29337-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
In the absence of ACPI or DT topology data, we fallback to haphazardly
decoding *something* out of MPIDR. Sadly, the contents of that register are
mostly unusable due to the implementation leniancy and things like Aff0
having to be capped to 15 (despite being encoded on 8 bits).
Consider a simple system with a single package of 32 cores, all under the
same LLC. We ought to be shoving them in the same core_sibling mask, but
MPIDR is going to look like:
| CPU | 0 | ... | 15 | 16 | ... | 31 |
|------+---+-----+----+----+-----+----+
| Aff0 | 0 | ... | 15 | 0 | ... | 15 |
| Aff1 | 0 | ... | 0 | 1 | ... | 1 |
| Aff2 | 0 | ... | 0 | 0 | ... | 0 |
Which will eventually yield
core_sibling(0-15) == 0-15
core_sibling(16-31) == 16-31
NUMA woes
=========
If we try to play games with this and set up NUMA boundaries within those
groups of 16 cores via e.g. QEMU:
# Node0: 0-9; Node1: 10-19
$ qemu-system-aarch64 <blah> \
-smp 20 -numa node,cpus=0-9,nodeid=0 -numa node,cpus=10-19,nodeid=1
The scheduler's MC domain (all CPUs with same LLC) is going to be built via
arch_topology.c::cpu_coregroup_mask()
In there we try to figure out a sensible mask out of the topology
information we have. In short, here we'll pick the smallest of NUMA or
core sibling mask.
node_mask(CPU9) == 0-9
core_sibling(CPU9) == 0-15
MC mask for CPU9 will thus be 0-9, not a problem.
node_mask(CPU10) == 10-19
core_sibling(CPU10) == 0-15
MC mask for CPU10 will thus be 10-19, not a problem.
node_mask(CPU16) == 10-19
core_sibling(CPU16) == 16-19
MC mask for CPU16 will thus be 16-19... Uh oh. CPUs 16-19 are in two
different unique MC spans, and the scheduler has no idea what to make of
that. That triggers the WARN_ON() added by commit
ccf74128d6 ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap")
Fixing MPIDR-derived topology
=============================
We could try to come up with some cleverer scheme to figure out which of
the available masks to pick, but really if one of those masks resulted from
MPIDR then it should be discarded because it's bound to be bogus.
I was hoping to give MPIDR a chance for SMT, to figure out which threads are
in the same core using Aff1-3 as core ID, but Sudeep and Robin pointed out
to me that there are systems out there where *all* cores have non-zero
values in their higher affinity fields (e.g. RK3288 has "5" in all of its
cores' MPIDR.Aff1), which would expose a bogus core ID to userspace.
Stop using MPIDR for topology information. When no other source of topology
information is available, mark each CPU as its own core and its NUMA node
as its LLC domain.
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20200829130016.26106-1-valentin.schneider@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
event_idx is obtained from armv8pmu_get_event_idx(), and this idx must be
between ARMV8_IDX_CYCLE_COUNTER and cpu_pmu->num_events. So it's unnecessary
to do this check. Let's remove it.
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1599213458-28394-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
TEXT_OFFSET serves no purpose, and for this reason, it was redefined
as 0x0 in the v5.8 timeframe. Since this does not appear to have caused
any issues that require us to revisit that decision, let's get rid of the
macro entirely, along with any references to it.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20200825135440.11288-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
The vdso linker script is preprocessed on demand.
Adding it to 'targets' is enough to include the .cmd file.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Greentime Hu <green.hu@gmail.com>
This patch is to add the general hardware last level cache (LLC) events
for PMUv3: one event is for LLC access and another is for LLC miss.
With this change, perf tool can support last level cache profiling,
below is an example to demonstrate the usage on Arm64:
$ perf stat -e LLC-load-misses -e LLC-loads -- \
perf bench mem memcpy -s 1024MB -l 100 -f default
[...]
Performance counter stats for 'perf bench mem memcpy -s 1024MB -l 100 -f default':
35,534,262 LLC-load-misses # 2.16% of all LL-cache hits
1,643,946,443 LLC-loads
[...]
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20200811053505.21223-1-leo.yan@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Currently, there are different description strings in die() such as
die("Oops",,), die("Oops - BUG",,). And panic() called by die() will
always show "Fatal exception" or "Fatal exception in interrupt".
Note that panic() will run any panic handler via panic_notifier_list.
And the string above will be formatted and placed in static buf[]
which will be passed to handler.
So panic handler can not distinguish which Oops it is from the buf if
we want to do some things like reserve the string in memory or panic
statistics. It's not benefit to debug. We need to add more codes to
troubleshoot. Let's utilize existing resource to make debug much simpler.
Signed-off-by: Yue Hu <huyue2@yulong.com>
Link: https://lore.kernel.org/r/20200804085347.10720-1-zbestahu@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
When hibernating the contents of all pages in the system are written to
disk, however the MTE tags are not visible to the generic hibernation
code. So just before the hibernation image is created copy the tags out
of the physical tag storage into standard memory so they will be
included in the hibernation image. After hibernation apply the tags back
into the physical tag storage.
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
When swapping pages out to disk it is necessary to save any tags that
have been set, and restore when swapping back in. Make use of the new
page flag (PG_ARCH_2, locally named PG_mte_tagged) to identify pages
with tags. When swapping out these pages the tags are stored in memory
and later restored when the pages are brought back in. Because shmem can
swap pages back in without restoring the userspace PTE it is also
necessary to add a hook for shmem.
Signed-off-by: Steven Price <steven.price@arm.com>
[catalin.marinas@arm.com: move function prototypes to mte.h]
[catalin.marinas@arm.com: drop '_tags' from arch_swap_restore_tags()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Will Deacon <will@kernel.org>
This regset allows read/write access to a ptraced process
prctl(PR_SET_TAGGED_ADDR_CTRL) setting.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Alan Hayward <Alan.Hayward@arm.com>
Cc: Luis Machado <luis.machado@linaro.org>
Cc: Omair Javaid <omair.javaid@linaro.org>
Add support for bulk setting/getting of the MTE tags in a tracee's
address space at 'addr' in the ptrace() syscall prototype. 'data' points
to a struct iovec in the tracer's address space with iov_base
representing the address of a tracer's buffer of length iov_len. The
tags to be copied to/from the tracer's buffer are stored as one tag per
byte.
On successfully copying at least one tag, ptrace() returns 0 and updates
the tracer's iov_len with the number of tags copied. In case of error,
either -EIO or -EFAULT is returned, trying to follow the ptrace() man
page.
Note that the tag copying functions are not performance critical,
therefore they lack optimisations found in typical memory copy routines.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Alan Hayward <Alan.Hayward@arm.com>
Cc: Luis Machado <luis.machado@linaro.org>
Cc: Omair Javaid <omair.javaid@linaro.org>
In preparation for ptrace() access to the prctl() value, allow calling
these functions on non-current tasks.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
The CPU resume/suspend routines only take care of the common system
registers. Restore GCR_EL1 in addition via the __cpu_suspend_exit()
function.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
The IRG, ADDG and SUBG instructions insert a random tag in the resulting
address. Certain tags can be excluded via the GCR_EL1.Exclude bitmap
when, for example, the user wants a certain colour for freed buffers.
Since the GCR_EL1 register is not accessible at EL0, extend the
prctl(PR_SET_TAGGED_ADDR_CTRL) interface to include a 16-bit field in
the first argument for controlling which tags can be generated by the
above instruction (an include rather than exclude mask). Note that by
default all non-zero tags are excluded. This setting is per-thread.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
By default, even if PROT_MTE is set on a memory range, there is no tag
check fault reporting (SIGSEGV). Introduce a set of option to the
exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
check fault mode:
PR_MTE_TCF_NONE - no reporting (default)
PR_MTE_TCF_SYNC - synchronous tag check fault reporting
PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting
These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
context-switched by the kernel. Note that the kernel accesses to the
user address space (e.g. read() system call) are not checked if the user
thread tag checking mode is PR_MTE_TCF_NONE or PR_MTE_TCF_ASYNC. If the
tag checking mode is PR_MTE_TCF_SYNC, the kernel makes a best effort to
check its user address accesses, however it cannot always guarantee it.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
When the Memory Tagging Extension is enabled, two pages are identical
only if both their data and tags are identical.
Make the generic memcmp_pages() a __weak function and add an
arm64-specific implementation which returns non-zero if any of the two
pages contain valid MTE tags (PG_mte_tagged set). There isn't much
benefit in comparing the tags of two pages since these are normally used
for heap allocations and likely to differ anyway.
Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Pages allocated by the kernel are not guaranteed to have the tags
zeroed, especially as the kernel does not (yet) use MTE itself. To
ensure the user can still access such pages when mapped into its address
space, clear the tags via set_pte_at(). A new page flag - PG_mte_tagged
(PG_arch_2) - is used to track pages with valid allocation tags.
Since the zero page is mapped as pte_special(), it won't be covered by
the above set_pte_at() mechanism. Clear its tags during early MTE
initialisation.
Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
The Memory Tagging Extension has two modes of notifying a tag check
fault at EL0, configurable through the SCTLR_EL1.TCF0 field:
1. Synchronous raising of a Data Abort exception with DFSC 17.
2. Asynchronous setting of a cumulative bit in TFSRE0_EL1.
Add the exception handler for the synchronous exception and handling of
the asynchronous TFSRE0_EL1.TF0 bit setting via a new TIF flag in
do_notify_resume().
On a tag check failure in user-space, whether synchronous or
asynchronous, a SIGSEGV will be raised on the faulting thread.
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Add the cpufeature and hwcap entries to detect the presence of MTE. Any
secondary CPU not supporting the feature, if detected on the boot CPU,
will be parked.
Add the minimum SCTLR_EL1 and HCR_EL2 bits for enabling MTE. The Normal
Tagged memory type is configured in MAIR_EL1 before the MMU is enabled
in order to avoid disrupting other CPUs in the CnP domain.
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
In the arm64 module linker script, the section .text.ftrace_trampoline
is specified unconditionally regardless of whether CONFIG_DYNAMIC_FTRACE
is enabled (this is simply due to the limitation that module linker
scripts are not preprocessed like the vmlinux one).
Normally, for .plt and .text.ftrace_trampoline, the section flags
present in the module binary wouldn't matter since module_frob_arch_sections()
would assign them manually anyway. However, the arm64 module loader only
sets the section flags for .text.ftrace_trampoline when CONFIG_DYNAMIC_FTRACE=y.
That's only become problematic recently due to a recent change in
binutils-2.35, where the .text.ftrace_trampoline section (along with the
.plt section) is now marked writable and executable (WAX).
We no longer allow writable and executable sections to be loaded due to
commit 5c3a7db0c7 ("module: Harden STRICT_MODULE_RWX"), so this is
causing all modules linked with binutils-2.35 to be rejected under arm64.
Drop the IS_ENABLED(CONFIG_DYNAMIC_FTRACE) check in module_frob_arch_sections()
so that the section flags for .text.ftrace_trampoline get properly set to
SHF_EXECINSTR|SHF_ALLOC, without SHF_WRITE.
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: http://lore.kernel.org/r/20200831094651.GA16385@linux-8ccs
Link: https://lore.kernel.org/r/20200901160016.3646-1-jeyu@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Commit eaecca9e77 ("arm64: Fix __cpu_logical_map undefined issue")
exported cpu_logical_map in order to fix tegra194-cpufreq module build
failure.
As this might potentially cause problem while supporting physical CPU
hotplug, tegra194-cpufreq module was reworded to avoid use of
cpu_logical_map() via the commit 93d0c1ab23 ("cpufreq: replace
cpu_logical_map() with read_cpuid_mpir()")
Since cpu_logical_map was exported to fix the module build temporarily,
let us remove the same before it gains any user again.
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20200901095229.56793-1-sudeep.holla@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
There's really no need to put every parameter on a new line when calling
a function with a long name, so reformat the *setup_additional_pages()
functions in the vDSO setup code to follow the usual conventions.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Most of the compat vDSO code can be built and guarded using IS_ENABLED,
so drop the unnecessary #ifdefs.
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Use the common DISCARDS rule for the linker script in an effort to
regularize the linker script to prepare for warning on orphaned
sections. Additionally clean up left-over no-op macros.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200821194310.3089815-12-keescook@chromium.org
Avoid .eh_frame* section generation by making sure both CFLAGS and AFLAGS
contain -fno-asychronous-unwind-tables and -fno-unwind-tables.
With all sources of .eh_frame now removed from the build, drop this
DISCARD so we can be alerted in the future if it returns unexpectedly
once orphan section warnings have been enabled.
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200821194310.3089815-11-keescook@chromium.org
Remove last instance of an .eh_frame section by removing the needless Call
Frame Information annotations which were likely leftovers from 32-bit ARM.
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200821194310.3089815-10-keescook@chromium.org