Currently we require that software at a higher exception level initialise
all registers at the exception level the kernel will be entered prior to
starting the kernel in order to ensure that there is nothing uninitialised
which could result in an UNKNOWN state while running the kernel. The
expectation is that the software running at the highest exception levels
will be tightly coupled to the system and can ensure that all available
features are appropriately initialised and that the kernel can initialise
anything else.
There is a gap here in the case where new registers are added to lower
exception levels that require initialisation but the kernel does not yet
understand them. Extend the requirement to also include exception levels
below the one where the kernel is entered to cover this.
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210401180942.35815-4-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The arm64 FEAT_FGT extension introduces a set of traps to EL2 for accesses
to small sets of registers and instructions from EL1 and EL0, access to
which is controlled by EL3. Require access to it so that it is
available to us in future and so that we can ensure these traps are
disabled during boot.
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210401180942.35815-2-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Memory controller drivers for v5.13 - Tegra SoC
1. Few cleanups.
2. Add debug statistics to Tegra20 memory controller.
3. Update bindings and convert to dtschema. This update is not
backwards compatible (ABI break) however the broken part was added
recently (v5.11) and there are no users of it yet.
* tag 'memory-controller-drv-tegra-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl:
dt-bindings: memory: tegra20: mc: Convert to schema
dt-bindings: memory: tegra124: emc: Replace core regulator with power domain
dt-bindings: memory: tegra30: emc: Replace core regulator with power domain
dt-bindings: memory: tegra20: emc: Replace core regulator with power domain
memory: tegra: Print out info-level once per driver probe
memory: tegra20: Protect debug code with a lock
memory: tegra20: Correct comment to MC_STAT registers writes
memory: tegra20: Add debug statistics
memory: tegra: replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE
Link: https://lore.kernel.org/r/20210407161333.73013-2-krzysztof.kozlowski@canonical.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Qualcomm driver updates for 5.13
This introduces SC7280 and SM8350 support in the RPMH power-domain
driver, SC7280 support to the LLCC driver, SC7280 support tot he AOSS
QMP driver, cleanups to the RPMH driver and a few smaller fixes to the
SMEM, QMI and EBI2 drivers.
* tag 'qcom-drivers-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
bus: qcom: Put child node before return
dt-bindings: firmware: scm: Add sc7280 support
soc: qcom: rpmh-rsc: Fold WARN_ON() into if condition
soc: qcom: rpmh-rsc: Loop over fewer bits in irq handler
soc: qcom: rpmh-rsc: Remove tcs_is_free() API
soc: qcom: smem: Update max processor count
soc: qcom: aoss: Add AOSS QMP support for SC7280
dt-bindings: soc: qcom: aoss: Add SC7280 compatible
soc: qcom: llcc: Add configuration data for SC7280
dt-bindings: arm: msm: Add LLCC for SC7280
soc: qcom: Fix typos in the file qmi_encdec.c
soc: qcom: rpmhpd: Add sc7280 powerdomains
dt-bindings: power: rpmpd: Add sc7280 to rpmpd binding
soc: qcom: rpmhpd: Add SM8350 power domains
dt-bindings: power: Add rpm power domain bindings for SM8350
Link: https://lore.kernel.org/r/20210404164951.713045-1-bjorn.andersson@linaro.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The documentation build legitimately screams about the PTP
documentation table being misformated.
Fix it by adjusting the table width guides.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Marc Zyngier <maz@kernel.org>
On newer Microsoft Surface models (specifically 7th-generation, i.e.
Surface Pro 7, Surface Book 3, Surface Laptop 3, and Surface Laptop Go),
battery and AC status/information is no longer handled via standard ACPI
devices, but instead directly via the Surface System Aggregator Module
(SSAM), i.e. the embedded controller on those devices.
While on previous generation models, battery status is also handled via
SSAM, an ACPI shim was present to translate the standard ACPI battery
interface to SSAM requests. The SSAM interface itself, which is modeled
closely after the ACPI interface, has not changed.
This commit introduces a new SSAM client device driver to support
battery status/information via the aforementioned interface on said
Surface models. It is in parts based on the standard ACPI battery
driver.
Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
This provides the ability for architectures to enable kernel stack base
address offset randomization. This feature is controlled by the boot
param "randomize_kstack_offset=on/off", with its default value set by
CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT.
This feature is based on the original idea from the last public release
of PaX's RANDKSTACK feature: https://pax.grsecurity.net/docs/randkstack.txt
All the credit for the original idea goes to the PaX team. Note that
the design and implementation of this upstream randomize_kstack_offset
feature differs greatly from the RANDKSTACK feature (see below).
Reasoning for the feature:
This feature aims to make harder the various stack-based attacks that
rely on deterministic stack structure. We have had many such attacks in
past (just to name few):
https://jon.oberheide.org/files/infiltrate12-thestackisback.pdfhttps://jon.oberheide.org/files/stackjacking-infiltrate11.pdfhttps://googleprojectzero.blogspot.com/2016/06/exploiting-recursion-in-linux-kernel_20.html
As Linux kernel stack protections have been constantly improving
(vmap-based stack allocation with guard pages, removal of thread_info,
STACKLEAK), attackers have had to find new ways for their exploits
to work. They have done so, continuing to rely on the kernel's stack
determinism, in situations where VMAP_STACK and THREAD_INFO_IN_TASK_STRUCT
were not relevant. For example, the following recent attacks would have
been hampered if the stack offset was non-deterministic between syscalls:
https://repositorio-aberto.up.pt/bitstream/10216/125357/2/374717.pdf
(page 70: targeting the pt_regs copy with linear stack overflow)
https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
(leaked stack address from one syscall as a target during next syscall)
The main idea is that since the stack offset is randomized on each system
call, it is harder for an attack to reliably land in any particular place
on the thread stack, even with address exposures, as the stack base will
change on the next syscall. Also, since randomization is performed after
placing pt_regs, the ptrace-based approach[1] to discover the randomized
offset during a long-running syscall should not be possible.
Design description:
During most of the kernel's execution, it runs on the "thread stack",
which is pretty deterministic in its structure: it is fixed in size,
and on every entry from userspace to kernel on a syscall the thread
stack starts construction from an address fetched from the per-cpu
cpu_current_top_of_stack variable. The first element to be pushed to the
thread stack is the pt_regs struct that stores all required CPU registers
and syscall parameters. Finally the specific syscall function is called,
with the stack being used as the kernel executes the resulting request.
The goal of randomize_kstack_offset feature is to add a random offset
after the pt_regs has been pushed to the stack and before the rest of the
thread stack is used during the syscall processing, and to change it every
time a process issues a syscall. The source of randomness is currently
architecture-defined (but x86 is using the low byte of rdtsc()). Future
improvements for different entropy sources is possible, but out of scope
for this patch. Further more, to add more unpredictability, new offsets
are chosen at the end of syscalls (the timing of which should be less
easy to measure from userspace than at syscall entry time), and stored
in a per-CPU variable, so that the life of the value does not stay
explicitly tied to a single task.
As suggested by Andy Lutomirski, the offset is added using alloca()
and an empty asm() statement with an output constraint, since it avoids
changes to assembly syscall entry code, to the unwinder, and provides
correct stack alignment as defined by the compiler.
In order to make this available by default with zero performance impact
for those that don't want it, it is boot-time selectable with static
branches. This way, if the overhead is not wanted, it can just be
left turned off with no performance impact.
The generated assembly for x86_64 with GCC looks like this:
...
ffffffff81003977: 65 8b 05 02 ea 00 7f mov %gs:0x7f00ea02(%rip),%eax
# 12380 <kstack_offset>
ffffffff8100397e: 25 ff 03 00 00 and $0x3ff,%eax
ffffffff81003983: 48 83 c0 0f add $0xf,%rax
ffffffff81003987: 25 f8 07 00 00 and $0x7f8,%eax
ffffffff8100398c: 48 29 c4 sub %rax,%rsp
ffffffff8100398f: 48 8d 44 24 0f lea 0xf(%rsp),%rax
ffffffff81003994: 48 83 e0 f0 and $0xfffffffffffffff0,%rax
...
As a result of the above stack alignment, this patch introduces about
5 bits of randomness after pt_regs is spilled to the thread stack on
x86_64, and 6 bits on x86_32 (since its has 1 fewer bit required for
stack alignment). The amount of entropy could be adjusted based on how
much of the stack space we wish to trade for security.
My measure of syscall performance overhead (on x86_64):
lmbench: /usr/lib/lmbench/bin/x86_64-linux-gnu/lat_syscall -N 10000 null
randomize_kstack_offset=y Simple syscall: 0.7082 microseconds
randomize_kstack_offset=n Simple syscall: 0.7016 microseconds
So, roughly 0.9% overhead growth for a no-op syscall, which is very
manageable. And for people that don't want this, it's off by default.
There are two gotchas with using the alloca() trick. First,
compilers that have Stack Clash protection (-fstack-clash-protection)
enabled by default (e.g. Ubuntu[3]) add pagesize stack probes to
any dynamic stack allocations. While the randomization offset is
always less than a page, the resulting assembly would still contain
(unreachable!) probing routines, bloating the resulting assembly. To
avoid this, -fno-stack-clash-protection is unconditionally added to
the kernel Makefile since this is the only dynamic stack allocation in
the kernel (now that VLAs have been removed) and it is provably safe
from Stack Clash style attacks.
The second gotcha with alloca() is a negative interaction with
-fstack-protector*, in that it sees the alloca() as an array allocation,
which triggers the unconditional addition of the stack canary function
pre/post-amble which slows down syscalls regardless of the static
branch. In order to avoid adding this unneeded check and its associated
performance impact, architectures need to carefully remove uses of
-fstack-protector-strong (or -fstack-protector) in the compilation units
that use the add_random_kstack() macro and to audit the resulting stack
mitigation coverage (to make sure no desired coverage disappears). No
change is visible for this on x86 because the stack protector is already
unconditionally disabled for the compilation unit, but the change is
required on arm64. There is, unfortunately, no attribute that can be
used to disable stack protector for specific functions.
Comparison to PaX RANDKSTACK feature:
The RANDKSTACK feature randomizes the location of the stack start
(cpu_current_top_of_stack), i.e. including the location of pt_regs
structure itself on the stack. Initially this patch followed the same
approach, but during the recent discussions[2], it has been determined
to be of a little value since, if ptrace functionality is available for
an attacker, they can use PTRACE_PEEKUSR/PTRACE_POKEUSR to read/write
different offsets in the pt_regs struct, observe the cache behavior of
the pt_regs accesses, and figure out the random stack offset. Another
difference is that the random offset is stored in a per-cpu variable,
rather than having it be per-thread. As a result, these implementations
differ a fair bit in their implementation details and results, though
obviously the intent is similar.
[1] https://lore.kernel.org/kernel-hardening/2236FBA76BA1254E88B949DDB74E612BA4BC57C1@IRSMSX102.ger.corp.intel.com/
[2] https://lore.kernel.org/kernel-hardening/20190329081358.30497-1-elena.reshetova@intel.com/
[3] https://lists.ubuntu.com/archives/ubuntu-devel/2019-June/040741.html
Co-developed-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210401232347.2791257-4-keescook@chromium.org
Add compatible strings to support the system timer, clocksource, OST,
watchdog and PWM blocks of the JZ4760 and JZ4760B SoCs.
Newer SoCs which behave like the JZ4760 or JZ4760B now see their
compatible string require a fallback compatible string that corresponds
to one of these two SoCs.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210308212302.10288-1-paul@crapouillou.net
Apple SoCs run firmware that sets up a simplefb-compatible framebuffer
for us. Add a compatible for it, and two missing supported formats.
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
AIC is the Apple Interrupt Controller found on Apple ARM SoCs, such as
the M1.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
This documents the newly introduced ioremap_np() along with all the
other common ioremap() variants, and some higher-level abstractions
available.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
This adds more detailed descriptions of the various read/write
primitives available for use with I/O memory/ports.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hector Martin <marcan@marcan.st>
ARM64 currently defaults to posted MMIO (nGnRE), but some devices
require the use of non-posted MMIO (nGnRnE). Introduce a new ioremap()
variant to handle this case. ioremap_np() returns NULL on arches that
do not implement this variant.
sparc64 is the only architecture that needs to be touched directly,
because it includes neither of the generic io.h or iomap.h headers.
This adds the IORESOURCE_MEM_NONPOSTED flag, which maps to this
variant and marks a given resource as requiring non-posted mappings.
This is implemented in the resource system because it is a SoC-level
requirement, so existing drivers do not need special-case code to pick
this ioremap variant.
Then this is implemented in devres by introducing devm_ioremap_np(),
and making devm_ioremap_resource() automatically select this variant
when the resource has the IORESOURCE_MEM_NONPOSTED flag set.
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
Not all platforms provide the same set of timers/interrupts, and Linux
only needs one (plus kvm/guest ones); some platforms are working around
this by using dummy fake interrupts. Implementing interrupt-names allows
the devicetree to specify an arbitrary set of available interrupts, so
the timer code can pick the right one.
This also adds the hyp-virt timer/interrupt, which was previously not
expressed in the fixed 4-interrupt form.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
This introduces bindings for all three 2020 Apple M1 devices:
* apple,j274 - Mac mini (M1, 2020)
* apple,j293 - MacBook Pro (13-inch, M1, 2020)
* apple,j313 - MacBook Air (M1, 2020)
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
This is different from the legacy AAPL prefix used on PPC, but
consensus is that we prefer `apple` for these new platforms.
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
This point in gregkh's tty-next tree includes all the samsung_tty
changes that were part of v3 of the M1 bring-up series, and have
already been merged in.
Signed-off-by: Hector Martin <marcan@marcan.st>
Chanwoo writes:
Update extcon next for v5.13
Detailed description for this pull request:
1. Update extcon provider driver
- Add the support of charging interrupt to detect charger connector
for extcon-max8997.c
- Detect OTG when USB_ID pin is connected to ground for extcon-sm5502.c
- Add the support for VBUS detection for extcon-qcom-spmi-misc.c
and replace qcom,pm8941-misc binding document with yaml style.
* tag 'extcon-next-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon:
extcon: qcom-spmi: Add support for VBUS detection
bindings: pm8941-misc: Add support for VBUS detection
bindings: pm8941-misc: Convert bindings to YAML
extcon: sm5502: Detect OTG when USB_ID is connected to ground
extcon: max8997: Add CHGINS and CHGRM interrupt handling
Moritz writes:
Second set of FPGA Manager changes for 5.13-rc1
FPGA Manager:
- Russ' first change improves port_enable reliability
- Russ' second change adds a new device ID for a DFL device
- Geert's change updates the examples in binding with dt overlay sugar
syntax
All patches have been reviewed on the mailing list, and have been in the
last linux-next releases (as part of my for-next branch) without issues.
Signed-off-by: Moritz Fischer <mdf@kernel.org>
* tag 'fpga-late-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga:
fpga: dfl: pci: add DID for D5005 PAC cards
dt-bindings: fpga: fpga-region: Convert to sugar syntax
fpga: dfl: afu: harden port enable logic
fpga: Add support for Xilinx DFX AXI Shutdown manager
dt-bindings: fpga: Add compatible value for Xilinx DFX AXI shutdown manager
fpga: xilinx-pr-decoupler: Simplify code by using dev_err_probe()
fpga: fpga-mgr: xilinx-spi: fix error messages on -EPROBE_DEFER
First of all, no_central_polling was removed since
commit 7e6fdd4bad ("PM / devfreq: Core updates to support devices
which can idle")
Secondly, get_target_freq() is not only called only with update_devfreq()
notified by OPP now, but also min/max freq qos notifier.
So remove this invalid description now to avoid confusing.
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
The general trend is to have devicetree bindings in YAML format, to
allow automatic validation of bindings and devicetrees.
Convert the NPCM SoC family's binding to YAML before it accumulates more
entries.
The nuvoton,npcm750-evb compatible string is introduced to keep the
structure of the binding a little simpler.
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210320164023.614059-1-j.neuschaefer@gmx.net
Signed-off-by: Joel Stanley <joel@jms.id.au>
As of commit 359a376081 ("kunit: support failure from dynamic analysis
tools"), we can use current->kunit_test to find the current kunit test.
Mention this in tips.rst and give an example of how this can be used in
conjunction with `test->priv` to pass around state and specifically
implement something like mocking.
There's a lot more we could go into on that topic, but given that
example is already longer than every other "tip" on this page, we just
point to the API docs and leave filling in the blanks as an exercise to
the reader.
Also give an example of kunit_fail_current_test().
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
This is only done once, we don't need to generate code to initialize a
structure stored in the ELF .data segment. Fill in the three required
.driver members directly instead of copying data into them during
mdev_register_driver().
Further the to_mdev_driver() function doesn't belong in a public header,
just inline it into the two places that need it. Finally, we can now
clearly see that 'drv' derived from dev->driver cannot be NULL, firstly
because the driver core forbids it, and secondly because NULL won't pass
through the container_of(). Remove the dead code.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Message-Id: <4-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The mdev API should accept and pass a 'struct mdev_device *' in all
places, not pass a 'struct device *' and cast it internally with
to_mdev_device(). Particularly in its struct mdev_driver functions, the
whole point of a bus's struct device_driver wrapper is to provide type
safety compared to the default struct device_driver.
Further, the driver core standard is for bus drivers to expose their
device structure in their public headers that can be used with
container_of() inlines and '&foo->dev' to go between the class levels, and
'&foo->dev' to be used with dev_err/etc driver core helper functions. Move
'struct mdev_device' to mdev.h
Once done this allows moving some one instruction exported functions to
static inlines, which in turns allows removing one of the two grotesque
symbol_get()'s related to mdev in the core code.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Message-Id: <3-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>