commit 07c6b2d01d upstream.
Since the switch from mmp_pdma to pxa_dma driver for pxa architectures,
the pxa_dma requires 2 arguments, namely the requestor line and the
requested priority.
Fix the only left device node which was still passing only one argument,
making the pxa3xx-nand driver misbehave in a device-tree configuration,
ie. failing all data transfers.
Fixes: c943646d1f ("ARM: dts: pxa: add dma engine node to pxa3xx-nand")
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b3a7f31eb7 upstream.
The Armada 375 has the same SATA IP as Armada 370 and Armada XP, which
requires the PHY speed to be set in the LP_PHY_CTL register for SATA
hotplug to work.
Therefore, this commit updates the compatible string used to describe
the SATA IP in Armada 375 from marvell,orion-sata to
marvell,armada-370-sata.
Fixes: 4de5908509 ("ARM: mvebu: add Device Tree description of the Armada 375 SoC")
Signed-off-by: Lior Amsalem <alior@marvell.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc7eb9d589 upstream.
We cannot select a symbol that has disabled dependencies, so
we get a warning if we ever enable EXYNOS_THERMAL without
also turning on THERMAL_OF:
warning: (ARCH_EXYNOS) selects EXYNOS_THERMAL which has unmet direct dependencies (THERMAL && (ARCH_EXYNOS || COMPILE_TEST) && THERMAL_OF)
This adds another 'select' in the platform code to avoid that
case. Alternatively, we could decide to not select EXYNOS_THERMAL
here and instead make it a user option.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: f87e6bd3f7 ("thermal: exynos: Add the dependency of CONFIG_THERMAL_OF instead of CONFIG_OF")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef2b1d777d upstream.
The atlas7 clock controller driver registers a reset controller
for itself, which causes a link error when the subsystem is
disabled:
drivers/built-in.o: In function `atlas7_clk_init':
drivers/clk/sirf/clk-atlas7.c:1681: undefined reference to `reset_controller_register'
As the clk driver does not have a Kconfig symbol for itself
but it always built-in when the platform is enabled, we have
to ensure that the reset controller subsystem is also built-in
in this case.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Philipp Zabel <p.zabel@pengutronix.de>
Fixes: 301c5d2940 ("clk: sirf: add CSR atlas7 clk and reset support")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98f4222150 upstream.
Based on CPU type choose generic omap3 or omap3430 specific cpuidle
parameters. Parameters for omap3430 were measured on Nokia N900 device and
added by commit 5a1b1d3a9e ("OMAP3: RX-51: Pass cpu idle parameters")
which were later removed by commit 231900afba ("ARM: OMAP3: cpuidle -
remove rx51 cpuidle parameters table") due to huge code complexity.
This patch brings cpuidle parameters for omap3430 devices again, but uses
simple condition based on CPU type.
Fixes: 231900afba ("ARM: OMAP3: cpuidle - remove rx51 cpuidle
parameters table")
Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1bdb897039 upstream.
If x86_vector_alloc_irq() fails x86_vector_free_irqs() is invoked to cleanup
the already allocated vectors. This subsequently calls clear_vector_irq().
The failed irq has no vector assigned, which triggers the BUG_ON(!vector) in
clear_vector_irq().
We cannot suppress the call to x86_vector_free_irqs() for the failed
interrupt, because the other data related to this irq must be cleaned up as
well. So calling clear_vector_irq() with vector == 0 is legitimate.
Remove the BUG_ON and return if vector is zero,
[ tglx: Massaged changelog ]
Fixes: b5dc8e6c21 "x86/irq: Use hierarchical irqdomain to manage CPU interrupt vectors"
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4705e02498 upstream.
We need to update the user TM feature bits (PPC_FEATURE2_HTM and
PPC_FEATURE2_HTM) to mirror what we do with the kernel TM feature
bit.
At the moment, if firmware reports TM is not available we turn off
the kernel TM feature bit but leave the userspace ones on. Userspace
thinks it can execute TM instructions and it dies trying.
This (together with a QEMU patch) fixes PR KVM, which doesn't currently
support TM.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit beff82374b upstream.
scan_features() updates cpu_user_features but not cpu_user_features2.
Amongst other things, cpu_user_features2 contains the user TM feature
bits which we must keep in sync with the kernel TM feature bit.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6997e57d69 upstream.
The REAL_LE feature entry in the ibm_pa_feature struct is missing an MMU
feature value, meaning all the remaining elements initialise the wrong
values.
This means instead of checking for byte 5, bit 0, we check for byte 0,
bit 0, and then we incorrectly set the CPU feature bit as well as MMU
feature bit 1 and CPU user feature bits 0 and 2 (5).
Checking byte 0 bit 0 (IBM numbering), means we're looking at the
"Memory Management Unit (MMU)" feature - ie. does the CPU have an MMU.
In practice that bit is set on all platforms which have the property.
This means we set CPU_FTR_REAL_LE always. In practice that seems not to
matter because all the modern cpus which have this property also
implement REAL_LE, and we've never needed to disable it.
We're also incorrectly setting MMU feature bit 1, which is:
#define MMU_FTR_TYPE_8xx 0x00000002
Luckily the only place that looks for MMU_FTR_TYPE_8xx is in Book3E
code, which can't run on the same cpus as scan_features(). So this also
doesn't matter in practice.
Finally in the CPU user feature mask, we're setting bits 0 and 2. Bit 2
is not currently used, and bit 0 is:
#define PPC_FEATURE_PPC_LE 0x00000001
Which says the CPU supports the old style "PPC Little Endian" mode.
Again this should be harmless in practice as no 64-bit CPUs implement
that mode.
Fix the code by adding the missing initialisation of the MMU feature.
Also add a comment marking CPU user feature bit 2 (0x4) as reserved. It
would be unsafe to start using it as old kernels incorrectly set it.
Fixes: 44ae3ab335 ("powerpc: Free up some CPU feature bits by moving out MMU-related features")
Signed-off-by: Anton Blanchard <anton@samba.org>
[mpe: Flesh out changelog, add comment reserving 0x4]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0851561d9c upstream.
In sha_complete_job, incorrect mcryptd_hash_request_ctx pointer is used
when check and complete other jobs. If the memory of first completed req
is freed, while still completing other jobs in the func, kernel will
crash since NULL pointer is assigned to RIP.
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d89d9e61d upstream.
Newer machines might use a different (larger) format for function
measurement blocks. To ensure that we comply with the alignment
requirement on these machines and prevent memory corruption (when
firmware writes more data than we expect) add 16 padding bytes
at the end of the fmb.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 103f6112f2 upstream.
Huge pages are not normally available to PV guests. Not suppressing
hugetlbfs use results in an endless loop of page faults when user mode
code tries to access a hugetlbfs mapped area (since the hypervisor
denies such PTEs to be created, but error indications can't be
propagated out of xen_set_pte_at(), just like for various of its
siblings), and - once killed in an oops like this:
kernel BUG at .../fs/hugetlbfs/inode.c:428!
invalid opcode: 0000 [#1] SMP
...
RIP: e030:[<ffffffff811c333b>] [<ffffffff811c333b>] remove_inode_hugepages+0x25b/0x320
...
Call Trace:
[<ffffffff811c3415>] hugetlbfs_evict_inode+0x15/0x40
[<ffffffff81167b3d>] evict+0xbd/0x1b0
[<ffffffff8116514a>] __dentry_kill+0x19a/0x1f0
[<ffffffff81165b0e>] dput+0x1fe/0x220
[<ffffffff81150535>] __fput+0x155/0x200
[<ffffffff81079fc0>] task_work_run+0x60/0xa0
[<ffffffff81063510>] do_exit+0x160/0x400
[<ffffffff810637eb>] do_group_exit+0x3b/0xa0
[<ffffffff8106e8bd>] get_signal+0x1ed/0x470
[<ffffffff8100f854>] do_signal+0x14/0x110
[<ffffffff810030e9>] prepare_exit_to_usermode+0xe9/0xf0
[<ffffffff814178a5>] retint_user+0x8/0x13
This is CVE-2016-3961 / XSA-174.
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <JGross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>
Link: http://lkml.kernel.org/r/57188ED802000078000E431C@prv-mh.provo.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fdc69e7df3 upstream.
The set_pte_at() function must update the hardware PTE_RDONLY bit
depending on the state of the PTE_WRITE and PTE_DIRTY bits of the given
entry value. However, it currently only performs this for pte_valid()
entries, ignoring PTE_PROT_NONE. The side-effect is that PROT_NONE
mappings would not have the PTE_RDONLY bit set. Without
CONFIG_ARM64_HW_AFDBM, this is not an issue since such PROT_NONE pages
are not accessible anyway.
With commit 2f4b829c62 ("arm64: Add support for hardware updates of
the access and dirty pte bits"), the ptep_set_wrprotect() function was
re-written to cope with automatic hardware updates of the dirty state.
As an optimisation, only PTE_RDONLY is checked to assess the "dirty"
status. Since set_pte_at() does not set this bit for PROT_NONE mappings,
such pages may be considered "dirty" as a result of
ptep_set_wrprotect().
This patch updates the pte_valid() check to pte_present() in
set_pte_at(). It also adds PTE_PROT_NONE to the swap entry bits comment.
Fixes: 2f4b829c62 ("arm64: Add support for hardware updates of the access and dirty pte bits")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Tested-by: Ganapatrao Kulkarni <gkulkarni@cavium.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac15bd63bb upstream.
Currently, set_pte_at() only checks the software PTE_WRITE bit for user
mappings when it sets or clears the hardware PTE_RDONLY accordingly. The
kernel ptes are written directly without any modification, relying
solely on the protection bits in macros like PAGE_KERNEL. However,
modifying kernel pte attributes via pte_wrprotect() would be ignored by
set_pte_at(). Since pte_wrprotect() does not set PTE_RDONLY (it only
clears PTE_WRITE), the new permission is not taken into account.
This patch changes set_pte_at() to adjust the read-only permission for
kernel ptes as well. As a side effect, existing PROT_* definitions used
for kernel ioremap*() need to include PTE_DIRTY | PTE_WRITE.
(additionally, white space fix for PTE_KERNEL_ROX)
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ca4a23810 upstream.
Commit 127500ccb7 ("ARM: OMAP2+: Only write the sysconfig on idle
when necessary") talks about verification of sysconfig cache value before
updating it, only during idle path. But the patch is adding the
verification in the enable path. So, adding the check in a proper place
as per the commit description.
Not keeping this check during enable path as there is a chance of losing
context and it is safe to do on idle as the context of the register will
never be lost while the device is active.
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Acked-by: Tero Kristo <t-kristo@ti.com>
Cc: Jon Hunter <jonathanh@nvidia.com>
Fixes: commit 127500ccb7 "ARM: OMAP2+: Only write the sysconfig on idle when necessary"
[paul@pwsan.com: appears to have been caused by my own mismerge of the
originally posted patch]
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 456e8d5348 upstream.
The following commits:
commit 3fa609755c ("ARM: omap2: restore OMAP4 barrier behaviour")
commit f746929ffd ("Revert "ARM: OMAP4: remove dead kconfig option OMAP4_ERRATA_I688"")
and
commit ea827ad5ff ("ARM: DRA7: Provide proper IO map table")
came in around the same time, unfortunately this seem to have missed
initializing the barrier for DRA7 platforms - omap5_map_io was reused
for dra7 till it was split out by the last patch. barrier_init
needs to be hence carried forward as it is valid for DRA7 family of
processors as they are for OMAP5.
Fixes: ea827ad5ff ("ARM: DRA7: Provide proper IO map table")
Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cfe1580a64 upstream.
commit 55ee7017ee ("arm: omap2: board-generic: use omap4_local_timer_init
for AM437x") makes synctimer32k as the clocksource on AM43xx. By default
the synctimer32k is clocked by 32K RTC OSC on AM43xx. But this 32K RTC OSC
is not available on epos boards which makes it fail to boot.
Synctimer32k can also be clocked by a peripheral PLL, so making this as
clock parent for synctimer3k on epos boards.
Fixes: 55ee7017ee ("arm: omap2: board-generic: use omap4_local_timer_init for AM437x")
Reported-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fc5b7f3bf1 upstream.
An interrupt handler that uses the fpu can kill a KVM VM, if it runs
under the following conditions:
- the guest's xcr0 register is loaded on the cpu
- the guest's fpu context is not loaded
- the host is using eagerfpu
Note that the guest's xcr0 register and fpu context are not loaded as
part of the atomic world switch into "guest mode". They are loaded by
KVM while the cpu is still in "host mode".
Usage of the fpu in interrupt context is gated by irq_fpu_usable(). The
interrupt handler will look something like this:
if (irq_fpu_usable()) {
kernel_fpu_begin();
[... code that uses the fpu ...]
kernel_fpu_end();
}
As long as the guest's fpu is not loaded and the host is using eager
fpu, irq_fpu_usable() returns true (interrupted_kernel_fpu_idle()
returns true). The interrupt handler proceeds to use the fpu with
the guest's xcr0 live.
kernel_fpu_begin() saves the current fpu context. If this uses
XSAVE[OPT], it may leave the xsave area in an undesirable state.
According to the SDM, during XSAVE bit i of XSTATE_BV is not modified
if bit i is 0 in xcr0. So it's possible that XSTATE_BV[i] == 1 and
xcr0[i] == 0 following an XSAVE.
kernel_fpu_end() restores the fpu context. Now if any bit i in
XSTATE_BV == 1 while xcr0[i] == 0, XRSTOR generates a #GP. The
fault is trapped and SIGSEGV is delivered to the current process.
Only pre-4.2 kernels appear to be vulnerable to this sequence of
events. Commit 653f52c ("kvm,x86: load guest FPU context more eagerly")
from 4.2 forces the guest's fpu to always be loaded on eagerfpu hosts.
This patch fixes the bug by keeping the host's xcr0 loaded outside
of the interrupts-disabled region where KVM switches into guest mode.
Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: David Matlack <dmatlack@google.com>
[Move load after goto cancel_injection. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6c777e8799 upstream.
991de2e590 ("PCI, x86: Implement pcibios_alloc_irq() and
pcibios_free_irq()") appeared in v4.3 and helps support IOAPIC hotplug.
Олег reported that the Elcus-1553 TA1-PCI driver worked in v4.2 but not
v4.3 and bisected it to 991de2e590. Sunjin reported that the RocketRAID
272x driver worked in v4.2 but not v4.3. In both cases booting with
"pci=routirq" is a workaround.
I think the problem is that after 991de2e590, we no longer call
pcibios_enable_irq() for upstream bridges. Prior to 991de2e590, when a
driver called pci_enable_device(), we recursively called
pcibios_enable_irq() for upstream bridges via pci_enable_bridge().
After 991de2e590, we call pcibios_enable_irq() from pci_device_probe()
instead of the pci_enable_device() path, which does *not* call
pcibios_enable_irq() for upstream bridges.
Revert 991de2e590 to fix these driver regressions.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=111211
Fixes: 991de2e590 ("PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()")
Reported-and-tested-by: Олег Мороз <oleg.moroz@mcc.vniiem.ru>
Reported-by: Sunjin Yang <fan4326@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
CC: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fa8ff601d7 upstream.
Copying the content of an MSA vector from user memory may involve TLB
faults & mapping in pages. This will fail when preemption is disabled
due to an inability to acquire mmap_sem from do_page_fault, which meant
such vector loads to unmapped pages would always fail to be emulated.
Fix this by disabling preemption later only around the updating of
vector register state.
This change does however introduce a race between performing the load
into thread context & the thread being preempted, saving its current
live context & clobbering the loaded value. This should be a rare
occureence, so optimise for the fast path by simply repeating the load if
we are preempted.
Additionally if the copy failed then the failure path was taken with
preemption left disabled, leading to the kernel typically encountering
further issues around sleeping whilst atomic. The change to where
preemption is disabled avoids this issue.
Fixes: e4aa1f153a "MIPS: MSA unaligned memory access support"
Reported-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: James Cowgill <James.Cowgill@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/12345/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 14ebda3394 upstream.
Windows lets applications choose the frequency of the timer tick,
and in Windows 10 the maximum rate was changed from 1024 Hz to
2048 Hz. Unfortunately, because of the way the Windows API
works, most applications who need a higher rate than the default
64 Hz will just do
timeGetDevCaps(&tc, sizeof(tc));
timeBeginPeriod(tc.wPeriodMin);
and pick the maximum rate. This causes very high CPU usage when
playing media or games on Windows 10, even if the guest does not
actually use the CPU very much, because the frequent timer tick
causes halt_poll_ns to kick in.
There is no really good solution, especially because Microsoft
could sooner or later bump the limit to 4096 Hz, but for now
the best we can do is lower a bit the upper limit for
halt_poll_ns. :-(
Reported-by: Jon Panozzo <jonp@lime-technology.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 321c5658c5 upstream.
Non maskable interrupts (NMI) are preferred to interrupts in current
implementation. If a NMI is pending and NMI is blocked by the result
of nmi_allowed(), pending interrupt is not injected and
enable_irq_window() is not executed, even if interrupts injection is
allowed.
In old kernel (e.g. 2.6.32), schedule() is often called in NMI context.
In this case, interrupts are needed to execute iret that intends end
of NMI. The flag of blocking new NMI is not cleared until the guest
execute the iret, and interrupts are blocked by pending NMI. Due to
this, iret can't be invoked in the guest, and the guest is starved
until block is cleared by some events (e.g. canceling injection).
This patch injects pending interrupts, when it's allowed, even if NMI
is blocked. And, If an interrupts is pending after executing
inject_pending_event(), enable_irq_window() is executed regardless of
NMI pending counter.
Signed-off-by: Yuki Shibuya <shibuya.yk@ncos.nec.co.jp>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a6002ec5a8 upstream.
arm and arm64 use different config options to specify big endian. This
needs taking into account when including code/headers between the two
architectures.
A case in point is PAN, which uses the __instr_arm() macro to output
instructions. The macro comes from opcodes.h, which lives under arch/arm.
On a big-endian build the mismatched config options mean the instruction
isn't byte swapped correctly, resulting in undefined instruction exceptions
during boot:
| alternatives: patching kernel code
| kdevtmpfs[87]: undefined instruction: pc=ffffffc0004505b4
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| kdevtmpfs[87]: undefined instruction: pc=ffffffc00076231c
| Internal error: Oops - undefined instruction: 0 [#1] SMP
| Modules linked in:
| CPU: 0 PID: 87 Comm: kdevtmpfs Not tainted 4.1.16+ #5
| Hardware name: Hisilicon PhosphorHi1382 EVB (DT)
| task: ffffffc336591700 ti: ffffffc3365a4000 task.ti: ffffffc3365a4000
| PC is at dump_instr+0x68/0x100
| LR is at do_undefinstr+0x1d4/0x2a4
| pc : [<ffffffc00076231c>] lr : [<ffffffc0000811d4>] pstate: 604001c5
| sp : ffffffc3365a6450
Reported-by: Hanjun Guo <guohanjun@huawei.com>
Tested-by: Xuefeng Wang <wxf.wang@hisilicon.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 08a5bb2921 upstream.
hugepd_free() used __get_cpu_var() once. Nothing ensured that the code
accessing the variable did not migrate from one CPU to another and soon
this was noticed by Tiejun Chen in 94b09d7554 ("powerpc/hugetlb:
Replace __get_cpu_var with get_cpu_var"). So we had it fixed.
Christoph Lameter was doing his __get_cpu_var() replaces and forgot
PowerPC. Then he noticed this and sent his fixed up batch again which
got applied as 69111bac42 ("powerpc: Replace __get_cpu_var uses").
The careful reader will noticed one little detail: get_cpu_var() got
replaced with this_cpu_ptr(). So now we have a put_cpu_var() which does
a preempt_enable() and nothing that does preempt_disable() so we
underflow the preempt counter.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ef4dfd9d9 upstream.
Handling exceptions from modules never worked on parisc.
It was just masked by the fact that exceptions from modules
don't happen during normal use.
When a module triggers an exception in get_user() we need to load the
main kernel dp value before accessing the exception_data structure, and
afterwards restore the original dp value of the module on exit.
Noticed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef72f3110d upstream.
The kernel module testcase (lib/test_user_copy.c) exhibited a kernel
crash on parisc if the parameters for copy_from_user were reversed
("illegal reversed copy_to_user" testcase).
Fix this potential crash by checking the fault handler if the faulting
address is in the exception table.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e3893027a3 upstream.
We want to avoid the kernel module loader to create function pointers
for the kernel fixup routines of get_user() and put_user(). Changing
the external reference from function type to int type fixes this.
This unbreaks exception handling for get_user() and put_user() when
called from a kernel module.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c3d266c8a9 upstream.
This patch tries to fix a PEBS warning found in my stress test. The
following perf command can easily trigger the pebs warning or spurious
NMI error on Skylake/Broadwell/Haswell platforms:
sudo perf record -e 'cpu/umask=0x04,event=0xc4/pp,cycles,branches,ref-cycles,cache-misses,cache-references' --call-graph fp -b -c1000 -a
Also the NMI watchdog must be enabled.
For this case, the events number is larger than counter number. So
perf has to do multiplexing.
In perf_mux_hrtimer_handler, it does perf_pmu_disable(), schedule out
old events, rotate_ctx, schedule in new events and finally
perf_pmu_enable().
If the old events include precise event, the MSR_IA32_PEBS_ENABLE
should be cleared when perf_pmu_disable(). The MSR_IA32_PEBS_ENABLE
should keep 0 until the perf_pmu_enable() is called and the new event is
precise event.
However, there is a corner case which could restore PEBS_ENABLE to
stale value during the above period. In perf_pmu_disable(), GLOBAL_CTRL
will be set to 0 to stop overflow and followed PMI. But there may be
pending PMI from an earlier overflow, which cannot be stopped. So even
GLOBAL_CTRL is cleared, the kernel still be possible to get PMI. At
the end of the PMI handler, __intel_pmu_enable_all() will be called,
which will restore the stale values if old events haven't scheduled
out.
Once the stale pebs value is set, it's impossible to be corrected if
the new events are non-precise. Because the pebs_enabled will be set
to 0. x86_pmu.enable_all() will ignore the MSR_IA32_PEBS_ENABLE
setting. As a result, the following NMI with stale PEBS_ENABLE
trigger pebs warning.
The pending PMI after enabled=0 will become harmless if the NMI handler
does not change the state. This patch checks cpuc->enabled in pmi and
only restore the state when PMU is active.
Here is the dump:
Call Trace:
<NMI> [<ffffffff813c3a2e>] dump_stack+0x63/0x85
[<ffffffff810a46f2>] warn_slowpath_common+0x82/0xc0
[<ffffffff810a483a>] warn_slowpath_null+0x1a/0x20
[<ffffffff8100fe2e>] intel_pmu_drain_pebs_nhm+0x2be/0x320
[<ffffffff8100caa9>] intel_pmu_handle_irq+0x279/0x460
[<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
[<ffffffff811f290d>] ? vunmap_page_range+0x20d/0x330
[<ffffffff811f2f11>] ? unmap_kernel_range_noflush+0x11/0x20
[<ffffffff8148379f>] ? ghes_copy_tofrom_phys+0x10f/0x2a0
[<ffffffff814839c8>] ? ghes_read_estatus+0x98/0x170
[<ffffffff81005a7d>] perf_event_nmi_handler+0x2d/0x50
[<ffffffff810310b9>] nmi_handle+0x69/0x120
[<ffffffff810316f6>] default_do_nmi+0xe6/0x100
[<ffffffff810317f2>] do_nmi+0xe2/0x130
[<ffffffff817aea71>] end_repeat_nmi+0x1a/0x1e
[<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
[<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
[<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
<<EOE>> <IRQ> [<ffffffff81006df8>] ? x86_perf_event_set_period+0xd8/0x180
[<ffffffff81006eec>] x86_pmu_start+0x4c/0x100
[<ffffffff8100722d>] x86_pmu_enable+0x28d/0x300
[<ffffffff811994d7>] perf_pmu_enable.part.81+0x7/0x10
[<ffffffff8119cb70>] perf_mux_hrtimer_handler+0x200/0x280
[<ffffffff8119c970>] ? __perf_install_in_context+0xc0/0xc0
[<ffffffff8110f92d>] __hrtimer_run_queues+0xfd/0x280
[<ffffffff811100d8>] hrtimer_interrupt+0xa8/0x190
[<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0
[<ffffffff81051bd8>] local_apic_timer_interrupt+0x38/0x60
[<ffffffff817af01d>] smp_apic_timer_interrupt+0x3d/0x50
[<ffffffff817ad15c>] apic_timer_interrupt+0x8c/0xa0
<EOI> [<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0
[<ffffffff81123de5>] ? smp_call_function_single+0xd5/0x130
[<ffffffff81123ddb>] ? smp_call_function_single+0xcb/0x130
[<ffffffff81199080>] ? __perf_read_group_add.part.61+0x1a0/0x1a0
[<ffffffff8119765a>] event_function_call+0x10a/0x120
[<ffffffff8119c660>] ? ctx_resched+0x90/0x90
[<ffffffff811971e0>] ? cpu_clock_event_read+0x30/0x30
[<ffffffff811976d0>] ? _perf_event_disable+0x60/0x60
[<ffffffff8119772b>] _perf_event_enable+0x5b/0x70
[<ffffffff81197388>] perf_event_for_each_child+0x38/0xa0
[<ffffffff811976d0>] ? _perf_event_disable+0x60/0x60
[<ffffffff811a0ffd>] perf_ioctl+0x12d/0x3c0
[<ffffffff8134d855>] ? selinux_file_ioctl+0x95/0x1e0
[<ffffffff8124a3a1>] do_vfs_ioctl+0xa1/0x5a0
[<ffffffff81036d29>] ? sched_clock+0x9/0x10
[<ffffffff8124a919>] SyS_ioctl+0x79/0x90
[<ffffffff817ac4b2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
---[ end trace aef202839fe9a71d ]---
Uhhuh. NMI received for unknown reason 2d on CPU 2.
Do you have a strange power saving mode enabled?
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1457046448-6184-1-git-send-email-kan.liang@intel.com
[ Fixed various typos and other small details. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8077eca079 upstream.
This patch fixes an issue with the GLOBAL_OVERFLOW_STATUS bits on
Haswell, Broadwell and Skylake processors when using PEBS.
The SDM stipulates that when the PEBS iterrupt threshold is crossed,
an interrupt is posted and the kernel is interrupted. The kernel will
find GLOBAL_OVF_SATUS bit 62 set indicating there are PEBS records to
drain. But the bits corresponding to the actual counters should NOT be
set. The kernel follows the SDM and assumes that all PEBS events are
processed in the drain_pebs() callback. The kernel then checks for
remaining overflows on any other (non-PEBS) events and processes these
in the for_each_bit_set(&status) loop.
As it turns out, under certain conditions on HSW and later processors,
on PEBS buffer interrupt, bit 62 is set but the counter bits may be
set as well. In that case, the kernel drains PEBS and generates
SAMPLES with the EXACT tag, then it processes the counter bits, and
generates normal (non-EXACT) SAMPLES.
I ran into this problem by trying to understand why on HSW sampling on
a PEBS event was sometimes returning SAMPLES without the EXACT tag.
This should not happen on user level code because HSW has the
eventing_ip which always point to the instruction that caused the
event.
The workaround in this patch simply ensures that the bits for the
counters used for PEBS events are cleared after the PEBS buffer has
been drained. With this fix 100% of the PEBS samples on my user code
report the EXACT tag.
Before:
$ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase
$ perf report -D | fgrep SAMPLES
PERF_RECORD_SAMPLE(IP, 0x2): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y
\--- EXACT tag is missing
After:
$ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase
$ perf report -D | fgrep SAMPLES
PERF_RECORD_SAMPLE(IP, 0x4002): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y
\--- EXACT tag is set
The problem tends to appear more often when multiple PEBS events are used.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: adrian.hunter@intel.com
Cc: kan.liang@intel.com
Cc: namhyung@kernel.org
Link: http://lkml.kernel.org/r/1457034642-21837-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b02acd4e62 upstream.
If enabling the hsmci regulator on card detection, the board can reboot
on sd card insertion. Keeping the regulator always enabled fixes this
issue.
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Fixes: 8d545f32bd ("ARM: at91/dt: sama5d4 xplained: add regulators for v(q)mmc1 supplies")
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ae3fc8ea08 upstream.
If enabling the hsmci regulator on card detection, the board can reboot
on sd card insertion. Keeping the regulator always enabled fixes this
issue.
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Fixes: 1b53e3416d ("ARM: at91/dt: sama5d3 xplained: add fixed regulator for vmmc0")
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 378c6520e7 upstream.
This commit fixes the following security hole affecting systems where
all of the following conditions are fulfilled:
- The fs.suid_dumpable sysctl is set to 2.
- The kernel.core_pattern sysctl's value starts with "/". (Systems
where kernel.core_pattern starts with "|/" are not affected.)
- Unprivileged user namespace creation is permitted. (This is
true on Linux >=3.8, but some distributions disallow it by
default using a distro patch.)
Under these conditions, if a program executes under secure exec rules,
causing it to run with the SUID_DUMP_ROOT flag, then unshares its user
namespace, changes its root directory and crashes, the coredump will be
written using fsuid=0 and a path derived from kernel.core_pattern - but
this path is interpreted relative to the root directory of the process,
allowing the attacker to control where a coredump will be written with
root privileges.
To fix the security issue, always interpret core_pattern for dumps that
are written under SUID_DUMP_ROOT relative to the root directory of init.
Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f778cc6571 upstream.
read{l,w}() write{l,w}() primitives should use le{16,32}_to_cpu() and
cpu_to_le{16,32}() respectively to ensure device registers are read
correctly in Big Endian CPU configuration.
Per Arnd Bergmann
| Most drivers using readl() or readl_relaxed() expect those to perform byte
| swaps on big-endian architectures, as the registers tend to be fixed endian
This was needed for getting UART to work correctly on a Big Endian ARC.
The ARC accessors originally were fine, and the bug got introduced
inadventently by commit b8a0330239 ("ARCv2: barriers")
Fixes: b8a0330239 ("ARCv2: barriers")
Link: http://lkml.kernel.org/r/201603100845.30602.arnd@arndb.de
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Lada Trimasova <ltrimas@synopsys.com>
[vgupta: beefed up changelog, added Fixes/stable tags]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7de7ac785a upstream.
There are XCHAL_NUM_DBREAK registers, clear them all.
This also fixes cryptic assembler error message with binutils 2.25 when
XCHAL_NUM_DBREAK is 0:
as: out of memory allocating 18446744073709551575 bytes after a total
of 495616 bytes
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a67cc9aa2d upstream.
Disabling pagefault makes little sense there, preemption disabling is
what was meant.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 362014c8d9 upstream.
Simulator stdin may be connected to a file, when its end is reached
kernel hangs in infinite loop inside rs_poll, because simc_poll always
signals that descriptor 0 is readable and simc_read always returns 0.
Check simc_read return value and exit loop if it's not positive. Also
don't rewind polling timer if it's zero.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b0f84ac352 upstream.
All architectures now need ioremap_uc(), ia64 seems defines this already
through its ioremap_nocache() and it already ensures it *only* uses UC.
This is needed since v4.3 to complete an allyesconfig compile on ia64,
there were others archs that needed this, and this one seems to have
fallen through the cracks.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>