This patch enables the use of the Advanced SIMD (NEON) extension on
ARMv7. The NEON technology is a 64/128-bit hybrid SIMD architecture
for accelerating the performance of multimedia and signal processing
applications. The extension shares the registers with the VFP unit and
enabling/disabling and saving/restoring follow the same rules. In
addition, there are instructions that do not have the appropriate CP
number encoded, the checks being made in the call_fpe function.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The ARM __kuser_cmpxchg routine is meant to implement an atomic cmpxchg
in user space. It however can produce spurious false negative if a
processor exception occurs in the middle of the operation. Normally
this is not a problem since cmpxchg is typically called in a loop until
it succeeds to implement an atomic increment for example.
Some use cases which don't involve a loop require that the operation be
100% reliable though. This patch changes the implementation so to
reattempt the operation after an exception has occurred in the critical
section rather than abort it.
Here's a simple program to test the fix (don't use CONFIG_NO_HZ in your
kernel as this depends on a sufficiently high interrupt rate):
#include <stdio.h>
typedef int (__kernel_cmpxchg_t)(int oldval, int newval, int *ptr);
#define __kernel_cmpxchg (*(__kernel_cmpxchg_t *)0xffff0fc0)
int main()
{
int i, x = 0;
for (i = 0; i < 100000000; i++) {
int v = x;
if (__kernel_cmpxchg(v, v+1, &x))
printf("failed at %d: %d vs %d\n", i, v, x);
}
printf("done with %d vs %d\n", i, x);
return 0;
}
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
get_irqnr_preamble allows machines to take some action before entering the
get_irqnr_and_base loop. On iop we enable cp6 access.
arch_ret_to_user is added to the userspace return path to allow individual
architectures to take actions, like disabling coprocessor access, before
the final return to userspace.
Per Nicolas Pitre's note, there is no need to cp_wait on the return to user
as the latency to return is sufficient.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
bad_mode() currently prints the mode which caused the exception, and
then causes an oops dump to be printed which again displays this
information (since the CPSR in the struct pt_regs is correct.) This
leads to processor_modes[] being shared between traps.c and process.c
with a local declaration of it.
We can clean this up by moving processor_modes[] to process.c and
removing the duplication, resulting in processor_modes[] becoming
static.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
If the kernel attempts to execute a CP1 or CP2 instruction and it
aborts, and a FP emulator is not loaded, we try to return as if to
a user context, instead of the proper kernel context. Since the
fault came from kernel mode, we must use the kernel return paths.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
XScale cores either have a DSP coprocessor (which contains a single
40 bit accumulator register), or an iWMMXt coprocessor (which contains
eight 64 bit registers.)
Because of the small amount of state in the DSP coprocessor, access to
the DSP coprocessor (CP0) is always enabled, and DSP context switching
is done unconditionally on every task switch. Access to the iWMMXt
coprocessor (CP0/CP1) is enabled only when an iWMMXt instruction is
first issued, and iWMMXt context switching is done lazily.
CONFIG_IWMMXT is supposed to mean 'the cpu we will be running on will
have iWMMXt support', but boards are supposed to select this config
symbol by hand, and at least one pxa27x board doesn't get this right,
so on that board, proc-xscale.S will incorrectly assume that we have a
DSP coprocessor, enable CP0 on boot, and we will then only save the
first iWMMXt register (wR0) on context switches, which is Bad.
This patch redefines CONFIG_IWMMXT as 'the cpu we will be running on
might have iWMMXt support, and we will enable iWMMXt context switching
if it does.' This means that with this patch, running a CONFIG_IWMMXT=n
kernel on an iWMMXt-capable CPU will no longer potentially corrupt iWMMXt
state over context switches, and running a CONFIG_IWMMXT=y kernel on a
non-iWMMXt capable CPU will still do DSP context save/restore.
These changes should make iWMMXt work on PXA3xx, and as a side effect,
enable proper acc0 save/restore on non-iWMMXt capable xsc3 cores such
as IOP13xx and IXP23xx (which will not have CONFIG_CPU_XSCALE defined),
as well as setting and using HWCAP_IWMMXT properly.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
The userspace helpers in clean/arch/arm/kernel/entry-armv.S are called
directly in/from userspace. They need to cope with being called from
Thumb code.
Patch below uses the bx interworking instruction when
CONFIG_ARM_THUMB=y.
Based on an earlier patch from Paul Brook <paul@codesourcery.com>
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (44 commits)
[ARM] 3541/2: workaround for PXA27x erratum E7
[ARM] nommu: provide a way for correct control register value selection
[ARM] 3705/1: add supersection support to ioremap()
[ARM] 3707/1: iwmmxt: use the generic thread notifier infrastructure
[ARM] 3706/2: ep93xx: add cirrus logic edb9315a support
[ARM] 3704/1: format IOP Kconfig with tabs, create more consistency
[ARM] 3703/1: Add help description for ARCH_EP80219
[ARM] 3678/1: MMC: Make OMAP MMC work
[ARM] 3677/1: OMAP: Update H2 defconfig
[ARM] 3676/1: ARM: OMAP: Fix dmtimers and timer32k to compile on OMAP1
[ARM] Add section support to ioremap
[ARM] Fix sa11x0 SDRAM selection
[ARM] Set bit 4 on section mappings correctly depending on CPU
[ARM] 3666/1: TRIZEPS4 [1/5] core
ARM: OMAP: Multiplexing for 24xx GPMC wait pin monitoring
ARM: OMAP: Fix SRAM to use MT_MEMORY instead of MT_DEVICE
ARM: OMAP: Update dmtimers
ARM: OMAP: Make clock variables static
ARM: OMAP: Fix GPMC compilation when DEBUG is defined
ARM: OMAP: Mux updates for external DMA and GPIO
...
Patch from Lennert Buytenhek
This patch makes the iWMMXt context switch hook use the generic
thread notifier infrastructure that was recently merged in commit
d6551e884c.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Lennert Buytenhek
Add the necessary kernel bits for crunch task switching.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Some machine classes need to allow VFP support to be built into the
kernel, but still allow the kernel to run even though VFP isn't
present. Unfortunately, the kernel hard-codes VFP instructions
into the thread switch, which prevents this being run-time selectable.
Solve this by introducing a notifier which things such as VFP can
hook into to be informed of events which affect the VFP subsystem
(eg, creation and destruction of threads, switches between threads.)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Paul Brook
The example code in the source documentation for __kernel_dmb
clobbers r0 but doesn't list it the asm clobber list.
Signed-off-by: Paul Brook <paul@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Allow the individual coprocessor handlers to decide when to enable
interrupts, rather than unconditionally enabling them.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
negative
Patch from Nicolas Pitre
The pre ARMv5 implementation can be aborted if an exception occurs in
the middle of it. Because of that, the ARMv6 implementation doesn't
re-attempt the operation on a failed strex either. Let's make this
transient nature of such a false positive more explicit in the
definition.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
The cmpxchg emulation on pre-ARMv5 relies on user code executed from a
kernel address. If the operation cannot complete atomically, it is
aborted from the usr_entry macro by clearing the Z flag. This clearing
of the Z flag is done whenever the user pc is above TASK_SIZE.
However this "pc >= TASK_SIZE" test cannot work in the non MMU case.
Worse: the current code will corrupt the Z flag on every entry to the
kernel.
Let's disable it in the non MMU case for now. Using NPTL on non MMU
targets needs to be worked out anyway.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
This is kernel provided user space code.
Since a syscall is used, it has to be updated to work with EABI.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
The ARM EABI says that the stack pointer has to be 64-bit aligned for
reasons already mentioned in patch #3101 when calling C functions.
We therefore must verify and adjust sp accordingly when taking an
exception from kernel mode since sp might not necessarily be 64-bit
aligned if the exception occurs in the middle of a kernel function.
If the exception occurs while in user mode then no sp fixup is needed as
long as sizeof(struct pt_regs) as well as any additional syscall data
stack space remain multiples of 8.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds register switch support in nommu mode.
Signed-off-by: Hyok S. Choi <hyok.choi@samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
arch/arm/kernel/entry-armv.S has contained a comment suggesting
that asm/hardware.h and asm/arch/irqs.h should be moved into the
asm/arch/entry-macro.S include. So move the includes to these
two files as required.
Add missing includes (asm/hardware.h, asm/io.h) to asm/arch/system.h
includes which use those facilities, and remove asm/io.h from
kernel/process.c.
Remove other unnecessary includes from arch/arm/kernel, arch/arm/mm
and arch/arm/mach-footbridge.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
Strictly speaking, the NPTL kernel helpers are required for pre ARMv6
only. They are available on ARMv6+ as well for obvious compatibility
reasons. However there are cases where extra memory barriers are needed
when using an SMP ARMv6 machine but not on pre-ARMv6.
This patch adds a memory barrier kernel helper that glibc can use as
needed for pre-ARMv6 binaries to be forward compatible with an SMP
kernel on ARMv6, as well as the necessary dmb instructions to the
cmpxchg helper.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Acked-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add infrastructure for supporting per-cpu local timers to update
the profiling information and update system time accounting.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
Since we know the value of cpsr on entry, we can replace the bic+orr with
a single eor. Also remove a possible result delay (at least on XScale).
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
This patch allows for assorted type of cleanups by letting assembly code
use the same set of defines for constant values and avoid duplicated
definitions that might not always be in sync, or that might simply be
confusing due to the different names for the same thing.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We accidentally corrupted the TLS value when clearing out the ARMv6
exclusive monitor. Avoid doing so.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
Not that there might be many of them on the planet, but at least RMK
apparently has one.
Signed-off-by: Nicolas Pitre
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The current vector entry system does not allow for SMP. In
order to work around this, we need to eliminate our reliance
on the fixed save areas, which breaks the way we enable
alignment traps. This patch changes the way we handle the
save areas such that we can have one per CPU.
Signed-off-by: Russell King <rmk@arm.linux.org.uk>
The current vector entry system does not allow for SMP. In
order to work around this, we need to eliminate our reliance
on the fixed save areas, which breaks the way we enable
alignment traps. This patch makes the alignment trap enable
code independent of the way we handle the save areas.
Signed-off-by: Russell King <rmk@arm.linux.org.uk>
By changing r9 -> r8 and r8 to 'tsk' (r9) we are able to remove
one instruction from the preempt path.
Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Patch from Nicolas Pitre
This better express things, and should cover RMK's weird SMP toys.
Signed-off-by: Nicolas Pitre
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Nicolas Pitre
This patch entirely reworks the kernel assistance for NPTL on ARM.
In particular this provides an efficient way to retrieve the TLS
value and perform atomic operations without any instruction emulation
nor special system call. This even allows for pre ARMv6 binaries to
be forward compatible with SMP systems without any penalty.
The problematic and performance critical operations are performed
through segment of kernel provided user code reachable from user space
at a fixed address in kernel memory. Those fixed entry points are
within the vector page so we basically get it for free as no extra
memory page is required and nothing else may be mapped at that
location anyway.
This is different from (but doesn't preclude) a full blown VDSO
implementation, however a VDSO would prevent some assembly tricks with
constants that allows for efficient branching to those code segments.
And since those code segments only use a few cycles before returning to
user code, the overhead of a VDSO far call would add a significant
overhead to such minimalistic operations.
The ARM_NR_set_tls syscall also changed number. This is done for two
reasons:
1) this patch changes the way the TLS value was previously meant to be
retrieved, therefore we ensure whatever library using the old way
gets fixed (they only exist in private tree at the moment since the
NPTL work is still progressing).
2) the previous number was allocated in a range causing an undefined
instruction trap on kernels not supporting that syscall and it was
determined that allocating it in a range returning -ENOSYS would be
much nicer for libraries trying to determine if the feature is
present or not.
Signed-off-by: Nicolas Pitre
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
SVC_MODE reflects the MODE_SVC definition in asm/ptrace.h. Use
the asm/ptrace.h definition instead, and remove SVC_MODE.
Signed-off-by: Russell King <rmk@arm.linux.org.uk>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!