Rewrite cmpxchg8b() to not use %edi register but a generic "+m"
constraint, to increase compiler freedom in code generation and
possibly better code.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus noticed that the 32-bit version of atomic64_read() was
being overly complex with re-reading the value and doing a
retry loop over that.
Instead we can just rely on cmpxchg8b returning either the new
value or returning the current value.
We can use any 'old' value, which will be faster as it can be
loaded via immediates. Using some value that is not equal to
the real value in memory the instruction gets faster.
This also has the advantage that the CPU could avoid dirtying
the cacheline.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus noted that the atomic64_t primitives are all inlines
currently which is crazy because these functions have a large
register footprint anyway.
Move them to a separate file: arch/x86/lib/atomic64_32.c
Also, while at it, rename all uses of 'unsigned long long' to
the much shorter u64.
This makes the appearance of the prototypes a lot nicer - and
it also uncovered a few bugs where (yet unused) API variants
had 'long' as their return type instead of u64.
[ More intrusive changes are not yet done in this patch. ]
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
LKML-Reference: <alpine.LFD.2.01.0907021653030.3210@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The discussion about using "access_ok()" in get_user_pages_fast() (see
commit 7f81890687: "x86: don't use
'access_ok()' as a range check in get_user_pages_fast()" for details and
end result), made us notice that x86-64 was really being very sloppy
about virtual address checking.
So be way more careful and straightforward about masking x86-64 virtual
addresses:
- All the VIRTUAL_MASK* variants now cover half of the address
space, it's not like we can use the full mask on a signed
integer, and the larger mask just invites mistakes when
applying it to either half of the 48-bit address space.
- /proc/kcore's kc_offset_to_vaddr() becomes a lot more
obvious when it transforms a file offset into a
(kernel-half) virtual address.
- Unify/simplify the 32-bit and 64-bit USER_DS definition to
be based on TASK_SIZE_MAX.
This cleanup and more careful/obvious user virtual address checking also
uncovered a buglet in the x86-64 implementation of strnlen_user(): it
would do an "access_ok()" check on the whole potential area, even if the
string itself was much shorter, and thus return an error even for valid
strings. Our sloppy checking had hidden this.
So this fixes 'strnlen_user()' to do this properly, the same way we
already handled user strings in 'strncpy_from_user()'. Namely by just
checking the first byte, and then relying on fault handling for the
rest. That always works, since we impose a guard page that cannot be
mapped at the end of the user space address space (and even if we
didn't, we'd have the address space hole).
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Provide for concurrent MSR writes on all the CPUs in the cpumask. Also,
add a temporary workaround for smp_call_function_many which skips the
CPU we're executing on.
Bart: zero out rv struct which is allocated on stack.
CC: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add a struct representing a 64bit MSR pair consisting of a low and high
register part and convert msr_info to use it. Also, rename msr-on-cpu.c
to msr.c.
Side note: Put the cpumask.h include in __KERNEL__ space thus fixing an
allmodconfig build failure in the headers_check target.
CC: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Impact: micro-optimization
This should slightly improve its performance.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <49B8F641.76E4.0078.0@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In general, the only definitions that assembly files can use
are in _types.S headers (where available), so convert them.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Impact: fix rare (but currently harmless) miscompile with certain configs and gcc versions
Hugh Dickins noticed that strncpy_from_user() was miscompiled
in some circumstances with gcc 4.3.
Thanks to Hugh's excellent analysis it was easy to track down.
Hugh writes:
> Try building an x86_64 defconfig 2.6.29-rc1 kernel tree,
> except not quite defconfig, switch CONFIG_PREEMPT_NONE=y
> and CONFIG_PREEMPT_VOLUNTARY off (because it expands a
> might_fault() there, which hides the issue): using a
> gcc 4.3.2 (I've checked both openSUSE 11.1 and Fedora 10).
>
> It generates the following:
>
> 0000000000000000 <__strncpy_from_user>:
> 0: 48 89 d1 mov %rdx,%rcx
> 3: 48 85 c9 test %rcx,%rcx
> 6: 74 0e je 16 <__strncpy_from_user+0x16>
> 8: ac lods %ds:(%rsi),%al
> 9: aa stos %al,%es:(%rdi)
> a: 84 c0 test %al,%al
> c: 74 05 je 13 <__strncpy_from_user+0x13>
> e: 48 ff c9 dec %rcx
> 11: 75 f5 jne 8 <__strncpy_from_user+0x8>
> 13: 48 29 c9 sub %rcx,%rcx
> 16: 48 89 c8 mov %rcx,%rax
> 19: c3 retq
>
> Observe that "sub %rcx,%rcx; mov %rcx,%rax", whereas gcc 4.2.1
> (and many other configs) say "sub %rcx,%rdx; mov %rdx,%rax".
> Isn't it returning 0 when it ought to be returning strlen?
The asm constraints for the strncpy_from_user() result were missing an
early clobber, which tells gcc that the last output arguments
are written before all input arguments are read.
Also add more early clobbers in the rest of the file and fix 32-bit
usercopy.c in the same way.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
[ since this API is rarely used and no in-kernel user relies on a 'len'
return value (they only rely on negative return values) this miscompile
was never noticed in the field. But it's worth fixing it nevertheless. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
copy_to/from_user and all its variants (except the atomic ones) can take a
page fault and perform non-trivial work like taking mmap_sem and entering
the filesyste/pagecache.
Unfortunately, this often escapes lockdep because a common pattern is to
use it to read in some arguments just set up from userspace, or write data
back to a hot buffer. In those cases, it will be unlikely for page reclaim
to get a window in to cause copy_*_user to fault.
With the new might_lock primitives, add some annotations to x86. I don't
know if I caught all possible faulting points (it's a bit of a maze, and I
didn't really look at 32-bit). But this is a starting point.
Boots and runs OK so far.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: performance optimization
I did some rebenchmarking with modern compilers and dropping
-funroll-loops makes the function consistently go faster by a few
percent. So drop that flag.
Thanks to Richard Guenther for a hint.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Remove an unnecessary level of abstraction in the msr-on-cpu library.
Although this duplicates some code, the duplicated code is less than
the additional code, and this way should be faster.
Additionally, change the order of the functions to make the regular
structure of this file more obvious.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Propagate error (-ENXIO) from smp_call_function_single(). These
errors can happen when a CPU is unplugged while the MSR driver is
open.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
movsl_mask is currently defined in arch/x86/kernel/cpu/intel.c, which
contains code specific to Intel CPUs. However, movsl_mask is used in
the non-CPU specific code in arch/x86/lib/usercopy_32.c, which breaks
the compilation when support for Intel CPUs is compiled out.
This patch solves this problem by moving movsl_mask's definition close
to its users in arch/x86/lib/usercopy_32.c.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: michael@free-electrons.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
New ALIGN_DESTINATION macro has sad typo: r8d register was used instead
of ecx in fixup section. This can be considered as a regression.
Register ecx was also wrongly loaded with value in r8d in
copy_user_nocache routine.
Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gas 2.15 complains about 32-bit registers being used in lea.
AS arch/x86/lib/copy_user_64.o
/local/scratch-2/jeremy/hg/xen/paravirt/linux/arch/x86/lib/copy_user_64.S: Assembler messages:
/local/scratch-2/jeremy/hg/xen/paravirt/linux/arch/x86/lib/copy_user_64.S:188: Error: `(%edx,%ecx,8)' is not a valid 64 bit base/index expression
/local/scratch-2/jeremy/hg/xen/paravirt/linux/arch/x86/lib/copy_user_64.S:257: Error: `(%edx,%ecx,8)' is not a valid 64 bit base/index expression
AS arch/x86/lib/copy_user_nocache_64.o
/local/scratch-2/jeremy/hg/xen/paravirt/linux/arch/x86/lib/copy_user_nocache_64.S: Assembler messages:
/local/scratch-2/jeremy/hg/xen/paravirt/linux/arch/x86/lib/copy_user_nocache_64.S:107: Error: `(%edx,%ecx,8)' is not a valid 64 bit base/index expression
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Vitaly Mayatskikh <v.mayatskih@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
putuser_32.S and putuser_64.S are merged into putuser.S.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In putuser_32.S and putuser_64.S, replace things like .quad, .long,
and explicit references to [r|e]ax for the apropriate macros
in asm/asm.h.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Remove them where unambiguous.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In putuser_64.S, do it the i386 way, and replace the code
in beginning and end of functions with macros, since it's
always the same thing. Save lines.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of operating over a register we need to put back
into normal state afterwards (the memory position), just
sub from rbx, which is trashed anyway. We can save a few instructions.
Also, this is the i386 way.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is consistent with i386 usage.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of clobbering r8, clobber rbx, which is the i386 way.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Clobber it in the inline asm macros, and let the compiler do this for us.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
getuser_32.S and getuser_64.S are merged into getuser.S.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Switch .long and .quad with _ASM_PTR in getuser*.S.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There are situations in which the architecture wants to use the
register that represents its word-size, whatever it is. For those,
introduce __ASM_REG in asm.h, along with the first users _ASM_AX
and _ASM_DX. They have users waiting for it, namely the getuser
functions.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The instructions access registers, so the size is unambiguous.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is for consistency with i386.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of doing a sub after the addition, use the
offset directly at the memory operand of the mov instructions.
This is the way i386 do.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since the instructions refer to registers, they'll be able
to figure it out.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There's really no reason to clobber r8 or pass the address in rcx.
We can safely use only two registers (which we already have to touch anyway)
to do the job.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
delay_32.c, delay_64.c are now equal, and are integrated into delay.c.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For x86_64, we can't just use %0, as it would
generate a mul against rdx, which is not really what we
want (note the ">> 32" in x86_64 version).
Using a u64 variable with a shift in i386 generates bad code,
so the solution is to explicitly use %%edx in inline assembly
for both.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This way we achieve the same code for both arches.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is for consistency with i386. We call use_tsc_delay()
at tsc initialization for x86_64, so we'll be always using it.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>