Right now there's a confusing mixture of 'offset' and 'size' parameters:
- __copy_xstate_to_*() input parameter 'end_pos' not not really an offset,
but the full size of the copy to be performed.
- input parameter 'count' to copy_xstate_to_*() shadows that of
__copy_xstate_to_*()'s 'count' parameter name - but the roles
are different: the first one is the total number of bytes to
be copied, while the second one is a partial copy size.
To unconfuse all this, use a consistent set of parameter names:
- 'size' is the partial copy size within a single xstate component
- 'size_total' is the total copy requested
- 'offset_start' is the requested starting offset.
- 'offset' is the offset within an xstate component.
No change in functionality.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/20170923130016.21448-9-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The 'copyin/copyout' nomenclature needlessly departs from what the modern FPU code
uses, which is:
copy_fpregs_to_fpstate()
copy_fpstate_to_sigframe()
copy_fregs_to_user()
copy_fxregs_to_kernel()
copy_fxregs_to_user()
copy_kernel_to_fpregs()
copy_kernel_to_fregs()
copy_kernel_to_fxregs()
copy_kernel_to_xregs()
copy_user_to_fregs()
copy_user_to_fxregs()
copy_user_to_xregs()
copy_xregs_to_kernel()
copy_xregs_to_user()
I.e. according to this pattern, the following rename should be done:
copyin_to_xsaves() -> copy_user_to_xstate()
copyout_from_xsaves() -> copy_xstate_to_user()
or, if we want to be pedantic, denote that that the user-space format is ptrace:
copyin_to_xsaves() -> copy_user_ptrace_to_xstate()
copyout_from_xsaves() -> copy_xstate_to_user_ptrace()
But I'd suggest the shorter, non-pedantic name.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/20170923130016.21448-2-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reported by syzkaller:
BUG: unable to handle kernel paging request at ffffffffc07f6a2e
IP: report_bug+0x94/0x120
PGD 348e12067
P4D 348e12067
PUD 348e14067
PMD 3cbd84067
PTE 80000003f7e87161
Oops: 0003 [#1] SMP
CPU: 2 PID: 7091 Comm: kvm_load_guest_ Tainted: G OE 4.11.0+ #8
task: ffff92fdfb525400 task.stack: ffffbda6c3d04000
RIP: 0010:report_bug+0x94/0x120
RSP: 0018:ffffbda6c3d07b20 EFLAGS: 00010202
do_trap+0x156/0x170
do_error_trap+0xa3/0x170
? kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm]
? mark_held_locks+0x79/0xa0
? retint_kernel+0x10/0x10
? trace_hardirqs_off_thunk+0x1a/0x1c
do_invalid_op+0x20/0x30
invalid_op+0x1e/0x30
RIP: 0010:kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm]
? kvm_load_guest_fpu.part.175+0x1c/0x170 [kvm]
kvm_arch_vcpu_ioctl_run+0xed6/0x1b70 [kvm]
kvm_vcpu_ioctl+0x384/0x780 [kvm]
? kvm_vcpu_ioctl+0x384/0x780 [kvm]
? sched_clock+0x13/0x20
? __do_page_fault+0x2a0/0x550
do_vfs_ioctl+0xa4/0x700
? up_read+0x1f/0x40
? __do_page_fault+0x2a0/0x550
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x23/0xc2
SDM mentioned that "The MXCSR has several reserved bits, and attempting to write
a 1 to any of these bits will cause a general-protection exception(#GP) to be
generated". The syzkaller forks' testcase overrides xsave area w/ random values
and steps on the reserved bits of MXCSR register. The damaged MXCSR register
values of guest will be restored to SSEx MXCSR register before vmentry. This
patch fixes it by catching userspace override MXCSR register reserved bits w/
random values and bails out immediately.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
We are going to split <linux/sched/task_stack.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/task_stack.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We are going to split <linux/sched/task.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/task.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull x86 fpu updates from Ingo Molnar:
"The main changes relate to fixes between (lack of) CPUID and FPU
detection that should only affect old or weird CPUs, by Andy
Lutomirski"
* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu: Fix the "Giving up, no FPU found" test
x86/fpu: Fix CPUID-less FPU detection
x86/fpu: Fix "x86/fpu: Legacy x87 FPU detected" message
x86/cpu: Re-apply forced caps every time CPU caps are re-read
x86/cpu: Factor out application of forced CPU caps
x86/cpu: Add X86_FEATURE_CPUID
x86/fpu/xstate: Move XSAVES state init to a function
Pull x86 cpufeature updates from Ingo Molnar:
"The main changes in this cycle were related to enable ring-3
MONITOR/MWAIT instructions support on supported CPUs, by Grzegorz
Andrejczuk and Piotr Luc"
* 'x86-cpufeature-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpufeature: Move RING3MWAIT feature to avoid conflicts
x86/cpufeature: Enable RING3MWAIT for Knights Mill
x86/cpufeature: Enable RING3MWAIT for Knights Landing
x86/cpufeature: Add RING3MWAIT to CPU features
x86/elf: Add HWCAP2 to expose ring 3 MONITOR/MWAIT
x86/msr: Add MSR_MISC_FEATURE_ENABLES and RING3MWAIT bit
x86/cpufeature: Add AVX512_VPOPCNTDQ feature
The compacted-format XSAVES area is determined at boot time and
never changed after. The field xsave.header.xcomp_bv indicates
which components are in the fixed XSAVES format.
In fpstate_init() we did not set xcomp_bv to reflect the XSAVES
format since at the time there is no valid data.
However, after we do copy_init_fpstate_to_fpregs() in fpu__clear(),
as in commit:
b22cbe404a x86/fpu: Fix invalid FPU ptrace state after execve()
and when __fpu_restore_sig() does fpu__restore() for a COMPAT-mode
app, a #GP occurs. This can be easily triggered by doing valgrind on
a COMPAT-mode "Hello World," as reported by Joakim Tjernlund and
others:
https://bugzilla.kernel.org/show_bug.cgi?id=190061
Fix it by setting xcomp_bv correctly.
This patch also moves the xcomp_bv initialization to the proper
place, which was in copyin_to_xsaves() as of:
4c833368f0 x86/fpu: Set the xcomp_bv when we fake up a XSAVES area
which fixed the bug too, but it's more efficient and cleaner to
initialize things once per boot, not for every signal handling
operation.
Reported-by: Kevin Hao <haokexin@gmail.com>
Reported-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: haokexin@gmail.com
Link: http://lkml.kernel.org/r/1485212084-4418-1-git-send-email-yu-cheng.yu@intel.com
[ Combined it with 4c833368f0. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Vector population count instructions for dwords and qwords are going to be
available in future Intel Xeon & Xeon Phi processors. Bit 14 of
CPUID[level:0x07, ECX] indicates that the instructions are supported by a
processor.
The specification can be found in the Intel Software Developer Manual (SDM)
and in the Instruction Set Extensions Programming Reference (ISE).
Populate the feature bit and clear it when xsave is disabled.
Signed-off-by: Piotr Luc <piotr.luc@intel.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Link: http://lkml.kernel.org/r/20170110173403.6010-2-piotr.luc@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull x86 FPU updates from Ingo Molnar:
"The main changes in this cycle were:
- do a large round of simplifications after all CPUs do 'eager' FPU
context switching in v4.9: remove CR0 twiddling, remove leftover
eager/lazy bts, etc (Andy Lutomirski)
- more FPU code simplifications: remove struct fpu::counter, clarify
nomenclature, remove unnecessary arguments/functions and better
structure the code (Rik van Riel)"
* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu: Remove clts()
x86/fpu: Remove stts()
x86/fpu: Handle #NM without FPU emulation as an error
x86/fpu, lguest: Remove CR0.TS support
x86/fpu, kvm: Remove host CR0.TS manipulation
x86/fpu: Remove irq_ts_save() and irq_ts_restore()
x86/fpu: Stop saving and restoring CR0.TS in fpu__init_check_bugs()
x86/fpu: Get rid of two redundant clts() calls
x86/fpu: Finish excising 'eagerfpu'
x86/fpu: Split old_fpu & new_fpu handling into separate functions
x86/fpu: Remove 'cpu' argument from __cpu_invalidate_fpregs_state()
x86/fpu: Split old & new FPU code paths
x86/fpu: Remove __fpregs_(de)activate()
x86/fpu: Rename lazy restore functions to "register state valid"
x86/fpu, kvm: Remove KVM vcpu->fpu_counter
x86/fpu: Remove struct fpu::counter
x86/fpu: Remove use_eager_fpu()
x86/fpu: Remove the XFEATURE_MASK_EAGER/LAZY distinction
x86/fpu: Hard-disable lazy FPU mode
x86/crypto, x86/fpu: Remove X86_FEATURE_EAGER_FPU #ifdef from the crc32c code
Pull x86 asm updates from Ingo Molnar:
"The main changes in this development cycle were:
- a large number of call stack dumping/printing improvements: higher
robustness, better cross-context dumping, improved output, etc.
(Josh Poimboeuf)
- vDSO getcpu() performance improvement for future Intel CPUs with
the RDPID instruction (Andy Lutomirski)
- add two new Intel AVX512 features and the CPUID support
infrastructure for it: AVX512IFMA and AVX512VBMI. (Gayatri Kammela,
He Chen)
- more copy-user unification (Borislav Petkov)
- entry code assembly macro simplifications (Alexander Kuleshov)
- vDSO C/R support improvements (Dmitry Safonov)
- misc fixes and cleanups (Borislav Petkov, Paul Bolle)"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
scripts/decode_stacktrace.sh: Fix address line detection on x86
x86/boot/64: Use defines for page size
x86/dumpstack: Make stack name tags more comprehensible
selftests/x86: Add test_vdso to test getcpu()
x86/vdso: Use RDPID in preference to LSL when available
x86/dumpstack: Handle NULL stack pointer in show_trace_log_lvl()
x86/cpufeatures: Enable new AVX512 cpu features
x86/cpuid: Provide get_scattered_cpuid_leaf()
x86/cpuid: Cleanup cpuid_regs definitions
x86/copy_user: Unify the code by removing the 64-bit asm _copy_*_user() variants
x86/unwind: Ensure stack grows down
x86/vdso: Set vDSO pointer only after success
x86/prctl/uapi: Remove #ifdef for CHECKPOINT_RESTORE
x86/unwind: Detect bad stack return address
x86/dumpstack: Warn on stack recursion
x86/unwind: Warn on bad frame pointer
x86/decoder: Use stderr if insn sanity test fails
x86/decoder: Use stdout if insn decoder test is successful
mm/page_alloc: Remove kernel address exposure in free_reserved_area()
x86/dumpstack: Remove raw stack dump
...
AVX512_4VNNIW - Vector instructions for deep learning enhanced word
variable precision.
AVX512_4FMAPS - Vector instructions for deep learning floating-point
single precision.
These new instructions are to be used in future Intel Xeon & Xeon Phi
processors. The bits 2&3 of CPUID[level:0x07, EDX] inform that new
instructions are supported by a processor.
The spec can be found in the Intel Software Developer Manual (SDM) or in
the Instruction Set Extensions Programming Reference (ISE).
Define new feature flags to enumerate the new instructions in /proc/cpuinfo
accordingly to CPUID bits and add the required xsave extensions which are
required for proper operation.
Signed-off-by: Piotr Luc <piotr.luc@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20161018150111.29926-1-piotr.luc@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull protection keys syscall interface from Thomas Gleixner:
"This is the final step of Protection Keys support which adds the
syscalls so user space can actually allocate keys and protect memory
areas with them. Details and usage examples can be found in the
documentation.
The mm side of this has been acked by Mel"
* 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/pkeys: Update documentation
x86/mm/pkeys: Do not skip PKRU register if debug registers are not used
x86/pkeys: Fix pkeys build breakage for some non-x86 arches
x86/pkeys: Add self-tests
x86/pkeys: Allow configuration of init_pkru
x86/pkeys: Default to a restrictive init PKRU
pkeys: Add details of system call use to Documentation/
generic syscalls: Wire up memory protection keys syscalls
x86: Wire up protection keys system calls
x86/pkeys: Allocation/free syscalls
x86/pkeys: Make mprotect_key() mask off additional vm_flags
mm: Implement new pkey_mprotect() system call
x86/pkeys: Add fault handling for PF_PK page fault bit