Wire up the io_pgetevents syscall that was introduced by commit
7a074e96de ("aio: implement io_pgetevents").
Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/19593/
Cc: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Wire up the restartable sequences (rseq) syscall for MIPS. This was
introduced by commit d7822b1e24 ("rseq: Introduce restartable
sequences system call") & MIPS now supports the prerequisites.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Reviewed-by: James Hogan <jhogan@kernel.org>
Patchwork: https://patchwork.linux-mips.org/patch/19525/
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
MIPS is the weirdest case for sysvipc, because each of the
three data structures is done differently:
* msqid64_ds has padding in the right place so we could in theory
extend this one to just have 64-bit values instead of time_t.
As this does not work for most of the other combinations,
we just handle it in the common manner though.
* semid64_ds has no padding for 64-bit time_t, but has two reserved
'long' fields, which are sufficient to extend the sem_otime
and sem_ctime fields to 64 bit. In order to do this, the libc
implementation will have to copy the data into another structure
that has the fields in a different order. MIPS is the only
architecture with this problem, so this is best done in MIPS
specific libc code.
* shmid64_ds is slightly worse than that, because it has three
time_t fields but only two unused 32-bit words. As a workaround,
we extend each field only by 16 bits, ending up with 48-bit
timestamps that user space again has to work around by itself.
The compat versions of the data structures are changed in the
same way.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Patch series "mm: introduce MAP_FIXED_NOREPLACE", v2.
This has started as a follow up discussion [3][4] resulting in the
runtime failure caused by hardening patch [5] which removes MAP_FIXED
from the elf loader because MAP_FIXED is inherently dangerous as it
might silently clobber an existing underlying mapping (e.g. stack).
The reason for the failure is that some architectures enforce an
alignment for the given address hint without MAP_FIXED used (e.g. for
shared or file backed mappings).
One way around this would be excluding those archs which do alignment
tricks from the hardening [6]. The patch is really trivial but it has
been objected, rightfully so, that this screams for a more generic
solution. We basically want a non-destructive MAP_FIXED.
The first patch introduced MAP_FIXED_NOREPLACE which enforces the given
address but unlike MAP_FIXED it fails with EEXIST if the given range
conflicts with an existing one. The flag is introduced as a completely
new one rather than a MAP_FIXED extension because of the backward
compatibility. We really want a never-clobber semantic even on older
kernels which do not recognize the flag. Unfortunately mmap sucks
wrt flags evaluation because we do not EINVAL on unknown flags. On
those kernels we would simply use the traditional hint based semantic so
the caller can still get a different address (which sucks) but at least
not silently corrupt an existing mapping. I do not see a good way
around that. Except we won't export expose the new semantic to the
userspace at all.
It seems there are users who would like to have something like that.
Jemalloc has been mentioned by Michael Ellerman [7]
Florian Weimer has mentioned the following:
: glibc ld.so currently maps DSOs without hints. This means that the kernel
: will map right next to each other, and the offsets between them a completely
: predictable. We would like to change that and supply a random address in a
: window of the address space. If there is a conflict, we do not want the
: kernel to pick a non-random address. Instead, we would try again with a
: random address.
John Hubbard has mentioned CUDA example
: a) Searches /proc/<pid>/maps for a "suitable" region of available
: VA space. "Suitable" generally means it has to have a base address
: within a certain limited range (a particular device model might
: have odd limitations, for example), it has to be large enough, and
: alignment has to be large enough (again, various devices may have
: constraints that lead us to do this).
:
: This is of course subject to races with other threads in the process.
:
: Let's say it finds a region starting at va.
:
: b) Next it does:
: p = mmap(va, ...)
:
: *without* setting MAP_FIXED, of course (so va is just a hint), to
: attempt to safely reserve that region. If p != va, then in most cases,
: this is a failure (almost certainly due to another thread getting a
: mapping from that region before we did), and so this layer now has to
: call munmap(), before returning a "failure: retry" to upper layers.
:
: IMPROVEMENT: --> if instead, we could call this:
:
: p = mmap(va, ... MAP_FIXED_NOREPLACE ...)
:
: , then we could skip the munmap() call upon failure. This
: is a small thing, but it is useful here. (Thanks to Piotr
: Jaroszynski and Mark Hairgrove for helping me get that detail
: exactly right, btw.)
:
: c) After that, CUDA suballocates from p, via:
:
: q = mmap(sub_region_start, ... MAP_FIXED ...)
:
: Interestingly enough, "freeing" is also done via MAP_FIXED, and
: setting PROT_NONE to the subregion. Anyway, I just included (c) for
: general interest.
Atomic address range probing in the multithreaded programs in general
sounds like an interesting thing to me.
The second patch simply replaces MAP_FIXED use in elf loader by
MAP_FIXED_NOREPLACE. I believe other places which rely on MAP_FIXED
should follow. Actually real MAP_FIXED usages should be docummented
properly and they should be more of an exception.
[1] http://lkml.kernel.org/r/20171116101900.13621-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/20171129144219.22867-1-mhocko@kernel.org
[3] http://lkml.kernel.org/r/20171107162217.382cd754@canb.auug.org.au
[4] http://lkml.kernel.org/r/1510048229.12079.7.camel@abdul.in.ibm.com
[5] http://lkml.kernel.org/r/20171023082608.6167-1-mhocko@kernel.org
[6] http://lkml.kernel.org/r/20171113094203.aofz2e7kueitk55y@dhcp22.suse.cz
[7] http://lkml.kernel.org/r/87efp1w7vy.fsf@concordia.ellerman.id.au
This patch (of 2):
MAP_FIXED is used quite often to enforce mapping at the particular range.
The main problem of this flag is, however, that it is inherently dangerous
because it unmaps existing mappings covered by the requested range. This
can cause silent memory corruptions. Some of them even with serious
security implications. While the current semantic might be really
desiderable in many cases there are others which would want to enforce the
given range but rather see a failure than a silent memory corruption on a
clashing range. Please note that there is no guarantee that a given range
is obeyed by the mmap even when it is free - e.g. arch specific code is
allowed to apply an alignment.
Introduce a new MAP_FIXED_NOREPLACE flag for mmap to achieve this
behavior. It has the same semantic as MAP_FIXED wrt. the given address
request with a single exception that it fails with EEXIST if the requested
address is already covered by an existing mapping. We still do rely on
get_unmaped_area to handle all the arch specific MAP_FIXED treatment and
check for a conflicting vma after it returns.
The flag is introduced as a completely new one rather than a MAP_FIXED
extension because of the backward compatibility. We really want a
never-clobber semantic even on older kernels which do not recognize the
flag. Unfortunately mmap sucks wrt. flags evaluation because we do not
EINVAL on unknown flags. On those kernels we would simply use the
traditional hint based semantic so the caller can still get a different
address (which sucks) but at least not silently corrupt an existing
mapping. I do not see a good way around that.
[mpe@ellerman.id.au: fix whitespace]
[fail on clashing range with EEXIST as per Florian Weimer]
[set MAP_FIXED before round_hint_to_min as per Khalid Aziz]
Link: http://lkml.kernel.org/r/20171213092550.2774-2-mhocko@kernel.org
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Russell King - ARM Linux <linux@armlinux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Jason Evans <jasone@google.com>
Cc: David Goldblatt <davidtgoldblatt@gmail.com>
Cc: Edward Tomasz Napierała <trasz@FreeBSD.org>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Indicate that CRC32 and CRC32C instuctions are supported by the CPU
through elf_hwcap flags.
This will be used by a follow-up commit that introduces crc32(c) crypto
acceleration modules and is required by GENERIC_CPU_AUTOPROBE feature.
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@mips.com>
Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/18600/
except, again, POLLFREE and POLL_BUSY_LOOP.
With this, we finally get to the promised end result:
- POLL{IN,OUT,...} are plain integers and *not* in __poll_t, so any
stray instances of ->poll() still using those will be caught by
sparse.
- eventpoll.c and select.c warning-free wrt __poll_t
- no more kernel-side definitions of POLL... - userland ones are
visible through the entire kernel (and used pretty much only for
mangle/demangle)
- same behavior as after the first series (i.e. sparc et.al. epoll(2)
working correctly).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull poll annotations from Al Viro:
"This introduces a __bitwise type for POLL### bitmap, and propagates
the annotations through the tree. Most of that stuff is as simple as
'make ->poll() instances return __poll_t and do the same to local
variables used to hold the future return value'.
Some of the obvious brainos found in process are fixed (e.g. POLLIN
misspelled as POLL_IN). At that point the amount of sparse warnings is
low and most of them are for genuine bugs - e.g. ->poll() instance
deciding to return -EINVAL instead of a bitmap. I hadn't touched those
in this series - it's large enough as it is.
Another problem it has caught was eventpoll() ABI mess; select.c and
eventpoll.c assumed that corresponding POLL### and EPOLL### were
equal. That's true for some, but not all of them - EPOLL### are
arch-independent, but POLL### are not.
The last commit in this series separates userland POLL### values from
the (now arch-independent) kernel-side ones, converting between them
in the few places where they are copied to/from userland. AFAICS, this
is the least disruptive fix preserving poll(2) ABI and making epoll()
work on all architectures.
As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and
it will trigger only on what would've triggered EPOLLWRBAND on other
architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered
at all on sparc. With this patch they should work consistently on all
architectures"
* 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
make kernel-side POLL... arch-independent
eventpoll: no need to mask the result of epi_item_poll() again
eventpoll: constify struct epoll_event pointers
debugging printk in sg_poll() uses %x to print POLL... bitmap
annotate poll(2) guts
9p: untangle ->poll() mess
->si_band gets POLL... bitmap stored into a user-visible long field
ring_buffer_poll_wait() return value used as return value of ->poll()
the rest of drivers/*: annotate ->poll() instances
media: annotate ->poll() instances
fs: annotate ->poll() instances
ipc, kernel, mm: annotate ->poll() instances
net: annotate ->poll() instances
apparmor: annotate ->poll() instances
tomoyo: annotate ->poll() instances
sound: annotate ->poll() instances
acpi: annotate ->poll() instances
crypto: annotate ->poll() instances
block: annotate ->poll() instances
x86: annotate ->poll() instances
...
... having taught the latter that si_errno and si_code might be
swapped.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Commit 0515e5999a ("bpf: introduce BPF_PROG_TYPE_PERF_EVENT
program type") introduced the bpf_perf_event_data structure which
exports the pt_regs structure. This is OK for multiple architectures
but fail for s390 and arm64 which do not export pt_regs. Programs
using them, for example, the bpf selftest fail to compile on these
architectures.
For s390, exporting the pt_regs is not an option because s390 wants
to allow changes to it. For arm64, there is a user_pt_regs structure
that covers parts of the pt_regs structure for use by user space.
To solve the broken uapi for s390 and arm64, introduce an abstract
type for pt_regs and add an asm/bpf_perf_event.h file that concretes
the type. An asm-generic header file covers the architectures that
export pt_regs today.
The arch-specific enablement for s390 and arm64 follows in separate
commits.
Reported-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Fixes: 0515e5999a ("bpf: introduce BPF_PROG_TYPE_PERF_EVENT program type")
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-and-tested-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* Introduce MAP_SYNC and MAP_SHARED_VALIDATE, a mechanism to enable
'userspace flush' of persistent memory updates via filesystem-dax
mappings. It arranges for any filesystem metadata updates that may be
required to satisfy a write fault to also be flushed ("on disk") before
the kernel returns to userspace from the fault handler. Effectively
every write-fault that dirties metadata completes an fsync() before
returning from the fault handler. The new MAP_SHARED_VALIDATE mapping
type guarantees that the MAP_SYNC flag is validated as supported by the
filesystem's ->mmap() file operation.
* Add support for the standard ACPI 6.2 label access methods that
replace the NVDIMM_FAMILY_INTEL (vendor specific) label methods. This
enables interoperability with environments that only implement the
standardized methods.
* Add support for the ACPI 6.2 NVDIMM media error injection methods.
* Add support for the NVDIMM_FAMILY_INTEL v1.6 DIMM commands for latch
last shutdown status, firmware update, SMART error injection, and
SMART alarm threshold control.
* Cleanup physical address information disclosures to be root-only.
* Fix revalidation of the DIMM "locked label area" status to support
dynamic unlock of the label area.
* Expand unit test infrastructure to mock the ACPI 6.2 Translate SPA
(system-physical-address) command and error injection commands.
Acknowledgements that came after the commits were pushed to -next:
957ac8c421 dax: fix PMD faults on zero-length files
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
a39e596baa xfs: support for synchronous DAX faults
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
7b565c9f96 xfs: Implement xfs_filemap_pfn_mkwrite() using __xfs_filemap_fault()
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJaDfvcAAoJEB7SkWpmfYgCk7sP/2qJhBH+VTTdg2osDnhAdAhI
co/AGEmsHFlUCMBb/Ek7UnMAmhBYiJU2q4ywPsNFBpusXpMlqNy5Iwo7k4/wQHE/
SJcIM0g4zg0ViFuUhwV+C2T0R5UzFR8JLd9EYWj/YS6aJpurtotm5l4UStaM0Hzo
AhxSXJLrBDuqCpbOxbctfiGEmdRL7aRfBEAARTNRKBn/iXxJUcYHlp62rtXQS+t4
I6LC/URCWTNTTMGmzW6TRsgSD9WMfd19xKcGzN3qL6ee0KFccxN4ctFqHA/sFGOh
iYLeR0XJUjJxyp+PkWGteXPVZL0Kj3bD/lSTG+Co5bm/ra8a/sh3TSFfgFyoBZD1
EqMN8Ryf80hGp3FabeH2Iw2SviYPZpHSWgjddjxLD0RA6OmpzINc+Wm8eqApjMME
sbZDTOijiab4QMQ0XamF4GuDHyQtawv5Y/w2Ehhl1tmiqW+5tKhsKqxkQt+/V3Yt
RTVSRe2Pkway66b+cD64IdQ6L2tyonPnmi5IzgkKOhlOEGomy+4/U2Jt2bMbhzq6
ymszKmXp2XI8P06wU8sHrIUeXO5I9qoKn/fZA73Eb8aIzgJe3tBE/5+Ab7RG6HB9
1OVfcMWoXU1gNgNktTs63X1Lsg4aW9kt/K4fPHHcqUcaliEJpJTlAbg9GLF2buoW
nQ+0fTRgMRihE3ZA0Fs3
=h2vZ
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm and dax updates from Dan Williams:
"Save for a few late fixes, all of these commits have shipped in -next
releases since before the merge window opened, and 0day has given a
build success notification.
The ext4 touches came from Jan, and the xfs touches have Darrick's
reviewed-by. An xfstest for the MAP_SYNC feature has been through
a few round of reviews and is on track to be merged.
- Introduce MAP_SYNC and MAP_SHARED_VALIDATE, a mechanism to enable
'userspace flush' of persistent memory updates via filesystem-dax
mappings. It arranges for any filesystem metadata updates that may
be required to satisfy a write fault to also be flushed ("on disk")
before the kernel returns to userspace from the fault handler.
Effectively every write-fault that dirties metadata completes an
fsync() before returning from the fault handler. The new
MAP_SHARED_VALIDATE mapping type guarantees that the MAP_SYNC flag
is validated as supported by the filesystem's ->mmap() file
operation.
- Add support for the standard ACPI 6.2 label access methods that
replace the NVDIMM_FAMILY_INTEL (vendor specific) label methods.
This enables interoperability with environments that only implement
the standardized methods.
- Add support for the ACPI 6.2 NVDIMM media error injection methods.
- Add support for the NVDIMM_FAMILY_INTEL v1.6 DIMM commands for
latch last shutdown status, firmware update, SMART error injection,
and SMART alarm threshold control.
- Cleanup physical address information disclosures to be root-only.
- Fix revalidation of the DIMM "locked label area" status to support
dynamic unlock of the label area.
- Expand unit test infrastructure to mock the ACPI 6.2 Translate SPA
(system-physical-address) command and error injection commands.
Acknowledgements that came after the commits were pushed to -next:
- 957ac8c421 ("dax: fix PMD faults on zero-length files"):
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
- a39e596baa ("xfs: support for synchronous DAX faults") and
7b565c9f96 ("xfs: Implement xfs_filemap_pfn_mkwrite() using __xfs_filemap_fault()")
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>"
* tag 'libnvdimm-for-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (49 commits)
acpi, nfit: add 'Enable Latch System Shutdown Status' command support
dax: fix general protection fault in dax_alloc_inode
dax: fix PMD faults on zero-length files
dax: stop requiring a live device for dax_flush()
brd: remove dax support
dax: quiet bdev_dax_supported()
fs, dax: unify IOMAP_F_DIRTY read vs write handling policy in the dax core
tools/testing/nvdimm: unit test clear-error commands
acpi, nfit: validate commands against the device type
tools/testing/nvdimm: stricter bounds checking for error injection commands
xfs: support for synchronous DAX faults
xfs: Implement xfs_filemap_pfn_mkwrite() using __xfs_filemap_fault()
ext4: Support for synchronous DAX faults
ext4: Simplify error handling in ext4_dax_huge_fault()
dax: Implement dax_finish_sync_fault()
dax, iomap: Add support for synchronous faults
mm: Define MAP_SYNC and VM_SYNC flags
dax: Allow tuning whether dax_insert_mapping_entry() dirties entry
dax: Allow dax_iomap_fault() to return pfn
dax: Fix comment describing dax_iomap_fault()
...
The mmap(2) syscall suffers from the ABI anti-pattern of not validating
unknown flags. However, proposals like MAP_SYNC need a mechanism to
define new behavior that is known to fail on older kernels without the
support. Define a new MAP_SHARED_VALIDATE flag pattern that is
guaranteed to fail on all legacy mmap implementations.
It is worth noting that the original proposal was for a standalone
MAP_VALIDATE flag. However, when that could not be supported by all
archs Linus observed:
I see why you *think* you want a bitmap. You think you want
a bitmap because you want to make MAP_VALIDATE be part of MAP_SYNC
etc, so that people can do
ret = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED
| MAP_SYNC, fd, 0);
and "know" that MAP_SYNC actually takes.
And I'm saying that whole wish is bogus. You're fundamentally
depending on special semantics, just make it explicit. It's already
not portable, so don't try to make it so.
Rename that MAP_VALIDATE as MAP_SHARED_VALIDATE, make it have a value
of 0x3, and make people do
ret = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED_VALIDATE
| MAP_SYNC, fd, 0);
and then the kernel side is easier too (none of that random garbage
playing games with looking at the "MAP_VALIDATE bit", but just another
case statement in that map type thing.
Boom. Done.
Similar to ->fallocate() we also want the ability to validate the
support for new flags on a per ->mmap() 'struct file_operations'
instance basis. Towards that end arrange for flags to be generically
validated against a mmap_supported_flags exported by 'struct
file_operations'. By default all existing flags are implicitly
supported, but new flags require MAP_SHARED_VALIDATE and
per-instance-opt-in.
Cc: Jan Kara <jack@suse.cz>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Suggested-by: Christoph Hellwig <hch@lst.de>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Many user space API headers have licensing information, which is either
incomplete, badly formatted or just a shorthand for referring to the
license under which the file is supposed to be. This makes it hard for
compliance tools to determine the correct license.
Update these files with an SPDX license identifier. The identifier was
chosen based on the license information in the file.
GPL/LGPL licensed headers get the matching GPL/LGPL SPDX license
identifier with the added 'WITH Linux-syscall-note' exception, which is
the officially assigned exception identifier for the kernel syscall
exception:
NOTE! This copyright does *not* cover user programs that use kernel
services by normal system calls - this is merely considered normal use
of the kernel, and does *not* fall under the heading of "derived work".
This exception makes it possible to include GPL headers into non GPL
code, without confusing license compliance tools.
Headers which have either explicit dual licensing or are just licensed
under a non GPL license are updated with the corresponding SPDX
identifier and the GPLv2 with syscall exception identifier. The format
is:
((GPL-2.0 WITH Linux-syscall-note) OR SPDX-ID-OF-OTHER-LICENSE)
SPDX license identifiers are a legally binding shorthand, which can be
used instead of the full boiler plate text. The update does not remove
existing license information as this has to be done on a case by case
basis and the copyright holders might have to be consulted. This will
happen in a separate step.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne. See the previous patch in this series for the
methodology of how this patch was researched.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Many user space API headers are missing licensing information, which
makes it hard for compliance tools to determine the correct license.
By default are files without license information under the default
license of the kernel, which is GPLV2. Marking them GPLV2 would exclude
them from being included in non GPLV2 code, which is obviously not
intended. The user space API headers fall under the syscall exception
which is in the kernels COPYING file:
NOTE! This copyright does *not* cover user programs that use kernel
services by normal system calls - this is merely considered normal use
of the kernel, and does *not* fall under the heading of "derived work".
otherwise syscall usage would not be possible.
Update the files which contain no license information with an SPDX
license identifier. The chosen identifier is 'GPL-2.0 WITH
Linux-syscall-note' which is the officially assigned identifier for the
Linux syscall exception. SPDX license identifiers are a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne. See the previous patch in this series for the
methodology of how this patch was researched.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull MIPS updates from Ralf Baechle:
"This is the main pull request for 4.14 for MIPS; below a summary of
the non-merge commits:
CM:
- Rename mips_cm_base to mips_gcr_base
- Specify register size when generating accessors
- Use BIT/GENMASK for register fields, order & drop shifts
- Add cluster & block args to mips_cm_lock_other()
CPC:
- Use common CPS accessor generation macros
- Use BIT/GENMASK for register fields, order & drop shifts
- Introduce register modify (set/clear/change) accessors
- Use change_*, set_* & clear_* where appropriate
- Add CM/CPC 3.5 register definitions
- Use GlobalNumber macros rather than magic numbers
- Have asm/mips-cps.h include CM & CPC headers
- Cluster support for topology functions
- Detect CPUs in secondary clusters
CPS:
- Read GIC_VL_IDENT directly, not via irqchip driver
DMA:
- Consolidate coherent and non-coherent dma_alloc code
- Don't use dma_cache_sync to implement fd_cacheflush
FPU emulation / FP assist code:
- Another series of 14 commits fixing corner cases such as NaN
propgagation and other special input values.
- Zero bits 32-63 of the result for a CLASS.D instruction.
- Enhanced statics via debugfs
- Do not use bools for arithmetic. GCC 7.1 moans about this.
- Correct user fault_addr type
Generic MIPS:
- Enhancement of stack backtraces
- Cleanup from non-existing options
- Handle non word sized instructions when examining frame
- Fix detection and decoding of ADDIUSP instruction
- Fix decoding of SWSP16 instruction
- Refactor handling of stack pointer in get_frame_info
- Remove unreachable code from force_fcr31_sig()
- Convert to using %pOF instead of full_name
- Remove the R6000 support.
- Move FP code from *_switch.S to *_fpu.S
- Remove unused ST_OFF from r2300_switch.S
- Allow platform to specify multiple its.S files
- Add #includes to various files to ensure code builds reliable and
without warning..
- Remove __invalidate_kernel_vmap_range
- Remove plat_timer_setup
- Declare various variables & functions static
- Abstract CPU core & VP(E) ID access through accessor functions
- Store core & VP IDs in GlobalNumber-style variable
- Unify checks for sibling CPUs
- Add CPU cluster number accessors
- Prevent direct use of generic_defconfig
- Make CONFIG_MIPS_MT_SMP default y
- Add __ioread64_copy
- Remove unnecessary inclusions of linux/irqchip/mips-gic.h
GIC:
- Introduce asm/mips-gic.h with accessor functions
- Use new GIC accessor functions in mips-gic-timer
- Remove counter access functions from irq-mips-gic.c
- Remove gic_read_local_vp_id() from irq-mips-gic.c
- Simplify shared interrupt pending/mask reads in irq-mips-gic.c
- Simplify gic_local_irq_domain_map() in irq-mips-gic.c
- Drop gic_(re)set_mask() functions in irq-mips-gic.c
- Remove gic_set_polarity(), gic_set_trigger(), gic_set_dual_edge(),
gic_map_to_pin() and gic_map_to_vpe() from irq-mips-gic.c.
- Convert remaining shared reg access, local int mask access and
remaining local reg access to new accessors
- Move GIC_LOCAL_INT_* to asm/mips-gic.h
- Remove GIC_CPU_INT* macros from irq-mips-gic.c
- Move various definitions to the driver
- Remove gic_get_usm_range()
- Remove __gic_irq_dispatch() forward declaration
- Remove gic_init()
- Use mips_gic_present() in place of gic_present and remove
gic_present
- Move gic_get_c0_*_int() to asm/mips-gic.h
- Remove linux/irqchip/mips-gic.h
- Inline __gic_init()
- Inline gic_basic_init()
- Make pcpu_masks a per-cpu variable
- Use pcpu_masks to avoid reading GIC_SH_MASK*
- Clean up mti, reserved-cpu-vectors handling
- Use cpumask_first_and() in gic_set_affinity()
- Let the core set struct irq_common_data affinity
microMIPS:
- Fix microMIPS stack unwinding on big endian systems
MIPS-GIC:
- SYNC after enabling GIC region
NUMA:
- Remove the unused parent_node() macro
R6:
- Constify r2_decoder_tables
- Add accessor & bit definitions for GlobalNumber
SMP:
- Constify smp ops
- Allow boot_secondary SMP op to return errors
VDSO:
- Drop gic_get_usm_range() usage
- Avoid use of linux/irqchip/mips-gic.h
Platform changes:
Alchemy:
- Add devboard machine type to cpuinfo
- update cpu feature overrides
- Threaded carddetect irqs for devboards
AR7:
- allow NULL clock for clk_get_rate
BCM63xx:
- Fix ENETDMA_6345_MAXBURST_REG offset
- Allow NULL clock for clk_get_rate
CI20:
- Enable GPIO and RTC drivers in defconfig
- Add ethernet and fixed-regulator nodes to DTS
Generic platform:
- Move Boston and NI 169445 FIT image source to their own files
- Include asm/bootinfo.h for plat_fdt_relocated()
- Include asm/time.h for get_c0_*_int()
- Include asm/bootinfo.h for plat_fdt_relocated()
- Include asm/time.h for get_c0_*_int()
- Allow filtering enabled boards by requirements
- Don't explicitly disable CONFIG_USB_SUPPORT
- Bump default NR_CPUS to 16
JZ4700:
- Probe the jz4740-rtc driver from devicetree
Lantiq:
- Drop check of boot select from the spi-falcon driver.
- Drop check of boot select from the lantiq-flash MTD driver.
- Access boot cause register in the watchdog driver through regmap
- Add device tree binding documentation for the watchdog driver
- Add docs for the RCU DT bindings.
- Convert the fpi bus driver to a platform_driver
- Remove ltq_reset_cause() and ltq_boot_select(
- Switch to a proper reset driver
- Switch to a new drivers/soc GPHY driver
- Add an USB PHY driver for the Lantiq SoCs using the RCU module
- Use of_platform_default_populate instead of __dt_register_buses
- Enable MFD_SYSCON to be able to use it for the RCU MFD
- Replace ltq_boot_select() with dummy implementation.
Loongson 2F:
- Allow NULL clock for clk_get_rate
Malta:
- Use new GIC accessor functions
NI 169445:
- Add support for NI 169445 board.
- Only include in 32r2el kernels
Octeon:
- Add support for watchdog of 78XX SOCs.
- Add support for watchdog of CN68XX SOCs.
- Expose support for mips32r1, mips32r2 and mips64r1
- Enable more drivers in config file
- Add support for accessing the boot vector.
- Remove old boot vector code from watchdog driver
- Define watchdog registers for 70xx, 73xx, 78xx, F75xx.
- Make CSR functions node aware.
- Allow access to CIU3 IRQ domains.
- Misc cleanups in the watchdog driver
Omega2+:
- New board, add support and defconfig
Pistachio:
- Enable Root FS on NFS in defconfig
Ralink:
- Add Mediatek MT7628A SoC
- Allow NULL clock for clk_get_rate
- Explicitly request exclusive reset control in the pci-mt7620 PCI driver.
SEAD3:
- Only include in 32 bit kernels by default
VoCore:
- Add VoCore as a vendor t0 dt-bindings
- Add defconfig file"
* '4.14-features' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (167 commits)
MIPS: Refactor handling of stack pointer in get_frame_info
MIPS: Stacktrace: Fix microMIPS stack unwinding on big endian systems
MIPS: microMIPS: Fix decoding of swsp16 instruction
MIPS: microMIPS: Fix decoding of addiusp instruction
MIPS: microMIPS: Fix detection of addiusp instruction
MIPS: Handle non word sized instructions when examining frame
MIPS: ralink: allow NULL clock for clk_get_rate
MIPS: Loongson 2F: allow NULL clock for clk_get_rate
MIPS: BCM63XX: allow NULL clock for clk_get_rate
MIPS: AR7: allow NULL clock for clk_get_rate
MIPS: BCM63XX: fix ENETDMA_6345_MAXBURST_REG offset
mips: Save all registers when saving the frame
MIPS: Add DWARF unwinding to assembly
MIPS: Make SAVE_SOME more standard
MIPS: Fix issues in backtraces
MIPS: jz4780: DTS: Probe the jz4740-rtc driver from devicetree
MIPS: Ci20: Enable RTC driver
watchdog: octeon-wdt: Add support for 78XX SOCs.
watchdog: octeon-wdt: Add support for cn68XX SOCs.
watchdog: octeon-wdt: File cleaning.
...
Pull namespace updates from Eric Biederman:
"Life has been busy and I have not gotten half as much done this round
as I would have liked. I delayed it so that a minor conflict
resolution with the mips tree could spend a little time in linux-next
before I sent this pull request.
This includes two long delayed user namespace changes from Kirill
Tkhai. It also includes a very useful change from Serge Hallyn that
allows the security capability attribute to be used inside of user
namespaces. The practical effect of this is people can now untar
tarballs and install rpms in user namespaces. It had been suggested to
generalize this and encode some of the namespace information
information in the xattr name. Upon close inspection that makes the
things that should be hard easy and the things that should be easy
more expensive.
Then there is my bugfix/cleanup for signal injection that removes the
magic encoding of the siginfo union member from the kernel internal
si_code. The mips folks reported the case where I had used FPE_FIXME
me is impossible so I have remove FPE_FIXME from mips, while at the
same time including a return statement in that case to keep gcc from
complaining about unitialized variables.
I almost finished the work to get make copy_siginfo_to_user a trivial
copy to user. The code is available at:
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git neuter-copy_siginfo_to_user-v3
But I did not have time/energy to get the code posted and reviewed
before the merge window opened.
I was able to see that the security excuse for just copying fields
that we know are initialized doesn't work in practice there are buggy
initializations that don't initialize the proper fields in siginfo. So
we still sometimes copy unitialized data to userspace"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
Introduce v3 namespaced file capabilities
mips/signal: In force_fcr31_sig return in the impossible case
signal: Remove kernel interal si_code magic
fcntl: Don't use ambiguous SIG_POLL si_codes
prctl: Allow local CAP_SYS_ADMIN changing exe_file
security: Use user_namespace::level to avoid redundant iterations in cap_capable()
userns,pidns: Verify the userns for new pid namespaces
signal/testing: Don't look for __SI_FAULT in userspace
signal/mips: Document a conflict with SI_USER with SIGFPE
signal/sparc: Document a conflict with SI_USER with SIGFPE
signal/ia64: Document a conflict with SI_USER with SIGFPE
signal/alpha: Document a conflict with SI_USER for SIGTRAP
Merge updates from Andrew Morton:
- various misc bits
- DAX updates
- OCFS2
- most of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (119 commits)
mm,fork: introduce MADV_WIPEONFORK
x86,mpx: make mpx depend on x86-64 to free up VMA flag
mm: add /proc/pid/smaps_rollup
mm: hugetlb: clear target sub-page last when clearing huge page
mm: oom: let oom_reap_task and exit_mmap run concurrently
swap: choose swap device according to numa node
mm: replace TIF_MEMDIE checks by tsk_is_oom_victim
mm, oom: do not rely on TIF_MEMDIE for memory reserves access
z3fold: use per-cpu unbuddied lists
mm, swap: don't use VMA based swap readahead if HDD is used as swap
mm, swap: add sysfs interface for VMA based swap readahead
mm, swap: VMA based swap readahead
mm, swap: fix swap readahead marking
mm, swap: add swap readahead hit statistics
mm/vmalloc.c: don't reinvent the wheel but use existing llist API
mm/vmstat.c: fix wrong comment
selftests/memfd: add memfd_create hugetlbfs selftest
mm/shmem: add hugetlbfs support to memfd_create()
mm, devm_memremap_pages: use multi-order radix for ZONE_DEVICE lookups
mm/vmalloc.c: halve the number of comparisons performed in pcpu_get_vm_areas()
...
Introduce MADV_WIPEONFORK semantics, which result in a VMA being empty
in the child process after fork. This differs from MADV_DONTFORK in one
important way.
If a child process accesses memory that was MADV_WIPEONFORK, it will get
zeroes. The address ranges are still valid, they are just empty.
If a child process accesses memory that was MADV_DONTFORK, it will get a
segmentation fault, since those address ranges are no longer valid in
the child after fork.
Since MADV_DONTFORK also seems to be used to allow very large programs
to fork in systems with strict memory overcommit restrictions, changing
the semantics of MADV_DONTFORK might break existing programs.
MADV_WIPEONFORK only works on private, anonymous VMAs.
The use case is libraries that store or cache information, and want to
know that they need to regenerate it in the child process after fork.
Examples of this would be:
- systemd/pulseaudio API checks (fail after fork) (replacing a getpid
check, which is too slow without a PID cache)
- PKCS#11 API reinitialization check (mandated by specification)
- glibc's upcoming PRNG (reseed after fork)
- OpenSSL PRNG (reseed after fork)
The security benefits of a forking server having a re-inialized PRNG in
every child process are pretty obvious. However, due to libraries
having all kinds of internal state, and programs getting compiled with
many different versions of each library, it is unreasonable to expect
calling programs to re-initialize everything manually after fork.
A further complication is the proliferation of clone flags, programs
bypassing glibc's functions to call clone directly, and programs calling
unshare, causing the glibc pthread_atfork hook to not get called.
It would be better to have the kernel take care of this automatically.
The patch also adds MADV_KEEPONFORK, to undo the effects of a prior
MADV_WIPEONFORK.
This is similar to the OpenBSD minherit syscall with MAP_INHERIT_ZERO:
https://man.openbsd.org/minherit.2
[akpm@linux-foundation.org: numerically order arch/parisc/include/uapi/asm/mman.h #defines]
Link: http://lkml.kernel.org/r/20170811212829.29186-3-riel@redhat.com
Signed-off-by: Rik van Riel <riel@redhat.com>
Reported-by: Florian Weimer <fweimer@redhat.com>
Reported-by: Colm MacCártaigh <colm@allcosts.net>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Cc: <linux-api@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A non-default huge page size can be encoded in the flags argument of the
mmap system call. The definitions for these encodings are in arch
specific header files. However, all architectures use the same values.
Consolidate all the definitions in the primary user header file
(uapi/linux/mman.h). Include definitions for all known huge page sizes.
Use the generic encoding definitions in hugetlb_encode.h as the basis
for these definitions.
Link: http://lkml.kernel.org/r/1501527386-10736-3-git-send-email-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the immediate encoded in the instruction is accessed, it is sign
extended due to being a signed value being assigned to a signed integer.
The ISA specifies that this operation is an unsigned operation.
The sign extension leads us to incorrectly decode:
801e9c8e: cbf1 sw ra,68(sp)
As having an immediate of 1073741809.
Since the instruction format does not specify signed/unsigned, and this
is currently the only location to use this instuction format, change it
to an unsigned immediate.
Fixes: bb9bc4689b ("MIPS: Calculate microMIPS ra properly when unwinding the stack")
Suggested-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: Miodrag Dinic <miodrag.dinic@imgtec.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/16957/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
In a recent discussion Maciej Rozycki reported that this case is
impossible.
Handle the impossible case by just returning instead of trying to
handle it. This makes static analysis simpler as it means nothing
needs to consider the impossible case after the return statement.
As the code no longer has to deal with this case remove FPE_FIXME from
the mips siginfo.h
Cc: "Maciej W. Rozycki" <macro@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Link: http://lkml.kernel.org/r/20170718140651.15973-4-ebiederm@xmission.com
Ref: ea1b75cf91 ("signal/mips: Document a conflict with SI_USER with SIGFPE")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
The send call ignores unknown flags. Legacy applications may already
unwittingly pass MSG_ZEROCOPY. Continue to ignore this flag unless a
socket opts in to zerocopy.
Introduce socket option SO_ZEROCOPY to enable MSG_ZEROCOPY processing.
Processes can also query this socket option to detect kernel support
for the feature. Older kernels will return ENOPROTOOPT.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct siginfo is a union and the kernel since 2.4 has been hiding a union
tag in the high 16bits of si_code using the values:
__SI_KILL
__SI_TIMER
__SI_POLL
__SI_FAULT
__SI_CHLD
__SI_RT
__SI_MESGQ
__SI_SYS
While this looks plausible on the surface, in practice this situation has
not worked well.
- Injected positive signals are not copied to user space properly
unless they have these magic high bits set.
- Injected positive signals are not reported properly by signalfd
unless they have these magic high bits set.
- These kernel internal values leaked to userspace via ptrace_peek_siginfo
- It was possible to inject these kernel internal values and cause the
the kernel to misbehave.
- Kernel developers got confused and expected these kernel internal values
in userspace in kernel self tests.
- Kernel developers got confused and set si_code to __SI_FAULT which
is SI_USER in userspace which causes userspace to think an ordinary user
sent the signal and that it was not kernel generated.
- The values make it impossible to reorganize the code to transform
siginfo_copy_to_user into a plain copy_to_user. As si_code must
be massaged before being passed to userspace.
So remove these kernel internal si codes and make the kernel code simpler
and more maintainable.
To replace these kernel internal magic si_codes introduce the helper
function siginfo_layout, that takes a signal number and an si_code and
computes which union member of siginfo is being used. Have
siginfo_layout return an enumeration so that gcc will have enough
information to warn if a switch statement does not handle all of union
members.
A couple of architectures have a messed up ABI that defines signal
specific duplications of SI_USER which causes more special cases in
siginfo_layout than I would like. The good news is only problem
architectures pay the cost.
Update all of the code that used the previous magic __SI_ values to
use the new SIL_ values and to call siginfo_layout to get those
values. Escept where not all of the cases are handled remove the
defaults in the switch statements so that if a new case is missed in
the future the lack will show up at compile time.
Modify the code that copies siginfo si_code to userspace to just copy
the value and not cast si_code to a short first. The high bits are no
longer used to hold a magic union member.
Fixup the siginfo header files to stop including the __SI_ values in
their constants and for the headers that were missing it to properly
update the number of si_codes for each signal type.
The fixes to copy_siginfo_from_user32 implementations has the
interesting property that several of them perviously should never have
worked as the __SI_ values they depended up where kernel internal.
With that dependency gone those implementations should work much
better.
The idea of not passing the __SI_ values out to userspace and then
not reinserting them has been tested with criu and criu worked without
changes.
Ref: 2.4.0-test1
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Setting si_code to __SI_FAULT results in a userspace seeing
an si_code of 0. This is the same si_code as SI_USER. Posix
and common sense requires that SI_USER not be a signal specific
si_code. As such this use of 0 for the si_code is a pretty
horribly broken ABI.
This use of of __SI_FAULT is only a decade old. Which compared
to the other pieces of kernel code that has made this mistake
is almost yesterday.
This is probably worth fixing but I don't know mips well enough
to know what si_code to would be the proper one to use.
Cc: Ralf Baechle <ralf@linux-mips.org>
Ref: 948a34cf39 ("[MIPS] Maintain si_code field properly for FP exceptions")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
This ioctl does nothing to justify an _IOC_READ or _IOC_WRITE flag
because it doesn't copy anything from/to userspace to access the
argument.
Fixes: 54ebbfb160 ("tty: add TIOCGPTPEER ioctl")
Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Acked-by: Aleksa Sarai <asarai@suse.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull MIPS updates from Ralf Baechle:
"Boston platform support:
- Document DT bindings
- Add CLK driver for board clocks
CM:
- Avoid per-core locking with CM3 & higher
- WARN on attempt to lock invalid VP, not BUG
CPS:
- Select CONFIG_SYS_SUPPORTS_SCHED_SMT for MIPSr6
- Prevent multi-core with dcache aliasing
- Handle cores not powering down more gracefully
- Handle spurious VP starts more gracefully
DSP:
- Add lwx & lhx missaligned access support
eBPF:
- Add MIPS support along with many supporting change to add the
required infrastructure
Generic arch code:
- Misc sysmips MIPS_ATOMIC_SET fixes
- Drop duplicate HAVE_SYSCALL_TRACEPOINTS
- Negate error syscall return in trace
- Correct forced syscall errors
- Traced negative syscalls should return -ENOSYS
- Allow samples/bpf/tracex5 to access syscall arguments for sane
traces
- Cleanup from old Kconfig options in defconfigs
- Fix PREF instruction usage by memcpy for MIPS R6
- Fix various special cases in the FPU eulation
- Fix some special cases in MIPS16e2 support
- Fix MIPS I ISA /proc/cpuinfo reporting
- Sort MIPS Kconfig alphabetically
- Fix minimum alignment requirement of IRQ stack as required by
ABI / GCC
- Fix special cases in the module loader
- Perform post-DMA cache flushes on systems with MAARs
- Probe the I6500 CPU
- Cleanup cmpxchg and add support for 1 and 2 byte operations
- Use queued read/write locks (qrwlock)
- Use queued spinlocks (qspinlock)
- Add CPU shared FTLB feature detection
- Handle tlbex-tlbp race condition
- Allow storing pgd in C0_CONTEXT for MIPSr6
- Use current_cpu_type() in m4kc_tlbp_war()
- Support Boston in the generic kernel
Generic platform:
- yamon-dt: Pull YAMON DT shim code out of SEAD-3 board
- yamon-dt: Support > 256MB of RAM
- yamon-dt: Use serial* rather than uart* aliases
- Abstract FDT fixup application
- Set RTC_ALWAYS_BCD to 0
- Add a MAINTAINERS entry
core kernel:
- qspinlock.c: include linux/prefetch.h
Loongson 3:
- Add support
Perf:
- Add I6500 support
SEAD-3:
- Remove GIC timer from DT
- Set interrupt-parent per-device, not at root node
- Fix GIC interrupt specifiers
SMP:
- Skip IPI setup if we only have a single CPU
VDSO:
- Make comment match reality
- Improvements to time code in VDSO"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (86 commits)
locking/qspinlock: Include linux/prefetch.h
MIPS: Fix MIPS I ISA /proc/cpuinfo reporting
MIPS: Fix minimum alignment requirement of IRQ stack
MIPS: generic: Support MIPS Boston development boards
MIPS: DTS: img: Don't attempt to build-in all .dtb files
clk: boston: Add a driver for MIPS Boston board clocks
dt-bindings: Document img,boston-clock binding
MIPS: Traced negative syscalls should return -ENOSYS
MIPS: Correct forced syscall errors
MIPS: Negate error syscall return in trace
MIPS: Drop duplicate HAVE_SYSCALL_TRACEPOINTS select
MIPS16e2: Provide feature overrides for non-MIPS16 systems
MIPS: MIPS16e2: Report ASE presence in /proc/cpuinfo
MIPS: MIPS16e2: Subdecode extended LWSP/SWSP instructions
MIPS: MIPS16e2: Identify ASE presence
MIPS: VDSO: Fix a mismatch between comment and preprocessor constant
MIPS: VDSO: Add implementation of gettimeofday() fallback
MIPS: VDSO: Add implementation of clock_gettime() fallback
MIPS: VDSO: Fix conversions in do_monotonic()/do_monotonic_coarse()
MIPS: Use current_cpu_type() in m4kc_tlbp_war()
...
Pull networking updates from David Miller:
"Reasonably busy this cycle, but perhaps not as busy as in the 4.12
merge window:
1) Several optimizations for UDP processing under high load from
Paolo Abeni.
2) Support pacing internally in TCP when using the sch_fq packet
scheduler for this is not practical. From Eric Dumazet.
3) Support mutliple filter chains per qdisc, from Jiri Pirko.
4) Move to 1ms TCP timestamp clock, from Eric Dumazet.
5) Add batch dequeueing to vhost_net, from Jason Wang.
6) Flesh out more completely SCTP checksum offload support, from
Davide Caratti.
7) More plumbing of extended netlink ACKs, from David Ahern, Pablo
Neira Ayuso, and Matthias Schiffer.
8) Add devlink support to nfp driver, from Simon Horman.
9) Add RTM_F_FIB_MATCH flag to RTM_GETROUTE queries, from Roopa
Prabhu.
10) Add stack depth tracking to BPF verifier and use this information
in the various eBPF JITs. From Alexei Starovoitov.
11) Support XDP on qed device VFs, from Yuval Mintz.
12) Introduce BPF PROG ID for better introspection of installed BPF
programs. From Martin KaFai Lau.
13) Add bpf_set_hash helper for TC bpf programs, from Daniel Borkmann.
14) For loads, allow narrower accesses in bpf verifier checking, from
Yonghong Song.
15) Support MIPS in the BPF selftests and samples infrastructure, the
MIPS eBPF JIT will be merged in via the MIPS GIT tree. From David
Daney.
16) Support kernel based TLS, from Dave Watson and others.
17) Remove completely DST garbage collection, from Wei Wang.
18) Allow installing TCP MD5 rules using prefixes, from Ivan
Delalande.
19) Add XDP support to Intel i40e driver, from Björn Töpel
20) Add support for TC flower offload in nfp driver, from Simon
Horman, Pieter Jansen van Vuuren, Benjamin LaHaise, Jakub
Kicinski, and Bert van Leeuwen.
21) IPSEC offloading support in mlx5, from Ilan Tayari.
22) Add HW PTP support to macb driver, from Rafal Ozieblo.
23) Networking refcount_t conversions, From Elena Reshetova.
24) Add sock_ops support to BPF, from Lawrence Brako. This is useful
for tuning the TCP sockopt settings of a group of applications,
currently via CGROUPs"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1899 commits)
net: phy: dp83867: add workaround for incorrect RX_CTRL pin strap
dt-bindings: phy: dp83867: provide a workaround for incorrect RX_CTRL pin strap
cxgb4: Support for get_ts_info ethtool method
cxgb4: Add PTP Hardware Clock (PHC) support
cxgb4: time stamping interface for PTP
nfp: default to chained metadata prepend format
nfp: remove legacy MAC address lookup
nfp: improve order of interfaces in breakout mode
net: macb: remove extraneous return when MACB_EXT_DESC is defined
bpf: add missing break in for the TCP_BPF_SNDCWND_CLAMP case
bpf: fix return in load_bpf_file
mpls: fix rtm policy in mpls_getroute
net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t
net, ax25: convert ax25_route.refcount from atomic_t to refcount_t
net, ax25: convert ax25_uid_assoc.refcount from atomic_t to refcount_t
net, sctp: convert sctp_ep_common.refcnt from atomic_t to refcount_t
net, sctp: convert sctp_transport.refcnt from atomic_t to refcount_t
net, sctp: convert sctp_chunk.refcnt from atomic_t to refcount_t
net, sctp: convert sctp_datamsg.refcnt from atomic_t to refcount_t
net, sctp: convert sctp_auth_bytes.refcnt from atomic_t to refcount_t
...
This adds the new getsockopt(2) option SO_PEERGROUPS on SOL_SOCKET to
retrieve the auxiliary groups of the remote peer. It is designed to
naturally extend SO_PEERCRED. That is, the underlying data is from the
same credentials. Regarding its syntax, it is based on SO_PEERSEC. That
is, if the provided buffer is too small, ERANGE is returned and @optlen
is updated. Otherwise, the information is copied, @optlen is set to the
actual size, and 0 is returned.
While SO_PEERCRED (and thus `struct ucred') already returns the primary
group, it lacks the auxiliary group vector. However, nearly all access
controls (including kernel side VFS and SYSVIPC, but also user-space
polkit, DBus, ...) consider the entire set of groups, rather than just
the primary group. But this is currently not possible with pure
SO_PEERCRED. Instead, user-space has to work around this and query the
system database for the auxiliary groups of a UID retrieved via
SO_PEERCRED.
Unfortunately, there is no race-free way to query the auxiliary groups
of the PID/UID retrieved via SO_PEERCRED. Hence, the current user-space
solution is to use getgrouplist(3p), which itself falls back to NSS and
whatever is configured in nsswitch.conf(3). This effectively checks
which groups we *would* assign to the user if it logged in *now*. On
normal systems it is as easy as reading /etc/group, but with NSS it can
resort to quering network databases (eg., LDAP), using IPC or network
communication.
Long story short: Whenever we want to use auxiliary groups for access
checks on IPC, we need further IPC to talk to the user/group databases,
rather than just relying on SO_PEERCRED and the incoming socket. This
is unfortunate, and might even result in dead-locks if the database
query uses the same IPC as the original request.
So far, those recursions / dead-locks have been avoided by using
primitive IPC for all crucial NSS modules. However, we want to avoid
re-inventing the wheel for each NSS module that might be involved in
user/group queries. Hence, we would preferably make DBus (and other IPC
that supports access-management based on groups) work without resorting
to the user/group database. This new SO_PEERGROUPS ioctl would allow us
to make dbus-daemon work without ever calling into NSS.
Cc: Michal Sekletar <msekleta@redhat.com>
Cc: Simon McVittie <simon.mcvittie@collabora.co.uk>
Reviewed-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When opening the slave end of a PTY, it is not possible for userspace to
safely ensure that /dev/pts/$num is actually a slave (in cases where the
mount namespace in which devpts was mounted is controlled by an
untrusted process). In addition, there are several unresolvable
race conditions if userspace were to attempt to detect attacks through
stat(2) and other similar methods [in addition it is not clear how
userspace could detect attacks involving FUSE].
Resolve this by providing an interface for userpace to safely open the
"peer" end of a PTY file descriptor by using the dentry cached by
devpts. Since it is not possible to have an open master PTY without
having its slave exposed in /dev/pts this interface is safe. This
interface currently does not provide a way to get the master pty (since
it is not clear whether such an interface is safe or even useful).
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Valentin Rothberg <vrothberg@suse.com>
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A definition was only provided for asm-generic/socket.h
using platforms, define it for the others as well
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Regularly, when a new header is created in include/uapi/, the developer
forgets to add it in the corresponding Kbuild file. This error is usually
detected after the release is out.
In fact, all headers under uapi directories should be exported, thus it's
useless to have an exhaustive list.
After this patch, the following files, which were not exported, are now
exported (with make headers_install_all):
asm-arc/kvm_para.h
asm-arc/ucontext.h
asm-blackfin/shmparam.h
asm-blackfin/ucontext.h
asm-c6x/shmparam.h
asm-c6x/ucontext.h
asm-cris/kvm_para.h
asm-h8300/shmparam.h
asm-h8300/ucontext.h
asm-hexagon/shmparam.h
asm-m32r/kvm_para.h
asm-m68k/kvm_para.h
asm-m68k/shmparam.h
asm-metag/kvm_para.h
asm-metag/shmparam.h
asm-metag/ucontext.h
asm-mips/hwcap.h
asm-mips/reg.h
asm-mips/ucontext.h
asm-nios2/kvm_para.h
asm-nios2/ucontext.h
asm-openrisc/shmparam.h
asm-parisc/kvm_para.h
asm-powerpc/perf_regs.h
asm-sh/kvm_para.h
asm-sh/ucontext.h
asm-tile/shmparam.h
asm-unicore32/shmparam.h
asm-unicore32/ucontext.h
asm-x86/hwcap2.h
asm-xtensa/kvm_para.h
drm/armada_drm.h
drm/etnaviv_drm.h
drm/vgem_drm.h
linux/aspeed-lpc-ctrl.h
linux/auto_dev-ioctl.h
linux/bcache.h
linux/btrfs_tree.h
linux/can/vxcan.h
linux/cifs/cifs_mount.h
linux/coresight-stm.h
linux/cryptouser.h
linux/fsmap.h
linux/genwqe/genwqe_card.h
linux/hash_info.h
linux/kcm.h
linux/kcov.h
linux/kfd_ioctl.h
linux/lightnvm.h
linux/module.h
linux/nbd-netlink.h
linux/nilfs2_api.h
linux/nilfs2_ondisk.h
linux/nsfs.h
linux/pr.h
linux/qrtr.h
linux/rpmsg.h
linux/sched/types.h
linux/sed-opal.h
linux/smc.h
linux/smc_diag.h
linux/stm.h
linux/switchtec_ioctl.h
linux/vfio_ccw.h
linux/wil6210_uapi.h
rdma/bnxt_re-abi.h
Note that I have removed from this list the files which are generated in every
exported directories (like .install or .install.cmd).
Thanks to Julien Floret <julien.floret@6wind.com> for the tip to get all
subdirs with a pure makefile command.
For the record, note that exported files for asm directories are a mix of
files listed by:
- include/uapi/asm-generic/Kbuild.asm;
- arch/<arch>/include/uapi/asm/Kbuild;
- arch/<arch>/include/asm/Kbuild.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Mark Salter <msalter@redhat.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
support; virtual interrupt controller performance improvements; support
for userspace virtual interrupt controller (slower, but necessary for
KVM on the weird Broadcom SoCs used by the Raspberry Pi 3)
* MIPS: basic support for hardware virtualization (ImgTec
P5600/P6600/I6400 and Cavium Octeon III)
* PPC: in-kernel acceleration for VFIO
* s390: support for guests without storage keys; adapter interruption
suppression
* x86: usual range of nVMX improvements, notably nested EPT support for
accessed and dirty bits; emulation of CPL3 CPUID faulting
* generic: first part of VCPU thread request API; kvm_stat improvements
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJZEHUkAAoJEL/70l94x66DBeYH/09wrpJ2FjU4Rqv7FxmqgWfH
9WGi4wvn/Z+XzQSyfMJiu2SfZVzU69/Y67OMHudy7vBT6knB+ziM7Ntoiu/hUfbG
0g5KsDX79FW15HuvuuGh9kSjUsj7qsQdyPZwP4FW/6ZoDArV9mibSvdjSmiUSMV/
2wxaoLzjoShdOuCe9EABaPhKK0XCrOYkygT6Paz1pItDxaSn8iW3ulaCuWMprUfG
Niq+dFemK464E4yn6HVD88xg5j2eUM6bfuXB3qR3eTR76mHLgtwejBzZdDjLG9fk
32PNYKhJNomBxHVqtksJ9/7cSR6iNPs7neQ1XHemKWTuYqwYQMlPj1NDy0aslQU=
=IsiZ
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"ARM:
- HYP mode stub supports kexec/kdump on 32-bit
- improved PMU support
- virtual interrupt controller performance improvements
- support for userspace virtual interrupt controller (slower, but
necessary for KVM on the weird Broadcom SoCs used by the Raspberry
Pi 3)
MIPS:
- basic support for hardware virtualization (ImgTec P5600/P6600/I6400
and Cavium Octeon III)
PPC:
- in-kernel acceleration for VFIO
s390:
- support for guests without storage keys
- adapter interruption suppression
x86:
- usual range of nVMX improvements, notably nested EPT support for
accessed and dirty bits
- emulation of CPL3 CPUID faulting
generic:
- first part of VCPU thread request API
- kvm_stat improvements"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
kvm: nVMX: Don't validate disabled secondary controls
KVM: put back #ifndef CONFIG_S390 around kvm_vcpu_kick
Revert "KVM: Support vCPU-based gfn->hva cache"
tools/kvm: fix top level makefile
KVM: x86: don't hold kvm->lock in KVM_SET_GSI_ROUTING
KVM: Documentation: remove VM mmap documentation
kvm: nVMX: Remove superfluous VMX instruction fault checks
KVM: x86: fix emulation of RSM and IRET instructions
KVM: mark requests that need synchronization
KVM: return if kvm_vcpu_wake_up() did wake up the VCPU
KVM: add explicit barrier to kvm_vcpu_kick
KVM: perform a wake_up in kvm_make_all_cpus_request
KVM: mark requests that do not need a wakeup
KVM: remove #ifndef CONFIG_S390 around kvm_vcpu_wake_up
KVM: x86: always use kvm_make_request instead of set_bit
KVM: add kvm_{test,clear}_request to replace {test,clear}_bit
s390: kvm: Cpu model support for msa6, msa7 and msa8
KVM: x86: remove irq disablement around KVM_SET_CLOCK/KVM_GET_CLOCK
kvm: better MWAIT emulation for guests
KVM: x86: virtualize cpuid faulting
...
Conflicts were simply overlapping changes. In the net/ipv4/route.c
case the code had simply moved around a little bit and the same fix
was made in both 'net' and 'net-next'.
In the net/sched/sch_generic.c case a fix in 'net' happened at
the same time that a new argument was added to qdisc_hash_add().
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce a new getsockopt operation to retrieve the socket cookie
for a specific socket based on the socket fd. It returns a unique
non-decreasing cookie for each socket.
Tested: https://android-review.googlesource.com/#/c/358163/
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Chenbo Feng <fengc@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Its value has never changed; we might as well make it part of the ABI instead
of using the return value of KVM_CHECK_EXTENSION(KVM_CAP_COALESCED_MMIO).
Because PPC does not always make MMIO available, the code has to be made
dependent on CONFIG_KVM_MMIO rather than KVM_COALESCED_MMIO_PAGE_OFFSET.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Add emulation of Memory Accessibility Attribute Registers (MAARs) when
necessary. We can't actually do anything with whatever the guest
provides, but it may not be possible to clear Guest.Config5.MRP so we
have to emulate at least a pair of MAARs.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Emulate the HYPCALL instruction added in the VZ ASE and used by the MIPS
paravirtualised guest support that is already merged. The new hypcall.c
handles arguments and the return value. No actual hypercalls are yet
supported, but this still allows us to safely step over hypercalls and
set an error code in the return value for forward compatibility.
Non-zero HYPCALL codes are not handled.
We also document the hypercall ABI which asm/kvm_para.h uses.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
This socket option returns the NAPI ID associated with the queue on which
the last frame is received. This information can be used by the apps to
split the incoming flows among the threads based on the Rx queue on which
they are received.
If the NAPI ID actually represents a sender_cpu then the value is ignored
and 0 is returned.
Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allows reading of SK_MEMINFO_VARS via socket option. This way an
application can get all meminfo related information in single socket
option call instead of multiple calls.
Adds helper function, sk_get_meminfo(), and uses that for both
getsockopt and sock_diag_put_meminfo().
Suggested by Eric Dumazet.
Signed-off-by: Josh Hunt <johunt@akamai.com>
Reviewed-by: Jason Baron <jbaron@akamai.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wire up the statx system call for MIPS, which was introduced in commit
a528d35e8b ("statx: Add a system call to make enhanced file info
available").
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15387/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Now that load/store faults due to read only memory regions are treated
as MMIO accesses it is safe to claim support for read only memory
regions (KVM_CAP_READONLY_MEM).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
This patch exports the sender chronograph stats via the socket
SO_TIMESTAMPING channel. Currently we can instrument how long a
particular application unit of data was queued in TCP by tracking
SOF_TIMESTAMPING_TX_SOFTWARE and SOF_TIMESTAMPING_TX_SCHED. Having
these sender chronograph stats exported simultaneously along with
these timestamps allow further breaking down the various sender
limitation. For example, a video server can tell if a particular
chunk of video on a connection takes a long time to deliver because
TCP was experiencing small receive window. It is not possible to
tell before this patch without packet traces.
To prepare these stats, the user needs to set
SOF_TIMESTAMPING_OPT_STATS and SOF_TIMESTAMPING_OPT_TSONLY flags
while requesting other SOF_TIMESTAMPING TX timestamps. When the
timestamps are available in the error queue, the stats are returned
in a separate control message of type SCM_TIMESTAMPING_OPT_STATS,
in a list of TLVs (struct nlattr) of types: TCP_NLA_BUSY_TIME,
TCP_NLA_RWND_LIMITED, TCP_NLA_SNDBUF_LIMITED. Unit is microsecond.
Signed-off-by: Francis Yan <francisyyan@gmail.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds two new system calls:
int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
int pkey_free(int pkey);
These implement an "allocator" for the protection keys
themselves, which can be thought of as analogous to the allocator
that the kernel has for file descriptors. The kernel tracks
which numbers are in use, and only allows operations on keys that
are valid. A key which was not obtained by pkey_alloc() may not,
for instance, be passed to pkey_mprotect().
These system calls are also very important given the kernel's use
of pkeys to implement execute-only support. These help ensure
that userspace can never assume that it has control of a key
unless it first asks the kernel. The kernel does not promise to
preserve PKRU (right register) contents except for allocated
pkeys.
The 'init_access_rights' argument to pkey_alloc() specifies the
rights that will be established for the returned pkey. For
instance:
pkey = pkey_alloc(flags, PKEY_DENY_WRITE);
will allocate 'pkey', but also sets the bits in PKRU[1] such that
writing to 'pkey' is already denied.
The kernel does not prevent pkey_free() from successfully freeing
in-use pkeys (those still assigned to a memory range by
pkey_mprotect()). It would be expensive to implement the checks
for this, so we instead say, "Just don't do it" since sane
software will never do it anyway.
Any piece of userspace calling pkey_alloc() needs to be prepared
for it to fail. Why? pkey_alloc() returns the same error code
(ENOSPC) when there are no pkeys and when pkeys are unsupported.
They can be unsupported for a whole host of reasons, so apps must
be prepared for this. Also, libraries or LD_PRELOADs might steal
keys before an application gets access to them.
This allocation mechanism could be implemented in userspace.
Even if we did it in userspace, we would still need additional
user/kernel interfaces to tell userspace which keys are being
used by the kernel internally (such as for execute-only
mappings). Having the kernel provide this facility completely
removes the need for these additional interfaces, or having an
implementation of this in userspace at all.
Note that we have to make changes to all of the architectures
that do not use mman-common.h because we use the new
PKEY_DENY_ACCESS/WRITE macros in arch-independent code.
1. PKRU is the Protection Key Rights User register. It is a
usermode-accessible register that controls whether writes
and/or access to each individual pkey is allowed or denied.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: linux-arch@vger.kernel.org
Cc: Dave Hansen <dave@sr71.net>
Cc: arnd@arndb.de
Cc: linux-api@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: luto@kernel.org
Cc: akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/20160729163015.444FE75F@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull MIPS updates from Ralf Baechle:
"This is the main pull request for MIPS for 4.8. Also includes is a
minor SSB cleanup as SSB code traditionally is merged through the MIPS
tree:
ATH25:
- MIPS: Add default configuration for ath25
Boot:
- For zboot, copy appended dtb to the end of the kernel
- store the appended dtb address in a variable
BPF:
- Fix off by one error in offset allocation
Cobalt code:
- Fix typos
Core code:
- debugfs_create_file returns NULL on error, so don't use IS_ERR for
testing for errors.
- Fix double locking issue in RM7000 S-cache code. This would only
affect RM7000 ARC systems on reboot.
- Fix page table corruption on THP permission changes.
- Use compat_sys_keyctl for 32 bit userspace on 64 bit kernels.
David says, there are no compatibility issues raised by this fix.
- Move some signal code around.
- Rewrite r4k count/compare clockevent device registration such that
min_delta_ticks/max_delta_ticks files are guaranteed to be
initialized.
- Only register r4k count/compare as clockevent device if we can
assume the clock to be constant.
- Fix MSA asm warnings in control reg accessors
- uasm and tlbex fixes and tweaking.
- Print segment physical address when EU=1.
- Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO.
- CP: Allow booting by VP other than VP 0
- Cache handling fixes and optimizations for r4k class caches
- Add hotplug support for R6 processors
- Cleanup hotplug bits in kconfig
- traps: return correct si code for accessing nonmapped addresses
- Remove cpu_has_safe_index_cacheops
Lantiq:
- Register IRQ handler for virtual IRQ number
- Fix EIU interrupt loading code
- Use the real EXIN count
- Fix build error.
Loongson 3:
- Increase HPET_MIN_PROG_DELTA and decrease HPET_MIN_CYCLES
Octeon:
- Delete built-in DTB pruning code for D-Link DSR-1000N.
- Clean up GPIO definitions in dlink_dsr-1000n.dts.
- Add more LEDs to the DSR-100n DTS
- Fix off by one in octeon_irq_gpio_map()
- Typo fixes
- Enable SATA by default in cavium_octeon_defconfig
- Support readq/writeq()
- Remove forced mappings of USB interrupts.
- Ensure DMA descriptors are always in the low 4GB
- Improve USB reset code for OCTEON II.
Pistachio:
- Add maintainers entry for pistachio SoC Support
- Remove plat_setup_iocoherency
Ralink:
- Fix pwm UART in spis group pinmux.
SSB:
- Change bare unsigned to unsigned int to suit coding style
Tools:
- Fix reloc tool compiler warnings.
Other:
- Delete use of ARCH_WANT_OPTIONAL_GPIOLIB"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (61 commits)
MIPS: mm: Fix definition of R6 cache instruction
MIPS: tools: Fix relocs tool compiler warnings
MIPS: Cobalt: Fix typo
MIPS: Octeon: Fix typo
MIPS: Lantiq: Fix build failure
MIPS: Use CPHYSADDR to implement mips32 __pa
MIPS: Octeon: Dlink_dsr-1000n.dts: add more leds.
MIPS: Octeon: Clean up GPIO definitions in dlink_dsr-1000n.dts.
MIPS: Octeon: Delete built-in DTB pruning code for D-Link DSR-1000N.
MIPS: store the appended dtb address in a variable
MIPS: ZBOOT: copy appended dtb to the end of the kernel
MIPS: ralink: fix spis group pinmux
MIPS: Factor o32 specific code into signal_o32.c
MIPS: non-exec stack & heap when non-exec PT_GNU_STACK is present
MIPS: Use per-mm page to execute branch delay slot instructions
MIPS: Modify error handling
MIPS: c-r4k: Use SMP calls for CM indexed cache ops
MIPS: c-r4k: Avoid small flush_icache_range SMP calls
MIPS: c-r4k: Local flush_icache_range cache op override
MIPS: c-r4k: Split r4k_flush_kernel_vmap_range()
...
AT_VECTOR_SIZE_ARCH should be defined with the maximum number of
NEW_AUX_ENT entries that ARCH_DLINFO can contain, but it wasn't defined
for MIPS at all even though ARCH_DLINFO will contain one NEW_AUX_ENT for
the VDSO address.
This shouldn't be a problem as AT_VECTOR_SIZE_BASE includes space for
AT_BASE_PLATFORM which MIPS doesn't use, but lets define it now and add
the comment above ARCH_DLINFO as found in several other architectures to
remind future modifiers of ARCH_DLINFO to keep AT_VECTOR_SIZE_ARCH up to
date.
Fixes: ebb5e78cc6 ("MIPS: Initial implementation of a VDSO")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13823/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The opcodes currently defined in inst.h as cbcond0_op & cbcond1_op are
actually defined in the MIPS base instruction set manuals as pop10 &
pop30 respectively. Rename them as such, for consistency with the
documentation.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The opcodes currently defined in inst.h as beqzcjic_op & bnezcjialc_op
are actually defined in the MIPS base instruction set manuals as pop66 &
pop76 respectively. Rename them as such, for consistency with the
documentation.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the R6 MUL instruction encoding for 3 operand signed multiply to
uasm so that KVM can use uasm for generating its entry point code at
runtime on R6.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add MTHI/MTLO instructions for writing to the hi & lo registers to uasm
so that KVM can use uasm for generating its entry point code at runtime.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add DI instruction for disabling interrupts to uasm so that KVM can use
uasm for generating its entry point code at runtime.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add CFCMSA/CTCMSA instructions for accessing MSA control registers to
uasm so that KVM can use uasm for generating its entry point code at
runtime.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Convert various MIPS KVM guest instruction emulation functions to decode
instructions (and encode translations) using the union mips_instruction
and related enumerations in asm/inst.h rather than #defines and
hardcoded values.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Declare the opcode for the MIPSr6 sel.fmt instruction, as fsel_op in
order to match other FP op names.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Maciej W. Rozycki <macro@imgtec.com>
Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13152/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Loongson-3A R2 has pwbase/pwfield/pwsize/pwctl registers in CP0 (this
is very similar to HTW) and lwdir/lwpte/lddir/ldpte instructions which
can be used for fast TLB refill.
[ralf@linux-mips.org: Resolve conflict.]
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Steven J . Hill <sjhill@realitydiluted.com>
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12754/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Since commit 8cb48fe169 ("MIPS: Provide correct siginfo_t.si_stime"),
MIPS' uapi/asm/siginfo.h has included uapi/asm-generic/siginfo.h
directly before defining MIPS' struct siginfo, in order to get the
necessary definitions needed for the siginfo struct without the generic
copy_siginfo() hitting compiler errors due to struct siginfo not yet
being defined.
Now that the generic copy_siginfo() is moved out to linux/signal.h we
can safely include asm-generic/siginfo.h before defining the MIPS
specific struct siginfo, which avoids the uapi/ include as well as
breakage due to generic copy_siginfo() being defined before struct
siginfo.
Reported-by: Christopher Ferris <cferris@google.com>
Fixes: 8cb48fe169 ("MIPS: Provide correct siginfo_t.si_stime")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Petr Malat <oss@malat.biz>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.0-
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit 85efde6f4e ("make exported headers use strict posix types")
changed the asm-generic siginfo.h to use the __kernel_* types, and
commit 3a471cbc08 ("remove __KERNEL_STRICT_NAMES") make the internal
types accessible only to the kernel, but the MIPS implementation hasn't
been updated to match.
Switch to proper types now so that the exported asm/siginfo.h won't
produce quite so many compiler errors when included alone by a user
program.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Christopher Ferris <cferris@google.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 2.6.30-
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/12477/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull x86 protection key support from Ingo Molnar:
"This tree adds support for a new memory protection hardware feature
that is available in upcoming Intel CPUs: 'protection keys' (pkeys).
There's a background article at LWN.net:
https://lwn.net/Articles/643797/
The gist is that protection keys allow the encoding of
user-controllable permission masks in the pte. So instead of having a
fixed protection mask in the pte (which needs a system call to change
and works on a per page basis), the user can map a (handful of)
protection mask variants and can change the masks runtime relatively
cheaply, without having to change every single page in the affected
virtual memory range.
This allows the dynamic switching of the protection bits of large
amounts of virtual memory, via user-space instructions. It also
allows more precise control of MMU permission bits: for example the
executable bit is separate from the read bit (see more about that
below).
This tree adds the MM infrastructure and low level x86 glue needed for
that, plus it adds a high level API to make use of protection keys -
if a user-space application calls:
mmap(..., PROT_EXEC);
or
mprotect(ptr, sz, PROT_EXEC);
(note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice
this special case, and will set a special protection key on this
memory range. It also sets the appropriate bits in the Protection
Keys User Rights (PKRU) register so that the memory becomes unreadable
and unwritable.
So using protection keys the kernel is able to implement 'true'
PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies
PROT_READ as well. Unreadable executable mappings have security
advantages: they cannot be read via information leaks to figure out
ASLR details, nor can they be scanned for ROP gadgets - and they
cannot be used by exploits for data purposes either.
We know about no user-space code that relies on pure PROT_EXEC
mappings today, but binary loaders could start making use of this new
feature to map binaries and libraries in a more secure fashion.
There is other pending pkeys work that offers more high level system
call APIs to manage protection keys - but those are not part of this
pull request.
Right now there's a Kconfig that controls this feature
(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled
(like most x86 CPU feature enablement code that has no runtime
overhead), but it's not user-configurable at the moment. If there's
any serious problem with this then we can make it configurable and/or
flip the default"
* 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
mm/pkeys: Fix siginfo ABI breakage caused by new u64 field
x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA
mm/core, x86/mm/pkeys: Add execute-only protection keys support
x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags
x86/mm/pkeys: Allow kernel to modify user pkey rights register
x86/fpu: Allow setting of XSAVE state
x86/mm: Factor out LDT init from context init
mm/core, x86/mm/pkeys: Add arch_validate_pkey()
mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits()
x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU
x86/mm/pkeys: Add Kconfig prompt to existing config option
x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps
x86/mm/pkeys: Dump PKRU with other kernel registers
mm/core, x86/mm/pkeys: Differentiate instruction fetches
x86/mm/pkeys: Optimize fault handling in access_error()
mm/core: Do not enforce PKEY permissions on remote mm access
um, pkeys: Add UML arch_*_access_permitted() methods
mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys
x86/mm/gup: Simplify get_user_pages() PTE bit handling
...
Stephen Rothwell reported this linux-next build failure:
http://lkml.kernel.org/r/20160226164406.065a1ffc@canb.auug.org.au
... caused by the Memory Protection Keys patches from the tip tree triggering
a newly introduced build-time sanity check on an ARM build, because they changed
the ABI of siginfo in an unexpected way.
If u64 has a natural alignment of 8 bytes (which is the case on most mainstream
platforms, with the notable exception of x86-32), then the leadup to the
_sifields union matters:
typedef struct siginfo {
int si_signo;
int si_errno;
int si_code;
union {
...
} _sifields;
} __ARCH_SI_ATTRIBUTES siginfo_t;
Note how the first 3 fields give us 12 bytes, so _sifields is not 8
naturally bytes aligned.
Before the _pkey field addition the largest element of _sifields (on
32-bit platforms) was 32 bits. With the u64 added, the minimum alignment
requirement increased to 8 bytes on those (rare) 32-bit platforms. Thus
GCC padded the space after si_code with 4 extra bytes, and shifted all
_sifields offsets by 4 bytes - breaking the ABI of all of those
remaining fields.
On 64-bit platforms this problem was hidden due to _sifields already
having numerous fields with natural 8 bytes alignment (pointers).
To fix this, we replace the u64 with an '__u32'. The __u32 does not
increase the minimum alignment requirement of the union, and it is
also large enough to store the 16-bit pkey we have today on x86.
Reported-by: Stehen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Stehen Rothwell <sfr@canb.auug.org.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Helge Deller <deller@gmx.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-next@vger.kernel.org
Fixes: cd0ea35ff5 ("signals, pkeys: Notify userspace about protection key faults")
Link: http://lkml.kernel.org/r/20160301125451.02C7426D@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch add the SO_CNX_ADVICE socket option (setsockopt only). The
purpose is to allow an application to give feedback to the kernel about
the quality of the network path for a connected socket. The value
argument indicates the type of quality report. For this initial patch
the only supported advice is a value of 1 which indicates "bad path,
please reroute"-- the action taken by the kernel is to call
dst_negative_advice which will attempt to choose a different ECMP route,
reset the TX hash for flow label and UDP source port in encapsulation,
etc.
This facility should be useful for connected UDP sockets where only the
application can provide any feedback about path quality. It could also
be useful for TCP applications that have additional knowledge about the
path outside of the normal TCP control loop.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ia64 and mips have separate definitions for siginfo from the
generic one. Patch them to have the pkey fields.
Note that this is exactly what we did for MPX as well.
[ This fixes a compile error that Ingo was hitting with MIPS when the
x86 pkeys patch set is applied. ]
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Malat <oss@malat.biz>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20160217181703.E99B6656@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull MIPS updates from Ralf Baechle:
"This is the main pull request for MIPS for 4.5 plus some 4.4 fixes.
The executive summary:
- ATH79 platform improvments, use DT bindings for the ATH79 USB PHY.
- Avoid useless rebuilds for zboot.
- jz4780: Add NEMC, BCH and NAND device tree nodes
- Initial support for the MicroChip's DT platform. As all the device
drivers are missing this is still of limited use.
- Some Loongson3 cleanups.
- The unavoidable whitespace polishing.
- Reduce clock skew when synchronizing the CPU cycle counters on CPU
startup.
- Add MIPS R6 fixes.
- Lots of cleanups across arch/mips as fallout from KVM.
- Lots of minor fixes and changes for IEEE 754-2008 support to the
FPU emulator / fp-assist software.
- Minor Ralink, BCM47xx and bcm963xx platform support improvments.
- Support SMP on BCM63168"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (84 commits)
MIPS: zboot: Add support for serial debug using the PROM
MIPS: zboot: Avoid useless rebuilds
MIPS: BMIPS: Enable ARCH_WANT_OPTIONAL_GPIOLIB
MIPS: bcm63xx: nvram: Remove unused bcm63xx_nvram_get_psi_size() function
MIPS: bcm963xx: Update bcm_tag field image_sequence
MIPS: bcm963xx: Move extended flash address to bcm_tag header file
MIPS: bcm963xx: Move Broadcom BCM963xx image tag data structure
MIPS: bcm63xx: nvram: Use nvram structure definition from header file
MIPS: bcm963xx: Add Broadcom BCM963xx board nvram data structure
MAINTAINERS: Add KVM for MIPS entry
MIPS: KVM: Add missing newline to kvm_err()
MIPS: Move KVM specific opcodes into asm/inst.h
MIPS: KVM: Use cacheops.h definitions
MIPS: Break down cacheops.h definitions
MIPS: Use EXCCODE_ constants with set_except_vector()
MIPS: Update trap codes
MIPS: Move Cause.ExcCode trap codes to mipsregs.h
MIPS: KVM: Make kvm_mips_{init,exit}() static
MIPS: KVM: Refactor added offsetof()s
MIPS: KVM: Convert EXPORT_SYMBOL to _GPL
...
The header arch/mips/kvm/opcode.h defines a few extra opcodes which
aren't in arch/mips/include/uapi/asm/inst.h. There's nothing KVM
specific about them, so lets move them into inst.h where they belong and
delete the header.
Note that mfmcz_op is renamed to mfmc0_op to match the instruction set
manual, and wait_op was already added to inst.h in commit b0a3eae2b9
("MIPS: inst.h: define COP0 wait op"), merged in v3.16-rc1.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11895/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Fix the description of the microMIPS NOP16 encoding or MM_NOP16, which
is not equivalent to the MIPS16 NOP instruction. This is 0x0c00 and
represents the microMIPS `MOVE16 $0, $0' operation, whereas MIPS16 NOP
is encoded as 0x6500, representing `MOVE $0, $16'.
Also fix a typo in `mm_fp0_format' description.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12177/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Emulate the microMIPS ADDIUPC instruction directly in `mips_dsemul'. If
executed in the emulation frame, this instruction produces an incorrect
result, because the value of the PC there is not the same as where the
instruction originated.
Reshape code so as to handle all microMIPS cases together.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12175/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commits 21f55b018b ("arch/*/include/uapi/asm/mman.h: : let MADV_FREE
have same value for all architectures") and ef58978f1e ("mm: define
MADV_FREE for some arches") both defined MADV_FREE, but did not use the
same values. This results in build errors such as
./arch/alpha/include/uapi/asm/mman.h:53:0: error: "MADV_FREE" redefined
./arch/alpha/include/uapi/asm/mman.h:50:0: note: this is the location of the previous definition
for the affected architectures.
Fixes: 21f55b018b ("arch/*/include/uapi/asm/mman.h: : let MADV_FREE have same value for all architectures")
Fixes: ef58978f1e ("mm: define MADV_FREE for some arches")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: Chen Gang <gang.chen.5i5j@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Acked-by: Helge Deller <deller@gmx.de> [parisc]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For uapi, need try to let all macros have same value, and MADV_FREE is
added into main branch recently, so need redefine MADV_FREE for it.
At present, '8' can be shared with all architectures, so redefine it to
'8'.
[sudipm.mukherjee@gmail.com: correct uniform value of MADV_FREE]
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Shaohua Li <shli@kernel.org>
Cc: <yalin.wang2010@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Jason Evans <je@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mika Penttil <mika.penttila@nextfour.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Shaohua Li <shli@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Most architectures use asm-generic, but alpha, mips, parisc, xtensa need
their own definitions.
This patch defines MADV_FREE for them so it should fix build break for
their architectures.
Maybe, I should split and feed pieces to arch maintainers but included
here for mmotm convenience.
[gang.chen.5i5j@gmail.com: let MADV_FREE have same value for all architectures]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Shaohua Li <shli@kernel.org>
Cc: <yalin.wang2010@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Gang <gang.chen.5i5j@gmail.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Evans <je@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mika Penttil <mika.penttila@nextfour.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Shaohua Li <shli@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Expose socket options for setting a classic or extended BPF program
for use when selecting sockets in an SO_REUSEPORT group. These options
can be used on the first socket to belong to a group before bind or
on any socket in the group after bind.
This change includes refactoring of the existing sk_filter code to
allow reuse of the existing BPF filter validation checks.
Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull MIPS updates from Ralf Baechle:
"These are the highlists of the main MIPS pull request for 4.4:
- Add latencytop support
- Support appended DTBs
- VDSO support and initially use it for gettimeofday.
- Drop the .MIPS.abiflags and ELF NOTE sections from vmlinux
- Support for the 5KE, an internal test core.
- Switch all MIPS platfroms to libata drivers.
- Improved support, cleanups for ralink and Lantiq platforms.
- Support for the new xilfpga platform.
- A number of DTB improvments for BMIPS.
- Improved support for CM and CPS.
- Minor JZ4740 and BCM47xx enhancements"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (120 commits)
MIPS: idle: add case for CPU_5KE
MIPS: Octeon: Support APPENDED_DTB
MIPS: vmlinux: create a section for appended DTB
MIPS: Clean up compat_siginfo_t
MIPS: Fix PAGE_MASK definition
MIPS: BMIPS: Enable GZIP ramdisk and timed printks
MIPS: Add xilfpga defconfig
MIPS: xilfpga: Add mipsfpga platform code
MIPS: xilfpga: Add xilfpga device tree files.
dt-bindings: MIPS: Document xilfpga bindings and boot style
MIPS: Make MIPS_CMDLINE_DTB default
MIPS: Make the kernel arguments from dtb available
MIPS: Use USE_OF as the guard for appended dtb
MIPS: BCM63XX: Use pr_* instead of printk
MIPS: Loongson: Cleanup CONFIG_LOONGSON_SUSPEND.
MIPS: lantiq: Disable xbar fpi burst mode
MIPS: lantiq: Force the crossbar to big endian
MIPS: lantiq: Initialize the USB core on boot
MIPS: lantiq: Return correct value for fpi clock on ar9
MIPS: ralink: Add missing clock on rt305x
...
Add an initial implementation of a proper (i.e. an ELF shared library)
VDSO. With this commit it does not export any symbols, it only replaces
the current signal return trampoline page. A later commit will add user
implementations of gettimeofday()/clock_gettime().
To support both new toolchains and old ones which don't generate ABI
flags section, we define its content manually and then use a tool
(genvdso) to patch up the section to have the correct name and type.
genvdso also extracts symbol offsets ({,rt_}sigreturn) needed by the
kernel, and generates a C file containing a "struct mips_vdso_image"
containing both the VDSO data and these offsets. This C file is
compiled into the kernel.
On 64-bit kernels we require a different VDSO for each supported ABI,
so we may build up to 3 different VDSOs. The VDSO to use is selected by
the mips_abi structure.
A kernel/user shared data page is created and mapped below the VDSO
image. This is currently empty, but will be used by the user time
function implementations which are added later.
[markos.chandras@imgtec.com:
- Add more comments
- Move abi detection in genvdso.h since it's the get_symbol function
that needs it.
- Add an R6 specific way to calculate the base address of VDSO in order
to avoid the branch instruction which affects performance.
- Do not patch .gnu.attributes since it's not needed for dynamic linking.
- Simplify Makefile a little bit.
- checkpatch fixes
- Restrict VDSO support for binutils < 2.25 for pre-R6
- Include atomic64.h for O32 variant on MIPS64]
Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: Matthew Fortune <matthew.fortune@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11337/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The previous patch introduced a flag that specified pages in a VMA should
be placed on the unevictable LRU, but they should not be made present when
the area is created. This patch adds the ability to set this state via
the new mlock system calls.
We add MLOCK_ONFAULT for mlock2 and MCL_ONFAULT for mlockall.
MLOCK_ONFAULT will set the VM_LOCKONFAULT modifier for VM_LOCKED.
MCL_ONFAULT should be used as a modifier to the two other mlockall flags.
When used with MCL_CURRENT, all current mappings will be marked with
VM_LOCKED | VM_LOCKONFAULT. When used with MCL_FUTURE, the mm->def_flags
will be marked with VM_LOCKED | VM_LOCKONFAULT. When used with both
MCL_CURRENT and MCL_FUTURE, all current mappings and mm->def_flags will be
marked with VM_LOCKED | VM_LOCKONFAULT.
Prior to this patch, mlockall() will unconditionally clear the
mm->def_flags any time it is called without MCL_FUTURE. This behavior is
maintained after adding MCL_ONFAULT. If a call to mlockall(MCL_FUTURE) is
followed by mlockall(MCL_CURRENT), the mm->def_flags will be cleared and
new VMAs will be unlocked. This remains true with or without MCL_ONFAULT
in either mlockall() invocation.
munlock() will unconditionally clear both vma flags. munlockall()
unconditionally clears for VMA flags on all VMAs and in the mm->def_flags
field.
Signed-off-by: Eric B Munson <emunson@akamai.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some GCC versions (e.g. 4.8.3) can incorrectly inline a function with
MIPS32 instructions into another function with MIPS16 code [1], causing
the assembler to genereate incorrect binary code or fail right away
complaining about unrecognized opcode.
In the case of __arch_swab{16,32}, when inlined by the compiler with
flags `-mips32r2 -mips16 -Os', the assembler can fail with the following
error.
{standard input}:79: Error: unrecognized opcode `wsbh $2,$2'
For performance concerns and to workaround the issue already existing in
older compilers, just ignore these 2 functions when compiling with
mips16 enabled.
[1] Inlining nomips16 function into mips16 function can result in
undefined builtins, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55777
Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11241/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This reverts commit e0d8b2ec53.
For at least GCC 4.8.3, adding nomips16 function attribute still cannot
prevent it from being inlined in mips16 context. So revert it first in
preparation for a better workaround.
[1] Inlining nomips16 function into mips16 function can result in
undefined builtins, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55777
Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11240/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The nomips16 has to be added both as function attribute and assembler
directive.
When only function attribute is specified, the compiler will inline the
function with -Os optimization. The generated assembly code cannot be
correctly assembled because ISA mode switch has to be done through jump
instruction.
When only ".set nomips16" directive is used, the generated assembly code
will use MIPS32 code for the inline assembly template and MIPS16 for the
function return. The compiled binary is invalid:
00403100 <__arch_swab16>:
403100: 7c0410a0 wsbh v0,a0
403104: e820ea31 swc2 $0,-5583(at)
while correct code should be:
00402650 <__arch_swab16>:
402650: 7c0410a0 wsbh v0,a0
402654: 03e00008 jr ra
402658: 3042ffff andi v0,v0,0xffff
Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com>
Cc: Chen Jie <chenj@lemote.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11087/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
In order for userland to determine whether various features are safe to
use, it will need to know both that the hardware supports those features
and that the kernel is recent enough & configured appropriately to
support them. For example under the O32 modeless FP proposal the dynamic
linker & ifunc resolvers will need this information. The kernel is the
only thing in a position to know availability accurately, so the kernel
needs to provide the information to userland. This patch introduces the
infrastructure to provide the AT_HWCAP aux vector to userland in order
to provide that information. It also defines the 2 currently specified
flags, which indicate MIPSr6 & MSA support.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Matthew Fortune <matthew.fortune@imgtec.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Alex Smith <alex@alex-smith.me.uk>
Cc: linux-kernel@vger.kernel.org
Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Ingo Molnar <mingo@kernel.org>
Patchwork: https://patchwork.linux-mips.org/patch/10797/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The context introduced by MSA needs to be saved around signals. However,
we can't increase the size of struct sigcontext because that will change
the offset of the signal mask in struct sigframe or struct ucontext.
This patch instead places the new context immediately after the struct
sigframe for traditional signals, or similarly after struct ucontext for
RT signals. The layout of struct sigframe & struct ucontext is identical
from their sigcontext fields onwards, so the offset from the sigcontext
to the extended context will always be the same regardless of the type
of signal.
Userland will be able to search through the extended context by using
the magic values to detect which types of context are present. Any
unrecognised context can be skipped over using the size field of struct
extcontext. Once the magic value END_EXTCONTEXT_MAGIC is seen it is
known that there are no further extended context structures to examine.
This approach is somewhat similar to that taken by ARM to save VFP &
other context at the end of struct ucontext.
Userland can determine whether extended context is present by checking
for the USED_EXTCONTEXT bit in the sc_used_math field of struct
sigcontext. Whilst this could potentially change the historic semantics
of sc_used_math if further extended context which does not imply FP
context were to be introduced in the future, I have been unable to find
any userland code making use of sc_used_math at all. Using one of the
fields described as unused in struct sigcontext was considered, but the
kernel does not already write to those fields so there would be no
guarantee of the field being clear on older kernels. Other alternatives
would be to have userland check the kernel version, or to have a HWCAP
bit indicating presence of extended context. However there is a desire
to have the context & information required to decode it be self
contained such that, for example, debuggers could decode the saved
context easily.
[ralf@linux-mips.org: Fixed conflict.]
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Matthew Fortune <matthew.fortune@imgtec.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com>
Cc: Alex Smith <alex@alex-smith.me.uk>
Cc: linux-kernel@vger.kernel.org
Cc: Richard Weinberger <richard@nod.at>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Maciej W. Rozycki <macro@codesourcery.com>
Patchwork: https://patchwork.linux-mips.org/patch/10795/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The sc_used_math field of struct sigcontext & its variants has
traditionally been used as a boolean value indicating only whether or
not floating point context is saved within the sigcontext. With various
supported FP modes & the ability to switch between them this information
will no longer be enough to decode the meaning of the data stored in the
sc_fpregs fields of struct sigcontext.
To make that possible 3 bits are defined within sc_used_math:
- Bit 0 (USED_FP) represents whether FP was used, essentially
providing the boolean flag which sc_used_math as a whole provided
previously.
- Bit 1 (USED_FR1) provides the value of the Status.FR bit at the time
the FP context was saved.
- Bit 2 (USED_HYBRID_FPRS) indicates whether the FP context was saved
under the hybrid FPR scheme. Essentially, when set the odd singles
are located in bits 63:32 of the preceding even indexed sc_fpregs
element.
Any userland that tests whether the sc_used_math field is zero or
non-zero will continue to function as expected. Having said that, I
could not find any userland which uses the sc_used_math field at all.
[ralf@linux-mips.org: Fixed rejects.]
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Matthew Fortune <matthew.fortune@imgtec.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linux-kernel@vger.kernel.org
Cc: Richard Weinberger <richard@nod.at>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Maciej W. Rozycki <macro@codesourcery.com>
Patchwork: https://patchwork.linux-mips.org/patch/10794/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Declare a struct describing the MSA MI10 instruction format used for ld &
st instructions, for use by subsequent patches.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-kernel@vger.kernel.org
Cc: Markos Chandras <markos.chandras@imgtec.com>
Patchwork: https://patchwork.linux-mips.org/patch/10571/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add KVM register numbers for the MIPS SIMD Architecture (MSA) registers,
and implement access to them with the KVM_GET_ONE_REG / KVM_SET_ONE_REG
ioctls when the MSA capability is enabled (exposed in a later patch) and
present in the guest according to its Config3.MSAP bit.
The MSA vector registers use the same register numbers as the FPU
registers except with a different size (128bits). Since MSA depends on
Status.FR=1, these registers are inaccessible when Status.FR=0. These
registers are returned as a single native endian 128bit value, rather
than least significant half first with each 64-bit half native endian as
the kernel uses internally.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-api@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Add KVM register numbers for the MIPS FPU registers, and implement
access to them with the KVM_GET_ONE_REG / KVM_SET_ONE_REG ioctls when
the FPU capability is enabled (exposed in a later patch) and present in
the guest according to its Config1.FP bit.
The registers are accessible in the current mode of the guest, with each
sized access showing what the guest would see with an equivalent access,
and like the architecture they may become UNPREDICTABLE if the FR mode
is changed. When FR=0, odd doubles are inaccessible as they do not exist
in that mode.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-api@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Clean up KVM_GET_ONE_REG / KVM_SET_ONE_REG register definitions for
MIPS, to prepare for adding a new group for FPU & MSA vector registers.
Definitions are added for common bits in each group of registers, e.g.
KVM_REG_MIPS_CP0 = KVM_REG_MIPS | 0x10000, for the coprocessor 0
registers.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Pull MIPS updates from Ralf Baechle:
"This is the main pull request for MIPS:
- a number of fixes that didn't make the 3.19 release.
- a number of cleanups.
- preliminary support for Cavium's Octeon 3 SOCs which feature up to
48 MIPS64 R3 cores with FPU and hardware virtualization.
- support for MIPS R6 processors.
Revision 6 of the MIPS architecture is a major revision of the MIPS
architecture which does away with many of original sins of the
architecture such as branch delay slots. This and other changes in
R6 require major changes throughout the entire MIPS core
architecture code and make up for the lion share of this pull
request.
- finally some preparatory work for eXtendend Physical Address
support, which allows support of up to 40 bit of physical address
space on 32 bit processors"
[ Ahh, MIPS can't leave the PAE brain damage alone. It's like
every CPU architect has to make that mistake, but pee in the snow
by changing the TLA. But whether it's called PAE, LPAE or XPA,
it's horrid crud - Linus ]
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (114 commits)
MIPS: sead3: Corrected get_c0_perfcount_int
MIPS: mm: Remove dead macro definitions
MIPS: OCTEON: irq: add CIB and other fixes
MIPS: OCTEON: Don't do acknowledge operations for level triggered irqs.
MIPS: OCTEON: More OCTEONIII support
MIPS: OCTEON: Remove setting of processor specific CVMCTL icache bits.
MIPS: OCTEON: Core-15169 Workaround and general CVMSEG cleanup.
MIPS: OCTEON: Update octeon-model.h code for new SoCs.
MIPS: OCTEON: Implement DCache errata workaround for all CN6XXX
MIPS: OCTEON: Add little-endian support to asm/octeon/octeon.h
MIPS: OCTEON: Implement the core-16057 workaround
MIPS: OCTEON: Delete unused COP2 saving code
MIPS: OCTEON: Use correct instruction to read 64-bit COP0 register
MIPS: OCTEON: Save and restore CP2 SHA3 state
MIPS: OCTEON: Fix FP context save.
MIPS: OCTEON: Save/Restore wider multiply registers in OCTEON III CPUs
MIPS: boot: Provide more uImage options
MIPS: Remove unneeded #ifdef __KERNEL__ from asm/processor.h
MIPS: ip22-gio: Remove legacy suspend/resume support
mips: pci: Add ifdef around pci_proc_domain
...
Provide correct siginfo_t.si_stime on MIPS64
Bug description:
MIPS version of copy_siginfo() is not aware of alignment on platforms with
64-bit long integers, which leads to an incorrect si_stime passed to signal
handlers, because the last element (si_stime) of _sifields._sigchld is not
copied. If _MIPS_SZLONG is 64, then the _sifields starts at the offset of
4 * sizeof(int).
Patch description:
Use the generic copy_siginfo, which doesn't have this problem.
Signed-off-by: Petr Malat <oss@malat.biz>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/8671/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>