Remove the old ccwgroup_create_from_string interface since all
drivers have been converted to ccwgroup_create_dev. Also remove
now unused members of ccwgroup_driver.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Instead of finding devices via driver_find_device use the bus_find_device
wrapper get_ccwdev_by_dev_id. This allows us to get rid of the ccw_driver
argument of ccwgroup_create_dev and thus simplify the interface.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add a new interface for drivers to create a group device. Via the old
interface ccwgroup_create_from_string we would create a virtual device
in a way that only the caller of this function would match and bind to.
Via the new ccwgroup_create_dev we stop playing games with the driver
core and directly set the driver of the new group device. For drivers
which have todo additional setup steps (like setting driver_data)
provide a new setup driver callback.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There is a small race window in the __switch_to code in regard to
the transfer of the TIF_MCCK_PENDING bit from the previous to the
next task. The bit is transferred before the task struct pointer
and the thread-info pointer for the next task has been stored to
lowcore. If a machine check sets the TIF_MCCK_PENDING bit between
the transfer code and the store of current/thread_info the bit
is still set for the previous task. And if the previous task has
terminated it can get lost. The effect is that a pending CRW is
not retrieved until the next machine checks sets TIF_MCCK_PENDING.
To fix this reorder __switch_to to first store the task struct
and thread-info pointer and then do the transfer of the bit.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If the kernel gets compiled for at least z196, make use of
the fast-BCR facility.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
s390 really has no eieio instruction, so get rid of the implied ppc
semantics and in addition change mb() into a function.
Also remove SYNC_OTHER_CORES() since it is unused.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use HAVE_MARCH_Z9_109_FEATURES to figure out if stckf is available
at compile time.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add HAVE_MARCH_[ARCH]_FEATURES Kconfig symbols. Whenever there is code that
needs an instruction that is only present beginning with a hardware generation
the #ifdef chain can become quite long.
To avoid this add the new Kconfig symbols which are selected if the kernel
gets compiled for at least the specified symbol.
If for example the kernel gets compiled for z196 this means that also all
symbols for all previous architure features are set.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If the task that was found on an initial interrupt doesn't match the
current task execute a WARN_ON_ONCE() and don't put the task to sleep.
When this happened something went wrong between the interface of the
hypervisor and the kernel. In such a case keep the tasks alive to
avoid a hanging system.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use __set_task_state() instead of set_task_state(). Saves a couple of
instructions, since the memory barrier is not needed here.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Make the code a bit more symmetric and always search for the task of the
reported pid. This simplifies the code a bit.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When setting the current task state to TASK_UNINTERRUPTIBLE this can
race with a different cpu. The other cpu could set the task state after
it inspected it (while it was still TASK_RUNNING) to TASK_RUNNING which
would change the state from TASK_UNINTERRUPTIBLE to TASK_RUNNING again.
This race was always present in the pfault interrupt code but didn't
cause anything harmful before commit f2db2e6c "[S390] pfault: cpu hotplug
vs missing completion interrupts" which relied on the fact that after
setting the task state to TASK_UNINTERRUPTIBLE the task would really
sleep.
Since this is not necessarily the case the result may be a list corruption
of the pfault_list or, as observed, a use-after-free bug while trying to
access the task_struct of a task which terminated itself already.
To fix this, we need to get a reference of the affected task when receiving
the initial pfault interrupt and add special handling if we receive yet
another initial pfault interrupt when the task is already enqueued in the
pfault list.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <stable@vger.kernel.org> # needed for v3.0 and newer
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The store clock fast instruction saves a couple of instructions compared
to the store clock instruction. Always use stckf instead of stck if it
is available.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
commit c3f0327f8e
mm: add rss counters consistency check
detected the following problem with kvm on s390:
BUG: Bad rss-counter state mm:00000004f73ef000 idx:0 val:-10
BUG: Bad rss-counter state mm:00000004f73ef000 idx:1 val:-5
We have to make sure that we accumulate all rss values into
the mm before we replace the mm to avoid triggering this (harmless)
bug message.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the builtin tape ipl code. If somebody really wants to create a
tape which can be ipl'ed from, then this can be achieved by using zipl.
zipl can write an ipl record to a tape device and aftwards the kernel
image must be written to tape.
The steps are described in the "Linux on System z - Device Drivers,
Features, and Commands" book (SC33-8411).
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The code in entry[64].S calls do_signal only on return to user space.
user_mode(regs) is true for every calls to do_signal, it is unnecessary
to recheck user_mode at the start of do_signal and the legacy signal
stack switching path in get_sigframe is never reached.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The software large page emulation on s390 did not clear the the
pre-allocated page table in arch_release_hugepage() before freeing
it. This could trigger the WARN_ON(!pte_none(*pte) in mm/vmalloc.c:106
and make vmap_pte_range() fail, because the page table could be reused
in page_table_alloc(). This is fixed now by calling clear_table()
before page_table_free().
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Replace the check for TIF_SIE in the fault handler by a check for PF_VCPU.
With the last user of TIF_SIE gone we can now remove the bit.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Currently dev/mem for s390 provides only real memory access. This means
that the CPU prefix pages are swapped. The prefix swap for real memory
works as follows:
Each CPU owns a prefix register that points to a page aligned memory
location "P". If this CPU accesses the address range [0,0x1fff], it is
translated by the hardware to [P,P+0x1fff]. Accordingly if this CPU
accesses the address range [P,P+0x1fff], it is translated by the hardware
to [0,0x1fff]. Therefore, if [P,P+0x1fff] or [0,0x1fff] is read from
the current /dev/mem device, the incorrectly swapped memory content is
returned.
With this patch the /dev/mem architecture code is modified to provide
absolute memory access. This is done via the arch specific functions
xlate_dev_mem_ptr() and unxlate_dev_mem_ptr(). For swapped pages on
s390 the function xlate_dev_mem_ptr() now returns a new buffer with a
copy of the requested absolute memory. In case the buffer was allocated,
the unxlate_dev_mem_ptr() function frees it after /dev/mem code has
called copy_to_user().
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
HANDLE_SIE_INTERCEPT is called early, use supervisor state and
instruction address to decide if the reset of the PSW to sie_loop
is required.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
To allow correct stack backtraces the backchain for the external
interrupt handler is now initialized with zero like it is already
done for example by io_int_handler().
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix this:
arch/s390/kernel/crash_dump.c:296:14:
error: 'zfcpdump_save_areas' undeclared (first use in this function)
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add missing #ifdep CONFIG_HOTPLUG_CPU to get rid of this one:
arch/s390/kernel/smp.c:229:13: warning: 'pcpu_free_lowcore'
defined but not used [-Wunused-function]
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The inline assembly in__arch_swab16p causes compile errors of the form:
*error*: *invalid* '*asm*': operand number missing after %-*letter*
The assembly uses the %O<n>/%R<n> notation but the first operand misses
the operand number, it needs to be "%O1" instead of "%O".
Reported-by: Gil Peleg <gilpeleg@servframe.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The stfle() function writes into lowcore memory when stfl_fac_list
is initialized with "S390_lowcore.stfl_fac_list = 0". For older
compilers this triggers a lowcore exception. With newer compilers
and "-OXX" compile option the bug does not show up because
the "S390_lowcore.stfl_fac_list" initialization is removed by the
compiler. The reason for thatis the incorrect "=m"
(S390_lowcore.stfl_fac_list) constraint in the stfl inline assembly.
The following shows the disassembly of the stfle() optimized code
that is inlined in the lgr_info_get() function:
000000000011325c <lgr_info_get>:
11325c: eb 9f f0 60 00 24 stmg %r9,%r15,96(%r15)
113262: c0 d0 00 29 0e 47 larl %r13,634ef0 <servi..>
113268: a7 f1 3f c0 tml %r15,16320
11326c: b9 04 00 ef lgr %r14,%r15
113270: a7 84 00 01 je 113272 <lgr_info_g..>
113274: a7 fb ff c0 aghi %r15,-64
113278: b9 04 00 c2 lgr %r12,%r2
11327c: a7 29 00 01 lghi %r2,1
113280: e3 e0 f0 98 00 24 stg %r14,152(%r15)
113286: d7 97 c0 00 c0 00 xc 0(152,%r12),0(%r12)
11328c: c0 e5 00 28 db 4c brasl %r14,62e924 <add_e..>
113292: b2 b1 00 00 stfl 0
To fix the problem we now clear the S390_lowcore.stfl_fac_list at
startup in "head.S" for all machine types before lowcore protection
is enabled.
In addition to that the "=m" constraint is replaced by "+m".
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix these:
arch/s390/kernel/perf_cpum_cf.c:180:3: warning: format '%lx'
expects argument of type 'long unsigned int',
but argument 2 has type 'int' [-Wformat]
arch/s390/kernel/perf_cpum_cf.c: In function 'cpumf_pmu_disable':
arch/s390/kernel/perf_cpum_cf.c:205:3: warning: format '%lx'
expects argument of type 'long unsigned int',
but argument 2 has type 'int' [-Wformat]
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use braces for if/else/list_for_each_entry bodies if the body consists
of more than a single line. Otherwise I get confused and check if there
is something broken whenever I see these code snippets.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add TASKSTATS options, enable CRASH_DUMP, and regenerate defconfig
file with "make savedefconfig".
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Git commit 36409f6353 "use generic RCU
page-table freeing code" introduced a tlb flushing bug. Partially revert
the above git commit and go back to s390 specific page table flush code.
For s390 the TLB can contain three types of entries, "normal" TLB
page-table entries, TLB combined region-and-segment-table (CRST) entries
and real-space entries. Linux does not use real-space entries which
leaves normal TLB entries and CRST entries. The CRST entries are
intermediate steps in the page-table translation called translation paths.
For example a 4K page access in a three-level page table setup will
create two CRST TLB entries and one page-table TLB entry. The advantage
of that approach is that a page access next to the previous one can reuse
the CRST entries and needs just a single read from memory to create the
page-table TLB entry. The disadvantage is that the TLB flushing rules are
more complicated, before any page-table may be freed the TLB needs to be
flushed.
In short: the generic RCU page-table freeing code is incorrect for the
CRST entries, in particular the check for mm_users < 2 is troublesome.
This is applicable to 3.0+ kernels.
Cc: <stable@vger.kernel.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Currently in the memcpy_real() function interrupts are disabled with
__arch_local_irq_stnsm(). In order to notify lockdep that interrupts
are disabled, with this patch local_irq_save() is used instead. The
function __arch_local_irq_stnsm() is still used for switching to
real mode.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull x32 support for x86-64 from Ingo Molnar:
"This tree introduces the X32 binary format and execution mode for x86:
32-bit data space binaries using 64-bit instructions and 64-bit kernel
syscalls.
This allows applications whose working set fits into a 32 bits address
space to make use of 64-bit instructions while using a 32-bit address
space with shorter pointers, more compressed data structures, etc."
Fix up trivial context conflicts in arch/x86/{Kconfig,vdso/vma.c}
* 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
x32: Fix alignment fail in struct compat_siginfo
x32: Fix stupid ia32/x32 inversion in the siginfo format
x32: Add ptrace for x32
x32: Switch to a 64-bit clock_t
x32: Provide separate is_ia32_task() and is_x32_task() predicates
x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls
x86/x32: Fix the binutils auto-detect
x32: Warn and disable rather than error if binutils too old
x32: Only clear TIF_X32 flag once
x32: Make sure TS_COMPAT is cleared for x32 tasks
fs: Remove missed ->fds_bits from cessation use of fd_set structs internally
fs: Fix close_on_exec pointer in alloc_fdtable
x32: Drop non-__vdso weak symbols from the x32 VDSO
x32: Fix coding style violations in the x32 VDSO code
x32: Add x32 VDSO support
x32: Allow x32 to be configured
x32: If configured, add x32 system calls to system call tables
x32: Handle process creation
x32: Signal-related system calls
x86: Add #ifdef CONFIG_COMPAT to <asm/sys_ia32.h>
...
Pull arch/tile (really asm-generic) update from Chris Metcalf:
"These are a couple of asm-generic changes that apply to tile."
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
compat: use sys_sendfile64() implementation for sendfile syscall
[PATCH v3] ipc: provide generic compat versions of IPC syscalls
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAT3NKzROxKuMESys7AQKElw/+JyDxJSlj+g+nymkx8IVVuU8CsEwNLgRk
8KEnRfLhGtkXFLSJYWO6jzGo16F8Uqli1PdMFte/wagSv0285/HZaKlkkBVHdJ/m
u40oSjgT013bBh6MQ0Oaf8pFezFUiQB5zPOA9QGaLVGDLXCmgqUgd7exaD5wRIwB
ZmyItjZeAVnDfk1R+ZiNYytHAi8A5wSB+eFDCIQYgyulA1Igd1UnRtx+dRKbvc/m
rWQ6KWbZHIdvP1ksd8wHHkrlUD2pEeJ8glJLsZUhMm/5oMf/8RmOCvmo8rvE/qwl
eDQ1h4cGYlfjobxXZMHqAN9m7Jg2bI946HZjdb7/7oCeO6VW3FwPZ/Ic75p+wp45
HXJTItufERYk6QxShiOKvA+QexnYwY0IT5oRP4DrhdVB/X9cl2MoaZHC+RbYLQy+
/5VNZKi38iK4F9AbFamS7kd0i5QszA/ZzEzKZ6VMuOp3W/fagpn4ZJT1LIA3m4A9
Q0cj24mqeyCfjysu0TMbPtaN+Yjeu1o1OFRvM8XffbZsp5bNzuTDEvviJ2NXw4vK
4qUHulhYSEWcu9YgAZXvEWDEM78FXCkg2v/CrZXH5tyc95kUkMPcgG+QZBB5wElR
FaOKpiC/BuNIGEf02IZQ4nfDxE90QwnDeoYeV+FvNj9UEOopJ5z5bMPoTHxm4cCD
NypQthI85pc=
=G9mT
-----END PGP SIGNATURE-----
Merge tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system
Pull "Disintegrate and delete asm/system.h" from David Howells:
"Here are a bunch of patches to disintegrate asm/system.h into a set of
separate bits to relieve the problem of circular inclusion
dependencies.
I've built all the working defconfigs from all the arches that I can
and made sure that they don't break.
The reason for these patches is that I recently encountered a circular
dependency problem that came about when I produced some patches to
optimise get_order() by rewriting it to use ilog2().
This uses bitops - and on the SH arch asm/bitops.h drags in
asm-generic/get_order.h by a circuituous route involving asm/system.h.
The main difficulty seems to be asm/system.h. It holds a number of
low level bits with no/few dependencies that are commonly used (eg.
memory barriers) and a number of bits with more dependencies that
aren't used in many places (eg. switch_to()).
These patches break asm/system.h up into the following core pieces:
(1) asm/barrier.h
Move memory barriers here. This already done for MIPS and Alpha.
(2) asm/switch_to.h
Move switch_to() and related stuff here.
(3) asm/exec.h
Move arch_align_stack() here. Other process execution related bits
could perhaps go here from asm/processor.h.
(4) asm/cmpxchg.h
Move xchg() and cmpxchg() here as they're full word atomic ops and
frequently used by atomic_xchg() and atomic_cmpxchg().
(5) asm/bug.h
Move die() and related bits.
(6) asm/auxvec.h
Move AT_VECTOR_SIZE_ARCH here.
Other arch headers are created as needed on a per-arch basis."
Fixed up some conflicts from other header file cleanups and moving code
around that has happened in the meantime, so David's testing is somewhat
weakened by that. We'll find out anything that got broken and fix it..
* tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits)
Delete all instances of asm/system.h
Remove all #inclusions of asm/system.h
Add #includes needed to permit the removal of asm/system.h
Move all declarations of free_initmem() to linux/mm.h
Disintegrate asm/system.h for OpenRISC
Split arch_align_stack() out from asm-generic/system.h
Split the switch_to() wrapper out of asm-generic/system.h
Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h
Create asm-generic/barrier.h
Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h
Disintegrate asm/system.h for Xtensa
Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt]
Disintegrate asm/system.h for Tile
Disintegrate asm/system.h for Sparc
Disintegrate asm/system.h for SH
Disintegrate asm/system.h for Score
Disintegrate asm/system.h for S390
Disintegrate asm/system.h for PowerPC
Disintegrate asm/system.h for PA-RISC
Disintegrate asm/system.h for MN10300
...
Pull kvm updates from Avi Kivity:
"Changes include timekeeping improvements, support for assigning host
PCI devices that share interrupt lines, s390 user-controlled guests, a
large ppc update, and random fixes."
This is with the sign-off's fixed, hopefully next merge window we won't
have rebased commits.
* 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits)
KVM: Convert intx_mask_lock to spin lock
KVM: x86: fix kvm_write_tsc() TSC matching thinko
x86: kvmclock: abstract save/restore sched_clock_state
KVM: nVMX: Fix erroneous exception bitmap check
KVM: Ignore the writes to MSR_K7_HWCR(3)
KVM: MMU: make use of ->root_level in reset_rsvds_bits_mask
KVM: PMU: add proper support for fixed counter 2
KVM: PMU: Fix raw event check
KVM: PMU: warn when pin control is set in eventsel msr
KVM: VMX: Fix delayed load of shared MSRs
KVM: use correct tlbs dirty type in cmpxchg
KVM: Allow host IRQ sharing for assigned PCI 2.3 devices
KVM: Ensure all vcpus are consistent with in-kernel irqchip settings
KVM: x86 emulator: Allow PM/VM86 switch during task switch
KVM: SVM: Fix CPL updates
KVM: x86 emulator: VM86 segments must have DPL 3
KVM: x86 emulator: Fix task switch privilege checks
arch/powerpc/kvm/book3s_hv.c: included linux/sched.h twice
KVM: x86 emulator: correctly mask pmc index bits in RDPMC instruction emulation
KVM: mmu_notifier: Flush TLBs before releasing mmu_lock
...
Pull s390 patches part 2 from Martin Schwidefsky:
"Some minor improvements and one additional feature for the 3.4 merge
window: Hendrik added perf support for the s390 CPU counters."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
[S390] register cpu devices for SMP=n
[S390] perf: add support for s390x CPU counters
[S390] oprofile: Allow multiple users of the measurement alert interrupt
[S390] qdio: log all adapter characteristics
[S390] Remove unncessary export of arch_pick_mmap_layout
The motivation for this patchset was that I was looking at a way for a
qemu-kvm process, to exclude the guest memory from its core dump, which
can be quite large. There are already a number of filter flags in
/proc/<pid>/coredump_filter, however, these allow one to specify 'types'
of kernel memory, not specific address ranges (which is needed in this
case).
Since there are no more vma flags available, the first patch eliminates
the need for the 'VM_ALWAYSDUMP' flag. The flag is used internally by
the kernel to mark vdso and vsyscall pages. However, it is simple
enough to check if a vma covers a vdso or vsyscall page without the need
for this flag.
The second patch then replaces the 'VM_ALWAYSDUMP' flag with a new
'VM_NODUMP' flag, which can be set by userspace using new madvise flags:
'MADV_DONTDUMP', and unset via 'MADV_DODUMP'. The core dump filters
continue to work the same as before unless 'MADV_DONTDUMP' is set on the
region.
The qemu code which implements this features is at:
http://people.redhat.com/~jbaron/qemu-dump/qemu-dump.patch
In my testing the qemu core dump shrunk from 383MB -> 13MB with this
patch.
I also believe that the 'MADV_DONTDUMP' flag might be useful for
security sensitive apps, which might want to select which areas are
dumped.
This patch:
The VM_ALWAYSDUMP flag is currently used by the coredump code to
indicate that a vma is part of a vsyscall or vdso section. However, we
can determine if a vma is in one these sections by checking it against
the gate_vma and checking for a non-NULL return value from
arch_vma_name(). Thus, freeing a valuable vma bit.
Signed-off-by: Jason Baron <jbaron@redhat.com>
Acked-by: Roland McGrath <roland@hack.frob.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Avi Kivity <avi@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A kernel compiled with SMP=n will not register any cpu devices.
The resulting kernel image will not boot with this error message:
kernel BUG at fs/sysfs/group.c:65!
Use GENERIC_CPU_DEVICES=y if SMP=n to get the missing cpu device.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add a perf PMU to access the CPU-measurement counter facility CPUM CF.
CPUM CF provides multiple counter sets for measuring generic,
problem-state, and crypto activaties. Also an extended counter set for
the IBM System z10 and IBM z196 mainframes is available.
Counters from the basic and problem-state counter set are mapped to
generic perf hardware events. Other counters are accessible through
raw events.
For a list of available counter sets and counters, see:
- The Load-Program-Parameter and the CPU-Measurement Facilities (SA23-2260)
- The CPU-Measurement Facility Extended Counters Definition for
z10 and z196 (SA23-2261)
Reviewed-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Prepare the measurement facility which is currently only used by oprofile
for multiple users. To achieve that the measurement alert interrupt control
bit needs to be protected. The measurement alert definitions are moved
to a header file and an interrupt mask is added so that users can discard
interrupts if they are for a different measurement subsystem.
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This function is defined for use in exec, not in modules.
No other architecture exports its implementation.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull s390 patches from Martin Schwidefsky:
"The biggest patch is the rework of the smp code, something I wanted to
do for some time. There are some patches for our various dump methods
and one new thing: z/VM LGR detection. LGR stands for linux-guest-
relocation and is the guest migration feature of z/VM. For debugging
purposes we keep a log of the systems where a specific guest has lived."
Fix up trivial conflict in arch/s390/kernel/smp.c due to the scheduler
cleanup having removed some code next to removed s390 code.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
[S390] kernel: Pass correct stack for smp_call_ipl_cpu()
[S390] Ensure that vmcore_info pointer is never accessed directly
[S390] dasd: prevent validate server for offline devices
[S390] Remove monolithic build option for zcrypt driver.
[S390] stack dump: fix indentation in output
[S390] kernel: Add OS info memory interface
[S390] Use block_sigmask()
[S390] kernel: Add z/VM LGR detection
[S390] irq: external interrupt code passing
[S390] irq: set __ARCH_IRQ_EXIT_IRQS_DISABLED
[S390] zfcpdump: Implement async sdias event processing
[S390] Use copy_to_absolute_zero() instead of "stura/sturg"
[S390] rework idle code
[S390] rework smp code
[S390] rename lowcore field
[S390] Fix gcc 4.6.0 compile warning