Need to create sysfs only for cpus that are present. Without which we see
NR_CPUS entries created when we have CONFIG_HOTPLUG and CONFIG_HOTPLUG_CPU
enabled.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Need to ensure we dont get prempted when we clear ourself from mask when using
clustered mode genapic code.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Up to date I've been using the GS value to determine the processor number
in dumps from show_regs, however this can be cumbersome to do if you don't
have the vmlinux to verify with the address of cpu_pda, how about the
following? I considered using hard_smp_processor_id for robustness but we
already dereference current so we're already relying on MSR_GS_BASE being
sane.
Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When handling writes to /proc/irq, current code is re-programming rte
entries directly. This is not recommended and could potentially cause
chipset's to lockup, or cause missing interrupts.
CONFIG_IRQ_BALANCE does this correctly, where it re-programs only when the
interrupt is pending. The same needs to be done for /proc/irq handling as well.
Otherwise user space irq balancers are really not doing the right thing.
- Changed pending_irq_balance_cpumask to pending_irq_migrate_cpumask for
lack of a generic name.
- added move_irq out of IRQ_BALANCE, and added this same to X86_64
- Added new proc handler for write, so we can do deferred write at irq
handling time.
- Display of /proc/irq/XX/smp_affinity used to display CPU_MASKALL, instead
it now shows only active cpu masks, or exactly what was set.
- Provided a common move_irq implementation, instead of duplicating
when using generic irq framework.
Tested on i386/x86_64 and ia64 with CONFIG_PCI_MSI turned on and off.
Tested UP builds as well.
MSI testing: tbd: I have cards, need to look for a x-over cable, although I
did test an earlier version of this patch. Will test in a couple days.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Zwane Mwaikambo <zwane@holomorphy.com>
Grudgingly-acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Coywolf Qi Hunt <coywolf@lovecn.org>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Ben Dooks
This has been broken for a while now, so fix the
problems with the code, test and bring up to date.
This also makes the code conditional on an
Kconfig option
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Adam Brooks
The current gettimeofday implementation for the IOP3xx processors reads the contents of the timer interrupt register and does math on the value to figure out exactly what time it is. To do this it multiplies the contents of the timer register with a large constant. The result is then divided by a large constant. Unfortunately the result of the first multiplication is often too large for the register to hold. The solution is to combine the two large constants to a single smaller constant at compile time. Then the timer value can be divided by single smaller constant without any overflow issues.
Signed-off-by: Adam Brooks <adam.j.brooks@intel.com>
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Richard Purdie
Add machine detection routines for several new models of PXA based Zaurii
(SL-C3000 - Spitz, SL-C1000 - Akita, SL-C3100 - Borzoi, SL-C6000 - Tosa).
Sharp continue to use broken bootloaders in ROM.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Tony Lindgren
This patch updates H2 defconfig
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Tony Lindgren
This patch syncs the mainline kernel with linux-omap tree.
The highlights of the patch are:
- Clock updates by Tuukka Tikkanen, Juha Yrjola,
Daniel Petrini and Tony Lindgren
- DMA fixes by Imre Deak, Juha Yrjola and Daniel Petrini
- Add support to dual-mode hardware timers by Lauri Leukkunen
- GPIO support for 24xx by Paul Mundt
- GPIO wake-up support by Tony Lindgren
- Better GPIO interrupt handler to not lose interrupts by
Ralph Walden and Ladislav Michl
- Power Management updates by Tuukka Tikkanen
- Make Power Management code use new SRAM functions by
Tony Lindgren
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Add the Simtec Anubis to the list of supported
machines in the arch/arm/mach-s3c2410 directory.
This ensures the core peripherals are registered,
the timer source is configured and the correct
power-management is enabled.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Need to use compat struct sizes and compat_sys_ioctl().
Reported by Adrian Bunk via kernel bugzilla #2683
Signed-off-by: David S. Miller <davem@davemloft.net>
Changing CONFIG_LOCALVERSION rebuilds too much, for no apparent reason.
Use system_utsname for progress and debug header.
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently, we set the class bit in kernel SLB entries, and clear it on
user SLB entries. On POWER5, ERAT entries created in real mode have
the class bit clear. So to avoid flushing kernel ERAT entries on each
context switch, this patch inverts our usage of the class bit, setting
it on user SLB entries and clearing it on kernel SLB entries.
Booted on POWER5 and G5.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Move oprofile_impl.h into include/asm-ppc64 in preparation for moving
oprofile_model into cpu feature struct.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Change oprofile to use num_pmcs from the cpu feature struct.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Remove the CPU_FTR_PMC8 feature now we encode the number of PMCs
directly.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add a field in the cputable struct to store the number of PMCs.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
I would like to be able to read the lparcfg data from any user so we
can make "intelligent" decisions based on underlying attributes when
running in lpars. Yes there's software that likes to do this :) and
runs as non-root.
It's very similar to say VM where you can get CP to provide feedback
of the real hardware inside a VM guest.
Signed-off-by: Wim Coekaerts <wim.coekaerts@oracle.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Removed PPC64 architecture specific users of asm/segment.h.
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Poison initmem after we free it so we catch use after free issues.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The following patch fixes 2 issues:
1) use PLATFORM_LPAR bit to test if running in LPAR mode
2) systemcfg pointer is assigned from static data in
arch/ppc64/kernel/pacaData.c. The file arch/ppc64/kernel/head.S
now refers to is using the GOT binding to the pointer and hence
must deref it.
Signed-off-by: Jimi Xenidis <jimix@watson.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Consolidate the early console and PPCDBG code in udbg.c
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Trim some no longer needed includes from udbg.c and friends.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Take udbg out of ppc_md. Allows us to not overwrite early udbg inits
when assigning ppc_md.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Split scc and 15550 functions from udbg each into their own file.
This makes them more symetric with the lpar and btext code.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
make udbg_init_uart set the ppc_md udbg methods.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Make the 16550 and real mode 16550 use tail recursion like the scc code
instead of repeating the routine except for the character sent.
Gcc recoginizes the tail recursion and handles it efficently without
stack allocations. The maple real putc shrinks from 188 to 104 bytes
of instructions. udbg_putc drops from 188 to 140 bytes.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Now that xmon is fixed we should not need the dummy getc routines.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
udbg_getc_poll is a ppc_md function. don't call directly into udbg.c
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Patch from Richard Purdie
This patch updates the PCMCIA pxa2xx_sharpsl driver to support multiple scoop
devices by adding a scoop to pcmcia slot mapping structure. It adds platform
support for poodle, is known to work on spitz (which is dual slot) and
should also support collie with a minor amount of further work.
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
The n30 machine file is using a kernel thread to change the
state of the USB D+ pull-up resistor, which is not the proper
way to do this. Once the usb-gadget support for the 24xx is
merge, the proper access method will be added.
This patch also removes the problem of using HZ for the
msleep calls, from Nishanth Aravamudan.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Use TIF bit to tell if a process is running in 31 bit mode instead of checking
the addressing mode bits of the PSW.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There is a race in pfault_interrupt. That function gets called two times for
each pfault notification. Once with a subcode of 0 to indicate that a real
page is not available and once with a subcode of 0x80 to indicate that the
page is present again.
Since the two external interrupts can be delivered on two different cpus the
order in which the two calls are made is unpredictable. It is possible that
the subcode 0x80 interrupt is completed before the subcode 0x00 interrupt has
done the wake_up() call.
To avoid calling wake_up() on an already removed task structure proper task
structure reference counting is needed. Increase the reference counter in the
subcode 0x00 interrupt before setting pfault_wait to zero and return the
reference after the wake_up call.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
debug feature changes/bug fixes:
- Use get_clock() function instead of private inline assembly.
- Use 'struct timeval' instead of 'struct timespec' for call to
tod_to_timeval(). Now the microsecond part of the timestamp is correct
again.
- Fix a locking problem: when creating a snapshot of the current content
of the debug areas, lock the entire debug_info object.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The new machine check handler still has a few bugs.
1) The system entry time has to be stored in the machine check handler,
2) the machine check return psw may not be stored at the usual place
because it might overwrite the return psw of the interrupted context,
3) the return address for the call to s390_handle_mcck in the i/o interrupt
handler is not correct,
4) the system call cleanup has to take the different save area of the
machine check handler into account,
5) the machine check handler may not call UPDATE_VTIME before
CREATE_STACK_FRAME, and
6) the io leave path needs a critical section cleanup to make sure that the
TIF_MCCK_PENDING bit is really checked before switching back to user space.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We were leaking pmd pages when 3_LEVEL_PGTABLES was enabled. This fixes that.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
cleanup and fix the check for advanced sysemu (PTRACE_SYSEMU_SINGLESTEP
option)
Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add new cmdline setups:
- noprocmm
- noptracefaultinfo
In case of testing, they can be used to switch off usage of
/proc/mm and PTRACE_FAULTINFO independently.
Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Change syscall-stub's data to include a "expected retval".
Stub now checks syscalls retval and aborts execution of syscall list, if
retval != expected retval.
run_syscall_stub prints the data of the failed syscall, using the data pointer
and retval written by the stub to the beginning of the stack.
one_syscall_stub is removed, to simplify code, because only some instructions
are saved by one_syscall_stub, no host-syscall.
Using the stub with additional data (modify_ldt via stub)
is prepared also.
Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This change enables SKAS0/SKAS3 to work with all combinations of /proc/mm and
PTRACE_FAULTINFO being available or not.
Also it changes the initialization of proc_mm and ptrace_faultinfo slightly,
to ease forcing SKAS0 on a patched host. Forcing UML to run without /proc/mm
or PTRACE_FAULTINFO by cmdline parameter can be implemented with a setup
resetting the related variable.
Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The serial UML OS-abstraction layer patch (um/kernel dir).
This moves all systemcalls from process.c file under os-Linux dir and join
process.c and process_kern.c files.
Signed-off-by: Gennady Sharapov <gennady.v.sharapov@intel.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds AIO support to the ubd driver.
The driver breaks a struct request into IO requests to the host, based on the
hardware segments in the request and on any COW blocks covered by the request.
The ubd IO thread is gone, since there is now an equivalent thread in the AIO
module.
There is provision for multiple outstanding requests now. Requests aren't
retired until all pieces of it have been completed. The AIO requests have a
shared count, which is decremented as IO operations come in until it reaches
0. This can be possibly moved to the request struct - haven't looked at this
yet.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch makes UML use host AIO support when it (and
/usr/include/linux/aio_abi.h) are present. This is only the support, with no
consumers - a consumer is coming in the next patch.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Added missing include list to uml AFLAGS
Killed magic for stubs. [So] - it was needed only because of messed AFLAGS
Switched segv_stubs.c to kernel CFLAGS sans profile, instead of user ones
Killed STUBS_CFLAGS - it's not needed and the only remaining use had been
gratitious - it only polluted CFLAGS
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This merges two sets of files which had no business being split apart in the
first place.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds VM op batching to skas0. Rather than having a context switch to and
from the userspace stub for each address space change, we write a number of
operations to the stub data page and invoke a different stub which loops over
them and executes them all in one go.
The operations are stored as [ system call number, arg1, arg2, ... ] tuples.
The set is terminated by a system call number of 0. Single operations, i.e.
page faults, are handled in the old way, since that is slightly more
efficient.
For a kernel build, a minority (~1/4) of the operations are part of a set.
These sets averaged ~100 in length, so for this quarter, the context switching
overhead is greatly reduced.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro <viro@parcelfarce.linux.theplanet.co.uk> spotted a bunch of duplicated
exports - this removes them.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Noticed by Al Viro <viro@parcelfarce.linux.theplanet.co.uk> - SMP on x86_64 is
fundamentally broken due to UML's reuse of the host arch's percpu stuff. This
is OK on x86, but the x86_64 pda stuff just won't work for UML.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove an unneeded reference to libc.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Build cleanups
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This cleans up the error path in ubd_open, causing it now to call ubd_close
appropriately when something fails.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix a macro typo which could break if the macro is passed arguments with
side-effects.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The copy_user stuff in the signal frame code was broke.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Avoid chomping low bits of address for functions doing it by themselves,
fix whitespace, add a correctness checking.
I did this for remap-file-pages protection support, it was useful on its
own too.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If a SIGWINCH comes in, while winch_thread() isn't waiting in wait(),
winch_thread could miss signals. It isn't very probable, that anyone will
see this causing trouble, as it would need a very special timing, that a
missed SIGWINCH results in a wrong window size.
So, this is a minor problem. But why not fix, as it can be done so easy?
Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Apparently, GDB gets confused when we do an execvp() on ourselves.
Since it's simply done to allocate further space for command line arguments
(which we'll use to allow gathering the startup command line for guest
processes through the host), allow the user to disable that to get a
debuggable UML binary.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As a follow-up to "UML Support - Ptrace: adds the host SYSEMU support, for
UML and general usage" (i.e. uml-support-* in current mm).
Avoid unconditionally jumping to work_pending and code copying, just reuse
the already existing resume_userspace path.
One interesting note, from Charles P. Wright, suggested that the API is
improvable with no downsides for UML (except that it will have to support
yet another host API, since dropping support for the current API, for UML,
is not reasonable from users' point of view).
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
CC: Charles P. Wright <cwright@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
This is simply an adjustment for "Ptrace - i386: fix Syscall Audit interaction
with singlestep" to work on top of SYSEMU patches, too. On this patch, I have
some doubts: I wonder why we need to alter that way ptrace_disable().
I left the patch this way because it has been extensively tested, but I don't
understand the reason.
The current PTRACE_DETACH handling simply clears child->ptrace; actually this
is not enough because entry.S just looks at the thread_flags; actually,
do_syscall_trace checks current->ptrace but I don't think depending on that is
good, at least for performance, so I think the clearing is done elsewhere.
For instance, on PTRACE_CONT it's done, but doing PTRACE_DETACH without
PTRACE_CONT is possible (and happens when gdb crashes and one kills it
manually).
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
CC: Roland McGrath <roland@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch implements the new ptrace option PTRACE_SYSEMU_SINGLESTEP, which
can be used by UML to singlestep a process: it will receive SINGLESTEP
interceptions for normal instructions and syscalls, but syscall execution will
be skipped just like with PTRACE_SYSEMU.
Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With this patch, we change the way we handle switching from PTRACE_SYSEMU to
PTRACE_{SINGLESTEP,SYSCALL}, to free TIF_SYSCALL_EMU from double use as a
preparation for PTRACE_SYSEMU_SINGLESTEP extension, without changing the
behavior of the host kernel.
Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Dike <jdike@addtoit.com>,
Paolo 'Blaisorblade' Giarrusso <blaisorblade_spam@yahoo.it>,
Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Adds a new ptrace(2) mode, called PTRACE_SYSEMU, resembling PTRACE_SYSCALL
except that the kernel does not execute the requested syscall; this is useful
to improve performance for virtual environments, like UML, which want to run
the syscall on their own.
In fact, using PTRACE_SYSCALL means stopping child execution twice, on entry
and on exit, and each time you also have two context switches; with SYSEMU you
avoid the 2nd stop and so save two context switches per syscall.
Also, some architectures don't have support in the host for changing the
syscall number via ptrace(), which is currently needed to skip syscall
execution (UML turns any syscall into getpid() to avoid it being executed on
the host). Fixing that is hard, while SYSEMU is easier to implement.
* This version of the patch includes some suggestions of Jeff Dike to avoid
adding any instructions to the syscall fast path, plus some other little
changes, by myself, to make it work even when the syscall is executed with
SYSENTER (but I'm unsure about them). It has been widely tested for quite a
lot of time.
* Various fixed were included to handle the various switches between
various states, i.e. when for instance a syscall entry is traced with one of
PT_SYSCALL / _SYSEMU / _SINGLESTEP and another one is used on exit.
Basically, this is done by remembering which one of them was used even after
the call to ptrace_notify().
* We're combining TIF_SYSCALL_EMU with TIF_SYSCALL_TRACE or TIF_SINGLESTEP
to make do_syscall_trace() notice that the current syscall was started with
SYSEMU on entry, so that no notification ought to be done in the exit path;
this is a bit of a hack, so this problem is solved in another way in next
patches.
* Also, the effects of the patch:
"Ptrace - i386: fix Syscall Audit interaction with singlestep"
are cancelled; they are restored back in the last patch of this series.
Detailed descriptions of the patches doing this kind of processing follow (but
I've already summed everything up).
* Fix behaviour when changing interception kind #1.
In do_syscall_trace(), we check the status of the TIF_SYSCALL_EMU flag
only after doing the debugger notification; but the debugger might have
changed the status of this flag because he continued execution with
PTRACE_SYSCALL, so this is wrong. This patch fixes it by saving the flag
status before calling ptrace_notify().
* Fix behaviour when changing interception kind #2:
avoid intercepting syscall on return when using SYSCALL again.
A guest process switching from using PTRACE_SYSEMU to PTRACE_SYSCALL
crashes.
The problem is in arch/i386/kernel/entry.S. The current SYSEMU patch
inhibits the syscall-handler to be called, but does not prevent
do_syscall_trace() to be called after this for syscall completion
interception.
The appended patch fixes this. It reuses the flag TIF_SYSCALL_EMU to
remember "we come from PTRACE_SYSEMU and now are in PTRACE_SYSCALL", since
the flag is unused in the depicted situation.
* Fix behaviour when changing interception kind #3:
avoid intercepting syscall on return when using SINGLESTEP.
When testing 2.6.9 and the skas3.v6 patch, with my latest patch and had
problems with singlestepping on UML in SKAS with SYSEMU. It looped
receiving SIGTRAPs without moving forward. EIP of the traced process was
the same for all SIGTRAPs.
What's missing is to handle switching from PTRACE_SYSCALL_EMU to
PTRACE_SINGLESTEP in a way very similar to what is done for the change from
PTRACE_SYSCALL_EMU to PTRACE_SYSCALL_TRACE.
I.e., after calling ptrace(PTRACE_SYSEMU), on the return path, the debugger is
notified and then wake ups the process; the syscall is executed (or skipped,
when do_syscall_trace() returns 0, i.e. when using PTRACE_SYSEMU), and
do_syscall_trace() is called again. Since we are on the return path of a
SYSEMU'd syscall, if the wake up is performed through ptrace(PTRACE_SYSCALL),
we must still avoid notifying the parent of the syscall exit. Now, this
behaviour is extended even to resuming with PTRACE_SINGLESTEP.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Avoid giving two traps for singlestep instead of one, when syscall auditing is
enabled.
In fact no singlestep trap is sent on syscall entry, only on syscall exit, as
can be seen in entry.S:
# Note that in this mask _TIF_SINGLESTEP is not tested !!! <<<<<<<<<<<<<<
testb $(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP),TI_flags(%ebp)
jnz syscall_trace_entry
...
syscall_trace_entry:
...
call do_syscall_trace
But auditing a SINGLESTEP'ed process causes do_syscall_trace to be called, so
the tracer will get one more trap on the syscall entry path, which it
shouldn't.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
CC: Roland McGrath <roland@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
To the extent that sub-Kconfig files exist elsewhere in the tree, they are
named Kconfig.foo, rather than the Kconfig_foo that UML has. This patch
brings the names in line with the rest of the tree.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This eliminates the segfault info ring buffer, which added a system call to
each page fault, and which hadn't been useful for debugging in ages.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch converts arch/cris/Kconfig.debug to using lib/Kconfig.debug.
This should fix a compile error in 2.6.13-rc4 caused by a missing
CONFIG_LOG_BUF_SHIFT definition.
While I was editing this file, I also converted some spaces to tabs.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Mikael Starvik <starvik@axis.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the builtin functions for memset/memclr/memcpy, special optimizations for
page operations have dedicated functions now. Uninline memmove/memchr and
move all functions into a single file and clean it up a little.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Move a few cache functions into its own file and fix flush_icache_range() so
it can handle both kernel and user addresses correctly (assuming context is
set correctly).
Turn copy_to_user_page/copy_from_user_page into inline functions and add a
missing cache flush.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- create helper function singlestep_disable()
- move variable definitions to the top of the function
- use "out_eio" label as common error destination
- don't clear failure value for PTRACE_SETREGS/PTRACE_GETREGS
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The timers lack .suspend/.resume methods. Because of this, jiffies got a
big compensation after a S3 resume. And then softlockup watchdog reports
an oops. This occured with HPET enabled, but it's also possible for other
timers.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds type-checking to pm_message_t, so that people can't confuse it
with int or u32. It also allows us to fix "disk yoyo" during suspend (disk
spinning down/up/down).
[We've tried that before; since that cpufreq problems were fixed and I've
tried make allyes config and fixed resulting damage.]
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix remaining bits of u32 vs. pm_message confusion. Should not break
anything.
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Reset the ISA DMA controller into a known state after a suspend. Primary
concern was reenabling the cascading DMA channel (4).
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Reset the ISA DMA controller into a known state after a suspend. Primary
concern was reenabling the cascading DMA channel (4).
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch moves the common code in x86 and x86-64's semaphore.c into a
single file in lib/semaphore-sleepers.c. The arch specific asm stubs are
left in the arch tree (in semaphore.c for i386 and in the asm for x86-64).
There should be no changes in code/functionality with this patch.
Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
for_each_cpu walks through all processors in cpu_possible_map, which is
defined as cpu_callout_map on i386 and isn't initialised until all
processors have been booted. This breaks things which do for_each_cpu
iterations early during boot. So, define cpu_possible_map as a bitmap with
NR_CPUS bits populated. This was triggered by a patch i'm working on which
does alloc_percpu before bringing up secondary processors.
From: Alexander Nyberg <alexn@telia.com>
i386-boottime-for_each_cpu-broken.patch
i386-boottime-for_each_cpu-broken-fix.patch
The SMP version of __alloc_percpu checks the cpu_possible_map before
allocating memory for a certain cpu. With the above patches the BSP cpuid
is never set in cpu_possible_map which breaks CONFIG_SMP on uniprocessor
machines (as soon as someone tries to dereference something allocated via
__alloc_percpu, which in fact is never allocated since the cpu is not set
in cpu_possible_map).
Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a clone operation for pgd updates.
This helps complete the encapsulation of updates to page tables (or pages
about to become page tables) into accessor functions rather than using
memcpy() to duplicate them. This is both generally good for consistency
and also necessary for running in a hypervisor which requires explicit
updates to page table entries.
The new function is:
clone_pgd_range(pgd_t *dst, pgd_t *src, int count);
dst - pointer to pgd range anwhere on a pgd page
src - ""
count - the number of pgds to copy.
dst and src can be on the same page, but the range must not overlap
and must not cross a page boundary.
Note that I ommitted using this call to copy pgd entries into the
software suspend page root, since this is not technically a live paging
structure, rather it is used on resume from suspend. CC'ing Pavel in case
he has any feedback on this.
Thanks to Chris Wright for noticing that this could be more optimal in
PAE compiles by eliminating the memset.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds a notify to the die_nmi notify that the system is about to
be taken down. If the notify is handled with a NOTIFY_STOP return, the
system is given a new lease on life.
We also change the nmi watchdog to carry on if die_nmi returns.
This give debug code a chance to a) catch watchdog timeouts and b) possibly
allow the system to continue, realizing that the time out may be due to
debugger activities such as single stepping which is usually done with
"other" cpus held.
Signed-off-by: George Anzinger<george@mvista.com>
Cc: Keith Owens <kaos@ocs.com.au>
Signed-off-by: George Anzinger <george@mvista.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The pushf/popf in switch_to are ONLY used to switch IOPL. Making this
explicit in C code is more clear. This pushf/popf pair was added as a
bugfix for leaking IOPL to unprivileged processes when using
sysenter/sysexit based system calls (sysexit does not restore flags).
When requesting an IOPL change in sys_iopl(), it is just as easy to change
the current flags and the flags in the stack image (in case an IRET is
required), but there is no reason to force an IRET if we came in from the
SYSENTER path.
This change is the minimal solution for supporting a paravirtualized Linux
kernel that allows user processes to run with I/O privilege. Other
solutions require radical rewrites of part of the low level fault / system
call handling code, or do not fully support sysenter based system calls.
Unfortunately, this added one field to the thread_struct. But as a bonus,
on P4, the fastest time measured for switch_to() went from 312 to 260
cycles, a win of about 17% in the fast case through this performance
critical path.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Privilege checking cleanup. Originally, these diffs were much greater, but
recent cleanups in Linux have already done much of the cleanup. I added
some explanatory comments in places where the reasoning behind certain
tests is rather subtle.
Also, in traps.c, we can skip the user_mode check in handle_BUG(). The
reason is, there are only two call chains - one via die_if_kernel() and one
via do_page_fault(), both entering from die(). Both of these paths already
ensure that a kernel mode failure has happened. Also, the original check
here, if (user_mode(regs)) was insufficient anyways, since it would not
rule out BUG faults from V8086 mode execution.
Saving the %ss segment in show_regs() rather than assuming a fixed value
also gives better information about the current kernel state in the
register dump.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some more assembler cleanups I noticed along the way.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Also, setting PDPEs in PAE mode does not require atomic operations, since the
PDPEs are cached by the processor, and only reloaded on an explicit or
implicit reload of CR3.
Since the four PDPEs must always be present in an active root, and the kernel
PDPE is never updated, we are safe even from SMIs and interrupts / NMIs using
task gates (which reload CR3). Actually, much of this is moot, since the user
PDPEs are never updated either, and the only usage of task gates is by the
doublefault handler. It appears the only place PGDs get updated in PAE mode
is in init_low_mappings() / zap_low_mapping() for initial page table creation
and recovery from ACPI sleep state, and these sites are safe by inspection.
Getting rid of the cmpxchg8b saves code space and 720 cycles in pgd_alloc on
P4.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Subtle fix: load_TLS has been moved after saving %fs and %gs segments to avoid
creating non-reversible segments. This could conceivably cause a bug if the
kernel ever needed to save and restore fs/gs from the NMI handler. It
currently does not, but this is the safest approach to avoiding fs/gs
corruption. SMIs are safe, since SMI saves the descriptor hidden state.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
i386 inline assembler cleanup.
This change encapsulates descriptor and task register management. Also,
it is possible to improve assembler generation in two cases; savesegment
may store the value in a register instead of a memory location, which
allows GCC to optimize stack variables into registers, and MOV MEM, SEG
is always a 16-bit write to memory, making the casting in math-emu
unnecessary.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
i386 arch cleanup. Introduce the serialize macro to serialize processor
state. Why the microcode update needs it I am not quite sure, since wrmsr()
is already a serializing instruction, but it is a microcode update, so I will
keep the semantic the same, since this could be a timing workaround. As far
as I can tell, this has always been there since the original microcode update
source.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
i386 Inline asm cleanup. Use cr/dr accessor functions.
Also, a potential bugfix. Also, some CR accessors really should be volatile.
Reads from CR0 (numeric state may change in an exception handler), writes to
CR4 (flipping CR4.TSD) and reads from CR2 (page fault) prevent instruction
re-ordering. I did not add memory clobber to CR3 / CR4 / CR0 updates, as it
was not there to begin with, and in no case should kernel memory be clobbered,
except when doing a TLB flush, which already has memory clobber.
I noticed that page invalidation does not have a memory clobber. I can't find
a bug as a result, but there is definitely a potential for a bug here:
#define __flush_tlb_single(addr) \
__asm__ __volatile__("invlpg %0": :"m" (*(char *) addr))
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This makes the vDSO use nops for all its padding around instructions,
rather than sometimes zeros, and nop-pads the end of the area containing
instructions to a 32-byte cache line, to keep text and data in separate
lines.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>