Lockdep (on 5.10-rc) points out that we're delivering IRQs while IRQs
are not even enabled, which clearly shouldn't happen. Defer the time
event IRQ delivery until they actually are enabled.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
With all the previous bits in place, we can now also support
suspend to RAM, in the sense that everything is suspended,
not just most, including userspace, processes like in s2idle.
Since um_idle_sleep() now waits forever, we can simply call
that to "suspend" the system.
As before, you can wake it up using SIGUSR1 since we're just
in a pause() call that only needs to return.
In order to implement selective resume from certain devices,
and not have any arbitrary device interrupt wake up, suspend
interrupts by removing SIGIO notification (O_ASYNC) from all
the FDs that are not supposed to wake up the system. However,
swap out the handler so we don't actually handle the SIGIO as
an interrupt.
Since we're in pause(), the mere act of receiving SIGIO wakes
us up, and then after things have been restored enough, re-set
O_ASYNC for all previously suspended FDs, reinstall the proper
SIGIO handler, and send SIGIO to self to process anything that
might now be pending.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
In order to be able to experiment with suspend in UML, add the
minimal work to be able to suspend (s2idle) an instance of UML,
and be able to wake it back up from that state with the USR1
signal sent to the main UML process.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Convert files to use SPDX header. All files are licensed under the GPLv2.
Signed-off-by: Alex Dewar <alex.dewar@gmx.co.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
UML enables TRACE_IRQFLAGS_SUPPORT but doesn't actually implement
it. It seems to have been added for lockdep support, but that can't
actually really work well without IRQ flags tracing, as is also
very noisily reported when enabling CONFIG_DEBUG_LOCKDEP.
Implement it now.
Fixes: 711553efa5 ("[PATCH] uml: declare in Kconfig our partial LOCKDEP support")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
In timer_real_alarm_handler(), regs is only initialized if
the context argument is non-NULL, also initialize in the
other case.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.co.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
This entry is misleading, the actual signal handler is
another one that never uses sig_info.
Also remove the SIGALRM if inside sig_handler() for the
same reason.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.co.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
Reverts commit b6024b21fe and
adjusts default stack sizing to cope with larger size of
floating point save registers on the newer Intel CPUs.
b6024b21fe replaced storing the
register state on the stack with kmalloc-ed storage. That has
a number of issues and a panic if that fails.
1. kmalloc/ATOMIC can fail. There was a latent hard crash
in all interrupt and fault handling as a result.
2. kmalloc in the interrupt path introduces a considerable
performance penalty for networking ~ 14% on iperf.
This commit restores uml to a stable state until a better
solution is found.
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Recent libcs have gotten a bit more strict, so we actually need to
include the right headers and use the right types. This enables UML to
compile again.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: stable@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
glibc 2.26 removed the 'struct ucontext' to "improve" POSIX compliance
and break programs, including User Mode Linux. Fix User Mode Linux
by using POSIX ucontext_t.
This fixes:
arch/um/os-Linux/signal.c: In function 'hard_handler':
arch/um/os-Linux/signal.c:163:22: error: dereferencing pointer to incomplete type 'struct ucontext'
mcontext_t *mc = &uc->uc_mcontext;
arch/x86/um/stub_segv.c: In function 'stub_segv_handler':
arch/x86/um/stub_segv.c:16:13: error: dereferencing pointer to incomplete type 'struct ucontext'
&uc->uc_mcontext);
Cc: stable@vger.kernel.org
Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net>
Signed-off-by: Richard Weinberger <richard@nod.at>
We are in atomic context and must not sleep.
Sleeping here is possible since malloc() maps
to kmalloc() with GFP_KERNEL.
Cc: stable@vger.kernel.org
Fixes: b6024b21 ("um: extend fpstate to _xstate to support YMM registers")
Signed-off-by: Richard Weinberger <richard@nod.at>
Extends fpstate to _xstate, in order to hold AVX/YMM registers.
To avoid oversized stack frame, the following functions have been
refactored by using malloc.
- sig_handler_common
- timer_real_alarm_handler
Signed-off-by: Eli Cooper <elicooper@gmx.com>
The existing IRQ handler design in UML does not prevent reentrancy
This is mitigated by fd-enable/fd-disable semantics for the IO
portion of the UML subsystem. The timer, however, can and is
re-entered resulting in very deep stack usage and occasional
stack exhaustion.
This patch prevents this by checking if there is a timer
interrupt in-flight before processing any pending timer interrupts.
Signed-off-by: Anton Ivanov <aivanov@brocade.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
UML is using an obsolete itimer call for
all timers and "polls" for kernel space timer firing
in its userspace portion resulting in a long list
of bugs and incorrect behaviour(s). It also uses
ITIMER_VIRTUAL for its timer which results in the
timer being dependent on it running and the cpu
load.
This patch fixes this by moving to posix high resolution
timers firing off CLOCK_MONOTONIC and relaying the timer
correctly to the UML userspace.
Fixes:
- crashes when hosts suspends/resumes
- broken userspace timers - effecive ~40Hz instead
of what they should be. Note - this modifies skas behavior
by no longer setting an itimer per clone(). Timer events
are relayed instead.
- kernel network packet scheduling disciplines
- tcp behaviour especially under load
- various timer related corner cases
Finally, overall responsiveness of userspace is better.
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Anton Ivanov <aivanov@brocade.com>
[rw: massaged commit message]
Signed-off-by: Richard Weinberger <richard@nod.at>
__ptr_t type is a glibc-specific type, while the generally
documented type is a void*. That's what other C libraries use,
too.
Signed-off-by: Hans-Werner Hilse <hwhilse@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
As UML uses an alternative signal stack we cannot use
the current stack pointer for stack dumping if UML itself
dies by SIGSEGV. To bypass this issue we save regs taken
from mcontext in our segv handler into thread_struct and
use these regs to obtain the stack pointer in show_stack().
Signed-off-by: Richard Weinberger <richard@nod.at>
Currently we use both struct siginfo and siginfo_t.
Let's use struct siginfo internally to avoid ongoing
compiler warning. We are allowed to do so because
struct siginfo and siginfo_t are equivalent.
Signed-off-by: Richard Weinberger <richard@nod.at>
arch/um/os-Linux/signal.c:18:8: error: conflicting types for 'sig_info'
In file included from /home/slyfox/linux-2.6/arch/um/os-Linux/signal.c:12:0:
arch/um/include/shared/as-layout.h:64:15: note: previous declaration of 'sig_info' was here
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
CC: Jeff Dike <jdike@addtoit.com>
CC: Richard Weinberger <richard@nod.at>
CC: "Martin Pärtel" <martin.partel@gmail.com>
CC: Al Viro <viro@zeniv.linux.org.uk>
CC: user-mode-linux-devel@lists.sourceforge.net
CC: user-mode-linux-user@lists.sourceforge.net
CC: linux-kernel@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
UML guest processes now get correct siginfo_t for SIGTRAP, SIGFPE,
SIGILL and SIGBUS. Specifically, si_addr and si_code are now correct
where previously they were si_addr = NULL and si_code = 128.
Signed-off-by: Martin Pärtel <martin.partel@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
now we don't mix host and guest signal frame layouts anymore; moreover,
we don't need host's struct sigcontext at all.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
For one thing, we always block the same signals (IRQ ones - IO, WINCH, VTALRM),
so there's no need to pass sa_mask elements in arguments. For another, the
flags depend only on whether it's an IRQ signal or not (we add SA_RESTART
for them).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
We used to generate those, but we hadn't done that for a long
time. No need to bother blocking them for signal handlers.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
- Make some variables and functions static, since they don't need to be
global.
- Remove an unused function - arch/um/kernel/time.c::sched_clock().
- Clean the style a bit as complained by checkpatch.pl.
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alarm delivery could be noticably late in the !CONFIG_NOHZ case because lost
ticks weren't being taken into account. This is now treated more carefully,
with the time between ticks being calculated and the appropriate number of
ticks delivered to the timekeeping system.
Cc: Nix <nix@esperi.org.uk>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Style changes under arch/um/os-Linux:
include trimming
CodingStyle fixes
some printks needed severity indicators
make_tempfile turns out not to be used outside of mem.c, so it is now static.
Its declaration in tempfile.h is no longer needed, and tempfile.h itself is no
longer needed.
create_tmp_file was also made static.
checkpatch moans about an EXPORT_SYMBOL in user_syms.c which is part of a
macro definition - this is copying a bit of kernel infrastructure into the
libc side of UML because the kernel headers can't be included there.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch tidies the signal handling code slightly.
pending is renamed to signals_pending for symmetry with signals_enabled.
remove_sigstack was unused, so can be deleted.
The value of change_sig was never used, so it is now void and the
return value is not calculated any more.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sig_handler_common_skas needs significant modernization, starting with
its name and storage class.
There is no need to hide the true type of the sigcontext pointer, so
the void * dummy parameter can be replaced with a sigcontext *sc.
The array of uml_pt_regs structs used in the page fault case are gone,
replaced by a local variable. This is also used in the non-segfault
case instead of the copy in the task_struct. Since it's local, the
special handling of the is_user flag can go away.
There hasn't been any special treatment of SIGUSR1 in ages, so the
line that enables it can be deleted.
The special treatment of SIGSEGV similarly goes away, but to
compensate, SA_NODEFER is added to sa_mask when registering a signal
handler.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch moves sig_handler_common_skas from
arch/um/os-Linux/skas/trap.c to its only caller in
arch/um/os-Linux/signal.c. trap.c is now empty, so it can be removed.
This is code movement only - the significant cleanup needed here is
done in the next patch.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
signals_enabled and pending have requirements on the order in which they are
modified. This used to be done by declaring them volatile and putting an mb()
where the ordering requirements were in effect.
After getting a better (I hope) understanding of how to do this correctly, the
volatile declarations are gone and the mb()'s replaced by barrier()'s.
One of the mb()'s was deleted because I see no problematic writes that could
be re-ordered past that point.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tidy kern_util.h. It turns out that most of the function declarations
aren't used, so they can go away. os.h no longer includes
kern_util.h, so files which got it through os.h now need to include it
directly. A number of other files never needed it, so these includes
are deleted.
The structure which was used to pass signal handlers from the kernel
side to the userspace side is gone. Instead, the handlers are
declared here, and used directly from libc code. This allows
arch/um/os-Linux/trap.c to be deleted, with its remnants being moved
to arch/um/os-Linux/skas/trap.c.
arch/um/os-Linux/tty.c had its inclusions changed, and it needed some
style attention, so it got tidied.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch contains varied fixes and improvements for some files under
arch/um/os-Linux/, such as a typo fix in a perror message, a missing
argument fix for a printf, some constifying for pointers and so on.
[ jdike - made sigprocmask failure return -errno instead of -1 ]
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that ITIMER_REAL is no longer used, there is no need for any use of
SIGALRM whatsoever. This patch removes all mention of it.
In addition, real_alarm_handler took a signal argument which is now always
SIGVTALRM. So, that is gone.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now, the idle loop now longer needs SIGALRM firing - it can just sleep for the
requisite amount of time and fake a timer interrupt when it finishes.
Any use of ITIMER_REAL now goes away. disable_timer only turns off
ITIMER_VIRTUAL. switch_timers is no longer needed, so it, and all calls, goes
away.
disable_timer now returns the amount of time remaining on the timer.
default_idle uses this to tell idle_sleep how long to sleep. idle_sleep will
call alarm_handler if nanosleep returns 0, which is the case if it didn't
return early due to an interrupt. Otherwise, it just returns.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move timer signal initialization from init_irq_signals to a new function,
timer_init.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix up the switching between virtual and real timers. The idle loop sleeps,
so the timer at that point must be real time. At all other times, the timer
must be virtual. Even when userspace is running, and the kernel is asleep,
the virtual timer is correct because the process timer will be running and the
process timer will be firing.
The timer switch used to be in the context switch and timer handler code.
This is moved to the idle loop and the signal handler, making it much more
clear why it is happening.
switch_timers now returns the old timer type so that it may be restored. The
signal handler uses this in order to restore the previous timer type when it
returns.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Formatting changes in the files which have been changed in the course
of folding foo_skas functions into their callers. These include:
copyright updates
header file trimming
style fixes
adding severity to printks
These changes should be entirely non-functional.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch makes a number of simplifications enabled by the removal of
CHOOSE_MODE. There were lots of functions that looked like
int foo(args){
foo_skas(args);
}
The bodies of foo_skas are now folded into foo, and their declarations (and
sometimes entire header files) are deleted.
In addition, the union uml_pt_regs, which was a union between the tt and skas
register formats, is now a struct, with the tt-mode arm of the union being
removed.
It turns out that usr2_handler was unused, so it is gone.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The next stage after removing code which depends on CONFIG_MODE_TT is removing
the CHOOSE_MODE abstraction, which provided both compile-time and run-time
branching to either tt-mode or skas-mode code.
This patch removes choose-mode.h and all inclusions of it, and replaces all
CHOOSE_MODE invocations with the skas branch. This leaves a number of trivial
functions which will be dealt with in a later patch.
There are some changes in the uaccess and tls support which go somewhat beyond
this and eliminate some of the now-redundant functions.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes a crash caused by an interrupt coming in when an IRQ stack
is being torn down. When this happens, handle_signal will loop, setting up
the IRQ stack again because the tearing down had finished, and handling
whatever signals had come in.
However, to_irq_stack returns a mask of pending signals to be handled, plus
bit zero is set if the IRQ stack was already active, and thus shouldn't be
torn down. This causes a problem because when handle_signal goes around
the loop, sig will be zero, and to_irq_stack will duly set bit zero in the
returned mask, faking handle_signal into believing that it shouldn't tear
down the IRQ stack and return thread_info pointers back to their original
values.
This will eventually cause a crash, as the IRQ stack thread_info will
continue pointing to the original task_struct and an interrupt will look
into it after it has been freed.
The fix is to stop passing a signal number into to_irq_stack. Rather, the
pending signals mask is initialized beforehand with the bit for sig already
set. References to sig in to_irq_stack can be replaced with references to
the mask.
[akpm@linux-foundation.org: use UL]
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a separate IRQ stack. This differs from i386 in having the entire
interrupt run on a separate stack rather than starting on the normal kernel
stack and switching over once some preparation has been done. The underlying
mechanism, is of course, sigaltstack.
Another difference is that interrupts that happen in userspace are handled on
the normal kernel stack. These cause a wait wakeup instead of a signal
delivery so there is no point in trying to switch stacks for these. There's
no other stuff on the stack, so there is no extra stack consumption.
This quirk makes it possible to have the entire interrupt run on a separate
stack - process preemption (and calls to schedule()) happens on a normal
kernel stack. If we enable CONFIG_PREEMPT, this will need to be rethought.
The IRQ stack for CPU 0 is declared in the same way as the initial kernel
stack. IRQ stacks for other CPUs will be allocated dynamically.
An extra field was added to the thread_info structure. When the active
thread_info is copied to the IRQ stack, the real_thread field points back to
the original stack. This makes it easy to tell where to copy the thread_info
struct back to when the interrupt is finished. It also serves as a marker of
a nested interrupt. It is NULL for the first interrupt on the stack, and
non-NULL for any nested interrupts.
Care is taken to behave correctly if a second interrupt comes in when the
thread_info structure is being set up or taken down. I could just disable
interrupts here, but I don't feel like giving up any of the performance gained
by not flipping signals on and off.
If an interrupt comes in during these critical periods, the handler can't run
because it has no idea what shape the stack is in. So, it sets a bit for its
signal in a global mask and returns. The outer handler will deal with this
signal itself.
Atomicity is had with xchg. A nested interrupt that needs to bail out will
xchg its signal mask into pending_mask and repeat in case yet another
interrupt hit at the same time, until the mask stabilizes.
The outermost interrupt will set up the thread_info and xchg a zero into
pending_mask when it is done. At this point, nested interrupts will look at
->real_thread and see that no setup needs to be done. They can just continue
normally.
Similar care needs to be taken when exiting the outer handler. If another
interrupt comes in while it is copying the thread_info, it will drop a bit
into pending_mask. The outer handler will check this and if it is non-zero,
will loop, set up the stack again, and handle the interrupt.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some tidying of the irq code before introducing irq stacks. Mostly
style fixes, but the timer handler calls the timer code directly
rather than going through the generic sig_handler_common_skas.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
user_util.h isn't needed any more, so delete it and remove all includes of it.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>