Commit Graph

244110 Commits

Author SHA1 Message Date
Peter Zijlstra
057f3fadb3 sched: Fix sched_domain iterations vs. RCU
Vladis Kletnieks reported a new RCU debug warning in the scheduler.

Since commit dce840a087 ("sched: Dynamically allocate sched_domain/
sched_group data-structures") the sched_domain trees are protected by
RCU instead of RCU-sched.

This means that we need to include rcu_read_lock() protection when we
iterate them since disabling preemption doesn't suffice anymore.

Reported-by: Valdis.Kletnieks@vt.edu
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1302882741.2388.241.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-19 10:56:54 +02:00
Venkatesh Pallipadi
2f36825b17 sched: Next buddy hint on sleep and preempt path
When a task in a taskgroup sleeps, pick_next_task starts all the way back at
the root and picks the task/taskgroup with the min vruntime across all
runnable tasks.

But when there are many frequently sleeping tasks across different taskgroups,
it makes better sense to stay with same taskgroup for its slice period (or
until all tasks in the taskgroup sleeps) instead of switching cross taskgroup
on each sleep after a short runtime.

This helps specifically where taskgroups corresponds to a process with
multiple threads. The change reduces the number of CR3 switches in this case.

Example:

Two taskgroups with 2 threads each which are running for 2ms and
sleeping for 1ms. Looking at sched:sched_switch shows:

BEFORE: taskgroup_1 threads [5004, 5005], taskgroup_2 threads [5016, 5017]
      cpu-soaker-5004  [003]  3683.391089
      cpu-soaker-5016  [003]  3683.393106
      cpu-soaker-5005  [003]  3683.395119
      cpu-soaker-5017  [003]  3683.397130
      cpu-soaker-5004  [003]  3683.399143
      cpu-soaker-5016  [003]  3683.401155
      cpu-soaker-5005  [003]  3683.403168
      cpu-soaker-5017  [003]  3683.405170

AFTER: taskgroup_1 threads [21890, 21891], taskgroup_2 threads [21934, 21935]
      cpu-soaker-21890 [003]   865.895494
      cpu-soaker-21935 [003]   865.897506
      cpu-soaker-21934 [003]   865.899520
      cpu-soaker-21935 [003]   865.901532
      cpu-soaker-21934 [003]   865.903543
      cpu-soaker-21935 [003]   865.905546
      cpu-soaker-21891 [003]   865.907548
      cpu-soaker-21890 [003]   865.909560
      cpu-soaker-21891 [003]   865.911571
      cpu-soaker-21890 [003]   865.913582
      cpu-soaker-21891 [003]   865.915594
      cpu-soaker-21934 [003]   865.917606

Similar problem is there when there are multiple taskgroups and say a task A
preempts currently running task B of taskgroup_1. On schedule, pick_next_task
can pick an unrelated task on taskgroup_2. Here it would be better to give some
preference to task B on pick_next_task.

A simple (may be extreme case) benchmark I tried was tbench with 2 tbench
client processes with 2 threads each running on a single CPU. Avg throughput
across 5 50 sec runs was:

 BEFORE: 105.84 MB/sec
 AFTER:  112.42 MB/sec

Signed-off-by: Venkatesh Pallipadi <venki@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1302802253-25760-1-git-send-email-venki@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-19 10:08:38 +02:00
Venkatesh Pallipadi
69c80f3e9d sched: Make set_*_buddy() work on non-task entities
Make set_*_buddy() work on non-task sched_entity, to facilitate the
use of next_buddy to cache a group entity in cases where one of the
tasks within that entity sleeps or gets preempted.

set_skip_buddy() was incorrectly comparing the policy of task that is
yielding to be not equal to SCHED_IDLE. Yielding should happen even
when task yielding is SCHED_IDLE. This change removes the policy check
on the yielding task.

Signed-off-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1302744070-30079-2-git-send-email-venki@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-19 10:08:37 +02:00
Ingo Molnar
6ddafdaab3 Merge branch 'sched/locking' into sched/core
Merge reason: the rq locking changes are stable,
              propagate them into the .40 queue.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-18 14:53:33 +02:00
Peter Zijlstra
bd8e7dded8 sched: Remove need_migrate_task()
Oleg noticed that need_migrate_task() doesn't need the ->on_cpu check
now that ttwu() doesn't do remote enqueues for !->on_rq && ->on_cpu,
so remove the helper and replace the single instance with a direct
->on_rq test.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110405152729.556674812@chello.nl
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-14 08:52:41 +02:00
Peter Zijlstra
317f394160 sched: Move the second half of ttwu() to the remote cpu
Now that we've removed the rq->lock requirement from the first part of
ttwu() and can compute placement without holding any rq->lock, ensure
we execute the second half of ttwu() on the actual cpu we want the
task to run on.

This avoids having to take rq->lock and doing the task enqueue
remotely, saving lots on cacheline transfers.

As measured using: http://oss.oracle.com/~mason/sembench.c

  $ for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i; done
  $ echo 4096 32000 64 128 > /proc/sys/kernel/sem
  $ ./sembench -t 2048 -w 1900 -o 0

  unpatched: run time 30 seconds 647278 worker burns per second
  patched:   run time 30 seconds 816715 worker burns per second

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152729.515897185@chello.nl
2011-04-14 08:52:41 +02:00
Peter Zijlstra
c05fbafba1 sched: Restructure ttwu() some more
Factor our helper functions to make the inner workings of try_to_wake_up()
more obvious, this also allows for adding remote queues.

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152729.475848012@chello.nl
2011-04-14 08:52:40 +02:00
Peter Zijlstra
23f41eeb42 sched: Rename ttwu_post_activation() to ttwu_do_wakeup()
The ttwu_post_activation() code does the core wakeup, it sets TASK_RUNNING
and performs wakeup-preemption, so give is a more descriptive name.

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152729.434609705@chello.nl
2011-04-14 08:52:40 +02:00
Peter Zijlstra
b84cb5df1f sched: Remove rq argument from ttwu_stat()
In order to call ttwu_stat() without holding rq->lock we must remove
its rq argument. Since we need to change rq stats, account to the
local rq instead of the task rq, this is safe since we have IRQs
disabled.

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152729.394638826@chello.nl
2011-04-14 08:52:40 +02:00
Peter Zijlstra
e4a52bcb9a sched: Remove rq->lock from the first half of ttwu()
Currently ttwu() does two rq->lock acquisitions, once on the task's
old rq, holding it over the p->state fiddling and load-balance pass.
Then it drops the old rq->lock to acquire the new rq->lock.

By having serialized ttwu(), p->sched_class, p->cpus_allowed with
p->pi_lock, we can now drop the whole first rq->lock acquisition.

The p->pi_lock serializing concurrent ttwu() calls protects p->state,
which we will set to TASK_WAKING to bridge possible p->pi_lock to
rq->lock gaps and serialize set_task_cpu() calls against
task_rq_lock().

The p->pi_lock serialization of p->sched_class allows us to call
scheduling class methods without holding the rq->lock, and the
serialization of p->cpus_allowed allows us to do the load-balancing
bits without races.

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152729.354401150@chello.nl
2011-04-14 08:52:39 +02:00
Peter Zijlstra
8f42ced974 sched: Drop rq->lock from sched_exec()
Since we can now call select_task_rq() and set_task_cpu() with only
p->pi_lock held, and sched_exec() load-balancing has always been
optimistic, drop all rq->lock usage.

Oleg also noted that need_migrate_task() will always be true for
current, so don't bother calling that at all.

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110405152729.314204889@chello.nl
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-14 08:52:39 +02:00
Peter Zijlstra
ab2515c4b9 sched: Drop rq->lock from first part of wake_up_new_task()
Since p->pi_lock now protects all things needed to call
select_task_rq() avoid the double remote rq->lock acquisition and rely
on p->pi_lock.

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110405152729.273362517@chello.nl
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-14 08:52:38 +02:00
Peter Zijlstra
0122ec5b02 sched: Add p->pi_lock to task_rq_lock()
In order to be able to call set_task_cpu() while either holding
p->pi_lock or task_rq(p)->lock we need to hold both locks in order to
stabilize task_rq().

This makes task_rq_lock() acquire both locks, and have
__task_rq_lock() validate that p->pi_lock is held. This increases the
locking overhead for most scheduler syscalls but allows reduction of
rq->lock contention for some scheduler hot paths (ttwu).

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110405152729.232781355@chello.nl
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-14 08:52:38 +02:00
Peter Zijlstra
2acca55ed9 sched: Also serialize ttwu_local() with p->pi_lock
Since we now serialize ttwu() using p->pi_lock, we also need to
serialize ttwu_local() using that, otherwise, once we drop the
rq->lock from ttwu() it can race with ttwu_local().

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152729.192366907@chello.nl
2011-04-14 08:52:37 +02:00
Peter Zijlstra
a8e4f2eaec sched: Delay task_contributes_to_load()
In prepratation of having to call task_contributes_to_load() without
holding rq->lock, we need to store the result until we do and can
update the rq accounting accordingly.

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152729.151523907@chello.nl
2011-04-14 08:52:37 +02:00
Peter Zijlstra
3fe1698b7f sched: Deal with non-atomic min_vruntime reads on 32bits
In order to avoid reading partial updated min_vruntime values on 32bit
implement a seqcount like solution.

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110405152729.111378493@chello.nl
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-14 08:52:37 +02:00
Peter Zijlstra
74f8e4b233 sched: Remove rq argument to sched_class::task_waking()
In preparation of calling this without rq->lock held, remove the
dependency on the rq argument.

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110405152729.071474242@chello.nl
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-14 08:52:36 +02:00
Peter Zijlstra
7608dec2ce sched: Drop the rq argument to sched_class::select_task_rq()
In preparation of calling select_task_rq() without rq->lock held, drop
the dependency on the rq argument.

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110405152729.031077745@chello.nl
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-14 08:52:36 +02:00
Peter Zijlstra
013fdb8086 sched: Serialize p->cpus_allowed and ttwu() using p->pi_lock
Currently p->pi_lock already serializes p->sched_class, also put
p->cpus_allowed and try_to_wake_up() under it, this prepares the way
to do the first part of ttwu() without holding rq->lock.

By having p->sched_class and p->cpus_allowed serialized by p->pi_lock,
we prepare the way to call select_task_rq() without holding rq->lock.

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110405152728.990364093@chello.nl
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-14 08:52:35 +02:00
Peter Zijlstra
fd2f4419b4 sched: Provide p->on_rq
Provide a generic p->on_rq because the p->se.on_rq semantics are
unfavourable for lockless wakeups but needed for sched_fair.

In particular, p->on_rq is only cleared when we actually dequeue the
task in schedule() and not on any random dequeue as done by things
like __migrate_task() and __sched_setscheduler().

This also allows us to remove p->se usage from !sched_fair code.

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152728.949545047@chello.nl
2011-04-14 08:52:35 +02:00
Peter Zijlstra
d7c01d27ab sched: Clean up ttwu() stats
Collect all ttwu() stat code into a single function and ensure its
always called for an actual wakeup (changing p->state to
TASK_RUNNING).

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152728.908177058@chello.nl
2011-04-14 08:52:34 +02:00
Peter Zijlstra
893633817f sched: Change the ttwu() success details
try_to_wake_up() would only return a success when it would have to
place a task on a rq, change that to every time we change p->state to
TASK_RUNNING, because that's the real measure of wakeups.

This results in that success is always true for the tracepoints.

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152728.866866929@chello.nl
2011-04-14 08:52:34 +02:00
Peter Zijlstra
c2f7115e2e sched: Move wq_worker_waking to the correct site
wq_worker_waking_up() needs to match wq_worker_sleeping(), since the
latter is only called on deactivate, move the former near activate.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/n/top-t3m7n70n9frmv4pv2n5fwmov@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-14 08:52:33 +02:00
Peter Zijlstra
c6eb3dda25 mutex: Use p->on_cpu for the adaptive spin
Since we now have p->on_cpu unconditionally available, use it to
re-implement mutex_spin_on_owner.

Requested-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152728.826338173@chello.nl
2011-04-14 08:52:33 +02:00
Peter Zijlstra
3ca7a440da sched: Always provide p->on_cpu
Always provide p->on_cpu so that we can determine if its on a cpu
without having to lock the rq.

Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110405152728.785452014@chello.nl
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-14 08:52:32 +02:00
Peter Zijlstra
184748cc50 sched: Provide scheduler_ipi() callback in response to smp_send_reschedule()
For future rework of try_to_wake_up() we'd like to push part of that
function onto the CPU the task is actually going to run on.

In order to do so we need a generic callback from the existing scheduler IPI.

This patch introduces such a generic callback: scheduler_ipi() and
implements it as a NOP.

BenH notes: PowerPC might use this IPI on offline CPUs under rare conditions!

Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152728.744338123@chello.nl
2011-04-14 08:52:32 +02:00
Ingo Molnar
a4c98f8bbe Merge branch 'linus' into sched/locking
Merge reason: Pick up this upstream commit:

  6631e635c6: block: don't flush plugged IO on forced preemtion scheduling

As it modifies the scheduler and we'll queue up dependent patches.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-14 08:51:07 +02:00
Linus Torvalds
85f2e689a5 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86:
  x86 platform drivers: Build fix for intel_pmic_gpio
2011-04-13 09:15:55 -07:00
Linus Torvalds
66bbf58b55 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/avr32-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/avr32-2.6:
  avr32: add ATAG_BOARDINFO
  don't check platform_get_irq's return value against zero
  avr32: init cannot ignore signals sent by force_sig_info()
  avr32: fix deadlock when reading clock list in debugfs
  avr32: Fix .size directive for cpu_enter_idle
  avr32: At32ap: pio fix typo "))" on gpio_irq_unmask prototype
  fix the wrong argument of the functions definition
2011-04-13 09:15:40 -07:00
Linus Torvalds
1f4f8eeaec Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (22 commits)
  Revert "i915: restore only the mode of this driver on lastclose"
  Revert "ttm: Utilize the DMA API for pages that have TTM_PAGE_FLAG_DMA32 set."
  i915: select VIDEO_OUTPUT_CONTROL for ACPI_VIDEO
  drm/radeon/kms: properly program vddci on evergreen+
  drm/radeon/kms: add voltage type to atom set voltage function
  drm/radeon/kms: fix pcie_p callbacks on btc and cayman
  drm/radeon/kms: fix suspend on rv530 asics
  drm/radeon/kms: clean up gart dummy page handling
  drm/radeon/kms: make radeon i2c put/get bytes less noisy
  drm/radeon/kms: pll tweaks for rv6xx
  drm/radeon: Fix KMS legacy backlight support if CONFIG_BACKLIGHT_CLASS_DEVICE=m.
  radeon: Fix KMS CP writeback on big endian machines.
  i915: restore only the mode of this driver on lastclose
  drm/nvc0: improve vm flush function
  drm/nv50-nvc0: remove some code that doesn't belong here
  drm/nv50: use "nv86" tlb flush method on everything except 0x50/0xac
  drm/nouveau: quirk for XFX GT-240X-YA
  drm/nv50-nvc0: work around an evo channel hang that some people see
  drm/nouveau: implement init table opcode 0x5c
  drm/nouveau: fix oops on unload with disabled LVDS panel
  ...
2011-04-13 09:10:25 -07:00
Matthew Garrett
21a8d026e0 x86 platform drivers: Build fix for intel_pmic_gpio
Fix an incorrect function name so the driver builds.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
2011-04-13 11:52:16 -04:00
Linus Torvalds
6631e635c6 block: don't flush plugged IO on forced preemtion scheduling
We really only want to unplug the pending IO when the process actually
goes to sleep.  So move the test for flushing the plug up to the place
where we actually deactivate the task - where we have properly checked
for preemption and for the process really sleeping.

Acked-by: Jens Axboe <jaxboe@fusionio.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-13 08:08:20 -07:00
Linus Torvalds
a626ca6a65 vm: fix vm_pgoff wrap in stack expansion
Commit 982134ba62 ("mm: avoid wrapping vm_pgoff in mremap()") fixed
the case of a expanding mapping causing vm_pgoff wrapping when you used
mremap.  But there was another case where we expand mappings hiding in
plain sight: the automatic stack expansion.

This fixes that case too.

This one also found by Robert Święcki, using his nasty system call
fuzzer tool.  Good job.

Reported-and-tested-by: Robert Święcki <robert@swiecki.net>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-13 08:07:28 -07:00
Andreas Bießmann
24a1a47562 avr32: add ATAG_BOARDINFO
The ATAG_BOARDINFO is intended to hand over the information
bd->bi_board_number from u-boot to the kernel.

This piece of information can be used to implement some kind of board
identification while booting the kernel. Therefore it is placed in .initdata
section and can be accessed via the new symbol board_number only while
initializing the kernel.

Signed-off-by: Andreas Bießmann <biessmann@corscience.de>
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
2011-04-13 15:46:59 +02:00
Uwe Kleine-König
c7d876321f don't check platform_get_irq's return value against zero
platform_get_irq returns -ENXIO on failure, so !int_irq was probably
always true.  Better use (int)int_irq <= 0.  Note that a return value of
zero is still handled as error even though this could mean irq0.

This is a followup to 305b3228f9 that
changed the return value of platform_get_irq from 0 to -ENXIO on error.

Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
2011-04-13 15:46:57 +02:00
Matt Fleming
9f0d15aac9 avr32: init cannot ignore signals sent by force_sig_info()
We can delete the code that checks to see if we're sending an ignored
signal to init because force_sig_info() already handles this case.
force_sig_info() will kill init even if the signal handler is SIG_DFL
and the scenario described in the comment where init might "generate
the same exception over and over again" cannot occur (force_sig_info()
clears SIGNAL_UNKILLABLE to ensure that init will die).

Also, the use of is_global_init() is not correct in the multhreaded
case, as Oleg Nesterov explains,

	"is_global_init() is not right in theory, /sbin/init can be
	multithreaded. And, this doesn't cover the sub-namespace
	inits... I'd suggest to check SIGNAL_UNKILLABLE, but looking
	closer I think you can simply remove this code."

It seems this code was copied from arch/powerpc in March 2007 in commit

  623b0355d5 "[AVR32] Clean up exception handling code"

but the code was deleted from arch/powerpc in November 2009 in commit

  a0592d42fe "powerpc: kill the obsolete code under is_global_init()"

So catch up with powerpc and delete the bogus code.

Signed-off-by: Matt Fleming <matt.fleming@linux.intel.com>
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
2011-04-13 15:46:55 +02:00
Ole Henrik Jahren
6e2ad51190 avr32: fix deadlock when reading clock list in debugfs
When writing out /sys/kernel/debug/at32ap_clk, clock list lock is being
held while clk_get() is called. clk_get() attempts to take the same
lock, which results in deadlock. Introduce and call lock free version,
__clk_get(), instead.

Signed-off-by: Ole Henrik Jahren <olehenja@alumni.ntnu.no>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
2011-04-13 15:46:52 +02:00
Ben Hutchings
51ef85d8f9 avr32: Fix .size directive for cpu_enter_idle
gas used to accept (and ignore?) .size directives which referred to
undefined symbols, as this does.  In binutils 2.21 these are treated
as errors.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
2011-04-13 15:46:49 +02:00
Jean-Christophe PLAGNIOL-VILLARD
024b3f2936 avr32: At32ap: pio fix typo "))" on gpio_irq_unmask prototype
introduce in commit d75f1bfdbc

Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Patrice Vilchez <patrice.vilchez@atmel.com>
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
2011-04-13 15:46:45 +02:00
Wanlong Gao
8faf9e3838 fix the wrong argument of the functions definition
The functions of eic_chip's memebers use the wrong argument .

Signed-off-by: Wanlong Gao <wanlong.gao@gmail.com>
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
2011-04-13 15:46:37 +02:00
Geert Uytterhoeven
60d48c1e67 m68k,m68knommu: Wire up name_to_handle_at, open_by_handle_at, clock_adjtime, syncfs
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-12 19:02:03 -07:00
Linus Torvalds
aaa119a3d4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  fix XEN_SAVE_RESTORE Kconfig dependencies
  PM / Hibernate: Introduce CONFIG_HIBERNATE_CALLBACKS
2011-04-12 17:18:05 -07:00
Dave Airlie
2582b6efce Revert "i915: restore only the mode of this driver on lastclose"
This reverts commit 0a0883c843.

this was in my tree by accident, I meant to rebase it out and
didn't realise in time.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-04-13 09:20:24 +10:00
Dave Airlie
d87dfdbfc9 Revert "ttm: Utilize the DMA API for pages that have TTM_PAGE_FLAG_DMA32 set."
This reverts commit 69a07f0b11.

We've tracked a number of problems back to this, and Thomas
thinks we should redesign this for .40/41 anyways so I'm
happy to revert it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-04-13 09:15:09 +10:00
Konstantin Khlebnikov
cbf15bdbbd i915: select VIDEO_OUTPUT_CONTROL for ACPI_VIDEO
fix Kconfig warning:

(DRM_I915 && STUB_POULSBO) selects ACPI_VIDEO which has unmet direct dependencies
(ACPI && X86 && BACKLIGHT_CLASS_DEVICE && VIDEO_OUTPUT_CONTROL && INPUT)

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-04-13 09:10:25 +10:00
Alex Deucher
2feea49ae3 drm/radeon/kms: properly program vddci on evergreen+
Change vddci as well as vddc when changing power modes
on evergreen/ni.  Also, properly set vddci on boot up
for ni cards.  The vbios only sets the limited clocks
and voltages on boot until the mc ucode is loaded.  This
should fix stability problems on some btc cards.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-04-13 09:09:44 +10:00
Alex Deucher
8a83ec5ee8 drm/radeon/kms: add voltage type to atom set voltage function
This is needed for setting voltages other than vddc.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-04-13 09:09:42 +10:00
Alex Deucher
b4df8be104 drm/radeon/kms: fix pcie_p callbacks on btc and cayman
btc and cayman asics use the same callback for
pcie port registers.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-04-13 09:09:30 +10:00
Alex Deucher
71e16bfbd2 drm/radeon/kms: fix suspend on rv530 asics
Apparently only rv515 asics need the workaround
added in f24d86f1a4
(drm/radeon/kms: fix resume regression for some r5xx laptops).

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=34709

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-04-13 09:09:27 +10:00
Alex Deucher
92656d707e drm/radeon/kms: clean up gart dummy page handling
As per Konrad's original patch, the dummy page used
by the gart code and allocated in radeon_gart_init()
was not freed properly in radeon_gart_fini().

At the same time r6xx and newer allocated and freed the
dummy page on their own.  So to do Konrad's patch one
better, just remove the allocation and freeing of the
dummy page in the r6xx, 7xx, evergreen, and ni code and
allocate and free in the gart_init/fini() functions for
all asics.

Cc: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-04-13 09:09:22 +10:00