linux/kernel/sched
Alex Shi a75cdaa915 sched: Set an initial value of runnable avg for new forked task
We need to initialize the se.avg.{decay_count, load_avg_contrib} for a
new forked task. Otherwise random values of above variables cause a
mess when a new task is enqueued:

    enqueue_task_fair
        enqueue_entity
            enqueue_entity_load_avg

and make fork balancing imbalance due to incorrect load_avg_contrib.

Further more, Morten Rasmussen notice some tasks were not launched at
once after created. So Paul and Peter suggest giving a start value for
new task runnable avg time same as sched_slice().

PeterZ said:

> So the 'problem' is that our running avg is a 'floating' average; ie. it
> decays with time. Now we have to guess about the future of our newly
> spawned task -- something that is nigh impossible seeing these CPU
> vendors keep refusing to implement the crystal ball instruction.
>
> So there's two asymptotic cases we want to deal well with; 1) the case
> where the newly spawned program will be 'nearly' idle for its lifetime;
> and 2) the case where its cpu-bound.
>
> Since we have to guess, we'll go for worst case and assume its
> cpu-bound; now we don't want to make the avg so heavy adjusting to the
> near-idle case takes forever. We want to be able to quickly adjust and
> lower our running avg.
>
> Now we also don't want to make our avg too light, such that it gets
> decremented just for the new task not having had a chance to run yet --
> even if when it would run, it would be more cpu-bound than not.
>
> So what we do is we make the initial avg of the same duration as that we
> guess it takes to run each task on the system at least once -- aka
> sched_slice().
>
> Of course we can defeat this with wakeup/fork bombs, but in the 'normal'
> case it should be good enough.

Paul also contributed most of the code comments in this commit.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Reviewed-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reviewed-by: Paul Turner <pjt@google.com>
[peterz; added explanation of sched_slice() usage]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1371694737-29336-4-git-send-email-alex.shi@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-06-27 10:07:30 +02:00
..
auto_group.c sched/autogroup: Fix race with task_groups list 2013-05-28 09:40:22 +02:00
auto_group.h Revert "sched/autogroup: Fix crash on reboot when autogroup is disabled" 2012-12-11 10:23:45 +01:00
clock.c sched_clock: Prevent 64bit inatomicity on 32bit systems 2013-04-08 11:50:44 +02:00
core.c sched: Set an initial value of runnable avg for new forked task 2013-06-27 10:07:30 +02:00
cpuacct.c sched/cpuacct/UML: Fix header file dependency bug on the UML build 2013-04-10 15:12:41 +02:00
cpuacct.h sched/cpuacct: Initialize root cpuacct earlier 2013-04-10 13:54:20 +02:00
cpupri.c sched/rt: Move rt specific bits into new header file 2013-02-07 20:51:08 +01:00
cpupri.h sched: Move all scheduler bits into kernel/sched/ 2011-11-17 12:20:22 +01:00
cputime.c sched: Use swap() macro in scale_stime() 2013-05-28 11:58:10 +02:00
debug.c Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-02-26 19:42:08 -08:00
fair.c sched: Set an initial value of runnable avg for new forked task 2013-06-27 10:07:30 +02:00
features.h mutex: Move mutex spinning code from sched/core.c back to mutex.c 2013-04-19 09:33:34 +02:00
idle_task.c sched: Keep at least 1 tick per second for active dynticks tasks 2013-05-04 08:32:02 +02:00
Makefile sched: Factor out load calculation code from sched/core.c --> sched/proc.c 2013-05-07 13:14:50 +02:00
proc.c sched: Factor out load calculation code from sched/core.c --> sched/proc.c 2013-05-07 13:14:50 +02:00
rt.c sched/rt: Simplify pull_rt_task() logic and remove .leaf_rt_rq_list 2013-06-19 12:58:40 +02:00
sched.h sched: Set an initial value of runnable avg for new forked task 2013-06-27 10:07:30 +02:00
stats.c fix a leak in /proc/schedstats 2013-04-29 15:41:45 -04:00
stats.h sched: Use an accessor to read the rq clock 2013-05-28 09:40:27 +02:00
stop_task.c sched: Use an accessor to read the rq clock 2013-05-28 09:40:27 +02:00