writeback: control dirty pause time

The dirty pause time shall ultimately be controlled by adjusting
nr_dirtied_pause, since there is relationship

	pause = pages_dirtied / task_ratelimit

Assuming

	pages_dirtied ~= nr_dirtied_pause
	task_ratelimit ~= dirty_ratelimit

We get

	nr_dirtied_pause ~= dirty_ratelimit * desired_pause

Here dirty_ratelimit is preferred over task_ratelimit because it's
more stable.

It's also important to limit possible large transitional errors:

- bw is changing quickly
- pages_dirtied << nr_dirtied_pause on entering dirty exceeded area
- pages_dirtied >> nr_dirtied_pause on btrfs (to be improved by a
  separate fix, but still expect non-trivial errors)

So we end up using the above formula inside clamp_val().

The best test case for this code is to run 100 "dd bs=4M" tasks on
btrfs and check its pause time distribution.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
This commit is contained in:
Wu Fengguang 2011-06-11 19:32:32 -06:00
parent c8462cc9de
commit 57fc978cfb

View File

@ -1086,6 +1086,10 @@ static void balance_dirty_pages(struct address_space *mapping,
task_ratelimit = (u64)dirty_ratelimit *
pos_ratio >> RATELIMIT_CALC_SHIFT;
pause = (HZ * pages_dirtied) / (task_ratelimit | 1);
if (unlikely(pause <= 0)) {
pause = 1; /* avoid resetting nr_dirtied_pause below */
break;
}
pause = min(pause, max_pause);
pause:
@ -1107,7 +1111,21 @@ pause:
bdi->dirty_exceeded = 0;
current->nr_dirtied = 0;
current->nr_dirtied_pause = dirty_poll_interval(nr_dirty, dirty_thresh);
if (pause == 0) { /* in freerun area */
current->nr_dirtied_pause =
dirty_poll_interval(nr_dirty, dirty_thresh);
} else if (pause <= max_pause / 4 &&
pages_dirtied >= current->nr_dirtied_pause) {
current->nr_dirtied_pause = clamp_val(
dirty_ratelimit * (max_pause / 2) / HZ,
pages_dirtied + pages_dirtied / 8,
pages_dirtied * 4);
} else if (pause >= max_pause) {
current->nr_dirtied_pause = 1 | clamp_val(
dirty_ratelimit * (max_pause / 2) / HZ,
pages_dirtied / 4,
pages_dirtied - pages_dirtied / 8);
}
if (writeback_in_progress(bdi))
return;