block, bfq: check also in-flight I/O in dispatch plugging

Consider a sync bfq_queue Q that remains empty while in service, and
suppose that, when this happens, there is a fair amount of already
in-flight I/O not belonging to Q. In such a situation, I/O dispatching
may need to be plugged (until new I/O arrives for Q), for the
following reason.

The drive may decide to serve in-flight non-Q's I/O requests before
Q's ones, thereby delaying the arrival of new I/O requests for Q
(recall that Q is sync). If I/O-dispatching is not plugged, then,
while Q remains empty, a basically uncontrolled amount of I/O from
other queues may be dispatched too, possibly causing the service of
Q's I/O to be delayed even longer in the drive. This problem gets more
and more serious as the speed and the queue depth of the drive grow,
because, as these two quantities grow, the probability to find no
queue busy but many requests in flight grows too.

If Q has the same weight and priority as the other queues, then the
above delay is unlikely to cause any issue, because all queues tend to
undergo the same treatment. So, since not plugging I/O dispatching is
convenient for throughput, it is better not to plug. Things change in
case Q has a higher weight or priority than some other queue, because
Q's service guarantees may simply be violated. For this reason,
commit 1de0c4cd9e ("block, bfq: reduce idling only in symmetric
scenarios") does plug I/O in such an asymmetric scenario. Plugging
minimizes the delay induced by already in-flight I/O, and enables Q to
recover the bandwidth it may lose because of this delay.

Yet the above commit does not cover the case of weight-raised queues,
for efficiency concerns. For weight-raised queues, I/O-dispatch
plugging is activated simply if not all bfq_queues are
weight-raised. But this check does not handle the case of in-flight
requests, because a bfq_queue may become non busy *before* all its
in-flight requests are completed.

This commit performs I/O-dispatch plugging for weight-raised queues if
there are some in-flight requests.

As a practical example of the resulting recover of control, under
write load on a Samsung SSD 970 PRO, gnome-terminal starts in 1.5
seconds after this fix, against 15 seconds before the fix (as a
reference, gnome-terminal takes about 35 seconds to start with any of
the other I/O schedulers).

Fixes: 1de0c4cd9e ("block, bfq: reduce idling only in symmetric scenarios")
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This commit is contained in:
Paolo Valente 2019-07-18 09:08:52 +02:00 committed by Jens Axboe
parent 1624b0b200
commit b5e02b484d

View File

@ -3354,38 +3354,57 @@ static void bfq_dispatch_remove(struct request_queue *q, struct request *rq)
* there is no active group, then the primary expectation for * there is no active group, then the primary expectation for
* this device is probably a high throughput. * this device is probably a high throughput.
* *
* We are now left only with explaining the additional * We are now left only with explaining the two sub-conditions in the
* compound condition that is checked below for deciding * additional compound condition that is checked below for deciding
* whether the scenario is asymmetric. To explain this * whether the scenario is asymmetric. To explain the first
* compound condition, we need to add that the function * sub-condition, we need to add that the function
* bfq_asymmetric_scenario checks the weights of only * bfq_asymmetric_scenario checks the weights of only
* non-weight-raised queues, for efficiency reasons (see * non-weight-raised queues, for efficiency reasons (see comments on
* comments on bfq_weights_tree_add()). Then the fact that * bfq_weights_tree_add()). Then the fact that bfqq is weight-raised
* bfqq is weight-raised is checked explicitly here. More * is checked explicitly here. More precisely, the compound condition
* precisely, the compound condition below takes into account * below takes into account also the fact that, even if bfqq is being
* also the fact that, even if bfqq is being weight-raised, * weight-raised, the scenario is still symmetric if all queues with
* the scenario is still symmetric if all queues with requests * requests waiting for completion happen to be
* waiting for completion happen to be * weight-raised. Actually, we should be even more precise here, and
* weight-raised. Actually, we should be even more precise * differentiate between interactive weight raising and soft real-time
* here, and differentiate between interactive weight raising * weight raising.
* and soft real-time weight raising. *
* The second sub-condition checked in the compound condition is
* whether there is a fair amount of already in-flight I/O not
* belonging to bfqq. If so, I/O dispatching is to be plugged, for the
* following reason. The drive may decide to serve in-flight
* non-bfqq's I/O requests before bfqq's ones, thereby delaying the
* arrival of new I/O requests for bfqq (recall that bfqq is sync). If
* I/O-dispatching is not plugged, then, while bfqq remains empty, a
* basically uncontrolled amount of I/O from other queues may be
* dispatched too, possibly causing the service of bfqq's I/O to be
* delayed even longer in the drive. This problem gets more and more
* serious as the speed and the queue depth of the drive grow,
* because, as these two quantities grow, the probability to find no
* queue busy but many requests in flight grows too. By contrast,
* plugging I/O dispatching minimizes the delay induced by already
* in-flight I/O, and enables bfqq to recover the bandwidth it may
* lose because of this delay.
* *
* As a side note, it is worth considering that the above * As a side note, it is worth considering that the above
* device-idling countermeasures may however fail in the * device-idling countermeasures may however fail in the following
* following unlucky scenario: if idling is (correctly) * unlucky scenario: if I/O-dispatch plugging is (correctly) disabled
* disabled in a time period during which all symmetry * in a time period during which all symmetry sub-conditions hold, and
* sub-conditions hold, and hence the device is allowed to * therefore the device is allowed to enqueue many requests, but at
* enqueue many requests, but at some later point in time some * some later point in time some sub-condition stops to hold, then it
* sub-condition stops to hold, then it may become impossible * may become impossible to make requests be served in the desired
* to let requests be served in the desired order until all * order until all the requests already queued in the device have been
* the requests already queued in the device have been served. * served. The last sub-condition commented above somewhat mitigates
* this problem for weight-raised queues.
*/ */
static bool idling_needed_for_service_guarantees(struct bfq_data *bfqd, static bool idling_needed_for_service_guarantees(struct bfq_data *bfqd,
struct bfq_queue *bfqq) struct bfq_queue *bfqq)
{ {
return (bfqq->wr_coeff > 1 && return (bfqq->wr_coeff > 1 &&
bfqd->wr_busy_queues < (bfqd->wr_busy_queues <
bfq_tot_busy_queues(bfqd)) || bfq_tot_busy_queues(bfqd) ||
bfqd->rq_in_driver >=
bfqq->dispatched + 4)) ||
bfq_asymmetric_scenario(bfqd, bfqq); bfq_asymmetric_scenario(bfqd, bfqq);
} }