linux/include
Tejun Heo 06b285bd11 blkcg: fix blkcg_policy_data allocation bug
e48453c386 ("block, cgroup: implement policy-specific per-blkcg
data") updated per-blkcg policy data to be dynamically allocated.
When a policy is registered, its policy data aren't created.  Instead,
when the policy is activated on a queue, the policy data are allocated
if there are blkg's (blkcg_gq's) which are attached to a given blkcg.
This is buggy.  Consider the following scenario.

1. A blkcg is created.  No blkg's attached yet.

2. The policy is registered.  No policy data is allocated.

3. The policy is activated on a queue.  As the above blkcg doesn't
   have any blkg's, it won't allocate the matching blkcg_policy_data.

4. An IO is issued from the blkcg and blkg is created and the blkcg
   still doesn't have the matching policy data allocated.

With cfq-iosched, this leads to an oops.

It also doesn't free policy data on policy unregistration assuming
that freeing of all policy data on blkcg destruction should take care
of it; however, this also is incorrect.

1. A blkcg has policy data.

2. The policy gets unregistered but the policy data remains.

3. Another policy gets registered on the same slot.

4. Later, the new policy tries to allocate policy data on the previous
   blkcg but the slot is already occupied and gets skipped.  The
   policy ends up operating on the policy data of the previous policy.

There's no reason to manage blkcg_policy_data lazily.  The reason we
do lazy allocation of blkg's is that the number of all possible blkg's
is the product of cgroups and block devices which can reach a
surprising level.  blkcg_policy_data is contrained by the number of
cgroups and shouldn't be a problem.

This patch makes blkcg_policy_data to be allocated for all existing
blkcg's on policy registration and freed on unregistration and removes
blkcg_policy_data handling from policy [de]activation paths.  This
makes that blkcg_policy_data are created and removed with the policy
they belong to and fixes the above described problems.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: e48453c386 ("block, cgroup: implement policy-specific per-blkcg data")
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Arianna Avanzini <avanzini.arianna@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-07-09 14:41:11 -06:00
..
acpi Additional ACPICA material for v4.2-rc1 2015-07-02 17:11:28 -07:00
asm-generic - Support for HS38 cores based on ARCv2 ISA 2015-07-01 09:24:26 -07:00
clocksource ARM: 8366/1: move Dual-Timer SP804 driver to drivers/clocksource 2015-06-02 09:58:18 +01:00
crypto crypto: rng - Do not free default RNG when it becomes unused 2015-06-22 15:49:18 +08:00
drm drm: use kvfree() in drm_free_large() 2015-06-30 19:44:59 -07:00
dt-bindings The changes to the common clock framework for 4.2 are dominated by new 2015-07-01 19:22:00 -07:00
keys
kvm
linux blkcg: fix blkcg_policy_data allocation bug 2015-07-09 14:41:11 -06:00
math-emu
media Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds 2015-07-01 19:09:11 -07:00
memory
misc cxl: Add AFU virtual PHB and kernel API 2015-06-03 13:27:20 +10:00
net net: Kill sock->sk_protinfo 2015-06-28 16:55:44 -07:00
pcmcia
ras tracing: add trace event for memory-failure 2015-06-24 17:49:43 -07:00
rdma IB/mad: Add final OPA MAD processing 2015-06-12 14:49:18 -04:00
rxrpc
scsi SCSI misc on 20150622 2015-06-23 15:55:44 -07:00
soc Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm 2015-06-26 12:20:00 -07:00
sound ASoC: Further updates for v4.2 2015-06-22 11:32:41 +02:00
target Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2015-07-04 14:13:43 -07:00
trace Merge branch 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2015-06-30 20:07:45 -07:00
uapi virtio/vhost: cross endian support 2015-07-03 16:02:25 -07:00
video Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2015-06-26 13:18:51 -07:00
xen
Kbuild