linux/drivers/md
Takahiro Yasui 558569aa9d dm raid1: fix null pointer dereference in suspend
When suspending a failed mirror, bios are completed by mirror_end_io() and
__rh_lookup() in dm_rh_dec() returns NULL where a non-NULL return value is
required by design.  Fix this by not changing the state of the recovery failed
region from DM_RH_RECOVERING to DM_RH_NOSYNC in dm_rh_recovery_end().

Issue

On 2.6.33-rc1 kernel, I hit the bug when I suspended the failed
mirror by dmsetup command.

BUG: unable to handle kernel NULL pointer dereference at 00000020
IP: [<f94f38e2>] dm_rh_dec+0x35/0xa1 [dm_region_hash]
...
EIP: 0060:[<f94f38e2>] EFLAGS: 00010046 CPU: 0
EIP is at dm_rh_dec+0x35/0xa1 [dm_region_hash]
EAX: 00000286 EBX: 00000000 ECX: 00000286 EDX: 00000000
ESI: eff79eac EDI: eff79e80 EBP: f6915cd4 ESP: f6915cc4
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process dmsetup (pid: 2849, ti=f6914000 task=eff03e80 task.ti=f6914000)
 ...
Call Trace:
 [<f9530af6>] ? mirror_end_io+0x53/0x1b1 [dm_mirror]
 [<f9413104>] ? clone_endio+0x4d/0xa2 [dm_mod]
 [<f9530aa3>] ? mirror_end_io+0x0/0x1b1 [dm_mirror]
 [<f94130b7>] ? clone_endio+0x0/0xa2 [dm_mod]
 [<c02d6bcb>] ? bio_endio+0x28/0x2b
 [<f952f303>] ? hold_bio+0x2d/0x62 [dm_mirror]
 [<f952f942>] ? mirror_presuspend+0xeb/0xf7 [dm_mirror]
 [<c02aa3e2>] ? vmap_page_range+0xb/0xd
 [<f9414c8d>] ? suspend_targets+0x2d/0x3b [dm_mod]
 [<f9414ca9>] ? dm_table_presuspend_targets+0xe/0x10 [dm_mod]
 [<f941456f>] ? dm_suspend+0x4d/0x150 [dm_mod]
 [<f941767d>] ? dev_suspend+0x55/0x18a [dm_mod]
 [<c0343762>] ? _copy_from_user+0x42/0x56
 [<f9417fb0>] ? dm_ctl_ioctl+0x22c/0x281 [dm_mod]
 [<f9417628>] ? dev_suspend+0x0/0x18a [dm_mod]
 [<f9417d84>] ? dm_ctl_ioctl+0x0/0x281 [dm_mod]
 [<c02c3c4b>] ? vfs_ioctl+0x22/0x85
 [<c02c422c>] ? do_vfs_ioctl+0x4cb/0x516
 [<c02c42b7>] ? sys_ioctl+0x40/0x5a
 [<c0202858>] ? sysenter_do_call+0x12/0x28

Analysis

When recovery process of a region failed, dm_rh_recovery_end() function
changes the state of the region from RM_RH_RECOVERING to DM_RH_NOSYNC.
When recovery_complete() is executed between dm_rh_update_states() and
dm_writes() in do_mirror(), bios are processed with the region state,
DM_RH_NOSYNC. However, the region data is freed without checking its
pending count when dm_rh_update_states() is called next time.

When bios are finished by mirror_end_io(), __rh_lookup() in dm_rh_dec()
returns NULL even though a valid return value are expected.

Solution

Remove the state change of the recovery failed region from DM_RH_RECOVERING
to DM_RH_NOSYNC in dm_rh_recovery_end(). We can remove the state change
because:

  - If the region data has been released by dm_rh_update_states(),
    a new region data is created with the state of DM_RH_NOSYNC, and
    bios are processed according to the DM_RH_NOSYNC state.

  - If the region data has not been released by dm_rh_update_states(),
    a state of the region is DM_RH_RECOVERING and bios are put in the
    delayed_bio list.

The flag change from DM_RH_RECOVERING to DM_RH_NOSYNC in dm_rh_recovery_end()
was added in the following commit:
  dm raid1: handle resync failures
  author  Jonathan Brassow <jbrassow@redhat.com>
    Thu, 12 Jul 2007 16:29:04 +0000 (17:29 +0100)
  http://git.kernel.org/linus/f44db678edcc6f4c2779ac43f63f0b9dfa28b724

Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2010-02-16 18:42:58 +00:00
..
raid6test md: drivers/md/unroll.pl replaced with awk analog 2009-10-16 16:25:19 +11:00
.gitignore
bitmap.c md/bitmap: update dirty flag when bitmap bits are explicitly set. 2009-12-14 12:51:41 +11:00
bitmap.h md: Support write-intent bitmaps with externally managed metadata. 2009-12-14 12:51:41 +11:00
dm-bio-record.h dm: preserve bi_io_vec when resubmitting bios 2009-04-02 19:55:23 +01:00
dm-crypt.c dm crypt: add plain64 iv 2009-12-10 23:52:25 +00:00
dm-delay.c dm table: pass correct dev area size to device_area_is_valid 2009-07-23 20:30:42 +01:00
dm-exception-store.c dm snapshot: move cow ref from exception store to snap core 2009-12-10 23:52:12 +00:00
dm-exception-store.h dm snapshot: add merging 2009-12-10 23:52:32 +00:00
dm-io.c dm io: handle empty barriers 2009-12-10 23:52:22 +00:00
dm-ioctl.c dm: rename dm_suspended to dm_suspended_md 2009-12-10 23:52:26 +00:00
dm-kcopyd.c dm kcopyd: accept zero size jobs 2009-12-10 23:52:13 +00:00
dm-linear.c dm table: pass correct dev area size to device_area_is_valid 2009-07-23 20:30:42 +01:00
dm-log-userspace-base.c dm log: userspace fix incorrect luid cast in userspace_ctr 2009-10-16 23:18:15 +01:00
dm-log-userspace-transfer.c dm log: userspace fix overhead_size calcuations 2010-02-16 18:42:53 +00:00
dm-log-userspace-transfer.h dm log: userspace add luid to distinguish between concurrent log instances 2009-09-04 20:40:34 +01:00
dm-log.c dm raid1: report flush errors separately in status 2009-12-10 23:52:02 +00:00
dm-mpath.c dm mpath: reject messages when device is suspended 2009-12-10 23:52:27 +00:00
dm-mpath.h dm mpath: remove is_active from struct dm_path 2008-10-10 13:36:58 +01:00
dm-path-selector.c dm: path selector use module refcount directly 2009-04-02 19:55:27 +01:00
dm-path-selector.h dm mpath: add start_io and nr_bytes to path selectors 2009-06-22 10:12:27 +01:00
dm-queue-length.c dm mpath: add queue length load balancer 2009-06-22 10:12:27 +01:00
dm-raid1.c dm raid1: fail writes if errors are not handled and log fails 2010-02-16 18:42:55 +00:00
dm-region-hash.c dm raid1: fix null pointer dereference in suspend 2010-02-16 18:42:58 +00:00
dm-round-robin.c dm mpath: add start_io and nr_bytes to path selectors 2009-06-22 10:12:27 +01:00
dm-service-time.c dm mpath: add service time load balancer 2009-06-22 10:12:28 +01:00
dm-snap-persistent.c dm snapshot: persistent annotate work_queue as on stack 2010-02-16 18:42:51 +00:00
dm-snap-transient.c dm snapshot: move cow ref from exception store to snap core 2009-12-10 23:52:12 +00:00
dm-snap.c dm snapshot: use merge origin if snapshot invalid 2009-12-10 23:52:36 +00:00
dm-stripe.c dm stripe: avoid divide by zero with invalid stripe count 2010-02-16 18:42:47 +00:00
dm-sysfs.c dm: rename dm_suspended to dm_suspended_md 2009-12-10 23:52:26 +00:00
dm-table.c DM: Fix device mapper topology stacking 2010-01-11 14:29:20 +01:00
dm-target.c dm target: remove struct tt_internal 2009-04-02 19:55:28 +01:00
dm-uevent.c dm: avoid _hash_lock deadlock 2009-12-10 23:51:52 +00:00
dm-uevent.h
dm-zero.c dm: consolidate target deregistration error handling 2009-01-06 03:04:58 +00:00
dm.c dm: export suspended state to targets 2009-12-10 23:52:27 +00:00
dm.h dm: rename dm_suspended to dm_suspended_md 2009-12-10 23:52:26 +00:00
faulty.c md: add MODULE_DESCRIPTION for all md related modules. 2009-12-14 12:51:41 +11:00
Kconfig md: revise Kconfig help for MD_MULTIPATH 2009-12-14 12:51:41 +11:00
linear.c md: add MODULE_DESCRIPTION for all md related modules. 2009-12-14 12:51:41 +11:00
linear.h md/linear: use call_rcu to free obsolete 'conf' structures. 2009-06-18 08:49:42 +10:00
Makefile md: drivers/md/unroll.pl replaced with awk analog 2009-10-16 16:25:19 +11:00
md.c md: fix some lockdep issues between md and sysfs. 2010-02-10 11:26:09 +11:00
md.h raid: improve MD/raid10 handling of correctable read errors. 2009-12-14 12:51:41 +11:00
mktables.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
multipath.c md: add MODULE_DESCRIPTION for all md related modules. 2009-12-14 12:51:41 +11:00
multipath.h md: remove mddev_to_conf "helper" macro 2009-06-16 16:54:21 +10:00
raid0.c md: add MODULE_DESCRIPTION for all md related modules. 2009-12-14 12:51:41 +11:00
raid0.h md: remove mddev_to_conf "helper" macro 2009-06-16 16:54:21 +10:00
raid1.c md: add MODULE_DESCRIPTION for all md related modules. 2009-12-14 12:51:41 +11:00
raid1.h md/raid1: add takeover support for raid5->raid1 2009-12-14 12:51:41 +11:00
raid5.c md: fix some lockdep issues between md and sysfs. 2010-02-10 11:26:09 +11:00
raid5.h md: fix problems with RAID6 calculations for DDF. 2009-10-16 16:27:34 +11:00
raid6algos.c md: add MODULE_DESCRIPTION for all md related modules. 2009-12-14 12:51:41 +11:00
raid6altivec.uc md: drivers/md/unroll.pl replaced with awk analog 2009-10-16 16:25:19 +11:00
raid6int.uc md: drivers/md/unroll.pl replaced with awk analog 2009-10-16 16:25:19 +11:00
raid6mmx.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6recov.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6sse1.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6sse2.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6x86.h md: fix typo in FSF address 2009-03-31 14:57:37 +11:00
raid10.c md: add MODULE_DESCRIPTION for all md related modules. 2009-12-14 12:51:41 +11:00
raid10.h md: remove mddev_to_conf "helper" macro 2009-06-16 16:54:21 +10:00
unroll.awk md: drivers/md/unroll.pl replaced with awk analog 2009-10-16 16:25:19 +11:00