linux/fs/ocfs2/dlm
Tariq Saeed bba1cb17d9 ocfs2: race between umount and unfinished remastering during recovery
Orabug: 19074140

When umount is issued during recovery on the new master that has not
finished remastering locks, it triggers BUG() in
dlm_send_mig_lockres_msg().  Here is the situation:

 1) node A has a lock on resource X mastered by node B.

 2) node B dies ->  node A sets recovering flag for res X

 3) Node C becomes the new master for resources owned by the
    dead node and is remastering locks of the dead node but
    has not finished the remastering process yet.

 4) umount is issued on node C.

 5) During processing of umount, ignoring unfished recovery,
    node C attempts to migrate resource X to node A.

 6) node A finds res X in DLM_LOCK_RES_RECOVERING state, considers
    it a logic error and sends back -EFAULT.

 7) node C asserts BUG() upon seeing EFAULT resp from node B.

Fix is to delay migrating res X till remastering is finished at which
point recovering flag will be cleared on both A and C.

Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-06 18:01:13 -07:00
..
dlmapi.h
dlmast.c ocfs2: use list_for_each_entry() instead of list_for_each() 2013-09-11 15:56:36 -07:00
dlmcommon.h ocfs2/dlm: do not purge lockres that is queued for assert master 2014-06-23 16:47:45 -07:00
dlmconvert.c ocfs2: use list_for_each_entry() instead of list_for_each() 2013-09-11 15:56:36 -07:00
dlmconvert.h
dlmdebug.c ocfs2: remove NULL assignments on static 2014-06-04 16:53:53 -07:00
dlmdebug.h
dlmdomain.c ocfs2: remove conversion of total_backoff in dlm_join_domain() 2014-08-06 18:01:13 -07:00
dlmdomain.h
dlmlock.c ocfs2: remove NULL assignments on static 2014-06-04 16:53:53 -07:00
dlmmaster.c ocfs2: race between umount and unfinished remastering during recovery 2014-08-06 18:01:13 -07:00
dlmrecovery.c ocfs2/dlm: do not purge lockres that is queued for assert master 2014-06-23 16:47:45 -07:00
dlmthread.c ocfs2/dlm: do not purge lockres that is queued for assert master 2014-06-23 16:47:45 -07:00
dlmunlock.c ocfs2: fix deadlock when two nodes are converting same lock from PR to EX and idletimeout closes conn 2014-06-23 16:47:45 -07:00
Makefile ocfs2: remove versioning information 2014-01-21 16:19:41 -08:00