sourceware.org Git - lvm2.git/commit

author	Jonathan Earl Brassow <jbrassow@redhat.com>
	Tue, 17 Aug 2010 23:56:23 +0000 (23:56 +0000)
committer	Jonathan Earl Brassow <jbrassow@redhat.com>
	Tue, 17 Aug 2010 23:56:23 +0000 (23:56 +0000)
commit	adbd3e478b9815722eef9af6eb199e9ba8059d27
tree	503d624ba02a255e75c747407651d213c46c5fdd	tree
parent	7b0804c22f309d1a6910be14e7028ae0204b8863	commit \| diff

Fix for bug 596453: multiple mirror image failures cause lvm repair...

The lvm repair issues I believe are the superficial symptoms of this
bug - there are worse issues that are not as clearly seen.  From my
inline comments:
* If the mirror was successfully recovered, we want to always
* force every machine to write to all devices - otherwise,
* corruption will occur.  Here's how:
*    Node1 suffers a failure and marks a region out-of-sync
*    Node2 attempts a write, gets by is_remote_recovering,
*          and queries the sync status of the region - finding
*          it out-of-sync.
*    Node2 thinks the write should be a nosync write, but it
*          hasn't suffered the drive failure that Node1 has yet.
*          It then issues a generic_make_request directly to
*          the primary image only - which is exactly the device
*          that has suffered the failure.
*    Node2 suffers a lost write - which completely bypasses the
*          mirror layer because it had gone through generic_m_r.
*    The file system will likely explode at this point due to
*    I/O errors.  If it wasn't the primary that failed, it is
*    easily possible in this case to issue writes to just one
*    of the remaining images - also leaving the mirror inconsistent.
*
* We let in_sync() return 1 in a cluster regardless of what is
* in the bitmap once recovery has successfully completed on a
* mirror.  This ensures the mirroring code will continue to
* attempt to write to all mirror images.  The worst that can
* happen for reads is that additional read attempts may be
* taken.

WHATS_NEW		diff \| blob \| blame \| history
daemons/cmirrord/functions.c		diff \| blob \| blame \| history