cluster: STABLE3 - rgmanager: Fix VM restart issue

Lon Hohberger lon@fedoraproject.org
Thu Mar 19 16:05:00 GMT 2009


Gitweb:        http://git.fedorahosted.org/git/cluster.git?p=cluster.git;a=commitdiff;h=b9856ad462de761165e93989a762d0bc94a2965a
Commit:        b9856ad462de761165e93989a762d0bc94a2965a
Parent:        dd17e0f04345d1193f4f5045763a8985be8a081c
Author:        Lon Hohberger <lhh@redhat.com>
AuthorDate:    Thu Mar 19 11:56:57 2009 -0400
Committer:     Lon Hohberger <lhh@redhat.com>
CommitterDate: Thu Mar 19 12:04:09 2009 -0400

rgmanager: Fix VM restart issue

Problem description:

* node A starts vm:foo.  Before starting vm:foo, it asks
the rest of the cluster if they have seen vm:foo

* node B receives a status inquiry request from node A.
It then executes a status check on that VM to see if it
is running.  It's not, so status returns 1.  At this
point, node B sets a NEEDSTOP flag.

* Suppose you disable the VM on node A and start it on
node B now.  At this point, the NEEDSTOP flag is still
persisted on node B, but is ignored by the start/status
checks.

* If you then do a configuration update, the NEEDSTOP flag
is -still- there.  After a configuration update (or during
a special "recover" operation", the NEEDSTOP flag is used
by rgmanager to decide what resources need to be stopped
or not.  Presence of this flag does NOT alter service state.

* Rgmanager does its reconfiguration, sees the NEESTOP flag,
and stops the virtual machine.  Because the state has not
actually changed according to rgmanager (NEEDSTOP is
succeeded by NEEDSTART if a resource's parameters have changed,
for example), the next status check causes a recovery of
the VM and then the VM is restarted.

Solution:

* Don't set NEEDSTOP during STATUS_INQUIRY

Signed-off-by: Lon Hohberger <lhh@redhat.com>
---
 rgmanager/include/res-ocf.h      |    1 +
 rgmanager/src/daemons/groups.c   |    3 +++
 rgmanager/src/daemons/restree.c  |   28 +++++++++++++++++++++++++++-
 rgmanager/src/daemons/rg_state.c |    2 +-
 4 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/rgmanager/include/res-ocf.h b/rgmanager/include/res-ocf.h
index 74549e3..6e71ac2 100644
--- a/rgmanager/include/res-ocf.h
+++ b/rgmanager/include/res-ocf.h
@@ -46,5 +46,6 @@
 #define RS_VALIDATE	(12)
 #define RS_MIGRATE	(13)
 #define RS_RECONFIG	(14)
+#define RS_STATUS_INQUIRY (15)	/** Quick status */
 
 #endif
diff --git a/rgmanager/src/daemons/groups.c b/rgmanager/src/daemons/groups.c
index d91d6e6..4d0fac7 100644
--- a/rgmanager/src/daemons/groups.c
+++ b/rgmanager/src/daemons/groups.c
@@ -964,6 +964,9 @@ group_op(char *groupname, int op)
 	case RG_STATUS:
 		ret = res_status(&_tree, res, NULL);
 		break;
+	case RG_STATUS_INQUIRY:
+		ret = res_status_inquiry(&_tree, res, NULL);
+		break;
 	case RG_CONDSTOP:
 		ret = res_condstop(&_tree, res, NULL);
 		break;
diff --git a/rgmanager/src/daemons/restree.c b/rgmanager/src/daemons/restree.c
index ecef62c..1cec468 100644
--- a/rgmanager/src/daemons/restree.c
+++ b/rgmanager/src/daemons/restree.c
@@ -1337,7 +1337,19 @@ _res_op_internal(resource_node_t __attribute__ ((unused)) **tree,
 			++node->rn_resource->r_incarnations;
 			node->rn_state = RES_STARTED;
 		}
-	} else if (me && (op == RS_STATUS)) {
+	} else if (me && (op == RS_STATUS || op == RS_STATUS_INQUIRY)) {
+
+		/* Special quick-check for status inquiry */
+		if (op == RS_STATUS_INQUIRY) {
+			if (res_exec(node, RS_STATUS, NULL, 0) != 0)
+				return SFL_FAILURE;
+
+			/* XXX: A migratable service (the only place this
+			 * check can be used) cannot have child dependencies
+			 * anyway, so this is a short-circuit. */
+			return 0;
+		}
+
 		/* Check status before children*/
 		rv = do_status(node);
 		if (rv != 0) {
@@ -1523,6 +1535,20 @@ res_status(resource_node_t **tree, resource_t *res, void *ret)
 
 
 /**
+   Check status of all occurrences of a resource in a tree
+
+   @param tree		Tree to search for our resource.
+   @param res		Resource to start/stop
+   @param ret		Unused
+ */
+int
+res_status_inquiry(resource_node_t **tree, resource_t *res, void *ret)
+{
+	return _res_op(tree, res, NULL, ret, RS_STATUS_INQUIRY);
+}
+
+
+/**
    Grab resource info for all occurrences of a resource in a tree
 
    @param tree		Tree to search for our resource.
diff --git a/rgmanager/src/daemons/rg_state.c b/rgmanager/src/daemons/rg_state.c
index b73a08c..4808ddf 100644
--- a/rgmanager/src/daemons/rg_state.c
+++ b/rgmanager/src/daemons/rg_state.c
@@ -1259,7 +1259,7 @@ svc_status_inquiry(char *svcName)
 	if (svcStatus.rs_flags & RG_FLAG_FROZEN)
 		return 0;
 	
-	return group_op(svcName, RG_STATUS);
+	return group_op(svcName, RG_STATUS_INQUIRY);
 }
 
 



More information about the Cluster-cvs mailing list