From e5b9338ada7152672000016861ef06f4af2160aa Mon Sep 17 00:00:00 2001 From: Jonathan Earl Brassow Date: Thu, 26 Apr 2012 17:30:49 +0000 Subject: [PATCH] Fix bug in cmirror that caused incorrect status info to print on some nodes. Looking at the code in cmirrord/local.c, we can see the various different request types handled in different ways. Some information that is non-changing does not need to go around the cluster and can be short-circuited. For example, once the cluster mirror is in-sync, it is pointless to continue sending that query around the cluster. We can save network bandwidth and reply directly back to the kernel. When it comes to status information, there are two types 'TABLE' and 'INFO'. The 'TABLE' information never changes and belongs to the group of requests that can be safely short-circuited. The 'STATUS' information can change - and will change if a device fails. Thus it cannot be short-circuited, but this is exactly what was found. The 'STATUS' information request was being short-circuited and therefore never reporting the failure condition to anyone other than the "server" that experienced it directly. --- WHATS_NEW | 1 + daemons/cmirrord/cluster.c | 4 ++-- daemons/cmirrord/local.c | 2 +- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/WHATS_NEW b/WHATS_NEW index 758657605..636fae3f2 100644 --- a/WHATS_NEW +++ b/WHATS_NEW @@ -1,5 +1,6 @@ Version 2.02.96 - ================================ + Fix bug in cmirror that caused incorrect status info to print on some nodes. Remove statement that snapshots cannot be tagged from lvm man page. Disallow changing cluster attribute of VG while RAID LVs are active. Fix lvconvert error message for non-mergeable volumes. diff --git a/daemons/cmirrord/cluster.c b/daemons/cmirrord/cluster.c index 0a782401e..3a6bb038d 100644 --- a/daemons/cmirrord/cluster.c +++ b/daemons/cmirrord/cluster.c @@ -1231,11 +1231,11 @@ out: _RQ_TYPE(rq->u_rq.request_type), rq->originator, (response) ? "YES" : "NO"); else - LOG_SPRINT(match, "SEQ#=%u, UUID=%s, TYPE=%s, ORIG=%u, RESP=%s, RSPR=%u", + LOG_SPRINT(match, "SEQ#=%u, UUID=%s, TYPE=%s, ORIG=%u, RESP=%s, RSPR=%u, error=%d", rq->u_rq.seq, SHORT_UUID(rq->u_rq.uuid), _RQ_TYPE(rq->u_rq.request_type), rq->originator, (response) ? "YES" : "NO", - nodeid); + nodeid, rq->u_rq.error); } } diff --git a/daemons/cmirrord/local.c b/daemons/cmirrord/local.c index 26dbff9ce..8601cfd27 100644 --- a/daemons/cmirrord/local.c +++ b/daemons/cmirrord/local.c @@ -237,7 +237,6 @@ static int do_local_work(void *data __attribute__((unused))) case DM_ULOG_GET_REGION_SIZE: case DM_ULOG_IN_SYNC: case DM_ULOG_GET_SYNC_COUNT: - case DM_ULOG_STATUS_INFO: case DM_ULOG_STATUS_TABLE: case DM_ULOG_PRESUSPEND: /* We do not specify ourselves as server here */ @@ -273,6 +272,7 @@ static int do_local_work(void *data __attribute__((unused))) case DM_ULOG_MARK_REGION: case DM_ULOG_GET_RESYNC_WORK: case DM_ULOG_SET_REGION_SYNC: + case DM_ULOG_STATUS_INFO: case DM_ULOG_IS_REMOTE_RECOVERING: case DM_ULOG_POSTSUSPEND: r = cluster_send(rq); -- 2.43.5