Zdenek Kabelac [Thu, 27 Mar 2014 10:27:42 +0000 (11:27 +0100)]
pvchange: fix exit code regression
Commit 1a832398a7caa96faab2ccbac773cc4047c8f9c1 moved
some code from _pvchange_single() to main pvchange() and
introduced exit code regression as return codes have not
been properly changed, thus pvchange command exited
with '0' exit code, even though it has reported error.
Also there is a missing vg unlock in error path.
Fix it by counting the total number of expected calls before
checking for pvname and also unlock and relase vg when
pv is not found.
Peter Rajnoha [Tue, 25 Mar 2014 12:06:59 +0000 (13:06 +0100)]
lvmdump: add lvm dumpconfig --type diff/missing
For a quick overview of config when debugging and to quickly check
which values are different from defaults and which are not defined
in the config and for which defaults are used.
Zdenek Kabelac [Tue, 25 Mar 2014 09:17:34 +0000 (10:17 +0100)]
clvmd: validate open device state
If clvmd does not hold any lock, it should also not keep any opened
device.
The reason for this patch is, that refresh_toolcontext calls
dev_cache_exit() which destroys whole device cache (even those with
opened file) - previous patch added recovery path to avoid memory
corruption, but opened files are still bugs that need to be fixed.
So this patch certainly kills many internal mirror & raid tests,
since they leak opened file descriptors (when tests are executed
with 'abort_on_error').
Zdenek Kabelac [Tue, 25 Mar 2014 09:50:36 +0000 (10:50 +0100)]
clvmd: hardening leak on exit
Operate with lvm_thread_exit while holding lvm_thread_mutex.
Don't leave unfinished work in the lvm thread queue
and always finish all queued tasks before exit,
so no cmd struct is left in the list.
Zdenek Kabelac [Tue, 25 Mar 2014 09:53:42 +0000 (10:53 +0100)]
dev-cache: fix mem corruption on refresh context
When lvm2 command works with clvmd and uses locking in wrong way,
it may 'leak' certain file descriptors in opened (incorrect) state.
dev_cache_exit then destroys memory pool of cached devices, while
_open_devices list in dev-io.c was still referencing them if they
were still opened.
Patch properly calls _close() function to 'self-heal' from this
invalid state, but it will report internal error (so execution
with abort_on_internal_error causes immediate death). On the
normal 'execution', error is only reported, but memory state is
corrected, and linked list is not referencing devices from
released mempool.
For crash see: https://bugzilla.redhat.com/show_bug.cgi?id=1073886
Peter Rajnoha [Mon, 24 Mar 2014 15:47:08 +0000 (16:47 +0100)]
libdaemon: fix misleading "WARNING: Ignoring unsupported value for expected." when communicating with daemon
When we're trying to search for certain tree node
in daemon's reply, we default to a blank string ""
if the node is not found. This happens during lvmetad
initialization.
However, when the default blank string is used, we
can't use dm_config_find_str at the same time - the
dm_config_find_str_allow_empty should be used instead.
Otherwise a a warning message:
Also cleanup-up associated pthreads by using cleanup_zombie() function.
Since this function may change the list, restart scanning always from
the list header.
Note: couple following patches are necessary to make this working properly.
Zdenek Kabelac [Thu, 20 Mar 2014 12:53:39 +0000 (13:53 +0100)]
memlock: drop locked mem in critical section
When daemon releases memory and it is still in critical
section, issue an error message and drop memory.
We cannot do anything better for now and we at least
release allocated resource.
FIXME:
This code is triggered when i.e. clvmd is killed while
some LVs are suspend - in this case suspended devices leak,
so if this happens during i.e. clvmd upgrade we have
unresolved problem - even locked rootfs...
Zdenek Kabelac [Thu, 20 Mar 2014 09:31:21 +0000 (10:31 +0100)]
activate: report release with critical section
This function is typically called for cmd context refresh or destroy.
On the non-clustered case we already unlocked all messages,
however when i.e. 'clvmd' gets break signal it may have
still couple messages queued.
Zdenek Kabelac [Wed, 19 Mar 2014 22:35:36 +0000 (23:35 +0100)]
lvmcache: fix debug trace
Recent debug tracing commit introduce read of uninitialized memory,
since VGID is not really a proper string which ends with '\0'.
Enforce at most 32 (ID_LEN) chars are read from vgid.
(in release fix)
Zdenek Kabelac [Fri, 21 Mar 2014 21:26:39 +0000 (22:26 +0100)]
lvmcache: handle reinit without error
Since commit f12ee43f2eaba5d38b1925a5a34b1746c1d66985 call destroy,
it start to check all VGs are unlocked. However when we become_daemon,
we simply reset locking (since lock is still kept by parent process).
So implement a simple 'reset' flag.
There are two types of CPG communications in a corosync cluster:
messages and state transitions. Cmirrord processes the state
transitions first.
When a cluster mirror issues a POSTSUSPEND, it signals the end of
cluster communication with the rest of the nodes in the cluster.
The POSTSUSPEND marks the last communication of the 'message'
type that will go around the cluster. The node then calls
cpg_leave which causes a final 'state transition' communication to
all of the nodes. Once the out-going node receives its own state
transition notice from the cluster, it finalizes the leave. At this
point, the state of the log is 'INVALID'; but it is possible that
there remains some cluster trafic that was queued up behind the
state transition that still wants to be processed. It is harmless
to attempt to dispatch any remaining messages - they won't be
delivered because the node is no longer in the cluster. However,
there was a warning message that was being printed in this case
that is now removed by this patch. The failure of the dispatch
created a false positive condition that triggered the message.
Zdenek Kabelac [Tue, 18 Mar 2014 23:25:51 +0000 (00:25 +0100)]
clvmd: avoid resending local sync commands
Instead of sending repeatedly LOCAL_SYNC commands to clvmds
like 'lvs', rememeber the last sent commmand, and if there was no other
clvmd command, drop this redundant SYNC call message.
This introduced correct synchronisation of name, when user requests to know
open_count (needs to wait for udev), however it is also executed for
read-only cases like 'lvs' command.
For now implement very simple solution, which is only monitoring
outgoing clvmd command, and when sequence of LOCAL sync names are
recognized, they are skipped automatically.
TODO:
Future solution might move this variable info 'cmd_context' and
use 'needs_sync' flag also i.e. in file locking code.
Zdenek Kabelac [Tue, 18 Mar 2014 23:20:39 +0000 (00:20 +0100)]
archiver: drop unneeded backup check
When the backup is disabled, avoid testing backup presence.
This only leads to errors being logged in debug trace and the missing
backup can't be fixed, since it's disabled.
Peter Rajnoha [Tue, 18 Mar 2014 10:04:21 +0000 (11:04 +0100)]
dumpconfig: fix memleak when using --mergedconfig
Check whether lvm dumpconfig --mergedconfig is used only
with --type current (where we're merging current config and
the config supplied on command line). With other types
the config was merged, but it was thrown away since we're
generating other type of config anyway. This lead to a memleak.
Error out if --mergedconfig is used with anything else than
--type current (or without specifying --type in which case
the --type current is used by default).