Dave Wysochanski [Wed, 24 Feb 2010 18:15:05 +0000 (18:15 +0000)]
Refactor _vgchange_tag() to vg_change_tag() library function.
Pull out common code to be called from tools as well as lvm2app.
Leave archive() at tool level so we can use from vgcreate
as well as vgchange. Should be no functional change.
- add stack macro in vgchange
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Mike Snitzer [Wed, 17 Feb 2010 22:59:46 +0000 (22:59 +0000)]
Refactor snapshot-merge deptree and device removal to support info-by-uuid
Add a merging snapshot to the deptree, using the "error" target, rather
than avoid adding it entirely. This allows proper cleanup of the -cow
device without having to rename the -cow to use the origin's name as a
prefix.
Move the preloading of the origin LV, after a merge, from
lv_remove_single() to vg_remove_snapshot(). Having vg_remove_snapshot()
preload the origin allows the -cow device to be released so that it can
be removed via deactivate_lv(). lv_remove_single()'s deactivate_lv()
reliably removes the -cow device because the associated snapshot LV,
that is to be removed when a snapshot-merge completes, is always added
to the deptree (and kernel -- via "error" target).
Now when the snapshot LV is removed both the -cow and -real devices
get removed using uuid rather than device name. This paves the way
for us to switch over to info-by-uuid queries.
Peter Rajnoha [Mon, 15 Feb 2010 16:38:22 +0000 (16:38 +0000)]
Remove hard-coded rule to skip _mimage devices in 11-dm-lvm.rules.
There's a tiny period of time when the _mimage device is visible during
downconversion from mirror to linear. Since it is visible, we need to
create the symlinks, otherwise warning messages will be issued about udev
not creating those symlinks. We have to rely on udev flags completely.
Peter Rajnoha [Mon, 15 Feb 2010 16:21:33 +0000 (16:21 +0000)]
Several changes in dmsetup and libdevmapper:
- add DM_UDEV_DISABLE_LIBRARY_FALLBACK udev flag to rely on udev only
- export dm_udev_create_cookie function to create new cookies on demand
- add --udevcookie, udevcreatecookie and udevreleasecookie for dmsetup
(to support "udev transactions" where one cookie value can be used for
several dmsetup calls)
- don't use DM_UDEV_DISABLE_CHECKING env. var. anymore and set the state
automatically (based on udev and libdevmapper dev path comparison)
Dave Wysochanski [Sun, 14 Feb 2010 03:21:06 +0000 (03:21 +0000)]
Fix off by 512 sizes for lvm2app.
Internally we store sizes in sectors, but lvm2app exports sizes
in bytes. We could get fancier and allow units configuration but
this fix should do for now.
Fixes rhbz561422.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Mike Snitzer [Wed, 10 Feb 2010 14:38:24 +0000 (14:38 +0000)]
Add 'fail_if_percent_unsupported' arg to _percent() and _percent_run().
We unfortunately don't yet _know_, in dev_manager_snapshot_percent(), if
a snapshot-merge target is active (activation is deferred if dev is
open); so we can't short-circuit origin devices based purely on existing
LVM LV attributes.
Set 'fail_if_percent_unsupported' in dev_manager_snapshot_percent() for
a merging origin LV, otherwise passing unsupported LV types to _percent
will lead to a default successful return with percent_range as
PERCENT_100.
For a merging origin, PERCENT_100 will result in a polldaemon that runs
infinitely (because completion is PERCENT_0).
Mike Snitzer [Mon, 8 Feb 2010 23:28:06 +0000 (23:28 +0000)]
Remove false "failed to find tree node for <lv>" error from _cached_info().
When activating a merging origin it is valid, and expected, to not have
a node in the deptree for both the origin and its merging snapshot. The
_cached_info() caller is only concerned with whether a device is open.
If there isn't a node in the tree the associated device is definitely
not open.
Mike Snitzer [Fri, 5 Feb 2010 22:47:22 +0000 (22:47 +0000)]
Switch lvconvert_single() over to using get_vg_lock_and_logical_volume()
This change was deferred to help ease the review of previous refactoring
related to using process_each_lv() for lvconvert's merge support. Not
that doing so _really_ helped but...
Mike Snitzer [Fri, 5 Feb 2010 22:44:37 +0000 (22:44 +0000)]
lvconvert --merge @tag support
Switch lvconvert's --merge code over to using process_each_lv(). Doing
so adds support for a single 'lvconvert --merge' to start merging
multiple LVs (which includes @tag expansion).
Add 'lvconvert --merge @tag' testing to test/t-snapshot-merge.sh
Adjust man/lvconvert.8.in to reflect these expanded capabilities.
The lvconvert.c implementation requires rereading the VG each iteration
of process_each_lv(). Otherwise a stale VG instance associated with
the LV passed to lvconvert_single_merge() would result in stale VG
metadata being written back out to disk. This overwrote new metadata
that was written when a previous snapshot LV finished merging (via
lvconvert_poll). This is only an issue when merging multiple LVs that
share the same VG (a single VG is typical for most LVM configurations on
system disks).
In the end this new support is very useful for performing a "system
rollback" that requires multiple snapshot LVs be merged to their
respective origin LV.
The yum-utils 'fs-snapshot' plugin tags all snapshot LVs that it creates
with a common 'snapshot_tag' that is unique to the yum transaction.
Rolling back a yum transaction, that created LVM snapshots with the tag
'yum_20100129133223', is as simple as:
lvconvert --merge @yum_20100129133223
Adding a new mimage (leg/copy) to a mirror behaves differently
depending on if the mirror has a 'core' or 'disk' log. When there
is a disk log, the new leg is added by stacking a new mirror on
top of the old (one leg is the old mirror and the other leg is the newly
added device). When the log is a 'core' log, the new leg is simply added
to the existing mirror and all the devices are re-synced.
The logic that handles collapsing the stacked 'disk' log mirror was
having the effect of causing 'core' logged mirrors to begin resync'ing
for a second time. I have used the 'CONVERTING' flag to indicate that
a mirror is converting by way of stacking. This is no longer set for
up-converting core logs. The final 'collapse' logic can safely be skipped
for 'core' log mirrors - getting rid of the second resync.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Peter Rajnoha [Wed, 3 Feb 2010 14:08:39 +0000 (14:08 +0000)]
This is related to liblvm and its lvm_list_vg_names() and lvm_list_vg_uuids() functions
where we should not expose internal VG names/uuids (the ones with "#" prefix )through the
interface. Otherwise, we could end up with library users opening internal VGs which will
initiate locking mechanism that won't be cleaned up properly.
"#orphans_{lvm1, lvm2, pool}" names are treated in a special way, they are truncated first
to "orphans" and this is used as a part of the lock name then (e.g. while calling lvm_vg_open()).
When library user calls lvm_vg_close(), the original name "orphans_{lvm1, lvm2, pool}"
is used directly and therefore no unlock occurs.
We should exclude internal VG names and uuids in the lists provided by lvmcache:
lvmcache_get_vgids() and lvmcache_get_vgnames().
Mike Snitzer [Wed, 3 Feb 2010 03:58:08 +0000 (03:58 +0000)]
Add %ORIGIN support to lv{create,extend,reduce,resize} --extents option
Allow the number of logical extents to be expressed (for a snapshot) as
a percentage of the total space in the Origin Logical Volume with the
suffix %ORIGIN.
Update the relevant man pages accordingly. Eliminate inconsistencies
between the man pages and tools/commands.h
Was using dm_list_iterate_items when I should have been using
*_safe. This had the effect of segfaulting the log daemon when
converting a mirror from one log type to another.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Milan Broz [Wed, 27 Jan 2010 13:29:11 +0000 (13:29 +0000)]
Fix pvmove abort when temporary mirror fails to be cluster-aware.
When activation of pvmove mirror fails on cluster, some nodes
still possibly succeeded in activation.
- Explicitly deactivate that mirror to be sure
- properly pair suspend/resume calls to not cause memory lock problems in clvmd
Code cannot simply call _finish_pvmove on cluster in this situation, because
changed LVs are suspended twice (causing memory inbalance) and also temporary
mirror is activated when it is not expected (and we know that it failed already).
Patch prepares special function which remove temporary mirror references from
metadata and then resumes changed LVs.
Mike Snitzer [Fri, 22 Jan 2010 21:59:42 +0000 (21:59 +0000)]
Default to checking LV's progress before waiting in _wait_for_single_lv.
Support "wait before testing" using '+' in pvmove and lvconvert
interval. Doing so overrides the new default of sleeping after checking
the LV's progress.
Sleeping before checking progress can lead to extraneous polldaemons
being left running. These polldaemons would have otherwise exited had
they checked before sleeping. Checking progress before sleeping helps
workaround the subtly unreliable nature of "finished" state checking
in _percent_run.
Update test/t-mirror-names.sh to use '+' when providing its lvconvert
interval.
Dave Wysochanski [Thu, 21 Jan 2010 21:04:44 +0000 (21:04 +0000)]
Remove useless memory allocation for pv->vg_name in _alloc_pv().
All this seems to do is provide a memory leak so remove it.
The only caller of _alloc_pv() later explicitly sets
pv->vg_name = fmt->orphan_vg_name so clearly this allocation
should be removed. I also saw no where in the code where
strncpy was used to assign pv->vg_name - only direct assignments
and strdup's.
Zdenek Kabelac [Thu, 21 Jan 2010 13:41:39 +0000 (13:41 +0000)]
Reset released pointer and counters.
DSO is currently not dl_close-ing pluing during it is unregister handling,
so clear structure and related counter, so there are no memory problems.
Futher fixes are needed.
Mike Snitzer [Wed, 20 Jan 2010 21:53:10 +0000 (21:53 +0000)]
Preload the origin prior to suspend IFF snapshot(s) still exist after a
merge completes. This narrows the scope of this "hack" (which still
needs a proper fix within the deptree).
This stops dmeventd from trying to access snapshot devices that were
already removed.
Mike Snitzer [Tue, 19 Jan 2010 16:44:57 +0000 (16:44 +0000)]
Add a common way to establish a scsi_debug-based 4K drive for use by an
LVM2 test (rather than using the traditional loop device).
prepare_scsi_debug_dev currently assumes exclussive access to the
scsi_debug module. Any script that tries to use prepare_scsi_debug_dev
when scsi_debug is unavailable or already loaded into the kernel will be
skipped.
t-topology-support.sh shows how prepare_scsi_debug_dev function can be
used repeatedly (within a script) to test LVM2 ontop of a ramdisk-based
SCSI device w/ arbitrary scsi_debug features.
Mike Snitzer [Tue, 19 Jan 2010 15:59:34 +0000 (15:59 +0000)]
update test/t-pvcreate-operation-md.sh attempt loading raid0.ko if raid0
isn't already available (in /proc/mdstat).
switch to requiring 2.6.33 for the alignment_offset tests; 2.6.{31,32}
alignment_offset values aren't reliable. 2.6.33 _should_ have mkp's
alignment_offset fixes but so far it doesn't (as of 2.6.33-rc4).
Mike Snitzer [Fri, 15 Jan 2010 22:58:25 +0000 (22:58 +0000)]
Change dev_manager_mirror_percent()'s 'struct logical_volume *' to be
'const'. Be consistent with its use (and dev_manager_snapshot_percent()).
Pass 'lv' from dev_manager_snapshot_percent() to _percent() to
_percent_run(). _percent_run() always dereferenced 'lv' (when
initializing segh) even though it may have been NULL (as was the case
until now for dev_manager_snapshot_percent()).
If a "snapshot-origin" LV (snapshot-merge whose merge was deferred
becuase it was open) was passed to _percent_run() it would always return
100%.
Update _percent_run() to NOT return PERCENT_100 et. al. if
->target_percent() wasn't ever called and supplied 'lv' is a merging
origin. A default return of 100% does not work for snapshot-merge.
Also tweak a related lvconvert log_error() to include "Aborting merge."
When moving the cluster log server into the LVM tree, the in memory
bitmap tracking was switched from the e2fsprogs implementation to
the device-mapper implementation (dm_bitset_t). The latter has a
leading uin32_t field designed to hold the number of bits that are
being tracked. The code was not properly handling this change in
all places. Specifically, when getting the bitmap to/from disk.
Endian adjustments will likely need to be made on the accounting
field as well, since bitmaps are passed between machines on
start-up.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>