Mike Snitzer [Fri, 15 Jan 2010 22:58:25 +0000 (22:58 +0000)]
Change dev_manager_mirror_percent()'s 'struct logical_volume *' to be
'const'. Be consistent with its use (and dev_manager_snapshot_percent()).
Pass 'lv' from dev_manager_snapshot_percent() to _percent() to
_percent_run(). _percent_run() always dereferenced 'lv' (when
initializing segh) even though it may have been NULL (as was the case
until now for dev_manager_snapshot_percent()).
If a "snapshot-origin" LV (snapshot-merge whose merge was deferred
becuase it was open) was passed to _percent_run() it would always return
100%.
Update _percent_run() to NOT return PERCENT_100 et. al. if
->target_percent() wasn't ever called and supplied 'lv' is a merging
origin. A default return of 100% does not work for snapshot-merge.
Also tweak a related lvconvert log_error() to include "Aborting merge."
When moving the cluster log server into the LVM tree, the in memory
bitmap tracking was switched from the e2fsprogs implementation to
the device-mapper implementation (dm_bitset_t). The latter has a
leading uin32_t field designed to hold the number of bits that are
being tracked. The code was not properly handling this change in
all places. Specifically, when getting the bitmap to/from disk.
Endian adjustments will likely need to be made on the accounting
field as well, since bitmaps are passed between machines on
start-up.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Off-by-one count was causing not all the mirror table parameters
that were necessary to be passed on to userspace.
The cluster mirror table (log portion only) used to look like this:
clustered-disk <parm_count> <disk> <region_size> <uuid> \
[[no]sync] [block_on_error]
Now it looks like this:
userspace <parm_count> <uuid> clustered-disk <disk> <region_size> \
[[no]sync]
So, there is one extra argument in the latter case - this was
unaccounted for.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Mike Snitzer [Wed, 13 Jan 2010 01:56:18 +0000 (01:56 +0000)]
Rename segment and lv status flag from SNAPSHOT_MERGE to MERGING.
Eliminate 'merging_snapshot' from 'struct logical_volume' and just use
'snapshot' for origin lv's reference to the merging snapshot; also set
MERGING in the origin lv's status.
Mike Snitzer [Wed, 13 Jan 2010 01:54:34 +0000 (01:54 +0000)]
Merge on activate support.
If either the origin or snapshot that is to be merged is open the merge
will not start; only the merge metadata will be written. The merge will
start on the next activation of the origin (or via lvchange --refresh)
IFF both the origin and snapshot are closed.
Merge on activate is particularly important if we want to merge over a
mounted filesystem that cannot be unmounted (until next boot) --- for
example root.
Mike Snitzer [Wed, 13 Jan 2010 01:52:58 +0000 (01:52 +0000)]
When turning merging origin into non-merging origin, there is bad sequence:
snapshots are suspended, new origin is created, snapshots are resumed, new
origin is resumed. So it allocates memory while suspended.
To fix it, move vg_commit after suspend_lv, so that the suspend code will
treat it as precommitted vg and will preload new origin prior to suspend.
NOTE: agk doesn't like this "hack"; need to revisit and fix
Mike Snitzer [Wed, 13 Jan 2010 01:49:22 +0000 (01:49 +0000)]
When there is merging snapshot, report percentage on the origin LV.
Because the snapshot LV will be hidden this is needed so the user can
see merging progress with "lvs" command.
Mike Snitzer [Wed, 13 Jan 2010 01:48:38 +0000 (01:48 +0000)]
Report merging snapshot as 'S' instead of 's':
This is useful for when the snapshot is still active and merging hasn't
started yet; it shows a merge is pending. Once merging starts the
merging snapshot will be hidden but can still be displayed with 'lvs -a'
Report snapshot origin with merging snapshot as 'O' instead of 'o':
Before merge starts this shows that a merge is pending. While merging
the snapshot will be hidden, 'O' enables a user to see that there is a
snapshot merging.
Mike Snitzer [Wed, 13 Jan 2010 01:44:37 +0000 (01:44 +0000)]
Merging device is loaded with "-cow" suffix and with base name of the
origin. This is needed so that "-cow" device can be found and removed
when lvremove is performed.
Mike Snitzer [Wed, 13 Jan 2010 01:39:44 +0000 (01:39 +0000)]
Add support for "snapshot-merge" target.
Introduces new libdevmapper function dm_tree_node_add_snapshot_merge_target
Verifies that the kernel (dm-snapshot) provides the 'snapshot-merge'
target.
Activate origin LV as snapshot-merge target. Using snapshot-origin
target would be pointless because the origin contains volatile data
while a merge is in progress.
Because snapshot-merge target is activated in place of the
snapshot-origin target it must be resumed after all other snapshots
(just like snapshot-origin does) --- otherwise small window for data
corruption would exist.
Ideally the merging snapshot would not be activated at all but if it is
to be activated (because snapshot was already active) it _must_ be done
after the snapshot-merge. This insures that DM's snapshot-merge target
will perform exception handover in the proper order (new->resume before
old->resume). DM's snapshot-merge does support handover if the reverse
sequence is used (old->resume before new->resume) but DM will fail to
resume the old snapshot; leaving it suspended.
To insure the proper activation sequence dm_tree_activate_children() was
updated to accommodate an additional 'activation_priority' level. All
regular snapshots are 0, snapshot-merge is 1, and merging snapshot is 2.
Alasdair Kergon [Tue, 12 Jan 2010 20:53:20 +0000 (20:53 +0000)]
Fix allocation code not to stop at the first area of a PV that fits.
This spurious 'break' has been here since this code was first committed
in June 2005 and stopped the algorithm behaving as described in the
comment above it and rendered the variable 'already_found_one' useless.
Testsuite updates and fixes for recently added features.
1. Found bug in 'redundant log' implementation that caused
problems when converting a linear that spanned multiple
devices to a mirror (wasn't checking for NULL value of
provided parameter in _alloc_parallel_area)
2. Testsuite was failing to perform tests when 'not' modifier
was used. This allowed a couple issues to slip through.
Added a 'not_sh' modifier that negates tests performed by
functions defined in the shell source file.
3. Was initializing a variable to far down, which cause
previously set value to be overridden. (This was the
result of the collision of the "redundant log" and
lvconvert fix patches.)
Mike Snitzer [Mon, 11 Jan 2010 19:08:18 +0000 (19:08 +0000)]
Reset _vgs_locked in lvmcache_init()
Upon successful fork(), _become_daemon() must assert that the locks that
are currently held belong to the parent, not the child. All of the
child's internal state saying 'this process holds a lock' has to be
reset.
A proper lvmcache_locking_reset() should follow later.
date: 2010/01/07 20:42:55; author: jbrassow; state: Exp; lines: +11 -0
The patch fixes some lvconvert issues (WRT mirror <-> mirror).
1) 'exisiting_mirrors' and 'lp->mirrors' where taken to be in 'n-1'
notation (i.e a 2-way mirror is '1' and a linear is '0'), but the
variables were in 'n' notation.
2) After adding the redundant mirror log support, I was calculating
log_count by looking at the mirror log LV, but didn't take into
account the fact that there could be no mirror log!
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Peter Rajnoha [Mon, 11 Jan 2010 15:36:24 +0000 (15:36 +0000)]
Add support to disable udev checking: DM_UDEV_DISABLE_CHECKING=1 env. variable.
Sometimes it is really needed to switch off udev checking and the warnings we show when
we detect that udev has not done its job right - the messages like "Udev should have done
this and that. Falling back to direct node creation/removal. " etc.
This would be especially handy while setting DM_DEV_DIR env var that could be set to a
different location than standard /dev (udev can't create nodes/symlinks out of that one
directory that is configured into udevd). The exact same situation happens while we're
running our tests.
Add the new mirror log type "redundant". The options are now:
--mirrorlog core: in-memory log
--mirrorlog disk: persistent log
--mirrorlog redundant: redundant persistent log
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
This patch adds the capability to split off a mirror legs.
It is pretty much the same as reducing the number of
mirror legs, but we just don't delete them afterwards.
The following command line interface is enforced:
prompt> lvconvert --splitmirror <n> -n <name> <VG>/<LV>
where 'n' is the number of images to split off, and
where 'name' is the name of the newly split off logical volume.
If more than one leg is split off, a new mirror will be the
result. The newly split off mirror will have a 'core' log.
Example:
[root@bp-01 LVM2]# !lvs
lvs -a -o name,copy_percent,devices
LV Copy% Devices
lv 100.00 lv_mimage_0(0),lv_mimage_1(0),lv_mimage_2(0),lv_mimage_3(0)
[lv_mimage_0] /dev/sdb1(0)
[lv_mimage_1] /dev/sdc1(0)
[lv_mimage_2] /dev/sdd1(0)
[lv_mimage_3] /dev/sde1(0)
[lv_mlog] /dev/sdi1(0)
[root@bp-01 LVM2]# lvconvert --splitmirrors 2 --name split vg/lv /dev/sd[ce]1
Logical volume lv converted.
[root@bp-01 LVM2]# !lvs
lvs -a -o name,copy_percent,devices
LV Copy% Devices
lv 100.00 lv_mimage_0(0),lv_mimage_2(0)
[lv_mimage_0] /dev/sdb1(0)
[lv_mimage_2] /dev/sdd1(0)
[lv_mlog] /dev/sdi1(0)
split 100.00 split_mimage_0(0),split_mimage_1(0)
[split_mimage_0] /dev/sde1(0)
[split_mimage_1] /dev/sdc1(0)
It can be seen that '--splitmirror <n>' is exactly the same
as '--mirrors -<n>' (note the minus sign), except there is the
additional notion to keep the image being detached from the
mirror instead of just throwing it away.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Petr Rockai [Fri, 8 Jan 2010 13:04:10 +0000 (13:04 +0000)]
In lvconvert --repair --use-policies, for the allocate policies, return success
even if allocation fails, as long as the downconversion or corelog conversion
succeeded.
The patch fixes some lvconvert issues (WRT mirror <-> mirror).
The default log option for a mirror is 'disk'. If the log
type is not explicitly stated on the command line when
converting from an X-way mirror to a Y-way mirror, 'disk'
is chosen. So, if you have a 'core' log mirror and you
convert, your result will contain a 'disk' log.
This patch remembers what the old log type was. If the
user is merely trying to switch the number of mirror
images, the log type is now kept the same.
There is one historical behaviour I left in place...
If you have a 2-way, core-log mirror and you use lvconvert to
specify you want a 2-way mirror - without specifying the
log type - you will get a 2-way, disk-log mirror.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Informal-IRC-ACK-by: agk
Milan Broz [Wed, 6 Jan 2010 13:26:21 +0000 (13:26 +0000)]
Remove empty "repaired" devices if empty in lvconvert.
The logic was that lvconvert repair volumes, marking
PV as MISSING and following vgreduce --removemissing
removes these missing devices.
Previously dmeventd mirror DSO removed all LV and PV
from VG by simply relying on
vgreduce --removemissing --force.
Now, there are two subsequent calls:
lvconvert --repair --use-policies
vgreduce --removemissing
So the VG is locked twice, opening space for all races
between other running lvm processes. If the PV reappears
with old metadata on it (so the winner performs autorepair,
if locking VG for update) the situation is even worse.
Patch simply adds removemissing PV functionality into
lvconcert BUT ONLY if running with --repair and --use-policies
and removing only these empty missing PVs which are
involved in repair.
(This combination is expected to run only from dmeventd.)
Mike Snitzer [Tue, 5 Jan 2010 21:14:04 +0000 (21:14 +0000)]
Use snapshot metadata usage to determine if snapshot is empty
Version >= 1.8.0 of the DM snapshot target appends metadata sectors used
to a snapshot's status. This patch allows LVM2 to accurately determine
if the snapshot store is empty. Knowing when a snapshot store is empty
is important in the context of snapshot-merge (means merge is complete).
Also update LVM2 to be aware of the possibility for "Merge failed" in
the snapshot-merge target's status.
Mike Snitzer [Tue, 5 Jan 2010 20:56:51 +0000 (20:56 +0000)]
Add a [--poll {y|n}] flag to vgchange and lvchange to control whether
the background polldaemon is allowed to start. It can be used
standalone or in conjunction with --refresh or --available y.
Control over when the background polldaemon starts will be particularly
important for snapshot-merge of a root filesystem.
Dracut will be updated to activate all LVs with: --poll n
The lvm2-monitor initscript will start polling with: --poll y
NOTE: Because we currently have no way of knowing if a background
polldaemon is active for a given LV the following limitations exist and
have been deemed acceptable:
1) it is not possible to stop an active polldaemon; so the lvm2-monitor
initscript doesn't stop running polldaemon(s)
2) redundant polldaemon instances will be started for all specified LVs
if vgchange or lvchange are repeatedly used with '--poll y'