Peter Rajnoha [Thu, 28 Jun 2012 08:15:07 +0000 (04:15 -0400)]
lvcreate: add --activate ay (autoactivate)
One can use "lvcreate --aay" to have the newly created volume
activated or not activated based on the activation/auto_activation_volume_list
this way.
Note: -Z/--zero is not compatible with -aay, zeroing is not used in this case!
When using lvcreate -aay, a default warning message is also issued that zeroing
is not done.
Peter Rajnoha [Wed, 27 Jun 2012 13:35:11 +0000 (09:35 -0400)]
pvscan: add --activate ay option (autoactivate)
Define auto_activation_handler that activates VGs/LVs automatically
based on the activation/auto_activation_volume_list (activating all
volumes by default if the list is not defined).
The autoactivation is done within the pvscan call in 69-dm-lvmetad.rules
that watches for udev events (device appearance/removal).
For now, this works for non-clustered and complete VGs only.
Peter Rajnoha [Wed, 27 Jun 2012 14:21:15 +0000 (10:21 -0400)]
vgchange: add --activate ay option (autoactivate)
Normally, the 'vgchange -ay' activates all volume groups (that pass
the activation/volume_list filter if set).
This call can appear in two scenarios:
- system boot (so activation within a script in general)
- manual call on command line (so activaton on user's direct request)
For the former one, we would like to select which VGs should be actually
activated. One can define the list of VGs directly to do that. But that
would require the same list to be provided in all the scripts.
The 'vgchange -aay' will check for the activation/auto_activation_volume_list
in adition and it will activate only those VGs/LVs that pass this
filter (assuming all to be activated if the list is not defined - the
same logic we already have for activation/volume_list).
Init/boot scripts should use this form of activation primarily
(which, anyway, becomes only a fallback now with autoactivation done
on PV appearance in tandem with lvmetad in place).
Peter Rajnoha [Wed, 27 Jun 2012 12:59:34 +0000 (08:59 -0400)]
activate: add autoactivation hooks
Define an 'activation_handler' that gets called automatically on
PV appearance/disappearance while processing the lvmetad_pv_found
and lvmetad_pv_gone functions that are supposed to update the
lvmetad state based on PV availability state. For now, the actual
support is for PV appearance only, leaving room for PV disappearance
support as well (which is a more complex problem to solve as this
needs to count with possible device stack).
Add a new activation change mode - CHANGE_AAY exposed as
'--activate ay/-aay' argument ('activate automatically').
Factor out the vgchange activation functionality for use in other
tools (like pvscan...).
Peter Rajnoha [Wed, 27 Jun 2012 11:48:31 +0000 (07:48 -0400)]
args: add --activate synonym for --available arg
We're refererring to 'activation' all over the code and we're talking
about 'LVs being activated' all the time so let's use 'activation/activate'
everywhere for clarity and consistency (still providing the old
'available' keyword as a synonym for backward compatibility with
existing environments).
Update release_lv_segment_area not to discard any PV extents,
as it also gets used when moving extents between LVs.
Instead, call a new function release_and_discard_lv_segment_area() in
the two places where data should be discarded - lv_reduce() and
remove_mirrors_from_segments().
Peter Rajnoha [Fri, 22 Jun 2012 09:50:02 +0000 (05:50 -0400)]
udev: udev rules cleanup
Remove executable path detection in udev rules and use sbindir that
is configured, but still provide the original functionality by means
of 'configure --enable-udev-rule-exec-detection'.
Normally, the exec path for the tools called in udev rules should
not differ from the sbindir used, however, there are cases this is
necessary. For example different environments could be assembled
in a way that these path differ for some reason (distribution installer,
initrd ...).
This functionality is kept for compatibility only. Any environment
moving the binaries around and using different paths should be fixed
eventually!
Peter Rajnoha [Thu, 21 Jun 2012 12:41:52 +0000 (08:41 -0400)]
configure: run directory configuration cleanup
There were several hard-coded values for run directory around the code.
Also, some tools are DM specific only, others are LVM specific and there
was no distinction made here before. With this patch applied, we have
this cleaned up a bit (subsystem in brackets, defaults in parentheses):
[common] configurable PID_DIR (/var/run)
lvm [lvm] configurable RUN_DIR (/var/run/lvm)
configurable locking dir (/var/lock/lvm)
The changes briefly:
- added configure --with-default-pid-dir
- added configure --with-default-dm-run-dir
- added configure --with-lvmetad-pidfile
- by default, using one common pid directory for everything
(only lvmetad was not following this before)
Peter Rajnoha [Mon, 25 Jun 2012 09:34:21 +0000 (11:34 +0200)]
dev-io: open device read-only to obtain readahead value
There's no need to have the device open RW while obtaining the readahead value.
The RW open used before caused the CHANGE udev event to be generated if the
WATCH udev rule was set for the underlying device (and that is normally the
case both for non-dm and dm devices by default).
This did not cause any problems before since we were not interested in
*underlying* devices. However, with upcoming changes (autoactivation), we're
watching for events on underlying devices marked as PVs and such a spurious
event could cause the autoactivation code to be triggered. So when trying
to deactivate the volume, we could end up with immediate activation just after
that because of the CHANGE event originated in the WATCH udev rule since the
underlying device was open RW during the deactivation process.
Though maybe a better solution would be to completely filter such spurious
events out of the autoactivation process somehow, it's still useful if there
are as least spurious events generated as possible in the system itself.
Zdenek Kabelac [Fri, 22 Jun 2012 09:15:14 +0000 (11:15 +0200)]
fix: limit preallocate stack size
If the user would set bigger reserved stack size then what
is allowed in resources (ulimit -s), then he would get coredump
So avoid coredump and ignore creation of such large stack size
(lvm should work properly, with just 64KB, so the option could
be eliminated).
Peter Rajnoha [Tue, 29 May 2012 08:09:10 +0000 (08:09 +0000)]
Remove unsupported udev_get_dev_path libudev call used for checking udev dir.
With latest changes in the udev, some deprecated functions were removed
from libudev amongst which there was the "udev_get_dev_path" function
we used to compare a device directory used in udev and directore set in
libdevmapper. The "/dev" is hardcoded in udev now (udev version >= 183).
Amongst other changes and from packager's point of view, it's also
important to note that the libudev development library ("libudev-devel")
could now be a part of the systemd development library ("systemd-devel")
because of the udev + systemd merge.
Alasdair Kergon [Wed, 16 May 2012 12:50:14 +0000 (12:50 +0000)]
Re-enable partial activation of non-thin LVs until it can be fixed. (2.02.90)
- The test should be checking the LV as a whole, not just individual segments.
Alasdair Kergon [Fri, 11 May 2012 22:19:12 +0000 (22:19 +0000)]
Fix allocation policy loop so it doesn't continue beyond cling using later
policies it shouldn't be using when --alloc cling is specified but no tags
are defined.
Zdenek Kabelac [Wed, 9 May 2012 12:17:06 +0000 (12:17 +0000)]
Initial support for lvconvert for thin pool volumes.
Support has many limitations and lots of FIXMEs inside,
however it makes initial task when user creates a separate LV for
thin pool data and thin metadata already usable, so let's enable
it for testing.
Easiest API:
lvconvert --chunksize XX --thinpool data_lv metadata_lv
More functionality extensions will follow up.
TODO: Code needs some rework since a lot of same code is getting copied.
Fix up-convert when mirror activation is controled by volume_list and tags.
When mirrors are up-converted, a transient mirror layer is put in so that
only the new devices are sync'ed. That transient layer must carry the tags
of the original mirror LV, otherwise it will fail to activate when activation
is regulated by lvm.conf:activation/volume_list. The conversion would then
fail.
The fix is to do exactly the same thing that is being done for linear ->
mirror converting (lib/metadata/mirror.c:_init_mirror_log()). We copy the
tags temporarily for the new LV and remove them after the activation.
Snapshots of RAID logical volumes are allowed (including "raid1"). However,
snapshots of "mirror" logical volumes has been disallowed due to unsolvable
issues inherent to the design. The fact that mirroring (dm-raid1.c) must
stop all I/O as the result of a failure and wait for userspace intervention
can lead to a circular dependency if userspace is simultaneously waiting for
snapshots (on mirrors) to make an I/O update before proceeding.
Various snapshot on mirror tests have been removed as a result.
Fix bug in cmirror that caused incorrect status info to print on some nodes.
Looking at the code in cmirrord/local.c, we can see the various different
request types handled in different ways. Some information that is non-changing
does not need to go around the cluster and can be short-circuited. For
example, once the cluster mirror is in-sync, it is pointless to continue
sending that query around the cluster. We can save network bandwidth and reply
directly back to the kernel. When it comes to status information, there are
two types 'TABLE' and 'INFO'. The 'TABLE' information never changes and
belongs to the group of requests that can be safely short-circuited. The
'STATUS' information can change - and will change if a device fails. Thus it
cannot be short-circuited, but this is exactly what was found. The 'STATUS'
information request was being short-circuited and therefore never reporting the
failure condition to anyone other than the "server" that experienced it
directly.
Allow a subset of failed devices to be replaced in RAID LVs.
If two devices in an array failed, it was previously impossible to replace
just one of them. This patch allows for the replacement of some, but perhaps
not all, failed devices.
Prevent resume from creating error devices that already exist from suspend.
Thanks to agk for providing the patch that prevents resume from attempting
(and then failing) to create error devices which already exist; having been
created by a corresponding suspend operation.
In some occasional case dmevent restart was experiencing problems
with obtaining pid lockfile. So this patch tries to send several more kill
message until daemon kills itself so there is would reponse.
With this small loop the restart seems to work reliable,
although the loopsize and usleep are just randomly picked for now.
Peter Rajnoha [Tue, 24 Apr 2012 08:00:55 +0000 (08:00 +0000)]
Rename (Blk)DevNames header to (Blk)DevNamesUsed in dmsetup info -c output.
Just to make it clearer since there is the "dmsetup info -c -o blkdevname"
as well that shows the "block device name for this mapping", having a
"BlkDevName" header on output.
It's a bit confusing then if the "dmsetup info -c -o devs_used,blkdevs_used"
is named with a plural "DevNames"/"BlkDevNames" but at the same time having
a totally different meaning than the singular form "BlkDevName".
Unlike 'mirror' segtype, 'raid1' should perform flush on suspend.
The 'mirror' segtype and 'raid1' segtype both set the 'MIRRORED' flag.
However, due to differences in the way these device-mapper targets behave
'mirror' must be suspended with the 'noflush' option and 'raid1' does not
have to be.
This patch ensures that when the 'MIRRORED' flag is checked to see if
'noflush' is needed that it does not also set it for 'raid1' by mistake.