Peter Rajnoha [Tue, 29 May 2012 08:09:10 +0000 (08:09 +0000)]
Remove unsupported udev_get_dev_path libudev call used for checking udev dir.
With latest changes in the udev, some deprecated functions were removed
from libudev amongst which there was the "udev_get_dev_path" function
we used to compare a device directory used in udev and directore set in
libdevmapper. The "/dev" is hardcoded in udev now (udev version >= 183).
Amongst other changes and from packager's point of view, it's also
important to note that the libudev development library ("libudev-devel")
could now be a part of the systemd development library ("systemd-devel")
because of the udev + systemd merge.
Alasdair Kergon [Wed, 16 May 2012 12:50:14 +0000 (12:50 +0000)]
Re-enable partial activation of non-thin LVs until it can be fixed. (2.02.90)
- The test should be checking the LV as a whole, not just individual segments.
Alasdair Kergon [Fri, 11 May 2012 22:19:12 +0000 (22:19 +0000)]
Fix allocation policy loop so it doesn't continue beyond cling using later
policies it shouldn't be using when --alloc cling is specified but no tags
are defined.
Zdenek Kabelac [Wed, 9 May 2012 12:17:06 +0000 (12:17 +0000)]
Initial support for lvconvert for thin pool volumes.
Support has many limitations and lots of FIXMEs inside,
however it makes initial task when user creates a separate LV for
thin pool data and thin metadata already usable, so let's enable
it for testing.
Easiest API:
lvconvert --chunksize XX --thinpool data_lv metadata_lv
More functionality extensions will follow up.
TODO: Code needs some rework since a lot of same code is getting copied.
Fix up-convert when mirror activation is controled by volume_list and tags.
When mirrors are up-converted, a transient mirror layer is put in so that
only the new devices are sync'ed. That transient layer must carry the tags
of the original mirror LV, otherwise it will fail to activate when activation
is regulated by lvm.conf:activation/volume_list. The conversion would then
fail.
The fix is to do exactly the same thing that is being done for linear ->
mirror converting (lib/metadata/mirror.c:_init_mirror_log()). We copy the
tags temporarily for the new LV and remove them after the activation.
Snapshots of RAID logical volumes are allowed (including "raid1"). However,
snapshots of "mirror" logical volumes has been disallowed due to unsolvable
issues inherent to the design. The fact that mirroring (dm-raid1.c) must
stop all I/O as the result of a failure and wait for userspace intervention
can lead to a circular dependency if userspace is simultaneously waiting for
snapshots (on mirrors) to make an I/O update before proceeding.
Various snapshot on mirror tests have been removed as a result.
Fix bug in cmirror that caused incorrect status info to print on some nodes.
Looking at the code in cmirrord/local.c, we can see the various different
request types handled in different ways. Some information that is non-changing
does not need to go around the cluster and can be short-circuited. For
example, once the cluster mirror is in-sync, it is pointless to continue
sending that query around the cluster. We can save network bandwidth and reply
directly back to the kernel. When it comes to status information, there are
two types 'TABLE' and 'INFO'. The 'TABLE' information never changes and
belongs to the group of requests that can be safely short-circuited. The
'STATUS' information can change - and will change if a device fails. Thus it
cannot be short-circuited, but this is exactly what was found. The 'STATUS'
information request was being short-circuited and therefore never reporting the
failure condition to anyone other than the "server" that experienced it
directly.
Allow a subset of failed devices to be replaced in RAID LVs.
If two devices in an array failed, it was previously impossible to replace
just one of them. This patch allows for the replacement of some, but perhaps
not all, failed devices.
Prevent resume from creating error devices that already exist from suspend.
Thanks to agk for providing the patch that prevents resume from attempting
(and then failing) to create error devices which already exist; having been
created by a corresponding suspend operation.
In some occasional case dmevent restart was experiencing problems
with obtaining pid lockfile. So this patch tries to send several more kill
message until daemon kills itself so there is would reponse.
With this small loop the restart seems to work reliable,
although the loopsize and usleep are just randomly picked for now.
Peter Rajnoha [Tue, 24 Apr 2012 08:00:55 +0000 (08:00 +0000)]
Rename (Blk)DevNames header to (Blk)DevNamesUsed in dmsetup info -c output.
Just to make it clearer since there is the "dmsetup info -c -o blkdevname"
as well that shows the "block device name for this mapping", having a
"BlkDevName" header on output.
It's a bit confusing then if the "dmsetup info -c -o devs_used,blkdevs_used"
is named with a plural "DevNames"/"BlkDevNames" but at the same time having
a totally different meaning than the singular form "BlkDevName".
Unlike 'mirror' segtype, 'raid1' should perform flush on suspend.
The 'mirror' segtype and 'raid1' segtype both set the 'MIRRORED' flag.
However, due to differences in the way these device-mapper targets behave
'mirror' must be suspended with the 'noflush' option and 'raid1' does not
have to be.
This patch ensures that when the 'MIRRORED' flag is checked to see if
'noflush' is needed that it does not also set it for 'raid1' by mistake.
Remove 'up' from rounding message that sometimes rounds down.
Detect reduction of 0 after rounding for stripes and avoid warning of potential data loss.
Fix inability to split RAID1 image while specifying a particular PV.
The logic for resuming the original and newly split LVs was not properly
done to handle situations where anything but the last device in the array
was split. It did not take into account the possible name collisions that
might occur when the original LV undergoes the shifting and renaming of its
sub-LVs.
When resizing thin pool - we need to use strip info from _tdata volume.
In future more generic solution will be necessary once we start to support
lvconvert (resize of stacked devices and stay properly aligned).
For now we just allow striped or linear LV so this code will work.
Peter Rajnoha [Wed, 11 Apr 2012 09:12:02 +0000 (09:12 +0000)]
Change message severity to log_very_verbose for missing dev info in udev db.
Libudev does not provide transactions when querying udev database - once we
get the list of block devices (devices/obtain_device_list_from_udev=1) and
we iterate over the list to get more detailed information about device node
and symlink names used etc., the device could be removed just in between we
get the list and put a query for more info. In this case, libudev returns
NULL value as the device does not exist anymore.
Recently, we've added a warning message to reveal such situations. However,
this could be misleading if the device is not related to the LVM action
we're just processing - the non-related block device could be removed in
parallel and this is not an error but a possible and normal operation.
(N.B. This "missing info" should not happen when devices are related to
the LVM action we're just processing since all such processing should be
synchronized with udev and the udev db must always be in consistent state
after the sync point. But we can't filter this situation out from others,
non-related devices, so we have to lower the message verbosity here for a
general solution.)
RAID LVs could not handle a down-convert if a device other than the last one
in the array was specified for removal. This change addresses that (bz806111).
Commit ID 46a75dedb4f6aa815a804f27cafbd3fd16a62011 consolidated code from the
various dmeventd plug-ins into a new function called 'dmeventd_lvm2_command',
but the new function did not strip off the "_mlog" extentions that the
mirror plug-in had been doing. This created bug 794904 - failure to replace
devices in a redundant log.
The test suite did catch this scenario because it performs repair tests (mainly)
through the CLI and not dmeventd. It's also not easy to test because the test
itself will hang if the bug is encountered.
Add a couple new functions to gdbinit file and decode a couple lv->status flags
New functions:
- seg_pvs: Print a list of PVs in a seg_pvs list
- pv_dev_name: print name of a PV
Zdenek Kabelac [Fri, 30 Mar 2012 14:59:35 +0000 (14:59 +0000)]
Fix unlocking in error path of vgreduce
When vg_read fails, it internally unlocks VG if it's been locked,
so in error path we should skip unlock_vg for this case.
(user would see ugly internal warning)
devel/~ # lvconvert --splitmirrors 1 -n abc/splitted_one vg/mirrored_one
Please use a single volume group name ("vg" or "abc")
Run `lvconvert --help' for more information.