Jonathan Brassow [Wed, 10 Oct 2012 16:47:04 +0000 (11:47 -0500)]
TEST: Add lvchange-partial.sh and vgchange-partial.sh to the test suite
Commit 3501f17fd0fcec2a1fbb8aeecf228e86ee022d99 enables a limited set
of metadata updates for partial LV/VGs when issuing lvchange or vgchange.
These tests verify those changes operate as intended.
Jonathan Brassow [Wed, 10 Oct 2012 16:33:10 +0000 (11:33 -0500)]
[lv|vg]change: Allow limited metadata changes when PVs are missing
A while back, the behavior of LVM changed from allowing metadata changes
when PVs were missing to not allowing changes. Until recently, this
change was tolerated by HA-LVM by forcing a 'vgreduce --removemissing'
before trying (again) to add tags to an LV and then activate it. LVM
mirroring requires that failed devices are removed anyway, so this was
largely harmless. However, RAID LVs do not require devices to be removed
from the array in order to be activated. In fact, in an HA-LVM
environment this would be very undesirable. Device failures in such an
environment can often be transient and it would be much better to restore
the device to the array than synchronize an entirely new device.
There are two methods that can be used to setup an HA-LVM environment:
"clvm" or "tagging". For RAID LVs, "clvm" is out of the question because
RAID LVs are not supported in clustered VGs - not even in an exclusively
activated manner. That leaves "tagging". HA-LVM uses tagging - coupled
with 'volume_list' - to ensure that only one machine can have an LV active
at a time. If updates are not allowed when a PV is missing, it is
impossible to add or remove tags to allow for activation. This removes
one of the most basic functionalities of HA-LVM - site redundancy. If
mirroring or RAID is used to replicate the storage in two data centers
and one of them goes down, a server and a storage device are lost. When
the service fails-over to the alternate site, the VG will be "partial".
Unable to add a tag to the VG/LV, the RAID device will be unable to
activate.
The solution is to allow vgchange and lvchange to alter the LVM metadata
for a limited set of options - --[add|del]tag included. The set of
allowable options are ones that do not cause changes to the DM kernel
target (like --resync would) or could alter the structure of the LV
(like allocation or conversion).
Peter Rajnoha [Wed, 10 Oct 2012 15:03:47 +0000 (17:03 +0200)]
dmsetup: also apply 'mangle' command for UUIDs
Compared to names, UUIDs can't be renamed once they are created
for a device. The 'mangle' command will just issue an error message
about a need for manual intervention in this case - reactivating the
device (remove + create) does the job as the defualt mangling mode
used is "auto" and that will assign a correct mangled form the UUID.
Peter Rajnoha [Wed, 10 Oct 2012 14:59:47 +0000 (16:59 +0200)]
libdm: add dm_task_get_uuid_mangled/unmangled
Just like we already have existing mangling support for
device-mapper names, we need exactly the same for device-mapper
UUIDs as their character whitelist is wider than what udev supports.
In case udev is used to create entries in /dev based on UUIDs
and these UUIDs contain characters not supported by udev,
we'll end up with incorrect /dev content for such devices.
So we need to mangle them to a form that is supported by udev.
The mangling used for UUIDs follows the mangling used for names
(that is already supported and used throughout). That means,
setting the name mangling mode via dm_set_name_mangling_mode
affects mangling used for UUIDs in exactly the same manner.
It would be useless to add a new and separate
dm_set_uuid_mangling_mode fn, we'll reuse existing interface.
Peter Rajnoha [Mon, 8 Oct 2012 14:49:54 +0000 (16:49 +0200)]
systemd: remove ExecStartPost from lvm2-lvmetad.service.
The ExecStartPost with pvscan --cache in lvm2-lvmetad.service
is not needed now as this is called transparently within the
first LVM command that queries lvmetad.
Zdenek Kabelac [Mon, 8 Oct 2012 12:46:44 +0000 (14:46 +0200)]
test: move raid test to separate tests
Revert changes to origin lvcreate-large test and use separate
test scripts for raid - so they can be properly skipped when
kernel doesn't support raid targets.
Zdenek Kabelac [Fri, 5 Oct 2012 09:06:08 +0000 (11:06 +0200)]
lvconvert: disable convertion of thin to mirrors
For now this convertions is not supported, thus disabled.
The only supported conversion for now is to create mirrored thin pools
from mirrored devices.
RAID: Do not allow RAID LVs in a cluster volume group.
It would be possible to activate a RAID LV exclusively in a cluster
volume group, but for now we do not allow RAID LVs to exist in a
clustered volume group at all. This has two components:
1) Do not allow RAID LVs to be created in a clustered VG
2) Do not allow changing a VG from single-machine to clustered
if there are RAID LVs present.
Zdenek Kabelac [Mon, 14 May 2012 11:57:30 +0000 (13:57 +0200)]
thin: lvconvert
Update code for lvconvert.
Change the lvconvert user interface a bit - now we require 2 specifiers
--thinpool takes LV name for data device (and makes the name)
--poolmetadata takes LV name for metadata device.
Fix type in thin help text -z -> -Z.
Supported is also new flag --discards for thinpools.
Patch clears the flag if thin pool is stacked over mirror.
Since thin pool could be used to stack device over mirrors,
it needs resume properly i.e. mirrors with corelog which are otherwise
unconditionally skipped (for pvmove functionality).
Jonathan Brassow [Thu, 27 Sep 2012 21:51:22 +0000 (16:51 -0500)]
RAID: Fix problems with creating, extending and converting large RAID LVs
MD's bitmaps can handle 2^21 regions at most. The RAID code has always
used a region_size of 1024 sectors. That means the size of a RAID LV was
limited to 1TiB. (The user can adjust the region_size when creating a
RAID LV, which can affect the maximum size.) Thus, creating, extending or
converting to a RAID LV greater than 1TiB would result in a failure to
load the new device-mapper table.
Again, the size of the RAID LV is not limited by how much space is allocated
for the metadata area, but by the limitations of the MD bitmap. Therefore,
we must adjust the 'region_size' to ensure that the number of regions does
not exceed the limit. I've added code to do this when extending a RAID LV
(which covers 'create' and 'extend' operations) and when up-converting -
specifically from linear to RAID1.
Petr Rockai [Sat, 11 Aug 2012 08:37:28 +0000 (10:37 +0200)]
lib/cache/lvmetad: Refactor to use dm_config_tree in requests.
We were using daemon_send_simple until now, but it is no longer adequate, since
we need to manipulate requests in a generic way (adding a validity token to each
request), and the tree-based request interface is much more suitable for this.
Petr Rockai [Sat, 11 Aug 2012 08:33:53 +0000 (10:33 +0200)]
libdaemon: Extend and refactor APIs.
- move common dm_config_tree manipulation functions from lvmetad-core to
daemon-shared
- add config-tree-based request manipulation APIs to daemon-client
- factor out _v (va_list) variants of most variadic functions in libdaemon