Commit bf2741376d47411994d4065863acab8e405ff5c7 started to use
lv_is_active() instead of call for lv_info & info.exists so
we cover also cluster activated devices.
For snapshost the conversion was not correct and introduced
regression by blocking creation of snapshot of inactive LV.
Fix it by assigning lv_is_active() directly.
Note: we still have minor issue to fix - to make
lv_is_???? function able to return error states since
lv_info() may fail.
Zdenek Kabelac [Thu, 15 Nov 2012 13:48:32 +0000 (14:48 +0100)]
thin: add common pool functions
Move common functions for lvcreate and lvconvert.
get_pool_params() - read thin pool args.
update_pool_params() - updates/validates some thin args.
It is getting complicated and even few more things will be
implemented, so to avoid reimplementing things differently
in lvcreate and lvconvert code has been splitted
into 2 common functions that allow some future extension.
Jonathan Brassow [Wed, 14 Nov 2012 20:58:47 +0000 (14:58 -0600)]
mirror: Mirrored log should be fixed before mirror when double fault occurs
This patch is intended to fix bug 825323 - FS turns read-only during a double
fault of a mirror leg and mirrored log's leg at the same time. It only
affects a 2-way mirror with a mirrored log. 3+-way mirrors and mirrors
without a mirrored log are not affected.
The problem resulted from the fact that the top level mirror was not
using 'noflush' when suspending before its "down-convert". When a
mirror image fails, the bios are queue until a suspend is recieved. If
it is a 'noflush' suspend, the bios can be safely requeued in the DM
core. If 'noflush' is not used, the bios must be pushed through the
target and if a device is failed for a mirror, that means issuing an
error. When an error is received by a file system, it results in it
turning read-only (depending on the FS).
Part of the problem was is due to the nature of the stacking involved in
using a mirror as a mirror's log. When an image in each fail, the top
level mirror stalls because it is waiting for a log flush. The other
stalls waiting for corrective action. When the repair command is issued,
the entire stacked arrangement is collapsed to a linear LV. The log
flush then fails (somewhat uncleanly) and the top-level mirror is suspended
without 'noflush' because it is a linear device.
This patch allows the log to be repaired first, which in turn allows the
top-level mirror's log flush to complete cleanly. The top-level mirror
is then secondarily reduced to a linear device - at which time this mirror
is suspended properly with 'noflush'.
Peter Rajnoha [Thu, 1 Nov 2012 12:33:49 +0000 (13:33 +0100)]
systemd: do not remove lvm2-activation.service
Fix previous commit 360c569ce8f0bfe936d59ca91de2716958550524.
Remove only fedora-storage-init/fedora-storage-init-late.service, but
not lvm2-activation.service.
fedora-storage-init.service fedora-storage-init-late.service
Peter Rajnoha [Tue, 30 Oct 2012 19:36:49 +0000 (20:36 +0100)]
systemd: various updates and fixes
Don't use lvmetad in lvm2-monitor.service ExecStop to avoid a systemd issue.
- a systemd design issue while processing dependencies
with socket-based activation that ends up with a hang
- https://bugzilla.redhat.com/show_bug.cgi?id=843587
(also tracker bug https://bugzilla.redhat.com/show_bug.cgi?id=871527)
- not using lvmetad in this case is just a workaround, once the bug
above is resolved, we should enable the lvmetad in that specific case
Remove dependency on fedora-storage-init.service in lvm2 systemd units.
- fedora-storage-init.service and fedora-storage-init-late.service is
going to be separated into respective units that belong to each block
device subsystem:
- mpath + mdraid activated via udev solely
- dmraid with its own dmraid-activation.service unit
- lvm2 with the lvm2-activation-generator to generate the
activation units runtime if lvmetad disabled
(global/use_lvmetad=0 set in lvm.conf) and activation done
via udev+lvmetad if lvmetad enabled (global/use_lvmetad=1 set
in lvm.conf)
Depend on lvm2-lvmetad.socket in lvm2-monitor.service systemd unit.
- as lvm2-monitor uses lvmetad if lvmetad is enabled
Tony Asleson [Thu, 25 Oct 2012 22:31:11 +0000 (17:31 -0500)]
python-lvm: Memory leaks & seg. fault fixes
Issues found (thus far) in unit test developemnt for python bindings.
Added Py_DECREF(ptr) in liblvm_lvm_vg_open & liblvm_lvm_vg_create
in error paths so that we correctly clean up memory.
Added a call to lvm_vg_close when we remove a vg. The code was
clearing out the vg pointer which prevented us from actually
calling lvm_vg_close in the close path.
liblvm_lvm_vg_create_lv_linear was not initializing
lvobj->parent_vgobj and if lvm_vg_create_lv_linear failed
we went through liblvm_lv_dealloc on clean up and tried to
Py_DECREF an invalid pointer.
Jonathan Brassow [Thu, 25 Oct 2012 05:42:45 +0000 (00:42 -0500)]
mirror: Avoid reading mirrors with failed devices in mirrored log
Commit 9fd7ac7d035f0b2f8dcc3cb19935eb181816bd76 did not handle mirrors
that contained mirrored logs. This is because the status line of the
mirror does not give an indication of the health of the mirrored log,
as you can see here:
[root@bp-01 lvm2]# dmsetup status vg-lv vg-lv_mlog
vg-lv: 0 409600 mirror 2 253:6 253:7 400/400 1 AA 3 disk 253:5 A
vg-lv_mlog: 0 8192 mirror 2 253:3 253:4 7/8 1 AD 1 core
Thus, the possibility for LVM commands to hang still persists when mirror
have mirrored logs. I discovered this while performing some testing that
does polling with 'pvs' while doing I/O and killing devices. The 'pvs'
managed to get between the mirrored log device failure and the attempt
by dmeventd to repair it. The result was a very nasty block in LVM
commands that is very difficult to remove - even for someone who knows
what is going on. Thus, it is absolutely essential that the log of a
mirror be recursively checked for mirror devices which may be failed
as well.
Despite what the code comment says in the aforementioned commit...
+ * _mirrored_transient_status(). FIXME: It is unable to handle mirrors
+ * with mirrored logs because it does not have a way to get the status of
+ * the mirror that forms the log, which could be blocked.
... it is possible to get the status of the log because the log device
major/minor is given to us by the status output of the top-level mirror.
We can use that to query the log device for any DM status and see if it
is a mirror that needs to be bypassed. This patch does just that and is
now able to avoid reading from mirrors that have failed devices in a
mirrored log.
Jonathan Brassow [Wed, 24 Oct 2012 04:10:33 +0000 (23:10 -0500)]
mirror: Avoid reading from mirrors that have failed devices
Addresses: rhbz855398 (Allow VGs to be built on cluster mirrors),
and other issues.
The LVM code attempts to avoid reading labels from devices that are
suspended to try to avoid situations that may cause the commands to
block indefinitely. When scanning devices, 'ignore_suspended_devices'
can be set so the code (lib/activate/dev_manager.c:device_is_usable())
checks any DM devices it finds and avoids them if they are suspended.
The mirror target has an additional mechanism that can cause I/O to
be blocked. If a device in a mirror fails, all I/O will be blocked
by the kernel until a new table (a linear target or a mirror with
replacement devices) is loaded. The mirror indicates that this condition
has happened by marking a 'D' for the faulty device in its status
output. This condition must also be checked by 'device_is_usable()' to
avoid the possibility of blocking LVM commands indefinitely due to an
attempt to read the blocked mirror for labels.
Until now, mirrors were avoided if the 'ignore_suspended_devices'
condition was set. This check seemed to suggest, "if we are concerned
about suspended devices, then let's ignore mirrors altogether just
in case". This is insufficient and doesn't solve any problems. All
devices that are suspended are already avoided if
'ignore_suspended_devices' is set; and if a mirror is blocking because
of an error condition, it will block the LVM command regardless of the
setting of that variable.
Rather than avoiding mirrors whenever 'ignore_suspended_devices' is
set, this patch causes mirrors to be avoided whenever they are blocking
due to an error. (As mentioned above, the case where a DM device is
suspended is already covered.) This solves a number of issues that weren't
handled before. For example, pvcreate (or any command that does a
pv_read or vg_read, which eventually call device_is_usable()) will be
protected from blocked mirrors regardless of how
'ignore_suspended_devices' is set. Additionally, a mirror that is
neither suspended nor blocking is /allowed/ to be read regardless
of how 'ignore_suspended_devices' is set. (The latter point being the
source of the fix for rhbz855398.)
Jonathan Brassow [Wed, 24 Oct 2012 02:19:27 +0000 (21:19 -0500)]
RAID: Make RAID 4/5/6 display sync status under heading s/Copy%/Cpy%Sync
The heading 'Copy%' is specific to PVMOVE volumes, but can be generalized
to apply to LVM mirrors also. It is a bit awkward to use 'Copy%' for
RAID 4/5/6, however - 'Sync%' would be more appropriate. This is why
RAID 4/5/6 have not displayed their sync status by any means available to
'lvs' yet.
Jonathan Brassow [Wed, 24 Oct 2012 01:33:54 +0000 (20:33 -0500)]
mirror/raid: Move 'copy_percent' to common code (mirror.c -> lv_manip.c)
The 'copy_percent' function takes the 'extents_copied' field from each
segment in an LV to create the numerator for the ratio that is to
become the copy_percent. (Otherwise known as the 'sync' percent for
non-pvmove uses, like mirror LVs and RAID LVs.) This function safely
works on RAID - not just mirrors - so it is better to have it in
lv_manip.c rather than mirror.c.
There's a lot of different functions that do a lot of different things
in lv_manip.c, so I placed the function near a function in lv_manip.c
that it was close to in metadata-exported.h. Different placement in the
file or a different name for the function may be useful.
Tony Asleson [Tue, 23 Oct 2012 19:50:14 +0000 (14:50 -0500)]
python-lvm: seg. fault in liblvm_lvm_percent_to_float
The first parameter needs to be present even if it isn't being
used, otherwise the one any only parameter we get is null
which causes PyArg_ParseTuple to seg. fault.
Peter Rajnoha [Tue, 23 Oct 2012 09:40:53 +0000 (11:40 +0200)]
udev_sync: cookie_set=1 on each dm_task_set_cookie call
cookie_set variable found in the struct dm_task should be always
set to 1 after dm_task_set_cookie_call, even if udev_sync is disabled
as the cookie itself carries synchronization informations *as well as*
extra flags to control other aspects of udev support.
For example, one could disable the synchronization itself, but still
direct the libdm code to disable library fallback via
DM_UDEV_DISABLE_LIBRARY_FALLBACK flag. These extra flags still need
to be carried out!
A concrete example:
$ dmsetup create test --table "0 1 zero" --noudevsync
This disables synchronization with udev. As the --verifyudev option is
not used, we don't want to do any corrections. In other words, we
need DM_UDEV_DISABLE_LIBRARY_FALLBACK flag to be used. However,
with --noudevsync this was not the case - the flag was ignored!
This patch fixes the case when noudevsync is used but there are still
some extra flags passed within the cookie flag part. The synchronization
part of the cookie stays zero (which is ok as dm_udev_wait call on such a
cookie is simply a NOOP).
Andy Grover [Wed, 17 Oct 2012 19:55:25 +0000 (12:55 -0700)]
python-lvm: Implement proper refcounting for parent objects
Our object nesting:
lib -> VG -> LV -> lvseg
-> PV -> pvseg
Implement refcounting and checks to ensure parent objects are not
dealloced before their children. Also ensure calls to self or child's
methods are handled cleanly for objects that have been closed or removed.
Ensure all functions that are object methods have a first parameter named
'self', for consistency
Rename local vars that reference a Python object to '*obj', in order to
differentiate from liblvm handles
Andy Grover [Mon, 15 Oct 2012 20:26:01 +0000 (13:26 -0700)]
python-lvm: Remove liblvm object
Instead of requiring users to create a liblvm object, and then calling
methods on it, the module acquires a liblvm handle as part of
initialization. This makes it impossible to instantiate a liblvm object
with a different systemdir, but there is an alternate envvar method for
that obscure use case.
Jonathan Brassow [Mon, 15 Oct 2012 20:43:15 +0000 (15:43 -0500)]
TEST: Re-add testing of lvconvert-raid for kernels < 3.2
I'm not sure what 'BUG's were being encountered when the restriction
to limit lvconvert-raid.sh tests to kernels > 3.2 was added. I do know
that there were BUG's that could be triggered when testing snapshots and
some of the earliest DM RAID available in the kernel. I've taken out
the 3.2 kernel restriction and added a dm-raid >= 1.2 restriction instead.
This will allow the test to run on patched production kernels.
Jonathan Brassow [Mon, 15 Oct 2012 20:09:05 +0000 (15:09 -0500)]
Clean-up: Adjust message to be clearer on action taken and why
A message is printed when the region_size of a RAID LV is adjusted
to allow for large (> ~1TB) LVs. The message wasn't very clear.
Hopefully, this is better.
Try to bring the lvmetad usage text and man page closer to the code.
There seem to be 3 useful ways to use -d with lvmetad at the moment:
-d all
-d wire
-d debug
(They can also be comma-separated like -d wire,debug.)
Prior to the last release, -d, -dd and -ddd were supported.
Fail if an unrecognised debug arg is supplied on the command line.
Change -V to report the same version as the lvm binary: previously it
just reported version 0.
Peter Rajnoha [Fri, 12 Oct 2012 12:37:57 +0000 (14:37 +0200)]
scripts: introduce blkdeactivate
blkdeactivate - utility to deactivate block devices
Traverses the tree of block devices and tries to deactivate them.
Currently, it supports device-mapper-based devices together with LVM.
See man/blkdeactivate.8 for more info.
It is targeted for use during shutdown to properly deactivate the
whole block device stack - systemd and init scripts are provided as
well. However, it might be used directly on command line too.
Please, see the commentary at the top of the blkdeactivate script
for dependencies and versions of other utilities required.
Peter Rajnoha [Fri, 12 Oct 2012 12:21:25 +0000 (14:21 +0200)]
systemd: add deps to order units more properly
lvm2-activation-early.service (generated by activation generator) should
be ordered before cryptsetup.target.
lvm2-monitor.service should be ordered after lvm2-activation.service,
if used. The lvm2-activation.service will replace fedora-storage-init.service
and fedora-storage-init-late.service in the end, but let's have it
prepared now.
Peter Rajnoha [Fri, 12 Oct 2012 09:53:04 +0000 (11:53 +0200)]
libdm: remove dm dev without error even with malformed UUID
On each ioctl return, the device UUID is decoded from \xNN format.
If the UUID of the device being *removed* is malformed (e.g. it
hasn't been corrected before), just remove it without any error
as the UUID is not needed anymore - the device is gone anyway.
Otherwise a misleading error message would be issued just after
the removal:
# dmsetup remove test
The UUID "a b" should be mangled but it contains blacklisted characters.
Command failed