Zdenek Kabelac [Tue, 23 Feb 2016 18:48:01 +0000 (19:48 +0100)]
coverity: check for info pointer existance
Since we already check in few other places 'info' is not NULL,
do the same for others - however when info would be NULL
it more or less looks like internal error.
Zdenek Kabelac [Tue, 23 Feb 2016 11:15:42 +0000 (12:15 +0100)]
cache: enforce header check
Currently it's been checked for 'zero' header for thin-pool,
but lets use it always for cache as well - since it's relatively 'cheap'
detection of read 'error' problems as thin/cache tools
currently do not work fast enough in this case.
Tony Asleson [Mon, 22 Feb 2016 20:28:11 +0000 (14:28 -0600)]
lvmdbusd: Add env variable to use session bus
export LVMDBUSD_SESSION=True to run on the session bus instead
of the system bus so that we can run the unit test without
installing the dbus conf file.
Tony Asleson [Mon, 22 Feb 2016 20:00:30 +0000 (14:00 -0600)]
lvmdbusd: background.py, fix stdout parse error
It appears that the output of lvconvert --merge can vary some. The code
was blowing up as it was trying to parse a line of stdout to retrieve the
% complete, but the line did not have the needed format and an execption
was thrown. The uncaught exception caused the background thread to exit
without updating the job object, which caused the client to hang forever
waiting. Added a default exception handler to prevent unhandled execptions
causing hangs and removed the parameter skip_first_line as it's no longer
needed. The code checks to see if the line can be parsed before doing so.
David Teigland [Mon, 22 Feb 2016 15:32:39 +0000 (09:32 -0600)]
lvmlockd: invalidate name in lockspace struct after remove
After the lockspace has been successfully removed,
invalidate the name field in the lockspace struct.
The struct remains on the list of lockspaces until
the struct can be freed later. Until the struct is
freed, its name will prevent another new lockspace
from being created with the same name.
Zdenek Kabelac [Fri, 19 Feb 2016 10:18:41 +0000 (11:18 +0100)]
thin: fix update_pool_lv error path
When update fails in suspend() (sending of messages
fails because metadata space is full) call resume(),
so the locking sequence works properly for clustering.
Zdenek Kabelac [Mon, 15 Feb 2016 15:33:38 +0000 (16:33 +0100)]
libdm: thin status update
Fix parsing of 'Fail' status (using capital letter) for thin-pool.
Add also parsing of 'Error' state for thin-pool.
Add needs_check test for thin-pool.
Peter Rajnoha [Thu, 18 Feb 2016 13:30:14 +0000 (14:30 +0100)]
metadata: ask for confirmation before really initializing/removing PV that is marked as belonging to a VG
Ask for confirmation when using pvcreate/pvremove on a PV which is
marked as belonging to a VG, just like we do in case of a PV which
belongs to known VG:
$ pvcreate -ff /dev/sda
Really INITIALIZE physical volume "/dev/sda" that is marked as belonging to a VG [y/n]? n
/dev/sda: physical volume not initialized
$ pvremove -ff /dev/sda
Really WIPE LABELS from physical volume "/dev/sda" that is marked as belonging to a VG [y/n]? n
/dev/sda: physical volume label not removed
Before this patch (lv_snapshot_invalid and lv_merge_failed not switched into numeric value
where -1 represents 'unknown' value)
$ lvs -o lv_name,lv_active_locally,lv_snapshot_invalid,lv_merge_failed vg/lvol0 --binary
LV ActLocal SnapInvalid MergeFailed
lvol0 1 unknown unknown
With this patch applied:
$ lvs -o lv_name,lv_active_locally,lv_snapshot_invalid,lv_merge_failed vg/lvol0 --binary
LV ActLocal SnapInvalid MergeFailed
lvol0 1 -1 -1
Peter Rajnoha [Mon, 15 Feb 2016 14:50:11 +0000 (15:50 +0100)]
pv: use pv->fmt to check for fake PVs, not pv->vg
pv->vg is not set yet during pvcreate processing. Use pv->fmt instead to
check for these fake PVs (all normal PVs have format defined, devices
which are not PVs don't have this set).
Peter Rajnoha [Mon, 15 Feb 2016 14:11:54 +0000 (15:11 +0100)]
toollib: skip PV if system ID is used and PV marked as used but metadata missing
If we know that a PV belongs to some VG and we're missing metadata
(because we have only those PV(s) from VG present in the system that
don't have metadata areas), we should skip such PV when processing
under system ID.
This is because we know that the PV belongs to some VG, but we
really can't decide whether it matches system ID unless the VG
metadata is present again.
Peter Rajnoha [Mon, 15 Feb 2016 13:46:31 +0000 (14:46 +0100)]
pv: mark fake PVs as not used
Some of the PVs are not even orphan PVs - they're fake PVs - this can
happen if we're listing all devices with "pvs -a". Such PV must not
be marked as used.
Peter Rajnoha [Thu, 26 Mar 2015 13:20:46 +0000 (14:20 +0100)]
backup: backup_restore_vg: register PVs that need writing via vg->pvs_to_write list
The backup_restore_vg is used directly for restoring the VG from backup.
It's also used to do the VG conversions from one metadata format to
another which means vgconvert calls backup_restore_vg too.
When restoring VG from backup, we need to rewrite/write PV headers as
PVs may have been orphans before and now they're becoming part of some
VG - we need to write the PV_EXT_USED flag at least.
When using the backup_restore_vg for vgconvert, we need to write
completely new PV header in different format.
Avoid the special "pv_write" call and handling that was used before
this patch in vgconvert (vgconvert_single function to be more precise)
and reuse existing internal interface to register PV header for writing
(or rewriting) via vg->pvs_to_write list instead like we do it elsewhere
in the code.
This patch also resolves a problem in which PV headers with target
format were written in the vgconvert_single fn as orphans and VG
metadata were added later on - this was a tiny hack actually.
We can't do this now - we need to write the PV as belonging
to a VG because otherwise the PV_EXT_USED flag won't be written
properly (if the PV header is written as orphan, the PV_EXT_USED
is set to 0, of course, even though metadata are attached later).
So this patch removes this tiny inconsistency which was passing
just fine before because we didn't have any relation to the VG
in PV header before. Now we have the PV_EXT_USED flag which says
the "PV is used in some VG".
Peter Rajnoha [Thu, 19 Mar 2015 06:53:22 +0000 (07:53 +0100)]
metadata: _vg_read: check if PV_EXT_USED flag is set correctly for non-orphan PVs and do a repair if needed
The same check as we already do for orphan PVs, just the other way
round now: if the PV is surely part of some VG and any PV the VG
contains does not have the PV_EXT_USED flag set, repair it.
For example - /dev/sda here is in VG vg and it's incorrectly not
marked as used by PV_EXT_USED flag:
pvs --binary -o pv_ext_vsn,pv_in_use
WARNING: Volume Group vg is not consistent.
WARNING: Repairing Physical Volume /dev/sda that is in Volume Group vg but not marked as used.
PV VG Fmt Attr PSize PFree ExtVsn PInUse
/dev/sda vg lvm2 a-- 124.00m 124.00m 2 1
Peter Rajnoha [Wed, 11 Mar 2015 15:18:42 +0000 (16:18 +0100)]
metadata: _vg_read: check if PV_EXT_USED flag is set correctly for orphan PVs and do a repair if needed
If we know that the PV is orphan, meaning there's at least one MDA on
that PV which does not reference any VG and at the same time there's
PV_EXT_USED flag set, we're certainly in an inconsistent state and we
need to fix this.
For example, such situation can happen during vgremove/vgreduce if we
removed/reduced the VG, but we haven't written PV headers yet because
vgremove stopped abruptly for whatever reason just before writing new
PV headers with updated state, including PV extension flags (and so the
PV_EXT_USED flag).
However, in case the PV has no MDAs at all, we can't double-check
whether the PV_EXT_USED is correct or not - if that PV is marked
as used, it's either:
- really used (but other disks with MDAs are missing)
- or the error state as described above is hit
User needs to overwrite the PV header directly if it's really clear
the PV having no MDAs does not belong to any VG and at the same time
it's still marked as being in use (pvcreate -ff <dev_name> will fix this).
For example - /dev/sda here has 1 MDA, orphan and is incorrectly marked
with PV_EXT_USED flag:
Peter Rajnoha [Tue, 10 Mar 2015 10:25:14 +0000 (11:25 +0100)]
pv: check for the PV_EXT_USED flag and deny pvcreate/pvchange/pvremove/vgcreate on such PV (unless forced)
Make sure we won't use a PV that is already marked as used. Normally,
VG metadata would stop us from doing that, but we can run into a
situation where such metadata is missing because PVs with MDAs
are missing and the PVs left are the ones with 0 MDAs.
(/dev/sda in this example has 0 MDAs and it belongs to a VG,
but other PVs with MDA are missing)
$ pvcreate /dev/sda
PV '/dev/sda' is marked as belonging to a VG but its metadata is missing.
Can't initialize PV '/dev/sda' without -ff.
$ pvchange -u /dev/sda
PV '/dev/sda' is marked as belonging to a VG but its metadata is missing.
Can't change PV '/dev/sda' without -ff.
Physical volume /dev/sda not changed
0 physical volumes changed / 1 physical volume not changed
$ pvremove /dev/sda
PV '/dev/sda' is marked as belonging to a VG but its metadata is missing.
(If you are certain you need pvremove, then confirm by using --force twice.)
$ vgcreate vg /dev/sda
Physical volume '/dev/sda' is marked as belonging to a VG but its metadata is missing.
Unable to add physical volume '/dev/sda' to volume group 'vg'.
We'll use this struct in subsequent patches for PVs which should
be rewritten, not just created. So rename struct pv_to_create to
struct pv_to_write for clarity.
The bug description: First we allocate memory for
processing handle (at an address 1) then we
allocate some memory on the same pool for later use
in pvmove_poll function inside the process_each_pv
function (at an address 2). After we jump out of
process_each_pv we called destroy_processing_handle.
As a result of destroying the handle memory pool could
deallocate all memory at address 1 or higher. The
pvmove_poll function tried to copy a memory allocated
at address 2 that could be returned to the system.
If it was so it led to segfault.
We need to rethink proper fix but in the same time
cmd->mem pool is recreated per each lvm command so
this should not cause problems even when we run
multiple commands in lvm shell.
A valgrind snapshot of the corruption:
Invalid read of size 1
at 0x4C29F92: strlen (mc_replace_strmem.c:403)
by 0x5495F2E: dm_pool_strdup (pool.c:51)
by 0x1592A7: _create_id (pvmove.c:774)
by 0x159409: pvmove_poll (pvmove.c:796)
by 0x1599E3: pvmove (pvmove.c:931)
by 0x15105B: lvm_run_command (lvmcmdline.c:1655)
by 0x1523C3: lvm2_main (lvmcmdline.c:2121)
by 0x1754F3: main (lvm.c:22)
Address 0xf15df8a is 138 bytes inside a block of size 8,192 free'd
at 0x4C28430: free (vg_replace_malloc.c:446)
by 0x5494E73: dm_free_wrapper (dbg_malloc.c:357)
by 0x5495DE2: _free_chunk (pool-fast.c:318)
by 0x549561C: dm_pool_free (pool-fast.c:151)
by 0x164451: destroy_processing_handle (toollib.c:1837)
by 0x1598C1: pvmove (pvmove.c:903)
by 0x15105B: lvm_run_command (lvmcmdline.c:1655)
by 0x1523C3: lvm2_main (lvmcmdline.c:2121)
by 0x1754F3: main (lvm.c:22)