Fix rename operation for snapshot (cow) LV.
Only the snapshot's origin has the lock and by mistake suspend
and resume has been called for the snapshot LV.
This further made volumes unusable in cluster.
So instead of suspend and resuming list of LVs,
we need to just suspend and resume origin.
As the sequence write/suspend/commit/resume
is widely used in lvm2 code base - move it to
new lv_update_and_reload function.
Since there is no easy way to do a macro evaluation for (PATH_MAX-1)
and string concatation of this number to get resulting (%4095s) - let's
go with easiest path and restore extra byte for 0.
Other option would be to prepare sscanf parsing string in runtime.
But lets resolve it when we look at PATH_MAX handling later...
Zdenek Kabelac [Thu, 21 Aug 2014 13:35:29 +0000 (15:35 +0200)]
cleanup: never return uninitialized buffer
Coverity noticed this function may return untouched buffer,
however in this state can't really happen, but anyway
ensure on error path the buffer will have zero lenght string.
Zdenek Kabelac [Tue, 26 Aug 2014 10:10:29 +0000 (12:10 +0200)]
thin: fix volume_list support
Fixing problem, when user sets volume_list and excludes thin pools
from activation. In this case pool return 'success' for skipped activation.
We need to really check the volume it is actually active to properly
to remove queued pool messages. Otherwise the lvm2 and kernel
metadata started to go async since lvm2 believed, messages were submitted.
Add also better check for threshold when create a new thin volume.
In this case we require local activation of thin pool so we are able
to check pool fullness.
Peter Rajnoha [Mon, 25 Aug 2014 08:02:32 +0000 (10:02 +0200)]
cleanup: consolidate lv_layout and lv_role reporting
This patch makes the keyword combinations found in "lv_layout" and
"lv_role" much more understandable - there were some ambiguities
for some of the combinations which lead to confusion before.
Now, the scheme used is:
LAYOUTS ("how the LV is laid out"):
===================================
[linear] (all segments have number of stripes = 1)
[striped] (all segments have number of stripes > 1)
[linear,striped] (mixed linear and striped)
raid (raid layout always reported together with raid level, raid layout == image + metadata LVs underneath that make up raid LV)
[raid,raid1]
[raid,raid10]
[raid,raid4]
[raid,raid5] (exact sublayout not specified during creation - default one used - raid5_ls)
[raid,raid5,raid5_ls]
[raid,raid5,raid6_rs]
[raid,raid5,raid5_la]
[raid,raid5,raid5_ra]
[raid6,raid] (exact sublayout not specified during creation - default one used - raid6_zr)
[raid,raid6,raid6_zr]
[raid,raid6,raid6_nc]
[raid,raid6,raid6_ns]
[mirror] (mirror layout == log + image LVs underneath that make up mirror LV)
thin (thin layout always reported together with sublayout)
[thin,sparse] (thin layout == allocated out of thin pool)
[thin,pool] (thin pool layout == data + metadata volumes underneath that make up thin pool LV, not supposed to be used for direct use!!!)
[cache] (cache layout == allocated out of cache pool in conjunction with cache origin)
[cache,pool] (cache pool layout == data + metadata volumes underneath that make up cache pool LV, not supposed to be used for direct use!!!)
[virtual] (virtual layout == not hitting disk underneath, currently this layout denotes only 'zero' device used for origin,thickorigin role)
[unknown] (either error state or missing recognition for such layout)
ROLES ("what's the purpose or use of the LV - what is its role"):
=================================================================
- each LV has either of these two roles at least: [public] (public LV that users may use freely to write their data to)
[public] (public LV that users may use freely to write their data to)
[private] (private LV that LVM maintains; not supposed to be directly used by user to write his data to)
- and then some special-purpose roles in addition to that:
[origin,thickorigin] (origin for thick-style snapshot; "thick" as opposed to "thin")
[origin,multithickorigin] (there are more than 2 thick-style snapshots for this origin)
[origin,thinorigin] (origin for thin snapshot)
[origin,multithinorigin] (there are more than 2 thin snapshots for this origin)
[origin,extorigin] (external origin for thin snapshot)
[origin,multiextoriginl (there are more than 2 thin snapshots using this external origin)
[origin,cacheorigin] (cache origin)
[snapshot,thicksnapshot] (thick-style snapshot; "thick" as opposed to "thin")
[snapshot,thinsnapshot] (thin-style snapshot)
Peter Rajnoha [Mon, 25 Aug 2014 08:05:27 +0000 (10:05 +0200)]
report: use dm_report_field_string_list_unordered for reporting lv_layout and lv_role fields
This makes it a bit more readable since we can report more general
layouts/roles first and keywords describing the LV more precisely
afterwards in the list.
Peter Rajnoha [Mon, 25 Aug 2014 07:07:03 +0000 (09:07 +0200)]
refactor: rename 'lv_type' field to 'lv_role'
The 'lv_type' field name was a bit misleading. Better one is 'lv_role'
since this fields describes what's the actual use of the LV currently -
its 'role'.
Sort out the lvresize calculation code to handle size changes
specified as physical extents as well as logical extents
and to process mirror resizing and raid extensions correctly.
The 'approx alloc' option was masking the underlying problem.
Zdenek Kabelac [Wed, 20 Aug 2014 12:35:57 +0000 (14:35 +0200)]
tests: proper /dev access
Commit 5ebff6cc9f631b7409d99b72fa0b39ccec30bf1f seemed to introduce
new 'for' loop but the mode is not yet used.
But the access to /dev dir needs to go through $DM_DEV_DIR
and whole path needs to be in "".
Peter Rajnoha [Wed, 20 Aug 2014 08:05:51 +0000 (10:05 +0200)]
lvconvert: snapshot: allow using raid1 for snapshot and snapshot origin
When testing conversion sanity, we checked lv->status & MIRRORED
which encompasses both old mirrors and raid1 mirrors. But we need to
ban only the old mirrors here hence allow raid1 mirrors.
Zdenek Kabelac [Tue, 19 Aug 2014 09:35:18 +0000 (11:35 +0200)]
cleanup: use just PATH_MAX size
Avoid playing with +1.
PATH_MAX code needs probably more thinking anyway, since
there is no MAX path in Linux - user may easily create path
with 64kB chars - so 4kB buffer is surelly not enough for
such dirs.
Peter Rajnoha [Tue, 19 Aug 2014 12:16:39 +0000 (14:16 +0200)]
lv: remove lv_type_name fn
The lv_type_name function is remnant from old code that reported
only single string for the LV type. LV types are now reported
in a more extended way as keyword list that describe the type
precisely (using lv_layout_and_type fn).
The lv_type_name was used in some error messages to display the
type of the LV so just reinstate the old messages back referencing
the type directly with a string - this is enough for error messages.
They don't need to display the LV type as precisely as it's used
on lvs output (which is optimized for selection anyway).
Peter Rajnoha [Mon, 18 Aug 2014 13:58:48 +0000 (15:58 +0200)]
report: fix thin external snapshot identification for lv_layout and lv_type fields
Thin snapshots having external origins missed the "snapshot" keyword for
lv_type field. Also, thin external origins which are thin devices (from
another pool) were not recognized properly.
For example, external origin itself can be either non-thin volume (lvol0
below) or it can be a thin volume from another pool (lvol3 below):
Before this patch:
$ lvs -o name,vg_name,attr,pool_lv,origin,layout,type
Internal error: Failed to properly detect layout and type for for LV vg/lvol3
Internal error: Failed to properly detect layout and type for for LV vg/lvol3
LV VG Attr Pool Origin Layout Type
lvol0 vg ori------- linear external,origin,thin
lvol2 vg Vwi-a-tz-- pool lvol0 thin thin
lvol3 vg ori---tz-- pool unknown external,origin,thin,thin
lvol4 vg Vwi-a-tz-- pool1 lvol3 thin thin
pool vg twi-a-tz-- pool,thin pool,thin
pool1 vg twi-a-tz-- pool,thin pool,thin
- lvol2 as well as lvol4 have missing "snapshot" in type field
- lvol3 has unrecognized layout (should be "thin"), but has double
"thin" in lv_type which is incorrect
- (also there's double "for" in the internal error message)
With this patch applied:
$ lvs -o name,vg_name,attr,pool_lv,origin,layout,type
LV VG Attr Pool Origin Layout Type
lvol0 vg ori------- linear external,origin,thin
lvol2 vg Vwi-a-tz-- pool lvol0 thin snapshot,thin
lvol3 vg ori---tz-- pool thin external,origin,thin
lvol4 vg Vwi-a-tz-- pool1 lvol3 thin snapshot,thin
pool vg twi-a-tz-- pool,thin pool,thin
pool1 vg twi-a-tz-- pool,thin pool,thin
Jonathan Brassow [Sat, 16 Aug 2014 02:15:34 +0000 (21:15 -0500)]
RAID: Fail RAID4/5/6 creation if PE size is less than STRIPE_SIZE_MIN
The maximum stripe size is equal to the volume group PE size. If that
size falls below the STRIPE_SIZE_MIN, the creation of RAID 4/5/6 volumes
becomes impossible. (The kernel will fail to load a RAID 4/5/6 mapping
table with a stripe size less than STRIPE_SIZE_MIN.) So, we report an
error if it is attempted.
This is very rare because reducing the PE size down that far limits the
size of the PV below that of modern devices.
This patch adds a new flag --deferred to dmsetup remove. If this flag is
specified and the device is open, it is scheduled to be deleted on
close.
struct dm_info is extended.
The existing dm_task_get_info() is converted into a wrapper around the
new version dm_task_get_info_with_deferred_remove() so existing binaries
can still use the old smaller structure.
Recompiled code will pick up the new larger structure.
Zdenek Kabelac [Fri, 15 Aug 2014 11:53:04 +0000 (13:53 +0200)]
toollib: print ignoring vorigin
When ignoring 'listed' volume, print info message.
(So the final command error message is a bit less confusing,
i.e. when user tries to deactive virtual origin:
> lvchange -an vg/lvol2_vorigin
Ignoring virtual origin logical volume vg/lvol2_vorigin.
One or more specified logical volume(s) not found.
Peter Rajnoha [Wed, 13 Aug 2014 08:03:45 +0000 (10:03 +0200)]
Add lv_layout_and_type fn, lv_layout and lv_type reporting fields.
The lv_layout and lv_type fields together help with LV identification.
We can do basic identification using the lv_attr field which provides
very condensed view. In contrast to that, the new lv_layout and lv_type
fields provide more detialed information on exact layout and type used
for LVs.
For top-level LVs which are pure types not combined with any
other LV types, the lv_layout value is equal to lv_type value.
For non-top-level LVs which may be combined with other types,
the lv_layout describes the underlying layout used, while the
lv_type describes the use/type/usage of the LV.
These two new fields are both string lists so selection (-S/--select)
criteria can be defined using the list operators easily:
[] for strict matching
{} for subset matching.
For example, let's consider this:
$ lvs -a -o name,vg_name,lv_attr,layout,type
LV VG Attr Layout Type
[lvol1_pmspare] vg ewi------- linear metadata,pool,spare
pool vg twi-a-tz-- pool,thin pool,thin
[pool_tdata] vg rwi-aor--- level10,raid data,pool,thin
[pool_tdata_rimage_0] vg iwi-aor--- linear image,raid
[pool_tdata_rimage_1] vg iwi-aor--- linear image,raid
[pool_tdata_rimage_2] vg iwi-aor--- linear image,raid
[pool_tdata_rimage_3] vg iwi-aor--- linear image,raid
[pool_tdata_rmeta_0] vg ewi-aor--- linear metadata,raid
[pool_tdata_rmeta_1] vg ewi-aor--- linear metadata,raid
[pool_tdata_rmeta_2] vg ewi-aor--- linear metadata,raid
[pool_tdata_rmeta_3] vg ewi-aor--- linear metadata,raid
[pool_tmeta] vg ewi-aor--- level1,raid metadata,pool,thin
[pool_tmeta_rimage_0] vg iwi-aor--- linear image,raid
[pool_tmeta_rimage_1] vg iwi-aor--- linear image,raid
[pool_tmeta_rmeta_0] vg ewi-aor--- linear metadata,raid
[pool_tmeta_rmeta_1] vg ewi-aor--- linear metadata,raid
thin_snap1 vg Vwi---tz-k thin snapshot,thin
thin_snap2 vg Vwi---tz-k thin snapshot,thin
thin_vol1 vg Vwi-a-tz-- thin thin
thin_vol2 vg Vwi-a-tz-- thin multiple,origin,thin
Which is a situation with thin pool, thin volumes and thin snapshots.
We can see internal 'pool_tdata' volume that makes up thin pool has
actually a level10 raid layout and the internal 'pool_tmeta' has
level1 raid layout. Also, we can see that 'thin_snap1' and 'thin_snap2'
are both thin snapshots while 'thin_vol1' is thin origin (having
multiple snapshots).
Such reporting scheme provides much better base for selection criteria
in addition to providing more detailed information, for example:
$ lvs -a -o name,vg_name,lv_attr,layout,type -S 'type=metadata'
LV VG Attr Layout Type
[lvol1_pmspare] vg ewi------- linear metadata,pool,spare
[pool_tdata_rmeta_0] vg ewi-aor--- linear metadata,raid
[pool_tdata_rmeta_1] vg ewi-aor--- linear metadata,raid
[pool_tdata_rmeta_2] vg ewi-aor--- linear metadata,raid
[pool_tdata_rmeta_3] vg ewi-aor--- linear metadata,raid
[pool_tmeta] vg ewi-aor--- level1,raid metadata,pool,thin
[pool_tmeta_rmeta_0] vg ewi-aor--- linear metadata,raid
[pool_tmeta_rmeta_1] vg ewi-aor--- linear metadata,raid
(selected all LVs which are related to metadata of any type)
lvs -a -o name,vg_name,lv_attr,layout,type -S 'type={metadata,thin}'
LV VG Attr Layout Type
[pool_tmeta] vg ewi-aor--- level1,raid metadata,pool,thin
(selected all LVs which hold metadata related to thin)
pvcreate: Fix cache state with filters/sig wiping.
_pvcreate_check() has two missing requirements:
After refreshing filters there must be a rescan.
(Otherwise the persistent filter may remain empty.)
After wiping a signature, the filters must be refreshed.
(A device that was previously excluded by the filter due to
its signature might now need to be included.)
If several devices are added at once, the repeated scanning isn't
strictly needed, but we can address that later as part of the command
processing restructuring (by grouping the devices).
Peter Rajnoha [Wed, 13 Aug 2014 13:39:03 +0000 (15:39 +0200)]
select: add support for selection to match string list subset, recognize { } operator
Using "[ ]" operator together with "&&" (or ",") inside causes the
string list to be matched if and only if all the items given match
the value reported and the number of items also match. This is
strict list matching and the original behaviour we already have.
In contrast to that, the new "{ }" operator together with "&&" inside
causes the string list to be matched if and only if all the items given
match the value reported but the number of items don't need to match.
So we can provide a subset in selection criteria and if the subset
is found, it matches.
Peter Rajnoha [Fri, 8 Aug 2014 08:49:19 +0000 (10:49 +0200)]
filter-mpath: fix primary device lookup failure for partition when processing mpath filter
If using persistent filter and we're refreshing filters (just like we
do for pvcreate now after commit 54685c20fc9dfb155a2e5bc9d8cf5f0aad944305),
we can't rely on getting the primary device of the partition from the cache
as such device could be already filtered by persistent filter and we get
a device cache lookup failure for such device.
$pvcreate /dev/sda1
dev_is_mpath: failed to get device for 8:1
Physical volume "/dev/sda1" successfully created
The problematic part of the code called dev_cache_get_by_devt
to get the device for the device number supplied. Then the code
used dev_name(dev) to get the name which is then used in check
whether there's any mpath on top of this dev...
This patch uses sysfs to get the base name for the partition
instead, hence avoiding the device cache which is a correct
approach here.
Peter Rajnoha [Thu, 7 Aug 2014 14:44:09 +0000 (16:44 +0200)]
activation: if LV inactive and non-clustered, do not issue "Cannot deactivate" on -aln
The message "Cannot deactivate remotely exclusive device locally." makes
sense only for clustered LV. If the LV is non-clustered, then it's
always exclusive by definition and if it's already deactivated, this
message pops up inappropriately as those two conditions are met.
So issue the message only if the conditions are met AND we have clustered VG.
Peter Rajnoha [Thu, 7 Aug 2014 13:23:58 +0000 (15:23 +0200)]
pvmove: remove spurious "Skipping mirror LV" message on pvmove of clustered mirror
With cmirrord, we can do pvmove of clustered mirror. The code checking
suitability of LVs on the PV being moved issued a message if a mirror
LV was found and the VG was clustered. However, the actual pvmove did
work correctly.
The top-level mirror LV is actually skipped in the code since it's
always layered on top of internal LVs making up the mirror LV and for pvmove
we consider these internal devices only as they're actually layered on
top of concrete PVs then. But we don't need to issue any message here
about skipping the top-level mirror LV - it's misleading here.
Commit e80884cd080cad7e10be4588e3493b9000649426 tried to dump filters
for them to be reevaluated when creating a PV to avoid overwriting
any existing signature that may have been created after last
scan/filtering.
However, we need to call refresh_filters instead of
persistent_filter->dump since dump requires proper rescannig to fill
up the persistent filter again. However, this is true only for pvcreate
but not for vgcreate with PV creation where the scanning happens before
this PV creation and hence the next rescan (if not full scan), does not
fill the persistent filter.
Also, move refresh_filters so that it's called sooner and only for
pvcreate, vgcreate already calls lvmcache_label_scan(cmd, 2) which
then calls refresh_filters itself, so no need to reevaluate this again.
This caused the persistent filter (/etc/lvm/cache/.cache file) to be
wrong and contain only the PV just being processed with
vgcreate <vg_name> <pv_name_to_create>.
This regression caused other block devices to be filtered out in case
the vgcreate with PV creation was used and then the persistent filter
is used by any other LVM command afterwards.
There was one more "not found" message issued in case the device
could not be found in device cache (commit 2e82a07 fixed this only
for PV lookup itself). But if we "allow_unformatted" for
find_pv_by_name, we should not issue this message even in case
the device can't be found in dev cache as we just need to know
whether there's a PV or not for the code to decide on next steps
and we don't want to issue any messages if either device itself
is not found or PV is not found.
For example, when we were creating a new PV (and so allow_unformatted = 1)
and the device had a signature on it which caused it to be filtered
by device filter (e.g. MD signature if md filtering is enabled),
or it was part of some other subsystem (e.g. multipath), this message
was issued on find_pv_by_name call which was misleading.
Also, remove misleading "stack" call in case find_pv_by_name
returns NULL in pvcreate_check - any error state is reported
later by pvcreate_check code so no need to "stack" here.
There's one more and proper check to issue "not found" message if
the device can't be found in device cache within pvcreate_check fn
so this situation is still covered properly later in the code.
Before this patch (/dev/sda contains MD signature and is therefore filtered):
$ pvcreate /dev/sda
Physical volume /dev/sda not found
WARNING: linux_raid_member signature detected on /dev/sda at offset 4096. Wipe it? [y/n]:
With this patch applied:
$ pvcreate /dev/sda
WARNING: linux_raid_member signature detected on /dev/sda at offset 4096. Wipe it? [y/n]:
Non-existent devices are still caught properly:
$ pvcreate /dev/sdx
Device /dev/sdx not found (or ignored by filtering).