Peter Rajnoha [Wed, 18 Jun 2014 11:26:47 +0000 (13:26 +0200)]
cleanup: gcc warnings and report-select test vs snap_percent 0%
Fix gcc warnings:
libdm-report.c:1952:5: warning: "end_op_flag_hit" may be used uninitialized in this function [-Wmaybe-uninitialized]
libdm-report.c:2232:28: warning: "custom" may be used uninitialized in this function [-Wmaybe-uninitialized]
And snap_percent is not 0% in dm < 1.10.0 so
don't test comparison with 0% here.
Ordering string list items on reports is also new compared to
previous state where items were not ordered at all and they
got reported simply as they appeared/were processed.
Jonathan Brassow [Wed, 18 Jun 2014 03:59:36 +0000 (22:59 -0500)]
pvmove: Enable all-or-nothing (atomic) pvmoves
pvmove can be used to move single LVs by name or multiple LVs that
lie within the specified PV range (e.g. /dev/sdb1:0-1000). When
moving more than one LV, the portions of those LVs that are in the
range to be moved are added to a new temporary pvmove LV. The LVs
then point to the range in the pvmove LV, rather than the PV
range.
Example 1:
We have two LVs in this example. After they were
created, the first LV was grown, yeilding two segments
in LV1. So, there are two LVs with a total of three
segments.
Each of the affected LV segments now point to a
range of blocks in the pvmove LV, which purposefully
corresponds to the segments moved from the original
LVs into the temporary pvmove LV.
The current implementation goes on from here to mirror the temporary
pvmove LV by segment. Further, as the pvmove LV is activated, only
one of its segments is actually mirrored (i.e. "moving") at a time.
The rest are either complete or not addressed yet. If the pvmove
is aborted, those segments that are completed will remain on the
destination and those that are not yet addressed or in the process
of moving will stay on the source PV. Thus, it is possible to have
a partially completed move - some LVs (or certain segments of LVs)
on the source PV and some on the destination.
Example 2:
What 'example 1' might look if it was half-way
through the move.
--------- --------- ---------
| LV1s0 | | LV2s0 | | LV1s1 |
--------- --------- ---------
| | |
-------------------------------------
pvmove0 | seg 0 | seg 1 | seg 2 |
-------------------------------------
| | |
| -------------------------
source PV | | 256 - 511 | 512 - 767 |
| -------------------------
| ||
-------------------------
dest PV | 000 - 255 | 256 - 511 |
-------------------------
This update allows the user to specify that they would like the
pvmove mirror created "by LV" rather than "by segment". That is,
the pvmove LV becomes an image in an encapsulating mirror along
with the allocated copy image.
Example 3:
A pvmove that is performed "by LV" rather than "by segment".
The thing that differentiates a pvmove done in this way and a simple
"up-convert" from linear to mirror is the preservation of the
distinct segments. A normal up-convert would simply allocate the
necessary space with no regard for segment boundaries. The pvmove
operation must preserve the segments because they are the critical
boundary between the segments of the LVs being moved. So, when the
pvmove copy image is allocated, all corresponding segments must be
allocated. The code that merges ajoining segments that are part of
the same LV when the metadata is written must also be avoided in
this case. This method of mirroring is unique enough to warrant its
own definitional macro, MIRROR_BY_SEGMENTED_LV. This joins the two
existing macros: MIRROR_BY_SEG (for original pvmove) and MIRROR_BY_LV
(for user created mirrors).
The advantages of performing pvmove in this way is that all of the
LVs affected can be moved together. It is an all-or-nothing approach
that leaves all LV segments on the source PV if the move is aborted.
Additionally, a mirror log can be used (in the future) to provide tracking
of progress; allowing the copy to continue where it left off in the event
there is a deactivation.
Peter Rajnoha [Tue, 17 Jun 2014 16:19:59 +0000 (18:19 +0200)]
tests: update lvcreate-thin for latest changes
With recent changes introduced with the report selection support,
the content of lv_modules field is of string list type (before
it was just string type).
String list elements are always ordered now so update lvcreate-thin
test to expect the elements to be ordered.
Peter Rajnoha [Tue, 17 Jun 2014 16:14:38 +0000 (18:14 +0200)]
prop: update FIELD macro to accomodate the differentiation of number, size and percent field values
The differentiation of the original number field into number, size and
percent field types has been introduced with recent changes for report
selection support.
Peter Rajnoha [Thu, 12 Jun 2014 11:57:22 +0000 (13:57 +0200)]
report: add support for implicit fields, add implicit "selected" field
Implicit fields are fields that are registered with the report
and reported internally by libdevmapper itself (compared to explicit
fields that are registered by the layer above libdevmapper - e.g. LVM,
dmsetup...).
The "selected" field is the implicit field (for now the only one)
that reports the result of the selection. Since the selection itself
is the property of the libdevmapper, the upper layer using dm_report_init
can't register this field itself and it must be done directly at
libdevmapper layer.
The "selected" field is internally registered as part of the "common"
report type with id 0x80000000 (the last bit in uin32_t) which is then
reserved (the explicit report types are then checked if they do not
contain this id and if yes, we error out).
This way, the "selected" field is recognized by all libdevmapper users
that initialize the reporting with "dm_report_init_with_selection".
If reporting is initialized with the classical "dm_report_init",
there's no functional change (so the "selected" field is not defined
and it's not recognized).
Peter Rajnoha [Fri, 30 May 2014 13:02:21 +0000 (15:02 +0200)]
report: select: add support for reserved value recognition in report selection string - add struct dm_report_reserved_value
Make dm_report_init_with_selection to accept an argument with an
array of reserved values where each element contains a triple:
{dm report field type, reserved value, array of strings representing this value}
When the selection is parsed, we always check whether a string
representation of some reserved value is not hit and if it is,
we use the reserved value assigned for this string instead of
trying to parse it as a value of certain field type.
This makes it possible to define selections like:
... --select lv_major=undefined (or -1 or unknown or undef or whatever string representations are registered for this reserved value in the future)
... --select lv_read_ahead=auto
... --select vg_mda_copies=unmanaged
With this, each time the field value of certain type is hit
and when we compare it with the selection, we use the proper
value for comparison.
For now, register these reserved values that are used at the moment
(also more descriptive names are used for the values):
Same reserved value of different field types do not collide.
All arrays are null-terminated.
The list of reserved values is automatically displayed within
selection help output:
Selection operands
------------------
...
Reserved values
---------------
-1, undefined, undef, unknown - Reserved value for undefined numeric value. [number]
unmanaged - Reserved value for unmanaged number of metadata copies in VG. [number]
auto - Reserved value for size that is automatically calculated. [size]
Peter Rajnoha [Mon, 16 Jun 2014 12:06:04 +0000 (14:06 +0200)]
report: select: show field type in field list if in context of selection
When the field list is displayed as help for constructing selection
criteria, show also the field value type. This is useful for users
to know what set of operators are allowed for the type - the subsequent
"Selection operands" section in the help output summarize all known
types that can be used in selection.
Peter Rajnoha [Thu, 29 May 2014 07:42:14 +0000 (09:42 +0200)]
report: select: add help for creating selections
The "<lvm command> -S/--select help" shows help (including list of fields to match against):
...field list here including the field type name...
Selection operands
------------------
field - Reporting field.
number - Non-negative integer value.
size - Floating point value with units specified.
string - Characters quoted by ' or " or unquoted.
string list - Strings enclosed by [ ] and elements delimited by either
"all items must match" or "at least one item must match" operator.
regular expression - Characters quoted by ' or " or unquoted.
Selection operators
-------------------
Comparison operators:
=~ - Matching regular expression.
!~ - Not matching regular expression.
= - Equal to.
!= - Not equal to.
>= - Greater than or equal to.
> - Greater than
<= - Less than or equal to.
< - Less than.
Logical and grouping operators:
&& - All fields must match
, - All fields must match
|| - At least one field must match
# - At least one field must match
! - Logical negation
( - Left parenthesis
) - Right parenthesis
[ - List start
] - List end
Peter Rajnoha [Thu, 29 May 2014 07:41:50 +0000 (09:41 +0200)]
report: select: add support for processing string lists in selection
Selection list items are enclosed in '[' and ']' (if there's only
one item, the '[' and ']' can be omitted). Each element of the list
is a string (either quoted or unquoted, like the usual string operand
used in selection) and each element is delimited either by conjunction
(meaining "match all") or disjunction operator (meaning "match any").
For example, if "," is the conjuction operator and "/" is the
disjunction operator then:
lv_tags=[a,b,c]
...will match all fields where tags contain *all* a, b and c.
lv_tags=[a/b/c]
...will match all fields where tags contain *any* of a, b, or c.
Mixing operators within the list is not supported:
lv_tags=[a,b/c]
...will give an error.
The order in which items are defined in the selection do not matter.
This patch enhances the selection parsing functionality to recognize
such lists.
Peter Rajnoha [Thu, 29 May 2014 07:41:36 +0000 (09:41 +0200)]
report: select: add DM_REPORT_FIELD_TYPE_STRING_LIST to make a difference between STRING and STRING_LIST
The {pv,vg,lv,seg}_tags and lv_modules fields are reported as string
lists using the new dm_report_field_string_list - so we just pass
the list to the fn that takes care of reporting and item sorting itself.
Peter Rajnoha [Thu, 29 May 2014 07:41:18 +0000 (09:41 +0200)]
report: select: add dm_report_field_string_list to libdm
Add a separate dm_report_field_string_list fn to libdevmapper to
support reporting string lists. Before, the code used libdevmappers's
dm_report_field_string fn which required formatting the list to a
single string. This functionality is now moved to libdevmapper
and the code that needs to report the string list just needs
to pass the list itself and libdevmapper will take care of this.
This also enhances code reuse.
The dm_report_field_string_list also accepts an argument to define
custom delimiter to use. If not defined, a default "," (comma) is
used as item delimiter in the string list reported.
The dm_report_field_string_list automatically sorts the items in
the list before formatting it to a final string. It also encodes
the position and length within the final string where each element
can be found. This can be used to support checking against each
list item reported since since when formatted as a single string
for the actual report, we would lose this information otherwise
(we don't want to copy each item, the position and length within
the final string is enough for us to get the original items back).
When such lists are checked against the selection tree, we can check
each item individually this way and we can support operators like
"match any" and "match all".
Peter Rajnoha [Thu, 29 May 2014 07:41:03 +0000 (09:41 +0200)]
report: select: refactor: move str_list to libdm
The list of strings is used quite frequently and we'd like to reuse
this simple structure for report selection support too. Make it part
of libdevmapper for general reuse throughout the code.
This also simplifies the LVM code a bit since we don't need to
include and manage lvm-types.h anymore (the string list was the
only structure defined there).
Peter Rajnoha [Thu, 29 May 2014 07:38:36 +0000 (09:38 +0200)]
report: select: use _check_report_selection in dm_report_object to report only objects that satisfy the report selection
This is rebased and edited version of the original design and
patch proposed by Jun'ichi Nomura:
http://www.redhat.com/archives/dm-devel/2007-April/msg00025.html
This activates the actual selection process in dm_report_object.
Peter Rajnoha [Thu, 29 May 2014 07:38:22 +0000 (09:38 +0200)]
report: select: add _check_selection fn to support checking fields against given selections
This is rebased and edited versions of the original design and
patch proposed by Jun'ichi Nomura:
http://www.redhat.com/archives/dm-devel/2007-April/msg00025.html
The _check_selection implements the actual field checking against the
selection tree.
Peter Rajnoha [Thu, 29 May 2014 07:38:08 +0000 (09:38 +0200)]
report: select: add dm_report_init_with_selection to libdm
This is rebased and edited version of the original design and
patch proposed by Jun'ichi Nomura:
http://www.redhat.com/archives/dm-devel/2007-April/msg00025.html
The dm_report_init_with_selection is the same as dm_report_init
but it contains an additional argument to set the selection
in the form of a string that contains field names to check against and
selection operators. The selection string is parsend and a selection
tree is composed for use in the checks against individual fields when
the report is processed. The parsed selection tree is stored in dm_report
structure as "selection_root".
Peter Rajnoha [Thu, 29 May 2014 07:37:54 +0000 (09:37 +0200)]
report: select: add supporting infrastucture for token parsing in report selections
This is rebased and edited version of the original design and
patch proposed by Jun'ichi Nomura:
http://www.redhat.com/archives/dm-devel/2007-April/msg00025.html
Add support for parsing numbers, strings (quoted or unquoted), regexes
and operators amogst these operands in selection condition supplied.
Peter Rajnoha [Thu, 29 May 2014 07:37:41 +0000 (09:37 +0200)]
report: select: add structs for report selection
This is rebased and edited version of the original design and
patch proposed by Jun'ichi Nomura:
http://www.redhat.com/archives/dm-devel/2007-April/msg00025.html
This patch defines operators and structures that will be used
to store the report selection against which the actual values
reported will be checked.
Selection operators
-------------------
Comparison operators:
=~ - Matching regular expression.
!~ - Not matching regular expression.
= - Equal to.
!= - Not equal to.
>= - Greater than or equal to.
> - Greater than
<= - Less than or equal to.
< - Less than.
Logical and grouping operators:
&& - All fields must match
, - All fields must match
|| - At least one field must match
# - At least one field must match
! - Logical negation
( - Left parenthesis
) - Right parenthesis
Peter Rajnoha [Thu, 29 May 2014 07:37:22 +0000 (09:37 +0200)]
report: select: add DM_REPORT_FIELD_TYPE_SIZE to make a difference between NUMBER and SIZE
This makes it easier to check against the fields (following patches for
report selection) and check whether size units are allowed or not
with the field value.
Zdenek Kabelac [Mon, 16 Jun 2014 10:39:32 +0000 (12:39 +0200)]
cleanup: we already know max device name size
Use NAME_LEN constant to simplify creation of device name.
Since the max size should be already tested in validation,
throw INTERNAL_ERROR if the size of vg/lv is bigger then NAME_LEN.
Zdenek Kabelac [Tue, 17 Jun 2014 11:13:23 +0000 (13:13 +0200)]
snapshot: report proper error message
Expresing -lXX%LV is not valid for snapshot, but error message for
snapshost case was not complete and missed %ORIGIN.
Also document correct settings for in manpage properly where
it missed %PVS.
Zdenek Kabelac [Mon, 16 Jun 2014 11:38:35 +0000 (13:38 +0200)]
snapshot: check it's still snapshot
While polling for snapshot, detect first the snapshot still
exits. It's valid to have multiple polling threads watching
for the same thing and just 1 can 'win' the finish part.
All others should nicely 'fail'.
Jonathan Brassow [Mon, 16 Jun 2014 23:56:32 +0000 (18:56 -0500)]
poll_daemon: Cleanly exit polling if the LV is no longer active
If the we are polling an LV due to some sort of conversion and it
becomes inactive, a rather worrisome message is produced, e.g.:
" ABORTING: Mirror percentage check failed."
We can cleanly exit if we do a simple check to see if the LV is
active before performing the check. This eliminates the scary
message.
Jonathan Brassow [Mon, 16 Jun 2014 23:15:39 +0000 (18:15 -0500)]
cache: Properly rename origin LV tree when adding "_corig"
When creating a cache LV with a RAID origin, we need to ensure that
the sub-LVs of that origin properly change their names to include
the "_corig" extention of the top-level LV. We do this by first
performing a 'lv_rename_update' before making the call to
'insert_layer_for_lv'.
Peter Rajnoha [Fri, 13 Jun 2014 07:45:26 +0000 (09:45 +0200)]
profile: add thin-generic.profile
The thin-generic.profile contains settings for thin/thin pool volumes
suitable for generic environment/use containing default settings.
This allows users to change the global lvm.conf settings at will
and still keep the original settings for volumes that have this
thin profile assigned already.
Zdenek Kabelac [Wed, 11 Jun 2014 08:52:16 +0000 (10:52 +0200)]
man: dmsetup manglename
More updates to manglename option.
Add reference to LVM2 resource page, since for a long time,
this is the right places for sources for libdevmapper....
Zdenek Kabelac [Fri, 6 Jun 2014 15:36:38 +0000 (17:36 +0200)]
tests: make timeouts longer
Add more time for tests, since debug kernels are getting slower...
and we add more and more tests.
However many test should be shortened to avoid testing disk-perfomance
and focus on lvm functionality...
(Often we should probably test with inactive volumes when we check
metadata operation of lvm2)
We may need to support option for 'DEEP' longer testing.
Also something like LVM_TEST_TIMEOUT_FACTOR might be useful
though it would be much better if test suite could approximete
from system perfomance test lenght...
Zdenek Kabelac [Mon, 9 Jun 2014 08:58:57 +0000 (10:58 +0200)]
activation: retry cleanup deactivation
Enable 'retry' deactivation also in 'cleanup' phase.
It shouldn't be mostly needed - however udev now produces
more and more completelny non-synchronizable device opens,
so even for orphan devices we can't easily predict where
udevd opens devices.
So it's more preferable here to log error about device being open
and retry clean, but let the command proceed.
Peter Rajnoha [Fri, 6 Jun 2014 12:21:09 +0000 (14:21 +0200)]
cleanup: move the "daemon is running" checks to lvm-wrappers
And use ifdefs there, not exposing it in the tool code itself.
Later in the future, we should probably make the PIDFILE and
daemon checking code available also in case the daemon itself
is not built.
If the user supplies a '--yes' argument, then don't bother them with
a question to confirm whether to change the cluster attribute (even
if clvmd isn't running).
vgchange: Prompt when setting VG cluster attr if cluster is not setup
If clvmd is not running or the locking type is not clustered and someone
attempts to set the cluster attribute on a volume group, prompt them to
see if they are sure. (Only prompt for one though. If neither are true,
simply ask them once.)
Zdenek Kabelac [Thu, 5 Jun 2014 15:28:03 +0000 (17:28 +0200)]
configure: cleanups
Replace AC_PATH_PROG with AC_PATH_TOOL.
Drop 'x' when already using "" around shell variable.
Simlify some long line and ifs.
Merge multiple test evaluation with '-a', '-o'.
Use 'case' instead if several ifs when it's more elegant.
Improve usage of pkg_config_init and add it where it's been missing.
Check for UDEV_HAS_BUILTIN_BLKID and when building udev-rules.
Zdenek Kabelac [Thu, 5 Jun 2014 12:38:10 +0000 (14:38 +0200)]
configure: accept 'none' as mangling mode
Since we advertise 'none' as mangling name, accept it.
Keep it backward compatible and leave disabled option still working
(though I guess there is likely no user of this option...)
Jonathan Brassow [Fri, 30 May 2014 22:26:10 +0000 (17:26 -0500)]
test: use direct I/O when injecting bad data into RAID images
When directly corrupting RAID images for the purpose of testing,
we must use direct I/O (or a 'sync' after the 'dd') to ensure that
the writes are not caught in the buffer cache in a way that is not
reachable by the top-level RAID device.
Jonathan Brassow [Wed, 28 May 2014 15:17:15 +0000 (10:17 -0500)]
activation: Remove empty DM device when table fails to load.
As part of better error handling, remove DM devices that have been
sucessfully created but failed to load a table. This can happen
when pvmove'ing in a cluster and the cluster mirror daemon is not
running on a remote node - the mapping table failing to load as a
result. In this case, any revert would work on other nodes running
cmirrord because the DM devices on those nodes did succeed in loading.
However, because no table was able to load on the non-cmirrord nodes,
there is no table present that points to what needs to be reverted.
This causes the empty DM device to remain on the system without being
present in any LVM representation.
This patch should only be considered a partial fix to the overall
problem. This is because only the device which failed to load a
table is removed. Any LVs that may have been loaded as requirements
to the DM device that failed to load may be left in place. Complete
clean-up will require tracking those devices which have been created
as dependencies and removing them along with the device that failed
to load a table.