Ondrej Kozina [Tue, 7 Jul 2015 12:03:15 +0000 (14:03 +0200)]
lvmpolld: fix possible memory corruption with mem debug
if lvm2 is built with debug memory options dm_free() is not
mapped directly to std library's free(). This may cause memory corruption
as a line buffer may get reallocated in getline with realloc.
This is a temporary hotfix. Other debug memory failure needs to
be investigated and explained.
Existing messaging intarface for thin-pool has a few 'weak' points:
* Message were posted with each 'resume' operation, thus not allowing
activation of thin-pool with the existing state.
* Acceleration skipped suspend step has not worked in cluster,
since clvmd resumes only nodes which are suspended (have proper lock
state).
* Resume may fail and code is not really designed to 'fail' in this
phase (generic rule here is resume DOES NOT fail unless something serious
is wrong and lvm2 tool usually doesn't handle recovery path in this case.)
* Full thin-pool suspend happened, when taken a thin-volume snapshot.
With this patch the new method relocates message passing into suspend
state.
This has a few drawbacks with current API, but overal it performs
better and gives are more posibilities to deal with errors.
Patch introduces a new logic for 'origin-only' suspend of thin-pool and
this also relates to thin-volume when taking snapshot.
When suspend_origin_only operation is invoked on a pool with
queued messages then only those messages are posted to thin-pool and
actual suspend of thin pool and data and metadata volume is skipped.
This makes taking a snapshot of thin-volume lighter operation and
avoids blocking of other unrelated active thin volumes.
Also fail now happens in 'suspend' state where the 'Fail' is more expected
and it is better handled through error paths.
Activation of thin-pool is now not sending any message and leaves upto a tool
to decided later how to finish unfinished double-commit transaction.
Problem which needs some API improvements relates to the lvm2 tree
construction. For the suspend tree we do not add target table line
into the tree, but only a device is inserted into a tree.
Current mechanism to attach messages for thin-pool requires the libdm
to know about thin-pool target, so lvm2 currently takes assumption, node
is really a thin-pool and fills in the table line for this node (which
should be ensured by the PRELOAD phase, but it's a misuse of internal API)
we would possibly need to be able to attach message to 'any' node.
Other thing to notice - current messaging interface in thin-pool
target requires to suspend thin volume origin first and then send
a create message, but this could not have any 'nice' solution on lvm2
side and IMHO we should introduce something like 'create_after_resume'
message.
Patch also changes the moment, where lvm2 transaction id is increased.
Now it happens only after successful finish of kernel transaction id
change. This change was needed to handle properly activation of pool,
which is in the middle of unfinished transaction, and also this corrects
usage of thin-pool by external apps like Docker.
Add support for sending message in suspend tree for thin-pools.
When this operation is requested whole subtree suspend is then skipped.
This is experimantal support for new lvm2 code for sending message
in suspend phase where 'thin-pool origin-only suspend' will send
messages instead of really suspending thin-pool tree.
When suspening thin volume origin-only - only thin volume is suspended,
then messages are posted and thin-pool suspend is skipped.
Peter Rajnoha [Fri, 3 Jul 2015 08:43:07 +0000 (10:43 +0200)]
report: select: add handler to recognize fuzzy time specification
Recognize date and time specification within selection criteria
that is formulated in a more free-form way besides to the original
basic YYYY-MM-DD HH:MM format that libdevmapper supports.
Currently, this free-form format is recognized for lv_time field.
Users are able to use expressions from this set:
- weekday names ("Sunday" - "Saturday" or abbreviated as "Sun" - "Sat")
- labels for points in time ("noon", "midnight")
- labels for a day relative to current day ("today", "yesterday")
- points back in time with relative offset from today (N is a number)
( "N" "seconds"/"minutes"/"hours"/"days"/"weeks"/"years" "ago")
( "N" "secs"/"mins"/"hrs" ... "ago")
( "N" "s"/"m"/"h" ... "ago")
- time specification either in hh:mm:ss format or with AM/PM suffixes
- month names ("January" - "December" or abbreviated as "Jan" - "Dec")
Peter Rajnoha [Wed, 20 May 2015 16:47:54 +0000 (18:47 +0200)]
report: call appropriate handler to evaluate fuzzy reserved names and dynamic reserved values
Wire the dm_report_reserved_handler instance call in reporting/selection
infrastructure to handle reserved value actions (currently only
DM_REPORT_RESERVED_PARSE_FUZZY_NAME and DM_REPORT_RESERVED_GET_DYNAMIC_VALUE
actions).
Peter Rajnoha [Tue, 19 May 2015 11:01:48 +0000 (13:01 +0200)]
report: add infrastructure to recognize fuzzy reserved names and returning dynamic reserved values
With fuzzy names we mean the names for which it's hard or even impossible
to enumerate all possible variations of the name - the name needs to
be evaluated. An example of fuzzy name is a name which has a base
(substring) which matches and it can contain arbitrary variations
around this base. We can cover human language better with fuzzy
names as people may use several different names (or sentences) to
denote the same thing.
With dynamic values we mean the values which are not constants
and they need to be evaluated in runtime. An example of dynamic
value is a value which depends on current system state (e.g. time,
current configuration or any other state which may change and it
needs runtime evaluation).
There's a handler that can be registered with reporting/selection
using dm_report_reserved_handler instance. This is a central point
in which the computation/evaluation happens when processing reserved
values. Currently, there are two actions declared:
DM_REPORT_RESERVED_PARSE_FUZZY_NAME
(translates fuzzy name into canonical name)
DM_REPORT_RESERVED_GET_DYNAMIC_VALUE
(gets value for canonical name)
The handler is then registered as value in struct
dm_report_reserved_value (see explaining comments besided
the struct dm_report_reserved_value in libdevmapper.h).
Also, this patch provides support for simple caching of values
used during report/selection via dm_report_value_cache_{set,get}.
This is supposed to be used mainly in the dm_report_reserved_handler
instances to save values among calls so all the handler calls work
with the same base value used in computation/evaluation and/or
possibly to save resources if the evaluation is more time-consuming.
The cache is attached to the dm_report handle and so the cache is
dropped one dm_report is dropped.
Peter Rajnoha [Thu, 2 Jul 2015 14:05:25 +0000 (16:05 +0200)]
report: adjust shared flags based on expected type for reserved values
Generic numbers and time values share some operators so make sure
we have the flags correctly adjusted based on expected type if
we're using reserved values.
Peter Rajnoha [Thu, 21 May 2015 13:19:03 +0000 (15:19 +0200)]
report: add support for time (basic)
This patch adds support for time values used in reporting fields.
The raw values are always stored as number of seconds since epoch.
The support that comes with this patch is the basic one which allows
only for recognition of strictly formatted date and time in selection
criteria (the format follows a subset of formats defined by ISO 8601):
date time timezone
date:
YYYY-MM-DD (or shortly YYYYMMDD)
YYYY-MM (shortly YYYYMM), auto DD=1
YYYY, auto MM=01 and DD=01
time:
hh:mm:ss (or shortly hhmmss)
hh:mm (or shortly hhmm), auto ss=0
hh (or shortly hh), auto mm=0, auto ss=0
timezone (always with + or - sign):
+hh:mm or -hh:mm (or shortly +hhmm or -hhmm)
+hh or -hh
Or directly the time (number of seconds) since Epoch (1970-01-01 00:00:00 UTC)
when the number value is prefixed by "@":
@number_of_seconds_since_epoch
This patch also adds aliases for comparison operators
used together with time values which are more intuitive
to use:
since (as alias for >=)
after (as alias for >)
until (as alias for <=)
before (as alias for <)
For example:
$ lvmconfig --type full report/time_format
time_format="%Y-%m-%d %T %z %Z [%s]"
$ lvs -o name,time vg
LV Time
lvol0 2015-06-28 21:25:41 +0200 CEST [1435519541]
lvol1 2015-06-30 03:25:43 +0200 CEST [1435627543]
lvol2 2015-04-26 14:52:20 +0200 CEST [1430052740]
lvol3 2015-06-30 14:52:23 +0200 CEST [1435668743]
$ lvs vg -o name,time -S 'time since "2015-04-26 15:00" && time until "2015-06-30"'
LV Time
lvol0 2015-06-28 21:25:41 +0200 CEST [1435519541]
lvol1 2015-06-30 03:25:43 +0200 CEST [1435627543]
lvol3 2015-06-30 14:52:23 +0200 CEST [1435668743]
$ lvs vg -o name,time -S 'time since "2015-04-26 15:00" && time until "2015-06-30 6:00"'
LV Time
lvol0 2015-06-28 21:25:41 +0200 CEST [1435519541]
lvol1 2015-06-30 03:25:43 +0200 CEST [1435627543]
$ lvs vg -o name,time -S 'time since @1435519541'
LV Time
lvol0 2015-06-28 21:25:41 +0200 CEST [1435519541]
lvol1 2015-06-30 03:25:43 +0200 CEST [1435627543]
lvol3 2015-06-30 14:52:23 +0200 CEST [1435668743]
This is basic time recognition support that is directly a part of
libdevmapper. Recognition of more free-form expressions will be a
part of subsequent patches.
Peter Rajnoha [Tue, 30 Jun 2015 12:09:00 +0000 (14:09 +0200)]
configure: set DEFAULT_FALLBACK_TO_LVM1 in configure and use it in config_settings.h
Just like we have DEFAULT_USE_LVMETAD (or DEFUALT_USE_LVMPOLLD), use
fallback_to_lvm1=1 lvm.conf setting if we configured lvm2 with
--enable-lvm1-fallback and use fallback_to_lvm1=0 otherwise.
Also, generate proper lvm.conf.in with unconfigured value.
Peter Rajnoha [Tue, 30 Jun 2015 08:13:35 +0000 (10:13 +0200)]
select: add support for range reserved values and flagging named-only values
This patch allows for registration and recognition of reserved
values which are ranges, so they're composed of two values actually
to denote the lower and upper bound for the range (stored as an array
with exactly two items to define the boundaries).
Also, this patch allows for flagging reserved values as named-only
which means that such values are not strictly reserved. The strictly
reserved values are reserved values as used before this patch.
Distinction between strictly-reserved and named-only values
is clearly visible with comparisons. Normally, strictly reserved
value is not accounted for if we do "greater than" or "lower than"
comparisons, for example:
1 2 3 ....
|
abc
- we have "abc" as reserved value for field with value "2"
- the value reported for the field is "abc" (or "2", it doesn't matter here)
- the selection we're processing is -S 'field < abc'
- the result of the selection gives nothing as "abc" is strictly
reserved value (bound to "2") and there's no order defined for
it and it would only match if we directly compared the value
(so -S 'field = abc' would match)
With named-only values, the "abc" is named-only value for "2",
so selection -S 'field < abc" is the same as using -S 'field < 2'.
The "abc" is just an alias for some value so the value or its
assigned name can be used equally in selection criteria.
Peter Rajnoha [Mon, 25 May 2015 14:13:07 +0000 (16:13 +0200)]
conf: make time format configurable
Make it possible to define format for time that is displayed.
The way the format is defined is equal to the way that is used
for strftime function, although not all formatting options as
used in strftime are available for LVM2 - the set is restricted
(e.g. we do not allow newline to be printed). The lvm.conf
comments contain the whole list that LVM2 accepts for time format
together with brief description (copied from strftime man page).
For example:
(defaults used - the format is the same as used before this patch)
$ lvs -o+time vg/lvol0 vg/lvol1
LV VG Attr LSize Time
lvol0 vg -wi-a----- 4.00m 2015-06-25 16:18:34 +0200
lvol1 vg -wi-a----- 4.00m 2015-06-29 09:17:11 +0200
(using 'time_format = "@%s"' in lvm.conf - number of seconds
since the Epoch)
$ lvs -o+time vg/lvol0 vg/lvol1
LV VG Attr LSize Time
lvol0 vg -wi-a----- 4.00m @1435241914
lvol1 vg -wi-a----- 4.00m @1435562231
Origin patch by Ferenc Wágner <wferi@niif.hu> changed later to
avoid using shell call, so makefile add 'server' target when
one of metad or polld daemon is requested.
Peter Rajnoha [Thu, 25 Jun 2015 10:49:59 +0000 (12:49 +0200)]
lvmconfig: do not display settings with undefined default values
Do not display settings with undefined default values, but do display
these settings in case the value is defined directly in any part of
the existing config cascade.
For example, the lvmconfig --type current always displays these settings
(as it's somewhere in "current" configuration cascade that makes it defined).
The lvmconfig --type full displays these settings only if it's defined
somewhere in the cascade, but not if default value is used instead
The lvmconfig --type default never displays these settings...
More concrete example - let's have activation/volume_list directly
set in lvm.conf and activation/read_only_volume_list not set.
Both of these settings have *undefined default* values.
$lvmconfig --type full activation/volume_list activation/read_only_volume_list
volume_list="/dev/vg/lv"
(...only volume_list is defined, hence it's printed)
However, the comments will display more info (see also previous commit):
$lvmconfig --type full activation/volume_list activation/read_only_volume_list --withsummary
# Configuration option activation/volume_list.
# Only LVs selected by this list are activated.
# This configuration option does not have a default value defined.
# Value defined in existing configuration has been used for this setting.
volume_list="/dev/vg/lv"
# Configuration option activation/read_only_volume_list.
# LVs in this list are activated in read-only mode.
# This configuration option does not have a default value defined.
Peter Rajnoha [Thu, 25 Jun 2015 11:03:31 +0000 (13:03 +0200)]
lvmconfig: display comment about value from existing config being used
Display comment abour value from existing config being used. For example:
$ lvmconfig --type full --withsummary report/compact_output report/buffered
# Configuration option report/compact_output.
# Do not print empty report fields.
# Value defined in existing configuration has been used for this setting.
compact_output=1
Peter Rajnoha [Thu, 25 Jun 2015 08:51:30 +0000 (10:51 +0200)]
lvmconfig: add --type full to display full tree of settings
The lvmconfig --type full is actually a combination of --type current
and --type missing together with --mergedconfig options used.
The overall outcome is a configuration tree with settings as LVM sees
it when it looks for the values - that means, if the setting is defined
in some config source (lvm.conf, --config, lvmlocal.conf or any profile
that is used), the setting is used. Otherwise, if the setting is not
defined in any part of the config cascade, the defaults are used.
The --type full displays exactly this final tree with all the values
defined, either coming from configuration tree or from defaults.
Zdenek Kabelac [Wed, 24 Jun 2015 13:12:43 +0000 (15:12 +0200)]
snapshot: add synchronization point
Synchronize with udev logic before reusing device as snapshot.
This patch tries to fix the problem with udev, where we manage
to 'active' LV for clearing, then we deactivate such device and
active again as member of 'origin&snapshot' tree all in 1 step.
There needs to be a sync point where udev has time to remove all links,
otherwise we race with scans and we may end-up with mysterious 'free'
links in the system pointing to wrong dm names.
This patch tries to fix failing topology cluster tests..
Peter Rajnoha [Wed, 24 Jun 2015 11:14:30 +0000 (13:14 +0200)]
lvmconfig: add --withspaces option
We shouldn't be adding spaces by default in output as that
may be be used already in scripts and especially for the eval
in shell scripts where spaces are not allowed between key
and value!
Add --withspaces option to lvmconfig for pretty output with
more space in for readability.
Peter Rajnoha [Tue, 23 Jun 2015 13:35:24 +0000 (15:35 +0200)]
config: allow empty values for {thin,cache}_{check,repair}_options
It's not an error to define empty values for
{thin,cache}_{check,repair}_options - such empty value means no
options are passed when these external commands are called.
Peter Rajnoha [Tue, 23 Jun 2015 13:20:51 +0000 (15:20 +0200)]
configure: add DEFAULT_USE_BLKID_WIPING
If blkid wiping is possible, than set use_blkid_wiping=1 and
use_blkid_wiping=0 otherwise for its default value. If blkid wiping
is disabled during configure and use_blkid_wiping=1 is set by chance,
it's simply ignored - this patch is just a cleanup that makes it more
obvious for the user (we use similar logic for use_lvmetad and
use_lvmpolld settings).
Note that these settings are not of string type. Recent change (the
DM_CONFIG_VALUE_FMT_STRING_NO_QUOTES formatting flag) makes it
possible to recognize that the setting is not of string type and if
there's unconfigured value defined for it, the enclosing " " is
automatically removed on output.
Peter Rajnoha [Tue, 23 Jun 2015 11:18:53 +0000 (13:18 +0200)]
config: cleanup default values for some configuration settings with array values
Do not use "#S" (blank string) as default value as that ends up with
'key = [ "" ]' to be generated which is not what we want in most cases.
Also, fix default values for global/{thin,cache}_{check,repair}_options
and avoid assigning blank values. For example, the thin_check_options
had this set as default value previously:
Peter Rajnoha [Tue, 23 Jun 2015 11:02:45 +0000 (13:02 +0200)]
config: add support for config value formatting flags
There are two basic groups of formatting flags (32 bits):
- common ones applicable for all config value types (lower 16 bits)
- type-related formatting flags (higher 16 bits)
With this patch, we initially support four new flags that
modify the the way the config value is displayed:
Common flags:
=============
DM_CONFIG_VALUE_FMT_COMMON_ARRAY - causes array config values
to be enclosed in "[ ]" even if there's only one item
(previously, there was no way to recognize an array with one
item and scalar value, hence array values with one member
were always displayed without "[ ]" which libdm accepted
when reading, but it may have been misleading for users)
DM_CONFIG_VALUE_FMT_COMMON_EXTRA_SPACE - causes extra spaces to
be inserted in "key = value" (or key = [ value, value, ... ] in
case of arrays), compared to "key=value" seen on output before.
This makes the output more readable for users.
Type-related flags:
===================
DM_CONFIG_VALUE_FMT_INT_OCTAL - prints integers in octal form with
"0" as a prefix (libdm's config reading code can handle this via
strtol just fine so it's properly recognized as number in octal
form already if there's "0" used as prefix)
DM_CONFIG_VALUE_FMT_STRING_NO_QUOTES - makes it possible to print
strings without enclosing " "
This patch also adds dm_config_value_set_format_flags and
dm_config_value_get_format_flags functions to set and get
these formatting flags.
The function added here:
. checks if the global state in lvmetad is invalid
. if so, scans disks to update the state in lvmetad
. clears the global_invalid flag in lvmetad
. updates the local udev db to reflect any changes
David Teigland [Tue, 21 Oct 2014 14:40:13 +0000 (09:40 -0500)]
lvmetad: add invalidation method
Add the ability to invalidate global or individual VG metadata.
The invalid state is returned to lvm commands along with the metadata.
This allows lvm commands to detect stale metadata from the cache and
reread the latest metadata from disk (in a subsequent patch.)
These changes do not change the protocol or compatibility between
lvm commands and lvmetad.
Global information
------------------
Global information refers to metadata that is not isolated
to a single VG , e.g. the list of vg names, or the list of pvs.
When an external system, e.g. a locking system, detects that global
information has been changed from another host (e.g. a new vg has been
created) it sends lvmetad the message: set_global_info: global_invalid=1.
lvmetad sets the global invalid flag to indicate that its cached data is
stale.
When lvm commands request information from lvmetad, lvmetad returns the
cached information, along with an additional top-level config node called
"global_invalid". This new info tells the lvm command that the cached
information is stale.
When an lvm command sees global_invalid from lvmated, it knows it should
rescan devices and update lvmetad with the latest information. When this
is complete, it sends lvmetad the message: set_global_info:
global_invalid=0, and lvmetad clears the global invalid flag. Further lvm
commands will use the lvmetad cache until it is invalidated again.
The most common commands that cause global invalidation are vgcreate and
vgextend. These are uncommon compared to commands that report global
information, e.g. vgs. So, the percentage of lvmetad replies containing
global_invalid should be very small.
VG information
--------------
VG information refers to metadata that is isolated to a single VG,
e.g. an LV or the size of an LV.
When an external system determines that VG information has been changed
from another host (e.g. an lvcreate or lvresize), it sends lvmetad the
message: set_vg_info: uuid=X version=N. X is the VG uuid, and N is the
latest VG seqno that was written. lvmetad checks the seqno of its cached
VG, and if the version from the message is newer, it sets an invalid flag
for the cached VG. The invalid flag, along with the newer seqno are saved
in a new vg_info struct.
When lvm commands request VG metadata from lvmetad, lvmetad includes the
invalid flag along with the VG metadata. The lvm command checks for this
flag, and rereads the VG from disk if set. The VG read from disk is sent
to lvmetad. lvmetad sees that the seqno in the new version matches the
seqno from the last set_vg_info message, and clears the vg invalid flag.
Further lvm commands will use the VG metadata from lvmetad until it is
next invalidated.
Zdenek Kabelac [Tue, 16 Jun 2015 11:47:43 +0000 (13:47 +0200)]
display: drop allocation from display_lvname
Use of display_lvname() in plain log_debug() may accumulate memory in
command context mempool. Use instead small ringbuffer which allows to
store cuple (10 ATM) names so upto 10 full names can be used at one.
We are not keeping full VG/LV names as it may eventually consume larger
amount of RAM resouces if vgname is longer and lots of LVs are in use.
Note: if there would be ever needed for displaing more names at once,
the limit should be raised (e.g. log_debug() would need to print more
then 10 LVs on a single line).