Peter Rajnoha [Mon, 5 Sep 2016 09:33:08 +0000 (11:33 +0200)]
lvmetad: check udev for mpath component several times if udev record not initialized yet
It's possible (mainly during boot) that udev has not finished
processing the device and hence the udev database record for that
device is still marked as uninitialized when we're trying to look
at it as part of multipath component check in pvscan --cache code.
So check several times with a short delay to wait for the udev db
record to be initialized before giving up completely.
David Teigland [Wed, 31 Aug 2016 18:05:53 +0000 (13:05 -0500)]
lvmetad: use udev to ignore multipath components during scan
When scanning devs to populate lvmetad during system startup,
filter-mpath with native sysfs multipath component detection
may not detect that a dev is multipath component. This is
because the multipath devices may not be set up yet.
Because of this, pvscan will scan multipath components during
startup, will see them as duplicate PVs, and will disable
lvmetad. This will leave lvmetad disabled on systems using
multipath, unless something or someone runs pvscan --cache
to rescan.
To avoid this problem, the code that is scanning devices to
populate lvmetad will now check the udev db to see if a
dev is a multipath component that should be skipped.
(This may not be perfect due to inherent udev races, but will
cover most cases and will be at least as good as it's ever
been.)
Peter Rajnoha [Tue, 30 Aug 2016 13:33:50 +0000 (15:33 +0200)]
lvmdump: use lsblk -s and lsblk -O in lvmdump only if these options are supported
The lsblk is just a nice helper here - it's not crucial for lvmdump so
do best effort here and use the most we can from current version of
lsblk that is installed on system. The lsblk -s option was added a bit
later after lsblk introduction and lsblk -O support even more later -
so if these are not available, use only pure lsblk output without any
extras.
Tony Asleson [Mon, 29 Aug 2016 19:26:16 +0000 (14:26 -0500)]
lvmdbusd: Handle different daemon options
- Prevent --lvmshell with --nojson, not a valid combination
- If user is preventing json, then no lvmshell usage
- Return boolean on Manager.UseLvmShell
Tony Asleson [Wed, 24 Aug 2016 23:29:35 +0000 (18:29 -0500)]
lvmdbusd: Use udev until ExternalEvent occurs
The normal mode of operation will be to monitor for udev events until an
ExternalEvent occurs. In that case the service will disable monitoring
for udev events and use ExternalEvent exclusively.
Note: User specifies --udev the service will always monitor udev regardless
if ExternalEvent is being called too.
Tony Asleson [Fri, 12 Aug 2016 20:23:05 +0000 (15:23 -0500)]
lvmdbusd: Add support for using lvm shell
With the addition of JSON and the ability to get output which is known to
not contain any extraneous text we can now leverage lvm shell, so that we
don't fork and exec lvm command line repeatedly.
Tony Asleson [Fri, 12 Aug 2016 20:20:49 +0000 (15:20 -0500)]
lvmdbusd: Add date & ts if running in a terminal
When we are running in a terminal it's useful to have a date & ts on log
output like you get when output goes to the journal. Check if we are
running on a tty and if we are, add it in.
Zdenek Kabelac [Wed, 24 Aug 2016 08:05:09 +0000 (10:05 +0200)]
cache: do not monitor cache-pool
Avoid monitoring of activated cache-pool - where the only purpose ATM
is to clear metadata volume which is actually activate in place
of cache-pool name (using public LV name).
Since VG lock is held across whole clear operation, dmeventd cannot
be used anyway - however in case of appliction crash we may
leave unmonitored device.
In future we may provide better mechanism as the current name
replacemnet is creating 'uncommon' table setups in case the metadata
LV is more complex type like raid (needs some futher thinking about
error path results).
Another point to think about is the fact we should not clear device
while holding lock (i.e. dmeventd mirror repair cannot work in cases
like this).
Peter Rajnoha [Thu, 25 Aug 2016 12:53:32 +0000 (14:53 +0200)]
conf: fix typo in report/columns_as_rows config option name recognition
Commit e947c362dd0ae1da2d76925fe6b38244c5f46d25 introduced
config_settings.h file for central place to store all definitions for
config options. By mistake, it used report/colums_as_rows instead
of report/columns_as_rows (missing "n" in "columns").
a579ba2ac27d fixed a regression causing a segfault if no external
origin existed but broke the logic leading to erroneous error
messages and creations of split off exported VGs in case the
external origin and the pool LVs were allocated on different PVs.
Creating a RaidLV in VGs with very small extent sizes caused
late failure in the kernel giving a not very informative error
message. Catch the attempt early and display failure message
'Unable to create RAID LV: requires minimum VG extent size 4.00 KiB'.
pvmove: prohibit non-resilient collocation of RAID SubLVs
'pvmove -n name pv1 pv2' allows to collocate multiple RAID SubLVs
on pv2 (e.g. results in collocated raidlv_rimage_0 and raidlv_rimage_1),
thus causing loss of resilence and/or performance of the RaidLV.
Fix this pvmove flaw leading to potential data loss in case of PV failure
by preventing any SubLVs from collocation on any PVs of the RaidLV.
Still allow to collocate any DataLVs of a RaidLV with their sibling MetaLVs
and vice-versa though (e.g. raidlv_rmeta_0 on pv1 may still be moved to pv2
already holding raidlv_rimage_0).
Because access to the top-level RaidLV name is needed,
promote local _top_level_lv_name() from raid_manip.c
to global top_level_lv_name().
raid_manip: add missing code avoiding MetaLV collocation on the same PV
Adding MetaLVs to given DataLVs (e.g. raid0 -> raid0_meta takeover),
_avoid_pvs_with_other_images_of_lv() was missing code to prohibit
allocation when called with a just allocated MetaLV to prohibit
collaocation of the next allocated MetaLV on the same PV.
Tony Asleson [Fri, 12 Aug 2016 19:31:06 +0000 (14:31 -0500)]
notify: Fix hang with lvm shell & --enable-notify-dbus
When lvm is compiled with --enable-notify-dbus and a user uses lvm
shell, after they issue 200+ commands the lvm shell will hang for
~30 seconds trying to notify the lvm dbus service that a change
has occurred. This appears to be caused by resource exhaustion,
because the sockets used for dbus communication are not be closed.
Peter Rajnoha [Mon, 8 Aug 2016 08:43:18 +0000 (10:43 +0200)]
config: add support for CFG_DISALLOW_INTERACTIVE flag to mark settings as not suitable for override in interactive mode
Some settings are not suitable for override in interactive/shell
mode because such settings may confuse the code and it may end
up with unexpected behaviour. This is because of the fact that
once we're in the interactive/shell mode, we have already applied
some settings for the shell itself and we can't override them
further because we're already using those settings to drive the
interactive/shell mode. Such settings would get ignored silently
or, in worse case, they would mess up the existing configuration.
Peter Rajnoha [Thu, 4 Aug 2016 11:42:57 +0000 (13:42 +0200)]
lvm: shell: extend log report to cover whole lvm shell's main loop
When lvm commands are executed in lvm shell, we cover the whole lvm
command execution within this shell now. That means, all messages logged
and status caught during each command execution is now recorded in the
log report, including overall command's return code.
The dm_report_group_output_and_pop_all calls dm_report_output and
dm_report_group_pop for all the items that are currently in report
group. This is just a shortcut that makes it easier to output and
pop group's content so the group handle can be reused again without
a need to initialize and configure it again.
The functionality of dm_report_group_output_and_pop_all is the
same as dm_report_destroy but without destroying the report group
handle.
Peter Rajnoha [Thu, 4 Aug 2016 09:05:20 +0000 (11:05 +0200)]
libdm: report: postpone printing of JSON starting '{' character till it's needed
This patch moves printing of starting '{' character for JSON output up
untili it's known there's any further output following - either the
content or ending '}' character.
Also, remove unnecessary switch for different report group types and
calling individual functions to handle dm_report_group_create as that
code is shared for all existing types at the moment.
Peter Rajnoha [Thu, 4 Aug 2016 07:42:33 +0000 (09:42 +0200)]
libdm: report: add dm_report_destroy_rows
Calling dm_report_destroy_rows makes it possible to destroy any report
content we have but at the same time it doesn't destroy the report
handle itself, thus it's possible to reuse that handle again for new
report content.
Functionally, this is the same as calling dm_report_output with the
report handle but omitting the output iself. This functionality may
be useful if we, for whatever reason, need to discard the report
content and start a fresh new one but with the same report configuration
and initialization and thus we can just reuse the existing handle.
Peter Rajnoha [Thu, 4 Aug 2016 07:32:05 +0000 (09:32 +0200)]
lvmcmdline: return 0/NULL if cmd->arg_values not set and arg_count/grouped_arg_count/arg_value called
We may call arg_count/grouped_arg_count/arg_value soon enough that
cmd->arg_values is not set yet.
Normally, when running a command, we execute lvm_run_command which in
turn calls _process_command_line to allocate and parse the command line
values and stores them in cmd->arg_values.
However, if we run lvm shell, this one doesn't accept any command line
options and we parse the command line for each command that is executed
within the lvm shell then. If we used any code that tries to access
cmd->arg_values through any of the the arg handling functions too
early, we could end up with a segfault due to uninitialized (NULL)
cmd->arg_values.
This patch just saves extra checks in all the code where arg handling
may be called too early so that the cmd->arg_values is not set up yet.
This does not apply to any of existing code, but subsequent patches
will need that.
Peter Rajnoha [Wed, 3 Aug 2016 13:37:14 +0000 (15:37 +0200)]
refactor: move report grouping and log reporting handles from processing_handle to cmd_context
With patches that will follow, this will make it possible to widen log
report coverage when commands are executed from lvm shell so the amount
of messages that may end up in stderr/stdout instead of log report are
minimized.
Peter Rajnoha [Mon, 25 Jul 2016 10:20:22 +0000 (12:20 +0200)]
shell: also collect last command's return code for subsequent 'lastlog' invocation
Add new log_context=shell and with log_object_type=cmd and
log_object_name=<command_name> for command log report to collect
overall return code from last command (this is reported under
log_type=status).
Peter Rajnoha [Fri, 8 Jul 2016 14:47:51 +0000 (16:47 +0200)]
log: separate output and make it possible to use given FDs
Currently, the output is separated in 3 parts and each part can go into
a separate and user-defined file descriptor:
- common output (stdout by default, customizable by LVM_OUT_FD environment variable)
- error output (stderr by default, customizable by LVM_ERR_FD environment variable)
- report output (stdout by default, customizable by LVM_REPORT_FD environment variable)
For example, each type of output goes to different output file:
[0] fedora/~ # cat err
Volume group "vg" not found
Cannot process volume group vg
[0] fedora/~ # cat report
LV VG Attr LSize Layout Role CTime
root fedora -wi-ao---- 19.00g linear public Wed May 27 2015 08:09:21
swap fedora -wi-ao---- 500.00m linear public Wed May 27 2015 08:09:21
Another example in LVM shell where the report goes to "report" file:
RAID6 LVs may not be created with --nosync or data corruption
may occur in case of device failures. The underlying MD raid6
personality used to drive the RaidLV performs read-modify-write
updates on stripes and thus relies on properly written parity
(P and Q Syndromes) during initial synchronization.
Once on it, enhance test to create/extend more and
larger RaidLVs and check sync/nosync status.
Tests exposed a kernel semantics change freezing resynchronization
on conversions from raid0[_meta] -> raid4 or adding raid1 legs
because kernel kept the RAID mapped device in 'frozen' state unless
an 'idle' message was sent or the table was reloaded (kernel patch pending).
Bryn M. Reeves [Mon, 8 Aug 2016 18:29:12 +0000 (19:29 +0100)]
man: explain deletion of 1st group member in dmstats.8.in
Although the use of the first region_id in a group to store the
DMS_GROUP=... aux_data tag is an internal implementation detail,
it has a user visible consequence in that deleting this region will
cause the group to disappear: add an explanation of this to the
'group' command and 'Regions, areas, and groups' section.