Tony Asleson [Thu, 22 Sep 2022 22:10:13 +0000 (17:10 -0500)]
lvmdbustest: Re-work setUp
Place the addCleanup at the end as we don't want to go through clean up
if we don't make it through setUp. If we don't do this we can remove VGs
that we didn't create in the unit test.
Tony Asleson [Wed, 21 Sep 2022 15:05:36 +0000 (10:05 -0500)]
lvmdbusd: Correct lvm shell signal & child process handling
Previously when the __del__ method ran on LVMShellProxy we would blindly
call terminate(). This was a race condition as the underlying process
may/maynot be present. When the process is still present the SIGTERM will
end up being seen by lvmdbusd too. Re-work the code so that we
first try to wait for the child process to exit and only then if it hasn't
exited will we send it a SIGTERM. We also ensure that when this is
executed we will briefly ignore a SIGTERM that arrives for the daemon.
If we plan to use dm throttling for mirror targets - we actually
have to check whether kernel runs with CONFIG_HZ_1000 - if it does
not the whole idea of throttling is actually not working in the
testsuite as within a single 'tick' with HZ 100 way too much date
is being moved on any modern hardware - and since there is no plan
to change this in kernel - we simply avoid using throttling on such
kernel and test needs to work differently - either ignore results
or use much larger mirror sizes...
Tony Asleson [Tue, 20 Sep 2022 16:40:15 +0000 (11:40 -0500)]
lvmdbusd: Correct get_object_path_by_uuid_lvm_id
When checking to see if the PV is missing we incorrectly checked that the
path_create was equal to PV creation. However, there are cases where we
are doing a lookup where the path_create == None. In this case, we would
fail to set lvm_id == None which caused a problem as we had more than 1
PV that was missing. When this occurred, the second lookup matched the
first missing PV that was added to the object manager. This resulted in
the following:
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/lvmdbusd/utils.py", line 667, in _run
self.rc = self.f(*self.args)
File "/usr/lib/python3.9/site-packages/lvmdbusd/fetch.py", line 25, in _main_thread_load
(changes, remove) = load_pvs(
File "/usr/lib/python3.9/site-packages/lvmdbusd/pv.py", line 46, in load_pvs
return common(
File "/usr/lib/python3.9/site-packages/lvmdbusd/loader.py", line 55, in common
del existing_paths[dbus_object.dbus_object_path()]
Because we expect to find the object in existing_paths if we found it in
the lookup.
Tony Asleson [Tue, 6 Sep 2022 21:23:06 +0000 (16:23 -0500)]
lvmdbusd: Use pseudo tty to get "lvm>" prompt again
When lvm is compiled with editline, if the file descriptors don't look like
a tty, then no "lvm> " prompt is done. Having lvm output the shell prompt
when consuming JSON on a report file descriptor is very useful in
determining if lvm command is complete.
Tony Asleson [Wed, 31 Aug 2022 20:04:59 +0000 (15:04 -0500)]
lvmdbusd: Instruct lvm to output debug to file for fullreport
Historically we have seen a few different errors which occur when we call
fullreport. Failing exit code and JSON which is missing one or more keys.
Instruct lvm to dump the debug to a file during fullreport calls when we
fork & exec lvm. If we encounter an error, ouput the debug data.
The reason this isn't being done when lvmshell is used is because we
don't have an easy way to test the error paths.
This change is complicated by the following:
1. We don't know if fullreport was good until we evaluate all the JSON.
This is done a bit after we have called into lvm and returned.
2. We don't want to orphan the debug file used by lvm if the daemon is
killed. Thus we try to minimize the window where the debug file hasn't
already been unlinked. A RFE to pass an open FD to lvm for this
purpose is outstanding.
The temp. file is:
-rw------. 1 root root /tmp/lvmdbusd.lvm.debug.XXXXXXXX.log
Tony Asleson [Wed, 31 Aug 2022 16:20:49 +0000 (11:20 -0500)]
lvmdbusd: Re-work error handling
Introduce an exception which is used for known existing issues with lvm.
This is used to distinguish between errors between lvm itself and lvmdbusd.
In the case of lvm bugs, when we simply retry the operation we will log
very little. Otherwise, we will dump a full traceback for investigation
when we do the retry.
Tony Asleson [Fri, 26 Aug 2022 18:01:05 +0000 (13:01 -0500)]
lvmdbusd: Re-work error handling for run_cmd
Instead of lumping all the exceptions, break them out to handle the dbus
exceptions separately, to reduce the amount of debug information that ends
up in the journal that has questionable value.
Tony Asleson [Thu, 25 Aug 2022 14:33:50 +0000 (09:33 -0500)]
lvmdbusd: Don't report recoverable error
Lvm occasionally fails to return all the request JSON keys in the output of
"fullreport". This happens very rarely. When it does the daemon was reporting
the resulting informational exception:
MThreadRunner: exception
Traceback (most recent call last):
File "/usr/lib/python3.9/site-packages/lvmdbusd/utils.py", line 667, in _run
self.rc = self.f(*self.args)
File "/usr/lib/python3.9/site-packages/lvmdbusd/fetch.py", line 40, in _main_thread_load
(lv_changes, remove) = load_lvs(
File "/usr/lib/python3.9/site-packages/lvmdbusd/lv.py", line 143, in load_lvs
return common(
File "/usr/lib/python3.9/site-packages/lvmdbusd/loader.py", line 37, in common
objects = retrieve(search_keys, cache_refresh=False)
File "/usr/lib/python3.9/site-packages/lvmdbusd/lv.py", line 95, in lvs_state_retrieve
l['vdo_operating_mode'],
KeyError: 'vdo_operating_mode'
The daemon retries the operation, which usually works and the daemon continues.
However, simply reporting this informational stack trace is causing CI and other
automated tests to fail as they expect no tracebacks in the log output.
Remove the reporting of this code path unless it persists and causes the daemon
to give up and exit.
Tony Asleson [Wed, 17 Aug 2022 22:24:08 +0000 (17:24 -0500)]
lvmdbusd: Add debug circular buffer
When the daemon isn't started with --debug we will keep a circular
buffer of the past N number of debug messages which we will output
when we encounter an issue.
Tony Asleson [Wed, 17 Aug 2022 22:21:19 +0000 (17:21 -0500)]
lvmdbusd: fix hangs on SIGINT
Rather than trying to bubble up return codes that get us to exit cleanly
it's better to just raise an exception to bail. In some cases functions
don't have return codes, so they cannot be checked.
Tony Asleson [Wed, 17 Aug 2022 17:08:16 +0000 (12:08 -0500)]
lvmdbusd: Make sure to set cfg.got_external_event
We were incorrectly only setting this if --udev wasn't present on the
command line. In all cases when we see a manager.ExternalEvent we want
to set this.
Tony Asleson [Wed, 17 Aug 2022 17:05:06 +0000 (12:05 -0500)]
lvmdbusd: Handle no lastlog
Depending on when an occurs, it maynot have any information available for
lastlog. In this case try to grab an error message from the original
response.
Tony Asleson [Tue, 9 Aug 2022 22:43:00 +0000 (17:43 -0500)]
lvmdbusd: Re-work flight recorder data
Introduce a new lock for the flight recorder, so that we can dump it when
a command is block waiting for lvm to complete. Also in all paths we will
addthe metadata to the flight recorder before it's done, so we will have
it when a command hangs and we dump the flight recorder. Add the missing
bits after the command has finished.
Tony Asleson [Tue, 9 Aug 2022 22:36:54 +0000 (17:36 -0500)]
lvmdbustest: Remove force exception in _wait_for_job
We put this in to test one of the paths in the damon, but unfortunately
if we hit the race condition where the job actually is done we will try
to call j.Wait(1) after the remove. This fails with:
dbus.exceptions.DBusException: org.freedesktop.DBus.Error.UnknownMethod:
Method "Wait" with signature "i" on interface "com.redhat.lvmdbus1.Job"
doesn't exist
Which is caused by the dbus object no longer existing. We could handle
this, but the issue is we no longer have the ability to get the result to
return, they have been lost.
A better solution would be to write a specific unit test to force this code
path and handle all the possible outcomes.
David Teigland [Wed, 14 Sep 2022 14:13:58 +0000 (09:13 -0500)]
vgremove: remove online files in run dir
These files are automatically cleared on reboot given
that /run is tmpfs, and that remains the primary way
of clearing online files.
But, if there's extreme use of vgcreate+pvscan+vgremove
between reboots, then removing online files in vgremove
will limit the number of unused online files using space
in /run.
David Teigland [Tue, 14 Jun 2022 20:20:21 +0000 (15:20 -0500)]
lvresize: add new options and defaults for fs handling
The new option "--fs String" for lvresize/lvreduce/lvextend
controls the handling of file systems before/after resizing
the LV. --resizefs is the same as --fs resize.
The new option "--fsmode String" can be used to control
mounting and unmounting of the fs during resizing.
Possible --fs values:
checksize
Only applies to reducing size; does nothing for extend.
Check the fs size and reduce the LV if the fs is not using
the affected space, i.e. the fs does not need to be shrunk.
Fail the command without reducing the fs or LV if the fs is
using the affected space.
resize
Resize the fs using the fs-specific resize command.
This may include mounting, unmounting, or running fsck.
See --fsmode to control mounting behavior, and --nofsck to
disable fsck.
resize_fsadm
Use the old method of calling fsadm to handle the fs
(deprecated.) Warning: this option does not prevent lvreduce
from destroying file systems that are unmounted (or mounted
if prompts are skipped.)
ignore
Resize the LV without checking for or handling a file system.
Warning: using ignore when reducing the LV size may destroy the
file system.
Possible --fsmode values:
manage
Mount or unmount the fs as needed to resize the fs,
and attempt to restore the original mount state at the end.
nochange
Do not mount or unmount the fs. If mounting or unmounting
is required to resize the fs, then do not resize the fs or
the LV and fail the command.
offline
Unmount the fs if it is mounted, and resize the fs while it
is unmounted. If mounting is required to resize the fs,
then do not resize the fs or the LV and fail the command.
Notes on lvreduce:
When no --fs or --resizefs option is specified:
. lvextend default behavior is fs ignore.
. lvreduce default behavior is fs checksize
(includes activating the LV.)
With the exception of --fs resize_fsadm|ignore, lvreduce requires
the recent libblkid fields FSLASTBLOCK and FSBLOCKSIZE.
FSLASTBLOCK*FSBLOCKSIZE is the last byte used by the fs on the LV,
which determines if reducing the fs is necessary.
mdraid has some breakage - so 5.19 is crashing
(possibly even some more older version - that can be added as well)
Test seems to pass with 6.0-rc kernel.
Add new define 'newline' for use in 'foreach()'
Add new $(SHOW) for makefile printing output
Add 'make print-VAR' for easier debugging of Makefiles' variables.
Peter Rajnoha [Tue, 6 Sep 2022 12:40:06 +0000 (14:40 +0200)]
report: fix lv_active column type from STR to BIN
Fix lv_active to be of BIN type instead of STR. This allows lv_active to
follow the report/binary_values_as_numeric setting as well as --binary
cmd line switch. Also, it makes it possible to use -S|--select with
either textual or numeric representation of the value, like 'lvs -S
active=active' but also 'lvs -S active=1'.
David Teigland [Tue, 30 Aug 2022 19:40:48 +0000 (14:40 -0500)]
vgimportdevices: fix locking when creating devices file
Take the devices file lock before creating a new devices file.
(Was missed by the change to preemptively create the devices
file prior to setup_devices(), which was done to improve the
error path.)