David Teigland [Fri, 1 Nov 2024 01:29:00 +0000 (20:29 -0500)]
lvmlockd: optimize new lv lease search
When converting a VG to locktype sanlock, a new
lease is allocated for each existing lv. Finding
a new lease location involved searching the lvmlock
LV from the start for an unused location, which
would be very slow with many LVs. Improve this by
starting each search from the last used location.
David Teigland [Thu, 31 Oct 2024 21:31:35 +0000 (16:31 -0500)]
lvmlockd: fix vgchange --locktype sanlock
Fix regression from commit 7f29afdb06d
"lvmlockd: configurable sanlock lease sizes on 4K disks"
That change failed to recognize that a running lockspace will not
exist in lvmlockd when converting a local VG to a sanlock VG, i.e.
vgchange --locktype sanlock vgname. When the vgchange attempted
to initialize new lv leases for existing LVs, lvmlockd would
return an error when it found no lockspace.
Zdenek Kabelac [Sat, 26 Oct 2024 20:37:00 +0000 (22:37 +0200)]
vg: add radix_tree for lv uuids
When searching for committed LV by uuid, this search can
be expensive for commands like 'vgremove' - so for
this part introduce 'lv_uuids' radix_tree that is
build with first access to lv_committed().
Zdenek Kabelac [Sat, 26 Oct 2024 20:18:19 +0000 (22:18 +0200)]
metadata: use radix_tree for find_lv_in_vg
Since there is a group of commands that need to access 'lv_list'
while still need to search for LV by its name, make the whole
struct lv_list a member of logical_volume structure.
This makes it easy to return also 'lv_list' this list this LV
within VG.
Also the patch should not use more memory, since we were allocating
lv_list for each LV anyway when linkin LV to VG.
Since find_lv_by_name() is now using radix_tree(),
use the same 'search for /' in LV in name for both
find_lv() & find_lv_in_vg().
TODO: Possibly refactor code and use only dm_list
instead of lv_list and dereference LV with container_of()
(thus saving pointer within struct logical_volume) - but
we use 'lv_list' currently in many places...
Zdenek Kabelac [Thu, 31 Oct 2024 13:42:16 +0000 (14:42 +0100)]
config: introduce validate_metadata
Add lvm.conf config/validate_metadata configurable setting.
Allows to disable validation of volume_group structure before
writing to disk.
Call of vg_validate() is supposed to catch any inconsistency
of in-memory volume group structure and possibly early aborting
commnand before making any more 'damage' in case the VG struct
is found insistent after some metadata manipulation.
This is almost always useful for devel - and also for normal user
as for small metadata size this doesn't add too much overhead.
However if the volume_group size is large and operations are just
adding removing simple LVs - this validation time may add noticable
to final command running time.
So if the user seeks the highest perfomance of command and does
not do any 'complex' metadata manipulation - it's reasonably safe
to disable validation (with the use of setting "none") here.
Zdenek Kabelac [Tue, 29 Oct 2024 17:44:55 +0000 (18:44 +0100)]
metadata: lv_set_name use uniq_insert
With presence of uniq_insert, use this function also
here for extra protection and check for duplicate lv_name
when inserting a new name into radix_tree.
Zdenek Kabelac [Mon, 28 Oct 2024 20:41:30 +0000 (21:41 +0100)]
tests: use longer tag
Avoid config 'grep' with actual 'randomly' generated path name
which may eventually contain 'cc' as part the path and
causing a mismatch of the grep test.
Zdenek Kabelac [Fri, 25 Oct 2024 12:44:50 +0000 (14:44 +0200)]
lvresize: fix regression when resizing with fs
When 'lvresize -r' is used to resize the volume, it's valid to
resize even to the same size of an LV, as the command then runs
fs-resize utility to eventually upsize the fs to the current
volume size.
Return code of such command then reflects the return value
of this fs-resize tool.
This fixes the regression introduced when the support
for option --fs was added (2.03.17).
There are no fuction named print_common_options_cmd()
and print_common_options_lvm(). So, rename them to the
real function named print_usage_common_cmd() and
print_usage_common_lvm().
Zdenek Kabelac [Thu, 24 Oct 2024 14:12:18 +0000 (16:12 +0200)]
metadata: use radix tree to find lv_names
Replace usage of dm_hash with radix_tree to quickly find LV name
with a vg and also index PV names with set of available PVs.
This PV index is only needed during the import, but instead
of passing 'radix_tree *' everywhere, just keep this within
a VG struct as well and once the parsing is finished, release
this PV index radix_tree.
This also makes it easier to replace this structure
in the future if needed.
lv_set_name now uses radix_tree remove+insert to keep lv_names
tree in-sync and usable for find_lv queries.
Zdenek Kabelac [Thu, 24 Oct 2024 12:04:07 +0000 (14:04 +0200)]
radix_tree: add radix_tree_uniq_insert
When using radix_tree to identify duplicate entries we may
avoid to call an extra 'lookup()' prior the insert() operation
add radix_tree_uniq_insert/_ptr() that is able to report -1 if
there was already set a value for the given key.
Zdenek Kabelac [Wed, 23 Oct 2024 17:22:10 +0000 (19:22 +0200)]
vgcfgrestore: validate complete VG
Avoid finding problems in vg_validate when restoring
invalid VG metadata as that would lead to internal error.
i.e. adding unsupported METADATA_FLAG to zero segtype
can trigger such thing.
Zdenek Kabelac [Wed, 23 Oct 2024 11:30:55 +0000 (13:30 +0200)]
export: change to read_segtype_and_lvflags
Instead of duplicating whole segtype string with flags and
using 2 calls read_segtype_lvflags() + get_segtype_from_string(),
merge the functionality into a single read_segtype_and_lvflags().
This allows to make only a local string copy (no allocs) and eventually
to not copy segtype string at all, when there are no flags.
Zdenek Kabelac [Sun, 20 Oct 2024 20:14:39 +0000 (22:14 +0200)]
export: reduce emit_to_buffer calls
As the 'emit_to_buffer' uses relatively complex
vsnprintf() call inside, try to reduce number
of unnecessary calls and try replace some more
complex string build with a single call instead.
Zdenek Kabelac [Wed, 23 Oct 2024 09:49:55 +0000 (11:49 +0200)]
dev-cache: enhance usability of dm cache
With existing code, the cache was working only to the 2nd. locking.
So i.e. when 'lvs' scans system with more then one VG, the caching
was effectively not working.
Update the code, so the label invalidate code is able to update DM
cache - so whenever we take a new lock - we will refresh the cache.
TODO: the refresh ATM does a very simple compare of old a new list
of cached DM device, and with the first spotted difference, it just
fallback to the full rebuild of DM cache - with large amount of active
devices this might not the most efficient way....
Zdenek Kabelac [Sun, 20 Oct 2024 18:48:56 +0000 (20:48 +0200)]
debug: use just LV name for debug message
Since we detect 'debug' level after calling 'log_debug()' - all
the arguments are evaluated, so in this case display_lvname() was
preparing a string that is not used in case debugging is not enabled.
So since these string are on 'hot-path' and it's already known
which VG is being worked on, in these few cases just use lv->name.
Zdenek Kabelac [Thu, 17 Oct 2024 21:10:01 +0000 (23:10 +0200)]
device_mapper: add dm_config_parse_only_section
This function call is able to setup config parser so it stops
parsing 'subsection' nodes after parsing named section node.
Only nodes at 'level' 0 will be still processed. And this nodes
are found by searching for last \n}\n sequence from the end of
buffer (instead of trying to analyze all the text in buffer).
Zdenek Kabelac [Fri, 18 Oct 2024 22:05:45 +0000 (00:05 +0200)]
check_lv_segment: split into incomplete complete
Split single check_lv_segments() into 2 separate
versions so they can be called independently.
This allow to 'skip' already checked segment
check after it's been imported to VG and also
avoid another repeated checking when validating
segment with complete vg.
**
check_lv_segments_incomplete_vg()
this check just basic LV segment properties and does not
validate those requiring full VG.
**
check_lv_segments_complete_vg()
Remaining check that expects complete VG is present.
Zdenek Kabelac [Sat, 12 Oct 2024 23:11:04 +0000 (01:11 +0200)]
thin: check only for profiled config vars
ATM this rather save a lot of unncessary log entries as it grabs
the global autoextend_threshold (profile == NULL) just once instead
of revealing it every time with NULL profile.
Zdenek Kabelac [Sat, 12 Oct 2024 18:52:27 +0000 (20:52 +0200)]
device_mapper: increase mem pool chunk size
Use bigger memory pool chunk size and reduces amount of
memory pool extensions when handling larger metadata, but do not
make it noticable bigger when handling small ones...
Use same large value also when allocating VG memory pool.
Zdenek Kabelac [Sat, 12 Oct 2024 19:31:11 +0000 (21:31 +0200)]
device_mapper: store string on stack
Instead of allocating string from a pool, for shorted strings
use buffer on stack since the string after the use in _find_or_make_node()
as no longer needed.
Eventually we may enhance code also for TOK_STRING_ESCAPED and TOK_STRING,
but they appear to be unused for _section().
Zdenek Kabelac [Wed, 9 Oct 2024 12:23:51 +0000 (14:23 +0200)]
device_mapper: nodes and values with strings
Avoid double dm_pool allocation call by copying string
for node name and config value directly after the end
of node/value structure.
It would be likely better to not copy these strings at all
and derefence it from the original string however that
needs futher changes in the code base.
Zdenek Kabelac [Fri, 4 Oct 2024 12:03:16 +0000 (14:03 +0200)]
crc: add newer zlib code
This code is faster when calculating crc32 checksum for larger
block areas. There is also SIMD variant present in the code,
however ATM the influence on performance of lvm2 is not that big..
Peter Rajnoha [Fri, 18 Oct 2024 08:43:08 +0000 (10:43 +0200)]
lv_manip: fall back to direct zeroing on any BLKZEROOUT ioctl failure
When BLKZEROOUT ioctl fails, it should not stop us from trying the direct
zeroing as a fallback action, since this is an optimization only.
We should be able to continue with new LV creation if we succeed
with that direct fallback then.
Related report: https://issues.redhat.com/browse/RHEL-58737
David Teigland [Wed, 16 Oct 2024 17:29:13 +0000 (12:29 -0500)]
lvremove: fix failed remove of all LVs in shared VG
commit a125a3bb505cc "lv_remove: reduce commits for removed LVs"
changed "lvremove <vgname>" from removing one LV at a time,
to removing all LVs in one vg write/commit. It also changed
the behavior if some of the LVs could not be removed, from
removing those LVs that could be removed, to removing nothing
if any LV could not be removed. This caused a regression in
shared VGs using sanlock, in which the on-disk lease was
removed for any LV that could be removed, even if the command
decided to remove nothing. This would leave LVs without a
valid ondisk lease, and "lock failed: error -221" would be
returned for any command attempting to lock the LV.
Fix this by not freeing the on-disk leases until after the
command has decided to go ahead and remove everything, and
has written the VG metadata.