There are no fuction named print_common_options_cmd()
and print_common_options_lvm(). So, rename them to the
real function named print_usage_common_cmd() and
print_usage_common_lvm().
Zdenek Kabelac [Thu, 24 Oct 2024 14:12:18 +0000 (16:12 +0200)]
metadata: use radix tree to find lv_names
Replace usage of dm_hash with radix_tree to quickly find LV name
with a vg and also index PV names with set of available PVs.
This PV index is only needed during the import, but instead
of passing 'radix_tree *' everywhere, just keep this within
a VG struct as well and once the parsing is finished, release
this PV index radix_tree.
This also makes it easier to replace this structure
in the future if needed.
lv_set_name now uses radix_tree remove+insert to keep lv_names
tree in-sync and usable for find_lv queries.
Zdenek Kabelac [Thu, 24 Oct 2024 12:04:07 +0000 (14:04 +0200)]
radix_tree: add radix_tree_uniq_insert
When using radix_tree to identify duplicate entries we may
avoid to call an extra 'lookup()' prior the insert() operation
add radix_tree_uniq_insert/_ptr() that is able to report -1 if
there was already set a value for the given key.
Zdenek Kabelac [Wed, 23 Oct 2024 17:22:10 +0000 (19:22 +0200)]
vgcfgrestore: validate complete VG
Avoid finding problems in vg_validate when restoring
invalid VG metadata as that would lead to internal error.
i.e. adding unsupported METADATA_FLAG to zero segtype
can trigger such thing.
Zdenek Kabelac [Wed, 23 Oct 2024 11:30:55 +0000 (13:30 +0200)]
export: change to read_segtype_and_lvflags
Instead of duplicating whole segtype string with flags and
using 2 calls read_segtype_lvflags() + get_segtype_from_string(),
merge the functionality into a single read_segtype_and_lvflags().
This allows to make only a local string copy (no allocs) and eventually
to not copy segtype string at all, when there are no flags.
Zdenek Kabelac [Sun, 20 Oct 2024 20:14:39 +0000 (22:14 +0200)]
export: reduce emit_to_buffer calls
As the 'emit_to_buffer' uses relatively complex
vsnprintf() call inside, try to reduce number
of unnecessary calls and try replace some more
complex string build with a single call instead.
Zdenek Kabelac [Wed, 23 Oct 2024 09:49:55 +0000 (11:49 +0200)]
dev-cache: enhance usability of dm cache
With existing code, the cache was working only to the 2nd. locking.
So i.e. when 'lvs' scans system with more then one VG, the caching
was effectively not working.
Update the code, so the label invalidate code is able to update DM
cache - so whenever we take a new lock - we will refresh the cache.
TODO: the refresh ATM does a very simple compare of old a new list
of cached DM device, and with the first spotted difference, it just
fallback to the full rebuild of DM cache - with large amount of active
devices this might not the most efficient way....
Zdenek Kabelac [Sun, 20 Oct 2024 18:48:56 +0000 (20:48 +0200)]
debug: use just LV name for debug message
Since we detect 'debug' level after calling 'log_debug()' - all
the arguments are evaluated, so in this case display_lvname() was
preparing a string that is not used in case debugging is not enabled.
So since these string are on 'hot-path' and it's already known
which VG is being worked on, in these few cases just use lv->name.
Zdenek Kabelac [Thu, 17 Oct 2024 21:10:01 +0000 (23:10 +0200)]
device_mapper: add dm_config_parse_only_section
This function call is able to setup config parser so it stops
parsing 'subsection' nodes after parsing named section node.
Only nodes at 'level' 0 will be still processed. And this nodes
are found by searching for last \n}\n sequence from the end of
buffer (instead of trying to analyze all the text in buffer).
Zdenek Kabelac [Fri, 18 Oct 2024 22:05:45 +0000 (00:05 +0200)]
check_lv_segment: split into incomplete complete
Split single check_lv_segments() into 2 separate
versions so they can be called independently.
This allow to 'skip' already checked segment
check after it's been imported to VG and also
avoid another repeated checking when validating
segment with complete vg.
**
check_lv_segments_incomplete_vg()
this check just basic LV segment properties and does not
validate those requiring full VG.
**
check_lv_segments_complete_vg()
Remaining check that expects complete VG is present.
Zdenek Kabelac [Sat, 12 Oct 2024 23:11:04 +0000 (01:11 +0200)]
thin: check only for profiled config vars
ATM this rather save a lot of unncessary log entries as it grabs
the global autoextend_threshold (profile == NULL) just once instead
of revealing it every time with NULL profile.
Zdenek Kabelac [Sat, 12 Oct 2024 18:52:27 +0000 (20:52 +0200)]
device_mapper: increase mem pool chunk size
Use bigger memory pool chunk size and reduces amount of
memory pool extensions when handling larger metadata, but do not
make it noticable bigger when handling small ones...
Use same large value also when allocating VG memory pool.
Zdenek Kabelac [Sat, 12 Oct 2024 19:31:11 +0000 (21:31 +0200)]
device_mapper: store string on stack
Instead of allocating string from a pool, for shorted strings
use buffer on stack since the string after the use in _find_or_make_node()
as no longer needed.
Eventually we may enhance code also for TOK_STRING_ESCAPED and TOK_STRING,
but they appear to be unused for _section().
Zdenek Kabelac [Wed, 9 Oct 2024 12:23:51 +0000 (14:23 +0200)]
device_mapper: nodes and values with strings
Avoid double dm_pool allocation call by copying string
for node name and config value directly after the end
of node/value structure.
It would be likely better to not copy these strings at all
and derefence it from the original string however that
needs futher changes in the code base.
Zdenek Kabelac [Fri, 4 Oct 2024 12:03:16 +0000 (14:03 +0200)]
crc: add newer zlib code
This code is faster when calculating crc32 checksum for larger
block areas. There is also SIMD variant present in the code,
however ATM the influence on performance of lvm2 is not that big..
Peter Rajnoha [Fri, 18 Oct 2024 08:43:08 +0000 (10:43 +0200)]
lv_manip: fall back to direct zeroing on any BLKZEROOUT ioctl failure
When BLKZEROOUT ioctl fails, it should not stop us from trying the direct
zeroing as a fallback action, since this is an optimization only.
We should be able to continue with new LV creation if we succeed
with that direct fallback then.
Related report: https://issues.redhat.com/browse/RHEL-58737
David Teigland [Wed, 16 Oct 2024 17:29:13 +0000 (12:29 -0500)]
lvremove: fix failed remove of all LVs in shared VG
commit a125a3bb505cc "lv_remove: reduce commits for removed LVs"
changed "lvremove <vgname>" from removing one LV at a time,
to removing all LVs in one vg write/commit. It also changed
the behavior if some of the LVs could not be removed, from
removing those LVs that could be removed, to removing nothing
if any LV could not be removed. This caused a regression in
shared VGs using sanlock, in which the on-disk lease was
removed for any LV that could be removed, even if the command
decided to remove nothing. This would leave LVs without a
valid ondisk lease, and "lock failed: error -221" would be
returned for any command attempting to lock the LV.
Fix this by not freeing the on-disk leases until after the
command has decided to go ahead and remove everything, and
has written the VG metadata.
Peter Rajnoha [Thu, 3 Oct 2024 07:38:11 +0000 (09:38 +0200)]
dev-type: detect mixed dos partition with gpt's PMBR
Detect when we have mixed dos partition with gpt's PMBR partition.
This is not a sane configuration, but detect it anyway, just in case
someone configures such partition layout manually and forcefully and
incorrectly defines one of the partition types to be the GPT's PMBR.
For example:
❯ fdisk -l /dev/sdc
Device Boot Start End Sectors Size Id Type
/dev/sdc1 2048 67583 65536 32M 83 Linux
/dev/sdc2 67584 262143 194560 95M ee GPT
Before:
(The partition filter passes even though there's real existing dos
partition - the empty GPT PMBR overrides it.)
❯ pvcreate /dev/sdc
WARNING: PMBR signature detected on /dev/sdc at offset 510. Wipe it? [y/n]:
Wiping PMBR signature on /dev/sdc.
Physical volume "/dev/sdc" successfully created.
With this patch applied:
(The GPT PMBR does not override the existence of the dos partition.)
❯ pvcreate /dev/sdc
Cannot use /dev/sdc: device is partitioned
❯ lvextend -L72m vg/swap
Size of logical volume vg/swap changed from 60.00 MiB (15 extents) to 72.00 MiB (18 extents).
Logical volume vg/swap successfully resized.
❯ lvreduce -L60m vg/swap
File system swap found on vg/swap.
File system device usage is not available from libblkid.
❯ lvreduce -L50m vg/swap
Rounding size to boundary between physical extents: 52.00 MiB.
File system swap found on vg/swap.
File system device usage is not available from libblkid.
After:
❯ lvextend -L72m vg/swap
Size of logical volume vg/swap changed from 60.00 MiB (15 extents) to 72.00 MiB (18 extents).
Logical volume vg/swap successfully resized.
❯ lvreduce -L60m vg/swap
File system swap found on vg/swap.
File system size (60.00 MiB) is equal to the requested size (60.00 MiB).
File system reduce is not needed, skipping.
Size of logical volume vg/swap changed from 72.00 MiB (18 extents) to 60.00 MiB (15 extents).
Logical volume vg/swap successfully resized.
❯ lvreduce -L50m vg/swap
Rounding size to boundary between physical extents: 52.00 MiB.
File system swap found on vg/swap.
File system size (60.00 MiB) is larger than the requested size (52.00 MiB).
File system reduce is required and not supported (swap).
Peter Rajnoha [Thu, 19 Sep 2024 10:39:46 +0000 (12:39 +0200)]
dev-type: get swap device size from blkid using FSSIZE
blkid does not report FSLASTBLOCK for a swap device. However, blkid
does report FSSIZE for swap devices, so use this field (and including
the header size which is of FSBLOCKSIZE for the swap) instead to
set the "filesystem last block" which is used subsequently for
further calculations and conditions.
Peter Rajnoha [Wed, 4 Sep 2024 13:30:42 +0000 (15:30 +0200)]
filter: partitioned: also detect non-empty GPT partition table
We already detect msdos partition table. If it is empty, that is, there
is just the partition header and no actual partitions defined, then the
filter-partitioned passes, otherwise not.
In cases user is sure he is not using his 'rootfs' or 'swap' on LVs
managed with his command - it possible to completely bypass pinning
process to RAM which may eventually slightly speedup command execution,
(however at the risk the process can be eventually delayed by swapping).
Basicaly use this only at your risk...
David Teigland [Wed, 25 Sep 2024 21:18:32 +0000 (16:18 -0500)]
lvmlockd: use lvmlock LV size
Previously, lvmlockd detected the end of the lvmlock LV
by doing i/o to it until an i/o error was returned.
This triggered sanlock warning messages, so use the LV
size to avoid accessing beyond the end of the device.
Previously, every lvcreate would refresh the lvmlock LV
in case another machine had extended it. This involves
a lot of unnecessary work in most cases, so now compare
the LV size and device size to detect when a refresh is
needed.
David Teigland [Tue, 10 Sep 2024 16:51:15 +0000 (11:51 -0500)]
metadata: use lv_hash in segment-specific metadata parsing
The lv_hash wasn't being passed to the seg-specific text import
functions, so they were doing many find_lv() calls which consumes
a lot of time when there are many LVs in the metadata.
Peter Rajnoha [Mon, 12 Aug 2024 12:31:19 +0000 (14:31 +0200)]
libdm: do not fail if GETVAL semctl fails for udev sync inc and dec
While performing udev sync semaphore's inc/dec operation, we use the
result from GETVAL semctl just to print a debug message with current
value of that sempahore, nothing else.
If the GETVAL fails for whetever reason while the actual inc/dec
completes successfully, just log a warning message about the GETVAL
(and print the debug messages without the actual semaphore value)
and return success for the inc/dec operation as a whole.
Peter Rajnoha [Mon, 12 Aug 2024 12:16:32 +0000 (14:16 +0200)]
libdm: clean up udev sync semaphore on fail path during its creation
Clean up udev sync semaphore on fail path during its creation, otherwise
the caller will have no handle returned to clean it up itself and the
semaphore will keep staying in the system. The only way to clean it up
would be to call `dmsetup udevcomplete_all` which would destroy all
udev sync semaphores, not just the failed one, which we don't want.
Peter Rajnoha [Mon, 12 Aug 2024 12:01:25 +0000 (14:01 +0200)]
libdm: add 'cookie create/inc/dec' log prefix if GETVAL fails for udev sync ops
The same message is printed while performing create/inc/dec operation and
the GETVAL semctl fails. Add a prefix so we know exactly in which of
these functions the issue actually happened.