lvm metadata writes, commits and activations are performed
for (newly) allocated RAID metadata SubLVs to wipe any preexisiting
data thus avoid false raid superblock positives on RaidLV activation.
This process can be interrupted by command or system crashs
thus leaving stale SubLVs in the lvm metadata as a problem.
Because we hold an exclusive lock in this metadata SubLV wiping
process, we can address this problem by avoiding aforementioned
commits/writes/activations altogether wiping the respective first
sector of the first physical extent allocated to any metadata SubLV
directly via the existing dev_set() API.
Zdenek Kabelac [Mon, 15 Oct 2018 13:17:07 +0000 (15:17 +0200)]
cov: fix failing filter initialization
When persistent_filter_create() fails, the existing passed filter
should be preserved, so it could be properly deleted on
error path - so new pfilter is assigned instead.
Zdenek Kabelac [Mon, 27 Aug 2018 08:18:26 +0000 (10:18 +0200)]
dmeventd: lvm2 plugin uses envvar registry
Thin plugin started to use configuble setting to allow to configure
usage of external scripts - however to read this value it needed to
execute internal command as dmeventd itself has no access to lvm.conf
and the API for dmeventd plugin has been kept stable.
The call of command itself was not normally 'a big issue' until users
started to use higher number of monitored LVs and execution of command
got stuck because other monitored resource already started to execute
some other lvm2 command and become blocked waiting on VG lock.
This scenario revealed necesity to somehow avoid calling lvm2 command
during resource registration - but this requires bigger changes - so
meanwhile this patch tries to minimize the possibility to hit this race
by obtaining any configurable setting just once - such patch is small
and covers majority of problem - yet better solution needs to be
introduced likely with bigger rework of dmeventd.
TODO: avoid blocking registration of resource with execution of lvm2
commands since those can get stuck waiting on mutexes.
David Teigland [Mon, 24 Sep 2018 19:41:58 +0000 (14:41 -0500)]
metadata: add direct size limit
Previously the size was limited by checking if the
old and new copies of the metadata overlapped.
This generally limited the size to about half of
the total space, but it could be larger given the
size differences between old and new. Now add a
direct check to limit the size to half the space.
David Teigland [Thu, 20 Sep 2018 18:53:50 +0000 (13:53 -0500)]
metadata: remove an unused and incorrect overflow check
Remove another instance of an invalid check for metadata
overflow during read. The previous instance was removed
in commit 5fb15b193.
This was checking for metadata that that overflowed the
circular disk metadata buffer during read, but such metadata
cannot be written, so it shouldn't be possible to find see.
Also, the check was incorrect and could trigger when there
was no overflow.
David Teigland [Wed, 12 Sep 2018 20:59:47 +0000 (15:59 -0500)]
fix readonly activation override options
This fixes a problem in commit e6bb780d242, in which the
back compat handling for the old locking_type=4 was
incorrectly translated to mean the same thing as --readonly,
which prevented activation because activation uses an
exclusive vg lock. Previously, locking_type=4 allowed
activation.
If we see locking_type 4 in an old config, translate it to
the new combination of --readonly and --sysinit, which we
now define to mean the --readonly behavior with an exception
to allow activation.
David Teigland [Thu, 30 Aug 2018 15:44:53 +0000 (10:44 -0500)]
metadata: improve write and commit code
The vg_write/vg_commit code was imprecise, uncommented, and
hard to understand. Rewrite it with clearer, cleaner code,
extensive comments, descriptions of how it works, and add
more info in debugging output.
The minor changes in behavior are to things that were
either incorrect or probably unintended:
- vg_write/vg_commit no longer check that the current vgname at
the start of the text metadata matches the vgname being written.
This has already been done at least twice by the time they are
called, and repeating it again against the same cached data has
no use.
- A fragment of old removed code had been left behind that checked
if the old unused alignment policy would wrap. It was still
being checked to decide if the metadata area was full, which
could possibly cause an incorrect full metadata failure.
- vg_remove now clears both the raw_locns in the mda_header that
point to committed metadata (raw_locn slot 0) and precommitted
metadata (raw_locn slot 1). Previously it fully cleared the
committed slot, and would only clear the offset field in the
precommitted slot if it saw a problem with the metadata in the
vg being removed.
- read_metadata_location_summary was wrongly comparing the number
of wrapped bytes with an offset to report an error about the
metadata being too large. This wrong check is removed, it
could have resulted in erroneous errors.
Commit 989626926c98cd00f0236c4fcac883107d76899d
introduced 2 new tests
lvconvert-raid-takeover-linear_to_raid4.sh and
lvconvert-raid-takeover-raid4_to_linear.sh
which involve raid reshaping.
Bump the checked dm-raid target version to 1.14.0
which has reshape kernel fixes to avoid test suite
runs to hang.
lvconvert: fix interim segtype regression on raid6 conversions
When converting from striped/raid0/raid0_meta
to raid6 with > 2 stripes, allow possible
direct conversion (to raid6_n_6).
In case of 2 stripes, first convert to raid5_n to restripe
to at least 3 data stripes (the raid6 minimum in lvm2) in
a second conversion before finally converting to raid6_n_6.
As before, raid6_n_6 then can be converted
to any other raid6 layout.
Enhance lvconvert-raid-takeover.sh to test the
2 stripes conversions to raid6.
David Teigland [Wed, 29 Aug 2018 18:14:18 +0000 (13:14 -0500)]
filter: add config setting to skip scanning LVs
devices/scan_lvs (default 1) determines whether lvm
will scan LVs for layered PVs. The lvm behavior has
always been to scan LVs, but it's rare for LVs to have
layered PVs, and much more common for there to be many
LVs that substantially slow down scanning with no benefit.
This is implemented in the usable filter, and has the
same effect as listing all LVs in the global_filter.
Peter Rajnoha [Thu, 30 Aug 2018 10:46:41 +0000 (12:46 +0200)]
scripts: lvm2-activation-generator: add prefix for all kmsg messages
Add "lvm2-activation-generator: " prefix for all kmsg messages written by
lvm2-activation-generator so we can identify the message in global system log.
Peter Rajnoha [Thu, 30 Aug 2018 10:35:58 +0000 (12:35 +0200)]
scripts: add After=rbdmap.service to {lvm2-activation-net,blk-availability}.service
We need to have Ceph RBD devices mapped first before use in a stack
where LVM is on top so make sure rbdmap.service is called before
generated lvm2-activation-net.service.
On shutdown, we need to stop blk-availability first before we stop the
rbdmap.service.
David Teigland [Fri, 24 Aug 2018 19:46:51 +0000 (14:46 -0500)]
bcache: reduce MAX_IO to 256
This is the number of concurrent async io requests that
the scan layer will submit to the bcache layer. There
will be an open fd for each of these, so it is best to
keep this well below the default limit for max open files
(1024), otherwise lvm may get EMFILE from open(2) when
there are around 1024 devices to scan on the system.
"lvconvert --type linear RaidLV" on striped and raid4/5/6/10
have to provide the convenient interim layouts. Fix involves
a cleanup to the convenience type function.
As a result of testing, add missing sync waits to
lvconvert-raid-reshape-linear_to_raid6-single-type.sh.