There're several reasons to split it. First, there's no support for conditional
stop in systemd and AFAIK they don't plan to support it. In other words:
if the deactivation fails for some reason, systemd doesn't care and will simply
kill all remaining processes in original cgroup (by default). Killing the
remaining procs can be suppressed however it doesn't solve the following problem:
You can't repeat the stop command of a failed service. The repeated stop command
is simply not propagated to the service in a failed state. You would have to start
and then try to stop the service again. Unfortunately, this can't be done while
the daemon is still running (and we need the daemon to stay active until all
clustered VGs are deactivated properly).
In a separated setup we need only to restart the failed activation service and
that's fine.
Peter Rajnoha [Mon, 10 Feb 2014 15:18:48 +0000 (16:18 +0100)]
systemd: cleanup for lvmetad systemd unit
No need to fork lvmetad when running under systemd.
Also, the "lvmetad -R" support has been removed in lvm2 v2.02.98
so remove the ExecReload line that called it on "systemctl reload".
Peter Rajnoha [Mon, 10 Feb 2014 12:28:13 +0000 (13:28 +0100)]
wiping: wipe DM_snapshot_cow signature without prompt in newly created LVs
The libblkid can detect DM_snapshot_cow signature and when creating
new LVs with blkid wiping used (allocation/use_blkid_wiping=1 lvm.conf
setting and --wipe y used at the same time - which it is by default).
Do not issue any prompts about this signature when new LV is created
and just wipe it right away without asking questions. Still keep the
log in verbose mode though.
Peter Rajnoha [Mon, 10 Feb 2014 08:03:08 +0000 (09:03 +0100)]
cleanup: missing parentheses in a condition
gcc reports:
metadata/merge.c:229:58: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
metadata/merge.c:232:58: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
Peter Rajnoha [Mon, 3 Feb 2014 15:07:06 +0000 (16:07 +0100)]
dmeventd: fix dmeventd -R to work properly with systemd
Trying to restart dmeventd as a reload action is causing problems
under systemd environment. The systemd loses track of new dmeventd
this way. See also https://bugzilla.redhat.com/show_bug.cgi?id=1060134
for more info.
We need to call dmeventd -R directly instead of "systemctl reload dm-event.service"
that was used before (the reload is aimed at configuration reload anyway,
not stateful restart of the daemon - we did this before just because
there's no ExecRestart in systemd and there's only ExecStart and
ExecStop with which we'd lose the state).
Also, use ExecStart="dmeventd -f" to run dmeventd in foreground
(and let's rely on systemd to daemonize it) and change the
service type from "forking" to "simple".
cache: Code to allow the create/remove of cache LVs
This patch allows users to create cache LVs with 'lvcreate'. An origin
or a cache pool LV must be created first. Then, while supplying the
origin or cache pool to the lvcreate command, the cache can be created.
Ex1:
Here the cache pool is created first, followed by the origin which will
be cached.
~> lvcreate --type cache_pool -L 500M -n cachepool vg /dev/small_n_fast
~> lvcreate --type cache -L 1G -n lv vg/cachepool /dev/large_n_slow
Ex2:
Here the origin is created first, followed by the cache pool - allowing
a cache LV to be created covering the origin.
~> lvcreate -L 1G -n lv vg /dev/large_n_slow
~> lvcreate --type cache -L 500M -n cachepool vg/lv /dev/small_n_fast
The code determines which type of LV was supplied (cache pool or origin)
by checking its type. It ensures the right argument was given by ensuring
that the origin is larger than the cache pool.
If the user wants to remove just the cache for an LV. They specify
the LV's associated cache pool when removing:
~> lvremove vg/cachepool
If the user wishes to remove the origin, but leave the cachepool to be
used for another LV, they specify the cache LV.
~> lvremove vg/lv
In order to remove it all, specify both LVs.
This patch also includes tests to create and remove cache pools and
cache LVs.
cache: Code changes to allow creation of cache pools
This patch allows the creation and removal of cache pools. Users are not
yet able to create cache LVs. They are only able to define the space used
for the cache and its characteristics (chunk_size and cache mode ATM) by
creating the cache pool.
A cache LV - from LVM's perpective - is a user accessible device that
links the cachepool LV and the origin LV. The following functions
were added to facilitate the creation and removal of this top-level
LV:
1) 'lv_cache_create' - takes a cachepool and an origin device and links
them into a new top-level LV of 'cache' segment type. No allocation
is necessary in this function, as the sub-LVs contain all of the
necessary allocated space. Only the top-level layer needs to be
created.
2) 'lv_cache_remove' - this function removes the top-level LV of a
cache LV - promoting the cachepool and origin sub-LVs to top-level
devices and leaving them exposed to the user. That is, the
cachepool is unlinked and free to be used with another origin to
form a new cache LV; and the origin is no longer cached.
(Currently, if the cache needs to be flushed, it is done in this
function and the function waits for it to complete before proceeding.
This will be taken out in a future patch in favor of polling.)
Jonathan Brassow [Tue, 28 Jan 2014 18:25:02 +0000 (12:25 -0600)]
cache: Allocation code changes necessary to support cache_pool
Cache pools require a data and metadata area (like thin pools). Unlike
thin pool, if 'cache_pool_metadata_require_separate_pvs' is not set to
'1', the metadata and data area will be allocated from the same device.
It is also done in a manner similar to RAID, where a single chunk of
space is allocated and then split to form the metadata and data device -
ensuring that they are together.
Jonathan Brassow [Tue, 28 Jan 2014 18:24:51 +0000 (12:24 -0600)]
cache: New functions for gathering info on cache devices
Building on the new DM function that parses DM cache status, we
introduce the following LVM level functions to aquire information
about cache devices:
- lv_cache_block_info: retrieves information on the cache's block/chunk usage
- lv_cache_policy_info: retrieves information on the cache's policy
I am reverting the commit below - removing the new 'dm_config_get_int'
function and simply calling 'dm_config_get_uint32' while casting the
'int *' pointer parameter.
Zdenek Kabelac [Wed, 29 Jan 2014 13:27:13 +0000 (14:27 +0100)]
thin: validate external origin size
Avoid use of external origin with size unaligned/incompatible with
thin pool chunk size, since the last chunk is not correctly provisioned
when it is overwritten.
Zdenek Kabelac [Tue, 28 Jan 2014 12:21:39 +0000 (13:21 +0100)]
thin: more validation of thin name
Avoid starting conversion of the LV to the thin pool and thin volume
at the same time. Since this is mostly a user mistake, do not try
to just convert to one of those type, since we cannot assume if the
user wanted LV to become thin volume or thin pool.
Before the fix tool reported pretty strange internal error:
Internal error: Referenced LV lvol1_tdata not listed in VG mvg.
Fixed output:
lvconvert --thinpool lvol0 -T mvg/lvol0
Can't use same LV mvg/lvol0 for thin pool and thin volume.
Jonathan Brassow [Mon, 27 Jan 2014 11:30:10 +0000 (05:30 -0600)]
cache: Add DM interface for retrieving a cache's status
This patch defines a structure for holding all of the device-mapper
cache target's status information. The associated function provides
an easy way for higher levels (LVM) to consume the information.
This patch finishes the device-mapper interface for the cache and
cachepool segment types (i.e. the cache target).
Jonathan Brassow [Mon, 27 Jan 2014 11:29:35 +0000 (05:29 -0600)]
cache: New 'cache' segment type
This patch adds the cache segment type - the second of two necessary
to create cache logical volumes. This segment type references the
cachepool (the small fast device) and the origin (the large slow device);
linking them to create the cache device. The cache device is the
hierarchical device-mapper device that the user ulitmately makes use
of.
The cache segment sources the information necessary to construct the
device-mapper cache target from the origin and cachepool segments to
which it links.
Jonathan Brassow [Mon, 27 Jan 2014 11:27:16 +0000 (05:27 -0600)]
cache: New 'cachepool' segment type
This patch adds the new cachepool segment type - the first of two
necessary to eventually create 'cache' logical volumes. In addition
to the new segment type, updates to makefiles, configure files, the
lv_segment struct, and some necessary libdevmapper flags.
The cachepool is the LV and corresponding segment type that will hold
all information pertinent to the cache itself - it's size, cachemode,
cache policy, core arguments (like migration_threshold), etc.
Zdenek Kabelac [Fri, 24 Jan 2014 14:59:38 +0000 (15:59 +0100)]
lvmetad: respect LVM_LVMETAD_PIDFILE settings in lvm
Test LVM_LVMETAD_PIDFILE for pid for lvm command.
Fix WHATS_NEW envvar name usage
Fix init order in prepare_lvmetad to respect set vars
and avoid clash with system settings.
Update test to really test the 'is running' message.
Jonathan Brassow [Wed, 22 Jan 2014 16:30:55 +0000 (10:30 -0600)]
Misc: Change name of lvcreate_params field - s/create_thin_pool/create_pool/
In preparation for other segment types that create and use "pools", we
s/create_thin_pool/create_pool/. This way it is not awkward when creating
a cachepool, for example, to use "create_thin_pool".
Jonathan Brassow [Wed, 22 Jan 2014 16:11:29 +0000 (10:11 -0600)]
Misc: Move some thin pool functions to a new file
Functions that handle set-up, tear-down and creation of thin pool
volumes will be more generally applicable when more targets exist
that make use of device-mapper's persistent data format. One of
these targets is the dm-cache target. I've selected some functions
that will be useful for the cache segment type to be moved, since
they will no longer be thin pool specific but are more broadly
useful to any segment type that makes use of a 'pool' LV.
Peter Rajnoha [Wed, 22 Jan 2014 15:26:49 +0000 (16:26 +0100)]
wiping: issue error if libblkid detects signature and fails to return offset/length
We need both offset and length when trying to wipe detected signatures.
The libblkid can fail so it's good to have an error message issued for
this state instead of being silent (libblkid does not issue any error
messages here). We just issued "stack" here before but that was not
quite useful if some error occurs...
Peter Rajnoha [Mon, 20 Jan 2014 11:54:10 +0000 (12:54 +0100)]
udev: clear temporary variable properly
Clear temporary DM_DISABLE_OTHER_RULES_FLAG properly. This did not
cause any bug or problem as the temporary variable is overwritten next
time it's used again, but we should still clean it properly!
Peter Rajnoha [Mon, 20 Jan 2014 11:38:21 +0000 (12:38 +0100)]
thin: fix thin LV flagging for udev to skip scanning
Only flag thin LV for no scanning in udev if this LV is about
to be wiped. This happens only in case the thin LV's pool was not
created with zeroing of the new blocks enabled.
Zdenek Kabelac [Mon, 20 Jan 2014 10:57:39 +0000 (11:57 +0100)]
fsadm: use xfs_repair when available
Since support for xfs_check is going to be obsoleted,
replace its usage with xfs_repair -n tool.
However this tool needs further intrumentation, since for really small
xfs devices (having just 1 allocation group) it needs to pass in
flag: "-o force_geometry". As we run the tool with '-n', it should
be safe to pass this flag always.
FIXME: figure way without always passing this flag.
Zdenek Kabelac [Wed, 15 Jan 2014 11:42:47 +0000 (12:42 +0100)]
libdm: preload revert after failing callback
Revert activated volumes if callback fails.
This is currently used only for thin_check failure support.
When thin_check detects failure in thin metadata device, it deactivate
volumes in reversed order that have been preloaded for thin pool activation.
After this change lvm command will not leave active pool subvolumes
in dm table.
Peter Rajnoha [Tue, 14 Jan 2014 16:49:39 +0000 (17:49 +0100)]
udev: do not drop SYSTEMD_READY for non-activating events
Do not drop device's flag to report readiness for systemd
processing if there's any event that follows the activatiion
event itself. Otherwise, systemd would lost track of this device
on any other event that follows the activating event (IOW, we
need to make SYSTEMD_READY variable change level-based, not edge-based).
This patch applies for MD and loop devices used as PVs.
format1: Mark obsolete and do not use with lvmetad.
DO NOT USE LVMETAD IF YOU HAVE ANY LVM1-FORMATTED PVS.
You may continue to use it without lvmetad, but do please schedule
an upgrade to the lvm2 format (with 'vgconvert').
Sending the original LVM1 formatted metadata to lvmetad is breaking
assumptions made by the code, so I am marking the format as obsolete for
now and no longer sending it to lvmetad.
This means that if you are using lvmetad, lvm1 volumes will usually
appear invisible - though not always: it depends on exactly what
sequence of commands you run!
The current situation is not satisfactory.
We'll either fix lvmetad and reenable this or we'll fix the code to
issue appropriate warning messages when lvm1 PVs are encountered
to avoid accidents.
(The latest unfixed problem is that lvmetad assumes metadata sequence
numbers exist and always increase - but the lvm1 format does not define
or store any sequence number, confusing both the daemon and client
when default values get passed to-and-fro.)
Several fields used to display 0 if undefined. Recent changes
to the way the fields are reported threw away some tests for
valid pointers, leading to segfaults with 'pvs -o all'.
If a PV in an existing VG becomes orphaned (with 'pvcreate -ff', for
example) the VG struct cached against its vginfo must be invalidated.
This is because the struct device it references no longer contains
the PV label so becomes incorrect.
This triggers the error:
Internal error: PV $dev unexpectedly not in cache.
when the PV from the cached VG metadata is subsequently looked up
in the cache.
Zdenek Kabelac [Wed, 8 Jan 2014 09:30:25 +0000 (10:30 +0100)]
libdm: pass dnode to callback
Pass dnode pointer instead of rather unknown child pointer.
The pointer is currently unused and passing child pointer
is quite undefined, while dnode has at least some usability.
Peter Rajnoha [Wed, 8 Jan 2014 09:56:05 +0000 (10:56 +0100)]
thin: cleanup _thin_pool_add_message
Make this code a bit more readable for Coverity as otherwise
it marks the "type" variable in the "_thin_pool_add_message" fn
as undefined for certain path (...which is normally unreachable anyway,
but let's clean this up).
Introduce FMT_OBSOLETE to identify pool metadata and use it and FMT_MDAS
instead of hard-coded format names.
Explain device accesses on pvscan --cache man page.
Tony Asleson [Tue, 17 Dec 2013 22:51:11 +0000 (16:51 -0600)]
liblvm: Save off and restore umask values
lvm has a default umask value in the config file that defaults
to 0077 which lvm changes to during normal operation. This
causes a problem when the code is used as a library with
liblvm as it is changing the umask for the process. This
patch saves off the current umask, sets to what is specified
in the config file and restores it what it was on library
function call exit.
The user is now free to change the umask in their application at
anytime including between library calls.
This fix address BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=1012113
Tested by setting umask to 0777 and running the python unit
test and verifying that umask is still same value as expected
at the test completion and with a successful run.
In the first case, we need to be sure to have an exact matching line
in the filter for the device to be used, no aliases are considered
So for example even if we have accept rule for "/dev/sda" present,
this won't apply for "/dev/block/8:0" even though it's the same device!
This is because we're comparing the path used on command line directly
with the path written in the rule.
For the second one, any alias mentioned in the filter will apply
as we're comparing the major and minor pair, not looking at actual
device names - so any alias mentioned in the rules will suffice for
the filtering rule to apply.
For the global_filter to be properly used, we need to call the
second one in the lvm2-pvscan@.service - nobody is able to tell
what value of major:minor the kernel assignes next time, hence
this bug makes the use of global_filter quite unusable!
Zdenek Kabelac [Wed, 18 Dec 2013 09:52:09 +0000 (10:52 +0100)]
device: if BLKPBSZGET is unavailable, enforce 512
If there is no define for BLKPBSZGET - we have hard time how to
decrypt physical block size - we can't use here block_size,
since this is usually 4k while we need to use 512b.
FIXME: find some better way, until that enforce value 512.
Eventually we could also try to put in: