Zdenek Kabelac [Wed, 1 Mar 2017 23:15:11 +0000 (00:15 +0100)]
cache: enhance lvdisplay for cache volumes
Better support for lvdisplay.
By default info about running (in kernel) cache status is printed.
To get 'segtype' info, user runs: 'lvdisplay -m', example:
--- Logical volume ---
LV Path /dev/vg/lvol0
LV Name lvol0
VG Name vg
LV UUID Y4uWuN-TBGk-duer-aPWl-yBWn-iFFR-RU1gg1
LV Write Access read/write
LV Creation host, time linux, 2017-03-01 20:52:39 +0100
LV Cache pool name lvol2
LV Cache origin name lvol0_corig
LV Status available
# open 0
LV Size 12,00 MiB
Cache used blocks 10,42%
Cache metadata blocks 0,49%
Cache dirty blocks 0,00%
Cache read hits/misses 112 / 34
Cache wrt hits/misses 133 / 0
Cache demotions 0
Cache promotions 20
Current LE 3
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:0
--- Segments ---
Logical extents 0 to 2:
Type cache
Chunk size 64,00 KiB
Metadata format 1
Mode writethrough
Policy smq
Setting migration_threshold=100000
Zdenek Kabelac [Sun, 26 Feb 2017 20:22:43 +0000 (21:22 +0100)]
cache: validation for cache_metadata_format
Only cache-pool segtype may store cache_metadata_format.
Only supported values are 0,1,2
Format 2 requires LV status uses LV_METADATA_FORMAT.
Format 0 (unselected) or 1 shall not set this 'incompatible' status.
Zdenek Kabelac [Wed, 1 Mar 2017 11:26:56 +0000 (12:26 +0100)]
cache: LV supports cache segs with metadata format
Cache pool read/writes metadata_format within its segment type..
For CachePoolLV unselected metadata format is NOT stored in metadata.
For CacheLV when metadata format is not present/selected in lvm2 metadata,
it's automatically assumed to be the version 1 (backward compatible).
To ensure older lvm2 will not 'miss-read' metadata with new version 2,
such LV is marked with METADATA_FORMAT status flag (segment is
specifying metadata format). So when cache uses metadata format 2,
it will become inaccesible on older system without such support.
(kernel dm cache < 1.10, lvm2 < 2.02.169).
Zdenek Kabelac [Fri, 24 Feb 2017 21:41:42 +0000 (22:41 +0100)]
libdm: maintain binary interface for new FEATURE flag
Older library version was not detecting unknown 'feature' bits
and could let start target without needed option.
New versioned symbol now checks for supported feature bits.
_Base version keeps accepting only previously known features and
mask/ignores unknown bits.
NB: if the older binary passed in 'random' bits, it will not get
metadata2 by chance. New linked binary get new validation function.
Library user is required to not pass 'trash' for unsupported bits,
as such calls will be rejected.
Zdenek Kabelac [Thu, 23 Feb 2017 22:38:06 +0000 (23:38 +0100)]
libdm: support cache metadata2 feature flag
Dm cache target version 1.10 introduces new cache metadata format
(upstream kernel >=4.11).
New format is enable by passing new target feature flag metadata2.
Interace side on libdm uses DM_CACHE_FEATURE_METADATA2.
This feature bit is now also recognized on status
and set in 'feature_flags' field of dm_status_cache structure.
Code also adds check for 'highest' supported feature flag bit.
So it rejects properly any 'unknown' feature bit set by application.
Zdenek Kabelac [Thu, 23 Feb 2017 22:35:40 +0000 (23:35 +0100)]
libdm: better code to enforce writethrough
Better code to enforce writethrough caching for cleaner policy.
Only check for cleaner when DM_CACHE_FEATURE_PASSTHROUGH or
DM_CACHE_FEATURE_WRITEBACK is set.
Zdenek Kabelac [Thu, 9 Mar 2017 15:24:28 +0000 (16:24 +0100)]
pool: rework handling of passed args
As now we can properly recognize all paramerters for pool creation,
we may drop PASS_ARG_ defines and rely on '_UNSELECTED' or 0 entries
as being those without user given args.
When setting are not given on command line - 'update' function
fill them from profiles or configuration. For this 'profile' arg
was needed to be passed around and since 'VG' itself is not needed,
it's been all replaced with 'cmd, profile, extents_size' args.
Zdenek Kabelac [Mon, 27 Feb 2017 13:54:08 +0000 (14:54 +0100)]
lvconvert: share function for cache params
Reuse same code for getting/setting cache parameters with lvcreate.
So there is single one place how to get vars from profiles and configs.
Variables declarations are moved to start of function and
there is no need to initialize them as they are always
defined by functions get_cache_params() and get_pool_params().
Zdenek Kabelac [Thu, 9 Mar 2017 16:15:56 +0000 (17:15 +0100)]
lvcreate: do not round cache volumes on cache chunks
Since cache chunk might be huge and there is no technical need
to enforce rounding and there is actually more 'real' VG space
used then necessary - keep rounding on 'chunk' bounrary only
for thin volumes - where it's the space used anyway.
NB: we support conversion of any-size 'existing' LV into cached LV.
Zdenek Kabelac [Mon, 27 Feb 2017 13:53:45 +0000 (14:53 +0100)]
cache: extend usability of cache_set_params
Fix missing reset of '*settings' pointer when no args were given.
Handle cache_chunk settings like all other settings, so it is properly
updated only with non-zero settings and the existing cache-pool
chunk_size is not being reconfigured.
Zdenek Kabelac [Thu, 9 Mar 2017 17:10:35 +0000 (18:10 +0100)]
lvconvert: replacing with goto
Split-off patch that just replaces returns with 'goto bad'
so there is single place to release policy_settings.
In the next patch we will start to use some shared
function call between lvconvert and lvcreate
(effectively restoring previous logic which has got lost).
Zdenek Kabelac [Wed, 1 Mar 2017 10:23:26 +0000 (11:23 +0100)]
thin: getting default chunk_size from single place
Basically code moving operation to have a single place resolving
thin_pool_chunk_size_policy.
Supported are generic & performance profiles.
Function is now shared between thin manipulation code and configuration
_CFG logic to obtain defaults and handle correct reporting upward coding
stack.
Bryn M. Reeves [Fri, 10 Mar 2017 11:45:08 +0000 (11:45 +0000)]
libdm: move initialisation of group_id in _aggregate_histogram()
Older compilers are not able to determine that although group_id
is only assigned in one branch of a conditional, it is never used
used when the other branch is taken:
libdm-stats.c:3319: warning: "group_id" may be used uninitialized in this function
Avoid this by always initialising the variable when it is
declared.
Tony Asleson [Wed, 8 Mar 2017 21:35:53 +0000 (15:35 -0600)]
lvmdbusd: Disable notify_dbus for service command use
Utilizing the --config option we will utilize global/notify_dbus=0 so
that the service itself doesn't generate change events which it then needs to
process.
Tony Asleson [Wed, 8 Mar 2017 21:31:45 +0000 (15:31 -0600)]
lvmdbusd: Use work queue for queries too
We need to place query operations in the queue to prevent the case where
a client knows of something before the service does. For example if a
client creates a PV/VG/LV outside of the dbus API and then immediately
tries to lookup and use that resource in the lvm dbus service it should
be present. By placing the queries in the work queue any previous
refresh operation will complete before we process the query.
In addition to the already supported conversion between 2-legged
raid1 and raid5, raid1 and raid4 can be also converted into each
other with 2 legs (raid4/5 are limited to map a 2-legged raid1).
This patch supports the missing raid4 conversion in the sequence
linear -> 2-legged raid1 -> raid4/5, then restripe to more than one
data stripes for performance and resilience reasons and optionally
convert to striped/raid0.
The other conversion sequence is also possible by converting N-way
striped/raid0 to raid4/5, then restripe to 2 legs followed by a
conversion to raid1 and optionally to linear (loosing all resilience).
Bryn M. Reeves [Thu, 9 Mar 2017 21:14:50 +0000 (21:14 +0000)]
dmfilemapd: always set the program_id after listing regions
The filemap daemon takes its program_id from the regions it is
managing: use DM_STATS_ALL_PROGRAMS when retrieving an initial
listing and then obtain the correct program_id from the group
leader.
On conversion from striped to raid0, data LVs are created
and all segments and their respective areas of the striped
LV are moved across to new segments allocated for the raid0
image LVs. This can cause non-canonical segments to be added
to the image LVs.
Add a call to lv_merge_segments() once all segments have been
added to an image LV to compensate for that. This avoids
unsafe table loads on activation.
dmeventd: (workaround) fix mirror DSO to work with lvmetad
Automatic dmeventd repair of mirrors with active lvmetad configured
(mirror_image_fault_policy = "allocate") fails because the lvscan
run before the repair in the mirror DSO does not update the
lvmetad cache properly thus "lvconvert --repair ..." fails.
Need to scan the mirror LV before and after the repair
to have proper cache content after the repair finished.
The cache can't be relied on or the repair will fail.
Bryn M. Reeves [Fri, 16 Dec 2016 13:42:08 +0000 (13:42 +0000)]
dmstats: start dmfilemapd when creating or updating file maps
Launch an instance of the filemap monitoring daemon when creating,
or updating, a file mapped group, unless the --nomonitor switch is
given.
Unless --foreground is given the daemon will detach from the
terminal and run in the background until it is signaled or the
daemon termination conditions are met.
The --follow={inode|path} switch is added to control the daemon
behaviour when files are moved, unlinked, or renamed while they
are being monitored.
The daemon runs with the same verbosity as the dmstats command
that starts it.
Bryn M. Reeves [Thu, 15 Dec 2016 20:10:27 +0000 (20:10 +0000)]
daemons: add dmfilemapd
Add a daemon that can be launched to monitor a group of regions
corresponding to the extents of a file, and to update the regions as the
file's allocation changes.
The daemon is intended to be started from a library interface, but can
also be run from the command line:
Where fd is a file descriptor open on the mapped file, group_id is the
group identifier of the mapped group and mode is either "inode" or
"path". E.g.:
# dmfilemapd 3 0 vm.img inode 1 3 3<vm.img
...
If foreground is non-zero, the daemon will not fork to run in the
background. If verbose is non-zero, libdm and daemon log messages will
be printed.
It is possible for the group identifier to change when regions are
re-mapped: this occurs when the group leader is deleted (regroup=1 in
dm_stats_update_regions_from_fd()), and another region is created before
the daemon has a chance to recreate the leader region.
The operation is inherently racey since there is currently no way to
atomically move or resize a dm_stats region while retaining its
region_id.
Detect this condition and update the group_id value stored in the
filemap monitor.
A function is also provided in the the stats API to launch the filemap
monitoring daemon:
This carries out the first fork and execs dmfilemapd with the arguments
specified.
A dm_filemapd_mode_t value is specified by the mode argument: either
DM_FILEMAPD_FOLLOW_INODE, or DM_FILEMAPD_FOLLOW_PATH. A helper function,
dm_filemapd_mode_from_string(), is provided to parse a string containing
a valid mode name into the appropriate dm_filemapd_mode_t value.
lvconvert: prompt when splitting off a tracked LV of a 2-legged raid1 LV
Splitting off an image LV of a 2-legged raid1 LV tracking changes
causes loosing partial resilience for any newly written data set.
Full resilience will be provided again after the split off image LV
got merged back in and the new data set got fully synchronized.
Reason being that the data is only stored on the remaining single
writable image during the split.
Ask user to avoid uninformed loss of such partial resilience.