Ondrej Kozina [Fri, 30 Jan 2015 14:15:24 +0000 (15:15 +0100)]
libdaemon: set CLOEXEC flag on systemd socket
all sockets opened by a daemon or handed over by systemd
have to have CLOEXEC flag set. Otherwise we get nasty
warnings about leaking descriptors in processes spawned by
daemon.
Zdenek Kabelac [Fri, 30 Jan 2015 15:22:11 +0000 (16:22 +0100)]
thin: fix upgrade regression
Older lvm2 tools where always providing linear mapping for thin pool.
Recent lvm2 version however support external usage of thin pool and
empty/unused pools are loaded without such external linear mapping.
So this patch covers 'upgrade' problem, where older tool has activated
thin-pool with 'linear' layer mapping, and newer tools didn't expected
such mapping to exist and were not able to deactivate such table.
So before checking for new layout in dm-table, check if there is not
an old one already there.
Zdenek Kabelac [Fri, 30 Jan 2015 14:29:39 +0000 (15:29 +0100)]
thin: report proper status for thin pool
After commit 158e9988768be344c0c97acdf52d1c020ab8c83e where we may
start to readlv_attr with a 'shared' ioctl call for a single lvs line
we where obtaing single status for thin pools.
However this is not properly reflecting lvm2 reality.
Correcting this by reading lv status from layered thin pool, but lv info
from non-layered (linear) mapped device which is maintained for proper
cluster locking.
Peter Rajnoha [Tue, 9 Sep 2014 13:05:57 +0000 (15:05 +0200)]
filters: add firmware RAID filter
Just like MD filtering that detects components of software RAID (md),
add detection for firmware RAID.
We're not adding any native code to detect this - there are lots of
firmware RAIDs out there which is just out of LVM scope. However,
with current changes with which we're able to get device info from
external sources (e.g. external_device_info_source="udev"), we can
do this easily if the external device status source has this kind
of information - which is the case of "udev" source where the results
of blkid scans are stored.
This detection should cover all firmware RAIDs that blkid can detect and
which are identified as:
ID_FS_TYPE = {adaptec,ddf,hpt45x,hpt37x,isw,jmicron,lsi_mega,nvidia,promise_fasttrack,silicon_medley,via}_raid_member
Peter Rajnoha [Wed, 3 Sep 2014 13:49:36 +0000 (15:49 +0200)]
filter-partitioned: use new 'udev' device status source to get partition status
Partitioned devices are marked in udev db as:
ID_PART_TABLE="<partition table type name>"
and at the same time they are *not* marked with:
ID_PART_ENTRY_DISK="<parent disk major:minor>"
Where partition table type name is dos/gpt/... But checking the presence
of this variable is enough for LVM here - it just needs to know whether
there's a partition table or not, not interested in the actual type.
The same applies for parent disk major:minor.
Peter Rajnoha [Thu, 29 Jan 2015 13:24:06 +0000 (14:24 +0100)]
filter-usable: move check for pv_min_size from filter-partitioned to filter-usable and use new 'udev' external device info source for this check
The filter-partitioned code should contain only checks in "partition" domain.
The check for pv_min_size should actually be a part of filter-usable.
If the device size is less than pv_min_size, such device is not usable
as a PV so this check clearly belongs here logically.
With udev external info source, we can get device size via libudev's
sysfs reading interface and we can avoid opening the device this way
effectively.
Peter Rajnoha [Mon, 15 Dec 2014 14:12:42 +0000 (15:12 +0100)]
filter-composite: add external device info hooks
Composite filter is a filter that can put several filters in one set.
This patch adds a switch when creating the composite filter which will
enable or disable external device info handles for all the filters
the composite filter encompasses.
We want to use this external device info for majority of the filters
which are in the "lvmetad filter chain" (or the respective part if
we're not using lvmetad).
Following patches will use the enabled external device handle in
concrete filters from the composite filter...
Peter Rajnoha [Fri, 30 Jan 2015 10:13:49 +0000 (11:13 +0100)]
properties: also recognize LVSINFO, LVSSTATUS and LVSINFOSTATUS as subtypes of LVS
LVSINFO, LVSSTATUS and LVSINFOSTATUS is the same as LVS, just with some
extra info/status decoration attached to it. Recognize this when looking
for properties for lvm2app. This fixes lvm_lv_get_property lvm2app call
for fields which already use LVS{INFO,STATUS,INFOSTATUS} - currently,
this is lv_attr field which was converted to LVSINFOSTATUS from
pure LVS type.
This is a regression from v115 where some of the fields/properties
were converted to using the common "struct lvinfo" and
"struct lv_seg_status" so we don't need to issue info and status
ioctl several times per one reported line. Not all fields are
converted yet, but one that *is* converted is the lv_attr field
with the lv_attr_dup counterpart used in lvm_lv_get_attr lvm2app fn.
These changes were introduced with e34b004422f0d51263e0d34f4064556cfc9148f6
and later - this patch introduced the "info_ok" field in the
lv_with_info_and_seg_status structure which encapsulates the lvinfo
and lv_seg_status struct.
For the lv_attr_dup, the lv_attr_dup code missed the
assignment for the "info_ok" flag which saves the result of the
lv_info_with_seg_status call. Hence such info was marked
as unusable - unknown and it was returned as such via lvm_lv_get_attr
lvm2app fn.
Zdenek Kabelac [Wed, 28 Jan 2015 17:30:08 +0000 (18:30 +0100)]
raid: preload splitted LV only when active
Check splitted leg is active before preload.
(Since splitmirrors currently only does work active raid volumes
it's not a change for current code flow).
Minor optimization included - when already positively checked
for raid image don't check again for raid metadata.
Zdenek Kabelac [Wed, 28 Jan 2015 14:12:38 +0000 (15:12 +0100)]
thin: preserve chunksize with lvconvert
When repairing thin pool or swapping thin pool metadata,
preserve chunk_size property and avoid to be automatically changed
later in the code to better match thin pool metadata size.
Zdenek Kabelac [Wed, 28 Jan 2015 12:39:41 +0000 (13:39 +0100)]
raid: fix raid image splitting
When raid leg is extracted, now the preload code handles this state
correctly and put proper new table entry into dm tree,
so the activation of extracted leg and removed metadata works
after commit.
Zdenek Kabelac [Wed, 28 Jan 2015 12:36:25 +0000 (13:36 +0100)]
raid: fix tree preload for splitting raid images
When raid is being splitted, extracted leg & metadata
is still floating in the table - and thus we need to
detect this case and properly preload their matching
table so consequent activation of extracted LVs properly
renames (and FREES) existing raid images, so ongoing
image name shifting will work.
Peter Rajnoha [Wed, 21 Jan 2015 09:43:40 +0000 (10:43 +0100)]
report: rename lv_error_when_full field to lv_when_full and display either "error", "queue" or ""
Rename original lv_error_when_full field to lv_when_full and also
convert it from binary field to string field displaying three
possible values: "error", "queueu" or "" (blank for undefined).
David Teigland [Tue, 20 Jan 2015 19:08:22 +0000 (13:08 -0600)]
vgimportclone: remove arg check that uses pvs
The arg check using pvs is unnecessary. If the arg is not a PV,
the command will just fail later. Using the pvs command at this
point in the command is a problem when lvmetad is running, because
the pvs command does not report duplicate PVs when using lvmetad.
(Alternatively, use_lvmetad could be disabled by adding a --config
override to this pvs command.)
Peter Rajnoha [Tue, 20 Jan 2015 15:02:48 +0000 (16:02 +0100)]
report: add separate LVSINFOSTATUS field type for info+status combined fields
Add separate LVSINFOSTATUS field type for fields which display both
dm info-like and dm status-like information.
The internal interface is there with the introduction of LVSSTATUS
field type which can cope with the combination of LVSSTATUS
and LVSINFO field types (several fields).
However, till now, we considered that *single* field can display
either LVSINFO or LVSSTATUS, but not both at the same time.
Till now, we haven't had single field which needs both - hence
add LVSINFOSTATUS field type for such fields as we currently
need this for the lv_attr field which requires combination of
info and status.
This patch just adds interface for an ability to register such fields
(the code that copes with this is already in).
Zdenek Kabelac [Tue, 20 Jan 2015 12:14:16 +0000 (13:14 +0100)]
report: use same info also for lv_attr
Recently the single 'status' code has been used for number of cache
features.
Extend the API a little bit to allow usage also for lv_attr_dup.
As the function itself is used in lvm2api - add a new function:
lv_attr_dup_with_info_and_seg_status() that is able to use
grabbed info & status information.
report_init() is now using directly passed lvdm struct pointer
which holds the infomation whether lv_info() was correctly obtained or
there was some error when trying to read it.
Move 'healt' attribute to status.
TODO convert raid function to use the already known status.
raid_manip: v2 fix multi-segment misallocation on 'lvconvert --repair'
The previous patch felt short WRT disabling allocation on PVs holding other
legs of the RAID LV persistently; this patch introduces an internal,
transient PV flag PV_ALLOCATION_PROHIBITED to address this very problem.
General problem description for completeness:
An 'lvconvert --repair $RAID_LV" to replace a failed leg of a multi-segment
RAID10/4/5/6 logical volume can lead to allocation of (parts of) the replacement
image component pair on the physical volume of another image component
(e.g. image 0 allocated on the same PV as image 1 silently impeding resilience).
Patch fixes this severe resilince issue by prohibiting allocation on PVs
already holding other legs of the RAID set. It allows to allocate free space
on any operational PV already holding parts of the image component pair.
David Teigland [Wed, 14 Jan 2015 20:38:05 +0000 (14:38 -0600)]
toollib: search for duplicate PVs only when needed
A full search for duplicate PVs in the case of pvs -a
is only necessary when duplicates have previously been
detected in lvmcache. Use a global variable from lvmcache
to indicate that duplicate PVs exist, so we can skip the
search for duplicates when none exist.
David Teigland [Wed, 14 Jan 2015 20:16:03 +0000 (14:16 -0600)]
toollib: pvs -a should display VG name for each duplicate PV
Previously, 'pvs -a' displayed the VG name for only the device
associated with the cached PV (pv->dev), and other duplicate
devices would have a blank VG name. This commit displays the
VG name for each of the duplicate devices. The cost of doing
this is not small: for each PV processed, the list of all
devices must be searched for duplicates.
David Teigland [Tue, 13 Jan 2015 22:16:22 +0000 (16:16 -0600)]
toollib: override the PV device with duplicates
When multiple duplicate devices are specified on the
command line, the PV is processed once for each of them,
but pv->dev is the device used each time.
This overrides the PV device to reflect the duplicate
device that was specified on the command line. This is
done by hacking the lvmcache to replace pv->dev with the
device of the duplicate being processed. (It would be
preferable to override pv->dev without munging the content
of the cache, and without sprinkling special cases throughout
the code.)
This override only applies when multiple duplicate devices are
specified on the command line. When only a single duplicate
device of pv->dev is specified, the priority is to display the
cached pv->dev, so pv->dev is not overridden by the named
duplicate device.
In the examples below, loop3 is the cached device referenced
by pv->dev, and is given priority for processing. Only after
loop3 is processed/displayed, will other duplicate devices
loop0/loop1 appear (when requested on the command line.)
With two duplicate devices, loop0 and loop3:
# pvs
Found duplicate PV XhLbpVo0hmuwrMQLjfxuAvPFUFZqD4vr: using /dev/loop3 not /dev/loop0
PV VG Fmt Attr PSize PFree
/dev/loop3 loopa lvm2 a-- 12.00m 12.00m
# pvs /dev/loop3
Found duplicate PV XhLbpVo0hmuwrMQLjfxuAvPFUFZqD4vr: using /dev/loop3 not /dev/loop0
PV VG Fmt Attr PSize PFree
/dev/loop3 loopa lvm2 a-- 12.00m 12.00m
# pvs /dev/loop0
Found duplicate PV XhLbpVo0hmuwrMQLjfxuAvPFUFZqD4vr: using /dev/loop3 not /dev/loop0
PV VG Fmt Attr PSize PFree
/dev/loop3 loopa lvm2 a-- 12.00m 12.00m
Zdenek Kabelac [Tue, 13 Jan 2015 14:23:03 +0000 (15:23 +0100)]
thin: errrorwhenfull support
Support error_if_no_space feature for thin pools.
Report more info about thinpool status:
(out_of_data (D), metadata_read_only (M), failed (F) also as health
attribute.)
Zdenek Kabelac [Wed, 14 Jan 2015 09:31:24 +0000 (10:31 +0100)]
cleanup: update API for segment reporting
API for seg reporting is breaking internal lvm coding - it cannot
use vgmem mem pool for allocation of reported value.
So use separate pool instead of 'vgmem' for non vg related allocations
Add consts for many function params - but still many other are left
for now as non-const - needs deeper level of change even on libdm side.
raid_manip: fix multi-segment misallocation on 'lvconvert --repair'
An 'lvconvert --repair $RAID_LV" to replace a failed leg of a multi-segment
RAID10/4/5/6 logical volume can lead to allocation of (parts of) the replacement
image component pair on the physical volume of another image component
(e.g. image 0 allocated on the same PV as image 1 silently impeding resilience).
Patch fixes this severe resilince issue by prohibiting allocation on PVs
already holding other legs of the RAID set. It allows to allocate free space
on any operational PV already holding parts of the image component pair.
Peter Rajnoha [Mon, 12 Jan 2015 14:16:57 +0000 (15:16 +0100)]
tests: pvscan --cache DevicePath does not fail if the device is just filtered
It's not an error if the device is filtered out and hence cleared from
lvmetad cache - "pvscan --cache DevPath" has now the same behaviour in
this case as "pvscan --cache major:minor" (which is more consistent).
Before, the tests expected failure return code for "pvscan --cache DevicePath"
if the device was filtered (which is a different situation if the device
is missing in the system completely!).
Peter Rajnoha [Mon, 12 Jan 2015 13:02:57 +0000 (14:02 +0100)]
dev-type: filter out partitioned device-mapper devices as unsuitable for use as PVs
Normally, if there are partitions defined on top of device-mapper
device, there should be a device-mapper device created for each
partiton on top of the old one and once the underlying DM device
is used by another devices (partition mappings in this case),
it can't be used as a PV anymore.
However, sometimes, it may happen the partition mappings are
missing - either the partitioning tool is not creating them if
it does not contain full support for device-mapper devices or
the mappings were removed.
Better safe than sorry - check for partition header on DM devs
and filter them out as unsuitable for PVs in case the check is
positive. Whatever the user is doing, let's do our best to prevent
unwanted corruption (...by running pvcreate on top of such device
that would corrupt the partition header).
Peter Rajnoha [Mon, 12 Jan 2015 12:50:11 +0000 (13:50 +0100)]
pvscan: notify lvmetad about device that is gone and pvscan is run with device path instead of major:minor pair
If pvscan is run with device path instead of major:minor pair and this
device still exists in the system and the device is not visible anymore
(due to a filter that is applied), notify lvmetad properly about this.
This makes it more consistent with respect to existing pvscan with
major:minor which already notifies lvmetad about device that is gone
due to filters.
However, if the device is not in the system anymore, we're not able
to translate the original device path into major:minor pair which
lvmetad needs for its action (lvmetad_pv_gone fn). So in this case,
we still need to use major:minor pair only, not device path. But at
least make "pvscan --cache DevicePath" as near as possible to "pvscan
--cahce <major>:<minor>" functionality.
Also add a note to pvscan man page about this difference when using
pvscan --cache with DevicePath and major:minor pair.
Peter Rajnoha [Fri, 9 Jan 2015 15:34:16 +0000 (16:34 +0100)]
scripts: clvmd: replace awk functionality with LVM's selection
No need to use awk now to get appropriate VGs/LVs, use LVM's
own --select - it's quicker, it removes a need for external
dependency on awk and it's also more readable.
Peter Rajnoha [Fri, 9 Jan 2015 13:04:44 +0000 (14:04 +0100)]
metadata: add "Failed to write VG <vg_name>." on failed vg_write and revert previous patch
Better than previous patch which changed log_warn to log_error -
we can have multiple MDAs and if one of them fails to be written,
we can still continue with other MDAs if we're in a mode where
we can handle missing PVs - so keep the log_warn for single
failed MDA write as it was before.
However, add log_error with "Failed to write VG <vg_name>." in
case we're not handling missing PVs or no MDA was written at all
during VG write process. This also prevents an internal error in
which the vg_write fails and we're not issuing any other log_error
in vg_write caller or above, so we end up with:
"Internal error: Failed command did not use log_error".