Zdenek Kabelac [Wed, 25 Jan 2012 09:10:13 +0000 (09:10 +0000)]
Thin add support for origin_only suspend of thin volumes
Pass in the origin_only flag also for thin volumes - but curently the flag
is not used to its best.
FIXME: achieve the state where only thin volume snapshot origin is
suspended without its childrens - let's explore whether this may
happen automatically inside libdm (might be generic for other targets).
So the code would not need to annotate the node for this.
Zdenek Kabelac [Wed, 25 Jan 2012 09:06:43 +0000 (09:06 +0000)]
Thin add messages only for activation tree
Extend lv_activate_opts with bool flag to know for which purpose
dtree is created - and add message only for activation tree
(since that's the only place that may send them).
Extend validation check for thin snapshot creation and test whether
active snapshot origin is suspended before its snapshot is created
(useful in recover scenarios) - in this case also detect, whether
transaction has been already completed and avoid such suspend check
failure in that case.
Zdenek Kabelac [Wed, 25 Jan 2012 08:55:19 +0000 (08:55 +0000)]
Thin fix transaction_id incrementation and code refactoring
Add pool_has_message and use it in attach_pool_message.
Also update header to make more obvious which segment type is
expected as parameter.
Rename 'read_only' to 'no_update' (no auto update transaction_id)
to better fit how it's used.
Fix problem when there was only one stacked message replaced with delete
message that caused unwanted transaction_id increase.
Zdenek Kabelac [Wed, 25 Jan 2012 08:46:21 +0000 (08:46 +0000)]
Thin send messages on activation resume code path
Using PRELOAD part would lead to problems when the problem
would happen before vg_write and vg_commit.
Also this change is necessary for snapshot creation sequence.
Use suspend|resume_origin_only when up-converting RAID LVs, as mirrors do.
Failure to do so results in "Performing unsafe table load while X device(s) are
known to be suspended" errors. While fixing the problem in this way works and
is consistent with the way the mirror segment type does it, it would be nice
to find a solution that uses the generic suspend/resume calls.
Also included in this check-in are additions to the test suite that perform
conversions on RAID LVs under a snapshot. These tests are disabled for the
time being due to a kernel bug that is yet to be tracked down.
Fix the way RAID meta LVs are added to the dependency tree.
Similar to the "mirror" segment type's log device, _add_dev_to_dtree should
be called and not _add_lv_to_dtree when adding metadata sub-LVs to the deptree.
Since _add_lv_to_dtree was being called, 'origin_only' could be set if a
snapshot sits on top of the RAID device. This would cause the actual device
that needed to be added to be skipped in favor of the non-existant device,
"<foo>-real".
Zdenek Kabelac [Fri, 20 Jan 2012 16:59:58 +0000 (16:59 +0000)]
Update lvdisplay to show more info about thin LVs
Reformat name and path how the LV is represented with lvm1 compatible option,
to switch to the old way - which had number of problem - i.e. many links
do not exist - since for private devices we are not creating them.
Add more info about thin pools and volumes.
Preserve exclusive activation of cluster mirror when converting.
This patch to the suspend code - like the similar change for resume -
queries the lock mode of a cluster volume and records whether it is active
exclusively. This is necessary for suspend due to the possibility of
preloading targets. Failure to check to exclusivity causes the cluster target
of an exclusively activated mirror to be used when converting - rather than
the single machine target.
Zdenek Kabelac [Thu, 19 Jan 2012 15:39:41 +0000 (15:39 +0000)]
Thin disable snapshot creation when pool is over the threshold.
Since snapshot needs to suspend origin - it might lead to pool userspace
deadlock (as the pool will wait for new space in case it would be overfilled,
but dmeventd would not be able to resize it, as the lvcreate operation would
have kept the VG lock.)
To minimize the risk of such scenario - we prevent to create new snapshot
in case we are over the threshold - but beware, there is still small timewindow,
so keep threshold at some reasonable level!
Zdenek Kabelac [Thu, 19 Jan 2012 15:27:54 +0000 (15:27 +0000)]
Thin add function to read thin volume percent
This value returns percentage of 'mapped' size compared with total LV size.
(Without passed seg pointer it return highest mapped size - but it's
not used yet.)
Petr Rockai [Mon, 16 Jan 2012 08:25:32 +0000 (08:25 +0000)]
Beef up the lvmetad code with more functionality and a bunch of bugfixes. There
used to be a few mis-ordered memory accesses (release and access in the next
block). Fix that set_flag could have sometimes corrupted the flags being
modified.
A few issues with metadata tracking are sorted out as well now, and there are
only a few problems remaining before we can integrate lvmetad, mostly on the
client side:
- metadata areas need to be tracked in lvmetad (most likely to be addressed
through an extension of metadata, meaning no special support in lvmetad would
be needed)
- non-udev scanning code needs to be taught about telling lvmetad about device
disappearance (pvscan most importantly)
- this last item also needs to mesh with metadata inconsistencies and
suddenly-incomplete volume groups (aux disable_dev in tests); udev-based
scanning should address this separately and more elegantly
Petr Rockai [Sun, 15 Jan 2012 11:17:16 +0000 (11:17 +0000)]
Unfortunately, blank lines are sometimes produced by config serializer, and
this interferes with their role as message separator in the lvmetad
protocol. Switch to using "##" on an otherwise blank line as a separator.
Peter Rajnoha [Wed, 11 Jan 2012 12:46:19 +0000 (12:46 +0000)]
Support different device name types on output of dmsetup deps, ls and info -c command.
Add 'blkdevname' and 'blkdevs_used' field to dmsetup info -c -o.
Add 'blkdevname' option to dmsetup ls --tree to see block device names.
Add '-o options' to dmsetup deps and ls to select device name type on output.
Peter Rajnoha [Wed, 11 Jan 2012 12:34:44 +0000 (12:34 +0000)]
Add dm_device_get_name to get map name or block device name for given devno.
This is accomplished by reading associated sysfs information. For a dm device,
this is /sys/dev/block/major:minor/dm/name (supported in kernel version >= 2.6.29,
for older kernels, the behaviour is the same as for non-dm devices).
For a non-dm device, this is a readlink on /sys/dev/block/major:minor, e.g.
/sys/dev/block/253:0 --> ../../devices/virtual/block/dm-0.
The last component of the path is a proper kernel name (block device name).
One can request to read only kernel names by setting the 'prefer_kernel_name'
argument if needed.
Zdenek Kabelac [Mon, 9 Jan 2012 12:26:14 +0000 (12:26 +0000)]
Use sysfs to set/get of read-ahead
If we know major:minor number of device (which is known after resume) we will
try to use sysfs to set/get read ahead parameters of device.
This avoid potential problem of blocking commands like 'dmsetup info' awaiting
for device being usable for open/close - i.e. overfilled thin pool may block
such command.
Zdenek Kabelac [Thu, 5 Jan 2012 15:38:18 +0000 (15:38 +0000)]
Support rounding of percentage upward
We want to keep this logic -
when LV is extend - extend the LV by at least given amount,
when LV is reduced - reduce the LV by at most given amount.
So for this the rounding needs to be used.
Current logic which seems to satisfy give rule is to round up all
extent values for LV resize upward except for values with '-' sign
that are round downward.
This patch also fixes the problem when lvextend --use-polices tried
to extend LV the by i.e. 20% - but the resulting 20% were smaller
the extent size thus before this patch no extension happened.
Zdenek Kabelac [Thu, 22 Dec 2011 16:37:01 +0000 (16:37 +0000)]
Use new dmeventd_lvm2_command function in dmeventd plugins.
For snapshot, prepare whole command in front into private buffer.
Add also some missing '\n' for syslog messages.
For raid and mirror only convert creation of command line string.
This should avoid any unbound growth of mempool for dm_split_names.
Zdenek Kabelac [Thu, 22 Dec 2011 15:57:29 +0000 (15:57 +0000)]
Thin use helper function
Fix some minor outstading issue from thin plugin introduction -
Call dmeventd_lvm2_exit() in failpath for registration.
Add some missing '\n' in syslog messages.
Zdenek Kabelac [Wed, 21 Dec 2011 13:24:24 +0000 (13:24 +0000)]
Drop extra stat before open of device
Since the !(dev->flags & DEV_REGULAR) code path just called
dev_name_confirmed() which has just called 'stat()' inside,
remove duplicate second stat() call here.
Zdenek Kabelac [Wed, 21 Dec 2011 13:21:09 +0000 (13:21 +0000)]
Do not lstat common path prefix
When both path have identical prefix i.e. /dev/disk/by-id
skip 2 x lstat() for /dev /dev/disk /dev/disk/by-id
and directly lstat() only different part of the path.
Reduces amount of lstat calls on system with lots of devices.