Peter Rajnoha [Fri, 22 Apr 2011 12:05:32 +0000 (12:05 +0000)]
Obtain device list from udev by default if LVM2 is compiled with udev support.
Also, add a new 'obtain_device_list_from_udev' setting to lvm.conf with which
we can turn this feature on or off if needed.
If set, the cache of block device nodes with all associated symlinks
will be constructed out of the existing udev database content.
This avoids using and opening any inapplicable non-block devices or
subdirectories found in the device directory. This setting is applied
to udev-managed device directory only, other directories will be scanned
fully. LVM2 needs to be compiled with udev support for this setting to
take effect. N.B. Any device node or symlink not managed by udev in
udev directory will be ignored with this setting on.
Dave Wysochanski [Thu, 21 Apr 2011 17:03:38 +0000 (17:03 +0000)]
Add nightly test for vgimportclone and querying of vgnames with duplicate pvs.
Related to rhbz 697959.
This test fails prior to these two commits:
commit af112eb2c9a62c5d794df920218bd3ee291d5b25
Author: Zdenek Kabelac <zkabelac@redhat.com>
Date: Thu Apr 21 13:15:26 2011 +0000
Avoid using of already released memory when duplicated MDA is found.
As get_pv_from_vg_by_id() may call lvmcache_label_scan() use the local copy
of the vgname and vgid on the stack as vginfo may dissapear and code was
then accessing garbage in memory.
i.e. pvs /dev/loop0
(when /dev/loop0 and /dev/loop1 has same MDA content)
Invalid read of size 1
at 0x523C986: dm_hash_lookup (hash.c:325)
by 0x440C8C: vginfo_from_vgname (lvmcache.c:399)
by 0x4605C0: _create_vg_text_instance (format-text.c:1882)
by 0x46140D: _text_create_text_instance (format-text.c:2243)
by 0x47EB49: _vg_read (metadata.c:2887)
by 0x47FBD8: vg_read_internal (metadata.c:3231)
by 0x477594: get_pv_from_vg_by_id (metadata.c:344)
by 0x45F07A: _get_pv_if_in_vg (format-text.c:1400)
by 0x45F0B9: _populate_pv_fields (format-text.c:1414)
by 0x45F40F: _text_pv_read (format-text.c:1493)
by 0x480431: _pv_read (metadata.c:3500)
by 0x4802B2: pv_read (metadata.c:3462)
Address 0x652ab80 is 0 bytes inside a block of size 4 free'd
at 0x4C2756E: free (vg_replace_malloc.c:366)
by 0x442277: _free_vginfo (lvmcache.c:963)
by 0x44235E: _drop_vginfo (lvmcache.c:992)
by 0x442B23: _lvmcache_update_vgname (lvmcache.c:1165)
by 0x443449: lvmcache_update_vgname_and_id (lvmcache.c:1358)
by 0x443C07: lvmcache_add (lvmcache.c:1492)
by 0x46588C: _text_read (text_label.c:271)
by 0x466A65: label_read (label.c:289)
by 0x4413FC: lvmcache_label_scan (lvmcache.c:635)
by 0x4605AD: _create_vg_text_instance (format-text.c:1881)
by 0x46140D: _text_create_text_instance (format-text.c:2243)
by 0x47EB49: _vg_read (metadata.c:2887)
Mike Snitzer [Wed, 13 Apr 2011 18:26:39 +0000 (18:26 +0000)]
Improve the discard documentation. Also improve discard code in
pv_manip.c to properly account for case when pe_start=0 and the first
physical extent is to be released (currently skip the first extent to
avoid discarding the PV label).
My previous patch fixed incorrect error check for dm_snprintf.
However in this particular case - dm_snprintf has been used differently -
just like strncpy + setting last char with '\0' - so the code had to return
error - because the buffer was to short for whole string.
Patch replaces it with real strncpy.
Also test for alloca() failure is removed - as the program behaviour
is rather undefined in this case - it never returns NULL.
Thanks to Zdenek Kabelac (kabi) for pointing out that I was using
dm_pool_free incorrectly. This check-in fixes that incorrect usage.
I've also added a WHATS_NEW line to reflect the changes I made to allow
lv_extend to operate on 0 length intrinsically layered LVs (i.e mirrors
and RAID). I forgot that in the last commit.
This patch adds the ability to extend 0 length layered LVs. This
allows us to allocate all images of a mirror (or RAID array) at one
time during create.
The current mirror implementation still requires a separate allocation
for the log, however.
Peter Rajnoha [Fri, 1 Apr 2011 14:54:20 +0000 (14:54 +0000)]
Cleanup fid finalization code in free_vg and allow exactly the same fid to be set again for a PV/VG.
Actually, we can call vg_set_fid(vg, NULL) instead of calling
destroy_instance for all PV structs and a VG struct - it's the same
code we already have in the vg_set_fid.
Also, allow exactly the same fid to be set again for the same PV/VG
Before, this could end up with the fid destroyed because we destroyed
existing fid first and then we used the new one and we didn't care
whether existing one == new one by chance.
Zdenek Kabelac [Wed, 30 Mar 2011 13:14:34 +0000 (13:14 +0000)]
Keep the cache content when the exported vg buffer is matching
Instead of regenerating config tree and parsing same data again,
check whether export_vg_to_buffer does not produce same string as
the one already cached - in this case keep it, otherwise throw cached
content away.
For the code simplicity calling _free_cached_vgmetadata() with
vgmetadata == NULL as the function handles this itself.
Note: sometimes export_vg_to_buffer() generates almost the same data
with just different time stamp, but for the patch simplicity,
data are reparsed in this case.
Zdenek Kabelac [Wed, 30 Mar 2011 12:57:03 +0000 (12:57 +0000)]
Word alignment for strings
Align strdup char* allocation just on 2 bytes.
It looks like wasting space to align strings on 8 bytes.
(Could be even 1byte - but for hashing it might eventually get better
perfomance - but probably hardly measurable).
TODO: check on various architectures it's not making any problems.
Zdenek Kabelac [Wed, 30 Mar 2011 12:36:19 +0000 (12:36 +0000)]
Better shutdown for clvmd
'a small step' towards cleaner shutdown sequence.
Normally clvmd doens't care about unreleased memory on exit -
but for valgrind testing it's better to have them cleaned all.
So - few things are left on exit path - this patch starts to remove
just some of them.
1. lvm_thread_fs is made as a thread which could be joined on exit()
2. memory allocated to local_clien_head list is released.
(this part is somewhat more complex if the proper reaction is
needed - and as it requires some heavier code moving - it will
be resolved later.
Idea of the fix is rather defensive - to allocate one extra element
to 'map' array which is then used in _area_length() - where the
loop checks, whether next map entry is continuous.
By placing there always one extra zero entry -
we fix the read of unallocated memory, and we make sure the data would
not make a continous block.
FIXME: there could be a problem if some special broken lvm1 data would be imported.
As the format1 is currently not really used - leave it for future fix
and use this small hotfix for now.
Zdenek Kabelac [Tue, 29 Mar 2011 21:49:18 +0000 (21:49 +0000)]
Const warning fixes
With recent update of dm_report_field_string() API call to accept
completely const objects - we no longer need loose constness here
and keep it forwarding.
Zdenek Kabelac [Tue, 29 Mar 2011 21:34:18 +0000 (21:34 +0000)]
Fix access to released memory
Invalid primary_vginfo was supposed to move all its lvmcache_infos to
orphan_vginfo - however it has called _drop_vginfo() inside the loop
that released primary_vginfo itself - thus made the loop using released
memory.
Use _vginfo_detach_info() instead and call _drop_vginfo after
th loop is finished.
Valgrind trace it should fix:
Invalid read of size 8
at 0x41E960: _lvmcache_update_vgname (lvmcache.c:1229)
by 0x41EF86: lvmcache_update_vgname_and_id (lvmcache.c:1360)
by 0x441393: _text_read (text_label.c:329)
by 0x442221: label_read (label.c:289)
by 0x41CF92: lvmcache_label_scan (lvmcache.c:635)
by 0x45B303: _vg_read_by_vgid (metadata.c:3342)
by 0x45B4A6: lv_from_lvid (metadata.c:3381)
by 0x41B555: lv_activation_filter (activate.c:1346)
by 0x415868: do_activate_lv (lvm-functions.c:343)
by 0x415E8C: do_lock_lv (lvm-functions.c:532)
by 0x40FD5F: do_command (clvmd-command.c:120)
by 0x413D7B: process_local_command (clvmd.c:1686)
Address 0x63eba10 is 16 bytes inside a block of size 160 free'd
at 0x4C2756E: free (vg_replace_malloc.c:366)
by 0x41DE70: _free_vginfo (lvmcache.c:980)
by 0x41DEDA: _drop_vginfo (lvmcache.c:998)
by 0x41E854: _lvmcache_update_vgname (lvmcache.c:1238)
by 0x41EF86: lvmcache_update_vgname_and_id (lvmcache.c:1360)
by 0x441393: _text_read (text_label.c:329)
by 0x442221: label_read (label.c:289)
by 0x41CF92: lvmcache_label_scan (lvmcache.c:635)
by 0x45B303: _vg_read_by_vgid (metadata.c:3342)
by 0x45B4A6: lv_from_lvid (metadata.c:3381)
by 0x41B555: lv_activation_filter (activate.c:1346)
by 0x415868: do_activate_lv (lvm-functions.c:343)
Zdenek Kabelac [Tue, 29 Mar 2011 21:05:39 +0000 (21:05 +0000)]
Fix sending uninitilised bytes in cluster messages
Fix 2 more functions sending cluster messages to avoid passing uninitilised bytes
and compensate 1 extra byte attached to the message from the clvm_header.args[1]
member variable.
Alasdair Kergon [Fri, 25 Mar 2011 23:50:35 +0000 (23:50 +0000)]
Use hard-coded /dev/mapper/control details for 2.6.36+ kernels and simplify
associated code. (Some obscure configurations that happened to work before
are no longer supported.)
If _move_lv_segments is passed a 'lv_from' that does not yet
have any segments, it will screw things up because the code
that does the segment copy assumes there is at least one
segment. See copy code here:
lv_to->segments = lv_from->segments;
lv_to->segments.n->p = &lv_to->segments;
lv_to->segments.p->n = &lv_to->segments;
If 'segments' is an empty list, the first statement copies over
the values, but the next two reset those values to point to the
other LV's list structure. 'lv_to' now appears to have one
segment, but it is really an ill-set pointer.
When I see 'seg_is_mirrored', I expect the argument to be an lv_segment.
In this case, it is lvcreate_params. Both structures, have a 'segtype'
entry which the macro dereferences. However, it just seems easier to
understand if we do 'segtype_is_mirrored' instead.
Petr Rockai [Thu, 24 Mar 2011 12:28:02 +0000 (12:28 +0000)]
In some cases, we could end up with a mirrored LV without a MIRRORED flag. In
other cases, the code could wind up removing wrong number of mirrors. In yet
other cases, we could remove the right number of mirrors, but fail to respect
the removal preferences (i.e. keep an image that was requested to be removed
while removing an image that was requested to be kept). Under some
circumstances, remove_mirror_images could also get stuck in an infinite loop.
This patch should fix all of the above undesirable behaviours.
Signed-off-by: Petr Rockai <prockai@redhat.com> Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Milan Broz [Fri, 18 Mar 2011 12:17:57 +0000 (12:17 +0000)]
Mitigate some warnings if running as non-root user.
LVM doesn't behave correctly if running as non-root user,
there is warning when it detects it.
Despite this, it produces many error messages, saying nothing.
See https://bugzilla.redhat.com/show_bug.cgi?id=620571
This patch fixes two things:
1) Removes eror message from device_is_usable() which has no
information value anyway (real warning is printed inside it).
2) it fixes device-mapper initialization, if we support
core dm module autoload and device node is present, it should
fail early and not try recreate existing and correct node.
(non-root == permission denied here)
N.B. In future code should support user roles, some more
drastic checks in code are probably contraproductive now.
Zdenek Kabelac [Mon, 14 Mar 2011 17:00:57 +0000 (17:00 +0000)]
Add missing \0 for grown debug object
Attach \0 for proper char* display - otherwise somewhat random message could
be displayed in debug more and read of unpredictable read of uninitilized
memory values could happen.
Zdenek Kabelac [Sun, 13 Mar 2011 23:05:48 +0000 (23:05 +0000)]
Fix allocation of system_id
As code uses strncpy(system_id, NAME_LEN) and doesn't set '\0'
Fix it by always allocating NAME_LEN + 1 buffer size and with zalloc
we always get '\0' as the last byte.
This bug may trigger some unexpected behavior of the string operation
code - depends on the pool allocator.
Zdenek Kabelac [Sun, 13 Mar 2011 22:57:51 +0000 (22:57 +0000)]
Fix buffer allocation size for uuid string
We have 3 components and traling '\0' so allocate proper room for all of them.
Problem was nicely hidden by allocation from pool and allocation aligment
offset - so to trigger real problem with this one is actually hard.
Zdenek Kabelac [Sun, 13 Mar 2011 22:52:16 +0000 (22:52 +0000)]
Fix usage of readlink
Return value of readlink limits valid string size.
Characters after returned size present some garbage to printf.
Fix it by placing '\0' on the return size value.
Peter Rajnoha [Fri, 11 Mar 2011 15:08:31 +0000 (15:08 +0000)]
Various cleanups for fid mem and ref_count changes.
Missing free_vg on error_path in lvmcache_get_vg fn. Call destroy_instance
only if the fid is not part of the vg in backup_read_vg fn (otherwise it's
part of the VG we're returning and we definitely don't want to destroy it!).
Peter Rajnoha [Fri, 11 Mar 2011 15:06:13 +0000 (15:06 +0000)]
Call destroy_instance for any PVs found in VG structure during vg_free call.
This is necessary for proper format instance ref_count support. We iterate
over vg->pvs and vg->removed_pvs list and the ref_count is decremented and
then it is destroyed if not referenced anymore.
Peter Rajnoha [Fri, 11 Mar 2011 14:56:56 +0000 (14:56 +0000)]
Add new free_pv_fid fn and use it throughout to free all attached fids.
Since format instances will use own memory pool, it's necessary to properly
deallocate it. For now, only fid is deallocated. The PV structure itself
still uses cmd mempool mostly, but anytime we'd like to add a mempool
in the struct physical_volume, we can just rename this fn to free_pv and
add the code (like we have free_vg fn for VGs).
Peter Rajnoha [Fri, 11 Mar 2011 14:50:13 +0000 (14:50 +0000)]
Use only vg_set_fid and new pv_set_fid fn to assign the format instance.
This is essential for proper format instance ref_count support. We must
use these functions to set the fid everywhere from now on, even the NULL
value!
Peter Rajnoha [Fri, 11 Mar 2011 14:45:17 +0000 (14:45 +0000)]
Make create_text_context fn static and move it inside create_instance fn.
We'd like to use the fid mempool for text_context that is stored
in the instance (we used cmd mempool before, so the order of
initialisation was not a matter, but now it is since we need to
create the fid mempool first which happens in create_instance fn).
The text_context initialisation is not needed anywhere outside the
create_instance fn so move it there.
Peter Rajnoha [Fri, 11 Mar 2011 14:38:38 +0000 (14:38 +0000)]
Add mem and ref_count fields to struct format_instance for own mempool use.
Format instances can be created anytime on demand and it contains
metadata area information mostly (at least for now, but in the future,
we may store more things here to update/edit in a PV/VG). In case we
have lots of metadata areas, memory consumption will rise. Using cmd
context mempool is not quite optimal here because it is destroyed too
late. So let's use a separate mempool for format instances.
Reference counting is used because fids could be shared, e.g. each PV
has either a PV-based fid or VG-based fid. If it's VG-based, each PV has
a shared fid with the VG - a reference to VG's fid.
Zdenek Kabelac [Thu, 10 Mar 2011 14:51:35 +0000 (14:51 +0000)]
Optimise _eat_space and _get_token
Makes the code more readable and has a smaller number of memory
accesses thus it's small optimisation as well.
For _get_token() optimize number parsing. Check for '.' char only
if it's not a digit. Move pointer incrementation into one place.
For _eat_space() check only p->te for '\0' in skipping of comment line.
Avoid check for '\0' when we know it is space. Also master while loop
doesn't need checking p->tb for '\0'. We just need to check p->tb
isn't already at the end of buffer. This could give 'extra' loop cycle
if we are already there - but safes memory access in every other case.
Zdenek Kabelac [Thu, 10 Mar 2011 14:40:32 +0000 (14:40 +0000)]
Refactor code for _lv_postoder
Add _lv_postorder_vg() - for calling _lv_postorder() for every LV from VG.
We use this in 2 places - vg_mark_partial_lvs() and vg_validate()
so make it as a one function.
Benefit here is - to use only one cleanup code and avoid
potentially duplicate scans of same LVs.
gdbinit - A GDB init file to help while debugging LVM.
Copy this file as '.gdbinit' to your home directory or your working
directory. It adds the following commands to gdb:
- first_seg
- lv_status
- lv_status_r
- lv_is_mirrored
- seg_item
- seg_status
- segs_using_this_lv
You can get a list of these user-defined commands by typing:
(gdb) help user-defined
You can get more information on each command by typing:
(gdb) help <command>
Zdenek Kabelac [Thu, 10 Mar 2011 13:11:59 +0000 (13:11 +0000)]
Use hash tables for validating names
Accelerate validation loop by using lvname, lvid, pvid hash tables.
Also merge pvl loop into one cycle now - no need to scan the list twice.
List scan is stopped when dm_hash_insert fails.
The error message with loop_counter1 is no longer provided - however
the message has been misleading anyway.
Zdenek Kabelac [Thu, 10 Mar 2011 12:43:29 +0000 (12:43 +0000)]
Refactor vg allocation code
Create new function alloc_vg() to allocate VG structure.
It takes pool_name (for easier debugging).
and also take vg_name to futher simplify code.
Move remainder of _build_vg_from_pds to _pool_vg_read
and use vg memory pool for import functions.
(it's been using smem -> fid mempool -> cmd mempool)
(FIXME: remove mempool parameter for import functions and use vg).
Move remainder of the _build_vg to _format1_vg_read