Mike Snitzer [Wed, 17 Jun 2009 15:47:01 +0000 (15:47 +0000)]
Various vgimportclone fixes:
- validate the specified device is a PV and that it is in a VG
- automatically enable DEBUG (-d) if >= 4 -v instances were supplied
- preserve TMP_LVM_SYSTEM_DIR if it contains an lvm.conf and -d was
specified
- fix handling of special-case where PV is listed as "unknown device"
- more descriptive error when a PV is missing ("unknown device")
- unset LVM_SYSTEM_DIR if it was not originally set
- skip final vgscan if no changes were made
Dave Wysochanski [Mon, 15 Jun 2009 17:09:32 +0000 (17:09 +0000)]
Cleanup pvs, vgs, and lvs "-o" section in man pages (rhbz 500861).
http://bugzilla.redhat.com/show_bug.cgi?id=500861
- Update list of fields/columns for each command (a few missing).
- Update list order to match "-o help" output (easier to verify field list)
- Add "{pv|vg|lv}_all" description.
- Move "-o help" sentence above column/field list.
New sample man page for lvs (pvs and vgs are similar):
-o, --options
Comma-separated ordered list of columns. Precede the list with ’+’ to append to the
default selection of columns instead of replacing it.
Use -o lv_all to select all logical volume columns, and -o seg_all to select all logical
segment columns.
Use -o help to view the full list of columns available.
Milan Broz [Mon, 15 Jun 2009 11:56:31 +0000 (11:56 +0000)]
Fix memory leaks in toolcontext error path.
E.g.
# vgscan
Parse error at byte 2360 (line 54): expected a value
Failed to load config file /etc/lvm/lvm.conf
You have a memory leak (not released memory pool):
[0x818c788] library (12 bytes)
Dave Wysochanski [Wed, 10 Jun 2009 16:14:40 +0000 (16:14 +0000)]
In the new _vg_read_for_update(), we always do the check for CLUSTERED vg
status flag after reading the volume group. Thus, no need to set the flag
in vg_read() or clear it later before calling _vg_bad_status_bits().
Also, add back in the !lockingfailed() as part of the CLUSTERED flag check.
It's unclear why it was removed when the check was moved from
_vg_bad_status_bits() to inside _vg_lock_and_read().
There was an open question about the last check in the 'if' stmt for
lockingfailed() with a previous patch submitted. However, I would
defer that right now as it is a separate item and this patch should
be no functional change by including the !lockingfailed().
Petr acked this patch on 5/26 I just forgot to check it in.
Milan Broz [Tue, 9 Jun 2009 16:10:20 +0000 (16:10 +0000)]
Support crypt segment in libdevmapper tree.
- it can support multiple segments, but note that
to work properly, correct IV (initialization vector)
offset parameter must be set properly.
Because most usage of IV start offset is when we join
several crypto segments together (so iv_offset is the segment
start offset), DM_CRYPT_IV_DEFAULT is defined to simplify
the process.
Function accepts the string in cipher agrument (already
including chainmode and iv type; chainmode and iv parameters are NULL
in this case) or user can provide split parameters which will
join into dm-crypt cipher specification "cipher-chainmode-iv".
All these parameters must be supplied in correct dm-crypt format.
Various tools need to check for existence of a VG before doing something
(vgsplit, vgrename, vgcreate). Currently we don't have an interface to
check for existence, but the existence check is part of the vg_read* call(s).
This patch is an attempt to pull out some of that functionality into a
separate function, and hopefully simplify our vg_read interface, and
move those patches along.
vg_lock_newname() is only concerned about checking whether a vg exists in
the system. Unfortunately, we cannot just scan the system, but we must first
obtain a lock. Since we are reserving a vgname, we take a WRITE lock on
the vgname. Once obtained, we scan the system to ensure the name does
not exist. The return codes and behavior is in the function header.
You might think of this function as similar to an open() call with
O_CREAT and O_EXCL flags (returns failure with -EEXIST if file already
exists).
NOTE: I think including the word "lock" in the function name is important,
as it clearly states the function obtains a lock and makes the code more
readable, especially when it comes to cleanup / unlocking. The ultimate
function name is somewhat open for debate though so later we may rename.
Milan Broz [Sat, 6 Jun 2009 16:40:39 +0000 (16:40 +0000)]
Fix various problems in tests
- PPC uses 64k page, some caculations are wrong in tests
- file utility is buggy on PPC and cannot detect swap, use blkid instead
- read ahead is quietly rounded down according to page size
(for now use some common divisor value in test)
Milan Broz [Sat, 6 Jun 2009 16:37:15 +0000 (16:37 +0000)]
Suspend virtual origin before real snapshot.
Because preload of table for snapshot can produce snapshot
metadata (in kernel cow header) read.
Code should suspend origin first to avoid possible deadlock
when preloading (thus calling snapshot in-kernel constructor)
for origin with suspended cow device.
Milan Broz [Fri, 5 Jun 2009 20:00:52 +0000 (20:00 +0000)]
Fix double releasing of vg when repairing of vg is requested.
Several commands calls process_each_vg() and in provided
callback it explicitly recovers VG if inconsistent.
(vgchange, vgconvert, vgscan)
It means that old VG is released and reread but the function
above (process_one_vg) tries to unlock and release old VG.
Patch moves the repair logic into _process_one_vg() function.
It always tries to read vg (even inconsistent) and then decides
what to do according new defined parameter.
Also patch unifies inconsistent error messages.
The only slight change if for vgremove command, where
it now tries to repair VG before it removes if force arg is given.
(It works similar way before, just the order of operation changed).
Milan Broz [Wed, 3 Jun 2009 11:31:06 +0000 (11:31 +0000)]
Build shared parts with 'make' command (mpatocka)
When some parts of lvm are built as shared libraries (for example with
--with-snapshots=shared), the 'make' command does not build these parts.
The shared parts are built with 'make install' command.
This bug can be seen if you go to 'lib' subdirectory and type 'make'.
If you type 'make', the shared libraries are not built, if you type
'make all', the shared libraries are built.
The reason for the bug is the line $(SUBDIRS): $(LIB_STATIC)
If make is executed without any arguments, it makes the first target
in the Makefile. If the first target is '$(SUBDIRS): $(LIB_STATIC)',
it only builds static libraries.
This patch moves '$(SUBDIRS): $(LIB_STATIC)' after
include $(top_srcdir)/make.tmpl. make.tmpl contains the 'all' target
as its first target, so 'make' will be equivalent to 'make all' and
shared libraries will be build with 'make' command.
Milan Broz [Mon, 1 Jun 2009 14:43:27 +0000 (14:43 +0000)]
Fix convert polling to ignore LV with different UUID.
When mirror convert polling is started (mainly as backgound process,
in lvchange -a y or in lvconvert itself) it tries to read VG
and LV identified by its name.
Unfortunatelly, the VG can have already different LV under the same name,
and various more or less funny things can happen (note that
_finish_lvconvert_mirror suspends the volume for example).
(the typical example is our testing script which continuously recreates
LVs under the same name in the same VG.)
This patch adds optional uuid parameter which helps to properly
select the monitoring object. For lvconvert polling it is set to LV UUID
and both _get_lvconvert_vg and _get_lvconvert_lv uses it to read proper VG/LV.
(In the pvmove case it is NULL, here we poll for physical volume name).
Milan Broz [Mon, 1 Jun 2009 12:43:31 +0000 (12:43 +0000)]
Fix readahead calculation problems.
During vgreduce is failed mirror image replaced with error segment,
this segmant type has always area_count == 0.
Current code expects that there is at least one area with device,
patch fixes it by additional check (fixes segfault during vgreduce).
Also do not calculate readahead in every lv_info call, we only need
to cache PV readahead before activation calls which locks memory.
Alasdair Kergon [Wed, 27 May 2009 18:19:21 +0000 (18:19 +0000)]
Suppress 'removed' messages displayed when internal LVs are removed.
Fix lvchange -a and -p for sparse LVs.
Fix lvcreate --virtualsize to activate the new device immediately.
Milan Broz [Wed, 20 May 2009 11:09:49 +0000 (11:09 +0000)]
Use readahead of underlying device and not default (smaller) one.
When we are stacking LV over device, which has for some reason
increased read_ahead (e.g. MD RAID), the read_ahead hint
for libdevmapper is wrong (it is zero).
If the calculated read_ahead hint is zero, patch uses read_ahead of underlying device
(if first segment is PV) when setting DM_READ_AHEAD_MINIMUM_FLAG.
Because we are using dev-cache, it also store this value to cache for future use
(if several LVs are over one PV, BLKRAGET is called only once for underlying device.)
This should fix all the reamining problems with readahead mismatch reported
for DM over MD configurations (and similar cases).
Milan Broz [Tue, 19 May 2009 10:25:16 +0000 (10:25 +0000)]
Use lvconvert --repair in dmeventd DSO (mornfall)
This means two things:
1) Non-mirrored LVs will be no longer affected by mirror monitoring. (Before,
if you had a LV that went partially missing on a VG where a mirror leg failed,
this LV would be removed automatically by dmeventd... Probably not an
unrecoverable dataloss bug, but still quite unpleasant.)
2) If enough parallel PV space is available at the time of the mirror failure,
the failed devices will be automatically replaced using this spare space. Which
(and whether) free space may be used is still not configurable, but is a
planned feature. Since it is relatively easy to undo the action by converting
the mirror manually, I don't consider this to be a showstopper. In fact, I
think the compromise is much better than what we have now.
Milan Broz [Tue, 19 May 2009 09:48:32 +0000 (09:48 +0000)]
Use PV UUID in hash for device name when exporting metadata.
Currently code uses pv_dev_name() for hash when getting internal
"pvX" name.
This produce corrupted metadata if PVs are missing, pv->dev
is NULL and all these missing devices returns one name
(using "unknown device" for all missing devices as hash key).