Milan Broz [Wed, 20 May 2009 11:09:49 +0000 (11:09 +0000)]
Use readahead of underlying device and not default (smaller) one.
When we are stacking LV over device, which has for some reason
increased read_ahead (e.g. MD RAID), the read_ahead hint
for libdevmapper is wrong (it is zero).
If the calculated read_ahead hint is zero, patch uses read_ahead of underlying device
(if first segment is PV) when setting DM_READ_AHEAD_MINIMUM_FLAG.
Because we are using dev-cache, it also store this value to cache for future use
(if several LVs are over one PV, BLKRAGET is called only once for underlying device.)
This should fix all the reamining problems with readahead mismatch reported
for DM over MD configurations (and similar cases).
Milan Broz [Tue, 19 May 2009 10:25:16 +0000 (10:25 +0000)]
Use lvconvert --repair in dmeventd DSO (mornfall)
This means two things:
1) Non-mirrored LVs will be no longer affected by mirror monitoring. (Before,
if you had a LV that went partially missing on a VG where a mirror leg failed,
this LV would be removed automatically by dmeventd... Probably not an
unrecoverable dataloss bug, but still quite unpleasant.)
2) If enough parallel PV space is available at the time of the mirror failure,
the failed devices will be automatically replaced using this spare space. Which
(and whether) free space may be used is still not configurable, but is a
planned feature. Since it is relatively easy to undo the action by converting
the mirror manually, I don't consider this to be a showstopper. In fact, I
think the compromise is much better than what we have now.
Milan Broz [Tue, 19 May 2009 09:48:32 +0000 (09:48 +0000)]
Use PV UUID in hash for device name when exporting metadata.
Currently code uses pv_dev_name() for hash when getting internal
"pvX" name.
This produce corrupted metadata if PVs are missing, pv->dev
is NULL and all these missing devices returns one name
(using "unknown device" for all missing devices as hash key).
Milan Broz [Wed, 13 May 2009 21:24:12 +0000 (21:24 +0000)]
Tidy format1 import LV function.
Later patch initializes lv->vg after the LV structure is prepared,
so pass through cmd context and do not use vg->cmd here.
Also move LV id calculation (which uses lv->vg too).
Also properly free memory pool if operation fails.
Dave Wysochanski [Wed, 13 May 2009 13:02:52 +0000 (13:02 +0000)]
Remove NON_BLOCKING lock flag from tools and set a policy to auto-set.
As a simplification to the tools and further liblvm, this patch pushes
the setting of NON_BLOCKING lock flag inside the lock_vol() call.
The policy we set is if any existing VGs are currently locked, we
set the NON_BLOCKING flag.
Milan Broz [Tue, 12 May 2009 19:09:21 +0000 (19:09 +0000)]
Fix first_seg() call for empty segment list.
The seg variable is temporary variable for list iterator,
code cannot expect that after iteration it remains NULL
(it contains non-NULL pointer here id list is empty).
Patch fixes first_seg function so it now correctly returns NULL
for empty segment list.
Milan Broz [Mon, 11 May 2009 10:28:45 +0000 (10:28 +0000)]
Introduce lvm2_install target.
Buildsystem support device-mapper only install,
but generic install tagret includes both dm+lvm2.
For distribution which uses separate install_device-mapper,
there is no way how to install lvm2 only
(so after installing lvm2 for packaging purposes
built system must remove installed device-mapper files).
Fix it by allowing lvm2_install target, similarily like
install_cluster for clvmd.
Milan Broz [Thu, 7 May 2009 12:11:50 +0000 (12:11 +0000)]
Fix PV datalign when for values starting prior to MDA area.
The dataalign value must always be aligned according
to MDA area.
The currect code checks if calculated value collides with
MDA area but not if the value is so small that it is
located before MDA starts.
Unfortunatelly there can be also MDA in the end of the device.
The patch adds simple check to avoid this miscalculation.
Patch expects that first MDA always starts on <= pagesize boundary
(this is true for all allowed label sector parameters).
Petr Rockai [Fri, 24 Apr 2009 08:00:48 +0000 (08:00 +0000)]
Avoid scanning non-PV devices in functional tests, otherwise lvconvert --repair
breaks for some reason -- possibly needs investagation, but this should fix it
in the meantime.
Milan Broz [Fri, 10 Apr 2009 09:59:18 +0000 (09:59 +0000)]
Introduce memory pool per volume group.
Since now, all code reading volume group is responsible for releasing
the memory allocated by calling vg_release(vg).
(For simplicity of use, vg_releae can be called for vg == NULL,
the same logic like free(NULL)).
Also providing simple macro for unlocking & releasing in one step,
tools usualy uses this approach.
The global memory pool (cmd->mem) should be used only for global
physical volume operations.
This patch have to be applied with all subsequent patches to complete
memory pool per vg logic.
Using separate memory pool has quite bit memory saving impact when
using large VGs, this is mainly needed when we have to use
preallocated and locked memory (and should not overflow from that
memory space).
Milan Broz [Fri, 10 Apr 2009 09:56:00 +0000 (09:56 +0000)]
Properly copy the whole pv structure for later use.
The all_pvs list, used in vg_read, should make its own private
copy of pv structures, otherwise (when vg will use its own pool)
it will point to released memory pool.
The same applies for get_pvs() call.
Patch adds pv_list copy helper and adds explicit memory pool
parameter into _copy_pv.
(Please note that all these helper functions cannot guarantee that
vg related fields are valid - proper vg read & lock must be used
if it is requested.)
Milan Broz [Wed, 8 Apr 2009 12:53:20 +0000 (12:53 +0000)]
Enable use of cached metadata for pvs & pvdisplay.
Currently PV commands, which performs full device scan, repeatly
re-reads PVs and scans for all devices.
This behaviour can lead to OOM for large VG.
This patch allows using internal metadata cache for pvs & pvdisplay,
so the commands scan the PVs only once.
(We have to use VG_GLOBAL otherwise cache is invalidated on every
VG unlock in process_single PV call.)