Zdenek Kabelac [Wed, 13 Oct 2010 12:18:53 +0000 (12:18 +0000)]
Don't use floor() in _bitset_with_random_bits
Use _even_rand() function instead of floor() in _bitset_with_random_bits().
floor() function is missing in dietlibc (on architectures other than x86).
Moreover using floor() to clip rand results does not assure even result
distribution. _even_rand() uses integer arithmetic only and is designed to
return evenly distributed results.
> Looks OK to me. It took a while to decipher what is the exact meaning of
> the loop in _even_rand (to a non-pseudorandomness-expert) but I am
> fairly comfortable with it now. If I understand this correctly, it
> rejects numbers that come from an "incomplete" slice of the RAND_MAX
> space (considering the number space [0, RAND_MAX] is divided into some
> "max"-sized slices and at most a single smaller slice, between [n*max,
> RAND_MAX] for suitable n -- numbers from this last slice are discarded
> because they could distort the distribution in favour of smaller
> numbers).
Petr Rockai [Wed, 13 Oct 2010 10:34:31 +0000 (10:34 +0000)]
Implement vgextend --restoremissing (BZ 537913), which makes it possible to
re-add a physical volume that has gone missing previously, due to a transient
device failure, without re-initialising it.
Signed-off-by: Petr Rockai <prockai@redhat.com> Reviewed-by: Alasdair Kergon <agk@redhat.com>
Zdenek Kabelac [Fri, 8 Oct 2010 15:02:05 +0000 (15:02 +0000)]
Add support for noninterctive shell execution
Try to distinguish between the case of using interactive shell and non
interactive running - different combinations of '-y' and '-p' option
needs to be used for fsck.
Zdenek Kabelac [Fri, 8 Oct 2010 14:55:19 +0000 (14:55 +0000)]
Fix detection of mounted filesystem.
Update the way how fsadm detects mounted filesystem.
With udev /dev/dm-XXX paths are now returned - but mount or /proc/mounts
prints names in form of /dev/mapper/vg-lv - so the match was not found.
Fixex RHBZ #638050.
Current solution uses same trick as mount and detects vg-lv name through
/sys where available - this should be reasonable safe.
Instead of calling mount without parameter to get actual mount table,
switch to use /proc/mounts directly.
Zdenek Kabelac [Fri, 8 Oct 2010 12:35:56 +0000 (12:35 +0000)]
Fix a serious bug in the behavior of fasdm tool when breaked.
Under certain conditions it was possible to break (^C) fsadm before actually
resizing filesystem, but lvresize which executed fsadm will think resize
was succesful and shrinks partitions with unresized filesystem on it.
Fix by returning error (1) for this case - this stops lvresize from futher
proceding in resize operation.
Rename 'flags' to 'status' for struct metadata_area.
In other LVM memory structures such as volume_group, the field
used to store flags is called "status", and on-disk fields are called
'flags', so rename the one inside metadata_area to be consistent.
Not only is it more consistent with existing code but is cleaner
to say "the status of this mda is ignored".
Background for this patch - prajnoha pinged me on IRC this morning
about a fix he was working on related to metadataignore when
metadata/dirs was set. I was reviewing my patches from this year
and realized the 'flags' field was probably not the best choice
when I originally did the metadataignore patches.
Dave Wysochanski [Thu, 30 Sep 2010 14:09:45 +0000 (14:09 +0000)]
Add pv_get_property and create generic internal _get_property function.
We need to use a similar function for pv and lv properties, so just make
a generic _get_property() function that contains most of the required
functionality. Also, add a check to ensure the field name matches the
object passed in by re-using report_type_t enum. For pv properties,
the report_type might be either PVS or LABEL.
In addition, add 'const' to 'get' functions object parameter, but not
'set' functions. Add _not_implemented_set() and _not_implemented_get()
functions.
Dave Wysochanski [Thu, 30 Sep 2010 14:08:58 +0000 (14:08 +0000)]
Add 'get' functions for vg fields.
Add 'get' functions based on generic macros for VG, PV, and LV.
Add 'get' functions for vg string fields, vg_name, vg_fmt, vg_sysid,
vg_uuid, vg_attr, and vg_tags, and all numeric fields.
Add supporting functions for vg_name, vg_fmt, vg_system_id.
Append "_dup" to end of supporting functions to make clear the strings
are dup'd and to avoid namespace conflict with vg_name.
Dave Wysochanski [Thu, 30 Sep 2010 14:07:47 +0000 (14:07 +0000)]
Add pv_uuid_dup, vg_uuid_dup, and lv_uuid_dup, and call id_format_and_copy.
Add supporting functions for pv_uuid, vg_uuid, and lv_uuid.
Call new function id_format_and_copy. Use 'const' where appropriate.
Add "_dup" suffix to indicate memory is being allocated.
Call {pv|vg|lv}_uuid_dup from lvm2app uuid functions.
Dave Wysochanski [Thu, 30 Sep 2010 14:07:19 +0000 (14:07 +0000)]
Simplify logic to create 'attr' strings.
This patch addresses code review request to simplify creation of 'attr'
strings. The simplification is done in this separate patch to more
easily review and ensure the simplification is done without error.
Dave Wysochanski [Thu, 30 Sep 2010 13:52:55 +0000 (13:52 +0000)]
Add {pv|vg|lv}_attr_dup() functions and refactor 'disp' functions.
Move the creating of the 'attr' strings into a common function so
they can be called from the 'disp' functions as well as the new
'get' property functions.
Add "_dup" suffix to indicate memory is allocated.
Refactor pvstatus_disp to take pv argument and call pv_attr_dup().
Dave Wysochanski [Thu, 30 Sep 2010 13:05:45 +0000 (13:05 +0000)]
Refactor metadata.[ch] into lv.[ch] for lv functions.
This patch is similar to the other patches for pv and vg
functionality, and separates lv functionality into separate
files, concentrating on reporting fields and simple functions.
Dave Wysochanski [Thu, 30 Sep 2010 13:05:20 +0000 (13:05 +0000)]
Refactor metadata.[ch] into pv.[ch] for pv functions.
The metadata.[ch] files are very large. This patch makes a first
attempt at separating out pv functions and data, particularly
related to the reporting fields calculations.
More code could be moved here but for now I'm stopping at reporting
functions 'get' / 'set' functions.
Dave Wysochanski [Thu, 30 Sep 2010 13:04:55 +0000 (13:04 +0000)]
Refactor metadata.[ch] into vg.[ch] for vg functions.
The metadata.[ch] files are very large. This patch makes a first
attempt at separating out vg functions and data, particularly
related to the reporting fields calculations.
Read complete content of /proc/self/maps into one buffer without
realocation in the middle of reading and before doing any m/unlock
operation with these lines - as some of them gets change.
With previous implementation we've read some mappings twice ([stack])
Milan Broz [Wed, 22 Sep 2010 13:45:21 +0000 (13:45 +0000)]
Fix handling of partial VG for lvm1 format metadata
If some lvm1 device is missing, lvm fails on all operations
# vgcfgbackup -f bck -P vg_test
Partial mode. Incomplete volume groups will be activated read-only.
3 PV(s) found for VG vg_test: expected 4
PV segment VG free_count mismatch: 152599 != 228909
PV segment VG extent_count mismatch: 152600 != 228910
Internal error: PV segments corrupted in vg_test.
Volume group "vg_test" not found
Allow loading of lvm1 partial VG by allocating "new" missing PV,
which covers lost space. Also this fake mising PV inform code
that it is partial VG.
Peter Rajnoha [Mon, 20 Sep 2010 14:25:27 +0000 (14:25 +0000)]
Revert to old glibc behaviour for vsnprintf used in emit_to_buffer function.
Revert to old glibc behaviour for vsnprintf used in emit_to_buffer fn.
Otherwise, the check that follows would be wrong for new glibc versions.
This caused the rh bug #633033 to be undetected and pass throught the check,
corrupting the metadata!
Peter Rajnoha [Thu, 9 Sep 2010 13:13:12 +0000 (13:13 +0000)]
Add random suffix to archive file names to prevent races when being created.
In certain configurations, we're not under a VG rw lock while trying to write
a new archive file with VG metadata. A common example is using "vgs" while
having the content of backup and archive directories empty. The code scans the
content of these directories and tries to determine the final index that should
be used in archive name. Since we're not under a lock, we can get into a race
while choosing the index which could end up showing errors about not being able
to rename to final archive name. Let's add random number suffix to these archive
file names so we can avoid the race.
Peter Rajnoha [Thu, 9 Sep 2010 13:07:13 +0000 (13:07 +0000)]
Reinitialize archive and backup handling on toolcontext refresh.
For example, when using '--config "backup { ... }"' line, the values from
lvm.conf (or default values) should be overridden. This patch adds
reinitialisation of archive and backup handling on toolcontext refresh
which makes these settings to be applied.
This patch fixes an issue where cluster mirror write I/O
can be opprobriously slow if created with '--nosync'.
One of the ways cluster mirrors coordinate I/O and recovery
amoung the different machines is by the use of the log
function 'is_remote_recovering()' which lets nodes know if
a region they wish to perform a write on is currently being
recovered on another node. If the region is being recovered,
the I/O is delayed.
The 'is_remote_recovering' routine has been optimized to
avoid the deluge of requests that would be issued to the
userspace log server by maintaining a marker of how far
the recovery has gotten. It can then immediately return
'not recovering' if the region being inquired about is
less than this mark. Additionally, if the region of
concern is greater than the mark, the function will
limit the number of transmissions to userspace by assuming
the region /is/ being recovered when skipping the
transmission. This limits the amount of processing
and updates the mark in 1/4 sec time steps.
This patch fixes a problem where 'the mark' is not being
updated because of faulty logic in the userspace log
daemon. When '--nosync' is used to create a cluster
mirror, the userspace log daemon never has a chance
to update the mark in the normal way. The fix is to set
the mark to "complete" if the mirror was created with
the --nosync flag.
This patch fixes a problem where the mirror polling process
may never complete.
If you convert from a linear to a mirror and then convert that
mirror back to linear /while/ the previous (up)convert is
taking place, the mirror polling process will never complete.
This is because the function that polls the mirror for
completion doesn't check if it is still polling a mirror and
the copy_percent that it gets back from the linear device is
certainly never 100%.
The fix is simply to check if the daemon is still looking at
a mirror device - if not, return PROGRESS_CHECK_FAILED.
The user sees the following output from the first (up)convert
if someone else sneaks in and does a down-convert shortly
after their convert:
[root@bp-01 ~]# lvconvert -m1 vg/lv
vg/lv: Converted: 43.4%
ABORTING: Mirror percentage check failed.
This patch fixes a potential for I/O to hang and LVM commands
to block when a mirror under a snapshot suffers a failure.
The problem has to do with label scanning. When a mirror suffers
a failure, the kernel blocks I/O to prevent corruption. When
LVM attempts to repair the mirror, it scans the devices on the
system for LVM labels. While mirrors are skipped during this
scanning process, snapshot-origins are not. When the origin is
scanned, it kicks up I/O to the mirror (which is blocked)
underneath - causing the label scan (an thus the repair operation)
to hang.
This patch simply bypasses snapshot-origin devices when doing
labels scans (while ignore_suspended_devices() is set). This
fixes the issue.
Milan Broz [Mon, 23 Aug 2010 11:34:10 +0000 (11:34 +0000)]
Fix pvmove --abort to work even for empty pvmove LV
If pvmove crashed and metadata contains pvmove LV
but without miorrored segments, pvmove --abort
will not repair the situation (and finish wth success!).
Fix it by allowing metadata update if aborting
(thus removing pvmove LV) even if no moved LVs detected.
(Tested on real metadata provided by an lvm user:-)
Mike Snitzer [Sat, 21 Aug 2010 15:43:45 +0000 (15:43 +0000)]
Verify that pvcreate --dataalignment really does override the topology
detected alignment.
NOTE: lvm2 doesn't detect MD 1.2 metadata (now the default on RHEL6) so
for now I'm forcing 1.0 metadata. This was needed to be able to reuse
the existing loop devices but recreate the md device with different
raid0 striping.
Mike Snitzer [Fri, 20 Aug 2010 20:59:05 +0000 (20:59 +0000)]
Update heuristic used for default and detected data alignment.
Add "devices/default_data_alignment" to lvm.conf to control the internal
default that LVM2 uses: 0==64k, 1==1MB, 2==2MB, etc.
If --dataalignment (or lvm.conf's "devices/data_alignment") is specified
then it is always used to align the start of the data area. This means
the md_chunk_alignment and data_alignment_detection are disabled if set.
(Same now applies to pvcreate --dataalignmentoffset, the specified value
will be used instead of the result from data_alignment_offset_detection)
set_pe_align() still looks to use the determined default alignment
(based on lvm.conf's default_data_alignment) if the default is a
multiple of the MD or topology detected values.
Dave Wysochanski [Fri, 20 Aug 2010 12:44:47 +0000 (12:44 +0000)]
Add properties.[ch] to lib/report, defined based on columns.h.
Extend the existing reporting infrastructure definitions and structures
to include a 'get' and 'set' function for each field. We will provide
a 'get' and 'set' function for each of these fields, which will be utilized
by exported lvm2app functions.
Define a default _not_implemented 'get' and 'set' function that just sets
an errno and returns 0. Future patches will actually implement the
specific 'get' and 'set' functions for each property. For read-only
properties, only the 'get' function will be implemented.
Define vg_get_property() function to query a property. We will call
this from a lvm2app function.
Dave Wysochanski [Fri, 20 Aug 2010 12:44:30 +0000 (12:44 +0000)]
Add macro definitions to report infrastructure for character array length.
Rather than hard code the size of the field, use a #define, so we can re-use.
The #define will be needed in a future patch when we extend the reporting
infrastructure to have 'get' and 'set' functions for each field, allowing
lvm2app functions which query any report field. In order to provide a
generic lookup based on the field id, we will define a type containing this
field id, and thus, we will need to re-use the length of this string as
it's defined inside libdevmapper.h.
Dave Wysochanski [Fri, 20 Aug 2010 12:44:17 +0000 (12:44 +0000)]
Remove explicit double quotes from columns.h 'id' entries.
The 'id' entries in columns.h are the report field names. Since these are
unique, we'd like to use them in generation of 'get' / 'set' functions.
As a step towards using them for this purpose, remove the explicit double
quotes and use the macro '#' character to add the double quotes back when
placing them into the '_fields' array 'id' member.