Zdenek Kabelac [Thu, 11 Aug 2011 17:34:30 +0000 (17:34 +0000)]
Lock memory for shared VG
Use debug pool locking functionality. So the command could check,
whether the memory in the pool has not been modified.
For lv_postoder() instead of unlocking and locking for every changed
struct status member do it once when entering and leaving function.
(mprotect would trap each such memory access).
Currently lv_postoder() does not modify other part of vg structure
then status flags of each LV with flags that are reverted back to
its original state after function exit.
Zdenek Kabelac [Thu, 11 Aug 2011 17:29:04 +0000 (17:29 +0000)]
Add memory pool locking functions
Adding debuging functionality to lock and unlock memory pool.
2 ways to debug code:
crc - is default checksum/hash of the locked pool.
It gets slower when the pool is larger - so the check is only
made when VG is finaly released and it has been used more then
once.Thus the result is rather informative.
mprotect - quite fast all the time - but requires more memory and
currently it is using posix_memalign() - this could be
later modified to use dm_malloc() and align internally.
Tool segfaults when locked memory is modified and core
could be examined for faulty code section (backtrace).
Only fast memory pools could use mprotect for now -
so such debug builds cannot be combined with DEBUG_POOL.
Zdenek Kabelac [Thu, 11 Aug 2011 17:24:23 +0000 (17:24 +0000)]
Cache and share generated VG structs
Extend vginfo cache with cached VG structure. So if the same metadata
are use, skip mda decoding in the case, the same data are in use.
This helps for operations like activation of all LVs in one VG,
where same data were decoded giving the same output result.
Patch adds 1-to-1 connection between volume_group and lvmcache_vginfo.
Compiler complaining that meta_lv could be used uninitialized. (Not true
because it is protected by 'clear_metadata'.) I switched to using 'lv->vg',
as it makes no difference to vg_[write|commit].
Milan Broz [Wed, 10 Aug 2011 16:07:53 +0000 (16:07 +0000)]
If anything bad happens and unlocking fails
(here clvmd crashed in the middle of operation),
lock is not removed from cache - here is one example:
locking/cluster_locking.c:497 Locking VG V_vg_test UN (VG) (0x6)
locking/cluster_locking.c:113 Error writing data to clvmd: Broken pipe
locking/locking.c:399 <backtrace>
locking/locking.c:461 <backtrace>
Internal error: Volume Group vg_test was not unlocked
Code should always remove lock info from lvmcache and update counters
on unlock, even if unlock fails.
Peter Rajnoha [Tue, 9 Aug 2011 11:44:57 +0000 (11:44 +0000)]
Suppress low-level locking errors and warnings while using --sysinit.
Today, we use "suppress_messages" flag (set internally in init_locking fn based
on 'ignorelockingfailure() && getenv("LVM_SUPPRESS_LOCKING_FAILURE_MESSAGES")'.
This way, we can suppress high level messages like "File-based locking
initialisation failed" or "Internal cluster locking initialisation failed".
However, each locking has its own sequence of initialization steps and these
could log some errors as well. It's quite misleading for the user to see such
errors and warnings if the "--sysinit" is used (and so the ignorelockingfailure
&& LVM_SUPPRESS_LOCKING_FAILURE_MESSAGES environment variable). Errors and
warnings from these intermediary steps should be suppressed as well if requested.
This patch propagates the "suppress_messages" flag deeper into locking init
functions. I've also added these flags for other locking types for consistency,
though it's not actually used for no_locking and readonly_locking.
Zdenek Kabelac [Thu, 4 Aug 2011 15:18:10 +0000 (15:18 +0000)]
Remove unused inconsistent_seqno
Last usage was removed in Petr's commit related to VG mda repair fix
where relaxed check starts to ignore inconsistencies coming from
PVs that are marked MISSING - thus removing unused variable.
Zdenek Kabelac [Thu, 4 Aug 2011 12:40:24 +0000 (12:40 +0000)]
Minor memory leak fix
Defer the test of the function return value after the string memory is released.
Otherwise in this error path the string would present memory leak.
(Thought in this case we are already out of memory...)
Basic support includes:
- ability to create RAID 1/4/5/6 arrays
- ability to delete RAID arrays
- ability to display RAID arrays
Notable missing features (not included in this patch):
- ability to clean-up/repair failures
- ability to convert RAID segment types
- ability to monitor RAID segment types
Peter Rajnoha [Tue, 2 Aug 2011 10:49:57 +0000 (10:49 +0000)]
Change DEFAULT_UDEV_SYNC to 1 so udev_sync is used even without any config.
This should be set by default! Normally we have "activation/udev_sync = 1"
in lvm.conf (example.conf.in). But if we use lvm2 without any config file
(or without a definition within '--config' option) the DEFAULT_UDEV_SYNC
is used instead. Together with verify_udev_operations=0 (when we rely on
udev fully), this can cause races as the node could be missing when needed.
(See also https://bugzilla.redhat.com/show_bug.cgi?id=723144)
Peter Rajnoha [Thu, 28 Jul 2011 13:06:50 +0000 (13:06 +0000)]
Add support for systemd file descriptor handover in dmeventd.
Systemd preloads file descriptors for us and passes them in for
newly spawned daemon when using on-demand fifo (or socket)
based activation.
This patch adds checks for file descriptors preloaded by
systemd and uses them instead of opening the FIFOs again
to properly support on-demand FIFO-based activation.
(We'll change FIFOs to sockets soon - but still this
part of the code will stay almost the same.)
Peter Rajnoha [Thu, 28 Jul 2011 13:03:37 +0000 (13:03 +0000)]
Add support for new oom killer adjustment interface (oom_score_adj).
The filename to adjust the oom score was changed in 2.6.36.
We should use oom_score_adj instead of oom_adj (which is still
there under /proc, but it's scheduled for removal in August 2012).
New oom_score_adj uses a range from -1000 (OOM_SCORE_ADJ_MIN,
disable oom killing) to 1000 (OOM_SCORE_ADJ_MAX).
Petr Rockai [Mon, 25 Jul 2011 17:59:50 +0000 (17:59 +0000)]
lvmetad: Edit the MISSING_PV flags only after making a "reply" copy of the
metadata, which is then serialised and discarded. This fixes a couple of
outstanding TODO items about handling the MISSING flags correctly.
Petr Rockai [Mon, 25 Jul 2011 15:51:51 +0000 (15:51 +0000)]
lvmetad: Check integrity of multiple metadata copies, i.e. ensure that seqno
equality implies metadata equality. Signal error in response to any update
requests that try to overwrite metadata without providing a higher seqno.
Compare also file size to detect changed config file
Clvmd detects modifed config file before it takes lv_lock.
If the config file is changed rapidly - the change was ignored within
a seocnd ranged. This patch adds also compare of file size.
So change like some flag for 0 to 1 would pass unnoticed - but
it's quick fix for failing test suite.
Petr Rockai [Wed, 20 Jul 2011 21:33:41 +0000 (21:33 +0000)]
lvmetad: Obliterate vg_status by returning the same information from
update_pv_status, saving a dozen lines of code and execution time of one
walkthrough of the PV list.
Petr Rockai [Wed, 20 Jul 2011 21:23:43 +0000 (21:23 +0000)]
First stab at making lvmetad-core threadsafe. The current design should allow
very reasonable amount of parallel access, although the hash tables may become
a point of contention under heavy loads. Nevertheless, there should be orders
of magnitude less contention on the hash table locks than we currently have on
block device scanning.
Petr Rockai [Wed, 20 Jul 2011 16:46:40 +0000 (16:46 +0000)]
Make lvmetad also report VGID in reply when adding a PV without MDAs (this
obviously only works for VGs that already had at least some MDA discovered).
Petr Rockai [Mon, 18 Jul 2011 14:42:44 +0000 (14:42 +0000)]
Improve format_buffer in daemon-shared.c, adding block formatting in addition
to string/integer (this propagates to the *simple* family of request/response
functionality).
Petr Rockai [Mon, 18 Jul 2011 14:34:33 +0000 (14:34 +0000)]
Revert the #include changes. Need to fix this at the #include site for now, and
eventually refactor the way we structure #includes in the all of the library.
where the naming is left completely on lvm.
(Commited code has been different version of test).
So here it should be able to figure out new free name and create a new LV.
Petr Rockai [Mon, 11 Jul 2011 12:13:07 +0000 (12:13 +0000)]
Fix t-vgreduce-usage to stop relying on the persistent cache not seeing a
device that has been brought back from the dead: this sometimes fails with
clvmd (the cache is updated "too soon"). Instead, force a pvscan and rely on an
up-to-date cache as usual.
Move snapshot deactivation logic into lib/activate, fixing the
teardown sequence. (Previously the snapshot was deactivated while its
origin was active and before its removal was committed to disk, so
restarting after a crash at the point would leave corruption.)
Always perform preload logic before suspending - not only in the case when we
have precommitted metadata. (Necessary to avoid loading tables
while suspend in lvchange --refresh.)