Petr Rockai [Sun, 26 Feb 2012 08:49:40 +0000 (08:49 +0000)]
- Improve error reporting on lvmetad connection failure.
- Fix a couple of memory leaks in the lvmetad client code.
- Avoid an error in lvmetad_pv_gone when we aren't using lvmetad.
Peter Rajnoha [Fri, 24 Feb 2012 09:53:12 +0000 (09:53 +0000)]
Add skeleton for lvmetad udev rules - 69-dm-lvm-metad.rules.
Why using the order 69:
- Storage processing in general happens in 60-persistent-storage.rules,
including the blkid call that adds some usable information we can use
for filtering and speedup (these rules are part of upstream udev and
the order is preserved on most distros)
- There's still some other storage-related processing done after
60-persistent-storage.rules in general. These might add some detailed
storage-related information we might use to filter devices effectively
(e.g. MD udev rules, ...).
- We need lvmetad rules to be processed before any consumers can use the
output - so the metadata cache is ready soon enough (e.g. udisks rules).
- There's no official (upstream udev) document about assigning the order,
so this number is chosen in best belief it will suit all scenarios.
Petr Rockai [Thu, 23 Feb 2012 23:52:11 +0000 (23:52 +0000)]
Couple of improvements in the daemon (common + lvmetad) code:
- some client-side memory leak fixes
- announce and check protocols and protocol versions
Zdenek Kabelac [Thu, 23 Feb 2012 22:45:43 +0000 (22:45 +0000)]
Introduce dm_strncpy
Should be faster then strncpy - since we could avoid clearing 4KB pages
with each strncpy(...,PATH_MAX).
Also it's easy to check whether string fit - and eventually avoid
to continue working we incomplete string.
Zdenek Kabelac [Thu, 23 Feb 2012 18:05:12 +0000 (18:05 +0000)]
Limit number of mem allocs and copies
If we have good enough glibc to return number of needed chars, do not
loop try to reach good size, but use this size directly for allocation,
saving also last strdup.
Since now we start with 16 bytes - skip buffer realloc for shorter string.
Petr Rockai [Thu, 23 Feb 2012 14:55:29 +0000 (14:55 +0000)]
Add a vgscan to lvcreate-repair.sh. The old test applied device filter hacks to
make devices invisible to lvm, but the behaviour of those is slightly different
than of actual missing devices. Running vgscan after re-enabling the device
triggers a metadata repair which is not done by vgremove -ff. This is not a
regression, merely an odd behaviour that has been around even before lvmetad.
Peter Rajnoha [Thu, 23 Feb 2012 13:31:49 +0000 (13:31 +0000)]
Use DEFAULT_RUN_DIR instead of hardcoded value in lvmetad systemd units
and add ExecStartPost=vgscan to actually run the first scan that will
fill the metadata daemon with metadata information.
Fix allocation code to allow replacement of single RAID 4/5/6 device.
The code fail to account for the case where we just need a single device
in a RAID 4/5/6 array. There is no good way to tell the allocation functions
that we don't need parity devices when we are allocating just a single device.
So, I've used a bit of a hack. If we are allocating an area_count that is <=
the parity count, then we can assume we are simply allocating a replacement
device (i.e. no need to include parity devices in the calculations). This
should make sense in most cases. If we need to allocate replacement devices
due to failure (or moving), we will never allocate more than the parity count;
or we would cause the array to become unusable. If we are creating a new device,
we should always create more stripes than parity devices.
Peter Rajnoha [Wed, 22 Feb 2012 17:55:10 +0000 (17:55 +0000)]
Add configure --with-tmpfilesdir and lvm2 tmpfiles.d configuration file itself.
/etc/tmpfiles.d directory holds configuration files for temporary/volatile
files and directories that should be automatically managed. For example,
if we have some parts of the fs hierarchy on tmpfs, we'd like to recreate
some files or directories on every boot so they're always prepared for use.
Systemd can read such configuration files. For now, the lock and run directory
are the ones that are most probably placed on tmpfs. If this is the case, we
can install the configuration by 'make install_tmpfiles_configuration'.
Add some messages that indicate completion of RAID device replacement.
There were no messages printed upon completiion of RAID device replacement.
This could cause confusion/concern during automated recovery, because the
user sees the failure messages but no other messages indicating correction.
Petr Rockai [Tue, 21 Feb 2012 09:19:08 +0000 (09:19 +0000)]
Tweak lvmetad a bit more:
- allow at most one PV on any given device
- allow PV lookup by device
- merge the pvmeta info into VG metadata when responding to vg_lookup
Peter Rajnoha [Mon, 20 Feb 2012 19:36:27 +0000 (19:36 +0000)]
Check whether udev supports built-in blkid.
Built-in blkid is supported since udev v176 - set the UDEV_HAS_BUILTIN_BLKID
variable appropriately so we can use it in the rules to call the built-in
blkid conditionaly.
Petr Rockai [Wed, 15 Feb 2012 17:30:07 +0000 (17:30 +0000)]
Update lvmetad: use device major/minor pair to track devices. Keep a pvmeta
config tree per PV which is mostly provided by the client, so it can be used to
keep track of things like label_sector, PV format, mda count / offsets and so
on.
Zdenek Kabelac [Wed, 15 Feb 2012 15:18:43 +0000 (15:18 +0000)]
Initialize dmeventd monitoring for every command
Read lvm.conf setting for monitoring for each command. So we should not
activate monitoring if the default compilation is set to monitor during
lvconvert commnads.
Patch also removes check for clustered VG and allows to disable monitoring
for clustered VG with the assumption, the problem with monitoring and dmeventd
flag passing for INGNORE is already fixed.
Peter Rajnoha [Wed, 15 Feb 2012 14:50:33 +0000 (14:50 +0000)]
Add watch rule to 13-dm-disk.rules.
We don't have anything better yet...
The problems the watch rule caused when removing devices should be covered
now with the "retry remove" logic. It's also better to have this maintained
by us, rather than having this rule anywhere else without proper control.
Peter Rajnoha [Wed, 15 Feb 2012 12:17:34 +0000 (12:17 +0000)]
Replace any '\' char with '\\' in table specification on input.
Device-mapper in kernel uses '\' as escape character so it's better
to double it to avoid any confusion when using existing device names
with '\' in the table specification.
For example:
dmsetup create x --table "0 8 linear /dev/mapper/a\x20b 0"
should pass just fine now without a need to explicitly escape the '\' char
like this:
dmsetup create x --table "0 8 linear /dev/mapper/a\\x20b 0"
Peter Rajnoha [Wed, 15 Feb 2012 12:01:28 +0000 (12:01 +0000)]
Unamngle dm device name list automatically on ioctl return.
If dm_task_get_name or dm_task_get_names gets called, these will return
unmangled form of the names so the name mangling stays totally transparent
to any libdevmapper user (unless DM_STRING_MANGLING_NONE is used in which
case the name is not touched and it is is returned as it is in kernel).
For example:
dmsetup create "a b" - will create a\x20b device in kernel and so udev will
create /dev/mapper/a\x20b
dm_task_get_name/names will still return "a b"
In AUTO mode, the libdevmapper user can still query the device by using
the mangled ("a\x20b") or unmangled form of the name when calling dm_task_set_name.
If mangled name is provided, it's detected and the name is kept as it is.
If unmangled name is provided, it will be mangled. IOW in AUTO mode it's
totally transparent and it should not require any changes in the code
using libdevmapper.
However, any libdevmapper user must be aware of the fact that the mangled form
of the name appears in /dev/mapper (udev just can't deal with those blacklisted
characters).
Petr Rockai [Wed, 15 Feb 2012 11:43:06 +0000 (11:43 +0000)]
lvmetad server-side update:
- rename the hashes to be explicit about the mapping
- add VG/PV listing calls to the protocol
- cache slightly more of the per-PV state
- filter cached metadata
- compare the metadata upon metadata_update
Peter Rajnoha [Wed, 15 Feb 2012 11:39:38 +0000 (11:39 +0000)]
Add dm_task_get_name_mangled/unmangled to libdevmapper.
dm_task_get_name_mangled will always return mangled form of the name while
the dm_task_get_name_unmangled will always return unmangled form of the name
irrespective of the global setting (dm_set/get_name_mangling_mode).
This is handy in situations where we need to detect whether the name is already
mangled or not. Also display functions make use of it.
Peter Rajnoha [Wed, 15 Feb 2012 11:33:53 +0000 (11:33 +0000)]
Add DEV_NAME macro.
Use the DEV_NAME macro to use the mangled form of the name if present,
use normal name otherwise (we store both forms - mangled and unmangled in
struct dm_task). Mangled form should be always preferred over unmangled
with the exception of the situations where we divide one task into several
others (like "create and load") - we need to avoid mangling the name twice
(because of multiple dm_task_set_name calls)!
Peter Rajnoha [Wed, 15 Feb 2012 11:27:01 +0000 (11:27 +0000)]
Mangle device name on dm_task_set_name/newname call if necessary.
If dm_task_set_name/newname is called, the name provided will be
automatically translated to correct encoded form with the hex enconding
so any character not on udev whitelist will be mangled with \xNN
format where NN is hex value of the character used.
By default, the name mangling mode used is the one set during
configure with the '--with-default-name-mangling' option.
Peter Rajnoha [Wed, 15 Feb 2012 11:17:57 +0000 (11:17 +0000)]
Add configure --with-default-name-mangling.
This option configures the default name mangling mode used, one of:
AUTO, NONE and HEX.
The name mangling is primarily used to support udev character whitelist
(0-9, A-Z, a-z, #*-.:=@_) so any character that is not on udev whitelist
will get translated into an encoded form \xNN where NN is the hex value
of the character.
Make conversion from a synced 'mirror' to 'raid1' not cause a full resync.
It was not possible to pass down the DM_[FORCE|NO]SYNC flags to
'dm_tree_node_add_raid_target'. This meant that converting to 'raid1' from
'mirror' would cause a full resync. (It also meant that '--nosync' was
ineffective when creating a 'raid1' LV.)
I've taken the 'reserved' parameter in 'dm_tree_node_add_raid_target' and
used it for the "flags" parameter. Now it is possible to pass the sync
flags and any other flags that may come up.
Change confusing message that is printed when a RAID device fails.
s/Issue/Use/, otherwise it is easy to misread "Issue" as "Issuing" - causing
the user confusion as to whether the action was performed automatically or
whether they need to issue the command.
Fix possible NULL pointer dereferences when updating mirror log.
'_lv_update_log_type' takes a lvconvert_params argument so that it can pass
down the user's preference of 'region_size' and allocation_policy. When
'mirror_remove_missing' was introduced (commit ID 95986e42a18ca98c9b1d777346978b7297c85558) it didn't make sense to pass down
user preferences - so NULL was given instead. While it may never happen in
practice, static analysis reveals that this argument could be dereferenced.
So, if the user preferences were not passed in, glean the necessary fields
from what is already set in the LV.
Reported-by: Zdenek Kabelac <zkabelac@redhat.com> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
(Not updating WHATSNEW for this simple clean-up.)
Fix bug that caused RAID devices to be unable to activate if sub-LV was missing.
Commit 02f6f4902fd90709f55e2e97e969ee28c2945c81 introduced a bug that caused
RAID devices to fail to activate if the device for a single sub-LV failed.
The special case of LVM mirror was handled, but not LVM RAID.
EXAMPLE:
[root@bp-01 ~]# devices vg
LV Copy% Devices
lv 100.00 lv_rimage_0(0),lv_rimage_1(0)
[lv_rimage_0] /dev/sde1(1)
[lv_rimage_1] /dev/sdh1(1)
[lv_rmeta_0] /dev/sde1(0)
[lv_rmeta_1] /dev/sdh1(0)
[root@bp-01 ~]# vgchange -an vg
0 logical volume(s) in volume group "vg" now active
[root@bp-01 ~]# off.sh sdh
Turning off sdh
[root@bp-01 ~]# vgchange -ay vg --partial
Partial mode. Incomplete logical volumes will be processed.
Couldn't find device with uuid fbI0YO-GX7x-firU-Vy5o-vzwx-vAKZ-feRxfF.
Cannot activate vg/lv_rimage_1: all segments missing.
0 logical volume(s) in volume group "vg" now active
AFTER this patch:
[root@bp-01 ~]# vgchange -ay vg --partial
Partial mode. Incomplete logical volumes will be processed.
Couldn't find device with uuid fbI0YO-GX7x-firU-Vy5o-vzwx-vAKZ-feRxfF.
1 logical volume(s) in volume group "vg" now active
[root@bp-01 ~]# devices vg
Couldn't find device with uuid fbI0YO-GX7x-firU-Vy5o-vzwx-vAKZ-feRxfF.
LV Copy% Devices
lv 100.00 lv_rimage_0(0),lv_rimage_1(0)
[lv_rimage_0] /dev/sde1(1)
[lv_rimage_1] unknown device(1)
[lv_rmeta_0] /dev/sde1(0)
[lv_rmeta_1] unknown device(0)
[root@bp-01 ~]# dmsetup table vg-lv; dmsetup status vg-lv
0 1024000 raid raid1 3 0 region_size 1024 2 253:2 253:3 - -
0 1024000 raid raid1 2 AD 1024000/1024000
No WHATSNEW update necessary because this is an intrarelease fix.