]>
Commit | Line | Data |
---|---|---|
10c7d802 PR |
1 | Automatic device assembly by udev |
2 | ================================= | |
3 | ||
4 | We want to asynchronously assemble and activate devices as their components | |
5 | become available. Eventually, the complete storage stack should be covered, | |
6 | including: multipath, cryptsetup, LVM, mdadm. Each of these can be addressed | |
7 | more or less separately. | |
8 | ||
9 | The general plan of action is to simply provide udev rules for each of the | |
10 | device "type": for MD component devices, PVs, LUKS/crypto volumes and for | |
11 | multipathed SCSI devices. There's no compelling reason to have a daemon do these | |
12 | things: all systems that actually need to assemble multiple devices into a | |
13 | single entity already either support incremental assembly or will do so shortly. | |
14 | ||
15 | Whenever in this document we talk about udev rules, these may include helper | |
16 | programs that implement a multi-step process. In many cases, it can be expected | |
17 | that the functionality can be implemented in couple lines of shell (or couple | |
18 | hundred of C). | |
19 | ||
20 | Multipath | |
21 | --------- | |
22 | ||
23 | For multipath, we will need to rely on SCSI IDs for now, until we have a better | |
24 | scheme of things, since multipath devices can't be identified until the second | |
25 | path appears, and unfortunately we need to decide whether a device is multipath | |
26 | when the *first* path appears. Anyway, the multipath folks need to sort this | |
27 | out, but it shouldn't bee too hard. Just bring up multipathing on anything that | |
28 | appears and is set up for multipathing. | |
29 | ||
30 | LVM | |
31 | --- | |
32 | ||
33 | For LVM, the crucial piece of the puzzle is lvmetad, which allows us to build up | |
34 | VGs from PVs as they appear, and at the same time collect information on what is | |
d742cdf3 | 35 | already available. A command, pvscan --cache is expected to be used to |
10c7d802 PR |
36 | implement udev rules. It is relatively easy to make this command print out a |
37 | list of VGs (and possibly LVs) that have been made available by adding any | |
38 | particular device to the set of visible devices. In othe words, udev says "hey, | |
d742cdf3 | 39 | /dev/sdb just appeared", calls pvscan --cache, which talks to lvmetad, which |
10c7d802 PR |
40 | says "cool, that makes vg0 complete". Pvscan takes this info and prints it out, |
41 | and the udev rule can then somehow decide whether anything needs to be done | |
42 | about this "vg0". Presumably a table of devices that need to be activated | |
43 | automatically is made available somewhere in /etc (probably just a simple list | |
44 | of volume groups or logical volumes, given by name or UUID, globbing | |
45 | possible). The udev rule can then consult this file. | |
46 | ||
47 | Cryptsetup | |
48 | ---------- | |
49 | ||
50 | This may be the trickiest of the lot: the obvious hurdle here is that crypto | |
51 | volumes need to somehow obtain a key (passphrase, physical token or such), | |
52 | meaning there is interactivity involved. On the upside, dm-crypt is a 1:1 | |
53 | system: one encrypted device results in one decrypted device, so no assembly or | |
54 | notification needs to be done. While interactivity is a challenge, there are at | |
55 | least partial solutions around. (TODO: Milan should probably elaborate here.) | |
56 | ||
57 | (For LUKS devices, these can probably be detected automatically. I suppose that | |
58 | non-LUKS devices can be looked up in crypttab by the rule, to decide what is the | |
59 | appropriate action to take.) | |
60 | ||
61 | MD | |
62 | -- | |
63 | ||
64 | Fortunately, MD (namely mdadm) already comes with a mechanism for incremental | |
65 | assembly (mdadm -I or such). We can assume that this fits with the rest of stack | |
66 | nicely. | |
67 | ||
68 | ||
69 | Filesystem &c. discovery | |
70 | ======================== | |
71 | ||
72 | Considering other requirements that exist for storage systems (namely | |
73 | large-scale storage deployments), it is absolutely not feasible to have the | |
74 | system hunt automatically for filesystems based on their UUIDs. In a number of | |
75 | cases, this could mean activating tens of thousands of volumes. On small | |
76 | systems, asking for all volumes to be brought up automatically is probably the | |
77 | best route anyway, and once all storage devices are activated, scanning for | |
78 | filesystems is no different from today. | |
79 | ||
80 | In effect, no action is required on this count: only filesystems that are | |
81 | available on already active devices can be mounted by their UUID. Activating | |
82 | volumes by naming a filesystem UUID is useless, since to read the UUID the | |
83 | volume needs to be active first. |