This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: [PATCH 1/3] Reshuffle manual sections on cryptography and random numbers.

From: Rical Jasan <rj at 2c3t dot io>
To: Zack Weinberg <zackw at panix dot com>
Cc: libc-alpha at sourceware dot org, carlos at redhat dot com, fweimer at redhat dot com, nmav at redhat dot com
Date: Mon, 30 Apr 2018 19:46:16 -0700
Subject: Re: [PATCH 1/3] Reshuffle manual sections on cryptography and random numbers.
References: <20180430162731.2971-1-zackw@panix.com> <20180430162731.2971-2-zackw@panix.com>
On 04/30/2018 09:27 AM, Zack Weinberg wrote:
> The "unpredictable bytes" section of crypt.texi and the "pseudo-random
> numbers" section of math.texi are moved to a new chapter, random.texi;
> the obsolete DES encryption functions are also moved to a new chapter,
> des.texi.  This leaves crypt.texi devoted entirely to password handling.
> 
> I completely dropped the "Legal Problems" section, because it is more
> than a little out-of-date and we don't have the manpower or the
> expertise to keep it up to date.  Better not to say anything at all
> than to say something misleading.  All discussion of FIPS compliance
> has also been dropped.
> 
> I also revised the description of the 'crypt' function a little, and
> wrote new introductory text for crypt.texi, des.texi, and
> random.texi.  More could be done.

Overall, this looks like a good start to me, content-wise.  However, I
have some concern over the creation of a whole chapter for "Obsolete
Encryption".  I don't think that sets a desirable precedent for chapter
topics.  I think combining the disparate sections on randomness is a
good call, though.

What do you think about continuing to use crypt.texi for crytographic
topics and creating sections for "Obsolete Encryption" and "Random
Number Generation" there (while otherwise keeping the rest of the
changes, such as dropping "Legal Problems" and FIPS compliance, updating
the description of crypt, etc.)?  "Password Handling" could likewise be
made its own section.

>         * manual/des.texi, manual/random.texi: New manual chapters.
>         * manual/crypt.texi: "DES encryption" section moved to
>         des.texi; "unpredictable bytes" section moved to random.texi;
>         "legal problems" section removed.  New introductory text.
>         Improve explanation of salt formats.  Mention support for
>         SHA-2-based hashes.
>         * manual/math.texi: "Pseudo-Random Numbers" section moved to
>         random.texi.
> 	* manual/arith.texi, manual/conf.texi, manual/time.texi:
>         Adjust next/prev links.
>         * manual/string.texi (memfrob, strfry): Drop cross-references
>         to DES encryption.

[snip Makefile]

[snip arith.texi (chapter adjustment)]

[snip conf.texi (chapter adjustment)]

> diff --git a/manual/crypt.texi b/manual/crypt.texi
> index 99d2d8e092..d967ca32c2 100644
> --- a/manual/crypt.texi
> +++ b/manual/crypt.texi
> @@ -1,95 +1,25 @@
> -@c This node must have no pointers.
> -@node Cryptographic Functions
> -@c @node Cryptographic Functions, Debugging Support, System Configuration, Top
> -@chapter DES Encryption and Password Handling
> -@c %MENU% DES encryption and password handling
> +@node Password Handling, Obsolete Encryption, System Configuration, Top
> +@chapter Password Handling
> +@c %MENU% Password Handling

...

The changes to the password handling section (including crypt and
crypt_r) look good to me.

[snip DES Encryption (removed)]

> diff --git a/manual/des.texi b/manual/des.texi
> new file mode 100644
> index 0000000000..51db650e0e
> --- /dev/null
> +++ b/manual/des.texi
> @@ -0,0 +1,195 @@
> +@node Obsolete Encryption, Debugging Support, Password Handling, Top
> +@chapter Obsolete Encryption
> +@c %MENU% Obsolete Encryption
> +@cindex DES
> +@cindex Data Encryption Standard
> +
> +For historical reasons, @theglibc{} includes several functions which
> +perform encryption using the obsolete Data Encryption Standard (DES).
> +None of these functions should be used in new programs.  Instead, use
> +one of the many free encryption libraries that use modern ciphers.
> +
> +DES is a block cipher standardized by the US government in 1977.
> +It is no longer considered to be secure, and has been withdrawn as a
> +standard, because it only has @math{2@sup{56}} possible keys, so
> +testing all of them is practical.
> +@c
> +@c In 1998, it cost US$250,000 to build a massively parallel computer
> +@c that could test all the keys in three days.
> +@c
> +@c It would be nice to say how much a similar machine would cost now
> +@c (2018), and how fast it would be.

I thought [0] was pretty interesting (I didn't see how much their
machine costs, but the homepage says how much cracking a DES key costs
on various platforms and how quickly it usually takes).

> +@deftypefun void setkey (const char *@var{key})
> +@standards{SVID, stdlib.h}
> +@standards{GNU, crypt.h}

I see the @standards for setkey and encrypt were changed.  Previously
BSD and SVID were both attributed with crypt.h.  Perhaps someone else
could acknowledge whether BSD was not an appropriate standard for these
functions, and that it was GNU that used crypt.h for the declarations.

> +@safety{@prelim{}@mtunsafe{@mtasurace{:crypt}}@asunsafe{@asucorrupt{} @asulock{}}@acunsafe{@aculock{}}}
> +@c The static buffer stores the key, making it fundamentally
> +@c thread-unsafe.  The locking issues are only in the initialization
> +@c path; cancelling the initialization will leave the lock held, it
> +@c would otherwise repeat the initialization on the next call.
> +
> +The @code{setkey} function prepares to perform DES encryption or
> +decryption using the key @var{key}.  @var{key} should point to an
> +array of 64 @code{char}s, each of which must be set to either @samp{0}
> +or @samp{1}; that is, each byte stores one bit of the key.  Every
> +eighth byte (array indices 7, 15, 23, @dots{}) must be set to give it
> +plus the preceding group of seven bytes odd parity.  For instance, if
> +there are three bytes set to @samp{1} among bytes 0 through 6, then
> +byte 7 must be set to @samp{0}, and similarly if there are four bytes
> +set to @samp{1} among bytes 8 through 14, then byte 15 must be set to
> +@samp{0}, and so on.  Thus, of the 64 bytes, only 56 can be used to
> +supply key data.

A much improved description, IMO.

> +The @code{setkey} function is declared in @file{stdlib.h}.
> +@end deftypefun
> +
> +@deftypefun void encrypt (char *@var{block}, int @var{edflag})
> +@standards{SVID, unistd.h}
> +@standards{GNU, crypt.h}
> +@safety{@prelim{}@mtunsafe{@mtasurace{:crypt}}@asunsafe{@asucorrupt{} @asulock{}}@acunsafe{@aculock{}}}
> +@c Same issues as setkey.
> +
> +The @code{encrypt} function encrypts @var{block} if @var{edflag} is 0,
> +otherwise it decrypts @var{block}, using a key previously set by
> +@code{setkey}.  The result overwrites the previous value of
> +@var{block}.
> +
> +Like @code{setkey}, @var{block} is as an array of 64 @code{char}s,
> +each of which stores one bit of the block to be encrypted.  Unlike
> +@code{setkey}, there are no parity bits.  All 64 of the bits are
> +treated as data.

Again, improved.

> +The @code{encrypt} function is declared in @file{unistd.h}.
> +@end deftypefun
> +
> +
> +@deftypefun void setkey_r (const char *@var{key}, {struct crypt_data *} @var{data})
> +@deftypefunx void encrypt_r (char *@var{block}, int @var{edflag}, {struct crypt_data *} @var{data})
> +@standards{GNU, crypt.h}
> +@c setkey_r: @safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{} @asulock{}}@acunsafe{@aculock{}}}
> +@safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{} @asulock{}}@acunsafe{@aculock{}}}
> +
> +These are reentrant versions of @code{setkey} and @code{encrypt}.  The
> +only difference is the extra parameter, which stores the expanded
> +version of @var{key}.  Before calling @code{setkey_r} the first time,
> +@code{data->initialized} must be cleared to zero.
> +
> +Both of these functions are declared in @file{crypt.h}.

I think we should continue to say, "These functions are GNU
extensions.", within the text for the time being.  Once @standards are
rendered in the descriptions, we can cut that out, but it's nice to keep
that information readily available here for now.

> +@end deftypefun
> +
> +@deftypefun int ecb_crypt (char *@var{key}, char *@var{blocks}, unsigned int @var{len}, unsigned int @var{mode})
> +@standards{SUNRPC, rpc/des_crypt.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +
> +The function @code{ecb_crypt} encrypts or decrypts one or more blocks
> +using DES.  Each block is encrypted independently, which means that if
> +any two input blocks are the same, then their encryptions will also be
> +the same.  This is an additional weakness in the encryption, on top of
> +the weakness of DES itself.
> +
> +The @var{blocks} and the @var{key} are stored packed in 8-bit bytes, so
> +that the first bit of the key is the most-significant bit of
> +@code{key[0]} and the 63rd bit of the key is stored as the
> +least-significant bit of @code{key[7]}.  The least-significant bit of
> +each byte must be chosen to give each byte odd parity, as with
> +@code{setkey}.

Good.

> +@var{len} is the number of bytes in @var{blocks}.  It should be a
> +multiple of 8 (so that there are a whole number of blocks to encrypt).
> +@var{len} is limited to a maximum of @code{DES_MAXDATA} bytes.
> +
> +The result of the encryption replaces the input in @var{blocks}.
> +
> +The @var{mode} parameter is the bitwise OR of two of the following:
> +
> +@vtable @code
> +@item DES_ENCRYPT
> +@standards{SUNRPC, rpc/des_crypt.h}
> +This constant, used in the @var{mode} parameter, specifies that
> +@var{blocks} is to be encrypted.
> +
> +@item DES_DECRYPT
> +@standards{SUNRPC, rpc/des_crypt.h}
> +This constant, used in the @var{mode} parameter, specifies that
> +@var{blocks} is to be decrypted.
> +
> +@item DES_HW
> +@standards{SUNRPC, rpc/des_crypt.h}
> +This constant, used in the @var{mode} parameter, asks to use a hardware
> +device.  If no hardware device is available, encryption happens anyway,
> +but in software.
> +
> +@item DES_SW
> +@standards{SUNRPC, rpc/des_crypt.h}
> +This constant, used in the @var{mode} parameter, specifies that no
> +hardware device is to be used.
> +@end vtable
> +
> +The result of the function will be one of these values:
> +
> +@vtable @code
> +@item DESERR_NONE
> +@standards{SUNRPC, rpc/des_crypt.h}
> +The encryption succeeded.
> +
> +@item DESERR_NOHWDEVICE
> +@standards{SUNRPC, rpc/des_crypt.h}
> +The encryption succeeded, but there was no hardware device available.
> +
> +@item DESERR_HWERROR
> +@standards{SUNRPC, rpc/des_crypt.h}
> +The encryption failed because of a hardware problem.
> +
> +@item DESERR_BADPARAM
> +@standards{SUNRPC, rpc/des_crypt.h}
> +The encryption failed because of a bad parameter, for instance @var{len}
> +is not a multiple of 8 or @var{len} is larger than @code{DES_MAXDATA}.
> +@end vtable
> +@end deftypefun

OK (rest of ecb_crypt looks like a copy).

> +@deftypefun int cbc_crypt (char *@var{key}, char *@var{blocks}, unsigned int @var{len}, unsigned int @var{mode}, char *@var{ivec})
> +@standards{SUNRPC, rpc/des_crypt.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +
> +The function @code{cbc_crypt} encrypts or decrypts one or more blocks
> +using DES in Cipher Block Chaining mode.
> +
> +For encryption in CBC mode, each block is exclusive-ored with @var{ivec}
> +before being encrypted, then @var{ivec} is replaced with the result of
> +the encryption, then the next block is processed.  Decryption is the
> +reverse of this process.
> +
> +This has the advantage that blocks which are the same before being
> +encrypted are very unlikely to be the same after being encrypted, making
> +it much harder to detect patterns in the data.
> +
> +Usually, @var{ivec} is set to 8 random bytes before encryption starts.
> +Then the 8 random bytes are transmitted along with the encrypted data
> +(without themselves being encrypted), and passed back in as @var{ivec}
> +for decryption.  Another possibility is to set @var{ivec} to 8 zeroes
> +initially, and have the first block encrypted consist of 8 random
> +bytes.
> +
> +Otherwise, all the parameters are similar to those for @code{ecb_crypt}.
> +@end deftypefun

OK (looks like copy).

> +@deftypefn Macro int DES_FAILED (int @var{err})
> +@standards{SUNRPC, rpc/des_crypt.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +This macro returns 1 if @var{err} is a `success' result code from
> +@code{ecb_crypt} or @code{cbc_crypt}, and 0 otherwise.
> +@end deftypefn

OK (copied, but moved below both ecb- and cbc_crypt).

> +@deftypefun void des_setparity (char *@var{key})
> +@standards{SUNRPC, rpc/des_crypt.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +
> +The function @code{des_setparity} changes the 64-bit @var{key}, stored
> +packed in 8-bit bytes, to have odd parity by altering the low bits of
> +each byte.
> +@end deftypefun
> +
> +The @code{ecb_crypt}, @code{cbc_crypt}, and @code{des_setparity}
> +functions and their accompanying macros are all defined in the header
> +@file{rpc/des_crypt.h}.
> diff --git a/manual/math.texi b/manual/math.texi

OK. (Pseudo-Random Numbers section moved elsewhere.)

> diff --git a/manual/random.texi b/manual/random.texi
> new file mode 100644
> index 0000000000..f14b9e6e82
> --- /dev/null
> +++ b/manual/random.texi
> @@ -0,0 +1,753 @@
> +@node Random Number Generation, Date and Time, Arithmetic, Top
> +@chapter Random Number Generation
> +@c %MENU% Various ways to generate random values.
> +
> +Many algorithms require a source of @dfn{random numbers}, or to be
> +more precise, sequences of numbers chosen uniformly at random from

I think the only thing missing from this introduction is a definition of
uniform, which occurs several times.  Perhaps wrap "uniformly" in an
@dfn and then follow up this sentence with something like: "Uniform
random number distribution means any number stands an equal chance of
being selected."  (I'm not sure how technical we want to be, but I think
that conveys the idea.)  Uniform randomness is a little oxymoronic, so I
think it would be good to say something about it.

> +some subset of the integers or reals.  There are several different
> +ways to generate random numbers, depending on how stringent your
> +requirements are.
> +
> +A @dfn{pseudo-random generator} (PRNG) produces a sequence of numbers

@dfn{pseudo-random number generator}

> +that @emph{appears} to be random, and has statistical properties
> +matching what we expect of numbers chosen uniformly at random.
> +However, an ordinary PRNG doesn't guarantee that its output is
> +unguessable.  Also, the output of a PRNG depends on a relatively small
> +@dfn{seed} value, and so there are only a small number of sequences
> +that it can produce; astronomically small, relative to the total
> +number of random sequences.  If the seed is reused, the output will
> +be exactly the same, which is sometimes exactly what you want, and
> +sometimes disastrous.
> +
> +A @dfn{cryptographically strong pseudo-random generator} (CSPRNG) is a

@dfn{cryptographically strong pseudo-random number generator}

(I also see "cryptographically secure" in use; not sure if there's a
preference/standard.)

> +PRNG that @emph{does} guarantee its output is unguessable.  Formally,
> +there is no deterministic, polynomial-time algorithm@footnote{Assuming
> +@iftex
> +@c Don't typeset NP as if multiplying N by P. Use text italic for both.
> +@math{\hbox{\it P} \neq \hbox{\it NP}}@c
> +@end iftex
> +@ifnottex
> +@math{P ≠ NP}@c
> +@end ifnottex
> +.} that can tell the difference between the output of
> +a CSPRNG and a sequence of numbers that really were chosen uniformly
> +at random.  A CSPRNG still uses a seed and can only produce an
> +astronomically small number of random sequences.
> +
> +Finally, a @dfn{true random generator} (TRNG) produces random numbers

@dfn{true random number generator}

> +not from a seed, but from some unpredictable physical process.  In
> +principle, a TRNG's output is unguessable, and is not limited to an
> +astronomically small number of sequences.  However, TRNGs are very
> +slow, and because there is no seed, there is no way to make one repeat
> +its output.  Usually, it is best to use a TRNG only to choose the seed
> +for a PRNG or CSPRNG.
> +
> +At present, @theglibc{} offers a variety of ordinary PRNGs, and on
> +some operating systems it also offers access to an OS-provided TRNG.
> +We may add a CSPRNG in the future.
> +
> +@menu
> +* Pseudo-Random Numbers::       Sequences of numbers with apparently
> +                                 random distribution, but not difficult
> +                                 to predict.
> +* Unpredictable Bytes::         Asking the operating system for truly
> +				 unpredictable bytes.

Inconsistent use of spaces vs. tabs in the menu.

[snip Pseudo-random Numbers]

> +@node Unpredictable Bytes
> +@section Generating Unpredictable Bytes
> +
> +Some cryptographic applications need truly unpredictable bytes.
> +@Theglibc{} provides two functions for this purpose, both of which
> +access a true random generator implemented by the operating system.
> +Not all operating systems support these functions; programs that use
> +them must be prepared for them to fail.  They are slow, and can only
> +produce short sequences of unpredictable bytes.  Most programs should
> +use these functions only to seed a cryptographically strong
> +pseudo-random generator.
> +
> +Most programs should use @code{getentropy}.  The @code{getrandom}
> +function is intended for low-level applications which need additional
> +control over the blocking behavior.

OK.

[snip getentropy and getrandom]

> diff --git a/manual/string.texi b/manual/string.texi
> index b07cfb4550..f113777f2d 100644
> --- a/manual/string.texi
> +++ b/manual/string.texi
> @@ -2469,8 +2469,6 @@ Unlike Rot13, @code{memfrob} works on arbitrary binary data, not just
>  text.
>  @cindex Rot13
>  
> -For true encryption, @xref{Cryptographic Functions}.
> -
>  This function is declared in @file{string.h}.
>  @pindex string.h
>  
> @@ -2487,9 +2485,7 @@ Note that @code{memfrob} a second time on the same data structure
>  returns it to its original state.
>  
>  This is a good function for hiding information from someone who doesn't
> -want to see it or doesn't want to see it very much.  To really prevent
> -people from retrieving the information, use stronger encryption such as
> -that described in @xref{Cryptographic Functions}.
> +want to see it or doesn't want to see it very much.
>  
>  @strong{Portability Note:}  This function is unique to @theglibc{}.

OK.

[snip time.texi (chapter adjustment)]


There was a lot of cross-referencing, so I may have missed some things,
but I tried to catch changes in sections cut and pasted.  It would be
easier to review these changes in two steps: first cut/paste, then
modify content.  The commit could squash the two steps.  Probably best
to adjust headings/links in the first diff, but I don't have a strong
opinion there if it gets squashed eventually anyway.  It was just a
little difficult since there was so much unchanged content mixed with
modified content.  My problem is that I'm not sure I actually caught all
of it (though what I did see, I liked).

Thank you,
Rical

[0] https://crack.sh/
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]