This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH 2/4] Revise manual sections on cryptography and random numbers.

From: Zack Weinberg <zackw at panix dot com>
To: Florian Weimer <fweimer at redhat dot com>
Cc: GNU C Library <libc-alpha at sourceware dot org>
Date: Tue, 8 May 2018 11:58:59 -0400
Subject: Re: [PATCH 2/4] Revise manual sections on cryptography and random numbers.
References: <20180506175153.22629-1-zackw@panix.com> <20180506175153.22629-3-zackw@panix.com> <e8f62598-f580-7e75-38dc-5c99e53caee2@redhat.com>

On Tue, May 8, 2018 at 10:23 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 05/06/2018 07:51 PM, Zack Weinberg wrote:
>> +Many countries have historically placed legal restrictions on the
>> +import, export, possession, or use of cryptographic tools.  These
>> +restrictions change frequently, so we cannot give you specific advice
>> +about what the situation might be in your country.  To the best of our
>> +knowledge, no one has objected to the export or import of @theglibc{}
>> +in many years,
>
> I think people routinely file notices for glibc with the appropriate
> authorities.  The whole thing doesn't really work the way the text implies.
> It's not that nobody complains, it's that all this work is done proactively.
> I don't think we need to get into detail here.
>
>> but if you are writing a program that performs
>> +cryptographic operations, we advise you to find out what your
>> +country's laws might be, first.  The one-way hash functions for
>> +password storage are less likely to be restricted than the DES-based
>> +encryption functions, because they cannot be used to send secret
>> +messages---the recipient would not be able to decrypt them.
>
> It's fairly easy to turn a one-way hash function into a stream cipher, so
> I'm not sure if the final sentence is correct.
>
> The HMAC construction can also be used with this and give you a cipher:
>
>   <https://people.csail.mit.edu/rivest/Chaffing.txt>
>
> I'm also not sure if DES qualifies as a cryptographic algorithm anymore, so
> the actual situation might well be the opposite of what you state. Sorry.

I will think about this some more, and possibly consult one of my
actual lawyer friends over at the EFF.  Do you have any thoughts on
more appropriate wording?  Or, do you think we should not discuss this
at all, which is how I had it in the previous revision?

>> +The function @code{crypt} converts a password string, @var{phrase},
>> +into a one-way hash suitable for storage in the user database.  The
>> +hash will consist entirely of printable ASCII characters.  It will not
>> +contain whitespace, nor any of the characters @kbd{:}, @kbd{;},
>> +@kbd{*}, @kbd{!}, or @kbd{\}.
>
> @code{':'} etc.?  These are bytes, not keyboard characters.

I always have problems deciding which of @samp/@code/@kbd/... to use
for this sort of thing.  The current Texinfo manual says specifically
to use @samp for single characters "unless @kbd or @key is more
appropriate", and I think you're right that it shouldn't be @kbd, so
I'll change them to @samp in the next revision.

>> +@tab @samp{$6$}
>
> Maybe use @code{"$6$"} etc. for the prefixes?  No strong preference on my
> part though.

In this case I think we need to use @samp, because the prefix for DES
is the empty string, which will come out as ‘’ with @samp but nothing
at all with @code.

> The code checks [that the salt uses only certain characters], right?  Or perhaps only for DES.

It looks like the code only checks for DES, but I think it's
appropriate to document that the salt should use only those characters
anyway.

>> +will begin with a @kbd{*} and will not be the same as @var{salt}.  In
> @code{'*'}?

@samp, again, I think.

>> -The @code{crypt_r} function does the same thing as @code{crypt}, but
>> -takes an extra parameter which includes space for its result (among
>> -other things), so it can be reentrant.  @code{data@w{->}initialized} must
>> be
>> -cleared to zero before the first time @code{crypt_r} is called.
>> -
>> -The @code{crypt_r} function is a GNU extension.
>> +The function @code{crypt_r} is a reentrant version of @code{crypt}.
>
> I don't think crypt_r is reentrant, it is merely thread-safe.  But no strong
> opinion.

The term 'reentrant' was used by the text this replaced (see above
quoted diff) but I think I will change it to 'thread-safe' here and
elsewhere.

>> +@samp{@w{memset(data, 0, sizeof data)}}
>
> Missing space before parentheses.

Will fix.

> Perhaps add a warning that struct crypt_data is very large and should not be
> allocated on the stack?

Yeah, good idea.

> Okay.  Maybe also say that these functions are not thread-safe?

Only encrypt and setkey are not thread-safe.  I will make sure this is
clear, though.

>> +The @code{setkey} function prepares to perform DES encryption or
>> +decryption using the key @var{key}.  @var{key} should point to an
>> +array of 64 @code{char}s, each of which must be set to either @samp{0}
>> +or @samp{1}; that is, each byte stores one bit of the key.  Every
>> +eighth byte (array indices 7, 15, 23, @dots{}) must be set to give it
>> +plus the preceding group of seven bytes odd parity.
>
> “must be set to the negation of the parity of the preceding seven
> bits/bytes”?

What I wrote is confusing, but I'm not sure this suggestion is correct
either.  I'll think about it some more.

>> +@samp{@w{memset(data, 0, sizeof data)}}
>
> Missing space before parenthesis.

Will fix.

>>     The function @code{ecb_crypt} encrypts or decrypts one or more blocks
>> -using DES.  Each block is encrypted independently.
>> +using DES.  Each block is encrypted independently, which means that if
>> +any two input blocks are the same, then their encryptions will also be
>> +the same.  This is an additional weakness in the encryption, on top of
>> +the weakness of DES itself.
>
> Maybe mention Electronic Codebook Mode (ECB)?

Good point, not everyone already knows what "ecb" means :)

>> +  if (getentropy (ubytes, sizeof ubytes))
>> +    {
>> +      perror ("getentropy");
>> +      return 1;
>> +    }
>
> Explicit check of getentropy return value against zero?  It's not exactly a
> boolean flag, after all.

getentropy is documented to return 0 on success or -1 on failure, so I
think 'if (getentropy (...))' is the way to go here.  If it returned
the number of bytes read, that would be a different story, but it's
getrandom that does that.

>> +  /* Retrieve the password for user argv[1]
>> +     from the user database.  */
>> +  pw = getpwnam (argv[1]);
>
> Check error?

Doh! Will fix.

>> -  ok = strcmp (result, pass) == 0;
>> +  ok = strcmp (result, pw->pw_passwd) == 0;
>
> Maybe add a comment that this could result in a timing oracle?

It doesn't, because we are comparing the hashes, not the raw
passwords.  Maybe I should say _that_?

>> +A @dfn{cryptographically strong pseudo-random number generator}
>> +(CSPRNG) is a PRNG that @emph{does} guarantee its output is
>> +unguessable.  Formally, there is no deterministic, polynomial-time
>> +algorithm@footnote{Assuming
>> +@iftex
>> +@c Don't typeset NP as if multiplying N by P. Use text italic for both.
>> +@math{\hbox{\it P} \neq \hbox{\it NP}}@c
>> +@end iftex
>> +@ifnottex
>> +@math{P ≠ NP}@c
>> +@end ifnottex
>> +.} that can tell the difference between the output of
>> +a CSPRNG and a sequence of numbers that really were chosen uniformly
>> +at random.  A CSPRNG still uses a seed and can only produce an
>> +astronomically small number of random sequences.
>
> I'm not sure if this detail is necessary.  If it is, you need to add
> “independent” somewhere.

I'm afraid I don't understand which bit of the above text is the
detail you think might not be necessary, or where the word
"independent" should appear.  This is probably because I have so
thoroughly internalized how the academic cryptography literature uses
the term "CSPRNG" that I no longer know how to explain it to someone
who's new to the topic. Could you clarify, please?

>> +Finally, a @dfn{true random number generator} (TRNG) produces random
>> +numbers not from a seed, but from some unpredictable physical process.
>> +In principle, a TRNG's output is unguessable, and is not limited to an
>> +astronomically small number of sequences.  However, TRNGs are very
>> +slow, and because there is no seed, there is no way to make one repeat
>> +its output.  Usually, it is best to use a TRNG only to choose the seed
>> +for a PRNG or CSPRNG.
>> +
>> +At present, @theglibc{} offers a variety of ordinary PRNGs, and on
>> +some operating systems it also offers access to an OS-provided TRNG.
>> +We may add a CSPRNG in the future.
>
> If this refers to arc4random, I don't think it's a CSPRNG under your
> definition.

Yes, that was the intent; why would it not qualify?

> And we shouldn't claim it has formal properties initially
> because that would make glibc subject to FIPS certification again, which is
> something we should avoid.

I meant to use "formal" in the mathematical sense -- "these are the
details of how the guarantee is made accessible to mathematical
proof".  Can you point me at some explanation of what phrasing might
or might not make a FIPS certification requirement apply?  I only know
the barest outline of that process.


>> +Some cryptographic applications need truly unpredictable bytes.
>> +@Theglibc{} provides two functions for this purpose, both of which
>> +access a true random generator implemented by the operating system.
>
> I don't think the Linux urandom generator qualifies as a TRNG, sorry.

Hmm.  I see why you say that, but for most programs' purposes it is
good enough.  Maybe that's what I should say, and also that if you
need something stronger you need to get yourself an actual piece of
hardware?

>
>> +Not all operating systems support these functions; programs that use
>> +them must be prepared for them to fail.  They are slow, and can only
>> +produce short sequences of unpredictable bytes.  Most programs should
>> +use these functions only to seed a cryptographically strong
>> +pseudo-random generator.
>
> The functions aren't *that* slow .  But I don't have a proposal for
> improving the wording.

I think maybe I just won't say they're slow.  The major reason for
wanting to use them only for seeds is the 256-byte limit on the output
of getentropy.

>>   This function is declared in @file{string.h}.
>>   @pindex string.h
>>   @@ -2487,9 +2485,7 @@ Note that @code{memfrob} a second time on the same
>> data structure
>>   returns it to its original state.
>>     This is a good function for hiding information from someone who
>> doesn't
>> -want to see it or doesn't want to see it very much.  To really prevent
>> -people from retrieving the information, use stronger encryption such as
>> -that described in @xref{Cryptographic Functions}.
>> +want to see it or doesn't want to see it very much.
>
> Hmm.  Okay I guess.

The point of this change is the manual shouldn't be endorsing DES even
by implication as a way of "really prevent[ing]" people from
retrieving information.  How about I mention libgcrypt again, instead
of saying nothing at all?

zw

Follow-Ups:
- Re: [PATCH 2/4] Revise manual sections on cryptography and random numbers.
  - From: Florian Weimer

References:
- [PATCH 0/4 v3] Revise crypto documentation, deprecate DES, add --disable-crypt
  - From: Zack Weinberg
- [PATCH 2/4] Revise manual sections on cryptography and random numbers.
  - From: Zack Weinberg
- Re: [PATCH 2/4] Revise manual sections on cryptography and random numbers.
  - From: Florian Weimer

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]