This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/4] Revise manual sections on cryptography and random numbers.


On 05/08/2018 05:58 PM, Zack Weinberg wrote:
I will think about this some more, and possibly consult one of my
actual lawyer friends over at the EFF.  Do you have any thoughts on
more appropriate wording?  Or, do you think we should not discuss this
at all, which is how I had it in the previous revision?

Not discussing it at all is fine with me. I can't provide any concrete guidance for the current U.S. situation.

+The function @code{crypt} converts a password string, @var{phrase},
+into a one-way hash suitable for storage in the user database.  The
+hash will consist entirely of printable ASCII characters.  It will not
+contain whitespace, nor any of the characters @kbd{:}, @kbd{;},
+@kbd{*}, @kbd{!}, or @kbd{\}.

@code{':'} etc.?  These are bytes, not keyboard characters.

I always have problems deciding which of @samp/@code/@kbd/... to use
for this sort of thing.  The current Texinfo manual says specifically
to use @samp for single characters "unless @kbd or @key is more
appropriate", and I think you're right that it shouldn't be @kbd, so
I'll change them to @samp in the next revision.

I meant ':' as a C expression, hence wrapping it in @code.

+@tab @samp{$6$}

Maybe use @code{"$6$"} etc. for the prefixes?  No strong preference on my
part though.

In this case I think we need to use @samp, because the prefix for DES
is the empty string, which will come out as ‘’ with @samp but nothing
at all with @code.

Again, with "…" as a wrapper, so that the empty string is actually visible. No strong opinion though.

The code checks [that the salt uses only certain characters], right?  Or perhaps only for DES.

It looks like the code only checks for DES, but I think it's
appropriate to document that the salt should use only those characters
anyway.

Right.

+  if (getentropy (ubytes, sizeof ubytes))
+    {
+      perror ("getentropy");
+      return 1;
+    }

Explicit check of getentropy return value against zero?  It's not exactly a
boolean flag, after all.

getentropy is documented to return 0 on success or -1 on failure, so I
think 'if (getentropy (...))' is the way to go here.

Hmm. I thought that the C/POSIX 0/-1 return value does not really qualify as a boolean because the meaning is reversed and the bit pattern is non-standard, but I'm not sure if that's part of our coding guidelines.

-  ok = strcmp (result, pass) == 0;
+  ok = strcmp (result, pw->pw_passwd) == 0;

Maybe add a comment that this could result in a timing oracle?

It doesn't, because we are comparing the hashes, not the raw
passwords.  Maybe I should say _that_?

It still makes me nervous. 8-/

In some cases, the comparison could go in the other direction.

+A @dfn{cryptographically strong pseudo-random number generator}
+(CSPRNG) is a PRNG that @emph{does} guarantee its output is
+unguessable.  Formally, there is no deterministic, polynomial-time
+algorithm@footnote{Assuming
+@iftex
+@c Don't typeset NP as if multiplying N by P. Use text italic for both.
+@math{\hbox{\it P} \neq \hbox{\it NP}}@c
+@end iftex
+@ifnottex
+@math{P ≠ NP}@c
+@end ifnottex
+.} that can tell the difference between the output of
+a CSPRNG and a sequence of numbers that really were chosen uniformly
+at random.  A CSPRNG still uses a seed and can only produce an
+astronomically small number of random sequences.

I'm not sure if this detail is necessary.  If it is, you need to add
“independent” somewhere.

I'm afraid I don't understand which bit of the above text is the
detail you think might not be necessary, or where the word
"independent" should appear.

The sequence of random variables against which the CSPRNG is compared must be uniformly distributed and independent, or something like that. This is something I last thought about probably fifteen years ago, so do not take my word for it.

However, I still think that this definition still has two problems: It is hard to prove anything in this area, so the definition isn't really helpful for documentation (because we don't know if our algorithm actually has that property). And I'm vary of asymptomatics (“polynomial-time”) in this context anyway because our algorithms have fixed-size inputs and internal state, so everything is O(1) anyway.

+Finally, a @dfn{true random number generator} (TRNG) produces random
+numbers not from a seed, but from some unpredictable physical process.
+In principle, a TRNG's output is unguessable, and is not limited to an
+astronomically small number of sequences.  However, TRNGs are very
+slow, and because there is no seed, there is no way to make one repeat
+its output.  Usually, it is best to use a TRNG only to choose the seed
+for a PRNG or CSPRNG.
+
+At present, @theglibc{} offers a variety of ordinary PRNGs, and on
+some operating systems it also offers access to an OS-provided TRNG.
+We may add a CSPRNG in the future.

If this refers to arc4random, I don't think it's a CSPRNG under your
definition.

Yes, that was the intent; why would it not qualify?

We don't have proof for any NIST-approved algorithm that it is a CSPRNG, do we?

And existing proofs are often useless, see section 6 of <https://eprint.iacr.org/2006/229> (Neal Koblitz and Alfred Menezes, “Another Look at "Provable Security" II”).

+Some cryptographic applications need truly unpredictable bytes.
+@Theglibc{} provides two functions for this purpose, both of which
+access a true random generator implemented by the operating system.

I don't think the Linux urandom generator qualifies as a TRNG, sorry.

Hmm.  I see why you say that, but for most programs' purposes it is
good enough.  Maybe that's what I should say, and also that if you
need something stronger you need to get yourself an actual piece of
hardware?

I think we should steer clear of crypto politics (including proof politics) and just say it's unpredictable.

I think maybe I just won't say they're slow.  The major reason for
wanting to use them only for seeds is the 256-byte limit on the output
of getentropy.

That's a good point.

   This function is declared in @file{string.h}.
   @pindex string.h
   @@ -2487,9 +2485,7 @@ Note that @code{memfrob} a second time on the same
data structure
   returns it to its original state.
     This is a good function for hiding information from someone who
doesn't
-want to see it or doesn't want to see it very much.  To really prevent
-people from retrieving the information, use stronger encryption such as
-that described in @xref{Cryptographic Functions}.
+want to see it or doesn't want to see it very much.

Hmm.  Okay I guess.

The point of this change is the manual shouldn't be endorsing DES even
by implication as a way of "really prevent[ing]" people from
retrieving information.  How about I mention libgcrypt again, instead
of saying nothing at all?

I like this idea.

Thanks,
Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]