Implement C11 annex K?

David A. Wheeler dwheeler@dwheeler.com
Thu Aug 21 22:45:00 GMT 2014


On Mon, 18 Aug 2014 01:03:30 -0700, Paul Eggert <eggert@cs.ucla.edu> wrote:
> OpenSSH's authors have strongly 
> advocated strlcpy and have much invested in it over the years.  Even if 
> they conceded that strlcpy is not that helpful now (admittedly 
> unlikely), inertia would probably induce them to keep it.

There are now *hundreds* of packages that use strlcpy/strlcat.
It is NOT just the original OpenSSH authors that use them.
Every package has to keep re-implementing them, typically in less-efficient
portable ways, because glibc fails to include them.

Here's a list put together by deraadt, there are probably others:
http://marc.info/?l=openbsd-tech&m=138733933417096&w=2

I'd like to convince you to think about *risk*.  There's no doubt that it
is *possible* to write secure software using only strcpy, strlen, etc.
But while that is *true*, it is also *irrelevant*.

The probability of making a mistake using routines like strcpy is very high.
This is amply demonstrated by the continuing march of buffer overflows.
We need alternatives, ones that automatically *prevent* buffer overflows
and can be easily applied.

So with that, let me comment on the comments...


> > addrmatch.c:321:
> > ... The one-line snprintf version is this horror:
> 
> That's because you wrote it in a horrible way.  This is better:
> 
>     if (snprintf(addrbuf, sizeof addrbuf, "%s", p) >= sizeof addrbuf)
>       return -1;

The spec says snprintf can return <0, which this code fails to handle.
It's still not better, anyway; it sure isn't obvious that this is a string copy.
But to continue...

> Though I wouldn't use snprintf here, as the following distinguishes the 
> check from the action more clearly:
> 
>     if (strlen(p) >= sizeof addrbuf)
>       return -1;
>     strcpy(addrbuf, p);

No. That wastes a lot of *people* time and is dangerous during maintenance.
In many projects, every strcpy() will (correctly) set off warning bells
requiring multiple people to *prove* that it can't exceed the buffer overflows.
And it's not just during writing the initial code.  It's also easy to turn this kind
of code into a vulnerability when the code is maintained
(which is one reason this kind of code wastes a lot of people-time),
so every time the function is changed, people will have to re-check it, and
over time this can get painful.

So every strcpy(), even if it's safe, increases *development* and *maintenance* time.
Developer and reviewer time is often MORE important than execution time
(even in C code, only some paths usually matter).
It's not TOO hard in this case to show that the strcpy cannot be
exceeded, sure, but that's not the end.

A lot of developers who care about secure code are trying to *eliminate*
the use of strcpy and functions like it, because it's just too easy to make a
disastrous mistake. Use strcpy() where it's provably safe and on a path where
performance is very important, sure.  But it should NOT be used elsewhere.
Telling people "don't make any mistakes" hasn't worked so far, and it won't in the future.

So NO, this is NOT better.  This is WORSE.  The call to strcpy() is
*faster*, but it's only justifiable if it's on a fast path where the speed matters.

> Regardless of the form one prefers, the use of strlcpy here does not fix 
> any bugs or make the code significantly clearer, compared to using 
> standard functions.

The use of strlcpy *does* help, quite substantially.

It eliminates the risk that a later modification will cause a buffer overflow;
at worst it becomes a truncation instead.  That *matters*.

So I reach radically different conclusions here.  I think that
pattern will continue... :-).



> > auth.c:486:
> >  strlcpy(buf, cp, sizeof(buf));
> >  ... So.. do you really believe that MAXPATHLEN really is the max length?
> 
> It's not a matter of belief.  It's obvious from the code that sets 'cp', 
> four lines earlier.

(I was actually complaining about MAXPATHLEN nonsense in the standard,
and not really about the subject-at-hand, so let's skip that.)

>  Worse, this use of strlcpy has undefined behavior 
> when cp points into buf.

I don't think so.  strlcpy is required to copy the source left-to-right, since it stops at the \0,
and it's copying to the beginning of "buf", so I don't see an undefined behavior.


>  A fix would be:
> 
>     memmove(buf, cp, strlen(cp) + 1);

That's not a fix.  That's horrific.

That way of using memmove checks the length of the *source*,
and copies it, and it fails to do *any* checks that the buffer has enough length.
That kind of construct is *BEGGING* to be part of a buffer overflow.

It's true that in *this particular case* it won't overflow a buffer, as long as
nothing ever changes in the rest of the code surrounding it.
But anyone who keeps writing code this way, using functions
that do not *always* check the destination length, will eventually make a mistake.
Actually, in my experience, they'll make dozens of them.

We need simple mechanisms that people can use *every time* that
prevent buffer overflows.  Ones that don't require deep analysis and
deep re-analysis every time someone changes a line of code.

Most code in most C programs is *not* on a speed-critical path, but is instead
doing setup, special-case error handling, etc.  It should be *easy* to write
code that obviously cannot ever cause a buffer overflow, when it's
okay to spend a few extra cycles to do it.


> > authfd.c:107:
> >  strlcpy(sunaddr.sun_path, authsocket, sizeof(sunaddr.sun_path));
> > ... Truncation isn't checked... but it's not clear what else you
> >  could do when truncation occurs.
> 
> No, it's quite clear.  You could return -1, which is what strlcpy's 
> caller is supposed to do on failure.  Here, strlcpy might be 
> contributing to a bug, and it certainly isn't helping: a programmer who 
> had used strlen + strcpy would likely have done better here and returned 
> -1 on overlong inputs.

Or, a programmer using strcpy could just allow a buffer overflow, the usual result :-).

I agree that returning -1 would be a good idea. But note that even when
you don't check the return, which you could call a mistake, the usual result
of strlcpy is much safer - it is merely truncation (which often isn't exploitable).
A common result of strcpy misuse is a CVE id :-).

But let's go with your thought.  Changing this to return -1 is also easy:

if (strlcpy(sunaddr.sun_path, authsocket, sizeof(sunaddr.sun_path)) >= sizeof(sunaddr.sun_path))
  return -1;
 

> > auth-pam.c:742:
> >   mlen = strlen(msg);
> >   ...
> >   len = plen + mlen + 1;
> >   **prompts = xrealloc(**prompts, 1, len);
> >   strlcpy(**prompts + plen, msg, len - plen);
> >   plen += mlen;
> > ....
> >   Advantage strlcpy, due to a philosophical preference
> 
> I'm afraid that veers too closely to "I like strlcpy because I like 
> strlcpy".  strlcpy does not fix any bugs here compared to strcpy, and 
> this was the point I originally made.  And strcpy would be simpler here.

No. strlcpy/strlcat, and other routines like strcpy_s, have the advantage
that they significantly reduce the likelihood that the inevitable programming
mistakes will become a serious vulnerability.

It's a *risk* decision.  C should have routines that give you the fastest
possible routines when you need them... but it should also have some
less-risky functions for common cases.

--- David A. Wheeler




More information about the Libc-alpha mailing list