This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libc/16640] string/strtok.c: undefined behaviour inconsistent between x86 and other generic code


https://sourceware.org/bugzilla/show_bug.cgi?id=16640

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |WAITING
                 CC|                            |carlos at redhat dot com

--- Comment #1 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Kyle McMartin from comment #0)
> Created attachment 7444 [details]
> make string/strtok match x86
> 
> The strtok.S implementations for x86_64 and i386 vary from the generic
> string/strtok.c version. In the former case, if str == NULL, and the saved
> string is also NULL, the strtok call returns NULL.
> 
> In contrast, the string/strtok.c call proceeds to pass s = olds = NULL to
> strspn which consequently crashes.
> 
> While this behaviour is probably permissible, it results in odd portability
> issues where the behaviour can't be reproduced on x86_64. As well, the
> generic versions in the BSD libc I looked at (which appears to date back to
> 4.3BSD or earlier...) also checks for the (s = olds) == NULL condition and
> handles it, so we have a bit of precedent here.
> 
> Attached is a patch which brings the generic string/strtok.c in-line with
> i386, x86_64 and BSD. It seems better to do that, rather than suddenly make
> working code SIGSEGV on x86_64...

The x86_64 and i386 implementations are wrong, they should also fault.

This issue hs come up so many times we've enshrined it into our coding style
and guidlines:
https://sourceware.org/glibc/wiki/Style_and_Conventions#Invalid_pointers
~~~
9. Invalid pointers

The GNU C library considers it a QoI feature not to mask user bugs by detecting
invalid pointers and returning EINVAL (unless the API is standardized and says
it does that). If passing a bad pointer has undefined behavior, it is far more
useful in the long run if it crashes quickly rather than diagnosing an error
that is probably ignored by the flaky caller.

9.1. NULL pointers

If you're going to check for NULL pointer arguments where you have not entered
into a contract to accept and interpret them, do so with an assert, not a
conditional error return. This way the bugs in the caller will be immediately
detected and can be fixed, and it makes it easy to disable the overhead in
production builds. The assert can be valuable as code documentation. However, a
segfault from dereferencing the NULL pointer is just as effective for
debugging. If you return an error code to a caller which has already proven
itself buggy, the most likely result is that the caller will ignore the error,
and bad things will happen much later down the line when the original cause of
the error has become difficult or impossible to track down. Why is it
reasonable to assume the caller will ignore the error you return? Because the
caller already ignored the error return of malloc or fopen or some other
library-specific allocation function which returned NULL to indicate an error.

In summary:

    If you have no contract to accept NULL and you don't immediately
dereference the pointer then use an assert to raise an error when NULL is
passed as an invalid argument.
    If you have no contract to accept NULL and immediately dereference the
pointer then the segfault is sufficient to indicate the error.
    If you have a contract to accept NULL then do so. 
~~~

In this case strtok has no contract to accept a NULL for s2 and therefore
segfaulting is the best option.

I'd accept a patch to have the x86 and x86-64 assembly crash since a NULL s2
has no valid meaning.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]