This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: memcmp-sse4.S EqualHappy bug
- From: "Dr. David Alan Gilbert" <dgilbert at redhat dot com>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: Simo Sorce <ssorce at redhat dot com>, Andrea Arcangeli <aarcange at redhat dot com>, "Carlos O'Donell" <carlos at redhat dot com>, Szabolcs Nagy <nsz at port70 dot net>, libc-alpha at sourceware dot org, "H.J. Lu" <hongjiu dot lu at intel dot com>
- Date: Fri, 19 Jun 2015 19:06:36 +0100
- Subject: Re: memcmp-sse4.S EqualHappy bug
- Authentication-results: sourceware.org; auth=none
- References: <20150618161943 dot GN14955 at redhat dot com> <20150618172231 dot GS14955 at redhat dot com> <1434649785 dot 30819 dot 37 dot camel at localhost dot localdomain> <20150618181219 dot GL2248 at work-vm> <55841B6B dot 10104 at redhat dot com> <20150619140710 dot GA14955 at redhat dot com> <1434724946 dot 2716 dot 88 dot camel at willson dot usersys dot redhat dot com> <1434727945 dot 3061 dot 51 dot camel at localhost dot localdomain> <20150619155907 dot GF2147 at work-vm> <1434733019 dot 3061 dot 68 dot camel at localhost dot localdomain>
* Torvald Riegel (triegel@redhat.com) wrote:
> On Fri, 2015-06-19 at 16:59 +0100, Dr. David Alan Gilbert wrote:
> > * Torvald Riegel (triegel@redhat.com) wrote:
> > > On Fri, 2015-06-19 at 10:42 -0400, Simo Sorce wrote:
> > > > On Fri, 2015-06-19 at 16:07 +0200, Andrea Arcangeli wrote:
> > > > > On Fri, Jun 19, 2015 at 09:38:51AM -0400, Carlos O'Donell wrote:
> > > > > > I agree with aborting, but only as long as the hot path's performance
> > > > > > is not impacted and I haven't thought about how to do that.
> > > > > >
> >
> > > > Clearly people are better using atomic comparisons on canary values
> > > > instead, but it seem easy to avoid false positives (returning 0 when
> > > > memory is clearly different) and keep these things working, so why not
> > > > do it ?
> > >
> > > I see two separate issues here. First, where do we draw the line, and
> > > what do we guarantee. I strongly believe that programs must not have
> > > data races, and that they should use atomics or other synchronization
> > > properly (this doesn't mean locks, but relaxed memory order atomics, for
> > > example).
> > > Second, do we try to keep buggy programs working. If it has no cost to
> > > do so (e.g., like it might be true in this case), then doing that can
> > > help to trigger less errors. But that doesn't mean the buggy programs
> > > should get fixed eventually.
> >
> > I find it difficult to understand the boundaries of what the C library
> > is allowed to do in this type of optimisation.
> >
> > For example, consider the following:
> >
> > char a[128];
> > char b[128];
> >
> > put some static data in a[0-63]
> > put some static data in b[0-63]
> > a[64]=0;
> > b[64]=0;
> > start a thread doing stuff in a[65..]
> > start a thread doing stuff in b[65..]
> >
> > if (!strcmp(a,b)) {
> > /* Do something */
> > }
> >
> > a) Is that behaviour defined?
>
> Good question. I think it should be. This depends on both the data
> race definition and what strcmp/strncmp/memcmp are specified to do using
> the abstract machine.
>
> The data race definition uses memory locations as granularity, which is
> in 3.14 in C11. Separate characters in an array should be separate
> memory locations.
>
> C11 isn't very specific regarding strcmp, and just says that it
> "compares the string[s]" (7.24.4.2). C++14 is a bit more specific
> regarding basic_string::compare (21.4.7.9), saying that first the length
> of the strings are determined, and then a strncmp is run using the
> smaller of the two lengths.
>
> Assuming the C++ specs, only the array indices [0..64] should be
> accessed by the abstract machine. So no data race with the stuff going
> on in [65..).
>
> > b) Is it defined with strncmp(a,b,64) ?
>
> Yes.
>
> > c) Is it defined with strncmp(a,b,128)?
>
> Not sure. C11 says that not more than "n characters" are compared, and
> characters that follow the a null character aren't compared either.
> This indicates it wouldn't be different from strncmp(a,b,64) in the
> particular case.
> Regarding C++11, I'm not sure. The closest copies a substring
> (conceptually) and then compares, but the substring copying has to
> determine length of the string and then subtracting the max length.
> This would do a strlen first, which wouldn't access past index 64.
> Thus, should be fine too.
>
> > d) Is it defined with memcmp(a,b,64)?
>
> No data race, IMO.
OK, so you mean that's defined behavior?
> > e) Is it defined with memcmp(a,b,128)?
>
> Data race. Undefined behavior.
OK, so it's interesting your answer to (e) is different from (c);
aren't there compiler optimisations that replace some strcmp s with
memcmp s? (I'm not sure whether it would in this case?)
> > f) If I moved that boundary off a nice round % 8 mark would
> > it matter?
>
> No difference as far as the standard is concerned.
OK, and I've not looked at this x86 code in detail, but are we sure
that where the memcmp length finishes in the middle of a word
the bytes after the end can't cause a match/mismatch that would
later change and be classed as undefined?
> > I can imagine there may be lots of things that terminate
> > a string and let other stuff happen in the allocated space
> > after the end of the string in the belief that at the end
> > of that string all is unknown. Andrea's case is a bit different
> > in that it's the later data that's static, but that doesn't
> > sound like it should change the answer as to what's allowed.
>
> I think it does, because the question is whether there is a data race on
> the memory locations that the abstract machine would access.
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK