This is the mail archive of the
mailing list for the glibc project.
Re: Potential issue with strstr on x86 with sse4.2 in glibc-2.18
- From: "Joseph S. Myers" <joseph at codesourcery dot com>
- To: Rich Felker <dalias at aerifal dot cx>
- Cc: Allan McRae <allan at archlinux dot org>, Alexander Monakov <amonakov at ispras dot ru>, <libc-alpha at sourceware dot org>
- Date: Tue, 20 Aug 2013 19:47:43 +0000
- Subject: Re: Potential issue with strstr on x86 with sse4.2 in glibc-2.18
- References: <520E181D dot 2040308 at archlinux dot org> <alpine dot LNX dot 2 dot 00 dot 1308191628370 dot 2626 at monopod dot intra dot ispras dot ru> <20130819144648 dot GF20515 at brightrain dot aerifal dot cx> <alpine dot LNX dot 2 dot 00 dot 1308191924490 dot 2626 at monopod dot intra dot ispras dot ru> <5212A278 dot 3090909 at archlinux dot org> <20130819230644 dot GM20515 at brightrain dot aerifal dot cx> <5212E278 dot 4030703 at archlinux dot org> <20130820033430 dot GN20515 at brightrain dot aerifal dot cx> <20130820043956 dot GO20515 at brightrain dot aerifal dot cx> <Pine dot LNX dot 4 dot 64 dot 1308201531540 dot 15834 at digraph dot polyomino dot org dot uk> <20130820175735 dot GT20515 at brightrain dot aerifal dot cx>
On Tue, 20 Aug 2013, Rich Felker wrote:
> > The old 4-byte alignment case should only apply to very old binaries, but
> > of course an old binary using strstr still ought to work on a new system.
> Or a new binary built with gcc 3.4. While compiling glibc with gcc 3.4
> is not supported, I don't think it's reasonable to tell people they
> can't compile application code with it...
My understanding was that 2.95 and later defaulted to
-mpreferred-stack-boundary=4, at least in the absence of -Os, so it would
be saying that particular ABI-breaking options can't be used to build new
binaries with the old compiler, rather than that it can't be used to build
new binaries at all.
> > (In the case of strstr, bug 12100 for asymptotic slowness of the SSE4.2
> > implementation is also still open - though the preference was to use a
> > hybrid approach for a fix rather than completely removing the SSE4.2
> > version, so I suppose the realignment issue will remain even with a fix
> > for that bug.)
> I question the reasoning for this. If the "short needle" version of
> two-way were removed and the "long needle" version (with bad character
> table) always used, I expect it would outperform the SSE code in
> almost all cases. SSE is not at all well-suited to strstr since you
> have to keep bitshifting and check all alignments. At best, the SSE
> code will do one vector comparison per byte of the haystack (up until
> the match, if any, is found). Two-way with the bad character table can
> do much better, on average inspecting only C*n/m positions (where n is
> the haystack length (up to the first match) and m the needle length).
In any case, a fix will need benchmark data, which was lacking for the
original addition of this SSE4.2 strstr code as well as for the patch to
remove it that Liubov sent in June 2012 - anyone picking up this bug will
need to do such benchmarking.
Joseph S. Myers