This is the mail archive of the
mailing list for the glibc project.
RE: [PATCH 4/*] Generic string memchr and strnlen
- From: "Wilco Dijkstra" <wdijkstr at arm dot com>
- To: 'OndÅej BÃlka' <neleai at seznam dot cz>, "Chris Metcalf" <cmetcalf at ezchip dot com>
- Cc: "'GNU C Library'" <libc-alpha at sourceware dot org>
- Date: Tue, 28 Jul 2015 14:07:35 +0100
- Subject: RE: [PATCH 4/*] Generic string memchr and strnlen
- Authentication-results: sourceware.org; auth=none
- References: <003b01d0c622$dd96f530$98c4df90$ at com> <20150724153758 dot GA855 at domone> <003c01d0c62f$3440aa00$9cc1fe00$ at com> <20150724170425 dot GA10041 at domone> <003d01d0c63b$0080b730$01822590$ at com> <20150724191248 dot GA2889 at domone> <004201d0c873$bd3b9fe0$37b2dfa0$ at com> <20150727165642 dot GA22842 at domone> <55B67BA7 dot 7030606 at ezchip dot com> <20150727232224 dot GA21851 at domone>
> OndÅej BÃlka wrote:
> On Mon, Jul 27, 2015 at 02:42:47PM -0400, Chris Metcalf wrote:
> > On 07/27/2015 12:56 PM, OndÅej BÃlka wrote:
> > >Best example is strrchr where almost all architectures got it wrong
> > >first time until I tell them otherwise. Problem is that they calculate
> > >position of byte at each iteration instead just at end which is
> > >ineffective. Now tile and powerpc have this problem. My proposed
> > >alternative
> > >
> > >return memrchr (s, c, strlen (s) + 1);
> > >
> > >would beat these on suitable inputs.
> > Your comment and your code snippet don't really match, so can I ask
> > you to elucidate a little what your concern is?
> It is mix of two concerns, so its both First a snippet was there
> mainly to show that generic code could be faster. I mentioned that
> it applies for some inputs, as it would need larger inputs than usual
> to overcome initialization and call overhead of assembly one.
I believe the above sequence is the right one to use in the generic code,
especially when you have an optimized memrchr that scans backwards
(simpler inner loop as you quit at first match). The current generic
implementation repeatedly calls strchr so will be very slow if there
are multiple matches.
> For example if application uses strrchr to find last / in path name then
> its likely that it would be among last 8 characters and occurs
> frequently previously.
> Here you would need a conditional move instead of branch to avoid that
Agreed - the AArch64 implementation does that. It's hard to say what
is best if you don't have such an instruction - it'll depend on the
statistics of real inputs and the relative performance of strlen+partial
backwards memrchr scan vs a full forward strrchr scan.
> > So I'm not sure which point you are making but unless you know
> > something about the average distribution of the characters in the
> > strrchr() string to suggest they are likely to occur in the last third
> > of the string more than 50% of the time, I don't think I'm convinced.
So what we really need is better statistics on strrchr and friends. Looking
at GLIBC sources, it seems that >90% search for '/' in a path.
Do you have actual stats for strrchr OndÅej? That would really help solving
this issue. What I'd like to know is:
1. Average length of the strings
2. What percentage fails to match
3. Average number of matches per string if it matches at least once
4. Average relative position within string of last match
With that info we could create a simple patch for bench-strrchr.c to make
it use realistic inputs. Then based on that we can fix the generic code and
let maintainers further tune their optimized implementations if they do not
beat the generic code.