This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 4/*] Generic string memchr and strnlen


On Fri, Jul 24, 2015 at 05:38:43PM +0100, Wilco Dijkstra wrote:
> > OndÅej BÃlka wrote:
> > On Fri, Jul 24, 2015 at 04:10:24PM +0100, Wilco Dijkstra wrote:
> > > Getting back to this, if you don't have an optimized strnlen then
> > > it is always better to try to use memchr (there are 14 optimized
> > > implementations of memchr but only 6 for strnlen).
> > >
> > > So I'd suggest changing strnlen in an independent patch as:
> > >
> > > __strnlen (const char *str, size_t n)
> > > {
> > >   char *ret = __memchr (str, 0, n);
> > >   return ret ? ret - str : n;
> > > }
> > >
> > > It also looks worthwhile to express strlen and rawmemchr as memchr
> > > so that you only need one highly optimized function rather than many.
> > > Deferring to more widely implemented optimized assembler functions
> > > should result in better performance than trying to optimize these
> > > functions in C.
> > >
> > No, that is bad idea. Unless you inline strnlen or memchr then you add
> > extra call overhead.
> 
> The goal is to call the optimized assembler version of memchr when there 
> isn't one for strnlen - you could inline the above in headers if a target
> decides that there will only be an optimized memchr and not a strnlen
> (assuming that strnlen shows similar performance as memchr on a particular
> target).
>
Which as I explained is worse than alternatives, unless saving size.
 
> > That is unless you want to claim that you want to save size.
> > 
> > As for optimized implementations of strnlen vs memchr it isn't clear
> > that we will delete all of them as they are slower.
> 
> Delete what? We could certainly decide on a core set of functions which
> every target should implement in assembler. Candidates are memcpy, memset,
> memmove, memchr, strchr, strlen. Then for those we do not try to provide
> an optimized C implementation as it won't ever be used. But deleting them
> seems a bridge too far.
> 
This patch is about generic string functions. When they have good
performance they will replace current ones for architectures. So soon
there won't be architecture where it holds.

> > Also its wrong way to solve it, a architecture maintainer should add
> > optimized strnlen implementations, that quite easy when you have memchr
> > implementation, add few macros to initially add start and different end
> > handling.
> 
> The problem with the non-standard functions that are rarely used is that 
> there are very few optimized implementations. We can't force maintainers to
> implement all string functions in assembler, so the generic code should use
> the fastest possible alternative if there isn't an optimized implementation. 
> And that is pretty much always a more commonly used function which does 
> have an optimized implementation.
>
But that isn't about what I said. I said that if there is optimized
memchr implementation then other function assembly is trivial to add for
maintainer. That gives you better performance.


> > Suggestion to express strlen as memchr would just cause regression. On
> > my system there happened 9535682 calls of strlen while memchr was called
> > just 11633 times and rawmemchr 1742 times.
> 
> Why would it cause a regression? If you don't have an optimized strlen,
> what other implementation would be the fastest alternative?
> 
It would be my generic strlen implementation. If you don't have
optimized strlen then you certainly don't have optimized memchr that is
called 819 times less often.

> > Also purpose of strlen and rawmechr is to be faster than memchr. Again
> > these should be implemented by architecture maintainer by removing size
> > checks from memchr implementation.
> 
> Yes it would be perfect if we had optimized assembler implementations for
> all functions. However that's unfortunately not the case given there is a
> high cost for creating assembler implementations.

No, there isn't. If you have optimized memchr then deriving these is
simple mechanic work. Just do equivalent of dead code elimination on
memchr and you will get strlen. 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]