This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2] Add __pure2 to __locale_ctype_ptr(_l)


Corinna Vinschen wrote: 
> On Oct 31 13:34, Wilco Dijkstra wrote:
> > The newlib ctype functons are extremely inefficient due to repeatedly
> > calling __locale_ctype_ptr for every single use of a ctype macro, even
> > in a tight loop.  Improve this by adding the missing __pure2 attribute so
> > the pointer can be cached just like in GLIBC, resulting in > 2x speedup
> > in loops.
>
> Apart from Craig's comment, how did you test this?  I checked this on
> x86_64 Cygwin with input of 300 Megs, and I didn't see any difference
> between using the current code and an additional __pure2.  A tight loop
> around these 300 Megs of input always took 2.3 - 2.4 secs.
>
> My testcase is attached.  Input file as parameter with no punctuation
> chars (so the printf is deliberately never called, but GCC's optimization
> doesn't discard the result of ispunct.

I tried the inner loop from your benchmark on an array and I get a speedup
of 3.1x on an AArch64 server with -O3. There seems to be an optimization bug
in GCC with -O2 since it won't lift the pure call, apparently due to the
side-effect in c = *p++. If I change to a for loop with the p++ done separately,
I get the expected code with -O2.

Btw is there a simple recipe to link newlib on a Linux system? I wasn't able
to get it to link with the libc.a, so I had to hack a __locale_ctype_ptr function
by copying bits of code from various newlib files...

Wilco

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]