This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] improve tls access for tolower table and errno


On Sun, Jun 07, 2015 at 04:14:02PM +0200, Florian Weimer wrote:
> On 06/06/2015 02:40 PM, OndÅej BÃlka wrote:
> > Hi, as I mentioned before that inline strcasecmp would be problematic as
> > it needs to get call to tolower which is suboptimal.
> > 
> > On architectures with tls register you don't need to do call call for tls 
> > access but start small. 
> 
> Making the offset part of the ABI (like we do for the stack canary) has
> been discussed before:
> 
> <https://sourceware.org/ml/libc-alpha/2015-03/msg00132.html>
> 
> > A sample implementation would be following, where should I add
> > initializer and is there other way to get %fs than assembly?
> > 
> > #include <errno.h>
> > #include <stdio.h>
> > 
> > static long __errno_offset;
> > __attribute__((constructor))
> > void  get_offset ()
> > {
> >   char *offset;
> >   char *location = &errno;
> >  __asm__ ("mov %%fs:0, %0" : "=r" (offset));
> > 
> >   __errno_offset = location - offset;
> > }
> > 
> > static __always_inline 
> > int *
> > __ep()
> > {
> >   char *__offset;
> >   __asm__ ("mov %%fs:0, %0" : "=r" (__offset));
> > 
> >   return (int *)(__offset + __errno_offset);
> > }
> > 
> > #define errno2 (*__ep())
> 
> Constructor functions in header files are a nightmare.  C++ has
> something similar for <iostream>, and the overhead from that is
> substantial.  Many projects ban inclusion of <iostream> as a result.
>
>From previous mail:

For x64 and other architectures as these variables are statically
allocated would could just save offset at register. Linker would at
initialization fill it with correct value and applications would just
need to access a normal variable.


 
> The problem remains that errno is mostly used on error paths and
> 
>   call __errno_location
>   movl (%rax), %eax
> 
> is much shorter than
> 
>   movq	__errno_offset(%rip), %rax
>   movq %fs:0, %rdx
>   movl (%rax, %rdx), %eax
> 
> (7 versus 19 bytes).  On paths which are supposed to be executed rarely,
> this is not desirable.  

Used errno as example, didn't consider size.

> With the thread locale, performance concerns are different, but the
> constructor issue is still valid.
>
It isn't as I explained in previous mail
 
> Furthermore, future C++ versions may make caching the addresses of
> thread-local variables invalid, so we should wait until the fate of
> resumable functions and coroutines is decided, and what shape they take.
> 
That is irrelevant unless it will force us change our tls
implementation. This produces value for current thread which is correct,
if one in coroutines switches thread tolower call would also return
value according to new locale.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]