This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] ifunc suck, use ufunc.


On Mon, May 25, 2015 at 12:40 PM, OndÅej BÃlka <neleai@seznam.cz> wrote:
> On Sun, May 24, 2015 at 11:15:10PM -0400, Rich Felker wrote:
>> On Mon, May 25, 2015 at 04:36:52AM +0200, OndÅej BÃlka wrote:
>> > On Mon, May 25, 2015 at 03:43:23AM +0200, Szabolcs Nagy wrote:
>> > > * Ond??ej B?lka <neleai@seznam.cz> [2015-05-24 23:38:58 +0200]:
>> > > > A main benefit would be interlibrary constant folding. Why waste cycles
>> > > > on reinitializing constant, just save it to ufunc structure. Resolver
>> > > > then could precompute tables to improve speed.
>> > > >
>> > > > As interposing these you would need to interpose resolver.
>> > > >
>> > > > An gcc support is not needed but we could get something with alternate
>> > > > calling convention as passing resolver struct is common and could be
>> > > > preserved for loops with tail calls.
>> > > >
>> > > > A future direction could be replace plt and linker with ufunc, it would
>> > > > require adding function string pointer to structure and calling first
>> > > > generic resolver to select specific resolver.
>> > > >
>> > > > Comments?
>> > > >
>> > >
>> > > this makes memset non-async-signal-safe. (qoi issue)
>> > >
>> > Did I explicitly say that its architecture specific optimization or did
>> > I forgot?
>>
>> AS-safety is broken regardless of arch. Only the barrier stuff if
>> arch-specific.
>>
>> > > it is not thread-safe either and would need an acquire
>> > > load barrier on every invocation of memset to fix that
>> > > or the use of thread local storage. (conformance issue)
>> > >
>> > > (in the example only resolve->fn is modified and idempotently,
>> > > this would work in practice but as soon as ->data is accessed
>> > > too the memory ordering guarantees are required.. which can
>> > > be made efficient on some archs but only in asm)
>> > >
>> > > in the example memset is called through the wrong type
>> > > of function pointer: the resolver and resolvee are
>> > > incompatible so this is invalid c, only works in asm.
>> > >
>> > Thats why I intended it as architecture-specific. On x64 it will work
>> > along with memset prototype. Adding atomic/locking in resolver would be unnecessary
>> > overhead.
>> >
>> > Could make this generic by defining macros that expand to atomic read on
>> > archs that don't act as pram.
>>
>> Do you realize the relative cost of an atomic read (barrier) versus a
>> small memset? This is like driving an extra mile to a cheaper gas
>> station to save $0.01 per gallon...
>>
> Since when does atomic read for x64 generates barrier?
> And you don't need to establish any happens-before relashionship here,
> so relaxed atomic read suffices. With atomic write you could only call a
> function and you are ok or resolver multiple times. A resolver would
> already take lock so its handled.

Since when is glibc x64 specific :).  There are other targets which
use ifuncs and that is what he is talking about.

Thanks,
Andrew

>
> A problem is only for arch that don't do atomic write of pointers. Chips
> that do this are typically so slow that if you care about performance
> best you could do is sell it and buy something decent. Nobody would try
> to improve performance by making several variants.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]