This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: Automatically test IFUNC implementations


On 09/24/2012 08:55 AM, H.J. Lu wrote:
> extern __typeof (memcpy) __memcpy_sse2;
> extern __typeof (memcpy) __memcpy_ssse3;
> extern __typeof (memcpy) __memcpy_ssse3_back;
> 
> static const struct libc_func_test memcpy_list[] =
> {
>   LIBC_FUNC_INIT (__memcpy_ssse3_back),
>   LIBC_FUNC_INIT (__memcpy_ssse3),
>   LIBC_FUNC_INIT (__memcpy_sse2),
>   { NULL, NULL },
> };
...
> const struct libc_func_test *
> __libc_func (const char *name)


Given that we're talking about building all shipped libcs with __libc_func included I think we should consider a different interface.  One that does not require so many runtime relocations in the library image.  And preferably, one in which we don't have to duplicate information between libc-func.c and all of the individual multiarch files.

Let's first simply state that we'd like to avoid these relative relocs.  Let's hide the  detail of how we avoid them behind a target-specific wall, by not returning pre-allocated const arrays:

  int __libc_func(const char *name, void **array, size_t n)

where N says how large ARRAY is.  The return value is the number of implementations placed in ARRAY, or -1 for ENOSPC (i.e. increase the size of ARRAY).  This adds extra overhead to __libc_func itself, but since that's for testing only, we shouldn't care.

As a next step we can encode these function pointers as PC relative.  We probably can't do this directly in C, but it's easy enough in assembler.  And given that these files are already target-specific that's trivial too.  Then __libc_func merely has to decode the pointers while copying to ARRAY.

A final step would be to come up with some data structure that's usable for both the multi-arch implementation and for testing.  These will likely be target-specific.

One possibility is to have all of the multi-arch entry points pass the common data structure to a common resolver routine.  Done right this would only have a no (or only small) extra overhead during initial function resolution.

Another possibility is some sort of macro solution that constructs the code for the runtime resolver and the data structure at the same time.  Given that we're generally testing bits at constant offsets from __cpu_features (or just bits in dl_hwcap), this should be trivial.


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]