This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: V4 [PATCH] x86: Add <sys/platform/x86.h>


* H. J. Lu:

> On Thu, Sep 20, 2018 at 5:43 AM, Florian Weimer <fweimer@redhat.com> wrote:
>> * H. J. Lu:
>>
>>> On Thu, Sep 20, 2018 at 4:41 AM, Florian Weimer <fweimer@redhat.com> wrote:
>>>> * H. J. Lu:
>>>>
>>>>> Add <sys/platform/x86.h> to provide an API to access x86 specific platform
>>>>> features.  It makes raw and derived CPUID information available to
>>>>> applications so that programmers can avoid complicated steps to retrieve
>>>>> the same information with CPUID instructions.
>>>>
>>>> Sorry, I still think that this patch semantically depends on delayed
>>>> IFUNC processing.
>
> I don't believe this is an issue at all since new functions are in
> ld.so.  Can you show this is a real issue with a testcase? I can
> add it to my test.

It's the relocation for x86_get_arch_feature in the IFUNC resolver I'm
worried about.  Without delayed processing, the IFUNC resolver may be
called at any time, and the relocation for x86_get_arch_feature may not
have been processed yet.  So it's not about calling into an unrelocated
or uninitialized ld.so (which I agree would not happen), but the call
itself going wrong because it's not been properly relocated yet.

Usually, dependency order as established by DT_NEEDED entries will
ensure that IFUNC resolvers are only called in already-relocated
objects, but LD_PRELOAD, symbol interposition, or missing DT_NEEDED
entries can invalidate that assumption and make the dependency ordering
insufficient.

I hope this makes it clear what I think will break when this feature is
used from IFUNC resolvers.

>>>> Furthermore, I strongly prefer a simplified interface, like:
>>>>
>>>>   enum {
>>>>     X86_FEATURE_MMX,
>>>>     X86_FEATURE_SSE,
>>>>     X86_FEATURE_SSE2,
>>>>     …
>>>>     X86_FEATURE_AVX512F,
>>>>     …
>>>>   };
>>>>
>>>>   _Bool x86_feature_usable (int index) __attribute__ ((pure));
>>>>
>>>> Applications could then do something like this:
>>>>
>>>>   if (x86_feature_usable (X86_FEATURE_AVX512F))
>>>>     return &__matmul_avx512f;
>>>>
>>>> I don't think there is value in exposing the CPUID data directly.  If
>>>> programmers need it, they can use <cpuid.h> from GCC instead, without
>>>> needing to worry how glibc populates its data structures.
>
> There are several advantages of my approach:
>
> 1. When a new feature is added to previously reserved bit, the
> program using the new feature will work with the older libc.so
> whose CPUID data has the reserved bit.

It depends on the CPUID feature bit whether that's a good idea.  For the
AVX512F, that would be wrong because it the CPUID data does not tell you
whether the feature is actually usable.

There is no ABI impact for future feature flags, so we can backport them
easily enough.

> 2. We only need to add 2 small functions:
>
> (gdb) disass x86_get_arch_feature
> Dump of assembler code for function x86_get_arch_feature:
>    0x0000000000017100 <+0>: xor    %eax,%eax
>    0x0000000000017102 <+2>: cmp    $0x2,%edi
>    0x0000000000017105 <+5>: ja     0x17117 <x86_get_arch_feature+23>
>    0x0000000000017107 <+7>: mov    %edi,%edi
>    0x0000000000017109 <+9>: lea    0x105f0(%rip),%rax        # 0x27700
> <_rtld_local_ro>
>    0x0000000000017110 <+16>: mov    0xb4(%rax,%rdi,4),%eax
>    0x0000000000017117 <+23>: retq
> End of assembler dump.
> (gdb) disass x86_get_cpuid_registers
> Dump of assembler code for function x86_get_cpuid_registers:
>    0x00000000000170d0 <+0>: lea    0xc479(%rip),%rax        # 0x23550
> <zero_cpuid_registers.8564>
>    0x00000000000170d7 <+7>: cmp    $0x2,%edi
>    0x00000000000170da <+10>: ja     0x170f0 <x86_get_cpuid_registers+32>
>    0x00000000000170dc <+12>: mov    %edi,%edi
>    0x00000000000170de <+14>: lea    0x1061b(%rip),%rax        #
> 0x27700 <_rtld_local_ro>
>    0x00000000000170e5 <+21>: add    $0x7,%rdi
>    0x00000000000170e9 <+25>: shl    $0x4,%rdi
>    0x00000000000170ed <+29>: add    %rdi,%rax
>    0x00000000000170f0 <+32>: retq
> End of assembler dump.
> (gdb)

The implementation of x86_feature_usable would be quite similar, but it
would be necessary to readjust the way the usable bits are stored in
ld.so.  That is true.

> 3. Write CPUID program isn't easy.   You can try to write a program
> to detect a CPUID feature to see what it takes.  I can't remember it
> exactly without looking at glibc sources which I wrote.

I completely agree, which is why I think we should *not* expose CPUID
data, but higher-level information about which features are available.
In some cases, it may be possible to use CPUID bits directly (for CMOV
on i386, for example), but in many other cases, it will be necessary to
take environmental factors into account, e.g. how the kernel has set up
XCR0 when it created the process.

As far as I know, it's the latter aspect that is not known in advance:
Future CPUID bits might well have dependencies on XCR0 (or other CPU
state) as well.  An application which tries to do detection with older
glibcs without the corresponding *_Usable support would necessarily go
wrong.  My API proposal avoids that because it just cannot happen with
this interface, at the cost that you need to upgrade glibc first before
you can use any previously-unknown bits.

Thanks,
Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]