V3 [PATCH] <sys/platform/x86.h>: Remove the C preprocessor magic
H.J. Lu
hjl.tools@gmail.com
Thu Jan 21 00:19:14 GMT 2021
On Wed, Jan 20, 2021 at 7:58 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * H. J. Lu via Libc-alpha:
>
> > On Wed, Jan 20, 2021 at 10:53:07AM +0100, Florian Weimer wrote:
> >> * H. J. Lu via Libc-alpha:
> >>
> >> > +enum
> >> > +{
> >> > + COMMON_CPUID_INDEX_1 = 0,
> >> > + COMMON_CPUID_INDEX_7,
> >> > + COMMON_CPUID_INDEX_80000001,
> >> > + COMMON_CPUID_INDEX_D_ECX_1,
> >> > + COMMON_CPUID_INDEX_80000007,
> >> > + COMMON_CPUID_INDEX_80000008,
> >> > + COMMON_CPUID_INDEX_7_ECX_1,
> >> > + COMMON_CPUID_INDEX_19,
> >> > + /* Keep the following line at the end. */
> >> > + COMMON_CPUID_INDEX_MAX
> >> > +};
> >>
> >> COMMON_CPUID_INDEX_MAX should not be in the public header because it
> >> subject to changes.
> >
> > Fixed.
> >
> >>
> >> > +struct cpuid_registers
> >> > +{
> >> > + unsigned int eax;
> >> > + unsigned int ebx;
> >> > + unsigned int ecx;
> >> > + unsigned int edx;
> >> > +};
> >>
> >> Should the non-public interfaces use __?
> >
> > Moved to sysdeps/x86/include/cpu-features.h.
> >
> >>
> >> > +enum cpu_features_kind
> >> > +{
> >> > + arch_kind_unknown = 0,
> >> > + arch_kind_intel,
> >> > + arch_kind_amd,
> >> > + arch_kind_zhaoxin,
> >> > + arch_kind_other
> >> > +};
> >> > +
> >> > +struct cpu_features_basic
> >> > +{
> >> > + enum cpu_features_kind kind;
> >> > + int max_cpuid;
> >> > + unsigned int family;
> >> > + unsigned int model;
> >> > + unsigned int stepping;
> >> > +};
> >>
> >> Should we really expose all this? It's not documented. enum
> >> cpu_features_kind doesn't seem to be compatible with how people use
> >> virtualization.
> >
> > Moved to sysdeps/x86/include/cpu-features.h.
> >
> >>
> >> > diff --git a/sysdeps/x86/dl-get-cpu-features.c b/sysdeps/x86/dl-get-cpu-features.c
> >> > index 349472d99f..4636d9f4a7 100644
> >> > --- a/sysdeps/x86/dl-get-cpu-features.c
> >> > +++ b/sysdeps/x86/dl-get-cpu-features.c
> >> > @@ -46,9 +46,9 @@ __ifunc (__x86_cpu_features, __x86_cpu_features, NULL, void,
> >> > #undef __x86_get_cpu_features
> >> >
> >> > const struct cpu_features *
> >> > -__x86_get_cpu_features (unsigned int max)
> >> > +__x86_get_cpu_features (unsigned int index)
> >> > {
> >> > - if (max > COMMON_CPUID_INDEX_MAX)
> >> > + if (index > COMMON_CPUID_INDEX_MAX * 8 * sizeof (unsigned int) * 4)
> >> > return NULL;
> >> > return &GLRO(dl_x86_cpu_features);
> >> > }
> >>
> >> This should work in principle (but I haven't verified the boundary
> >> condition).
> >>
> >> Could you pass the COMMON_CPUID_INDEX_* value (without the register and
> >> bit selectors)? Then more calls can be subject to common subexpression
> >> elimination in the caller, leading to more compact code.
> >
> > My current public API uses a single enum index for each CPU feature
> > detection. Use 2 indices for each CPU feature makes it less user
> > friendly. Internally, we use the C preprocessor magic so that more
> > calls can be subject to common subexpression elimination in the caller,
> > leading to more compact code for glibc.
>
> I had this in mind for the implementation:
>
> struct cpuid_feature
> {
> unsigned int cpuid_array[4];
> unsigned int usable_array[4];
> };
>
> const struct cpuid_feature *
> __x86_get_cpu_features (unsigned int leaf)
> {
> const struct cpu_feature future = { };
> if (leaf < CPUID_INDEX_MAX)
> return &GLRO(dl_x86_cpu_features).features[leaf];
> else
> return &future;
> }
>
> And in the header:
>
> unsigned int leaf = index / (8 * sizeof (unsigned int) * 4);
>
> const struct cpuid_feature * __x86_get_cpu_features (unsigned int leaf)
> __attribute__ ((pure));
>
> static inline const struct cpuid_feature *
> __x86_get_cpu_leaf (unsigned int index)
> {
> return __x86_get_cpu_features (index / (8 * sizeof (unsigned int) * 4));
> }
> #define HAS_CPU_FEATURE(name) \
> (x86_cpu_has_feature (__x86_get_cpu_leaf (x86_cpu_##name), \
> x86_cpu_##name))
> #define HAS_CPU_FEATURE(name) \
> (x86_cpu_is_usable (__x86_get_cpu_leaf (x86_cpu_##name), \
> x86_cpu_##name))
>
>
> I expect that the index arguments to __x86_get_cpu_features are
> constant-folded by GCC, and many of the __x86_get_cpu_features calls can
> be CSE'ed as a result.
>
> In the public interface, the max_cpuid field appears redundant, so this
> is why I think this will work.
>
>
Here is the updated patch. The public function is changed to
/* Get a pointer to the CPU feature structure. */
extern const struct cpuid_feature *__x86_get_cpuid_feature_leaf (unsigned int)
__attribute__ ((pure));
OK for master?
Thanks.
--
H.J.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-sys-platform-x86.h-Remove-the-C-preprocessor-magic.patch
Type: text/x-patch
Size: 80586 bytes
Desc: not available
URL: <https://sourceware.org/pipermail/libc-alpha/attachments/20210120/c689b06c/attachment-0001.bin>
More information about the Libc-alpha
mailing list