[PATCH v6 1/2] x86: Add comment explaining no Slow_SSE4_2 check in ifunc-sse4_2

H.J. Lu hjl.tools@gmail.com
Fri Jul 1 22:38:23 GMT 2022


On Thu, Jun 30, 2022 at 5:01 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Thu, Jun 30, 2022 at 5:01 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > On Thu, Jun 30, 2022 at 4:20 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > On Thu, Jun 30, 2022 at 1:13 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> > > >
> > > > Just for clarities sake and so that if a future implementation is
> > > > added we remember to add the check.
> > > > ---
> > > >  sysdeps/x86_64/multiarch/ifunc-sse4_2.h | 4 ++++
> > > >  1 file changed, 4 insertions(+)
> > > >
> > > > diff --git a/sysdeps/x86_64/multiarch/ifunc-sse4_2.h b/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
> > > > index ee36525bcf..752798278c 100644
> > > > --- a/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
> > > > +++ b/sysdeps/x86_64/multiarch/ifunc-sse4_2.h
> > > > @@ -27,6 +27,10 @@ IFUNC_SELECTOR (void)
> > > >  {
> > > >    const struct cpu_features* cpu_features = __get_cpu_features ();
> > > >
> > > > +  /* This function uses slow sse4.2 instructions (pcmpstri) but since
> > > > +     there is no other optimized implementation keep using it.  If an
> > > > +     optimized fallback is added add a X86_ISA_CPU_FEATURE_ARCH_P
> > > > +     (cpu_features, Slow_SSE4_2) check.  */
> > >
> > > This function always uses sse4.2 instructions (pcmpstri) since there
> > > is no other optimized implementation.  If an ...
> >
> > Is it all sse4.2 instructions that count for Slow_SSE4.2 or other the
> > microcode string ones?
> other->only*

No.  Only the string instructions are slow.

> > >
> > > >    if (CPU_FEATURE_USABLE_P (cpu_features, SSE4_2))
> > > >      return OPTIMIZE (sse42);
> > > >
> > > > --
> > > > 2.34.1
> > > >
> > >
> > >
> > > --
> > > H.J.



-- 
H.J.


More information about the Libc-alpha mailing list