[committed, PATCH] x86: Don't disable SSE4a when disabling SSE4

H.J. Lu hjl.tools@gmail.com
Mon Feb 17 17:05:00 GMT 2020


On Mon, Feb 17, 2020 at 9:01 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 17.02.2020 17:52, H.J. Lu wrote:
> > On Mon, Feb 17, 2020 at 7:49 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 17.02.2020 16:44, H.J. Lu wrote:
> >>> On Mon, Feb 17, 2020 at 7:32 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 17.02.2020 16:30, H.J. Lu wrote:
> >>>>> On Mon, Feb 17, 2020 at 7:27 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>
> >>>>>> On 16.02.2020 17:47, H.J. Lu wrote:
> >>>>>>> On Wed, Feb 12, 2020 at 9:18 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> On Wed, Feb 12, 2020 at 9:08 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>>>>
> >>>>>>>>> Since ".arch sse4a" enables SSE3 and earlier, disabling SSE3 should also
> >>>>>>>>> disable SSE4a. And as per its name, ".arch .nosse4" should also do so.
> >>>>>>>>>
> >>>>>>>>> gas/
> >>>>>>>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
> >>>>>>>>>
> >>>>>>>>>         * config/tc-i386.c (cpu_noarch): Use CPU_ANY_SSE4_FLAGS in
> >>>>>>>>>         "nosse4" entry.
> >>>>>>>>>
> >>>>>>>>> opcodes/
> >>>>>>>>> 2020-02-XX  Jan Beulich  <jbeulich@suse.com>
> >>>>>>>>>
> >>>>>>>>>         * i386-gen.c (cpu_flag_init): Move CpuSSE4a from
> >>>>>>>>>         CPU_ANY_SSE_FLAGS entry to CPU_ANY_SSE3_FLAGS one. Add
> >>>>>>>>>         CPU_ANY_SSE4_FLAGS entry.
> >>>>>>>>>         * i386-init.h: Re-generate.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> OK.
> >>>>>>>>
> >>>>>>>> Thanks.
> >>>>>>>
> >>>>>>> commit 7deea9aad8 changed nosse4 to include CpuSSE4a.  But AMD SSE4a is
> >>>>>>> a superset of SSE3 and Intel SSE4 is a superset of SSSE3.  Disable Intel
> >>>>>>> SSE4 shouldn't disable AMD SSE4a.  This patch restores nosse4.  It also
> >>>>>>> adds .sse4a and nosse4a.
> >>>>>>
> >>>>>> And where is it said that "nosse4" means only the Intel flavors? As
> >>>>>> said in the commit message of said change, to me the clear implication
> >>>>>> is that anything called SSE4* will get disabled.
> >>>>>>
> >>>>>
> >>>>> SSE4 refers to SSE4 from Intel, which includes SSE4.1 and SSE4.2.
> >>>>> SSE4a from AMD is unrelated from Intel SSE4.
> >>>>
> >>>> Repeating my question then: Where is this being said? (Best imo
> >>>> would be to delete ".arch .nosse4" support then, eliminating
> >>>> the ambiguity.)
> >>>
> >>> We have both .sse4 and nosse4 which are aliases for SSE4.2.  Please
> >>> feel free to add documentation.
> >>
> >> If it's not documented, then it's not clear at all what the intention
> >> is. I'm certainly not going to add documentation saying something that
> >> I don't believe should be said. I.e. if I were to add documentation
> >> here, it'd say .nosse4 covers all three SSE4* variants (and it would
> >> then be a bug of the implementation that this isn't the case).
> >
> > From gcc/config/i386/i386.opt:
> >
> > msse4.1
> > Target Report Mask(ISA_SSE4_1) Var(ix86_isa_flags) Save
> > Support MMX, SSE, SSE2, SSE3, SSSE3 and SSE4.1 built-in functions and
> > code generation.
> >
> > msse4.2
> > Target Report Mask(ISA_SSE4_2) Var(ix86_isa_flags) Save
> > Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 and SSE4.2 built-in
> > functions and code generation.
> >
> > msse4
> > Target RejectNegative Report Mask(ISA_SSE4_2) Var(ix86_isa_flags) Save
> > Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 and SSE4.2 built-in
> > functions and code generation.
> >
> > mno-sse4
> > Target RejectNegative Report InverseMask(ISA_SSE4_1) Var(ix86_isa_flags) Save
> > Do not support SSE4.1 and SSE4.2 built-in functions and code generation.
> >
> > SSE4 is for Intel SSE4 only.
>
> Hmm, okay, that's gcc, not gas, but at least something.
>

Can you add a sentence for SSE4 to gas manual?

-- 
H.J.



More information about the Binutils mailing list