[committed, PATCH] x86: Don't disable SSE4a when disabling SSE4
H.J. Lu
hjl.tools@gmail.com
Mon Feb 17 17:05:00 GMT 2020
On Mon, Feb 17, 2020 at 9:01 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 17.02.2020 17:52, H.J. Lu wrote:
> > On Mon, Feb 17, 2020 at 7:49 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 17.02.2020 16:44, H.J. Lu wrote:
> >>> On Mon, Feb 17, 2020 at 7:32 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 17.02.2020 16:30, H.J. Lu wrote:
> >>>>> On Mon, Feb 17, 2020 at 7:27 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>
> >>>>>> On 16.02.2020 17:47, H.J. Lu wrote:
> >>>>>>> On Wed, Feb 12, 2020 at 9:18 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> On Wed, Feb 12, 2020 at 9:08 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>>>>
> >>>>>>>>> Since ".arch sse4a" enables SSE3 and earlier, disabling SSE3 should also
> >>>>>>>>> disable SSE4a. And as per its name, ".arch .nosse4" should also do so.
> >>>>>>>>>
> >>>>>>>>> gas/
> >>>>>>>>> 2020-02-XX Jan Beulich <jbeulich@suse.com>
> >>>>>>>>>
> >>>>>>>>> * config/tc-i386.c (cpu_noarch): Use CPU_ANY_SSE4_FLAGS in
> >>>>>>>>> "nosse4" entry.
> >>>>>>>>>
> >>>>>>>>> opcodes/
> >>>>>>>>> 2020-02-XX Jan Beulich <jbeulich@suse.com>
> >>>>>>>>>
> >>>>>>>>> * i386-gen.c (cpu_flag_init): Move CpuSSE4a from
> >>>>>>>>> CPU_ANY_SSE_FLAGS entry to CPU_ANY_SSE3_FLAGS one. Add
> >>>>>>>>> CPU_ANY_SSE4_FLAGS entry.
> >>>>>>>>> * i386-init.h: Re-generate.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> OK.
> >>>>>>>>
> >>>>>>>> Thanks.
> >>>>>>>
> >>>>>>> commit 7deea9aad8 changed nosse4 to include CpuSSE4a. But AMD SSE4a is
> >>>>>>> a superset of SSE3 and Intel SSE4 is a superset of SSSE3. Disable Intel
> >>>>>>> SSE4 shouldn't disable AMD SSE4a. This patch restores nosse4. It also
> >>>>>>> adds .sse4a and nosse4a.
> >>>>>>
> >>>>>> And where is it said that "nosse4" means only the Intel flavors? As
> >>>>>> said in the commit message of said change, to me the clear implication
> >>>>>> is that anything called SSE4* will get disabled.
> >>>>>>
> >>>>>
> >>>>> SSE4 refers to SSE4 from Intel, which includes SSE4.1 and SSE4.2.
> >>>>> SSE4a from AMD is unrelated from Intel SSE4.
> >>>>
> >>>> Repeating my question then: Where is this being said? (Best imo
> >>>> would be to delete ".arch .nosse4" support then, eliminating
> >>>> the ambiguity.)
> >>>
> >>> We have both .sse4 and nosse4 which are aliases for SSE4.2. Please
> >>> feel free to add documentation.
> >>
> >> If it's not documented, then it's not clear at all what the intention
> >> is. I'm certainly not going to add documentation saying something that
> >> I don't believe should be said. I.e. if I were to add documentation
> >> here, it'd say .nosse4 covers all three SSE4* variants (and it would
> >> then be a bug of the implementation that this isn't the case).
> >
> > From gcc/config/i386/i386.opt:
> >
> > msse4.1
> > Target Report Mask(ISA_SSE4_1) Var(ix86_isa_flags) Save
> > Support MMX, SSE, SSE2, SSE3, SSSE3 and SSE4.1 built-in functions and
> > code generation.
> >
> > msse4.2
> > Target Report Mask(ISA_SSE4_2) Var(ix86_isa_flags) Save
> > Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 and SSE4.2 built-in
> > functions and code generation.
> >
> > msse4
> > Target RejectNegative Report Mask(ISA_SSE4_2) Var(ix86_isa_flags) Save
> > Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1 and SSE4.2 built-in
> > functions and code generation.
> >
> > mno-sse4
> > Target RejectNegative Report InverseMask(ISA_SSE4_1) Var(ix86_isa_flags) Save
> > Do not support SSE4.1 and SSE4.2 built-in functions and code generation.
> >
> > SSE4 is for Intel SSE4 only.
>
> Hmm, okay, that's gcc, not gas, but at least something.
>
Can you add a sentence for SSE4 to gas manual?
--
H.J.
More information about the Binutils
mailing list