optimal encoding of SIMD insns

Jan Beulich JBeulich@suse.com
Wed Nov 22 07:44:00 GMT 2017


>>> On 21.11.17 at 23:14, <hjl.tools@gmail.com> wrote:
> On Mon, Nov 20, 2017 at 6:18 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 20.11.17 at 14:51, <hjl.tools@gmail.com> wrote:
>>> On Mon, Nov 20, 2017 at 5:41 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>> On 20.11.17 at 14:22, <hjl.tools@gmail.com> wrote:
>>>>> On Mon, Nov 20, 2017 at 5:09 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>>> On 20.11.17 at 13:55, <hjl.tools@gmail.com> wrote:
>>>>>>> On Mon, Nov 20, 2017 at 4:08 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>> H.J.,
>>>>>>>>
>>>>>>>> one more sub-optimal thing I've come across, int the context of my
>>>>>>>> analysis of whether Vec_Disp8 really is fully redundant with
>>>>>>>> DispMemShift: Certain instructions, for example
>>>>>>>>
>>>>>>>>         vaddps  xmm0, xmm0, [eax+0x80]
>>>>>>>>
>>>>>>>> are encodable with both VEX and EVEX. Generally the assembler
>>>>>>>> tries to pick the shortest encoding. Obviously the VEX encoding,
>>>>>>>> due to requiring the Disp32 ModR/M form, is longer than the
>>>>>>>> EVEX one in the specific example above. Clearly such a
>>>>>>>> conversion can't be done unilaterally, as that could break code
>>>>>>>> assuming to be run on AVX-512-incapable hardware. However,
>>>>>>>> does anything speak against doing so after having seen a
>>>>>>>> command line option or directive explicitly enabling AVX-512
>>>>>>>> insns? Or should this instead be made even more explicit, by
>>>>>>>> introducing something paralleling the automatic SSE->AVX
>>>>>>>> conversion?
>>>>>>>>
>>>>>>>
>>>>>>> Have you looked at
>>>>>>>
>>>>>>> commit 86fa6981e7487e2c2df4337aa75ed2d93c32eaf2
>>>>>>> Author: H.J. Lu <hjl.tools@gmail.com>
>>>>>>> Date:   Thu Mar 9 09:58:46 2017 -0800
>>>>>>>
>>>>>>>     X86: Add pseudo prefixes to control encoding
>>>>>>>
>>>>>>>     Many x86 instructions have more than one encodings.  Assembler picks
>>>>>>>     the default one, usually the shortest one.  Although the ".s", ".d8"
>>>>>>>     and ".d32" suffixes can be used to swap register operands or specify
>>>>>>>     displacement size, they aren't very flexible.  This patch adds pseudo
>>>>>>>     prefixes, {xxx}, to control instruction encoding.  The available
>>>>>>>     pseudo prefixes are {disp8}, {disp32}, {load}, {store}, {vex2}, {vex3}
>>>>>>>     and {evex}.  Pseudo prefixes are preferred over the ".s", ".d8" and
>>>>>>>     ".d32" suffixes, which are deprecated.
>>>>>>
>>>>>> Yes, but that requires explicit action on part of the programmer or
>>>>>> compiler on every individual affected insn. Especially for the latter I
>>>>>> doubt it will (or even should) emit such pseudo prefixes.
>>>>>>
>>>>>
>>>>> You were asking " command line option or directive".  Does above
>>>>> cover the "directive" part?
>>>>
>>>> No - the pseudo prefixes need to be used per insn. A directive would
>>>> be something similar to .sse2avx, covering all subsequent insns.
>>>>
>>>>> We currently have
>>>>>
>>>>>  -mavxscalar=[128|256]   encode scalar AVX instructions with specific vector
>>>>>                            length
>>>>>   -mevexlig=[128|256|512] encode scalar EVEX instructions with specific 
> vector
>>>>>                            length
>>>>>   -mevexwig=[0|1]         encode EVEX instructions with specific EVEX.W 
> value
>>>>>                            for EVEX.W bit ignored instructions
>>>>>   -mevexrcig=[rne|rd|ru|rz]
>>>>>                           encode EVEX instructions with specific EVEX.RC 
> value
>>>>>                            for SAE-only ignored instructions
>>>>>
>>>>> We can add one mapped to {vex2}, {vex3}, {evex} directives.
>>>>
>>>> I'm afraid I don't understand what you're trying to suggest, or
>>>> how that is related to the original question.
>>>>
>>>
>>> Yes, we can add a command option as well as .vex2/.vex3/.evex
>>> directives.
>>
>> Oh, you mean to have a way to effect e.g. {vex2} globally? How
>> would that work for things where the pseudo prefix would cause
>> an error because a given insn isn't representable?
> 
> It can be used to cancel vex3 command-line option.

Which still goes against my argument of specifically not wanting
anyone to have to deal with individual insns here. I want a way
to improve encoding _without_ incurring any diagnostics that
then require changes to the original code.

>> And then - does your reply mean you don't think we could derive
>> the encoding selection from the most recent -march/.arch (maybe
>> also -mtune) setting?
> 
> No, I don't think so.

Please would you mind being a little less terse here: I'd really
like to understand the "why" behind your answer.

Jan



More information about the Binutils mailing list