This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: [PATCH] x86: Expand Broadcast to 3 bits
On Thu, Jul 26, 2018 at 9:13 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 26.07.18 at 18:03, <hjl.tools@gmail.com> wrote:
>> On Thu, Jul 26, 2018 at 8:58 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 26.07.18 at 17:52, <hjl.tools@gmail.com> wrote:
>>>> On Thu, Jul 26, 2018 at 8:47 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>> On 26.07.18 at 17:02, <hjl.tools@gmail.com> wrote:
>>>>>> On Thu, Jul 26, 2018 at 7:58 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>>>> On 26.07.18 at 00:05, <hongjiu.lu@intel.com> wrote:
>>>>>>>> @@ -5008,6 +5010,22 @@ optimize_disp (void)
>>>>>>>> }
>>>>>>>> }
>>>>>>>>
>>>>>>>> +/* Return 1 if there is a match in broadcast bytes between operand
>>>>>>>> + GIVEN and instruction template T. */
>>>>>>>> +
>>>>>>>> +static INLINE int
>>>>>>>> +match_broadcast_size (const insn_template *t, unsigned int given)
>>>>>>>> +{
>>>>>>>> + return ((t->opcode_modifier.broadcast == BYTE_BROADCAST
>>>>>>>> + && i.types[given].bitfield.byte)
>>>>>>>> + || (t->opcode_modifier.broadcast == WORD_BROADCAST
>>>>>>>> + && i.types[given].bitfield.word)
>>>>>>>> + || (t->opcode_modifier.broadcast == DWORD_BROADCAST
>>>>>>>> + && i.types[given].bitfield.dword)
>>>>>>>> + || (t->opcode_modifier.broadcast == QWORD_BROADCAST
>>>>>>>> + && i.types[given].bitfield.qword));
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> /* Check if operands are valid for the instruction. */
>>>>>>>>
>>>>>>>> static int
>>>>>>>> @@ -5126,23 +5144,29 @@ check_VecOperands (const insn_template *t)
>>>>>>>> i386_operand_type type, overlap;
>>>>>>>>
>>>>>>>> /* Check if specified broadcast is supported in this instruction,
>>>>>>>> - and it's applied to memory operand of DWORD or QWORD type. */
>>>>>>>> + and its broadcast bytes match the memory operand. */
>>>>>>>> op = i.broadcast->operand;
>>>>>>>> if (!t->opcode_modifier.broadcast
>>>>>>>> || !i.types[op].bitfield.mem
>>>>>>>> || (!i.types[op].bitfield.unspecified
>>>>>>>> - && (t->operand_types[op].bitfield.dword
>>>>>>>> - ? !i.types[op].bitfield.dword
>>>>>>>> - : !i.types[op].bitfield.qword)))
>>>>>>>> + && !match_broadcast_size (t, op)))
>>>>>>>> {
>>>>>>>> bad_broadcast:
>>>>>>>> i.error = unsupported_broadcast;
>>>>>>>> return 1;
>>>>>>>> }
>>>>>>>>
>>>>>>>> + i.broadcast->bytes = ((1 << (t->opcode_modifier.broadcast - 1))
>>>>>>>> + * i.broadcast->type);
>>>>>>>
>>>>>>> So if you moved this up ahead of the earlier if(), and if you used
>>>>>>> i.broadcast->bytes in place of t->opcode_modifier.broadcast in
>>>>>>> match_broadcast_size(), I think you could get away without the
>>>>>>> extension to 3 bits in the templates.
>>>>>>
>>>>>> i.broadcast->bytes is set from t->opcode_modifier.broadcast.
>>>>>> I'd like to avoid check byte, word, dword, qword to compute
>>>>>> i.broadcast->bytes.
>>>>>
>>>>> And this is because of what? This is exactly the kind of redundancy
>>>>> I'm talking about. Or are there going to be cases where the
>>>>> broadcast element size is not the smallest among multiple possible
>>>>> ones for a single template (but then your logic in i386-gen would
>>>>> be wrong too)?
>>>>
>>>> By definition, the broadcast element size is the smalltest non-vector
>>>> size.
>>>
>>> In which case my question stands - what you've said in your earlier
>>> reply is because of what?
>>
>> I want to avoid checking byte, word, dword, qword when all I need
>> is the broadcast element size.
>
> Hmm, moving in circles? You just repeat what you've said before. Are
> you suggesting you view this as a performance issue?
>
Yes.
--
H.J.