This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
Re: [PATCH] x86: Expand Broadcast to 3 bits
- From: "Jan Beulich" <JBeulich at suse dot com>
- To: "H.J. Lu" <hjl dot tools at gmail dot com>
- Cc: "Igor V Tsimbalist" <igor dot v dot tsimbalist at intel dot com>, <binutils at sourceware dot org>
- Date: Thu, 26 Jul 2018 09:47:02 -0600
- Subject: Re: [PATCH] x86: Expand Broadcast to 3 bits
- References: <20180725220507.GA3533@intel.com> <5B59E18402000078001D83EA@prv1-mh.provo.novell.com> <CAMe9rOq4S2RO3q2_0Jttu48StAOs=xev4KnveEs0drdxbwxKGQ@mail.gmail.com>
>>> On 26.07.18 at 17:02, <hjl.tools@gmail.com> wrote:
> On Thu, Jul 26, 2018 at 7:58 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 26.07.18 at 00:05, <hongjiu.lu@intel.com> wrote:
>>> @@ -5008,6 +5010,22 @@ optimize_disp (void)
>>> }
>>> }
>>>
>>> +/* Return 1 if there is a match in broadcast bytes between operand
>>> + GIVEN and instruction template T. */
>>> +
>>> +static INLINE int
>>> +match_broadcast_size (const insn_template *t, unsigned int given)
>>> +{
>>> + return ((t->opcode_modifier.broadcast == BYTE_BROADCAST
>>> + && i.types[given].bitfield.byte)
>>> + || (t->opcode_modifier.broadcast == WORD_BROADCAST
>>> + && i.types[given].bitfield.word)
>>> + || (t->opcode_modifier.broadcast == DWORD_BROADCAST
>>> + && i.types[given].bitfield.dword)
>>> + || (t->opcode_modifier.broadcast == QWORD_BROADCAST
>>> + && i.types[given].bitfield.qword));
>>> +}
>>> +
>>> /* Check if operands are valid for the instruction. */
>>>
>>> static int
>>> @@ -5126,23 +5144,29 @@ check_VecOperands (const insn_template *t)
>>> i386_operand_type type, overlap;
>>>
>>> /* Check if specified broadcast is supported in this instruction,
>>> - and it's applied to memory operand of DWORD or QWORD type. */
>>> + and its broadcast bytes match the memory operand. */
>>> op = i.broadcast->operand;
>>> if (!t->opcode_modifier.broadcast
>>> || !i.types[op].bitfield.mem
>>> || (!i.types[op].bitfield.unspecified
>>> - && (t->operand_types[op].bitfield.dword
>>> - ? !i.types[op].bitfield.dword
>>> - : !i.types[op].bitfield.qword)))
>>> + && !match_broadcast_size (t, op)))
>>> {
>>> bad_broadcast:
>>> i.error = unsupported_broadcast;
>>> return 1;
>>> }
>>>
>>> + i.broadcast->bytes = ((1 << (t->opcode_modifier.broadcast - 1))
>>> + * i.broadcast->type);
>>
>> So if you moved this up ahead of the earlier if(), and if you used
>> i.broadcast->bytes in place of t->opcode_modifier.broadcast in
>> match_broadcast_size(), I think you could get away without the
>> extension to 3 bits in the templates.
>
> i.broadcast->bytes is set from t->opcode_modifier.broadcast.
> I'd like to avoid check byte, word, dword, qword to compute
> i.broadcast->bytes.
And this is because of what? This is exactly the kind of redundancy
I'm talking about. Or are there going to be cases where the
broadcast element size is not the smallest among multiple possible
ones for a single template (but then your logic in i386-gen would
be wrong too)?
Jan