[Patch] [gas][aarch64][SVE2] Fix pmull{t,b} requirement on SVE2-AES

Matthew Malcomson Matthew.Malcomson@arm.com
Mon Jul 1 09:40:00 GMT 2019


On 01/07/19 08:27, Tamar Christina wrote:
> Hi Matthew,
> 
> Quick question, the patch doesn't mention what turns on these instructions now,
> I am assuming that it is just +sve2? In which case does it mean that HaveSVE2PMULL128 is mandatory for SVE2?
> 

Hi Tamar,

So +sve2 turns on pmull{t,b} for sizes  .B -> .H  and  .S -> .D
In the spec these sizes are not gated on anything.

Meanwhile the .D -> .Q  size is gated on HaveSVE2PMULL128 in the spec, 
and that is turned on with the +sve2-aes extension.
(this is the line
if size == '00' && !HaveSVE2PMULL128() then UNDEFINED;
in the pseudocode).

The mistake before was that all sizes were put under the +sve2-aes 
extension instead of just the  .D -> .Q size.

n.b. we put PMULL128 under the +sve2-aes extension flag to follow 
similar behaviour to the PMULL and PMULL2 instructions in NEON where the 
128 bit variant is enabled when +aes is specified.
Also, pmull128 is only available if also implementing SVE2-AES, so this 
isn't an unrelated flag.

Cheers,
Matthew

> Kind Regards,
> Tamar
> 
>> -----Original Message-----
>> From: binutils-owner@sourceware.org <binutils-owner@sourceware.org>
>> On Behalf Of Matthew Malcomson
>> Sent: Friday, June 28, 2019 14:45
>> To: binutils@sourceware.org
>> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Marcus Shawcroft
>> <Marcus.Shawcroft@arm.com>; Richard Earnshaw
>> <Richard.Earnshaw@arm.com>; nd <nd@arm.com>
>> Subject: [Patch] [gas][aarch64][SVE2] Fix pmull{t,b} requirement on SVE2-
>> AES
>>
>> I had mistakenly given all variants of the new SVE2 instructions pmull{t,b} a
>> dependency on the feature +sve2-aes.
>>
>> Only the variant specifying .Q -> .D  sizes should have that restriction.
>>
>> This patch fixes that mistake and updates the testsuite to have extra tests
>> (matching the given set of tests per line in aarch64-tbl.h that the rest of the
>> SVE2 tests follow).
>>
>> We also add a line in the documentation of the command line to clarify how
>> to enable `pmull{t,b}` of this larger size.  This is needed because all other
>> instructions gated under the `sve2-aes` architecture extension are marked in
>> the instruction documentation by an `HaveSVE2AES` check while pmull{t,b} is
>> gated under the `HaveSVE2PMULL128` check.
>>
>> Regtested targeting aarch64-linux.
>>
>> gas/ChangeLog:
>>
>> 2019-06-28  Matthew Malcomson  <matthew.malcomson@arm.com>
>>
>> 	* testsuite/gas/aarch64/illegal-sve2-aes.d: Update tests.
>> 	* testsuite/gas/aarch64/illegal-sve2.l: Update tests.
>> 	* doc/c-aarch64.texi: Add special note of pmull{t,b}
>> 	instructions under the sve2-aes architecture extension.
>> 	* testsuite/gas/aarch64/illegal-sve2.s: Add small size
>> 	pmull{t,b} instructions.
>> 	* testsuite/gas/aarch64/sve2.d: Add small size pmull{t,b}
>> 	disassembly.
>> 	* testsuite/gas/aarch64/sve2.s: Add small size pmull{t,b}
>> 	instructions.
>>
>> include/ChangeLog:
>>
>> 2019-06-28  Matthew Malcomson  <matthew.malcomson@arm.com>
>>
>> 	* opcode/aarch64.h (enum aarch64_insn_class): sve_size_013
>> 	renamed to sve_size_13.
>>
>> opcodes/ChangeLog:
>>
>> 2019-06-28  Matthew Malcomson  <matthew.malcomson@arm.com>
>>
>> 	* aarch64-asm.c (aarch64_encode_variant_using_iclass): Use new
>> 	sve_size_13 icode to account for variant behaviour of
>> 	pmull{t,b}.
>> 	* aarch64-dis-2.c: Regenerate.
>> 	* aarch64-dis.c (aarch64_decode_variant_using_iclass): Use new
>> 	sve_size_13 icode to account for variant behaviour of
>> 	pmull{t,b}.
>> 	* aarch64-tbl.h (OP_SVE_VVV_HD_BS): Add new qualifier.
>> 	(OP_SVE_VVV_Q_D): Add new qualifier.
>> 	(OP_SVE_VVV_QHD_DBS): Remove now unused qualifier.
>> 	(struct aarch64_opcode): Split pmull{t,b} into those requiring
>> 	AES and those not.
>>
>>
>>
>> ###############     Attachment also inlined for ease of reply
>> ###############
>>                  Inline version does not contain generated files
>>
>>
>> diff --git a/gas/doc/c-aarch64.texi b/gas/doc/c-aarch64.texi index
>> e6630610010744b75fb2b6a3a2cb008d08c9c97a..6844f5980214919bec6d1a37b
>> 76d0a69ebb12983 100644
>> --- a/gas/doc/c-aarch64.texi
>> +++ b/gas/doc/c-aarch64.texi
>> @@ -203,7 +203,8 @@ automatically cause those extensions to be disabled.
>>   @item @code{sve2-sm4} @tab ARMv8-A @tab No
>>    @tab Enable SVE2 SM4 Extension.
>>   @item @code{sve2-aes} @tab ARMv8-A @tab No
>> - @tab Enable SVE2 AES Extension.
>> + @tab Enable SVE2 AES Extension.  This also enables the .Q->.B form of
>> + the @code{pmullt} and @code{pmullb} instructions.
>>   @item @code{sve2-sha3} @tab ARMv8-A @tab No
>>    @tab Enable SVE2 SHA3 Extension.
>>   @end multitable
>> diff --git a/gas/testsuite/gas/aarch64/illegal-sve2-aes.d
>> b/gas/testsuite/gas/aarch64/illegal-sve2-aes.d
>> index
>> 8e6daa2c3509a143067dcf8d90c1b3db8aed321f..926db22e4d2b38823d9f9d232
>> 8f1052f6397d05c 100644
>> --- a/gas/testsuite/gas/aarch64/illegal-sve2-aes.d
>> +++ b/gas/testsuite/gas/aarch64/illegal-sve2-aes.d
>> @@ -12,9 +12,5 @@
>>   #error: [^ :]+:[0-9]+: Error: selected processor does not support `aesmc
>> z0\.b,z0\.b'
>>   #error: [^ :]+:[0-9]+: Error: selected processor does not support `pmullb
>> z17\.q,z21\.d,z27\.d'
>>   #error: [^ :]+:[0-9]+: Error: selected processor does not support `pmullb
>> z0\.q,z0\.d,z0\.d'
>> -#error: [^ :]+:[0-9]+: Error: selected processor does not support `pmullb
>> z0\.h,z0\.b,z0\.b'
>> -#error: [^ :]+:[0-9]+: Error: selected processor does not support `pmullb
>> z0\.d,z0\.s,z0\.s'
>>   #error: [^ :]+:[0-9]+: Error: selected processor does not support `pmullt
>> z17\.q,z21\.d,z27\.d'
>>   #error: [^ :]+:[0-9]+: Error: selected processor does not support `pmullt
>> z0\.q,z0\.d,z0\.d'
>> -#error: [^ :]+:[0-9]+: Error: selected processor does not support `pmullt
>> z0\.h,z0\.b,z0\.b'
>> -#error: [^ :]+:[0-9]+: Error: selected processor does not support `pmullt
>> z0\.d,z0\.s,z0\.s'
>> diff --git a/gas/testsuite/gas/aarch64/illegal-sve2.l
>> b/gas/testsuite/gas/aarch64/illegal-sve2.l
>> index
>> 7d93a0902632f8c7f119fb6c3b41e40b703b1664..01c68479c4c7fd7afd07d68f744
>> 87b33130addba 100644
>> --- a/gas/testsuite/gas/aarch64/illegal-sve2.l
>> +++ b/gas/testsuite/gas/aarch64/illegal-sve2.l
>> @@ -756,18 +756,24 @@
>>   [^ :]+:[0-9]+: Error: operand mismatch -- `pmullb z0\.d,z0\.d,z0\.d'
>>   [^ :]+:[0-9]+: Info:    did you mean this\?
>>   [^ :]+:[0-9]+: Info:    	pmullb z0\.q, z0\.d, z0\.d
>> -[^ :]+:[0-9]+: Info:    other valid variant\(s\):
>> -[^ :]+:[0-9]+: Info:    	pmullb z0\.h, z0\.b, z0\.b
>> -[^ :]+:[0-9]+: Info:    	pmullb z0\.d, z0\.s, z0\.s
>> +[^ :]+:[0-9]+: Error: operand 1 must be an SVE vector register -- `pmullb
>> z32\.h,z0\.b,z0\.b'
>> +[^ :]+:[0-9]+: Error: operand 2 must be an SVE vector register -- `pmullb
>> z0\.h,z32\.b,z0\.b'
>> +[^ :]+:[0-9]+: Error: operand 3 must be an SVE vector register -- `pmullb
>> z0\.h,z0\.b,z32\.b'
>> +[^ :]+:[0-9]+: Error: operand mismatch -- `pmullb z0\.b,z0\.b,z0\.b'
>> +[^ :]+:[0-9]+: Info:    did you mean this\?
>> +[^ :]+:[0-9]+: Info:    	pmullb z0\.q, z0\.d, z0\.d
>>   [^ :]+:[0-9]+: Error: operand 1 must be an SVE vector register -- `pmullt
>> z32\.q,z0\.d,z0\.d'
>>   [^ :]+:[0-9]+: Error: operand 2 must be an SVE vector register -- `pmullt
>> z0\.q,z32\.d,z0\.d'
>>   [^ :]+:[0-9]+: Error: operand 3 must be an SVE vector register -- `pmullt
>> z0\.q,z0\.d,z32\.d'
>>   [^ :]+:[0-9]+: Error: operand mismatch -- `pmullt z0\.d,z0\.d,z0\.d'
>>   [^ :]+:[0-9]+: Info:    did you mean this\?
>>   [^ :]+:[0-9]+: Info:    	pmullt z0\.q, z0\.d, z0\.d
>> -[^ :]+:[0-9]+: Info:    other valid variant\(s\):
>> -[^ :]+:[0-9]+: Info:    	pmullt z0\.h, z0\.b, z0\.b
>> -[^ :]+:[0-9]+: Info:    	pmullt z0\.d, z0\.s, z0\.s
>> +[^ :]+:[0-9]+: Error: operand 1 must be an SVE vector register -- `pmullt
>> z32\.h,z0\.b,z0\.b'
>> +[^ :]+:[0-9]+: Error: operand 2 must be an SVE vector register -- `pmullt
>> z0\.h,z32\.b,z0\.b'
>> +[^ :]+:[0-9]+: Error: operand 3 must be an SVE vector register -- `pmullt
>> z0\.h,z0\.b,z32\.b'
>> +[^ :]+:[0-9]+: Error: operand mismatch -- `pmullt z0\.b,z0\.b,z0\.b'
>> +[^ :]+:[0-9]+: Info:    did you mean this\?
>> +[^ :]+:[0-9]+: Info:    	pmullt z0\.q, z0\.d, z0\.d
>>   [^ :]+:[0-9]+: Error: operand mismatch -- `raddhnb z0\.h,z0\.h,z0\.h'
>>   [^ :]+:[0-9]+: Info:    did you mean this\?
>>   [^ :]+:[0-9]+: Info:    	raddhnb z0\.b, z0\.h, z0\.h
>> diff --git a/gas/testsuite/gas/aarch64/illegal-sve2.s
>> b/gas/testsuite/gas/aarch64/illegal-sve2.s
>> index
>> c6c408c196821cae34bf03442d915e9b87d1ac53..c963a5c2710876df39c641ad90
>> 8859585cec0dc1 100644
>> --- a/gas/testsuite/gas/aarch64/illegal-sve2.s
>> +++ b/gas/testsuite/gas/aarch64/illegal-sve2.s
>> @@ -519,11 +519,21 @@ pmullb z0.q, z32.d, z0.d  pmullb z0.q, z0.d, z32.d
>> pmullb z0.d, z0.d, z0.d
>>
>> +pmullb z32.h, z0.b, z0.b
>> +pmullb z0.h, z32.b, z0.b
>> +pmullb z0.h, z0.b, z32.b
>> +pmullb z0.b, z0.b, z0.b
>> +
>>   pmullt z32.q, z0.d, z0.d
>>   pmullt z0.q, z32.d, z0.d
>>   pmullt z0.q, z0.d, z32.d
>>   pmullt z0.d, z0.d, z0.d
>>
>> +pmullt z32.h, z0.b, z0.b
>> +pmullt z0.h, z32.b, z0.b
>> +pmullt z0.h, z0.b, z32.b
>> +pmullt z0.b, z0.b, z0.b
>> +
>>   raddhnb z0.h, z0.h, z0.h
>>   raddhnb z32.b, z0.h, z0.h
>>   raddhnb z0.b, z32.h, z0.h
>> diff --git a/gas/testsuite/gas/aarch64/sve2.d
>> b/gas/testsuite/gas/aarch64/sve2.d
>> index
>> efa9b270ff49b16fcd544ff8236461ca852204d4..5324583020fda14912f799ac15a
>> 41ab5dfebc1ab 100644
>> --- a/gas/testsuite/gas/aarch64/sve2.d
>> +++ b/gas/testsuite/gas/aarch64/sve2.d
>> @@ -264,10 +264,12 @@ Disassembly of section \.text:
>>    *[0-9a-f]+:	04206400 	pmul	z0\.b, z0\.b, z0\.b
>>    *[0-9a-f]+:	451b6ab1 	pmullb	z17\.q, z21\.d, z27\.d
>>    *[0-9a-f]+:	45006800 	pmullb	z0\.q, z0\.d, z0\.d
>> + *[0-9a-f]+:	455b6ab1 	pmullb	z17\.h, z21\.b, z27\.b
>>    *[0-9a-f]+:	45406800 	pmullb	z0\.h, z0\.b, z0\.b
>>    *[0-9a-f]+:	45c06800 	pmullb	z0\.d, z0\.s, z0\.s
>>    *[0-9a-f]+:	451b6eb1 	pmullt	z17\.q, z21\.d, z27\.d
>>    *[0-9a-f]+:	45006c00 	pmullt	z0\.q, z0\.d, z0\.d
>> + *[0-9a-f]+:	455b6eb1 	pmullt	z17\.h, z21\.b, z27\.b
>>    *[0-9a-f]+:	45406c00 	pmullt	z0\.h, z0\.b, z0\.b
>>    *[0-9a-f]+:	45c06c00 	pmullt	z0\.d, z0\.s, z0\.s
>>    *[0-9a-f]+:	457b6ab1 	raddhnb	z17\.b, z21\.h, z27\.h
>> diff --git a/gas/testsuite/gas/aarch64/sve2.s
>> b/gas/testsuite/gas/aarch64/sve2.s
>> index
>> 13d2e2a2421a3c71e2a2faa403dccc510e1d4f1f..9417a0d27b1c6fc2b962f196c62
>> 706536a929a3c 100644
>> --- a/gas/testsuite/gas/aarch64/sve2.s
>> +++ b/gas/testsuite/gas/aarch64/sve2.s
>> @@ -338,11 +338,15 @@ pmul z0.b, z0.b, z0.b
>>
>>   pmullb z17.q, z21.d, z27.d
>>   pmullb z0.q, z0.d, z0.d
>> +
>> +pmullb z17.h, z21.b, z27.b
>>   pmullb z0.h, z0.b, z0.b
>>   pmullb z0.d, z0.s, z0.s
>>
>>   pmullt z17.q, z21.d, z27.d
>>   pmullt z0.q, z0.d, z0.d
>> +
>> +pmullt z17.h, z21.b, z27.b
>>   pmullt z0.h, z0.b, z0.b
>>   pmullt z0.d, z0.s, z0.s
>>
>> diff --git a/include/opcode/aarch64.h b/include/opcode/aarch64.h index
>> a4520da8d150f4ff638bb0c2c7529d94ef4d150e..d0bbe01be6cd4084054a56461
>> 07d2bc6ad82bd07 100644
>> --- a/include/opcode/aarch64.h
>> +++ b/include/opcode/aarch64.h
>> @@ -599,7 +599,7 @@ enum aarch64_insn_class
>>     sve_size_sd,
>>     sve_size_bh,
>>     sve_size_sd2,
>> -  sve_size_013,
>> +  sve_size_13,
>>     sve_shift_tsz_hsd,
>>     sve_shift_tsz_bhsd,
>>     sve_size_tsz_bhs,
>> diff --git a/opcodes/aarch64-asm.c b/opcodes/aarch64-asm.c index
>> afb0e5b4d2ad0e62307e97858062f8599169248e..67ebad687cc7473842e258858
>> c0838dd5459cf2e 100644
>> --- a/opcodes/aarch64-asm.c
>> +++ b/opcodes/aarch64-asm.c
>> @@ -1679,8 +1679,8 @@ aarch64_encode_variant_using_iclass (struct
>> aarch64_inst *inst)
>>   		     0, 2, FLD_SVE_tszl_19, FLD_SVE_sz);
>>         break;
>>
>> -    case sve_size_013:
>> -      variant = aarch64_get_variant (inst);
>> +    case sve_size_13:
>> +      variant = aarch64_get_variant (inst) + 1;
>>         if (variant == 2)
>>   	  variant = 3;
>>         insert_field (FLD_size, &inst->value, variant, 0); diff --git
>> a/opcodes/aarch64-dis.c b/opcodes/aarch64-dis.c index
>> 6b53a2c3228aea3117e0f45450dc9f192eedd6de..7ae844a601c409b23f0a53735
>> 5aec6b54f7ef385 100644
>> --- a/opcodes/aarch64-dis.c
>> +++ b/opcodes/aarch64-dis.c
>> @@ -2822,14 +2822,11 @@ aarch64_decode_variant_using_iclass
>> (aarch64_inst *inst)
>>         variant = i - 1;
>>         break;
>>
>> -    case sve_size_013:
>> -      i = extract_field (FLD_size, inst->value, 0);
>> -      if (i == 2)
>> -	return FALSE;
>> -      if (i == 3)
>> -	variant = 2;
>> -      else
>> -	variant = i;
>> +    case sve_size_13:
>> +      /* Ignore low bit of this field since that is set in the opcode for
>> +	 instructions of this iclass.  */
>> +      i = (extract_field (FLD_size, inst->value, 0) & 2);
>> +      variant = (i >> 1);
>>         break;
>>
>>       case sve_shift_tsz_bhsd:
>> diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h index
>> 60255d93647cb21186c44cbd4f25cbdb8bd23225..9ee92ea35b226d7ca375d8b8
>> 1e324220256c6b89 100644
>> --- a/opcodes/aarch64-tbl.h
>> +++ b/opcodes/aarch64-tbl.h
>> @@ -1958,15 +1958,18 @@
>>   {                                                       \
>>     QLF3(S_S,S_S,S_S),                                    \
>>   }
>> +#define OP_SVE_VVV_HD_BS				\
>> +{                                                       \
>> +  QLF3(S_H,S_B,S_B),                                    \
>> +  QLF3(S_D,S_S,S_S),                                    \
>> +}
>>   #define OP_SVE_VVV_S_B                                  \
>>   {                                                       \
>>     QLF3(S_S,S_B,S_B),                                    \
>>   }
>> -#define OP_SVE_VVV_QHD_DBS                              \
>> +#define OP_SVE_VVV_Q_D					\
>>   {                                                       \
>>     QLF3(S_Q,S_D,S_D),                                    \
>> -  QLF3(S_H,S_B,S_B),                                    \
>> -  QLF3(S_D,S_S,S_S),                                    \
>>   }
>>   #define OP_SVE_VVV_HSD_BHS                              \
>>   {                                                       \
>> @@ -4673,6 +4676,8 @@ struct aarch64_opcode aarch64_opcode_table[] =
>>     SVE2_INSNC ("nbsl", 0x04e03c00, 0xffe0fc00, sve_misc, 0, OP4 (SVE_Zd,
>> SVE_Zd, SVE_Zm_16, SVE_Zn), OP_SVE_DDDD, 0, C_SCAN_MOVPRFX, 1),
>>     SVE2_INSN ("nmatch", 0x45208010, 0xffa0e010,  sve_size_bh, 0, OP4
>> (SVE_Pd, SVE_Pg3, SVE_Zn, SVE_Zm_16), OP_SVE_VZVV_BH, 0, 0),
>>     SVE2_INSN ("pmul", 0x04206400, 0xffe0fc00,  sve_misc, 0, OP3 (SVE_Zd,
>> SVE_Zn, SVE_Zm_16), OP_SVE_BBB, 0, 0),
>> +  SVE2_INSN ("pmullb", 0x45406800, 0xff60fc00, sve_size_13, 0, OP3
>> + (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_VVV_HD_BS, 0, 0),  SVE2_INSN
>> + ("pmullt", 0x45406c00, 0xff60fc00, sve_size_13, 0, OP3 (SVE_Zd,
>> + SVE_Zn, SVE_Zm_16), OP_SVE_VVV_HD_BS, 0, 0),
>>     SVE2_INSN ("raddhnb", 0x45206800, 0xff20fc00,  sve_size_hsd, 0, OP3
>> (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_VVV_BHS_HSD, 0, 0),
>>     SVE2_INSN ("raddhnt", 0x45206c00, 0xff20fc00,  sve_size_hsd, 0, OP3
>> (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_VVV_BHS_HSD, 0, 0),
>>     SVE2_INSN ("rshrnb", 0x45201800, 0xffa0fc00,  sve_shift_tsz_hsd, 0, OP3
>> (SVE_Zd, SVE_Zn, SVE_SHRIMM_UNPRED_22), OP_SVE_VVU_BHS_HSD, 0,
>> 0), @@ -4892,8 +4897,8 @@ struct aarch64_opcode aarch64_opcode_table[]
>> =
>>     SVE2AES_INSN ("aese", 0x4522e000, 0xfffffc00, sve_misc, 0, OP3 (SVE_Zd,
>> SVE_Zd, SVE_Zn), OP_SVE_BBB, 0, 1),
>>     SVE2AES_INSN ("aesimc", 0x4520e400, 0xffffffe0, sve_misc, 0, OP2 (SVE_Zd,
>> SVE_Zd), OP_SVE_BB, 0, 1),
>>     SVE2AES_INSN ("aesmc", 0x4520e000, 0xffffffe0, sve_misc, 0, OP2 (SVE_Zd,
>> SVE_Zd), OP_SVE_BB, 0, 1),
>> -  SVE2AES_INSN ("pmullb", 0x45006800, 0xff20fc00,  sve_size_013, 0, OP3
>> (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_VVV_QHD_DBS, 0, 0),
>> -  SVE2AES_INSN ("pmullt", 0x45006c00, 0xff20fc00,  sve_size_013, 0, OP3
>> (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_VVV_QHD_DBS, 0, 0),
>> +  SVE2AES_INSN ("pmullb", 0x45006800, 0xffe0fc00, sve_misc, 0, OP3
>> + (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_VVV_Q_D, 0, 0),  SVE2AES_INSN
>> + ("pmullt", 0x45006c00, 0xffe0fc00, sve_misc, 0, OP3 (SVE_Zd, SVE_Zn,
>> + SVE_Zm_16), OP_SVE_VVV_Q_D, 0, 0),
>>     /* SVE2_SHA3 instructions.  */
>>     SVE2SHA3_INSN ("rax1", 0x4520f400, 0xffe0fc00,  sve_misc, 0, OP3 (SVE_Zd,
>> SVE_Zn, SVE_Zm_16), OP_SVE_DDD, 0, 0),
>>     /* SVE2_BITPERM instructions. */
> 



More information about the Binutils mailing list