[PATCH v2 1/2] gas, aarch64: Add AdvSIMD lut extension
Saurabh Jha
saurabh.jha@arm.com
Wed May 22 10:17:54 GMT 2024
Hi Andrew,
Thanks for the comments. Please find responses inline.
On 5/21/2024 2:57 PM, Andrew Carlotti wrote:
> On Thu, May 16, 2024 at 11:35:18AM +0100, Saurabh Jha wrote:
>> Introduces instructions for the Advanced SIMD lut extension for AArch64.
>> They are documented in the following links:
>> * luti2: https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/LUTI2--Lookup-table-read-with-2-bit-indices-?lang=en
>> * luti4: https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/LUTI4--Lookup-table-read-with-4-bit-indices-?lang=en
>>
>> These instructions needed definition of some new operands. We will first
>> discuss operands for the third operand of the instructions and then
>> discuss a vector register list operand needed for the second operand.
>>
>> The third operands are vectors with bit indices and without type
>> qualifiers. They are called Em_INDEX1_14, Em_INDEX2_13, and Em_INDEX3_12
>> and they have 1 bit, 2 bit, and 3 bit indices respectively. For these
>> new operands, we defined new parsing case branch and a new instruction
>> class. We also modified the existing reglane inserters and extractors
>> to handle the new operands. The lsb and width of these operands are
>> the same as many existing operands but the convention is to give
>> different names to fields that serve different purpose so we
>> introduced new fields in aarch64-opc.c and aarch64-opc.h for these
>> operands.
>>
>> For the second operand of these instructions, we introduced a new
>> operand called LVn_LUT. This represents a vector register list with
>> stride 1. We defined new inserter and extractor for this new operand and
>> it is encoded in FLD_Rn. We are enforcing the number of registers in the
>> reglist using opcode flag rather than operand flag as this is what other
>> SIMD vector register list operands are doing. The disassembly also uses
>> opcode flag to print the correct number of registers.
>> ---
>> Hi,
>>
>> Regression tested for aarch64-none-elf and found no regressions.
>>
>> Ok for binutils-master? I don't have commit access so can someone please
>> commit on my behalf?
>>
>> Regards,
>> Saurabh
>
>> diff --git a/gas/config/tc-aarch64.c b/gas/config/tc-aarch64.c
>> index 6ad4fae8b0ece71e5ac448be889846369c657420..bfba6efc6417e15887b0349c671e074e2238adc0 100644
>> --- a/gas/config/tc-aarch64.c
>> +++ b/gas/config/tc-aarch64.c
>> @@ -1513,6 +1513,54 @@ parse_vector_reg_list (char **ccp, aarch64_reg_type type,
>> return error ? PARSE_FAIL : (ret_val << 2) | (nb_regs - 1);
>> }
>>
>> +/* Parse a SIMD vector register with a bit index. The SIMD vectors with
>> + bit indices don't have type qualifiers.
>> +
>> + Return null if the string pointed to by *CCP is not a valid AdvSIMD
>> + vector register with a bit index.
>> +
>> + Otherwise return the register and the bit index information
>> + in *typeinfo.
>> +
>> + The validity of the bit index itself is checked separately in encoding.
>> + */
>> +
>> +static const reg_entry *
>> +parse_simd_vector_with_bit_index (char **ccp, struct vector_type_el *typeinfo)
>> +{
>> + char *str = *ccp;
>> + const reg_entry *reg = parse_reg (&str);
>> + struct vector_type_el atype;
>> +
>> + // Setting it here as this is the convention followed in the
>> + // rest of the code with indices.
>> + atype.defined = NTA_HASINDEX;
>> + // This will be set to correct value in parse_index_expressions.
>> + atype.index = 0;
>> + // The rest of the fields are not applicable for this operand.
>> + atype.type = NT_invtype;
>> + atype.width = -1;
>> + atype.element_size = 0;
>> +
>> + if (reg == NULL)
>> + return NULL;
>> +
>> + if (reg->type != REG_TYPE_V)
>> + return NULL;
>> +
>> + // Parse the bit index.
>> + if (!skip_past_char (&str, '['))
>> + return NULL;
>> + if (!parse_index_expression (&str, &atype.index))
>> + return NULL;
>> + if (!skip_past_char (&str, ']'))
>> + return NULL;
>> +
>> + *typeinfo = atype;
>> + *ccp = str;
>> + return reg;
>> +}
>> +
>> /* Directives: register aliases. */
>>
>> static reg_entry *
>> @@ -6761,6 +6809,23 @@ parse_operands (char *str, const aarch64_opcode *opcode)
>> reg_type = REG_TYPE_Z;
>> goto vector_reg_index;
>>
>> + case AARCH64_OPND_Em_INDEX1_14:
>> + case AARCH64_OPND_Em_INDEX2_13:
>> + case AARCH64_OPND_Em_INDEX3_12:
>> + // These are SIMD vector operands with bit indices. For example,
>> + // 'V27[3]'. These operands don't have type qualifiers before
>> + // indices.
>> + reg = parse_simd_vector_with_bit_index(&str, &vectype);
>> +
>> + if (!reg)
>> + goto failure;
>> + gas_assert (vectype.defined & NTA_HASINDEX);
>> +
>> + info->qualifier = AARCH64_OPND_QLF_NIL;
>> + info->reglane.regno = reg->number;
>> + info->reglane.index = vectype.index;
>> + break;
>> +
>
> Is the new function and separate handling necessary? There's already support
> in the section below for indexed operands without qualifiers on SVE registers.
> I tested removing the reg->type check from the below line, and nothing broke in
> the testsuite, so maybe that's an option.
>> if (reg->type == REG_TYPE_Z && vectype.type == NT_invtype)
>
> If that's not an option, could you move this block of code so it isn't in the
> middle of the vector_reg_index cases?
I tried doing it but couldn't get it to work. It does seem like the code
path assumes that the register is going to have qualifiers. So I have
kept it unchanged. But I have moved the block of code downwards as you
suggested.
>
>> case AARCH64_OPND_Ed:
>> case AARCH64_OPND_En:
>> case AARCH64_OPND_Em:
>> @@ -6812,6 +6877,7 @@ parse_operands (char *str, const aarch64_opcode *opcode)
>> goto vector_reg_list;
>>
>> case AARCH64_OPND_LVn:
>> + case AARCH64_OPND_LVn_LUT:
>> case AARCH64_OPND_LVt:
>> case AARCH64_OPND_LVt_AL:
>> case AARCH64_OPND_LEt:
>> @@ -10477,6 +10543,7 @@ static const struct aarch64_option_cpu_value_table aarch64_features[] = {
>> {"rcpc3", AARCH64_FEATURE (RCPC3), AARCH64_FEATURE (RCPC2)},
>> {"cpa", AARCH64_FEATURE (CPA), AARCH64_NO_FEATURES},
>> {"faminmax", AARCH64_FEATURE (FAMINMAX), AARCH64_FEATURE (SIMD)},
>> + {"lut", AARCH64_FEATURE (LUT), AARCH64_FEATURE (SIMD)},
>> {NULL, AARCH64_NO_FEATURES, AARCH64_NO_FEATURES},
>> };
>>
> ...
>> diff --git a/include/opcode/aarch64.h b/include/opcode/aarch64.h
>> index 2fca9528c2012be983c2414a30fa5930e57e5c92..63456021a1d167c747ca913a355dd02cf90fc726 100644
>> --- a/include/opcode/aarch64.h
>> +++ b/include/opcode/aarch64.h
>> @@ -232,6 +232,8 @@ enum aarch64_feature_bit {
>> AARCH64_FEATURE_CPA,
>> /* FAMINMAX instructions. */
>> AARCH64_FEATURE_FAMINMAX,
>> + /* LUT instructions. */
>> + AARCH64_FEATURE_LUT,
>> AARCH64_NUM_FEATURES
>> };
>>
>> @@ -518,10 +520,14 @@ enum aarch64_opnd
>> AARCH64_OPND_Em, /* AdvSIMD Vector Element Vm. */
>> AARCH64_OPND_Em16, /* AdvSIMD Vector Element Vm restricted to V0 - V15 when
>> qualifier is S_H. */
>> + AARCH64_OPND_Em_INDEX1_14, /* AdvSIMD 1-bit encoded index in Vm at [14] */
>> + AARCH64_OPND_Em_INDEX2_13, /* AdvSIMD 2-bit encoded index in Vm at [14:13] */
>> + AARCH64_OPND_Em_INDEX3_12, /* AdvSIMD 3-bit encoded index in Vm at [14:12] */
>> AARCH64_OPND_LVn, /* AdvSIMD Vector register list used in e.g. TBL. */
>> AARCH64_OPND_LVt, /* AdvSIMD Vector register list used in ld/st. */
>> AARCH64_OPND_LVt_AL, /* AdvSIMD Vector register list for loading single
>> structure to all lanes. */
>> + AARCH64_OPND_LVn_LUT, /* AdvSIMD Vector register list used in lut. */
>> AARCH64_OPND_LEt, /* AdvSIMD Vector Element list. */
>>
>> AARCH64_OPND_CRn, /* Co-processor register in CRn field. */
>> @@ -1018,7 +1024,8 @@ enum aarch64_insn_class
>> the,
>> sve2_urqvs,
>> sve_index1,
>> - rcpc3
>> + rcpc3,
>> + lut
>> };
>>
>> /* Opcode enumerators. */
>> diff --git a/opcodes/aarch64-asm.h b/opcodes/aarch64-asm.h
>> index 88e389bfebda001efbb578a6e144dd5e2513cf78..edeb6d8de7e2c3e117e0ad91a02b93c0e040a061 100644
>> --- a/opcodes/aarch64-asm.h
>> +++ b/opcodes/aarch64-asm.h
>> @@ -47,6 +47,7 @@ AARCH64_DECL_OPD_INSERTER (ins_reglane);
>> AARCH64_DECL_OPD_INSERTER (ins_reglist);
>> AARCH64_DECL_OPD_INSERTER (ins_ldst_reglist);
>> AARCH64_DECL_OPD_INSERTER (ins_ldst_reglist_r);
>> +AARCH64_DECL_OPD_INSERTER (ins_lut_reglist);
>> AARCH64_DECL_OPD_INSERTER (ins_ldst_elemlist);
>> AARCH64_DECL_OPD_INSERTER (ins_advsimd_imm_shift);
>> AARCH64_DECL_OPD_INSERTER (ins_imm);
>> diff --git a/opcodes/aarch64-asm.c b/opcodes/aarch64-asm.c
>> index 5a55ca2f86db2d45b6cb54b5ee22606ec27c51fd..338ed54165d26cec2f0634bc62c1d7355ca4956a 100644
>> --- a/opcodes/aarch64-asm.c
>> +++ b/opcodes/aarch64-asm.c
>> @@ -168,6 +168,27 @@ aarch64_ins_reglane (const aarch64_operand *self, const aarch64_opnd_info *info,
>> assert (reglane_index < 4);
>> insert_field (FLD_SM3_imm2, code, reglane_index, 0);
>> }
>> + else if (inst->opcode->iclass == lut)
>> + {
>> + unsigned reglane_index = info->reglane.index;
>> + switch (info->type)
>> + {
>> + case AARCH64_OPND_Em_INDEX1_14:
>> + assert (reglane_index < 2);
>> + insert_field (FLD_imm1_14, code, reglane_index, 0);
>> + break;
>> + case AARCH64_OPND_Em_INDEX2_13:
>> + assert (reglane_index < 4);
>> + insert_field (FLD_imm2_13, code, reglane_index, 0);
>> + break;
>> + case AARCH64_OPND_Em_INDEX3_12:
>> + assert (reglane_index < 8);
>> + insert_field (FLD_imm3_12, code, reglane_index, 0);
>> + break;
>> + default:
>> + return false;
>> + }
>> + }
>> else
>> {
>> /* index for e.g. SQDMLAL <Va><d>, <Vb><n>, <Vm>.<Ts>[<index>]
>> @@ -286,6 +307,17 @@ aarch64_ins_ldst_reglist_r (const aarch64_operand *self ATTRIBUTE_UNUSED,
>> return true;
>> }
>>
>> +/* Insert regnos of register list operand for AdvSIMD lut instructions. */
>> +bool
>> +aarch64_ins_lut_reglist (const aarch64_operand *self, const aarch64_opnd_info *info,
>> + aarch64_insn *code,
>> + const aarch64_inst *inst ATTRIBUTE_UNUSED,
>> + aarch64_operand_error *errors ATTRIBUTE_UNUSED)
>> +{
>> + insert_field (self->fields[0], code, info->reglist.first_regno, 0);
>> + return true;
>> +}
>> +
>> /* Insert Q, opcode<2:1>, S, size and Rt fields for a register element list
>> operand e.g. Vt in AdvSIMD load/store single element instructions. */
>> bool
>> diff --git a/opcodes/aarch64-dis.h b/opcodes/aarch64-dis.h
>> index 86494cc30937b1d7e4caf90630caec30c8b31d3e..9e8f7c214d70390a72f93e38655a5ac0f562d085 100644
>> --- a/opcodes/aarch64-dis.h
>> +++ b/opcodes/aarch64-dis.h
>> @@ -70,6 +70,7 @@ AARCH64_DECL_OPD_EXTRACTOR (ext_reglane);
>> AARCH64_DECL_OPD_EXTRACTOR (ext_reglist);
>> AARCH64_DECL_OPD_EXTRACTOR (ext_ldst_reglist);
>> AARCH64_DECL_OPD_EXTRACTOR (ext_ldst_reglist_r);
>> +AARCH64_DECL_OPD_EXTRACTOR (ext_lut_reglist);
>> AARCH64_DECL_OPD_EXTRACTOR (ext_ldst_elemlist);
>> AARCH64_DECL_OPD_EXTRACTOR (ext_advsimd_imm_shift);
>> AARCH64_DECL_OPD_EXTRACTOR (ext_shll_imm);
>> diff --git a/opcodes/aarch64-dis.c b/opcodes/aarch64-dis.c
>> index 96f42ae862a395bf3aa498c495fdcea9a3d12a41..130d2c1fae005c25a4615a88190b62ffd059cdb1 100644
>> --- a/opcodes/aarch64-dis.c
>> +++ b/opcodes/aarch64-dis.c
>> @@ -398,6 +398,23 @@ aarch64_ext_reglane (const aarch64_operand *self, aarch64_opnd_info *info,
>> /* index for e.g. SM3TT2A <Vd>.4S, <Vn>.4S, <Vm>S[<imm2>]. */
>> info->reglane.index = extract_field (FLD_SM3_imm2, code, 0);
>> }
>> + else if (inst->opcode->iclass == lut)
>> + {
>> + switch (info->type)
>> + {
>> + case AARCH64_OPND_Em_INDEX1_14:
>> + info->reglane.index = extract_field (FLD_imm1_14, code, 0);
>> + break;
>> + case AARCH64_OPND_Em_INDEX2_13:
>> + info->reglane.index = extract_field (FLD_imm2_13, code, 0);
>> + break;
>> + case AARCH64_OPND_Em_INDEX3_12:
>> + info->reglane.index = extract_field (FLD_imm3_12, code, 0);
>> + break;
>> + default:
>> + return false;
>> + }
>> + }
>> else
>> {
>> /* Index only for e.g. SQDMLAL <Va><d>, <Vb><n>, <Vm>.<Ts>[<index>]
>> @@ -533,6 +550,21 @@ aarch64_ext_ldst_reglist_r (const aarch64_operand *self ATTRIBUTE_UNUSED,
>> return true;
>> }
>>
>> +/* Decode AdvSIMD vector register list for AdvSIMD lut instructions.
>> + The number of of registers in the list is determined by the opcode
>> + flag. */
>> +bool
>> +aarch64_ext_lut_reglist (const aarch64_operand *self, aarch64_opnd_info *info,
>> + const aarch64_insn code,
>> + const aarch64_inst *inst ATTRIBUTE_UNUSED,
>> + aarch64_operand_error *errors ATTRIBUTE_UNUSED)
>> +{
>> + info->reglist.first_regno = extract_field (self->fields[0], code, 0);
>> + info->reglist.num_regs = get_opcode_dependent_value (inst->opcode);
>> + info->reglist.stride = 1;
>> + return true;
>> +}
>> +
>> /* Decode Q, opcode<2:1>, S, size and Rt fields of Vt in AdvSIMD
>> load/store single element instructions. */
>> bool
>> diff --git a/opcodes/aarch64-opc.h b/opcodes/aarch64-opc.h
>> index 4e781f000cc38c12058530e5851b08083d42af52..23e634f1250de579661bbeb14d611b868b76bc8d 100644
>> --- a/opcodes/aarch64-opc.h
>> +++ b/opcodes/aarch64-opc.h
>> @@ -147,6 +147,7 @@ enum aarch64_field_kind
>> FLD_imm1_2,
>> FLD_imm1_8,
>> FLD_imm1_10,
>> + FLD_imm1_14,
>> FLD_imm1_15,
>> FLD_imm1_16,
>> FLD_imm2_0,
>> @@ -154,6 +155,7 @@ enum aarch64_field_kind
>> FLD_imm2_8,
>> FLD_imm2_10,
>> FLD_imm2_12,
>> + FLD_imm2_13,
>> FLD_imm2_15,
>> FLD_imm2_16,
>> FLD_imm2_19,
>> diff --git a/opcodes/aarch64-opc.c b/opcodes/aarch64-opc.c
>> index e88c616f4a9f3657756b919dc1196c08831c3cc5..61ab4c14a6393150f29a3fa1679a30b642bf8844 100644
>> --- a/opcodes/aarch64-opc.c
>> +++ b/opcodes/aarch64-opc.c
>> @@ -337,6 +337,7 @@ const aarch64_field fields[] =
>> { 2, 1 }, /* imm1_2: general immediate in bits [2]. */
>> { 8, 1 }, /* imm1_8: general immediate in bits [8]. */
>> { 10, 1 }, /* imm1_10: general immediate in bits [10]. */
>> + { 14, 1 }, /* imm1_14: general immediate in bits [14]. */
>> { 15, 1 }, /* imm1_15: general immediate in bits [15]. */
>> { 16, 1 }, /* imm1_16: general immediate in bits [16]. */
>> { 0, 2 }, /* imm2_0: general immediate in bits [1:0]. */
>> @@ -344,6 +345,7 @@ const aarch64_field fields[] =
>> { 8, 2 }, /* imm2_8: general immediate in bits [9:8]. */
>> { 10, 2 }, /* imm2_10: 2-bit immediate, bits [11:10] */
>> { 12, 2 }, /* imm2_12: 2-bit immediate, bits [13:12] */
>> + { 13, 2 }, /* imm2_13: 2-bit immediate, bits [14:13] */
>> { 15, 2 }, /* imm2_15: 2-bit immediate, bits [16:15] */
>> { 16, 2 }, /* imm2_16: 2-bit immediate, bits [17:16] */
>> { 19, 2 }, /* imm2_19: 2-bit immediate, bits [20:19] */
>> @@ -2554,6 +2556,10 @@ operand_general_constraint_met_p (const aarch64_opnd_info *opnds, int idx,
>> num = get_opcode_dependent_value (opcode);
>> switch (type)
>> {
>> + case AARCH64_OPND_LVn_LUT:
>> + if (!check_reglist (opnd, mismatch_detail, idx, num, 1))
>> + return 0;
>> + break;
>> case AARCH64_OPND_LVt:
>> assert (num >= 1 && num <= 4);
>> /* Unless LD1/ST1, the number of registers should be equal to that
>> @@ -3165,6 +3171,14 @@ operand_general_constraint_met_p (const aarch64_opnd_info *opnds, int idx,
>> and is halfed because complex numbers take two elements. */
>> num = aarch64_get_qualifier_nelem (opnds[0].qualifier)
>> * aarch64_get_qualifier_esize (opnds[0].qualifier) / 2;
>> + else if (opcode->iclass == lut)
>> + {
>> + size = get_operand_fields_width (get_operand_from_code (type)) - 5;
>> + if (!check_reglane (opnd, mismatch_detail, idx, "v", 0, 31,
>> + 0, (1 << size) - 1))
>> + return 0;
>> + break;
>> + }
>> else
>> num = 16;
>> num = num / aarch64_get_qualifier_esize (qualifier) - 1;
>> @@ -4069,6 +4083,14 @@ aarch64_print_operand (char *buf, size_t size, bfd_vma pc,
>> style_imm (styler, "%" PRIi64, opnd->reglane.index));
>> break;
>>
>> + case AARCH64_OPND_Em_INDEX1_14:
>> + case AARCH64_OPND_Em_INDEX2_13:
>> + case AARCH64_OPND_Em_INDEX3_12:
>> + snprintf (buf, size, "%s[%s]",
>> + style_reg (styler, "v%d", opnd->reglane.regno),
>> + style_imm (styler, "%" PRIi64, opnd->reglane.index));
>> + break;
>> +
>> case AARCH64_OPND_VdD1:
>> case AARCH64_OPND_VnD1:
>> snprintf (buf, size, "%s[%s]",
>> @@ -4077,6 +4099,7 @@ aarch64_print_operand (char *buf, size_t size, bfd_vma pc,
>> break;
>>
>> case AARCH64_OPND_LVn:
>> + case AARCH64_OPND_LVn_LUT:
>> case AARCH64_OPND_LVt:
>> case AARCH64_OPND_LVt_AL:
>> case AARCH64_OPND_LEt:
>> diff --git a/opcodes/aarch64-tbl.h b/opcodes/aarch64-tbl.h
>> index 5b1c8561ac6147e64ba99b6e9fba85ed8ee712c4..6d7aa3d770ad34071bfe67f95d974eaa7b6cdbbd 100644
>> --- a/opcodes/aarch64-tbl.h
>> +++ b/opcodes/aarch64-tbl.h
>> @@ -1004,6 +1004,24 @@
>> QLF3(V_16B, V_16B, V_16B), \
>> }
>>
>> +/* e.g. luti2 <Vd>.16B, { <Vn>.16B }, <Vm>[index]. */
>> +/* The third operand is an AdvSIMD vector with a bit index
>> + and without a type qualifier and is checked separately
>> + based on operand enum. */
>> +#define QL_VVUB \
>> +{ \
>> + QLF3(V_16B , V_16B , NIL), \
>> +}
>> +
>> +/* e.g. luti2 <Vd>.8H, { <Vn>.8H }, <Vm>[index]. */
>> +/* The third operand is an AdvSIMD vector with a bit index
>> + and without a type qualifier and is checked separately
>> + based on operand enum. */
>> +#define QL_VVUH \
>> +{ \
>> + QLF3(V_8H , V_8H , NIL), \
>> +}
>> +
>> /* e.g. EXT <Vd>.<T>, <Vn>.<T>, <Vm>.<T>, #<index>. */
>> #define QL_VEXT \
>> { \
>> @@ -2669,6 +2687,8 @@ static const aarch64_feature_set aarch64_feature_faminmax_sve2 =
>> AARCH64_FEATURES (2, FAMINMAX, SVE2);
>> static const aarch64_feature_set aarch64_feature_faminmax_sme2 =
>> AARCH64_FEATURES (3, SVE2, FAMINMAX, SME2);
>> +static const aarch64_feature_set aarch64_feature_lut =
>> + AARCH64_FEATURE (LUT);
>>
>> #define CORE &aarch64_feature_v8
>> #define FP &aarch64_feature_fp
>> @@ -2740,6 +2760,7 @@ static const aarch64_feature_set aarch64_feature_faminmax_sme2 =
>> #define FAMINMAX &aarch64_feature_faminmax
>> #define FAMINMAX_SVE2 &aarch64_feature_faminmax_sve2
>> #define FAMINMAX_SME2 &aarch64_feature_faminmax_sme2
>> +#define LUT &aarch64_feature_lut
>>
>> #define CORE_INSN(NAME,OPCODE,MASK,CLASS,OP,OPS,QUALS,FLAGS) \
>> { NAME, OPCODE, MASK, CLASS, OP, CORE, OPS, QUALS, FLAGS, 0, 0, NULL }
>> @@ -2925,6 +2946,8 @@ static const aarch64_feature_set aarch64_feature_faminmax_sme2 =
>> #define FAMINMAX_SME2_INSN(NAME,OPCODE,MASK,OPS,QUALS) \
>> { NAME, OPCODE, MASK, sme_size_22_hsd, 0, FAMINMAX_SME2, OPS, QUALS, \
>> F_STRICT | 0, 0, 1, NULL }
>> +#define LUT_INSN(NAME,OPCODE,MASK,OPS,QUALS,FLAGS) \
>> + { NAME, OPCODE, MASK, lut, 0, LUT, OPS, QUALS, FLAGS, 0, 0, NULL }
>>
>> #define MOPS_CPY_OP1_OP2_PME_INSN(NAME, OPCODE, MASK, FLAGS, CONSTRAINTS) \
>> MOPS_INSN (NAME, OPCODE, MASK, 0, \
>> @@ -4275,6 +4298,11 @@ const struct aarch64_opcode aarch64_opcode_table[] =
>> FAMINMAX_SME2_INSN ("famax", 0xc120b940, 0xff23ffe3, OP3 (SME_Zdnx4, SME_Zdnx4, SME_Zmx4), OP_SVE_VVV_HSD),
>> FAMINMAX_SME2_INSN ("famin", 0xc120b141, 0xff21ffe1, OP3 (SME_Zdnx2, SME_Zdnx2, SME_Zmx2), OP_SVE_VVV_HSD),
>> FAMINMAX_SME2_INSN ("famin", 0xc120b941, 0xff23ffe3, OP3 (SME_Zdnx4, SME_Zdnx4, SME_Zmx4), OP_SVE_VVV_HSD),
>> + /* AdvSIMD lut. */
>> + LUT_INSN ("luti2", 0x4e801000, 0xffe09c00, OP3 (Vd, LVn_LUT, Em_INDEX2_13), QL_VVUB, F_OD(1)),
>> + LUT_INSN ("luti2", 0x4ec00000, 0xffe08c00, OP3 (Vd, LVn_LUT, Em_INDEX3_12), QL_VVUH, F_OD(1)),
>> + LUT_INSN ("luti4", 0x4e402000, 0xffe0bc00, OP3 (Vd, LVn_LUT, Em_INDEX1_14), QL_VVUB, F_OD(1)),
>> + LUT_INSN ("luti4", 0x4e401000, 0xffe09c00, OP3 (Vd, LVn_LUT, Em_INDEX2_13), QL_VVUH, F_OD(2)),
>> /* Move wide (immediate). */
>> CORE_INSN ("movn", 0x12800000, 0x7f800000, movewide, OP_MOVN, OP2 (Rd, HALF), QL_DST_R, F_SF | F_HAS_ALIAS),
>> CORE_INSN ("mov", 0x12800000, 0x7f800000, movewide, OP_MOV_IMM_WIDEN, OP2 (Rd, IMM_MOV), QL_DST_R, F_SF | F_ALIAS | F_CONV),
>> @@ -6531,12 +6559,20 @@ const struct aarch64_opcode aarch64_opcode_table[] =
>> "a SIMD vector element") \
>> Y(SIMD_ELEMENT, reglane, "Em16", 0, F(FLD_Rm), \
>> "a SIMD vector element limited to V0-V15") \
>> + Y(SIMD_ELEMENT, reglane, "Em_INDEX1_14", 0, F(FLD_Rm, FLD_imm1_14), \
>> + "a SIMD vector without a type qualifier encoding a bit index") \
>> + Y(SIMD_ELEMENT, reglane, "Em_INDEX2_13", 0, F(FLD_Rm, FLD_imm2_13), \
>> + "a SIMD vector without a type qualifier encoding a bit index") \
>> + Y(SIMD_ELEMENT, reglane, "Em_INDEX3_12", 0, F(FLD_Rm, FLD_imm3_12), \
>> + "a SIMD vector without a type qualifier encoding a bit index") \
>
> I think this is a better fit for simple_index (instead of reglane). See also
> how the existing SME luti operands work. Unless I've missed something, using
> simple_index would mean that you don't need to edit the inserter or extractor
> functions.
>
Yes, you are right. I used simple_index and it worked. I have also
removed references to added inserters and extractors from the cover
letter in the new version of this patch here
https://sourceware.org/pipermail/binutils/2024-May/134230.html.
>> Y(SIMD_REGLIST, reglist, "LVn", 0, F(FLD_Rn), \
>> "a SIMD vector register list") \
>> Y(SIMD_REGLIST, ldst_reglist, "LVt", 0, F(), \
>> "a SIMD vector register list") \
>> Y(SIMD_REGLIST, ldst_reglist_r, "LVt_AL", 0, F(), \
>> "a SIMD vector register list") \
>> + Y(SIMD_REGLIST, lut_reglist, "LVn_LUT", 0, F(FLD_Rn), \
>> + "a SIMD vector register list") \
>> Y(SIMD_REGLIST, ldst_elemlist, "LEt", 0, F(), \
>> "a SIMD vector element list") \
>> Y(IMMEDIATE, imm, "CRn", 0, F(FLD_CRn), \
>
More information about the Binutils
mailing list