This is the mail archive of the
binutils@sourceware.org
mailing list for the binutils project.
RE: [PATCH 3/4] Arm64: correct {su,us}dot SIMD encodings
- From: Tamar Christina <Tamar dot Christina at arm dot com>
- To: Jan Beulich <JBeulich at suse dot com>, "binutils at sourceware dot org" <binutils at sourceware dot org>
- Cc: Marcus Shawcroft <Marcus dot Shawcroft at arm dot com>, Mihail Ionescu <Mihail dot Ionescu at arm dot com>, Richard Earnshaw <Richard dot Earnshaw at arm dot com>, nd <nd at arm dot com>
- Date: Mon, 30 Dec 2019 11:58:09 +0000
- Subject: RE: [PATCH 3/4] Arm64: correct {su,us}dot SIMD encodings
- Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PAn4rND/9tb3Ei6EAjpBKUl8j/OulgrSLV1r7dDW0V4=; b=GTWy4RLywnOvJ2nNDh7LaqtmvXPG7+B/hUWP034FGAEiXj5qZtO/hc45e3fSj1kKFsO+17NiuXDqeMGGVd2Wn0jEgqim7Vt2ThPpWiWK6/f9CqQE3IssN0bB4xS0A48VwMlchbK90LKMThMBcsO0OyIZ9xOm2F+DzHgiasgugHKdikeAhJd8zkb0dkNSEg1uMN9RAvX61XRvM+U0Bl49zKgOoxmNoEGKPGlQdkS8roQ2DWHeicwHJNwexYjDGVSlAFQE2wp2t22LllpvQdSjQqeSPIKub1WZha7f5z3DTTsxPfXts+fxVbVDtNRiuRBI68PuM6jOaAqJyv0/HNaZMQ==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=b7LKqvfuQYWFqzQajuEcTwrC1Bjyj158phFerV89KtmbKw1tLLkf/1tJvF/+dxlVeHm72yJGqVQDupdJNlqu8Pmv8qGLN/tN0uiUqoS02cc89lqV6uPkd9vnsrpxggGkugGSd2M0U6ZrbVkFMl7I/djU4STy+apgBMXMa8Fwz/H71EJQCppQQuMvtklNkrioDgrS21qwPk7hlY+HeHPIjdK10oDU64GhYgY6Sjv10/bA7YlzHHP0dW9lqkFyf2E6/meMCT+0JOJ1vLOhlo6Di4fpZF+ihUZX4tp0GsGtgBjkf8pt0CoWEeCT3nA8QF2ZoIz++BXSqd8cc8EvLnZcuQ==
- Original-authentication-results: spf=none (sender IP is ) smtp.mailfrom=Tamar dot Christina at arm dot com;
- References: <37213fea-ae2e-0293-a042-9db2274cd061@suse.com> <48d49025-1f28-5238-9ea0-1181c22b403c@suse.com>
Hi Jan,
Thanks! Same as the rest this one looks ok too but you still need a maintainer to approve.
Cheers,
Tamar
> -----Original Message-----
> From: binutils-owner@sourceware.org <binutils-owner@sourceware.org>
> On Behalf Of Jan Beulich
> Sent: Friday, December 27, 2019 10:40
> To: binutils@sourceware.org
> Cc: Marcus Shawcroft <Marcus.Shawcroft@arm.com>; Mihail Ionescu
> <Mihail.Ionescu@arm.com>; Richard Earnshaw
> <Richard.Earnshaw@arm.com>
> Subject: [PATCH 3/4] Arm64: correct {su,us}dot SIMD encodings
>
> According to the specification these permit the Q bit to control the
> vector length operated on, and hence this bit should not already be set
> in the opcode table entries (it rather needs setting dynamically). Note
> how the test case output did also not match its input. Besides
> correcting the test case also extend it to cover both forms.
>
> gas/
> 2020-01-XX Jan Beulich <jbeulich@suse.com>
>
> * testsuite/gas/aarch64/i8mm.s: Add 128-bit form tests for
> by-element usdot. Add 64-bit form tests for by-element sudot.
> * testsuite/gas/aarch64/i8mm.d: Adjust expectations.
>
> opcodes/
> 2020-01-XX Jan Beulich <jbeulich@suse.com>
>
> * opcodes/aarch64-tbl.h (aarch64_opcode_table): Correct SIMD
> forms of SUDOT and USDOT.
>
> --- a/gas/testsuite/gas/aarch64/i8mm.d
> +++ b/gas/testsuite/gas/aarch64/i8mm.d
> @@ -29,15 +29,23 @@ Disassembly of section \.text:
> *[0-9a-f]+: 6e80a400 ummla v0\.4s, v0\.16b, v0\.16b
> *[0-9a-f]+: 4e80ac00 usmmla v0\.4s, v0\.16b, v0\.16b
> *[0-9a-f]+: 4e9baeb1 usmmla v17\.4s, v21\.16b, v27\.16b
> - *[0-9a-f]+: 4e9b9eb1 usdot v17\.2s, v21\.8b, v27\.8b
> - *[0-9a-f]+: 4e809c00 usdot v0\.2s, v0\.8b, v0\.8b
> - *[0-9a-f]+: 4e9b9eb1 usdot v17\.2s, v21\.8b, v27\.8b
> - *[0-9a-f]+: 4e809c00 usdot v0\.2s, v0\.8b, v0\.8b
> - *[0-9a-f]+: 4fbbfab1 usdot v17\.2s, v21\.8b, v27\.4b\[3\]
> - *[0-9a-f]+: 4fa0f800 usdot v0\.2s, v0\.8b, v0\.4b\[3\]
> - *[0-9a-f]+: 4f9bf2b1 usdot v17\.2s, v21\.8b, v27\.4b\[0\]
> - *[0-9a-f]+: 4f80f000 usdot v0\.2s, v0\.8b, v0\.4b\[0\]
> - *[0-9a-f]+: 4f3bfab1 sudot v17\.2s, v21\.8b, v27\.4b\[3\]
> - *[0-9a-f]+: 4f20f800 sudot v0\.2s, v0\.8b, v0\.4b\[3\]
> - *[0-9a-f]+: 4f1bf2b1 sudot v17\.2s, v21\.8b, v27\.4b\[0\]
> - *[0-9a-f]+: 4f00f000 sudot v0\.2s, v0\.8b, v0\.4b\[0\]
> + *[0-9a-f]+: 0e9b9eb1 usdot v17\.2s, v21\.8b, v27\.8b
> + *[0-9a-f]+: 0e809c00 usdot v0\.2s, v0\.8b, v0\.8b
> + *[0-9a-f]+: 4e9b9eb1 usdot v17\.4s, v21\.16b, v27\.16b
> + *[0-9a-f]+: 4e809c00 usdot v0\.4s, v0\.16b, v0\.16b
> + *[0-9a-f]+: 0fbbfab1 usdot v17\.2s, v21\.8b, v27\.4b\[3\]
> + *[0-9a-f]+: 0fa0f800 usdot v0\.2s, v0\.8b, v0\.4b\[3\]
> + *[0-9a-f]+: 0f9bf2b1 usdot v17\.2s, v21\.8b, v27\.4b\[0\]
> + *[0-9a-f]+: 0f80f000 usdot v0\.2s, v0\.8b, v0\.4b\[0\]
> + *[0-9a-f]+: 4fbbfab1 usdot v17\.4s, v21\.16b, v27\.4b\[3\]
> + *[0-9a-f]+: 4fa0f800 usdot v0\.4s, v0\.16b, v0\.4b\[3\]
> + *[0-9a-f]+: 4f9bf2b1 usdot v17\.4s, v21\.16b, v27\.4b\[0\]
> + *[0-9a-f]+: 4f80f000 usdot v0\.4s, v0\.16b, v0\.4b\[0\]
> + *[0-9a-f]+: 0f3bfab1 sudot v17\.2s, v21\.8b, v27\.4b\[3\]
> + *[0-9a-f]+: 0f20f800 sudot v0\.2s, v0\.8b, v0\.4b\[3\]
> + *[0-9a-f]+: 0f1bf2b1 sudot v17\.2s, v21\.8b, v27\.4b\[0\]
> + *[0-9a-f]+: 0f00f000 sudot v0\.2s, v0\.8b, v0\.4b\[0\]
> + *[0-9a-f]+: 4f3bfab1 sudot v17\.4s, v21\.16b, v27\.4b\[3\]
> + *[0-9a-f]+: 4f20f800 sudot v0\.4s, v0\.16b, v0\.4b\[3\]
> + *[0-9a-f]+: 4f1bf2b1 sudot v17\.4s, v21\.16b, v27\.4b\[0\]
> + *[0-9a-f]+: 4f00f000 sudot v0\.4s, v0\.16b, v0\.4b\[0\]
> --- a/gas/testsuite/gas/aarch64/i8mm.s
> +++ b/gas/testsuite/gas/aarch64/i8mm.s
> @@ -49,7 +49,15 @@ usdot v17.2s, v21.8b, v27.4b[3]
> usdot v0.2s, v0.8b, v0.4b[3]
> usdot v17.2s, v21.8b, v27.4b[0]
> usdot v0.2s, v0.8b, v0.4b[0]
> +usdot v17.4s, v21.16b, v27.4b[3]
> +usdot v0.4s, v0.16b, v0.4b[3]
> +usdot v17.4s, v21.16b, v27.4b[0]
> +usdot v0.4s, v0.16b, v0.4b[0]
>
> +sudot v17.2s, v21.8b, v27.4b[3]
> +sudot v0.2s, v0.8b, v0.4b[3]
> +sudot v17.2s, v21.8b, v27.4b[0]
> +sudot v0.2s, v0.8b, v0.4b[0]
> sudot v17.4s, v21.16b, v27.4b[3]
> sudot v0.4s, v0.16b, v0.4b[3]
> sudot v17.4s, v21.16b, v27.4b[0]
> --- a/opcodes/aarch64-tbl.h
> +++ b/opcodes/aarch64-tbl.h
> @@ -5092,9 +5092,9 @@ struct aarch64_opcode aarch64_opcode_tab
> INT8MATMUL_INSN ("smmla", 0x4e80a400, 0xffe0fc00, aarch64_misc, OP3
> (Vd, Vn, Vm), QL_MMLA64, 0),
> INT8MATMUL_INSN ("ummla", 0x6e80a400, 0xffe0fc00, aarch64_misc, OP3
> (Vd, Vn, Vm), QL_MMLA64, 0),
> INT8MATMUL_INSN ("usmmla", 0x4e80ac00, 0xffe0fc00, aarch64_misc,
> OP3 (Vd, Vn, Vm), QL_MMLA64, 0),
> - INT8MATMUL_INSN ("usdot", 0x4e809c00, 0xffe0fc00, aarch64_misc, OP3
> (Vd, Vn, Vm), QL_V3DOT, F_SIZEQ),
> - INT8MATMUL_INSN ("usdot", 0x4f80f000, 0xffc0f400, dotproduct, OP3 (Vd,
> Vn, Em), QL_V2DOT, F_SIZEQ),
> - INT8MATMUL_INSN ("sudot", 0x4f00f000, 0xffc0f400, dotproduct, OP3 (Vd,
> Vn, Em), QL_V2DOT, F_SIZEQ),
> + INT8MATMUL_INSN ("usdot", 0x0e809c00, 0xbfe0fc00, aarch64_misc, OP3
> (Vd, Vn, Vm), QL_V3DOT, F_SIZEQ),
> + INT8MATMUL_INSN ("usdot", 0x0f80f000, 0xbfc0f400, dotproduct, OP3
> (Vd, Vn, Em), QL_V2DOT, F_SIZEQ),
> + INT8MATMUL_INSN ("sudot", 0x0f00f000, 0xbfc0f400, dotproduct, OP3
> (Vd, Vn, Em), QL_V2DOT, F_SIZEQ),
>
> /* BFloat instructions. */
> BFLOAT16_SVE_INSNC ("bfdot", 0x64608000, 0xffe0fc00, sve_misc, OP3
> (SVE_Zd, SVE_Zn, SVE_Zm_16), OP_SVE_SHH, 0, C_SCAN_MOVPRFX, 0),