Bug 24348 - GNU (g)as is confusing about vmovdqu mnemonics
Summary: GNU (g)as is confusing about vmovdqu mnemonics
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: gas (show other bugs)
Version: 2.30
: P2 normal
Target Milestone: 2.33
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-15 13:40 UTC by Hendrik Greving
Modified: 2019-03-18 01:52 UTC (History)
2 users (show)

See Also:
Host: Linux version 4.19.20-1rodete1-amd64
Target: x86/-march=skylake-avx512
Build:
Last reconfirmed: 2019-03-15 00:00:00


Attachments
A patch (5.14 KB, patch)
2019-03-15 22:52 UTC, H.J. Lu
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Hendrik Greving 2019-03-15 13:40:10 UTC
1) vmovdqu %ymm0, %ymm1
as --64 -o test.o test.s
Assembles ok.

2) vmovdqu %ymm0, %ymm16
as --64 -o test.o test.s
test.s: Assembler messages:
test.s:1: Error: unsupported instruction `vmovdqu'

2) Requires the vmovdqu<8/16/32/64> mnemonic. I understand that vmovdqu is the VEX version, while vmovdqu<8/16/32/64> encodes as EVEX. I also understand that 2) requires EVEX. However, I don't see a reason why 2) could not default to one version of vmovdqu<8/16/32/64> with writemask k0. If it doesn't, the consequence is that inline asm e.g. written in C needs to write vmovdqu for ymm <= 15 and vmovsqu32 for ymm > 15. This is inconvenient e.g. for macros.

#ifdef __AVX__
vmovdqu ymm0, []
[..]
vmovdqu ymm15, []
#ifdef __AVX512F__
vmovdqu32 ymm15, []
[..]
vmovdqu32 ymm31, []
#endif
#endif

as --version
GNU assembler (GNU Binutils for Debian) 2.30
Copyright (C) 2018 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `x86_64-linux-gnu'.

Thanks in advance!0000000000
Comment 1 H.J. Lu 2019-03-15 22:52:57 UTC
Created attachment 11683 [details]
A patch

You can use

vmovdqu32 %reg, %reg

and pass -O2 or -Os to assembler.  Assembler will encode vmovdqu32 as
vmovdqu if possible.
Comment 2 cvs-commit@gcc.gnu.org 2019-03-18 00:59:45 UTC
The master branch has been updated by H.J. Lu <hjl@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=97ed31ae00ea83410f9daf61ece8a606044af365

commit 97ed31ae00ea83410f9daf61ece8a606044af365
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Mon Mar 18 08:56:10 2019 +0800

    x86: Optimize EVEX vector load/store instructions
    
    When there is no write mask, we can encode lower 16 128-bit/256-bit
    EVEX vector register load and store instructions as VEX vector register
    load and store instructions with -O1.
    
    gas/
    
    	PR gas/24348
    	* config/tc-i386.c (optimize_encoding): Encode 128-bit and
    	256-bit EVEX vector register load/store instructions as VEX
    	vector register load/store instructions for -O1.
    	* doc/c-i386.texi: Update -O1 documentation.
    	* testsuite/gas/i386/i386.exp: Run PR gas/24348 tests.
    	* testsuite/gas/i386/optimize-1.s: Add tests for EVEX vector
    	load/store instructions.
    	* testsuite/gas/i386/optimize-2.s: Likewise.
    	* testsuite/gas/i386/optimize-3.s: Likewise.
    	* testsuite/gas/i386/optimize-5.s: Likewise.
    	* testsuite/gas/i386/x86-64-optimize-2.s: Likewise.
    	* testsuite/gas/i386/x86-64-optimize-3.s: Likewise.
    	* testsuite/gas/i386/x86-64-optimize-4.s: Likewise.
    	* testsuite/gas/i386/x86-64-optimize-5.s: Likewise.
    	* testsuite/gas/i386/x86-64-optimize-6.s: Likewise.
    	* testsuite/gas/i386/optimize-1.d: Updated.
    	* testsuite/gas/i386/optimize-2.d: Likewise.
    	* testsuite/gas/i386/optimize-3.d: Likewise.
    	* testsuite/gas/i386/optimize-4.d: Likewise.
    	* testsuite/gas/i386/optimize-5.d: Likewise.
    	* testsuite/gas/i386/x86-64-optimize-2.d: Likewise.
    	* testsuite/gas/i386/x86-64-optimize-3.d: Likewise.
    	* testsuite/gas/i386/x86-64-optimize-4.d: Likewise.
    	* testsuite/gas/i386/x86-64-optimize-5.d: Likewise.
    	* testsuite/gas/i386/x86-64-optimize-6.d: Likewise.
    	* testsuite/gas/i386/optimize-7.d: New file.
    	* testsuite/gas/i386/optimize-7.s: Likewise.
    	* testsuite/gas/i386/x86-64-optimize-8.d: Likewise.
    	* testsuite/gas/i386/x86-64-optimize-8.s: Likewise.
    
    opcodes/
    
    	PR gas/24348
    	* i386-opc.tbl: Add Optimize to vmovdqa32, vmovdqa64, vmovdqu8,
    	vmovdqu16, vmovdqu32 and vmovdqu64.
    	* i386-tbl.h: Regenerated.
Comment 3 H.J. Lu 2019-03-18 01:52:57 UTC
Fixed for 2.33 with -O.