Bug 28078 - arm: fails to build when using armv8 neon with dotprod extension
Summary: arm: fails to build when using armv8 neon with dotprod extension
Status: RESOLVED MOVED
Alias: None
Product: binutils
Classification: Unclassified
Component: gas (show other bugs)
Version: 2.36
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-07-12 04:50 UTC by Alok Parlikar
Modified: 2022-06-22 06:31 UTC (History)
3 users (show)

See Also:
Host:
Target: arm
Build:
Last reconfirmed: 2021-07-13 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alok Parlikar 2021-07-12 04:50:04 UTC
I was trying to build tensorflow-lite v2.5 with a custom toolchain that was using binutils 2.36.1. The build failed when building the xnnpack project with an error:

/tmp/ccMSJOfk.s:380: Error: selected processor does not support `vsdot.s8 q12,q9,d11[0]' in ARM mode

Some of my notes about this issue are here: https://github.com/google/XNNPACK/issues/1465#issuecomment-877910701

Following is a minimal example to reproduce this:


// file: test.c
#include <arm_neon.h>

int32x2_t test(int32x2_t a, int8x8_t b, int8x8_t c) {
        return vdot_lane_s32(a, b, c, 1);
}
// EOF



$ arm-unknown-linux-gnueabihf-gcc  -march=armv8.2-a+dotprod -mfpu=neon-fp-armv8 test.c -S -o test.s

$ arm-unknown-linux-gnueabihf-as  -march=armv8.2-a+dotprod -mfpu=neon-fp-armv8 test.s
/tmp/cc7vCQog.s: Assembler messages:
/tmp/cc7vCQog.s:42: Error: selected processor does not support `vsdot.s8 d17,d16,d7[1]' in ARM mode

This happens with gcc 8, 9 and 10. Binutils 2.32, 2.33.1 works. Binutils 2.34 and later does not build.

This seems to be related to commit https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=f439988037a589de3798f44e7268301adaec21a9 that changed the behavior of fpu directives.

The generated assembly has the following snippet:

        .arch armv8.2-a
        .arch_extension dotprod
        .syntax unified
        .arm
        .fpu neon-fp-armv8

The fpu directive basically cancels the dotprod extension setup just above.

I also see later commits that have changed MVE and CRC to CORE_HIGH section; not sure if something like that is relevant for dotprod.
Comment 1 Tamar Christina 2021-07-13 10:04:13 UTC
Hi Alok,

this is intentional as it became increasingly difficult to mix FPU and arch_extensions in a sane manner.

For that reason using specific -mfpu flags is deprecated and you should instead use -mfpu=auto which then does the right thing based on the arch and extensions.
Comment 2 Alok Parlikar 2021-07-13 10:29:06 UTC
Ah. Thanks, Tamar.

Doing the same as you suggested with binutils 2.36.1 gives:

arm-unknown-linux-musleabihf-as -march=armv8.2-a+dotprod -mfpu=auto  test.s
Assembler messages:
Error: unknown floating point format `auto'

Error: unrecognized option -mfpu=auto

Is the "auto" option added in a later release?

Also, in the case when gcc is directly calling the assembler, trying:

arm-unknown-linux-musleabihf-gcc -march=armv8.2-a+dotprod -mfpu=auto test.c

this works, but still calls as with -mfpu=neon-fp-armv8

Am I doing something wrong?
Comment 3 Tamar Christina 2021-07-13 12:36:21 UTC
Hi,

>arm-unknown-linux-musleabihf-as -march=armv8.2-a+dotprod -mfpu=auto  test.s
> Assembler messages:
> Error: unknown floating point format `auto'
> 
> Error: unrecognized option -mfpu=auto

hmm no it looks like we don't have -mfpu=auto at the assembler level yet, only at the compiler level.

But trying it locally it does seem like something has broken..
Comment 4 Richard Earnshaw 2021-07-13 13:02:41 UTC
(In reply to Tamar Christina from comment #3)
> Hi,
> 
> >arm-unknown-linux-musleabihf-as -march=armv8.2-a+dotprod -mfpu=auto  test.s
> > Assembler messages:
> > Error: unknown floating point format `auto'
> > 
> > Error: unrecognized option -mfpu=auto
> 
> hmm no it looks like we don't have -mfpu=auto at the assembler level yet,
> only at the compiler level.
> 
> But trying it locally it does seem like something has broken..

This is down to a change in binutils (gas); but that was to fix a different bug, so I don't think it can be reverted.

I think the solution is that we need to re-issue the architecture extensions that relate to the FPU after emitting a .fpu directive.
Comment 5 Richard Earnshaw 2021-08-02 11:19:16 UTC
Moving this issue to GCC as the problem is the order in which the compiler emits the directives.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101723
Comment 6 Richard Earnshaw 2021-08-25 11:13:25 UTC
For completeness, GCC has now been fixed on master and all maintained releases (back to gcc-9).
Comment 7 Alok Parlikar 2021-08-25 12:15:26 UTC
Thank you so much! I've also posted this status on the xnnpack issue that motivated this issue report. https://github.com/google/XNNPACK/issues/1465