Bug 18500 - VLDR immediate not using VMOV
Summary: VLDR immediate not using VMOV
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: gas (show other bugs)
Version: unspecified
: P2 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-06-07 17:25 UTC by Alessandro Marzocchi
Modified: 2015-10-22 02:24 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
Patch (456 bytes, patch)
2015-06-07 17:25 UTC, Alessandro Marzocchi
Details | Diff
Revised patch (737 bytes, patch)
2015-06-07 18:32 UTC, Alessandro Marzocchi
Details | Diff
Revised patch including test cases (2.21 KB, patch)
2015-06-08 01:01 UTC, Alessandro Marzocchi
Details | Diff
Revised patch including test cases (2.17 KB, patch)
2015-06-08 17:33 UTC, Alessandro Marzocchi
Details | Diff
Revised patch (2.21 KB, patch)
2015-06-16 23:06 UTC, Alessandro Marzocchi
Details | Diff
Revised patch (2.15 KB, patch)
2015-06-16 23:08 UTC, Alessandro Marzocchi
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alessandro Marzocchi 2015-06-07 17:25:08 UTC
Created attachment 8348 [details]
Patch

The vldr Rx,=imm always translate to a ldr from literal pool. This patch tries to correctly translate it to a vmov when possible. Please double check it as I have never done any development on GAS source.
Comment 1 Alessandro Marzocchi 2015-06-07 18:32:40 UTC
Created attachment 8349 [details]
Revised patch

Corrected code for double precision values.
Comment 2 Alessandro Marzocchi 2015-06-08 01:01:31 UTC
Created attachment 8350 [details]
Revised patch including test cases

Added 3 test cases (vfpv2 which has no vmov, vfpv3xd which has only single vmov and vfpv3 which has both single and double vmov).

Note that while making the patch a problem was discovered: VMOV with immediate value of -0.125 compiles correctly for single but not for double registers:
vmov s0,#0xbe000000 ; OK
vmov d0,#0xbe000000 ; OK

When this problem will be corrected the test case for vfpv3 will give an error on that vldr
Comment 3 Alessandro Marzocchi 2015-06-08 17:33:35 UTC
Created attachment 8353 [details]
Revised patch including test cases

Corrected functions is_double_a_single and double_to_single
Comment 4 Nick Clifton 2015-06-16 14:10:08 UTC
Hi Alessandro,

  There are three problems with this patch:

  1. It checks for X_op being O_bignum, but then it uses the value in X_add_number.  If X_op is O_bignum then the value is in the generic_bignum array, and the value is only a floating point value if X_add_number is -1.  If X_add_number is zero (or any value other than -1) then generic_bignum holds a large integer value, not a large floating point value.

  2. It is not clear to me that we actually want to convert VLDR to VMOV.  If the same constant is being used multiple times then putting it in the literal pool will save space.  Ideally we could assume that the producer of the VLDR instructions (either the compiler or the assembly programmer), knows what they are doing and not modify their code for them.  But if you still feel that this conversion is a useful idea then it ought to be controllable via a command line option.

  3. It does not follow the GNU Coding Standards in terms of the layout of the code.  (See: http://www.gnu.org/prep/standards/)

Cheers
  Nick
Comment 5 Alessandro Marzocchi 2015-06-16 23:06:20 UTC
Created attachment 8366 [details]
Revised patch
Comment 6 Alessandro Marzocchi 2015-06-16 23:08:00 UTC
Created attachment 8367 [details]
Revised patch
Comment 7 Alessandro Marzocchi 2015-06-16 23:29:48 UTC
Hi Nick,
   thank for your patience. 
The latest patch should solve the problem indicated by you and match GNU coding standards (I hope at least). 
2) I took that idea from ARM specs:

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489c/CJAEFGHE.html 
"VLDR pseudo-instruction

The VLDR pseudo-instruction loads a constant value into every element of a 64-bit NEON vector, or into a VFP single-precision or double-precision register.
[omissis]
If an instruction (for example, VMOV) is available that can generate the constant directly into the register, the assembler uses it. Otherwise, it generates a doubleword literal pool entry containing the constant and loads the constant using a VLDR instruction."

Also the conversion should not take more space (both VMOV and VLDR instruction get converted to 32 bit instruction in ARM and THUMB modes) while performing better (1 cycle vs 2/3 for Cortex M4). 

Regards
   Alessandro


By the way... what is the correct way to load 1.0 in a double register? I tried this but it gives an error:
.arm
.syntax unified
.fpu vfpv3
vmov d0,#0x3f800000
Comment 8 Nick Clifton 2015-06-17 11:54:19 UTC
(In reply to Alessandro Marzocchi from comment #7)

> 2) I took that idea from ARM specs:

A good point.  I rescind my earlier comment on this point.

> By the way... what is the correct way to load 1.0 in a double register? I
> tried this but it gives an error:
> .arm
> .syntax unified
> .fpu vfpv3
> vmov d0,#0x3f800000

You need to explicitly state the size of the floating point being stored.  IE:

  vmov.f64 d0,#0x3f800000

or:

  vmov.f64 d0, 1.0

Note - when this instruction is disassembled via objdump it will be displayed as:

   eeb70b00 	vmov.f64	d0, #112	; 0x70

Which is not very helpful.  Ideally objdump should decode the 112 value and display the floating point equivalent.  If you do this by hand by the way you will find that the actual bit pattern being stored into the d0 register is:

  0x3ff0000000000000

which is not quite the same as the 0x3f80000 value that was provided in the assembler.  This is because the input value (0x3f80000) is treated as a 32-bit floating point bit pattern, but the value stored in d0 is a 64-bit floating point value.


Anyway, I have now checked your patch in.  I made a few small modifications however:

  * I fixed the test cases so that they would work with arm-wince and arm-aout targets.  I also updated the names of the tests so that they reflect the vfp version being tested.

  * I added code to handle bignum floating point constants.

  * I added changelog entries for the patch and the new tests.

Cheers
  Nick
Comment 9 cvs-commit@gcc.gnu.org 2015-06-17 11:57:55 UTC
The master branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=ba592044bc04610d6fa14d0a95931bac303ace37

commit ba592044bc04610d6fa14d0a95931bac303ace37
Author: Alessandro Marzocchi <alessandro.marzocchi@gmail.com>
Date:   Wed Jun 17 12:56:17 2015 +0100

    Add support for converting VLDR <reg>,=<constant> to a VMOV instruction when appropriate.
    
    	PR gas/18500
    gas	* config/tc-arm.c (is_double_a_single): New function.
    	(double_to_single): New function.
    	(move_or_literal_pool): Add support for	converting VLDR to VMOV.
    
    tests	* gas/arm/vfpv2-ldr_immediate.s: New test case.
    	* gas/arm/vfpv2-ldr_immediate.d: Expected disassembly.
    	* gas/arm/vfpv3-ldr_immediate.s: New test case.
    	* gas/arm/vfpv3-ldr_immediate.d: Expected disassembly.
    	* gas/arm/vfpv3xd-ldr_immediate.s: New test case.
    	* gas/arm/vfpv3xd-ldr_immediate.d: Expected disassembly.
Comment 10 Nick Clifton 2015-06-17 12:10:10 UTC
Fixed
Comment 11 cvs-commit@gcc.gnu.org 2015-06-17 15:11:54 UTC
The master branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=5fc177c89526035dfa5ac5329f88b02af01d8ca5

commit 5fc177c89526035dfa5ac5329f88b02af01d8ca5
Author: Nick Clifton <nickc@redhat.com>
Date:   Wed Jun 17 16:10:36 2015 +0100

    Fix compile warnings building previous delta in a 32-bit environment.
    
    	* config/tc-arm.c (is_double_a_single): Make conditional upon the
    	availablity of a 64-bit type.  Use this type for the argument and
    	mantissa.
    	(double_to_single): Likewise.
    	* config/tc-arm.c (move_or_literal_pool): Use a 64-bit type for
    	the constant value, if available.  Generate a 64-bit value from a
    	bignum if supported.  Only perform the second optimization for
    	PR 18500 if the 64-bit type is available.
Comment 12 Jiong Wang 2015-10-20 13:06:06 UTC
The testcases includes failed on arm big-endian. Can be reproduced on x86 cross --target=armeb-none-eabi

FAIL: VFPv2 vldr to vmov
FAIL: VFPv3 vldr to vmov
FAIL: VFPv3xd vldr to vmov
Comment 13 cvs-commit@gcc.gnu.org 2015-10-21 16:26:39 UTC
The master branch has been updated by Nick Clifton <nickc@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=a846e9c1872038b0d7bac1fe5bb134668ae5e697

commit a846e9c1872038b0d7bac1fe5bb134668ae5e697
Author: Nick Clifton <nickc@redhat.com>
Date:   Wed Oct 21 17:25:28 2015 +0100

    Fix tests for PR 18500 so that they will pass for big-endian ARM toolchains.
    
    	PR gas/18500
    	* gas/arm/vfpv3xd-ldr_immediate.d: Update test for big-endian ARM
    	toolchains.
    	* gas/arm/vfpv3-ldr_immediate.d: Likewise.
    	* gas/arm/vfpv2-ldr_immediate.d: Likewise.
Comment 14 Nick Clifton 2015-10-21 16:28:17 UTC
Hi Jiong,

> The testcases includes failed on arm big-endian. Can be reproduced on x86
> cross --target=armeb-none-eabi
> 
> FAIL: VFPv2 vldr to vmov
> FAIL: VFPv3 vldr to vmov
> FAIL: VFPv3xd vldr to vmov

Doh.  Sorry about that.  I have checked in a patch to fix it, and added a big-endian arm toolchain to my local array of toolchains to test.

Cheers
  Nick
Comment 15 cvs-commit@gcc.gnu.org 2015-10-22 02:24:55 UTC
The master branch has been updated by Alan Modra <amodra@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=4ee3febded6c95ef47e8982a406aa3549cb6b59b

commit 4ee3febded6c95ef47e8982a406aa3549cb6b59b
Author: Alan Modra <amodra@gmail.com>
Date:   Thu Oct 22 10:58:47 2015 +1030

    Fix tests for PR 18500, revisited
    
    Correct commit a846e9c1.
    
    	PR gas/18500
    	* gas/arm/vfpv2-ldr_immediate.d: Use parentheses, not brackets,
    	to select alternatives.
    	* gas/arm/vfpv3-ldr_immediate.d: Likewise.
    	* gas/arm/vfpv3xd-ldr_immediate.d: Likewise.