Bug 20030 - STM32L4XX : --fix-stm32l4xx-629360 fails to create vldm/vpop veneers for double-precision registers
Summary: STM32L4XX : --fix-stm32l4xx-629360 fails to create vldm/vpop veneers for doub...
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.27
: P2 normal
Target Milestone: ---
Assignee: Christophe Monat
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-05-02 09:20 UTC by Christophe Monat
Modified: 2016-05-18 13:23 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:


Attachments
Sequence of vldm/vpop with dp registers that should be patched by the linker. (304 bytes, text/plain)
2016-05-02 09:20 UTC, Christophe Monat
Details
Proposed patch (3.93 KB, patch)
2016-05-03 13:53 UTC, Nick Clifton
Details | Diff
Alternate patch (4.09 KB, patch)
2016-05-03 14:09 UTC, Christophe Monat
Details | Diff
Updated patch (4.31 KB, patch)
2016-05-04 15:26 UTC, Christophe Monat
Details | Diff
ChangeLog contribution corresponding to the updated patch (377 bytes, text/plain)
2016-05-04 15:27 UTC, Christophe Monat
Details
Patch corresponding to final patch created on top of origin/binutils-2_26-branch (4.31 KB, patch)
2016-05-09 13:27 UTC, Christophe Monat
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Christophe Monat 2016-05-02 09:20:37 UTC
Created attachment 9228 [details]
Sequence of vldm/vpop with dp registers that should be patched by the linker.

The --fix-stm32l4xx-629360 ld option should create veneers for vldm/vpop instructions when the number of words transferred is greater than 8.

This is the case for vldm/vpop when single precision register are used (eg in the s0-s31 range), but not when the aliases to double precision registers are used (eg the d0-d15 corresponding to s0-s31).

It turns out that, despite the fact that there is not much use for dp registers in fpv4-sp-d16 mode, gcc uses dp sequences for generating its save/restore sequences.

The original STM32L4XX hardware bug work-around fix did not account for that case, an oversight.

To reproduce, use the attached .s file:
$ arm-none-eabi-as -EL -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 stm32l4xx-fix-vldm-dp.s -o stm32l4xx-fix-vldm-dp.o
$ arm-none-eabi-ld -EL --fix-stm32l4xx-629360 -Ttext=0x8000 stm32l4xx-fix-vldm-dp.o -o stm32l4xx-fix-vldm-dp
$ arm-none-eabi-objdump -d stm32l4xx-fix-vldm-dp | grep veneer

is empty, this should not be the case since all instructions used are subject to the bug.
Comment 1 Nick Clifton 2016-05-03 13:53:29 UTC
Created attachment 9232 [details]
Proposed patch

Hi Christophe,

  Will something like this do ?

Cheers
  Nick
Comment 2 Christophe Monat 2016-05-03 14:09:11 UTC
Created attachment 9234 [details]
Alternate patch
Comment 3 Christophe Monat 2016-05-03 14:22:38 UTC
(In reply to Nick Clifton from comment #1)

Hi Nick,
>   Will something like this do ?

Thanks for this, it looks good, but I am seeing that the DP registers are grouped by pairs whereas they could be group by four, meaning that we could keep only one veneer size for VLDM veneers.

I have attached an alternate patch (maybe a little bit rough, but to get the idea rapidly) that sticks to the original VLDM veneers size, certainly the same idea, but enabling a denser packing of the DP registers. By proceeding like this (dealing with chunks instead of registers) the change in bdf/elf32-arm.c is less intrusive.

In you patch, I am seeing also some factorization when creating the new_insn for instance, that certainly are worth doing.

So if we could agree on a mix of the two changes, it would be great.

Best,
--C
Comment 4 Nick Clifton 2016-05-03 14:36:50 UTC
Hi Christophe,
 
> Thanks for this, it looks good, but I am seeing that the DP registers are
> grouped by pairs whereas they could be group by four,  

True - there must be a bug in my arithmetic somewhere. :-(

> So if we could agree on a mix of the two changes, it would be great.

Certainly.

Would it be OK if you created the final draft of the patch ?  I have just started working on an x86 bug...

Cheers
  Nick
Comment 5 Christophe Monat 2016-05-03 14:52:02 UTC
Hi Nick,

(In reply to Nick Clifton from comment #4)
> Would it be OK if you created the final draft of the patch ?  I have just
> started working on an x86 bug...

Yes, I agree to propose a draft for the final patch that I will post here. Thanks for you help.

Best,
--C
Comment 6 Christophe Monat 2016-05-04 15:26:34 UTC
Created attachment 9235 [details]
Updated patch
Comment 7 Christophe Monat 2016-05-04 15:27:14 UTC
Created attachment 9236 [details]
ChangeLog contribution corresponding to the updated patch
Comment 8 Nick Clifton 2016-05-05 09:28:30 UTC
Hi Christophe,

  Patch approved - please apply.

Cheers
  Nick
Comment 9 cvs-commit@gcc.gnu.org 2016-05-09 13:16:16 UTC
The master branch has been updated by Christophe Lyon <clyon@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=9239bbd3a6bf901dba1c0170622c50c78f6d1096

commit 9239bbd3a6bf901dba1c0170622c50c78f6d1096
Author: Christophe Monat <christophe.monat@st.com>
Date:   Mon May 9 15:10:37 2016 +0200

    [ARM/STM32L4XX] PR 20030: --fix-stm32l4xx-629360 fails to create vldm/vpop veneers for double-precision registers
    
    bfd/
    	PR ld/20030
    	* elf32-arm.c (is_thumb2_vldm): Account for T1 (DP) encoding.
    	(stm32l4xx_need_create_replacing_stub): Rename ambiguous nb_regs
    	to nb_words.
    	(create_instruction_vldmia): Add is_dp to disambiguate SP/DP
    	encoding.
    	(create_instruction_vldmdb): Likewise.
    	(stm32l4xx_create_replacing_stub_vldm): is_dp detects DP encoding,
    	uses it to re-encode.
    
    ld/
    	PR ld/20030
    	* testsuite/ld-arm/arm-elf.exp: Run new stm32l4xx-fix-vldm-dp
    	tests. Fix misnamed stm32l4xx-fix-all.
    	* testsuite/ld-arm/stm32l4xx-fix-vldm-dp.s: New tests for multiple
    	loads with DP registers.
    	* testsuite/ld-arm/stm32l4xx-fix-vldm-dp.d: New reference file.
    	* testsuite/ld-arm/stm32l4xx-fix-vldm.s: Add missing comment.
    	* testsuite/ld-arm/stm32l4xx-fix-all.s: Add tests for multiple
    	loads with DP registers.
    	* testsuite/ld-arm/stm32l4xx-fix-all.d: Update reference.
Comment 10 Christophe Monat 2016-05-09 13:27:54 UTC
Created attachment 9244 [details]
Patch corresponding to final patch created on top of origin/binutils-2_26-branch

Patch created on top of origin/binutils-2_26-branch
Comment 11 Christophe Monat 2016-05-09 13:30:04 UTC
Hi Nick,

(In reply to Nick Clifton from comment #8)
>   Patch approved - please apply.

Thanks for your approval, Christophe Lyon has applied it.

As it is a bug which is also present in 2_26, would you accept that it is applied on binutils-2_26-branch ? I have attached the corresponding patch.

Best,
--C
Comment 12 cvs-commit@gcc.gnu.org 2016-05-18 13:16:18 UTC
The binutils-2_26-branch branch has been updated by Christophe Lyon <clyon@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=ec0d6fa0d832ca4c703ba8349f525a5cf02c3972

commit ec0d6fa0d832ca4c703ba8349f525a5cf02c3972
Author: Christophe Monat <christophe.monat@st.com>
Date:   Wed May 18 15:14:09 2016 +0200

        [ARM/STM32L4XX] PR 20030: --fix-stm32l4xx-629360 fails to create vldm/vpop veneers for double-precision registers
    
    	Backport from master
        bfd/
            PR ld/20030
            * elf32-arm.c (is_thumb2_vldm): Account for T1 (DP) encoding.
            (stm32l4xx_need_create_replacing_stub): Rename ambiguous nb_regs
            to nb_words.
            (create_instruction_vldmia): Add is_dp to disambiguate SP/DP
            encoding.
            (create_instruction_vldmdb): Likewise.
            (stm32l4xx_create_replacing_stub_vldm): is_dp detects DP encoding,
            uses it to re-encode.
    
        ld/
            PR ld/20030
            * testsuite/ld-arm/arm-elf.exp: Run new stm32l4xx-fix-vldm-dp
            tests. Fix misnamed stm32l4xx-fix-all.
            * testsuite/ld-arm/stm32l4xx-fix-vldm-dp.s: New tests for multiple
            loads with DP registers.
            * testsuite/ld-arm/stm32l4xx-fix-vldm-dp.d: New reference file.
            * testsuite/ld-arm/stm32l4xx-fix-vldm.s: Add missing comment.
            * testsuite/ld-arm/stm32l4xx-fix-all.s: Add tests for multiple
            loads with DP registers.
            * testsuite/ld-arm/stm32l4xx-fix-all.d: Update reference.
Comment 13 Christophe Monat 2016-05-18 13:23:58 UTC
Fixed in HEAD (2.27) and 2.26