Bug 12494 - Relaxation leads to wrong code optimization (computed goto)
Summary: Relaxation leads to wrong code optimization (computed goto)
Status: RESOLVED FIXED
Alias: None
Product: binutils
Classification: Unclassified
Component: ld (show other bugs)
Version: 2.21
: P2 normal
Target Milestone: ---
Assignee: Senthil Kumar Selvaraj
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-16 19:48 UTC by Georg-Johann Lay
Modified: 2013-04-09 15:52 UTC (History)
5 users (show)

See Also:
Host: all
Target: avr
Build: all
Last reconfirmed:


Attachments
GNU-C source using computed goto (155 bytes, text/plain)
2011-02-16 19:48 UTC, Georg-Johann Lay
Details
.s as compiled by avr-gcc (1.28 KB, text/plain)
2011-02-16 19:54 UTC, Georg-Johann Lay
Details
Disassembly (1.19 KB, text/plain)
2011-02-16 19:57 UTC, Georg-Johann Lay
Details
Patch that loops over all sections (528 bytes, patch)
2013-03-12 18:03 UTC, Senthil Kumar Selvaraj
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Georg-Johann Lay 2011-02-16 19:48:54 UTC
Created attachment 5247 [details]
GNU-C source using computed goto

The following GNU-C source uses computed goto in function fb. avr-gcc produces correct code, but the relaxation optimizes rcall+ret to rkump even though between rcall and ret there is a jump label.

Maybe someone can confirm with an up to date and unpatched version of binutils.

e:/WinAVR/20100110/bin/avr-gcc -Wl,--relax -v -Wl,-v -save-temps -fverbose-asm -Wall -W -Os -o computed-goto.elf -mmcu=atmega8 computed-goto.c
Using built-in specs.
Target: avr
Configured with: ../gcc-4.3.3/configure --enable-win32-registry=WinAVR-20100110 --with-gmp=/usr/local --with-mpfr=/usr/local --prefix=/c/WinAVR --target=avr --enable-languages=c,c++,objc --with-dwarf2 --enable-doc --disable-shared --disable-libada --disable-libssp --disable-nls --with-pkgversion='WinAVR 20100110' --with-bugurl='URL:http://sourceforge.net/tracker/?atid=520074&group_id=68108&func=browse'
Thread model: single
gcc version 4.3.3 (WinAVR 20100110) 
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-fverbose-asm' '-Wall' '-W' '-Os' '-o' 'computed-goto.elf' '-mmcu=atmega8'
 e:/winavr/20100110/bin/../libexec/gcc/avr/4.3.3/cc1.exe -E -quiet -v -imultilib avr4 -iprefix e:\winavr\20100110\bin\../lib/gcc/avr/4.3.3/ computed-goto.c -mmcu=atmega8 -Wall -W -fverbose-asm -Os -fpch-preprocess -o computed-goto.i
ignoring nonexistent directory "e:/winavr/20100110/lib/gcc/../../avr/sys-include"
#include "..." search starts here:
#include <...> search starts here:
 e:\winavr\20100110\bin\../lib/gcc/avr/4.3.3/include
 e:\winavr\20100110\bin\../lib/gcc/avr/4.3.3/include-fixed
 e:/winavr/20100110/lib/gcc/../../lib/gcc/avr/4.3.3/include
 e:/winavr/20100110/lib/gcc/../../lib/gcc/avr/4.3.3/include-fixed
 e:/winavr/20100110/lib/gcc/../../avr/include
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-fverbose-asm' '-Wall' '-W' '-Os' '-o' 'computed-goto.elf' '-mmcu=atmega8'
 e:/winavr/20100110/bin/../libexec/gcc/avr/4.3.3/cc1.exe -fpreprocessed computed-goto.i -quiet -dumpbase computed-goto.c -mmcu=atmega8 -auxbase computed-goto -Os -Wall -W -version -fverbose-asm -o computed-goto.s
GNU C (WinAVR 20100110) version 4.3.3 (avr)
	compiled by GNU C version 3.4.5 (mingw-vista special r3), GMP version 4.2.3, MPFR version 2.4.1.
GGC heuristics: --param ggc-min-expand=47 --param ggc-min-heapsize=32702
Compiler executable checksum: 61d68a374065d489330774d2533cbbdf
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-fverbose-asm' '-Wall' '-W' '-Os' '-o' 'computed-goto.elf' '-mmcu=atmega8'
 e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/bin/as.exe -mmcu=atmega8 -o computed-goto.o computed-goto.s
COMPILER_PATH=e:/winavr/20100110/bin/../libexec/gcc/avr/4.3.3/;e:/winavr/20100110/bin/../libexec/gcc/;e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/bin/
LIBRARY_PATH=e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/avr4/;e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/lib/avr4/;e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/;e:/winavr/20100110/bin/../lib/gcc/;e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/lib/
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-fverbose-asm' '-Wall' '-W' '-Os' '-o' 'computed-goto.elf' '-mmcu=atmega8'
 e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/bin/ld.exe -m avr4 -o computed-goto.elf e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/lib/avr4/crtm8.o -Le:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/avr4 -Le:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/lib/avr4 -Le:/winavr/20100110/bin/../lib/gcc/avr/4.3.3 -Le:/winavr/20100110/bin/../lib/gcc -Le:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/lib --relax -v computed-goto.o -lgcc -lc -lgcc
GNU ld (WinAVR 20100110) 2.19
Comment 1 Georg-Johann Lay 2011-02-16 19:54:37 UTC
Created attachment 5248 [details]
.s as compiled by avr-gcc

avr-gcc output according to GNU-C source
Comment 2 Georg-Johann Lay 2011-02-16 19:57:33 UTC
Created attachment 5249 [details]
Disassembly

The wrong relaxation optimization. Note that fa gets called no matter what value parameter x of fb has.

Disassembly as of
avr-objdump -z -h -S -j .data -j .text computed-goto.elf > computed-goto.lst
Comment 3 Georg-Johann Lay 2011-03-11 13:35:09 UTC
Unpatched binutils 2.20 also trigger this bug.
Comment 4 Anitha Boyapati 2011-03-12 08:35:04 UTC
(In reply to comment #2)
> Created attachment 5249 [details]
> Disassembly
> 
> The wrong relaxation optimization. Note that fa gets called no matter what
> value parameter x of fb has.
> 
> Disassembly as of
> avr-objdump -z -h -S -j .data -j .text computed-goto.elf > computed-goto.lst


If label 'b' is not empty, I think this will not happen. I just tried a small change. Lets say:
 
<code>
 goto *l[x];

 a:fa();
 b: z--;
</code>

The code generated will be :
...
8e:   09 94           ijmp
90:   e9 df           rcall   .-46            ; 0x64 <fa>
92:   80 91 64 00     lds     r24, 0x0064
96:   81 50           subi    r24, 0x01       ; 1
98:   80 93 64 00     sts     0x0064, r24
9c:   08 95           ret
...


(In reply to comment #1)
> Created attachment 5248 [details]
> .s as compiled by avr-gcc
> 
> avr-gcc output according to GNU-C source

Also, I find it a little strange that 2 ret are emitted in fb()

	ijmp
.L6:
	rcall fa	 ; 
	ret
.L7:
	ret
	

Although it does not make any semantic difference why 2 return statements?
Comment 5 Georg-Johann Lay 2011-06-14 19:50:27 UTC
(In reply to comment #4)

It's definitely abinutils bug.

Rewriting and tweaking C-source will not fix a binutils bug.

The simplistic source just shows that avr-gcc actually can generate code that binutils linker relaxation will shred. For sure there are more realistic examles out in the world, maybe even assembler or auto-generated by some tool.

BTW, avr-gcc may generate a function wil several RETs if it detects that an expilogue is "simple". That's preferred over jumping around.
Comment 6 Anitha Boyapati 2011-06-15 05:19:03 UTC
(In reply to comment #5)
> (In reply to comment #4)
> 
> It's definitely abinutils bug.

Yes, ofcourse.

> 
> Rewriting and tweaking C-source will not fix a binutils bug.

The comment is more of an observation that it might be one of the corner cases that is not properly handled in binutils.


> 
> BTW, avr-gcc may generate a function wil several RETs if it detects that an
> expilogue is "simple". That's preferred over jumping around.

OK.
Comment 7 Georg-Johann Lay 2011-07-19 17:50:07 UTC
Confirmed for binutils 2.21
Comment 8 Bjoern Haase 2011-07-26 15:26:13 UTC
Maybe the two following hints are helpful:

1.) As a fast fix, there is a command line switch for the linker that disables the problematic rcall ret optimization. Execute your avr-ld with "-h" and you get the information on the options to configure the relaxation optimization.

2.) The source of the problem is that the present relaxation algorithm does not check wether relocations point to the return instruction. The place to fix the issue is the file

bfd/elf32-avr.c

and the function to place the fix is

static bfd_boolean
elf32_avr_relax_section (bfd *abfd,
			 asection *sec,
                         struct bfd_link_info *link_info,
                         bfd_boolean *again)


The part of the function to fix is around line 1912 where the relaxation algorithm tries to identify call/ret and rcall/ret sequences.

#code fragment to look for

            /* Here we look for rcall/ret or call/ret sequences that could be
               safely replaced by rjmp/ret or jmp/ret.  */

#end of code fragment

What is missing is code that checks wether there is somebody actually needing the seemingly useless ret instruction before it is deleted. 

Code that does this job can be taken from the code part some lines later that deletes ret instructions that immediately follow jumps and rjmps. Note that one needs to check for three possible uses of the "ret" instruction
 a) global labels refering to the ret
 b) local labels refering to the ret
 c) relocations refering to the ret

The required code already exists and the method for these checks could be taken from about line 2062

#code fragment to look for

                       /* We now only have to make sure that there is no
                           local label defined at the address of the ret
                           instruction and that there is no local relocation
                           in this section pointing to the ret.  */

                        int deleting_ret_is_safe = 1;
                        unsigned int section_offset_of_ret_insn =
                                          irel->r_offset + insn_size;
                        Elf_Internal_Sym *isym, *isymend;
                        unsigned int sec_shndx;

                        sec_shndx =
			  _bfd_elf_section_from_bfd_section (abfd, sec);

                        /* Check for local symbols.  */

                        ....

                        /* Now check for global symbols.  */
                        ....

                        /* Now we check for relocations pointing to ret.  */
                        .... 

#end of code fragment

Since these checks obviously are required at three places, it might be useful to move the checking code above to a seperate function, in order to avoid code duplication.

HTH,

Björn.
Comment 9 Senthil Kumar Selvaraj 2013-03-12 18:02:51 UTC
Actually, the linker relaxation code already considers cases where deletion of
the ret in call/ret or jmp/ret might be unsafe - just that it missed this
specific case.

The case being that GCC puts the array containing the addresses of the labels
in a separate section (.rodata). The safe ret deletion logic assumes that only
relocations in the local section (containing the ret) could potentially have
symbols whose values evaluate to the address containing the ret, but that is
not true in this case.

Looping over all sections in the BFD, and repeating the check for each section
fixes the problem (see attached patch).

Changelog

2013-03-12  Senthil Kumar Selvaraj  <senthil_kumar.selvaraj@atmel.com>

	* bfd/elf32-avr.c: Consider all sections to determine if linker
	  relaxation can safely delete a ret after a call/jmp
Comment 10 Senthil Kumar Selvaraj 2013-03-12 18:03:45 UTC
Created attachment 6932 [details]
Patch that loops over all sections
Comment 11 Sourceware Commits 2013-04-09 15:50:40 UTC
CVSROOT:	/cvs/src
Module name:	src
Changes by:	nickc@sourceware.org	2013-04-09 15:50:38

Modified files:
	bfd            : ChangeLog elf32-avr.c 

Log message:
	PR ld/12494
	* bfd/elf32-avr.c: Consider all sections to determine if linker
	relaxation can safely delete a ret after a call/jmp

Patches:
http://sourceware.org/cgi-bin/cvsweb.cgi/src/bfd/ChangeLog.diff?cvsroot=src&r1=1.6024&r2=1.6025
http://sourceware.org/cgi-bin/cvsweb.cgi/src/bfd/elf32-avr.c.diff?cvsroot=src&r1=1.61&r2=1.62
Comment 12 Nick Clifton 2013-04-09 15:52:19 UTC
Hi Senthil,

  Thanks for the patch - I have checked it in now.

Cheers
  Nick