Created attachment 5247 [details] GNU-C source using computed goto The following GNU-C source uses computed goto in function fb. avr-gcc produces correct code, but the relaxation optimizes rcall+ret to rkump even though between rcall and ret there is a jump label. Maybe someone can confirm with an up to date and unpatched version of binutils. e:/WinAVR/20100110/bin/avr-gcc -Wl,--relax -v -Wl,-v -save-temps -fverbose-asm -Wall -W -Os -o computed-goto.elf -mmcu=atmega8 computed-goto.c Using built-in specs. Target: avr Configured with: ../gcc-4.3.3/configure --enable-win32-registry=WinAVR-20100110 --with-gmp=/usr/local --with-mpfr=/usr/local --prefix=/c/WinAVR --target=avr --enable-languages=c,c++,objc --with-dwarf2 --enable-doc --disable-shared --disable-libada --disable-libssp --disable-nls --with-pkgversion='WinAVR 20100110' --with-bugurl='URL:http://sourceforge.net/tracker/?atid=520074&group_id=68108&func=browse' Thread model: single gcc version 4.3.3 (WinAVR 20100110) COLLECT_GCC_OPTIONS='-v' '-save-temps' '-fverbose-asm' '-Wall' '-W' '-Os' '-o' 'computed-goto.elf' '-mmcu=atmega8' e:/winavr/20100110/bin/../libexec/gcc/avr/4.3.3/cc1.exe -E -quiet -v -imultilib avr4 -iprefix e:\winavr\20100110\bin\../lib/gcc/avr/4.3.3/ computed-goto.c -mmcu=atmega8 -Wall -W -fverbose-asm -Os -fpch-preprocess -o computed-goto.i ignoring nonexistent directory "e:/winavr/20100110/lib/gcc/../../avr/sys-include" #include "..." search starts here: #include <...> search starts here: e:\winavr\20100110\bin\../lib/gcc/avr/4.3.3/include e:\winavr\20100110\bin\../lib/gcc/avr/4.3.3/include-fixed e:/winavr/20100110/lib/gcc/../../lib/gcc/avr/4.3.3/include e:/winavr/20100110/lib/gcc/../../lib/gcc/avr/4.3.3/include-fixed e:/winavr/20100110/lib/gcc/../../avr/include End of search list. COLLECT_GCC_OPTIONS='-v' '-save-temps' '-fverbose-asm' '-Wall' '-W' '-Os' '-o' 'computed-goto.elf' '-mmcu=atmega8' e:/winavr/20100110/bin/../libexec/gcc/avr/4.3.3/cc1.exe -fpreprocessed computed-goto.i -quiet -dumpbase computed-goto.c -mmcu=atmega8 -auxbase computed-goto -Os -Wall -W -version -fverbose-asm -o computed-goto.s GNU C (WinAVR 20100110) version 4.3.3 (avr) compiled by GNU C version 3.4.5 (mingw-vista special r3), GMP version 4.2.3, MPFR version 2.4.1. GGC heuristics: --param ggc-min-expand=47 --param ggc-min-heapsize=32702 Compiler executable checksum: 61d68a374065d489330774d2533cbbdf COLLECT_GCC_OPTIONS='-v' '-save-temps' '-fverbose-asm' '-Wall' '-W' '-Os' '-o' 'computed-goto.elf' '-mmcu=atmega8' e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/bin/as.exe -mmcu=atmega8 -o computed-goto.o computed-goto.s COMPILER_PATH=e:/winavr/20100110/bin/../libexec/gcc/avr/4.3.3/;e:/winavr/20100110/bin/../libexec/gcc/;e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/bin/ LIBRARY_PATH=e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/avr4/;e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/lib/avr4/;e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/;e:/winavr/20100110/bin/../lib/gcc/;e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/lib/ COLLECT_GCC_OPTIONS='-v' '-save-temps' '-fverbose-asm' '-Wall' '-W' '-Os' '-o' 'computed-goto.elf' '-mmcu=atmega8' e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/bin/ld.exe -m avr4 -o computed-goto.elf e:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/lib/avr4/crtm8.o -Le:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/avr4 -Le:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/lib/avr4 -Le:/winavr/20100110/bin/../lib/gcc/avr/4.3.3 -Le:/winavr/20100110/bin/../lib/gcc -Le:/winavr/20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/lib --relax -v computed-goto.o -lgcc -lc -lgcc GNU ld (WinAVR 20100110) 2.19
Created attachment 5248 [details] .s as compiled by avr-gcc avr-gcc output according to GNU-C source
Created attachment 5249 [details] Disassembly The wrong relaxation optimization. Note that fa gets called no matter what value parameter x of fb has. Disassembly as of avr-objdump -z -h -S -j .data -j .text computed-goto.elf > computed-goto.lst
Unpatched binutils 2.20 also trigger this bug.
(In reply to comment #2) > Created attachment 5249 [details] > Disassembly > > The wrong relaxation optimization. Note that fa gets called no matter what > value parameter x of fb has. > > Disassembly as of > avr-objdump -z -h -S -j .data -j .text computed-goto.elf > computed-goto.lst If label 'b' is not empty, I think this will not happen. I just tried a small change. Lets say: <code> goto *l[x]; a:fa(); b: z--; </code> The code generated will be : ... 8e: 09 94 ijmp 90: e9 df rcall .-46 ; 0x64 <fa> 92: 80 91 64 00 lds r24, 0x0064 96: 81 50 subi r24, 0x01 ; 1 98: 80 93 64 00 sts 0x0064, r24 9c: 08 95 ret ... (In reply to comment #1) > Created attachment 5248 [details] > .s as compiled by avr-gcc > > avr-gcc output according to GNU-C source Also, I find it a little strange that 2 ret are emitted in fb() ijmp .L6: rcall fa ; ret .L7: ret Although it does not make any semantic difference why 2 return statements?
(In reply to comment #4) It's definitely abinutils bug. Rewriting and tweaking C-source will not fix a binutils bug. The simplistic source just shows that avr-gcc actually can generate code that binutils linker relaxation will shred. For sure there are more realistic examles out in the world, maybe even assembler or auto-generated by some tool. BTW, avr-gcc may generate a function wil several RETs if it detects that an expilogue is "simple". That's preferred over jumping around.
(In reply to comment #5) > (In reply to comment #4) > > It's definitely abinutils bug. Yes, ofcourse. > > Rewriting and tweaking C-source will not fix a binutils bug. The comment is more of an observation that it might be one of the corner cases that is not properly handled in binutils. > > BTW, avr-gcc may generate a function wil several RETs if it detects that an > expilogue is "simple". That's preferred over jumping around. OK.
Confirmed for binutils 2.21
Maybe the two following hints are helpful: 1.) As a fast fix, there is a command line switch for the linker that disables the problematic rcall ret optimization. Execute your avr-ld with "-h" and you get the information on the options to configure the relaxation optimization. 2.) The source of the problem is that the present relaxation algorithm does not check wether relocations point to the return instruction. The place to fix the issue is the file bfd/elf32-avr.c and the function to place the fix is static bfd_boolean elf32_avr_relax_section (bfd *abfd, asection *sec, struct bfd_link_info *link_info, bfd_boolean *again) The part of the function to fix is around line 1912 where the relaxation algorithm tries to identify call/ret and rcall/ret sequences. #code fragment to look for /* Here we look for rcall/ret or call/ret sequences that could be safely replaced by rjmp/ret or jmp/ret. */ #end of code fragment What is missing is code that checks wether there is somebody actually needing the seemingly useless ret instruction before it is deleted. Code that does this job can be taken from the code part some lines later that deletes ret instructions that immediately follow jumps and rjmps. Note that one needs to check for three possible uses of the "ret" instruction a) global labels refering to the ret b) local labels refering to the ret c) relocations refering to the ret The required code already exists and the method for these checks could be taken from about line 2062 #code fragment to look for /* We now only have to make sure that there is no local label defined at the address of the ret instruction and that there is no local relocation in this section pointing to the ret. */ int deleting_ret_is_safe = 1; unsigned int section_offset_of_ret_insn = irel->r_offset + insn_size; Elf_Internal_Sym *isym, *isymend; unsigned int sec_shndx; sec_shndx = _bfd_elf_section_from_bfd_section (abfd, sec); /* Check for local symbols. */ .... /* Now check for global symbols. */ .... /* Now we check for relocations pointing to ret. */ .... #end of code fragment Since these checks obviously are required at three places, it might be useful to move the checking code above to a seperate function, in order to avoid code duplication. HTH, Björn.
Actually, the linker relaxation code already considers cases where deletion of the ret in call/ret or jmp/ret might be unsafe - just that it missed this specific case. The case being that GCC puts the array containing the addresses of the labels in a separate section (.rodata). The safe ret deletion logic assumes that only relocations in the local section (containing the ret) could potentially have symbols whose values evaluate to the address containing the ret, but that is not true in this case. Looping over all sections in the BFD, and repeating the check for each section fixes the problem (see attached patch). Changelog 2013-03-12 Senthil Kumar Selvaraj <senthil_kumar.selvaraj@atmel.com> * bfd/elf32-avr.c: Consider all sections to determine if linker relaxation can safely delete a ret after a call/jmp
Created attachment 6932 [details] Patch that loops over all sections
CVSROOT: /cvs/src Module name: src Changes by: nickc@sourceware.org 2013-04-09 15:50:38 Modified files: bfd : ChangeLog elf32-avr.c Log message: PR ld/12494 * bfd/elf32-avr.c: Consider all sections to determine if linker relaxation can safely delete a ret after a call/jmp Patches: http://sourceware.org/cgi-bin/cvsweb.cgi/src/bfd/ChangeLog.diff?cvsroot=src&r1=1.6024&r2=1.6025 http://sourceware.org/cgi-bin/cvsweb.cgi/src/bfd/elf32-avr.c.diff?cvsroot=src&r1=1.61&r2=1.62
Hi Senthil, Thanks for the patch - I have checked it in now. Cheers Nick