Bug in Instruction Combination procedure and RTL generation.
J.J. Garcia
stigmatedbrain@gmail.com
Mon Sep 4 06:37:00 GMT 2006
Hi all,
I'm trying to debug a code optimization in gcc for an specific arch, to
be more explicit it's for gcc 2.95.3 for Metaware ARC target
architecture, i know the old release of compiler and i know there will
not be lot of support about it, anyway im keep on trying...,
In other words i noticed that when using -O0 flag the code generation is
correct but when using -Ox, where x=1,2,3 there's the following issue:
· Calling the compiler with the following debug flags:
-dr -dj -ds -dG -dL -df -dc -dN -dS -dl -dg -dR -dJ -dd
And inspecting the corresponding rtl files, i noticed a bad rtl
building/construction at following files and not for the others not
noted below:
.combine
.dbr
.greg
.jump2
.lreg
.regmove
.sched
.sched2
First, i don't know exactly the order of processing, i guess that
'.combine' comes after '.flow' one, and apparently '.flow' has correct
building, this is why i suspect that problem is in Instruction
Combination procedure.
The problem:
Using the following source code:
#define SINGULARP 10
#define SINGULARP_LOW 9
#define SINGULARP_HIGH 11
unsigned int
uifprb04503_A (unsigned int n)
{
unsigned int test_value = 69;
if ( n >= SINGULARP )
{
test_value = 20;
}
return test_value;
}
int main ()
{
if (uifprb04503_A (SINGULARP_LOW) != 69)
exit(1);
exit (0);
}
And using the -O2 flag, i get the following structure after objdump'ing
the object:
000000ec <uifprb04503_A>:
ec: 00 36 0e 10 100e3600 st fp,[sp]
f0: 00 38 6e 63 636e3800 mov fp,sp
f4: 08 7a e0 57 57e07a08 sub.f 0,r0,8
f8: 14 fe 1f 60 601ffe14 mov r0,20
fc: 0e 7c 1f 60 601f7c0e mov.ls r0,69
100: 45 00 00 00
104: 20 80 0f 38 380f8020 j.d [blink]
108: 00 10 6e 0b 0b6e1000 ld.a fp,[sp]
0000010c <main>:
10c: 04 3e 0e 10 100e3e04 st blink,[sp,4]
110: 00 36 0e 10 100e3600 st fp,[sp]
114: 00 38 6e 63 636e3800 mov fp,sp
118: 10 7e 8e 53 538e7e10 sub sp,sp,16
11c: a0 f9 ff 2f 2ffff9a0 bl.d ec <uifprb04503_A>
<...>
As you can see, the 3rd asm line at <uifprb04503_A> is not correctly
generated, it should be 'sub.f 0,r0,9'.
If you take a look to '.flow' file you can see the following 'good
strcture' for keeping the correct value on comparison:
----------- (.flow)
(insn 4 37 5 (set (reg/v:SI 67)
(reg:SI 0 %r0)) 7 {*movsi_insn} (nil)
(expr_list:REG_DEAD (reg:SI 0 %r0)
(nil)))
<...>
(insn 12 11 32 (set (reg:CC 61 %cc)
(compare:CC (reg/v:SI 67)
(const_int 9 [0x9]))) 70 {*cmpsi_cc_insn} (insn_list 4
(nil))
(expr_list:REG_DEAD (reg/v:SI 67)
(nil)))
(insn 32 12 34 (set (reg:SI 71)
(const_int 20 [0x14])) 7 {*movsi_insn} (nil)
(expr_list:REG_EQUAL (const_int 20 [0x14])
(nil)))
(insn 34 32 14 (set (reg/v:SI 68)
(if_then_else (ltu (reg:CC 61 %cc)
(const_int 0 [0x0]))
(const_int 69 [0x45])
(reg:SI 71))) 29 {*movsicc_insn} (insn_list 12 (insn_list 32
(nil)))
(expr_list:REG_DEAD (reg:CC 61 %cc)
(expr_list:REG_DEAD (reg:SI 71)
(nil))))
<...>
------------
But inspecting the '.combine' optimization, you'll get the '8' constant
value not expected:
------------ (.combine)
(note 4 37 5 "" NOTE_INSN_DELETED)
<..>
(insn 12 11 32 (set (reg:CC 61 %cc)
(compare:CC (reg:SI 0 %r0)
(const_int 8 [0x8]))) 70 {*cmpsi_cc_insn} (nil)
(expr_list:REG_DEAD (reg:SI 0 %r0)
(nil)))
(insn 32 12 34 (set (reg:SI 71)
(const_int 20 [0x14])) 7 {*movsi_insn} (nil)
(expr_list:REG_EQUAL (const_int 20 [0x14])
(nil)))
(insn 34 32 14 (set (reg/v:SI 68)
(if_then_else (leu (reg:CC 61 %cc)
(const_int 0 [0x0]))
(const_int 69 [0x45])
(reg:SI 71))) 29 {*movsicc_insn} (insn_list 12 (insn_list 32
(nil)))
(expr_list:REG_DEAD (reg:CC 61 %cc)
(expr_list:REG_DEAD (reg:SI 71)
(nil))))
<...>
-----------
Not an expert on RTL generation/optimization but i imagine that problem
could be on gcc internals than in arc.md rules definition, anyway not
sure about it, feel free to ask for affected rules in .md file.
By the way, you have a workaround to all of this situation. If you place
another 'source line' in statement block at 'if' testing, you get the
properly results, iow:
uifprb04503_A (unsigned int n)
{
unsigned int test_value = 69;
unsigned int dummy_value = 69;
if ( n >= SINGULARP )
{
test_value = 20;
dummy_value = test_value + 4; /* added 2 avoid bad object */
}
<...>
Hint's and Help appreciated
Have a good day
Jose.
--
For unsubscribe information see http://sourceware.org/lists.html#faq
More information about the crossgcc
mailing list