optimizing gcc output for ARM
Jens-Christian Lache
lache@tu-harburg.de
Wed Nov 22 09:34:00 GMT 2000
If I compile the following lines
for (i=0;i<10;i++) {
asm volatile ("mla %0,%3,%2,%1":"=r" (sum):"r" (coeff), "r" (sample), "r" (sum));
}
I get the assembler output
.L3:
ldr r3, [fp, #-28]
cmp r3, #9
ble .L6
b .L4
.L6:
ldr r3, [fp, #-16]
ldr r2, [fp, #-20]
ldr r1, [fp, #-24]
ldr ip, [fp, #-16]
mla r3,r1,r2,r3
mov r2, r3
str r2, [fp, #-16]
.L5:
ldr r3, [fp, #-28]
add r2, r3, #1
str r2, [fp, #-28]
b .L3
(explanation: [fp, #-16]: sum; [fp, #-20]:coeff; [fp, #-24]:sample)
I can life with the fact that gcc doesnôt recognize the
special MAC instruction from the ARM, but the line
after it is stupid,
str r3, [fp, #-16]
would work fine too, and is one ins. shorter.
And the line above it is not nessesary, too.
1.) Why does the gcc produce such output?
2.) How can I avoid this?
Thankôs for your hints!
Jens-Christian
--
Jens-Christian Lache
Technische Universitaet Hamburg-Harburg
www.tu-harburg.de/~sejl1601
Mail:
lache@tu-harburg.de
lache@ngi.de
Tel.:
+0491759610756
------
Want more information? See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com
More information about the crossgcc
mailing list