optimizing gcc output for ARM

Jens-Christian Lache lache@tu-harburg.de
Wed Nov 22 09:34:00 GMT 2000

If I compile the following lines

  for (i=0;i<10;i++) {
    asm volatile ("mla %0,%3,%2,%1":"=r" (sum):"r" (coeff), "r" (sample), "r" (sum));
I get the assembler output
	ldr	r3, [fp, #-28]
	cmp	r3, #9
	ble	.L6
	b	.L4
	ldr	r3, [fp, #-16]
	ldr	r2, [fp, #-20]
	ldr	r1, [fp, #-24]
	ldr	ip, [fp, #-16]
	mla r3,r1,r2,r3
	mov	r2, r3
	str	r2, [fp, #-16]
	ldr	r3, [fp, #-28]
	add	r2, r3, #1
	str	r2, [fp, #-28]
	b	.L3

(explanation: [fp, #-16]: sum; [fp, #-20]:coeff;  [fp, #-24]:sample)

I can life with the fact that gcc doesn´t recognize the 
special MAC instruction from the ARM, but the line
after it is stupid,
	str	r3, [fp, #-16]
would work fine too, and is one ins. shorter. 
And the line above it is not nessesary, too.
1.) Why does the gcc produce such output?
2.) How can I avoid this?

Thank´s for your hints!


Jens-Christian Lache
Technische Universitaet Hamburg-Harburg

Want more information?  See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com

More information about the crossgcc mailing list