optimizing gcc output for ARM

Wed Nov 22 09:34:00 GMT 2000

If I compile the following lines

  for (i=0;i<10;i++) {
    asm volatile ("mla %0,%3,%2,%1":"=r" (sum):"r" (coeff), "r" (sample), "r" (sum));
  }
I get the assembler output
.L3:
	ldr	r3, [fp, #-28]
	cmp	r3, #9
	ble	.L6
	b	.L4
.L6:
	ldr	r3, [fp, #-16]
	ldr	r2, [fp, #-20]
	ldr	r1, [fp, #-24]
	ldr	ip, [fp, #-16]
	mla r3,r1,r2,r3
	mov	r2, r3
	str	r2, [fp, #-16]
.L5:
	ldr	r3, [fp, #-28]
	add	r2, r3, #1
	str	r2, [fp, #-28]
	b	.L3

(explanation: [fp, #-16]: sum; [fp, #-20]:coeff;  [fp, #-24]:sample)

I can life with the fact that gcc doesnÃ‚Â´t recognize the 
special MAC instruction from the ARM, but the line
after it is stupid,
	str	r3, [fp, #-16]
would work fine too, and is one ins. shorter. 
And the line above it is not nessesary, too.
1.) Why does the gcc produce such output?
2.) How can I avoid this?

ThankÃ‚Â´s for your hints!
Jens-Christian

-- 

Jens-Christian Lache
Technische Universitaet Hamburg-Harburg
www.tu-harburg.de/~sejl1601
Mail:
lache@tu-harburg.de
lache@ngi.de
Tel.:
+0491759610756

------
Want more information?  See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com