Optimization question
Richard Earnshaw
rearnsha@arm.com
Thu Aug 24 06:16:00 GMT 2000
>
> I'm using an arm-elf cross compiler built from 2.95.2 for a device with
> a StrongARM CPU core. I've noticed that the compiler tries to avoid
> multiply instructions by transforming a 32 bit multiply by a constant
> into a sequence of adds and subtracts with shifts. This is probably
> desirable if the processor has a slow multiply instruction, but the
> StrongARM core I'm using has a fast multiply (1 clock issue, 1-3 clock
> result delay depending on early termination). So I'd really prefer for
> the compiler to use the multiply instruction. A quick glance through
> arm.c in the GCC sources indicate that when -mcpu=strongarm is used,
> then a flag (arm_fast_multiply) gets set. Should this cause the use of
> the multiply instructions (or at least make them more favorable)? Any
> hints on how to get the compiler to cooperate?
>
Well, when multiplying by a constant, it is nearly always faster to build
the operation up from shift instructions, even on a StrongARM. Remember
that to use the multiply instruction a constant first has to be loaded
into a register; that takes at least one cycle and may take many more if
the value has to be synthesised or fetched from an area of memory that
might be outside the cache (though that can sometimes be moved outside of
a loop at the expense of increasing register pressure). It then takes at
least two cycles to perform the multiply itself, so we have an absolute
minimum of 3 cycles before it could be possible to save time by using the
multiply instruction. A very large number of constant multiplications in
normal code can be synthesised in 3 or less shift+add insns (each taking
one cycle), so there are only a small number of cases where it would be
better to use the multiply instruction even on a StrongARM.
The costings in gcc are set up to take the above into account, so I'm not
surprised that you are not seeing the use of the multiply insn. Do you
have a specific example where the compile is definitely generating slower
code? If so, I'd be interested in taking a look at it.
Richard
------
Want more information? See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com
More information about the crossgcc
mailing list