This is the mail archive of the crossgcc@sources.redhat.com mailing list for the crossgcc project.
See the CrossGCC FAQ for lots more infromation.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Richard Earnshaw wrote: > > > > > I'm using an arm-elf cross compiler built from 2.95.2 for a device with > > a StrongARM CPU core. I've noticed that the compiler tries to avoid > > multiply instructions by transforming a 32 bit multiply by a constant > > into a sequence of adds and subtracts with shifts. This is probably > > desirable if the processor has a slow multiply instruction, but the > > StrongARM core I'm using has a fast multiply (1 clock issue, 1-3 clock > > result delay depending on early termination). So I'd really prefer for > > the compiler to use the multiply instruction. A quick glance through > > arm.c in the GCC sources indicate that when -mcpu=strongarm is used, > > then a flag (arm_fast_multiply) gets set. Should this cause the use of > > the multiply instructions (or at least make them more favorable)? Any > > hints on how to get the compiler to cooperate? > > > > Well, when multiplying by a constant, it is nearly always faster to build > the operation up from shift instructions, even on a StrongARM. Remember > that to use the multiply instruction a constant first has to be loaded > into a register; that takes at least one cycle and may take many more if > the value has to be synthesised or fetched from an area of memory that > might be outside the cache (though that can sometimes be moved outside of > a loop at the expense of increasing register pressure). It then takes at > least two cycles to perform the multiply itself, so we have an absolute > minimum of 3 cycles before it could be possible to save time by using the > multiply instruction. A very large number of constant multiplications in > normal code can be synthesised in 3 or less shift+add insns (each taking > one cycle), so there are only a small number of cases where it would be > better to use the multiply instruction even on a StrongARM. > > The costings in gcc are set up to take the above into account, so I'm not > surprised that you are not seeing the use of the multiply insn. Do you > have a specific example where the compile is definitely generating slower > code? If so, I'd be interested in taking a look at it. > > Richard Thanks Richard, Your arguments are compelling. I'll just let the compiler do its "thing". Art ------ Want more information? See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/ Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |