This is the mail archive of the mailing list for the libc-ports project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch, mips] Improved memset for MIPS

On 12/12/2013 07:14 PM, Steve Ellcey wrote:
> On Thu, 2013-12-12 at 19:01 -0500, Carlos O'Donell wrote:
>>> I noticed this patch causes some performance regressions on Octeon due
>>> to having 128 byte cache lines.
>>> Changing PREFETCH_CHUNK/PREFETCH_FOR_STORE to assume 128 byte cache
>>> line gives us the performance back and improves over the original code
>>> at least 15%.
>>> That is:
>>> #  define PREFETCH_CHUNK 128
>>> #  define PREFETCH_FOR_STORE(chunk, reg) \
>>>     pref PREFETCH_STORE_HINT, (chunk)*128(reg);
>> Submit a patch for that?
>> We have microbenchmarks now, but the next hardest
>> part is going to be archiving data by device so that
>> the community can help track performance and point
>> out regressions like this.
>> Cheers,
>> Carlos.
> Unless the change is under some kind of ifdef for Octeon changing this
> will probably slow down other MIPS chips.  Most of them have 32 byte
> cache lines.

Absolutely. I don't suggest he just change it, but Andrew would have to add
enough framework for Octeon to be enabled with an optimal implementation.
For example you could compile an alternate version with 128 byte cache
line support and select it via IFUNC based on AT_HWCAP?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]