This is the mail archive of the mailing list for the libc-ports project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] sysdeps/arm/armv7/multiarch/memcpy_impl.S: Improve performance.

On Wed, Sep 4, 2013 at 6:03 AM, OndÅej BÃlka <> wrote:
> On Wed, Sep 04, 2013 at 01:00:09PM +0530, Siddhesh Poyarekar wrote:
>> 2. Scale with size
> Not very important for several reasons. One is that big sizes are cold
> (just look in oprofile output that loops are less frequent than header.)
> Second reason is that if we look at caller large sizes are unlikely
> bottleneck.

From my experience, extremely large data sizes are not very common.
Optimizing for those gets diminishing returns.  I believe that at very
large sizes the pressure is all on the hardware anyway.  Prefetching
large amounts of data in a loop takes a fixed amount of time and given
a large enough amount of data, the overhead introduced by most other
factors is negligible.

>> 4. Measure the effect of dcache pressure on function performance
>> 5. Measure effect of icache pressure on function performance.
> Here you really need to base weigths on function usage patterns.
> A bigger code size is acceptable for functions that are called more
> often. You need to see distribution of how are calls clustered to get
> full picture. A strcmp is least sensitive to icache concerns, as when it
> is called its mostly 100 times over in tight loop so size is not big issue.
> If same number of call is uniformnly spread through program we need
> stricter criteria.

Icache pressure is probably one of the more difficult things to
measure with a benchmark.  I suppose it'd be easier with a pipeline

Can you explain how usage pattern analysis might reveal icache pressure?

I'm not sure how useful 'usage pattern' are when considering dcache
pressure.  On Power we have data-cache prefetch instructions and since
we know that dcache pressure is a reality, we will prefetch if our
data sizes are large enough to out-weigh the overhead of prefetching,
e.g., when the data size exceeds the cacheline size.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]