This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PING][PATCHv3 1/2] aarch64: Hoist ZVA check out of the memset function


On Thursday 12 October 2017 03:14 AM, Andrew Pinski wrote:
> For at least the micro-archs I work with, reading dczid_el0 can and
> will most likely be faster than reading from global memory.
> Especially if the global memory is not in the L1 cache.  This is one
> case where a micro-benchmark can fall down.  If the global memory is
> in L1 cache the read is 3 cycles while reading from dczid_el0 is 4
> cycles, but once it is out of L1 cache, reading becomes 10x worse plus
> it pollutes the L1 cache.

This is a falkor caveat - mrs ends up being significantly slower.  Also,
the question about reading a global is pointless since I've dropped that
code path.  It only affected non-standard zva sizes anyway so it doesn't
affect current cores.

Siddhesh


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]