This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: glibc.cpu.cached_memopt (was Re: [PATCH] Rename the glibc.tune namespace to glibc.cpu)

On 08/06/2018 07:03 PM, Tulio Magno Quites Machado Filho wrote:
Yes, for cacheable memory.  A safe execution uses only naturally aligned memory
accesses and doesn't provide the best performance we have.

However an unsafe execution on cached inhibited memory is catastrophic because
every naturally unaligned memory access generates an alignment interruption
that is treated by the kernel, causing an even greater performance impact than
a safe execution on cacheable memory.

There seem to be two discussions that seem to me to be slightly orthogonal: there's the issue of using memcpy for volatile objects because overlapping writes may not work correctly without barriers and then there is the question of ensuring aligned accesses for device memory that may have been mapped in as cache-inhibited and does not like misaligned access.

It seems to me the issue with Power w.r.t. cache-inhibited memory access is only the latter. Is that correct?

does it make sense to fix this in glibc?

IMHO, yes.  I haven't seen yet a good explanation on why userspace programs
should not be using memcpy in these conditions, e.g. AFAIK, ISO C 11 does not
prohibit this.

If it is a question of misaligned accesses only then there may be a case to add a memcpy that strictly does aligned accesses only, but a better name for that would be glibc.cpu.misaligned_access and not cached_memopt since that has slightly different implications.

If volatile (and overlapping) access is also an issue then there seems to be some amount of clarity that we need not attempt to support it in memcpy by default. I don't know if having support only in Power makes sense but if there is a strong need for it then the tunable name should change to something more precise, e.g. glibc.cpu.ppc_allow_volatile_memcpy.

I still believe this could help, but there is still one open issue: how do we
know a memcpy call is accessing cached inhibited memory?
I'm afraid this property is not that easy to detect.

It's not, it has to be annotated by the developer.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]