This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: glibc.cpu.cached_memopt (was Re: [PATCH] Rename the glibc.tune namespace to glibc.cpu)
- From: Siddhesh Poyarekar <siddhesh at sourceware dot org>
- To: Tulio Magno Quites Machado Filho <tuliom at ascii dot art dot br>, Carlos O'Donell <carlos at redhat dot com>, libc-alpha at sourceware dot org, "H.J. Lu" <hjl dot tools at gmail dot com>
- Date: Tue, 7 Aug 2018 13:16:17 +0530
- Subject: Re: glibc.cpu.cached_memopt (was Re: [PATCH] Rename the glibc.tune namespace to glibc.cpu)
- References: <20180716141633.6948-1-siddhesh@sourceware.org> <902a4076-7b87-ea27-bab4-3740ab0a04ec@redhat.com> <25d88c07-1e8f-bd73-cc28-989930a55933@sourceware.org> <87tvozx83k.fsf@linux.ibm.com> <56f276ad-f0da-a075-b5d1-0d03520ea4fd@sourceware.org> <87o9ejgpgf.fsf@linux.ibm.com> <7348e04a-e922-4fca-b8af-52a7c1408d76@sourceware.org> <87k1p3hb63.fsf@linux.ibm.com>
On 08/06/2018 07:03 PM, Tulio Magno Quites Machado Filho wrote:
Yes, for cacheable memory. A safe execution uses only naturally aligned memory
accesses and doesn't provide the best performance we have.
However an unsafe execution on cached inhibited memory is catastrophic because
every naturally unaligned memory access generates an alignment interruption
that is treated by the kernel, causing an even greater performance impact than
a safe execution on cacheable memory.
There seem to be two discussions that seem to me to be slightly
orthogonal: there's the issue of using memcpy for volatile objects
because overlapping writes may not work correctly without barriers and
then there is the question of ensuring aligned accesses for device
memory that may have been mapped in as cache-inhibited and does not like
misaligned access.
It seems to me the issue with Power w.r.t. cache-inhibited memory access
is only the latter. Is that correct?
does it make sense to fix this in glibc?
IMHO, yes. I haven't seen yet a good explanation on why userspace programs
should not be using memcpy in these conditions, e.g. AFAIK, ISO C 11 does not
prohibit this.
If it is a question of misaligned accesses only then there may be a case
to add a memcpy that strictly does aligned accesses only, but a better
name for that would be glibc.cpu.misaligned_access and not cached_memopt
since that has slightly different implications.
If volatile (and overlapping) access is also an issue then there seems
to be some amount of clarity that we need not attempt to support it in
memcpy by default. I don't know if having support only in Power makes
sense but if there is a strong need for it then the tunable name should
change to something more precise, e.g. glibc.cpu.ppc_allow_volatile_memcpy.
I still believe this could help, but there is still one open issue: how do we
know a memcpy call is accessing cached inhibited memory?
I'm afraid this property is not that easy to detect.
It's not, it has to be annotated by the developer.
Siddhesh