This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[PATCH] powerpc: unaligned memcpy and DMA
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Tue, 6 Jan 2015 21:35:48 +0100
- Subject: [PATCH] powerpc: unaligned memcpy and DMA
- Authentication-results: sourceware.org; auth=none
- References: <54A59CAC dot 1070303 at linux dot vnet dot ibm dot com> <20150106185317 dot GA27726 at domone> <54AC3381 dot 9040808 at linux dot vnet dot ibm dot com>
On Tue, Jan 06, 2015 at 05:12:01PM -0200, Adhemerval Zanella wrote:
> On 06-01-2015 16:53, OndÅej BÃlka wrote:
> >
> > Main question is why there is no power8 memcpy using unaligned loads yet?
> >
> > Memcpy is called about hundred times more often than strcpy(and no
> > strncpy call) on my computer so possible gains are bigger and with
> > optimized memcpy a generic strncpy will be faster as well.
>
> Mainly because powerpc still triggers kernel traps when issuing VMX/VSX instruction
> on non-cacheable memory. That's why I pushed 87868c2418fb74357757e3b739ce5b76b17a8929
> by the way.
>
> Although it is not really an issue for 99% of cases, where memory will be cacheable;
> some code (specially libdrm and xorg), uses memcpy (and possible memset) on DMA mapped
> memory. And that's why memcpy/memset for POWER8 are still using aligned accesses all
> 5b76b17a8929
That looks like overkill. Better way would be add variable that detects
if application can do it.
A probably simplest way would be add variable in vdso that kernel sets
to 1 when doing trap.
Otherwise it would be more complicated as we would need set it when
application allocates noncachable memory, is mmap only way to do that?