This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] powerpc: Use aligned stores in memset


Florian Weimer <fweimer@redhat.com> writes:

> On 08/18/2017 11:10 AM, Florian Weimer wrote:
>> On 08/18/2017 08:51 AM, Rajalakshmi Srinivasaraghavan wrote:
>>>
>>>
>>> On 08/18/2017 11:51 AM, Florian Weimer wrote:
>>>> On 08/18/2017 07:11 AM, Rajalakshmi Srinivasaraghavan wrote:
>>>>>     * sysdeps/powerpc/powerpc64/power8/memset.S: Store byte by byte
>>>>>     for unaligned inputs if size is less than 8.
>>>>
>>>> This makes me rather nervous.  powerpc64le was supposed to have
>>>> reasonable efficient unaligned loads and stores.  GCC happily generates
>>>> them, too.
>>>
>>> This is meant ONLY for caching inhibited accesses.  Caching Inhibited
>>> accesses are required to be Guarded and properly aligned.
>> 
>> The intent is to support memset for such memory regions, right?  This
>> change is insufficient.  You have to fix GCC as well because it will
>> inline memset of unaligned pointers, like this:
>
> Here's a more complete example:
>
>
> #include <assert.h>
> #include <stdio.h>
> #include <string.h>
>
> typedef long __attribute__ ((aligned(1))) long_unaligned;
>
> __attribute__ ((noinline, noclone, weak))
> void
> clear (long_unaligned *p)
> {
>   memset (p, 0, sizeof (*p));
> }
>
> struct data
> {
>   char misalign;
>   long_unaligned data;
> };
>
> int
> main (void)
> {
>   struct data *data = malloc (sizeof (*data));
>   assert (data != NULL);
>   long_unaligned *p = &data->data;
>   printf ("pointer: %p\n", p);
>   clear (p);
>   return 0;
> }
>
> The clear function compiles to:
>
> typedef long __attribute__ ((aligned(1))) long_unaligned;
>
> void
> clear (long_unaligned *p)
> {
>   memset (p, 0, sizeof (*p));
> }
>
> At run time, I get:
>
> pointer: 0x10003c10011
>
> This means that GCC introduced an unaligned store, no matter how memset
> was implemented.

Which isn't necessarily a problem.
The performance penalty only appears when the memory access is referring
to an address which isn't at the instruction's natural boundary.

In this case, memset should use stb to avoid an alignment interrupt.

Notice that if the memory access is not at the natural boundary, an alignment
interrupt is generated and it won't generate an error.  The access will still
happen, but it will have a performance penalty.

> So I think the implementation constraint on the mem* functions is wrong.
>  It leads to a slower implementation of the mem* function for most of
> userspace which does not access device memory, and even for device
> memory, it is probably not what you want.

Makes sense.  But as there is nothing in the standard allowing or prohibiting
the usage of mem* functions to access caching-inhibited memory, I thought it
would make sense to provide functions that are as generic as possible.

IMHO, it's easier for programmers to use generic functions in most scenarios
and have access to specialized functions, e.g. a function for data already
aligned at 16 bytes.

-- 
Tulio Magno


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]