This is the mail archive of the
mailing list for the libc-ports project.
Re: [PATCH] ARM: Add pointer guard support.
- From: Will Newton <will dot newton at linaro dot org>
- To: "Carlos O'Donell" <carlos at redhat dot com>
- Cc: "libc-ports at sourceware dot org" <libc-ports at sourceware dot org>, Patch Tracking <patches at linaro dot org>
- Date: Wed, 25 Sep 2013 17:23:43 +0100
- Subject: Re: [PATCH] ARM: Add pointer guard support.
- Authentication-results: sourceware.org; auth=none
- References: <5242A79D dot 1030709 at linaro dot org> <52430AA4 dot 70703 at redhat dot com>
On 25 September 2013 17:09, Carlos O'Donell <firstname.lastname@example.org> wrote:
> On 09/25/2013 05:06 AM, Will Newton wrote:
>> Add support for pointer mangling in glibc internal structures in C
>> and assembler code.
>> Tested on armv7 with hard and soft thread pointers.
> Have you measured the performance versus using the existing
> global variable?
No, but I'll put together a patch for that approach and see how it looks.
> TLS access on ARM is quite slow and it looks to me like it
> may be faster to use the global variable. Keep in mind that
> the pointer guard and stack guard do not vary by thread.
>From a back of the envelope calculation the cost of accessing TLS is
one cycle faster than accessing a global in best case (e.g.
Cortex-A15), considerably slower in common case (50-60 cycles,
Cortex-A9) and slower still in worst case (function call to libgcc and
the kernel, ARMv4/ARMv5).
Pointer guard looks to be on slower code paths anyway as compared to
stack guard, but you may be right that the global variable solution is
the best way to go.
> 32-bit ARM is currently using a global variable e.g.
> __pointer_chk_guard, all you need to do to make it work
> is adjust the definitions of PTR_MANGLE and PTR_DEMANGLE
> to reference the global symbol.
> This is the second proposal for ARM (first was  for
> AArch64) to support storing the a guard in the TCB, but
> nobody has responded yet to my question about performance.
AArch64 the equation is different - all AArch64 cores have a TLS
register, and while it is not general purpose I suspect accessing it
will be much faster than on the worst performing 32bit cores. I don't
have any numbers though.
Toolchain Working Group, Linaro