This is the mail archive of the
mailing list for the libc-ports project.
Re: [PATCH] ARM: Add pointer guard support.
- From: "Carlos O'Donell" <carlos at redhat dot com>
- To: Will Newton <will dot newton at linaro dot org>
- Cc: "libc-ports at sourceware dot org" <libc-ports at sourceware dot org>, Patch Tracking <patches at linaro dot org>
- Date: Wed, 25 Sep 2013 12:25:50 -0400
- Subject: Re: [PATCH] ARM: Add pointer guard support.
- Authentication-results: sourceware.org; auth=none
- References: <5242A79D dot 1030709 at linaro dot org> <52430AA4 dot 70703 at redhat dot com> <CANu=DmhCMr-LpHLaDRsLOHDVXsWm-sxHzRTMJxNCJs3Ae0uPZg at mail dot gmail dot com>
On 09/25/2013 12:23 PM, Will Newton wrote:
> On 25 September 2013 17:09, Carlos O'Donell <email@example.com> wrote:
>> On 09/25/2013 05:06 AM, Will Newton wrote:
>>> Add support for pointer mangling in glibc internal structures in C
>>> and assembler code.
>>> Tested on armv7 with hard and soft thread pointers.
>> Have you measured the performance versus using the existing
>> global variable?
> No, but I'll put together a patch for that approach and see how it looks.
>> TLS access on ARM is quite slow and it looks to me like it
>> may be faster to use the global variable. Keep in mind that
>> the pointer guard and stack guard do not vary by thread.
> From a back of the envelope calculation the cost of accessing TLS is
> one cycle faster than accessing a global in best case (e.g.
> Cortex-A15), considerably slower in common case (50-60 cycles,
> Cortex-A9) and slower still in worst case (function call to libgcc and
> the kernel, ARMv4/ARMv5).
> Pointer guard looks to be on slower code paths anyway as compared to
> stack guard, but you may be right that the global variable solution is
> the best way to go.
Thanks for exploring this solution.
>> 32-bit ARM is currently using a global variable e.g.
>> __pointer_chk_guard, all you need to do to make it work
>> is adjust the definitions of PTR_MANGLE and PTR_DEMANGLE
>> to reference the global symbol.
>> This is the second proposal for ARM (first was  for
>> AArch64) to support storing the a guard in the TCB, but
>> nobody has responded yet to my question about performance.
> AArch64 the equation is different - all AArch64 cores have a TLS
> register, and while it is not general purpose I suspect accessing it
> will be much faster than on the worst performing 32bit cores. I don't
> have any numbers though.
I don't disagree with you, but I'd like to see some due-diligence
in testing out the two alternatives and reporting back the performance
numbers. You need not implement both, just test two access methods
using a small test program and report the data.