This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] powerpc: New feature - HWCAP/HWCAP2 bits in the TCB


On Tue, Jun 09, 2015 at 01:43:09PM -0500, Steven Munroe wrote:
> On Tue, 2015-06-09 at 13:42 -0400, Rich Felker wrote:
> > On Tue, Jun 09, 2015 at 12:37:04PM -0500, Steven Munroe wrote:
> > > On Tue, 2015-06-09 at 12:50 -0400, Rich Felker wrote:
> > > > On Tue, Jun 09, 2015 at 04:48:10PM +0100, Szabolcs Nagy wrote:
> > > > > >> if hwcap is useful abi between compiler and libc
> > > > > >> then why is this done in a powerpc specific way?
> > > > > > 
> > > > > > Other platform are free use this technique.
> > > > > 
> > > > > i think this is not a sustainable approach for
> > > > > compiler abi extensions.
> > > > > 
> > > > > (it means juggling with magic offsets on the order
> > > > > of compilers * libcs * targets).
> > > > > 
> > > > > unfortunately accessing the ssp canary is already
> > > > > broken this way, i'm not sure what's a better abi,
> > > > > but it's probably worth thinking about one before
> > > > > the tcb code gets too messy.
> > > > 
> > > > For the canary I think it makes sense, even though it's ugly -- the
> > > > compiler has to generate a reference in every single function (for
> > > > 'all' mode, or just most non-trivial functions in 'strong' mode).
> > > > That's much different from a feature (hwcap) that should only be used
> > > > at init-time and where, even if programmers did abuse it and use it
> > > > over and over at runtime, it's only going to be a small constant
> > > > overhead in a presumably medium to large sized function, and the cost
> > > > is only the need to setup the GOT register and load from the GOT,
> > > > anyway.
> > > 
> > > You are entitled to you own opinion but you are not accounting for the
> > > aggressive out of order execution the POWER processors and specifics of
> > > the PowerISA. In the time it take to load indirect via the TOC (4 cycles
> > > minimum) compare/branch we could have executed 12-16 useful
> > > instructions. 
> > > 
> > > Any indirection exposes the sequences to hazards (like cache miss) which
> > > only make things worse.
> > > 
> > > As stated before I have thought about this and understand the options in
> > > the context of the PowerISA, POWER micro-architecture, and the PowerPC
> > > ABIs. This information is publicly available (if a little hard to find)
> > > but I doubt you have taken the time to study it in detail, if at all.
> > > 
> > > I suspect you base your opinion on other architectures and hardware
> > > implementations that do not apply to this situation. 
> > 
> > That's nice but all theoretical. I've seen countless such theoretical
> > claims from people who are coming from a standpoint of the vendor
> > manuals for the ISA they're working with, and more often than not,
> > they don't translate into measurable benefits. (I've been guilty of
> > this myself too, going to great lengths to tweak x86 codegen or even
> > write the asm by hand, only to find the resulting code to run the
> > exact same speed.) Creating a permanent ABI is an extremely high cost,
> > and unless you can justify the cost with actual measurements and a
> > reason to believe those measurements have anything to do with
> > real-world usage needs, I believe it's an unjustified cost.
> 
> This is not theory, I am thinking at the level of pipeline cycle timing
> for P7/P8. I have been at this so long I can do this in my head.
> 
> Now experience does tell me that adding an indirection and the
> associated exposure to cache miss hazard can mean the the performance
> optimization gets lost in the hazard when it is measured.
> 
> I have been to this movie, I don't need to see it again.

Doing this in your head is EXACTLY what I mean by theoretical.
Non-theoretical would be having a test program that demonstrates the
timing difference, i.e. empirical.

Rich


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]