This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf

From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
To: Siddhesh Poyarekar <siddhesh at gotplt dot org>
Cc: Szabolcs Nagy <szabolcs dot nagy at arm dot com>, libc-alpha at sourceware dot org, Ashwin Sekhar T K <ashwin dot sekhar at caviumnetworks dot com>, nd at arm dot com
Date: Tue, 13 Jun 2017 13:53:16 -0300
Subject: Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
Authentication-results: sourceware.org; auth=none
References: <20170613071707.43396-1-ashwin.sekhar@caviumnetworks.com> <593FC77A.6050609@arm.com> <1de74f07-dac3-3e01-11fc-48e3787e0f7e@gotplt.org> <593FE870.8000801@arm.com> <b2f1868b-a944-afdf-c570-d74ced0d5d3e@gotplt.org>


> On 13 Jun 2017, at 11:19, Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
> 
>> On Tuesday 13 June 2017 06:58 PM, Szabolcs Nagy wrote:
>> i didnt say i rejected his code, but that duplicated
>> effort is not good.
> 
> OK, but but wasn't clear from the context of the message.  I agree that
> duplication is wasteful, but it's not really that bad if it brings out
> different implementations that can be weighed and improved upon.
> 
>> asm is not acceptable even if it's slightly faster.
>> (fix the compiler in that case)
>> 
>> asm code maintenance is a huge problem in glibc,
>> in the long term generic code is better in a lot
>> of domains, the sinf/cosf code is such a case,
>> there is no special instruction that helps them
>> that the compiler cannot easily generate.
> 
> I'm going to disagree with this even though I agree with the general
> premise that C > assembly for maintenance.  If someone comes up with an
> assembly implementation that is significantly (the definition of
> 'significant' may vary from function to function) faster that cannot be
> implemented in the compiler in the current release, it makes sense to
> carry that implementation in glibc until the compiler can catch up
> provided that all other criteria (accuracy, readability, etc.) are met.
> 
> Additionally, the source of the implementation is important.  Now if
> this patch came from a university student who does not intend to follow
> up and maintain her patch then I would be (slightly, again it depends on
> the magnitude of the improvement) inclined to agree with you since it
> puts the maintenance overhead on us, but in this case the source is
> reliable, so that is an added advantage.
> 
>> i didn't say it's a glibc requirement, you have to use
>> common sense here: there are algorithms that are so
>> useful outside of glibc and so generic that it is just
>> unnecessary complication to develop them within glibc
>> (obviously it's not a complication for glibc, but for
>> everybody else, and i cant impose this procedure on
>> others, but i still think this is the better for the
>> larger community).
> 
> Yes, I did not disagree on the merit of the requirement, I am arguing
> about our ability to gate that effectively.  It might be useful to come
> up with a wiki doc (or enhancing the contribution checklist) to specify
> this.
> 
> But then, as a project we are also ideologically bound to LGPL, so again
> I wonder if doing this conflicts with that ideology.  I personally am
> more liberal about this, but I don't know if that is the general opinion
> of the community.
> 
>> if one tests the same input in a loop that does
>> not measure the effect of branches and thus we end
>> up breaking up the input space into many special
>> ranges, however in practice that's not optimal.
> 
> Currently the microbenchmark framework tests the same input in a loop a
> specific number of times to get a large enough number that a single
> iteration gives a stable mean and then tests inputs in a loop - I agree
> that this is cheating a bit since it eliminates cache effects as well as
> branches.  It will need a pretty straightforward fix to run only once
> for a single input and it will do what you want, i.e. measure the effect
> of branches.
> 
> Maybe Ashwin could patch the framework as well when he posts his patch.
> 
> Siddhesh

I think a good starting point I would be if Ashwin in could provide us with a C skeleton with same implementation done in assembly.

Follow-Ups:
- Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
  - From: Joseph Myers

References:
- [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
  - From: Ashwin Sekhar T K
- Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
  - From: Szabolcs Nagy
- Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
  - From: Siddhesh Poyarekar
- Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
  - From: Szabolcs Nagy
- Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
  - From: Siddhesh Poyarekar

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]