This is the mail archive of the
mailing list for the glibc project.
Re: [RFC][PATCH 0/2] aarch64: Add optimized ASIMD versions of sinf/cosf
> On 13 Jun 2017, at 11:19, Siddhesh Poyarekar <firstname.lastname@example.org> wrote:
>> On Tuesday 13 June 2017 06:58 PM, Szabolcs Nagy wrote:
>> i didnt say i rejected his code, but that duplicated
>> effort is not good.
> OK, but but wasn't clear from the context of the message. I agree that
> duplication is wasteful, but it's not really that bad if it brings out
> different implementations that can be weighed and improved upon.
>> asm is not acceptable even if it's slightly faster.
>> (fix the compiler in that case)
>> asm code maintenance is a huge problem in glibc,
>> in the long term generic code is better in a lot
>> of domains, the sinf/cosf code is such a case,
>> there is no special instruction that helps them
>> that the compiler cannot easily generate.
> I'm going to disagree with this even though I agree with the general
> premise that C > assembly for maintenance. If someone comes up with an
> assembly implementation that is significantly (the definition of
> 'significant' may vary from function to function) faster that cannot be
> implemented in the compiler in the current release, it makes sense to
> carry that implementation in glibc until the compiler can catch up
> provided that all other criteria (accuracy, readability, etc.) are met.
> Additionally, the source of the implementation is important. Now if
> this patch came from a university student who does not intend to follow
> up and maintain her patch then I would be (slightly, again it depends on
> the magnitude of the improvement) inclined to agree with you since it
> puts the maintenance overhead on us, but in this case the source is
> reliable, so that is an added advantage.
>> i didn't say it's a glibc requirement, you have to use
>> common sense here: there are algorithms that are so
>> useful outside of glibc and so generic that it is just
>> unnecessary complication to develop them within glibc
>> (obviously it's not a complication for glibc, but for
>> everybody else, and i cant impose this procedure on
>> others, but i still think this is the better for the
>> larger community).
> Yes, I did not disagree on the merit of the requirement, I am arguing
> about our ability to gate that effectively. It might be useful to come
> up with a wiki doc (or enhancing the contribution checklist) to specify
> But then, as a project we are also ideologically bound to LGPL, so again
> I wonder if doing this conflicts with that ideology. I personally am
> more liberal about this, but I don't know if that is the general opinion
> of the community.
>> if one tests the same input in a loop that does
>> not measure the effect of branches and thus we end
>> up breaking up the input space into many special
>> ranges, however in practice that's not optimal.
> Currently the microbenchmark framework tests the same input in a loop a
> specific number of times to get a large enough number that a single
> iteration gives a stable mean and then tests inputs in a loop - I agree
> that this is cheating a bit since it eliminates cache effects as well as
> branches. It will need a pretty straightforward fix to run only once
> for a single input and it will do what you want, i.e. measure the effect
> of branches.
> Maybe Ashwin could patch the framework as well when he posts his patch.
I think a good starting point I would be if Ashwin in could provide us with a C skeleton with same implementation done in assembly.