x86 *rint* functions: fastmath or not fastmath.
Jeff Johnston
jjohnstn@redhat.com
Mon Jan 7 17:09:00 GMT 2008
Dave Korn wrote:
> Hi Jeff, happy new year and hope you had a pleasant break,
>
>
> When we left off before christmas, I wrote:
>
>
>> I added the prototypes to math.h unconditionally - it's just
>> occurred to me as an afterthought that I should probably have put them in
>> fastmath.h
>>
>
> and you wrote:
>
>
>> After looking at it, I tend to agree with you that you have really
>> provided fastmath versions of the functions.
>>
>
>
> I'd now like to revisit that discussion. First off, I was seriously sleep
> deprived at the time I wrote my email, and not thinking straight; the separate
> concepts of fast math and hard float were somewhat tangled up in my head.
> What I /meant/ to suggest was that the prototypes should be moved to an
> x86-specific header file (since the corresponding functions don't exist on
> other platforms), and due to confusion suggested fastmath.h when it's not
> actually what I really wanted or meant at all, but my addled brain seized on
> it just because it's a header file and it's x86-specific. Sorry for the
> confusion.
>
> So, when you replied agreeing with me, I'm not sure if I've just confused
> the issue with my own lack of clarity, or if there's some other reason to say
> these are fastmath functions.
>
>
The discussion and resolution was kind of rushed due to my self-imposed
1.16.0 deadline.
> Which brings me to the nub of the issue: are these fast math functions, or
> are they good enough to be first-class implementations?
>
> I haven't seen any formal definition of the difference between fast and
> non-fast math, but my understanding is that fast math might not be entirely
> accurate or rounded correctly, might or might not handle exceptions correctly,
> and might or might not optimise using assumptions such as associativity and
> distributivity that aren't entirely valid for FP; that is, the issues are
> accuracy and IEEE compliance.
>
>
Pretty much. A fast math routine usually uses a hardware instruction or
a coding trick. It often isn't prepared for all possible inputs (e.g.
NaNs or Infs or extreme values). It sometimes has accuracy implications
when compared to the IEEE soft-floating point versions (e.g. a sin
instruction might not handle extremely huge or extemely small values
properly or there is a hardware limitation to the accuracy). The odd
situations usually require special-casing, as exemplified in the
soft-float routines, that might not be present in the instruction
logic. The fact that fastmath is a purely optional optimization, lets
it off the hook for full compliance.
> To the best of my knowledge, I can't see why these functions would meet any
> of those conditions. The x87 produces IEEE-conformant results for single
> operations like these, the excess precision between stages of a prolonged
> series of fpu insns shouldn't come into play here (obviously this
> consideration might be different if we were discussing anything except the
> round-to-integer instructions), and the x87 handles all the exceptions and
> status codes properly - and we don't even have support for fenv.h and the
> fe{set,get}* exception and status handling parts of the library yet.
>
>
> So, as far as I can see, these functions ought to be good enough for
> first-class library functions. I wonder whether you agree with my reasoning
> or not here, and whether you'd consider a patch that completely overrode the
> common/ soft implementations of rint/rintf/lrint/lrintf altogether? If not,
> we might still decide that they're "good enough" for cygwin and I'll supply a
> patch that only affects cygwin, but I can't decide which yet until I
> understand the reasoning for calling them fast-math or not.
>
>
>
If they match or surpass the accuracy of the soft-float versions for the
full-range of inputs and set errno appropriately, then they are
certainly first-class and can override.
> BTW, my plan is to provide fenv.h and implement the related functions. I
> care most about cygwin, which does use the fast math implementations (and my
> major motivator is to speed up FP in cygwin and provide control over the x87
> hardware fpu features), so I'm not really keen to provide soft
> implementations. For that reason, would you prefer if I generate my patches
> to target cygwin only, or would you be happy with patches that add new
> funtionality to all i386 builds, but only when using hard float?
>
>
I'm ok with adding such functionality to all i386 builds.
> cheers,
> DaveK
>
More information about the Newlib
mailing list