This is the mail archive of the gsl-discuss@sources.redhat.com mailing list for the GSL project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: specfunc error analysis - generated and propagated errors.


On Wed, 2004-09-08 at 11:17, Jonathan G. Underwood wrote:
> 
> So, if I understand what you're saying correctly, the current model for 
> error analysis is:
> 
> (1) Assume exact input(s). [except for a few functions where input 
> errors are explicitly taken into account]

Yes.

> (2) Assume no errors on constants used in calculations.

Yes. I think this never matters, but there might be some
boundary case cancellations where it does. They would be
very hard to find.

> (3)  Each "primitive" operation {+,-,/,*} between two numbers adds a 
> _relative_ error of 2*GSL_DBL_EPSILON to the (intermediate) result. 
> [This is what I meant by generated error]

Yes.

> (4) Errors at each step are propogated to the next step according to 
> normal error analysis i.e.
> (i) Addition and subtraction of A and B with absolute errors a and b 
> respectively gives an absolute propagated error in the result of a+b
> (ii) Multiplication and division of A and B with absolute errors a and b 
> respectively gives a relative propagated error in the result of [|a/A| + 
> |b/B|]

Yes. There are a few additional rules I use for elementary functions,
which also follow simple propagation rules. And as I said, sometimes
I use more generous error expressions for intermediate results, if
those expressions are simpler.

> (5) Errors from (3) and (4) are added for each step.

Yes.

> (6) At the end, if the error calculated by this analysis means that the 
> computed value is not equal to the known real value within the computed 
> error, add a Jungman relative error factor of a few GSL_DBL_EPSILON 
> until it is :)

One of the things I almost always do as a last step is:

   result->err += 2.0 * GSL_DBL_EPSILON * fabs(result->val);

I guess that's the "just in case" clause. The fiddling is
then typically done on that last multiplicative factor.
For some simple functions that is the only error estimate.

> Certainly the presence of point (6) above indicates  that 
> the current approach is somewhat subjective and inprecise in some 
> circumstances.

I think in each of those cases I must have simply forgotten
something, or there may be explicit bugs in the expressions.
So it's not so much imprecision as some logical error.

> I wonder if a cleaner design might be to move to a backward error 
> analysis philosophy, whereby the functions are evaluated as they are 
> currently, but without the forward error analysis, and the test suite is 
> used to chacterise/establish the accuracy of the implementation by 
> comparing known values of the functions to computed values.

I like this idea. This would work well for the functions of
a single variable. On the other hand, those are often the
easiest cases anyway. For example, I often use Chebyshev
fits for functions of a single variable. In these cases
the error is absolutely controlled, so the backwards
analysis is not needed.

The hard part is the set of functions with extra parameters.
These are going to be hard to analyze, for obvious reasons.

Still, we may be saved in some cases.

It may be possible prove in some cases that the errors for a large
range of parameter values are characterized by those for a
single value, perhaps times a simple bounding function which
depends on the parameter. I'm thinking of things like the
incomplete gamma function, where there are obvious
monotonicity properties.

Also, if the direct comparison
shows that the error is stable as a function of a parameter,
then that is probably good enough. As you point out below though,
we have to be very careful about test coverage in such cases.

One of the things I struggled with was the oscillating
functions. I don't know the right way to characterize
the error near a zero. Think of Bessel functions and
how you will parametrize the results of a backward
analysis for J(nu, x), with variable nu. The damn
zeroes keep moving around. If this kind of problem
has not been solved yet, and we make progress, I
think that is probably an actual research result.


> For backwards compatibility, a typical backward-analysed error could be 
> returned by the _e functions. It would make for a much cleaner design of 
> the specfun code, and also remove the runtime computational overhead of 
> (most-times often unused) the forward error analysis.

Yes. That's fine. Also, I'm not wedded to the _e functions in the
long run. One day there may be a GSL 2.0, and I have no qualms
about changing the interfaces then.


> This would of 
> course require further development of the test suite, and relies upon 
> testing over the full range of use, and of course, that the "true" 
> values of the functions are known for comparison and error calculation.

And it may require some kind of automation. And we may need to separate
a "conditioning" suite from the test suite. Right now there is no
distinction, but there is definitely a difference between the full
error diagnosis process and some poor end user who just wants to
run 'make check'.


> I'm not really questioning the value of the forward error analysis 
> during the design and implementation/coding of the functions, but am 
> questioning the worth of it to the library user, and also the cost in 
> terms of code complexity, maintainability, and efficiency.

I agree. I guess this is the kind of thing that happens when you
are 2 years into a 10 year project and somebody says "ok, it's
time to release 1.0".


> A backward 
> error analysis I think could be (a) more accurate/representative and (b) 
> more efficient (b) lend itself to cleaner, simpler code. Of course, the 
> error may vary substantially over the range/domain of the function, but 
> that shouldn't be too difficult to deal with.

I tentatively agree. The open issue is whether or not there is
a clear procedure for a backward analysis on functions with more
than one parameter.


> Just looking through NAG and numerical recipes type equivalents, I see 
> they don't implement any error analysis returned to the user. Many 
> conclusions could be reached from that, though :)

Hah. I laugh at their 1000s-of-man-years efforts.
Well... not really. 


> I agree on your point of avoiding having more than a single version of 
> the compiled library entirely, we'll write that off as an idea based on 
> 1 am porridge-brain.

Nevertheless, it is a real issue. We have this problem already
with other switches. I fought hard to get the gslcblas library
separated so that the end user could choose any conformant cblas
at link time. This seems like a clean solution to that problem.
It probably won't work for everything, but that's how I tend
to think about modularizing things. A similar thing will have
to be done for linear algebra, i.e. lapack functionality.

And the sad truth is that I never even know myself whether or not
I am getting what I want, with regards to things like bounds-checking
and (especially) inlining.

I think I will start another thread about the library
configuration, so we don't get sidetracked here.


> Goggling around a bit on these matters is interesting. Many people seem 
> to favour proper interval arithmetic for this sort of thing, but that 
> would be a massive headache to implement.

I decided that it was inconceivable without a hardware
implementation for interval arithmetic. Not-in-our-lifetime
kind of stuff...

Just getting an extra bit into the new IEEE-754 is probably going
to be impossible. Muller argues for a bit indicating whether
the value represents an exact value or it is already rounded,
a kind of "dirty/clean" bit. I like this idea very much.
Good luck to him.


> Interestingly people always 
> argue against forward error analysis on the basis of some statement by 
> von Neumann who showed that it was overly pessimistic for some matrix 
> problems, couldn't find the details on that, though.

I have heard the same thing, but I also have no idea of the details.


> I also noticed something else about this code - although Legendre 
> polynomials are "hard coded" up until P_3, in the general function, only 
> up to P_2 is hard coded.

Odd, isn't it?

> Anyway, if you let me know which of the two 
> error analyses you think is more approriate, I'll happily cook up a 
> patch to sort it out.

The one in the explicit function (gsl_sf_legendre_P2_e) is correct.
I just committed a fix to CVS. The same sort of carelessness may
appear elsewhere...


-- 
Gerard Jungman <jungman@lanl.gov>
Los Alamos National Laboratory


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]