minimization using the Brent algorithm (brent.c)

Pedro Gonnet
Mon Mar 17 22:27:00 GMT 2003

Hello all,

Maybe I'm getting something wrong here, but here's my two cent's worth:

	- First of all, isn't the requirement for a return value of
	  Brent's algorithm F(a) >= F(x) <= F(b), which is a bit laxer
	  than F(a) > F(x) < F(b)?

	- Although we like to consider a, b and x to be (vectors of)
	  real values, they are, on any computer, for all practical
	  puposes, of finite precision. Therefore, if there is an
	  _exact_ x_n for which F(a) >= F(x_n) <= F(b).

The first argument is rather easy to understand if, for instance, we are
looking for the minimum of F(x) = x^2 in the interval 0..1.

The second argument is almost as easy. With every step of Brent's nice
little algorithm we reduce the search space by a factor of 0.6182 or
something, right? Let delta_x be b - a. Let delta_x_max be the largest
component of delta_x (this is just in case a and b are vectors). Let's
also assume, now that we're at it, that we are using 64-bit double
precision values. If all these assumptions are correct, delta_x_max +/-
delta_x_max * 2.2e-16 = delta_x_max, right? Therefore, after
log[golden](2.2e-16) steps, no improvement should be possible, since
there is no more interval.

All this, although being really nice, is only peripheral to the main
discussion here, namely, what happens if F of both points selected in
]a,b[ are larger than F(a) and F(b).

In most cases where Brent's Algorithm is used, b - a is just a search
direction and the choice of b is usualy arbitrary. To be honest, I have
never used GSL's Brent because it's too easy to implement one's self
and, well, it's also somewhat a matter of taste how it is done...

Usually one sets out with (the vector) a and some direction delta_x. The
vector d is usually chosen as a + delta_x. The nomenclature "d" is not a
typo -- I call the four vectors used in Brent's a, b, c and d (I can't
think of a good reason, but somehow it makes sense...). I then calculate
c as a + 0.6182 * delta_x, and this is where the fun starts:

	alpha = 1.0;
	while (F(c) > F(d)) {
		alpha /= 0.6182;
		c = d; d = a + alpha * delta_x;

This way we guarantee that the function has at least one local minimum
in the interval ]a,d], unless you are unlucky enough for F(x) = e^(-x)
or something of the sort, in which case the minimum is somewhere out in
infinity and/or beyond (or until F(d) - F(c) disappears since, as
mentioned earlier, doubles are finite).

If, however, you are not doing a line search, this gets to be somewhat
more complicated, since if F(a) < F(b) > F(d) and F(a) < F(c) > F(d) you
are not guaranteed to have a local minimum in [a,d]. The naive approach
would be to apply Brent's Algorithm recursively on the resulting three
intervals, but this can go quite wrong, e.g. for F(x) = -x^2 in the
interval -1..1. An easier approach is to work with what the case F(a) <
F(b) > F(d) and F(a) < F(c) > F(d) gives us, namely the guarantee of at
least one local _maximum_. In such a case we can use Brent's to find the
local maximum, x_max, and then do the recursion on ]a,x_max] and
[x_max,d[. The number of Brent searches is proportional to the number of
local maxima.

I hope that in the best case this rather long mail helps any, and in the
worst case that I haven't made too much of an ass of myself...


On Mon, 2003-03-17 at 22:37, Z F wrote:
> Dear Fabrice Rossi and gsl-ers..
> First of all let me ask the question: Which problem is being solved?
> The answer (i.e the formulation of the problem) which I see is the
> following:
> On an open interval (a,b) find such a point x0 inside (a,b) that
> F(x0) is minimal. (from this it follows that F(a)>F(x0)<F(b).)
> Numerically, exact x0 can never be found so it is acceted that the
> output is the best approximation x to x0 and/or the interval (a',b')
> on which the correct x0 lies.
> Thus, the input to the problem is the open interval (a,b) and the
> function F(x). Nothing more, there is no "first' guess here...
> How the method proceeds is up to the method.
> It is a common misconseption, in my opinion, that a method (golden rule
> or Brent) start with a triplet, quartet or sixtet or N-tet...
> The algorithm itself should construct N-tet and make calculations.
> The motivation for the golden section or for brent is that for any
> given stage of calculation given the search interval (a_n,b_n)
> a new approximation x_n for x0 is found (according to the rules of the
> method) and the following should be true F(a_n)>F(x_n)<F(b_n).
> If on any stage this inequality is false, the golden/Brent is in deep
> trouble. Note that x_n is not given by the user, it is obtained within 
> the algorithm.
> Thus, the scenario should be the following:
> Given the interval (a,b) and F(x) each method should find the first 
> iteration x_1 and test the inequality above if the algorithm requires
> it. If the test is required and not satisfied and the method can not
> proceed -- abort, if the test is satisfied or not required and the
> method can proceed, a new interval (a_1,b_1) is constructed and the
> problem returns to the initial one of finding extremum on (a_1,b_1)
> given F(x).
> There is no need for a "third" point supplied by the user in the
> initial step.
> The use of "user" points only worsens(does not improve) convergence of
> the algorithm without any other effect. Unfotunately, the common
> partice is to use that useless "initial" guess making the user to
> execute the first stage of the algorithm or face the loss of one
> function evaluation or even convergence.
> This is my opinion.
> Lazar
> __________________________________________________
> Do you Yahoo!?
> Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 232 bytes
Desc: This is a digitally signed message part
URL: <>

More information about the Gsl-discuss mailing list