Если Вы найдете статью полезной и интересной - не сочтите за труд,
переведите материал или хотя бы часть его и отправьте на адрес
algolist@manual.ru.
Задать вопросы или просто написать письмо можно
также на странице контактов.
It is of great importance to solve equations of the form
f(x)=0,
in many applications in Mathematics, Physics, Chemistry, ... and of course in
the computation of some important mathematical constants or functions like
square roots. In this essay, we are only interested in one type of methods :
the Newton's methods.
Around 1669, Isaac Newton (1643-1727) gave a new algorithm [3]
to solve a polynomial equation and it was illustrated on the example
y3-2y-5=0. To find
an accurate root of this equation, first one must guess a starting value, here y
» 2. Then just write y=2+p and after the substitution
the equation becomes
p3+6p2+10p-1=0.
Because p is supposed to be small we neglect p3+6p2
compared to 10p-1 and the previous equation gives p
» 0.1, therefore a better approximation of the root is
y » 2.1. It's possible to repeat this process and write
p=0.1+q, the substitution gives
q3+6.3q2+11.23q+0.061=0,
hence q » -0.061/11.23=-0.0054... and a new
approximation for y » 2.0946... And the process should
be repeated until the required number of digits is reached.
In his method, Newton doesn't explicitly use the notion of derivative and he
only applies it on polynomial equations.
A few years later, in 1690, a new step was made by Joseph Raphson (1678-1715)
who proposed a method [6]
which avoided the substitutions in Newton's approach. He illustrated his
algorithm on the equation x3-bx+c=0, and
starting with an approximation of this equation x » g,
a better approximation is given by
x » g+
c+g3-bg
b-3g2
.
Observe that the denominator of the fraction is the opposite of the
derivative of the numerator.
This was the historical beginning of the very important Newton's (or
sometimes called Newton-Raphson's) algorithm.
The method was then studied and generalized by other mathematicians like
Simpson (1710-1761), Mourraille (1720-1808), Cauchy (1789-1857), Kantorovich
(1912-1986), ... The important question of the choice of the starting point was
first approached by Mourraille in 1768 and the difficulty to make this choice is
the main drawback of the algorithm (see also [5]).
Nowadays, Newton's method is a generalized process to find an accurate root
of a system (or a single) equations f(x)=0. We suppose that f is a
C2function on a given interval, then using Taylor's expansion near x
f(x+h)=f(x)+hfў(x)+O(h2),
and if we stop at the first order (linearization of the equation), we are
looking for a small h such as
The Newton iteration is then given by the following procedure: start with an
initial guess of the root x0, then find the limit of recurrence:
xn+1=xn-
f(xn)
fў(xn)
,
and Figure 1
is a geometrical interpretation of a single iteration of this formula.
Figure 1: One Iteration of Newton's method
Unfortunately, this iteration may not converge or is unstable (we say
chaotic sometimes) in many cases. However we have two important theorems
giving conditions to insure the convergence of the process (see [2]).
Theorem 1Let x* be a root of f(x)=0, where f is a
C2function on an interval containing x*, and we suppose
that | fў(x*)| > 0, then
Newton's iteration will converge to x* if the starting point x0
is close enough to x*.
This theorem insure that Newton's method will always converge if the
initial point is sufficiently close to the root and if this root if not singular
(that is fў(x*) is non zero). This process has the local
convergence property. A more constructive theorem was given by Kantorovich,
we give it in the particular case of a function of a single variable x.
Theorem 2(Kantorovich). Let f be a C2 numerical
function from an open interval I of the real line, let x0 be a
point in this interval, then if there exist finite positive constants
(m0,M0,K0) such as
к к
1
fў(x0)
к к
<
m0,
к к
f(x0)
fў(x0)
к к
<
M0,
| fўў(x0)|
<
K0,
and if h0=2m0M0K0 Ј 1,Newton's iteration will converge to a root x*
of f(x)=0, and
| xn-x*| < 21-nM0h02n-1.
This theorem gives sufficient conditions to insure the existence of a
root and the convergence of Newton's process. More if h0 < 1, the
last inequality shows that the convergence is quadratic (the number of good
digits is doubled at each iteration). Note that if the starting point
x0 tends to the root x*, the constant M0 will
tend to zero and h0 will become smaller than 1, so the local
convergence theorem is a consequence of Kantorovich's theorem.
Newton's iteration may be seen as a first order method (or linearization
method), it's possible to go one step further and write the Taylor expansion of
f to a higher order
f(x+h)=f(x)+hfў(x)+
h2
2
fўў(x)+O(h3),
and we are looking for h such as
f(x+h)=0 »
f(x)+hfў(x)+
h2
2
fўў(x),
we take the smallest solution for h (we have to suppose that fў(x) and fўў(x) are non zero)
h=-
fў(x)
fўў(x)
ж з и
1-
ж Ц
1-
2f(x)fўў(x)
fў(x)2
ц ч ш
.
It's not necessary to compute the square root, because if f(x) is small,
using the expansion
1-
Ц
1-a
=
a
2
+
a2
8
+O(a3),
h becomes
h=-
f(x)
fў(x)
ж и
1+
f(x)fўў(x)
2fў(x)2
+...
ц ш
.
The first attempt to use the second order expansion is due to the astronomer
E. Halley (1656-1743) in 1694.
Under some conditions of regularity of f and it's derivative, Householder
gave in [1]
the general iteration
xn+1=xn+(p+1)
ж и
(1/f)(p)
(1/f)(p+1)
ц ш
xn
,
where p is an integer and (1/f)(p) is the derivative of order p of
the inverse of the function f. This iteration has convergence of order (p+2).
For example p=0 has quadratic convergence (order 2) and the formula gives back
Newton's iteration while p=1 has cubical convergence (order 3) and gives again
Halley's method. Just like Newton's method a good starting point is required to
insure convergence.
Using the iteration with p=2, gives the following iteration which has
quartical convergence (order 4)
where hn=-f(xn)/fў(xn) is given by the simple Newton's
iteration and (a2n,a3n,...) are real
parameters which we will estimate in order to minimize the value of
f(xn+1):
f(xn+1)=f
ж и
xn+hn+a2n
hn2
2!
+a3n
hn3
3!
+...
ц ш
,
we assume that f is regular enough and
hn+a2nhn2/2!+a3nhn3/3!+...
is small, hence using the expansion of f near xn
f(xn+1)=f(xn)+
ж и
hn+a2n
hn2
2!
+a3n
hn3
3!
+...
ц ш
fў(xn)+
ж и
hn+a2n
hn2
2!
+a3n
hn3
3!
+...
ц ш
2
fўў(xn)
2
+...,
and because f(xn)+hnfў(xn)=0, we have
f(xn+1)=(
a2nfў(xn)+fўў(xn))
hn2
2!
+( a3nfў(xn)+3a2nfўў(xn)+f(3)(xn))
hn3
3!
+O(hn4).
A good choice for the ain is clearly to cancel as many
terms as possible in the previous expansion, so we impose
with, for the sake of brevity, the notation
fn(k)=f(k)(xn). The formal values of
the ain may be computed for much larger values of i.
Finally the general iteration is
xn+1
=xn+hn
ж и
1+a2n
hn
2!
+a3n
hn2
3!
+a4n
hn3
4!
+...
ц ш
=xn-
f(xn)
fў(xn)
ж и
1+
fўў(xn)
2!fў(xn)
ж и
f(xn)
fў(xn)
ц ш
+
3fўў(xn)2-fў(xn)f(3)(xn)
3!fў(xn)2
ж и
f(xn)
fў(xn)
ц ш
2
+...
ц ш
.
For example, if we stop at a3n and set
a4n=a5n=...=0, we have the
helpful quartic modified iteration (note that this iteration is
different than the previous Householder's quartic method)
xn+1=xn-
f(xn)
fў(xn)
ж и
1+
f(xn)fўў(xn)
2!fў(xn)2
+
f(xn)2(3fўў(xn)2-fў(xn)f(3)(xn))
3!fў(xn)4
ц ш
,
and if we omit a3n, we retrieve Householder's cubic
iteration.
It's also possible to find the expressions for
(a4n,a5n,a6n,a7n,...),
and define quintic, sextic, septic, octic ... iterations.
In the essay on the square root of two, some applications of those methods
are given and also in the essay on how compute the inverse or the square root of
a number. Here let's compare the different iterations on the computation of the
Golden Ratio
j =
1+Ц5
2
.
First we write j = 1/2+x and x=Ц5/2 is root of the equation
1
x2
-
4
5
=0,
then, it's convenient to set hn=1-4/5×xn2, we have the 5 algorithms,
deduced from the general previous methods (left as exercise!), with respectively
quadratic, cubic, quartic, sextic and octic convergence :
and one more step gives for x2 respectively 17, 39, 69,154 and 273
good digits. If we iterate just three more steps, x5 will give
respectively 139, 1049, 4406, 33321 and 140053 correct digits ... Observe that
hn tends to zero as n increases which may be used to accelerate the
computations and the choice of the best algorithm depends on your implementation
(algorithm used for multiplication, the square, ...)
Newton's method may also be used to find a root of a system of two or more
non linear equations
f(x,y)
=0
g(x,y)
=0,
where f and g are C2functions on a given domain. Using Taylor's
expansion of the two functions near (x,y) we find
f(x+h,y+k)
=f(x,y)+h
¶f
¶x
+k
¶f
¶y
+O(h2+k2)
g(x+h,y+k)
=g(x,y)+h
¶g
¶x
+k
¶g
¶y
+O(h2+k2)
and if we keep only the first order terms, we are looking for a couple (h,k)
such as
f(x+h,y+k)
=0 »
f(x,y)+h
¶f
¶x
+k
¶f
¶y
g(x+h,y+k)
=0 »
g(x,y)+h
¶g
¶x
+k
¶g
¶y
hence it's equivalent to the linear system
ж з з з з з и
¶f
¶x
¶f
¶y
¶g
¶x
¶g
¶y
ц ч ч ч ч ч ш
ж з и
h
k
ц ч ш
=-
ж з и
f(x,y)
g(x,y)
ц ч ш
.
The (2×2) matrix is called the Jacobian matrix (or Jacobian)
and is sometimes denoted
J(x,y)=
ж з з з з з и
¶f
¶x
¶f
¶y
¶g
¶x
¶g
¶y
ц ч ч ч ч ч ш
(it's generalization as a (N×N) matrix for a system of N equations and N
variables is immediate) . This suggest to define the new process
ж з и
xn+1
yn+1
ц ч ш
=
ж з и
xn
yn
ц ч ш
-J-1(xn,yn)
ж з и
f(xn,yn)
g(xn,yn)
ц ч ш
starting with an initial guess (x0,y0) and under
certain conditions (which are not so easy to check and this is again the main
disadvantage of the method), it's possible to show that this process converges
to a root of the system. The convergence remains quadratic.
Example 3We are looking for a root near (x0=-0.6,y0=0.6) of the following system
Depending on your initial guess Newton's process may converge to one of the
three roots of the system:
(-1/2,Ц3/2),(-1/2,-Ц3/2),(1,0),
and for some values of (x0,y0) the convergence of the
process may be tricky! The study of the influence of this initial guess leads to
aesthetic fractal pictures.
Cubic convergence also exists for several variables system of equations
: Chebyshev's method.
H. Qiu, A Robust Examination of the Newton-Raphson Method with Strong
Global Convergence Properties, Master's Thesis, University of Central
Florida, (1993)