A Simplified Algorithm for a Full-Rank Update Quasi-Newton Method

Berzi, Peter

doi:10.3390/appliedmath5010015

Open AccessArticle

A Simplified Algorithm for a Full-Rank Update Quasi-Newton Method

by

Peter Berzi

Doctoral School of Applied Informatics and Applied Mathematics, Óbuda University, Bécsi út 96/B, 1034 Budapest, Hungary

AppliedMath 2025, 5(1), 15; https://doi.org/10.3390/appliedmath5010015

Submission received: 23 December 2024 / Revised: 18 January 2025 / Accepted: 5 February 2025 / Published: 8 February 2025

Download

Browse Figures

Versions Notes

Abstract

An efficient linearization method for solving a system of nonlinear equations was developed, showing good stability and convergence properties. It uses an unconventional and simple strategy to improve the performance of classic methods by a full-rank update of the Jacobian approximates. It can be considered both as a discretized Newton’s method or as a quasi-Newton method with a full-rank update of the Jacobian approximates. A solution to the secant equation presented earlier was based on the Wolfe-Popper procedure. The secant equation was splitted into two equations by introducing an auxiliary variable. A simplified algorithm is given in this paper for the full-rank update procedure.It directly solves the secant equation with the pseudoinverse of the Jacobian approximate matrix. Numerical examples are shown for demonstration purposes. The convergence and efficiency of the suggested method are discussed and compared with the convergence and efficiency of classic linearization methods.

Keywords:

quasi-Newton methods; least-squares solution; multi-variable nonlinear equations; full-rank update Jacobian approximate; efficiency

1. Introduction

One of the most common ideas for finding the zero of a nonlinear function is to replace it with a series of suitably chosen linear functions, for which the zeros can easily be determined and the sequence of zeros approximate the zero of the nonlinear function. The widely used classic methods are Newton’s method and a large family of quasi-Newton methods (secant, Broyden’s, discretized Newton, Steffensen’s, …). This strategy can be called “linearization”, and such methods may be called “linearization methods”.

A mathematical model

ϕ (ω, x)

with some parameters

x = [x_{i}]

(i = 1, \dots, n)

is constructed for an observed system that gives the observable response

D = [D_{j}]

sampled at locations

ω_{j}

(j = 1, \dots, m)

(

m > n

) to an observable external effect. It is assumed that the simulated system response

ϕ (ω, x)

is sensitive to the perturbations

▵ x_{i}

of all parameters. The

n

adjustable parameters

x

of the mathematical model are determined so that the distance

l_{q} = {∥f (x)∥}_{q} = {∥ϕ (ω, x) - D∥}_{q}

(1)

between the observed and the simulated system responses (

[D_{j}]

and

ϕ (ω, x)

) is minimized, where

l_{q} : R^{m} \to R

is a

q

norm

l_{q} = {(\sum_{j = 1}^{m} {|f_{j} (x)|}^{q})}^{\frac{1}{q}} .

(2)

If

q = 2

is chosen, then we obtain the Euclidean or least-squares norm

l_{2} = {(\sum_{j = 1}^{m} {|f_{j} (x)|}^{2})}^{\frac{1}{2}},

(3)

that is well computable and safely usable for a great class of problems. Then, minimizing the distance (1) leads to the problem of minimizing the least-squares norm

l_{2} = {∥f (x)∥}_{2} = {∥ϕ (ω, x) - D∥}_{2}

(4)

(

x \in

R^{n}

,

f : R^{n} \to R^{m}

and

n < m

). The minimum value of this norm is zero, but for “real life” problems, it is not reachable in most cases due to modeling inaccuracies and measurement noises. The suggested procedure gives a least-squares solution to the overdetermined system of nonlinear equations

f (x) = 0,

(5)

where the solution

x^{*}

minimizes the least squares norm (4) of the nonlinear residual function

f (x) = ϕ (ω, x) - D .

(6)

Least-squares minimizations are successively made within the actual iteration steps. The basic concept of solving Equation (5) is that the function

f (x)

is “linearized” by repeatedly replacing it with a linear function

y_{p} = y (x_{p})

as

f (x) ≃ y_{p} = f_{p} + J_{p} (x - x_{p})

(7)

for Newton’s method, where

f_{p} = f (x_{p})

is the function value,

J_{p} = J (x_{p})

is the Jacobian matrix of the function

f (x)

(

p

is iteration counter), and

f (x) ≃ y_{p} = f_{p} + S_{p} (x - x_{p})

(8)

for the quasi-Newton methods, where

S_{p} = S (x_{p})

is the finite-difference Jacobian approximate of the function

f (x)

. Then, the nonlinear problem is solved by solving a series of linear problems

y_{p} = 0,

(9)

and it follows that the new approximate

x_{p + 1}

to the solution

x^{*}

can be given from Equations

J_{p} ▵ x_{p} = - f_{p}

(10)

(Newton’s method) and

S_{p} ▵ x_{p} = - f_{p},

(11)

(quasi-Newton methods) where the iteration stepsize is

▵ x_{p} = x_{p + 1} - x_{p}

(12)

in both cases.

The system of simultaneous multi-variable nonlinear Equation (5) can be solved by Newton’s method when the derivatives

J_{p}

of

f (x)

are available analytically, and a new iterate

x_{p + 1}

can be determined. In many cases, explicit formulas for the function

f (x)

are not available and the Jacobian

J_{p}

can only be approximated. The partial derivatives of the analytic Jacobian may be replaced by suitable finite-difference quotients (discretized Newton’s iteration [1,2]) with properly chosen stepsizes. For most problems, Newton’s method using analytic derivatives and Newton’s method using properly chosen divided differences are virtually indistinguishable [1]. However, the determination of the finite-difference stepsizes is not clearly defined. The suggested full-rank update procedure helps to overcome this deficit.

Let

S_{p}

be a full column rank matrix (all column vectors of matrix

S_{p}

are linearly independent). As

n < m

(there are more rows than columns),

S_{p}

is not invertible, and the solution to the overdetermined system of Equation (11) does not exist. However,

S_{p}^{T} S_{p}

is invertible and the pseudoinverse

S_{p}^{+} = {(S_{p}^{T} S_{p})}^{- 1} S_{p}^{T}

(13)

of matrix

S_{p}

is a unique left inverse which allows finding a least-squares approximate solution

\tilde{x}

to Equation (11). The pseudoinverse can be determined in different ways (e.g., rank factorization [3], singular value decomposition [4]). Singular value decomposition is a widely used technique, and it has the advantage that it gives the solution even if

S_{p}

is singular (not a full column rank matrix). Matrix

S_{p}

can be factorized as

S_{p} = U_{p} Σ_{p} V_{p}^{T},

(14)

where

U_{p}

(

mxn

matrix) and

V_{p}

(

nxn

matrix) are orthogonal matrices with orthonormal column eigenvectors, and

Σ_{p} = diag (σ_{i, p})

(15)

(i = 1, \dots, n)

with

σ_{i, p}

singular values of matrix

S_{p}

and

σ_{1} \geq \dots \geq σ

. Then, the least-square solution to Equation (11) exists, and the iteration stepsize is

▵ x_{p} = - S_{p}^{+} f_{p},

(16)

where

S_{p}^{+} = V_{p} Σ_{p}^{- 1} U_{p}^{T}

(17)

is the pseudoinverse of

S_{p}

and

Σ_{p}^{- 1} = diag (\frac{1}{σ_{i, p}}) .

(18)

The pseudoinverse

{[\dots]}^{+}

gives a unique least-squares solution to Equation (11) for which the

q = 2

norm (Euclidean norm) (4) of the residual Function (6) is minimum. If the rank of matrix

S_{p}

is less than

n

(the column vectors of

S_{p}

are not linearly independent and some singular values are zero), then a unique solution to Equation (11) does not exist. The spectral condition number

κ (S_{p}) = \frac{σ_{1}}{σ_{n}} \geq 1

(19)

of matrix

S_{p}

measures the linear dependency of the column vectors. If they are linearly independent (none of the singular values are zero), and the condition number

κ

is not much larger than one, then the matrix is well conditioned. This is a desirable situation throughout the whole iteration process. Then, the column vectors of matrix

S_{p}

are in “general positions” and the Ortega–Rheinboldt condition [2] satisfies. If

κ ≫ 1

, then the matrix is ill-conditioned, and the solution

▵ x_{p}

(iteration stepsize) to the Equation (11) may be sensitive to small changes in

S_{p}

or

f_{p}

.

Classic quasi-Newton linearization methods generate a sequence of improved iterates

x_{p}

(p = 0, 1, 2, \dots)

so that the next approximate

x_{p + 1}

to the solution

x^{*}

is determined on the basis of rank-1 update

S_{p + 1}

of the Jacobian approximate

S_{p}

. Such linearization update is not unique if only a single new approximate

x_{p + 1}

exists. In a single-variable case

(n = 1)

, there are two possible updates (secant lines) from which the selection is not obvious (see details in Section 3). In the multi-variable case, only the rank-1 update is possible if one single new approximate

x_{p + 1}

exists.

The solution to Equation (11) presented in [5] and applied to the identification of physically nonlinear dynamic systems [6,7,8] was based on the Wolfe–Popper procedure [9,10]. The secant Equation (11) was splitted into two equations by introducing an auxiliary variable. The simplified solution presented in Section 4 directly solves Equation (11) with the pseudoinverse (13) of matrix

S_{p}

. The simplified algorithm given in Section 5 gives an unconventional and simple strategy to improve the performance of Newton and quasi-Newton iterations for the solution of the system of nonlinear Equations (5). It can be considered both as a discretized Newton’s method or as a quasi-Newton method with the full-rank update of the Jacobian approximates.

The rate of convergence

α

of the classic secant method equals the golden section ratio

φ

(

α = φ ≃

1.618) for a simple root in a single-variable case. The suggested method gives an additional new independent approximate

x_{p + 1}^{B}

to the solution x*, then rank-

n

linearization update can be made with the classic approximate

x_{p + 1}

(notated as

x_{p + 1}^{A}

in the following) and the suggested new approximate

x_{p + 1}^{B}

. The results of single- and multi-variable numerical examples are given in Section 6. The efficiency of the proposed method is discussed in Section 7, and it is compared with other classic rank-one update and line-search methods on the basis of available test data. Concluding remarks are summarized in Section 8.

2. Notations

Vectors and matrices are denoted by bold-face letters. Subscripts refer to components of vectors and matrices, superscripts

A

and

B

refer to interpolation base points. Notations

A

and

B

are introduced to be able to clearly distinguish between the two new approximates

x^{A}

and

x^{B}

. Vectors and matrices may also be given by their general elements. ∆ refers to a difference between two elements.

x

and

X

denote unknown quantities, and

f

and

F

denote function values and matrices.

t

and

T

denote multiplier scalars and scaling transformation matrices.

e

,

ε

, and

E

denote approximate error,

p

is iteration counter,

α

is convergence rate,

ε^{*}

is termination criterion.

n

is the number of unknowns,

m

is the number of function values,

i

,

j

, and

k

and are running indexes of matrix columns and rows. Superscripts

S

and

TS

refer to the traditional the secant method and to the suggested full-rank update method (T-Secant) receptively.

3. Linearization Methods

The origin of the secant method can be traced back to ancient times to the “rule of double false position” method, described in the 18th century B.C. on the Egyptian Rhind Papyrus [11], and it predates Newton’s method by more than 3000 years. It is also well known that the local convergence rate of the classic Newton’s and the classic secant methods for simple roots are quadratic and superlinear, respectively [12,13,14,15]. Weerakson and Fernando’s third-order Newton variant [16] involves the evaluation of a function value and two derivatives, similarly to Traub’s suggestion [15]. Secant method variations with improved convergence have been reported in the literature [16,17,18,19,20,21,22,23,24]. Mueller’s method [25] uses a quadratic approximation. A class of variations employs two function values and a derivative evaluation [26]. Third-order methods are obtained by a second derivative evaluation, such as Halley’s [21] method and the “super Halley” [27] method. Further improvements with higher-degree polynomial approximations are proposed by Chen [28], Kanwar [20], Zhang [29] and Wang [30].

3.1. Single-Variable Case

The zero

x^{*}

of a scalar nonlinear function

f (x)

(

x \to f (x)

,

x \in R^{1}

,

f : R^{1} \to R^{1}

) has to be determined, where

f (x^{*}) = 0 .

(20)

Linearization means that

f (x)

is locally replaced by a linear function

y (x)

(

x \to y (x)

,

x \in R^{1}

,

y : R^{1} \to R^{1}

) (secant or tangent line) as shown in Figure 1 (Left) and operations are made on the linear function

y_{p} (x)

in the

p^{th}

iteration.

y_{p} (x)

may be defined through two points

A_{p} (x_{p}^{A} f_{p}^{A})

and

B_{p} (x_{p}^{B} f_{p}^{B})

of the nonlinear function

f (x)

, where

f_{p}^{A} = f (x_{p}^{A})

and

f_{p}^{B} = f (x_{p}^{B})

are function values, or through one point

A_{p} (x_{p}^{A} f_{p}^{A})

and by the “slope”

S_{p} = \frac{▵ f_{p}}{▵ x_{p}}

(21)

of the secant line

y_{p} (x)

, where

▵ x_{p} = x_{p}^{B} - x_{p}^{A}

and

▵ f_{p} = f_{p}^{B} - f_{p}^{A}

are differences, or by the “slope”

S_{p} = f^{'} (x_{p}^{A}) = \frac{df}{dx}

(22)

at

x_{p}^{A}

of the tangent line

y_{p} (x)

, where

f^{'}

is the derivative function of

f (x)

. The successive local replacement of the nonlinear function

f (x)

by a secant or tangent line

y_{p} (x)

gives a simple and efficient numerical root-finding procedure. It follows from the condition

y_{p} (x_{p + 1}^{A}) = 0

(23)

that the zero

x_{p + 1}^{A} = x_{p}^{A} - \frac{f_{p}^{A}}{S_{p}}

(24)

of the linear function

y_{p} (x)

approximates the zero

x^{*}

of the nonlinear function

f (x)

, and the new point

A_{p + 1} (x_{p + 1}^{A} f_{p + 1}^{A})

can be determined for the next iteration, as shown in Figure 1 (Right). With the iteration step length

▵ x_{p}^{A} = x_{p + 1}^{A} - x_{p}^{A}

(25)

from point

A_{p}

, Equation (24) can be rewritten as

S_{p} ▵ x_{p}^{A} = - f_{p}^{A},

(26)

that is called the secant equation, or we may also call it a “linearization” equation, as it also includes Newton’s method as a special case (tangent line). The secant-based procedure has the advantage that it does not need the calculation of function derivatives, it only uses function values and the order of asymptotic convergence is superlinear with convergence rate

α^{S} ≅ 1.618 \dots

. However, the “slope” of the updated secant line

y_{p + 1} (x)

can be determined in two ways from the secant (quasi-Newton) conditions as

S_{p + 1} ▵ x_{p}^{A} = f_{p + 1}^{A} - f_{p}^{A}

(27)

or

S_{p + 1} ▵ x_{p}^{B} = f_{p + 1}^{A} - f_{p}^{B},

(28)

where

▵ x_{p}^{B} = x_{p + 1}^{A} - x_{p}^{B}

(29)

is the iteration step length from point

B_{p}

. The decision is far from obvious, as shown in Figure 1 (Right). The tangent-line-based iteration (Newton’s method) has an advantage that the “slope”

S_{p + 1} = f^{'} (x_{p + 1}^{A})

(30)

of the updated tangent line is always well defined. The iteration then continues with the updated secant or tangent line

y_{p + 1} (x)

.

3.2. Multi-Variable Case

The scalar linearization procedure can be extended to multi-dimensions. The zero

x^{*}

of a nonlinear vector-valued function

f (x) = [\begin{matrix} f_{1} (x) \\ ⋮ \\ f_{m} (x) \end{matrix}]

(31)

of

n

variables

x = [\begin{matrix} x_{1} \\ ⋮ \\ x_{n} \end{matrix}]

(32)

(

x \to f (x)

,

x \in R^{n}, f : R^{n} \to R^{m}, n \leq m

) has to be determined, where

f (x^{*}) = 0 .

(33)

Then,

f (x)

is locally replaced by an

n

-dimensional hyperplane

y (x) (x \to y (x), x \in R^{n}, y : R^{n} \to R^{m}

(34)

(n \leq m)

and operations are made on the hyperplane

y_{p} (x)

in the

p^{th}

iteration.

y_{p} (x)

may be defined through

n + 1

points

A_{p} (x_{p}^{A} f_{p}^{A})

and

B_{k, p} (x_{k, p}^{B} f_{k, p}^{B})

(

k = 1, \dots, n

) of the nonlinear function

f (x)

, where

f_{p}^{A} = f (x_{p}^{A})

and

f_{k, p}^{B} = f (x_{k, p}^{B})

(

k = 1, \dots, n

) are function values, or by one point

A_{p} (x_{p}^{A} f_{p}^{A})

and by the “slope” (divided differences)

S_{p} = ▵ F_{p} ▵ X_{p}^{- 1}

(35)

(i = 1, \dots, n)

,

(j = 1, \dots, m)

,

(k = 1, \dots, n)

of the

n

-dimensional secant hyperplane

y_{p} (x)

, where

▵ x_{k, p} = x_{k, p}^{B} - x_{p}^{A}

and

▵ f_{k, p} = f_{k, p}^{B} - f_{p}^{A}

are difference vectors with

n

and

m

components, respectively, and

▵ F_{p} = [▵ f_{k, p}] = [f_{k, p}^{B} - f_{p}^{A}] = [\begin{matrix} ▵ f_{1, 1, p} & \dots & ▵ f_{n, 1, p} \\ ⋮ & ⋱ & ⋮ \\ ▵ f_{1, m, p} & \dots & ▵ f_{n, m, p} \end{matrix}],

(36)

▵ X_{p} = [▵ x_{k, p}] = [x_{k, p}^{B} - x_{p}^{A}] = [\begin{matrix} ▵ x_{1, 1, p} & \dots & ▵ x_{n, 1, p} \\ ⋮ & ⋮ & ⋮ \\ ▵ x_{1, n, p} & \dots & ▵ x_{n, n, p} \end{matrix}] .

(37)

The

n

-dimensional tangent hyperplane

y_{p} (x)

through point

A_{p} (x_{p}^{A} f_{p}^{A})

may be defined by the “slope”

S_{p} = [\frac{\partial f_{k, j, p}}{\partial x_{k, i, p}}] = [\begin{matrix} \frac{\partial f_{1, 1, p}}{\partial x_{1, p}} & \dots & \frac{\partial f_{n, 1, p}}{\partial x_{n, p}} \\ ⋮ & ⋮ & ⋮ \\ \frac{\partial f_{1, m, p}}{\partial x_{1, p}} & \dots & \frac{\partial f_{n, m, p}}{\partial x_{n, p}} \end{matrix}],

(38)

that corresponds to the Jacobian matrix

J_{p}

of the nonlinear vector-valued function

f (x)

, and accordingly, Definition (35) corresponds to the approximation of the Jacobian matrix. It follows from the condition

y_{p} (x_{p + 1}^{A}) = 0

(39)

that the zero

x_{p + 1}^{A} = x_{p}^{A} - S_{p}^{+} f_{p}^{A}

(40)

of the

n

-dimensional hyperplane

y (x)

approximates the zero

x^{*}

of the nonlinear vector-valued function

f (x)

, where

{[.]}^{+}

stands for the pseudoinverse. Then, the

i^{th}

element of the new approximate

x_{p + 1}^{A}

in the

p^{th}

iteration will be

x_{i, p + 1}^{A} = x_{i, p}^{A} - \sum_{j = 1}^{m} (S_{i, j, p}^{+} f_{j, p}^{A})

(41)

(i = 1, \dots, n), (j = 1, \dots, m)

. With the iteration stepsize

▵ x_{p}^{A} = x_{p + 1}^{A} - x_{p}^{A} = - S_{p}^{+} f_{p}^{A} = - [{(\sum_{j = 1}^{m} (S_{i, j, p}^{+} f_{j, p}^{A}))}_{i}],

(42)

Equation (40) can be rewritten as

S_{p} ▵ x_{p}^{A} = - f_{p}^{A},

(43)

which is called “secant equation” or “linearization equation” in the multi-variable case. With the new approximate

x_{p + 1}^{A}

and with the corresponding function value

f_{p + 1}^{A}

, an updated Jacobian or Jacobian approximate

S_{p + 1}

has to be determined. Table 1 summarizes the basic equations of the above-detailed linearization methods.

If all partial derivatives of the function

f (x)

are known, then the Jacobian update can easily be calculated. Newton’s method is one of the most widely used algorithms, with very attractive theoretical and practical properties and with some limitations. The computational costs of Newton’s method are high, since the Jacobian

J_{p}

and the solution to the linear system (43) must be computed at each iteration. It is well known that the local convergence of Newton’s method is

q

-quadratic if the initial trial approximate

x_{0}

is close enough to the solution

x^{*}

, if

J (x^{*})

is nonsingular, and if

J (x)

satisfies the Lipschitz condition

∥J (x) - J (x^{*})∥ \leq L ∥x - x^{*}∥

(44)

for all

x

close enough to

x^{*}

.

However, in many cases, the function

f (x)

is not an analytical function, the partial derivatives are not known or difficult to evaluate, and Newton’s method cannot be applied. Quasi-Newton methods are widely used for solving systems of nonlinear equations when the Jacobian is not known or difficult to determine. The Jacobian is approximated by divided differences (35), and the system of nonlinear Equations (5) is solved by repeatedly solving systems of linear Equations (43) providing the new approximates

x_{p + 1}^{A}

.

The Jacobian approximate (35) should be updated according to the fundamental equation of the quasi-Newton methods (“quasi-Newton condition” or “secant condition”)

S_{p + 1} ▵ x_{p}^{A} = f_{p + 1}^{A} - f_{p}^{A}

(45)

for all

p = 0, 1, 2, \dots

. However, the condition (45) does not uniquely specify the update

S_{p + 1}

, so the iterative procedure is not well defined and further constraints are needed. Different methods offer their own specific solution, but new quasi-Newton approximate

x_{p + 1}

will never allow full-rank update of the Jacobian approximate

S_{p + 1}

(Equation (45) is an underdetermined system of

m

linear equations with

mn

unknowns). A large family of methods is available with different additional conditions or assumptions. Martinez [31] has been made a thorough survey on the family of practical quasi-Newton methods.

The partial derivatives of the Jacobian matrix may be replaced by suitable difference quotients (discretized Newton iteration; see [1,2])

[\frac{▵ f}{▵ x}] = [\frac{f_{j} (x + ▵ x_{k} d^{k}) - f_{j} (x)}{▵ x_{k}}] = [\frac{▵ f_{j} (x)}{▵ x_{k}}]

(46)

(k = 1, \dots, n)

,

(j = 1, \dots, m)

with

n

additional function value evaluations, where

d^{k}

is the

k^{th}

Cartesian unit vector. However, it is difficult to choose the stepsize

▵ x

. If any

▵ x_{k}

is too large, then Expression (46) can be a bad approximation to the Jacobian so the iteration converges much more slowly, if it converges at all. On the other hand, if any

▵ x_{k}

is too small, then

▵ f_{j} (x) ≃ 0

, and cancellations can occur, which reduces the accuracy of the difference quotients (46) (see [32]). Another modification is the inexact-Newton approach, when the nonlinear equation is solved by an iterative linear solver (see [33,34,35]). Wolfe [10] and Popper [9] suggested the column update of matrix (36).

One of the most widely used formulas is that of Broyden [36], which makes a rank-one update for the Jacobian approximate as

S_{p + 1} = S_{p} + Q_{p} = S_{p} + \frac{▵ f_{p}^{A} - S_{p} ▵ x_{p}^{A}}{{∥▵ x_{p}^{A}∥}^{2}} {[▵ x_{p}^{A}]}^{T},

(47)

where

Q_{p}

is a rank-one matrix, and by using the Sherman–Morrison formula, the inverse Jacobian approximate update is given as

S_{p + 1}^{+} = S_{p}^{+} + \frac{▵ x_{p}^{A} - S_{p}^{+} ▵ f_{p}^{A}}{{[▵ x_{p}^{A}]}^{T} S_{p}^{+} ▵ f_{p}^{A}} {[▵ x_{p}^{A}]}^{T} S_{p}^{+} .

(48)

Broyden’s secant condition (45) can be rewritten as

S_{p} ▵ x_{p}^{A} + Q_{p} ▵ x_{p}^{A} = f_{p + 1}^{A} - f_{p}^{A},

(49)

and with the secant Equation (43), we have

Q_{p} ▵ x_{p}^{A} = f_{p + 1}^{A} .

(50)

As

Q_{p}

is a rank-one matrix, then this equation has an infinitely large number of solutions (the left nullspace of matrix

Q_{p}

is an

n - 1

dimensional vector space [37]), and the Ortega–Rheinboldt [2] condition will not be satisfied (the column vectors of matrix

S_{p}

should be linearly independent and have to be “in general position” through the whole iteration process) and it may result an ill-conditioned Broyden update

S_{p + 1}

.

4. T-Secant Method

A numerical procedure has been developed for solving an overdetermined system of nonlinear Equations (5). It can be considered both as a discretized Newton’s method or as a quasi-Newton method with full-rank update of the Jacobian approximates. The suggested full-rank update procedure (“T-Secant” method [5]) provides an unconventional and simple strategy to improve the performance of quasi-Newton iterations. It is based on the classic secant linearization, which was completed with a new independent approximate

x_{p + 1}^{B}

. The secant equation was modified by a suitably chosen nonuniform scaling transformation and a new independent approximate

x_{p + 1}^{B}

was determined from the modified secant equation. The solution to the secant equation presented in [5] was based on the Wolfe–Popper procedure [9,10] so that the secant equation was splitted into two equations by introducing an auxiliary variable. The simplified algorithm, presented in this paper, directly solves the secant equation by using the pseudoinverse of the transformed Jacobian approximate.

4.1. Single-Variable Case

The suggested procedure uses the information on the improvement of the classic secant procedure and gives a new approximate

x_{p + 1}^{B}

in the vicinity of the classic secant approximate

x_{p + 1}^{A}

as follows. Given two initial approximates

x_{p}^{A}

and

x_{p}^{B}

with function values

f_{p}^{A}

and

f_{p}^{B}

, the new approximate

x_{p + 1}^{A}

with function value

f_{p + 1}^{A}

is known from the solution of the classic secant Equation (26). An independent new approximate

x_{p + 1}^{B} = x_{p + 1}^{A} + ▵ x_{p + 1}

(51)

is planned to be determined in the vicinity of the classic secant approximate

x_{p + 1}^{A}

. Let the ratio

t_{p}^{f} = \frac{f_{p + 1}^{A}}{f_{p}^{A}}

(52)

(f_{p}^{A} \neq 0)

of the function value improvement and the ratio

t_{p}^{x} = \frac{x_{p + 1}^{B} - x_{p + 1}^{A}}{x_{p + 1}^{A} - x_{p}^{A}} = \frac{▵ x_{p + 1}}{▵ x_{p}^{A}}

(53)

(▵ x_{p}^{A} \neq 0)

of the desired iteration stepsize change be defined. The single-variable secant Equation (26) is then modified as

S_{Tp} ▵ x_{p}^{A} = - f_{p}^{A},

(54)

where

S_{Tp} = \frac{t_{p}^{f}}{t_{p}^{x}} S_{p} .

(55)

Then, the new approximate

x_{p + 1}^{B}

can be expressed from Equation (54) as

t_{p}^{f} \frac{x_{p + 1}^{A} - x_{p}^{A}}{x_{p + 1}^{B} - x_{p + 1}^{A}} S_{p} ▵ x_{p}^{A} = - f_{p}^{A} .

(56)

After rearrangement, we obtain

x_{p + 1}^{A} - x_{p + 1}^{B} = \frac{t_{p}^{f}}{f_{p}^{A}} S_{p} {(▵ x_{p}^{A})}^{2}

(57)

and with the secant Equation (26), it turns to

x_{p + 1}^{B} = x_{p + 1}^{A} - \frac{t_{p}^{f}}{f_{p}^{A}} S_{p} {(▵ x_{p}^{A})}^{2} = x_{p + 1}^{A} + t_{p}^{f} ▵ x_{p}^{A}

(58)

and the desired new iteration stepsize

▵ x_{p + 1} = t_{p}^{f} ▵ x_{p}^{A}

(59)

can be given. The suggested procedure can also be applied to Newton’s method (T-Newton method). Then, the “slope”

S_{p}

corresponds to the derivative

f^{'}

of function

f (x)

.

The geometrical representation of the suggested method can be derived as follows ([5]). Let the new approximate

x_{p + 1}^{B}

be the zero of a function

z_{p} (x)

:

z_{p} (x_{p + 1}^{B}) = 0 .

(60)

Then, replacing

x_{p + 1}^{B}

with

x

in Equation (58) gives the function

z_{p} (x) = \frac{t_{p}^{f} S_{p} {(▵ x_{p}^{A})}^{2}}{x - x_{p + 1}^{A}} + f_{p}^{A} = 0

(61)

with zero at

x = x_{p + 1}^{B} .

(62)

Equation (61) is a hyperbolic function with vertical and horizontal asymptotes

x_{p + 1}^{A}

and

f_{p}^{A}

, and its root

x_{p + 1}^{B}

will be in the vicinity of

x_{p + 1}^{A}

in an “appropriate distance” that is regulated by the function value

f_{p + 1}^{A}

(see Figure 2). This virtue of the suggested procedure ensures an automatic mechanism for having the actual approximates

x_{p}^{A}

and

x_{p}^{B}

in general positions through the whole iteration process providing stable and efficient numerical performance. The suggested method’s performance is demonstrated with the results of numerical tests on test functions in Section 6.

4.2. Multi-Variable Case

Let two independent approximates

x_{p}^{A} = [x_{k, p}^{A}]

and

x_{p}^{B} = [x_{k, p}^{B}]

to the zero

x^{*}

of the nonlinear vector-valued function

f (x)

be given in the

p^{th}

iteration

(k = 1, \dots, n)

. Let the series of

n

approximates

x_{k, p}^{B}

be constructed by individually increment the elements

x_{k, p}^{A}

of the approximate

x_{p}^{A}

by an increment

▵ x_{k, p} = x_{k, p}^{B} - x_{k, p}^{A}

(63)

as

x_{k, p}^{B} = x_{p}^{A} + ▵ x_{k, p} d^{k}

(64)

where

d^{k}

is the

k^{th}

Cartesian unit vector. It follows from this special construction of the approximates

x_{k, p}^{B}

that

x_{k, i, p}^{B} - x_{k, p}^{A} = 0

for

i \neq k

and

x_{k, i, p}^{B} - x_{k, p}^{A} = ▵ x_{k, p}

for

i = k

, and matrix (37) will be a diagonal matrix:

▵ X_{p} = [▵ x_{k, p}] = [x_{k, p}^{B} - x_{p}^{A}] = [\begin{matrix} ▵ x_{1, p} & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & ▵ x_{n, p} \end{matrix}] = diag (▵ x_{k, p}) .

(65)

Let the ratios

T_{p}^{F} = diag (t_{j, p}^{F}) = diag (\frac{f_{j, p + 1}^{A}}{f_{j, p}^{A}})

(66)

(f_{j, p}^{A} \neq 0)

of the function value improvements

(j = 1, \dots, m)

and the ratios

T_{p}^{X} = diag (t_{i, p}^{X}) = diag (\frac{x_{i, p + 1}^{B} - x_{i, p + 1}^{A}}{x_{i, p + 1}^{A} - x_{i, p}^{A}}) = diag (\frac{▵ x_{i, p + 1}}{▵ x_{i, p}^{A}})

(67)

(▵ x_{i, p}^{A} \neq 0)

of the desired iteration stepsize changes

(i = 1, \dots, n)

be defined similarly as in the single-variable case. Let the secant Equation (43) be modified as

S_{Tp} ▵ x_{p}^{A} = - f_{p}^{A},

(68)

where

S_{Tp} = T_{p}^{F} S_{p} {(T_{p}^{X})}^{- 1} = diag (t_{j, p}^{F}) [\frac{▵ f_{k, j, p}}{▵ x_{i, p}}] diag (\frac{1}{t_{i, p}^{X}}) = [\frac{t_{j, p}^{F}}{t_{i, p}^{X}} \frac{▵ f_{k, j, p}}{▵ x_{i, p}}],

(69)

(i = 1, \dots, n)

,

(j = 1, \dots, m)

,

(k = 1, \dots, n)

or in explicit form:

S_{Tp} = [\begin{matrix} \frac{f_{1, p + 1}^{A}}{f_{1, p}^{A}} & 0 & 0 \\ 0 & ⋱ & 0 \\ 0 & 0 & \frac{f_{m, p + 1}^{A}}{f_{m, p}^{A}} \end{matrix}] [\begin{matrix} \frac{▵ f_{1, 1, p}}{▵ x_{1, p}} & \dots & \frac{▵ f_{1, n, p}}{▵ x_{n, p}} \\ ⋮ & ⋮ & ⋮ \\ \frac{▵ f_{1, m, p}}{▵ x_{1, p}} & \dots & \frac{▵ f_{n, m, p}}{▵ x_{n, p}} \end{matrix}] [\begin{matrix} \frac{▵ x_{1, p}^{A}}{▵ x_{1, p + 1}} & 0 & 0 \\ 0 & ⋱ & 0 \\ 0 & 0 & \frac{▵ x_{n, p}^{A}}{▵ x_{n, p + 1}} \end{matrix}],

(70)

and

S_{Tp}^{+} = T_{p}^{X} S_{p}^{+} {(T_{p}^{F})}^{- 1}

(71)

is the pseudoinverse of

S_{p}

. Then, Equation (68) can be rewritten as

▵ x_{p}^{A} = - S_{Tp}^{+} f_{p}^{A} .

(72)

Then, the

i^{th}

element

x_{i, p + 1}^{B}

of the new approximate

x_{p + 1}^{B}

in the

p^{th}

iteration will be

x_{i, p + 1}^{B} = x_{i, p + 1}^{A} + ▵ x_{i, p + 1} = x_{i, p + 1}^{A} - \frac{{(▵ x_{i, p}^{A})}^{2}}{\sum_{j = 1}^{m} (S_{i, j, p}^{+} \frac{f_{j, p}^{A}}{t_{j, p}^{F}})},

(73)

where

t_{p, j}^{F} \neq 0

,

(j = 1, \dots, m)

and

(i = 1, \dots, n)

. Let the ratios

μ_{i, p} = \frac{\sum_{j = 1}^{m} (S_{i, j, p}^{+} f_{j, p}^{A})}{\sum_{j = 1}^{m} (S_{i, j, p}^{+} \frac{f_{j, p}^{A}}{t_{j, p}^{F}})} = \frac{- ▵ x_{i, p}^{A}}{\sum_{j = 1}^{m} (S_{i, j, p}^{+} \frac{f_{j, p}^{A}}{t_{j, p}^{F}})}

(74)

be introduced . By using Equation (42), the new iteration stepsize can be expressed from Equation (73) as

▵ x_{i, p + 1} = x_{i, p + 1}^{B} - x_{i, p + 1}^{A} = - \frac{{(▵ x_{i, p}^{A})}^{2}}{\sum_{j = 1}^{m} (S_{i, j, p}^{+} \frac{f_{j, p}^{A}}{t_{j, p}^{F}})} = μ_{i, p} ▵ x_{i, p}^{A} .

(75)

Table 2 summarizes the basic equations of the above-detailed multi-variable update method. The suggested procedure can also be applied to Newton’s method (T-Newton method). Then, matrix

S_{p}

corresponds to the Jacobian matrix

J_{p}

of function

f (x)

. The function value improvement parameter

t_{j, p}^{F}

(j = 1, \dots, m)

in Equation (74), defined as (66), has a key role in the suggested iteration process, and the absolute value of the denominator

|f_{j, p}^{A}|

has to be low-bounded with a pregiven value

T_{\min}

to avoid division by zero. When the values

|f_{j, p}^{A}|

tend to zero with increasing iteration counter

p

, then

μ_{i, p} \to T_{\min} .

(76)

Figure 3 shows the effect of iteration parameter

T_{\min}

on the iteration process (

N = 1000

). It is clear that too high a value of

T_{\min}

has a negative effect on the iteration efficiency. It can be seen in the example that a value

0.001

causes the iteration process failure, and there is almost no effect with lower values

0.0001 ≦ T_{\min}

.

The suggested procedure may also be repeated without using the classic secant approximates and without updating the previously known Jacobian approximate

S_{p}

as

▵ x_{i, p + 1 + h} = x_{i, p + 1 + h}^{B} - x_{i, p + 1 + h}^{A} = \frac{- {(\sum_{j = 1}^{m} (S_{i, j, p}^{+} f_{j, p + h}^{A}))}^{2}}{\sum_{j = 1}^{m} (S_{i, j, p}^{+} \frac{f_{j, p + h}^{A}}{t_{j, p + h}^{F}})},

(77)

where

t_{j, p + h}^{F} = \frac{f_{j, p + 1 + h}^{A}}{f_{j, p + h}^{A}}

(78)

(h = 1, \dots)

until the new approximates

x_{i, p + 1 + h}^{B} = x_{i, p + 1 + h}^{A} + ▵ x_{i, p + 1 + h}

(79)

(i = 1, \dots, n)

are sufficiently improved. A numerical example is shown in Table 3 with test function

f (x) = x^{3} - 2 x - 5

(80)

for demonstration purpose. The results indicate linear convergence, with convergence rate

α = 1

.

α

is the computed convergence rate,

N_{f}

is the number of function value evaluations, and

L

is the mean convergence rate, suggested by Broyden [36] in Table 3.

5. Algorithm

Let

ε^{*}

be the error bound for termination criterion, and if

x^{*}

is known, then

e_{p}^{A} = x_{p}^{A} - x^{*}

(81)

is the error vector of approximate

x_{p}^{A}

in the

p^{th}

iteration with elements

e_{i, p}^{A}

(i = 1, \dots, n)

. Let the error norm

ε_{p} = \frac{{∥e_{p}^{A}∥}_{2}}{n} = \frac{\sqrt{\sum_{i = 1}^{n} {(e_{i, p}^{A})}^{2}}}{n}

(82)

be defined, where

{∥.∥}_{2}

is Euclidean norm, and let the iteration be terminated when

ε_{p} < ε^{*}

(83)

holds. Choose

T_{\min}

as the lower bound for

|t_{j, p}^{F}| (j = 1 \dots, m)

, and let

f_{\min}

be a lower bound for

|f_{j, p}^{A}|

(j = 1, \dots, m)

. Let

p = 0

, and let the approximate

x_{p}^{A}

and the difference vector

▵ x_{p}

be given. Calculate the corresponding function values

f_{p}^{A}

and assure that

f_{\min} < |f_{j, 0}^{A}|

(j = 1, \dots, m)

. Iteration constants

f_{\min}

and

T_{\min}

are necessary to avoid division by zero and to avoid computed values be near the numerical precision.

Step 1: Generate a set of $n$ additional approximates $x_{k, p}^{B}$ (Equation (64)) and evaluate function values $f_{k, p}^{B}$ $(k = 1, \dots, n)$ . Assure that $f_{\min} < |f_{k, j, p}^{B}|$ $(j = 1, \dots, m)$ .
Step 2 (Classic secant update method): Construct the Jacobian approximate matrix $S_{p}$ , determine its pseudoinverse $S_{p}^{+}$ [4], calculate $x_{p + 1}^{A}$ from Equation (40) and $ε_{p}$ from Equation (82).
Step 3: If $ε_{p} < ε^{*}$ , then terminate iteration, else continue with Step 4.
Step 4 (suggested update method): Calculate $f_{p + 1}^{A}$ (assure that $f_{\min} < |f_{j, p}^{A}|$ ), $T_{p}^{F}$ from Equation (66) and $μ_{i}$ from Equation (74). Let $T_{\min} < |t_{j, p}^{F}|$ and determine $x_{p + 1}^{B}$ from Equation (75).
Step 5 : Continue iteration from Step 1 with $p = p + 1$ , $x_{p + 1}^{A}$ and $x_{p + 1}^{B}$ .

If

p_{\max}

is the number of necessary iterations for satisfying the termination criterion

ε_{p} < ε^{*}

and

n

is the number of unknowns to be determined, then the suggested update method needs

n + 1

function evaluations in each iterations and altogether

N_{f} = p_{\max} (n + 1)

(84)

function evaluations to reach the desired termination criterion.

p_{\max}

is dependent on many circumstances, such as the nature of the function

f (x)

, termination criteria (

ε^{*}

or others), the distance of the initial approximate

x^{A}

from the solution

x^{*}

, and from the iteration constants

f_{\min}

and

T_{\min}

.

6. Numerical Tests Results

6.1. Single-Variable Test Function

A numerical example is given with a single-variable test function

f (x) = cos (x) - x,

(85)

with root

x^{*} ≅ 0.73908513 \dots

The results of the classic secant iteration process are shown in Table 4 and Figure 4. Iterations were made with initial approximates

x_{0}^{A} = - 2.0

and

x_{0}^{B} = 2.0

(p = 0)

providing

f_{0}^{A} = 1.584

. The first secant approximate

x_{1}^{A} = - 0.416 \dots

is found as the zero of the first secant line

y_{0} (x)

providing

f_{1}^{A} = 1.331 \dots

(p = 1)

. The next iteration (

p = 2

) will continue with approximate

x_{2}^{A} = 0.442 \dots

and with

f_{2}^{A} = 0.462 \dots

.

The results of the suggested iteration process are shown in Table 5 and Figure 5. Iterations were made with the same initial approximates as in the case of the classic secant iteration, and the first secant approximate

x_{1}^{A} = - 0.416 \dots

is found. The first T-secant approximate

x_{1}^{B} = 0.915 \dots

is given as the zero of the first hyperbola function

z_{0} (x)

(Figure 5, left). Iteration then goes on with approximates

x_{1}^{A} = - 0.416 \dots

and

x_{1}^{B} = 0.915 \dots

providing

f_{1}^{A} = 1.331 \dots

(p = 1)

, and new approximates

x_{2}^{A} = 0.667 \dots

and

x_{2}^{B} = 0.764 \dots

as the zeros of the second secant and hyperbola functions

y_{1} (x)

and

z_{1} (x)

receptively (Figure 5, left). The next iteration (

p = 2

) will then continue with approximates

x_{2}^{A} = 0.667 \dots

and

x_{2}^{B} = 0.764 \dots

and with

f_{2}^{A} = 0.119 \dots

and gives

x_{3}^{A} = 0.7387 \dots

and

x_{3}^{B} = 0.7391 \dots

(Figure 5, right).

6.2. Solution of an Inverse Problem

An example is given for an imaginary modeling problem. The modeling of systems on the basis of observed or specified system responses plays a key role in scientific research. A central problem of modeling is to fit a structured mathematical model to available data so that the values of the unknown model parameters are determined providing that the simulated data are as near to the available data as possible. A system response can generally be represented by a “curve” in a two (or three)-dimensional space and can be simulated by a computer program. Parameter identification corresponds to minimizing the distance between observed and simulated system responses. Parameter identification problems can generally be formulated as nonlinear least-squares problems so that some unknown model parameters are determined providing minimal deviation between observed and simulated system responses. Solving nonlinear least-squares problems is often a difficult task, especially if the number of unknowns is high.

A rational system under investigation produces a measurable response to a known measurable external effect. A mathematical model with

n

unknown parameters

(p_{1}, \dots, p_{n})

simulates the behavior of the system providing response to the same external effect as acted on the system during observation. Let the observed and the simulated system responses be represented by two-dimensional curves and sampled in k discrete points with coordinates

x_{i}

and

y_{i}

(i = 1, \dots k)

. The coordinates of a synthetic observed response were generated as

x (z) = (a + b) cos (z + α) - c cos (\frac{a + b}{b} z + α) + x_{0}

(86)

y (z) = (a + b) sin (z + α) - c sin (\frac{a + b}{b} z + α) + y_{0}

(87)

(an epicycloid) with parameters

x_{0} = 10

,

y = 8

,

a = 4

,

b = 2

,

c = 3.5

and

α = 1

and with

0 ≦ z ≦ 2 π

. The parameters

p = [\begin{matrix} x_{0} & y_{0} & a & b & c \end{matrix}]

(88)

were considered to be unknown, and the system response was simulated with initial approximate

p_{0} = [\begin{matrix} x_{0} & y_{0} & a & b & c \end{matrix}] = [\begin{matrix} 8 & 11 & 3.5 & 2.5 & 3 \end{matrix}] .

(89)

The distance between observed and simulated system responses was defined for arbitrary two-dimensional curves. This definition can be extended to

n

dimensions making a wide range of parameter identification problems possible to formulate. The value of the defined distance is dimensionless, and it expresses the ratio of the area between the system response graphs in a normalized coordinate system to the area of a rectangle with unit area.

The distance between the observed and simulated system responses was quantified by area

D (p)

between the graphs of system responses in a normalized coordinate system. Partial area

r_{j} (p)

was defined as the area of a quadrangle among the

{(j - 1)}^{t h}

and

j^{th}

division points on the system response graphs

(j = 1, \dots, m)

, as shown in Figure 6. These division points were selected equidistantly along system response graphs.

Thus, parameter identification problems can be formulated as solving a system of nonlinear equations

r (p) = 0,

(90)

where

p \to r (p)

,

p \in R^{5}

,

r : R^{5} \to R^{m}

and

r

is residual vector with components

r_{j}

(j = 1, \dots, m)

. The unknown parameters

p

of an epicycloid (86) and (87) were determined according to the suggested update algorithm. The variation in parameters

p

through iterations and the observed and simulated system responses for the initial approximate are shown in Figure 6. The observed and simulated system response pairs are shown in Figure 7 7, 13, 19, 25, 31 and 43 simulations (

S

).

The results of numerical tests with different initial approximates show that the optimal solutions were not reached in all cases. The convergence was especially sensitive for the initial values of parameters

a

,

b

, and

c

. Several local optimal solutions were detected, as shown in Figure 8.

7. Efficiency

The efficiency of an algorithm for the solution of nonlinear equations is thoroughly discussed by Traub [15], as follows. Let

β

be the order of the iteration sequence such that for the approximate errors

e_{i} = x_{i} - x^{*}

, there exists a nonzero constant

C

(asymptotic error constant) for which

\frac{|e_{i + 1}|}{{|e_{i}|}^{β}} \to C .

(91)

A natural measure of the information used by an algorithm is the “informational usage”

d

, which is defined as the number of new pieces of information (values of the function and its derivatives) required per iteration (called “horner” by Ostrowski [14]). Then, the efficiency of the algorithm within one iteration can be measured by the “informational efficiency”

EFF = \frac{β}{d} .

(92)

An alternative definition of efficiency is

* EFF = β^{\frac{1}{d}},

(93)

called “efficiency index” by Ostrowski [14]. Another measure of efficiency, called “computational efficiency”, takes into account the “cost” of calculating different derivatives. The concepts of informational efficiency (

EFF

) and efficiency index (

* EFF

) do not take into account the cost of evaluating

f

and its derivatives, nor do they take into account the total number of pieces of information needed to achieve a certain accuracy in the root of the function. If

f

is composed of elementary functions, then the derivatives are also composed of elementary functions; thus, the cost of evaluating the derivatives is merely the cost of combining the elementary functions.

Broyden [36] suggested the mean convergence rate

L = \frac{1}{N_{f}} ln \frac{R (x_{0}^{A})}{R (x_{p_{\max}}^{A})}

(94)

as a measure of the efficiency of an algorithm for solving a particular problem, where

N_{f}

is the total number of function evaluations,

x_{0}^{A}

is the initial approximate,

x_{p_{\max}}^{A}

is the last approximate to the solution

x^{*}

when the termination criterion is satisfied after

p_{\max}

iterations.

R (x)

is the Euclidean norm of

f (x)

.

Zang’s two-step method [29] is similar to the King–Werner method [38,39] with convergence rate

2.414 \dots

[40]. Chen’s [28] and Wang’s [30] three-step methods show asymptotic convergence order

2.732 \dots

. Chen’s [28] method requires two functions and one derivative evaluation using a second-order polynomial, proposed in [30] and shows

1.174 \dots

Ostrowski’s [14] efficiency index. Wang’s [30] method requires three functions and two derivative evaluations showing

1.101 \dots

Ostrowski index. Zhang [29] published a finite-difference-based secant method with

2.618 \dots

asymptotic convergence order. Later, Ren [40] showed that Zhang’s [29] method has only a

2.414 \dots

convergence order due to some mistakes in the derivation. Table 6 compares the efficiencies of classic (Rows 1–2: Secant, Newton) and improved algorithms (Rows 3, 5: T-Secant, T-Newton with the suggested full-rank update and from references [28,30]), the T-secant method with constant Jacobian approximate

S_{p}

(row 4: TS-const.

S_{p}

, see data in Table 3), and Chen’s [28] and Wang’s [30] methods.

Very limited data are available to compare the performance of the suggested update method (T-Secant) with other classic methods, especially for a large number of unknowns. Efficiency results were given by Broyden [36] for the Rosenbrock function for

N = 2

. The calculated convergence rates for the two Broyden method variants [36], for the Powell’s method [41], for the adaptive coordinate descent method [42], and for the Nelder–Mead simplex method [43] were compared with the calculated values for the T-secant method in Table 7 [5]. Rows 1–5 are data from referenced papers, rows 6–8 are T-secant results with the referenced initial approximates, and rows 9–15 are calculated data for

N > 2

. If the value of

R (x) = 0

(for

p_{\max}

) is zero, then the mean convergence rates (

L

and

L_{N}

) are not countable (zero in the denominator). A substitute value

10^{- 25}

was used when

R (x) = 0

(for

p_{\max}

) in rows 6, 7, 10, and 13.

Results show that the mean convergence rate

L

(Equation (94)) for

N = 2

is much higher for the T-secant method (

≃ 5.5 - 6.9

) than for the other listed methods (

≃ 0.1 - 0.6

). However, it is obvious that the mean convergence rate values decrease rapidly with increasing

N

values (more unknowns need more function evaluations). A modified convergence rate

L_{N} = N * L = \frac{N}{N_{f}} ln \frac{R (x_{0}^{A})}{R (x_{p_{\max}}^{A})}

(95)

is suggested to use as an “

N

” independent measure of efficiency (see Table 7). The values of

L

and

L_{N}

are at least 10 times larger for the T-secant method than for the referenced classic methods for

N = 2

(see Table 7).

The efficiency measures (

L

and

L_{N}

) also depend on the initial conditions (distance of the initial approximate from the optimal solution and termination criterion). Results from a large number of numerical tests indicate an average

L_{N} ≃ 7.4

with a standard deviation of around

3.7

for the T-secant method, even for large

N

values.

8. Conclusions

A numerical procedure has been developed for solving an overdetermined system of nonlinear Equation (5). It can be considered both as a discretized Newton method or as a quasi-Newton method with a full-rank update of the Jacobian approximates. Quasi-Newton methods are widely used for solving systems of nonlinear equations when the function derivatives (Jacobian) are not known or difficult to determine. The derivatives (Jacobian) are approximated by divided differences (35), and the system of nonlinear Equation (5) is solved by repeatedly solving systems of linear Equation (43). The new approximate

x_{p + 1}^{A}

is determined from the classic secant equation, and the divided differences are updated according to the secant condition (45). However, the secant condition does not uniquely specify the Jacobian approximate, so the update procedure is not well defined, and further constraints are needed. Different methods offer specific update solutions.

The suggested numerical procedure (“T-Secant” method [5]) provides an unconventional and simple strategy to improve the performance of quasi-Newton iterations by a full-rank update of the Jacobian approximates. It is based on the classic secant linearization and allows a full-rank update of the Jacobian approximates. The classic secant iteration was completed with a new independent approximate

x_{p + 1}^{B}

, that was determined from a modified secant Equation (68). A modification was made by a suitably chosen nonuniform scaling transformation. Scaling was made by the difference quotients of the classic secant function value improvements (66), the difference quotients of the desired new iteration stepsizes, and the classic secant stepsizes (67). A new independent approximate

x_{p + 1}^{B}

was determined from the modified secant Equation (68). The solution to the secant equation presented in [5] was based on the Wolfe–Popper procedure [9,10] so that the secant equation was splitted into two equations by introducing an auxiliary variable. The simplified algorithm, presented in the paper, directly solves the secant equation by using the pseudoinverse (71) of the transformed Jacobian approximate (69).

It has been shown that the new T-secant approximate

x_{p + 1}^{B}

will be in the vicinity of the classic secant approximate

x_{p + 1}^{A}

if the classic secant iterates converge to the root of the nonlinear function [5]. The Jacobian approximate was then full-rank updated by constructing new divided differences (Jacobian approximates) from the classic secant and from the T-secant approximates.

It was shown that the iterative procedure possesses superquadratic asymptotic convergence property with convergence rate

α = φ + 1 = φ^{2} ≃ 2.618

for simple root in a single-variable case ([5]), where

φ

is the golden section ratio. The suggested procedure can also be applied to Newton’s method (matrix

S_{p}

corresponds to the Jacobian matrix

J_{p}

). Numerical test results indicate that the quadratic convergence (

α = 2

) of Newton’s method increases to cubic (

α = 3

).

The efficiency has been studied in a multi-variable case and compared with other classic rank-one update and line-search methods on the basis of available data. Results show that its efficiency is considerably better than the efficiency of other classic low-rank update methods. The suggested method’s performance was demonstrated by the results of numerical tests with widely used single- and multi-variable benchmark test functions. A Rosenbrock test function was used with up to 1000 variables [5]. The method has also been successfully applied for the solution of different “real-life” inverse problems [6,7,8] on physically nonlinear dynamic systems. Further studies can be carried out on convergence properties in the case of multiple roots and on the efficiency characteristics in cases of more benchmark test functions. The suggested procedure may be used for other applications, especially with a large number of unknowns and when a low number of function evaluations is crucial.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Acknowledgments

A considerable part of the research work has been performed between the years 1988 and 1992 at Technical University of Budapest (Hungary), at TNO-BOUW Structural Division (The Netherlands) and at Technical High-school of Lulea (Sweden). The work has been sponsored by the Technical University of Budapest (Hungary), by the Hungarian Academy of Sciences (Hungary), by TNO-BOUW (The Netherlands), by Sandvik Rock Tools (Sweden), by CP Test a/s (Denmark), and by Óbuda University (Hungary). Valuable discussions and personal support from Géza Petrasovits, György Popper, Peter Middendorp, Rikard Skov, Bengt Lundberg, Mario Martinez and Csaba J. Hegedűs are greatly appreciated.

Conflicts of Interest

The author declares no conflicts of interest.

References

Dennis, J.E., Jr.; Schnabel, R.B. Numerical Methods for Unconstrained Optimization and Nonlinear Equations; Prentice-Hall: Englewood Cliffs, NJ, USA, 1983. [Google Scholar]
Ortega, J.M.; Rheinboldt, W.C. Iterative Solution of Nonlinear Equations in Several Variables; Academic Press: New York, NY, USA, 1970. [Google Scholar]
Hegedus, C. Numerical Methods I; ELTE, Faculty of Informatics: Budapest, Hungary, 2015. [Google Scholar]
Press, W.H.; Flannery, B.P.; Teukolsky, S.A.; Wetterling, W.T. Numerical Recepies; Cambridge University Press: Cambridge, UK, 1986. [Google Scholar]
Berzi, P. Convergence and Stability Improvement of Quasi-Newton Methods by Full-Rank Update of the Jacobian Approximates. AppliedMath 2024, 4, 143–181. [Google Scholar] [CrossRef]
Berzi, P.; Beccu, R.; Lundberg, B. Identification of a Percussive Drill Rod Joint from its Response to Stress Wave Loading. Int. J. Impact Eng. 1994, 18, 281–290. [Google Scholar] [CrossRef]
Berzi, P. Pile-Soil Interaction due to Static and Dynamic Load. In Proceedings of the 13th International Conference on Soil Mechanics and Foundation Engineering, New Delhi, India, 5–10 January 1994; pp. 609–612. [Google Scholar]
Berzi, P.; Popper, G. Evaluation of dynamic load test results on piles. In Proceedings of the International Symposium on Identification of Nonlinear Mechanical Systems from Dynamic Tests (Euromech 280), Ecully, France, 29–31 October 1991; pp. 121–128. [Google Scholar]
Popper, G. Numerical method for least square solving of nonlinear equations. Period. Polytech. 1985, 29, 67–69. [Google Scholar]
Wolfe, P. The Secant Method for Simultaneous Nonlinear Equations. Commun. ACM 1959, 2, 12–13. [Google Scholar] [CrossRef]
Papakonstantinou, J.M.; Tapia, R.A. Origin and evolution of the secant method in one dimension. Am. Math. Mon. 2013, 120, 500–518. [Google Scholar] [CrossRef]
Broyden, C.G.; Dennis, J.E.; Mor, J.J. On the local and superlinear convergence of quasi-Newton methods. J. Inst. Math. Appl. 1973, 12, 223–245. [Google Scholar] [CrossRef]
Dennis, J.E.; Mor, J.J. A characterization of superlinear convergence and its application to quasi-Newton methods. Math. Comput. 1974, 28, 543–560. [Google Scholar] [CrossRef]
Ostrowski, A.M. Solution of Equations and Systems of Equations; Academic Press: New York, NY, USA, 1966. [Google Scholar]
Traub, J.F. Iterative Methods for the Solution of Equations, 1st ed.; Prentice-Hall, Inc.: Englewood Cliffs, NJ, USA, 1964. [Google Scholar]
Weerakoon, S.; Fernando, T.G.I. A variant of Newton’s method with accelerated third-order convergence. Appl. Math. Lett. 2000, 13, 87–93. [Google Scholar] [CrossRef]
Gerlach, J. Accelerated convergence in Newton’s method. SIAM Rev. 1994, 36, 272–276. [Google Scholar] [CrossRef]
Homeier, H.H.H. On Newton-type methods with cubic convergence. J. Comput. Appl. Math. 2005, 176, 425–432. [Google Scholar] [CrossRef]
Kou, J.; Li, Y.; Wang, X. Third-order modification of Newton’s method. J. Comput. Appl. Math. 2007, 205, 1–5. [Google Scholar]
Kanwar, V.; Sharma, J.R.; Mamta, J. A new family of Secant-like method with super-linear convergence. Appl. Math. Comput. 2005, 171, 104–107. [Google Scholar] [CrossRef]
Melman, A. Geometry and convergence of Euler’s and Halley’s methods. SIAM Rev. 1997, 39, 728–735. [Google Scholar] [CrossRef]
Özban, A.Y. Some new variants of Newton’s method. Appl. Math. Letter. 2004, 17, 677–682. [Google Scholar] [CrossRef]
Scavo, T.R.; Thoo, J.B. On the geometry of Halley’s method. Am. Math. Mon. 1995, 102, 417–426. [Google Scholar] [CrossRef]
Shaw, S.; Mukhopadhyay, B. An improved regula falsi method for finding simple roots of nonlinear equations. Appl. Math. Comput. 2015, 254, 370–374. [Google Scholar] [CrossRef]
Muller, D.E. A Method for Solving Algebraic Equations Using an Automatic Computer. Math. Tables Other Aids Comput. 1956, 10, 208–215. [Google Scholar] [CrossRef]
Thukral, R. A New Secant-type method for solving nonlinear equations. Am. J. Comput. Appl. Math. 2018, 8, 32–36. [Google Scholar]
Amat, S.; Busquier, S.; Gutiérrez, J.M. Geometric constructions of iterative functions to solve nonlinear equations. J. Comput. Appl. Math. 2003, 157, 197–205. [Google Scholar] [CrossRef]
Chen, L.; Ma, Y. A new modified King–Werner method for solving nonlinear equations. Comput. Math. Appl. 2011, 62, 3700–3705. [Google Scholar] [CrossRef]
Zhang, H.; Li, D.-S.; Liu, Y.-Z. A new method of secant-like for nonlinear equations. Commun. Nonlinear Sci. Numer. Simul. 2009, 14, 2923–2927. [Google Scholar]
Wang, X.; Kou, J.; Gu, C. A new modified secant-like method for solving nonlinear equations. Comput. Math. Appl. 2010, 60, 1633–1638. [Google Scholar] [CrossRef]
Martínez, J.M. Practical quasi-Newton methods for solving nonlinear systems. J. Comput. Appl. Math. 2000, 124, 97–121. [Google Scholar] [CrossRef]
Stoer, J.; Bulirsch, R. Introduction to Numerical Analysis; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar]
Birgin, E.G.; Krejic, N.; Martinez, J.M. Globally convergent inexact quasi-Newton methods for solving nonlinear systems. Num. Algorithms 2003, 32, 249–260. [Google Scholar] [CrossRef]
Dembo, R.S.; Eisenstat, S.C.; Steihaug, T. Inexact Newton methods. SIAM J. Numer. Anal. 1971, 19, 400–408. [Google Scholar] [CrossRef]
Martinez, J.M.; Qi, L. Inexact Newton methods for solving non-smooth equations. J. Comput. Appl. Math. 1995, 60, 127–145. [Google Scholar] [CrossRef]
Broyden, C.G. A class of Methods for Solving Nonlinear Simultaneous Equations. Math. Comput. Am. Math. 1965, 19, 577–593. [Google Scholar] [CrossRef]
Strang, G. Introduction to Linear Algebra, revised international ed.; Wellesley-Cambridge Press: Wellesley, MA, USA, 2005. [Google Scholar]
King, R.F. Tangent method for nonlinear equations. Numer. Math. 1972, 18, 298–304. [Google Scholar] [CrossRef]
Werner, W. Über ein Verfarhren der Ordnung 1 + √ 2 zur Nullstellenbestimmung. Numer. Math. 1979, 32, 333–342. [Google Scholar] [CrossRef]
Ren, H.; Wu, Q.; Bi, W. On convergence of a new secant-like method for solving nonlinear equations. Appl. Math. Comput. 2010, 217, 583–589. [Google Scholar] [CrossRef]
Powell, M.J.D. An efficient method for finding the minimum of a function of several variables without calculating derivatives. Comput. J. 1964, 7, 155–162. [Google Scholar] [CrossRef]
Loshchilov, I.; Schoenauer, M.; Sebag, M. Adaptive Coordinate Descent. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), Dublin, Ireland, 12–16 July 2011; ACM Press: New York, NY, USA, 2011; pp. 885–892. [Google Scholar]
Nelder, J.A.; Mead, R. A simplex method for function minimization. Comput. J. 1965, 7, 308–313. [Google Scholar] [CrossRef]

Figure 1. Left: linearization of a nonlinear function; Right: classic secant method.

Figure 2. Suggested full-rank update method (T-secant).

Figure 3. Effect of the iteration parameter

T_{\min}

on the iteration process (

N = 1000

).

Figure 3. Effect of the iteration parameter

T_{\min}

on the iteration process (

N = 1000

).

Figure 4. Classic secant iterations with test Function (85) (see Table 4). (

Left

:

x_{1}^{A} = - 0.416 \dots

is the root of

y_{0} (x)

, then

x_{2}^{A} = 0.442 \dots

is the root of

y_{1} (x)

.

Middle

:

x_{3}^{A} = 0.898 \dots

is the root of

y_{2} (x)

and

x_{4}^{A} = 0.728 \dots

is the root of

y_{3} (x)

.

Right

:

x_{5}^{A} = 0.7387 \dots

is the root of

y_{4} (x)

and

x_{6}^{A} = 0.739086 \dots

is the root of

y_{5} (x)

).

Figure 4. Classic secant iterations with test Function (85) (see Table 4). (

Left

:

x_{1}^{A} = - 0.416 \dots

is the root of

y_{0} (x)

, then

x_{2}^{A} = 0.442 \dots

is the root of

y_{1} (x)

.

Middle

:

x_{3}^{A} = 0.898 \dots

is the root of

y_{2} (x)

and

x_{4}^{A} = 0.728 \dots

is the root of

y_{3} (x)

.

Right

:

x_{5}^{A} = 0.7387 \dots

is the root of

y_{4} (x)

and

x_{6}^{A} = 0.739086 \dots

is the root of

y_{5} (x)

).

Figure 5. T-Secant iterations with test Function (85) (see Table 5) (

Left

:

x_{1}^{A} = - 0.416 \dots

is the root of

y_{0} (x)

,

x_{1}^{B} = 0.915 \dots

is the root of

z_{0} (x)

, then

x_{2}^{A} = 0.667 \dots

is the root of

y_{1} (x)

,

x_{2}^{B} = 0.764 \dots

is the root of

z_{1} (x)

.

Right

:

x_{3}^{A} = 0.7387 \dots

is the root of

y_{2} (x)

,

x_{3}^{B} = 0.7391 \dots

is the root of

z_{2} (x)

).

Figure 5. T-Secant iterations with test Function (85) (see Table 5) (

Left

:

x_{1}^{A} = - 0.416 \dots

is the root of

y_{0} (x)

,

x_{1}^{B} = 0.915 \dots

is the root of

z_{0} (x)

, then

x_{2}^{A} = 0.667 \dots

is the root of

y_{1} (x)

,

x_{2}^{B} = 0.764 \dots

is the root of

z_{1} (x)

.

Right

:

x_{3}^{A} = 0.7387 \dots

is the root of

y_{2} (x)

,

x_{3}^{B} = 0.7391 \dots

is the root of

z_{2} (x)

).

Figure 6. Definition of the distance between system responses (left) and variations in parameters through iterations (right).

Figure 7. Observed and simulated system response pairs after “S” function value evaluations.

Figure 8. Local optimal solutions.

Table 1. Linearization method’s basic equations (single- and multi-variable cases).

	Single-Variable $(m = n = 1)$	Multi-Variable $(m \geq n > 1)$	Equations
1	$▵ x_{p} = x_{p}^{B} - x_{p}^{A}$	$▵ X_{p} = [\begin{matrix} ▵ x_{1, 1, p} & \dots & ▵ x_{n, 1, p} \\ ⋮ & ⋮ & ⋮ \\ ▵ x_{1, n, p} & \dots & ▵ x_{n, n, p} \end{matrix}]$	(37)
2	$▵ f_{p} = f_{p}^{B} - f_{p}^{A}$	$▵ F_{p} = [\begin{matrix} ▵ f_{p, 1, 1} & \dots & ▵ f_{p, n, 1} \\ ⋮ & ⋱ & ⋮ \\ ▵ f_{p, 1, m} & \dots & ▵ f_{p, n, m} \end{matrix}]$	(36)
3	$S_{p} = \frac{▵ f_{p}}{▵ x_{p}}$	$S_{p} = ▵ F_{p} ▵ X_{p}^{- 1}$	(21) and (35)
4	$x_{p + 1}^{A} = x_{p}^{A} - \frac{f_{p}^{A}}{S_{p}}$	$x_{p + 1}^{A} = x_{p}^{A} - S_{p}^{+} f_{p}^{A}$	(24) and (40)
5	$S_{p + 1} ▵ x_{p}^{A} = f_{p + 1}^{A} - f_{p}^{A}$	$S_{p + 1} ▵ x_{p}^{A} = f_{p + 1}^{A} - f_{p}^{A}$	(27) and (45)

Table 2. Basic equations of classic and suggested multi-variable update methods.

	Classic Secant Method	Suggested Update Method	Equations
1		$x_{k, p}^{B} = x_{p}^{A} + ▵ x_{k, p} d^{k}$	(64)
2	$▵ X_{p} = [▵ x_{k, p}]$	$▵ X_{p} = diag (▵ x_{k, p})$	(37) and (65)
3	$▵ F_{p} = [f_{k, p}^{B} - f_{p}^{A}]$	$▵ F_{p} = [f_{k, p}^{B} - f_{p}^{A}]$	(36)
4		$T_{p}^{X} = diag (t_{i, p}^{X}) = diag (\frac{▵ x_{i, p + 1}}{▵ x_{i, p}^{A}})$	(67)
5		$T_{p}^{F} = diag (t_{j, p}^{F}) = diag (\frac{f_{j, p + 1}^{A}}{f_{j, p}^{A}})$	(66)
6	$S_{p} = ▵ F_{p} ▵ X_{p}^{- 1}$	$S_{Tp} = T_{p}^{F} S_{p} {(T_{p}^{X})}^{- 1} = [\frac{t_{j, p}^{F} ▵ f_{k, j, p}}{t_{i, p}^{X} ▵ x_{i, p}}]$	(35) and (69)
7	$S_{p} ▵ x_{p}^{A} = - f_{p}^{A}$	$S_{Tp} ▵ x_{p}^{A} = - f_{p}^{A}$	(43) and (68)
8	$▵ x_{p}^{A} = - S_{p}^{+} f_{p}^{A}$	$▵ x_{p}^{A} = - S_{Tp}^{+} f_{p}^{A}$	(42) and (72)
9	$x_{i, p + 1}^{A} = x_{i, p}^{A} - \sum_{j = 1}^{m} (S_{i, j, p}^{+} f_{j, p}^{A})$	$x_{i, p + 1}^{B} = x_{i, p + 1}^{A} - \frac{{(▵ x_{i, p}^{A})}^{2}}{\sum_{j = 1}^{m} (S_{i, j, p}^{+} \frac{f_{j, p}^{A}}{t_{j, p}^{F}})}$	(41) and (73)
10	$x_{i, p + 1}^{B} = x_{i, p}^{B}$	$μ_{i, p} = \frac{\sum_{j = 1}^{m} (S_{i, j, p}^{+} f_{j, p}^{A})}{\sum_{j = 1}^{m} (S_{i, j, p}^{+} \frac{f_{j, p}^{A}}{t_{j, p}^{F}})}$	(74)
11	$▵ x_{i, p + 1} = x_{i, p + 1}^{B} - x_{i, p + 1}^{A}$	$▵ x_{i, p + 1} = μ_{i, p} ▵ x_{i, p}^{A}$	(75)

Table 3. T-Secant iteration with constant Jacobian approximate

S_{p}

(Equation (77)).

Table 3. T-Secant iteration with constant Jacobian approximate

S_{p}

(Equation (77)).

$h$	$x_{p + h}^{A}$	$x_{p + h}^{B}$	$Δ x_{p + h}^{A}$	$t_{p + h}^{f}$	$Δ x_{p + 1 + h}$	$\|e_{p + 1 + h}^{A}\|$	$α_{p + h}^{T}$	$N_{f}$	$L$
0	1.700	2.000	0.421	−0.085	0.036	$2.6 \times 10^{- 2}$		2	1.23
1	2.121	2.156	−0.036	−0.359	−0.013	$2.2 \times 10^{- 2}$		4	0.87
2	2.085	2.072	0.013	−0.342	0.0044	$7.6 \times 10^{- 3}$	1.08	6	0.76
3	2.098	2.102	−0.0044	−0.348	−0.0015	$2.7 \times 10^{- 3}$	0.97	8	0.70
4	2.093	2.092	0.0015	−0.346	0.00053	$9.2 \times 10^{- 4}$	1.01	10	0.67
5	2.0949	2.0955	−0.00053	−0.347	−0.00018	$3.2 \times 10^{- 4}$	0.997	12	0.65
6	2.0944	2.0942	0.00018	−0.346	0.000063	$1.1 \times 10^{- 4}$	1.001	14	0.63
7	2.0946	2.0947	−0.000063	−0.346	0.000022	$3.8 \times 10^{- 5}$	0.9997	16	0.62

Table 4. Classic secant iterations with test Function (85) (see Figure 4).

$p$	0	1	2	3	4	5	6
$x_{p}^{A}$	$- 2.000$	$2.000$	$- 0.416$	$0.442$	$0.898$	$0.7279$	$0.7387$
$▵ x_{p}$	$4.000$	$- 2.416$	$0.858$	$0.456$	$- 0.170$	$0.011$	$3.6 \times 10^{- 4}$
$f_{p}^{A}$	$1.584$	$- 2.416$	$1.331$	$0.462$	$- 0.275$	$0.019$	$6.1 \times 10^{- 4}$
$x_{p + 1}^{A}$	$- 0.416$	$0.442$	$0.898$	$0.728$	$0.7387$	$0.739086$	$0.739085$
$e_{p + 1}^{A}$	$- 1.155$	$- 0.297$	$0.159$	$- 0.011$	$- 0.0004$	$9.1 \times 10^{- 7}$	$7.3 \times 10^{- 11}$
$α^{S}$			$0.113$	$15.5$	$0.459$	$4.249$	$1.292$
$f_{p + 1}^{A}$	$1.331$	$0.462$	$- 0.275$	$0.019$	$6.1 \times 10^{- 4}$	$1.5 \times 10^{- 6}$	$1.2 \times 10^{- 10}$

Table 5. T-Secant iterations with test Function (85) (see Figure 5).

$p$	0	1	2	3	4
$x_{p}^{A}$	$- 2.000$	$- 0.416$	$0.6668$	$0.7387$	$0.7390851328$
$▵ x_{p}$	$4.000$	$1.331$	$0.0968$	$4.1 \times 10^{- 4}$	$3.9 \times 10^{- 10}$
$f_{p}^{A}$	$1.584$	$1.331$	$0.1190$	$6.7 \times 10^{- 4}$	$6.5 \times 10^{- 10}$
$x_{p + 1}^{A}$	$- 0.416$	$0.667$	$0.7387$	$0.7390851328$	$0.7390851332$
$e_{p + 1}^{A}$	$- 1.155$	$- 0.072$	$- 4.0 \times 10^{- 4}$	$3.8 \times 10^{- 10}$
$α^{TS}$			$3.210$	$1.874$	$2.668$
$f_{p + 1}^{A}$	$1.331$	$0.119$	$6.7 \times 10^{- 4}$	$6.5 \times 10^{- 10}$
$t_{p}^{F}$	$0.840$	$0.089$	$0.0057$	$9.6 \times 10^{- 7}$
$x_{p + 1}^{B}$	$0.915$	$0.764$	$0.7391$	$0.7390851332$

Table 6. Efficiencies of classic and improved algorithms.

	Method	$d$	$β$	$EFF$ [15]	$* EFF$ [14]	$L_{\max}$ [36]
1	Secant	1	1.618…	1.618…	1.618…	4.0
2	Newton	2	2.0	1.0	1.414…	3.0
3	T-Secant	2	2.618…	1.309…	1.618…	4.5
4	TS-const. $S_{p}$	2	1.0	0.5	1.0	0.6
5	T-Newton	3	3.0	1.0	1.442…	3.0
6	Chen [28]	3	1.618…	0.539…	1.173…	−
7	Wang [30]	5	1.618…	0.323…	1.101…	−

Table 7. Calculated values of the mean convergence rates.

	$N$	Method	$R (x_{0}^{A})$	$R (x_{p_{\max}}^{A})$	$p_{\max}$	$N_{f}$	$L$	$L_{N}$
1	2	Broyden 1. [36]	4.92	$4.7 E -^{10}$	-	59	0.391	0.78
2	2	Broyden 2. [36]	4.92	$2.6 E^{- 10}$	-	39	0.607	1.22
3	2	Powell [41]	4.92	$7.0 E^{- 10}$	-	151	0.150	0.30
4	2	ACD [42]	130.1	$1.0 E^{- 10}$	-	325	0.086	0.17
5	2	Nelder–Mead [43]	2.00	$1.4 E^{- 10}$	-	185	0.127	0.25
6	2	T-secant [36,41]	4.92	$1.0 E^{- 25}$	3	9	6.573	13.15
7	2	T-secant [42]	130.1	$1.0 E^{- 25}$	3	9	6.937	13.87
8	2	T-secant [43]	2.00	$6.7 E^{- 15}$	2	6	5.556	11.11
9	3	T-secant	72.72	$1.4 E^{- 14}$	5	20	1.809	5.43
10	3		32.47	$1.0 E^{- 25}$	4	16	3.815	11.45
11	5		93.53	$1.3 E^{- 14}$	8	48	0.760	3.80
12	5		7.19	$5.9 E^{- 14}$	4	24	1.351	6.76
13	10		202.6	$1.0 E^{- 25}$	14	154	0.408	4.08
14	200		92.78	$9.0 E^{- 15}$	10	2010	0.042	8.44
15	1000		212.4	$3.6 E^{- 13}$	6	6006	0.006	5.66

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Berzi, P. A Simplified Algorithm for a Full-Rank Update Quasi-Newton Method. AppliedMath 2025, 5, 15. https://doi.org/10.3390/appliedmath5010015

AMA Style

Berzi P. A Simplified Algorithm for a Full-Rank Update Quasi-Newton Method. AppliedMath. 2025; 5(1):15. https://doi.org/10.3390/appliedmath5010015

Chicago/Turabian Style

Berzi, Peter. 2025. "A Simplified Algorithm for a Full-Rank Update Quasi-Newton Method" AppliedMath 5, no. 1: 15. https://doi.org/10.3390/appliedmath5010015

APA Style

Berzi, P. (2025). A Simplified Algorithm for a Full-Rank Update Quasi-Newton Method. AppliedMath, 5(1), 15. https://doi.org/10.3390/appliedmath5010015

Article Menu

A Simplified Algorithm for a Full-Rank Update Quasi-Newton Method

Abstract

1. Introduction

2. Notations

3. Linearization Methods

3.1. Single-Variable Case

3.2. Multi-Variable Case

4. T-Secant Method

4.1. Single-Variable Case

4.2. Multi-Variable Case

5. Algorithm

6. Numerical Tests Results

6.1. Single-Variable Test Function

6.2. Solution of an Inverse Problem

7. Efficiency

8. Conclusions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI