Numerical Methods for Solving Nonlinear Equations

Edited by Maria Isabel Berenguer and Manuel Ruiz Galán

mdpi.com/journal/mathematics

## **Numerical Methods for Solving Nonlinear Equations**

## **Numerical Methods for Solving Nonlinear Equations**

Editors

**Maria Isabel Berenguer Manuel Ruiz Gal´an**

Basel • Beijing • Wuhan • Barcelona • Belgrade • Novi Sad • Cluj • Manchester

*Editors* Maria Isabel Berenguer Department of Applied Mathematics, University of Granada Granada, Spain

Manuel Ruiz Galan´ Department of Applied Mathematics, University of Granada Granada, Spain

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Mathematics* (ISSN 2227-7390) (available at: https://www.mdpi.com/si/mathematics/Numerical Methods for Solving Nonlinear Equations).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

Lastname, A.A.; Lastname, B.B. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-9214-5 (Hbk) ISBN 978-3-0365-9215-2 (PDF) doi.org/10.3390/books978-3-0365-9215-2**

© 2023 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license. The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons Attribution-NonCommercial-NoDerivs (CC BY-NC-ND) license.

## **Contents**


## **About the Editors**

#### **Maria Isabel Berenguer**

Maria Isabel Berenguer is an Associate Professor in the Department of Applied Mathematics at the University of Granada, Spain; since 2015, she has also been a member of the university's Institute of Mathematics. She is a notable reviewer, and has published a several articles in well-known journals with high JCR ranking. She has also actively participated in different projects with national or regional funding. Additionally, she has trained young researchers and tutored international doctoral students during their term abroad at the University of Granada. Currently, she is a member of the Editorial Board of MDPI's journal *Mathematics*. Her research interests include applied mathematics, numerical analysis, fixed point theory and inverse problems.

#### **Manuel Ruiz Gal ´an**

Manuel Ruiz Galan received his Ph.D. in 1999 from the University of Granada, Spain. Currently, ´ he serves as a Full Professor in the Mathematics Department at the University of Granada, Spain. He has authored more than 60 research papers and book chapters, particularly on topics related to convex and numerical analysis and their applications. He has been a member and principal investigator in several nationally funded projects (Spanish Government). He has served as a Guest Editor for several Special Issues of journals such as *Optimization and Engineering*, *Frontiers in Psychology*, *Mathematical Problems in Engineering*, and the *Journal of Function Spaces and Applications*. Additionally, he is a member of the Editorial Board of MDPI's journal *Mathematics* and the journal *Minimax Inequalities* and its *Applications*.

## **Preface**

Many problems that emerge in areas such as medicine, biology, economics, finance, or engineering can be described in terms of nonlinear equations or systems of such equations, which can take different forms, from algebraic, differential, integral or integro-differential models to variational inequalities or equilibrium problems. For this reason, nonlinear problems are one of the most interesting fields of study in pure and applied mathematics.

However, there is a lack of direct methods that can facilitate the effective resolution of nonlinear problems, and hence, research interest in their numerical treatment has further consolidated. This Special Issue have collated manuscripts that address the recent advancements in the aforementioned area. It contains 10 articles accepted for publication among the 24 submitted.

#### **Maria Isabel Berenguer and Manuel Ruiz Gal´an**

*Editors*

## *Article* **On the Convergence of a New Family of Multi-Point Ehrlich-Type Iterative Methods for Polynomial Zeros**

**Petko D. Proinov \* and Milena D. Petkova**

Faculty of Mathematics and Informatics, University of Plovdiv Paisii Hilendarski, 24 Tzar Asen, 4000 Plovdiv, Bulgaria; milenapetkova@uni-plovdiv.bg

**\*** Correspondence: proinov@uni-plovdiv.bg

**Abstract:** In this paper, we construct and study a new family of multi-point Ehrlich-type iterative methods for approximating all the zeros of a uni-variate polynomial simultaneously. The first member of this family is the two-point Ehrlich-type iterative method introduced and studied by Tri´ckovi´c and Petkovi´c in 1999. The main purpose of the paper is to provide local and semilocal convergence analysis of the multi-point Ehrlich-type methods. Our local convergence theorem is obtained by an approach that was introduced by the authors in 2020. Two numerical examples are presented to show the applicability of our semilocal convergence theorem.

**Keywords:** multi-point iterative methods; iteration functions; polynomial zeros; local convergence; error estimates; semilocal convergence

**MSC:** 65H04

#### **1. Introduction**

This work deals with multi-point iterative methods for approximating all the zeros of a polynomial simultaneously. Let us recall that an iterative method for solving a nonlinear equation is called a multi-point method if it can be defined by an iteration of the form

$$\mathbf{x}^{(k+1)} = \boldsymbol{\varrho}(\mathbf{x}^{(k)}, \mathbf{x}^{(k-1)}, \dots, \mathbf{x}^{(k-N)}), \quad k = 0, 1, 2, \dots, 4$$

where *N* is a fixed natural number, and *x*(0), *x*(−1),..., *x*(−*N*) are *N* + 1 initial approximations. In the literature, there are multi-point iterative methods for finding a single zero of a nonlinear equation (see, e.g., [1–7]). This study is devoted to the multi-point iterative methods for approximating all the zeros of a polynomial simultaneously (see, e.g., [8–11]).

Let us recall the two most popular iterative methods for simultaneous computation of all the zeros of a polynomial *f* of degree *n* ≥ 2. These are Weierstrass' method [12] and Ehrlich's method [13].

Weierstrass' method is defined by the following iteration:

$$\mathbf{x}^{(k+1)} = \mathbf{x}^{(k)} - \mathcal{W}\_f(\mathbf{x}^{(k)}), \qquad k = 0, 1, 2, \dots, \tag{1}$$

where the function *Wf* : D ⊂ <sup>K</sup>*<sup>n</sup>* <sup>→</sup> <sup>K</sup>*<sup>n</sup>* is defined by *Wf*(*x*) = (*W*1(*x*),..., *Wn*(*x*)) with

$$\mathcal{W}\_{i}(\mathbf{x}) = \frac{f(\mathbf{x}\_{i})}{a\_{0} \prod\_{j \neq i} (\mathbf{x}\_{i} - \mathbf{x}\_{j})} \qquad (i = 1, \dots, n), \tag{2}$$

where *<sup>a</sup>*<sup>0</sup> <sup>∈</sup> <sup>K</sup> is the leading coefficient of *<sup>f</sup>* and <sup>D</sup> denotes the set of all vectors in <sup>K</sup>*<sup>n</sup>* with pairwise distinct components. Weierstrass' method (1) has second order of convergence (provided that *f* has only simple zeros).

**Citation:** Proinov, P.D.; Petkova, M.D. On the Convergence of a New Family of Multi-Point Ehrlich-Type Iterative Methods for Polynomial Zeros. *Mathematics* **2021**, *9*, 1640. https://doi.org/10.3390/ math9141640

Academic Editors: Maria Isabel Berenguer and Manuel Ruiz Galán

Received: 17 June 2021 Accepted: 8 July 2021 Published: 12 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Ehrlich's method is defined by the following fixed point iteration:

$$\mathbf{x}^{(k+1)} = T(\mathbf{x}^{(k)}), \qquad k = 0, 1, 2, \dots, \tag{3}$$

where the iteration function *<sup>T</sup>* : <sup>K</sup>*<sup>n</sup>* <sup>→</sup> <sup>K</sup>*<sup>n</sup>* is defined by *<sup>T</sup>*(*x*)=(*T*1(*x*),..., *Tn*(*x*)) with

$$T\_i(\mathbf{x}) = \mathbf{x}\_i - \frac{f(\mathbf{x}\_i)}{f'(\mathbf{x}\_i) - f(\mathbf{x}\_i) \sum\_{j \neq i} \frac{1}{\mathbf{x}\_i - \mathbf{x}\_j}} \qquad (i = 1, \dots, n). \tag{4}$$

Ehrlich's method has third order convergence. In 1973, this method was rediscovered by Aberth [14]. In 1970, Börsch-Supan [15] constructed another third-order method for simultaneous computing all the zeros of a polynomial. However in 1982, Werner [16] proved that both Ehrlich's and Börsch-Supan's methods are identical.

In 1999, Tri´ckovi´c and Petkovi´c [9] constructed and studied a two-point version of Ehrlich's method. They proved that the two-point Ehrlich-type method has the order of convergence *<sup>r</sup>* <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>√</sup>2.

In the present paper, we introduce an infinite sequence of multi-point Ehrlich-type iterative methods. We note that the first member of this family of iterative methods is the two-point Ehrlich-type method constructed in [9]. The main purpose of this paper is to provide a local and semilocal convergence analysis of the multi-point Ehrlich-type methods.

Our local convergence result (Theorem 2) contains the following information: convergence domain; a priori and a posteriori error estimates; convergence order of every method of the family. For instance, we prove that for a given natural number *N*, the order of convergence of the *N*th multi-point Ehrlich-type method is *r* = *r*(*N*), where *r* is the unique positive solution of the equation

$$1 + 2(t + \ldots + t^N) = t^{N+1}.\tag{5}$$

It follows from this result that the first iterative method (*N* = 1) has the order of convergence *<sup>r</sup>*(1) = <sup>1</sup> <sup>+</sup> <sup>√</sup><sup>2</sup> which coincides with the above mentioned result of Tri´ckovi´c and Petkovi´c. We note that each method of the new family has super-quadratic convergence of order *<sup>r</sup>* <sup>∈</sup> [<sup>1</sup> <sup>+</sup> <sup>√</sup>2, 3). The semilocal convergence result (Theorem 4) states a computerverifiable initial condition that guarantees fast convergence of the corresponding method of the family.

The paper is structured as follows: In Section 2, we introduce the new family of multi-point iterative methods. Section 3 contains some auxiliary results that underlie the proofs of the main results. In Section 3, we present a local convergence result (Theorem 2) for the iterative methods of the new family. This result contains initial conditions as well as a priori and a posteriori error estimates. In Section 5, we provide a semilocal convergence result (Theorem 4) with computer verifiable initial conditions. Section 6 provides two numerical examples to show the applicability of our semilocal convergence theorem and the convergence behavior of the proposed multi-point iterative methods. The paper ends with a conclusion section.

#### **2. A New Family of Multi-Point Ehrlich-Type Iterative Methods**

Throughout the paper (K, |·|) stands for a valued field with a nontrivial absolute value |·| and <sup>K</sup>[*z*] denotes the ring of uni-variate polynomials over <sup>K</sup>. The vector space K*<sup>n</sup>* is equipped with the product topology.

For a given vector *<sup>u</sup>* <sup>∈</sup> <sup>K</sup>*n*, *ui* always denotes the *<sup>i</sup>*th component of *<sup>u</sup>*. For example, if *<sup>F</sup>* is a map with values in <sup>K</sup>*n*, then *Fi*(*x*) denotes the *<sup>i</sup>*th component of the vector *<sup>F</sup>*(*x*) <sup>∈</sup> <sup>K</sup>*n*. Let us define a binary relation # on K*<sup>n</sup>* as follows [17]

$$
\mu \not\models \upsilon \quad \Leftrightarrow \quad \mu\_i \not\models \upsilon\_j \text{ for all } i, j \in I\_n \text{ with } i \neq j.
$$

Here and throughout the paper, *In* is defined by

$$I\_n = \{1, 2, \ldots, n\}\dots$$

Suppose *<sup>f</sup>* <sup>∈</sup> <sup>K</sup>[*z*] is a polynomial of degree *<sup>n</sup>* <sup>≥</sup> 2. A vector *<sup>ξ</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup>* is called a *root vector* of the polynomial *f* if

$$f(z) = a\_0 \prod\_{i=1}^{n} (z - \zeta\_i) \quad \text{for all} \quad z \in \mathbb{K}\_{\text{tot}}$$

where *<sup>a</sup>*<sup>0</sup> <sup>∈</sup> <sup>K</sup>. It is obvious that *<sup>f</sup>* possesses a root vector in <sup>K</sup>*<sup>n</sup>* if and only if it splits over <sup>K</sup>.

In the following definition, we introduce a real-value function of two vector variables that plays an essential role in the present study.

**Definition 1.** *Suppose <sup>f</sup>* <sup>∈</sup> <sup>K</sup>[*z*] *is a polynomial of degree <sup>n</sup>* <sup>≥</sup> <sup>2</sup>*. We define an iteration function* <sup>Φ</sup>: *<sup>D</sup>*<sup>Φ</sup> <sup>⊂</sup> <sup>K</sup>*<sup>n</sup>* <sup>×</sup> <sup>K</sup>*<sup>n</sup>* <sup>→</sup> <sup>K</sup>*<sup>n</sup> of two vector variables as follows:*

$$\Phi\_i(\mathbf{x}, y) = \mathbf{x}\_i - \frac{f(\mathbf{x}\_i)}{f'(\mathbf{x}\_i) - f(\mathbf{x}\_i) \sum\_{j \neq i} \frac{1}{\mathbf{x}\_i - y\_j}} \qquad (i = 1, \dots, n), \tag{6}$$

*where D*Φ *is defined by*

$$D\_{\Phi} = \left\{ (\mathbf{x}, \mathbf{y}) \in \mathbb{K}^n \times \mathbb{K}^n \; : \; \mathbf{x} \; \# \; \mathcal{Y}\_r \; \; f'(\mathbf{x}\_i) - f(\mathbf{x}\_i) \; \sum\_{j \neq i} \frac{1}{\mathbf{x}\_i - \mathbf{y}\_j} \neq \mathbf{0} \quad \text{for} \quad i \in I\_n \right\}. \tag{7}$$

Now the two-point Ehrlich-type root-finding method introduced by Tri´ckovi´c and Petkovi´c [9] can be defined by the following iteration

$$\mathbf{x}^{(k+1)} = \Phi(\mathbf{x}^{(k)}, \mathbf{x}^{(k-1)}), \qquad k = 0, 1, \dots \tag{8}$$

with initial approximations *<sup>x</sup>*(0), *<sup>x</sup>*(−1) <sup>∈</sup> <sup>K</sup>*n*.

**Theorem 1** (Petkovi´c and Tri´ckovic [9])**.** *The convergence order of the two-point Ehrlich-type method* (8) *is r* <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>√</sup><sup>2</sup> <sup>≈</sup> 2.414*.*

Based on the function Φ, we define a sequence (Φ(*N*))<sup>∞</sup> *<sup>N</sup>*=<sup>1</sup> of vector-valued functions such that the *N*th function Φ(*N*) is a function of *N* + 1 vector variables.

**Definition 2.** *We define a sequence* (Φ(*N*))<sup>∞</sup> *<sup>N</sup>* <sup>=</sup> <sup>0</sup> *of iteration functions*

$$\Phi^{(N)} \colon D\_N \subset \underbrace{\mathbb{K}^n \times \ldots \times \mathbb{K}^n}\_{N+1} \to \mathbb{K}^n$$

*recursively by setting* Φ(0)(*x*) = *x and*

$$\Phi^{(N)}(\mathbf{x}, y, \dots, z) = \Phi(\mathbf{x}, \Phi^{(N-1)}(y, \dots, z)).\tag{9}$$

*The sequence* (*DN*)<sup>∞</sup> *<sup>N</sup>* <sup>=</sup> <sup>0</sup> *of domains is defined also recursively by setting D*<sup>0</sup> <sup>=</sup> <sup>K</sup>*<sup>n</sup> and*

$$D\_{N} = \left\{ (\mathbf{x}, y, \dots, z) \in \underbrace{\mathbb{K}^{n} \times \dots \times \mathbb{K}^{n}}\_{N+1} : (y, \dots, z) \in D\_{N-1}, \mathbf{x} \notin \Phi^{(N-1)}(y, \dots, z) \right\} \tag{10}$$
 
$$\text{and } f'(\mathbf{x}\_{i}) - f(\mathbf{x}\_{i}) \sum\_{j \neq i} \frac{1}{\mathbf{x}\_{i} - \Phi\_{j}^{(N-1)}(y, \dots, z)} \neq 0 \quad \text{for} \quad i \in I\_{n} \right\}. \tag{11}$$

Clearly, the iteration function Φ(1) coincides with the function Φ.

**Definition 3.** *Let <sup>N</sup> be a given natural number, and <sup>x</sup>*(0), *<sup>x</sup>*(−1), ..., *<sup>x</sup>*(−*N*) <sup>∈</sup> <sup>K</sup>*<sup>n</sup> be <sup>N</sup>* <sup>+</sup> <sup>1</sup> *initial approximations. We define the Nth iterative method of an infinite sequence of multi-point Ehrlich-type methods by the following iteration*

$$\mathbf{x}^{(k+1)} = \Phi^{(N)}(\mathbf{x}^{(k)}, \mathbf{x}^{(k-1)}, \dots, \mathbf{x}^{(k-N)}), \qquad k = 0, 1, \dots \tag{11}$$

Note that in the case *N* = 1, the iterative method (11) coincides with the two-point Ehrlich-type method (8).

In Section 4, we present a local convergence theorem (Theorem 2) for the methods (11) with initial conditions that guarantee the convergence to a root vector of *f* . In the case *N* = 1, this result extends Theorem 1 in several directions.

In Section 5, we present a semilocal convergence theorem (Theorem 4) for the family (11), which is of practical importance.

#### **3. Preliminaries**

In this section, we present two basic properties of the iteration function Φ defined in Definition 1, which play an important role in obtaining the main result in Section 4.

In what follows, we assume that <sup>K</sup>*<sup>n</sup>* is endowed with the norm ·<sup>∞</sup> defined by

$$\|\mu\|\_{\infty} = \max\{|\mu\_1|\_{\prime} \dots \prime |\mu\_n|\}$$

and with the cone norm · : <sup>K</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup>*<sup>n</sup>* defined by

$$\|u\| = (|u\_1| \dots |u\_n|)\_n$$

assuming that <sup>R</sup>*<sup>n</sup>* is endowed with the component-wise ordering  defined by

$$u \preceq v \quad \Leftrightarrow \quad u\_i \preceq v\_i \text{ for all } i \in I\_n. \dots$$

Furthermore, for two vectors *<sup>u</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup>* and *<sup>v</sup>* <sup>∈</sup> <sup>R</sup>*n*, we denote by *<sup>u</sup>*/*<sup>v</sup>* the vector

$$\frac{u}{v} = \left(\frac{|u\_1|}{v\_1}, \dots, \frac{|u\_n|}{v\_n}\right).$$

We define a function *<sup>d</sup>* : <sup>K</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup>*<sup>n</sup>* by *<sup>d</sup>*(*u*)=(*d*1(*u*),..., *dn*(*u*)) with

$$d\_i(\boldsymbol{u}) = \min\_{j \neq i} |\boldsymbol{u}\_i - \boldsymbol{u}\_j| \qquad (i = 1, \dots, n).$$

**Lemma 1** ([11])**.** *Suppose x*, *<sup>y</sup>*, *<sup>ξ</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup> and <sup>ξ</sup> is a vector with pairwise distinct components.*

$$|\mathbf{x}\_i - \mathbf{y}\_j| \ge (1 - E(\mathbf{x}) - E(\mathbf{y})) \left| \mathfrak{F}\_i - \mathfrak{F}\_j \right| \quad \text{for all} \ i, j \in I\_{\mathbf{n}\_i} \tag{12}$$

*where the function E*: <sup>K</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup><sup>+</sup> *is defined by*

$$E(\boldsymbol{x}) = \left\| \frac{\boldsymbol{x} - \boldsymbol{\xi}}{d(\boldsymbol{\xi})} \right\|\_{\infty}.\tag{13}$$

**Lemma 2.** *Suppose <sup>f</sup>* <sup>∈</sup> <sup>K</sup>[*z*] *is a polynomial of degree <sup>n</sup>* <sup>≥</sup> <sup>2</sup>*, which splits over* <sup>K</sup>*, and <sup>ξ</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup> is a root vector of <sup>f</sup> . Let <sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup> be two vectors such that <sup>x</sup>* # *y. If <sup>f</sup>*(*xi*) <sup>=</sup> <sup>0</sup> *for some <sup>i</sup>* <sup>∈</sup> *In, then*

$$\frac{f'(\mathbf{x}\_i)}{f(\mathbf{x}\_i)} - \sum\_{j \neq i} \frac{1}{\mathbf{x}\_i - y\_j} = \frac{1 - \tau\_i}{\mathbf{x}\_i - \tilde{\xi}\_i},\tag{14}$$

*where <sup>τ</sup><sup>i</sup>* <sup>∈</sup> <sup>K</sup> *is defined by*

$$\pi\_i = (\boldsymbol{\pi}\_i - \boldsymbol{\xi}\_i) \sum\_{j \neq i} \frac{y\_j - \boldsymbol{\tilde{\xi}}\_j}{(\boldsymbol{\pi}\_i - \boldsymbol{\tilde{\xi}}\_j)(\boldsymbol{\pi}\_i - y\_j)}. \tag{15}$$

**Proof.** Since *ξ* is a root vector of *f* , we obtain

$$\frac{f'(\mathbf{x}\_i)}{f(\mathbf{x}\_i)} - \sum\_{j \neq i} \frac{1}{\mathbf{x}\_i - y\_j} = \sum\_{j=1}^n \frac{1}{\mathbf{x}\_i - \mathfrak{F}\_i} - \sum\_{j \neq i} \frac{1}{\mathbf{x}\_i - y\_j} = \frac{1}{\mathbf{x}\_i - \mathfrak{F}\_i} + \sum\_{j \neq i} \left( \frac{1}{\mathbf{x}\_i - \mathfrak{F}\_j} + \frac{1}{\mathbf{x}\_i - y\_j} \right)$$

$$= \frac{1}{\mathbf{x}\_i - \mathfrak{F}\_i} - \sum\_{j \neq i} \frac{y\_j - \mathfrak{F}\_j}{(\mathbf{x}\_i - \mathfrak{F}\_j)(\mathbf{x}\_i - y\_j)} = \frac{1 - \tau\_i}{\mathbf{x}\_i - \mathfrak{F}\_i},$$

which proves (14).

Define the function *<sup>σ</sup>* : <sup>D</sup> <sup>⊂</sup> <sup>K</sup>*<sup>n</sup>* <sup>×</sup> <sup>K</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup><sup>+</sup> by

$$\sigma(\mathbf{x}, y) = \frac{(n - 1)E(\mathbf{x})E(y)}{(1 - E(\mathbf{x}))(1 - E(\mathbf{x}) - E(y)) - (n - 1)E(\mathbf{x})E(y)} \tag{16}$$

with domain

<sup>D</sup> <sup>=</sup> {(*x*, *<sup>y</sup>*) <sup>∈</sup> <sup>K</sup>*<sup>n</sup>* <sup>×</sup> <sup>K</sup>*<sup>n</sup>* : (<sup>1</sup> <sup>−</sup> *<sup>E</sup>*(*x*))(<sup>1</sup> <sup>−</sup> *<sup>E</sup>*(*x*) <sup>−</sup> *<sup>E</sup>*(*y*)) <sup>&</sup>gt; (*<sup>n</sup>* <sup>−</sup> <sup>1</sup>)*E*(*x*)*E*(*y*) and *<sup>E</sup>*(*x*) + *<sup>E</sup>*(*y*) <sup>&</sup>lt; <sup>1</sup>}, (17) where *<sup>E</sup>*: <sup>K</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup><sup>+</sup> is defined by (13).

> **Lemma 3.** *Let <sup>f</sup>* <sup>∈</sup> <sup>K</sup>[*z*] *be a polynomial of degree <sup>n</sup>* <sup>≥</sup> <sup>2</sup> *with <sup>n</sup> simple zeros in* <sup>K</sup>*, and let <sup>ξ</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup> be a root vector of f . Suppose x*, *<sup>y</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup> are two vectors such that* (*x*, *<sup>y</sup>*) <sup>∈</sup> <sup>D</sup>*. Then:*


*where the functions* Φ*, E and σ are defined by* (6)*,* (13) *and* (16)*, respectively.*

**Proof.** (i) According to (17), we have *E*(*x*) + *E*(*y*) < 1. Then it follows from Lemma 1 that

$$|x\_i - y\_j| \ge (1 - E(\mathbf{x})) \, d\_j(\xi) > 0 \tag{18}$$

for every *j* = *i*. This yields *x* # *y*. In view of (7), it remains to prove that

$$(f'(\mathbf{x}\_i) - f(\mathbf{x}\_i)) \sum\_{j \neq i} \frac{1}{x\_i - y\_j} \neq 0 \tag{19}$$

for *i* ∈ *In*. Let *i* ∈ *In* be fixed. We shall consider only the non-trivial case *f*(*xi*) = 0. In this case, (19) is equivalent to

$$\frac{f'(x\_i)}{f(x\_i)} - \sum\_{j \neq i} \frac{1}{x\_i - y\_j} \neq 0. \tag{20}$$

On the other hand, it follows from Lemma 2 that (20) is equivalent to *τ<sup>i</sup>* = 1, where *τ<sup>i</sup>* is defined by (15). By Lemma 1 with *y* = *ξ*, we obtain

$$|x\_i - \mathbb{f}\_j| \ge \left(1 - E(\mathbf{x})\right) d\_i(\xi) > 0 \tag{21}$$

for every *j* = *i*. From (15), (18) and (21), we obtain

$$|\pi\_i| \quad \le |\mathbf{x}\_i - \tilde{\mathbf{y}}\_i| \sum\_{j \ne i} \frac{|\mathbf{y}\_j - \tilde{\mathbf{y}}\_j|}{|\mathbf{x}\_i - \tilde{\mathbf{x}}\_j| |\mathbf{x}\_i - \mathbf{y}\_j|} \tag{22}$$

$$\le \frac{1}{(1 - E(\mathbf{x}))(1 - E(\mathbf{x}) - E(\mathbf{y}))} \frac{|\mathbf{x}\_i - \tilde{\mathbf{x}}\_i|}{d\_i(\tilde{\mathbf{y}})} \sum\_{j \ne i} \frac{|\mathbf{y}\_j - \tilde{\mathbf{x}}\_j|}{d\_j(\tilde{\mathbf{y}})}$$

$$\le \frac{(n - 1)E(\mathbf{x})E(\mathbf{y})}{(1 - E(\mathbf{x}))(1 - E(\mathbf{x}) - E(\mathbf{y}))} < 1.$$

This implies that *τ<sup>i</sup>* = 1 which proves the first claim.

(ii) The second claim is equivalent to

$$|\Phi\_i(\mathbf{x}, y) - \mathfrak{Z}\_i| \le \sigma(\mathbf{x}, y) \, |\mathbf{x}\_i - \mathfrak{Z}\_i| \tag{23}$$

for all *i* ∈ *In*. If *xi* = *ξi*, then (23) holds trivially. Let *xi* = *ξi*. Then, it follows from (21) that *f*(*xi*) = 0. It follows from (6), (20) and (14) that

$$\begin{split} \Phi\_{i}(\mathbf{x}, \boldsymbol{y}) - \mathfrak{F}\_{i} &= \mathbf{x}\_{i} - \mathfrak{F}\_{i} - \left( \frac{f'(\mathbf{x}\_{i})}{f(\mathbf{x}\_{i})} - \sum\_{j \neq i} \frac{1}{\mathbf{x}\_{i} - \mathbf{y}\_{j}} \right)^{-1} \\ &= \mathbf{x}\_{i} - \mathfrak{F}\_{i} - \frac{\mathbf{x}\_{i} - \mathfrak{F}\_{i}}{1 - \mathfrak{x}\_{i}} = -\frac{\mathfrak{r}\_{i}}{1 - \mathfrak{x}\_{i}} (\mathbf{x}\_{i} - \mathfrak{F}\_{i}). \end{split} \tag{24}$$

By (24) and the estimate (22), we obtain

$$\begin{split} |\Phi\_{i}(\mathbf{x},\boldsymbol{y}) - \boldsymbol{\xi}\_{i}| &= \frac{|\boldsymbol{\tau}\_{i}|}{|1 - \boldsymbol{\tau}\_{i}|} |\mathbf{x}\_{i} - \boldsymbol{\xi}\_{i}| \leq \frac{|\boldsymbol{\tau}\_{i}|}{1 - |\boldsymbol{\tau}\_{i}|} |\mathbf{x}\_{i} - \boldsymbol{\xi}\_{i}| \\ &\leq \frac{(n - 1)E(\mathbf{x})E(\mathbf{y})}{(1 - E(\mathbf{x}))(1 - E(\mathbf{x}) - E(\mathbf{y})) - (n - 1)E(\mathbf{x})E(\mathbf{y})} |\mathbf{x}\_{i} - \boldsymbol{\xi}\_{i}| \\ &= \sigma(\mathbf{x}, \mathbf{y}) |\mathbf{x}\_{i} - \boldsymbol{\xi}\_{i}|. \end{split}$$

Therefore, (23) holds, which proves the second claim.

(iii) By dividing both sides of the last inequality by *di*(*ξ*) and taking the max-norm, we obtain the third claim.

**Lemma 4.** *Let <sup>f</sup>* <sup>∈</sup> <sup>K</sup>[*z*] *be a polynomial of degree <sup>n</sup>* <sup>≥</sup> <sup>2</sup> *with <sup>n</sup> simple zeros in* <sup>K</sup>*, and let <sup>ξ</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup> be a root vector of f . Suppose x*, *<sup>y</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup> are two vectors satisfying*

$$\max\{E(\mathbf{x}), E(\mathbf{y})\} \le R = \frac{2}{3 + \sqrt{8n - 7}},\tag{25}$$

*where the function E*: <sup>K</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup><sup>+</sup> *is defined by* (13)*. Then:* (i) (*x*, *y*) ∈ D*;*

$$\text{(ii)}\qquad \sigma(\mathfrak{x},\mathfrak{y}) \le \frac{E(\mathfrak{x})E(\mathfrak{y})}{R^2};$$

$$(\text{iii}) \qquad E(\Phi(x, y)) \le \frac{E(x)^2 E(y)}{R^2}.$$

**Proof.** It follows from (25) that *E*(*x*) + *E*(*y*) ≤ 2*R* < 1 and

$$(1 - E(\mathbf{x}))(1 - E(\mathbf{x}) - E(\mathbf{y})) - (n - 1)E(\mathbf{x})E(\mathbf{y}) \ge (1 - R)(1 - 2R) - (n - 1)R^2 > 0. \tag{26}$$

Hence, it follows from (17) that (*x*, *y*) ∈ D which proves the claim (i). It is easy to show that *R* is the unique positive zero of the function *φ*, defined by

$$\phi(t) = \frac{(n-1)t^2}{(1-t)(1-2t) - (n-1)t^2} \,. \tag{27}$$

Then, from (16) and (26), we obtain

$$\begin{split} \sigma(\mathbf{x}, \mathbf{y}) &\leq \frac{(n-1)E(\mathbf{x})E(\mathbf{y})}{(1-R)(1-2R)-(n-1)R^2} \\ &= \frac{(n-1)R^2}{(1-R)(1-2R)-(n-1)R^2} \frac{E(\mathbf{x})E(\mathbf{y})}{R^2} \\ &= \phi(R)\frac{E(\mathbf{x})E(\mathbf{y})}{R^2} = \frac{E(\mathbf{x})E(\mathbf{y})}{R^2}, \end{split} \tag{28}$$

which proves (ii). The claim (iii) follows from Lemma 3 (iii) and claim (ii).

#### **4. Local Convergence Analysis**

In this section, we present a local convergence theorem for the multi-point iterative methods (11). More precisely, we study the local convergence of the multi-point Ehrlichtype methods (11) with respect to the function of the initial conditions *<sup>E</sup>*: <sup>K</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup><sup>+</sup> defined by (13), where *<sup>ξ</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup>* is a root vector of a polynomial *<sup>f</sup>* <sup>∈</sup> <sup>K</sup>[*z*].

**Definition 4.** *We define a sequence* (*σN*)<sup>∞</sup> *<sup>N</sup>* <sup>=</sup> <sup>1</sup> *of functions <sup>σ</sup><sup>N</sup>* : <sup>D</sup>*<sup>N</sup>* <sup>⊂</sup> <sup>K</sup>*<sup>n</sup>* <sup>×</sup> ... <sup>×</sup> <sup>K</sup>*<sup>n</sup> <sup>N</sup>*+<sup>1</sup> <sup>→</sup> <sup>R</sup> *by*

$$
\sigma\_N(\mathbf{x}, y, \dots, z) = \sigma(\mathbf{x}, \Phi^{(N-1)}(y, \dots, z)),
\tag{29}
$$

*where σ is defined by* (16)*. The domain* D*<sup>N</sup> is defined by*

$$\begin{aligned} \partial\_N &= \{ (\mathbf{x}, y, \dots, z) \; : \; \mathbf{x} \in \mathbb{K}^n \; , \; (y, \dots, z) \in D\_{N-1} \} \\ & \quad (1 - E(\mathbf{x})) (1 - E(\mathbf{x}) - E(\Phi^{(N-1)}(y, \dots, z))) > (n - 1) E(\mathbf{x}) E(\Phi^{(N-1)}(y, \dots, z)), \\ & \quad E(\mathbf{x}) + E(\Phi^{(N-1)}(y, \dots, z)) < 1 \}, \end{aligned}$$

*and DN is defined by* (10)*.*

**Lemma 5.** *Let <sup>f</sup>* <sup>∈</sup> <sup>K</sup>[*z*] *be a polynomial of degree <sup>n</sup>* <sup>≥</sup> <sup>2</sup> *with <sup>n</sup> simple zeros in* <sup>K</sup> *and <sup>ξ</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup> be a root vector of f . Assume N* ≥ 1 *and* (*x*, *y*,..., *z*) ∈ D*N. Then:*

(i) (*x*, *y*,..., *z*) ∈ *DN;*

$$\text{(iii)}\qquad\|\Phi^{(N)}(\mathbf{x},\mathbf{y},\ldots,z)-\tilde{\mathsf{s}}\|\ \preceq \sigma\_{\mathsf{N}}(\mathbf{x},\mathbf{y},\ldots,z)\,\|\mathbf{x}-\tilde{\mathsf{s}}\|\,;\,\mathbf{y}$$

(iii) *<sup>E</sup>*(Φ(*N*)(*x*, *<sup>y</sup>*,..., *<sup>z</sup>*)) <sup>≤</sup> *<sup>σ</sup>N*(*x*, *<sup>y</sup>*,..., *<sup>z</sup>*) *<sup>E</sup>*(*x*)*,*

*where* Φ(*N*) *and σ<sup>N</sup> are defined by* (9) *and* (29)*, respectively.*

**Proof.** Applying Lemma 1 with *y* = Φ(*N*−<sup>1</sup>)(*y*,..., *z*), we obtain (i). It follows from Definition 2, Lemma 3 (ii) and Definition 4 that

$$\begin{aligned} \|\Phi^{(N)}(\mathbf{x}, \mathbf{y}, \dots, z) - \xi\| &= \|\Phi(\mathbf{x}, \Phi^{(N-1)}(\mathbf{y}, \dots, z)) - \xi\| \\ \preceq \sigma(\mathbf{x}, \Phi^{(N-1)}(\mathbf{y}, \dots, z)) \, \|\mathbf{x} - \xi\| &= \sigma\_N(\mathbf{x}, \mathbf{y}, \dots, z) \, \|\mathbf{x} - \xi\| \, \|\mathbf{y}\| \end{aligned}$$

which proves (ii). From Definition 2, Lemma 3 (iii) and Definition 4, we obtain

$$\begin{aligned} &E(\Phi^{(N)}(\mathbf{x}, \mathbf{y}, \dots, z)) = E(\Phi(\mathbf{x}, \Phi^{(N-1)}(\mathbf{y}, \dots, z))) \\ &\le \sigma(\mathbf{x}, \Phi^{(N-1)}(\mathbf{y}, \dots, z)) \to (\mathbf{x} = \sigma\_N(\mathbf{x}, \mathbf{y}, \dots, z) \to (\mathbf{x}), \end{aligned}$$

which proves (iii).

**Lemma 6.** *Let <sup>f</sup>* <sup>∈</sup> <sup>K</sup>[*z*] *be a polynomial of degree <sup>n</sup>* <sup>≥</sup> <sup>2</sup> *with <sup>n</sup> simple zeros in* <sup>K</sup>*, and let <sup>ξ</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup> be a root vector of f . Assume N* <sup>≥</sup> <sup>1</sup> *and x*, *<sup>y</sup>*,..., *<sup>t</sup>*, *z are N* <sup>+</sup> <sup>1</sup> *vectors in* <sup>K</sup>*<sup>n</sup> such that*

$$\max\{E(x), E(y), \dots, E(z)\} \le R = \frac{2}{3 + \sqrt{8n - 7}},\tag{30}$$

*where the function E*: <sup>K</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup><sup>+</sup> *is defined by* (13)*. Then:*

(i) (*x*, *y*,..., *t*, *z*) ∈ D*N;*

$$\text{(iii)}\qquad \sigma\_N(x, y, \dots, t, z) \le \frac{E(x)E(y)^2 \dots E(t)^2 E(z)}{R^{2N}};$$

$$\text{(iii)}\qquad E(\Phi^{(N)}(x,y,\ldots,t,z)) \le \frac{E(x)^2 E(y)^2 \ldots E(t)^2 E(z)}{R^{2N}}.$$

**Proof.** The proof goes by induction on *N*. In the case *N* = 1, Lemma 6 coincides with Lemma 4. Suppose that for some *N* ≥ 1 the three claims of the lemma hold for every *N* + 1 vectors *<sup>x</sup>*, *<sup>y</sup>*,..., *<sup>t</sup>*, *<sup>z</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup>* satisfying (30). Let *<sup>x</sup>*, *<sup>y</sup>*,..., *<sup>t</sup>*, *<sup>z</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup>* be *<sup>N</sup>* <sup>+</sup> 2 vectors satisfying

$$\max \{ E(x), E(y), \dots, E(t), E(z) \} \le R.$$

We must prove the following three claims:

$$(\mathbf{x}, \mathbf{y}, \dots, \mathbf{t}, \mathbf{z}) \in \mathcal{\mathcal{S}}\_{\text{N}+1\text{\textquotedbl{}}\text{\textquotedbl{}}}\tag{31}$$

$$
\sigma\_{N+1}(\mathbf{x}, y, \dots, \mathbf{t}, z) \le \frac{E(\mathbf{x})E(y)^2 \dots E(\mathbf{t})^2 E(z)}{R^{2(N+1)}},\tag{32}
$$

$$E(\Phi^{(N+1)}(x,y,\ldots,z) \le \frac{E(x)^2 E(y)^2 \ldots E(t)^2 E(z)}{R^{2(N+1)}}.\tag{33}$$

By induction assumption, we obtain (*y*,..., *t*, *z*) ∈ D*N*. By induction assumption (ii) and (30), we obtain

$$E(\mathbf{x}) + E(\Phi^{(N)}(y, \dots, t, z)) \le E(\mathbf{x}) + E(y)^2 \dots E(t)^2 E(z) / R^{2N} \le 2R < 1. \tag{34}$$

By induction assumption, we also have

$$\begin{aligned} &(1 - E(\mathbf{x}))(1 - E(\mathbf{x}) - E(\Phi^{(N)}(y, \dots, z))) - (n - 1)E(\mathbf{x})E(\Phi^{(N)}(y, \dots, z)) \\ &> (1 - R)(1 - 2R) - (n - 1)R^2 > 0 \end{aligned} \tag{35}$$

The inequalities (34) and (35) yield (*x*, *y*,..., *z*) ∈ D*N*<sup>+</sup>1, which proves (31). From Definition 4, Lemma 4 (ii) and induction assumption (ii), we obtain

$$\begin{aligned} \sigma\_{N+1}(\mathfrak{x}, y, \dots, z) &= \sigma(\mathfrak{x}, \Phi^{(N)}(y, \dots, z)) \le E(\mathfrak{x}) \operatorname{E}(\Phi^{(N)}(y, \dots, z) / R^2 \\ &\le E(\mathfrak{x}) E(y)^2 \dots E(t)^2 E(z) / R^{2(N+1)} .\end{aligned}$$

which proves (32). Claim (33) follows from Lemma 5 (ii) and claim (32).

Now we are ready to state the first main result in this paper.

**Theorem 2.** *Suppose <sup>f</sup>* <sup>∈</sup> <sup>K</sup>[*z*] *is a polynomial of degree <sup>n</sup>* <sup>≥</sup> <sup>2</sup> *which has <sup>n</sup> simple zeros in* <sup>K</sup>*, <sup>ξ</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup> is a root vector of <sup>f</sup> , and <sup>N</sup>* <sup>∈</sup> <sup>N</sup>*. Let <sup>x</sup>*(0), *<sup>x</sup>*(−1),..., *<sup>x</sup>*(−*N*) <sup>∈</sup> <sup>K</sup>*<sup>n</sup> be initial approximations such that*

$$\max\_{-N \le k \le 0} E(\mathfrak{x}^{(k)}) < R = \frac{2}{3 + \sqrt{8n - 7}},\tag{36}$$

*where the function <sup>E</sup>*: <sup>K</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup><sup>+</sup> *is defined by* (13)*. Then the multi-point Ehrlich-type iteration* (11) *is well defined and converges to ξ with order r and error estimates*

$$\|\mathbf{x}^{(k+1)} - \boldsymbol{\xi}^{\tau}\| \preceq \lambda^{r^{k+N+1} - r^{k+N}} \|\mathbf{x}^{(k)} - \boldsymbol{\xi}^{\tau}\| \quad \text{for all} \ k \ge 0,\tag{37}$$

$$||\mathbf{x}^{(k)} - \mathfrak{F}|| \preceq \lambda^{r^{k+N} - r^N} ||\mathbf{x}^{(0)} - \mathfrak{F}|| \qquad \text{for all } k \ge 0,\tag{38}$$

*where r* = *r*(*N*) *is the unique positive root of the Equation* (5)*, and λ is defined by*

$$\lambda = \max\_{-N \le k \le 0} \left( \frac{E(x^{(k)})}{R} \right)^{1/r^{k+N}}.\tag{39}$$

**Proof.** First, we will show that the iterative sequence (*x*(*k*))<sup>∞</sup> *<sup>k</sup>*=−*<sup>N</sup>* generated by (11) is well defined and the inequality

$$E(\boldsymbol{x}^{(\nu)}) \le \mathcal{R}\boldsymbol{\lambda}^{\boldsymbol{r}^{\boldsymbol{\nu}} + \boldsymbol{N}} \tag{40}$$

holds for every integer *ν* ≥ −*N*. The proof is by induction. It follows from (39) that (40) holds for <sup>−</sup>*<sup>N</sup>* <sup>≤</sup> *<sup>ν</sup>* <sup>≤</sup> 0. Suppose that for some *<sup>k</sup>* <sup>≥</sup> <sup>0</sup> the iterates *<sup>x</sup>*(*k*), *<sup>x</sup>*(*k*−1),..., *<sup>x</sup>*(*k*−*N*) are well defined and

$$E(\mathbf{x}^{(\nu)}) \le R\lambda^{r^{\nu+N}} \quad \text{for all} \quad k-N \le \nu \le k. \tag{41}$$

We shall prove that the iterate *x*(*k*+1) is well defined and that it satisfies the inequality (40) with *ν* = *k* + 1. It follows from (39) that 0 ≤ *λ* < 1. Hence, from (41) we obtain

$$\max\_{k-N \le \nu \le k} E(\mathfrak{x}^{(\nu)}) \le \mathcal{R}.$$

Then by (11), Lemma 6 (iii), (41) and the definition of *r*, we obtain

$$\begin{aligned} E(\mathbf{x}^{(k+1)}) &= E(\Phi^{(N)}(\mathbf{x}^{(k)}, \mathbf{x}^{(k-1)}, \dots, \mathbf{x}^{(k-N)})) \\ &\le \left( E(\mathbf{x}^{(k)}) E(\mathbf{x}^{(k-1)}) \dots E(\mathbf{x}^{(k-N+1)}) \right)^2 E(\mathbf{x}^{(k-N)}) / \mathbb{R}^{2N} \\ &\le R \left( \boldsymbol{\lambda}^{r^{k+N}} \boldsymbol{\lambda}^{r^{k+N-1}} \dots \boldsymbol{\lambda}^{r^{k+1}} \right)^2 \boldsymbol{\lambda}^k = R \boldsymbol{\lambda}^{r^k (1 + 2r + \dots + 2r^{N-1} + 2r^N)} = R \boldsymbol{\lambda}^{r^{k+N+1}} \dots \boldsymbol{\lambda}^{r^N} \end{aligned}$$

which completes the induction. By Lemma 6 (ii), (40) and the definition of *r*, we obtain the following estimate

$$\begin{split} \sigma\_N(\mathbf{x}^{(k)}, \mathbf{x}^{(k-1)}, \dots, \mathbf{x}^{(k-N)}) &\leq E(\mathbf{x}^{(k)}) \left( E(\mathbf{x}^{(k-1)}) \cdots E(\mathbf{x}^{(k-N+1)}) \right)^2 E(\mathbf{x}^{(k-N)}) / R^{2N} \\ &\leq \lambda^{r^{k+N}} \left( \lambda^{r^{k+N-1}} \cdots \lambda^{r^{k+1}} \right)^2 \lambda^{r^k} = \lambda^{r^k(1+2r+\ldots+2r^{N-1}+r^N)} = \lambda^{r^{k+N+1}-r^{k+N}} .\end{split}$$

From (11), Lemma 5 (ii) and the last estimate, we obtain

$$\begin{aligned} \|\boldsymbol{\mathfrak{x}}^{(k+1)} - \boldsymbol{\mathfrak{y}}\| &= \|\boldsymbol{\Phi}^{(N)}(\boldsymbol{\mathfrak{x}}^{(k)}, \boldsymbol{\mathfrak{x}}^{(k-1)}, \dots, \boldsymbol{\mathfrak{x}}^{(k-N)}) - \boldsymbol{\mathfrak{y}}\| \\ &\preceq \sigma\_N(\boldsymbol{\mathfrak{x}}^{(k)}, \boldsymbol{\mathfrak{x}}^{(k-1)}, \dots, \boldsymbol{\mathfrak{x}}^{(k-N)}) \, \|\boldsymbol{\mathfrak{x}}^{(k)} - \boldsymbol{\mathfrak{y}}\| \\ &\preceq \boldsymbol{\lambda}^{r^{k+N+1} - r^{k+N}} \, \|\boldsymbol{\mathfrak{x}}^{(k)} - \boldsymbol{\mathfrak{y}}\| \, \boldsymbol{\mathsf{y}} \end{aligned}$$

which proved the a posteriori estimate (37). The a priori estimate (38) can be easily proved by induction using the estimate (37). Finally, the convergence of the sequence *x*(*k*) to a root vector *ξ* follows from the estimate (38).

**Remark 1.** *It can be proved that the sequence r*(*N*)*, N* = 1, 2, . . .*, of orders of the multi-point Ehrlich-type methods* (11) *is an increasing sequence which converges to* 3 *as N* → ∞*. In Table 1, one can see the order of convergence r* = *r*(*N*) *for N* = 1, 2, . . . , 10*.*

**Table 1.** Values of the convergence order *r* = *r*(*N*) for *N* = 1, 2, . . . , 10.


#### **5. Semilocal Convergence Analysis**

In this section, we present a semilocal convergence result for the multi-point Ehrlich type methods (11) with respect to the function of initial conditions *Ef* : D ⊂ <sup>K</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup><sup>+</sup> defined by

$$E\_f(\mathbf{x}) = \left\| \frac{\mathcal{W}\_f(\mathbf{x})}{d(\mathbf{x})} \right\|\_{\infty} \text{ \[} \tag{42}$$

where the function *Wf* : D ⊂ <sup>K</sup>*<sup>n</sup>* <sup>→</sup> <sup>K</sup>*<sup>n</sup>* is defined by (2). We note that in the last decade, this is the most frequently used function to set the initial approximations of semilocal results for simultaneous methods for polynomial zeros. (see, e.g., [10,11,17–22]).

The new result is obtained as a consequence from the local convergence Theorem 2 by using the following transformation theorem:

**Theorem 3** (Proinov [19])**.** *Let* <sup>K</sup> *be an algebraically closed field, <sup>f</sup>* <sup>∈</sup> <sup>K</sup>[*z*] *be a polynomial of degree n* <sup>≥</sup> <sup>2</sup>*, and let x* <sup>∈</sup> <sup>K</sup>*<sup>n</sup> be a vector with pairwise distinct components such that*

$$\left\| \frac{\mathcal{W}\_f(\mathbf{x})}{d(\mathbf{x})} \right\|\_{\infty} < \frac{R(1+R)}{(1+2R)(1+nR)} \,\mathrm{}\,\tag{43}$$

*where* 0 < *R* ≤ 1/( <sup>√</sup>*<sup>n</sup>* <sup>−</sup> <sup>1</sup> <sup>−</sup> <sup>1</sup>)*. Then <sup>f</sup> has only simple zeros in* <sup>K</sup> *and there exists a root vector <sup>ξ</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup> of f such that*

$$\left\|\frac{\mathfrak{x}-\mathfrak{f}}{d(\mathfrak{f})}\right\|\_{\infty} < R. \tag{44}$$

Each iterative method for finding simultaneously all roots of a polynomial *<sup>f</sup>* <sup>∈</sup> <sup>K</sup>[*z*] of degree *<sup>n</sup>* <sup>≥</sup> 2 is an iterative method in <sup>K</sup>*n*. It searches the roots *<sup>ξ</sup>*1,..., *<sup>ξ</sup><sup>n</sup>* of the polynomial *<sup>f</sup>* as a vector *<sup>ξ</sup>* = (*ξ*1,..., *<sup>ξ</sup>n*) <sup>∈</sup> <sup>K</sup>*n*. We have noticed in Section <sup>2</sup> that such a vector *<sup>ξ</sup>* is called a root vector of *f* . Clearly, a polynomial can have more than one vector of the roots. On the other hand, we can assume that the vector root is unique up to permutation.

A natural question arises regarding how to measure the distance of an approximation *<sup>x</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup>* to the zeros of a polynomial. The first step is to identify all vectors whose components are the same up to permutation. Namely, we define a relation of equivalence <sup>≡</sup> on <sup>K</sup>*<sup>n</sup>* by *<sup>x</sup>* <sup>≡</sup> *<sup>y</sup>* if the components of *<sup>x</sup>* and *<sup>y</sup>* are the same up to permutation. Then following [11,20], we define a distance between two vectors *<sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup>* as follows

$$\rho(x, y) = \min\_{\upsilon \equiv y} \|x - \upsilon\|\_{\csc}. \tag{45}$$

Note that *ρ* is a metric on the set of classes of equivalence. For simplicity, we shall identify equivalence classes with their representatives.

In what follows, we consider the convergence in K*<sup>n</sup>* with respect to the metric *ρ*. Clearly, if a sequence *<sup>x</sup>*(*k*) in <sup>K</sup>*<sup>n</sup>* is convergent to a vector *<sup>x</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup>* with respect to the norm ·, then it converges to *x* with respect to the metric *ρ*. The opposite statement is not true (see [11]).

Before formulating the main result, we recall a technical lemma.

**Lemma 7** ([11])**.** *Let <sup>x</sup>*, *<sup>ξ</sup>*, *<sup>ξ</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup> be such that <sup>ξ</sup>* <sup>≡</sup> *<sup>ξ</sup>. Then there exists a vector <sup>x</sup>* <sup>∈</sup> <sup>K</sup>*<sup>n</sup> such that x* ≡ *x and*

$$\left\| \frac{\mathfrak{x} - \overline{\mathfrak{x}}}{d(\overline{\mathfrak{x}})} \right\|\_{\infty} = \left\| \frac{\mathfrak{x} - \overline{\mathfrak{x}}}{d(\overline{\mathfrak{x}})} \right\|\_{\infty}.\tag{46}$$

Now we can formulate and prove the second main result of this paper.

**Theorem 4.** *Suppose* <sup>K</sup> *is an algebraically closed field, <sup>f</sup>* <sup>∈</sup> <sup>K</sup>[*z*] *is a polynomial of degree <sup>n</sup>* <sup>≥</sup> <sup>2</sup> *and <sup>N</sup>* <sup>∈</sup> <sup>N</sup>*. Let <sup>x</sup>*(0), *<sup>x</sup>*(−1),..., *<sup>x</sup>*(−*N*) <sup>∈</sup> <sup>K</sup>*<sup>n</sup> be initial approximations satisfying the following condition:*

$$\max\_{-N \le k \le \sqrt{8}} E\_f(\mathbf{x}^{(k)}) < R\_n = \frac{2(5 + \sqrt{8n - 7})}{(2n + 3 + \sqrt{8n - 7})(7 + \sqrt{8n - 7})},\tag{47}$$

*where the function Ef is defined by* (42)*. Then the polynomial f has only simple zeros and the multi-point Ehrlich-type iteration* (11) *is well defined and converges (with respect to the metric ρ*) *to a root vector ξ of f with order of convergence r* = *r*(*N*)*, where r is the unique positive solution of the Equation* (5)*.*

**Proof.** The condition (47) can be represented in the form

$$\max\_{-N \le k \le 0} \left\| \frac{W\_f(\mathbf{x})}{d(\mathbf{x})} \right\|\_{\infty} < \frac{R(1+R)}{(1+2R)(1+nR)}\,\prime \tag{48}$$

where *R* is defined in (36). From Theorem 3 and the inequality (48), we conclude that *f* has *<sup>n</sup>* simple zeros in <sup>K</sup> and that there exist root vectors *<sup>ξ</sup>*(0), *<sup>ξ</sup>*(−1),... *<sup>ξ</sup>*(−*N*) <sup>∈</sup> <sup>K</sup>*<sup>n</sup>* such that

$$\max\_{1-N \le k \le 0} \left\| \frac{x^{(k)} - \tilde{\xi}^{(k)}}{d(\tilde{\xi}^{(k)})} \right\|\_{\infty} < R. \tag{49}$$

Let us put *<sup>ξ</sup>*(0) <sup>=</sup> *<sup>ξ</sup>*. Since *<sup>ξ</sup>*(0), *<sup>ξ</sup>*(−1),... *<sup>ξ</sup>*(−*N*) are root vectors of *<sup>f</sup>* , then *<sup>ξ</sup>*(*k*) <sup>≡</sup> *<sup>ξ</sup>* for all *<sup>k</sup>* <sup>=</sup> 0, <sup>−</sup>1, . . . , <sup>−</sup>*N*. It follows from Lemma <sup>7</sup> that there exist vectors *<sup>x</sup>*(0) , *x*(−1) ,..., *x*(−*N*) such that *<sup>x</sup>*(*k*) <sup>≡</sup> *<sup>x</sup>*(*k*) and (49) can be represented in the form

$$\max\_{-N \le k \le 0} \left\| \frac{\overline{x}^{(k)} - \overline{\xi}}{d(\overline{\xi})} \right\|\_{\infty} < R. \tag{50}$$

It follows from Theorem 2 and inequality (50) that the multi-point iterative method (11) with initial approximations *x*(0) , *x*(−1) ,..., *x*(−*N*) is well defined and converges to *ξ*. Hence, the iteration (11) with initial approximations *x*(0), *x*(−1),..., *x*(−*N*) converges with respect to the metric *ρ* to the root vector of *f* .

The following criterion guarantees the convergence of the methods (11). It is an immediate consequence of Theorem 4.

**Corollary 1** (Convergence criterion)**.** *If there exists an integer m* ≥ 0 *such that*

$$E\_m = \max\left\{ E\_f(\mathbf{x}^{(m)}), E\_f(\mathbf{x}^{(m-1)}), \dots, E\_f(\mathbf{x}^{(m-N)}) \right\} < R\_n. \tag{51}$$

*then f has only simple zeros and the multi-point Ehrlich-type iteration* (11) *converges to a root vector ξ of f .*

The next result is an immediate consequence of Theorem 5.1 of [19]. It can be used as a stopping criterion of a large class of iterative methods for approximating all zeros of a polynomial simultaneously.

**Theorem 5** (Proinov [19])**.** *Suppose* <sup>K</sup> *is an algebraically closed field, <sup>f</sup>* <sup>∈</sup> <sup>K</sup>[*z*] *is a polynomial of degree <sup>n</sup>* <sup>≥</sup> <sup>2</sup> *with simple zeros, and* (*x*(*k*))<sup>∞</sup> *<sup>k</sup>*=<sup>0</sup> *is a sequence in* <sup>K</sup>*<sup>n</sup> consisting of vectors with pairwise distinct components. If k* ≥ 0 *is such that*

$$E\_f(\mathfrak{x}^{(k)}) < \mu\_n = 1/(n + 2\sqrt{n-1}),\tag{52}$$

*then the following a posteriori error estimate holds:*

$$\rho(\mathbf{x}^{(k)}, \xi) \le \varepsilon\_k = \mathfrak{a}(E\_f(\mathbf{x}^{(k)})) \parallel \mathcal{W}\_f(\mathbf{x}^{(k)} \| \approx \ , \tag{53}$$

*where the metric ρ is defined by* (45)*, the function Ef is defined by* (42)*, and the function α is defined by*

$$u(t) = 2/\left(1 - (n - 2)t + \sqrt{\left(1 - (n - 2)t\right)^2 - 4t}\right). \tag{54}$$

#### **6. Numerical Examples**

In this section, we present two numerical examples in order to show the applicability of Theorem 4. Using the convergence criterion (51), we show that at the beginning of the iterative process it can be proven numerically that the method is convergent under the given initial approximations.

We apply the first four methods of the family (11) for calculating simultaneously all the zeros of the selected polynomials. In each example, we calculate the smallest *m* > 0 that satisfies the convergence criterion (51). In accordance with Theorem 5, we use the following stop criterion

$$E\_f(\mathbf{x}^{(k)}) < \mu\_n \quad \text{and} \quad \varepsilon\_k < 10^{-12},\tag{55}$$

where *μ<sup>n</sup>* and *ε<sup>k</sup>* are defined by (52) and (53), respectively. To see the convergence behavior of the methods, we show in the tables *εk*+<sup>1</sup> in addition to *εk*.

In both examples, we take the same polynomials and initial approximations as in [11], where the initial approximations are chosen quite randomly. This choice gives the opportunity to compare numerically the convergence behavior of the multi-point Ehrlich-type methods with those of the multi-point Weierstrass-type methods which are studied in [11].

To present the calculated approximations of high accuracy, we implemented the corresponding algorithms using the programming package Wolfram Mathematica 10.0 with multiple precision arithmetic.

**Example 1.** *The first polynomial is*

$$f(z) = z^3 - (2 + 5i)z^2 - (3 - 10i)z + 15i \tag{56}$$

*with zeros* −1*,* 3 *and* 5*i (marked in blue in Figure 1). For N* ∈ {1, 2, 3, 4}*, the initial approximations x*(0), *x*(−1),..., *x*(−*N*) *in* C<sup>3</sup> *are given in Table 2, where*

$$a = (5 + i, 7 - i, -4.5i), \quad b = (1, -2.7, 4.5i), \quad c = (-5i, 2, 8),$$

$$u = (-10, -5i, 8), \quad v = (i, 3 + i, 8).$$

*In the case N* = 3*, the initial approximations are marked in red in Figure 1.*

**Table 2.** Initial approximations for Example 1.


The numerical results for Example 1 are presented in Table 3. For instance, for the multi-point Ehrlich-type method (11) with *N* = 3, one can see that the convergence condition (51) is satisfied for *m* = 6 which guarantees that the considered method is convergent with order of convergence *r* = 2.94771. The stopping criterion (55) is satisfied for *k* = 6 and at the sixth iteration the guaranteed accuracy is 10−16. At the next seventh iteration, the zeros of the polynomial *f* are calculated with accuracy 10−47.

**Table 3.** Convergence behavior for Example 1 (*Rn* = 0.125, *τ<sup>n</sup>* = 0.171573).


In Figure 1, we present the trajectories of the approximations generated by the first six iterations of the method (11) for *N* = 3. We observe how each initial approximation, moving along a bizarre trajectory, finds a zero of the polynomial.

**Example 2.** *The second polynomial is*

$$f(z) = z^7 - 28z^6 + 322z^5 - 1960z^4 + 6769z^3 - 13132z^2 + 13068z - 5040 \tag{57}$$

*with zeros* 1, 2, 3, 4, 5, 6, 7 *(marked in blue in Figure 2). For given N* ∈ {1, 2, 3, 4}*, the initial approximations <sup>x</sup>*(*k*) <sup>∈</sup> <sup>C</sup>*<sup>n</sup>* (*<sup>k</sup>* <sup>=</sup> <sup>−</sup>*N*, ... , <sup>−</sup>1, 0) *are chosen with Aberth initial approximations as follows:*

$$\mathbf{x}\_{\nu}^{(k)} = -\frac{a\_1}{n} + R\_k \exp\left(i\theta\_{\nu}\right), \quad \theta\_{\nu} = \frac{\pi}{n} \left(2\nu - \frac{3}{2}\right), \quad \nu = 1, \ldots, n,\tag{58}$$

*where a*<sup>1</sup> = −28*, n* = 7*, Rk* = *R* + 2 − *k and R* = 13.7082*. In the case N* = 3*, the initial approximations are marked in red in Figure 2.*

**Figure 1.** Trajectories of the approximations for Example 1 (*N* = 3).

The numerical results for Example 2 are presented in Table 4. For example, for the multi-point Ehrlich-type method (11) with *N* = 3, the convergence condition (51) is satisfied for *m* = 7 and the stopping criterion (55) is satisfied for *k* = 8 which guarantees an accuracy 10<sup>−</sup>22. At the next ninth iteration, the zeros of the polynomial *f* are calculated with accuracy 10−65. In Figure 1, we present the trajectories of the approximations generated by the first seven iterations of the method (11) for *N* = 3. One can see that the trajectories are quite regular in the case of Aberth's initial approximations.


**Table 4.** Convergence behavior for Example 2 (*Rn* = 0.125, *τ<sup>n</sup>* = 0.171573).

**Figure 2.** Trajectories of the approximations for Example 2 (*N* = 3).

#### **7. Conclusions**

In this paper, we introduced a new family of multi-points iterative methods for approximating all the zeros of a polynomial simultaneously. Let us note that the first member of this family is the two-point Ehrlich-type method introduced in 1999 by Tri´ckovi´c and Petkovi´c [9]. Its convergence order is *<sup>r</sup>* <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>√</sup>2 .

We provide a local and semilocal convergence analysis of the new iterative methods. Our local convergence result (Theorem 2) contains the following information for each method: convergence order; initial conditions that guarantee the convergence; a priori and a posteriori error estimates. In particular, each method of the family has super-quadratic convergence of order *<sup>r</sup>* <sup>∈</sup> [<sup>1</sup> <sup>+</sup> <sup>√</sup>2, 3). Our semilocal convergence result (Theorem 4) can be used to numerically prove the convergence of each method for a given polynomial and initial approximation.

Finally, we would like to note that the local convergence theorem was obtained by a new approach developed in our previous article [11]. We believe that this approach can be applied to obtain convergence results for other multi-point iterative methods.

**Author Contributions:** The authors contributed equally to the writing and approved the final manuscript of this paper. Both authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the National Science Fund of the Bulgarian Ministry of Education and Science under Grant DN 12/12.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declares no conflict of interest.

#### **References**


**Li Zhang 1, Jin Huang 1,\*, Hu Li <sup>2</sup> and Yifei Wang <sup>1</sup>**


**Abstract:** This paper proposes an extrapolation method to solve a class of non-linear weakly singular kernel Volterra integral equations with vanishing delay. After the existence and uniqueness of the solution to the original equation are proved, we combine an improved trapezoidal quadrature formula with an interpolation technique to obtain an approximate equation, and then we enhance the error accuracy of the approximate solution using the Richardson extrapolation, on the basis of the asymptotic error expansion. Simultaneously, a posteriori error estimate for the method is derived. Some illustrative examples demonstrating the efficiency of the method are given.

**Keywords:** weakly singular kernel Volterra integral equation; proportional delay; improved trapezoidal quadrature formula; Richardson extrapolation; posteriori error estimate

#### **1. Introduction**

Delay functional equations are often encountered in biological processes, such as the growth of the population and the spread of an epidemic with immigration into the population [1,2], and a time delay can cause the population to fluctuate. In general, some complicated dynamics systems are also modeled by delay integral equations since the delay argument could cause a stable equilibrium to become unstable. The motivation of our work is twofold: one of the reasons is based on the first-kind delay Volterra integral equation (VIE) of the form [3]

$$\int\_{qt}^{t} k(t,s)y(s)\mathbf{d}s = f(t), \qquad t \in I := [0,T]\_{\prime}$$

which was discussed and transformed into the second-kind equivalent form

$$k(t,t)y(t) - qk(t,qt)y(qt) + \int\_{qt}^{t} \frac{\partial k(t,s)}{\partial t} y(s)ds = f'(t),$$

if *k*(*t*, *t*) = 0 for *t* ∈ *I*, the normal form was given by

$$y(t) = f(t) + y(qt) + \int\_0^t K\_1(t,s)y(s)ds + \int\_0^{qt} K(t,s)y(s)ds, \qquad t \in I.$$

There has been some research [4–6] to the following form

$$y(t) = f(t) + \int\_0^t K\_1(t, s)y(s)ds + \int\_0^{qt} K(t, s)y(s)ds, \qquad t \in I.$$

**Citation:** Zhang, L.; Huang, J.; Li, H.; Wang, Y. Extrapolation Method for Non-Linear Weakly Singular Volterra Integral Equation with Time Delay. *Mathematics* **2021**, *9*, 1856. https:// doi.org/10.3390/math9161856

Academic Editors: Alicia Cordero Barbero and Maria Isabel Berenguer

Received: 15 June 2021 Accepted: 3 August 2021 Published: 5 August 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Another source of motivation comes from the weakly singular delay VIE [7–9]

$$y(t) = f(t) + \int\_0^{qt} \frac{K(t,s)}{(qt-s)^\lambda} G(s, y(s))ds, \quad t \in [0,1],$$

where *λ* ∈ (0, 1), *K*(*t*,*s*) is smooth and *G*(*s*, *y*(*s*)) is a smooth non-linear function. However, there has not yet been investigated for the case where two integral terms are presented, the first integral term is the weakly singular Volterra integral and the second integral terms not only has weak singularity in the left endpoint but also its upper limit is a delay function, which is challenging to calculate. It is the aim of this paper to fill this gap.

With theoretical and computational advances, some numerical methods for delay differential equations [10–13], delay integral equations [14], delay integral–differential equations [15–18], and fractional differential equations with time delay [19–22] have been investigated widely. Here, we consider the following non-linear weakly singular kernel VIE with vanishing delay

$$y(t) = f(t) + \int\_0^t s^\lambda k\_1(t, s; y(s)) \mathrm{d}s + \int\_0^{\theta(t)} s^\mu k\_2(t, s; y(s)) \mathrm{d}s, \qquad t \in I,\tag{1}$$

where *θ*(*t*) := *qt*, *q* ∈ (0, 1), *λ*, *μ* ∈ (−1, 0), *f*(*t*), *k*1(*t*,*s*; *y*(*s*)), *k*2(*t*,*s*; *y*(*s*)) are *r*(*r* ≥ 1, *r* ∈ **N**) times continuously differentiable on *I*, *D* × **R**, *D<sup>θ</sup>* × **R**, respectively, *D* := {(*t*,*s*) : 0 ≤ *s* ≤ *t* ≤ *T*} and *D<sup>θ</sup>* := {(*t*,*s*) : 0 ≤ *s* ≤ *θ*(*t*) ≤ *θ*(*T*), *t* ∈ *I*}. Additionally, *ki*(*t*,*s*; *y*(*s*)) (*i* = 1, 2) satisfy the Lipschitz conditions with respect to *y*(*s*) on the domains, respectively. That is, for fixed *s* and *t*, there are two positive constants *Lj* (*j* = 1, 2) which are independent of *s* and *t*, such that

$$|k\_j(t, s; y(s)) - k\_j(t, s; v(s))| \le L\_j |y(s) - v(s)|.\tag{2}$$

Then, Equation (1) possesses a unique solution (see Theorem 1). In this paper, we consider the case where the solution is smooth.

Some numerical investigations of delay VIE have been conducted, such as discontinuous Galerkin methods [23], collocation methods [24–26], the iterative numerical method [27], and the least squares approximation method [28]. In [29], an *hp* version of the pseudospectral method was analyzed, based on the variational form of a non-linear VIE with vanishing variable delays. The algorithm increased the accuracy by refining the mesh and/or increasing the degree of the polynomial. Mokhtary et al. [7] used a well-conditioned Jacobi spectral Galerkin method for a VIE with weakly singular kernels and proportional delay by solving sparse upper triangular non-linear algebraic systems. In [8], the Chebyshev spectral-collocation method was investigated for the numerical solution of a class of weakly singular VIEs with proportional delay. An error analysis showed that the approximation method could obtain spectral accuracy. Zhang et al. [9] used some variable transformations to change the weakly singular VIE with pantograph delays into new equations defined on [−1, 1], and then combined it with the Jacobi orthogonal polynomial.

The extrapolation method has been used extensively [30,31]. We apply the extrapolation method for the solution of the non-linear weakly singular kernel VIE with proportional delay. We prove the existence of the solution to the original equation using an iterative method, while uniqueness is demonstrated by the Gronwall integral inequality. We obtain the approximate equation by using the quadrature method based on the improved trapezoidal quadrature formula, combining the floor technique and the interpolation technique. Then, we solve the approximate equation through an iterative method. The existence of the approximate solution is validated by analyzing the convergence of the iterative sequence, while uniqueness is shown using a discrete Gronwall inequality. In addition, we provide an analysis of the convergence of the approximate solution and obtain the asymptotic expansion of the error. Based on the error asymptotic expansion, the Richardson extrapolation method is applied to enhance the numerical accuracy of the approximate solution. Furthermore, we obtain the posterior error estimate of the method. Numerical

experiments effectively support the theoretical analysis, and all the calculations can be easily implemented.

This paper is organized as follows: In Section 2, the existence and uniqueness of the solution for (1) are proven. The numerical algorithm is introduced in Section 3. In Section 4, we prove the existence and uniqueness of the approximate solution. In Section 5, we provide the convergence analysis of the approximate solution. In Section 6, we obtain the asymptotic expansion of error, the corresponding extrapolation technique is used for achieving high precision, and a posterior error estimate is derived. Numerical examples are described in Section 7. Finally, we outline the conclusions of the paper in Section 8.

#### **2. Existence and Uniqueness of Solution of the Original Equation**

In this section, we discuss the existence and uniqueness of the solution of the original equation. There are two cases, 0 ≤ *t* ≤ *T* ≤ 1 and 1 < *t* ≤ *T*, that we will discuss in the following.

**Lemma 1** ([32])**.** *Let y*(*t*) *and g*(*t*) *be non-negative integrable functions, t* ∈ [0, *T*]*, A* ≥ 0*, satisfying*

$$y(t) \le A + \int\_0^t \mathbf{g}(\mathbf{s}) y(\mathbf{s}) d\mathbf{s},$$

*then, for all* 0 ≤ *t* ≤ *T,*

$$y(t) \le Ae^{\int\_0^t g(s)ds}.$$

**Theorem 1.** *f*(*t*), *k*1(*t*,*s*; *y*(*s*)), *k*2(*t*,*s*; *y*(*s*)) *are r*(*r* ≥ 1, *r* ∈ **N**) *times continuously differentiable on I*, *D* × **R**, *D<sup>θ</sup>* × **R***, respectively. Additionally, assume that ki*(*t*,*s*; *y*(*s*)) (*i* = 1, 2) *satisfies the Lipschitz conditions (2), respectively. Then, Equation (1) has a unique solution.*

**Proof.** We first construct the sequence {*yn*(*t*), *n* ∈ **N**} as follows:

$$\begin{aligned} y\_0(t) &= f(t), \\ y\_n(t) &= f(t) + \int\_0^t s^\lambda k\_1(t, s; y\_{n-1}(s)) \mathbf{d}s + \int\_0^{qt} s^\mu k\_2(t, s; y\_{n-1}(s)) \mathbf{d}s. \end{aligned}$$

$$\begin{aligned} \text{Let } b = \max\_{0 \le t \le T} |y\_1(t) - y\_0(t)|, L = \max\{L\_1, L\_2\}, \gamma = \min\{\lambda, \mu\}. \\ \bullet \quad \mathsf{Case I. For } 0 \le s \le t \le T \le 1, \text{by means of mathematical induction, when } n = 1, \mu = \gamma = \gamma\_0 = 1 \text{ and } \gamma = \gamma\_0 = 1. \end{aligned}$$

$$\begin{split} |y\_2(t) - y\_1(t)| &= \left| \int\_0^t s^\lambda \Big( k\_1(t, s; y\_1(s)) - k\_1(t, s; y\_0(s)) \Big) \mathrm{d}s + \int\_0^{qt} s^\mu \Big( k\_2(t, s; y\_1(s)) - k\_2(t, s; y\_0(s)) \Big) \mathrm{d}s \right| \\ &\le \int\_0^t s^\lambda \Big| k\_1(t, s; y\_1(s)) - k\_1(t, s; y\_0(s)) \Big| \mathrm{d}s + \int\_0^{qt} s^\mu \Big| k\_2(t, s; y\_1(s)) - k\_2(t, s; y\_0(s)) \Big| \mathrm{d}s \\ &\le \int\_0^t L\_1 s^\lambda \left| y\_1(s) - y\_0(s) \right| + \int\_0^t L s^\mu \left| y\_1(s) - y\_0(s) \right| \mathrm{d}s \\ &\le \int\_0^t (s^\lambda Lb + s^\mu Lb) \mathrm{d}s \\ &\le 2Lb \int\_0^t s^\mu \mathrm{d}s \\ &= 2Lb \frac{t^{\gamma + 1}}{\gamma + 1}. \end{split} \tag{3}$$

Suppose that the following expression is established when *n* = *k*,

$$|y\_k(t) - y\_{k-1}(t)| \le b \frac{(2L)^{k-1}}{(k-1)!(\gamma+1)^{k-1}} t^{(k-1)(\gamma+1)}.\tag{4}$$

Let *n* = *k* + 1; then,

$$\begin{split} |y\_{k+1}(t) - y\_k(t)| &\leq \int\_0^t s^{\lambda} \Big| k\_1(t, s; y\_k(s)) - k\_1(t, s; y\_{k-1}(s)) \Big| \mathrm{d}s + \int\_0^{qt} s^{\mu} \Big| k\_2(t, s; y\_k(s)) - k\_2(t, s; y\_{k-1}(s)) \Big| \mathrm{d}s \\ &\leq \int\_0^t L\_1 s^{\lambda} |y\_k(s) - y\_{k-1}(s)| + \int\_0^t L\_2 s^{\mu} |y\_k(s) - y\_{k-1}(s)| \mathrm{d}s \\ &\leq 2L \int\_0^t s^{\gamma} |y\_k(s) - y\_{k-1}(s)| \mathrm{d}s \\ &\leq b \frac{(2L)^k}{k!(\gamma+1)^k} t^{k(\gamma+1)} \Big{ }\end{split}$$

that is, the recurrence relation is established when *n* = *k* + 1, then the inequality (4) is also established. Next, we prove that the sequence *yn*(*t*) is a Cauchy sequence,

$$\begin{split} |y\_n(t) - y\_{n+m}(t)| &\leq |y\_{n+1}(t) - y\_n(t)| + |y\_{n+2}(t) - y\_{n+1}(t)| + \dots + |y\_{n+m}(t) - y\_{n+m-1}(t)| \\ &\leq b \frac{(2L)^n}{n!(\gamma + 1)^n} t^{n(\gamma + 1)} + \dots + b \frac{(2L)^{n+m-1}}{(n+m-1)!(\gamma + 1)^{n+m-1}} t^{(n+m-1)(\gamma + 1)} \\ &\leq b \sum\_{i=n}^{n+m+1} (\frac{2L}{\gamma + 1})^i T^{i(\gamma + 1)} \frac{1}{i!} .\end{split}$$

The term <sup>∞</sup> ∑ *i*=0 ( <sup>2</sup>*<sup>L</sup> <sup>γ</sup>*+<sup>1</sup> )*<sup>i</sup> Ti*(*γ*+1) <sup>1</sup> *<sup>i</sup>*! is convergent, so the Cauchy sequence {*yn*}*n*∈**<sup>N</sup>** is convergent uniformly to *y*(*t*). Thus, *y*(*t*) is the solution to Equation (1), the existence is proved.

• **Case II.** For 1 <sup>&</sup>lt; *<sup>s</sup>* <sup>≤</sup> *<sup>t</sup>* <sup>≤</sup> *<sup>T</sup>*, the process is similar. Let *<sup>γ</sup>* <sup>=</sup> max{*λ*, *<sup>μ</sup>*}, when *<sup>n</sup>* <sup>=</sup> 1,

$$|y\_2(t) - y\_1(t)| \le 2Lb \frac{t^{\bar{\gamma} + 1}}{\bar{\gamma} + 1}.\tag{5}$$

Suppose that the following expression is established when *n* = *k*,

$$|y\_k(t) - y\_{k-1}(t)| \le b \frac{(2L)^{k-1}}{(k-1)!(\overline{\gamma} + 1)^{k-1}} t^{(k-1)(\overline{\gamma}+1)}.\tag{6}$$

Let *n* = *k* + 1. Then, we have

$$\left|y\_{k+1}(t) - y\_k(t)\right| \le b \frac{(2L)^k}{k!(\bar{\gamma}+1)^k} t^{k(\bar{\gamma}+1)},$$

i.e., the recurrence relation is established when *n* = *k* + 1, such that the inequality (6) is also established. For the sequence *yn*(*t*),

$$\left|y\_n(t) - y\_{n+m}(t)\right| \le b \sum\_{i=n}^{n+m+1} (\frac{2L}{\bar{\gamma}+1})^i T^{i(\bar{\gamma}+1)} \frac{1}{i!} \cdot t$$

Since the term <sup>∞</sup> ∑ *i*=0 ( <sup>2</sup>*<sup>L</sup> <sup>γ</sup>*<sup>+</sup><sup>1</sup> )*<sup>i</sup> Ti*(*<sup>γ</sup>*<sup>+</sup>1) <sup>1</sup> *<sup>i</sup>*! is convergent, so the Cauchy sequence {*yn*}*n*∈**<sup>N</sup>** is convergent uniformly to *y*(*t*). Thus, *y*(*t*) is the solution to Equation (1), the existence is proved.

Now, we prove that the solution to Equation (1) is unique. Let *y*(*t*) and *v*(*t*) be two distinct solutions to Equation (1), and denote the difference between them by *w*(*t*) = |*y*(*t*) − *v*(*t*)|. We obtain

$$\begin{split} w(t) &= \left| \int\_{0}^{t} \mathbf{s}^{\lambda} \left( k\_{1}(t,s;\mathbf{y}(s)) - k\_{1}(t,s;\mathbf{v}(s)) \right) \mathrm{d}s + \int\_{0}^{qt} \mathbf{s}^{\mu} \left( k\_{2}(t,s;\mathbf{y}(s)) - k\_{2}(t,s;\mathbf{v}(s)) \right) \mathrm{d}s \right| \\ &\leq \int\_{0}^{t} \mathbf{s}^{\lambda} \left| k\_{1}(t,s;\mathbf{y}(s)) - k\_{1}(t,s;\mathbf{v}(s)) \right| \mathrm{d}s + \int\_{0}^{qt} \mathbf{s}^{\mu} \left| k\_{2}(t,s;\mathbf{y}(s)) - k\_{2}(t,s;\mathbf{v}(s)) \right| \mathrm{d}s \\ &\leq \int\_{0}^{t} L\_{1} \mathbf{s}^{\lambda} w(\mathbf{s}) \mathrm{d}s + \int\_{0}^{qt} L\_{2} \mathbf{s}^{\mu} w(\mathbf{s}) \mathrm{d}s \\ &\leq \int\_{0}^{t} (L\mathbf{s}^{\lambda} + L\mathbf{s}^{\mu}) w(\mathbf{s}) \mathrm{d}s. \end{split}$$

Let *g*(*s*) = *Ls<sup>λ</sup>* + *Lsμ*, then *g*(*s*) is a non-negative integrable function, according to Lemma 1. We obtain *w*(*t*) = 0, i.e., *y*(*t*) = *v*(*t*), the solution to Equation (1) is unique.

#### **3. The Numerical Algorithm**

In this section, we first provide some essential lemmas which are useful for the derivation of the approximate equation. Next, the discrete form of Equation (1) is obtained by combining an improved trapezoidal quadrature formula and linear interpolation. Finally, we solve the approximate equation using an iterative method. The process does not have to compute the integrals; hence, the method can be implemented easily.

#### *3.1. Some Lemmas*

**Lemma 2** ([32])**.** *Let u* <sup>∈</sup> *<sup>C</sup>*3(0, 1) *and z* <sup>=</sup> *<sup>β</sup><sup>x</sup>* + (<sup>1</sup> <sup>−</sup> *<sup>β</sup>*)*y with <sup>β</sup>* <sup>∈</sup> [0, 1], *<sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> [0, *<sup>T</sup>*]*. Then,*

$$
\mu(z) = \beta u(x) + (1 - \beta)u(y) - \frac{\beta(1 - \beta)}{2}(x - y)^2 u''(z) + O((x - y)^3). \tag{7}
$$

**Proof.** The Taylor expansion of function *u*(*x*) at the point *z* is

$$\begin{split} u(\mathbf{x}) &= u(\beta \mathbf{x} + (1 - \beta)\mathbf{x}) \\ &= u(\beta \mathbf{x} + (1 - \beta)\mathbf{y} + (1 - \beta)(\mathbf{x} - \mathbf{y})) \\ &= u(z + (1 - \beta)(\mathbf{x} - \mathbf{y})) \\ &= u(z) + (1 - \beta)(\mathbf{x} - \mathbf{y})u'(z) + \frac{(1 - \beta)^2}{2}(\mathbf{x} - \mathbf{y})^2 u''(z) + O((\mathbf{x} - \mathbf{y})^3). \end{split} \tag{8}$$

Similarly, the Taylor expansion of function *u*(*y*) at point *z* is

$$u(y) = u(z - \beta(\mathbf{x} - y)) = u(z) - \beta(\mathbf{x} - y)u'(z) + \frac{\beta^2}{2}(\mathbf{x} - y)^2 u''(z) + O((\mathbf{x} - y)^3), \tag{9}$$

combining (8) with (9), the proof is completed.

**Lemma 3** ([33,34])**.** *Let <sup>g</sup>*(*t*) <sup>∈</sup> *<sup>C</sup>*<sup>2</sup>*<sup>r</sup>*[*a*, *<sup>b</sup>*] (*<sup>r</sup>* <sup>≥</sup> 1, *<sup>r</sup>* <sup>∈</sup> **<sup>N</sup>**)*, <sup>G</sup>*(*t*)=(*<sup>b</sup>* <sup>−</sup> *<sup>t</sup>*)*λg*(*t*)*, <sup>h</sup>* <sup>=</sup> (*b*−*a*) *<sup>N</sup> , and tk* <sup>=</sup> *<sup>a</sup>* <sup>+</sup> *kh for <sup>k</sup>* <sup>=</sup> 0, ··· , *N, as for the integral <sup>b</sup> <sup>a</sup> G*(*t*)d*t. Then, the error of the modified trapezoidal integration rule*

$$T\_N(G) = \frac{h}{2}G(t\_0) + h\sum\_{j=1}^{N-1} G(t\_k) - \zeta(-\lambda)g(b)h^{1+\lambda},\tag{10}$$

*has an asymptotic expansion*

$$E\_N(G) = \sum\_{j=1}^{\bar{r}-1} \frac{B\_{2\bar{j}}}{(2\bar{j})!} G^{(2\bar{j}-1)}(a) h^{2\bar{j}} + \sum\_{j=1}^{2\bar{r}-1} (-1)^j \zeta(-\lambda - j) \frac{g^{(j)}(b) h^{j+\lambda+1}}{(j!)} + O(h^{2\bar{r}}),\tag{11}$$

*where* −1 < *λ* < 0*, ζ is the Riemann–Zeta function and B*2*<sup>j</sup> represents the Bernoulli numbers.*

#### *3.2. The Approximation Process*

In this subsection, we describe the numerical method used to find the approximate solution to Equation (1). Let *y*(*t*) have continuous partial derivatives up to 3 on *I*, *f*(*t*), *k*1(*t*,*s*; *y*(*s*)), *k*2(*t*,*s*; *y*(*s*)) are four times continuously differentiable on *I*, *D* × **R**, *D<sup>θ</sup>* × **R**, respectively. Let *y*(*ti*), *yi* denote the exact solution and approximate solution when *t* = *ti*, respectively. We divide *I* = [0, *T*] into *N* subintervals with a uniform step size *h* = *<sup>T</sup> <sup>N</sup>* , *ti* = *ih*, *i* = 0, 1, ··· , *N*. Let *t* = *ti* in Equation (1). Then,

$$\begin{split} y(t\_i) &= f(t\_i) + \int\_0^{t\_i} s^\lambda k\_1(t\_i, s; y(s)) \, \mathrm{ds} + \int\_0^{qt\_i} s^\mu k\_2(t\_i, s; y(s)) \, \mathrm{ds} \\ &= f(t\_i) + \int\_0^{t\_i} s^\lambda k\_1(t\_i, s; y(s)) \, \mathrm{ds} + \int\_0^{t\_{[\varphi]}} s^\mu k\_2(t\_i, s; y(s)) \, \mathrm{ds} + \int\_{t\_{[\bar{\varphi}]}}^{qt\_i} s^\mu k\_2(t\_i, s; y(s)) \, \mathrm{ds} \\ &= f(t\_i) + I\_1 + I\_2 + I\_3 \end{split} \tag{12}$$

where [*qi*] denotes the maximum integer less than *qi*. According to Lemma 3, we have

$$I\_1 = \int\_0^{t\_i} s^\lambda k\_1(t\_i; s; y(s)) \, \mathrm{d}s \approx -\zeta(-\lambda) k\_1(t\_i, t\_0; y(t\_0)) h^{1+\lambda} + h \sum\_{k=1}^{i-1} t\_k^\lambda k\_1(t\_i, t\_k; y(t\_k)) + \frac{h}{2} t\_i^\lambda k\_1(t\_i, t\_i; y(t\_i)). \tag{13}$$
 
$$\text{For } I\_1 \text{ and } I\_2 \text{ then zero have}$$

For *I*<sup>2</sup> and *I*3, there are two cases.

• **Case I.** If [*qi*] = 0, then

$$\begin{aligned} I\_2 &= 0; \\ I\_3 &= \int\_0^{qt\_i} s^\mu k\_2 \left( t\_i, s; y(s) \right) ds \approx -\zeta(-\mu) \left( qt\_i \right)^{1+\mu} k\_2 \left( t\_i, t\_0; y(t\_0) \right) + \frac{qt\_i}{2} \left( qt\_i \right)^\mu k\_2 \left( t\_i, qt\_i; y(qt\_i) \right). \end{aligned} \tag{14}$$

• **Case II.** If [*qi*] ≥ 1, we obtain

$$I\_{2} \approx \begin{cases} -\zeta(-\mu)h^{1+\mu}k\_{2}\left(t\_{i},t\_{0};y(t\_{0})\right) + \frac{h}{2}\frac{\mu}{1}k\_{2}\left(t\_{i},t\_{1};y(t\_{1})\right), & [q\bar{q}] = 1, \\ -\zeta(-\mu)h^{1+\mu}k\_{2}\left(t\_{i},t\_{0};y(t\_{0})\right) + h\sum\_{k=1}^{[q]-1}\frac{t\_{i}^{\mu}k\_{2}\left(t\_{i},t\_{k};y(t\_{k})\right) + \frac{h}{2}t\_{[q\bar{q}]}^{\mu}k\_{2}\left(t\_{i},t\_{[q\bar{q}]};y(t\_{[q\bar{q}]})\right), & [q\bar{q}] > 1. \end{cases}$$

$$I\_{3} \approx \frac{qt\_{i}-t\_{[q\bar{q}]}}{2}\left(t\_{[q\bar{q}]}^{\mu}k\_{2}\left(t\_{i},t\_{[q\bar{q}]};y(t\_{[q\bar{q}]})\right) + (qt\_{i})^{\mu}k\_{2}\left(t\_{i},qt\_{i};y(qt\_{i})\right)\right). \tag{15}$$

*y*(*qti*) can be represented by linear interpolation of the adjacent points *y*(*t*[*qi*]) and *y*(*t*[*qi*]+1). For the node *ti* = *ih*, *i* = 0, 1, ··· , *N*, since [*qi*] ≤ *qi* ≤ [*qi*] + 1, we obtain *t*[*qi*] ≤ *qti* ≤ *t*[*qi*]+1; according to Lemma 2, there exists *β<sup>i</sup>* ∈ [0, 1] such that *qti* = *βit*[*qi*] + (1 − *βi*)*t*[*qi*]+1. The value of *β<sup>i</sup>* = 1 + [*qi*] − *qi* can be calculated easily. Then, the approximate expression of *y*(*qti*) is

$$y(qt\_i) \approx \beta\_i y(t\_{[qi]}) + (1 - \beta\_i) y(t\_{[qi]+1}).\tag{16}$$

Then, (15) can be written as

$$I\_3 \approx \frac{qt\_i - t\_{\left[qi\right]}}{2} \left( t\_{\left[qi\right]}^\mu k\_2 \left( t\_i, t\_{\left[qi\right]}; y(t\_{\left[qi\right]}) \right) + (qt\_i)^\mu k\_2 \left( t\_i, qt\_i; \beta\_i y(t\_{\left[qi\right]}) + (1 - \beta\_i) y(t\_{\left[qi\right]}) \right) \right). \tag{17}$$

The approximation equations are as follows

• **Case I.** When [*qi*] = 0,

$$\begin{split} y\_0 &= f(t\_0); \\ y\_i &\approx f(t\_i) - \zeta(-\lambda)k\_1(t\_i, t\_0; y\_0)h^{1+\lambda} + h\sum\_{k=1}^{i-1} t\_k^\lambda k\_1(t\_i, t\_k; y\_k) + \frac{h}{2}t\_i^\lambda k\_1(t\_i, t\_i; y\_i) \\ &- \zeta(-\mu)(qt\_i)^{1+\mu}k\_2(t\_i, t\_0; y\_0) + \frac{qt\_i}{2}(qt\_i)^\mu k\_2(t\_i, qt\_i; \beta\_i y\_{\left[q\right]} + (1-\beta\_i)y\_{\left[q\right]+1}). \end{split} \tag{18}$$

• **Case II.** When [*qi*] ≥ 1,

$$y\_0 = f(t\_0);$$

$$\begin{split} y\_i &\approx \xi(t\_i) - \zeta(-\lambda)k\_1\{t\_i, t\_0; y\_0\}h^{1+\lambda} + h\sum\_{k=1}^{i-1} t\_k^{\lambda}k\_1\{t\_i, t\_0; y\_k\} + \frac{h}{2}t\_i^{\lambda}k\_1\{t\_i, t\_i; y\_i\} \\ &- \zeta(-\mu)h^{1+\mu}k\_2\{t\_i, t\_0; y\_0\} + \delta\_i + \frac{h}{2}t\_{[qi]}^{\mu}k\_2\{t\_i, t\_{[qi]}; y\_{[qi]}\} \\ &+ \frac{qt\_i - t\_{[qi]}}{2}\left(t\_{[qi]}^{\mu}k\_2\{t\_i, t\_{[qi]}; y\_{[qi]}\} + (qt\_i)^{\mu}k\_2\{t\_i, qt\_{i\cdot}; \beta; y\_{[qi]} + (1-\beta\_i)y\_{[qi]+1}\}\right), \\ &\text{where} \quad \begin{cases} 0\_{t\_{[qi]}} & \text{[ $qi$ ]} = 1, \end{cases} \tag{4.18}$$

$$\delta\_i \approx \begin{cases} 0, & [qi] = 1, \\ h \sum\_{k=1}^{\lfloor qi \rfloor - 1} t\_k^{\mu} k\_2 \left( t\_{i\prime} t\_k; y\_k \right), & [qi] \ge 2. \end{cases}$$

#### *3.3. Iterative Scheme*

Now, the solution of the approximate equation can be solved by an iterative algorithm. **Iterative algorithm**

**Step 1.** Take sufficiently small > 0 and set *y*˜0 = *f*(*t*0), *i* := 1. **Step 2.** Let *y*˜<sup>0</sup> *<sup>i</sup>* <sup>=</sup> *<sup>y</sup>*˜*i*−1, *<sup>m</sup>* :<sup>=</sup> 0, then we compute *<sup>y</sup>m*+<sup>1</sup> *<sup>i</sup>* (*i* ≤ *N*) as follows:

• **Case I.** When [*qi*] = 0,

$$\begin{split} y\_0 &= f(t\_0); \\ y\_i^{m+1} &\approx f(t\_i) - \zeta(-\lambda)k\_1(t\_i, t\_0; \overline{y}\_0)h^{1+\lambda} + h \sum\_{k=1}^{i-1} t\_k^{\lambda} k\_1(t\_i; t\_k; \overline{y}\_k) + \frac{h}{2} t\_i^{\lambda} k\_1(t\_i, t\_i; y\_i^m) \\ &- \zeta(-\mu)(qt\_i)^{1+\mu} k\_2(t\_i, t\_0; \overline{y}\_0) + \frac{qt\_i}{2} (qt\_i)^{\mu} k\_2(t\_i; qt\_i; \beta\_i \overline{y}\_{[q]} + (1-\beta\_i)y\_{[q]+1}^{m+1}). \end{split} \tag{20}$$
 
$$\mathbf{r} \quad \text{ $\mathbf{C}$ -norm.} \text{ If } \text{When } [m] \ge 1$$

$$\text{\textbullet} \quad \text{Case II. When } [qi] \ge 1,$$

$$\begin{split} y\_0 &= f(t\_0); \\ y\_i^{m+1} &\approx f(t\_i) - \zeta(-\lambda)k\_1(t\_i, t\_0; \bar{y}\_0)h^{1+\lambda} + h\sum\_{k=1}^{i-1} t\_k^{\lambda} k\_1(t\_i, t\_k; \bar{y}\_k) + \frac{h}{2}t\_i^{\lambda} k\_1(t\_i, t\_i; y\_i^m) \\ &- \zeta(-\mu)h^{1+\mu}k\_2\left(t\_i, t\_0; \bar{y}\_0\right) + \tilde{\delta}\_i + \frac{h}{2}t\_{[q]}^{\mu}k\_2\left(t\_i, t\_{[q]}; \bar{y}\_{[q]}\right) \\ &+ \frac{qt\_i - t\_{[q]}}{2}\left(t\_{[q]}^{\mu}k\_2\left(t\_i, t\_{[q]}; \bar{y}\_{[q]}\right) + (qt\_i)^{\mu}k\_2\left(t\_i, qt\_i; \bar{y}\_i\bar{y}\_{[q]} + (1-\beta\_i)y\_{[q]+1}^{m+1}\right)\right), \\ &\text{where} \end{split} \tag{21}$$

$$\widetilde{\delta}\_i \approx \begin{cases} 0, & [qi] = 1, \\ h \sum\_{k=1}^{\lfloor qi \rfloor - 1} t\_k^{\mu} k\_2 \left( t\_i, t\_k; \overline{y}\_k \right), & [qi] \ge 2. \end{cases}$$

**Step 3.** If <sup>|</sup>*ym*+<sup>1</sup> *<sup>i</sup>* <sup>−</sup> *<sup>y</sup><sup>m</sup> <sup>i</sup>* | ≤ , then let *<sup>y</sup>*˜*<sup>i</sup>* :<sup>=</sup> *<sup>y</sup>m*+<sup>1</sup> *<sup>i</sup>* and *i* := *i* + 1, and return to step 2. If otherwise, let *m* := *m* + 1, and return to step 2.

**Remark 1.** *In Section 3.2, we considered the regularity of ki*(*t*,*s*; *<sup>y</sup>*(*s*))(*<sup>i</sup>* <sup>=</sup> 1, 2) *only up to <sup>r</sup>* <sup>=</sup> <sup>2</sup> *in Lemma 3, since the desired accuracy has been obtained, and it is sufficient for the subsequent convergence analysis and extrapolation algorithm.*

#### **4. Existence and Uniqueness of the Solution to the Approximate Equation**

In this section, we investigate the existence and uniqueness of the solution to the approximate equation. We first introduce the following discrete Gronwall inequality.

**Lemma 4** ([35,36])**.** *Suppose that the non-negative sequence* {*wn*}, *n* = 0, ··· , *N, satisfy*

$$w\_n \le h \sum\_{k=1}^{n-1} B\_k w\_k + A\_\prime \quad 0 \le n \le N\_\prime \tag{22}$$

*where A and Bk*, *k* = 1, ··· , *N are non-negative constants, h* = 1/*N, when h* max <sup>0</sup>≤*k*≤*<sup>N</sup> wk* <sup>≤</sup> <sup>1</sup> <sup>2</sup> *then we have*

$$\max\_{0 \le n \le N} w\_n \le A \exp(2h \sum\_{k=1}^N B\_k).$$

**Theorem 2.** *Let f*(*t*), *k*1(*t*,*s*; *y*(*s*)), *k*2(*t*,*s*; *y*(*s*)) *are four times continuously differentiable on I*, *D* × **R**, *D<sup>θ</sup>* × **R***, respectively. Additionally, y*(*t*) *has continuous partial derivatives up to 3 on I and ki*(*t*,*s*; *y*(*s*)) (*i* = 1, 2) *satisfy Lipschitz conditions (2). Assume that h is sufficiently small, then the solution to Equation* (21) *exists and is unique.*

**Proof.** We discuss the existence of the approximate solution under two cases.

• **Case I.** When [*qi*] = 0,

$$\begin{aligned} \left| y\_i^{m+1} - y\_i^m \right| &= \left| \frac{h}{2} t\_i^{\lambda} \left( k\_1 \left( t\_{i\prime} t\_{i\prime} y\_i^m \right) - k\_1 \left( t\_{i\prime} t\_{i\prime} y\_i^{m-1} \right) \right) \right| \\ &\le L\_1 \frac{h}{2} t\_i^{\lambda} \left| y\_i^m - y\_i^{m-1} \right|. \end{aligned}$$

When *h* is sufficiently small, such that *L*<sup>1</sup> *h* 2 *t λ <sup>i</sup>* <sup>≤</sup> <sup>1</sup> <sup>2</sup> , then <sup>|</sup>*ym*+<sup>1</sup> *<sup>i</sup>* <sup>−</sup> *<sup>y</sup><sup>m</sup> <sup>i</sup>* | ≤ <sup>1</sup> <sup>2</sup> <sup>|</sup>*y<sup>m</sup> <sup>i</sup>* <sup>−</sup> *<sup>y</sup>m*−<sup>1</sup> *i* | holds. Therefore, the iterative algorithm is convergent and the limit is the solution to the approximation equation. The existence of approximation is proved when [*qi*] = 0. Now, we prove the uniqueness of approximation. Suppose *yi* and *xi* are both solutions to Equation (20). Denote the absolute differences as *wi* = |*yi* − *xi*|. We have

$$w\_0 = 0, \\\nu$$

*wi* ≤ − *ζ*(−*λ*) *k*1(*ti*, *t*0; *y*0) − *k*1(*ti*, *t*0; *x*0) *h*1+*<sup>λ</sup>* <sup>+</sup> *<sup>h</sup> i*−1 ∑ *k*=1 *t λ k k*1(*ti*, *tk*; *yk*) − *k*1(*ti*, *tk*; *xk*) + *h* 2 *t λ i k*1(*ti*, *ti*; *yi*) − *k*1(*ti*, *ti*; *xi*) <sup>−</sup> *<sup>ζ</sup>*(−*μ*)(*qt*1)1+*μ k*2(*ti*, *t*0; *y*0) − *k*2(*ti*, *t*0; *x*0) <sup>+</sup> *qti* <sup>2</sup> (*qti*)*μ k*2(*ti*, *qti*; *βiy*[*qi*] + (1 − *βi*)*y*[*qi*]+1) − *k*2(*ti*, *qti*; *βix*[*qi*] + (1 − *βi*)*x*[*qi*]+1) ≤*L*1*h i*−1 ∑ *k*=1 *t λ <sup>k</sup> wk* + *L*<sup>1</sup> *h* 2 *t λ <sup>i</sup> wi* + *L*<sup>2</sup> *qti* <sup>2</sup> (*qti*)*μ*(*βiw*[*qi*] + (<sup>1</sup> <sup>−</sup> *<sup>β</sup>i*)*w*[*qi*]+1) ≤*Lh i*−1 ∑ *k*=1 *t λ <sup>k</sup> wk* <sup>+</sup> *<sup>L</sup> <sup>h</sup>* 2 *t λ <sup>i</sup> wi* <sup>+</sup> *<sup>L</sup> qti* <sup>2</sup> (*qti*)*μ*(<sup>1</sup> <sup>−</sup> *<sup>β</sup>i*)*w*1. (23)

where *L* = max{*L*1, *L*2}. When *h* is sufficiently small, such that *L*<sup>1</sup> *h* 2 *t λ <sup>i</sup>* <sup>≤</sup> <sup>1</sup> <sup>2</sup> , we have

$$\begin{aligned} \label{eq:1} \boldsymbol{w}\_{i} & \leq 2Lh \sum\_{k=1}^{i-1} t\_{k}^{\lambda} w\_{k} + Lh t\_{i}^{\lambda} w\_{i} + Lq t\_{i} (q t\_{i})^{\mu} (1 - \beta\_{i}) w\_{k} \\ & \leq \left[ 2Lh t\_{k}^{\lambda} + Lh (q t\_{i})^{\mu} (1 - \beta\_{i}) \right] w\_{1} + 2Lh \sum\_{k=2}^{i-1} t\_{k}^{\lambda} w\_{k} \\ & = h \sum\_{k=1}^{i-1} B\_{k} w\_{k'} \end{aligned}$$

where

$$B\_k = \begin{cases} 2Lt\_k^\lambda + L(qt\_i)^\mu (1 - \beta\_i), & j = 1, \\ 2Lht\_{k'}^\lambda & j = 2, \dots, i - 1. \end{cases}$$

According to Lemma 4 with *A* = 0, we have *wi* = 0, i.e., *yi* = *xi*, the solution of Equation (20) is unique.

	- (1) The first situation is [*qi*] + <sup>1</sup> <sup>=</sup> *<sup>i</sup>*, namely, when *<sup>i</sup>* <sup>≤</sup> <sup>1</sup> <sup>1</sup>−*<sup>q</sup>* , we have

 *ym*+<sup>1</sup> *<sup>i</sup>* <sup>−</sup> *<sup>y</sup><sup>m</sup> i* = *h* 2 *t λ i k*1(*ti*, *ti*; *y<sup>m</sup> <sup>i</sup>* ) <sup>−</sup> *<sup>k</sup>*1(*ti*, *ti*; *<sup>y</sup>m*−<sup>1</sup> *<sup>i</sup>* ) <sup>+</sup> *qti* <sup>−</sup> *<sup>t</sup>*[*qi*] <sup>2</sup> (*qti*)*<sup>μ</sup> k*2 *ti*, *qti*; *<sup>β</sup>iy*˜[*qi*] + (<sup>1</sup> <sup>−</sup> *<sup>β</sup>i*)*y<sup>m</sup>* [*qi*]+1 − *k*<sup>2</sup> *ti*, *qti*; *<sup>β</sup>iy*˜[*qi*] + (<sup>1</sup> <sup>−</sup> *<sup>β</sup>i*)*ym*−<sup>1</sup> [*qi*]+1 ≤*L*<sup>1</sup> *h* 2 *t λ i ym <sup>i</sup>* <sup>−</sup> *<sup>y</sup>m*−<sup>1</sup> *i* + *L*<sup>2</sup> *qti* − *t*[*qi*] <sup>2</sup> (*qti*)*μ* (<sup>1</sup> <sup>−</sup> *<sup>β</sup>i*)*y<sup>m</sup>* [*qi*]+<sup>1</sup> <sup>−</sup> (<sup>1</sup> <sup>−</sup> *<sup>β</sup>i*)*ym*−<sup>1</sup> [*qi*]+1 <sup>≤</sup>*<sup>L</sup> <sup>h</sup>* 2 *t λ <sup>i</sup>* + (*qti*)*μ*(<sup>1</sup> <sup>−</sup> *<sup>β</sup>i*) *ym <sup>i</sup>* <sup>−</sup> *<sup>y</sup>m*−<sup>1</sup> *i* .

Let the step size *h* be small enough, such that *L <sup>h</sup>* 2 *t λ <sup>i</sup>* + (*qti*)*μ*(<sup>1</sup> <sup>−</sup> *<sup>β</sup>i*) <sup>≤</sup> <sup>1</sup> <sup>2</sup> . Then, we can determine that <sup>|</sup>*ym*+<sup>1</sup> *<sup>i</sup>* <sup>−</sup> *<sup>y</sup><sup>m</sup> <sup>i</sup>* | ≤ <sup>1</sup> <sup>2</sup> <sup>|</sup>*y<sup>m</sup> <sup>i</sup>* <sup>−</sup> *<sup>y</sup>m*−<sup>1</sup> *<sup>i</sup>* | holds.

(2) The second situation is [*qi*] + 1 < *i*, namely, when *i* > <sup>1</sup> <sup>1</sup>−*<sup>q</sup>* , we obtain

$$\begin{aligned} \left| y\_i^{m+1} - y\_i^m \right| &= \left| \frac{h}{2} t\_i^{\lambda} \left( k\_1 \left( t\_{i\prime} t\_{i\prime} y\_i^m \right) - k\_1 \left( t\_{i\prime} t\_{i\prime} y\_i^{m-1} \right) \right) \right| \\ &\leq L\_1 \frac{h}{2} t\_i^{\lambda} \left| y\_i^m - y\_i^{m-1} \right| \\ &\leq L \frac{h}{2} t\_i^{\lambda} \left| y\_i^m - y\_i^{m-1} \right|. \end{aligned}$$

Let *L <sup>h</sup>* 2 *t λ <sup>i</sup>* <sup>≤</sup> <sup>1</sup> <sup>2</sup> for a sufficiently small *<sup>h</sup>*, then <sup>|</sup>*ym*+<sup>1</sup> *<sup>i</sup>* <sup>−</sup> *<sup>y</sup><sup>m</sup> <sup>i</sup>* | ≤ <sup>1</sup> <sup>2</sup> <sup>|</sup>*y<sup>m</sup> <sup>i</sup>* <sup>−</sup> *<sup>y</sup>m*−<sup>1</sup> *<sup>i</sup>* | holds.

The above two situations show that the iterative algorithm is convergent and that the limit is the solution to Equation (21).

Next, we prove that the solution to Equation (21) is unique. Suppose *yi* and *<sup>x</sup><sup>i</sup>* are both solutions to Equation (21). Denote the differences as *<sup>w</sup><sup>i</sup>* <sup>=</sup> <sup>|</sup>*yi* <sup>−</sup> *<sup>x</sup><sup>i</sup>*|, *<sup>i</sup>* <sup>=</sup> 1, ··· , *<sup>N</sup>*. Then, we have

*<sup>w</sup>*<sup>0</sup> <sup>=</sup>0; *<sup>w</sup><sup>i</sup>* ≤ − *<sup>ζ</sup>*(−*λ*) *k*1(*ti*, *<sup>t</sup>*0; *<sup>y</sup>*0) <sup>−</sup> *<sup>k</sup>*1(*ti*, *<sup>t</sup>*0; *<sup>x</sup>*<sup>0</sup>) *h*1+*<sup>λ</sup>* <sup>+</sup> *<sup>h</sup> i*−1 ∑ *k*=1 *t λ k k*1(*ti*, *tk*; *yk*) <sup>−</sup> *<sup>k</sup>*1(*ti*, *tk*; *<sup>x</sup><sup>k</sup>*) + *h* 2 *t λ i k*1(*ti*, *ti*; *yi*) <sup>−</sup> *<sup>k</sup>*1(*ti*, *ti*; *<sup>x</sup><sup>i</sup>*) <sup>−</sup> *<sup>ζ</sup>*(−*μ*)*h*1+*μ k*2(*ti*, *<sup>t</sup>*0; *<sup>y</sup>*0) <sup>−</sup> *<sup>k</sup>*2(*ti*, *<sup>t</sup>*0; *<sup>x</sup>*<sup>0</sup>) + *h* [*qi*]−1 ∑ *k*=1 *t μ <sup>k</sup>* <sup>|</sup>*k*2(*ti*, *tk*; *yk*) <sup>−</sup> *<sup>k</sup>*2(*ti*, *tk*; *<sup>x</sup><sup>k</sup>*) + *h* 2 *t μ* [*qi*] <sup>|</sup>*k*2(*ti*, *<sup>t</sup>*[*qi*]; *<sup>y</sup>*[*qi*]) <sup>−</sup> *<sup>k</sup>*2(*ti*, *<sup>t</sup>*[*qi*]; *<sup>x</sup>*[*qi*]) <sup>+</sup> *qti* <sup>−</sup> *<sup>t</sup>*[*qi*] <sup>2</sup> (*<sup>t</sup> μ* [*qi*] *k*2(*ti*, *<sup>t</sup>*[*qi*]; *<sup>y</sup>*[*qi*]) <sup>−</sup> *<sup>k</sup>*2(*ti*, *<sup>t</sup>*[*qi*]; *<sup>x</sup>*[*qi*]) (24) + (*qti*)*μ k*2 *ti*, *qti*; *βiy*[*qi*] + (1 − *βi*)*y*[*qi*]+1) − *k*<sup>2</sup> *ti*, *qti*; *<sup>β</sup>ix*[*qi*] + (<sup>1</sup> <sup>−</sup> *<sup>β</sup>i*)*<sup>x</sup>*[*qi*]+1) ≤*h i*−1 ∑ *k*=1 *t λ <sup>k</sup> <sup>L</sup>*1*<sup>w</sup><sup>k</sup>* <sup>+</sup> *h* 2 *t λ <sup>i</sup> <sup>L</sup>*1*<sup>w</sup><sup>i</sup>* <sup>+</sup> *<sup>h</sup>* [*qi*]−1 ∑ *k*=1 *t μ <sup>k</sup> <sup>L</sup>*2*<sup>w</sup><sup>k</sup>* <sup>+</sup> *h* 2 *t μ* [*qi*] *<sup>L</sup>*2*<sup>w</sup>*[*qi*] <sup>+</sup> *qti* <sup>−</sup> *<sup>t</sup>*[*qi*] 2 *t μ* [*qi*] *<sup>L</sup>*2*<sup>w</sup>*[*qi*] + (*qti*)*μL*2(*βiw*[*qi*] + (<sup>1</sup> <sup>−</sup> *<sup>β</sup>i*)*<sup>w</sup>*[*qi*]+1) ≤*Lh i*−1 ∑ *k*=1 *t γ <sup>k</sup> <sup>w</sup><sup>k</sup>* <sup>+</sup> *<sup>L</sup> <sup>h</sup>* 2 *t γ <sup>i</sup> <sup>w</sup><sup>i</sup>* <sup>+</sup> *Lh* [*qi*]−1 ∑ *k*=1 *t γ <sup>k</sup> <sup>w</sup><sup>k</sup>* <sup>+</sup> *<sup>L</sup> <sup>h</sup>* 2 *t γ* [*qi*] *<sup>w</sup>*[*qi*] <sup>+</sup> *<sup>L</sup> <sup>h</sup>* 2 *t γ* [*qi*] *<sup>w</sup>*[*qi*] <sup>+</sup> *<sup>t</sup> γ* [*qi*] *<sup>β</sup>iw*[*qi*] + (<sup>1</sup> <sup>−</sup> *<sup>β</sup>i*)*<sup>w</sup>*[*qi*]+<sup>1</sup> .

(1) The first situation is [*qi*] + <sup>1</sup> <sup>=</sup> *<sup>i</sup>* (i.e., when *<sup>i</sup>* <sup>≤</sup> <sup>1</sup> <sup>1</sup>−*<sup>q</sup>* ). Then, (24) entails

$$\begin{split} \bar{w}\_{i} \leq & \mathrm{Lh} \sum\_{k=1}^{[q]-1} t\_{k}^{\prime} \bar{w}\_{k} + \mathrm{Lh} t\_{[q]}^{\prime} \bar{w}\_{[q]} + \mathrm{L} \frac{\mathrm{h}}{2} t\_{i}^{\prime} \bar{w}\_{i} + \mathrm{Lh} \sum\_{k=1}^{[q]-1} t\_{k}^{\prime} \bar{w}\_{k} + \mathrm{L} \frac{\mathrm{h}}{2} t\_{[q]}^{\prime} \bar{w}\_{[q]} \\ & + \mathrm{L} \frac{\mathrm{h}}{2} \Big( t\_{[q]}^{\prime} \bar{w}\_{[q]} + t\_{[q]}^{\prime} \Big( \beta\_{i} \bar{w}\_{[q]} + (1 - \beta\_{i}) \bar{w}\_{[q]+1} \Big) \Big) \\ = & 2 \mathrm{Lh} \sum\_{k=1}^{[q]-1} t\_{k}^{\prime} \bar{w}\_{k} + \left( 2 \mathrm{Lh} t\_{[q]}^{\prime} + \mathrm{L} \frac{\mathrm{h}}{2} \beta\_{i} t\_{[q]}^{\prime} \right) \bar{w}\_{[q]} + \left( \mathrm{L} \frac{\mathrm{h}}{2} t\_{i}^{\prime} + \mathrm{L} \frac{\mathrm{h}}{2} t\_{[q]}^{\prime} (1 - \beta\_{i}) \right) \bar{w}\_{[q]} \Big) \bar{w}\_{[q]} + 1 \end{split} \tag{25}$$

By letting *h* be so small that *L h* 2 *t γ <sup>i</sup>* <sup>+</sup> *<sup>L</sup> <sup>h</sup>* 2 *t γ* [*qi*] (1 − *βi*) <sup>≤</sup> <sup>1</sup> <sup>2</sup> , we can easily derive

$$d\overline{w}\_i \le 4Lh \sum\_{k=1}^{[qi]-1} t\_k^\gamma w\_k + (4Lh t\_{[qi]}^\gamma + Lh \beta\_i t\_{[qi]}^\gamma) \overline{w}\_{[qi]} = h \sum\_{k=1}^{i-1} B\_k \overline{w}\_{k\*}$$

where

$$B\_k = \begin{cases} 4Lt\_{k'}^\gamma & j=1, \cdots, [qi]-1, \\ 4Lt\_{[qi]}^\gamma + L\beta\_i t\_{[qi]}^\gamma & j=[qi]. \end{cases}$$

According to Lemma 4 with *A* = 0, we have *wi* = 0, and the solution of Equation (21) is unique.

(2) The second situation is [*qi*] + 1 < *i* (i.e., when *i* > <sup>1</sup> <sup>1</sup>−*<sup>q</sup>* ). Then, (24) can imply

*<sup>w</sup><sup>i</sup>* <sup>≤</sup>*Lh* [*qi*]−1 ∑ *k*=1 *t γ <sup>k</sup> <sup>w</sup><sup>k</sup>* <sup>+</sup> *Lht<sup>γ</sup>* [*qi*] *<sup>w</sup>*[*qi*] <sup>+</sup> *Lht<sup>γ</sup>* [*qi*]+1*<sup>w</sup>*[*qi*]+<sup>1</sup> <sup>+</sup> *Lh i*−1 ∑ *k*=[*qi*]+2 *t γ <sup>k</sup> <sup>w</sup><sup>k</sup>* <sup>+</sup> *<sup>L</sup> <sup>h</sup>* 2 *t γ i wi* + *Lh* [*qi*]−1 ∑ *k*=1 *t γ <sup>k</sup> <sup>w</sup><sup>k</sup>* <sup>+</sup> *<sup>L</sup> <sup>h</sup>* 2 *t γ* [*qi*] *<sup>w</sup>*[*qi*] <sup>+</sup> *<sup>L</sup> <sup>h</sup>* 2 *t γ* [*qi*] *<sup>w</sup>*[*qi*] <sup>+</sup> *<sup>t</sup> γ* [*qi*] *<sup>β</sup>iw*[*qi*] <sup>+</sup> <sup>1</sup> <sup>−</sup> *<sup>β</sup>i*)*<sup>w</sup>*[*qi*]+<sup>1</sup> =2*Lh* [*qi*]−1 ∑ *k*=1 *t γ <sup>k</sup> <sup>w</sup><sup>k</sup>* <sup>+</sup> 2*Lht<sup>γ</sup>* [*qi*] <sup>+</sup> *<sup>L</sup> <sup>h</sup>* 2 *βit γ* [*qi*] *<sup>w</sup>*[*qi*] <sup>+</sup> *Lht<sup>γ</sup>* [*qi*]+<sup>1</sup> <sup>+</sup> *<sup>L</sup> <sup>h</sup>* 2 *t γ* [*qi*] (1 − *βi*) *<sup>w</sup>*[*qi*]+<sup>1</sup> + *Lh i*−1 ∑ *k*=[*qi*]+2 *t γ <sup>k</sup> <sup>w</sup><sup>k</sup>* <sup>+</sup> *<sup>L</sup> <sup>h</sup>* 2 *t γ <sup>i</sup> <sup>w</sup><sup>i</sup>*. (26)

Letting *h* be so small that *L <sup>h</sup>* 2 *t γ <sup>i</sup>* <sup>≤</sup> <sup>1</sup> <sup>2</sup> , then

$$\begin{split} \vert \tilde{w}\_{i} \vert &\leq 4Lh \sum\_{k=1}^{[q]-1} t\_{k}^{\gamma} \tilde{w}\_{k} + \left( 4Lh \overset{\gamma}{[q]} + Lh \\$\dot{\mathfrak{z}}\_{[q]}^{\gamma} \right) \tilde{w}\_{[q]} + \left( 2Lh \overset{\gamma}{[q]} + 1 + Lh \overset{\gamma}{[q]} \right) \tilde{w}\_{[q]} \vert + 1 \right) \tilde{w}\_{[q]} + 1 + 2Lh \sum\_{k=[q]+2}^{i-1} t\_{k}^{\gamma} \tilde{w}\_{k} \\ &= h \sum\_{k=1}^{n-1} \tilde{B}\_{k} \tilde{w}\_{k}, \end{split}$$

where

$$
\widetilde{B}\_{k} = \begin{cases}
4Lt\_{k'}^{\gamma} & j = 1, \cdots, \gamma \, [qi] - 1, \\
4Lt\_{[qi]}^{\gamma} + L\beta\_{i}t\_{[qi]}^{\gamma} & j = [qi], \\
2Lt\_{[qi]+1}^{\gamma} + Lt\_{[qi]}^{\gamma}(1 - \beta\_{i}), & j = [qi] + 1, \\
2Lt\_{k'}^{\gamma} & j = [qi] + 2, \cdots, \gamma \, i - 1.
\end{cases}
$$

According to Lemma <sup>4</sup> with *<sup>A</sup>* <sup>=</sup> 0, we have *<sup>w</sup><sup>i</sup>* <sup>=</sup> 0, i.e., *yi* <sup>=</sup> *<sup>x</sup><sup>i</sup>*, the solution of Equation (21) is unique. Combining the above situations, the proof of Theorem 2 is completed.

#### **5. Convergence Analysis**

In this section, we will discuss errors caused by the process of obtaining discrete equations using a quadrature formula and interpolation technique and the errors caused by solving the discrete equation using iterative algorithms. According to the quadrature rule, Equation (12) can be expressed as

$$y(t\_0) = f(t\_0)\_{\prime}$$

$$\begin{split} y(t\_i) &= f(t\_i) - \zeta(-\lambda)k\_1(t\_i, t\_0; y(t\_0))h^{1+\lambda} + h\sum\_{k=1}^{i-1} t\_k^{\lambda} k\_1(t\_i, t\_k; y(t\_k)) + \frac{h}{2} t\_i^{\lambda} k\_1(t\_i, t\_i; y(t\_i)) + E\_{1,i} \\ &- \zeta(-\mu)h^{1+\mu} k\_2(t\_i, t\_0; y(t\_0)) + h\sum\_{k=1}^{[q]-1} t\_k^{\mu} k\_2(t\_i, t\_k; y(t\_k)) + \frac{h}{2} t\_{[q]}^{\mu} k\_2(t\_i, t\_{[q]}; y(t\_{[q]})) + E\_{2,i} \\ &+ \frac{qt\_i - t\_{[q]}}{2} \left( t\_{[q]}^{\mu} k\_2\left(t\_i, t\_{[q]}; y(t\_{[q]}) \right) + (qt\_i)^{\mu} k\_2\left(t\_i, qt\_{[i]}; \hat{p}\_i y(t\_{[q]}) + (1 - \hat{p}\_i) y(t\_{[q]}, +1) \right) \right) + E\_{3,i} .\end{split} \tag{27}$$

From Lemmas 2 and 3, the remainders are

$$\begin{aligned} E\_{1,i} &= \left[ k\_1 \left( t\_i; s; y(s) \right) \right]' \big|\_{s=0} \zeta(-\lambda - 1) h^{2 + \lambda} + \frac{\left[ k\_1 \left( t\_i; s; y(s) \right) \right]' \big|\_{s=0}}{2!} \zeta(-\lambda - 2) h^{3 + \lambda} + O(h^{4 + \lambda}), \\ &= T\_1(t\_i) h^{2 + \lambda} + O(h^{3 + \lambda}), \end{aligned}$$

$$\begin{split} E\_{2j} &= \left[ k\_2(t\_i, s; y(s)) \right]' |\_{s=0} \zeta(-\mu - 1) h^{2 + \gamma} + \frac{\left[ k\_2(t\_i, s; y(s)) \right]' |\_{s=0}}{2!} \zeta(-\mu - 2) h^{3 + \mu} + O(h^{4 + \mu}) \\ &= T\_2(t\_i) h^{2 + \mu} + O(h^{3 + \mu}), \\ E\_{3j} &= -\frac{\beta(1 - \beta)}{2} h^2 y''(qt\_i)(qt\_i) \gamma^y k\_2 \left( t\_i, qt\_i; \left( \beta\_i y(t\_{[q]}) + (1 - \beta\_i) y(t\_{[q]}) + 1 \right) \right) \\ &+ \frac{(qt\_i - t\_{[q]})^2}{12} \int\_{t\_{[q]}}^{q \text{t.i.}} \frac{\partial^2}{\partial \textbf{s}^2} \left( k\_2(t\_i, s; y(s)) s^{\mu} \right) \text{ds} + O(h^3) \\ &= T\_3(t\_i) h^2 + \frac{(qt\_i - t\_{[q]})^2 - h^2}{12} \int\_{t\_{[q]}}^{q \text{t.i.}} \frac{\partial^2}{\partial \textbf{s}^2} \left( k\_2(t\_i, s; y(s)) s^{\mu} \right) \text{ds} + O(h^3) \\ &= T\_3(t\_i) h^2 + O(h^3), \\ &\text{where} \end{split}$$

where

$$\begin{aligned} T\_1(t\_i) &= \left[ k\_1(t\_i; s; y(s)) \right] \big|\_{s=0} \mathbb{E}(-\lambda - 1), \\ T\_2(t\_i) &= \left[ k\_2(t\_i; s; y(s)) \right] \big|\_{s=0} \mathbb{E}(-\mu - 1), \\ T\_3(t\_i) &= -\frac{\beta(1-\beta)}{2} u''(qt\_i)(qt\_i)^\mu k\_2 \left( t\_i; qt\_i; \left( \beta\_i y(t\_{\left[q\right]}) + (1 - \beta\_i) y(t\_{\left[q\right]}) \right) \right) + \frac{1}{12} \int\_{t\_{\left[q\right]}}^{qt\_i} \frac{\partial^2}{\partial s^2} k\_2(t\_i; s; y(s)) s^\mu ds \,. \end{aligned}$$

In order to investigate the error between the exact solution and the approximate solution of Equation (1), we first give the following theorem.

**Theorem 3.** *Under the conditions of Theorem 2, y*(*ti*) *is the exact solution of Equation* (1) *when t* = *ti and yi is the solution of discrete Equation* (19) *at ti. Assume that h is sufficiently small, then, the absolute error denote by e*1,*<sup>i</sup>* = |*y*(*ti*) − *yi*| *has the estimate*

$$\max\_{1 \le i \le N} |e\_{1,i}| \le O(h^{2+\gamma}).$$

**Proof.** Subtracting (19) from (27),

 *e*1,0 =0, *e*1,*<sup>i</sup>* = − *ζ*(−*λ*) *k*1(*ti*, *t*0; *y*(*t*0)) − *k*1(*ti*, *t*0; *y*0) *h*1+*<sup>λ</sup>* <sup>+</sup> *<sup>h</sup> i*−1 ∑ *k*=1 *t λ k k*1(*ti*, *tk*; *y*(*tk*)) − *k*1(*ti*, *tk*; *yk*) + *h* 2 *t λ i k*1(*ti*, *ti*; *y*(*ti*)) − *k*1(*ti*, *ti*; *yi*) <sup>−</sup> *<sup>ζ</sup>*(−*μ*)*h*1+*μ k*2(*ti*, *t*0; *y*(*t*0)) − *k*2(*ti*, *t*0; *y*0) + *h* [*qi*]−1 ∑ *k*=1 *t μ k k*2(*ti*, *tk*; *y*(*tk*)) − *k*2(*ti*, *tk*; *yk*) + *h* 2 *t μ* [*qi*] *k*2(*ti*, *t*[*qi*]; *y*(*t*[*qi*])) − *k*2(*ti*, *t*[*qi*]; *y*[*qi*]) <sup>+</sup> *qti* <sup>−</sup> *<sup>t</sup>*[*qi*] 2 *t μ* [*qi*] *k*2(*ti*, *t*[*qi*]; *y*(*t*[*qi*])) − *k*2(*ti*, *t*[*qi*]; *y*[*qi*]) + (*qti*)*μ k*2 *ti*, *qti*; *βiy*(*t*[*qi*])+(1 − *βi*)*y*(*t*[*qi*]+1) − *k*2(*ti*, *qti*; *βiy*[*qi*] + (1 − *βi*)*y*[*qi*]+1) + *T*1(*ti*)*h*2+*<sup>λ</sup>* + *T*2(*ti*)*h*2+*<sup>μ</sup>* + *T*3(*ti*)*h*<sup>2</sup> + *O*(*h*3+*γ*) =*h i*−1 ∑ *k*=1 *t λ <sup>k</sup> L*1|*e*1,*k*| + *h* 2 *t λ <sup>i</sup> L*1|*e*1,*i*| + *h* [*qi*]−1 ∑ *k*=1 *t μ <sup>k</sup> L*2|*e*1,*k*| + *h* 2 *t μ* [*qi*] *L*2|*e*1,[*qi*]| <sup>+</sup> *qti* <sup>−</sup> *<sup>t</sup>*[*qi*] 2 *t μ* [*qi*] *<sup>L</sup>*2|*e*1,[*qi*]<sup>|</sup> + (*qti*)*μL*2|*βie*1,[*qi*] + (<sup>1</sup> <sup>−</sup> *<sup>β</sup>i*)*e*1,[*qi*]+1<sup>|</sup> + *T*1(*ti*)*h*2+*<sup>λ</sup>* + *T*2(*ti*)*h*2+*<sup>μ</sup>* + *T*3(*ti*)*h*<sup>2</sup> + *O*(*h*3+*γ*). (28)

Letting *h* be so small, that *<sup>h</sup>* 2 *t λ <sup>i</sup> <sup>L</sup>*<sup>1</sup> <sup>≤</sup> <sup>1</sup> <sup>2</sup> , it is easy to derive

$$|e\_{1,i}| \le A + h \sum\_{j=1}^{i-1} B\_j |e\_{1,j}|\_{\prime} \qquad 0 \le i \le N\_{\prime}$$

where

$$A = 2|T\_1(t\_i)h^{2+\lambda} + T\_2(t\_i)h^{2+\mu} + T\_3(t\_i)h^2 + O(h^{3+\gamma}) = O(h^{2+\gamma}).$$

By Lemma 4, we have

$$\max\_{1 \le i \le N} |e\_{1,i}| \le O(h^{2+\gamma}).$$

The proof is complete.

Next, we evaluate the error arising from the iterative process.

**Theorem 4.** *Under the conditions of Theorem 2, yi is the solution of Equation* (19) *and <sup>y</sup><sup>i</sup> is the approximate solution of Equation* (1)*, and <sup>y</sup><sup>i</sup> is defined by* (21)*. The absolute error is denoted by <sup>e</sup>*2,*<sup>i</sup>* <sup>=</sup> <sup>|</sup>*yi* <sup>−</sup> *<sup>y</sup><sup>i</sup>*|*. Assume that <sup>h</sup> is sufficiently small, then, there exist two positive constants, <sup>C</sup>*<sup>1</sup> *and C*2*, which are independent of h* = *<sup>T</sup> <sup>N</sup> , such that*

$$v\_i \le \begin{cases} C\_1 h \epsilon\_\prime & [qi] + 1 = i\_\prime \\ C\_2 h \epsilon\_\prime & [qi] + 1 \le i\_\prime \end{cases}$$

**Proof.** Subtracting (21) from (19), we have *e*2,0 = 0. We consider two cases.

(1) The first case is [*qi*] + <sup>1</sup> <sup>=</sup> *<sup>i</sup>* (i.e., when *<sup>i</sup>* <sup>≤</sup> <sup>1</sup> <sup>1</sup>−*<sup>q</sup>* ). Then, we have

$$\begin{split} \epsilon\_{2,i} &= h \sum\_{k=1}^{i-1} t\_k^{\lambda} L\_1 c\_{2,k} + \frac{h}{2} t\_i^{\lambda} L\_1 \epsilon + h \sum\_{k=1}^{[q]-1} t\_k^{\mu} L\_2 c\_{2,k} + \frac{h}{2} t\_{[q]}^{\mu} L\_2 \epsilon\_{2,[qi]} \\ &+ \frac{q t\_i - t\_{[qi]}}{2} \left( t\_{[q]}^{\mu} \right)\_L L\_2 c\_{2,[qi]} + (q t\_i)^{\mu} L\_2 (\beta\_i e\_{2,[qi]} + (1 - \beta\_i) \epsilon) \Big) \\ &\leq h \sum\_{k=1}^{i-1} t\_k^{\lambda} L c\_{2,k} + h \sum\_{k=1}^{[q]-1} t\_k^{\mu} L c\_{2,k} + \frac{h}{2} t\_{[q]}^{\mu} L c\_{2,[qi]} \\ &+ \frac{h}{2} \left( t\_{[qi]}^{\mu} L + (q t\_i)^{\mu} \beta\_i \right) c\_{2,[qi]} + \left( \frac{h}{2} (q t\_i)^{\mu} L (1 - \beta\_i) + \frac{h}{2} t\_i^{\lambda} L \right) \epsilon \\ &= h \sum\_{k=1}^{i-1} B\_k c\_{2,k} + \left( \frac{1}{2} (q t\_i)^{\mu} L (1 - \beta\_i) + \frac{1}{2} t\_i^{\lambda} L \right) h \epsilon. \end{split} \tag{29}$$

According to Lemma 4, we have *e*2,*<sup>i</sup>* ≤ *C*1*h*.

(2) The second case is [*qi*] + <sup>1</sup> <sup>≤</sup> *<sup>i</sup>* (i.e., when *<sup>i</sup>* <sup>&</sup>gt; <sup>1</sup> <sup>1</sup>−*<sup>q</sup>* ). Then, we obtain

$$\begin{split} e\_{2,i} &= h \sum\_{k=1}^{i-1} t\_k^\lambda L\_1 e\_{2,k} + \frac{h}{2} t\_i^\lambda L\_1 \varepsilon + h \sum\_{k=1}^{[qi]-1} t\_k^\mu L\_2 e\_{2,k} + \frac{h}{2} t\_{[qi]}^\mu L\_2 e\_{2,[qi]} \\ &+ \frac{q t\_i - t\_{[qi]}}{2} \left( t\_{[qi]}^\mu L\_2 e\_{2,[qi]} + (q t\_i)^\mu L\_2 (\beta\_i e\_{2,[qi]} + (1 - \beta\_i) e\_{2,[qi]+1}) \right) \\ &\le h \sum\_{k=1}^{i-1} t\_k^\lambda L e\_{2,k} + \frac{h}{2} t\_i^\lambda L \varepsilon + h \sum\_{k=1}^{[qi]-1} t\_k^\mu L e\_{2,k} + \frac{h}{2} t\_{[qi]}^\mu L e\_{2,[qi]} \\ &+ \frac{h}{2} \left( t\_{[qi]}^\mu L e\_{2,[qi]} + (q t\_i)^\mu L (\beta\_i e\_{2,[qi]} + (1 - \beta\_i) e\_{2,[qi]+1}) \right) \\ &= h \sum\_{k=1}^{i-1} B\_k e\_{2,k} + \frac{1}{2} t\_i^\lambda L h \epsilon. \end{split} \tag{30}$$

According to Lemma 4, we have *e*2,*<sup>i</sup>* ≤ *C*2*h*.

**Theorem 5.** *Under the conditions of Theorem 2, <sup>y</sup>*(*ti*) *is the exact solution of Equation* (1)*, <sup>y</sup><sup>i</sup> is the approximate solution of Equation* (1) *when t* = *ti, we have*

$$|y(t\_i) - \tilde{y}\_i| \le \begin{cases} \mathcal{C}\_1 h \epsilon + O(h^{2+\gamma}), & [qi] + 1 = i, \\ \mathcal{C}\_2 h \epsilon + O(h^{2+\gamma}), & [qi] + 1 \le i. \end{cases}$$

**Proof.** By Theorems <sup>3</sup> and 4, the absolute error between *<sup>y</sup>*(*ti*) and *<sup>y</sup><sup>i</sup>* has the expression

$$\begin{split} |y(t\_i) - \overline{y}\_i| &= |y(t\_i) - y\_i + y\_i - \overline{y}\_i| \\ &\le |y(t\_i) - y\_i| + |y\_i - \overline{y}\_i|. \end{split} \tag{31}$$

We obtain the conclusion of Theorem 5.

#### **6. Extrapolation Method**

In this section, we first describe the asymptotic error expansion and then present an extrapolation technique for achieving high precision. Finally, a posterior error estimate is derived.

**Theorem 6.** *Let f*(*t*), *k*1(*t*,*s*; *y*(*s*)), *k*2(*t*,*s*; *y*(*s*)) *are four times continuously differentiable on I*, *D* × **R**, *D<sup>θ</sup>* × **R***, respectively. Additionally, y*(*t*) *has continuous partial derivatives up to 3 on I and ki*(*t*,*s*; *y*(*s*)) (*i* = 1, 2) *satisfy Lipschitz conditions (2). There exist functions W*ˆ*i*(*t*)(*i* = 1, 2, 3) *independent of h, such that we have the following asymptotic expansions:*

$$y\_i = y(t\_i) + \mathring{W}\_1(t\_i)h^{2+\lambda} + \mathring{W}\_2(t\_i))h^{2+\mu} + \mathring{W}\_3(t\_i)h^2 + O(h^{3+\gamma}), \quad -1 < \lambda < 0, \quad -1 < \mu \le 0. \tag{32}$$

**Proof.** Assume that {*W*<sup>ˆ</sup> *<sup>k</sup>*(*t*), *k* = 1, 2, 3} satisfy the auxiliary delay equations

$$\mathcal{W}\_k(t) = \mathcal{W}\_k(t) + \int\_0^t \mathbf{s}^\lambda k\_1(t, \mathbf{s}; \mathbf{y}(\mathbf{s})) \mathcal{W}\_k(\mathbf{s}) \mathrm{d}\mathbf{s} + \int\_0^{qt} \mathbf{s}^\mu k\_2(t, \mathbf{s}; \mathbf{y}(\mathbf{s})) \mathcal{W}\_k(\mathbf{s}) \mathrm{d}\mathbf{s} \mathrm{d}\mathbf{s}$$

and *W*ˆ *<sup>k</sup>*(*ti*), *i* = 1, ··· , *N* satisfy the approximation equations

$$\begin{split} \mathcal{W}\_{k}(t\_{i}) &= -\zeta(-\lambda)h^{1+\lambda}k\_{1}(t\_{i},t\_{0};y(t\_{0}))\Psi\_{k}(t\_{0}) + h\sum\_{k=1}^{i-1}t\_{k}^{\lambda}k\_{1}(t\_{i},t\_{k};y(t\_{k}))\Psi\_{k}(t\_{k}) \\ &+ \frac{h}{2}t\_{k}^{\lambda}k\_{1}(t\_{i},t\_{i};y(t\_{i}))\Psi\_{k}(t\_{i}) - \zeta(-\mu)h^{1+\mu}k\_{2}(t\_{i},t\_{0};y(t\_{0}))\Psi\_{k}(t\_{0}) \\ &+ h\sum\_{k=1}^{[q]-1}t\_{k}^{\mu}k\_{2}(t\_{i},t\_{k};y(t\_{k}))\Psi\_{k}(t\_{k}) + \frac{h}{2}t\_{[q]}^{\mu}k\_{2}(t\_{i},t\_{[q]};y(t\_{[q]}))\Psi\_{k}(t\_{i}) \\ &+ \frac{qt\_{i}-t\_{[q]}}{2}\left(t\_{[q]}^{\mu}k\_{2}(t\_{i},t\_{[q]};y(t\_{[q]}))\Psi\_{k}(t\_{[q]}) + (qt\_{i})^{\mu}k\_{2}(t\_{i},qt\_{i};\beta\_{i}\mathbf{y}(t\_{[q]})\mathbf{\hat{W}}\_{k}(t\_{[q]}) \\ &+ (1-\beta\_{i})y(t\_{[q]}|+1)\Psi\_{k}(t\_{[q]}|+1))\right) + \mathcal{W}\_{k}(t\_{i}). \end{split} \tag{33}$$

The analysis procedure is similar to the proof of Theorem 3. We obtain

$$\max\_{1 \le i \le N} |\hat{\mathcal{W}}\_k(t\_i) - \mathcal{W}(t\_i)| \le Lh^{2+\gamma}.$$

Let

$$E\_i = e\_i - \left(\mathcal{W}\_1(t\_i)h^{2+\lambda} + \mathcal{W}\_2(t\_i)h^{2+\mu} + \mathcal{W}\_3(t\_i)\right)h^2 \dots$$

Then, we obtain

$$\begin{split} E\_{i} &= -\zeta(-\lambda)h^{1+\lambda}k\_{1}(t\_{i},t\_{0};y(t\_{0}))E\_{0} + h\sum\_{k=1}^{i-1}t\_{k}^{\lambda}k\_{1}(t\_{i},t\_{k};y(t\_{k}))E\_{k} + \frac{h}{2}t\_{i}^{\lambda}k\_{1}(t\_{i},t\_{i};y(t\_{i}))E\_{i} \\ &- \zeta(-\mu)h^{1+\mu}k\_{2}(t\_{i},t\_{0};y(t\_{0}))E\_{0} + h\sum\_{k=1}^{[q]-1}t\_{k}^{\mu}k\_{2}(t\_{i},t\_{k};y(t\_{k}))E\_{k} + \frac{h}{2}t\_{[q]}^{\mu}k\_{2}(t\_{i},t\_{[q]};y(t\_{[q]})E\_{[q]}) \\ &+ \frac{q[t\_{i}-t\_{[q]}]}{2}\left(t\_{[q]}^{\mu}k\_{2}(t\_{i},t\_{[q]};y(t\_{[q]}))E\_{[q]} + (qt\_{i})^{\mu}k\_{2}\left(t\_{i},qt\_{i};\beta\_{i}y(t\_{[q]})E\_{[q]} + (1-\beta\_{i})y(t\_{[q]})E\_{[q]}\right)\right). \end{split}$$

According to Lemma 4, there exists a constant *d* such that

$$\max\_{1 \le i \le N} |E\_i| \le dh^{3+\gamma}.$$

The asymptotic expansion is

$$
\overline{y}\_i = y(t\_i) + \mathcal{W}\_1(t\_i)h^{2+\lambda} + \mathcal{W}\_2(t\_i))h^{2+\mu} + \mathcal{W}\_3(t\_i)h^2 + O(h^{3+\gamma}).
$$

From Theorem 6, we consider the Richardson extrapolation method to achieve higher accuracy.

#### **Extrapolation algorithm**

**Step 1.** Assume *γ* = min (*λ*, *μ*) = *λ*, and halve the step length to obtain

$$\hat{y}\_i^\dagger = y(t\_i) + \hat{\mathcal{W}}\_1(t\_i) \left(\frac{\hbar}{2}\right)^{2+\lambda} + \hat{\mathcal{W}}\_2(t\_i) \left(\frac{\hbar}{2}\right)^{2+\mu} + \hat{\mathcal{W}}\_3(t\_i) \left(\frac{\hbar}{2}\right)^2 + O\left(\left(\frac{\hbar}{2}\right)^{3+\lambda}\right). \tag{34}$$

Then, the term *W*ˆ <sup>1</sup>(*ti*)*h*2+*<sup>λ</sup>* can be removed.

$$\overline{y}\_{i}^{1\cdot h} = \frac{2^{2+\lambda}\overline{y}\_{i}^{\frac{h}{2}} - \overline{y}\_{i}^{h}}{2^{2+\lambda}-1} = y(t\_{i}) + \hat{W}\_{2}(t\_{i}))h^{2+\mu} + \hat{W}\_{3}(t\_{i})h^{2} + O(h^{3+\lambda}).\tag{35}$$

**Step 2.** To eliminate *W*ˆ <sup>2</sup>(*ti*))*h*2+*μ*, we apply Richardson *h*2+*<sup>μ</sup>* extrapolation:

$$\bar{y}\_i^{1, \frac{h}{2}} = y(t\_i) + \hat{\mathcal{W}}\_2(t\_i) \left(\frac{h}{2}\right)^{2+\mu} + \hat{\mathcal{W}}\_3(t\_i) \left(\frac{h}{2}\right)^2 + O\left(\left(\frac{h}{2}\right)^{3+\lambda}\right). \tag{36}$$

Combining (35) and (36), we have

$$
\overline{y}\_i^{2,h} = \frac{2^{2+\mu}\overline{y}\_i^{1,\frac{h}{2}} - \overline{y}\_i^{1,h}}{2^{2+\mu} - 1} = y(t\_i) + \mathcal{W}\_3(t\_i)h^2 + O(h^{3+\lambda}).\tag{37}
$$

A posterior asymptotic error estimate is

$$\begin{split} \left| \vec{y}\_i^{\underline{h}} - y(t\_i) \right| &= \left| \frac{2^{2+\lambda} \vec{y}\_i^{\underline{h}} - \vec{y}\_i^{\underline{h}}}{2^{2+\lambda} - 1} - y(t\_i) + \frac{\vec{y}\_i^{\underline{h}} - \vec{y}\_i^{\underline{h}}}{2^{2+\lambda} - 1} \right| \leq \left| \frac{2^{2+\lambda} \vec{y}\_i^{\underline{h}} - \vec{y}\_i^{\underline{h}}}{2^{2+\lambda} - 1} - y(t\_i) \right| + \left| \frac{\vec{y}\_i^{\underline{h}} - \vec{y}\_i^{\underline{h}}}{2^{2+\lambda} - 1} \right| \\ &= \left| \vec{y}\_i^{1,h} - y(t\_i) \right| + \left| \frac{\vec{y}\_i^{\underline{h}} - \vec{y}\_i^{\underline{h}}}{2^{2+\lambda} - 1} \right| + O(h^2) \end{split} \tag{38}$$

The error *<sup>y</sup> h* 2 *<sup>i</sup>* <sup>−</sup> *<sup>y</sup>*(*ti*) is bounded by *<sup>y</sup><sup>h</sup> <sup>i</sup>* <sup>−</sup>*<sup>y</sup> h* 2 *i* <sup>22</sup>+*λ*−<sup>1</sup> , which is important for constructing adaptable algorithms.

#### **7. Numerical Experiments**

In this section, we illustrate the performance and accuracy of the quadrature method using the improved trapezoid formula. For ease of notation, we define

$$E\_h = |y(t\_i) - \overline{y}\_i^h|, \quad E\_{k,i} = |y(t\_i) - \overline{y}\_i^{k,h}| \text{ (\$k = 1, 2)}, \quad Rate = \log\_2\left(\frac{E\_h}{E\_{\frac{h}{2}}}\right),$$

where *<sup>y</sup><sup>h</sup> <sup>i</sup>* is the approximate solution of Equation (1), *<sup>y</sup><sup>k</sup>*,*<sup>h</sup> <sup>i</sup>* is the approximate solution of *k*-th extrapolation, *Ek*,*<sup>i</sup>* is the absolute error between the exact solution and the approximate solution of *k*-th extrapolation when *t* = *ti*. The procedure was implemented in MATLAB.

**Example 1.** *Consider the following equation*

$$y(t) = f(t) - \int\_0^t s^\lambda \sin(y(s)) \mathrm{d}s + \int\_0^{qt} (t+s) \sin(y(s)) \mathrm{d}s, \qquad t \in [0, T], \tag{39}$$

*with <sup>T</sup>* <sup>=</sup> 1, *<sup>λ</sup>* <sup>=</sup> <sup>−</sup><sup>1</sup> <sup>2</sup> *, and q* = 0.95*. The exact solution is given by y*(*t*) = *t and f*(*t*) *is determined by the exact solution.*

*Applying the algorithm with N* = 24, 25, 26, 27, 28*, the numerical results at t* = 0.4 *are presented in Table 1, the CPU time(s) are 0.34, 0.55, 0.98, 1.62, and 3.01 s, respectively. By comparing Eh and E*1,*i, we can observe that the accuracy was improved and the extrapolation algorithm was effective. In the third column, the rate values show that the convergence order was consistent with the theoretical analysis.*

**Table 1.** Numerical results at *t* = 0.4 of Example 1.


**Example 2.** *Consider the following equation*

$$y(t) = f(t) - \int\_0^t s^\lambda (t^2 + s) (y(s))^2 \, \mathrm{ds} + \int\_0^{qt} s^\mu \sin(y(s)) \, \mathrm{ds}, \qquad t \in [0, T], \tag{40}$$

*where <sup>T</sup>* <sup>=</sup> 1, *<sup>λ</sup>* <sup>=</sup> *<sup>μ</sup>* <sup>=</sup> <sup>−</sup><sup>1</sup> <sup>2</sup> , *q* = 0.8 *and the analytical solution is y*(*t*) = *t. Then, f*(*t*) *is determined by the exact solution.*

*By applying the numerical method for N* = 24, 25, 26, 27, 28*, the obtained results at t* = 0.2 *are shown in Table 2. By comparing Eh and E*1,*i, we can observe that the accuracy was improved, proving that the extrapolation algorithm is effective. The results verified the theoretic convergence order, which is O*(*h*1.5)*.*


**Table 2.** Numerical results at *t* = 0.2 of Example 2.

**Example 3.** *We consider the following equation*

$$y(t) = f(t) - \int\_0^t \mathbf{s}^\lambda(t+s)\sin(y(s))\,\mathrm{d}s + \int\_0^{qt} \mathbf{s}^\mu(t+s)(y(s))^2 \,\mathrm{d}s, \qquad t \in [0, T], \tag{41}$$

*where <sup>T</sup>* <sup>=</sup> 1, *<sup>λ</sup>* <sup>=</sup> <sup>−</sup><sup>1</sup> <sup>3</sup> , *<sup>μ</sup>* <sup>=</sup> <sup>−</sup><sup>1</sup> <sup>4</sup> , *q* = 0.9*, and the analytical solution is y*(*t*) = *t. Then, f*(*t*) *is determined by the exact solution.*

*By applying the numerical method for N* = 24, 25, 26, 27*, and* 28*, the obtained results at t* = 0.4 *are shown in Table 3. As λ was not equal to μ, we first applied the Richardson h*2+*<sup>λ</sup> extrapolation, and then adopted the Richardson h*2+*<sup>μ</sup> extrapolation. By comparing Eh*, *E*1,*<sup>i</sup> and E*2,*i, these results verify the theoretical results, and we can see that the extrapolation improved the accuracy dramatically. When N = 8, 16, 32, 64, 128, the CPU time(s) are 1.43, 2.41, 3.99, 17.46, and 21.36 s, respectively. The exact solution and the approximation when N=8 are plotted in Figure 1.*

**Table 3.** Numerical results at *t* = 0.4 of Example 3.


**Figure 1.** The absolute errors and the approximations when N = 23.

#### **8. Conclusions**

In this paper, by using the improved trapezoidal quadrature formula and linear interpolation, we obtained the approximate equation for non-linear Volterra integral equations with vanishing delay and weak singular kernels. The approximate solutions were obtained by an iterative algorithm, which possessed a high accuracy order *O*(*h*2+*γ*). Additionally, we analyzed the existence and uniqueness of both the exact and approximate solutions. The significance of this work was that it demonstrated the efficiency and reliability of the Richardson extrapolation. The computational findings were compared with the exact solution: we found that our methods possess high accuracy and low computational complexity, and the results showed good agreement with the theoretical analysis. For future work, we can apply this method for solving two-dimensional delay integral equations.

**Author Contributions:** Conceptualization, J.H. and L.Z.; methodology, J.H. and L.Z.; validation, J.H. and H.L.; writing—review and editing, L.Z. and Y.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Program of Chengdu Normal University, grant number CS18ZDZ02.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors would like to thank the editor and referees for their careful comments and fruitful suggestions.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**María Isabel Berenguer 1,2,\*,† and Manuel Ruiz Galán 1,2,†**


**Abstract:** First of all, in this paper we obtain a perturbed version of the geometric series theorem, which allows us to present an iterative numerical method to approximate the fixed point of a contractive affine operator. This result requires some approximations that we obtain using the projections associated with certain Schauder bases. Next, an algorithm is designed to approximate the solution of Fredholm's linear integral equation, and we illustrate the behavior of the method with some numerical examples.

**Keywords:** iterative numerical methods; Schauder bases; Fredholm integral equation

**MSC:** 65R20; 46B15; 45B05

#### **1. Introduction**

The idea of iterative numerical methods is, given a complete metric space *X* (typically a Banach space) and a contractive operator *T* : *X* −→ *X*, or at least one which guarantees the convergence of the Picard iterates, to construct a sequence of approximations of the fixed point of that operator *x*<sup>0</sup> = *T*(*x*0). The calculation of the Picard iterates is not generally easy or even feasible, so several methods which allow us to approximate the elements of the Picard sequence have been proposed. Therefore, a part of the Picard-type iterative algorithms are focused on determining, for an element *x* ∈ *X*, a value close to *T*(*x*) and in this way, successively approximating the iterates. The numerical techniques used are very diverse, and the resulting algorithms have numerous applications. Proof of all this are the recent references [1–16].

However, our approach here is completely different: given *x*, instead of approximating successively *T*(*x*), *T*2(*x*), *T*3(*x*), ... , which necessarily involves an accumulation of errors, in this paper, we approximate directly *Tn*(*x*) by means of the use of suitable Schauder bases, transforming it into a simple calculation which, for example, does not involve the resolution of systems of algebraic equations or the use of any quadrature formulae because simply linear combinations of certain values associated with the operator are calculated. What is more, motivated by its application for the numerical resolution of the linear Fredholm integral equation, the operator *T* is considered to be affine and continuous. This affine and continuous nature means that, instead of using a fixed-point language, we opted for resorting to an equivalent version using the geometric series theorem, and more specifically, our first contribution is to obtain a perturbed version of the same which is susceptible to presenting approximations by means of certain Schauder bases related to the operator. Such an approximation will imply a low computational cost as mentioned above. Thus, we are going to design an iterative-type algorithm which allows the approximation of the fixed point of a suitable continuous affine operator.

As we have mentioned, the application that we are presenting consists of a numerical algorithm to solve the linear Fredholm integral equation, which is chosen for its great versatility.

**Citation:** Berenguer, M.I.; Ruiz Galán, M. An Iterative Algorithm for Approximating the Fixed Point of a Contractive Affine Operator. *Mathematics* **2022**, *10*, 1012. https:// doi.org/10.3390/math10071012

Academic Editor: Ioannis K. Argyros

Received: 27 February 2022 Accepted: 18 March 2022 Published: 22 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The structure of this paper is as follows. In Section 2 we establish an analytical– numerical result, which provides us with an approximation of the fixed point of a suitable continuous affine operator in a Banach space. To continue, Section 3 interprets the previous result in terms of an algorithm when a Schauder basis is introduced into the considered space. Section 4 derives a specific algorithm in the case of the linear Fredholm integral equation in two distinct contexts. Next, Section 5 shows some illustrative examples of equations or a classic model of electrostatics (Love's equation), and finally, Section 6 rounds up with some conclusions.

#### **2. Approximating Fixed Points of Affine Operators**

The following result provides us with an approximation of the fixed point of a suitable continuous affine operator, as well as an estimation of the error. It addresses a version of the geometric series theorem, which we can label as perturbed: it presents the possibility of converting the precise calculations into approximate ones, in exchange for making the calculations possible.

Before establishing this, we present some standard notation. Given a (real) Banach space *X*, *L*(*X*) will denote the Banach space (usual operator norm) of those bounded and linear operators from *<sup>X</sup>* to *<sup>X</sup>*. For *<sup>T</sup>* <sup>∈</sup> *<sup>L</sup>*(*X*) and *<sup>n</sup>* <sup>∈</sup> <sup>N</sup>, *<sup>T</sup><sup>n</sup>* denotes the power operator *n* times

*T* ◦ ··· ◦*T*, while *<sup>T</sup>*<sup>0</sup> <sup>=</sup> *<sup>I</sup>*, the identity map on *<sup>X</sup>*.

**Theorem 1.** *Let X be a Banach space, y* ∈ *X and L* ∈ *L*(*X*) *with L* < 1*, and consider the continuous affine operator A* : *X* −→ *X defined by*

$$Ax := y + Lx, \quad (x \in X).$$

*Let <sup>y</sup>*<sup>0</sup> <sup>∈</sup> *X, <sup>n</sup>* <sup>∈</sup> <sup>N</sup> *and <sup>L</sup>*0, *<sup>L</sup>*1, ... , *Ln* <sup>∈</sup> *<sup>L</sup>*(*X*)*. Then, the equation Ax* <sup>=</sup> *<sup>x</sup> has a unique solution x*• ∈ *X and*

$$\left\| \left\| \sum\_{j=0}^{n} L\_j y\_0 - \mathbf{x}^{\bullet} \right\| \right\| \le \sum\_{j=0}^{n} \left\| L\_j y\_0 - L^j y\_0 \right\| + \left( \frac{1 - \left\| L \right\|^{n+1}}{1 - \left\| L \right\|} \right) \left\| y\_0 - y \right\| + \frac{\left\| L \right\|^{n+1}}{1 - \left\| L \right\|} \left\| y \right\|.$$

**Proof.** Let us first observe that, according to the geometric series theorem, there exists a unique solution *x*• ∈ *X* for the equation *Ax* = *x*,

$$x^\bullet = (I - L)^{-1} y\_\bullet$$

which satisfies for any *<sup>k</sup>* <sup>∈</sup> <sup>N</sup>,

$$\left\| \sum\_{j=0}^{k} L^j y - x^{\bullet} \right\| \le \frac{||L||^{k+1}}{1 - ||L||} ||y||.$$

Therefore,

$$\begin{split} \left\| \sum\_{j=0}^{n} L\_{j} y\_{0} - \mathbf{x}^{\bullet} \right\| &\leq \left\| \sum\_{j=0}^{n} L\_{j} y\_{0} - \sum\_{j=0}^{n} L^{j} y\_{0} \right\| + \left\| \sum\_{j=0}^{n} L^{j} y\_{0} - \sum\_{j=0}^{n} L^{j} y\_{0} \right\| + \left\| \sum\_{j=0}^{n} L^{j} y - \mathbf{x}^{\bullet} \right\| \\ &\leq \sum\_{j=0}^{n} \left\| L\_{j} y\_{0} - L^{j} y\_{0} \right\| + \sum\_{j=0}^{n} \left\| L \right\|^{j} \left\| y\_{0} - y \right\| + \frac{\left\| L \right\|^{n+1}}{1 - \left\| L \right\|} \left\| y \right\| \\ &= \sum\_{j=0}^{n} \left\| L\_{j} y\_{0} - L^{j} y\_{0} \right\| + \left( \frac{1 - \left\| L \right\|^{n+1}}{1 - \left\| L \right\|} \right) \left\| y\_{0} - y \right\| + \frac{\left\| L \right\|^{n+1}}{1 - \left\| L \right\|} \left\| y \right\|. \end{split}$$

as announced.

It is worth mentioning that when *y*<sup>0</sup> = *y* and for all *j* = 0, 1, ... , *m*, we have that *Lj* = *L<sup>j</sup>* , and we recover a well-known algorithm associated with the geometric series theorem. However, iterative procedures such as this, used to involve difficult and even impossible calculations from a practical perspective, so the idea behind this theorem is to choose the operators *L*0, *L*1, ... , *Ln* in such a way that *L*0*y*0, *L*1*y*0, *Lny*<sup>0</sup> are not only calculable, but also have a low computational cost. In addition, if *y*<sup>0</sup> represents an approximation of *y*—normally due to a certain type of error—the previous result shows how *y*<sup>0</sup> influences the final approximation. Finally, we can obtain an approximation for *x*• for some adequate *<sup>n</sup>* <sup>∈</sup> <sup>N</sup>, and for each *<sup>j</sup>* <sup>=</sup> 0, 1, ... , *<sup>n</sup>*, *Ljy*<sup>0</sup> is close to *<sup>L</sup><sup>j</sup> y*0. More specifically:

**Corollary 1.** *Suppose that X is a Banach space, L* ∈ *L*(*X*) *and L* < 1*, y* ∈ *X, and that A* : *X* −→ *X is the continuous and affine operator A*(·) := *y* + *L*(·)*, whose unique fixed point is denoted by <sup>x</sup>*• <sup>∈</sup> *X. Additionally, assume that for some <sup>y</sup>*<sup>0</sup> <sup>∈</sup> *X, <sup>n</sup>* <sup>∈</sup> <sup>N</sup>*, <sup>L</sup>*0, *<sup>L</sup>*1, ... , *Ln* <sup>∈</sup> *<sup>L</sup>*(*X*) *and <sup>ε</sup>*,*ε*0,*ε<sup>n</sup>* <sup>&</sup>gt; <sup>0</sup>*, we have that <sup>n</sup>*

$$\sum\_{j=0}^{n} \varepsilon\_j < \frac{\varepsilon}{2},$$

$$j = 0, \dots, n \implies ||L\_j y\_0 - L^j y\_0|| < \varepsilon\_{j'} \tag{1}$$

.

*and that*

$$\left(\frac{1-\|L\|^{n+1}}{1-\|L\|}\right)\|y\_0-y\| + \frac{\|L\|^{n+1}}{1-\|L\|}\|y\| < \frac{\varepsilon}{2}.\tag{2}$$

*Then*

$$\left\| \sum\_{j=0}^{n} L\_j y\_0 - x^{\bullet} \right\| < \varepsilon.$$

Obviously, (2) is valid as soon as *n* is large enough and *y*<sup>0</sup> − *y* is small. For condition (1), we present some analytical tools in the next section.

#### **3. Numerical Ideas behind the Algorithm for the Equation** *y* **+** *Lx* **=** *x*

In view of Corollary 2.2 and under its hypotheses, we can approximate the fixed point *x*• of *A* by a series close to the geometric one:

$$y\_0 \in X \leadsto \begin{array}{ccccccccc} & y\_0 & L y\_0 & L^2 y\_0 & \dots & L^n y\_0 & \leadsto & \sum\_{j=0}^n L^j y\_0 & \xrightarrow{\alpha} & & \mathbf{x}^\bullet \\ & L\_0 y\_0 & L\_1 y\_0 & L\_2 y\_0 & \dots & L\_n y\_0 & \leadsto & \sum\_{j=0}^n L\_j y\_0 & & \approx & \sum\_{j=0}^n L^j y\_0 \end{array}$$

In order to derive *<sup>n</sup>*

$$\sum\_{j=0}^{n} L\_j y\_0 \approx \sum\_{j=0}^{n} L^j y\_0$$

an approximation as that given in (1) is required. To this end, a possible tool appears provided by the Schauder bases, since they give an explicit linear approximation of any element of a Banach space by means of the associated projections, which is compatible with the continuity and affinity of the operator. What is more, in the case of classic bases, we easily obtain approximations of (the linear part of) *A* and its powers.

Thus, before continuing, we revise some of the basic notions of Schauder bases that we are going to need in the design of our algorithm. A sequence {*ej*}*j*∈<sup>N</sup> in a Banach space *<sup>X</sup>* is a Schauder basis if all the element *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>* can be uniquely represented as

$$\mathfrak{x} = \sum\_{j=1}^{\infty} a\_j e\_{j\prime\prime}$$

for a sequence of real {*αj*}*j*∈N. If we define for each *<sup>j</sup>* <sup>∈</sup> <sup>N</sup> the linear operator *Pj* : *<sup>X</sup>* −→ *<sup>X</sup>*, known as the *j*-th projection associated with the basis, as

$$P\_{\vec{j}}\mathfrak{x} := \sum\_{k=0}^{\hat{j}} \alpha\_k \mathfrak{e}\_{k\check{j}}$$

for such an *x*, it is easy to prove, as a consequence of the Baire lemma, that it is a continuous operator and, in view of the representation of *x* in terms of the elements of the basis,

$$\lim\_{j \to \infty} ||P\_j x - x|| = 0.$$

With the aid of a Schauder basis, we can approximate *Lx* with *L*(*Pjx*) which, on occasion, is easy to calculate. To summarize all of this, we focus on a type of affine equation, linear Fredholm integral equations, although this is the objective of the following section.

#### **4. Algorithm to Approximate the Solution of a Linear Fredholm Integral Equation**

In the rest of this paper, we focus our efforts on realizing everything that we explained thus far in order to address the study of a specific problem, the numerical resolution of a linear Fredholm integral equation, in two distinct settings.

Let *<sup>X</sup>* <sup>=</sup> *<sup>C</sup>*[*a*, *<sup>b</sup>*] or *<sup>X</sup>* <sup>=</sup> *<sup>L</sup>p*[*a*, *<sup>b</sup>*], (<sup>1</sup> <sup>&</sup>lt; *<sup>p</sup>* <sup>&</sup>lt; <sup>∞</sup>), *<sup>k</sup>* <sup>∈</sup> *<sup>C</sup>*[*a*, *<sup>b</sup>*] <sup>2</sup> or *<sup>k</sup>* <sup>∈</sup> *<sup>L</sup>*∞[*a*, *<sup>b</sup>*] 2, respectively, and *y* ∈ *X*. Then we consider the corresponding linear Fredhlom integral equation

$$\mathbf{x}(t) = \mathbf{y}(t) + \int\_{a}^{b} k(t, s)\mathbf{x}(s)ds,\tag{3}$$

where *x* ∈ *X* is the unknown function. In view of the previous results, we consider the continuous and linear operator *L* : *X* −→ *X* defined at each *y*<sup>0</sup> ∈ *X* as

$$L y\_0 := \int\_a^b k(\cdot, s) y\_0(s) ds.$$

Then, given *<sup>j</sup>* <sup>∈</sup> <sup>N</sup>,

$$L^{\hat{\jmath}}y\_0 = \int\_a^b \left( \cdot \cdot \cdot \int\_a^b k(\cdot, t\_1) k(t\_1, t\_2) \cdot \cdot \cdot k(t\_{\hat{\jmath}-1}, t\_{\hat{\jmath}}) y\_0(t\_{\hat{\jmath}}) dt\_{\hat{\jmath}} \right) \cdot \cdot \cdot dt\_1 \dots$$

From now on, in both cases (*X* = *C*[*a*, *b*] or *X* = *Lp*[*a*, *b*]), we assume that

$$\|k\| (b-a) < 1,$$

since such a condition is sufficient for the validity of *L* < 1 and it is very easy to check. Furthermore, for each *<sup>d</sup>* <sup>∈</sup> <sup>N</sup>, we fix a Schauder basis {*<sup>e</sup>* (*d*) *<sup>j</sup>* }*j*∈<sup>N</sup> in *<sup>C</sup>*[*a*, *<sup>b</sup>*] *<sup>d</sup>* (if *X* = *C*[*a*, *b*]) or in *Lp*[*a*, *b*] *<sup>d</sup>* (if *<sup>X</sup>* <sup>=</sup> *<sup>L</sup>p*[*a*, *<sup>b</sup>*]) and we denote the projections in this basis as {*P*(*d*) *<sup>j</sup>* }*j*∈N.

With all of this, we are now ready to define the approximate operators *Lj*: for each *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>* and *<sup>j</sup>* <sup>∈</sup> <sup>N</sup>, we take

$$\Phi\_j(\boldsymbol{x})(t, t\_1, \dots, t\_j) := k(\cdot, t\_1)k(t\_1, t\_2)\cdots k(t\_{j-1}, t\_j)y\_0(t\_j)$$

and fixed on *rj* <sup>∈</sup> <sup>N</sup>, thus *Lj* : *<sup>X</sup>* −→ *<sup>X</sup>* is given as

$$L\_{\vec{\jmath}}y\_0 := \int\_a^b \left( \cdot \cdot \cdot \int\_a^b P\_{r\_{\vec{\jmath}}}^{(j+1)}(\Phi\_{\vec{\jmath}}(y\_0)(\cdot, t\_1, \dots, t\_{\vec{\jmath}})dt\_{\vec{\jmath}} \right) \cdot \cdot \cdot dt\_1. \tag{4}$$

Now we can apply the corollary 1 since without going any further, each *rj* is big enough, *Ljx* <sup>−</sup> *<sup>L</sup><sup>j</sup> x* < *εj*.

**Corollary 2.** *For any ε* > 0 *and y*<sup>0</sup> ∈ *X, there are natural numbers n and r*0, ... ,*rn in such a way that if x*• *is the unique solution to the linear Fredholm integral Equation* (3)*, then*

$$\left\| \sum\_{j=0}^{n} L\_j y\_0 - x^{\bullet} \right\| < \varepsilon\_{\prime} $$

*where L*<sup>0</sup> = *I and for each j* ≥ 1*, the operator Lj is defined by* (4)*.*

Thus, we have established the following (Algorithm 1)

**Algorithm 1:** Algorithm for approximating the solution of the linear Fredholm integral equation.

Choose *<sup>y</sup>*0, *<sup>k</sup>*, *<sup>n</sup>*, *<sup>ε</sup>*,*ε*0,...,*εn*, *<sup>r</sup>*0,...,*rn* <sup>∈</sup> <sup>N</sup>, and {*<sup>e</sup>* (*d*) *<sup>i</sup>* }*i*∈N, *<sup>d</sup>* = 1, . . . , *<sup>n</sup>* + 1; *L*<sup>0</sup> ← *I*; *j* ← 1; **while** *n* ∑ *j*=0 *Ljy*<sup>0</sup> − *x*• ≥ *ε* and *j* ≤ *n* <sup>Φ</sup>*j*(*y*0)(*t*, *<sup>t</sup>*1,..., *tj*) ← *<sup>k</sup>*(·, *<sup>t</sup>*1)*k*(*t*1, *<sup>t</sup>*2)··· *<sup>k</sup>*(*tj*−1, *tj*)*y*0(*tj*); *Ljy*<sup>0</sup> ← *b a* ··· *<sup>b</sup> a <sup>P</sup>*(*j*+1) *rj* (Φ*j*(*y*0)(·, *<sup>t</sup>*1,..., *tj*)*dtj* ··· *dt*1; *j* ← *j* + 1; **end (while) sol**−**approx** ← *n* ∑ *j*=0 *Ljy*0.

Observe that

$$||
\mathbf{sol\\_approx} - \mathbf{x}^\bullet|| < \varepsilon$$

and that for an appropriate choice of the bases {*e* (*d*) *<sup>j</sup>* }*j*∈N, the calculations are immediate, as justified below.

Returning to the considered spaces in order to study the linear Fredholm integral equation, *X* = *C*[*a*, *b*] or *X* = *Lp*[*a*, *b*], we remember how it is possible to tensorially construct bases {*e* (*d*) *<sup>j</sup>* }*j*∈<sup>N</sup> in *<sup>X</sup>* = *<sup>C</sup>*[*a*, *<sup>b</sup>*] *<sup>d</sup>* or *X* = *Lp*[*a*, *b*] *<sup>d</sup>*, respectively, from a basis {*e* (1) *<sup>j</sup>* }*j*∈<sup>N</sup> in the aforementioned spaces.

Specifically, given *<sup>d</sup>* <sup>∈</sup> <sup>N</sup>, *<sup>d</sup>* <sup>≥</sup> 2, we consider in <sup>N</sup>*<sup>d</sup>* the square ordering introduced in [17] in a inductive form: for *d* ≥ 2, (1, 1), (1, 2), (2, 2), (2, 1), (1, 3), (2, 3), (3, 3), (3, 2), ... , and given the ordering *o*1, *o*2,... of N*d*−1, the order in N*<sup>d</sup>* is (*o*1, 1), (*o*1, 2), (*o*2, 2), (*o*2, 1), (*o*1, 3), (*o*2, 3), (*o*3, 3), . . . . Graphically,

$$(o\_{2,1}) \xleftarrow{\longrightarrow} (o\_{1,2}) \qquad \qquad (o\_{1,3}) \qquad \qquad (o\_{1,4})$$

$$(o\_{2,1}) \xleftarrow{\longrightarrow} (o\_{2,2}) \qquad \qquad (o\_{2,3}) \qquad \qquad (o\_{2,4})$$

$$\bigcup\_{\begin{array}{c}(o\_{3,1}) \ \long{1 \rightarrow \cdots \ \longrightarrow} (o\_{3,2}) \ \long{2 \rightarrow \cdots} (o\_{3,3}) \end{array}} \qquad (o\_{3,4})$$

$$\begin{array}{c} \dots \longrightarrow \longrightarrow \longrightarrow (o\_{4,3}) \xleftarrow{\longrightarrow} (o\_{4,4}) \end{array} \begin{array}{c} \end{array} \qquad (o\_{4,4})$$

Thus, we establish a bijection *<sup>τ</sup>* : <sup>N</sup> −→ <sup>N</sup>*d*, that for each *<sup>j</sup>* <sup>∈</sup> <sup>N</sup> <sup>a</sup> *<sup>d</sup>*-upla is assigned in the form

$$\tau(j) = (\alpha\_{1'} \dots \alpha\_d)^j$$

and for such a *j*, we define

$$\mathfrak{e}\_{j}^{(d)}(t\_{1},\ldots,t\_{d}) := \mathfrak{e}\_{a\_{1}}(t\_{1})\cdots\mathfrak{e}\_{a\_{d}}(t\_{d}), \qquad ((t\_{1},\ldots,t\_{d}) \in [a,b]^{d}).\tag{5}$$

The usual Schauder basis in *C*[*a*, *b*] is the Faber–Schauder system, and in *Lp*[*a*, *b*] is the Haar system [18]. More specifically, and assuming without loss of generality that *a* = 0 and *<sup>b</sup>* = 1, for the Faber–Schauder system, we start from the nodes {*tj*}*j*∈N, which are the points of [*a*, *b*] arranged dyadically, and the basis functions {*e* (1) *<sup>j</sup>* }*j*∈<sup>N</sup> are continuous piecewise linear functions, the so-called hat functions, satisfying for each *<sup>j</sup>* <sup>∈</sup> <sup>N</sup>

$$e\_j^{(1)}(t\_j) = 1$$

and

$$1 \le k < n \implies e\_k^{(1)}(t\_j) = 0.$$

On the other hand, if *<sup>A</sup>* is a non-empty subset of [0, 1] and *<sup>δ</sup><sup>A</sup>* : [0, 1] −→ <sup>R</sup> is the function defined in each 0 ≤ *t* ≤ 1 as

$$\delta\_A(t) := \begin{cases} 1, & \text{if } t \in A \\ 0, & \text{if } t \notin A \end{cases}$$

and *<sup>ϕ</sup>* : [0, 1] −→ <sup>R</sup> is the function such that in each 0 <sup>≤</sup> *<sup>t</sup>* <sup>≤</sup> <sup>1</sup>

$$\varphi(t) := \delta\_{[0,0.5)}(t) - \delta\_{[0.5,1]}(t)\_{\prime\prime}$$

then the Haar system is given by

$$e\_1^{(1)} := 1$$

and for *<sup>j</sup>* <sup>≥</sup> 2, written uniquely as *<sup>j</sup>* <sup>=</sup> <sup>2</sup>*<sup>k</sup>* <sup>+</sup> *<sup>r</sup>* <sup>+</sup> 1, with *<sup>k</sup>* <sup>=</sup> 0, 1, . . . and *<sup>r</sup>* <sup>=</sup> 0, 1, . . . , 2*<sup>k</sup>* <sup>−</sup> 1,

$$e\_j^{(1)}(\cdot) := \varrho(2^k(\cdot) - r).$$

In both cases, the tensorial sequences defined as (5) constitute Schauder bases in their respective spaces, *C*[*a*, *b*] *<sup>d</sup>* and *Lp*[*a*, *b*] *<sup>d</sup>* [17,19]. However, what really makes these bases useful when they are used in our Algorithm 1 is precisely that the calculation of the approximate operators *Lj* is very easy, since the basis functions *e* (*d*) *<sup>j</sup>* are of separate variables and each factor is immediately integrable. Let us mention that these Schauder bases allow us to preserve the linearity of the convergence that it is guaranteed by the series geometric theorem.

#### **5. Numerical Examples**

We now show the numerical results obtained in several specific examples. Beforehand, let us mention that the reordering of a finite number of Schauder basis elements produces another new Schauder basis, which could be interesting from a computational point of view. Thus, for each *<sup>r</sup>* <sup>∈</sup> <sup>N</sup>, we reordered the bases of *<sup>C</sup>*[*a*, *<sup>b</sup>*] *<sup>d</sup>* and *Lp*[*a*, *b*] *<sup>d</sup>* so that the *r<sup>d</sup>* first elements correspond to (*α*1, *α*2, ... , *αd*) being 1 ≤ *α<sup>i</sup>* ≤ *r*. For these reordered bases, we maintain the same previous notation, {*e* (*d*) *<sup>j</sup>* }*j*∈N*<sup>d</sup>* for the basis and {*P*(*d*) *<sup>j</sup>* }*j*∈<sup>N</sup> for the sequence of projections. Furthermore, given *<sup>n</sup>*,*<sup>r</sup>* <sup>∈</sup> <sup>N</sup>, we write

$$\mathfrak{x}^{(n\mathcal{I})} := \sum\_{j=0}^{n} L\_j y\_{0\mathcal{I}}.$$

where the indices *rj* involved in the definition of *Lj* are given by *τ*(*rj*)=( *j* + 1 times *r*,...,*r* ).

In each example, we consider *y*<sup>0</sup> = *y* since another choice of *y*<sup>0</sup> is rather more theoretical, and since as we have indicated previously, it addresses the function *y* when some kind of error is produced in this function. Calculations were obtained by means of the Mathematica 12 software.

**Example 1.** *We consider the equation of the Example 1 in [10]:*

$$\mathbf{x}(t) = \frac{30\pi t - \sin(\pi t)}{15} + \frac{1}{15} \int\_0^1 t \cos(\pi t s^2) \mathbf{x}(s) \, ds$$

*whose solution is x*•(*t*) = 2*πt.*

*The errors obtained with our method are comparable to the order of those obtained in the reference taking m* = 4 *and p* = 2*, as shown in Table 1. The advantage in our case is that it is not necessary to start with an approximate solution "close enough" to the exact solution and it is not necessary either solve any system of linear equations.*

**Table 1.** *<sup>x</sup>*• <sup>−</sup> *<sup>x</sup>*(*n*,*r*) for Example <sup>1</sup> using the usual basis in *<sup>C</sup>*[0, 1].


**Example 2.** *The following equation is also extracted from the same reference (Example 2, [10]):*

$$\mathbf{x}(t) = t^2 - t + 1 + \frac{1}{4} \int\_0^1 e^{t s} \mathbf{x}(s) \, ds.$$

*As in the referenced paper, since the solution of this equation is not known, we consider the operator F* : *C*[0, 1] → *C*[0, 1] *given by*

$$F(\mathbf{x})(t) = \mathbf{x}(t) - t^2 + t - 1 - \frac{1}{4} \int\_0^1 e^{t\mathbf{s}} \mathbf{x}(\mathbf{s}) d\mathbf{s}$$

*and we show <sup>F</sup>*(*x*(*n*,*r*)) *for different values of n and r in Table 2.*

*The errors obtained are similar to those reported in Table 2 of [10] but with the same advantage mentioned above.*

**Table 2.** *<sup>F</sup>*(*x*(*n*,*r*)) for Example <sup>2</sup> using the usual basis in *<sup>C</sup>*[0, 1].


**Example 3.** *The following equation,*

$$\varkappa(t) = \frac{2t^2 - 1}{3} + \frac{2}{3}e^t(t - 1) + \frac{1}{3}\int\_0^1 t^3 e^{ts} \varkappa(s) ds \varkappa$$

*is taken from [20], Example 3. Its solution is x*•(*t*) = *t* <sup>2</sup> <sup>−</sup> <sup>1</sup>*. See Table <sup>3</sup> for the error generated by Algorithm 1.*


**Table 3.** *<sup>x</sup>*• <sup>−</sup> *<sup>x</sup>*(*n*,*r*) for Example <sup>3</sup> using the usual basis in *<sup>C</sup>*[0, 1].

**Example 4.** *This is a standard test problem, and it arises in electrostatics (see [21]) where it is called Love's equation.*

$$x(t) = y(t) + \frac{\delta}{\pi} \int\_0^1 \frac{x(s)}{\delta^2 + (t - s)^2} ds.$$

*We consider δ* = −1 *and y*(*t*) = 1 + 1 *<sup>π</sup>* (*arctg*(<sup>1</sup> <sup>−</sup> *<sup>t</sup>*) + *arctg*(*t*)) *as in Example 3.2 of [22]. In this case, the exact solution is x*•(*t*) = 1*.*

*The errors—see Tables 4 nad 5—are similar to those obtained by the Haar wavelet method and rationalized Haar functions method (see Table 1 in [22]), although their computation requires to solve some high-order systems of linear equations.*

**Table 4.** *<sup>x</sup>*• <sup>−</sup> *<sup>x</sup>*(*n*,*r*) for Example <sup>4</sup> using the usual basis in *<sup>C</sup>*[0, 1].


**Table 5.** *<sup>x</sup>*• <sup>−</sup> *<sup>x</sup>*(*n*,*r*) for Example <sup>4</sup> using the usual basis in *<sup>L</sup>*2[0, 1].


**Example 5.** *Now considering Example 2 of [23] which has solution x*•(*t*) = sin(2*πt*)

$$\mathbf{x}(t) = \sin(2\pi t) + \int\_0^1 (t^2 - t - s^2 - s)\mathbf{x}(s)ds.$$

*We observe that the numerical results obtained with our method (Table 6) significantly improve those obtained in the reference.*

**Table 6.** *<sup>x</sup>*• <sup>−</sup> *<sup>x</sup>*(*n*,*r*) for Example <sup>5</sup> using the usual basis in *<sup>L</sup>*2[0, 1].


#### **6. Conclusions**

In this paper, we present an algorithm for iteratively approximating the fixed point of a continuous coercive affine operator. Its design is based on a perturbed version of the classic geometric series theorem, the error control that this provides, and the use of certain Schauder bases. All of this is illustrated for a wide group of affine problems, the linear Fredholm integral equations. The low computational cost that our algorithm entails makes it particularly efficient. All of this is illustrated by several examples. We consider that future research could be focused on extending the algorithm to solve different types of integral and even integro-differential equations.

**Author Contributions:** Conceptualization, M.I.B. and M.R.G.; methodology, M.I.B. and M.R.G.; software M.I.B. and M.R.G.; validation, M.I.B. and M.R.G.; formal analysis, M.I.B. and M.R.G.; investigation, M.I.B. and M.R.G.; writing—original draft preparation, M.I.B. and M.R.G.; writing review and editing, M.I.B. and M.R.G.; supervision, M.I.B. and M.R.G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was partially supported by Junta de Andalucía, Project "Convex and numerical analysis", reference FQM359, and by the "María de Maeztu" Excellence Unit IMAG, reference CEX2020-001105-M, funded by MCIN/AEI/10.13039/501100011033/.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Juan Zhang 1,\* and Xiao Luo <sup>2</sup>**


**Abstract:** In this paper, we transform the problem of solving the Sylvester matrix equation into an optimization problem through the Kronecker product primarily. We utilize the adaptive accelerated proximal gradient and Newton accelerated proximal gradient methods to solve the constrained non-convex minimization problem. Their convergent properties are analyzed. Finally, we offer numerical examples to illustrate the effectiveness of the derived algorithms.

**Keywords:** Sylvester matrix equation; Kronecker product; adaptive accelerated proximal gradient method; Newton-accelerated proximal gradient method

**MSC:** 15A24; 65F45

#### **1. Introduction**

Matrix equations are ubiquitous in signal processing [1], control theory [2], and linear systems [3]. Most time-dependent models accounting for the prediction, simulation, and control of real-world phenomena may be represented as linear or nonlinear dynamical systems. Therefore, the relevance of matrix equations within engineering applications largely explains the great effort put forth by the scientific community into their numerical solution. Linear matrix equations have an important role in the stability analysis of linear dynamical systems and the theoretical development of the nonlinear system. The Sylvester matrix equation was first proposed by Sylvester and produced from the research of relevant fields in applied mathematical cybernetics. It is a famous matrix equation that occurs in linear and generalized eigenvalue problems for the computation of invariant subspaces using Riccati equations [4–6]. The Sylvester matrix equation takes part in linear algebra [7–9], image processing [10], model reduction [11], and numerical methods for differential equations [12,13].

We consider the Sylvester matrix equation of the form

$$AX + XB = \mathbb{C}\_{\prime} \tag{1}$$

where *<sup>A</sup>* <sup>∈</sup> <sup>R</sup>*m*×*m*, *<sup>B</sup>* <sup>∈</sup> <sup>R</sup>*n*×*n*, *<sup>C</sup>* <sup>∈</sup> <sup>R</sup>*m*×*<sup>n</sup>* are given matrices, and *<sup>X</sup>* <sup>∈</sup> <sup>R</sup>*m*×*<sup>n</sup>* is an unknown matrix to be solved. We discuss a special form of the Sylvester matrix equation, in which *A* and *B* are symmetric positive definite.

Recently, there has been a lot of discussion on the solution and numerical calculation of the Sylvester matrix equation. The standard methods for solving this equation are the Bartels–Stewart method [14] and the Hessenberg–Schur method [15], which are efficient for small and dense system matrices. When system matrices are small, the block Krylov subspace methods [16,17] and global Krylov subspace methods [18] are proposed. These methods use the global Arnoldi process, block Arnoldi process, or nonsymmetric block

**Citation:** Zhang, J.; Luo, X.; Gradient-Based Optimization Algorithm for Solving Sylvester Matrix Equation. *Mathematics* **2022**, *10*, 1040. https://doi.org/10.3390/ math10071040

Academic Editors: Maria Isabel Berenguer and Manuel Ruiz Galán

Received: 14 February 2022 Accepted: 22 March 2022 Published: 24 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Lanczos process to produce low-dimensional Sylvester matrix equations. More feasible methods for solving large and sparse problems are iterative methods. When system matrices are large, there are some effective methods such as the alternating direction implicit (ADI) method [19], global full orthogonalization method, global generalized minimum residual method [20], gradient-based iterative method [21], and global Hessenberg and changing minimal residual with Hessenberg process method [22]. When system matrices are low-rank, the ADI method [23], block Arnoldi method [17], preconditioned block Arnoldi method [24], and extended block Arnoldi method [25] and its variants [26,27], including the global Arnoldi method [28,29] and extended global Arnoldi method [25], are proposed to obtain the low-rank solution.

The adaptive accelerated proximal gradient (A-APG) method [30] is an efficient numerical method for calculating the steady states of the minimization problem, motivated by the accelerated proximal gradient (APG) method [31], which has wide applications in image processing and machine learning. In each iteration, the A-APG method takes the step size by using a line search initialized with the Barzilai–Borwein (BB) step [32] to accelerate the numerical speed. Moreover, as the traditional APG method is proposed for the convex problem and its oscillation phenomenon slows down the convergence, the restart scheme has been used for speeding up the convergence. For more details, one can refer to [30] and the references therein.

The main contribution is to study gradient-based optimization methods such as the A-APG and Newton-APG methods for solving the Sylvester matrix equation through transforming this equation into an optimization problem by using Kronecker product. The A-APG and Newton-APG methods are theoretically guaranteed to converge to a global solution from an arbitrary initial point and achieve high precision. These methods are especially efficient for large and sparse coefficient matrices.

The rest of this paper is organized as follows. In Section 2, we transform this equation into an optimization problem by using the Kronecker product. In Section 3, we apply A-APG and Newton-APG algorithms to solve the optimization problem and compare them with other methods. In Section 4, we focus on the convergence analysis of the A-APG method. In Section 5, the computational complexity of these algorithms is analyzed exhaustively. In Section 6, we offer corresponding numerical examples to illustrate the effectiveness of the derived methods.

Throughout this paper, let <sup>R</sup>*n*×*<sup>m</sup>* be the set of all *<sup>n</sup>* <sup>×</sup> *<sup>m</sup>* real matrices. *In* is the identity matrix of order *<sup>n</sup>*. If *<sup>A</sup>* <sup>∈</sup> <sup>R</sup>*n*×*n*, the symbols *<sup>A</sup>T*, *<sup>A</sup>*−1, *<sup>A</sup>* and *tr*(*A*) express the transpose, the inverse, the 2-norm, and the trace of *A*, respectively. The inner product in matrix space <sup>E</sup> is *x*, *<sup>y</sup>* <sup>=</sup> *tr*(*x*, *<sup>y</sup>*), <sup>∀</sup>*x*, *<sup>y</sup>* <sup>∈</sup> <sup>E</sup>.

#### **2. The Variant of an Optimization Problem**

In this section, we transform the Sylvester equation into an optimization problem. We recall some definitions and lemmas.

**Definition 1.** *Let <sup>Y</sup>* = (*yij*) <sup>∈</sup> <sup>R</sup>*m*×*n*, *<sup>Z</sup>* <sup>∈</sup> <sup>R</sup>*p*×*q, the Kronecker product of <sup>Y</sup> and <sup>Z</sup> be defined by*

$$Y \otimes Z = \begin{bmatrix} y\_{11}Z & y\_{12}Z & \cdots & y\_{1n}Z \\ y\_{21}Z & y\_{22}Z & \cdots & y\_{2n}Z \\ \vdots & \vdots & \vdots & \vdots \\ y\_{m1}Z & y\_{m2}Z & \cdots & y\_{mn}Z \end{bmatrix}.$$

**Definition 2.** *If Y* <sup>∈</sup> <sup>R</sup>*m*×*n, then the straightening operator vec* : <sup>R</sup>*m*×*<sup>n</sup>* −→ <sup>R</sup>*mn of Y is*

$$vec(\boldsymbol{Y}) = (y\_1^T, y\_2^T, \dots, y\_n^T)^T.$$

**Lemma 1.** *Let Y* <sup>∈</sup> <sup>R</sup>*l*×*m*, *<sup>Z</sup>* <sup>∈</sup> <sup>R</sup>*m*×*n*, *<sup>W</sup>* <sup>∈</sup> <sup>R</sup>*n*×*k*, *then*

$$vec(\mathbf{Y}ZW) = (\mathcal{W}^T \otimes \mathbf{Y})vec(Z).$$

From Lemma 1, the Sylvester Equation (1) can be rewritten as

$$\text{cr}(I\_{\mathfrak{n}} \otimes A + B^T \otimes I\_{\mathfrak{m}}) \text{vec}(X) = \text{vec}(\mathbb{C}). \tag{2}$$

**Lemma 2.** *Let A be a symmetric positive matrix; solving the equation Ax* = *b is equivalent to obtaining the minimum of <sup>ϕ</sup>*(*x*) = *<sup>x</sup>TAx* <sup>−</sup> <sup>2</sup>*bTx.*

According to Lemma 2 and Equation (2), define

$$\vec{A} = (I\_{\mathfrak{n}} \otimes A + B^T \otimes I\_{\mathfrak{m}}), \ \mathfrak{k} = \mathsf{vec}(X), \ \mathfrak{k} = \mathsf{vec}(\mathsf{C}).$$

Therefore, Equation (2) should be *A*¯ *x*¯ = ¯ *b*. Obviously, if *A* and *B* are symmetric positive, then *A*¯ is symmetric positive. The variant of the Sylvester Equation (2) reduces to the optimization problem:

$$\begin{split} \min q(\mathbf{x}) &= \min \left\{ \mathbf{\bar{x}}^T \mathbf{A} \mathbf{\bar{x}} - 2 \mathbf{\bar{b}}^T \mathbf{\bar{x}} \right\} \\ &= \min \left\{ \text{vec}(\mathbf{X})^T (I\_n \odot A + B^T \odot I\_m) \text{vec}(\mathbf{X}) - 2 \text{vec}(\mathbf{X})^T \text{vec}(\mathbf{C}) \right\} \\ &= \min \left\{ \text{vec}(\mathbf{X})^T \cdot \text{vec}(A\mathbf{X}) + \text{vec}(\mathbf{X})^T \cdot \text{vec}(\mathbf{X}\mathbf{B}) - 2 \text{vec}(\mathbf{X})^T \cdot \text{vec}(\mathbf{C}) \right\} \\ &= \min \left\{ tr(\mathbf{X}^T A \mathbf{X}) + tr(\mathbf{X}^T \mathbf{X} \mathbf{B}) - 2 tr(\mathbf{X}^T \mathbf{C}) \right\}. \end{split} \tag{3}$$

Using the calculation of the matrix differential from [33], we have the following propositions immediately.

\*\*Proposition 1.\*\*  $If \ A = (a\_{ij}) \in \mathbb{R}^{m \times n}, X = (\mathbf{x}\_{ij}) \in \mathbb{R}^{m \times n}, then \ \frac{\partial \text{tr}(A^T X)}{\partial X} = \frac{\partial \text{tr}(X^T A)}{\partial X} = A.$ 

**Proposition 2.** *If A* = (*aij*) <sup>∈</sup> <sup>R</sup>*m*×*m*, *<sup>X</sup>* = (*xij*) <sup>∈</sup> <sup>R</sup>*m*×*n, then <sup>∂</sup>tr*(*X<sup>T</sup> AX*) *<sup>∂</sup><sup>X</sup>* = *AX* + *<sup>A</sup>TX.*

**Proposition 3.** *If B* = (*bij*) <sup>∈</sup> <sup>R</sup>*n*×*n*, *<sup>X</sup>* = (*xij*) <sup>∈</sup> <sup>R</sup>*m*×*n, then <sup>∂</sup>tr*(*XX<sup>T</sup> <sup>B</sup>*) *<sup>∂</sup><sup>X</sup>* = *XB* + *XBT.*

Using Propositions 2 and 3, the gradient of the objective function (3) is

$$
\nabla \cdot \boldsymbol{\varrho}(\boldsymbol{X}) = \boldsymbol{A}\boldsymbol{X} + \boldsymbol{X}\boldsymbol{B} + \boldsymbol{A}^T \boldsymbol{X} + \boldsymbol{X}\boldsymbol{B}^T - 2\boldsymbol{C}.\tag{4}
$$

By (4), the Hessian matrix is

$$
\Box \nabla^2 \varphi(X) = A + A^T + B + B^T. \tag{5}
$$

#### **3. Iterative Methods**

In this section, we will introduce the adaptive accelerated proximal gradient (A-APG) method and the Newton-APG method to solve the Sylvester equation. Moreover, we compare the A-APG and Newton-APG methods with other existing methods.

#### *3.1. APG Method*

The traditional APG method [31] is designed for solving the composite convex problem:

$$\min\_{\mathbf{x}\in\mathbb{H}} H(\mathbf{x}) = \mathbf{g}(\mathbf{x}) + f(\mathbf{x})\_r$$

where <sup>H</sup> is the finite-dimensional Hilbert space equipped with the inner product <sup>&</sup>lt; ·, · <sup>&</sup>gt;, *<sup>g</sup>* and *f* are both continuously convex, and *f* has a Lipschitz constant *L*. Given initializations *x*<sup>1</sup> = *x*<sup>0</sup> and *t*<sup>0</sup> = 1, the APG method is

$$\begin{aligned} t\_k &= \left(\sqrt{4(t-k-1)^2+1}\right)/2, \\ \mathbf{Y}\_k &= \mathbf{X}\_k + \frac{t\_{k-1}-1}{t\_k}(\mathbf{X}\_k - \mathbf{X}\_{k-1})\_\prime \\ X\_{k+1} &= \text{Prox}\_{\mathcal{S}}^a(\mathbf{Y}\_k - \boldsymbol{\alpha} \nabla f(\mathbf{Y}\_k))\_\prime \end{aligned}$$

where *<sup>α</sup>* <sup>∈</sup> (0, *<sup>L</sup>*] and the mapping Prox*<sup>α</sup> <sup>g</sup>*(·) : <sup>R</sup>*<sup>n</sup>* → <sup>R</sup>*<sup>n</sup>* is defined as

$$\operatorname{Prox}\_{\mathcal{S}}^{\alpha}(\mathfrak{x}) = \operatorname\*{argmin}\_{\mathcal{Y}} \left\{ \mathcal{g}(\mathcal{y}) + \frac{1}{2\alpha} ||\mathfrak{y} - \mathfrak{x}||^{2} \right\}.$$

Since our minimization problem is linear, we choose the explicit scheme. The explicit scheme is a simple but effective approach for the minimization problem. Given an initial value *Y*<sup>0</sup> and the step *αk*, the explicit scheme is

$$Y\_{k+1} = Y\_k - \mathfrak{a}\_k \bigtriangledown \varrho(Y\_k),\tag{6}$$

where *Yk* is the approximation solution. The explicit scheme satisfies the sufficient decrease property using the gradient descent (GD) method.

Let *Xk* and *Xk*−<sup>1</sup> be the current and previous states and the extrapolation weight be *wk*. Using the explicit method (6), the APG iterative scheme is

$$\begin{aligned} w\_k &= k - 2/k + 1, \\ Y\_k &= (1 + w\_k)X\_k - wX\_{k-1}, \\ Y\_{k+1} &= Y\_k - \mathfrak{a}\_k \bigcirc \mathfrak{q}(Y\_k). \end{aligned} \tag{7}$$

Together with the standard backtracking, we adopt the step size *α<sup>k</sup>* when the following condition holds:

$$
\varrho(\mathcal{Y}\_k) - \varrho(\mathcal{Y}\_{k+1}) \ge \eta \|\mathcal{Y}\_{k+1} - \mathcal{Y}\_k\|^2,\tag{8}
$$

for some *η* > 0.

Combining (7) and (8), the APG algorithm is summarized in Algorithm 1.

#### **Algorithm 1** APG algorithm.

**Require:** *X*0, *tol*, *α*0, *η* > 0, *β* ∈ (0, 1), and *k* = 1. 1: **while** the stop condition is not satisfied **do** 2: Update *Yk* via Equation (7); 3: **if** Equation (8) holds **then** 4: *break* 5: **else** 6: *α<sup>k</sup>* = *βαk*; 7: Calculate *Yk*<sup>+</sup><sup>1</sup> via (7); 8: *k* = *k* + 1.

#### *3.2. Restart APG Method*

Recently, an efficient and convergent numerical algorithm has been developed for solving a discretized phase-field model by combining the APG method with the restart technique [30]. Unlike the APG method, the restart technique involves choosing *Xk*<sup>+</sup><sup>1</sup> = *Yk*<sup>+</sup><sup>1</sup> whenever the following condition holds:

$$\left\|\varphi(X\_k) - \varphi(Y\_{k+1})\right\| \geqslant \gamma \left\|X\_k - Y\_{k+1}\right\|^2,\tag{9}$$

for some *γ* > 0. If the condition is not met, we restart the APG by setting *wk* = 0. The restart APG method (RAPG) is summarized in Algorithm 2.


#### *3.3. A-APG Method*

In RAPG Algorithm 2, we can adaptively estimate the step size *α<sup>k</sup>* by using the line search technique. Define

$$s\_k := X\_k - X\_{k-1}, \\ \\ \\ \pounds := \bigtriangledown \varphi(X\_k) - \bigtriangledown \varphi(X\_{k-1}).$$

We initialize the search step by the Barzilai–Borwein (BB) method, i.e.,

$$\alpha\_k = \frac{\operatorname{tr}(s\_k^T s\_k)}{\operatorname{tr}(s\_k^T \mathcal{g}\_k)} \text{ or } \frac{\operatorname{tr}(\mathcal{g}\_k^T s\_k)}{\operatorname{tr}(\mathcal{g}\_k^T \mathcal{g}\_k)}. \tag{10}$$

Therefore, we obtain the A-APG algorithm summarized in Algorithm 3.


*3.4. Newton-APG Method*

Despite the fast initial convergence speed of the gradient-based methods, the tail convergence speed becomes slow. Therefore, we use a practical Newton method to solve the minimization problem. We obtain the initial value from A-APG Algorithm 3, and then choose the Newton direction as the gradient in the explicit scheme in RAPG Algorithm 2. Then we have the Newton-APG method shown in Algorithm 4.

**Algorithm 4** Newton-APG algorithm.

**Require:** *X*0, *α*0, *γ* > 0, *η* > 0, *β* ∈ (0, 1), , *tol* and *k* = 1. 1: Obtain the initial value from A-APG Algorithm 3 by the precision ; 2: **while** the stop condition is not satisfied **do** 3: Initialize *α<sup>k</sup>* by BB step Equation (10); 4: Update *Xk*<sup>+</sup><sup>1</sup> by RAPG Algorithm 2 using Newton direction.

#### *3.5. Gradient Descent (GD) and Line Search (LGD) Methods*

Moreover, we show gradient descent (GD) and line search (LGD) methods for comparing with the A-APG and Newton-APG methods. The GD and line search LGD methods are summarized in Algorithm 5.

#### **Algorithm 5** GD and LGD algorithms.


#### *3.6. Computational Complexity Analysis*

Further, we analyze the computational complexity of each iteration of the derived algorithms.

The computation of APG is mainly controlled by matrix multiplication and addition operations in three main parts. The iterative scheme needs 4*m*2*n* + 4*mn*<sup>2</sup> + *O*(*mn*) computational complexity. The backtracking linear search needs 14*m*2*n* + 20*n*2*m* + 6*n*<sup>3</sup> + *O*(*mn*) + *O*(*n*2) computational complexity defined by Equation (8). The extrapolation needs *O*(*mn*) computational complexity defined by the Equation (7). The total computational complexity is 18*m*2*n* + 24*n*2*m* + 6*n*<sup>3</sup> + *O*(*mn*) + *O*(*n*2) in Algorithm 1.

The computation of RAPG is mainly controlled by matrix multiplication and addition operations in four main parts. The iterative scheme needs 4*m*2*n* + 4*mn*<sup>2</sup> + *O*(*mn*) computational complexity. The backtracking linear search defined by Equation (8) needs 14*m*2*n* + 20*n*2*m* + 6*n*<sup>3</sup> + *O*(*mn*) + *O*(*n*2) computational complexity. The extrapolation defined by Equation (7) needs *O*(*mn*) computational complexity. The restart defined by Equation (9) needs 4*m*2*n* + 14*n*2*m* + 4*n*<sup>3</sup> + *O*(*mn*) + *O*(*n*2) computational complexity. The total computational complexity is 22*m*2*n* + 38*n*2*m* + 10*n*<sup>3</sup> + *O*(*mn*) + *O*(*n*2) in Algorithm 2.

The computation of A-APG is mainly controlled by matrix multiplication and addition operations in four main parts. The iterative scheme needs 4*m*2*n* + 4*mn*<sup>2</sup> + *O*(*mn*) computational complexity. The BB step and the backtracking linear search defined by Equations (8) and (10) need *mn*, 4*m*2*<sup>n</sup>* <sup>+</sup> <sup>4</sup>*mn*<sup>2</sup> <sup>+</sup> <sup>6</sup>*mn*, 2*n*2(2*<sup>m</sup>* <sup>−</sup> <sup>1</sup>) + <sup>2</sup>*n*, and 14*m*2*<sup>n</sup>* <sup>+</sup> 20*n*2*m* + 6*n*<sup>3</sup> + *O*(*mn*) + *O*(*n*2) computational complexity. The extrapolation defined by Equation (7) needs *O*(*mn*) computational complexity. The restart defined by Equation (9) needs 4*m*2*n* + 14*n*2*m* + 4*n*<sup>3</sup> + *O*(*mn*) + *O*(*n*2) computational complexity. The total computational complexity is 26*m*2*n* + 46*n*2*m* + 10*n*<sup>3</sup> + *O*(*mn*) + *O*(*n*2) in Algorithm 3.

The computation of Newton-APG is mainly controlled by matrix multiplication and addition operations in four main parts, different from the A-APG method. The iterative scheme needs 8*n*<sup>3</sup> + 3*n*<sup>2</sup> + *O*(*n*2) + *O*(*n*3) computational complexity. The BB step and the backtracking linear search defined by Equations (8) and (10) need *n*2, 8*n*<sup>3</sup> + 6*n*2, <sup>2</sup>*n*2(2*<sup>n</sup>* <sup>−</sup> <sup>1</sup>) + <sup>2</sup>*n*, and 10*n*2(2*<sup>n</sup>* <sup>−</sup> <sup>1</sup>) + <sup>8</sup>*n*<sup>3</sup> <sup>+</sup> <sup>3</sup>*n*<sup>2</sup> <sup>+</sup> *<sup>O</sup>*(*n*3) + *<sup>O</sup>*(*n*2) computational complexity. The extrapolation defined by Equation (7) needs *O*(*n*2) computational complexity. The restart defined by Equation (9) needs 5*n*2(2*<sup>n</sup>* <sup>−</sup> <sup>1</sup>) + *<sup>n</sup>*<sup>2</sup> <sup>+</sup> *<sup>O</sup>*(*n*3) computational complexity. The total computational complexity is 50*n*<sup>3</sup> <sup>−</sup> <sup>10</sup>*n*<sup>2</sup> <sup>+</sup> <sup>2</sup>*<sup>n</sup>* <sup>+</sup> *<sup>O</sup>*(*n*2) + *<sup>O</sup>*(*n*3) in Algorithm 4.

The computation of GD is mainly controlled by matrix multiplication and addition operations in Equations (4) and (6). It requires *mn*(2*m* − 1), *mn*(2*n* − 1), *mn*(2*m* − 1), *mn*(2*<sup>n</sup>* <sup>−</sup> <sup>1</sup>) computational complexity to compute *AX*, *XB*, *<sup>A</sup>TX*, *XBT*. The total computational complexity is 4*m*2*n* + 4*mn*<sup>2</sup> + *O*(*mn*) in Algorithm 5 using GD.

The computation of LGD is mainly controlled by matrix multiplication and addition operations in the calculation of *s*, *g* defined by Equation (8) and (10), and the calculation of GD, which require *mn*, 4*m*2*<sup>n</sup>* <sup>+</sup> <sup>4</sup>*mn*<sup>2</sup> <sup>+</sup> <sup>6</sup>*mn*, 2*n*2(2*<sup>m</sup>* <sup>−</sup> <sup>1</sup>) + <sup>2</sup>*n*, 14*m*2*<sup>n</sup>* <sup>+</sup> <sup>20</sup>*n*2*<sup>m</sup>* <sup>+</sup> 6*n*<sup>3</sup> + *O*(*mn*) + *O*(*n*2), and 4*m*2*n* + 4*mn*<sup>2</sup> + *O*(*mn*), respectively. The total computational complexity is 22*m*2*n* + 32*n*2*m* + 6*n*<sup>3</sup> + *O*(*mn*) + *O*(*n*2) in Algorithm 5 using GD.

#### **4. Convergent Analysis**

In this section, we focus on the convergence analysis of A-APG Algorithm 3. The following proposition is required.

**Proposition 4.** *Let <sup>M</sup> be a bounded region that contains* {*<sup>ϕ</sup> <sup>ϕ</sup>*(*X*0)} *in* <sup>R</sup>*n*×*n, then ϕ*(*X*) *satisfies the Lipschitz condition in M, i.e., there exists LM* > 0 *such that*

*ϕ*(*X*) − *ϕ*(*Y*) *LMX* − *Y f or X*,*Y* ∈ *M*.

**Proof.** Using the continuity of *ϕ*(*X*), note that

$$\left\| \left| \nabla^2 \boldsymbol{\varrho}(\mathbf{X}) \right| \right\| = \left\| \left( \boldsymbol{A} + \boldsymbol{A}^T \right) + \left( \boldsymbol{B} + \boldsymbol{B}^T \right) \right\|$$

defined by (5) is bounded. Then *ϕ*(*X*) satisfies the Lipschitz condition in *M*.

In recent years, the proximal method based on the Bregman distance has been applied for solving optimization problems. The proximal operator is

$$\operatorname{Prox}\_{\varphi}^{\mathfrak{a}}(\boldsymbol{y}) := \operatorname\*{argmin}\_{\boldsymbol{y}} \{ \varphi(\boldsymbol{y}) + \frac{1}{2\alpha} ||\boldsymbol{X} - \boldsymbol{X}\_{k}||^{2} \}.$$

Basically, given the current estimation *Xk* and step size *α<sup>k</sup>* > 0, update *Xk*<sup>+</sup><sup>1</sup> via

$$X\_{k+1} = \text{Prox}\_0^\kappa(X\_k - a\_k \nabla \varphi(X\_k)) = \underset{X}{\text{argmin}} \{ \frac{1}{2a\_k} \left\| X - (X\_k - a\_k \nabla \varphi(X\_k)) \right\|^2 \}.\tag{11}$$

Thus we obtain

$$\frac{1}{2\alpha\_k} (\mathcal{X}\_{k+1} - (\mathcal{X}\_k - \mathfrak{a}\_k \nabla \mathfrak{p}(\mathcal{X}\_k))) = 0\_\prime$$

which implies that

$$X\_{k+1} = X\_k - \mathfrak{a}\_k \nabla \mathfrak{q}(X\_k).$$

This is exactly the explicit scheme in our algorithm.

*4.1. Linear Search Is Well-Defined*

Using the optimization from Equation (11), it is evident that

$$\begin{split} X\_{k+1} &= \underset{\mathcal{X}}{\operatorname{argmin}} \{ \frac{1}{2\alpha\_{k}} \| \boldsymbol{X} - (\boldsymbol{X}\_{k} - \boldsymbol{\alpha}\_{k} \nabla \boldsymbol{\varrho}(\boldsymbol{X}\_{k})) \|^{2} \} \\ &= \underset{\mathcal{X}}{\operatorname{argmin}} \{ \frac{1}{2\alpha\_{k}} \| \boldsymbol{X} - \boldsymbol{X}\_{k} \| ^{2} + \langle \boldsymbol{X} - \boldsymbol{X}\_{k}, \nabla \boldsymbol{\varrho}(\boldsymbol{X}\_{k}) \rangle \} \\ &= \underset{\mathcal{X}}{\operatorname{argmin}} \{ \frac{1}{2\alpha\_{k}} \| \boldsymbol{X} - \boldsymbol{X}\_{k} \| ^{2} + \langle \boldsymbol{X} - \boldsymbol{X}\_{k}, \nabla \boldsymbol{\varrho}(\boldsymbol{X}\_{k}) \rangle + \boldsymbol{\varrho}(\boldsymbol{X}\_{k}) \}. \end{split}$$

Then we obtain

$$\begin{split} \varphi(\mathbf{X}\_{k}) &\geqslant \frac{1}{2a\_{k}} \| \mathbf{X}\_{k+1} - \mathbf{X}\_{k} \|^{2} + \langle \mathbf{X}\_{k+1} - \mathbf{X}\_{k}, \nabla \varphi(\mathbf{X}\_{k}) \rangle + \varphi(\mathbf{X}\_{k}) \\ &\geqslant \varphi(\mathbf{X}\_{k+1}) + \frac{1}{2a\_{k}} \| \mathbf{X}\_{k} - \mathbf{X}\_{k+1} \|^{2} - \frac{\left\| \nabla^{2} \varrho(\mathbf{X}) \right\|}{2} \| \mathbf{X}\_{k} - \mathbf{X}\_{k+1} \|^{2} \\ &\geqslant \varphi(\mathbf{X}\_{k+1}) + (\frac{1}{2a\_{k}} - \frac{L\_{M}}{2}) \| \mathbf{X}\_{k} - \mathbf{X}\_{k+1} \|^{2} .\end{split} \tag{12}$$

where the second inequality follows from Taylor expansion of *ϕ*(*Xk*<sup>+</sup>1). By Equation (12), set

$$0 < a\_k < \overline{\pi} := \min \{ \frac{1}{L\_M + 2\eta'}, \frac{1}{L\_M + 2\gamma} \}, \tag{13}$$

the conditions in linear search Equation (8) and non-restart Equation (9) are both satisfied. Therefore, the backtracking linear search is well-defined.

#### *4.2. Sufficient Decrease Property*

In this section, we show the sufficient decrease property of the sequence generated by A-APG Algorithm 3. If *α<sup>k</sup>* satisfies the condition Equation (13), then

$$\left\|\varphi(X\_k) - \varphi(Y\_{k+1})\right\| \geqslant \rho\_1 \left\|X\_k - Y\_{k+1}\right\|^2 \rho$$

where *ρ*<sup>1</sup> = min{*η*, *γ*} > 0. Since *ϕ* is a bounded function, then there exists *ϕ*<sup>∗</sup> such that *ϕ*(*Xk*) *ϕ*<sup>∗</sup> and *ϕ*(*Xk*) → *ϕ*<sup>∗</sup> as *k* → +∞. This implies

$$\rho\_1 \sum\_{k=0}^{\infty} \left\| X\_{k+1} - X\_k \right\|^2 \lesssim \rho(X\_0) - \rho^\* < +\infty.$$

which shows that

$$\lim\_{k \to +\infty} ||X\_{k+1} - X\_k|| = 0.$$

#### *4.3. Bounded Gradient*

Define two sets Ω<sup>2</sup> = {*k* : *k* = 2} and Ω<sup>1</sup> = *N* \ Ω2. Let *wk* = *k* − 2/*k* + 1, for any *k* ∈ Ω2, then *Xk*<sup>+</sup><sup>1</sup> = *Yk*<sup>+</sup><sup>1</sup> when *wk* = 0. There exists *w* = *kmax* − 2/*kmax* + 1 ∈ [0, 1) such that *wk w* as *k* increases. If *k* ∈ Ω1, since

$$\mathcal{Y}\_{k+1} = \operatorname\*{argmin}\_{\mathcal{X}} \{ \frac{1}{2\alpha\_k} \| X - (\mathcal{Y}\_k - \mathfrak{a}\_k \nabla \boldsymbol{\varphi}(\mathcal{Y}\_k)) \|^2 \},$$

we have

$$0 = \nabla \varphi(\mathcal{Y}\_k) + \frac{1}{\alpha\_k} (\mathcal{Y}\_{k+1} - \mathcal{Y}\_k).$$

Thus,

$$\nabla \varphi(\mathbf{Y}\_k) = \frac{1}{\alpha\_k} (\mathbf{Y}\_k - \mathbf{X}\_{k+1}) .$$

Note that *Yk* = (<sup>1</sup> + *wk*)*Xk* − *wkXk*−1, then

$$\begin{aligned} \|\nabla\varphi(\mathbf{Y}\_k)\| &= \frac{1}{\alpha\_k} \|(1+w\_k)\mathbf{X}\_k - w\_k \mathbf{X}\_{k-1} - \mathbf{X}\_{k+1}\| \\ &= \frac{1}{\alpha\_k} \|w\_k(\mathbf{X}\_k - \mathbf{X}\_{k-1}) + (\mathbf{X}\_k - \mathbf{X}\_{k+1})\| \\ &\lesssim \frac{1}{\alpha\_{min}} (\overline{w}\|\mathbf{X}\_k - \mathbf{X}\_{k-1}\| + \|\mathbf{X}\_k - \mathbf{X}\_{k+1}\|) \\ &= c\_1 (\|\mathbf{X}\_{k+1} - \mathbf{X}\_k\| + \overline{w}\|\mathbf{X}\_k - \mathbf{X}\_{k-1}\|)\_t \end{aligned} \tag{14}$$

where *c*<sup>1</sup> = <sup>1</sup> *<sup>α</sup>min* > 0. If *k* ∈ Ω2, then

$$X\_{k+1} = \underset{X}{\operatorname{argmin}} \{ \frac{1}{2a\_k} \left\| X - (X\_k - \alpha\_k \nabla \boldsymbol{\varphi}(X\_k)) \right\|^2 \},$$

which implies that

$$0 = \nabla \varphi(X\_k) + \frac{1}{\alpha\_k} (X\_{k+1} - X\_k).$$

Thus

$$\|\|\nabla\varrho(X\_k)\|\| = \frac{1}{a\_k} \|X\_k - X\_{k+1}\| \lesssim \frac{1}{a\_{\min}} \|X\_k - X\_{k+1}\| = c\_1(\|X\_k - X\_{k+1}\|),\tag{15}$$

Combining Equations (14) and (15), it follows that

$$\|\|\nabla \boldsymbol{\varrho}(\mathcal{X}\_k)\|\| \preccurlyeq c\_1 (\|\|\mathcal{X}\_{k+1} - \mathcal{X}\_k\|\| + \overline{w} \|\|\mathcal{X}\_k - \mathcal{X}\_{k-1}\|\|).$$

#### *4.4. Subsequence Convergence*

As {*Xk*} ∈ *M* is compact, there exists a subsequence {*Xkj* } ⊂ *M* and *X*<sup>∗</sup> ∈ *M* such that lim *<sup>j</sup>*→+<sup>∞</sup> *Xkj* <sup>=</sup> *<sup>X</sup>*∗. Then *<sup>ϕ</sup>* is bounded, i.e., *<sup>ϕ</sup>*(*X*) <sup>&</sup>gt; <sup>−</sup><sup>∞</sup> and *<sup>ϕ</sup>* keeps decreasing. Hence, there exists *ϕ*∗ such that lim *<sup>k</sup>*→+<sup>∞</sup> *<sup>ϕ</sup>*(*Xk*) = *<sup>ϕ</sup>*∗. Note that

$$\left|\varphi(X\_k) - \varphi(X\_{k+1})\right| \gtrsim c\_0 \left| \left|X\_k - X\_{k+1}\right| \right|^2, \ k = 1, 2, \dots \tag{16}$$

Summation over *k* yields

$$\left\| c\_0 \sum\_{k=0}^{\infty} \left\| X\_k - X\_{k+1} \right\| \right\|^2 \lesssim \left\| \left( X\_0 \right) - \phi^\* \right\| < +\infty.$$

Therefore,

$$\lim\_{k \to +\infty} ||X\_k - X\_{k+1}|| = 0.$$

Due to the property of the gradient, thus

$$\lim\_{j \to +\infty} \left\| \nabla \varrho(X\_{k\_j}) \right\| = 0.$$

Considering the continuity of *ϕ* and *ϕ*, we have

$$\lim\_{j \to +\infty} \varrho(X\_{k\_j}) = \varrho(X^\*)\_{\prime} \\ \lim\_{j \to +\infty} \nabla \varrho(X\_{k\_j}) = \nabla \varrho(X^\*) = 0\_{\prime}$$

which implies that *ϕ*(*X*∗) = 0.

#### *4.5. Sequence Convergence*

In this section, the subsequence convergence can be strengthened by using the Kurdyka– Lojasiewicz property.

**Proposition 5.** *For <sup>x</sup>* <sup>∈</sup> *dom ∂ϕ* :<sup>=</sup> {*<sup>x</sup>* : *∂ϕ*(*x*) <sup>=</sup> <sup>∅</sup>}*, there exists <sup>η</sup>* <sup>&</sup>gt; <sup>0</sup>*, an neighborhood of x, and ψ* ∈ Ψ*<sup>η</sup>* = {*ψ* ∈ *C*[0, *η*) ∩ *C* (0, *η*)*, where ψ is concave, ψ*(0) = 0, *ψ* > 0 *on* (0, *η*)} *such that for all x* ∈ Γ*η*(*x*, ) : *U* ∩ {*x* : *ϕ*(*x*) < *ϕ*(*x*) < *ϕ*(*x*) + *η*}*, we have*

$$\|\psi'(\varrho(\mathfrak{x}) - \varrho(\overline{\mathfrak{x}})) \|\nabla \varrho(\mathfrak{x})\| \geqslant 1.$$

*Then we say ϕ*(*x*) *satisfies the Kurdyka–Lojasiewicz property.*

**Theorem 1.** *Assume that Propositions 4 and 5 are met. Let* {*Xk*} *be the sequence generated by A-APG Algorithm 3. Then, there exists a point X*<sup>∗</sup> ∈ *M so that* lim *<sup>k</sup>*→+<sup>∞</sup> *Xk* <sup>=</sup> *<sup>X</sup>*<sup>∗</sup> *and ϕ*(*X*∗) = <sup>0</sup>*.*

**Proof.** Let *ω*(*X*0) be the set of limiting points of the sequence {*Xk*}. Based on the boundedness of {*Xk*} and the fact that *ω*(*X*0) = ∩*q*∈*<sup>N</sup>* ∪*k*>*<sup>q</sup>* {*Xk*}, it follows that *ω*(*X*0) is a non-empty and compact set. In addition, by Equation (16), we know that *ϕ*(*X*) is a constant on *ω*(*X*0), denoted by *ϕ*∗. If there exists some *k*<sup>0</sup> such that *ϕ*(*Xk*<sup>0</sup> ) = *ϕ*∗, then for ∀*k* > *k*0, we have *ϕ*(*Xk*) = *ϕ*∗. Next, we assume that ∀*k*, *ϕ*(*Xk*) > *ϕ*∗. Therefore, for

∀, *η* > 0, ∃*l* > 0, for ∀*k* > *l* we have *dist*(*ω*(*X*0), *Xk*) and *ϕ*<sup>∗</sup> < *ϕ*(*Xk*) < *ϕ*<sup>∗</sup> + *η* i.e., for ∀*X*<sup>∗</sup> ∈ *ω*(*X*0), *X* ∈ Γ*η*(*X*∗, ). Applying Proposition 5, for ∀*k* > *l*, we have

$$\left|\psi'(\boldsymbol{\varrho}(\boldsymbol{X}\_k) - \boldsymbol{\varrho}^\*)\right| \left|\nabla \boldsymbol{\varrho}(\boldsymbol{X}\_k)\right| \geqslant 1.$$

Then

$$
\psi'(\varphi(X\_k) - \varphi^\*) \ge \frac{1}{c\_1(||X\_k - X\_{k-1}|| + \overline{w}||X\_{k-1} - X\_{k-2}||)}.\tag{17}
$$

By the convexity of *ψ*, it is obvious that

$$
\psi(\varphi(X\_k) - \varphi^\*) - \psi(\varphi(X\_{k+1}) - \varphi^\*) \geqslant \psi'(\varphi(X\_k) - \varphi^\*)(\varphi(X\_k) - \varphi(X\_{k+1})).\tag{18}
$$

Define

$$
\triangle\_{p,q} = \psi(\varphi(X\_p) - \varphi^\*) - \psi(\varphi(X\_q) - \varphi^\*), \ c = (1 + \overline{w})c\_1/c\_0 > 0.
$$

Combining with Equations (16)–(18), for ∀*k* > *l*, we obtain

$$\begin{split} \triangle\_{k,k+1} &\geqslant \frac{c\_0 ||X\_{k+1} - X\_k||^2}{c\_1 (||X\_k - X\_{k-1}|| + \overline{w} ||X\_{k-1} - X\_{k-2}||)} \\ &\geqslant \frac{||X\_{k+1} - X\_k||^2}{c (||X\_k - X\_{k-1}|| + ||X\_{k-1} - X\_{k-2}||)}. \end{split} \tag{19}$$

Applying the geometric inequality to Equation (19), thus

$$2\|X\_{k+1} - X\_k\| \preccurlyeq \frac{1}{2}(\|X\_k - X\_{k-1}\| + \|X\_{k-1} - X\_{k-2}\|) + 2c\triangle\_{k,k+1}\dots$$

Therefore, for ∀*k* > *l*, summing up the above inequality for *i* = *l* + 1, ... , *k*, we obtain

$$\begin{split} 2\sum\_{i=l+1}^{k} \left\lVert \mathbf{X}\_{i+1} - \mathbf{X}\_{i} \right\rVert &\leqslant \frac{1}{2} \sum\_{i=l+1}^{k} \left( \left\lVert \mathbf{X}\_{i} - \mathbf{X}\_{i-1} \right\rVert + \left\lVert \mathbf{X}\_{i-1} - \mathbf{X}\_{i-2} \right\rVert \right) + 2c \sum\_{i=l+1}^{k} \left\lVert \boldsymbol{\triangle}\_{i,i+1} \right\rVert \\ &\leqslant \sum\_{i=l+1}^{k} \left\lVert \mathbf{X}\_{i+1} - \mathbf{X}\_{i} \right\rVert + \left\lVert \mathbf{X}\_{l+1} - \mathbf{X}\_{l} \right\rVert + \frac{1}{2} \left\lVert \mathbf{X}\_{l} - \mathbf{X}\_{l-1} \right\rVert \\ &\qquad + 2c\triangle\_{l+1,k+1} .\end{split}$$

For ∀*k* > *l*, *ψ* 0, it is evident that

$$\sum\_{i=l+1}^{k} \left\lVert \left| X\_{i+1} - X\_i \right| \right\rVert \leqslant \left\lVert \left| X\_{l+1} - X\_l \right| \right\rVert + \frac{1}{2} \left\lVert \left| X\_l - X\_{l-1} \right| \right\rVert + 2c\psi(\varphi(X\_l) - \varphi^\*),$$

which implies that <sup>∞</sup>

$$\sum\_{k=1}^{\infty} ||X\_{k+1} - X\_k|| < \infty.$$

In the end, we have lim *<sup>k</sup>*→+<sup>∞</sup> *Xk* <sup>=</sup> *<sup>X</sup>*∗.

#### **5. Numerical Results**

In this section, we offer two corresponding numerical examples to illustrate the efficiency of the derived algorithms. All code is written in Python language. Denote iteration and error by the iteration step and error of the objective function. We take the matrix order "*n*" as 128, 1024, 2048, and 4096.

**Example 1.** *Let*

$$A\_1 = \begin{pmatrix} 2 & -1 & & & \\ -1 & 2 & -1 & & \\ & \ddots & \ddots & \ddots & \\ & & \ddots & \ddots & -1 \\ & & & -1 & 2 \end{pmatrix}, B\_1 = \begin{pmatrix} 1 & 0.5 \\ 0.5 & 1 & 0.5 \\ & \ddots & \ddots & \ddots \\ & & \ddots & \ddots & 0.5 \\ & & & 0.5 & 1 \end{pmatrix}.$$

*be tridiagonal matrices in the Sylvester Equation (1). Set the matrix C*<sup>1</sup> *as the identity matrix. The initial step size is 0.01, which is small enough to iterate. The parameters are η*<sup>1</sup> = 0.25, *ω*<sup>1</sup> = 0.2 *taken from (0,1) randomly. Table 1 and Figure 1 show the numerical results of Algorithms 1–5. It can be seen that the LGD, A-APG, and Newton-APG Algorithms are more efficient than other methods. Moreover, the iteration step does not increase when the matrix order increases due to the same initial value. The A-APG method has higher error accuracy compared with other methods. The Newton-APG method takes more CPU time and fewer iteration steps than the A-APG method. The Newton method needs to calculate the inverse of the matrix, while it has quadratic convergence. From Figure 1, the error curves of the LGD, A-APG, and Newton-APG algorithms are hard to distinguish. We offer another example below.*

**Table 1.** Numerical results for Example 1.


**Figure 1.** The error curves when *n* = 128, 1024, 2048, 4096 for Example 1.

**Example 2.** *Let A*<sup>2</sup> = *A*1*A<sup>T</sup>* <sup>1</sup> , *<sup>B</sup>*<sup>2</sup> = *<sup>B</sup>*1*B<sup>T</sup>* <sup>1</sup> *be positive semi-definite matrices in the Sylvester Equation (1). Set the matrix C*<sup>2</sup> *as the identity matrix. The initial step size is 0.009. The parameters are η*<sup>2</sup> = 0.28, *ω*<sup>2</sup> = 0.25 *taken from (0,1) randomly. Table 2 and Figure 2 show the numerical results of Algorithms 1–5. It can be seen that the LGD, A-APG, and Newton-APG algorithms take less CPU time compared with other methods. Additionally, we can observe the different error curves of the LGD, A-APG, and Newton-APG algorithms from Figure 2.*

**Remark 1.** *The difference of the iteration step in Examples 1 and 2 emerges due to the given different initial values. It can be seen that the LGD, A-APG, and Newton-APG algorithms have fewer iteration steps. Whether the A-APG method or Newton-APG yields fewer iteration steps varies from problem to problem. From Examples 1 and 2, we observe that the A-APG method has higher accuracy, although it takes more time and more iteration steps than the LGD method.*

**Remark 2.** *Moreover, we compare the performance of our methods with other methods such as the conjugate gradient method (CG) in Tables 1 and 2. We take the same initial values and set the error to <sup>1</sup>* <sup>×</sup> *<sup>10</sup>*−14*. From Tables <sup>1</sup> and 2, it can be seen that the LGD and A-APG methods are more efficient for solving the Sylvester matrix equation when the order n is small. When n is large, the LGD and A-APG methods nearly have a convergence rate with the CG method.*

**Figure 2.** The error curves when *n* = 128, 1024, 2048, 4096 for Example 2.


**Table 2.** Numerical results for Example 2.

#### **6. Conclusions**

In this paper, we have introduced the A-APG and Newton-APG methods for solving the Sylvester matrix equation. The key idea is to change the Sylvester matrix equation to an optimization problem by using the Kronecker product. Moreover, we have analyzed the computation complexity and proved the convergence of the A-APG method. Convergence results and preliminary numerical examples have shown that the schemes are promising in solving the Sylvester matrix equation.

**Author Contributions:** J.Z. (methodology, review, and editing); X.L. (software, visualization, data curation). All authors have read and agreed to the published version of the manuscript.

**Funding:** The work was supported in part by the National Natural Science Foundation of China (12171412, 11771370), Natural Science Foundation for Distinguished Young Scholars of Hunan Province (2021JJ10037), Hunan Youth Science and Technology Innovation Talents Project (2021RC3110), the Key Project of the Education Department of Hunan Province (19A500, 21A0116).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Generalized Three-Step Numerical Methods for Solving Equations in Banach Spaces**

**Michael I. Argyros 1, Ioannis K. Argyros 2,\*, Samundra Regmi 3,\* and Santhosh George <sup>4</sup>**


**Abstract:** In this article, we propose a new methodology to construct and study generalized three-step numerical methods for solving nonlinear equations in Banach spaces. These methods are very general and include other methods already in the literature as special cases. The convergence analysis of the specialized methods is been given by assuming the existence of high-order derivatives which are not shown in these methods. Therefore, these constraints limit the applicability of the methods to equations involving operators that are sufficiently many times differentiable although the methods may converge. Moreover, the convergence is shown under a different set of conditions. Motivated by the optimization considerations and the above concerns, we present a unified convergence analysis for the generalized numerical methods relying on conditions involving only the operators appearing in the method. This is the novelty of the article. Special cases and examples are presented to conclude this article.

**Keywords:** generalized three-step numerical method; convergence; Banach space

**MSC:** 49M15; 47H17; 65J15; 65G99; 47H17; 41A25; 49M15

#### **1. Introduction**

A plethora of applications from diverse disciplines of computational sciences are converted to nonlinear equations such as

$$F(\mathbf{x}) = \mathbf{0} \tag{1}$$

using modeling (mathematical) [1–4]. The nonlinear operator *F* is defined on an open and convex subset Ω of a Banach space *X* with values in *X*. The solution of the equation is denoted by *x*∗. Numerical methods are mainly used to find *x*∗. This is the case since the analytic form of the solution *x*<sup>∗</sup> is obtained in special cases.

Researchers, as well as practitioners, have proposed numerous numerical methods under a different set of convergence conditions using high-order derivatives, which are not present in the methods.

Let us consider an example.

**Example 1.** *Define the function F on X* = [−0.5, 1.5] *by*

$$F(t) = \begin{cases} \ \ t^3 \ln t^2 + t^5 - t^4, & t \neq 0 \\ \ 0, \ t = 0 \end{cases}$$

**Citation:** Argyros, M.I.; Argyros, I.K.; Regmi, S.; George, D. Generalized Three-Step Numerical Methods for Solving Equations in Banach Spaces. *Mathematics* **2022**, *10*, 2621. https:// doi.org/10.3390/math10152621

Academic Editors: Maria Isabel Berenguer and Manuel Ruiz Galán

Received: 8 July 2022 Accepted: 26 July 2022 Published: 27 July 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** c 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

*Clearly, the point t*<sup>∗</sup> = 1 *solves the equation F*(*t*) = 0. *It follows that*

$$F^{\prime\prime\prime}(t) \quad = \ 6\ln t^2 + 60t^2 - 24t + 22.$$

Then, the function *F* does not have a bounded third derivative in *X*.

Hence, many high convergence methods (although they may converge) cannot apply to show convergence. In order to address these concerns, we propose a unified approach for dealing with the convergence of these numerical methods that take into account only the operators appearing on them. Hence, the usage of these methods becomes possible and under weaker conditions.

Let *x*<sup>0</sup> ∈ Ω be a starting point. Define the generalized numerical method ∀*n* = 0, 1, 2, . . . by

$$\begin{aligned} y\_n &= \quad a\_n = a(\mathbf{x}\_n) \\ z\_n &= \quad b\_n = b(\mathbf{x}\_{n\prime} y\_n) \\ \mathbf{x}\_{n+1} &= \quad c\_n = c(\mathbf{x}\_{n\prime} y\_{n\prime} z\_n) \end{aligned} \tag{2}$$

where *a* : Ω −→ *X*, *b* : Ω × Ω −→ *X* and *c* : Ω × Ω × Ω −→ *X* are given operators chosen so that lim*n*−→<sup>∞</sup> *xn* = *x*∗.

The specialization of (2) is

$$\begin{aligned} y\_n &= \quad \mathbf{x}\_n + \mathbf{a}\_n F(\mathbf{x}\_n) \\ z\_n &= \quad u\_n + \beta\_n F(\mathbf{x}\_n) + \gamma\_n F(y\_n) \\ \mathbf{x}\_{n+1} &= \quad v\_n + \delta\_n F(\mathbf{x}\_n) + \epsilon\_n F(y\_n) + \theta\_n F(z\_n) \end{aligned} \tag{3}$$

where *un* = *xn* or *un* = *yn*, *vn* = *xn* or *vn* = *yn* or *vn* = *zn*, and *αn*, *βn*, *γn*, *δn*, *n*, *θ<sup>n</sup>* are linear operators on Ω, Ω × Ω and Ω × Ω × Ω, with values in *X*, respectively. By choosing some of the linear operators equal to the *O* linear operators in (3), we obtain the methods studied in [5]. Moreover, if *X* = R*k*, then we obtain the methods studied in [6,7]. In particular, the methods in [5] are of the special form

$$\begin{aligned} y\_n &= \quad \varkappa\_n - \mathcal{O}\_{1,n}^{-1} F(\varkappa\_n) \\ z\_n &= \quad y\_n - \mathcal{O}\_{2,n}^{-1} F(y\_n) \\ \varkappa\_{n+1} &= \quad z\_n - \mathcal{O}\_{3,n}^{-1} F(z\_n) \end{aligned} \tag{4}$$

$$\begin{array}{rcl} y\_n &=& \mathbf{x}\_n - sF'(\mathbf{x}\_n)^{-1}F(\mathbf{x}\_n) \\ z\_n &=& \mathbf{x}\_n - \mathcal{O}\_{4,n}F(\mathbf{x}\_n) \\ \mathbf{x}\_{n+1} &=& z\_n - \mathcal{O}\_{5,n}F(z\_n) \end{array} \tag{5}$$

where they, as the methods in [7,8], are of the form

$$\begin{array}{rcl} y\_n &=& \mathbf{x}\_n - F'(\mathbf{x}\_n)^{-1} F(\mathbf{x}\_n) \\ z\_n &=& y\_n - \mathcal{O}\_{6,n} F'(\mathbf{x}\_n)^{-1} F(\mathbf{y}\_n) \\ \mathbf{x}\_{n+1} &=& z\_n - \mathcal{O}\_{7,n} F'(\mathbf{x}\_n)^{-1} F(z\_n) \end{array} \tag{6}$$

where *<sup>s</sup>* <sup>∈</sup> <sup>R</sup> is a given parameter, and <sup>O</sup>*k*,*n*, *<sup>k</sup>* <sup>=</sup> 1, 2, ... , 7 are linear operators acting between Ω and *X*. In particular, operators must have a special form to obtain the fourth, seventh or eighth order of convergence.

Further specifications of operators "O" lead to well-studied methods, a few of which are listed below (other choices can be found in [6,7,9,10]): **Newton method (second order) [1,4,11,12]:**

$$y\_n = \mathbf{x}\_n - F'(\mathbf{x}\_n)^{-1} F(\mathbf{x}\_n). \tag{7}$$

**Jarrat method (second order) [13]:**

$$y\_n = x\_n - \frac{2}{3} F'(x\_n)^{-1} F(x\_n). \tag{8}$$

**Traub-type method (fifth order) [14]:**

$$\begin{array}{rcl} y\_n &=& \mathbf{x}\_n - F'(\mathbf{x}\_n)^{-1} F(\mathbf{x}\_n) \\ z\_n &=& \mathbf{x}\_n - F'(\mathbf{x}\_n)^{-1} F(y\_n) \\ \mathbf{x}\_{n+1} &=& \mathbf{x}\_n - F'(\mathbf{x}\_n)^{-1} F(z\_n) .\end{array} \tag{9}$$

**Homeir method (third order) [15]:**

$$y\_n = \|\mathbf{x}\_n - \frac{1}{2}F'(\mathbf{x}\_n)^{-1}F(\mathbf{x}\_n)\|$$

$$\mathbf{x}\_{n+1} = \|y\_n - F'(\mathbf{x}\_n)^{-1}F(y\_n). \tag{10}$$

**Cordero–Torregrosa (third Order) [2]:**

$$\begin{array}{rcl} y\_n &=& \mathbf{x}\_n - F'(\mathbf{x}\_n)^{-1} F(\mathbf{x}\_n) \\ \mathbf{x}\_{n+1} &=& \mathbf{x}\_n - 6 \left( F'(\mathbf{x}\_n) + 4F'(\frac{\mathbf{x}\_n + y\_n}{2}) \right) F'(y\_n)^{-1} F(\mathbf{x}\_n). \end{array} \tag{11}$$

or

$$y\_n = \|\mathbf{x}\_n - F'(\mathbf{x}\_n)^{-1} F(\mathbf{x}\_n)\tag{12}$$

$$\mathbf{x}\_{n+1} = \|\mathbf{x}\_n - 2\left[2F'(\frac{3\mathbf{x}\_n + \mathbf{y}\_n}{4}) - F'(\frac{\mathbf{x}\_n + \mathbf{y}\_n}{2}) + 2F'(\frac{\mathbf{x}\_n + 3\mathbf{y}\_n}{4})\right]^{-1} F(\mathbf{x}\_n).$$

**Noor–Wasseem method (third order) [3]:**

$$\begin{array}{rcl} y\_n &=& \mathfrak{x}\_n - F'(\mathfrak{x}\_n)^{-1} F(\mathfrak{x}\_n) \\ \mathfrak{x}\_{n+1} &=& \mathfrak{x}\_n - 4 \left[ 3F'(\frac{2\mathfrak{x}\_n + \mathfrak{y}\_n}{3}) + F'(\mathfrak{y}\_n) \right]^{-1} F(\mathfrak{x}\_n). \end{array} \tag{13}$$

**Xiao–Yin method (third order) [16]:**

$$\begin{aligned} y\_n &=& \mathbf{x}\_n - F'(\mathbf{x}\_n)^{-1} F(\mathbf{x}\_n) \\ \mathbf{x}\_{n+1} &=& \mathbf{x}\_n - \frac{2}{3} \left[ \left( 3F'(y\_n) - F'(\mathbf{x}\_n) \right)^{-1} + F'(\mathbf{x}\_n)^{-1} \right] F(\mathbf{x}\_n) . \end{aligned} \tag{14}$$

**Corder–Torregrosa method (fifth order) [2]:**

$$\begin{aligned} y\_{\boldsymbol{n}} &=& \mathbf{x}\_{\boldsymbol{n}} - \frac{2}{3} F'(\mathbf{x}\_{\boldsymbol{n}})^{-1} F(\mathbf{x}\_{\boldsymbol{n}}) \\ z\_{\boldsymbol{n}} &=& \mathbf{x}\_{\boldsymbol{n}} - \frac{1}{2} (3F'(y\_{\boldsymbol{n}}) - F'(\mathbf{x}\_{\boldsymbol{n}}))^{-1} (3F'(y\_{\boldsymbol{n}}) + F'(\mathbf{x}\_{\boldsymbol{n}})) F'(\mathbf{x}\_{\boldsymbol{n}})^{-1} F(\mathbf{x}\_{\boldsymbol{n}}) \\ \mathbf{x}\_{\boldsymbol{n}+1} &=& z\_{\boldsymbol{n}} - (\frac{1}{2} F'(y\_{\boldsymbol{n}}) + \frac{1}{2} F'(\mathbf{x}\_{\boldsymbol{n}}))^{-1} F(\mathbf{z}\_{\boldsymbol{n}}). \end{aligned} \tag{15}$$

or

$$\begin{array}{rcl} y\_n &=& \mathbf{x}\_n - F'(\mathbf{x}\_n)^{-1} F(\mathbf{x}\_n) \\ z\_n &=& \mathbf{x}\_n - 2(F'(y\_n) + F'(\mathbf{x}\_n))^{-1} F(\mathbf{x}\_n) \\ \mathbf{x}\_{n+1} &=& z\_n - F'(y\_n)^{-1} F(z\_n) .\end{array} \tag{16}$$

**Sharma–Arora method (fifth order) [17,18]:**

$$\begin{array}{rcl} y\_n &=& \mathbf{x}\_n - F'(\mathbf{x}\_n)^{-1} F(\mathbf{x}\_n) \\ \mathbf{x}\_{n+1} &=& \mathbf{x}\_n - (2F'(y\_n)^{-1} - F'(\mathbf{x}\_n)^{-1}) F(\mathbf{x}\_n). \end{array} \tag{17}$$

**Xiao–Yin method (fifth order) [16]:**

$$\begin{aligned} y\_n &=& x\_n - \frac{2}{3} F'(\mathbf{x}\_n)^{-1} F(\mathbf{x}\_n) \\ z\_n &=& \mathbf{x}\_n - \frac{1}{4} (3F'(\mathbf{y}\_n)^{-1} + F'(\mathbf{x}\_n)^{-1}) F(\mathbf{x}\_n) \\ x\_{n+1} &=& x\_n - \frac{1}{3} \left[ (3F'(\mathbf{y}\_n) - F'(\mathbf{x}\_n))^{-1} \right] F(\mathbf{x}\_n) .\end{aligned} \tag{18}$$

**Traub-type method (second order) [14]:**

$$\begin{array}{rcl} y\_n &=& \mathbf{x}\_n - [w\_{n\prime} \mathbf{x}\_n; F]^{-1} F(\mathbf{x}\_n) \\ w\_{\mathcal{U}} &=& \mathbf{x}\_{\mathcal{U}} + dF(\mathbf{x}\_{\mathcal{U}}) \end{array} \tag{19}$$

where [., .; *F*] : Ω × Ω −→ *L*(*X*, *X*) is a divided difference of order one. **Moccari–Lofti method (fourth order) [19]:**

$$\begin{array}{rcl} y\_n &=& \mathbf{x}\_n - [\mathbf{x}\_{n\prime} w\_n; F]^{-1} F(\mathbf{x}\_n) \\ \mathbf{x}\_{n+1} &=& y\_n - ([y\_{n\prime} w\_{n\prime}; F] + [y\_{n\prime} \mathbf{x}\_{n\prime}; F] - [\mathbf{x}\_{n\prime} w\_{n\prime}; F])^{-1} F(y\_n). \end{array} \tag{20}$$

**Wang–Zang method (seventh order) [8,16,20]:**

$$\begin{aligned} y\_{\boldsymbol{n}} &=& \mathbf{x}\_{\boldsymbol{n}} - [\boldsymbol{w}\_{\boldsymbol{n}}, \boldsymbol{\chi}\_{\boldsymbol{n}}; \boldsymbol{F}]^{-1} \boldsymbol{F}(\boldsymbol{\chi}\_{\boldsymbol{n}}) \\ z\_{\boldsymbol{n}} &=& M\_{8}(\boldsymbol{\chi}\_{\boldsymbol{n}}, \boldsymbol{y}\_{\boldsymbol{n}}) \\ \mathbf{x}\_{\boldsymbol{n}+1} &=& z\_{\boldsymbol{n}} - ([z\_{\boldsymbol{n}}, \boldsymbol{\chi}\_{\boldsymbol{n}}; \boldsymbol{F}] + [z\_{\boldsymbol{n}}, \boldsymbol{y}\_{\boldsymbol{n}}; \boldsymbol{F}] - [\boldsymbol{y}\_{\boldsymbol{n}}, \boldsymbol{\chi}\_{\boldsymbol{n}}; \boldsymbol{F}])^{-1} \boldsymbol{F}(z\_{\boldsymbol{n}}), \end{aligned} \tag{21}$$

where *M*<sup>8</sup> is any fourth-order Steffensen-type iteration method. **Sharma–Arora method (seventh order) [17]:**

$$\begin{array}{rcl} y\_n &=& \mathbf{x}\_n - [w\_{n\prime} \mathbf{x}\_n; F]^{-1} F(\mathbf{x}\_n) \\ z\_n &=& y\_n - (3I - [w\_{n\prime} \mathbf{x}\_n; F]^([y\_{n\prime} \mathbf{x}\_n; F] + [y\_{n\prime} w\_n; F])) \\ & & [w\_{n\prime} \mathbf{x}\_n; F]^{-1}) F(y\_n) \\ \mathbf{x}\_{n+1} &=& z\_n - [z\_{n\prime} y\_n; F]^{-1} ([w\_{n\prime} \mathbf{x}\_n; F] \\ & & + [y\_n, \mathbf{x}\_n; F] - [z\_n, \mathbf{x}\_n; F]) (w\_n, \mathbf{x}\_n; F]^{-1} F(z\_n). \end{array} \tag{22}$$

The local, as well as the semi-local, convergence for methods (4) and (5), were presented in [17], respectively, using hypotheses relating only to the operators on these methods. However, the local convergence analysis of method (6) requires the usage of derivatives or divided differences of higher than two orders, which do not appear in method (6). These high-order derivatives restrict the applicability of method (6) to equations whose operator *F* has high-order derivatives, although method (6) may converge (see Example 1).

Similar restrictions exist for the convergence of the aforementioned methods of order three or above.

It is also worth noticing that the fifth convergence order method by Sharma [18]

$$\begin{array}{rcl} y\_n &=& \mathbf{x}\_n - F'(\mathbf{x}\_n)^{-1} F(\mathbf{x}\_n) \\ z\_n &=& y\_n - \mathfrak{S} F'(\mathbf{x}\_n)^{-1} F(y\_n) \\ \mathbf{x}\_{n+1} &=& y\_n - \frac{1}{5} [9F'(\mathbf{x}\_n)^{-1} F(y\_n) + F'(\mathbf{x}\_n)^{-1} F(z\_n)] \end{array} \tag{23}$$

cannot be handled with the analyses given previously [5–7] for method (4), method (5), or method (6).

Based on all of the above, clearly, it is important to study the convergence of method (2) and its specialization method (3) with the approach employed for method (4) or (5). This way, the resulting unified convergence criteria can apply to their specialized methods listed or not listed previously. Hence, this is the motivation as well as the novelty of the article.

There are two important types of convergence: the semi-local and the local. The semi-local uses information involving the initial point to provide criteria, assuring the convergence of the numerical method, while the local one is based on the information about the solution to find the radii of the convergence balls.

The local convergence results are vital, although the solution is unknown in general since the convergence order of the numerical method can be found. This kind of result also demonstrates the degree of difficulty in selecting starting points. There are cases when the radius of convergence of the numerical method can be determined without the knowledge of the solution.

As an example, let *X* = R. Suppose function *F* satisfies an autonomous differential [5,21] equation of the form

$$H(F(t)) = F'(t),$$

where *H* is a continuous function. Notice that *H*(*F*(*t*∗)) = *F* (*t*∗) or *F* (*t*∗) = *H*(0). In the case of *<sup>F</sup>*(*t*) = *<sup>e</sup><sup>t</sup>* <sup>−</sup> 1, we can choose *<sup>H</sup>*(*t*) = *<sup>t</sup>* <sup>+</sup> 1 (see also the numerical section).

Moreover, the local results can apply to projection numerical methods, such as Arnoldi's, the generalized minimum residual numerical method (GMRES), the generalized conjugate numerical method (GCS) for combined Newton/finite projection numerical methods, and in relation to the mesh independence principle to develop the cheapest and most efficient mesh refinement techniques [1,5,11,21].

In this article, we introduce a majorant sequence and use our idea of recurrent functions to extend the applicability of the numerical method (2). Our analysis includes error bounds and results on the uniqueness of *x*<sup>∗</sup> based on computable Lipschitz constants not given before in [5,13,21–24] and in other similar studies using the Taylor series. This idea is very general. Hence, it applies also to other numerical methods [10,14,22,25].

The convergence analysis of method (2) and method (3) is given in Section 2. Moreover, the special choices of operators appear in the method in Sections 3 and 4. Concluding remarks, open problems, and future work complete this article.

#### **2. Convergence Analysis of Method**

The local is followed by the semi-local convergence analysis. Let *S* = [0, ∞) and *<sup>S</sup>*<sup>0</sup> = [0, *<sup>ρ</sup>*0) for some *<sup>ρ</sup>*<sup>0</sup> <sup>&</sup>gt; 0. Consider functions *<sup>h</sup>*<sup>1</sup> : *<sup>S</sup>*<sup>0</sup> −→ <sup>R</sup>, *<sup>h</sup>*<sup>2</sup> : *<sup>S</sup>*<sup>0</sup> <sup>×</sup> *<sup>S</sup>*<sup>0</sup> −→ <sup>R</sup> and *<sup>h</sup>*<sup>3</sup> : *<sup>S</sup>*<sup>0</sup> <sup>×</sup> *<sup>S</sup>*<sup>0</sup> <sup>×</sup> *<sup>S</sup>*<sup>0</sup> −→ <sup>R</sup> be continuous and nondecreasing in each variable.

Suppose that equations

$$h\_i(t) - 1 = 0, \; i = 1, 2, 3\tag{24}$$

have the smallest solutions, *ρ<sup>i</sup>* ∈ *S* − {0}. The parameter *ρ* defined by

$$
\rho = \min \{ \rho\_i \} \tag{25}
$$

shall be shown to be a radius of convergence for method (2). Let *S*<sup>1</sup> = [0, *ρ*). It follows by the definition of radius *ρ* that for all *t* ∈ *S*<sup>1</sup>

$$0 \le h\_i(t) < 1.\tag{26}$$

The notation *U*(*x*, *ς*) denotes an open ball with center *x* ∈ *X* and of radius *ς* > 0. By *U*[*x*, *ς*], we denote the closure of *U*(*x*, *ς*).

The following conditions are used in the local convergence analysis of the method (2). Suppose the following:

(H1) Equation *F*(*x*) = 0 has a solution *x*<sup>∗</sup> ∈ Ω.

$$\text{(H2) } ||a(\mathbf{x}) - \mathbf{x}\_\*|| \le h\_1 (||\mathbf{x} - \mathbf{x}\_\*||) ||\mathbf{x} - \mathbf{x}\_\*||\_\prime$$

$$||b(\mathbf{x}, y) - \mathbf{x}\_\*|| \le h\_2(||\mathbf{x} - \mathbf{x}\_\*||\_\prime ||y - \mathbf{x}\_\*||) ||\mathbf{x} - \mathbf{x}\_\*||\_\prime$$

and

$$||c(\mathbf{x}, y, z) - \mathbf{x}\_\*\|| \le h\_3(\|\mathbf{x} - \mathbf{x}\_\*\|, \|y - \mathbf{x}\_\*\|, \|z - \mathbf{x}\_\*\|) ||\mathbf{x} - \mathbf{x}\_\*\||$$

for all *x*, *y*, *z* ∈ Ω<sup>0</sup> = Ω ∩ *U*(*x*∗, *ρ*0).


Next, the main local convergence analysis is presented for method (2).

**Theorem 1.** *Suppose that the conditions (H1)–(H4) hold and x*<sup>0</sup> ∈ *U*(*x*∗,*r*) − {*x*∗}. *Then, the sequence* {*xn*} *generated by method (2) is well defined and converges to x*∗. *Moreover, the following estimates hold* ∀ *n* = 0, 1, 2, . . .

$$\|y\_n - \mathbf{x}\_\*\| \le h\_1(\|\mathbf{x}\_n - \mathbf{x}\_\*\|) \|\mathbf{x}\_n - \mathbf{x}\_\*\| \le \|\mathbf{x}\_n - \mathbf{x}\_\*\| < \rho \tag{27}$$

$$\|\|z\_n - \mathbf{x}\_\*\|\| \le h\_2(\|\mathbf{x}\_n - \mathbf{x}\_\*\|, \|\|y\_n - \mathbf{x}\_\*\|) \|\mathbf{x}\_n - \mathbf{x}\_\*\| \le \|\mathbf{x}\_n - \mathbf{x}\_\*\|\tag{28}$$

*and*

$$\|\|\mathbf{x}\_{n+1} - \mathbf{x}\_{\*}\|\| \le h\_3(\|\|\mathbf{x}\_{\mathrm{il}} - \mathbf{x}\_{\*}\|, \|\|y\_{\mathrm{il}} - \mathbf{x}\_{\*}\|, \|\|z\_{\mathrm{il}} - \mathbf{x}\_{\*}\|\|) \\
\|\|\mathbf{x}\_{\mathrm{il}} - \mathbf{x}\_{\*}\|\| \le \|\|\mathbf{x}\_{\mathrm{il}} - \mathbf{x}\_{\*}\|\|. \tag{29}$$

**Proof.** Let *x*<sup>0</sup> ∈ *U*(*x*∗, *ρ*0). Then, it follows from the first condition in (H1) the definition of *ρ*, (26) (for *i* = 1) and the first substep of method (2) for *n* = 0 that

$$\|\|y\_0 - \mathbf{x}\_\*\|\| \le h\_1(\|\|\mathbf{x}\_0 - \mathbf{x}\_\*\|) \|\|\mathbf{x}\_0 - \mathbf{x}\_\*\|\| \le \|\|\mathbf{x}\_0 - \mathbf{x}\_\*\|\| < \rho,\tag{30}$$

showing estimate (27) for *n* = 0 and the iterate *y*<sup>0</sup> ∈ *U*(*x*∗, *ρ*). Similarly,

$$\begin{aligned} \|\mathbf{z}\_{0} - \mathbf{x}\_{\*}\| &\leq \quad h\_{2}(\|\mathbf{x}\_{0} - \mathbf{x}\_{\*}\|\_{\prime}, \|y\_{0} - \mathbf{x}\_{\*}\|) \|\mathbf{x}\_{0} - \mathbf{x}\_{\*}\| \\ &\leq \quad h\_{2}(\|\mathbf{x}\_{0} - \mathbf{x}\_{\*}\|\_{\prime}, \|y\_{0} - \mathbf{x}\_{\*}\|) \\ &\leq \quad h\_{2}(\|\mathbf{x}\_{0} - \mathbf{x}\_{\*}\|\_{\prime}, \|\mathbf{x}\_{0} - \mathbf{x}\_{\*}\|) \|\mathbf{x}\_{0} - \mathbf{x}\_{\*}\| \leq \|\mathbf{x}\_{0} - \mathbf{x}\_{\*}\| \end{aligned} \tag{31}$$

and

$$\begin{array}{rcl} \|\mathbf{x}\_{1} - \mathbf{x}\_{\*}\| & \leq & h\_{3}(\|\mathbf{x}\_{0} - \mathbf{x}\_{\*}\|, \|\mathbf{y}\_{0} - \mathbf{x}\_{\*}\|, \|\mathbf{z}\_{0} - \mathbf{x}\_{\*}\|) \|\mathbf{x}\_{0} - \mathbf{x}\_{\*}\| \\ & \leq & h\_{3}(\|\mathbf{x}\_{0} - \mathbf{x}\_{\*}\|, \|\mathbf{x}\_{0} - \mathbf{x}\_{\*}\|, \|\mathbf{x}\_{0} - \mathbf{x}\_{\*}\|) \|\mathbf{x}\_{0} - \mathbf{x}\_{\*}\| \\ & \leq & \|\mathbf{x}\_{0} - \mathbf{x}\_{\*}\|. \end{array}$$

showing estimates (28), (29), respectively and the iterates *z*0, *x*<sup>1</sup> ∈ *U*(*x*∗, *ρ*). By simply replacing *x*0, *y*0, *z*0, *x*<sup>1</sup> by *xk*, *yk*, *zk*, *xk*<sup>+</sup><sup>1</sup> in the preceding calculations, the induction for estimates (27)–(29) is terminated. Then, from the estimate

$$||\mathfrak{x}\_{k+1} - \mathfrak{x}\_\*|| \le d||\mathfrak{x}\_k - \mathfrak{x}\_\*|| < \rho\_{\prime\prime}$$

where

$$d = h\_{\mathbb{S}}(\|\mathbf{x}\_0 - \mathbf{x}\_\*\|, \|\mathbf{x}\_0 - \mathbf{x}\_\*\|, \|\mathbf{x}\_0 - \mathbf{x}\_\*\|) \in [0, 1) \tag{32}$$

we conclude *xk*<sup>+</sup><sup>1</sup> ∈ *<sup>U</sup>*[*x*∗, *<sup>ρ</sup>*] and lim*k*−→<sup>∞</sup> *xk* = *<sup>x</sup>*∗.

**Remark 1.** *It follows from the proof of Theorem 1 that y*, *z can be chosen in particular as yn* = *a*(*xn*) *and zn* = *b*(*xn*, *yn*). *Thus, the condition (H2) should hold for all x*, *a*(*x*), *b*(*x*, *y*) ∈ Ω<sup>0</sup> *and not x*, *y*, *z* ∈ Ω0. *Clearly, in this case, the resulting functions hi are at least as tight as the functions hi, leading to an at least as large radius of convergence ρ*¯ *as ρ (see the numerical section).*

Concerning the semi-local convergence of method (2), let us introduce scalar sequences {*tn*}, {*sn*} and {*un*} defined for *t*<sup>0</sup> = 0,*s*<sup>0</sup> = *η* ≥ 0 and the rest of the iterates, depending on operators *a*, *b*, *c* and *F* (see how in the next section). These sequences shall be shown to be majorizing for method (2). However, first, a convergence result for these sequence is needed.

**Lemma 1.** *Suppose that* ∀ *n* = 0, 1, 2, . . .

$$\mathfrak{h}\_n \le \mathfrak{s}\_n \le \mathfrak{u}\_n \le \mathfrak{h}\_{n+1} \tag{33}$$

*and*

$$t\_n \le \lambda \tag{34}$$

*for some λ* ≥ 0. *Then, the sequence* {*tn*} *is convergent to its unique least upper bound t*<sup>∗</sup> ∈ [0, *λ*].

**Proof.** It follows from conditions (33) and (34) that sequence {*tn*} is nondecreasing and bounded from above by *λ*, and as such, it converges to *t*∗.

#### **Theorem 2.** *Suppose the following:*

*(H5) Iterates* {*xn*}, {*yn*}, {*zn*} *generated by method (2) exist, belong in U*(*x*0, *t*∗) *and satisfy the conditions of Lemma 1 for all n* = 0, 1, 2, . . . *(H6) a*(*xn*) − *xn* ≤ *sn* − *tn*,

$$||b(\mathfrak{x}\_{\mathfrak{n}}, \mathfrak{y}\_{\mathfrak{n}}) - \mathfrak{y}\_{\mathfrak{n}}|| \le \mathfrak{u}\_{\mathfrak{n}} - \mathfrak{s}\_{\mathfrak{n}}$$

*and*

$$||c(\mathfrak{x}\_{\mathsf{n}}, y\_{\mathsf{n}}, z\_{\mathsf{n}}) - z\_{\mathsf{n}}|| \le t\_{\mathsf{n}+1} - \mathfrak{u}\_{\mathsf{n}}||$$

*for all n* = 0, 1, 2, . . . *and (H7) U*[*x*0, *t*∗] ⊂ Ω. *Then, there exists x*<sup>∗</sup> ∈ *U*[*x*0, *t*∗] *such that* lim*n*−→<sup>∞</sup> *xn* = *x*∗.

**Proof.** It follows by condition (H5) that sequence {*tn*} is complete as convergent. Thus, by condition (H6), sequence {*xn*} is also complete in a Banach space *X*, and as such, it converges to some *x*<sup>∗</sup> ∈ *U*[*x*0, *t*∗] (since *U*[*x*0, *t*∗] is a closed set).

**Remark 2.** *(i) Additional conditions are needed to show F*(*x*∗) = 0. *The same is true for the results on the uniqueness of the solution.*

*(ii) The limit point t*<sup>∗</sup> *is not given in the closed form. So, it can be replaced by λ in Theorem 2.*

#### **3. Special Cases I**

The iterates of method (3) are assumed to exist, and operator *F* has a divided difference of order one.

#### **Local Convergence**

Three possibilities are presented for the local cases based on different estimates for the determination of the functions *hi*. It follows by method (3) that

$$(\text{P1)}\ \ y\_n - \mathfrak{x}\_\* = \mathfrak{x}\_n - \mathfrak{x}\_\* + \mathfrak{a}\_n\\F(\mathfrak{x}\_n) = (I + \mathfrak{a}\_n[\mathfrak{x}\_n, \mathfrak{x}\_\*; F])(\mathfrak{x}\_n - \mathfrak{x}\_\*),$$

$$\begin{aligned} \mathbf{z}\_n - \mathbf{x}\_\* &= \begin{aligned} (I + \gamma\_n [y\_n, \mathbf{x}\_\*; F])(y\_n - \mathbf{x}\_\*) + \beta\_n [\mathbf{x}\_n, \mathbf{x}\_\*; F](\mathbf{x}\_n - \mathbf{x}\_\*) \\ &= \left[ (I + \gamma\_n [y\_n, \mathbf{x}\_\*; F])(I + \boldsymbol{\alpha}\_n [\mathbf{x}\_n, \mathbf{x}\_\*; F]) + \beta\_n [\mathbf{x}\_n, \mathbf{x}\_\*; F](\mathbf{x}\_n - \mathbf{x}\_\*) \right] \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \left(\mathbf{x}\_{n+1} - \mathbf{x}\_{\*}\right) &= \left(I + \theta\_{n}[\mathbf{z}\_{n}, \mathbf{x}\_{\*}; F]\right)(\mathbf{z}\_{n} - \mathbf{x}\_{\*}) + \delta\_{n}[\mathbf{x}\_{n}, \mathbf{x}\_{\*}; F](\mathbf{x}\_{n} - \mathbf{x}\_{\*}) \\ &+ \epsilon\_{n}[y\_{n}, \mathbf{x}\_{\*}; F](y\_{n} - \mathbf{x}\_{\*}) \\ &= \left[\left(I + \theta\_{n}[\mathbf{z}\_{n}, \mathbf{x}\_{\*}; F]\right)(I + \gamma\_{n}[y\_{n}, \mathbf{x}\_{\*}; F])\left(I + \beta\_{n}[\mathbf{x}\_{n}, \mathbf{x}\_{\*}; F]\right) \\ &+ \delta\_{n}[\mathbf{x}\_{n}, \mathbf{x}\_{\*}; F] + \epsilon\_{n}[y\_{n}, \mathbf{x}\_{\*}; F](I + \alpha\_{n}[\mathbf{x}\_{n}, \mathbf{x}\_{\*}; F])\right)(\mathbf{x}\_{n} - \mathbf{x}\_{\*}) \end{aligned}$$

Hence, the functions *hi* are selected to satisfy ∀*xn*, *yn*, *zn* ∈ Ω

$$||I + \mathfrak{a}\_n[\mathfrak{x}\_n, \mathfrak{x}\_\*; F]|| \le h\_1(||\mathfrak{x}\_n - \mathfrak{x}\_\*||),$$

$$\|\left(I + \gamma\_n[\underline{y}\_n, \mathbf{x}\_\*; F]\right)(I + \mathfrak{a}\_n[\mathbf{x}\_n, \mathbf{x}\_\*; F]) + \beta\_n[\mathbf{x}\_n, \mathbf{x}\_\*; F] \| \le h\_2(\|\|\mathbf{x}\_n - \mathbf{x}\_\*\|\_\prime, \|\|\underline{y}\_n - \mathbf{x}\_\*\|)$$

$$\begin{aligned} &\left|\left(I+\theta\_{n}[z\_{n},\boldsymbol{\varkappa}\_{\*};\boldsymbol{F}]\right)(I+\gamma\_{n}[y\_{n},\boldsymbol{\varkappa}\_{\*};\boldsymbol{F}])\left(I+\beta\_{n}[\boldsymbol{\varkappa}\_{n},\boldsymbol{\varkappa}\_{\*};\boldsymbol{F}]\right)\right| \\ &+\delta\_{n}[\boldsymbol{\varkappa}\_{n},\boldsymbol{\varkappa}\_{\*};\boldsymbol{F}]+\epsilon\_{n}[y\_{n},\boldsymbol{\varkappa}\_{\*};\boldsymbol{F}](I+\boldsymbol{\varkappa}\_{n}[\boldsymbol{\varkappa}\_{n},\boldsymbol{\varkappa}\_{\*};\boldsymbol{F}])\left| \\ &\leq \quad h\_{3}(\left|\left|\boldsymbol{\varkappa}\_{n}-\boldsymbol{\varkappa}\_{\*}\right|\right|\left|y\_{n}-\boldsymbol{\varkappa}\_{\*}\right|)\left|z\_{n}-\boldsymbol{\varkappa}\_{\*}\right|). \end{aligned}$$

A practical non-discrete choice for the function *h*<sup>1</sup> is given by

$$\|\|I + \mathfrak{a}(\mathfrak{x})[\mathfrak{x}, \mathfrak{x}\_\*; F] \|\| \le h\_1(\|\|\mathfrak{x} - \mathfrak{x}\_\*\|) \,\,\forall \mathfrak{x} \in \Omega.$$

Another choice is given by

$$h\_1(t) = \sup\_{\substack{\mathbf{x} \in \Omega, \|\mathbf{x} - \mathbf{x}\_\*\| \le t}} \left\| \left[ I + \mathfrak{a}(\mathbf{x})[\mathbf{x}, \mathbf{x}\_\*; F] \right] \right\|.$$

The choices of functions *h*<sup>2</sup> and *h*<sup>3</sup> can follow similarly.

(P2) Let *<sup>M</sup><sup>i</sup>* : <sup>Ω</sup> −→ *<sup>X</sup>* be a linear operator. By *<sup>M</sup><sup>i</sup> <sup>n</sup>* we denote *M<sup>i</sup>* (*xn*) ∀*n* = 0, 1, 2, ... . Then, it follows from method (3)

$$\begin{array}{rcl} y\_n - \mathfrak{x}\_\* &=& \mathfrak{x}\_n - \mathfrak{x}\_\* - M\_n^1 F(\mathfrak{x}\_n) + (\mathfrak{a}\_n + M\_n) F(\mathfrak{x}\_n) \\ &=& (I - M\_n^2 [\mathfrak{x}\_n, \mathfrak{x}\_\*; F]) + (\mathfrak{a}\_n + M\_n^2) [\mathfrak{x}\_n, \mathfrak{x}\_\*; F]) (\mathfrak{x}\_n - \mathfrak{x}\_\*), \\ z\_n - \mathfrak{x}\_\* &=& ((I - M\_n^2 [\mathfrak{y}\_n, \mathfrak{x}\_\*; F]) + (\gamma\_n + M\_n^2) [\mathfrak{y}\_n, \mathfrak{x}\_\*; F]) (\mathfrak{y}\_n - \mathfrak{x}\_\*) \end{array}$$

and

$$\mathbf{x}\_{n+1} - \mathbf{x}\_\* = \left( \left( I - M\_n^3 [z\_{n\prime} \mathbf{x}\_\*; F] \right) + \left( \theta\_n + M\_n^3 \right) [z\_{\hbar\prime} \mathbf{x}\_\*; F] \right) (z\_{\hbar} - \mathbf{x}\_\*).$$

Thus, the functions *hi* must satisfy

$$||I + \mathfrak{a}\_{\mathfrak{n}}|| \le h\_1 (||\mathfrak{x}\_{\mathfrak{n}} - \mathfrak{x}\_\*||),$$

$$\|(I + \gamma\_{\mathfrak{n}})(I + \mathfrak{a}\_{\mathfrak{n}})\| \le h\_2 (||\mathfrak{x}\_{\mathfrak{n}} - \mathfrak{x}\_\*||\_{\prime} ||\mathfrak{y}\_{\mathfrak{n}} - \mathfrak{x}\_\*||)$$

and

$$\|\|\mathbf{x}\_{n+1} - \mathbf{x}\_\*\|\| \le \|(I + \theta\_n)(I + \gamma\_n)(I + \mathbf{x}\_n)\|\| \le h\_3(\|\|\mathbf{x}\_n - \mathbf{x}\_\*\|, \|\|y\_n - \mathbf{x}\_\*\|, \|\|z\_n - \mathbf{x}\_\*\|).$$

Clearly, the function *h*<sup>1</sup> can be chosen again as in case (P1). The functions *h*<sup>2</sup> and *h*<sup>3</sup> can be defined similarly.

(P3) Assume <sup>∃</sup> function *<sup>ϕ</sup>*<sup>0</sup> : [0, <sup>∞</sup>) −→ <sup>R</sup> continuous and non-decreasing such that

$$\|F'(\mathfrak{x}\_\*)^{-1}(F'(\mathfrak{x}) - F'(\mathfrak{x}\_\*))\| \le q\_0(\|\mathfrak{x} - \mathfrak{x}\_\*\|) \,\,\forall \mathfrak{x} \in \Omega.$$

Then, we can write

$$F(\mathbf{x}\_{\mathbb{H}}) = F(\mathbf{x}\_{\mathbb{H}}) - F(\mathbf{x}\_{\*}) = \int\_{0}^{1} F'(\mathbf{x}\_{\*} + \theta(\mathbf{x}\_{\mathbb{H}} - \mathbf{x}\_{\*})) d\theta(\mathbf{x}\_{\mathbb{H}} - \mathbf{x}\_{\*})$$

leading to

$$\left\| \left| F'(\mathbf{x}\_{\ast})^{-1} F(\mathbf{x}\_{\mathrm{n}}) \right| \right\| \leq \int\_{0}^{1} \left\| \varphi\_{0}(\theta \| \| \mathbf{x}\_{\mathrm{n}} - \mathbf{x}\_{\ast} \|) \right\| d\theta \left\| \mathbf{x}\_{\mathrm{n}} - \mathbf{x}\_{\ast} \right\|.$$

Then, by method (3) we obtain, in turn, that

$$\begin{aligned} y\_n - \mathbf{x}\_\* &= \quad [I + a\_n F'(\mathbf{x}\_\*) F'(\mathbf{x}\_\*) \, ^ {-1} \\ &\times \Big( \int\_0^1 F'(\mathbf{x}\_\* + \theta(\mathbf{x}\_n - \mathbf{x}\_\*)) d\theta - F'(\mathbf{x}\_\*) + F'(\mathbf{x}\_\*) \Big) [(\mathbf{x}\_n - \mathbf{x}\_\*)] \end{aligned}$$

so, the function *h*<sup>1</sup> must satisfy

$$||I + \mathfrak{a}\_{\mathfrak{n}} \int\_{0}^{1} F'(\mathfrak{x}\_{\*} + \theta(\mathfrak{x}\_{\mathfrak{n}} - \mathfrak{x}\_{\*})) d\theta|| \le h\_{1}(||\mathfrak{x}\_{\mathfrak{n}} - \mathfrak{x}\_{\*}||)$$

or

$$||h\_1(t)|| = \sup\_{||\mathbf{x} - \mathbf{x}\_\*|| \le t, \mathbf{x} \in \Omega} ||I + \mathfrak{a}(\mathbf{x}) \int\_0^1 F'(\mathbf{x}\_\* + \theta(\mathbf{x}\_{\mathcal{U}} - \mathbf{x}\_\*)) d\theta||$$

or

$$\|I + \mathfrak{a}\_{\mathfrak{n}} F'(\mathfrak{x}\_{\ast})\| (1 + \int\_{0}^{1} \mathfrak{q}\_{0}(\theta \| \| \mathfrak{x}\_{\mathfrak{n}} - \mathfrak{x}\_{\ast} \|) d\theta) \le h\_{1}(\|\mathfrak{x}\_{\mathfrak{n}} - \mathfrak{x}\_{\ast}\|)$$

or

$$h\_1(t) = \sup\_{\|\mathbf{x} - \mathbf{x}\_\*\| \le t, \mathbf{x} \in \Omega} ||I + \mathfrak{a}(\mathbf{x})F'(\mathbf{x}\_\*)|| (1 + \int\_0^1 q\_0(\theta \|\|\mathbf{x}\_{\hbar} - \mathbf{x}\_\*\|) d\theta).$$

Similarly, for the other two steps, we obtain in the last choice

$$\begin{aligned} \|\|\mathbf{z}\_{\mathsf{n}} - \mathbf{x}\_{\ast}\|\| &\leq \quad \|\|I + \gamma\_{\mathsf{n}}F'(\mathbf{x}\_{\ast})\|\|(\mathbf{1} + \int\_{0}^{1} \varrho\_{0}(\theta) \|y\_{\mathsf{n}} - \mathbf{x}\_{\ast}\|\| d\theta)\|\|y\_{\mathsf{n}} - \mathbf{x}\_{\ast}\|\| \\ &\quad + \|\|\beta\_{\mathsf{n}}F'(\mathbf{x}\_{\ast})\|\|(\mathbf{1} + \int\_{0}^{1} \varrho\_{0}(\theta) \|\mathbf{x}\_{\mathsf{n}} - \mathbf{x}\_{\ast}\|\|) d\theta)\|\|\mathbf{x}\_{\mathsf{n}} - \mathbf{x}\_{\ast}\|\| \end{aligned}$$

and

$$\begin{split} \|\|\mathbf{x}\_{\mathsf{n}+1} - \mathbf{x}\_{\ast}\|\| &\leq \quad \|\|I + \theta\_{\mathsf{n}}F'(\mathbf{x}\_{\ast})\|\|\left(1 + \int\_{0}^{1} \boldsymbol{\varrho}\_{0}(\theta \|\|\mathbf{z}\_{\mathsf{n}} - \mathbf{x}\_{\ast}\|)d\theta\right)\|\|\mathbf{z}\_{\mathsf{n}} - \mathbf{x}\_{\ast}\|\| \\ &\quad + \|\|\boldsymbol{\delta}\_{\mathsf{n}}F'(\mathbf{x}\_{\ast})\|\|\left(1 + \int\_{0}^{1} \boldsymbol{\varrho}\_{0}(\theta \|\|\mathbf{x}\_{\mathsf{n}} - \mathbf{x}\_{\ast}\|)d\theta\right)\|\|\mathbf{x}\_{\mathsf{n}} - \mathbf{x}\_{\ast}\|\| \\ &\quad + \|\boldsymbol{\varepsilon}\_{\mathsf{n}}F'(\mathbf{x}\_{\ast})\|\|\left(1 + \int\_{0}^{1} \boldsymbol{\varrho}\_{0}(\theta \|\|\mathbf{y}\_{\mathsf{n}} - \mathbf{x}\_{\ast}\|)d\theta\right)\|\|\mathbf{y}\_{\mathsf{n}} - \mathbf{x}\_{\ast}\|\|. \end{split}$$

Thus, the function *h*<sup>2</sup> satisfies

$$\begin{aligned} &\|I + \gamma\_n F'(\mathbf{x}\_\*)\| (1 + \int\_0^1 \varphi\_0(\theta \|\| y\_n - \mathbf{x}\_\*\|) d\theta) \|y\_n - \mathbf{x}\_\*\| \\ &+ \|\beta\_n F'(\mathbf{x}\_\*)\| (1 + \int\_0^1 \varphi\_0(\theta \|\| \mathbf{x}\_n - \mathbf{x}\_\*\|) d\theta) \\ &\le \quad h\_2(\|\|\mathbf{x}\_n - \mathbf{x}\_\*\|\_{\prime} \|\|y\_n - \mathbf{x}\_\*\|) \end{aligned}$$

or

$$\begin{aligned} h\_2(\mathbf{s}, t) &= \sup\_{\|\mathbf{x} - \mathbf{x}\_\*\| \le s, \, \|\mathbf{y} - \mathbf{x}\_\*\| \le t} [||I + \gamma(\mathbf{x})F'(\mathbf{x}\_\*)||] \\ &\quad \times (1 + \int\_0^1 q\_0(\theta t) d\theta) t) \\ &\quad + ||\beta(\mathbf{x})F'(\mathbf{x}\_\*)|| (1 + \int\_0^1 q\_0(\theta \mathbf{s}) d\theta)). \end{aligned}$$

Finally, concerning the choice of the function *h*3, by the third substep of method (3)

$$\begin{split} \|\|\mathbf{x}\_{n+1} - \mathbf{x}\_{\*}\|\| &\leq \quad \|\|I + \theta\_{n}F'(\mathbf{x}\_{\*})\|\|\left(1 + \int\_{0}^{1} \boldsymbol{\varrho}\_{0}(\theta \|\|\mathbf{z}\_{n} - \mathbf{x}\_{\*}\|)d\theta\right)\|\|\mathbf{z}\_{n} - \mathbf{x}\_{\*}\|\| \\ &\quad + \|\|\delta\_{\mathrm{n}}F'(\mathbf{x}\_{\*})\|\|\left(1 + \int\_{0}^{1} \boldsymbol{\varrho}\_{0}(\theta \|\|\mathbf{x}\_{\mathrm{n}} - \mathbf{x}\_{\*}\|)d\theta\right)\|\|\mathbf{x}\_{\mathrm{n}} - \mathbf{x}\_{\*}\|\| \\ &\quad + \|\|\boldsymbol{\varepsilon}\_{\mathrm{n}}F'(\mathbf{x}\_{\*})\|\|\left(1 + \int\_{0}^{1} \boldsymbol{\varrho}\_{0}(\theta \|\|\mathbf{y}\_{\mathrm{n}} - \mathbf{x}\_{\*}\|)d\theta\right)\|\|\mathbf{y}\_{\mathrm{n}} - \mathbf{x}\_{\*}\|\|, \end{split}$$

so the function *h*<sup>3</sup> must satisfy

$$\begin{split} & \| I + \theta\_n F'(\mathbf{x}\_\*) \| (1 + \int\_0^1 \varphi\_0(\theta \| \| y\_n - \mathbf{x}\_\* \|) d\theta) h\_2(\| \| \mathbf{x}\_n - \mathbf{x}\_\* \|) \, d\| \, \pi\_n - \mathbf{x}\_\* \|) \\ & + \| \delta\_n F'(\mathbf{x}\_\*) \| (1 + \int\_0^1 \varphi\_0(\theta \| \| \mathbf{x}\_n - \mathbf{x}\_\* \|) d\theta) \\ & + \| \epsilon\_n F'(\mathbf{x}\_\*) \| (1 + \int\_0^1 \varphi\_0(\theta \| \| y\_n - \mathbf{x}\_\* \|) d\theta) h\_1(\| \| \mathbf{x}\_n - \mathbf{x}\_\* \|) \\ & \le \quad h\_3(\| \| \mathbf{x}\_n - \mathbf{x}\_\* \|) \_\epsilon \| y\_n - \mathbf{x}\_\* \|\| \_\epsilon \| \| z\_n - \mathbf{x}\_\* \|) \end{split}$$

or

*h*(*x*,*s*, *t*, *u*) = sup *x*−*x*∗≤*s*, *y*−*x*∗≤*t*, *z*−*x*∗≤*u μ*(*x*,*s*, *t*, *u*),

where

$$\begin{aligned} \mu(\mathbf{x}, \mathbf{s}, t, \mu) &= \quad ||I + \theta(\mathbf{x})F'(\mathbf{x}\_\*)|| \\ &\quad \times (1 + \int\_0^1 \varrho\_0(\theta \mathbf{u}) d\theta) h\_2(t, \mathbf{s}) \\ &\quad + ||\delta(\mathbf{x})F'(\mathbf{x}\_\*)|| (1 + \int\_0^1 \varrho\_0(\theta \mathbf{s}) d\theta) \\ &\quad + ||\epsilon(\mathbf{x})F'(\mathbf{x}\_\*)|| (1 + \int\_0^1 \varrho\_0((\theta \mathbf{t}) d\theta) h\_1(\mathbf{s})) .\end{aligned}$$

The functions *h*<sup>2</sup> and *h*<sup>3</sup> can also be defined with the other two choices as those of function *h*<sup>1</sup> given previously.

#### **Semi-local Convergence**

Concerning this case, we can have instead of the conditions of Theorem 2 (see (H6)) but for method (3)

$$||\mathfrak{a}\_{n}F(\mathfrak{x}\_{n})|| \leq s\_{n} - t\_{n}$$

$$||\beta\_{n}F(\mathfrak{x}\_{n}) + \gamma\_{n}F(\mathfrak{y}\_{n})|| \leq \mathfrak{u}\_{n} - s\_{n}$$

and

$$\left\| \left| \delta\_n F(\mathbf{x}\_n) + \epsilon\_n F(\mathbf{y}\_n) + \theta\_n F(z\_n) \right| \right\| \le t\_{n+1} - \mu\_n \ \forall n = 0, 1, 2, \dots$$

Notice that under these choices,

$$\|y\_n - x\_n\| \le \mathbf{s}\_n - t\_n$$

$$\|z\_n - y\_n\| \le \boldsymbol{\mu}\_n - \mathbf{s}\_n$$

and

$$\|\mathfrak{x}\_{n+1} - z\_n\| \le t\_{n+1} - u\_n.$$

Then, the conclusions of Theorem 2 hold for method (3). Even more specialized choices of linear operators appearing on these methods as well as function *hi* can be found in the Introduction, the next section, or in [1,2,11,21] and the references therein.

#### **4. Special Cases II**

The section contains even more specialized cases of method (2) and method (3). In particular, we study the local and semi-local convergence first of method (22) and second of method (20). Notice that to obtain method (22), we set in method (3)

$$\begin{aligned} \mu\_n &= -F'(\mathbf{x}\_n)^{-1}, \boldsymbol{u}\_n = \boldsymbol{y}\_n \boldsymbol{\beta}\_n = \boldsymbol{O}, \boldsymbol{\gamma}\_n = -5F'(\mathbf{x}\_n)^{-1}, \\ \boldsymbol{v}\_n &= \boldsymbol{y}\_n, \boldsymbol{\delta}\_n = \boldsymbol{O}, \boldsymbol{\epsilon}\_n = -\frac{9}{5}F'(\mathbf{x}\_n)^{-1} \text{ and } \boldsymbol{\theta}\_n = -\frac{1}{5}F'(\mathbf{x}\_n). \end{aligned} \tag{35}$$

Moreover, for method (20), we let

$$\begin{array}{rcl} \mathfrak{a}\_{n} & = & -[\mathfrak{x}\_{n}, \mathfrak{w}\_{n}; \mathcal{F}]^{-1}, \mathfrak{a}\_{n} = \mathfrak{y}\_{n}, \mathfrak{f}\_{n} = \mathcal{O}, \mathfrak{z}\_{n} = \mathfrak{x}\_{n+1}, \\ \gamma\_{n} & = & ([\underline{y}\_{n}, \mathfrak{w}\_{n}; \mathcal{F}] + [\underline{y}\_{n}, \mathfrak{x}\_{n}; \mathcal{F}] - [\mathfrak{x}\_{n}, \mathfrak{w}\_{n}; \mathcal{F}])^{-1}, \delta\_{n} = \mathfrak{e}\_{n} = \theta\_{n} = \mathcal{O} \end{array} \tag{36}$$

and *vn* = *zn*.

#### **5. Local Convergence of Method**

The local convergence analysis of method (23) utilizes some functions parameters. Let *S* = [0, ∞).

Suppose the following:

(i) <sup>∃</sup> function *<sup>w</sup>*<sup>0</sup> : *<sup>S</sup>* −→ <sup>R</sup> continuous and non-decreasing such that equation

$$w\_0(t) - 1 = 0$$

has a smallest solution *ρ*<sup>0</sup> ∈ *S* − {0}. Let *S*<sup>0</sup> = [0, *ρ*0). (ii) <sup>∃</sup> function *<sup>w</sup>* : *<sup>S</sup>*<sup>0</sup> −→ <sup>R</sup> continuous and non-decreasing such that equation

$$h\_1(t) - 1 = 0$$

has a smallest solution *<sup>ρ</sup>*<sup>1</sup> <sup>∈</sup> *<sup>S</sup>*<sup>0</sup> − {0}, where the function *<sup>h</sup>*<sup>1</sup> : *<sup>S</sup>*<sup>0</sup> −→ <sup>R</sup> defined by

$$h\_1(t) = \frac{\int\_0^1 w((1-\theta)t)d\theta}{1-w\_0(t)}.$$

(iii) Equation

$$w\_0(h\_1(t)t) - 1 = 0$$

has a smallest solution *ρ*¯1 ∈ *S*<sup>0</sup> − {0}. Let

$$
\vec{\rho}\_0 = \min \{ \rho\_{0\prime} \vec{\rho}\_1 \}
$$

and *S*˜ <sup>1</sup> = [0, *ρ*¯¯0). (iv) Equation

$$h\_2(t) - 1 = 0$$

has a smallest solution *<sup>ρ</sup>*<sup>2</sup> <sup>∈</sup> *<sup>S</sup>*˜ <sup>1</sup> − {0}, where the function *<sup>h</sup>*<sup>2</sup> : *<sup>S</sup>*˜ <sup>1</sup> −→ <sup>R</sup> is defined as

$$\begin{split} h\_{2}(t) &= \, \left[ \frac{\int\_{0}^{1} w((1-\theta)h\_{1}(t)t)d\theta}{1 - w\_{0}(h\_{1}(t)t)} \right. \\ &\left. + \frac{w((1+h\_{1}(t))t)(1 + \int\_{0}^{1} w\_{0}(\theta h\_{1}(t)t)d\theta)}{(1 - w\_{0}(t))(1 - w\_{0}(h\_{1}(t)t))} \right. \\ &\left. + \frac{4(1 + \int\_{0}^{1} w\_{0}(\theta h\_{1}(t)t)d\theta}{1 - w\_{0}(t)} \right] h\_{1}(t). \end{split}$$

(v) Equation

$$h\_3(t) - 1 = 0$$

has a smallest solution *<sup>ρ</sup>*<sup>3</sup> <sup>∈</sup> *<sup>S</sup>*˜ <sup>1</sup> − {0}, where the function *<sup>h</sup>*<sup>3</sup> : *<sup>S</sup>*˜ <sup>1</sup> −→ <sup>R</sup> is defined by

$$h\_3(t) = -h\_1(t) + \frac{1}{5} [\frac{9(1 + \int\_0^1 w\_0(\theta h\_1(t)t) d\theta) h\_1(t)}{1 - w\_0(t)}],$$

$$(1 + \int\_0^1 w\_0(\theta h\_2(t)t) d\theta) h\_2(t).$$

The parameter *ρ* defined by

$$\rho = \min\{\rho\_{\bar{j}}\} \ j = 1,2,3\tag{37}$$

is proven to be a radius of convergence for method (2) in Theorem 3. Let *S*<sup>1</sup> = [0, *ρ*). Then, it follows by these definitions that ∀ *t* ∈ *S*<sup>2</sup>

$$0 \le w\_0(t) < 1\tag{38}$$

$$0 \le w\_0(h\_1(t)t) < 1\tag{39}$$

and

$$0 \le h\_i(t) < 1.\tag{40}$$

The conditions required are as follows: (C1) Equation *F*(*x*) = 0 has a simple solution *x*<sup>∗</sup> ∈ Ω.

(C2) *F* (*x*∗)−1(*F* (*x*) − *F* (*x*∗)) ≤ *w*0(*x* − *x*∗) ∀ *x* ∈ Ω. Set Ω<sup>1</sup> = *U*(*x*∗, *ρ*0) ∩ Ω. (C3) *F* (*x*∗)−1(*F* (*y*) − *F* (*x*)) ≤ *w*(*y* − *x*) ∀ *x*, *y* ∈ Ω<sup>1</sup> and (C4) *U*[*x*0, *ρ*] ⊂ Ω.

Next, the main local convergence result follows for method (23).

**Theorem 3.** *Suppose that conditions (C1)–(C4) hold and x*<sup>0</sup> ∈ *U*(*x*∗, *ρ*) − {*x*∗}. *Then, the sequence* {*xn*} *generated by method (23) is well defined in U*(*x*∗, *ρ*), *remains in U*(*x*∗, *ρ*) ∀*n* = 0, 1, 2, . . . *and is convergent to x*∗. *Moreover, the following assertions hold:*

$$\|\|y\_n - \mathbf{x}\_\*\|\| \le h\_1 (\|\|\mathbf{x}\_n - \mathbf{x}\_\*\|) \|\|\mathbf{x}\_n - \mathbf{x}\_\*\|\| \le \|\|\mathbf{x}\_n - \mathbf{x}\_\*\|\| < \rho\_\prime \tag{41}$$

$$\|z\_{\mathfrak{n}} - \mathbf{x}\_{\ast}\| \le h\_2(\|\mathbf{x}\_{\mathfrak{n}} - \mathbf{x}\_{\ast}\|) \|\mathbf{x}\_{\mathfrak{n}} - \mathbf{x}\_{\ast}\| \le \|\mathbf{x}\_{\mathfrak{n}} - \mathbf{x}\_{\ast}\|,\tag{42}$$

*and*

$$\|\|\mathbf{x}\_{n+1} - \mathbf{x}\_{\*}\|\| \le h\_3(\|\|\mathbf{x}\_n - \mathbf{x}\_{\*}\|) \|\mathbf{x}\_n - \mathbf{x}\_{\*}\| \le \|\|\mathbf{x}\_n - \mathbf{x}\_{\*}\|\|,\tag{43}$$

*where functions hi are defined previously and the radius ρ is given by Formula (37).*

**Proof.** Let *u* ∈ *U*(*x*∗, *ρ*) − {*x*∗}. By using conditions (C1), (C2) and (37), we have that

$$\|\|F'(\mathbf{x}\_\*)^{-1}(F'(\mu) - F'(\mathbf{x}\_\*))\|\| \le \|w\_0(\|\mathbf{x}\_0 - \mathbf{x}\_\*\|) \le w\_0(r) < 1. \tag{44}$$

It follows by (44) and the Banach lemma on invertible operators [11,15] that *F* (*u*)−<sup>1</sup> <sup>∈</sup> *L*(*X*, *X*) and

$$\|\|F'(u)^{-1}F'(\mathbf{x}\_\*)\|\| \le \frac{1}{1 - w\_0(\|\|\mathbf{x}\_0 - \mathbf{x}\_\*\|)}.\tag{45}$$

If *u* = *x*0, then the iterate *y*<sup>0</sup> is well defined by the first substep of method (23) and we can write

$$\begin{split} \mathbf{x}\_{0} - \mathbf{x}\_{\*} &= \quad \mathbf{x}\_{0} - \mathbf{x}\_{\*} - F'(\mathbf{x}\_{0})^{-1} F(\mathbf{x}\_{0}) \\ &= \quad F'(\mathbf{x}\_{0})^{-1} \int\_{0}^{1} (F'(\mathbf{x}\_{\*} + \theta(\mathbf{x}\_{0} - \mathbf{x}\_{\*})) d\theta - F'(\mathbf{x}\_{0}))(\mathbf{x}\_{0} - \mathbf{x}\_{\*}) . \end{split} \tag{46}$$

In view of (C1)–(C3), (45) (for *u* = *x*0), (40) (for *i* = 1) and (46), we obtain in turn that

$$\begin{split} \|y\_0 - \mathbf{x}\_\*\| &\quad \leq \frac{\int\_0^1 w((1-\theta)\|\mathbf{x}\_0 - \mathbf{x}\_\*\|)d\theta \|\mathbf{x}\_0 - \mathbf{x}\_\*\|}{1 - w\_0(\|\mathbf{x}\_0 - \mathbf{x}\_\*\|)}\\ &\leq \|h\_1(\|\mathbf{x}\_0 - \mathbf{x}\_\*\|)\|\mathbf{x}\_0 - \mathbf{x}\_\*\| < \|\mathbf{x}\_0 - \mathbf{x}\_\*\| < \rho. \end{split} \tag{47}$$

Thus, the iterate *y*<sup>0</sup> ∈ *U*(*x*∗,*r*) and (41) holds for *n* = 0. The iterate *z*<sup>0</sup> is well defined by the second substep of method (23), so we can write

$$\begin{array}{rcl}z\_0 - \mathfrak{x}\_\* &=& y\_0 - \mathfrak{x}\_0 - 5F'(\mathfrak{x}\_0)^{-1}F(y\_0) \\ &=& y\_0 - \mathfrak{x}\_\* - F'(y\_0)^{-1}F(y\_0) \\ &+ F'(y\_0)^{-1}(F(\mathfrak{x}\_0) - F'(y\_0))F'(\mathfrak{x}\_0)^{-1}F(y\_0) \\ &-4F'(\mathfrak{x}\_0)^{-1}F(y\_0). \end{array} \tag{48}$$

Notice that linear operator *F* (*y*0)−<sup>1</sup> exists by (45) (for *u* = *y*0). It follows by (37), (40) (for *j* = 1), (C3), (45) (for *u* = *x*0, *y*0), in turn that

$$\begin{split} \|\|z\_{0} - \mathbf{x}\_{\*}\|\| &\leq \quad \left( \frac{\int\_{0}^{1} w((1-\theta)\|\|y\_{0} - \mathbf{x}\_{\*}\|)d\theta}{1 - w\_{0}(\|\|y\_{0} - \mathbf{x}\_{\*}\|)} \right) \\ &+ \frac{w(\|\|y\_{0} - \mathbf{x}\_{0}\|)(1 + \int\_{0}^{1} w\_{0}(\theta\|\|y\_{0} - \mathbf{x}\_{\*}\|)d\theta)}{(1 - w\_{0}(\|\|x\_{0} - \mathbf{x}\_{\*}\|))(1 - w\_{0}(\|\|y\_{0} - \mathbf{x}\_{\*}\|))} \\ &+ \frac{4(1 + \int\_{0}^{1} w\_{0}(\theta\|\|y\_{0} - \mathbf{x}\_{\*}\|)d\theta}{1 - w\_{0}(\|\|x\_{0} - \mathbf{x}\_{\*}\|)} \Big\|\|y\_{0} - \mathbf{x}\_{\*}\| \\ &\leq \quad h\_{2}(\|\|x\_{0} - \mathbf{x}\_{\*}\|) \|\|x\_{0} - \mathbf{x}\_{\*}\| \leq \|\|x\_{0} - \mathbf{x}\_{\*}\|\|. \end{split} \tag{49}$$

Thus, the iterate *z*<sup>0</sup> ∈ *U*(*x*∗, *ρ*) and (42) holds for *n* = 0, where we also used (C1) and (C2) to obtain the estimate

$$\begin{aligned} \|F(\mathbf{x}\_{\*})^{-1}F(y\_{0})\| &= \|F'(\mathbf{x}\_{\*})^{-1}[\int\_{0}^{1}F'(\mathbf{x}\_{\*} + \theta(y\_{0} - \mathbf{x}\_{\*}))d\theta - F'(\mathbf{x}\_{\*})] \\ &+ F'(\mathbf{x}\_{\*})[(y\_{0} - \mathbf{x}\_{\*})] \\ &\leq \quad (1 + \int\_{0}^{1}w\_{0}(\theta \| |y\_{0} - \mathbf{x}\_{\*}| )d\theta) \|y\_{0} - \mathbf{x}\_{\*}\|. \end{aligned}$$

Moreover, the iterate *x*<sup>1</sup> is well defined by the third substep of method (23), so we can have

$$\varkappa\_1 - \varkappa\_\* = y\_0 - \varkappa\_\* - \frac{1}{5} F'(\varkappa\_0)^{-1} (\mathcal{G} F(y\_0) + F(z\_0))\_\*$$

leading to

$$\|\mathbf{x}\_1 - \mathbf{x}\_\*\| \le \|\mathbf{y}\_0 - \mathbf{x}\_\*\| + \frac{1}{5} \left( \frac{9(1 + \int\_0^1 w\_0(\theta \|\| y\_0 - \mathbf{x}\_\*\|) d\theta) \|y\_0 - \mathbf{x}\_\*\|}{1 - w\_0(\|\| y\_0 - \mathbf{x}\_\*\|)} \right.$$

$$\begin{split} & + (1 + \int\_0^1 w\_0(\theta \|\| z\_0 - \mathbf{x}\_\*\|) d\theta) \|z\_0 - \mathbf{x}\_\*\| \\ & \le \|\mathbf{z}\_3(\|\| \mathbf{x}\_0 - \mathbf{x}\_\*\|) \|\mathbf{x}\_0 - \mathbf{x}\_\*\| \le \|\mathbf{x}\_0 - \mathbf{x}\_\*\| < \rho. \end{split} \tag{50}$$

Therefore, the iterate *x*<sup>1</sup> ∈ *U*(*x*∗, *ρ*) and (43) holds for *n* = 0.

Switch *x*0, *y*0, *z*0, *x*<sup>1</sup> by *xm*, *ym*, *zm*, *xm*+<sup>1</sup> ∀*m* = 0, 1, 2 ... in the preceding calculations to complete the induction for the estimates (41)–(43). Then, by the estimate

$$||\mathfrak{x}\_{m+1} - \mathfrak{x}\_\*|| \le d||\mathfrak{x}\_m - \mathfrak{x}\_\*|| < \rho,\tag{51}$$

where *d* = *h*3(*x*<sup>0</sup> − *x*∗) ∈ [0, 1), we obtain that *xm*+<sup>1</sup> ∈ *U*(*x*∗, *ρ*) and *limm*−→∞*xm* = *x*∗.

The uniqueness of the solution result for method (23) follows.

**Proposition 1.** *Suppose the following: (i) Equation F*(*x*) = 0 *has a simple solution x*<sup>∗</sup> ∈ *U*(*x*∗,*r*) ⊂ Ω *for some r* > 0. *(ii) Condition (C2) holds. (iii) There exists r*<sup>1</sup> ≥ *r such that*

$$\int\_{0}^{1} w\_{0}(\theta r\_{1})d\theta < 1. \tag{52}$$

*Set* Ω<sup>2</sup> = *U*[*x*∗,*r*1] ∩ Ω. *Then, the only solution of equation F*(*x*) = 0 *in the set* Ω<sup>2</sup> *is x*∗.

**Proof.** Let *<sup>y</sup>*<sup>∗</sup> <sup>∈</sup> *<sup>D</sup>*<sup>2</sup> be such that *<sup>F</sup>*(*y*∗) = 0. Define the linear operator *<sup>J</sup>* <sup>=</sup> <sup>1</sup> <sup>0</sup> *h*(*x*<sup>∗</sup> + *θ*(*y*<sup>∗</sup> − *x*∗))*dθ*. It then follows by (ii) and (52) that

$$\begin{aligned} \|h(\mathfrak{x}\_\*)^{-1}(f - F'(\mathfrak{x}\_\*))\| &\leq \int\_0^1 w\_0(\theta ||y\_\* - \mathfrak{x}\_\*||) d\theta\\ &\leq \int\_0^1 w\_0(\theta r\_1) d\theta < 1. \end{aligned}$$

Hence, we deduce *x*<sup>∗</sup> = *y*<sup>∗</sup> by the invertibility of *J* and the estimate *J*(*x*<sup>∗</sup> − *y*∗) = *F*(*x*∗) − *F*(*y*∗) = 0.

**Remark 3.** *Under all conditions of Theorem 3, we can set ρ* = *r*.

**Example 2.** *Consider the motion system*

$$F\_1'(v\_1) = e^{v\_1}, \; F\_2'(v\_2) = (e - 1)v\_2 + 1, \; F\_3'(v\_3) = 1$$

*with <sup>F</sup>*1(0) = *<sup>F</sup>*2(0) = *<sup>F</sup>*3(0) = 0. *Let <sup>F</sup>* = (*F*1, *<sup>F</sup>*2, *<sup>F</sup>*3)*tr*. *Let <sup>X</sup>* <sup>=</sup> <sup>R</sup>3, <sup>Ω</sup> <sup>=</sup> *<sup>U</sup>*[0, 1], *<sup>x</sup>*<sup>∗</sup> <sup>=</sup> (0, 0, 0)*tr*. *Let function F on* Ω *for v* = (*v*1, *v*2, *v*3)*tr given as*

$$F(v) = (e^{v\_1} - 1, \frac{e - 1}{2}v\_2^2 + v\_2, v\_3)^{tr}.$$

*Using this definition, we obtain the derivative as*

$$F'(v) = \begin{bmatrix} e^{v\_1} & 0 & 0 \\ 0 & (e-1)v\_2 + 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}.$$

*Hence, F* (*x*∗) = *<sup>I</sup>*. *Let <sup>v</sup>* <sup>∈</sup> <sup>R</sup><sup>3</sup> *with <sup>v</sup>* = (*v*1, *<sup>v</sup>*2, *<sup>v</sup>*3)*tr*. *Moreover, the nor for <sup>N</sup>* <sup>∈</sup> <sup>R</sup><sup>3</sup> <sup>×</sup> <sup>R</sup><sup>3</sup>

*is*

$$||N|| = \max\_{1 \le j \le 3} \sum\_{i=1}^{3} ||n\_{j,i}||.$$

*Conditions (C1)–(C3) are verified for <sup>w</sup>*0(*t*)=(*<sup>e</sup>* <sup>−</sup> <sup>1</sup>)*<sup>t</sup> and <sup>w</sup>*(*t*) = <sup>2</sup>(<sup>1</sup> <sup>+</sup> <sup>1</sup> *<sup>e</sup>*−<sup>1</sup> )*t*. *Then, the radii are*

$$\rho\_1 = 0.3030, \rho\_2 = 0.1033 = \rho \text{ and } \rho\_3 = 0.1461...$$

**Example 3.** *If X* = *C*[0, 1] *is equipped with the max-norm,* Ω = *U*[0, 1], *consider G* : Ω −→ *E*<sup>1</sup> *given as*

$$G(\lambda)(x) = \varphi(x) - 6 \int\_0^1 x \tau \lambda(\tau)^3 d\tau. \tag{53}$$

*We obtain*

$$G'(\lambda(\xi))(\mathbf{x}) = \xi(\mathbf{x}) - 18 \int\_0^1 \mathbf{x} \tau \lambda(\tau)^2 \xi(\tau) d\tau,\text{ for each } \xi \in D.$$

*Clearly, x*<sup>∗</sup> = 0 *and the conditions (C1)–(C3) hold for w*0(*t*) = 9*t and w*(*t*) = 18*t*. *Then, the radii are*

*ρ*<sup>1</sup> = 0.0556, *ρ*<sup>2</sup> = 0.0089 = *ρ* and *ρ*<sup>3</sup> = 0.0206.

#### **6. Semi-Local Convergence of Method**

As in the local case, we use some functions and parameters for the method (23). Suppose:

There exists function *<sup>v</sup>*<sup>0</sup> : *<sup>S</sup>* −→ <sup>R</sup> that is continuous and non-decreasing such that equation

$$v\_0(t) - 1 = 0$$

has a smallest solution *<sup>τ</sup>*<sup>0</sup> <sup>∈</sup> *<sup>S</sup>* − {0}. Consider function *<sup>v</sup>* : *<sup>S</sup>*<sup>0</sup> −→ <sup>R</sup> to be continuous and non-decreasing. Define the scalar sequences for *η* ≥ 0 and ∀ *n* = 0, 1, 2, . . . by

$$\begin{aligned} t\_0 &= 0, \ s\_0 = \eta \\ u\_n &= -s\_n + \frac{5 \int\_0^1 v(\theta(s\_n - t\_n)) d\theta(s\_n - t\_n)}{1 - v\_0(t\_n)}, \\ t\_{n+1} &= -u\_n + \frac{1}{1 - v\_0(t\_n)} [(1 + \int\_0^1 v\_0(u\_n + \theta(u\_n - s\_n)) d\theta(u\_n - s\_n) \\ &\quad + 3 \int\_0^1 v(\theta(s\_n - t\_n)) d\theta(s\_n - t\_n)] \\ s\_{n+1} &= -t\_{n+1} + \frac{1}{1 - v\_0(t\_{n+1})} \left[ \int\_0^1 v(\theta(t\_{n+1} - t\_n)) d\theta(t\_{n+1} - t\_n) \\ &\quad + (1 + \int\_0^1 v\_0(\theta t\_n) d\theta(t\_{n+1} - s\_n) \right]. \end{aligned} \tag{54}$$

This sequence is proven to be majorizing for method (23) in Theorem 4. However, first, we provide a general convergence result for sequence (54).

**Lemma 2.** *Suppose that* ∀ *n* = 0, 1, 2, . . .

$$v\_0(t\_n) < 1\tag{55}$$

*and there exists τ* ∈ [0, *τ*0) *such that*

$$t\_n \le x.\tag{56}$$

*Then, sequence* {*tn*} *converges to some t*<sup>∗</sup> ∈ [0, *τ*].

**Proof.** It follows by (54)–(56) that sequence {*tn*} is non-decreasing and bounded from above by *τ*. Hence, it converges to its unique least upper bound *t*∗.

Next, the operator *F* is related to the scalar functions. Suppose the following:


(h4) Conditions of Lemma 2 hold.

and

(h5) *U*[*x*0, *t*∗] ⊂ Ω.

We present the semi-local convergence result for the method (23).

**Theorem 4.** *Suppose that conditions (h1)–(h5) hold. Then, sequence* {*xn*} *given by method (23) is well defined, remains in U*[*x*0, *t*∗] *and converges to a solution x*<sup>∗</sup> ∈ *U*[*x*0, *t*∗] *of equation F*(*x*) = 0. *Moreover, the following assertions hold:*

$$\left\|\left|y\_n - x\_n\right|\right\| \le s\_n - t\_{n\prime} \tag{57}$$

$$\|\|z\_{\mathfrak{n}} - \underline{y}\_{\mathfrak{n}}\|\| \leq \|\mathfrak{u}\_{\mathfrak{n}} - \mathfrak{s}\_{\mathfrak{n}}\|\tag{58}$$

*and*

$$\|\mathbf{x}\_{n+1} - z\_n\| \le t\_{n+1} - u\_n. \tag{59}$$

**Proof.** Mathematical induction is utilized to show estimates (57)–(59). Using (h1) and method (23) for *n* = 0

$$\|\|y\_0 - \mathbf{x}\_0\|\| = \|{F}(\mathbf{x}\_0)^{-1} F(\mathbf{x}\_0)\|\| \le \eta = s\_0 - t\_0 \le t\_\*.$$

Thus, the iterate *y*<sup>0</sup> ∈ *U*[*x*0, *t*∗] and (57) holds for *n* = 0. Let *u* ∈ *U*[*x*0, *t*∗]. Then, as in Theorem 3, we get

$$\left\| \left| F'(u)^{-1} F'(x\_0) \right| \right\| \le \frac{1}{1 - v\_0(\left| \left| u - x\_0 \right|)}. \tag{60}$$

Hence, if we set *u* = *x*0, iterates *y*0, *z*<sup>0</sup> and *x*<sup>1</sup> are well defined by method (23) for *n* = 0. Suppose iterates *xk*, *yk*, *zk*, *xk*<sup>+</sup><sup>1</sup> also exist for all integer values *k* smaller than *n*. Then, we have the estimates

$$\begin{aligned} \|z\_{n} - y\_{n}\| &= \quad 5\|F'(\mathbf{x}\_{n})^{-1}F(y\_{n})\| \\ &\leq \quad \frac{5\int\_{0}^{1}v(\theta \|\|y\_{n} - \mathbf{x}\_{n}\|)d\theta \|\|y\_{n} - \mathbf{x}\_{n}\|}{1 - v\_{0}(\|\|\mathbf{x}\_{n} - \mathbf{x}\_{0}\|)} \\ &\leq \quad \frac{5\int\_{0}^{1}v(\theta \|\|y\_{n} - t\_{n}\|)d\theta (y\_{n} - t\_{n})}{1 - v\_{0}(t\_{n})} = u\_{n} - s\_{n}, \\\ \|\mathbf{x}\_{n+1} - \mathbf{z}\_{n}\|\| &= \quad \|\frac{1}{5}F'(\mathbf{x}\_{n})^{-1}(F(y\_{n}) - F(z\_{n})) + 3F'(\mathbf{x}\_{n})^{-1}F(y\_{n})\| \\ &\leq \quad \frac{1}{1 - v\_{0}(\|\|\mathbf{x}\_{n} - \mathbf{x}\_{0}\|)}[(1 + \frac{1}{5}\int\_{0}^{1}v\_{0}(\|\|\mathbf{z}\_{n} - \mathbf{x}\_{0}\| + \theta\|\|\mathbf{z}\_{n} - \mathbf{y}\_{n}\|)d\theta] \|y\_{n} - \mathbf{x}\_{n}\|] \\ &\quad + 3\int\_{0}^{1}v(\theta \|\|y\_{n} - \mathbf{x}\_{n}\|)d\theta \|y\_{n} - \mathbf{x}\_{n}\|)] \\ &\leq \quad t\_{n+1} - u\_{n} \end{aligned}$$

and

$$\begin{aligned} \|y\_{n+1} - \mathbf{x}\_{n+1}\| &= \|F'(\mathbf{x}\_{n+1})^{-1} F(\mathbf{x}\_{n+1})\| \\ &\leq \|F'(\mathbf{x}\_{n+1})^{-1} F(\mathbf{x}\_0)\| \|\|F'(\mathbf{x}\_0)^{-1} F(\mathbf{x}\_{n+1})\| \\ &\leq \frac{1}{1 - v\_0(\|\mathbf{x}\_{n+1} - \mathbf{x}\_0\|)} \Big[ \int\_0^1 v(\theta \|\mathbf{x}\_{n+1} - \mathbf{x}\_n\|) d\theta \|\mathbf{x}\_{n+1} - \mathbf{x}\_n\| \\ &+ (1 + \int\_0^1 v\_0(\theta \|\|\mathbf{x}\_n - \mathbf{x}\_0\|) d\theta) \|\mathbf{x}\_{n+1} - \mathbf{y}\_n\| \\ &\leq \|s\_{n+1} - t\_{n+1}\| \end{aligned}$$

where we also used

$$\begin{aligned} F(y\_n) &= \, \, \, F(y\_n) - F(\boldsymbol{\pi}\_n) - F'(\boldsymbol{\pi}\_n)(y\_n - \boldsymbol{\pi}\_n) \\ &= \, \int\_0^1 [F'(\boldsymbol{\pi}\_n + \theta(y\_n - \boldsymbol{\pi}\_n))d\theta - F'(\boldsymbol{\pi}\_n)](y\_n - \boldsymbol{\pi}\_n) \, d\theta \end{aligned}$$

so

$$\|\|F'(\mathbf{x}\_0)^{-1}F(\mathbf{y}\_n)\|\| \le \int\_0^1 \upsilon(\theta \|\|\mathbf{y}\_n - \mathbf{x}\_n\|)d\theta \|\|\mathbf{y}\_n - \mathbf{x}\_n\|\|$$

and

$$\begin{aligned} F(\mathbf{x}\_{n+1}) &= \, \, \, F(\mathbf{x}\_{n+1}) - F(\mathbf{x}\_n) - F'(\mathbf{x}\_n)(y\_n - \mathbf{x}\_n) \\ &- \, \, \, \, \, \, \mathbf{x}'(\mathbf{x}\_n)(\mathbf{x}\_{n+1} - \mathbf{x}\_n) + F'(\mathbf{x}\_n)(\mathbf{x}\_{n+1} - \mathbf{x}\_n) \\ &= \, \, \, \, \, \, F(\mathbf{x}\_{n+1}) - F(\mathbf{x}\_n) - F'(\mathbf{x}\_n)(\mathbf{x}\_{n+1} - \mathbf{x}\_n) + F'(\mathbf{x}\_n)(\mathbf{x}\_{n+1} - \mathbf{y}\_n) \, \, \, \end{aligned}$$

so

$$\begin{split} \|F(\mathbf{x}\_{0})^{-1}F(\mathbf{x}\_{n+1})\| &\leq \int\_{0}^{1} v(\theta \|\|\mathbf{x}\_{n+1} - \mathbf{x}\_{n}\|)d\theta \|\mathbf{x}\_{n+1} - \mathbf{x}\_{n}\| \\ &\qquad + (1 + v\_{0}(\|\|\mathbf{x}\_{n} - \mathbf{x}\_{0}\|)) \|\mathbf{x}\_{n+1} - \mathbf{y}\_{n}\| \\ &\leq \int\_{0}^{1} v(\theta(t\_{n+1} - t\_{n}))d\theta (t\_{n+1} - t\_{n}) \\ &\qquad + (1 + v\_{0}(t\_{n}))(t\_{n+1} - s\_{n})\_{\prime} \\ \|z\_{n} - \mathbf{x}\_{0}\| &\leq \quad \|z\_{n} - y\_{n}\| + \|y\_{n} - \mathbf{x}\_{0}\| \\ &\leq \quad u\_{n} - s\_{n} + s\_{n} - t\_{0} \leq t\_{\ast} \end{split} \tag{61}$$

and

$$\begin{aligned} \|\mathbf{x}\_{n+1} - \mathbf{x}\_0\| &\leq \quad \|\mathbf{x}\_{n+1} - z\_n\| + \|z\_n - \mathbf{x}\_0\| \\ &\leq \quad t\_{n+1} - u\_n + u\_n - t\_0 \leq t\_\*. \end{aligned}$$

Hence, sequence {*tn*} is majorizing for method (2) and iterates {*xn*}, {*yn*}, {*zn*} belong in *U*[*x*0, *t*∗]. The sequence {*xn*} is complete in Banach space *X* and as such, it converges to some *x*<sup>∗</sup> ∈ *U*[*x*0, *t*∗]. By using the continuity of *F* and letting *n* −→ ∞ in (61), we deduce *F*(*x*∗) = 0.

#### **Proposition 2.** *Suppose:*

*(i) There exists a solution x*<sup>∗</sup> ∈ *U*(*x*0, *ρ*2) *of equation F*(*x*) = 0 *for some ρ*<sup>2</sup> > 0. *(ii) Condition (h2) holds. (iii) There exists ρ*<sup>3</sup> ≥ *ρ*<sup>2</sup> *such that*

$$\int\_{0}^{1} v\_{0}((1-\theta)\rho\_{2} + \theta\rho\_{3})d\theta < 1. \tag{62}$$

*Set* Ω<sup>4</sup> = Ω ∩ *U*[*x*0, *ρ*3]. *Then, x*<sup>∗</sup> *is the only solution of equation F*(*x*) = 0 *in the region* Ω4.

**Proof.** Let *<sup>y</sup>*<sup>∗</sup> <sup>∈</sup> <sup>Ω</sup><sup>4</sup> with *<sup>F</sup>*(*y*∗) = 0. Define the linear operator *<sup>Q</sup>* <sup>=</sup> <sup>1</sup> <sup>0</sup> *F* (*x*<sup>∗</sup> + *θ*(*y*<sup>∗</sup> − *x*∗))*dθ*. Then, by (h2) and (62), we obtain in turn that

$$\begin{aligned} \| |F'(\mathbf{x}\_0)^{-1}(Q - F'(\mathbf{x}\_0))| \| &\leq \int\_0^1 v\_0((1-\theta)\|\mathbf{x}\_0 - \mathbf{y}\_\*\| + \theta\|\mathbf{x}\_0 - \mathbf{x}\_\*\|) d\theta\\ &\leq \int\_0^1 v\_0((1-\theta)\rho\_2 + \theta\rho\_3) d\rho < 1. \end{aligned}$$

Thus, *x*<sup>∗</sup> = *y*∗.

The next two examples show how to choose the functions *v*0, *v*, and the parameter *η*.

**Example 4.** *Set <sup>X</sup>* <sup>=</sup> <sup>R</sup>. *Let us consider a scalar function <sup>F</sup> defined on the set* <sup>Ω</sup> <sup>=</sup> *<sup>U</sup>*[*x*0, 1 <sup>−</sup> *<sup>μ</sup>*] *for μ* ∈ (0, 1) *by*

$$F(\mathbf{x}) = \mathbf{x}^3 - \mu.$$

*Choose <sup>x</sup>*<sup>0</sup> = 1. *Then, the conditions (h1)–(h3) are verified for <sup>η</sup>* = <sup>1</sup>−*<sup>μ</sup>* <sup>3</sup> , *v*0(*t*)=(3 − *μ*)*t and v*(*t*) = 2(1 + <sup>1</sup> <sup>3</sup>−*<sup>μ</sup>* )*t*.

**Example 5.** *Consider X* = *C*[0, 1] *and* Ω = *U*[0, 1]. *Then the problem [5]*

$$
\Xi(0) = 0, \Xi(1) = 1,
$$

$$
\Xi'' = -\Xi - \iota \Xi^2
$$

*is also given as integral equation of the form*

$$\Xi(q\_2) = q\_2 + \int\_0^1 \Theta(q\_2, q\_1) (\Xi^3(q\_1) + \iota \Xi^2(q\_1)) dq\_1$$

*where ι is a constant and* Θ(*q*2, *q*1) *is the Green's function*

$$\Theta(q\_2, q\_1) = \begin{cases} \
q\_1(1 - q\_2)\_{\prime} & q\_1 \le q\_2 \\ 
q\_2(1 - q\_1)\_{\prime} & q\_2 < q\_1. \end{cases}$$

*Consider F* : Ω −→ *X as*

$$[F(\mathfrak{x})](q\_2) = \mathfrak{x}(q\_2) - q\_2 - \int\_0^1 \Theta(q\_2, q\_1) (\mathfrak{x}^3(q\_1) + \mathfrak{x}^2(q\_1)) dq\_1.$$

*Choose* Ξ0(*q*2) = *q*<sup>2</sup> *and* Ω = *U*(Ξ0, 0). *Then, clearly U*(Ξ0, 0) ⊂ *U*(0, <sup>0</sup> + 1), *since* Ξ<sup>0</sup> = 1. *If* 2*ι* < 5. *Then, conditions (C1)–(C3) are satisfied for*

$$w\_0(t) = \frac{2\iota + 3\rho\_0 + 6}{8}t,\\ w(t) = \frac{\iota + 6\rho\_0 + 3}{4}t.$$

*Hence, w*0(*t*) ≤ *w*(*t*).

#### **7. Local Convergence of Method**

The local analysis is using on certain parameters and real functions. Let *L*0, *L* and *α* be positive parameters. Set *T*<sup>1</sup> = [0, <sup>1</sup> (2+*α*)*L*<sup>0</sup> ] provided that (2 + *α*)*L*<sup>0</sup> < 1.

Define the function *<sup>h</sup>*<sup>1</sup> : *<sup>T</sup>*<sup>1</sup> −→ <sup>R</sup> by

$$h\_1(t) = \frac{(1+a)Lt}{1 - (2+a)L\_0t}.$$

Notice that parameter *ρ*

$$\rho = \frac{1}{(1+\alpha)L + (2+\alpha)L\_0}$$

is the only solution of equation

$$h\_1(t) - 1 = 0$$

in the set *T*1.

Define the parameter *ρ*<sup>0</sup> by

$$\rho\_0 = \frac{1}{(2+a)(L\_0+L)}.$$

Notice that *ρ*<sup>0</sup> < *ρ*. Set *T*<sup>0</sup> = [0, *ρ*0]. Define the function *<sup>h</sup>*<sup>2</sup> : *<sup>T</sup>*<sup>0</sup> −→ <sup>R</sup> by

$$h\_2(t) = \frac{(2+2\alpha+h\_1(t))Lh\_1(t)t}{1-(2+\alpha)(L\_0+L)t}.$$

The equation

$$h\_2(t) - 1 = 0$$

has a smallest solution *ρ* ∈ *T*<sup>0</sup> − {0} by the intermediate value theorem, since *h*2(0) − 1 = −1 and *h*2(*t*) −→ ∞ as *y* −→ *ρ*<sup>−</sup> <sup>0</sup> . It shall be shown that *R* is a radius of convergence for method (20). It follows by these definitions that ∀*t* ∈ *T*<sup>0</sup>

$$0 \le (L\_0 + L)(2 + \mathfrak{a})t < 1\tag{63}$$

$$0 \le h\_1(t) < 1\tag{64}$$

and

$$0 \le h\_2(t) < 1.\tag{65}$$

The following conditions are used:

(C1) There exists a solution *x*<sup>∗</sup> ∈ Ω of equation *F*(*x*) = 0 such that *F* (*x*∗)−<sup>1</sup> <sup>∈</sup> *<sup>L</sup>*(*X*, *<sup>X</sup>*). (C2) There exist positive parameters *L*<sup>0</sup> and *α* such that ∀*v*, *z* ∈ Ω

$$\left| \left| F'(\mathfrak{x}\_\*)^{-1}([\upsilon, z; F] - F'(\mathfrak{x}\_\*)) \right| \right| \le L\_0 (\left| \left| \upsilon - \mathfrak{x}\_\* \right| \right| + \left| \left| z - \mathfrak{x}\_\* \right| \right|).$$

and

$$\|\|F(\mathbf{x})\|\| \le \mathfrak{a} \|\|\mathbf{x} - \mathbf{x}\_\*\|\|.$$

Set Ω<sup>1</sup> = *U*(*x*∗, *ρ*) ∩ Ω. (C3) There exists a positive constant *L* > 0 such that ∀*x*, *y*, *v*, *z* ∈ Ω<sup>1</sup>

$$||F'(x\_\*)^{-1}([x,y;F] - [v,z;F])|| \le L(||x-v|| + ||y-z||)$$

and

(C4) *U*[*x*0, *ρ*] ⊂ Ω.

Next, the local convergence of method (20) is presented using the preceding terminology and conditions.

**Theorem 5.** *Under conditions (C1)–(C4), further suppose that x*<sup>0</sup> ∈ *U*(*x*∗, *ρ*). *Then, the sequence* {*xn*} *generated by method (20) is well defined in U*(*x*∗, *ρ*), *stays in U*(*x*∗, *ρ*) ∀*n* = 0, 1, 2, ... *and is convergent to x*<sup>∗</sup> *so that*

$$\|\|y\_n - \mathbf{x}\_\*\|\| \le h\_1(\|\mathbf{x}\_n - \mathbf{x}\_\*\|) \|\|\mathbf{x}\_n - \mathbf{x}\_\*\| \le \|\|\mathbf{x}\_n - \mathbf{x}\_\*\| < \Omega \tag{66}$$

*and*

$$\|\|\mathbf{x}\_{n+1} - \mathbf{x}\_\*\|\| \le h\_2(\|\|\mathbf{x}\_n - \mathbf{x}\_\*\|) \|\mathbf{x}\_n - \mathbf{x}\_\*\| \le \|\|\mathbf{x}\_n - \mathbf{x}\_\*\|\_\prime \tag{67}$$

*where the functions h*1, *h*<sup>2</sup> *and the radius ρ are defined previously.*

**Proof.** It follows by method (20), (C1), (C2) and *x*<sup>0</sup> ∈ *U*(*x*∗, *ρ*) in turn that

$$\begin{array}{rcl} \|F'(\mathbf{x}\_{\ast})^{-1}(A\_{0} - F'(\mathbf{x}\_{\ast}))\| &=& \|F'(\mathbf{x}\_{\ast})^{-1}([\mathbf{x}\_{0}, \mathbf{x}\_{0} + F(\mathbf{x}\_{0}); F] - F'(\mathbf{x}\_{\ast}))\| \\ &\leq& I\_{0}(2\|\mathbf{x}\_{0} - \mathbf{x}\_{\ast}\| + \|F(\mathbf{x}\_{0}) - F(\mathbf{x}\_{\ast})\|) \\ &\leq& I\_{0}(2 + a)\|\mathbf{x}\_{0} - \mathbf{x}\_{\ast}\| \\ &<& I\_{0}(2 + a)\rho. \end{array} \tag{68}$$

It follows by (68) and the Banach lemma on invertible operators [24] that *A*−<sup>1</sup> 0 ∈ *L*(*X*, *X*) and

$$\|A\_0^{-1}F'(\mathbf{x}\_\*)\| \le \frac{1}{1 - (2 + \mathfrak{a})L\_0 \|\mathbf{x}\_0 - \mathbf{x}\_\*\|}. \tag{69}$$

Hence, the iterate *y*<sup>0</sup> exists by the first substep of method (20) for *n* = 0. It follows from the first substep of method (20), (C2) and (C3), that

$$\begin{split} \|\|y\_0 - \mathbf{x}\_\*\|\| &\leq \quad \|\mathbf{x}\_0 - \mathbf{x}\_\* - A\_0^{-1} F(\mathbf{x}\_0) \\ &\quad \quad \|A\_0^{-1} F(\mathbf{x}\_\*) F(\mathbf{x}\_\*)^{-1} (A\_0 - (F(\mathbf{x}\_0) - F(\mathbf{x}\_\*)))(\mathbf{x} - \mathbf{0} - \mathbf{x}\_\*)\| \\ &\leq \quad \|A\_0^{-1} F(\mathbf{x}\_\*)\|\| \|F(\mathbf{x}\_\*)^{-1} (A\_0 - (F(\mathbf{x}\_0) - F(\mathbf{x}\_\*)))\|\| \|\mathbf{x}\_0 - \mathbf{x}\_\*\| \\ &\leq \quad \frac{L(\|\mathbf{x}\_0 - \mathbf{x}\_\*\| + \|F(\mathbf{x}\_0) - F(\mathbf{x}\_\*)\|)}{1 - L\_0(2 + a)\|\mathbf{x}\_0 - \mathbf{x}\_\*\|} \\ &\leq \quad h\_1(\|\mathbf{x}\_0 - \mathbf{x}\_\*\|) \|\mathbf{x}\_0 - \mathbf{x}\_\*\| \leq \|\mathbf{x}\_0 - \mathbf{x}\_\*\| < \rho. \end{split} \tag{70}$$

Thus, the iterate *y*<sup>0</sup> ∈ *U*(*x*∗, *ρ*) and (66) holds for *n* = 0. Similarly, by the second substep of method (20), we have

$$\begin{split} \|F'(\mathbf{x}\_\*)^{-1}(B\_0 - F'(\mathbf{x}\_\*))\| &= \|F'(\mathbf{x}\_\*)^{-1}([y\_0, w\_0; F] \\ &- [y\_0, \mathbf{x}\_0; F] - [\mathbf{x}\_0, w\_0; F] - [\mathbf{x}\_\*, \mathbf{x}\_\*; F])\| \\ &\leq L(\|y\_0 - w\_0\| + L\_0(\|y\_0 - \mathbf{x}\_\*\| + \|w\_0 - \mathbf{x}\_\*\|) \\ &\leq L(\|y\_0 - \mathbf{x}\_\*\| + \|w\_0 - \mathbf{x}\_\*\|) + L\_0(\|y\_0 - \mathbf{x}\_\*\| + \|w\_0 - \mathbf{x}\_\*\|) \\ &\leq (L + L\_0)(2 + a)\rho \leq \frac{L + L\_0}{L + L\_0} = 1. \end{split}$$

Hence, *B*−<sup>1</sup> <sup>0</sup> ∈ *L*(*X*, *X*) and

$$\|\|B\_0^{-1}F'(\mathbf{x}\_\*)\|\| \le \frac{1}{1 - (L + L\_0)(2 + \mathfrak{a})\|\|\mathbf{x}\_0 - \mathbf{x}\_\*\|}. \tag{72}$$

Thus, the iterate *x*<sup>1</sup> exists by the second sub-step of method (20). Then, as in (70) we obtain in turn that

*<sup>x</sup>*<sup>1</sup> <sup>−</sup> *<sup>x</sup>*∗≤*y*<sup>0</sup> <sup>−</sup> *<sup>x</sup>*<sup>∗</sup> <sup>−</sup> *<sup>B</sup>*−<sup>1</sup> <sup>0</sup> *F*(*y*0) <sup>≤</sup>*B*−<sup>1</sup> <sup>0</sup> *F* (*x*∗)*F* (*x*∗)−1(*B*<sup>0</sup> <sup>−</sup> (*F*(*y*0) <sup>−</sup> *<sup>F</sup>*(*x*∗)))*<sup>y</sup>*<sup>0</sup> <sup>−</sup> *<sup>x</sup>*<sup>∗</sup> <sup>≤</sup> *<sup>F</sup>* (*x*∗)−1([*y*0, *<sup>w</sup>*0; *<sup>F</sup>*]+[*y*0, *<sup>x</sup>*0; *<sup>F</sup>*] <sup>−</sup> [*x*0, *<sup>w</sup>*0; *<sup>F</sup>*] <sup>−</sup> [*y*0, *<sup>x</sup>*<sup>∗</sup> : *<sup>F</sup>*]) 1 − (*L* + *L*0)(2 + *α*)*x*<sup>0</sup> − *x*∗ *y*<sup>0</sup> − *x*∗ ≤ *L*(2 + 2*α* + *h*2(*x*<sup>0</sup> − *x*∗))*x*<sup>0</sup> − *x*∗ <sup>1</sup> <sup>−</sup> (*<sup>L</sup>* <sup>+</sup> *<sup>L</sup>*0)(<sup>2</sup> <sup>+</sup> *<sup>α</sup>*)*<sup>x</sup>*<sup>0</sup> <sup>−</sup> *<sup>x</sup>*<sup>∗</sup> *<sup>h</sup>*1(*<sup>x</sup>*<sup>0</sup> <sup>−</sup> *<sup>x</sup>*∗) *x*<sup>0</sup> − *x*∗ (73) ≤ *h*2(*x*<sup>0</sup> − *x*∗)*x*<sup>0</sup> − *x*∗≤*x*<sup>0</sup> − *x*∗ < *ρ*.

Therefore, the iterate *x*<sup>1</sup> ∈ *U*(*x*∗, *ρ*) and (67) holds for *n* = 0.

Simply replace *x*0, *y*0, *x*<sup>1</sup> by *xm*, *ym*, *xm*+<sup>1</sup> ∀*m* = 0, 1, 2 ... in the preceding calculations to complete the induction for (66) and (67). It then follows from the estimate

$$\|\mathbf{x}\_{m+1} - \mathbf{x}\_\*\| \le \mu \|\mathbf{x}\_m - \mathbf{x}\_\*\| \prec \rho,\tag{74}$$

where, *μ* = *h*2(*x*<sup>0</sup> − *x*∗) ∈ [0, 1) leading to *xm*+<sup>1</sup> ∈ *U*(*x*∗, *ρ*) and *limm*−→∞*xm* = *x*∗.

Concerning the uniqueness of the solution *x*<sup>∗</sup> (not given in [9]), we provide the result.

**Proposition 3.** *Suppose:*

*(i) The point x*<sup>∗</sup> *is a simple solution x*<sup>∗</sup> ∈ *U*(*x*∗,*r*) ⊂ Ω *for some r* > 0 *of equation F*(*x*) = 0. *(ii) There exists positive parameter L*<sup>1</sup> *such that* ∀*y* ∈ Ω

$$\left\| \left[ F'(\mathbf{x}\_\*)^{-1}([\mathbf{x}\_\*, y; F] - F'(\mathbf{x}\_\*) ) \right] \right\| \le L\_1 \left\| y - \mathbf{x}\_\* \right\|\tag{75}$$

*(iii) There exists r*<sup>1</sup> ≥ *r such that*

$$L\_1 r\_1 < 1.\tag{76}$$

*Set* Ω<sup>2</sup> = *U*[*x*∗,*r*1] ∩ Ω. *Then, x*<sup>∗</sup> *is the only solution of equation F*(*x*) = 0 *in the set* Ω2.

**Proof.** Set *P* = [*x*∗, *y*∗; *F*] for some *y*<sup>∗</sup> ∈ *D*<sup>2</sup> with *F*(*y*∗) = 0. It follows by (i), (75) and (76) that

$$\|\|F'(\mathfrak{x}\_\*)^{-1}(P - F'(\mathfrak{x}\_\*))\|\| \le L\_1 \|\|y\_\* - \mathfrak{x}\_\*\|\| < 1.$$

Thus, we conclude *x*<sup>∗</sup> = *y*<sup>∗</sup> by the invertability of *P* and identity *P*(*x*<sup>∗</sup> − *y*∗) = *F*(*x*∗) − *F*(*y*∗) = 0.

**Remark 4.** *(i) Notice that not all conditions of Theorem 5 are used in Proposition 3. If they were, then we can set r*<sup>1</sup> = *ρ*.

*(ii) By the definition of set* Ω<sup>1</sup> *we have*

$$
\Omega\_1 \subset \Omega.\tag{77}
$$

*Therefore, the parameter*

$$L \le L\_{2"} \tag{78}$$

*where L*<sup>2</sup> *is the corresponding Lipschitz constant in [1,3,9,19] appearing in the condition* ∀*x*, *y*, *z* ∈ Ω

$$\|F'(\mathbf{x}\_\*)^{-1}([\mathbf{x}, y; F] - [v, z; F])\| \le L\_2(\|\mathbf{x} - \mathbf{v}\| + \|y - z\|). \tag{79}$$

*Thus, the radius of convergence R*<sup>0</sup> *in [1,7,8,20] uses L*<sup>2</sup> *instead of L*. *That is by (78)*

$$R\_0 \le \rho.\tag{80}$$

*Examples where (77), (78) and (80) are strict can be found in [2,5,11–13,15,21–24].*

#### **8. Majorizing Sequences for Method**

Let *K*0, *K*, be given positive parameters and *δ* ∈ [0, 1), *K*<sup>0</sup> ≤ *K*, *η* ≥ 0, and *T* = [0, 1). Consider recurrent polynomials defined on the interval *T* for *n* = 1, 2, . . . by

$$\begin{array}{cccc}f\_{n}^{(1)}(t) & = & K t^{2n} \eta + K t^{2n-1} \eta + 2K\_0 (1 + t + \ldots + t^{2n+1}) \eta \\ & & + K\_0 (t^{2n+1} + 2t^{2n}) t^{2n+1} \eta + \delta - 1, \\ f\_{n}^{(2)}(t) & = & K t^{2n+1} \eta + K (t^{2n+1} + 2t^{2n}) t^{2n} \eta \\ & & + 2K\_0 (1 + t + \ldots + t^{2n+2}) \eta + \delta - 1, \\ g\_{n}^{(1)}(t) & = & K t^3 + K t^2 - Kt - K + 2K\_0 (t^3 + t^4) \\ & & + K\_0 (t^{2n+3} + 2t^{n+2}) t^4 \eta - K\_0 (t^{2n+1} + 2t^{2n}) t^2 \eta, \\ g\_{n}^{(2)}(t) & = & K t^3 + K (t^3 + 2t^2) t^{2n+2} \eta \\ & & + 2K\_0 (t^3 + t^4) - Kt - K(t + 2) t^{2n} \eta, \\ h\_{n+1}^{(1)}(t) & = & g\_{n+1}^{(1)}(t) - g\_n^{(1)}(t), \\ h\_{n+1}^{(2)}(t) & = & g\_{n+1}^{(2)}(t) - g\_n^{(2)}(t), \end{array}$$

and polynomials

$$\mathbf{g}\_{\infty}^{(1)}(t) = \mathbf{g}\_1(t) = Kt^3 + Kt^2 - Kt - K + 2K\_0(t^3 + t^4),$$

$$\mathbf{g}\_{\infty}^{(2)}(t) = \mathbf{g}\_2(t) = Kt^3 + 2K\_0(t^3 + t^4) - Kt = \mathbf{g}\_3(t)t$$

and

$$g(t) = (t-1)^2(t^5 + 4t^4 + 6t^3 + 6t^2 + 5t + 2).$$

Then, the following auxiliary result connecting these polynomials can be shown.

**Lemma 3.** *The following assertions hold:*

$$f\_{n+1}^{(1)}(t) = f\_n^{(1)}(t) + \mathcal{g}\_n^{(1)}(t)t^{2n-1}\eta\_{\prime} \tag{81}$$

$$f\_{n+1}^{(2)}(t) = f\_n^{(2)}(t) + \mathbb{g}\_n^{(2)}(t)t^{2n}\eta\_{\prime} \tag{82}$$

$$h\_{n+1}^{(1)}(t) = \mathcal{g}(t)\mathcal{K}\_0 t^{2n+2}\eta\_{\prime} \tag{83}$$

$$h\_{n+1}^{(2)}(t) = \mathcal{g}(t) \mathcal{K} t^{2n} \eta,\tag{84}$$

*polynomials g*<sup>1</sup> *and g*<sup>2</sup> *have smallest zeros in the interval T* − {0} *denoted by ξ*<sup>1</sup> *and α*2, *respectively,*

$$h\_{n+1}^{(1)}(t) \ge 0 \; \forall \; t \in [0, \mathfrak{f}\_1) \tag{85}$$

*and*

$$h\_{n+1}^{(2)}(t) \ge 0 \; \forall \; t \in [0, \xi\_2). \tag{86}$$

*Moreover, define functions on the interval T by*

$$\mathbf{g}\_{\infty}^{(1)}(t) = \lim\_{n \to \infty} \mathbf{g}\_{n}^{(1)}(t) \tag{87}$$

*and*

$$\mathcal{g}\_{\infty}^{(2)}(t) = \lim\_{n \to \infty} \mathcal{g}\_{n}^{(2)}(t). \tag{88}$$

*Then,*

$$\mathbf{g}\_{\infty}^{(1)}(t) = \mathbf{g}\_1(t) \; \forall \; t \in [0, \mathfrak{a}\_1), \tag{89}$$

$$\text{g}^{(2)}\_{\infty}(t) = \text{g}\_2(t) \; \forall \; t \in [0, \mathfrak{a}\_2), \tag{90}$$

$$f\_{n+1}^{(1)}(t) \le f\_n^{(1)}(t) + \mathcal{g}\_1(t)t^{2n-1}\eta \,\,\forall \,\,t \in [0, \xi\_1),\tag{91}$$

$$f\_{n+1}^{(2)}(t) \le f\_n^{(2)}(t) + \lg\_2(t)t^{2n} \eta \,\,\forall \,\, t \in [0, \mathfrak{f}\_2),\tag{92}$$

$$f\_{n+1}^{(1)}(\xi\_1) \le f\_n^{(1)}(\xi\_1),\tag{93}$$

*and*

$$f\_{n+1}^{(2)}(\xi\_2) \le f\_n^{(2)}(\xi\_2). \tag{94}$$

**Proof.** Assertions (81)–(84) hold by the definition of these functions and basic algebra. By the intermediate value theorem polynomials *g*<sup>1</sup> and *g*<sup>3</sup> have zeros in the interval *T* − {0}, since *g*1(0) = −*K*, *g*1(1) = 4*K*0, *g*2(0) = −*K* and *g*2(1) = 4*K*0. Then, assertions (85) and (86) follow by the definition of these polynomials and zeros *ξ*<sup>1</sup> and *ξ*2. Next, assertions (91) and (94) also follow from (87), (88) and the definition of these polynomials.

The preceding result is connected to the scalar sequence defined ∀ *n* = 0, 1, 2, ... by *t*<sup>0</sup> = 0,*s*<sup>0</sup> = *η*,

$$\begin{array}{rcl} t\_1 &=& s\_0 + \frac{K(\eta + \delta)\eta}{1 - K\_0(2\eta + \delta)}, \\ s\_{n+1} &=& t\_{n+1} + \frac{K(t\_{n+1} - t\_n + s\_n - t\_n)(t\_{n+1} - s\_n)}{1 - K\_0(2t\_{n+1} + \gamma\_n + \delta)} \\ t\_{n+2} &=& s\_{n+1} + \frac{K(s\_{n+1} - t\_{n+1} + \gamma\_n)(s\_{n+1} - t\_{n+1})}{1 - K\_0(2s\_{n+1} + \delta)}, \end{array} \tag{95}$$

where *γ<sup>n</sup>* = *K*(*tn*+<sup>1</sup> − *tn* + *sn* − *tn*)(*tn*+<sup>1</sup> − *sn*), *δ* ≥ *γ*0.

Moreover, define parameters *<sup>ξ</sup>*<sup>1</sup> <sup>=</sup> *<sup>K</sup>*(*s*1−*t*1+*γ*0) <sup>1</sup>−*K*0(2*s*1+*δ*), *<sup>ξ</sup>*<sup>2</sup> <sup>=</sup> *<sup>K</sup>*(*t*1+*s*0) <sup>1</sup>−*K*0(2*t*1+*γ*0+*δ*) and *<sup>a</sup>* <sup>=</sup> max{*ξ*1, *<sup>ξ</sup>*2}, Then, the first convergence result for sequence {*tn*} follows.

#### **Lemma 4.** *Suppose*

$$K\eta \le 1, \ 0 < \mathfrak{f}\_1, \ 0 < \mathfrak{f}\_2, \ a < \mathfrak{f} < 1,\tag{96}$$

*f* (1) <sup>1</sup> (*ξ*1) ≤ 0 (97)

*and*

$$f\_2^{(1)}(\xi\_2) \le 0.\tag{98}$$

*Then, scalar sequence* {*tn*} *is non-decreasing, bounded from above by <sup>t</sup>*∗∗ <sup>=</sup> *<sup>η</sup>* <sup>1</sup>−*<sup>ξ</sup>* , *and converges to its unique least upper bound t*<sup>∗</sup> ∈ [0, *t*∗∗]. *Moreover, the following error bounds hold*

$$0 < t\_{n+1} - s\_n \le \tilde{\xi}(s\_n - t\_n) \le \tilde{\xi}^{2n+1} \eta\_\prime \tag{99}$$

$$0 < s\_n - t\_n \le \tilde{\xi}(t\_n - s\_{n-1}) \le \tilde{\xi}^{2n} \eta \tag{100}$$

*and*

$$
\gamma\_{n+1} \stackrel{\*}{\leq} \gamma\_n \stackrel{\*}{\leq} \gamma 0 \text{-} \tag{101}
$$

**Proof.** Assertions (99)–(101) hold if we show using induction that

$$0 < \frac{K(t\_{n+1} - t\_n + s\_n - t\_n)}{1 - K\_0(2t\_{n+1} + \gamma\_n + \delta)} \le \tilde{\xi}\_{1\prime} \tag{102}$$

$$0 < \frac{K(s\_{n+1} - t\_{n+1} + \gamma\_n)}{1 - K\_0(2s\_{n+1} + \delta)} \le \tilde{\xi}\_{2\prime} \tag{103}$$

and

$$t\_n \le s\_n \le t\_{n+1}.\tag{104}$$

By the definition of *t*1, we obtain

$$\frac{t\_1}{s\_0} = \frac{1 - K\eta}{1 - K\_0(2\eta + \delta)} > 1,$$

so *s*<sup>0</sup> < *t*1, and (103) holds for *n* = 0. Suppose assertions (101)–(103) hold for each *m* = 0, 1, 2, 3, . . . , *n*. By (99) and (100) we have

$$\begin{array}{rcl} s\_m & \le & t\_m + \xi^{2m}\eta \le s\_{m-1} + \xi^{2m-1}\eta + \xi^{2m}\eta \\ & \le & \eta + \xi\eta + \dots + \xi^{2m}\eta \\ & = & \frac{1 - \xi^{2m+1}}{1 - \xi}\eta \le t\_{\*\*} \end{array} \tag{105}$$

and

$$\begin{array}{rcl} t\_{m+1} & \le & s\_m + \mathfrak{z}^{2m+1}\eta \le t\_m + \mathfrak{z}^{2m+1}\eta + \mathfrak{z}^{2m}\eta\\ & \le & \eta + \mathfrak{z}\eta + \dots + \mathfrak{z}^{2m+1}\eta\\ & = & \frac{1 - \mathfrak{z}^{2m+2}}{1 - \mathfrak{z}}\eta \le t\_{\*\ast}. \end{array} \tag{106}$$

By the induction hypotheses sequences {*tm*}, {*sm*} are increasing. Evidently, estimate (101) holds if

$$\begin{aligned} K\xi^{2m+1}\eta + K\xi^{2m}\eta + 2K\_0\xi \frac{1-\xi^{2m+2}}{1-\xi}\eta \\ + K\_0\xi\delta + \xi\gamma\_m K\_0 - \xi \le 0 \end{aligned}$$

or

$$f\_m^{(1)}(t) \le 0 \text{ at } t = \mathfrak{f}\_{1\prime} \tag{107}$$

where *<sup>γ</sup><sup>m</sup>* <sup>≤</sup> *<sup>K</sup>*(*ξ*2*m*+<sup>1</sup> <sup>+</sup> <sup>2</sup>*ξ*2*m*)*ξ*2*m*+1*η*2. By (91), (93), and (98) estimate (107) holds. Similarly, assertion (103) holds if

$$\begin{aligned} K\xi^{2m+2}\eta + K^2(\xi^{2m+1}\eta + 2\xi^{2m}\eta)\xi^{2m+1}\eta \\ + 2\xi K\_0(1+\xi+\ldots+\xi^{2m+2})\eta + \delta\xi-\xi \le 0 \end{aligned}$$

or

$$f\_m^{(2)}(t) \le 0 \text{ at } t = \mathfrak{J}\_2. \tag{108}$$

By (92) and (94), assertion (108) holds. Hence, (100) and (103) also hold. Notice that *γ<sup>n</sup>* can be written as *γ<sup>n</sup>* = *K*(*En* + *E*<sup>1</sup> *n*)*E*<sup>2</sup> *<sup>n</sup>*, where *En* <sup>=</sup> *tn*+<sup>1</sup> <sup>−</sup> *tn* <sup>&</sup>gt; 0, *<sup>E</sup>*<sup>1</sup> *<sup>n</sup>* = *sn* − *tn*, and *E*2 *<sup>n</sup>* = *tn*+<sup>1</sup> − *sn* > 0. Hence, we get

$$E\_{n+1} - E\_n = t\_{n+2} - 2t\_{n+1} + t\_n \le \xi^{2n} (\xi^2 - 1)(\xi + 1)\eta < 0,$$

$$E\_{n+1}^1 - E\_n^1 = s\_{n+1} - t\_{n+1} - (s\_n - t\_n) \le \xi^{2n} (\xi^2 - 1)\eta < 0,$$

and

$$E\_{n+1}^2 - E\_n^2 = t\_{n+2} - s\_{n+1} - (t\_{n+1} - s\_n) \le \tilde{\zeta}^{2n+1} (\tilde{\zeta}^2 - 1)\eta < 0,$$

so

$$
\gamma\_{n+1} \le \gamma\_n \le \gamma\_0.
$$

It follows that sequence {*tn*} is non-decreasing, bounded from above by *t*∗∗. Thus, it converges to *t*∗.

Next, a second convergence result for sequence (95) is presented but the sufficient criteria are weaker but more difficult to verify than those of Lemma 4.

#### **Lemma 5.** *Suppose*

$$K\_0 \delta < 1,\tag{109}$$

$$
\mathbb{K}\_0(2t\_{n+1} + \gamma\_n + \delta) < 1,\tag{110}
$$

*and*

$$\mathbb{K}\_0(2s\_{n+1} + \delta) < 1\tag{111}$$

*hold. Then, sequence* {*tn*} *is increasing and bounded from above by t* ∗∗ <sup>1</sup> <sup>=</sup> <sup>1</sup>−*K*0*<sup>δ</sup>* <sup>2</sup>*K*<sup>0</sup> , *so it converges to its unique least upper bound t*∗ <sup>1</sup> ∈ [0, *t* ∗∗ <sup>1</sup> ].

**Proof.** It follows from the definition of sequence (95), and conditions (109)–(111).

#### **9. Semi-Local Convergence of Method**

The conditions (C) shall be used in the semi-local convergence analysis of method (20).

Suppose

(C1) There exist *<sup>x</sup>*<sup>0</sup> <sup>∈</sup> <sup>Ω</sup>, *<sup>η</sup>* <sup>≥</sup> 0, *<sup>δ</sup>* <sup>∈</sup> [0, 1) such that *<sup>A</sup>*−<sup>1</sup> <sup>0</sup> <sup>∈</sup> *<sup>L</sup>*(*X*, *<sup>X</sup>*), *<sup>A</sup>*−<sup>1</sup> <sup>0</sup> *F*(*x*0) ≤ *η*, and *F*(*x*0) ≤ *δ*.

(C2) There exists *K*<sup>0</sup> > 0 such that for all *u*, *v* ∈ Ω

$$\|\|A\_0^{-1}([\mu, \upsilon; F] - A\_0)\|\| \le K\_0(\|\|\mu - \mathfrak{x}\_0\| + \|\upsilon - \mathfrak{w}\_0\|).$$

Set Ω<sup>0</sup> = *U*(*x*0, <sup>1</sup>−*K*0*<sup>δ</sup>* <sup>2</sup>*K*<sup>0</sup> ) ∩ <sup>Ω</sup> for *<sup>K</sup>*0*<sup>δ</sup>* < 1.

(C3) There exists *K* > 0 such that for all *u*, *v*, *u*¯, *v*¯ ∈ Ω<sup>0</sup>

$$\left\| \left| A\_0^{-1}([\boldsymbol{\mu}, \boldsymbol{\upsilon}; \boldsymbol{F}] - [\boldsymbol{\mu}, \boldsymbol{\upsilon}; \boldsymbol{F}]) \right| \right\| \leq \mathcal{K} (\left\| \left| \boldsymbol{\mu} - \boldsymbol{\mathfrak{u}} \right\| + \left\| \boldsymbol{\upsilon} - \boldsymbol{\mathfrak{v}} \right\|) . \leq$$

(C4) *U*[*x*0, *ρ* + *δ*] ⊂ Ω, where *ρ* = ! *<sup>t</sup>*<sup>∗</sup> <sup>+</sup> *<sup>γ</sup>*<sup>0</sup> or *<sup>t</sup>*∗∗, if conditions of Lemma <sup>4</sup> hold *t* ∗ <sup>1</sup> + *γ*<sup>0</sup> or *t* ∗∗ <sup>1</sup> , if conditions of Lemma 5 hold.

**Remark 5.** *The results in [19] are given in the non-affine form. The benefits of using affine invariant results over non-affine are well-known [1,5,11,21]. In particular, they assumed <sup>A</sup>*−<sup>1</sup> <sup>0</sup> ≤ *β and (C3)'* [*x*, *<sup>y</sup>*; *<sup>F</sup>*] <sup>−</sup> [*x*¯, *<sup>y</sup>*¯; *<sup>F</sup>*]<sup>≤</sup> *<sup>K</sup>*¯(*<sup>x</sup>* <sup>−</sup> *<sup>x</sup>*¯ <sup>+</sup> *<sup>y</sup>* <sup>−</sup> *<sup>y</sup>*¯) *holds for all <sup>x</sup>*, *<sup>y</sup>*, *<sup>x</sup>*¯ *<sup>y</sup>*¯ <sup>∈</sup> <sup>Ω</sup>. *By the definition of the set* Ω0, *we get*

$$
\Omega\_0 \subset \Omega,\tag{112}
$$

*so*

*<sup>K</sup>*<sup>0</sup> <sup>≤</sup> *<sup>β</sup>K*¯ (113)

*and*

$$
\mathbb{K} \le \beta \vec{\mathbb{K}}.\tag{114}
$$

*Hence, K can replace βK in the results in [* ¯ *19]. Notice also that using (C3)' they estimated*

$$\|B\_{n+1}^{-1}A\_0\| \le \frac{1}{1 - \beta \mathcal{K}(2\mathbb{S}\_{n+1} + \delta)}\tag{115}$$

*and*

$$\|A\_0^{-1}(A\_{n+1} - A\_0)\| \le \frac{1}{1 - \beta \mathcal{K}(\bar{t}\_{n+1} - \bar{t}\_0) + \bar{\gamma}\_n + \delta)'}\tag{116}$$

*where* {¯*tn*}, {*s*¯*n*} *are defined for n* = 0, 1, 2, . . . *by* ¯*t*<sup>0</sup> = 0, *<sup>s</sup>*¯0 = *<sup>η</sup>*,

$$\begin{array}{rcl} \bar{t}\_{1} &=& \bar{s}\_{0} + \frac{\beta \bar{\mathcal{K}}(\eta + \delta)\eta}{1 - \beta \bar{\mathcal{K}}(2\bar{s}\_{0} + \delta)} \\\\ \bar{s}\_{n+1} &=& \bar{t}\_{n+1} + \frac{\beta \bar{\gamma}}{1 - \beta \bar{\mathcal{K}}(2\bar{t}\_{n+1} + \bar{\gamma}\_{n} + \delta)} \\\\ \bar{t}\_{n+2} &=& \bar{s}\_{n+1} + \frac{\beta \bar{\mathcal{K}}(\bar{s}\_{n+1} - \bar{t}\_{n+1} + \bar{\gamma}\_{n})(\bar{s}\_{n+1} - \bar{t}\_{n+1})}{1 - \beta \bar{\mathcal{K}}(2\bar{s}\_{n+1} + \delta)} \end{array} \tag{117}$$

*where <sup>γ</sup>*¯*<sup>n</sup>* <sup>=</sup> *<sup>K</sup>*¯(¯*tn*+<sup>1</sup> <sup>−</sup> ¯*tn* <sup>+</sup> *<sup>s</sup>*¯*<sup>n</sup>* <sup>−</sup> ¯*tn*)(¯*tn*+<sup>1</sup> <sup>−</sup> *<sup>s</sup>*¯*n*), *<sup>δ</sup>* <sup>≥</sup> *<sup>γ</sup>*¯0. *But using the weaker condition (C2) we obtain respectively,*

$$\|\|B\_{n+1}^{-1}A\_0\|\| \le \frac{1}{1 - K\_0(2s\_{n+1} + \delta)}\tag{118}$$

*and*

$$\left\|A\_0^{-1}(A\_{n+1} - A\_0)\right\| \le \frac{1}{1 - K\_0(t\_{n+1} - t\_0 + \gamma\_n + \delta)}\tag{119}$$

*which are tighter estimates than (115) and (116), respectively. Hence, K*0, *K can replace βK*¯, *β*, *K*¯ *and (118), (119) can replace (115), (116), respectively, in the proof of Theorem 3 in [19]. Examples where (112)–(114) are strict can be found in [1,5,11,21]. Simple induction shows that*

$$0 < s\_n - t\_n \le \overline{s}\_n - \overline{t}\_n \tag{120}$$

$$0 < t\_{n+1} - s\_n \le \overline{t}\_{n+1} - \overline{s}\_n \tag{121}$$

*and*

$$t\_\* \le \overline{t}^\* = \lim\_{n \to \infty} \overline{t}\_n. \tag{122}$$

*These estimates justify the claims made at the introduction of this work along the same lines. The local results in [19] can also be extended using our technique.*

Next, we present the semi-local convergence result for the method (20).

**Theorem 6.** *Suppose that conditions (C) hold. Then, iteration* {*xn*} *generated by method (20) exists in U*[*x*0, *t*∗], *remains in U*[*x*0, *t*∗] *and* lim*n*−→<sup>∞</sup> *xn* = *x*<sup>∗</sup> ∈ *U*[*x*0, *t*∗] *with F*(*x*∗) = 0, *so that*

$$\|\mathfrak{x}\_n - \mathfrak{x}\_\*\| \le t\_\* - t\_{n-1}$$

**Proof.** It follows from the comment above Theorem 6.

Next, we present the uniqueness of the solution result, where conditions (C) are not necessarily utilized.

**Proposition 4.** *Suppose the following: (i) There exists a simple solution x*<sup>∗</sup> ∈ *U*(*x*0,*r*) ⊂ Ω *for some r* > 0. *(ii) Condition (C2) holds and (iii) There exists r*<sup>∗</sup> ≥ *r such that K*0(*r* + *r*<sup>∗</sup> + *δ*) < 1.

*Set* <sup>Ω</sup><sup>1</sup> <sup>=</sup> *<sup>U</sup>*(*x*0, <sup>1</sup>−*K*0(*δ*+*r*) *<sup>K</sup>*<sup>0</sup> ) ∩ <sup>Ω</sup>. *Then, the element <sup>x</sup>*<sup>∗</sup> *is the only solution of equation F*(*x*) = 0 *in the region* Ω1.

**Proof.** Let *z*<sup>∗</sup> ∈ Ω<sup>1</sup> with *F*(*z*∗) = 0. Define *Q* = [*x*∗, *z*∗; *F*]. Then, in view of (ii) and (iii),

$$\|A\_0^{-1}(Q - A\_0)\| \le K\_0(\|\mathbf{x}\_\* - \mathbf{x}\_0\| + \|z^\* - w\_0\| \le K\_0(r + r^\* + \delta) < 1.)$$

Therefore, we conclude *z*<sup>∗</sup> = *x*<sup>∗</sup> is a consequence of the invertibility of *Q* and the identity *Q*(*x*<sup>∗</sup> − *z*∗) = *F*(*x*∗) − *F*(*z*∗) = 0.

**Remark 6.** *(i) Notice that r can be chosen to be t*∗.

*(ii) The results can be extended further as follows. Replace (C3) <sup>A</sup>*−<sup>1</sup> <sup>0</sup> ([*u*, *<sup>v</sup>*; *<sup>F</sup>*] <sup>−</sup> [*u*¯, *<sup>v</sup>*¯; *<sup>F</sup>*])<sup>≤</sup> *<sup>K</sup>*˜(*<sup>u</sup>* <sup>−</sup> *<sup>u</sup>*¯ <sup>+</sup> *<sup>v</sup>* <sup>−</sup> *<sup>v</sup>*¯), <sup>∀</sup> *<sup>u</sup>*, *<sup>u</sup>*¯ <sup>∈</sup> <sup>Ω</sup>0, *<sup>v</sup>* <sup>=</sup> *<sup>u</sup>* <sup>−</sup> *<sup>A</sup>*(*u*)−1*F*(*u*) *and v*¯ = *A*(*u*¯)−1*F*(*u*¯). *Then, we have (iii) <sup>K</sup>*˜ <sup>≤</sup> *<sup>K</sup>*.

*Another way is if we define the set* <sup>Ω</sup><sup>2</sup> <sup>=</sup> *<sup>U</sup>*(*x*1, <sup>1</sup>−*K*0(*δ*+*γ*0) <sup>2</sup>*K*<sup>0</sup> − *<sup>η</sup>*) *provided that <sup>K</sup>*0(*<sup>δ</sup>* + *<sup>γ</sup>*0) < 1. *Moreover, suppose* Ω<sup>2</sup> ⊂ Ω. *Then, we have* Ω<sup>2</sup> ⊂ Ω<sup>0</sup> *if condition (C3) on* Ω2*, say, with constant K*˜0*. Then, we have that*

$$
\dot{\mathbb{K}}\_0 \le \mathbb{K}
$$

*also holds. Hence, tighter K or* ˜ *K*˜0 *can replace K in Theorem 6.*

#### **10. Conclusions**

The convergence analysis is developed for generalized three-step numerical methods. The advantages of the new approach include weaker convergence criteria and a uniform set of conditions utilizing information on these methods in contrast to earlier works on special cases of these methods, where the existence of high-order derivatives is assumed to prove convergence. The methodology is very general and does not depend on the methods. That is why it can be applied to multi-step and other numerical methods that shall be the topic of future work.

The weak point of this methodology is the observation that the computation of the majorant functions "*h*" at this generality is hard in general. Notice that this is not the case for the special cases of method (2) or method (3) given below them (see, for example, Examples 4 and 5). As far as we know, there is no other methodology that can be compared to the one introduced in this article to handle the semi-local or the local convergence of method (2) or method (3) at this generality.

**Author Contributions:** Conceptualization, M.I.A., I.K.A., S.R. and S.G.; methodology, M.I.A., I.K.A., S.R. and S.G.; software, M.I.A., I.K.A., S.R. and S.G.; validation, M.I.A., I.K.A., S.R. and S.G.; formal analysis, M.I.A., I.K.A., S.R. and S.G.; investigation, M.I.A., I.K.A., S.R. and S.G.; resources, M.I.A., I.K.A., S.R. and S.G.; data curation, M.I.A., I.K.A., S.R. and S.G.; writing—original draft preparation, M.I.A., I.K.A., S.R. and S.G.; writing—review and editing, M.I.A., I.K.A., S.R. and S.G.; visualization, M.I.A., I.K.A., S.R. and S.G.; supervision, M.I.A., I.K.A., S.R. and S.G.; project administration, M.I.A., I.K.A., S.R. and S.G.; funding acquisition, M.I.A., I.K.A., S.R. and S.G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **A Methodology for Obtaining the Different Convergence Orders of Numerical Method under Weaker Conditions**

**Ioannis K. Argyros 1, Samundra Regmi 2, Stepan Shakhno 3,\* and Halyna Yarmola <sup>4</sup>**


**Abstract:** A process for solving an algebraic equation was presented by Newton in 1669 and later by Raphson in 1690. This technique is called Newton's method or Newton–Raphson method and is even today a popular technique for solving nonlinear equations in abstract spaces. The objective of this article is to update developments in the convergence of this method. In particular, it is shown that the Kantorovich theory for solving nonlinear equations using Newton's method can be replaced by a finer one with no additional and even weaker conditions. Moreover, the convergence order two is proven under these conditions. Furthermore, the new ratio of convergence is at least as small. The same methodology can be used to extend the applicability of other numerical methods. Numerical experiments complement this study.

**Keywords:** nonlinear equation; criterion; integral equation; convergence

**MSC:** 49M15; 47H17; 65G99; 65H10; 65N12; 58C15

#### **1. Introduction**

Given Banach spaces U, V. Let *L*(U, V) stand for the space of all continuous linear operators mapping U into V. Consider differentiable as per Fréchet operator L : *D* ⊆ U −→ V and its corresponding nonlinear equation

$$
\mathcal{L}(\mathbf{x}) = \mathbf{0},
\tag{1}
$$

with *D* denoting a nonempty open set. The task of determining a solution *x*<sup>∗</sup> ∈ *D* is very challenging but important, since applications from numerous computational disciplines are brought in form (1) [1,2]. The analytic form of *x*∗ is rarely attainable. That is why mainly numerical methods are used generating approximations to solution *x*∗. Most of them are based on Newton's method [3–7]. Moreover, authors developed efficient high-order and multi-step algorithms with derivative [8–13] and divided differences [14–18].

Among these processes the most widely used is Newton's and its variants. In particular, Newton's Method (NM) is developed as

$$\mathbf{x}\_0 \in D, \mathbf{x}\_{n+1} = \mathbf{x}\_n - \mathcal{L}'(\mathbf{x}\_n)^{-1} \mathcal{L}(\mathbf{x}\_n) \; \forall \; n = 0, 1, 2, \dots \tag{2}$$

There exists a plethora of results related to the study of NM [3,5–7,19–21]. These papers are based on the theory inaugurated by Kantorovich and its variants [21]. Basically, the conditions (K) are used in non-affine or affine invariant form. Suppose (K1) ∃ point *x*<sup>0</sup> ∈ *D* and parameter *s* ≥ 0 : L (*x*0)−<sup>1</sup> <sup>∈</sup> *<sup>L</sup>*(V, <sup>U</sup>), and

**Citation:** Argyros, I.K.; Regmi, S.; Shakhno, S.; Yarmola, H. A Methodology for Obtaining the Different Convergence Orders of Numerical Method under Weaker Conditions. *Mathematics* **2022**, *10*, 2931. https://doi.org/10.3390/ math10162931

Academic Editors: Maria Isabel Berenguer and Manuel Ruiz Galán

Received: 22 July 2022 Accepted: 10 August 2022 Published: 14 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

$$\|\mathcal{L}'(x\_0)^{-1}\mathcal{L}(x\_0)\| \le s\_{\text{\textquotedblleft}}$$

(K2) ∃ parameter *M*<sup>1</sup> > 0 : Lipschitz condition

$$\left\| \left| \mathcal{L}'(x\_0)^{-1} (\mathcal{L}'(w\_1) - \mathcal{L}'(w\_2)) \right| \right\| \le M\_1 \left\| w\_1 - w\_2 \right\|.$$

holds ∀*w*<sup>1</sup> ∈ *D* and *w*<sup>2</sup> ∈ *D*, (K3)

$$s \le \frac{1}{2M\_1}$$

and

(K4) *B*[*x*0, *ρ*] ⊂ *D*, where parameter *ρ* > 0 is given later.

$$\text{Denote } B[\mathbf{x}\_0, r] := \{ \mathbf{x} \in D : \|\mathbf{x} - \mathbf{x}\_0\| \le r \} \text{ for } r > 0. \text{ Set } \rho = r\_1 = \frac{1 - \sqrt{1 - 2M\_1 \mathbf{s}}}{M\_1}.$$

There are many variants of Kantorovich's convergence result for NM. One of these results follows [4,7,20].

**Theorem 1.** *Under conditions (K) for ρ* = *r*1; *NM is contained in B*(*x*0,*r*1), *convergent to a solution x*<sup>∗</sup> ∈ *B*[*x*0,*r*1] *of Equation (1), and*

$$||\mathfrak{x}\_{n+1} - \mathfrak{x}\_n|| \le \mathfrak{u}\_{n+1} - \mathfrak{u}\_n.$$

*Moreover, the convergence is linear if <sup>s</sup>* <sup>=</sup> <sup>1</sup> 2*M*<sup>1</sup> *and quadratic if s* < 1 2*M*<sup>1</sup> . *Furthermore, the solution is unique B*[*x*0,*r*1] *in the first case and in B*(*x*0,*r*2) *in the second case where <sup>r</sup>*<sup>2</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> <sup>√</sup><sup>1</sup> <sup>−</sup> <sup>2</sup>*M*1*<sup>s</sup> M*<sup>1</sup> *and scalar sequence* {*un*} *is given as*

$$
\mu\_0 = 0,\\
\mu\_1 = s,\\
\mu\_{n+1} = \mu\_n + \frac{M\_1(\mu\_n - \mu\_{n-1})^2}{2(1 - M\_1\mu\_n)}.
$$

A plethora of studies have used conditions (K) [3–5,19,21–23].

**Example 1.** *Consider the cubic polynomial*

$$c(\mathbf{x}) = \mathbf{x}^3 - a$$

*for <sup>D</sup>* <sup>=</sup> *<sup>B</sup>*(*x*0, 1 <sup>−</sup> *<sup>a</sup>*) *and parameter <sup>a</sup>* <sup>∈</sup> (0, <sup>1</sup> 2 ). *Select initial point x*<sup>0</sup> = 1. *Conditions (K) give <sup>s</sup>* <sup>=</sup> <sup>1</sup> <sup>−</sup> *<sup>a</sup>* <sup>3</sup> *and M*<sup>1</sup> <sup>=</sup> <sup>2</sup>(<sup>2</sup> <sup>−</sup> *<sup>a</sup>*). *It follows that estimate*

$$\frac{1-a}{3} > \frac{1}{4(2-a)}$$

*holds* <sup>∀</sup>*<sup>a</sup>* <sup>∈</sup> (0, <sup>1</sup> 2 ). *That is condition (K3) is not satisfied. Therefore convergence is not assured by this theorem. However, NM may converge. Hence, clearly, there is a need to improve the results based on the conditions K.*

By looking at the crucial sufficient condition (K3) for the convergence, (K4) and the majorizing sequence given by Kantorovich in the preceding Theorem 1 one sees that if the Lipschitz constants *M*<sup>1</sup> is replaced by a smaller one, say *L* > 0, than the convergence domain will be extended, the error distances *xn*+<sup>1</sup> − *xn*, *xn* − *x*∗ will be tighter and the location of the solution more accurate. This replacement will also lead to fewer Newton iterates to reach a certain predecided accuracy (see the numerical Section). That is why with the new methodology, a new domain is obtained inside *D* that also contains the Newton iterates. However, then, *L* can replace *M*<sup>1</sup> in Theorem 1 to obtain the aforementioned extensions and benefits.

In this paper several avenues are presented for achieving this goal. The idea is to replace Lipschitz parameter *M*<sup>1</sup> by smaller ones. (K5) Consider the center Lipschitz condition

$$\|\mathcal{L}'(\mathfrak{x}\_0)^{-1}(\mathcal{L}'(w\_1) - \mathcal{L}'(\mathfrak{x}\_0))\| \le M\_0 \|w\_1 - \mathfrak{x}\_0\| \,\,\forall w\_1 \in D\_{\varkappa}$$

the set *<sup>D</sup>*<sup>0</sup> <sup>=</sup> *<sup>B</sup>*[*x*0, <sup>1</sup> *M*<sup>0</sup> ] ∩ *D* and the Lipschitz-2 condition (K6)

$$\|\|\mathcal{L}'(\mathbf{x}\_0)^{-1}(\mathcal{L}'(w\_1) - \mathcal{L}'(w\_2))\|\| \le M \|\|w\_1 - w\_2\|\| \,\,\forall w\_1, w\_2 \in D\_0.$$

These Lipschitz parameters are related as

$$M\_0 \le M\_1. \tag{3}$$

$$M \le M\_1 \tag{4}$$

since

$$D\_0 \subset D.\tag{5}$$

Notice also since parameters *M*<sup>0</sup> and *M* are specializations of parameter *M*1, *M*<sup>1</sup> = *M*1(*D*), *M*<sup>0</sup> = *M*0(*D*), but *M* = *M*(*D*0). Therefore, no additional work is required to find *M*<sup>0</sup> and *<sup>M</sup>* (see also [22,23]). Moreover the ratio *<sup>M</sup>*<sup>0</sup> *<sup>M</sup>* can be very small (arbitrarily). Indeed,

**Example 2.** *Define scalar function*

$$F(t) = b\_0 t + b\_1 + b\_2 \sin e^{b\_3 t}.$$

*for t*<sup>0</sup> = 0, *where bj*, *j* = 0, 1, 2, 3 *are real parameters. It follows by this definition that for b*<sup>3</sup> *sufficiently large and b*<sup>2</sup> *sufficiently small, <sup>M</sup>*<sup>0</sup> *M*<sup>1</sup> *can be small (arbitrarily), i.e., <sup>M</sup>*<sup>0</sup> *M*<sup>1</sup> −→ 0.

Then, clearly there can be a significant extension if parameters *M*<sup>1</sup> and *M*<sup>0</sup> or *M* and *M*<sup>0</sup> can be replace *M*<sup>1</sup> in condition (K3). Looking at this direction the following replacements are presented in a series of papers [19,22,23], respectively

(N2): *s* ≤ 1 *q*2 , 1

(N3): *s* ≤

and

(N4): *s* ≤

*q*4 where *<sup>q</sup>*<sup>1</sup> <sup>=</sup> <sup>2</sup>*M*1, *<sup>q</sup>*<sup>2</sup> <sup>=</sup> *<sup>M</sup>*<sup>1</sup> <sup>+</sup> *<sup>M</sup>*0, *<sup>q</sup>*<sup>3</sup> <sup>=</sup> <sup>1</sup> 4 (4*M*<sup>0</sup> + *M*<sup>1</sup> + *M*<sup>2</sup> <sup>1</sup> + 8*M*1*M*0) and *<sup>q</sup>*<sup>4</sup> <sup>=</sup> <sup>1</sup> 4 (4*M*<sup>0</sup> + *M*<sup>2</sup> <sup>1</sup> <sup>+</sup> <sup>8</sup>*M*0*M*<sup>1</sup> <sup>+</sup> <sup>√</sup>*M*1*M*0). These items are related as follows:

,

*q*3 ,

1

$$q\_4 \le q\_3 \le q\_2 \le q\_{1\prime}$$

$$(N2) \Rightarrow (N3) \Rightarrow (N4)\_{\prime\prime}$$

and as relation *<sup>M</sup>*<sup>0</sup> *M*<sup>1</sup> −→ 0,

$$\frac{q\_2}{q\_1} \longrightarrow \frac{1}{2}, \frac{q\_3}{q\_2} \longrightarrow \frac{1}{4}, \frac{q\_4}{q\_3} \longrightarrow 0$$

and

*q*4 *q*2 −→ 0.

Preceding items indicate the times (at most) one is improving the other. These are the extensions given in this aforementioned references. However, it turns out that parameter *L* can replace *M*<sup>1</sup> in these papers (see Section 3). Denote by *N*˜ , *q*˜ the corresponding items. It follows

$$\frac{\bar{q}\_1}{q\_1} = \frac{M}{M\_1} \longrightarrow 0, \frac{\bar{q}\_2}{q\_2} \longrightarrow 0, \frac{\bar{q}\_3}{q\_3} \longrightarrow 0.$$

for *<sup>M</sup>*<sup>0</sup> *M*<sup>1</sup> −→ 0 and *<sup>M</sup> M*<sup>1</sup> −→ 0. Hence, the new results also extend the ones in the aforementioned references. Other extensions involve tighter majorizing sequences for NM (see Section 2) and improved uniqueness report for solution *x*∗ (Section 3). The applications appear in Section 4 followed by conclusions in Section 5.

#### **2. Majorizations**

Let *K*0, *M*0, *K*, *M* be given positive parameters and *s* be a positive variable. The real sequence {*tn*} defined for *t*<sup>0</sup> = 0, *t*<sup>1</sup> = *s*, *t*<sup>2</sup> = *t*<sup>1</sup> + *<sup>K</sup>*(*t*<sup>1</sup> <sup>−</sup> *<sup>t</sup>*0)<sup>2</sup> <sup>2</sup>(<sup>1</sup> <sup>−</sup> *<sup>K</sup>*0*t*1) and <sup>∀</sup>*<sup>n</sup>* <sup>=</sup> 0, 1, 2, . . . by *tn*+<sup>2</sup> = *tn*+<sup>1</sup> + *<sup>M</sup>*(*tn*+<sup>1</sup> <sup>−</sup> *tn*)<sup>2</sup> <sup>2</sup>(<sup>1</sup> <sup>−</sup> *<sup>M</sup>*0*tn*+1) (6)

plays an important role in the study of NM, we adopted the notation *tn*(*s*) = *tn* ∀*n* = 1, 2, ... . That is why some convergence results for it are listed in what follows next in this study.

**Lemma 1.** *Suppose conditions*

$$K\_0 t\_1 < 1 \text{ and } \ t\_{n+1} < \frac{1}{M\_0} \tag{7}$$

*hold* ∀ *n* = 1, 2, . . . . *Then, the following assertions hold*

$$t\_n < t\_{n+1} < \frac{1}{M\_0} \tag{8}$$

*and* ∃ *t* <sup>∗</sup> <sup>∈</sup> [*s*, <sup>1</sup> *M*<sup>0</sup> ] *such that* lim*n*→<sup>∞</sup> *tn* <sup>=</sup> *<sup>t</sup>* ∗.

**Proof.** The definition of sequence {*tn*} and the condition (7) implies (8). Moreover, increasing sequence {*tn*} has <sup>1</sup> *M*<sup>0</sup> as an upper bound. Hence, it is convergent to its (unique) least upper bound *t* ∗.

Next, stronger convergence criteria are presented. However, these criteria are easier to verify than conditions of Lemma 1. Define parameter *δ* by

$$\delta = \frac{2M}{M + \sqrt{M^2 + 8M\_0M}}.\tag{9}$$

This parameter plays a role in the following results.

**Case:** *K*<sup>0</sup> = *M*<sup>0</sup> and *K* = *M*.

Part (i) of the next auxiliary result relates to the Lemma in [19].

**Lemma 2.** *Suppose condition*

$$s \le \frac{1}{2M\_2} \tag{10}$$

*holds, where*

$$M\_2 = \frac{1}{4}(M + 4M\_0 + \sqrt{M^2 + 8M\_0M}).\tag{11}$$

*Then, the following assertions hold*

*(i) Estimates*

$$t\_{n+1} - t\_n \le \delta (t\_n - t\_{n-1}) \tag{12}$$

$$t\_n < \frac{1 - \delta^{n+1}}{1 - \delta}s < \frac{s}{1 - \delta} \tag{13}$$

*hold. Moreover, conclusions of Lemma 1 are true for sequence* {*tn*}. *The sequence,* {*tn*} *converges linearly to t*<sup>∗</sup> <sup>∈</sup> (0, *<sup>s</sup>* 1 − *δ* ]. *Furthermore, if for some μ* > 0

$$s < \frac{\mu}{(1+\mu)M\_2}.\tag{14}$$

*Then, the following assertions hold*

*(ii)*

$$t\_{n+1} - t\_n \le \frac{M}{2}(1+\mu)(t\_n - t\_{n-1})^2\tag{15}$$

*and*

$$t\_{n+1} - t\_n \le \frac{1}{\alpha} (\alpha \text{s})^{2^n},\tag{16}$$

*where <sup>α</sup>* <sup>=</sup> *<sup>M</sup>* <sup>2</sup> (<sup>1</sup> <sup>+</sup> *<sup>μ</sup>*) *and the conclusions of Lemma <sup>1</sup> for sequence* {*tn*} *are true. The sequence,* {*tn*} *converges quadratically to t*∗.

**Proof.** (i) It is given in [19].

(ii) Notice that condition (14) implies (11) by the choice of parameter *μ*. Assertion (15) holds if estimate

$$0 < \frac{M}{2(1 - M0t\_{n+1})} \le \frac{M}{2}(1 + \mu) \tag{17}$$

is true. This estimate is true for *<sup>n</sup>* <sup>=</sup> 1, since it is equivalent to *<sup>M</sup>*0*<sup>s</sup>* <sup>≤</sup> *<sup>μ</sup>* 1 + *μ* . But this is true by *<sup>M</sup>*<sup>0</sup> <sup>≤</sup> <sup>2</sup>*M*2, condition (11) and inequality *<sup>μ</sup>M*<sup>0</sup> (1 + *μ*)2*M*<sup>2</sup> <sup>≤</sup> *<sup>μ</sup>* 1 + *μ* . Then, in view of estimate (13), estimate (17) certainly holds provided that

$$(1+\mu)M\mathbf{0}(1+\delta+\ldots+\delta^{n+1})\mathbf{s}-\mu \le 0.\tag{18}$$

This estimate motivates the introduction of recurrent polynomials *pn* which are defined by

$$p\_n(t) = (1+\mu)M\_0(1+t+\ldots+t^{n+1})s - \mu\_\prime \tag{19}$$

∀*t* ∈ [0, 1). In view of polynomial *pn* assertion (18) holds if

$$p\_{\mathfrak{n}}(t) \le 0 \text{ at } t = \delta. \tag{20}$$

The polynomials *pn* are connected:

$$p\_{n+1}(t) - p\_n(t) = (1 + \mu)M\_0 t^{n+2}s > 0,$$

$$p\_n(t) < p\_{n+1}(t) \,\forall \, t \in [0, 1). \tag{21}$$

so

Define function *<sup>p</sup>*<sup>∞</sup> : [0, 1) −→ <sup>R</sup> by

$$p\_{\infty}(t) = \lim\_{n \to \infty} p\_n(t). \tag{22}$$

It follows by definitions (19) and (20) that

$$p\_{\infty}(t) = \frac{(1+\mu)M\_0s}{1-t} - \mu. \tag{23}$$

Hence, assertion (20) holds if

$$p\_{\infty}(t) \le 0 \text{ at } t = \delta,\tag{24}$$

or equivalently

$$M\_0 s \le \frac{\mu}{1+\mu} \frac{\sqrt{M^2 + 8M\_0 M} - M}{\sqrt{M^2 + 8M\_0 M} + M}.$$

which can be rewritten as condition (14). Therefore, the induction for assertion (17) is completed. That is assertion (15) holds by the definition of sequence {*tn*} and estimate (15). It follows that

$$\begin{array}{rcl} \left(\alpha(t\_{n+1}-t\_n)\right) &\leq& \alpha^2(t\_n-t\_{n-1}) = \left(\alpha(t\_n-t\_{n-1})\right)^2, \\ &\leq& \alpha^2(\alpha(t\_{n-1}-t\_{n-2}))^2 \\ &\leq& \alpha^2\alpha^2(t\_{n-1}-t\_{n-2})^2 \\ &\leq& \alpha^2\alpha^2\alpha^2(t\_{n-2}-t\_{n-3})^2 \\ &\vdots \end{array}$$

so

$$\begin{aligned} t\_{n+1} - t\_n &\le \quad \alpha^{1+2+2^2+\ldots+2^{n-1}} s^{2^n} \\ &= \quad \frac{1}{\alpha} (\alpha s)^{2^n} .\end{aligned}$$

Notice also that *<sup>M</sup><sup>μ</sup>* <sup>&</sup>lt; <sup>4</sup>*M*2, then *<sup>μ</sup>* (1 + *μ*)*M*<sup>1</sup> < 2 *M*(1 + *μ*) , so *αs* < *μ*.


*so μ*<sup>1</sup> > 0.

*Part (i) of the next auxiliary result relates to a Lemma in [19]. The case M*<sup>0</sup> = *M has been studied in the introduction. So, in the next Lemma we assume M*<sup>0</sup> = *M in part (ii).*

**Lemma 3.** *Suppose condition*

$$s \le \frac{1}{2M\_3} \tag{25}$$

*holds, where*

$$M\_3 = \frac{1}{8}(4M\_0 + \sqrt{M\_0M + 8M\_0^2} + \sqrt{M\_0M}).$$

*Then, the following assertions hold*

*(i)*

$$t\_{n+1} - t\_n \le \delta(t\_n - t\_{n-1}) \le \frac{\delta^{n-1} M\_0 s^2}{2(1 - M\_0 s)}\tag{26}$$

*and*

$$t\_{n+2} \le s + \frac{1 - \delta^{n+1}}{1 - \delta}(t\_2 - t\_1) < t^{\*\*} = s + \frac{t\_2 - t\_1}{1 - \delta}s, \ \forall n = 1, 2, \ldots \tag{27}$$

*Moreover, conclusions of Lemma 1 are true for sequence* {*tn*}. *The sequence* {*tn*} *converges linearly to t*<sup>∗</sup> ∈ (0, *t* ∗∗]. *Define parameters h*<sup>0</sup> *by*

$$h\_0 = \frac{2(\sqrt{M\_0M + 8M\_0^2} + \sqrt{M\_0M})}{M(\sqrt{M\_0M + 8M\_0^2} + \sqrt{M\_0M} + 4M\_0)}, \bar{M}\_3 = \frac{h\_0}{2},$$

$$\gamma = 1 + \mu, \beta = \frac{\mu}{1 + \mu}, d = 2(1 - \delta)$$

*and*

$$
\mu = \frac{M\_0}{2M\_3 - M\_0}.
$$

*(ii) Suppose*

$$M\_0 < M \le \frac{M\_0}{\theta} \tag{28}$$

*and (25) hold, where <sup>θ</sup>* <sup>≈</sup> 0.6478 *is the smallest solution of scalar equation* <sup>2</sup>*z*<sup>4</sup> <sup>+</sup> *<sup>z</sup>* <sup>−</sup> <sup>1</sup> <sup>=</sup> 0. *Then, the conclusions of Lemma 2 also hold for sequence* {*tn*}. *The sequence converges quadratically to t*<sup>∗</sup> ∈ (0, *t* ∗∗].

*(iii) Suppose*

$$M \ge \frac{1}{\theta} M\_0 \text{ and } \text{s} < \frac{1}{2\tilde{M}\_3} \tag{29}$$

*hold. Then, the conclusions of Lemma 2 are true for sequence* {*tn*}. *The sequence* {*tn*} *converges quadratically to t*<sup>∗</sup> ∈ (0, *t* ∗∗].

*(iv) <sup>M</sup>*<sup>0</sup> <sup>&</sup>gt; *<sup>M</sup> and (25) hold. Then, <sup>M</sup>*¯ <sup>3</sup> <sup>≤</sup> *<sup>M</sup>*<sup>3</sup> *and the conclusions of Lemma <sup>2</sup> are true for sequence* {*tn*}. *The sequence* {*tn*} *converges quadratically to t*<sup>∗</sup> ∈ (0, *t* ∗∗].

**Proof.** (i) It is given in Lemma 2.1 in [23].

(ii) As in Lemma 2 but using estimate (27) instead of (13) to show

$$\frac{M}{2(1 - M\_0 t\_{n+1})} \le \frac{M\gamma}{2}.$$

It suffices

$$
\gamma M\_0 \left( s + \frac{1 - \delta^n}{1 - \delta} (t\_2 - t\_1) \right) + 1 - \gamma \le 0
$$

or

$$p\_n(t) \le 0 \text{ at } t = \delta\_\prime \tag{30}$$

where

$$p\_n(t) = \gamma M\_0 (1 + t + \dots + t^{n-1}) (t\_2 - t\_1) + \gamma M\_0 s + 1 - \gamma \dots$$

Notice that

$$p\_{n+1}(t) - p\_n(t) = \gamma M\_0 t^n (t\_2 - t\_1) > 0.$$

Define function *<sup>p</sup>*<sup>∞</sup> : [0, 1) −→ <sup>R</sup> by

$$p\_{\infty}(t) = \lim\_{n \to \infty} p\_n(t).$$

It follows that

$$p\_{\infty}(t) = \frac{\gamma M\_0 (t\_2 - t\_1)}{1 - t} + \gamma M\_0 s + 1 - \gamma.$$

So, (30) holds provided that

$$p\_{\infty}(t) \le 0 \text{ at } t = \delta. \tag{31}$$

By the definition of parameters *γ*, *d*, *β* and for *M*0*s* = *x*, (31) holds if

$$\frac{x^2}{2(1-x)(1-\delta)} + x \le \beta$$

or

$$(d-1)\mathfrak{x}^2 + (1+\beta)\mathfrak{x} - \beta \le 0$$

or

$$\alpha \le \frac{1+\beta-\sqrt{(1-\beta)^2+4\beta d}}{2(1-d)}$$

or

or

$$s \le \frac{1 + \beta - \sqrt{(1 - \beta)^2 + 4\beta d}}{2(1 - d)}.\tag{32}$$

**Claim.** The right hand side of assertion (31) equals <sup>1</sup> *M*<sup>2</sup> . Indeed, this is true if 1 + *β* − (<sup>1</sup> <sup>−</sup> *<sup>β</sup>*)<sup>2</sup> <sup>+</sup> <sup>4</sup>*β<sup>d</sup>* <sup>=</sup> <sup>2</sup>*M*0(<sup>1</sup> <sup>−</sup> *<sup>d</sup>*) *M*<sup>2</sup>

$$1 + \beta - \frac{2M\_0(1 - d)}{2M\_3} = \sqrt{(1 - \beta)^2 + 4\beta d}$$

or by squaring both sides

$$1 + \beta^2 + \frac{4M\_0^2(1-d)^2}{4M\_3^2} + 2\beta - \frac{4M\_0(1-d)}{2M\_3} - \frac{4\beta M\_0(1-d)}{2M\_3} = 1 + \beta^2 - 2\beta + 4\beta d$$

or

$$\beta \left( 1 - \frac{M\_0 (1 - d)}{2M\_3} - d \right) = \frac{M\_0 (1 - d)}{2M\_3} \left( 1 - \frac{M\_0}{2M\_3} \right)$$

or

*β* <sup>1</sup> <sup>−</sup> *<sup>M</sup>*<sup>0</sup> 2*M*<sup>3</sup> (<sup>1</sup> <sup>−</sup> *<sup>d</sup>*) = <sup>1</sup> <sup>−</sup> *<sup>M</sup>*<sup>0</sup> 2*M*<sup>3</sup> (<sup>1</sup> <sup>−</sup> *<sup>d</sup>*) *<sup>M</sup>*<sup>0</sup> 2*M*<sup>3</sup> *<sup>β</sup>* <sup>=</sup> *<sup>M</sup>*<sup>0</sup> 2*M*<sup>3</sup> *μ* <sup>1</sup> <sup>+</sup> *<sup>μ</sup>* <sup>=</sup> *<sup>M</sup>*<sup>0</sup> 2*M*<sup>3</sup>

> *<sup>μ</sup>* <sup>=</sup> *<sup>M</sup>*<sup>0</sup> 2*M*<sup>3</sup> − *M*<sup>0</sup>

which is true. Notice also that

$$\begin{aligned} 2M\_3 - M\_0 &= -\frac{1}{4}(4M\_0 + \sqrt{M\_0M} + \sqrt{M\_0M + 8M\_0^2}) \\ &= -\frac{1}{4}(\sqrt{M\_0M} + \sqrt{M\_0M + 8M\_0^2}) > 0 \end{aligned}$$
 
$$\frac{\sqrt{MM}}{\sqrt{MM}} + \sqrt{M\_0M + 8M\_0^2}} \quad (4.14)$$

,

and 2*M*<sup>3</sup> − 2*M*<sup>0</sup> > 0, since 2*M*<sup>3</sup> − *M*<sup>0</sup> = <sup>√</sup>*M*0*<sup>M</sup>* <sup>+</sup> *M*0*M* + 8*M*<sup>2</sup> <sup>0</sup> − 4*M*<sup>0</sup> <sup>4</sup> , *<sup>M</sup>*<sup>0</sup> <sup>&</sup>lt; <sup>√</sup>*M*0*<sup>M</sup>* and 3*M*<sup>0</sup> < *M*0*M* + 8*M*<sup>2</sup> <sup>0</sup> (by condition (25)). Thus, *μ* ∈ (0, 1). It remains to show

$$\mathfrak{a} = \frac{M}{2}(1+\mu)\mathfrak{s} < 1$$

or by the choice of *μ* and *M*<sup>2</sup>

$$
\frac{M\_2}{2} \left( 1 + \frac{M\_0}{2M\_3 - M\_0} \right) \text{s} < 1
$$

$$
\text{s} < \frac{1}{2\overline{M}\_3}.\tag{33}
$$

or

or

or

or

**Claim.** *<sup>M</sup>*¯ <sup>3</sup> <sup>≤</sup> *<sup>M</sup>*3. By the definition of parameters *<sup>M</sup>*<sup>2</sup> and *<sup>M</sup>*¯ <sup>3</sup> it must be shown that

$$\frac{M(\sqrt{M\_0M} + \sqrt{M\_0M + 8M\_0^2} + 4M\_0}{2(\sqrt{M\_0M} + \sqrt{M\_0M + 8M\_0^2})} \le \frac{\sqrt{M\_0M} + \sqrt{M\_0M + 8M\_0^2} + 4M\_0}{4}$$
  $\text{or if for } y = \frac{M\_0}{M}$ 

<sup>2</sup> <sup>−</sup> <sup>√</sup>*<sup>y</sup>* <sup>≤</sup> *y* + 8*y*2. (34)

By (28) 2 <sup>−</sup> <sup>√</sup>*<sup>y</sup>* <sup>&</sup>gt; 0, so estimate (34) holds if 2*y*<sup>2</sup> <sup>+</sup> <sup>√</sup>*<sup>y</sup>* <sup>−</sup> <sup>1</sup> <sup>≥</sup> 0 or

$$2z^4 + z - 1 \ge 0 \text{ for } z = \sqrt{y}.$$

However, the last inequality holds by (28). The claimed is justified. So, estimate (33) holds by (25) and this claim.

(iii) It follows from the proof in part (ii). However, this time *<sup>M</sup>*<sup>2</sup> <sup>≤</sup> *<sup>M</sup>*¯ <sup>2</sup> follows from (29). Notice also that according to part (ii) condition (25) implies (29). Moreover, according to part (iii) condition (29) implies (25).

$$\text{The first-order coupling between the two-dimensional } \mathcal{N} \text{-matrices is the only possible } \mathcal{N} \text{-matrices with } \mathcal{N} = \{0, 1, 2, \dots, N\} \text{ and } \mathcal{N} = \{0, 1, 2, \dots, N\}.$$

	-

Comments similar to Remark 1 can follow for Lemma 3.

**Case.** Parameters *K*<sup>0</sup> and *K* are not equal to *M*0. Comments similar to Remark 1 can follow for Lemma 3.

It is convenient to define parameter *δ*<sup>0</sup> by

$$\delta\_0 = \frac{K(t\_2 - t\_1)}{2(1 - K\_0 t\_2)}$$

and the quadratic polynomial *ϕ* by

$$\varphi(t) = (MK + 2\delta M\_0(K - 2K\_0))t^2 + 4\delta(M\_0 + K\_0)t - 4\delta J$$

The discriminant of polynomial *q* can be written as

$$
\triangle = 16\delta (\delta (M\_0 - K\_0))^2 + (M + 2\delta M\_0)K > 0.
$$

It follows that the root <sup>1</sup> *h*1 given by the quadratic formula can be written as

$$\frac{1}{2h\_1} = \frac{2}{\delta(M\_0 + K\_0) + \sqrt{(\delta(M\_0 + K\_0))^2 + \delta(MK + 2\delta M\_0)(K - 2K\_0)}}.$$

Denote by <sup>1</sup> *h*2 the unique positive zero of equation

$$M\_0(K - 2K\_0)t^2 + 2M\_0t - 1 = 0.$$

This root can be written as

$$\frac{1}{2h\_2} = \frac{1}{M\_0 + \sqrt{M^2 + M\_0(K - 2K\_0)}}$$

Define parameter *M*<sup>4</sup> by

$$\frac{1}{M\_4} = \min\left\{\frac{1}{h\_1}, \frac{1}{h\_2}\right\}.\tag{35}$$

.

Part (i) of the next auxiliary result relates to Lemma 2.1 in [22].

#### **Lemma 4.** *Suppose*

$$s \le \frac{1}{2M\_4} \tag{36}$$

*holds, where parameter M*<sup>4</sup> *is given by Formula (35). Then, the following assertions hold (i) Estimates*

$$t\_{n+2} - t\_{n+1} \le \delta\_0 \delta^{n-1} \frac{Ks^2}{2(1 - K\_0s)},$$

*and*

$$t\_{n+2} \le s + \left(1 + \delta\_0 \frac{1 - \delta^n}{1 - \delta}\right)(t\_2 - t\_1) \le \delta = s + \left(1 + \frac{\delta\_0}{1 - \delta}\right)(t\_2 - t\_1).$$

*Moreover, conclusions of Lemma 2 are true for sequence* {*tn*}. *The sequence* {*tn*} *converges linearly to t*<sup>∗</sup> ∈ (0, ¯*t*].

*(ii) Suppose*

$$M\_0 \left(\frac{\delta\_0 (t\_2 - t\_1)}{1 - \delta} + s\right) \le \beta\_\prime \tag{37}$$

$$s < \frac{2}{(1+\mu)M} \tag{38}$$

*and (36) hold for some μ* > 0. *Then, the conclusions of Lemma 3 are true for sequence* {*tn*}. *The sequence* {*tn*} *converges quadratically to t*<sup>∗</sup> ∈ (0, ¯*t*].

**Proof.** (i) It is given in Lemma 2.1 in [22]. (ii) Define polynomial *pn* by

$$p\_n(t) = \gamma M\_0 \delta\_0 (1 + t + \dots + t^{n-1})(t\_2 - t - 1) + \gamma M\_0 \mathbf{s} + 1 - \gamma \mathbf{s}$$

By this definition it follows

$$p\_{n+1}(t) - p\_n(t) = \gamma M\_0 \delta\_0 (t\_2 - t\_1) t^n > 0.$$

As in the proof of Lemma 3 (ii), estimate

$$\frac{M}{2(1 - M\_0 t\_{n+1})} \le \frac{M}{2}\gamma$$

holds provided that

$$p\_{\mathfrak{n}}(t) \le 0 \text{ at } t = \delta. \tag{39}$$

Define function *<sup>p</sup>*<sup>∞</sup> : [0, 1) −→ <sup>R</sup> by

$$p\_{\infty}(t) = \lim\_{n \to \infty} p\_n(t).$$

It follows by the definition of function *p*∞ and polynomial *pn* that

$$p\_{\infty}(t) = \frac{\gamma M\_0 \delta\_0 (t\_2 - t\_1)}{1 - t} + \gamma M\_0 \mathbf{s} - \gamma \mathbf{s}.$$

Hence, estimate (39) holds provided that

$$p\_{\infty}(t) \le 0 \text{ at } t = \delta.$$

However, this assertion holds, since *μ* ∈ (0, 1). Moreover, the definition of *α* and condition (38) of the Lemma 4 imply

$$\mathfrak{as} = \frac{M}{2}(1+\mu).$$

Hence, the sequence {*tn*} converges quadratically to *t* ∗.

**Remark 2.** *Conditions (36)–(38) can be condensed and a specific choice for μ can be given as follows: Define function f* : 1 0, <sup>1</sup> *K*0 −→ <sup>R</sup> *by*

$$f(t) = 1 - M\_0 \left( \frac{\delta\_0(t) \left( t\_2(t) - t\_1(t) \right)}{1 - \delta} + t \right).$$

*It follows by this definition*

$$f(0) = 1 > 0,\\ f(t) \longrightarrow -\infty \text{ as } t \longrightarrow \frac{1}{K\_0}^-.$$

*Denote by <sup>μ</sup>*<sup>2</sup> *the smallest solution of equation <sup>f</sup>*(*t*) = <sup>0</sup> *in* 0, <sup>1</sup> *K*0 . *Then, by choosing μ* = *μ*<sup>2</sup> *conditions (37) holds as equality. Then, if follows that if we solve the first condition in (37) for "s", then conditions (36)–(38) can be condensed as*

$$s \le s\_1 \min\left\{ \frac{1}{M\_4}, \frac{2}{(2+\mu\_2)M} \right\}. \tag{40}$$

*If <sup>s</sup>*<sup>1</sup> <sup>=</sup> <sup>2</sup> (<sup>2</sup> <sup>+</sup> *<sup>μ</sup>*2)*M*, *then condition (40) should hold as a strict inequality to show quadratic convergence.*

#### **3. Semi-Local Convergence**

Sequence {*tn*} given by (6) was shown to be majorizing for {*xn*} and tighter than {*un*} under conditions of Lemmas in [19,22,23], respectively. These Lemmas correspond to part (i) of Lemma 1, Lemma 3 and Lemma 4, respectively. However, by asking the initial approximation *s* to be bounded above by a slightly larger bound the quadratic order of convergence is recovered. Hence, the preceding Lemmas can replace the order ones, respectively in the semi-local proofs for NM in these references. The parameter *K*<sup>0</sup> and *K* are connected to *x*<sup>0</sup> and L as follows

(K7) ∃ parameter *K*<sup>0</sup> > 0 such that for *x*<sup>1</sup> = *x*<sup>0</sup> − L (*x*0)−1L(*x*0)

$$\left\| \left| \mathcal{L}'(\mathfrak{x}\_0)^{-1} (\mathcal{L}'(\mathfrak{x}\_1) - \mathcal{L}'(\mathfrak{x}\_0)) \right| \right\| \le K\_0 \left\| \mathfrak{x}\_1 - \mathfrak{x}\_0 \right\|\_{\prime}$$

(K8) ∃ parameter *K* such that ∀*ξ* ∈ [0, 1], ∀*x*, *y* ∈ *D*0,

$$\left\| \left| \int\_0^1 \mathcal{L}'(\mathbf{x}\_0)^{-1} (\mathcal{L}'(\mathbf{x} + \tilde{\xi}(y - \mathbf{x})) - \mathcal{L}'(\mathbf{x})) d\tilde{\xi} \right| \right\| \le \frac{K}{2} \|y - \mathbf{x}\|.$$

Note that *K*<sup>0</sup> ≤ *M*<sup>0</sup> and *K* ≤ *M*. The convergence criteria in Lemmas 1, 3 and 4 do not necessarily imply each other in each case. That is why we do not only rely on Lemma 4 to show the semi-local convergence of NM. Consider the following three sets of conditions:

(A1): (K1), (K4), (K5), (K6) and conditions of Lemma 1 hold for *ρ* = *t* ∗, or (A2): (K1), (K4) (K5), (K6), conditions of Lemma 2 hold with *ρ* = *t* ∗, or (A3): (K1), (K4) (K5), (K6), conditions of Lemma 3 hold with *ρ* = *t* ∗, or (A4): (K1), (K4) (K5), (K6), conditions of Lemma 4 hold with *ρ* = *t* ∗.

The upper bounds of the limit point given in the Lemmas and in closed form can replace *ρ* in condition (K4). The proof are omitted in the presentation of the semi-local convergence of NM since the proof is given in the aforementioned references [19,20,22,23] with the exception of quadratic convergence given in part (ii) of the presented Lemmas.

**Theorem 2.** *Suppose any of conditions Ai*, *i* = 1, 2, 3, 4 *hold. Then, sequence* {*xn*} *generated by NM is well defined in B*[*x*0, *ρ*], *remains in B*[*x*0, *ρ*] ∀*n* = 0, 1, 2, ... *and converges to a solution x*<sup>∗</sup> ∈ *B*[*x*0, *ρ*] *of equation* L(*x*) = 0. *Moreover, the following assertion hold* ∀*n* = 0, 1, 2, . . .

*and*

$$||\mathfrak{x}\_{n+1} - \mathfrak{x}\_{\mathfrak{n}}|| \le t\_{n+1} - t\_{\mathfrak{n}}$$

*x*<sup>∗</sup> − *xn* ≤ *t* <sup>∗</sup> − *tn*.

The convergence ball is given next. Notice, however that we do not use all conditions *Ai*.

**Proposition 1.** *Suppose: there exists a solution x*<sup>∗</sup> ∈ *B*(*x*0, *ρ*0) *of equation* L(*x*) = 0 *for some ρ*<sup>0</sup> > 0; *condition (K5) holds and* ∃ *ρ*<sup>1</sup> ≥ *ρ*<sup>0</sup> *such that*

$$\frac{M\_0}{2}(\rho\_0 + \rho\_1) < 1.\tag{41}$$

*Set D*<sup>1</sup> = *D* ∩ *B*[*x*0, *ρ*1]. *Then, the only solution of equation* L(*x*) = 0 *in the set D*<sup>1</sup> *is x*∗.

**Proof.** Let *x*<sup>∗</sup> ∈ *D*<sup>1</sup> be a solution of equation L(*x*) = 0. Define linear operator *J* = <sup>1</sup> <sup>0</sup> L (*x*<sup>∗</sup> + *τ*(*x*<sup>∗</sup> − *x*∗))*dτ*. Then, using (K5) and (41)

$$\begin{split} \|\mathcal{L}'(\mathbf{x}\_0)^{-1}(\mathcal{L}'(\mathbf{x}\_0) - f)\| &\quad \leq \quad M\_0 \int\_0^1 ((1-\tau)\|\mathbf{x}\_0 - \mathbf{x}^\*\| + \tau \|\mathbf{x}\_0 - \mathbf{x}\_\*\|) d\tau \\ &\leq \quad \frac{M\_0}{2}(\rho\_0 + \rho\_1) < 1. \end{split} \tag{42}$$

Therefore, *x*<sup>∗</sup> = *x*<sup>∗</sup> is implied by the invertability of *J* and

$$J(\mathbf{x}^\* - \mathbf{x}\_\*) = \mathcal{L}(\mathbf{x}^\*) - \mathcal{L}(\mathbf{x}\_\*) = 0.$$

If conditions of Theorem 2 hold, set *ρ*<sup>0</sup> = *ρ*.

#### **4. Numerical Experiments**

Two experiments are presented in this Section.

**Example 3.** *Recall Example <sup>1</sup> (with* <sup>L</sup>(*x*) = *<sup>c</sup>*(*x*)*). Then, the parameters are <sup>s</sup>* <sup>=</sup> <sup>1</sup> <sup>−</sup> *<sup>a</sup>* <sup>3</sup> *, <sup>K</sup>*<sup>0</sup> <sup>=</sup> *<sup>a</sup>* <sup>+</sup> <sup>5</sup> <sup>3</sup> , *<sup>M</sup>*<sup>0</sup> <sup>=</sup> <sup>3</sup> <sup>−</sup> *<sup>a</sup>*, *<sup>M</sup>*<sup>1</sup> <sup>=</sup> <sup>2</sup>(<sup>2</sup> <sup>−</sup> *<sup>a</sup>*). *It also follows <sup>D</sup>*<sup>0</sup> <sup>=</sup> *<sup>B</sup>*(1, 1 <sup>−</sup> *<sup>a</sup>*) <sup>∩</sup> *<sup>B</sup>* 1 1, <sup>1</sup> *M*<sup>0</sup> 2 = *B* 1 1, <sup>1</sup> *M*<sup>0</sup> 2 , *so K* = *M* = 2 1 + 1 3 − *a* . *Denote by Ti*, *i* = 1, 2, 3, 4 *the set of values a for which conditions* (*K*3),(*N*2) − *N*4) *are satisfied. Then, by solving these inequalities for a* : *T*<sup>1</sup> = ∅, *T*<sup>2</sup> = [0.4648, 0.5), *T*<sup>3</sup> = [0.4503, 0.5), *and T*<sup>4</sup> = [0.4272, 0.5), *respectively.*

*The domain can be further extended. Choose <sup>a</sup>* <sup>=</sup> 0.4, *then,* <sup>1</sup> *M*<sup>0</sup> = 0.3846. *The following Table 1 shows, that the conditions of Lemma 1, since K*0*t* < 1 *and M*0*tn*+<sup>1</sup> < 1 ∀ *n* = 1, 2, . . .*.*

**Table 1.** Sequence (6) for Example 1.


**Example 4.** *Let* <sup>U</sup> <sup>=</sup> <sup>V</sup> <sup>=</sup> IR3*, D* <sup>=</sup> *<sup>B</sup>*(*x*0, 0.5) *and*

$$\mathcal{L}(\mathbf{x}) = \begin{pmatrix} e^{\mathbf{x}\_1} \ -1 \ \mathbf{x}\_2^3 + \mathbf{x}\_{2\prime} \ \mathbf{x}\_3 \end{pmatrix}^T.$$

*The equation* <sup>L</sup>(*x*) = <sup>0</sup> *has the solution x*<sup>∗</sup> = (0, 0, 0)*<sup>T</sup> and* <sup>L</sup> (*x*) = *diag*(*ex*<sup>1</sup> , 3*x*<sup>2</sup> <sup>2</sup> + 1, 1)*. Let x*<sup>0</sup> = (0.1, 0.1, 0.1)*T. Then s* <sup>=</sup> <sup>L</sup> (*x*0)−1L(*x*0)<sup>∞</sup> <sup>≈</sup> 0.1569*,*

$$M\_0 = \max\left\{\frac{e^{0.6}}{e^{0.1}}, \frac{3(0.6 + 0.1)}{1.03}\right\} \approx 2.7183,$$

$$M\_1 = \max\left\{\frac{e^{0.6}}{e^{0.1}}, \frac{3(0.6 + 0.6)}{1.03}\right\} \approx 3.49513.$$

$$\text{It also follows that } \frac{1}{M\_0} \approx 0.3679, D\_0 = D \cap B[x\_{0'}, \frac{1}{M\_0}] = B[0.1, 0.3679] \text{ and }$$

$$K\_0 = \max\left\{\frac{e^{p\_1}}{e^{0.1}}, \frac{3(p\_2 + 0.1)}{1.03}\right\} \approx 2.3819,$$

$$M = K = \max\left\{\frac{e^{p\_1}}{e^{0.1}}, \frac{6p\_1}{1.03}\right\} \approx 2.7255,$$

$$\text{or } p\_1 = 0.1 + \frac{1}{1.03} \approx 0.4679, m \approx 0.0019.$$

*where p*<sup>1</sup> = 0.1 + *M*<sup>0</sup> ≈ 0.4679*, p*<sup>2</sup> ≈ 0.0019*.*

*Notice that M*<sup>0</sup> < *M*<sup>1</sup> *and M* < *M*1. *The Kantorovich convergence condition (K3) is not fulfilled, since* 2*M*1*s* ≈ 1.0968 > 1. *Hence, convergence of converge NM is not assured by the Kantorovich criterion. However, the new conditions (N2)–(N4) are fulfilled, since q*2*s* ≈ 0.9749 < 1*, q*3*s* ≈ 0.9320 < 1*, q*4*s* ≈ 0.8723 < 1*.*

*The following Table 2 shows, that the conditions of Lemma 1 are fulfilled, since K*0*t* < 1 *and M*0*tn*+<sup>1</sup> < 1 ∀*n* = 1, 2, . . .*.*

**Table 2.** Sequence (6) for Example 4.


**Example 5.** *Let* U = V = *C*[0, 1] *be the domain of continuous real functions defined on the interval* [0, 1]. *Set D* = *B*[*x*0, 3], *and define operator* L *on D as*

$$\mathcal{L}(v)(v\_1) = v(v\_1) - \mathcal{y}(v\_1) - \int\_0^1 N(v\_1, t) v^3(t) dt, \ v \in \mathbb{C}[0, 1], v\_1 \in [0, 1], \tag{43}$$

*where y is given in C*[0, 1], *and N is a kernel given by Green's function as*

$$N(v\_1, t) = \begin{cases} (1 - v\_1)t, & t \le v\_1 \\ v\_1(1 - t), & v\_1 \le t. \end{cases} \tag{44}$$

*By applying this definition the derivative of* L *is*

$$[\mathcal{L}'(v)(z)](v\_1) = z(v\_1) - 3 \int\_0^1 N(v\_1, t) v^2(t) z(t) dt\tag{45}$$

*z* ∈ *C*[0, 1], *v*<sup>1</sup> ∈ [0, 1]. *Pick x*0(*v*1) = *y*(*v*1) = 1. *The norm-max is used. It then follows from (43)–(45) that* L (*x*0)−<sup>1</sup> <sup>∈</sup> *<sup>L</sup>*(*B*2, *<sup>B</sup>*1),

$$\|I - \mathcal{L}'(\mathbf{x}\_0)\| < 0.375, \ \|\mathcal{L}'(\mathbf{x}\_0)^{-1}\| \le 1.6,$$

$$s = 0.2, \ M\_0 = 2.4, \ M\_1 = 3.6,$$

*and D*<sup>0</sup> = *B*(*x*0, 3) ∩ *B*[*x*0, 0.4167] = *B*[*x*0, 0.4167], *so M* = 1.5. *Notice that M*<sup>0</sup> < *M*<sup>1</sup> *and M* < *M*1. *Choose K*<sup>0</sup> = *K* = *M*0. *The Kantorovich convergence condition (K3) is not fulfilled, since* 2*M*1*s* = 1.44 > 1. *Hence, convergence of converge NM is not assured by the Kantorovich criterion. However, new condition (36) is fulfilled, since* 2*M*4*s* = 0.6 < 1.

**Example 6.** *Let* U = V = IR*, D* = (−1, 1) *and*

$$\mathcal{L}(x) = e^x + 2x - 1.$$

*The equation* L(*x*) = 0 *has the solution x*<sup>∗</sup> = 0*. The parameters are s* = *<sup>e</sup>x*<sup>0</sup> <sup>+</sup> <sup>2</sup>*x*<sup>0</sup> <sup>−</sup> <sup>1</sup> *ex*<sup>0</sup> + 2 *, <sup>M</sup>*<sup>0</sup> <sup>=</sup> *<sup>M</sup>*<sup>1</sup> <sup>=</sup> *e, K*<sup>0</sup> <sup>=</sup> *<sup>K</sup>* <sup>=</sup> *<sup>M</sup>* <sup>=</sup> *<sup>e</sup>x*0<sup>+</sup> <sup>1</sup> *<sup>e</sup> and*

$$D\_0 = (-1,1) \cap \left[ \mathfrak{x}\_0 - \frac{1}{e}, \mathfrak{x}\_0 + \frac{1}{e} \right] \\ = \left[ \mathfrak{x}\_0 - \frac{1}{e}, \mathfrak{x}\_0 + \frac{1}{e} \right].$$

*Let us choose x*<sup>0</sup> = 0.15*. Then, s* ≈ 0.1461*. Conditions (K3) and (N2) are fulfilled. The majorizing sequences* {*tn*} *(6) and* {*un*} *from Theorem 1 are:*

$$\{t\_n\} = \{0, 0.1461, 0.1698, 0.1707, 0.1707, 0.1707, 0.1707\}, \dots$$

{*un*} = {0, 0.1461, 0.1942, 0.2008, 0.2009, 0.2009, 0.2009, 0.2009}.

*In Table 3, there are error bounds. Notice that the new error bounds are tighter, than the ones in Theorem 1.*


**Table 3.** Results for *x*<sup>0</sup> = 0.15 for Example 6.

*Let us choose x*<sup>0</sup> = 0.2*. Then, s* ≈ 0.1929*. In this case condition (K3) is not held, but (N2) holds. The majorizing sequence* {*tn*} *(6) is:*

{*tn*} = {0, 0.1929, 0.2427, 0.2491, 0.2492, 0.2492, 0.2492, 0.2492 }.

*Table 4 shows the error bounds from Theorem 2.*

**Table 4.** Results for *x*<sup>0</sup> = 0.2 for Example 6.


#### **5. Conclusions**

We developed a comparison between results on the semi-local convergence of NM. There exists an extensive literature on the convergence analysis of NM. Most convergence results are based on recurrent relations, where the Lipschitz conditions are given in affine or non-affine invariant forms.The new methodology uses recurrent functions. The idea is to construct a domain included in the one used before which also contains the Newton iterates. That is important, since the new results do not require additional conditions. This way the new sufficient convergence conditions are weaker in the Lipschitz case, since they rely on smaller constants. Other benefits include tighter error bounds and more precise uniqueness of the solution results. The new constants are special cases of earlier ones. The methodology is very general making it suitable to extend the usage of other numerical methods under Hölder or more generalized majorant conditions. This will be the topic of our future work.

**Author Contributions:** Conceptualization I.K.A.; Methodology I.K.A.; Investigation S.R., I.K.A., S.S. and H.Y. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Constructing a Class of Frozen Jacobian Multi-Step Iterative Solvers for Systems of Nonlinear Equations**

**R. H. Al-Obaidi and M. T. Darvishi** *∗*

Department of Mathematics, Faculty of Science, Razi University, Kermanshah 67149, Iran **\*** Correspondence: darvishi@razi.ac.ir

**Abstract:** In this paper, in order to solve systems of nonlinear equations, a new class of frozen Jacobian multi-step iterative methods is presented. Our proposed algorithms are characterized by a highly convergent order and an excellent efficiency index. The theoretical analysis is presented in detail. Finally, numerical experiments are presented for showing the performance of the proposed methods, when compared with known algorithms taken from the literature.

**Keywords:** iterative method; frozen Jacobian multi-step iterative method; system of nonlinear equations; high-order convergence

**MSC:** 65Hxx

#### **1. Introduction**

Approximating a locally unique solution *α* of the nonlinear system

$$F(\mathbf{x}) = 0 \tag{1}$$

has many applications in engineering and mathematics [1–4]. In (1), we have *n* equations with *n* variables. In fact, *F* is a vector-valued function with *n* variables. Several problems arising from the different areas in natural and applied sciences take the form of systems of nonlinear Equation (1) that need to be solved, where *F*(**x**)=(*f*1(**x**), *f*2(**x**), ··· , *fn*(**x**)) such that for all *k* = 1, 2, ··· , *n*, *fk* is a scalar nonlinear function. Additionally, there are many real life problems for which, in the process of finding their solutions, one needs to solve a system of nonlinear equations, see for example [5–9]. It is known that finding an exact solution *<sup>α</sup><sup>t</sup>* = (*α*1, *<sup>α</sup>*2, ··· , *<sup>α</sup>n*) of the nonlinear system (1) is not an easy task, especially when the equation contains terms consisting of logarithms, trigonometric and exponential functions, or a combination of transcendental terms. Hence, in general, one cannot find the solution of Equation (1) analytically, therefore, we have to use iterative methods. Any iterative method starts from one approximation and constructs a sequence such that it converges to the solution of the Equation (1) (for more details, see [10]).

The most commonly used iterative method to solve (1) is the classical Newton method, given by

$$\mathbf{x}^{(k+1)} = \mathbf{x}^{(k)} - J\_F(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}),$$

where *JF*(**x**) (or *F* (**x**)) is the Jacobian matrix of function *F*, and **x**(*k*) is the *k*-th approximation of the root of (1) with the initial guess **x**(0). It is well known that Newton's method is a quadratic convergence method with the efficiency index <sup>√</sup><sup>2</sup> [11]. The third and higherorder methods such as the Halley and Chebyshev methods [12] have little practical value because of the evaluation of the second Frechèt-derivative. However, third and higherorder multi-step methods can be good substitutes because they require the evaluation of the function and its first derivative at different points.

In the recent decades, many authors tried to design iterative procedures with better efficiency and higher order of convergence than the Newton scheme, see, for example, ref. [13–24]

**Citation:** Al-Obaidi, R.H.; Darvishi, M.T. Constructing a Class of Frozen Jacobian Multi-Step Iterative Solvers for Systems of Nonlinear Equations. *Mathematics* **2022**, *10*, 2952. https:// doi.org/10.3390/math10162952

Academic Editors: Maria Isabel Berenguer and Manuel Ruiz Galán

Received: 1 July 2022 Accepted: 8 August 2022 Published: 16 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

and references therein. However, the accuracy of solutions is highly dependent on the efficiency of the utilized algorithm. Furthermore, at each step of any iterative method, we must find the exact solution of an obtained linear system which is expensive in actual applications, especially when the system size *n* is very large. However, the proposed higher-order iterative methods are futile unless they have high-order convergence. Therefore, the important aim in developing any new algorithm is to achieve high convergence order with requiring as small as possible the evaluations of functions, derivatives and matrix inversions. Thus, here, we focus on the technique of the frozen Jacobian multi-step iterative algorithms. It is shown that this idea is computationally attractive and economical for constructing iterative solvers because the inversion of the Jacobian matrix (regarding *LU*-decomposition) is performed once. Many researchers have reduced the computational cost of these algorithms by frozen Jacobian multi-step iterative techniques [25–28].

In this work, we construct a new class of frozen Jacobian multi-step iterative methods for solving the nonlinear systems of equations. This is a high-order convergent algorithm with an excellent efficiency index. The theoretical analysis is presented completely. Further, by solving some nonlinear systems, the ability of the methods is compared with some known algorithms.

The rest of this paper is organized as follows. In the following section, we present our new methods with obtaining of their order of convergence. Additionally, their computational efficiency are discussed in general. Some numerical examples are considered in Sections 3 and 4 to show the asymptotic behavior of these methods. Finally, a brief concluding remark is presented in Section 5.

#### **2. Constructing New Methods**

In this section, two high-order frozen Jacobian multi-step iterative methods to solve systems of nonlinear equations are presented. These come by increasing the convergence in Newton's method and simultaneously decreasing its computational costs. The framework of these Frozen Jacobian multi-step iterative Algorithms (FJA) can be described as

$$\begin{cases} \text{No. of steps} &= m > 1\_{\text{\textquotedblleft}}\\ \text{Order of convergence} &= m + 1\_{\text{\textquotedblleft}}\\ \text{Function evaluations} &= m\_{\text{\textquotedblleft}}\\ \text{Jacobi evaluations} &= 1\_{\text{\textquotedblleft}}\\ \text{No. of } LL \text{ decomposition} &= 1; \end{cases} \quad \text{F} \| \mathbf{A} : \begin{cases} \mathbf{y}\_{0} = \text{initial guess} \\ \mathbf{y}\_{1} = \mathbf{y}\_{0} - J\_{\text{F}}(\mathbf{y}\_{0})^{-1} F(\mathbf{y}\_{0}) \\ \text{for } i = 1: m - 1 \\ \mathbf{E}\_{i} = J\_{\text{F}}(\mathbf{y}\_{0})^{-1} (F(\mathbf{y}\_{i}) + F(\mathbf{y}\_{i-1})) \\ \mathbf{y}\_{i+1} = \mathbf{y}\_{i-1} - \mathbf{E}\_{i} \\ \text{end} \\ \mathbf{y}\_{0} = \mathbf{y}\_{m}. \end{cases} \quad \text{(2)}$$

In (2), for an *m*-step method (*m* > 1), one needs *m* function evaluations and only one Jacobian evaluation. Further, the number of *LU* decompositions is one. The order of convergence for such FJA method is *m* + 1. In the right-hand side column of (2), the algorithm is briefy described.

In the following subsections, by choosing two different values for *m*, a third- and a fourth-order frozen Jacobian multi-step iterative algorithm are presented.

#### *2.1. The Third-Order FJA*

First, we investigate case *m* = 2, that is,

$$\begin{aligned} \mathbf{y}^{(k)} &= \mathbf{x}^{(k)} - f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}), \\\\ \mathbf{x}^{(k+1)} &= \mathbf{x}^{(k)} - f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} (F(\mathbf{y}^{(k)}) + F(\mathbf{x}^{(k)})), \end{aligned} \tag{3}$$

we denote this by *M*3.

#### 2.1.1. Convergence Analysis

In this part, we prove that the order of convergence of method (3) is three. First, we need to definition of the Frechèt derivative.

**Definition 1** ([29])**.** *Let F be an operator which maps a Banach space X into a Banach space Y. If there exists a bounded linear operator T from X into Y such that*

$$\lim\_{\mathbf{y}\to 0} \frac{\|F(\mathbf{x} + \mathbf{y}) - F(\mathbf{x}) - T(\mathbf{y})\|}{\|\mathbf{y}\|} = 0,$$

*then F is said to be Frechèt differentiable and F* (**x**0) = *T*(**x**0)*.*

*For more details on the Frechèt differentiability and Frechèt derivative, we refer the interested readers to a review article by Emmanuel [30] and references therein.*

**Theorem 1.** *Let <sup>F</sup>* : *<sup>I</sup>* <sup>⊆</sup> <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup>*<sup>n</sup> be a Frechèt differentiable function at each point of an open convex neighborhood I of α, the solution of system F*(**x**) = 0*. Suppose that JF*(**x**(*k*)) *is continuous and nonsingular in <sup>α</sup>, then, the sequence* {**x**(*k*)}(*k*0) *obtained using the iterative method (3) converges to α and its rate of convergence is three.*

**Proof.** Suppose that *En* <sup>=</sup> **<sup>x</sup>**(*n*) <sup>−</sup> *<sup>α</sup>*, using Taylor's expansion [31], we obtain

$$F(\mathbf{x}^{(n)}) = F(a) + F'(a)E\_n + \frac{1}{2!}F''(a)E\_n^2 + \frac{1}{3!}F^{\prime\prime\prime}(a)E\_n^3 + \frac{1}{4!}F^{\prime\prime\prime\prime}(a)E\_n^4 + \dots$$

as *α* is the root of *F* so *F*(*α*) = 0. As a matter of fact, one may yield the following equations of *F*(**x**(*n*)) and *F* (**x**(*n*)) in a neighborhood of *α* by using Taylor's series expansions [32],

$$F(\mathbf{x}^{(n)}) = F'(\boldsymbol{\alpha}) \left[ E\_n + \mathbb{C}\_2 E\_n^2 + \mathbb{C}\_3 E\_n^3 + \mathbb{C}\_4 E\_n^4 + \mathbb{C}\_5 E\_n^5 + O(|E\_n^6|) \right],\tag{4}$$

$$F'(\mathbf{x}^{(n)}) = F'(\boldsymbol{\mathfrak{a}}) \left[ I + 2\mathbf{C}\_2\boldsymbol{E}\_n + 3\mathbf{C}\_3\boldsymbol{E}\_n^2 + 4\mathbf{C}\_4\boldsymbol{E}\_n^3 + 5\mathbf{C}\_5\boldsymbol{E}\_n^4 + 6\mathbf{C}\_6\boldsymbol{E}\_n^5 + \mathcal{O}(|\boldsymbol{E}\_n^6|) \right], \tag{5}$$

wherein *Cn* = [*F* (*α*)]−1*F*(*n*)(*α*) *<sup>n</sup>*! and *I* is the identity matrix whose order is the same as the order of the Jacobian matrix. Note that *iCiE<sup>i</sup>*−<sup>1</sup> *<sup>n</sup>* ∈ L(R*n*). Using (4) and (5) we obtain

$$\begin{split} F(\mathbf{x}^{(n)})^{-1} F(\mathbf{x}^{(n)}) &= E\_n - \mathsf{C}\_2 \mathrm{E}\_n^2 + (2\mathsf{C}\_2^2 - 2\mathsf{C}\_3) \mathrm{E}\_n^3 + (-4\mathsf{C}\_2^3 + 7\mathsf{C}\_2\mathsf{C}\_3 - 3\mathsf{C}\_4) \mathrm{E}\_n^4 \\ &+ (-32\mathsf{C}\_2^5 + 8\mathsf{C}\_2^4 - 20\mathsf{C}\_2^2 \mathsf{C}\_3 + 10\mathsf{C}\_2 \mathsf{C}\_4 + 6\mathsf{C}\_3^2 - 4\mathsf{C}\_5) \mathrm{E}\_n^5 + O(|\mathsf{E}\_n^6|). \end{split}$$

Since **<sup>y</sup>**(*n*) <sup>=</sup> **<sup>x</sup>**(*n*) <sup>−</sup> *<sup>F</sup>* (**x**(*n*))−1*F*(**x**(*n*)), we find

$$\begin{aligned} \mathbf{y}^{(n)} &= \mathbf{a} + \mathbf{C}\_2 \mathbf{E}\_n^2 + (-2\mathbf{C}\_2^2 + 2\mathbf{C}\_3)\mathbf{E}\_n^3 + (4\mathbf{C}\_2^3 - 7\mathbf{C}\_2\mathbf{C}\_3 + 3\mathbf{C}\_4)\mathbf{E}\_n^4 \\ &+ (32\mathbf{C}\_2^5 - 8\mathbf{C}\_2^4 + 20\mathbf{C}\_2^2\mathbf{C}\_3 - 10\mathbf{C}\_2\mathbf{C}\_4 - 6\mathbf{C}\_3^2 + 4\mathbf{C}\_5)\mathbf{E}\_n^5 + O||\mathbf{E}\_n^6||. \end{aligned} \tag{6}$$

By the definition of error term *En*, the error term of **y**(*n*) as an approximation of *α*, that is, **<sup>y</sup>**(*n*) <sup>−</sup> *<sup>α</sup>* is obtained from the second term of the right-hand side of Equation (6). Similarly, the Taylor's expansion of the function *F*(**y**(*n*)) is

$$F(\mathbf{y}^{(n)}) = F'(a) \left[ \mathbf{C}\_2 \mathbf{E}\_n^2 + (-2\mathbf{C}\_2^2 + 2\mathbf{C}\_3)\mathbf{E}\_n^3 + (5\mathbf{C}\_2^3 - 7\mathbf{C}\_2\mathbf{C}\_3 + 3\mathbf{C}\_4)\mathbf{E}\_n^4 + \cdots \right] \tag{7}$$

$$(32\mathbf{C}\_2^5 - 12\mathbf{C}\_2^4 + 24\mathbf{C}\_2^2 - 10\mathbf{C}\_2\mathbf{C}\_4 - 6\mathbf{C}\_3^2 + 4\mathbf{C}\_5)\mathbf{E}\_n^5 + O(||\mathbf{E}\_n^6||) \tag{8}$$

From (4) and (7), we obtain

$$F(F(\mathbf{x}^{(n)}) + F(\mathbf{y}^{(n)})) = F(\alpha) \left[ E\_n + 2\mathcal{C}\_2 E\_n^2 + (-2\mathcal{C}\_2^2 + 3\mathcal{C}\_3)E\_n^3 + (5\mathcal{C}\_2^3 - 7\mathcal{C}\_2\mathcal{C}\_3 + 3\mathcal{C}\_3)E\_n^2 \right] - \frac{1}{2}$$

$$4\mathcal{C}\_4)E\_n^4 + \left( 32\mathcal{C}\_2^5 - 12\mathcal{C}\_2^4 + 24\mathcal{C}\_2^2 - 10\mathcal{C}\_2\mathcal{C}\_4 - 6\mathcal{C}\_3^2 + 6\mathcal{C}\_5\right)E\_n^5 \left| + O(|E\_n^6|) \right|.$$

Thus,

$$\begin{aligned} F'(\mathbf{x}^{(n)})^{-1}(F(\mathbf{x}^{(n)}) + F(\mathbf{y}^{(n)})) &= E\_n - (2\mathbf{C}\_2^2)E\_n^3 + (9\mathbf{C}\_2^3 - 7\mathbf{C}\_2\mathbf{C}\_3)E\_n^4 \\ &+ (-30\mathbf{C}\_2^4 + 44\mathbf{C}\_2^2\mathbf{C}\_3 - 10\mathbf{C}\_2\mathbf{C}\_4 - 6\mathbf{C}\_3^2 + \mathbf{C}\_5)E\_n^5 + O(|E\_n^6|). \end{aligned}$$

Finally, since

$$\mathbf{x}^{(n+1)} = \mathbf{x}^{(n)} - J\_F(\mathbf{x}^{(n)})^{-1} (F(\mathbf{x}^{(n)}) + F(\mathbf{y}^{(n)})),$$

we have

$$\mathbf{x}^{(n+1)} = \mathbf{a} - (\mathbf{2C\_2^2})\mathbf{E\_n^3} - (\mathbf{9C\_2^3} - \mathbf{7C\_2C\_3})\mathbf{E\_n^4} - (-\mathbf{30C\_2^4} + \mathbf{44C\_2^2C\_3} - \mathbf{10C\_2C\_4} + \mathbf{10C\_2C\_4})\mathbf{A} \tag{8}$$

$$\cdots - \mathbf{6C\_3^2} + \mathbf{C\_5})\mathbf{E\_n^5} + O(|E\_n^6|). \tag{9}$$

Clearly, the error Equation (8) shows that the order of convergence of the frozen Jacobian multi-step iterative method (3) is three. This completes the proof.

#### 2.1.2. The Computational Efficiency

In this section, we compare the computational efficiency of our third-order scheme (3), denoted as *M*3, with some existing third-order methods. We will assess the efficiency index of our new frozen Jacobian multi-step iterative method in contrast with the existing methods for systems of nonlinear equations, using two famous efficiency indices. The first one is the classical efficiency index [33] as

$$IE = p^{\frac{1}{7}}$$

where *p* is the rate of convergence and *c* stands for the total computational cost per iteration in terms of the number of functional evaluations, such that *c* = (*rn* + *mn*2) where *r* refers to the number of function evaluations needed per iteration and *m* is the number of Jacobian matrix evaluations needed per iteration.

It is well known that the computation of *LU* factorization by any of the existing methods in the literature normally needs 2*n*3/3 flops in floating point operations, while the floating point operations to solve two triangular systems needs 2*n*<sup>2</sup> flops.

The second criterion is the flops-like efficiency index (*FLEI*) which was defined by Montazeri et al. [34] as

$$FLEI = p^{\frac{1}{2}}$$

where *p* is the order of convergence of the method, *c* denotes the total computational cost per loop in terms of the number of functional evaluations, as well as the cost of *LU* factorization for solving two triangular systems (based on the flops).

As the first comparison, we compare *M*<sup>3</sup> with the third-order method given by Darvishi [35], which is denoted as *M*3,1

$$\begin{array}{l} \mathbf{y}^{(k)} = \mathbf{x}^{(k)} - J\_F(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}),\\ \mathbf{x}^{(k+1)} = \mathbf{x}^{(k)} - 2(J\_F(\mathbf{x}^{(k)}) + J\_F(\mathbf{y}^{(k)}))^{-1} F(\mathbf{x}^{(k)}). \end{array}$$

The second iterative method shown by *M*3,2 is the following third-order method introduced by Hernández [36]

$$\begin{cases} \mathbf{y}^{(k)} = \mathbf{x}^{(k)} - \frac{1}{2} f\_F(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}),\\ \mathbf{x}^{(k+1)} = \mathbf{x}^{(k)} + f\_F(\mathbf{x}^{(k)})^{-1} (f\_F(\mathbf{y}^{(k)}) - 2f\_F(\mathbf{x}^{(k)})) \times f\_F(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}). \end{cases}$$

Another method is the following third-order iterative method given by Babajee et al. [37], *M*3,3,

$$\begin{array}{l} \mathbf{y}^{(k)} = \mathbf{x}^{(k)} - J\_F(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}),\\ \mathbf{x}^{(k+1)} = \mathbf{x}^{(k)} + \frac{1}{2} J\_F(\mathbf{x}^{(k)})^{-1} (J\_F(\mathbf{y}^{(k)}) - 3J\_F(\mathbf{x}^{(k)})) \times J\_F(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}). \end{array}$$

Finally, the following third-order iterative method, *M*3,4, ref. [38] is considered

$$\begin{array}{l} \mathbf{y}^{(k)} = \mathbf{x}^{(k)} - \frac{2}{3} J\_F(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}),\\ \mathbf{x}^{(k+1)} = \mathbf{x}^{(k)} - 4(J\_F(\mathbf{x}^{(k)}) + 3J\_F(\mathbf{y}^{(k)}))^{-1} F(\mathbf{x}^{(k)}).\end{array}$$

The computational efficiency of our third-order method revealed that our method, *M*3, is the best one in respect with methods *M*3,1, *M*3,2, *M*3,3 and *M*3,4, as presented in Table 1, and Figures 1 and 2.

**Table 1.** Comparison of efficiency indices between *M*<sup>3</sup> and other third-order methods.


**Figure 1.** The classical efficiency index for methods *M*3, *M*3,1, *M*3,2, *M*3,3 and *M*3,4.

**Figure 2.** The flops-like efficiency index for methods *M*3, *M*3,1, *M*3,2, *M*3,3 and *M*3,4.

#### *2.2. The Fourth-Order FJA*

By setting *m* = 3 in FJA, the following three-step algorithm is deduced

$$\begin{array}{l} \mathbf{y}^{(k)} = \mathbf{x}^{(k)} - f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}),\\ \mathbf{z}^{(k)} = \mathbf{x}^{(k)} - f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} (F(\mathbf{y}^{(k)}) + F(\mathbf{x}^{(k)})),\\ \mathbf{x}^{(k+1)} = \mathbf{y}^{(k)} - f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} (F(\mathbf{z}^{(k)}) + F(\mathbf{y}^{(k)})).\end{array} \tag{9}$$

In the following subsections, the order of convergence and efficiency indices are obtained for the method described in (9).

#### 2.2.1. Convergence Analysis

The frozen Jacobian three-step iterative process (9) has the rate of convergence order four by using three evaluations of function *F* and one first-order Frechèt derivative *F* per full iterations. To avoid any repetition, we take a sketch of proof on this subject. Similar to the proof of Theorem 1, by setting **z**(*k*) = **x**(*k*+1) in (8) we obtain

$$F(\mathbf{z}^{(k)}) = F'(a)[2\mathbf{C}\_2^2 E\_n^3 + (-9\mathbf{C}\_2^3 + 7\mathbf{C}\_2 \mathbf{C}\_3)E\_n^4 + (30\mathbf{C}\_2^4 - 44\mathbf{C}\_2^2 \mathbf{C}\_3 + 1)]$$

$$\dots + 10\mathbf{C}\_2 \mathbf{C}\_4 - \mathbf{C}\_5 |E\_n^5 + O(|E\_n^6|) |\dots$$

Hence,

$$\begin{split} \left( F(\mathbf{z}^{(k)}) + F(\mathbf{y}^{(k)}) \right) &= F'(\boldsymbol{\alpha}) \left[ \mathbf{C}\_2 \boldsymbol{E}\_n^2 + 2 \mathbf{C}\_3 \mathbf{E}\_n^3 + (-4 \mathbf{C}\_2^3 + 3 \mathbf{C}\_4) \mathbf{E}\_n^4 \\ &+ (32 \mathbf{C}\_2^5 + 18 \mathbf{C}\_2^4 - 20 \mathbf{C}\_2^2 \mathbf{C}\_3 + 3 \mathbf{C}\_5) \mathbf{E}\_n^5 + O(|\boldsymbol{E}\_n^6|) \right]. \end{split} \tag{10}$$

Therefore, from (5) and (10), we find

$$F'(\mathbf{x}^{(k)})^{-1}(F(\mathbf{z}^{(k)}) + F(\mathbf{y}^{(k)})) = \left[\mathbb{C}\_2\mathcal{E}\_n^2 + (-2\mathbb{C}\_2^2 + 2\mathbb{C}\_3)\mathcal{E}\_n^3 + (-7\mathbb{C}\_2\mathbb{C}\_3 + \dots$$

$$+ 3\mathbb{C}\_4)\mathcal{E}\_n^4 + (18\mathcal{C}\_2^4 - 10\mathbb{C}\_2\mathbb{C}\_4 - 6\mathbb{C}\_3^2 + 3\mathbb{C}\_5)\mathcal{E}\_n^5 + O(|\mathcal{E}\_n^6|) \right]. \tag{11}$$

Since we have **<sup>x</sup>**(*k*+1) <sup>=</sup> **<sup>y</sup>**(*k*) <sup>−</sup> *JF*(**x**(*k*)))−1(*F*(**z**(*k*)) + *<sup>F</sup>*(**y**(*k*))) from (6) and (11), the following result is obtained

$$\mathbf{x}^{(k+1)} = \mathbf{a} + (4\mathbf{C}\_2^3)\mathbf{E}\_n^4 + (32\mathbf{C}\_2^5 - 26\mathbf{C}\_2^4 + 20\mathbf{C}\_2^2\mathbf{C}\_3 + \mathbf{C}\_5)\mathbf{E}\_n^5 + O||\mathbf{E}\_n^6||.\tag{12}$$

This completes the proof, since error Equation (12) shows that the order of convergence of the frozen Jacobian multi-step iterative method (9) is four.

#### 2.2.2. The Computational of Efficiency

Now, we compare the computational efficiency of our fourth-order scheme (9), called by *M*4, with some existing fourth-order methods. The considered methods are: the thirdorder method *M*4,1 given by Sharma et al. [39],

$$\begin{aligned} \mathbf{y}^{(k)} &= \frac{2}{3} \mathbf{x}^{(k)} - f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}),\\ \mathbf{x}^{(k+1)} &= \mathbf{x}^{(k)} - \frac{1}{2} \left[ -I + \frac{9}{4} f\_{\mathcal{F}}(\mathbf{y}^{(k)})^{-1} f\_{\mathcal{F}}(\mathbf{x}^{(k)}) + \frac{2}{4} f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} f\_{\mathcal{F}}(\mathbf{y}^{(k)}) \right],\\ &\times f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}), \end{aligned}$$

the fourth-order iterative method *M*4,2 given by Darvishi and Barati [40],

$$\begin{aligned} \mathbf{y}^{(k)} &= \mathbf{x}^{(k)} - J\_F(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}),\\ \mathbf{z}^{(k)} &= \mathbf{x}^{(k)} - J\_F(\mathbf{x}^{(k)})^{-1} \left( F(\mathbf{y}^{(k)}) + F(\mathbf{x}^{(k)}) \right),\\ \mathbf{x}^{(k+1)} &= \mathbf{x}^{(k)} - \left[ \frac{1}{6} J\_F(\mathbf{x}^{(k)}) + \frac{2}{3} J\_F(\frac{(\mathbf{x}^{(k)} + \mathbf{z}^{(k)})}{2}) + \frac{1}{6} J\_F(\mathbf{z}^{(k)}) \right]^{-1} F(\mathbf{x}^{(k)}), \end{aligned}$$

the fourth-order iterative method *M*4,3 given by Soleymani et al. [34,41],

$$\begin{aligned} \mathbf{y}^{(k)} &= \frac{2}{5} \mathbf{x}^{(k)} - f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)})\_{\prime} \\ \mathbf{x}^{(k+1)} &= \mathbf{x}^{(k)} - \left[ I - \frac{3}{8} \left( I - \left( f\_{\mathcal{F}}(\mathbf{y}^{(k)})^{-1} f\_{\mathcal{F}}(\mathbf{x}^{(k)}) \right)^{2} \right) \right] f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)})\_{\prime} \end{aligned}$$

and the following Jarratt fourth-order method *M*4,4 [42],

$$\begin{array}{l} \mathbf{y}^{(k)} = \frac{2}{3} \mathbf{x}^{(k)} - f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}),\\ \mathbf{x}^{(k+1)} = \mathbf{x}^{(k)} - \frac{1}{2} \left( 3f\_{\mathcal{F}}(\mathbf{y}^{(k)}) - f\_{\mathcal{F}}(\mathbf{x}^{(k)}) \right)^{-1} \left( 3f\_{\mathcal{F}}(\mathbf{y}^{(k)}) + f\_{\mathcal{F}}(\mathbf{x}^{(k)}) \right),\\ \times f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}). \end{array}$$

The computational efficiency of our fourth-order method showed that our method *M*<sup>4</sup> is better than methods *M*4,1, *M*4,2, *M*4,3 and *M*4,4 as the comparison results are presented in Table 2, and Figures 3 and 4. As we can see from Table 2, the indices of our method *M*<sup>4</sup> are better than similar ones in methods *M*4,1, *M*4,2, *M*4,3 and *M*4,4. Furthermore, Figures 3 and 4 show the superiority of our method in respect with the another schemes.

**Table 2.** Comparison of efficiency indices between *M*<sup>4</sup> and other fourth-order methods.


**Figure 3.** The classical efficiency index for methods *M*4, *M*4,1, *M*4,2, *M*4,3 and *M*4,4.

**Figure 4.** The Flops-like efficiency index for methods *M*4, *M*4,1, *M*4,2, *M*4,3 and *M*4,4.

#### **3. Numerical Results**

In order to check the validity and efficiency of our proposed frozen Jacobian multistep iterative methods, three test problems are considered to illustrate convergence and computation behaviors such as efficiency index and some another indices of the frozen Jacobian multi-step iterative methods. Numerical computations have been performed using variable precision arithmetic that uses floating point representation of 100 decimal digits of mantissa in MATLAB. The computer specifications are: Intel(R) Core(TM) i7-1065G7 CPU 1.30 GHz with 16.00 GB of RAM on Windows 10 pro.

**Experiment 1.** We begin with the following nonlinear system of *n* equations [43],

$$f\_i(\mathbf{x}) = \cos(\mathbf{x}\_i) - 1, \quad i = 1, 2, \dots, n. \tag{13}$$

The exact zero of *F*(**x**)=(*f*1(**x**), *f*2(**x**), ... , *fn*(**x**))*<sup>t</sup>* = 0 is (0, 0, ... , 0)*<sup>t</sup>* . To solve (13), we set the initial guess as (0.78, 0.78, ... , 0.78)*<sup>t</sup>* . The stopping criterion is selected as || *<sup>f</sup>*(**x**(*k*))|| ≤ <sup>10</sup>−3.

**Experiment 2.** The next test problem is the following system of nonlinear equations [44],

$$f\_i(\mathbf{x}) = (1 - \mathbf{x}\_i^2) + \mathbf{x}\_i (1 + \mathbf{x}\_i \mathbf{x}\_{n-2} \mathbf{x}\_{n-1} \mathbf{x}\_n) - 2, \quad i = 1, 2, \dots, n. \tag{14}$$

The exact root of *F*(**x**) = 0 is (1, 1, ... , 1)*<sup>t</sup>* . To solve (14), the initial guess is taken as (2, 2, . . . , 2)*<sup>t</sup>* . The stopping criterion is selected as || *<sup>f</sup>*(**x**(*k*))|| ≤ <sup>10</sup>−8.

**Experiment 3.** The last test problem is the following nonlinear system [9],

$$\begin{cases} f\_i(\mathbf{x}) = \mathbf{x}\_i^2 \mathbf{x}\_{i+1} - \mathbf{1}\_i & i = 1, 2, \dots, n - 1, \\ f\_n(\mathbf{x}) = \mathbf{x}\_n^2 \mathbf{x}\_1 - \mathbf{1}\_i \end{cases} \tag{15}$$

with the exact solution (1, 1, ... , 1)*<sup>t</sup>* . To solve (15), the initial guess and the stopping criterion are respectively considered as (3, 3, . . . , 3)*<sup>t</sup>* and || *<sup>f</sup>*(**x**(*k*))|| ≤ <sup>10</sup>−8.

Table 3 shows the comparison results between our third-order frozen Jacobian twostep iterative method *M*<sup>3</sup> and some third-order frozen Jacobian iterative methods, namely, *M*3,1, *M*3,2, *M*3,3 and *M*3,4. For all test problems, two different values for *n* are considered, namely, *n* = 50, 100. As this table shows, in all cases, our method works better than the others. Similarly, in Table 4, CPU time and number of iterations are presented for our fourth-order method, namely, *M*<sup>4</sup> and methods *M*4,1, *M*4,2, *M*4,3 and *M*4,4. Similar to *M*3, the CPU time for *M*<sup>4</sup> is less than the CPU time for the other methods. These tables show superiority of our methods in respect with the other ones. In Tables 3 and 4, *it* shows the number of iterations.

**Table 3.** Comparison results between *M*<sup>3</sup> and other third-order methods.


**Table 4.** Comparison results between *M*<sup>4</sup> and other fourth-order methods.


#### **4. Another Comparison**

In the previous parts, we presented some comparison results between our methods *M*<sup>3</sup> and *M*<sup>4</sup> with some another frozen Jacobian multi-step iterative methods from third- and fourth-order methods. In this section, we compare our presented methods with three other methods which are fourth- and fifth-order ones. As Tables 5 and 6 and Figures 5 and 6 show, our methods are also better than these methods.

**First.** The fourth-order method given by Qasim et al. [25], *MA*,

$$\begin{array}{l} J\_F(\mathbf{x}^{(k)}) \theta\_1 = F(\mathbf{x}^{(k)}), \\ \mathbf{y}^{(k)} = \mathbf{x}^{(k)} - \theta\_1, \\ J\_F(\mathbf{x}^{(k)}) \theta\_2 = F(\mathbf{y}^{(k)}), \\ J\_F(\mathbf{x}^{(k)}) \theta\_3 = J\_F(\mathbf{y}^{(k)}) \theta\_{2'}, \\ \mathbf{x}^{(k+1)} = \mathbf{y}^{(k)} - 2\theta\_2 + \theta\_3. \end{array}$$

**Second.** The fourth-order Newton-like method by Amat et al. [26], *MB*,

$$\begin{array}{l} \mathbf{y}^{(k)} = \mathbf{x}^{(k)} - f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} F(\mathbf{x}^{(k)}),\\ \mathbf{z}^{(k)} = \mathbf{y}^{(k)} - f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} F(\mathbf{y}^{(k)}),\\ \mathbf{x}^{(k+1)} = \mathbf{z}^{(k)} - f\_{\mathcal{F}}(\mathbf{x}^{(k)})^{-1} F(\mathbf{z}^{(k)}). \end{array}$$

**Third.** The fifth-order iterative method by Ahmad et al. [28], *MC*,

$$\begin{array}{l} J\_F(\mathbf{x}^{(k)}) \theta\_1 = F(\mathbf{x}^{(k)}), \\ \mathbf{y}^{(k)} = \mathbf{x}^{(k)} - \theta\_1, \\ J\_F(\mathbf{x}^{(k)}) \theta\_2 = F(\mathbf{y}^{(k)}), \\ \mathbf{z}^{(k)} = \mathbf{y}^{(k)} - 3 \theta\_2, \\ J\_F(\mathbf{x}^{(k)}) \theta\_3 = J\_F(\mathbf{z}^{(k)}) \theta\_2, \\ J\_F(\mathbf{x}^{(k)}) \theta\_4 = J\_F(\mathbf{z}^{(k)}) \theta\_3, \\ \mathbf{x}^{(k+1)} = \mathbf{y}^{(k)} - \frac{7}{4} \theta\_2 + \frac{1}{2} \theta\_3 + \frac{1}{4} \theta\_4. \end{array}$$

**Figure 5.** The classical efficiency index for *M*3, *M*4, *MA*, *MB* and *MC*.

**Figure 6.** The Flops-like efficiency index for *M*3, *M*4, *MA*, *MB* and *MC*.

**Table 5.** Numerical results for comparing of *M*<sup>3</sup> and *M*<sup>4</sup> with *MA*, *MB* and *MC*.


The comparison results of computational efficiency between our methods *M*<sup>3</sup> and *M*<sup>4</sup> with selected methods *MA*, *MB* and *MC* are presented in Table 5. Additionally, Figures 5 and 6 show the graphical comparisons between these methods. Finally, Table 6 shows CPU time and number of iterations to solve our test problems by methods *M*3, *M*4, *MA*, *MB* and *MC*. These numerical and graphical reports show the quality of our algorithms.

**Table 6.** Comparison results between *M*3, *M*4, *MA*, *MB* and *MC*.


#### **5. Conclusions**

In this article, two new frozen Jacobian two- and three-step iterative methods to solve systems of nonlinear equations are presented. For the first method, we proved that the order of convergence is three, while for the second one, a fourth-order convergence is proved. By solving three different examples, one may see our methods work as well. Further, the CPU time of our methods is less than some selected frozen Jacobian multi-step iterative methods in the literature. Moreover, other indices of our methods such as number of steps, functional evaluations, the classical efficiency index, and so on, are better than these indices for other methods. This class of the frozen Jacobian multi-step iterative methods can be a pattern for new research on the frozen Jacobian iterative algorithms.

**Author Contributions:** Investigation, R.H.A.-O. and M.T.D.; Project administration, M.T.D.; Resources, R.H.A.-O.; Supervision, M.T.D.; Writing—original draft, M.T.D. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors would like to thank the editor of the journal and three anonymous reviewers for their generous time in providing detailed comments and suggestions that helped us to improve the paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Approximation of the Fixed Point of the Product of Two Operators in Banach Algebras with Applications to Some Functional Equations**

**Khaled Ben Amara 1, Maria Isabel Berenguer 2,3,\* and Aref Jeribi <sup>1</sup>**


**Abstract:** Making use of the Boyd-Wong fixed point theorem, we establish a new existence and uniqueness result and an approximation process of the fixed point for the product of two nonlinear operators in Banach algebras. This provides an adequate tool for deriving the existence and uniqueness of solutions of two interesting type of nonlinear functional equations in Banach algebras, as well as for developing an approximation method of their solutions. In addition, to illustrate the applicability of our results we give some numerical examples.

**Keywords:** Banach algebras; fixed point theory; functional equations; Schauder bases

**MSC:** 65J15; 47J25; 46B15; 32A65

#### **1. Introduction**

Many phenomena in physics, chemistry, mechanics, electricity, and so as, can be formulated by using the following nonlinear differential equations with nonlocal initial condition of the form:

$$\begin{cases} \frac{d}{dt} \left( \frac{\mathbf{x}(t)}{f(t, \mathbf{x}(t))} \right) &= \mathbf{g}(t, \mathbf{x}(t)), \quad t \in J := [0, \rho], \\\\ \mathbf{x}(0) &= \mu(\mathbf{x}), \end{cases} \tag{1}$$

where *<sup>ρ</sup>* <sup>&</sup>gt; 0 is a real constant, *<sup>f</sup>* : *<sup>J</sup>* <sup>×</sup> <sup>R</sup> <sup>→</sup> <sup>R</sup> \ {0}, *<sup>g</sup>* : *<sup>J</sup>* <sup>×</sup> <sup>R</sup> <sup>→</sup> <sup>R</sup> are supposed to be <sup>D</sup>-Lipschitzian with respect to the second variable, and the operator *<sup>μ</sup>* : *<sup>C</sup>*(*J*) <sup>→</sup> <sup>R</sup> represents the nonlocal initial condition, see [1,2]. Here, *C*(*J*) is the space of all continuous functions from *<sup>J</sup>* into <sup>R</sup> endowed with the norm ·<sup>∞</sup> <sup>=</sup> sup*t*∈*<sup>J</sup><sup>x</sup>*(*t*).

The nonlocal condition *x*(0) = *μ*(*x*) can be more descriptive in physics with better effect than the classical initial condition *x*(0) = *x*0, (see, e.g., [2–5]). In the last case, i.e., *x*(0) = *x*0, the problem (1) has been studied by Dhage [6] and O'Regan [7]. Therefore it is of interest to discuss and to approximate the solution of (1) with a nonlocal initial condition.

Similarly another class of nonlinear equations is used frequently to describe many phenomena in different fields of applied sciences such as physics, control theory, chemistry, biology, and so forth (see [8–11]). This class is generated by the nonlinear integral equations of the form:

$$\mathbf{x}(t) = f(t, \mathbf{x}(\sigma(t))) \cdot \left[ q(t) + \int\_0^{\eta(t)} K(t, s, \mathbf{x}(\tau(s))) ds \right], \quad t \in J := [0, \rho], \tag{2}$$

**Citation:** Ben Amara, K; Berenguer, M.I.; Jeribi, A. Approximation of the Fixed Point of the Product of Two Operators in Banach Algebras with Applications to Some Functional Equations. *Mathematics* **2022**, *10*, 4179. https://doi.org/10.3390/ math10224179

Academic Editor: Salvador Romaguera

Received: 19 September 2022 Accepted: 3 November 2022 Published: 9 November 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

where *<sup>ρ</sup>* <sup>&</sup>gt; 0 is a real constant, *<sup>σ</sup>*, *<sup>τ</sup>*, *<sup>η</sup>* : *<sup>J</sup>* <sup>→</sup> *<sup>J</sup>* and *<sup>q</sup>* : *<sup>J</sup>* <sup>→</sup> <sup>R</sup> are supposed to be continuous, and the functions *<sup>f</sup>* : *<sup>J</sup>* <sup>×</sup> <sup>R</sup> <sup>→</sup> <sup>R</sup>, *<sup>K</sup>* : *<sup>J</sup>* <sup>×</sup> *<sup>J</sup>* <sup>×</sup> <sup>R</sup> <sup>→</sup> <sup>R</sup> are supposed to be <sup>D</sup>-Lipschizian with respect to the second and the third variable, respectively.

Both, (1) and (2), can be interpreted as fixed point problems in which the equation involved is a nonlinear hybrid equation on a Banach algebra *E* of the type

$$\mathfrak{x} = A(\mathfrak{x}) \cdot B(\mathfrak{x}),\tag{3}$$

where *A* and *B* are nonlinear operators map a nonempty closed convex subset Ω ⊂ *E* into *E*.

A hybrid fixed point result to (3) was proved by Dhage in [12] and since then, several extensions and generalizations of this result have been achieved. See [13–15] and the references therein. These results can be used to achieves the existence of solutions. Although the explicit calculation of the fixed point is difficult in most cases, the previous cited results are regarded as one of the most powerful tools to give an approximation of the fixed point by a computational method and to develop numerical methods that allow us to approximate the solution of these equations.

In Banach spaces, several works deals with developing numerical techniques in order to approximate the solutions of integral and integro–differential equations, by using different methods such as the Chebyshev polynomial [16], the secant-like methods [17], using Schauder's basis [18,19], the parameterization method [20], the wavelet methods [21], a collocation method in combination with operational matrices of Berstein polynomials [22], the contraction principle and a suitable quadrature formula [23], the variational iteration method [24], etc.

Since the Banach algebras represents a practical framework for several equations such as (1) and (2), and in general (3), the purposes of this paper are twofold. Firstly, to present, under suitable conditions, a method to approximate the fixed point of a hybrid equation of type (3), by means of the product and composition of operators defined in a Banach algebra. Secondly, to set forth and apply the proposed method to obtain an approximation of the solutions of (1) and (2).

The structure of this work is as follows: in Section 2 we present some definitions and auxiliary results; in Section 3 we derive an approximation method for the fixed point of the hybrid Equation (3); in Sections 4 and 5, we apply our results to prove the existence and the uniqueness of solution of (1) and (2), we give an approximation method for these solutions and moreover, we establish some numerical examples to illustrate the applicability of our results. Finally, some conclusions are quoted in Section 6.

#### **2. Analytical Tools**

In this section, we provide some concepts and results that we will need in the following sections. The first analytical tool to be used comes from the theory of the fixed point. Let *X* be a Banach space with norm · and the zero element *θ*. We denote by *B*(*x*,*r*) the closed ball centered at *x* with radius *r*. We write *Br* to denote *B*(*θ*,*r*). For any bounded subset Ω of *X*, the symbol Ω denotes the norm of a set Ω, i.e., Ω = sup{*x*, *x* ∈ Ω}.

Let us introduce the concept of D-Lipschitzian mappings which will be used in the sequel.

**Definition 1.** *Let X be a Banach space. A mapping A* : *X* −→ *X is said to be* D*-Lipschitzian if*

$$||Ax - Ay|| \le \phi(||x - y||) \quad \forall x, y \in X$$

*with <sup>φ</sup>* : <sup>R</sup><sup>+</sup> −→ <sup>R</sup><sup>+</sup> *a continuous nondecreasing function such that <sup>φ</sup>*(0) = <sup>0</sup>*. The mapping <sup>φ</sup> is called the* D*-function associate to A. When φ*(*r*) < *r for r* > 0, *the mapping A is called a nonlinear contraction on X.*

The class of D-Lipschitzian mappings on *X* contains the class of Lipschitzian mapping on *X*, indeed if *φ*(*r*) = *α r*, for some *α* > 0, then *A* is called Lipschitzian mapping with Lipschitz constant *α* or an *α*-Lipschitzian mapping. When 0 ≤ *α* < 1, we say that *A* is a contraction.

The Banach fixed point theorem ensures that every contraction operator *A* on a complete metric space *X* has a unique fixed point *x*˜ ∈ *X*, and, for every *x*<sup>0</sup> ∈ *X*, the sequence {*An*(*x*0)}*n*∈<sup>N</sup> converges to *<sup>x</sup>*˜. Much attention has been paid to Banach principle and it was generalized in different works (we quote, for instance, [25,26]). In [25], Boyd and Wong established the following result.

**Theorem 1.** *Let* (*X*, *d*) *be a complete metric space, and let A* : *X* → *X be a mapping satisfying*

$$d(A(\mathfrak{x}), A(\mathfrak{y})) \le \mathfrak{q}(d(\mathfrak{x}, \mathfrak{y})), \ \forall \mathfrak{x}, \mathfrak{y} \in X$$

*where ϕ* : [0, ∞) → [0, ∞) *is a continuous function such that ϕ*(*r*) < *r if r* > 0*. Then A has a unique fixed point <sup>x</sup>*˜ <sup>∈</sup> *X and for any x*<sup>0</sup> <sup>∈</sup> *<sup>X</sup>*, *the sequence* {*An*(*x*0)}*n*∈<sup>N</sup> *converges to <sup>x</sup>*˜.

On the other hand, Schauder bases will constitute the second essential tool. We recall that a Schauder basis in a Banach space *E* is a sequence {*en*}*n*∈<sup>N</sup> ⊂ *E* such that for every *<sup>x</sup>* <sup>∈</sup> *<sup>E</sup>*, there is a unique sequence {*an*}*n*∈<sup>N</sup> <sup>⊂</sup> <sup>R</sup> such that

$$\mathfrak{x} = \sum\_{m \ge 1} a\_m e\_m.$$

This notion produces the concept of the sequence of projections *Pn* : *E* → *E*, defined by the formula

$$P\_n\left(\sum\_{k\ge 1} a\_k e\_k\right) = \sum\_{k=1}^n a\_k e\_{k,n}$$

and the sequence of coordinate functionals *e*∗ *<sup>n</sup>* ∈ *E*<sup>∗</sup> defined as

$$e\_n^\* \left(\sum\_{k\ge 1} a\_k e\_k\right) = a\_n.$$

Moreover, in view of the Baire category Theorem [27], that for all *n* ≥ 1, *e*<sup>∗</sup> *<sup>n</sup>* and *Pn* are continuous. This yields, in particular, that

$$\lim\_{n \to \infty} ||P\_n(\mathbf{x}) - \mathbf{x}|| = 0.$$

#### **3. Existence, Uniqueness and Approximation of a Fixed Point of the Product of Two Operators in Banach Algebras**

Based on the Boyd-Wong Theorem, we establish the following fixed point result for the product of two nonlinear operators in Banach algebras.

**Theorem 2.** *Let X be a nonempty closed convex subset of a Banach algebra E*. *Let A*, *B* : *X* → *E be two operators satisfying the following conditions:*


*Then, if A*(*X*)*ψ*(*r*) + *B*(*X*)*ϕ*(*r*) < *r when r* > 0*, there is a unique point x*˜ ∈ *X such that <sup>A</sup>*(*x*˜) · *<sup>B</sup>*(*x*˜) = *x. In addition, for each x* ˜ <sup>0</sup> <sup>∈</sup> *X, the sequence* {(*<sup>A</sup>* · *<sup>B</sup>*)*n*(*x*0)}*n*∈<sup>N</sup> *converges to x.*˜

**Proof.** Let *x*, *y* ∈ *X*. we have

$$\|A(\mathbf{x}) \cdot B(\mathbf{x}) - A(\mathbf{y}) \cdot B(\mathbf{y})\| \le \|A(\mathbf{x}) \cdot (B(\mathbf{x}) - B(\mathbf{y})) \| + \|(A(\mathbf{x}) - A(\mathbf{y})) \cdot B(\mathbf{y})\| \le \delta$$

*A*(*x*) *B*(*x*) − *B*(*y*) + *B*(*y*) *A*(*x*) − *A*(*y*)≤*A*(*X*) *ψ*(*x* − *y*) + *B*(*X*) *ϕ*(*x* − *y*). This implies that *A* · *B* defines a nonlinear contraction with D-function

$$\phi(r) = \left| \left| A(X) \right| \right| \psi(r) + \left| \left| B(X) \right| \right| \left| \phi(r), \ r > 0. \right.$$

Applying the cited Boyd-Wong's fixed point Theorem, we obtain the desired result.

Boyd-Wong's fixed point Theorem expresses the fixed point of *A* · *B* as the limit of the sequence {(*<sup>A</sup>* · *<sup>B</sup>*)*n*(*x*0)}*n*∈<sup>N</sup> with *<sup>x</sup>*<sup>0</sup> <sup>∈</sup> *<sup>X</sup>*. If it is possible explicitly compute (*<sup>A</sup>* · *<sup>B</sup>*)*n*(*x*0), then for each *<sup>n</sup>*, the expression (*<sup>A</sup>* · *<sup>B</sup>*)*n*(*x*0) would be an approximation of the fixed point. But in the practice, this explicit calculation use to be not possible. For that, our aim is to propose another approximation of the fixed point which simple to calculate. We will need the following lemma.

**Lemma 1.** *Let X be a nonempty closed convex subset of a Banach algebra E*. *Let A*, *B* : *X* → *E be two* D*-Lipschitzian operators with* D*-functions ϕ and ψ*, *respectively, and A* · *B maps X into X*. *Moreover, suppose that*

$$
\phi(r) < r, \ r > 0.
$$

*Let <sup>x</sup>*˜ *be the unique fixed point of <sup>A</sup>* · *<sup>B</sup> and <sup>x</sup>*<sup>0</sup> <sup>∈</sup> *<sup>X</sup>*. *Let <sup>ε</sup>* <sup>&</sup>gt; 0, *<sup>m</sup>* <sup>∈</sup> <sup>N</sup>*, and <sup>T</sup>*0, *<sup>T</sup>*1, ... , *Tm* : *<sup>E</sup>* <sup>→</sup> *<sup>E</sup>*, *with T*<sup>0</sup> ≡ *I, I being the identity operator on E, such that*

$$\left\|\left|\mathfrak{X} - (A \cdot B)^{\mathfrak{m}}(\mathfrak{x}\_{0})\right\|\right\| \leq \frac{\varepsilon}{2} \tag{4}$$

*and*

$$\sum\_{p=1}^{m-1} \Phi^{m-p} \left( \left\| (A \cdot B) \circ T\_{p-1} \circ \dots \circ T\_1(\mathbf{x}\_0) - T\_p \circ \dots \circ T\_1(\mathbf{x}\_0) \right\| \right) +$$

$$\left\| (A \cdot B) \circ T\_{m-1} \circ \dots \circ T\_1(\mathbf{x}\_0) - T\_m \circ \dots \circ T\_1(\mathbf{x}\_0) \right\| \le \frac{\varepsilon}{2}. \tag{5}$$

*Then,*

$$\|\|\mathfrak{x} - T\_{\mathfrak{m}} \circ \dots \circ T\_1(\mathfrak{x}\_0)\|\| \le \varepsilon.$$

**Proof.** Arguing as in the proof of Theorem 2, it follows that *A* · *B* is a nonlinear contraction with D-function *φ*, and by induction argument, it is easy to show that

$$\left\|(A \cdot B)^{n}(\mathbf{x}) - (A \cdot B)^{n}(y)\right\| \le \Phi^{n}(\left\|\mathbf{x} - y\right\|), \left\|\mathbf{x}, y \in X. \tag{6}$$

By using the triangular inequality, we have

$$\begin{split} \left\|(A \cdot B)^{m}(\mathbf{x}\_{0}) - T\_{\mathfrak{m}} \circ \dots \circ T\_{1}(\mathbf{x}\_{0})\right\| &\leq \\ \left\|(A \cdot B)^{m-1} \circ (A \cdot B)(\mathbf{x}\_{0}) - (A \cdot B)^{m-1} \circ T\_{1}(\mathbf{x}\_{0})\right\| \\ &+ \left\|(A \cdot B)^{m-2} \circ (A \cdot B) \circ T\_{1}(\mathbf{x}\_{0}) - (A \cdot B)^{m-2} \circ T\_{2} \circ T\_{1}(\mathbf{x}\_{0})\right\| + \dots + \\ &+ \left\|(A \cdot B) \circ (A \cdot B) \circ T\_{m-2} \circ \dots \circ T\_{1}(\mathbf{x}\_{0}) - (A \cdot B) \circ T\_{m-1} \circ \dots \circ T\_{1}(\mathbf{x}\_{0})\right\| \\ &+ \left\|(A \cdot B) \circ T\_{m-1} \circ \dots \circ T\_{1}(\mathbf{x}\_{0}) - T\_{m} \circ \dots \circ T\_{1}(\mathbf{x}\_{0})\right\|. \end{split}$$

Taking into account (6), we obtain

$$\begin{aligned} \left\|(A \cdot B)^{m}(\mathbf{x}\_{0}) - T\_{m} \circ \dots \circ T\_{1}(\mathbf{x}\_{0})\right\| &\leq \\ \sum\_{p=1}^{m-1} \phi^{m-p} \left( \left\|(A \cdot B) \circ T\_{p-1} \circ \dots \circ T\_{1}(\mathbf{x}\_{0}) - T\_{p} \circ \dots \circ T\_{1}(\mathbf{x}\_{0})\right\| \right) \\ &+ \left\|(A \cdot B) \circ T\_{m-1} \circ \dots \circ T\_{1}(\mathbf{x}\_{0}) - T\_{m} \circ \dots \circ T\_{1}(\mathbf{x}\_{0})\right\|. \end{aligned}$$

This implies, by using the Triangular inequality again, that

$$\begin{aligned} \|\|\dot{\mathbf{x}} - T\_m \diamond \dots \odot T\_1(\mathbf{x}\_0)\|\| &\le \\ \sum\_{p=1}^{m-1} \Phi^{m-p} \left( \left\|(A \cdot B) \circ T\_{p-1} \diamond \dots \odot T\_1(\mathbf{x}\_0) - T\_p \diamond \dots \odot T\_1(\mathbf{x}\_0)\right\| \right) \\ &+ \left\|(A \cdot B) \circ T\_{m-1} \diamond \dots \odot T\_1(\mathbf{x}\_0) - T\_m \diamond \dots \odot T\_1(\mathbf{x}\_0)\right\| + \left\|\|\mathbf{x} - (A \cdot B)^m(\mathbf{x}\_0)\|\right\| \le \varepsilon. \end{aligned} \tag{7}$$
  $\square$ 

Taking into account the above lemma, observe that, under the previous hypotheses,

$$\mathfrak{x}^\* = T\_{\mathfrak{m}} \circ \dots \circ T\_1(\mathfrak{x}\_0) \approx \mathfrak{x}^\*$$

In order to get the approximation *x*<sup>∗</sup> = *Tm* ◦ ... ◦ *T*1(*x*0) of the fixed point *x*˜, it is evident that, given *ε* > 0, by Theorem 2, condition (4) is satisfied for *m* sufficiently large. So, we are interested in building *T*1, *T*2, . . ., *Tm* satisfying (5), i. e. with the idea that

$$(A \cdot B)^{\mathfrak{m}}(\mathfrak{x}\_0) \approx T\_{\mathfrak{m}} \circ \dots \circ T\_1(\mathfrak{x}\_0) \dots$$

Schauder bases are the tool we will use next to build such operators. Concretely, for the case of problems (1) and (2), which can be written as a fixed point problem *x* = *A*(*x*) · *B*(*x*), where *B* is given by an integral operator, we will choice to approximate only the power terms of the operator *B* which is difficult to compute in general, unlike operator *A* which is easy to calculate and does not need to approximate their power terms. For this reason, we specifically propose the following scheme, in which we will construct *S*1, *S*2,··· , *Sm*:

$$\begin{array}{ccccc} & \mathbf{x\_0} & & \\ & \downarrow & & \\ (A \cdot B)(\mathbf{x\_0}) & \approx & T\_1(\mathbf{x\_0}) = A(\mathbf{x\_0}) \cdot S\_1(\mathbf{x\_0}) \\ & \downarrow & & \downarrow \\ (A \cdot B)^2(\mathbf{x\_0}) & \approx & T\_2 \circ T\_1 \mathbf{x\_0} = (A \cdot S\_2) \circ T\_1(\mathbf{x\_0}) \\ & \vdots & \vdots & \vdots \\ & & \downarrow & & \vdots \\ & & & \downarrow & & \\ (A \cdot B)^m(\mathbf{x\_0}) & \approx & T\_m \circ \ldots \circ T\_1(\mathbf{x\_0}) = (A \cdot S\_m) \circ T\_{m-1} \circ \ldots \circ T\_1(\mathbf{x\_0}) \approx \mathbf{\tilde{x}} \end{array}$$

**Remark 1.** *The above scheme is constructed as follows. In the first term, we approximate B*(*x*0) *by S*1(*x*0), *then we obtain T*1(*x*0) := *A*(*x*0) · *S*1(*x*0) *as an approximation of the first term of the Picard iterate, A*(*x*0) · *B*(*x*0). *In the second term of our scheme, we approximate the second term of the Picard iterate,* (*<sup>A</sup>* · *<sup>B</sup>*)2(*x*0) = *<sup>A</sup>*((*<sup>A</sup>* · *<sup>B</sup>*)(*x*0)) · *<sup>B</sup>*((*<sup>A</sup>* · *<sup>B</sup>*)(*x*0)). *So we obtain the second term of our scheme by combining the first term T*1(*x*0), *with an approximation of the operator B*, *which denoted by S*2, *and consequently we obtain a second term of our scheme T*<sup>2</sup> ◦ *T*1(*x*0) = (*<sup>A</sup>* · *<sup>S</sup>*2)(*T*1(*x*0)) *which approximate* (*<sup>A</sup>* · *<sup>B</sup>*)2(*x*0).

#### **4. Nonlinear Differential Equations with Nonlocal Initial Condition**

In this section we focus our attention in the nonlinear differential equation with nonlocal initial condition (1). This equation will be studied when the mappings *f* , *g* : *<sup>J</sup>* <sup>×</sup> <sup>R</sup> <sup>→</sup> <sup>R</sup> are such that:

*(i)* The partial mappings *t* → *f*(*t*, *x*), *t* → *g*(*t*, *x*) are continuous and the mapping *<sup>μ</sup>* : *<sup>C</sup>*(*J*) <sup>→</sup> <sup>R</sup> is *<sup>L</sup>μ*-Lipschitzian.

*(ii)* There exist *<sup>r</sup>* <sup>&</sup>gt; 0, *<sup>α</sup>*, *<sup>γ</sup>* : *<sup>J</sup>* <sup>→</sup> <sup>R</sup> two continuous functions and *<sup>ϕ</sup>*, *<sup>ψ</sup>* : <sup>R</sup><sup>+</sup> −→ <sup>R</sup><sup>+</sup> two nondecreasing, continuous functions such that

$$|f(t, \mathbf{x}) - f(t, y)| \le a(t)\varphi(|\mathbf{x} - y|), t \in I\_\prime \text{ and } \mathbf{x}, y \in \mathbb{R} \text{ with } |\mathbf{x}|, |y| \le r\_\prime$$

and

$$|\lg(t, \mathbf{x}) - \lg(t, \mathbf{y})| \le \gamma(t)\psi(|\mathbf{x} - \mathbf{y}|), t \in J \text{ and } \mathbf{x}, \mathbf{y} \in \mathbb{R} \text{ with } |\mathbf{x}|, |\mathbf{y}| \le r.$$

*(iii)* There is a constant *<sup>δ</sup>* <sup>&</sup>gt; 0 such that sup*x*∈R,|*x*|≤*<sup>r</sup>* <sup>|</sup> *<sup>f</sup>*(0, *<sup>x</sup>*)<sup>|</sup> <sup>−</sup><sup>1</sup> <sup>≤</sup> *<sup>δ</sup>*.

Throughout this section, Ω will denote the closed ball *Br* of *C*(*J*), where *r* is defined in the above assumption (*ii*). Observe that Ω is a non-empty, closed, convex and bounded subset of *C*(*J*).

#### *4.1. Existence and Uniqueness of Solutions*

In this subsection, we prove the existence and the uniqueness of a solution to the functional differential problem (1).

**Theorem 3.** *Assume that the assumptions* (*i*)*,* (*ii*) *and* (*iii*) *hold. If*

$$M\_A M\_B \le r \,\text{and}$$

$$M\_A \delta L\_\mu t + \left(M\_A \delta^2 |a(0)| M\_\mu + M\_B ||a||\_\infty\right) \varrho(t) + M\_A ||\gamma(\cdot)||\_{L^1} \psi(t) < t, \,\forall t > 0,$$

*where MA* = *α*<sup>∞</sup> *ϕ*(*r*) + *f*(·, 0)<sup>∞</sup>, *MB* = *δM<sup>μ</sup>* + *γ*<sup>∞</sup>*ρψ*(*r*) + *ρg*(·, 0)<sup>∞</sup> *and M<sup>μ</sup>* = *Lμr* + |*μ*(0)| *, then the nonlinear differential problem* (1) *has a unique solution in* Ω*.*

**Proof.** Notice that the problem of the existence of a solution to (1) can be formulated in the following fixed point problem *x* = *A*(*x*) · *B*(*x*), where *A*, *B* are given for *x* ∈ *C*(*J*) by

$$\begin{array}{ll}(A(\mathbf{x}))(t) &= f(t, \mathbf{x}(t)) \\\\ (B(\mathbf{x}))(t) &= \left[\frac{1}{f(0, \mathbf{x}(0))} \mu(\mathbf{x}) + \int\_0^t \mathbf{g}(s, \mathbf{x}(s)) ds\right], t \in J. \end{array} \tag{8}$$

Let *x* ∈ Ω and *t*, *t* ∈ *J*. Since *f* is D-lipschitzian with respect to the second variable and is continuous with respect to the first variable, then by using the inequality

$$|f(t, \mathbf{x}(t)) - f(t', \mathbf{x}(t'))| \le |f(t, \mathbf{x}(t)) - f(t', \mathbf{x}(t))| + |f(t', \mathbf{x}(t)) - f(t', \mathbf{x}(t'))|.$$

we can show that *A* maps Ω into *C*(*J*).

Now, let us claim that *B* maps Ω into *C*(*J*). In fact, let *x* ∈ Ω and *t*, *t* ∈ *J* be arbitrary. Taking into account that *t* → *g*(*t*, *x*) is a continuous mapping, it follows from assumption (*ii*) that

$$\begin{aligned} |( (\mathcal{B}(x))(t) - (\mathcal{B}(x))(t') )| \leq \int\_{t'}^{t} |\mathcal{g}(s, \mathfrak{x}(s)) - \mathcal{g}(s, 0)| ds + (t - t') \|\mathcal{g}(\cdot, 0)\|\_{\infty} \leq \\ &(t - t') (||\gamma||\_{\infty} \psi(r) + ||\mathcal{g}(\cdot, 0)||\_{\infty}). \end{aligned}$$

This proves the claim. Our strategy is to apply Theorem 2 to show the existence and the uniqueness of a fixed point for the product *A* · *B* in Ω which in turn is a continuous solution for problem (1).

For this purpose, we will claim, first, that *A* and *B* are D-lipschitzian mappings on Ω. The claim regarding *A* is clear in view of assumption (*ii*), that is *A* is D-lipschitzian with D-function Φ such that

$$\Phi(t) = \|\boldsymbol{a}\|\_{\infty} \boldsymbol{\varrho}(t), t \in J.$$

We corroborate now the claim for *B*. Let *x*, *y* ∈ Ω, and let *t* ∈ *J*. By using our assumptions, we obtain

$$\begin{split} \left| (\mathcal{B}(\mathbf{x}))(t) - (\mathcal{B}(\mathbf{y}))(t) \right| &= \\ \left| \frac{1}{f(0, \mathbf{x}(0))} \mu(\mathbf{x}) - \frac{1}{f(0, \mathbf{y}(0))} \mu(\mathbf{y}) + \int\_0^t \mathbf{g}(s, \mathbf{x}(s)) - \mathbf{g}(s, \mathbf{y}(s)) ds \right| &\leq \\ \frac{L\_{\mu}}{|f(0, \mathbf{x}(0))|} \left| \|\mathbf{x} - \mathbf{y}\| + \frac{|\mu(0)|}{|f(0, \mathbf{x}(0)) f(0, \mathbf{y}(0))|} \right| &\left( L\_{\mu} r + |\mu(0)| \right) \mathfrak{g}(\|\mathbf{x} - \mathbf{y}\|) + \\ \int\_0^t |\gamma(s)| \psi(|\mathbf{x}(s) - \mathbf{y}(s)|) ds &\leq \\ \delta L\_{\mu} \|\mathbf{x} - \mathbf{y}\| + \delta^2 |\mathfrak{a}(0)| \left( L\_{\mu} r + |\mu(0)| \right) \mathfrak{g}(\|\mathbf{x} - \mathbf{y}\|) + \|\gamma(\cdot)\|\_{L^1} \mathfrak{g}(\|\mathbf{x} - \mathbf{y}\|). \end{split}$$

Taking the supremum over *t*, we obtain that *B* is D-lipschitzian with D-function Ψ such that

$$\Psi(t) = \delta L\_{\mu}t + \delta^2|a(0)|\left(L\_{\mu}r + |\mu(0)|\right)\varrho(t) + ||\gamma(\cdot)||\_{L^1}\psi(t), t \in I.$$

On the other hand, bearing in mind assumption (*i*), by using the above discussion we can see that *A*(Ω) and *B*(Ω) are bounded with bounds *MA* and *MB* respectively. Taking into account the estimate *MAMB* ≤ *r*, we obtain that *A* · *B* maps Ω into Ω. Since

$$\begin{aligned} |(\mathcal{B}(\mathbf{x}))(t)| &\leq \left| \frac{1}{f(0, \mathbf{x}(0))} \mu(\mathbf{x}) \right| + \int\_0^t |\mathbf{g}(s, \mathbf{x}(s))| ds \\ &\leq \delta (|\mu(\mathbf{x}) - \mu(0)| + |\mu(0)|) + \int\_0^t |\mathbf{g}(s, \mathbf{x}(s)) - \mathbf{g}(s, 0)| ds + \int\_0^t |\mathbf{g}(s, 0)| ds \\ &\leq \delta (L\_\mu ||\mathbf{x}|| + |\mu(0)|) + \int\_0^t |\gamma(s)| \psi(|\mathbf{x}(s)|) ds + \int\_0^t |\mathbf{g}(s, 0)| ds, \end{aligned}$$

and using the fact that |*γ*(*s*)|*ψ*(|*x*(*s*)|) ≤ *γ*<sup>∞</sup>*ψ*(*x*) ≤ *γ*<sup>∞</sup>*ψ*(*r*), we have that

$$\|\|B(\mathbf{x})\|\| \le \delta (L\_{\mu} \|\|\mathbf{x}\|\| + |\mu(0)|) + \rho \|\|\gamma\|\|\_{\infty} \psi(r) + \rho \|\|\mathbf{g}(\cdot, 0)\|\|\_{\infty} = M\_B.$$

On the other hand, *A*(*x*) ≤ *MA* since

$$\begin{aligned} |(A(\boldsymbol{x}))(t)| &= |f(t, \boldsymbol{x}(t))| \leq |f(t, \boldsymbol{x}(t)) - f(t, 0)| + |f(t, 0)| \leq \\ &|a(t)| \, \boldsymbol{\varrho}(|\boldsymbol{x}(t)|) + |f(t, 0)| \leq \|a\|\_{\infty} \boldsymbol{\varrho}(r) + \|f(\cdot, 0)\|\_{\infty} = M\_A. \end{aligned}$$

Taking into account that

$$\|\|(A \cdot B)(\mathbf{x}) - (A \cdot B)(\mathbf{y})\|\| \le \|\|A(\mathbf{x})\|\| \|B(\mathbf{x}) - B(\mathbf{y})\|\| + \|\|B(\mathbf{y})\|\| \|A(\mathbf{x}) - A(\mathbf{y})\|\|.$$

we can notice that *A* · *B* is a nonlinear contraction with D-function Θ(·) := *MA*Ψ(·) + *MB*Φ(·), i.e.,

$$\|(A \cdot B)(\mathbf{x}) - (A \cdot B)(y)\| \le \Theta(\|\mathbf{x} - y\|), \, \mathbf{x}, y \in \Omega. \tag{9}$$

Now, applying Theorem 2, we infer that (1) has one and only one solution *x*˜ in Ω, and for each *x*<sup>0</sup> ∈ Ω we have

$$\lim\_{n \to \infty} (A \cdot B)^n (\mathfrak{x}\_0) = \mathfrak{x}.\tag{10}$$

In what follows we will assume that the hypotheses of the Theorem 3 are satisfied.

#### *4.2. Numerical Method to Approximate the Solution*

In this subsection we find a numerical approximation of the solution to the nonlinear Equation (1) using a Schauder basis {*en*}*n*≥<sup>1</sup> in *C*(*J*) and the sequence of associated projections {*Pn*}*n*≥1. Let *<sup>p</sup>* <sup>∈</sup> <sup>N</sup> and *np* <sup>∈</sup> <sup>N</sup>. We consider

$$\begin{array}{ccc} S\_p: \mathcal{C}(J) & \longrightarrow & \mathcal{C}(J) \\ \mathtt{x} & \longrightarrow & S\_p(\mathtt{x}) \end{array}$$

defined as

$$S\_{\mathcal{P}}(\mathbf{x})(t) = \frac{1}{f(0, \mathbf{x}(0))} \mu(\mathbf{x}) + \int\_0^t P\_{\mathcal{V}\_{\mathcal{P}}}(\mathcal{U}\_0(\mathbf{x}))(s) ds,$$

where *U*<sup>0</sup> : *C*(*J*) −→ *C*(*J*) is given by *U*0(*x*)(*s*) = *g*(*s*, *x*(*s*)).

Now consider the operator *Tp* : *C*(*J*) −→ *C*(*J*) such that for each *x* ∈ *C*(*J*), *Tp*(*x*) is defined by

$$T\_p(\mathbf{x})(t) = A(\mathbf{x})(t)\mathcal{S}\_p(\mathbf{x})(t), \ t \in I,\tag{11}$$

with *A* : *C*(*J*) −→ *C*(*J*), *A*(*x*)(*t*) = *f*(*t*, *x*(*t*)).

**Remark 2.** *For <sup>p</sup>* <sup>≥</sup> <sup>1</sup> *and any np* <sup>∈</sup> <sup>N</sup> *that we use for defining Tp, the operator Tp maps* <sup>Ω</sup> *into* Ω, *since just keep in mind that for x* ∈ Ω, *we have*

$$\left| \left| T\_p(\mathbf{x})(t) \right| \right| = \left| A(\mathbf{x})(t) \left( \frac{1}{f(0, \mathbf{x}(0))} \mu(\mathbf{x}) + \int\_0^t P\_{\eta\_p}(\mathcal{U}\_0(\mathbf{x}))(s) ds \right) \right| \le \frac{\varepsilon}{2}$$

$$\left| f(t, \mathbf{x}(t)) \right| \left( \delta |\mu(\mathbf{x})| + \int\_0^t \left| P\_{\eta\_p}(\mathcal{U}\_0(\mathbf{x}))(s) \right| ds \right),$$

*and proceeding as in the above subsection and using the fact that Pnp is a bounded linear operator on C*(*J*), *we get*

$$\left|T\_p(\mathbf{x})(t)\right| \le M\_A \left[\delta |\mu(\mathbf{x})| + \rho \left|\left|P\_{n\_p}(\mathcal{U}\_0(\mathbf{x}))\right|\right|\right] \le$$

$$M\_A \left[\delta (L\_\mu r + |\mu(0)|) + \rho \sup\_{s \in I} |\varrho(s, \mathbf{x}(s))|\right] \le M\_A M\_B < r.$$

*In particular, for m* ≥ 1, *the operator Tm* ◦ ... ◦ *T*<sup>1</sup> *maps* Ω *into* Ω.

Our goal is to prove that we can chose *<sup>n</sup>*1, *<sup>n</sup>*2, ... <sup>∈</sup> <sup>N</sup> in order that *<sup>T</sup>*1, *<sup>T</sup>*2, ..., which are defined above, can be used to approximate the solution of (1).

**Theorem 4.** *Let x*˜ *be the unique solution to the nonlinear problem (1). Let x*<sup>0</sup> ∈ Ω *and ε* > 0, *then there exist m* <sup>∈</sup> <sup>N</sup> *and ni* <sup>∈</sup> <sup>N</sup> *to construct Ti for i* <sup>=</sup> 1, . . . , *m, in such a way that*

$$\|\|\mathfrak{x} - T\_{\mathfrak{m}} \circ \dots \circ T\_1(\mathfrak{x}\_0)\|\| \le \varepsilon.$$

**Proof.** Let *<sup>x</sup>*<sup>0</sup> <sup>∈</sup> <sup>Ω</sup> and *<sup>ε</sup>* <sup>&</sup>gt; 0. By using (10), there is *<sup>m</sup>* <sup>∈</sup> <sup>N</sup> such that

$$\left\|(A \cdot B)^{m}(\mathfrak{x}\_{0}) - \mathfrak{x}\right\| \leq \mathfrak{e}/2.$$

For that *m*, and for *p* ∈ {1, . . . , *m*}, we define *Up* : *C*(*J*) → *C*(*J*) by

$$\mathcal{U}\_p(\mathfrak{x})(\mathfrak{s}) := \mathcal{g}(\mathfrak{s}, T\_p \circ \dots \circ T\_1(\mathfrak{x})(\mathfrak{s})), \mathfrak{s} \in \mathfrak{f}, \mathfrak{x} \in \mathbb{C}(f)$$

and *Ap* : *C*(*J*) → *C*(*J*) by

$$A\_p(\mathfrak{x})(s) := f\left(\mathfrak{s}, T\_p \circ \dots \circ T\_1(\mathfrak{x})(s)\right), \; s \in J, \mathfrak{x} \in \mathbb{C}(f).$$

According to inequality (9), in view of (5) of Lemma 1, it suffices to show that

$$\sum\_{p=1}^{m-1} \Theta^{\mathfrak{M}-p}(\left\|(A \cdot B) \circ T\_{p-1} \circ \dots \circ T\_1(\mathbf{x}\_0) - T\_p \circ \dots \circ T\_1(\mathbf{x}\_0)\right\|) +$$

$$\left\|(A \cdot B) \circ T\_{m-1} \circ \dots \circ T\_1(\mathbf{x}\_0) - T\_m \circ \dots \circ T\_1(\mathbf{x}\_0)\right\| \le \varepsilon/2.$$

In view of (11), we have

$$\begin{aligned} \left| (A \cdot B) \circ T\_{p-1} \circ \dots \circ T\_1(\mathbf{x}\_0)(t) - T\_p \circ T\_{p-1} \circ \dots \circ T\_1(\mathbf{x}\_0)(t) \right| &= \\ \left| (A \cdot B) \circ T\_{p-1} \circ \dots \circ T\_1(\mathbf{x}\_0)(t) - (A \cdot S\_p) \circ T\_{p-1} \circ \dots \circ T\_1(\mathbf{x}\_0)(t) \right| &= \\ \left| A\_{p-1}(\mathbf{x}\_0)(t) \left( B \circ T\_{p-1} \circ \dots \circ T\_1(\mathbf{x}\_0)(t) - S\_p \circ T\_{p-1} \circ \dots \circ T\_1(\mathbf{x}\_0)(t) \right) \right|. \end{aligned}$$

Taking into account Remark 2, we infer that *Ap*−1(*x*) is bounded, and consequently we get

$$\left| (A \cdot B) \circ T\_{p-1} \circ \dots \circ T\_1(\mathbf{x}\_0)(t) - T\_p \circ T\_{p-1} \circ \dots \circ T\_1(\mathbf{x}\_0)(t) \right| =$$

$$\left| A\_{p-1}(\mathbf{x}\_0)(t) \left( \int\_0^t \mathbf{g} \left( \mathbf{s}, T\_{p-1} \circ \dots \circ T\_1(\mathbf{x}\_0)(\mathbf{s}) \right) d\mathbf{s} - \int\_0^t P\_{n\_p} \left( \mathcal{U}\_{p-1}(\mathbf{x}\_0) \right)(\mathbf{s}) \, d\mathbf{s} \right) \right| \le \frac{1}{2}$$

$$\left| A\_{p-1}(\mathbf{x}\_0)(t) \right| \int\_0^t \left| \left( P\_{n\_p} (\mathcal{U}\_{p-1}(\mathbf{x}\_0)) - \mathcal{U}\_{p-1}(\mathbf{x}\_0) \right)(\mathbf{s}) \right| \, d\mathbf{s} \le$$

$$\rho \left| A\_{p-1}(\mathbf{x}\_0) \right| \left| \left| P\_{n\_p} (\mathcal{U}\_{p-1})(\mathbf{x}) - \mathcal{U}\_{p-1}(\mathbf{x}) \right| \right|.$$

Taking the supremum over *t*, we get

$$\left\| \left( (A \cdot B) \circ T\_{p-1} \circ \dots \circ T\_1(\mathbf{x}\_0) - T\_p \circ T\_{p-1} \circ \dots \circ T\_1(\mathbf{x}\_0) \right) \right\| \le \frac{1}{2}$$

$$\rho \mathcal{M}\_A \left\| P\_{n\_p} (\mathcal{U}\_{p-1})(\mathbf{x}\_0) - \mathcal{U}\_{p-1}(\mathbf{x}\_0) \right\|.$$

Since Θ is a nondecreasing continuous mapping, and taking into account the convergence of the projection operators associated to the Schauder basis, for all 1 ≤ *p* ≤ *m* we obtain

$$\ominus^{m-p}\left(\rho M\_A \left\| \left| P\_{\mathfrak{n}\_P}(\mathcal{U}\_{p-1}(\mathfrak{x}\_0)) - \mathcal{U}\_{p-1}(\mathfrak{x}\_0) \right| \right\| \right) \le \varepsilon/2m\_\star.$$

for *np* sufficiently large. Consequently, we consider those *<sup>n</sup>*1, ... , *nm* <sup>∈</sup> <sup>N</sup> for defining *<sup>T</sup>*1, *T*2,..., *Tm* respectively, and we obtain

$$\sum\_{p=1}^{m-1} \Theta^{m-p} \left( \left\| \left( A \cdot B \right) \diamond T\_{p-1} \diamond \dots \diamond T\_1(\mathbf{x}\_0) - T\_p \diamond \dots \diamond T\_1(\mathbf{x}\_0) \right\| \right) +$$

$$\left\| \left( A \cdot B \right) \diamond T\_{m-1} \diamond \dots \diamond T\_1(\mathbf{x}\_0) - T\_m \diamond \dots \diamond T\_1(\mathbf{x}\_0) \right\| \le$$

$$\sum\_{p=1}^{m-1} \Theta^{m-p} \left( \rho M\_A \left\| P\_{\mathcal{U}\_p} \left( \mathcal{U}\_{p-1}(\mathbf{x}\_0) \right) - \mathcal{U}\_{p-1}(\mathbf{x}\_0) \right\| \right) + \rho M\_A \left\| P\_{\mathcal{U}\_m} \left( \mathcal{U}\_{m-1}(\mathbf{x}\_0) \right) - \mathcal{U}\_{m-1}(\mathbf{x}\_0) \right\| \le \varepsilon/2.$$

Now apply Lemma 1, in order to get *x*˜ − *Tm* ◦ ... ◦ *T*1(*x*0) < *ε*.

#### *4.3. Numerical Experiments*

This subsection is devoted to providing some examples and their numerical results to illustrate the theorems of the above sections. We will consider *J* = [0, 1] and the classical Faber-Schauder system in *C*(*J*) where the nodes are the naturally ordered dyadic numbers (see Table 1 in [18] and [28,29] for details). In following examples, we will denote *x*<sup>∗</sup> = *Tm* ◦ ... ◦ *T*1(*x*0) with *m* = 4 and *n*<sup>1</sup> = ··· = *nm* = *l* with *l* = 9 or *l* = 33.

**Example 1.** *Consider the nonlinear differential equation with a nonlocal initial condition*

$$\begin{cases} \quad \frac{d}{dt} \left( \frac{\mathbf{x}(t)}{f(t, \mathbf{x}(t))} \right) = a e^{-\mathbf{x}(t)}, & t \in J, \\\\ \quad \mathbf{x}(0) = b \left( \sup\_{t \in I} |\mathbf{x}(t)| + \frac{3}{4} \right), \end{cases} \tag{12}$$

*where* <sup>0</sup> <sup>&</sup>lt; *<sup>a</sup>* <sup>&</sup>lt; 1/ log(2) *and f*(*t*, *<sup>x</sup>*) = *<sup>b</sup>* 1 + *ae*−*bt* . *Let us define the mappings g* : *<sup>J</sup>* <sup>×</sup> <sup>R</sup> <sup>→</sup> <sup>R</sup> *and <sup>μ</sup>* : *<sup>C</sup>*(*J*) <sup>→</sup> <sup>R</sup> *by*

$$g(t, \mathbf{x}) = ae^{-\mathbf{x}}, \ t \in J, \mathbf{x} \in \mathbb{R}$$

*and*

$$\mu(u) = b \left( \sup\_{t \in \mathcal{J}} |u(t)| + 3/4 \right), \ u \in \mathcal{C}(\mathcal{J}).$$

*Let R be small enough such that a*(log(2) + *R*) < 1. *Let x*, *y* ∈ [−*R*, *R*], *by an elementary calculus we can show that the functions f and g satisfy the condition (ii), with α*(*t*) = *ϕ*(*t*) = 0, *<sup>γ</sup>*(*t*) = *aeR*(<sup>1</sup> <sup>−</sup> *<sup>e</sup>*−*<sup>t</sup>* ), *and ψ*(*t*) = *t*.

*On the other hand, we have that μ is Lipschizian with a Lipschiz constant L<sup>μ</sup>* = *b*, *and*

$$\sup\_{|x\_r|x| \le R} [f(0, x)]^{-1} \le \delta = \frac{1}{b}.$$

*Applying Theorem 3, we obtain that (12) has a unique solution in BR* = {*x* ∈ *C*(*J*); *x* ≤ *R*} *with R* = 3/4, *when a is small enough. In fact the solution is x*˜(*t*) = *b*. *We apply the numerical method for a* = 0.1*, b* = <sup>1</sup> <sup>4</sup> *and the initial x*0(*t*) = <sup>1</sup> 4 √*bt* <sup>+</sup> <sup>1</sup> . *Table 1 collects the obtained results.*

**Table 1.** Numerical results for (12) with initial *x*0(*t*) = <sup>1</sup> 4 √*bt* <sup>+</sup> <sup>1</sup> .


**Example 2.** *Consider the nonlinear differential equation with a nonlocal initial condition*

$$\begin{cases} \quad \frac{d}{dt} \left( \frac{\varkappa(t)}{f(t, \varkappa(t))} \right) = a(\varkappa(t))^2, & t \in \mathcal{I}, \\\\ \quad \varkappa(0) = 1/(4b) \sup\_{t \in \mathcal{I}} |\varkappa(t)|^2, \end{cases} \tag{13}$$

.

*where a*, *b are positive constants such that ab*<sup>2</sup> <sup>&</sup>lt; <sup>3</sup> *and f*(*t*, *<sup>x</sup>*) = *<sup>b</sup>*(*<sup>t</sup>* <sup>+</sup> <sup>1</sup>) 1 + *ab*<sup>2</sup> <sup>3</sup> (*x*3/*b*<sup>3</sup> − 1)

*Let us define the mappings g* : *<sup>J</sup>* <sup>×</sup> <sup>R</sup> <sup>→</sup> <sup>R</sup> *and <sup>μ</sup>* : *<sup>C</sup>*(*J*) <sup>→</sup> <sup>R</sup> *by*

$$\log(t, \mathbf{x}) = a\mathbf{x}^2, \ t \in \mathcal{J}, \mathbf{x} \in \mathbb{R} \text{ and } \mu(\boldsymbol{\mu}) = 1/(4b) \sup\_{t \in \mathcal{J}} |\mu(t)|^2, \ \boldsymbol{\mu} \in \mathbb{C}(\boldsymbol{J}).$$

*Let <sup>R</sup>* <sup>&</sup>gt; <sup>0</sup> *such that* <sup>2</sup>*<sup>b</sup>* <sup>≤</sup> *<sup>R</sup> and <sup>a</sup>* <sup>3</sup>*<sup>b</sup>* (*b*<sup>3</sup> <sup>+</sup> *<sup>R</sup>*3) <sup>&</sup>lt; 1. *Let <sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> [−*R*, *<sup>R</sup>*]. *By an elementary calculus we can show that <sup>f</sup> and <sup>g</sup> satisfy the condition (ii) with <sup>α</sup>*(*t*) = *<sup>a</sup>*(*<sup>t</sup>* <sup>+</sup> <sup>1</sup>)*R*<sup>2</sup> <sup>1</sup> <sup>−</sup> *<sup>a</sup>* <sup>3</sup>*<sup>b</sup>* (*R*<sup>3</sup> + *b*3) <sup>2</sup> , *<sup>γ</sup>*(*t*) = <sup>2</sup>*aR*, *and ϕ*(*t*) = *ψ*(*t*) = *t*.

*On the other hand, we have that*

$$|\mu(\mu) - \mu(v)| \quad \le \quad \frac{R}{2b}||\mu - v||.$$

*Consequently, μ is Lipschizian with a Lipschiz constant L<sup>μ</sup>* = *<sup>R</sup>* <sup>2</sup>*<sup>b</sup>* . *It is easy to prove that*

$$\sup\_{\mathbf{x}\in\mathbb{R}, \|\mathbf{x}\|\leq R} [f(0, \mathbf{x})]^{-1} \leq \delta = aR^3/(3b^2) + 1/b.$$

*Now, applying Theorem 3, in order to obtain that (13), with a is small enough, has a unique solution in BR with R* = 1/2. *We can check that the solution is x*˜(*t*) = *b*(*t* + 1). *Table 2 shows the numerical results of the proposed method for a* = 0.05*, b* = 1/4 *and x*0(*t*) = <sup>1</sup> 2 *t*.



#### **5. Nonlinear Integral Equations**

This section deals with the nonlinear integral Equation (2). More precisely, we prove the existence and the uniqueness of a solution to Equation (2) under the hypothesis that the mappings *<sup>f</sup>* : *<sup>J</sup>* <sup>×</sup> <sup>R</sup> <sup>→</sup> <sup>R</sup> and *<sup>K</sup>* : *<sup>J</sup>* <sup>×</sup> *<sup>J</sup>* <sup>×</sup> <sup>R</sup> <sup>→</sup> <sup>R</sup> are such that:

*(i)* The partial mappings *t* → *f*(*t*, *x*) and (*t*,*s*) → *K*(*t*,*s*, *x*) are continuous.

*(ii)* There exist *<sup>r</sup>* <sup>&</sup>gt; 0, *<sup>γ</sup>* : *<sup>J</sup>* <sup>×</sup> *<sup>J</sup>* <sup>→</sup> <sup>R</sup>, *<sup>α</sup>* : *<sup>J</sup>* <sup>→</sup> <sup>R</sup> two continuous functions and *<sup>ϕ</sup>*, *<sup>ψ</sup>* : <sup>R</sup><sup>+</sup> −→ <sup>R</sup><sup>+</sup> two nondecreasing continuous functions such that

$$|f(t, \mathbf{x}) - f(t, y)| \le a(t)\varphi(|\mathbf{x} - y|), t \in I\_\prime \text{ and } \mathbf{x}, y \in \mathbb{R} \text{ with } |\mathbf{x}|, |y| \le r\_\prime$$

and

$$|K(t, s, \mathbf{x}) - K(t, s, y)| \le \gamma(t, s)\psi(|\mathbf{x} - y|), t, s \in J \text{ and } \mathbf{x}, y \in \mathbb{R} \text{ with } |\mathbf{x}|, |y| \le r.$$

Throughout this section, Ω will denote the closed ball *Br* of *C*(*J*), where *r* is defined in the above assumption (*ii*).

#### *5.1. Existence and Uniqueness of Solutions*

To allow the abstract formulation of Equation (2), we define the following operators on *C*(*J*) by

$$\begin{aligned} (Ax)(t) &= f(t, \mathbf{x}(\sigma(t))), \\\\ (B\mathbf{x})(t) &= \left[ q(t) + \int\_0^{\eta(t)} \mathbf{K}(t, s, \mathbf{x}(\tau(s))) ds \right], t \in J. \end{aligned} \tag{14}$$

First, we will establish the following result which shows the existence and uniqueness of a solution.

**Theorem 5.** *Assume that the assumptions* (*i*) *and* (*ii*) *hold. If*

$$M\_A M\_B \le r \text{ and } M\_A \rho ||\gamma||\_\infty \psi(t) + M\_B ||u||\_\infty \rho(t) < t, \; \forall t > 0\_\tau$$

*where*

$$M\_A = \|\mathfrak{a}\|\_{\infty} \varphi(r) + \|f(\cdot, \theta)\|\_{\infty} \text{ and } M\_B = \|\mathfrak{q}(\cdot)\|\_{\infty} + \rho(\|K(\cdot, \cdot, 0)\|\_{\infty} + \|\gamma\|\_{\infty} \psi(r)),$$

*then the nonlinear integral Equation* (2) *has a unique solution in* Ω*.*

**Proof.** By using similar arguments to those in the above section, we can show that *A* and *B* define D-lipschitzian mappings from Ω into *C*(*J*), with D-functions *α*<sup>∞</sup> *ϕ* and *ργ*<sup>∞</sup>*ψ*, respectively. Also it is easy to see that *A*(Ω) and *B*(Ω) are bounded with bounds, respectively, *MA* and *MB*. Taking into account our assumptions, we deduce that *A* · *B* maps Ω into Ω.

Notice that *A* · *B* defines a nonlinear contraction with D-function

$$\Theta(t) := \rho \|\gamma\|\_{\infty} M\_A \psi(t) + \|\alpha\|\_{\infty} M\_B \,\uprho(t), t \ge 0, \text{ i.e.,}$$

$$\|(A \cdot B)(x) - (A \cdot B)(y)\| \le \Theta(\|x - y\|), \ x, y \in \Omega. \tag{15}$$

Now, an application of Theorem 2 yields that (2) has one and only one solution *x*˜ in Ω, and for each *x*<sup>0</sup> ∈ Ω we have

$$\lim\_{n \to \infty} (A \cdot B)^n (\mathfrak{x}\_0) = \mathfrak{x}.\tag{16}$$

#### *5.2. A Numerical Method to Approximate the Solution*

Now we consider a Schauder basis {*en*}*n*≥<sup>1</sup> in *C*(*J* × *J*) and the sequence of associated projections {*Pn*}*n*≥1. Let *<sup>p</sup>* <sup>∈</sup> <sup>N</sup>, *np* <sup>∈</sup> <sup>N</sup> and consider

$$\begin{cases} \begin{array}{rcl} \mathbb{S}\_p: \mathcal{C}(f) & \longrightarrow \mathcal{C}(f) \\\\ \end{array} \\\\ \begin{array}{rcl} \mathbf{x} & \longrightarrow \mathcal{S}\_p(\mathbf{x})(t) = q(t) + \int\_0^{\eta(t)} P\_{\eta\_p}(\mathcal{U}\_0(\mathbf{x}))(t, \mathbf{s}) d\mathbf{s}, \end{array} \end{cases}$$

where *U*<sup>0</sup> : *C*(*J*) −→ *C*(*J* × *J*) is defined as *U*0(*x*)(*t*,*s*) = *K*(*t*,*s*, *x*(*τ*(*s*))). Also, we consider the operator *Tp* : *C*(*J*) −→ *C*(*J*), which assigns for all *x* ∈ *C*(*J*) the valued *Tp*(*x*) ∈ *C*(*J*) such that

$$T\_p(\mathfrak{x})(t) = A(\mathfrak{x})(t) S\_{n\_p}(\mathfrak{x})(t), t \in J\_{\mathfrak{x}}$$

where *A* : *C*(*J*) −→ *C*(*J*) is defined as *A*(*x*)(*t*) = *f*(*t*, *x*(*σ*(*t*))).

**Remark 3.** *Since for p* ≥ 1*,*

$$\begin{aligned} \left| \left| T\_p(\mathbf{x})(t) \right| \right| = \left| A(\mathbf{x})(t) \left( q(t) + \int\_0^{\eta(t)} P\_{n\_p}(\mathcal{U}\_0(\mathbf{x}))(t, \mathbf{s}) d\mathbf{s} \right) \right| &\leq \\ \left| f(t, \mathbf{x}(\sigma(t))) \right| \left( \left| q(t) \right| + \int\_0^{\eta(t)} \left| P\_{n\_p}(\mathcal{U}\_0(\mathbf{x}))(t, \mathbf{s}) \right| d\mathbf{s} \right), \end{aligned}$$

*proceeding essentially as in the above section and using the fact that Pnp is a bounded linear operator on C*(*J* × *J*), *we get*

$$\begin{aligned} \left( \left| T\_p(\mathbf{x})(t) \right| \leq M\_A \left( \left| q(t) \right| + \rho \right) \left\| P\_{n\_p} (\mathcal{U}\_0(\mathbf{x})) \right\| \right) &\leq \\ M\_A \left( \left| q \right|\_{\infty} + \rho \sup\_{t, s \in \mathcal{f}} \left| \mathcal{K}(t, s, \mathbf{x}(\tau(s))) \right| \right) &\leq M\_A M\_B. \end{aligned}$$

*Accordingly, under the hypotheses of the Theorem 5, the mapping Tp maps* Ω *into* Ω. *In particular, for m* ≥ 1, *the operator Tm* ◦ ... ◦ *T*<sup>1</sup> *maps* Ω *into* Ω.

Analogously as we did in the previous section, the following result allow us to justify it is possible to choose *n*1, *n*2, ... in order that *T*1, *T*2, ... can be used to approximate the unique solution to Equation (2).

**Theorem 6.** *Let x*˜ *be the unique solution to the nonlinear Equation (2). Let x*<sup>0</sup> ∈ Ω *and ε* > 0, *then there exists m* <sup>∈</sup> <sup>N</sup> *and ni* <sup>∈</sup> <sup>N</sup> *to construct Ti for i* <sup>=</sup> 1, . . . , *m, such that*

$$\|\|\mathfrak{x} - T\_{\mathfrak{m}} \circ \dots \circ T\_1(\mathfrak{x}\_0)\|\| \le \varepsilon.$$

**Proof.** Let *<sup>ε</sup>* <sup>&</sup>gt; 0, by using (16), there is *<sup>m</sup>* <sup>∈</sup> <sup>N</sup> such that

$$\left\|(A \cdot B)^{m}(\mathfrak{x}\_{0}) - \mathfrak{x}\right\| \leq \mathfrak{e}/2.$$

For that *m*, and for *p* ∈ {1, . . . , *m*}, we define *Up* : *C*(*J*) → *C*(*J* × *J*) by

$$\mathcal{U}\_p(\mathbf{x})(t, \mathbf{s}) := K(t, \mathbf{s}, T\_p \diamond \dots \diamond T\_1(\mathbf{x})(\mathbf{s})), \ t, \mathbf{s} \in \mathcal{J}, \mathbf{x} \in \mathbb{C}(f)$$

and *Ap* : *C*(*J*) → *C*(*J*) by

$$A\_p(\mathfrak{x})(s) := f\left(s, T\_p \circ \dots \circ T\_1(\mathfrak{x})(s)\right), \; s \in J, \mathfrak{x} \in \mathbb{C}(f).$$

Proceeding essentially, as in the Theorem 4, and taking into account (15) together with Remark 3 the desired thesis can be proved.

#### *5.3. Numerical Experiments*

This section is devoted to give some numerical examples to illustrate the previous results using the usual Schauder basis in *C*([0, 1] <sup>2</sup>) with the well know square ordering (see Table 1 in [18] and [28,29]). In each example, we will denote *x*<sup>∗</sup> = *Tm* ◦ ... ◦ *T*1(*x*0) for *m* = 4 and *n*<sup>1</sup> = ··· = *nm* = *l* <sup>2</sup> with *l* = 9 or *l* = 33.

**Example 3.** *Consider the nonlinear integral equation*

$$\mathbf{x}(t) = a(t+1)\left[\frac{b}{a} - \frac{b^2}{3}((t+1)^3 - 1) + \int\_0^t (\mathbf{x}(s))^2 ds\right], \quad t \in \mathbb{I}.\tag{17}$$

*Now we consider the mappings <sup>q</sup>* : *<sup>J</sup>* <sup>→</sup> *<sup>J</sup>*, *<sup>f</sup>* : *<sup>J</sup>* <sup>×</sup> <sup>R</sup> <sup>→</sup> <sup>R</sup> *and <sup>K</sup>* : *<sup>J</sup>* <sup>×</sup> *<sup>J</sup>* <sup>×</sup> <sup>R</sup> <sup>→</sup> <sup>R</sup> *such that <sup>q</sup>*(*t*) = *<sup>b</sup>*/*<sup>a</sup>* <sup>−</sup> *<sup>b</sup>*<sup>2</sup> 3 (*<sup>t</sup>* <sup>+</sup> <sup>1</sup>)<sup>3</sup> <sup>−</sup> <sup>1</sup> , *f*(*t*, *x*) = *a*(*t* + 1) *and K*(*t*,*s*, *x*) = *x*2. *Let R* > 0 *and let x*, *y* ∈ [−*R*, *R*]. *We have that*

$$|K(t,s,\varkappa) - K(t,s,y)| \le \gamma(t,s)\psi(|\varkappa - y|)\_{\varkappa}$$

*where γ*(*t*,*s*) = 2*R*, *and ψ*(*t*) = *t*. *An application of Theorem 5, yields that (17) has a unique solution in BR*, *with R* = 3. *In fact the solution is x*˜(*t*) = *b*(*t* + 1).

*Using the proposed method with a* = 0.1, *b* = 0.1 *and x*0(*t*) = *t* 2, *we obtain Table 3.*


**Table 3.** Numerical results for the (17).

**Example 4.** *Consider the nonlinear differential equation*

$$\mathbf{x}(t) = \left(ae^{-\mathbf{x}(t)} + b\right) \left[\frac{t}{ae^{-t} + b} + \frac{1}{1 - c} \log(\cos(1 - c)t) + \int\_0^t \tan((1 - c)\mathbf{x}(s))ds\right]. \tag{18}$$

*Similarly to that above, (18) can be written as a fixed point problem with the same notations in (14). Let R* > 0 *and let x*, *y* ∈ [−*R*, *R*]. *By an elementary calculus we can show that the functions f and <sup>g</sup> satisfy the condition (ii), with <sup>α</sup>*(*t*) = *aeR*, *<sup>γ</sup>*(*t*)=(<sup>1</sup> <sup>+</sup> tan2(<sup>1</sup> <sup>−</sup> *<sup>c</sup>*)*R*), *and <sup>ϕ</sup>*(*t*)=(<sup>1</sup> <sup>−</sup> *<sup>e</sup>*−*<sup>t</sup>* ) *and ψ*(*t*) = tan(1 − *c*)*t*.

*Apply Theorem 5, (18), with a small enough and c* = 1 − *a*, *has a unique solution in BR with R* = 3, *in fact the solution is x*˜(*t*) = *t*. *We obtain the results given in Table 4 for a* = 0.01, *b* = 1, *R* = 3, *and x*0(*t*) = sin(*t*).


**Table 4.** Numerical results for (18) with initial *x*0(*t*) = sin(*t*).

**Example 5.** *Consider the problem (2) with*

$$\begin{array}{rcl}f(t,x) &=& at\left[(b+t)^2 + \frac{t}{(t+1)}\int\_0^t \left(1 - e^{-(t+1)(as+1)}\right)ds\right]^{-1}, \\\\ K(t,s,x) &=& \int\_0^{x+1} e^{-(t+1)u} du, \\\\ q(t) &=& (b+t)^2. \end{array} \tag{19}$$

*Let* 0 < *R* < 1 *and let x*, *y* ∈ [−*R*, *R*]. *By an elementary calculus, we can show that f and g satisfy the condition* (*ii*), *with α*(*t*) = *ϕ*(*t*) = 0, *ψ*(*t*) = <sup>2</sup>*<sup>t</sup>* <sup>0</sup> *<sup>e</sup>*−*sds*, *and <sup>γ</sup>*(*t*,*s*) = <sup>1</sup> *<sup>t</sup>*+<sup>1</sup> *<sup>e</sup>*(*t*+1)(*R*−1).

*Taking a* = 0.1, *b* = 1, *and applying Theorem 5, the problem has a unique solution in BR* = {*x* ∈ *C*([0, 1]); *x* ≤ *R*}*, in fact the solution is x*˜(*t*) = *at*. *We obtain the results given in Table 5.*

**Table 5.** Numerical results for (19) with initial *x*0(*t*) = 1/2*cos*(10*πt*).


#### **6. Conclusions**

In this paper we have presented a numerical method, based on the use of Schauder's bases, to solve hybrid nonlinear equations in Banach algebras. To do this, we have used Boyd-Wong's theorem to establish the existence and uniqueness of a fixed point for the product of two nonlinear operators in Banach algebra (Theorem 2). The method is applied to a wide class of nonlinear hybrid equations such as the ones we have illustrated by means of several numerical examples.

The possibility of applying this process or a similar idea to other types of hybrid equations or systems of such equations is open and we hope to discuss this in the near future.

**Author Contributions:** Conceptualization, K.B.A. and M.I.B.; methodology, K.B.A., M.I.B. and A.J.; software, K.B.A. and M.I.B.; validation, K.B.A. and M.I.B.; formal analysis, K.B.A., M.I.B. and A.J.; investigation, K.B.A. and M.I.B.; writing—original draft preparation, K.B.A. and M.I.B.; writing—review and editing, K.B.A. and M.I.B.; supervision, K.B.A., M.I.B. and A.J. All authors have read and agreed to the published version of the manuscript.

**Funding:** The research of Aref Jeribi and Khaled Ben Amara has been partially supported by the University of Sfax (Tunisia). The research of María Isabel Berenguer has been partially supported by Junta de Andalucía (Spain), Project *Convex and numerical analysis*, reference FQM359, and by the *María de Maeztu* Excellence Unit IMAG, reference CEX2020-001105-M, funded by MCIN/AEI/10.13039/ 501100011033/.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** This work was partially carried out during the first author's visit to the Department of Applied Mathematics, University of Granada. The authors wish to thank the anonymous referees for their useful comments. They also acknowledge the financial support of the University of Sfax (Tunisia), the Consejería de Conocimiento, Investigación y Universidad, Junta de Andalucía (Spain) and the *María de Maeztu* Excellence Unit IMAG (Spain).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Finding an Efficient Computational Solution for the Bates Partial Integro-Differential Equation Utilizing the RBF-FD Scheme**

**Gholamreza Farahmand 1, Taher Lotfi 1,\*, Malik Zaka Ullah <sup>2</sup> and Stanford Shateyi 3,\***


**Abstract:** This paper proposes a computational solver via the localized radial basis function finite difference (RBF-FD) scheme and the use of graded meshes for solving the time-dependent Bates partial integro-differential equation (PIDE) arising in computational finance. In order to avoid facing a large system of discretization systems, we employ graded meshes along both of the spatial variables, which results in constructing a set of ordinary differential equations (ODEs) of lower sizes. Moreover, an explicit time integrator is used because it can bypass the need to solve the large discretized linear systems in each time level. The stability of the numerical method is discussed in detail based on the eigenvalues of the system matrix. Finally, numerical tests revealed the accuracy and reliability of the presented solver.

**Keywords:** PIDE; stochastic volatility; semi-discretiztion; RBF-FD; Bates model

**MSC:** 65M22; 91G60; 91B25

#### **1. Introductory Notes**

The Bates model for option pricing considers that the underlying asset *St*, the volatility *Vt*, the riskless constant *r* and *Nt* as the Poisson process satisfy the following system of stochastic differential equations (SDEs) [1]:

$$\begin{aligned} dS\_t &= \sqrt{\nabla\_t} S\_t dW\_t^1 + (-\lambda\_t^\pi - q + r) S\_t dt + (\varrho - 1) S\_t dN\_{t\prime} \\\\ dV\_t &= \sigma \sqrt{\nabla\_t} dW\_t^2 + \kappa (-V\_t + \theta) dt, \end{aligned} \tag{1}$$

wherein *W*<sup>2</sup> *<sup>t</sup>* and *W*<sup>1</sup> *<sup>t</sup>* are standard Brownian motions having *dW*<sup>1</sup> *<sup>t</sup> dW*<sup>2</sup> *<sup>t</sup>* = *ρdt*. Here *κ* is the reversion's rate of the variance *Vt*, *λ* is the Poisson process intensity, *ξ* is the mean jump, *q* is the dividend, is the jump size, while *θ* is the mean level and *σ* stands for the volatility fixed value.

Financial derivatives such as European call or put options play pioneer roles in the risk management of some portfolios and their pricing as efficiently as possible is of importance. On the other hand for the financial derivative price, since analytical relations are available only in limited settings, one is in need for the construction and the application of fast and stable numerical solvers. More concretely, starting from the initial time zero, we must numerically solve a second-order high-dimensional time-dependent partial integrodifferential equation (PIDE) or a partial differential equation (PDE) and then compute the present value of the financial derivative [2–4].

**Citation:** Farahmand, G.; Lotfi, T.; Ullah, M.Z.; Shateyi, S. Finding an Efficient Computational Solution for the Bates Partial Integro-Differential Equation Utilizing the RBF-FD Scheme. *Mathematics* **2023**, *11*, 1123. https://doi.org/10.3390/ math11051123

Academic Editors: Maria Isabel Berenguer and Manuel Ruiz Galán

Received: 18 January 2023 Revised: 3 February 2023 Accepted: 21 February 2023 Published: 23 February 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The Heston model, which could be considered as a generalization of the Black–Scholes model [5], can be extended further if one follows the consideration of Bates [1,6] by imposing the jump component into the modeling. In fact, in the stochastic volatility jump (SVJ) model, the price of an option is computed by solving a time-dependent 2D PIDE [7,8]. It is requisite to recall some related models [9,10] discussing stochastic volatility for PDEs in control theory and AI.

The Bates PIDE based on the price function *u*(*x*, *y*, *τ*) for European options is expressed by the following [11]:

$$\begin{split} \frac{\partial u(\mathbf{x}, \mathbf{y}, \tau)}{\partial \tau} &= \frac{1}{2} \chi \mathbf{x}^2 \frac{\partial^2 u(\mathbf{x}, \mathbf{y}, \tau)}{\partial \mathbf{x}^2} + \frac{1}{2} \sigma^2 \mathbf{y} \frac{\partial^2 u(\mathbf{x}, \mathbf{y}, \tau)}{\partial \mathbf{y}^2} \\ &+ \rho \sigma \chi \mathbf{x} \frac{\partial^2 u(\mathbf{x}, \mathbf{y}, \tau)}{\partial \mathbf{x} \partial \mathbf{y}} \\ &+ (-\lambda\_5^x - q + r) \mathbf{x} \frac{\partial u(\mathbf{x}, \mathbf{y}, \tau)}{\partial \mathbf{x}} + \kappa(\theta - y) \frac{\partial u(\mathbf{x}, \mathbf{y}, \tau)}{\partial y} \\ &- (\lambda + r) u(\mathbf{x}, \mathbf{y}, \tau) + \lambda \int\_0^\infty u(\mathbf{x} \mathbf{q}, \mathbf{y}, \tau) b(\mathbf{q}) d\mathbf{q} = \mathcal{A}u(\mathbf{x}, \mathbf{y}, \tau), \end{split} \tag{21}$$

wherein *T* is the time to maturity and *τ* = *T* − *t* is a time transformation to have forward in time PIDE formulation, unlike the original Bates PIDE, which is backward in time. Besides, both the differential and integral parts of the PIDE problem have been encapsulated in the operator A, that is to say, we also can write

$$\mathcal{A}u(\mathbf{x}, \mathbf{y}, \boldsymbol{\pi}) = \mathcal{A}\_D u(\mathbf{x}, \mathbf{y}, \boldsymbol{\pi}) + \lambda \mathcal{A}\_I u(\mathbf{x}, \mathbf{y}, \boldsymbol{\pi}), \tag{3}$$

in which A*<sup>D</sup>* and A*<sup>I</sup>* stand for the differential and integral portions of the PIDE problem. The probability density function is *b*() = <sup>√</sup> <sup>1</sup> <sup>2</sup>*πσ*<sup>ˆ</sup> exp<sup>1</sup> −(ln()−*γ*)<sup>2</sup> 2*σ*ˆ <sup>2</sup> 2 , where it reads <sup>∞</sup> <sup>0</sup> *b*()*d* = 1. Here, *σ*ˆ and *γ* are the standard deviation and the mean, respectively, which are positive constants. Additionally, we have *<sup>ξ</sup>* <sup>=</sup> exp *γ* + <sup>1</sup> <sup>2</sup>*σ*<sup>ˆ</sup> <sup>2</sup> − 1.

The so-called payoff which is the the initial condition for the PIDE problem in call-type option pricing can be expressed as [12]:

$$
\mu(x, y, 0) = (0, x - K)^{+},\tag{4}
$$

wherein *K* is the strike price. The payoff for a put option could be written similarly. The point is that the initial condition is written only on *x* and does not rely on the second independent variable of the PIDE, i.e., *y*.

The side conditions for *x* and *y* could be given as follows [12]:

$$u(\mathbf{x}, y, \mathbf{r}) \simeq 0, \qquad \mathbf{x} \to \mathbf{0},\tag{5}$$

$$u(x, y, \tau) \simeq x\_{\text{max}} \exp\left(-q\tau\right) - K \exp\left(-r\tau\right), \qquad x \to x\_{\text{max}} \tag{6}$$

$$\frac{\partial u(\mathbf{x}, y, \mathbf{r})}{\partial y} \simeq 0, \qquad y \to +y\_{\text{max}}.\tag{7}$$

Note that for the case when *y* = 0, the PIDE (2) is degenerate and no boundaries should be incorporated while *x*max and *y*max are large constants. Similarly for the put option, the boundary conditions are described by the following:

$$
\mu(x, y, \tau) \simeq K \exp\left(-r\tau\right) - x\_{\text{max}} \exp\left(-q\tau\right), \qquad x \to 0,\tag{8}
$$

$$u(\mathbf{x}, y, \mathbf{r}) \simeq 0, \qquad \mathbf{x} \to \mathbf{x}\_{\text{max},} \tag{9}$$

$$\frac{\partial u(x, y, \tau)}{\partial y} \simeq 0, \qquad y \to y\_{\text{max}}.\tag{10}$$

The Bates PIDE (2) is given on (*x*, *y*, *τ*) ∈ [0, +∞) × [0, +∞) ×(0, *T*]. To solve our highdimensional linear PDE, we must truncate the unbounded domain while quite delicately ignoring the error caused by imposing the boundary conditions. This can be pursued as follows:

$$
\Omega = [0, \mathfrak{x}\_{\text{max}}] \times [0, y\_{\text{max}}]\_\prime \tag{11}
$$

wherein *x*max, *y*max are fixed values. The values for *x*max, and *y*max should be considered large enough to be able to neglect the effect of imposing artificial boundary conditions or imposing the boundary conditions for truncated domains. Some choices are Ω = [0, 4*K*] × [0, 1] or Ω = [0, 3*K*] × [0, 1].

Assume that {*xi*}*<sup>m</sup> <sup>i</sup>*=<sup>1</sup> is a mesh of nodes for *x*. The hyperbolic stretching of nodes [13] can be expressed as follows (1 ≤ *i* ≤ *m*):

$$\mathbf{x}\_{i} = c \sinh(\beta\_{i}) + \mathbf{K}\_{\prime} \tag{12}$$

wherein *c* > 0 stands for a fixed value that controls the density around *x* = *K* and *m* 3. In implementations, one can employ *c* as in [14], i.e., *c* = *K*/5. This puts a focus around the strike price, in which the initial condition of the PIDE has nonsmoothness. Moreover, {*βi*}*<sup>m</sup> <sup>i</sup>*=<sup>0</sup> stands for the uniform points given by the following:

$$\beta\_i = (i - 1)\Delta \beta + \sinh^{-1}\left(-\frac{K}{c}\right), \qquad 1 \le i \le m,\tag{13}$$

wherein <sup>Δ</sup>*<sup>β</sup>* = (*<sup>m</sup>* <sup>−</sup> <sup>1</sup>)−<sup>1</sup> sinh−<sup>1</sup> *S*−*<sup>K</sup> c* <sup>−</sup> sinh−<sup>1</sup> <sup>−</sup>*<sup>K</sup> c* .

Also, if {*yj*}*<sup>n</sup> <sup>j</sup>*=<sup>1</sup> is a partition for *y*, then this stretching strategy can be expressed by the following:

$$
\mu\_j = \sinh(\varrho\_j)\nu\_\prime \qquad 1 \le j \le n\_\prime \tag{14}
$$

wherein *ν* > 0 is a fixed value that controls the density around *y* = 0 and *n* 3. Basically, we use *ν* = *K*/500 [14]. Additionally, the *ς<sup>j</sup>* are equidistant nodes provided by *ς<sup>j</sup>* = (Δ*ς*)(*<sup>j</sup>* <sup>−</sup> <sup>1</sup>), and for any 1 <sup>≤</sup> *<sup>j</sup>* <sup>≤</sup> *<sup>n</sup>* we have <sup>Δ</sup>*<sup>ς</sup>* <sup>=</sup> <sup>1</sup> *<sup>n</sup>*−<sup>1</sup> sinh−<sup>1</sup> *<sup>K</sup> ν* .

Numerical solution methods generally utilize the discretization means to realize the approximate calculation. When the computational domain/interval is partitioned more finely, the calculated result is closer to the theoretical solution. Indeed, the time required for the calculation increases. For high-dimensional PIDE problems with kink behavior at the initial conditions, sometimes special solvers such as high-order sparse numerical methods are necessary, see, e.g., [15]. Noting finite difference (FD) methods are discussed in [16,17].

In this paper, the main aim is to propose a novel computational method for resolving (2) via the radial basis function generated finite difference (RBF-FD) methodology [18]. This is mostly because (2) is a (1+2)D problem with variable coefficients, in which there is one cross derivatives. Hence, the computational solvers should be constructed for this aim with much attention. In fact, the motivation of this work lies in the fact that literature lacks the application of efficient RBF-FD methodology which result in fast and sparse procedures for solving the Bates PIDE model. Hence, such an application and investigation on the theoretical stability issues will help price option under stochastic volatility in equity markets.

The RBF-FD formulations in this paper, see, e.g., [19], are written so they can be applied to graded meshes in which there is a clear concentration on the hot zone. The procedure taken here is to employ tensor grids and then time discretize the semi-discretized constructed problem. We note that the present work is related to the pioneering works in [20–22]. Meanwhile, these works motivate us to propose a new variant of the RBF-FD scheme for the Bates PIDE problem that competes with these efficient works.

In this paper, after reviewing the well-resulted maps for generating graded meshes along spatial variables with a clear focus around the hot area, the rest of this article is unfolded as follows. The RBF-FD formulas associated with the GMQ RBF are given in Section 2. Then, the semi-discretization of the two-dimensional PDE (2) is described in Section 3. Then in Section 4, an explicitly quadratically convergent method is taken into account. In fact, an explicit time integrator is used because it can avoid to the need to solve the large discretized linear systems in each time level. It is shown that the proposed solver is fast and conditionally stable. The numerical pieces of evidence are given in Section 5, which overwhelmingly uphold the theoretical discussions of the paper. Concluding notes are provided in Section 6.

#### **2. RBF-FD: The Weights**

Generally speaking, for computing the weights *α<sup>i</sup>* in the methodology of the RBF-FD, one must consider *L* as a linear operator and at *x* = *xp*, for the node locations *xi*, the following is written down [20]:

$$
\begin{bmatrix}
\Lambda\_1(\underline{\chi}\_1) & \Lambda\_1(\underline{\chi}\_2) & \cdots & \Lambda\_1(\underline{\chi}\_m) \\
\Lambda\_2(\underline{\chi}\_1) & \Lambda\_2(\underline{\chi}\_2) & \cdots & \Lambda\_2(\underline{\chi}\_m) \\
\vdots & \vdots & & \vdots \\
\Lambda\_m(\underline{\chi}\_1) & \Lambda\_m(\underline{\chi}\_2) & \cdots & \Lambda\_m(\underline{\chi}\_m)
\end{bmatrix}
\begin{bmatrix} a\_1 \\ a\_2 \\ \vdots \\ a\_m \end{bmatrix} = \begin{bmatrix}
L\Lambda\_1(\underline{\chi})|\_{\underline{\chi}=\underline{\chi}\_p} \\
L\Lambda\_2(\underline{\chi})|\_{\underline{\chi}=\underline{\chi}\_p} \\
\vdots \\
L\Lambda\_m(\underline{\chi})|\_{\underline{\chi}=\underline{\chi}\_p}
\end{bmatrix},\tag{15}
$$

where the underlined *x* shows a vector quantity in the dimension *d* and 1 ≤ *k* ≤ *m* for some set of test functions Λ(*x*). It is noted that the extension of RBF-FD methodology for solving computational finance models was revived by the works of Soleymani and co-authors, see for instance [23,24].

Now, we consider the famous generalized multiquadric RBF (GMQ RBF) as follows ([25] Chapter 4):

$$\Lambda(r\_i) = (p^2 + r\_i^2)^l, \qquad i = 1, 2, \dots, m,\tag{16}$$

where *l* is a suitable parameter, the parameter of shape is *p* and *ri* = **y** − **y***<sup>i</sup>* shows the Euclidean distance.

It is now focused on computing the weights for the GMQ RBF (in the 1D case without loss of generality). So, we consider a graded mesh including three points along the first spatial variable. For finding the weights of the RBF-FD methodology, by taking into account *L* as an operator, we could write down [26]:

$$L[\Lambda(y\_j)] \simeq \sum\_{i=1}^{\Psi} \alpha\_i \Lambda(y\_i), \qquad j = 1, 2, \dots, \Psi. \tag{17}$$

This gives us *ψ* unknowns for *ψ* equations while the solutions will be *αi*. For computing the 1st derivative, three graded nodes are considered (*ψ* = 3) as comes next: {*yi* − *h*, *yi*, *yi* + *wh*}, *w* > 0, *h* > 0, and find (17) as follows:

$$\mathbf{g}'(y\_i) \simeq a\_{i-1}\mathbf{g}(y\_{i-1}) + a\_i\mathbf{g}(y\_i) + a\_{i+1}\mathbf{g}(y\_{i+1}).\tag{18}$$

Noting that we assume that the function *g* is smooth sufficiently. In estimating the 1st derivative of a function, the analytical weighting coefficients associated to this RBF can be given as follows [22]:

$$\alpha\_{i-1} = \frac{\omega \left(p^2(9-6l) - h^2(l-1)(4(l-5)\omega - 10l + 29)\right)}{3p^2h(2l-3)(\omega+1)},\tag{19}$$

$$\alpha\_{l} = \frac{(\omega - 1)\left(p^{2}(6l - 9) + 4h^{2}(l - 5)(l - 1)\omega\right)}{3p^{2}h(2l - 3)\omega},\tag{20}$$

$$\alpha\_{i+1} = \frac{p^2(6l - 9) - h^2(l - 1)\omega(2l(5\omega - 2) - 29\omega + 20)}{3p^2h(2l - 3)\omega(\omega + 1)}.\tag{21}$$

Similarly, in estimating the function's second derivative, we can obtain

$$\mathcal{g}^{\prime\prime}(y\_i) \simeq \sum\_{j=i-1}^{i+1} \Theta\_j \mathcal{g}(y\_j)\_{\prime} \tag{22}$$

along with the following weighting coefficients:

$$\Theta\_{l-1} = \frac{2\left(p^2(6l-9) - h^2(l-1)\left(4(l-5)\omega^2 + (34-8l)\omega + 10l - 29\right)\right)}{3p^2h^2(2l-3)(\omega+1)},\tag{23}$$

$$\Theta\_i = \frac{2\left(p^2(9-6l) + h^2(l-1)\left(4(l-5)\omega^2 + (25-2l)\omega + 4(l-5)\right)\right)}{3p^2h^2(2l-3)\omega},\tag{24}$$

$$\Theta\_{i+1} = \frac{2\left(p^2(6l-9) - h^2(l-1)\left(2l(\omega(5\omega -4) + 2) + \omega(34 - 29\omega) - 20\right)\right)}{3p^2h^2(2l-3)\omega(\omega+1)}.\tag{25}$$

Also noting that the given RBF-FD formulations are valid for the interior nodes and at boundary points, similar formulations must be constructed. We give the derivation for the independent variable *y* and it would be similar for the other cases. The formulations (19)–(21) and (23)–(24) are useful for the rows two to the row before the last one, while for the 1st and the last rows of the derivative matrices (30) and (31), the weighting coefficients could not be valid on boundaries and sided estimations should be incorporated. Hence, by the work [21] on the stencil {*y*1, *y*2, *y*3}, we have:

$$\log'(y\_1) = \lg[y\_2, y\_1] - \lg[y\_3, y\_2] + \lg[y\_3, y\_1] + \mathcal{O}\left((y\_2 - y\_1)^2\right),\tag{26}$$

and

$$\log'(y\_m) = -\lg[y\_{m-1}, y\_{m-2}] + \lg[y\_{m-2}, y\_m] + \lg[y\_{m'}y\_{m-1}] + \mathcal{O}\left(\left(y\_{m-1} - y\_m\right)^2\right),\tag{27}$$

wherein *g*[*l*, *p*]=(*g*(*l*) − *g*(*p*))/(*l* − *p*). In a similar manner, for the four nodes {{*y*1, *g*(*y*1)}, {*y*2, *g*(*y*2)}, {*y*3, *g*(*y*3)}, {*y*4, *g*(*y*4)}}, we can obtain

$$\begin{split} g''(y\_1) &= \frac{2(\delta y\_{1,2} + \delta y\_{1,3} + \delta y\_{1,4})}{\delta y\_{1,2}\delta y\_{1,3}\delta y\_{1,4}} g(y\_1) + \frac{2(\delta y\_{3,1} + \delta y\_{4,1})}{\delta y\_{1,2}\delta y\_{2,3}\delta y\_{2,4}} g(y\_2) \\ &+ \frac{2(\delta y\_{2,1} + \delta y\_{4,1})}{\delta y\_{1,3}\delta y\_{3,2}\delta y\_{3,4}} g(y\_3) + \frac{2(\delta y\_{2,1} + \delta y\_{3,1})}{\delta y\_{1,4}\delta y\_{4,2}\delta y\_{4,3}} g(y\_4) + \mathcal{O}\left(h^2\right), \end{split} \tag{28}$$

where *δyl*,*<sup>q</sup>* = *yl* − *yq*, *h* is the maximum space width for the considered stencil nodes. Similarly, we have:

$$\begin{split} g\_{\boldsymbol{y}}^{\boldsymbol{y}}(\boldsymbol{y}\_{m}) &= \frac{2(\delta y\_{m-3,m} + \delta y\_{m-2,m} + \delta y\_{m-1,m})}{\delta y\_{m-3,m}\delta y\_{m,m-2}\delta y\_{m,m-1}} g(\boldsymbol{y}\_{m}) + \frac{2(\delta y\_{m-3,m} + \delta y\_{m-2,m})}{\delta y\_{m-3,m-1}\delta y\_{m-1,m-2}\delta y\_{m-1,m}} g(\boldsymbol{y}\_{m-1}) \\ &+ \frac{2(\delta y\_{m-3,m} + \delta y\_{m-1,m})}{\delta y\_{m-3,m-2}\delta y\_{m-2,m-1}\delta y\_{m-2,m}} g(\boldsymbol{y}\_{m-2}) + \frac{2(\delta y\_{m-2,m} + \delta y\_{m-1,m})}{\delta y\_{m-2,m-3}\delta y\_{m-1,m-3}\delta y\_{m,m-3}} g(\boldsymbol{y}\_{m-3}) \\ &+ \mathcal{O}\left(h^{2}\right). \end{split} \tag{29}$$

#### **3. A New Solution Method**

Let us use the well-known procedure of method of lines (MOL) for semi discretization of the time-dependent problem [27,28] and convert the PIDE problem into a set of linear ordinary differential equations (ODEs). Hence, the following derivative matrices for the 1st and 2nd derivatives of the function in approximating the PDE problem (2) via semidiscretization are considered on non-uniform stencils given in Section 2 as comes next:

$$M\_{\mathbf{x}} = \begin{cases} \begin{array}{ll} \alpha\_{i,j} \text{ using (19)} \\ \alpha\_{i,j} \text{ using (20)} \\ \alpha\_{i,j} \text{ using (21)} \\ 0 \end{array} & i - j = 0, \\\end{cases} \tag{30}$$

and

$$M\_{\text{xx}} = \begin{cases} \begin{array}{l} \Theta\_{i,j} \text{ using (23)} \\ \Theta\_{i,j} \text{ using (24)} \\ \Theta\_{i,j} \text{ using (25)} \\ 0 \end{array} & \begin{array}{l} i-j=1, \\ i-j=0, \\ j-i=1, \\ \text{otherwise.} \end{array} \end{cases} \tag{31}$$

Consider the *N* × *N* unit matrix *I* = *Ix* ⊗ *Iy*, while *N* = *m* × *n*, *Ix* and *Iy* are unit matrices of appropriate sizes. The MOL can be resulted in the following coefficient matrix for the 1 + 2 dimensional PIDE:

$$\begin{split} B &= \frac{1}{2} \mathcal{Y} \mathcal{X}^2 (M\_{\text{xx}} \otimes I\_{\text{n}}) + \frac{1}{2} \sigma^2 \mathcal{Y} (I\_{\text{m}} \otimes M\_{\text{yy}}) + \rho \sigma \mathcal{Y} \mathcal{X} M\_{\text{x},\text{y}} \\ &+ (-\lambda\_{\text{s}}^{\text{x}} - q + r) \mathcal{X} (M\_{\text{x}} \otimes I\_{\text{n}}) + \kappa (\theta I\_{\text{N}} - \mathcal{Y}) (I\_{\text{m}} \otimes M\_{\text{y}}) - (-\lambda + r) I\_{\text{N}} \end{split} \tag{32}$$

where ⊗ stands for the Kronecker product. The square matrices *Mx*, *My*, *Mz*, *Mxx*, and *Myy*, are constructed by the associated weights similarly. Additionally, the sparse diagonal matrices Y and X are written as:

$$\mathcal{Y} = I\_{\mathbf{x}} \circledast \text{diag}(\mathcal{Y}\_1, \mathcal{Y}\_2, \dots, \mathcal{Y}\_n), \tag{33}$$

$$\mathcal{X} = \text{diag}(\mathbf{x}\_1, \mathbf{x}\_2, \dots, \mathbf{x}\_m) \otimes I\_{\mathcal{Y}}.\tag{34}$$

Here the weights corresponding the cross derivative term in the structure of the PIDE (2) can be obtained by employing the Kronecker product as follows:

$$M\_{\mathbf{x},\mathbf{y}} = M\_{\mathbf{x}} \circledast M\_{\mathbf{y}}.\tag{35}$$

Now it is possible to find the following system of ODEs for pricing (2):

$$
\mu'(\mathfrak{r}) = B\mathfrak{u}(\mathfrak{r}).\tag{36}
$$

Now, note that we can use the work of [22,29] to discretize the integral part as follows. By a linear interpolation for *u*(*x*, *y*, *τ*) among the adaptive numerical grid nodes, the nonlocal integral given in (2) can be solved using

$$\mathcal{A}\_I(u) = \int\_0^\infty u(\mathbf{x}\varrho, \mathbf{y}, \mathbf{r}) b(\varrho) d\varrho. \tag{37}$$

Employing *z* = *x*, one can transform (37) into the integral below:

$$\mathcal{A}\_I(u) = \int\_0^\infty u(z, y, \mathbf{r}) b\left(\frac{z}{\mathbf{x}}\right) \left(\frac{1}{\mathbf{x}}\right) dz. \tag{38}$$

Using a linear interpolation for (38), we can find the following:

$$\mathcal{A}\_i(\boldsymbol{\mu}) \simeq \sum\_{l=1}^{m-1} \mathbb{Q}\_{i,l'} \tag{39}$$

for every node *xi*, *i* = 2, . . . , *m* − 1, wherein

$$Q\_{i,l} = \int\_{\mathbf{x}\_{l}}^{\mathbf{x}\_{l+1}} \left( \frac{\mathbf{x}\_{l+1} - z}{\Delta \mathbf{x}\_{l}} u(\mathbf{x}\_{l}, \mathbf{y}, \tau) + \frac{z - \mathbf{x}\_{l}}{\Delta \mathbf{x}\_{l}} u(\mathbf{x}\_{l+1}, \mathbf{y}, \tau) \right) b\left(\frac{z}{\mathbf{x}\_{l}}\right) \left(\frac{1}{\mathbf{x}\_{l}}\right) dz,\tag{40}$$

wherein Δ*xl* = *xl*<sup>+</sup><sup>1</sup> − *xl* is the graded step size. Hence, we have

$$\begin{split} Q\_{i,l} &= \frac{1}{2\Delta x\_{l}} \Bigg( \exp\left(\gamma + \frac{\vartheta^{2}}{2}\right) \Bigg( -\text{erf}\left(\frac{-\ln\left(\frac{x\_{l}}{x\_{i}}\right) + \gamma + \vartheta^{2}}{\sqrt{2}\vartheta}\right) \\ &+ \text{erf}\left(\frac{-\ln\left(\frac{x\_{l+1}}{x\_{l}}\right) + \gamma + \vartheta^{2}}{\sqrt{2}\vartheta}\right) \Bigg) \mathbf{x}\_{i}(u(\mathbf{x}\_{l}, y, \tau) - u(\mathbf{x}\_{l+1}, y, \tau)) + \left(\text{erf}\left(\frac{\gamma - \ln\left(\frac{x\_{l}}{x\_{l}}\right)}{\sqrt{2}\vartheta}\right)\right) \\ &- \text{erf}\left(\frac{\gamma - \ln\left(\frac{x\_{l+1}}{x\_{l}}\right)}{\sqrt{2}\vartheta}\right) \Bigg) \left(\mathbf{x}\_{l+1}u(\mathbf{x}\_{l}, y, \tau) - \mathbf{x}\_{l}u(\mathbf{x}\_{l+1}, y, \tau)\right)\right) \Bigg). \end{split} \tag{41}$$

wherein erf(·) stands for the Gaussian distribution.

So, (36) turns into

$$
\mu'(\pi) = \vec{B}\mu(\pi),
\tag{42}
$$

where *B*¯ is the system matrix after imposing the integral part. Finally, after considering the boundary conditions, a set of ODEs can be attained as follows:

$$u'(\tau) = F(\tau, u(\tau)) = \mathcal{B}u(\tau) + b,\tag{43}$$

wherein *b* consists of the boundary conditions.

#### **4. The Time-Stepping Solver**

Time stepping schemes must be used to solve (43). Although very recently some optimal time stepping solvers have been proposed in literature [30–32] for solving system of ODEs, here we focus on a basic but efficient one. Now it is considered that **u***<sup>ι</sup>* as an approximation to *u*(*τι*), then we could derive our final (explicit) time-integrator method. Select *<sup>k</sup>* <sup>+</sup> 1 uniform temporal nodes and 0 <sup>≤</sup> *<sup>ι</sup>* <sup>≤</sup> *<sup>k</sup>*, *τι*+<sup>1</sup> <sup>=</sup> *τι* <sup>+</sup> *<sup>ζ</sup>*, *<sup>ζ</sup>* <sup>=</sup> *<sup>T</sup> <sup>k</sup>* <sup>&</sup>gt; 0 with **<sup>u</sup>**<sup>0</sup> = (4), then the second-order RK solver (RK2) also known as the mid-point explicit method is given by [33] (p. 95):

$$\mathbf{u}^{\iota+1} = \mathbf{u}^{\iota} + \psi\_2 + \mathcal{O}(\mathbb{Q}^3),\tag{44}$$

where

$$
\psi\_2 = \zeta F\left(\mathfrak{r}\_\iota + \frac{1}{2}\zeta, \mathfrak{u}^\iota + \frac{1}{2}\psi\_1\right), \quad \psi\_1 = \zeta F(\mathfrak{r}\_\iota, \mathfrak{u}^\iota). \tag{45}
$$

The approach (44) is useful because its explicit procedure helps programming in lower computational time than many of its competitors from the RK methods. This is a motivation of choosing (44) and not other higher order members of the RK family since their computational cost per time level increases. Anyhow, the investigation for finding the best time-stepping solver from the RK family of an optimal order for our specific PIDE problem remains an open question which could be focused on forthcoming works. Now the most important thing is to investigate that under what conditions this stability can be kept.

**Theorem 1.** *Let us assume that (43) satisfies the Lipschitz condition, then we have a conditional time-stable iteration process using (44) for solving (43).*

**Proof.** Considering the time-stepping method (44) on the set of ODEs (43) gives:

$$\mathbf{u}^{\iota+1} = \left(\frac{(\zeta \mathcal{B})^0}{1} + \frac{(\zeta \mathcal{B})^1}{1} + \frac{(\zeta \mathcal{B})^2}{2}\right) \mathbf{u}^{\iota}. \tag{46}$$

The explicit method (46) is clearly time-stable if the matrix eigenvalues of

$$\left(I + \zeta B + \frac{(\zeta \bar{B})^2}{2}\right) \tag{47}$$

have modulus less than or equal to one. Viewing (46) as an iterative map, it would be clear that the eigenvalues of this matrix are 1 + *ζB*¯ <sup>i</sup> <sup>+</sup> (*ζB*¯ i)2 <sup>2</sup> , where *<sup>B</sup>*¯ <sup>i</sup> are the eigenvalues of matrix *B*¯. Thus, for i = 1, 2, . . . , *m*, the A-stability is simplified to

$$\left|1+\zeta \mathcal{B}\_i + \frac{(\zeta \bar{\mathcal{B}}\_i)^2}{2}\right| \le 1. \tag{48}$$

Therefore, our proposed method is time-stable if the time step size *ζ* reads as (48). The stability function in (48) shows a conditional stable behavior for (44). Using (48) along with *ζ* > 0 we have the following:

$$0 < \zeta \le \frac{2}{\text{Re}(\lambda\_{\text{max}}(\vec{B}))},\tag{49}$$

where Re(·) is the real part and *λ*max(·) is the largest eigenvalue (in the absolute value sense). Note that we also obtain

$$-\mathcal{J}\_{\text{i}} \le \text{Im}(\bar{\mathcal{B}}\_{\text{i}}) \le \mathcal{J}\_{\text{i}}.\tag{50}$$

while

$$\xi\_{\rm i} = \left( 2 \left( -\frac{\text{Re}(\mathcal{B}\_{\rm i})(\zeta \text{Re}(\mathcal{B}\_{\rm i}) + 2)}{\mathbb{J}^3} \right)^{1/2} - \frac{\text{Re}(\mathcal{B}\_{\rm i})(\zeta \text{Re}(\mathcal{B}\_{\rm i}) + 2)}{\mathbb{J}} \right)^{1/2}.\tag{51}$$

These inequalities on the real and the imaginary parts of the eigenvalues will determine the conditional time stability bounds of the proposed solver when pricing (2). This ends the proof.

To discuss about the advantages of the proposed approach, we briefly express that our solver has now been expressed all in matrix notations as in (43) which is a system of linear ODEs. When it couples by the ODE solver (44) with the stability condition (50), it solves (2) and the stability relied only on the largest eigenvalue of the system matrix.

#### **5. Numerical Aspects**

The goal here is to resolve (2) for at-the-money options, i.e., the value of *u* at *x*<sup>0</sup> = *K* and also *y*<sup>0</sup> = 0.04 and *K* = 100\$. The comparing methods are given below:


Noting that all the programs have been written carefully under similar conditions in Mathematica 13 [35,36]. Here, the whole CPU time (for constructing the meshes, the derivative matrices, the set of ODEs and employing the time-stepping method) is in second. We use more number of nodes along *x* than *y*, since its computational domain is larger. The criterion given below is used for computing the errors

$$\varepsilon\_{i,j,\iota} = \left| u\_{\text{appprox}}(\mathbf{x}\_{i\prime} y\_{j\prime} \mathbf{r}\_{\iota}) - u\_{\text{ref}}(\mathbf{x}\_{\prime} y\_{\prime} \mathbf{r}) \right|,\tag{52}$$

where *u*approx and *u*ref are the approximate and exact solutions. *u*ref is selected from the already well-known literature [14,34].

It is remarked that one efficient way to compute the shape parameter is to calculate it adaptively via the number of discretization points, the numerical domain as well as the structure of the PIDE problem. Hence, here we use (1 ≤ *i* ≤ *m* − 1):

$$p = 4 \max\{\Delta x\_i\},\tag{53}$$

where Δ*xi* are the increments along the variable mesh. We can write and use (53) similarly for the other variable. Throughout the tables of this paper, a E-b stands for the scientific notation *<sup>a</sup>* <sup>×</sup> <sup>10</sup>−*b*.

**Example 1** ([14])**.** *Let us investigate the computational results for the call option of (2) using the following settings:*

$$
\rho = -0.9, \ r = 0.025, \ \lambda = 0, \ \sigma = 0.3, \ \kappa = 1.5, \ \theta = 0.04, \ q = 0, \ T = 1. \tag{54}
$$

The reference price, which is obtained by the FFT approach [14], is 8.894869 at the point (*x*0, *y*0)=(100, 0.04). The numerical truncated domain is Ω = [0, 3*K*] × [0, 1] and *ψ* = 1.5. Economically speaking, the values for the variance (for domain truncating) that are larger than one are not significant. The results in this case are provided in Table 1, which shows the superiority of the proposed solver RBF-FD-PM.

**Example 2** ([34])**.** *Let us investigate the computational results of a European put option for (2) using the following settings:*

*γ* = −0.5, *σ*ˆ = 0.4, *ρ* = −0.5, *λ* = 0.2, *σ* = 0.25, *r* = 0.03, *T* = 0.5, *κ* = 2.0, *θ* = 0.04, *q* = 0. (55)

**Table 1.** Numerical results for Example 1.



**Table 1.** *Cont.*

The reference prices for specific locations of the domain are 11.302917 at (90, 0.04, 0.5), 6.589881 at (100, 0.04, 0.5) and 4.191455 at (110, 0.04, 0.5) using [34]. The convergence results are provided in Tables 2 and 3 and confirm the superiority of the proposed solver with *ψ* = 2 in this paper.

The FDM solver is back-of-the-envelope accounting because it is clear that the uniform grids for the PIDE problem are not a fast calculation to obtain highly accurate prices. To check the stability and positivity of the numerical solution for RBF-FD-PM, the numerical solution for Example 2 is plotted in Figure 1, which shows the stable behavior of RBF-FD-PM using *m* = 16, *n* = 8 and *k* = 1001.

**Figure 1.** Numerical solution of Example 2 using the RBF-FD-PM solver when *τ* = 0 on the left and *τ* = 0.5 on the right. Green points show the location of the graded discretization points on the red curve, which is the numerical solution.

The reason for providing Figure 1 is twofold. We must first reveal that the numerical solution obtained by RBF-FD-PM using some *m* and *n* is stable and does not have oscillations. This is important since the PIDE model has a mixed derivative term, which can lead to oscillations in the numerical solution as long as a careless numerical method is employed. Second, we must reveal how the graded meshes (the green points in Figure 1) located on the numerical solution are obtained by employing an automatic interpolation on the obtained solutions.

An inquiry might arise by analyzing the results in Tables 1 and 2. tt is not easy to find out the advantages of the proposed approach since the numerical values are given for different values of the parameters *m*, *n*, *N* and *k* + 1. In fact, larger time step sizes (lower *k*) are taken for SM and RBF-FD-PM since their ODE solver, i.e., (44), has a larger stability region, and the overall solvers must be compared by fixing an accuracy for the errors and then checking the computational times.


**Table 2.** Numerical results of the different solvers in Example 2.

To also show how the instability may ruin the numerical pricing using the stability bound (49), we provide the numerical results of solving (2) by the RBF-FD-PM using *m* = 16 and *n* = 8, but with *k* = 25 uniform discretization nodes along time in Figure 2. This shows that all the involved solvers have some limitations, but the proposed solver sounds more efficient than others.

**Figure 2.** The instable numerical solution of Example 2 at *τ* = 0.5 using *k* = 25 nodes.

However, due to nonsmoothness at the strike price in the initial condition, it might be useful to employ a time-stepping solver that works on graded meshes over time with more focus at the beginning of the starting time, i.e., zero (the solution near the initial time point has a weak singularity). One such method is the Rannacher time-marching method [37]. Although such an application will help our solver a lot, we will try to focus on this in forthcoming related works.


**Table 3.** Numerical results of the AFFT solver in Example 2.

#### **6. Concluding Remarks**

PIDEs arise in the mathematical modeling of many processes in different fields of engineering and finance. This paper has presented an approximate solution of the linear Bates PIDE with clear application in financial option pricing using a local integral term. The solution method was considered on graded meshes at which there is a clear concentration of the discretization nodes on the financially important are of the problem. Then, an RBF-FD solver using semi-discretization via sparse arrays have been constructed for solving the Bates PIDE. The numerical results were furnished and supported the theoretical discussions. These results have been provided in Tables 1 and 2 which implicitly state that the proposed approach can compete the most efficient solver (SM) for the same purpose. Additionally, the prospects for future research can be focused on how to obtain RBF-FD weights on stencils having five/six adjacent nodes on graded meshes or employing the Rannacher timemarching method in order to obtain higher accuracies for solving the PIDE problem (2).

**Author Contributions:** Conceptualization, T.L. and G.F.; formal analysis, G.F.; T.L. and S.S.; funding acquisition, M.Z.U. and T.L.; investigation, T.L., M.Z.U. and S.S.; methodology, G.F. and S.S.; supervision, T.L.; validation, M.Z.U. and S.S.; writing—original draft, G.F., M.Z.U. and S.S.; writing—review and editing, T.L., M.Z.U. and S.S. All authors have read and agreed to the published version of the manuscript.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** For the data-availability statement, we state that data sharing is not applicable to this article as no new data were created in this study.

**Acknowledgments:** The third author states that: The Deanship of Scientific Research (DSR) at King Abdulaziz University (KAU), Jeddah, Saudi Arabia, has funded this project, under grant no. (KEP-MSc: 65-130-1443). The authors are very much thankful to three anonymous referees for their suggestions, which helped to improve this paper.

**Conflicts of Interest:** The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## *Article* **Hybrid Newton–Sperm Swarm Optimization Algorithm for Nonlinear Systems**

**Obadah Said Solaiman 1, Rami Sihwail 2,\*, Hisham Shehadeh 3, Ishak Hashim 1,4 and Kamal Alieyan <sup>2</sup>**


**Abstract:** Several problems have been solved by nonlinear equation systems (NESs), including real-life issues in chemistry and neurophysiology. However, the accuracy of solutions is highly dependent on the efficiency of the algorithm used. In this paper, a Modified Sperm Swarm Optimization Algorithm called MSSO is introduced to solve NESs. MSSO combines Newton's second-order iterative method with the Sperm Swarm Optimization Algorithm (SSO). Through this combination, MSSO's search mechanism is improved, its convergence rate is accelerated, local optima are avoided, and more accurate solutions are provided. The method overcomes several drawbacks of Newton's method, such as the initial points' selection, falling into the trap of local optima, and divergence. In this study, MSSO was evaluated using eight NES benchmarks that are commonly used in the literature, three of which are from real-life applications. Furthermore, MSSO was compared with several well-known optimization algorithms, including the original SSO, Harris Hawk Optimization (HHO), Butterfly Optimization Algorithm (BOA), Ant Lion Optimizer (ALO), Particle Swarm Optimization (PSO), and Equilibrium Optimization (EO). According to the results, MSSO outperformed the compared algorithms across all selected benchmark systems in four aspects: stability, fitness values, best solutions, and convergence speed.

**Keywords:** nonlinear systems; Newton's method; iterative methods; sperm swarm optimization algorithm; optimization algorithm

**MSC:** 65D99; 65H10; 65K10

#### **1. Introduction**

Many issues in the natural and applied sciences are represented by systems of nonlinear equations *F*(*X*) = 0 that require solving, where *F*(*X*) = (*f*1, *f*2,..., *fn* ) such that *fi* is nonlinear for all *i* = 1, 2, ... , *n*. It is well known that determining the precise solution *α* = (*α*1, *α*2,..., *αn*) *<sup>t</sup>* to the nonlinear system *F*(*X*) = 0 is a difficult undertaking, especially when the equation comprises terms made up of logarithmic, exponential, trigonometric, or a mix of any transcendental terms. Thus, finding approximate solutions to this type of problem has emerged as a need. The iterative methods, including Newton's method, are some of the most famous methods for finding approximate solutions to nonlinear equation systems (NESs) [1]. Alternatively, optimization algorithms have been applied in attempts to extract the root solution of nonlinear systems.

In the last ten years, various optimization algorithms have been developed. Those methods can be divided into four primary categories: human-based methods, swarm-based

**Citation:** Said Solaiman, O.; Sihwail, R.; Shehadeh, H.; Hashim, I.; Alieyan, K. Hybrid Newton–Sperm Swarm Optimization Algorithm for Nonlinear Systems. *Mathematics* **2023**, *11*, 1473. https://doi.org/10.3390/ math11061473

Academic Editors: Maria Isabel Berenguer and Manuel Ruiz Galán

Received: 6 February 2023 Revised: 11 March 2023 Accepted: 14 March 2023 Published: 17 March 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

methods, physical-based methods, and evolutionary-based methods [2]. Human perception, attitude, or lifestyle influence human-based methods. Examples of these methods are the "Harmony Search Algorithm (HSA)" [3] and the "Fireworks Algorithm (FA)" [4]. Swarm-based methods mimic the behavior of swarms or animals to reproduce or survive. Examples of this algorithm are "Sperm Swarm Optimization (SSO)" [5–8], "Harris Hawks Optimization (HHO)" [9], "The Ant Lion Optimizer (ALO)" [10], and "Butterfly Optimization Algorithm (BOA)" [11]. Some representative swarm intelligence optimization methods and applications have also been proposed; see for example, [12]. Physical-based methods are inspired by both physical theories and the universe's rules. An example of these algorithms is the "Gravitational Search Algorithm (GSA)" [2], and "Equilibrium Optimizer (EO)" [13]. Evolutionary-based methods are inspired by the Darwinian theory of evolution. An example of this method is the "Genetics Algorithm (GA)" [14]. Finally, some advanced optimization methods with applications from the real-life have been proposed, for example [15,16].

The primary objectives of these methods are to yield the optimal solution and a higher convergence rate. Meta-heuristic optimization should be based on exploration and exploitation concepts to achieve global optimum solutions. The exploitation concept indicates the ability of a method to converge to the optimal potential solution. In contrast, exploration refers to the power of algorithms to search the entire space of a problem domain. Therefore, the main goal of meta-heuristic methods is to balance the two concepts.

However, different meta-heuristic methods have been developed to find solutions to various real-life tasks. The use of optimization algorithms for solving NESs is significant and critical. Various optimization algorithms are used in the solution of nonlinear systems. The following may be summarized:

By improving the performance of optimization algorithms, researchers have been able to target more accurate solutions. For example, Zhou and Li [17] provided a unified solution to nonlinear equations using a modified CSA version. FA was modified by Ariyaratne et al. [18], who made it possible to make the root approximation simultaneously with continuity, differentiation, and initial assumptions. Ren et al. [19] proposed another variation by combining GA with harmonic and symmetric individuals. Chang [20] also revised the GA to estimate better parameters for NESs.

Furthermore, complex systems were handled by Grosan and Abraham [21] by putting them in the form of multi-objective optimization problems. Jaberipour et al. [22] addressed NESs using a modified PSO method; the modification aims to overcome the core PSO's drawbacks, such as delayed convergence and trapping at local minimums. Further, NESs have been addressed by Mo and Liu [23], who added the "Conjugate Direction Method (CDM)" into the PSO algorithm. The algorithm's efficiency for solving high-dimensional problems and overcoming local minima was increased by using CDM [24].

Several research methods involved combining two population-based algorithms (PBAs) to achieve more precise results in nonlinear modeling systems. These combinations produce hybrid algorithms that inherit the benefits of both techniques while reducing their downsides [25]. Hybrid ABC [26], hybrid ABC and PSO [27], hybrid FA [28], hybrid GA [29], hybrid KHA [30], hybrid PSO [31], and many others [32–36] are some examples of hybridizing PBAs.

NESs have often been solved using optimization techniques, either using a "Single Optimization Algorithm (SOA)" or a hybrid algorithm that combines two optimization procedures. Only a few researchers have attempted to combine the iterative method and an optimization approach. Karr et al. [37] presented a hybrid method combining Newton's method and GA for obtaining solutions for nonlinear testbed problems. After using GA to identify the most efficient starting solution, Newton's approach was utilized. To solve systems of nonlinear models, a hybrid algorithm described by Luo et al. [38] can be utilized; the combination includes GA, Powell algorithm, and Newton's method. Luo et al. [39] have provided a method for solving NESs by integrating chaos and quasi-Newton techniques. Most of the previous research has concentrated on a specific topic or issue rather

than attempting to examine NESs. In a relatively recent study, Sihwail et al. [40] developed a hybrid algorithm known as NHHO to solve arbitrary NESs of equations that combine Harris Hawks' optimization method and Newton's method. Very recently, Sihwail et al. [41] proposed a new algorithm for solving NESs of equations in which Jarratt's iterative approach and the Butterfly optimization algorithm were combined to create the new scheme known as JBOA.

A hybrid algorithm can leverage the benefits of one method while overcoming the drawbacks of the other. However, most hybrid methods face problems with premature convergence due to the technique used in the original algorithms [42]. As a result, choosing a dependable combination of algorithms to produce an efficient hybrid algorithm is a crucial step.

One of the more recent swarm-based methods is Sperm Swarm Optimization (SSO), which is based on the mobility of flocks of sperm to fertilize an ovum. There are various benefits of SSO, which can be listed as follows [2,5,6]:


However, most NESs simulate different data science and engineering problems that have more than one solution. Hence, it is difficult to give accurate solutions to these problems. Like other optimization algorithms, SSO may fall into a local minimum (solution) instead of the optimal solution. As a result, we developed a hybrid approach that incorporates Newton's iterative scheme with the SSO algorithm to mitigate the drawback. It is worth mentioning that Newton's method is the first known iterative scheme for solving nonlinear equations using the successive approximation technique. According to Newton's method, the correct digits nearly double each time a step is performed, referred to as the second order of convergence.

Newton's method is highly dependent on choosing the correct initial point. To achieve good convergence toward the root, the starting point, like other iterative approaches, must be close enough to the root. The scheme may converge slowly or diverge if the initial point is incorrect. Consequently, Newton's method can only perform limited local searches in some cases.

For the reasons outlined above, a hybrid SSO algorithm (MSSO) has been proposed to solve NESs, where Newton's method is applied to improve the search technique and SSO is used to enhance the selection of initial solutions and make global search more efficient.

It is not the concern of this study to demonstrate that hybridizing the SSO and Newton's methods performs better than other optimization algorithms such as PSO or genetic algorithms. However, this work aims to highlight the benefits of hybridizing an optimization algorithm with an iterative method. This is to enhance the iterative method's accuracy in solving nonlinear systems and reduce its complexity. Further, it is also able to overcome several drawbacks of Newton's method, such as initial point selection, trapping in local optima, and divergence problems. Moreover, hybridization in MSSO is beneficial in finding better roots for the selected NSEs. Optimization algorithms alone are unlikely to provide precise solutions compared to iterative methods such as Newton's method and Jarratt's method.

The proposed modification improves the initial solution distribution in the search space domain. Moreover, compared to the random distribution used by the original technique, Newton's approach improves the computational accuracy of SSO and accelerates its convergence rate. Hence, this research paper aims to improve the accuracy of NES solutions. The following are the main contributions of this paper:


The rest of the paper is organized as follows: Section 2 discusses SSO algorithms and Newton's iterative method. Section 3 describes the proposed MSSO. Section 4 describes the experiments on the benchmark systems and their results. Further discussion of the findings is provided in Section 5. Finally, Section 6 presents the study's conclusion.

#### **2. Background**

#### *2.1. Standard Sperm Swarm Optimization (SSO) Algorithm*

SSO is a newly created swarm-based technique proposed by Shehadeh et al. [2,5,6] that draws inspiration from the actions of a group of sperm as they fertilize an ovum. In the process of fertilization, a single sperm navigates a path against overwhelming odds to merge with an egg (ova). In general, there are 130 million sperm involved in the insemination process. Eventually, one of these sperm will fertilize the ovum. Based on Shehadeh et al. [6], the procedure of fertilization can be summarized as follows:

A male's reproductive system releases the sperm into the cervix, where the fertilization process starts. Each sperm is given a random location inside the cervix to begin the fertilization process as part of this task. Further, every sperm has two velocities on the Cartesian plane. The initial velocity value of sperm denotes this velocity. The procedure of fertilization is demonstrated in Figure 1.

**Figure 1.** The procedure of fertilization [6].

From this point, every sperm in the swarm is ready to swim until it reaches the outer surface of the ovum. Scientists found that the sperm float on the surface as a flock or swarm, moving from the zone of low temperature to the area of high temperature. Moreover, they observed that the ovum triggers a chemical to pull the swarm; this is known as a chemotactic process. According to researchers, these cells also beat at the same frequency as the tail movements through the grouping. The ovum and its location in the fallopian

tubes are illustrated in Figure 1. Based on Shehadeh et al. [6], this velocity is denoted by the personal best velocity of the sperm.

Usually, in a typical scenario, one sperm can fertilize an ovum. Based on that, Shehadeh et al. [2,5–8] calls this sperm the winner. The winner and the flock of sperm are illustrated in Figure 2.

**Figure 2.** A flock of sperm and the winner [2].

The best answer is found and obtained using this strategy, which makes use of a group of sperm (potential solutions) floating over the whole search area. Concurrently, the possible solutions will consider the most suitable sperm in their path, who will be the victor (the sperm that is closest to the egg). Alternatively, the flock will consider on the winner's position and the position of its prior best solution. Thus, every sperm enhances its initial zone across the optimum area by taking into consideration its current velocity, current location, and the location of both the global's best solution (the winner) and the sperm's best solution. Mathematically speaking, in SSO, the flock updated their sites according to the following formula:

$$
\pi\_{i+1}(t) = \pi\_i(t) + \upsilon\_i(t) \tag{1}
$$

where


Three velocities can be used to calculate the sperm's velocity: the initial velocity of a potential solution, the personal best solution, and the global best solution.

First is the initial velocity of sperm, which takes a random value based on the velocity dumping parameter and the pH value of the initial location. The model can be calculated by applying the following formula:

$$Initial\\_Velocity = D \cdot V\_i(t) \cdot Log\_{10}(pH\\_Rand\_1) \tag{2}$$

Second is a personal best location for the potential solution, adjusted in memory based on the prior location until it is closest to the optimal value. However, this velocity can be changed based on the pH and temperature values. The following formula may be used to calculate this model:

$$\text{Current\\_Best\\_Solution} = \text{Log}\_{10}(pH\\_Rand\_2) \cdot \text{Log}\_{10}(Temp\\_Rand\_1) \cdot \left(\mathbf{x}\_{sheet\_i}[] - \mathbf{x}\_i[] \right) \tag{3}$$

Third, the global best solution is simulated by the winner, which is denoted by the closest sperm to the ovum. The mathematical model of the winning velocity of the potential solution *Vi*(*t*) can be represented in Equation (4). The flock of sperm and the value of the winner are depicted in Figure 2.

*Global*\_*Best*\_*Solution*(*the*\_*winner*) = *Log*10(*pH*\_*Rand*3) · *Log*10(*Temp*\_*Rand*2) · *xsgbesti* []−*xi*[] (4)

The symbols of the prior equations are as follows:


Based on the equations mentioned above, the total velocity rule *Vi*(*t*) can be formalized based on velocity initial value, personal best solution, and global best solution as follows [2,5–8]:

$$V\_i(t) = \mathcal{L}g\_{10}(pH\_{\text{\\_Rand}\_1}).V\_i + \mathcal{L}g\_{10}(pH\_{\text{\\_Rand}\_2}).\mathcal{L}g\_{10}(Temp\\_\text{Rand}\_1).) \cdot \begin{pmatrix} \mathbf{x}\_{\text{best}\_i} - \mathbf{x}\_i(t) \end{pmatrix} + \mathcal{L}g\_{10}(pH\_{\text{\\_Rand}\_3}).\tag{5}$$

$$\mathcal{L}g\_{10}(pH\_{\text{\\_Rand}\_3}).\mathcal{L}g\_{10}(Temp\\_\text{Rand}\_2).\left(\mathbf{x}\_{\text{best}\_i} - \mathbf{x}\_i(t)\right)$$

Based on the theory of SSO, both pH and temperature affect the velocity rule. The pH changes depending on the woman's attitude, whether depressed or happy, and on the food consumed. The value of the pH parameter falls in a range between seven and fourteen. Alternatively, the temperature ranges from 35.1 to 38.5 ◦C according to blood pressure circulation in the reproductive system [7].

Further, SSO is a swarm-based method that simulates the metaphor of natural fertilization. SSO, however, has a few disadvantages in terms of efficiency. Applied to a broad search domain, SSO is prone to getting trapped in local optima [2], which is one of its main drawbacks. Therefore, improvements are needed to enhance the method's exploration process.

#### *2.2. Newton's Method*

An iterative technique is a technique (method) for finding an approximate solution by making successive approximations. Iterative approaches usually cannot deliver accurate answers. Accordingly, researchers generally select a tolerance level to distinguish between approximate and exact answers for the solutions obtained through iterative approaches. Newton's method, also known as the Newton–Raphson method, was proposed by Isaac Newton and is the most widely used iterative method. The procedure of Newton's scheme is described by

$$X\_{n+1} = X\_n - F'^{-1}(X\_n).F(X\_n),\tag{6}$$

where *F*(*X*) is the nonlinear system of equations, and *F* (*Xn*) represents the "Jacobian of *F*(*X*)". Newton's second-order convergence method may be easily applied to various nonlinear algebraic problems [1]. As a result, mathematical tools such as Mathematica and MATLAB provide built-in routines for finding nonlinear equations' roots based on Newton's scheme.

In Newton's method, many studies and refinements have been performed to improve approximation solutions to nonlinear problems as well as the order of convergence, which impact the speed at which the desired solution can be reached; see, for example, [43–47] and their references.

#### **3. Modified Sperm Swarm Optimization (MSSO)**

SSO is a powerful optimization technique that can address various issues. No algorithm, however, is suitable for tackling all problems, according to the "No Free Lunch

(NFL)" theorem [48]. By using Newton's method, the proposed MSSO outperforms the original SSO in terms of solving nonlinear equation systems. In MSSO, Newton's methods are used as a local search to enhance the search process, as shown in Figure 3.

**Figure 3.** The framework of the proposed MSSO.

When Newton's method is applied to the sperm position, at each iteration, the fitness value of the potential solution is compared to the fitness of the location calculated by Newton's scheme. The newly computed location by Newton's method is shown in Figure 3 as (*Xn*+1).

In each iteration, MSSO employs both the SSO algorithm and Newton's method. The SSO first determines the most optimal sperm location among the twenty initial locations as an optimal candidate location. The optimal candidate location is then fed into Newton's method. In other words, the output from SSO is considered a potential solution or a temporary solution. The obtained solution is then treated as an input for Newton's method. Newton's method as an iterative method calculates the next candidate solution based on Equation (6). Newton's method's ability to find a better candidate is very high since it is a second-order convergence method. However, in order to avoid a local optimal solution, the candidate solution obtained from Newton's method (*Xn*+1) is compared to the solution calculated by SSO (*Xsperm*). Thus, the location with the lowest fitness value determines the potential solution to the problem. The next iteration is then performed based on the current most promising solution. Algorithm 1 shows the pseudocode for the suggested

MSSO algorithm.


The initialization, exploitation, and exploration phases of the SSO method are shown in the algorithm. The alterations specified in the red box are implemented at the end of each iteration. We compare Newton's location with the sperm's optimal location based on their fitness values and select the one that has the best fitness value.

#### *Computational Complexity*

The complexity of the new MSSO's can be obtained by adding up the SSO's complexity and Newton's method's complexity. At first glance, Newton's technique is overly complicated compared to optimization methods. At each iteration, one has to solve a *N* × *N* system of linear models, which is time-consuming because every Jacobian calculation requires *n*<sup>2</sup> scalar function evaluations. As a result, combining Newton's approach with any optimization process is likely to make it more complicated.

On the other hand, combining SSO with Newton's technique did not significantly increase processing time. However, the MSSO can overcome Newton's method limitations, including selecting the starting points and divergence difficulties. As a result, the MSSO is superior at solving nonlinear equation systems.

The MSSO's time complexity is influenced by the initial phase, the process of updating the position of the sperm, and the use of Newton's scheme. The complexity of the initialization process is O(S), where S is the total number of sperm. The updating process, which includes determining the optimal solution and updating sperm positions, has a complexity equal to O(I × S) + O(I × S × M), where I and M represent the maximum number of iterations and the complexity of the tested benchmark equation respectively. Furthermore, Newton's scheme complexity is calculated as O(I × T), where T is the computation time. Consequently, the proposed MSSO has an overall computational complexity of O(S × (I + IM + 1) + IT).

Every improvement certainly has a cost. The principal objective of the proposed hybrid algorithm is to enhance the fitness value and the convergence speed of the existing algorithms. However, as a result of adding one algorithm to another, the complexity and the time cost of the hybrid algorithm are increased compared to the original algorithm. Eventually, a tradeoff between the merits and disadvantages should be considered while using any algorithm.

#### **4. Numerical Tests**

Eight nonlinear systems of several orders were selected as indicators to clarify the efficiency and capability of the new hybrid MSSO scheme. Comparisons between MSSO and the other six well-known optimization algorithms have been performed. Those optimization algorithms are the original SSO [2], HHO [9], PSO [49], ALO [10], BOA [11], and EO [13]. For consistency, all selected systems used in the comparisons are arbitrary problems that are common in the literature, for instance, [19,21,40,44,50–53].

The comparison between the optimization algorithms is based on the fitness value of each algorithm in each benchmark. A solution with less fitness value is more accurate than a solution with a higher fitness value. Hence, the most effective optimization algorithm is the one that solves with the least fitness value. The fitness function used in the comparison is the Euclidean norm, also called the square norm or norm-2. Using the Euclidean norm, we can determine the distance from the origin, which is expressed as follows:

$$Fitness = \|F(\mathbf{x})\|\_2 = \sqrt{f\_1^2 + f\_2^2 + \dots + f\_n^2} \,\,\,\tag{7}$$

Similar settings have been used in all benchmarks to guarantee a fair comparison of all selected algorithms. The parameter values of all optimization algorithms have been fine-tuned to improve the performance of the algorithms. The best solution was chosen by every optimization method 30 times. Search agents (population size) have been set to 20 and the maximum iteration to 50. Furthermore, the best solution with the least fitness value is chosen if there is more than one solution for a particular benchmark. In the end, for lack of space, answers are shortened to 11 decimal places.

Calculations were conducted using MATLAB software version R2020a with the default variable precision of 16 digits. This was on an Intel Core i5 processor running at 2.2 GHz and 8 GB of RAM under the Microsoft Windows 8 operating system.

**Problem 1:** Let us consider the first problem to be the following nonlinear system of two equations:

$$F\_1(X) = \begin{cases} \ x\_1 + 1 - \mathbf{e}^{x\_2} = 0, \\\ x\_1 + \cos(x\_2) - 2 = 0, \end{cases}$$

For this system, the precise solution is given by α = {1.3401918575555883401 . . . , 0.8502329164169513268 . . .}*<sup>t</sup>* . After running the algorithms 30 times, MSSO significantly surpassed all other optimization algorithms in the comparison. Table 1 shows that the proposed hybrid MSSO algorithm has attained the best solution with the least fitness value equaling zero. This means that the solution obtained by MSSO is an exact solution for the given system.

**Table 1.** Comparison of different optimization algorithms for Problem 1.


**Problem 2:** The second benchmark is the system of two nonlinear equations given by:

$$F\_2(X) = \begin{cases} 2 - e^{x\_1} + \tan^{-1} x\_2 = 0, \\ \tan^{-1} \left( x\_1^2 + x\_2^2 - 5 \right) = 0, \end{cases}$$

Here, the exact zero for the system in this problem is given by *α* = ( 1.1290650391602 ... , 1.9300808629035 ...)*<sup>t</sup>* . As shown in Table 2, it is evident that MSSO achieved the exact solution of this system with a fitness value of zero. It also outperformed all other algorithms with a substantial difference, especially in comparison with SSO, BOA, and HHO.


**Table 2.** Comparison of different optimization algorithms for Problem 2.

**Problem 3:** The third system of nonlinear equations is given by:

$$F\_3(X) = \begin{cases} \cos(x\_2) - \sin(x\_1) = 0, \\\ x\_3^{x\_1} - \frac{1}{x\_2} = 0, \\\ e^{x\_1} - x\_3^2 = 0. \end{cases}$$

This NES of three equations has the exact solution α = {0.9095694945200448838 . . . , 0.6612268322748517354 ..., 1.575834143906999036 . . .}*<sup>t</sup>* . According to Table 3, the proposed MSSO achieved a zero fitness value. The superiority of MSSO is evident in this example, with a significant difference between MSSO and all other compared optimization algorithms.

**Table 3.** Comparison of different optimization algorithms for Problem 3.


**Problem 4:** Consider the following system of three nonlinear equations:

$$F\_4(X) = \begin{cases} \ x\_2 + \ x\_3 - \varepsilon^{-x\_1} = 0, \\\ x\_1 + \ x\_3 - \varepsilon^{-x\_2} = 0, \\\ x\_1 + \ x\_2 - \varepsilon^{-x\_3} = 0. \end{cases}$$

The precise solution of the nonlinear systemin this problemis equal to *α* = (0.351733711249 ... , 0.351733711249 ... , 0.351733711249 ...)*<sup>t</sup>* . The best solution achieved by the compared schemes for the given system is illustrated in Table 4. The proposed MSSO found a precise answer, with zero as a fitness value. ALO recorded the second-best solution with a fitness value of 2.27 <sup>×</sup> <sup>10</sup>−6, while the rest of the compared algorithms were far from the exact answer. Again, the proposed MSSO has proved it has an efficient local search mechanism. Hence, it can achieve more accurate solutions for nonlinear systems.

**Table 4.** Comparison of different optimization algorithms for Problem 4.


**Problem 5:** The next benchmark is the following system of two nonlinear equations:

$$F\_5(X) = \begin{cases} \ x\_1 + e^{x\_2} - \cos(x\_2) = 0, \\ 3x\_1 - \sin(x\_1) - x\_2 = 0, \end{cases}$$

This nonlinear system has the trivial solution *α* = (0, 0) *t* . Table <sup>5</sup> illustrates the comparison between the different optimization algorithms for the given system. Compared with the other algorithms, the original SSO and HHO achieved excellent results, with fitness values of 5.36 <sup>×</sup> <sup>10</sup>−<sup>15</sup> and 6.92 <sup>×</sup> <sup>10</sup><sup>−</sup>14, respectively. However, MSSO outperformed both of them and delivered the exact solution for the given system.

**Table 5.** Comparison of different optimization algorithms for Problem 5.


**Problem 6:** The sixth system considered for the comparison is an interval arithmetic benchmark [53] given by the following system of ten equations:


In this benchmark, MSSO has proven its efficiency. Table 6 clearly shows the significant differences between MSSO and the other compared algorithms. MSSO achieved the best solution with a fitness value of 5.21 <sup>×</sup> <sup>10</sup>−17, while all different algorithms achieved solutions far from the exact answer. When we compare the fitness values of the hybrid MSSO and the original SSO, we can see how substantial modifications were made to the local search mechanism of the original SSO to produce the hybrid MSSO.

**Table 6.** Comparison of different optimization algorithms for Problem 6.


**Problem 7:** Consider the model A combustion chemistry problem for a temperature of 3000 ◦C [21], which can be described by the following nonlinear system of equations:

$$F\_7(X) = \begin{cases} x\_2 + 2x\_6 + x\_9 + 2x\_{10} - 10^{-5} = 0, \\ x\_3 + x\_8 - 3 \times 10^{-5} = 0, \\ x\_1 + x\_3 + 2x\_5 + 2x\_8 + x\_9 + x\_{10} - 5 \times 10^{-5} = 0, \\ x\_4 + 2x\_7 - 10^{-5} = 0, \\ 0.5140437 \times 10^{-7} x\_5 - x\_1^2 = 0, \\ 0.1066932 \times 10^{-6} x\_6 - 2x\_2^2 = 0, \\ 0.7816278 \times 10^{-15} x\_7 - x\_4^2 = 0, \\ 0.1496236 \times 10^{-6} x\_8 - x\_1 x\_3 = 0, \\ 0.6194411 \times 10^{-7} x\_9 - x\_1 x\_2 = 0, \\ 0.2089296 \times 10^{-14} x\_{10} - x\_1 x\_2^2 = 0, \\ -10 \le x\_1, x\_2, \dots, x\_{10} \le 10. \end{cases}$$

In Table 7, the comparison for this system shows that MSSO has the least fitness value of 7.09 <sup>×</sup> <sup>10</sup><sup>−</sup>21, while PSO and EO have fitness values of 2.85 <sup>×</sup> <sup>10</sup>−<sup>9</sup> and 3.45 <sup>×</sup> <sup>10</sup>−8, respectively.



**Problem 8:** The last benchmark is an application from neurophysiology [52], described by the nonlinear system of six equations:

$$F\_8(X) = \begin{cases} x\_1^2 + x\_2^2 - 1 = 0\\ x\_2^2 + x\_4^2 - 1 = 0\\ x\_5 x\_3^3 + x\_6 x\_4^3 = 0\\ x\_5 x\_1^3 + x\_6 x\_2^3 = 0\\ x\_5 x\_1 x\_2^2 + x\_6 x\_2 x\_4^2 = 0,\\ x\_5 x\_3 x\_1^2 + x\_6 x\_4 x\_2^2 = 0 \end{cases} \quad -10 \le x\_1, x\_2, \dots, x\_6 \le 10.$$

There is more than one exact solution to this system. Table 8 shows that the proposed MSSO algorithm achieved the most accurate solution with a fitness value of 1.18 <sup>×</sup> <sup>10</sup>−24, and the PSO algorithm achieved second place with a fitness value of 5.26 <sup>×</sup> <sup>10</sup>−7. In contrast, the rest of the algorithms recorded answers that differ significantly from the exact solution. Further, NESs in problems 6–8 prove the flexibility of the proposed hybrid MSSO as it remains efficient even in a wide interval [−10, 10].

**Table 8.** Comparison of different optimization algorithms for Problem 8.


The comparison results in all benchmarks confirm the hypothesis that we have mentioned in the first section; that is, that the hybridization of two algorithms inherits the efficient merits of both algorithms (SSO and Newton's methods). This can be seen by looking at the comparison results between the MSSO and the original SSO, where the MSSO has outperformed the original SSO in all selected benchmarks. The reason for this remarkable performance is the use of Newton's method as a local search, which strengthens the hybrid's capability to avoid the local optimum in Problems 1–5 (where MSSO has obtained the exact solution), and significantly improves the obtained fitness values in Problems 6-8. The comparisons indicate that the proposed hybrid algorithm MSSO has avoided being trapped in the local optima in all problems, compared with the majority of the other algorithms.

#### **5. Results and Analysis**

#### *5.1. Stability and Consistency of MSSO*

Table 9 shows the average fitness value of the MSSO and the other algorithms compared in the previous benchmarks. This is when each problem is run 30 times to illustrate the continuous efficiency and power of the proposed MSSO algorithm.



According to Table 9, MSSO has surpassed all other compared algorithms. The average fitness values of MSSO and the original SSO show a significant difference in all benchmarks. Consequently, this improvement confirms the flexibility of the hybrid MSSO in seeking the best solution without being entrapped by local optima. Furthermore, as shown in Table 9, MSSO outperforms all of the other compared algorithms, particularly for problems 2, 4, 6, and 8.

Additionally, the algorithm is considered consistent and stable if it maintains consistency over 30 runs. The average of the solutions must, therefore, be the same as or very close to the best solution in order to achieve consistency. It has been demonstrated in this study that MSSO consistency has been maintained for all selected problems. Moreover, the average standard deviation achieved by each algorithm is shown in Table 10, in which smaller values of standard deviation indicate more stability. The hybrid MSSO demonstrated stable results in most of the selected problems.

**Table 10.** The average standard deviation for all problems.


Furthermore, the significance of MSSO improvements was examined using the statistical t-test in Table 11. Improvements were considered significant if the *p*-value was less than 0.05; otherwise, they were not. Results show that all algorithms have *p*-values lower than 0.05 in all tested problems, except for HHO, which has a single value above 0.05 in Problem 5. It is evident from this that MSSO has a higher level of reliability than competing algorithms. Further, MSSO's solutions are significantly more accurate than those of other algorithms since the majority of its *p*-values are close to 0. The results demonstrate that the MSSO is a robust search method capable of finding precise solutions. Moreover, it is able to avoid local optimal traps and immature convergence.

**Table 11.** *p*-values for the fitness based on the *t*-test.


Moreover, one of the criteria that is considered when comparing algorithms is their speed of convergence. Figure 4 indicates that MSSO enhanced the convergence speed of the original SSO in all problems. It also shows that MSSO achieves the best solution with much fewer iterations than the other algorithms. Consequently, the superiority of the proposed MSSO is confirmed.

It is known that any optimization method has some constraints that slow down the algorithm from finding the optimum solution or, in some cases, prevent it from finding the solution. HHO, for instance, probably attains local optima instead of the optimal answer. SSO quickly falls into a local minimum of systems of nonlinear equations, which consists of a set of models [2]. PSO has some drawbacks, such as a lack of population variety and the inability to balance local optima and global optima [54]. The EO method, on the other hand, does not function well for large-scale situations [55].

The novel hybrid algorithm MSSO's convergence speed is attributed to combining Newton's iterative method as a local search and the SSO algorithm. On the one hand, MSSO benefits from the originality of Newton's method, which was developed to find solutions to nonlinear equation systems. On the other hand, SSO ensures appropriate initial solutions for Newton's method by employing search agents. Furthermore, Newton's method features a second-order of convergence, which implies that the scheme converges to approximately two significant digits in each iteration [1]. Thus, the hybridization between Newton's method and the SSO algorithm inherits the merits from both sides to produce an efficient algorithm that can overcome the main disadvantages [56,57].

It is worth noting that the default precision value of the variable in MATLAB was used for all calculations in this study, which is 16 digits of precision. This precision is timesaving compared with more significant digits. However, in some situations, this may impact the outcome. In MATLAB, the function "vpa" may be used to enhance variable precision. Thus, increasing the number of digits can improve the accuracy of the findings, but this is a time-consuming operation. More details and examples of this case can be seen in [40]. In this research, the use of "vpa" has increased the accuracy of the results in Problem 5, Problem 7, and Problem 8.

**Figure 4.** The convergence speed for the eight problems based on an average of 30 runs.

#### *5.2. Comparison between MSSO and Newton's Method*

The effectiveness of MSSO is demonstrated by the correctness of the generated solutions and its ability to avoid local optima compared to Newton's method. Accordingly, both strategies were examined for problems 1–4. Tables 12–15 compare the fitness values achieved by MSSO and Newton's method using three randomly chosen starting points. We examined both strategies for comparison purposes at iteration 5, iteration 7, and iteration 10. In addition, variables of 1200-digit precision in all selected problems were used to clarify the solutions' accuracy. If, as noted earlier, the number of digits is increased, the findings may also improve.


**Table 12.** A comparison of Newton's method and MSSO for Problem 1.


**Table 14.** A comparison of Newton's method and MSSO for Problem 3.


**Table 15.** A comparison of Newton's method and MSSO for Problem 4.


MSSO surpassed Newton's approach in all of the chosen problems, as shown in Tables 12–15. Newton's method, like all other iterative methods, is extremely sensitive to the starting answer *x*0. Choosing an incorrect starting point can slow down the convergence of Newton's method (see Tables 12 and 14) or cause Newton's method to diverge (see Table 13). Further, a singularity in the Jacobian in Newton's method's denominator can be caused by the improper selection of the initial point. The Jacobian's inverse does not thus exist. Therefore, it is impossible to utilize Newton's approach (refer to Tables 14 and 15).

Tables 12–15 show a considerable improvement in MSSO outcomes compared with Newton's technique. The primary issue with Newton's starting point has been addressed by relying on 20 search agents at the early stages of the hybrid MSSO. This is rather than picking one point as Newton does. The MSSO selects several random starting points, called search agents, unlike Newton's method. MSSO examines each search agent's fitness value, then chooses the search agent with the lowest fitness value as an initial guess. Selecting the starting point in this manner is crucial for improving the accuracy of the answer.

The previous experiments show that the proposed MSSO outperforms Newton's method in selected problems. As opposed to Newton's method, which normally starts with one initial point, MSSO starts with 20 search agents. The superiority of the MSSO is demonstrated by the accuracy of its solutions. In addition, the time required to reach the convergence criteria is less in MSSO. Having 20 random initial solutions clearly requires more time for Newton's method. Therefore, this is another reason why hybridizing both SSO and Newton's method is better than depending on one of them.

Moreover, the speed of convergence towards the best solution is astonishing. MSSO can choose the best initial point in a few iterations and move quickly toward the global optima. Figure 5 shows the convergence speed of problems 1–4 for the first five iterations on an average of 30 runs.

**Figure 5.** The convergence speed of problems 1–4 for five iterations based on an average of 30 runs.

To clarify the significant improvements of MSSO over Newton's iterative method, a comparison between Newton's technique and MSSO for Problems 1, 2, 3, and 4 were performed. Table 16 shows the CPU-time needed for Newton's technique and MSSO to attain the stopping criterion (<sup>ε</sup> <sup>≤</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup>−15).

**Table 16.** Comparing Newton's method and MSSO in terms of average time (in seconds).


Based on the results, an apparent enhancement has been added to Newton's method by using the hybridized MSSO. The CPU-time needed to satisfy the selected stopping limit is much better for MSSO than Newton's method. Even though Newton's method is a part of the proposed MSSO, MSSO showed better results because of the mechanism of SSO in selecting the best initial guess for Newton's technique as a local search inside the hybrid algorithm.

It is well known that choosing a starting point that is far from the root of the system could negatively affect the convergence of Newton's method. Therefore, since Newton's method is included in the MSSO, this could negatively affect MSSO's convergence as well. However, based on the mechanism of the MSSO, the algorithm randomly selects 20 agents that are considered as initial points within a specific interval. In general, optimization algorithms have more choices to start with compared to Newton's method. Iterative methods can benefit from hybridization in selecting initial points because optimization algorithms can have many initial points. On the other hand, optimization algorithms can benefit from the fast and accurate convergence of iterative methods.

#### **6. Conclusions**

In this work, a hybrid method known as MSSO was introduced for solving systems of nonlinear equations using Newton's iterative method as a local search for the Sperm Swarm Optimization algorithm SSO. The main goal of the MSSO is to solve the problem of Newton's method's initial guess, the achievement of which results in a better selection of initial points, enabling it to be applied to a wider variety of real-world applications. Moreover, Newton's scheme was used in MSSO as a local search, which improved the accuracy of the tested solutions. In addition, the MSSO's convergence speed is substantially improved.

Eight nonlinear systems of varying orders were utilized to illustrate the effectiveness of the proposed MSSO. The novel MSSO was also compared to six well-known optimization methods, including the original SSO, BOA, ALO, EO, HHO, and PSO. The Euclidean norm has been utilized as a fitness function in all benchmarks. According to the results, MSSO outperforms all other compared algorithms in four metrics: fitness value, solution accuracy, stability, and speed of convergence. In addition, the consistency of the MSSO is confirmed by running the methods thirty times. Additionally, the standard deviation showed that MSSO was the most stable optimization algorithm.

Additionally, we compared the performance of MSSO and Newton's method on four problems from the benchmarks. Across all four datasets, the MSSO outperformed Newton's method. The MSSO method also overcomes some of Newton's scheme's limitations, such as divergence and selection of initial guesses.

Future work can address some related issues, such as how the suggested method performs against common optimization benchmarks. Future research will also focus on solving nonlinear equations arising from real-world applications, such as Burgers' Equation. In addition, future work needs to address the efficiency of the proposed algorithm when solving big systems. Finally, the use of a derivative-free iterative method instead of Newton's method reduces the computational complexity resulting from the need to evaluate Newton's method in each iteration and is an interesting topic that needs to be focused on in the future.

**Author Contributions:** Conceptualization, O.S.S. and R.S.; methodology, O.S.S., R.S. and H.S.; validation, K.A. and I.H.; formal analysis, O.S.S. and R.S.; investigation, O.S.S. and R.S.; resources, O.S.S., R.S. and H.S.; data curation, O.S.S. and R.S.; writing—original draft preparation, O.S.S., R.S. and H.S.; writing—review and editing, K.A. and I.H.; visualization, R.S.; supervision, O.S.S., R.S. and I.H.; project administration, O.S.S., R.S. and I.H.; funding acquisition, I.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research and the APC were funded by Universiti Kebangsaan Malaysia, grant number DIP-2021-018.

**Data Availability Statement:** The data that support the findings of this study are available on request from the corresponding author, Sihwail, R.

**Conflicts of Interest:** The authors declare that there is no conflict of interest regarding the publication of this paper.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

MDPI St. Alban-Anlage 66 4052 Basel Switzerland www.mdpi.com

*Mathematics* Editorial Office E-mail: mathematics@mdpi.com www.mdpi.com/journal/mathematics

Disclaimer/Publisher's Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Academic Open Access Publishing

mdpi.com ISBN 978-3-0365-9215-2