Next Article in Journal
Numerical Solutions of Fractional Differential Equations by Using Laplace Transformation Method and Quadrature Rule
Next Article in Special Issue
Complexity-Based Analysis of the Effect of Forming Parameters on the Surface Finish of Workpiece in Single Point Incremental Forming (SPIF)
Previous Article in Journal
On the Operator Method for Solving Linear Integro-Differential Equations with Fractional Conformable Derivatives
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A q-Gradient Descent Algorithm with Quasi-Fejér Convergence for Unconstrained Optimization Problems

by
Shashi Kant Mishra
1,†,
Predrag Rajković
2,†,
Mohammad Esmael Samei
3,4,*,†,
Suvra Kanti Chakraborty
5,†,
Bhagwat Ram
6,7,† and
Mohammed K. A. Kaabar
8,9,*,†
1
Department of Mathematics, Institute of Science, Banaras Hindu University, Varanasi 221005, India
2
Faculty of Mechanical Engineering, University of Niš, 18000 Niš, Serbia
3
Department of Mathematics, Faculty of Basic Science, Bu-Ali Sina University, Hamedan 65178, Iran
4
Department of Medical Research, China Medical University Hospital, China Medical University, Taichung 40402, Taiwan
5
Department of Mathematics, Ramakrishna Mission Vidyamandira Belur Math, Howrah 711202, India
6
Centre for Digital Transformation, Indian Institute of Management Ahmadabad (IIMA), Vastrapur 380015, India
7
DST-Centre for Interdisciplinary Mathematical Sciences, Institute of Science, Banaras Hindu University, Varanasi 221005, India
8
Institute of Mathematical Sciences, Faculty of Science, University of Malaya, Kuala Lumpur 50603, Malaysia
9
Department of Mathematics and Statistics, Washington State University, Pullman, WA 99163, USA
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Fractal Fract. 2021, 5(3), 110; https://doi.org/10.3390/fractalfract5030110
Submission received: 7 June 2021 / Revised: 18 August 2021 / Accepted: 27 August 2021 / Published: 3 September 2021
(This article belongs to the Special Issue The Materials Structure and Fractal Nature)

Abstract

:
We present an algorithm for solving unconstrained optimization problems based on the q-gradient vector. The main idea used in the algorithm construction is the approximation of the classical gradient by a q-gradient vector. For a convex objective function, the quasi-Fejér convergence of the algorithm is proved. The proposed method does not require the boundedness assumption on any level set. Further, numerical experiments are reported to show the performance of the proposed method.
MSC:
41A60; 05A30; 65F08

1. Introduction

The descent direction plays a central role in the development of optimization algorithms. The classical gradient descent method was first proposed by Cauchy [1] in 1847. The optimization problem is a significant mathematical model in a wide class of disciplines [2]. In applications such as image processing [3], data analysis [4], and machine learning [5], in which one needs to quickly provide an approximate solution, several gradient-based algorithms [6,7,8,9] have been proposed based on the iterative technique of the gradient descent method [2,10]. Quantum calculus is the modern field for the investigation of calculus without limits. Quantum calculus, or q-calculus, began with the work of Jackson in the early twentieth century [11], but similar kinds of calculus had already been developed by Euler and Jacobi in the eighteenth and nineteenth centuries, respectively. Recently it has come under deep interest due to the high demand of mathematics that models quantum computing. Besides appearing as a connection between mathematics and physics, q-calculus has many applications in different mathematical areas such as operator theory [12], combinatorics [13], orthogonal polynomials [14], basic hyper-geometric functions [15], and other sciences like quantum theory [16,17], mechanics [18], and fractional calculus [19,20,21,22,23,24,25]. For more recent studies about fractional calculus analysis and applications, refer to [26,27,28,29,30,31].
The q-Taylor formula for functions of several variables with mean value theorems in q-calculus was first used to develop a new method for solving systems of equations [32]. The advantage of q-calculus is shown in the next example [32] where the a scheme involving q-derivatives finds the solution, but the classical Newton–Kantorovich method fails to do so
| x 1 2 4 | + e 7 x 2 36 = 2 , log 10 12 x 1 2 x 2 6 + x 1 4 = 9 .
For q = 0.9 , the iterations converge to the exact solution as
( x 1 , x 2 ) T = 3 , 36 7 T .
The concept of the q-analogy of the gradient was first introduced in [33] for solving systems of equations. It has some advantages with respect the classical method when the functions are not differentiable. Further, the same concept of q-gradient was introduced in the steepest descent method [34] to optimize single objective functions [35]. The parameter q was generated by a Gaussian distribution with standard deviation σ that decreases by an iterative process as
σ ( k + 1 ) = β σ ( k ) ,
with a starting standard deviation σ 0 and the reduction factor β . The step length was generated using the golden section search method [34]. However, the convergence properties of the steepest descent method with inexact line searches have been studied under several strategies for the choice of the step length α k [36,37,38]. Recently, several modified unconstrained optimization algorithms using the q-gradient have been proposed to solve unconstrained optimization problems [10,19,39,40,41].
In this paper, we propose a q-gradient line search scheme that provides a q-descent direction at every kth iteration. For this, a sequence q ( k ) [39] is taken to generate the values of q, and the backtracking technique is utilized to find the step length without requiring the bounded level sets or Lipschitz condition on the gradient of the function. We also provide the convergence proof theoretically when step length is fixed without any hypothesis on the level sets of the objective function. The advantage of using q-gradient is shown by comparing our method with the method given in [36] based on the number of iterations and function evaluations.
The paper is organized as follows. In the next section, some notations and definitions for q-calculus and other prerequisites are provided, which are used throughout the paper. In Section 3, the q-gradient descent algorithm is given, and its convergence analysis is provided in Section 4. Numerical experiments are performed in Section 4, which is followed by a section of concluding remarks.

2. Essential Preliminaries

We assume that R + stands for the nonnegative real line, q ( 0 , 1 ) is a real number, and the q-integer [ n ] q is given as:
[ n ] q = 1 q n 1 q , q 1 , n , q = 1 ,
for all n N . The expansion of ( 1 + x ) q n is:
( 1 + x ) q n = 1 , n = 0 , ( 1 + x ) ( 1 + q x ) ( 1 + q n 1 x ) , n 1 .
The q-derivative of x n with respect to x is:
D q x n = [ n ] q x n 1 .
The q-derivative of a function ψ : R R is given by
D q ψ ( x ) = ψ ( q x ) ψ ( x ) q x x , ( x 0 ) .
In the special case, D q ψ ( 0 ) = d ψ ( 0 ) d x .
If ψ is differentiable, then
lim q 1 D q ψ ( x ) = lim q 1 ψ ( q x ) ψ ( x ) ( q 1 ) x = d ψ ( x ) d x .
The higher order q-derivatives of ψ are:
D q 0 ψ : = ψ , D q n : = D q ( D q n 1 ψ ) , ( n = 1 , 2 , 3 , ) .
The q-derivative of a function is a linear operator [42] for any constants c 1 and c 2 as:
D q c 1 ψ 1 ( x ) + c 2 ψ 2 ( x ) = c 1 D q ψ 1 ( x ) + c 2 D q ψ 2 ( x ) .
Let ψ ( x ) be a continuous function on [ a , b ] , where a , b R . Then, there exists q ^ ( 0 , 1 ) and x ( a , b ) [43] such that
ψ ( b ) ψ ( a ) = D q ψ ( x ) ( b a ) ,
for all q ( q ^ , 1 ) ( 1 , q ^ 1 ) . The q-partial derivative of function ψ : R n R at x R n with respect to x i is defined as (see ([35]):
D q i , x i ψ ( x ) = 1 ( 1 q i ) x i [ ψ x 1 , x 2 , , x i 1 , x i , x i + 1 , , x n ψ x 1 , x 2 , , x i 1 , q i x i , x i + 1 , , x n ] , x i 0 , q i 1 , x i ψ x 1 , x 2 , , x i 1 , 0 , x i + 1 , , x n , x i = 0 , x i ψ x 1 , x 2 , , x i 1 , x i , x i + 1 , , x n , q i = 1 .
A function is called q-differentiable at a point if its q-partial derivatives are continuous. A continuously q-differentiable function ψ is a function whose q-derivative function is also continuous at the point. We now choose the parameter q as a vector; that is,
q = ( q 1 , , q i , , q n ) T R n .
Then, the q-gradient vector [35] of ψ : R n R is:
q ψ ( x ) T = D q 1 , x 1 ψ ( x ) D q i , x i ψ ( x ) D q n , x n ψ ( x ) .
Let { q i ( k ) } [10,41] be a real sequence defined by
q i ( k + 1 ) = 1 q i ( k ) ( k + 1 ) 2 ,
for each i = 1 , , n , where k = 0 , 1 , 2 and 0 < q i ( 0 ) < 1 . Of course, the sequence { q i ( k ) } converges to ( 1 , , 1 ) T as k , so that the q-gradient reduces to the classic gradient vector.
Proposition 1.
If ψ ( x ) = a 0 + x T a , where a 0 R and a R n , then for any x, q R n ,
q ( k ) ψ ( x ) = ψ ( x ) = a .
Example 1.
Consider a function ψ : R 2 R such that ψ ( x ) = 1 x 1 x 2 . Then, the q-gradient is given as
q ( k ) ψ ( x ) = 1 x 1 x 2 1 q 1 ( k ) x 1 1 q 2 ( k ) x 2 .
Definition 1
(q–Integral [42]). The q-analog of the integral is given by
0 b ψ ( x ) d q x = b ( 1 q ) k = 0 ψ ( b q k ) q k .
In the special case, it reduces to the classical integral 0 b ψ ( x ) d x , when q 1 . It is right only in the case [44]
0 1 x d q x = ( 1 q ) k = 0 q k · q k = 1 1 + q .
Definition 2
(q-Newton–Leibniz formula [42]). The q-anti-derivative of ψ ( x ) is
F q ( x ) = 0 x ψ ( t ) d q t .
In that manner, the q-Newton–Leibnitz formula is
a b ψ ( t ) d q t = F q ( b ) F q ( a ) .
Definition 3
(Quasi-Fejér Convergence [36]). A sequence { x ( k ) } is quasi-Fejèr convergent to a set U R n if for every u U , there exists a non-negative, summable sequence { ϵ k } R such that ϵ k 0 , k = 0 ϵ k < and
x ( k + 1 ) u 2 x ( k ) u 2 + ϵ k ,
for all k.
In the next section, the q-gradient descent algorithm is presented for solving unconstrained optimization problems.

3. A q –Gradient Descent Algorithm

We consider the unconstrained optimization problem:
min ψ ( x ) , x R n ,
where ψ : R n R is a continuously q-differentiable convex function. Note that min is same as the max of ψ ( x ) . We choose a starting point x ( 0 ) R n . The general iterative scheme to solve (9) using the q-gradient is of the following form:
x ( k + 1 ) = x ( k ) + α k d q ( k ) ( k ) ,
where x ( k + 1 ) is a new iterative point, x ( k ) is the previous iterative point, and d q ( k ) ( k ) is a q-descent direction given as:
d q ( k ) ( k ) = q ( k ) ψ x ( k ) ,
and α k R is called the step length can be computed by two main line search strategies: exact line search and inexact line search. The exact optimal step length theoretically cannot be found for practical computation, and it is expensive to generate the value of α k . Therefore, the most frequently used technique in practice is inexact line searching for finding the descent direction. When inexact line searches are performed then α k is assigned a given predetermined value or through some finite iterative method. Note that existence of a solution to the minimization problem (9) is implicitly assumed. The condition for the existence of a solution [2] in the context of q-calculus is:
x ( k ) x * Ω : = x : q ( k ) ψ ( x ) = 0 .
The following result presents the first-order necessary condition in the light of q-calculus.
Theorem 1.
Let Ω be a subset of R n and ψ be a first-order q-differentiable real valued function on Ω. If x * is a local minimizer of ψ over Ω, then for any feasible q-direction d q ( k ) ( k ) at x * , we have
d q ( k ) ( k ) T q ( k ) ψ ( x * ) 0 .
Proof. 
We consider x ( α ) : = x * + α d q ( k ) ( k ) Ω . For α = 0 , we obtain x ( 0 ) = x * . Define the composite function f ( α ) = ψ ( x ( α ) ) . Applying q-Taylor’s theorem to ψ ( α ) and taking α = 0 , we obtain
ψ ( α ) = ψ ( 0 ) + α ψ ( 0 ) + O ( α ) .
We can write this as
ψ ( α ) ψ ( 0 ) = α ψ ( 0 ) + O ( α ) ψ ( x * + α d ) ψ ( x * ) = α ψ ( 0 ) + O ( α ) ,
and
ψ ( α ) = d q d q α ψ ( x ( α ) ) = d q d q α ψ x * + α d q ( k ) ( k ) = d q ( k ) ( k ) T q ( k ) ψ x * + α d q ( k ) ( k ) .
At α = 0 and when we have x * = x ( 0 ) , then
ψ ( 0 ) = d q ( k ) ( k ) T q ( k ) ψ ( x ( 0 ) ) .
Therefore,
ψ x * + α d q ( k ) ( k ) ψ ( x * ) = α d q ( k ) ( k ) T q ( k ) ψ ( x ( 0 ) ) + O ( α ) ,
where α 0 . Since x * is a local minimizer of ψ over Ω , for α sufficiently small α > 0 , we can write
ψ ( x * ) ψ x * + α d q ( k ) ( k ) .
From (11) and (12),
α d q ( k ) ( k ) T q ( k ) ψ ( x ( 0 ) ) = ψ x * + α d q ( k ) ( k ) ψ ( x * ) 0 ,
that is,
α d q ( k ) ( k ) T q ( k ) ψ ( x ( 0 ) ) 0 .
Since α > 0 , then
d q ( k ) ( k ) T ψ ( x ( 0 ) ) 0 .
Since x ( 0 ) = x * , then
d q ( k ) ( k ) T q ( k ) ψ ( x * ) 0 .
This completes the proof. □
We present the following result for the interior case in the context of q-calculus.
Corollary 1.
Let Ω be a subset of R n and ψ be a first-order q-differentiable real valued function on Ω. If x * is a local minimizer of ψ over Ω and if x * is an interior point of Ω, then q ( k ) ψ ( x * ) = 0 .
Proof. 
Since x * is a local minimizer of ψ over Ω , then for any feasible q-direction, we have
d q ( k ) ( k ) T q ( k ) ψ ( x * ) 0 .
Since x * is an interior point of Ω , then every q-direction is a feasible direction, therefore
d q ( k ) ( k ) T q ( k ) f ( x * ) 0 .
From (13) and (14), we obtain
d q ( k ) ( k ) T q ( k ) ψ ( x * ) = 0 .
For all d q ( k ) ( k ) R n , we obtain q ( k ) ψ ( x * ) = 0 . This completes the proof. □
Before writing an algorithm for the q-gradient descent method, we need the following assumptions.
Assumption 1.
We consider the two following assumptions:
1.
Let ψ : R n R be convex and continuously q-differentiable.
2.
The q-gradient of ψ with constant L > 0 satisfies the following condition:
q ψ ( x ) q ψ ( y ) L x y ,
for all x , y R n .
Assumption 2.
Let ϕ : R + R be such that:
1.
ϕ is convex and continuously q-differentiable on [ 0 , ) ,
2.
ϕ ( 0 ) = 0 and ϕ ( 0 ) < 1 ,
3.
lim v 0 + ϕ ( v ) v 2 > 0 .
Note that (3) implies that ϕ ( 0 ) 0 , thus from (2), we obtain 0 ϕ ( 0 ) < 1 and from (1), ϕ is convex and continuously q-differentiable. Therefore, ϕ is non-decreasing. The statement of the theorem given in [45] can be presented in the light of q-calculus as given below.
Theorem 2.
Let F : R n × R R such that
1.
There exists ( x , u ) R n × R such that F x , u = 0 ,
2.
F is continuous in a neighborhood of x , u ,
3.
F is q-differentiable with respect to the variable u in x , u and
q F q u x , u 0 .
Then, there exists a neighborhood U ( x ) of x and at least one function ϱ : U x R such that ϱ x = u and
F ( x , ϱ ( x ) ) = 0 ,
for any x U ( x ) .
4.
If q F q u ( . , . ) is continuous at ( x , u ) , then the function ϱ is the only one that satisfies (16) and is continuous at ( x , u ) .
Note that the proof q-analogue of the above implicit function theorem is out of the scope of the present research. For developing the algorithm, we need the following proposition whose proof is given in the light of [36].
Proposition 2.
Let
G = x R n | q ( k ) ψ ( x ) 0 ,
and ϕ satisfy Assumption 2. Then,
1.
For all x G , there exists a unique ϱ ( x ) > 0 such that
ψ ( x ) ψ x ϱ ( x ) q ( k ) ψ ( x ) = ϕ ( ϱ ( x ) ) q ( k ) ψ ( x ) 2 ,
and
ψ x u q ( k ) ψ ( x ) + ϕ ( u ) q ( k ) ψ ( x ) 2 ψ ( x ) ,
if and only if 0 u ϱ ( x ) .
2.
ϱ : G R + is continuous in G .
Proof. 
  • We first prove (1). Fix x G , u R + and define the function in the context of q-calculus as:
    F ( x , u ) = ψ x u q ( k ) ψ ( x ) ψ ( x ) + ϕ ( u ) q ( k ) ψ ( x ) 2 .
    From (1) of Assumption 2, F ( x , . ) is convex and continuously q-differentiable, and when substituting u = 0 in (19), then we obtain
    F ( x , 0 ) = ϕ ( 0 ) q ( k ) ψ ( x ) 2 .
    From (2) of Assumption 2, we have ϕ ( 0 ) = 0 , thus
    F ( x , 0 ) = 0 .
    Applying q-derivative with respect to u to (19), then
    q F q u = q ψ ( x u q ψ ( x ) ) q ψ ( x ) + ϕ ( 0 ) q ψ ( x ) 2 ,
    Substituting u = 0 in the right hand side of above equation, we obtain
    q F q u = q ( k ) ψ ( x ) 2 1 + ϕ ( 0 ) ,
    Since ϕ ( 0 ) < 1 , then
    q F q u < 0 .
    In addition,
    F ( x , u ) ψ * ψ ( x ) + ϕ ( u ) q ( k ) ψ ( x ) 2 ,
    where ψ * is the minimum function value of ψ . From (20) and (23), we conclude that F ( x , . ) is negative in some interval to the right of zero, and from (24), (1) and (2) of Assumption 2, we obtain
    lim u F ( x , u ) + .
    From Theorem 2 it follows that there exists ϱ ( x ) > 0 such that F ( x , ϱ ( x ) ) = 0 . Using the above value in (19), we obtain (17). Since F ( x , . ) is convex, therefore there exists a uniqueness of ϱ ( x ) . Note that a convex function of a real variable can take a given value different from its minimum point at most two different points while
    F ( x , . ) = F ( x , ϱ ( x ) ) = 0 ,
    and from (20) and (23), the minimum point of F ( x , . ) is not zero. Thus, (1) of this proposition is proved.
  • Let u ( 0 ) = ϱ ( x ( 0 ) ) given by (1), for a given x ( 0 ) G . Then, we have that
    F ( x ( 0 ) , u ( 0 ) ) = 0 ,
    F ( . , . ) is continuous in a neighborhood of ( x ( 0 ) , u ( 0 ) ) and from (21)
    q F q u x ( 0 ) , u ( 0 ) = q ( k ) ψ x ( 0 ) u ( 0 ) q ( k ) ψ ( x ( 0 ) ) T q ( k ) ψ ( x ( 0 ) ) + ϕ ( u ( 0 ) ) q ( k ) ψ ( x ( 0 ) ) 2 .
    As F x ( 0 ) , . is strictly increasing at u ( 0 ) , we have that
    q F q u x ( 0 ) , u ( 0 ) > 0 .
    From (26) we observe that q F q u ( . , . ) , is continuous at x ( 0 ) , u ( 0 ) and all the hypotheses of Theorem 2 hold. Thus u is continuous at x ( 0 ) .
We present the following Algorithm 1.
Algorithm 1:q-Gradient Descent (q-GD) Algorithm
Fractalfract 05 00110 i001
Remark 1
([36]). Two important facts about computing α k are presented below:
1.
In Algorithm 1, the modified backtracking technique finds α k using only one inequality instead of two inequalities required in [46]
2.
We can find α k by another technique; we take positive numbers δ 1 and δ 2 such that
L 1 + M ^ k δ 1 < 1 δ 2 ,
where
M ^ k = max 1 i n q i ( k ) ,
so that step length α k can be computed using
δ 1 α k 1 L 1 + M ^ k ( 1 δ 2 ) ,
here L > 0 .
Note that we start our Algorithm 1 by taking s 0 such that
δ 1 < s 0 < δ 2 ,
where δ 1 and δ 2 are two positive numbers. This proposed algorithm is well defined, then it must be established that the following inequality used in Algorithm 1,
ψ x ( k ) s k q ( k ) ψ x ( k ) ψ ( x ( k ) ) ϕ ( s k ) q ( k ) ψ x ( k ) 2 ,
is satisfied after some finite number of steps. Note that every accumulation point of { x ( k ) } is a minimizer of ψ . Since { ψ ( x ( k ) ) } is non-increasing we have
ψ ( x * ) ψ x ( k ) ,
for all k with fixed x * . The content of Theorem 5 will be argued later. The analysis of backtracking used in Algorithm 1 to compute α k is shown below due to [36].
Proposition 3.
The backtracking method of Algorithm 1 defined by Equations (29) and (30) stops after a finite number of iterations with
min δ 1 , ϱ x ( k ) 1 + M ^ k α k min δ 2 , ϱ x ( k ) .
Proof. 
We present two cases for the value of s 0 :
  • s 0 0 , ϱ x ( k ) ,
  • s 0 ϱ x ( k ) .
  • From Equations (29) and (30), and (1) of Proposition 2, α k = s 0 and iteration stops at j = 0 . Since s 0 < δ 2 , and s 0 < ϱ ( x ( k ) ) , then
    α k = s 0 < min δ 2 , ϱ x ( k ) ,
    and since s 0 > δ 1 , then
    s 0 = α k min δ 1 , ϱ x ( k ) 1 + M ^ k .
  • There exists a unique t N , where t 1 such that
    1 + M ^ k t 1 ϱ x ( k ) < s 0 1 + M ^ k t ϱ x ( k ) .
    Then,
    ϱ x ( k ) 1 + M ^ k < s 0 1 + M ^ k t ϱ x ( k ) .
    From Equations (29) and (30), we have
    s j = s 0 1 + M ^ k j ,
    used in Algorithm 1 so the above inequality establishes that
    ϱ x ( k ) 1 + M ^ k < s t ϱ x ( k ) .
    We now claim that α k = s t . From Equation (34), we obtain s t ϱ ( x ( k ) ) , and if we assume
    s t 1 > ϱ ( x ( k ) ) ,
    then for t = 1 , we obtain Case 2, so this assumption is true. Using Proposition 2, and α k = s t , the inequality (30) is satisfied, but this does not satisfy if α k = s t 1 . Note that (31) follows from (34) and in fact, we have that s t s 0 < δ 2 .
The proof is complete. □
The following proposition states that the q-gradient descent method moves in orthogonal steps whose proof is very similar to the proof of [2] (Proposition 6.1).
Proposition 4.
If { x ( k ) } k = 0 is a q-gradient descent sequence for a given function ψ : R n R , then for every k, the vector
x ( k + 1 ) x ( k ) ,
is orthogonal to the vector x ( k + 2 ) x ( k + 1 ) .
Proof. 
The iterative formula of the q-gradient descent is:
x ( k + 1 ) x ( k ) = α k q ( k ) ψ x ( k ) ,
For k = k + 1 , we obtain
x ( k + 2 ) x ( k + 1 ) = α k + 1 q ( k ) ψ x ( k + 1 ) .
From (35) and (36), we obtain
x ( k + 1 ) x ( k ) , x ( k + 2 ) x ( k + 1 ) = α k q ( k ) ψ x ( k ) , α k + 1 q ( k ) ψ x ( k + 1 ) .
We need to show that
q ( k ) ψ x ( k ) , q ( k ) ψ x ( k + 1 ) = 0 .
Since α k 0 is a minimizer of
ϕ k ( α ) = ψ x ( k ) + α k d q ( k ) ( k ) ψ x ( k ) ϕ ( s k ) d q ( k ) ( k ) T g q ( k ) ( k ) ,
used in Algorithm 1, then from the first-order necessary condition, we have
d q ϕ k ( α ) d q α = 0 ,
that is,
q ( k ) ψ x ( k ) α q ψ x ( k ) q ( k ) ψ x ( k ) = 0 ,
that is,
q ( k ) ψ x ( k + 1 ) q ( k ) ψ x ( k ) T = 0 ,
We obtain the desired result after putting the value of the above equation in (37). □
The above theorem implies that q ( k ) ψ ( x ( k ) ) is parallel to the tangent plane to the level set
ψ ( x ) = ψ x ( k + 1 ) ,
at x ( k + 1 ) when q ( k ) ( 1 , , 1 ) T as k . Note that each approximate point generated by the q-gradient descent algorithm decreases the corresponding objective function ψ .
Proposition 5.
If x ( k ) k = 0 is the q-gradient descent sequence for ψ : R n R and
q ( k ) ψ x ( k ) 0 ,
then
ψ x ( k + 1 ) < ψ x ( k ) .
Proof. 
We know that
x ( k + 1 ) = x ( k ) α k q ( k ) ψ x ( k ) ,
where α k 0 is the minimizer of
ϕ k ( α ) = ψ x ( k ) + α k d q ( k ) ( k ) ψ x ( k ) ϕ ( s k ) d q ( k ) ( k ) T g q ( k ) ( k ) .
Therefore, for α 0 , we have ϕ k ( α k ) ϕ k ( α ) for all α . For α = 0 , we have
d q ϕ k d q α = q ( k ) ψ x ( k ) 2 .
Since q ( k ) ψ x ( k ) 0 , then q ( k ) ψ x ( k ) 2 < 0 . Thus,
d q ϕ k d q α < 0 .
There exists an α ¯ > 0 such that ϕ k ( 0 ) > ϕ k ( α ) , for each α ( 0 , α ¯ ] . That is,
ψ x ( k + 1 ) = ϕ k α k ϕ k ( α ) < ϕ k ( 0 ) = ψ x ( k ) .
Thus,
ψ x ( k + 1 ) < ψ x ( k ) .
This completes the proof. □

4. Convergence Analysis

In this section, we present the convergence analysis of the proposed method with the inexact line searches and the concept of quasi Fejér convergence.
Theorem 3.
[36] If x ( k ) is quasi-Fejér convergence to a nonempty set U R n , then { x ( k ) } is bounded. Furthermore, if an accumulation point x of { x ( k ) } belongs to U then lim k x ( k ) = x .
Proof. 
The proof of the above theorem can be seen in [36]. □
Theorem 4.
For Algorithm 1 it holds that
1.
There exists γ > 0 such that
ψ x ( k + 1 ) ψ x ( k ) γ x ( k + 1 ) x ( k ) 2 ,
for all k N ,
2.
The { ψ ( x ( k ) ) } is nonincreasing and convergent,
3.
k = 0 x ( k + 1 ) x ( k ) 2 < .
Proof. 
  • For Algorithm 1, we compute α k using inequalities (27) and (29). Note that
    x ( k + 1 ) x ( k ) 2 = α k 2 q ψ x ( k ) 2 .
    Since α k > 0 , we also have
    ϕ ( α k ) α k 2 x ( k + 1 ) x ( k ) 2 = ϕ ( α k ) q ψ x ( k ) 2 .
    Since
    ϕ ( α k ) q ψ x ( k ) 2 ψ x ( k ) ψ x ( k + 1 ) ,
    thus
    ϕ ( α k ) α k 2 x ( k + 1 ) x ( k ) 2 ψ x ( k ) ψ x ( k + 1 ) .
    We take
    0 < ξ < lim v 0 + inf ϕ ( u ) u 2 .
    By definition of ξ , there exists θ > 0 such that if for a general case, the value of step length α ( 0 , θ ) , then
    ξ < ϕ ( α ) α 2 .
    However, for every k, we take the following two cases for choosing the step length as:
    (i)
    If α k ( 0 , θ ) , then from (39), we have ϕ ( α k ) α k 2 .
    (ii)
    If α k θ . In this case, by Proposition 3, we have that
    α k min δ 2 , ϱ x ( k ) δ 2 ,
    and it follows from (1) and (2) of Assumption A2 that ϕ is increasing, implying ϕ ( θ ) ϕ ( α k ) . Thus, we have
    ϕ ( θ ) δ 2 2 ϕ ( α k ) α k 2 .
    From (38), we have
    ψ x ( k + 1 ) ψ x ( k ) ϕ ( α k ) α k 2 x ( k + 1 ) x ( k ) 2 .
    From Equations (39) and (40) and using above inequality, we obtain
    ψ x ( k + 1 ) ψ x ( k ) γ x ( k + 1 ) x ( k ) 2 ,
    where
    γ = min ξ , ϕ ( θ ) δ 2 2 .
    Thus, (1) of this theorem is proved. If we use Equations (29) and (30) to compute the step length then from q-Newton–Leibniz formula [42]
    ψ x ( k + 1 ) = ψ x ( k ) α k q ( k ) ψ x ( k ) 2 α k 0 1 q ( k ) ψ x ( k ) u α k q ( k ) ψ x ( k ) q ( k ) ψ x ( k ) T q ( k ) ψ x ( k ) d q u .
    From (2) of Assumption 1, we obtain
    ψ x ( k + 1 ) ψ x ( k ) α k q ( k ) ψ x ( k ) 2 + L α k 2 q ( k ) ψ x ( k ) 2 0 1 u d q u .
    Using a special case of the q-integral [44], we obtain
    ψ x ( k + 1 ) ψ x ( k ) α k 1 L α k 1 + M ^ k q ( k ) ψ x ( k ) 2 ,
    where we assume
    M ^ k = max 1 i n q i ( k ) .
    Since
    q ( k ) ψ x ( k ) 2 = 1 α k 2 x ( k + 1 ) x ( k ) 2 ,
    then
    ψ x ( k + 1 ) ψ x ( k ) 1 α k 1 L α k 1 + M ^ k x ( k + 1 ) x ( k ) 2 .
    From (28), we have
    δ 1 α k 1 + M ^ k L ( 1 δ 2 ) ,
    then we obtain
    1 α k 1 L α k 1 + M ^ k δ 2 L 1 + M ^ k ( 1 δ 2 ) .
    From (41), we obtain
    ψ x ( k + 1 ) ψ x ( k ) δ 2 L 1 + M ^ k ( 1 δ 2 ) x ( k + 1 ) x ( k ) 2 .
    where
    δ 2 L 1 + M ^ k ( 1 δ 2 ) = γ .
    Thus, (1) is proved.
  • It follows from (1) using γ > 0 .
  • By (1), there exist γ > 0 such that
    k = 0 r x ( k + 1 ) x ( k ) 2 1 γ ψ x ( 0 ) ψ x ( r ) 1 δ ψ x ( 0 ) ψ x * .
    Suppose that r , we obtain
    k = 0 x ( k + 1 ) x ( k ) 2 < .
This completes the proof. □
We also need to present the following proposition to prove the convergence of Algorithm 1.
Proposition 6.
Let
T = z R n : ψ z     lim k inf ψ x ( k ) .
Every point generated by Algorithm 1 is placed in T, making T nonempty. For any z T , we have
x ( k + 1 ) z 2 x ( k ) z 2 + x ( k + 1 ) x ( k ) 2 ,
then the sequence { x ( k ) } generated by Algorithm 1 is a quasi-Fejér convergence to a point x * T with any α k > 0 .
Proof. 
Given that z T , then
x ( k + 1 ) z 2 x ( k ) z 2 x ( k + 1 ) x ( k ) 2 = 2 z x ( k ) T x ( k + 1 ) x ( k ) .
Since x ( k + 1 ) x ( k ) = α k q ( k ) ψ ( x ( k ) ) , then
x ( k + 1 ) z 2 x ( k ) z 2 x ( k + 1 ) x ( k ) 2 = 2 α k z x ( k ) T q ( k ) ψ x ( k ) .
A function will be called a q-convex function if ψ satisfies the following inequality in the light of q-calculus:
ψ z ψ x ( k ) z x ( k ) T q ( k ) ψ x ( k ) .
With the above inequality and (42), we obtain
x ( k + 1 ) z 2 x ( k ) z 2 x ( k + 1 ) x ( k ) 2 2 α k ψ z ψ x ( k ) 0 ,
and from (3) of Theorem 2, we have
k = 0 x ( k + 1 ) x ( k ) 2 < ,
then we have that x ( k ) is a quasi-Fejér convergence to T R n with
x ( k + 1 ) x ( k ) 2 = ϵ k ,
and from Theorem 3, { x ( k ) } is bounded. Further, it has an accumulation point x of { x ( k ) } , which is in T . Thus, lim k x ( k ) = x . □
Theorem 5.
The sequence { x ( k ) } generated by Algorithm 1 converges in the sense of quasi-Fejér convergence to a minimizer of function ψ : R n R .
Proof. 
From Proposition 6, lim k x ( k ) = x * T , where T is a set of accumulation points that are responsible for decreasing the objective function in every iteration. However, we need to prove that x * T * , where T * is a set of minimizers that minimize the objective function. From Algorithm 1, suppose that x * T * , then from the convexity of ψ , x * G and
q ( k ) ψ x * > 0 .
From Proposition 2, ϱ ( x * ) > 0 and ϱ x ( k ) converges to ϱ x * . Thus, there exists k 0 such that for all k k 0 , we have
v ( x ( k ) ) ϱ ( x * ) 1 + M ^ k ,
q ψ x ( k ) 2 1 1 + M ^ k q ( k ) ψ ( x * ) 2 .
Let
σ = min δ 1 , ϱ ( x * ) 1 + M ^ k 2 q ψ ( x * ) 2 1 + M ^ k .
Then, for any k k 0 ,
x ( k + 1 ) x ( k ) 2 = α k 2 q ( k ) ψ x ( k ) 2 .
Since
α k 2 min δ 1 , ϱ x ( k ) 1 + M ^ k 2 ,
then
x ( k + 1 ) x ( k ) 2 min δ 1 , ϱ ( x ( k ) ) 1 + M ^ k 2 q ( k ) ψ x ( k ) 2 .
From Equations (43) and (45), we obtain
x ( k + 1 ) x ( k ) 2 min δ 1 , ϱ ( x * ) 1 + M ^ k 2 2 q ( k ) ψ ( x * ) 2 M ^ k = σ > 0 .
This contradicts (3) of Theorem 4. Thus it is proved that x * T * . If we choose α k using (27) and (28), then we have
x ( k + 1 ) x ( k ) 2 = α k 2 q ( k ) ψ x ( k ) 2 .
Since α k 2 δ 1 2 , then
x ( k + 1 ) x ( k ) 2 δ 1 2 q ( k ) ψ x ( k ) 2 .
From (3) of Theorem 4, the continuity of q ( k ) f ( . ) , and δ 1 > 0 , used in the above inequality, we obtain q ( k ) ψ ( x ( k ) ) 0 for k . Moreover, as q ( k ) ( 1 , , 1 ) T , this implies that
q ( k ) ψ x ( k ) ,
is a good approximation of ψ x ( k ) [2]. Further, see the proof of Proposition 2.2 in [47] for an affine function where the classical gradient and q-gradient of ψ are the same. For other functions, Example 3.2 and Remark 3.3 can be seen again in [47]. Hence, the accumulation point x * = x ( k ) is a minimizer of the function. □

5. Experimental Results

We compared the numerical performance of our algorithm with the methodology used in [36]. The stopping criteria were set as
q ( k ) ψ x ( k ) ϵ ,
where ϵ = 10 6 to terminate both the algorithms. Numerical results were compared based on the number of iterations and number of function evaluations. The iteration was stopped either when it satisfied the stopping criteria or when iteration counts were 500. All problems were taken from [48] and computer codes were written in R language.
The numerical experiments were performed on an Intel Core i5-3210M CPU with 2.5 GHz, 4 GB of RAM and a 64-bit XXX (Intel, Santa Clara, CA, USA) to solve the unconstrained minimization problems.
Example 2
([49]). Consider a function ψ : R 3 R is given by
ψ ( x ) = 1 2 x T Q x b T x ,
where
Q = 1 0 0 0 τ 0 0 0 τ , b = 1 1 1 .
We apply a q-gradient descent algorithm with a starting point
x ( 0 ) = ( 3 , 2 , 1 ) T ,
on different values of τ = 2 , 5 , 10 , 20 , 50 . We compared our method with the methodology used in [36] and the numerical results are described in Table 1.
It is worth mentioning that the method given in this paper generates the least number of iterations, and minimizer x * and minimum function value ψ ( x * ) = ψ * are almost the same for both methods.
Example 3
([50]). Consider a function ψ : R 2 R such that
ψ ( x ) = x 1 4 2 x 1 2 x 2 + x 1 2 + x 2 2 2 x 1 + 1 .
With a starting point
x ( 0 ) = ( 2 , 2 ) T ,
Algorithm 1 shows improvement over the algorithm given in [36]. The numerical results are shown in the following Table 2 and Table 3, with abbreviations used in columns as ‘ g n ’ norm of gradient, ‘ n f ’ number of function evaluations, and step length computed using the backtracking technique. We observe that our proposed algorithm converges to the solution point in 28 iterations while the algorithm used in [36] converges to the same solution point in 40 iterations. The graphs for both methods in terms of f e versus the logarithm of the function value are provided in Figure 1 and Figure 2. The three-dimensional pictorial representation of Example 3 is shown in Figure 3.
Table 2. Numerical output of Example 3 using Algorithm 1.
Table 2. Numerical output of Example 3 using Algorithm 1.
q-Gradient Descent
It ψ g 2 n nf
0 4.50000 E + 01 5.53114 E + 01 1
1 3.05531 E + 00 3.83381 E + 00 4
2 1.44344 E + 00 2.45969 E + 00 7
3 5.37577 E 01 1.48052 E + 00 10
4 3.32689 E 01 8.22955 E 01 12
5 2.09224 E 01 7.95222 E 01 15
6 1.50867 E 01 5.00475 E 01 17
7 1.10048 E 01 5.35543 E 01 20
8 8.48946 E 02 3.57301 E 01 22
9 6.60685 E 02 3.64658 E 01 24
10 5.33656 E 02 2.90936 E 01 26
11 4.33762 E 02 2.65104 E 01 28
12 3.57854 E 02 2.51345 E 01 30
13 2.96786 E 02 1.99254 E 01 32
14 2.45786 E 02 2.26142 E 01 34
15 2.04572 E 02 1.51693 E 01 36
16 1.66695 E 02 2.10239 E 01 38
17 1.36586 E 02 1.14514 E 01 40
18 1.05164 E 02 2.01242 E 01 42
19 8.15978 E 03 8.24237 E 02 44
20 5.09294 E 03 2.12029 E 01 45
21 2.88677 E 03 4.64793 E 02 47
22 4.05728 E 04 8.80875 E 02 50
23 6.74352 E 05 7.16943 E 03 52
24 1.98233 E 05 1.75357 E 02 55
25 5.78355 E 06 2.33610 E 03 57
26 3.56220 E 06 6.21160 E 03 60
27 1.61834 E 06 1.41408 E 03 62
28 1.32663 E 06 3.28145 E 03 65
Figure 1. Number of function evaluations due to q-gradient descent algorithm for Example 3.
Figure 1. Number of function evaluations due to q-gradient descent algorithm for Example 3.
Fractalfract 05 00110 g001
Table 3. Numerical results of Example 3 using Classical Gradient Descent [36].
Table 3. Numerical results of Example 3 using Classical Gradient Descent [36].
Classical Gradient Descent
It ψ g 2 n nf
0 4.50000 E + 01 5.53173 E + 01 1
1 3.05623 E + 00 3.83810 E + 00 4
2 1.44520 E + 00 2.46074 E + 00 7
3 5.38450 E 01 1.48336 E + 00 10
4 3.33100 E 01 8.23444 E 01 12
5 2.09391 E 01 7.95583 E 01 15
6 1.51000 E 01 5.00572 E 01 17
7 1.10457 E 01 4.84543 E 01 19
8 8.61656 E 02 3.86837 E 01 21
9 6.78803 E 02 3.30047 E 01 23
10 5.44785 E 02 3.30855 E 01 25
11 4.40901 E 02 2.35325 E 01 27
12 3.51020 E 02 3.02819 E 01 29
13 2.81958 E 02 1.68607 E 01 31
14 2.07562 E 02 2.94811 E 01 33
15 1.54894 E 02 1.13592 E 01 35
16 6.31703 E 03 2.74534 E 01 37
17 2.67696 E 03 4.64201 E 02 39
18 1.88478 E 03 1.04102 E 01 41
19 1.35417 E 03 3.27922 E 02 43
20 9.76185 E 04 7.34138 E 02 45
21 7.13476 E 04 2.36981 E 02 47
22 5.22365 E 04 5.29427 E 02 49
23 3.86176 E 04 1.73832 E 02 51
24 2.85774 E 04 3.87568 E 02 53
25 2.12958 E 04 1.28821 E 02 55
26 1.58791 E 04 2.86713 E 02 57
27 1.19008 E 04 9.61596 E 03 59
28 8.92260 E 05 2.13709 E 02 61
29 6.71499 E 05 7.21554 E 03 63
30 5.05489 E 05 1.60174 E 02 65
31 3.81590 E 05 5.43511 E 03 67
32 2.88110 E 05 1.20540 E 02 69
33 2.17989 E 05 4.10562 E 03 71
34 1.64954 E 05 9.09882 E 03 73
35 1.25020 E 05 3.10790 E 03 75
36 9.47621 E 06 6.88380 E 03 77
37 7.19130 E 06 2.35637 E 03 79
38 5.45769 E 06 5.21692 E 03 81
39 4.14574 E 06 1.78870 E 03 83
40 3.14931 E 06 3.95878 E 03 85
Dolan and Moré [51] presented an appropriate technique to demonstrate the performance profiles, which is a statistical process. The performance ratio is presented as:
ρ p , s = r ( p , s ) min r ( p , s ) : 1 r n s ,
where r ( p , s ) refers to the iteration and function evaluations for solver s spent on problem p, and n s refers to the number of problems in the model test. The cumulative distribution function is given as:
P s ( τ ) = 1 n p size p ρ ( p , s ) τ ,
where P s ( τ ) is the probability that a performance ratio ρ ( p , s ) is within a factor of τ R . That is, for a subset of the methods being analyzed, we plot the fraction ρ s ( τ ) of problems for which any given method is within a factor τ of the best. We use this tool to show the performance of Algorithm 1. First, we solved 28 test problems with different starting points and stored the number of iterations and number of function evaluation values in Table 4. In fact, Figure 4 and Figure 5 show that the q-Gradient Descent method solves about 86% and 79% of 28 test problems [48] with the least number of iterations and function evaluations, respectively. We can conclude that the proposed method is superior.
Figure 2. Number of function evaluations due to Classical Gradient Descent algorithm for Example 3.
Figure 2. Number of function evaluations due to Classical Gradient Descent algorithm for Example 3.
Fractalfract 05 00110 g002
Figure 3. Performance profile for number of function evaluations based on Table 4.
Figure 3. Performance profile for number of function evaluations based on Table 4.
Fractalfract 05 00110 g003
Figure 4. Performance profile for number of iterations based on Table 3.
Figure 4. Performance profile for number of iterations based on Table 3.
Fractalfract 05 00110 g004
Figure 5. Graphics of Example 3.
Figure 5. Graphics of Example 3.
Fractalfract 05 00110 g005
Table 4. Numerical results of 28 test problems.
Table 4. Numerical results of 28 test problems.
Sl. No.Problem NameStarting Pointq-Gradient Descent (q-GD)Classical Gradient Descent (CSD) [36]
NI FE NI FE
1Booth ( 3.000 E 01 , 6.000 E + 00 ) T 716715
2 Aluffi Pentini ( 1.000 E + 00 , 7.000 E 01 ) T 3849
3 Bohachevsky ( 8.000 E 01 , 5.000 E 01 ) T 930930
4 Branin ( 1.800 E + 00 , 1.500 E + 00 ) T 19432052
5Colville ( 1.000 E + 00 , 1.000 E + 00 , 1.000 E + 00 , 8.000 E 01 ) T 1633474981012
6Csendes ( 1.300 E + 00 , 2.500 E + 00 ) T 519627
7Ackley2 ( 2.300 E + 00 , 1.600 E + 00 ) T 226236
8Csendes ( 1.600 E + 00 , 1.600 E + 00 ) T 314315
9Cubic ( 1.300 E + 00 , 1.300 E + 00 ) T 61197251645
10Deckkers Aarts ( 2.000 E + 00 , 1.200 E + 00 ) T 84873348
11Dixon Price ( 1.000 E + 00 , 2.000 E + 00 ) T 19442453
12Himmelblau ( 1.000 E + 00 , 4.000 E + 00 ) T 10521136
13Leon ( 8.000 E 01 , 1.400 E + 00 ) T 64153284595
14diagonal4 ( 1.300 E + 00 , 1.500 E + 00 , 1.300 E + 00 , 1.500 E + 00 ) T 4836
15Zakharov ( 2.300 E + 00 , 1.500 E + 00 ) T 818818
16FH1 ( 1.000 E 02 , 1.000 E 02 ) T 19513376
17Zakharov ( 2.000 E + 00 , 2.500 E + 00 ) T 818818
18Three Hump Camel ( 3.000 E + 00 , 5.000 E + 00 ) T 18451945
19Six Hump Camel ( 1.000 E + 00 , 3.000 E + 00 ) T 924924
20Matyas ( 1.000 E + 00 , 3.000 E + 00 ) T 3726
21FH2 ( 1.010 E + 00 , 1.000 E 02 ) T 22482551
22Raydan 1 ( 1.000 E + 00 , 1.000 E + 00 ) T 3999
23Raydan 2 ( 1.000 E + 00 , 1.000 E + 00 ) T 4747
24Hager ( 1.000 E + 00 , 1.000 E + 00 ) T 511612
25Generalized Tridiagonal 1 ( 1.200 E + 00 , 1.700 E + 00 ) T 26742960
26Extended Tridiagonal 1 ( 1.200 E + 00 , 2.700 E + 00 ) T 28633473
27BDEXP ( 1.000 E + 00 , 1.000 E + 00 , 1.000 E + 00 , 1.000 E + 00 ) T 416416
28BDQRTIC ( 1.000 E + 00 , 1 , 1.2 , 1 , 1.3 , 1 ) T 15421644

6. Conclusions

A q-gradient descent optimization algorithm was presented to solve unconstrained optimization problems. The approach was based on the q-gradient vector in place of the classical gradient to present the algorithm. The quasi-Fejér convergence of the algorithm was proved. The algorithm sometimes declares a fixed point that is not the solution, but it is very close to it because of using the q-derivative. Further, examples and numerical results demonstrate the improvement of the proposed method over the classical method. Our future work will include applying the q-gradient to the accelerated gradient descent method for unconstrained optimization.

Author Contributions

Conceptualization, S.K.M., P.R., S.K.C., M.K.A.K., B.R. and M.E.S.; methodology, S.K.M., P.R., S.K.C., M.K.A.K., B.R. and M.E.S.; software, M.E.S.; validation, M.K.A.K., B.R. and M.E.S.; formal analysis, S.K.M., P.R., S.K.C. and B.R.; investigation, S.K.M., P.R., S.K.C., M.K.A.K., B.R. and M.E.S.; resources, S.K.M., P.R., S.K.C., M.K.A.K., B.R. and M.E.S.; data curation, M.E.S.; writing—original draft preparation, S.K.M., P.R., S.K.C., M.K.A.K., B.R. and M.E.S.; writing—review and editing, M.K.A.K., B.R. and M.E.S.; visualization, M.E.S.; supervision, M.K.A.K. and M.E.S.; project administration, M.K.A.K. and M.E.S.; funding acquisition, M.K.A.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing not applicable.

Acknowledgments

The first author was supported by the Science and Engineering Research Board (Grant No. DST-SERBMTR-2018/000121) and the research grant for faculty (Institute of Eminence-BHU Grant No. 3254). The second author was supported by the Ministry Education, Science and Technological Development, Grant No. 174013. The third author was supported by Bu-Ali Sina University. The authors would like to thank the editors and the reviewers for their helpful comments and suggestions which have led to the improvement of the earlier version of this article.

Conflicts of Interest

The authors declare that they have no competing interest.

References

  1. Cauchy, A. Méthode générale pour la résolution des systemes d’équations simultanées. Comp. Rend. Sci. Paris 1847, 25, 536–538. [Google Scholar]
  2. Mishra, S.K.; Ram, B. Introduction to Unconstrained Optimization with R; Springer Nature: Gateway East, Singapore, 2019. [Google Scholar]
  3. El Mouatasim, A. Fast gradient descent algorithm for image classification with neural networks. Signal Image Video Process. 2020, 14, 1565–1572. [Google Scholar] [CrossRef]
  4. Chen, Y.; Miao, D. Granular regression with a gradient descent method. Inf. Sci. 2020, 537, 246–260. [Google Scholar] [CrossRef]
  5. Pan, H.; Niu, X.; Li, R.; Dou, Y.; Jiang, H. Annealed gradient descent for deep learning. Neurocomputing 2020, 380, 201–211. [Google Scholar] [CrossRef]
  6. Koshak, W.J.; Krider, E.P. A linear method for analyzing lightning field changes. J. Atmos. Sci. 1994, 51, 473–488. [Google Scholar] [CrossRef] [Green Version]
  7. Liao, L.Z.; Qi, L.; Tam, H.W. A gradient-based continuous method for large-scale optimization problems. J. Glob. Optim. 2005, 31, 271–286. [Google Scholar] [CrossRef]
  8. Nesterov, Y. Universal gradient methods for convex optimization problems. Math. Program. 2015, 152, 381–404. [Google Scholar] [CrossRef] [Green Version]
  9. Nezhadhosein, S. A Modified Descent Spectral Conjugate Gradient Method for Unconstrained Optimization. Iran. J. Sci. Technol. Trans. A Sci. 2021, 45, 209–220. [Google Scholar] [CrossRef]
  10. MishraSamei Mishra, S.K.; Samei, M.E.; Chakraborty, S.K.; Ram, B. On q-variant of Dai–Yuan conjugate gradient algorithm for unconstrained optimization problems. Nonlinear Dyn. 2021, 104, 2471–2496. [Google Scholar] [CrossRef]
  11. Jackson, F.H. q-Difference equations. Am. J. Math. 1910, 32, 305–314. [Google Scholar] [CrossRef]
  12. Aral, A.; Gupta, V.; Agarwal, R.P. Applications of q-Calculus in Operator Theory; Springer: New York, NY, USA, 2013. [Google Scholar]
  13. Ernst, T. The different tongues of q-calculus. Proc. Est. Acad. Sci. 2008, 57, 81–99. [Google Scholar] [CrossRef]
  14. Ernst, T. A method for q-calculus. J. Nonlinear Math. Phys. 2003, 10, 487–525. [Google Scholar] [CrossRef] [Green Version]
  15. Srivastava, H. Operators of basic (or q-) calculus and fractional q-calculus and their applications in geometric function theory of complex analysis. Iran. J. Sci. Technol. Trans. A Sci. 2020, 44, 327–344. [Google Scholar] [CrossRef]
  16. Piejko, K.; Sokol, J.; Trkabka-Wikeclaw, K. On q-Calculus and Starlike Functions. Iran. J. Sci. Technol. Trans. A Sci. 2019, 43, 2879–2883. [Google Scholar] [CrossRef] [Green Version]
  17. Mursaleen, M.; Ansari, K.J.; Nasiruzzaman, M. Approximation by q-analogue of Jakimovski–Leviatan operators involving q-Appell polynomials. Iran. J. Sci. Technol. Trans. A Sci. 2017, 41, 891–900. [Google Scholar] [CrossRef]
  18. Dimakis, A.; Müller-Hoissen, F. Quantum mechanics on a lattice and q-deformations. Phys. Lett. B 1992, 295, 242–248. [Google Scholar] [CrossRef]
  19. Lai, K.K.; Mishra, S.K.; Panda, G.; Chakraborty, S.K.; Samei, M.E.; Ram, B. A limited memory q-BFGS algorithm for unconstrained optimization problems. J. Appl. Math. Comput. 2020, 1–20. [Google Scholar] [CrossRef]
  20. Samei, M.E.; Hedayati, V.; Rezapour, S. Existence results for a fraction hybrid differential inclusion with Caputo-Hadamard type fractional derivative. Adv. Differ. Equ. 2019, 2019, 163. [Google Scholar] [CrossRef]
  21. Rezapour, S.; Imran, A.; Hussain, A.; Martínez, F.; Etemad, S.; Kaabar, M.K.A. Condensing Functions and Approximate Endpoint Criterion for the Existence Analysis of Quantum Integro-Difference FBVPs. Symmetry 2021, 13, 469. [Google Scholar] [CrossRef]
  22. Samei, M.E.; Ghaffari, R.; Yao, S.W.; Kaabar, M.K.A.; Martínez, F.; Inc, M. Existence of Solutions for a Singular Fractional q-Differential Equations under Riemann–Liouville Integral Boundary Condition. Symmetry 2021, 13, 1235. [Google Scholar] [CrossRef]
  23. Ntouyas, S.K.; Samei, M.E. Existence and uniqueness of solutions for multi-term fractional q-integro-differential equations via quantum calculus. Adv. Differ. Equ. 2019, 2019, 475. [Google Scholar] [CrossRef]
  24. Samei, M.E.; Hedayati, V.; Ranjbar, G.K. The existence of solution for k-dimensional system of Langevin Hadamard-type fractional differential inclusions with 2k different fractional orders. Mediterr. J. Math. 2020, 17, 37. [Google Scholar] [CrossRef]
  25. Etemad, S.; Rezapour, S.; Samei, M.E. On fractional hybrid and non-hybrid multi-term integro-differential inclusions with three-point integral hybrid boundary conditions. Adv. Differ. Equ. 2020, 2020, 161. [Google Scholar] [CrossRef]
  26. Kaabar, M.K.A.; Martínez, F.; Gómez-Aguilar, J.F.; Ghanbari, B.; Kaplan, M.; Günerhan, H. New approximate analytical solutions for the nonlinear fractional Schrödinger equation with second-order spatio-temporal dispersion via double Laplace transform method. Math. Methods Appl. Sci. 2021, 44, 11138–11156. [Google Scholar] [CrossRef]
  27. Alzabut, J.; Selvam, A.; Dhineshbabu, R.; Kaabar, M.K.A. The Existence, Uniqueness, and Stability Analysis of the Discrete Fractional Three-Point Boundary Value Problem for the Elastic Beam Equation. Symmetry 2021, 13, 789. [Google Scholar] [CrossRef]
  28. Etemad, S.; Souid, M.S.; Telli, B.; Kaabar, M.K.A.; Rezapour, S. Investigation of the neutral fractional differential inclusions of Katugampola-type involving both retarded and advanced arguments via Kuratowski MNC technique. Adv. Differ. Equ. 2021, 2021, 1–20. [Google Scholar] [CrossRef]
  29. Mohammadi, H.; Kaabar, M.K.A.; Alzabut, J.; Selvam, A.G.M.; Rezapour, S. A Complete Model of Crimean-Congo Hemorrhagic Fever (CCHF) Transmission Cycle with Nonlocal Fractional Derivative. J. Funct. Spaces 2021, 2021, 1–12. [Google Scholar] [CrossRef]
  30. Matar, M.M.; Abbas, M.I.; Alzabut, J.; Kaabar, M.K.A.; Etemad, S.; Rezapour, S. Investigation of the p-Laplacian nonperiodic nonlinear boundary value problem via generalized Caputo fractional derivatives. Adv. Differ. Equ. 2021, 2021, 1–18. [Google Scholar] [CrossRef]
  31. Alam, M.; Zada, A.; Popa, I.L.; Kheiryan, A.; Rezapour, S.; Kaabar, M.K.A. A fractional differential equation with multi-point strip boundary condition involving the Caputo fractional derivative and its Hyers–Ulam stability. Bound. Value Probl. 2021, 2021, 1–18. [Google Scholar] [CrossRef]
  32. Rajković, P.M.; Marinković, S.D.; Stanković, M.S. On q-Newton–Kantorovich method for solving systems of equations. Appl. Math. Comput. 2005, 168, 1432–1448. [Google Scholar] [CrossRef]
  33. Rajković, P.M.; Marinković, S.D.; Stanković, M. The q-gradient Method. In Proceedings of the International Symposium “Geometric Function Theory and Applications”, Sofia, Bulgaria, 27–31 August 2010; pp. 240–244. [Google Scholar]
  34. Kiefer, J. Sequential minimax search for a maximum. Proc. Am. Math. Soc. 1953, 4, 502–506. [Google Scholar] [CrossRef]
  35. Soterroni, A.C.; Galski, R.L.; Ramos, F.M. The q-gradient vector for unconstrained continuous optimization problems. In Operations Research Proceedings 2010; Springer: Berlin/Heidelberg, Germany, 2011; pp. 365–370. [Google Scholar]
  36. Burachik, R.; Graña Drummond, L.; Iusem, A.N.; Svaiter, B. Full convergence of the steepest descent method with inexact line searches. Optimization 1995, 32, 137–146. [Google Scholar] [CrossRef]
  37. Kiwiel, K.C.; Murty, K. Convergence of the steepest descent method for minimizing quasiconvex functions. J. Optim. Theory Appl. 1996, 89, 221–226. [Google Scholar] [CrossRef]
  38. Yuan, Y.X. A new stepsize for the steepest descent method. J. Comput. Math. 2006, 24, 149–156. [Google Scholar]
  39. Mishra, S.K.; Panda, G.; Ansary, M.A.T.; Ram, B. On q-Newton’s method for unconstrained multiobjective optimization problems. J. Appl. Math. Comput. 2020, 1–20. [Google Scholar] [CrossRef]
  40. Lai, K.K.; Mishra, S.K.; Ram, B. On q-Quasi-Newton’s Method for Unconstrained Multiobjective Optimization Problems. Mathematics 2020, 8, 616. [Google Scholar] [CrossRef] [Green Version]
  41. Chakraborty, S.K.; Panda, G. Newton like line search method using q-calculus. In International Conference on Mathematics and Computing; Springer: Singapore, 2017; pp. 196–208.q-calculus. In International Conference on Mathematics and Computing; Springer: Singapore, 2017; pp. 196–208. [Google Scholar]
  42. Kac, V.; Cheung, P. Quantum Calculus; Springer Science & Business Media: New York, NY, USA, 2001. [Google Scholar]
  43. Rajković, P.; Stanković, M.; Marinković, D.S. Mean value theorems in q-calculus. Mat. Vesn. 2002, 54, 171–178. [Google Scholar]
  44. Andrews, G.E. q-Series: Their Development and Application in Analysis, Number Theory, Combinatorics, Physics and Computer Algebra: Their Development and Application in Analysis, Number Theory, Combinatorics, Physics, and Computer Algebra; Number 66; American Mathematical Soc.: Rhode Island, USA, 1986. [Google Scholar]
  45. Pastor, J.R.; Calleja, P.P.; Trejo, C.A. Análisis Matemático Volúmen: Cálculo Infinitesimal de Varias Variables Aplicaciones; McGraw-Hill: Madrid, Spain, 1963. [Google Scholar]
  46. Dennis, J.E., Jr.; Schnabel, R.B. Numerical Methods for Unconstrained Optimization and Nonlinear Equations; SIAM: Philadelphia, PA, USA, 1996. [Google Scholar]
  47. Lai, K.K.; Mishra, S.K.; Ram, B. A q-conjugate gradient algorithm for unconstrained optimization problems. Pac. J. Optim. 2021, 17, 57–76. [Google Scholar]
  48. Andrei, N. An unconstrained optimization test functions collection. Adv. Model. Optim. 2008, 10, 147–161. [Google Scholar]
  49. Meza, J.C. Steepest descent. Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 719–722. [Google Scholar] [CrossRef]
  50. Snyman, J.; Hay, A. The spherical quadratic steepest descent (SQSD) method for unconstrained minimization with no explicit line searches. Comput. Math. Appl. 2001, 42, 169–178. [Google Scholar] [CrossRef] [Green Version]
  51. Dolan, E.D.; Moré, J.J. Benchmarking optimization software with performance profiles. Math. Program. 2002, 91, 201–213. [Google Scholar] [CrossRef]
Table 1. Numerical results of Example 2.
Table 1. Numerical results of Example 2.
τ x ¯ q-Gradient Descent
ψ ( x ¯ ) GN IT
2 ( 1.001 E + 00 , 5.000 E 01 , 2.501 E 01 ) T 8.750 E 01 3.683 E 07 16
5 ( 1.002 E + 00 , 2.000 E 01 , 4.019 E 02 ) T 6.200 E 01 8.270 E 07 56
10 ( 1.003 E + 00 , 1.000 E 01 , 9.991 E 03 ) T 5.550 E 01 8.859 E 07 97
20 ( 1.004 E + 00 , 5.000 E 02 , 2.499 E 03 ) T 5.262 E 01 9.984 E 07 139
50 ( 1.004 E + 00 , 2.000 E 02 , 3.999 E 04 ) T 5.102 E 01 9.913 E 07 353
Classical Gradient Descent [36]
ψ ( x ¯ ) GN IT
2 ( 1.001 E + 00 , 5.000 E 01 , 2.498 E 01 ) T 8.750 E 01 4.068 E 07 15
5 ( 1.003 E + 00 , 2.000 E 01 , 3.993 E 02 ) T 6.200 E 01 8.345 E 07 71
10 ( 1.004 E + 00 , 1.000 E 01 , 1.012 E 02 ) T 5.550 E 01 9.782 E 07 134
20 ( 1.006 E + 00 , 5.000 E 02 , 2.497 E 03 ) T 5.262 E 01 9.640 E 07 215
50 ( 1.010 E + 00 , 2.000 E 02 , 4.272 E 04 ) T 5.101 E 01 9.843 E 07 588
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Mishra, S.K.; Rajković, P.; Samei, M.E.; Chakraborty, S.K.; Ram, B.; Kaabar, M.K.A. A q-Gradient Descent Algorithm with Quasi-Fejér Convergence for Unconstrained Optimization Problems. Fractal Fract. 2021, 5, 110. https://doi.org/10.3390/fractalfract5030110

AMA Style

Mishra SK, Rajković P, Samei ME, Chakraborty SK, Ram B, Kaabar MKA. A q-Gradient Descent Algorithm with Quasi-Fejér Convergence for Unconstrained Optimization Problems. Fractal and Fractional. 2021; 5(3):110. https://doi.org/10.3390/fractalfract5030110

Chicago/Turabian Style

Mishra, Shashi Kant, Predrag Rajković, Mohammad Esmael Samei, Suvra Kanti Chakraborty, Bhagwat Ram, and Mohammed K. A. Kaabar. 2021. "A q-Gradient Descent Algorithm with Quasi-Fejér Convergence for Unconstrained Optimization Problems" Fractal and Fractional 5, no. 3: 110. https://doi.org/10.3390/fractalfract5030110

APA Style

Mishra, S. K., Rajković, P., Samei, M. E., Chakraborty, S. K., Ram, B., & Kaabar, M. K. A. (2021). A q-Gradient Descent Algorithm with Quasi-Fejér Convergence for Unconstrained Optimization Problems. Fractal and Fractional, 5(3), 110. https://doi.org/10.3390/fractalfract5030110

Article Metrics

Back to TopTop