Next Article in Journal
A Mathematically Exact and Well-Determined System of Equations to Close Reynolds-Averaged Navier–Stokes Equations
Next Article in Special Issue
Optimal Debt Ratio and Dividend Payment Policies for Insurers with Ambiguity
Previous Article in Journal
Closed-Loop Continuous-Time Subspace Identification with Prior Information
Previous Article in Special Issue
From Transience to Recurrence for Cox–Ingersoll–Ross Model When b < 0
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Robust and Sparse Portfolio: Optimization Models and Algorithms

1
School of Mathematics and Statistics, Beijing Jiaotong University, Beijing 100044, China
2
Department of Economic Management, Shijiazhuang Institute of Railway Technology, Shijiazhuang 050000, China
3
Personnel Department, Shijiazhuang University, Shijiazhuang 050035, China
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(24), 4925; https://doi.org/10.3390/math11244925
Submission received: 8 November 2023 / Revised: 7 December 2023 / Accepted: 8 December 2023 / Published: 11 December 2023

Abstract

:
The robust and sparse portfolio selection problem is one of the most-popular and -frequently studied problems in the optimization and financial literature. By considering the uncertainty of the parameters, the goal is to construct a sparse portfolio with low volatility and decent returns, subject to other investment constraints. In this paper, we propose a new portfolio selection model, which considers the perturbation in the asset return matrix and the parameter uncertainty in the expected asset return. We define three types of stationary points of the penalty problem: the Karush–Kuhn–Tucker point, the strong Karush–Kuhn–Tucker point, and the partial minimizer. We analyze the relationship between these stationary points and the local/global minimizer of the penalty model under mild conditions. We design a penalty alternating-direction method to obtain the solutions. Compared with several existing portfolio models on seven real-world datasets, extensive numerical experiments demonstrate the robustness and effectiveness of our model in generating lower volatility.

1. Introduction

In 1952, Harry M. Markowitz [1] published the classic “Portfolio Selection” in The Journal of Finance, which ushered in a new era of financial mathematical analysis. Markowitz pointed out that investors who care about return and risk should hold portfolios located at the efficient boundary of mean-variance, which is the famous mean-variance portfolio (MVP) selection model. Since then, many portfolio selection strategies have been proposed by referring to the MVP and its variants. However, MVPs exhibit instability due to estimation errors in the input parameters [2], especially in large-scale conditions. The instability means that the solution obtained under sample fluctuation may be optimal for a given sample, but it is not optimal from the perspective of risk. For more comments on this model, we refer to [3,4,5,6] and the references therein.
This paper focuses attention on sample fluctuations and parameter uncertainty in the portfolio selection problem. We now review some relevant methods for the parameter uncertainty. Among various approaches, the attractive one is the robust portfolio (RP), which corresponds to a robust optimization, since it does not use any information about the probability distribution of the uncertain parameters. RP we considered is a conservative approach that minimizes the loss function within an uncertainty set and then solves the problem under the worst-case scenario. In the last two decades, robust portfolio selection problems have gained the increasing interest of researches. These researches constructed well-known optimal portfolios from the perspective of robust optimization [7,8,9,10]. In this way, Goldfarb and Iyengar [11] formulated and solved RP problems. They introduced the uncertainty structures for the input parameters, then they showed that the RP problems corresponding to the second-order cone programs and these uncertainty structures correspond to confidence regions employed to estimate the market parameters. Given the uncertainty in the mean and covariance matrix of the asset return, Lobo and Boyd [12] computed the maximum risk of a portfolio in a numerically efficient way. They proved that this is a semi-definite programming problem and is readily solved by interior-point methods for convex optimization. Min et al. [13] proposed the hybrid RP models under ellipsoidal uncertainty sets, and they considered both the best-case and the worst-case counterparts. Won and Kim [14] considered RP problems involving a trade-off between the worst-case utility and the worst-case regret, or the largest difference between the best utility achievable under the model and that achieved by a given portfolio. They showed that the entire optimal trade-off curve can be found via solving a series of semi-definite programs under the ellipsoidal uncertainty model. Some research works [15,16] concentrated on the application of robust optimization on basic mean-variance, mean value-at-risk (mean-VaR), and mean conditional-value-at-risk (mean-CVaR) problems, but did not consider variants of the problem like robust index tracking, robust and sparse portfolio selection problems, and so on. More relevant works can be found in [17,18,19,20] and the references therein.
RPs have a wide range of applications, among these, one essential step is the construction of uncertainty sets. Two types of uncertainty sets are widely used, namely the box uncertainty set and the ellipsoidal uncertainty set. Tütüncü and Koenig [21] used symmetric box uncertainty sets defined as U μ = { μ R n | μ L μ μ U } and U Σ = { Σ R n × n | Σ L Σ Σ U , Σ 0 } , where μ L R n and μ U R n are the lower and upper bounds of mean vector μ , Σ L R n × n and Σ U R n × n are the lower and the upper bounds of the covariance matrix Σ , respectively, and Σ is positive semi-definite. Khodamoradi et al. [22] used box uncertainty sets for a cardinal-constrained mean-variance portfolio problem which allows short selling. Swain and Ojha [10] analyzed the robust version of the mean-variance portfolio problem and mean-semi-variance portfolio problem with box uncertainty sets. Alternatively, Fabozzi et al. [23] defined an ellipsoidal uncertainty set for the expected asset return as U μ = { μ | ( μ μ ¯ ) Σ 1 ( μ μ ¯ ) ϵ 2 } , where μ ¯ is the nominal asset return and ϵ 2 is a small scalar, which controls the size of the uncertainty set. However, they did not consider the uncertainty of the covariance matrix, thus the solution was robust only against perturbations in the asset return vector. Pıinar [24] developed a multi-period robust mean-variance portfolio problem with an ellipsoidal uncertainty set while allowing short selling. As we all know, the estimation error is more sensitive to the mean vector than the covariance matrix. On the other hand, dealing with the uncertainty in the covariance matrix is more complicated than dealing with the uncertainty set of the mean vector. Thus, in this paper, we consider two types of uncertainty sets for the mean vector.
Financial data have some remarkable features, such as multicollinearity and a heavy tail. Therefore, the perturbations of these data should not be underestimated. By referring to Brodie et al. [2], who transferred the MVP into a Lasso-type portfolio, we consider the perturbations in the asset return matrix and design its uncertainty set. In addition, from the perspective of transaction costs and administrative expenses, more assets are not always better. Therefore, it is also necessary to consider sparsity when constructing a portfolio [25,26,27]. After these discussions, a natural question follows: How do we find better RPs that not only reduce the undesired impact of parameter uncertainty, but also improve sparsity and reduce cost?
Following the above considerations, this paper proposes a sparsity constrained robust portfolio optimization model with parameter uncertainty and data perturbation. Specifically, we consider the perturbation in the asset return matrix and the parameter uncertainty in the expected asset return. By using the equivalence of robustness and regularization, the Lasso-type objective function can be converted into the sum of a square root and the 1 norm. We consider two kinds of uncertainty sets: the box uncertainty set and the ellipsoidal uncertainty set. For its penalty model, we define three types of stationary points: the Karush–Kuhn–Tucker (KKT) point, the strong KKT point, and the partial minimizer. Under mild constraint qualification (CQ), we prove that any local minimizer of the penalty model is a KKT point. Moreover, the global minimizer of the penalty model is proven to be a partial minimizer and, then, a stronger KKT point under Slater’s CQ. Finally, a penalty alternating direction method is proposed to obtain a portfolio, and its convergence is established. We confirm the effectiveness of our approach by comparing with nine widely studied portfolio models on seven real-world data sets. The numerical results show that the portfolios we proposed have less volatility, that is less risk. Moreover, our portfolio strategies can yield higher Sharpe ratios when the appropriate parameters are selected.
This paper is organized as follows. Some notations and preliminaries used in this paper are given in the next section. The model of robust and sparse portfolios and the analysis of their optimization theory are stated in Section 3. Two types of uncertainty sets of mean vectors are presented in Section 4. The optimization algorithm named the penalty alternating direction method is established in Section 5. Extensive numerical experiments are conducted in Section 6. Conclusions are drawn in Section 7.

2. Notations and Preliminary

We use R and R n and R m × n to denote the set of real numbers and the n-dimensional and m × n -dimensional Euclidean space. We use boldfaced small letters to denote vectors, e.g., w R n is a column vector with n elements w i , i = 1 , , n . The transpose of w is denoted as w T , which is a row vector. In particular, 1 n is the vector of all ones of size n. For a vector a R n , we define its absolute value vector by | a | : = ( | a 1 | , , | a n | ) . We use capital letters to denote matrices, e.g., A R m × n and a i j denote the ( i , j ) -th entry of A. Given an index Γ { 1 , , n } , a Γ denotes the sub-vector of a . We write the Euclidean norm of w by w 2 , the 1 norm by w 1 , and the infinity norm by w . For two vectors a R n and b R n , a , b denotes the standard inner product.
We now provide some existing results of optimization that are crucial for the theory of this paper. For the convenience of expression, we define the following convex programming:
min x R n f ( x ) , s . t . g i ( x ) 0 , i = 1 , , m , x Ω ,
where Ω is a nonempty convex set, f is a convex function, and the g i ( x ) s are concave functions. For problem (1), Slater’s CQ builds a bridge between its solution and the KKT point (the point satisfying the conditions in Theorem 1).
Definition 1
([28], Definition 4.17). Slater’s CQ holds in problem (1) if there exists u Ω such that g i ( u ) > 0 for all i = 1 , , m .
Theorem 1
([28], Theorem 4.18). Suppose that Slater’s CQ holds in problem (1). Then, x * is an optimal solution to problem (1) if and only if there exist non-negative Lagrange multipliers ( λ 1 , , λ m ) R m such that
0 f ( x * ) i = 1 m λ i g i ( x * ) + N ( x * ; Ω )
and λ i g i ( x * ) = 0 for all i = 1 , , m , where f ( x * ) denotes the classical sub-differential set ([28], Definition 2.30) of f at x * and N ( x * ; # ) denotes the classical normal cone ([28], Definition 2.9) of # at x * .
We also introduce some crucial terminologies and results for sparsity nonlinear programming:
min x R n f ( x ) s . t . g i ( x ) 0 , i = 1 , , m , h j ( x ) = 0 , j = 1 , , l , x 0 s ,
where f is a convex function and g and h are continuously differentiable. A restricted linear independence constraint qualification (R-LICQ) used for sparsity nonlinear programming (2) was defined by [29] as follows.
Definition 2
([29], Definition 2.4). We say that the R-LICQ holds at x * , where x * is feasible for the problem (2):
  • When x * 0 = s , g i ( x * ) , i I ( x * ) , h j ( x * ) , j = 1 , , l , are linearly independent.
  • When x * 0 < s , Γ * g i ( x * ) , i I ( x * ) , Γ * h j ( x * ) , j = 1 , , l , are linearly independent.
Based on the R-LICQ, the following decomposition result holds.
Theorem 2
([29], Proposition 2.5). Let x * be a feasible point of problem (2) and the R-LICQ hold at x * . Then,
N ^ ( x * ; S Q ) = N ^ ( x * ; S ) + N ^ ( x * ; Q ) ,
where S : = { x : x 0 s } , Q : = { x : g i ( x ) 0 , i = 1 , , m , h j ( x ) = 0 , j = 1 , , l } , and N ^ ( x * ; # ) denotes the Frechét normal cone ([30], Definition 6.3) of # at x * , which degenerates into the classical norm cone described in Theorem 1 if # is a convex set.
For the partial problem (10) of the portfolio model (6) in Section 3.1, the R-LICQ holds automatically at x * , where m = 0 , l = 1 , and h ( x ) : = 1 T x 1 . Next, we establish the relationship between the local minimizer of problem (2) and its KKT point (the point satisfying the KKT system in Theorem 3).
Theorem 3.
Suppose that x * is a local minimizer of problem (2) and the R-LICQ holds at x * . Then, there exist non-negative Lagrange multipliers ( λ 1 * , , λ m * ) R + m and ( μ 1 * , , μ m * ) R l such that
0 f ( x * ) i = 1 m λ i g i ( x * ) + j = 1 l λ i h j ( x * ) + N ^ ( x * ; S ) , g i ( x ) 0 , λ i g i ( x ) = 0 , i = 1 , , m , h j ( x ) = 0 , j = 1 , , l , x 0 s .
Proof. 
It follows from Theorem 6.12 of [30] that
0 f ( x * ) + N ^ ( x * ; S Q ) .
Combing Theorem 2 with the proof of Theorem 3.2 of [29], this result holds.    □
This result is different from Theorem [29]. We allow the objective function of problem (2) to be non-differentiable. The analysis process of this result is completely consistent with that of Theorem [29].

3. Model and Optimization Theory

In this section, we first propose a robust and sparse portfolio model (4) with an uncertainty set constraint and a sparsity constraint. For the convenience of the numerical calculation, we consider its 1 norm penalization variant (6). We define three types of stationary points of the penalization variant: the KKT point, the strong KKT point, and the partial minimizer. The relationships of these stationary points and the local/global minimizer of the penalization problem (6) are established in Section 3.2.

3.1. Robust and Sparse Portfolio Model

Consider n risky assets, denoting the asset return at period t by r t = ( r 1 , . . . , r n ) R n . The expected return vector of different assets is denoted by E ( r t ) = μ , and the covariance matrix is denoted by E [ ( r t μ ) ( r t μ ) ] = V . In the traditional Markowitz portfolio selection problem, the portfolio construction is based on the trade-off between risk and return. For a given level of acceptable portfolio return ρ = w μ , the mean-variance optimization can be formulated as
min w R n 1 2 w V w , s . t . w μ = ρ , w 1 n = 1 ,
and its aim is to find a portfolio that has minimal risk for a given expected return. A significant model that has been developed from the Markowitz model is the Lasso-type portfolio proposed by Brodie et al. [2], which is given as:
min w R n 1 T ρ 1 T R w 2 2 + α w 1 , s . t . w μ ¯ = ρ , w 1 n = 1 ,
where μ ¯ = 1 T t = 1 T r t , α is the penalty parameter, and R R T × n is the asset return matrix. Brodie et al. [2] confirmed that the 1 norm can produce a sparse portfolio, and this method can stabilize the problem. In this paper, we start with the square root Lasso-type portfolio, while adding more consideration about the perturbation in asset return matrix R and the parameter uncertainty in μ . We propose the following robust and sparse portfolio selection model:
min w R n max Δ U 0 ρ 1 T ( R + Δ ) w 2 s . t . min μ U w T μ ρ , w T 1 n = 1 , w 0 , w 0 s ,
where Δ is the data perturbation matrix and U 0 = { Δ R T × n : Δ i 2 α , i { 1 , . . . , n } } . The uncertainty set of the asset return is denoted by U, and we will discuss two selections of U in the last section.
In [31] (Chapter 2), they showed the equivalence of robustness and regularization. Specifically, they precisely characterized the conditions on the model of uncertainty and loss function under which robustness is equivalent to regularization for linear regression.
Definition 3.
Let g : R T R and h : R n R be the norm, then the induced norm · ( h , g ) is defined as
Δ ( h , g ) = max w R n g ( Δ w ) h ( w ) .
Theorem 4
([31], Chapter 2). If r , q [ 1 , ] , then
min w max Δ U ( q , r ) y ( R + Δ ) w r = min w y R w r + α w q ,
where U ( q , r ) = { Δ : Δ ( q , r ) α } . Moreover, if U 0 = { Δ : Δ i 2 α , i { 1 , . . . , n } } , then U ( 1 , 2 ) = U 0 , and this implies
min w max Δ i 2 α y ( R + Δ ) w 2 = min w y R w 2 + α w 1 .
From the relationship of the robustness and the regularization, problem (4) can be rewritten as
min w R n ρ 1 T R w 2 + α w 1 s . t . min μ U w T μ ρ , w T 1 n = 1 , w 0 , w 0 s .
Under this transformation, the problem (5) actually enjoys robustness. We plan to use an alternating penalty method to solve problem (5). To ensure the implementation of the alternating penalty method, we add a copy constraint w = v to the problem (5) and, then, move it to the objective function by means of the 1 norm penalty, then the penalization formulation is
min w , v R n f ( w , v ) : = ρ 1 T R w 2 + α w 1 + β w v 1 s . t . w Ω 1 : = { w | min μ U w T μ ρ , w 0 } v Ω 2 : = { v | v T 1 n = 1 , v 0 s } .
We conduct its optimality analysis in the next subsection.

3.2. Optimization Theory

We now analyze the optimality of the penalization problem (6). Obviously, the objective function of the problem (6) is a lower semi-continuous and coercive function. Theorem 5 in the next subsection provides the existence of optimal solutions. The selection of the uncertainty set U is discussed in Section 4.
This subsection provides a few theoretical results of the problem (6) including the existence of the solution and three classes of the first-order necessary optimal condition.
Theorem 5.
For any given α R + and β R + , the optimal solutions of the problem (6) can be attained.
Proof. 
It is clear that f is a proper, closed, and coercive function and Ω 1 × Ω 2 is a nonempty closed set satisfying Ω 1 × Ω 2 dom ( f ) . It follows from Theorem 2.14 of [32] that this theorem holds.    □
We now define a class of KKT points of the problem (6). For the convenience of expression and the generality of optimality, we write min μ U w T μ ρ as g ( w ) 0 and suppose that g is a concave function and is not necessarily differentiable. Indeed, the quadratic uncertainty set and the absolute uncertainty set introduced in Section 4 satisfy these terminologies.
Definition 4.
The point ( w * , v * ) Ω 1 × Ω 2 is called a KKT point of the problem (6), if there exist Lagrange multipliers λ 1 * R + and λ 2 * R such that the following system holds:
0 w f ( w * , v * ) λ 1 * w g ( w * ) + N ( w * ; R + ) , 0 v f ( w * , v * ) λ 2 * 1 + N ^ ( v * ; S ) , g ( w * ) 0 , λ * g ( w * ) = 0 , 1 T v * = 1 , v * 0 s .
Although the functions corresponding to the quadratic uncertainty set and the absolute uncertainty set introduced in Section 4 are all concave and may both be non-differentiable and Slater’s CQ automatically holds for both functions, we still considered Slater’s CQ as a condition of Theorem 6 for the sake of generality. Moreover, it is stated in Section 2 that the R-LICQ of Ω 2 holds at every point. Then, only under the condition that Slater’s CQ holds, the relationship between the local minimizer of the problem (6) and the KKT point of the problem (6) can be established.
Theorem 6.
Let ( w * , v * ) Ω 1 × Ω 2 be a local minimizer of the problem (6). If Slater’s CQ holds on Ω 1 , then it is a KKT point of the problem (6).
Proof. 
On the one hand, since ( w * , v * ) Ω 1 × Ω 2 is a local minimizer of the problem (6), w * is a local minimizer of the following optimization:
min w R n f ( w , v * ) = ρ 1 T R w 2 + α w 1 + β w v * 1 s . t . w Ω 1 .
Notice that f ( w , v * ) is a convex function about w and Ω 1 is a convex set. Then, problem (8) is a convex optimization. Since Slater’s CQ holds on Ω 1 , it follows from Theorem 1 that there exists a Lagrange multiplier λ 1 * R + such that
0 w f ( w * , v * ) + λ 1 * w g ( w * ) + N ( w * ; R + ) , g ( w * ) 0 , λ 1 * g ( w * ) = 0 .
On the other hand, v * is a local minimizer of the following optimization:
min v R n f ( w * , v ) = ρ 1 T R w * 2 + α w * 1 + β w * v 1 s . t . v Ω 2 .
Since the R-LICQ of Ω 2 holds at every point, it follows from Theorem 3 that there exists a Lagrange multiplier λ 2 * R such that
0 v f ( w * , v * ) + λ 2 * 1 + N ^ ( v * ; S ) , 1 T v * = 1 , v * 0 s .
Combing the system (9) and (11), this theorem holds.    □
Again, problem (10) can be simply written as
min v R n w * v 1 s . t . v Ω 2 ,
and it has a closed-form solution; see [33], i.e.,
v i * = w i * ( w s * ) T 1 s , if i I s * 0 , otherwise ,
where I s * : = { i | w 1 * w s * } and w i * denotes the i-th largest absolute value among the n elements of w * . Thus, we can define a class of strong KKT points of the problem (6) as follows.
Definition 5.
The point ( w * , v * ) Ω 1 × Ω 2 is called a strong KKT point of the problem (6), if there exists a Lagrange multiplier λ * R + such that the following system holds:
0 w f ( w * , v * ) + w g ( w * ) + N ( w * ; R + ) , g ( w * ) 0 , λ * g ( w * ) = 0 , v i * = w i * ( w s * ) T 1 s , if i I s * 0 , otherwise .
It is easy to prove that, if ( w * , v * ) is a strong KKT point of the problem (6), then it is a KKT point of the problem (6). The following result provides the relationship between the global minimizer of the problem (6) and the strong KKT point of the problem (6).
Theorem 7.
Let ( w * , v * ) Ω 1 × Ω 2 be a global minimizer of the problem (6). If Slater’s CQ holds on Ω 1 at w * , then it is a strong KKT point of the problem (6).
Proof. 
The part of w * in (13) follows from (7). We only need to discuss the part of v * in (13). Since v * is the global minimizer of (10), it follows from (12) that the part of v * in (13) holds.    □
Note that the local minimizer of the problem (6) cannot be guaranteed to be a strong KKT point.
Finally, we introduce the third stationary point of the problem (6), which is called the partial minimizer.
Definition 6.
The point ( w * , v * ) Ω 1 × Ω 2 is called a partial minimizer of the problem (6), if it satisfies
f ( w * , v * ) f ( w , v * ) , w Ω 1 , f ( w * , v * ) f ( w * , v ) , v Ω 2 .
Clearly, any global minimizer of the problem (6) is a partial minimizer. Moreover, on the one hand, the partial problem (8) is a convex optimization, and Slater’s CQ ensures that its KKT point and global minimizer are consistent. On the other hand, the partial problem (10) has a closed-form solution. Thus, the equivalence relationship between the KKT point of the problem (6) and the partial minimizer of the problem (6) can be established under Slater’s CQ.
Theorem 8.
Let ( w * , v * ) Ω 1 × Ω 2 be a feasible point of the problem (6). Suppose that Slater’s CQ holds on Ω 1 . Then, ( w * , v * ) is a partial minimizer of the problem (6) if and only if ( w * , v * ) is a strong KKT point of the problem (6).
Proof. 
Suppose that ( w * , v * ) is a strong KKT point of the problem (6), then
0 w f ( w * , v * ) + w g ( w * ) + N ( w * ; R + ) , g ( w * ) 0 , and λ 1 * g ( w * ) = 0 .
Since Slater’s CQ holds at w * , w * is a global minimizer of the problem (8). Then, we have that f ( w * , v * ) f ( w , v * ) w Ω 1 . Moreover, it follows from the definition of the strong KKT point of the problem (6) that v * is a global minimizer of the problem (10). Then, we have that f ( w * , v * ) f ( w * , v ) , v Ω 2 . Thus, ( w * , v * ) is a partial minimum of the problem (6). The opposite conclusion clearly holds.    □

4. The Uncertainty Set U

In SubSection 3.2, we rewrite the uncertainty set constraint as g ( w ) 0 , where g is a generalized concave function and is not necessarily differentiable. This section introduces two mainstream formulations for the uncertainty set in asset mean return vector μ (see [34]), which corresponds to the quadratic uncertainty set and the absolute uncertainty set, respectively.

4.1. The Quadratic Uncertainty Set

The first one is the quadratic formulation, U = { μ | ( μ μ ¯ ) T Ω ( μ μ ¯ ) κ 2 } , where μ ¯ is the nominal expected return and κ is the error. Assume that asset returns are independent and identically distributed and μ μ ¯ follows a normal distribution with mean value 0 and covariance matrix Ω , where Ω is the covariance matrix of errors in the expected asset return. In Yin et al. [35], they discussed the choice of uncertainty matrix Ω in the quadratic uncertainty set and proposed the selection criteria. In the quadratic uncertainty case, min μ U w T μ in problem (5) is equivalent to the following problem:
max μ U w T μ ¯ w T μ .
Solving the above problem, we obtain
μ = μ ¯ κ 2 w T Ω w Ω w .
Then, the problem (5) is rewritten as
min w , v R n ρ 1 T R w 2 + α w 1 s . t . μ ¯ T w κ w T Ω w ρ , w T 1 n = 1 , w 0 , w 0 s .
Here, g ( w ) = μ ¯ T w κ w T Ω w ρ . The penalization form of problem (14) can be rewritten as
min w , v R n ρ 1 T R w 2 + α w 1 + β w v 1 s . t . w Ω 1 = { w | κ Ω w 2 μ ¯ T w ρ , w 0 } v Ω 2 = { v | v T 1 n = 1 , v 0 s } .
According to the proposition of Yin et al. [35], we choose Ω d i a g ( V ) . By using this uncertainty matrix, it is expected to reduce the sensitivity to the inputs, as well as keep the original volatility unchanged.

4.2. The Absolute Uncertainty Set

Fabozzi et al. [23] used the absolute uncertainty set in mean returns that ask that the sum of absolute spreads between estimated and possible mean returns should not be too large. The absolute formulation is U = { μ | i | μ i μ ¯ i | κ σ ¯ T } . In this case,
μ T w μ ¯ T w i | μ i μ ¯ i | max ( | w i | ) κ σ ¯ T max ( | w i | ) ;
thus,
μ T w μ ¯ T w κ σ ¯ T max ( | w i | ) .
Then, the problem (5) is equivalent to
min w , v R n ρ 1 T R w 2 + α w 1 s . t . μ ¯ T w κ σ ¯ T max ( | w i | ) ρ , w T 1 n = 1 , w 0 , w 0 s .
Here, g ( w ) = μ ¯ T w κ σ ¯ max ( | w i | ) / T ρ . Similarly, the penalization from of problem (15) can be written as
min w , v R n ρ 1 T R w 2 + α w 1 + β w v 1 s . t . w Ω 1 = { w | | w i | T κ σ ¯ ( w T μ ¯ ρ ) , i = 1 , . . . , n , w 0 } v Ω 2 = { v | v T 1 n = 1 , v 0 s } .

5. Optimization

This section introduces a penalty alternating direction method (PADM) to solve problem (5).

5.1. Alternating Direction Methods

We first discuss the optimization of the problem (6). Due to the complexity of this problem, alternating direction methods (ADMs) can be used to solve this problem. The framework of ADMs is described as follows:
Next, we state the general convergence result of Algorithm 1, and one can refer to Geissler et al. [36] for a proof (Theorem 8) and for further details about this method.
Algorithm 1 ADM: Alternating Direction Method.
1:
Set the problem parameters: α , κ , ρ , T > 0 , asset return matrix R R T × n , and nominal expected return vector μ ¯ R n . Initialize ε > 0 , ( w 0 , v 0 ) and penalty parameter β > 0 . Set the iteration index k : = 0 , 1 , . . . .
2:
Compute
w k + 1 arg min w { f ( w , v k ) : w Ω 1 } ,
and
v k + 1 arg min v { f ( w k + 1 , v ) : v Ω 2 } .
3:
If w k + 1 w k 2 + v k + 1 v k 2 w k 2 + v k 2 ε , then stop with ( w k , v k ) being an output point of (6).
Theorem 9.
Let { ( w k , v k ) } be a sequence generated by Algorithm 1. Then, the following holds:
(a)
{ ( w k , v k ) } is bounded.
(b)
Any limiting point { ( w * , v * ) } of { ( w k , v k ) } is a partial minimizer of the problem (6).
(c)
If Slater’s CQ holds on Ω 1 , the limiting point of { ( w k , v k ) } is also a strong KKT of the problem (6).
Proof. 
(a) It follows from Algorithm 1 that
f ( w k + 1 , v k + 1 ) f ( w k + 1 , v k ) f ( w k , v k ) .
Since f is a coercive function, then the level set of f is bounded. Thus, { ( w k , v k ) } is bounded.
(b) Clearly, { f ( w k , v k ) } is a decreasing sequence and f ( w k , v k ) 0 , then there exists a value f * such that lim k f ( w k , v k ) = f * . Suppose that { ( w * , v * ) } is a limiting point of { ( w k , v k ) } . Then, there exists a sequence { k j } such that lim j k j = , lim j ( w k j , v k j ) = ( w * , v * ) and lim j f ( w k j , v k j ) = f ( w * , v * ) = f * , where the last equality holds since lim k f ( w k , v k ) = f * . Without loss of generality, let lim k ( w k , v k ) = ( w * , v * ) . It follows from Algorithm 1 that
w k + 1 arg min w Ω 1 f ( w , v k ) .
Since f is continuous with respect to w , then taking k , we have that
w * arg min w Ω 1 f ( w , v * ) .
Similarly,
v * arg min v Ω 2 f ( w * , v ) .
Thus, the limiting point { ( w * , v * ) } of { ( w k , v k ) } is a partial minimizer of the problem (6).
(c) Under Slater’s CQ, the partial minimizer of the problem (6) is a strong KKT point of the problem (6) and the opposite also holds. Thus, this result holds.    □

5.2. The Optimization for the Partial Problem (8)

We now discuss the optimization of the partial problem (8) at the k-th iteration of the ADM. Some non-exact penalty methods and smoothing methods can be used to solve this problem. Here, we obtain w k + 1 by solving the following optimization:
min w R n f ( w , v k ) = ρ 1 T R w 2 + α w 1 + β w v k 1 + γ | g ( w ) | s . t . w 0 ,
where γ > 0 is a penalty parameter. Let
ψ μ ( t ) = | t | , | t | μ , t 2 2 μ + μ 2 , | t | < μ , ϕ μ ( t ) = 1 2 ( t + t 2 + μ ) ,
where μ > 0 is a smoothing parameter. Then, a class of the smoothing optimization of problem (18) can be given as follows:
min w R n f μ ( w , v k ) = ρ 1 T R w 2 2 + μ + α i = 1 n ψ μ ( w i ) + β i = 1 n ψ μ ( w i v i k ) + γ ϕ μ ( g ( w ) ) s . t . w 0 .
The projection gradient method (PGM) can be used to solve the problem (19), and its iteration formula is
w j + 1 , k = P R + ( w j , k + η f μ ( w j , k , v k ) ) ,
where η > 0 denotes the step length at the j-th iteration of the PGM at the k-th iteration of the ADM and P # ( t ) denotes the projection point of t onto #. The framework of the above method is called the penalty projection gradient method (PPGM) and can be described in Algorithm 2.
Algorithm 2 PPGM: Penalty Projection Gradient Method.
1:
Set the problem parameters: α , κ , ρ , T , β , γ max > 0 , asset return matrix R R T × n , and nominal expected return vector μ ¯ R n . Initialize penalty parameters γ 0 > 0 , τ > 1 . Set the iteration index j = 0 , 1 , .
2:
Computing w j + 1 , k = P R + ( w j 1 , k + η f μ ( w , v k ) ) .
3:
If w j + 1 , k satisfies w j + 1 , k w j , k 2 w j , k 2 ε and g ( w j , k ) ε , then stop and w k = w j , k .
4:
If w j + 1 , k w j , k 2 w j , k 2 ε and g ( w j , k ) < ε , then choose new penalty parameter γ = min { τ γ , γ max } . Otherwise, return to Step 2.

5.3. Penalty Alternating Direction Method

At the end of this section, we describe the PADM for the general problem (5). At iteration l, set the value of penalty parameter β l and obtain ( w l , v l ) by the ADM with β l . If the inequality w l v l 1 t o l holds, where t o l is a small positive constant, we stop with a feasible solution of problem (6). Otherwise, the penalty parameter β l is updated to β l + 1 . In this way, the PADM generates a sequence of the partial minimizer of problem (6) with β l . The framework of the PADM is formally stated in Algorithm 3.
Algorithm 3 PADM: Penalty Alternating Direction Method.
1:
Set the problem parameters: α , κ , ρ , T , β max > 0 , asset return matrix R R T × n , and nominal expected return vector μ ¯ R n . Initialize penalty parameters β 0 > 0 , τ > 1 . Set the iteration index l = 0 , 1 , .
2:
Obtain ( w l , v l ) by the ADM with β l .
3:
If ( w l , v l ) satisfies w l v l 1 t o l , then stop with ( w l , v l ) . Otherwise, choose new penalty parameter β l + 1 = min { τ β l , β max } , and return to Step 2.

6. Numerical Results

This section shows extensive numerical experiments. In Section 6.1, we first present six real data sets, explain some existing models to be compared with and describe the performance measures to be used. In Section 6.2, we demonstrate that our methods lead to robust and sparse portfolios. In Section 6.3, we compare nine popular portfolios in terms of out-of-sample (OOS) performance measures. Finally, in Section 6.4, we show the cumulative return of different portfolio strategies. All of our computations are conducted in the Matlab R2019a environment, on a PC with an Intel(R) Core(TM) i5-7200U CPU (2.50 GHz, 4 CPUs) and 4G RAM processors.

6.1. Models of Comparison, Data, and Performance Measures

(a) Eleven portfolio models compared. We compare the OOS performance of 11 portfolio models across six real data sets of weekly and monthly returns. Those models are well studied, and we divide them into four groups, which are summarized in Table 1. The first group is the robust and sparse portfolio strategies developed in this paper. The second group includes some well-studied portfolio strategies. The third group includes three benchmark portfolio strategies. The last group consists of two portfolios that use the shrinkage technique to estimate the covariance matrix.
(b) Seven data sets tested. Table 2 lists some real-world data sets: DJIA [42], NASDAQ [43], S&P [44,45], Russell2000 [46], Russell3000 [47], and FF100 [38]. All the data are obtained from Yahoo finance (https://finance.yahoo.com/, accessed on 10 January 2023) and Ken French’s website (https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html, accessed on 10 January 2023). In all cases, we remove those assets that have missing values.
(c) Measuring the OOS performance and its setup. We largely follow the “rolling-window” procedures in [2,37] to conduct our comparison. Let T be the length of a data set and τ be the window length (e.g., τ = 120 ) used to construct the optimal portfolio by a model. In each period ( t + 1 ) , t = τ , . . . , T 1 , we compute different portfolios over the previous τ periods. We then compute the OOS return in the ( t + 1 ) -th period based on the obtained portfolio. We repeat this procedure until we reach the end of the data set. In this way, we will obtain a series of ( T τ ) portfolio vectors for each model listed in Table 1. To make it precise, let w t s be the optimal portfolio obtained by the portfolio strategy s over the date from t τ + 1 , , t . The OOS return in the t + 1 period is computed as r t + 1 s = w t s r t + 1 , where r t + 1 is the return in the ( t + 1 ) -th period. Thus, we obtain a time series of ( T τ 1 ) periods OOS returns for all strategies. Note that we use the traditional “rolling-window” procedures for the numerical analysis, and some new methods could provide new ideas for the analysis of portfolio selection problems, see [48].
The OOS performance of each portfolio strategy is assessed by using four quantities: (i) the OOS portfolio variance ( σ ^ 2 ), (ii) the OOS portfolio Sharpe ratio ( S R ^ ), (iii) portfolio turnover ( T U R N ), and (iv) the average short positions ( A S P ). The specific definitions can be found in DeMiguel et al. [6], Yen and Yen [38], and Zhao et al. [37]. We evaluate the cumulative return (CR). The CR of a portfolio scores the total payoffs that are yielded by the investment strategy across the investment periods without considering any risk or cost, see Shen et al. [49]. We also consider some quantities studied in [38] on the profiles of the portfolio weights: PAP represents the proportion of active positions and PZP is the proportion of zero positions, respectively, defined as P A P t = | S t 1 | N , P Z P t = | S t 0 | N , where S t 1 = { i : w i , t 0 } and S t 0 = { i : w i , t = 0 } .

6.2. Robust and Sparse Portfolio

This section shows the weight of robust and sparse portfolios. We use the DJIA data set and the sparse levels s 1 = 30 % n and s 2 = 50 % n . The parameter α = β = 10 λ , and the value of λ varies from 10 2 to 10 1 .
Figure 1 shows the portfolio weights, PAP, and PZP. The two plots in the top panel correspond to a robust portfolio under the quadratic uncertainty set, and the sparsity is s 1 . The two plots in the bottom panel correspond to a robust portfolio under the absolute uncertainty set, and the sparsity is s 2 . With the increase of penalty parameter λ , the portfolio weights tended to be sparse. The PAP and PZP indicate that we can obtain sparse portfolios that satisfy the specified sparsity.
Figure 2 shows the sparse portfolio. We use four different data sets. The sparsity level on DJIA is s 1 = 30 % n , on NASDAQ and FF100 is s 2 = 10 % n , and on Russell2000 is s 3 = 1 % n . We solve the robust portfolio under the quadratic uncertainty set to show the results. We obtain the portfolio with the specified sparsity and the distribution of different asset weight values.

6.3. Out-of-Sample Performance

The Sharpe ratio considers return and risk at the same time; it is a comprehensive measurement for us to observe the performance of a portfolio. Thus, we first test the Sharpe ratio of different portfolio strategies. We use the SP100 data set. The parameter α = 2 β and the value of β varies from 10 to 10 1.5 . The sparsity level s 1 = 15 % n and s 2 = 5 % n .
By comparing with two benchmark portfolios, Figure 3 shows that the RSQ and RSA can produce a higher Sharpe ratio when choosing a suitable penalty parameter.
Table 3 reports the OOS performance by using four quantities defined in Section 6.1. We set α = β = 10 and the sparsity level s 1 = = 30 % n (on the DJIA, NASDAQ, SP500, and FF100 data sets) and s 2 = = 30 % n (on the Russell2000 and Russell3000 data sets). We can observe that the RSA and RSQ portfolios achieve the smallest variances across all portfolio strategies, i.e., on average with 10.84 ( % ) 2 and 11.09 ( % ) 2 , respectively. This means they are less volatile, i.e., less risky. SU, SC1F, and SCID have the highest variance on average, 995.73 ( % ) 2 , 442.81 ( % ) 2 and 404.20 ( % ) 2 in this setting. The variance of the remaining portfolio strategies is 11.94 ( % ) 2 (L1), 11.98 ( % ) 2 (EN), 16 ( % ) 2 (L12), 27.78 ( % ) 2 (SC), 28.71 ( % ) 2 (EW), and 71.11 ( % ) 2 (Box), respectively. In addition, we observe that the Sharpe ratios of the various portfolios on average are 12.34% (SC), 11.82% (EW), 11.77% (RSA), 11.70% (RSQ), 11.61% (L12), 11.21% (EN), 11.18% (L1), 10.58% (SCID), 10.09% (SC1F), 9.23% (Box), and 7.92% (SU). We see that the RSA and RSQ portfolios do not result in a significantly different OOS Sharpe ratio when compared with SC and EW; however, they are higher than the rest of the portfolio strategies.
As for the portfolio turnover, unsurprisingly, the EW portfolio strategy exhibit the lowest turnover of all portfolio strategies, amounting to 3.20%. The RSA and RSQ portfolio strategies have moderate levels of turnover on average, 13.49% and 13.20%. The highest average turnover is generated by the SU portfolio and, then, by the Box portfolio, amounting on average to 225.81% and 193.10%, meaning that they are very costly. The turnover of the remaining portfolio strategies range between 11.85% (L12), 25.76% (L2), 16.66% (L1), and 13.98% (EN), respectively. The high turnover of SU and Box was reflect in the enormous average short positions of over 283.52% and 333.52% on average across the six data sets. The second two highest average short positions are by SCID and SC1F, respectively, amounting to 164.29% and 132.81%. The average short positions of the SC and EW portfolios are on average approximately 0% across the six data sets. The average short positions of the RSQ and RSA portfolio strategies also tend to zero. Therefore, considering the moderate turnover and the average short positions, the proposed RSQ and RSA strategies represent a practically implementable method that outperform the portfolio strategies listed in Table 1.

6.4. Cumulative Return

In this subsection, we show the CR of several portfolio strategies. We use the FF100 data set. The sparsity level s = 10 % n . The parameter α = β = 10 . According to the OOS performance, we choose RSQ, RSA, L12, L1, EN, EW, and SC to compare the CR.
Figure 4 shows the curves of the CR over the corresponding investment periods for the different portfolio strategies. Apparently, RSQ and RSA outperform the others with visible margins. However, RSA and RSQ do not produce significant differences. This result suggest that, compared with the other portfolios, the sparse portfolios RSA and RSQ grow more steadily together with a reduced volatility across most of the investment periods.

7. Conclusions

Portfolio selection has been a fertile area for robust optimization techniques. We proposed a robust and sparse portfolio selection optimization model by considering the perturbation in the asset return matrix and the parameter uncertainty in the expected asset return. We used the equivalence of robustness and regularization to deal with the perturbation in the asset return matrix. To deal with the uncertainty in the expected asset return, we considered two kinds of uncertainty sets and solved the worst-case scenario. We defined three types of stationary points of the penalty problem and then analyzed the relationship between these stationary points and local/global minimizers. Then, we designed the penalty alternating direction method to solve each problem. Although there is no theoretical guarantee for the equivalence between problems (5) and (6), as well as problems (8) and (18), we confirmed the effectiveness of our approach by comparing with nine widely studied portfolio models on seven real-world data sets. Extensive numerical experiments confirm that the portfolios we proposed have lower volatility, that is less risk. Moreover, our portfolio strategies can yield higher Sharpe ratios when the appropriate parameters are selected.
We note that the robust optimization (RO) mainly consider the uncertainty sets of parameters and thus it do not consider any distribution information of the data. This characteristic makes RO attractive, but at the same time, this method loses the comprehensive characterization of the data. Recently, distributed robust optimization (DRO) has attracted widespread attention and research. Although DRO takes into account the distribution information of the data, the cost paid is that it is difficult to solve. We will consider how to apply DRO to sparse portfolio problems, while considering the distribution information of financial data and improving the sparsity. The most direct extension is the distributed robust portfolio optimization with the 0 norm constraint, which is a worthwhile and challenging issue.

Author Contributions

Conceptualization, H.Z.; methodology, H.Z. and Y.J.; software, H.Z., Y.J., and Y.Y.; formal analysis, H.Z. and Y.J.; writing—original draft preparation, H.Z.; writing—review and editing, H.Z., Y.J., and Y.Y.; visualization, H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data source has been presented in Section 6 of the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Markowitz, H. Portfolio selection. J. Financ. 1952, 7, 142–149. [Google Scholar]
  2. Brodie, J.; Daubechies, I.; De Mol, C.; Giannone, D.; Loris, I. Sparse and stable markowitz portfolios. Proc. Natl. Acad. Sci. USA 2009, 106, 12267–12272. [Google Scholar] [CrossRef] [PubMed]
  3. Ledoit, O.; Wolf, M. Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. J. Empir. Financ. 2003, 10, 603–621. [Google Scholar] [CrossRef]
  4. Ledoit, O.; Wolf, M. Nonlinear shrinkage of the covariance matrix for portfolio selection: Markowitz meets goldilocks. Rev. Financ. Stud. 2017, 30, 4349–4388. [Google Scholar] [CrossRef]
  5. Jagannathan, R.; Ma, T. Risk reduction in large portfolios: Why imposing the wrong constraints helps. J. Financ. 2003, 58, 1651–1683. [Google Scholar] [CrossRef]
  6. DeMiguel, V.; Garlappi, L.; Nogales, F.J.; Uppal, R. A generalized approach to portfolio optimization: Improving performance by constraining portfolio norms. Manag. Sci. 2009, 55, 798C812. [Google Scholar] [CrossRef]
  7. Ben-Tal, A.; Nemirovski, A.; Roos, C. Robust solutions of uncertain quadratic and conic-quadratic problems. SIAM J. Optim. 2002, 13, 535–560. [Google Scholar] [CrossRef]
  8. El Ghaoui, L.; Oustry, F.; Lebret, H. Robust solutions to uncertain semidefinite programs. SIAM J. Optim. 1998, 9, 33–52. [Google Scholar] [CrossRef]
  9. Lee, Y.; Kim, M.J.; Kim, J.H.; Jang, J.R.; Chang, K.W. Sparse and robust portfolio selection via semi-definite relaxation. J. Oper. Res. 2020, 71, 687–699. [Google Scholar] [CrossRef]
  10. Swain, P.; Ojha, A.K. Robust approach for uncertain portfolio allocation problems under box uncertainty. In Recent Trends in Applied Mathematics: Select Proceedings of AMSE; Springer: Singapore, 2019; pp. 347–356. [Google Scholar]
  11. Goldfarb, D.; Iyengar, G. Robust portfolio selection problems. Math. Oper. Res. 2003, 28, 1–38. [Google Scholar] [CrossRef]
  12. Lobo, M.S.; Boyd, S. The Worst-Case Risk of a Portfolio. Unpublished Manuscript. Available online: http://faculty.fuqua.duke.edu/(2000)%7Emlobo/bio/researchfiles/rsk-bnd.pdf (accessed on 13 July 2022).
  13. Min, L.; Dong, J.; Liu, J.; Gong, X. Robust mean-risk portfolio optimization using machine learning-based trade-off parameter. Appl. Soft Comput. 2021, 113, 107948. [Google Scholar] [CrossRef]
  14. Won, J.H.; Kim, S.J. Robust trade-off portfolio selection. Optim. Eng. 2020, 21, 867–904. [Google Scholar] [CrossRef]
  15. Fabozzi, F.J.; Huang, D.; Zhou, G. Robust portfolios: Contributions from operations research and finance. Ann. Oper. Res. 2010, 176, 191–220. [Google Scholar] [CrossRef]
  16. Scutella, M.G.; Recchia, R. Robust portfolio asset allocation and risk measures. Ann. Oper. Res. 2013, 204, 145–169. [Google Scholar] [CrossRef]
  17. Xidonas, P.; Steuer, R.; Hassapis, C. Robust portfolio optimization: A categorized bibliographic review. Ann. Oper. Res. 2020, 292, 533–552. [Google Scholar] [CrossRef]
  18. Ghahtarani, A.; Saif, A.; Ghasemi, A. Robust portfolio selection problems: A comprehensive review. Oper. Res. 2022, 22, 3203–3264. [Google Scholar] [CrossRef]
  19. Leyffer, S.; Menickelly, M.; Munson, T.; Vanaret, C.; Wild, S.M. A survey of nonlinear robust optimization. INFOR: Inf. Syst. Oper. Res. 2020, 58, 342–373. [Google Scholar] [CrossRef]
  20. Zhao, Z.; Xu, F.; Du, D.; Meihua, W. Robust portfolio rebalancing with cardinality and diversification constraints. Quant. Financ. 2021, 21, 1707–1721. [Google Scholar] [CrossRef]
  21. Tütüncü, R.H.; Koenig, M. Robust asset allocation. Ann. Oper. Res. 2004, 132, 157–187. [Google Scholar] [CrossRef]
  22. Khodamoradi, T.; Salahi, M.; Najafi, A.R. Robust CCMV model with short selling and risk-neutral interest rate. Phys. A Stat. Mech. Its Appl. 2020, 547, 124429. [Google Scholar] [CrossRef]
  23. Fabozzi, F.J.; Kolm, P.N.; Pachamanova, D.A.; Focardi, S.M. Robust portfolio optimization. J. Portf. Manag. 2007, 33, 40. [Google Scholar] [CrossRef]
  24. Pınar, M.Ç. On robust mean-variance portfolios. Optimization 2016, 65, 1039–1048. [Google Scholar] [CrossRef]
  25. Busse, J.A.; Chordia, T.; Jiang, L.; Tang, Y. Transaction costs, portfolio characteristics, and mutual fund performance. Manag. Sci. 2021, 67, 1227–1248. [Google Scholar] [CrossRef]
  26. Hautsch, N.; Voigt, S. Large-scale portfolio allocation under transaction costs and model uncertainty. J. Econom. 2019, 212, 221–240. [Google Scholar] [CrossRef]
  27. Yu, J.R.; Chiou, W.J.P.; Lee, W.Y.; Lin, S.J. Portfolio models with return forecasting and transaction costs. Int. Rev. Econ. Financ. 2020, 66, 118–130. [Google Scholar] [CrossRef]
  28. Mordukhovich, B.S.; Nguyen, M.M. An Easy Path to Convex Analysis and Applications; Morgan and Claypool Publishers Series: San Rafael, CA, USA, 2014. [Google Scholar]
  29. Pan, L.; Xiu, N.; Fan, J. Optimality conditions for sparsity nonlinear programming. Sci. China Math. 2017, 60, 759–776. [Google Scholar] [CrossRef]
  30. Rockafellar, R.T.; Wets, R.J. Variational Analysis; Springer: Berlin/Heidelberg, Germany, 1998. [Google Scholar]
  31. Bertsimas, D.; Dunn, J. Machine Learning under a Modern Optimization Lens; Dynamic Ideas LLC: Charlestown, MA, USA, 2019. [Google Scholar]
  32. Beck, A. First-Order Methods in Optimization; Society for Industrial and Applied Mathematics and Mathematical Optimization Society: Philadelphia, PA, USA, 2017. [Google Scholar]
  33. Costa, C.M.; Kreber, D.; Schmidta, M. An alternating method for cardinality-constrained optimization: A computational study for the best subset selection and sparse portfolio problems. Informs J. Comput. 2022, 34, 2968–2988. [Google Scholar] [CrossRef]
  34. Heckel, T.; de Carvahlo, R.; Lu, X.; Perchet, R. Insights into robust optimization: Decomposing into mean-variance and risk-based portfolios. J. Invest. Strateg. 2016, 6, 1–24. [Google Scholar] [CrossRef]
  35. Yin, C.; Perchet, R.; Soupé, F. A practical guide to robust portfolio optimization. Quant. Financ. 2021, 21, 911–928. [Google Scholar] [CrossRef]
  36. Geissler, B.; Morsi, A.; Schewe, L.; Schmidt, M. Penalty alternating direction methods for mixed-integer optimization: A new view on feasibility pumps. SIAM J. Optim. 2017, 27, 1611–1636. [Google Scholar] [CrossRef]
  37. Zhao, H.; Kong, L.; Qi, H.D. Optimal portfolio selections via 12-norm regularization. Comput. Optim. Appl. 2021, 80, 853–881. [Google Scholar] [CrossRef]
  38. Yen, Y.M.; Yen, T.J. Solving norm constrained portfolio optimization via coordinate-wise descent algorithms. Comput. Stat. Data Anal. 2014, 76, 737–759. [Google Scholar] [CrossRef]
  39. Behr, P.; Guettler, A.; Miebs, F. On portfolio optimization: Imposing the right constraints. J. Bank. Financ. 2013, 37, 1232–1242. [Google Scholar] [CrossRef]
  40. DeMiguel, V.; Garlappi, L.; Uppal, R. Optimal versus naive diversification: How in-efficient is the 1/n portfolio strategy? Rev. Financ. Stud. 2009, 22, 1915–1953. [Google Scholar] [CrossRef]
  41. Ledoit, O.; Wolf, M. A well-conditioned estimator for large-dimensional covariance matrices. J. Multivar. Anal. 2004, 88, 365–411. [Google Scholar] [CrossRef]
  42. Lai, Z.R.; Yang, P.Y.; Fang, L.; Wu, X. Short-term sparse portfolio optimization based on alternating direction method of multipliers. J. Mach. Learn. Res. 2018, 19, 2547–2574. [Google Scholar]
  43. Chou, R.K.; Chung, H. Decimalization, trading costs, and information transmission between etfs and index futures. J. Futur. Mark. 2006, 26, 131–151. [Google Scholar] [CrossRef]
  44. Kan, R.; Wang, X.; Zhou, G. Optimal portfolio choice with estimation risk: No risk-free asset case. Manag. Sci. 2022, 68, 2047–2068. [Google Scholar] [CrossRef]
  45. Mutunge, P.; Haugl, D. Minimizing the tracking error of cardinality constrained portfolios. Comput. Oper. Res. 2018, 90, 33–41. [Google Scholar] [CrossRef]
  46. Fan, J.; Zhang, J.; Yu, K. Vast portfolio selection with gross-exposure constraints. J. Am. Stat. Assoc. 2012, 107, 592–606. [Google Scholar] [CrossRef]
  47. Teng, Y.; Yang, L.; Yu, B.; Song, X. A penalty PALM method for sparse portfolio selection problems. Optim. Methods Softw. 2017, 32, 126–147. [Google Scholar] [CrossRef]
  48. Wang, Y.; Gao, S.; Yu, Y.; Cai, Z.; Wang, Z. A gravitational search algorithm with hierarchy and distributed framework. Knowl.-Based Syst. 2021, 218, 106877. [Google Scholar] [CrossRef]
  49. Shen, W.; Wang, J.; Ma, S. Doubly regularized portfolio with risk minimization. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec City, QC, Canada, 27–31 July 2014. [Google Scholar]
Figure 1. Portfolio weights.
Figure 1. Portfolio weights.
Mathematics 11 04925 g001
Figure 2. Sparse solutions.
Figure 2. Sparse solutions.
Mathematics 11 04925 g002
Figure 3. The Sharpe ratio.
Figure 3. The Sharpe ratio.
Mathematics 11 04925 g003
Figure 4. The cumulative return.
Figure 4. The cumulative return.
Mathematics 11 04925 g004
Table 1. List of portfolio strategies considered.
Table 1. List of portfolio strategies considered.
GroupModelAbbr.Refer.
(1)Robust and sparse portfolios with
  quadratic uncertainty setRSQthis paper
  absolute uncertainty setRSAthis paper
(2)Some well-studied portfolio strategies with
   1 regularizationL1Brodie et al. [2]
   1 , 2 regularizationL12Zhao et al. [37]
  Elastic Net regularizationENYen and Yen [38]
  upper and lower boundBoxBehr et al. [39]
(3)Benchmarks’ portfolio strategies with
  short-sales constrainedSCJagannathan and Ma [5]
  short-sales unconstrainedSUJagannathan and Ma [5]
  equally weighted (1/N) portfolioEWDeMiguel et al. [40]
(4)Shrinkage of covariance
  sample covariance and identity matrixSCIDOlivier and Wolf [41]
  sample covariance and 1-factor matrixSC1FOlivier and Wolf [3]
Table 2. Information of the seven real data sets.
Table 2. Information of the seven real data sets.
#Data SetsStocksTime PeriodSourceFrequency
1DJIA2901/10/2017–30/10/2022Yahoo financeWeekly
2NASDAQ9501/10/2017–30/10/2022Yahoo financeWeekly
3SP50033601/10/2017–30/10/2022Yahoo financeWeekly
4Russell2000134001/10/2017–30/10/2022Yahoo financeWeekly
5Russell3000216601/10/2017–30/10/2022Yahoo financeWeekly
6SP1007101/10/2017–30/10/2022Yahoo financeWeekly
7FF10010011/1999–06/2022K.FrenchMonthly
Table 3. Portfolio out-of-sample variance ( σ ^ 2 ) ( ( % ) 2 ), Sharpe ratio ( S R ^ ), turnover ( T U R N ), and the average short positions ( A S P ).
Table 3. Portfolio out-of-sample variance ( σ ^ 2 ) ( ( % ) 2 ), Sharpe ratio ( S R ^ ), turnover ( T U R N ), and the average short positions ( A S P ).
DJIANASDAQSP500Russell2000Russell3000FF100
n = 29n = 95n = 336n = 1340n = 2166n = 100
var5.37415.906511.560412.181312.783717.2215
RSASR0.07030.18070.10080.00630.02670.3215
TURN0.11940.17210.14300.10330.10210.1698
ASP−1.11e-182.22e-180−3.08e-18−4.01e-18−6.28e-18
var5.39495.863312.593612.704112.770217.2097
RSQSR0.07370.18890.08630.00590.02580.3216
TURN0.10630.17110.13340.11120.10030.1698
ASP−6.66e-18−3.33e-18−1.43e-17−9.25e-18−9.25e-180
var9.59317.144232.190714.892713.787918.3918
L12SR0.08600.17680.10000.00720.02950.2971
TURN0.02200.02860.03690.06750.06050.0296
ASP−0.0216−0.0209−0.0240−0.0057−0.0050−0.0276
var8.03736.707813.122814.267913.016616.4937
L1SR0.08640.18310.04140.00420.02790.3279
TURN0.10630.06770.03690.12500.11360.0629
ASP−0.0036−0.0012−0.0124−0.00390.0020−0.0195
var8.58716.536212.944914.207413.081716.5823
ENSR0.09110.17900.04180.00290.04020.3181
TURN0.02560.04720.04080.12080.11810.0510
ASP−0.0257−0.0198−0.02720.00150.0011−0.0257
var9.91818.77493.59e+027.36367.000434.5878
BOXSR0.02500.0768−0.1204−0.0495−0.00230.6244
TURN0.83221.71463.27090.28200.29735.1895
ASP1.11722.78506.51370.57440.58868.4327
var10.01328.008986.129124.311016.974121.2942
SCSR0.08710.18910.12020.02330.03990.2812
TURN0.03880.03130.04110.05120.04810.0247
ASP1.38e-161.23e-16−1.52e-163.12e-16−4.19e-160
var9.918115.07215.91e+0312.17219.999917.2369
SUSR0.02500.1800−0.1270−0.0482−0.01670.4624
TURN0.83222.79473.91500.39190.37225.2429
ASP1.11722.55805.88700.55420.54946.3456
var11.13398.178089.825221.337019.574222.2346
EWSR0.07010.17900.11510.01940.03370.2922
TURN0.02080.02530.04000.04290.03810.0252
ASP1.13e-161.13e-16−1.12e-164.52e-16−3.39e-160
var6.96006.97822.38e+037.36767.030416.8727
SCIDSR0.02940.1097−0.1259−0.0497−0.00250.6740
TURN0.41150.85890.99400.28030.28841.5393
ASP0.63621.67352.82750.57270.58713.5605
var6.26706.53472.6140e+037.37907.036715.6203
SC1FSR0.03800.0902−0.1253−0.0510−0.00340.6570
TURN0.30321.16480.89720.29640.29621.5858
ASP0.46901.35562.17710.57220.58632.8086
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, H.; Jiang, Y.; Yang, Y. Robust and Sparse Portfolio: Optimization Models and Algorithms. Mathematics 2023, 11, 4925. https://doi.org/10.3390/math11244925

AMA Style

Zhao H, Jiang Y, Yang Y. Robust and Sparse Portfolio: Optimization Models and Algorithms. Mathematics. 2023; 11(24):4925. https://doi.org/10.3390/math11244925

Chicago/Turabian Style

Zhao, Hongxin, Yilun Jiang, and Yizhou Yang. 2023. "Robust and Sparse Portfolio: Optimization Models and Algorithms" Mathematics 11, no. 24: 4925. https://doi.org/10.3390/math11244925

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop