Next Article in Journal
Trading Stocks Based on Financial News Using Attention Mechanism
Next Article in Special Issue
The FEDHC Bayesian Network Learning Algorithm
Previous Article in Journal
Interval Uncertainty Quantification for the Dynamics of Multibody Systems Combing Bivariate Chebyshev Polynomials with Local Mean Decomposition
Previous Article in Special Issue
Anomaly Detection in the Internet of Vehicular Networks Using Explainable Neural Networks (xNN)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Robust Variable Selection Based on Penalized Composite Quantile Regression for High-Dimensional Single-Index Models

College of Science, China University of Petroleum, Qingdao 266580, China
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(12), 2000; https://doi.org/10.3390/math10122000
Submission received: 23 March 2022 / Revised: 21 May 2022 / Accepted: 23 May 2022 / Published: 10 June 2022
(This article belongs to the Special Issue Statistical Data Modeling and Machine Learning with Applications II)

Abstract

:
The single-index model is an intuitive extension of the linear regression model. It has been increasingly popular due to its flexibility in modeling. In this work, we focus on the estimators of the parameters and the unknown link function for the single-index model in a high-dimensional situation. The SCAD and Laplace error penalty (LEP)-based penalized composite quantile regression estimators, which could realize variable selection and estimation simultaneously, are proposed; a practical iterative algorithm is introduced to obtain the efficient and robust estimators. The choices of the tuning parameters, the bandwidth, and the initial values are also discussed. Furthermore, under some mild conditions, we show the large sample properties and oracle property of the SCAD and Laplace penalized composite quantile regression estimators. Finally, we evaluated the performances of the proposed estimators by two numerical simulations and a real data application.
MSC:
62F12; 62G08; 62G20; 62J07T07

1. Introduction

As a generalized regression model, the single-index regression model has a wide range of applications in the fields of finance, economics, biomedicine, etc. The single-index regression model not only avoids the so-called “curse of dimensionality” in the nonparametric models, but also significantly improves the efficiency of model estimation and reveals the relationship between the response variables and the high-dimensional covariates, to keep good interpretability of the parametric models and flexibility of the nonparametric models simultaneously [1,2,3,4,5]. However, the single-index regression model also inherits the shortcomings of the classical regression models. For example, in practical applications, especially in heavy-tailed error distribution scenarios, it is difficult to satisfy the bounded error variance, which a single-index regression model requires. Moreover, in the mean regression scenario, the estimation results of a single-index regression model are very sensitive to extreme values. To overcome these drawbacks, robust regression methods are necessary for the single-index regression model when fitting the real data.
Compared with the mean regression, the quantile regression proposed by [6] can measure the effect of the explanatory variable, not only on the distribution center but also on the upper and lower tails of the response variable. The quantile regression (QR) estimation, which is not restricted by the error distribution and can effectively avoid the impact of outliers, is more robust than the least squares estimation. Furthermore, in order to utilize the information on different quantiles adequately, composite quantile regression (CQR) was proposed by [7]. [8] added the SCAD-L2 penalty to the loss function and proposed a robust variable selection method based on the weighted composite quantile regression (WCQR), which made variable selection insensitive to high-leverage points and outliers. In this article, we studied the estimation and variable selection of the single-index quantile regression model. The single-index quantile regression model is specified in the following form
Y = g ( X γ ) + ε ,
where Y is the response variable, X is a d-dimensional covariate vector, γ is an unknown parameter vector, g ( · ) is an unknown link function, ε is the random error, and the τ th conditional quantile is zero, i.e., P ( ε 0 | X ) = τ . In order to identify it more easily, we assume that γ = 1 and the first component of γ is positive, where · denotes the Euclidean norm.
There are two estimation problems for the single-index quantile regression model. One is the estimation of parameters and the other is the estimation of the link function. The study of estimation for single-index quantile regression models began with [9], which generalized the average derivative method. Meanwhile, [10] proposed a simple algorithm to achieve the quantile regression for single-index models and proved the asymptotic properties of estimators. [3] proposed D-Vine Copula-based quantile regression, which is a new algorithm that does not require accurately assuming the shape of conditional quantiles and avoids the typical defects of linear models, such as multicollinearity. [11] proposed a non-iterative coincidence quantile regression (NICQR) estimation algorithm for the single-index quantile regression model, which has high computational efficiency and is suitable for analyzing massive data sets.
In real data, the model is often sparse. The variables inevitably contain a few irrelevant and unnecessary variables while modeling the real data, which can degrade the efficiency of the resulting estimation procedure and increase the complexity of models. In the case of linear models, many authors have considered variable selection via penalized least squares, which allows for a simultaneous selection of variables and estimation of regression parameters. Several penalty functions, including the SCAD [12], the adaptive LASSO [13], and the adaptive elastic net [14] have been shown to possess favorable theoretical properties: unbiased, sparsity, and continuity; it is regarded as the basic properties that a good estimator should enjoy [15]. It enjoys the oracle property. [5] combined the SCAD penalty variable selection method with LM-ANN for modeling, making good use of the advantages of SCAD in dimension reduction and the efficiency of LM-ANN in nonlinear relationship modeling.
Similar to the linear regression model, the set of predictors for the single-index quantile regression model can contain a large number of irrelevant variables. Therefore, it is important to select the relevant variables when fitting the single-index quantile regression model. However, the problem of variable selection for the high-dimensional single-index quantile regression model is not well settled in the literature. In recent years, many significant research results have emerged on the variable selection problem of the single-index quantile regression model. [16] proposed a non-iterative estimation and variable selection method for the single-index quantile regression model. The initial value and the weight of the penalty function were obtained via the inverse regression technique, which is the key to this method. [17] combined least absolute deviations (LAD) and SCAD for single-index models. However, we note that SCAD is a piecewise continuous spline function. Because of this structure, different splines need different derivative formulas; it is necessary to select different derivative formulas to match different splines when we carry out penalty optimization. This certainly adds to the programming complexity. So, [18] proposed a continuous bounded smooth penalty function–Laplace error penalty (LEP) that does not have a piecewise spline structure and proved its oracle property. LEP is infinitely differentiable everywhere except at the origin and, therefore, is much smoother than SCAD. Furthermore, LEP can approach the L 0 penalty as close as possible, which is viewed as the optimal penalty. Moreover, LEP can yield a convex objective function for optimization under mild conditions, such that it is easier to calculate and obtain the only optimal solution with desired properties.
In this paper, we combined (composite) the quantile regression method with the SCAD penalty and Laplace error penalty to construct two sparse estimators for the single-index quantile regression model. Our method realizes variable selection and parameter estimation simultaneously. In addition, we prove that the proposed estimator has large sample properties, including N-consistency and oracle properties. A simulation study showed that our method has some resistance to heavy tail errors and outliers, and the accuracy of parameter estimation is higher.
The rest of this paper is organized as follows. In Section 2, the SCAD penalized composite quantile regression and the Laplace penalized composite quantile regression for single-index models are introduced. Furthermore, an iterative algorithm for the single-index model is analyzed and the selections of bandwidth, tuning parameters, and initial values are discussed. In Section 3, we state the large sample properties of SCAD and Laplace penalized composite quantile estimators for single-index models. In Section 4, we show our method and algorithm by two numerical simulations and real data. Section 5 includes some concluding remarks. Technical proofs and the algorithm based on LEP are relegated to Appendix A and Appendix B, respectively.

2. Problem Setting and Methodology

2.1. Composite Quantile-SCAD Method for Single-Index Models

We assume { X i , Y i , i = 1 , 2 , , n } are n independent samples from the single-index model (1). Note that there are two estimation problems, which are the estimation of the parameter vector γ and the estimation of the link function g ( · ) . Given an accurate estimate of γ , the link function g ( · ) can be locally approximated by a linear function
g ( X γ ) g ( u ) + g ( u ) ( X γ u ) = a + b ( X γ u ) ,
for X γ in the neighborhood of u, where a = g ( u ) and b = g ( u ) are local constants. Namely, we can obtain a good local linear estimation of g ( u ) and g ( u ) , which are g ^ ( u ) = a ^ and g ^ ( u ) = b ^ , respectively, based on an accurate estimate of γ . So our main interest is to estimate the parameter vector. Following [19], we adopt the MAVE estimate of γ , which is obtained by solving the minimization problem
min a , b , γ = 1 j = 1 n i = 1 n [ Y i a j b j ( X i γ X j γ ) ] 2 w i j ,
where w i j = k h ( X i γ X j γ ) / l = 1 n k h ( X l γ X j γ ) , a = ( a 1 , a 2 , , a n ) , b = ( b 1 , b 2 , , b n ) , k h ( · ) = k ( · / h ) / h and k ( · ) is a symmetric kernel function, h is the bandwidth. [20] combined the MAVE and LASSO to obtain the sparse estimate (sim-lasso) of γ by solving the following minimization problem
min a , b , γ = 1 j = 1 n i = 1 n [ Y i a j b j ( X i γ X j γ ) ] 2 w i j + λ j = 1 n | b j | k = 1 d | γ k | ,
where λ is a nonnegative penalty parameter. Note that the above target loss function is the square loss function based on the least squares method and, naturally, the LAD is extended to a single-index model as an alternative to the LS method. [17] combined LAD with SCAD to construct a sparse estimator of γ by solving the following minimization problem
min a , b , γ = 1 j = 1 n i = 1 n | Y i a j b j ( X i γ X j γ ) | w i j + j = 1 n | b j | k = 1 d p λ ( | γ k | ) ,
where p λ ( · ) is the SCAD penalty function proposed by [3]; it is defined in terms of its first order derivative. For θ > 0
p λ ( θ ) = λ { I ( θ λ ) + ( a λ θ ) + ( a 1 ) λ I ( θ > λ ) } ,
where a > 2 and λ is a nonnegative penalty parameter, the notation Z + stands for the positive part of Z. The LAD is a special case of the quantile regression, which only indicates the case when the quantile is 1/2. Thus, this motives us to generalize composite quantile regression for single-index models. Combining the SCAD penalty function with the compound quantile regression, we can obtain a sparse estimation γ ^ q r . s i m . s c a d of the parameter γ , which is the solution to the following minimization problem
min a , b , γ = 1 j = 1 n q = 1 Q i = 1 n ρ τ q [ Y i a j b j ( X i γ X j γ ) ] w i j + j = 1 n | b j | k = 1 d p λ ( | γ k | )
where τ q = q q + 1 ( 0 , 1 ) stands for τ q -quantile and q = 1 , 2 , , Q with the number of quantile Q, and ρ τ q ( z ) = τ q z · I [ 0 , ] ( z ) ( 1 τ q ) z · I ( , 0 ) ( z ) is the τ q -quantile loss function. In addition, we assume that the τ q -quantile of the random error ε is 0. Thus, g ( X γ ) is the conditional τ q -quantile of the response variable Y. We denote the target function in (7) by Q λ S ( a , b , γ ) .

2.2. Composite Quantile–LEP Method for Single-Index Models

The Laplace error penalty function with two tuning parameters is proposed by [18]. Unlike other penalty functions, this new penalty function is naturally constructed as a bounded smooth function rather than a piecewise spline. The figure of LEP is similar to SCAD, but is much smoother than SCAD, which prompts us to apply it to the composite quantile regression for single-index models. Combining LEP with composite quantile regression, we can obtain a sparse estimation γ ^ q r . s i m . l e p of the parameter γ , which is the solution to the following minimization problem
min a , b , γ = 1 j = 1 n q = 1 Q i = 1 n ρ τ q [ Y i a j b j ( X i γ X j γ ) ] w i j + j = 1 n | b j | k = 1 d p λ , κ ( | γ k | )
where P λ , κ ( · ) is LEP. For θ > 0 ,
P λ , κ ( θ ) = λ ( 1 e θ κ ) ,
where λ and κ are two nonnegative tuning parameters regularizing the magnitude of penalty and controlling the degree of approximation to the L 0 penalty, respectively. This penalty function is called Laplace penalty function because function e θ κ has the form of the Laplace density. We denote the target function in (8) by Q λ , κ S ( a , b , γ ) .

2.3. Computation

Given the initial estimate γ ^ , the SCAD penalty function can be locally linear, approximated as follows [21]. For γ ^ j 0 ,
p λ ( | γ j | ) p λ ( | γ ^ j | ) + p λ ( | γ ^ j | ) ( | γ j | | γ ^ j | ) ,
where γ j γ ^ j . Remove a few irrelevant terms, (7) can be rewritten as
min a , b , γ = 1 j = 1 n q = 1 Q i = 1 n ρ τ q [ Y i a j b j ( X i γ X j γ ) ] w i j + j = 1 n | b j | k = 1 d p λ ( | γ ^ k | ) | γ k | .
We denote the target function in (11) by Q λ S * ( a , b , γ ) . We can easily discover that Q λ S * is invariant, so when minimizing Q λ S * , we restrict γ = 1 to be the unit length γ = 1 through normalization γ .
In order to obtain an accurate estimate of γ and g ( · ) , we introduce a new iterative algorithm. Then our estimation procedure is described in detail as follows:
Step 0.
Obtain an initial estimate of γ . Standardize the initial estimate γ ^ , such that γ = 1 and γ ^ 1 > 0 .
Step 1.
Given an estimate γ ^ , we obtain { a ^ j , b ^ j , j = 1 , 2 , , n } by solving
min ( a j , b j ) i = 1 n q = 1 Q ρ τ q [ Y i a j b j ( X i γ ^ X j γ ^ ) ] w i j + | b j | k = 1 d p λ ( | γ ^ k | ) | γ ^ k = min ( a j , b j ) i = 1 n + 1 q = 1 Q ρ [ Y i * ( A , B ) a j b j ] w i j * ,
where h is the optimal bandwidth, ( ρ , Y i * , A , B , w i j * ) = ( ρ τ q , Y i , 1 , X i γ ^ X j γ ^ , w i j ) for i = 1 , 2 , , n , and ( ρ , Y i * , A , B , w i j * ) = ( 1 / Q , 0 , 0 , k = 1 d p λ ( | γ ^ k | ) | γ ^ k | , 1 ) for i = n + 1 . The r q ( · ) function in R package “quantreg” is helpful to obtain { a ^ j , b ^ j , j = 1 , 2 , , n } . Moreover, for the SCAD penalty, we can apply a difference-of-convex algorithm [22] to the simple computation.
Step 2.
Given { a ^ j , b ^ j , j = 1 , 2 , , n } , update γ ^ by solving
min γ j = 1 n q = 1 Q i = 1 n ρ τ q [ Y i a ^ j b ^ j ( X i γ X j γ ) ] w i j + j = 1 n | b ^ j | k = 1 d p λ ( | γ ^ k | ) | γ k | .
We can apply a fast and efficient coordinate descent algorithm [23] if d is very large, or combine the MM algorithm [24] to reduce the calculation.
Step 3.
Scale b ^ sgn ( γ ^ 1 ) · γ ^ b ^ , and γ ^ sgn ( γ ^ 1 ) · γ ^ / γ ^ .
Step 4.
Continue Step 1–Step 3 until convergence.
Step 5.
Given the final estimate γ ^ from Step 4, we estimate g ( · ) at any u by g ^ ( · , h , γ ^ ) = a ^ , where
( a ^ , b ^ ) = min ( a , b ) q = 1 Q i = 1 n ρ τ q [ Y i a b ( X i γ ^ u ) ] k h ( X i γ ^ u ) .
Remark 1.
The above algorithm is aimed at the SCAD penalty function. Moreover, similarly, we can obtain the other algorithm for the Laplace penalty function, replacing SCAD with LEP.

2.4. The Selections of Bandwidth, Tuning Parameters, and Initial Value

The selection of the bandwidth plays a crucially important role in local polynomial smoothing because it controls the curvature of the fitted function. The cross-validation (CV) and the generalized cross-validation (GCV) can be utilized to choose a proper bandwidth, but these methods are not computationally practical due to the large calculation amounts. For the local linear quantile regression, [25] obtained an approximate optimal bandwidth under some suitable assumptions and found the rule-of-thumb bandwidth: h τ = h m { τ ( 1 τ ) / ψ 2 ( Φ 1 ( τ ) ) } 1 / 5 , where ψ ( · ) and Φ ( · ) are the probability density function and the cumulative distribution function of the normal distribution, respectively; h m is the optimal bandwidth of the least squares regression. There are many algorithms for the selection of h m . [26] found that the rule-of-thumb bandwidth: h m = { 4 / ( d + 2 ) } 1 / ( 4 + d ) n 1 / ( 4 + d ) in the single-index models acts fairly well, where d is the dimension of the kernel function. We combined a multiplier only consisting of τ with the optimal bandwidth h m of the LS regression to obtain a h τ with good characters.
There are several kinds of selection methods for SCAD’s nonnegative tuning parameter λ , such as CV, GCV, AIC, BIC, and so on. Following [27], we utilized the BIC criterion to choose a proper tuning parameter of SCAD in this paper
BIC ( λ ) = 1 σ ˜ i = 1 n q = 1 Q ρ τ q [ Y i g ( X i γ ^ ( λ ) ) ] + l o g ( n ) · d f / 2 ,
where σ ˜ = ( 1 / n ) i = 1 n q = 1 Q ρ τ q [ Y i g ( X i γ ˜ ) ] with γ ˜ being the composite quantile estimator without penalty and d f is the number of non-zero coefficients of γ ^ ( λ ) . Then, we chose the optimal tuning parameter by minimizing the above criteria. Moreover, for LEP, we utilized the extended Bayesian information criterion (EBIC) [28] to choose proper tuning parameters λ and κ .
EBIC ( λ , κ ) = l o g ( σ ^ 2 ) + l o g ( n ) + l o g l o g ( d ) n d f
where σ ^ 2 = 1 / ( n 1 ) i = 1 n q = 1 Q ρ τ q [ Y i g ( X i γ ^ ) ] and d f is the number of non-zero coefficients of γ ^ ( λ , κ ) . Similarly, in order to select the best tuning parameters, we minimized the above criteria in the arrangement of λ values.
The initial value of the unknown parameter is required at the beginning of the iteration of our algorithm. A convenient choice is γ ^ / γ ^ where γ ^ is the composite quantile estimator without penalty.

3. Theoretical Properties

A good estimate is supposed to satisfy unbiasedness, continuity, and the so-called oracle property. Thus, in this section, we discuss the large sample properties of the SCAD penalized composite quantile regression and the Laplace penalized composite quantile regression for single-index models. We consider the data { ( X i , Y i ) , i = 1 , 2 , , n } including n observations from model (1), such as Section 2. Moreover, let X i = ( X i 1 , X i 2 ) , γ = ( γ 1 , γ 2 ) , X i 1 s , X i 2 d s . In addition, γ 0 = ( γ 10 , γ 20 ) stands for the true regression parameters of model (1) and γ 0 = 1 , where the s components in γ 10 are not zero. We suppose the following regularity conditions to hold:
(i)
The density function of X γ is positive and uniformly continuous for γ in a neighborhood of γ 0 . Further, the density of X γ 0 is continuous and bounded away from 0 and on its support D.
(ii)
The function g ( · ) has a continuous and bounded second derivative in D.
(iii)
The kernel function k ( · ) is a symmetric density function with bounded support and a bounded first derivative.
(iv)
The density function f Y ( · ) of Y is continuous and has a bounded derivative; it is bounded away from 0 and on compact support.
(v)
The following expectations exist:
C 0 = E { g ( X γ 0 ) 2 [ X E ( X | X γ 0 ) ] [ X E ( X | X γ 0 ) ] } C 1 = E { f Y ( g ( X γ 0 ) ) g ( X γ 0 ) 2 [ X E ( X | X γ 0 ) ] [ X E ( X | X γ 0 ) ] }
(vi)
h 0 and n h .
Given ( a ^ j , b ^ j ) , let H = j = 1 n | b ^ j | , a n = max { P λ ( γ 0 j ) : γ 0 j 0 } and γ ^ q r . s i m . s c a d be the solution of (7). We should note that under condition (ii), the first derivative is bounded. Thus, H = O ( n ) .
Theorem 1.
Under the conditions (i)–(v). If max { P λ ( | γ 0 k | ) : γ 0 k 0 } 0 and a n = O ( n 1 / 2 ) , then there exists a local minimizer in (7) such that γ ^ q r . s i m . s c a d γ 0 = O P ( n 1 / 2 + a n ) with γ ^ q r . s i m . s c a d = γ 0 = 1 .
According to Theorem 1, we show that there exists a n -consistent SCAD penalized composite quantile regression estimate for γ if a proper tuning parameter λ is selected. Let c n = { p λ ( | γ 01 | ) sgn ( γ 01 ) , , p λ ( | γ 0 s | ) sgn ( γ 0 s ) } , and λ = diag ( p λ ( | γ 01 | ) , , p λ ( | γ 0 s | ) .
Lemma 1.
Under the same conditions as in Theorem 1. Assume that
lim inf n + lim inf θ 0 + p λ n ( θ / λ n ) > 0 .
If λ 0 and H λ as n , then with probability tending to 1, for any given γ ^ 1 satisfying γ ^ 1 γ 10 = O P ( n 1 / 2 ) , and any constant C, we have
Q λ S ( ( γ 1 , 0 ) ) = min γ 2 C n 1 / 2 Q λ S ( ( γ 1 , γ 2 ) ) .
Theorem 2.
Under the same conditions as in Theorem 1—assume that the penalty function p λ ( | θ | ) satisfies condition (17). If λ 0 and H λ , then with the probability tending to 1, the n -consistent local minimizer γ ^ q r . s i m . s c a d = ( ( γ ^ 1 q r . s i m . s c a d ) , ( γ ^ 2 q r . s i m . s c a d ) ) in Theorem 1 must satisfy:
(i) Sparsity: γ ^ 2 q r . s i m . s c a d = 0 .
(ii) Asymptotic normality:
n { ( Q C 11 + H λ / n ) ( γ ^ 1 q r . s i m . s c a d γ 10 ) + H c n / n } D N ( 0 , 0.2 C 01 ) ,
where C 11 is the top-left s-by-s sub-matrix of C 1 and C 01 is the top-left s-by-s sub-matrix of C 0 .
Theorem 2 shows that the SCAD penalized composite quantile regression estimator has the so-called oracle property when λ 0 and H λ .
Remark 2.
In this section, we discuss the large sample properties of the SCAD penalized composite quantile estimator ( γ ^ q r . s i m . s c a d ) in detail. Similarly, we can also show the large sample properties of the Laplace penalized composite quantile estimator ( γ ^ q r . s i m . l e p ).

4. Numerical Studies

4.1. Simulation Studies

In this section, we evaluate the proposed estimator by simulation studies. We compare the performances of different penalized estimators with the oracle estimator. Specially, in order to reduce the burden of computation and simplify the calculation, we take the Gaussian kernel as the kernel function in our simulations. Moreover, we do not tune the value of the parameter a and set a = 3.7 , which is suggested by [12] for the SCAD penalty. Moreover, we set the quantile number: Q = 5 . Next, we compare the performances of the following four estimates for the single-index model:
  • lad.sim.scad: the LAD estimators with the SCAD penalty;
  • cqr.sim.scad: the composite quantile estimators with the SCAD penalty;
  • cqr.sim.lep: the composite quantile estimators with the SCAD penalty;
  • Oracle: the oracle estimators (composite quantile regression without penalty under the true model).
In order to evaluate the performances of the above estimators, we consider the following criteria:
  • MAD (the mean absolute deviation) of γ ^ : MAD = 1 n i = 1 n | X i γ ^ X i γ 0 | .
  • NC: the average number of non-zero coefficients that are correctly estimated to be non-zero.
  • NIC: the average number of zero coefficients that are incorrectly estimated to be non-zero, respectively.
Additionally, an estimated coefficient is viewed as 0 if its absolute value is smaller than 10 6 .
Scenario 1. We assume that the single-index model has the following form:
Y = 2 X γ 0 + 10 exp ( ( X γ 0 ) 2 / 5 ) + ϵ ,
where γ 0 = γ / γ with γ = ( 1 , 1 , 2 , 0.5 , 0 , , 0 ) being a 15-dimensional vector with only four non-zero values (the true coefficients). The X-variables are generated from the multivariate normal distribution and set the correlation between X i and X j to be 0.5 | i j | and the Y-variable is generated from the above model. Then, to eliminate the impacts of different error distributions, we consider the following five error distributions:
  • N ( 0 , 1 ) : the standard normal distribution (N);
  • t ( 3 ) : the t-distribution with 3 degrees of freedom;
  • D E : the double exponential distribution with media 0 and scale parameter 1 / 2 ;
  • C N : the polluted normal distribution 0.9 N ( 0 , 1 ) + 0.1 N ( 0 , 25 ) (CN);
  • O u t l i e r : an outlier case is considered, in which 10 % of the responses are shifted with a constant c = 5 .
In order to perform the simulations, we generated 200 replicates with moderate sample sizes, n = 100 , 200. Then, the corresponding results are reported in Table 1.
Scenario 2. The model set-up is similar to Example 1, except the link function is X γ 0 + 4 | X γ 0 + 1 | . These link functions are also analyzed by [20]. Then, Table 2 summarizes the corresponding results.
From Table 1 and Table 2, we can note that the performance of cqr.sim.lep is best, cqr.sim.scad is second, and lad.sim.scad is the worst for five different error distributions. This is consistent with our theoretical findings. Furthermore, with the sample size increasing, all estimators become better.

4.2. Real Data Application: Boston Housing Data

In this section, the methods are illustrated through an analysis of Boston housing data. The data (506 observations and 14 variables) are available in the package (‘MASS’) in R, and the definitions of the dependent variable (MEDV) and explanatory variables are described in Table 3. We checked whether there were missing values in the data through the violin diagram of each variable. Figure 1 and Figure 2 show the violins between the first and last seven variables, respectively. It is obvious from Figure 1 and Figure 2 that there are obvious outliers in CRIM and Black columns. In order to test the linear relationship among variables, the heat map between the variables is given in Figure 3. It can be seen from the heat map that variables RM, Ptratio, Istat, and MEDV have certain correlations. The correlation coefficients between Indus and nox, CRIM and RAD, RAD and tax, and Indus and tax were 0.7, 0.8, 0.9, and 0.7, respectively. Therefore, there was a high correlation between variables, so the single-index regression model could be considered.
Boston housing data have been utilized by many regression studies; the potential relationship between MEDV and X-variables was also founded [10,17]. For the single-index quantile regression, [10] introduced a practical algorithm where the unknown link function g ( · ) is estimated by local linear quantile regression and the parametric index was estimated through linear quantile regression. However, the authors did not consider the variable selection. For the single-index regression models, [17] considered the penalized LAD regression, which dealt with variable selection and estimation simultaneously. However, the LAD estimator is only the special case of the quantile estimator in which the quantile τ is equal to 0.5. Moreover, the two literature studies mentioned above only used the case of a single quantile. In this article, we constructed new sparse estimators for single-index quantile regression models based on the composite quantile regression method combined with the SCAD penalty and Laplace error penalty.
Due to the sparsity of data in the region concerned, possible quantile curves cross at both tails similar to [14]. The results of the real data example and simulation studies confirm the reasonableness and effectiveness of our method in practice.
In order to better numerical performance, we need to standardize the response variable MEDV and the predictor variables except CHAS before applying our method. The estimated coefficient is treated as 0 if its absolute value is smaller than 10 12 . Then, the corresponding results are reported in Table 4.
From Table 4, we see that all methods can achieve the variable selection and parameter estimation simultaneously in the real problem. Moreover, all methods choose the sparse model including RM, DIS, PTRATIO, LSTAT, the same as the one using all predictors without penalty (cqr.sim). Moreover, the estimation of cqr.sim.lep is the closest to cqr.sim. These results indicate that only four explanatory variables are significant and the rest are irrelevant.

5. Conclusions

In this article, we propose SCAD and Laplace penalized composite quantile regression estimators for single-index models in a high-dimensional case. Compared with the least squares method, composite quantile regression can obtain the robust estimator with respect to heavy-tailed error distributions and outliers.Then, a practical iterative algorithm was introduced. It is based on composite quantile regression and uses local linear smoothing to estimate the unknown link function. This method realizes parameter selection and estimation simultaneously by combining two kinds of penalty functions with the composite quantile regression. In addition, we proved that the proposed estimator has large sample properties, including n -consistence and the oracle property. Furthermore, the estimator was evaluated and illustrated by numerical studies. Moreover, we can draw the conclusion from Boston housing data: the sparse model with the same significant variables was selected by all three estimation methods; however, the estimators of cqr.sim.lep were the closest to that of cqr.sim. This reveals that using—or not using—the LEP penalty essentially acts the same when we estimate the link function.

Author Contributions

Formal analysis, Z.L.; Methodology, Y.S.; Software, M.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the NNSF project (61503412) of China and the NSF project (ZR2019MA016) of the Shandong Province of China.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Theorems

Proof of Theorem 1.
Given ( a ^ j , b ^ j ) , we need to definite other symbols before proving. Let α n = n 1 / 2 + a n , Y i j = Y i a ^ j b ^ j X i j γ 0 , X i j = X i X j , γ = α n u + γ 0 where u is a d-dimensional vector, S ( u ) = i , j , q ρ τ q [ Y i j α n b ^ j X i j u ] w i j i , j , q ρ τ q ( Y i j ) w i j and set u = C where C is a large enough constant.
Our purpose is to prove that γ ^ q r . s i m . s c a d is n -consistent; that is, to show that for any given ϵ > 0 and large n, there is a large enough constant C, such that
P { inf u = C Q λ S ( γ 0 + α n u ) > Q λ S ( γ 0 ) } 1 ϵ ,
which implies that there exists a local minimum γ ^ in the ball { α n u + γ 0 : u C } , such that γ ^ γ 0 = O p ( α n ) with probability of at least 1 ϵ .
Let
D n ( u ) = Q λ S ( α n u + γ 0 ) Q λ S ( γ 0 ) S ( u ) + H k = 1 s [ p λ ( | α n u k + γ 0 k | ) p λ ( | γ 0 k | ) ] = : I + ⨿
First, we consider I and partition it into I 1 and I 2 . We have
I = i , j , q ρ τ q [ Y i j b ^ j X i j α n u ] w i j i , j , q ρ τ q ( Y i j ) w i j = i , j , q w i j { b ^ j X i j α n u [ I ( Y i j < b ^ j X i j α n u ) τ q ] } + i , j Q w i j Y i j I ( b ^ j X i j α n u < Y i j < 0 ) = I 1 + I 2
Obviously,
I 1 i , j Q w i j b ^ j X i j α n u = n α n u i , j Q w i j b ^ j X i j / n
Note that i , j Q w i j b ^ j X i j / n = O P ( 1 ) , which can refer to the proof of Theorem 2 of [10]. Therefore, we can obtain that I 1 = O P ( n α n ) u = O P ( n α n 2 ) u . Then, we can obtain the mean and variance of I 2 by direct computation.
E I 2 = i , j Q w i j + Y i j · I ( b ^ j X i j α n u < Y i j < 0 ) · f Y i j ( y ) d y = i , j Q w i j b ^ j X i j α n u 0 Y i j · f Y i j ( y ) d y = i , j Q w i j · P ( b ^ j X i j α n u < Y i j < 0 ) = i j Q w i j P i j
where f Y i j ( · ) is the probability density function of Y i j , P i j stands for P ( b ^ j X i j α n u < Y i j < 0 ) . Note that E I 2 > 0 , furthermore, E I 2 0 . Moreover,
V a r ( I 2 ) = i , j Q w i j · E [ Y i j · I ( b ^ j X i j α n u < Y i j < 0 ) P i j ] 2 i , j Q w i j max i , j | Y i j | P i j P i j 2 = i , j Q w i j P i j [ max i , j | Y i j | P i j ] 0
In addition, by taking Taylor’s expansion for P λ ( | γ k | ) and the basic inequality, we obtain that,
⨿ = H k = 1 s [ P λ ( | γ 0 k + α n u k | ) P λ ( | γ 0 k | ) ] = H k = 1 s [ α n P λ ( | γ 0 k | ) sgn ( γ 0 k ) u k + 0.5 α n 2 P λ ( | γ 0 k | ) u k 2 ] s H α n a n u + H max 1 k s P λ ( | γ 0 k | ) α n 2 u 2
According to the condition H = O P ( n ) , max { P λ ( | γ 0 k | ) : γ 0 k 0 } 0 , and u = C , D n ( u ) in (A2) is mainly determined by I 2 , which is positive. Thus, we prove (A1). □
Proof of Lemma 1.
Due to γ = α n u + γ 0 , let u 1 = α n 1 ( γ 1 γ 01 ) , u 2 = α n 1 ( γ 2 γ 02 ) , u = ( u 1 , u 2 ) . After the computation, we obtain
Q λ S ( ( γ 1 , 0 ) ) Q λ S ( ( γ 1 , γ 2 ) ) = S ( ( u 1 , 0 ) ) S ( ( u 1 , u 2 ) ) H k = s + 1 d P λ ( | γ k | )
According to the proof of Theorem 1 and u = O ( 1 ) , we have I 1 = O P ( n α n 2 ) and I 2 = O P ( 1 ) ; thus, we prove that S ( ( u 1 , u 2 ) ) = I = O P ( 1 ) . Similarly, S ( ( u 1 , 0 ) ) = O P ( 1 ) . By the mean value theorem and P λ ( 0 ) = 0 , we can obtain the following inequality
H k = s + 1 d P λ ( | γ k | ) = H k = s + 1 d P λ ( | γ k * | ) | γ k | H λ ( lim inf λ 0 lim inf θ 0 + ( P λ ( θ / λ ) ) k = s + 1 d | γ k |
where 0 < | γ k * | < | γ k | ( k = s + 1 , , d ) . We can obtain that H λ = H ( H λ ) is of higher order than O ( H ) because of the condition H λ , which implies that the last term of (A8) dominates in magnitude. As a result, Q λ S ( ( γ 1 , 0 ) ) Q λ S ( ( γ 1 , γ 2 ) ) < 0 for large n. This proves Lemma 1. □
Proof of Theorem 2.
(i)
Follows from Lemma 1.
(ii)
By partitioning the vectors u = ( u 1 , u 2 ) , P λ ( 0 ) = 0 and (A2), we have
D n ( ( u 1 , 0 ) ) = S ( ( u 1 , 0 ) ) + H k = 1 s [ P λ ( | γ 0 k + α n u k | ) P λ ( | γ 0 k | ) ] = S ( ( u 1 , 0 ) ) + P λ ( u 1 )
where P λ ( u 1 ) = H k = 1 s [ P λ ( | γ 0 k + α n u k | ) P λ ( | γ 0 k | ) ] . Moreover, by Taylor’s expansion and calculation, P λ ( u 1 ) can be rewritten as
P λ ( u 1 ) = H α n c n u 1 + 1 2 H α n 2 u 1 λ u 1
Let
δ * = a ^ j + b ^ j X 1 i j ( γ 01 + α n u ^ 1 ) δ 1 * = a ^ j + b ^ j X 1 i j γ 01
In order to find the minimized u ^ 1 of D n ( ( u 1 , 0 ) ) , we compute the derivation of it and set D n ( ( u 1 , 0 ) ) = 0 . Thus, we obtain the following equation.
i , j , q w i j b ^ j X 1 i j α n [ τ q I ( Y i a ^ j b ^ j X 1 i j ( γ 01 + α n u ^ 1 ) < 0 ) ] + H α n c n + H α n 2 λ u ^ 1 = 0
That is,
1 n i , j , q b ^ j X 1 i j w i j [ I ( Y i < δ * ) τ q ] = 1 n [ H c n + H α n λ u ^ 1 ]
Let
Z 1 = n 1 / 2 i , j , q b ^ j X 1 i j w i j [ I ( Y i < δ 1 * ) τ q ] B 1 = n 1 i , j , q b ^ j X 1 i j w i j [ F Y ( δ 1 * ) F Y ( δ * ) ] B 2 = n 1 i , j , q b ^ j X 1 i j w i j { [ I ( Y i < δ 1 * ) I ( Y i < δ * ) ] [ F Y ( δ 1 * ) F Y ( δ * ) ] }
where F Y ( · ) is the cumulative distribution function of Y. Therefore, we have
1 n i , j , q b ^ j X 1 i j w i j [ I ( Y i < δ * ) τ q ] = 1 n Z 1 + B 1 + B 2
By taking the Taylor’s expansion for F Y ( · ) , we can obtain that
B 1 = Q α n n 1 i , j b ^ j 2 f Y ( δ 1 * ) w i j X 1 i j X 1 i j u ^ 1 Q α n C 11 u ^ 1
where f Y ( · ) is the probability density function of Y. According to the direct calculation of the mean and variance in [15], we have B 2 = o P ( Q n ) = o P ( 1 n ) . Moreover, combing (A14), (A15) and u ^ 1 = α n 1 ( γ ^ 1 γ 01 ) , (A13) can be rewritten in the following form:
n { ( Q C 11 + H n λ ) ( γ ^ 1 γ 01 ) + H n c n } = Z 1 + o P ( 1 n )
Note that Z 1 D Q N ( 0 , 0.25 C 01 ) . We can obtain that
n { ( Q C 11 + H n λ ) ( γ ^ 1 γ 01 ) + H n c n } D N ( 0 , Q 2 0.25 C 01 )
Thus, we prove Theorem 2. □
Remark A1.
Above all, we prove the n -consistency and oracle property for the SCAD penalized composite quantile estimator γ ^ q r . s i m . s c a d . Similarly, we can also show the same properties for the Laplace penalized composite quantile estimator γ ^ q r . s i m . l e p

Appendix B. The Algorithm Based on LEP

Similar to the SCAD penalty function, by the local linear approximation of LEP and removal of a few irrelevant terms, (8) can be rewritten as
min a , b , γ = 1 j = 1 n q = 1 Q i = 1 n ρ τ q [ Y i a j b j ( X i γ X j γ ) w i j + j = 1 n | b j | k = 1 d p λ , κ ( | γ ^ k | ) | γ k | .
We denote the target function in (A18) by Q λ , κ S * ( a , b , γ ) . Moreover, the iterative algorithm based on LEP is as follows.
Step 0.
We obtain an initial estimate of γ . We standardize the initial estimate γ ^ such that γ = 1 and γ ^ 1 > 0 .
Step 1.
Given an estimate γ ^ , we obtain { a ^ j , b ^ j , j = 1 , 2 , , n } by solving
min ( a j , b j ) i = 1 n q = 1 Q ρ τ q [ Y i a j b j ( X i γ ^ X j γ ^ ) ] w i j + | b j | k = 1 d p λ , κ ( | γ ^ k | ) | γ ^ k | = min ( a j , b j ) i = 1 n + 1 q = 1 Q ρ [ Y i * ( A , B ) a j b j ] w i j * ,
where h is the optimal bandwidth, ( ρ , Y i * , A , B , w i j * ) = ( ρ τ q , Y i , 1 , X i γ ^ X j γ ^ , w i j ) for i = 1 , 2 , , n , and ( ρ , Y i * , A , B , w i j * ) = ( 1 / Q , 0 , 0 , k = 1 d p λ , κ ( | γ ^ k | ) | γ ^ k | , 1 ) for i = n + 1 .
Step 2.
Given { a ^ j , b ^ j , j = 1 , 2 , , n } , update γ ^ by solving
min γ j = 1 n q = 1 Q i = 1 n ρ τ q [ Y i a ^ j b ^ j ( X i γ X j γ ) ] w i j + j = 1 n | b ^ j | k = 1 d p λ , κ ( | γ ^ k | ) | γ k | .
Step 3.
Scale b ^ sgn ( γ ^ 1 ) · γ ^ b ^ , and γ ^ sgn ( γ ^ 1 ) · γ ^ / γ ^ .
Step 4.
Continue Step 1–Step 3 until convergence.
Step 5.
Given the final estimate γ ^ from Step 4, we estimate g ( · ) at any u by g ^ ( · , h , γ ^ ) = a ^ , where
( a ^ , b ^ ) = min ( a , b ) q = 1 Q i = 1 n ρ τ q [ Y i a b ( X i γ ^ u ) ] k h ( X i γ ^ u ) .

References

  1. Kuruwita, C.N. Variable selection in the single-index quantile regression model with high dimensional covariates. Commun. Stat.-Simul. Comput. 2021, 1–13. [Google Scholar] [CrossRef]
  2. Sara, M.; Amena, U.; Faridoon, K.; Mohammed, N.A.; Mohammed, A.; Sanaa, A. Comparison of weighted lag adaptive LASSO with Autometrics for Covariate Selection and forecasting using time-series data. Complexity 2022, 2022, 2649205. [Google Scholar]
  3. Kraus, D.; Czado, C. D-vine copula based quantile regression. Comput. Stat. Data Anal. 2017, 110, 1–18. [Google Scholar] [CrossRef] [Green Version]
  4. Imtiaz, S.; Abdul, G.; Abdollah, A.M. The COVID-19 pandemic and speculation in energy, precious metals, and agricultural futures. J. Behav. Exp. Financ. 2021, 30, 100498. [Google Scholar]
  5. Mozafari, Z.; Arab Chamjangali, M.; Arashi, M.; Goudarzi, N. Performance of smoothly clipped absolute deviation as a variable selection method in the artificial neural network based QSAR studies. J. Chemom. 2021, 35, e3338. [Google Scholar] [CrossRef]
  6. Koenker, R.; Basset, G. Regression quanties. Econometrica 1978, 46, 33–50. [Google Scholar] [CrossRef]
  7. Zou, H.; Yuan, M. Composite quantile regression and the oracle model selection Theory. Ann. Stat. 2008, 36, 1108–1126. [Google Scholar] [CrossRef]
  8. Cao, Z.; Kang, X.; Wang, M. Doubly robust weighted composite quantile regression based on SCAD-L2. Can. J. Stat. 2021. [Google Scholar] [CrossRef]
  9. Chaudhuri, P.; Doksum, K.; Samarov, A. On average derivative quantile regression. Ann. Stat. 1997, 25, 715–744. [Google Scholar] [CrossRef]
  10. Wu, T.Z.; Yu, K.; Yu, Y. Single-index quantile regression. J. Multivar. Anal. 2010, 101, 1607–1621. [Google Scholar] [CrossRef] [Green Version]
  11. Jiang, R.; Yu, K. Single-index composite quantile regression for massive data. J. Multivar. Anal. 2020, 180, 104669. [Google Scholar] [CrossRef]
  12. Fan, J.; Li, R. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 2001, 96, 1348–1360. [Google Scholar] [CrossRef]
  13. Zou, H. The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 2006, 101, 1418–1429. [Google Scholar] [CrossRef] [Green Version]
  14. Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. 2005, 67, 301–320. [Google Scholar] [CrossRef] [Green Version]
  15. Fan, J.; Lv, J. A selection overview of variable selection in high dimensional feature space. Stat. Sin. 2010, 20, 101–148. [Google Scholar]
  16. Kuruwita, C.N. Non-iterative estimation and variable selection in the single-index quantile regression model. Commun. Stat.-Simul. Comput. 2016, 45, 3615–3628. [Google Scholar] [CrossRef]
  17. Yang, H.; Lv, J.; Guo, C. Penalized LAD regression for single-index models. Commun. Stat.-Simul. Comput. 2016, 45, 2392–2408. [Google Scholar] [CrossRef]
  18. Wen, C.; Wang, X.; Wang, S. Laplace error penalty-based variable selection in high dimension. Scand. J. Stat. 2015, 42, 685–700. [Google Scholar] [CrossRef]
  19. Xia, Y.; Tong, H.; Li, W.K. An adaptive estimation of dimension reduction space (with discussion). J. R. Stat. Soc. Ser. B 2002, 64, 363–410. [Google Scholar] [CrossRef]
  20. Zeng, P.; He, T.; Zhu, Y. A Lasso-type approach for estimation and variable selection in single index moedls. J. Comput. Graph. Stat. 2012, 21, 92–109. [Google Scholar] [CrossRef]
  21. Zou, H.; Li, R. One-step sparse estimates in nonconcave penalized likelihood models. Ann. Stat. 2008, 36, 1509–1533. [Google Scholar] [PubMed]
  22. An, L.T.H.; Tao, P.D. Solving a class of linearly constrained indefinite quadratic problems by d.c. algorithms. J. Glob. Optim. 1997, 11, 253–285. [Google Scholar]
  23. Wu, T.T.; Lange, K. Coordinate descent algorithms for lasso penalized regression. Ann. Appl. Stat. 2008, 2, 224–244. [Google Scholar] [CrossRef]
  24. Hunter, D.R.; Lange, K. Quantile regression via an MM algorithm. J. Comput. Graph. Stat. 2000, 9, 60–77. [Google Scholar]
  25. Yu, K.; Jones, M. Local linear quantile regression. J. Am. Stat. Assoc. 1998, 93, 228–237. [Google Scholar] [CrossRef]
  26. Wang, Q.; Yin, X. A nonlinear multi-dimensional variable selection method for high dimensional data: Sparse MAVE. Comput. Stat. Data Anal. 2008, 52, 4512–4520. [Google Scholar] [CrossRef]
  27. Shows, H.S.; Lu, W.; Zhang, H.H. Sparse estimation and inference for censored median regression. J. Stat. Plan. Inference 2010, 140, 1903–1917. [Google Scholar] [CrossRef] [Green Version]
  28. Chen, J.; Chen, Z. Extended Bayesian information for model selection with large model spaces. Biometrika 2008, 95, 759–771. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Violin diagram of the first seven variables.
Figure 1. Violin diagram of the first seven variables.
Mathematics 10 02000 g001
Figure 2. Violin diagram of the last seven variables.
Figure 2. Violin diagram of the last seven variables.
Mathematics 10 02000 g002
Figure 3. Heat map between the variables.
Figure 3. Heat map between the variables.
Mathematics 10 02000 g003
Table 1. Simulation results for Scenario 1 based on 200 replications.
Table 1. Simulation results for Scenario 1 based on 200 replications.
Error DistributionMethod n = 100 n = 200
MAD (%)NCNICMAD (%)NCNIC
N ( 0 , 1 ) lad.sim.scad11.653.963.537.634.001.57
cqr.sim.scad11.613.933.457.604.001.54
cqr.sim.lep11.593.913.447.584.001.53
Oracle11.573.903.427.564.001.51
t ( 3 ) lad.sim.scad13.723.903.739.253.991.99
cqr.sim.scad13.703.953.709.214.001.94
cqr.sim.lep13.683.963.689.184.001.93
Oracle13.673.983.679.154.001.91
D E lad.sim.scad8.793.973.205.694.001.76
cqr.sim.scad8.763.973.085.654.001.76
cqr.sim.lep8.743.983.055.634.001.76
Oracle8.723.993.005.614.001.76
C N lad.sim.scad16.653.822.8310.783.941.55
cqr.sim.scad16.633.852.8010.753.951.53
cqr.sim.lep16.623.872.7810.743.971.52
Oracle16.613.892.7710.713.981.50
O u t l i e r lad.sim.scad13.763.952.8410.243.971.84
cqr.sim.scad13.743.962.8310.233.971.81
cqr.sim.lep13.733.972.8210.223.981.80
Oracle13.723.982.8110.203.991.78
MAD (the mean absolute deviation) of γ ^ : MAD = 1 n i = 1 n | X i γ ^ X i γ 0 | ; NC: the average number of non-zero coefficients that are correctly estimated to be non-zero; NIC: the average number of zero coefficients that are incorrectly estimated to be non-zero, respectively.
Table 2. Simulation results for Scenario 2 based on 200 replications.
Table 2. Simulation results for Scenario 2 based on 200 replications.
Error DistributionMethod n = 100 n = 200
MAD (%)NCNICMAD (%)NCNIC
N ( 0 , 1 ) lad.sim.scad9.903.983.278.264.001.41
cqr.sim.scad9.843.983.248.194.001.35
cqr.sim.lep9.813.993.238.164.001.34
Oracle9.793.903.218.144.001.32
t ( 3 ) lad.sim.scad12.153.973.418.574.001.72
cqr.sim.scad12.083.983.368.514.001.68
cqr.sim.lep12.053.993.348.504.001.67
Oracle12.033.883.328.474.001.64
D E lad.sim.scad7.634.003.164.974.002.18
cqr.sim.scad7.564.003.134.944.002.11
cqr.sim.lep7.544.003.114.924.002.08
Oracle7.514.003.094.894.002.06
C N lad.sim.scad12.023.953.0611.313.971.24
cqr.sim.scad11.973.963.0311.263.971.21
cqr.sim.lep11.963.983.0211.253.981.18
Oracle11.933.862.9711.233.881.15
O u t l i e r lad.sim.scad11.953.953.189.474.001.62
cqr.sim.scad11.923.963.159.424.001.59
cqr.sim.lep11.893.983.139.414.001.58
Oracle11.863.873.109.394.001.55
MAD (the mean absolute deviation) of γ ^ : MAD = 1 n i = 1 n | X i γ ^ X i γ 0 | ; NC: the average number of non-zero coefficients that are correctly estimated to be non-zero; NIC: the average number of zero coefficients that are incorrectly estimated to be non-zero, respectively.
Table 3. Description of variables for Boston housing data.
Table 3. Description of variables for Boston housing data.
VariablesDescription
MEDVMedian value of owner-occupied homes in USD thousands
CRIMPer capita crime rate by town
ZNProportion of residential land zoned for lots over 25,000 sq.ft.
INDUSProportion of non-retail business acres per town
CHASCharles River dummy variable (=1 if tract bounds river, 0 otherwise)
NOXNitric oxide concentrations (parts per 10 million)
RMAverage number of rooms per dwelling
AGEProportion of owner-occupied units built prior to 1940
DISWeighted distances to five Boston employment centers
RADIndex of accessibility to radial highways
TAXFull-value property-tax rate per USD 10,000
PTRATIOPupil–teacher ratio by town
B 1000 ( Bk 0.63 ) 2 which Bk is the black proportion of the population
LSTAT% lower status of the population
Table 4. Coefficient estimates for Boston housing data.
Table 4. Coefficient estimates for Boston housing data.
VariablesMethods
cqr.sim.scadcqr.sim.leplad.sim.scadls.sim.lassocqr.sim
τ = 0.25 τ = 0.75 τ = 0.25 τ = 0.75 τ = 0.5 τ = 0.25 τ = 0.75
CRIM0.30920.30890.30820.30810.308300.30760.3075
ZN00000−0.06900
INDUS00000000
CHAS00000000
NOX00000000
RM−0.1884−0.1883−0.1871−0.1870−0.1872−0.5300−0.1866−0.1864
AGE00000000
DIS0.14530.14510.14430.14420.14440.11630.14390.1437
RAD00000−0.046000
TAX00000000
PTRATIO0.18770.18760.18700.18680.18710.10690.18650.1863
B00000000
LSTAT0.90420.90400.90310.90290.90320.83190.90240.9023
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Song, Y.; Li, Z.; Fang, M. Robust Variable Selection Based on Penalized Composite Quantile Regression for High-Dimensional Single-Index Models. Mathematics 2022, 10, 2000. https://doi.org/10.3390/math10122000

AMA Style

Song Y, Li Z, Fang M. Robust Variable Selection Based on Penalized Composite Quantile Regression for High-Dimensional Single-Index Models. Mathematics. 2022; 10(12):2000. https://doi.org/10.3390/math10122000

Chicago/Turabian Style

Song, Yunquan, Zitong Li, and Minglu Fang. 2022. "Robust Variable Selection Based on Penalized Composite Quantile Regression for High-Dimensional Single-Index Models" Mathematics 10, no. 12: 2000. https://doi.org/10.3390/math10122000

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop