Next Article in Journal
A Conjugate Gradient Method: Quantum Spectral Polak–Ribiére–Polyak Approach for Unconstrained Optimization Problems
Previous Article in Journal
Construction of Software Supply Chain Threat Portrait Based on Chain Perspective
Previous Article in Special Issue
Regression Analysis of Dependent Current Status Data with Left Truncation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Estimation in Semi-Varying Coefficient Heteroscedastic Instrumental Variable Models with Missing Responses

1
College of Science, Inner Mongolia Agricultural University, Hohhot 010018, China
2
School of Statistics, Beijing Normal University, Beijing 100875, China
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(23), 4853; https://doi.org/10.3390/math11234853
Submission received: 17 October 2023 / Revised: 29 November 2023 / Accepted: 30 November 2023 / Published: 2 December 2023
(This article belongs to the Special Issue Computational Statistics and Data Analysis, 2nd Edition)

Abstract

:
This paper studies the estimation problem for semi-varying coefficient heteroscedastic instrumental variable models with missing responses. First, we propose the adjusted estimators for unknown parameters and smooth functional coefficients utilizing the ordinary profile least square method and instrumental variable adjustment technique with complete data. Second, we present an adjusted estimator of the stochastic error variance by employing the Nadaraya–Watson kernel estimation technique. Third, we apply the inverse probability-weighted method and instrumental variable adjustment technique to construct the adaptive-weighted adjusted estimators for unknown parameters and smooth functional coefficients. The asymptotic properties of our proposed estimators are established under some regularity conditions. Finally, numerous simulation studies and a real data analysis are conducted to examine the finite sample performance of the proposed estimators.

1. Introduction

As an important category of statistical regression models, the varying-coefficient partially linear model possesses strong explanatory power and flexibility. It has been used widely in scientific research, such as econometrics, biomedicine and engineering technology. Its general mathematical expression is:
Y = X θ ( U ) + Z β + ε ,
where Y is the response variable, and X R P and Z R q are covariates. To prevent the “curse of dimensionality” problem, the covariate U is confined to be one-dimensional. θ ( · ) = [ θ 1 ( · ) , , θ p ( · ) ] is a p × 1 smooth functional coefficients vector, β = ( β 1 , β 2 , , β q ) is a q × 1 constant coefficients vector, and ε is the model error, which is independent of ( X , Z , U ) . The mean of ε is zero, and the variance is designated as a heteroscedasticity structure, which satisfies Var ( ε | X , Z , U ) = σ 2 ( U ) . As an extension of classical linear regression models, varying-coefficient partially linear models have been investigated by many scholars with the hypothesis of homoscedasticity; please see [1,2,3,4,5,6,7,8]. However, in regression analysis, we may omit some important explanatory variables, and the sample observation data may be measured with errors. For such cases, the model error can be heteroscedastic. In recent years, statisticians have developed many statistical inference methods for model (1) with heteroscedasticity. For instance, Shen et al. [9] introduced a re-weighted estimation procedure for unknown parameters based on the generalized least squares method. Zhao et al. [10] proposed a two-stage iterative estimation method by using the orthogonal projection technique, which can individually estimate unknown parameters and functional coefficients. Zhao et al. [11] proposed a re-weighted estimation procedure when the covariates contain an additive measurement error. Zhang and Li [12] proposed a weighted estimation and testing method when the covariates suffer from an additive measurement error. Yuan and Zhou [13] introduced an adaptive-weighted estimation method for model (1), which can increase the estimation accuracy of their proposed estimators.
However, the above works have not considered the endogenous problem of covariates. In practice, there may be endogenous explanatory variables in model (1); see [14,15]. In such cases, the above methods will promote the generation of endogenous bias, which leads to the inconsistency of obtained estimators. Therefore, the instrumental variable method will provide an effective way to eliminate endogeneity bias. In the past ten years, the semi-parametric instrumental variable model has been widely studied by many statisticians. For instance, Cai and Xiong [16] suggested a three-stage estimation approach for semi-varying coefficient models with endogenous covariates. Zhao and Li [17] developed an effective variable selection approach for the classical varying coefficient models with endogenous covariates. Zhao and Xue [18] considered the interval estimation for semi-parametric instrumental variable models by using the empirical likelihood method. Yuan et al. [19] proposed an effective method to identify important variables by combining the SCAD penalty method and instrumental variable adjustment technique for semi-varying coefficient models with endogenous covariates. Zhao et al. [20] applied the popular empirical likelihood approach to study the effective interval estimation for semi-varying coefficient instrumental variable models with the orthogonal decomposition technique. For more related research, please refer to the literature [21,22,23,24]. In this paper, the covariates U and Z are assumed to be exogenous, but the covariates X is endogenous, and ζ R r is an instrumental variable related to X . Similar to [16], the dimension of ζ is designated as greater than or equal to the dimension p of X for identifiability, and X and ζ are specified to satisfy the following parametric model:
X = Ψ ζ + e ,
where Ψ is an unknown constant matrix with dimension p × r , and the error term e satisfies E ( e | ζ , Z , U ) = 0 . E ( ε | X , Z , U ) 0 since thecovariates are endogenous, which indicates that X is associated with the model error ε . Moreover, we further suppose that E ( ε | ζ , Z , U ) = 0 .
In applications, the missing data can occur in many areas, such as market surveys, medical research, opinion polls, and other scientific experiments. When we encounter the missing data, the classical statistical inference methods cannot be used directly. Thus, scholars have developed corresponding methods to solve the problem of missing data. The main methods included the complete sample method, inverse probability weighted technique, and imputation technique. Rubin [25] discussed the complete sample method in detail, but this method will reduce the estimation effectiveness, especially when the observed data are missing at random. Robins et al. [26] suggested an inverse probability-weighted method by assigning the weights to the observed data, which can effectively diminish the deviation caused by the missing data. Wang and Rao [27], and Wang et al. [28] developed the imputation methods for linear and semi-parametric regression models, respectively. Up to now, many scholars have studied the statistical inference for model (1) with missing data, but few scholars considered heteroscedasticity and endogeneity. For instance, Li and Xue [29] constructed an imputation estimator for unknown parameters with a missing response. When the explanatory variables are missing at random, Chen et al. [30] constructed the inverse probability weighted estimation of unknown constant and functional coefficients. For more recent works on missing data research for model (1), the reader can refer to [31,32,33], among others. In this paper, the response variable is specified to be missing, and other explanatory variables can be fully observed. An indicator variable δ is introduced to describe the missing mechanism. If Y is obtainable, we denote δ = 1 , and otherwise δ = 0 . Furthermore, we suppose that the missing data mechanism is randomly missing, which is expressed as:
P ( δ = 1 | U , X , Z , Y ) = P ( δ = 1 | U , X , Z ) = π ( U , X , Z ) ,
where π ( · ) is referred to as the propensity score.
Although many scholars have discussed the statistical inference procedures for various semi-parametric models with endogenous covariates, the existing works have not considered the heteroscedasticity and missing data. Therefore, we consider the estimation problem for models (1) and (2) with heteroscedasticity and a missing response. The adaptive-weighted adjusted estimators of unknown parameters and functional coefficients are proposed using the profile least squares method, instrumental variables adjustment technique, Nadaraya–Watson kernel estimation, inverse probability-weighted method, and weighted least squares method, and we also establish the asymptotic properties of the proposed estimators.
The rest of the paper is organized as follows. In Section 2, we introduced an adaptive-weighted adjusted estimation method to obtain the estimators for unknown parameters and functional coefficients, and the corresponding asymptotic properties are established. In Section 3, numerous simulation studies are conducted to demonstrate the effectiveness and feasibility of the proposed estimators. A real data analysis is performed as well in Section 4. Section 5 summarizes the research results of this paper with some conclusions. The technical proofs are presented in Appendix A.

2. Estimation Methods and Main Results

2.1. Adjusted Profile Least Squares Estimation

In this subsection, we apply the local linear smoothing technique and instrumental variable adjustment technique to estimate unknown parameters β and smooth functional coefficients θ ( · ) . Assume that { Y i , X i , U i , Z i , ζ i , δ i } i = 1 n are independent and identically distributed (i.i.d.) samples, which come from the semi-varying coefficient heteroscedastic instrumental variable models (1)–(3); then, we have:
δ i Y i = δ i X i θ ( U i ) + δ i Z i β + δ i ε i , X i = Ψ ζ i + e i , i = 1 , 2 , , n .
For any u in the small neighborhood of u 0 , each functional coefficients θ j ( u ) ( j = 1 , 2 , , p ) can be expanded by the Taylor expansion as follows:
θ j ( u ) θ j ( u 0 ) + θ j ( u 0 ) ( u u 0 ) , j = 1 , 2 , , p .
If β is given, then the estimator of θ j ( u 0 ) is given by minimizing the following weighted least squares objective function:
i = 1 n Y i Z i β j = 1 p θ j ( u 0 ) + θ j ( u 0 ) ( U i u 0 ) X i j 2 K h 1 ( U i u 0 ) δ i ,
where K h 1 ( · ) = K ( · / h 1 ) / h 1 , K ( · ) is a kernel function, which is chosen as a symmetric probability density function, and h 1 is a bandwidth. For ease of presentation, we denote:
Y = ( Y 1 , Y 2 , , Y n ) , X = ( X 1 , X 2 , , X n ) , Z = ( Z 1 , Z 2 , , Z n ) ,
Δ 0 = diag ( δ 1 , δ 2 , , δ n ) , M = [ X 1 θ ( U 1 ) , , X n θ ( U n ) ] ,
w h 1 ( u 0 ) = diag [ K h 1 ( U 1 u 0 ) , K h 1 ( U 2 u 0 ) , , K h 1 ( U n u 0 ) ] ,
X h 1 ( u 0 ) = X 1 h 1 1 ( U 1 u 0 ) X 1 X n h 1 1 ( U n u 0 ) X n .
Thus, the first formula of model (4) can be transformed into:
Δ 0 Y Δ 0 Z β = Δ 0 M + Δ 0 ε .
By minimizing the weighted least squares objective (5), the estimators of functional coefficients θ ( u 0 ) are given by:
θ ˜ ( u 0 , β ) = ( I p , 0 p ) { X h 1 ( u 0 ) w h 1 ( u 0 ) Δ 0 X h 1 ( u 0 ) } 1 X h 1 ( u 0 ) w h 1 ( u 0 ) Δ 0 ( Y Z β ) ,
where I p and 0 p denote the identity matrix and zero matrix with dimension p × p . It is noteworthy that the explanatory variable X in this paper is an endogenous variable, which indicates that E ( ε | X ) 0 . Thus, the estimators of the functional coefficients in (7) are inconsistent. Then, we make a correction for θ ˜ ( u 0 , β ) by applying the available instrumental variables ζ . For model (2), we can easily obtain:
E ( X ζ ) = Ψ E ( ζ ζ ) .
Therefore, a usual moment estimator of the unknown constant matrix Ψ is given by:
Ψ ^ = i = 1 n X i ζ i i = 1 n ζ i ζ i 1 .
Let X ^ i = Ψ ^ ζ i . Invoking (7), the adjusted estimators of functional coefficients θ ( u ) are given by:
θ ^ ( u 0 , β ) = ( I p , 0 p ) { X ^ h 1 ( u 0 ) w h 1 ( u 0 ) Δ 0 X ^ h 1 ( u 0 ) } 1 X ^ h 1 ( u 0 ) w h 1 ( u 0 ) Δ 0 ( Y Z β ) ,
where X ^ h 1 ( u 0 ) has the same structure as X h 1 ( u 0 ) , but replacing the variable X i with X ^ i . Then, invoking (8), the estimator of M is given by:
M ^ ( β ) = ( X ^ 1 θ ^ ( U 1 , β ) , , X ^ n θ ^ ( U n , β ) ) S ( Y Z β )
and:
S = ( X ^ 1 , 0 ) { X ^ h 1 ( U 1 ) w h 1 ( U 1 ) Δ 0 X ^ h 1 ( U 1 ) } 1 X ^ h 1 ( U 1 ) w h 1 ( U 1 ) Δ 0 ( X ^ n , 0 ) { X ^ h 1 ( U n ) w h 1 ( U n ) Δ 0 X ^ h 1 ( U n ) } 1 X ^ h 1 ( U n ) w h 1 ( U n ) Δ 0 ,
where 0 denotes a zero vector with dimensions 1 × p . Replacing M with M ^ ( β ) in (6), we obtain:
Δ 0 ( I S ) Y = Δ 0 ( I S ) Z β + Δ 0 ε .
For the model (10), a least squares approach is implemented, and then the adjusted estimators of unknown parameters β are given by:
β ^ = ( Z ˜ Δ 0 Z ˜ ) 1 Z ˜ 0 Δ 0 Y ˜ ,
where Y ˜ = ( I S ) Y , Z ˜ = ( I S ) Z . Combining (8) and (11), the adjusted estimators of functional coefficients θ ( u ) at u 0 are given by:
θ ^ ( u 0 , β ^ ) = ( I p , 0 p ) { X ^ h 1 ( u 0 ) w h 1 ( u 0 ) Δ 0 X ^ h 1 ( u 0 ) } 1 X ^ h 1 ( u 0 ) w h 1 ( u 0 ) Δ 0 ( Y Z β ^ ) .

2.2. Adaptive-Weighted Adjusted Profile Least Squares Estimation

In this subsection, we develop an adaptive-weighted adjusted estimation method for unknown parameters and functional coefficients based on weighted least squares estimation. First of all, according to the estimators of model residuals, we suggest an adjusted Nadaraya–Watson kernel estimation method for the variance function. By (11) and (12), we can estimate the residual error by:
ε ^ = ( ε ^ 1 , ε ^ 2 , , ε ^ n ) = Δ 0 ( Y M ^ ( β ^ ) Z β ^ ) .
Note that Var ( ε i | X i , Z i , U i ) = σ 2 ( U i ) , and by using the Nadaraya–Watson kernel estimation method, an adjusted estimator of variance function σ 2 ( u 0 ) is given by:
σ ^ 2 ( u 0 ) = i = 1 n δ i ε ^ i 2 K h ˜ ( U i u 0 ) i = 1 n δ i K h ˜ ( U i u 0 ) ,
where K h ˜ ( · ) and K h 1 ( · ) have the same structure, except that the bandwidth h ˜ is replaced by h 1 . Furthermore, replacing u 0 by U i ( i = 1 , , n ) , we can obtain:
σ ^ 2 ( U i ) = k = 1 n δ k ε ^ k 2 K h ˜ ( U k U i ) k = 1 n δ k K h ˜ ( U k U i ) , i = 1 , , n .
The estimator σ ^ 2 ( u 0 ) of the variance function σ 2 ( u 0 ) is consistent. The proof of this property is similar to that of Theorem 1 in [9]; thus, we omit the details.
Then, we consider how to deal with missing data. In general, the selection probability function π ( U i , X i , Z i ) defined in (3) is usually unknown. We can employ nonparametric estimation methods to estimate it, such as kernel estimation and local polynomial estimation, but these methods may cause the curse of dimensionality. Therefore, similar to the method in [31], we suppose that the selection probability function satisfies:
π ( U i , X i , Z i , w ) = exp ( w 0 + w 1 U i + w 2 X i + w 3 Z i ) 1 + exp ( w 0 + w 1 U i + w 2 X i + w 3 Z i ) ,
where w = ( w 0 , w 1 , w 2 , w 3 ) is the unknown parameter vector, and the estimator w ^ of w can be attained by the quasi-likelihood estimation method. We also assume V i = ( U i , X i , Z i ) , and then the estimator π ( U i , X i , Z i , w ^ ) = π ( V i , w ^ ) .
Based on the estimator of variance function σ ^ 2 ( U i ) and selection probability function π ( V i , w ^ ) , the adaptive-weighted adjusted estimators for functional coefficients are given by minimizing:
i = 1 n Y i Z i β j = 1 p [ θ j ( u 0 ) + θ j ( u 0 ) ( U i u 0 ) ] X i j 2 σ ^ 2 ( U i ) δ i π ( V i , w ^ ) K h 2 ( U i u 0 ) ,
where K h 2 ( · ) = K ( · / h 2 ) / h 2 with bandwidth h 2 .
Minimizing the objective Function (16), the adaptive-weighted estimators of functional coefficients can be expressed as:
θ ˜ w ( u 0 , β ) = ( I p , 0 p ) { X h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ X h 2 ( u 0 ) } 1 X h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ ( Y Z β ) ,
where Σ ^ = diag [ σ ^ 2 ( U 1 ) , σ ^ 2 ( U 2 ) , , σ ^ 2 ( U n ) ] , Δ ^ = [ δ 1 / π ( V 1 , w ^ ) , , δ n / π ( V n , w ^ ) ] . Since the explanatory variable X is endogenous, the instrumental variable adjustment technique is used to make a correction for θ ˜ ( u 0 , β ) . Then, the adaptive-weighted adjusted estimators of functional coefficients θ ( u 0 ) are given by:
θ ^ w ( u 0 , β ) = ( I p , 0 p ) { X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ X ^ h 2 ( u 0 ) } 1 X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ ( Y Z β ) ,
where X ^ h 2 ( u 0 ) and X ^ h 1 ( u 0 ) , w h 2 ( u 0 ) and w h 1 ( u 0 ) have the same form except that h 1 is replaced by h 2 , respectively. Then, the estimator of M is given by:
M ^ w = ( X ^ 1 θ ^ w ( U 1 , β ) , , X ^ n θ ^ w ( U n , β ) ) S ^ ( Y Z β ) ,
where:
S ^ = ( X ^ 1 , 0 ) { X ^ h 2 ( U 1 ) w h 2 ( U 1 ) Σ ^ 1 Δ ^ X ^ h 2 ( U 1 ) } 1 X ^ h 2 ( U 1 ) w h 2 ( U 1 ) Σ ^ 1 Δ ^ ( X ^ n , 0 ) { X ^ h 2 ( U n ) w h 2 ( U n ) Σ ^ 1 Δ ^ X ^ h 2 ( U n ) } 1 X ^ h 2 ( U n ) w h 2 ( U 1 ) Σ ^ 1 Δ ^ .
Substituting M ^ w into (1), we have:
( I S ^ ) Y = ( I S ^ ) Z β + ε .
In order to eliminate the impact of heteroscedasticity, left-multiplying the matrix Σ ^ 1 / 2 to (19), we obtain:
Σ ^ 1 / 2 ( I S ^ ) Y = Σ ^ 1 / 2 ( I S ^ ) β + Σ ^ 1 / 2 ε ,
where Σ ^ 1 / 2 = diag [ σ ^ 1 ( U 1 ) , σ ^ 1 ( U 2 ) , , σ ^ 1 ( U n ) ] .
By employing the inverse probability-weighted method and combining the idea of weighted least squares, we can derive the estimators of unknown parameters β by minimizing
Q ( β ) = [ ( I S ^ ) Y ( I S ^ ) X β ] Σ ^ 1 Δ ^ [ ( I S ^ ) Y ( I S ^ ) X β ] .
Solving the minimum problem with respect to β , the proposed adaptive-weighted adjusted estimator of β is given by:
β ^ w = ( Z ^ Σ ^ 1 Δ ^ Z ^ ) 1 Z ^ Σ ^ 1 Δ ^ Y ^ ,
where Y ^ = ( I S ^ ) Y , Z ^ = ( I S ^ ) Z . Moreover, substituting β ^ w into (17), we give the proposed adaptive-weighted adjusted estimators of functional coefficients θ ( u 0 ) as follows:
θ ^ w ( u 0 , β ^ w ) = ( I p , 0 p ) { X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ X ^ h 2 ( u 0 ) } 1 X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ ( Y Z β ^ w ) .

2.3. Asymptotic Properties

In this subsection, we establish the asymptotic properties of the proposed estimators. First, some regularity conditions are needed. These conditions are mild, and similar conditions can be found in [9,10,11], and other varying-coefficient partially linear heteroscedasticity literature.
(C1)
The random variable U has a bounded support U and its density function f ( u ) is Lipschitz-continuous and has a second-order continuous derivative. Moreover, f ( u ) is bounded away from zero.
(C2)
The kernel K ( · ) is a symmetric probability density function with a compact support and is Lipschitz-continuous.
(C3)
For each u U , the matrix Π ( u ) = E ( ζ 1 ζ 1 | U = u ) is invertible, and the matrices Π ( u ) , Π 1 ( u ) , and Ξ ( u ) = E ( ζ 1 Z 1 | U = u ) are Lipschitz-continuous.
(C4)
For each u U , the functional coefficients { θ j ( u ) , j = 1 , 2 , , p } are Lipschitz-continuous and have continuous second derivatives.
(C5)
There is a s > 2 , such that E ζ 1 2 s < , E Z 1 2 s < .
(C6)
There is a δ < 1 s 1 , such that lim n n 2 δ 1 h i = , i = 1 , 2 .
(C7)
The variance function σ 2 ( · ) has a continuous second derivative and is uniformly bounded on the domain.
(C8)
For the bandwidth h i ( i = 1 , 2 ) , n h i 2 , n h i 8 0 and [ log ( 1 / h i ) ] 2 / ( n h i 2 ) 0 as n . In addition, h i and h ˜ satisfy O ( c n i ) O ( c ˜ n ) = o ( n 1 / 2 ) , where c n i = h i 2 + [ log ( 1 / h i ) / ( n h i ) ] 1 / 2 , i = 1 , 2 , c ˜ n = h ˜ 2 + [ log ( 1 / h ˜ ) / ( n h ˜ ) ] 1 / 2 .
(C9)
On the base of ( X i , Z i , U i ) , π ( · ) has a second-order continuous derivative. Moreover, π ( · ) is bounded away from zero.
Conditions (C1)–(C6) are quite generally required in the semi-varying coefficient model. Conditions (C7) and (C8) are mainly to obtain the consistent estimator of the variance function, which can be found in [9]. Condition (C9) provides a guarantee for the inverse probability weighted technique.
Theorem 1.
Suppose that regularity conditions (C1)–(C9) hold; for each u 0 U , we have:
sup u 0 U | σ ^ 2 ( u 0 ) σ 2 ( u 0 ) | = o p ( c ˜ n ) .
Theorem 2.
Suppose that regularity conditions (C1)–(C9) hold; then, the proposed adaptive-weighted adjusted estimator of β satisfies:
n ( β ^ w β ) D N ( 0 , Λ 1 1 Λ 2 Λ 1 1 ) , n ,
where Λ 1 = E σ 2 ( U 1 ) [ Z 1 Ξ ( U 1 ) Ψ ( Ψ Π ( U 1 ) Ψ ) 1 Ψ ζ 1 ] 2 , Λ 2 = E { π ( V 1 , w ) 1 × σ 4 ( U 1 ) ( e 1 θ ( U 1 ) + ε 1 ) 2 [ Z 1 Ξ ( U 1 ) Ψ ( Ψ Π ( U 1 ) Ψ ) 1 Ψ ζ 1 ] 2 } , and H 2 = H H .
Theorem 3.
Suppose that regularity conditions (C1)–(C9) hold; then, the proposed adaptive-weighted adjusted estimator of θ ( u 0 ) satisfies:
n h 2 θ ^ w ( u 0 , β ^ w ) θ ( u 0 ) 1 2 h 2 2 μ 2 θ ( u 0 ) D N ( 0 , Λ ( u 0 ) ) , n ,
where Λ ( u 0 ) = v 0 f 1 ( u 0 ) E π ( V 1 , w ) 1 ( e 1 θ ( U 1 ) + ε 1 ) 2 Ψ Π ( u 0 ) Ψ 1 ,
μ 2 = u 2 K h 2 ( u ) d u , v 0 = K h 2 2 ( u ) d u .
Theorems 2 and 3 give the asymptotic distribution of our proposed adaptive-weighted adjusted estimators. These results can be utilized to conduct statistical inference for unknown parameters and functional coefficients. Additionally, the above theorems expand the application scale of semi-varying coefficient models to satisfy the modeling requirements of applications. By assuming that the missing responses and endogenous covariates are nonexistent, the asymptotic variance of our proposed estimators possess the same structure as that of the estimators in [13]. On the other hand, when the missing response and heteroscedasticity are not considered, the asymptotic variance of our proposed estimators is the same as that of the estimators in [16].

3. Simulation Studies

In this section, we carry out some simulations to evaluate the finite sample performance of the proposed adaptive-weighted adjusted estimation method. We generate the data from the semi-varying coefficient heteroscedastic instrumental variables model:
Y = Z 1 β 1 + Z 2 β 2 + X θ ( U ) + ε ,
where the explanatory variables Z 1 and Z 2 are both independently drawn from N ( 2 , 1 ) , the univariate U is independently drawn from U ( 0 , 1 ) , the explanatory variable X is an endogenous variable generated from the model X = ζ + k ε , where ζ in an instrumental variable generated from normal distribution N ( 1 , 1 ) , and k is taken as 0.2 and 0.4 to represent different levels of endogeneity. We set the parameters β 1 = 1.5 , β 2 = 2 , θ ( U ) = sin ( 2 π U ) . The model error ε N ( 0 , σ 2 ( U ) ) with σ 2 ( U ) = 0.25 + [ c sin ( 2 π U ) ] 2 for c = 2 , 4 , respectively. The Gaussian kernel K ( x ) = 1 / 2 π exp ( x 2 / 2 ) is adopted. The leave one out cross-validation (LOOCV) method is applied to choose h 1 , which is derived by minimizing
CV ( h 1 ) = 1 n i = 1 n δ i [ Y i X i θ ^ [ i ] ( U i ) Z i β ^ [ i ] ] ,
where β ^ [ i ] and θ ^ [ i ] ( · ) are the adjusted profile least-squares estimators, which are given in (11) and (12), respectively. We choose bandwidths h ˜ and h 2 by a similar method. To compare the performance of the proposed adaptive-weighted adjusted estimators under different missing probabilities, two selection probability functions are chosen as follows:
π 1 ( x , u , z ) = P ( δ = 1 | X = x , U = u , Z = ( z 1 , z 2 ) ) = [ 1 + exp ( 1 + 1.1 x z 1 0.4 z 2 0.7 u ) ] 1 , π 2 ( x , u , z ) = P ( δ = 1 | X = x , U = u , Z = ( z 1 , z 2 ) ) = [ 1 + exp ( 1 + 1.5 x 0.8 z 1 0.4 z 2 0.5 u ) ] 1 .
The corresponding average response rates are about 0.9 and 0.8 when c = 2 and k = 0.2 . To show the performance of our proposed adaptive-weighted adjusted profile least square estimation based on the instrumental variable adjustment technique (represented as IAWPLS), we contrast it with the two approaches below: (1) the naive adaptive-weighted profile least square estimation, denoted by NAWPLS; (2) the instrumental variable weighted profile least square estimation, denoted by IWPLS. The former omits the endogeneity of the explanatory covariate, and it is derived by combining the inverse probability-weighted method and adaptive-weighted profile least squares method in [13]. The latter ignores the heteroscedasticity of the model error and combines the inverse probability-weighted method and instrumental variable adjustment method in [16]. We set the sample size as 50, 100, 200, and 300. The results are based on 500 replications for each case. For parametric components, we use the following measures to compare the performance of different methods: (1) Mean: the average of the estimated value; (2) MSE: mean square errors for the corresponding estimators. The results are shown in Table 1 and Table 2 for π = π 1 , π 2 , c = 2 , 4 , k = 0.2 , 0.4 and n = 50 , 100 , 200 , 300 , respectively.
According to Table 1 and Table 2, we have the following results.
(1)
The IAWPLS and IWPLS estimators are asymptotically unbiased, but the NAWPLS estimators are biased. For fixed π , c , k , with the increase of n, the MSEs of all three estimators decrease.
(2)
For fixed π , c , k , when the sample size n = 50 , the MSEs of our proposed IAWPLS estimators are slightly larger than those of NAWPLS in some cases, but obviously smaller than those of the NAWPLS and IWPLS estimators when the sample size is greater than 100.
(3)
For fixed c , k , n , with the increase in the missing probability, the MSEs of all three estimators increase.
(4)
For fixed π , k , n , with the increase in c, the MSEs of all three estimators increase.
(5)
For fixed π , c , n , with the increase in k, the MSEs of all three estimators increase.
Subsequently, we further consider the behavior of the adaptive-weighted adjusted estimation method for the variance functions and functional coefficients. The corresponding estimated values are computed at n = 200 equally spaced values U i = i / n [ 0 , 1 ] , and the mean value of 500 simulations at every point U i is taken for the ultimate estimated values. Due to the similarity of the estimated curves for different sample sizes and missing probabilities, we only plot the estimated curves of the variance functions and functional coefficients when c = 2 , 4 , k = 0.2 , 0.4 , n = 200 , and π = π 1 . The estimated curves are shown in Figure 1 and Figure 2. To demonstrate the effectiveness of the proposed estimation method for the variance function, two methods are taken for contrast: (1) the adjusted Nadaraya–Watson kernel estimation method based on instrumental variable adjustment techniques; (2) the naive Nadaraya–Watson kernel estimation method, which ignores the endogeneity of explanatory variables and uses the standard Nadaraya–Watson kernel estimation.
Figure 1 shows that the proposed adjusted Nadaraya–Watson kernel estimators are asymptotically unbiased, but the naive Nadaraya–Watson kernel estimators are biased, and the deviation increases with the increase of c or k. Note that the performance of our proposed adjusted Nadaraya–Watson kernel estimators may be affected by larger c and k. From Figure 2, we find that the estimated curves obtained by the IAWPLS and IWPLS methods both approach the true curves, but the estimated curves obtained by the NAWPLS method are biased, and the deviation increases with the increase of c or k.
Owing to the estimated curves obtained by the IAWPLS and IWPLS methods being close to each other, we further utilize the root mean squared error (RMSE) to evaluate the effectiveness for functional coefficients:
RMSE = 1 N k = 1 N θ ^ ( U k ) θ ( U k ) 2 1 / 2 ,
where U k ( k = 1 , 2 , , N ) are the selected split points defined on the bounded support U . In this case, we set N = 100 and assume that U k takes equal intervals on the interval [ 0 , 1 ] . We calculate RMSE of the functional coefficients estimators. Results are presented in Table 3 for π = π 1 , π 2 , c = 2 , 4 , k = 0.2 , 0.4 and n = 50 , 100 , 200 , 300 .
According to Table 3, we find that the proposed IAWPLS estimation method for functional coefficients has smaller RMSEs than those of NAWPLS and IWPLS methods for given π , k , c , n . The RMSEs of all three estimators decrease with the increase of n. Nevertheless, the RMSEs increase with the increase of c , k or missing probability.

4. Real Data Analysis

We applied our adaptive-weighted adjusted estimation method to the National Longitudinal Survey of Young Men (NLSYM) dataset, which includes 3010 samples from 1976. This dataset has been used widely to analyze the endogeneity issues for parametric and semi-parametric models, such as in [18,21,23,34]. We aim to study the potential relationship between the log of hourly wage in cents ( Y ) and six other explanatory variables: the years of schooling ( Z 1 ) , the dummy variable black ( Z 2 ) , south ( Z 3 ) , the standard metropolitan statistical area ( Z 4 , smsa), age ( U ) , and work experience ( X ) , constructed as U Z 1 6 ; further details regarding the variables in the dataset can be found in [34]. Similar to [20,34], we constructed the following semi-varying coefficient model:
Y = X θ ( U ) + Z 1 β 1 + Z 2 β 2 + Z 3 β 3 + Z 4 β 4 + ε .
Based on the idea of [34], we took a four-year university as the corresponding instrumental variable since the years of schooling are not randomly assigned or endogenous. For the missing data, we used the following selection probability model to randomly delete approximately 11% of the responses:
π ( x , u , z ) = P ( δ = 1 | X = x , U = u , Z = ( z 1 , z 2 , z 3 , z 4 ) ) = [ 1 + exp ( 1 + 1.5 x 0.4 z 1 0.4 z 2 0.5 z 3 0.7 z 4 0.5 u ) ] 1 .
In addition, the Gaussian kernel function was chosen in this case, and the bandwidths h 1 , h 2 , and h ˜ were all selected as 0.3 for ease of calculation. We first applied the adjusted profile least-squares method to obtain the initial estimators for unknown parameters and functional coefficients. Then, the model fitting values Y ^ and residuals Y Y ^ could be derived using a simple calculation, and a scatter plot of Y ^ to Y Y ^ is presented in Figure 3.
Figure 3 demonstrates that the model residuals show a certain linear trend instead of random variation. Therefore, we concluded that there should be a heteroscedasticity structure. To further reveal the heteroscedasticity, an adjusted Nadaraya–Watson kernel estimation method is suggested for estimating the variance function, and the estimated curve is presented in Figure 4.
From Figure 4, we found that the variance function shows a significant downward trend with the increase of age U, indicating the existence of heteroscedasticity in this specified model. Then, we employed the proposed adaptive-weighted adjusted estimation method to estimate unknown parameters and functional coefficients. For comparison, the three estimation methods IAWPLS, NAWPLS, and IWPLS were included, and the simulation results for unknown parameters and functional coefficients are shown in Table 4 and Figure 5, respectively.
According to Table 4 and Figure 5, we found that although the IAWPLS estimators and IWPLS estimators are relatively close to each other, the IWPLS estimation method slightly overestimated the parameters vector. In addition, there is a significant deviation in the NAWPLS estimators compared with the other estimators. Overall, our proposed estimation method can effectively eliminate the endogeneity and heteroscedasticity for semi-varying coefficient models with missing data.

5. Conclusions

In this paper, we study an estimation problem for semi-varying coefficient instrumental variable models with missing data, and the model error is subject to heteroscedasticity simultaneously. An adaptive-weighted adjusted estimation procedure is proposed based on the instrumental variable adjustment technique. The consistency of the variance function estimator is established, and the asymptotic distribution of unknown parameters and functional coefficient estimators is also demonstrated under some regular conditions. Moreover, numerous simulation studies and a NLSYM data analysis further demonstrate the effectiveness of the proposed estimation method. However, we only discuss the estimation problem of the semi-varying coefficient instrumental variables model with heteroscedasticity and missing data in this study. More interesting research topics can be explored in the future, including variable selection and model averaging issues. In addition, high-dimensional data have become a focus in statistical research. How to develop statistical inference methods and theories for a semi-varying coefficient heteroscedastic instrumental variables model with high-dimensional data is an interesting research direction. These issues will be studied in future work.

Author Contributions

Conceptualization, W.Z., S.M. and J.L.; Methodology, W.Z., S.M. and J.L.; Validation, W.Z., S.M. and J.L.; Formal analysis, W.Z.; Investigation, W.Z.; Data curation, W.Z.; Writing—original draft preparation, W.Z.; Writing—review and editing, W.Z., S.M. and J.L.; Supervision, W.Z., S.M. and J.L.; Project administration, W.Z. and S.M.; Funding acquisition, W.Z. and S.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of Inner Mongolia Autonomous Region of China (Nos. 2023QN01001, 2022MS07014), the National Natural Science Foundation of China (No. 12271046 and 71661027), and the Research Program of Humanities and Social Sciences at Universities of Inner Mongolia Autonomous Region of China (No. NJSY22497).

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available to protect sensitive information.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Lemma A1
(Mack and Silverman [35]). Suppose that ( X 1 , Y 1 ) , , ( X n , Y n ) are i.i.d. random vectors, where Y i ( i = 1 , , n ) are scalar random variables. Assume that E | Y i | s < and sup x | y | s f ( x , y ) d y < , where f ( x , y ) denotes the joint probability density of random variables X and Y, and K ( · ) is a bounded positive function and defined on a bounded support. Moreover, K ( · ) satisfies a Lipschitz condition. Given n 2 r 1 h for some r < 1 s 1 , we have:
sup x 1 n i = 1 n K h ( X i x ) Y i E ( K h ( X i x ) Y i ) = O p log ( 1 / h ) n h 1 / 2 .
Lemma A2
(Shi and Lau [36]). Let T 1 , , T n be i.i.d. random variables. If E | T i | s is bounded for s > 1 , then max 1 i n | T i | = o p ( n 1 / s ) , a . s .
Lemma A3
(Chen et al. [31]). Let τ i = ( 1 , X i , U i , Z i ) , λ min denotes the least eigenvalue of i = 1 n τ i τ i , and assume that sup i 1 τ i < and λ min ; then, the quasi-likelihood estimation w ^ = ( w ^ 0 , w ^ 1 , w ^ 2 , w ^ 3 ) of w = ( w 0 , w 1 , w 2 , w 3 ) satisfies:
n ( w ^ w ) = A 1 n 1 2 i = 1 n τ i ( δ i π i ) + o p ( 1 ) ,
where A = E [ τ 1 τ 1 π 1 ( 1 π 1 ) ] .
Lemma A4.
Under regularity conditions (C1)–(C9), we have:
max 1 i n δ i π ( V i , w ^ ) δ i π ( V i , w ) = o p ( n 1 2 + 1 2 s ) .
Proof of Lemma A4.
According to Lemma A3, we use the first-order Taylor expansion δ i / π ( V i , w ^ ) at w ,
δ i π ( V i , w ^ ) = δ i π ( V i , w ) + δ i π ( V i , w ) ( w ^ w ) + o p ( | w ^ w | ) = δ i π ( V i , w ) δ i π ( V i , w ) π 2 ( V i , w ) ( w ^ w ) + o p ( 1 ) .
By Condition (C9) and Lemma A2, we have:
δ i π ( V i , w ^ ) δ i π ( V i , w ) max τ i δ i [ 1 π ( V i , w ) ] π ( V i , w ) ( w ^ w ) + o p ( 1 ) = o p ( n 1 2 + 1 2 s ) .
This completes the proof of Lemma A4. □
Lemma A5.
Under regularity conditions (C1)–(C9), as n , it holds that:
1 n X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ X ^ h 2 ( u 0 ) = σ 2 ( u 0 ) f ( u 0 ) Ψ Π ( u 0 ) Ψ 1 0 0 μ 2 { 1 + O p ( c n 2 ) } ,
1 n X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ Z = σ 2 ( u 0 ) f ( u 0 ) Ψ Ξ ( u 0 ) ( 1 0 ) { 1 + O p ( c n 2 ) } ,
1 n X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ M = σ 2 ( u 0 ) f ( u 0 ) Ψ Π ( u 0 ) Ψ θ ( u 0 ) ( 1 0 ) { 1 + O p ( c n 2 ) } ,
1 n X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ ε = O p ( c n 2 ) ,
wheredenotes the Kronecker product.
Proof of Lemma A5.
For any u 0 U , by some simple calculation, we have:
1 n X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ X ^ h 2 ( u 0 ) = 1 n i = 1 n B i i = 1 n B i h 2 1 ( U i u 0 ) i = 1 n B i h 2 1 ( U i u 0 ) i = 1 n B i h 2 2 ( U i u 0 ) 2 ,
where B i = [ δ i / π ( V i , w ^ ) ] σ ^ 2 ( U i ) X ^ i X ^ i K h 2 ( U i u 0 ) .
We first consider the term of the upper-left corner of the matrix. It is noteworthy that Ψ ^ is the usual moment estimator of Ψ . According to [16], we obtain Ψ ^ = Ψ + O p ( n 1 / 2 ) . Therefore, combining Theorem 1, Lemma A4 and condition (C8), we can obtain:
1 n i = 1 n σ ^ 2 ( U i ) δ i π ( V i , w ^ ) X ^ i X ^ i K h 2 ( U i u 0 ) = 1 n i = 1 n σ 2 ( U i ) δ i π ( V i , w ) Ψ ζ i ζ i Ψ K h 2 ( U i u 0 ) + 1 n i = 1 n [ σ ^ 2 ( U i ) σ 2 ( U i ) ] δ i π ( V i , w ) Ψ ζ i ζ i Ψ K h 2 ( U i u 0 ) + 1 n i = 1 n σ 2 ( U i ) δ i π ( V i , w ^ ) δ i π ( V i , w ) Ψ ζ i ζ i Ψ K h 2 ( U i u 0 ) + 1 n i = 1 n [ σ ^ 2 ( U i ) σ 2 ( U i ) ] δ i π ( V i , w ^ ) δ i π ( V i , w ) Ψ ζ i ζ i Ψ K h 2 ( U i u 0 ) + O p ( n 1 / 2 ) = σ 2 ( u 0 ) f ( u 0 ) Ψ Π ( u 0 ) Ψ + o p ( c ˜ n ) + o p ( n 1 2 + 1 2 s ) + o p ( c ˜ n ) o p ( n 1 2 + 1 2 s ) + O p ( n 1 / 2 ) = σ 2 ( u 0 ) f ( u 0 ) Ψ Π ( u 0 ) Ψ + O p ( c ˜ n 2 ) .
By using the same argument as above, other terms in the matrix can be proven, which lead to the result of (A1). The proofs of (A2)–(A4) are similar to those of (A1). We omit the details here. □
Lemma A6.
Under regularity conditions (C1)–(C9), as n , we have:
1 n Z ^ Σ 1 Δ Z ^ Λ 1 , a . s . ,
where Σ = diag [ σ 2 ( U 1 ) , σ 2 ( U 2 ) , , σ 2 ( U n ) ] , Δ = diag δ 1 π ( V 1 , w ^ 1 ) , , δ n π ( V n , w ^ n ) , Λ 1 is given in Theorem 2.
Proof of Lemma A6.
Invoking (A1) and (A2), it is easy to show that:
( X ^ 0 ) { X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ X ^ h 2 ( u 0 ) } 1 X ^ h 2 ( u 0 ) Σ ^ 1 Δ ^ Z = ( Ψ ζ ) ( Ψ Π ( u 0 ) Ψ ) 1 Ψ Ξ ( u 0 ) { 1 + O p ( c n 2 ) } .
Then, we have:
S ^ Z = ( Ψ ζ 1 ) ( Ψ Π ( U 1 ) Ψ ) 1 Ψ Ξ ( U 1 ) ( Ψ ζ n ) ( Ψ Π ( U n ) Ψ ) 1 Ψ Ξ ( U n ) { 1 + O p ( c n 2 ) } .
Then, invoking (A5), some calculations yield:
1 n Z ^ Σ 1 Δ Z ^ = 1 n ( Z S ^ Z ) Σ 1 Δ ( Z S ^ Z ) = 1 n i = 1 n σ 2 ( U i ) δ i π ( V i , w i ) η i η i + O p ( c n 2 ) ,
where η i = Z i Ξ ( U i ) Ψ ( Ψ Π ( U i ) Ψ ) 1 Ψ ζ i . Then, by a law of large numbers, Lemma A6 can be easily proven. □
Lemma A7.
Under regularity conditions (C1)–(C9), then as n , we can obtain:
n 1 / 2 Z ^ Σ 1 Δ ( M ^ w + ε ^ ) D N ( 0 , Λ 2 ) , n ,
where ε ^ = ( I S ^ ) ε , M ^ w = ( I S ^ ) M is given in Theorem 2.
Proof of Lemma A7.
Invoking (A1) and (A3), it is easy to check that:
( X ^ 0 ) { X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ X ^ h 2 ( u 0 ) } 1 X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ M = ( Ψ ζ ) θ ( u 0 ) { 1 + O p ( c n 2 ) } .
Then, invoking (A5), some calculations yield:
n 1 / 2 Z ^ Σ 1 Δ M ^ w = n 1 / 2 Z ( I S ^ ) Σ 1 Δ ( I S ^ ) M = n 1 / 2 i = 1 n σ 2 ( U i ) δ i π ( V i , w i ) η i ( 1 + O p ( c n 2 ) ) [ X i θ ( U i ) ( Ψ ζ i ) θ ( U i ) ( 1 + O p ( c n 2 ) ) ] = n 1 / 2 i = 1 n σ 2 ( U i ) δ i π ( V i , w i ) η i ( Ψ ζ i ) θ ( U i ) O p ( c n 2 ) + n 1 / 2 i = 1 n σ 2 ( U i ) δ i π ( V i , w i ) η i e i θ ( U i ) .
Note that E [ η i ( Ψ ζ i ) θ ( U i ) | U i ] = 0 , and then we can prove:
n 1 / 2 i = 1 n σ 2 ( U i ) δ i π ( V i , w i ) η i ( Ψ ζ i ) θ ( U i ) O p ( c n 2 ) = O p ( n 1 / 2 c n 2 2 ) .
Therefore, we derive:
n 1 / 2 Z ^ Σ 1 Δ M ^ w = n 1 / 2 i = 1 n σ 2 ( U i ) δ i π ( V i , w i ) η i e i θ ( U i ) + O p ( n 1 / 2 c n 2 2 ) .
In addition, invoking (A1) and (A4), we have:
( X ^ 0 ) { X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ X ^ h 2 ( u 0 ) } 1 X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ ε = O p ( c n 2 ) .
Then, invoking (A5), some calculations yield:
n 1 / 2 Z ^ Σ 1 Δ ε ^ = n 1 / 2 Z ( I S ^ ) Σ 1 Δ ( I S ^ ) ε = n 1 / 2 i = 1 n σ 2 ( U i ) δ i π ( V i , w i ) η i ε i + O p ( c n 2 ) .
Hence, combining (A6), (A7), and condition (C8), with the help of the central limit theorem, the result of Lemma A7 can be obtained. □
The proof of Theorem 1 is similar to that of Theorem 1 in [9]. We omit the details here.
Proof of Theorem 2.
Note that:
n ( β ^ w β ) = n ( β ^ w β ^ v ) + n ( β ^ v β ) ,
where β ^ v = ( Z ^ Σ 1 Δ Z ^ ) 1 Z ^ Σ 1 Δ Y ^ . Then, we need to complete the proof of
n ( β ^ v β ) D N ( 0 , Λ 1 1 Λ 2 Λ 1 1 ) , n
and
n ( β ^ w β ^ v ) = o p ( 1 ) .
Multiplying both sides of model (1) by ( I S ^ ) , we obtain:
Y ^ = Z ^ β + M ^ w + ε ^ .
Then, we have:
n ( β ^ v β ) = ( n 1 Z ^ Σ 1 Δ Z ^ ) 1 n 1 / 2 Z ^ Σ 1 Δ ( M ^ w + ε ^ ) .
Invoking Lemma A6 and Lemma A7, we can derive the result of (A8) using the Slutsky Theorem. Since:
n ( β ^ w β ^ v ) = n [ ( Z ^ Σ ^ 1 Δ ^ Z ^ ) 1 Z ^ Σ ^ 1 Δ ^ Y ^ ( Z ^ Σ 1 Δ Z ^ ) 1 Z ^ Σ 1 Δ Y ^ ] = n [ ( Z ^ Σ ^ 1 Δ ^ Z ^ ) 1 ( Z ^ Σ 1 Δ Z ^ ) 1 ] Z ^ Σ 1 Δ ( M ^ w + ε ^ ) + n ( Z ^ Σ ^ 1 Δ ^ Z ^ ) 1 ( Z ^ Σ ^ 1 Δ ^ Z ^ Σ 1 Δ ) ( M ^ w + ε ^ ) .
Thus, to obtain the result of (A9), we have to prove:
n 1 ( Z ^ Σ ^ 1 Δ ^ Z ^ Z ^ Σ 1 Δ Z ^ ) = o p ( 1 ) , n 1 / 2 Z ^ Σ 1 Δ ( M ^ w + ε ^ ) = O p ( 1 ) , n 1 Z ^ Σ ^ 1 Δ ^ Z ^ = O p ( 1 ) , n 1 / 2 ( Z ^ Σ ^ 1 Δ ^ Z ^ Σ 1 Δ ) ( M ^ w + ε ^ ) = o p ( 1 ) .
Usng a similar method to the proof of Theorem 3 in [9], it is easy to prove the above conclusions under regular conditions (C1)–(C9), so the details are omitted here. Then, combining (A8) and (A9) yields the result in Theorem 2. □
Proof of Theorem 3.
Recall the definition of θ ^ w ( u 0 , β ^ w ) in (21); we have:
θ ^ w ( u 0 , β ^ w ) = ( I p , 0 p ) { X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ X ^ h 2 ( u 0 ) } 1 X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ ( Y Z β ^ w ) = ( I p , 0 p ) { X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ X ^ h 2 ( u 0 ) } 1 X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ M ζ + ( I p , 0 p ) { X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ X ^ h 2 ( u 0 ) } 1 X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ Z ( β β ^ w ) + ( I p , 0 p ) { X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ X ^ h 2 ( u 0 ) } 1 X ^ h 2 ( u 0 ) w h 2 ( u 0 ) Σ ^ 1 Δ ^ ( ε + e ) K 1 + K 2 + K 3 ,
where:
M ζ = [ ( Ψ ζ 1 ) θ ( U 1 ) , ( Ψ ζ 2 ) θ ( U 2 ) , , ( Ψ ζ n ) θ ( U n ) ] , e = ( e 1 θ ( U 1 ) , e 2 θ ( U 2 ) , , e n θ ( U n ) ) .
Let us consider K 1 first; for any point U i in the neighborhood of u 0 , each functional coefficients θ ( U i ) can be approximated by:
θ ( U i ) = θ ( u 0 ) + h 2 θ ( u 0 ) U i u 0 h 2 + h 2 2 2 θ ( u 0 ) ( U i u 0 h 2 ) 2 + o p ( h 2 2 ) .
Then, we have:
M ζ = ( Ψ ζ 1 ) θ ( U 1 ) ( Ψ ζ n ) θ ( U n ) = X ^ h 2 ( u 0 ) θ ( u 0 ) h 2 θ ( u 0 ) + h 2 2 2 X ^ 1 ( U 1 u 0 h 2 ) 2 X n ( U n u 0 h 2 ) 2 θ ( u 0 ) + o p ( h 2 2 ) .
By Lemma 1, we obtain:
K 1 = θ ( u 0 ) + h 2 2 2 μ 2 θ ( u 0 ) + o p ( h 2 2 ) .
For K 2 , combining (A1), (A2), Theorem 2, and condition (C8), we can obtain that:
n h 2 K 2 = n h 2 ( Ψ Π ( u 0 ) Ψ ) 1 Ψ Ξ ( u 0 ) { 1 + O p ( c n 2 ) } O p ( n 1 / 2 ) = o p ( 1 ) .
Now, we consider K 3 . Combining (A1), Theorem 1, and Lemma 4, we can derive:
n h 2 K 3 = σ 2 ( u 0 ) f 1 ( u 0 ) ( Ψ Π ( u 0 ) Ψ ) 1 × n h 2 1 n i = 1 n σ 2 ( U i ) δ i π ( V i , w ) X ^ i K h 2 ( U i u 0 ) ( ε i + e i θ ( U i ) ) + o p ( 1 ) .
Since n h 2 1 n i = 1 n σ 2 ( U i ) δ i π ( V i , w ) X ^ i K h 2 ( U i u 0 ) ( ε i + e i θ ( U i ) ) follows an asymptotic normal distribution with mean zero and the variance function
v 0 f ( u 0 ) σ 4 ( u 0 ) E { π ( V 1 , w ) 1 ( e 1 θ ( U 1 ) + ε 1 ) 2 } Ψ Π ( u 0 ) Ψ .
Then, by the Slutsky Theorem, we have:
n h 2 K 3 D v 0 f 1 ( u 0 ) E { π ( V 1 , w ) 1 ( e 1 θ ( U 1 ) + ε 1 ) 2 } ( Ψ Π ( u 0 ) Ψ ) 1 .
Combining (A10)–(A12) together with the Slutsky Theorem yields the result in Theorem 3. □

References

  1. Zhang, W.Y.; Lee, S.Y.; Song, X.Y. Local polynomial fitting in semi-varying coefficient models. J. Multivar. Anal. 2002, 82, 166–188. [Google Scholar] [CrossRef]
  2. Zhou, X.; You, J.H. Wavelet estimation in varying-coefficient partially linear regression model. Stat. Probab. Lett. 2004, 68, 91–104. [Google Scholar] [CrossRef]
  3. Fan, J.Q.; Huang, T. Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli 2005, 11, 1031–1057. [Google Scholar] [CrossRef]
  4. Zhao, P.X.; Xue, L.G. Variable selection for semiparametric varying coefficient partially linear models. Stat. Probab. Lett. 2009, 79, 2148–2157. [Google Scholar] [CrossRef]
  5. Kai, B.; Li, R.Z.; Zou, H. New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann. Stat. 2011, 39, 305–332. [Google Scholar] [CrossRef] [PubMed]
  6. Yang, J.; Lu, F.; Yang, H. Quantile regression for robust estimation and variable selection in partially linear varying-coefficient models. Stat. J. Theor. Appl. Stat. 2017, 51, 1–21. [Google Scholar] [CrossRef]
  7. Li, Y.J.; Li, G.R.; Lian, H.; Tong, T.J. Profile forward regression screening for ultra-high dimensional semiparametric varying coefficient partially linear models. J. Multivar. Anal. 2017, 155, 133–150. [Google Scholar] [CrossRef]
  8. Zhao, P.X.; Yang, Y.P. A new orthogonality-based estimation for varying-coefficient partially linear models. J. Korean Stat. Soc. 2019, 48, 29–39. [Google Scholar] [CrossRef]
  9. Shen, S.L.; Cui, J.L.; Mei, C.L.; Wang, C.L. Estimation and inference of semi-varying coefficient models with heteroscedastic errors. J. Multivar. Anal. 2014, 124, 70–93. [Google Scholar] [CrossRef]
  10. Zhao, Y.Y.; Lin, J.G.; Xu, P.R.; Ye, X.G. Orthogonality-projection-based estimation for semi-varying coefficient models with heteroscedastic errors. Comput. Stat. Data Anal. 2015, 89, 204–221. [Google Scholar] [CrossRef]
  11. Zhao, F.R.; Song, W.X.; Shi, J.H. Statistical inference for heteroscedastic semi-varying coefficient EV models. Commun. Stat.-Theory Methods 2018, 48, 2432–2455. [Google Scholar] [CrossRef]
  12. Zhang, W.W.; Li, G.R. Weighted bias-corrected restricted statistical inference for heteroscedastic semiparametric varying-coefficient errors-in-variables model. J. Korean Stat. Soc. 2021, 50, 1098–1128. [Google Scholar] [CrossRef]
  13. Yuan, Y.Z.; Zhou, Y. Adaptive-weighted estimation of semi-varying coefficient models with heteroscedastic errors. J. Stat. Comput. Simul. 2021, 91, 3029–3047. [Google Scholar] [CrossRef]
  14. Greenland, S. An introduction to instrumental variables for epidemiologists. Int. J. Epidemiol. 2000, 29, 722–729. [Google Scholar] [CrossRef] [PubMed]
  15. Fan, J.Q.; Liao, Y. Endogeneity in dimensions. Ann. Stat. 2014, 42, 872–917. [Google Scholar] [CrossRef]
  16. Cai, Z.W.; Xiong, H.Y. Partially varying coefficient instrumental variables models. Stat. Neerl. 2012, 66, 85–110. [Google Scholar] [CrossRef]
  17. Zhao, P.X.; Li, G.R. Modified SEE variable selection for varying coefficient instrumental variable models. Stat. Methodol. 2013, 12, 60–70. [Google Scholar] [CrossRef]
  18. Zhao, P.X.; Xue, L.G. Empirical likelihood inferences for semiparametric instrumental variable models. J. Appl. Math. Comput. 2013, 43, 75–90. [Google Scholar] [CrossRef]
  19. Yuan, J.Y.; Zhao, P.X.; Zhang, W.G. Semiparametric variable selection for partially varying coefficient models with endogenous variables. Comput. Stat. 2016, 31, 693–707. [Google Scholar] [CrossRef]
  20. Zhao, P.X.; Zhou, X.S.; Wang, X.L.; Huang, X.S. A new orthogonality empirical likelihood for varying coefficient partially linear instrumental variable models with longitudinal data. Commun. Stat. Simul. Comput. 2020, 49, 3328–3344. [Google Scholar] [CrossRef]
  21. Yao, F. Efficient semiparametric instrumental variable estimation under conditional heteroskedasticity. J. Quant. Econ. 2012, 10, 32–55. [Google Scholar]
  22. Yang, Y.P.; Chen, L.F.; Zhao, P.X. Empirical likelihood inference in partially linear single-index models with endogenous covariates. Commun. Stat.-Theory Methods 2017, 46, 3297–3307. [Google Scholar] [CrossRef]
  23. Huang, J.T.; Zhao, P.X. Orthogonal weighted empirical likelihood-based variable selection for semiparametric instrumental variable models. Commun. Stat.-Theory Methods 2018, 47, 4375–4388. [Google Scholar] [CrossRef]
  24. Tang, X.R.; Zhao, P.X.; Yang, Y.P.; Yang, W.M. Adjusted empirical likelihood inferences for varying coefficient partially non linear models with endogenous covariates. Commun. Stat.-Theory Methods 2022, 51, 953–973. [Google Scholar] [CrossRef]
  25. Rubin, D.B. Inference and missing data. Biometrika 1976, 63, 581–592. [Google Scholar] [CrossRef]
  26. Robins, J.M.; Rotnitzky, A.; Zhao, L.P. Estimation of regression coefficient when some regressors are not always observed. J. Am. Stat. Assoc. 1994, 89, 846–866. [Google Scholar] [CrossRef]
  27. Wang, Q.H.; Rao, J.N.K. Empirical likelhood-based inference in linear models with missing response data. Scand. J. Stat. 2002, 29, 563–576. [Google Scholar] [CrossRef]
  28. Wang, Q.H.; Linton, O.; Härdle, W. Semiparametric regression analysis with missing response at random. J. Am. Stat. Assoc. 2004, 99, 334–345. [Google Scholar] [CrossRef]
  29. Li, Z.Q.; Xue, L.G. The imputation estimators of semiparametric varying-coefficient models with missing data. Acta Math. Appl. Sin. 2009, 32, 422–430. (In Chinese) [Google Scholar]
  30. Chen, P.P.; Feng, S.Y.; Xue, L.G. Statistical inference for semiparametric varying coefficient partially linear model with missing data. Acta Math. Sci. 2015, 35A, 345–358. (In Chinese) [Google Scholar]
  31. Xu, H.X.; Fan, G.L.; Wu, C.X. Statistical inference for varying-coefficient partially linear errors-in-variables models with missing data. Commun. Stat.-Theory Methods 2019, 48, 5621–5636. [Google Scholar] [CrossRef]
  32. Xiao, Y.T.; Li, F.X. Estimation in partially linear varying-coefficient errors-in-variables models with missing response variables. Comput. Stat. 2020, 35, 1637–1658. [Google Scholar] [CrossRef]
  33. Yan, Y.X.; Lan, S.H.; Zhang, C.Y. Statistical inference for partially linear varying coefficient quantile models with missing responses. Symmetry 2022, 14, 2258. [Google Scholar] [CrossRef]
  34. Card, D. Using Geographic Variation in College Proximity to Estimate the Return to Schooling; Nber Working Papers; University of Toronto Press: Toronto, ON, Canada, 1993; pp. 1127–1160. [Google Scholar]
  35. Mack, Y.P.; Silverman, B.W. Weak and strong uniform consistency of kernel regression estimates. Z. Wahrscheinlichkeitstheorie Verwandte Geb. 1982, 61, 405–415. [Google Scholar] [CrossRef]
  36. Shi, J.; Lau, T.S. Emprical likelihood for partially linear models. J. Multivar. Anal. 2000, 72, 132–148. [Google Scholar] [CrossRef]
Figure 1. Plot of the variance function estimates through the use of the adjusted Nadaraya–Watson kernel estimation method (denoted using dashed lines) and the naive Nadaraya–Watson kernel estimation method (denoted using dot-dashed lines); the solid lines represent the true curves.
Figure 1. Plot of the variance function estimates through the use of the adjusted Nadaraya–Watson kernel estimation method (denoted using dashed lines) and the naive Nadaraya–Watson kernel estimation method (denoted using dot-dashed lines); the solid lines represent the true curves.
Mathematics 11 04853 g001
Figure 2. Plot of the functional coefficient estimates through the use of the IAWPLS method (denoted using dot-dashed lines), the NAWPLS method (denoted using dashed lines), and the IWPLS method (denoted using dotted lines); the solid lines represent the true curves.
Figure 2. Plot of the functional coefficient estimates through the use of the IAWPLS method (denoted using dot-dashed lines), the NAWPLS method (denoted using dashed lines), and the IWPLS method (denoted using dotted lines); the solid lines represent the true curves.
Mathematics 11 04853 g002
Figure 3. Plot of the model residuals for NLSYM data.
Figure 3. Plot of the model residuals for NLSYM data.
Mathematics 11 04853 g003
Figure 4. Plot of the variance function estimate for NLSYM data.
Figure 4. Plot of the variance function estimate for NLSYM data.
Mathematics 11 04853 g004
Figure 5. Plot of the functional coefficient estimates through the use of the IAWPLS method (solid lines), the NAWPLS method (dashed lines), and the IWPLS method (dotted lines) for NLSYM data.
Figure 5. Plot of the functional coefficient estimates through the use of the IAWPLS method (solid lines), the NAWPLS method (dashed lines), and the IWPLS method (dotted lines) for NLSYM data.
Mathematics 11 04853 g005
Table 1. Sample means and MSEs for β 1 based on the IAWPLS, NAWPLS, and IWPLS methods.
Table 1. Sample means and MSEs for β 1 based on the IAWPLS, NAWPLS, and IWPLS methods.
π cknIAWPLSNAWPLSIWPLS
Mean MSE Mean MSE Mean MSE
π 1 20.2501.51070.03551.43350.03291.51530.0607
1001.50700.01281.43790.01411.51240.0261
2001.50620.00561.43660.00891.50810.0121
3001.50190.00361.44240.00641.50230.0094
0.4501.51820.04291.38080.03771.52890.0778
1001.51290.01641.38980.02321.51920.0383
2001.50910.00741.39200.01651.51150.0203
3001.50300.00461.39540.01401.50410.0113
40.2501.52310.10561.28570.23601.52260.2382
1001.50580.03671.30250.07161.52140.1063
2001.49630.01431.31160.05101.51460.0481
3001.50190.00941.31720.04301.50620.0323
0.4501.52740.15981.22520.13241.52640.3682
1001.51100.04491.23610.09171.52550.1377
2001.50880.01821.24680.07491.52270.0813
3001.50530.01051.25170.06861.52040.0510
π 2 20.2501.51870.03521.42740.03641.52450.0630
1001.50500.01691.42890.01861.51120.0346
2001.49740.00721.43160.01091.50090.0151
3001.50270.00491.43860.00771.50280.0112
0.4501.49290.05351.35810.05021.49940.1011
1001.50830.02021.37850.02641.51870.0438
2001.50440.00791.38710.01841.51400.0221
3001.50320.00581.38820.01631.50930.0153
40.2501.54380.16461.29720.15311.56010.3148
1001.51800.04171.31030.07231.54460.1141
2001.51590.01921.30540.05381.52620.0625
3001.50620.01241.32100.04421.51650.0491
0.4501.54550.20161.23230.14241.58950.4071
1001.53970.05551.25660.08711.58020.1955
2001.52260.02711.24690.07841.55590.1111
3001.52230.01811.24820.07251.55110.1043
Table 2. Sample means and MSEs for β 2 based on the IAWPLS, NAWPLS, and IWPLS methods.
Table 2. Sample means and MSEs for β 2 based on the IAWPLS, NAWPLS, and IWPLS methods.
π cknIAWPLSNAWPLSIWPLS
Mean MSE Mean MSE Mean MSE
π 1 20.2501.99000.03501.92240.03441.98100.0573
1001.99110.01341.92920.01571.98820.0288
2001.99620.00581.93740.00921.99270.0121
3002.00090.00391.94230.00672.00150.0093
0.4501.98750.04751.88150.03841.98360.0820
1001.98980.01711.88800.02341.98850.0384
2001.99600.00701.88770.01711.99710.0200
3002.00140.00451.88960.01522.00790.0117
40.2502.00240.11401.78000.13472.00730.2454
1002.00170.03541.81830.06231.99270.1010
2002.01480.01591.82750.04472.01540.0525
3002.00650.00891.82700.03922.00430.0336
0.4501.96880.14691.76020.14411.96040.3306
1001.98930.04881.75680.08291.99330.1606
2002.00010.01951.75940.07022.00820.0887
3001.99960.01181.75960.06451.99950.0554
π 2 20.2501.97640.03491.91240.03621.98050.0602
1001.99940.01811.93490.01891.99850.0339
2002.00210.00731.94140.00981.99990.0155
3001.99790.00471.93960.00752.00230.0105
0.4502.00040.05201.88700.04092.00890.0970
1001.99560.01771.88960.02422.00350.0406
2001.99700.00871.88720.01832.00020.0230
3002.00240.00561.89190.01572.00180.0164
40.2501.97500.15581.77660.16351.97820.3179
1001.99770.04051.80480.07391.99580.1295
2001.98800.01781.80850.05291.98870.0647
3001.98980.01201.81030.04721.99110.0497
0.4501.95230.21951.73920.18031.94870.4049
1001.97480.06881.73890.09781.95700.2404
2001.97810.02411.75100.07631.95520.1095
3001.99330.01711.75320.06951.99330.0171
Table 3. Sample RMSEs for θ ( · ) based on the IAWPLS, NAWPLS, and IWPLS methods.
Table 3. Sample RMSEs for θ ( · ) based on the IAWPLS, NAWPLS, and IWPLS methods.
π cknIAWPLSNAWPLSIWPLS
π 1 20.2500.41130.51220.4348
1000.27030.41540.2845
2000.19740.36650.2043
3000.16660.36570.1710
0.4500.44130.64130.4713
1000.28430.59210.3020
2000.21210.56150.2205
3000.17750.56000.1819
40.2500.77961.17310.8454
1000.48331.07680.5332
2000.34951.02420.3827
3000.28541.00830.3107
0.4500.82371.30330.9124
1000.54421.22690.6099
2000.38161.21440.4253
3000.32081.20500.3562
π 2 20.2500.51480.58560.5424
1000.31740.43920.3376
2000.23800.38760.2473
3000.20050.37460.2066
0.4500.54940.72950.5908
1000.35090.60420.3740
2000.25320.57560.2674
3000.21760.56590.2296
40.2500.91091.26060.9901
1000.60031.10500.6560
2000.42521.03320.4671
3000.36031.01850.3962
0.4501.01091.36931.1332
1000.69341.25750.7847
2000.52961.22010.5875
3000.45171.21340.5030
Table 4. The estimates for the parameters’ vector based on the IAWPLS, IWPLS, and NAWPLS methods for NLSYM data.
Table 4. The estimates for the parameters’ vector based on the IAWPLS, IWPLS, and NAWPLS methods for NLSYM data.
β 1 β 2 β 3 β 4
IAWPLS0.0806−0.61351.52790.4864
IWPLS0.0841−0.59591.60030.5702
NAWPLS0.3069−0.16810.1772−0.1093
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, W.; Luo, J.; Ma, S. Estimation in Semi-Varying Coefficient Heteroscedastic Instrumental Variable Models with Missing Responses. Mathematics 2023, 11, 4853. https://doi.org/10.3390/math11234853

AMA Style

Zhang W, Luo J, Ma S. Estimation in Semi-Varying Coefficient Heteroscedastic Instrumental Variable Models with Missing Responses. Mathematics. 2023; 11(23):4853. https://doi.org/10.3390/math11234853

Chicago/Turabian Style

Zhang, Weiwei, Jingxuan Luo, and Shengyun Ma. 2023. "Estimation in Semi-Varying Coefficient Heteroscedastic Instrumental Variable Models with Missing Responses" Mathematics 11, no. 23: 4853. https://doi.org/10.3390/math11234853

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop