Sampling Importance Resampling Algorithm with Nonignorable Missing Response Variable Based on Smoothed Quantile Regression

Guo, Jingxuan; Liu, Fuguo; Härdle, Wolfgang Karl; Zhang, Xueliang; Wang, Kai; Zeng, Ting; Yang, Liping; Tian, Maozai

doi:10.3390/math11244906

Open AccessArticle

Sampling Importance Resampling Algorithm with Nonignorable Missing Response Variable Based on Smoothed Quantile Regression

by

Jingxuan Guo

^1,†,

Fuguo Liu

^2,3,†,

Wolfgang Karl Härdle

^4,5,6

,

Xueliang Zhang

⁷,

Kai Wang

⁷

,

Ting Zeng

⁷,

Liping Yang

⁷ and

Maozai Tian

^7,*

¹

School of Statistics and Data Science, Beijing Wuzi University, Beijing 101149, China

²

School of Statistics and Data Science, Xinjiang University of Finance and Economics, Urumqi 830012, China

³

School of Mathematics and Data Science, Changji University, Changji 831100, China

⁴

Department of Information Management and Finance, National Yang Ming Chiao Tung University, Taiwan 30010, China

⁵

Institute Digital Assets, Academy Economic Sciences, 010374 Bucharest, Romania

⁶

School of Business and Economics, Humboldt-Universität zu Berlin, 10117 Berlin, Germany

⁷

Department of Medical Engineering and Technology, Xinjiang Medical University, Urumqi 830011, China

^*

Author to whom correspondence should be addressed.

^†

The first two authors contributed equally to this work.

Mathematics 2023, 11(24), 4906; https://doi.org/10.3390/math11244906

Submission received: 19 October 2023 / Revised: 24 November 2023 / Accepted: 29 November 2023 / Published: 8 December 2023

(This article belongs to the Section Mathematics and Computer Science)

Download

Browse Figures

Versions Notes

Abstract

:

The presence of nonignorable missing response variables often leads to complex conditional distribution patterns that cannot be effectively captured through mean regression. In contrast, quantile regression offers valuable insights into the conditional distribution. Consequently, this article places emphasis on the quantile regression approach to address nonrandom missing data. Taking inspiration from fractional imputation, this paper proposes a novel smoothed quantile regression estimation equation based on a sampling importance resampling (SIR) algorithm instead of nonparametric kernel regression methods. Additionally, we present an augmented inverse probability weighting (AIPW) smoothed quantile regression estimation equation to reduce the influence of potential misspecification in a working model. The consistency and asymptotic normality of the empirical likelihood estimators corresponding to the above estimating equations are proven under the assumption of a correctly specified parameter working model. Furthermore, we demonstrate that the AIPW estimation equation converges to an IPW estimation equation when a parameter working model is misspecified, thus illustrating the robustness of the AIPW estimation approach. Through numerical simulations, we examine the finite sample properties of the proposed method when the working models are both correctly specified and misspecified. Furthermore, we apply the proposed method to analyze HIV—CD4 data, thereby exploring variations in treatment effects and the influence of other covariates across different quantiles.

Keywords:

empirical likelihood; nonignorable missing; quantile regression; sampling importance resampling

MSC:

62H25; 62F12

1. Introduction

Missing data analysis has gained significant attention in recent years. To analyze missing data, it is crucial to understand the response mechanism that leads to missing data. If the missingness of the variable of interest is conditionally independent of that variable, the response mechanism is considered to be random or ignorable. Otherwise, the response mechanism is considered to be nonrandom or nonignorable. Dealing with nonrandom missing data presents greater challenges, which are evident in two aspects: Firstly, the assumed response model cannot be validated solely based on observed data; secondly, the model parameters may be unidentifiable.

To obtain meaningful inferences from incomplete data with nonrandom missingness, it is necessary to satisfy a set of identifying conditions [1,2]. Moreover, the accuracy of the methods based on parameter models is greatly influenced by the correct specification of the assumed parameter model [3]. Consequently, researchers aim to impose weaker model assumptions on the response mechanism to achieve robust results. The semiparametric response model was initially considered by Kim and Yu [4], but their proposed method necessitated a validation sample to estimate the model parameters. Similarly, Shao and Wang [5] examined the same semiparametric exponential tilting model and proposed a parameter estimation approach based on calibration estimation equations. A comprehensive review of parameter estimation methods for nonrandom missing data is provided by Kim and Shao [6].

Quantile regression, introduced by Koenker and Bassett [7], has become a widely used statistical analysis tool. It offers more adaptability and flexibility compared to mean regression. Notably, quantile regression does not require the assumption of error term distribution and demonstrates robustness against heavy-tailed errors and outliers. Furthermore, by considering regressions at different quantiles of the response variable, quantile regression enables the assessment of covariate effects at various quantiles and yields a more comprehensive understanding of the conditional distribution. However, there is a scarcity of literature on quantile regression for nonrandom missing data.

The nonsmoothness of the check function for standard quantile estimators makes it impossible to directly estimate the asymptotic covariance matrix [8]. As a result, the existing theoretical results for nonrandom missing mean regression cannot be directly extended to quantile regression.

The idea of smoothing nondifferentiable objective functions can be traced back to Horwitz [9], while Whang [10] introduced the smoothed empirical likelihood approach for quantile regression. Luo et al. [11] extended the aforementioned method to analyze data with random missingness; Zhang and Wang [12] further expanded it to handle cases of nonignorable missingness.

However, on the one hand, this method relies on the assumption of a parametric propensity missingness model, which introduces the risk of model misspecification. On the other hand, this method corrects estimation biases caused by missing data through inverse probability weighting but may not fully utilize the information from incomplete observations.

Regarding nonrandom missing data, previous studies have addressed the issue in different regression settings. Specifically, Niu et al. [13] and Bindele and Zhao [14] focused on estimation equation imputation in linear regression and rank regression, respectively. In the context of quantile regression, Chen et al. [15] introduced three missing quantile regression estimation equations: inverse probability weighting, estimation equation imputation, and an enhanced approach combining both methods. It is important to note that these studies assume a response mechanism with random missingness.

Moreover, the existing literature commonly utilizes kernel estimation methods [16] to estimate the conditional means involved in the imputation estimation equation. However, when the dimension of the covariates is high, the kernel estimation results can become unstable. To overcome the curse of dimensionality associated with multivariate nonparametric kernel estimation, Kim [17] proposed a parametric fractional imputation method for handling missing data. Additionally, Riddles et al. [18] extended this method to address the scenario of nonignorable missing data. They developed an EM algorithm based on a parameter working model derived from observed data and incorporated the parametric fractional imputation (FI) method. Nevertheless, these approaches heavily rely on parameter-based response models, which renders them sensitive to model misspecification. Furthermore, the likelihood-based EM algorithm is not directly applicable to quantile regression.

Utilizing estimation equations, Paik and Larsen [19] incorporated a working model for observed data and employed a sampling importance resampling (SIR) algorithm to estimate the missing data and corresponding estimation equations. Building upon this, Wang et al. [20] and Song et al. [21] extended the logistic response model used in the aforementioned approaches to a develop a semiparametric exponential tilt model. However, in the absence of knowledge about the tilting parameter, these methods relied on validation samples.

In this paper, we propose a smoothed empirical likelihood approach for imputing quantile regression estimation equations with nonignorable missing data based on a semiparametric response model. The novel estimation equation guarantees the second-order differentiability of the objective function with respect to the parameter vector. The imputed values for the missing data were derived from a parameter working model and obtained using sampling importance resampling.

Although imputation estimation equations applying information from missing data compared to IPW estimation equations can enhance estimation efficiency, both theoretical and numerical experiments have shown that imputation estimation equations are sensitive to misspecification of the parameter working model. Therefore, to mitigate the impact of misspecification in the working model, this paper further proposes the AIPW smoothed quantile regression estimation equation. It is demonstrated that, when the working model is correctly specified, the asymptotic variance of the AIPW estimation equation shares the same form as the asymptotic variance of the nonparametric model estimator. Furthermore, even when the working model is misspecified, the estimates remain consistent.

The remaining sections of this paper are organized as follows. Section 2 establishes the semi-parametric response model and the AIPW quantile regression estimation equation, along with the algorithmic procedure for estimating the skewness parameter and quantile regression coefficients using importance resampling. Section 3 presents the large sample properties of the parameter estimators. Section 4 demonstrates the finite sample properties of the estimators through numerical simulations. Section 5 applies the proposed methodology to analyze the HIV—CD4 dataset.

2. Proposed Method

Consider a linear quantile regression model as follows:

Y_{i} = Z_{i}^{⊤} θ_{τ} + ϵ_{i}, i = 1, \dots, n,

where

Y_{i}

is the response variable,

Z_{i}

is a fully observed q-dimensional covariate vector,

θ_{τ}

represents the unknown regression coefficient vector,

ϵ_{i}

denotes the random error term satisfying

P (ϵ_{i} \leq 0 | Z_{i}) = τ

,

τ \in (0, 1)

, and the

ϵ_{i}

values are mutually independent. In the subsequent discussion, we will abbreviate

θ_{τ}

as

θ

.

If the response variables

Y_{i}, i = 1, \dots, n

are fully observed, the quantile regression estimator of

θ

is obtained by minimizing the following equation:

\hat{θ} = arg min_{θ \in Θ} \frac{1}{n} \sum_{i = 1}^{n} ρ_{τ} (Y_{i} - Z_{i}^{⊤} θ),

(1)

where

ρ_{τ} (u) = u (τ - I (u < 0))

is the check function, and

I (\cdot)

is the indicator function. For a given

τ

,

\hat{θ}

satisfies the following estimation equation:

\sum_{i = 1}^{n} ψ (Z_{i}, Y_{i}; θ) \approx 0,

(2)

where

ψ (Z_{i}, Y_{i}; θ) = Z_{i} ψ_{τ} (Y_{i} - Z_{i}^{⊤} θ)

when

Y_{i} - Z_{i}^{⊤} θ \neq 0

, and

ψ (Z_{i}, Y_{i}; θ) = 0

otherwise. Here,

ψ_{τ} (u) = I (u < 0) - τ

.

In the scenario where the missingness of response variable

Y_{i}

is nonignorable, let

δ_{i}

denote the missing indicator. If

Y_{i}

is missing,

δ_{i} = 1

; otherwise,

δ_{i} = 0

.

(Z_{i}, Y_{i}, δ_{i}), i = 1, \dots, n

represents an independent and identically distributed sample from

(Z, Y, δ)

. We establish a semiparametric exponential tilting model for missing propensity as follows:

P (δ = 1 | Z, Y) = \frac{1}{1 + exp (- g (X) + γ Y)},

(3)

where

g (\cdot)

is an unspecified function,

X \subset Z

is a d-dimensional vector, and there exists an instrumental variable

V = Z / X

that is unrelated to

δ

given

(X, Y)

.

Let

f_{κ} (Y, Z | X)

, where

κ \in 0, 1

denotes the conditional density of

(Z, Y)

of

X

when

δ = κ

. Specifically, we have the following:

f_{0} (Y, Z | X, γ) = f_{1} (Y, Z | X) \times \frac{O (X, Y)}{E {O (X, Y) | X, δ = 1}},

(4)

where

O (X, Y) = \frac{P (δ = 0 | X, Y)}{P (δ = 1 | X, Y)} = exp (- g (X) + γ Y)

. For the quantile estimation equation

ψ (Y, Z; θ)

, let

ψ^{e e i} (Z, Y, δ; θ, γ) = δ ψ (Y, Z; θ) + (1 - δ) E (ψ (Z, Y; θ) | X, δ = 0, γ),

(5)

it can be easily shown that

E (ψ (Y, Z; θ) | X) = E (ψ^{e e i} (Y, Z; θ) | X)

, where

E {ψ (Z, Y; θ) | X, δ = 0, γ} = \frac{E {δ exp (γ Y) ψ (Z, Y; θ) | Z}}{E {δ exp (γ Y) | Z}} : = m_{ψ}^{0} (X; θ, γ) .

(6)

The nonparametric kernel estimate of Equation (6) is given by

{\hat{m}}_{0} (X; θ, γ) = \sum_{i = 1}^{n} ω_{i} (γ) ψ (Z_{i}, Y_{i}; θ),

where

ω_{i} (γ) = δ_{i} exp (γ Y_{i}) K_{h} (X - X_{i}) / \sum_{j = 1}^{n} δ_{j} exp (γ Y_{j}) K_{h} (X - X_{j})

,

K_{h} (u) = h^{- 1} K (u / h)

, and

K (\cdot)

is a d-dimensional kernel function with bandwidth h.

Due to the instability of the nonparametric multivariate kernel estimation of the aforementioned conditional expectation, this paper adopts Monte Carlo methods to estimate

m_{ψ}^{0} (X; θ, γ)

. For simplicity of discussion, we consider the parameter assumption of the conditional distribution

f (Y | Z, δ = 1; β)

of the observed response. This assumption can be verified easily using fully observed samples. Consequently, the conditional distribution of the response with nonrandom missingness satisfies

f_{0} (Y | Z; β, γ) = f_{1} (Y | Z; β) \times \frac{exp (γ Y)}{E {exp (γ Y) | Z, δ = 1; β}} .

(7)

Let

Y_{i}^{(j)}, j = 1, \dots, M

be independent and identically distributed samples from

f (Y | Z_{i}, δ = 0; β, γ)

. According to the law of large numbers, as

M \to \infty

, we have

{\hat{m}}_{ψ}^{0} (Z_{i}; θ, β, γ) = \frac{1}{M} \sum_{j = 1}^{M} ψ (Z_{i}, Y_{i}^{* (j)} (β, γ); θ) \overset{p}{\to} m_{ψ}^{0} (Z_{i}; θ, β, γ) .

To obtain a set of random realizations from

f_{0} (Y | Z; β, γ)

, the SIR algorithm [19] can be employed based on the parametric representation in (7) for a given

(β, γ)

:

(1): Random samples $S_{i} = {{\tilde{Y}}_{i}^{(k)}, k = 1, \dots, M_{2}}$ are drawn from $f (Y | Z_{i}, δ = 1; β)$ .
(2): Calculate the adjustment weights for each sample point in S as

$ω_{i k} (γ) = \frac{exp (γ {\tilde{Y}}_{i}^{(k)})}{\frac{1}{M_{2}} \sum_{j = 1}^{M_{2}} exp (γ {\tilde{Y}}_{i}^{(k)})}, k = 1, 2, \dots, M_{2} .$

(8)
(3): Resample from $S_{i}$ according to the probabilities $ω_{i 1} (γ), \dots, ω_{i M_{2}} (γ)$ to obtain $Y_{i}^{(1)}, \dots$ , $Y_{i}^{(M)}$ . To ensure the convergence of the aforementioned process, it is crucial to have $M_{2} \to \infty$ and $M / M_{2} \to 0$ .

The SIR-based quantile regression estimation equation is given by

ψ^{e e i} (Y_{i}, Z_{i}, δ_{i}; θ, β, γ) = δ_{i} ψ (Z_{i}, Y_{i}; θ) + (1 - δ_{i}) \frac{1}{M} \sum_{j = 1}^{M} ψ (Z_{i}, Y_{i}^{(j)} (β, γ); θ) .

Due to the nonsmoothness of the aforementioned estimation equation, obtaining the sandwich estimator of the asymptotic covariance matrix directly is not feasible. Therefore, this paper proposes using a smooth function

G (Z_{i}^{⊤} θ - Y_{i})

as a substitute for the indicator function

I (Y_{i} - Z_{i}^{⊤} θ < 0)

in the quantile estimation equation, thus resulting in a smooth approximation of

ψ_{τ} (Y_{i} - Z_{i}^{⊤} θ)

:

ψ_{h} (Y_{i}, Z_{i}; θ) = Z^{⊤} G (Z_{i}^{⊤} θ - Y_{i}) - τ,

where

G_{h} (u) = G (u / h)

,

G (u) = \int_{- \infty}^{u} K (v) d v

, and

K (\cdot)

is a kernel function defined in the range

[- 1, 1]

.

For nonignorable missing data, we have the following representation for the smoothed SIR-based quantile regression estimation equation:

ψ_{h}^{e e i} (Y_{i}, Z_{i}, δ_{i}; θ, β, γ) = δ_{i} ψ_{h} (Z_{i}, Y_{i}; θ) + (1 - δ_{i}) \frac{1}{M} \sum_{j = 1}^{M} ψ_{h} (Z_{i}, Y_{i}^{(j)} (β, γ); θ) .

The estimation equation based on imputation is susceptible to the misspecification of

f (Y | Z, δ = 1; β)

. Due to the relative robustness of the semiparametric response model, we consider the AIPW (augmented inverse probability weighting) estimation equation:

\begin{matrix} ψ_{h}^{a i p w} (Y_{i}, Z_{i}, δ_{i}; θ, β, γ) = & \frac{δ_{i} Z_{i} ψ_{h} (Z_{i}, Y_{i}; θ)}{π (X_{i}, Y_{i}; {\hat{g}}_{γ}; γ)} \\ + (1 - \frac{δ_{i}}{π (X_{i}, Y_{i}; {\hat{g}}_{γ}; γ)}) \cdot \frac{1}{M} \sum_{j = 1}^{M} ψ_{h} (Z_{i}, Y_{i}^{(j)} (β, γ); θ), \end{matrix}

where it can be proven that the AIPW estimation equation is consistent in the case of the misspecification of the parameter model

f (Y | Z, δ = 1; β)

.

In practice,

(β, γ)

are often unknown and need to be estimated separately. The maximum likelihood estimation of

β

, denoted as

\hat{β}

, is the solution to the following score function:

\sum_{i = 1}^{n} δ_{i} \frac{\partial ln f (y_{i} | X_{i}, δ_{i} = 1; β)}{\partial β} = 0 .

(9)

Then, we consider the estimation of the tilting parameter

γ

. The semiparametric missing propensity model is analyzed by considering two estimation approaches for the tilting parameter

γ

: the profile two-step generalized method of the moments estimation and the kernel regression estimation for the nonparametric component

g (\cdot)

. To estimate the skewness parameter

γ

, we define the profile estimation equation as follows:

ξ (Z_{i}, Y_{i}, δ_{i}; g_{γ}, γ) = \{\frac{δ_{i}}{π (X_{i}, Y_{i}; g_{γ}, γ)} - 1\} h (V_{i}) : = ξ_{i} (g_{γ}, γ),

(10)

where

h (V)

is an arbitrary specified function of the instrumental variable

V

, and

g_{γ} (\cdot)

satisfies the following:

exp (- g_{γ} (X_{i})) = \frac{E (1 - δ_{i} | X_{i})}{E {δ_{i} exp (γ Y_{i}) | X_{i}}} .

Under the assumption of a correctly specified missingness mechanism, it holds that

E {ξ_{i} (g_{γ}, γ)} = 0

, and the vector

ξ_{i} (g_{γ}, γ)

is overidentified with respect to

γ

. The profile two-stage generalized method of the moments estimation for

γ

is given by the following:

\hat{γ} = arg min_{γ \in R} \bar{ξ} {({\hat{g}}_{γ}, γ)}^{⊤} W_{n}^{- 1} \bar{ξ} ({\hat{g}}_{γ}, γ),

where

\bar{ξ} ({\hat{g}}_{γ}, γ) = \frac{1}{n} \sum_{i = 1}^{n} ξ (Z_{i}, Y_{i}, δ_{i}; {\hat{g}}_{γ}, γ)

, and

W_{n} = \frac{1}{n} \sum_{i = 1}^{n} ξ {(Z_{i}, Y_{i}, δ_{i}; {\hat{g}}_{γ}, γ)}^{\otimes 2}

. The estimator

{\hat{g}}_{γ} (\cdot)

represents the kernel regression estimate of

g (\cdot)

and satisfies the following equation:

exp (- {\hat{g}}_{γ} ({\hat{X}}_{i})) = \frac{\sum_{j = 1}^{n} (1 - δ_{j}) K_{h} ({\hat{X}}_{j} - {\hat{X}}_{i})}{\sum_{j = 1}^{n} δ_{j} exp (γ Y_{i}) K_{h} ({\hat{X}}_{j} - {\hat{X}}_{i})},

where

K_{h} (u_{1}, \dots, u_{d})

represents the d-dimensional kernel function with a bandwidth h.

Define

{\hat{ψ}}_{h i}^{(l)} (θ, \hat{β}, \hat{γ})

and

l = 1, 2

, which satisfy the following:

\begin{matrix} {\hat{ψ}}_{h i}^{(1)} (θ, \hat{β}, \hat{γ}) = & δ_{i} ψ_{h} (Z_{i}, Y_{i}; θ) + (1 - δ_{i}) {\hat{m}}_{ψ_{h}}^{0} (Z_{i}; θ, \hat{β}, \hat{γ}), \\ {\hat{ψ}}_{h i}^{(2)} (θ, \hat{β}, \hat{γ}) = & \frac{δ_{i}}{π (X_{i}, Y_{i}; {\hat{g}}_{\hat{γ}}; \hat{γ})} ψ_{h} (Z_{i}, Y_{i}; θ) + (1 - \frac{δ_{i}}{π (X_{i}, Y_{i}; {\hat{g}}_{\hat{γ}}; \hat{γ})}) {\hat{m}}_{ψ_{h}}^{0} (Z_{i}; θ, \hat{β}, \hat{γ}), \end{matrix}

where

{\hat{m}}_{ψ_{h}}^{0} (Z_{i}; θ, \hat{β}, \hat{γ}) = \frac{1}{M} \sum_{j = 1}^{M} ψ_{h} (Z_{i}, Y_{i}^{* (j)} (\hat{β}, \hat{γ}); θ)

.

Let

p_{i}

represent the probability mass of

{\hat{ψ}}_{h i}^{(l)} (θ, \hat{β}, \hat{γ})

, where

i = 1, 2, \dots, n

. The empirical log-likelihood ratio function with respect to

θ

is defined as follows:

{\hat{R}}^{(l)} (θ) = - 2 sup \{\sum_{i = 1}^{n} log (n p_{i}) | p_{i} \geq 0, \sum_{i = 1}^{n} p_{i} = 1, \sum_{i = 1}^{n} p_{i} {\hat{ψ}}_{h i}^{(l)} (θ, \hat{β}, \hat{γ}) = 0\} .

Using the method of Lagrange multipliers, it can be shown that

{\hat{R}}^{(l)} (θ)

can be expressed as follows:

{\hat{R}}^{(l)} (θ) = 2 \sum_{i = 1}^{n} log {1 + λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ, \hat{β}, \hat{γ})},

(11)

where

λ

satisfies the following:

\frac{1}{n} \sum_{i = 1}^{n} \frac{{\hat{ψ}}_{h i}^{(l)} (θ, \hat{β}, \hat{γ})}{1 + λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ, \hat{β}, \hat{γ})} = 0 .

The empirical likelihood estimators of the quantile regression coefficients based on the two proposed estimation equations in this paper, denoted as

{\hat{θ}}^{(l)}

,

l = 1, 2

, are given by the following:

{\hat{θ}}^{(l)} = arg min_{θ} {\hat{R}}^{(l)} (θ) .

3. Theoretical Analysis

To elucidate the theoretical properties of the proposed estimators in this paper, we first define the matrix as follows:

\begin{matrix} A_{1} = & E \{π (X, Y) ψ {(Z, Y; θ_{0})}^{\otimes 2} + (1 - π (X, Y)) m_{ψ}^{0} {(Z; θ_{0})}^{\otimes 2}\}, \\ A_{2} = & E \{π {(X, Y)}^{- 1} {ψ (Z, Y; θ_{0}) - m_{ψ}^{0} (Z; θ_{0})}^{\otimes 2} + m_{ψ}^{0} {(Z; θ_{0})}^{\otimes 2}\}, \\ B_{1} = & Var (H_{1} Ψ (β_{0}) + H_{2} Φ (γ_{0})), B_{2} = Var (H_{3} Φ (γ_{0})), \\ T_{l} = & A_{l} + B_{l}, l = 1, 2, \\ H_{1} = & E \{(1 - δ) {Cov}_{0} (ψ (Z, Y; θ_{0}), s (Z, Y; β_{0}) | Z)\}, \\ H_{2} = & E \{(1 - δ) {Cov}_{0} (ψ (Z, Y; θ_{0}), Y | Z)\}, \\ H_{3} = & E \{(1 - δ) (Y - m_{Y}^{0} (X)) (ψ (Z, Y; θ_{0}) - m_{ψ}^{0} (Z; θ_{0}))\} . \end{matrix}

(C1): (a) The density of $Z$ is bounded and has continuous and bounded second-order derivatives; (b) the density of $Z$ and the propensity $π (Z, Y, γ_{0})$ are bounded away from 0; and (c) $E \{exp (2 γ Y)\}$ is finite and $E \{π (Z, Y, γ_{0}) ∣ Z\} \neq 1$ almost surely.
(C2): Let $K_{h} (u) = K (u / h) / h^{d}$ and K denote a generic notation for a d-dimensional kernel function, and the value of d is determined by the context of use. $K$ is a bounded, uniformly continuous, symmetric function of the $m^{'}$ th order satisfying the following conditions: $\int K (s) d s = 1$ , $s = (s_{1}, \dots, s_{d})$ , $\int s_{l}^{t} K (s) d s = 0, and \int s_{l}^{m} K (s) d s \neq 0 for any l = 1 \dots, d and t = 1, \dots, m - 1 .$
(C3): The bandwidth sequence h satisfies $n h^{2 d} \to \infty, n h^{d} / log n \to \infty$ , and $n h^{2 m} \to 0$ as $n \to \infty$ ; the order m satisfies $m \geq 2$ and $2 m > d$ .
(C4): Let $W (γ) = E {ξ {(g_{γ}, γ)}^{\otimes 2}}$ and $Ξ (γ) = E {(1 - δ) (h (V) - m_{V}^{0} (X, γ) (Y - m_{Y}^{0} (X, γ))}$ , $Λ (Z, Y, δ, γ) = (\frac{δ}{π (X, Y)} - 1) (h (V) - m_{V}^{0} (X, γ))$ , where $m_{V}^{0} (X, γ) = E {h (V) | X, δ = 0; γ}$ , and $m_{Y}^{0} (X, γ) = E {Y | X, δ = 0; γ}$ :

$Φ (Z, Y, δ, γ) = - {(Ξ {(γ)}^{⊤} W {(γ)}^{- 1} Ξ (γ))}^{- 1} Ξ {(γ)}^{⊤} W {(γ)}^{- 1} Λ (Z, Y, δ, γ),$

$E {\{Φ_{i} (γ_{0})\}}^{2} < \infty, \partial Φ_{i} (γ) / \partial γ$ exists at $γ_{0}, E \{{sup}_{γ} ∥ ξ (g_{γ}, γ) ∥\} < \infty, γ_{0}$ is the unique solution to $E {ξ (g_{γ}, γ)} = 0$ , and $W (γ_{0})$ is positive definite.
(C5): $\{(Z_{i}, Y_{i}, δ_{i}) : i = 1, \dots, n\}$ are independent and identically distributed random vectors. The support of $θ$ denoted by $B$ is a compact set in $R^{q},$ and $θ_{0} \in B$ is the unique solution to $E \{ψ (Z_{i}, Y_{i}, θ)\} = 0$ . Furthermore, $∥\partial ψ (Z_{i}, Y_{i}, θ) / \partial θ∥$ , $∥\partial^{2} ψ (Z_{i}, Y_{i}, θ) / \partial θ \partial θ^{⊤}∥$ and ${∥ψ (Z_{i}, Y_{i}, θ)∥}^{3}$ are bounded by an integrable function $H (x, y)$ within a neighborhood of $θ_{0}$ .
(C6): For all $ϵ$ in a neighborhood of zero and for almost every $Z, F (ϵ ∣ Z)$ , $f (ϵ ∣ Z)$ and $f (ϵ ∣ Z, δ = 0)$ to exist, they are bounded away from zero and are r times continuously differentiable with $r \geq 2$ . There exists a function $C (Z)$ such that $|f^{(s)} (ϵ ∣ Z)|$ , and $|f^{(s)} (ϵ ∣ Z, δ = 0)| \leq C (Z)$ for $s = 0, 2, \dots, r$ , for almost all $Z$ and $ϵ$ in a neighborhood of zero, and $E [{C (X) ∥ X ∥}^{2}] < \infty$ .
(C7): The kernel function $K (\cdot)$ is a probability density function such that (a) it is bounded and has a compact support; (b) $K (\cdot)$ is an rth order kernel, i.e., $K (\cdot)$ satisfies $\int u^{j} K (u) d u = 1$ if $j = 0$ ; 0 if $1 \leq j \leq r - 1$ , and $C_{K}$ if $j = r$ for some constant $C_{k} \neq 0$ ; and (c) we let $\tilde{G} (u) = (G (u), G^{2} (u), \dots, G^{L + 1} (u))$ for some $L \geq 1$ , where $G (u) = \int_{\tilde{v} < u} K (v) d v$ . For any $ι \in R^{L + 1}$ satisfying $∥ ι ∥ = 1$ , there is a partition of $[- 1, 1], - 1 = a_{0} < a_{1} < \dots a_{L + 1}$ such that $ι^{⊤} \tilde{G} (u)$ is either strictly positive or strictly negative within $(a_{l - 1}, a_{l})$ for $l = 1, \dots, L + 1$ .
(C8): The positive bandwidth parameter h satisfies $n h^{2 r} \to 0$ and $n h / log (n) \to \infty$ as $n \to \infty$ .
(C9): $Z$ has a bounded support, ${E ∥ Z ∥}^{4} < \infty$ and the matrices $Γ$ and $T_{l};$ additionally, $l = 1, 2$ are nonsingular.
(C10): Under complete observation of $(Z_{i}, Y_{i})$ for $i = 1, \dots, n$ , the unique solution $\hat{β}$ to the score equation in (3) satisfies

$\sqrt{n} (\hat{β} - β_{0}) \overset{d}{\to} N (0, Σ),$

for some $Σ$ for sufficiently large n.

To ensure the requirements of Lemma 8.11 by Newey and McFadden [22] and Theorem 6.18 by Van [23], the conditions (C1)–(C4), which are commonly found in the literature on missing data and nonparametric method [24,25], are primarily employed. These conditions encompass the following: (1) random equivalence and continuity conditions; (2) linearity conditions on the objective function with respect to nonparametric components and convergence rate conditions for nonparametric estimators; and (3) the differential continuity condition of the estimating equations with respect to the parameter of interest. Conditions (C5)–(C9) ensure the consistency and asymptotic normality of the empirical likelihood estimator for quantile regression smoothing [10]. To simplify the discussion on the asymptotic properties of maximum likelihood estimation in the working model, we introduce condition (C10).

Under the fulfillment of the assumed conditions, we define the following:

\begin{matrix} \sqrt{n} (\hat{γ} - γ_{0}) = & \frac{1}{\sqrt{n}} Φ_{i} (γ_{0}) + o_{p} (1), \\ \sqrt{n} (\hat{β} - β_{0}) = & \frac{1}{\sqrt{n}} Ψ_{i} (β_{0}) + o_{p} (1) . \end{matrix}

In addition, we have the following lemma, whose proof is given in the Appendix A:

Lemma 1.

Under conditions (C5)–(C9), we have

\begin{matrix} E {ψ_{h} (Z_{i}, Y_{i}; θ_{0})} = O (h^{r}), \\ E {m_{ψ_{h}}^{0} (Z_{i}; θ_{0})} = E {m_{ψ}^{0} (Z_{i}; θ_{0})} + O (h^{r}) . \end{matrix}

Lemma 2.

Under the assumption conditions (C1)–(C10), with the notation from Section 3, the following results hold as

n \to \infty

:

\begin{matrix} (1) \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} ψ_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) \overset{d}{\to} N (0, T_{l}); & (2) \frac{1}{n} \sum_{i = 1}^{n} ψ_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) \overset{p}{\to} V_{l}; \\ (3) \frac{1}{n} \sum_{i = 1}^{n} \frac{\partial ψ_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})}{\partial θ^{⊤}} \overset{p}{\to} Γ; & (4) max_{i} ∥ ψ_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) ∥ = o_{p} (n^{1 / 2}) . \end{matrix}

Theorem 1.

Under conditions (C1)–(C10), if the parameter working model is correctly specified, as

n \to \infty

for

l = 1, 2

, we have

\sqrt{n} ({\hat{θ}}^{(l)} - θ_{0}) \overset{d}{\to} N (0, Γ^{- 1} T_{l} Γ^{- ⊤}),

where

Γ = E \{f (0 ∣ Z) Z Z^{⊤}\}

.

If there is no missing data, we have

P (δ | Z, Y) = 0

, which implies that

H_{1}

,

H_{2}

, and

H_{3}

are all zero. Additionally, we have

A_{1} = A_{2} = E \{ψ {(Z, Y, θ_{0})}^{\otimes 2}\} = τ (1 - τ) E \{Z_{i} Z_{i}^{⊤}\} .

The above results are consistent with the asymptotic normality conclusion of classical quantile regression.

The different forms of

Λ_{1}

and

Λ_{2}

indicate that if the parameter working model

f (Y | Z, δ = 1; β)

is misspecified,

{\hat{θ}}^{(1)}

is no longer consistent, while

{\hat{θ}}^{(2)}

remains a consistent estimator of

θ_{0}

. The following procedure demonstrates the double robustness of the AIPW estimation equation. For misspecified

f (Y | Z, δ = 1)

values, there exists

C (Z; θ, \hat{γ})

such that

{\hat{m}}_{ψ}^{0} (Z_{i}; θ, \hat{β}, \hat{γ}) = C (Z_{i}; θ, \hat{γ}) + o_{p} (1) .

This illustrates the double robustness property of the AIPW estimation equation.

It can be shown that

\begin{matrix} \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ, \hat{β}, \hat{γ}) = & \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} \frac{δ_{i} ψ (Z_{i}, Y_{i}; θ)}{π (X_{i}, Y_{i}; {\hat{g}}_{\hat{γ}}, \hat{γ})} \\ + \{1 - \frac{δ_{i}}{π (X_{i}, Y_{i}; {\hat{g}}_{\hat{γ}}, \hat{γ})}\} C (Z; θ, \hat{γ}) + o_{p} (1) \\ = & \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} \frac{δ_{i} ψ (Z_{i}, Y_{i}; θ)}{π (X_{i}, Y_{i}; {\hat{g}}_{\hat{γ}}, \hat{γ})} + o_{p} (1) \\ = : & \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{i p w} (θ, \hat{β}, \hat{γ}) + o_{p} (1) . \end{matrix}

If

π (X_{i}, Y_{i})

is correctly specified, the IPW estimation equation is consistent, which implies that the AIPW quantile regression estimation equation remains consistent in this case.

Theorem 2.

Under the conditions of Theorem 1, if the parameter working model is correctly specified, for

l = 1, 2

and as

n \to \infty

,

{\hat{R}}^{(l)} (θ_{0}) \overset{d}{\to} r_{1}^{(l)} χ_{1 \cdot 1}^{2} + r_{2}^{(l)} χ_{1 \cdot 2}^{2} + \dots + r_{q}^{(l)} χ_{1 \cdot q}^{2},

where

r_{i}^{(l)}

are the eigenvalues of

A_{l}^{- 1} Λ_{l}

.

χ_{1 \cdot 1}^{2}, χ_{1 \cdot 2}^{2}, \dots, χ_{1 \cdot q}^{2}

represent q independent standard

χ^{2}

distributed random variables.

First, if there is no data missingness, we have

T_{l} = A_{l} = τ (1 - τ) E \{Z Z^{⊤}\}

, which leads to

{\hat{R}}^{(l)} (θ) \overset{d}{\to} χ_{q}^{2}

, and the Wilks’ Theorem holds. Furthermore, it is worth noting that if

β

and

γ

are known, we still have

T_{l} = A_{l}

, and in this case, the Wilks’ Theorem still holds. The above conclusion is consistent with Zhao [26].

4. Simulation Study

To investigate the finite-sample properties of the proposed method, this study conducted numerical simulations under both correctly specified and misspecified working model scenarios.

4.1. Simulation 1: Correctly Specified Working Model

In the numerical simulation, we generated a random vector

(x, y, δ)

, where x is the independent variable, y is the response variable of interest, and

δ

is the indicator variable for the observation of y. When

δ = 1

, y has observed values; otherwise, the observation of y is missing. Let

x \sim N (0, 0.5)

, and generate the observed response variable y according to the following equation:

y = μ (x) + e,

where

μ (x) = - 1 + x

. We considered two different distributions for the random error term e: (a)

N (0, 0.9)

and (b)

N (0, 0.49 (1 + x^{2}))

. For the working model

f (y | x, δ = 1; β)

, the former follows a homoscedastic structure

N (μ (x), σ_{1}^{2})

, while the latter follows a heteroscedastic structure

N (μ (x), σ_{2}^{2} (x))

.

The indicator variable

δ

follows a Bernoulli distribution with parameter p, i.e.,

δ \sim Bernoulli (1, p)

. The conditional probability of

δ

given

(x, y)

is defined as follows:

p (ϕ) = P (δ = 1 | x, y) = {\{1 + exp (- ϕ_{0} - ϕ_{1} y)\}}^{- 1},

where

(ϕ_{0}, ϕ_{1}) = (0.8, - 0.2)

. In this case, the missingness mechanism for the response variable y is nonrandom, with x serving as a missingness instrument. The average observed rate in the sample was approximately 73%.

We establish a quantile regression model of the response variable y on x as follows:

Q_{τ} (y | x) = θ_{0} + θ_{1} x,

where

τ

represents the quantile of interest, specifically

τ = (0.25, 0.5, 0.75)

.

Consider the following five quantile regression estimation equations:

(1): Full Estimation Equation: $ψ_{h}^{full} (x_{i}, y_{i}; θ) = ψ_{h} (X_{i}, y_{i}; θ)$ ;
(2): Complete Case (CC) Estimation Equation: $ψ_{h}^{cc} (x_{i}, y_{i}, δ_{i}; θ) = δ_{i} ψ_{h} (X_{i}, y_{i}; θ)$ ;
(3): IPW Estimation Equation: $ψ_{h}^{ipw} (x_{i}, y_{i}, δ_{i}; θ, \hat{ϕ})$ ;
(4): EEI Estimation Equation: $ψ_{h}^{eei} (x_{i}, y_{i}, δ_{i}; θ, \hat{ϕ}, \hat{β})$ ;
(5): AIPW Estimation Equation: $ψ_{h}^{aipw} (x_{i}, y_{i}, δ_{i}; θ, \hat{ϕ}, \hat{β})$ .

To generate a sample of size

n = 500

that meets the requirements of the simulation, we can use the law of total probability and express

f (y | x)

as follows:

f (y | x) = P (δ = 1 | x) f (y | x, δ = 1) + P (δ = 0 | x) f (y | x, δ = 0),

where

f (y | x, δ = 0) = f (y | x, δ = 1) \times \frac{O (x, y)}{E (O (x, Y) | x, δ = 1)} .

Under the specified nonrandom missingness mechanism, we have

O (x, y) = exp (- ϕ_{0} - ϕ_{1} y)

for the homoscedastic case of

f (y | x, δ = 1) = N (μ, σ_{1}^{2})

. In this case, we can express the ratio of the conditional probabilities as follows:

\frac{P (δ = 0 | x)}{P (δ = 1 | x)} = E (O (x, y) | x, δ = 1) = exp (ϕ_{0} - ϕ_{0}^{'}),

where

ϕ_{0}^{'} = - \frac{1}{2 σ_{1}^{2}} (2 μ σ_{1}^{2} ϕ_{1} + σ_{1}^{4} ϕ_{1}^{2})

. Thus, we have

f (y | x, δ = 0) = N (μ + σ_{1}^{2} ϕ_{1}, σ_{1}^{2})

and

P (δ = 1 | x) = {1 + exp (ϕ_{0} - ϕ_{0}^{'})}^{- 1}

.

Since x is completely observed, we can draw a sample of size

n = 500

from the mixed distribution of

f (y | x)

. A similar approach can be applied under the heteroscedastic assumption.

It should be noted that the response variable y originates from a distribution with a complex, mixed form. As a result, discussing the true values of the parameters

(θ_{0}, θ_{1})

poses a formidable challenge. This complexity renders it difficult to assess the performance of the estimation methods using conventional measures such as bias or the root mean squared error (RMSE). Consequently, we introduce the following approxmate relative evaluation metrics:

A R E (M e t h o d *, F u l l) = \frac{A R M S E (M e t h o d *)}{A R M S E (F u l l)},

where

A R M S E (M e t h o d *) = \sqrt{S D {(M e t h o d *)}^{2} + {(M e a n (M e t h o d *) - M e a n (F u l l))}^{2}}

.

Table 1 and Table 2 summarize the mean and variance of the five coefficient estimates at different quantiles based on 1000 Monte Carlo simulations under the homoscedastic case (a) and heteroscedastic case (b) of

f (y | x, δ = 1)

. From the estimation results, it can be observed that the coefficient estimates based on complete observations have larger bias compared to the other estimation methods. When the working model

f (y | x, δ = 1; β)

was correctly specified, the proposed imputation estimates yielded smaller variances compared to the IPW estimates. In this case, the performance of the AIPW estimates was similar to the IPW estimates. Comparing the results at different quantiles, it can be seen that the variances of the five estimation methods at the 0.5 quantile were smaller than those at the 0.25 and 0.75 quantiles, which is due to the larger sample size at the central quantile compared to the tails. Under the homoscedastic assumption, the variances of the estimates at the 0.25 and 0.75 quantiles were similar. Under the existing missing mechanism, as the value of the response variable y increased, the missing propensity also increased, thereby indicating higher missing rates at the upper quantiles. Consequently, the estimation variances of the IPW and AIPW estimates were higher at the high quantile of

τ = 0.75

compared to the low quantile of

τ = 0.25

. However, proper imputation could greatly improve the estimation efficiency at the high quantile of

τ = 0.75

. This improvement was more pronounced under the heteroscedastic model. These results demonstrate that the imputation estimates are nearly unbiased when the working model is correctly specified and have higher estimation efficiency compared to the IPW and AIPW estimates.

4.2. Simulation 2: Misspecification of the Working Model

In practical situations, the true data generation mechanism is unknown, and it is challenging to accurately specify the working model

f (y | x, δ = 1; β)

for the observed data. In this study, we investigated the finite sample properties of the proposed imputation estimator and calibration estimator under the misspecification of the working model. The simulation model includes two covariates:

X_{1} \sim N (0, 1)

and

X_{2} \sim Exp (0.2)

. Given the covariates, the response variable Y is generated as follows:

Y = 1 + X_{1} + X_{2} + 0.25 (2 + X_{2}) ε,

where

ε \sim N (0, 1)

, and

X_{1}

,

X_{2}

, and

ε

are mutually independent.

In this simulation setup, the error term distribution of Y is heteroscedastic. The missing data mechanism for Y is nonrandom and follows

P (R = 1 | X, Y) = \frac{1}{1 + exp (- 0.1 + 0.5 X_{1}^{2} + 0.15 Y)} .

The average observed rate in the model was approximately 73%, and

X_{2}

served as an instrumental variable. We generated a random sample of size

n = 500

denoted as

(X_{i}, Y_{i}, δ_{i}) : i = 1

,

\dots, n

. For the aforementioned simulation model, we consider the following quantile regression model:

Q_{τ} (Y | X) = (1, X^{⊤}) θ_{0} (τ),

where

θ_{0} (τ) = {(1 + 0.5 Q_{τ} (ε), 1, 1 + 0.25 Q_{τ} (ε))}^{⊤}

.

Under the aforementioned data generating mechanism, obtaining an explicit expression for

f (Y | X, δ = 1; β)

is challenging and requires specifying the working model based on the observed data. In this simulation model, we consider two possible working models: (1)

N (\hat{μ} (X), {\hat{σ}}^{2})

and (2)

N (\hat{μ} (X), {(0.5 + 0.25 X_{2})}^{2})

. Figure 1 and Figure 2 illustrate that the residual distribution of the working model (1) exhibited peakedness, thus violating the normality assumption and indicating model misspecification. In contrast, working model (2) took into account the correct specification of the variance.

The estimation results of the five types of quantile regression estimates obtained from 1000 random simulations at different quantiles are summarized in Table 3, Table 4 and Table 5. These tables include two types of imputation estimates based on the parameter working models (1) and (2), as well as the corresponding AIPW estimates based on the parameter working models (1) and (2), and the combined estimation equations. The results show that the imputation estimates based on the erroneously specified working model (1) exhibited significant estimation bias. On the other hand, although the working model (2) was also misspecified, it took into account the heteroscedasticity in the conditional distribution of the response variable, thus resulting in smaller estimation bias compared to model (1) and better estimation performance. These findings highlight the sensitivity of imputation methods to misspecified working models. Across the three quantiles, the IPW estimates performed well, thus indicating the robustness of the semiparametric response assumption. Even in the presence of misspecified parameter working models, both of the AIPW estimates had similar median absolute deviations to IPW, which were significantly smaller than the misspecified imputation estimates, thereby demonstrating the robustness of the AIPW estimation. Comparing the two AIPW estimates, it is observed that the estimate based on the correctly specified parameter working model had smaller estimation bias and higher estimation efficiency.

5. Real Data Application

We applied our proposed method to the data of 2139 HIV-infected patients enrolled in the ACTG175 study [27]. The ACTG175 study evaluated the efficacy of monotherapy or combination therapy in HIV-infected patients with CD4 cell counts between 200 and 500 cells/mm³. Following the studies by Davidian et al. [28] and Zhang et al. [29], we categorized all the treatment regimens into two groups. The first group consisted of the standard zidovudine (ZDV) monotherapy arm, while the second group included three newer treatment arms: ZDV and dual nucleoside analogue (ddl), ZDV and zalcitabine (ddC), and ddl monotherapy. The first group comprised 532 subjects, while the second group comprised 1697 subjects. We investigated the effect of the treatment arm (trt, 0 = ZDV monotherapy only) on the

τ

quantile of the CD4 cell count (

CD 4_{96}

) measured at baseline and adjusted for the baseline CD4 cell count (

CD 4_{0}

) and other baseline covariates, including age, weight, race (0 = Caucasian), gender (0 = female), history of reverse transcriptase inhibitor use (0 = no), and whether the subject discontinued treatment before 96 weeks (offtrt, 0 = no).

Consider fitting a linear quantile regression model as follows:

\begin{matrix} Q_{τ} ({CD 4}_{96} | X) = & β_{1} (τ) + β_{2} (τ) trt + β_{3} (τ) {CD 4}_{0} + β_{4} (τ) age + β_{5} (τ) weight \\ + β_{6} (τ) race + β_{7} (τ) gender + β_{8} (τ) history + β_{9} (τ) offtrt . \end{matrix}

The dataset used in this study is sourced from the R package “speff2trial”. The study population consists of 1522 Caucasian individuals and 617 non-Caucasian individuals, with 1171 males and 368 females. The average age of the participants is 35 years, with a standard deviation of 8.7 years. Among the participants, 1253 individuals had a history of antiretroviral therapy, and 776 individuals discontinued treatment before the 96th week.

Due to attrition during the study period, approximately 37% of the participants have missing values for the variable

CD 4 96

. Although complete measurements of other variables related to

CD 4 96

, such as baseline CD4 and CD8 cell counts

CD 4 0

and

CD 8 0

, as well as CD4 and CD8 cell counts at

20 \pm 5

weeks

CD 4 20

and

CD 8 20

, were obtained at baseline and follow-up visits, these variables may not fully explain the propensity for participants to drop out. In other words, we cannot assume that the missingness of

{CD 4}_{96}

is random. Therefore, in our analysis, we consider a more comprehensive semiparametric nonrandom missingness mechanism:

P (R = 1 | X, s, Y) = π (s, Y) = \frac{1}{1 + exp (g (s) + γ Y)},

where

s

represents the set of variables associated with attrition, and

g (s)

is a function capturing the relationship between these variables and the missingness indicator R.

Figure 3 displays the histograms of the observed

{CD 4}_{96}

and its logarithm. From the figure, it can be observed that the conditional distribution

f (y | X, R = 1)

of observed

{CD 4}_{96}

is right-skewed. However, the logarithmic transformation did not result in improved symmetry, thus indicating that the normality assumption did not hold. In our analysis, we can assume that

{CD 4}_{96}

follows a truncated normal distribution with left truncation at 0, where its mean is primarily determined by the influence of eight covariates and three auxiliary variables.

The parameters

β

in the working model

f (y | X, R = 1; β)

are estimated using the truncation regression model in R package "truncreg". The parameter

γ

in

π (s, Y)

is estimated using the method of the profile generalized method of moments (GMMs).

Figure 4 and Figure 5 illustrate the normality properties of the residuals from the truncated regression working model. Visually, the distribution of residuals appears to be symmetric. The calculated sample skewness is 0.05, thus indicating a slight deviation from perfect symmetry. The Q-Q plot reveals that the distribution of residuals has a kurtosis less than 3. Further computation reveals a kurtosis of 2.11, thus indicating that the residual distribution is flatter than a standard normal distribution.

Table 6 presents the estimates of the quantile regression coefficients and corresponding 95% confidence intervals at the

τ = 0.25, 0.5, 0.75

quantile levels. The four estimation methods considered include complete case (CC) estimation, inverse probability weighting (IPW) estimation, multiple imputation (MI) estimation, and augmented inverse probability weighting (AIPW) estimation. The MI estimation is based on averaging over

L = 20

randomly generated imputations. Confidence intervals for the coefficient estimates were obtained using the bootstrap method with

B = 200

resampling iterations.

From Table 6, it can be observed that for the three given quantile levels and four estimation methods, patients receiving the three new combined treatment methods had significantly higher CD4 cell counts at

96 \pm 5

weeks compared to the traditional treatment method. In other words, the new treatment methods had significantly slowed down the progression of AIDS compared to the traditional method. Comparing the four estimation methods, it is evident that the complete case estimation overestimated the performance of the treatment group. The results of the IPW estimation and AIPW estimation were similar and higher than the imputation estimation. When comparing the treatment effects at different quantile levels, both the IPW estimation and imputation estimation reflected a decreasing trend in treatment effect from the 0.25th to the 0.75th quantile. Although the AIPW estimation and complete case estimation did not show a similar trend, the coefficient estimates of the AIPW estimation also indicate a more significant improvement in treatment effect for patients at lower quantiles.

Upon examining the effects of the other covariates, it is found that for all four estimation methods, the baseline CD4 level

{CD 4}_{0}

had a positive impact on the CD4 cell count at

96 \pm 6

weeks, while patients with a history of antiretroviral therapy or early treatment discontinuation exhibited poorer CD4 cell levels at

96 \pm 5

weeks. In comparison to the covariates directly related to the disease progression mentioned above, the effects of age, weight, race, gender, and other covariates on the CD4 cell count at

96 \pm 6

weeks were minimal. The impact directions and significance obtained from different methods were also not consistent. Therefore, although these variables needed to be considered in the modeling process, conclusions regarding their effects should be drawn with caution.

6. Discussion

In this study, we address the bias in quantile regression estimates by constructing imputation and AIPW estimation equations, with both involving the estimation of conditional means under nonrandom missingness. Many existing methods rely on kernel regression to estimate conditional means. However, nonparametric estimation methods may suffer from the curse of dimensionality when the dimension of the covariates is high. Paik and Larsen [19] proposed using importance resampling to obtain Monte Carlo estimates of conditional means, and Song et al. [21] further applied this method to estimation equations. In this study, we extend these methods to quantile regression and overcome the theoretical and computational challenges caused by the nonsmoothness of the checking function in classical quantile regression by employing convolution smoothing.

Common parameter working models are based on linear regression for observed data. Song et al.’s [21] simulation results showed that model misspecification does not lead to estimation bias. However, their simulation study was based on a regression model that satisfied the Gauss–Markov assumption, with missing response variables following a normal distribution with homoscedasticity concerning the covariates. Misspecification was reflected in the estimation of the mean or location variables. However, the advantages of quantile regression are more evident in situations involving skewness, heavy tails, and heteroscedasticity. In this study, our simulation results under heteroscedasticity showed that imputation estimation based on the assumption of a linear regression working model leads to significant estimation bias, while the AIPW estimation equation can mitigate the impact of model misspecification. We also provide theoretical proof of the consistency of AIPW estimation.

Our simulation results demonstrate that, under the ideal scenario of correctly specified parameter working models, the imputation estimator is more efficient than the IPW and AIPW estimators. The AIPW estimator based on the correctly specified model was found to be more efficient than that based on the misspecified model. Therefore, in practical applications, it is crucial to appropriately specify the parameter working model based on the observed data. Fortunately, the effectiveness of the model specification can be assessed using various methods such as Q-Q plots and histograms. For the observed response conditional distributions that do not conform to the linear regression assumption, a Box–Cox transformation can be applied to approximate a normal parameter working model. If such a parameter working model is difficult to obtain, the AIPW estimator proposed in this study can still provide relatively reliable estimates. This is because the proposed response mechanism model is semiparametric and offers certain flexibility. However, the response model constructed in this study does not consider the interaction effects between covariates

X

and the response variable Y or the potential nonlinear effects of the response variable Y on the missingness propensity.

Author Contributions

The first two authors contributed equally to this work. J.G.: Methodology, software, validation, data curation, writing; F.L.: visualization, data curation, review; W.K.H.: writing—review and editing; X.Z.: supervision, validation; K.W.: formal analysis; T.Z.: investigation; L.Y.: resources; M.T.: Conceptualization, project administration, funding acquisition, and the corresponding author. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Fundamental Research Funds for the Central Universities and the Research Funds of Renmin University of China (22XNL016).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The researchers can download the ACTG175 dataset from the R package “speff2trial”.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

SIR	Sampling Importance Resampling
IPW	Inverse Probability Weighting
AIPW	Augmented Inverse Probability Weighting
EEI	Imputed Estimation Equation

Appendix A

Appendix A.1. Proof of Lemma 1

\begin{matrix} E {ψ_{h} (Z_{i}, Y_{i}, θ_{0})} = & E {Z_{i} {G_{h} (Z_{i}^{⊤} θ_{0} - Y_{i}) - τ}} \\ = & E {Z_{i} {G (- ε_{i} / h) - τ}} \\ = & E \{Z_{i} \{\int_{u < - ε_{i} / h} K (u) d u - τ\}\} \\ = & E \{Z_{i} E \{\int I_{ε_{i} < - h u} (ε_{i}) K (u) d u - τ | Z_{i}\}\} \\ = & E \{Z_{i} \{\int F (- h u | Z_{i}) K (u) d u - τ\}\} . \end{matrix}

By assumption (C7), we can utilize a Taylor expansion, thus yielding

\begin{matrix} \int F (- h u | Z_{i}) K (u) d u \\ = & F (0 | Z_{i}) + \int \sum_{k = 1}^{r} F^{(k)} (0 | Z_{i}) \frac{{(- h)}^{r} u^{r}}{r!} K (u) d u + \int F^{(r + 1)} (- \tilde{h} u | Z_{i}) \frac{{(- h)}^{r + 1} u^{r + 1}}{(r + 1)!} K (u) d u \\ = & F (0 | Z_{i}) + \int F^{(r)} (0 | Z_{i}) \frac{{(- h)}^{r} u^{r}}{r!} K (u) d u + \int F^{(r + 1)} (- \tilde{h} u | Z_{i}) \frac{{(- h)}^{r + 1} u^{r + 1}}{(r + 1)!} K (u) d u \\ = & τ + \int f^{(r - 1)} (0 | Z_{i}) \frac{{(- h)}^{r} u^{r}}{r!} K (u) d u + \int f^{(r)} (- \tilde{h} u | Z_{i}) \frac{{(- h)}^{r + 1} u^{r + 1}}{(r + 1)!} K (u) d u, \tilde{h} \in [0, h] . \end{matrix}

Thus, we have

\begin{matrix} E {Z_{i} G_{h} (Z_{i}^{⊤} θ - Y_{i}) - τ} = & \frac{{(- h)}^{r}}{r!} E {Z_{i} f^{(r - 1)} (0 | Z_{i})} \int u^{r} K (u) d u \\ + E [Z_{i} f^{(r)} (- \tilde{h} u | Z_{i}) u^{r} K (u) d u] O (h^{r + 1}) . \end{matrix}

By assumption (C6), we have

∥E [Z_{i} \{\int f^{(r)} (- \tilde{h} u | Z_{i}) u^{r} K (u) d u\}]∥ \leq E [\int C (Z) ∥ Z ∥ u^{r} K (u) d u] = O (1) .

Therefore,

E {ψ_{h} (Z_{i}, Y_{i}; θ_{0})} = \frac{{(- h)}^{r}}{r!} C_{K} E [Z_{i} f^{(r - 1)} (0 | Z_{i})] + o (h^{r}) .

Similarly, we have

\begin{matrix} E {m_{ψ_{h}}^{0} (Z_{i}; θ_{0})} \\ = & E [E {ψ_{h} (Z_{i}, Y_{i}; θ_{0}) | Z_{i}, δ_{i} = 0}] \\ = & E [E {Z_{i} {G_{h} (Z_{i}^{⊤} θ_{0} - Y_{i}) - τ} | Z_{i}, δ_{i} = 0}] \\ = & E [E {Z_{i} {G_{h} (- ε_{i} / h) - τ} | Z_{i}, δ_{i} = 0}] \\ = & E [E \{Z_{i} \{\int_{u < - ε_{i} / h} K (u) d u - τ\} | Z_{i}, δ_{i} = 0\}] \\ = & E [E \{Z_{i} \{\int I_{ε_{i} < - h u} (ε_{i}) K (u) d u - τ\} | Z_{i}, δ_{i} = 0\}] \\ = & E [E \{Z_{i} \{\int F (- h u | Z_{i}, δ_{i} = 0) K (u) d u - τ\} | Z_{i}, δ_{i} = 0\}] . \end{matrix}

Based on the assumptions and the Taylor expansion, we have

\begin{matrix} \int F (- h u | Z_{i}, δ_{i} = 0) K (u) d u \\ = & F (0 | Z_{i}, δ_{i} = 0) + \int \sum_{k = 1}^{r} F^{(k)} (0 | Z_{i}, δ_{i} = 0) \frac{{(- h)}^{r} u^{r}}{r!} K (u) d u \\ + \int F^{(r + 1)} (- \tilde{h} u | Z_{i}, δ_{i} = 0) \frac{{(- h)}^{r + 1} u^{r + 1}}{(r + 1)!} K (u) d u \\ = & F (0 | Z_{i}, δ_{i} = 0) + \int F^{(r)} (0 | Z_{i}, δ_{i} = 0) \frac{{(- h)}^{r} u^{r}}{r!} K (u) d u \\ + \int F^{(r + 1)} (- \tilde{h} u | Z_{i}, δ_{i} = 0) \frac{{(- h)}^{r + 1} u^{r + 1}}{(r + 1)!} K (u) d u \\ = & F (0 | Z_{i}, δ_{i} = 0) + \int f^{(r - 1)} (0 | Z_{i}, δ_{i} = 0) \frac{{(- h)}^{r} u^{r}}{r!} K (u) d u \\ + \int f^{(r)} (- \tilde{h} u | Z_{i}, δ_{i} = 0) \frac{{(- h)}^{r + 1} u^{r + 1}}{(r + 1)!} K (u) d u, \tilde{h} \in [0, h] . \end{matrix}

Notice that

\begin{matrix} E [E {Z_{i} {F (0 | Z_{i}, δ_{i} = 0) - τ} | Z_{i}, δ_{i} = 0}] \\ = & E [Z_{i} E {I (Y_{i} < Z_{i}^{⊤} θ_{0}) - τ} | Z_{i}, δ_{i} = 0] \\ = & E [E {Z_{i} {I (Y_{i} < Z_{i}^{⊤} θ_{0}) - τ} | Z_{i}, δ_{i} = 0}] \\ = & E {m_{ψ}^{0} (Z_{i}; θ_{0})}, \end{matrix}

which implies

\begin{matrix} E {m_{ψ_{h}}^{0} (Z_{i}; θ_{0})} \\ = & E [E {ψ_{h} (Z_{i}, Y_{i}; θ_{0}) | Z_{i}, δ_{i} = 0}] \\ = & E {m_{ψ}^{0} (X_{i}; θ_{0})} + \frac{{(- h)}^{r}}{r!} E {Z_{i} f^{(r - 1)} (0 | Z_{i}, δ_{i} = 0)} \int u^{r} K (u) d u \\ + E [Z_{i} {f^{(r)} (- \tilde{h} u | Z_{i}, δ_{i} = 0) u^{r} K (u) d u}] O (h^{r + 1}) . \end{matrix}

Under the assumption conditions, we have

\begin{matrix} ∥E [Z_{i} \{\int f^{(r)} (- \tilde{h} u | Z_{i}, δ_{i} = 0) u^{r} K (u) d u\}]∥ \leq E [\int C (Z) ∥ Z ∥ u^{r} K (u) d u] = O (1), \end{matrix}

which implies

E {m_{ψ_{h}}^{0} (Z_{i}; θ_{0})} = E {m_{ψ}^{0} (Z_{i}; θ_{0})} + \frac{{(- h)}^{r}}{r!} C_{K} E [Z_{i} f^{(r - 1)} (0 | Z_{i}, δ_{i} = 0)] + o (h^{r}) .

Appendix A.2. Proof of Lemma 2

To prove (1), we can perform a simple calculation. We have

\begin{matrix} \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} ψ_{h i}^{(1)} (θ_{0}, \hat{β}, \hat{γ}) = & \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} δ_{i} ψ_{h} (Z_{i}, Y_{i}; θ_{0}) + (1 - δ_{i}) {\hat{m}}_{ψ_{h}}^{0} (Z_{i}; θ_{0}, \hat{β}, \hat{γ}) \\ = & \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} δ_{i} ψ_{h} (Z_{i}, Y_{i}; θ_{0}) + (1 - δ_{i}) m_{ψ_{h}}^{0} (Z_{i}; θ_{0}, β_{0}, γ_{0}) \\ + \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} (1 - δ_{i}) {{\hat{m}}_{ψ_{h}}^{0} (Z_{i}; θ_{0}, β_{0}, γ_{0}) - m_{ψ_{h}}^{0} (Z_{i}; θ_{0}, β_{0}, γ_{0})} \\ + \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} (1 - δ_{i}) {{\hat{m}}_{ψ_{h}}^{0} (Z_{i}; θ_{0}, \hat{β}, \hat{γ}) - {\hat{m}}_{ψ_{h}}^{0} (Z_{i}; θ_{0}, β_{0}, γ_{0})} \\ = & \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} δ_{i} ψ_{h} (Z_{i}, Y_{i}; θ_{0}) + (1 - δ_{i}) m_{ψ_{h}}^{0} (Z_{i}; θ_{0}, β_{0}, γ_{0}) \\ + \frac{1}{\sqrt{n}} (1 - δ_{i}) {\{\frac{\partial {\hat{m}}_{ψ_{h}}^{0} (Z_{i}; θ_{0}, β^{*}, γ^{*})}{\partial β^{⊤}}\}}^{⊤} (\hat{β} - β_{0}) \\ + \frac{1}{\sqrt{n}} (1 - δ_{i}) {\{\frac{\partial {\hat{m}}_{ψ_{h}}^{0} (Z_{i}; θ_{0}, β^{*}, γ^{*})}{\partial γ}\}}^{⊤} (\hat{γ} - γ_{0}) + o_{p} (1) \\ = & \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 1} + \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 2} + \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 3} + o_{p} (1) . \end{matrix}

Based on the fact that

E {ψ_{h} (Z_{i}, Y_{i}; θ_{0})} = O (h^{r})

and

E {m_{ψ_{h}}^{0} (Z_{i}; θ_{0}, β_{0}, γ_{0})} = E {m_{ψ}^{0} (Z_{i}; θ_{0}, β_{0}, γ_{0})} + O (h^{r}),

(A1)

we have

E I_{i 1} = E {ψ_{h} (Z_{i}, Y_{i}; θ_{0})} = O (h^{r}) .

Additionally, we have

\begin{matrix} E I_{i 2}^{2} = & E [Z_{i} Z_{i}^{⊤} {δ_{i} G_{h} (Z_{i}^{⊤} θ_{0} - Y_{i}) + (1 - δ_{i}) E {G_{h} (Z_{i}^{⊤} θ_{0} - Y_{i}) | Z_{i}, δ_{i} = 0}}^{\otimes 2}], \end{matrix}

According to the assumptions, as

n \to \infty

,

\begin{matrix} lim_{n h^{2 r} \to 0} E I_{i 1}^{2} = & E [{\{δ_{i} ψ (Z_{i}, Y_{i}; θ_{0}) + (1 - δ_{i}) m_{ψ}^{0} (Z_{i}; θ_{0})\}}^{\otimes 2}] \\ = & E [π (X_{i}, Y_{i}) ψ {(Z_{i}, Y_{i}; θ_{0})}^{\otimes 2} + (1 - π (X_{i}, Y_{i})) m_{ψ}^{0} {(Z_{i}; θ_{0})}^{\otimes 2}] \\ : = & A_{1}, \end{matrix}

thus yielding

\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 1} \overset{d}{\to} N (0, A_{1}) .

For

I_{i 2}

, we have

\begin{matrix} \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 2} = & \frac{1}{n} \sum_{i = 1}^{n} (1 - δ_{i}) {\{\frac{\partial {\hat{m}}_{ψ_{h}}^{0} (Z_{i}; θ_{0}, β^{*}, γ^{*})}{\partial β^{⊤}}\}}^{⊤} \sqrt{n} (\hat{β} - β_{0}) \\ = & \frac{1}{n} \sum_{i = 1}^{n} (1 - δ_{i}) {\{\frac{\partial m_{ψ_{h}}^{0} (Z_{i}; θ_{0}, β^{*}, γ^{*})}{\partial β^{⊤}}\}}^{⊤} \sqrt{n} (\hat{β} - β_{0}) + o_{p} (1) . \end{matrix}

where

\begin{matrix} lim_{n h^{2 r} \to 0} E \{(1 - δ_{i}) {\{\frac{\partial m_{ψ_{h}}^{0} (Z_{i}; θ_{0}, β^{*}, γ^{*})}{\partial β^{⊤}}\}}^{⊤}\} \\ = & E \{(1 - δ_{i}) {Cov}_{0} (ψ (Z_{i}, Y_{i}; θ_{0}), s (Z_{i}, Y_{i}; β_{0}) | Z_{i})\} + o_{p} (1) . \end{matrix}

Let

H_{1} = E \{(1 - δ_{i}) {Cov}_{0} (ψ (Z_{i}, Y_{i}; θ_{0}), s (Z_{i}, Y_{i}; β_{0}) | Z_{i})\}

. By the assumption, we have

\hat{β} - β_{0} = O_{p} (n^{- 1 / 2})

and

\sqrt{n} (\hat{β} - β_{0}) \overset{d}{\to} N (0, Σ)

. Therefore, as

n \to \infty

,

\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 2} \overset{d}{\to} N (0, H_{1} Σ H_{1}^{⊤}) .

Similarly, let

H_{2} = E \{(1 - δ_{i}) {Cov}_{0} (ψ (Z_{i}, Y_{i}; θ_{0}), Y | Z_{i})\}

. According to Shao [5], we have

\hat{γ} - γ_{0} = O_{p} (n^{- 1 / 2})

and

\sqrt{n} (\hat{γ} - γ_{0}) \overset{d}{\to} N (0, σ^{2})

. Thus, as

n \to \infty

,

\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 3} \overset{d}{\to} N (0, σ^{2} H_{2}^{\otimes 2}) .

It can be shown that

E {I_{i 1} + I_{i 2}} = o_{p} (1)

. We have

\begin{matrix} (\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 1}) \cdot (\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 2}) = \frac{1}{n} \sum_{i = 1}^{n} I_{i 1} I_{i 2} + \frac{1}{n} \sum_{i \neq j}^{n} \sum_{j = 1}^{n} I_{i 1} I_{j 2} . \end{matrix}

where

\begin{matrix} \frac{1}{n} \sum_{i = 1}^{n} I_{i 1} I_{i 2} = & \frac{1}{n} \sum_{i = 1}^{n} ψ_{h}^{(l)} (Z_{i}, Y_{i}, δ_{i}; θ_{0}, β_{0}, γ_{0}) {\{\frac{\partial {\hat{m}}_{ψ_{h}}^{0} (Z_{i}; θ_{0}, β^{*}, γ^{*})}{\partial β^{⊤}}\}}^{⊤} (\hat{β} - β_{0}) \\ = & o_{p} (1), \end{matrix}

\begin{matrix} \frac{1}{n} \sum_{i \neq j}^{n} \sum_{j = 1}^{n} I_{i 1} I_{j 2} = & \frac{1}{n} \sum_{i \neq j} \sum_{j = 1}^{n} ψ_{h i}^{(1)} (θ_{0}, β_{0}, γ_{0}) {\{\frac{\partial m_{ψ_{h}}^{0} (Z_{j}; θ_{0}, β_{0}, γ_{0})}{\partial β^{⊤}}\}}^{⊤} (\hat{β} - β_{0}) \\ + o_{p} (n^{- 1 / 2}) . \end{matrix}

For

i \neq j

,

ψ_{h i}^{(1)} (θ_{0}, β_{0}, γ_{0})

and

\{\frac{\partial m_{ψ_{h}}^{0} (Z_{j}; θ_{0}, β_{0}, γ_{0})}{\partial β^{⊤}}\}

are independent; therefore,

\begin{matrix} E [ψ_{h i}^{(1)} (θ_{0}, β_{0}, γ_{0})] = O (h^{r}) . \\ E [{\{\frac{\partial m_{ψ_{h}}^{0} (Z_{j}; θ_{0}, β_{0}, γ_{0})}{\partial β^{⊤}}\}}^{⊤}] (\hat{β} - β_{0}) = O_{p} (n^{- 1 / 2}) . \end{matrix}

By the assumption, we have

\frac{1}{n} \sum_{i \neq j}^{n} \sum_{j = 1}^{n} I_{i 1} I_{j 2} = (n - 1) O (h^{r}) O_{p} (n^{- 1 / 2}) = O_{p} (n^{1 / 2} h^{r}) = o_{p} (1)

. Hence, we can conclude that

(\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 1}) \cdot (\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 2}) = o_{p} (1)

, which implies

Cov (\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 1}, \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 2}) = o (1) .

Similarly, we can show that

Cov (\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 1}, \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 3}) = o (1)

. Consequently,

Cov (\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 1}, \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} (I_{i 2} + I_{i 3})) = o (1) .

To establish the asymptotic properties of

(\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 2}) \cdot (\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} I_{i 3})

, we employ Taylor expansions, thus yielding the following:

\begin{matrix} \sqrt{n} (\hat{γ} - γ_{0}) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} (γ_{0}) + o_{p} (1), & \sqrt{n} (\hat{β} - β_{0}) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} Ψ_{i} (β_{0}) + o_{p} (1) . \end{matrix}

Consequently, as

n \to \infty

, we have

\frac{1}{\sqrt{n}} (I_{i 2} + I_{i 3}) \overset{d}{\to} N (0, B_{1}),

where

B_{1} : = Var (H_{1} Ψ (β_{0}) + H_{2} Φ (γ_{0}))

. Furthermore, we obtain

\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} ψ_{h i}^{(1)} (θ_{0}, \hat{β}, \hat{γ}) \overset{d}{\to} N (0, T_{1}),

where

T_{1} = A_{1} + B_{1}

.

To investigate the asymptotic properties of

\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} ψ_{h i}^{(2)} (θ_{0}, \hat{β}, \hat{γ})

, we have

\begin{matrix} \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} ψ_{h i}^{(2)} (θ_{0}, \hat{β}, \hat{γ}) \\ = & \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} ψ_{h i}^{(2)} (θ_{0}, β_{0}, γ_{0}) \\ + \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} \{\frac{δ_{i}}{π (X_{i}, Y_{i}; {\hat{g}}_{γ_{0}}, γ_{0})} - \frac{δ_{i}}{π (X_{i}, Y_{i})}\} \{ψ_{h} (Z_{i}, Y_{i}; θ_{0}) - m_{ψ_{h}}^{0} (Z_{i}; θ_{0})\} \\ + \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} \{\frac{δ_{i}}{π (X_{i}, Y_{i}; {\hat{g}}_{\hat{γ}}, \hat{γ})} - \frac{δ_{i}}{π (X_{i}, Y_{i}; {\hat{g}}_{γ_{0}}, γ_{0})}\} \{ψ_{h} (Z_{i}, Y_{i}; θ_{0}) - m_{ψ_{h}}^{0} (Z_{i}; θ_{0})\} + o_{p} (1) \\ = & \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} L_{i 1} + \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} L_{i 2} + \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} L_{i 3} + o_{p} (1) . \end{matrix}

Similar to the previous proof, as

n \to \infty

, we have

\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} L_{i 1} \overset{d}{\to} N (0, A_{2}),

where

A_{2} = E \{π {(X_{i}, Y_{i})}^{- 1} {(ψ (Z_{i}, Y_{i}; θ_{0}) - m_{ψ}^{0} (Z_{i}; θ_{0}))}^{\otimes 2}\} + E \{m_{ψ}^{0} {(Z_{i}; θ_{0})}^{\otimes 2}\}

.

To analyze the asymptotic behavior of

\frac{1}{\sqrt{n}} L_{i 2}

, we have

\begin{matrix} \frac{1}{\sqrt{n}} L_{i 2} = & \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} \{1 - \frac{δ_{i}}{π (X_{i}, Y_{i})}\} \{E \{ψ_{h} (Z_{i}, Y_{i}; θ_{0}) - m_{ψ_{h}}^{0} (Z_{i}; θ_{0}) | X_{i}, δ_{i} = 0\}\} + o_{p} (1) \\ = & o_{p} (1) . \end{matrix}

According to the analysis, we can conclude that

\frac{1}{\sqrt{n}} L_{i 2}

converges to zero in probability, i.e.,

\frac{1}{\sqrt{n}} L_{i 2} = o_{p} (1)

.

To establish the asymptotic properties of

\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} L_{i 3}

, we have

\begin{matrix} \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} L_{i 3} \\ = & \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} δ_{i} \frac{\partial π^{- 1} (X_{i}, Y_{i}; {\hat{g}}_{γ}, γ)}{\partial γ} \{ψ_{h} (Z_{i}, Y_{i}; θ_{0}) - m_{ψ_{h}}^{0} (Z_{i}, θ_{0})\} (\hat{γ} - γ_{0}) \\ = & \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} δ_{i} \{ψ_{h} (Z_{i}, Y_{i}; θ_{0}) - m_{ψ_{h}}^{0} (Z_{i}, θ_{0})\} exp (- {\hat{g}}_{γ_{0}} (X_{i}) + γ_{0} Y_{i}) {Y_{i} - {\hat{m}}_{Y}^{0} (X_{i}; γ_{0})} (\hat{γ} - γ_{0}) \\ = & \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} δ_{i} O (X_{i}, Y_{i}) \{ψ_{h} (Z_{i}, Y_{i}; θ_{0}) - m_{ψ_{h}}^{0} (Z_{i}, θ_{0})\} {Y_{i} - m_{Y}^{0} (X_{i}; γ_{0})} (\hat{γ} - γ_{0}) + o_{p} (1) \\ = : & H_{3} \sqrt{n} (\hat{γ} - γ_{0}) + o_{p} (1), \end{matrix}

where

H_{3} = E \{(1 - δ) (Y - m_{Y}^{0} (X)) (ψ (Z, Y; θ_{0}) - m_{ψ}^{0} (Z; θ_{0}))\}

.

According to the Slutzky theorem, we have

\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} L_{i 3} \overset{d}{\to} N (0, B_{2}),

where

B_{2} = Var (H_{3} Φ (γ_{0}))

. Similarly to the previous proof, it can be shown that

Cov (\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} L_{i 1}, \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} L_{i 3}) = o_{p} (1),

which implies that, as

n \to \infty

,

\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(2)} (θ_{0}, \hat{β}, \hat{γ}) \overset{d}{\to} N (0, T_{2}),

where

T_{2} = A_{2} + B_{2}

.

To prove (2), we first establish the asymptotic property of

\frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(1)} {(θ_{0}, \hat{β}, \hat{γ})}^{\otimes 2}

. By the law of large numbers and the fact that

\hat{γ} - γ_{0} = o_{p} (1)

and

\hat{β} - β_{0} = o_{p} (1)

, we have

\begin{matrix} \frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(1)} {(θ_{0}, \hat{β}, \hat{γ})}^{\otimes 2} = \frac{1}{n} \sum_{i = 1}^{n} ψ_{h i}^{(1)} {(θ_{0}, β_{0}, γ_{0})}^{\otimes 2} + o_{p} (1) . \end{matrix}

As

n \to \infty

, under the assumption, we have

\begin{matrix} lim_{n h^{2 r} \to 0} E \{{[δ_{i} ψ_{h} (Z_{i}, Y_{i}; θ_{0}) + (1 - δ_{i}) m_{ψ_{h}}^{0} (Z_{i}; θ_{0})]}^{\otimes 2}\} \\ = & lim_{n h^{2 r} \to 0} E [δ_{i} ψ_{h} {(Z_{i}, Y_{i}; θ_{0})}^{\otimes 2} + (1 - δ_{i}) m_{ψ_{h}}^{0} {(Z_{i}; θ_{0})}^{\otimes 2}] \\ + lim_{n h^{2 r} \to 0} E [2 δ_{i} (1 - δ_{i}) ψ_{h} (Z_{i}, Y_{i}; θ_{0}) m_{ψ_{h}}^{0} (Z_{i}; θ_{0})] \\ = & E [δ_{i} ψ {(Z_{i}, Y_{i}; θ_{0})}^{\otimes 2} + (1 - δ_{i}) m_{ψ}^{0} {(Z_{i}; θ_{0})}^{\otimes 2} + 2 δ_{i} (1 - δ_{i}) ψ (Z_{i}, Y_{i}; θ_{0}) m_{ψ}^{0} (Z_{i}; θ_{0})] \\ = & E [δ_{i} ψ {(Z_{i}, Y_{i}; θ_{0})}^{\otimes 2} + (1 - δ_{i}) m_{ψ}^{0} {(Z_{i}; θ_{0})}^{\otimes 2}] \\ : = & A_{1} . \end{matrix}

Therefore,

\frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(1)} {(θ_{0}, \hat{β}, \hat{γ})}^{\otimes 2} \overset{p}{\to} A_{1} .

Similarly, for

\frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(2)} {(θ_{0}, \hat{β}, \hat{γ})}^{\otimes 2}

, we have

\frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(2)} {(θ_{0}, \hat{β}, \hat{γ})}^{\otimes 2} = \frac{1}{n} \sum_{i = 1}^{n} ψ_{h i}^{(2)} {(θ_{0}, β_{0}, γ_{0})}^{\otimes 2} + o_{p} (1) .

As

n \to \infty

, under the assumption, we have

\begin{matrix} lim_{n h^{2 r} \to 0} E \{{[\frac{δ_{i}}{π (X_{i}, Y_{i})} {ψ_{h} (Z_{i}, Y_{i}; θ_{0}) - m_{ψ_{h}}^{0} (Z_{i}; θ_{0})} + m_{ψ_{h}}^{0} (Z_{i}; θ_{0})]}^{\otimes 2}\} \\ = & lim_{n h^{2 r} \to 0} E \{[\frac{δ_{i}}{π {(X_{i}, Y_{i})}^{2}} {ψ_{h} (Z_{i}, Y_{i}; θ_{0}) - m_{ψ_{h}}^{0} (Z_{i}; θ_{0})}^{\otimes 2} + {m_{ψ_{h}}^{0} (Z_{i}; θ_{0})}^{\otimes 2}]\} \\ = & E \{π {(X_{i}, Y_{i})}^{- 1} {ψ (Z_{i}, Y_{i}; θ_{0}) - m_{ψ}^{0} (Z_{i}; θ_{0})}^{\otimes 2} + {m_{ψ}^{0} (Z_{i}; θ_{0})}^{\otimes 2}\} \\ = & A_{2} . \end{matrix}

Therefore, we have

\frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(2)} (θ_{0}, \hat{β}, \hat{γ}) \overset{p}{\to} A_{2} .

Next, we prove (3). Note that, for

l = 1, 2

, we have

\begin{matrix} E \{\frac{\partial}{\partial θ^{⊤}} ψ_{h i}^{(l)} (θ_{0}, β_{0}, γ_{0})\} = E \{\frac{\partial}{\partial θ^{⊤}} ψ_{h} (Z_{i}, Y_{i}; θ_{0})\} \\ = & E \{\frac{\partial}{\partial θ^{⊤}} Z_{i} G_{h} (Z_{i}^{⊤} θ - Y_{i})\} = E \{\frac{\partial}{\partial θ^{⊤}} Z_{i} \int I (Y_{i} < Z_{i}^{⊤} θ_{0} - u h) K (u) d u\} \\ = & E \{\frac{\partial}{\partial θ^{⊤}} Z_{i} \int F_{Y} (Z_{i}^{⊤} θ_{0} - u h) K (u) d u\} = E \{Z_{i} \int \frac{\partial}{\partial θ^{⊤}} F_{Y} (Z_{i}^{⊤} θ_{0} - u h) K (u) d u\} \\ = & E \{Z_{i} Z_{i}^{⊤} \int f_{Y} (Z_{i}^{⊤} θ_{0} - u h) K (u) d u\} = E \{Z_{i} Z_{i}^{⊤} f_{Y} (Z_{i}^{⊤} θ_{0} | Z_{i})\} + o_{p} (1) \\ = & E \{Z_{i} Z_{i}^{⊤} f (0 | Z_{i})\} + o_{p} (1) : = Γ + o_{p} (1) . \end{matrix}

By the law of large numbers, as

n \to \infty

, we have

\frac{1}{n} \sum_{i = 1}^{n} \frac{\partial {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})}{\partial θ^{⊤}} \overset{p}{\to} Γ

.

Finally, we demonstrate (4). From

\frac{n^{- 1} ({max}_{i} ∥ {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) {∥)}^{2}}{n^{- 1} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{\otimes 2}} \to 0,

it can be easily shown that, for

l = 1, 2

,

{max}_{i} ∥ {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) ∥ = o_{p} (n^{1 / 2})

.

Proof of Theorem 1.

By applying the Lagrange multiplier method, we obtain the empirical log-likelihood ratio function with respect to the parameter vector

θ

:

{\hat{R}}^{(l)} (θ) = 2 \sum_{i = 1}^{n} log {1 + λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ, \hat{β}, \hat{γ})},

where

λ = λ (θ)

is the solution to the following equation:

g (λ) = \frac{1}{n} \sum_{i = 1}^{n} \frac{{\hat{ψ}}_{h i}^{(l)} (θ, \hat{β}, \hat{γ})}{1 + λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ, \hat{β}, \hat{γ})} .

In other words,

\hat{θ}

simultaneously satisfies the following two equations:

\begin{matrix} T_{1 n}^{(l)} (θ, λ) = & \frac{1}{n} \sum_{i = 1}^{n} \frac{{\hat{ψ}}_{h i}^{(l)} (θ, \hat{β}, \hat{γ})}{1 + λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ, \hat{β}, \hat{γ})}, \\ T_{2 n}^{(l)} (θ, λ) = & \frac{1}{n} \sum_{i = 1}^{n} \frac{{\partial {\hat{ψ}}_{h i}^{(l)} (θ, \hat{β}, \hat{γ}) / \partial θ^{⊤}} λ}{1 + λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ, \hat{β}, \hat{γ})} . \end{matrix}

Note that

T_{1 n}^{(l)} (\hat{θ}, 0) = n^{- 1} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (\hat{θ})

, and

T_{2 n}^{(l)} (\hat{θ}, 0) = 0

. Under the assumption conditions, according to Lemma A.1 in Newey and Smith [30] and Theorem 1(a) in Leng and Tang Leng [31], it can be shown that

\hat{θ}

is a consistent estimator of

θ 0

. By Taylor expanding

T_{1 n}^{(l)} (\hat{θ}, λ)

and

T_{2 n}^{(l)} (\hat{θ}, λ)

around

(θ_{0}, 0)

, we have

\begin{matrix} 0 = & T_{1 n}^{(l)} (θ_{0}, 0) + \frac{\partial T_{1 n}^{(l)} (θ_{0}, 0)}{\partial θ^{⊤}} (\hat{θ} - θ_{0}) + \frac{\partial T_{1 n}^{(l)} (θ_{0}, 0)}{\partial λ} λ + o_{p} (u_{n}), \\ 0 = & T_{2 n}^{(l)} (θ_{0}, 0) + \frac{\partial T_{2 n}^{(l)} (θ_{0}, 0)}{\partial θ^{⊤}} (\hat{θ} - θ_{0}) + \frac{\partial T_{2 n}^{(l)} (θ_{0}, 0)}{\partial λ} λ + o_{p} (u_{n}), \end{matrix}

where

u_{n} = ∥ \hat{θ} - θ_{0} | + | λ ∥

.

The above equations can be rewritten as follows:

\begin{matrix} (\begin{matrix} λ \\ \hat{θ} - θ_{0} \end{matrix}) = & {(\begin{matrix} - \frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{\otimes 2} & \frac{1}{n} \sum_{i = 1}^{n} \frac{\partial {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})}{\partial θ^{⊤}} \\ \frac{1}{n} \sum_{i = 1}^{n} \frac{\partial {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})}{\partial θ^{⊤}} & 0 \end{matrix})}^{- 1} \\ (\begin{matrix} - \frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) + o_{p} (u_{n}) \\ o_{p} (u_{n}) \end{matrix}) . \end{matrix}

Based on the results of Lemma 2, we have

\sqrt{n} ({\hat{θ}}^{(l)} - θ_{0}) = {\{- \frac{1}{n} \sum_{i = 1}^{n} \frac{\partial {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})}{\partial θ^{⊤}}\}}^{- 1} \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) + o_{p} (1) .

Therefore, for

l = 1, 2

and as

n \to \infty

, we have

\sqrt{n} ({\hat{θ}}^{(l)} - θ_{0}) \overset{d}{\to} N (0, Γ^{- 1} T_{l} Γ^{- ⊤}) .

□

Proof of Theorem 2.

First, we note that

∥ λ ∥ = O_{p} (n^{- 1 / 2})

. Let

λ = λ (θ_{0}) = ρ u

, where

u = λ / ∥ λ ∥

and

∥ u ∥ = 1

. We have

\begin{matrix} 0 = & \frac{1}{n} \sum_{i = 1}^{n} \frac{{\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})}{1 + λ {(θ)}^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})} \\ = & \frac{1}{n} \sum_{i = 1}^{n} \frac{{\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})}{1 + ρ u^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})} \\ = & \frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) - \frac{1}{n} \sum_{i = 1}^{n} \frac{{\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{⊤} u ρ}{1 + ρ u^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})}, \end{matrix}

By multiplying both sides of the equation by

u^{⊤}

, we obtain

\begin{matrix} ∥ u^{⊤} \frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) ∥ \\ = & \frac{1}{n} \sum_{i = 1}^{n} \frac{u^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{⊤} u ρ}{1 + ρ u^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})} \\ \geq & \frac{1}{1 + ρ {max}_{i} | {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) |} \frac{1}{n} \sum_{i = 1}^{n} u^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{⊤} u ρ . \end{matrix}

Thus, we can conclude that

\begin{matrix} \frac{1}{n} \sum_{i = 1}^{n} u^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{⊤} u ρ \leq ∥ u^{⊤} \frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) ∥ {1 + ρ max_{i} ∥ {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) ∥} . \end{matrix}

Based on Lemma 2, we have

ρ u^{⊤} A_{l} u + o_{p} (1) \leq O_{p} (n^{- 1 / 2}) 1 + ρ o_{p} (n^{1 / 2}) .

Consequently, it follows that

ρ = O_{p} (n^{- 1 / 2})

. Furthermore, we can observe that

max_{i} ∥ λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) ∥ \leq ∥ λ ∥ max_{i} ∥ {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) ∥ = O_{p} (n^{- 1 / 2}) o_{p} (n^{1 / 2}) = o_{p} (1) .

By expanding the function

g (λ)

, we obtain

\begin{matrix} 0 = g (λ) = & \frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) [1 - λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) + \frac{{[λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})]}^{2}}{{(1 + η_{i})}^{3}}] \\ = & \frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) - λ \frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{\otimes 2} \\ + \frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) \frac{{[λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})]}^{2}}{{(1 + η_{i})}^{3}}, \end{matrix}

where

η_{i} \in (0, λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}))

. From the fact that

{max}_{i} ∥ λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) ∥ = o_{p} (1)

, it follows that

| ξ_{i} | = o_{p} (1)

.

Note that

\begin{matrix} ∥\frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) \frac{{[λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})]}^{2}}{{(1 + η_{i})}^{3}}∥ \\ \leq & \frac{{max}_{i} ∥ {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) ∥}{1 - {max}_{i} | ξ_{i} |} ∥ λ^{⊤} \frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{\otimes 2} λ ∥ \\ = & o_{p} (n^{1 / 2}) O_{p} (n^{- 1}) = o_{p} (n^{- 1 / 2}) . \end{matrix}

Therefore,

λ = {\{\frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{\otimes 2}\}}^{- 1} \{\frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})\} + ζ,

where

∥ ζ ∥ = o_{p} (n^{- 1 / 2})

.

By expanding

{\hat{R}}^{(l)} (θ_{0})

around

θ_{0}

using a Taylor series, we obtain

\begin{matrix} {\hat{R}}^{(l)} (θ_{0}) = & 2 \sum_{i = 1}^{n} [λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) - \frac{1}{2} {[λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})]}^{2} + \frac{1}{3} \frac{{[λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})]}^{3}}{{(1 + ξ_{i})}^{3}}] \\ = & 2 λ^{⊤} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) - \sum_{i = 1}^{n} λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{⊤} λ \\ + \frac{2}{3} \sum_{i = 1}^{n} \frac{{[λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})]}^{3}}{{(1 + ξ_{i})}^{3}} . \end{matrix}

Similarly,

\begin{matrix} |\sum_{i = 1}^{n} \frac{{[λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})]}^{3}}{{(1 + ξ_{i})}^{3}}| \leq & \frac{{max}_{i} ∥ λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) ∥}{1 - {max}_{i} | ξ_{i} |} ∥ λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{⊤} λ ∥ \\ = & o_{p} (1) n O_{p} (n^{- 1}) = o_{p} (1) . \end{matrix}

Therefore, we obtain

{\hat{R}}^{(l)} (θ_{0}) = 2 λ^{⊤} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) - \sum_{i = 1}^{n} \sum_{i = 1}^{n} λ^{⊤} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ}) {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{⊤} λ + o_{p} (1) .

By combining the previous results, we can express

\begin{matrix} {\hat{R}}^{(l)} (θ_{0}) = & n {\{\frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})\}}^{⊤} {\{\frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{\otimes 2}\}}^{- 1} \\ - n ζ^{⊤} \frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{\otimes 2} ζ + o_{p} (1) . \end{matrix}

Here,

n ζ^{⊤} \frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{\otimes 2} ζ = n o_{p} (n^{- 1 / 2}) o_{p} (n^{- 1 / 2}) = o_{p} (1)

. Thus,

\begin{matrix} {\hat{R}}^{(l)} (θ_{0}) = & n {\{\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} (θ_{0}, \hat{β}, \hat{γ})\}}^{⊤} {\{\frac{1}{n} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{\otimes 2}\}}^{- 1} \\ \cdot & \{\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} {\hat{ψ}}_{h i}^{(l)} {(θ_{0}, \hat{β}, \hat{γ})}^{⊤}\} \\ + o_{p} (1) . \end{matrix}

Based on the results (1) and (2) of Lemma 2, it can be easily demonstrated that, as n tends to infinity, the asymptotic distribution of

{\hat{R}}^{(l)} (θ_{0})

follows a linear combination of independent chi-squared random variables:

{\hat{R}}^{(l)} (θ_{0}) \overset{d}{\to} r_{1}^{(l)} χ_{1 \cdot 1}^{2} + r_{2}^{(l)} χ_{1 \cdot 2}^{2} + \dots + r_{q}^{(l)} χ_{1 \cdot q}^{2},

where

r_{i}^{(l)}

represents the eigenvalues of

A_{l}^{- 1} T_{l}

. Here,

χ_{1 \cdot 1}^{2}, χ_{1 \cdot 2}^{2}, \dots, χ_{1 \cdot q}^{2}

denote q independent standard chi-squared distributed random vectors. This completes the proof of the theorem. □

References

Robins, J.M.; Ritov, Y. Toward a curse of dimensionality appropriate (CODA) asymptotic theory for semi–parametric models. Stat. Med. 1997, 16, 285–319. [Google Scholar] [CrossRef]
Wang, S.; Shao, J.; Kim, J.K. An instrumental variable approach for identification and estimation with nonignorable nonresponse. Stat. Sin. 2014, 24, 1097–1116. [Google Scholar] [CrossRef]
Kenward, M.G. Selection models for repeated measurements with non–random dropout: An illustration of sensitivity. Stat. Med. 1998, 17, 2723–2732. [Google Scholar] [CrossRef]
Kim, J.K.; Yu, C.L. A semiparametric estimation of mean functionals with nonignorable missing data. J. Am. Stat. Assoc. 2011, 106, 157–165. [Google Scholar] [CrossRef]
Shao, J.; Wang, L. Semiparametric inverse propensity weighting for nonignorable missing data. Biometrika 2016, 103, 175–187. [Google Scholar] [CrossRef]
Kim, J.K.; Shao, J. Statistical Methods for Handling Incomplete Data; CRC Press: New York, NY, USA, 2022. [Google Scholar]
Koenker, R.; Bassett, G. Regression quantiles. Econometrica 1978, 46, 33–50. [Google Scholar] [CrossRef]
Koenker, R.; Bassett, G. Tests of linear hypotheses and L1 estimation. Econometrica 1982, 50, 1577–1583. [Google Scholar] [CrossRef]
Horowitz, J.L. Bootstrap methods for median regression models. Econometrica 1998, 66, 1327–1351. [Google Scholar] [CrossRef]
Whang, Y.J. Bootstrap methods for median regression models. Econ. Theory 2006, 22, 173–205. [Google Scholar]
Luo, S.H.; Mei, C.L.; Zhang, C.Y. Smoothed empirical likelihood for quantile regression models with response data missing at random. Adv. Stat. Anal. 2017, 101, 95–116. [Google Scholar] [CrossRef]
Zhang, T.; Wang, L. Smoothed empirical likelihood inference and variable selection for quantile regression with nonignorable missing response. Comput. Stat. Data. Anal. 2020, 144, 106888. [Google Scholar] [CrossRef]
Niu, C.; Guo, X.; Xu, W.; Zhu, L. Empirical likelihood inference in linear regression with nonignorable missing response. Comput. Stat. Data. Anal. 2014, 79, 91–112. [Google Scholar] [CrossRef]
Bindele, H.F.; Zhao, Y.C. Rank-based estimating equation with non-ignorable missing responses via empirical likelihood. Stat. Sin. 2018, 28, 1787–1820. [Google Scholar] [CrossRef]
Chen, X.R.; Wan, A.T.K.; Zhou, Y. Efficient quantile regression analysis with missing observations. J. Am. Stat. Assoc. 2015, 110, 723–741. [Google Scholar] [CrossRef]
Tang, N.S.; Zhao, P.Y.; Zhu, H.T. Efficient quantile regression analysis with missing observations. Stat. Sin. 2014, 24, 723–747. [Google Scholar] [PubMed]
Kim, J.K. Parametric fractional imputation for missing data analysis. Biometrika 2011, 98, 119–132. [Google Scholar] [CrossRef]
Riddles, M.K.; Kim, J.K.; Im, J. A propensity-score-adjustment method for nonignorable nonresponse. J. Surv. Stat. Methodol. 2016, 4, 215–245. [Google Scholar] [CrossRef]
Paik, M.; Larsen, M.D. Handling nonignorable nonresponse with respondent modeling and the SIR algorithm. J. Stat. Plan. Inference 2014, 145, 179–189. [Google Scholar] [CrossRef]
Wang, X.L.; Song, Y.Q.; Lin, L. Handling estimating equation with nonignorably missing data based on SIR algorithm. J. Comput. Appl. Math. 2017, 326, 62–70. [Google Scholar] [CrossRef]
Song, Y.Q.; Zhu, Y.J.; Wang, X.L.; Lin, L. Robust inference for estimating equations with nonignorably missing data based on SIR algorithm. J. Stat. Comput. Simul. 2019, 89, 3196–3212. [Google Scholar] [CrossRef]
Newey, W.K.; McFadden, D. Large sample estimation and hypothesis testing. In Handbook of Econometrics; Engle, R.F., McFadden, D., Eds.; Elsevier: Amsterdam, The Netherlands, 1994; pp. 2111–2245. [Google Scholar]
Van der Vaart, A.W. Semiparametric statistics. In Lectures on Probability Theory and Statistics (Saint-Flour, 1999); Bernard, P., Ed.; Springer: Berlin, Germany, 2002; pp. 331–457. [Google Scholar]
Morikawa, K.; Kim, J.K.; Kano, Y. Semiparametric maximum likelihood estimation with data missing not at random. Can. J. Stat. 2017, 45, 393–409. [Google Scholar] [CrossRef]
Morikawa, K.; Kim, J.K. Semiparametric optimal estimation with nonignorable nonresponse data. Ann. Stat. 2021, 49, 2991–3014. [Google Scholar] [CrossRef]
Zhao, P.; Wang, L.; Shao, J. Empirical likelihood and Wilks phenomenon for data with nonignorable missing values. Scan. J. Stat. 2019, 46, 1003–1024. [Google Scholar] [CrossRef]
Hammer, S.M.; Katzenstein, D.A.; Hughes, M.D.; Gundacker, H.; Schooley, R.T.; Haubrich, R.H.; Henry, W.K.; Lederman, M.M.; Phair, J.P.; Niu, M.; et al. A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. N. Engl. J. Med. 1996, 335, 1081–1090. [Google Scholar] [CrossRef] [PubMed]
Davidian, M.; Tsiatis, A.A.; Leon, S. Semiparametric estimation of treatment effect in a pretest–posttest study with missing data. Stat. Sci. 2005, 20, 261–301. [Google Scholar] [CrossRef] [PubMed]
Zhang, M.; Tsiatis, A.A.; Davidian, M. Improving efficiency of inferences in randomized clinical trials using auxiliary covariates. Biometrics 2008, 64, 707–715. [Google Scholar] [CrossRef]
Newey, W.K.; Smith, R.J. Higher order properties of GMM and generalized empirical likelihood estimators. Econometrica 2004, 72, 219–255. [Google Scholar] [CrossRef]
Leng, C.L.; Tang, C.Y. Penalized empirical likelihood and growing dimensional general estimating equations. Biometrika 2012, 99, 703–716. [Google Scholar] [CrossRef]

Figure 1. Histogram of the residual distribution for the parameter working model

N (\hat{μ} (X), {\hat{σ}}^{2})

.

Figure 1. Histogram of the residual distribution for the parameter working model

N (\hat{μ} (X), {\hat{σ}}^{2})

.

Figure 2. QQ plot of the residual distribution for the parameter working model

N (\hat{μ} (X), {\hat{σ}}^{2})

.

Figure 2. QQ plot of the residual distribution for the parameter working model

N (\hat{μ} (X), {\hat{σ}}^{2})

.

Figure 3. Histogram of complete observed data

{CD 4}_{96}

in ACTG175.

Figure 3. Histogram of complete observed data

{CD 4}_{96}

in ACTG175.

Figure 4. Histogram of residuals from the parameterized working model.

Figure 5. Q-Q plot of residuals from the parameterized working model.

Table 1. Monte Carlo mean, standard deviation (SD), and approximate relative performance (ARE) of the five methods for error term (a).

		FULL		CC		EEI		IPW		AIPW
		$θ_{0}$	$θ_{1}$	$θ_{0}$	$θ_{1}$	$θ_{0}$	$θ_{1}$	$θ_{0}$	$θ_{1}$	$θ_{0}$	$θ_{1}$
$τ = 0.25$	Mean	−1.606	1.009	−1.652	1.002	−1.598	1.008	−1.605	1.008	−1.608	1.008
	SD	0.054	0.105	0.063	0.123	0.049	0.095	0.062	0.121	0.061	0.122
	ARE	1.000	1.000	1.445	1.173	0.919	0.905	1.148	1.152	1.130	1.162
$τ = 0.5$	Mean	−0.950	1.010	−0.999	1.002	−0.953	1.008	−0.950	1.009	−0.950	1.008
	SD	0.049	0.098	0.057	0.116	0.044	0.089	0.057	0.117	0.057	0.118
	ARE	1.000	1.000	1.534	1.186	0.900	0.908	1.163	1.194	1.163	1.204
$τ = 0.75$	Mean	−0.298	1.009	−0.349	1.001	−0.310	1.008	−0.297	1.008	−0.293	1.008
	SD	0.052	0.107	0.061	0.126	0.046	0.097	0.063	0.130	0.062	0.131
	ARE	1.000	1.000	1.529	1.180	0.914	0.907	1.212	1.215	1.196	1.224

Table 2. Monte Carlo mean, standard deviation (SD), and approximate relative performance (ARE) of the five methods for error term (b).

		FULL		CC		EEI		IPW		AIPW
		$θ_{0}$	$θ_{1}$	$θ_{0}$	$θ_{1}$	$θ_{0}$	$θ_{1}$	$θ_{0}$	$θ_{1}$	$θ_{0}$	$θ_{1}$
$τ = 0.25$	Mean	−1.503	1.004	−1.534	1.004	−1.492	1.005	−1.502	1.006	−1.507	1.005
	SD	0.045	0.099	0.052	0.112	0.041	0.088	0.051	0.109	0.051	0.110
	ARE	1.000	1.000	1.345	1.131	0.943	0.889	1.134	1.101	1.137	1.111
$τ = 0.5$	Mean	−0.968	1.004	−1.000	0.0997	−0.970	1.003	−0.968	1.004	−0.968	1.003
	SD	0.040	0.092	0.046	0.104	0.036	0.081	0.046	0.105	0.047	0.106
	ARE	1.000	1.000	1.401	1.133	0.901	0.881	1.150	1.141	1.175	1.152
$τ = 0.75$	Mean	−0.433	1.006	−0.467	0.996	−0.447	1.007	−0.432	1.007	−0.428	1.006
	SD	0.044	0.101	0.051	0.119	0.039	0.090	0.053	0.124	0.052	0.123
	ARE	1.000	1.000	1.392	1.182	0.942	0.891	1.205	1.228	1.187	1.218

Table 3. The bias (Bias), standard deviation (SD), and median absolute deviation (MAD) of the five types of quantile regression coefficient estimates at

τ = 0.25

.

Table 3. The bias (Bias), standard deviation (SD), and median absolute deviation (MAD) of the five types of quantile regression coefficient estimates at

τ = 0.25

.

Method	$θ_{0}$			$θ_{1}$			$θ_{2}$
Method	Bias	SD	MAD	Bias	SD	MAD	Bias	SD	MAD
full	−0.021	0.100	0.067	0.002	0.069	0.046	0.004	0.038	0.026
cc	0.001	0.122	0.086	−0.003	0.079	0.052	0.020	0.041	0.032
ipw	−0.039	0.149	0.081	−0.001	0.103	0.053	0.006	0.065	0.029
eei.0	−0.362	0.158	0.358	0.019	0.098	0.067	0.047	0.044	0.049
eei.1	−0.003	0.137	0.093	0.002	0.082	0.054	0.003	0.043	0.029
aipw.0	−0.026	0.120	0.086	0.002	0.081	0.054	0.005	0.041	0.028
aipw.1	−0.024	0.115	0.082	0.002	0.078	0.052	0.005	0.040	0.027

Table 4. The bias (Bias), standard deviation (SD), and median absolute deviation (MAD) of the five types of quantile regression coefficient estimates at

τ = 0.5

.

Table 4. The bias (Bias), standard deviation (SD), and median absolute deviation (MAD) of the five types of quantile regression coefficient estimates at

τ = 0.5

.

Method	$θ_{0}$			$θ_{1}$			$θ_{2}$
Method	Bias	SD	MAD	Bias	SD	MAD	Bias	SD	MAD
full	0.001	0.087	0.056	0.001	0.064	0.044	0.001	0.034	0.022
cc	0.157	1.089	0.076	−0.039	0.398	0.048	−0.003	0.151	0.027
ipw	0.029	0.103	0.067	−0.017	0.395	0.047	−0.001	0.065	0.024
eei.0	−0.089	0.118	0.104	0.006	0.076	0.048	0.013	0.039	0.026
eei.1	0.007	0.127	0.085	0.001	0.077	0.049	0.001	0.039	0.026
aipw.0	−0.008	0.159	0.066	0.001	0.110	0.047	0.002	0.037	0.025
aipw.1	−0.002	0.124	0.069	0.003	0.079	0.046	0.002	0.036	0.024

Table 5. The bias (Bias), standard deviation (SD), and median absolute deviation (MAD) of the five types of quantile regression coefficient estimates at

τ = 0.75

.

Table 5. The bias (Bias), standard deviation (SD), and median absolute deviation (MAD) of the five types of quantile regression coefficient estimates at

τ = 0.75

.

Method	$θ_{0}$			$θ_{1}$			$θ_{2}$
Method	Bias	SD	MAD	Bias	SD	MAD	Bias	SD	MAD
full	0.019	0.095	0.063	0.001	0.072	0.048	−0.001	0.037	0.025
cc	0.048	0.117	0.082	−0.003	0.080	0.054	0.009	0.041	0.029
ipw	0.011	0.246	0.070	0.002	0.079	0.051	0.001	0.049	0.026
eei.0	0.092	0.127	0.106	−0.003	0.084	0.055	−0.009	0.041	0.029
eei.1	0.019	0.131	0.084	−0.001	0.081	0.054	−0.002	0.0419	0.029
aipw.0	−0.047	0.510	0.073	0.020	0.361	0.052	0.001	0.072	0.026
aipw.1	−0.017	0.424	0.072	0.006	0.248	0.052	−0.001	0.064	0.025

Table 6. Analysis results of the ACTG175 dataset.

Covariate	AIPW Estimator		IPW Estimator		EEI Estimator		CC Estimator
Covariate	Est	CI	Est	CI	Est	CI	Est	CI
$τ = 0.25$
intercept	−0.527	(−0.576, −0.478)	−0.493	(−0.656, −0.362)	−0.528	(−0.708, −0.416)	−0.509	(−0.671, −0.386)
age	−0.001	(−0.019, 0.021)	0.001	(−0.058, 0.056)	0.091	(0.029, 0.141)	−0.006	(−0.061, 0.059)
wtkg	0.007	(−0.012, 0.026)	0.029	(−0.025, 0.083)	−0.147	(−0.198, −0.085)	0.031	(−0.025, 0.091)
race	−0.057	(−0.091, −0.022)	−0.089	(−0.218, 0.026)	−0.056	(−0.152, 0.032)	−0.092	(−0.204, 0.027)
gender	−0.002	(−0.042, 0.037)	−0.061	(−0.148, 0.075)	0.024	(−0.079, 0.176)	−0.057	(−0.131, 0.076)
history	−0.231	(−0.266, −0.197)	−0.234	(−0.349, −0.142)	−0.216	(−0.293, −0.123)	−0.227	(−0.331, −0.134)
offtrt	−0.549	(−0.584, −0.513)	−0.528	(−0.716, −0.416)	−0.399	(−0.449, −0.268)	−0.553	(−0.717, −0.434)
${CD 4}_{0}$	0.474	(0.459, 0.489)	0.496	(0.446, 0.549)	0.472	(0.416, 0.506)	0.493	(0.442, 0.543)
trt	0.367	(0.332, 0.401)	0.369	(0.254, 0.475)	0.239	(0.168, 0.344)	0.377	(0.261, 0.479)
$τ = 0.5$
intercept	−0.072	(−0.144, −0.001)	−0.006	(−0.221, 0.199)	0.025	(−0.128, 0.147)	−0.008	(−0.253, 0.180)
age	−0.021	(−0.042, −0.001)	−0.045	(−0.091, 0.031)	0.134	(0.076, 0.179)	−0.051	(−0.091, 0.029)
wtkg	−0.005	(−0.024, 0.012)	0.016	(−0.035, 0.091)	−0.175	(−0.210, −0.104)	0.016	(−0.032, 0.095)
race	−0.084	(−0.121, −0.041)	−0.115	(−0.251, 0.017)	−0.051	(−0.182, 0.049)	−0.134	(−0.246, 0.013)
gender	−0.013	(−0.068, 0.042)	−0.046	(−0.212, 0.113)	−0.002	(−0.116, 0.096)	−0.066	(−0.197, 0.108)
history	−0.243	(−0.279, −0.208)	−0.276	(−0.406, −0.148)	−0.217	(−0.285, −0.099)	−0.287	(−0.407, −0.160)
offtrt	−0.384	(−0.422, −0.345)	−0.493	(−0.621, −0.302)	−0.321	(−0.383, −0.183)	−0.500	(−0.623, −0.331)
${CD 4}_{0}$	0.509	(0.493, 0.529)	0.531	(0.481, 0.597)	0.523	(0.492, 0.571)	0.517	(0.478, 0.585)
trt	0.372	(0.329, 0.415)	0.366	(0.254, 0.524)	0.231	(0.136, 0.306)	0.385	(0.261, 0.530)
$τ = 0.75$
intercept	0.456	(0.397, 0.515)	0.553	(0.319, 0.721)	0.591	(0.506, 0.896)	0.547	(0.328, 0.692)
age	0.037	(0.013, 0.059)	0.005	(−0.051, 0.068)	0.181	(0.119, 0.223)	0.009	(−0.051, 0.069)
wtkg	0.026	(0.004, 0.047)	0.045	(−0.009, 0.102)	−0.163	(−0.207, −0.106)	0.046	(−0.007, 0.106)
race	−0.122	(−0.158, −0.087)	−0.182	(−0.291, −0.057)	−0.095	(−0.131, 0.068)	−0.188	(−0.289, −0.056)
gender	0.018	(−0.029, 0.066)	−0.039	(−0.183, 0.114)	0.041	(−0.205, 0.132)	−0.037	(−0.195, 0.084)
history	−0.207	(−0.242, −0.173)	−0.250	(−0.350, −0.139)	−0.221	(−0.306, −0.109)	−0.255	(−0.348, −0.143)
offtrt	−0.214	(−0.251, −0.177)	−0.403	(−0.559, −0.230)	−0.226	(−0.358, −0.114)	−0.457	(−0.579, −0.266)
${CD 4}_{0}$	0.557	(0.536, 0.577)	0.566	(0.514, 0.674)	0.539	(0.489, 0.589)	0.559	(0.505, 0.649)
trt	0.283	(0.235, 0.330)	0.316	(0.181, 0.438)	0.142	(0.118, 0.179)	0.318	(0.203, 0.447)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Guo, J.; Liu, F.; Härdle, W.K.; Zhang, X.; Wang, K.; Zeng, T.; Yang, L.; Tian, M. Sampling Importance Resampling Algorithm with Nonignorable Missing Response Variable Based on Smoothed Quantile Regression. Mathematics 2023, 11, 4906. https://doi.org/10.3390/math11244906

AMA Style

Guo J, Liu F, Härdle WK, Zhang X, Wang K, Zeng T, Yang L, Tian M. Sampling Importance Resampling Algorithm with Nonignorable Missing Response Variable Based on Smoothed Quantile Regression. Mathematics. 2023; 11(24):4906. https://doi.org/10.3390/math11244906

Chicago/Turabian Style

Guo, Jingxuan, Fuguo Liu, Wolfgang Karl Härdle, Xueliang Zhang, Kai Wang, Ting Zeng, Liping Yang, and Maozai Tian. 2023. "Sampling Importance Resampling Algorithm with Nonignorable Missing Response Variable Based on Smoothed Quantile Regression" Mathematics 11, no. 24: 4906. https://doi.org/10.3390/math11244906

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Sampling Importance Resampling Algorithm with Nonignorable Missing Response Variable Based on Smoothed Quantile Regression

Abstract

1. Introduction

2. Proposed Method

3. Theoretical Analysis

4. Simulation Study

4.1. Simulation 1: Correctly Specified Working Model

4.2. Simulation 2: Misspecification of the Working Model

5. Real Data Application

6. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1. Proof of Lemma 1

Appendix A.2. Proof of Lemma 2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI