Variational Bayesian EM Algorithm for Quantile Regression in Linear Mixed Effects Models

Wang, Weixian; Tian, Maozai

doi:10.3390/math12213311

Open AccessArticle

Variational Bayesian EM Algorithm for Quantile Regression in Linear Mixed Effects Models

by

Weixian Wang

¹ and

Maozai Tian

^2,*

¹

School of statistics and Mathematics, Central University of Finance and Economics, Beijing 102206, China

²

Department of Medical Engineering and Technology, Xinjiang Medical University, Urumqi 830011, China

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(21), 3311; https://doi.org/10.3390/math12213311

Submission received: 11 September 2024 / Revised: 2 October 2024 / Accepted: 21 October 2024 / Published: 22 October 2024

(This article belongs to the Section D1: Probability and Statistics)

Download

Browse Figures

Versions Notes

Abstract

This paper extends the normal-beta prime (NBP) prior to Bayesian quantile regression in linear mixed effects models and conducts Bayesian variable selection for the fixed effects of the model. The choice of hyperparameters in the NBP prior is crucial, and we employed the Variational Bayesian Expectation–Maximization (VBEM) for model estimation and variable selection. The Gibbs sampling algorithm is a commonly used Bayesian method, and it can also be combined with the EM algorithm, denoted as GBEM. The results from our simulation and real data analysis demonstrate that both the VBEM and GBEM algorithms provide robust estimates for the hyperparameters in the NBP prior, reflecting the sparsity level of the true model. The VBEM and GBEM algorithms exhibit comparable accuracy and can effectively select important explanatory variables. The VBEM algorithm stands out in terms of computational efficiency, significantly reducing the time and resource consumption in the Bayesian analysis of high-dimensional, longitudinal data.

Keywords:

mixed effects models; normal-beta prime prior; variational Bayesian EM algorithm

MSC:

62-08

1. Introduction

Regression modeling for high-dimensional data has become increasingly prevalent in modern statistical applications. In such scenarios, the number of covariates, denoted as

p

, often far exceeds the number of observations, denoted as

n

. Many of these covariates are typically irrelevant, and the inclusion of numerous covariates may adversely affect parameter estimation, leading to what is known as the “curse of dimensionality” [1]. Therefore, variable selection is crucial in modeling high-dimensional data. Current variable selection methods can be broadly categorized into two classes. The first class comprises penalty likelihood methods, including classical techniques such as Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Least Absolute Shrinkage and Selection Operator (LASSO), and Smoothly Clipped Absolute Deviation (SCAD). These methods minimize the penalized function of the logarithmic likelihood and regression coefficients. By selecting an appropriate penalty function, certain components of regression coefficients are forced to be zero, achieving simultaneous parameter estimation and variable selection. The second class involves Bayesian methods, which assign a hierarchical prior to all unknown parameters. Bayesian inference is then conducted by combining the prior on unknown parameters with the model likelihood, yielding posterior distributions for the parameters. A notable advantage of Bayesian methods is the ability to obtain posterior distributions for all submodels. Additionally, in terms of prediction, model combinations often outperform individual models. Consequently, Bayesian variable selection methods are advantageous.

Mixed effects models are employed for modeling and analyzing longitudinal data. Furthermore, there is a growing interest in quantile regression for longitudinal data. Koenker [2] first applied quantile regression to analyze longitudinal data, and Bayesian quantile regression was initially proposed by Yu and Moyeed [3]. Luo and Tian [4] extended Bayesian quantile regression to longitudinal data analysis, while Sriram et al. [5] utilized semiparametric Bayesian quantile regression for modeling longitudinal data, applied to insurance cost data. Chen et al. [6] incorporated spike-and-slab priors into Bayesian variable selection for quantile regression, Alhamzawi [7] employed adaptive LASSO penalty in Bayesian quantile regression, and Li et al. [8] unified the treatment of common penalties, including LASSO, group LASSO, and elastic net, in Bayesian quantile regression. Ji et al. [9] applied spike-and-slab priors for Bayesian variable selection in quantile regression within mixed effects models. In Bayesian variable selection for quantile regression, posterior inference is typically carried out using Markov Chain Monte Carlo (MCMC) algorithms.

In the case of large datasets, this process can be computationally intensive. We need a faster method than the MCMC algorithm, and Variational Bayesian provides an excellent alternative. Huang et al. [10] proposed a Variational Bayesian algorithm for Bayesian variable selection in linear regression models, known for its rapid convergence, extensible to high-dimensional datasets. For various applications of Variational Bayesian algorithms in different models, refer to Neville [11]. Regarding high-dimensional data, Lim et al. [12] first introduced the horseshoe prior in high-dimensional linear quantile regression and obtained posterior distributions using Variational Bayesian algorithms. Ray et al. applied spike-and-slab priors with Variational Bayesian algorithms for posterior distributions in high-dimensional linear [13] and logistic regression [14]. A comprehensive comparison and discussion of Variational Bayesian and MCMC methods are available in Blei [15] et al.’s review.

In the literature on Variational Bayesian quantile regression, e.g., Dai et al. [16] proposed high-dimensional quantile regression with a spike-and-slab lasso penalty, based on Variational Bayesian (VBSSLQR). Wang et al. [17] proposed a coordinate ascent variational inference (CAVI) algorithm for Tobit quantile regression. Li et al. [18] employed the Variational Bayesian algorithm for quantile regression models with nonignorable missing data, while incorporating variable selection. The assumption of sparse, high-dimensional models in these methods is generally considered reasonable, but this does not guarantee correctness. Furthermore, the selection of hyperparameters in model priors may influence the results of posterior estimation. To address these issues, Bai and Ghosh [19] introduced the NBP prior in high-dimensional linear models, deriving effective GBEM and VBEM algorithms.

This paper extends the NBP prior to Bayesian quantile regression in linear mixed effects models and conducts Bayesian variable selection for the fixed effects in the regression model. The hyperparameters in the NBP prior determine the sparsity level of the resulting model. The VBEM algorithm is employed to estimate the hyperparameters of the NBP prior based on the data, thereby providing parameter estimates and variable selection results for the regression model. The remainder of this paper is structured as follows. Section 2 introduces Bayesian quantile regression in linear mixed effects models and the hierarchical representation of the NBP prior. In Section 3, the corresponding Variational Bayesian EM algorithm is proposed. Section 4 presents a simulation study of the proposed algorithm, while Section 5 applies the Variational Bayesian EM algorithm to a real data analysis. Finally, Section 6 concludes with the relevant findings.

2. Model Assumptions

In general, the linear mixed effects model takes the following form:

y_{i j} = X_{i j}^{T} β + Z_{i j}^{T} α_{i} + ε_{i j}, i = 1, \dots, n . j = 1, \dots, n_{i} . \sum_{i} n_{i} = N

(1)

where

y_{i j}

is the response variable for the

j

-th observation of individual

i

,

X_{i j}

denotes the covariates for the

j

-th observation of individual

i

,

Z_{i j}

represents the covariates corresponding to the random effects, where

Z_{i j} \subseteq X_{i j}

. The fixed effect coefficients

β = (β_{1}, \dots, β_{p})

are unknown parameters, with the typical assumption that

β

is sparse. The random effect coefficient,

α_{i}

, is a

k \times 1

vector, with

α_{i}

following a multivariate normal distribution,

α_{i} ~ N_{k} (0, A)

. The term

ε_{i j}

represents the random error in (1).

For a given quantile

τ

, the definition of the quantile function for the linear mixed effects model is as follows:

Q_{y_{i j}} (τ ∣ X_{i j}, Z_{i j}, α_{i}) = X_{i j}^{T} β + Z_{i j}^{T} α_{i}

In Bayesian quantile regression, it is often assumed that

y_{i j}

follows an asymmetric Laplace (AL) distribution, where

f (y_{i j} ∣ β, α_{i}, σ) = \frac{τ (1 - τ)}{σ} e x p \{- ρ_{τ} (\frac{y - X_{i j}^{T} β - Z_{i j}^{T} α_{i}}{σ})\}

According to the findings of Kozumi and Kobayashi [20], the asymmetric Laplace (AL) distribution is expressed as a mixture of a normal distribution and an exponential distribution.

\{\begin{matrix} y_{i j} | β, α_{i}, σ, e_{i j} ~ N (y_{i j}| X_{i j}^{T} β + Z_{i j}^{T} α_{i} + k_{1} e_{i j}, σ k_{2} e_{i j}), \\ e_{i j} | σ ~ E x p (e_{i j}| \frac{1}{σ}), \end{matrix}

(2)

where

k_{1} = (1 - 2 τ) / (τ (1 - τ))

,

k_{2} = 2 / (τ (1 - τ))

.

For the fixed effect coefficient

β

, considering the NBP prior, the hierarchical representation is given by

β_{l} | ω_{l}^{2} ~ N (0, ω_{l}^{2}), l = 1, \dots, p,

ω_{l}^{2} ~ B P (a, b), l = 1, \dots, p .

The notation

B P (a, b)

represents the beta prime distribution with parameters

a

and

b

. The probability density function of this distribution is given by

π (ω_{l}^{2}) = \frac{Γ (a + b)}{Γ (a) Γ (b)} {(ω_{l}^{2})}^{a - 1} {(1 + ω_{l}^{2})}^{- a - b},

Specifically, when

a = b = 0.5

,

π (ω_{l}^{2})

corresponds to the half-Cauchy distribution. In this case, the prior for

β

becomes the horseshoe prior. Therefore, the NBP prior can be regarded as a generalization of the horseshoe prior. When

0 \leq a \leq 0.5

, Bai and Ghosh [19] demonstrated that zero is a singularity for

π (β_{l})

, and

π (β_{l})

is unbounded at this point. The parameter

a

determines the sparsity of

β

, with smaller values of

a

resulting in greater sparsity. Figure 1 illustrates the density plot of the beta prime prior for different values of

(a, b)

. Thus, fixed hyperparameters

(a, b)

may not facilitate obtaining the true sparsity level of the model. Bai and Ghosh recommend using the marginal maximum likelihood (MML) method to estimate

(a, b)

from the given data, subsequently achieving the genuine sparsity level of the model.

The NBP prior can be expressed as the product of the gamma and inverse gamma:

β_{l} | ω_{l}^{2} ~ N (0, ω_{l}^{2} = λ_{l}^{2} ξ_{l}^{2}), l = 1, \dots, p,

λ_{l}^{2} ~ G a (a, 1), l = 1, \dots, p,

ξ_{l}^{2} ~ I G (b, 1), l = 1, \dots, p .

G a (a, b)

and

I G (a, b)

represent the gamma distribution and inverse gamma distribution, respectively, with the shape parameter

a

and the scale parameter

b

.

For the random effect coefficients

α_{i} ~ N_{k} (0, A)

, to avoid estimating an excessive number of parameters, we assume

A = ϕ^{2} I

, where

I

represents the

k \times k

identity matrix. The prior for the hyperparameter

ϕ^{2}

is

ϕ^{2} \sim I G (g, h)

, and the prior for

σ

is

σ \sim I G (c, d)

.

According to Bayes’ theorem, the logarithm of the posterior for the parameters is given by

- (\frac{N + p + n k}{2}) \log (2 π) - \frac{N}{2} \log (k_{2}) - (\frac{3 N}{2} + c + 1) \log (σ) - \frac{1}{2 σ k_{2}} \sum_{i} \sum_{j} \frac{{(y - X_{i j}^{T} β - Z_{i j}^{T} α_{i} - k_{1} e_{i j})}^{2}}{e_{i j}} - (\frac{n k}{2} + g + 1) \log (ϕ^{2}) - \frac{\sum_{i} {α_{i}}^{T} α_{i}}{2 ϕ^{2}} - \sum_{l = 1}^{p} \frac{β_{l}^{2}}{2 λ_{l}^{2} ξ_{l}^{2}} - p \log (Γ (a)) + (a - \frac{3}{2}) \sum_{l = 1}^{p} \log (λ_{l}^{2}) - \sum_{l = 1}^{p} λ_{l}^{2} - p l o g (Γ (b)) - (b + \frac{3}{2}) \sum_{l = 1}^{p} \log (ξ_{l}^{2}) - \sum_{l = 1}^{p} \frac{1}{ξ_{l}^{2}} + c \log (d) - \log (Γ (c)) - \frac{d}{σ} - \frac{1}{2} \sum_{i} \sum_{j} \log (e_{i j}) - \frac{1}{σ} \sum_{i} \sum_{j} e_{i j} + g \log (h) - \log (Γ (g)) - \frac{h}{ϕ^{2}}

3. Variational Bayesian EM Algorithm

Let

θ \in Θ

,

D

is the observed data, and

p (θ | D)

is the conditional posterior distribution of

θ

. Variational Bayesian methods aim to find a distribution,

q (θ)

, within a given family of distributions,

F

, minimizing the Kullback–Leibler (KL) divergence between

q (θ)

and

p (θ | D)

. Denoting

p (θ)

as the prior distribution of the parameters, we have

\log p (D) = \int \log \frac{p (θ) p (D| θ)}{q (θ)} q (θ) d θ + \int \log \frac{q (θ)}{p (θ| D)} q (θ) d θ

(3)

In Equation (3), the first term is the evidence lower bound observed (ELBO), and the second term is the Kullback–Leibler (KL) divergence between

q (θ)

and

p (θ | D)

, denoted as

K L (q | | p)

. The logarithm of the likelihood of the observed data,

\log p (D)

, is independent of the choice of

q (θ)

. Therefore, minimizing

K L (q | | p)

is equivalent to maximizing the evidence lower bound:

\hat{q} (θ) = \arg \max_{q (θ) \in F} \{E_{q (θ)} [\log p (D, θ)] - E_{q (θ)} [\log q (θ)]\}

(4)

For the ease of solving

\hat{q} (θ)

, consider mean-field Variational Bayesian Bayes (MFVB). Let

\hat{q} (θ) = \prod_{t = 1}^{T} q_{t} (θ_{t})

, and denote

θ_{- t} = (θ_{1}, \dots, θ_{t - 1}, θ_{t + 1}, . ., θ_{T})

. According to the results of Blei et al. [15], the optimal

q (θ_{t})

is given by

q^{*} (θ_{t}) \propto \exp \{E_{θ_{- t}} [\log p (θ_{t}| D, θ_{- t})]\}

(5)

In this paper, the parameters under consideration are

θ = (β, {\{λ_{l}^{2}, ξ_{l}^{2}\}}_{l = 1}^{p}, {e_{i j}}_{i = 1, j = 1}^{n, n_{i}}, σ, ϕ^{2}, {α_{i}}_{i = 1}^{n})

Under the assumptions of MFVB, the Variational Bayesian posterior of

θ

can be decomposed as

(θ) \approx q_{1} (β) \{\prod_{l} q_{2} (λ_{l}^{2}) q_{3} (ξ_{l}^{2})\} \prod_{i j} q_{4} (e_{i j}) q_{5} (σ) q_{6} (ϕ^{2}) \prod_{i} q_{7} (α_{i})

According to Equation (5),

q_{1}^{*} (β) \sim N_{p} (β^{*}, Σ^{*}),

where

Σ^{*} = {(\frac{E_{q_{5}^{*}} [1 / σ]}{k_{2}} \sum_{i} \sum_{j} \frac{{X_{i j}}^{T} X_{i j}}{E_{q_{4}^{*}} {[e}_{i j}]} + D^{*})}^{- 1},

D^{*} = d i a g (E_{q_{2}^{*}} [1 / λ_{1}^{2}] E_{q_{3}^{*}} [1 / ξ_{1}^{2}], \dots, E_{q_{2}^{*}} [{1 / λ}_{p}^{2}] E_{q_{3}^{*}} [1 / ξ_{p}^{2}]),

β^{*} = Σ^{*} (\frac{E_{q_{5}^{*}} [1 / σ]}{k_{2}} \sum_{i} \sum_{j} {X_{i j}}^{T} (E_{q_{4}^{*}} [1 / e_{i j}] {(y}_{i} - Z_{i j}^{T} E_{q_{7}^{*}} [α_{i}] -) k_{1}))) .

q_{2}^{*} (λ_{l}^{2}) \sim G I G (m^{*} = a - 0.5, s_{l}^{*}, 2), q_{3}^{*} (ξ_{l}^{2}) \sim I G (u^{*} = b + 0.5, v_{l}^{*}),

where

s_{l}^{*} = E_{q_{1}^{*}} [β_{l}^{2}] E_{q_{3}^{*}} [{1 / ξ}_{l}^{2}], v_{l}^{*} = \frac{1}{2} E_{q_{1}^{*}} [β_{l}^{2}] E_{q_{2}^{*}} [1 / λ_{l}^{2}] + 1, l = 1, \dots, p .

q_{4}^{*} (e_{i j}) \sim G I G (0.5, χ_{i j}^{*}, ψ^{*}),

where

χ_{i j}^{*} = \frac{1}{k_{2}} E_{q} [\frac{{(y - X_{i j}^{T} β - Z_{i j}^{T} α_{i})}^{2}}{σ}], ψ^{*} = \frac{E_{q_{5}^{*}} [\frac{1}{σ}] ({k_{1}}^{2} + 2 k_{2})}{k_{2}} .

G I G (ρ, χ, ψ)

is a generalized inverse Gaussian (GIG) distribution, the density is

G I G (x | ρ, χ, ψ) \propto x^{ρ - 1} e x p {- 0.5 (χ / x + ψ x)}

q_{5}^{*} (σ) \sim I G (c^{*}, d^{*}),

where

c^{*} = \frac{3 N}{2} + c, d^{*} = \frac{1}{2 k_{2}} \sum_{i} \sum_{j} E_{q} [\frac{{(y - X_{i j}^{T} β - Z_{i j}^{T} α_{i} - k_{1} e_{i j})}^{2}}{e_{i j}}] + \sum_{i} \sum_{j} E_{q_{4}^{*}} {[e}_{i j}] + d .

q_{6}^{*} (ϕ^{2}) \sim I G (g^{*}, h^{*}),

where

g^{*} = \frac{n k}{2} + g, h^{*} = E_{q_{7}^{*}} [\frac{\sum_{i} {α_{i}}^{T} α_{i}}{2}] + h .

q_{7}^{*} (α_{i}) \sim N_{k} (α^{*}, A^{*}),

where

A^{*} = {(\frac{E_{q_{5}^{*}} [\frac{1}{σ}]}{k_{2}} \sum_{j} \frac{Z_{i j}^{T} Z_{i J}}{E_{q_{4}^{*}} {[e}_{i j}]} + E_{q_{6}^{*}} [\frac{1}{ϕ^{2}}] I)}^{- 1},

α^{*} = A^{*} (\frac{E [1 / σ]}{k_{2}} \sum_{j} {Z_{i j}}^{T} (E_{q_{4}^{*}} [\frac{1}{e_{i j}}] {(y}_{i} - X_{i j}^{T} E_{q_{1}^{*}} [β] - k_{1} E_{q_{4}^{*}} {[e}_{i j}]))) .

The computation of ELBO is as follows:

L = - \frac{N}{2} \log (2 π k_{2}) + (p + N) l o g (2) + c \log (d) - \log (Γ (c)) - p \log (Γ (a)) - p l o g (Γ (b)) + \frac{p + n k}{2} + g \log (h) - \log (Γ (g)) - c^{*} \log (d^{*}) + \log (Γ (c^{*})) - \sum_{l = 1}^{p} \log (\frac{{(2 / s_{l}^{*})}^{m^{*} / 2}}{K_{m^{*}} (\sqrt{2 s_{l}^{*}})}) - u^{*} \sum_{l = 1}^{p} \log (v_{l}^{*}) + p \log (Γ (u^{*})) + \frac{1}{2} l o g |Σ^{*}| + \sum_{l = 1}^{p} (v_{l}^{*} - 1) E_{q_{3}^{*}} [1 / ξ_{l}^{2}] - \sum_{i} \sum_{j} \log (\frac{{({ψ^{*} / χ}_{i j}^{*})}^{1 / 4}}{K_{1 / 2} (\sqrt{{ψ^{*} χ}_{i j}^{*}})}) + χ_{i j}^{*} E_{q_{4}^{*}} [1 / e_{i j}] + ψ^{*} E_{q_{4}^{*}} {[e}_{i j}] - g^{*} \log (h^{*}) + \log (Γ (g^{*})) + \frac{1}{2} l o g |A^{*}|

where

ψ (x) = d / d x (Γ (x))

is the digamma function, and

K_{v} (\cdot)

is the modified Bessel function of the second kind.

Using the EM algorithm to obtain the MML estimate for the parameters

(a, b)

, the E-step, given

ν^{(t - 1)} = (a^{(t - 1)}, b^{(t - 1)})

, is

\begin{array}{r} Q (ν ∣ ν^{(t - 1)}) & = - p l o g (Γ (a)) + a \sum_{l = 1}^{p} E_{a^{(t - 1)}} [l o g (λ_{l}^{2})] - p (l o g Γ (b)) \\ - b \sum_{l = 1}^{p} E_{b^{(t - 1)}} [l o g (ξ_{l}^{2})] + C . \end{array}

C

represents a constant unrelated to the

(a, b)

.

M-step:

ν^{(t)} = \max_{ν = (a, b)} Q (ν ∣ ν^{(t - 1)})

Q (ν ∣ ν^{(t - 1)})

takes derivatives with respect to

a

and

b

, respectively:

\begin{array}{r} \frac{\partial Q}{\partial a} & = - p ψ (a) + \sum_{l = 1}^{p} E_{a^{(t - 1)}} [l o g (λ_{l}^{2})] = 0, \\ \frac{\partial Q}{\partial b} & = - p ψ (b) - \sum_{l = 1}^{p} E_{b^{(t - 1)}} [l o g (ξ_{l}^{2})] = 0, \end{array}

Bai and Ghosh [16] also demonstrated the existence and uniqueness of the

ν^{(t)}

sought at each iteration. We updated the parameters in the Variational Bayesian posterior through a coordinate ascent algorithm. The expectations

E_{a^{(t - 1)}} [l o g (λ_{l}^{2})]

and

E_{b^{(t - 1)}} [l o g (ξ_{l}^{2})]

can be replaced by

{E_{q_{2}^{*}}}^{(t - 1)} [l o g (λ_{l}^{2})]

and

{E_{q_{3}^{*}}}^{(t - 1)} [l o g (ξ_{l}^{2})]

, respectively. This leads to the development of the Variational Bayesian EM algorithm (VBEM), for details of the EMVB, please refer to Algorithm 1. Where

{E_{q_{2}^{*}}}^{(t - 1)} [l o g (λ_{l}^{2})] = \log (\sqrt{s_{l}^{* (t - 1)} / 2}) + \frac{\partial}{\partial m^{* (t - 1)}} \log K_{m^{* (t - 1)}} (\sqrt{2 s_{l}^{* (t - 1)}})

{E_{q_{3}^{*}}}^{(t - 1)} [l o g (ξ_{l}^{2})] = \log (v_{l}^{* (t - 1)}) - ψ (u^{* (t - 1)})

Algorithm 1: Variational Bayesian EM Algorithm

1: Variational Bayesian parameter initialization

c^{*} = \frac{3 N}{2} + c, g^{*} = \frac{n k}{2} + g,

d^{* (0)}, h^{* (0)}, s_{l}^{* (0)}, v_{l}^{* (0)}, χ_{i j}^{* (0)}, ψ^{* (0)}

2 : Setting initial values a^{(0)} = b^{(0)} = 0.01, δ = 10^{- 3}, J = 1000

,

t = 1, L^{(0)} = 0

.

3 : While |L^{(t)} - L^{(t - 1)}| \geq δ

and

1 \leq t \leq J

do
4: E-step: Update Variational Bayesian parameters

5 : m^{* (t)} \leftarrow a^{(t - 1)} - \frac{1}{2}

6 : u^{* (t)} \leftarrow b^{(t - 1)} + \frac{1}{2}

7 : D^{* (t)} \leftarrow d i a g ({E_{q_{2}^{*}}}^{(t - 1)} [1 / λ_{1}^{2}] {E_{q_{3}^{*}}}^{(t - 1)} [1 / ξ_{1}^{2}], \dots, {E_{q_{2}^{*}}}^{(t - 1)} [1 / λ_{p}^{2}] {E_{q_{3}^{*}}}^{(t - 1)} [1 / ξ_{p}^{2}])

8 : Σ^{* (t)} \leftarrow {(\frac{{E_{q_{5}^{*}}}^{(t - 1)} [1 / σ]}{k_{2}} \sum_{i} \sum_{j} \frac{{X_{i j}}^{T} X_{i j}}{{E_{q_{4}^{*}}}^{(t - 1)} {[e}_{i j}]} + D^{* (t)})}^{- 1}

9 : β^{* (t)} \leftarrow Σ^{* (t)} (\frac{{E_{q_{5}^{*}}}^{(t - 1)} [1 / σ]}{k_{2}} \sum_{i} \sum_{j} {X_{i j}}^{T} ({E_{q_{4}^{*}}}^{(t - 1)} [1 / e_{i j}] {(y}_{i} - Z_{i j}^{T} {E_{q_{7}^{*}}}^{(t - 1)} [α_{i}]) - k_{1}))

10:

s_{l}^{* (t)} \leftarrow {E_{q_{1}^{*}}}^{(t - 1)} [β_{l}^{2}] {E_{q_{3}^{*}}}^{(t - 1)} [1 / ξ_{l}^{2}], l = 1, \dots, p

11:

v_{l}^{* (t)} \leftarrow \frac{1}{2} {E_{q_{1}^{*}}}^{(t - 1)} [β_{l}^{2}] {E_{q_{2}^{*}}}^{(t - 1)} [1 / λ_{l}^{2}] + 1, l = 1, \dots, p

12:

d^{* (t)} \leftarrow \frac{1}{2 k_{2}} \sum_{i} \sum_{j} {E_{q}}^{(t - 1)} [\frac{{(y - X_{i j}^{T} β - Z_{i j}^{T} α_{i} - k_{1} e_{i j})}^{2}}{e_{i j}}] + \sum_{i} \sum_{j} {E_{q_{4}^{*}}}^{(t - 1)} {[e}_{i j}] + d

13:

h^{* (t)} \leftarrow \frac{1}{2} {E_{q_{7}^{*}}}^{(t - 1)} [\sum_{i} {α_{i}}^{T} α_{i}] + h

14:

A^{* (t)} \leftarrow {(\frac{{E_{q_{5}^{*}}}^{(t - 1)} [1 / σ]}{k_{2}} \sum_{j} \frac{Z_{i j}^{T} Z_{i J}}{{E_{q_{4}^{*}}}^{(t - 1)} {[e}_{i j}]} + {E_{q_{6}^{*}}}^{(t - 1)} [1 / ϕ^{2}] I)}^{- 1},

15 : α^{* (t)} \leftarrow A^{* (t)} (\frac{{E_{q_{5}^{*}}}^{(t - 1)} [1 / σ]}{k_{2}} \sum_{j} {Z_{i j}}^{T} ({E_{q_{4}^{*}}}^{(t - 1)} [1 / e_{i j}] {(y}_{i} - X_{i j}^{T} {E_{q_{1}^{*}}}^{(t - 1)} [β]) - k_{1}))

16 : ψ^{* (t)} \leftarrow \frac{{E_{q_{5}^{*}}}^{(t - 1)} [1 / σ] ({k_{1}}^{2} + 2 k_{2})}{k_{2}}, χ_{i j}^{* (t)} = \frac{1}{k_{2}} {E_{q}}^{(t - 1)} [\frac{{(y - X_{i j}^{T} β - Z_{i j}^{T} α_{i})}^{2}}{σ}]

17: M-step

Update hyperparameters (a, b)

18 : a^{(t)}

is the solution to the equation

- p ψ (a) + \sum_{l = 1}^{p} {E_{q_{2}^{*}}}^{(t - 1)} [l o g (λ_{l}^{2})] = 0

19:

b^{(t)}

is the solution to the equation

- p ψ (b) + \sum_{l = 1}^{p} {E_{q_{3}^{*}}}^{(t - 1)} [l o g (ξ_{l}^{2})] = 0

20: Update

L^{(t)}

,

t \leftarrow t + 1

21: end while

4. Simulation Study

Consider the generation of data from the following mixed effects model:

y_{i j} = X_{i j}^{T} β + Z_{i j}^{T} α_{i} + ε_{i j}, i = 1, \dots, 10 . j = 1, \dots, 5 .

where

X_{i j} ~ N_{p} (0, I_{p \times p})

,

Z_{i j}

takes the first three elements of

X_{i j}

, and

α_{i} ~ N_{3} (0, I_{3 \times 3})

. Consider two types of error terms: (1)

ε_{i j} ~ N (0,1)

; (2)

ε_{i j} ~ C (0,1)

.

Simulation 1:

β = (5, \underset{9}{\underset{⏟}{0, \dots, 0}})

;

Simulation 2:

β = (\underset{5}{\underset{⏟}{5, \dots, 5}}, \underset{5}{\underset{⏟}{0, \dots, 0}})

;

Simulation 3:

β = (\underset{10}{\underset{⏟}{5, \dots, 5}})

;

Simulation 4:

β = (\underset{10}{\underset{⏟}{5, \dots, 5}}, \underset{90}{\underset{⏟}{0, \dots, 0}})

.

We chose non-informative hyperparameter priors with

c = d = 10^{- 5}

,

g = h = 10^{- 5}

. The Gibbs sampling algorithm is a commonly used Bayesian method, and it can also be combined with the EM algorithm, denoted as GBEM. We used the GBEM algorithm as a comparative method, measuring the estimation accuracy and efficiency of the two methods by the mean squared error (MSE) of the

β

estimates and the runtime (in seconds,

T

), as well as the median of the mean absolute deviation (MAD). We conducted 100 simulations, and the table reports the median of the MAD (MMAD) from the 100 simulation results. The runtime can be obtained using the tic and toc functions in the R package “tictoc”. The true positive rate (TPR) and the false discovery rate (FDR) were calculated to assess the effectiveness of variable selection. For each setup, we conducted 100 simulations. The table reports the average results of the 100 simulations, with the standard deviations in parentheses. The quantiles considered were

τ = 0.25,0.5,0.75

. Other methods for comparison include Bayesian quantile regression for linear mixed effects models (BQRLM), quantile regression for linear mixed effects models (QRM), and LASSO-penalized quantile regression for linear mixed effects models (QRML). These methods are available in R through the “BeQut”, “qrLMM”, and “alqrfe” packages.

The NBP prior is a continuous shrinkage prior, and the posterior of

β

does not necessarily shrink to exactly zero. Commonly, confidence intervals are used for variable selection. Hahn and Carvalho [21] proposed the Decoupled Shrinkage and Selection (DSS) method, which was also employed in this study for variable selection. The DSS method can be considered as solving the following optimization problem:

\hat{γ} = \arg \min_{γ} ∥ X \hat{β} - X γ ∥ / n + λ \sum_{l = 1}^{p} |γ_{l}| / |{\hat{β}}_{l}|

\hat{β}

is the posterior estimate of

β

. Friedman et al. [22] recommend using the R-4.4.1 package “glmnet” to solve this optimization problem.

Table 1, Table 2, Table 3, Table 4, Table 5, Table 6, Table 7 and Table 8 present the performance comparisons of the various methods under the settings for Simulations 1–4 for the two types of error distributions. In all the settings, our method performed well in variable selection. Across Simulations 1–4, the difference in the MSE among the methods is minor, though the VBEM algorithm exhibits a slightly higher MSE compared to the GBEM algorithm. This is because VB is an approximate inference method, which typically underestimates parameter variance, resulting in a higher MSE for the VBEM algorithm. Additionally, it is evident that the Bayesian methods (the first three methods in the table) have a smaller MMAD, indicating better predictive performance. Among the Bayesian methods, VBEM has a significantly shorter runtime compared to GBEM and BQRLM. Overall, the VBEM algorithm demonstrates the best performance.

Figure 2, Figure 3, Figure 4 and Figure 5 display the results of the EMVB algorithm in computing the ELOB for

τ = 0.5

in the four simulations (Left:

ε_{i j} ~ N (0,1)

; Right:

ε_{i j} ~ C (0,1)

). It can be observed that the EMVB algorithm converges after several iterations in all the cases. The GBEM algorithm requires multiple samples to ensure the accuracy of the computation, leading to a longer runtime.

The proposed VBEM algorithm for Bayesian quantile regression in linear mixed effects models offers several advantages over established methods, such as MCMC-based approaches. One of the main advantages of the VBEM algorithm is its significant reduction in computational time compared to MCMC-based methods, like GBEM and BQRLM. MCMC methods are known to be computationally expensive, especially for large datasets or high-dimensional models, while the VBEM algorithm provides a faster alternative due to its variational approximation framework. VBEM’s faster convergence and lower computational costs make it highly scalable, which is crucial for analyzing high-dimensional data or complex models with many covariates. Like Bayesian methods, VBEM effectively handles the sparsity inherent in high-dimensional data by selecting relevant variables. It also achieves this while providing good parameter estimates, making it a strong choice for variable selection tasks. By operating within a Bayesian framework, VBEM inherits the ability to quantify uncertainty and provide posterior distributions for all the submodels, which is an advantage over frequentist methods, like LASSO and SCAD, that only focus on point estimates.

A limitation of VBEM is that it is an approximate method, which can result in underestimating parameter variance. This is reflected in its slightly higher mean squared error (MSE) compared to MCMC-based methods. The trade-off for increased speed is a loss in the precision of the parameter uncertainty estimates.

We also estimate the hyperparameters in the NBP prior. In Simulation 1 and Simulation 4, where the sparsity level of

β

is the same (10% non-zero elements), the estimated hyperparameters are close:

a = 0.125, b = 0.607

(in Simulation 1) and

a = 0.122, b = 0.62

(in Simulation 4). In Simulation 2 and Simulation 3, as the number of non-zero elements in

β

increases, the hyperparameter estimates also increase:

a = 0.515, b = 0.311; a = 19.885, b = 1.189

.

When

τ = 0.5

, Figure 6 (Left:

ε_{i j} ~ N (0,1)

; Right:

ε_{i j} ~ C (0,1)

) presents the estimation results of the EMVB algorithm and the GBEM algorithm in Simulation 4. The true values of the signal

β

are shown in black. These findings further illustrate that there is little difference between the estimation results of the two algorithms. However, when

ε_{i j} ~ C (0,1)

, the performance of both the EMVB algorithm and the GBEM algorithm declines.

5. Real Data Analysis

We applied the VBEM algorithm to the yeast cell cycle gene expression dataset collected from the CDC15 experiment [23]. The cell cycle process is typically divided into the M/G1-G1-S-G2-M phases. To better understand the phenomena behind the cell cycle process, it is important to identify the transcription factors (TFs) that regulate the expression levels of genes. Similar to Wang et al. [24], we analyzed a subset of the data corresponding to the G1 phase. The dataset can be obtained by loading the R package PGEE and using the command “data (“yeastG1”)”. It contains 1132 observations (283 genes, observed four times each) and 99 variables (e.g., id, y, time, and 96 TFs). The response variable

y_{i j}

represents the log-transformed gene expression level of the gene,

i

, at the time,

j

, and Figure 7 shows the density of the response variable.

X_{i p}

is the matching score of the binding probability of the

p

th transcription factor on the promoter region of the

i

th gene. The binding probability is calculated using a mixture modeling approach based on ChIP binding experimental data, as detailed in Wang et al. [24]. We modeled it with the following formulation:

y_{i j} = α_{0} + α_{1} t_{i j} + \sum_{p = 1}^{96} β_{p} X_{i p} + ε_{i j}, i = 1,2, \dots, 283, j = 1,2, \dots, 4 .

(6)

When

τ = 0.25, 0.5, o r 0.75

, the number of the TFs selected by the VBEM algorithm are 22, 16, and 12, respectively, and the specific TFs are listed in Table 9. Additionally, Table 9 also shows the TFs selected by the Penalized Generalized Estimating Equations (PGEE) method proposed by Wang [24]. At

τ = 0.5

, the TFs selected by the VBEM algorithm are nearly identical to those selected by the PGEE method. While we used the DSS method to select the TFs, the PGEE method selected TFs with estimated coefficients greater than 10⁻³. In a real data analysis, we could not directly compare which variable selection method was better, but it was observed that the runtime of the VBEM algorithm was significantly shorter than that of the PGEE method.

6. Conclusions

In this paper, the primary contribution is the extension of the NBP prior to Bayesian quantile regression for linear mixed effects models, which improves the ability to perform Bayesian variable selection. The use of the VBEM algorithm for hyperparameter estimation in the NBP prior is another major innovation, offering a more computationally efficient alternative to traditional methods, like MCMC.

Our key findings include the following: (1) The VBEM algorithm is significantly faster than GBEM and other methods, especially when dealing with high-dimensional data, making it well-suited for large datasets and complex models. (2) While the VBEM algorithm tends to have a slightly higher mean squared error (MSE) due to its approximate nature, its performance in variable selection and predictive accuracy remains strong. (3) Bayesian methods, particularly those employing the VBEM algorithm, demonstrated superior predictive performance with a lower median mean absolute deviation (MMAD) compared to non-Bayesian methods.

The loss function of quantile regression is non-differentiable at the origin, which presents challenges for certain theoretical analyses. This limitation has been acknowledged as a major difficulty in applying VBEM to quantile regression. Unlike MCMC methods, Variational Bayesian methods (including VBEM) lack the same level of theoretical guarantees. The theoretical research on VBEM, particularly for quantile regression, is limited, making it more challenging to justify its use in some contexts where rigorous theoretical backing is required.

The proposed VBEM algorithm has substantial practical relevance for real-world decision making in fields such as genomics, finance, and healthcare, where high-dimensional data and complex models are common. The ability to quickly and accurately identify important variables while maintaining computational efficiency makes it a valuable tool for large-scale data analysis. By reducing computation time without sacrificing predictive power, the VBEM algorithm provides a practical solution for researchers and practitioners who require both accuracy and speed in data modeling and decision making.

The normal-beta prime prior can be applied to more complex and flexible models, such as nonparametric regression quantile regression or semiparametric quantile regression with unknown error distributions. The normal-beta prime prior can also be applied to other statistical problems, such as density estimation or classification tasks. With the rise of deep learning, researchers have begun to integrate deep learning models with Variational Bayesian methods to handle high-dimensional, nonlinear, and complex data. In quantile regression, deep neural networks can be employed to model the quantile function, and Variational Bayesian inference can be used to estimate the posterior distribution of the network parameters.

Author Contributions

The first two authors contributed equally to this work. M.T. contributed to the conceptualization, project administration, and funding acquisition for this project, with assistance from the corresponding author. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Beijing Natural Science Foundation (1242005).

Data Availability Statement

The researchers can download the yeastG1 dataset from the R package “PGEE”.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Giraud, C. Introduction to High-Dimensional Statistics; Chapman and Hall/CRC: Boca Raton, FL, USA, 2021. [Google Scholar]
Koenker, R. Quantile regression for longitudinal data. J. Multivar. Anal. 2004, 91, 74–89. [Google Scholar] [CrossRef]
Yu, K.; Moyeed, R.A. Bayesian quantile regression. Stat. Probab. Lett. 2001, 54, 437–447. [Google Scholar] [CrossRef]
Luo, Y.; Lian, H.; Tian, M. Bayesian quantile regression for longitudinal data models. J. Stat. Comput. Simul. 2012, 82, 1635–1649. [Google Scholar] [CrossRef]
Sriram, K.; Shi, P.; Ghosh, P. A Bayesian Semiparametric Quantile Regression Model for Longitudinal Data with Application to Insurance Company Costs. IIM Bangalore Res. Pap. 2011, 355. [Google Scholar] [CrossRef]
Chen, C.W.S.; Dunson, D.B.; Reed, C.; Yu, K. Bayesian variable selection in quantile regression. Stat. Its Interface 2013, 6, 261–274. [Google Scholar] [CrossRef]
Alhamzawi, R.; Yu, K.; Benoit, D.F. Bayesian adaptive Lasso quantile regression. Stat. Model. 2012, 12, 279–297. [Google Scholar] [CrossRef]
Li, Q.; Lin, N.; Xi, R. Bayesian regularized quantile regression. Bayesian Anal. 2010, 5, 533–556. [Google Scholar] [CrossRef]
Ji, Y.; Shi, H. Bayesian variable selection in linear quantile mixed models for longitudinal data with application to macular degeneration. PLoS ONE 2020, 15, e0241197. [Google Scholar] [CrossRef] [PubMed]
Huang, X.; Wang, J.; Liang, F. A Variational Bayesian algorithm for Bayesian variable selection. arXiv 2016, arXiv:1602.07640. [Google Scholar]
Neville, S.E. Elaborate Distribution Semiparametric Regression via Mean Field Variational Bayesian Bayes. Ph.D. Thesis, University of Wollongong, Wollongong, Australia, 2013. [Google Scholar]
Lim, D.; Park, B.; Nott, D.; Wang, X.; Choi, T. Sparse signal shrinkage and outlier detection in high-dimensional quantile regression with Variational Bayesian Bayes. Stat. Its Interface 2020, 13, 237–249. [Google Scholar] [CrossRef]
Ray, K.; Szabó, B. Variational Bayesian Bayes for high-dimensional linear regression with sparse priors. J. Am. Stat. Assoc. 2022, 117, 1270–1281. [Google Scholar] [CrossRef]
Ray, K.; Szabó, B.; Clara, G. Spike and slab Variational Bayesian Bayes for high dimensional logistic regression. arXiv 2010, arXiv:2010.11665. [Google Scholar]
Blei, D.M.; Kucukelbir, A.; McAuliffe, J.D. Variational Bayesian inference: A review for statisticians. J. Am. Stat. Assoc. 2017, 112, 859–877. [Google Scholar] [CrossRef]
Dai, D.; Tang, A.; Ye, J. High-Dimensional variable selection for quantile regression based on variational method. Mathematics 2023, 11, 2232. [Google Scholar] [CrossRef]
Wang, Z.; Wu, Y.; Cheng, W. Variational inference on a Bayesian adaptive lasso Tobit quantile regression model. Stat 2023, 12, e563. [Google Scholar] [CrossRef]
Li, X.; Tuerde, M.; Hu, X. Variational Bayesian Inference for Quantile Regression Models with Nonignorable Missing Data. Mathematics 2023, 11, 3926. [Google Scholar] [CrossRef]
Bai, R.; Ghosh, M. On the beta prime prior for scale parameters in high-dimensional bayesian regression models. Stat. Sin. 2021, 31, 843–865. [Google Scholar] [CrossRef]
Kozumi, H.; Kobayashi, G. Gibbs sampling methods for Bayesian quantile regression. J. Stat. Comput. Simul. 2011, 81, 1565–1578. [Google Scholar] [CrossRef]
Hahn, P.R.; Carvalho, C.M. Decoupling shrinkage and selection in bayesian linear models: A posterior summary perspective. J. Am. Stat. Assoc. 2015, 110, 435–448. [Google Scholar] [CrossRef]
Friedman, J.; Hastie, T.; Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef]
Eisen, M.B.; Spellman, P.T.; Brown, P.O.; Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 1998, 95, 14863–14868. [Google Scholar] [CrossRef] [PubMed]
Wang, L.; Zhou, J.; Qu, A. Penalized generalized estimating equations for high-dimensional longitudinal data analysis. Biometrics 2011, 68, 353–360. [Google Scholar] [CrossRef] [PubMed]

Figure 1. The density of the beta prime prior for various values of

(a, b)

.

Figure 1. The density of the beta prime prior for various values of

(a, b)

.

Figure 2. ELOB of the EMVB algorithm for Simulation 1 ((Left):

ε_{i j} ~ N (0,1)

; (Right):

ε_{i j} ~ C (0,1)

).

Figure 2. ELOB of the EMVB algorithm for Simulation 1 ((Left):

ε_{i j} ~ N (0,1)

; (Right):

ε_{i j} ~ C (0,1)

).

Figure 3. ELOB of the EMVB algorithm for Simulation 2 ((Left):

ε_{i j} ~ N (0,1)

; (Right):

ε_{i j} ~ C (0,1)

).

Figure 3. ELOB of the EMVB algorithm for Simulation 2 ((Left):

ε_{i j} ~ N (0,1)

; (Right):

ε_{i j} ~ C (0,1)

).

Figure 4. ELOB of the EMVB algorithm for Simulation 3 ((Left):

ε_{i j} ~ N (0,1)

; (Right):

ε_{i j} ~ C (0,1)

).

Figure 4. ELOB of the EMVB algorithm for Simulation 3 ((Left):

ε_{i j} ~ N (0,1)

; (Right):

ε_{i j} ~ C (0,1)

).

Figure 5. ELOB of the EMVB algorithm for Simulation 4 ((Left):

ε_{i j} ~ N (0,1)

; (Right):

ε_{i j} ~ C (0,1)

).

Figure 5. ELOB of the EMVB algorithm for Simulation 4 ((Left):

ε_{i j} ~ N (0,1)

; (Right):

ε_{i j} ~ C (0,1)

).

Figure 6. Comparison of the EMVB algorithm and the GBEM algorithm for Simulation 4.

Figure 7. The density of the log-transformed gene expression levels.

Table 1. Comparison of Simulation 1 (

ε_{i j} ~ N (0,1)

).

Table 1. Comparison of Simulation 1 (

ε_{i j} ~ N (0,1)

).

τ	Method	MSE	MMAD	TPR	FDR	T
0.25	VBEM	3.20 (0.19)	1.42 (0.25)	1.00 (0.00)	0.00 (0.00)	0.35 (0.06)
	GBEM	3.19 (0.24)	1.44 (0.16)	1.00 (0.00)	0.00 (0.00)	17.61 (1.52)
	BQRLM	3.44 (0.23)	1.52 (0.28)	1.00 (0.00)	0.00 (0.00)	15.26 (1.02)
	QRM	3.32 (0.33)	2.55 (0.48)	1.00 (0.00)	0.95 (0.13)	0.36 (0.12)
	QRML	3.24 (0.23)	2.35 (0.38)	1.00 (0.00)	0.04 (0.01)	0.46 (0.09)
0.5	VBEM	3.08 (0.34)	1.12 (0.14)	1.00 (0.00)	0.00 (0.00)	0.26 (0.05)
	GBEM	3.15 (0.21)	1.38 (0.17)	1.00 (0.00)	0.00 (0.00)	17.45 (1.45)
	BQRLM	3.35 (0.21)	1.49 (0.32)	1.00 (0.00)	0.00 (0.00)	14.26 (1.52)
	QRM	3.25 (0.42)	2.46 (0.18)	1.00 (0.00)	0.93 (0.10)	0.25 (0.14)
	QRML	3.20 (0.23)	2.33 (0.29)	1.00 (0.00)	0.06 (0.02)	0.45 (0.06)
0.75	VBEM	3.21 (0.32)	1.44 (0.30)	1.00 (0.00)	0.00 (0.00)	0.39 (0.02)
	GBEM	3.18 (0.19)	1.46 (0.15)	1.00 (0.00)	0.00 (0.00)	17.04 (1.02)
	BQRLM	3.40 (0.23)	1.55 (0.31)	1.00 (0.00)	0.00 (0.00)	15.33 (1.05)
	QRM	3.49 (0.29)	2.48 (0.41)	0.98 (0.03)	0.94 (0.02)	0.33 (0.11)
	QRML	3.38 (0.23)	2.39 (0.38)	1.00 (0.00)	0.08 (0.02)	0.48 (0.10)

Table 2. Comparison of Simulation 1 (

ε_{i j} ~ C (0,1)

).

Table 2. Comparison of Simulation 1 (

ε_{i j} ~ C (0,1)

).

τ	Method	MSE	MMAD	TPR	FDR	T
0.25	VBEM	5.50 (0.27)	2.85 (0.35)	1.00 (0.00)	0.00 (0.00)	0.45 (0.09)
	GBEM	5.32 (0.45)	2.94 (0.16)	1.00 (0.00)	0.00 (0.00)	18.73 (1.33)
	BQRLM	6.47 (0.35)	3.12 (0.18)	1.00 (0.00)	0.00 (0.00)	16.39 (1.52)
	QRM	5.92 (0.42)	4.59 (0.40)	1.00 (0.00)	0.91 (0.12)	0.39 (0.22)
	QRML	5.84 (0.23)	4.27 (0.29)	1.00 (0.00)	0.03 (0.01)	0.52 (0.05)
0.5	VBEM	5.48 (0.21)	2.62 (0.18)	1.00 (0.00)	0.00 (0.00)	0.29 (0.06)
	GBEM	5.75 (0.21)	2.48 (0.17)	1.00 (0.00)	0.00 (0.00)	17.46 (1.63)
	BQRLM	6.15 (0.20)	2.69 (0.27)	1.00 (0.00)	0.00 (0.00)	13.63 (1.01)
	QRM	5.87 (0.39)	3.16 (0.18)	1.00 (0.00)	0.92 (0.12)	0.27 (0.11)
	QRML	5.79 (0.23)	3.03 (0.29)	1.00 (0.00)	0.07 (0.01)	0.49 (0.07)
0.75	VBEM	5.53 (0.25)	2.87 (0.39)	1.00 (0.00)	0.00 (0.00)	0.46 (0.09)
	GBEM	5.62 (0.45)	2.98 (0.16)	1.00 (0.00)	0.00 (0.00)	19.06 (1.33)
	BQRLM	6.37 (0.33)	3.15 (0.16)	1.00 (0.00)	0.00 (0.00)	18.22 (1.02)
	QRM	5.98 (0.52)	4.61 (0.37)	1.00 (0.00)	0.94 (0.11)	0.38 (0.24)
	QRML	5.74 (0.23)	4.07 (0.22)	1.00 (0.00)	0.02 (0.01)	0.51 (0.03)

Table 3. Comparison of Simulation 2 (

ε_{i j} ~ N (0,1)

).

Table 3. Comparison of Simulation 2 (

ε_{i j} ~ N (0,1)

).

τ	Method	MSE	MMAD	TPR	FDR	T
0.25	VBEM	3.69 (0.17)	1.46 (0.21)	1.00 (0.00)	0.00 (0.00)	0.38 (0.04)
	GBEM	3.55 (0.16)	1.42 (0.15)	1.00 (0.00)	0.00 (0.00)	16.61 (0.14)
	BQRLM	3.64 (0.25)	1.55 (0.28)	1.00 (0.00)	0.00 (0.00)	16.33 (1.05)
	QRM	3.74 (0.33)	2.45 (0.31)	1.00 (0.00)	0.94 (0.10)	0.41 (0.15)
	QRML	3.66 (0.23)	2.13 (0.38)	1.00 (0.00)	0.04 (0.02)	0.43 (0.06)
0.5	VBEM	2.88 (0.27)	1.33 (0.29)	1.00 (0.00)	0.00 (0.00)	0.37 (0.04)
	GBEM	2.94 (0.31)	1.35 (0.15)	1.00 (0.00)	0.00 (0.00)	17.26 (1.15)
	BQRLM	3.09 (0.21)	1.52 (0.35)	1.00 (0.00)	0.00 (0.00)	16.52 (1.02)
	QRM	3.33 (0.29)	2.39 (0.15)	1.00 (0.00)	0.92 (0.10)	0.29 (0.10)
	QRML	3.25 (0.21)	2.28 (0.29)	1.00 (0.00)	0.05 (0.02)	0.41 (0.06)
0.75	VBEM	3.22 (0.39)	1.49 (0.25)	1.00 (0.00)	0.00 (0.00)	0.35 (0.03)
	GBEM	3.19 (0.16)	1.45 (0.11)	1.00 (0.00)	0.00 (0.00)	18.06 (1.16)
	BQRLM	3.42 (0.29)	1.57 (0.30)	1.00 (0.00)	0.00 (0.00)	15.49 (1.18)
	QRM	3.41 (0.29)	2.45 (0.35)	0.96 (0.02)	0.92 (0.02)	0.31 (0.21)
	QRML	3.35 (0.26)	2.37 (0.31)	1.00 (0.00)	0.05 (0.02)	0.45 (0.20)

Table 4. Comparison of Simulation 2 (

ε_{i j} ~ C (0,1)

).

Table 4. Comparison of Simulation 2 (

ε_{i j} ~ C (0,1)

).

τ	Method	MSE	MMAD	TPR	FDR	T
0.25	VBEM	6.61 (0.35)	2.96 (0.22)	1.00 (0.00)	0.00 (0.00)	0.47 (0.09)
	GBEM	6.16 (0.45)	3.13 (0.15)	1.00 (0.00)	0.00 (0.00)	17.77 (0.28)
	BQRLM	6.43 (0.30)	3.25 (0.21)	1.00 (0.00)	0.00 (0.00)	15.39 (1.02)
	QRM	6.95 (0.40)	4.38 (0.45)	1.00 (0.00)	0.90 (0.11)	0.35 (0.21)
	QRML	6.82 (0.15)	4.17 (0.29)	1.00 (0.00)	0.02 (0.01)	0.55 (0.04)
0.5	VBEM	5.85 (0.25)	2.52 (0.15)	1.00 (0.00)	0.00 (0.00)	0.28 (0.07)
	GBEM	5.96 (0.25)	2.33 (0.11)	1.00 (0.00)	0.00 (0.00)	17.05 (0.47)
	BQRLM	6.14 (0.20)	2.61 (0.21)	1.00 (0.00)	0.00 (0.00)	12.98 (1.11)
	QRM	5.99 (0.34)	3.01 (0.29)	1.00 (0.00)	0.90 (0.10)	0.25 (0.10)
	QRML	5.89 (0.21)	3.12 (0.20)	1.00 (0.00)	0.09 (0.01)	0.38 (0.08)
0.75	VBEM	6.08 (0.32)	2.92 (0.31)	1.00 (0.00)	0.00 (0.00)	0.47 (0.08)
	GBEM	6.21 (0.45)	2.82 (0.14)	1.00 (0.00)	0.00 (0.00)	19.33 (1.02)
	BQRLM	6.39 (0.36)	3.18 (0.11)	1.00 (0.00)	0.00 (0.00)	19.20 (1.01)
	QRM	6.32 (0.32)	4.55 (0.32)	1.00 (0.00)	0.90 (0.15)	0.40 (0.25)
	QRML	6.36 (0.23)	4.15 (0.25)	1.00 (0.00)	0.03 (0.01)	0.55 (0.02)

Table 5. Comparison of Simulation 3 (

ε_{i j} ~ N (0,1)

).

Table 5. Comparison of Simulation 3 (

ε_{i j} ~ N (0,1)

).

τ	Method	MSE	MMAD	TPR	FDR	T
0.25	VBEM	3.63 (0.15)	1.40 (0.25)	1.00 (0.00)	/	0.38 (0.04)
	GBEM	3.42 (0.19)	1.36 (0.14)	1.00 (0.00)	/	16.63 (0.22)
	BQRLM	3.69 (0.28)	1.58 (0.33)	1.00 (0.00)	/	17.55 (1.21)
	QRM	3.70 (0.28)	2.51 (0.35)	1.00 (0.00)	/	0.36 (0.22)
	QRML	3.87 (0.19)	2.59 (0.21)	1.00 (0.00)	/	0.41 (0.05)
0.5	VBEM	2.76 (0.23)	1.35 (0.29)	1.00 (0.00)	/	0.35 (0.02)
	GBEM	2.78 (0.18)	1.38 (0.19)	1.00 (0.00)	/	16.77 (0.61)
	BQRLM	3.28 (0.21)	1.49 (0.22)	1.00 (0.00)	/	17.02 (1.52)
	QRM	3.01 (0.18)	2.51 (0.19)	1.00 (0.00)	/	0.24 (0.03)
	QRML	3.39 (0.19)	2.62 (0.27)	1.00 (0.00)	/	0.47 (0.05)
0.75	VBEM	3.73 (0.23)	1.52 (0.22)	1.00 (0.00)	/	0.53 (0.09)
	GBEM	3.56 (0.25)	1.55 (0.19)	1.00 (0.00)	/	17.73 (1.28)
	BQRLM	3.82 (0.20)	1.66 (0.25)	1.00 (0.00)	/	15.33 (1.06)
	QRM	3.61 (0.29)	2.37 (0.35)	1.00 (0.00)	/	0.35 (0.02)
	QRML	3.66 (0.25)	2.52 (0.23)	1.00 (0.00)	/	0.41 (0.12)

Table 6. Comparison of Simulation 3 (

ε_{i j} ~ C (0,1)

).

Table 6. Comparison of Simulation 3 (

ε_{i j} ~ C (0,1)

).

τ	Method	MSE	MMAD	TPR	FDR	T
0.25	VBEM	6.52 (0.26)	2.98 (0.23)	1.00 (0.00)	/	0.47 (0.09)
	GBEM	6.04 (0.15)	3.09 (0.17)	1.00 (0.00)	/	18.93 (0.21)
	BQRLM	6.48 (0.22)	3.07 (0.29)	1.00 (0.00)	/	16.33 (1.21)
	QRM	6.56 (0.33)	4.01 (0.36)	1.00 (0.00)	/	0.39 (0.15)
	QRML	6.77 (0.15)	4.59 (0.29)	1.00 (0.00)	/	0.50 (0.04)
0.5	VBEM	5.32 (0.22)	2.69 (0.18)	1.00 (0.00)	/	0.35 (0.05)
	GBEM	5.59 (0.25)	2.52 (0.21)	1.00 (0.00)	/	17.59 (1.23)
	BQRLM	6.09 (0.36)	2.77 (0.21)	1.00 (0.00)	/	14.15 (1.35)
	QRM	5.84 (0.34)	3.32 (0.19)	1.00 (0.00)	/	0.30 (0.09)
	QRML	5.74 (0.21)	3.47 (0.15)	1.00 (0.00)	/	0.36 (0.05)
0.75	VBEM	6.11 (0.52)	2.85 (0.29)	1.00 (0.00)	/	0.45 (0.06)
	GBEM	6.39 (0.32)	2.78 (0.11)	1.00 (0.00)	/	18.36 (1.05)
	BQRLM	6.35 (0.31)	3.26 (0.28)	1.00 (0.00)	/	19.41 (1.55)
	QRM	6.18 (0.39)	4.05 (0.25)	1.00 (0.00)	/	0.35 (0.21)
	QRML	6.26 (0.18)	4.01 (0.35)	1.00 (0.00)	/	0.53 (0.10)

Table 7. Comparison of Simulation 4 (

ε_{i j} ~ N (0,1)

).

Table 7. Comparison of Simulation 4 (

ε_{i j} ~ N (0,1)

).

τ	Method	MSE	MMAD	TPR	FDR	T
0.25	VBEM	4.13 (0.18)	1.92 (0.35)	1.00 (0.00)	0.00 (0.00)	2.45 (0.12)
	GBEM	4.02 (0.12)	1.88 (0.15)	1.00 (0.00)	0.00 (0.00)	63.52 (2.69)
	BQRLM	4.61 (0.23)	2.18 (0.26)	1.00 (0.00)	0.00 (0.00)	65.54 (1.32)
	QRM	4.62 (0.35)	2.85 (0.42)	1.00 (0.00)	1.00 (0.00)	0.89 (0.25)
	QRML	4.22 (0.25)	2.48 (0.36)	1.00 (0.00)	0.00 (0.00)	0.92 (0.09)
0.5	VBEM	2.92 (0.29)	1.45 (0.33)	1.00 (0.00)	0.00 (0.00)	1.49 (0.09)
	GBEM	2.72 (0.25)	1.44 (0.14)	1.00 (0.00)	0.00 (0.00)	64.41 (2.69)
	BQRLM	3.05 (0.19)	1.54 (0.22)	1.00 (0.00)	0.00 (0.00)	67.33 (1.35)
	QRM	3.22 (0.29)	2.08 (0.13)	1.00 (0.00)	1.00 (0.00)	0.85 (0.05)
	QRML	3.84 (0.11)	2.59 (0.17)	1.00 (0.00)	0.00 (0.00)	0.88 (0.05)
0.75	VBEM	4.39 (0.15)	1.73 (0.15)	1.00 (0.00)	0.00 (0.00)	2.53 (0.52)
	GBEM	4.19 (0.21)	1.75 (0.25)	1.00 (0.00)	0.00 (0.00)	64.73 (2.35)
	BQRLM	4.22 (0.17)	1.80 (0.25)	1.00 (0.00)	0.00 (0.00)	69.21 (1.89)
	QRM	4.42 (0.18)	2.08 (0.38)	1.00 (0.00)	1.00 (0.00)	0.92 (0.11)
	QRML	4.62 (0.22)	2.35 (0.35)	1.00 (0.00)	0.00 (0.00)	1.82 (0.25)

Table 8. Comparison of Simulation 4 (

ε_{i j} ~ C (0,1)

).

Table 8. Comparison of Simulation 4 (

ε_{i j} ~ C (0,1)

).

τ	Method	MSE	MMAD	TPR	FDR	T
0.25	VBEM	5.95 (0.22)	2.25 (0.21)	1.00 (0.00)	0.00 (0.00)	3.66 (0.67)
	GBEM	6.21 (0.52)	1.92 (0.18)	1.00 (0.00)	0.00 (0.00)	70.47 (3.02)
	BQRLM	6.23 (0.36)	2.42 (0.36)	1.00 (0.00)	0.00 (0.00)	69.33 (1.58)
	QRM	6.55 (0.28)	2.94 (0.35)	1.00 (0.00)	1.00 (0.00)	0.92 (0.14)
	QRML	6.50 (0.27)	2.82 (0.39)	1.00 (0.00)	0.00 (0.00)	0.95 (0.09)
0.5	VBEM	4.98 (0.15)	1.62 (0.32)	1.00 (0.00)	0.00 (0.00)	2.89 (0.19)
	GBEM	4.93 (0.32)	1.52 (0.21)	1.00 (0.00)	0.00 (0.00)	68.17 (3.19)
	BQRLM	5.02 (0.33)	1.65 (0.21)	1.00 (0.00)	0.00 (0.00)	68.25 (1.98)
	QRM	5.51 (0.29)	2.74 (0.29)	1.00 (0.00)	1.00 (0.00)	0.88 (0.07)
	QRML	5.41 (0.32)	2.55 (0.13)	1.00 (0.00)	0.00 (0.00)	0.92 (0.12)
0.75	VBEM	6.89 (0.39)	1.82 (0.19)	1.00 (0.00)	0.00 (0.00)	3.92 (0.66)
	GBEM	7.14 (0.21)	1.88 (0.13)	1.00 (0.00)	0.00 (0.00)	71.55 (4.055)
	BQRLM	7.20 (0.63)	2.05 (0.21)	1.00 (0.00)	0.00 (0.00)	72.25 (1.33)
	QRM	8.05 (0.56)	2.89 (0.42)	1.00 (0.00)	1.00 (0.00)	0.95 (0.15)
	QRML	7.60 (0.19)	2.46 (0.34)	1.00 (0.00)	0.00 (0.00)	1.76 (0.32)

Table 9. Comparison of the VBEM algorithm and the PGEE method on yeast cell cycle gene expression data.

Method	The Selected TFs	T
VBEM $(τ = 0.25)$	“ABF1”“ARG81”“ASH1”“FKH1”“FKH2”“GAT3” “GCR1”“GCR2”“GTS1”“HMS1”“MBP1”“MET4” “MSN4”“NDD1”“ROX1”“SIP4”“STB1”“STP1” “SWI4”“SWI6”“YAP6” “ZAP1”	156.36
VBEM $(τ = 0.5)$	“ABF1”“ARG81”“ASH1”“FKH2”“GAT3”“GCR2” “MBP1”“MET4”“MSN4”“NDD1”“ROX1”“STB1” “STP1”“SWI4”“SWI6”“YAP6”	159.63
VBEM $(τ = 0.75$ )	“ABF1”“ASH1”“FKH2”“GAT3”“GCR2”“MBP1” “MSN4”“NDD1”“STB1”“STP1”“SWI4”“SWI6”	159.61
PGEE	“ABF1”“FKH1”“FKH2”“GAT3”“GCR2”“MBP1” “MSN4”“NDD1”“PHD1”“RGM1”“RLM1”“SMP1” “SRD1”“STB1”“SWI4”“SWI6”	1331.59

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, W.; Tian, M. Variational Bayesian EM Algorithm for Quantile Regression in Linear Mixed Effects Models. Mathematics 2024, 12, 3311. https://doi.org/10.3390/math12213311

AMA Style

Wang W, Tian M. Variational Bayesian EM Algorithm for Quantile Regression in Linear Mixed Effects Models. Mathematics. 2024; 12(21):3311. https://doi.org/10.3390/math12213311

Chicago/Turabian Style

Wang, Weixian, and Maozai Tian. 2024. "Variational Bayesian EM Algorithm for Quantile Regression in Linear Mixed Effects Models" Mathematics 12, no. 21: 3311. https://doi.org/10.3390/math12213311

APA Style

Wang, W., & Tian, M. (2024). Variational Bayesian EM Algorithm for Quantile Regression in Linear Mixed Effects Models. Mathematics, 12(21), 3311. https://doi.org/10.3390/math12213311

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Variational Bayesian EM Algorithm for Quantile Regression in Linear Mixed Effects Models

Abstract

1. Introduction

2. Model Assumptions

3. Variational Bayesian EM Algorithm

4. Simulation Study

5. Real Data Analysis

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI