Next Article in Journal
A Refined Jensen Inequality Connected to an Arbitrary Positive Finite Sequence
Next Article in Special Issue
Non-Parametric Conditional U-Processes for Locally Stationary Functional Random Fields under Stochastic Sampling Design
Previous Article in Journal
Physics-Based Observers for Measurement-While-Drilling System in Down-the-Hole Drills
Previous Article in Special Issue
A Markov Chain Model for Approximating the Run Length Distributions of Poisson EWMA Charts under Linear Drifts
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Semiparametric Bayesian Joint Modelling of Skewed Longitudinal and Competing Risks Failure Time Data: With Application to Chronic Kidney Disease

1
Pan African University Institute for Basic Sciences, Technology and Innovation (PAUSTI), Nairobi 62000-00200, Kenya
2
Department of Statistics, University of Gondar, Gondar 196, Ethiopia
3
Department of Statistics and Actuarial Sciences, Jomo Kenyatta University of Agriculture and Technology (JKUAT), Nairobi 62000-00200, Kenya
4
Department of Epidemiology and Biostatistics, College of Public Health, University of South Florida, 13201 Bruce B. Downs, Tampa, FL 33612, USA
5
School of Public Health, Jomo Kenyatta University of Agriculture and Technology (JKUAT), Nairobi 62000-00200, Kenya
6
Department of Internal Medicine, College of Medicine and Health Sciences, University of Gondar, Gondar 196, Ethiopia
7
Department of Mathematics, College of Science and Humanities in Al-Kharj, Prince Sattam bin Abdulaziz University, Al-Kharj 11942, Saudi Arabia
8
Department of Mathematics, Faculty of Science, Mansoura University, Mansoura 35516, Egypt
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(24), 4816; https://doi.org/10.3390/math10244816
Submission received: 10 November 2022 / Revised: 28 November 2022 / Accepted: 12 December 2022 / Published: 18 December 2022
(This article belongs to the Special Issue Current Developments in Theoretical and Applied Statistics)

Abstract

:
In clinical and epidemiological studies, when the time-to-event(s) and the longitudinal outcomes are associated, modelling them separately may give biased estimates. A joint modelling approach is required to obtain unbiased results and to evaluate their association. In the joint model, a subject may be exposed to more than one type of failure event (competing risks). Considering the competing event as an independent censoring of the time-to-event process may underestimate the true survival probability and give biased results. Within the joint model, longitudinal outcomes may have nonlinear (irregular) trajectories over time and exhibit skewness with heavy tails. Accordingly, fully parametric mixed-effect models may not be flexible enough to model this type of complex longitudinal data. In addition, assuming a Gaussian distribution for model errors may be too restrictive to adequately represent within-individual variations and may lack robustness against deviation from distributional assumptions. To simultaneously overcome these issues, in this paper, we presented semiparametric joint models for competing risks failure time and skewed-longitudinal data by using a smoothing spline approach and a multivariate skew-t distribution. We also considered different parameterization approaches in the formulation of joint models and used a Bayesian approach to make the statistical inference. We illustrated the proposed methods by analyzing real data on a chronic kidney disease. To evaluate the performance of the methods, we also carried out simulation studies. The results of both the application and simulation studies revealed that the joint modelling approach proposed in this study performed well when the semiparametric, random-effects parameterization, and skew-t distribution specifications were taken into account.

1. Introduction

In several clinical and other follow-up studies, one or more biomarkers from the study subjects can be examined repeatedly before the time-to-event(s) of interest occurs. In most of these studies, the event-time and the longitudinal measurements can be recorded. For instance, in a chronic kidney disease (CKD) study, biomarkers of kidney functional progress of a patient may be measured repeatedly, and the time to end-stage renal disease (ESRD), death, hemodialysis or kidney transplantation may be recorded. There are well-established approaches in the literature to model longitudinal and survival data separately. However, when the time-to-event(s) and the longitudinal outcomes are associated, modelling them separately may give biased estimates, and a joint modelling approach is required to obtain unbiased statistical results [1,2] and to evaluate their association. As a result, methods that can model the two processes together are becoming indispensable in several follow-up studies.
The development of joint modelling for analyzing time-to-event and longitudinal data from clinical studies is an active research area and has received much attention recently. See, for example, Refs. [2,3,4,5,6,7,8] among others. The most popular submodels proposed in the joint modelling method include mixed-effects submodels e.g., [6,9] for longitudinal data, a Cox regression submodel [4] or a Weibull survival submodel [6,9] for time-to-event data.
Various extensions to the joint modelling approach have been proposed in the literature to consider several longitudinal and time-to-event data features. For example, relaxing the normality assumption of model error [10,11,12], taking into account measurement errors in covariates [11,13], developing a semiparametric likelihood approach [14], the Bayesian approach [15], and consideration of heterogeneous population [13], including multivariate longitudinal responses [16,17,18]. These studies only focused at survival data with a single event (single failure type) and assumed independent censoring for the time-to-event process.
In some follow-up studies, however, a subject may be exposed to more than one type of failure event, and whether or not they are censored as a result of a particular cause of failure may depend on the occurrence of other events of interest, called competing risks. In other words, in survival time data with competing risks, the occurrence of one failure type may prevent or change the likelihood of the occurrence of another. For example, in the CKD study, the occurrence of death can prevent the occurrence of ESRD, whereas the occurrence of ESRD can change (increase) the probability of the occurrence of death. Thus, considering the competing event, death in this case, as an independent censoring of the time-to-ESRD process may underestimate the true survival probability and give biased results [19,20].
In the literature, there is little work carried out on joint modelling in CKD follow-up studies. Yang et al. [21] developed a joint model for longitudinal and competing risks in CKD data. They proposed a standard mixed-effects model (MEM) for the longitudinal outcome and a linear regression model on either of the event times. Armero et al. [22] proposed a linear MEM for longitudinal outcome and a Cox PH model with Weibull baseline hazards for competing risks in their joint model development to evaluate the progression of CKD in children. Teixeira et al. [23] also proposed a linear MEM and relative risk model with an unspecified baseline hazard for the longitudinal-competing risk processes in the development of their joint model to analyse peritoneal dialysis data.
Most of these previous studies focused on a fully parametric modelling approach for longitudinal outcomes, with model errors and random effects assumed to have a Gaussian distribution. However, in some applications, the exact form of the relationship between the longitudinal outcomes and the time-effects may be unknown (irregular). For instance, in our application data presented in Section 4.1, some CKD patients have nonlinear trajectories of the outcome eGFR over time. Thus, a fully parametric modelling approach may not be flexible enough to model such types of complex longitudinal data. To account for this issue and model nonlinear trajectories of longitudinal outcomes more flexibly, a semiparametric modelling approach based on smoothing spline functions can be used. For instance, Tang et al. [24] proposed semiparametric mixed-effects submodels for multivariate longitudinal breast cancer data; Refs. [12,25,26,27] proposed semiparametric mixture Tobit, partially linear mixed-effects, and quantile regression models, respectively, for HIV/AIDS dynamics; and Cavieres et al. [28] modelled a spatial data semiparametrically with a smoothing tin plate for spatial coordinates.
In addition, longitudinal data may contain potential outliers or exhibit skewness with heavy tails, leading to biased results and invalid statistical inference [29,30]. As a result, assuming a normal distribution for model errors may be too restrictive to adequately represent within-individual variations and lack robustness against deviation from symmetry [31]. Thus, considering more flexible distributional assumptions to well represent such data has received great attention recently [12]. In recent years, skew distributions have gained popularity as a useful tool for dealing with asymmetric longitudinal data in many applications. The most commonly used skew-elliptical distributions in the literature are multivariate skew-normal [27,28,32,33] and multivariate skew-t [8,12,26,29] distributions.
Furthermore, various parameterization (association structure) approaches have been utilized in the literature to formulate joint models and evaluate the association between time-to-event(s) and longitudinal outcome(s) processes. These include current value parameterization; see, e.g., [7,23,34,35,36,37], shared random-effects parameterization [8,10,12,21,22,26], a special type of shared random-effects parameterization with associated fixed parameters of time-dependent covariates [38,39,40,41], and correlated random-effects parameterization [42,43,44]. The choice of the association structure, however, may also need careful consideration because the statistical results from various parametrizations may differ.
Thus, to address the impact of those issues stated above simultaneously, we propose a semiparametric joint model for skewed-longitudinal outcome and competing risk failure time chronic kidney disease data. A smoothing spline approach is used to model the non-parametric time-effects. A multivariate skew-t distribution is considered to relax the normality assumption of model errors. A full Bayesian approach is used to make the statistical inference. This is because the Bayesian approach is computationally feasible and easy to draw statistical inference from such types of complex joint models.
The rest of this article is organized as follows: Section 2 briefly describes the sub- and joint models. In Section 3, the Bayesian inference is described. Section 4 presents the application CKD data description, analysis, and model comparisons. Section 5 provides a confirmatory simulation studies conducted to evaluate the performance of the proposed methods. Section 6 includes a discussion, concluding remarks, and suggestions for future work. Supplementary materials for this study are included in Appendix A.

2. Joint Modelling

2.1. The Longitudinal Outcome Submodel

Suppose there are n study subjects, and let y i j be the longitudinal outcome eGFR measured at time t i j ( j = 1 , 2 , , n i ) for subject i ( i = 1 , 2 , , n ). The estimated glomerular filtration rate (eGFR) measurements from many CKD patients, as illustrated in Section 4.1, have nonlinear trajectories over time. Thus, to account for this nonlinear (irregular) form of relationship between eGFR and time, we consider a non-parametric time effects and propose a semi-parametric mixed-effects model for the outcome eGFR, we call it henceforth the “longitudinal submodel”, which is defined by
y i j = z i j T ( β + φ i ) + f i ( t i j ) + ϵ i j , f i ( t i j ) = w ( h ( t i j ) , g i ( t i j ) )
where z i j = ( z 1 i j , z 2 i j , , z l i j ) T denotes a vector of l covariates, β = ( β 1 , , β l ) T and φ i = ( φ i 1 , φ i 2 , , φ i m ) T are associated population parameters (fixed-effects) and subject-specific parameters (random-effects); here l and m are the number of fixed-effect and random-effect parameters, respectively. The function f i ( . ) denotes a non-parametric smoothing function of the measurement time ( t i j ) and defined by f i ( t i j ) = h ( t i j ) + g i ( t i j ) , where h ( t i j ) and g i ( t i j ) denote unknown smoothing functions for the fixed and random time-effects, respectively, which represent population average and inter-subject variation in the eGFR process. ϵ i j denotes within subject model error term. We assume that φ i , ϵ i = ( ϵ i 1 , , ϵ i n i ) T and g i ( t i ) are independent of each other.
To approximate the unknown smoothing functions, h ( t i j ) and g i ( t i j ) in model (1), a spline-based approach, a regression spline, was considered by using a linear combination of spline basis functions, Ψ r ( t i j ) = ( ψ 0 ( t i j ) , ψ 1 ( t i j ) , , ψ r 1 ( t i j ) ) T and Ξ s ( t i j ) = ( ξ 0 ( t i j ) , ξ 1 ( t i j ) , , ξ s 1 ( t i j ) ) T , respectively. That is,
h r ( t i j ) l = 0 r 1 ζ l ψ l ( t i j ) = Ψ r ( t i j ) T ζ r g i s ( t i j ) l = 0 s 1 ϖ i l ξ l ( t i j ) = Ξ s ( t i j ) T ϖ i s
where ζ r = ( ζ 0 , ζ 1 , , ζ r 1 ) T and ϖ i s = ( ϖ i 0 , ϖ i 1 , , ϖ i ( s 1 ) ) T are coefficients of the fixed- and random-effects smoothing functions, respectively. r and s represent the number of basis functions utilized in the regression spline’s smoothing function approximation. Regression spline allows for flexible shapes for time-effects and subject-specific evolution of the longitudinal outcome. Natural cubic spline is used to approximate the bases ( Ψ r ( t i j ) and Ξ s ( t i j ) ) in (2), and percentiles are used to determine the position of knots. Substituting h ( t i j ) and g i ( t i j ) by their approximations h r ( t ) and g i s ( t ) , we obtain the following partially linear mixed-effects model for the eGFR process:
y i j = z i j T ( β + φ i ) + Ψ r ( t i j ) T ζ r + Ξ s ( t i j ) T ϖ i s + ϵ i j
Let x i j = ( z i j , Ψ r ( t i j ) ) , h i j = ( z i j , Ξ s i ( t i j ) ) , B = ( β T , ζ r T ) T , and b i = ( φ i T , ϖ i s T ) T . Then, Equation (3) can be rewritten as
y i j = x i j T B + h i j T b i + ϵ i j
If we assume x i j and h i j from (4) as the fixed and random effects design matrices with associated parameter vectors B and b i , respectively, then model (4) appears to be a standard linear mixed-effects model. The error vector ϵ i = ( ϵ i 1 , ϵ i 2 , , ϵ i n i ) T (within subject variation) and the random-effects b i (between subject variation) are assumed to be independent of each other. As presented graphically in Section 4.1, the longitudinal outcome eGFR follows asymmetric(skew) distribution. Hence, in this study, we assume that the model errors ϵ i follow multivariate skew-t distribution, ϵ i S T n i , κ ϵ μ ϵ , Σ ϵ , δ ϵ . Where S T n i , κ ϵ ( . ) is an n i -variate skew-t distribution [45] with degree freedom κ ϵ , zero mean error vector μ ϵ = 0 ; Σ ϵ = σ ϵ 2 I n i is the covariance matrix of model errors, and δ ϵ = ( δ ϵ 1 , , δ ϵ n i ) T denotes a skewness vector. In practical implementation, if our interest is the skewness of the overall data set of the eGFR, i.e., if δ ϵ 1 = = δ ϵ n i δ ϵ , then δ ϵ = δ ϵ 1 n i , where 1 n i = ( 1 , , 1 ) T is an n i × 1 identity vector. We also assume that the b i follows a multivariate normal distribution, b i N m + s 0 , Σ b , where Σ b is its unstructured covariance matrix.
Furthermore, in order to approximate the bases in Equation (2) using natural cubic spline, let R knots placed at τ 1 < , , < τ R , and let ψ r ( t i j ) = ξ s ( t i j ) . Then, following [46] Chapter 9, the natural cubic spline can be written as:
ψ r ( t i j ) = t i j τ r + 3 t i j τ R 1 + 3 ( τ R τ r ) ( τ R τ R 1 ) + t i j τ R + 3 ( τ R 1 τ r ) ( τ R τ R 1 ) ,
where t i j τ r + = 0 if t i j < τ r , otherwise t i j τ r ; r = 1 , , R 2 . The bases in Equation (5) are based on a linearity constraint outside the boundaries [ τ 1 , τ R ] and both the second and third derivatives of ψ r ( t i j ) must be zero at τ 1 and τ R .

2.2. The Competing Risks Failure Time Submodel

Suppose there are q different failure types that a subject may experience during the follow-up period, and let T i 1 * , T i 2 * , , T i q * be the corresponding true failure times, and C i denotes the censoring time. Then, the observed event-time for subject i is T i = m i n ( T i 1 * , , T i q * , C i ) , and let ρ denote an event indicator with values ρ i { 0 , 1 , 2 , , q } , ρ i = 0 indicates censoring event and ρ i = q represent the q t h failure type that the i t h subject experience during the follow-up period. We postulate a cause-specific hazard model for the q t h failure type at time t, which is given by
λ i q ( t ; w i , R i q ) = λ 0 q ( t ) exp γ q T w i + R i q ( t ) , q = 1 , , Q
where λ i q ( t ; w i , R i q ) denotes the cause-specific instantaneous rate due to failure type q at time t, given a vector of baseline covariates (possibly time-dependent) w i and latent process R i q ; λ 0 q ( t ) denotes the corresponding baseline hazard function, and γ q denotes a coefficient vector of w i possibly associated with the q t h competing risk.
In the joint model construction, different association structures have been proposed in the literature to evaluate the association between the longitudinal outcome and the failure time processes, which may also lead to different statistical results and inferences. Thus, taking into account this important issue is also one of the focuses of this study. Following [12,22,26,36,40,41], we considered the following joint modelling parameterizations to link the longitudinal eGFR process with the k t h competing risks process:
(1)
Random-effects parameterization (P1); see, e.g., [12,22]:
R i q ( t ) = b i T α q .
(2)
Random-effects with corresponding fixed-effects parameterization (P2) [40,41], a special case of the first parameterization approach:
R i q ( t ) = l = 1 q α q l ζ ˜ l + b i l ,
where ζ ˜ l is the fixed-effect coefficient of the l t h smoothing function of the time-effect on outcome eGFR, l = 1 , , r .
(3)
Current value parameterization (P3) see, e.g., [7,36]
R i q ( t ) = α q m i ( t ) ,
where m i ( t ) = z i j T ( β + φ i ) + f i ( t i j ) , the true value of the longitudinal outcome in model (1).
The parameter vector α q denotes the level of association between the q t h competing-event and the subject-specific longitudinal process, q = 1 , , Q . Given b i , the time-to-each competing risk ( T i ) and the longitudinal outcome ( y i j ) are assumed to be conditionally independent in the first two parameterizations. The corresponding failure event-q free survival function, S q ( t ) , the probability that a subject survives beyond time t or who has previously experienced an event from a failure type other than the q, is given by
S q ( t ) = exp 0 t λ 0 q ( s ) exp γ q T w i + R i q ( s ) d s ,
and the density function for the q t h competing risk failure time T i of the i t h is given by:
f q ( T i , ρ i | θ t ) = λ 0 q ρ i ( T i ) exp ρ i ( γ q T w i + R i q ( t ) × exp q = 1 Q 0 T i λ 0 q ( s ) exp γ q T w i + R i q ( s ) d s
where ρ i is the censoring indicator for subject i. We used the piecewise-constant function [47,48] to model the q t h baseline hazard function λ 0 q ( t ) more flexibly and accommodate various hazard shapes [49,50]. To construct such a step function, suppose that the observed time axis of the q t h competing risk can be partitioned into D intervals, ( s q 0 , s q 1 ] , ( s q 1 , s q 2 ] , , ( s q , d 1 , s q d ] , where 0 = s q 0 < s q 1 < s q 2 < < s q , d 1 < s q D < , d = 1 , 2 , , D . The quantiles of the observed event times can be used to select the D intervals. Then, the baseline hazard function for the q t h competing risk is defined as λ 0 q ( t ) = λ q d , f o r t ( s q , d 1 , s q , d ] , d = 1 , , D .

3. Bayesian Inference

Statistical inference on all the parameters in the longitudinal and competing risks models needs to be made simultaneously in order to adequately capture the underlying associations. The computational burden to make simultaneous inference based on the joint likelihood of the proposed joint models with skew-distributions can be extremely intensive, and may have convergence problems [13,51]. However, the Bayesian approach can reduce the computational burden and allow us to incorporate prior information for the unknown parameters as well. Thus, in this paper, we used a fully Bayesian approach to estimate the parameters simultaneously.
Recall that we assumed a multivariate skew-t distribution for model errors ϵ i = ( ϵ i 1 , , ϵ i n i ) T of the longitudinal outcome. For simplicity, let y i = ( y i 1 , y i 2 , , y i n i ) T be the longitudinal outcomes of the i t h subject defined in model (4). Then, to specify this model for MCMC computation, we introduce a random vector S ϵ i = ( S ϵ i 1 , , S ϵ i n i ) T and a random variable (scaling weight) u ϵ based on the stochastic representation for a skew-t distribution [29,45,52] to represent hierarchically the skew-t distribution. Thus, we hierarchically reformulate the longitudinal outcome submodel (4) and the competing risks failure time submodel (6) jointly as follows:
y i | b i , S ϵ i , u ϵ i ; B , σ ϵ 2 , Σ b , δ ϵ N n i X i B + H i b i + δ ϵ S ϵ i , u ϵ i 1 σ ϵ 2 1 n i , S ϵ i | u ϵ i N n i ( 0 , u ϵ i 1 I n i ) I ( S ϵ i > 0 ) u ϵ i | κ ϵ Γ ( κ ϵ / 2 , κ ϵ / 2 ) b i | Σ b N m + s 0 , Σ b , T i | b i ; λ q , γ q , α q , F ( T i | . ) = T i f ( t | b i ; λ q , γ q , α q ) d t , q = 1 , , Q
where N n i ( . ) represents n i -variate normal distribution, and the marginal specification for S i is simply a scaling weight multiple of an n i -variate truncated standard normal distribution with an indicator function I ( S i > 0 ) [29,31].
Let the overall set of parameters of the two models can be given by
Ω = { B , σ ϵ 2 , Σ b , δ ϵ , κ ϵ , λ 1 , , λ Q , γ 1 , , γ Q , α 1 , , α Q } .
where Q is the number of competing risks in the time-to-events process. We assume independent multivariate normal prior distributions for B , δ ϵ , γ 1 , , γ Q , α 1 , , α Q . Inverse-Gamma (IG) and inverse-Wishart (IW) priors are assumed for the within-subject variance of the longitudinal outcome, σ ϵ 2 , and the random effects’ covariance matrix, Σ b , respectively. Truncated Exponential (Exp) prior is assumed for the degree freedom κ ϵ . We propose independent gamma (G) priors for λ q with ω q d 1 > 0 shape parameter and ω q d 2 > 0 scale parameter of the piecewise baseline hazards. Thus, the prior distributions can be written in the following way:
B N l + r ( β 0 , Λ B ) , σ ϵ 2 I G ( ϕ ϵ 1 , ϕ ϵ 2 ) , Σ b I W m + s ( D b , ν b ) δ ϵ N ( 0 , Γ δ ϵ ) , κ ϵ E x p ( κ ϵ 0 ) I ( κ ϵ > 3 ) λ q d G ( ω q d 1 , ω q d 2 ) , γ q N p ( γ q 0 , Λ γ q ) , α q N q ( α q 0 , Λ α q )
For convenient implementation, we assume that the hyperparameter matrices Λ B , D b , Λ γ q and Λ α q are diagonal. The joint prior distribution of Ω can be defined as follows, assuming that each parameter in Ω is independent of one another:
π ( Ω ) = π ( B ) π ( σ ϵ 2 ) π ( Σ b ) π ( δ ϵ ) π ( κ ϵ ) π ( λ q ) π ( γ q ) π ( α q ) . Let D o b s = { y , X , T , ρ , w } stands for the overall observed data for CKD patients, which comprises data on the longitudinal outcome eGFR, covariates in both submodels, observed failure times, and event indicators. The joint likelihood function of the observed data can be defined by
f ( D o b s | Ω ) = i = 1 n b i f ( y i | b i , x i , S ϵ i , u ϵ i ; θ y ) f ( b i | Σ b ) × f ( S ϵ i | u ϵ i , S ϵ i > 0 ) f ( u ϵ i ) f ( T i , ρ i | b i , w i ; θ t ) f ( b i ) d b i
where f ( . ) and f ( . | . ) are the marginal and conditional density functions, respectively. Subsequently, the joint posterior density of Ω can be approximated by
π ( Ω | D o b s ) f ( D o b s | Ω ) × π ( Ω ) b σ ϵ 2 n n i / 2 e x p i = 1 n u ϵ i 2 σ ϵ 2 y i μ y i T y i μ y i × | Σ b | n 2 e x p 1 2 i = 1 n b i T Σ b 1 b i × exp 1 2 i = 1 n u ϵ i S ϵ i T S ϵ i 1 Γ ( κ ϵ / 2 ) ( κ ϵ / 2 ) κ ϵ / 2 i = 1 n u ϵ i κ ϵ 2 1 e x p 2 κ ϵ i = 1 n u ϵ i × i = 1 n q = 1 Q λ 0 q ρ i ( T i ) exp ρ i ( γ q T w i + R i q ( t ) × exp q = 1 Q 0 T i λ q 0 ( s ) exp γ q T w i + R i q ( t ) d s d b i × e x p 1 2 ( B β 0 ) T Λ B 1 ( B β 0 ) × ( σ ϵ 2 ) ϕ ϵ 1 1 e x p ( ϕ ϵ 2 / σ ϵ 2 ) × | Σ b | ( ν b + m + s + 1 ) 2 e x p 1 2 tr Ω b Σ b 1 × e x p 1 2 Γ δ ϵ δ ϵ 2 × e x p κ ϵ 0 κ ϵ × λ q d ω q d 1 1 e x p ω q d 2 λ q d × e x p 1 2 ( γ q γ q 0 ) T Λ γ q 1 ( γ q γ q 0 ) × e x p 1 2 ( α q α q 0 ) T Λ α q 1 ( α q α q 0 )
where μ y i = X i B + H i b i + δ ϵ S ϵ i . The Metropolis–Hastings algorithm within a Gibbs sampler can be used to draw samples from the full conditional posterior distributions of the parameters (Appendix A) and to estimate their posterior means and standard deviations. For all models, the Markov chain Monte Carlo (MCMC) procedure was implemented using WinBUGS software version 1.4.3.

4. Application to Chronic Kidney Disease (CKD) Data

4.1. Motivating CKD Data

The motivating application data for this study were chronic kidney disease (CKD) longitudinal and failure time data. CKD is becoming a major global health issue, with an 82 percent increase since 1990 [53]. It affects around 500 million people worldwide, with 80 percent of them living in low-and middle-income countries [53]. In Ethiopia, the overall prevalence of CKD in patients with diabetes was estimated to be 35.52 percent, and 14.5 percent for CKD stages three to five [54]. According to a study conducted in Northwest Ethiopia, the prevalence of CKD stages 3 to 5 was 17.3 percent [55]. The progressive loss of kidney function of a CKD patient is measured through the estimated glomerular filtration rate (eGFR).
The data were gathered from CKD patients at the University of Gondar Comprehensive Specialized Hospital in Ethiopia. Among CKD patients who had been treated in the hospital’s renal clinic, those who had three or more visits within the last eight years (from the initial diagnostic period to the end of the data collection period on 5 June 2022) are included in this paper. For those CKD patients who have been treated in the hemodialysis unit, only pre-dialysis measurement profiles are included. Repeatedly measured biomarkers were also available in the follow-up study, which could be used to demonstrate the proposed methods. Patients with possible acute kidney injury (AKI) and improving kidney function (estimated GFR of more than ninety for subsequent visits/measurements) and those with a number of visits less than three are excluded from this study. Accordingly, 198 patients who met the inclusion and exclusion criteria described above are included in this paper to demonstrate the proposed methods.
Data on baseline socio-demographic characteristics (such as age, gender), comorbidities (such as diabetes, hypertension), repeatedly measured biomarkers of kidney function (such as serum creatinine, systolic and diastolic blood pressures records) and time-to-events (death and/or ESRD) and time-to-right censoring were collected from patients’ profiles and medical records. Of the total 189 patients, 107 (56.6%) were male, 65 (34.4%) had baseline hypertension (i.e., BP ≥ 140/90 mmHg), and 45 (23.81%) had diabetes. The mean age of the patients was 55.1 years (standard deviation: 15.2 years). During the follow-up, 59 (31.2%) CKD patients experienced end-stage renal disease (ESRD), 40 (21.2%) died, and the remaining 90 (47.6%) were censored (those who were under follow-up and with eGFR > 15 mL/min per 1.73 m2, and lost to follow-up).
The longitudinal outcome variable is eGFR. Serum creatinine is usually measured (mg/dL) at each visit, and we utilized it to compute the eGFR values using the MDRD equation [56,57]. In general, the number of measures, visiting times, and time intervals between two consecutive visits differed for each patient. Some patients had their next follow-up within six months, a year, or even more than a year, despite the hospital offering a scheduled regular appointment. The frequency of this depends on the severity/stage of the CKD and the presence of other comorbidities. The median (mean) follow-up time was 16.97 (20.59) months (interquartile range: 5.33–30.67 months). In this study, patients with an eGFR value of less than ninety are included in the analysis to capture and properly model a wide range of possible trajectories of a patient’s kidney function over time.
In general, Figure 1 demonstrates the cumulative incidence function (CIF) plots of ESRD and death as well as the trajectories (or distribution) of the outcome eGFR over time. As we can clearly see from the figure, the CIFs of ESRD and death are rapidly increasing after a 25-month period of the first visit. It also shows a nonlinear trajectory of outcome eGFR over time (see plots (b) and (c)) and an asymmetry (left-skewed) distribution of the outcome eGFR after a log-transformation. We further examined the normality of the outcome eGFR before and after a log-transformation using the Shapiro–Wilk’s test, and the results (not shown here) revealed that the normality assumption was violated.

4.2. Implementation of the Models

For the CKD data, we provide here specific formulations of the general longitudinal and competing risks models discussed in Section 2. Based on the data set, we consider diabetes (Diab), hypertension (HTN), and measurement time as covariates for the longitudinal outcome eGFR ( y i j ). Thus, the general longitudinal outcome submodel (4) is specified as follows:
l o g ( e G F R i j ) = β 1 + β 2 D i a b i j + β 3 H T N i j + b i 1 + ( ζ 1 + b i 2 ) ψ 1 ( t i j ) + ( ζ 2 + b i 3 ) ψ 2 ( t i j ) + ( ζ 3 + b i 4 ) ψ 3 ( t i j ) + e i j
where β = ( β 1 , β 2 , β 3 ) T and ζ = ( ζ 1 , ζ 2 , ζ 3 ) T are population parameters (fixed-effects); b i = ( b i 1 , b i 2 , b i 3 , b i 4 ) T is a vector of random-effects; ψ 1 ( t i j ) , ψ 2 ( t i j ) and ψ 3 ( t i j ) are natural cubic spline bases obtained based on Equation (5) and utilized in the regression spline approximation of the nonlinear time effects. We considered two internal knots at 9 and 25 months and two boundary knots at 0 and 96 months, and the quantiles of the distribution of measurement time points were used to set the knots’ locations.
In this study, we consider two competing risks failure times, such as time-to-ESRD ( T 1 * ) as an event of interest and time-to-death ( T 2 * ) as competing event. The time until a patient withdraws from the study or does not yet experience either of the two events at the end of the data collection period is considered as a right-censoring time (C). Then, for the i t h patient, the observed event time ( T i ) is computed as T i = m i n ( T 1 * , T 1 * , C i ) . Age, gender, diabetes and hypertension status patients are considered as baseline predictors of the instantaneous failure rates. By considering the first parameterization approach here, we assume that the random-effects b i which represent a subject-specific longitudinal process of eGFR affect the distribution of the time(Ti)-to-ESRD/death. Because of the high dimension of the random-effects from the longitudinal submodel, we only included b i 1 (the subject-specific intercept of eGFR) and b i 2 (the slope of ψ 1 ( t i j ) into the competing risks submodel. Thus, for Q = 2 failure types, the general competing-risks model (6) can be reformulated as:
λ i , E S R D ( t ) = λ 01 ( t ) exp γ 11 A g e i + γ 12 G e n d e r i + γ 13 D i a b i + γ 14 H T N i + α 11 b i 1 + α 12 b i 2 λ i , d e a t h ( t ) = λ 02 ( t ) exp γ 21 A g e i + γ 22 G e n d e r i + γ 23 D i a b i + γ 24 H T N i + α 21 b i 1 + α 22 b i 2
where the coefficient parameter α q r represents the level of association between the q t h competing event and the r t h subject-specific longitudinal eGFR process b i r , q = 1 , 2 ; r = 1 , 2 . The baseline hazard functions λ 01 ( t ) and λ 02 ( t ) are specified using piecewise-constant functions to accommodate various hazard shapes. Four intervals are taken into consideration at time points 4, 17.93, 30.00, 48.00, and 95.7 months based on the quantiles of the observed event times. Thus, λ 0 q ( t ) = λ q d , for t ( s q , d 1 , s q , d ] , d = 2 , , 5 time points, and then λ q = ( λ q 1 , λ q 2 , λ q 3 , λ q 4 ) , q = 1 , 2 . In general, there is no closed-form solution for the integral of the likelihood function of the q t h competing risk failure time T i defined in Equation (11). As a result, following [40], we employed a 15-point Gauss–Kronrod rule as a numerical approximation technique to estimate the integral. We implement and compare the following models with different parameterizations and distributional specifications of model errors.

4.3. Model Comparison

Choice of parameterization: We first compare joint models with different parameterization (association structures) as described in Section 2.2, but with the same distribution, a multivariate skew-t distribution, of model errors from the longitudinal submodel.
  • SPJM-P1: A semiparametric joint model (SPJM) with a shared random-effects parameterization (P1) defined in Equation (7);
  • SPJM-P2: An SPJM with a special case of random-effects parameterization (P2) defined in Equation (8);
  • SPJM-P3: An SPJM with current value parameterization (P3) defined in Equation (9).
Distribution specification: We further compare the chosen model among the above three joint models with a joint model that consists of a Gaussian longitudinal submodel.
To conduct the Bayesian inference, weakly-informative prior distributions were taken into account for the parameters. In particular, independent N ( 0 , 100 ) prior is assumed for each element of the fixed-effects coefficient vectors β , ζ , γ 1 , γ 2 , α 1 and α 2 , the skewness parameter δ e , and the piecewise baseline hazard parameters λ 1 and λ 2 . I G ( 0.01 , 0.01 ) and I W ( D b , 4 ) prior distributions are assumed for the scale parameter σ e 2 and variance–covariance matrix Σ b , respectively, where D b = d i a g ( 0.01 , 0.01 , 0.01 , 0.01 ) . The truncated exponential distribution E x p ( 0.5 ) is assumed for the degrees of freedom parameter κ ϵ .
Three chains each with 90,000 iterations and a burn-in of 45,000 were used to run the MCMC method in WinBUGS software version 1.4.3. By keeping every 30th MCMC sample from the next 45,000, we obtained 4500 simulated samples of the unknown posterior parameters. Convergence is assessed by utilizing both graphical diagnostics (such as trace plot, Brooks–Gelman–Rubin (BGR) diagnostics plot, and autocorrelation plot) and the Geweke diagnostic test [58]. Figure 2 presents these plots, which demonstrate convergence. Furthermore, Table 1 shows Geweke’s test of convergence with the null hypothesis that the mean estimates from the early and latter parts of the MC are the same (chain convergence). None of the absolute values of Geweke’s test statistics for the parameters exceeded the 95% critical value of 1.96, demonstrating strong evidence of convergence.

4.4. Data Analysis Results

In this section, we present the analysis of the CKD data using the aforementioned joint models under different parameterizations and distributional specifications, compare the models, and interpret the results of the best model. In the comparison of joint models, we first compare the joint models with the specified parameterizations. Table 2 presents the posterior mean estimates of the parameters and their standard deviations and 95% credible intervals from the proposed joint models with three parameterizations. We used the same MCMC set-up (i.e., same number of chains, initial values, burn-in and thinning specifications) to fit and compare all the proposed models.
It is observed that each model yields marginally different but significant estimates for most of the parameters since the 95% credible intervals (CI) do not include zero (Table 2). Overall, the findings in Table 2 indicate that estimates increased from the first joint model (SPJM-P1) to the third one (SPJM-P3). The estimates from the first two joint models are approximately similar, but they might be overestimated by the third joint model. For instance, we can clearly see that the estimated values of the parameters β 1 , β 2 , σ ϵ 2 and δ ϵ have increased from 4.221 , 0.127 , 0.006 and 0.365 in model SPJM-P1 to 5.891 , 0.358 , 0.009 and 0.523 in model SPJM-P3, respectively. Additionally, model SPJM-P3 provides larger estimates of the competing risk submodel parameters γ 12 , γ 14 and γ 21 than model SPJM-P1. We notice that the estimates of the alphas are only comparable within the first two joint models due to the third joint model having a different number of association parameters.
Furthermore, to choose the best Bayesian joint model parameterization which fits the data well, we also computed the deviance information criterion (DIC) [59], and found that the first joint model (SPJM-P1) gives a smaller DIC value (DIC = 3835) compared to the other joint models SPJM-P2 (DIC = 4064) and SPJM-P3 (DIC = 4111).
After the joint model with the first parameterization approach (SPJM-P1) has been chosen as the good joint model that fits the data well, we then further compare it with a Gaussian joint model. As previously mentioned, all the three joint models were fitted by assuming a multivariate skew-t (ST) distribution for model errors of the longitudinal submodel. We again fitted the chosen joint model by assuming a multivariate normal (N) distribution for model errors of the longitudinal submodel, and the results are presented in Table 3.
Table 3 shows that the Gaussian joint model (SPJM-P1N) provides larger parameter estimates when compared to the skew-t joint model (SPJM-P1ST). For instance, the posterior mean estimates of β 2 and ζ 3 increased (negatively) from 0.127 and 2.864 to 0.176 and 3.398 , respectively. That is, the magnitude of the estimates of β 2 and ζ 3 increased from ( 0.127 and 2.864 ) to ( 0.176 and 3.398 ). Particularly, the estimated scale parameters (variances) of model errors ( σ ^ ϵ 2 ) and each random-effect components ( σ ^ b 1 2 , , σ ^ b 4 2 ) from SPJM-P1N are much larger than those from SPJM-P1ST. Furthermore, an SPJM-P1ST also has smaller DIC value (3835) compared to an SPJM-P1N (5143). The negative and significantly different from zero posterior mean of the skewness parameter ( δ ^ ϵ = 0.365 ; 95 % CI : 0.448 , 0.282 ) reveals that there is a negative skewness in the log(eGFR) data. The relatively small posterior means of the variances and the significant skewness estimate demonstrate how crucial it is to take skewness into account in the longitudinal data analysis. As a result, the joint model with the skew-t distribution and shared random-effects parameterization may outperform the joint model with the other parameterizations and the Gaussian distribution and appears to be the best-fitting joint model. The findings of the chosen joint model (SPJM-P1ST) will thus be used for further result interpretation and discussion.
The results of all models show that the estimates of the parameters α 11 , …, α 22 , which quantify the association between the longitudinal eGFR and the competing-risks (ESRD and death) failure time processes, are significantly different from zero. That is, for the study’s CKD data, a joint modelling approach is preferable to modelling them separately. Their negative posterior mean estimates indicate that the patient-specific eGFR process has a negative association with the time to ESRD and time to death. That is, generally speaking, the risk of experiencing ESRD and/or dying increased as the log(eGFR) measurements decreased. We also fitted the SPJM-P1 by setting the association parameters to zero (separate models) and computed the DIC to further support the better performance of the joint modelling. We found that the joint model (SPJM-P1) has a smaller DIC value (3835) than the separate model (DIC = 4728).
As we can see from the results of the chosen model (Table 3), the covariates included in this study, such as diabetes ( β 2 ), hypertension ( β 3 ), and the spline basis functions of the measurement-time ( ζ 1 , ζ 2 , ζ 3 ), are significant predictors of, and have a negative association, with the longitudinal outcome eGFR because their 95% CIs do not include zero. We can clearly observe that these covariates all lead to a decline in log(GFR) estimates. For example, the hypertension coefficient ( β ^ 3 = 0.145 , 95% CI: [ 0.188 , 0.104 ]) can be interpreted as the mean eGFR value (log-transformed) of a CKD patient with hypertension decreasing by 0.145 unit when compared to a non-hypertension patient, while other covariates’ effects remain constant. The posterior means of the parameters of the baseline age ( γ 11 , γ 21 ) and gender ( γ 12 , γ 22 ) of a CKD patient from the competing risks’ submodels are not significantly different from zero. However, in addition to patient-specific eGFR values, diabetes ( γ 13 , γ 23 ) and hypertension ( γ 14 , γ 24 ) are significant and strongly positively association with failure events, ESRD and death because their 95% CIs do not include zero. In other words, patients with diabetes and/or hypertension had a higher risk of dying or getting ESRD than patients who did not have diabetes or hypertension. For instance, the coefficient of diabetes associated with the risk of ESRD, γ ^ 13 = 1.92 [ H R 13 = 6.82 , 95% CI ( 1.95 , 27.11 )] may be interpreted as CKD patients with diabetes at a given time are around 6.82 times more likely to experience ESRD than those without diabetes, keeping the effect of other predictors constant.

5. Simulation Studies

5.1. Simulation Design

We conducted simulation studies to evaluate and conform the performance of the proposed joint model by considering different distributional assumptions for model errors ϵ i . We considered a sample size of 400 subjects and assuming that each subject had 15 follow-up measurements. Two competing risks (ESRD and death) are also taken into account. Based on the real CKD data analysed in Section 4, two categorical (binary) variables X 1 and X 2 are considered as design covariate matrices for the longitudinal outcome process. Besides X 1 and X 2 , one more categorical (binary) and one continuous baseline predictors are also considered as design matrix for the time-to-competing risks processes. The simulation studies were designed in the following way. The longitudinal outcome y i j data were simulated from the semiparametric mixed-effects longitudinal submodel (16). To generate the spline bases, we consider fifteen equally spaced visiting time points between 0 and 14 ( t i j = 0 , 1 , 2 , , 14 ), where the longitudinal measurements are taken from each subject and consider percentile-based knots. The model error term e i j was simulated from a Γ ( 2 , 1 ) distribution with shape parameter value 2 and scale 1, and then subtracted by 2 to obtain a skew distributed data [33]. The random-effects vector b i was simulated from N 4 ( μ b = 0 , Σ b = d i a g ( 1 , 1 , 1 , 1 ) ) distribution. We set the true values of the fixed-effect parameter vectors β = ( 4.50 , 0.25 , 0.31 ) T and ζ = ( 1.2 , 2.5 , 3.1 ) T . The binary covariates x 1 i j and x 2 i j were generated from B e r n o u l l i ( p = 0.24 ) and B e r n o u l l i ( 0.44 ) distributions, respectively. We also simulated two competing risks failure time data from the cause-specific hazards models specified in Section 4. We used constant baseline hazards λ 01 ( t ) = 0.3 and λ 02 ( t ) = 0.1 to generate failure time data for those events. We set γ 1 = ( 0.01 , 0.15 , 1.80 , 1.60 ) , γ 2 = ( 0.01 , 0.10 , 2.00 , 2.20 ) , α 1 = ( 2.50 , 3.00 ) and α 2 = ( 2.00 , 0.50 ) . Then, the true failure time T i q * of the q t h competing risk was simulated from its distribution function (cumulative incidence function), which was defined as
F q ( t ) = P ( T q * t ) = 1 exp 0 T i q * λ 0 q ( t ) exp γ q T w i + α q T b i d t
where the design vector of baseline covariates w i consists of three categorical (binary) and one continuous predictor. We generated censoring time C i from an e x p ( 0.5 ) distribution. Thus, the observed failure time of the i t h subject was computed as T i = m i n ( T i 1 * , T i 2 * , C i ) . The event status for the i t h subject was determined as ρ i = q (for q = 1 , 2 ) if I ( T i q * C i ) and ρ i = 0 otherwise. Prior distributions’ specification and convergence diagnostics tools were similar to the application part (Section 4).

5.2. Simulation Results

Finally, we conducted simulation studies, as a confirmatory analysis, to evaluate the performance of the last two joint models with multivariate skew-t and normal distributions as stated in the application section. This section also utilized the same MCMC procedure specification and convergence diagnostics checking as the application section. All the MCMC samplers were implemented by using WinBUGS14 software interacted with R, that is, using R2WinBUGS14 package in R. We compute the relative bias, RB ( θ ) = 1 M i = 1 M θ ^ i θ T 1 , root-mean-square error RMSE ( θ ) = 1 M i = 1 M θ ^ i θ T 2 , 95% coverage probability (the proportion of the credible intervals, θ ^ ± 1.96 × s e ( θ ^ ) and deviance information criterion (DIC) to evaluate the behaviour of the estimators of each model and to compare them, where θ ^ i is the estimate of the true parameter θ T for the i t h simulated sample, i = 1 , , M . Table 4 presents the simulation results, and the posterior mean estimates with the corresponding RB, RMSE, CP and DIC for each parameter of the joint models. Thus, based on the relatively smaller values of RB, RMS, and DIC and the larger value of CP, we found that the joint model with skew-t distribution (SPJM-P1ST) has performed better.

6. Discussion and Conclusions

Literature has emphasized the importance of joint modelling of survival data with multiple failure types (or competing-risks) and longitudinal data with many features in recent years. The primary goal of this paper was to develop a joint model for longitudinal outcomes with skewness and nonlinear measurement-time effects, and competing risks’ failure time data within the Bayesian framework in order to make a robust statistical inference. Thus, in this study, we proposed Bayesian semiparametric joint models for skewed-longitudinal outcome and competing risks failure time data that address the above issues simultaneously. We used regression splines to account for the nonlinear time effects; an n i -variate skew-t distribution to take into consideration skewness in the longitudinal outcome eGFR; and piecewise constant functions to flexible model the hazard functions.
We illustrated the proposed joint models by analysing data that were collected from CKD patients at the University of Gondar Comprehensive Specialized Hospital in Ethiopia. We then compared joint models by considering different association structures (parameterizations) and different distributions of model errors. The computed DIC values were mainly used to compare the joint models. As described in the data analysis and results section, a semiparametric joint model with shared random-effects parameterization and skew-t distribution better fitted the data compared to the other joint models. We then evaluated the association between or the impact of the patient-specific eGFR process on the time to ESRD and death processes and other covariates and interpreted the results. Some simulation studies were carried out to further evaluate the performance of the joint models proposed in this paper. The relative bias, root mean square error, and coverage probability were used as performance evaluation tools.
The findings of this study suggest that, in general, for clinical and other follow-up data with the same features as the study application CKD data, a joint modelling approach is more appropriate to capture the associations between longitudinal and competing risks data simultaneously than separate models. The findings also suggest that the specifications of the functional form of longitudinal biomarkers, the parameterization (association structure) in the construction of joint models, as well as distributional assumptions, require special attention. According to the application’s findings, diabetes, hypertension and time were significant predictors of and had a negative association with kidney function (a decline in eGFR measures), keeping the effects of other predictors constant. Hypertension and diabetes are also significantly associated with high risks of experiencing end-stage renal disease and/or death. However, in this study, patients’ age and gender were not significantly associated with the risks of end-stage renal disease and death since their 95% credible intervals include zero.
In addition to the motivating CKD follow-up data, our methodology has broader applications whenever continuous outcomes and associated biomarkers are repeatedly measured, the time-to-competing failure events are recorded, and the basic submodels and joint model specifications are met. Our simulation and application studies revealed that our work contributed to this interesting study area by making use of a more flexible methodology to model complex competing risks failure time and longitudinal data.
The methodology proposed in this paper has some extensions for future research. (i) In this study, for each competing risks, we proposed a cause-specific proportional hazard submodel. However, it may not quantify the overall impact of a covariate on a subject’s event status [19]. Thus, future research may consider a subdistribution hazard [39] or fully specified subdistribution hazards [60] submodel to evaluate how covariates affect the cumulative incidence function. (ii) The methodology of this work could also be extended to a multivariate setting to accommodate multiple longitudinal outcomes that are repeatedly measured for each subject. We note that the first issue mentioned is currently under investigation, and we hope that its findings will be available in the near future.

Author Contributions

Conceptualization, M.M.F., S.M., G.D., W.H., A.A.-B. and M.E.-M.; methodology, M.M.F., S.M. and G.D.; software, M.M.F. and G.D.; validation, S.M., G.D., S.K., W.H., M.M.F., A.A.-B. and M.E.-M.; data curation, M.M.F., G.D. and W.H.; formal analysis, M.M.F.; Funding acquisition, A.A.-B. and M.E.-M.; investigation, W.H., M.M.F. and S.K.; resources, M.M.F., G.D., M.E.-M., A.A.-B. and W.H.; writing—original draft preparation, M.M.F., S.M. and G.D.; writing—review and editing, M.M.F., S.M., G.D., S.K., A.A.-B., W.H. and M.E.-M.; supervision, S.M., G.D., S.K. and W.H.; project administration, M.M.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Ethical approval for this study was obtained from both the Jomo Kenyata University of Agriculture and Technology (JKUAT) Institutional Ethics Review Committee in Kenya (approval No.: JKU/IERC/02316/0539; date: 21/04/2022) and the University of Gondar (UOG) Ethical Review Board in Ethiopia (Rfe.:VP/RTT/05/777/2022; date: 03/05/2022).

Data Availability Statement

The real data that are used to illustrate the proposed methods may be available from the corresponding author upon considerable request. The data are not publicly available due to ethical restrictions.

Acknowledgments

The corresponding author would like to express his appreciation to the Pan African University for supporting his work. His appreciation also goes to the University of Gondar Comprehensive Specialized Hospital for the data access. We appreciate the academic editors and referees for their valuable comments and suggestions, which helped to improve the paper.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
CKDChronic Kidney Disease
CICredible Interval
DICDeviance Information Criterion
eGFRestimated Glomerular Filtration Rate
ESRDEnd Stage Renal Disease
FPJM-P1Fully Parametric Joint Model with Parameterization-1
GFRGlomerular Filtration Rate
MCMCMarkov Chain Monte Carlo
MDRDModification of Diet in Renal Disease
RMSERoot Mean Squared Error
SCrSerum Creatinine
SPJM-P1Semi-parametric Joint Model with Parameterization-1
STSkew-T

Appendix A. Full Conditional Distributions

Let θ denote the specific parameter of interest to be estimated and “others” denote the rest of all parameters excluding the θ and the observed data. In addition, let y i = ( y i 1 , , y i j , , y i n i ) T , where y i j = x i j T B + h i j T b i + δ ϵ S ϵ i j . Then, the derived full conditional distribution (FCD), π ( θ | o t h e r s ) , of each parameter θ , in the parameter space of interest Ω is given below:
(i)
The FCD of B , the fixed-effects parameter vector, is
π ( B | o t h e r s ) e x p 1 2 i = 1 n u ϵ i σ ϵ 2 y i μ y i T y i μ y i × e x p 1 2 ( B β 0 ) T Λ B 1 ( B β 0 ) N V B Λ B 1 β 0 + i = 1 n u ϵ i σ ϵ 2 X i y i H i b i δ ϵ S ϵ i , V B
where V B = Λ B 1 + i = 1 n u ϵ i σ ϵ 2 X i X i T 1 , and μ y i = y i X i B H i b i δ ϵ S ϵ i .
(ii)
FCD of the scale parameter σ e 2 is given by
π ( σ ϵ 2 | o t h e r s ) i = 1 n σ ϵ 2 n i / 2 e x p u ϵ i 2 σ ϵ 2 y i μ y i T y i μ y i × ( σ ϵ 2 ) ϕ ϵ 1 1 e x p { ϕ ϵ 2 / σ ϵ 2 } I G ϕ ϵ 1 + 1 2 i = 1 n n i , ϕ ϵ 2 + 1 2 i = 1 n u ϵ i y i μ y i T y i μ y i
(iii)
The FCD of Σ b , the covariance matrix of b i is given by
π ( Σ b | o t h e r s ) i = 1 n | Σ b | 1 2 e x p 1 2 b i T Σ b 1 b i × | Σ b | ( ν b + m + s + 1 ) 2 e x p 1 2 tr Ω b Σ b 1 | Σ b | 1 2 ( n + ν b + m + 1 ) exp 1 2 i = 1 n b i T Σ b 1 b i + tr Ω b Σ b 1 I W i = 1 n b i T b i + Ω b , n + ν b
(iv)
FCD of δ ϵ , the skewness parameter of the longitudinal outcome, is given by
π ( δ ϵ | o t h e r s ) i = 1 n e x p u ϵ i 2 σ ϵ 2 y i X i B H i b i δ ϵ S ϵ i T y i X i B H i b i δ ϵ S ϵ i × e x p 1 2 Γ δ ϵ δ ϵ 2 e x p 1 2 i = 1 n u ϵ i σ ϵ 2 ( y i X i B H i b i δ ϵ S ϵ i ) T × ( y i X i B H i b i δ ϵ S ϵ i ) + 1 Γ δ ϵ δ ϵ 2
(v)
FCD of the degree of freedom of the longitudinal submodel error κ ϵ is
π ( κ e | o t h e r s ) i = 1 n e x p u ϵ i 2 σ ϵ 2 y i X i B H i b i δ e S ϵ i T y i X i B H i b i δ e S ϵ i × 1 Γ ( κ ϵ / 2 ) ( κ ϵ / 2 ) κ ϵ / 2 u ϵ i κ ϵ 2 1 e x p 2 κ ϵ u ϵ i × e x p κ ϵ 0 κ ϵ
(vi)
The FCD of each piecewise constant baseline hazard parameter λ q d , q = 1 , , Q ; d = 1 , , D , is
π ( λ q d | o t h e r s ) i = 1 n λ 0 q ρ i ( T i ) exp ρ i ( γ q T w i + R i q ( t ) × exp 0 T i λ 0 q ( s ) exp γ q T w i + R i q ( s ) d s × λ q d ω q d 1 1 e x p ω q d 2 λ q d λ q d ω q d 1 1 e x p ω q d 2 λ q d × i = 1 n λ q d ρ i I ( s q , d 1 t < s q d ) × exp I ( T i s q , d 1 ) s q , d 1 m i n ( s q d , T i ) λ q d exp γ q T w i + R i q ( s ) d s Γ ω q d 1 + n q d , ω q d 2 + i = 1 n exp γ q T w i + R i q ( s ) I ( T i s q , d 1 ) s q , d 1 m i n ( s q d , T i ) d s
where n q d = i = 1 n ρ i I ( s q , d 1 t < s q d ) is the number of subjects whose failure-time is within the interval ( s q , d 1 t < s q d ) , d = 1 , , D .
(vii)
The FCD of γ q , the coefficient vector of the baseline covariates in the time-to-competing risks submodel, q = 1 , , Q , is given by
π ( γ q | o t h e r s ) i = 1 n λ 0 q ρ i ( T i ) exp ρ i γ q T w i + R i q ( s ) × exp 0 T i λ 0 q ( s ) exp γ q T w i + R i q ( s ) d s × e x p 1 2 ( γ γ 0 ) T Λ γ 1 ( γ γ 0 ) λ q d i = 1 n ρ i I ( s q , d 1 T i < s q d ) exp i = 1 n ρ i ( γ q T w i ) × exp I ( T i s q , d 1 ) s q , d 1 m i n ( s q d , T i ) λ q d i = 1 n exp γ q T w i + R i q ( s ) d s × e x p 1 2 ( γ γ 0 ) T Λ γ 1 ( γ γ 0 )
(viii)
FCD of the association parameter vector α q , q = 1 , , Q , is given by
π ( α q | o t h e r s ) i = 1 n λ q d 0 ρ i ( T i ) exp ρ i ( γ q T w i + α q T b i × exp 0 T i λ 0 q ( s ) exp γ q T w i + α q T b i d s × e x p 1 2 ( α 1 α 01 ) T Λ α 1 1 ( α 1 α 01 )

References

  1. Tsiatis, A.A.; Degruttola, V.; Wulfsohn, M.S. Modeling the relationship of survival to longitudinal data measured with error. Applications to survival and CD4 counts in patients with AIDS. J. Am. Stat. Assoc. 1995, 90, 27–37. [Google Scholar] [CrossRef]
  2. Tsiatis, A.A.; Davidian, M. Joint modeling of longitudinal and time-to-event data: An overview. Stat. Sin. 2004, 14, 809–834. [Google Scholar]
  3. Ibrahim, J.G.; Chu, H.; Chen, L.M. Basic concepts and methods for joint models of longitudinal and survival data. J. Clin. Oncol. 2010, 28, 2796. [Google Scholar] [CrossRef]
  4. Baghfalaki, T.; Ganjali, M.; Berridge, D. Robust joint modeling of longitudinal measurements and time to event data using normal/independent distributions: A Bayesian approach. Biom. J. 2013, 55, 844–865. [Google Scholar] [CrossRef]
  5. Gaskins, J.; Daniels, M.; Marcus, B. Bayesian methods for nonignorable dropout in joint models in smoking cessation studies. J. Am. Stat. Assoc. 2016, 111, 1454–1465. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Dessiso, A.H.; Goshu, A.T. Bayesian Joint Modelling of Longitudinal and Survival Dataof HIV/AIDS Patients: A Case Study at Bale Robe GeneralHospital, Ethiopia. Am. J. Theor. Appl. Stat. 2017, 6, 182–190. [Google Scholar] [CrossRef] [Green Version]
  7. Mwanyekange, J.; Mwalili, S.; Ngesa, O. Bayesian Inference in a Joint Model for Longitudinal and Time to Event Data with Gompertz Baseline Hazards. Mod. Appl. Sci. 2018, 12, 159–172. [Google Scholar] [CrossRef]
  8. Azarbar, A.; Wang, Y.; Nadarajah, S. Simultaneous Bayesian modeling of longitudinal and survival data in breast cancer patients. Commun. -Stat.-Theory Methods 2021, 50, 400–414. [Google Scholar] [CrossRef]
  9. Baghfalaki, T.; Ganjali, M.; Hashemi, R. Bayesian joint modeling of longitudinal measurements and time-to-event data using robust distributions. J. Biopharm. Stat. 2014, 24, 834–855. [Google Scholar] [CrossRef]
  10. Huang, Y.; Dagne, G. A Bayesian approach to joint mixed-effects models with a skew-normal distribution and measurement errors in covariates. Biometrics 2011, 67, 260–269. [Google Scholar] [CrossRef]
  11. Tang, A.M.; Tang, N.S. Semiparametric Bayesian inference on skew–normal joint modeling of multivariate longitudinal and survival data. Stat. Med. 2015, 34, 824–843. [Google Scholar] [CrossRef] [PubMed]
  12. Zhang, H.; Huang, Y. Bayesian joint modeling for partially linear mixed-effects quantile regression of longitudinal and time-to-event data with limit of detection, covariate measurement errors and skewness. J. Biopharm. Stat. 2021, 31, 295–316. [Google Scholar] [CrossRef] [PubMed]
  13. Huang, Y.; Dagne, G.A.; Park, J.G. Mixture joint models for event time and longitudinal data with multiple features. Stat. Biopharm. Res. 2016, 8, 194–206. [Google Scholar] [CrossRef]
  14. Song, X.; Davidian, M.; Tsiatis, A.A. A semiparametric likelihood approach to joint modeling of longitudinal and time-to-event data. Biometrics 2002, 58, 742–753. [Google Scholar] [CrossRef]
  15. Das, K. A semiparametric Bayesian approach for joint modeling of longitudinal trait and event time. J. Appl. Stat. 2016, 43, 2850–2865. [Google Scholar] [CrossRef]
  16. Rizopoulos, D.; Ghosh, P. A Bayesian semiparametric multivariate joint model for multiple longitudinal outcomes and a time-to-event. Stat. Med. 2011, 30, 1366–1380. [Google Scholar] [CrossRef]
  17. Tang, N.S.; Tang, A.M.; Pan, D.D. Semiparametric Bayesian joint models of multivariate longitudinal and survival data. Comput. Stat. Data Anal. 2014, 77, 113–129. [Google Scholar] [CrossRef]
  18. Mauff, K.; Steyerberg, E.; Kardys, I.; Boersma, E.; Rizopoulos, D. Joint models with multiple longitudinal outcomes and a time-to-event outcome: A corrected two-stage approach. Stat. Comput. 2020, 30, 999–1014. [Google Scholar] [CrossRef] [Green Version]
  19. Bakoyannis, G.; Touloumi, G. Practical methods for competing risks data: A review. Stat. Methods Med Res. 2012, 21, 257–272. [Google Scholar] [CrossRef]
  20. Rizopoulos, D. Joint Models for Longitudinal and Time-to-Event Data: With Applications in R; CRC Press: Boca Raton, FL, USA, 2012. [Google Scholar]
  21. Yang, W.; Xie, D.; Pan, Q.; Feldman, H.I.; Guo, W. Joint modeling of repeated measures and competing failure events in a study of chronic kidney disease. Stat. Biosci. 2017, 9, 504–524. [Google Scholar] [CrossRef]
  22. Armero, C.; Forte, A.; Perpiñán, H.; Sanahuja, M.J.; Agustí, S. Bayesian joint modeling for assessing the progression of chronic kidney disease in children. Stat. Methods Med. Res. 2018, 27, 298–311. [Google Scholar] [CrossRef]
  23. Teixeira, L.; Sousa, I.; Rodrigues, A.; Mendonca, D. Joint Modelling of Longitudinal and Competing Risks Data in Clinical Research. REVSTAT Stat. J. 2019, 17, 245–264. [Google Scholar]
  24. Tang, A.M.; Tang, N.S.; Zhu, H. Influence analysis for skew-normal semiparametric joint models of multivariate longitudinal and multivariate survival data. Stat. Med. 2017, 36, 1476–1490. [Google Scholar] [CrossRef] [PubMed]
  25. Dagne, G.A.; Huang, Y. Bayesian semiparametric mixture Tobit models with left censoring, skewness, and covariate measurement errors. Stat. Med. 2013, 32, 3881–3898. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  26. Lu, T. Simultaneous inference for semiparametric mixed-effects joint models with skew distribution and covariate measurement error for longitudinal competing risks data analysis. J. Biopharm. Stat. 2017, 27, 1009–1027. [Google Scholar] [CrossRef]
  27. Castro, L.M.; Wang, W.L.; Lachos, V.H.; Inácio de Carvalho, V.; Bayes, C.L. Bayesian semiparametric modeling for HIV longitudinal data with censoring and skewness. Stat. Methods Med. Res. 2019, 28, 1457–1476. [Google Scholar] [CrossRef]
  28. Cavieres, J.; Ibacache-Pulgar, G.; Contreras-Reyes, J.E. Thin plate spline model under skew-normal random errors: Estimation and diagnostic analysis for spatial data. J. Stat. Comput. Simul. 2022, 92, 1–21. [Google Scholar] [CrossRef]
  29. Sahu, S.K.; Dey, D.K.; Branco, M.D. A new class of multivariate skew distributions with applications to Bayesian regression models. Can. J. Stat. 2003, 31, 129–150. [Google Scholar] [CrossRef] [Green Version]
  30. Huang, X.; Li, G.; Elashoff, R.M. A joint model of longitudinal and competing risks survival data with heterogeneous random effects and outlying longitudinal measurements. Stat. Interface 2010, 3, 185. [Google Scholar]
  31. Arellano-Valle, R.; Bolfarine, H.; Lachos, V. Bayesian inference for skew-normal linear mixed models. J. Appl. Stat. 2007, 34, 663–682. [Google Scholar] [CrossRef]
  32. Huang, Y.; Dagne, G.; Wu, L. Bayesian inference on joint models of HIV dynamics for time-to-event and longitudinal data with skewness and covariate measurement errors. Stat. Med. 2011, 30, 2930–2946. [Google Scholar] [CrossRef] [PubMed]
  33. Zhang, H.; Huang, Y. Quantile regression-based Bayesian joint modeling analysis of longitudinal–survival data, with application to an AIDS cohort study. Lifetime Data Anal. 2019, 26, 339–368. [Google Scholar] [CrossRef] [PubMed]
  34. Ibrahim, J.G.; Chen, M.H.; Sinha, D. Bayesian methods for joint modeling of longitudinal and survival data with applications to cancer vaccine trials. Stat. Sin. 2004, 14, 863–883. [Google Scholar]
  35. Zugna, D. A New Bayesian Approach to Competing Risks Models in Longitudinal Studies. Presented at Atti della XLIV Riunione Scientifica, Università della Calabria, Arcavacata, Italy, 25–27 July 2008. [Google Scholar]
  36. Andrinopoulou, E.R.; Rizopoulos, D.; Takkenberg, J.J.; Lesaffre, E. Combined dynamic predictions using joint models of two longitudinal outcomes and competing risk data. Stat. Methods Med. Res. 2017, 26, 1787–1801. [Google Scholar] [CrossRef]
  37. Rue, M.; Andrinopoulou, E.R.; Alvares, D.; Armero, C.; Forte, A.; Blanch, L. Bayesian joint modeling of bivariate longitudinal and competing risks data: An application to study patient-ventilator asynchronies in critical care patients. Biom. J. 2017, 59, 1184–1203. [Google Scholar] [CrossRef]
  38. Williamson, P.R.; Kolamunnage-Dona, R.; Philipson, P.; Marson, A.G. Joint modelling of longitudinal and competing risks data. Stat. Med. 2008, 27, 6426–6438. [Google Scholar] [CrossRef]
  39. Deslandes, E.; Chevret, S. Joint modeling of multivariate longitudinal data and the dropout process in a competing risk setting: Application to ICU data. BMC Med. Res. Methodol. 2010, 10, 69. [Google Scholar] [CrossRef] [Green Version]
  40. Andrinopoulou, E.R.; Rizopoulos, D.; Takkenberg, J.J.; Lesaffre, E. Joint modeling of two longitudinal outcomes and competing risk data. Stat. Med. 2014, 33, 3167–3178. [Google Scholar] [CrossRef]
  41. Hickey, G.L.; Philipson, P.; Jorgensen, A.; Kolamunnage-Dona, R. A comparison of joint models for longitudinal and competing risks data, with application to an epilepsy drug randomized controlled trial. J. R. Stat. Soc. Ser. A Stat. Soc. 2018, 181, 1105–1123. [Google Scholar] [CrossRef] [Green Version]
  42. Elashoff, R.M.; Li, G.; Li, N. A joint model for longitudinal measurements and survival data in the presence of multiple failure types. Biometrics 2008, 64, 762–771. [Google Scholar] [CrossRef] [Green Version]
  43. Li, N.; Elashoff, R.M.; Li, G.; Saver, J. Joint modeling of longitudinal ordinal data and competing risks survival times and analysis of the NINDS rt-PA stroke trial. Stat. Med. 2010, 29, 546–557. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  44. Baghfalaki, T.; Kalantari, S.; Ganjali, M.; Hadaegh, F.; Pahlavanzadeh, B. Bayesian joint modeling of ordinal longitudinal measurements and competing risks survival data for analysing Tehran Lipid and Glucose Study. J. Biopharm. Stat. 2020, 30, 689–703. [Google Scholar] [CrossRef] [PubMed]
  45. Lee, S.; McLachlan, G.J. Finite mixtures of multivariate skew t-distributions: Some recent and new results. Stat. Comput. 2014, 24, 181–202. [Google Scholar] [CrossRef]
  46. Brown, J.D. Advanced Statistics for the Behavioral Sciences; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
  47. Rajeswaran, J.; Blackstone, E.H.; Barnard, J. Joint modeling of multivariate longitudinal data and competing risks using multiphase sub-models. Stat. Biosci. 2018, 10, 651–685. [Google Scholar] [CrossRef]
  48. Sheikh, M.T.; Ibrahim, J.G.; Gelfond, J.A.; Sun, W.; Chen, M.H. Joint modelling of longitudinal and survival data in the presence of competing risks with applications to prostate cancer data. Stat. Model. 2020, 21, 72–94. [Google Scholar] [CrossRef]
  49. Liu, L.; Huang, X. Joint analysis of correlated repeated measures and recurrent events processes in the presence of death, with application to a study on acquired immune deficiency syndrome. J. R. Stat. Soc. Ser. C Appl. Stat. 2009, 58, 65–81. [Google Scholar] [CrossRef]
  50. Alvares, E.; Lázaro, V.; Gómez-Rubio, C.; Armero, C. Bayesian survival analysis with BUGS. Stat. Med. 2021, 40, 2975–3020. [Google Scholar] [CrossRef]
  51. Wu, L.; Liu, W.; Hu, X. Joint inference on HIV viral dynamics and immune suppression in presence of measurement errors. Biometrics 2010, 66, 327–335. [Google Scholar] [CrossRef]
  52. Azzalini, A.; Capitanio, A. Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. J. R. Stat. Soc. Ser. B Stat. Methodol. 2003, 65, 367–389. [Google Scholar] [CrossRef]
  53. Stanifer, J.W.; Muiru, A.; Jafar, T.H.; Patel, U.D. Chronic kidney disease in low-and middle-income countries. Nephrol. Dial. Transplant. 2016, 31, 868–874. [Google Scholar] [CrossRef] [Green Version]
  54. Shiferaw, W.S.; Akalu, T.Y.; Aynalem, Y.A. Chronic Kidney Disease among Diabetes Patients in Ethiopia: A Systematic Review and Meta-Analysis. Int. J. Nephrol. 2020, 2020, 15. [Google Scholar] [CrossRef] [PubMed]
  55. Alemu, H.; Hailu, W.; Adane, A. Prevalence of chronic kidney disease and associated factors among patients with diabetes in northwest Ethiopia: A hospital-based cross-sectional study. Curr. Ther. Res. 2020, 92, 100578. [Google Scholar] [CrossRef] [PubMed]
  56. Levey, A.S.; Bosch, J.P.; Lewis, J.B.; Greene, T.; Rogers, N.; Roth, D.; for the Modification of Diet in Renal Disease Study Group. A more accurate method to estimate glomerular filtration rate from serum creatinine: A new prediction equation. Ann. Intern. Med. 1999, 130, 461–470. [Google Scholar] [CrossRef] [PubMed]
  57. Levey, A.S.; Coresh, J.; Greene, T.; Stevens, L.A.; Zhang, Y.; Hendriksen, S.; Kusek, J.W.; Van Lente, F.; for the Chronic Kidney Disease Epidemiology Collaboration. Using standardized serum creatinine values in the modification of diet in renal disease study equation for estimating glomerular filtration rate. Ann. Intern. Med. 2006, 145, 247–254. [Google Scholar] [CrossRef] [PubMed]
  58. Geweke, J. Evaluating the accuracy of sampling-based approaches to the calculations of posterior moments. Bayesian Stat. 1992, 4, 641–649. [Google Scholar]
  59. Spiegelhalter, D.J.; Best, N.G.; Carlin, B.P.; Van Der Linde, A. Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B Stat. Methodol. 2002, 64, 583–639. [Google Scholar] [CrossRef] [Green Version]
  60. Hosseini-Baharanchi, F.S.; Baghestani, A.R.; Baghfalaki, T.; Hajizadeh, E.; Najafizadeh, K.; Shafaghi, S. Joint Modeling of Longitudinal Measurements and Multiple Failure Time Using Fully-specified Subdistribution Model: A Bayesian Perspective. J. Reliab. Stat. Stud. 2020, 13, 221–242. [Google Scholar] [CrossRef]
Figure 1. Plot (a) is the CIF for ESRD and death events, and shows an increasing probability of the occurrence of the two events as time increases; (b) plots the log(eGFR) trajectories against follow-up time since diagnosis of all CKD patients, where the consecutive measurements of eGFR for each patient are connected by line segments; Plot (c) explores more the nature of the trajectories of the longitudinal outcome eGFR using six randomly selected patients. Plot (d) is a histogram of log-transformed eGFR with a normal curve and shows an asymmetry (left-skewed) distribution.
Figure 1. Plot (a) is the CIF for ESRD and death events, and shows an increasing probability of the occurrence of the two events as time increases; (b) plots the log(eGFR) trajectories against follow-up time since diagnosis of all CKD patients, where the consecutive measurements of eGFR for each patient are connected by line segments; Plot (c) explores more the nature of the trajectories of the longitudinal outcome eGFR using six randomly selected patients. Plot (d) is a histogram of log-transformed eGFR with a normal curve and shows an asymmetry (left-skewed) distribution.
Mathematics 10 04816 g001
Figure 2. Trace plots (a), autocorrelation plots (b) and BGR convergence statistic plots (c) of some representative parameters from the chosen model. The trace plots show how well the chains are mixed. The BGR statistic plots, which are the ratio (red line) of the pooled within-chain posterior variability (green line) to the average variability across the three chains (blue line), approached to 1.0. In addition, after thinning, the level of autocorrelation drops rapidly as the number of lags increases. Thus, the plots (ac) demonstrate convergence.
Figure 2. Trace plots (a), autocorrelation plots (b) and BGR convergence statistic plots (c) of some representative parameters from the chosen model. The trace plots show how well the chains are mixed. The BGR statistic plots, which are the ratio (red line) of the pooled within-chain posterior variability (green line) to the average variability across the three chains (blue line), approached to 1.0. In addition, after thinning, the level of autocorrelation drops rapidly as the number of lags increases. Thus, the plots (ac) demonstrate convergence.
Mathematics 10 04816 g002
Table 1. Results of the Geweke’s test of convergence. The computed value of the test statistic for each parameter from the chosen model.
Table 1. Results of the Geweke’s test of convergence. The computed value of the test statistic for each parameter from the chosen model.
beta.1beta.2beta.3zeta.1zeta.2zeta.3delta. ϵ
−0.20620−1.46783−0.06250−1.290630.88574−1.224080.15702
kappa. ϵ sigma. ϵ sigma.b11sigma.b12sigma.b13sigma.b14sigma.b21
−0.095340.31079−0.09137−0.94210−0.09534−0.27347−0.06552
sigma.b22sigma.b23sigma.b24sigma.b31sigma.b32sigma.b33sigma.b34
0.221850.22018−0.77517−0.273470.220180.579240.41516
sigma.b41sigma.b42sigma.b43sigma.b44gamma1.1gamma1.2gamma1.3
−0.06552−0.775170.415160.090230.468900.15345−0.93649
gamma1.4gamma2.1gamma2.2gamma2.3gamma2.4alpha1.1alpha1.2
−1.963070.39827−0.108640.17433−1.128290.013700.04930
alpha2.1alpha2.2deviance
−0.59232−2.03888−0.18847
Table 2. The posterior mean estimates (Est), standard deviation (Sd) and 95% credible interval (CI) for the parameters from the proposed joint models with different parameterizations.
Table 2. The posterior mean estimates (Est), standard deviation (Sd) and 95% credible interval (CI) for the parameters from the proposed joint models with different parameterizations.
SPJM-P1SPJM-P2SPJM-P3
ParaEst.Sd95 % ClEst.Sd95 % ClEst.Sd95 % Cl
The longitudinal submodel parameters estimates
β 1 4.221 0.051 ( 4.12 , 4.31 ) 4.226 0.055 ( 4.13 , 4.34 ) 5.891 1.709 ( 4.13 , 7.79 )
β 2 0.127 0.064 ( 0.25 , 0.001 ) 0.13 0.072 ( 0.26 , 0.02 ) 0.358 0.227 ( 0.68 , 0.06 )
β 3 0.145 0.022 ( 0.19 , 0.10 ) 0.142 0.020 ( 0.18 , 0.10 ) 0.605 0.461 ( 1.21 , 0.13 )
ζ 1 1.045 0.100 ( 1.24 , 0.85 ) 1.067 0.117 ( 1.34 , 0.85 ) 0.653 0.387 ( 1.20 , 0.11 )
ζ 2 2.433 0.207 ( 2.81 , 1.99 ) 2.468 0.269 ( 2.99 , 1.93 ) 2.283 0.206 ( 2.59 , 1.95 )
ζ 3 2.864 0.352 ( 3.48 , 2.02 ) 2.926 0.435 ( 3.78 , 2.06 ) 1.238 1.082 ( 2.56 , 0.03 )
σ ϵ 2 0.006 0.003 ( 0.003 , 0.01 ) 0.006 0.002 ( 0.003 , 0.01 ) 0.009 0.006 ( 0.0020 . 03 )
σ b 1 2 0.171 0.025 ( 0.13 , 0.22 ) 0.170 0.025 ( 0.12 , 0.22 ) 1.774 1.622 ( 0.14 , 4.08 )
σ b 2 2 0.735 0.198 ( 0.40 , 1.17 ) 0.743 0.234 ( 0.40 , 1.31 ) 1.057 0.299 ( 0.55 , 1.72 )
σ b 3 2 2.447 0.687 ( 1.27 , 3.95 ) 2.628 0.810 ( 1.43 , 4.51 ) 2.007 0.628 ( 1.05 , 3.49 )
σ b 4 2 3.610 1.364 ( 1.44 , 6.79 ) 3.986 1.745 ( 1.62 , 8.32 ) 3.197 1.655 ( 0.77 , 6.97 )
δ ϵ 0.365 0.052 ( 0.45 , 0.28 ) 0.37 0.035 ( 0.43 , 0.31 ) 0.523 0.243 ( 0.89 , 0.17 )
κ ϵ 3.171 0.178 ( 3.01 , 3.64 ) 3.173 0.172 ( 3.00 , 3.64 ) 3.206 0.227 ( 3.01 , 3.82 )
The competing risks submodel parameters estimates
γ 11 0.006 0.012 ( 0.03 , 0.02 ) 0.009 0.012 ( 0.030 . 01 ) 0.006 0.010 ( 0.03 , 0.01 )
γ 12 0.169 0.350 ( 0.50 , 0.89 ) 0.147 0.341 ( 0.52 , 0.80 ) 0.184 0.303 ( 0.42 , 0.78 )
γ 13 1.923 0.661 ( 0.67 , 3.29 ) 1.914 0.644 ( 0.73 , 3.21 ) 1.519 0.472 ( 0.62 , 2.45 )
γ 14 1.587 0.597 ( 0.38 , 2.71 ) 1.599 0.600 ( 0.39 , 2.72 ) 2.019 0.524 ( 0.94 , 3.00 )
γ 21 0.009 0.011 ( 0.01 , 0.03 ) 0.009 0.011 ( 0.01 , 0.03 ) 0.014 0.011 ( 0.010 . 04 )
γ 22 0.056 0.368 ( 0.68 , 0.77 ) 0.032 0.379 ( 0.71 , 0.79 ) 0.095 0.373 ( 0.64 , 0.85 )
γ 23 2.825 0.650 ( 1.61 , 4.15 ) 2.849 0.664 ( 1.59 , 4.16 ) 2.607 0.604 ( 1.46 , 3.85 )
γ 24 2.396 0.763 ( 0.87 , 3.91 ) 2.360 0.786 ( 0.82 , 3.95 ) 1.917 0.872 ( 0.09 , 3.55 )
Estimates of the association parameters of the joint models
α 11 2.759 0.824 ( 4.46 , 1.27 ) 2.799 0.817 ( 4.55 , 1.39 ) 1.266 0.437 ( 2.09 , 0.53 )
α 12 1.697 0.453 ( 2.59 , 0.85 ) 1.670 0.511 ( 2.75 , 0.82 )
α 21 2.076 0.604 ( 3.40 , 1.00 ) 2.061 0.587 ( 3.32 , 1.00 ) 1.598 0.356 ( 2.33 , 0.94 )
α 22 0.395 0.372 ( 1.17 , 0.29 ) 0.373 0.363 ( 1.19 , 0.27 )
DIC383540644111
Table 3. The posterior mean estimates (Est), standard deviation (Sd), and 95% credible interval (CI) for the parameters from the selected joint model SPJM-P1 with different distributions.
Table 3. The posterior mean estimates (Est), standard deviation (Sd), and 95% credible interval (CI) for the parameters from the selected joint model SPJM-P1 with different distributions.
SPJM-P1STSPJM-P1N
ParaEst.Sd95 % ClEst.Sd95 % Cl
The longitudinal submodel parameters estimates
β 1 4.221 0.051 ( 4.122 , 4.313 ) 4.017 0.051 ( 3.916 , 4.114 )
β 2 0.127 0.064 ( 0.249 , 0.001 ) 0.176 0.072 ( 0.314 , 0.029 )
β 3 0.145 0.022 ( 0.188 , 0.104 ) 0.168 0.022 ( 0.211 , 0.125 )
ζ 1 1.045 0.100 ( 1.242 , 0.853 ) 1.071 0.125 ( 1.322 , 0.836 )
ζ 2 2.433 0.207 ( 2.805 , 1.987 ) 2.791 0.297 ( 3.394 , 2.177 )
ζ 3 2.864 0.352 ( 3.476 , 2.022 ) 3.398 0.519 ( 4.444 , 2.288 )
σ ϵ 2 0.006 0.003 ( 0.003 , 0.013 ) 0.061 0.003 ( 0.055 , 0.067 )
σ b 1 2 0.171 0.025 ( 0.128 , 0.224 ) 0.192 0.025 ( 0.148 , 0.249 )
σ b 2 2 0.735 0.198 ( 0.400 , 1.166 ) 1.087 0.277 ( 0.630 , 1.741 )
σ b 3 2 2.447 0.687 ( 1.265 , 3.954 ) 4.021 1.231 ( 2.054 , 6.93 )
σ b 4 2 3.610 1.364 ( 1.44 , 6.785 ) 7.704 2.611 ( 3.828 , 13.8 )
δ ϵ 0.365 0.052 ( 0.448 , 0.282 )
κ ϵ 3.171 0.178 ( 3.006 , 3.644 )
The competing risks submodel parameters estimates
γ 11 0.006 0.012 ( 0.030 , 0.016 ) 0.007 0.011 ( 0.029 , 0.015 )
γ 12 0.169 0.350 ( 0.500 , 0.888 ) 0.215 0.338 ( 0.428 , 0.901 )
γ 13 1.923 0.661 ( 0.673 , 3.286 ) 2.070 0.657 ( 0.8954 , 3.455 )
γ 14 1.587 0.597 ( 0.38 , 2.71 ) 1.539 0.588 ( 0.345 , 2.63 )
γ 21 0.009 0.011 ( 0.013 , 0.032 ) 0.009 0.011 ( 0.012 , 0.031 )
γ 22 0.056 0.368 ( 0.679 , 0.768 ) 0.067 0.365 ( 0.642 , 0.794 )
γ 23 2.825 0.650 ( 1.61 , 4.151 ) 2.956 0.669 ( 1.728 , 4.385 )
γ 24 2.396 0.763 ( 0.865 , 3.899 ) 2.358 0.798 ( 0.848 , 3.938 )
Estimates of the association parameters of the joint models
α 11 2.759 0.824 ( 4.462 , 1.274 ) 2.894 0.714 ( 4.508 , 1.645 )
α 12 1.697 0.453 ( 2.593 , 0.848 ) 1.359 0.340 ( 2.071 , 0.752 )
α 21 2.076 0.604 ( 3.395 , 0.999 ) 1.904 0.549 ( 3.07 , 0.935 )
α 22 0.395 0.372 ( 1.173 , 0.287 ) 0.352 0.307 ( 0.994 , 0.211 )
DIC38355143
Table 4. Simulation results. The posterior mean estimate (est), relative bias (RB), root mean square (RMS) error and coverage probability (CP) for each parameter of the joint models.
Table 4. Simulation results. The posterior mean estimate (est), relative bias (RB), root mean square (RMS) error and coverage probability (CP) for each parameter of the joint models.
SPJM-P1STSPJM-P1N
ParaTPVRBRMSECPRBRMSECP
β 1 4.50 0.185 1.757 100.0 0.029 0.178 82.40
β 2 0.25 0.046 0.094 94.80 0.165 0.109 92.37
β 3 0.31 0.012 0.076 95.3 0.024 0.086 94.67
ζ 1 1.20 0.007 0.172 95.47 0.071 0.192 92.63
ζ 2 2.50 0.085 0.358 88.93 0.001 0.295 94.97
ζ 3 3.10 0.076 0.19 61.77 0.029 0.157 89.20
σ ϵ 2 2.1 0.096 1.682 100.0 1.565 3.29 50.00
σ b 1 2 0.1 0.134 0.037 95.00 0.142 0.051 93.93
σ b 2 2 0.15 0.102 0.084 93.50 0.55 0.17 90.80
σ b 3 2 0.25 0.390 0.251 92.20 0.502 0.301 92.73
σ b 4 2 0.3 0.097 0.153 96.43 0.175 0.204 93.87
γ 11 0.01 0.292 0.019 94.73 0.445 0.021 93.43
γ 12 0.15 0.199 0.585 94.33 0.141 0.546 94.90
γ 13 1.80 0.268 0.889 89.73 0.287 0.831 87.73
γ 14 1.60 0.185 0.688 93.13 0.186 0.672 91.90
γ 21 0.01 0.861 0.015 88.37 0.842 0.014 87.83
γ 22 0.10 1.246 0.386 93.60 1.128 0.35 93.40
γ 23 2.0 0.044 0.486 94.63 0.082 0.499 94.27
γ 24 2.20 0.107 0.493 92.00 0.145 0.518 87.83
α 11 2.50 0.289 6.643 95.57 0.796 4.51 93.43
α 12 3.00 2.151 7.916 73.10 1.892 6.708 69.77
α 21 2.00 0.881 4.033 92.33 0.547 3.083 92.43
α 22 0.50 3.669 4.13 94.97 5.454 3.564 80.43
DIC14,59021,030
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Ferede, M.M.; Mwalili, S.; Dagne, G.; Karanja, S.; Hailu, W.; El-Morshedy, M.; Al-Bossly, A. A Semiparametric Bayesian Joint Modelling of Skewed Longitudinal and Competing Risks Failure Time Data: With Application to Chronic Kidney Disease. Mathematics 2022, 10, 4816. https://doi.org/10.3390/math10244816

AMA Style

Ferede MM, Mwalili S, Dagne G, Karanja S, Hailu W, El-Morshedy M, Al-Bossly A. A Semiparametric Bayesian Joint Modelling of Skewed Longitudinal and Competing Risks Failure Time Data: With Application to Chronic Kidney Disease. Mathematics. 2022; 10(24):4816. https://doi.org/10.3390/math10244816

Chicago/Turabian Style

Ferede, Melkamu Molla, Samuel Mwalili, Getachew Dagne, Simon Karanja, Workagegnehu Hailu, Mahmoud El-Morshedy, and Afrah Al-Bossly. 2022. "A Semiparametric Bayesian Joint Modelling of Skewed Longitudinal and Competing Risks Failure Time Data: With Application to Chronic Kidney Disease" Mathematics 10, no. 24: 4816. https://doi.org/10.3390/math10244816

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop