Explainable Survival Analysis of Censored Clinical Data Using a Neural Network Approach

De Santi, Lisa Anita; Orlandini, Francesca; Positano, Vincenzo; Pistoia, Laura; Sorrentino, Francesco; Messina, Giuseppe; Roberti, Maria Grazia; Missere, Massimiliano; Schicchi, Nicolò; Vallone, Antonino; Santarelli, Maria Filomena; Clemente, Alberto; Meloni, Antonella

doi:10.3390/biomedinformatics5020017

Open AccessArticle

Explainable Survival Analysis of Censored Clinical Data Using a Neural Network Approach

by

Lisa Anita De Santi

^1,2

,

Francesca Orlandini

^1,2,

Vincenzo Positano

²

,

Laura Pistoia

³

,

Francesco Sorrentino

⁴,

Giuseppe Messina

⁵,

Maria Grazia Roberti

⁶,

Massimiliano Missere

⁷,

Nicolò Schicchi

⁸,

Antonino Vallone

⁹,

Maria Filomena Santarelli

¹⁰

,

Alberto Clemente

¹¹

and

Antonella Meloni

^2,*

¹

Department of Information Engineering, University of Pisa, 56122 Pisa, Italy

²

Bioengineering Unit, Fondazione Toscana G. Monasterio, 56124 Pisa, Italy

³

Unità Operativa Semplice Dipartimentale Ricerca Clinica, Fondazione Toscana G. Monasterio, 56124 Pisa, Italy

⁴

Unità Operativa Semplice Dipartimentale Day Hospital Talassemici, Ospedale “Sant’Eugenio”, 00143 Rome, Italy

⁵

Centro Microcitemie, Grande Ospedale Metropolitano “Bianchi-Melacrino-Morelli”, 89100 Reggio Calabria, Italy

⁶

Servizio Trasfusionale, Azienda Ospedaliero-Universitaria OO.RR. Foggia, 71100 Foggia, Italy

⁷

Unità Operativa Complessa Radiodiagnostica, Gemelli Molise SpA, Fondazione di Ricerca e Cura “Giovanni Paolo II”, 86100 Campobasso, Italy

⁸

Dipartimento di Radiologia, Azienda Ospedaliero-Universitaria Ospedali Riuniti “Umberto I-Lancisi-Salesi”, 60020 Ancona, Italy

⁹

Reparto di Radiologia, Azienda Ospedaliera “Garibaldi” Presidio Ospedaliero Nesima, 95126 Catania, Italy

¹⁰

CNR Institute of Clinical Physiology, 56124 Pisa, Italy

¹¹

Department of Radiology, Fondazione Toscana G. Monasterio, 56124 Pisa, Italy

Show full affiliation list

Hide full affiliation list

^*

Author to whom correspondence should be addressed.

BioMedInformatics 2025, 5(2), 17; https://doi.org/10.3390/biomedinformatics5020017

Submission received: 17 February 2025 / Revised: 13 March 2025 / Accepted: 25 March 2025 / Published: 27 March 2025

(This article belongs to the Section Medical Statistics and Data Science)

Download

Browse Figures

Versions Notes

Abstract

:

Survival analysis is a statistical approach widely employed to model the time of an event, such as a patient’s death. Classical approaches include the Kaplan–Meier estimator and Cox proportional hazards regression, which assume a linear relationship between the model’s covariates. However, the linearity assumption might pose challenges with high-dimensional data, thus stimulating interest in performing survival analysis using neural network models. In the present work, we implemented a deep Cox neural network (Cox-net) to predict the time of a cardiac event using patient data collected from the Myocardial Iron Overload in Thalassemia (MIOT) project. Cox-net achieved a concordance index (c-index) of 0.812 ± 0.036, outperforming the classical Cox regression (0.790 ± 0.040), and it demonstrated resilience to varying levels of censored patients. A permutation feature importance analysis identified fibrosis and sex as the most significant predictors, aligning with clinical knowledge. Cox-net was able to represent the nonlinear relationships between covariates and maintain reliable survival curve predictions in datasets with a large number of censored patients, making it a promising tool for determining the appropriate clinical pathway for thalassemic patients.

Keywords:

survival analysis; thalassemia major; cardiac event; neural network; deep learning; XAI; permutation feature importance

1. Introduction

Survival analysis, or time-to-event analysis, refers to a set of statistical tools used to estimate the time it takes for an event of interest to occur based on a series of observations. The name “survival analysis” is related to investigating mortality rates. However, the technique can be used to estimate any clinical event. Survival analysis is used in various fields, such as demographics, biology, engineering, epidemiology, and medicine [1]. In the last case, the event of interest can be, for example, the diagnosis of an illness or death. A model that is able to predict the time of a particular event can be very useful for doctors, as it allows them to triage patients very quickly. For instance, survival analysis methods were applied during the COVID-19 pandemic period to prioritize radiography for patients characterized by a higher risk of death [2]. Moreover, these methods were applied to determine which treatment was most suitable for a given patient based on the related risk of developing a particular event [3]. Nonparametric methods, such as the Kaplan–Meier estimator, represent the classical tools still used, while semiparametric methods, such as the Cox proportional hazards regression model, represent the current standard [4]. More recently, machine learning (ML) methods have been applied to survival tasks (e.g., Random Survival Forest [5] and boosting-based methods [6]). Finally, neural networks (NNs) have been applied to the field of survival analysis. A complete list of existing methods can be found in [7]. Most of the proposed methods are Cox-based. These methods basically modify the Cox regression model by parameterizing the hazard rate and minimizing the Cox loss. Discrete-time methods, such as DeepHit [8,9], sample time in discrete intervals and predict binary events in the interval. Other approaches include parametric methods, such as DeepWeiSurv [10]; piece-wise exponential model (PEM) methods, such as Piecewise Constant (PC)-Hazard [11]; ordinary differential equation (ODE)-based methods, such as DeepComplete [12]; and ranking-based methods, such as RankDeepSurv [13].

Among the Cox-based models, the most representative is DeepSurv [3]. DeepSurv uses a deep NN with nonlinear hidden layer activation functions. Improvements in DeepSurv were presented by Kvamme et al. [11] and Tong et al. [14], who introduced a modified norm for handling the problem of missed features. Wang et al. introduced SurvNet. SurvNet includes an input construction and survival classification modules for handling missing values and high-/low-risk classification [15]. The Deep Cox Mixture (DCM) method exploits a Monte Carlo expectation maximization algorithm to estimate a mixture of Cox models [16]. Despite the high number of studies in the field, the problem of managing censored data has not been fully addressed in the current literature [7], so the influence of the percentage of censored data on methods based on NNs is poorly understood.

In the present work, we developed a Cox neural network to overcome the limits of classical Cox regression in performing survival analysis. The influence of censored patient incidence on network performances was investigated by analyzing synthetic data and a clinical database.

2. Materials and Methods

2.1. Survival Analysis

The survival function S(t) defines the probability that a subject will not experience the event of interest until time T:

S (t) = P (t > T)

(1)

The hazard function, denoted by h(t), describes the risk of an event occurring given that it has not happened by time T:

h (t) = lim_{Δ t \to 0} \frac{P (T \leq t < T + Δ t | t \geq T)}{Δ t}

(2)

Equation (2) describes the instantaneous risk of an event occurring at time t since the time interval

Δ

t tends to 0. The relationship between the hazard function h(t) and the survival function S(t) can be described by the following equation:

h (t) = - \frac{S^{'} (t)}{S (t)},

(3)

where S′(t) is the first derivative of S(t). Hence, based on Equation (3), we can express S(t) as

S (t) = e x p [- \int_{0}^{t} h (z) d z] = e x p [- H (t)]

(4)

where H(t) is the cumulative hazard function. The most commonly used method for estimating S(t) is the Kaplan–Meier model, which is described by the following equation:

\hat{S} (t) = \prod_{i = t_{i} < t} \frac{n_{i} - d_{i}}{n_{i}}

(5)

The term

d_{i}

is the number of patients with the event at time t, and

n_{i}

is the number of subjects at risk of developing the event just prior to time t.

The Cox Proportional Hazards Model [4] is the most popular and widespread survival regression model. It is a semi-parametric model, where the hazard risk at time t for a subject with a set of covariates

x_{1}

, …,

x_{p}

can be expressed by the following equation:

h (t | X) = h_{0} (t) e x p (β_{1} x_{1} + β_{2} x_{2} + \dots + β_{p} x_{p})

(6)

The term

h_{0}

(t) is the baseline hazard function, which is the hazard risk when all the covariates are equal to 0, and

β_{1}

, …,

β_{p}

are the model parameters describing the effect of the covariates on the hazard risk. The model can be defined as semi-parametric because, while the covariates fit it linearly, the baseline can take any form [17].

The covariates

x_{1}

, …,

x_{p}

can be continuous or categorical. Two important variables are 1: whether the event of interest has occurred (categorical) and 2: the time T at which the event occurred or, for patients who did not experience the event, the last time that they were seen (continuous) [18]. Censored subjects are those who experienced the event at an unknown time. The time can be unknown for two main reasons: the subject did not experience the event during the study observation period, or the patient left the study before the event occurred. Right-censored patients are those who are considered from the beginning of the study and did not experience the event before the end of the study or who left the study before the event occurred. Left-censored patients are those who did not experience the event before the end of the study and are considered in the study after the start time has already passed, so we cannot know whether they experienced the event in the period before being included in the study.

It is important to note that the only time-dependent element is the baseline hazard

h_{0} (t)

, which does not depend on covariates

β_{1}

, …,

β_{p}

. Therefore, an important quantity used to interpret the Cox model is the hazard ratio (HR), defined as the ratio between the hazard risk of two different subjects. Given two subjects i and j, the corresponding hazard risk values

θ_{i}

and

θ_{j}

are

θ_{i} = β_{1} x_{i 1} + β_{2} x_{i 2} + \dots + β_{p} x_{i p}, θ_{j} = β_{1} x_{j 1} + β_{2} x_{j 2} + \dots + β_{p} x_{j p}

(7)

So the HR can defined as

H R = \frac{h (t | X_{i})}{h (t | X_{j})} = \frac{h_{0} (t) e x p (θ_{i})}{h_{0} (t) e x p (θ_{j})} = \frac{e x p (θ_{i})}{e x p (θ_{j})}

(8)

If the hazard ratio is greater than 1, it means that subject i is more likely to develop the event than subject j, while if the HR is lower than 1, subject j has a higher probability of developing the event than subject i. A hazard ratio of 1 means that the predictor does not affect the hazard risk of the event. As shown in Equation (8), the HR between subjects is not dependent on time t.

Considering Equation (6), we can interpret the model in the following manner:

If $β_{i}$ is < 0 ( $e x p (β_{i}) < 1$ ), when the covariate increases, the event hazard decreases, and the survival time increases.
If $β_{i}$ is = 0 ( $e x p (β_{i}) = 1$ ), the covariate has no effect on the event hazard.
If $β_{i}$ is > 0 ( $e x p (β_{i}) > 1$ ), when the covariate increases, the event hazard increases, and the survival time decreases.

It is important to note that, thanks to Cox regression, we can estimate the model parameters

β_{1}

, …,

β_{p}

without knowing the baseline hazard

h_{0} (t)

. Then, we can estimate which covariates have the greatest impact on the hazard risk.

The parameters

β

are obtained by maximizing the so-called log partial likelihood [4]. The partial likelihood can be expressed by the following equation:

L (β) = \prod_{i = 1}^{n} \frac{h_{0} (t) e^{β^{T} x_{i}}}{\sum_{j \in R_{i}} h_{0} (t) e^{β^{T} x_{j}}}

(9)

where n is the number of subjects,

x_{i}

are the subjects who had the event of interest at time

t_{i}

, and

R_{i}

represents the group of patients’ indices who experienced the event of interest at time

t_{j} \geq t_{i}

. Estimation is conducted using the log partial likelihood, which can be written as

l (β) = l o g (L (β)) = \sum_{i = 1}^{n} (β^{T} x_{i} - l o g (\sum_{j \in R_{i}} e^{β^{T} x_{j}}))

(10)

Using Equation (10),

\hat{β}

can be found by minimizing the negative log partial likelihood.

\hat{β} = \underset{β}{argmin} {- l (β)}

(11)

This is usually carried out using the Newton–Raphson method, one of the methods for approximating the solution of an f(x) = 0 equation.

In addition to covariates, it is also possible to estimate the baseline hazard

h_{0} (t)

by the Aalen–Breslow estimator [19], which is a generalization of the non-parametric Nelson–Aalen estimator of the cumulative hazard function.

The Aalen–Breslow estimator can be defined as:

{\hat{h}}_{0} (t) = \sum_{t_{i} \leq t} \frac{d_{i}}{\sum_{j \in R_{i}} e^{β^{T} x_{j}}}

(12)

The term

d_{i}

is the number of patients with the event at time t, and

R_{i}

is the group of patients’ indices who experienced the event of interest at time

t_{j} \geq t_{i}

.

After obtaining the covariate values and the baseline hazard, we can estimate the survival curve for each patient and, finally, the average survival curve for the entire study population.

2.2. Dataset

2.2.1. Synthetic Dataset Analysis

We generated synthetic datasets, each with a varying percentage of censored patients. These synthetic populations can be designed to closely resemble real datasets by ensuring that each variable matches the distribution of its corresponding covariate in the real data. Additionally, the coefficients (

β

) can be adjusted to align with those obtained from the Cox regression applied to real reference data (Equation (11)).

We simulated the survival synthetic dataset using the following steps [20], as depicted in Figure 1:

(i): We designed the time axis, including k points in a time interval T. Internal points were randomly generated from a uniform distribution. We generated the cumulative distribution function (CDF) of event occurrences by randomly sampling CDF values from a uniform distribution in ascending order to simulate a monotone increasing function.
(ii): The obtained CDF samples were fitted to a cubic smoothing spline, obtaining the simulated CDF.
(iii): We calculated the probability density function (PDF) from the CDF.
(iv): We obtained the baseline survival function as (1-PDF).
(v): Finally, we computed the baseline hazard by dividing the PDF by the survival function.

For the baseline hazard, we need the covariate matrix X and the coefficient values

β

. After that, we can calculate the survival function for subject i using

S_{i} (t) = S_{0} (t) e^{X_{i} β}

(13)

For each subject i, we generated a value from a uniform distribution in the range [0, 1]. The corresponding event time value in the CDF function was taken as the subject survival time. Censored patients were randomly generated with a uniform distribution. Synthetic datasets were generated by R software using the sim.survdata() function in the coxed libraries.

2.2.2. Clinical Dataset

The clinical dataset used in the present study contains clinical data from thalassemia major (TM) patients enrolled in the Myocardial Iron Overload in Thalassemia (MIOT) network. The MIOT project is an Italian network devoted to prospectively following TM patients at thalassemia and magnetic resonance imaging (MRI) centers, connected by a web-based database configured to collect clinical, instrumental, and laboratory patient data [21]. Clinical data are updated at every MRI scan, which is performed per protocol every 18 ± 3 months. The dataset used in this study consists of 13 variables from 481 TM patients, acquired during the study, as shown in Table 1. The selection of clinical features was based on both clinical expertise and the existing literature focusing on cardiovascular risk. The standard MRI protocol included the following [21]:

The quantification of hepatic and cardiac iron overload by the T2* MRI technique, with the subsequent conversion of hepatic T2* values into liver iron concentration (LIC) values;
The measurement of left ventricular (LV) end-diastolic volume index (EDVI), LV mass index (MI), LV ejection faction (EF), right ventricular (RV) EDVI, and RV EF from cine images;
The measurement of left atrial (LA) and right atrial (RA) area indices (AI) from cine images;
The detection of replacement myocardial fibrosis by the late gadolinium enhancement technique;
An assessment of serum ferritin levels within one month of the MRI scan.

Thirty-six patients experienced a cardiac event during the study. Cardiac events included heart failure (18), arrhythmia (16), and pulmonary hypertension (2). Heart failure, arrhythmia, and pulmonary hypertension were diagnosed by physicians based on symptoms, signs, and instrumental findings according to the American College of Cardiology (ACC)/AHA guidelines [22,23].

The MIOT database contains the characteristics of a typical survival analysis dataset:

Both continuous and categorical variables;
The censoring variable: a categorical variable that describes whether the patient is censored, that is, whether the event of interest (i.e., cardiac event) has occurred;
The time variable: a continuous variable that represents the time at which the event occurred or, for patients who did not experience the event, the last time that they were seen.

2.3. Data Pre-Processing

As commonly observed in clinical practice, some data are missing in the MIOT database. There are different methods for treating missing values [15,24], e.g., replacing the missing clinical values with synthetically generated values, but this would wrongly alter the intrinsic characteristics of the dataset and, in the same way, the results obtained. Hence, patients with missing values in the MIOT dataset were omitted from the analysis in this study. We then normalized both the MIOT and synthetic datasets by applying standardization to continuous and discrete clinical variables to deal with the exploding gradient problem [25].

z = \frac{x - μ}{σ}

(14)

where x is the original value,

μ

is the mean, and

σ

is the standard deviation.

2.4. Cox-Net Model

There are different NN-based methods for survival analysis in the literature, developed as alternatives to the Cox Proportional Hazard regression analysis, including Cox-net [26]. We developed a Cox-net model based on the design of existing survival analysis NN-models [3,8,13,26,27,28,29]. The proposed Cox-net had an input layer of 13 nodes (equal to the number of features), two fully connected hidden layers, both with 256 neurons, and a 1-neuron output layer without a bias term. The ReLU activation function was chosen instead of the tanh activation function used in the original model. The Cox-net architecture is represented in Figure 2.

The output predicts the log partial hazard of the Cox regression, expressed for patient i as

θ_{i} = G {W_{h 1 h 2} [G (W_{i n h 1} X_{i} + b_{h 1})] + b_{h 2}}^{T} W_{h o u t}

(15)

where

W_{i n h 1} X_{i} + b_{h 1}

is the output of the first hidden layer,

W_{i n h 1}

is the coefficient weight matrix between the input and the first hidden layer,

b_{h 1}

is the vector of the bias terms of each hidden node of the first hidden layer,

W_{h 1 h 2}

is the coefficient weight matrix between the first hidden layer and the second hidden layer,

b_{h 2}

is the vector of the bias terms of each hidden node of the second hidden layer,

W_{h o u t}

is the coefficient weight matrix between the second hidden layer and the output layer, and G is the tanh activation function:

G (z) = \frac{e^{z} - e^{- z}}{e^{z} + e^{- z}}

(16)

We used the negative partial log-likelihood loss with ridge normalization (L2):

p l (β) = \sum_{C (i) = 1} (θ_{i} - log \sum_{t_{i} \leq t_{j}} e^{θ_{j}}) + λ {∥ W ∥}_{2}

(17)

We applied a dropout of 0.2 to the first fully connected layer to prevent overfitting.

2.5. Synthetic Data Generation

Since the MIOT dataset was highly unbalanced, with only 8% of the patients experiencing the event, we used synthetic populations to investigate how the proportion of censored patients affects Cox-net training and convergence. We generated eight synthetic datasets, each with a varying percentage of censored patients (0, 20, 35, 50, 65, 80, 95, and 98%). These synthetic populations were designed to closely resemble the MIOT dataset by ensuring that each variable matched the distribution of its corresponding covariate in the real data. Additionally, the coefficients (

β

) were set to align with those obtained from the Cox regression applied to the MIOT data (Equation (11)). All the generated populations had 1000 patients with 13 features, and, as in the MIOT dataset, 2 of the features were categorical. We set coefficients

β

to be equal to those resulting from the Cox regression to obtain a dataset as close as possible to the MIOT population. Moreover, each continuous variable had the same distribution (mean and standard deviation) as the corresponding covariate in the real dataset, as shown in Table 1. The categorical variables corresponded to the real categorical variables of the MIOT dataset: sex and fibrosis. Considering the synthetic dataset without censored patients, by separating the entire dataset into two populations according to whether they had fibrosis or whether they were male or female and by applying the Kaplan–Meier estimator, we obtained two completely separate survival curves. In Figure 3, we can see that patients with fibrosis (Figure 3a) and male patients (Figure 3b) have a higher probability of developing a cardiac event.

2.6. Model Training and Validation

We trained Cox-net using K-fold stratified cross-validation with

K = 3

on both the synthetic and MIOT datasets in order to determine whether the model could be generalized to different data. To take into account the presence of a large number of censored patients, we applied stratified cross-validation and chose a small K value to ensure that all folds had patients with a reasonable number of events. We applied a stratified split so that the size of the test set was 20% of the total dataset. Then, according to the K-fold cross-validation, the remaining patients were divided into three folds randomly during each training session, and one of these folds was considered as the validation set, while the other folds were considered as the training set.

The SGD Nesterov optimizer was applied, with a learning rate of 0.00001, a dropout of 0.4, and a momentum of 0.9.

λ

ridge, which is the regularization parameter presented in Equation (17), was set equal to 0.0005. We applied early stopping (patience = 20) to prevent overfitting by saving the model hyperparameters corresponding to the minimum validation loss value.

2.7. Model Evaluation: C-Index

We evaluated Cox-net using the concordance index (c-index) [30], which is the most frequently used evaluation metric for survival models. C-index was defined as the ratio of concordant pairs to comparable pairs:

c = \frac{# concordant pairs}{# concordant pairs + # discordant pairs}

(18)

Two samples i and j were comparable if the sample with a lower observed time experienced an event, i.e., if

t_{j} > t_{i}

and

e_{i} = 1

. We estimated the c-index value with Harrel’s c-index, which estimates the fraction of correctly ordered pairs out of all comparable pairs in the dataset. The principle of Harrell’s c-index is that the patient with the higher risk score should have a shorter time to disease.

Considering patients i and j, different situations can arise:

If both i and j are not censored, then we can determine when both patients developed the disease. The pair $(i, j)$ is determined to be a concordant pair if $θ_{i} > θ_{j}$ and $t_{i} < t_{j}$ , and it is determined to be a discordant pair if $θ_{i} > θ_{j}$ and $t_{i} > t_{j}$ .
If both i and j are censored, then we do not know who developed the disease first (if at all), so we do not consider this pair in the computation.
If either i or j is censored, we only observe one disease. For example, if patient i develops the disease at time $t_{i}$ and patient j is censored, then two different situations can arise:
–
If $t_{j} < t_{i}$ , then we do not know for sure who developed the disease first, so we do not consider this pair in the computation.
–
If $t_{j} > t_{i}$ , then we know for sure that patient i developed the disease first. Hence, $(i, j)$ is a concordant pair if $θ_{i} > θ_{j}$ , and it is a discordant pair if $θ_{i} < θ_{j}$ .

2.8. Explanation by Permutation Feature Importance

Neural networks are intrinsically more complex than the standard survival models, so their level of interpretability is generally low. EXplainable Artificial Intelligence (XAI) helps solve this issue. In particular, the evaluation against adversarial perturbation suggested the execution of a certain perturbation process of the input data and the usage of evaluation metrics to quantify the faithfulness of the explanation to the AI model’s internal decision process [31]. In addition to evaluation against adversarial perturbation methods, we used the permutation feature importance method [32] to assess the importance of covariates in the NN-based prediction. Permutation feature importance (PFI) measures the increase in the prediction error of the model depending on the permutation of feature values. Here, a feature is considered “important” if shuffling its values increases the model prediction error, indicating that the model relied on the feature for the prediction. In contrast, a feature is “unimportant” if shuffling its values leaves the model prediction error unchanged. In survival analysis, PFI can be used to assess the importance of a covariate in the prediction of an event.

Let us consider an optimal predictive model with the associated tabular dataset D containing K features. The C-index value

e_{m a x}

computed on D represents the best possible performance of the model. To assess the PFI value for the feature j, the c-index value

e_{k j}

is computed k times (k in 1, …, K), and the column j of the dataset D is randomly shuffled to generate a corrupted version of the dataset, called

D_{k j}

. The feature importance

i_{j}

related to the j feature is calculated as

i_{j} = e_{m a x} - \frac{1}{K} \sum_{k = 1}^{K} e_{k j}

(19)

A PFI analysis was performed by the Python library

s k l e a r n . i n s p e c t i o n

containing the function

p e r m u t a t i o n_

i m p o r t a n c e

to determine which features are the most predictive.

3. Results

3.1. Synthetic Dataset Results

We generated synthetic datasets mimicking the characteristics of the MIOT dataset previously described. We created eight synthetic datasets containing 356 patients each, with different percentages of censored patients (range 0–98%), to investigate the influence of the percentage of censored patients on network performance. The test set included 72 patients, while the training/validation set included 284 patients (189 in the training set, and 95 in the validation set). Table 2 reports the performance of Cox-net and the standard Cox regression in the three validation folds used in the network training. The training and validation loss curves in the worst case (simulated dataset with 98% of censored patients) are reported in Figure 4.

Cox-net outperformed standard Cox regression, especially when the number of censored patients increased. In fact, the average c-index of Cox-net among folds remained above 0.8, while the average c-index of the Cox regression dropped below 0.8 when the number of censored patients increased up to 35%. The performance reproducibility among folds decreased with the increase in the number of censored patients for both models.

The model with the best performance (i.e., the one trained in Fold 1) was used to perform test set analysis (Figure 5). The results obtained on the test set were consistent with those obtained on the validation set, with a decrease in the performances in the simulated datasets with a higher number of censored patients. Overall, Cox-net performed better than the standard Cox regression in all datasets.

The survival curves obtained for the test set are shown in Figure 6 and compared with the Kaplan–Meier curves taken as references. When the number of censored patients was low, the survival curves generated by Cox-net and standard Kaplan–Meier were similar. For a higher number of censored patients, the predicted survival ratio significantly differed, with a higher risk estimation produced by the Kaplan–Meier estimator. The permutation analysis results are presented in Figure 7. As previously described, permutation feature importance analysis determines which features have the greatest influence on the output of Cox-net.

From the permutation analysis, we noticed that myocardial fibrosis and sex were the features with the highest impact on the log partial hazard. This finding was consistent among the different percentages of censored patients. We also noticed that, as the number of censored patients increased, the uncertainness in the permutation analysis rose.

3.2. MIOT Dataset Results

The MIOT database included about 90% of censored patients. After omitting patients with missing values, we analyzed 356 patients. Using stratified three fold cross-validation, we obtained a training set of 189 patients, including 15 with the event, and a validation set of 95 patients, including 8 with the event, in the first and second folds. In the third fold, we obtained a training set of 190 patients, including 16 with the event, and a validation set of 94 patients, including 7 with the event. The test set contained 72 patients, including 6 with the event.

Table 3 reports the cross-validation results for the training of the Cox-net model. The results resembled those obtained in the synthetic data analysis, with a low reproducibility among folds showing a strong data dependence. Cox-net showed better performances than the standard Cox regression. The model with the best performance (i.e., the Fold 2 model) was used in the test set analysis.

Figure 8 shows the results of the permutation analysis introduced in Section 2.8 for the test set. Myocardial fibrosis appeared as the most important prediction factor, followed by sex. Based on this finding, we divided the test set into two different groups, first considering fibrosis and then considering sex, and we plotted the associated survival curves.

Figure 9 shows the survival curve analysis considering the fibrosis and sex covariates for the test set. Cox-net well discriminated the two patient groups.

4. Discussion

Artificial intelligence has revolutionary potential in healthcare applications [33]. Neural network models have strong nonlinear modeling capabilities, which allow them to effectively capture complex relationships among clinical variables. These models can provide accurate, personalized survival predictions. In this study, we trained and tested a neural network model (Cox-net) on synthetic and real clinical datasets to perform a survival analysis. In particular, we investigated the influence of the presence of censored patients on network performance and stability. As shown in Table 2, the performances of Cox-net and standard Cox regression were similar and reproducible among folds if the percentage of censored patients was low (<20%). As the percentage of censored patients increased, Cox-net outperformed standard Cox regression. For a high percentage of censored patients (i.e., >90%), the increase in the value of the c-index reached about 10%. However, the reproducibility of the performances among the folds decreased with the percentage of censored patients, revealing a significant dependence on the distribution of the data in the folds. Hence, the stability of the model showed a strong dependence on the number of events rather than the number of subjects. This finding was confirmed by examining the loss curves in the model training performed on a synthetic dataset with a large number of censored patients (Figure 4). In this case, a suboptimal convergence of the training procedure (Fold 2) was associated with a lower performance (CI = 0.745 in Fold 2 vs. 0.857 and 0.863 in Folds 1 and 3; see Table 2). The assessed trend was confirmed in a test set analysis (Figure 5). The performances of Cox-net were good (c-index about 80%) among a large range of censored patient percentages (0–80%) and better than those of standard Cox regression. A high percentage of censored patients led to a significant decrease in the value of the c-index. The findings of the survival curves analysis matched those of the c-index performance analysis (Figure 6). Cox-net allowed for an assessment of the importance of covariates by PFI. As shown in Figure 7, the ability of PFI to identify important covariates was reduced with the presence of many censored patients, although PFI was still effective for very high percentages of censored patients.

We used Cox-net to analyze a clinical database characterized by a high number of censored patients (>80%). Cross-validation and a test set analysis confirmed the findings obtained on the synthetic data (Table 3). Cox-net showed good performances, with a low reproducibility of performances among folds. The PFI analysis was able to clearly detect the dependence of the survival rate on fibrosis, while dependence on sex and the left atrium area was also found (Figure 8). The findings of the PFI analysis also align with those of previous clinical research, confirming that myocardial fibrosis is a strong predictor of cardiac risk in TM [34,35] and further highlighting the value of fully utilizing the multiparametric capabilities of magnetic resonance. Indeed, myocardial fibrosis represents a reparative response to myocardial injury and is considered an irreversible process. Over time, it leads to worsened ventricular function, abnormal remodeling, and increased stiffness, potentially resulting in left ventricular failure and heart failure symptoms. Additionally, reduced compliance and elevated filling pressures can raise left atrial pressures, promoting structural changes that increase the risk of atrial fibrillation [36]. Similarly, male sex is a well-established risk factor for cardiac complications in TM [37,38], likely due to a lower tolerance to iron toxicity. This increased vulnerability is thought to be driven by a greater sensitivity to chronic oxidative stress [39]. The link between the left atrial area and survival can be explained by the high incidence of arrhythmias in our population, as more than half of our patients had supraventricular arrhythmias. Atrial dilatation, leading to increased atrial pressures and altered electrical conduction, plays an important part in arrhythmia pathogenesis [40]. As shown in Figure 9, Cox-net was able to produce survival curves consistent with clinical findings. However, according to the synthetic data analysis, the predictions made by Cox-net did not perfectly overlap those made by standard Cox regression due to the high incidence of censored patients in the MIOT database.

This study has some limitations. The study results were affected by the relatively small sample size and low incidence of cardiovascular events. Hence, it was not possible to consider the different types of cardiovascular events separately. Patients with missing values in the MIOT database were excluded from the analysis. The explicit handling of missing values has been demonstrated in the literature [15,41] and could be employed to better manage patients with missing data. The model’s discriminatory ability was measured by the concordance index because of its established use and interpretability in survival analysis. Other indices, such as Uno’s C index and Dynamic AUC, could be used to further assess the model’s predictive performance. The primary goal of this study was to evaluate the performance of the deep Cox neural network compared to that of Cox regression, which remains one of the most widely used statistical methods in survival analysis. A comparison of the proposed model with other deep learning models proposed in the literature was not performed and will be the objective of further studies.

5. Conclusions

Cox-net represents a possible improvement in survival analysis, as it is able to interpret the nonlinear relationships between covariates and maintain acceptable performance in the presence of a high percentage of censored patients. The permutation feature importance method is a valid instrument for assessing the importance of covariates in Cox-net prediction. Hence, Cox-net offers a promising tool for improving risk stratification and supporting the clinical decision-making process. Future work will include expanding our analysis to larger and more diverse datasets to validate the generalizability of our approach across various clinical settings.

Author Contributions

Conceptualization, L.A.D.S., M.F.S., V.P. and A.M.; methodology, L.A.D.S. and F.O.; software, L.A.D.S. and F.O.; validation, L.A.D.S., V.P. and A.M.; resources, M.F.S.; data curation, L.P., F.S., G.M., M.G.R., M.M., N.S. and A.V.; writing—original draft preparation, L.A.D.S. and F.O.; writing—review and editing, L.A.D.S., V.P. and A.M.; visualization, F.O. and V.P.; supervision, A.C. and A.M.; project administration, A.C.; funding acquisition, A.C. and A.M. All authors have read and agreed to the published version of the manuscript.

Funding

The E-MIOT project receives “no-profit support” from industrial sponsorships (Chiesi Farmaceutici S.p.A. and Bayer). The funders had no role in the study design, data collection, or analysis; the decision to publish; or the preparation of the manuscript.

Institutional Review Board Statement

This study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Ethics Committee of Area Vasta Nord Ovest (protocol code 56664, date of approval 8 October 2015).

Informed Consent Statement

Informed consent was obtained from all patients involved in the study.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CDF	cumulative distribution function
c-index	concordance index
HR	hazard ratio
MIOT	Myocardial Iron Overload in Thalassemia
MRI	magnetic resonance imaging
NN	neural networks
PDF	probability density function
PFI	permutation feature inversion
TM	thalassemia major

References

Clark, T.; Bradburn, M.; Love, S.; Altman, D. Survival Analysis Part I: Basic concepts and first analyses. Br. J. Cancer 2003, 89, 232–238. [Google Scholar] [CrossRef] [PubMed]
Matsumoto, T.; Walston, S.L.; Walston, M.; Kabata, D.; Miki, Y.; Shiba, M.; Ueda, D. Deep Learning–Based Time-to-Death Prediction Model for COVID-19 Patients Using Clinical Data and Chest Radiographs. J. Digit. Imaging 2023, 36, 178–188. [Google Scholar] [CrossRef]
Katzman, J.L.; Shaham, U.; Cloninger, A.; Bates, J.; Jiang, T.; Kluger, Y. DeepSurv: Personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med. Res. Methodol. 2018, 18, 24. [Google Scholar] [CrossRef]
Cox, D.R. Partial Likelihood. Biometrika 1975, 62, 269–276. [Google Scholar]
Ishwaran, H.; Kogalur, U.B.; Blackstone, E.H.; Lauer, M.S. Random survival forests. Ann. Appl. Stat. 2008, 2, 841–860. [Google Scholar] [CrossRef]
Binder, H.; Schumacher, M. Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models. BMC Bioinform. 2008, 9, 14. [Google Scholar] [CrossRef]
Wiegrebe, S.; Kopper, P.; Sonabend, R.; Bischl, B.; Bender, A. Deep learning for survival analysis: A review. Artif. Intell. Rev. 2024, 57, 65. [Google Scholar] [CrossRef]
Lee, C.; Zame, W.R.; Yoon, J.; Van Der Schaar, M. DeepHit: A deep learning approach to survival analysis with competing risks. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; pp. 2314–2321. [Google Scholar]
Lee, C.; Yoon, J.; Van Der Schaar, M. Dynamic-DeepHit: A Deep Learning Approach for Dynamic Survival Analysis with Competing Risks Based on Longitudinal Data. IEEE Trans. Biomed. Eng. 2020, 67, 122–133. [Google Scholar] [CrossRef]
Bennis, A.; Mouysset, S.; Serrurier, M. Estimation of Conditional Mixture Weibull Distribution with Right Censored Data Using Neural Network for Time-to-Event Analysis. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer International Publishing: Berlin/Heidelberg, Germany, 2020; Volume 12084 LNAI, pp. 687–698. [Google Scholar] [CrossRef]
Kvamme, H.; Borgan, Ø. Continuous and discrete-time survival prediction with neural networks. Lifetime Data Anal. 2021, 27, 710–736. [Google Scholar] [CrossRef]
Aastha; Huang, P.; Liu, Y. DeepCompete: A deep learning approach to competing risks in continuous time domain. In Proceedings of the AMIA Annual Symposium Proceedings, Online, 14–18 November 2020; Volume 2020, pp. 177–186. [Google Scholar]
Jing, B.; Zhang, T.; Wang, Z.; Jin, Y.; Liu, K.; Qiu, W.; Ke, L.; Sun, Y.; He, C.; Hou, D.; et al. A deep survival analysis method based on ranking. Artif. Intell. Med. 2019, 98, 1–9. [Google Scholar] [CrossRef]
Tong, L.; Mitchel, J.; Chatlin, K.; Wang, M.D. Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis. BMC Med. Inform. Decis. Mak. 2020, 20, 225. [Google Scholar] [CrossRef] [PubMed]
Wang, J.; Chen, N.; Guo, J.; Xu, X.; Liu, L.; Yi, Z. SurvNet: A Novel Deep Neural Network for Lung Cancer Survival Analysis with Missing Values. Front. Oncol. 2021, 10, 588990. [Google Scholar] [CrossRef] [PubMed]
Nagpal, C.; Yadlowsky, S.; Rostamzadeh, N.; Heller, K. Deep Cox Mixtures for Survival Regression. In Proceedings of the Machine Learning for Healthcare Conference, Vrtual, 6–7 August 2021; Volume 149, pp. 674–708. [Google Scholar]
Devarajan, K.; Ebrahimi, N. A semi-parametric generalization of the Cox proportional hazards regression model: Inference and applications. Comput. Stat. Data Anal. 2011, 55, 667–676. [Google Scholar] [CrossRef]
Kartsonaki, C. Survival analysis. Diagn. Histopathol. 2016, 22, 263–270. [Google Scholar] [CrossRef]
Lin, D. On the Breslow estimator. Lifetime Data Anal. 2007, 13, 471–480. [Google Scholar] [CrossRef] [PubMed]
Harden, J.J.; Kropko, J. Simulating duration data for the cox model. Political Sci. Res. Methods 2019, 7, 921–928. [Google Scholar] [CrossRef]
Pepe, A.; Pistoia, L.; Gamberini, M.R.; Cuccia, L.; Lisi, R.; Cecinati, V.; Maggio, A.; Sorrentino, F.; Filosa, A.; Rosso, R.; et al. National networking in rare diseases and reduction of cardiac burden in thalassemia major. Eur. Heart J. 2021, 43, 2482–2492. [Google Scholar] [CrossRef]
Buxton, A.E.; Calkins, H.; Callans, D.J.; DiMarco, J.P.; Fisher, J.D.; Greene, H.L.; Haines, D.E.; Hayes, D.L.; Heidenreich, P.A.; Miller, J.M.; et al. ACC/AHA/HRS 2006 key data elements and definitions for electrophysiological studies and procedures: A report of the American College of Cardiology/American Heart Association Task Force on Clinical Data Standards (ACC/AHA/HRS Writing Committee to Develop Data Standards on Electrophysiology). Circulation 2006, 114, 2534–2570. [Google Scholar] [CrossRef]
Jessup, M.; Abraham, W.T.; Casey, D.E.; Feldman, A.M.; Francis, G.S.; Ganiats, T.G.; Konstam, M.A.; Mancini, D.M.; Rahko, P.S.; Silver, M.A.; et al. 2009 Focused Update: ACCF/AHA Guidelines for the Diagnosis and Management of Heart Failure in Adults. Circulation 2009, 119, 1977–2016. [Google Scholar] [CrossRef]
Batista, G.E.A.P.A.; Monard, M.C. An analysis of four missing data treatment methods for supervised learning. Appl. Artif. Intell. 2003, 17, 519–533. [Google Scholar] [CrossRef]
Philipp, G.; Song, D.; Carbonell, J.G. Gradients explode—Deep Networks are shallow—ResNet explained. CoRR 2017. Available online: https://openreview.net/pdf?id=HkpYwMZRb (accessed on 16 February 2025).
Ching, T.; Zhu, X.; Garmire, L.X. Cox-nnet: An artificial neural network method for prognosis prediction of high-throughput omics data. PLoS Comput. Biol. 2018, 14, e1006076. [Google Scholar] [CrossRef]
Akbas, K.; Balıkçı Çiçek, I.; Kaya, M.; Colak, C. Comparison of Performance of Deep Survival and Cox Proportional Hazard Models: An Application on the Lung Cancer Dataset. Med. Sci. Int. Med. J. 2022, 11, 1202. [Google Scholar] [CrossRef]
Zhao, L.; Feng, D. Deep neural networks for survival analysis using pseudo values. IEEE J. Biomed. Health Inform. 2020, 24, 3308–3314. [Google Scholar] [CrossRef] [PubMed]
Bice, N.; Kirby, N.; Bahr, T.; Rasmussen, K.; Saenz, D.; Wagner, T.; Papanikolaou, N.; Fakhreddine, M. Deep learning-based survival analysis for brain metastasis patients with the national cancer database. J. Appl. Clin. Med. Phys. 2020, 21, 187–192. [Google Scholar] [CrossRef]
Alabdallah, A.; Ohlsson, M.; Pashami, S.; Rögnvaldsson, T. The Concordance Index decomposition: A measure for a deeper understanding of survival prediction models. Artif. Intell. Med. 2024, 148, 102781. [Google Scholar] [CrossRef] [PubMed]
Ghorbani, A.; Abid, A.; Zou, J. Interpretation of Neural Networks Is Fragile. arXiv 2019, arXiv:1710.10547. [Google Scholar]
Molnar, C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 2nd ed.; 2022; Available online: https://christophm.github.io/interpretable-ml-book/ (accessed on 16 February 2025).
Schwalbe, N.; Wahl, B. Artificial intelligence and the future of global health. Lancet 2020, 395, 1579–1586. [Google Scholar] [CrossRef]
Pepe, A.; Meloni, A.; Rossi, G.; Midiri, M.; Missere, M.; Valeri, G.; Sorrentino, F.; D’Ascola, D.G.; Spasiano, A.; Filosa, A.; et al. Prediction of cardiac complications for thalassemia major in the widespread cardiac magnetic resonance era: A prospective multicentre study by a multi-parametric approach. Eur. Heart J. Cardiovasc. Imaging 2018, 19, 299–309. [Google Scholar] [CrossRef]
Meloni, A.; Pistoia, L.; Gamberini, M.R.; Cuccia, L.; Lisi, R.; Cecinati, V.; Ricchi, P.; Gerardi, C.; Restaino, G.; Righi, R.; et al. Multi-Parametric Cardiac Magnetic Resonance for Prediction of Heart Failure Death in Thalassemia Major. Diagnostics 2023, 13, 890. [Google Scholar] [CrossRef]
Mewton, N.; Liu, C.Y.; Croisille, P.; Bluemke, D.; Lima, J.A. Assessment of myocardial fibrosis with cardiovascular magnetic resonance. J. Am. Coll. Cardiol. 2011, 57, 891–903. [Google Scholar] [CrossRef]
Borgna-Pignatti, C.; Rugolotto, S.; De Stefano, P.; Zhao, H.; Cappellini, M.D.; Del Vecchio, G.C.; Romeo, M.A.; Forni, G.L.; Gamberini, M.R.; Ghilardi, R.; et al. Survival and complications in patients with thalassemia major treated with transfusion and deferoxamine. Haematologica 2004, 89, 1187–1193. [Google Scholar] [PubMed]
Pepe, A.; Gamberini, M.R.; Missere, M.; Pistoia, L.; Mangione, M.; Cuccia, L.; Spasiano, A.; Maffei, S.; Cadeddu, C.; Midiri, M.; et al. Gender differences in the development of cardiac complications: A multicentre study in a large cohort of thalassaemia major patients to optimize the timing of cardiac follow-up. Br. J. Haematol. 2018, 180, 879–888. [Google Scholar] [CrossRef] [PubMed]
Kander, M.C.; Cui, Y.; Liu, Z. Gender difference in oxidative stress: A new look at the mechanisms for cardiovascular diseases. J. Cell. Mol. Med. 2017, 21, 1024–1032. [Google Scholar] [CrossRef] [PubMed]
Solti, F.; Vecsey, T.; Kékesi, V.; Juhász-Nagy, A. The effect of atrial dilatation on the genesis of atrial arrhythmias. Cardiovasc. Res. 1989, 23, 882–886. [Google Scholar] [CrossRef]
Vale-Silva, L.A.; Rohr, K. Long-term cancer survival prediction using multimodal deep learning. Sci. Rep. 2021, 11, 13505. [Google Scholar] [CrossRef]

Figure 1. Schematization of the synthetic data generation procedure. PF: Probability of Failure; PS: Probability of Survival.

Figure 2. The Cox-net structure. The network includes two fully connected hidden layers with 256 neurons each.

Figure 3. Kaplan–Meier survival curve estimation for the synthetic dataset estimated for myocardial fibrosis (a) and sex (b) covariates.

Figure 4. Training and validation loss curves for 3-fold cross-validation in the worst case (simulated dataset with 98% of censored patients).

Figure 5. Comparison of c-index values assessed on the synthetic test set by Cox-net and standard Cox regression.

Figure 6. Survival curves estimated on synthetic test set by Cox-net and Kaplan–Meier approaches for different percentages of censored patients: (a) 0%; (b) 20%; (c) 35%; (d) 50%; (e) 65%; (f) 80%; (g) 95%; (h) 98%.

Figure 7. Permutation feature importance analysis results on the synthetic test set for different percentages of censored patients (mean ± SD): (a) 0%; (b) 20%; (c) 35%; (d) 50%; (e) 65%; (f) 80%; (g) 95%; (h) 98%.

Figure 8. Permutation feature importance analysis of the MIOT database.

Figure 9. Survival curves for Cox-net and standard Cox regression considering fibrosis and sex covariates: (a) survival curves considering fibrosis (Cox-net); (b) survival curves considering sex (Cox-net); (c) survival curves considering fibrosis (Cox regression); (d) survival curves considering sex (Cox regression).

Table 1. Clinical variables used in this study. Continuous variables are expressed as the mean ± SD, while categorical variables are expressed as percentages.

Variable	Mean ± SD
Age	29.3 ± 9.1 (years)
Left Ventricular End-Diastolic Volume Index (LVEDI)	86.2 ± 18.8 (mL/m²)
Left Ventricular Ejection Fraction (LVEF)	62.5 ± 5.9%
Left Ventricular Mass Index (LVMI)	58.8 ± 12.8 (g/m²)
Right Ventricular End-Diastolic Volume Index (RVEDI)	82.9 ± 19.4 (mL/m²)
Right Ventricular Ejection Fraction (RVEF)	61.6 ± 6.7%
Left Atrium Area Index (LAAI)	12.9 ± 2.7 (cm²)
Right Atrium Area Index (RAAI)	12.3 ± 2.4 (cm²)
Cardiac T2*	28.1 ± 12.3 (ms)
Liver Iron Concentration (LIC)	9.1 ± 9.0 (mg/g/dw)
Ferritin	1594 ± 1394 (mg/mL)
Replacement Myocardial Fibrosis	21%
Male Sex	47%

Table 2. Assessed c-index values in K-fold cross-validation for Cox-net and standard Cox regression approaches.

Cox-Net					Cox Regression
Censored (%)	Fold 1	Fold 2	Fold 3	Mean ± SD	Fold 1	Fold 2	Fold 3	Mean ± SD
0%	0.832	0.812	0.834	0.826 ± 0.009	0.832	0.819	0.836	0.828 ± 0.007
20%	0.883	0.863	0.852	0.866 ± 0.013	0.838	0.831	0.833	0.834 ± 0.003
35%	0.832	0.828	0.798	0.819 ± 0.015	0.793	0.776	0.759	0.776 ± 0.014
50%	0.828	0.859	0.827	0.838 ± 0.015	0.756	0.764	0.760	0.760 ± 0.003
65%	0.808	0.862	0.835	0.835 ± 0.022	0.749	0.739	0.726	0.738 ± 0.009
80%	0.857	0.862	0.813	0.844 ± 0.022	0.763	0.756	0.702	0.740 ± 0.027
95%	0.817	0.850	0.739	0.802 ± 0.047	0.745	0.660	0.720	0.708 ± 0.036
98%	0.857	0.745	0.863	0.821 ± 0.054	0.697	0.771	0.758	0.742 ± 0.032

Table 3. K-fold cross-validation and test set results for MIOT data.

Model	Fold 1	Fold 2	Fold 3	Mean ± SD	Test Set
Cox-Net	0.771	0.856	0.806	0.811 ± 0.035	0.795
Cox Regression	0.736	0.806	0.829	0.790 ± 0.040	0.690

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

De Santi, L.A.; Orlandini, F.; Positano, V.; Pistoia, L.; Sorrentino, F.; Messina, G.; Roberti, M.G.; Missere, M.; Schicchi, N.; Vallone, A.; et al. Explainable Survival Analysis of Censored Clinical Data Using a Neural Network Approach. BioMedInformatics 2025, 5, 17. https://doi.org/10.3390/biomedinformatics5020017

AMA Style

De Santi LA, Orlandini F, Positano V, Pistoia L, Sorrentino F, Messina G, Roberti MG, Missere M, Schicchi N, Vallone A, et al. Explainable Survival Analysis of Censored Clinical Data Using a Neural Network Approach. BioMedInformatics. 2025; 5(2):17. https://doi.org/10.3390/biomedinformatics5020017

Chicago/Turabian Style

De Santi, Lisa Anita, Francesca Orlandini, Vincenzo Positano, Laura Pistoia, Francesco Sorrentino, Giuseppe Messina, Maria Grazia Roberti, Massimiliano Missere, Nicolò Schicchi, Antonino Vallone, and et al. 2025. "Explainable Survival Analysis of Censored Clinical Data Using a Neural Network Approach" BioMedInformatics 5, no. 2: 17. https://doi.org/10.3390/biomedinformatics5020017

APA Style

De Santi, L. A., Orlandini, F., Positano, V., Pistoia, L., Sorrentino, F., Messina, G., Roberti, M. G., Missere, M., Schicchi, N., Vallone, A., Santarelli, M. F., Clemente, A., & Meloni, A. (2025). Explainable Survival Analysis of Censored Clinical Data Using a Neural Network Approach. BioMedInformatics, 5(2), 17. https://doi.org/10.3390/biomedinformatics5020017

Article Menu

Explainable Survival Analysis of Censored Clinical Data Using a Neural Network Approach

Abstract

1. Introduction

2. Materials and Methods

2.1. Survival Analysis

2.2. Dataset

2.2.1. Synthetic Dataset Analysis

2.2.2. Clinical Dataset

2.3. Data Pre-Processing

2.4. Cox-Net Model

2.5. Synthetic Data Generation

2.6. Model Training and Validation

2.7. Model Evaluation: C-Index

2.8. Explanation by Permutation Feature Importance

3. Results

3.1. Synthetic Dataset Results

3.2. MIOT Dataset Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI