1. Introduction
Sepsis is a life-threatening organ dysfunction caused by a dysregulated immune response to infection [
1] and septic shock is its most severe form associated with higher mortality [
2]. More than half of the patients with septic shock present elevated levels of circulating cardiac biomarkers, such as troponin (herein referred as myocardial injury) and some degree of impairment in echocardiographic indices of diastolic and/or systolic function (herein referred as cardiac dysfunction), conditions commonly grouped under the terminology of septic cardiomyopathy (SC) [
3]. Patients with SC have higher mortality rates than those without it [
3].
At present, there is no therapy specially targeting SC. While positive inotropes, mainly dobutamine, are used clinically to ameliorate cardiac functions and improve both cardiac output and systemic oxygen delivery, excessive β-adrenergic stimulation can be associated with harm [
4]. Initial enthusiasm with Levosimendan, a calcium sensitizer [
5], has not been confirmed in a subsequent large randomised-control trial [
6], although the latter did not target to treat overt SC, per se. One likely reason for the lack of successful interventions targeting SC is that we fail to understand the root causes of heart affection (myocardial injury and cardiac dysfunction) in patients with septic shock. The underlying pathophysiology is certainly complex and studies performed in sub-optimal animal models have proposed a number of events and pathways [
7] that have rarely been confirmed in human subjects.
The characterisation of patients at the molecular level is a promising approach to identify pathophysiological mechanisms and specific targets for new therapeutic interventions in critically-ill patients with a particular condition [
8]. For example, we have previously shown that changes in the metabolome (lipidome in particular) and transcriptome may play a relevant role in early recovery of organ dysfunction in patients with septic shock [
9,
10].
In this work, we used a machine learning (ML) pipeline to investigate the prospects of its application to analyse transcriptomic, proteomic and metabolomic data gathered at two time points during Intensive Care Unit (ICU) stay in patients with septic shock and prospectively collected measurements of high-sensitive cardiac troponin and echocardiography. Our primary aim was to identify and characterise the multiOMICs profile of myocardial injury in patients with septic shock. Second, we aimed at identifying if a distinct profile exists when cardiac dysfunction in patients is associated with myocardial injury.
2. Materials and Methods
2.1. Study Design and Participants
This manuscript follows the STROBE guidelines for reporting observational studies (
Table S1) [
11].
This study is part of the multicentre prospective observational trial “ShockOmics” (ClinicalTrials.gov Identifier NCT02141607) [
12]. Patients were recruited in the Intensive Care Units (ICU) of Hopitaux Universitaires de Genève (Geneva, Switzerland) and Hôpital Erasme—Cliniques Universitaires de Bruxelles (Brussels, Belgium). The study was approved by the Geneva Regional Research Ethics Committee (study number 14-041) and the Ethical committee of Hôpital Erasme-Université Libre De Bruxelles (study number P2014/171). Informed consent was obtained from the patients or their representatives.
As detailed elsewhere [
12], we included consecutive adult (>18 years old) patients, admitted for septic shock in the ICUs of two University Hospitals, with an admission SOFA score ≥6, and an arterial lactate ≥2 mmol/L. Although septic shock was defined according to the recommendations and international guidelines at the time of inclusion [
13], all patients fulfil the criteria of Sepsis-3 [
1]. Peripheral blood samples for OMICS analysis and measurements of high-sensitive cardiac troponin T (hscTnT) were collected within 16 h of ICU admission (T1) and 48 h after admission (T2). A certified intensivist performed an echocardiography at the same time points. Left ventricle ejection fraction (LVEF) was measured in apical view, according to the biplane modified Simpson’s method.
We excluded patients expected to die within 24 h after ICU admission and with terminal illness; those receiving more than four units of red blood cells or >1 fresh frozen plasma transfused; with active haematological malignancy, metastatic cancer, chronic immunodepression, pre-existing end-stage renal disease requiring renal replacement therapy, recent cardiac surgery, and Child-Pugh C cirrhosis. The main reason for our exclusion criteria was to avoid confounding factors that would make it difficult to distinguish at transcriptomic, proteomic, and metabolomic levels what is due to septic shock and what to the comorbidities. We also excluded patients who did not have data in the three OMICS domains. Furthermore, we prospectively planned at least two time points in our study using expensive technologies, hence the two exclusion criteria of “terminal illness” and high risk of death within 24 h of ICU admission. Demographic and clinical characteristics in patients with and without myocardial injury (Table 1 and Table 2) were compared using Fisher’s exact test or Mann–Whitney U test, as appropriate.
2.2. OMICS Data
Blood samples were collected in EDTA tubes and treated as follows:
For transcriptomics: after adding 400 µL of 2X Denaturing solution (Ambion, Austin, Texas, USA) to an equal volume of blood, samples were stored at −20 °C until analysis;
For proteomics, metabolomics and hscTnT quantification: after adding 900 µL of a protease inhibitor solution (Roche Applied Science, Penzberg, Germany) to 6 mL of blood, 0.5 mL plasma aliquots were obtained by two-step centrifugation and stored at −80 °C until analysis.
All subsequent analytic steps were performed in batches.
2.3. Transcriptomics Analysis
As detailed elsewhere [
14], total RNA was extracted from blood samples with a MirVana Paris Kit (Applied Biosystems, Waltham, Massachusetts, United States)and treated with Turbo DNA-free Kit (Ambion, Austin, Texas, USA). RNA Quality was assessed on Agilent Bioanalyzer with the RNA 6000 Nano Kit (Agilent, Santa Clara, CA, USA) and samples were considered suitable for processing if RNA Integrity Number was greater than 7.5. Sequencing libraries were prepared with the TruSeq Stranded Total RNA with Ribo-Zero Globin Kit (Illumina, San Diego, CA, USA) using 800 ng of total RNA. Final libraries were validated with the Agilent DNA1000 kit (Agilent, Santa Clara, California, United States) and sequenced on a HiSeq2500 platform producing 50 × 2 bp paired end reads. High quality paired-end reads were aligned to the human reference genome (GRCh38) using STAR (v2.5.2b) (Github, San Francisco, CA, USA) [
15] emitting only uniquely mapping reads. Reads were assigned to genes with featureCounts (v1.5.1) [
16] using the GENCODE (v.25) (GENCODE reference annotation for the human and mouse genomes”) primary assembly gene transfer file (GTF) as the reference annotation file for genomic features boundaries.
DESeq2 package (Bioconductor) [
17] built-in functions were used to perform data pre-processing and export of normalised counts.
2.4. Proteomics Analysis
The proteomics analysis was performed using Tandem Mass Tag, TMT-10plex (Thermo Scientific) technique. Firstly, immunoaffinity depletion of highly abundant proteins from plasma samples was performed using IgY14 Seppro® column (Sigma—St. Louis, MO, USA). Eluted proteins were reduced, alkylated, and double trypsin digested. Seven TMT-10plex experiments were performed. After peptide labelling, samples were subjected to high pH fractionation with a high pH reversed phase peptide fractionation kit (Pierce, ref. 84868—Thermo Fisher, Waltham, MA, USA). A total of eight fractions from each TMT-10plex batches were analysed using an Orbitrap Fusion Lumos™ Tribrid mass spectrometer (Thermo Scientific, Waltham, MA, USA). The mass spectrometer was operated in a data-dependent acquisition (DDA) mode. MS2-MS3 analysis was conducted with a top speed approach. Thermo Proteome Discover (v.2.1) Thermo Scientific, Waltham, MA, USA) was used to search with Sequest HT search engine against the Swiss-Prot human public database. For each TMT batch, eight raw files corresponding to the eight fractions injections from the MS analyses were used to perform a single search against this database. The quantification of proteins was conducted by summing, within each TMT™ 10plex experiment, the reporter ion intensities of unique peptides. Libra channel normalization was performed for each TMT™ 10plex experiment.
2.5. Metabolomics Analysis
We performed a targeted quantitative approach using a combined direct flow injection and liquid chromatography (LC) tandem mass spectrometry (MS/MS) assay (AbsoluteIDQ 180 kit, Biocrates, Innsbruck, Austria), as detailed elsewhere [
9]. This method combines derivatisation and extraction of analytes with the selective mass-spectrometric detection using multiple reaction monitoring (MRM) pairs. Isotope-labelled internal standards are integrated into the platform for absolute quantification of metabolites. MRM detection was used for quantification applying spectra parsing algorithm integrated into the Metiq software (Biocrates Life Science AG, Innsbruck, Austria). Concentrations were calculated and evaluated by comparing measured analytes in a defined extracted ion count section to those of specific labelled internal standards or non-labelled ones, provided by the kit. This strategy allows simultaneous quantification of up to 186 metabolites. Metabolites were excluded from further analysis if: (1) fewer than 20% of missing values (non-detectable peak) for each quantified metabolite, (2) 50% of all sample concentrations for the metabolite had to be above the limit of detection (LOD). In total, 130 of the 186 metabolites were used for statistical analysis.
2.6. Definition of Myocardial Injury and Cardiac Dysfunction
Myocardial injury was defined as a circulating hscTnT level >14 ng/L, the 99th percentile upper reference limit of the assay, according to the Fourth universal definition of myocardial infarction [
18]. Cardiac dysfunction was prospectively defined as a (LVEF) < 50% or treatment with positive inotropic drugs to improve cardiac output and tissue perfusion as judged necessary by the treating physician. Echocardiography image acquisition was performed by skilled Intensivists with a National Diploma of echocardiography. Analysis of the echocardiography images was performed by two assessors with a National Diploma of echocardiography and extensive teaching experience in echocardiography (KB and AH). Both assessors were blinded for the cardiac troponin measurements and OMICS results.
2.7. Multiscale Modelling of OMICS Data
Metabolomics and transcriptomics data have been previously published. Proteomics data has not. No analyses regarding the phenotypes of myocardial injury and cardiac dysfunction or the construction of multiOMICS models have been previously published for this cohort.
The ML pipeline presented in this paper is divided into two main experimental phases: FS and classification. The FS experiments started with the execution of tests to compare distributions where the data from our analysis pipeline has been further divided into six groups. Each group corresponds to its own dataset: transcriptomics at T1, transcriptomics at T2, proteomics at T1, proteomics at T2, metabolomics at T1, and metabolomics at T2. The FS phase started with a one-way ANOVA and a Kruskal–Wallis test to select the variables that yield a significant
p-value and a reduced
q-value [
19], as appropriate. The normality of all OMICS data were assessed through the Shapiro–Wilk test and the variance homogeneity through Bartlett’s test. After comparing the distributions, the FS phase was completed with a recursive feature selection implemented with random forests. In this analysis, we also calculated the stability score for each biomarker, presented as a frequency (i.e., number of times a particular biomarker has been selected in the reported number of experiments). Due to the high dimensionality of the data, we propose to reduce first its dimensionality and later study the relation among the biomarkers. However, in case of working with lower dimensional data, the step of reducing dimensionality could be omitted.
The set of markers obtained in our FS phase was further analysed in an enrichment and pathway analysis. Particularly, the transcripts were analysed with Enrichr [
20], proteins were analysed with Impala [
21] and metabolites with MetaboAnalyst [
22].
Since this is a knowledge discovery study, our main assumption here is that the biomarker sets yielding the best performances to assess myocardial injury and cardiac dysfunction (based on troponin and ejection fraction or an inotrope requirement) will also shed light on the pathophysiological processes involved in SC. For this reason, the set of biomarkers that result from the FS phase was used to create a set of classifiers to predict cardiac injury and dysfunction. In our case, we used logistic regression, classification trees (CART), and a support vector classifier (SVC) [
23]. All of these methods are well suited for knowledge discovery from small-sized data sets for their interpretability and their execution time, which allow efficient implementations for training through leave-one-out cross-validation (LOOCV). The performance of each classifier has been evaluated through its accuracy, sensitivity, and specificity. The accuracy is strengthened by the
p-value obtained in the McNemar’s test, and the sensitivity and specificity are shown together with their binomial confidence interval.
All of the analyses were performed using R and the main packages employed were: stats, caret, randomForest, rpart, and e1071.
4. Discussion
The present prospective clinical investigation demonstrates that the application of a ML pipeline to circulating transcriptomics, proteomics, and metabolomics data in patients with septic shock has huge potential to predict both the risk of myocardial injury (assessed with circulating levels of troponin) and the risk of cardiac dysfunction (assessed with echocardiography-derived left ventricle ejection fraction or an inotrope requirement). To our knowledge, the present results are the first data linking serial cardiac and hemodynamic measurements using a ML pipeline of OMICS data. The advantage of such a pipeline lies in the fact that it can untangle relevant relations between different markers and related them to a particular outcome (i.e., cardiac injury and cardiac dysfunction) through a data-driven approach. Thus, the application of ML techniques can improve the standard methods used in the classical clinical practice to assess these relations. Even though ML approaches require large amounts of data from big cohorts of patients to draw conclusions, this paper shows that it is also possible to obtain sensible results even with scarce data.
These results are of great interest as they throw light on the hypothesis that the root causes of cardiomyocyte injury and cardiac dysfunction in patients with septic shock may be approached using a ML pipeline tailored for OMICS analysis.
We offer an explanation to reinforce that changes in complement and coagulation systems were associated with myocardial injury in patients with septic shock. These observations are coherent with data showing that the coagulation system and microthrombosis are the main causes of ischemic heart affection and myocardial injury [
24]. Microthrombosis has also been shown to be a main source of respiratory dysfunction and architectural lung injury in other inflammatory diseases as ARDS [
25].
Ischemia-reperfusion injury has been shown to play a role in sepsis-associated organ dysfunction [
26]. In this setting, platelets are critical mediators of thrombo-inflammation and have been shown to contribute to an exaggerated ischemia-reperfusion injury response [
27,
28,
29]. However, the mechanisms underlying vulnerability to ischemia-reperfusion injury in septic shock patients is not well defined, nor the role of platelets in the process of SC [
30].
Changes in the coagulation system have also been associated with myocardial dysfunctions in patients with ischemic myocardial injury. Indeed, there is now ample evidence supporting the concept of cardiac injury causing local inflammation and increased activation of pro-coagulant processes in patients with STEMI [
31]. Moreover, biomarkers of coagulation and inflammation have been shown to provide pertinent and relevant distinction of patients suffering from coronary diseases and ischemic heart failure. In this regard, it is worth asking the question whether there is a plausible mechanistic basis that would allow myocardial capillary endothelial dysfunction to worsen right and left ventricular function in patients with septic shock [
32]. The idea behind the present interrogation is to treat myocardial injury in septic shock patients by targeting pathways that link inflammation and thrombosis. For instance, several studies demonstrated a reduction in ischemic and clinical events with early high dose statins [
33]. The present hypothesis is in line with some animal studies demonstrating that changes in systemic haemodynamics, coronary perfusion pressure, myocardial function, and increased tumour protein 53 expression with apoptosis related to bacterial exotoxin cause cardiac dysfunction. Indeed, in vivo changes were significantly inhibited by pretreatment with simvastatin, which provide novel evidence for the pleiotropic mechanisms by which septicaemia causes myocardial depression and hint at a potential role for simvastatin as an inhibitor of apoptosis in sepsis [
34]. In our opinion, this finding highlights that subendocardial and myocardial ischemia are key damages induced by inflammation and sepsis causing diastolic and systolic heart dysfunctions [
35,
36]. The fact that recent data suggest that diastolic dysfunction is more frequent and associated with prognosis than systolic dysfunction in SC, corroborates our finding, as diastolic dysfunction is the first functional alteration during myocardial ischemia.
The underlying cause of SC could be, also, a disorder in communication between the intracellular contractile apparatus and extracellular matrix, resulting in attenuation of the myocardial contraction. In this regard, the fact that selected biomarkers could predict myocardial injury with good accuracy may contribute to underline the main causes of this transitory contraction interruption observed for the heart during sepsis.
We could also observe an association of alpha-1-antichymotrypsin and serum paraoxonase/arylesterase with myocardial injury. A recent proteomic study showed that circulating alpha-1-antichymotrypsin level was higher in patients with myocardial injury compared with stable angina or healthy controls [
37]. Alpha-1-antichymotrypsin can inhibit the activity of neutrophil cathepsin G and mast cell chymase and with this mechanism may act as a mediator of inflammatory processes [
38]. Paraoxonase is an antioxidant bioscavenger, responsible for hydrolysing lipid peroxides and decreased serum paraoxonase/arylesterase activity were related to poor prognosis (30-day mortality) in patients with sepsis [
39]. These circulatory markers suggest a link of inflammation and oxidative stress with myocardial injury in septic shock.
In defiance of data scarcity, this study casts light on the relation between different biomarkers that play a role in patients with septic shock and strengthens some of the hypothesis posed by other aforementioned works. This is an exploratory methodology that may be further exploited over larger cohorts to elucidate the association between OMICS data and the biomarkers of interest.
In summary, this study showed that ML methods, applied to circulating OMICS data, can give an accurate estimation of myocardial injury and cardiac dysfunction in septic shock patients. This approach was also useful to investigate septic cardiomyopathy at molecular level and to identify a role of complement, coagulation, and inflammation pathways in the pathophysiology of myocardial injury. Our results, obtained in a small sized cohort of septic shock patients, show that the analysis of circulating OMICS data with a ML pipeline is a valuable tool to conduct research in critically ill patients.
Limitations
Circulating troponin can be influenced by age and comorbidities (such as heart failure and renal function) besides the burden of acute illness caused by septic shock. Hence using a fixed threshold of troponin at ICU admission to classify patients as having myocardial injury may lead to an overestimation of cases. However, in the current analysis, we focused on identifying biomarkers acquired at ICU simultaneously to troponin measurements. In addition, there is no validated method to choose a different cut-off in patients with impaired renal clearance. Furthermore, elevated troponin has been shown to be associated with increased mortality independently of renal failure and elevated creatinine [
40]. It is possible that use of inotropes impacts OMICS results. However, our study was not designed to answer this question and our small cohort does not allow exploring the impact of inotropes independent of cardiac function. The primary aim of this study was to provide new insight into the mechanisms of the phenotypes of myocardial injury and cardiac dysfunction. Hence, we used a large range of biological intermediates covering multiple levels of information (transcripts, proteins, and metabolites), which is a novelty in the field. These analyses require important resources and are not easily available in a clinical setting. However, exploratory studies as our own are often the basis for other mechanistic studies aimed at identifying biomarkers for prediction of the phenotypes or risk stratification. The main limitation in this paper is related to the cohort size. On the one hand, the high rate of non-eligibility and exclusion due to the OMICS techniques constraints and the discard of the cardiogenic shock patients, reduces the significance with respect to the original cohort. On the other hand, there is a small amount of data available to implement a ML pipeline to ascertain the role of circulating OMICS for assessing cardiac dysfunction and injury during septic shock. This lack of data has also limited the exploitation of the full potential of ML-based approaches so that we had to apply simple yet powerful methods that perform well under these circumstances. In our case, we used logistic regression as a baseline, CART, and SVC. Nevertheless, in the light of the results obtained it is worth exploring and improving the pipeline presented here in future research with larger patient cohorts.