Next Article in Journal
Spatial Distribution Characteristics and Source Appointment of Heavy Metals in Soil in the Areas Affected by Non-Ferrous Metal Slag Field in the Dry-Hot Valley
Previous Article in Journal
Deformation Behaviour of Cold-Rolled Ni/CNT Nanocomposites
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Relationship of Behavioral, Social and Diabetes Factors with LVEF Measured Using Machine Learning Techniques

by
Cezara-Andreea Soysaler
1,*,
Cătălina Liliana Andrei
1,
Octavian Ceban
2 and
Crina-Julieta Sinescu
1
1
Department of Cardiology, University of Medicine and Pharmacy “Carol Davila”, Emergency Hospital “Bagdasar-Arseni”, 050474 Bucharest, Romania
2
Economic Cybernetics and Informatics Department, The Bucharest University of Economic Studies, 010374 Bucharest, Romania
*
Author to whom correspondence should be addressed.
Appl. Sci. 2022, 12(19), 9474; https://doi.org/10.3390/app12199474
Submission received: 24 August 2022 / Revised: 16 September 2022 / Accepted: 17 September 2022 / Published: 21 September 2022

Abstract

:
Purpose: Using a data and machine learning approach, from classical to complex, we aim to approximate the relationship between factors such as behavioral, social or comorbidity and the ejection fraction for hospitalized patients. To measure how much the independent variables influence the left ventricular ejection fraction (LVEF), classification models will be made and the influences of the independent variables will be interpreted. Through the data obtained, it is desired to improve the management of patients with heart failure (treatment, monitoring in primary medicine) in order to reduce morbidity and mortality. Patients and Methods: In this study, we enrolled 201 patients hospitalized with decompensated chronic heart failure. The models used are extreme gradient boosting (XGB) and logistic regression (LR). To have a deeper analysis of the independent variables, their influences will be analyzed in two ways. The first is a modern technique, Shapley values, from game theory, adapted in the context of Machine Learning for XGB; and the second, the classical approach, is by analysis of Logistic Regression coefficients. Results: The importance of several factors related to behavior, social and diabetes are measured. Smoking, low education and obesity are the most harmful factors, while diabetes controlled by diet or medication does not significantly affect LVEF, indeed, there is a tendency to increase the LVEF. Conclusions: Using machine learning techniques, we can better understand to what extent certain factors affect LVEF in this sample. Following further studies on larger groups and from different regions, prevention could be better understood and applied.

1. Introduction

Cardiovascular disease continues to represent the main cause of mortality, being responsible for about 31% of the deaths in the world, according to the World Health Organization [1]. Cardiac insufficiency (CI) represents a progressive disease, with various periods of stability and decompensation, characterized by the impossibility of the heart to ensure an oxygen intake suitable for peripheral aerobic metabolism [2,3,4].
Globally, heart failure is a significant public health problem, with a prevalence of over 37.7 million cases worldwide, being one of the most important causes of hospitalization among adults over 65, with significant medical costs, according to the study published in 2018 by Lesyuk et al. [5,6].
The physiological and morphological changes related to the age and the action of demographic, social and comorbidities increase the predisposition of the population to develop chronic heart failure [7]. The substrate of heart failure is plurietiological and has a critical variability depending on sex, ethnicity, age, comorbidities and environment [8,9].
In Romania, a prevalence of 4.76% was found, with a prevalence in males of 4.91% compared to 4.63% in females and the same growth with age, reaching over 15% in population groups over 70 years [10,11]. Once diagnosed with heart failure, the patient’s expected survival rate is 50% at five years and 10% to ten years [12,13].
Thirty-five percent of patients with heart failure have a high degree of severity of the disease and extremely low quality of life, influenced by psychological effects, adverse treatment and social limitations [14,15].
Starting from the recognition of heart failure as the mechanical dysfunction of the heart, the measurement of the left ventricle function in the form of the ejection fraction has been, is and will remain considered the best parameter for diagnosing and managing the disease [16].
Noncardiac comorbidities are of great importance in the evolution of heart failure, and their management is a key element in the holistic care of patients with heart failure [17].
Early diagnosis of diabetes, with its proper treatment, periodic evaluation of the risk of cardiovascular disease and aggressive treatment of cardiovascular risk factors are the main methods of reducing morbidity and mortality [18]. Among patients with diabetes, atherosclerotic cardiovascular disease (CV) is the main cause of morbidity and mortality [19]. The risk of CV for diabetics is two to four times higher than for non-diabetics; diabetes represents a major, independent risk factor for CV and is often associated with other CV risk factors (high blood pressure, dyslipidemia, obesity) [20].
The optimal management of the patient with heart failure requires not only the treatment of heart disease but also the prevention and intervention ofall comorbidities in order to improve the prognosis [21]. Hospitalized patients with a first diagnosis of acute heart failure have an increased rate of mortality in the hospital and after discharge, as well as respiration [22]. Understanding the impact due to social, behavioral and comorbidity factors on the population in general, but even more on patients with symptoms of heart failure, is a key component to improving prognosis [23,24].
The purpose and objectives of the research are represented by prevention, rapid diagnosis and intervention on modifiable risk factors and certain comorbidities, in order to improve the prognosis of patients who have heart failure. It is also desired to identify the main drivers of heart failure, factors used later to estimate the evolution of the disease and survival [25]. Through the data obtained, it is desired to improve the management of patients with heart failure (treatment, monitoring in primary medicine) in order to reduce morbidity–mortality. In this sense, we chose to do an observational, prospective, comparative, prognostic study, with the enrollment of 201 patients over 18 years old. The study was conducted for one year (October 2019–October 2020) within the Hospital Cardiology Clinic of the emergency “Bagdasar-Arseni”.
According to estimates, 15 million people have been affected by heart failure (HF) in Europe and it is also expected to increase due to coronavirus and aging of the population [26,27]. Heart failure represents a common, expensive, and potentially fatal disease and has the highest mortality index [28,29]. It is ranked first by the substantial morbidity and mortality and the significant annual healthcare and economic burden [30]. Individual differences exist and are influenced by several factors, e.g., gender, educational level, income, comorbidity and social support [31,32,33].
The added value can be viewed as the technologies used to investigate the relationship between different factors and LVEF, with the validation of results. In order to approximate the relationship between factors, logistics and XGB regression were used, and in XGB the Shapley values were calculated to see the intensity of the factors [34,35,36] after these Shapley values were compared with the coefficients from logistic regression. The method of validating the models used was cross-validation—leave one out. This approach of combining multiple and new ways to interpret statistical models from the simple ones to the complex ones could lead us to more information about the latent relationship between factors, how the intensity is, and to observe some patterns for some variables. For example, maybe a continuous variable brings an effect only when it is in some part of an interval and using a classical method, like estimated coefficients, would tell us the overall effect of the changes of that variable, rather than how it would be for some certain intervals.
One of the fields where new data science technologies have brought benefits is, of course, medicine [37]. Most machine learning approaches aim to have a better understanding of the factors that determine a certain effect, such as readmission to hospital or of risks such as sepsis. When it comes to decision-making, machine learning models mostly only have an advisory role, and a method by which these black box models would be better understood would be using interpretability solutions [38]. Advances in technology, especially those in the fast and efficient collection and analysis of data, have led to remarkable results in fields other than medicine. In the 4th industrial revolution, the driver is the interconnection of objects and the ability to create decision-making systems based on data. In Khorasani’s paper [39], details are presented on how the internet of things, in combination with robotics and artificial intelligence, helps different industries.

2. Material and Methods

The study took place in the Cardiology Department of the Emergency Hospital “Bagdasar-Arseni”, in Bucharest, Romania. The study type was retrospective, and data was collected from 201 patients between October 2019 and October 2020. The inclusion criteria were as follows: patients over 18 years, with chronic heart failure with preserved ejection fraction and noncardiac comorbidities (neoplasia, chronic kidney disease, acute coronary syndrome, acute stroke, autoimmune disease, diabetes), and patients with chronic heart failure with reduced ejection fraction and noncardiac comorbidities. The exclusion criteria were as follows: neuropsychiatric diseases, life expectancy under 1 year, and refusal to sign the informed consent when enrolling in the study.

2.1. Data

The variables used in the research are the following. There are no missing values.

2.1.1. Independent Variables

Behavioural
  • ‘smoker’: 1 if a smoker, else 0.
  • ‘alcoholic’: 1 if alcoholic, else 0.
  • ‘obese’: 1 if obesity (BMI over 30), else 0.
Social
  • ‘women’: 1 if a woman, 0 if is a man.
  • ‘from_rural’: 1 if is from rural, 0 if is from urban.
  • ‘low_income’: 1 if the monthly income is below 400 USD, 0 else.
  • ‘low_education’: 1 if they have primary education, 0 secondary or tertiary education.
Diabet
  • ‘no_diabet’: 1 if not a diabetic, else 0.
  • ‘diabet_antidiabetic_pills’: 1 if he has diabetes and takes antidiabetic pills, else 0.
  • ‘diabet_diet’: 1 if he has diabetes and follows a diet (without insulin or antidiabetic pills administration), 0 else.
  • ‘diabet_insulin’: 1 if he has diabetic and takes insulin, else 0.

2.1.2. Dependent Variable

LVEF_over_50: Left Ventricular Ejection Fraction (LVEF) Is Greater or Equal with 50%

Left ventricular ejection fraction (LVEF) is the central measure of left ventricular systolic function. LVEF is the fraction of chamber volume ejected in systole (stroke volume) in relation to the volume of the blood in the ventricle at the end of diastole (end-diastolic volume). Stroke volume (SV) is calculated as the difference between end-diastolic volume (EDV) and end-systolic volume (ESV). The simplest classification, as per the American College of Cardiology (ACC) that is used clinically, is as follows:
  • Hyperdynamic = LVEF greater than 70%.
  • Normal = LVEF 50% to 70% (midpoint 60%).
  • Mild dysfunction = LVEF 40% to 49% (midpoint 45%).
  • Moderate dysfunction = LVEF 30% to 39% (midpoint 35%).
  • Severe dysfunction = LVEF less than 30%.
The accurate measurement of LVEF is very important for managing patients with cardiovascular disease. LVEF also has a prognostic value in predicting adverse outcomes in patients with congestive heart failure, after myocardial infarction, and after revascularization. LVEF plays an important role in assessing the severity of a decrease in the systolic function of the heart and thus is helpful in directing the management of various cardiovascular diseases.

2.1.3. Summary Statistics

The proportion of values 1 (cases of LVEF over 50%) are presented in Table 1:
The target variable, LVEF_over_50, is balanced; 49% of patients with LVEF are over 50. In the sample are 53% women, most with low income, and 50% do not have diabetes. Smokers are 40%, alcoholics are a quarter, and most (79%) have various forms of obesity.
Figure 1 shows the correlation matrix between the variables. In general, there are no high correlations, which is good for the classification model used and the analysis of the factors leading to LVEF. However, there are some moderate or low correlations, for example, those who smoke tend to consume alcohol, and there is also a slight tendency to have a low education. In terms of smoking and drinking, women have a slight tendency not to smoke or drink. Analyzing the correlations of independent variables with the target variable LVEF_over_50, there are moderate correlations between smokers and low education. Those who smoke and have low education seem to be the most vulnerable, with a risk of low LVEF. Table 2 shows the proportions of patients in relation to LVEF_over_50.
Considering that the Pearson correlation coefficient is calculated on a sample, the value of the parameter can even be 0. The critical value, for a p-value < 0.05, is that |r| > 0.14. Thus, correlations with an absolute value higher than 0.14 can be considered significantly different from zero.
Table 2 shows the number of patients and the percentage of them who have LVEF over 50. The higher the percentage differences, the more likely that factor is to have a greater impact. The higher the number, the less likely the percentage differences are to be random. For example, of the 81 smokers, 79% had LVEFs below 50%. The same is true for alcoholics, where 75% had LVEF below 50%, but in this case, there are fewer patients.
There are some differences in the impact of certain factors on LVEF, but a more detailed investigation is needed with more advanced statistical tools, which will be presented below.

3. Methodology

Classification models will be estimated to see to what extent the variables explain LVEF over 50. Separate models will be created using the three groups of variables: behavioral, social and diabetes. Classification models receive input observations with dependent variables and a binary dependent variable. Having the independent variable, the model reconstructs the function that best explains the dependent variable. This process results in a model that can calculate the expected dependent variable, or a probability, for a new observation with known independent variables. For example, having data about patients regarding their behavior, if they are smokers, if they are obese, if they are alcoholic and the target variable LVEF over 50, we can approximate the function that determines the probability that a patient has LVEF over 50 knowing the variables related the behavior.
For each group, two models will be trained, one XGB and one LR. This will result in 3 XGB and 3 LR models. In order to interpret the variables from XGB, Shapley values [34,35,36] will be analyzed, adapted in the context of machine learning, and for the variables from LR, the coefficients will be interpreted.
The flow is suggested in the Figure 2. The interpretation of the factors for the XGB model only makes sense if the model explains, to some extent, LVEF. The model will be tested by cross-validation—leave one out. Thus, 201 models will be created, trained with 200 observations, and the testing will be done on the remaining out of sample observations. There will be 201 predictions for the 201 patients, so they were not in the training set. Precision, recall, accuracy, and F1 score will be measured. Processing and training models are made in python.

4. Results

Behavioral Variables Analysis

XGB model trained with the following variables:
  • Dependent variable: LVEF over 50;
  • Independent variable: smoker, alcoholic, obese.
Following the training, a model emerged that generates probabilities based on independent variables. The probabilities presented below are generated after cross-validation—leave one out, for each patient.
Figure 3 shows the probability distributions returned by the model. The graphs are similar to boxplots, where the white point is the median. It can be seen that the model has learned something from the data presented, so there is a relationship between behavioral factors and LVEF. Blue is the probability distribution for patients with LVEF over 50, and these are significantly higher than those with orange. In the ROC curves, it is observed that it was a learning process; using the model, better results are obtained than if it had been predicted randomly. The same is observed in the confusion matrix, where accuracy is 74%, for a threshold of 0.5.
Knowing that there is a relationship between these factors and LVEF, it is still important to see what is the variables’ contribution. For this, we will compare the results given by Shapley Values and Logistic Regression coefficients.
In order to better understand Figure 4, the following clarifications will be made. Each point is a patient; if the bar is higher, then there are more patients. If the dot is red, the patient had the value 1 (was smoker, obese or alcoholic). If the dot is blue, the patient has the value 0. The lower the dot has a shapley value on OX, the fact that he had that variable determines the probability of having LVEF over 50 decreases. The further to the right, then that variable (depending on color) had a larger contribution to having LVEF over 50.
It is observed that smoking had a very harmful role; if a patient smoked, their chances of being in LVEF over 50 decreased a lot—even more than if they were both obese and alcoholic. If the patient had not been a smoker, then his chances would have increased significantly with LVEF over 50, a little more than if he had not been obese. It is interesting that being obese has an impact, but not a very big one, compared to smoking, but not being obese has similar benefits to not smoking. Being an alcoholic does not have a very large impact-most values being close to 0; it seems to be some connection, but not very obvious.
The results are similarly explained by the classic variant, logistic regression coefficients. These can be seen in the Table 3.
Smokers have the largest impact on LVEF, followed by obese, and for both p-value is below 0.05. As for the alcoholic variable, there seems to be some connection, but the p-value is greater than 0.05, so it is possible that the influence is random.

5. Social Variables Analysis

XGB model trained with the following variables:
  • Dependent variable: LVEF over 50;
  • Independent variable: low education, low income, women, from rural.
The performance of the model is shown in Figure 5. The validation technique is the same, cross-validation—leave one out. It can be seen that the probabilities for those who actually had LVEF over 50 were much higher than those for patients without LVEF over 50. The median, marked in the graph with a white dot, is almost 0.8 for LVEF over 50, and for the others is about 0.3. In the ROC curves, one can observe a progress in learning the relationship, where the AUC is 0.75. The results from the confusion matrix confirm a relationship between the independent and dependent variables, where the accuracy is 77% and the F1 score is 78%.
The use of these independent variables largely explains LVEF in patients. Next, it is important to see how the contribution of these variables is divided individually. In order to better understand the role played by the variables in calculating the probabilities, they are analyzed by Shap Values, and the results are presented in Figure 6.
For those with low education equal to 1 (red dots on the first line), the assigned Shap values are very small—this suggests that they negatively influenced, with a high power, the probability that a patient has LVEF over 50. If the patient does not have a low education, their chances increase, but not much compared to not having an education. Surprisingly it is low income, those with low income equal to 1 have a certain advantage—they are a little more favored in having LVEF over 50. This can be explained perhaps by the fact that they have restricted access to cigarettes or alcohol. Women have a small advantage, as there is a slight tendency for women to have a slightly higher probability than for men. For the variable from rural, the values are very close to zero, which suggests that there is not a large contribution of this variable to the formation of probabilities. These results are also reflected in logistic regression.
As we can see in Table 4, the biggest influence is low education, with a negative value of −2.87. Then, low_income has a protective role, with 1.58, and the variable women modifies the score, which can later be transformed into a probability of 0.59. In the case of these three variables, the p-value is below 0.05, so we have reason to believe that they do have an impact. As for the from_rural variable, p-value is 0.7, the data do not show that we can be sure if it has an influence.

6. Diabetes Variables Analysis

XGB model trained with the following variables:
  • Dependent variable: LVEF over 50;
  • Independent variable: no diabetes, diabetes insulin, diabetes diet, diabetes antidiabetes pills.
In this case, the model could not explain so well the LVEF based on independent variables. This suggests that there is not a very strong relationship between different diabetes and LVEF. In Figure 7 it can be seen that the probability distributions are similar for both types of patients, those with LVEF over 50 and the others. However, the model managed to have a slightly better accuracy than random, with 59%. In addition, it is worth mentioning that X variables here are not independent, thus, if a patient has no diabetes, the rest of the variables will be 0.
Figure 8 shows the Shapley results for the trained XGB model. Given that the model does not know very well how to correctly classify patients based on their independent variable, the interpretation of the variables is quite complicated.
The variables do not make a significant contribution to explaining LVEF, and this may be an interpretation in itself. A patient with diabetes and diet is about as vulnerable as a patient without diabetes or taking antidiabetic pills.
The Logistic Regression coefficients presented in Table 5 show that the variables are significant, but they explain very little of the LVEF. In other words, there are other much more important factors to explain LVEF. It is noted, however, that diet, even in the context of diabetes, helps.

7. Discussion

Heart failure remains one of the most important public health problems, a major cause of death and readmission, with an increasing incidence and prognosis most often unfavorable that requires new therapeutic options suitable for each subgroup of patients.
Various mechanisms contribute to the development of heart failure with reduced ejection fraction, many of which are underinvestigated. Obesity, hypertension, sedentary lifestyle and metabolic syndrome have been identified as important risk factors for various types of heart disease, including heart failure with reduced ejection fraction.
Hospitalizations remain relevant as a countable cost and as a marker for disease severity, quality of life and prognosis. In 2012, Cook et al. evaluated the annual global heart failure burden from all published sources and estimated it at $108 billion annually. The mean immediate HF burden value for the high-income countries was 1.42% versus 0.11% for low- or middle-income countries [40,41]. Hospitalization expenses are the most significant cost component following the expenditures for the medication [42].
The purpose of this study is represented by prevention, rapid diagnosis and intervention on reducing the risk factors, also in the context of certain comorbidities, which aim to be better understood. The ultimate goal is to improve the prognosis of patients who have heart failure and offer better recommendations. It is also desired to identify the main drivers of heart failure, factors used later to estimate the evolution of the disease and survival. The data of the patients admitted to the hospital for acute heart failure were evaluated and analyzed from the first day of admission to the day of discharge. We can prove that smoking, low education and obesity are the most harmful factors, while diabetes controlled by diet or medication does not significantly affect LVEF; indeed, there is a slight tendency to increase the LVEF.
This research has just scratched the surface in the context of measuring and deeply understanding of how certain factors affect HF. However, we believe this approach of measuring where we combine classical and modern machine learning technologies, along with more data could lead to significant progress, also combined with others findings from different areas, such as genetics.
Numerous factors are known to contribute to the development of heart failure. The potential causes include coronary artery disease, hypertension, cardiomyopathies, valvular and congenital heart disease, arrhythmias, alcohol and drugs, high output failure (anemia, thyrotoxicosis, Paget’s disease, etc.), pericardial disease, and primary right heart failure. The meta-analyses conducted by Jones et al. [43] found an improvement in the survival rates secondary to CHF over the past 70 years. The estimated 1-year survival rate was 85.5%; however, the 5-year and 10-year survival rates were 56.7% and 34.9%, and most patients died directly from heart failure or cardiovascular diseases. Although the risk of HF decompensation among older patients has declined over time, it remains one of the leading causes of hospitalization.
We believe that sharing these findings from our study will empower the clinicians with more knowledgeby exploring the relationship between LVEF and factors such as behavioral, social or comorbidity, and predictors of LVEF in these patients.

8. Conclusions

In conclusion, by using machine learning techniques, from traditional to modern, we can better understand to what extent certain factors affect LVEF. Smoking, low education and obesity are the most harmful factors, while diabetes controlled by diet or medication does not significantly affect LVEF; indeed, there is a slight tendency to increase the LVEF. Following more studies on larger groups and including different regions, prevention could be better understood and applied. This approach to better understand the effects or correlations of some factors can be developed in the future in several ways. Another direction of research would be to use more data and technologies, for example, data from sensors and make real-time analyses.

9. Limitations of the Study

We believe that our study has a few limitations: it was a retrospective study, which has not made it possible for us to establish a nonbiased design for the selection of the patients.
The study sample is small, especially for a powerful and complex machine learning classification algorithm like XGB, and the results obtained are preliminary, which is why further research is needed to certify the results presented.

Author Contributions

Conceptualization, C.-A.S.; methodology, C.-A.S.; software, O.C.; validation, C.-A.S.; formal analysis, C.-A.S.; writing, C.-A.S.; writing—review and editing, C.L.A.; supervision, C.-J.S.;project administration, C.-J.S. All authors have read and agreed to the published version of the manuscript.

Funding

No funding was received for this work.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Ethics Committee of “Carol Davila” University of Medicine and Pharmacy, Bucharest, Romania (protocol code PO-35-F-03, date 12 July 2021).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Data is available on request due to ethical restrictions. The data presented in this study are available on request from the corresponding author. The data are not publicly available due to ethical restrictions.

Conflicts of Interest

The authors declare no conflict of interest in this work.

References

  1. World Health Organization. Cardiovascular Diseases (CVDs) Fact Sheet. Available online: https://www.who.int/en/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds) (accessed on 1 January 2022).
  2. Greisenegger, S.; Endler, G.; Hsieh, K.; Tentschert, S.; Mannhalter, C.; Lalouschek, W. Is Elevated Mean Platelet Volume Associated with a Worse Outcome in Patients with Acute Ischemic Cerebrovascular Events? Stroke 2004, 35, 1688–1691. [Google Scholar] [CrossRef] [PubMed]
  3. Kaya, H.; Kurt, R.; Beton, O.; Zorlu, A.; Yucel, H.; Güneş, H.; Oguz, D.; Yilmaz, M.B. Cancer Antigen 125 is Associated with Length of Stay in Patients with Acute Heart Failure. Tex. Heart Inst. J. 2017, 44, 22–28. [Google Scholar] [CrossRef] [PubMed]
  4. Chu, S.G.; Becker, R.C.; Berger, P.B.; Bhatt, D.L.; Eikelboom, J.; Konkle, B.; Mohler, E.R.; Reilly, M.; Berger, J.S. Mean platelet volume as a predictor of cardiovascular risk: A systematic review and meta-analysis. J. Thromb. Haemost. 2009, 8, 148–156. [Google Scholar] [CrossRef] [PubMed]
  5. Ziaeian, B.; Fonarow, B.Z.G.C. Epidemiology and aetiology of heart failure. Nat. Rev. Cardiol. 2016, 13, 368–378. [Google Scholar] [CrossRef]
  6. Lesyuk, W.; Kriza, C.; Kolominsky-Rabas, P. Cost-of-illness studies in heart failure: A systematic review 2004–2016. BMC Cardiovasc. Disord. 2018, 18, 74. [Google Scholar] [CrossRef]
  7. Naylor, M.D.; Brooten, D.A.; Campbell, R.L.; Maislin, G.; McCauley, K.M.; Schwartz, J.S. Transitional Care of Older Adults Hospitalized with Heart Failure: A Randomized, Controlled Trial. J. Am. Geriatr. Soc. 2004, 52, 675–684. [Google Scholar] [CrossRef]
  8. Porcel, J.M. Pleural Effusions from Congestive Heart Failure. Semin. Respir. Crit. Care Med. 2010, 31, 689–697. [Google Scholar] [CrossRef]
  9. Natanzon, A.; Kronzon, I. Pericardial and Pleural Effusions in Congestive Heart Failure—Anatomical, Pathophysiologic, and Clinical Considerations. Am. J. Med Sci. 2009, 338, 211–216. [Google Scholar] [CrossRef]
  10. Andrei, C.A.; Oancea, B.; Nedelcu, M.; Sinescu, R.D. Predicting Cardiovascular Diseases Prevalence Using Neural Networks. Econ. Comput. Econ. Cybern. Stud. Res. 2015, 49, 73–84. [Google Scholar]
  11. Chioncel, O.; Tatu-Chitoiu, G.; Christodorescu, R.; Coman, I.M.; Deleanu, D.; Vinereanu, D.; Macarie, C.; Crespo, M.; Laroche, C.; Fereirra, T.; et al. Characteristic of patients with heart failure from Romania enrolled in—ESC-HF Long-term (ESC-HF-LT) Registry. Rom. J. Cardiol. 2015, 25, 1–8. [Google Scholar]
  12. Bytyçi, I.; Bajraktari, G. Mortality in heart failure patients. Anadolu Kardiyol. Dergisi/Anatol. J. Cardiol. 2015, 15, 63–68. [Google Scholar] [CrossRef] [PubMed]
  13. Van Nuys, K.E.; Xie, Z.; Tysinger, B.; Hlatky, M.A.; Goldman, D.P. Innovation in Heart Failure Treatment. JACC Heart Fail. 2018, 6, 401–409. [Google Scholar] [CrossRef] [PubMed]
  14. Lee, H.; Oh, S.-H.; Cho, H.; Cho, H.-J.; Kang, H.-Y. Prevalence and socio-economic burden of heart failure in an aging society of South Korea. BMC Cardiovasc. Disord. 2016, 16, 215. [Google Scholar] [CrossRef] [PubMed]
  15. Ni, H.; Xu, J. Recent Trends in Heart Failure-related Mortality: United States, 2000–2014. NCHS Data Brief. 2015, 231, 1–8. [Google Scholar]
  16. Das, T. National Center for Health Statistics Information. J. Consum. Health Internet 2015, 19, 40–51. [Google Scholar] [CrossRef]
  17. Dharmarajan, K.; Rich, M.W. Epidemiology, Pathophysiology, and Prognosis of Heart Failure in Older Adults. Heart Fail. Clin. 2017, 13, 417–426. [Google Scholar] [CrossRef]
  18. Braunstein, J.B.; Anderson, G.F.; Gerstenblith, G.; Weller, W.; Niefeld, M.; Herbert, R.; Wu, A.W. Noncardiac comorbidity increases preventable hospitalizations and mortality among medicare beneficiaries with chronic heart failure. J. Am. Coll. Cardiol. 2003, 42, 1226–1233. [Google Scholar] [CrossRef]
  19. Dries, D.L.; Sweitzer, N.K.; Drazner, M.H.; Stevenson, L.W.; Gersh, B.J. Prognostic impact of diabetes mellitus in patients with heart failure according to the etiology of left ventricular systolic dysfunction. J. Am. Coll. Cardiol. 2001, 38, 421–428. [Google Scholar] [CrossRef]
  20. Herzog, C.A.; Muster, H.A.; Li, S.; Collins, A.J. Impact of congestive heart failure, chronic kidney disease, and anemia on survival in the Medicare population. J. Card. Fail. 2004, 10, 467–472. [Google Scholar] [CrossRef]
  21. Kannel, W.B.; McGee, D.L. Diabetes and Glucose Tolerance as Risk Factors for Cardiovascular Disease: The Framingham Study. Diabetes Care 1979, 2, 120–126. [Google Scholar] [CrossRef]
  22. Christiansen, M.N.; Køber, L.; Weeke, P.; Vasan, R.S.; Jeppesen, J.L.; Smith, J.G.; Gislason, G.; Torp-Pedersen, C.; Andersson, C. Age-Specific Trends in Incidence, Mortality, and Comorbidities of Heart Failure in Denmark, 1995 to 2012. Circulation 2017, 135, 1214–1223. [Google Scholar] [CrossRef] [PubMed]
  23. Ather, S.; Chan, W.; Bozkurt, B.; Aguilar, D.; Ramasubbu, K.; Zachariah, A.A.; Wehrens, X.H.; Deswal, A. Impact of Noncardiac Comorbidities on Morbidity and Mortality in a Predominantly Male Population with Heart Failure and Preserved Versus Reduced Ejection Fraction. J. Am. Coll. Cardiol. 2012, 59, 998–1005. [Google Scholar] [CrossRef] [PubMed]
  24. Taylor, C.J.; Ordóñez-Mena, J.M.; Roalfe, A.K.; Lay-Flurrie, S.; Jones, N.; Marshall, T.; Hobbs, F.D.R. Trends in survival after a diagnosis of heart failure in the United Kingdom 2000-2017: Population based cohort study. BMJ 2019, 364, l223. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  25. Lankarani, M.M.; Assari, S. Baseline Depressive Symptoms Predict Subsequent Heart Disease; A 20-Year Cohort. Int. Cardivascular Res. J. 2016, 10, 29–34. [Google Scholar] [CrossRef]
  26. Guidelines for the Evaluation and Management of Heart Failure. J. Am. Coll. Cardiol. 1995, 26, 1376–1398. [CrossRef]
  27. Oliveros, E.; Brailovsky, Y.; Scully, P.; Nikolou, E.; Rajani, R.; Grapsa, J. Coronavirus Disease 2019 and Heart Failure: A Multiparametric Approach. Card. Fail. Rev. 2020, 6, e22. [Google Scholar] [CrossRef]
  28. Ponikowski, P.; Voors, A.A.; Anker, S.D.; Bueno, H.; Cleland, J.G.F.; Coats, A.J.S.; Falk, V.; González-Juanatey, J.R.; Harjola, V.-P.; Jankowska, E.A.; et al. 2016 ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure. Eur. J. Heart Fail. 2016, 18, 891–975. [Google Scholar] [CrossRef]
  29. Virani, S.S.; Alonso, A.; Benjamin, E.J.; Bittencourt, M.S.; Callaway, C.W.; Carson, A.P.; Chamberlain, A.M.; Chang, A.R.; Cheng, S.; Delling, F.N.; et al. Heart Disease and Stroke Statistics—2020 Update: A Report from the American Heart Association. Circulation 2020, 141, e139–e596. [Google Scholar] [CrossRef]
  30. Benjamin, E.J.; Blaha, M.J.; Chiuve, S.E.; Cushman, M.; Das, S.R.; Deo, R.; de Ferranti, S.D.; Floyd, J.; Fornage, M.; Gillespie, C. Heart disease and stroke statistics-2016 update a report from the American Heart Association. Circulation 2016, 133, e38–e48. [Google Scholar]
  31. Savarese, G.; Lund, L.H. Global Public Health Burden of Heart Failure. Card. Fail. Rev. 2017, 3, 7–11. [Google Scholar] [CrossRef]
  32. Fortin, M.; Hudon, C.; Haggerty, J.; Akker, M.V.D.; Almirall, J. Prevalence estimates of multimorbidity: A comparative study of two sources. BMC Health Serv. Res. 2010, 10, 111. [Google Scholar] [CrossRef] [PubMed]
  33. Gijsen, R.; Hoeymans, N.; Schellevis, F.; Ruwaard, D.; Satariano, W.; Bos, G. Causes and Consequences of Comorbidity. 2022. Available online: https://www.academia.edu/22796958/Causes_and_consequences_of_comorbidity (accessed on 13 August 2022).
  34. Bouneder, L.; Léo, Y.; Lachapelle, A. X-SHAP: Towards multiplicative explainability of Machine Learning. arXiv 2022, arXiv:2006.04574. [Google Scholar]
  35. Shapley, L.S. 17. A Value for n-Person Games. In Contributions to the Theory of Games (AM-28); Princeton University Press: Princeton, NJ, USA, 1953; Volume II, pp. 307–318. [Google Scholar] [CrossRef]
  36. Ribeiro, M.T.; Singh, S.; Guestrin, C. Model-Agnostic Interpretability of Machine Learning. arXiv 2016, arXiv:1606.05386. [Google Scholar]
  37. Shailaja, K.; Seetharamulu, B.; Jabbar, M.A. Machine Learning in Healthcare: A Review. In Proceedings of the 2018 2nd International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 29–31 March 2018. [Google Scholar] [CrossRef]
  38. Ahmad, M.A.; Eckert, C.; Teredesai, A. Interpretable Machine Learning in Healthcare. In Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, New York, NY, USA, 4–7 June 2018. [Google Scholar] [CrossRef]
  39. Khorasani, M.; Loy, J.; Ghasemi, A.H.; Sharabian, E.; Leary, M.; Mirafzal, H.; Cochrane, P.; Rolfe, B.; Gibson, I. A review of Industry 4.0 and additive manufacturing synergy. Rapid Prototyp. J. 2022, 28, 1462–1475. [Google Scholar] [CrossRef]
  40. Cook, C.; Cole, G.; Asaria, P.; Jabbour, R.; Francis, D.P. The annual global economic burden of heart failure. Int. J. Cardiol. 2014, 171, 368–376. [Google Scholar] [CrossRef] [PubMed]
  41. Zugck, C.; Müller, A.; Helms, T.M.; Wildau, H.J.; Becks, T.; Hacker, J.; Haag, S.; Goldhagen, K.; Schwab, J.O. Gesundheitsökonomische Bedeutung der Herzinsuffizienz: Analyse bundesweiter Daten. DMW Dtsch. Med. Wochenschr. 2010, 135, 633–638. [Google Scholar] [CrossRef] [PubMed]
  42. Delgado, J.F.; Oliva-Moreno, J.; Llano, M.; Figal, D.A.P.; Grillo, J.J.; Comín-Colet, J.; Díaz, B.; de La Concha, L.M.; Martí, B.; Peña, L.M. Health Care and Nonhealth Care Costs in the Treatment of Patients with Symptomatic Chronic Heart Failure in Spain. Rev. Española Cardiol. (Engl. Ed.) 2014, 67, 643–650. [Google Scholar] [CrossRef]
  43. Jones, N.; Roalfe, A.K.; Adoki, I.; Hobbs, F.R.; Taylor, C.J. Survival of patients with chronic heart failure in the community: A systematic review and meta-analysis. Eur. J. Heart Fail. 2019, 21, 1306–1325. [Google Scholar] [CrossRef]
Figure 1. Correlation matrix between variables.
Figure 1. Correlation matrix between variables.
Applsci 12 09474 g001
Figure 2. Research workflow.
Figure 2. Research workflow.
Applsci 12 09474 g002
Figure 3. Classification results for behavioral variables.
Figure 3. Classification results for behavioral variables.
Applsci 12 09474 g003
Figure 4. Shapley values for behavioral variables.
Figure 4. Shapley values for behavioral variables.
Applsci 12 09474 g004
Figure 5. Classification results for social variables.
Figure 5. Classification results for social variables.
Applsci 12 09474 g005
Figure 6. Shapley values for social variables.
Figure 6. Shapley values for social variables.
Applsci 12 09474 g006
Figure 7. Classification results for diabetes variables.
Figure 7. Classification results for diabetes variables.
Applsci 12 09474 g007
Figure 8. Shapley values for variables related to diabetes.
Figure 8. Shapley values for variables related to diabetes.
Applsci 12 09474 g008
Table 1. The proportion of patients by each variable.
Table 1. The proportion of patients by each variable.
VariablePercentage
Smoker40%
Alcoholic26%
Obese79%
Women53%
from_rural36%
low_income72%
low_education25%
no_diabet50%
diabet_antidiabetic_pills24%
diabet_diet13%
diabet_insulin12%
LVEF_over_5049%
Table 2. The proportionof patient’s diagnostics by each variable.
Table 2. The proportionof patient’s diagnostics by each variable.
VariableProportion Having LVEF_over_50Proportion Not Having LVEF_over_50Count of Patients Variable = 1
Smoker0.210.7981
Alcoholic0.250.7552
Obese0.430.57159
Women0.60.4106
from_rural0.40.673
low_income0.590.41145
low_education0.080.9250
no_diabet0.490.51100
diabet_antidiabetic_pills0.460.5448
diabet_diet0.810.1926
diabet_insulin0.240.7625
Table 3. Logistic regression output for behavioural variables.
Table 3. Logistic regression output for behavioural variables.
Logit Regression Results
Dependent VariableLVEF_over_50No. Observations:201
Model:LogitDf Residuals:197
Method:MLEDf Model:3
Date:Wednesday, 13 July 2022Pseudo R-squ.:0.1858
Time:19:57:17Log-Likelihood:−113.38
Converged:TrueLL-Null:−139.26
Covariance Type:nonrobustLLR p-Value:3.371 × 10−11
coefstd errzp > |z|[0.0250.975]
Intercept1.56820.3983.9440.0000.7892.347
smoker−1.90560.373−5.1130.000−2.636−1.175
alcoholic−0.40420.431−0.9370.349−1.2490.441
obese−1.00380.417−2.4100.016−1.820−0.187
Table 4. Logistic regression output for social variables.
Table 4. Logistic regression output for social variables.
Logit Regression Result
Dependent VariableLVEF_over_50No. Observations:201
Model:LogitDf Residuals:196
Method:MLEDf Model:4
Date:Wednesday, 13 July 2022Pseudo R-squ.:0.2627
Time:19:57:41Log-Likelihood:−102.68
Converged:TrueLL-Null:−139.26
Covariance Type:nonrobustLLR p-Value:4.890 × 1015
coefstd errzp > |z|[0.0250.975]
Intercept−0.95140.395−2.4090.016−1.725−0.177
women0.59510.3461.7200.085−0.0831.273
from_rural−0.14110.374−0.3780.706−0.8730.591
low_income1.58140.3983.9750.0000.8022.361
low_education−2.87540.573−5.0140.000−3.999−1.751
Table 5. Logistic regression output for variables related to diabetes.
Table 5. Logistic regression output for variables related to diabetes.
Logit Regression Result
Dependent VariableLVEF_over_50No. Observations:201
Model:LogitDf Residuals:197
Method:MLEDf Model:3
Date:Wednesday, 13 July 2022Pseudo R-squ.:0.07059
Time:19:22:30Log-Likelihood:−129.43
Converged:TrueLL-Null:−139.26
Covariance Type:nonrobustLLR p-Value:0.0001994
coefstd errzp > |z|[0.0250.975]
Intercept−1.25280.463−2.7060.007−2.160−0.345
no_diabet1.21280.5042.4050.0160.2242.201
diabet_antidiabetic_pills1.08570.5461.9880.0470.0152.156
diabet_diet2.68780.6803.9550.0001.3564.020
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Soysaler, C.-A.; Andrei, C.L.; Ceban, O.; Sinescu, C.-J. The Relationship of Behavioral, Social and Diabetes Factors with LVEF Measured Using Machine Learning Techniques. Appl. Sci. 2022, 12, 9474. https://doi.org/10.3390/app12199474

AMA Style

Soysaler C-A, Andrei CL, Ceban O, Sinescu C-J. The Relationship of Behavioral, Social and Diabetes Factors with LVEF Measured Using Machine Learning Techniques. Applied Sciences. 2022; 12(19):9474. https://doi.org/10.3390/app12199474

Chicago/Turabian Style

Soysaler, Cezara-Andreea, Cătălina Liliana Andrei, Octavian Ceban, and Crina-Julieta Sinescu. 2022. "The Relationship of Behavioral, Social and Diabetes Factors with LVEF Measured Using Machine Learning Techniques" Applied Sciences 12, no. 19: 9474. https://doi.org/10.3390/app12199474

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop