1. Introduction
The main organ of the human body is the heart. If it is not functioning properly, an infection can spread throughout the body, and the sickness typically leads to death. It regulates our body temperature, blood pressure, and many other crucial health-related factors, like blood oxygen levels [
1]. Congestive heart failure (CHF) is a condition in which the heart is not able to pump the required amount of blood to the body. In this condition, blood accumulates in the heart, and pulmonary circulation causes congestion, and thus, fluid builds up around the heart and lungs, further causing heart failure. The main cause of CHF is decreased preload, decreased contractility of myocardium, and increased afterload. Valvular dysfunction, aneurysm, or plaque may be the reason behind this. The main types of CHF include left-sided heart failure and right-sided heart failure, which may be systolic or diastolic depending upon the complications [
1].
Heart disease is the main reason for hospitalizations in elderly individuals over 65 years of age [
2]. The American Heart Association states that over 50% of all American adults (i.e., 121.5 million) have one or more cardiovascular diseases [
3]. According to World Health Organization (WHO) reports, heart disease causes more than 15 million fatalities worldwide, which accounts for around 30% of all deaths worldwide. If unmitigated, it is predicted that, in 2030, there will be approximately 22 million deaths globally. Some of the major causes that can develop into heart disease include diabetes, high cholesterol, physical inactivity, high blood pressure, family genetics, excessive use of alcohol, unhealthy diet, tobacco, and many other factors [
4,
5,
6,
7]. Sometimes, different genders may experience different CHF symptoms. For instance, a male patient can more possibly experience pain in the chest. In contrast, a female patient may experience additional symptoms, such as nausea, acute exhaustion, and shortness of breath, along with chest pain [
8]. These uncertainties can be reduced by adopting a healthy way of living, such as consuming less salt, ingesting vegetables and fruits, exercising regularly, and giving up alcohol and cigarettes. This may help reduce the chance of developing heart disease over time.
Since heart disease has a very complex nature, it involves cautious administration. Failure to do so can cause damage to the heart or result in an untimely demise. In more than 40% of cases, heart attacks occur suddenly. Despite the best medical treatment, such events are often fatal and very severe, making it impossible to save a life. The prediction of CHF in its early stages can help to find a cure and save many lives. To diagnose CHF, available diagnostic methods are X-ray, B.N.P. (brain natriuretic peptide), echocardiogram, and cardiac catheterization. Manual methods for diagnosing CHF include the identification of jugular vein distention in the neck, the swelling of extremities, systemic congestion, and maybe liver cirrhosis. The quality of service (QoS) in the healthcare industry, which provides accurate and prompt disease diagnosis and competent patient care, is experiencing serious challenges. A specially trained artificial intelligence (AI) model can be used as an aid to medical professionals in diagnosis. This can speed up the diagnosis process and reduce their workload. This can also save the precious time for medical professionals [
9]. In this context, machine learning (ML) algorithms are effective and reliable resources for identifying and classifying patients with and without CHF. Therefore, using machine learning and data mining techniques, researchers have created a wide range of automatic systems for diagnosis [
10,
11]. The current approaches have some limitations, such as hefty data produced by CT scans or MRI scans and space complexity, where it is challenging to develop a solution to distinguish between heart disease and stroke. Heart disease prediction is considered to be challenging; however, with the development of ML algorithms, it is now an important subject. Employing AI helps both patients and doctors to save lives by undertaking treatment as soon as possible using electronic health records (EHRs), which are useful in this modern period for both clinical and research purposes by increasing productivity and effectiveness in healthcare. To make the prediction system more exact and accurate, there is now an urge to increase accuracy.
By minimizing the prediction error and actual outcomes, different ML techniques have been employed to comprehend the non-linear relationship between various parameters and complexity [
10]. The outlook of clinical research and data mining techniques are applied to identify several types of anabolism syndromes. Classification and data mining both serve a crucial function in data analysis and heart disease prediction. Furthermore, we have observed the usage of decision trees to identify events related to heart disease [
12]. However, when it comes to machine learning, imbalance and outlier data can emerge and affect the effectiveness of the prediction model. Earlier research has shown that the outlier data can be identified and eliminated by using density-based spatial clustering of applications with noise-based methods [
13], and to balance the distribution of data, a hybrid synthetic minority over-sampling technique-edited nearest neighbor (SMOTE-ENN) is used to significantly improve the performances of the prediction models [
14]. In order to overcome these problems, we propose an effective hybrid model for CHF prediction, where we employ the C4.5 algorithm for feature selection from an inconsistent and noisy dataset, the KNN algorithm for missing data imputation, and DNN to predict CHF. In this work, various observations were made to develop a prediction model employing a unique methodology. These integrated new techniques are often referred to as hybrid methods [
15].
In this work, our goal is to help the physician diagnose CHF in its early stage by identifying the optimal feature from all the features present in the dataset, which improves the overall performance of CHF prediction and minimizes the number of tests to be performed by healthcare. A block diagram of the proposed model is shown in
Figure 1. As we can see in
Figure 1, first, the whole dataset is cleaned, where all the duplicate and inconsistent records are removed, and then, we split the remaining data into training and testing sets in the ratio of 70:30, respectively. The ratio of training to testing data is a balance between providing sufficient data for model learning and ensuring a robust evaluation of unseen instances. And, we can see in
Figure 2 that in feature selection, we further split the training set into complete and incomplete sets, in which features are selected from the complete set to prevent the model from selecting irrelevant or suboptimal features that affect the model’s performance.
Figure 1 shows the general block diagram of our proposed model, and in
Figure 2, we can see all the steps involved in the preprocessing. The CHS dataset was used to build our models and employ seven machine learning models: Naive NB, RF, SVM, KNN, DT, and deep neural network (DNN). After comparing all seven models, DNN performed better than other models with our proposed pre-processing technique. The main goal of this research is to design an effective CHF prediction model using a DNN classifier with improved and enhanced accuracy. The overview of our work’s contributions is as follows: (a) We provide an understanding of the various risk factors for CHF prediction. We examine the numerous factors found in patient records to determine the most crucial factors required for CHF prediction, reducing the need for physical examinations, promptly identifying new patients, and shortening the length of the diagnostic process. (b) We analyze and evaluate the most effective method for predicting the development of CHF by benchmarking widely used machine learning models for heart disease prediction. (c) Our model helps enhance CHF diagnosis using a minimum feature that reduces the number of tests and medical bills. (d) Additionally, the use of digitized healthcare offers a multitude of prospects for mitigating human errors, enhancing therapeutic outcomes, monitoring longitudinal data, and so forth.
The remainder of the paper is organized into different sections as follows.
Section 2 presents the background of the proposed problem along with preliminaries.
Section 3 discusses a brief description of existing methods and available models.
Section 4 discusses the proposed methodology, dataset description, and pre-processing methodology.
Section 5 and
Section 6 present state-of-the-art classification models used for the experiment and the performance evaluation metrics, respectively. The experimental setup, analysis of results, and comparison with earlier works are presented in
Section 7. Finally,
Section 8 represents the conclusion and future directions.
2. Background and Preliminaries
Congestive heart failure takes place when the heart is not able to distribute blood adequately to meet a body’s requirements [
16]. The death rate from heart failure can range from 5% to 75% every year. Most of the risk factors for heart failure comprise high blood pressure, a previous heart attack, being overweight, smoking, abusing alcohol, lack of vitamins, lack of sleep, consuming unhealthy foods (such as animal fats), and laziness [
17]. Heart failure is more common in people over 65, those who are overweight, and those who have already experienced heart failure.
The available method for heart disease prediction is very costly, and low-income families cannot afford it. The prediction of CHF may facilitate decisions about specific medications or devices. Current risk prediction techniques, however, have very moderate success rates, most likely as a result of the fact that they were developed using statistical analytic techniques that miss predictive data with multi-dimensional correlations in huge collections of data. Data mining techniques play an important role in finding hidden information from a collection of prior archives for decision-making in the future. Almost every area of life, including business, engineering, medicine, and education, uses data mining.
The world today makes use of electronic health records (EHRs) for both clinical and research purposes where data mining techniques are used. And, studies have claimed that there are several risk factors (high cholesterol, diabetes, alcohol consumption, smoking, and insufficient exercise) that cause heart disease [
5]. The usage of various techniques and algorithms has been observed in much research to identify events related to heart disease. Several observations were made in this work in order to develop a prediction model using an integrated methodology [
12]. Here, we used the Cardiovascular Health Study (CHS) dataset, which is a challenging dataset with vast inconsistencies and missing values. There are many challenges in this dataset, like the many attributes that are irrelevant to the problem we are trying to solve, such as seeing well enough to recognize a person, waking up at night, having a grandchild born in the past 6 months, etc.; data entry errors; noisy data; and missing values. There are more than 30% of data missing; some of the reasons are participant refusal, the inability to answer certain questions, or death. Manual input error is one of the problems because it leads to some missing data but can also add to data inconsistency.
To make the dataset suitable for the classification algorithm, we applied an integrated method by combining the decision tree with KNN in pre-processing, where the decision tree serves as a feature selection algorithm and KNN as a missing data imputation algorithm. Initially, we separated CHF records from the raw data and split them into two subsets (complete and incomplete). Then, DT C4.5 was applied to the complete set to calculate the gain ratio value for each attribute to rank important features in the dataset, select top features, and discard the remaining features with a lower information gain ratio. For the missing values present in the incomplete set, we merged the complete and incomplete sets with respect to the selected attributes. Then, KNN was used to replace the missing value with the nearest possible value within the dataset. After the pre-processing of raw data, the dataset was cleaned and ready for classification without any missing and inconsistent values. The pre-processed data were used as input for six standard ML algorithms (KNN, LR, NB, RF, SVM, DT), and a DNN, and the performance was evaluated using a 10-fold cross-validation method. The result shows that the integrated pre-processing method with the DNN as a classifier outperformed other standard ML classifiers with maximum accuracy. After analyzing the outcomes, this model can be used to predict CHF and can help improve the condition by adopting specific medications or devices before it is too late. For further improvement of the model performance, different ML techniques can be integrated or combined in new possible ways to predict different diseases using various datasets. In the future, the model can be a reliable and cost-effective tool against expensive procedures to predict disease. A summary of key notations used throughout this article is given in
Table 1.
3. Related Work
The non-linear Cleveland heart disease dataset was utilized in [
18] for predicting heart disease using the properties of random forests with minor modifications. Every attribute with incomplete data was taken, and the median of that attribute was used to fill in the incomplete data values. The dataset for heart disease was cleaned by removing any missing values. They achieved more accurate heart disease predictions by employing the RF classification technique when the attributes were well-defined. By employing 10-fold cross-validation, the accuracy of random forest was verified after 303 instances of data were trained. The suggested model outperformed the competing models by achieving maximum accuracy.
A heart sound-based technique for detecting CHF was presented by Gjoreski et al. in [
19]. They concentrated on using the analysis of cardiac sound recordings to determine the CHF condition. The approach combined the traditional ML approach and the end-to-end deep learning (DL) approach. A spectro-temporal representation of the signal was used as input for the DL algorithm to learn, whereas traditional ML used expert characteristics to learn. As many as 947 publicly accessible individual recordings and one CHF dataset were gathered, and the proposed method was evaluated. The ML models were created using 15 features to differentiate between CHF phases.
Plati et al. proposed a model using different machine learning classifiers (Bayes network (BN), decision tree, SVM, logistic model tree (LMT), RF, KNN, aive Bayes (NB), and rotation forest (ROT)) to diagnose heart failure on 422 subjects with 10-fold cross-validation for the evaluation [
20]. A total of 73 female and 154 male subjects made up the HF sample, whereas 106 male and 151 female subjects made up the non-HF dataset. LMT and ROT performed better than any other classifiers after being compared.
Gjoreski et al. presented a method by stacking different ML classifiers for detecting CHF from the sounds of the heart [
21]. A professional digital stethoscope was used to capture the sounds. They used 152 distinct heart sounds from 122 different people in all. In the segment-based ML phase, experiments were conducted using combinations of various types of models, ranging from specific techniques to an aggregate of seven techniques—NB, Bagging, RF, SVM, KNN, Boosting, and J48. For the evaluation, these models employed the LOSO cross-validation method. An accuracy of 96% was achieved by the experimental approach, which revealed encouraging findings.
A heart failure risk assessment prediction model was proposed by Aljaaf et al. using the C4.5 classifier on multiple levels, where heart failure is categorized into five heart failure risk levels, namely, high-risk, low-risk, extreme-risk, moderate-risk, and no-risk [
22]. This study made use of heart disease information from the Cleveland Clinic Foundation. Three supplementary features—physical activity, smoking, and obesity—that significantly increase the risk of getting heart failure were added to the dataset as part of the study’s focus on improving dataset features. Ten-fold cross-validation was employed to measure performance. With 95.5% specificities, 86.5% sensitivity, and 86.53% accuracy, the prediction model outperforms several other models.
By integrating the computing capacity of several ML techniques, such as RF, KNN, DT, and SVM, and using Cleveland Heart Disease Dataset, Srivastava et al. proposed a revolutionary way to diagnose heart disease [
23]. The result showed that KNN provided the maximum accuracy of any method. After training, the model was used to create packages that could be uploaded to a web server, and a web interface was created that allowed users to enter attributes and see the results.
Awan et al. proposed a model using the UCI dataset for the prediction of heart disease. For the prediction of heart disease, the KDD model and neural network approaches were used, and the results before and after PCA were compared [
24]. Using the PCA algorithm, the data were pre-processed for training before being input into the ANN. Using principal component analysis (PCA), the accuracy rate increased from 94.7% to 97.7%. This is how the accuracy is computed and represented.
A unique approach called Hybrid Random Forest with Linear Model (HRFLM) was presented by Mohan et al. in [
25] for accurate heart disease predictions utilizing heart rate time series. The Cleveland UCI repository is where the dataset was gathered. The HRFLM method bases feature selection on DT entropy, and feature selection and modeling are repeated for different combinations of attributes to maximize the performance in cardiovascular disease prediction. Three heart disease prediction models (RF, LM, and DT) were employed as classification, and a confusion matrix was used for the evaluation of the model. The characteristics of the suggested hybrid linear method (LM) and random forest (RF) technique were combined. Comparing the HRFLM classification method to other approaches, it had the highest accuracy.
Shaji et al. used data mining technologies in their research to predict and enhance the performance of the diagnosis of heart disease [
26]. By interacting with patients and gathering data from discharge summaries of the patients, the data were gathered from Jubilee Mission Hospital in Thrissur. Twenty attributes were collected from roughly 2200 patients in total, and they were entered into an Excel file. Then, classification was performed using data mining techniques including ANN, RF, and KNN, where ANN delivered the best outcome and maximum accuracy.
To predict cardiac illness, Sharanyaa et al. proposed a hybrid strategy that combined the virtues of fuzzy logic and the KNN algorithm [
27]. Various classification methods used clinical data values, and the algorithms KNN, SVM, DT, and RF were used to forecast heart disease. A total of 13 features and four classifiers were used to forecast disease and maximize the accuracy of the model. Compared to other methods, the KNN classification outperformed with the highest accuracy.
Singh et al. used the Cleveland and Framingham dataset for the heart disease prediction model to increase the performance accuracy [
28]. Random forest was combined with logistic regression since it is a very robust model that provides superior accuracy. Following a comparison of the outcomes of different ML and DL models, the idea of a hybrid technique utilizing weighted average aided in heart disease prediction for better performance.
To improve the accuracy of predictions, Fitriyani et al. developed a system that supports clinical decisions by integrating SMOTE-ENN, Density-based Spatial Clustering of Applications with Noise (DBSCAN), and XGBoost-based MLA [
29]. This system can be utilized to diagnose the individual’s heart state earlier. The outlier data were found and removed using DBSCAN, the uneven training dataset was balanced using SMOTE-ENN, and the prediction model was learned and created using XGBoost MLA. The model was developed using Statlog and Cleveland, two publicly accessible datasets. The proposed method outperformed other methods and past research after a performance comparison with other categorization models.
By using nine classical models—Gaussian naive Bayes classifier (G-NB), adaptive boosting classifier (AdaBoost), gradient boosting classifier (GBM), SVM, ETC, SGD classifier, DT, LR, and RF, Ishaq et al. showed that effective data mining approaches on significant features can increase the accuracy in cardiovascular patients [
30]. To address the imbalance in class, they used SMOTE. Furthermore, RF was employed to select the highest-ranking features used to train machine learning models.
Alaa Khaleel Faieq et al. [
31] employed ML techniques (SVM and ANN) in the prediction of heart disease using a UCI ML repositories database, which contained medical information of 170 subjects. Here, SVM outperformed ANN with an accuracy of 89.10% in heart disease prediction.
Abdul Saboor et al. [
32] proposed a machine learning classifier with different state-of-the-art models ((RF, XGBoost, CART, SVM, multinomial naïve Bayes (MNB), LR, linear discriminant analysis (LDA), AdaBoost classifier (AB), and ET) for heart disease prediction. The author used the Cleveland heart disease dataset for testing.
The performance of the proposed SVM outperforms the other classifiers. Gunjan Gupta et al. [
33] designed a model KNN for heart disease prediction using a dataset collected from Kaggle. They also compared the model with different classifiers, like DT, KNN, ANN, NB, and RF. The outcome of the comparison showed that with the value of k = 5, the proposed model with KNN had the highest accuracy.
Umarani Nagavelli et al. [
34] proposed a tool using XGBoost to improve the accuracy of heart disease diagnosis prediction. They compared four types of machine learning (ML) models. From the abovementioned table and graphs, it is clear that the accuracy parameter is high in the XGBoost algorithm-based heart disease detection.
Using an RF algorithm, Vien T. Truong et al. [
35] determined the diagnosis of congenital heart disease (CHD) using postnatal echocardiography. They also demonstrated greater sensitivity in prenatal CHD screening with very excellent performance.
All previous research has focused on classifying heart failure from non-heart failure, using various features, datasets, and methods. The comparative analysis of earlier work in this study helps to identify the effectiveness and weaknesses of previously proposed ML techniques for the diagnosis of heart disease. A summary of the related work is presented in
Table 2. Most of the previous research used small datasets (less than 300 samples), which resulted in fewer observations in test data and could lead to overfitting. Several optimization techniques have been used in their work to enhance a number of measures, namely precision, recall, and accuracy. One of the primary aims of our work is to evaluate various ML algorithms to analyze which method is best for CHF prediction. We anticipate that our prediction model performs better than the approaches used in earlier research and state-of-the-art models.
7. Experimental Results Analysis
The experimental methodology and findings from all of the CHF prediction experiments are discussed in this section. The dataset included information regarding lifestyle, clinical, and bodily characteristic attributes. Some of these characteristics, like gender, smoking, diabetes, and blood pressure, are binary. After pre-processing, the dataset consisted of 12 best features and no missing values. The pre-processed dataset was utilized for training the machine learning models, which were then evaluated on precision, sensitivity, specificity, accuracy, F1-score, and Matthew’s correlation coefficient.
Figure 1 represents the proposed methodology’s flowchart.
7.1. Experimental Design
To evaluate the performance of the models, we employed supervised machine learning techniques. To prevent overfitting, the dataset was split into a ratio of 70:30 between the training set and the testing set. This ratio was employed in various works of literature for classification tasks [
56]. Different performance evaluation metrics were employed to test the machine learning classifier’s performance. Hardware details are shown in
Table 5.
7.2. Experimental Results
Early identification of CHF is crucial for individuals at risk in order to prevent the worsening of symptoms and minimize associated risks. This study proposed an integrated automatic CHF diagnosis method using the CHS dataset. In this experiment, machine learning classifiers were employed to evaluate significant features selected by the C4.5 algorithm. The least significant features were eliminated throughout the training and testing of classifiers. By eliminating the least important features, the classifiers improved performance. After eliminating most of the noise from the dataset during data cleaning, feature selection, and missing data imputation, the DNN was applied, and when compared to other ML models, the DNN showed an impressive outcome for improving prediction accuracy. For comparison, we used six MLAs (DT, SVM, KNN, RF, NB, and LR) that are often employed in research and are renowned for accuracy and performance. The F1-scores of all the classifiers are shown in
Figure 4. For the validation of every model, we carried out a 10-fold cross-validation on the training set and tested the model performance using the testing set. We carefully considered statistical stability and practical issues when choosing 10-fold cross-validation for our study. This way, we avoided any problems that might come with using fewer folds. The benefit of the 10-fold technique was that it could give a more precise overview of how effectively the model performed. Each iteration’s smaller testing set would make things stabler and give an accurate view of how well the model works across different sets of data. Practically speaking, computational effectiveness is essential. Increasing the number of folds increases processing needs while improving statistical dependability. Practically speaking, very high folds may not be feasible because of significant processing times, particularly with large datasets or complex models. On the other hand, fewer folds may seem preferable for computational simplicity, but there is a cost associated with it. Larger testing sets result from fewer folds, which could increase variability and compromise the accuracy of the model’s assessment. The purposeful selection of 10-fold cross-validation in our study guaranteed a cautious balance. It satisfied both practical limitations and statistical rigor by offering a strong evaluation of model performance despite unnecessarily taxing computing power. To evaluate the performance of the proposed CHF prediction model, we used seven evaluation parameters: precision, accuracy, FPR, TNR, sensitivity/recall, F1-score, and MCC.
7.3. Comparison and Performance Analysis
The performance outcome showed that the proposed method outperformed the other six models by obtaining an F1-score, accuracy, sensitivity, and precision, up to 97.03%, 95.30%, 96.49%, and 97.58%, respectively. The performance outcomes also showed that the suggested model had the highest TNR and lowest FPR in comparison to other ML models (refer
Table 6). The proposed methodology achieved a TNR of 90.63% and an FPR of 9.38%. The proposed model’s high TNR and low FPR value showed DNN’s capability to improve prediction accuracy and decrease miss-rate for both positive and negative subjects. In terms of MCC, the DNN showed the maximum 85.75% MCC value, which confirms the efficiency of our proposed model over other state-of-the-art models. The overall performance outcomes are displayed in
Table 6, which also reveals that the results of the SVM, KNN, NB, and DT models are low when compared with other classifiers like RF, LR, and DNN. The same can also be observed from the graph in
Figure 4. Therefore, this overall analysis (as shown in
Table 6) cannot be utilized as the primary evidence to make conclusions about the efficiency of the provided prediction models. However, it can be used to compare the proposed methodologies in general. The obtained confusion matrix for the training set, testing set, and ROC of the model is shown in
Figure 5,
Figure 6, and
Figure 7, respectively.
7.4. Discussions
The goal of this study was to create an effective ML pipeline that can predict CHF with reasonable explanations. A secondary objective of this study was to uncover useful risk factors that contribute considerably to the classification output to reduce the diagnosis cost and enhance the performance. The proposed method’s feature selection step can substantially affect the classification performance of feature-based ML methods. Moreover, within each classification model, our approach ultimately employed a fewer-feature subset. The results of the study indicate that a limited set of clinical, biochemical, and demographic parameters are sufficient for establishing a diagnosis of CHF, resulting in reduced time and cost demands. In this research study, we employed simple DL models for predictions of CHF. The purposeful decision to use imbalanced datasets, consisting of 2043 instances of CHF patients and only 580 examples of no CHF, is based on numerous concerns. First and foremost, it is crucial to highlight that the occurrence of CHF in real-world populations appears to be less in comparison to non-CHF instances. The dataset we used accurately represents the basic imbalance, thereby ensuring that our study was in keeping with the normal distribution of occurrences of CHF. The selection of this option was crucial for the clinical significance of our prediction model, as accurately identifying positive cases is of significant importance in healthcare applications. We understand the likelihood of biases that may arise because of the unequal distribution of classes and implemented strategies to successfully mitigate them. The study utilized performance indicators that exhibit reduced sensitivity to imbalance, such as F1-score, recall, and precision. These measures offered a more thorough assessment of the model’s performance, particularly in situations where there was an imbalance in class distributions. We also performed sensitivity analyses to see what happens to the model’s predictions when the imbalance between classes changes. It is clear from these analyses that our method is robust and can work in a wide range of challenging environments. In addition, we conducted a comparison between our model’s performance and that of baseline models and other methodologies. This comparison demonstrates that our chosen methodology successfully utilized the imbalanced data to obtain better outcomes. Ultimately, the utilization of imbalanced datasets was a deliberate and defensible decision in our research. Our methodology guarantees that our predictive model is both of medical importance and capable of addressing the natural challenges resulting from imbalanced datasets. We value the careful evaluation of these factors and encourage any further input to improve the strength and clarity of our research.
The selection of a subset of features was based on a rigorous procedure with the goal of maximizing the model’s performance and improving its understanding. When predicting CHF, the objective is to find factors that have a substantial impact on the forecast and offer distinct information, enabling a more detailed comprehension of the patterns linked to CHF. We selected the 12 best attributes (shown in
Table 3) with a high correlation with the class. We analyzed and experimented with various attribute sets (from the top 8 attributes to the top 25 attributes), and the attribute set with the top 12 attributes gave the best result. These features have shown a strong correlation with CHF prediction, encompassing crucial medical data and risk factors that are recognized as key factors in the development of CHF. The choice to include certain features was probably based on past research and knowledge in the field. If earlier research consistently shows that certain parameters are good at predicting CHF, they are likely to be given more weight in the selected features. This approach keeps up with what is already known and makes sure that the model fits with the information we already have about what leads to CHF. Using what we have learned from past studies makes the predictive model more valid and useful in healthcare environments. Also, choosing a limited set of 12 attributes demonstrates a deliberate approach to reducing dimensionality. Keeping track of a smaller group of attributes helps avoid overfitting, especially when working with small datasets. Employing a small group of features in the model makes it simpler, which makes it faster to compute and reduces its susceptibility to capturing irrelevant patterns or noise. This method simplifies the process of converting model predictions into practical information for medical professionals, thereby enhancing the practicality of the model’s predictions in real-life situations.
With the use of several machine learning algorithms and the CHS dataset, the current work significantly adds to the development of such a method for the prediction of CHF. Various models presently in use have experimented with various heart disease datasets. Since there were not many studies on the CHS dataset, we chose these models for comparison. A comparison with these current models had a greater opportunity for a quantitative study of our proposed model, even though different heart disease datasets comprise distinct sets of properties. Our approach cannot be directly compared with those using cardiac sound recordings [
19] or heart sound characteristics [
21], but it may be contrasted with those using datasets similar to ours [
18,
22,
23,
24,
25,
29,
30,
31,
32,
33]. Regarding the related work, many studies [
18,
22,
23,
24,
25,
29,
30,
31,
32,
57] have conducted their research on the same dataset (i.e., UCI Cleveland heart disease dataset) with different approaches, which display a range of accuracy from 85.81% to 97.70%. So, we utilized most of the previously applied approaches in our dataset and compared the performance, as shown in
Table 6. The results obtained for CHF prediction show that the DNN technique surpassed all other ML approaches in terms of accuracy (95.30%), sensitivity (96.49%), and specificity (90.63%) when compared to other ML approaches.
In our study, we used a moderate number of data and used the same pre-processed data for all the classification models, and the DNN we used in our study was a simple feed-forward DNN. By utilizing a standardized, pre-processed dataset, the experiment guaranteed an equitable and impartial comparison between various methods. Eliminating any potential biases resulting from differences in pre-processing procedures ensured a fair and equal basis for analysis. The simplicity of the DNN enhanced transparency and interpretability, enabling a more comprehensible comprehension of the acquired patterns. The DNN, despite being simple, is highly suitable for effectively training on a moderate-sized dataset, rendering it feasible in circumstances where computational capabilities are factors to be taken into account. The model’s analysis of non-linear relationships in the data corresponds to the complexity of patterns that conventional methods may not adequately represent. Moreover, the uniformity of the complexity of models across algorithms improves the comprehensibility of outcomes, enabling a concentrated examination. The FF DNN’s inherent simplicity facilitates the replication and interpretation processes, hence enhancing the reproducibility and transparency of the study. Overall, these favorable qualities emphasize the FF DNN’s function in offering vital insights regarding its efficiency compared to different algorithms in a just, understandable, and reliable way.
The efficacy of the proposed DNN architecture for diagnosing CHF lies in the complex interaction of many design decisions that collectively improve our model’s capabilities. Incorporating the ReLU activation function into the hidden layers makes the model non-linear. This provides a base for the DNN to understand the CHF data’s complicated, non-linear correlations, enabling it to find trends that conventional linear models could have overlooked. The depth of our model consists of four hidden layers with a constantly decreasing number of nodes (50, 30, 20, 10). The model’s depth enables it to systematically learn insight into features at various levels of abstraction. The optimization approach is a crucial factor, with SGD playing a prominent role. We used SGD with a meticulously adjusted learning rate of 0.001. This selection guarantees an even integration during the training process, which is a vital element in the model’s capacity to efficiently acquire representative features. By using ReLU activation, sparsity is introduced, resulting in improved computing efficiency and an enhanced ability of the model to identify important features, which, consequently, reduces the possibility of overfitting. Furthermore, the model’s adaptability is facilitated by the architecture’s flexibility, which enables the utilization of various node configurations in each layer. This adaptability helps the model to effectively accommodate the many patterns found in CHF data. This flexibility is useful for encapsulating the complex structure of medical data. Therefore, the combination of adaptability, non-linearity, and incorporation of depth highlights the DNN as an effective tool for CHF prediction, offering benefits over baseline models.
7.5. Limitation
One of the constraints of our work is the lack of appropriate input data and their strong association. The majority of the features in the EHR dataset do not add any new information to the original feature space because of their significant correlation with one another. As a result of the sparse data with high levels of inconsistency, we were unable to train deep learning architectures, which are superior to traditional ML methods in terms of effectiveness. Additionally, the way medical institutions store their information increases the effort for researchers because they contain a number of undesirable traits that have nothing to do with CHF. Therefore, a significant amount of processing time can be saved if the institute maintains patient information that only includes selected properties that are essential for CHF prediction instead of storing and preserving all aspects. As part of our next work, we also intend to independently validate our recommended method and further examine its efficacy before using deeper DL frameworks. We will also enhance the performance of the DL and ML approaches by collecting additional characteristics and parameters.