1. Introduction
A frequent and dangerous kind of cancer in women, breast cancer, affects around 36% of females. It is the most prevalent kind of cancer in women and the second-leading cause of cancer-related mortality, after lung cancer. According to the World Health Organization (WHO), breast cancer claims the lives of more than 2 million women each year. Breast cancer is the only illness that results in as many disability-adjusted life years (DALYs) of impairment for women worldwide. After puberty, breast cancer can affect women of any age, while the chance of getting it is increased later in life. According to the research, there will be more than 3 million new cases of breast cancer each year, and by 2040 there will be a 40% increase, with more than 1 million fatalities, i.e., a 50% increase in comparison with the year 2020 [
1,
2].
The core challenge of chronic illness research is image classification with deep learning, whereby several objective categories are created and a trained model to recognize each type is produced [
3]. An early breast cancer diagnosis is the best way to prevent death and eventually lower healthcare costs. Technologies for the early detection and diagnosis of breast cancer are constantly developing to provide patients with less intrusive procedures and precise diagnoses. Mammography is the key factor in reducing breast cancer mortality [
4,
5]. The Internet of Things (IoT) provides a wide variety of applications in the healthcare industry due to the quick advancement of smart medical equipment. However, these systems are built on a centralized connection with Cloud servers, increasing security and privacy risks. Fog computing, a distributed computing architecture developed by CISCO that integrates Cloud computing into the network to get around incompatible formats, keeps data computation, processing, and applications between the Cloud and the network’s edges [
6]. Fog computing has brought up age-old problems such as host node movement, data center exchange, data security, and robustness. Additionally, in today’s healthcare system, people who are aged, ill, or physically challenged are becoming more in need of a trustworthy continuous health monitoring system. According to several research studies, the remote health monitoring system is what enables healthcare professionals to efficiently and quickly monitor patients’ health [
7,
8].
Figure 1 shows the Fog–Cloud-based distributions to IoT Healthcare systems.
Despite the abundance of literature studies on the subject of breast cancer diagnosis and classification, relatively few works have been performed on the classification of breast cancer using mammography images and on a distant instantaneous diagnosis based on Fog computing ideas. Despite the importance of image data pre-processing and segmentation approaches for breast cancer detection, ensemble deep learning (EDL) models perform in a wide variety of other contexts. In this study, we used publicly available mammography pictures from the Mammographic Image Analysis Society (MIAS) repository to train a novel transfer deep learning (TDL) enabled automated assistance system for breast cancer diagnosis and classification enabled with a Fog computing strategy. The data were pre-processed before being provided to the model. Transfer learning (TL) techniques, such as GoogleNet, ResNet50, ResNet101, In-ceptionV3, AlexNet, VGG16, and VGG19, were integrated with convolutional neural networks (CNNs), a common deep learning (DL) methodology. These TDLs were combined with the feature reduction method principal component analysis (PCA) and the classifier support vector machine (SVM) to further enhance their prediction capabilities. We ran comprehensive simulations to test how well the proposed model works in practice.
The important contributions of this study are listed below:
The effectiveness of the proposed system is shown and evaluated in several scenarios, including predictive analysis, network capacity, low latency, bandwidth, secrecy, integrity, and protection;
Automated remote diagnosis of benign and malignant breast cancer;
Designed a TDL-based algorithm to analyze mammograms for the early detection of breast cancer;
Installation of an IoT-based healthcare monitoring system utilizing Fog computing for real-time analysis;
The remainder of this manuscript is structured as shown below. Information about the pertinent work is provided in
Section 2. The methods and datasets employed in this proposed study are described in detail in
Section 3.
Section 4 discusses the implementation of the experimental setting and the elaborate architectural design of the proposed work.
Section 5 of the manuscript includes a discussion of the experiments and comparative analysis. Finally, in
Section 6, we draw some final findings and provide some recommendations about the proposed approach.
2. Existing Works
Khan et al. [
9] examined the effectiveness of six methods for extracting directional features for mass classification issues in digital mammograms. The feature extraction methods were assessed using the MIAS database’s extracted ROIs. The resultant imbalanced datasets were effectively classified using SELwSVM, or Successive Enhancement Learning-based weighted support vector machine.
Hepsag et al. [
10] used DL with CNN to categorize anomalies in mammography pictures as benign or malignant by utilizing two distinct databases, namely, mini-MIAS and BCDR, and they found that accuracy, precision, recall, and f-score values were between roughly 60% and 72%. The authors used pre-processing techniques that included cropping, augmentation, and balancing picture data to enhance their findings.
To help doctors make an early diagnosis, Ting et al. [
11] introduced CNN Improvement for Breast Cancer Classification (CNNI-BCC). Through studies, it has been found that CNNI-BCC is capable of accurately classifying incoming medical photos of patients as benign, malignant, or normal.
Mohanty et al. [
12] developed a hybrid CAD framework to classify suspicious areas as normal, aberrant, benign, or malignant. The proposed framework consisted of four computational elements: A number of classifiers, such as SVM, KNN, NB, and C4.5, were used to categorize inputs as normal, abnormal, benign, or malignant on MIAS and DDSM; ROI generation using cropping operation; texture feature extraction using contourlet transformation; forest optimization algorithm (FOA), a wrapper-based feature selection algorithm, to select the best features.
Abd-Elmegid [
13] presented a Fog computing-based architecture for predicting breast cancer prognosis. The proposed architecture employed the BCOAP model as a prediction tool and solved the issue of handling massive volumes of data in real-time without taxing the Cloud’s data center by employing Fog nodes to carry these out.
Chougrad et al. [
14] proposed a tailored label choice system that calculates the best level of confidence for each visual notion. The authors used the CBIS-DDSM, BCDR, INBreast, and MIAS benchmark datasets to show the efficacy of their methodology and compared the findings to those of other widely used baselines.
The Fuzzy c-means clustering approach was utilized by Xu et al. [
15] in their enhanced semi-supervised tumor identification system, which provides a pathological degree tree based on 10 3D and 2D tumor variables. The design of Fog computing also distributed a significant amount of complex data processing using the medical CT images of 143 individuals, including 452 tumors.
Zhu et al. [
16] used DL approaches to build a technique for enhancing low-dose mammography picture quality. The CNN model for it focused on lowering the noise in low-dose mammography. After some practice, it may provide a picture of good quality from a low-dose mammogram verified on the datasets collected from TCIA.
Rajan et al. [
17] proposed a unique technique that makes use of a DCNN and a modified vessel-ness assessment to detect the structure of the oral cancer region. Each linked component is independently examined utilizing the trained DCNN during classification by taking into account the feature vector values specific to that area, and the technique achieved a sensitivity of 92% and an accuracy of 96.8% on a training set of 1500 images.
Ragab et al. [
18] developed a specific CAD system based on feature extraction and classification using DL methods to assist radiologists in identifying breast cancer lesions in mammograms. To decide the best course of action, four tests were conducted. End-to-end pre-trained fine-tuned DCNN networks made up the first one. In the second, an SVM classifier with multiple kernel functions was fed the deep features that were extracted from the DCNNs. In the third experiment, deep feature fusion was carried out to show that merging deep features would improve the SVM classifiers’ accuracy. To compress the enormous feature vector created by feature fusion and to lower the computing cost, PCA was finally used in the fourth trial.
In order to effectively aid in the automatic identification and diagnosis of the BC suspicious zone based on two approaches, namely 80-20 and cross-validation, Saber et al. [
19] designed a special DL model based on the TL methodology. Using a pre-trained CNN architecture such as Inception V3, ResNet50, VGG-19, VGG-16, or Inception-V2, the features were retrieved from the MIAS dataset.
For the identification and classification of patients into three classifications (cancer, no cancer, and non-cancerous) under the supervision of a database, Allugunti [
20] presented a CAD approach based on classifiers such as CNN, SVM, and RF. The scientists looked at the effects of pre-processing mammography images, which increases the accuracy of categorization.
For the identification of microcalcification clusters from mammograms and classification into malignant and benign categories, Rehman et al. [
21] introduced the FC-DSCNN CAD system. The computer vision technology quickly distinguishes the MC item from mammograms while automatically reducing noise and background color contrast, enhancing the classification performance of the neural network.
Canatalay et al. [
22] offered three standard techniques employing TL approaches to identify and categorize breast cancer in breast X-ray pictures depending on the DL framework. The proposed approach may quickly and accurately identify a mass area on an X-ray picture as benign or cancerous. The proposed model was examined using X-ray pictures from the Cancer Imaging Archive (CIA) repository. In order to increase prediction accuracy, the TL was used, and extensive simulations were run to evaluate the performance of the proposed model.
Zhu et al. [
23] designed a unique Edge–Fog computing framework based on an ensemble ML approach. The proposed architecture provides healthcare with a Fog system that manages data from many sources to properly treat disorders by employing automated glioma illness diagnosis in real-world settings. This framework was made to work under various operational conditions, such as various Edge–Fog scenarios, user needs, and service quality, precision, and prediction accuracy requirements. The efficiency of the proposed model was evaluated in terms of power consumption, latency, accuracy, and execution time.
Kavitha et al. [
24] presented Optimal Multi-Level Thresholding-based Segmentation with DL-enabled Capsule Network (OMLTSDLCN), a brand new digital mammogram-based breast cancer screening model. This approach uses adaptive fuzzy-based median filtering (AFF) during pre-processing to minimize mammography image noise. The Optimal Kapur’s Shell Game Optimization (SGO) method is also used to segment breast cancer (OKMTSGO). The proposed technique recognizes breast cancer by using a Back-Propagation Neural Network (BPNN) classification model and a CapsNet-based feature extractor.
An evolutionary method for identifying and diagnosing breast cancer that is based on ML and image processing was reported by Jasti et al. [
25]. In order to categorize and identify skin issues, this model integrates feature extraction, feature selection, and machine learning approaches. The quality of the image is improved using the geometric mean filter. Features are extracted using AlexNet, and features are chosen using the relief approach. The model uses ML methods including LS-SVM, KNN, RF, and NB for disease classification and detection.
Nasir et al. [
26] developed a DL-based automated detection method based on whole slide images (WSIs) to automatically diagnose osteosarcoma, achieving an accuracy rate of up to 99.3%. This strategy uses blockchain technology to guarantee the confidentiality and accuracy of patient data and increases efficiency and lessens the burden on centralized servers by utilizing Edge and Fog computing technologies.
5. Simulations and Results
A significant portion of any proposed research involves the empirical examination of the outcomes. With a set of evaluation criteria, performance metrics seek to create a real-to-anticipated-class confusion matrix. The confusion matrix is abbreviated as T
P and F
P for true and false positives, and T
N and F
N for true and false negatives. Several performance metrics including Acc, MCR, Pre, Sen, Spc, F1S, and MCC for accuracy, misclassification rate, precision, sensitivity, specificity, f1-scores, and Mathew’s correlation coefficient, respectively, were used to classify data in this study. These metrics can be expressed as Equations (6)–(12) [
41,
42,
43].
In this work, after image data pre-processing, the deep features were extracted by DCNN with seven TL approaches including InceptionV3, GoogleNet, AlexNet, VGG16, VGG19, ResNet50, and ResNet101. Then, feature dimensions were reduced using PCA and the final features were classified by SVM with three different kernels: linear, polynomial, and RBF. The twenty-one trained TDL models (TTMs) were constructed using these concepts as TTM-1 (InceptionV3 + PCA + SVMLinear), TTM-2 (InceptionV3 + PCA + SVMPolynomial), TTM-3 (InceptionV3 + PCA + SVM-RBF), TTM-4 (GoogleNet + PCA + SVMLinear), TTM-5 (GoogleNet + PCA + SVMPolynomial), TTM-6 (GoogleNet + PCA + SVM-RBF), TTM-7 (AlexNet + PCA + SVMLinear), TTM-8 (AlexNet + PCA + SVMPolynomial), TTM-9 (AlexNet + PCA + SVM-RBF), TTM-10 (VGG16 + PCA + SVMLinear), TTM-11 (VGG16 + PCA + SVMPolynomial), TTM-12 (VGG16 + PCA + SVMRBF), TTM-13 (VGG19 + PCA + SVMLinear), TTM-14 (VGG19 + PCA + SVMPolynomial), TTM-15 (VGG19 + PCA + SVM-RBF), TTM-16 (ResNet50 + PCA + SVMLinear), TTM-17 (ResNet50 + PCA + SVMPolynomial), TTM-18 (ResNet50 + PCA + SVM-RBF), TTM-19 (ResNet101 + PCA + SVMLinear), TTM-20 (ResNet101 + PCA + SVMPolynomial), and TTM-21 (ResNet101 + PCA + SVM-RBF).
Table 4 displays the obtained findings in % of various proposed TTMs. The TTM-12 had the greatest classification accuracy of 99.01%, as shown in
Figure 5, from the observations that outperformed all other twenty hybrid approaches. In addition, this proposed approach also outperformed all others stated in terms of precision, sensitivity, specificity, f1-scores, and MCC.
The computing strategy or coordination level utilized by the Fog-enabled IoT application has a big impact on network characteristics. Several network metrics were used to confirm the proposed work for demonstrating the importance of enabling IoT with Fog computing, including latency, arbitration, total processing time, jitter, network, and energy consumption. Specification-1 for Master PC alone, Specification-2 for Master PC with one Fog Worker Node, Specification-3 for Master PC with two Fog Worker Nodes, Specification-4 for Master PC with three Fog Worker Nodes, Specification-5 for Master PC with four Fog Worker Nodes, and Specification-6 for Cloud node only are just a few of the multiple configurations that were employed in this study to evaluate various network metrics.
Data transit time over a network is referred to as latency. It is the time it takes to gather, transport, process, and receive a data packet.
Figure 6 illustrates how to compute the difference in latencies by combining transmission time and queuing delay. Because only single-hop data transfers are used for interactions, the latency will be almost identical whether the job is submitted to the MP or the FW Nodes. Fog computing’s principal function of multi-hop data transport outside of the network results in rather significant latency in a Cloud architecture. The “arbitration time” refers to the MP’s reaction time to GTs, depending on how the network is set up. The arbitration time is shown in
Figure 6 for varied levels of Fog architectures. Arbitration is less likely to take place when assignments are given directly to Master PC or Cloud nodes. Other times, it takes time to equally distribute the load among the nodes, reducing the arbitration rate.
Figure 6 provides a comparative analysis of latency and arbitration time based on various specifications.
The processing time is the amount of time needed to start, finish, and deliver the work to the users.
Figure 7 depicts the processing characteristics under various Fog circumstances. One apparent thing is that employing Cloud communications significantly reduces the overall processing time. The difference in response times between task requests is known as jitter. It is required for many practical applications, including the analysis of data in e-Healthcare systems. The jitter fluctuation for various settings is shown in
Figure 7. Because the MP handles resource management, security checking, and arbitration, the jitter is larger in the MP-alone scenario than when tasks are dispersed to FW nodes. When jobs are sent to the Cloud, the jitter is significantly higher.
Figure 7 provides a comparative analysis of processing time and jitter based on various specifications.
Fog computing uses fewer networks than a Cloud computing system does. The issue has an impact on both the number of FW nodes and the network consumption, which may include MP-alone, FW, or CN nodes. Because the Fog environment, as shown in
Figure 8, limits the number of user requests that are sent to the Cloud, network usage time for MP and/or FW nodes is significantly lower than that of CN nodes. The entire amount of energy that the system uses is called energy consumption, which is required by sensors and other system components. As seen in
Figure 8, a CN needs a lot more energy than an MP or an FW node. Due to this, CN nodes consume a lot more energy than FW nodes. The proposed work consumes more energy as the quantity of FW nodes increases.
Figure 8 provides a comparative analysis of network and energy consumption based on various specifications. Based on the data gathered,
Table 5 displays the averages of the observed results of different network parameters for various specifications.
A scalable infrastructure may add resources while retaining its limits to satisfy changing application needs. Our key concern is whether the system can scale up in quantity as consumers demand it over time, as indicated in
Figure 9. The average response time increases as the volume of requests rises, considering specification-5 as the configuration. Additionally, it is highlighted that average response times do not exponentially increase when the number of requests rises, demonstrating the framework’s scalability.
The proposed framework, CanDiag, is compared with some of the current studies that have been taken into consideration in terms of various performance parameters. In
Table 6, the proposed work is compared with some state-of-the-art works based on DL and TL approaches and mammogram imaging datasets considering the comparison parameters as methodologies, datasets employed, and the performance parameters including accuracy, precision, sensitivity, specificity, f1-score, MCR, and MCC. Based on the experimental outcomes, it can be concluded that this proposed work outperforms in some cases and also falls short in some cases.
The proposed framework, CanDiag, was also compared with previous studies on performance and network parameters concerning cancer diseases with Fog computing concepts. The network’s consumption, energy consumption, scalability, etc., were examined for the first time, showing the work’s novelty.
Table 7 compares DiaFog with numerous key outcomes from this research.
Table 7 abbreviates the Presence of Concepts (1), Jitter (JT), Energy Consumption (EC), Processing Time (PT), Network Consumption (NC), Arbitration Time (AT), Latency (LT), and Scalability (SB).
6. Conclusions and Future Scope
Fog computing with IoT applications has become more crucial to make people’s lives easier and better. Given the severity of the breast cancer situation, allowing the patient to use IoT applications for remote self-diagnosis is beneficial. Formal IoT implications, however, simply call for cloud infrastructures for instantaneous data storing, scrutiny, etc., which have several concerns including latency, network, and energy consumption, security and privacy, etc. To solve these problems, Fog computing should be combined with IoT and Cloud computing. This study suggests using numerous TDL techniques along with PCA and SVM in a Fog-enabled system for instantaneous breast cancer patient diagnosis. A dataset made up of mammography images that were collected from the MIAS warehouse was used to create this TTM model. Various performance and network parameters of this study are investigated. With an accuracy, MCR, precision, sensitivity, specificity, f1-scores, and MCC of 99.01%, 0.99%, 98.89%, 99.86%, 95.85%, 99.37%, and 97.02%, respectively, the research on the dataset of mammography images categorized as normal and abnormal outperformed some earlier studies.
This proposed work can be beneficial from the instantaneous remote self-diagnosis of individuals relating to breast cancer diseases. However, there are some limitations to this work, including the cost and difficulty of the entire development and execution of the proposed work that can be included in future works as summarized and listed: (i) Applying these frameworks on various Image-based datasets having multi-class variables; (ii) Inclusion of LDA as an alternative to PCA for dimensionality reduction; (iii) Expansions of this study could be utilized to treat a variety of other chronic illnesses; (iv) Alternative Cloud computing platforms, including Edge computing, Mist computing, and Surge computing, should also be employed to enhance the architecture provided; and (v) The challenge of a single network platform is another area that we should focus on in the future.