1. Introduction
SARS-CoV-2 coronavirus was first discovered and reported in Wuhan, China, in 2019 and has spread globally, causing a health hazard [
1,
2,
3]. On 30 January 2020, the World Health Organization labelled the outbreak a Public Health Emergency of International Concern, and on 11 March 2020, it was declared a pandemic. COVID-19 has varied effects on different people. The majority of infected patients experience mild to moderate symptoms and do not require hospitalization. Fever, exhaustion, cough, and a loss of taste or smell are all common COVID-19 symptoms [
4]. Loss of smell, confusion, trouble breathing or shortness of breath, and chest discomfort are some of the major symptoms that lead to serious pneumonia in both lungs [
1,
4,
5,
6]. COVID-19 pneumonia is a serious infection with a high mortality rate. The signs of a COVID-19 infection progressing into dangerous pneumonia include a fast pulse, dyspnea, confusion, rapid breathing, heavy sweating, and pulmonary embolism [
7,
8]. It induces serious lung inflammation, as seen in lung microscopy [
9]. It puts strain on the cells and tissue that cover the lungs’ air sacs. The oxygen for breathing is collected and supplied to the bloodstream through these sacs. Due to injury, tissue breaks off and blocks the lungs [
10]. The sacs’ walls might thicken, making breathing extremely difficult.
The most prevalent method of diagnosing individuals with respiratory disorders is chest radiography imaging [
11,
12,
13]. At the beginning of COVID-19, a chest radiography image appeared normal, but it gradually altered in a fashion that may be associated with pneumonia or acute respiratory distress syndrome (ARDS) [
11].
Figure 1 depicts the progression of chest X-ray images for a 45-year-old person infected with COVID-19. Roughly 15% of COVID-19 patients require hospitalization and oxygen therapy. Approximately 5% of people develop serious infections and require a ventilator.
During the peak period of infection transmission, having enough oxygen and a ventilator is also a major challenge for hospitals [
14,
15]. As a result, hospitals and medical practitioners are under a lot of stress trying to deal with critical patients who have been admitted to hospitals [
16]. They concentrate on providing good care to individuals who are hospitalized so that the mortality rate can be lowered, and the patients can recover quickly. However, hospitals’ capability to provide adequate treatments to hospitalized patients is sometimes limited by the availability of doctors and resources. In this scenario, a recommender system (RS) using machine learning (ML) approaches might be used to administer the best treatment while working with limited resources [
17,
18]. As the mortality rate and recovery rate of seriously hospitalized COVID-19 patients generally depend upon the amount of infection in the lungs [
19,
20,
21], the radiographic lung images of those patients can be used to recommend proper treatment in terms of a doctor, medicine, and other related resources.
From the perspective of the RS’s implementation, a new patient’s chest X-ray image is sent to the proposed system, and doctors, medicines, and resources are recommended for that patient. The proposed system assumes that the database consists of lung images and other information such as the name of the doctor assigned, medicines, and resources provided, such as intensive care unit (ICU), oxygen therapy, and ventilators. The COVID-19 patients who were admitted to the hospital in the past successfully recovered from the hospital. It uses a collaborative filtering method to find similar COVID-19 patients to new COVID-19 patients using image similarity. The proposed approach uses convolutional neural networks (CNN) for feature extraction [
22,
23,
24] from images and utilizes those feature vectors for similarity computation.
The proposed collaborative RS uses image similarity to produce recommendations, as image similarity is a popular and efficient technique in image-based RS [
25,
26,
27]. Traditional image-based RS recommends images for a given input query image. The novelty of the proposed RS is that it recommends some metadata information such as doctors, medicines, and resources for a given input query such as a chest X-ray image. The proposed system is built around two hypotheses. The
first hypothesis states that the proposed system’s performance is dependent on the feature extraction technique used by CNN models. The
second hypothesis states that the proposed system’s performance is also affected by the similarity measure used for similarity computation. Higher similarity assures a more accurate recommendation. The chest X-ray images are compared based on their feature vectors. The CNN model is used to effectively extract feature vectors from chest X-ray images. The combination of a robust search strategy and the best feature selection approach may make the RS more powerful for efficient and accurate recommendations.
Figure 2 represents the global system representation of the proposed approach. The proposed system can be analyzed as a combination of an online and an offline system. The offline system is responsible for the feature extraction process from the images, and the online system handles the recommendation process. The web-based system is equipped with a performance module that calculates accuracy based on the known reference values in the test dataset.
In particular, the objectives of this article include: (i) to propose an efficient RS system for COVID-19 based on chest X-ray images to address the impact of an RS on the efficient handling of situations in hospitals during the peak period of a pandemic with limited resources; (ii) to use multiple CNN models to construct an RS using COVID-19 chest X-ray images; (iii) to propose a unique design by embedding four kinds of search paradigm in the CNN-based framework; (iv) comparative data analysis of different similarity measure in the RS framework, providing metadata which includes doctors, medicines, and resources; (v) finally, to mitigate the impact of an RS in the healthcare domain through improved services and efficient resource management.
The remainder of this study is as follows:
Section 2 includes the discussion of RS, CNN, the feature extraction process, measures for similarity computation, and studies related to the proposed work. The proposed model is explained in
Section 3, and the experimental evaluation is discussed in
Section 4.
Section 5 focuses on future scope, and the article is concluded in
Section 4.
2. Background Literature
Many researchers have presented various models employing traditional machine learning approaches in the past for the identification of COVID-19 using radiography images [
2,
28]. Zimmerman et al. [
29] reviewed many cardiovascular uses of machine learning algorithms, as well as their applications to COVID-19 diagnosis and therapy. The authors in refs. [
30,
31] proposed image analysis tools to classify lung infection in COVID-19 based on chest X-ray images and claimed that artificial intelligence (AI) methods have the potential to improve diagnostic efficiency and accuracy when reading portable chest X-rays. In ref. [
19], the authors established an ensemble framework of five classifiers such as K-nearest neighbors (KNN), naive Bayes, decision tree, support vector machines (SVM), and artificial neural network for the detection of COVID-19 using chest X-ray images. Ref. [
32] describes a method for detecting SARS-CoV-2 precursor-miRNAs (pre-miRNAs) that aids in the identification of specific ribonucleic acids (RNAs). The method employs an artificial neural network and proposes a model with an estimated accuracy of 98.24%. The proposed method would be useful in identifying RNA target regions and improving recognition of the SARS-CoV-2 genome sequence in order to design oligonucleotide-based drugs against the virus’s genetic structure.
Due to the unprecedented benefits of a deep CNN in image processing, it has been successfully utilized by various researchers for the identification and accurate diagnosis of COVID-19. In ref. [
20], the authors proposed a deep learning (DL) model for the detection of COVID-19 by annotating computed tomography (CT) and X-ray chest images of patients. In ref. [
33], various DL models such as ResNet-152, VGG-16, ResNet-50, and DenseNet-121 were applied to radiographic medical images for the identification of COVID-19 and were compared and analyzed. To overcome the lack of information and enhance the training time, the authors also applied transfer learning (TL) techniques to the proposed system. A voting-based approach using DL for the identification of COVID-19 was proposed in ref. [
34]. The proposed method makes use of CT scan chest images of patients and utilizes a voting mechanism to classify a CT scan image of a new patient. Various DL algorithms for identifying COVID-19 infections from lung ultrasound imaging were reviewed and compared by the authors in ref. [
35]. The proposed method adopts four pre-trained models of DL such as InceptionV3, VGG-19, Xception, and ResNet50, for the classification of a lung ultrasound image. In ref. [
36], the authors compared the results of using CNNs pre-trained with ML-based classification algorithms. The major purpose of this research was to see how CNN-extracted features affect the construction of COVID-19 and non-COVID-19 classifiers. The usefulness of DL learning algorithms for the detection of COVID-19 using chest X-ray images is demonstrated in ref. [
37]. The proposed approach was implemented using 15 different pre-trained CNN models, and VGG-19 showed a maximum classification accuracy of 89.3%. In ref. [
38], an object detection approach using DL for the identification of COVID-19 in chest X-ray images was presented. The suggested method claims a sensitivity of 94.92% and a specificity of 92%.
Many kinds of research have also been conducted in the past using image segmentation, image regrouping, and other hybrid techniques for accurate diagnosis of COVID-19 [
39]. In ref. [
40], the authors proposed an innovative model using multiple segmentation methods on CT scan chest images to determine the area of pulmonary parenchyma by identifying pulmonary infiltrates (PIs) and ground-glass opacity (GGO). In ref. [
41], the authors proposed a hybrid model for the detection of COVID-19 using feature extraction and image segmentation techniques to improve the classification accuracy in the detection of COVID-19. In ref. [
42], a hybrid approach of feature extraction and CNN on chest X-ray images for the detection of COVID-19 using a histogram-oriented gradient (HOG) algorithm and watershed segmentation methodology was proposed. This proposed hybrid technique showed satisfactory results in the detection of COVID-19 with an accuracy of 99.49%, sensitivity of 93.65%, and specificity of 95.7%. In ref. [
43], the authors came up with a new way to determine COVID-19 in images of chest X-rays using image segmentation and image regrouping. The proposed approach was found to outperform the existing models for the identification of COVID-19 in terms of classification accuracy with a lower amount of training data. In ref. [
44], the transfer learning technique was used in conjunction with image augmentation to train and validate several pretrained deep Convolutional Neural Networks (CNNs). The networks were trained to classify two different schemes: (i) normal and COVID-19 pneumonia and (ii) normal, viral, and COVID-19 pneumonia with and without image augmentation. The classification accuracy, precision, sensitivity, and specificity for both schemes were 99.7%, 99.7%, 99.7%, and 99.55% and 97.9%, 97.95%, 97.9%, and 98.8%, respectively. The high accuracy of this computer-aided diagnostic tool can significantly improve the speed and accuracy of COVID-19 diagnosis. A systematic and unified approach for lung segmentation and COVID-19 localization with infection quantification from CXR images was proposed in ref. [
45] for accurate COVID-19 diagnosis. The proposed method demonstrated exceptional COVID-19 detection performance, with sensitivity and specificity values exceeding 99%.
RS has also been useful in combating the COVID-19 pandemic by making recommendations such as medical therapies for self-care [
46], wearable gadgets to prevent the COVID-19 outbreak [
47], and unreported people to reduce infection rates by contact tracing [
48], among others. An RS based on image content was proposed in ref. [
25] that employed a random forest classifier to determine the product’s class or category in the first phase and employed the JPEG coefficients measure to extract the feature vectors of the photos in the second phase to generate recommendations using feature vector similarity. A neural network-based framework for product selection based on a specific input query image was provided by ref. [
26]. The suggested system employed a neural network to classify the supplied input query image, followed by another neural network that used the Jaccard similarity measure to find the most comparable product image to that input image. In ref. [
27], the authors developed a two-stage DL framework using a neural network classifier and a ranking algorithm for recommending fashion images based on similar input images. Traditional RS frequently faces a significant challenge in learning relevant features of both users and images in big social networks with sparse relationships between users and images, as well as the widely different visual contents of images. Refs. [
49,
50,
51] presented a strategy for solving this data sparsity problem in content-based and collaborative filtering RS by importing additional latent information to identify users’ probable preferences.
The majority of previous research in RS based on computer vision was conducted for the e-commerce domain, with only a few works carried out for the healthcare domain, according to the literature. It was also revealed from the literature that image similarity is one of the successful techniques used for designing RS in computer vision. Furthermore, the efficacy of computer vision in RS in providing solutions for combating the COVID-19 pandemic has yet to be investigated. In this context, we suggest a health recommender system (HRS) that uses image similarity and collaborative filtering to provide treatment suggestions for COVID-19.
6. Discussion
6.1. Principal Findings
The proposed study presented an image-based health RS for the efficient management of resources in hospitals such as doctors, medicine, ICUs, ventilators, and oxygen masks during the peak period of the COVID-19 pandemic. The proposed system recommends these resources to a new patient according to his or her current health condition. It is defined as a hybrid of an offline and an online system. The offline system is in charge of extracting feature vectors from images. Using a similarity measure, the online system compares the feature vectors of the image being queried and the image in the database. The top-k most similar images are then found, as shown in
Figure 18.
The test was carried out on 20,000 COVID-19 patients’ chest X-ray images. The following similarity measures were used to select the best one for the system based on the AHS value: (i) cosine similarity, (ii) Maxwell–Boltzmann similarity, (iii) Euclidean similarity, and (iv) Jaccard similarity. With a similarity value of more than 94%, the Maxwell–Boltzmann similarity outperformed all other similarity measures. The proposed RS’ performance was validated using the following CNN models: (i) Resnet-50, (ii) Resnet-101, (iii) Resnet-152, (iv) VGG-16, and (v) VGG-19. The performance of the CNN models was validated using parameters such as the ROC curve and FoM value. The AUC and p-values obtained from the ROC curve indicate the ability of the CNN models to correctly predict the GT of the input image. The Resnet-50 model was found to outperform other CNN models with an AUC greater than 0.98 (p < 0.0001). The performance of the CNN models was also analyzed through FoM. The FoM was defined as the error’s central tendency. The Resnet-50 CNN model was found to have a maximum FoM value of 98.38. The performance of the similarity measures was also validated using the FoM value, and Maxwell–Boltzmann similarity outperformed the other three similarity measures; the overall performance of the proposed RS was evaluated using MAP@k. The MAP@k was determined using different CNN models for the threshold similarity in the range of 0.7 to 0.95. The proposed RS with the Resnet-50 CNN model showed the best result with a MAP@k value of 0.98014 and 0.98861 for k = 5 and k = 10, respectively. Finally, the system recommended meta-data information regarding hospital resources to a new COVID-19 patient admitted to the hospital based on his or her chest X-ray image.
6.2. Benchmarking
We considered various papers related to RS based on image similarity in our benchmarking strategy. This included Ullah et al. [
17], Chen et al. [
18], Tuinhof et al. [
19], and Geng et al. [
40]. In ref. [
17], an RS based on image content was proposed and divided into two phases. The RS used a random forest classifier in the first phase to determine the product’s class or category. The system then used the JPEG coefficients measure to extract the feature vectors of the photos, which were then used to provide recommendations based on feature vector similarity in the second phase. The proposed method produced correct recommendations with a 98% accuracy rate, indicating its efficacy in real-world applications. Ref. [
18] provided a neural network-based framework for product selection based on a specific input query image. A neural network was used in the proposed system to classify the supplied input query image, followed by another neural network that used the Jaccard similarity measure to determine the most comparable product image to that input image. The approach had a classification accuracy of 0.5. It offered quick and accurate online purchasing assistance and recommended products with a similarity of more than 0.5. Ref. [
19] describes a two-stage deep learning framework for recommending fashion images based on similar input images. The authors proposed using a neural network classifier as a data-driven, visually aware feature extractor. The data were then fed into ranking algorithms, which generated suggestions based on similarities. The proposed method was validated using the fashion dataset, which was made public. The proposed framework, when combined with other types of content-based recommendation systems, can improve the system’s stability and effectiveness. Ref. [
40] proposed a framework for combining an RS with visual product attributes by employing a deep architecture and a series of convolution operations that result in the overlapping of edges and blobs in images. The benchmarking table for the proposed study is shown in
Table 9.
The proposed framework for developing an entirely image-based recommendation model compares various linear and nonlinear reduction approaches to the properties of a CNN. Ref. [
82] presented an RS framework that uses chest X-ray images to predict whether a person needs COVID-19 testing. It implemented the same datasets used by the proposed method but with a different objective. None of these studies proposed any hypothesis for their proposed systems.
In contrast, we proposed two hypotheses for our system and also evaluated and validated them in the result and performance evaluation sections, respectively.
6.3. Special Note on Searching for RS
RS works on the principle of information filtering, and the searching strategy plays an important role in finding the relevant items to produce efficient and useful recommendations. The proposed RS utilizes image similarity to find the most relevant chest X-ray images with similar infections for a new COVID-19 patient with a chest X-ray image. Although CNN models play a vital role in producing accurate feature vectors, the quality of the recommendation mainly depends on the similarity measure. A proper similarity measure producing a high similarity value can produce more accurate recommendations. The four similarity measures considered for this study were analyzed based on AHS. In this study, the AHS was determined by averaging the similarity value of the most similar image to every input image present in the test set. The similarity measure with the highest AHS was considered for the RS. The performance of the proposed RS was determined in terms of MAP@k for a top-k recommendation. To identify relevant similar images for each query image, a “threshold value (T)” of similarity was also considered in the system. A retrieved database image was considered relevant when it had a similarity greater than or equal to the threshold value. This threshold value was found to affect the overall performance of the system in terms of MAP@k for a top-k recommendation.
The input images in both the training set and the testing sets were large images. These large images had many pixels to process. Further, the method we adopted reduced the computational complexity. The similarity measure strategy was very fast, quick, and low in complexity, one reason being there was no special optimization protocol and iteration adopted. Thus, overall, there was simplicity, speed, and low complexity. Such benefits overrule direct image comparison. Note that the top-n similar images obtained from the similarity computation were used for the recommendation. The proposed RS using CNN for feature extraction and similarity measurement can be an efficient tool to produce recommendations in the healthcare domain. The recommendations can be utilized for the proper allocation of doctors, medicine, and hospital resources to new patients.
6.4. Strengths, Weaknesses, and Extensions
The proposed method shows that the RS using a CNN for feature extraction and similarity measure can be an efficient tool for producing recommendations in the healthcare domain. The recommendations can be utilized for the proper allocation of doctors, medicine, and hospital resources to new patients. The proposed study proposed two hypotheses and also evaluated and validated them in the paper.
The results of the current pilot study are encouraging. However, due to the unavailability of the denoising technique in the proposed RS, the quality of the recommendation may be affected due to the presence of noise in the chest X-ray images. Denoising can be conducted in the offline and online systems. Denoising is an expensive operation in terms of computations. Therefore, offline denoising does not hurt the system that much, but the online system must be hardware interactive. The low resolution of chest X-ray images may also affect the quality of recommendations. Due to the limited number of images available for similarity calculation, a small database size may result in incorrect recommendations. A large database size may result in longer training time. While the study used basic ResNet-based systems, this can be extended to hybrid ResNet systems [
83,
84].
In the future, we could apply more sophisticated feature extraction techniques by fusing the different deep-learning models to achieve accurate recommendations. Better similarity methods can be explored to increase the efficiency of the proposed system. It could also be enhanced by applying segmentation techniques to make the system more robust. It can also be extended to cloud settings and big data platforms.
7. Conclusions
Through this study, we offered an RS for treating COVID-19 patients based on X-ray images of the chest. The proposed RS was divided into two phases. In phase 1, the proposed system fine-tuned the CNN models for feature extraction in phase 2. In phase 2, the finely tuned CNN model was used to extract features from both the chest X-ray of a new COVID-19 patient and the chest X-rays of COVID-19 patients present in the hospital database who were already treated successfully. The top-k similar images to the input query image of a new COVID-19 patient were determined further utilized for recommendation. In its recommendation, the proposed RS recommends doctors, medicines, and resources for new COVID-19 patients according to the metadata information of similar patients.
The proposed RS implemented with the ResNet-50 feature extraction CNN model provides the highest MAP@k with k = 5 (top-5) and k = 10 (top-10) for all the datasets with higher threshold values of similarity. The proposed RS with ResNet-50 CNN feature extraction model was found to be a proper framework for the treatment recommendation with a mean average precision (MAP) of more than 0.90 for the threshold similarities in the range of 0.7 to 0.9. The results of the proposed study were hypothesized and validated using various parameters. The proposed RS in this paper assumes that the hospital database contains related metadata, such as information about the doctors investigated, medicines, and resources allocated to a patient. The major limitation of our proposed system is that we did not consider the related physiological parameters such as sugar level, blood pressure, and other associated parameters that may affect the condition of a COVID-19 patient having similar chest infections. In the future, the proposed RS can be enhanced by considering these parameters for better recommendations.