**3. Background**

Healthcare organizations generate a vast range of data and information. Thanks to the progress of HT omics technologies, there has been an exponential growth of omics data, e.g., gene expressions, sequences alignment, and protein sequences, rendering classical computational approaches ineffective for handling these massive amounts of heterogeneous data. Consequently, omics sciences turned into Big Data science. Big Data in health and medical areas need infrastructures to improve data storage and management. Data sharing and security are critical in health and medical care since researchers need easy and extensive access to data for scientific analysis and sharing results. Cloud Computing solutions for healthcare organizations can contribute to making data analysis, sharing,

access, and storage effective through Cloud services able to scale when the amount of data increases. Thus, Cloud Computing services are a cost-effective solution for storing, accessing, analyzing, sharing, and protecting healthcare data and information.

The following is a list of well-known Cloud services models suitable for handling Big omics Data.


Eoulsan 2 is implemented in Java, available only for Linux systems, and distributed under the LGPL and CeCILL-C licenses at http://outils.genomique.biologie.ens.fr/ eoulsan/ (accessed on 21 March 2023). The source code and sample workflows are available on GitHub https://github.com/GenomicParisCentre/eoulsan (accessed on 21 March 2023).


#### **4. Materials and Methods**

This section aims to highlight some challenges, security issues, and impediments limiting the spread of the use of Cloud Computing in healthcare corporations. To identify some of the main relevant obstacles limiting the high adoption of Cloud methodologies in healthcare corporations, we searched the online knowledge database PubMed [38], to figure out from the available scientific literature suitable clues to identify possible advice that could help mitigate the current difficulties in the large use of Cloud Computing in healthcare corporations.

The first step regarded the keywords definition to use for selecting relevant manuscripts. The chosen keywords to implement the selection criteria of the manuscript are: cloud computing, healthcare, security, challenges, applications. Table 2 shows the produced queries obtained by combining the keywords and the selected range of publication years in which to search for manuscripts.

In the second step, we defined the inclusion criteria comprising the following: (i) the manuscripts available on PubMed from the 2009, up to the December 2022 meeting the selected keywords; (ii) all the types of abstracts, manuscripts, conference abstracts, reviews, and letters are eligible if they contain the chosen keywords in the title and are free full text.

**Table 2.** The table shows the defined queries to identify relevant manuscripts related to Cloud Computing in healthcare.


Table 3 reports the number of identified manuscripts in PubMed that apply to the queries contained in Table 2. The results of the queries were analyzed using an in-house Python script, to parse and extract manuscripts' title keywords, computing for each keyword its frequency (excluding from the frequency terms counting articles, prepositions, adverbs etc). Finally, keyword frequency is used to produce the word cloud diagram shown in Figure 1.

**Table 3.** The table shows the total number of eligible PubMed manuscripts matching the defined queries.


Figure 1 presents the results of query *Q*1 in the form of word cloud diagram.

**Figure 1.** Figure shows the query *Q*1 results as word cloud diagram.

Figure 2 shows the publication growth trend of manuscripts concerning the use of Cloud Computing in healthcare.

**Figure 2.** Figure shows the growth trend of Cloud Computing in healthcare starting from 2008 up to December 2022. *Q*1 presents the growth per year of manuscripts dealing with cloud computing in healthcare. *Q*2 shows the trends per year of the manuscripts focused on security issues in Cloud Computing especially within Cloud Computing in healthcare. *Q*3 shows the growth of manuscripts focused on the challenges to be faced in Cloud Computing for healthcare. Finally, *Q*4 provides an overview of the growth per year of Cloud application for healthcare.

To highlight the difficulties of adopting Cloud Computing in the healthcare sector, we will analyze the results obtained from the queries represented graphically using piecharts. Figure 3 shows the results of query *Q*1.

*Q*1 contains the following keywords cloud computing and healthcare, resulting in 67 keywords extracted (for readability reasons, the piechart visualize the first 30 keywords) from the titles of the scientific articles selected using the previously defined criteria concerning the use of Cloud Computing in Healthcare. Analyzing the frequency of keywords identified by query *Q*1 shown in Figure 3, it is worth noting that many terms are related to healthcare, which could lead to misleading conclusions concerning the use of Cloud Computing in healthcare, considering that keywords such as security and privacy occupy the 35th and 38th position, respectively.

Query *Q*2 adds the keyword security to query *Q*1, extracting from scientific works compatible with the selection criteria 17 keywords. Adding the keyword security restricts the selection and search range of the query. In fact, from the result of *Q*2 shown in Figure 4, it is possible to notice that the keywords related to security and privacy now leap respectively into 5th, 6th, and 8th position, highlighting the importance of the concepts of security and privacy in the various areas of use of the Cloud and, in particular, in the health sector.

Query *Q*3, composed of keywords cloud computing, healthcare and challenges, locates 20 keywords, as shown in Figure 5.

**Figure 3.** Figure shows the keyword frequency produced from query *Q*1. To improve legibility, the percentage values have been truncated to the first value after the decimal point.

**Figure 5.** Figure shows the keyword frequency produced from query *Q*3. To improve legibility, the percentage values have been truncated to the first value after the decimal point.

In particular, challenges occupies the 5th position, highlighting that the use of the Cloud in the healthcare sector must overcome various challenges, particularly related to the sensitive aspects of the data to be handled. Figure 6 displays the frequency of keywords extracted from query *Q*4.

From the analysis of Figure 6 security occupies the 19th position, while privacy does not appear in the list of frequent keywords, introducing biases in the interpretation of the results, suggesting that the existing Cloud Computing applications are mainly aimed at sectors other than healthcare, as less stringent privacy requirements regulate them. In light of these conclusions, decisions regarding the relevant scientific papers to be analyzed were made using the intersection of the results produced by the four queries as a selection criterion.

To limit the manuscripts investigation, we computed the intersection among the results obtained from the four queries performed in PubMed. Figure 7 shows the intersection among the manuscripts' keywords retrieved from each query). The manuscripts intersection was computed using Venny 2.0 [39] a web application used to draw Venn diagrams.

Analysing Figure 7 it is wort noting that the intersection among the four queries contains 27 manuscripts. According to the eligibility criteria, 21 manuscripts have been excluded since they are not explicitly related to Cloud Computing. Finally, only the 6 manuscripts meeting the eligibility criteria have been assessed.

**Figure 6.** Figure shows the keywords' frequency produced from query *Q*4. To improve legibility, the percentage values have been truncated to the first value after the decimal point.

**Figure 7.** Figure shows the intersection among the manuscripts' keywords retrieved from each query.
