*2.2. Data Analysis*

Bibliographic Items Co-occurrence Matrix Builder (BICOMB), provided by Professor Cui from China Medical University [33], and Microsoft Excel were employed to determine the annual number of publications, most active journals, distribution of journals' publication places, and the frequency of major MeSH/subheading combination terms. In the following section, "major MeSH/Subheading combination term" is referred to as "term" for short. The publication time of an article in this study was the final publication time, which meant that the information about the volume, pages, or serial article number had been released.

A research hotspot refers to a focus of research where researchers have conducted a lot of studies and published many related papers. By obtaining the frequencies and relationship of the words reflecting the content of the articles in a field, the hotspots of the field can usually be identified [34]. In this study, based on the principle for the g-index of the word frequency, a proper threshold (g) was set for the number of terms in order to generate a list of highly-frequent terms and a term-article matrix [35]. Egghe put forward a g-index used to reflect the contribution value of high-quality papers (i.e., highly cited papers) to a scientist. Similarly, a co-word analysis is used to select highly-frequent words to reflect the hotspots of a certain research field, so the g-index can also reflect the contribution value of highly-frequent words to all of the words in a given field [36]. Zhang et al. and Yang et al. have proved the simplicity and effectiveness of the g-index in selecting highly-frequent words in their empirical studies [35,37]. The method for the determination of the number, g, is as follows: firstly, all major MeSH/subheading combination terms were sorted in descending order of frequency; i was the sequence number of each term; when i was equal to g, the cumulative frequency of the first g terms was not less than g2, while that of the first (g + 1) terms was less than (g + 1)2. Then, the first g terms were considered as high-frequency terms [37]. If there were multiple terms with the frequency equal to that of the *g*th term, these terms were also identified as the highly-frequent terms. Next, a binary matrix with highly-frequent term-source article was created from BICOMB. By using the software "gCLUTO" (Graphical Clustering Toolkit, developed by Rasmussen et al. from University of Minnesota)) version 1.0 (University of Minnesota, Minneapolis, MN, USA), the matrix was imported for further biclustering [38]. The parameters in gCLUTO—repeated bisection for the clustering method, cosine for similarity function, and I<sup>2</sup> for criterion function of clustering—were set based on those appropriate for the biclustering analysis of the literature. In order to gain the optimal number of clusters, the procedure for biclustering was repeated with different numbers of clusters. The biclustering result of the term-article matrix was presented through the visualization format of a mountain and matrix. With the aid of semantic relationships between the MeSH/subheading combination terms, and the content of the representative articles in each cluster, the basic framework of research hotspots on climate change-related infectious diseases was drawn and analyzed.

Moving forward, every hot research topic was put into the strategic diagram to show the relational patterns inside each cluster and among all of the clusters, so that the current status and evolutionary trends of this field could be revealed. In 1988, Law et al. proposed a strategic diagram to describe the internal linkages in the research domains and inter-domain interactions [39]. The strategic diagram is manifested as a two-dimensional chart, with the horizontal axis representing the centrality (the average value of external links, and external links refer to the sum of times that every term in a given cluster and every term in other clusters co-occur in the same article) and the vertical axis standing for the density (the average value of internal links, and internal links are the sum of times that every pair of terms in a given cluster co-occur in the same article) [40]. The centrality is used to judge the degree to which each term is connected with the terms in the other clusters, which can indicate the degree that one theme affects the others. The greater the number and intensity of links between a subject domain and other subject domains, the more central the subject domain becomes in the whole research work. The density is used to measure the closeness degree of the internal terms inside the same cluster, indicating the strength of relations that make terms into a cluster, i.e., the ability of one theme to maintain and develop itself [41]. Based on the results of the co-word clustering analysis and co-occurrence matrix of the

highly-frequent terms, the density and the centrality were calculated for each cluster. The origin of coordinate is the average value of all centralities and that of all densities. With the help of the content, as well as the centrality and density of each cluster, the development status of the hot research topics in the two decades was presented by strategic diagrams, from which the evolutionary trend of the global research on climate change and infectious diseases was analyzed and discussed.
