*2.1. Data Collection*

Due to the interdisciplinary nature of CM, a comprehensive academic database, Web of Science, known for its comprehensiveness, organizational structure, and scientific robustness, was chosen in this study [26]. It is a consensus in the industry that articles in high-ranking journals usually have high influence [25]. Considering this fact, the authors selected the journals that have an important impact and top quality in CM based on the 2019 Scopus journal metrics (CiteScore is not less than 2) and the ranking of CM journals [27]. Moreover, in view of timeliness, these journals must have published at least three papers related to ANN in CM between 2000 and 2020.

To summarize the research progress in the past 20 years, the retrieval time of the journal article is limited to 2000 and 2020. In order to ensure that all relevant papers are captured, different search keywords are applied. The following retrieval code was adopted, and the search was conducted using the 'topic' in the Web of Science.

(neural network) AND (construction management), (neural network) AND (engineering management)

Four hundred potential articles were initially identified and then filtered based on the criteria that ANN emerged as the main technology or played an important role rather than just a comparison. A two-stage selection strategy was adopted to meet the above criteria. Firstly, title, abstract, and keywords in each article are examined to exclude unrelated articles. Secondly, the entire paper content is analyzed in detail to ensure that all the selected articles are closely related to the research objectives. Finally, 112 papers were selected for later scientometric review; the different journals in which the selected papers were published can be seen in Table 1. These articles provide a representative sampling of existing studies on the ANN in CM and form the dataset utilized in the current research.


**Table 1.** Distribution of the selected papers among different journals.

#### *2.2. Introduction and Process of Scientometric Analysis*

The earliest definition of scientometrics is "quantitative research on scientific development research" [138]. The purpose of scientometric analysis is to help literature reviews overcome subjective issues in content analysis [139]. A scientometric analysis consists of the text-mining and citation analysis which helps researchers find systematic literaturerelated findings, by finding literature information that may be ignored in manual review studies [140]. There are several available tools to realize the goal of scientometric analysis such as VOSviewer and Citespace. Citespace is an advantageous application for analyzing and visualizing networks, and is specialized in analyzing what the major research interests are and how they are evolved and linked [141]. VOSviewer offers the basic functionality required for producing, visualizing, and exploring bibliometric networks, and also has special text-mining features [142]. Each tool has its own strengths, and it is necessary to appropriately use different tools for different kinds of analysis [143]. The combination of these two tools is widely accepted for reviews, such as that of AI in the Architecture, Engineering and Construction industry [12] and computer vision applications for construction [144]. Therefore, VOSviewer and Citespace were selected for scientometric analysis in this paper. VOSviewer was used to implement keyword co-occurrence analysis and Citespace was used for other tasks.

Whether VOSviewer or Citespace, its main process can be summarized as follows: (1) importing literature data from WOS into software for visualization; (2) Figures presentation and optimization of different aspects according to software functions; (3) In-depth scientometric analysis of the adjusted figures. After the screening mentioned above, each piece of data is downloaded from WOS as a full-record text format. In total, 112 valid publications were extracted. Then, the output file format after renaming is 'download\_\*.txt' and is imported into VOSviewer and Citespace for format conversion. Visualizations are generated and can be converted to different views through different functions of the software. After adjusting different visualizations of input data to make them easier to read and analyze, the scientometric analysis begins, including the following four aspects considered in this paper. First, through co-author network analysis, core research groups and their cooperative relationships were identified. Second, with network analysis of participating countries/regions, the most influential countries/regions which are particularly active on ANN in CM and the collaborations among them were described. Third, taking note of the keywords, network analysis is conducted to discern the main research interests and the hot topics on ANN in CM. Fourth, with the timeline visualization and citation bursts, the keywords evolution network shows the trends and changes. Finally, network analysis of the co-citation references was carried out to mine the most representative literature in this filed.

### **3. Results of Scientometric Analysis**

#### *3.1. Author Analysis*

Co-authorship network analysis of current research in ANN in CM can promote access to specialists and expand research productivity [25]. The core group of authors and

their cooperative relationships in this field can be determined by analyzing the structural characteristics of the corresponding authors and their cooperation networks. In this paper, the co-authorship network was generated in Citespace and the collaboration map is presented in Figure 2. The size of the node indicates the number of the articles published by the author, and the connecting line indicates the collaboration relationship among them, and the color of the line indicates the authors in the same cluster. Publication dates from past to present are shown in a transition from cool to warm color. As can be seen from Figure 2, there are 253 nodes, 287 links, and the network density is 0.0101. The typical value for network density is between 0 and 1. Especially low network density, even close to 0, indicates that the authors in the network are not closely connected [145]. The most productive author on ANN in CM was *MINYUAN CHENG* with 8 articles, followed by *HSINGCHIH TSAI* (4), and *XUEFENG ZHAO* (4). It can be noted that many authors tend to collaborate with a relatively stable group of collaborators, so there are several major groups of authors. Among them, the cluster with *MINYUAN CHENG* is the core research team. They represent the important scholars in the application of ANN in CM and can offer highly individualized scientific research information to other researchers in this field.

**Figure 2.** Network of co-authorship in research on ANN in CM [15,28–137].

#### *3.2. Countries/Regions Analysis*

The leading countries/regions in research on ANN in CM can be identified through network analysis. The results are useful for interested scholars to help them identify leading countries with high potential for cooperation. At the same time, the results can also provide top management with macro data to facilitate policy decisions on industry digitization. This section presents the countries/regions contributing to the 112 research articles extracted for the study. Figure 3 shows the network of citing countries/regions, which contains 29 nodes and 33 links. The size of a node represents the total number of published articles in the 112 articles, and the thickness of the links indicates the levels of the cooperative relationships. As a result, PEOPLES R CHINA (23 articles), TAIWAN (21 articles) and USA (17 articles) top the list, demonstrating that the considerable number of related articles in these countries/regions have made significant contributions to research in this field. However, compared with other emerging technology such as AI and BIM, as a growing new technology, ANN in CM has not yet attracted the global attention it deserves. It is believed that in the future, more and more countries/regions will pay attention to and promote research in this field. The betweenness centrality is an important index in

Citespace. Freeman [146] noted that the betweenness centrality could be calculated by the ratio of the shortest path between two nodes to the sum of all such shortest paths. The greater the betweenness centrality, the higher its importance. From the perspective of centrality, Citespace identified a collaborative pattern, and the network reveals that countries such as PEOPLES R CHINA (centrality = 0.43), USA (centrality = 0.35) and TURKEY (centrality = 0.13) were the key infrastructure nodes in the network. Researchers from these countries collaborate more actively than others. The centrality for TURKEY is only 0.13 which indicates insufficient collaboration, while Taiwan, as the second prolific region, has a centrality of 0.00. In general, these results imply that strengthening academic exchanges and contacts to expand current research productivity may be a subject worth more attention.

**Figure 3.** Collaboration network of countries/regions in the research on ANN in CM.

#### *3.3. Keywords Analysis*

#### 3.3.1. Co-Occurrence Network of Keywords

Keywords are representative and concise descriptions of research paper content, and analyzing keywords provides an opportunity to identify major research interests in this field [147]. A network of keywords offers a good picture of a knowledge domain, which help to identify the interests over a specific timespan and provides an understanding of how they are connected and organized [138]. Identical terms (e.g., cost estimate, construction cost estimation and cost prediction; genetic algorithm and GA; regression and regression analysis) were merged (as cost estimate, genetic algorithm and regression analysis, respectively) and generic keywords related to research areas, etc. (e.g., management, model) were omitted during analysis because they do not reflect the current related research trend and have an impact on the clustering accuracy of analysis results [148].

To construct and map the knowledge domain on ANN in CM, keyword co-occurrence in the research area was obtained using VOSviewer. Main research interests on ANN in CM is shown as Figure 4. The network of co-occurring keywords has 122 nodes, 342 links, and a total link strength of 432. In this network, each node represents a different keyword, and the link between the two nodes is the co-relationship between the connected keywords, the node size is determined by the frequency of the words appeared in the 112 articles.

The frequency with which keywords are cited indicates the main research interests in the research field [149]. Table 2 summarizes the keyword occurrences and node strength of each. The links are the number of linkages between a given node and others, while the total link strength reflects the total strength linked to a specific item.

**Figure 4.** Main research interests on ANN in CM (co-occurrence network of keywords).



From Table 2, it is revealed that ANN was the keyword with the highest frequency, and was used as the keyword in 75 of 112 articles, which further verifies the rationale of the literature selection. Besides ANN, prediction is the most frequent keyword and the total link strength is 113. Prediction is in the highest level of the keywords indicating the strong inter-relatedness between ANN and prediction. The analysis result that prediction has received considerable attention could be interpreted by the fact that as a powerful algorithm for AI, ANN is an effective tool for prediction [150]. ANN is typically applied in prediction models for knowledge discovery from a large quantity of information and documents which are generated in the process of construction management, and the result can provide reliable assistance for decision-making [117] and optimization [33]. In addition, ANN can also be used for recognition and classification, such as defect classification [85] and construction activity recognition [87]. Compared with prediction, research on recognition and classification attracts relatively insufficient attention and deserves further exploration in the future.

Except for the main functions of ANN in CM, the range of problems or tasks ANN has been applied to solve in CM is another important issue. It can be seen from Figure 4 and Table 2 that cost estimation, performance, productivity, risk, safety, project success and duration represent other important types of nodes in the network, which are key tasks of construction management [151]. The results indicate that ANN has gradually indeed become an effective tool for CM and is gradually replacing the traditional mainstream methods due to its advantages [79]. It is worth mentioning that behavior, construction worker, contractor, labor and personnel issues are an emerging type of research topic, and the importance of personnel management is further highlighted in project management [59]. There is, however, a conspicuous absence of interest in the topic of environment in the network, which needs further attention.

Finally, there are many keywords related to algorithms such as genetic algorithm, regression analysis, algorithms, fuzzy logic, data mining etc. It indicates that in order to better complete project management tasks, a variety of methods have become more widely applied together with ANN to improve the efficiency and precision of the model.

#### 3.3.2. Timeline Visualization and Citation Bursts of Keywords

Cluster analysis is used to identify the semantic themes hidden in the textual data. Figure 5 shows a timeline visualization of cluster analysis of keywords which was created by Citespace 5.7.R1. There are three text mining algorithms that can be used to label clusters in CiteSpace. Log-Likelihood ratio (LLR) clustering technique was implemented in this study because of its good clustering results [80]. The network is divided into 9 major co-citation clusters (with cluster IDs #0, #1, etc.). CiteSpace automatically selects a label for each cluster based on titles, keywords, and abstracts of the articles in each cluster. Usually, the Modularity (Q value) and Mean Silhouette (MS) value are used to evaluate the clustering effect. Generally speaking, Q value is within the interval of [0, 1). Q > 0.3 means that the community structure is significantly divided. A cluster's silhouette value ranges from −1 to 1 and assesses the uncertainty involved in defining the cluster's nature. A value of 1 signifies that the cluster is perfectly isolated [12]. As shown in Figure 5, the network has high modularity (Q = 0.581 > 0.3), which shows that the network is divided into clusters with dense links amid nodes within clusters. The MS value is 0.363, indicating that the homogeneity of the clusters is not high. This MS value shows that although the studies in the network in each cluster may be consistent in exploring somewhat similar issues, they address different issues in fact [12].

The largest cluster (#0) has 25 members and is labeled as 'machine learning' by LLR, which includes deep learning, modeling, LSTM, machine learning, framework, hybrid intelligence, back-propagation neural network, prediction, etc. This result means that the enhancement and optimization of ANN is one of the most popular research interests in recent years and is in line with earlier observations; development of ANN-based deep learning (DL) model is a representative direction. DL affords a machine learning technique

in which computers are taught to perform what comes naturally to humans by training [150]. Zhou, Xu, Ding, Wei and Zhou [98] combined a wavelet transform noise filter, CNN, and long short-term memory predictor to propose a DL method. The DL method proposed by Rafiei and Adeli [56], including unsupervised deep Boltzmann machine learning and BPNN. The actual data is used to verify the proposed DL algorithm, and the results show that the effectiveness and accuracy of a single ANN are improved.

**Figure 5.** Timeline visualization of keywords (clustering structure).

The second largest cluster (#1), 'labor and personnel issues', mainly includes research on management, behavior, safety climate, worker, health, etc. The clustering results show that worker safety is another important topic. Especially since 2015, there has been a node explosion in Figure 5 (#1). This result shows that CM has paid great attention to safety and ANN has been widely used in the safety field in recent years. Emerging technologies such as laser scanning and smart sensors have made massive data acquisition a reality, which has provided tremendous support for this development [65]. Yi, Chan, Wang and Wang [94] proposed a system which could be automated by integrating smart sensor technology, location tracking technology and ANN to protect the wellbeing of those who have to work in hot and humid conditions.

In addition, cost estimation (#6) and risk allocation (#7), as the main tasks of traditional building management, were important topics before 2017, but the attention has been gradually weakened in recent years. Optimization (#8) is labeled as the smallest cluster indicating that the research on ANN applications in this area needs to be strengthened, as shown in Figure 5.

Citation bursts reflect the dynamics and evolution of the field by citing articles with a sharp increase in citations [152]. The higher the suddenness of a keyword, the more attention is paid to it in the time interval considered, and to some extent, it represents the research frontier and hotspot in the subject area [153]. Figure 6 shows the top 25 keywords with strongest citations bursts from 2000 to 2020. The light green line indicates the range of literature years reviewed, while the orange line indicates the duration of a citation burst event.

As shown in Figure 6, keywords with citation bursts can be divided into two categories including methods and issues. As to the methods appearing in the last 20 years, data mining first burst between 2001 and 2006, followed by fuzzy logic, case-based reasoning, genetic algorithm, and regression analysis. Meanwhile, hybrid neural network, machine learning and artificial intelligence began to appear in the last 10 years indicating that as an important representative of machine learning and artificial intelligence, ANN is attracting more and more attention. ML had the strongest of the citation bursts (3.77) and the bursts began from 2018 up until 2020 implying that ML based on ANN represent emerging themes in research on ANN in CM.


**Figure 6.** Top 25 keywords with the strongest citation bursts in the selected literature.

As to the issues, ANN has been applied to bridge management, project success, risk allocation, organizational capability, transaction and construction cost, bidding, dispute, safety, workers and productivity, etc., which have been hot research topics in the past 20 years. Before 2010, risk allocation had strong citation bursts due to the continuous development of the PPP model [134]. After 2010, Figure 6 shows that cost estimate (burst strength, 1.87), cost and schedule (1.86), had strong citation bursts in the literature. This implies that these were hot topics in the respective years.

#### *3.4. Document Co-Citation Analysis*

The references of frontier manuscripts can represent the knowledge base in a field [154]. Document co-citation analysis (DCA) studies a network of co-cited references. Thus, through analyzing the co-cited references, DCA objectively explored the underlying knowledge base of the ANN research in CM. CiteSpace was used to analyze the documents

cited in 112 records. Figure 7 shows the detailed outcome of the document co-citation analysis, i.e., a co-citation network including 497 nodes and 1485 links. Each link represents the co-citation relationship between the two corresponding articles while the font size represents the co-citation frequency of these documents. The node documents were among cited documents and were not necessarily included in the 112 retrieved articles.

**Figure 7.** Document co-citation network of ANN in CM [15,28–137].

The top 10 cited documents are listed in Table 3. These articles were widely recognized by peers and had high value for research on ANN in CM. A systematic review of these 10 high-quality articles reveals the following findings: (1) Except for two studies on the improvement of CNN methods, the remaining research topics are divided equally into cost and safety for CM. (2) It can be seen from the publication dates of top cited documents that cost and safety have been the main research interests of the last two decades. Cost boomed in the last decade, but now the focus has turned to safety.

Four important cost documents cover almost all the research directions of ANN. First, the research on ANN: for example, articles compare traditional prediction methods (regression analysis, etc.) and ANN [155], develop different types of ANN algorithms (MLFN and GRNN) [55], other algorithms (FL) improve ANN [35], and establish a database for ANN [99]. Second, research on different topics: for example, articles include the cost for different types of construction projects, such as road tunnel construction cost [55], and the cost of reconstruction projects [155]. In addition, there is discussion of price, total cost, maintenance cost and other cost predictions from different perspectives.

For safety, the top cited literature shows ANN is mainly used for object safety detection and accident analysis. Specific techniques such as object detection, tracking and action recognition can be used to effectively identify unsafe acts and conditions. A large number of related researches in computer vision technology provide conditions for CNN to further realize accurate object detection. Fang Q developed a CNN model for automatic non-hardhat-use detection technology [156]. Fang WL proposed an improved and faster approach with CNN features to detect the presence of workers and equipment in real-time [157].


#### **Table 3.** Top 10 highly co-cited papers.
