1. Introduction
Civil infrastructure projects constitute the backbone of the development of urban and rural areas. The significance of civil infrastructure in improving the social quality of life cannot be overstated. Nevertheless, the implementation of such civil infrastructure projects requires managing a complex array of stakeholders and interests [
1]. At the crossroads of development and conservation, civil infrastructure projects encounter an intricate web of interests. Governments advocate for stable infrastructure growth; local communities seek inclusive development; private sectors look for a return on investments; environmentalists press for sustainable practices; and heritage committees prioritize preservation [
2,
3]. Although these various objectives aim toward a common good, they often have different opinions. The unique combination of diverse stakeholders, limited national budget, and societal implications makes projects a fertile ground for conflict, leading to delays in project timelines and even project cancellations [
4,
5]. Social conflict among external stakeholders in civil infrastructure projects has a significant impact on project performance as well as on social governance. For sustainable development of civil infrastructure, it is essential to manage conflict preemptively.
In the era of digitalization, people are more connected to each other than ever. The transition from traditional print and broadcast news to online news media has empowered information dissemination and communication. With the ability to reach millions of people instantly, online news platforms have democratized access to information and played a pivotal role in shaping public discourse [
6]. However, these platforms have a potential drawback, as they can amplify and intensify social conflicts through the swift dissemination of information [
7]. The rapid dissemination of information through news and the subsequent formation of public opinion have a significant impact on the progress of civil infrastructure projects [
8,
9]. Negative public opinion and social conflicts in civil infrastructure projects may lead to cost growth, schedule delay, and even project termination. In particular, conflicts change dynamically during the project execution, and unresolved conflicts cause complex problems in conjunction with subsequent conflicts. Therefore, it is essential to detect key conflict drivers and establish appropriate response strategies in a timely manner for successful project management [
4].
Recent natural language processing (NLP) technology, coupled with cutting-edge artificial intelligence and big data, has reached a level where it can help solve real-world problems [
10]. Its applicability in extracting information and identifying patterns related to social conflicts based on news content is worth exploring. Against this backdrop, this study aimed to extract conflict drivers related to civil infrastructure projects presented in news articles using ChatGPT.
2. Conflict Management in the Construction Sector
The construction industry is not free from social conflicts in modern society. Construction projects are not only technical endeavors but also social systems involving various stakeholders. In particular, the occurrence and level of conflicts during the execution of construction projects have increased as the projects have become more complex and modern society is hyper-connected [
1,
8,
9].
Understanding social conflicts is crucial as they can significantly affect project performance. Therefore, predicting potential stakeholder reactions is critical for stable and timely project implementation [
11]. Many studies attempted to understand the nature of stakeholders in conflict situations because successful stakeholder management is a key success factor of complex projects [
12]. According to Freeman [
13], stakeholders are defined as “any group or individual who can affect or is affected by the achievement of a corporation’s purpose”. In the context of infrastructure projects, this study refers to stakeholders as any group or individual who can affect or is affected by the achievement of an infrastructure project’s objective. There are two types of stakeholders in construction projects: internal and external stakeholders. Internal stakeholders are contractually involved in the project, while external stakeholders are mainly parties without any legal relationship [
14]. In the construction domain, previous conflict-related studies primarily focused on internal stakeholders by addressing contractual and technical disputes, typically represented by claims among stakeholders [
12,
15,
16,
17,
18,
19,
20].
Although civil infrastructure projects suffer from social conflicts including external stakeholders, there is a limited amount of research that attempted to study social conflicts in construction that included external stakeholders. In previous research, social conflict in civil infrastructure projects has been expressed in various terms such as social risk, social acceptance, and stakeholder satisfaction. Social conflict is influenced by various factors in accordance with the nature of the project. Previous studies identified conflict drivers for civil infrastructure projects using literature reviews and case studies. In particular, these approaches have been instrumental in delving into specific cases and deriving meaningful insights. For example, Lee et al. (2017) conducted 22 retrospective case studies and identified 15 conflict drivers in 49 conflict events [
4]. They presented conflict scenarios of civil infrastructure projects and conflict propagation pathways of each scenario. Min et al. (2018) identified 18 conflict drivers based on two cases using grounded theory and the paradigm model [
21]. They proposed a conflict analysis framework considering the causes of conflict and characteristics of civil infrastructure projects. In addition, Oppong’s research team conducted a systematic literature review of stakeholder management performance attributes in construction projects and identified 18 performance objectives and 22 performance indicators [
11]. Then, they evaluated which factors contributed to successful external stakeholder management in a subsequent study [
22].
The nature of conflict and its origins, the motivations behind collective action, and the consequences of such conflicts have attracted the interest of researchers, practitioners, public officers, and policymakers. But at the same time, literature reviews revealed a research gap in the methodologies employed in analyzing social conflicts in construction. Most previous studies on social conflict to date have been grounded in qualitative analysis. These studies often delved deeply into the conflict phenomena in specific conflict cases through interviews, surveys, and case studies. These qualitative approaches have provided context-specific insights for understanding conflict phenomena in civil infrastructure projects. Nevertheless, they are limited in generalizability and broad perspectives. Furthermore, it is crucial to detect and mitigate conflict drivers as early as possible for the sustainable implementation of the project. Although previous case-driven studies produced meaningful fruits, to the authors’ best knowledge, there is no scientific system for timely conflict management in practice. It is necessary to monitor and mitigate conflicts during the project execution in a timely manner. To address the limitations of the previous studies, this study aimed to pave the way for data-driven conflict analysis. By utilizing cutting-edge NLP technology and a large volume of textual data, this study extracted conflict drivers that occurred during the implementation of civil infrastructure projects.
3. Methodology
This study aimed to establish an automated process and method for detecting conflict drivers using ChatGPT.
Figure 1 shows the conflict driver detection process applied in this study. This study collected Korean news articles related to civil infrastructure projects using a web crawler developed by the authors. Then, ChatGPT was utilized for keyphrase extraction (KPE) and keyphrase classification (KPC) to identify conflict drivers within a given text. If input text is relevant to a conflict phenomenon, ChatGPT was requested to respond to a list of the five most relevant conflict-related keyphrases. After extracting keyphrases from entire datasets, the authors required ChatGPT to select the most relevant factor among a set of predefined classes, which consists of 18 conflict drivers and 3 other factors. Methodological background and detailed setup for the conflict driver detection are described in the following subsections and
Section 4.
3.1. Web Crawling
Web crawling is a technique that automatically collects and stores specific information on web pages [
23]. A web crawler starts with a list of URLs to visit. It connects to each website and crawls the predefined contents by parsing a hypertext markup language (HTML) document. Once the crawler has accessed the content of a page, it stores the crawled content in a temporary format in a database. Web crawling has the characteristics of an exhaustive survey, and it has the advantage of being able to obtain a large amount of data quickly and accurately beyond the limitations of manual data collection [
24]. Since this study used news articles, the web crawler used in this study was designed to crawl a title and body of a news article, the published date, and its URL.
3.2. ChatGPT-Based Keyphrase Extraction and Classification
KPE is an automated process of identifying the most relevant and representative phrases from text input. Although research on the development of a KPE model by itself is an interesting research object, this study utilized an existing text model for KPE. Recent large language models have shown superior performance on NLP tasks, including the KPE task [
25]. There are two representative text models in the KPE tasks, namely, ChatGPT [
26] and KeyBART [
27]. This study employed the ChatGPT model as a keyphrase generator to identify points of conflict in relation to infrastructure from a collection of news articles for the following reasons. First, the training datasets of ChatGPT are multi-domain documents while that of KeyBART are scientific documents. Thus, ChatGPT covers a wider range of topics compared to KeyBART in terms of natural language understanding. Second, the maximum length of input tokens of ChatGPT is 4096 for the GPT-3.5 version, while that of KeyBART is 1024. Therefore, ChatGPT can be utilized with longer texts such as scientific documents or news articles. SemEval2010 is a widely used long scientific document dataset for KPE tasks, and DUC2001 is a dataset for KPE consisting of long news articles. ChatGPT had markedly better performance for both representative datasets due to its higher input token limit [
25]. Third, ChatGPT is adjustable as the keyphrases are generated by user prompts, while KeyBART focuses on extracting keyphrases related to the main theme of input text. Appropriate keyphrases within a text may vary depending on the perspective. Owing to the intrinsic properties of its conversation-based generative AI, a user can guide ChatGPT in extracting keyphrases from a predefined perspective. In this way, ChatGPT performs KPE specific to any purpose without additional fine-tuning. Since this study aimed to identify domain-specific keyphrases focusing on conflict from long news articles, the authors determined that ChatGPT is more appropriate than KeyBART.
KPC is a process to classify the extracted keyphrases into predefined categories, which is a kind of text classification task in NLP. Most existing studies developed text models for text classification based on data in accordance with the purpose of each study [
28]. However, the previous approach suffered from the lack of data and labor-intensive manual annotation to train a model [
29]. Meanwhile, recent attempts aiming to evaluate the performance of ChatGPT reported that ChatGPT outperforms manual annotation in text classification [
30,
31]. Thus, this study utilized ChatGPT for not only KPE but also KPC.
ChatGPT is a state-of-the-art deep learning model for natural language understanding and generation introduced by OpenAI [
26]. As its name states, the generative pre-trained transformer (GPT) is a pretrained language model designed to predict which word will most appropriately follow a given text. ChatGPT is a fine-tuned model of the GPT series that generates coherent and contextually relevant responses to a given text called a prompt. ChatGPT is able to process various NLP tasks in accordance with a user request through prompt engineering. There are two kinds of prompts: system prompts and user prompts. The system prompt is a pre-configured text that is sent to the model when a new interaction starts. A user can instruct the model on the task it should perform through a system prompt. The user prompt is text that a user inputs to elicit a response from the model. The model takes this user prompt and generates a response based on pretrained data [
32]. This study first asked ChatGPT to extract conflict-related keyphrases from a given article. Then, the authors required ChatGPT to classify the key phrases, which is the response of the previous step, into one of the predefined categories.
4. Conflict Driver Extraction and Classification
4.1. Conflict Drivers
There are no standardized conflict drivers related to civil infrastructure projects. Previous studies have classified conflict drivers into several categories, such as economic, social, institutional, technical, cognitive, and environmental [
4,
5,
11,
21,
22,
33,
34,
35,
36]. Based on the literature cited above, this study identified and categorized conflict drivers into 18 categories as presented in
Table 1.
4.2. Data Collection
The authors collected news articles related to civil infrastructure projects using web crawling. With the digital transformation, the news consumption of modern society has intensified its dependence on digital news aggregators such as portal sites. This study collected news articles through the portal site Naver (
www.naver.com, accessed on 31 August 2023). When looking into the Korean domestic internet news channels, 89.8 percent of users access news articles through digital news aggregators, and Naver has the largest share of portal sites in Korea [
37].
In order to select the target infrastructure projects, the authors conducted a preliminary investigation considering the level of conflict and the project implementation period. Consequently, this study selected five representative infrastructure projects that experienced nationwide conflicts in the Republic of Korea: the Cheonseong Mountain Tunnel (Case #1), Gadeok Island New Airport (Case #2), Ilsan Bridge (Case #3), Jeju 2nd Airport (Case #4), and Miryang Transmission Tower (Case #5) projects. This study collected news articles using the five infrastructure project names as queries in Korean (
Figure 1). As a result, a total of 50,801 news articles published by 115 news media from 2001 to 2022 were collected (
Table 2). The number of published news articles in each case jumped up by year when social issues related to each infrastructure arose. Case #1 and Case #5 suffered from social conflicts at the project initiation stage in 2005 and 2013, respectively. Case #2 attracted public attention ahead of the election as it was mentioned in the election pledges of past presidential candidates in 2016 and 2020. Case #3 is a public-private partnership (PPP) project that was completed in 2007 and has been in operation since 2008. Due to the nature of road infrastructure, news articles such as traffic information have been published consistently. In the meantime, news articles increased rapidly as the competent authority raised the toll issue in 2021. Case #4 is still under debate over whether to proceed with the project and has been since 2015.
4.3. Data Annotation
To measure the performance of KPC using ChatGPT, a human-crafted gold standard consisting of 2000 keyphrases was developed. The authors randomly sampled keyphrases from the outputs of ChatGPT and annotated them as a gold standard. Two graduate students in civil engineering independently reviewed and labeled the keyphrases. The annotators were asked to assign one of the most relevant factors among the 18 conflict drivers and 3 other factors to each keyphrase. Conflict drivers were identified through a literature review. In addition, the authors defined three more factors based on the results of KPE: “Stakeholder (O-01)”, “Project attribute (O-02)”, and “Undefined (O-03)”. “Stakeholder” indicates both internal and external stakeholders, such as central and local governments, public authorities, engineering and construction companies, the public and residents, and non-governmental organizations. “Project attribute” is project-related factual information, including the project name, scope, size, scheme, and the nature of project itself. Lastly, “Undefined” plays a dummy role that is used to contain irrelevant keyphrases. These factors are not conflict drivers, but they are foundational information related to the conflict phenomenon of infrastructure projects.
As a result, 1624 conflict drivers (81.2%) and 376 other factors (18.8%) were annotated. Due to the nature of social conflict in civil infrastructure projects, the annotated factors were imbalanced. In particular, opposite movement and response (D-07) was the most (29.1%) among conflict drivers followed by ecological issue (D-03) (14.6%). The rest of the conflict drivers were less than 5% each. Meanwhile, 10.9% of the keyphrases were related to the project attribute (O-02), and 5.7% of the keyphrases were not related (O-03) to the predefined categories.
4.4. ChatGPT API Setting
The KPE and KPC process in this study was conducted through an API service with the “gpt-4.0” provided by OpenAI. There were several parameters used to obtain appropriate answers from ChatGPT. This study used the default setting except “Temperature”. It ranges from 0 to 1 and controls the randomness of the outcomes generated by ChatGPT [
38]. A higher temperature returns more random text while a lower temperature value makes ChatGPT become more deterministic. In this study, the temperature was set to 0 for consistent results.
For the KPE, this study used the following text as a system prompt: “You will be provided with an article, and your first task is to identify whether the provided article is related to social conflict during the implementation of the infrastructure project, and if your response to the first task is yes, your second task is to extract a list of the five related keyphrases at most that represent a point of social conflict in relation to the infrastructure mentioned in the article”. Each news article was used as a user prompt and ChatGPT returned a list of keyphrases related to the conflict within a text if the given article is associated with the social conflict.
For the KPC, the authors granted a classifier role to ChatGPT by using the following text as a system prompt: “You are a classifier, and you will be required to solve a given problem”. Then, the authors made ChatGPT select one of the most relevant factors with a given keyphrase by using the following template as a user prompt: “Select the options most relevant to [keyphrase] from the list below. (1) Communication issue, (2) Compensation issue, ∙∙∙”.
5. Results
This study evaluated the KPC performance by calculating the precision, recall, and F1-score. The F1-score is a metric that combines both precision and recall, providing a single measure of a model’s performance. The F1-score ranges from 0% to 100%, where a score of 100% indicates perfect precision and recall, and a score of 0% indicates that neither precision nor recall has been achieved. Precision, recall, and F1-score are based on a confusion matrix which consists of true positive (TP), false positive (FP), true negative (TN), and false negative (FN), as described in
Table 3.
Precision is the ratio of
TP predictions to all the samples a model predicted as positive (Equation (1)), while
recall is the ratio of
TP predictions to all the actual positive samples (Equation (2)). The
F1-score is a harmonic mean of
precision and
recall (Equation (3)). In addition, there are three ways to calculate the average, namely, micro average, macro average, and weighted average. The micro average considers all categories collectively; the macro average calculates the metric independently for each category and then takes the average; and the weighted average takes the average using the number of samples in each category as weights.
Table 4 shows the performances of KPC using ChatGPT. The micro, macro, and weighted average F1-scores were 85.7%, 83.6%, and 84.7%, respectively. As a result, the ChatGPT-based KPC showed a notable performance in identifying conflict drivers from news articles. In particular, ecological issue (D-03), opposite movement and response (D-07), perception and emotional issue (D-08), professional investigation issue (D-09), project organization issue (D-11), and public–private partnership issue (D-12) showed a good performance with F1-scores over 90%. Meanwhile, compensation issue (D-02), project objective issue (D-10), and technical issue (D-17) showed a somewhat insufficient performance with F1-scores below 80%. In addition, the recall of 15 conflict drivers was over 90% and that of all conflict drivers except four factors (D-07, D-08, D-09, and D-14) was computed to be greater than or equal to the precision value; that is, the ChatGPT-based KPC is practical for detecting actual conflict drivers with high performance.
The authors compared the KPC performance with other models that aimed to recognize words or phrases and classify them into predefined categories in the construction domain (
Table 5). The performance of word/phrase classification varies depending on multiple factors such as research objective, document type, the number of classes, language, and sample distribution [
39]. KPC in this study resulted in suboptimal performance compared to other research records. This could be attributed to the variability of the classification object. Keyphrases used in this study consisted of various lengths from two words to a full sentence. Also, the keyphrase distribution was imbalanced, which might disrupt the text classification performance. Nevertheless, the performance of KPC in this study is notable in that it utilized existing LLM without further training. Developing a domain-specific NLP model is constrained due to the required computing power and available dataset. In this circumstance, it is necessary to consider the trade-off between the cost needed for the model improvement and performance from the practical useability perspective.
6. Illustrative Case Study and Discussion
For the purpose of evaluating the results qualitatively, the authors analyzed two cases, namely, the Cheonseong Mountain Tunnel (Case #1) and the Gadeok Island New Airport (Case #2) projects. The authors excluded the analysis of keyphrases that belong to stakeholder (O-01), project attribute (O-02), and undefined (O-03) since this study was focused on conflict drivers.
6.1. Conflict over the Cheonseong Mountain Tunnel Project
The construction of the Cheonseong Mountain Tunnel is part of the high-speed railway project that passes through Cheonseong Mountain in the Republic of Korea (
Table 6). Its official name is the Wonhyo Tunnel, but it is better known as the Cheonseong Mountain Tunnel because of the conflict that occurred during its construction. Construction started in 2003 and was completed in 2008. The conflict arose when religious believers living on Cheonseong Mountain raised concerns about the disruption of their religious practice environment and the destruction of the ecosystem. The symbol of opposition to the construction was the salamander. Religious believers and environmental groups argued that the construction would destroy the salamander habitat and ecosystem around Cheonseong Mountain. With a religious believer’s lawsuit and hunger strike, the conflict spread nationwide. The government and the opposition made efforts to resolve the conflict; they formed a committee to review alternative routes together and re-implemented the environmental impact assessment. However, they failed to reach an agreement. Eventually, the Korean Supreme Court made a decision to dismiss the lawsuit and the construction resumed.
Figure 2 presents the number of the identified keyphrases of each conflict driver from 2002 to 2016. A total of 4338 keyphrases were identified over 15 years, and 78.4% of the keyphrases appeared over the 4 years from 2003 to 2006, indicating that conflict was serious during this period. As shown in
Table 6, beginning in late 2003, the conflict had intensified and spread nationwide with the opposition movement of religious believers and environmental groups including the lawsuit. This indicates that the results of the KPE align with actual conflict situations.
The identified keyphrases show in more detail which conflict drivers were the dominant and root causes of the conflict.
Table 7 presents the representative keyphrases related to the Cheonseong Mountain Tunnel that accounted for more than 5% of all keyphrases. The most frequently presented conflict driver was the ecological issue (D-03) followed by the opposite movement and response (D-07) and the technical issue (D-17). The keyphrases classified as D-03, D-07, and D-17 accounted for 61.4% of all keyphrases. The identified keyphrases related to D-03 and D-17 represent concerns about the destruction of the Cheonseong Mountain ecosystem due to groundwater leakage caused by the tunnel construction. In addition, the opposition movement and response shows that actual conflict events occurred during the implementation of the project.
6.2. Conflict over the Gadeok Island New Airport Project
The Gadeok Island New Airport Project was first proposed around 1990 but is still ongoing (
Table 8). This project was promoted to meet the growing demand for airport infrastructure in the southeastern region of South Korea. The main issue of this project is site selection. Due to PIMFY (Please in My Front Yard) syndrome, local governments and residents have confronted each other to attract a new airport in their neighborhood. Previous presidents and members of the National Assembly promoted policies related to the new airport in the southeastern region, but they could not easily reach a social agreement. As the conflict intensified, various questions arose about the pros and cons of each alternative. The latest alternatives were the expansion of an existing airport (Gimhae Airport) and the construction of a new airport on Gadeok Island. As of February 2021, a special law was enacted for the construction of a new airport on Gadeok Island, and construction is scheduled to break ground in 2024.
Figure 3 presents the KPE results of the Gadeok New Airport Project from 2008 to 2022. A total of 2783 keyphrases were identified over 15 years, and the appearance of keyphrases jumped in 2016 and 2020. This indicates that there was significant controversy for the decision of the Ministry of Land, Infrastructure and Transport in 2016 and the special law for the construction of the Gadeok Island New Airport in 2020 as shown in
Table 8. While the conflict in the Cheonseong Mountain Tunnel case was focused on ecological and technical issues, there have been diverse issues that arose in the Gadeok Island New Airport Project.
Table 9 presents the representative keyphrases related to the Gadeok Island New Airport Project that accounted for more than 5% of all keyphrases. The most frequently presented conflict driver was the project objective issue (D-10). Most of the extracted keyphrases are related to the necessity of a new airport in the southeastern region of South Korea and the balance of national development. This is because the international airport that is actually active in the mainland of South Korea is far from the southeastern region. Stakeholders who wish to attract new airport infrastructure propagated why it should be located in their neighborhood and how it would contribute to regional development. While project objective issues (D-10) emphasize the necessity of a new airport in a specific region, ecological issues (D-03) represent why another alternative site is not suitable for a new airport. In addition, keyphrases of facility operation and utilization issues (D-06) show future plans for how the new airport will be operated and associated with nearby infrastructure. In the Gadeok Island New Airport case, perception and emotional issues (D-08) appeared as both positive and negative perspectives because opinions for and against each alternative were mixed. Lastly, most of the keyphrases related to the laws, institution, and guideline issue (D-14) indicate the enactment and passage of the Gadeok Island New Airport Special Law.
6.3. Discussion
This study evaluated the KPC performance quantitatively using the F1-score and confirmed the KPE performance qualitatively through two case studies. The results showed that the presented process and methods are viable in identifying which conflict drivers are issues and monitoring them in order to manage them in practice.
Recent NLP research in the computer science domain has been progressing toward developing a large language model (LLM) that comprehensively understands human language and can be used for general purposes. Compared to previous text-based studies in the construction domain, the recent generative LLM showed usefulness in reducing the burden of labor-intensive annotation tasks while resulting in higher performance [
28]. It can help researchers efficiently extract latent information and meaningful insights from text data by omitting the process of training a text model from scratch or fine-tuning a pre-trained language model.
When utilizing an interactive LLM like ChatGPT, the prompt has a significant effect on outputs. The authors finalized the prompts used in this study after much trial and error. A lesson learned through trial and error is that prompts should be as specific as possible about the output the user expects. Although this study performed prompt engineering manually, there were recent attempts to apply ML technology to improve prompt engineering [
47]. Because of the difficulty of fine-tuning an LLM due to its enormous model size, utilizing ML-based prompt engineering on existing LLMs can be an efficient approach to solve domain-specific problems.
Most previous studies of conflict in construction have been retrospective, based on case studies and literature reviews. This study contributes to the shift in conflict research from a qualitative to a quantitative approach and from post-evaluation to preemptive mitigation. For example, the frequency of each conflict driver can be used to determine its importance and management priority. The illustrative case studies showed that there were symptoms of conflict beforehand, and it gradually intensified. In the Cheonseong Mountain Tunnel case, the frequency of dominant conflict drivers first popped up significantly in the first quarter of 2003, as shown in
Figure 4, and it reached the highest level in the first quarter of 2005. If practitioners had noticed the signal of conflict beforehand, they could have seized the golden hour for mitigation. Thereafter, the early mitigation of conflict might decrease the negative impact on the project execution.
In addition, there are various textual data representing the conflict phenomenon, such as social network service (SNS) data, although this study used only news articles. Such real-time online data can be used to assess public acceptance immediately and how it is changing. By monitoring the public acceptance of civil infrastructure projects in real-time, practitioners can respond proactively.
As a point of departure for data-driven conflict management, this study introduced a framework for automatically identifying conflict drivers from textual data using LLMs. Through this study, the authors confirmed the viability of developing an automated system that monitors the social conflict related to civil infrastructure projects in real-time based on online information. Although this study utilized an existing LLM without any modification, this study, as applied research, is novel in that it presents a framework and illustrative cases for NLP-powered conflict management. Data-driven conflict analysis can help construction managers manage social conflicts by generalizing conflict drivers and events, propagation paths, and response strategies. Consequently, it is expected to reduce not only construction costs but also social costs caused by conflicts.
7. Conclusions
This study presented an automated process and method for detecting conflict drivers using ChatGPT. The authors developed a web crawler to collect news articles, which are data sources for identifying conflict drivers. Using project names as queries, this study collected online news articles related to civil infrastructure projects implemented in South Korea. Then, this study utilized ChatGPT to extract conflict-related keyphrases from the article collections and classify the extracted keyphrases into predefined conflict drivers. The performance of KPC was measured by micro, macro, and weighted average F1-scores of 85.7%, 83.6%, and 84.7%, respectively. In particular, the recall values of 15 out of the 18 conflict drivers were above 90%, and the recall values of all but 4 conflict drivers were calculated to be greater than or equal to the precision values. This indicates that the output of ChatGPT-based KPC is useful for practical conflict analysis. To evaluate the performance of KPE, this study conducted two illustrative case studies using actual projects implemented in South Korea. As a result, the extracted keyphrases satisfactorily described the actual conflict phenomena.
The results of this study contribute to performing timely data-driven conflict management. The occurrence of conflict during the project implementation leads to cost overruns and schedule delays. If the conflict intensifies and spreads nationwide, it may end up with project cancellation. Therefore, it is crucial to identify and hedge conflict drivers as early as possible for the success of the project. Nevertheless, to the best of the authors’ knowledge, there is no scientific system for the timely monitoring and management of conflicts in practice. Using the process and methods presented in this study, practitioners can quickly identify which conflict drivers are emerging and what the point of the issue is based on up-to-date news. In addition, this study contributes to the body of knowledge in conflict management by laying the groundwork for facilitating data-driven conflict analysis. The qualitative approaches of previous studies have limitations in generalizing based on comprehensive analysis. By using a large amount of news data and state-of-the-art NLP technology, researchers would be able to study the conflicts more efficiently. The authors expect the result of this study will prime the pump for data-driven conflict analysis in the civil engineering and management domain by overcoming the limitations of previous qualitative approaches.
There is also room for improvement in this study. First, the degree of news digitalization was not considered. When collecting historical news data, the frequency of news publication varies depending on the activity level of online news. It is necessary to adjust the frequency-based significance of conflict drivers by considering the publication time. Second, the identification of other conflict-related factors is required for in-depth conflict analysis, such as stakeholder type, specific conflict events, and conflict impact on project performance. Third, this study collected news data from a major portal site, which may omit smaller news media. Empirically, residents’ opinions and perceptions toward nearby construction projects are often first expressed in local news media. It is more difficult to mitigate conflicts as they intensify. Therefore, it is necessary to expand data sources to include local news media in order to detect earlier signals of latent conflict. In future research, the authors will focus on quantifying the conflict index based on the results of this study by complementing the aforementioned limitations. Thus, subsequent studies will expand the body of knowledge on conflict management from qualitative to quantitative analyses and from case studies to data-driven studies.
Author Contributions
Conceptualization, S.B., D.N., J.W. and S.H.H.; methodology, S.B. and D.N.; data curation, D.N. and J.W.; writing—original draft preparation, S.B.; writing—review and editing, S.H.H. All authors have read and agreed to the published version of the manuscript.
Funding
This research was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (No. NRF-2022R1A2C1012018).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Flyvbjerg, B. What You Should Know about Megaprojects and Why: An Overview. Proj. Manag. J. 2014, 45, 6–19. [Google Scholar] [CrossRef]
- Xue, J.; Shen, G.Q.; Yang, R.J.; Zafar, I.; Ekanayake, E.M.A.C. Dynamic Network Analysis of Stakeholder Conflicts in Megaprojects: Sixteen-Year Case of Hong Kong-Zhuhai-Macao Bridge. J. Constr. Eng. Manag. 2020, 146, 4020103. [Google Scholar] [CrossRef]
- Sovacool, B.K.; Hess, D.J.; Cantoni, R.; Lee, D.; Brisbois, M.C.; Walnum, H.J.; Dale, R.F.; Rygg, B.J.; Korsnes, M.; Goswami, A.; et al. Conflicted Transitions: Exploring the Actors, Tactics, and Outcomes of Social Opposition against Energy Infrastructure. Glob. Environ. Chang. 2022, 73, 102473. [Google Scholar] [CrossRef]
- Lee, C.; Won, J.W.; Jang, W.; Jung, W.; Han, S.H.; Kwak, Y.H. Social Conflict Management Framework for Project Viability: Case Studies from Korean Megaprojects. Int. J. Proj. Manag. 2017, 35, 1683–1696. [Google Scholar] [CrossRef]
- Mok, K.Y.; Shen, G.Q.; Yang, R.J.; Li, C.Z. Investigating Key Challenges in Major Public Engineering Projects by a Network-Theory Based Analysis of Stakeholder Concerns: A Case Study. Int. J. Proj. Manag. 2017, 35, 78–94. [Google Scholar] [CrossRef]
- Braun, J.; Gillespie, T. Hosting the Public Discourse, Hosting the Public: When Online News and Social Media Converge. Journal. Pract. 2011, 5, 383–398. [Google Scholar] [CrossRef]
- Hornik, J.; Shaanan Satchi, R.; Cesareo, L.; Pastore, A. Information Dissemination via Electronic Word-of-Mouth: Good News Travels Fast, Bad News Travels Faster! Comput. Human Behav. 2015, 45, 273–280. [Google Scholar] [CrossRef]
- Ninan, J.; Clegg, S.; Mahalingam, A. Branding and Governmentality for Infrastructure Megaprojects: The Role of Social Media. Int. J. Proj. Manag. 2019, 37, 59–72. [Google Scholar] [CrossRef]
- Zeitzoff, T. How Social Media Is Changing Conflict. J. Conflict Resolut. 2017, 61, 1970–1991. [Google Scholar] [CrossRef]
- Teubner, T.; Flath, C.M.; Weinhardt, C.; van der Aalst, W.; Hinz, O. Welcome to the Era of ChatGPT et al.: The Prospects of Large Language Models. Bus. Inf. Syst. Eng. 2023, 65, 95–101. [Google Scholar] [CrossRef]
- Oppong, G.D.; Chan, A.P.C.; Dansoh, A. A Review of Stakeholder Management Performance Attributes in Construction Projects. Int. J. Proj. Manag. 2017, 35, 1037–1051. [Google Scholar] [CrossRef]
- Beringer, C.; Jonas, D.; Georg Gemünden, H. Establishing Project Portfolio Management: An Exploratory Analysis of the Influence of Internal Stakeholders’ Interactions. Proj. Manag. J. 2012, 43, 16–32. [Google Scholar] [CrossRef]
- Freeman, R.E. Strategic Management: A Stakeholder Approach; Cambridge University Press: Cambridge, UK, 2010; ISBN 0521151740. [Google Scholar]
- Bonke, S.; Winch, G. Project Stakeholder Mapping: Analyzing the Interests of Project Stakeholders. In The Frontiers of Project Management Research; Project Management Institute, PMI: Newtown Square, PA, USA, 2002; pp. 385–405. [Google Scholar]
- Ock, J.H.; Han, S.H. Lessons Learned from Rigid Conflict Resolution in an Organization: Construction Conflict Case Study. J. Manag. Eng. 2003, 19, 83–89. [Google Scholar] [CrossRef]
- Jaffar, N.; Abdul Tharim, A.H.; Shuib, M.N. Factors of Conflict in Construction Industry: A Literature Review. Procedia Eng. 2011, 20, 193–202. [Google Scholar] [CrossRef]
- Harmon, K.M.J. Conflicts between Owner and Contractors: Proposed Intervention Process. J. Manag. Eng. 2003, 19, 121–125. [Google Scholar] [CrossRef]
- Panagiotis, M.; Gregory, H. Model for Understanding, Preventing, and Resolving Project Disputes. J. Constr. Eng. Manag. 2001, 127, 223–231. [Google Scholar] [CrossRef]
- Kassab, M.; Hipel, K.; Hegazy, T. Conflict Resolution in Construction Disputes Using the Graph Model. J. Constr. Eng. Manag. 2006, 132, 1043–1052. [Google Scholar] [CrossRef]
- Awwad, R.; Barakat, B.; Menassa, C. Understanding Dispute Resolution in the Middle East Region from Perspectives of Different Stakeholders. J. Manag. Eng. 2016, 32, 05016019. [Google Scholar] [CrossRef]
- Min, J.H.; Jang, W.; Han, S.H.; Kim, D.; Kwak, Y.H. How Conflict Occurs and What Causes Conflict: Conflict Analysis Framework for Public Infrastructure Projects. J. Manag. Eng. 2018, 34, 04018019. [Google Scholar] [CrossRef]
- Oppong, G.D.; Chan, A.P.C.; Ameyaw, E.E.; Frimpong, S.; Dansoh, A. Fuzzy Evaluation of the Factors Contributing to the Success of External Stakeholder Management in Construction. J. Constr. Eng. Manag. 2021, 147, 04021142. [Google Scholar] [CrossRef]
- Kovacevic, M.; Nie, J.-Y.; Davidson, C. Providing Answers to Questions from Automatically Collected Web Pages for Intelligent Decision Making in the Construction Sector. J. Comput. Civ. Eng. 2008, 22, 3–13. [Google Scholar] [CrossRef]
- Ferrara, E.; De Meo, P.; Fiumara, G.; Baumgartner, R. Web Data Extraction, Applications and Techniques: A Survey. Knowl.-Based Syst. 2014, 70, 301–323. [Google Scholar] [CrossRef]
- Martínez-Cruz, R.; López-López, A.J.; Portela, J. ChatGPT vs State-of-the-Art Models: A Benchmarking Study in Keyphrase Generation Task. arXiv 2023, arXiv:2304.14177. [Google Scholar]
- Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.L.; Mishkin, P.; Zhang, C.; Agarwal, S.; Slama, K.; Ray, A.; et al. Training Language Models to Follow Instructions with Human Feedback. In Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS 2022), New Orleans, LA, USA, 28 November–9 December 2022. [Google Scholar]
- Kulkarni, M.; Mahata, D.; Arora, R.; Bhowmik, R. Learning Rich Representation of Keyphrases from Text. In Findings of the Association for Computational Linguistics: NAACL 2022, Online, Seattle, WA, USA, 10–15 July 2022; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; pp. 891–906. [Google Scholar] [CrossRef]
- Baek, S.; Jung, W.; Han, S.H. A Critical Review of Text-Based Research in Construction: Data Source, Analysis Method, and Implications. Autom. Constr. 2021, 132, 103915. [Google Scholar] [CrossRef]
- Ding, Y.; Ma, J.; Luo, X. Applications of Natural Language Processing in Construction. Autom. Constr. 2022, 136, 104169. [Google Scholar] [CrossRef]
- Hassani, H.; Silva, E.S. The Role of ChatGPT in Data Science: How AI-Assisted Conversational Interfaces Are Revolutionizing the Field. Big Data Cogn. Comput. 2023, 7, 62. [Google Scholar] [CrossRef]
- Gilardi, F.; Alizadeh, M.; Kubli, M. ChatGPT Outperforms Crowd Workers for Text-Annotation Tasks. Proc. Natl. Acad. Sci. USA 2023, 120, e2305016120. [Google Scholar] [CrossRef]
- OpenAI ChatGPT API Transition Guide. Available online: https://help.openai.com/en/articles/7042661-chatgpt-api-transition-guide (accessed on 9 September 2023).
- Friedl, C.; Reichl, J. Realizing Energy Infrastructure Projects—A Qualitative Empirical Analysis of Local Practices to Address Social Acceptance. Energy Policy 2016, 89, 184–193. [Google Scholar] [CrossRef]
- Li, T.H.Y.; Ng, S.T.; Skitmore, M. Evaluating Stakeholder Satisfaction during Public Participation in Major Infrastructure and Construction Projects: A Fuzzy Approach. Autom. Constr. 2013, 29, 123–135. [Google Scholar] [CrossRef]
- Park, C.Y.; Han, S.; Lee, K.-W.; Lee, Y. Analyzing Drivers of Conflict in Energy Infrastructure Projects: Empirical Case Study of Natural Gas Pipeline Sectors. Sustainability 2017, 9, 2031. [Google Scholar] [CrossRef]
- Liu, Z.-Z.; Zhu, Z.-W.; Wang, H.-J.; Huang, J. Handling Social Risks in Government-Driven Mega Project: An Empirical Case Study from West China. Int. J. Proj. Manag. 2016, 34, 202–218. [Google Scholar] [CrossRef]
- Ministry of Culture, Sports and Tourism (MCST). Opinion Concentration Investigation Report; Korean Ministry of Culture, Sports and Tourism: Sejong, Republic of Korea, 2021. (In Korean)
- OpenAI API Reference–Create Chat Completion. Available online: https://platform.openai.com/docs/api-reference/chat/create (accessed on 9 September 2023).
- Baek, S.; Han, S.H.; Jung, W. Automated Identification of Active Players for International Construction Market Entry Using Natural Language Processing. J. Manag. Eng. 2023, 39, 04023025. [Google Scholar] [CrossRef]
- Moon, S.; Chung, S.; Chi, S. Bridge Damage Recognition from Inspection Reports Using NER Based on Recurrent Neural Network with Active Learning. J. Perform. Constr. Facil. 2020, 34, 04020119. [Google Scholar] [CrossRef]
- Ko, T.; Jeong, H.D.; Lee, G. Natural Language Processing–Driven Model to Extract Contract Change Reasons and Altered Work Items for Advanced Retrieval of Change Orders. J. Constr. Eng. Manag. 2021, 147, 04021147. [Google Scholar] [CrossRef]
- Li, R.; Mo, T.; Yang, J.; Li, D.; Jiang, S.; Wang, D. Bridge Inspection Named Entity Recognition via BERT and Lexicon Augmented Machine Reading Comprehension Neural Model. Adv. Eng. Inform. 2021, 50, 101416. [Google Scholar] [CrossRef]
- Moon, S.; Lee, G.; Chi, S.; Oh, H. Automated Construction Specification Review with Named Entity Recognition Using Natural Language Processing. J. Constr. Eng. Manag. 2021, 147, 04020147. [Google Scholar] [CrossRef]
- Zhang, R.; El-Gohary, N. A Deep Neural Network-Based Method for Deep Information Extraction Using Transfer Learning Strategies to Support Automated Compliance Checking. Autom. Constr. 2021, 132, 103834. [Google Scholar] [CrossRef]
- Jeon, K.; Lee, G.; Yang, S.; Jeong, H.D. Named Entity Recognition of Building Construction Defect Information from Text with Linguistic Noise. Autom. Constr. 2022, 143, 104543. [Google Scholar] [CrossRef]
- Zhou, Y.C.; Zheng, Z.; Lin, J.R.; Lu, X.Z. Integrating NLP and Context-Free Grammar for Complex Rule Interpretation towards Automated Compliance Checking. Comput. Ind. 2022, 142, 103746. [Google Scholar] [CrossRef]
- Liu, X.; Zheng, Y.; Du, Z.; Ding, M.; Qian, Y.; Yang, Z.; Tang, J. GPT Understands, Too. AI Open 2023. [Google Scholar] [CrossRef]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).