Intelligent Checking Method for Construction Schemes via Fusion of Knowledge Graph and Large Language Models
Abstract
:1. Introduction
2. Related Work
2.1. Intelligent Checking in Engineering
2.2. Large Language Model Applications
2.3. Overview of Related Work
3. Methods
- 1.
- Construction Standard Knowledge Graph Module: A hierarchical analysis of the construction standard concepts is conducted from the perspectives of document structure and semantic information. A multi-level representation model for construction standard knowledge, called “Document-Module-Knowledge”, is proposed. This model aims to reveal and organize construction standard knowledge based on multi-dimension and multi-granularity aspects, providing a knowledge foundation for subsequent checking.
- 2.
- Construction Construction Scheme Parsing Module: The BERT-BiLSTM model is employed for text classification to predict the query type of construction scheme text statements. The BERT model extracts candidate checking entities from construction scheme text statements, and a key checking entity is filtered based on the predicted query type. This process helps to accurately identify crucial standard checking points.
- 3.
- LLM Check Inference Module: It constructs cipher query statements based on the query type and the key checking entity of the construction scheme text statements. It retrieves construction standard checking points from the standard knowledge graph and combines them with the construction scheme text statements through prompting engineering. The resulting text is then introduced into the LLM for check analysis and result generation.
3.1. Construction Standard Knowledge Graph Module
3.1.1. Document Structure Layer
3.1.2. Knowledge Semantic Layer
3.1.3. Combination of Knowledge and Semantic Layers
- Check Point: Describes the types of checking themes of the clause, such as “Installation of Steel Casing” or “Drilling Rig Operation”. This node is connected to either “Event” nodes or “Check Object” nodes.
- Check Object: Refers to the subdividing of the check object on the checking theme. For example, the theme of “Drilling Rig Operation” can be divided into “percussion drilling”, “rotary drilling” and so on, according to the selection of drilling rigs. The node is connected to a “checkpoint” at one end and an “event” at the other.
3.2. Construction Scheme Parsing Module
3.2.1. Theme Classification of Construction Scheme Text Statement
3.2.2. Entity Extraction of Construction Scheme Text Statement
3.2.3. Key Entity Screening
3.3. LLM Check Inference Module
3.3.1. Checking Point Retrieval
3.3.2. Prompt Engineering
- 1.
- Setting role parameters: Defining the identity parameters of the LLM, which guide and constrain the model’s thinking scope and language style.
- 2.
- Prompting domain knowledge: This aims to address gaps in the LLM’s comprehension of specialized domain knowledge.
- 3.
- Providing task description: Providing clear and concise task descriptions to guide the LLM in completing subsequent tasks.
4. Experiments
4.1. Text Parsing Performance
4.1.1. Data Preparation
4.1.2. Evaluation Metrics
4.1.3. Scheme Theme Classification Performance
4.1.4. Checking Entity Extraction Performance
4.2. Construction Scheme Compliance Checking Performance
4.2.1. Data Preparation
4.2.2. Comparison of Different LLMs
4.2.3. Comparison of Different Prompt Templates
4.2.4. Robustness Experiment
5. Discussion
5.1. Reasoning Ability of LLM
5.2. Construct Comprehensive Knowledge Base
5.3. Application in Practice
6. Conclusions
- 1.
- This study proposes a method for the compliance check of construction schemes that combines a knowledge graph and LLM. Aiming at the problem of the lack of vertical domain knowledge in LLMs, a hierarchical knowledge graph of construction standards is constructed. Through an text analysis model and the powerful semantic understanding and computational reasoning capabilities of the LLM, the processes of the automatic analysis of construction schemes, automatic query of standard specifications and automatic review reasoning are achieved, thereby achieving the compliance checking of construction schemes.
- 2.
- An in-depth examination was conducted to assess the influence of various factors, including the LLM base, prompt templates and parsing module errors, on the effectiveness of the method in the checking task. The experiment results demonstrate that there are notable discrepancies in the performances of different LLMs in the checking task. It can be concluded that semantic understanding and calculation capabilities are the key factors affecting the checking performance. The well-designed prompt template can more effectively guide the model to think logically, activate its potential ability and thus improve the quality of checks. In addition, our method demonstrates robustness through the use of a two-stage checking process, which effectively mitigates the impact of potential errors in the parsing module. It is also accurate in intelligent checking tasks. However, to meet the practical application requirements in real-world scenarios, it is necessary to further optimize the deep integration strategy between the LLM and domain expertise to enhance the model’s correctness and interpretability.
- 3.
- In practical application, the method can be transferred to other fields by adapting the domain knowledge base to suit the specific requirements of the new field. This illustrates the high adaptability of our method, which provides a valuable avenue for further exploration in the application of LLMs in vertical domains.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Lin, J.R.; Guo, J.F. BlM-based automatic compliance checking. J. Tsinghua Univ. (Sci. Technol.) 2020, 60, 873–879. [Google Scholar]
- Lu, J.W.; Guo, C.; Dai, X.Y.; Miu, Q.H.; Wang, X.X. The ChatGPT After: Opportunities and Challenges of Very Large Scale Pre-trained Models. Acta Autom. Sin. 2023, 49, 705–717. [Google Scholar]
- Wei, J.; Bosma, M.; Zhao, V.Y.; Guu, K.; Yu, A.W. Finetuned Language Models are Zero-Shot Learners. arXiv 2021, arXiv:2109.01652v3. [Google Scholar]
- Christiano, P.; Leike, J.; Brown, T.; Martic, M.; Legg, S. Deep reinforcement learning from human preferencess. Adv. Neural Inf. Process. Syst. 2017, 49, 4302–4310. [Google Scholar]
- Li, C.T.; Han, X.; Jiang, R.H.; Yun, P.W.; Hu, P.F. Application and prospects of large models in materials science. Chin. J. Eng. 2024, 46, 290–305. [Google Scholar]
- Qin, T.; Du, S.H.; Chang, Y.Y.; Wang, C.X. Key Technologies and Emerging Trends of ChatGPT. J. Xi’an Jiaotong Univ. 2024, 58, 1–12. [Google Scholar]
- Pan, S.R.; Luo, L.H.; Wang, Y.F.; Chen, C.; Wang, J.P. Unifying Large Language Models and Knowledge Graphs: A Roadmap. arXiv 2024, arXiv:2306.08302v3. [Google Scholar] [CrossRef]
- Xu, Y.X.; Yang, Z.B.; Lin, Y.C.; Hu, J.L.; Dong, S.B. Interpretable Biomedical Reasoning via Deep Fusion of Knowledge Graph and Pre-trained Language Models. Acta Sci. Nat. Univ. Pekin. 2024, 60, 62–70. [Google Scholar]
- Qiao, S.J.; Yang, G.P.; Yu, Y.; Han, N.; Tan, X. QA-KGNet: Language Model-driven Knowledge Graph Question-answering Model. J. Softw. 2023, 34, 4584–4600. [Google Scholar]
- Eastman, C.; Lee, J.M.; Jeong, Y.S.; Lee, J.K. Automatic rule-based checking of building designs. Autom. Constr. 2009, 18, 1011–1033. [Google Scholar] [CrossRef]
- Zhou, P.; El-Gohary, N. Ontology-based automated information extraction from building energy conservation codes. Autom. Constr. 2017, 74, 103–117. [Google Scholar] [CrossRef]
- Salama, D.M.; El-Gohary, N.M. Semantic text classification for supporting automated compliance checking in construction. J. Comput. Civ. Eng. 2016, 30, 4014106. [Google Scholar] [CrossRef]
- lal, M.S.; Gunaydin, H.M. Computer representation of building codes for automated compliance checking. Autom. Constr. 2017, 82, 43–58. [Google Scholar]
- Zhang, J.S.; EI-Gohary, N.M. Semantic NLP-based information extraction from construction regulatory documents for automated compliance checking. J. Comput. Civ. Eng. 2016, 30, 4015014. [Google Scholar] [CrossRef]
- Lee, J.; Yi, J.S. Predicting project’s uncertainty risk in the bidding process by integrating unstructured text data and structured numerical data using text mining. Appl. Sci. 2017, 7, 1141. [Google Scholar] [CrossRef]
- Hassan, F.u.; Le, T.; Lv, X. Addressing legal and contractual matters in construction using natural language processing: A critical review. Constr. Eng. Manag. 2021, 147, 03121004. [Google Scholar] [CrossRef]
- Hassan, F.u.; Le, T. Automated requirements identification from construction contract documents using natural language processing. J. Leg. Aff. Disput. Resolut. Eng. Constr. 2020, 12, 04520009. [Google Scholar] [CrossRef]
- Zhang, R.C.; El-Gohary, N. A deep neural network-based method for deep information extraction using transfer learning strategies to support automated compliance checking. Autom. Constr. 2021, 132, 103834. [Google Scholar] [CrossRef]
- Song, J.; Lee, J.K.; Choi, J.; Kim, I. Deep learning-based extraction of predicate-argument structure (PAS) in building design rule sentences. J. Comput. Des. Eng. 2020, 7, 563–576. [Google Scholar] [CrossRef]
- Pan, Y.; Zhang, L.M. BlM log mining:learning and predicting design commands. Autom. Constr. 2020, 112, 103107. [Google Scholar] [CrossRef]
- Wang, X.Y.; El-Gohary, N. Deep learning-based relation extraction from construction safety regulations for automated field compliance checking. In Proceedings of the Construction Research Congress 2022: Computer Applications, Automation, and Data Analytics, Arlington, VN, USA, 9–12 March 2022. [Google Scholar]
- Zhang, J.; El-Gohary, N.M. Integrating semantic NLP and logic reasoning into a unified system for fully-automated code checking. Autom. Constr. 2016, 73, 45–57. [Google Scholar] [CrossRef]
- Raskin, V.; Hempelmann, C.; Triezenberg, K.E.; Nirenburg, S. Ontology in information security: A useful theoretical foundation and methodological tool. In Proceedings of the Workshop on New Security Paradigms, Cloudcroft, NM, USA, 10–13 September 2001; pp. 53–59. [Google Scholar]
- Zhong, B.T.; Wu, H.T.; Xiang, R.; Guo, J.D. Automatic Information Extraction from Construction Quality Inspection Regulations: A Knowledge Pattern-Based Ontological Method. J. Constr. Eng. Manag. 2022, 148, 04021207. [Google Scholar] [CrossRef]
- Yang, M.S.; Zhao, Q.; Zhu, L.; Meng, H.N.; Chen, K.H. Semi-automatic representation of design code based on knowledge graph for automated compliance checking. Comput. Ind. 2023, 150, 103945. [Google Scholar] [CrossRef]
- Pauwels, P.; Van Deursen, D.; Verstraeten, R.; De Roo, J.; De Meyer, R.; Van de Walle, R.; Van Campenhout, J. A semantic rule checking environment for building performance checking. Autom. Constr. 2011, 20, 506–518. [Google Scholar] [CrossRef]
- Dimyadi, J.; Pauwels, P.; Amor, R. Modelling and accessing regulatory knowledge for computer-assisted compliance audit. J. Inf. Technol. Constr. 2016, 21, 317–336. [Google Scholar]
- Wang, S.K.; Zheng, C.M.; Su, X.; Tang, Y.Q. Construction contract risk identification based on knowledge-augmented language models. Comput. Ind. 2024, 157, 104082. [Google Scholar] [CrossRef]
- Zheng, Z.; Zhou, Y.C.; Lu, X.Z.; Lin, J.R. Knowledge-informed semantic alignment and rule interpretation for automated compliance checking. Autom. Constr. 2022, 142, 104524. [Google Scholar] [CrossRef]
- Zhou, P.; El-Gohary, N. Semantic information alignment of BIMs to computerinterpretable regulations using ontologies and deep learning. Adv. Eng. Inform. 2021, 48, 101239. [Google Scholar] [CrossRef]
- Zhong, B.T.; Ding, L.Y.; Luo, H.B.; Zhou, Y.; Hu, Y.Z.; Hu, H.M. Ontology-based semantic modeling of regulation constraint for automated construction quality compliance checking. Autom. Constr. 2012, 28, 58–70. [Google Scholar] [CrossRef]
- Jiang, L.; Shi, J.Y.; Wang, C.Y. Multi-ontology fusion and rule development to facilitate automated code compliance checking using BlM and rule-based reasoning. Adv. Eng. Inform. 2022, 51, 101449. [Google Scholar] [CrossRef]
- Wu, J.; Zhang, J.S. Model validation using invariant signatures and logic-based inference for automated building code compliance checking. J. Comput. Civ. Eng. 2022, 36, 4022002.1–4022002.14. [Google Scholar] [CrossRef]
- Xu, X.; Cai, H. Semantic approach to compliance checking of underground utilities. Autom. Constr. 2020, 109, 103006. [Google Scholar] [CrossRef]
- Zeng, W.; Ren, X.Z.; Su, T.; Wang, H.; Liao, Y. PanGu-a: Large-scale autoregressive pretrained Chinese language models with auto-parallel computation. arXiv 2021, arXiv:2104.12369. [Google Scholar]
- Wang, S.H.; Sun, Y.; Xiang, Y.; Wu, Z.H.; Ding, S.Y. ERNIE 3.0 Titan: Exploring larger-scale knowledge en-hanced pre-training for language understanding and generation. arXiv 2021, arXiv:2112.12731. [Google Scholar]
- Du, Z.X.; Qian, Y.J.; Liu, X.; Ding, M.; Qiu, J.Z. GLM: General language model pretraining with autoregressive blank infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022; pp. 320–335. [Google Scholar]
- Zhao, B.; Jin, W.Q.; Ser, J.D.; Yang, G. ChatAgri: Exploring potentials of ChatGPT on cross-linguistic agricultural text classification. Neurocomputing 2023, 557, 126708. [Google Scholar] [CrossRef]
- Liu, X.; Zheng, Y.N.; Du, Z.X.; Ding, M.; Qian, Y.J. GPT understands, too. arXiv 2021, arXiv:2103.10385. [Google Scholar] [CrossRef]
- Liu, X.; Ji, K.X.; Fu, Y.C.; Tian, W.L.; Du, Z.X. P-tuning w2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022; pp. 61–68. [Google Scholar]
- Wang, Y.H.; Bai, H.Y.; Meng, X.Y. An Intelligent Practical Exploration of Large Language Model in Library Reference Consulting Service. Inf. Stud. Appl. 2023, 46, 96–103. [Google Scholar]
- Xu, H.; Zhu, T.Q.; Zhang, L.; Zhou, W.L.; Yu, P.S. Machine Unlearning: A Survey. ACM Comput. Surv. 2024, 56, 9. [Google Scholar] [CrossRef]
- Tan, S.Z.; Zheng, Z.; Lu, X.Z. Exploring and Discussion on the Application of Large Language Models in Construction Engineering. Ind. Constr. 2023, 53, 162–169. [Google Scholar]
- Zhang, H.Y.; Wang, X.; Han, L.F.; Li, Z.; Chen, Z.R. Research on Question Answering System on Joint of Knowledge Graph and Large Language Models. J. Front. Comput. Sci. Technol. 2023, 17, 2377–2388. [Google Scholar]
- Wen, Y.L.; Wang, Z.F.; Sun, J.M. MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models. arXiv 2023, arXiv:2308.09729. [Google Scholar]
- JTG/T 3650-2020; Technical Specifications for Construction of Highway Bridges and Culverts. China Communications Press: Beijing, China, 2020.
- Wei, J.; Wang, X.Z.; Schuurmans, D.; Bosma, M.; Ichter, B. Chain-of-Thought prompting elicits reasoning in large language models. Adv. Neural Inf. Process. Syst. 2022, 35, 24824–24837. [Google Scholar]
- JTG F80/1-2023; Quality Inspection and Evaluation Standards for Highway Engineering Section 1 Civil Engineering. China Communications Press: Beijing, China, 2023.
- JTG F90-2015; Safety Technical Specifications for Highway Engineering Construction. China Communications Press: Beijing, China, 2015.
Phase | Method | Advantages | Disadvantages | Relevant Literature |
---|---|---|---|---|
Information Extraction | Rule-based methods | High extraction accuracy and fast efficiency in specific domains | Dependence on manually constructed matching templates and poor portability | [12,13] |
Machine learning- based methods | Alleviates high human costs, avoids cumbersome rule-making and has a strong generalization ability | Requires significant human involvement in complex feature engineering, closely related to quality | [15,16,17] | |
Deep learning- based methods | Automatically learns underlying semantic features and is more flexible when handling large-scale data | Relies on learning from large amounts of high-quality annotated data | [18,19,20] | |
Information Interpretation | Logic representation- based methods | Formal grammatical structure and consistent patterns with natural language | Limitations in interpreting and representing complex code clauses and high cost of logic construction | [22] |
Semantic network- based methods | Provides a structured framework for representing concepts, instances and their relationships and facilitates knowledge sharing | Inability to represent generative rules, requires expert involvement for complex rules or query formulation and difficult to create and maintain | [23,24] | |
Knowledge graph-based methods | Stronger semantic relationship expression, efficient data retrieval and flexible knowledge association; supports querying and reasoning | Requires manual effort for designing the schema of knowledge graphs | [25] | |
Information Matching | Rule-based methods | High and fast application effect in quantitative checking tasks | High human cost, difficult to maintain and unable to understand semantic information | [26,27] |
Semantic vector-based methods | Strong versatility and eliminates human cost | Limited by the effect of mapping text to vector space | [29,30] | |
Compliance checking | Semantic reasoning engine-based methods | Good execution, accurate and fast | Poor versatility, high maintenance cost and low flexibility | [22,34] |
Attribute Type | Definition | Example |
---|---|---|
Name | Standard formulated and released by the country and industry | Technical Specifications for Construction of Highway Bridges and Culverts [46] |
Number | Number of standards | JTG/T 3650-2020 |
Level | Level of standards | National level, industry level… |
Application scope | Engineering type to which the standard applies | Highway and bridge construction |
Implementation date | Effective date of the standard | 1 October 2020 |
Status | Whether the standard is currently in force | effective or abolished |
Entity Type | Definition |
---|---|
Object | Objects subject to the clauses, including building components, equipment and materials |
Attribute | Attribute requiring compliance check |
Value | Attribute value |
Comparative word | Comparative relationships between attributes and attributes and attributes and attribute values |
Condition | Conditional information of constraint clauses |
Action | Compliance behavior adopted under the condition |
Relationship Type | Relationship Expression |
---|---|
has attribute of | [Object]—[has attribute of ]—>[Attribute] |
has comparative word | [Attribute]—[has comparative word]—>[Comparative word] |
has attribute value of | [Comparative word]—[has attribute value of]—>[Value] |
constrains | [Condition]—[constrains]—>[Action] |
Type | TextRCNN | BERT | BERT-BiLSTM (Ours) | ||||||
---|---|---|---|---|---|---|---|---|---|
R/% | P/% | F1/% | R/% | P/% | F1/% | R/% | P/% | F1/% | |
Steel casing selection | 100 | 92.59 | 96.15 | 100 | 92.59 | 96.15 | 100 | 100 | 100 |
Steel casing installation | 100 | 97.06 | 98.51 | 93.94 | 96.88 | 95.38 | 96.00 | 96.00 | 96.00 |
Drilling rig setup | 96.55 | 96.55 | 96.55 | 96.55 | 93.33 | 94.92 | 100 | 94.44 | 97.14 |
Drilling operation | 94.92 | 96.55 | 95.73 | 96.61 | 98.28 | 97.44 | 93.33 | 100 | 96.55 |
standard for slurry after hole cleaning | 97.50 | 100 | 98.73 | 95.00 | 97.44 | 96.20 | 100 | 100 | 100 |
Mud construction requirements | 76.19 | 94.12 | 94.21 | 76.19 | 88.89 | 82.05 | 100 | 76.92 | 86.96 |
Submerged concrete placement | 100 | 90 | 94.74 | 92.59 | 83.33 | 87.72 | 93.75 | 100 | 96.77 |
F1 | 95.73 | 94.02 | 96.79 |
No | Model | R/% | P/% | F1/% |
---|---|---|---|---|
1 | BERT | 83.44 | 86.78 | 85.06 |
2 | BERT+BiLSTM+CRF | 82.39 | 86.76 | 84.52 |
3 | BERT+CRF (ours) | 84.62 | 87.42 | 86.00 |
Statement Type | Content | Checking Points |
---|---|---|
Quantitative Statement | 泥浆PH值控制在8∼10,比重≤1.10 g/cm3,黏度宜为17∼20 s,胶体率在98%以上,含沙率小于2%。(original) |
|
The PH value of the slurry should be controlled at 8–10, the specific gravity should be ≤1.10 g/cm3, the viscosity should be controlled at 17–20 pa·s, the colloid rate should be higher than 98%, and the sand content should be lower than 2%. (translation) | ||
Qualitative Statement | 若有异常变化,首先提高孔内泥浆水头,加大泥浆比重,提起钻头,防止卡钻、埋钻等现象。(original) |
|
If there is an abnormal change, first increase the slurry head in the hole, increase the slurry density, and lift the drill bit to prevent jamming and burying the drill bit. (translation) |
Type | P/% | |||
---|---|---|---|---|
Quantitative Statements (31) | Qualitative Statements (19) | ALL | AVG | |
ChatGPT-3.5 | 64.52% (20) | 84.21% (16) | 72.00% (36) | 75.37% |
ChatGLM-6B | 58.06% (18) | 89.47% (17) | 70.00% (35) | 73.77% |
ERNIE Bot | 41.94% (13) | 47.37% (9) | 44.00% (22) | 44.66% |
Type | Quantitative Statements | Qualitative Statements | ||
---|---|---|---|---|
Number of Checking Points (29) | Number of Correct Results | Number of Checking Points (17) | Number of Correct Results | |
Template A | 36 | 22 | 23 | 10 |
Template B | 30 | 26 | 19 | 11 |
Type | Checking Results | ||
---|---|---|---|
First stage | irrelevant (8) | relevant (2) | |
Second stage | — | unable to judge (2) | check (0) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, H.; Yang, R.; Xu, S.; Xiao, Y.; Zhao, H. Intelligent Checking Method for Construction Schemes via Fusion of Knowledge Graph and Large Language Models. Buildings 2024, 14, 2502. https://doi.org/10.3390/buildings14082502
Li H, Yang R, Xu S, Xiao Y, Zhao H. Intelligent Checking Method for Construction Schemes via Fusion of Knowledge Graph and Large Language Models. Buildings. 2024; 14(8):2502. https://doi.org/10.3390/buildings14082502
Chicago/Turabian StyleLi, Hao, Rongzheng Yang, Shuangshuang Xu, Yao Xiao, and Hongyu Zhao. 2024. "Intelligent Checking Method for Construction Schemes via Fusion of Knowledge Graph and Large Language Models" Buildings 14, no. 8: 2502. https://doi.org/10.3390/buildings14082502
APA StyleLi, H., Yang, R., Xu, S., Xiao, Y., & Zhao, H. (2024). Intelligent Checking Method for Construction Schemes via Fusion of Knowledge Graph and Large Language Models. Buildings, 14(8), 2502. https://doi.org/10.3390/buildings14082502