RoUIE: A Method for Constructing Knowledge Graph of Power Equipment Based on Improved Universal Information Extraction
Abstract
:1. Introduction
- By collecting authentic data from three categories, including inventory information, monitoring information, and management information, we have constructed a dataset for the state evaluation of power equipment. Specifically, we have introduced a standardized short-text format for monitoring information that is applicable across various types of primary power equipment.
- Based on the characteristics of the three categories of information mentioned above, we have developed an ontology layer and proposed the construction of a device-centric knowledge graph, thereby addressing the current research gap in multi-source information fusion for power equipment state evaluation and assisting in resolving operation and maintenance challenges on site.
- In order to better accommodate the concentrated relationship representations and varying lengths of device state evaluation texts, this paper proposes an improved UIE model called RoUIE, which utilizes a RoFormer pre-trained language model as the text encoder during the fine-tuning stage. Additionally, the Distribution Focal Loss is employed to replace Binary Cross-Entropy Loss as the loss function, further enhancing the extraction performance of the model.
- By comparing our proposed model with the UIE model and other mainstream joint information extraction baseline models, we demonstrate its superiority and general applicability on both a Chinese general dataset and a specific dataset for power equipment state evaluation.
2. Related Work
2.1. Current Methods of Information Extraction
2.2. Characteristics of Texts Related to State Evaluation
3. Construction Method of a Device-Centric Knowledge Graph
4. Improved UIE Model with Rotary Position Embedding
4.1. Model Architecture
4.2. Rotary Position Embedding
4.3. Distribution Focal Loss
5. Experiments
5.1. Datasets
5.2. Model Parameter Settings
5.3. Experimental Results
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
UIE | Universal Information Extraction |
BCE | Binary Cross-Entropy Loss |
DFL | Distribution Focal Loss |
DGA | Dissolved Gas Analysis |
SRO | triplets of subject, relationship, and object |
SEO | Single-Entity-Overlap |
SEL | Structural Extraction Language |
SSI | Structural Schema Instructor |
FL | Focal Loss |
CV | Computer Vision |
NLP | Natural Language Processing |
DUIE | A large-scale Chinese dataset for information extraction |
PESED | Power Equipment State Evaluation Dataset |
Appendix A
Object Type (Combination of Attributes) | Predicate (Relation) | Subject Type (Combination of Entities) |
---|---|---|
Substation | Locate in | Equipment |
Substation | Locate in | Bar |
Bar | Corresponding bar | Equipment |
Current condition | Current condition | Equipment |
Component | Consist of | Equipment |
Operation department | Operation department | Equipment |
Subclass of equipment | Corresponding Subclass | Equipment |
Date and time | Date of commencement | Equipment |
Date and time | Date of production | Equipment |
Equipment | Corresponding equipment | Defect report ID |
Defect level | Corresponding defect level | Defect report ID |
Defect type | Corresponding defect type | Defect report ID |
Defect resolution status | Defect resolution status | Defect report ID |
Description of the defect | Description of the defect | Defect report ID |
Personnel | Defect reporter | Defect report ID |
Date and time | Date of defect reporting | Defect report ID |
Team | Defect handling team | Defect report ID |
Personnel | Defect handler | Defect report ID |
Date and time | Date of defect handling | Defect report ID |
Origin of defect | Origin of defect | Defect report ID |
Equipment | Corresponding equipment | Inspection and test report ID |
Team | Inspection (test) team | Inspection and test report ID |
Date and time | Date and time of Inspection (test) | Inspection and test report ID |
Personnel | Inspection (test) engineer | Inspection and test report ID |
Approaches | Inspection (test) approaches | Inspection and test report ID |
Result of Inspection (test) | Result of Inspection (test) | Inspection and test report ID |
Equipment | Corresponding equipment | Alarm ID |
Source system | Source system | Alarm ID |
Alarm level | Alarm level | Alarm ID |
Abnormal description | Abnormal description | Alarm ID |
Datase | Input Data from the PESED |
---|---|
PESED | {“spo_list”:[ {“subject”:“Bar”, “object”:“Substation”, “subject”:“Test Line”, “predicate”:“Locate in”, “object”:“110 kV Test A Substation”}, {“subject”:“Equipment”, “object”:“Bar”, “subject”:“1830”, “predicate”:“ Corresponding bar”, “object”:“Test Line”}, {“subject”:“ Corresponding bar”, “object”:“Operation department”, “subject”:“1830”, “predicate”:“ Operation department”, “object”:“Substation Management Department 1”}, {“subject”:“Equipment”, “object”:“Date and time”, “subject”:“1830”, “predicate”:“Date of production”, “object”:“1 Mar 2011 00:00:00”}, {“subject”:“Equipment”, “object”:“Date and time”, “subject”:“1830”, “predicate”:“Date of commencement”, “object”:“30 Nov 2011 00:00:00”}, {“subject”:“Equipment”, “object”:“ Current condition”, “subject”:“1830”, “predicate”:“ Current condition”, “object”:“hot standby”}], “text”:“Incident Report: The 110 kV Test A Substation Test Line 1830 tripped and failed reclosing. Upon investigation, it was found that the equipment is managed and maintained by ubstation Management Department 1, with a production date of 1 Mar 2011 00:00:00, and was officially put into operation on 30 Nov 2011 00:00:00. Its current status is hot standby.”} |
Input Text | Corresponding Output Result Generated from the Model Inference |
---|---|
As shown in the input text from Table A2 | [{’Equipment’: [{’end’: 21,
’probability’: 0.8385402622754041, ’relations’: {’Corresponding bar’: [{’end’: 17, ’probability’: 0.8421710521379353, ’start’: 14, ’text’: ’Test Line’}], ’Current condition’: [{’end’: 109, ’probability’: 0.9079101521374563, ’start’: 106, ’text’: ’hot standby’}], ’Operation department’: [{’end’: 46, ’probability’: 0.9754414623456341, ’start’: 40, ’text’: ’Substation Management Department 1’}], ’Date of production’: [{’end’: 75, ’probability’: 0.8292706618236352, ’start’: 56, ’text’: ’1 Mar 2011 00:00:00’}, {’end’: 96, ’probability’: 0.4456306614587452, ’start’: 77, ’text’: ’30 Nov 2011 00:00:00’}], ’Date of commencement’: [{’end’: 96, ’probability’: 0.6671295049136148, ’start’: 77, ’text’: ’30 Nov 2011 00:00:00’}]}, ’start’: 17, ’text’: ’1830’}], ’Bar’: [{’end’: 17, ’probability’: 0.7685434552767871, ’relations’: {’Locate in’: [{’end’: 14, ’probability’: 0.9267710521345789, ’start’: 5, ’text’: ’110 kV Test A Substation’}, {’end’: 14, ’probability’: 0.5364710458655789, ’start’: 12, ’text’: ’A Substation’}]}, ’start’: 14, ’text’: ’Test Line’}] }] |
True Positive | False Positive | False Negative |
---|---|---|
denotes the number of correctly predicted triplets in all sentences. It consists of the corresponding triplets as follows: (Test Line, Locate in, 110 kV Test A Substation), (1830, Corresponding bar, Test Line), (1830, Current condition, hot standby), (1830, Operation department, Substation Management Department 1), (1830, Date of production, 1 March 2011 00:00:00), (1830, Date of commencement, 30 November 2011 00:00:00) | denotes the number of triplets extracted by the model that are incorrectly predicted. It consists of the corresponding triplet as follows: (Test Line, Locate in, A Substation) | denotes the number of manually annotated triplets that were not successfully extracted by the model. In the aforementioned inference stage, no manually annotated triplet was missed during extraction. Hence, it is unable to provide the corresponding triplets for here. |
References
- Wang, J.; Liu, H.; Pan, Z.; Zhao, S.; Wang, W.; Geng, S. Research on Evaluation Method of Transformer Operation State in Power Supply System. In Wireless Technology, Intelligent Network Technologies, Smart Services and Applications: Proceedings of 4th International Conference on Wireless Communications and Applications (ICWCA 2020); Springer: Singapore, 2022; pp. 249–256. [Google Scholar]
- Chen, L. Evaluation method of fault indicator status detection based on hierarchical clustering. J. Phys. Conf. Ser. 2021, 2125, 012002. [Google Scholar] [CrossRef]
- Qi, B.; Zhang, P.; Rong, Z.; Li, C. Differentiated warning rule of power transformer health status based on big data mining. Int. J. Electr. Power Energy Syst. 2020, 121, 106150. [Google Scholar] [CrossRef]
- Jin-qiang, C. Fault prediction of a transformer bushing based on entropy weight TOPSIS and gray theory. Comput. Sci. Eng. 2018, 21, 55–62. [Google Scholar] [CrossRef]
- Xie, C.; Zou, G.; Wang, H.; Jin, Y. A new condition assessment method for distribution transformers based on operation data and record text mining technique. In Proceedings of the 2016 China International Conference on Electricity Distribution (CICED), Xi’an, China, 10–13 August 2016; pp. 1–7. [Google Scholar] [CrossRef]
- Liu, R.; Fu, R.; Xu, K.; Shi, X.; Ren, X. A Review of Knowledge Graph-Based Reasoning Technology in the Operation of Power Systems. Appl. Sci. 2023, 13, 4357. [Google Scholar] [CrossRef]
- Meng, F.; Yang, S.; Wang, J.; Xia, L.; Liu, H. Creating knowledge graph of electric power equipment faults based on BERT–BiLSTM–CRF model. J. Electr. Eng. Technol. 2022, 17, 2507–2516. [Google Scholar] [CrossRef]
- Yang, Y.; Wu, Z.; Yang, Y.; Lian, S.; Guo, F.; Wang, Z. A survey of information extraction based on deep learning. Appl. Sci. 2022, 12, 9691. [Google Scholar] [CrossRef]
- Yang, J.; Meng, Q.; Zhang, X. Improvement of operation and maintenance efficiency of power transformers based on knowledge graphs. IET Electr. Power Appl. 2024, 1, 1–13. [Google Scholar] [CrossRef]
- Xie, Q.; Cai, Y.; Xie, J.; Wang, C.; Zhang, Y.; Xu, Z. Research on Construction Method and Application of Knowledge Graph for Power Transformer Operation and Maintenance Based on ALBERT. Trans. China Electrotech. Soc. 2023, 38, 95–106. [Google Scholar]
- Lu, Y.; Liu, Q.; Dai, D.; Xiao, X.; Lin, H.; Han, X.; Sun, L.; Wu, H. Unified structure generation for universal information extraction. arXiv 2022, arXiv:2203.12277. [Google Scholar]
- Lou, J.; Lu, Y.; Dai, D.; Jia, W.; Lin, H.; Han, X.; Sun, L.; Wu, H. Universal information extraction as unified semantic matching. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington DC, USA, 7–14 February 2023; Volume 37, pp. 13318–13326. [Google Scholar]
- Liu, C.; Zhao, F.; Kang, Y.; Zhang, J.; Zhou, X.; Sun, C.; Wu, F.; Kuang, K. Rexuie: A recursive method with explicit schema instructor for universal information extraction. arXiv 2023, arXiv:2304.14770. [Google Scholar]
- Fei, H.; Wu, S.; Li, J.; Li, B.; Li, F.; Qin, L.; Zhang, M.; Zhang, M.; Chua, T.S. Lasuie: Unifying information extraction with latent adaptive structure-aware generative language model. Adv. Neural Inf. Process. Syst. 2022, 35, 15460–15475. [Google Scholar]
- Su, J.; Ahmed, M.; Lu, Y.; Pan, S.; Bo, W.; Liu, Y. Roformer: Enhanced transformer with rotary position embedding. Neurocomputing 2024, 568, 127063. [Google Scholar] [CrossRef]
- Li, X.; Wang, W.; Wu, L.; Chen, S.; Hu, X.; Li, J.; Tang, J.; Yang, J. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Adv. Neural Inf. Process. Syst. 2020, 33, 21002–21012. [Google Scholar]
- Zhang, Y.S.; Liu, S.K.; Liu, Y.; Ren, L.; Xin, Y.H. Joint Extraction of Entities and Relations Based on Deep Learning:A Survey. Acta Electron. Sin. 2023, 51, 1093–1116. [Google Scholar]
- Liu, W.; Yin, M.; Zhang, J.; Cui, L. A Joint Entity Relation Extraction Model Based on Relation Semantic Template Automatically Constructed. Comput. Mater. Contin. 2024, 78, 975–997. [Google Scholar] [CrossRef]
- Fei, H.; Ren, Y.; Zhang, Y.; Ji, D.; Liang, X. Enriching contextualized language model from knowledge graph for biomedical information extraction. Briefings Bioinform. 2021, 22, bbaa110. [Google Scholar] [CrossRef] [PubMed]
- Katiyar, A.; Cardie, C. Going out on a limb: Joint extraction of entity mentions and relations without dependency trees. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, BC, Canada, 30 July–4 August 2017; pp. 917–928. [Google Scholar]
- Bekoulis, G.; Deleu, J.; Demeester, T.; Develder, C. Adversarial training for multi-context joint entity and relation extraction. arXiv 2018, arXiv:1808.06876. [Google Scholar]
- Zeng, X.; Zeng, D.; He, S.; Liu, K.; Zhao, J. Extracting relational facts by an end-to-end neural model with copy mechanism. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, 15–20 July 2018; pp. 506–514. [Google Scholar]
- Wei, Z.; Su, J.; Wang, Y.; Tian, Y.; Chang, Y. A novel cascade binary tagging framework for relational triple extraction. arXiv 2019, arXiv:1909.03227. [Google Scholar]
- Wang, Y.; Sun, C.; Wu, Y.; Zhou, H.; Li, L.; Yan, J. UniRE: A unified label space for entity relation extraction. arXiv 2021, arXiv:2107.04292. [Google Scholar]
- Li, X.; Yin, F.; Sun, Z.; Li, X.; Yuan, A.; Chai, D.; Zhou, M.; Li, J. Entity-relation extraction as multi-turn question answering. arXiv 2019, arXiv:1905.05529. [Google Scholar]
- Sui, D.; Zeng, X.; Chen, Y.; Liu, K.; Zhao, J. Joint entity and relation extraction with set prediction networks. IEEE Trans. Neural Netw. Learn. Syst. 2023, 1–12. [Google Scholar] [CrossRef]
- Wang, Y.; Yu, B.; Zhang, Y.; Liu, T.; Zhu, H.; Sun, L. TPLinker: Single-stage joint extraction of entities and relations through token pair linking. arXiv 2020, arXiv:2010.13415. [Google Scholar]
- Shang, Y.M.; Huang, H.; Mao, X. Onerel: Joint entity and relation extraction with one module in one step. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 22 February–1 March 2022; Volume 36, pp. 11285–11293. [Google Scholar]
- Dong, L.; Yang, N.; Wang, W.; Wei, F.; Liu, X.; Wang, Y.; Gao, J.; Zhou, M.; Hon, H.W. Unified language model pre-training for natural language understanding and generation. Adv. Neural Inf. Process. Syst. 2019, 32, 1–13. [Google Scholar]
- Lewis, M.; Liu, Y.; Goyal, N.; Ghazvininejad, M.; Mohamed, A.; Levy, O.; Stoyanov, V.; Zettlemoyer, L. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv 2019, arXiv:1910.13461. [Google Scholar]
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 2020, 21, 1–67. [Google Scholar]
- Katikapalli, K. A survey of GPT-3 family large language models including ChatGPT and GPT-4. Nat. Lang. Process. J. 2024, 6, 100048. [Google Scholar] [CrossRef]
- Hang, T.; Feng, J.; Wu, Y.; Yan, L.; Wang, Y. Joint extraction of entities and overlapping relations using source-target entity labeling. Expert Syst. Appl. 2021, 177, 114853. [Google Scholar] [CrossRef]
- Rosin, G.D.; Radinsky, K. Temporal attention for language models. arXiv 2022, arXiv:2202.02093. [Google Scholar]
- Li, S.; He, W.; Shi, Y.; Jiang, W.; Liang, H.; Jiang, Y.; Zhang, Y.; Lyu, Y.; Zhu, Y. Duie: A large-scale chinese dataset for information extraction. In Proceedings of the 8th CCF International Conference, Natural Language Processing and Chinese Computing, NLPCC 2019, Dunhuang, China, 9–14 October 2019; Part II 8. Springer: Cham, Switzerland, 2019; pp. 791–800. [Google Scholar]
- Su, J.; Murtadha, A.; Pan, S.; Hou, J.; Sun, J.; Huang, W.; Wen, B.; Liu, Y. Global pointer: Novel efficient span-based approach for named entity recognition. arXiv 2022, arXiv:2208.03054. [Google Scholar]
- Cui, Y.; Che, W.; Liu, T.; Qin, B.; Yang, Z. Pre-Training With Whole Word Masking for Chinese BERT. IEEE/ACM Trans. Audio Speech Lang. Process. 2021, 29, 3504–3514. [Google Scholar] [CrossRef]
- Su, H.; Shi, W.; Shen, X.; Xiao, Z.; Ji, T.; Fang, J.; Zhou, J. Rocbert: Robust chinese bert with multimodal contrastive pretraining. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; pp. 921–931. [Google Scholar]
Unified Short Text Format | Date and Time of Action | Substation | Bar | Equipment | Component | Abnormal Description |
---|---|---|---|---|---|---|
Entities | 18 December 2023 06:35:25 | 220 kV Test Substation | Test Line | 2518 | Hydraulic Mechanism | … not reached the interlocking threshold |
Sample Text | 18 December 2023 06:35:25, the pressure of the hydraulic mechanism of the 2518 circuit breaker of the Test Line of 220 kV Test Substation below the required level, yet it has not reached the interlocking threshold (in Chinese). |
Hyper-Parameter | Setting |
---|---|
Learning rate | 1 |
Batch size | 96 |
Num of epochs | 10 |
max length | 512 |
Random seed for initialization | 1000 |
optimizer | AdamW |
Model | DUIE | PESED | ||||
---|---|---|---|---|---|---|
Prec. | Rec. | F1 | Prec. | Rec. | F1 | |
Casrel [23] | 74.47 | 66.00 | 69.98 | 99.15 | 52.87 | 68.97 |
TPLinker [27] | 76.66 | 76.61 | 76.63 | 98.85 | 52.89 | 68.91 |
GPLinker [36] | 76.77 | 74.55 | 75.65 | 94.56 | 48.79 | 64.37 |
Onerel [28] | 76.49 | 74.93 | 75.70 | 94.45 | 54.71 | 69.29 |
UIEbase [11] | 82.30 | 80.02 | 81.15 | 76.14 | 98.72 | 85.97 |
RoUIE | 83.41 | 80.58 | 81.97 | 80.50 | 99.23 | 88.89 |
Model | DUIE | PESED | ||||
---|---|---|---|---|---|---|
Prec. | Rec. | F1 | Prec. | Rec. | F1 | |
83.19 | 80.16 | 81.65 | 78.85 | 97.45 | 87.17 | |
83.51 | 80.14 | 81.79 | 79.34 | 98.88 | 88.04 | |
82.70 | 80.64 | 81.66 | 77.30 | 99.25 | 86.91 | |
84.57 | 77.90 | 81.10 | 79.49 | 99.25 | 88.28 | |
RoUIE () | 83.41 | 80.58 | 81.97 | 80.50 | 99.23 | 88.89 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ye, Z.; Qi, D.; Liu, H.; Yan, Y.; Chen, Q.; Liu, X. RoUIE: A Method for Constructing Knowledge Graph of Power Equipment Based on Improved Universal Information Extraction. Energies 2024, 17, 2249. https://doi.org/10.3390/en17102249
Ye Z, Qi D, Liu H, Yan Y, Chen Q, Liu X. RoUIE: A Method for Constructing Knowledge Graph of Power Equipment Based on Improved Universal Information Extraction. Energies. 2024; 17(10):2249. https://doi.org/10.3390/en17102249
Chicago/Turabian StyleYe, Zhenhao, Donglian Qi, Hanlin Liu, Yunfeng Yan, Qihao Chen, and Xiayu Liu. 2024. "RoUIE: A Method for Constructing Knowledge Graph of Power Equipment Based on Improved Universal Information Extraction" Energies 17, no. 10: 2249. https://doi.org/10.3390/en17102249
APA StyleYe, Z., Qi, D., Liu, H., Yan, Y., Chen, Q., & Liu, X. (2024). RoUIE: A Method for Constructing Knowledge Graph of Power Equipment Based on Improved Universal Information Extraction. Energies, 17(10), 2249. https://doi.org/10.3390/en17102249