Person Entity Alignment Method Based on Multimodal Information Aggregation
Abstract
:1. Introduction
- Two multi-person relation graphs were constructed, and the multi-person entity alignment dataset was further constructed. Based on face detection technology and network open-source semi-structured encyclopedia data, this study built two multi person relation graphs with different knowledge sources, and, through analysis, optimization and cleaning, created a multi-person entity alignment-based pedia, which contains 23,512 entities and 59,691 triplets. It is now open-source on GitHub.
- A person entity alignment method based on multi-information aggregation is proposed. In this method, firstly, the single-hop and multi-hop neighborhood features of the person entity are extracted by using the graph convolution layer and the dynamic graph attention layer. Secondly, the layer-wise gated network is applied to aggregate the single-hop and multi-hop characteristics of the nodes comprehensively and reasonably, and the structural feature is carried out. Finally, the cascade convolution network is used to process the modal attributes of the entity image to detect the presence of the human face, and the pretrained SE-LResNet101E-IR [7] and bert-base-chinese [8] are used to extract the facial and semantic features of the person entity, which is then aggregated with the structural feature modeled by the entity relationship triad to enhance the low-dimensional vector representation of the target entity.
- Based on the constructed multi-person entity alignment dataset, experiments were designed to verify the effectiveness of integrating multi-information on person entities, applying a dynamic graph attention network and a layer-wise gated network in the task of person entity alignment.
2. Materials and Methods
2.1. Graph Convolution Layer
2.2. Dynamic Graph Attention Layer
2.3. Layer-Wise Gated Layer
2.4. Feature Extraction Layer
3. Experiment and Discussion
3.1. Datasets
3.2. Configuration
3.3. Evaluation Index
3.4. Results and Analysis
3.5. Complexity Analysis
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. Adv. Neural Inf. Process. Syst. 2013, 26. [Google Scholar]
- Wang, Z.; Lv, Q.; Lan, X.; Zhang, Y. Cross-lingual knowledge graph alignment via graph convolutional networks. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; pp. 349–357. [Google Scholar]
- Zeng, K.; Li, C.; Hou, L.; Li, J.; Feng, L. A comprehensive survey of entity alignment for knowledge graphs. AI Open 2021, 2, 1–13. [Google Scholar] [CrossRef]
- Wang, M.; Wang, H.; Qi, G.; Zheng, Q. Richpedia: A Large-Scale, Comprehensive Multi-Modal Knowledge Graph. Big Data Res. 2020, 22, 100159. [Google Scholar] [CrossRef]
- Chen, L.; Li, Z.; Wang, Y.; Xu, T.; Wang, Z.; Chen, E. MMEA: Entity alignment for multi-modal knowledge graph. In Proceedings of the International Conference on Knowledge Science, Engineering and Management, Hangzhou, China, 28–30 August 2020; Springer: Cham, Switzerland, 2020; pp. 134–147. [Google Scholar]
- Sun, Z.; Wang, C.; Hu, W.; Chen, M.; Dai, J.; Zhang, W.; Qu, Y. Knowledge Graph Alignment Network with Gated Multi-Hop Neighborhood Aggregation. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 222–228. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
- Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 3844–3852. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Liò, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Brody, S.; Alon, U.; Yahav, E.; Brody, S.; Alon, U.; Yahav, E. How Attentive are Graph Attention Networks? In Proceedings of the International Conference on Learning Representations, Vienna, Austria, 3–7 May 2021; pp. 1–10. [Google Scholar]
- Guo, Y.; Zhang, L.; Hu, Y.; He, X.; Gao, J. Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016; pp. 87–102. [Google Scholar]
- Zhang, K.; Zhang, Z.; Li, Z.; Qiao, Y. Joint Face Detection and Alignment Using Multitask Cascaded ConvolutiSonal Networks. IEEE Signal Process. Lett. 2016, 23, 1499–1503. [Google Scholar] [CrossRef] [Green Version]
- Deng, J.; Guo, J.; Yang, J.; Xue, N.; Cotsia, I.; Zafeiriou, S. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4690–4699. [Google Scholar]
- Sun, Z.; Hu, W.; Li, C. Cross-Lingual Entity Alignment via Joint Attribute-Preserving Embedding; Springer: Cham, Switzerland, 2017; pp. 3–10. [Google Scholar]
- Schlichtkrull, M.; Kipf, T.N.; Bloem, P.; van den Berg, R.; Titov, I.; Welling, M. Modeling Relational Data with Graph Convolutional Networks. In Proceedings of the Semantic Web, Monterey, CA, USA, 8–12 October 2018; Springer: Cham, Switzerland, 2018; pp. 1–13. [Google Scholar]
Data Source | Language | Entities | Relations | Triplets |
---|---|---|---|---|
Baidu Encyclopedia | Chinese | 14,226 | 19 | 38,716 |
Sogou Encyclopedia | Chinese | 9286 | 19 | 20,975 |
Datasets | Language | Entities | Relations | Triplets |
---|---|---|---|---|
DBP15KZH-EN | Chinese | 66,469 | 2830 | 153,929 |
English | 98,125 | 2317 | 237,674 | |
DBP15KJA-EN | Japanese | 65,744 | 2043 | 164,373 |
English | 95,680 | 2096 | 233,319 | |
DBP15KFR-EN | French | 66,858 | 1379 | 192,191 |
English | 105,889 | 2209 | 278,590 |
Model | Baidu-Sogou Person Entity Alignment Dataset | |||
---|---|---|---|---|
H@1 | H@10 | MR | MRR | |
Only face similarity comparison is used | 0.281 | 0.573 | 203.106 | 0.356 |
AliNet | 0.612 | 0.706 | 59.297 | 0.655 |
PEAMA (without aggregating multimodal information) | 0.623 | 0.714 | 59.093 | 0.661 |
PEAMA (only aggregating semantic feature) | 0.654 | 0.755 | 48.427 | 0.692 |
PEAMA (only aggregating face feature) | 0.711 | 0.811 | 34.996 | 0.751 |
PEAMA | 0.736 | 0.832 | 26.501 | 0.771 |
Model | DBP15KZH-EN | DBP15KJA-EN | DBP15KFR-EN | ||||||
---|---|---|---|---|---|---|---|---|---|
H@1 | MR | MRR | H@1 | MR | MRR | H@1 | MR | MRR | |
GCN | 0.487 | - | 0.559 | 0.507 | - | 0.618 | 0.508 | - | 0.628 |
GAT | 0.418 | - | 0.508 | 0.446 | - | 0.537 | 0.442 | - | 0.546 |
R-GCN [16] | 0.463 | - | 0.564 | 0.471 | - | 0.571 | 0.469 | - | 0.570 |
AliNet | 0.547 | 282.760 | 0.628 | 0.549 | 362.110 | 0.633 | 0.556 | 276.225 | 0.644 |
PEAMA | 0.562 | 278.372 | 0.644 | 0.551 | 341.427 | 0.634 | 0.558 | 267.285 | 0.645 |
Model | Epoch Average Time |
---|---|
AliNet | 4.3329 s |
PEAMA | 7.2569 s |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, H.; Huang, R.; Zhang, J. Person Entity Alignment Method Based on Multimodal Information Aggregation. Electronics 2022, 11, 3163. https://doi.org/10.3390/electronics11193163
Wang H, Huang R, Zhang J. Person Entity Alignment Method Based on Multimodal Information Aggregation. Electronics. 2022; 11(19):3163. https://doi.org/10.3390/electronics11193163
Chicago/Turabian StyleWang, Huansha, Ruiyang Huang, and Jianpeng Zhang. 2022. "Person Entity Alignment Method Based on Multimodal Information Aggregation" Electronics 11, no. 19: 3163. https://doi.org/10.3390/electronics11193163
APA StyleWang, H., Huang, R., & Zhang, J. (2022). Person Entity Alignment Method Based on Multimodal Information Aggregation. Electronics, 11(19), 3163. https://doi.org/10.3390/electronics11193163