Span-Prototype Graph Based on Graph Attention Network for Nested Named Entity Recognition
Abstract
:1. Introduction
2. Related Work
2.1. NER
2.2. Nested NER
2.3. Graph Attention Network
3. Model
3.1. Overall Framework
3.2. Encoder Module
3.3. Filter
3.4. Prototype Module
3.4.1. Prototype Learning
3.4.2. Span-Prototype Graph
3.5. Classifier
3.6. Training Objective
4. Experiment and Analysis
4.1. Datasets
4.2. Evaluation
4.3. Parameter Settings
4.4. Baselines and Results
4.4.1. Baseline Methods
4.4.2. Experimental Result
4.5. Ablation Study
4.6. Sensitivity Analysis
4.6.1. The Layers of Graph Attention Network (GAT)
4.6.2. The Multi-Heads of GAT
4.6.3. Comparison of Different Span Representation Enhancement Methods
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Yang, Z.; Ma, J.; Chen, H.; Zhang, Y.; Chang, Y. HiTRANS: A Hierarchical Transformer Network for Nested Named Entity Recognition. In Findings of the Association for Computational Linguistics: EMNLP 2021; Association for Computational Linguistics: Punta Cana, Dominican Republic, 2021; pp. 124–132. [Google Scholar] [CrossRef]
- Chen, L.-C.; Chang, K.-H. An Extended AHP-Based Corpus Assessment Approach for Handling Keyword Ranking of NLP: An Example of COVID-19 Corpus Data. Axioms 2023, 12, 740. [Google Scholar] [CrossRef]
- Finkel, J.R.; Manning, C.D. Nested Named Entity Recognition. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing Volume 1—EMNLP ’09, Singapore, 2–7 August 2009; Volume 1, p. 141. [Google Scholar] [CrossRef]
- Lu, W.; Roth, D. Joint Mention Extraction and Classification with Mention Hypergraphs. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; Association for Computational Linguistics: Lisbon, Portugal, 2015; pp. 857–867. [Google Scholar] [CrossRef]
- Wang, B.; Lu, W. Neural Segmental Hypergraphs for Overlapping Mention Recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 Octobe–4 November 2018; Association for Computational Linguistics: Brussels, Belgium, 2018; pp. 204–214. [Google Scholar] [CrossRef]
- Straková, J.; Straka, M.; Hajič, J. Neural Architectures for Nested NER through Linearization. arXiv 2019, arXiv:1908.06926. [Google Scholar]
- Ju, M.; Miwa, M.; Ananiadou, S. A Neural Layered Model for Nested Named Entity Recognition. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, LA, USA, 1–6 June 2018; Association for Computational Linguistics: New Orleans, LA, USA, 2018; pp. 1446–1459. [Google Scholar] [CrossRef]
- Wang, J.; Shou, L.; Chen, K.; Chen, G. Pyramid: A Layered Model for Nested Named Entity Recognition. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 5918–5928. [Google Scholar] [CrossRef]
- Shibuya, T.; Hovy, E. Nested Named Entity Recognition via Second-Best Sequence Learning and Decoding. Trans. Assoc. Comput. Linguist. 2020, 8, 605–620. [Google Scholar] [CrossRef]
- Shen, Y.; Ma, X.; Tan, Z.; Zhang, S.; Wang, W.; Lu, W. Locate and Label: A Two-Stage Identifier for Nested Named Entity Recognition. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 1–6 August 2021; pp. 2782–2794. [Google Scholar] [CrossRef]
- Zhong, Z.; Chen, D. A Frustratingly Easy Approach for Entity and Relation Extraction. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, 6–11 June 2021; pp. 50–61. [Google Scholar] [CrossRef]
- Tan, C.; Qiu, W.; Chen, M.; Wang, R.; Huang, F. Boundary Enhanced Neural Span Classification for Nested Named Entity Recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 9016–9023. [Google Scholar] [CrossRef]
- Wan, J.; Ru, D.; Zhang, W.; Yu, Y. Nested Named Entity Recognition with Span-Level Graphs. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022; Volume 1, pp. 892–903. [Google Scholar]
- Shaalan, K.; Raza, H. NERA: Named Entity Recognition for Arabic. J. Am. Soc. Inf. Sci. 2009, 60, 1652–1663. [Google Scholar] [CrossRef]
- Krupka, G.R. SRA: Description of the SRA System as Used for MUC-6. In Proceedings of the 6th Conference on Message understanding—MUC6 ’95, Columbia, MA, USA, 6–8 November 1995; Association for Computational Linguistics: Columbia, MA, USA, 1995; p. 221. [Google Scholar] [CrossRef]
- Bikel, D.M.; Miller, S.; Schwartz, R.; Weischedel, R. Nymble: A High-Performance Learning Name-Finder. In Proceedings of the Fifth Conference on Applied Natural Language Processing, Washington, DC, USA, 31 March–3 April 1997; Association for Computational Linguistics: Washington, DC, USA, 1997; pp. 194–201. [Google Scholar] [CrossRef]
- McCallum, A.; Li, W. Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, Edmonton, AB, Canada, 31 May–1 June 2003; Association for Computational Linguistics: Edmonton, AB, Canada, 2003; Volume 4, pp. 188–191. [Google Scholar] [CrossRef]
- Borthwick, A.; Sterling, J.; Agichtein, E.; Grishman, R. NYU: Description of the MENE Named Entity System as Used in MUC-7. In Proceedings of the 7th Message Understanding Conference, MUC 1998—Proceedings, Fairfax, VA, USA, 29 April–1 May 1998. [Google Scholar]
- Collobert, R.; Weston, J.; Bottou, L.; Karlen, M.; Kavukcuoglu, K.; Kuksa, P. Natural Language Processing (Almost) from Scratch. J. Mach. Learn. Res. 2011, 12, 2493–2537. [Google Scholar]
- Lample, G.; Ballesteros, M.; Subramanian, S.; Kawakami, K.; Dyer, C. Neural Architectures for Named Entity Recognition. arXiv 2016, arXiv:1603.01360. [Google Scholar]
- Zhang, Y.; Yang, J. Chinese NER Using Lattice LSTM. arXiv 2018, arXiv:1805.02023. [Google Scholar]
- Ma, R.; Peng, M.; Zhang, Q.; Wei, Z.; Huang, X. Simplify the Usage of Lexicon in Chinese NER. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–20 July 2020; pp. 5951–5960. [Google Scholar] [CrossRef]
- Muis, A.O.; Lu, W. Labeling Gaps Between Words: Recognizing Overlapping Mentions with Mention Separators. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 7–11 September 2017; pp. 2608–2618. [Google Scholar] [CrossRef]
- Luo, Y.; Zhao, H. Bipartite Flat-Graph Network for Nested Named Entity Recognition. arXiv 2020, arXiv:2005.00436. [Google Scholar]
- Fisher, J.; Vlachos, A. Merge and Label: A Novel Neural Network Architecture for Nested NER. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; Association for Computational Linguistics: Florence, Italy, 2019; pp. 5840–5850. [Google Scholar] [CrossRef]
- Sohrab, M.G.; Miwa, M. Deep Exhaustive Model for Nested Named Entity Recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; Association for Computational Linguistics: Brussels, Belgium, 2018; pp. 2843–2849. [Google Scholar] [CrossRef]
- Li, X.; Feng, J.; Meng, Y.; Han, Q.; Wu, F.; Li, J. A Unified MRC Framework for Named Entity Recognition. arXiv 2022, arXiv:1910.11476. [Google Scholar]
- Tan, Z.; Shen, Y.; Zhang, S.; Lu, W.; Zhuang, Y. A Sequence-to-Set Network for Nested Named Entity Recognition. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, Montreal, QC, Canada, 19–26 August 2021; International Joint Conferences on Artificial Intelligence Organization: Montreal, QC, Canada, 2021; pp. 3936–3942. [Google Scholar] [CrossRef]
- Xu, Y.; Huang, H.; Feng, C.; Hu, Y. A Supervised Multi-Head Self-Attention Network for Nested Named Entity Recognition. In Proceedings of the AAAI Conference on Artificial Intelligence AAAI 2021, Online, 2–9 February 2021; Volume 35, pp. 14185–14193. [Google Scholar] [CrossRef]
- Huang, P.; Zhao, X.; Hu, M.; Fang, Y.; Li, X.; Xiao, W. Extract-Select: A Span Selection Framework for Nested Named Entity Recognition with Generative Adversarial Training. In Findings of the Association for Computational Linguistics: ACL 2022; Association for Computational Linguistics: Dublin, Ireland, 2022; pp. 85–96. [Google Scholar] [CrossRef]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Liò, P.; Bengio, Y. Graph Attention Networks. arXiv 2018, arXiv:1710.10903. [Google Scholar]
- Liang, S.; Wei, W.; Mao, X.-L.; Wang, F.; He, Z. BiSyn-GAT+: Bi-Syntax Aware Graph Attention Network for Aspect-Based Sentiment Analysis. In Findings of the Association for Computational Linguistics: ACL 2022; Association for Computational Linguistics: Dublin, Ireland, 2022; pp. 1835–1848. [Google Scholar] [CrossRef]
- Kipf, T.N.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv 2017, arXiv:1609.02907. [Google Scholar]
- Eberts, M.; Ulges, A. Span-Based Joint Entity and Relation Extraction with Transformer Pre-Training. In Proceedings of the 24th European Conference on Artificial Intelligence (ECAI), Santiago de Compostela, Spain, 29 August–8 September 2020; Volume 325, pp. 2006–2013. [Google Scholar]
- Yu, J.; Bohnet, B.; Poesio, M. Named Entity Recognition as Dependency Parsing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 6470–6476. [Google Scholar] [CrossRef]
- Lybarger, K.; Yetisgen, M.; Uzuner, Ö. The 2022 N2c2/UW Shared Task on Extracting Social Determinants of Health. J. Am. Med. Inform. Assoc. 2023, 30, 1367–1378. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv 2019, arXiv:1810.04805. [Google Scholar]
- Manning, C.; Surdeanu, M.; Bauer, J.; Finkel, J.; Bethard, S.; McClosky, D. The Stanford CoreNLP Natural Language Processing Toolkit. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Baltimore, MA, USA, 23–24 June 2014; Association for Computational Linguistics: Baltimore, MA, USA, 2014; pp. 55–60. [Google Scholar] [CrossRef]
- Zheng, Q.; Wu, Y.; Wang, G.; Chen, Y.; Wu, W.; Zhang, Z.; Shi, B.; Dong, B. Exploring Interactive and Contrastive Relations for Nested Named Entity Recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 2023, 31, 2899–2909. [Google Scholar] [CrossRef]
- Katiyar, A.; Cardie, C. Nested Named Entity Recognition Revisited. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, LA, USA, 1–6 June 2018; Association for Computational Linguistics: New Orleans, LA, USA, 2018; pp. 861–871. [Google Scholar] [CrossRef]
- Fu, Y.; Tan, C.; Chen, M.; Huang, S.; Huang, F. Nested Named Entity Recognition with Partially-Observed TreeCRFs. arXiv 2020, arXiv:2012.08478. [Google Scholar] [CrossRef]
- Yang, S.; Tu, K. Bottom-Up Constituency Parsing and Nested Named Entity Recognition with Pointer Networks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; Association for Computational Linguistics: Dublin, Ireland, 2022; pp. 2403–2416. [Google Scholar] [CrossRef]
- Yan, H.; Gui, T.; Dai, J.; Guo, Q.; Zhang, Z.; Qiu, X. A Unified Generative Framework for Various NER Subtasks. arXiv 2021, arXiv:2106.01223. [Google Scholar]
ACE2005 | ACE2004 | GENIA | ||||||
---|---|---|---|---|---|---|---|---|
Train | Dev | Test | Train | Dev | Test | Train | Test | |
total sentences | 7194 | 969 | 1047 | 6200 | 745 | 812 | 16,692 | 1854 |
total entities | 24,441 | 3200 | 2993 | 22,204 | 2514 | 3035 | 50,509 | 5506 |
sent. nested entities | 2691 | 338 | 320 | 2712 | 294 | 388 | 3522 | 446 |
avg. sentence length | 19.21 | 18.93 | 17.2 | 22.50 | 23.02 | 23.05 | 25.35 | 25.99 |
total nested entities | 9389 | 1112 | 1118 | 10,149 | 1092 | 1417 | 9064 | 1199 |
nested percentage (%) | 38.41 | 34.75 | 37.35 | 45.71 | 46.69 | 45.61 | 17.95 | 21.78 |
Models | ACE 2004 | ACE 2005 | GENIA | ||||||
---|---|---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | P | R | F1 | |
HyperGraph (2018) | 73.60 | 71.80 | 72.70 | 70.60 | 70.40 | 70.50 | 77.70 | 71.80 | 74.60 |
Second-path (2020) | 85.94 | 85.69 | 85.82 | 83.83 | 84.87 | 84.34 | 77.81 | 76.94 | 77.36 |
Pyramid (2020) | 86.08 | 86.48 | 86.28 | 83.95 | 85.39 | 84.66 | 79.45 | 78.94 | 79.19 |
BENSC (2020) | 85.80 | 84.80 | 85.30 | 83.80 | 83.90 | 83.90 | 79.20 | 77.40 | 78.30 |
TreeCRFs (2021) | 86.70 | 86.50 | 86.60 | 84.50 | 86.40 | 85.40 | 78.20 | 78.20 | 78.20 |
BartNER (2021) | 87.27 | 86.41 | 86.84 | 83.16 | 86.38 | 84.74 | 78.87 | 79.60 | 79.23 |
PointNetwork (2022) | 86.60 | 87.28 | 86.94 | 84.61 | 86.43 | 85.53 | 78.08 | 78.26 | 78.16 |
Ours | 87.17 | 87.40 | 87.28 | 85.71 | 86.23 | 85.97 | 79.48 | 80.01 | 79.74 |
Model | ACE 2004 | ACE 2005 | GENIA | ||||||
---|---|---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | P | R | F1 | |
w/o boundary | 86.88 | 87.21 | 87.04 | 85.36 | 86.20 | 85.78 | 79.12 | 79.80 | 79.46 |
w/o filter | 86.53 | 87.30 | 86.91 | 84.97 | 85.85 | 85.41 | 78.83 | 80.20 | 79.51 |
w/o prototype | 86.14 | 87.18 | 86.66 | 84.32 | 85.91 | 85.11 | 78.16 | 79.65 | 78.90 |
Full model | 87.17 | 87.40 | 87.28 | 85.71 | 86.23 | 85.97 | 79.48 | 80.01 | 79.74 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mu, J.; Ouyang, J.; Yao, Y.; Ren, Z. Span-Prototype Graph Based on Graph Attention Network for Nested Named Entity Recognition. Electronics 2023, 12, 4753. https://doi.org/10.3390/electronics12234753
Mu J, Ouyang J, Yao Y, Ren Z. Span-Prototype Graph Based on Graph Attention Network for Nested Named Entity Recognition. Electronics. 2023; 12(23):4753. https://doi.org/10.3390/electronics12234753
Chicago/Turabian StyleMu, Jichong, Jihong Ouyang, Yachen Yao, and Zongxiao Ren. 2023. "Span-Prototype Graph Based on Graph Attention Network for Nested Named Entity Recognition" Electronics 12, no. 23: 4753. https://doi.org/10.3390/electronics12234753
APA StyleMu, J., Ouyang, J., Yao, Y., & Ren, Z. (2023). Span-Prototype Graph Based on Graph Attention Network for Nested Named Entity Recognition. Electronics, 12(23), 4753. https://doi.org/10.3390/electronics12234753