Global and Local Information Adjustment for Semantic Similarity Evaluation
Abstract
:1. Introduction
- To evaluate the semantic similarity of sentence pairs, we propose a model that uses global features, entire sentence information, and local features, localized sentence information, simultaneously. The proposed model can adjust whether to focus more on global information or local information, and it is seen that the accuracy is higher than the existing models that use only global information.
- In this study, the effect of dynamic routing on similarity evaluation was investigated. Since the similarity of the meaning of sentences is relatively free in the positioning of the corresponding phrases, it was found that dynamic routing hindered the correct evaluation. In addition, experiments were conducted on both English and Korean datasets to prove the language independency of the model proposed.
2. Related Works
3. Materials and Methods
3.1. Word Embedding
3.2. Bidirectional Long Short-Term Momory
3.3. Attention Mechanism
3.4. Capsule Network
3.5. Similarity Measure
4. Experiments
4.1. Dataset
4.1.1. English Dataset
4.1.2. Korean Dataset
4.2. Hyperparameters
4.3. Accuracy Comparison According to
4.4. Result
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Marelli, M.; Menini, S.; Baroni, M.; Bentivogli, L.; Bernardi, R.; Zamparelli, R. A sick cure for the evaluation of compositional distributional semantic models. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC’14), Reykjavik, Iceland, 26–31 May 2014; pp. 216–223. [Google Scholar]
- Mueller, J.; Thyagarajan, A. Siamese recurrent architectures for learning sentence similarity. In Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI-16), Phoenix, AZ, USA, 12–17 February 2016; pp. 2786–2792. [Google Scholar]
- Pontes, E.L.; Huet, S.; Linhares, A.C.; Torres-Moreno, J.M. Predicting the semantic textual similarity with siamese CNN and LSTM. arXiv 2018, arXiv:1810.10641. [Google Scholar]
- Li, Y.; Zhou, D.; Zhao, W. Combining Local and Global Features into a Siamese Network for Sentence Similarity. IEEE Access 2020, 8, 75437–75447. [Google Scholar] [CrossRef]
- Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.J. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, 6–12 July 2002; pp. 311–318. [Google Scholar]
- Banerjee, S.; Lavie, A. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the acl Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, MI, USA, 29 June 2005; pp. 65–72. [Google Scholar]
- Lin, Z.; Feng, M.; Santos, C.N.D.; Yu, M.; Xiang, B.; Zhou, B.; Bengio, Y. A Structured Self-Attentive Sentence Embedding. In Proceedings of the 5th International Conference on Learning Representations (ICLR 2017), Palais des Congrès Neptune, Toulon, France, 24–26 March 2017. [Google Scholar]
- Ma, Q.; Yu, L.; Tian, S.; Chen, E.; Ng, W.W. Global-Local Mutual Attention Model for Text Classification. IEEE/ACM Trans. Audio Speech Lang. Process. 2019, 27, 2127–2139. [Google Scholar] [CrossRef]
- Sabour, S.; Frosst, N.; Hinton, G.E. Dynamic routing between capsules. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS-2017), Long Beach, CA, USA, 4–9 December 2017; pp. 3859–3869. [Google Scholar]
- Yang, M.; Zhao, W.; Chen, L.; Qu, Q.; Zhao, Z.; Shen, Y. Investigating the transferring capability of capsule networks for text classification. Neural Netw. 2019, 118, 247–261. [Google Scholar] [CrossRef] [PubMed]
- Wu, Y.; Li, J.; Wu, J.; Chang, J. Siamese capsule networks with global and local features for text classification. Neurocomputing 2020, 390, 88–98. [Google Scholar] [CrossRef]
- Kim, J.; Jang, S.; Park, E.; Choi, S. Text classification using capsules. Neurocomputing 2020, 376, 214–221. [Google Scholar] [CrossRef] [Green Version]
- Yang, Z.; Yang, D.; Dyer, C.; He, X.; Smola, A.; Hovy, E. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 1480–1489. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Gong, J.; Qiu, X.; Wang, S.; Huang, X. Information aggregation via dynamic routing for sequence encoding. In Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, NM, USA, 20–25 August 2018; pp. 2742–2752. [Google Scholar]
- Zheng, G.; Mukherjee, S.; Dong, X.L.; Li, F. Opentag: Open attribute value extraction from product profiles. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 22 July 2018; pp. 1049–1058. [Google Scholar]
- Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.S.; Dean, J. Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 12–15 December 2013; pp. 3111–3119. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Sahlgren, M. The distributional hypothesis. Ital. J. Disabil. Stud. 2008, 20, 33–53. [Google Scholar]
- Peng, C.; Zhang, X.; Yu, G.; Luo, G.; Sun, J. Large Kernel Matters--Improve Semantic Segmentation by Global Convolutional Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4353–4361. [Google Scholar]
- Google News. Available online: https://code.google.com/archive/p/word2vec/ (accessed on 7 September 2020).
- Korean Raw Corpus. Available online: http://nlp.kookmin.ac.kr (accessed on 7 September 2020).
- Song, H.J.; Heo, T.S.; Kim, J.D.; Park, C.Y.; Kim, Y.S. Sentence similarity evaluation using Sent2Vec and siamese neural network with parallel structure. J. Intell. Fuzzy Syst. 2021. Pre-press. [Google Scholar]
- Quora Question Pairs. Available online: https://www.kaggle.com/quora/question-pairs-dataset (accessed on 7 September 2020).
- Google Translator. Available online: https://translate.google.com/ (accessed on 2 February 2021).
- Naver Question Pairs. Available online: http://kin.naver.com/ (accessed on 7 September 2020).
- Exobrain Korean Paraphrase Corpus. Available online: http://aiopen.etri.re.kr/service_dataset.php (accessed on 7 September 2020).
- German Translation Pair of Hankuk University of Foreign Studies. Available online: http://deutsch.hufs.ac.kr/ (accessed on 7 September 2020).
- Naver Spell Checker. Available online: https://search.naver.com/search.naver?sm=top_hty&fbm=0&ie=utf8&query=%EB%84%A4%EC%9D%B4%EB%B2%84+%EB%A7%9E%EC%B6%A4%EB%B2%95+%EA%B2%80%EC%82%AC%EA%B8%B0 (accessed on 7 September 2020).
- Kkma Morpheme Analyzer. Available online: http://kkma.snu.ac.kr/ (accessed on 7 September 2020).
Sentence 1 | Sentence 2 | Label |
---|---|---|
I’m a 19-year-old. How can I improve my skills or what should I do to become an entrepreneur in the next few years? | I am a 19 years old guy. How can I become a billionaire in the next 10 years? | 0 |
What are the life lessons that Batman teaches us? | What are the life lessons you can learn from the dark knight? | 1 |
Language | Sentence 1 | Sentence 2 | Label |
---|---|---|---|
Korean | 참고 인내하며 때를 기다려야 하는 날입니다. | 일이 잘 풀릴 것 같은 날입니다. | 0 |
English | It is a day when you have to be patient and wait for the time. | It’s a day that seems to be going well. | |
Korean | 가장 저렴한 방법으로 치아 미백 효과를 낼 수 있는 방법은? | 치아를 미백할 수 있는 저렴하고 효율적인 방법은? | 1 |
English | What is the cheapest way to produce teeth whitening effect? | What is an inexpensive and efficient way to whiten teeth? |
Features | Description | Hyperparameter |
---|---|---|
Global | The number of Units of bidirectional long-short term memory (Bi-LSTM) | 256 |
Local | Rectified Linear Unit | |
3 | ||
256 | ||
None | ||
The total size of the Input | ||
256 | ||
The number of dimensions used in the PrimaryCaps | 8 |
No. | Model | English Acc | Korean Acc |
---|---|---|---|
1 | LSTM [2] | 80.99% | 87.25% |
2 | Convolutional neural networks (CNN) + LSTM [3] | 80.44% | 83.76% |
3 | Group CNN + bidirectional gated recurrent unit (Bi-GRU) [4] | 82.18% | 89.52% |
4 | Bi-LSTM | 82.9% | 90.28% |
5 | CNN + Max-Pooling | 80.58% | 88.26% |
6 | Capsule Network | 81.57% | 88.85% |
7 | Capsule Network + Dynamic Routing | 78.82% | 86.48% |
8 | LSTM + Capsule Network | 82.07% | 90.25% |
9 | Bi-LSTM + Capsule Network | 82.15% | 90.56% |
10 | LSTM + Self-Attention + Capsule Network | 82.78% | 90.61% |
11 | Bi-LSTM + Self-Attention + Capsule Network | 82.9% | 91.41% |
12 | Proposed Model | 83.51% | 92.08% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Heo, T.-S.; Kim, J.-D.; Park, C.-Y.; Kim, Y.-S. Global and Local Information Adjustment for Semantic Similarity Evaluation. Appl. Sci. 2021, 11, 2161. https://doi.org/10.3390/app11052161
Heo T-S, Kim J-D, Park C-Y, Kim Y-S. Global and Local Information Adjustment for Semantic Similarity Evaluation. Applied Sciences. 2021; 11(5):2161. https://doi.org/10.3390/app11052161
Chicago/Turabian StyleHeo, Tak-Sung, Jong-Dae Kim, Chan-Young Park, and Yu-Seop Kim. 2021. "Global and Local Information Adjustment for Semantic Similarity Evaluation" Applied Sciences 11, no. 5: 2161. https://doi.org/10.3390/app11052161
APA StyleHeo, T. -S., Kim, J. -D., Park, C. -Y., & Kim, Y. -S. (2021). Global and Local Information Adjustment for Semantic Similarity Evaluation. Applied Sciences, 11(5), 2161. https://doi.org/10.3390/app11052161