Special Issue on Machine Learning and Natural Language Processing

Mozgovoy, Maxim; Suero Montero, Calkin

doi:10.3390/app12178894

Open AccessEditorial

Special Issue on Machine Learning and Natural Language Processing

by

Maxim Mozgovoy

^1,*

and

Calkin Suero Montero

²

¹

Active Knowledge Engineering Lab, University of Aizu, Aizuwakamatsu 965-8580, Japan

²

School of Educational Sciences and Psychology, University of Eastern Finland, FI-80101 Joensuu, Finland

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(17), 8894; https://doi.org/10.3390/app12178894

Submission received: 30 August 2022 / Accepted: 30 August 2022 / Published: 5 September 2022

(This article belongs to the Special Issue Machine Learning and Natural Language Processing)

Download Versions Notes

The task of processing natural language automatically has been on the radar of researchers since the dawn of computing, fostering the rise of fields such as computational linguistics and human–language technologies. Recent advances in machine learning (ML) technologies and the growing power of computing hardware have significantly pushed forward the state of the art in computational linguistics and related fields. The present Special Issue was conceived as a specialised forum for discussing these advancements and charting further potential research directions. We note the high interest of the scientific community on ML-supported natural language processing, resulting in the publication of twenty research papers in the present issue. We also note a growing diversity of natural languages addressed in the studies of this Special Issue, including English, Catalan, Arabic, Chinese, Korean, Vietnamese, Spanish and Portuguese.

Attempting to categorise the research topics addressed here, we highlight:

Analysing tweets and Twitter profiles. Kasthuriarachchy et al. [1] propose an improved method for understanding noisy English texts, while Alshalan and Al-Khalifa [2] design a system for hate speech detection in Arabic tweets. Prada and Iglesias [3], in their contribution, analyse Twitter profiles of the users to predict their reputation on an online marketplace.
Annotated text corpora. Ruiz-Dolz et al. [4] develop a corpus of debate transcripts, suitable for multilingual computational argumentation research. Looking at social media, Bel-Enguix et al. [5] present a corpus focused on negation structures found in Twitter. Additionally, Vu et al. [6] build parallel Korean–English and Korean–Vietnamese datasets targeted at machine translation research, whereas Shaikh et al. [7] exploit text generation models to balance highly biased text corpora in the English language.
Chatbots. Chen et al. [8] improve chatbot performance by augmenting the transformer-based architecture with a memory-based deep neural attention model and Kim et al. [9] propose a method for retaining the dialog context by using a specially designed attention mechanism.
Error detection. Madi and Al-Khalifa [10] address the topic of error detection in the work dedicated to the task of grammar checking in Modern Standard Arabic texts.
Named entity recognition (NER). Syed and Chung [11] fine tune a BERT model to improve its performance on food menu data in English. Additionally, relying on a finely tuned BERT, Wang et al. [12] present their work, achieving high quality Chinese NER. Kim and Kim [13], on the other hand, design a system able to perform both morphological analysis and NER in Korean, while Dias et al. [14] propose a combination of methods to build a NER system for Portuguese.
Natural language understanding. Son et al. [15] introduce a Sequential and Intensive Weighted Language Modelling scheme that is used together with multi-task deep neural network to outperform state-of-the-art approaches on the standard natural language understanding benchmarks. On the other hand, Zeng et al. [16] provide a survey of existing machine reading comprehension tasks, evaluation metrics, and datasets.
Text classification and summarisation. Long et al. [17] propose a novel graph convolution network-based classifier for the task of relation classification. Focusing on the specialised task of classifying industrial construction accident reports, Zhang et al. [18] present their approach and promising results. Additionally, Kim et al. [19] design a topical category-aware text summariser able to consider the topic of the input document. Zeng et al. [20], furthermore, implement a self-matching mechanism to improve the memory capacity of the document summarisation system.

The diversity of topics, languages and approaches to text analysis points to the large number of underexplored areas in this field and the possibility of further progress. We wish to thank all the authors of the present Special Issue and encourage new contributions to natural language processing technologies.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kasthuriarachchy, B.; Chetty, M.; Shatte, A.; Walls, D. From General Language Understanding to Noisy Text Comprehension. Appl. Sci. 2021, 11, 7814. [Google Scholar] [CrossRef]
Alshalan, R.; Al-Khalifa, H. A Deep Learning Approach for Automatic Hate Speech Detection in the Saudi Twittersphere. Appl. Sci. 2020, 11, 8614. [Google Scholar] [CrossRef]
Prada, A.; Iglesias, C.A. Predicting Reputation in the Sharing Economy with Twitter Social Data. Appl. Sci. 2021, 10, 2881. [Google Scholar] [CrossRef]
Ruiz-Dolz, R.; Nofre, M.; Taulé, M.; Heras, S.; García-Fornes, A. VivesDebate: A New Annotated Multilingual Corpus of Argumentation in a Debate Tournament. Appl. Sci. 2021, 11, 7160. [Google Scholar] [CrossRef]
Bel-Enguix, G.; Gómez-Adorno, H.; Pimentel, A.; Ojeda-Trueba, S.L.; Aguilar-Vizuet, B. Negation Detection on Mexican Spanish Tweets: The T-MexNeg Corpus. Appl. Sci. 2021, 11, 3880. [Google Scholar] [CrossRef]
Vu, V.H.; Nguyen, Q.P.; Shin, J.C.; Ock, C.Y. UPC: An Open Word-Sense Annotated Parallel Corpora for Machine Translation Study. Appl. Sci. 2020, 10, 3904. [Google Scholar] [CrossRef]
Shaikh, S.; Daudpota, S.M.; Imran, A.S.; Kastrati, Z. Towards Improved Classification Accuracy on Highly Imbalanced Text Dataset Using Deep Neural Language Models. Appl. Sci. 2021, 11, 869. [Google Scholar] [CrossRef]
Chen, J.; Agbodike, O.; Wang, L. Memory-Based Deep Neural Attention (mDNA) for Cognitive Multi-Turn Response Retrieval in Task-Oriented Chatbots. Appl. Sci. 2020, 10, 5819. [Google Scholar] [CrossRef]
Kim, S.; Kwon, O.W.; Kim, H. Knowledge-Grounded Chatbot Based on Dual Wasserstein Generative Adversarial Networks with Effective Attention Mechanisms. Appl. Sci. 2020, 10, 3335. [Google Scholar] [CrossRef]
Madi, N.; Al-Khalifa, H. Error Detection for Arabic Text Using Neural Sequence Labeling. Appl. Sci. 2020, 10, 5279. [Google Scholar] [CrossRef]
Syed, M.H.; Chung, S.T. MenuNER: Domain-Adapted BERT Based NER Approach for a Domain with Limited Dataset and Its Application to Food Menu Domain. Appl. Sci. 2021, 11, 6007. [Google Scholar] [CrossRef]
Wang, Y.; Sun, Y.; Ma, Z.; Gao, L.; Xu, Y. An ERNIE-Based Joint Model for Chinese Named Entity Recognition. Appl. Sci. 2020, 10, 5711. [Google Scholar] [CrossRef]
Kim, H.; Kim, H. Integrated Model for Morphological Analysis and Named Entity Recognition Based on Label Attention Networks in Korean. Appl. Sci. 2020, 10, 3740. [Google Scholar] [CrossRef]
Dias, M.; Boné, J.; Ferreira, J.C.; Ribeiro, R.; Maia, R. Ferreira, J.C.; Ribeiro, R.; Maia, R. Named Entity Recognition for Sensitive Data Discovery in Portuguese. Appl. Sci. 2020, 10, 2303. [Google Scholar] [CrossRef]
Son, S.; Hwang, S.; Bae, S.; Park, S.J.; Choi, J.H. A Sequential and Intensive Weighted Language Modeling Scheme for Multi-Task Learning-Based Natural Language Understanding. Appl. Sci. 2021, 11, 3095. [Google Scholar] [CrossRef]
Zeng, C.; Li, S.; Li, Q.; Hu, J.; Hu, J. A Survey on Machine Reading Comprehension—Tasks, Evaluation Metrics and Benchmark Datasets. Appl. Sci. 2020, 10, 7640. [Google Scholar] [CrossRef]
Long, J.; Wang, Y.; Wei, X.; Ding, Z.; Qi, Q.; Xie, F.; Huang, W. Entity-Centric Fully Connected GCN for Relation Classification. Appl. Sci. 2021, 11, 1377. [Google Scholar] [CrossRef]
Zhang, J.; Zi, L.; Hou, Y.; Deng, D.; Jiang, W.; Wang, M. A C-BiLSTM Approach to Classify Construction Accident Reports. Appl. Sci. 2020, 10, 5754. [Google Scholar] [CrossRef]
Kim, S.E.; Kaibalina, N.; Park, S.B. A Topical Category-Aware Neural Text Summarizer. Appl. Sci. 2020, 10, 5422. [Google Scholar] [CrossRef]
Zeng, B.; Xu, R.; Yang, H.; Gan, Z.; Zhou, W. Comprehensive Document Summarization with Refined Self-Matching Mechanism. Appl. Sci. 2020, 10, 1864. [Google Scholar] [CrossRef] [Green Version]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mozgovoy, M.; Suero Montero, C. Special Issue on Machine Learning and Natural Language Processing. Appl. Sci. 2022, 12, 8894. https://doi.org/10.3390/app12178894

AMA Style

Mozgovoy M, Suero Montero C. Special Issue on Machine Learning and Natural Language Processing. Applied Sciences. 2022; 12(17):8894. https://doi.org/10.3390/app12178894

Chicago/Turabian Style

Mozgovoy, Maxim, and Calkin Suero Montero. 2022. "Special Issue on Machine Learning and Natural Language Processing" Applied Sciences 12, no. 17: 8894. https://doi.org/10.3390/app12178894

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Special Issue on Machine Learning and Natural Language Processing

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI