Information

Journal Browser

► Journal Browser

Natural Language Generation and Machine Learning

Share This Special Issue

Special Issue Editor

Dr. Ioannis Konstas

E-Mail Website
Guest Editor

School of Mathematical and Computer Sciences, Heriot Watt University, Edinburgh, Scotland, UK
Interests: natural language generation; natural language processing

Special Issue Information

Dear Colleagues,

The generation of natural language has always been one of the core tasks in the area of natural language processing, with a very wide spectrum of applications ranging from summarization systems to conversational agents. However, it was only quite recently that it received rekindled interest from the research community, mostly due to vast improvement of the linguistic capacity of neural generators and the wide adoption of more sophisticated generation systems within commercial personal assistants.

Natural language generation is an umbrella term for a variety of tasks that are usually categorized according to the different given input (e.g., text, unstructured/semi-structured/ structured meaning representations, images, dialogue history) and output formats (e.g., sentence, document, caption, dialogue utterance) they deal with. The most common tasks include text summarization, data-to-text generation with the input ranging from semi-structured record–field–value tables to meaning representations and RDF triples, caption generation of images, and conversational response generation in the context of either a closed- or open-domain dialogue. Given the subjectivity of the task and, often, the lack of large enough datasets with multiple reference text outputs, evaluating the quality of the output becomes a significant bottleneck in the deployment of a successful generation system.

The goal of this Special Issue is to present a collection of the current research in data-driven approaches to natural language generation using machine learning, aiming to capture a variety of tasks, modeling approaches, and effective evaluation techniques. We are interested in submissions of high-quality, original, technical and survey papers addressing both theoretical and practical aspects. We wish for this Special Issue to not only showcase systems with state-of-the-art performance on a specific task, but also studies that consider the ethical implications and the potential impact on society of such systems with regards to generating output which is of high fidelity, factual, and does not contradict common sense knowledge.

Dr. Ioannis Konstas
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.

Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.

Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.

External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.

e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (1 paper)

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

13 pages, 2178 KiB

Open AccessArticle

Multidocument Arabic Text Summarization Based on Clustering and Word2Vec to Reduce Redundancy

by Samer Abdulateef, Naseer Ahmed Khan, Bolin Chen and Xuequn Shang

Information 2020, 11(2), 59; https://doi.org/10.3390/info11020059 - 23 Jan 2020

Cited by 46 | Viewed by 6060

Abstract

Arabic is one of the most semantically and syntactically complex languages in the world. A key challenging issue in text mining is text summarization, so we propose an unsupervised score-based method which combines the vector space model, continuous bag of words (CBOW), clustering, and a statistically-based method. The problems with multidocument text summarization are the noisy data, redundancy, diminished readability, and sentence incoherency. In this study, we adopt a preprocessing strategy to solve the noise problem and use the word2vec model for two purposes, first, to map the words to fixed-length vectors and, second, to obtain the semantic relationship between each vector based on the dimensions. Similarly, we use a k-means algorithm for two purposes: (1) Selecting the distinctive documents and tokenizing these documents to sentences, and (2) using another iteration of the k-means algorithm to select the key sentences based on the similarity metric to overcome the redundancy problem and generate the initial summary. Lastly, we use weighted principal component analysis (W-PCA) to map the sentences’ encoded weights based on a list of features. This selects the highest set of weights, which relates to important sentences for solving incoherency and readability problems. We adopted Recall-Oriented Understudy for Gisting Evaluation (ROUGE) as an evaluation measure to examine our proposed technique and compare it with state-of-the-art methods. Finally, an experiment on the Essex Arabic Summaries Corpus (EASC) using the ROUGE-1 and ROUGE-2 metrics showed promising results in comparison with existing methods. Full article

(This article belongs to the Special Issue Natural Language Generation and Machine Learning)

► Show Figures

Journal Menu

Journal Browser

Natural Language Generation and Machine Learning

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (1 paper)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI