Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Punctuation Restoration with Transformer Model on Social Media Data

Appl. Sci. 2023, 13(3), 1685; https://doi.org/10.3390/app13031685

by Adebayo Mustapha Bakare¹, Kalaiarasi Sonai Muthu Anbananthen^1,*

, Saravanan Muthaiyah²

, Jayakumar Krishnan¹ and Subarmaniam Kannan¹

Reviewer 1:

Krzysztof Wołk

Reviewer 2:

Afzal Badshah

Reviewer 3: Anonymous

Appl. Sci. 2023, 13(3), 1685; https://doi.org/10.3390/app13031685

Submission received: 20 December 2022 / Revised: 15 January 2023 / Accepted: 18 January 2023 / Published: 28 January 2023

Round 1

Reviewer 1 Report

The article is well structured and well written. Only minor changes in writing after a good proofreading are recommended. The topic is novel, current and of potential interest to readers.

Punctuation restoration is important topic in NLP. Authors focus on sentiment analysis use case. But please not that there are many potential applications like ASR etc. I recommend elaborating other potential areas of application and potential scientific and business implications. Fir evaluation my recommendation is to all WER/TER metric.

It would be nice if you could discuss if the method is applicable for other languages e.g. Slavic, European, etc. and add a demonstrational Collaboratory Notebook for better understanding and easier reproduction possibilities.

Author Response

The answer in file attached

Author Response File: Author Response.docx

Reviewer 2 Report

Author Response

Attached in file

Author Response File: Author Response.docx

Reviewer 3 Report

The article Punctuation Restoration with Transformer Model on Social Media Data focuses on the problem of correctly dividing a text's paragraphs into separate sentences. The authors argue that misplaced punctuation in social media posts can have a negative impact on the analysis of the text's sentiment. The study consists of applying a transformer model approach in order to recover punctuation and break the text into sentences. According to the results, the BiLSTM provides an accuracy of 90–97% punctuation recovery, depending on the dataset.

A few significant observations:

1. Section 2 requires considerable revision. The authors have provided a cursory review of existing algorithms used for punctuation restoration. The review should be expanded. Existing studies should be analyzed, detailing the datasets, evaluation methods, results, and limitations used in them. The section should conclude with a discussion of the limitations common to all approaches and the problems that the new (author's) approach should address.

2. Section 3.1 should provide a rationale for the choice of the Amazon and Telekom datasets. Clarify whether there were alternative datasets and why they were not used. In addition, the authors provide statistics regarding Amazon in Table 5, but similar statistics for Telekom are missing.

3. In Section 3.5, the rationale for the choice of hyperparameters should be given. Information about the number of learning epochs should also be given, and it should be clarified how such a number was established.

4. In lines 150–151, the authors provide information that the training dataset size was 60% and the test dataset size was 40%. Provide an explanation why the cross-validation procedure wasn't used, since this procedure allows for a more objective evaluation.

Author Response

The comment is attached in file

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

authors did necessary corrections

Reviewer 2 Report

Dear Editor,

The paper is related to Natural Language Processing, which is not my area of expertise and I wouldn't be able to review it. It is therefore requested to assign it to the related person.

Reviewer 3 Report

The paper was revised. All my comments are taken into account.

Article Menu

Punctuation Restoration with Transformer Model on Social Media Data

Further Information

Guidelines

MDPI Initiatives

Follow MDPI