Next Article in Journal
A Simple Method to Estimate the In Situ Performance of Noise Barriers
Next Article in Special Issue
A New Method for Detecting Onset and Offset for Singing in Real-Time and Offline Environments
Previous Article in Journal
Complex Band Structure of 2D Piezoelectric Local Resonant Phononic Crystal with Finite Out-Of Plane Extension
Previous Article in Special Issue
Automatic Clustering of Students by Level of Situational Interest Based on Their EEG Features
 
 
Article
Peer-Review Record

Smart-Median: A New Real-Time Algorithm for Smoothing Singing Pitch Contours

Appl. Sci. 2022, 12(14), 7026; https://doi.org/10.3390/app12147026
by Behnam Faghih * and Joseph Timoney
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2022, 12(14), 7026; https://doi.org/10.3390/app12147026
Submission received: 27 May 2022 / Revised: 30 June 2022 / Accepted: 7 July 2022 / Published: 12 July 2022
(This article belongs to the Special Issue Processing Techniques Applied to Audio, Image and Brain Signals)

Round 1

Reviewer 1 Report

Herein, the authors introduce a new smoother algorithm that rectifies Pitch detection. The proposed smoother algorithm is compared with 15 other smoother algorithms over roughly 2700 pitch contours. Four metrics were used for the comparison. According to all the metrics, the proposed algorithm could smooth the contours more accurately than other algorithms. A distinct conclusion is that smoother algorithms should be designed according to the type of the contour and the final applications of the result. The article is interesting, merits publication in the journal after addressing the following minor comments:

1. Elaborate more on the results depicted in the given figures.

2. For better presentations move the long tables to an appendix.

3. Comment on future extensions in this direction in the concluding remarks.

4. Proofread the whole manuscript for possible typos and grammatical errors.

 

Author Response

Dear Reviewer,

Many thanks for your time in reviewing our paper and for helping us to improve the paper. I appreciate it. We have tried revising our paper according to your and the other reviewer's comments. Here are our explanations for each of the comments.

Comment 1. Elaborate more on the results depicted in the given figures.

Answer: An example has been added to clarify how to interpret the tables in the result section.

Comment 2. For better presentations move the long tables to an appendix.

Answer: The tables have been moved to Appendix B and all the cross-references updated.

Comment 3. Comment on future extensions in this direction in the concluding remarks.

Answer:

The future work section has been rewritten as follows.

For future work, one short-term task is based on recognizing that the parameters of the smart-median can be set according to the specific properties of the sound inputs, such as from particular musical instruments or their families, to improve its accuracy in a targeted way. Another is that the smart-median finds the incorrect F0 based on its interval from the previous F0. This approach can be improved by considering a maximum noise duration. For example, if there is a considerable frequency interval between the last F0 and the current one, and additionally, if several immediately subsequent F0s are near to the current F0, then we may not consider the big jump a noise but rather a new musical articulation. This requires an extra decision-making stage to be included in the algorithm. In the longer term, further testing can be carried out on vocal material from a wide variety of genres and techniques. This would require the creation of new, specialist corpora, which requires lots of manual effort in both the gathering and labelling. It can be supported by machine learning. Such a dataset would also benefit the research field at large.

Comment 4. Proofread the whole manuscript for possible typos and grammatical errors.

Answer:

We tried to edit the text to be clearer for the readers.

Reviewer 2 Report

The paper presents clearly the existing state of the art and describe clearly the experimental setup. It has, however, few points that need improvement

Line 92, "the mixture of" seems to be in a different font

Line 167,168

"All the provided files, such as the dataset and codes, are available in a GitHub repository at https://github.com/BehnamFaghihMusicTech/Smart-Median"

I went there, I found a repository with only the README.md file saying "This repository included all relevant files to Smart-Median..."  

Why the past tense?  Did it include the files, but now it doesn't? If so, please remove the reference to the gitHub repository; otherwise correct it.

Equation (1)

Please define the meaning of N, GTi, SMi.  After a while I understood that they were the number of frames, the values of the Ground Truth and the results of the smoothing; however it is preferable to define them, in order to make the paper easier to read.

Equation (6), (7), (8) and (14)

Does the '*' in equation (6) denote a convolution (as it usually does) or a product (language programming style).  From the context I guess that the authors are using it with the latter meaning, in which case I would strongly suggest removing it.  This applies also to (7), (8) and (14).  If it denotes the convolution, please say it explicitly since the first impression is that it is a product (actually, in (7) has no other interpretation, since alpha is a scalar).

 

Line 259  "the data near the is prone to artefacts"

I think something is missing here 

Line 482 "the source code is available on the gitHub repository"

Which repository?  The one named on line 167? 

Table 2-6

It would be nice if the "best" entries were emphasized with a bold font.

Table 3

Why the R^2 values are negative?  According to its definition in (1) R^2 is the ratio of two squares, therefore it cannot be negative.  Please clarify 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop