Next Article in Journal
A Novel Stacked Auto Encoders Sparse Filter Rotating Component Comprehensive Diagnosis Network for Extracting Domain Invariant Features
Next Article in Special Issue
Speech Recognition for Task Domains with Sparse Matched Training Data
Previous Article in Journal
Characteristic Test and Electromagnetic Analysis of Regenerative Hybrid Electrodynamic Damper for Vibration Mitigation and Monitoring of Stay Cables
Previous Article in Special Issue
Gated Recurrent Attention for Multi-Style Speech Synthesis
 
 
Article
Peer-Review Record

Speech Enhancement for Hearing Aids with Deep Learning on Environmental Noises

Appl. Sci. 2020, 10(17), 6077; https://doi.org/10.3390/app10176077
by Gyuseok Park 1,2,†, Woohyeong Cho 3,†, Kyu-Sung Kim 2,4 and Sangmin Lee 1,3,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Appl. Sci. 2020, 10(17), 6077; https://doi.org/10.3390/app10176077
Submission received: 20 August 2020 / Revised: 28 August 2020 / Accepted: 31 August 2020 / Published: 2 September 2020
(This article belongs to the Special Issue Intelligent Speech and Acoustic Signal Processing)

Round 1

Reviewer 1 Report

The paper is really good and correct in all its aspect.
Few minor comments:
avoid the use of "we" in english writing.
Please insert the equation numbering as requested by the journal guidelines.
Conclusions and discussions are a bit short and need to be improved.

Author Response

Dear Reviewer,

I appreciate your valuable and various comments. My paper has been corrected based on your opinion, and I sent revised contents to you.

 

I accepted your opinions, and my answer are as follows.

  1. avoid the use of "we" in English writing.
  • #24, #114, #172, and #253 in paper,
    The sentence using “we” were changed to passive sentences.

 

  1. Please insert the equation numbering as requested by the journal guidelines.
  • #204 in paper,
    The equation numbering was inserted as requested by the journal guidelines.

 

  1. Conclusions and discussions are a bit short and need to be improved
  • From #269 to 282 in paper,
    The detailed discussion and strength point of proposed algorithm were added in conclusion.

Author Response File: Author Response.docx

Reviewer 2 Report

Thank you for inviting me to be a reviewer of manuscript ApplSci-921753, titled “Speech Enhancement for Hearing aids with Deep Learning on Environmental Noises”.

In this manuscript, the authors propose new method for improving speech quality in a hearing aid environment by applying deep learning algorithm.

The manuscript is well-structured.

The main idea and its machine learning foundations are clearly described.

The evaluation of speech quality verifies that the proposed noise reduction algorithm could effectively improve speech quality in a noisy environment. A comparison with similar studies’ results is also available.

In my opinion, the authors should add more details about similar previous studies by using references from the last five years.

There is no information on how the new method is actually implemented (programming language, development environment).

 

Some technical remarks:

Figure 1: “IRM” is actually “IBM”.

l. 215: “Table 3” -> “Table 2”.

l. 348: Please, remove source #35.

Author Response

Dear Reviewer,

I appreciate your valuable and various comments. My paper has been corrected based on your opinion, and I sent revised contents to you.

 

I accepted your opinions, and my answer are as follows.

  1. The evaluation of speech quality verifies that the proposed noise reduction algorithm could effectively improve speech quality in a noisy environment. A comparison with similar studies’ results is also available.
    In my opinion, the authors should add more details about similar previous studies by using references from the last five years.
  • From #60 to #71 in paper,
    The similar previous studies from the last five years were added and compared in Section 1.

 

  1. There is no information on how the new method is actually implemented (programming language, development environment).
  • From #193 to #194 in paper,
    The implemented information were added and included in Section 2.4

 

  1. Figure 1: “IRM” is actually “IBM”.
  • #148 in paper,
    “IRM” was replaced to “IBM” in Figure 1.

 

  1. 215: “Table 3” -> “Table 2”.
  • #240 in paper,
    “Table 3” was replaced to “Table 2” in Table 2.

 

  1. 348: Please, remove source #35.
  • #387 in paper,
    A source of template introduce was removed.

Author Response File: Author Response.docx

Reviewer 3 Report

This is a very interesting paper and reads very well.

It would be interesting to know in which frequency range the model works. Since the sounds were recorded at 44.1 kHz, one might assume that the frequency range is between 20 and 20 kHz. If not, it should be specified.

Line 162 states that 44.1 kHz is the highest possible sampling frequency. This generally does not apply, since for audio with a maximum frequency range of 20 kHz, the sampling rate is usually 44.1 kHz, but can also be sampled at 48 kHz. For high-quality multi-channel audio, the sampling rate can be up to 192 kHz. Therefore, you should specify why the sampling rate is 44.1 kHz. For example, the device you are using limits the sampling rate of 44.1 kHz, and so on.

 

Author Response

Dear Reviewer,

I appreciate your valuable and various comments. My paper has been corrected based on your opinion, and I sent revised contents to you.

 

I accepted your opinions, and my answer are as follows.

It would be interesting to know in which frequency range the model works. Since the sounds were recorded at 44.1 kHz, one might assume that the frequency range is between 20 and 20 kHz. If not, it should be specified.

Line 162 states that 44.1 kHz is the highest possible sampling frequency. This generally does not apply, since for audio with a maximum frequency range of 20 kHz, the sampling rate is usually 44.1 kHz, but can also be sampled at 48 kHz. For high-quality multi-channel audio, the sampling rate can be up to 192 kHz. Therefore, you should specify why the sampling rate is 44.1 kHz. For example, the device you are using limits the sampling rate of 44.1 kHz, and so on.

 

  • From #179 to #183 in paper,
    The expression was not clear to understand how the noise recorded and down-sampled for hearing aids signal processing. So, I made some changes. The range of recorded audio signal was from 0 to 44.1kHz, and the recorded audio signal was down-sampled to 16kHz.
    The detail process of down-sampling was added and included in Section 2.4.

Author Response File: Author Response.docx

Back to TopTop