Next Article in Journal
Nursing Professionals’ Role in the Comprehensive Management of Obstructive Sleep Apnoea: A Literature Review
Previous Article in Journal
Application of Near Infrared Hyperspectral Imaging Technology in Purity Detection of Hybrid Maize
Previous Article in Special Issue
Laryngeal Imaging Study of Glottal Attack/Offset Time in Adductor Spasmodic Dysphonia during Connected Speech
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Editorial

Special Issue on Current Trends and Future Directions in Voice Acoustics Measurement

Division of Speech, Music and Hearing, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, 100 44 Stockholm, Sweden
Appl. Sci. 2023, 13(6), 3514; https://doi.org/10.3390/app13063514
Submission received: 4 March 2023 / Accepted: 7 March 2023 / Published: 9 March 2023
(This article belongs to the Special Issue Current Trends and Future Directions in Voice Acoustics Measurement)
The human voice production mechanism implements a superbly rich communication channel that at once tells us what, who, how, and much more. This is thanks to its many degrees of freedom and its large variability. This same variability, however, presents a multitude of challenges to using the sound of the voice for assessing vocal status in the clinic or in the voice studio. Decades of research notwithstanding, many acoustic and also other physical measures of voice are still not solidly established as clinical evidence, and this is true even though experienced clinicians or pedagogues often can hear what the problem is.
Therefore, this Special Issue begins with a group of five reviews and tutorial articles that address the evidential and informative value of various forms of analysis, both established and proposed. The invited opening opinion piece by Brockmann-Bauser and de Paula Soares [1] implicitly encourages voice scientists to emerge from the bubble of technical/theoretical research, and take stock of the most pressing needs of voice clinics. Which should be the prioritized issues for clinical measurement?
The present maelstrom of new developments in machine learning (ML) and data analytics is producing a deluge of articles on automated voice diagnosis. While ML techniques clearly hold great promise, too many such papers turn out to be poorly grounded in the domain of voice science as such, and/or are difficult to assimilate for readers and reviewers who are not themselves in the ML field. The featured article by Gomez et al. [2] gives an extensive methodological appraisal of articles that apply ML to voice analysis; and offers guidance and criteria for future authors, reviewers, editors, and readers alike.
In the third article [3], Peter Pabon and I present a perspective on how the mapping of the voice over its range reveals a greater variability than is often appreciated; to such an extent, in fact, that conventional methods for sampling and aggregating observations of the voice often must be questioned. Fortunately, the paradigm of the voice map can be used also for dealing with and accounting for how different individual voices are, in their variation across the range, and for visualizing how functional variation has been affected by a pathology or by an intervention, clinical or pedagogical.
Despite increasingly insistent cautions from statisticians, there remains entrenched in many research communities, including ours, an over-reliance on null hypothesis significance testing for judging the import of quantitative findings. The invited tutorial by Anders Sand [4] uses an exemplary vocological setting to demonstrate why inferential statistics should not be a primary basis for interpreting one’s results. Thereby, he is voicing for us the statisticians’ concern. You may find the implications of his article to be disruptive or not, depending on your training.
Quantitative real-time measurement of voice is finding applications for feedback in pedagogy for singing, as is herein thoroughly reviewed by Lã and Fiuza [5], under the classical headings of respiration, phonation, and articulation. Many aspects of vocal performance are measurable, and voice teachers are showing an increasing interest in its quantitative assessment combined with feedback. Singing is different from speech in many ways, and its quantitative assessment poses particular challenges for the technologist and voice teacher alike.
So far, these general reviews have not concerned any particular type of measurement. The extensive review of studies of the Relative Fundamental Frequency (RFF) by McKenna et al. [6] brings us to the consideration of specific voice metrics. The RFF metric has been widely proposed for assessing laryngeal tension, with the great attraction of being non-invasive and easy to understand. The review shows that much consensus as to its deployment remains to be established. On a somewhat related topic, also dealing with onsets and offsets of phonation, the article by Naghibolhosseini et al. [7] investigates whether glottal attacks/offsets as obtained by endoscopic imaging could be useable for characterizing adductor spasmodic dysphonia. Imaging is an important complement to the acoustic signal, and since modern high-speed video techniques generate extreme volumes of image data, there is a great need for automated glottis segmentation. The article by Döllinger et al. [8] reports on a major effort in applying ML to this problem, including the issues associated with training that is accumulated over several datasets.
True to its theme, this Special Issue also contains reports on new types of measurement. Measuring or at least estimating the subglottal pressure in real-time during in vivo phonation has often been touted as the ‘holy grail’ of vocology. To do so non-invasively and in an ambulatory setting is particularly ambitious. The groups around Cortés et al. have a long track record of such research, and their invited article [9] reports on their recent promising developments. Two complementary trends in algorithms for voice analysis are those toward increasing complexity and detail, versus low-power, low-complexity solutions for portable devices. Fernandes et al. [10] propose one of the latter, for extracting perturbation-related metrics, intended as analysis features for inclusion in a portable system. Still, individual metrics will not suffice to characterize vocal function. With the aim of leveraging combined measurements, Cai and Ternström [11] present a scheme for automated, real-time classification and visualization of phonation types such as breathy, pressed, etc. It is based on direct clustering of combinations of acoustic and EGG metrics, displayed as colors on voice maps.
Finally, I am most grateful to the nestors of voice simulation, Ingo Titze and Jorge Lucero, for kindly accepting to offer their views on appropriate ways forward. Their insights and suggestions [12] close this Special Issue, and open a door to the future.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

My warm thanks are extended to all the authors and reviewers who have made this Special Issue possible; as well as to the staff at MDPI for their expeditious and rigourous assistance.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Brockmann-Bauser, M.; de Paula Soares, M.F. Do We Get What We Need from Clinical Acoustic Voice Measurements? Appl. Sci. 2023, 13, 941. [Google Scholar] [CrossRef]
  2. Gómez-Vilda, P.; Gómez-Rodellar, A.; Palacios-Alonso, D.; Rodellar-Biarge, V.; Álvarez-Marquina, A. The Role of Data Analytics in the Assessment of Pathological Speech—A Critical Appraisal. Appl. Sci. 2022, 12, 11095. [Google Scholar] [CrossRef]
  3. Ternström, S.; Pabon, P. Voice Maps as a Tool for Understanding and Dealing with Variability in the Voice. Appl. Sci. 2022, 12, 11353. [Google Scholar] [CrossRef]
  4. Sand, A. Inferential Statistics Is an Unfit Tool for Interpreting Data. Appl. Sci. 2022, 12, 7691. [Google Scholar] [CrossRef]
  5. Lã, F.M.B.; Fiuza, M.B. Real-Time Visual Feedback in Singing Pedagogy: Current Trends and Future Directions. Appl. Sci. 2022, 12, 10781. [Google Scholar] [CrossRef]
  6. McKenna, V.S.; Vojtech, J.M.; Previtera, M.; Kendall, C.L.; Carraro, K.E. A Scoping Literature Review of Relative Fundamental Frequency (RFF) in Individuals with and without Voice Disorders. Appl. Sci. 2022, 12, 8121. [Google Scholar] [CrossRef]
  7. Naghibolhosseini, M.; Zacharias, S.R.C.; Zenas, S.; Levesque, F.; Deliyski, D.D. Laryngeal Imaging Study of Glottal Attack/Offset Time in Adductor Spasmodic Dysphonia during Connected Speech. Appl. Sci. 2023, 13, 2979. [Google Scholar] [CrossRef]
  8. Döllinger, M.; Schraut, T.; Henrich, L.A.; Chhetri, D.; Echternach, M.; Johnson, A.M.; Kunduk, M.; Maryn, Y.; Patel, R.R.; Samlan, R.; et al. Re-Training of Convolutional Neural Networks for Glottis Segmentation in Endoscopic High-Speed Videos. Appl. Sci. 2022, 12, 9791. [Google Scholar] [CrossRef]
  9. Cortés, J.P.; Lin, J.Z.; Marks, K.L.; Espinoza, V.M.; Ibarra, E.J.; Zañartu, M.; Hillman, R.E.; Mehta, D.D. Ambulatory Monitoring of Subglottal Pressure Estimated from Neck-Surface Vibration in Individuals with and without Voice Disorders. Appl. Sci. 2022, 12, 10692. [Google Scholar] [CrossRef] [PubMed]
  10. Fernandes, J.F.T.; Freitas, D.; Junior, A.C.; Teixeira, J.P. Determination of Harmonic Parameters in Pathological Voices--Efficient Algorithm. Appl. Sci. 2023, 13, 2333. [Google Scholar] [CrossRef]
  11. Cai, H.; Ternström, S. Mapping Phonation Types by Clustering of Multiple Metrics. Appl. Sci. 2022, 12, 12092. [Google Scholar] [CrossRef]
  12. Titze, I.R.; Lucero, J.C. Voice Simulation: The Next Generation. Appl. Sci. 2022, 12, 11720. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ternström, S. Special Issue on Current Trends and Future Directions in Voice Acoustics Measurement. Appl. Sci. 2023, 13, 3514. https://doi.org/10.3390/app13063514

AMA Style

Ternström S. Special Issue on Current Trends and Future Directions in Voice Acoustics Measurement. Applied Sciences. 2023; 13(6):3514. https://doi.org/10.3390/app13063514

Chicago/Turabian Style

Ternström, Sten. 2023. "Special Issue on Current Trends and Future Directions in Voice Acoustics Measurement" Applied Sciences 13, no. 6: 3514. https://doi.org/10.3390/app13063514

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop