Next Article in Journal
Towards an Optimized Blockchain-Based Secure Medical Prescription-Management System
Previous Article in Journal
Explainable Artificial Intelligence Methods to Enhance Transparency and Trust in Digital Deliberation Settings
Previous Article in Special Issue
Software-Bus-Toolchain (SBT): Introducing a Versatile Method for Quickly Implementing (I)IoT-Scenarios
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Evaluating Convolutional Neural Networks and Vision Transformers for Baby Cry Sound Analysis

Computer Engineering Department, Arab Academy of Science and Technology and Maritime Transport, Alexandria 1029, Egypt
*
Author to whom correspondence should be addressed.
Future Internet 2024, 16(7), 242; https://doi.org/10.3390/fi16070242
Submission received: 21 May 2024 / Revised: 24 June 2024 / Accepted: 25 June 2024 / Published: 7 July 2024

Abstract

Crying is a newborn’s main way of communicating. Despite their apparent similarity, newborn cries are physically generated and have distinct characteristics. Experienced medical professionals, nurses, and parents are able to recognize these variations based on their prior interactions. Nonetheless, interpreting a baby’s cries can be challenging for carers, first-time parents, and inexperienced paediatricians. This paper uses advanced deep learning techniques to propose a novel approach for baby cry classification. This study aims to accurately classify different cry types associated with everyday infant needs, including hunger, discomfort, pain, tiredness, and the need for burping. The proposed model achieves an accuracy of 98.33%, surpassing the performance of existing studies in the field. IoT-enabled sensors are utilized to capture cry signals in real time, ensuring continuous and reliable monitoring of the infant’s acoustic environment. This integration of IoT technology with deep learning enhances the system’s responsiveness and accuracy. Our study highlights the significance of accurate cry classification in understanding and meeting the needs of infants and its potential impact on improving infant care practices. The methodology, including the dataset, preprocessing techniques, and architecture of the deep learning model, is described. The results demonstrate the performance of the proposed model, and the discussion analyzes the factors contributing to its high accuracy.
Keywords: audio processing; cry sound analysis; deep learning; spectrogram; transformer models; convolutional neural networks audio processing; cry sound analysis; deep learning; spectrogram; transformer models; convolutional neural networks

Graphical Abstract

Share and Cite

MDPI and ACS Style

Younis, S.A.; Sobhy, D.; Tawfik, N.S. Evaluating Convolutional Neural Networks and Vision Transformers for Baby Cry Sound Analysis. Future Internet 2024, 16, 242. https://doi.org/10.3390/fi16070242

AMA Style

Younis SA, Sobhy D, Tawfik NS. Evaluating Convolutional Neural Networks and Vision Transformers for Baby Cry Sound Analysis. Future Internet. 2024; 16(7):242. https://doi.org/10.3390/fi16070242

Chicago/Turabian Style

Younis, Samir A., Dalia Sobhy, and Noha S. Tawfik. 2024. "Evaluating Convolutional Neural Networks and Vision Transformers for Baby Cry Sound Analysis" Future Internet 16, no. 7: 242. https://doi.org/10.3390/fi16070242

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop