Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (406)

Search Parameters:
Keywords = auditory model

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 12122 KiB  
Article
A Computational–Cognitive Model of Audio-Visual Attention in Dynamic Environments
by Hamideh Yazdani, Alireza Bosaghzadeh, Reza Ebrahimpour and Fadi Dornaika
Big Data Cogn. Comput. 2025, 9(5), 120; https://doi.org/10.3390/bdcc9050120 - 6 May 2025
Abstract
Human visual attention is influenced by multiple factors, including visual, auditory, and facial cues. While integrating auditory and visual information enhances prediction accuracy, many existing models rely solely on visual-temporal data. Inspired by cognitive studies, we propose a computational model that combines spatial, [...] Read more.
Human visual attention is influenced by multiple factors, including visual, auditory, and facial cues. While integrating auditory and visual information enhances prediction accuracy, many existing models rely solely on visual-temporal data. Inspired by cognitive studies, we propose a computational model that combines spatial, temporal, face (low-level and high-level visual cues), and auditory saliency to predict visual attention more effectively. Our approach processes video frames to generate spatial, temporal, and face saliency maps, while an audio branch localizes sound-producing objects. These maps are then integrated to form the final audio-visual saliency map. Experimental results on the audio-visual dataset demonstrate that our model outperforms state-of-the-art image and video saliency models and the basic model and aligns more closely with behavioral and eye-tracking data. Additionally, ablation studies highlight the contribution of each information source to the final prediction. Full article
18 pages, 4885 KiB  
Article
Decoding Poultry Welfare from Sound—A Machine Learning Framework for Non-Invasive Acoustic Monitoring
by Venkatraman Manikandan and Suresh Neethirajan
Sensors 2025, 25(9), 2912; https://doi.org/10.3390/s25092912 - 5 May 2025
Abstract
Acoustic monitoring presents a promising, non-invasive modality for assessing animal welfare in precision livestock farming. In poultry, vocalizations encode biologically relevant cues linked to health status, behavioral states, and environmental stress. This study proposes an integrated analytical framework that combines signal-level statistical analysis [...] Read more.
Acoustic monitoring presents a promising, non-invasive modality for assessing animal welfare in precision livestock farming. In poultry, vocalizations encode biologically relevant cues linked to health status, behavioral states, and environmental stress. This study proposes an integrated analytical framework that combines signal-level statistical analysis with machine learning and deep learning classifiers to interpret chicken vocalizations in a welfare assessment context. The framework was evaluated using three complementary datasets encompassing health-related vocalizations, behavioral call types, and stress-induced acoustic responses. The pipeline employs a multistage process comprising high-fidelity signal acquisition, feature extraction (e.g., mel-frequency cepstral coefficients, spectral contrast, zero-crossing rate), and classification using models including Random Forest, HistGradientBoosting, CatBoost, TabNet, and LSTM. Feature importance analysis and statistical tests (e.g., t-tests, correlation metrics) confirmed that specific MFCC bands and spectral descriptors were significantly associated with welfare indicators. LSTM-based temporal modeling revealed distinct acoustic trajectories under visual and auditory stress, supporting the presence of habituation and stressor-specific vocal adaptations over time. Model performance, validated through stratified cross-validation and multiple statistical metrics (e.g., F1-score, Matthews correlation coefficient), demonstrated high classification accuracy and generalizability. Importantly, the approach emphasizes model interpretability, facilitating alignment with known physiological and behavioral processes in poultry. The findings underscore the potential of acoustic sensing and interpretable AI as scalable, biologically grounded tools for real-time poultry welfare monitoring, contributing to the advancement of sustainable and ethical livestock production systems. Full article
(This article belongs to the Special Issue Sensors in 2025)
Show Figures

Figure 1

22 pages, 6743 KiB  
Article
The Effect of Audiovisual Environment in Rail Transit Spaces on Pedestrian Psychological Perception
by Mingli Zhang, Xinyi Zou, Xuejun Hu, Haisheng Xie, Feng Han and Qi Meng
Buildings 2025, 15(9), 1400; https://doi.org/10.3390/buildings15091400 - 22 Apr 2025
Viewed by 194
Abstract
The environmental quality of rail transit spaces has increasingly attracted attention, as factors such as train noise and visual disturbances from elevated lines can impact pedestrians’ psychological perception through the audiovisual environment in these spaces. This study first collects audiovisual materials from rail [...] Read more.
The environmental quality of rail transit spaces has increasingly attracted attention, as factors such as train noise and visual disturbances from elevated lines can impact pedestrians’ psychological perception through the audiovisual environment in these spaces. This study first collects audiovisual materials from rail transit spaces and pedestrian perception data through on-site surveys, measurements, VR environment simulations, and custom Deep Learning (DL) models. Using cluster analysis, the environments are categorized based on visual and auditory perceptions and evaluations of rail transit stations, delineating and classifying the spaces into different zones. The study further explores the interactive effects of audiovisual environmental factors on psychological perception within these zones. The results indicate that, based on audiovisual perception, the space within 300 m of a rail transit station can be divided into three zones and four distinct types of audiovisual perception spaces. The effect of the type of auditory environment on visual indicators was smaller than the effect of the visual environment on auditory indicators, and the category of vision had the greatest effect on the subjective indicators of hearing within Zones 1 and 2. This study not only provides a scientific basis for improving the environmental quality of rail transit station areas but also offers new perspectives and practical approaches for urban transportation planning and design. Full article
Show Figures

Figure 1

26 pages, 15804 KiB  
Article
Acoustic Event Detection in Vehicles: A Multi-Label Classification Approach
by Anaswara Antony, Wolfgang Theimer, Giovanni Grossetti and Christoph M. Friedrich
Sensors 2025, 25(8), 2591; https://doi.org/10.3390/s25082591 - 19 Apr 2025
Viewed by 265
Abstract
Autonomous driving technologies for environmental perception are mostly based on visual cues obtained from sensors like cameras, RADAR, or LiDAR. They capture the environment as if seen through “human eyes”. If this visual information is complemented with auditory information, thereby also providing “ears”, [...] Read more.
Autonomous driving technologies for environmental perception are mostly based on visual cues obtained from sensors like cameras, RADAR, or LiDAR. They capture the environment as if seen through “human eyes”. If this visual information is complemented with auditory information, thereby also providing “ears”, driverless cars can become more reliable and safer. In this paper, an Acoustic Event Detection model is presented that can detect various acoustic events in an automotive context along with their time of occurrence to create an audio scene description. The proposed detection methodology uses the pre-trained network Bidirectional Encoder representation from Audio Transformers (BEATs) and a single-layer neural network trained on the database of real audio recordings collected from different cars. The performance of the model is evaluated for different parameters and datasets. The segment-based results for a duration of 1 s show that the model performs well for 11 sound classes with a mean accuracy of 0.93 and F1-Score of 0.39 for a confidence threshold of 0.5. The threshold-independent metric mAP has a value of 0.77. The model also performs well for sound mixtures containing two overlapping events with mean accuracy, F1-Score, and mAP equal to 0.89, 0.42, and 0.658, respectively. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

29 pages, 4394 KiB  
Article
Analysis of Voice, Speech, and Language Biomarkers of Parkinson’s Disease Collected in a Mixed Reality Setting
by Milosz Dudek, Daria Hemmerling, Marta Kaczmarska, Joanna Stepien, Mateusz Daniol, Marek Wodzinski and Magdalena Wojcik-Pedziwiatr
Sensors 2025, 25(8), 2405; https://doi.org/10.3390/s25082405 - 10 Apr 2025
Viewed by 599
Abstract
This study explores an innovative approach to early Parkinson’s disease (PD) detection by analyzing speech data collected using a mixed reality (MR) system. A total of 57 Polish participants, including PD patients and healthy controls, performed five speech tasks while using an MR [...] Read more.
This study explores an innovative approach to early Parkinson’s disease (PD) detection by analyzing speech data collected using a mixed reality (MR) system. A total of 57 Polish participants, including PD patients and healthy controls, performed five speech tasks while using an MR head-mounted display (HMD). Speech data were recorded and analyzed to extract acoustic and linguistic features, which were then evaluated using machine learning models, including logistic regression, support vector machines (SVMs), random forests, AdaBoost, and XGBoost. The XGBoost model achieved the best performance, with an F1-score of 0.90 ± 0.05 in the story-retelling task. Key features such as MFCCs (mel-frequency cepstral coefficients), spectral characteristics, RASTA-filtered auditory spectrum, and local shimmer were identified as significant in detecting PD-related speech alterations. Additionally, state-of-the-art deep learning models (wav2vec2, HuBERT, and WavLM) were fine-tuned for PD detection. HuBERT achieved the highest performance, with an F1-score of 0.94 ± 0.04 in the diadochokinetic task, demonstrating the potential of deep learning to capture complex speech patterns linked to neurodegenerative diseases. This study highlights the effectiveness of combining MR technology for speech data collection with advanced machine learning (ML) and deep learning (DL) techniques, offering a non-invasive and high-precision approach to PD diagnosis. The findings hold promise for broader clinical applications, advancing the diagnostic landscape for neurodegenerative disorders. Full article
Show Figures

Figure 1

15 pages, 660 KiB  
Review
Mobile Gaming for Cognitive Health in Older Adults: A Scoping Review of App Store Applications
by Jiadong Yu, Eunie Jung, D.A. Bekerian and Chelsee Osback
Healthcare 2025, 13(8), 855; https://doi.org/10.3390/healthcare13080855 - 9 Apr 2025
Viewed by 474
Abstract
Background: Mobile gaming applications are increasingly marketed as cognitive training tools for older adults, yet their scientific validity and accessibility remain uncertain. This scoping review evaluates their effectiveness and inclusivity. Methods: A systematic search of the Apple App Store and Google Play Store [...] Read more.
Background: Mobile gaming applications are increasingly marketed as cognitive training tools for older adults, yet their scientific validity and accessibility remain uncertain. This scoping review evaluates their effectiveness and inclusivity. Methods: A systematic search of the Apple App Store and Google Play Store identified 227 applications, with 14 meeting inclusion criteria. Apps were assessed for scientific validity, theoretical foundation, accessibility, cognitive targeting, user engagement, and monetization models. Results: While all 14 apps claimed cognitive benefits, only one cited empirical research. None included baseline cognitive assessments or progress tracking. Accessibility was limited—eight apps had visual accommodations, but none provided auditory support. Six apps were English-only, restricting linguistic inclusivity. Monetization varied, with eight requiring in-app purchases or subscriptions, posing financial barriers. Conclusions: This review highlights critical gaps in the current cognitive gaming application market for older adults. Despite their popularity, cognitive training apps for older adults lack scientific validation and accessibility, limiting their effectiveness as cognitive interventions. Developers should integrate evidence-based training, adaptive assessments, and inclusive accessibility features such as voice guidance and multilingual support. Future research should prioritize longitudinal studies to assess real-world efficacy, refine interventions targeting memory, executive function, and processing speed, and enhance inclusive design for diverse aging populations. Full article
(This article belongs to the Special Issue Smart Medicine for Older Adults)
Show Figures

Figure 1

22 pages, 10173 KiB  
Article
Tech-Enhanced Vocabulary Acquisition: Exploring the Use of Student-Created Video Learning Materials in the Tertiary-Level EFL (English as a Foreign Language) Flipped Classroom
by Jelena Bobkina, Svetlana Baluyan and Elena Dominguez Romero
Educ. Sci. 2025, 15(4), 450; https://doi.org/10.3390/educsci15040450 - 5 Apr 2025
Viewed by 582
Abstract
This study explores the effectiveness of Technology-Assisted Vocabulary Learning (TAVL) using student-created video learning materials within a tertiary-level English as a Foreign Language (EFL) flipped classroom. By leveraging the flipped classroom model, which allocates classroom time for interactive activities and shifts instructional content [...] Read more.
This study explores the effectiveness of Technology-Assisted Vocabulary Learning (TAVL) using student-created video learning materials within a tertiary-level English as a Foreign Language (EFL) flipped classroom. By leveraging the flipped classroom model, which allocates classroom time for interactive activities and shifts instructional content delivery outside of class, the research investigates how student-produced videos can enhance vocabulary acquisition and retention. Conducted with 47 university students from a Translation and Translation Studies course, the study aims to fill a gap in empirical evidence regarding this innovative approach. Quantitative analysis revealed that students who created and utilized videos (Group 1) showed the highest improvement in vocabulary scores, followed by those who only used the videos (Group 2), with the control group relying on traditional teacher-led methods showing the least improvement. Qualitative feedback highlighted that video creators experienced deeper engagement and better vocabulary retention, while users appreciated the videos’ visual and auditory elements but faced challenges with vocabulary overload. The findings suggest that incorporating student-created videos into the curriculum fosters a dynamic and collaborative learning environment, offering practical implications for enhancing vocabulary instruction through technology-enhanced pedagogical practices. Future research should focus on optimizing video production processes and integrating these methods with traditional teaching for comprehensive vocabulary learning. Full article
(This article belongs to the Section Language and Literacy Education)
Show Figures

Figure 1

15 pages, 2503 KiB  
Article
The Effect of Concurrent Auditory Working Memory Task in Auditory Category Learning
by Jie Wu, Jianghong Lu, Zixuan Che and Siying Li
Behav. Sci. 2025, 15(4), 440; https://doi.org/10.3390/bs15040440 - 31 Mar 2025
Viewed by 286
Abstract
In the auditory domain, the role of auditory working memory in shaping strategy selection and performance within both auditory rule-based and information-integration tasks remains unclear. To address this issue, the present study utilized a concurrent auditory working memory paradigm to investigate the impact [...] Read more.
In the auditory domain, the role of auditory working memory in shaping strategy selection and performance within both auditory rule-based and information-integration tasks remains unclear. To address this issue, the present study utilized a concurrent auditory working memory paradigm to investigate the impact of working memory on rule-based and information-integration category learning within the auditory domain. Additionally, we employed a categorization strategy model and drift-diffusion model to examine the impact of auditory working memory on auditory category learning. The categorization strategies model revealed that significantly more participants employed the optimal strategy in the control condition compared to the concurrent working memory condition in both the rule-based and information-integration tasks. Furthermore, given that most participants used the rule-based strategy in the information-integration task, the results showed a decrease in accuracy and an increase in reaction time for the concurrent working memory condition relative to the control condition in rule-based category learning. According to the drift-diffusion model, this decline observed under the concurrent working memory condition can be attributed to a reduction in information accumulation speed, increased cautious decision-making, and longer nondecision time. This study suggests that the concurrent working memory task interfered with participants’ strategy selection and performance, with decreased performance in the concurrent working memory condition stemming from slower information accumulation and extended nondecision time for the rule-based strategy, while more cautious decision-making was related to the information-integration strategy. Full article
Show Figures

Figure 1

15 pages, 4456 KiB  
Article
Using Machine Learning for Analysis of Wideband Acoustic Immittance and Assessment of Middle Ear Function in Infants
by Shan Peng, Yukun Zhao, Xinyi Yao, Huilin Yin, Bei Ma, Ke Liu, Gang Li and Yang Cao
Audiol. Res. 2025, 15(2), 35; https://doi.org/10.3390/audiolres15020035 - 31 Mar 2025
Viewed by 296
Abstract
Objectives: Evaluating middle ear function is essential for interpreting screening results and prioritizing diagnostic referrals for infants with hearing impairments. Wideband Acoustic Immittance (WAI) technology offers a comprehensive approach by utilizing sound stimuli across various frequencies, providing a deeper understanding of ear physiology. [...] Read more.
Objectives: Evaluating middle ear function is essential for interpreting screening results and prioritizing diagnostic referrals for infants with hearing impairments. Wideband Acoustic Immittance (WAI) technology offers a comprehensive approach by utilizing sound stimuli across various frequencies, providing a deeper understanding of ear physiology. However, current clinical practices often restrict WAI data analysis to peak information at specific frequencies, limiting its comprehensiveness. Design: In this study, we developed five machine learning models—feedforward neural network, convolutional neural network, kernel density estimation, random forest, and support vector machine—to extract features from wideband acoustic immittance data collected from newborns aged 2–6 months. These models were trained to predict and assess the normalcy of middle ear function in the samples. Results: The integrated machine learning models achieved an average accuracy exceeding 90% in the test set, with various classification performance metrics (accuracy, precision, recall, F1 score, MCC) surpassing 0.8. Furthermore, we developed a program based on ML models with an interactive GUI interface. The software is available for free download. Conclusions: This study showcases the capability to automatically diagnose middle ear function in infants based on WAI data. While not intended for diagnosing specific pathologies, the approach provides valuable insights to guide follow-up testing and clinical decision-making, supporting the early identification and management of auditory conditions in newborns. Full article
Show Figures

Figure 1

13 pages, 641 KiB  
Article
Sensory Modality in Students Enrolled in a Specialized Training Program for Security Forces and Its Impact on Karate Performance Indicators
by Ivan Uher, Ján Pivovarník and Mária Majherová
J. Funct. Morphol. Kinesiol. 2025, 10(2), 114; https://doi.org/10.3390/jfmk10020114 - 28 Mar 2025
Viewed by 218
Abstract
Objectives: The present study examined the sensory preferences adopted by students over three years of training in a specialized training program for security forces (STPSF). It determines their impact on karate performance metrics. Methods: Thirty-one students aged 20 to 26 (SD = 0.81) [...] Read more.
Objectives: The present study examined the sensory preferences adopted by students over three years of training in a specialized training program for security forces (STPSF). It determines their impact on karate performance metrics. Methods: Thirty-one students aged 20 to 26 (SD = 0.81) completed the modified Visual, Aural, Read/Write, and Kinesthetic questionnaire (VARK), a tool designed to help identify students’ preferred learning styles. This research suggests a theoretical model in which the balanced and optimal engagement of visual, auditory, and kinesthetic modalities rather than a strict mathematical equation might provide an optimal foundation for improving proficiency in martial arts. Balanced engagement of these sensory modalities can foster a deeper understanding of karate techniques, improve performance, minimize dependence on a single sensory channel, and bolster real-time adaptability. The students were tested at two points: once at the beginning of their enrolment and again after completing their three-year training program. Results: After a relatively intensive intervention over three years, the findings suggest a positive shift in the ratio of the primary modalities, moving toward an optimal balance. Considering the ideal sensory balance of 50:50:50%, the visual modality increased from 45.8 to 50.4, approaching the optimal value. The auditory modality, initially above the ideal level at 53.8, adjusted closer to balance, reaching 51.9. In contrast, the kinesthetic modality slightly decreased from 50 to 47.5, indicating a minor deviation from the ideal state. It was further confirmed that a higher technical level, such as the third kyu, exhibits an equal distribution, approaching the optimal use of the three modalities: visual 51.5 auditory 47.6 and kinesthetic 50.7. Moreover, the progress toward an optimal synergy and a more efficient evaluation of situational possibilities within the decision-making process was more frequently noted in females than in male students. Conclusions: Acknowledging students’ sensory processing preferences can assist the teacher, trainer, coach, and student in advancing interaction, optimizing learning strategies, improving performance, promoting analytical skills, and fostering self-assurance and determination. Full article
Show Figures

Figure 1

11 pages, 503 KiB  
Article
The Impact of Background Music on Flow, Work Engagement and Task Performance: A Randomized Controlled Study
by Yuwen Sun
Behav. Sci. 2025, 15(4), 416; https://doi.org/10.3390/bs15040416 - 25 Mar 2025
Viewed by 869
Abstract
The widespread adoption of background music in workplaces contrasts with the inconsistent empirical evidence regarding its cognitive effects, particularly concerning how music types influence the sequential pathway from flow states to work engagement and task performance. While prior research identifies flow and engagement [...] Read more.
The widespread adoption of background music in workplaces contrasts with the inconsistent empirical evidence regarding its cognitive effects, particularly concerning how music types influence the sequential pathway from flow states to work engagement and task performance. While prior research identifies flow and engagement as potential mediators, theoretical conflicts persist regarding their temporal dynamics and susceptibility to auditory habituation. This study tested three hypotheses: (1) music type indirectly affects performance through flow–engagement mediation, (2) high-arousal music impairs while structured compositions (e.g., Mozart’s K448) enhance this pathway, and (3) repeated exposure diminishes music’s efficacy. A two-phase longitudinal experiment with 428 Chinese undergraduates employed structural equation modeling (SEM) to analyze data from randomized groups (control, high-arousal, low-arousal, and Mozart K448), completing Backward Digit Span tasks under controlled auditory conditions. The results confirmed Mozart K448’s superior immediate mediation effect (β = 0.118, 95% CI [0.072, 0.181]) compared to high-arousal music’s detrimental impact (β = −0.112, 95% CI [−0.182, −0.056]), with flow fully mediating engagement’s influence on performance. A longitudinal analysis revealed a 53% attenuation in Mozart’s flow-enhancing effect after a 30-day familiarization (B = 0.150 vs. baseline 0.321), though residual benefits persisted. These findings reconcile the cognitive tuning and arousal–mood hypotheses by proposing a hybrid model where music initially operates through a novelty-driven dopamine release before transitioning to schema-based cognitive priming. Practically, the results advocate tiered auditory strategies: deploying structured music during skill acquisition phases while rotating selections to counter habituation. The study highlights the cultural specificity in auditory processing, challenging universal prescriptions and underscoring the need for localized music policies. By integrating flow theory with neurocognitive habituation models, this research advances evidence-based guidelines for optimizing workplace auditory environments. Full article
Show Figures

Figure 1

49 pages, 2083 KiB  
Systematic Review
Pain and the Brain: A Systematic Review of Methods, EEG Biomarkers, Limitations, and Future Directions
by Bayan Ahmad and Buket D. Barkana
Neurol. Int. 2025, 17(4), 46; https://doi.org/10.3390/neurolint17040046 - 21 Mar 2025
Viewed by 712
Abstract
Background: Pain is prevalent in almost all populations and may often hinder visual, auditory, tactile, olfactory, and taste perception as it alters brain neural processing. The quantitative methods emerging to define pain and assess its effects on neural functions and perception are important. [...] Read more.
Background: Pain is prevalent in almost all populations and may often hinder visual, auditory, tactile, olfactory, and taste perception as it alters brain neural processing. The quantitative methods emerging to define pain and assess its effects on neural functions and perception are important. Identifying pain biomarkers is one of the initial stages in developing such models and interventions. The existing literature has explored chronic and experimentally induced pain, leveraging electroencephalograms (EEGs) to identify biomarkers and employing various qualitative and quantitative approaches to measure pain. Objectives: This systematic review examines the methods, participant characteristics, types of pain states, associated pain biomarkers of the brain’s electrical activity, and limitations of current pain studies. The review identifies what experimental methods researchers implement to study human pain states compared to human control pain-free states, as well as the limitations in the current techniques of studying human pain states and future directions for research. Methods: The research questions were formed using the Population, Intervention, Comparison, Outcome (PICO) framework. A literature search was conducted using PubMed, PsycINFO, Embase, the Cochrane Library, IEEE Explore, Medline, Scopus, and Web of Science until December 2024, following the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines to obtain relevant studies. The inclusion criteria included studies that focused on pain states and EEG data reporting. The exclusion criteria included studies that used only MEG or fMRI neuroimaging techniques and those that did not focus on the evaluation or assessment of neural markers. Bias risk was determined by the Newcastle–Ottawa Scale. Target data were compared between studies to organize the findings among the reported results. Results: The initial search resulted in 592 articles. After exclusions, 24 studies were included in the review, 6 of which focused on chronic pain populations. Experimentally induced pain methods were identified as techniques that centered on tactile perception: thermal, electrical, mechanical, and chemical. Across both chronic and stimulated pain studies, pain was associated with decreased or slowing peak alpha frequency (PAF). In the chronic pain studies, beta power increases were seen with pain intensity. The functional connectivity and pain networks of chronic pain patients differ from those of healthy controls; this includes the processing of experimental pain. Reportedly small sample sizes, participant comorbidities such as neuropsychiatric disorders and peripheral nerve damage, and uncontrolled studies were the common drawbacks of the studies. Standardizing methods and establishing collaborations to collect open-access comprehensive longitudinal data were identified as necessary future directions to generalize neuro markers of pain. Conclusions: This review presents a variety of experimental setups, participant populations, pain stimulation methods, lack of standardized data analysis methods, supporting and contradicting study findings, limitations, and future directions. Comprehensive studies are needed to understand the pain and brain relationship deeper in order to confirm or disregard the existing findings and to generalize biomarkers across chronic and experimentally induced pain studies. This requires the implementation of larger, diverse cohorts in longitudinal study designs, establishment of procedural standards, and creation of repositories. Additional techniques include the utilization of machine learning and analyzing data from long-term wearable EEG systems. The review protocol is registered on INPLASY (# 202520040). Full article
Show Figures

Figure 1

36 pages, 4990 KiB  
Article
Toward Inclusive Smart Cities: Sound-Based Vehicle Diagnostics, Emergency Signal Recognition, and Beyond
by Amr Rashed, Yousry Abdulazeem, Tamer Ahmed Farrag, Amna Bamaqa, Malik Almaliki, Mahmoud Badawy and Mostafa A. Elhosseini
Machines 2025, 13(4), 258; https://doi.org/10.3390/machines13040258 - 21 Mar 2025
Viewed by 490
Abstract
Sound-based early fault detection for vehicles is a critical yet underexplored area, particularly within Intelligent Transportation Systems (ITSs) for smart cities. Despite the clear necessity for sound-based diagnostic systems, the scarcity of specialized publicly available datasets presents a major challenge. This study addresses [...] Read more.
Sound-based early fault detection for vehicles is a critical yet underexplored area, particularly within Intelligent Transportation Systems (ITSs) for smart cities. Despite the clear necessity for sound-based diagnostic systems, the scarcity of specialized publicly available datasets presents a major challenge. This study addresses this gap by contributing in multiple dimensions. Firstly, it emphasizes the significance of sound-based diagnostics for real-time detection of faults through analyzing sounds directly generated by vehicles, such as engine or brake noises, and the classification of external emergency sounds, like sirens, relevant to vehicle safety. Secondly, this paper introduces a novel dataset encompassing vehicle fault sounds, emergency sirens, and environmental noises specifically curated to address the absence of such specialized datasets. A comprehensive framework is proposed, combining audio preprocessing, feature extraction (via Mel Spectrograms, MFCCs, and Chromatograms), and classification using 11 models. Evaluations using both compact (52 features) and expanded (126 features) representations show that several classes (e.g., Engine Misfire, Fuel Pump Cartridge Fault, Radiator Fan Failure) achieve near-perfect accuracy, though acoustically similar classes like Universal Joint Failure, Knocking, and Pre-ignition Problem remain challenging. Logistic Regression yielded the highest accuracy of 86.5% for the vehicle fault dataset (DB1) using compact features, while neural networks performed best for datasets DB2 and DB3, achieving 88.4% and 85.5%, respectively. In the second scenario, a Bayesian-Optimized Weighted Soft Voting with Feature Selection (BOWSVFS) approach is proposed, significantly enhancing accuracy to 91.04% for DB1, 88.85% for DB2, and 86.85% for DB3. These results highlight the effectiveness of the proposed methods in addressing key ITS limitations and enhancing accessibility for individuals with disabilities through auditory-based vehicle diagnostics and emergency recognition systems. Full article
(This article belongs to the Special Issue Recent Developments in Machine Design, Automation and Robotics)
Show Figures

Figure 1

38 pages, 10305 KiB  
Article
Listening Beyond the Source: Exploring the Descriptive Language of Musical Sounds
by Isabel Pires
Behav. Sci. 2025, 15(3), 396; https://doi.org/10.3390/bs15030396 - 20 Mar 2025
Viewed by 930
Abstract
The spontaneous use of verbal expressions to articulate and describe abstract auditory phenomena in everyday interactions is an inherent aspect of human nature. This occurs without the structured conditions typically required in controlled laboratory environments, relying instead on intuitive and spontaneous modes of [...] Read more.
The spontaneous use of verbal expressions to articulate and describe abstract auditory phenomena in everyday interactions is an inherent aspect of human nature. This occurs without the structured conditions typically required in controlled laboratory environments, relying instead on intuitive and spontaneous modes of expression. This study explores the relationship between auditory perception and descriptive language for abstract sounds. These sounds, synthesized without identifiable sources or musical structures, allow listeners to engage with sound perception free from external references. The investigation of correlations between subjective descriptors (e.g., “rough”, “bright”) and physical sound attributes (e.g., spectral and dynamic properties) reveals significant cross-modal linguistic associations in auditory perception. An international survey with a diverse group of participants revealed that listeners often draw on other sensory domains to describe sounds, suggesting a robust cross-modal basis for auditory descriptors. Moreover, the findings indicate a correlation between subjective descriptors and objective sound wave properties, demonstrating the effectiveness of abstract sounds in guiding listeners’ attention to intrinsic qualities. These results could support the development of new paradigms in sound analysis and manipulation, with applications in artistic, educational, and analytical contexts. This multidisciplinary approach may provide the foundation for a perceptual framework for sound analysis, to be tested and refined through theoretical modelling and experimental validation. Full article
(This article belongs to the Special Issue Music Listening as Exploratory Behavior)
Show Figures

Figure 1

18 pages, 2430 KiB  
Article
The Art of Replication: Lifelike Avatars with Personalized Conversational Style
by Michele Nasser, Giuseppe Fulvio Gaglio, Valeria Seidita and Antonio Chella
Robotics 2025, 14(3), 33; https://doi.org/10.3390/robotics14030033 - 13 Mar 2025
Viewed by 723
Abstract
This study presents an approach for developing digital avatars replicating individuals’ physical characteristics and communicative style, contributing to research on virtual interactions in the metaverse. The proposed method integrates large language models (LLMs) with 3D avatar creation techniques, using what we call the [...] Read more.
This study presents an approach for developing digital avatars replicating individuals’ physical characteristics and communicative style, contributing to research on virtual interactions in the metaverse. The proposed method integrates large language models (LLMs) with 3D avatar creation techniques, using what we call the Tree of Style (ToS) methodology to generate stylistically consistent and contextually appropriate responses. Linguistic analysis and personalized voice synthesis enhance conversational and auditory realism. The results suggest that ToS offers a practical alternative to fine-tuning for creating stylistically accurate responses while maintaining efficiency. This study outlines potential applications and acknowledges the need for further work on adaptability and ethical considerations. Full article
(This article belongs to the Special Issue Human–AI–Robot Teaming (HART))
Show Figures

Figure 1

Back to TopTop