Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,436)

Search Parameters:
Keywords = R-CNN

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
28 pages, 7200 KB  
Article
SOH Estimation of Lithium Battery Under Improved CNN-BIGRU-Attention Model Based on Hiking Optimization Algorithm
by Qianli Dong, Ziyang Liu, Hainan Wang, Lujun Wang, Rui Dong and Lu Lv
World Electr. Veh. J. 2025, 16(9), 487; https://doi.org/10.3390/wevj16090487 (registering DOI) - 25 Aug 2025
Abstract
Accurate State of Health (SOH) estimation is critical for ensuring the safe operation of lithium-ion batteries. However, current data-driven approaches face significant challenges: insufficient feature extraction and ambiguous physical meaning compromise prediction accuracy, while initialization sensitivity to noise undermines stability; the inherent nonlinearity [...] Read more.
Accurate State of Health (SOH) estimation is critical for ensuring the safe operation of lithium-ion batteries. However, current data-driven approaches face significant challenges: insufficient feature extraction and ambiguous physical meaning compromise prediction accuracy, while initialization sensitivity to noise undermines stability; the inherent nonlinearity and temporal complexity of battery degradation data further lead to slow convergence or susceptibility to local optima. To address these limitations, this study proposes an enhanced CNN-BIGRU model. The model replaces conventional random initialization with a Hiking Optimization Algorithm (HOA) to identify superior initial weights, significantly improving early training stability. Furthermore, it integrates an Attention mechanism to dynamically weight features, strengthening the capture of key degradation characteristics. Rigorous experimental validation, utilizing multi-dimensional features extracted from the NASA dataset, demonstrates the model’s superior convergence speed and prediction accuracy compared to the CNN-BIGRU-Attention benchmark. Compared with other methods, the HOA-CNN-BIRGU-Attention model proposed in this study has a higher prediction accuracy and better robustness under different conditions, and the RMSEs on the NASA dataset are all controlled within 0.01, with R2 kept above 0.91. The RMSEs on the University of Maryland dataset are all below 0.006, with R2 kept above 0.98. Compared with the CNN-BIGRU-ATTENTION baseline model without HOA optimization, the RMSE is reduced by at least 0.15% across different battery groups in the NASA dataset. Full article
13 pages, 4677 KB  
Proceeding Paper
Hyperspectral Analysis of Apricot Quality Parameters Using Classical Machine Learning and Deep Neural Networks
by Martin Dejanov
Eng. Proc. 2025, 107(1), 24; https://doi.org/10.3390/engproc2025104024 (registering DOI) - 25 Aug 2025
Abstract
This study focuses on predicting β-carotene content using hyperspectral images captured in the near-infrared (NIR) region during the drying process. Several machine learning models are compared, including Partial Least Squares Regression (PLSR), Stacked Autoencoders (SAEs) combined with Random Forest (RF), and Convolutional Neural [...] Read more.
This study focuses on predicting β-carotene content using hyperspectral images captured in the near-infrared (NIR) region during the drying process. Several machine learning models are compared, including Partial Least Squares Regression (PLSR), Stacked Autoencoders (SAEs) combined with Random Forest (RF), and Convolutional Neural Networks (CNNs) in three configurations: 1D-CNN, 2D-CNN, and 3D-CNN. The models are evaluated using R2, Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE). The PLSR model showed excellent results with R2 = 0.97 for both training and testing, indicating minimal overfitting. The SAE-RF model also performed well, with R2 values of 0.82 and 0.83 for training and testing, respectively, showing strong consistency. The CNN models displayed varying results: 1D-CNN achieved moderate performance, while 2D-CNN and 3D-CNN exhibited signs of overfitting, especially on testing data. Overall, the findings suggest that although CNNs are capable of capturing complex patterns, the PLSR and SAE-RF models deliver more reliable and robust predictions for β-carotene content in hyperspectral imaging. Full article
Show Figures

Figure 1

14 pages, 1113 KB  
Article
Image Captioning Using Topic Faster R-CNN-LSTM Networks
by Jui-Feng Yeh, Kuei-Mei Lin and Chun-Chieh Chen
Information 2025, 16(9), 726; https://doi.org/10.3390/info16090726 - 25 Aug 2025
Abstract
Image captioning is an important task in cross-modal research in numerous applications. Image captioning aims to capture the semantic content of an image and express it in a linguistically and contextually appropriate sentence. However, existing models mostly trend to focus on a topic [...] Read more.
Image captioning is an important task in cross-modal research in numerous applications. Image captioning aims to capture the semantic content of an image and express it in a linguistically and contextually appropriate sentence. However, existing models mostly trend to focus on a topic generated by the most conspicuous foreground objects. Thus, other topics in the image are often ignored. To address these limitations, we propose a model that can generate richer semantic content and more diverse captions. The proposed model can capture not only main topics using coarse-grained objects but also finds fine-grained visual information from background or minor foreground objects. Our image captioning system combines the ResNet, LSTM, and topic feature models. The ResNet model extracts fine-grained image features and enriches the description of objects. The LSTM model provides a longer context for semantics, increasing the fluency and semantic completeness of the generated sentences. The topic model determines multiple topics based on the image and text content. The topics provide directions for the model to generate different sentences. We evaluate our model on the MSCOCO dataset. The results show that compared with other models, our model achieves a certain improvement in higher-order BLEU scores and a significant improvement in CIDEr score. Full article
(This article belongs to the Special Issue Information Processing in Multimedia Applications)
Show Figures

Figure 1

32 pages, 6455 KB  
Article
Novel Encoder–Decoder Architecture with Attention Mechanisms for Satellite-Based Environmental Forecasting in Smart City Applications
by Kalsoom Panhwar, Bushra Naz Soomro, Sania Bhatti and Fawwad Hassan Jaskani
Future Internet 2025, 17(9), 380; https://doi.org/10.3390/fi17090380 - 25 Aug 2025
Abstract
Desertification poses critical threats to agricultural productivity and socio-economic stability, particularly in vulnerable regions like Thatta and Badin districts of Sindh, Pakistan. Traditional monitoring methods lack the accuracy and temporal resolution needed for effective early warning systems. This study presents a novel Spatio-Temporal [...] Read more.
Desertification poses critical threats to agricultural productivity and socio-economic stability, particularly in vulnerable regions like Thatta and Badin districts of Sindh, Pakistan. Traditional monitoring methods lack the accuracy and temporal resolution needed for effective early warning systems. This study presents a novel Spatio-Temporal Desertification Predictor (STDP) framework that integrates deep learning with next-generation satellite imaging for time-series desertification forecasting. The proposed encoder–decoder architecture combines Convolutional Neural Networks (CNNs) for spatial feature extraction from high-resolution satellite imagery with modified Long Short-Term Memory (LSTM) networks enhanced by multi-head attention to capture temporal dependencies. Environmental variables are fused through an adaptive data integration layer, and hyperparameter optimization is employed to enhance model performance for edge computing deployment. Experimental validation on a 15-year satellite dataset (2010–2024) demonstrates superior performance with MSE = 0.018, MAE = 0.079, and R2=0.94, outperforming traditional CNN-only, LSTM-only, and hybrid baselines by 15–20% in prediction accuracy. The framework forecasts desertification trends through 2030, providing actionable signals for environmental management and policy-making. This work advances the integration of AI with satellite-based Earth observation, offering a scalable path for real-time environmental monitoring in IoT and edge computing infrastructures. Full article
(This article belongs to the Special Issue Advances in Deep Learning and Next-Generation Internet Technologies)
Show Figures

Figure 1

15 pages, 5342 KB  
Article
Transfer Learning-Based Multi-Sensor Approach for Predicting Keyhole Depth in Laser Welding of 780DP Steel
by Byeong-Jin Kim, Young-Min Kim and Cheolhee Kim
Materials 2025, 18(17), 3961; https://doi.org/10.3390/ma18173961 - 24 Aug 2025
Abstract
Penetration depth is a critical factor determining joint strength in butt welding; however, it is difficult to monitor in keyhole-mode laser welding due to the dynamic nature of the keyhole. Recently, optical coherence tomography (OCT) has been introduced for real-time keyhole depth measurement, [...] Read more.
Penetration depth is a critical factor determining joint strength in butt welding; however, it is difficult to monitor in keyhole-mode laser welding due to the dynamic nature of the keyhole. Recently, optical coherence tomography (OCT) has been introduced for real-time keyhole depth measurement, though accurate results require meticulous calibration. In this study, deep learning-based models were developed to estimate penetration depth in laser welding of 780 dual-phase (DP) steel. The models utilized coaxial weld pool images and spectrometer signals as inputs, with OCT signals serving as the output reference. Both uni-sensor models (based on coaxial pool images) and multi-sensor models (incorporating spectrometer data) were developed using transfer learning techniques based on pre-trained convolutional neural network (CNN) architectures including MobileNetV2, ResNet50V2, EfficientNetB3, and Xception. The coefficients of determination values (R2) of the uni-sensor CNN transfer learning models without fine-tuning ranged from 0.502 to 0.681, and the mean absolute errors (MAEs) ranged from 0.152 mm to 0.196 mm. In the fine-tuning models, R2 decreased by more than 17%, and MAE increased by more than 11% compared to the previous models without fine-tuning. In addition, in the multi-sensor model, R2 ranged from 0.900 to 0.956, and MAE ranged from 0.058 mm to 0.086 mm, showing better performance than uni-sensor CNN transfer learning models. This study demonstrated the potential of using CNN transfer learning models for predicting penetration depth in laser welding of 780DP steel. Full article
(This article belongs to the Special Issue Advances in Plasma and Laser Engineering (Second Edition))
Show Figures

Figure 1

28 pages, 2147 KB  
Article
Generalized Methodology for Two-Dimensional Flood Depth Prediction Using ML-Based Models
by Mohamed Soliman, Mohamed M. Morsy and Hany G. Radwan
Hydrology 2025, 12(9), 223; https://doi.org/10.3390/hydrology12090223 - 24 Aug 2025
Abstract
Floods are among the most devastating natural disasters; predicting their depth and extent remains a global challenge. Machine Learning (ML) models have demonstrated improved accuracy over traditional probabilistic flood mapping approaches. While previous studies have developed ML-based models for specific local regions, this [...] Read more.
Floods are among the most devastating natural disasters; predicting their depth and extent remains a global challenge. Machine Learning (ML) models have demonstrated improved accuracy over traditional probabilistic flood mapping approaches. While previous studies have developed ML-based models for specific local regions, this study aims to establish a methodology for estimating flood depth on a global scale using ML algorithms and freely available datasets—a challenging yet critical task. To support model generalization, 45 catchments from diverse geographic regions were selected based on elevation, land use, land cover, and soil type variations. The datasets were meticulously preprocessed, ensuring normality, eliminating outliers, and scaling. These preprocessed data were then split into subgroups: 75% for training and 25% for testing, with six additional unseen catchments from the USA reserved for validation. A sensitivity analysis was performed across several ML models (ANN, CNN, RNN, LSTM, Random Forest, XGBoost), leading to the selection of the Random Forest (RF) algorithm for both flood inundation classification and flood depth regression models. Three regression models were assessed for flood depth prediction. The pixel-based regression model achieved an R2 of 91% for training and 69% for testing. Introducing a pixel clustering regression model improved the testing R2 to 75%, with an overall validation (for unseen catchments) R2 of 64%. The catchment-based clustering regression model yielded the most robust performance, with an R2 of 83% for testing and 82% for validation. The developed ML model demonstrates breakthrough computational efficiency, generating complete flood depth predictions in just 6 min—a 225× speed improvement (90–95% time reduction) over conventional HEC-RAS 6.3 simulations. This rapid processing enables the practical implementation of flood early warning systems. Despite the dramatic speed gains, the solution maintains high predictive accuracy, evidenced by statistically robust 95% confidence intervals and strong spatial agreement with HEC-RAS benchmark maps. These findings highlight the critical role of the spatial variability of dependencies in enhancing model accuracy, representing a meaningful approach forward in scalable modeling frameworks with potential for global generalization of flood depth. Full article
Show Figures

Figure 1

40 pages, 9864 KB  
Article
Cascaded Hierarchical Attention with Adaptive Fusion for Visual Grounding in Remote Sensing
by Huming Zhu, Tianqi Gao, Zhixian Li, Zhipeng Chen, Qiuming Li, Kongmiao Miao, Biao Hou and Licheng Jiao
Remote Sens. 2025, 17(17), 2930; https://doi.org/10.3390/rs17172930 - 23 Aug 2025
Viewed by 53
Abstract
Visual grounding for remote sensing (RSVG) is the task of localizing the referred object in remote sensing (RS) images by parsing free-form language descriptions. However, RSVG faces the challenge of low detection accuracy due to unbalanced multi-scale grounding capabilities, where large objects have [...] Read more.
Visual grounding for remote sensing (RSVG) is the task of localizing the referred object in remote sensing (RS) images by parsing free-form language descriptions. However, RSVG faces the challenge of low detection accuracy due to unbalanced multi-scale grounding capabilities, where large objects have more prominent grounding accuracy than small objects. Based on Faster R-CNN, we propose Faster R-CNN in Visual Grounding for Remote Sensing (FR-RSVG), a two-stage method for grounding RS objects. Building on this foundation, to enhance the ability to ground multi-scale objects, we propose Faster R-CNN with Adaptive Vision-Language Fusion (FR-AVLF), which introduces a layered Adaptive Vision-Language Fusion (AVLF) module. Specifically, this method can adaptively fuse deep or shallow visual features according to the input text (e.g., location-related or object characteristic descriptions), thereby optimizing semantic feature representation and improving grounding accuracy for objects of different scales. Given that RSVG is essentially an expanded form of RS object detection, and considering the knowledge the model acquired in prior RS object detection tasks, we propose Faster R-CNN with Adaptive Vision-Language Fusion Pretrained (FR-AVLFPRE). To further enhance model performance, we propose Faster R-CNN with Cascaded Hierarchical Attention Grounding and Multi-Level Adaptive Vision-Language Fusion Pretrained (FR-CHAGAVLFPRE), which introduces a cascaded hierarchical attention grounding mechanism, employs a more advanced language encoder, and improves upon AVLF by proposing Multi-Level AVLF, significantly improving localization accuracy in complex scenarios. Extensive experiments on the DIOR-RSVG dataset demonstrate that our model surpasses most existing advanced models. To validate the generalization capability of our model, we conducted zero-shot inference experiments on shared categories between DIOR-RSVG and both Complex Description DIOR-RSVG (DIOR-RSVG-C) and OPT-RSVG datasets, achieving performance superior to most existing models. Full article
(This article belongs to the Section AI Remote Sensing)
21 pages, 7575 KB  
Article
Mapping Orchard Trees from UAV Imagery Through One Growing Season: A Comparison Between OBIA-Based and Three CNN-Based Object Detection Methods
by Maggi Kelly, Shane Feirer, Sean Hogan, Andy Lyons, Fengze Lin and Ewelina Jacygrad
Drones 2025, 9(9), 593; https://doi.org/10.3390/drones9090593 - 22 Aug 2025
Viewed by 129
Abstract
Extracting the shapes of individual tree crowns from high-resolution imagery can play a crucial role in many applications, including precision agriculture. We evaluated three CNN models—MASK R-CNN, YOLOv3, and SAM—and compared their tree crown results with OBIA-based reference datasets from UAV imagery for [...] Read more.
Extracting the shapes of individual tree crowns from high-resolution imagery can play a crucial role in many applications, including precision agriculture. We evaluated three CNN models—MASK R-CNN, YOLOv3, and SAM—and compared their tree crown results with OBIA-based reference datasets from UAV imagery for seven dates across one growing season. We found that YOLOv3 performed poorly across all dates; both MASK R-CNN and SAM performed well in May, June, September, and November (precision, recall, and F1 scores over 0.79). All models struggled in the early season imagery (e.g., March). MASK R-CNN outperformed other models in August (when there was smoke haze) and December (showing end-of-season variation in leaf color). SAM was the fastest model, and, as it required no training, it could cover more area in less time; MASK R-CNN was very accurate and customizable. In this paper, we aimed to contribute insight into which CNN model offers the best balance of accuracy and ease of implementation for orchard management tasks. We also evaluated its applicability within one software ecosystem, ESRI ArcGIS Pro, and showed how such an approach offers users a streamlined efficient way to detect objects in high-resolution UAV imagery. Full article
Show Figures

Figure 1

15 pages, 622 KB  
Review
Artificial Intelligence in the Diagnosis and Imaging-Based Assessment of Pelvic Organ Prolapse: A Scoping Review
by Marian Botoncea, Călin Molnar, Vlad Olimpiu Butiurca, Cosmin Lucian Nicolescu and Claudiu Molnar-Varlam
Medicina 2025, 61(8), 1497; https://doi.org/10.3390/medicina61081497 - 21 Aug 2025
Viewed by 172
Abstract
Background and Objectives: Pelvic organ prolapse (POP) is a complex condition affecting the pelvic floor, often requiring imaging for accurate diagnosis and treatment planning. Artificial intelligence (AI), particularly deep learning (DL), is emerging as a powerful tool in medical imaging. This scoping [...] Read more.
Background and Objectives: Pelvic organ prolapse (POP) is a complex condition affecting the pelvic floor, often requiring imaging for accurate diagnosis and treatment planning. Artificial intelligence (AI), particularly deep learning (DL), is emerging as a powerful tool in medical imaging. This scoping review aims to synthesize current evidence on the use of AI in the imaging-based diagnosis and anatomical evaluation of POP. Materials and Methods: Following the PRISMA-ScR guidelines, a comprehensive search was conducted in PubMed, Scopus, and Web of Science for studies published between January 2020 and April 2025. Studies were included if they applied AI methodologies, such as convolutional neural networks (CNNs), vision transformers (ViTs), or hybrid models, to diagnostic imaging modalities such as ultrasound and magnetic resonance imaging (MRI) to women with POP. Results: Eight studies met the inclusion criteria. In these studies, AI technologies were applied to 2D/3D ultrasound and static or stress MRI for segmentation, anatomical landmark localization, and prolapse classification. CNNs were the most commonly used models, often combined with transfer learning. Some studies used hybrid models of ViTs, demonstrating high diagnostic accuracy. However, all studies relied on internal datasets, with limited model interpretability and no external validation. Moreover, clinical deployment and outcome assessments remain underexplored. Conclusions: AI shows promise in enhancing POP diagnosis through improved image analysis, but current applications are largely exploratory. Future work should prioritize external validation, standardization, explainable AI, and real-world implementation to bridge the gap between experimental models and clinical utility. Full article
(This article belongs to the Section Obstetrics and Gynecology)
Show Figures

Graphical abstract

20 pages, 1818 KB  
Article
Image Captioning Model Based on Multi-Step Cross-Attention Cross-Modal Alignment and External Commonsense Knowledge Augmentation
by Liang Wang, Meiqing Jiao, Zhihai Li, Mengxue Zhang, Haiyan Wei, Yuru Ma, Honghui An, Jiaqi Lin and Jun Wang
Electronics 2025, 14(16), 3325; https://doi.org/10.3390/electronics14163325 - 21 Aug 2025
Viewed by 306
Abstract
To address the semantic mismatch between limited textual descriptions in image captioning training datasets and the multi-semantic nature of images, as well as the underutilized external commonsense knowledge, this article proposes a novel image captioning model based on multi-step cross-attention cross-modal alignment and [...] Read more.
To address the semantic mismatch between limited textual descriptions in image captioning training datasets and the multi-semantic nature of images, as well as the underutilized external commonsense knowledge, this article proposes a novel image captioning model based on multi-step cross-attention cross-modal alignment and external commonsense knowledge enhancement. The model employs a backbone architecture comprising CLIP’s ViT visual encoder, Faster R-CNN, BERT text encoder, and GPT-2 text decoder. It incorporates two core mechanisms: a multi-step cross-attention mechanism that iteratively aligns image and text features across multiple rounds, progressively enhancing inter-modal semantic consistency for more accurate cross-modal representation fusion. Moreover, the model employs Faster R-CNN to extract region-based object features. These features are mapped to corresponding entities within the dataset through entity probability calculation and entity linking. External commonsense knowledge associated with these entities is then retrieved from the ConceptNet knowledge graph, followed by knowledge embedding via TransE and multi-hop reasoning. Finally, the fused multimodal features are fed into the GPT-2 decoder to steer caption generation, enhancing the lexical richness, factual accuracy, and cognitive plausibility of the generated descriptions. In the experiments, the model achieves CIDEr scores of 142.6 on MSCOCO and 78.4 on Flickr30k. Ablations confirm both modules enhance caption quality. Full article
Show Figures

Figure 1

29 pages, 2133 KB  
Article
A Wavelet–Attention–Convolution Hybrid Deep Learning Model for Accurate Short-Term Photovoltaic Power Forecasting
by Kaoutar Ait Chaoui, Hassan EL Fadil, Oumaima Choukai and Oumaima Ait Omar
Forecasting 2025, 7(3), 45; https://doi.org/10.3390/forecast7030045 - 19 Aug 2025
Viewed by 267
Abstract
The accurate short-term forecasting (PV) of power is crucial for grid stability control, energy trading optimization, and renewable energy integration in smart grids. However, PV generation is extremely variable and non-linear due to environmental fluctuations, which challenge the conventional forecasting models. This study [...] Read more.
The accurate short-term forecasting (PV) of power is crucial for grid stability control, energy trading optimization, and renewable energy integration in smart grids. However, PV generation is extremely variable and non-linear due to environmental fluctuations, which challenge the conventional forecasting models. This study proposes a hybrid deep learning architecture, Wavelet Transform–Transformer–Temporal Convolutional Network–Efficient Channel Attention Network–Gated Recurrent Unit (WT–Transformer–TCN–ECANet–GRU), to capture the overall temporal complexity of PV data through integrating signal decomposition, global attention, local convolutional features, and temporal memory. The model begins by employing the Wavelet Transform (WT) to decompose the raw PV time series into multi-frequency components, thereby enhancing feature extraction and denoising. Long-term temporal dependencies are captured in a Transformer encoder, and a Temporal Convolutional Network (TCN) detects local features. Features are then adaptively recalibrated by an Efficient Channel Attention (ECANet) module and passed to a Gated Recurrent Unit (GRU) for sequence modeling. Multiscale learning, attention-driven robust filtering, and efficient encoding of temporality are enabled with the modular pipeline. We validate the model on a real-world, high-resolution dataset of a Moroccan university building comprising 95,885 five-min PV generation records. The model yielded the lowest error metrics among benchmark architectures with an MAE of 209.36, RMSE of 616.53, and an R2 of 0.96884, outperforming LSTM, GRU, CNN-LSTM, and other hybrid deep learning models. These results suggest improved predictive accuracy and potential applicability for real-time grid operation integration, supporting applications such as energy dispatching, reserve management, and short-term load balancing. Full article
Show Figures

Figure 1

35 pages, 11854 KB  
Article
ODDM: Integration of SMOTE Tomek with Deep Learning on Imbalanced Color Fundus Images for Classification of Several Ocular Diseases
by Afraz Danish Ali Qureshi, Hassaan Malik, Ahmad Naeem, Syeda Nida Hassan, Daesik Jeong and Rizwan Ali Naqvi
J. Imaging 2025, 11(8), 278; https://doi.org/10.3390/jimaging11080278 - 18 Aug 2025
Viewed by 522
Abstract
Ocular disease (OD) represents a complex medical condition affecting humans. OD diagnosis is a challenging process in the current medical system, and blindness may occur if the disease is not detected at its initial phase. Recent studies showed significant outcomes in the identification [...] Read more.
Ocular disease (OD) represents a complex medical condition affecting humans. OD diagnosis is a challenging process in the current medical system, and blindness may occur if the disease is not detected at its initial phase. Recent studies showed significant outcomes in the identification of OD using deep learning (DL) models. Thus, this work aims to develop a multi-classification DL-based model for the classification of seven ODs, including normal (NOR), age-related macular degeneration (AMD), diabetic retinopathy (DR), glaucoma (GLU), maculopathy (MAC), non-proliferative diabetic retinopathy (NPDR), and proliferative diabetic retinopathy (PDR), using color fundus images (CFIs). This work proposes a custom model named the ocular disease detection model (ODDM) based on a CNN. The proposed ODDM is trained and tested on a publicly available ocular disease dataset (ODD). Additionally, the SMOTE Tomek (SM-TOM) approach is also used to handle the imbalanced distribution of the OD images in the ODD. The performance of the ODDM is compared with seven baseline models, including DenseNet-201 (R1), EfficientNet-B0 (R2), Inception-V3 (R3), MobileNet (R4), Vgg-16 (R5), Vgg-19 (R6), and ResNet-50 (R7). The proposed ODDM obtained a 98.94% AUC, along with 97.19% accuracy, a recall of 88.74%, a precision of 95.23%, and an F1-score of 88.31% in classifying the seven different types of OD. Furthermore, ANOVA and Tukey HSD (Honestly Significant Difference) post hoc tests are also applied to represent the statistical significance of the proposed ODDM. Thus, this study concludes that the results of the proposed ODDM are superior to those of baseline models and state-of-the-art models. Full article
(This article belongs to the Special Issue Advances in Machine Learning for Medical Imaging Applications)
Show Figures

Figure 1

16 pages, 1961 KB  
Article
Short-Term Wind Energy Yield Forecasting: A Comparative Analysis Using Multiple Data Sources
by Nikita Dmitrijevs, Vitalijs Komasilovs, Svetlana Orlova and Edmunds Kamolins
Energies 2025, 18(16), 4393; https://doi.org/10.3390/en18164393 - 18 Aug 2025
Viewed by 354
Abstract
Short-term wind turbine energy yield forecasting is crucial for effectively integrating wind energy into the electricity grid and fulfilling day-ahead scheduling obligations in electricity markets such as Nord Pool and EPEX SPOT. This study presents a forecasting approach utilising operational data from two [...] Read more.
Short-term wind turbine energy yield forecasting is crucial for effectively integrating wind energy into the electricity grid and fulfilling day-ahead scheduling obligations in electricity markets such as Nord Pool and EPEX SPOT. This study presents a forecasting approach utilising operational data from two wind turbines in Latvia, as well as meteorological inputs from the NORA 3 reanalysis dataset, sensor measurements from the turbines, and data provided by the Latvian Environment, Geology and Meteorology Centre (LEGMC). Forecasts with lead times of 1 to 36 h are generated to support accurate day-ahead generation estimates. Several modelling techniques, including recurrent neural networks (RNNs), convolutional neural networks (CNNs), artificial neural networks (ANNs), XGBoost, CatBoost, LightGBM, linear regression, and Ridge regression, are evaluated, incorporating wind and atmospheric parameters from three datasets: operational turbine data, meteorological measurements from LEGMC, and the NORA 3 reanalysis dataset. Model performance is assessed using standard error metrics, including Mean Squared Error (MSE), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and R-squared (R2). This study demonstrates the effectiveness of integrating reanalysis-based meteorological data with turbine-level operational measurements to enhance the accuracy and reliability of short-term wind energy forecasting, thereby supporting efficient day-ahead market scheduling and the integration of clean energy. Full article
Show Figures

Figure 1

35 pages, 33285 KB  
Article
Chaotic Vibration Prediction of a Laminated Composite Cantilever Beam Subject to Random Parametric Error
by Lin Sun, Xudong Li and Xiaopei Liu
J. Compos. Sci. 2025, 9(8), 442; https://doi.org/10.3390/jcs9080442 - 17 Aug 2025
Viewed by 193
Abstract
Random parametric errors (RPEs) are introduced into the model establishment of a laminated composite cantilever beam (LCCB) to demonstrate the accuracy and robustness of a recurrent neural network (RNN) in predicting the chaotic vibration of a LCCB, and a comparative analysis of training [...] Read more.
Random parametric errors (RPEs) are introduced into the model establishment of a laminated composite cantilever beam (LCCB) to demonstrate the accuracy and robustness of a recurrent neural network (RNN) in predicting the chaotic vibration of a LCCB, and a comparative analysis of training performance and generalization capability is conducted with a convolutional neural network (CNN). In the process of dynamic modeling, the nonlinear dynamic system of a LCCB is established by considering RPEs. The displacement and velocity time series obtained from numerical simulation are used to train and test the RNN model. The RNN model converts the original data into a multi-step supervised learning format and normalizes it using the MinMaxScaler method. The prediction performance is comprehensively evaluated through three performance indicators: coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE). The results show that, under the condition of introducing RPEs, the RNN model still exhibits high prediction accuracy, with the maximum R2 reaching 0.999984548634328, the maximum MAE being 0.075, and the maximum RMSE being 0.121. Furthermore, performing predictions at the free end of the LCCB verifies the applicability and robustness of the RNN model with respect to spatial position variations. These results fully demonstrate the accuracy and robustness of the RNN model in predicting the chaotic vibration of a LCCB. Full article
Show Figures

Figure 1

17 pages, 3027 KB  
Article
Time Series Prediction of Water Quality Based on NGO-CNN-GRU Model—A Case Study of Xijiang River, China
by Xiaofeng Ding, Yiling Chen, Haipeng Zeng and Yu Du
Water 2025, 17(16), 2413; https://doi.org/10.3390/w17162413 - 15 Aug 2025
Viewed by 432
Abstract
Water quality deterioration poses a critical threat to ecological security and sustainable development, particularly in rapidly urbanizing regions. To enable proactive environmental management, this study develops a novel hybrid deep learning model, the NGO-CNN-GRU, for high-precision time-series water quality prediction in the Xijiang [...] Read more.
Water quality deterioration poses a critical threat to ecological security and sustainable development, particularly in rapidly urbanizing regions. To enable proactive environmental management, this study develops a novel hybrid deep learning model, the NGO-CNN-GRU, for high-precision time-series water quality prediction in the Xijiang River Basin, China. The model integrates a Convolutional Neural Network (CNN) for spatial feature extraction and a Gated Recurrent Unit (GRU) for temporal dependency modeling, with hyperparameters optimized via the Northern Goshawk Optimization (NGO) algorithm. Using historical water quality (pH, DO, CODMn, NH3-N, TP, TN) and meteorological data (precipitation, temperature, humidity) from 11 monitoring stations, the model achieved exceptional performance: test set R2 > 0.986, MAE < 0.015, and RMSE < 0.018 for total nitrogen prediction (Xiaodong Station case study). Across all stations and indicators, it consistently outperformed baseline models (GRU, CNN-GRU), with average R2 improvements of 12.3% and RMSE reductions up to 90% for NH3-N predictions. Spatiotemporal analysis further revealed significant pollution gradients correlated with anthropogenic activities in the Pearl River Delta. This work provides a robust tool for real-time water quality early warning systems and supports evidence-based river basin management. Full article
(This article belongs to the Special Issue Monitoring and Modelling of Contaminants in Water Environment)
Show Figures

Figure 1

Back to TopTop