Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (48)

Search Parameters:
Keywords = ResRNN

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 1149 KB  
Article
IP Spoofing Detection Using Deep Learning
by İsmet Kaan Çekiş, Buğra Ayrancı, Fezayim Numan Salman and İlker Özçelik
Appl. Sci. 2025, 15(17), 9508; https://doi.org/10.3390/app15179508 - 29 Aug 2025
Viewed by 601
Abstract
IP spoofing is a critical component in many cyberattacks, enabling attackers to evade detection and conceal their identities. This study rigorously compares eight deep learning models—LSTM, GRU, CNN, MLP, DNN, RNN, ResNet1D, and xLSTM—for their efficacy in detecting IP spoofing attacks. Overfitting was [...] Read more.
IP spoofing is a critical component in many cyberattacks, enabling attackers to evade detection and conceal their identities. This study rigorously compares eight deep learning models—LSTM, GRU, CNN, MLP, DNN, RNN, ResNet1D, and xLSTM—for their efficacy in detecting IP spoofing attacks. Overfitting was mitigated through techniques such as dropout, early stopping, and normalization. Models were trained using binary cross-entropy loss and the Adam optimizer. Performance was assessed via accuracy, precision, recall, F1 score, and inference time, with each model executed a total of 15 times to account for stochastic variability. Results indicate a powerful performance across all models, with LSTM and GRU demonstrating superior detection efficacy. After ONNX conversion, the MLP and DNN models retained their performance while achieving significant reductions in inference time, miniaturized model sizes, and platform independence. These advancements facilitated the effective utilization of the developed systems in real-time network security applications. The comprehensive performance metrics presented are crucial for selecting optimal IP spoofing detection strategies tailored to diverse application requirements, serving as a valuable reference for network anomaly monitoring and targeted attack detection. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

28 pages, 983 KB  
Article
Robust Pavement Modulus Prediction Using Time-Structured Deep Models and Perturbation-Based Evaluation on FWD Data
by Xinyu Guo, Yue Chen and Nan Sun
Sensors 2025, 25(17), 5222; https://doi.org/10.3390/s25175222 - 22 Aug 2025
Viewed by 742
Abstract
The accurate prediction of the pavement structural modulus is crucial for maintenance planning and life-cycle assessment. While recent deep learning models have improved predictive accuracy using Falling Weight Deflectometer data, challenges remain in effectively structuring time-series inputs and ensuring robustness against noise measurement. [...] Read more.
The accurate prediction of the pavement structural modulus is crucial for maintenance planning and life-cycle assessment. While recent deep learning models have improved predictive accuracy using Falling Weight Deflectometer data, challenges remain in effectively structuring time-series inputs and ensuring robustness against noise measurement. This paper presents an integrated framework that combines systematic time-step modeling with perturbation-based robustness evaluation. Five distinct input sequencing strategies (Plan A through Plan E) were developed to investigate the impact of temporal structure on model performance. A hybrid Wide & Deep ResRNN architecture incorporating SimpleRNN, GRU, and LSTM components was designed to jointly predict four-layer moduli. To simulate real-world sensor uncertainty, Gaussian noise with ±3% variance was injected into inputs, allowing the Monte-Carlo-style estimation of confidence intervals. Experimental results revealed that time-step design plays a critical role in both prediction accuracy and robustness, with Plan D consistently achieving the best balance between accuracy and stability. These findings offer a practical and generalizable approach for deploying deep sequence models in pavement modulus prediction tasks, particularly under uncertain field conditions. Full article
(This article belongs to the Section Physical Sensors)
Show Figures

Figure 1

22 pages, 5197 KB  
Article
Electrical Resistivity Tomography Methods and Technical Research for Hydrate-Based Carbon Sequestration
by Zitian Lin, Qia Wang, Shufan Li, Xingru Li, Jiajie Ye, Yidi Zhang, Haoning Ye, Yangmin Kuang and Yanpeng Zheng
J. Mar. Sci. Eng. 2025, 13(7), 1205; https://doi.org/10.3390/jmse13071205 - 21 Jun 2025
Viewed by 694
Abstract
This study focuses on the application of electrical resistivity tomography (ERT) for monitoring the growth process of CO2 hydrate in subsea carbon sequestration, aiming to provide technical support for the safety assessment of marine carbon storage. By designing single-target, dual-target, and multi-target [...] Read more.
This study focuses on the application of electrical resistivity tomography (ERT) for monitoring the growth process of CO2 hydrate in subsea carbon sequestration, aiming to provide technical support for the safety assessment of marine carbon storage. By designing single-target, dual-target, and multi-target hydrate samples, convolutional neural networks (CNNs), recurrent neural networks (RNNs), and residual neural networks (ResNets) were constructed and compared with traditional image reconstruction algorithms (e.g., back-projection) to quantitatively analyze ERT imaging accuracy. The experiments used boundary voltage as the input and internal conductivity distribution as the output, employing the relative image error (RIE) and image correlation coefficient (ICC) to evaluate algorithmic performance. The results demonstrate that neural network algorithms—particularly RNNs—exhibit superior performance compared to traditional image reconstruction methods due to their strong noise resistance and nonlinear mapping capabilities. These algorithms significantly improve the edge clarity in target identification, enabling the precise capture of the hydrate distribution during carbon sequestration. This advancement effectively enhances the monitoring capability of CO2 hydrate reservoir characteristics and provides reliable data support for the safety assessment of hydrate reservoirs. Full article
Show Figures

Figure 1

33 pages, 17535 KB  
Article
MultiScaleFusion-Net and ResRNN-Net: Proposed Deep Learning Architectures for Accurate and Interpretable Pregnancy Risk Prediction
by Amna Asad, Madiha Sarwar, Muhammad Aslam, Edore Akpokodje and Syeda Fizzah Jilani
Appl. Sci. 2025, 15(11), 6152; https://doi.org/10.3390/app15116152 - 30 May 2025
Cited by 1 | Viewed by 957
Abstract
Women exhibit marked physiological transformations in pregnancy, mandating regular and holistic assessment. Maternal and fetal vitality is governed by a spectrum of clinical, demographic, and lifestyle factors throughout this critical period. The existing maternal health monitoring techniques lack precision in assessing pregnancy-related risks, [...] Read more.
Women exhibit marked physiological transformations in pregnancy, mandating regular and holistic assessment. Maternal and fetal vitality is governed by a spectrum of clinical, demographic, and lifestyle factors throughout this critical period. The existing maternal health monitoring techniques lack precision in assessing pregnancy-related risks, often leading to late interventions and adverse outcomes. Accurate and timely risk prediction is crucial to avoid miscarriages. This research proposes a deep learning framework for personalized pregnancy risk prediction using the NFHS-5 dataset, and class imbalance is addressed through a hybrid NearMiss-SMOTE approach. Fifty-one primary features are selected via the LASSO to refine the dataset and enhance model interpretability and efficiency. The framework integrates a multimodal model (NFHS-5, fetal plane images, and EHG time series) along with two core architectures. ResRNN-Net further combines Bi-LSTM, CNNs, and attention mechanisms to capture sequential dependencies. MultiScaleFusion-Net leverages GRU and multiscale convolutions for effective feature extraction. Additionally, TabNet and MLP models are explored to compare interpretability and computational efficiency. SHAP and Grad-CAM are used to ensure transparency and explainability, offering both feature importance and visual explanations of predictions. The proposed models are trained using 5-fold stratified cross-validation and evaluated with metrics including accuracy, precision, recall, F1-score, and ROC–AUC. The results demonstrate that MultiScaleFusion-Net balances accuracy and computational efficiency, making it suitable for real-time clinical deployment, while ResRNN-Net achieves higher precision at a slight computational cost. Performance comparisons with baseline machine learning models confirm the superiority of deep learning approaches, achieving over 80% accuracy in pregnancy complication prediction. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Biomedical Informatics)
Show Figures

Figure 1

19 pages, 3555 KB  
Article
Research on Park Perception and Understanding Methods Based on Multimodal Text–Image Data and Bidirectional Attention Mechanism
by Kangen Chen, Xiuhong Lin, Tao Xia and Rushan Bai
Buildings 2025, 15(9), 1552; https://doi.org/10.3390/buildings15091552 - 4 May 2025
Viewed by 1045
Abstract
Parks are an important component of urban ecosystems, yet traditional research often relies on single-modal data, such as text or images alone, making it difficult to comprehensively and accurately capture the complex emotional experiences of visitors and their relationships with the environment. This [...] Read more.
Parks are an important component of urban ecosystems, yet traditional research often relies on single-modal data, such as text or images alone, making it difficult to comprehensively and accurately capture the complex emotional experiences of visitors and their relationships with the environment. This study proposes a park perception and understanding model based on multimodal text–image data and a bidirectional attention mechanism. By integrating text and image data, the model incorporates a bidirectional encoder representations from transformers (BERT)-based text feature extraction module, a Swin Transformer-based image feature extraction module, and a bidirectional cross-attention fusion module, enabling a more precise assessment of visitors’ emotional experiences in parks. Experimental results show that compared to traditional methods such as residual network (ResNet), recurrent neural network (RNN), and long short-term memory (LSTM), the proposed model achieves significant advantages across multiple evaluation metrics, including mean squared error (MSE), mean absolute error (MAE), root mean squared error (RMSE), and the coefficient of determination (R2). Furthermore, using the SHapley Additive exPlanations (SHAP) method, this study identified the key factors influencing visitors’ emotional experiences, such as “water”, “green”, and “sky”, providing a scientific basis for park management and optimization. Full article
Show Figures

Figure 1

20 pages, 1764 KB  
Article
A Temporal Convolutional Network–Bidirectional Long Short-Term Memory (TCN-BiLSTM) Prediction Model for Temporal Faults in Industrial Equipment
by Jinyin Bai, Wei Zhu, Shuhong Liu, Chenhao Ye, Peng Zheng and Xiangchen Wang
Appl. Sci. 2025, 15(4), 1702; https://doi.org/10.3390/app15041702 - 7 Feb 2025
Cited by 3 | Viewed by 2609
Abstract
Traditional algorithms and single predictive models often face challenges such as limited prediction accuracy and insufficient modeling capabilities for complex time-series data in fault prediction tasks. To address these issues, this paper proposes a combined prediction model based on an improved temporal convolutional [...] Read more.
Traditional algorithms and single predictive models often face challenges such as limited prediction accuracy and insufficient modeling capabilities for complex time-series data in fault prediction tasks. To address these issues, this paper proposes a combined prediction model based on an improved temporal convolutional network (TCN) and bidirectional long short-term memory (BiLSTM), referred to as the TCN-BiLSTM model. This model aims to enhance the reliability and accuracy of time-series fault prediction. It is designed to handle continuous processes but can also be applied to batch and hybrid processes due to its flexible architecture. First, preprocessed industrial operation data are fed into the model, and hyperparameter optimization is conducted using the Optuna framework to improve training efficiency and generalization capability. Then, the model employs an improved TCN layer and a BiLSTM layer for feature extraction and learning. The TCN layer incorporates batch normalization, an optimized activation function (Leaky ReLU), and a dropout mechanism to enhance its ability to capture multi-scale temporal features. The BiLSTM layer further leverages its bidirectional learning mechanism to model the long-term dependencies in the data, enabling effective predictions of complex fault patterns. Finally, the model outputs the prediction results after iterative optimization. To evaluate the performance of the proposed model, simulation experiments were conducted to compare the TCN-BiLSTM model with mainstream prediction methods such as CNN, RNN, BiLSTM, and A-BiLSTM. The experimental results indicate that the TCN-BiLSTM model outperforms the comparison models in terms of prediction accuracy during both the modeling and forecasting stages, providing a feasible solution for time-series fault prediction. Full article
Show Figures

Figure 1

33 pages, 17902 KB  
Article
Modeling and Design of a Grid-Tied Renewable Energy System Exploiting Re-Lift Luo Converter and RNN Based Energy Management
by Kavitha Paulsamy and Subha Karuvelam
Sustainability 2025, 17(1), 187; https://doi.org/10.3390/su17010187 - 30 Dec 2024
Cited by 1 | Viewed by 1166
Abstract
The significance of the Hybrid Renewable Energy System (HRES) is profound in the current scenario owing to the mounting energy requirements, pressing ecological concerns and the pursuit of transitioning to greener energy alternatives. Thereby, the modeling and design of HRES, encompassing PV–WECS–Battery, which [...] Read more.
The significance of the Hybrid Renewable Energy System (HRES) is profound in the current scenario owing to the mounting energy requirements, pressing ecological concerns and the pursuit of transitioning to greener energy alternatives. Thereby, the modeling and design of HRES, encompassing PV–WECS–Battery, which mainly focuses on efficient power conversion and advanced control strategy, is proposed. The voltage gain of the PV system is improved using the Re-lift Luo converter, which offers high efficiency and power density with minimized ripples and power losses. Its voltage lift technique mitigates parasitic effects and delivers improved output voltage for grid synchronization. To control and stabilize the converter output, a Proportional–Integral (PI) controller tuned using a novel hybrid algorithm combining Grey Wolf Optimization (GWO) with Hermit Crab Optimization (HCO) is implemented. GWO follows the hunting and leadership characteristics of grey wolves for improved simplicity and robustness. By simulating the shell selection behavior of hermit crabs, the HCO adds diversity to exploitation. Due to these features, the hybrid GWO–HCO algorithm enhances the PI controller’s capability of handling dynamic non-linear systems, generating better control accuracy, and rapid convergence to optimal solutions. Considering the Wind Energy Conversion System (WECS), the PI controller assures improved stability despite fluctuations in wind. A Recurrent Neural Network (RNN)-based battery management system is also incorporated for accurate monitoring and control of the State of Charge (SoC) and the terminal voltage of battery storage. The simulation is conducted in MATLAB Simulink 2021a, and a lab-scale prototype is implemented for real-time validation. The Re-lift Luo converter achieves an efficiency of 97.5% and a voltage gain of 1:10 with reduced oscillations and faster settling time using a Hybrid GWO–HCO–PI controller. Moreover, the THD is reduced to 1.16%, which indicates high power quality and reduced harmonics. Full article
Show Figures

Figure 1

18 pages, 8573 KB  
Article
ResTUnet: A Novel Neural Network Model for Nowcasting Using Radar Echo Sequences by Ground-Based Remote Sensing
by Lei Zhang, Ruoyang Zhang, Yu Wu, Yadong Wang, Yanfeng Zhang, Lijuan Zheng, Chongbin Xu, Xin Zuo and Zeyu Wang
Remote Sens. 2024, 16(24), 4792; https://doi.org/10.3390/rs16244792 - 23 Dec 2024
Viewed by 1150
Abstract
Radar echo extrapolation by ground-based remote sensing is essential for weather prediction and flight guiding. Existing radar echo extrapolation methods can hardly capture complex spatiotemporal features, resulting in the low accuracy of predictions, and, therefore, severely restrict their use in extreme weather situations. [...] Read more.
Radar echo extrapolation by ground-based remote sensing is essential for weather prediction and flight guiding. Existing radar echo extrapolation methods can hardly capture complex spatiotemporal features, resulting in the low accuracy of predictions, and, therefore, severely restrict their use in extreme weather situations. A deep learning method was recently applied for extrapolating radar echoes; however, its accuracy declines too quickly over a short time. In this study, we introduce a solution: Residual Transformer and Unet (ResTUnet), a novel model that improves prediction accuracy and exhibits good stability with a slow rate of accuracy decline. This presented Rest-Net model is designed to solve the issue of declining prediction accuracy by integrating a 1*1 convolution to diminish the neural network parameters. We constructed an observed dataset by Zhengzhou East Airport radar observation from July 2022 to August 2022 and performed 90 min experiments comprising five aspects, including extrapolation images, the Probability of Detection (POD) index, the Critical Success Index (CSI), the False Alarm Rate (FAR) index, and the Heidke Skill Score (HSS) index. The experimental results show that the ResTUnet model improved the CSI, HSS index, and the POD index by 17.20%, 11.97%, and 11.35%, compared to current models, including Convolutional Long Short-Term Memory (convLSTM), the Convolutional Gated Recurrent Unit (convGRU), the Trajectory Gated Recurrent Unit (TrajGRU), and the improved recurrent network for video predictive learning, the Predictive Recurrent Neural Network++ (predRNN++). In addition, the mean squared error of the ResTUnet model remains stable at 15% between 0 and 60 min and starts to increase after 60–90 min, which is 12% better than the current models. This enhancement in prediction accuracy has practical applications in meteorological services and decision making. Full article
(This article belongs to the Special Issue Advance of Radar Meteorology and Hydrology II)
Show Figures

Graphical abstract

13 pages, 2625 KB  
Article
DeepAT: A Deep Learning Wheat Phenotype Prediction Model Based on Genotype Data
by Jiale Li, Zikang He, Guomin Zhou, Shen Yan and Jianhua Zhang
Agronomy 2024, 14(12), 2756; https://doi.org/10.3390/agronomy14122756 - 21 Nov 2024
Cited by 2 | Viewed by 2567
Abstract
Genomic selection serves as an effective way for crop genetic breeding, capable of significantly shortening the breeding cycle and improving the accuracy of breeding. Phenotype prediction can help identify genetic variants associated with specific phenotypes. This provides a data-driven selection criterion for genomic [...] Read more.
Genomic selection serves as an effective way for crop genetic breeding, capable of significantly shortening the breeding cycle and improving the accuracy of breeding. Phenotype prediction can help identify genetic variants associated with specific phenotypes. This provides a data-driven selection criterion for genomic selection, making the selection process more efficient and targeted. Deep learning has become an important tool for phenotype prediction due to its abilities in automatic feature learning, nonlinear modeling, and high-dimensional data processing. Current deep learning models have improvements in various aspects, such as predictive performance and computation time, but they still have limitations in capturing the complex relationships between genotype and phenotype, indicating that there is still room for improvement in the accuracy of phenotype prediction. This study innovatively proposes a new method called DeepAT, which mainly includes an input layer, a data feature extraction layer, a feature relationship capture layer, and an output layer. This method can predict wheat yield based on genotype data and has innovations in the following four aspects: (1) The data feature extraction layer of DeepAT can extract representative feature vectors from high-dimensional SNP data. By introducing the ReLU activation function, it enhances the model’s ability to express nonlinear features and accelerates the model’s convergence speed; (2) DeepAT can handle high-dimensional and complex genotype data while retaining as much useful information as possible; (3) The feature relationship capture layer of DeepAT effectively captures the complex relationships between features from low-dimensional features through a self-attention mechanism; (4) Compared to traditional RNN structures, the model training process is more efficient and stable. Using a public wheat dataset from AGT, comparative experiments with three machine learning and six deep learning methods found that DeepAT exhibited better predictive performance than other methods, achieving a prediction accuracy of 99.98%, a mean squared error (MSE) of only 28.93 tones, and a Pearson correlation coefficient close to 1, with yield predicted values closely matching observed values. This method provides a new perspective for deep learning-assisted phenotype prediction and has great potential in smart breeding. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

17 pages, 45843 KB  
Article
How to Learn More? Exploring Kolmogorov–Arnold Networks for Hyperspectral Image Classification
by Ali Jamali, Swalpa Kumar Roy, Danfeng Hong, Bing Lu and Pedram Ghamisi
Remote Sens. 2024, 16(21), 4015; https://doi.org/10.3390/rs16214015 - 29 Oct 2024
Cited by 31 | Viewed by 4608
Abstract
Convolutional neural networks (CNNs) and vision transformers (ViTs) have shown excellent capability in complex hyperspectral image (HSI) classification. However, these models require a significant number of training data and are computational resources. On the other hand, modern Multi-Layer Perceptrons (MLPs) have demonstrated a [...] Read more.
Convolutional neural networks (CNNs) and vision transformers (ViTs) have shown excellent capability in complex hyperspectral image (HSI) classification. However, these models require a significant number of training data and are computational resources. On the other hand, modern Multi-Layer Perceptrons (MLPs) have demonstrated a great classification capability. These modern MLP-based models require significantly less training data compared with CNNs and ViTs, achieving state-of-the-art classification accuracy. Recently, Kolmogorov–Arnold networks (KANs) were proposed as viable alternatives for MLPs. Because of their internal similarity to splines and their external similarity to MLPs, KANs are able to optimize learned features with remarkable accuracy, in addition to being able to learn new features. Thus, in this study, we assessed the effectiveness of KANs for complex HSI data classification. Moreover, to enhance the HSI classification accuracy obtained by the KANs, we developed and proposed a hybrid architecture utilizing 1D, 2D, and 3D KANs. To demonstrate the effectiveness of the proposed KAN architecture, we conducted extensive experiments on three newly created HSI benchmark datasets: QUH-Pingan, QUH-Tangdaowan, and QUH-Qingyun. The results underscored the competitive or better capability of the developed hybrid KAN-based model across these benchmark datasets over several other CNN- and ViT-based algorithms, including 1D-CNN, 2DCNN, 3D CNN, VGG-16, ResNet-50, EfficientNet, RNN, and ViT. Full article
Show Figures

Figure 1

17 pages, 1307 KB  
Article
Station-Keeping Control of Stratospheric Balloons Based on Simultaneous Optimistic Optimization in Dynamic Wind
by Yuanqiao Fan, Xiaolong Deng, Xixiang Yang, Yuan Long and Fangchao Bai
Electronics 2024, 13(20), 4032; https://doi.org/10.3390/electronics13204032 - 13 Oct 2024
Cited by 1 | Viewed by 1672
Abstract
Stratospheric balloons serve as cost-effective platforms for wireless communication. However, these platforms encounter challenges stemming from their underactuation in the horizontal plane. Consequently, controllers must continually identify favorable wind conditions to optimize station-keeping performance while managing energy consumption. This study presents a receding [...] Read more.
Stratospheric balloons serve as cost-effective platforms for wireless communication. However, these platforms encounter challenges stemming from their underactuation in the horizontal plane. Consequently, controllers must continually identify favorable wind conditions to optimize station-keeping performance while managing energy consumption. This study presents a receding horizon controller based on wind and balloon models. Two neural networks, PredRNN and ResNet, are utilized for short-term wind field forecast. Additionally, an online receding horizon controller, based on simultaneous optimistic optimization (SOO), is developed for action sequence planning and adapted to accommodate various constraints, which is especially suitable due to its gradient-free nature, high efficiency, and effectiveness in black-box function optimization. A reward function is formulated to balance power consumption and station-keeping performance. Simulations conducted across diverse positions and dates demonstrate the superior performance of the proposed method compared with traditional greedy and A* algorithms. Full article
Show Figures

Figure 1

18 pages, 7434 KB  
Article
Prediction of Jacking Force for Construction of Long-Distance Rectangular Utility Tunnel Using Differential Evolution–Bidirectional Gated Re-Current Unit–Attention Model
by Tianshuang Liu, Juncheng Liu, Yong Tan and Dongdong Fan
Buildings 2024, 14(10), 3169; https://doi.org/10.3390/buildings14103169 - 5 Oct 2024
Cited by 1 | Viewed by 1257
Abstract
Most of the current machine learning algorithms are applied to predict the jacking force required in micro-tunneling; in contrast, few studies about long-distance, large-section jacking projects have been reported in the literature. In this study, an intelligent framework, consisting of a differential evolution [...] Read more.
Most of the current machine learning algorithms are applied to predict the jacking force required in micro-tunneling; in contrast, few studies about long-distance, large-section jacking projects have been reported in the literature. In this study, an intelligent framework, consisting of a differential evolution (DE), a bidirectional gated re-current unit (BiGRU), and attention mechanisms was developed to automatically identify the optimal hyperparameters and assign weights to the information features, as well as capture the bidirectional temporal features of sequential data. Based on field data from a pipe jacking project crossing underneath a canal, the model’s performance was compared with those of four conventional models (RNN, GRU, BiGRU, and DE–BiGRU). The results indicated that the DE–BiGRU–attention model performed best among these models. Then, the generalization performance of the proposed model in predicting jacking forces was evaluated with the aid of a similar case at the site. It was found that fine-tuning parameters for specific projects is essential for improving the model’s generalization performance. More generally, the proposed prediction model was found to be practically useful to professionals and engineers in making real-time adjustments to jacking parameters, predicting jacking force, and carrying out performance evaluations. Full article
(This article belongs to the Section Construction Management, and Computers & Digitization)
Show Figures

Figure 1

40 pages, 4095 KB  
Article
An End-to-End Scene Text Recognition for Bilingual Text
by Bayan M. Albalawi, Amani T. Jamal, Lama A. Al Khuzayem and Olaa A. Alsaedi
Big Data Cogn. Comput. 2024, 8(9), 117; https://doi.org/10.3390/bdcc8090117 - 9 Sep 2024
Cited by 3 | Viewed by 2178
Abstract
Text localization and recognition from natural scene images has gained a lot of attention recently due to its crucial role in various applications, such as autonomous driving and intelligent navigation. However, two significant gaps exist in this area: (1) prior research has primarily [...] Read more.
Text localization and recognition from natural scene images has gained a lot of attention recently due to its crucial role in various applications, such as autonomous driving and intelligent navigation. However, two significant gaps exist in this area: (1) prior research has primarily focused on recognizing English text, whereas Arabic text has been underrepresented, and (2) most prior research has adopted separate approaches for scene text localization and recognition, as opposed to one integrated framework. To address these gaps, we propose a novel bilingual end-to-end approach that localizes and recognizes both Arabic and English text within a single natural scene image. Specifically, our approach utilizes pre-trained CNN models (ResNet and EfficientNetV2) with kernel representation for localization text and RNN models (LSTM and BiLSTM) with an attention mechanism for text recognition. In addition, the AraElectra Arabic language model was incorporated to enhance Arabic text recognition. Experimental results on the EvArest, ICDAR2017, and ICDAR2019 datasets demonstrated that our model not only achieves superior performance in recognizing horizontally oriented text but also in recognizing multi-oriented and curved Arabic and English text in natural scene images. Full article
Show Figures

Figure 1

31 pages, 5139 KB  
Review
A Comprehensive Review of Methods for Hydrological Forecasting Based on Deep Learning
by Xinfeng Zhao, Hongyan Wang, Mingyu Bai, Yingjie Xu, Shengwen Dong, Hui Rao and Wuyi Ming
Water 2024, 16(10), 1407; https://doi.org/10.3390/w16101407 - 15 May 2024
Cited by 27 | Viewed by 9984
Abstract
Artificial intelligence has undergone rapid development in the last thirty years and has been widely used in the fields of materials, new energy, medicine, and engineering. Similarly, a growing area of research is the use of deep learning (DL) methods in connection with [...] Read more.
Artificial intelligence has undergone rapid development in the last thirty years and has been widely used in the fields of materials, new energy, medicine, and engineering. Similarly, a growing area of research is the use of deep learning (DL) methods in connection with hydrological time series to better comprehend and expose the changing rules in these time series. Consequently, we provide a review of the latest advancements in employing DL techniques for hydrological forecasting. First, we examine the application of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) in hydrological forecasting, along with a comparison between them. Second, a comparison is made between the basic and enhanced long short-term memory (LSTM) methods for hydrological forecasting, analyzing their improvements, prediction accuracies, and computational costs. Third, the performance of GRUs, along with other models including generative adversarial networks (GANs), residual networks (ResNets), and graph neural networks (GNNs), is estimated for hydrological forecasting. Finally, this paper discusses the benefits and challenges associated with hydrological forecasting using DL techniques, including CNN, RNN, LSTM, GAN, ResNet, and GNN models. Additionally, it outlines the key issues that need to be addressed in the future. Full article
Show Figures

Figure 1

22 pages, 5272 KB  
Article
ECARRNet: An Efficient LSTM-Based Ensembled Deep Neural Network Architecture for Railway Fault Detection
by Salman Ibne Eunus, Shahriar Hossain, A. E. M. Ridwan, Ashik Adnan, Md. Saiful Islam, Dewan Ziaul Karim, Golam Rabiul Alam and Jia Uddin
AI 2024, 5(2), 482-503; https://doi.org/10.3390/ai5020024 - 8 Apr 2024
Cited by 11 | Viewed by 5416
Abstract
Accidents due to defective railway lines and derailments are common disasters that are observed frequently in Southeast Asian countries. It is imperative to run proper diagnosis over the detection of such faults to prevent such accidents. However, manual detection of such faults periodically [...] Read more.
Accidents due to defective railway lines and derailments are common disasters that are observed frequently in Southeast Asian countries. It is imperative to run proper diagnosis over the detection of such faults to prevent such accidents. However, manual detection of such faults periodically can be both time-consuming and costly. In this paper, we have proposed a Deep Learning (DL)-based algorithm for automatic fault detection in railway tracks, which we termed an Ensembled Convolutional Autoencoder ResNet-based Recurrent Neural Network (ECARRNet). We compared its output with existing DL techniques in the form of several pre-trained DL models to investigate railway tracks and determine whether they are defective or not while considering commonly prevalent faults such as—defects in rails and fasteners. Moreover, we manually collected the images from different railway tracks situated in Bangladesh and made our dataset. After comparing our proposed model with the existing models, we found that our proposed architecture has produced the highest accuracy among all the previously existing state-of-the-art (SOTA) architecture, with an accuracy of 93.28% on the full dataset. Additionally, we split our dataset into two parts having two different types of faults, which are fasteners and rails. We ran the models on those two separate datasets, obtaining accuracies of 98.59% and 92.06% on rail and fastener, respectively. Model explainability techniques like Grad-CAM and LIME were used to validate the result of the models, where our proposed model ECARRNet was seen to correctly classify and detect the regions of faulty railways effectively compared to the previously existing transfer learning models. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
Show Figures

Figure 1

Back to TopTop