Mathematics

Research

14 pages, 628 KiB

Open AccessArticle

The Application of Residual Connection-Based State Normalization Method in GAIL

by Yanning Ge, Tao Huang, Xiaoding Wang, Guolong Zheng and Xu Yang

Mathematics 2024, 12(2), 214; https://doi.org/10.3390/math12020214 - 9 Jan 2024

Viewed by 677

In the domain of reinforcement learning (RL), deriving efficacious state representations and maintaining algorithmic stability are crucial for optimal agent performance. However, the inherent dynamism of state representations often complicates the normalization procedure. To overcome these challenges, we present an innovative RL framework [...] Read more.

In the domain of reinforcement learning (RL), deriving efficacious state representations and maintaining algorithmic stability are crucial for optimal agent performance. However, the inherent dynamism of state representations often complicates the normalization procedure. To overcome these challenges, we present an innovative RL framework that integrates state normalization techniques with residual connections and incorporates attention mechanisms into generative adversarial imitation learning (GAIL). This combination not only enhances the expressive capability of state representations, thereby improving the agent’s accuracy in state recognition, but also significantly mitigates the common issues of gradient dissipation and explosion. Compared to traditional RL algorithms, GAIL combined with the residual connection-based state normalization method empowers the agent to markedly reduce the exploration duration such that feedback concerning rewards in the current state can be provided in real time. Empirical evaluations demonstrate the superior efficacy of this methodology across various RL environments. Full article

(This article belongs to the Special Issue Artificial Neural Networks: Design and Applications)

► Show Figures

Figure 1

25 pages, 2969 KiB

Open AccessArticle

Lightweight and Elegant Data Reduction Strategies for Training Acceleration of Convolutional Neural Networks

by Alexander Demidovskij, Artyom Tugaryov, Aleksei Trutnev, Marina Kazyulina, Igor Salnikov and Stanislav Pavlov

Mathematics 2023, 11(14), 3120; https://doi.org/10.3390/math11143120 - 14 Jul 2023

Viewed by 1560

Abstract

Due to industrial demands to handle increasing amounts of training data, lower the cost of computing one model at a time, and lessen the ecological effects of intensive computing resource consumption, the job of speeding the training of deep neural networks becomes exceedingly [...] Read more.

Due to industrial demands to handle increasing amounts of training data, lower the cost of computing one model at a time, and lessen the ecological effects of intensive computing resource consumption, the job of speeding the training of deep neural networks becomes exceedingly challenging. Adaptive Online Importance Sampling and IDS are two brand-new methods for accelerating training that are presented in this research. On the one hand, Adaptive Online Importance Sampling accelerates neural network training by lowering the number of forward and backward steps depending on how poorly a model can identify a given data sample. On the other hand, Intellectual Data Selection accelerates training by removing semantic redundancies from the training dataset and subsequently lowering the number of training steps. The study reports average 1.9x training acceleration for ResNet50, ResNet18, MobileNet v2 and YOLO v5 on a variety of datasets: CIFAR-100, CIFAR-10, ImageNet 2012 and MS COCO 2017, where training data are reduced by up to five times. Application of Adaptive Online Importance Sampling to ResNet50 training on ImageNet 2012 results in 2.37 times quicker convergence to 71.7% top-1 accuracy, which is within 5% of the baseline. Total training time for the same number of epochs as the baseline is reduced by 1.82 times, with an accuracy drop of 2.45 p.p. The amount of time required to apply Intellectual Data Selection to ResNet50 training on ImageNet 2012 is decreased by 1.27 times with a corresponding decline in accuracy of 1.12 p.p. Applying both methods to ResNet50 training on ImageNet 2012 results in 2.31 speedup with an accuracy drop of 3.5 p.p. Full article

(This article belongs to the Special Issue Artificial Neural Networks: Design and Applications)

► Show Figures

Figure 1

11 pages, 1091 KiB

Open AccessArticle

Probabilistic Classification Method of Spiking Neural Network Based on Multi-Labeling of Neurons

by Mingyu Sung, Jaesoo Kim and Jae-Mo Kang

Mathematics 2023, 11(5), 1224; https://doi.org/10.3390/math11051224 - 2 Mar 2023

Viewed by 1421

Abstract

Recently, deep learning has exhibited outstanding performance in various fields. Even though artificial intelligence achieves excellent performance, the amount of energy required for computations has increased with its development. Hence, the need for a new energy-efficient computer architecture has emerged, which further leads [...] Read more.

Recently, deep learning has exhibited outstanding performance in various fields. Even though artificial intelligence achieves excellent performance, the amount of energy required for computations has increased with its development. Hence, the need for a new energy-efficient computer architecture has emerged, which further leads us to the neuromorphic computer. Although neuromorphic computing exhibits several advantages, such as low-power parallelism, it exhibits lower accuracy than deep learning. Therefore, the major challenge is to improve the accuracy while maintaining the neuromorphic computing-specific energy efficiency. In this paper, we propose a novel method of the inference process that considers the probability that after completing the learning process, a neuron can react to multiple target labels. Our proposed method can achieve improved accuracy while maintaining the hardware-friendly, low-power-parallel processing characteristics of a neuromorphic processor. Furthermore, this method converts the spike counts occurring in the learning process into probabilities. The inference process is conducted to implement the interaction between neurons by considering all the spikes that occur. The inferring circuit is expected to show a significant reduction in hardware cost and can afford an algorithm exhibiting a competitive computing performance. Full article

(This article belongs to the Special Issue Artificial Neural Networks: Design and Applications)

► Show Figures

Figure 1

16 pages, 7147 KiB

Open AccessArticle

Hybrid Traffic Accident Classification Models

by Yihang Zhang and Yunsick Sung

Mathematics 2023, 11(4), 1050; https://doi.org/10.3390/math11041050 - 19 Feb 2023

Cited by 1 | Viewed by 1792

Abstract

Traffic closed-circuit television (CCTV) devices can be used to detect and track objects on roads by designing and applying artificial intelligence and deep learning models. However, extracting useful information from the detected objects and determining the occurrence of traffic accidents are usually difficult. [...] Read more.

Traffic closed-circuit television (CCTV) devices can be used to detect and track objects on roads by designing and applying artificial intelligence and deep learning models. However, extracting useful information from the detected objects and determining the occurrence of traffic accidents are usually difficult. This paper proposes a CCTV frame-based hybrid traffic accident classification model that enables the identification of whether a frame includes accidents by generating object trajectories. The proposed model utilizes a Vision Transformer (ViT) and a Convolutional Neural Network (CNN) to extract latent representations from each frame and corresponding trajectories. The fusion of frame and trajectory features was performed to improve the traffic accident classification ability of the proposed hybrid method. In the experiments, the Car Accident Detection and Prediction (CADP) dataset was used to train the hybrid model, and the accuracy of the model was approximately 97%. The experimental results indicate that the proposed hybrid method demonstrates an improved classification performance compared to traditional models. Full article

(This article belongs to the Special Issue Artificial Neural Networks: Design and Applications)

► Show Figures

Figure 1

19 pages, 10812 KiB

Open AccessArticle

A Study on the Low-Power Operation of the Spike Neural Network Using the Sensory Adaptation Method

by Mingi Jeon, Taewook Kang, Jae-Jin Lee and Woojoo Lee

Mathematics 2022, 10(22), 4191; https://doi.org/10.3390/math10224191 - 9 Nov 2022

Viewed by 1223

Abstract

Motivated by the idea that there should be a close relationship between biological significance and low power driving of spike neural networks (SNNs), this paper aims to focus on spike-frequency adaptation, which deviates significantly from existing biological meaningfulness, and develop a new spike-frequency [...] Read more.

Motivated by the idea that there should be a close relationship between biological significance and low power driving of spike neural networks (SNNs), this paper aims to focus on spike-frequency adaptation, which deviates significantly from existing biological meaningfulness, and develop a new spike-frequency adaptation with more biological characteristics. As a result, this paper proposes the

s e n s o r y

a d a p t a t i o n

method that reflects the mechanisms of the human sensory organs, and studies network architectures and neuron models for the proposed method. Next, this paper introduces a dedicated SNN simulator that can selectively apply the conventional spike-frequency adaptation and the proposed method, and provides the results of functional verification and effectiveness evaluation of the proposed method. Through intensive simulation, this paper reveals that the proposed method can produce a level of training and testing performance similar to the conventional method while significantly reducing the number of spikes to 32.66% and 45.63%, respectively. Furthermore, this paper contributes to SNN research by showing an example based on in-depth analysis that embedding biological meaning in SNNs may be closely related to the low-power driving characteristics of SNNs. Full article

(This article belongs to the Special Issue Artificial Neural Networks: Design and Applications)

► Show Figures

Figure 1

23 pages, 834 KiB

Open AccessFeature PaperArticle

Rectifying Ill-Formed Interlingual Space: A Framework for Zero-Shot Translation on Modularized Multilingual NMT

by Junwei Liao and Yu Shi

Mathematics 2022, 10(22), 4178; https://doi.org/10.3390/math10224178 - 9 Nov 2022

Viewed by 1136

Abstract

The multilingual neural machine translation (NMT) model can handle translation between more than one language pair. From the perspective of industrial applications, the modularized multilingual NMT model (M2 model) that only shares modules between the same languages is a practical alternative to the [...] Read more.

The multilingual neural machine translation (NMT) model can handle translation between more than one language pair. From the perspective of industrial applications, the modularized multilingual NMT model (M2 model) that only shares modules between the same languages is a practical alternative to the model that shares one encoder and one decoder (1-1 model). Previous works have proven that the M2 model can benefit from multiway training without suffering from capacity bottlenecks and exhibits better performance than the 1-1 model. However, the M2 model trained on English-centric data is incapable of zero-shot translation due to the ill-formed interlingual space. In this study, we propose a framework to help the M2 model form an interlingual space for zero-shot translation. Using this framework, we devise an approach that combines multiway training with a denoising autoencoder task and incorporates a Transformer attention bridge module based on the attention mechanism. We experimentally show that the proposed method can form an improved interlingual space in two zero-shot experiments. Our findings further extend the use of the M2 model for multilingual translation in industrial applications. Full article

(This article belongs to the Special Issue Artificial Neural Networks: Design and Applications)

► Show Figures

Figure 1

15 pages, 1434 KiB

Open AccessArticle

Noise-Regularized Advantage Value for Multi-Agent Reinforcement Learning

by Siying Wang, Wenyu Chen, Jian Hu, Siyue Hu and Liwei Huang

Mathematics 2022, 10(15), 2728; https://doi.org/10.3390/math10152728 - 2 Aug 2022

Viewed by 1918

Abstract

Leveraging global state information to enhance policy optimization is a common approach in multi-agent reinforcement learning (MARL). Even with the supplement of state information, the agents still suffer from insufficient exploration in the training stage. Moreover, training with batch-sampled examples from the replay [...] Read more.

Leveraging global state information to enhance policy optimization is a common approach in multi-agent reinforcement learning (MARL). Even with the supplement of state information, the agents still suffer from insufficient exploration in the training stage. Moreover, training with batch-sampled examples from the replay buffer will induce the policy overfitting problem, i.e., multi-agent proximal policy optimization (MAPPO) may not perform as good as independent PPO (IPPO) even with additional information in the centralized critic. In this paper, we propose a novel noise-injection method to regularize the policies of agents and mitigate the overfitting issue. We analyze the cause of policy overfitting in actor–critic MARL, and design two specific patterns of noise injection applied to the advantage function with random Gaussian noise to stabilize the training and enhance the performance. The experimental results on the Matrix Game and StarCraft II show the higher training efficiency and superior performance of our method, and the ablation studies indicate our method will keep higher entropy of agents’ policies during training, which leads to more exploration. Full article

(This article belongs to the Special Issue Artificial Neural Networks: Design and Applications)

► Show Figures

Figure 1

20 pages, 3422 KiB

Open AccessArticle

Artificial Neural Networking (ANN) Model for Drag Coefficient Optimization for Various Obstacles

by Khalil Ur Rehman, Andaç Batur Çolak and Wasfi Shatanawi

Mathematics 2022, 10(14), 2450; https://doi.org/10.3390/math10142450 - 14 Jul 2022

Cited by 12 | Viewed by 2067

Abstract

For various obstacles in the path of a flowing liquid stream, an artificial neural networking (ANN) model is constructed to study the hydrodynamic force depending on the object. The multilayer perceptron (MLP), back propagation (BP), and feed-forward (FF) network models were employed to [...] Read more.

For various obstacles in the path of a flowing liquid stream, an artificial neural networking (ANN) model is constructed to study the hydrodynamic force depending on the object. The multilayer perceptron (MLP), back propagation (BP), and feed-forward (FF) network models were employed to create the ANN model, which has a high prediction accuracy and a strong structure. To be more specific, circular-, octagon-, hexagon-, square-, and triangular-shaped cylinders are installed in a rectangular channel. The fluid is flowing from the left wall of the channel by following two velocity profiles explicitly linear velocity and parabolic velocity. The no-slip condition is maintained on the channel upper and bottom walls. The Neumann condition is applied to the outlet. The entire physical design is mathematically regulated using flow equations. The result is presented using the finite element approach, with the LBB-stable finite element pair and a hybrid meshing scheme. The drag coefficient values are calculated by doing line integration around installed obstructions for both linear and parabolic profiles. The values of the drag coefficient are predicted with high accuracy by developing an ANN model toward various obstacles. Full article

(This article belongs to the Special Issue Artificial Neural Networks: Design and Applications)

► Show Figures

Figure 1

19 pages, 3323 KiB

Open AccessArticle

Artificial Neural Networking (ANN) Model for Convective Heat Transfer in Thermally Magnetized Multiple Flow Regimes with Temperature Stratification Effects

by Khalil Ur Rehman, Andaç Batur Çolak and Wasfi Shatanawi

Mathematics 2022, 10(14), 2394; https://doi.org/10.3390/math10142394 - 7 Jul 2022

Cited by 9 | Viewed by 2030

Abstract

The convective heat transfer in non-Newtonian fluid flow in the presence of temperature stratification, heat generation, and heat absorption effects is debated by using artificial neural networking. The heat transfer rate is examined for the four different thermal flow regimes namely (I) thermal [...] Read more.

The convective heat transfer in non-Newtonian fluid flow in the presence of temperature stratification, heat generation, and heat absorption effects is debated by using artificial neural networking. The heat transfer rate is examined for the four different thermal flow regimes namely (I) thermal flow field towards a flat surface along with thermal radiations, (II) thermal flow field towards a flat surface without thermal radiations, (III) thermal flow field over a cylindrical surface with thermal radiations, and (IV) thermal flow field over a cylindrical surface without thermal radiations. For each regime, a Nusselt number is carried out to construct an artificial neural networking model. The model prediction performance is reported by using varied neuron numbers and input parameters, and the results are assessed. The ANN model is designed by using the Bayesian regularization training procedure, and a high-performing MLP network model is used. The data used in the creation of the MLP network was 80 percent for model training and 20 percent for testing. The graph shows the degree of agreement between the ANN model projected values and the goal values. We discovered that an artificial neural network model can provide high-efficiency forecasts for heat transfer rates having engineering standpoints. For both flat and cylindrical surfaces, the heat transfer normal to the surface reflects inciting nature towards the Prandtl number and heat absorption parameter, while the opposite is the case for the temperature stratification parameter and heat generation parameter. It is important to note that the magnitude of heat transfer is significantly larger for Flow Regime-IV in comparison with Flow Regimes-I, -II, and -III. Full article

(This article belongs to the Special Issue Artificial Neural Networks: Design and Applications)

► Show Figures

Figure 1

14 pages, 2536 KiB

Open AccessFeature PaperArticle

Computing Frequency-Dependent Hysteresis Loops and Dynamic Energy Losses in Soft Magnetic Alloys via Artificial Neural Networks

by Simone Quondam Antonio, Francesco Riganti Fulginei, Gabriele Maria Lozito, Antonio Faba, Alessandro Salvini, Vincenzo Bonaiuto and Fausto Sargeni

Mathematics 2022, 10(13), 2346; https://doi.org/10.3390/math10132346 - 4 Jul 2022

Cited by 4 | Viewed by 1732

Abstract

A neural network model to predict the dynamic hysteresis loops and the energy-loss curves (i.e., the energy versus the amplitude of the magnetic induction) of soft ferromagnetic materials at different operating frequencies is proposed herein. Firstly, an innovative Fe-Si magnetic alloy, grade 35H270, [...] Read more.

A neural network model to predict the dynamic hysteresis loops and the energy-loss curves (i.e., the energy versus the amplitude of the magnetic induction) of soft ferromagnetic materials at different operating frequencies is proposed herein. Firstly, an innovative Fe-Si magnetic alloy, grade 35H270, is experimentally characterized via an Epstein frame in a wide range of frequencies, from 1 Hz up to 600 Hz. Parts of the dynamic hysteresis loops obtained through the experiments are involved in the training of a feedforward neural network, while the remaining ones are considered to validate the model. The training procedure is accurately designed to, firstly, identify the optimum network architecture (i.e., the number of hidden layers and the number of neurons per layer), and then, to effectively train the network. The model turns out to be capable of reproducing the magnetization processes and predicting the dynamic energy losses of the examined material in the whole range of inductions and frequencies considered. In addition, its computational and memory efficiency make the model a useful tool in the design stage of electrical machines and magnetic components. Full article

(This article belongs to the Special Issue Artificial Neural Networks: Design and Applications)

► Show Figures

Figure 1

14 pages, 558 KiB

Open AccessArticle

Robust Sparse Bayesian Learning Scheme for DOA Estimation with Non-Circular Sources

by Linlu Jian, Xianpeng Wang, Jinmei Shi and Xiang Lan

Mathematics 2022, 10(6), 923; https://doi.org/10.3390/math10060923 - 14 Mar 2022

Cited by 3 | Viewed by 1727

Abstract

In this paper, a robust DOA estimation scheme based on sparse Bayesian learning (SBL) for non-circular signals in impulse noise and mutual coupling (MC) is proposed. Firstly, the Toeplitz property of the MC matrix is used to eliminate the effect of array MC, [...] Read more.

In this paper, a robust DOA estimation scheme based on sparse Bayesian learning (SBL) for non-circular signals in impulse noise and mutual coupling (MC) is proposed. Firstly, the Toeplitz property of the MC matrix is used to eliminate the effect of array MC, and the array aperture is extended by using the properties of the non-circular signal. To eliminate the effect of impulse noise, the outlier part of the impulse noise is reconstructed together with the original signal in the signal matrix, and the DOA coarse estimation is obtained by balancing the accuracy and efficiency of parameter estimation using the alternating SBL update algorithm. Finally, a one-dimensional search is used in the vicinity of the searched spectral peaks to achieve a high-precision DOA estimation. The effectiveness and robustness of the algorithm for dealing with the above errors are demonstrated by extensive simulations. Full article

(This article belongs to the Special Issue Artificial Neural Networks: Design and Applications)

► Show Figures

Figure 1

0 pages, 2844 KiB

Open AccessArticle

An Investigation on Spiking Neural Networks Based on the Izhikevich Neuronal Model: Spiking Processing and Hardware Approach

by Abdulaziz S. Alkabaa, Osman Taylan, Mustafa Tahsin Yilmaz, Ehsan Nazemi and El Mostafa Kalmoun

Mathematics 2022, 10(4), 612; https://doi.org/10.3390/math10040612 - 16 Feb 2022

Cited by 5 | Viewed by 4297 | Correction

Abstract

The main required organ of the biological system is the Central Nervous System (CNS), which can influence the other basic organs in the human body. The basic elements of this important organ are neurons, synapses, and glias (such as astrocytes, which are the [...] Read more.

The main required organ of the biological system is the Central Nervous System (CNS), which can influence the other basic organs in the human body. The basic elements of this important organ are neurons, synapses, and glias (such as astrocytes, which are the highest percentage of glias in the human brain). Investigating, modeling, simulation, and hardware implementation (realization) of different parts of the CNS are important in case of achieving a comprehensive neuronal system that is capable of emulating all aspects of the real nervous system. This paper uses a basic neuron model called the Izhikevich neuronal model to achieve a high copy of the primary nervous block, which is capable of regenerating the behaviors of the human brain. The proposed approach can regenerate all aspects of the Izhikevich neuron in high similarity degree and performances. The new model is based on Look-Up Table (LUT) modeling of the mathematical neuromorphic systems, which can be realized in a high degree of correlation with the original model. The proposed procedure is considered in three cases: 100 points LUT modeling, 1000 points LUT modeling, and 10,000 points LUT modeling. Indeed, by removing the high-cost functions in the original model, the presented model can be implemented in a low-error, high-speed, and low-area resources state in comparison with the original system. To test and validate the proposed final hardware, a digital FPGA board (Xilinx Virtex-II FPGA board) is used. Digital hardware synthesis illustrates that our presented approach can follow the Izhikevich neuron in a high-speed state (more than the original model), increase efficiency, and also reduce overhead costs. Implementation results show the overall saving of

84.30 %

in FPGA and also the higher frequency of the proposed model of about 264 MHz, which is significantly higher than the original model, 28 MHz. Full article

(This article belongs to the Special Issue Artificial Neural Networks: Design and Applications)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Artificial Neural Networks: Design and Applications

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (12 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI