Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (14)

Search Parameters:
Keywords = multi-step deep Q learning network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
29 pages, 3320 KB  
Article
Risk-Aware Crypto Price Prediction Using DQN with Volatility-Adjusted Rewards Across Multi-Period State Representations
by Otabek Sattarov and Fazliddin Makhmudov
Mathematics 2025, 13(18), 3012; https://doi.org/10.3390/math13183012 - 18 Sep 2025
Viewed by 1591
Abstract
Forecasting Bitcoin prices remains a complex task due to the asset’s inherent and significant volatility. Traditional reinforcement learning (RL) models often rely on a single observation from the time series, potentially missing out on short-term patterns that could enhance prediction performance. This study [...] Read more.
Forecasting Bitcoin prices remains a complex task due to the asset’s inherent and significant volatility. Traditional reinforcement learning (RL) models often rely on a single observation from the time series, potentially missing out on short-term patterns that could enhance prediction performance. This study presents a Deep Q-Network (DQN) model that utilizes a multi-step state representation, incorporating consecutive historical timesteps to reflect recent market behavior more accurately. By doing so, the model can more effectively identify short-term trends under volatile conditions. Additionally, we propose a novel reward mechanism that adjusts for volatility by penalizing large prediction errors more heavily during periods of high market volatility, thereby encouraging more risk-aware forecasting behavior. We validate the effectiveness of our approach through extensive experiments on Bitcoin data across minutely, hourly, and daily timeframes. The proposed model achieves notable results, including a Mean Absolute Percentage Error (MAPE) of 10.12%, Root Mean Squared Error (RMSE) of 815.33, and Value-at-Risk (VaR) of 0.04. These outcomes demonstrate the advantages of integrating short-term temporal features and volatility sensitivity into RL frameworks for more reliable cryptocurrency price prediction. Full article
Show Figures

Figure 1

25 pages, 2304 KB  
Article
From Anatomy to Genomics Using a Multi-Task Deep Learning Approach for Comprehensive Glioma Profiling
by Akmalbek Abdusalomov, Sabina Umirzakova, Obidjon Bekmirzaev, Adilbek Dauletov, Abror Buriboev, Alpamis Kutlimuratov, Akhram Nishanov, Rashid Nasimov and Ryumduck Oh
Bioengineering 2025, 12(9), 979; https://doi.org/10.3390/bioengineering12090979 - 15 Sep 2025
Viewed by 731
Abstract
Background: Gliomas are among the most complex and lethal primary brain tumors, necessitating precise evaluation of both anatomical subregions and molecular alterations for effective clinical management. Methods: To find a solution to the disconnected nature of current bioimage analysis pipelines, where anatomical segmentation [...] Read more.
Background: Gliomas are among the most complex and lethal primary brain tumors, necessitating precise evaluation of both anatomical subregions and molecular alterations for effective clinical management. Methods: To find a solution to the disconnected nature of current bioimage analysis pipelines, where anatomical segmentation based on MRI and molecular biomarker prediction are done as separate tasks, we use here Molecular-Genomic and Multi-Task (MGMT-Net), a one deep learning scheme that carries out the task of the multi-modal MRI data without any conversion. MGMT-Net incorporates a novel Cross-Modality Attention Fusion (CMAF) module that dynamically integrates diverse imaging sequences and pairs them with a hybrid Transformer–Convolutional Neural Network (CNN) encoder to capture both global context and local anatomical detail. This architecture supports dual-task decoders, enabling concurrent voxel-wise tumor delineation and subject-level classification of key genomic markers, including the IDH gene mutation, the 1p/19q co-deletion, and the TERT gene promoter mutation. Results: Extensive validation on the Brain Tumor Segmentation (BraTS 2024) dataset and the combined Cancer Genome Atlas/Erasmus Glioma Database (TCGA/EGD) datasets demonstrated high segmentation accuracy and robust biomarker classification performance, with strong generalizability across external institutional cohorts. Ablation studies further confirmed the importance of each architectural component in achieving overall robustness. Conclusions: MGMT-Net presents a scalable and clinically relevant solution that bridges radiological imaging and genomic insights, potentially reducing diagnostic latency and enhancing precision in neuro-oncology decision-making. By integrating spatial and genetic analysis within a single model, this work represents a significant step toward comprehensive, AI-driven glioma assessment. Full article
(This article belongs to the Special Issue Mathematical Models for Medical Diagnosis and Testing)
Show Figures

Figure 1

19 pages, 2833 KB  
Article
Research on AGV Path Planning Based on Improved DQN Algorithm
by Qian Xiao, Tengteng Pan, Kexin Wang and Shuoming Cui
Sensors 2025, 25(15), 4685; https://doi.org/10.3390/s25154685 - 29 Jul 2025
Viewed by 851
Abstract
Traditional deep reinforcement learning methods suffer from slow convergence speeds and poor adaptability in complex environments and are prone to falling into local optima in AGV system applications. To address these issues, in this paper, an adaptive path planning algorithm with an improved [...] Read more.
Traditional deep reinforcement learning methods suffer from slow convergence speeds and poor adaptability in complex environments and are prone to falling into local optima in AGV system applications. To address these issues, in this paper, an adaptive path planning algorithm with an improved Deep Q Network algorithm called the B-PER DQN algorithm is proposed. Firstly, a dynamic temperature adjustment mechanism is constructed, and the temperature parameters in the Boltzmann strategy are adaptively adjusted by analyzing the change trend of the recent reward window. Next, the Priority experience replay mechanism is introduced to improve the training efficiency and task diversity through experience grading sampling and random obstacle configuration. Then, a refined multi-objective reward function is designed, combined with direction guidance, step punishment, and end point reward, to effectively guide the agent in learning an efficient path. Our experimental results show that, compared with other algorithms, the improved algorithm proposed in this paper achieves a higher success rate and faster convergence in the same environment and represents an efficient and adaptive solution for reinforcement learning for path planning in complex environments. Full article
(This article belongs to the Special Issue Intelligent Control and Robotic Technologies in Path Planning)
Show Figures

Figure 1

27 pages, 3479 KB  
Article
A Hybrid IVFF-AHP and Deep Reinforcement Learning Framework for an ATM Location and Routing Problem
by Bahar Yalcin Kavus, Kübra Yazici Sahin, Alev Taskin and Tolga Kudret Karaca
Appl. Sci. 2025, 15(12), 6747; https://doi.org/10.3390/app15126747 - 16 Jun 2025
Viewed by 1006
Abstract
The impact of alternative distribution channels, such as bank Automated Teller Machines (ATMs), on the financial industry is growing due to technological advancements. Investing in ideal locations is critical for new ATM companies. Due to the many factors to be evaluated, this study [...] Read more.
The impact of alternative distribution channels, such as bank Automated Teller Machines (ATMs), on the financial industry is growing due to technological advancements. Investing in ideal locations is critical for new ATM companies. Due to the many factors to be evaluated, this study addresses the problem of determining the best location for ATMs to be deployed in Istanbul districts by utilizing the multi-criteria decision-making framework. Furthermore, the advantages of fuzzy logic are used to convert expert opinions into mathematical expressions and incorporate them into decision-making processes. For the first time in the literature, a model has been proposed for ATM location selection, integrating clustering and the interval-valued Fermatean fuzzy analytic hierarchy process (IVFF-AHP). With the proposed methodology, the districts of Istanbul are first clustered to find the risky ones. Then, the most suitable alternative location in this district is determined using IVFF-AHP. After deciding the ATM locations with IVFF-AHP, in the last step, a Double Deep Q-Network Reinforcement Learning model is used to optimize the Cash in Transit (CIT) vehicle route. The study results reveal that the proposed approach provides stable, efficient, and adaptive routing for real-world CIT operations. Full article
Show Figures

Figure 1

25 pages, 7158 KB  
Article
Anti-Jamming Decision-Making for Phased-Array Radar Based on Improved Deep Reinforcement Learning
by Hang Zhao, Hu Song, Rong Liu, Jiao Hou and Xianxiang Yu
Electronics 2025, 14(11), 2305; https://doi.org/10.3390/electronics14112305 - 5 Jun 2025
Viewed by 1433
Abstract
In existing phased-array radar systems, anti-jamming strategies are mainly generated through manual judgment. However, manually designing or selecting anti-jamming decisions is often difficult and unreliable in complex jamming environments. Therefore, reinforcement learning is applied to anti-jamming decision-making to solve the above problems. However, [...] Read more.
In existing phased-array radar systems, anti-jamming strategies are mainly generated through manual judgment. However, manually designing or selecting anti-jamming decisions is often difficult and unreliable in complex jamming environments. Therefore, reinforcement learning is applied to anti-jamming decision-making to solve the above problems. However, the existing anti-jamming decision-making models based on reinforcement learning often suffer from problems such as low convergence speeds and low decision-making accuracy. In this paper, a multi-aspect improved deep Q-network (MAI-DQN) is proposed to improve the exploration policy, the network structure, and the training methods of the deep Q-network. In order to solve the problem of the ϵ-greedy strategy being highly dependent on hyperparameter settings, and the Q-value being overly influenced by the action in other deep Q-networks, this paper proposes a structure that combines a noisy network, a dueling network, and a double deep Q-network, which incorporates an adaptive exploration policy into the neural network and increases the influence of the state itself on the Q-value. These enhancements enable a highly adaptive exploration strategy and a high-performance network architecture, thereby improving the decision-making accuracy of the model. In order to calculate the target value more accurately during the training process and improve the stability of the parameter update, this paper proposes a training method that combines n-step learning, target soft update, variable learning rate, and gradient clipping. Moreover, a novel variable double-depth priority experience replay (VDDPER) method that more accurately simulates the storage and update mechanism of human memory is used in the MAI-DQN. The VDDPER improves the decision-making accuracy by dynamically adjusting the sample size based on different values of experience during training, enhancing exploration during the early stages of training, and placing greater emphasis on high-value experiences in the later stages. Enhancements to the training method improve the model’s convergence speed. Moreover, a reward function combining signal-level and data-level benefits is proposed to adapt to complex jamming environments, which ensures a high reward convergence speed with fewer computational resources. The findings of a simulation experiment show that the proposed phased-array radar anti-jamming decision-making method based on MAI-DQN can achieve a high convergence speed and high decision-making accuracy in environments where deceptive jamming and suppressive jamming coexist. Full article
Show Figures

Figure 1

48 pages, 14298 KB  
Article
A Multi-Level Speed Guidance Cooperative Approach Based on Bidirectional Periodic Green Wave Coordination Under Intelligent and Connected Environment
by Luxi Dong, Xiaolan Xie, Lieping Zhang, Shuiwang Li and Zhiqian Yang
Sensors 2025, 25(7), 2114; https://doi.org/10.3390/s25072114 - 27 Mar 2025
Viewed by 781
Abstract
To maximize arterial green wave bandwidth utilization, this study aims to minimize average travel delays at coordinated intersections and maximize vehicle throughput. In view of the aforementioned points, the present paper sets out a collaborative optimization method for the control of related intersection [...] Read more.
To maximize arterial green wave bandwidth utilization, this study aims to minimize average travel delays at coordinated intersections and maximize vehicle throughput. In view of the aforementioned points, the present paper sets out a collaborative optimization method for the control of related intersection groups. The method combines multi-level speed guidance with green wave coordinated control. In an intelligent and connected environment (ICE), the driving trajectory of the initial vehicle is determined in each optimization cycle following the receipt of active speed guidance. Subsequently, the driving trajectories of subsequent vehicles are calculated, with an assessment made as to whether they can leave the intersection before the end of the green light. The subsequent step involves the calculation of a characteristic index, comprising the average speed of the arterial coordination section and its corresponding phase offset. The phase offset is then optimized with the objective of maximizing the comprehensive bandwidth of green wave coordination within the control range. The maximum average speed and the bidirectional cycle comprehensive green wave bandwidth are employed as the control objectives. Finally, a model is constructed through the combination of multi-level vehicle speed guidance with bidirectional cycle green wave coordinated control. A bi-level combinatorial optimization method is constructed through a combinatorial deep Q learning method, named Deep Q Network-Genetic Algorithm (DQNGA), with the objective of obtaining the global optimal solution. Finally, the reliability of the method is validated using traffic flow data and map sensor data on several associated road sections in a city. The results demonstrate that the proposed method reduces the average delay and number of stops by 20.76% and 44.49%, respectively, outperforming conventional traffic control strategies. This suggests that the issue of inefficient utilization of green light time in arterial coordinated signal control has been effectively addressed. Consequently, the efficiency of intersections in the intelligent and connected environment has been enhanced. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

15 pages, 3322 KB  
Article
Development of a Fleet Management System for Multiple Robots’ Task Allocation Using Deep Reinforcement Learning
by Yanyan Dai, Deokgyu Kim and Kidong Lee
Processes 2024, 12(12), 2921; https://doi.org/10.3390/pr12122921 - 20 Dec 2024
Cited by 3 | Viewed by 2984
Abstract
This paper presents a fleet management system (FMS) for multiple robots, utilizing deep reinforcement learning (DRL) for dynamic task allocation and path planning. The proposed approach enables robots to autonomously optimize task execution, selecting the shortest and safest paths to target points. A [...] Read more.
This paper presents a fleet management system (FMS) for multiple robots, utilizing deep reinforcement learning (DRL) for dynamic task allocation and path planning. The proposed approach enables robots to autonomously optimize task execution, selecting the shortest and safest paths to target points. A deep Q-network (DQN)-based algorithm evaluates path efficiency and safety in complex environments, dynamically selecting the optimal robot to complete each task. Simulation results in a Gazebo environment demonstrate that Robot 2 achieved a path 20% shorter than other robots while successfully completing its task. Training results reveal that Robot 1 reduced its cost by 50% within the first 50 steps and stabilized near-optimal performance after 1000 steps, Robot 2 converged after 4000 steps with minor fluctuations, and Robot 3 exhibited steep cost reduction, converging after 10,000 steps. The FMS architecture includes a browser-based interface, Node.js server, rosbridge server, and ROS for robot control, providing intuitive monitoring and task assignment capabilities. This research demonstrates the system’s effectiveness in multi-robot coordination, task allocation, and adaptability to dynamic environments, contributing significantly to the field of robotics. Full article
Show Figures

Figure 1

14 pages, 3705 KB  
Article
Navigation Based on Hybrid Decentralized and Centralized Training and Execution Strategy for Multiple Mobile Robots Reinforcement Learning
by Yanyan Dai, Deokgyu Kim and Kidong Lee
Electronics 2024, 13(15), 2927; https://doi.org/10.3390/electronics13152927 - 24 Jul 2024
Cited by 3 | Viewed by 1317
Abstract
In addressing the complex challenges of path planning in multi-robot systems, this paper proposes a novel Hybrid Decentralized and Centralized Training and Execution (DCTE) Strategy, aimed at optimizing computational efficiency and system performance. The strategy solves the prevalent issues of collision and coordination [...] Read more.
In addressing the complex challenges of path planning in multi-robot systems, this paper proposes a novel Hybrid Decentralized and Centralized Training and Execution (DCTE) Strategy, aimed at optimizing computational efficiency and system performance. The strategy solves the prevalent issues of collision and coordination through a tiered optimization process. The DCTE strategy commences with an initial decentralized path planning step based on Deep Q-Network (DQN), where each robot independently formulates its path. This is followed by a centralized collision detection the analysis of which serves to identify potential intersections or collision risks. Paths confirmed as non-intersecting are used for execution, while those in collision areas prompt a dynamic re-planning step using DQN. Robots treat each other as dynamic obstacles to circumnavigate, ensuring continuous operation without disruptions. The final step involves linking the newly optimized paths with the original safe paths to form a complete and secure execution route. This paper demonstrates how this structured strategy not only mitigates collision risks but also significantly improves the computational efficiency of multi-robot systems. The reinforcement learning time was significantly shorter, with the DCTE strategy requiring only 3 min and 36 s compared to 5 min and 33 s in the comparison results of the simulation section. The improvement underscores the advantages of the proposed method in enhancing the effectiveness and efficiency of multi-robot systems. Full article
Show Figures

Figure 1

12 pages, 1016 KB  
Article
Secure Healthcare Model Using Multi-Step Deep Q Learning Network in Internet of Things
by Patibandla Pavithra Roy, Ventrapragada Teju, Srinivasa Rao Kandula, Kambhampati Venkata Sowmya, Anca Ioana Stan and Ovidiu Petru Stan
Electronics 2024, 13(3), 669; https://doi.org/10.3390/electronics13030669 - 5 Feb 2024
Cited by 10 | Viewed by 2250
Abstract
Internet of Things (IoT) is an emerging networking technology that connects both living and non-living objects globally. In an era where IoT is increasingly integrated into various industries, including healthcare, it plays a pivotal role in simplifying the process of monitoring and identifying [...] Read more.
Internet of Things (IoT) is an emerging networking technology that connects both living and non-living objects globally. In an era where IoT is increasingly integrated into various industries, including healthcare, it plays a pivotal role in simplifying the process of monitoring and identifying diseases for patients and healthcare professionals. In IoT-based systems, safeguarding healthcare data is of the utmost importance, to prevent unauthorized access and intermediary assaults. The motivation for this research lies in addressing the growing security concerns within healthcare IoT. In this proposed paper, we combine the Multi-Step Deep Q Learning Network (MSDQN) with the Deep Learning Network (DLN) to enhance the privacy and security of healthcare data. The DLN is employed in the authentication process to identify authenticated IoT devices and prevent intermediate attacks between them. The MSDQN, on the other hand, is harnessed to detect and counteract malware attacks and Distributed Denial of Service (DDoS) attacks during data transmission between various locations. Our proposed method’s performance is assessed based on such parameters as energy consumption, throughput, lifetime, accuracy, and Mean Square Error (MSE). Further, we have compared the effectiveness of our approach with an existing method, specifically, Learning-based Deep Q Network (LDQN). Full article
Show Figures

Figure 1

16 pages, 17137 KB  
Article
Deep Reinforcement Learning-Based 2.5D Multi-Objective Path Planning for Ground Vehicles: Considering Distance and Energy Consumption
by Xiru Wu, Shuqiao Huang and Guoming Huang
Electronics 2023, 12(18), 3840; https://doi.org/10.3390/electronics12183840 - 11 Sep 2023
Cited by 18 | Viewed by 2845
Abstract
Due to the vastly different energy consumption between up-slope and down-slope, a path with the shortest length in a complex off-road terrain environment (2.5D map) is not always the path with the least energy consumption. For any energy-sensitive vehicle, realizing a good trade-off [...] Read more.
Due to the vastly different energy consumption between up-slope and down-slope, a path with the shortest length in a complex off-road terrain environment (2.5D map) is not always the path with the least energy consumption. For any energy-sensitive vehicle, realizing a good trade-off between distance and energy consumption in 2.5D path planning is significantly meaningful. In this paper, we propose a deep reinforcement learning-based 2.5D multi-objective path planning method (DMOP). The DMOP can efficiently find the desired path in three steps: (1) transform the high-resolution 2.5D map into a small-size map, (2) use a trained deep Q network (DQN) to find the desired path on the small-size map, and (3) build the planned path to the original high-resolution map using a path-enhanced method. In addition, the hybrid exploration strategy and reward-shaping theory are applied to train the DQN. The reward function is constructed with the information of terrain, distance, and border. The simulation results show that the proposed method can finish the multi-objective 2.5D path planning task with significantly high efficiency and quality. Also, simulations prove that the method has powerful reasoning capability that enables it to perform arbitrary untrained planning tasks. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Robotics)
Show Figures

Figure 1

17 pages, 2664 KB  
Article
AQMDRL: Automatic Quality of Service Architecture Based on Multistep Deep Reinforcement Learning in Software-Defined Networking
by Junyan Chen, Cenhuishan Liao, Yong Wang, Lei Jin, Xiaoye Lu, Xiaolan Xie and Rui Yao
Sensors 2023, 23(1), 429; https://doi.org/10.3390/s23010429 - 30 Dec 2022
Cited by 7 | Viewed by 3574
Abstract
Software-defined networking (SDN) has become one of the critical technologies for data center networks, as it can improve network performance from a global perspective using artificial intelligence algorithms. Due to the strong decision-making and generalization ability, deep reinforcement learning (DRL) has been used [...] Read more.
Software-defined networking (SDN) has become one of the critical technologies for data center networks, as it can improve network performance from a global perspective using artificial intelligence algorithms. Due to the strong decision-making and generalization ability, deep reinforcement learning (DRL) has been used in SDN intelligent routing and scheduling mechanisms. However, traditional deep reinforcement learning algorithms present the problems of slow convergence rate and instability, resulting in poor network quality of service (QoS) for an extended period before convergence. Aiming at the above problems, we propose an automatic QoS architecture based on multistep DRL (AQMDRL) to optimize the QoS performance of SDN. AQMDRL uses a multistep approach to solve the overestimation and underestimation problems of the deep deterministic policy gradient (DDPG) algorithm. The multistep approach uses the maximum value of the n-step action currently estimated by the neural network instead of the one-step Q-value function, as it reduces the possibility of positive error generated by the Q-value function and can effectively improve convergence stability. In addition, we adapt a prioritized experience sampling based on SumTree binary trees to improve the convergence rate of the multistep DDPG algorithm. Our experiments show that the AQMDRL we proposed significantly improves the convergence performance and effectively reduces the network transmission delay of SDN over existing DRL algorithms. Full article
(This article belongs to the Special Issue Adaptive Resource Allocation for Internet of Things and Networks)
Show Figures

Figure 1

19 pages, 2820 KB  
Article
A DDQN Path Planning Algorithm Based on Experience Classification and Multi Steps for Mobile Robots
by Xin Zhang, Xiaoxu Shi, Zuqiong Zhang, Zhengzhong Wang and Lieping Zhang
Electronics 2022, 11(14), 2120; https://doi.org/10.3390/electronics11142120 - 6 Jul 2022
Cited by 14 | Viewed by 3504
Abstract
Constrained by the numbers of action space and state space, Q-learning cannot be applied to continuous state space. Targeting this problem, the double deep Q network (DDQN) algorithm and the corresponding improvement methods were explored. First of all, to improve the accuracy of [...] Read more.
Constrained by the numbers of action space and state space, Q-learning cannot be applied to continuous state space. Targeting this problem, the double deep Q network (DDQN) algorithm and the corresponding improvement methods were explored. First of all, to improve the accuracy of the DDNQ algorithm in estimating the target Q value in the training process, a multi-step guided strategy was introduced into the traditional DDQN algorithm, for which the single-step reward was replaced with the reward obtained in continuous multi-step interactions of mobile robots. Furthermore, an experience classification training method was introduced into the traditional DDQN algorithm, for which the state transition generated by the mobile robot–environment interaction was divided into two different types of experience pools, and experience pools were trained by the Q network, and the sampling proportions of the two experience pools were updated through the training loss. Afterward, the advantages of a multi-step guided DDQN (MS-DDQN) algorithm and experience classification DDQN (EC-DDQN) algorithm were combined to develop a novel experience classification multi-step DDQN (ECMS-DDQN) algorithm. Finally, the path planning of these four algorithms, including DDQN, MS-DDQN, EC-DDQN, and ECMS-DDQN, was simulated on the OpenAI Gym platform. The simulation results revealed that the ECMS-DDQN algorithm outperforms the other three in the total return value and generalization in path planning. Full article
(This article belongs to the Section Systems & Control Engineering)
Show Figures

Figure 1

18 pages, 13610 KB  
Article
Multi-Agent Decision-Making Modes in Uncertain Interactive Traffic Scenarios via Graph Convolution-Based Deep Reinforcement Learning
by Xin Gao, Xueyuan Li, Qi Liu, Zirui Li, Fan Yang and Tian Luan
Sensors 2022, 22(12), 4586; https://doi.org/10.3390/s22124586 - 17 Jun 2022
Cited by 24 | Viewed by 3848
Abstract
As one of the main elements of reinforcement learning, the design of the reward function is often not given enough attention when reinforcement learning is used in concrete applications, which leads to unsatisfactory performances. In this study, a reward function matrix is proposed [...] Read more.
As one of the main elements of reinforcement learning, the design of the reward function is often not given enough attention when reinforcement learning is used in concrete applications, which leads to unsatisfactory performances. In this study, a reward function matrix is proposed for training various decision-making modes with emphasis on decision-making styles and further emphasis on incentives and punishments. Additionally, we model a traffic scene via graph model to better represent the interaction between vehicles, and adopt the graph convolutional network (GCN) to extract the features of the graph structure to help the connected autonomous vehicles perform decision-making directly. Furthermore, we combine GCN with deep Q-learning and multi-step double deep Q-learning to train four decision-making modes, which are named the graph convolutional deep Q-network (GQN) and the multi-step double graph convolutional deep Q-network (MDGQN). In the simulation, the superiority of the reward function matrix is proved by comparing it with the baseline, and evaluation metrics are proposed to verify the performance differences among decision-making modes. Results show that the trained decision-making modes can satisfy various driving requirements, including task completion rate, safety requirements, comfort level, and completion efficiency, by adjusting the weight values in the reward function matrix. Finally, the decision-making modes trained by MDGQN had better performance in an uncertain highway exit scene than those trained by GQN. Full article
Show Figures

Figure 1

21 pages, 2226 KB  
Article
A Novel Multi-Factor Three-Step Feature Selection and Deep Learning Framework for Regional GDP Prediction: Evidence from China
by Qingwen Li, Guangxi Yan and Chengming Yu
Sustainability 2022, 14(8), 4408; https://doi.org/10.3390/su14084408 - 7 Apr 2022
Cited by 21 | Viewed by 3679
Abstract
Gross domestic product (GDP) is an important index reflecting the economic development of a region. Accurate GDP prediction of developing regions can provide technical support for sustainable urban development and economic policy formulation. In this paper, a novel multi-factor three-step feature selection and [...] Read more.
Gross domestic product (GDP) is an important index reflecting the economic development of a region. Accurate GDP prediction of developing regions can provide technical support for sustainable urban development and economic policy formulation. In this paper, a novel multi-factor three-step feature selection and deep learning framework are proposed for regional GDP prediction. The core modeling process is mainly composed of the following three steps: In Step I, the feature crossing algorithm is used to deeply excavate hidden feature information of original datasets and fully extract key information. In Step II, BorutaRF and Q-learning algorithms analyze the deep correlation between extracted features and targets from two different perspectives and determine the features with the highest quality. In Step III, selected features are used as the input of TCN (Temporal convolutional network) to build a GDP prediction model and obtain final prediction results. Based on the experimental analysis of three datasets, the following conclusions can be drawn: (1) The proposed three-stage feature selection method effectively improves the prediction accuracy of TCN by more than 10%. (2) The proposed GDP prediction framework proposed in the paper has achieved better forecasting performance than 14 benchmark models. In addition, the MAPE values of the models are lower than 5% in all cases. Full article
Show Figures

Figure 1

Back to TopTop