*5.1. Simulation Environment*

In order to verify the adaptability of the improved RRT\* algorithm, the classic RRT, the classic RRT\*, and the improved RRT\* algorithms are simulated and verified in the underground ore transportation scenario. The parameters of the vehicles come from the Scooptram ST3.5 diesel LHD, as shown in Figure 12 and Table 2. The verification map comes from a large underground mine in China, as shown in Figure 13a. The design size of the drifts was 4.4 m × 3.9 m. The ore is transported by an LHD from Stope #1 to Orepass #1. The map was preprocessed, and only the route of the LHD was retained. The simplified map is shown in Figure 13b.

**Figure 12.** Scooptram ST3.5 diesel LHD.


**Table 2.** Parameters of the Scooptram ST3.5 diesel LHD.

Data Source: Epiroc official website.

**Figure 13.** The map of the case study. (**a**) The original map; (**b**) the simplified map.

The case study simulated the operation process of the LHD from the stope to the orepass and verified the algorithm's ability to plan a feasible path in a long and narrow space. The LHD is required to complete ore transportation with the minimum distance under safe conditions and kinematic constraints. The simulation process was developed with Python 3.7, the operating system was Windows 10 × 64 bit, the CPU was Intel Core i7-8550U, and the memory was 16 GB. The simulation environment included Scipy 1.6.2, Shapely 1.8.0, and Matplotlib 3.3.4. Scipy was used to create the formulas. Shapely was used to calculate the OBB of vehicles and map polygons. Matplotlib was used to show the path.

#### *5.2. Simulation Results*

Comparative simulation experiments of the classic RRT algorithm, classic RRT\* algorithm, and improved RRT\* algorithm were carried out, and the results are shown in Figure 14. The red "X" represents the starting point and end point of the path planning, the blue line represents the wall of the drifts, and the horizontal and vertical axes represent the east and north coordinates. The yellow line represents the result of the classic RRT algorithm, the green line represents the result of the classic RRT\* algorithm, and the red line represents the result of the improved RRT\* algorithm.

**Figure 14.** The simulation results.

It can be seen from Figure 13 that the path generated by the classic RTT algorithm had robust randomness, and there were a lot of irregular corners, such as Circle 1 and Circle 2. In contrast, the smoothness of the path generated by the classic RRT\* algorithm was greatly improved, but the steering angle at the bend of the drift was too sharp, which was not suitable for the steering angle of the vehicles, such as Circle 2 and Circle 3.

Ten independent random simulations were performed on each algorithm to offset the random deviation of a single experiment. The results are shown in Table 3. The average path length obtained by the improved RRT\* algorithm was much lower than that of the classic RRT algorithm but had only a small reduction compared with the classic RRT\* algorithm. The main reason is that the reconnection in the classic RRT\* algorithm can quickly approach the theoretically shortest time. The improved RRT\* algorithm inherited this feature, and there was no more room for improvement. For the average search time, the performance of the improved RRT\* algorithm was between the classic RRT algorithm and the classic RTT\* algorithm. The same reason also led to the increment in average search nodes. Due to the optimal tree reconnection, the improved RRT\* algorithm had a significant advantage over the classic algorithm in terms of average path nodes. This parameter reduced the control points during vehicle driving and reduced the difficulty of automatic driving. The steering angle constraints made the improved RRT\* algorithm result fully meet the steering requirements, and the optimal tree reconnection increased the smoothness of the path, so the device can directly follow the path without further adjustment, avoiding multiple calculations. In general, the improved RRT\* algorithm greatly improved the quality of the path while appropriately sacrificing the solution speed.


**Table 3.** Statistics of 10 independent random simulations.

Obstacles in underground drifts are common, such as faulty vehicles and stacked materials. Further verification was conducted with known obstacles, as shown in Figure 15. Two scenarios were considered with both avoidable obstacles and unavoidable obstacles in the drift. The red line represents the final result, the yellow line represents the invalid leaf of a random tree, and the blue point represents the obstacle. For avoidable obstacles, the algorithm could pass them using a smooth curve without more additional sampling being necessary. For unavoidable obstacles, the algorithm stopped sampling after a certain number of samples.

**Figure 15.** The simulation result with known obstacles. (**a**) With avoidable obstacles; (**b**) with unavoidable obstacles.

The kidnapping problem of intelligent vehicles might occur due to navigation failure or other reasons. For the verification of the kidnapping problem, we assumed that the vehicle planned to reach point B from point A but reached point B' for kidnapping reasons. Two scenarios were considered with both turnable kidnapping and unturnable kidnapping for the vehicle, as shown in Figure 16. For turnable kidnapping, it will reach the front point of the original path by the maximum steering angle. For unturnable kidnapping, it will drive astern to the back point of the original path by the maximum steering angle.

**Figure 16.** The simulation result for the kidnapping problem. (**a**) With turnable kidnapping; (**b**) with unturnable kidnapping.

#### *5.3. Discussion*

With the aim of the unmanned driving of intelligent vehicles in underground mines, we improved the path planning algorithm to adapt to the complex drift environment based on the RRT\* algorithm. Many existing algorithms have to rasterize the map, but rasterized maps are not suitable for the drift environment. We constructed a vectorized drift environment map and then selected the RRT\* algorithm to improve it. The vectorized map can effectively restore the details of the roadway environment and can also reduce the dataset. Combined with the articulated structure of underground intelligent vehicles, the dynamic characteristics were analyzed, and then the constraints were constructed. It strengthened the consideration of complex vehicle structures in this field. The process of the classical RRT\* algorithm was analyzed, and then its shortcomings in adaptability to underground mining were extracted. On this basis, three improvements were proposed: a dynamic step size solved the algorithm efficiency problem; steering angle constraints solved the vehicle dynamics problem; optimal tree reconnection solved the control difficulty problem. By way of a simulation case study, the improved RRT\* algorithm obtained a path suitable for underground intelligent vehicles within a reasonable time. Its results increased the effective ratio of the steering angle to 100%, fully met the vehicle's requirements, eliminated the secondary optimization of the path, greatly reduced the average number of path nodes, and simplified the vehicle's automatic driving control. Many existing algorithms have to rasterize the map.

However, we must admit that in order to achieve the path planning effect, a large number of invalid samples were discarded, which led to an increase in calculation time. This algorithm can improve the sampling efficiency and shorten the calculation time through parallel calculation. This will be improved in future research to further reduce the calculation time. In addition, the simulation case study was completed in this paper, but no on-site industrial experiment was carried out. The unmanned driving design of underground intelligent vehicles coordinates with multiple modules, including communication, sensors, SLAM, mechanical control, etc. It is also necessary to shut down some mining operations to ensure the safety of the experiment area. Due to these difficulties, this research only completed the path planning algorithm module, and in the future, an on-site industrial experiment will be completed after the preparation of each module.

#### **6. Conclusions**

This paper proposed a path planning method based on an improved RRT\* algorithm for solving the problem of path planning for underground intelligent vehicles on an articulated structure and in drift environment conditions. Through a vectorized drift map and using the kinematics of vehicles, the constraints of articulated underground intelligent vehicles can be ascertained. The RRT\* algorithm is an efficient sampling-based path planning algorithm, but it cannot meet the constraints of articulated underground intelligent vehicles. To solve this problem, this paper proposed an improved RRT\* algorithm, including dynamic step size, steering angle constraints, and optimal tree reconnection. A simulation case study proved that the algorithm was effective and could solve the problem of underground intelligent vehicle path planning.

However, the method in this paper still has limitations, and future research will focus on the following aspects. (1) The solution time is still unsatisfactory because 86.12s cannot meet the application requirement for underground unmanned driving. Vehicles need to obtain a path within several seconds. A parallel calculation will be used to increase the solution speed and further reduce the calculation time. (2) There is still no joint debugging with intelligent vehicles. After the preparation of the industrial site, it will be combined with other modules to complete on-site industrial experiments and test the gap between the simulated and actual performance.

**Author Contributions:** Conceptualization, N.H. and H.W.; methodology, H.W.; software, J.H. and L.C.; validation, H.W., G.L. and L.C.; formal analysis, H.W.; data curation, L.C.; writing—original draft preparation, H.W.; writing—review and editing, G.L. and N.H; visualization, J.H.; supervision, N.H.; funding acquisition, G.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Natural Science Foundation of China, grant number 52074022 and National Key R&D Program of China, grant number 2018YFC0604400.

**Data Availability Statement:** The parameters of the Scooptram ST3.5 diesel LHD were obtained from https://www.epiroc.com/en-us/products/loaders-and-trucks/diesel-loaders/scooptram-st3 -5 (accessed on 12 December 2021). The source code associated with the algorithms introduced in the paper is available from the corresponding author upon request. The environment data are not available according to the privacy policy.

**Acknowledgments:** The authors would like to thank SD Gold for drift environment data support and Epiroc for LHD parameters support.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Qi Zhang 1,†, Yukai Song 1,2,†, Peng Jiao <sup>1</sup> and Yue Hu 1,\***


**Abstract:** Exploration in unknown dynamic environments is a challenging problem in an AI system, and current techniques tend to produce irrational exploratory behaviours and fail in obstacle avoidance. To this end, we present a three-tiered hierarchical and modular spatial exploration model that combines the intrinsic motivation integrated deep reinforcement learning (DRL) and rule-based real-time obstacle avoidance approach. We address the spatial exploration problem in two levels on the whole. On the higher level, a DRL based global module learns to determine a distant but easily reachable target that maximizes the current exploration progress. On the lower level, another two-level hierarchical movement controller is used to produce locally smooth and safe movements between targets based on the information of known areas and free space assumption. Experimental results on diverse and challenging 2D dynamic maps show that the proposed model achieves almost 90% coverage and generates smoother trajectories compared with a state-of-the-art IM based DRL and some other heuristic methods on the basis of avoiding obstacles in real time.

**Keywords:** spatial exploration; hierarchical framework; deep reinforcement learning; intrinsic motivation; path planning; obstacle avoidance
