1. Introduction
Urban pipeline facilities play a vital role in energy transportation, clean water supply, and wastewater discharge in urban areas, thereby significantly improving the living conditions of urban residents. However, as pipes age, they may undergo degradation due to factors such as corrosion, geological subsidence, or improper plumbing and digging. This deterioration can lead to economic losses and hazardous incidents [
1]. Therefore, obtaining internal pipe information is crucial for the sustainable operation of urban pipelines as it provides early support for pipeline inspection and maintenance.
Traditional pipe inspection methods depend on manual work and are limited by the following factors: (i) they are time consuming and laborious; (ii) they risk lives if entering into unknown pipe environments. Therefore, robotic solutions for pipeline inspection promise enhancement of human labor by automating data acquisition for pipe condition assessments. CCTV (closed-circuit television) cameras or sonar devices are some of the most well-adopted robotic solutions used to obtain information about the interior of pipelines for basic visual inspection as well as data collection and video analysis [
1]. Although CCTV is the preferred method because it provides a complete view of the pipe interior, it requires long cables and travels with slow speed in small-diameter pipeline inspections.
To gain further insight into the 3D structure and textures of the pipeline, additional steps must be taken with the video data or image sequences to achieve 3D reconstruction. This is often accomplished offline by employing the SFM method [
2]. Additionally, multi-sensor fusion SLAM techniques from the robotics community can be utilized to automatically collect sensory data and perform RGB-D 3D reconstructions within the pipes. Shang and Shen proposed a pipeline reconstruction system using a multi-camera array, which includes one centrally positioned depth camera and four oblique depth cameras. This system estimates the pipeline path and maps its surface based on the vision SLAM framework. However, it should be noted that this method is not capable of real-time reconstruction, and the use of multiple cameras increases the integrated cost of the system [
3]. In another study, Zhang et al. presented a pipeline reconstruction SLAM method that employs a cylindrical rule. This method addresses the issues of scale drift and accuracy reduction encountered in the vision SLAM approach for pipeline reconstruction. However, it is important to mention that the accuracy of this method significantly decreases when the lower part of the pipeline is covered [
4]. Zhang et al. also introduced a 3D reconstruction system based on 360° panoramic video. This system involves multi-view image frame extraction, panoramic reprojection, and photogrammetric processing. It provides an intuitive and clear reconstruction of the pipeline’s real scenes, but it can only be operated offline [
5]. While these approaches offer valuable information, including visual, 3D, and geo-referencing data, for advanced pipe condition monitoring and assessment, they are time-consuming and require offline operation. Consequently, there is a need for autonomous, unconstrained robots that can perform continuous, real-time pipeline health monitoring [
4].
In recent years, SLAM technology has gained wide popularity in solving complex problems related to real-time 3D mapping in robots. Several improved algorithms based on SLAM have been developed to enhance work efficiency. The ORB-SLAM2 method, for instance, requires ORB features with better view invariance and higher matching efficiency. However, it is susceptible to tracking failures in scenes with fast camera movement and weak textures. Additionally, the maps generated by ORB-SLAM2 are sparse point cloud maps with limited applicability [
6]. Zou et al. proposed a robust RGB-D SLAM system that utilizes point and line features to improve robustness in low-texture scenes, which are more sensitive to light variations [
7]. On the other hand, LIO-SLAM achieves relatively high localization and map construction accuracy by fusing LiDAR data and IMUs. However, it comes with high costs and is typically used for generating 2D maps, which may not suffice for tasks requiring 3D environment modeling [
8]. DSO-SLAM, which employs the sparse direct method, extracts feature points directly from the image without the need for dense feature matching. It is capable of functioning in environments with changing illumination and weak texture, making it more robust [
9]. VINS-Fusion, on the other hand, utilizes a vision camera and IMU fusion technology, offering high positioning accuracy, real-time capabilities, and stability in both indoor and outdoor scenes. However, it requires precise calibration of sensors, including camera–IMU time synchronization and external parameter calibration. Moreover, it is primarily designed for high-precision positioning applications and is not suitable for 3D map construction [
10]. Manhattan SLAM is a technology integrating superpixels and Manhattan world assumptions, in which both line features and planar features can be better extracted; however, it cannot be used for the 3D reconstruction of pipelines [
11]. Notably, the performance of different improved SLAM algorithms can significantly differ in diverse application environments. Consequently, when applying SLAM to different contexts, it becomes crucial to select the most suitable SLAM approach or adapt existing methods to balance efficiency and accuracy.
Although various methods and systems have made significant progress in achieving optimal results, there is still a demand for automated and real-time 3D reconstruction of underground pipelines in complex environments. For instance, the robotic vehicle inspection system equipped with CCTV has limitations when it comes to inspecting large-diameter or irregular-shaped pipelines. Additionally, these systems often require manual operators and lack autonomy. Existing SLAM studies primarily focus on improving reconstruction accuracy, while efficiency and real-time performance receive less attention. Liang et al. propose an end-to-end network called SVR-Net for monocular TSDF SLAM, which directly generates dense TSDF maps during localization, avoids inconsistencies in depth map fusion, and meets real-time requirements [
12]. However, this approach requires pre-training and re-execution in different application scenarios. Deep learning has proven to be effective in image processing, and many works have applied it to front-end feature matching and loopback detection in visual SLAM, reducing the false matching rate. However, it relies on strong arithmetic support and prior information. It is particularly important to use a low arithmetic platform that is easy to operate without the need for training or introducing prior information. RTAB-Map is a representative method for RGB-D SLAM, which generates point clouds and triangular mesh maps [
13]. Its binary program is available in the ROS system due to its high integration. It can run in real time on low-computing power platforms and generates dense point cloud maps. The binary program of RTAB-Map can optimize both bit-pose filtering and model reconstruction of the point cloud data.
Nowadays, UAVs have gained significant importance in structural health performance inspections due to their small size, low cost, flexibility, mobility, and ability to perform multiple-view inspections [
14,
15,
16,
17]. However, their application in pipeline inspection missions has been limited to the outer surface of pipes and they have not yet been deployed for 3D reconstruction of the interior pipeline [
18,
19,
20]. This limitation is primarily attributed to the vast and complex nature of pipeline systems with narrow spaces and dim lighting conditions. Additionally, existing commercial UAVs face challenges related to carrying a variety of sensors while having limited load capacity and endurance. Therefore, there is a need to design a high-performance, multi-functional, and reliable UAV inspection system specifically tailored for pipeline inspections.
With the development of drone and vision computer technology, drones equipped with lightweight sensor devices will become a more mainstream option for the inspection of pipe networks. In view of this, this paper’s objective is to propose a real-time 3D reconstruction system for UAVs based on RTAB-Map, which collects data through RGB-D cameras and generates 3D point clouds inside the pipeline through real-time processing by on-board computers that are then remotely viewed, in real time, at the other end.
The main contributions are as follows: (i) A real-time 3D reconstruction platform for UAVs with RTAB-Map is proposed to deal with harsh scenarios such as urban underground pipelines that are difficult for people to reach. (ii) The validation compares the reconstruction effect of mainstream visual SLAM algorithms in the underground pipe scenario, showing that the RTAB-Map method has better robustness in this scenario. (iii) Experiments are carried out in real sewage pipes, and the results show that the system can carry out 3D reconstruction in real time in the underground pipe scene, and the reconstruction accuracy is also guaranteed.
The rest of this paper is organized as follows:
Section 2 describes the methodology and framework. This section mainly introduces the hardware architecture of the UAV system and the RTAB-Map method.
Section 3 describes the experiments and results and also includes a comparison of trajectories and point cloud between RTAB-Map, ORB-SLAM, and Manhattan-SLAM.
Section 4 provides an analysis and discussion of the experimental results. Finally,
Section 5 summarizes the main conclusions of this research and discusses its limitations.
4. Analysis and Discussion
In this study, the Intel RealSense D435 depth camera was utilized to perform real-time 3D reconstruction of underground pipes. This camera serves as a crucial data source for our reconstruction process. Additionally, the core hardware environment of the UAV acquisition system used in this experiment consists of an i5-1145G7 with Radeon Graphics 2.60 GHz and 8 GB RAM. The experiment employed an onboard mini-computer running on a 64-bit Ubuntu 20.04 operating system. The system can process thirty frames per second during operation. The time consumed during data acquisition and the experimental process demonstrates that the current efficiency is sufficient for real-time 3D reconstruction. Importantly, the device’s stability ensures the high quality of the video during data acquisition.
Pipe 3D modeling in pipeline internal environments presents unique challenges such as complex pipeline structures, lack of natural light, inconspicuous pipe in-wall texture, and poor security. To address these challenges, the use of RTAB-Map, a graph-based SLAM system, is proposed. RTAB-Map combines loop closure detection and graph optimization methods, while also providing memory management for large-scale and long-term online operations. This system demonstrates advantages in scenarios where the camera moves quickly, surrounding objects reflect light, or illumination is unstable. Additionally, RTAB-Map’s memory management mechanism ensures the system’s stability over extended periods. To evaluate the feasibility of RTAB-Map in pipeline internal complex environments, this study conducted experiments using a UAV to collect video data and extract high frame rate RGB and depth images. The results confirm that RTAB-Map is a viable solution for navigation and modeling in such challenging environments.
5. Conclusions
This paper presents a study on an unmanned aircraft system designed for dense 3D real-time reconstruction of urban pipelines using low-cost equipment. The research begins by building a UAV hardware platform for data acquisition, with the Intel RealSense D435 depth camera selected as the sensor for capturing RGB and depth information. The experimental scene focuses on a section of rubber sewage pipe, and three visual SLAM methods (RTAB-Map, ORB-SLAM2, and Manhattan-SLAM) are compared to evaluate their reconstruction effects. The results demonstrate that RTAB-Map outperforms the other two methods in terms of localization accuracy and 3D reconstruction. Additionally, RTAB-Map exhibits greater robustness in low-textured and dim urban pipe environments. The experiments confirm that the proposed UAV 3D reconstruction system, based on visual SLAM, enables real-time modeling in urban pipeline internal environments.
The proposed UAV acquisition system for urban underground inner pipes offers several advantages: (i) UAVs are more lightweight and flexible in large-diameter scenarios such as urban sewage pipes, which reduces the risk of operational safety. (ii) RTAB-Map is better at real-time and reconstruction accuracy compared to other visual SLAM methods. (iii) RTAB-Map incorporates a memory management mechanism that enables real-time execution of data acquisition and modeling processes. This mechanism is crucial for ensuring pipeline stability analysis. (iv) Compared to Laser scanners, the depth camera used in this system is cost-effective while providing detailed texture information about the pipe in-walls.
However, there are certain limitations in this study that need to be acknowledged. Firstly, in order to maintain the lightweight nature of the UAV, its range is limited and it cannot operate for extended periods of time. Secondly, the system faces difficulties in smooth metal pipes and other texture-deficient scenes, which is a drawback of the visual SLAM method. To overcome this limitation, the study will propose the fusion of LiDAR, IMU, and other sensors to enhance the accuracy of 3D reconstruction in the future. Lastly, the study only focuses on the performance of the system in straight pipe scenarios. Future research will delve into more complex situations involving underground pipes with multiple structures, such as curved pipes, vertical pipes, and ‘T’ shaped pipes.