*Article* **Evaluation of RGB-D Multi-Camera Pose Estimation for 3D Reconstruction**

**Ian de Medeiros Esper 1,\*, Oleh Smolkin 2,3, Maksym Manko 2,4, Anton Popov 2,3,4 and Pål Johan From <sup>1</sup> and Alex Mason 1,5,\***


**Abstract:** Advances in visual sensor devices and computing power are revolutionising the interaction of robots with their environment. Cameras that capture depth information along with a common colour image play a significant role. These devices are cheap, small, and fairly precise. The information provided, particularly point clouds, can be generated in a virtual computing environment, providing complete 3D information for applications. However, off-the-shelf cameras often have a limited field of view, both on the horizontal and vertical axis. In larger environments, it is therefore often necessary to combine information from several cameras or positions. To concatenate multiple point clouds and generate the complete environment information, the pose of each camera must be known in the outer scene, i.e., they must reference a common coordinate system. To achieve this, a coordinate system must be defined, and then every device must be positioned according to this coordinate system. For cameras, a calibration can be performed to find its pose in relation to this coordinate system. Several calibration methods have been proposed to solve this challenge, ranging from structured objects such as chessboards to features in the environment. In this study, we investigate how three different pose estimation methods for multi-camera perspectives perform when reconstructing a scene in 3D. We evaluate the usage of a charuco cube, a double-sided charuco board, and a robot's tool centre point (TCP) position in a real usage case, where precision is a key point for the system. We define a methodology to identify the points in the 3D space and measure the root-mean-square error (RMSE) based on the Euclidean distance of the actual point to a generated ground-truth point. The reconstruction carried out using the robot's TCP position produced the best result, followed by the charuco cuboid; the double-sided angled charuco board exhibited the worst performance.

**Keywords:** pose estimation; robotics; 3D reconstruction; charuco cuboid
