Implementation of a Small-Sized Mobile Robot with Road Detection, Sign Recognition, and Obstacle Avoidance

Wong, Ching-Chang; Weng, Kun-Duo; Yu, Bo-Yun; Chou, Yung-Shan

doi:10.3390/app14156836

Open AccessArticle

Implementation of a Small-Sized Mobile Robot with Road Detection, Sign Recognition, and Obstacle Avoidance

Department of Electrical and Computer Engineering, Tamkang University, New Taipei City 25137, Taiwan

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(15), 6836; https://doi.org/10.3390/app14156836

Submission received: 30 June 2024 / Revised: 22 July 2024 / Accepted: 30 July 2024 / Published: 5 August 2024

(This article belongs to the Special Issue Artificial Intelligence and Its Application in Robotics)

Download

Browse Figures

Versions Notes

Abstract

:

In this study, under the limited volume of 18 cm × 18 cm × 21 cm, a small-sized mobile robot is designed and implemented. It consists of a CPU, a GPU, a 2D LiDAR (Light Detection And Ranging), and two fisheye cameras to let the robot have good computing processing and graphics processing capabilities. In addition, three functions of road detection, sign recognition, and obstacle avoidance are implemented on this small-sized robot. For road detection, we divide the captured image into four areas and use Intel NUC to perform road detection calculations. The proposed method can significantly reduce the system load and also has a high processing speed of 25 frames per second (fps). For sign recognition, we use the YOLOv4-tiny model and a data augmentation strategy to significantly improve the computing performance of this model. From the experimental results, it can be seen that the mean Average Precision (mAP) of the used model has increased by 52.14%. For obstacle avoidance, a 2D LiDAR-based method with a distance-based filtering mechanism is proposed. The distance-based filtering mechanism is proposed to filter important data points and assign appropriate weights, which can effectively reduce the computational complexity and improve the robot’s response speed to avoid obstacles. Some results and actual experiments illustrate that the proposed methods for these three functions can be effectively completed in the implemented small-sized robot.

Keywords:

mobile robot; road detection; sign recognition; obstacle avoidance; data augmentation

1. Introduction

With the rapid development of robotics technology, autonomous mobile robots (AMRs) have been increasingly used in industry, services, and academia [1,2,3,4]. In academia and research, small and open-source platforms like the TurtleBot3 robot [5] are ideal for exploring complex robotic problems [6]. In industrial applications, AMRs are widely used in factory automation, such as logistics and warehouse management, and can move goods autonomously to improve production efficiency. In the service field, AMR can be used to provide food delivery services in restaurants and guide customers in large shopping malls. Additionally, autonomous driving technology is increasingly used on roads. By integrating data from various sensors, such as cameras, radar, and ultrasonic sensors, automated driving technology enables vehicles to autonomously perform tasks such as navigation, environment sensing, and decision-making without human intervention. Autonomous vehicles enable accurate road detection and sign recognition, improving traffic safety and efficiency. The development of this technology not only changes the traditional transportation model, but also provides new opportunities for the construction of smart cities. However, the realization of efficient autonomous robots requires overcoming many technical challenges to ensure that the robot can reach the target location correctly and safely. Therefore, this study aims to build a small-sized mobile robot and implement road detection, sign recognition, and obstacle avoidance methods on this robot.

Road detection enables robots to autonomously recognize and follow roads, crucial for navigation in complex environments. In previous research, Ma’arif et al. [7] proposed a picture-based approach to detect single-line contours but encountered difficulties in accuracy and corners. Zhang et al. [8] proposed a low-cost vision-based road following system for autonomous mobile robots working in outdoor environments. The system combines color-based road detection to detect the asphalt road surface and uses the Unscented Kalman Filter (UKF) to enhance the robustness of the system. While effective, this system may cause incorrect navigation on urban roads. Cáceres et al. [9] proposed a real-time lane detection algorithm that combines geometric and image features dynamically adjusting the Region Of Interest (ROI) based on vehicle speed. It performed well in complex lighting but struggled with strong illumination and shadow changes. In summary, these studies demonstrate different approaches and challenges in road identification and navigation technologies. Image technology has significant advantages in road detection, and despite the challenges of strong light and shadow changes, these limitations can be overcome through improved algorithms and multi-sensor fusion to enhance the navigation capabilities of autonomous mobile robots in complex environments.

Sign recognition is one application of object recognition. Object recognition is an important part of robotics that enables robots to understand the information in the environment. Deep learning algorithms have proven effective for complex vision tasks [10,11,12]. There are four main branches of object recognition, including image classification, object detection, instance segmentation, and semantic segmentation. Image classification is one of the most basic fields, which mainly utilizes CNN to recognize the whole image [13,14]. The more common algorithms include AlexNet [15], VGGNet [16], ResNet [17], etc. Object detection can further accurately identify the position of the object [18,19], usually marking a bounding box around the object. Common algorithms mainly include YOLO [20,21] and SSD [22,23] for the one-stage method, and Fast R-CNN [24] and Faster R-CNN [25] for the two-stage method. Instance segmentation is a method that can recognize the location of objects [26,27] and accurately mark the outline of each object. Common algorithms include Mask R-CNN [28] and YOLACT/YOLACT++ [29,30]. Semantic segmentation [26,31] divides images into parts and classifies each part at the pixel level, using algorithms like DeepLab and FCN [32]. The field of object recognition is also developing rapidly, including real-time processing, multi-task learning, multi-modal learning, and transfer learning, and can even be practiced in edge devices, so that it can be applied in the field of rapid response requirements. On the other hand, as the application of these techniques increases in sensitive areas, the need for interpretability is also expanding, which is always an important issue to be addressed [33,34]. As for road sign recognition, Cui et al. [35] enhanced the YOLO model, improving precision and recall, particularly for small objects and complex scenes, thereby boosting the safety and environmental awareness of autonomous driving systems. In summary, object recognition technology plays a crucial role in robotics and autonomous driving. With the continuous development of deep learning technology, many advanced algorithms have demonstrated their powerful ability and wide application potential in various complex visual tasks.

In terms of obstacle avoidance, it mainly focuses on the robot’s real-time identification and avoidance of obstacles, using various advanced sensors to collect information from the environment and perform visual algorithms based on either traditional or deep learning computations to enable the robots to respond to the obstacles [36]. Key research areas include sensing technology, environment sensing and data processing, path planning and decision-making, and multi-robot collaboration. Sensing technology employs RGB/RGB-D cameras, 2D/3D optical radar, radar, ultrasonic sensors, and infrared sensors, often combined to leverage their strengths [37]. Environment sensing and data processing focus on data fusion, scene understanding, and real-time performance for accurate and reliable system operation. Dynamic obstacle avoidance path planning emphasizes real-time detection and response to moving obstacles [38,39]. For this part, Dynamic Path Planning (DPP) can more fully include the whole scope, from recognizing moving obstacles to real-time path adjustment, which are all discussed in DPP, and even GPP and LPP are usually included for perfect path planning. Yang et al. [40] proposed a local path planning algorithm using high-precision GNSS/GPS for global path location, real-time dynamic sensing for environmental information, and B-sample curvature interpolation for smooth local paths, enhancing performance and stability. Guo et al. [41] proposed an AGV path planning algorithm that incorporates the improved A* algorithm and the Dynamic Window Approach (DWA) for obstacle avoidance. Meanwhile, the use of deep learning in obstacle avoidance is growing, with technologies still evolving. Escobar-Naranjo et al. [42] proposed a Deep Q-Network (DQN)-based approach for navigating an autonomous driving robot. Their experiments in a Gazebo simulation environment demonstrated significant improvements in the robot’s navigation efficiency and accuracy. Therefore, by integrating advanced sensing technology with data processing, path planning, and decision-making, the obstacle avoidance system can realize higher accuracy and reliability.

To summarize, with the rapid advancement of robotics technology, autonomous mobile robots are increasingly used in industrial, service, and academic fields. The aim of this study is to explore the applications of mobile robots in road detection, sign recognition, and obstacle avoidance. This study not only integrates the existing technologies but also provides more possibilities for the application of small-sized mobile robots.

2. Preliminary

2.1. Equipment

The 3D diagram and the chassis structure of the implemented small-sized mobile robot are shown in Figure 1. The size of the robot is 18 × 18 × 21 cm³. The key hardware components are listed in Table 1. To fulfill all the computational performance requirements needed for the study, we use Intel NUC and Jetson AGX Xavier to let the robot have good computing processing capability and graphics processing capability. The challenge of this design and implementation is to effectively arrange all the necessary hardware components within the limited space. Despite the larger size of the Jetson AGX Xavier, the high-performance GPU is an indispensable part of this design. Through meticulous layout, this study successfully integrates the Intel NUC, OpenCR, two XM-430 motors, batteries, a DC-DC stabilizing chip, a 2D LiDAR, and two fisheye cameras into the compact structure of the small-sized robot. This setup satisfies all requirements for environmental data collection, algorithm computation, and motor control of this study. The Jetson AGX Xavier has exceptional image processing and deep learning capabilities, so it is selected as the primary computing unit. It supports Nvidia’s Compute Unified Device Architecture (CUDA), enabling us to perform General-Purpose computing on Graphics Processing Units (GPGPUs) on NVIDIA GPUs. This approach significantly accelerates processing speed by directly accessing the virtual instruction set and parallel computing elements in the GPU, offering a performance several times, or even tens of times, faster than using a CPU alone.

Moreover, the small-sized robot is equipped with an RPLIDAR-A1 LiDAR on its top. The specification of RPLIDAR-A1 is listed in Table 2. With a resolution of less than 1 degree, it effectively provides 2D data points, significantly enhancing environmental perception capabilities. The high-precision obstacle detection offered by this LiDAR is crucial for obstacle avoidance, providing key data for this function.

The input/output relationships of the system architecture are shown in Figure 2. It primarily includes three components: sensors, data processing modules, and control systems. Two fisheye cameras and a 2D-LiDAR transmit RGB images and point cloud data via USB, which are processed by the YOLO model, line following system, and virtual force field system to generate object information and target direction information. Data are transmitted through DDS and UART to the data integration module, where it is consolidated to produce road information. The control systems receive the road information and send control signals to the motor via UART, adjusting the motor speed to achieve autonomous navigation. Each component is deployed on hardware platforms such as Jetson AGX Xavier, Intel NUC, and OpenCR, ensuring efficient system operation. The visual processing of the system is primarily handled by the Jetson AGX Xavier, which is connected via Universal Serial Bus (USB). The Jetson supports USB Video Class (UVC) and Linux V4L2 (Video for Linux 2), ensuring compatibility with most cameras. After receiving images from the cameras, object detection is conducted using the YOLOv4-tiny model, which is capable of quickly and accurately detecting all road markings. The Intel NUC is the main control platform, which uses ROS2′s DDS as a communication bridge with the Jetson AGX Xavier to handle more complex computational tasks, such as road detection and data processing. It receives data from the LiDAR and another fisheye camera. Leveraging the CPU’s advantage in handling complex logic and control flows as well as memory access, the NUC can provide these computational capabilities in realtime. Finally, OpenCR, as the motor’s control center, communicates with the Intel NUC via UART. It receives all the processed data and converts various data inputs into control signals to manage the two motors. In summary, the system architecture of this study successfully integrates complex computational tasks and data processing workflows. This diversified structure meets the requirements of the small-sized robot for both CPU and GPU, achieving excellent performance with commercially available equipment. Additionally, it demonstrates the capability of a small-sized robot in handling deep learning tasks.

2.2. Mechanical Model

The robot designed in this study is of the differential drive type, which means it moves by controlling the speed and rotation direction of its left and right wheels. This type of movement is typical for two-wheeled differential platforms in Figure 3. In this model, the velocity of the robot’s center, denoted as

v_{c}

, is described by

v_{c} = \frac{v_{r} + v_{l}}{2}

(1)

where

v_{l}

and

v_{r}

are the velocities of the left and right wheels. The angular velocity of the robot’s center, denoted as

ω_{c}

, is expressed by

ω_{c} = \frac{v_{l} - v_{r}}{l}

(2)

where l is the distance between these two wheels. The turning radius r of the robot can be expressed by

r = \frac{v_{c}}{ω_{c}} = \frac{l}{2} \frac{v_{r} + v_{l}}{v_{r} - v_{l}}

(3)

The center point linear velocity and angular velocity can be expressed by

\dot{x_{c}} = (r \cos θ (ω_{r} + ω_{l})) / 2

(4)

\dot{y_{c}} = (r \sin θ (ω_{r} + ω_{l})) / 2

(5)

\dot{θ} = (r (ω_{r} - ω_{l})) / 2 l

(6)

where

ω_{r}

and

ω_{l}

are the right wheel angular speed and lift wheel angular speed, respectively. Therefore, the x-direction and y-direction linear velocity models can be expressed by

\dot{x} = v_{c} \cos θ

(7)

\dot{y} = v_{c} \sin θ

(8)

where

v_{c}

is the linear velocity of the robot center point, so the robot kinematics model can be expressed by

[\begin{matrix} \begin{matrix} {\dot{x}}_{c} \\ {\dot{y}}_{c} \end{matrix} \\ \dot{θ} \end{matrix}] = [\begin{matrix} \cos θ & 0 \\ \sin θ & 0 \\ 0 & 1 \end{matrix}] [\begin{matrix} v_{c} \\ \dot{θ} \end{matrix}]

(9)

and the conversion relationship between

v_{c}

and

\dot{θ}

is expressed by

[\begin{matrix} v_{c} \\ \dot{θ} \end{matrix}] = [\begin{matrix} \frac{r}{2} & \frac{r}{2} \\ \frac{r}{2 l} & \frac{- r}{2 l} \end{matrix}] [\begin{matrix} ω_{r} \\ ω_{l} \end{matrix}]

(10)

3. Methods

3.1. Rode Detection

In this study, fisheye cameras are utilized to address the processing of road detection by achieving a wider field of view compared to traditional cameras. Fisheye cameras provide more environmental information, which is highly beneficial for the robot’s environmental perception. However, images captured by these cameras inherently exhibit distortion. In these images, straight lines in the real world appear curved, particularly towards the edges where significant arc-like distortions occur. Traditionally, various correction algorithms have been employed to rectify these distortions, such as the Straight Lines Spherical Perspective Projection Constraint (SLSPPC), the Ellipsoidal Function Model (EFM), and techniques based on three-view stitching. Considering the computational resource constraints in this study, a different approach was chosen: applying the algorithms directly to the uncorrected, distorted images. This approach requires the algorithm to handle the curved lines in the images, posing new design requirements. By acknowledging the distortion as an inherent feature of the fisheye camera images, this method incorporates it into the algorithmic processing. This reduces the computational load associated with distortion correction, enabling more efficient processing within the limitations of the available hardware.

The methodology adopted in this study is illustrated in Figure 4. The process begins by converting the captured image into the HSV (Hue, Saturation, Value) color space and applying a filter to highlight the features of road markings, as shown in Figure 4a,b. Following this, the Canny edge detection algorithm is used to extract the edge features from the image, as depicted in Figure 4c. In distorted images, the density of edge features can potentially decrease the performance of standard road detection algorithms. However, this study leverages the density as an advantage. The approach involves analyzing the direction and length of each small segment of edges in the image and converting them into a series of localized road cues. This method’s versatility lies in its ability to adjust the threshold values of the Canny algorithm based on the robot’s computational capabilities. By altering these thresholds, the number of extracted edges can be increased or decreased, thus finding a balance between accuracy and computational speed.

After extracting all the edges, the system divides the image into several analysis regions and uses Equation (11) to obtain the weight value of each region.

g a i n = \{\begin{matrix} 0.5 if (L_{y l} + L_{y r} \leq λ_{1}) \\ 0.3 if (L_{y l} + L_{y r} \leq λ_{2}) \\ 0.2 if (L_{y l} + L_{y r} \leq λ_{3}) \end{matrix}

(11)

where

L_{y l}

and

L_{y r}

are the average y-coordinates of the lane edge lines in each area, and

λ_{1}

to

λ_{3}

are the threshold values. Then, apply Equation (12) to calculate the error value of each area. The calculated average value of each area is used to construct the final road trajectory. The edge data of these areas are weighted and summed to calculate a combined error value that indicates the target navigation direction of the robot in the current state.

{e r r o r}_{i m a g e} = \sum_{i = 0}^{n} {g a i n}_{i} \times (\frac{Y_{l i} + Y_{r i}}{2})

(12)

In this study, the number of areas is defined as

n

, and

Y_{l}

and

Y_{r}

are, respectively, defined as the y coordinates of both ends of the road edge line. This algorithm not only accounts for the characteristics of distorted images but also features an adaptive design that allows for parameter adjustments based on the robot’s processing capabilities. By optimizing the number of edge detection regions, this study ensures the effective allocation of computational resources, avoiding inefficient consumption on excessive edge detection. This method effectively balances detection accuracy with computational efficiency, providing a robust road detection solution for the autonomous navigation system. In performance tests, an Intel NUC equipped with an i5-10210U processor was used as the computational core. With the input image divided into four processing regions, the system successfully achieved processing speeds of up to 25 fps. This rate demonstrates good efficiency in terms of computational load. Such speed is crucial for real-time path tracking in robotics, ensuring that the robot can promptly recognize road markings and avoid obstacles while in motion. In contrast, relying on outdated data for navigation significantly increases operational risks, potentially leading to inaccurate path tracking and delayed responses to obstacles. Therefore, this algorithm proves to be highly beneficial for small-sized mobile robots with limited computational performance, underlining its suitability and effectiveness for real-time autonomous navigation applications.

In summary, the test results demonstrate the advantages of the small-sized mobile robot developed in this study in terms of processing speed and real-time performance. It has been found that this enhances the importance of efficient processing for the safe operation of autonomous mobile robots in complex environments. Future work will further explore the impact of different numbers of processing regions on system performance, as well as how to optimize the allocation of computational resources while ensuring real-time responsiveness.

3.2. Sign Recognition

In this study, the Jetson AGX Xavier equipped with a high-field-of-view fisheye camera was used as the image processing unit, specifically designed to accelerate the YOLOv4-tiny model. Utilizing NVIDIA’s CUDA and cuDNN libraries, YOLOv4-tiny significantly improves the speed and accuracy of road sign recognition in all aspects of various conditions, including, but not limited to, strong sidelight, low light, and other special light interference. As demonstrated in Figure 5, YOLOv4-tiny consistently recognizes road signs in diverse environments, maintaining its performance even under extreme and challenging conditions. The robustness of the YOLOv4-tiny model in handling a wide range of lighting and environmental scenarios is crucial for the reliable functioning of autonomous navigation systems. It ensures that the system can operate effectively not just in ideal conditions but also in real-world scenarios where lighting and visibility can vary significantly.

To investigate the specific effects of data augmentation on the performance of YOLOv4-tiny models, a series of experiments were designed in this study to evaluate the effectiveness of various data augmentation methods. The experiments used a set of non-intersecting training and validation data and applied 8 types of data augmentation techniques built into YOLOv4-tiny: 1. Saturation, 2. Exposure, 3. Hue, 4. Blur, 5. Mosaic, 6. Gaussian, 7. Random Resizing, and 8. Cropping. By randomly combining these techniques, 28 different configurations of datasets were generated, as listed in Table 3.

The experimental results showed that the application of data augmentation does not always have a proportional relationship to model performance. As indicated in Table 4, some augmentation techniques, when used individually, have limited or even negative effects on the model’s improvement. This study found that specific combinations of data augmentation techniques could significantly enhance model performance, especially under limited data conditions. For instance, data augmentation methods involving size variation proved particularly effective, as they increase the model’s ability to recognize objects of various sizes, which is beneficial in situations where the amount of data is unknown.

Further analysis revealed a 52.14% improvement in model performance using all data augmentation techniques. However, the details of the analysis show that not all data augmentations will have positive effects, and some specific techniques may counteract others when applied together comprehensively. This indicates that choosing the appropriate data augmentation strategy is more important than simply increasing the number of augmentation techniques. Under the premise of limited data volume, the correct data augmentation method has a significant positive impact on the performance and robustness of the model. The experimental results in this study emphasize the effectiveness of using data augmentation to compensate for the minor performance differences between YOLOv4 and YOLOv4-tiny on devices with computational constraints.

Overall, this study shows that reasonable data augmentation can not only enhance the performance of the model when the amount of data is insufficient, but also significantly improve the robustness of the model in unknown environments. This finding is crucial for resource-constrained autonomous systems, as it proves that even lightweight models can achieve outstanding performance through the proposed data augmentation designs. In large-scale projects, this approach can effectively reduce costs and simulate unpredictable scenarios, thereby improving system power consumption and practicality while ensuring performance.

3.3. Obstacle Avoidance

In this study, a 2D LiDAR-based obstacle avoidance method is proposed. We use 2D LiDAR to finely sample the robot’s surrounding environment. In the collection of data points obtained by LiDAR, each point is treated as an independent entity, and they play an important role in path planning for obstacle avoidance. In order to improve data processing efficiency, a distance-based filtering mechanism is proposed and described by

{e r r o r}_{o b s} = \sum {g a i n}_{d} * (D_{m a x} - P_{d i s}) + {g a i n}_{a} * (A_{m a x} - P_{a n g l e})

(13)

where

D_{m a x}

is the maximum value of distance from the obstacle,

P_{d i s}

is the actual distance with respect to the obstacle,

A_{m a x}

is the maximum angle with respect to the obstacle, and

P_{a n g l e}

is the actual angle with respect to the obstacle. First only those data points that are larger than the default value

D_{m a x}

are filtered to exclude those that do not have a significant impact on the navigation decision. Among these filtered data points, this study further considers the distance and angle of each point to calculate the error value of the distance center. Specifically, the current position of the robot is assumed to be the origin, and according to Equation (13), each valid data point is converted into an error value, which is weighted according to its distance and angle relative to the current position of the robot. This process is similar to a weighted summation, where each data point will affect the moving direction of the robot.

It is further understood that data points closer to the robot and directly in front of it have a greater impact. Therefore, these points are assigned higher weights in the algorithm. It ultimately integrates all these weighted data points into a single value, which provides the robot with real-time directional guidance. It is worth noting that the correlation between data points is not a critical factor in the decision-making process, so the proposed method also reduces computational complexity. As shown in Figure 6, the original data points (red lines) collected by the LiDAR and the important data points (dark red lines) are filtered and used to calculate the error. The algorithm integrates these weighted data points into a comprehensive error evaluation value through Equation (13), so that the robot can effectively avoid obstacles and stay within the predetermined path. One experimental video can be viewed on this website: https://youtu.be/kbgB1MqsVRc (accessed on 22 July 2024). From the actual experimental result, we can see that the proposed 2D LiDAR-based obstacle avoidance method can allow the implemented robot to avoid obstacles autonomously.

An obstacle avoidance method based on 2D LiDAR is proposed. The proposed distance-based filtering mechanism can filter important data points and assign appropriate weights, which can effectively reduce the computational complexity and improve the robot’s response speed to avoid obstacles.

4. Results and Discussion

In this study, a highly integrated and efficient small-sized mobile robot is designed and implemented. It consists of a CPU (Intel NUC), a GPU (NVIDIA Jetson AGX Xavier), sensors (a 2D LiDAR and two fisheye cameras), a power system, and the proposed configuration of IO interfaces. The primary objectives of this study are to achieve three core functions of the small-sized robot: road detection, sign recognition, and obstacle avoidance.

For road detection, we divide the captured image into four areas and use the Intel NUC processor to perform road detection calculations. The proposed method can significantly reduce the load on the system. In addition to successfully and accurately detecting lines on both sides of the road, the proposed method also has a high processing speed of 25 fps. In complex environments, this fast processing is critical to avoid the potential of using delayed data. It ensures that the robot can instantly eliminate images and avoid obstacles.

For sign recognition, we use the YOLOv4-tiny model and use a data augmentation strategy to significantly improve the computing performance of this model. From a series of experiments, it can be seen that the mean Average Precision (mAP) of the used model has increased by 52.14%. Although YOLOv4-tiny is a simplified model of YOLOv4 in structure, the data augmentation strategy used in this study ensures its detection accuracy in complex environments. In addition, the high-speed processing capabilities of YOLOv4-tiny in this study can show that it has obvious advantages in object recognition for small-sized robots.

For obstacle avoidance, we propose a distance-based filtering mechanism, which can simplify the processing of data points collected by LiDAR. The proposed method can filter out some important data points and assign appropriate weights, which can effectively reduce the amount of data processing and improve the robot’s response speed to avoid obstacles.

As shown in Figure 7, testing on the Jetson AGX Xavier platform showed an average processing speed of 24.10 fps in 30 W mode and 21.90 fps in 15 W mode. There is not much difference in processing speed performance between the two. When the power supply is sufficient, the 30 W mode can significantly improve the overall efficiency of the system. However, taking into account power consumption and the operating speed of other programs, this study chose the 15 W mode. In conclusion, this study successfully implemented functions such as road detection, sign recognition, and obstacle avoidance on a small-sized robot. The system demonstrates strong resistance to environmental disturbances, proving its robustness. The results of this study have important implications for the design and application of small robots, especially in system architectures with limited resources. Through good data augmentation and algorithm optimization, even the highly simplified YOLOv4-tiny model can achieve high accuracy while maintaining high processing speed.

In future work, we plan to continue to expand and optimize the functionality and applications of the implemented small-sized mobile robot. The focus will include in-depth optimization algorithms, especially deep learning models, to enhance the accuracy and robustness of the system in more changing and complex environments. Advanced data augmentation techniques and adaptive algorithms will be explored to better adapt to dynamic environments, extreme conditions, and complex road obstacles. End-to-end learning will be one of the directions for improvement. In addition, sensor fusion techniques will be further explored by integrating various sensors such as LiDAR and RGB-D cameras to improve the accuracy of environmental perception and dynamic obstacle detection.

5. Conclusions

The main achievement of this study is the design and implementation of a small-sized mobile robot with a limited volume of 18 × 18 × 21 cm³. Under this limited volume, we selected the appropriate CPU and GPU to make the robot have good computing and graphics processing capabilities. In addition, a 2D LiDAR and two fisheye cameras were installed on the robot, and some algorithms were implemented to enable this small-sized robot to have three major functions: road detection, sign recognition, and obstacle avoidance. For road detection, the proposed method can significantly reduce the system load and have a high processing speed, which makes small-sized robots flexible and stable in road detection applications. For sign recognition, the YOLOv4-tiny model and data augmentation techniques used in this study can significantly improve recognition accuracy and effectively deal with the influence of the external environment. For obstacle avoidance, a 2D LiDAR-based obstacle avoidance method with a distance-based filtering mechanism is proposed, which can effectively reduce the computational complexity and improve the response speed of obstacle avoidance. All the proposed methods can process well in real time in the face of different environmental changes or interferences. Therefore, these methods can provide some references for the design and application of autonomous mobile robots, especially for small-sized robots with limited system resources.

Author Contributions

Conceptualization, C.-C.W., K.-D.W. and Y.-S.C.; methodology, K.-D.W. and B.-Y.Y.; validation, C.-C.W. and Y.-S.C.; analysis and investigation, K.-D.W. and B.-Y.Y.; writing—original draft preparation, K.-D.W. and B.-Y.Y.; writing—review and editing, C.-C.W.; visualization, K.-D.W. and B.-Y.Y.; project administration, C.-C.W.; funding acquisition, C.-C.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partly supported by the National Science and Technology Council (NSTC) of Taiwan, R.O.C., under grant number NSTC 112-2221-E-032-035-MY2.

Institutional Review Board Statement

Not applicable

Informed Consent Statement

Not applicable

Data Availability Statement

All data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Alatise, M.B.; Hancke, G.P. A review on challenges of autonomous mobile robot and sensor fusion methods. IEEE Access 2020, 8, 39830–39846. [Google Scholar] [CrossRef]
Zghair, N.A.K.; Al-Araji, A.S. A one decade survey of autonomous mobile robot systems. Int. J. Electr. Comput. Eng. 2021, 11, 4891. [Google Scholar] [CrossRef]
Loganathan, A.; Ahmad, N.S. A systematic review on recent advances in autonomous mobile robot navigation. Eng. Sci. Technol. Int. J. 2023, 40, 101343. [Google Scholar] [CrossRef]
Amsters, R.; Slaets, P. Turtlebot 3 as a robotics education platform. In Proceedings of the Robotics in Education: Current Research and Innovations 10; Springer: Cham, Switzerland, 2020; pp. 170–181. [Google Scholar] [CrossRef]
Guizzo, E.; Ackerman, E. The turtlebot3 teacher [resources_hands On]. IEEE Spectr. 2017, 54, 19–20. [Google Scholar] [CrossRef]
Stan, A.C. A decentralised control method for unknown environment exploration using Turtlebot 3 multi-robot system. In Proceedings of the 2022 14th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), Ploiești, Romania, 30 June–2 July 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–6. [Google Scholar] [CrossRef]
Ma’arif, A.; Nuryono, A.A. Vision-based line following robot in webots. In Proceedings of the 2020 FORTEI-International Conference on Electrical Engineering (FORTEI-ICEE), Yogyakarta, Indonesia, 24–25 September 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 24–28. [Google Scholar] [CrossRef]
Zhang, H.; Hernandez, D.E.; Su, Z.; Su, B. A low cost vision-based road-following system for mobile robots. Appl. Sci. 2018, 8, 1635. [Google Scholar] [CrossRef]
Cáceres Hernández, D.; Kurnianggoro, L.; Filonenko, A.; Jo, K.H. Real-time lane region detection using a combination of geometrical and image features. Sensors 2016, 16, 1935. [Google Scholar] [CrossRef] [PubMed]
Soori, M.; Arezoo, B.; Dastres, R. Artificial intelligence, machine learning and deep learning in advanced robotics, a review. Cogn. Robot. 2023, 3, 54–70. [Google Scholar] [CrossRef]
Zheng, X.; Liu, Y.; Lu, Y.; Hua, T.; Pan, T.; Zhang, W.; Tao, D.; Wang, L. Deep learning for event-based vision: A comprehensive survey and benchmarks. arXiv 2023, arXiv:2302.08890. [Google Scholar]
DeSouza, G.N.; Kak, A.C. Vision for mobile robot navigation: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 237–267. [Google Scholar] [CrossRef]
Masana, M.; Liu, X.; Twardowski, B.; Menta, M.; Bagdanov, A.D.; Van De Weijer, J. Class-incremental learning: Survey and performance evaluation on image classification. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 5513–5533. [Google Scholar] [CrossRef]
Deepan, P.; Sudha, L. Object classification of remote sensing image using deep convolutional neural network. In The Cognitive Approach in Cloud Computing and Internet of Things Technologies for Surveillance Tracking Systems; Elsevier: Amsterdam, The Netherlands, 2020; pp. 107–120. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 84–90. [Google Scholar] [CrossRef]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object detection in 20 years: A survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
Wu, X.; Sahoo, D.; Hoi, S.C.H. Recent advances in deep learning for object detection. Neurocomputing 2020, 396, 39–64. [Google Scholar] [CrossRef]
Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 7464–7475. [Google Scholar] [CrossRef]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016, Proceedings, Part I 14; Springer International Publishing: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar] [CrossRef]
Li, Z.; Yang, L.; Zhou, F. FSSD: Feature fusion single shot multibox detector. arXiv 2017, arXiv:1712.00960. [Google Scholar] [CrossRef]
Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1440–1448. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef]
Minaee, S.; Boykov, Y.; Porikli, F.; Plaza, A.; Kehtarnavaz, N.; Terzopoulos, D. Image segmentation using deep learning: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 3523–3542. [Google Scholar] [CrossRef]
Hafiz, A.M.; Bhat, G.M. A survey on instance segmentation: State of the art. Int. J. Multimed. Inf. Retr. 2020, 9, 171–189. [Google Scholar] [CrossRef]
He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 2961–2969. [Google Scholar] [CrossRef]
Bolya, D.; Zhou, C.; Xiao, F.; Lee, Y.J. YOLACT: Real-time instance segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 9157–9166. [Google Scholar] [CrossRef]
Bolya, D.; Zhou, C.; Xiao, F.; Lee, Y.J. YOLACT++ Better Real-Time Instance Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 44, 1108–1121. [Google Scholar] [CrossRef]
Mo, Y.; Wu, Y.; Yang, X.; Liu, F.; Liao, Y. Review the state-of-the-art technologies of semantic segmentation based on deep learning. Neurocomputing 2022, 493, 626–646. [Google Scholar] [CrossRef]
Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), 2018; Springer: Cham, Switzerland, 2018; pp. 801–818. [Google Scholar] [CrossRef]
Linardatos, P.; Papastefanopoulos, V.; Kotsiantis, S. Explainable AI: A review of machine learning interpretability methods. Entropy 2020, 23, 18. [Google Scholar] [CrossRef]
Longo, L.; Goebel, R.; Lecue, F.; Kieseberg, P.; Holzinger, A. Explainable artificial intelligence: Concepts, applications, research challenges and visions. In Proceedings of the International Cross-Domain Conference for Machine Learning and Knowledge Extraction; Springer International Publishing: Cham, Switzerland, 2020; pp. 1–16. [Google Scholar] [CrossRef]
Cui, Y.; Guo, D.; Yuan, H.; Gu, H.; Tang, H. Enhanced YOLO Network for Improving the Efficiency of Traffic Sign Detection. Appl. Sci. 2024, 14, 555. [Google Scholar] [CrossRef]
Cheng, C.; Sha, Q.; He, B.; Li, G. Path Planning and Obstacle Avoidance for AUV: A Review. Ocean Eng. 2021, 235, 109355. [Google Scholar] [CrossRef]
Wenzel, P.; Schön, T.; Leal-Taixé, L.; Cremers, D. Vision-based mobile robotics obstacle avoidance with deep reinforcement learning. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 14360–14366. [Google Scholar] [CrossRef]
Zhu, G.; Shen, Z.; Liu, L.; Zhao, S.; Ji, F.; Ju, Z.; Sun, J. AUV Dynamic Obstacle Avoidance Method Based on Improved PPO Algorithm. IEEE Access 2022, 10, 121340–121351. [Google Scholar] [CrossRef]
Liu, C.-C.; Lee, T.-T.; Xiao, S.-R.; Lin, Y.-C.; Lin, Y.-Y.; Wong, C.-C. Real-time FPGA-based balance control method for a humanoid robot pushed by external forces. Appl. Sci. 2020, 10, 2699. [Google Scholar] [CrossRef]
Yang, X.; Wu, F.; Li, R.; Yao, D.; Meng, L.; He, A. Real-time path planning for obstacle avoidance in intelligent driving sightseeing cars using spatial perception. Appl. Sci. 2023, 13, 11183. [Google Scholar] [CrossRef]
Guo, T.; Sun, Y.; Liu, Y.; Liu, L.; Lu, J. An Automated Guided Vehicle Path Planning Algorithm Based on Improved A* and Dynamic Window Approach Fusion. Appl. Sci. 2023, 13, 10326. [Google Scholar] [CrossRef]
Escobar-Naranjo, J.; Caiza, G.; Ayala, P.; Jordan, E.; Garcia, C.A.; Garcia, M.V. Autonomous Navigation of Robots: Optimization with DQN. Appl. Sci. 2023, 13, 7202. [Google Scholar] [CrossRef]

Figure 1. Description of the implemented small-sized mobile robot: (a) 3D diagram of the robot; (b) the chassis structure of the robot.

Figure 2. Input/output relationships of the system architecture.

Figure 3. Graphical diagram of a two-wheeled differential platform.

Figure 4. Image recognition of the road: (a) right mask status; (b) left mask status; (c) edge detection results; (d) final result.

Figure 5. Two results of the road sign recognition: (a) schematic diagram of detecting left turn road sign; (b) schematic diagram of actual detection situation.

Figure 6. Description of environmental data obtained by LiDAR and fisheye camera.

Figure 7. Average processing speed in 30 W and 15 W modes on the Jetson AGX Xavier platform.

Table 1. Key hardware components of the implemented robot.

Device	Processor/Graphics	Power
Nvidia Jetson AGX Xavier	NVIDIA Volta GPU with 64 Tensor cores	Max 30 W
Intel NUC10i5FNB	Intel^® Core™ i5-10210U Processor	Max 25 W
OpenCR	STM32F746ZGT6	Max 54 W
XM-430	ARM CORTEX-M3 (72 MHz, 32Bit)	Max 27 W

Table 2. Specification of RPLIDAR-A1.

Essentials	Parameters
Measuring Range	0.15 m–12 m
Sampling Frequency	5.5 Hz
Angular Range	≤1°
Power Consumption	0.5 W

Table 3. List of data augmentation methods.

Number	Data Augmentation Methods
1	None
2	Saturation
3	Hue
4	Gaussian noise (Gaussian)
5	Random
6	Blur
7	Crop
8	Mosaic, Hue
9	Blur, Gaussian
10	Crop, Random, Mosaic
11	Exposure, Hue, Random
12	Exposure, Blur, Gaussian
13	Random, Crop, Saturation
14	Blur, Mosaic, Gaussian
15	Saturation, Exposure, Hue
16	Crop, Saturation, Random, Mosaic
17	Hue, Blur, Mosaic, Random
18	Exposure, Blur, Random, Crop
19	Mosaic, Gaussian, Random, Crop
20	Saturation, Hue, Blur, Gaussian
21	Saturation Hue, Blur, Random, Crop
22	Exposure, Mosaic, Gaussian, Random, Crop
23	Saturation, Exposure, Hue, Gaussian, Random
24	Hue, Blur, Mosaic, Gaussian, Random, Crop
25	Saturation, Exposure, Hue, Blur, Gaussian, Crop
26	Saturation, Exposure, Hue, Blur, Mosaic, Gaussian
27	Saturation, Hue, Blur, Mosaic, Gaussian, Random
28	ALL

Table 4. Data augmentation model identification metrics.

Number	AP₅₀	AP₅₅	AP₆₀	AP₆₅	AP₇₀	AP₇₅	AP₈₀	AP₈₅	AP₉₀	AP₉₅	AP@ 50:5:95
1	16.49%	12.07%	8.45%	3.54%	1.51%	0.20%	0.06%	0.00%	0.00%	0.00%	4.23%
2	30.55%	21.05%	19.80%	9.51%	4.67%	1.49%	0.36%	0.15%	0.15%	0.15%	8.79%
3	18.96%	13.52%	7.58%	3.88%	1.57%	0.06%	0.00%	0.00%	0.00%	0.00%	4.56%
4	21.61%	16.45%	12.26%	5.02%	2.49%	0.85%	0.00%	0.00%	0.00%	0.00%	5.87%
5	68.8%	64.48%	47.93%	42.18%	27.3%	17.89%	9.39%	4.75%	0.65%	0.48%	28.39%
6	21.62%	15.7%	10.41%	6.56%	4.94%	2.38%	0.63%	0.00%	0.00%	0.00%	6.22%
7	71.19%	67.94%	60.70%	50.07%	35.86%	23.00%	15.96%	10.64%	3.14%	0.00%	33.85%
8	65.64%	63.77%	51.21%	36.87%	23.4%	13.8%	7.81%	2.98%	0.00%	0.00%	26.55%
9	16.50%	16.50%	14.93%	13.86%	10.51%	3.08%	1.18%	0.04%	0.04%	0.00%	7.66%
10	71.33%	67.73%	65.66%	64.41%	34.68%	26.58%	8.53%	0.87%	0.24%	0.00%	34.00%
11	84.36%	77.53%	71.94%	65.87%	40.06%	20.73%	7.08%	1.54%	1.43%	0.00%	37.05%
12	25.97%	21.76%	14.1%	9.14%	3.37%	1.84%	0.47%	0.07%	0.00%	0.00%	7.67%
13	77.32%	72.95%	67.97%	62.18%	41.7%	26.28%	12.29%	4.23%	1.18%	0.00%	36.61%
14	64.46%	63.1%	56.48%	44.02%	31.29%	22.64%	12%	8.82%	3.43%	0.11%	30.64%
15	23.17%	17.75%	11.4%	7.45%	5.06%	3.39%	3.26%	1.72%	0.03%	0.00%	7.32%
16	70.11%	65.6%	60.7%	54.94%	42.02%	25.75%	17.37%	5.63%	1.83%	0.00%	34.40%
17	74.73%	72.88%	64.68%	56.10%	45.95%	37.04%	16.38%	5.87%	0.24%	0.00%	37.39%
18	74.73%	72.88%	64.68%	56.10%	45.95%	37.04%	16.38%	5.87%	0.24%	0.00%	37.39%
19	70.00%	65.04%	56.77%	46.81%	35.95%	27.87%	13.95%	5.99%	0.47%	0.19%	32.30%
20	41.92%	23.91%	13.78%	9.85%	8.2%	5.39%	0.71%	0.00%	0.00%	0.00%	10.38%
21	78.21%	73.79%	55.7%	45.88%	13.71%	30.29%	7.84%	1.43%	0.00%	0.00%	30.69%
22	74.54%	63.54%	59.02%	51.48%	37.56%	18.04%	6.68%	3.38%	0.08%	0.00%	31.43%
23	78.50%	71.6%	66.33%	43.72%	30.93%	21.41%	9.41%	2.04%	0.44%	0.00%	32.44%
24	62.34%	58.51%	43.77%	29.51%	15.19%	9.01%	4.40%	1.55%	0.00%	0.00%	22.43%
25	79.17%	66.27%	55.98%	51.78%	34.79%	19.99%	10.83%	1.55%	0.00%	0.00%	32.04%
26	74.19%	71.16%	63.62%	53.87%	42.43%	26.35%	14.44%	10.20%	2.02%	0.00%	35.83%
27	66.87%	66.02%	60.52%	53.55%	43.26%	26.14%	11.73%	3.19%	1.88%	0.00%	33.32%
28	68.63%	57.54%	55.40%	43.02%	28.17%	8.45%	3.07%	0.45%	0.06%	0.00%	26.48%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wong, C.-C.; Weng, K.-D.; Yu, B.-Y.; Chou, Y.-S. Implementation of a Small-Sized Mobile Robot with Road Detection, Sign Recognition, and Obstacle Avoidance. Appl. Sci. 2024, 14, 6836. https://doi.org/10.3390/app14156836

AMA Style

Wong C-C, Weng K-D, Yu B-Y, Chou Y-S. Implementation of a Small-Sized Mobile Robot with Road Detection, Sign Recognition, and Obstacle Avoidance. Applied Sciences. 2024; 14(15):6836. https://doi.org/10.3390/app14156836

Chicago/Turabian Style

Wong, Ching-Chang, Kun-Duo Weng, Bo-Yun Yu, and Yung-Shan Chou. 2024. "Implementation of a Small-Sized Mobile Robot with Road Detection, Sign Recognition, and Obstacle Avoidance" Applied Sciences 14, no. 15: 6836. https://doi.org/10.3390/app14156836

APA Style

Wong, C. -C., Weng, K. -D., Yu, B. -Y., & Chou, Y. -S. (2024). Implementation of a Small-Sized Mobile Robot with Road Detection, Sign Recognition, and Obstacle Avoidance. Applied Sciences, 14(15), 6836. https://doi.org/10.3390/app14156836

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Implementation of a Small-Sized Mobile Robot with Road Detection, Sign Recognition, and Obstacle Avoidance

Abstract

1. Introduction

2. Preliminary

2.1. Equipment

2.2. Mechanical Model

3. Methods

3.1. Rode Detection

3.2. Sign Recognition

3.3. Obstacle Avoidance

4. Results and Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI