*4.3. Multi-Robot Experiments*

Based on the single robot navigation results, we discuss the experiments of the multi-robot navigation tasks in this section. The proposed Parallel Deep Deterministic Policy Gradient (PDDPG) algorithm will be evaluated and analyzed carefully. With the curriculum learning setup, the experiments would be illustrated in a two-stage arrangement: the formation construction and the collaborative navigation.

**Figure 4.** The trajectory of the mobile robot in the single robot mapless navigation tasks. The left part is the navigation environment in Gazebo, the right part is the trajectory of the mobile robot.
