1. Introduction
The advancement of artificial intelligence, especially deep learning, coupled with the recent development of the Internet of Things, has once again received great attention by the public. Robots with autonomous learning capabilities have become one of the most important robotics trends throughout the world. However, the performance of a robot operating in the actual environment, regardless of how well it is designed, is sometimes quite different from the expected performance in its design environment. To minimize this difference, so-called “smart systems” should possess a certain degree of self-correction or self-learning capability. In response to the need to correct this problem, most traditional artificial intelligence approaches must carefully consider and evaluate all possible situations in real world operations, to ensure that everything is under control. Without doubt, in such conditions, a carefully designed system should perform as expected, as all the problems related to the operational environment should have been clearly defined in a predetermined manner. However, as we all know, most traditional artificial intelligence systems, based on people’s “rigid appeal”, are realized with computer programs or codes. That is, in terms of programming, roboticists must accurately design the required objects (or “symbols”) and the rules (algorithms) that operate on these objects. A noticeable problem is that, when the code of a computer system is slightly modified, the results (system functions) it produces may be quite unpredictable.
Higushi et al. [
1] therefore emphasized that people should think about how to add plasticity characteristics into a computer system, to increase its malleability. Similarly, Vassilev et al. [
2] put forward the so-called “neutral principle”, which states that, when adding a number of similar functions into a system, the possibility of providing solutions increases, as well. Thompson and Layzell [
3] suggested that relative adaptability might be improved if strict requirements for computer systems could be appropriately released.
Compared with computer systems, biological systems exhibit a relatively better “adaptation” capability, as they are comparatively more capable of continuing to function (or operating) in an uncertain or even in an unknown environment. This is because the fitness of organisms can be thought of as presenting with a relatively gentle surface; that it, its function (fitness) changes gradually when the structure changes slightly.
The artificial neuromolecular (ANM) system [
4,
5,
6], a biologically motivated information processing architecture, possesses a close “structure/function” relation. Unlike most artificial neural networks, its emphasis is on intra-neuronal dynamics (i.e., information processing within neurons). Through evolutionary learning, we have shown that the close “structure/function” relation facilitates the ANM system in generating sufficient dynamics for shaping neurons into special input/output pattern transducers that meet the needs of a specific task [
4,
5].
In this study, the ANM system was linked with the navigation problem of a snake-like robot. It should be noted that our goal was absolutely not to build a state-of-art snake-like robot but to use it as a test domain for investigating the perpetual learning capability of the system. More importantly, we are comparatively more interested in addressing the issue related to learning in an uncertain or unknown environment, the so-called adaptability problem proposed by Conrad [
7]. An ambiguous or ill-defined problem domain would fall into the category of an unknown or partially known environment. A noisy or changing domain would fall into the category of a statistically uncertain or unpredictable environment.
Previously, the research team of this group used a quadruped robot controlled by the ANM system, to address the issue of adaptability [
6]. The movement of the robot was sometimes uncertain, as the robot was composed of wooden strips fused together with simple disposable chopsticks. Apart from the robot itself, controlling the robot to move forward or to make a left/right turn was also uncertain due to variable friction between the robotic legs and the test ground. Another constraint of the robot was that each leg was only allowed to produce a limited front and rear swing. The only way for the robot to generate appropriate movements was to constantly change its center of gravity during the movement process. Finding changes in the center of gravity at different times may not be possible in a simple prior calculation. The sum of all constraints put on the test made it infeasible for system designers to find solutions to solve the movement problems in advance, even through careful design and analysis.
In nature, some animals can move quickly on land and in the water, without feet, which is hard for humans to imagine and understand. What is even more surprising is that these animals show considerable adaptability to different environments and terrains. Because of this, some scholars have put considerable effort into understanding the snake’s movement and trying to create a robot that mimics it. The steps involved are to first study the snake’s motion curve and then to apply mechanical operation principles to create snake-type motion [
8]. These steps include deriving the velocities of different snake segments to perform rectilinear motion [
9,
10,
11], using the position of the motor and leaning against the rotation of the wheel to cause the snake-shaped robot to move forward [
12], and using Watt-I planar linkage mechanism to control a biped water-running robot to generate propulsion force [
13]. In short, most of the above research involves designing and calculating the operation of snake-type robots by studying the operating principle and motion patterns of these robots. Moreover, as pointed out by Liljebäck et al. [
14], the majority of literature on snake robots so far has focused on locomotion over flat surfaces. However, in the real world, there are many unidentified or unknown external factors (noise). Small changes in these factors can cause a well-designed system to become completely inoperable. Shan and Koren [
15] have proposed a motion planning system for a mechanical snake robot to move in cluttered environments, without avoiding obstacles on its way; the robot would instead “accommodate” them by continuing its motion toward the target, while being in contact with the obstacles.
Unlike the above study from our group, here, through an autonomous learning mechanism, we trained a snake-type robot (herein referred to as
Snaky) to learn how to complete the specified snake mission. The core learning mechanism (
Snaky’s brain) is the artificial neuromolecular system (ANM) developed by one of the authors Reference [
4] decades ago. We note that this robot has eight joints and four control rotary motors. Initially, the ANM system randomly generates different parameter sets for motor control. For each learning session, the system evaluates the performance of each set, selects some of the best-performing sets, and finally copies from the best-performing sets to lesser-performing sets, with some alterations. The learning continues until the robot completes the assigned tasks or is stopped by the system developer. It should be noted that
Snaky is definitely not a high-precision robot; that is, given the same input, its output behavior may be different for each run of experiments. However, we must emphasize that the goal of this research was definitely not to build a state-of-art snake-type robot. Instead, we were trying to use the uncertainty of the robot, as well as its interaction with the environment, as a test bed for studying the continuous optimization problem in the ANM system. Basically, if the entire system is capable of showing that it learns in a continuous manner when the complexity of the task assigned increases, we can call it a “success”.
Section 2 introduces the architecture of the proposed model, the evolutionary learning mechanisms, our application domain, and the input–output interface.
Section 3 discusses the experimental results. The final section presents the concluding remarks.
4. Experiments
Two types of experiments were performed in this study. The first asked
Snaky to move to a certain target position. In this study, three independent tasks were performed separately. One asked
Snaky to move to position
L (i.e., move to the left), another to move to position
M (i.e., move forward), and the other to move to position
R (i.e., move to the right). The second asked
Snaky to move to a certain target position, as described above, but with some limitations on the range of motor rotation angle. Our intuition is that the second experiment would be comparatively more difficult than the first one because the possible solutions that the ANM system could explore are relatively limited when more constraints are put on the system. Because of this, in the second type of experiment, the distances of the target positions were comparatively shorter than those in the first experiment. Consequently, we also had the robot perform three independent tasks in this second type of experiment. One asked the robot to move to position
l (i.e., move to the left), the second to move to position
m (i.e., move forward), and the third to move to position
r (i.e., move to the right).
Figure 8 shows the position of the above six target locations in a two-dimensional space. For all the above experiments, the fitness of the system is based on the distance between the snake robot and its designated target. The shorter the distance is, the better the performance of the system.
4.1. General Learning
The goal of this experiment was to train
Snaky to move to each of these three assigned positions (i.e., positions
L,
M, and
R in
Figure 8) independently.
Figure 9 shows that the learning performance of
Snaky varied as learning preceded. There are two reasons for this behavior. One of the reasons is that
Snaky itself is not a highly accurate machine. The other is that
Snaky’s test floor was not a completely flat platform but instead a lattice with some kind of tidal groove. Based on the abovementioned reasons, the contact friction between
Snaky and the floor was somewhat uncertain. Therefore, for each run of the experiment in the learning process, the resulting output may be different, even if all variables were controlled under the same conditions. However, even though the performance oscillates, it falls mostly within a certain range of variations. The most important thing to note is that the learning curve of the robot gradually improves over time. This implies that the system can not only overcome the noise but also shows the ability to continuously learn. For each set of motor rotation angles obtained for each of the above experimental results, we repeated the test times times, to check the similarity of
Snaky’s movement trajectories. The results (
Figure 10) show that when
Snaky uses the same set of motor rotation angles, it can advance toward its assigned target and produce a similar movement trajectory.
In the following section, we analyse the rotation angles of the four motors obtained from each of the above experimental results.
Table 2 lists the rotation angle of each motor used by
Snaky toward each of the three designated positions after 64 generations of learning. If the angle of rotation was negative, the motor was turned to the left (only four of the horizontally controlled motors in the eight joints of the snake-type robot were used). In contrast, if it was positive, the motor was turned to the right. The experimental result shows that
Snaky will advance toward the designated target with a combination of relatively large angles of motor rotations. In other words, it will rotate at a large angle in one direction and then at a large angle in the other opposite direction. As shown in
Table 2, in the turning left task,
Snaky first turned to the right at a large angle and then turned to the left at a large angle. The above results are very similar to the results obtained from the task where
Snaky turned to the right (where
Snaky turned left at a large angle and then right at a large angle). This result can be explained by the fact that
Snaky takes a similar approach to the motion of swaying to produce forward motion. The most interesting result is the forward-target task. The results show that
Snaky uses two consecutive “first right to left” cross-angle rotations. This practice can be explained by the fact that it swings left and right with large angles, to produce forward motion, and simultaneously uses an almost equal left and right swing angle to achieve unbiased forward movement.
As mentioned earlier, for each run, we allowed
Snaky to complete the four operational steps of the action and then evaluated its mobile performance (i.e., learning performance). The practical approach taken was to sequentially assign the rotation angles of the four motors obtained via the ANM system to the individual motors of each operational step in a rolling manner. When we combine the results of the four motor rotations assigned to the four operating steps in turn, the results are even more impressive. For example, in the case of moving toward the
L position, at
S1,
M1 first approaches the right turn at approximately 80 degrees, and then
M2 and
M3 turn left by approximately 80 degrees. As a result of the above combination, as shown in
Figure 11,
Snaky changes with relatively large angles, to produce a moving action similar to the English letter
V. Similarly, at
S2,
M2 first rotates at a large angle of approximately 80 degrees with a left turn and then
M3 and
M4 rotate to the right at an approximately 80-degree angle. The result of the combination of
S1 and
S2 is a large left turn, followed by a large right turn, to produce a movement and balance effect. Similarly, from
S3 to
S4,
Snaky first generates a large angle right turn with the
M1 and
M2 motors and then mixes
M3 and
M4 to turn left, at a large angle, to form a final left turn movement. The combination of the rotation angles of all four motors (
M1,
M2,
M3, and
M4) forms an inverted V-shape.
Another task in this study asked Snaky to move toward the R position. The results show that, similar to the previous task, the result of training through the ANM system caused M1, M2, M3, and M4 to produce relatively large angles to make Snaky rotate in one direction and then to rotate with opposite angles to cause movement in the opposite orientation. The difference between the movements for the L and R tasks is that Snaky turns to the left first at Step 1, S1, and then generates a right turn when at S2 and S3, and finally turns left slightly at S4 (note: the result of correcting an excessive right turn). In terms of overall operational action, Snaky produces a right-turning motion shift.
The last task required Snaky to move straight toward the front M position. As a result, it is shown that the four motors (M1 to M4) form an uppercase English letter N-shape at S1, an inverse N-shape at S2, an N-shape at S3, and an inverse N-shape at S4. This result shows that Snaky uses an N-shape and an inverse N-shape in an interlaced way, to generate forward motion.
4.2. Constrained Learning
Unlike the above experiment, the goal of this experiment was to train
Snaky to move to the assigned locations with limitations on the range of motor rotation angles. Our intuition was that the second experiment would be more difficult than the first one because the possible solutions from which the ANM system can search are relatively limited. In this experiment, independent tasks were performed to reach three different target locations (i.e., positions
l,
m, and
r), as shown in
Figure 8.
Similar to the first experiment, the learning performance oscillates but falls mostly within a certain range of variations in the learning process, even if all variables are controlled under the same conditions (
Figure 12). A particularly striking difference from the previous experimental results can be seen in the task where
Snaky is required to move forward.
Figure 12 shows that its learning performance decreases at a relatively slow rate. When we compare the distances actually moved, it can be clearly seen that, when the snake’s motor rotation angle is limited, it is relatively difficult to learn how to reach the target, because of the small range of solutions that can be searched. However, what is certain is that the system can still perform continuous learning. In the process of learning, the performance of the system increases and decreases, but as the number of learning cycle increases, the performance of the system shows a trend of continuous improvement. Overall, through autonomous learning, the ANM system can find a set of rotation angles in cooperation with the four motors, to move the snake robot to a specified target point.
Table 3 lists the rotation angle for each motor used by
Snaky to reach each of the three designated positions after 64 generations of learning. As with the first experiment, we tested the motor rotation angles five times for each task through the ANM system. The results show that, when
Snaky uses the same rotation angles, it can advance toward its assigned target and produce a similar movement trajectory each time (
Figure 13).
When the range of motor angles is limited, the experimental results (
Figure 14) show that the four motors (
M1,
M2,
M3, and
M4) are rotated in a manner similar to the English letter
N,
V, or
U (as well as in an inverse
N-, inverse
V-, or inverse
U-shape). When the task involves moving toward position
l, from
S1 to
S4, the combination of the rotation angles for the four motors (
M1,
M2,
M3, and
M4) forms an uppercase English letter
N-shape or an inverse
N-shape.
In contrast, when moving toward position m or toward position r, the combination of angles consists of N-, V-, or U-shaped rotations (including inverse N-, V-, or U-shapes). In other words, the way the robot moved did not resemble the way it moved in the previous experiment.
5. Conclusions
In recent years, the use of “machine learning” in the processing of big data has increased everyone’s attention to artificial intelligence. Autonomous learning plays a very important role in the field of artificial intelligence, especially in the case of a problem that is difficult to solve with systemic algorithms. This study uses a snake-type robot to explore the design of ground motion problems, which is somewhat compatible with the difficulty of obtaining a solution to a problem in advance by using a systematic algorithm. First, Snaky is not a high-precision robot, and the angles of rotation of each motor have some degree of error; second, the floor on which Snaky performed its tasks was not flat. During each movement, it came into contact with elements with different resistances, due to the different moving positions. Because of this, the resistance faced by snake-type robots in contact with the ground is not uniform but varies over time. Combining the above two factors, this study wants to emphasize again that Snaky itself and its interactions with the environment contend with uncertainty of considerable interference, which provides a suitable experimental platform for us to explore the issue of autonomous learning.
This study applied a molecular-like neural system and an evolutionary learning algorithm to allow a snake-like robot to search for the rotation angles for its motors, in a self-learning manner, in order to move toward a target point. It is added here that the rotation angle of each motor must be matched to the angles of the other three motors, in order to produce effective movement. The results of the whole experiment show that, through autonomous learning, the snake-type robot can learn to move to reach the target in a continuous manner and can use different motion combinations to reach farther distances. The preliminary results of this study demonstrate that different combinations of motions can be used to create additional combinations of paths, to reach different locations at equal distances from the starting point, or to use more combinations of motions to meet a different level of objectives. For the topic of continuous learning, this research can be further explored, in the future, by increasing the difficulty of the test environment, such as increasing the slope of the ground or changing the flatness of the ground (including regular or irregular). On the other hand, this research should continue to explore how the system can overcome this problem through self-learning when one or more of its motors malfunction.