A Review on Inverse Kinematics, Control and Planning for Robotic Manipulators With and Without Obstacles via Deep Neural Networks

Calzada-Garcia, Ana; Victores, Juan G.; Naranjo-Campos, Francisco J.; Balaguer, Carlos

doi:10.3390/a18010023

Open AccessReview

A Review on Inverse Kinematics, Control and Planning for Robotic Manipulators With and Without Obstacles via Deep Neural Networks

RoboticsLab, Systems and Automation Engineering Department, University Carlos III of Madrid, 28911 Leganés, Spain

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Algorithms 2025, 18(1), 23; https://doi.org/10.3390/a18010023

Submission received: 16 December 2024 / Revised: 31 December 2024 / Accepted: 2 January 2025 / Published: 4 January 2025

(This article belongs to the Special Issue Optimization Methods for Advanced Manufacturing)

Download

Browse Figures

Versions Notes

Abstract

:

Robotic manipulators are highly valuable tools that have become widespread in the industry, as they can achieve great precision and velocity in pick and place as well as processing tasks. However, to unlock their complete potential, some problems such as inverse kinematics (IK) need to be solved: given a Cartesian target, a method is needed to find the right configuration for the robot to reach that point. Another issue that needs to be addressed when dealing with robotic manipulators is the obstacle avoidance problem. Workspaces are usually cluttered and the manipulator should be able to avoid colliding with objects that could damage it, as well as with itself. Two alternatives exist to do this: a controller can be designed that computes the best action for each moment given the manipulator’s state, or a sequence of movements can be planned to be executed by the robot. Classical approaches to all these problems, such as numeric or analytical methods, can produce precise results but take a high computation time and do not always converge. Learning-based methods have gained considerable attention in tackling the IK problem, as well as motion planning and control. These methods can reduce the computational cost and provide results for every situation avoiding singularities. This article presents a literature review of the advances made in the past five years in the use of Deep Neural Networks (DNN) for IK with regard to control and planning with and without obstacles for rigid robotic manipulators. The literature has been organized in several categories depending on the type of DNN used to solve the problem. The main contributions of each reference are reviewed and the best results are presented in summary tables.

Keywords:

deep learning; inverse kinematics; motion planning; motion control; obstacle avoidance

1. Introduction

A robotic manipulator can be defined as a series of links connected together by joints [1]. They are commonly used in pick and place or processing tasks and can be governed directly by an operator, or a logical device. Manipulators are used throughout the industry in a variety of sectors, such as manufacturing, agriculture, health and medicine, space, exploration, military, defense, and service.

One of the main challenges regarding robotic manipulators is solving the inverse kinematics (IK) problem: given a desired pose for the end-effector

X_{d} \in S E (3)

, find joint space positions

θ

which achieve said pose. This problem may have multiple solutions, a unique solution, or no solution at all [2]. Furthermore, some of the solutions could lead the robot to reach a singularity, causing operational issues. The problem can become more complicated if there are obstacles present in the workspace. In this case, given a desired pose for the end-effector

X_{d} \in S E (3)

and a set of constraints, a sequence of configurations could be devised for the manipulator so that the robot can follow them and reach the target without colliding (motion planning [3]). The alternative to motion planning would be to design a controller so that, given the current state of the manipulator and the obstacles, the best immediate configuration is chosen so that the robot gets closer to the target while staying away from the obstacles (motion control [4]).

The IK problem can be classically solved using analytical, numerical, or meta-heuristic algorithms. Analytical solutions are generally the most precise, so advances are being made in this field to find new methods [5,6,7]. The computing time of the analytical IK solutions is faster than the numerical approach and real-time capable, but it is only available for specific robot kinematics. Numerical methods can be divided into three categories: Jacobian, Newton and heuristic methods [8]. They are the preferred option when dealing with redundant manipulators. However, they often reach singularities and have high computational costs (except for heuristic methods, which tend to be faster but cannot adapt to global constraints and sometimes end in unnatural positions). New advances regarding numerical methods can be seen in [9,10]. Finally, meta-heuristic algorithms [11] such as genetic algorithms [12] and particle swarm optimization [13] represent another possible means to solve the IK problem. However, these methods are unsuitable for dynamic environments and require high computation time.

Motion control takes the manipulator and environment state as an input and decides the best immediate action according to a function. In the presence of obstacles, this function will contain the distance to the target as an objective to minimize and the distance to the obstacles as an objective to maximize. Motion control for obstacle avoidance can be achieved by means of the Jacobian pseudo-inverse [14]. On the downside, this method can lead to singularities and relies on a distance algorithm such as [15]. Artificial potential fields [16] represent another option, which consist of a set of forces that attract the manipulator toward the goal and repel it from obstacle regions. They can be used in real-time controllers. A more recent alternative is the Jacobian transpose [17], which does not need to calculate inverse Jacobian matrices or to know the dynamics of manipulators.

Motion planning takes the manipulator and environment state as an input and outputs a sequence of actions according to a set of parameters and constraints. In the presence of obstacles, these constraints will make sure that the robot does not make movements that would take it close to colliding with them or with itself. Motion planning can be achieved via potential fields [18] as well. However, the preferred options to plan trajectories while avoiding collisions are random sampling based methods such as Rapidly Exploring Random Trees (RRT) [19]. RRT has probabilistic completeness, perfect expansion, and a fast exploring speed. However, it cannot deal with dynamic unstructured environments in which many obstacles are distributed randomly. Another sampling based method is Probabilistic Roadmaps (PRM) [20]. The algorithm abstracts the configuration space by selecting a number of points in it and connecting them to form an undirected graph. The path search problem in the free space is transformed to a graph search problem.

In this article, the approach to the IK, motion control and motion planning problems using Deep Neural Networks (DNN) is studied. Neural networks are capable of learning complex functions from data, and then generalize to new points outside of it. Therefore, their use in inverse kinematics and obstacle avoidance yields multiple solutions within an acceptable error range, and far enough from possible obstacles. Additionally, once the training has been completed, the inference time is independent of the configuration of the robot, proving useful for real-time controllers.

In the following sections, different DNN architectures will be reviewed, studying their use and precision when dealing with inverse kinematics, motion control and planning for robotic manipulators, with and without obstacles. More specifically, the key challenges addressed by the reviewed approaches are their precision, architectural and computational complexity, time cost and real-world applicability. To select the references included in this paper, Google Scholar has been used as the main scientific search engine, filtering by year and prioritizing on well-established publishers such as MDPI, IEEE, Elsevier, and Springer. The search has been conducted using keywords such as inverse kinematics, motion control, motion planning, robotic manipulator, DNN and each specific type of DNN architecture (e.g., MLP, CNN). Other relevant reviews devoted to the application of neural networks in robotics are [8,21,22,23,24,25].

The sections are organized as follows: In Section 2, solutions to the IK problem involving DNN are presented. They are classified in supervised learning, unsupervised learning and Deep Reinforcement Learning, and also by the type of DNN used. This is a conventional classification within the machine learning community, which focuses on the problem statements before delving into the set of techniques or algorithms that serve to solve them. The obstacle avoidance problem is studied next, first from a motion control perspective in Section 3 and then from a motion planning one in Section 4. Section 6 also includes a brief review on the use of transformers. The results and trends found are discussed in Section 5, and some concluding remarks are presented in Section 6.

2. Deep Neural Networks for Inverse Kinematics

This section deals with the inverse kinematics (IK) problem in its simplest form. Given a Cartesian target (a Cartesian space position, or a complete pose composed by a Cartesian space position and orientation), a Deep Neural Network (DNN) is used to learn the joint space positions that reach it, e.g., the network architecture seen in Figure 1. The following studies focus on solving the IK problem for rigid manipulators without considering obstacles. Soft robots or dynamic control may be mentioned as examples in some cases, but are not in the scope of this review.

2.1. Supervised Learning

Supervised learning is a type of machine learning algorithm designed to learn how to make predictions based on certain input parameters. The training needs a set of input data accompanied by the desired outputs. The algorithm can learn to identify patterns and correlations in the data and then predict the correct output for new data based on what it has learned [26]. This framework is ideal for the IK problem, as a set of data can be easily composed that contains the Cartesian end-effector positions for a given set of joint space positions using forward kinematics.

An example of such a dataset would be [27], which contains 10,000 input–output data pairs in which the end-effector position and orientation are the inputs and a set of joint angular positions are the outputs. A different dataset is provided in [28] for a 7-degrees-of-freedom (DOF) manipulator in both 2D and 3D space. The dataset can be generated by sampling random joint space positions and applying forward kinematics or by using motor babbling on a simulated robot or directly on the real one. Data generation and preprocessing is more extensively discussed in [29]. The effect of different datasets in the training of neural networks is studied in [30,31].

2.1.1. Feed-Forward Multilayer Perceptron

The feed-forward multilayer perceptron (MLP) is one of the most commonly used DNNs. For IK, it is a model representing a nonlinear mapping between an input vector (in this case, the desired Cartesian target for the end-effector

X_{d}

) and an output vector (the corresponding joint space positions

θ

). It contains an input layer, an output layer, and several hidden layers, each of them containing a number of neurons with an activation function [32].

Feed-forward MLPs have been proved to be useful for solving the IK problem, competing with analytical and numerical methods [33], and heuristic methods such as FABRIK [34]. In [35], an MLP is compared to other neural network approaches, achieving small error values for targets where the orientation is not considered. A study on the effect of different arm lengths, datasets, hyperparameters and optimizers is conducted in [36], comparing performance in estimating IK solutions. In [37], a training dataset is virtually generated considering only the

x, y

and z coordinates as inputs, and different hyperparameters and architectures are tested for MLP. The achieved distance error is low, but it proves that orientation information is needed for a better performance of the end-effector joint.

Because of their simplicity, computational robustness and low computational cost, feed-forward MLPs are widely used. Still, when compared to other types of DNNs such as recurrent neural networks, they are proved to be less accurate [38]. Some possibilities to reduce the error are discussed in [39,40], such as the division of the workspace in different subspaces so that a different MLP can be used for each of them. A quality function is added in [41] to detect the best solutions between all those generated for a pose in the case of a redundant robot. This idea can also be applied to manipulators with a lower number of DOF [42]. A genetic algorithm is used in [43] to improve the orientation precision of an MLP, outperforming a Generative Adversarial Network (GAN) in small workspaces.

A recent trend for feed-forward MLPs in this area are physically inspired data-driven learning methods for inverse dynamics of a robotic manipulator [44]. However, as ref. [45] shows, they cannot yet achieve the necessary precision.

2.1.2. Convolutional Neural Network

Convolutional Neural Networks (CNN) are conventionally composed by convolutional and pooling layers [46]. Fully-connected layers (also called dense layers) are typically also placed towards the output of the network. Convolutional layers are composed of input data, a filter, and a feature map. This allows the CNN to extract significant features from the input database. Pooling layers conduct downsampling operations, reducing the number of parameters in the input. Finally, any fully-connected layer assures each node in the next layer is directly connected to a node in the previous layer. CNNs are typically used for image processing and recognition tasks [47], but they can also serve other purposes such as solving IK.

In [48], a network composed by a CNN connected to shallow neural network of dense layers was tested against a simpler MLP, providing better accuracy and highlighting the importance of data scaling. Similar results are obtained in [49] for a 6-DOF industrial manipulator. A similar manipulator is used in [50], improving the precision of the CNN by tuning its hyperparameters. The network is then compared to recurrent neural networks and analytical methods with different types of noise. In general, it obtains better results than the analytical method and recurrent neural networks (RNNs) with Gated Recurrent Units but performs worse than RNN with Long-Short Term Memory units and its bidirectional version. The effect of the manipulator’s number of DOF is also studied, proving that precision is reduced when the number of DOF grows for every network. To reduce the pose error, ref. [51] proposes the use of a CNN combined with a Bilateral Long-Short Term Memory and a squeeze-and-excitation network [52].

2.1.3. Recurrent Neural Network

Recurrent neural networks (RNN) are DNN models that are capable of managing sequential data [53]. This makes them well suited for processing time-series. In this case, inputs and outputs are not independent, but can affect the way new data is processed. RNNs possess a recurrent unit that acts as a form of memory, allowing the network to learn from past inputs. Three main types of RNN units can be selected: Gated Recurrent Units (GRU), Long-Short Term Memory (LSTM) and Bilateral Long-Short Term Memory (BiLSTM). As noted previously, multiple studies prove RNNs to be more suitable than other options for the IK problem [38,50].

In [54], a RNN is used to solve the IK problem for a 6-DOF manipulator. This network is then compared to the approach followed in [55], where a Backpropagation Neural Network (BPNN) with particle swarm optimization (PSO) is used. The RNN achieves remarkable results and is proven to outperform the BPNN, even when this one uses optimization. An LSTM is used in [56] to solve IK for a 5-DOF manipulator. This type of RNN has the ability to remember long-term information and avoid the problem of gradient vanishing or explosion. The results prove it has high efficiency and satisfies the demands of real-time computing.

To reduce the complexity of the problem for high-DOF manipulators, ref. [57] builds several network architectures taking into account the dependencies between the different joints. The RNN model proves useful for this task, as it takes the coordinate of the target point as the initial state and outputs one joint space value based on the previous joint space positions at each time step. In fact, it achieves better results than an MLP and Hierarchical Neural Networks. More examples of RNNs for the IK problem can be seen in [58,59,60].

2.2. Unsupervised Learning

As was mentioned before, supervised learning is an excellent framework to deal with the IK problem, as it only requires a training dataset that can be obtained using forward kinematics or motor babbling. Nevertheless, other options exist that allow to forego this step and the costs associated to it.

2.2.1. Generative Adversarial Networks

Generative Adversarial Networks (GAN) consist of two interlinked neural networks: the generative model captures the data distribution, and the discriminative model tries to differentiate between genuine and generated data [61]. The objective of the generative model is to maximize the probability of the discriminative model making a mistake. In [62], four types of GAN are compared to the performance of a Fully Connected Neural Network for both IK and inverse dynamics, achieving similar performances in every case. The idea is to approximate the real model globally using a limited real-world dataset, which is augmented with fake data generated using GANs. As happened with other DNNs, the performance is lower when the number of DOF grows. To solve this problem, ref. [63] uses GANs to generate initial solutions to the IK problem that are later refined using an iterative numeric approach. In large workspaces, GANs have also been shown to perform better than MLP [43].

2.2.2. Autoencoders

Autoencoders (AE) are generative models composed of an encoder that learns how to extract “latent” variables from training data and a decoder that uses those variables to reconstruct the input data [64]. Different variants of this model exist, one of the most famous being Variational Autoencoders (VAE), which are extended from AEs by introducing a variational inference [65]. A goal-conditioned VAE-based kinematics model is developed in [66] for expressing the redundant forward/inverse kinematics of the class-1 tensegrity manipulator. In [67], a framework is introduced that is capable of concurrently solving both forward and inverse kinematics for diverse models and datasets using a predefined and differentiable structure for forward kinematics. The performance of a classic AE is compared to that of its variational version, showing the VAE’s greater precision. VAEs are also used in [68] to model the IK solution space and find multiple distinct joint space solutions for a given pose target.

2.2.3. Normalizing Flows

Normalizing Flows (NF) [69] represent another generative approach, which were created to solve some of the problems of GANs and AEs, such as vanishing gradients or mode collapse. This approach is followed in [70], proving computationally very efficient, especially when compared to numeric methods. It is also used in [71], providing a representative set of solutions, quickly and with acceptable error. A new variation on this approach is tested in [72] and proved to learn faster and more accurately than the previous one with the same amount of data passed to the network during the training process.

2.2.4. Graph Neural Networks

Graph Neural Networks (GNN) [73] are a type of Deep Neural Network specialized in processing data that can be represented as graphs. The main element of GNNs is the use of message passing for graph nodes to update their information by communicating with their neighbors. In the case of IK, this method presents the possibility of creating one neural network to solve the problem for every robot provided it is described as a graph. This way, the training does not depend on the manipulator’s specific parameters. This can be seen in [74,75,76]. It also allows for motion constraints to be introduced as part of the robot’s graph description. It is to be noted that this approach to IK, although promising, is still not as developed as others.

2.3. Deep Reinforcement Learning

Deep Reinforcement Learning (DRL) combines reinforcement learning (RL) [77] and deep learning. RL deals with the problem of learning through interaction by trial and error, allowing the agent to make decisions from unstructured data by trying to maximize the reward received for its actions on each state. DRL techniques aim to solve RL via DNNs and have a clear advantage over supervised learning in that they do not require a previously generated labeled training dataset, which reduces the time and computation cost. Compared to unsupervised learning, DRL can be architecturally simpler and less sensitive to hyperparameters. Additionally, they have a notion of objective via the reward signal, which can explicitly map to reaching a target and/or attempt to enforce a certain degree of collision avoidance.

DRL is used in [78] to solve the IK of a high-DOF robotic arm using the Deep Q-Network (DQN) algorithm resulting in a maximum error in end-effector position of 1 mm. In [79], Deep Deterministic Policy Gradient (DDPG) is used to solve the problem, at the same time focusing in possible issues with data privacy. A modified version of said algorithm is used in [80]. In [81] the Multi-Agent Proximal Policy Optimization (MAPPO) algorithm is successfully applied to the problem, overcoming the shortcomings of traditional methods and learning autonomously based on the environment. An actor–critic multi-agent approach is followed in [82] where each joint is a separate agent. The results are compared to standard Jacobian-based and neural network methods and proved to converge faster. DDPG, Twin Delayed DDPG (TD3) and Soft Actor–Critic (SAC) are compared in [83], with TD3 obtaining the best results. The approach followed in this reference is to integrate the manipulator’s forward kinematics model into the control framework and conceptualize the control task as a Markov decision process.

2.4. Performance Summary

In Table 1, a summary of the performance of each method is presented. The examples have been chosen by the relevance of the experiments conducted, discarding the measures taken in training time. Of those remaining, one result is presented for each type of DNN mentioned, selected by the amount of relevant information provided in the corresponding reference. It is to be noted that, although they are indicative of how precise each method can be, these results cannot be interpreted as a rigorously fair comparison, as each measure has been taken under different circumstances.

3. Deep Neural Networks for Motion Control

This section deals with the problem of controlling a manipulator’s movements to reach a Cartesian target from its current position while considering different constraints. These constraints can be obstacles present in the environment, or limitations that need to be respected. In this case, the input for the DNN will be the Cartesian target and some additional information, such as the position of obstacles, as can be seen in Figure 2. With this information, the DNN will decide the best immediate movement at each given moment to bring the manipulator closer to the target while respecting the constraints. The following references are focused on rigid serial robotic manipulators; however, these techniques can also be applied to other types of robots, such as continuum manipulators [84] or robotic fish [85].

3.1. Supervised Learning

To apply supervised learning to motion control problems with obstacle avoidance, the position of the obstacles has to be known beforehand so it can be incorporated to the training dataset information. In refs. [41,86], for example, the points used for training are specifically selected by a quality function [87] so that they are close to the desired trajectory and a safe distance away from the obstacles. An MLP can be later trained with this data to control the manipulator and force it around the obstacles. The issue of data generation is further discussed in [88], where the issue of multiple solutions to the problem for redundant manipulators is avoided and obstacles are incorporated.

RNNs are the best supervised method for obstacle avoidance, as the problem can be considered a time series. Although they are often used for motion planning, motion control can also be solved with RNNs, as can be seen in [89].

3.2. Unsupervised Learning

Unsupervised learning is better suited to perform motion control while avoiding collisions, as it does not require the addition of obstacles to the training dataset. Instead, soft constraints can be integrated in the training scheme, e.g., computing the distance between joints and obstacles and adding it to the error being minimized. This is performed from a motion control perspective in [90], where an unsupervised scheme is proposed involving an MLP and auxiliary loss functions to add constraints. In [91], an AE framework for kinematic control of manipulators is proposed, and both non-sparse and sparse AE controllers are developed. In this case, however, no obstacles are included.

3.3. Deep Reinforcement Learning

As can be seen by the amount of results and ideas developed, DRL is the preferred technique when dealing with motion control and obstacle avoidance in deep learning. Firstly, it does not need pre-generated data for training, so it is faster than supervised alternatives. Additionally, most unsupervised approaches rely on formulating the problem as an optimization one and adding the obstacles to the loss function. However, it seems more intuitive and practical to add them to a reward function.

Different software tools and frameworks for DRL-based control for robotic manipulators in simulation are reviewed in [92], including simulators, application programming interfaces (APIs), libraries, and methods, and their interactions with each other.

For this problem, a ground-truth model of the environment is usually not available to the agent, meaning model-free DRL algorithms are preferred. Different algorithms have been tested to achieve joint control of a manipulator to reach a Cartesian space target while avoiding possible obstacles. Deep Deterministic Policy Gradient (DDPG) is used in [93], where it is applied to a 3-DOF manipulator to capture space objects and avoid collisions. Proximal Policy Optimization (PPO) is used in [94,95], where the distance to obstacles is considered instead of the collisions themselves. In these two cases, the importance of curriculum learning [96] to obtain better results is highlighted. Soft Actor–Critic (SAC) is used in [97], obtaining better results than DQN and DDPG, in a sim-to-real (sim2real) framework. SAC, DDPG and TD3 are compared in [98], with SAC achieving the best results again.

3.4. Performance Summary

In Table 2, a summary of the most relevant aspects of each method is presented. As discussed in the previous section, the examples have been chosen by the relevance of the experiments conducted and the amount of information provided in the corresponding reference. In the case of similar references, we chose the newest.

4. Deep Neural Networks for Motion Planning

Similarly to the previous section, the objective now is to reach a Cartesian target while avoiding collisions with possible obstacles or self-collisions. In this case, instead of deciding the best current move given the manipulator’s state and the environment information, a DNN will be used to produce the sequence of actions or trajectory that the robot will have to follow to reach the target from beginning to end, as seen in Figure 3. The following references are focused on rigid serial robotic manipulators; however, these techniques can also be applied to other types of manipulators, such as parallel robots [99].

4.1. Supervised Learning

To apply supervised learning to the motion planning problem, the obstacles need to be known beforehand so they can be added as constraints to the function that needs to be minimized. A way to achieve this is to formulate the problem as a quadratic programming one.

CNNs are not directly used for this particular case. An indirect approach uses CNN for tasks such as obstacle detection so that a cost map can be generated, while the best path is found using other methods such as RRT [100].

RNNs are the best supervised method for motion planning with obstacle avoidance via DNNs, as the problem can be considered a time series. By choosing a proper objective function to be optimized and formulating physical constraints as inequality constraints, the problem can be formulated as a quadratic programming one [101] and solved using recurrent neural networks, as is performed in [102,103]. In [104], a GRU network is used to compute the trajectory between two states avoiding obstacles.

RNNs can also be used alongside other methods to boost their performance. For example, in [105], the SAC algorithm is used to learn obstacle avoidance by reinforcement learning, with both the actor and critic networks being LSTM, which allows them to use historical information and the current state to improve the agent’s decisions.

4.2. Unsupervised Learning

Equivalently to supervised learning, the key to adding obstacles to a motion planning scheme using unsupervised DNNs relies on the problem being formulated as an optimization problem with constraints.

From the motion planning point of view, the IK problem can be formulated as an optimization problem such as [106], including collisions with the environment and self-collisions. This is directly used to update the network weights via backpropagation, finally achieving great precision in a much shorter time than the supervised approach, which needs generating data previously. The approach followed in [63] to solve IK using GANs can also accept constraints. In both approaches, the IK results are used in a larger scheme to find the best motion planning paths.

In [107], a VAE is used to learn a generative latent-variable model of a dataset containing joint space positions and end-effector positions. A loss function is then minimized that takes into account the latent representations of the target, joint positions and obstacles. VAEs are also used in [108] to capture information about the high-dimensional robotic space in a lower-dimensional manifold, so that collision avoidance solutions can be found using Topological Manifold Learning [109].

4.3. Deep Reinforcement Learning

Deep Reinforcement Learning is, again, the best framework for motion planning in the presence of obstacles. In this case, the problem does not need to be reformulated or adapted, it is enough to add the target as a positive reward and the obstacles as negative ones to compute the best path between two points.

From the motion planning perspective, different solutions have been devised: Deep Deterministic Policy Gradient (DDPG) is used in [110,111], where the reward takes into account the distance to the target, a penalty for collisions and the joint effort. PPO is used in [112], where a success reward is added, as well as an out-of-workspace penalty. This approach is proved to outperform traditional RRT methods. A SAC-based algorithm is used in [113] with the addition of the Hindsight Experience Replay (HER) technique to use the training data efficiently. A prophet-guided actor–critic (PAC) structure is designed in [114] by integrating the policies trained in other similar scenes as prophets. The DRL expert memory is expanded using Generative Adversarial Imitation Learning and the sparsity of the reward is reduced by adding an attraction reward to the goal and a repulsion reward to the obstacles.

4.4. Transformer

A transformer [115] is a deep learning architecture for transforming one sequence into another one using an encoder and a decoder. They include a parallel multi-head attention mechanism, which allows them to consider the importance of each token in the sequence. Transformers have no recurrent units, therefore requiring less training time than RNNs. Transformers have unique advantages over other DNN architectures in the motion planning scope when the task at hand is described not as set of states and goals but as spoken commands, human actions or with images. Some examples of this are [116,117,118,119].

A transformer is employed in [59] to learn how to map a sequence of end-effector poses to a sequence of joint space positions. In [120], a dataset is generated randomizing link length, configuration, and dimension of robotic manipulators with the intent of learning general inverse kinematic solutions via a generative pre-trained transformer. Transformers can also be used to learn the kinematic models of soft robotic manipulators, as can be seen in [121,122].

4.5. Performance Summary

In Table 3, a summary of the most relevant aspects of each method is presented. As discussed in the previous section, the examples have been chosen by the relevance of the experiments conducted and the amount of information provided in the corresponding reference.

5. Discussion

In this study, we have explored the application of Deep Neural Networks (DNNs) for addressing the inverse kinematics (IK), motion control, and motion planning problems, all in the presence and absence of obstacles. These are long-standing problems, with diverse well-proved numerical and analytical solutions, but none of them are definitive.

Deep Neural Networks have gained increasing popularity in recent years, and appear to be a great tool to tackle these problems. First of all, the IK problem can be easily framed as a supervised learning task. Once that is performed, the capabilities of DNNs as function approximators should make them ideal for the task of finding a suitable solution. The problem can become harder to formulate with unsupervised learning, but, if performed correctly, the computational and time cost can be reduced greatly.

We have seen that simple DNNs, such as MLPs, can provide good accuracy in specified environments or settings. Newer architectures such as GANs, VAEs or GNNs are also being introduced and present acceptable results under certain conditions, though they still have considerable room to improve. DRL appears to be a good alternative as well, and is better suited for control and path planning around obstacles. Our research shows that studies are tending to focus on hybrid methods, using DNNs for fast initial IK guesses and then optimizing the solution through numerical methods, genetic algorithms or meta-heuristics such as PSO.

The obstacle avoidance problem is harder to pose in a supervised learning framework. The positions of the obstacles need to be known beforehand and the trained DNN cannot adapt to changes in the environment. Additionally, introducing the constraints in the training dataset is not a trivial task. This is why there are not many results in this field, with the exception of RNNs, as they deal with time series. This model is mainly used for motion planning. In the case of motion control, the best strategy would be to select points for training that are already far from obstacles and train a simpler model, such as an MLP. Unsupervised learning appears to be a convenient option. However, constraints are still difficult to add to the framework. From both the motion control and planning perspectives, the idea is to reformulate the problem as an optimization one and then add constraints. This is why Deep Reinforcement Learning shines in this environment, as obstacles can be added intuitively to the reward function to be taken into account when training. This set of techniques are mostly used for control, but can also learn to plan trajectories. The results look promising, but more progress can still be made. Additionally, there is a lack of validation tests in real robots, as the learning is usually conducted in simulated environments. More recent tools such as transformers have yet to make an impact in this field.

The majority of presented results have been achieved in simulated environments, where the structure and behavior of the manipulator and obstacles can be set by the user, as well as the physics. All models thus trained will have to face an additional issue when transitioning from simulations to real-world applications. In the first place, while high-quality 3D scene models exist, real-world scenes usually differ from those in which the robot was trained. Additionally, collecting real-world data to reconstruct scenes is time consuming. When obstacles are involved, the distances between them and the robot have to be calculated, and the transition is limited to its capacity to generalize in new scenes where objects have new positions and geometries. Secondly, real-world scenarios are unpredictable, as the physics behavior cannot be set by hand. Additionally, the manipulator may not act in an ideal way, and sensor noise and hardware limitations have to be considered. All of these factors will have to be taken into account when trying to deploy one of the methods presented in this review to real-world manipulation scenarios.

6. Conclusions

DNNs hold considerable promise for improving the computational cost and precision of long-standing solutions to the inverse kinematics (IK) and obstacle avoidance problems. Theoretically, deep learning provides an ideal framework to tackle these complex problems, given its ability to model high-dimensional, non-linear relationships, and the significant progress made in this field in the latest years. DNNs, similar to traditional numeric methods, can be implemented in a much more generic and robot-agnostic manner than analytical methods. However, they can potentially achieve faster and more accurate solutions than traditional numeric methods, which often rely on iterative processes and approximations.

Our findings prove that these expectations are well-founded, as interesting results are being achieved even with relatively simple DNN architectures. However, there is still a gap between the expectation and the practical outcomes. This is evidenced by the lack of experimental testing in real robots, with most advances being made in simulated environments. While simulation provides valuable insights, it cannot fully account for the complexities and unpredictability of real-world scenarios, such as sensor noise, environmental variations, and hardware limitations.

Further investigation is required to refine the methodologies and enhance their robustness, ensuring that DNNs can be effectively deployed in real-world robotic systems. This will involve not only optimizing network architectures and training techniques but also developing hybrid approaches that combine deep learning with traditional control theory. Additionally, testing on actual robots under a range of dynamic and challenging conditions is essential to fully understand the practical limitations and potential of deep learning for these applications. By addressing these challenges, we can unlock the full capabilities of deep learning in solving robot IK, motion control, and motion planning, paving the way for more efficient and autonomous robotic systems.

Author Contributions

Conceptualization, A.C.-G., J.G.V., F.J.N.-C. and C.B.; methodology, A.C.-G. and J.G.V.; software, A.C.-G.; validation, A.C.-G. and J.G.V.; formal analysis, A.C.-G.; investigation, A.C.-G. and J.G.V.; resources, A.C.-G. and J.G.V.; data curation, A.C.-G. and J.G.V.; writing—original draft preparation, A.C.-G.; writing—review and editing, J.G.V.; visualization, A.C.-G.; supervision, J.G.V., F.J.N.-C. and C.B.; project administration, J.G.V. and C.B.; funding acquisition, J.G.V. and C.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by “iREHAB: AI-powered Robotic Personalized Rehabilitation”, ISCIII-AES-2022/003041 funded by ISCIII and UE; “COMPANION-CM: Inteligencia artificial y modelos cognitivos para la interacción simétrica humano-robot en el ámbito de la robótica asistencial”, Y2020/NMT-666, funded by “Proyectos Sinérgicos de I+D la Comunidad de Madrid”; and EU structural funds; “IROPER: Robótica inteligente para necesidades personales”, PLEC2021-007819, funded by MCIN/AEI/10.13039/501100011033 and by “NextGenerationEU/PRTR” of European Union.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

In the process of writing, the artificial intelligence tool ChatGPT was employed as a means of enhancing and verifying the accuracy of orthography and grammar of the final paragraph of the Section 6. Its outputs were always reviewed and edited by the authors. The contributions and content of the paper are entirely from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

IK	Inverse Kinematics
DNN	Deep Neural Network
PSO	Particle Swarm Optimization
DOF	Degree of Freedom
FABRIK	Forward And Backward Reaching Inverse Kinematics
MLP	Multilayer Perceptron
GAN	Generative Adversarial Network
CNN	Convolutional Neural Network
RNN	Recurrent Neural Network
LSTM	Long-Short Term Memory
BiLSTM	Bilateral Long-Short Term Memory
GRU	Gated Recurrent Units
BPNN	Backpropagation Neural Network
AE	Autoencoder
VAE	Variational Autoencoder
NF	Normalizing Flows
GNN	Graph Neural Network
DRL	Deep Reinforcement Learning
DQN	Deep Q-Network
API	Application Programming Interface
DDPG	Deep Deterministic Policy Gradient
PPO	Proximal Policy Optimization
MAPPO	Multi-Agent Proximal Policy Optimization
TD3	Twin Delayed DDPG
SAC	Soft Actor-Critic
RRT	Rapidly Exploring Random Trees
PRM	Probabilistic Roadmap
HER	Hindsight Experience Replay
PAC	Prophet-guided Actor–Critic

References

Paul, R.P. Robot Manipulators: Mathematics, Programming, and Control: The Computer Control of Robot Manipulators; MIT Press: Cambridge, MA, USA, 1981. [Google Scholar]
Murray, R.M.; Li, Z.; Sastry, S.S. A mathematical Introduction to Robotic Manipulation; CRC Press: Boca Raton, FL, USA, 1994. [Google Scholar]
Latombe, J.C. Robot Motion Planning; Springer Science & Business Media: New York, NY. USA, 1991; Volume 124. [Google Scholar]
Lewis, F.L.; Dawson, D.M.; Abdallah, C.T. Robot Manipulator Control: Theory and Practice; CRC Press: Boca Raton, FL, USA, 2003. [Google Scholar]
Chou, W.; Liu, Y. An Analytical Inverse Kinematics Solution with the Avoidance of Joint Limits, Singularity and the Simulation of 7-DOF Anthropomorphic Manipulators. Trans. FAMENA 2024, 48, 117–132. [Google Scholar] [CrossRef]
Zheng, L.; Lin, P.; Liang, M.; Wang, C.; Li, Y.; Sun, J.; Han, Y.; Liu, H. Analytical Inverse Kinematics for a Prismatic-Revolute Hybrid Joints Radiography Robot Mounted on the Ambulance. IEEE/ASME Trans. Mechatronics 2024, 1–11. [Google Scholar] [CrossRef]
Vu, M.N.; Beck, F.; Schwegel, M.; Hartl-Nesic, C.; Nguyen, A.; Kugi, A. Machine learning-based framework for optimally solving the analytical inverse kinematics for redundant manipulators. Mechatronics 2023, 91, 102970. [Google Scholar] [CrossRef]
Aristidou, A.; Lasenby, J.; Chrysanthou, Y.; Shamir, A. Inverse kinematics techniques in computer graphics: A survey. Comput. Graph. Forum 2018, 37, 35–58. [Google Scholar] [CrossRef]
Yonezawa, A.; Yonezawa, H.; Kajiwara, I. Simple inverse kinematics computation considering joint motion efficiency. IEEE Trans. Cybern. 2024, 54, 4903–4914. [Google Scholar] [CrossRef] [PubMed]
Colan, J.; Davila, A.; Hasegawa, Y. Variable step sizes for iterative Jacobian-based inverse kinematics of robotic manipulators. IEEE Access 2024, 12, 87909–87922. [Google Scholar] [CrossRef]
Nguyen, T.T.; Bui, T.N.; Dai, W.; Nguyen, T.V.; Tao, L.N. Apply some meta-heuristic algorithms to solve inverse kinematic problems of a 7-DoFs manipulator robot. Int. J. Mech. Eng. Robot. Res. 2021, 10, 498–504. [Google Scholar] [CrossRef]
Li, M.; Qiao, L. A Review and Comparative Study of Differential Evolution Algorithms in Solving Inverse Kinematics of Mobile Manipulator. Symmetry 2023, 15, 1080. [Google Scholar] [CrossRef]
Santos, D.O.; Molina, L.; Carvalho, J.G.; Carvalho, E.A.; Freire, E.O. Modifications of Fully Resampled PSO in the Inverse Kinematics of Robot Manipulators. IEEE Robot. Autom. Lett. 2024, 9, 1923–1928. [Google Scholar] [CrossRef]
Li, Q.; Cang, N.; Zhang, W.; Guo, D.; Zhang, C. A Pseudo-Inverse Redundancy-Based Resolution Scheme at the Acceleration Level to Control Robotic Arm Motion. In Proceedings of the 2023 6th International Conference on Robotics, Control and Automation Engineering (RCAE), Suzhou, China, 27–29 October 2023; pp. 18–22. [Google Scholar]
Gilbert, E.G.; Johnson, D.W.; Keerthi, S.S. A fast procedure for computing the distance between complex objects in three-dimensional space. IEEE J. Robot. Autom. 1988, 4, 193–203. [Google Scholar] [CrossRef]
Khatib, O. Real-time obstacle avoidance for manipulators and mobile robots. Int. J. Robot. Res. 1986, 5, 90–98. [Google Scholar] [CrossRef]
Lee, K.K.; Buss, M. Obstacle avoidance for redundant robots using Jacobian transpose method. In Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA, USA, 29 October–2 November 2007; pp. 3509–3514. [Google Scholar]
Scoccia, C.; Palmieri, G.; Palpacelli, M.C.; Callegari, M. Real-time strategy for obstacle avoidance in redundant manipulators. In Advances in Italian Mechanism Science. IFToMM ITALY 2020. Mechanisms and Machine Science; Springer: Cham, Switzerland, 2021; Volume 91, pp. 278–285. [Google Scholar]
Wei, K.; Ren, B. A method on dynamic path planning for robotic manipulator autonomous obstacle avoidance based on an improved RRT algorithm. Sensors 2018, 18, 571. [Google Scholar] [CrossRef] [PubMed]
Chen, G.; Luo, N.; Liu, D.; Zhao, Z.; Liang, C. Path planning for manipulators based on an improved probabilistic roadmap method. Robot. Comput.-Integr. Manuf. 2021, 72, 102196. [Google Scholar] [CrossRef]
Liu, R.; Nageotte, F.; Zanne, P.; de Mathelin, M.; Dresp-Langley, B. Deep reinforcement learning for the control of robotic manipulation: A focussed mini-review. Robotics 2021, 10, 22. [Google Scholar] [CrossRef]
Sari, Y. Performance evaluation of the various training algorithms and network topologies in a neural-network-based inverse kinematics solution for robots. Int. J. Adv. Robot. Syst. 2014, 11, 64. [Google Scholar] [CrossRef]
Tamizi, M.G.; Yaghoubi, M.; Najjaran, H. A review of recent trend in motion planning of industrial robots. Int. J. Intell. Robot. Appl. 2023, 7, 253–274. [Google Scholar] [CrossRef]
Noroozi, F.; Daneshmand, M.; Fiorini, P. Conventional, Heuristic and Learning-Based Robot Motion Planning: Reviewing Frameworks of Current Practical Significance. Machines 2023, 11, 722. [Google Scholar] [CrossRef]
Ozalp, R.; Ucar, A.; Guzelis, C. Advancements in Deep Reinforcement Learning and Inverse Reinforcement Learning for Robotic Manipulation: Towards Trustworthy, Interpretable, and Explainable Artificial Intelligence. IEEE Access 2024, 12, 51840–51858. [Google Scholar] [CrossRef]
Toche Tchio, G.M.; Kenfack, J.; Kassegne, D.; Menga, F.D.; Ouro-Djobo, S.S. A comprehensive review of supervised learning algorithms for the diagnosis of photovoltaic systems, Proposing a new approach using an ensemble learning algorithm. Appl. Sci. 2024, 14, 2072. [Google Scholar] [CrossRef]
Nugroho, A.; Yuniarno, E.M.; Purnomo, M.H. ARKOMA dataset: An open-source dataset to develop neural networks-based inverse kinematics model for NAO robot arms. Data Brief 2023, 51, 109727. [Google Scholar] [CrossRef] [PubMed]
Rathnam, R.; Godfrey, W.W. Data Driven Approach for Inverse Kinematics in 2D and 3D. In Proceedings of the 2023 IEEE 7th Conference on Information and Communication Technology (CICT), Jabalpur, India, 15–17 December 2023; pp. 1–6. [Google Scholar]
Zhang, B. Inverse Kinematics Implementation Techniques in Robotics. Highlights Sci. Eng. Technol. 2024, 81, 109–120. [Google Scholar] [CrossRef]
Bouzid, R.; Narayan, J.; Gritli, H. Solving Inverse Kinematics Problem for Manipulator Robots Using Artificial Neural Network with Varied Dataset Formats. In Complex Systems and Their Applications; Springer Nature: Cham, Switzerland, 2024; pp. 55–78. [Google Scholar]
Bouzid, R.; Gritli, H.; Narayan, J. ANN approach for SCARA robot inverse kinematics solutions with diverse datasets and optimisers. Appl. Comput. Syst. 2024, 29, 24–34. [Google Scholar] [CrossRef]
Chan, K.Y.; Abu-Salih, B.; Qaddoura, R.; Ala’M, A.Z.; Palade, V.; Pham, D.S.; Del Ser, J.; Muhammad, K. Deep neural networks in the cloud: Review, applications, challenges and research directions. Neurocomputing 2023, 545, 126327. [Google Scholar] [CrossRef]
Semwal, V.B.; Gupta, Y. Performance analysis of data-driven techniques for solving inverse kinematics problems. In Intelligent Systems and Applications. IntelliSys 2021; Lecture Notes in Networks and Systems; Springer: Cham, Switzerland, 2022; Volume 294, pp. 85–99. [Google Scholar]
Semwal, V.B.; Reddy, M.; Narad, A. Comparative study of inverse kinematics using data driven and fabrik approach. In Proceedings of the 2021 5th International Conference on Advances in Robotics, Kanpur India, 30 June–4 July 2021; pp. 1–6. [Google Scholar]
Sharkawy, A.N.; Khairullah, S.S. Forward and Inverse Kinematics Solution of A 3-DOF Articulated Robotic Manipulator Using Artificial Neural Network. Int. J. Robot. Control Syst. 2023, 3, 330–353. [Google Scholar] [CrossRef]
Bouzid, R.; Narayan, J.; Gritli, H. Investigating neural network hyperparameter variations in robotic arm inverse kinematics for different arm lengths. In Proceedings of the 2024 Third International Conference on Power, Control and Computing Technologies (ICPC2T), Raipur, India, 18–20 January 2024; pp. 351–356. [Google Scholar]
Šegota, S.B.; Anđelić, N.; Mrzljak, V.; Lorencin, I.; Kuric, I.; Car, Z. Utilization of multilayer perceptron for determining the inverse kinematics of an industrial robotic manipulator. Int. J. Adv. Robot. Syst. 2021, 18, 1729881420925283. [Google Scholar] [CrossRef]
Toquica, J.S.; Oliveira, P.S.; Souza, W.S.; Motta, J.M.S.; Borges, D.L. An analytical and a Deep Learning model for solving the inverse kinematic problem of an industrial parallel robot. Comput. Ind. Eng. 2021, 151, 106682. [Google Scholar] [CrossRef]
Lu, J.; Zou, T.; Jiang, X. A neural network based approach to inverse kinematics problem for general six-axis robots. Sensors 2022, 22, 8909. [Google Scholar] [CrossRef]
Cagigas-Muñiz, D. Artificial Neural Networks for inverse kinematics problem in articulated robots. Eng. Appl. Artif. Intell. 2023, 126, 107175. [Google Scholar] [CrossRef]
Hlaváč, V. MLP neural network for a kinematic control of a redundant planar manipulator. In Advances in Mechanism Design III. TMM 2020. Mechanisms and Machine Science; Springer: Cham, Switzerland, 2022; Volume 85, pp. 24–32. [Google Scholar]
Hlaváč, V. Inverted Kinematics of a Redundant Manipulator with a MLP Neural Network. In Proceedings of the 2022 International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME), Male, Maldives, 16–18 November 2022; pp. 1–5. [Google Scholar]
Habekost, J.G.; Strahl, E.; Allgeuer, P.; Kerzel, M.; Wermter, S. Cycleik: Neuro-inspired inverse kinematics. In Artificial Neural Networks and Machine Learning—ICANN 2023; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2023; Volume 14254, pp. 457–470. [Google Scholar]
Wu, S.; Li, Z.; Chen, W.; Sun, F. Dynamic Modeling of Robotic Manipulator via an Augmented Deep Lagrangian Network. Tsinghua Sci. Technol. 2024, 29, 1604–1614. [Google Scholar] [CrossRef]
Li, Z.; Wu, S.; Chen, W.; Sun, F. Extrapolation of Physics-Inspired Deep Networks in Learning Robot Inverse Dynamics. Mathematics 2024, 12, 2527. [Google Scholar] [CrossRef]
Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6999–7019. [Google Scholar] [CrossRef]
Wen, X.; Wang, Y.; Zhu, Q.; Wu, J.; Xiong, R.; Xie, A. Design of recognition algorithm for multiclass digital display instrument based on convolution neural network. Biomim. Intell. Robot. 2023, 3, 100118. [Google Scholar] [CrossRef]
Elkholy, H.A.; Shahin, A.S.; Shaarawy, A.W.; Marzouk, H.; Elsamanty, M. Solving inverse kinematics of a 7-DOF manipulator using convolutional neural network. In Proceedings of the International Conference on Artificial Intelligence and Computer Vision (AICV2020), Cairo, Egypt, 8–10 April 2020; Advances in Intelligent Systems and Computing. Springer: Cham, Switzerland, 2020; Volume 1153, pp. 343–352. [Google Scholar]
Kumhar, H.S.; Kukshal, V. Inverse kinematic solution for 6-r industrial robot manipulator using convolution neural network. In Recent Trends in Product Design and Intelligent Manufacturing Systems; Lecture Notes in Mechanical Engineering; Springer: Cham, Switzerland, 2022; pp. 923–930. [Google Scholar]
Wagaa, N.; Kallel, H.; Mellouli, N. Analytical and deep learning approaches for solving the inverse kinematic problem of a high degrees of freedom robotic arm. Eng. Appl. Artif. Intell. 2023, 123, 106301. [Google Scholar] [CrossRef]
Zhu, X.; Liu, Z.; Cai, C.; Yang, M.; Zhang, H.; Fu, L.; Zhang, J. Deep learning-based predicting and compensating method for the pose deviations of parallel robots. Comput. Ind. Eng. 2024, 191, 110179. [Google Scholar] [CrossRef]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
Mienye, I.D.; Swart, T.G.; Obaido, G. Recurrent neural networks: A comprehensive review of architectures, variants, and applications. Information 2024, 15, 517. [Google Scholar] [CrossRef]
Shaar, A.; Ghaeb, J.A. Intelligent Solution for Inverse Kinematic of Industrial Robotic Manipulator Based on RNN. In Proceedings of the 2023 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT), Amman, Jordan, 22–24 May 2023; pp. 74–79. [Google Scholar]
Jiang, G.; Luo, M.; Bai, K.; Chen, S. A precise positioning method for a puncture robot based on a PSO-optimized BP neural network algorithm. Appl. Sci. 2017, 7, 969. [Google Scholar] [CrossRef]
Wang, S.; Zhang, Y.; Chen, S.; Xu, M.; Yu, Y.; Liu, P. Inverse Kinematics Analysis of 5-DOF Cooperative Robot Based on Long Short-Term Memory Network. In Proceedings of the 2023 IEEE 3rd International Conference on Software Engineering and Artificial Intelligence (SEAI), Xiamen, China, 16–18 June 2023; pp. 245–249. [Google Scholar]
Tsai, M.T.; King, C.T.; Ho, C.K. Exploiting Joint Dependencies for Data-driven Inverse Kinematics with Neural Networks for High-DOF Robot Arms. In Proceedings of the ISCA 34th International Conference on Computer Applications in Industry and Engineering, EPiC Series in Computing, Online, 11–13 October 2021; Volume 79, pp. 159–170. [Google Scholar]
Kong, L. Kinematic resolutions of redundant robot manipulators using integration-enhanced RNNs. arXiv 2020, arXiv:2008.08228. [Google Scholar]
Bensadoun, R.; Gur, S.; Blau, N.; Wolf, L. Neural inverse kinematic. In Proceedings of the International Conference on Machine Learning, PMLR, Baltimore, MD, USA, 17–23 July 2022; pp. 1787–1797. [Google Scholar]
Vaishnavi, J.; Singh, B.; Vijayvargiya, A.; Kumar, R. Deep Learning Framework for Inverse Kinematics Mapping for a 5 DoF Robotic Manipulator. In Proceedings of the 2022 IEEE International Conference on Power Electronics, Drives and Energy Systems (PEDES), Jaipur, India, 14–17 December 2022; pp. 1–6. [Google Scholar]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the Advances in Neural Information Processing Systems 27, Montreal, QC, Canada, 8–13 December 2014; Volume 27. [Google Scholar]
Ren, H.; Ben-Tzvi, P. Learning inverse kinematics and dynamics of a robotic manipulator using generative adversarial networks. Robot. Auton. Syst. 2020, 124, 103386. [Google Scholar] [CrossRef]
Lembono, T.S.; Pignat, E.; Jankowski, J.; Calinon, S. Learning constrained distributions of robot configurations with generative adversarial network. IEEE Robot. Autom. Lett. 2021, 6, 4233–4240. [Google Scholar] [CrossRef]
Zhai, J.; Zhang, S.; Chen, J.; He, Q. Autoencoder and its various variants. In Proceedings of the 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Miyazaki, Japan, 7–10 October 2018; pp. 415–419. [Google Scholar]
Kingma, D.P. Auto-encoding variational bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
Yoshimitsu, Y.; Osa, T.; Ikemoto, S. Forward/Inverse Kinematics Modeling for Tensegrity Manipulator Based on Goal-Conditioned Variational Autoencoder. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; pp. 6668–6673. [Google Scholar]
Wilhelm, N.; Haddadin, S.; Burgkart, R.; Van Der Smagt, P.; Karl, M. Accurate Kinematic Modeling using Autoencoders on Differentiable Joints. In Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan, 13–17 May 2024; pp. 7122–7128. [Google Scholar]
Ho, C.K.; Chan, L.W.; King, C.T.; Yen, T.Y. A deep learning approach to navigating the joint solution space of redundant inverse kinematics and its applications to numerical IK computations. IEEE Access 2023, 11, 2274–2290. [Google Scholar] [CrossRef]
Kobyzev, I.; Prince, S.J.; Brubaker, M.A. Normalizing flows: An introduction and review of current methods. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3964–3979. [Google Scholar] [CrossRef]
Kim, S.; Perez, J. Learning reachable manifold and inverse mapping for a redundant robot manipulator. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 4731–4737. [Google Scholar]
Ames, B.; Morgan, J.; Konidaris, G. Ikflow: Generating diverse inverse kinematics solutions. IEEE Robot. Autom. Lett. 2022, 7, 7177–7184. [Google Scholar] [CrossRef]
Park, S.; Schwartz, M.; Park, J. NODEIK: Solving Inverse Kinematics with Neural Ordinary Differential Equations for Path Planning. In Proceedings of the 2022 22nd International Conference on Control, Automation and Systems (ICCAS), Busan, Republic of Korea, 27 November–1 December 2022; pp. 944–949. [Google Scholar]
Scarselli, F.; Gori, M.; Tsoi, A.C.; Hagenbuchner, M.; Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw. 2008, 20, 61–80. [Google Scholar] [CrossRef] [PubMed]
Limoyo, O.; Marić, F.; Giamou, M.; Alexson, P.; Petrović, I.; Kelly, J. Euclidean Equivariant Models for Generative Graphical Inverse Kinematics. arXiv 2023, arXiv:2307.01902. [Google Scholar]
Limoyo, O.; Maric, F.; Giamou, M.; Alexson, P.; Petrovic, I.; Kelly, J. One Network, Many Robots: Generative Graphical Inverse Kinematics. CoRR 2022. [Google Scholar] [CrossRef]
Kim, J.T.; Park, J.; Choi, S.; Ha, S. Learning robot structure and motion embeddings using graph neural networks. arXiv 2021, arXiv:2109.07543. [Google Scholar]
Sutton, R.S. Reinforcement learning: An introduction. In A Bradford Book; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
Malik, A.; Lischuk, Y.; Henderson, T.; Prazenica, R. A deep reinforcement-learning approach for inverse kinematics solution of a high degree of freedom robotic manipulator. Robotics 2022, 11, 44. [Google Scholar] [CrossRef]
Shivkumar, S.; Kumaar, A.N. Manipulator Control using Federated Deep Reinforcement Learning. In Proceedings of the 2024 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), Bangalore, India, 12–14 July 2024; pp. 1–6. [Google Scholar]
Majumder, S.; Sahoo, S.R. A Reinforcement-Learning Approach to Control Robotic Manipulator Based on Improved DDPG. In Proceedings of the 2023 Ninth Indian Control Conference (ICC), Visakhapatnam, India, 18–20 December 2023; pp. 281–286. [Google Scholar]
Zhao, C.; Wei, Y.; Xiao, J.; Sun, Y.; Zhang, D.; Guo, Q.; Yang, J. Inverse kinematics solution and control method of 6-degree-of-freedom manipulator based on deep reinforcement learning. Sci. Rep. 2024, 14, 12467. [Google Scholar] [CrossRef]
Perrusquía, A.; Yu, W.; Li, X. Multi-agent reinforcement learning for redundant robot control in task-space. Int. J. Mach. Learn. Cybern. 2021, 12, 231–241. [Google Scholar] [CrossRef]
Chen, Y.; Su, S.; Ni, K.; Li, C. Integrated Intelligent Control of Redundant Degrees-of-Freedom Manipulators via the Fusion of Deep Reinforcement Learning and Forward Kinematics Models. Machines 2024, 12, 667. [Google Scholar] [CrossRef]
Liang, X.; He, G.; Su, T.; Wang, W.; Huang, C.; Zhao, Q.; Hou, Z.G. Finite-time observer-based variable impedance control of cable-driven continuum manipulators. IEEE Trans. Hum.-Mach. Syst. 2021, 52, 26–40. [Google Scholar] [CrossRef]
Wang, Q.; Hong, Z.; Zhong, Y. Learn to swim: Online motion control of an underactuated robotic eel based on deep reinforcement learning. Biomim. Intell. Robot. 2022, 2, 100066. [Google Scholar] [CrossRef]
Hlavac, V. Kinematics control of a redundant planar manipulator with a MLP neural network. In Proceedings of the 2021 International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME), Mauritius, Mauritius, 7–8 October 2021; pp. 1–5. [Google Scholar]
Hlavac, V. Neural Network for the identification of a functional dependence using data preselection. Neural Netw. World 2021, 2, 109–124. [Google Scholar] [CrossRef]
Hlaváč, V. Accuracy of the Inverse Kinematics of a Planar Redundant Manipulator Solved by an MLP Neural Network. In Advances in Mechanism Design IV. TMM 2024. Mechanisms and Machine Science; Springer: Cham, Switzerland, 2024; Volume 171, pp. 202–211. [Google Scholar]
Li, Z.; Li, S. Recursive recurrent neural network: A novel model for manipulator control with different levels of physical constraints. CAAI Trans. Intell. Technol. 2023, 8, 622–634. [Google Scholar] [CrossRef]
Stephan, B.; Dontsov, I.; Müller, S.; Gross, H.M. On Learning of Inverse Kinematics for Highly Redundant Robots with Neural Networks. In Proceedings of the 2023 21st International Conference on Advanced Robotics (ICAR), Abu Dhabi, United Arab Emirates, 5–8 December 2023; pp. 402–408. [Google Scholar]
Li, Z.; Li, S. Neural network model-based control for manipulator: An autoencoder perspective. IEEE Trans. Neural Netw. Learn. Syst. 2021, 34, 2854–2868. [Google Scholar] [CrossRef] [PubMed]
Calderon-Cordova, C.; Sarango, R.; Castillo, D.; Lakshminarayanan, V. A deep reinforcement learning framework for control of robotic manipulators in simulated environments. IEEE Access 2024, 12, 103133–103161. [Google Scholar] [CrossRef]
Blaise, J.; Bazzocchi, M.C. Space Manipulator Collision Avoidance Using a Deep Reinforcement Learning Control. Aerospace 2023, 10, 778. [Google Scholar] [CrossRef]
Kumar, V.; Hoeller, D.; Sundaralingam, B.; Tremblay, J.; Birchfield, S. Joint space control via deep reinforcement learning. In Proceedings of the RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; pp. 3619–3626. [Google Scholar]
Væhrens, L.; Álvarez, D.D.; Berger, U.; Bøgh, S. Learning Task-independent Joint Control for Robotic Manipulators with Reinforcement Learning and Curriculum Learning. In Proceedings of the 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA), Nassau, Bahamas, 12–14 December 2022; pp. 1250–1257. [Google Scholar]
Bengio, Y.; Louradour, J.; Collobert, R.; Weston, J. Curriculum learning. In Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, QC, Canada, 14–18 June 2009; pp. 41–48. [Google Scholar]
Zhang, T.; Zhang, K.; Lin, J.; Louie, W.Y.G.; Huang, H. Sim2real learning of obstacle avoidance for robotic manipulators in uncertain environments. IEEE Robot. Autom. Lett. 2021, 7, 65–72. [Google Scholar] [CrossRef]
Zheng, L.; Wang, Y.; Yang, R.; Wu, S.; Guo, R.; Dong, E. An efficiently convergent deep reinforcement learning-based trajectory planning method for manipulators in dynamic environments. J. Intell. Robot. Syst. 2023, 107, 50. [Google Scholar] [CrossRef]
Su, T.; Liang, X.; Zeng, X.; Liu, S. Pythagorean-Hodograph curves-based trajectory planning for pick-and-place operation of Delta robot with prescribed pick and place heights. Robotica 2023, 41, 1651–1672. [Google Scholar] [CrossRef]
Zhang, Q.; Liu, F.; Li, B. A heuristic tomato-bunch harvest manipulator path planning method based on a 3D-CNN-based position posture map and rapidly-exploring random tree. Comput. Electron. Agric. 2023, 213, 108183. [Google Scholar] [CrossRef]
Nocedal, J.; Wright, S.J. Quadratic programming. In Numerical Optimization; Springer: New York, NY, USA, 2006; pp. 448–492. [Google Scholar]
Xu, Z.; Zhou, X.; Wu, H.; Li, X.; Li, S. Motion planning of manipulators for simultaneous obstacle avoidance and target tracking: An RNN approach with guaranteed performance. IEEE Trans. Ind. Electron. 2021, 69, 3887–3897. [Google Scholar] [CrossRef]
Liang, J.; Xu, Z.; Zhou, X.; Li, S.; Ye, G. Recurrent neural networks-based collision-free motion planning for dual manipulators under multiple constraints. IEEE Access 2020, 8, 54225–54236. [Google Scholar] [CrossRef]
Jin, T.; Zhu, H.; Zhu, J.; Zhu, S.; He, Z.; Zhang, S.; Song, W.; Gu, J. Whole-Body Inverse Kinematics and Operation-Oriented Motion Planning for Robot Mobile Manipulation. IEEE Trans. Ind. Inform. 2024, 20, 14239–14248. [Google Scholar] [CrossRef]
Kuang, X.; Zhou, S. Robotic Manipulator in Dynamic Environment with SAC Combing Attention Mechanism and LSTM. Electronics 2024, 13, 1969. [Google Scholar] [CrossRef]
Tenhumberg, J.; Mielke, A.; Bäuml, B. Efficient Learning of Fast Inverse Kinematics with Collision Avoidance. In Proceedings of the 2023 IEEE-RAS 22nd International Conference on Humanoid Robots (Humanoids), Austin, TX, USA, 12–14 December 2023; pp. 1–8. [Google Scholar]
Hung, C.M.; Zhong, S.; Goodwin, W.; Jones, O.P.; Engelcke, M.; Havoutis, I.; Posner, I. Reaching through latent space: From joint statistics to path planning in manipulation. IEEE Robot. Autom. Lett. 2022, 7, 5334–5341. [Google Scholar] [CrossRef]
Dastider, A.; Lin, M. DAMON: Dynamic Amorphous Obstacle Navigation using Topological Manifold Learning and Variational Autoencoding. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; pp. 251–258. [Google Scholar]
Lin, T.; Zha, H. Riemannian manifold learning. IEEE Trans. Pattern Anal. Mach. Intell. 2008, 30, 796–809. [Google Scholar]
Zhong, J.; Wang, T.; Cheng, L. Collision-free path planning for welding manipulator via hybrid algorithm of deep reinforcement learning and inverse kinematics. Complex Intell. Syst. 2021, 8, 1899–1912. [Google Scholar] [CrossRef]
Ge, D. Research on obstacle avoidance path planning of robotic manipulator based on deep reinforcement learning. In Proceedings of the Fourth International Conference on Advanced Algorithms and Neural Networks (AANN 2024), Qingdao, China, 9–11 August 2024; Volume 13416, pp. 428–433. [Google Scholar]
Bhuiyan, T.; Kästner, L.; Hu, Y.; Kutschank, B.; Lambrecht, J. Deep-reinforcement-learning-based path planning for industrial robots using distance sensors as observation. In Proceedings of the 2023 8th International Conference on Control and Robotics Engineering (ICCRE), Niigata, Japan, 21–23 April 2023; pp. 204–210. [Google Scholar]
Prianto, E.; Park, J.H.; Bae, J.H.; Kim, J.S. Deep reinforcement learning-based path planning for multi-arm manipulators with periodically moving obstacles. Appl. Sci. 2021, 11, 2587. [Google Scholar] [CrossRef]
Liu, H.; Ying, F.; Jiang, R.; Shan, Y.; Shen, B. Obstacle-Avoidable Robotic Motion Planning Framework Based on Deep Reinforcement Learning. IEEE/ASME Trans. Mech. 2024, 29, 4377–4388. [Google Scholar] [CrossRef]
Vaswani, A. Attention is all you need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Zhang, Y.; Liu, Y.; Liu, S.; Liang, W.; Wang, C.; Wang, K. Multimodal Perception for Indoor Mobile Robotics Navigation and Safe Manipulation. IEEE Trans. Cogn. Dev. Syst. 2024, 1–13. [Google Scholar] [CrossRef]
Kim, H.; Ohmura, Y.; Kuniyoshi, Y. Transformer-based deep imitation learning for dual-arm robot manipulation. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; pp. 8965–8972. [Google Scholar]
Fishman, A.; Walsman, A.; Bhardwaj, M.; Yuan, W.; Sundaralingam, B.; Boots, B.; Fox, D. Avoid Everything: Model-Free Collision Avoidance with Expert-Guided Fine-Tuning. In Proceedings of the CoRL Workshop on Safe and Robust Robot Learning for Operation in the Real World, Munich, Germany, 9 November 2024. [Google Scholar]
Huang, X.; Batra, D.; Rai, A.; Szot, A. Skill transformer: A monolithic policy for mobile manipulation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 1–6 October 2023; pp. 10852–10862. [Google Scholar]
Xing, D.; Xia, W.; Xu, B. Kinematics learning of massive heterogeneous serial robots. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 10535–10541. [Google Scholar]
Alkhodary, A.; Gur, B. Kinematics transformer: Solving the inverse modeling problem of soft robots using transformers. arXiv 2022, arXiv:2211.06643. [Google Scholar]
Alkhodary, A.; Gur, B. KineFormer: Solving the Inverse Modeling Problem of Soft Robots Using Transformers. In Proceedings of the 7th EAI International Conference on Robotic Sensor Networks (ROSENET 2023). EAI/Springer Innovations in Communication and Computing, Istanbul, Türkiye, 15–16 December 2023; Springer: Cham, Switzerland, 2023; pp. 31–45. [Google Scholar]

Figure 1. Fully connected feed-forward Deep Neural Network for learning inverse kinematics.

Figure 2. Fully connected feed-forward Deep Neural Network for learning control with obstacles.

Figure 3. Fully connected feed-forward Deep Neural Network for learning motion planning with obstacles.

Table 1. Relevant results for each DNN model for inverse kinematics.

Reference	Model	Year	Metric	Performance	DOF	Time (s)	Advantages	Limitations
[43]	MLP	2023	Accuracy	98.49%	8	0.243	Simple architecture.	Does not work well in large workspaces.
[50]	CNN	2023	Accuracy	94.66%	6	0.637	Extracts significant features from the input dataset.	Complexity similar to RNN but worse performance.
[50]	RNN	2023	Accuracy	95.34%	6	0.659	Deals with dependencies in time series.	More complex, higher time cost.
[63]	GAN	2021	Accuracy	88.6%	7	0.0125	Learns distribution of valid robot configurations.	Useful for initial guesses but lacks further precision.
[68]	VAE	2023	Distance	0.5 cm	7	0.0272	Models IK solution space.	Not as developed as other methods.
[75]	GNN	2022	Accuracy	90%	6	0.0097	One network for every robot.	Not as developed as other methods.
[83]	DRL	2024	Accuracy	98%	4	-	No training dataset. Simpler and less sensitive to hyperparameters.	Sim2real transfer.

Table 2. Relevant results for motion control with obstacle avoidance via DNN.

Reference	Model	Year	Control	Dynamic Environment	DOF	Orientation Considered	Advantages	Limitations
[88]	MLP	2024	Position	No	5	No	Simple architecture.	Requires dataset limited to points far from obstacles.
[89]	RNN	2023	Velocity	No	7	No	Problem can be considered a time series.	Better suited for motion planning.
[90]	Unsupervised	2023	Position	No	7	Yes	Does not need to add obstacles to dataset.	Not many examples.
[95]	DRL	2022	Both	Yes	7	Yes	Simplest way to add obstacles.	Sim2real transfer.

Table 3. Relevant results for deep learning obstacle avoidance via motion planning.

Reference	Model	Year	Planning	Dynamic Environment	DOF	Orientation Considered	Advantages	Limitations
[102]	RNN	2021	Online	Yes	6	Yes	Problem can be considered a time series.	Problem must be reformulated.
[63]	GAN	2021	Offline	Yes	7	No	Learns a distribution of valid robot configurations.	Problem must be reformulated and added to a larger scheme.
[107]	VAE	2022	Offline	No	7	No	Models latent space of target, joint positions and obstacles.	Problem must be reformulated and added to a larger scheme.
[114]	DRL	2024	Online	No	7	Yes	Simplest way to formulate the problem.	Sim2real transfer.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Calzada-Garcia, A.; Victores, J.G.; Naranjo-Campos, F.J.; Balaguer, C. A Review on Inverse Kinematics, Control and Planning for Robotic Manipulators With and Without Obstacles via Deep Neural Networks. Algorithms 2025, 18, 23. https://doi.org/10.3390/a18010023

AMA Style

Calzada-Garcia A, Victores JG, Naranjo-Campos FJ, Balaguer C. A Review on Inverse Kinematics, Control and Planning for Robotic Manipulators With and Without Obstacles via Deep Neural Networks. Algorithms. 2025; 18(1):23. https://doi.org/10.3390/a18010023

Chicago/Turabian Style

Calzada-Garcia, Ana, Juan G. Victores, Francisco J. Naranjo-Campos, and Carlos Balaguer. 2025. "A Review on Inverse Kinematics, Control and Planning for Robotic Manipulators With and Without Obstacles via Deep Neural Networks" Algorithms 18, no. 1: 23. https://doi.org/10.3390/a18010023

APA Style

Calzada-Garcia, A., Victores, J. G., Naranjo-Campos, F. J., & Balaguer, C. (2025). A Review on Inverse Kinematics, Control and Planning for Robotic Manipulators With and Without Obstacles via Deep Neural Networks. Algorithms, 18(1), 23. https://doi.org/10.3390/a18010023

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Review on Inverse Kinematics, Control and Planning for Robotic Manipulators With and Without Obstacles via Deep Neural Networks

Abstract

1. Introduction

2. Deep Neural Networks for Inverse Kinematics

2.1. Supervised Learning

2.1.1. Feed-Forward Multilayer Perceptron

2.1.2. Convolutional Neural Network

2.1.3. Recurrent Neural Network

2.2. Unsupervised Learning

2.2.1. Generative Adversarial Networks

2.2.2. Autoencoders

2.2.3. Normalizing Flows

2.2.4. Graph Neural Networks

2.3. Deep Reinforcement Learning

2.4. Performance Summary

3. Deep Neural Networks for Motion Control

3.1. Supervised Learning

3.2. Unsupervised Learning

3.3. Deep Reinforcement Learning

3.4. Performance Summary

4. Deep Neural Networks for Motion Planning

4.1. Supervised Learning

4.2. Unsupervised Learning

4.3. Deep Reinforcement Learning

4.4. Transformer

4.5. Performance Summary

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI