1. Introduction
Robot configuration is based on several actuated joints that link the rigid elements in a series-wise, parallel, or hybrid approach for a general-purpose manipulation task. For precise motion tracking, the focus is on the relation between the base frame, located at the robot ground, and the end-effector frame, that defines the physical interaction with the external environment.
Multi degree-of-freedom (DOF) robots exhibit a working volume that enables the pose reaching in different configurations of the kinematic chain, except for the singularity positions. The robot’s kinematics formulate the transfer functions between the joint actuators’ values and the corresponding Cartesian coordinates of the end-effector with respect to the base frame. The forward kinematics (FK) equations describe the robotic manipulator, from joints to Cartesian coordinates, computing the position and orientation of an element (e.g., tool center point—TCP) and defining the required motion of the robot actuators. The function is unique, a given joint configuration corresponds to a specific Cartesian position and orientation. The robot control and the trajectory planning task require the robot joint space information.
In the literature, a significant amount of research work has been presented dealing with this set of equations through matrix multiplications and mathematical procedures [
1,
2]. Most industrial robots are developed with simple-form kinematics to obtain analytic closed-form solutions. Nevertheless, several factors, such as manufacturing tolerances, wear, transmission errors, and set-up errors, compromise the internal design model used in the robot controller, affecting the transformation accurately and deviating from the desired locations.
For these reasons, the inverse kinematic (IK) formulation may be an alternative approach to obtain joint variables for any cartesian coordinate information in reaching a required point. The inverse transformation may be a computationally intensive procedure for general-form multi-link manipulators, depending on the modeling, formulation used for the kinematics, robot structure, and workspace. In IK, for all the poses within the workspace volume exist
n potential joint configurations, where
n is the redundancy degree. Robots capable of reaching a point in the cartesian space with multiple configurations are known as redundant robots since they have more degrees of freedom than required to perform the point-reaching task [
3].
The closed-form analytical IK solution becomes more complex as the DOF increases, requiring optimizations and constraints [
4,
5,
6]. There is a set of conventional techniques to solve IK problems such as algebraic, geometric, and iterative methods [
1,
7,
8]. Nevertheless, these techniques show several issues for robots with complex structures, such as closed-form solutions not guaranteed, high computation costs, or results depending on the start point. For these types of manipulators, optimization methods, such as particle swarm optimization (PSO) [
9,
10,
11] or genetic algorithm (GA) [
12,
13], have been adopted to avoid high-order polynomials. Nevertheless, the impact of the initial setting and the inadequate convergence are constraints that need to be regulated [
14].
Recently, machine learning (ML) procedures have been used to model the IK, as in other industrial applications [
15,
16,
17,
18,
19,
20,
21]. The predictions are promptly obtained by comparing numerical and optimization methods, using data from the offline training phase.
One of the most recognized applications of ML schemes observed in literature is based on artificial neural networks (ANN). A planar ANN scheme was proposed by Duka in [
22], creating a dataset from the FK of a 3-DOF planar robot, used to train a shallow network with 100 neurons in its hidden layer. The results showed a mean square error (MSE) equal to 0.0054387 rad in the joint space. Habibkhah and Mayorga, in [
23], implemented an orientation function based on the cosine for a shallow network with 12 neurons, obtaining a validation of MSE equal to 0.00038523 rad. Varedi-Koulaei et al. deployed a 5-layer ANN based on an identical planar manipulator. The proposed scheme was designed with the PSO algorithm, obtaining a final validation of MSE close to 0.00013 rad in joint space [
24]. However, the inverse kinematic computation was simplified in the planar motion configuration due to the dimensional reduction.
Li et al. presented an interesting study in [
25], increasing the analysis complexity. Compared to previous works, the authors modeled an industrial robot with 6-DOFs (RS10N), using a shallow single hidden layer back-propagation neural network with 30 hidden neurons, applied in a constrained workspace and validated on a rectangle trajectory. The obtained results showed an error lower than ±0.05° with a maximum deviation of 1.00 mm from the horizontal axis. A further method was developed by Narayan et al. [
26], proposing the ANFIS (adaptive neuro-fuzzy inference systems) technique to predict the IK of a 4-DOF SCARA robot in a reduced workspace. The maximum observed deviation of the end-effector was 0.3775 mm in X-direction, 0.4135 mm in Y-direction, and 0.0027 mm in Z-direction, respectively. Nevertheless, these applications considered limited workspaces with planar or highly constrained robot motions, facilitating the corresponding inverse kinematic computation and reducing the effect of the estimated errors. To avoid the workspace limitations, an alternative approach was presented by Shah et al. [
27], dividing the spatial workspace into quadrants and applying a 5-layer ANN scheme. The study aimed to model a 5-DOFs robot. The application was validated for a single quadrant, obtaining a joint error with a range equal to ±0.003 rad. Almusawi et al. [
28] suggested a further method to address the spatial complexity in modeling the inverse kinematics of a 6-DOF robot (Denso VP6242). Inner feedback data for the current joints’ values was included as an extra input to the 10-hidden layers ANN. The process validation was performed using a spring curve trajectory demonstrating a maximum error in the Z-direction lower than 0.35 mm. Despite the promising results, the inclusion of the joint values may increase the complexity in the scheme generalization since it is based on the trajectory as an input of the ANN requirement. Singh et al. [
29] proposed a long short-term memory (LSTM) method regarding a recurrent neural network model, with two hidden layers of 10 LSTM neurons. The aim was to represent the inverse kinematics of the left arm of the Baxter Robot (7 DOFs). The study showed a final MSE lower than 0.02, obtained from the test data set. A tree-based algorithm was implemented by Thomas et al. [
30] to model the inverse kinematics of a 6-DOF parallel manipulator. The dataset was generated by recording the end-effector poses with a camera and the corresponding joint values with ultrasonic sensors. The random forest (RF) algorithm showed the best training results, with a determination coefficient equal to 94.61% and a root-mean-square error (RMSE) close to 0.036 m.
Köker et al. [
31] proposed the committee schemes as an alternative solution to represent a 6-DOF redundant robot, Hitachi M6100. The concept was based on 10 parallel shallow networks, simplified to 6 networks, with variable hidden neurons. The scheme output was selected from the parallel networks based on the minimization of the Cartesian error, obtained using the forward kinematics. Compared to a global ANN approach, the proposed method showed a significant reduction of the final error range, from 5.76 to 13.41 mm to 0.39 to 0.74 mm. Nevertheless, the multiple model training time and the higher amount of data required significant computational cost due to the high complexity.
To improve the model’s accuracy, further methodologies are the hybrid procedures based on ML models and optimization algorithms applied in the cascade form. In [
32], Köker et al. implemented a hybrid simulated annealing with GA to reduce the shallow network (46 hidden neurons in the hidden layer) error to 1.0 µm for a 4-DOF manipulator. Similarly, a neuro annealing approach based on Elman recurrent networks for the Stanford and PUMA 560 robotic manipulator was applied in [
33]. This study proposed a committee system with Elman networks as in [
31], optimizing the final output using a simulated annealing and reducing the MSE from 2.10 to 0.0000079, in joint space. Zhou et al. [
34] proposed the GA approach by applying an extreme learning machine (ELM), combined with a sequential mutation genetic algorithm (SGA). This work focused on a 6-DOF Stanford MT-ARM robot, highlighting a model error equal to 0.0225 mm with a standard deviation of 0.0081 mm.
Although the literature shows promising methods and significant results in addressing ML strategies to achieve effective and robust manipulator’s IK formulation, further studies are required to evaluate novel techniques for robot inverse kinematic modeling.
In particular, the state of art shows that several ML methods are based on general models to compute all the joint values simultaneously, increasing the computational cost and complexity. In this way, the development of a sequential methodology focused on the single joint calculation may represent a novel paradigm [
35,
36].
This paper aims to investigate and deploy an ML-model using a sequential procedure for an accurate IK formulation, addressing the model prediction using a data-driven approach. At each joint, the input vector includes the previously obtained actuators’ values and the Cartesian coordinates with orientation. Moreover, the ANN architecture is defined through an optimization phase based on the GA technique.
The main contribution of this study is to validate a complete sequential ML model methodology, assessing the results in terms of joint and Cartesian error. Additionally, the results are compared with the global approach. A further contribution is to integrate the Genetic Algorithm optimization into the IK procedure to define the network structure: the hidden layers and hidden neurons per layer. The GA score function depends on the normalized root mean square error, the maximum error, and the coefficient of determination. The novel sequential methodology is validated in a simulated environment (IRB140) and with a real physical robot (LR-Mate 200iC).
The paper is organized as follows:
Section 2 presents the novel methodology to model the robot inverse kinematics, describing the sequential procedure. Following the proposed methodology, a simulated case study applied to an articulate robot (IRB140) is described in
Section 3. Results are discussed and compared with other studies.
Section 4 shows the experimental campaign performed on a 6 DOFs robot (FANUC LR-Mate 200iC), evaluating the obtained results and validated the proposed sequential methodology for robot inverse kinematic modeling. Finally,
Section 5 presents the conclusions discussing the study findings and prospective.
2. Methodology Structure
The IK function defines the joints’ configuration (Q) required to reach the target frame with the Cartesian coordinates (S) of the end-effector. This study models the IK procedure adopting a sequential set of ANNs. The flow diagram of the proposed methodology is presented in
Figure 1. Three macro-steps can be depicted, an initial phase (“
IK boundary definition and pre-processing”) where the robot is chosen and the dataset is produced with the FK equations, obtained automatically through an iterative procedure based on the D-H parameters or by manual setting. The corresponding modeling is evaluated with the
Mean Euclidean Error (MEE) for a set of predefined trajectories in order to verify the dataset quality on the machine learning model. Once the FK is validated, the dataset is produced. The following step aims to define the ANN structure (“
Network Architecture Definition”) for the IK model. Two different approaches may be selected, a
global approach, where all the joints are computed together, or the novel
sequential approach, where the joints are computed sequentially from the TCP or the Base, integrating the previously modeled joint as extra input for the following one. In order to avoid a manual inference of the model structure, the genetic algorithm (GA) optimization is included with a predefined tunable cost-function. Finally, the produced IK model is validated using the same trajectories employed in the initial step FK validation.
Table 1 summarizes the methodology in terms of input and output elements of the proposed IK procedure.
2.1. Dataset Generation
This phase consists of the robot selection and the D-H parameters estimation for the FK dataset creation with a pseudo-random generator. The FK equations compute the Cartesian coordinates from the joints’ values using the D-H parameters. The axes X, Y, and Z are defined from the origin of each joint, positioned at the actuator centre: Z is the direction of the joint axis; X is the axis perpendicular to both Z
i and Z
i−1, from the latter; Y is derived from Z and X axes. The D-H parameters based on the presented axes definition are [
37]:
Joint angleθi: rotation angle about the Zi−1 axis, determined from Xi−1 to Xi. The angle corresponds to the joint value if the joint has a rotary actuator.
Joint distancedi: distance along the Zi−1, from the origin of the (i − 1)th frame to the intersection of the Zi−1 with Xi.
Link lengthai: distance along the Xi axis, from the intersection between the Zi and the Zi−1 axis to the origin of the ith coordinate frame.
Link twistαi: rotation about Xi, defined from Zi−1 to the Zi axis.
The D-H (
Rn×4) transformation matrices are obtained in the Equations (1) and (2), where
k is the joint number of the robotic manipulator. The transformation matrices are computed to obtain the relative modeling between consequent joints, since the multiplication of the complete set of matrices generates the model between the base frame and the final joint frame. The homogeneous transformation matrix A is defined based on the D-H parameters as follows:
where
and
correspond to the cosine and sine functions of
;
and
correspond to the cosine and sine functions of
.
As stated, the transformation matrix from the end-effector to the base frame may be obtained as a product of matrix multiplication, as:
where vectors
n,
s, and
a define the manipulator orientation and vector
p represents the position in Cartesian coordinates. The produced matrix corresponds to the FK equations. Nevertheless, the manual definition of these parameters is a time-consuming activity; considering the novel methodology, an automatic procedure is implemented, as presented in [
38]. This is an alternative method for the user, however, a manual approach may also be pursuable if the user aims to insert the D-H parameters or the FK equations, directly.
The FK model validation is a critical point in the proposed methodology, due to the data-driven nature of the IK model. The employed dataset needs to be coherent with the reality to prevent misleading results. In fact, the FK validation is carried out with a set of four different trajectories, as shown in
Figure 2a–d, with the base reference frame fixed at Zero-World [0; 0; 0; 0; 0; 0].
Furthermore, the robustness of the direct model is assessed by varying the main trajectory parameter on two different levels: reducing (low level) and expanding (high level) by 20% the initial path. This perturbation, arbitrarily selected, permits to evaluate the effectiveness of the direct model in multi-conditions, recreating a set of trajectories in the workspace. The corresponding paths are:
Perturbed parameter: radius
Perturbed parameter: radius
Perturbed parameter: radius
Perturbed parameter: length
bConsidering the robot workspace, the Cartesian error should be evaluated in terms of Euclidean distance since it is more representative than the single-axis evaluation. Therefore, the MEE index is employed to assess the FK performances, as a measure of the mean 3D distance error. The MEE is obtained as defined in Equation (12):
where
describe the error along X-, Y-, Z- axes, respectively; and
is the number of samples.
As shown in the flow diagram presented in
Figure 1, if the obtained MEE is greater than the chosen threshold, the program requests the manual input of the D-H parameters (or the FK equations) if the automatic procedure has been initially selected, or their correction in case of the manual path setting.
Once the performance is validated, the FK equations are used to produce the robot dataset, considering the unique mapping between the input joint values and the resulting cartesian coordinates. Furthermore, a pseudo-random generator is considered to obtain the RN×m matrix, based on the minimum and maximum joints’ values, where n is the number of DOFs of the selected robot and m is the number of joint space data points (10,000 elements by default). Considering the IK methodology, this data corresponds to the output dataset for the ML model; in fact, the input dataset is obtained by computing the cartesian coordinates, position and orientation, based on the FK equations.
2.2. Model Architecture Definition
The second step of the sequential methodology,
Figure 1, is the “
network architecture definition”. The procedure begins with the dataset division: 70% for training, 15% for testing, and 15% for validation. This is a conventional approach for ML applications [
39,
40,
41]. The validation dataset is generally employed for feature selection of model fine-tuning and it marginally influences the ML model. The test dataset is considered as an unbiased dataset for the performance evaluation. A different method that could be employed to reduce the three-part partition would be the k-fold cross-validation to tune the model hyperparameters [
42,
43,
44].
The network structure is determined by the number of hidden layers, the number of neurons per hidden layer, and the activation function for each layer. The depth of the network determines the capability of the architecture to model non-linearities and complex structures. The first two parameters are tuned with the GA, which is a recognized stochastic search optimization algorithm inspired by the evolution theory, while the activation function is implemented as
tan-sigmoid (tansig) to reduce the required application time. The number of hidden layers is limited from 3 to 7, and the neurons per layer is defined in the range from 10 to 80, with a 5 hidden neurons resolution. The GA iteration score is determined by Equation (13), based on training and validation performance measurements obtained by Equations (14)–(16):
where
is the target value,
and
are the maximum and minimum of the target element;
is the predicted value;
is the sample number;
is the
normalized root mean square error;
is the
maximum absolute error; and
is the coefficient of determination. The
v and
tr correspond to the
validation and
training dataset, respectively. The
α and
β coefficients need to be tuned by the user to apply the optimization algorithm. Considering the sequential network deployment, the error coefficients and the final score are expressed in the
joint space error [rad].
The defined cost-function for this method is based on three performance coefficients: , , and . The first is a normalized coefficient correlated with the error’s range and independent on the error’s dimension. This is a critical point since the robot dimension and the corresponding workspace influence the non-normalized error.
Furthermore, the maximum error and the coefficient of determination are employed together, as shown in Equation (13). In particular, the maximum error corresponds to the maximum Euclidean distance, used as a coefficient of error distribution. Its value is typically higher than the normalized coefficient, thus, a reduction factor is included based on the value, the coefficient associated with the correlation in terms of variability dependency. Considering the presence of the performance measurements for the training and the dataset validation, the constant coefficients α and β are used to induce a proportion index.
The validation dataset is crucial for the model performance evaluation with respect to the training dataset. For this reason, this application defines α equal to 0.8 and β equal to 0.2, respectively.
Finally, during the GA optimization iterations, validation threshold (70%) is verified to define high-cost function values for those non-compliant models, depicted as “ANN design Rsq < 70%”.
Two different approaches are examined for the model comparison and validation, the traditional global approach, where the cartesian coordinates are given as input and all the joints are computed simultaneously; and the sequential approach derived from the single joint computation. The sequential concept is based on the FK obtention with the ordered matrix multiplication, from the base to the end-effector. Similarly, the deployed networks are applied to learn the IK of one joint at a time; the first network input is the Cartesian coordinates and orientation, and the output is the final joint value. The following network has the same input, including the previous predicted joints values. This approach has two main advantages: (1) it increases the information provided to the ML model; (2) it allows a parallel scheme training as each joint model can be trained and optimized, separately. Furthermore, the modularity of the presented procedure allows the model evaluation for each joint, implementing multiple models for the same robot and considering the complete mechanism as a set of simpler elements.
The GA is used to optimize each joint model for the sequential scheme. A further variability term is incorporated in the model deployment to change the direction of the sequential approach. In fact, the system finds the optimized structure starting from the tool center point (TCP), as the FK computation, which means that the first ANN computes the last joint firstly, using it as an input for the following joints, up to the base or from the base, computing the opposite path. This implementation increases the required optimization time; however, it provides a further degree of freedom in terms of model deployment in reducing the consequent error. A
Sequential approach example from the TCP to the base is shown in
Figure 3.
2.3. Model Validation
The final step of the proposed methodology is the “
network architecture validation” for the
global and
sequential configurations, providing the IK solution with the associated error. The prediction accuracy for the network is measured using the
mean square error (MSE),
mean absolute error (MAE),
root mean squared error (RMSE), and the
maximum error (MAXE). Furthermore, the trajectories employed for the FK validations, presented in
Figure 2a–d, are used to validate the IK model performance.
5. Conclusions
This paper proposes a procedure to generate a ML model using a sequential scheme for an accurate IK formulation. Two anthropomorphic robots are studied, evaluating the spatial trajectory error obtained from a data-driven model in the simulated (ABB IRB140) and the laboratory experimental (FANUC LR Mate 200iC) environments. A 3-step procedure is proposed: (i) IK boundary definition and pre-processing, (ii) network architecture definition based on GA, and (iii) network validation. The models show promising results for the validation of the trajectories, executed in critical workspace positionings of the robot. The goodness of IK methodology is confirmed by evaluating the errors of the joint space and the trajectories in cartesian space.
In the simulation, the sequential procedure shows a reduction of the mean squared error index of 42.7–56.7%, compared to the global scheme. In particular, the most effective approach is obtained by modeling the robot starting from the base to the TCP, with a final mean square error equal to 1.73 × 10−6 [rad2] in joint space. Comparable results are collected from the experimental campaign with a MAE lower than 0.60 mm, using the sequential method. As in the simulated environment, the global approach highlights a greater error, that is close to 0.72 mm.
Compared with other studies in the literature [
23,
31,
32,
33,
34,
35], the proposed methodology offers several contributions. The first point is the application of simulated data points based on the FK equations obtained from an automatic procedure with a reduced experimental dataset to train the ML model. Usually, it is required a significant computation of the FK model or the D-H parameters for the dataset creation [
35]. This approach allows the creation of large datasets without unnecessary experimental measurements, reducing the application time. Furthermore, the automatic proposed procedure does not require user-time since the data generation and ANN configuration are done by the deployed methodology. In addition, if the robot does not allow a continuous data sample to produce the corresponding D-H parameters, the user may implement the direct FK equations within the program manually.
A further advantage of this methodology is the inclusion of the genetic algorithm to define and optimize the network structure. This step permits the user to avoid any kind of ANN parameters tuning; however, the limits and discretization for the number of hid-den layers and hidden neurons for each layer has to be defined. The fine-tuning requirement may represent a boundary limitation for the complete methodology application, and it may directly influence the results obtained during the model deployment and the computational training time.
The methodology also provides robust modeling in dealing with the multiple IK solutions, compared to the global approach or the pure individual joints method [
22,
23,
24,
27,
35]. In particular, the proposed case studies showed a significant reduction of the final error with respect to the global approach [
35], validating the model capabilities.
Finally, the sequential scheme capability to set and define a given joint value allows to choose a given robot configuration, especially when the initial joints are selected. This point will be explored in future applications focused on trajectory optimization schemes, minimizing the joint movement and redundant solutions. It is a significant advantage compared to global approaches [
22,
23,
24], in which the individual joints are neglected, and single-joint modeling [
35], that considers each joint as an independent element.
In this scenario, the future advances of this study will focus on dealing with high-redundant schemes and non-industrial robots as non-rigid real robots, that are complex to model analytically. In the same way, the methodology may be improved to solve configuration limitations due to workspace constraints by external objects, reduce the motion or include power consumption optimization schemes.