**1. Introduction**

Rotate vector (RV) reducer has the advantages of small size, compact structure and large transmission ratio [1]. The quality of its assembly determines the performance, production cost, and production efficiency of the product. It is mainly used in robot joints with high precision and large load. At present, foreign countries already have a relatively complete theoretical system for RV reducers. Domestic RV reducers have been developed for many years without major breakthroughs in accuracy. The dynamic transmission error of RV reducers depends on the manufacturing error of each component, assembly errors, and elastic deformation. However, the material properties of parts are easy to determine and the manufacturing accuracy is difficult to ensure. In this case, companies generally measure parts and assemble parts to improve the dynamic transmission accuracy of the reducer.

In 2007, Kannan SM et al. used particle swarm algorithm to obtain the best combination of parts [2] and successfully made the assembly deviation less than the sum of the tolerances of the parts. Gentilini et al. established a finite element model; this method can predict and show the final shape of the assembly [3]. There is no need for physical assembly in future practice, reducing the time and cost of product quality inspection. S.Khodaygan et al. proposed to estimate the tolerance assembly and the reliability of the mechanical assembly to meet the quality requirements through the Bayesian modeling [4], which can formulate accurate assembly functions for complex mechanical assemblies. At present, artificial neural networks are widely used in the field of speech recognition, computer vision, and bioinformatics, etc. In recent years, some scholars have used neural networks to develop assembly models, which avoids the complicated operation and heavy workload of traditional methods to solve the accuracy. Steinberg Fabian [5] utilized a gradient boosting

**Citation:** Jin, S.; Chen, Y.; Shao, Y.; Wang, Y. An Accuracy Prediction Method of the RV Reducer to Be Assembled Considering Dendritic Weighting Function. *Energies* **2022**, *15*, 7069. https://doi.org/10.3390/ en15197069

Academic Editors: Xi Gu, Tangbin Xia, Ershun Pan, Rongxi Wang and Yupeng Li

Received: 31 August 2022 Accepted: 21 September 2022 Published: 26 September 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

classifier to identify assembly start delayers. Stark Rainerr et al. selected three approaches based upon regression, natural language processing, and clustering into intelligent services in the digital production process to reduce the time for resolving upcoming issues in assembly [6]. It has certain guiding significance for improving the assembly quality.

The current neuron model relies on the McCulloch–Pitts structure. This neuron uses the weight between the synapses, and then obtains the output through the activation function. It does not consider the local interaction played by the dendritic structure as a part of the neuron and is only used to transmit signals. Experiments had shown that dendrites play an important role in information processing in biological neurons, which can nonlinearly integrate postsynaptic signals and filter out irrelevant background information [7–9]. The dendrites receive signals (action potentials) from upstream neurons. After the information is integrated and processed, it is transmitted to the neuron soma. The powerful information processing capabilities of dendrites enable individual neurons to process thousands of different synaptic inputs unequally [10]. Inspired by biological phenomena, Yuki Todo et al. proposed a single dendritic neuron with synaptic, dendritic, membranous, and somatic layers [11]. A synapse performs a sigmoid non-linear operation on its input, the nodes in the synaptic layer contain the initial weights and thresholds of the dendritic neural network, each branch receives a signal and performs a multiplication operation at its synapse, the membrane sums the results of all branches, ultimately transmitting the signal to the cell body. When the threshold is exceeded, cells send signals to other neurons through the axon. At present, this model has been widely used, such as computer-aided medical diagnosis [12] and morphological hardware implementation [13]. Gang et al. believed that the dendritic neuron model proposed by Yuki Todo and Gao [14] is not enough to be expressive, can only express first-order input, and is not conducive to computer operation [15]. Considering that the definition of multivalued logic and the integration of potentials in biological neurons can be described in a multiplicative form, a dendrite (DD) containing only matrix multiplication and Hadamard product was proposed to simulate the function of data interaction processing. Gang [16] developed a Taylor series using the dendritic network proposed by him and constructed a relational spectrum model to analyze the synergy and coupling of hand muscles, which is helpful for the design of prosthetic hands. In this article, the DD module is used to develop a prediction model to capture the internal structure dependence and increase the interaction between information, which provides a new idea for the prediction of reducer assembly accuracy.

#### **2. Analysis of Quality Influencing Factors**

Transmission error (TE) is an important index to evaluate the gear meshing mass, which is directly related to working accuracy, reliability, vibration noise, and service life of gear transmission [17]. In the actual manufacturing and installation process of the reducer, transmission errors are inevitable and the main sources include their own processing errors and assembly errors. According to the different specifications of the RV reducer, the transmission error is strictly limited within (1~1.5-).

The RV reducer has a sophisticated and complex structure, and its components mainly include input shaft, planetary gear, planet carrier, flange, crankshaft, pin teeth, etc. Figure 1 is a schematic diagram of the RV-40E reducer. Figure 1a shows the component assembly structure of the reducer. The transmission of the RV reducer is divided into two stages, which are the first-stage involute planetary gear transmission and the second-stage cycloidal pinwheel transmission [18,19]. The movement of the input shaft drives the planetary gear to achieve first-level deceleration. The planetary gear drives the crankshaft, then drives the swivel arm bearing, and transmits the power to the cycloidal pinwheels. Under the combined action of the swivel arm bearing and the pin teeth, the cycloid pinwheel produces a revolution motion that rotates around the central circular axis of the pin tooth and an autorotation motion that rotates around its own axis. The autorotation drives the flange and output planet carrier to achieve 1:1 speed ratio output to complete the secondstage of deceleration. Since the influence of the cycloidal pinwheel transmission on the

transmission error is directly reflected on the output shaft, the first-stage planetary gear reduction mechanism is far from the output, and its transmission ratio is only the reciprocal of the second-stage cycloidal pinwheel transmission ratio. Therefore, the RV reducer transmission accuracy mainly depends on the second transmission. Figure 1b shows the schematic diagram of the transmission principle mechanism of the reducer [20,21]. The key parts involved in cycloidal pinwheel transmission include: cycloidal wheel, pinwheel, crankshaft, and crank bearing. However, in actual engineering assembly, the crank bearing clearance is usually adjusted to a small value, and the influence on the transmission error can be ignored. The influence of error factors of remaining key components on transmission error were analyzed, and the influence degree of each factor on the output shaft error were obtained. The parameters of each influencing factor *θ* = [*<sup>θ</sup>*1, *θ*2, ··· , *<sup>θ</sup>n*] have errors Δ*θ* = [Δ*θ*1, Δ*θ*2, ··· , <sup>Δ</sup>*θn*]. The transmission error of RV reducer can be recorded as *ϕ*(*θ*) = *ϕ*(*<sup>θ</sup>*1, *θ*2, ··· , *<sup>θ</sup>n*), and each error sensitivity index can be defined as [22,23]:

$$S\_N = \nabla \varphi(\theta) = \begin{bmatrix} \frac{\partial \varphi}{\partial \theta\_1}, \frac{\partial \varphi}{\partial \theta\_2}, \dots, \frac{\partial \varphi}{\partial \theta\_n} \end{bmatrix} = [\mathbf{S}\_1, \mathbf{S}\_2, \dots, \mathbf{S}\_n] \tag{1}$$

୮

**Figure 1.** Schematic diagram of RV-40E reducer structure: (**a**) structure diagram; and (**b**) schematic diagram.

This article takes the RV-40E type reducer as an example. Taking the center circle radius error of the needle tooth as reference error. The sensitivity index of the needle tooth center circle radius error is 1, the sensitivity index of other error parameters is compared with the reference error parameter. Through the calculation of sensitivity. It can be concluded that the factors that have a larger sensitivity index to the transmission error are: needle tooth center circle radius error <sup>δ</sup>rp, needle tooth pin radius error <sup>δ</sup>rrp, needle tooth pin hole circumferential position error <sup>δ</sup>t∑, equidistant modification error <sup>δ</sup>Δrp, shift distance modification error <sup>δ</sup>Δrrp.

#### **3. Development a Network Model**

In the back propagation feed-forward neural network, each neuron takes the output of the node in the upper layer as the input. The input is activated by linear weighting and nonlinear function to obtain the output of this node, then pass the output to the node in the next layer. When using McCulloch–Pitts neurons to learn the relationship between input space and output space, it ignores the function of dendritic structure as a part of neurons to process information. For this phenomenon, this article uses the logic extractor DD proposed by gang to realize data interaction processing function and constructs a dendrite neural network (DDNN) model for the reducer transmission error prediction. In this model, the dendrites first to perform interactive preprocessing on the signals transmitted from upstream neurons, and then transmit them to the cell body for linear weighting. Finally, the axons are activated by nonlinear functions, then the signals are transmitted to downstream neurons. Compared with McCulloch–Pitts neuron, the neuron model constructed by DD module has lower computational complexity, stronger network generalization, and fitting capability.

#### *3.1. Back Propagation Feed-Forward Neural Network*

In the artificial neuron, *X* = {*<sup>x</sup>*1, *x*2, ··· , *xn*} is feature input, *n* is the number of input features, *xi* is the ith influencing factor. The input value of the neuron is transmitted by the connection of the weighting coefficient. The positive and negative values of the weighting coefficient simulate the excitation and inhibition of the synapse. The size indicates the strength of the connection, ∑*n i*=1 *wixi* is the integration of all input signals by the cell body. The threshold *b* controls the activation of neurons. When the sum of the inputs exceeds the threshold, the neuron is activated and an output signal is generated to transmit information. *f* represents nonlinear activation function, the output *y*ˆ can be expressed as:

$$\hat{y} = f(\mathcal{W}^{2,1}X + b) \tag{2}$$

where *W*2,1 is the weight matrix from the first layer to the second layer, *b* is the bias.

Back propagation feed-forward (bp) neural network is a multi-layer feed-forward network trained according to the error back propagation algorithm, which is widely used at present. The network usually composed of three layers: input layer, hidden layer, and output layer. Its learning rule is gradient descent, which gradually adjusts weights and thresholds between layers through back propagation error signals to minimize the loss function of the network.

Firstly, define *E* as the loss function of the latter layer in back propagation and *a* as the learning rate. The weights from the input layer to the hidden layer is represented by *vj*, and from the hidden layer to the output layer is represented by *wj*. *insp* is the input of the pth neuron in the output layer, and *insq* is the input of the qth neuron in the hidden layer. The output vector of the output layer is *o*, the expected output vector is *e*, *l* represents the *l*th layer of the network model. The output error is:

$$E = \frac{1}{2} \sum\_{p=1}^{l} \left( \varepsilon\_p - o\_p \right)^2 = \frac{1}{2} \sum\_{p=1}^{l} \left\{ \varepsilon\_p - f \left[ \sum\_{q=0}^{m} w\_{qp} f \left( \sum\_{i=0}^{n} v\_{jq} \mathbf{x}\_n \right) \right] \right\}^2 \tag{3}$$

It can be seen from the above formula that the size of the output error of the network is related to the weights *vj* and *wj* of each layer. The size *W* = ' *vj*, *wj* ( of the output error *E* can be changed by adjusting the weights. The weight matrix update can be simplified as:

$$\mathcal{W}^{2,1(\text{new})} = \mathcal{W}^{2,1(\text{old})} - a \frac{\partial E}{\partial \mathcal{W}^{2,1(\text{old})}} \tag{4}$$

#### *3.2. Dynamic Adam Optimization Algorithm*

For machine learning problems in high-dimensional parameter spaces or large datasets, Jimmy Ba et al. proposed a dynamic Adam optimization algorithm based on gradient optimization of the objective function with variable learning rate during iterations [24]. It continues the strengths of Adagrad and RMSprop. Independent adaptive learning rates are set for different parameters through the first and second moments of the gradient, which makes the algorithm converge faster. It improves problems, such as objective function fluctuations, and maintains prediction performance in non-dense gradient problems and unstable data. The first-order and second-order moment deviations are calculated asfollows:

$$m\_t = \beta\_1 m\_{t-1} + (1 - \beta\_1) \text{d}\mathcal{W}^{l, l-1} \tag{5}$$

$$
\omega\_t = \beta\_2 m\_{t-1} + (1 - \beta\_2) (\text{d}\mathcal{W}^{l, l-1})^2 \tag{6}
$$

$$\begin{cases} \hat{m} = \frac{m\_t}{1 - \beta\_1^f} \\ \hat{n} = \frac{n\_t}{1 - \beta\_2^f} \end{cases} \tag{7}$$

where *t* is the number of iterations, *mt* is the first-order moment vector, *nt* is the secondorder moment vector. *m*<sup>ˆ</sup> is the first-order moment deviation, *n*ˆ is the second-order moment deviation, β1 is the first-order moment attenuation coefficient, β2 is the second-order moment attenuation coefficient. This article takes 0.9 and 0.999, respectively. *Wl*,*l*−<sup>1</sup> is the weight matrix from the (*l* − 1)th layer to the *l*th layer in the DD module. The weight matrix using the Adam optimizer can be updated as:

$$\mathcal{W}^{2,1(\text{new})} = \mathcal{W}^{2,1(\text{old})} - a^{2,1} \frac{\hat{\mathfrak{m}}}{\sqrt{\hat{\mathfrak{n}}} + \varepsilon} \tag{8}$$

where ε = 10−<sup>8</sup> prevents the divisor from becoming 0.

#### *3.3. Dendritic Unit*

In biological nervous systems, dendrites had been shown to have logical operations [25]. The dendritic used in this article is the 0 product between the current input and the previous input. The Hadamard product can be used to establish the logical relationship between inputs, so DD is the expression of the logical relationship between features. DD can capture the logical relationship between features. The DD module can not only be used to extract the local relationship between inputs, but they also use the internal correlation information to strengthen the connection between the two features and improve the network accuracy while capturing the internal structural dependencies. The DD module is shown in Figure 2.

**Figure 2.** Dendrite module.

The expression of the dendrite module is as follows:

$$Z^l = W^{l,l-1} Z^{l-1} \odot X \tag{9}$$

where *Zl*−<sup>1</sup> is the input of the module, Z*l* is the output of the module. *Wl*,*l*−<sup>1</sup> is the weight matrix from the (*l*−1)th layer to the *l*th layer. *X* = {*<sup>x</sup>*1, *x*2, ··· , *xn*} represents the original input. "" represents the Hadamard product, which represents the multiplication of corresponding elements. It is used to establish the logical relationship between inputs.

So, it produces another matrix with the same dimensions as the original product matrix. The features are deeply fused by multiplying corresponding elements, which can better reflect the meaning of feature intersection.

The DD module is essentially an information processing method, similar to a variant of the self-attention mechanism. It pays attention to the input feature variable itself, increases the interaction between information, and strengthens the connection between features. By training its own information to update the parameters, it gives the network stronger information processing capability and better generalization capability.

#### *3.4. Developing a Dendritic Neural Network Model*

The artificial neuron structure ignores the capability of dendrites to process information interaction and only uses dendrites to transmit information. However, in actual biological information, each neuron can have one or more dendrites. In this article, multiple DD modules are used to simulate dendritic functions to form a single dendritic neuron [26]. The dendrite extracts logical information from the data transmitted by the upstream neurons. Then, the information is interactively preprocessed and passed to the cell body for linear weighting. The signal is nonlinearly activated and, finally, output by the axon. Figure 3 shows the structure of neurons. As shown in Figure 3a, the biological neurons have dendrites, cell bodies, axons, and other organizations. Figure 3b shows the filter, accumulator, and balancer in dendritic neurons to simulate the organizational function of biological neurons, and the expression is as follows:

$$\begin{cases} Z^1 = W^{1,0}X \diamond X\\ \mathcal{Y} = f\left(W^{2,1}Z^1 + b\right) \end{cases} \tag{10}$$

where *X* = {*<sup>x</sup>*1, *x*2, ··· , *xn*} is the feature input, *n* is the number of influencing factors, *W*1,0 and *W*2,1 represent the weight matrix from layer 0 to layer 1 and the weight matrix from layer 1 to layer 2, respectively. *b* is bias, *f* is nonlinear activation function, where ReLU is selected as the activation function.

**Figure 3.** The structure of neurons: (**a**) biological neuron structure; and (**b**) functional structure diagram of dendritic neuron model.

According to the source of the transmission error, five influencing factors are divided into two dimensions. The information of different dimensions was represented by different neurons. Figure 4 shows the fusion of information from different neurons, and the transmission error prediction model was constructed.

The architecture of DDNN can be expressed as follows: 

$$\begin{cases} \begin{aligned} Z^1\_\upsilon &= \mathcal{W}^{1,0}\_\upsilon X \circ X \\ h\_1 &= f(\mathcal{W}^{2,1}\_\upsilon Z^1\_\upsilon + b) \end{aligned} \tag{11} \end{aligned} \tag{11}$$

$$\begin{cases} \begin{aligned} Z^1\_\varepsilon &= \mathcal{W}^{1,0}\_\varepsilon X \circ X \\ h\_2 &= f(\mathcal{W}^{2,1}\_\varepsilon Z^1\_\varepsilon + b) \end{aligned} \tag{12} \end{cases} \tag{12}$$

$$\begin{cases} A\_{\rm li} = \mathcal{W}\_{\rm li}^{1,0} H \circ H \\ \mathcal{Y} = f(\mathcal{W}\_{\rm li}^{2,1} A\_{\rm li} + b) \end{cases} \tag{13}$$

where *V* = {*<sup>v</sup>*1, *v*2, *<sup>v</sup>*3}, *v*1 represents the radius error of needle tooth center circle <sup>δ</sup>rp, *v*2 represents the needle tooth pin radius error <sup>δ</sup>rrp, *v*3 represents the needle tooth pin hole circumferential position error <sup>δ</sup>t∑, all from the component pin wheel. *E* = {*<sup>e</sup>*1,*e*2}, *e*1 represents the equidistant modification error <sup>δ</sup>Δrp, *e*2 represents the shift distance modification error <sup>δ</sup>Δrrp, all from components cycloid wheel. *H* = {*h*1, *h*2}. Feature vectors are combined into tensor, which is fed into DDNN model for training. The information is interactively preprocessed in the DD module, and then transmitted to the cell body for linear weighting. Finally, the axon performs nonlinear function activation and transmits information *h*1 and *h*2 to the next layer of DD module. Each DD module selects input units from upper layer without repeating to connect and repeats the above learning rules. Finally, axon output the predicted value. The overall process of DDNN model prediction is shown in Figure 5. Firstly, the sample data set is divided into training set and test set. Secondly, the factors affecting the quality of reducer in the training set and test set are denoted as eigenvalues, and the transmission errors are denoted as labels. The training set is imported into DDNN model for training. Equations (11)–(13) aim to obtain the minimum value of the loss function. They use the dynamic Adam optimization algorithm and the error signal back propagation algorithm to update threshold and weight parameters. When the convergence of the loss function of model training reaches the expectation, the training is stopped and the model is saved. Finally, the test set is brought into the trained model to obtain prediction results.

**Figure 5.** Prediction flow chart of the DDNN model.

#### **4. The Model Solution**

*4.1. Preprocessing of Test Data*

This article takes the RV-40E-121 reducer as an example to analyze. Five influencing factors related to transmission error during assembly were selected: needle tooth center circle radius error <sup>δ</sup>rp, needle tooth pin radius error <sup>δ</sup>rrp, needle tooth pin hole circumferential position error <sup>δ</sup>t∑, equidistant modification error <sup>δ</sup>Δrp, shift distance modification error <sup>δ</sup>Δrrp. The design upper and lower limits of the five influencing factors are shown in Table 1. Dimensions of parts are in millimeters.

**Table 1.** Design upper and lower limits of parameter error of main parts (mm).


In order to avoid the large difference in the value range of each feature affecting the efficiency of the gradient descent method. The data set needs to be preprocessed before model training. First, removing the data that deviate too much, then using the Max–Min normalization method to linearly transform the sample eigenvalues. Thus, making the result mapped in the interval [0, 1], and the scaling function is as follows:

$$\chi\_{\text{new}} = \frac{\mathbf{x} - \min(\mathbf{x})}{\max(\mathbf{x}) - \min(\mathbf{x})} \tag{14}$$

where *x* is the original data, max(*x*) is the maximum, and min(*x*) is the minimum values in the data. *x*new is the standardized data.

Through the error test platform, the size parameters of the above-mentioned main components were collected as characteristic inputs of the influencing factors, and the transmission error of the reducer was used as the output index. A data set was constructed with a total of 300 samples information. Some sample point data are shown in Table 2, where the unit of transmission error is angular minutes.


**Table 2.** Some sample data example (mm).

Firstly, 300 groups of samples were randomly shuffled, and then the data set was divided. A total of 90% of the samples were set as the training set, and 10% of the samples were set as the test set.

#### *4.2. Parameter Selection*

The framework used in the experiment in this chapter is TensorFlow2.0, and the test was implemented through Python3.6. The specific hardware environment is: the CPU is Intel i7, the GPU model is NVIDIA 2080Ti, the CUDA version is 11.2.1, and the CUDNN version is 8.1.1.33.

The loss function can measure the difference between the output value of the model and the true value. It is a measure to evaluate the fitting capability of the model. The advantage of using Log-Cosh function as loss function is that when the feature errors between samples is small, the Log-Cosh function converges faster. when the feature errors between samples is large, the Log-Cosh function is not susceptible by outliers. Compared with other functions, the characteristic curve is smoother and can be derivable twice. To a certain extent, the robustness of the model can be improved [27]. The expression of the Log-Cosh function is as follows:

$$\text{Loss}(y\_{i\prime}\hat{y}\_{i}) = \sum\_{i=1}^{m} \log(\coth(\hat{y}\_{i} - y\_{i})) \tag{15}$$

where *yi* is the label vector of the *i*th training sample, *y*ˆ*i* represents the predicted value of the *i*th output sample, *m* represents the total number of training samples. The relevant training parameter conditions: the loss function was the Log-Cosh function, the batch-size was 128, the training epoch was 200, the initial learning rate was 0.001, the optimizer was Adam. After iterative training according to the settings of the above training parameters, the loss function convergence curve of DDNN model, as shown in Figure 6, was obtained.

When the convergence of the loss function of model training reached the expectation, the training was stopped and the model was saved; the weight parameters were extracted from the saved model. Table 3 shows the weight information of each layer of the model parameters of the last training (keep values to three decimal places); here, symbol "—" means no weight value. Finally, the test set was brought into the trained model to ge<sup>t</sup> the prediction result and compared with the label.

**Figure 6.** Convergence of loss function of reducer transmission accuracy error.



The prediction curve of the DDNN model is shown in Figure 7, where the blue line represents real sample values, the red line represents the predicted values.

**Figure 7.** Prediction result of transmission error sample of RV-40E-121 reducer.

#### *4.3. Predictive Performance Analysis*

In order to further quantify the prediction accuracy of DDNN model, the mean absolute error and mean square error were introduced as performance evaluation indicators. The effectiveness of the DDNN prediction method was verified by analyzing the performance of model predictions. The specific definitions are as follows: MSE(Meansquarederror)

$$\text{MSE} = \frac{1}{m} \sum\_{i=1}^{m} (y\_i - y\_i)^2 \tag{16}$$

MAE (Mean absolute error)

$$\text{MAE} = \frac{1}{m} \sum\_{i=1}^{m} |\mathcal{Y}\_i - y\_i| \tag{17}$$

where *yi* is the label vector of the *i*th training sample, *y*ˆ*i* represents the predicted value of the *i*th output sample, and *m* represents the total number of training samples.

MSE reflects the mean squared prediction error of the model. MAE describes the average value of the absolute error between the predicted value and the observed value [28]. The smaller the value of MSE and MAE, the better. The method proposed in this paper was compared with the other three commonly used prediction models, which are BP neural network, support vector regression with Gaussian kernel (SVR-R) and general regression neural network (GRNN) [29]. The initial parameter settings for all models are shown in Table 4.

**Table 4.** The initial parameter values of models.


Table 5 shows the comparison results of the prediction performance and computational efficiency of the above four models. From the error results in the table, it can be concluded that for transmission error prediction, the prediction accuracy of the DDNN model is better than the other three models. For computational efficiency, the running time of GRNN and SVR-R outperforms the remaining two models. Considering the prediction performance and calculation efficiency comprehensively, the DDNN model has the highest prediction accuracy and can also meet the time requirements of actual assembly.

**Table 5.** Comparison of the DDNN with other models.


#### **5. Conclusions and Future Work**

In the actual production and assembly process of reducer, the success rate of one-time selection of parts is low, and repeated disassembly and assembly will lead to the generation and accumulation of errors. Through the analysis of the factors affecting the quality of RV reducer, five factors with larger sensitivity indices were selected as sample eigenvalues in the data set, and transmission error was noted as sample label. The prediction results show that DDNN model can capture the logical relationship between features and strengthen the internal information correlation; it can effectively avoid the failure of the loss function to converge due to large fluctuations of parameters when updating; it has better generalization ability and can effectively predict reducer dynamic transmission error. The research in this article has practical guiding significance for the selection of parts and components in the RV reducer assembly, which can improve the assembly qualification rate, avoid repeated disassembly and assembly, and reduce the waste of labor and time costs.

However, there is still room for improvement in this study, such as the deficiencies in selection of quality influencing factors, and the applicability of the method for different examples. In the follow-up, the influence of quality factors on transmission error of reducer will be comprehensively analyzed to obtain a more ideal DDNN prediction model.

**Author Contributions:** Conceptualization; Methodology; Project administration, S.J.; Writing; Software and Validation, Y.C.; Review, Y.S.; Supervision; Editing, Y.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by National High-tech R&D Program of China (Grant No.2015AA043002), Natural Science Foundation of Zhejiang Province (Grant No. LQ22E050017), Postdoctoral Science Foundation of China (Grant No. 2021M702894), and Zhejiang Provincial Postdoctoral Science Foundation (Grant No. ZJ2021119).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author. The data are not publicly available due to corporate data privacy.

**Acknowledgments:** This work is supported by Zhejiang Shuanghuan Transmission Machinery Co., Ltd. providing the RV reducer sample and some data.

**Conflicts of Interest:** The authors declared no potential conflict of interest with respect to the research, authorship, and/or publication of this article.
