Influence of the Neural Network Morphology Symmetry on the Complex Dynamic Objects’ Diagnostics

Vladov, Serhii; Vysotska, Victoria; Vasylenko, Viktor; Lytvyn, Vasyl; Nazarkevych, Mariia; Fedevych, Olha

doi:10.3390/sym17010035

Open AccessArticle

Influence of the Neural Network Morphology Symmetry on the Complex Dynamic Objects’ Diagnostics

by

Serhii Vladov

^1,*

,

Victoria Vysotska

²

,

Viktor Vasylenko

¹,

Vasyl Lytvyn

²

,

Mariia Nazarkevych

²

and

Olha Fedevych

^2,*

¹

Kremenchuk Flight College, Kharkiv National University of Internal Affairs, 27, L. Landau Avenue, 61080 Kharkiv, Ukraine

²

Information Systems and Networks Department, Lviv Polytechnic National University, 12, Bandera Street, 79013 Lviv, Ukraine

^*

Authors to whom correspondence should be addressed.

Symmetry 2025, 17(1), 35; https://doi.org/10.3390/sym17010035

Submission received: 17 November 2024 / Revised: 15 December 2024 / Accepted: 24 December 2024 / Published: 27 December 2024

(This article belongs to the Section Computer)

Download

Browse Figures

Versions Notes

Abstract

:

In this article, to study the influence of neural networks’ morphology symmetry, a mathematical model is developed that considers dynamic symmetry for diagnosing complex dynamic objects. The developed mathematical model includes the symmetric architecture concept with adaptive parameters, according to which the neural network is represented by a function that relates the input data to the diagnostic outputs. A dynamic symmetry function is introduced to the neural networks’ weight change depending on the systems’ state. To achieve symmetric training, the loss function is minimised with regularisation considering deviations from the symmetric state. The theorem “On the symmetric neural network optimisation stability” is formulated and proven, demonstrating the symmetric neural network optimisation stability, which is confirmed by the weights’ stability and the loss functions’ global optimisation, which includes symmetric regularisation, which stabilises the weights and reduces their sensitivity to minor disturbances. It is shown that in the training process, gradient descent with symmetric regularisation contributes to stable convergence and a decrease in weight asymmetry. In this case, an energy function that tends to zero with the optimal weights’ achievement is introduced. The analysis showed that symmetric regularisation minimises the weights’ deviation and prevents their overtraining. It was experimentally established that the optimal regularisation coefficient λ = 1.0 ensures a balance between the models’ symmetry and flexibility, minimising the diagnostic error. The results show that symmetric regularisation contributes to practical training and increases the diagnostic models’ accuracy.

Keywords:

symmetry; neural network; symmetrical morphology; weights; loss function; regularization; symmetrisation

1. Introduction

Modern neural networks used to diagnose complex dynamic objects, such as aircraft engines [1,2] and industrial medical systems [3,4,5,6], are becoming increasingly complex and adaptive. However, despite high accuracy and efficiency, many algorithms [2,5,6,7] face stability and reliability problems in the changing environmental parameters and the system’s internal structure. The neural networks’ morphology symmetry research allows for an increase in their ability to adapt to changes in an object’s characteristics and, as a result, optimise the diagnostic process. The neural networks’ symmetrical morphology can reduce the need for frequent reconfigurations, minimising the incorrect recognition risks and anomalies in the system’s operation, thereby increasing the diagnostics’ accuracy and reliability.

Due to increasing requirements for the technical systems’ reliability and safety [1,3,4,8], research on the influence of the neural networks’ morphology symmetry on the dynamic objects’ diagnostics is becoming relevant. Incorporating symmetry principles into the neural networks’ architecture [9] can increase its resistance to changes in system parameters and reduce the computational load required to adapt to new conditions. It is essential in actual operating conditions, where the object is exposed to various external factors that affect its behaviour and characteristics.

Research into the application of neural network technologies in the complex dynamic systems’ diagnostics field is actively developing, especially in areas requiring high accuracy, such as aerospace engineering [10,11], energy [12,13], and industrial automation [14,15,16]. Neural network technologies are widely used for data time series analysis [17], predicting anomalies [18] and malfunctions [19], and optimising the dynamic objects’ control in operational conditions [20,21]. At the same time, their application emphasises recognising deviations in algorithms that can developed [22,23] from the norm and predicting potential failures in the system. In particular, recurrent neural networks, such as LSTM (long short-term memory) networks [24,25], have shown the required results in the time sequence diagnostics problems due to their ability to store and analyse long-term dependencies in data.

The symmetric neural architectures used [26,27], in which the network structure maintains mirror or other symmetry in the connections and weights distribution, have been shown to increase robustness to noise and input data distortion, improving the network’s ability to extract critical patterns. Symmetry use also reduces the number of neural network parameters [26,28], which reduces the computational load and makes models less susceptible to overfitting. However, much of the research, including [26,27,28], has focused on symmetry use in static problems such as image recognition, and only a limited number of studies (e.g., [29]) have aimed to adapt these methods to dynamic systems.

One of the most promising areas is the symmetry used to improve neural networks’ stability and adaptability in changing data structures and system parameters. Research [30,31,32] shows that using symmetric neural networks is appropriate when the object is exposed to multicomponent external factors. Traditional diagnostic methods often lose accuracy in such conditions and require regular calibration. Symmetry in the neural networks’ morphology can potentially provide better stability and reduce the neural networks’ dependence on individual variables. However, the practical implementation of this approach for dynamic systems is still an open task.

However, issues related to determining the optimal symmetry level that achieves a balance between accuracy and stability remain understudied. Most existing approaches rely on empirical data, but no formalised methods exist for determining and quantifying optimal symmetry parameters for dynamic objects. In addition, the symmetry effect on the neural network’s ability to adapt to changing system operating conditions, such as load, temperature, or other external factors, has not yet received sufficient theoretical justification and experimental verification.

An equally important task is to develop methods for integrating symmetric structures into more complex neural network architectures capable of operating in real-time. An essential requirement for most dynamic objects, especially those operating in critical conditions, is the neural networks’ ability to process data and adapt to its changes quickly. However, most research focuses on static symmetry analysis, while dynamic symmetry and its impact on network performance remain poorly understood.

The research aims to develop and study the influence of neural network morphology symmetry on diagnostic systems’ accuracy, stability, and adaptability for complex dynamic objects. The research object is neural networks used for diagnostics and monitoring the complex dynamic objects’ operating state under changing external factors. The research subject is the neural networks’ symmetrical morphology and its influence on the neural networks’ ability to adapt to changing operating conditions, improving the stability and accuracy of complex dynamic objects’ diagnostics.

The article consists of an introduction, main sections (“Materials and Methods”, “Case Study”, “Discussions”), conclusions, references, and Appendix A. The introduction substantiates the relevance of research on neural networks’ morphology symmetry to improve the accuracy, stability, and adaptability of complex dynamic objects’ operating system diagnostics under changing external factors, which will optimise the diagnostic process and reduce the computational load. The “Materials and Methods” section proposes a mathematical model that takes into account the neural networks’ dynamic morphology symmetry for diagnosing complex dynamic objects. The symmetric architecture and adaptation parameters concept are introduced, the conditions for the weights symmetry and their dynamic adaptation are formulated, and optimisation methods taking into account symmetry regularization are proposed. A theorem on the symmetric neural network optimisation is also proved, which ensures the solution’s stability and the loss function minimization with a unique global minimum. The “Case Study” section includes mathematical modelling of the scale’s behaviour under symmetry, a convergence analysis of gradient descent with symmetry, mathematical modelling of the weights’ behaviour under symmetry, an analysis of the regularization parameter influence on symmetry and overall error, and an analysis of the symmetry influence on the loss function and error dynamics. The “Discussion” section presents the research generalization substantiating the symmetry advantages in neural network architecture. This includes the symmetric regularization influence on the optimisation stability, training algorithms’ convergence, and the weight matrix stability, as well as the limitations analysis and prospects for further development for solving applied problems. The “Conclusions” present the research results. Appendix A presents an example of a neural network diagnostic model of a helicopter turboshaft engine based on a five-layer perceptron (3-6-12-6-3 structure) that analyses key engine performance parameters (rotor speeds and gas temperature in front of the compressor turbine) to detect defects and assess the engine condition based on data collected in real flight conditions.

2. Materials and Methods

To research the influence of neural networks’ morphology symmetry, a mathematical model is proposed that considers the neural networks’ dynamic symmetry for the diagnosis of complex dynamic objects, and the concept of symmetric architecture and adaptation parameters is proposed. Let us consider the neural network as a function f: ℝⁿ → ℝ^m [33], which finds the relations between the input data and diagnostic outputs. It is assumed that the neural networks’ weights are determined by the matrix W, and the shift vectors are determined by the matrix b. For the l-th layer with n neurons, the weights and shifts are defined as follows:

W^{(l)} = (\begin{matrix} w_{11}^{(l)} & w_{12}^{(l)} & \dots & w_{1 n}^{(l)} \\ w_{21}^{(l)} & w_{22}^{(l)} & \dots & w_{2 n}^{(l)} \\ \dots & \dots & \dots & \dots \\ w_{n 1}^{(l)} & w_{n 2}^{(l)} & \dots & w_{n n}^{(l)} \end{matrix}), b^{(l)} = (\begin{matrix} b_{1}^{(l)} \\ b_{2}^{(l)} \\ \dots \\ b_{n}^{(l)} \end{matrix}) .

(1)

The outputs of the l-th layer are defined as follows:

a^{(l)} = σ (W^{(l)} \cdot a^{(l - 1)} + b^{(l)}),

(2)

where σ is an activation function, such as ReLU (and its modifications, such as SmoothReLU [34]) or sigmoid, and a^(l−1) is the previous layers’ output.

The weights must satisfy a certain symmetric condition for a symmetric neural network. It is assumed that W^(l) is a symmetric matrix of the following form:

W^{(l)} = {(W^{(l)})}^{T} .

(3)

Then, each element

w_{i j}^{(l)} = w_{j i}^{(l)}

, which significantly reduces the number of unique parameters in the weight matrix. To take into account dynamic changes, a symmetry function S is introduced, which changes the weights depending on the systems’ state as follows:

W^{(l)} (t) = S (W^{(l)}, t),

(4)

where t is time, and the function S dynamically adjusts the weights depending on current conditions. For example, the function S can be defined as follows:

S (W^{(l)}, t) = a^{(l)} \cdot W^{(l)} + (1 - a^{(l)}) \cdot W_{0}^{(l)},

(5)

where

W_{0}^{(l)}

is the weights’ initial symmetric state, and a^(l) is a function that regulates the initial states’ contribution and the systems’ current state. It is assumed that θ(t) is the adaptation parameters vector, including weights and biases that depend on time:

θ (t) = \{W (t), b (t)\} .

(6)

Then, the training problem with dynamic symmetry is formulated as the parameters θ(t) optimisation taking into account the loss function L(t) minimisation:

L (t) = \frac{1}{N} \cdot \sum_{i = 1}^{N} {‖y_{i} - f (x_{i}, θ (t))‖}^{2},

(7)

where x_i is the input data, y_i is the expected output, and N is the number of training examples.

A condition on the weights’ gradients is introduced to optimise the parameters considering symmetry. It is assumed that ∇W^(l) is the loss function gradient over the weights:

\nabla W^{(l)} = \frac{\partial L}{\partial W^{(l)}} .

(8)

To ensure the weights’ symmetry, a constraint of the form is added as follows:

\nabla W^{(l)} = {({\nabla W}^{(l)})}^{T} .

(9)

In this case, the weights’ update is carried out taking into account this limitation as follows:

W^{(l)} (t + 1) = W^{(l)} (t) - η \cdot ({\nabla W}^{(l)} + {({\nabla W}^{(l)})}^{T}),

(10)

where η is the training rate.

To take into account the dynamic symmetry influence, a regularising term R(W) is added, which minimises the deviation from the symmetric state:

L_{t o t a l} (t) = L (t) + λ \cdot R (W),

(11)

where

R (W) = \sum_{l} {‖W^{(l)} + {(W^{(l)})}^{T}‖}^{2}

and λ is the regularisation coefficient that controls symmetry, and L_total(t) is the resulting loss function with dynamic symmetry.

A function γ(t) is introduced for dynamic symmetry, which controls the symmetry degree as a function of time. Then, the symmetry condition can be modified as follows:

W^{(l)} (t) = γ (t) \cdot W_{s y s}^{(l)} + (1 - γ (t)) \cdot W^{(l)},

(12)

where

W_{s y s}^{(l)}

is the weights’ symmetric part, and γ(t) ∈ [0, 1] determines the symmetry level.

Then, the final expression for determining the loss function taking into account regularisation takes the following form:

L_{t o t a l} (t) = \frac{1}{N} \cdot \sum_{i = 1}^{N} {‖y_{i} - f (x_{i}, θ (t))‖}^{2} + λ \cdot \sum_{l} {‖W^{(l)} + {(W^{(l)})}^{T}‖}^{2} .

(13)

To solve the loss function minimising optimisation problem with regularisation given in Equation (13), it is necessary to minimise it by the parameters θ(t), including the weights W^(l) and the biases b^(l). To optimise the function L_total(t) by W^(l), based on [32,35,36], it is advisable to use the gradient descent method. According to this method, the gradients of each term are calculated (the primary term is the neural network error, and the regularisation term is the weights’ symmetry). The loss functions’ (13) central part is the mean square error between the neural network prediction f(x_i, θ(t)) and the expected result y_i:

L_{M S E} (t) = \frac{1}{N} \cdot \sum_{i = 1}^{N} {‖y_{i} - f (x_{i}, θ (t))‖}^{2} .

(14)

For simplicity, the error for one training example is denoted as follows:

L_{i} (t) = {‖y_{i} - f (x_{i}, θ (t))‖}^{2} .

(15)

Then, this error’s partial derivative concerning the weights W^(l) is equal to the following:

\frac{{\partial L}_{i} (t)}{\partial W^{(l)}} = - 2 \cdot (y_{i} - f (x_{i}, θ (t))) \cdot \frac{\partial f (x_{i}, θ (t))}{\partial W^{(l)}} .

(16)

The calculation

\frac{\partial f (x_{i}, θ (t))}{\partial W^{(l)}}

depends on the activation functions and the neural network architecture. For example, for a superficial linear layer with activation

a^{(l)} = W^{(l)} \cdot a^{(l - 1)} + b^{(l)}

, the partial derivative is as follows:

\frac{\partial f (x_{i}, θ (t))}{\partial W^{(l)}} = a^{(l - 1)} .

(17)

The regularisation term responsible for the weights’ symmetry has the following form:

R (W) = \sum_{l} {‖W^{(l)} + {(W^{(l)})}^{T}‖}^{2},

(18)

or the following expanded form:

R (W) = \sum_{l} \sum_{i, j} {(w_{i j}^{(l)} - w_{j i}^{(l)})}^{2} .

(19)

For weights

w_{i j}^{(l)}

, the regularising terms’ partial derivative for

w_{i j}^{(l)}

is equal to

\frac{\partial R (W)}{\partial w_{i j}^{(l)}} = 2 \cdot (w_{i j}^{(l)} - w_{j i}^{(l)}) .

(20)

Then, the expression for determining the total loss function L_total(t) gradient by weights W^(l) takes the following form:

\frac{\partial L_{t o t a l}}{\partial W^{(l)}} = \frac{1}{N} \cdot \sum_{i = 1}^{N} \frac{\partial L_{i}}{\partial W^{(l)}} + λ \cdot \frac{\partial R (W)}{\partial W^{(l)}} .

(21)

After setting up expressions to determine partial derivatives, we obtain the following:

\frac{\partial L_{t o t a l}}{\partial W^{(l)}} = - \frac{2}{N} \cdot \sum_{i = 1}^{N} (y_{i} - f (x_{i}, θ (t))) \cdot a^{(l - 1)} + 2 \cdot λ \cdot (W^{(l)} + {(W^{(l)})}^{T}) .

(22)

Using the gradient descent method, the weights’ W^(l) update at t-th step is carried out as follows:

W^{(l)} (t + 1) = W^{(l)} (t) - η \cdot \frac{\partial L_{t o t a l}}{\partial W^{(l)}} .

(23)

After substituting the expression for determining the gradient (22), we obtain the following:

W^{(l)} (t + 1) = W^{(l)} (t) + \frac{2 \cdot η}{N} \cdot \sum_{i = 1}^{N} (y_{i} - f (x_{i}, θ (t))) \cdot a^{(l - 1)} - 2 \cdot λ \cdot η \cdot (W^{(l)} + {(W^{(l)})}^{T}) .

(24)

Since the weights’ symmetry is required, the weights are adjusted after each update step by averaging their values with the transposed matrix:

W^{(l)} (t + 1) = \frac{W^{(l)} (t + 1) + {(W^{(l)} (t + 1))}^{T}}{2} .

(25)

Considering the neural network weights’ symmetry property, which influences the solutions’ stability to optimise the loss function problem, Theorem 1, “On the symmetric neural network optimisation stability”, is formulated.

Theorem 1.

If the weight matrix W is symmetric and positive definite, then minimising the loss function L(W), which has a smooth, convex shape, leads to a unique global minimum.

Proof of Theorem 1.

Let W ∈ ℝⁿ^×n be a symmetric and positive definite matrix that is as follows:

W = W^T and x^T·W·x > 0, ∀x ∈ ℝⁿ, x ≠ 0.

(26)

To prove the formulated Theorem 1, we consider the loss function that must be minimised for W:

L (W) = f (W) + λ \cdot {‖W - W^{T}‖}^{2} .

(27)

where f(W) is a convex function depending on the weights, and the second term is a regularisation that ensures the weights’ symmetry. For a symmetric matrix W = W^T, the second term vanishes, as follows:

{‖W - W^{T}‖}^{2} = 0 .

(28)

Thus, L(W) = f(W) for symmetric W. Since f(W) is convex, it has a unique global minimum; that is, there is a unique point W* such that

f(W*) ≤ f(W), ∀W ∈ ℝⁿ^×n.

(29)

Next, the loss function L(W) is minimised, including symmetry regularisation. For this, the L(W) derivative concerning W of the form is considered, as follows:

\nabla_{W} L (W) = \nabla_{W} f (W) + 2 \cdot λ \cdot (W - W^{T}) .

(30)

Since W is symmetric, W = W^T, and the regularisation term 2·λ·(W − W^T) vanishes. Therefore,

\nabla_{W} L (W) = \nabla_{W} f (W) .

(31)

Since W is positive and definite, this property holds for f(W), ensuring the stability of the solution. The f(W) convexity ensures that W* is the only minimum, and the positive definiteness of W confirms that W* is stable and minimal. Since there is a unique minimum of L(W) for a symmetric, positive definite matrix W, minimising this function leads to a stable and unique solution. Thus, it is proven that the symmetry and positive definiteness of the weight matrix in a neural network ensure the loss functions’ stable optimisation with a unique global minimum. □

The proof of Theorem 1 relies on the symmetry and positive definiteness of the weight matrix W and the loss function L(W) convexity. The symmetry and positive definiteness of W guarantee the uniqueness and stability of a solution that minimizes the loss function L(W). The positive definiteness of W ensures that the quadratic form x^T· W·x > 0 for all x ≠ 0, which confirms the solutions’ stability and minimality. The convexity of f(W) ensures the unique global minimum existence of W* such that f(W*) ≤ f(W) for all W ∈ ℝⁿ^×n. The symmetry regularization of L(W), including the term λ·∥W − W^T∥², forces W to be symmetric; for W = W^T, this term is zero, and L(W) = f(W). The derivative ∇_WL(W) = ∇_Wf(W) + 2·λ·(W − W^T) also simplifies to ∇_WL(W) = ∇_Wf(W) for symmetric W, and the convexity of f(W) ensures that ∇_Wf(W) = 0 has a unique solution W*, confirming the optimisation’s uniqueness and stability.

Thus, a final optimisation procedure for symmetry-based weights is proposed, consisting of the following steps:

Calculating the error L_total(t) and its gradient over the weights W^(l);
Updating the weights using gradient descent;
Applying symmetrisation.

The proposed optimisation procedure allows for considering the weights’ dynamic symmetry, minimising the overall loss function, and ensuring the neural network’s stable operation when diagnosing complex dynamic objects. For this aim, several studies were conducted in the research, described in Table 1.

The proposed mathematical model demonstrates an innovative approach to accounting for symmetry in neural networks, which is emphasized by the weights’ behaviour analysis when introducing symmetry, gradient descent convergence, and the regularization parameter influence. The weights behaviour under symmetry modelling shows that regularization improves the network’s stability by minimizing the discrepancies between the elements of the weight matrix and its transposed version. The gradient descent convergence analysis, taking into account symmetry, reveals that the weights’ symmetric structure contributes to more stable and predictable dynamics of parameter updates, which is confirmed by the global minimum uniqueness proof. The regularization parameter λ plays a key role in the balance between prediction accuracy and symmetry. Increasing λ emphasizes symmetry preservation, which can reduce the error during generalization, but excessive values of the parameter can lead to a limitation of the flexibility of the model. The effect of symmetry on the loss function is expressed in a decrease in dynamic errors due to a decrease in parameter redundancy and a simplification of the optimisation landscape.

3. Case Study

3.1. Mathematical Modelling of the Scale’s Behaviour Under Symmetry

A mathematical model has been developed to prove the stability of symmetric regularisation in a neural network that analyses the behaviour of the weights W with symmetric regularisation and estimates their stability over time. Consider the loss function L_total(W), presented in a generalised form in (27), which includes the main error component and symmetric regularisation, in which f(W) is a convex function depending on the weights, that is, the primary loss function, λ > 0 is the regularisation parameter, and the regularising term ∥W − W^T∥² is minimised when W is symmetric, that is, W = W^T.

This study shows that symmetric regularisation, with an appropriate choice of the parameter λ, promotes the weights’ robust behaviour, in which small perturbations of W do not lead to significant deviations in the loss function L_total value. Using gradient descent to update the weights W, we obtain the following:

W (t + 1) = W (t) - η \cdot \nabla_{W} L_{t o t a l} (W),

(32)

where η is the training step, and the gradient ∇_WL_total(W) is given by the following:

\nabla_{W} L_{t o t a l} (W) = \nabla_{W} f (W) + 2 \cdot λ \cdot (W - W^{T}) .

(33)

Stability requires that the weights W(t) converge to the equilibrium value W* while minimising L_total, and that small changes in the initial conditions W(0) do not lead to significant deviations of W(t) from W*. It is achieved if the Hessian matrix

H_{L_{t o t a l}}

of the loss function

H_{L_{t o t a l}} (W)

is positive definite. The Hessian

H_{L_{t o t a l}}

of the loss function

H_{L_{t o t a l}} (W)

can be written as the following sum:

H_{L_{t o t a l}} = H_{f} + 2 \cdot λ \cdot I,

(34)

where

H_{f} = \nabla_{W}^{2} f (W)

is the Hessian of the main loss function f(W), and 2·λ·I is the symmetric regularisation contribution.

For sufficiently large λ,

H_{L_{t o t a l}}

becomes a positive definite matrix since 2·λ·I adds positive eigenvalues, which the weights’ behaviour stabilises. Stability requires that all

λ_{i} (H_{L_{t o t a l}})

eigenvalues of

H_{L_{t o t a l}}

be positive. It is ensured by choosing λ such that

λ_{i} (H_{f}) + 2 \cdot λ > 0, \forall i .

(35)

Thus, if H_f has negative or small positive eigenvalues, adding 2·λ·I with sufficient λ shifts all eigenvalues to the positive region, ensuring stability. We define the energy function for the weight W as E(W) = L_total(W)·E(W). Stability implies that the change in E(W) over time tends to zero as the equilibrium state W* is approached:

\frac{d E}{d t} = \nabla_{W} L_{t o t a l} \cdot \frac{d W}{d t} = η \cdot {‖\nabla_{W} L_{t o t a l}‖}^{2} \leq 0 .

(36)

Thus, symmetric regularisation causes E(W) to decrease, and the weight system stabilises at W = W^T, where the loss function is minimal. Since symmetric regularisation adds positive definiteness to the loss function L_total Hessian, it leads to stability in weight training since small perturbations do not cause significant deviations from the minimum point.

3.2. Convergence Analysis of Gradient Descent with Symmetry

To analyse the gradient descent with symmetric regularisation convergence, we consider the full loss function L_total(W), which includes the main error component and symmetric regularisation and is presented in Equation (27). The analysis studies the gradient norm ∥∇L_total∥ and the weights’ norm ∥W^(l)∥ at each step. It will allow us to determine how symmetric regularisation affects the convergence speed and stability. Using gradient descent, the weight update at the t-th step is carried out according to (32), while the loss function L_total full gradient, considering symmetric regularisation, is determined according to (33). Thus, the iterative weight update rule takes the following form:

W (t + 1) = W (t) - η \cdot (\nabla_{W} f (W (t)) + 2 \cdot λ \cdot W (t) - {(W (t))}^{T}) .

(37)

For convergence, it is required that

‖\nabla L_{t o t a l} (W (t))‖

decreases as the iterations t increases the number. For this, it is assumed that W* is the optimal value of the weights that minimises L_total(W). The change in the loss function at each step can be written as follows:

L_{t o t a l} (W (t + 1)) - L_{t o t a l} (W (t)) \approx - η \cdot {‖\nabla_{W} L_{t o t a l} (W (t))‖}^{2} .

(38)

To prove the convergence, we assume that L_total(W) is convex and ∇_WL_total(W) is Lipschitz continuous with constant L, that is,

‖\nabla_{W} L_{t o t a l} (W (t + 1)) - \nabla_{W} L_{t o t a l} (W (t))‖ \approx - L \cdot ‖W (t + 1) - W (t)‖ .

(39)

Then, for a convex function with symmetric regularisation, the gradient descent convergence will be ensured if the training step η is chosen such that

0 < η < \frac{2}{L + 4 \cdot λ},

(40)

where 4·λ is related to the symmetric regularisation. This condition allows the control of the step size and thus promotes stable convergence.

To estimate the change in the weights’ norm ∥W(t)∥, the change in the weights’ norm considers the symmetric regularisation. At each step,

‖W (t + 1)‖ = {‖W (t) - η \cdot \nabla_{W} L_{t o t a l} (W (t))‖}^{2} .

(41)

Substituting the gradient value, we obtain the following:

{‖W (t + 1)‖}^{2} = {‖W (t) - η \cdot \nabla_{W} f (W (t)) + 2 \cdot λ \cdot W (t) - {(W (t))}^{T}‖}^{2} .

(42)

Equation (42) shows that symmetric regularisation adds a term 2·λ·(W(t) − (W(t))^T)², which minimises the weights’ asymmetry, gradually bringing W closer to the symmetric state. This regularisation smoothes out the changes in the weights’ norm, which prevents sharp fluctuations and promotes stable convergence.

To prove stability, the Lyapunov method is used. Let V(W) = ∥W − W*∥ be the Lyapunov function, where W* is the minimum point. Then, the change in V at each step will be equal to the following:

{‖W (t + 1)‖}^{2} = {‖W (t) - η \cdot \nabla_{W} f (W (t)) + 2 \cdot λ \cdot W (t) - {(W (t))}^{T}‖}^{2} .

(43)

Using the gradient descent weight update formula, after substituting (32), we obtain the following:

V (W (t + 1)) - V (W (t)) = {‖W (t) - η \cdot \nabla_{W} L_{t o t a l} (W (t)) - W^{*}‖}^{2} - {‖W (t) - W^{*}‖}^{2} .

(44)

After the norm square expanding, expanding the brackets and reducing ∥W − W*∥², taking into account that for a small step η the regularisation term 2·λ·(W(t) − (W(t))^T)² smooths out the asymmetry, W(t) is brought closer to symmetry:

V (W (t + 1)) - V (W (t)) = - 2 \cdot η \cdot {(W (t) - W^{*})}^{T} \cdot \nabla_{W} L_{t o t a l} (W (t)) + η^{2} \cdot {‖\nabla_{W} L_{t o t a l} (W (t))‖}^{2} .

(45)

For V(W(t + 1)) − V(W(t)) ≤ 0, it is required that the second term does not exceed the first. This change will be negative if the training step η satisfies the abovementioned conditions and the symmetry regularisation λ stabilises the trajectory W, minimising V(W). Thus, V(W) decreases at each step, proving the algorithms’ convergence.

3.3. Mathematical Modelling of the Weights’ Behaviour Under Symmetry

To construct a weights’ evolution W mathematical model under symmetric regularisation, the weight changes dynamics are described as a differential equations system that considers the loss function primary gradient and the regularising symmetric term. To do this, we consider the full loss function, which includes the main error component and symmetric regularisation, presented in Equation (27), where ∥W − W^T∥² is the symmetric regularisation minimised at W = W^T. This study aims to construct the weights’ W(t) evolution model taking into account regularisation to understand how they change over time depending on the initial conditions and the training step η.

To derive the differential equation for the weights, it is assumed that the weights’ evolution is described by continuous dynamics, where the first-order differential equation determines the changes in the weights W(t) over time t:

\frac{d W}{d t} = - \nabla_{W} L_{t o t a l} (W) .

(46)

Let us substitute Equation (27), which describes the total gradient L_total(W), to obtain an expression for the right-hand side of (46):

\frac{d W}{d t} = - \nabla_{W} f (W) - 2 \cdot λ \cdot (W - W^{*}) .

(47)

Equation (47) describes the change in the weights W(t) under the loss function f(W) main gradient action and symmetric regularisation. Equation (47) is split into two components to analyse the weights’ behaviour under symmetry: the primary gradient and regularisation contributions. The result is the dynamics determination for the weights’ symmetric and asymmetric parts, representing W as the sum of the symmetric W_s and antisymmetric W_a parts, that is,

W = W_s + W_a,

(48)

where

W_{s} = \frac{W + W^{T}}{2}

is a symmetrical part, and

W_{a} = \frac{W - W^{T}}{2}

is an asymmetrical part.

For the symmetric part, regularisation has no effect since

W_{s} = W_{s}^{T}

. Then, the dynamics for W_s are described only by the loss function main gradient:

\frac{d W_{s}}{d t} = - \nabla_{W_{s}} f (W) .

(49)

The regularisation tends to reduce the antisymmetric part to zero, i.e., W_a → 0. Then, the dynamics for W_a will be as follows:

\frac{d W_{a}}{d t} = - \nabla_{W_{a}} f (W) - 2 \cdot λ \cdot W_{a} .

(50)

Equation (50) shows that the antisymmetric part of W_a will exponentially decrease at a rate dependent on the regularisation parameter λ. The larger the value of λ, the faster W_a tends to zero, which leads to the matrix W symmetrisation over time. The solution for the antisymmetric part, the W_a equation, is determined by Equation (50). If the main loss function f(W) does not have a significant effect on W_a, then the equation approximately takes the following form:

\frac{d W_{a}}{d t} = - 2 \cdot λ \cdot W_{a} .

(51)

The solution to this differential equation will be as follows:

W_{a} (t) = W_{a} (0) \cdot e^{- 2 \cdot λ \cdot t},

(52)

where W_a(0) is the initial value of the antisymmetric part.

Solution (52) shows that the antisymmetric part W_a(t) exponentially tends to zero, which confirms the matrix W symmetrisation in the regularisation presence.

3.4. Analysis of the Regularisation Parameter λ Influence on Symmetry and Overall Error

To assess the influence of the regularisation coefficient λ on the weights’ W symmetry and the final error L_total, the analysed parameter is the loss function, presented in the form of (27), where f(W) is the primary loss function (e.g., the mean square error (14)), and λ·∥W – W^T∥² is the regularisation term that controls the weight matrix W symmetry degree. The regularisation coefficient λ determines the regularisation weight: small values of λ have a minimal effect on symmetry, while large values can exaggerate symmetrisation, potentially worsening the models’ accuracy. To analyse the weights’ behaviour with a change in λ, the loss function gradient for the weights W is determined according to (33). The calculated gradient ∇_WL_total(W) is interpreted as follows:

The first component ∇_Wf(W) is aimed at minimising the primary loss function, affecting the models’ accuracy.
The second component 2·λ·(W − W^T) is the regularisation gradient, proportional to the difference between W and its transpose. The regularisation gradient tends to make W symmetric.

Symmetry regularisation affects the final error and the models’ accuracy as follows:

For small values of λ, the regularisation gradient 2·λ·(W − W^T) has a small weight, and the weights’ symmetry has a minimal effect on the loss function. Only the weights that minimise f(W) have a major influence on training.
For large values of λ, the regularisation gradient is amplified, forcing the weights to be symmetric, which can lead to the models’ degradation due to a narrowing of the possible values of W.

To quantify symmetry, a deviation metric from symmetry is introduced in the following form:

S y m m e t r y M e a s u r e = ‖W - W^{T}‖ .

(53)

The total loss function, taking into account the model error and symmetry, becomes the following:

L_{t o t a l} (W, λ) = f (W) + λ \cdot S y m m e t r y M e a s u r e .

(54)

Thus, to analyse the influence of λ, changes in L_total(W, λ) and the norm ∥W − W^T∥ are investigated for different values of λ. With increasing λ, the following is observed:

If λ is too large, W will be “driven” towards symmetric values, which can reduce accuracy because the weights will be less flexible to optimise the underlying loss function f(W).
If λ is too small, symmetry will not emerge, and the weight matrix will be dominated by model error, resulting in a suboptimal weight structure.

In this case, the change in the gradient norm is described by the following expression:

‖\nabla_{W} L_{t o t a l} (W)‖ = ‖\nabla_{W} f (W) + 2 \cdot λ \cdot (W - W^{T})‖ .

(55)

To experimentally confirm the obtained theoretical results using the helicopter turboshaft engines’ (TE) neural network diagnostic model [38,39] presented in Appendix A as an example, the following were obtained: a diagram of the final error L_total depending on λ (Figure 1) to assess the model accuracy dependence on the symmetrisation strength; a diagram of the symmetry measure ∥W − W^T∥ depending on λ (Figure 2) to show how an increase in λ leads to an increase in symmetry; and training curves for different λ (Figure 3) to observe the convergence rate and the difference in the final error for various parameter values.

According to Figure 1, as λ increases, the symmetry measure decreases, indicating that the weights tend to the neural networks’ more symmetrical configuration. This behaviour of the symmetry measure highlights the possibility of ensuring symmetry with strong regularisation (λ ≥ 1), while smaller values of λ allow for more significant deviation from symmetry. Small fluctuations may represent small changes in weight adjustment due to other factors in the training process.

According to Figure 2, as λ increases, L_total decreases, indicating that the performance improves due to the regularisation effect. However, after the point λ ≈ 1, further increasing λ causes L_total to increase, indicating that over-regularisation may lead to underfitting and reduced model accuracy.

According to Figure 3, at λ = 0.1, the training curve shows a relatively high initial loss (≈0.5) and slow convergence, indicating insufficient regularisation. At λ = 0.1, the neural network diagnostic model takes longer to reach a stable minimum, which may reflect slight overfitting. At λ = 0.5, with a moderate value, the model achieves better convergence, reaching a lower overall loss (the maximum loss does not reach 0.4%) more quickly. It suggests a better balance where regularisation helps the model generalise without significantly limiting the training flexibility. The value of λ = 1.0 is optimal. At λ = 1.0, the training curve shows the most desirable training behaviour with fast convergence to a low final loss (the loss is almost eliminated). At λ = 1.0, the balance between regularisation and flexibility gives the best results. At a high regularisation value (λ = 1.5), the training curve converges to a higher final loss (loss increases by 2.0 times compared to the results obtained with λ = 1.0), indicating underfitting. At λ = 1.5, the neural network diagnostic model is over-constrained, limiting its ability to reduce further error.

3.5. The Symmetry Influence on the Loss Function and Error Dynamics

To analyse the influence of weight symmetry on the loss function L_total(W) landscape, a symmetric regularisation is introduced according to (27), where f(W) is the primary loss function, and λ·∥W − W^T∥ is the symmetric regularisation. The loss function analysis as a function of weights consists of researching the influence of symmetry on the loss function L_total(W) and its change along different directions in the weights’ W space:

Symmetric direction, in which the weight matrix W changes in the symmetric matrices space (where W = W^T);
Asymmetric direction, in which the weight matrix W has a component different from W^T.

Similar to previous studies, the weights W are divided into a symmetric part W_s and an antisymmetric part W_a according to (48). Then, the loss function is expressed as follows:

L_{t o t a l} (W_{s}, W_{a}) = f (W_{s} + W_{a}) + λ \cdot {‖W_{a}‖}^{2} .

(56)

The loss function weight analysis consists of studying the loss function behaviour along symmetric and asymmetric directions:

In the symmetric direction, since W = W_s and W_a = 0, the loss function takes the following form:

L_total(W_s, 0) = f(W_s),

(57)

in which the regularisation term disappears, and the loss function behaviour is determined only by the underlying function f(W_s). If f(W) is convex in W_s, then symmetrisation allows one to avoid local minima and focus on the global minimum.

2.: For an asymmetric direction W = W_a, the loss function will contain a regularisation term:

$L_{t o t a l} (0, W_{a}) = f (W_{a}) + λ \cdot {‖W_{a}‖}^{2},$

(58)

in which λ·∥W_a∥² creates an additional term that prevents W_a from deviating too much from zero. The regularisation contribution tends to minimise the antisymmetric part, facilitating stable optimisation.

To analyse the symmetry influence on the optimisation dynamics and the landscape of the loss function, the loss functions’ L_total gradients and curvature are analysed through the Hessian. The loss functions’ gradient is decomposed into gradients by the symmetric and antisymmetric parts according to Equation (33), which shows that symmetric regularisation adds a gradient aimed at reducing the antisymmetric components, which avoids “drift” in antisymmetric directions and thus promotes smooth optimisation. To study the L_total(W) curvature, the Hessian is calculated as follows:

H_{L_{t o t a l}} = \nabla_{W}^{2} \cdot f (W) + 2 \cdot λ \cdot I_{W_{a}},

(59)

where I·W_a is the indicator matrix for the antisymmetric component. The Hessian’s second part,

2 \cdot λ \cdot I_{W_{a}}

, is positive definite, which enhances the loss functions’ convexity along the antisymmetric directions, making unwanted extremes less likely and reducing the probability of becoming stuck in local minima.

To experimentally confirm the obtained theoretical results using the helicopter TE neural network diagnostic model (Appendix A) example, the following were obtained: a diagram of the loss function L_total along symmetric and asymmetric directions (Figure 4), which allows one to see the regularisation influence on the loss functions’ stability; a loss function gradients map (Figure 5), which shows how the gradients direct the weights to the global minimum, avoiding unwanted antisymmetric components; and a Hessian eigenvalues spectrum (Figure 6), which allows one to analyse the loss functions’ curvature and the regularisation influence.

For the symmetric direction (the “blue curve” in Figure 4), the loss function diagram is displayed as a smoothed curve that reaches a minimum near W = 0 with a minimum loss function value of approximately L_total ≈ 0.9. It is noted that along the symmetric direction, there is a relatively small oscillation of the L_total value, which indicates a more stable and predictable evolution of the loss function in the weights’ symmetric directions. The loss function along the asymmetric direction (the “red curve” in Figure 4) is characterised by significant oscillations reflected in the additional local extrema form. The minimum value in this curve is also near W = 0, but the overall profile is wavier, and the function reaches values up to L_total ≈ 1.3 and higher. The obtained results indicate a tendency of asymmetric directions to form additional local minima and saddle points, which can complicate optimisation and lead to model instability.

According to Figure 5, near the coordinates x = 0 and y = 0 the gradients take minimal values. Their length is noticeably reduced, indicating the loss functions’ L_total possible global minimum zone. In this region, the gradient values along both axes are approximately ∇_xL_total ≈ 0.1 and ∇_yL_total ≈ 0.1, which indicates proximity to the state with minimal error, where training slows down. In the zones with high gradients for values x ≈ ±2 and y ≈ ±2, the gradients increase to values ∇_xL_total ≈ 4 and ∇_yL_total ≈ 4. These vectors represent vital directions for updating the weights, in which the loss function increases steeply. Such a sharp increase in the gradient indicates a loss function’s “steep descent”, accelerating the training process while the weights are significantly far from the minimum. The map is symmetrical concerning the axes x = 0 and y = 0. It indicates that the loss function is symmetrical concerning the weight parameters, and the symmetric regularisation presence facilitates the neural network’s easier finding of the optimal direction. This symmetry suggests that the weights will tend to a symmetrical minimum at sufficiently large gradient values, minimising L_total faster. When the neural networks’ weights are far from the minimum at the training stages, the significant gradient presence (up to four) accelerates training, making the network capable of finding the optimum faster. When approaching the minimum point (0, 0), small gradient values help to avoid overtraining and oscillations, maintaining a stable, smooth approach to the loss functions’ minimum.

According to Figure 6, the eigenvalues are distributed around the mean μ ≈ 0 with a normal distribution and standard deviation σ = 1. Most eigenvalues are concentrated from –2 to 2, indicating weak curvature dominance in the corresponding directions. Values ranging outside of these small numbers indicate possible directions with high or low curvature, affecting the loss function L_total landscape’s local properties.

4. Discussion

To study the influence of neural networks’ morphology symmetry, a mathematical model was developed that considers dynamic symmetry for diagnosing complex dynamic objects. The symmetric architecture concept and adaptive parameters were proposed. A function f represents the neural network, ℝⁿ → ℝ^m, connecting the input data with the diagnostic outputs, where the weights W^(l) and biases b^(l) of the l-th layer are specified by the matrix (1). For symmetric networks, the weights satisfy condition (3), which reduces the number of unique parameters. A dynamic symmetry function S is introduced, which changes the weights depending on the system’s state (4). At the same time, symmetric training is achieved by minimising the loss function with regularisation that takes into account the deviation from the symmetric state (11), where R(W) is represented by (18). The weights are updated considering the gradient (22), which includes symmetry, and the resulting weights are averaged with the transposed matrix. Based on the obtained results, Theorem 1, “On the symmetric neural network optimisation stability”, is formulated and proven, stating that the weight matrix in a neural network symmetry and positive definiteness ensure the stable optimisation of the loss function with a single global minimum.

Symmetry regularization in neural networks adds additional computational cost to the training process due to the need to control the weight matrices’ symmetry. The main cost is associated with the regularization term R(W) calculation, which includes the difference norms between the weight matrix and its transpose, and with the calculation of the corresponding gradients

\frac{\partial R (W)}{\partial W}

. These operations require additional matrix operations at each optimisation step, including calculating the matrices’ transpose, addition, and subtraction, which increase the complexity proportionally to the weight matrix size. In addition, the symmetrisation step, where the weights are adjusted by averaging with their transpose, requires additional matrix operations. Thus, the regularization cost increases linearly with the number of layers and quadratically with the number of neurons in a layer. However, such additional costs can be justified by the regularization benefits, such as reducing the number of parameters, improving the optimisation convergence, ensuring the solution’s stability, and preventing overfitting by introducing structural constraints on the network parameters.

The block diagram (Figure 7) shows the steps from the mathematical model for optimisation and weights symmetrisation.

A mathematical model analyses the behaviour of the weights W in a neural network with symmetric regularisation to prove its operation stability. According to (27), the losses L_total(W) include the main error and the regularising term ∥W − W^T∥², which is minimised when W = W^T. It is shown that the choice of parameter λ > 0 contributes to the weights’ stability, as it is a value at which small perturbations of W do not lead to significant changes in the loss function. The use of gradient descent allows the weights to be updated according to rule (32), where the gradient contains the symmetric regularisation 2·λ·(W − W^T)² contribution. Stability is achieved in the case of a positive definite Hessian matrix

H_{L_{t o t a l}}

(34), achieved by choosing a λ that satisfies condition (35). It shifts all eigenvalues to the positive region, ensuring the systems’ stability. The introduced energy function E(W) (36) decreases with time, tending to zero when the weights reach the equilibrium state W*, where the losses are minimal. Thus, symmetric regularisation is proven to provide stability, reducing sensitivity to minor disturbances. It allows for stabilising the weights during the neural networks’ training.

The total loss function L_total(W) (27) is investigated to analyse the gradient descent convergence with symmetric regularisation, including the main error and the regularising term. The weights are updated at each step according to rule (37), in which regularisation adds a stabilising effect. The neural networks’ training convergence is ensured by decreasing the gradient norm ∥∇_WL_total(W)∥ with an increasing number of iterations. In this case, the training step η is chosen to satisfy condition (40), in which L is the Lipschitz constant. Symmetric regularisation minimises the asymmetry of the weights W by adding the term 2·λ·(W − W^T)², which smooths out changes in the weight norm, prevents sharp fluctuations, and promotes stable convergence. The Lyapunov method with the function V(W) = ∥W − W*∥ proves stability, where W* is the optimal weight. The change in V(W), according to (45), at each step is negative if the conditions for the training step and the regularisation parameter are met, which guarantees a decrease in V(W), and proves the algorithms’ convergence.

To construct a mathematical model of the evolution of the weights W under symmetric regularisation action, the weight change dynamics are described by a differential equations system that considers the loss function primary gradient and the regularising symmetric term. The L_total(W) loss function is introduced and presented in Equation (27), including the main error and the regularisation ∥W − W^T∥², minimised at W = W^T. The evolution of the weights W(t) is described by Equation (47). The matrix W (48) is decomposed into symmetric W_s and asymmetric W_a parts to analyse the behaviour of the weights W. It is determined that the symmetric part dynamics are determined only by the loss function (50) primary gradient, while the asymmetric part tends to zero under the regularisation action according to (51), whose solution is determined by Equation (52). The obtained solution shows that the antisymmetric part W_a(t) exponentially tends to zero with increasing λ, which ensures the matrix W symmetrisation in time.

The regularisation coefficients’ λ influences the symmetry of the weights W, and the final error L_total (27) is analysed. It includes the main error f(W) and the regularising term λ·∥W − W^T∥², which the weight matrix symmetry degree controls. For small values of λ, symmetry has a minimal effect on the loss function, and the weights are adjusted to minimise f(W). For large values of λ, regularisation dominates, forcing the symmetry of W, which reduces the accuracy of the diagnostic model. Symmetry is estimated by the deviation metric ∥W − W^T∥, and the loss function takes the form of Equation (54). It has been experimentally proven (see Figure 1) that with an increase in λ, the metric ∥W − W^T∥ decreases, which increases symmetry, but excessive regularisation (λ > 1) leads to underfitting and a decrease in the diagnostic model accuracy. Experimentally, for the helicopter TE diagnostic model, it was found (see Figure 3) that λ = 1.0 provides the optimal balance between symmetry and flexibility, minimising the error. At λ = 0.1, slow convergence is observed. At λ = 0.5, a rapid decrease in error is achieved. At λ = 1.5, the model is over-limited, which increases the error by two times.

The influence of the loss function and error dynamics symmetry is analysed by introducing symmetric regularisation, which adds a regularisation term λ·∥W − W^T∥ to the loss function L_total(W) (27), minimising the antisymmetric components of the weight matrix W. The loss function was studied in the symmetric direction (where W = W^T) and the antisymmetric direction (W ≠ W^T). It is shown that symmetric regularisation contributes to the stabilisation of optimisations by adding a gradient that reduces the antisymmetric components and improves the convexity of loss functions through the second derivative (Hessian). The analysis showed that along the symmetric direction L_total(W_s, 0) has a more stable behaviour with more minor local minima (Figure 4, blue curve), and along the antisymmetric direction L_total(0, W_a), additional extrema arise, complicating the optimisation (Figure 4, red curve). It experimentally confirmed that symmetric regularisation reduces the loss functions’ oscillations and directs the weights to the global minimum, thereby ensuring the diagnostic models’ stability. The gradients along the symmetric direction show a uniform approach to the minimum, while sharp changes are observed in the antisymmetric regions, as shown in the gradient map (Figure 5). The Hessian eigenvalues spectrum (Figure 6) indicates a weak curvature along most directions, confirming symmetric regularisation’s advantage of reducing the probability of becoming stuck in local minima.

The limitations of the research are related to the assumptions used in developing the mathematical models. The developed model examines the neural networks’ symmetric architecture as a critical factor for optimisation stability. Still, in natural systems, additional parameters affecting stability are possible, such as input data noise, nonlinear dependencies, and external disturbances, which were not considered in the current analysis. In this case, the loss function with symmetric regularisation is minimised based on gradient descent, which involves the optimal training step and the regularisation parameter λ selection and does not consider the dynamic changes that influence the neural network training process. The theoretical analysis results, such as Theorem 1, “On the symmetric neural network optimisation stability”, and the Hessians’ behaviour, are limited to the weight matrix positive definition cases. However, they may not apply to neural networks with arbitrary parameters.

The limitations of the conducted research are related to the simplified assumptions used, such as perfect weights positive definiteness and the absence of a significant influence of input data noise. To overcome these limitations, future research is planned to develop adaptive training methods that take into account dynamic changes in network parameters and simulate the influence of real operating conditions, including noise and nonlinearities.

Prospects for further research are related to eliminating the identified limitations and expanding the developed mathematical model application scope. Additional studies will analyse the influence of input data noise and nonlinear dependencies on the stability of optimisations, including modelling actual operating conditions of neural networks. Another promising direction for further research is the development of adaptive training methods that consider dynamic changes in the neural networks’ parameters during the training process (using methods with a variable training step or introducing self-regulation mechanisms for the regularisation parameter λ). Further studies will expand the neural network models’ theoretical foundations with arbitrary parameters, including a case study where the weight matrix is not positive definite. It will include the regularisation of new types of development that ensure optimisation stability even when favourable definiteness conditions are violated. To confirm the effectiveness of the proposed approach, experimental studies will be conducted on actual data, including complex dynamic objects with asynchronous processes, which will allow for assessing the models’ applicability in applied diagnostics and predicting problems.

Future research should also consider using different benchmark datasets to test the proposed model, which will allow us to evaluate its generalizability and applicability in different areas. In addition, it is important to investigate the influence of different types of noise and dynamic factors to determine the robustness of the model under real operating conditions, as well as to validate its effectiveness in more diverse examples.

5. Conclusions

A neural network with a dynamic symmetry mathematical model has been developed, ensuring optimisation and reducing sensitivity to stability under minor disturbances. The symmetric architecture and regularisation with the introduction of a dynamic symmetry function S concept helps to reduce the number of unique parameters and simplifies the training process. Symmetric regularisation minimises the deviation from the symmetric state, which is theoretically proven within the Theorem 1 “On the symmetric neural network optimisation stability” framework, guaranteeing the presence of the loss function single global minimum under the weight matrix’s favourable definiteness conditions.

The analysis of weight W dynamics under symmetric regularisation confirmed that the proposed model ensures an exponential tendency of the antisymmetric components to zero, which stabilises the training process. The Lyapunov method used to prove convergence demonstrated a decrease in the loss function L_total(W) and the weights deviation from the equilibrium state at each gradient descent step. The results confirm that symmetric regularisation smooths out weight changes and prevents sharp fluctuations, thereby ensuring the stable convergence of neural network training under conditions with minor disturbances.

It is established that the dynamic symmetry function S(W^(l), t) is a mechanism that allows for taking into account changes in the system state and adapting the neural network weights in real time, while maintaining their symmetric properties. This function regulates the balance between the initial symmetric state of the weights W₀(l) and their current value W(l), and also takes into account external conditions changing over time t. Formally, the function is defined as S(W^(l), t) = a^(l)·W^(l) + (1 – a^(l))·W₀(l), where a^(l) is a parameter that depends on the system’ current state and regulates the contribution of the initial symmetric state W₀(l). Such a model allows for the dynamical control of the weights’ symmetry degree, ensuring a balance between stability and adaptability. This approach’s implementation requires updating the weights W^(l) taking into account the change in the symmetry function S(W^(l), t) at each training step, as well as imposing additional constraints on the symmetry during the update. This is achieved by introducing a regularizing term into the loss function that minimizes the deviation from symmetry, and applying a correction to the weights through their symmetrisation

W^{(l)} (t) = \frac{W^{(l)} (t) + {(W^{(l)} (t))}^{T}}{2}

after each update step.

The experiments using the helicopter turboshaft engines’ neural network diagnostic model showed that the regularisation coefficient λ = 1.0 provides an optimal balance between the weights’ symmetry and the models’ accuracy. At small values of λ (for example, λ = 0.1), slow convergence is observed, and with excessive regularisation (λ > 1.0), the diagnostic models’ error increases due to undertraining. A decrease in antisymmetric components with increasing λ was ensured by introducing the symmetry metric ∥W − W^T∥, which ensured the neural network weight optimisation process stability.

It has been experimentally proven that symmetric regularisation reduces loss function oscillations and improves convexity along symmetric directions, reducing the probability of becoming stuck in local minima and accelerating the global optimum achievement. Experimental data and the Hessian eigenvalue spectrum analysis confirmed that the symmetric neural network architecture increases the training algorithm’s stability and efficiency when working with dynamic objects.

Author Contributions

Conceptualisation, S.V., V.V. (Victoria Vysotska) and V.L.; methodology, S.V., V.V. (Victoria Vysotska) and M.N.; software, O.F., V.V. (Viktor Vasylenko), V.V. (Victoria Vysotska) and V.L.; validation, O.F., V.V. (Viktor Vasylenko), V.V. (Victoria Vysotska), V.L. and M.N.; formal analysis, S.V.; investigation, O.F., V.V. (Viktor Vasylenko), V.L. and M.N.; resources, S.V., O.F., V.V. (Victoria Vysotska), V.L. and M.N.; data curation, S.V., O.F., V.V. (Victoria Vysotska), V.L. and M.N., writing—original draft preparation, S.V.; writing—review and editing, O.F., V.V. (Victoria Vysotska), V.V. (Viktor Vasylenko), V.L. and M.N.; visualisation, O.F., V.V. (Victoria Vysotska) and M.N.; supervision, V.V. (Viktor Vasylenko), V.V. (Victoria Vysotska) and V.L.; project administration, S.V., O.F. and V.V. (Viktor Vasylenko); funding acquisition, O.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data is contained within the article.

Acknowledgments

The research was carried out with the grant support of the National Research Fund of Ukraine “Methods and means of active and passive recognition of mines based on deep neural networks”, project registration number 273/0024 from 1/08/2024 (2023.04/0024). This research also was supported by the Ministry of Education and Science of Ukraine “Methods and means of identification of combat vehicles based on deep learning technologies for automated control of target distribution” under Project No. 0124U000925 and by the Ministry of Internal Affairs of Ukraine “Theoretical and applied aspects of the development of the aviation sphere” under Project No. 0123U104884. Also, we would like to thank the reviewers for their precise and concise recommendations that improved the presentation of the results obtained.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

To conduct experimental studies, a helicopter TE neural network diagnostic model, developed by this group of authors in [38,39], based on a five-layer perceptron with a 3–6–12–6–3 structure (Figure A1), was used.

Figure A1. The helicopter turboshaft engines’ neural network diagnostic model.

The neural network input layer receives three key parameters recorded on board the helicopter by standard sensors [34]: gas-generator rotor speed (n_TC), free turbine rotor speed (n_FT), and compressor turbine inlet gas temperature (

T_{G}^{*}

). Thus, x = (n_TC, n_FT,

T_{G}^{*}

), where x ∈ ℝ³. The parameters recorded on board the helicopter are the conditions’ main indicators and the helicopter TE performance; their variations indicate various malfunctions or changes in the engines’ operating conditions.

The first hidden layer with six neurons performs the initial processing of the input data, extracting from it the main patterns that may indicate abnormal or pre-defective conditions. In this layer, the neural network begins to form basic dependencies between the input parameters, such as the relationship between the rotors’ speed and the gas temperature, which are of key diagnostic value. The six neurons used, proven in [38,39], allow the neural network to identify combinations representing patterns and deviations characteristic of normal and abnormal engine conditions. The ReLU activation function [40] helps the neural network process both positive and negative values, stabilising the calculations at the level of the first layer. The weight matrix is W⁽¹⁾ ∈ ℝ^6×3 and the bias vector is b⁽¹⁾ ∈ ℝ⁶. The first hidden layers’ output h⁽¹⁾ is defined as follows:

h^{(1)} = R e L U (W^{(1)} \cdot x + b^{(1)}) .

(A1)

The second hidden layer with 12 neurons, proven in [38,39], analyses the features identified at the previous level and forms more complex dependencies reflecting subtle relationships between the input parameters. In this layer, the neural network distinguishes minor deviations associated with wear or the onset of failures in the helicopter TE. The ReLU activation function [39,40] ensures nonlinearity and prevents gradient disappearance during training. The weight matrix is W⁽²⁾ ∈ ℝ^12×6 and the bias vector is b⁽²⁾ ∈ ℝ¹². The second hidden layers’ output h⁽²⁾ is defined as follows:

h^{(2)} = R e L U (W^{(2)} \cdot h^{(1)} + b^{(2)}) .

(A2)

The third hidden layer with six neurons, proven in [38,39], enhances the already identified abnormal pattern recognition and tunes the neural network to distinguish between normal and defective states. The reduction in the number of neurons compared to the second layer helps to generalise the information, and potential oversaturation eliminates detailed features, improving the model’s resistance to noise and random deviations in the data. This layer completes the feature formation process, information compression, and structuring before being fed to the output layer. The weight matrix is W⁽³⁾ ∈ ℝ^6×12 and the bias vector is b⁽³⁾ ∈ ℝ⁶. The third hidden layers’ output h⁽³⁾ is defined as follows:

h^{(3)} = R e L U (W^{(3)} \cdot h^{(2)} + b^{(3)}) .

(A3)

The output layer consists of three neurons, each corresponding to one of the diagnostic parameters reflecting the rotor speeds and the predicted values of the compressor turbine inlet gas temperature. The output layer provides final values that can be interpreted as the engines’ current state estimates. Since the diagnostic data are continuous, a linear activation function is used to ensure their accurate numerical output. It allows the operator or another system to interpret these results for further decision-making. The weight matrix is W⁽⁴⁾ ∈ ℝ^3×6 and the bias vector is b⁽⁴⁾ ∈ ℝ³. The output vector y ∈ ℝ³ is defined as follows:

y = W^{(4)} \cdot h^{(3)} + b^{(4)}

(A4)

where y are the diagnostic parameters predicted by the model for assessing the helicopter’s TE condition.

The developed neural network (Figure A1) training uses the backpropagation method with the Adam optimiser [41,42]. The loss function is the mean square error (MSE). At the forward propagation stage, for each i-th data sample consisting of the input vector

x^{(i)} = (n_{T C}^{(i)}, n_{F T}^{(i)}, T_{G}^{* (i)})

and the true value y⁽ⁱ⁾, forward propagation is performed, where the network outputs are calculated according to (A1)–(A4). The loss function is defined similarly to (14) as follows:

L = \frac{1}{N} \cdot \sum_{i = 1}^{N} {‖{\hat{y}}^{(i)} - y^{(i)}‖}^{2} = \frac{1}{N} \cdot \sum_{i = 1}^{N} \sum_{j = 1}^{3} ({\hat{y}}_{j}^{(i)} - y_{j}^{(i)})

(A5)

where

{\hat{y}}_{j}^{(i)}

is the predicted value of the j-th output neuron for the i-th example, and

y_{j}^{(i)}

is the true value.

Each layer’s weights and bias gradients are calculated at the backpropagation stage to minimise the loss function. The loss function’s gradient over the neural network output

{\hat{y}}_{j}^{(i)}

for the i-th sample is calculated as follows:

\frac{\partial L}{\partial {\hat{y}}^{(i)}} = 2 \cdot ({\hat{y}}^{(i)} - y^{(i)}) .

(A6)

The gradients for the weights W⁽⁴⁾ and biases b⁽⁴⁾ are calculated as follows:

\frac{\partial L}{\partial W^{(4)}} = \frac{1}{N} \cdot \sum_{i = 1}^{N} \frac{\partial L}{\partial {\hat{y}}^{(i)}}, \frac{\partial L}{\partial b^{(4)}} = \frac{1}{N} \cdot \sum_{i = 1}^{N} \frac{\partial L}{\partial {\hat{y}}^{(i)}} .

(A7)

For the first, second, and third hidden layers’ neurons with the ReLU activation function, the gradients for the weights and biases are calculated as follows:

\begin{array}{c} \frac{\partial L}{\partial h^{(3)}} = \frac{\partial L}{\partial \hat{y}} \cdot W^{(4)}, \frac{\partial L}{\partial W^{(3)}} = \frac{1}{N} \cdot (\sum_{i = 1}^{N} (\frac{\partial L}{\partial h^{(3)}}) ⊙ {R e L U}^{'} (h^{(3)})) \cdot {(h^{(2)})}^{T}, \frac{\partial L}{\partial b^{(3)}} = \frac{1}{N} \cdot (\sum_{i = 1}^{N} (\frac{\partial L}{\partial h^{(3)}}) ⊙ {R e L U}^{'} (h^{(3)})), \\ \frac{\partial L}{\partial h^{(2)}} = \frac{\partial L}{\partial h^{(3)}} \cdot W^{(3)}, \frac{\partial L}{\partial W^{(2)}} = \frac{1}{N} \cdot (\sum_{i = 1}^{N} (\frac{\partial L}{\partial h^{(2)}}) ⊙ {R e L U}^{'} (h^{(2)})) \cdot {(h^{(1)})}^{T}, \frac{\partial L}{\partial b^{(2)}} = \frac{1}{N} \cdot (\sum_{i = 1}^{N} (\frac{\partial L}{\partial h^{(2)}}) ⊙ {R e L U}^{'} (h^{(2)})), \\ \frac{\partial L}{\partial h^{(1)}} = \frac{\partial L}{\partial h^{(2)}} \cdot W^{(2)}, \frac{\partial L}{\partial W^{(1)}} = \frac{1}{N} \cdot (\sum_{i = 1}^{N} (\frac{\partial L}{\partial h^{(1)}}) ⊙ {R e L U}^{'} (h^{(1)})) \cdot x^{T}, \frac{\partial L}{\partial b^{(1)}} = \frac{1}{N} \cdot (\sum_{i = 1}^{N} (\frac{\partial L}{\partial h^{(1)}}) ⊙ {R e L U}^{'} (h^{(1)})), \end{array}

(A8)

where ⊙ denotes component-wise multiplication and ReLU′(z) = 1 if z > 0 and 0 if z ≤ 0. The gradients

\frac{\partial L}{\partial h^{(k)}}

for each layer are computed recursively, starting from the output layer and moving toward the first hidden layer.

The Adam optimiser updates weights using gradients and adaptive moments. The first- and second-degree moments are defined as follows:

\begin{array}{c} m_{t} = β_{1} \cdot m_{t - 1} + (1 - β_{1}) \cdot \nabla_{θ} L, \\ v_{t} = β_{2} \cdot b_{t - 1} + (1 - β_{2}) \cdot {(\nabla_{θ} L)}^{2}, \end{array}

(A9)

where β₁ and β₂ are the coefficients for the moments (usually β₁ = 0.9 and β₂ = 0.999).

The offset corrections are defined as follows:

{\hat{m}}_{t} = \frac{m_{t}}{1 - β_{1}^{t}}, {\hat{v}}_{t} = \frac{v_{t}}{1 - β_{2}^{t}},

(A10)

The parameters are updated as follows:

θ = θ - α \cdot \frac{{\hat{m}}_{t}}{\sqrt{{\hat{v}}_{t}} + ϵ},

(A11)

where α is the training rate, and ϵ is a small number to avoid division by zero (usually ϵ = 10⁻⁸).

The training dataset was obtained from the TV3-117 TE [43] parameters fragment, which is the Mi-8MTV helicopter power plant part [44,45,46], recorded on board the helicopter in flight mode by standard sensors [34]: gas-generator rotor speed, n_TC (recorded by the D-2M sensor), gas temperature in front of the compressor turbine,

T_{G}^{*}

(recorded by a sensor consisting of 14 dual thermocouples T-101), and free turbine rotor speed, n_FT (recorded by the D-1M sensor) (Figure A2). The parameters were recorded in the 320-second interval of an actual flight with a sampling period of 1 s. The parameters recorded on board the helicopter were provided by the authors in response to an official request from the Ministry of Internal Affairs of Ukraine as part of the “Theoretical and Applied Aspects of Aviation Sphere Development” project’s implementation, officially registered in Ukraine No. 0123U104884.

Figure A2. Time series of the TV3-117 turboshaft engine parameters’ dynamics using digitised oscillograms: the black curve is the gas-generator rotor r.p.m; the green curve is the free turbine rotor speed; and the blue curve is the gas temperature in front of the compressor turbine.

According to [1,2,11,34,38,39], and according to Figure A2, 256 values of n_TC,

T_{G}^{*}

, and n_FT were selected, which were normalised according to the following expression:

{\tilde{x}}_{i} = \frac{x_{i} - x_{i m i n}}{x_{i m a x} - x_{i m i n}},

(A12)

where

{\tilde{x}}_{i}

is a dimensionless parameter ranging from zero to one, with x_i _min and x_i _max representing the variables’ x_i minimum and maximum values, respectively. The normalised training dataset is subsequently displayed in Table A1.

Table A1. The training dataset fragment (author’s research).

Value	1	…	45	…	98	…	165	…	217	…	232	…	256
n_TC	0.685	…	0.970	…	0.906	…	0.908	…	0.911	…	0.740	…	0.825
$T_{G}^{*}$	0.487	…	0.978	…	0.722	…	0.726	…	0.722	…	0.518	…	0.523
n_FT	0.533	…	0.985	…	0.751	…	0.757	…	0.761	…	0.454	…	0.507

At the input data preliminary processing stage, the training dataset’s homogeneity was assessed according to the Fisher–Pearson [47,48,49] and Fisher–Snedecor [50,51,52] criteria (Table A2). The obtained results indicate the training dataset’s homogeneity. The training (100% of n_TC,

T_{G}^{*}

, and n_FT values), validation (67% of n_TC,

T_{G}^{*}

, and n_FT values), and test (33% of n_TC,

T_{G}^{*}

, and n_FT values) datasets’ representativeness was assessed using cluster analysis [53,54] using the k-means method [55,56,57]. The training dataset’s (Table A1) cluster analysis revealed eight distinct clusters (I…VIII). The data were divided using random sampling into training and test sets with a 2:1 ratio (67% and 33%, respectively). Both subsets encompassed all eight clusters and maintained a consistent structure. The inter-cluster distances were nearly identical across the two sets, confirming their equivalence (Figure A3). Consequently, the optimal dataset sizes were determined: the training dataset consists of 256 elements, the validation dataset includes 172 elements, and the test dataset comprises 84 elements.

Figure A3. Cluster analysis results: (a) training dataset and (b) test dataset (author’s research).

Table A2. The training dataset’s homogeneity assessment for the parameters n_TC,

T_{G}^{*}

, and n_FT was conducted using the Fisher–Pearson and Fisher–Snedecor tests.

Table A2. The training dataset’s homogeneity assessment for the parameters n_TC,

T_{G}^{*}

, and n_FT was conducted using the Fisher–Pearson and Fisher–Snedecor tests.

Parameter	The Criterion Meaning		Description
Parameter	Calculated	Critical	Description
The Fisher–Pearson criterion
n_TC	26.519	27.683	$The Fisher - Pearson analysis produced values for the parameters n_{T C}, T_{G}^{*}$ , and n_FT that remained below the critical limit, indicating the training dataset’s uniformity.
$T_{G}^{*}$	26.532
n_FT	26.487
The Fisher–Snedecor criterion
n_TC	5.488	5.74	$The Fisher - Snedecor test yielded values for n_{T C}, T_{G}^{*}$ , and n_FT that did not exceed the critical threshold, confirming the uniformity of the training dataset.
$T_{G}^{*}$	5.502
n_FT	5.494

References

Vladov, S.; Shmelov, Y.; Yakovliev, R.; Petchenko, M. Modified Neural Network Fault-Tolerant Closed Onboard Helicopters Turboshaft Engines Automatic Control System. CEUR Workshop Proc. 2023, 3387, 160–179. Available online: https://ceur-ws.org/Vol-3387/paper13.pdf (accessed on 14 October 2024).
Vladov, S.; Shmelov, Y.; Petchenko, M. A Neuro-Fuzzy Expert System for the Control and Diagnostics of Helicopters Aircraft Engines Technical State. CEUR Workshop Proc. 2021, 3013, 40–52. Available online: https://ceur-ws.org/Vol-3013/20210040.pdf (accessed on 14 October 2024).
Baranovskyi, D.; Bulakh, M.; Myamlin, S.; Kebal, I. New Design of the Hatch Cover to Increase the Carrying Capacity of the Gondola Car. Adv. Sci. Technol. Res. J. 2022, 16, 186–191. [Google Scholar] [CrossRef]
Sagin, S.; Madey, V.; Sagin, A.; Stoliaryk, T.; Fomin, O.; Kučera, P. Ensuring Reliable and Safe Operation of Trunk Diesel Engines of Marine Transport Vessels. J. Mar. Sci. Eng. 2022, 10, 1373. [Google Scholar] [CrossRef]
Avrunin, O.G.; Nosova, Y.V.; Abdelhamid, I.Y.; Pavlov, S.V.; Shushliapina, N.O.; Wójcik, W.; Kisała, P.; Kalizhanova, A. Possibilities of Automated Diagnostics of Odontogenic Sinusitis According to the Computer Tomography Data. Possibilities of Automated Diagnostics of Odontogenic Sinusitis According to the Computer Tomography Data. Sensors 2021, 21, 1198. [Google Scholar] [CrossRef] [PubMed]
Baranovskyi, D.; Myamlin, S. The criterion of development of processes of the self organisation of subsystems of the second level in tribosystems of diesel engine. Sci. Rep. 2023, 13, 5736. [Google Scholar] [CrossRef] [PubMed]
Tairidis, G.K.; Stavroulakis, G.E. Fuzzy and Neuro-Fuzzy Control for Smart Structures. Springer Optim. Its Appl. 2019, 150, 75–103. [Google Scholar] [CrossRef]
Szrama, S.; Lodygowski, T. Aircraft Engine Remaining Useful Life Prediction Using Neural Networks and Real-Life Engine Operational Data. Adv. Eng. Softw. 2024, 192, 103645. [Google Scholar] [CrossRef]
Abdillah, M.; Mellouli, E.M.; Haidi, T. A New Intelligent Controller Based on Integral Sliding Mode Control and Extended State Observer for Nonlinear MIMO Drone Quadrotor. Int. J. Intell. Netw. 2024, 5, 49–62. [Google Scholar] [CrossRef]
Djeddi, C.; Hafaifa, A.; Iratni, A.; Hadroug, N.; Chen, X. Robust Diagnosis with High Protection to Gas Turbine Failures Identification Based on a Fuzzy Neuro Inference Monitoring Approach. J. Manuf. Syst. 2021, 59, 190–213. [Google Scholar] [CrossRef]
Vladov, S.; Yakovliev, R.; Vysotska, V.; Nazarkevych, M.; Lytvyn, V. The Method of Restoring Lost Information from Sensors Based on Auto-Associative Neural Networks. Appl. Syst. Innov. 2024, 7, 53. [Google Scholar] [CrossRef]
Xiao, X.; Zhang, X.; Song, M.; Liu, X.; Huang, Q. NPP Accident Prevention: Integrated Neural Network for Coupled Multivariate Time Series Prediction Based on PSO and Its Application under Uncertainty Analysis for NPP Data. Energy 2024, 305, 132374. [Google Scholar] [CrossRef]
Doucoure, B.; Agbossou, K.; Cardenas, A. Time Series Prediction Using Artificial Wavelet Neural Network and Multi-Resolution Analysis: Application to Wind Speed Data. Renew. Energy 2016, 92, 202–211. [Google Scholar] [CrossRef]
Rusyn, B.; Lutsyk, O.; Kosarevych, R.; Kapshii, O.; Karpin, O.; Maksymyuk, T.; Gazda, J. Rethinking Deep CNN Training: A Novel Approach for Quality-Aware Dataset Optimization. IEEE Access 2024, 12, 137427–137438. [Google Scholar] [CrossRef]
Sarwar, U.; Muhammad, M.; Mokhtar, A.A.; Khan, R.; Behrani, P.; Kaka, S. Hybrid intelligence for enhanced fault detection and diagnosis for industrial gas turbine engine. Results Eng. 2024, 21, 101841. [Google Scholar] [CrossRef]
Salilew, W.M.; Abdul Karim, Z.A.; Lemma, T.A. Investigation of fault detection and isolation accuracy of different Machine learning techniques with different data processing methods for gas turbine. Alex. Eng. J. 2022, 61, 12635–12651. [Google Scholar] [CrossRef]
Li, B.; Chen, H.; An, Z.; Yu, Y.; Jia, Y.; Chen, L.; Sun, M. The Continuous Memory: A Neural Network with Ordinary Differential Equations for Continuous-Time Series Analysis. Appl. Soft Comput. 2024, 167, 112275. [Google Scholar] [CrossRef]
Xu, L.; Wang, B.; Zhao, D.; Wu, X. DAN: Neural Network Based on Dual Attention for Anomaly Detection in ICS. Expert Syst. Appl. 2025, 263, 125766. [Google Scholar] [CrossRef]
Farahbakhsh, A.; Dezfoulian, H.; Khazaee, S. Predictive Classification of Nosocomial Infection Type and Treatment Outcome Using Neural Network Algorithm. Biomed. Signal Process. Control 2024, 95, 106331. [Google Scholar] [CrossRef]
Alexander, A.A.; Kumar, D.N. Optimizing Parameter Estimation in Hydrological Models with Convolutional Neural Network Guided Dynamically Dimensioned Search Approach. Adv. Water Resour. 2024, 194, 104842. [Google Scholar] [CrossRef]
Yin, J.; Wen, Z.; Li, S.; Zhang, Y.; Wang, H. Dynamically Configured Physics-Informed Neural Network in Topology Optimization Applications. Comput. Methods Appl. Mech. Eng. 2024, 426, 117004. [Google Scholar] [CrossRef]
Sun, Z. Wearable Glove Gesture Recognition Based on Fiber Bragg Grating Sensing Using Genetic Algorithm-Back Propagation Neural Network. Opt. Fiber Technol. 2024, 87, 103874. [Google Scholar] [CrossRef]
Zhang, C.; Fulneček, J.; Yang, L.; Zhang, Y.; Zheng, J. Combining Multi-Level Feature Extraction Algorithm with Residual Graph Convolutional Neural Network for Partial Discharge Detection. Measurement 2025, 242, 116151. [Google Scholar] [CrossRef]
Boujamza, A.; Lissane Elhaq, S. Attention-Based LSTM for Remaining Useful Life Estimation of Aircraft Engines. IFAC-PapersOnLine 2022, 55, 450–455. [Google Scholar] [CrossRef]
Bukhari, M.; Yasmin, S.; Naz, S.; Durrani, M.Y.; Javaid, M.; Moon, J.; Rho, S. A Smart Heart Disease Diagnostic System Using Deep Vanilla LSTM. Comput. Mater. Contin. 2023, 77, 1251–1279. [Google Scholar] [CrossRef]
Rasche, C. A Neural Architecture for the Symmetric-Axis Transform. Neurocomputing 2005, 64, 301–317. [Google Scholar] [CrossRef]
Ilina, O.; Ziyadinov, V.; Klenov, N.; Tereshonok, M. A Survey on Symmetrical Neural Network Architectures and Applications. Symmetry 2022, 14, 1391. [Google Scholar] [CrossRef]
Hu, S.X.; Zagoruyko, S.; Komodakis, N. Exploring Weight Symmetry in Deep Neural Networks. Comput. Vis. Image Underst. 2019, 187, 102786. [Google Scholar] [CrossRef]
Rusyn, B.; Lutsyk, O.; Kosarevych, R.; Obukh, Y. Application Peculiarities of Deep Learning Methods in the Problem of Big Datasets Classification. Lect. Notes Electr. Eng. 2021, 831, 493–506. [Google Scholar] [CrossRef]
Song, C.; Lu, L.; Zeng, C. Non-Negative Matrix Factorization with Averaged Kurtosis and Manifold Constraints for Blind Hyperspectral Unmixing. Symmetry 2024, 16, 1414. [Google Scholar] [CrossRef]
Akers, B.F.; Williams, K.O.F. Coarse-Gridded Simulation of the Nonlinear Schrödinger Equation with Machine Learning. Mathematics 2024, 12, 2784. [Google Scholar] [CrossRef]
Hashemi, M.; Peralta, R.C.; Yost, M. Balancing Results from AI-Based Geostatistics versus Fuzzy Inference by Game Theory Bargaining to Improve a Groundwater Monitoring Network. Mach. Learn. Knowl. Extr. 2024, 6, 1871–1893. [Google Scholar] [CrossRef]
Mao, C.; Wu, Z.; Liu, Y.; Shi, Z. Matrix Factorization Recommendation Algorithm Based on Attention Interaction. Symmetry 2024, 16, 267. [Google Scholar] [CrossRef]
Vladov, S.; Scislo, L.; Sokurenko, V.; Muzychuk, O.; Vysotska, V.; Osadchy, S.; Sachenko, A. Neural Network Signal Integration from Thermogas-Dynamic Parameter Sensors for Helicopters Turboshaft Engines at Flight Operation Conditions. Sensors 2024, 24, 4246. [Google Scholar] [CrossRef]
Siddiqi, F.U.R.; Ahmad, S.; Akram, T.; Ali, M.U.; Zafar, A.; Lee, S.W. Artificial Neural Network-Based Data-Driven Parameter Estimation Approach: Applications in PMDC Motors. Mathematics 2024, 12, 3407. [Google Scholar] [CrossRef]
Vladov, S.; Yakovliev, R.; Hubachov, O.; Rud, J. Neuro-Fuzzy System for Detection Fuel Consumption of Helicopters Turboshaft Engines. CEUR Workshop Proc. 2024, 3628, 55–72. Available online: https://ceur-ws.org/Vol-3628/paper5.pdf (accessed on 11 November 2024).
Liu, W.; He, X.; Li, X. Lyapunov Conditions for Exponential Stability of Nonlinear Delay Systems via Impulsive Control Involving Stabilising Delays. Nonlinear Anal. Hybrid Syst. 2024, 51, 101411. [Google Scholar] [CrossRef]
Vladov, S.; Shmelov, Y.; Yakovliev, R.; Petchenko, M.; Drozdova, S. Neural Network Method for Helicopters Turboshaft Engines Working Process Parameters Identification at Flight Modes. In Proceedings of the 2022 IEEE 4th International Conference on Modern Electrical and Energy System (MEES), Kremenchuk, Ukraine, 20–22 October 2022; pp. 604–609. [Google Scholar] [CrossRef]
Vladov, S.; Petchenko, M.; Shmelov, Y.; Drozdova, S.; Yakovliev, R. Helicopters Turboshaft Engines Parameters Identification at Flight Modes Using Neural Networks. In Proceedings of the 2022 IEEE 17th International Conference on Computer Sciences and Information Technologies (CSIT), Lviv, Ukraine, 10–12 November 2022; pp. 5–8. [Google Scholar] [CrossRef]
Sooksatra, K.; Rivas, P. Dynamic-Max-Value ReLU Functions for Adversarially Robust Machine Learning Models. Mathematics 2024, 12, 3551. [Google Scholar] [CrossRef]
Sun, H.; Zhou, W.; Yang, J.; Shao, Y.; Xing, L.; Zhao, Q.; Zhang, L. An Improved Medical Image Classification Algorithm Based on Adam Optimizer. Mathematics 2024, 12, 2509. [Google Scholar] [CrossRef]
Marakhimov, A.R.; Khudaybergenov, K.K. Approach to the synthesis of neural network structure during classification. Int. J. Comput. 2020, 19, 20–26. [Google Scholar] [CrossRef]
Catana, R.M.; Dediu, G. Analytical Calculation Model of the TV3-117 Turboshaft Working Regimes Based on Experimental Data. Appl. Sci. 2023, 13, 10720. [Google Scholar] [CrossRef]
Katunin, A.; Synaszko, P.; Dragan, K. Automated Identification of Hidden Corrosion Based on the D-Sight Technique: A Case Study on a Military Helicopter. Sensors 2023, 23, 7131. [Google Scholar] [CrossRef]
Sałaciński, M.; Puchała, K.; Leski, A.; Szymczyk, E.; Hutsaylyuk, V.; Bednarz, A.; Synaszko, P.; Kozera, R.; Olkowicz, K.; Głowacki, D. Technological Aspects of a Reparation of the Leading Edge of Helicopter Main Rotor Blades in Field Conditions. Appl. Sci. 2022, 12, 4249. [Google Scholar] [CrossRef]
Castillo-Rivera, S.; Tomas-Rodriguez, M. Description of a Dynamical Framework to Analyse the Helicopter Tail Rotor. Dynamics 2021, 1, 171–180. [Google Scholar] [CrossRef]
Turchenko, V.; Chalmers, E.; Luczak, A. A deep convolutional auto-encoder with pooling—unpooling layers in caffe. Int. J. Comput. 2019, 1, 8–31. [Google Scholar] [CrossRef]
Altameem, A.; Al-Ma’aitah, M.; Kovtun, V.; Altameem, T. A Computationally Efficient Method for Assessing the Impact of an Active Viral Cyber Threat on a High-Availability Cluster. Egypt. Inform. J. 2023, 24, 61–69. [Google Scholar] [CrossRef]
Kim, H.-Y. Statistical Notes for Clinical Researchers: Chi-Squared Test and Fisher’s Exact Test. Restor. Dent. Endod. 2017, 42, 152. [Google Scholar] [CrossRef]
Stefanovic, C.M.; Armada, A.G.; Costa-Perez, X. Second Order Statistics of Fisher-Snedecor Distribution and Their Application to Burst Error Rate Analysis of Multi-Hop Communications. IEEE Open J. Commun. Soc. 2022, 3, 2407–2424. [Google Scholar] [CrossRef]
Avram, F.; Leonenko, N.N.; Šuvak, N. Hypothesis testing for Fisher–Snedecor diffusion. J. Stat. Plan. Inference 2012, 142, 2308–2321. [Google Scholar] [CrossRef]
Vlasenko, D.; Inkarbaieva, O.; Peretiatko, M.; Kovalchuk, D.; Sereda, O. Helicopter Radio System for Low Altitudes and Flight Speed Measuring with Pulsed Ultra-Wideband Stochastic Sounding Signals and Artificial Intelligence Elements. Radioelectron. Comput. Syst. 2023, 3, 48–59. [Google Scholar] [CrossRef]
Hu, Z.; Kashyap, E.; Tyshchenko, O.K. GEOCLUS: A Fuzzy-Based Learning Algorithm for Clustering Expression Datasets. Lect. Notes Data Eng. Commun. Technol. 2022, 134, 337–349. [Google Scholar] [CrossRef]
Benaceur, A.; Verfürth, B. Statistical Variational Data Assimilation. Comput. Methods Appl. Mech. Eng. 2024, 432, 117402. [Google Scholar] [CrossRef]
Cherrat, E.M.; Alaoui, R.; Bouzahir, H. Score fusion of finger vein and face for human recognition based on convolutional neural network model. Int. J. Comput. 2020, 19, 11–19. [Google Scholar] [CrossRef]
Babichev, S.; Krejci, J.; Bicanek, J.; Lytvynenko, V. Gene expression sequences clustering based on the internal and external clustering quality criteria. In Proceedings of the 2017 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies (CSIT), Lviv, Ukraine, 5–8 September 2017. [Google Scholar] [CrossRef]
Vladov, S.; Shmelov, Y.; Yakovliev, R. Modified Searchless Method for Identification of Helicopters Turboshaft Engines at Flight Modes Using Neural Networks. In Proceedings of the 2022 IEEE 3rd KhPI Week on Advanced Technology (KhPIWeek), Kharkiv, Ukraine, 3–7 October 2022; pp. 257–262. [Google Scholar] [CrossRef]

Figure 1. The diagram of the final error L_total depending on λ.

Figure 2. The diagram of the symmetry measure ∥W − W^T∥ depending on λ.

Figure 3. Training curves for different λ.

Figure 4. The diagram of the loss function L_total along symmetric and asymmetric directions.

Figure 5. The loss function gradients map.

Figure 6. Hessian eigenvalues spectrum.

Figure 7. A block diagram of the mathematical model for optimisation and symmetrisation of weights.

Table 1. Description of the primary studies on ensuring the neural networks’ stable operation in the complex dynamic objects’ diagnostics.

Studies Name	Target	Objective
Study of the symmetric regularisation stability	To analyse how the regularisation term 2·λ·(W^(l) − (W^(l))^T)² affects the stability of weight updates. In particular, we can estimate how much weight symmetrisation reduces the “overfitting” probability by smoothing the weights and stabilising training under unstable or small gradients.	To perform a spectral analysis of the weight matrix W^(l), by tracking changes in eigenvalues and directions. If the eigenvalues tend to zero or constant values, this may indicate that the network is stabilising and improving its stability.
Convergence analysis of gradient descent with symmetry	To assess how symmetric regularisation changes the gradient descent algorithm convergence. Regularisation should promote a smoother and more predictable movement of weights towards the global minimum, especially for a convex loss function.	To conduct research in the convex optimisation context, consider how regularisation affects the convergence rate (e.g., using Lyapunov analysis [37]).
Influence of symmetry on the loss function and error dynamics	To study how the weight symmetry affects the loss function L_total shape to see if it contributes to the additional local minima or saddle points appearance. It can show how much symmetry helps to avoid becoming stuck in “bad” minima.	To explore the loss function L_total as a weights function and perform a cut-section analysis along symmetric and asymmetric directions. To analyse the gradient map and the loss function curvature (Hessian) to determine how symmetry changes the loss function landscape.
Analysis of the influence of the regularisation parameter λ on symmetry and total error	To assess how changing the value of λ (the regularisation coefficient) affects the balance between weight symmetry and network accuracy. For small values of λ, symmetry may not be evident, while regularisation may worsen the models’ accuracy for values that are too large.	To conduct a series of experiments analysing the network’s loss function L_total and symmetry measure ∥W^(l) − (W^(l))^T∥ using gradient analysis and studying training curves for different values of λ, which will allow for finding the optimal balance between symmetry and total error.
Mathematical modelling of the scale’s behaviour under symmetry	To construct a mathematical model that describes the weights’ evolution in the symmetric regularisation presence. It can help predict the weights’ behaviour depending on the initial conditions and the training step.	To use differential equations to describe the weights’ dynamics, considering the regularising term.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Vladov, S.; Vysotska, V.; Vasylenko, V.; Lytvyn, V.; Nazarkevych, M.; Fedevych, O. Influence of the Neural Network Morphology Symmetry on the Complex Dynamic Objects’ Diagnostics. Symmetry 2025, 17, 35. https://doi.org/10.3390/sym17010035

AMA Style

Vladov S, Vysotska V, Vasylenko V, Lytvyn V, Nazarkevych M, Fedevych O. Influence of the Neural Network Morphology Symmetry on the Complex Dynamic Objects’ Diagnostics. Symmetry. 2025; 17(1):35. https://doi.org/10.3390/sym17010035

Chicago/Turabian Style

Vladov, Serhii, Victoria Vysotska, Viktor Vasylenko, Vasyl Lytvyn, Mariia Nazarkevych, and Olha Fedevych. 2025. "Influence of the Neural Network Morphology Symmetry on the Complex Dynamic Objects’ Diagnostics" Symmetry 17, no. 1: 35. https://doi.org/10.3390/sym17010035

APA Style

Vladov, S., Vysotska, V., Vasylenko, V., Lytvyn, V., Nazarkevych, M., & Fedevych, O. (2025). Influence of the Neural Network Morphology Symmetry on the Complex Dynamic Objects’ Diagnostics. Symmetry, 17(1), 35. https://doi.org/10.3390/sym17010035

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Influence of the Neural Network Morphology Symmetry on the Complex Dynamic Objects’ Diagnostics

Abstract

1. Introduction

2. Materials and Methods

3. Case Study

3.1. Mathematical Modelling of the Scale’s Behaviour Under Symmetry

3.2. Convergence Analysis of Gradient Descent with Symmetry

3.3. Mathematical Modelling of the Weights’ Behaviour Under Symmetry

3.4. Analysis of the Regularisation Parameter λ Influence on Symmetry and Overall Error

3.5. The Symmetry Influence on the Loss Function and Error Dynamics

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI