*2.5. Artificial Neural Networks*

Tasks related to the prediction or identification of parameters are related to the category of so-called inverse problems [12]. Most often, they consist of determining unknown parameters on the basis of values measured experimentally or obtained from numerical simulations. In a situation where knowledge about the observed phenomenon or the degree of its complexity does not allow building an expert system, the most frequently used tools are ANNs.

ANNs are widely used in many areas and tasks. The assumption is that ANNs are able to learn an unknown relationship between input and output data. This typically requires large amounts of data acquired in computational or experimental investigations. In recent years, much attention has been paid to so-called deep learning (DL) [20], and one of the major benefits of using deep models is successful NN learning from fewer training data. In addition, deep learning approaches have proven to be suitable for big data analysis, and hierarchical learning systems show superior performance in several engineering applications [21]. In the approach described in this article, two applications are demonstrated: DL multi-layer perceptron (MLP) was used for signal compression, and DL regression ANNs were trained to predict the axial forces in the screws of the investigated connections.

The learning process consisted of minimizing the computed error value between the target and the network outputs obtained for successive iterations. Testing and validation were carried out based on data that the network had never seen before. The ability to produce such a prediction for the training set is called network generalization [10]. In this article, the mean squared error (MSE) and standard deviation were used as measures of the error obtained at the network output.

The force identification provides information about the predicted value of that force with respect to parameters that are sensitive enough to its changes. The correct selection of these parameters is the most important issue in any identification task. Then, the accuracy of the neural predictor may be obtained by tuning the architecture or different training strategies. For the aforementioned task, feed forward ANNs are commonly used [10]. They consist of an input (first) layer, (usually) a few hidden layers, and an output layer. The number of elements in the input and output layers is determined by the size of the training datasets.

As an alternative to the designated wave parameters (PCA, time and amplitude (TA)), an MLP was trained to reproduce the inputs in the output layer. This kind of network is called an autoencoder. The network is trained so that the signal feed at the input is reproduced at the output, so the input layer and the output layer have the same dimensions. In the middle of the network, there is a hidden layer—called the bottleneck layer—that has fewer neurons than the input layer. This auto-associative neural network is used to compress signal data. After the learning process, the output and hidden layers after the bottleneck are neglected, and only the input, hidden, and bottleneck layers are used. This part of the network is called the encoder, which is used to reduce the dimension of the measured signals. It can be assumed that the parameters obtained from the encoder represent all the information contained in the signals. Moreover, this approach allowed us to perform compression on the input data and to reduce the data dimensionality. In other words, the encoder extracts the most important components of the signal and ignores the less important parts. The encoder, with only one hidden layer (the bottleneck), can be compared by computing the PCA of the input data, while the additional hidden layers introduce some non-linear transformation of the signal [22], and the encoder performs the PCA on a non-linear version of the signals. In this work, various encoder architectures were trained: *inl* − 850 − 250 − *bl*, where *inl* = {8192, 4096, 3072} is related to the number of data taken from the measured signal (starting from its beginning) and *bl* = {24, 12, 6, 4}. It was found that the most suitable architecture for the DNN for the compression task was 4096 − 850 − 250 − 12 − 250 − 850 − 4096, and the DNN for the regression task was 13 − 50 − 50 − 50 − 1.
