The modeling strategy here is to integrate two artificial neural networks (ANN) to simulate the heat transfer and deformation processes, respectively, as shown in
Figure 2. The two neural networks are named thermal ANN and mechanical ANN, respectively. The thermal ANN simulates the relationship between machining parameters, material thermophysical property parameters, plate thickness, and the maximum temperature of the heated surface. The mechanical ANN simulates the relationship between the machining parameters, plate thickness, material thermophysical property parameters, mechanical property parameters, the maximum temperature of the heated surface (thermal ANN prediction result), and plastic strain.
In order to alleviate the dependence of model training on the size of the sample dataset, dimensional analysis and domain knowledge were utilized here to reduce the dimensionality of the sample input space. The basic idea of this process is as follows: assuming that the
m physical variables involved have
k dimensions, the physical variables are converted into
m −
k dimensionless
π factors based on the
π theorem; then, the
π factors are further merged based on the knowledge of thermal variables and mechanics to constitute the sample input features of lower dimensionality, and the dimensionality-reduced dataset is utilized to train the ANN. The physical parameters (
βi and
γi) in
Figure 2 represent the input features after dimensionality reduction.
(1) Sample data obtained using TEP-FEM, that is, the maximum temperature and plastic strain of the heated surface corresponding to the forming parameters for different materials and plate thicknesses obtained;
(3) Training, validation, and evaluation of the artificial neural networks.
3.2. Extraction of Physical Parameters
Table 1 displays the fundamental dimensions of the forming parameters involved in the IHMRF process. Using
σT in Equation (1) as the variable representing the variation of yield stress with temperature,
where
σT represents the average value of the material yield stress from room temperature to the processing temperature history,
σs(
T) represents the temperature-dependent yield stress,
T0 represents the room temperature, and
Tmax represents the maximum temperature of the heating surface.
The functional relationship describing the thermal process can be expressed as:
The functional relationship describing the deformation process can be expressed as:
As can be seen from
Table 1, the thermal process under study involves 4 dimensions (length L, force F, time t, and temperature θ) and 6 physical variables. According to the
π theorem, four basic variables need to be determined, and the remaining two variables can be combined with each of these four basic variables to form a dimensionless
π factor. The basic variables are selected as follows:
The power P and velocity v represent the magnitude of the heat source energy and are therefore naturally included as basic variables.
The plate thickness h is an important geometrical parameter of the heated plate.
The specific heat Cp is related to the ability of a material to absorb heat.
After determining the basic variables, the dimension matrix can be listed:
where rows 1 to 4 of matrix
A are the four basic dimensions of length L, force F, time t, and temperature θ, in that order; columns 1 to 4 are the power-square values of the basic variables corresponding to the basic dimensions; column 5 is the power-square value of the thermal conductivity corresponding to the basic dimension; and column 6 is the power-square value of the maximum temperature corresponding to the basic dimension. After the row primitive transformation of the matrix, the π-factor of the heat transfer process can be obtained as:
Similarly, for the mechanical process, the yield stress
σs of the material, the thickness of the plate
h, the maximum temperature of the heated surface
Tmax, and the velocity
v are chosen as the basic variables. Then, the
π factor of the mechanical process can be derived as:
The heat transfer equation reveals the physical significance of the π
T1 factor in Equation (5) as a parameter associated with the heat source’s moving speed. Therefore, the
πT1 factor is used as the physical parameter
β1 for training the thermal ANN. In the
πT2 factor, which contains the output of the thermal ANN, the construction of the physical parameter
β2 related to the maximum temperature of the heated surface is considered:
where the value of
P/(
vh2) is related to the maximum temperature of the heating surface. Moreover, according to the literature [
12], the relationship between the π
T1 factor and the π
T2 factor is completely different in thin and thick plates. Therefore, the plate thickness
h is used here as the physical parameter
β3 for training the thermal ANN.
Table 2 displays the input features related to the thermal ANN.
In IHMRF, the thermal expansion and mechanical bending action are the two types of driving forces that cause the deformation of plates. Therefore, the inputs used to train the mechanical ANN should contain terms for the temperature, coefficient of thermal expansion, and amount of downward pressure. According to the literature [
13], the temperature at which the material yields during the heating process is approximately equal to
σs/(
αE). The physical parameter
γ1 is constructed based on the
πm1 factor and
πm2 factor:
The physical significance of the constructed physical parameter
γ1 is to characterize the magnitude of the heating-induced thermal expansion force relative to the material yield stress. The physical parameter
γ2 is constructed based on the
πm6 factor, the
πm8 factor, and the plate thickness
h, considering the distribution of the temperature field in the plate thickness direction:
The two parameters
γ1 and
γ2 together reflect the intrinsic effect of the temperature field. For the mechanical bending effect of the rollers, with reference to the slat beam theory, the forming depth can be characterized in the following dimensionless form:
On the other hand, the mechanical properties of the material, particularly the yield stress, are intrinsic factors that determine the magnitude of plastic deformation of the material. Therefore, we also use π
m1 and π
m3 in Equation (6) as input features for training the mechanical ANN. The input features related to the mechanical ANN are shown in
Table 3.
3.3. Artificial Neural Network (ANN)
A feedforward neural network with backpropagation was used in this study. This is a supervised learning algorithm that maps a given set of inputs to outputs using the function f: Ωj → Ωo, where j is the number of dimensions for input and o is the number of dimensions for output. The purpose of training is to learn the function f using the sample dataset. The function f is found by minimizing the cost function L, which is a measure of how well the neural network performs with respect to the network predicted output f(x) and the target values y.
Usually, a neural network model can be divided into an input layer, the hidden layers, and an output layer. The input layer is composed of a set of neurons representing the input features {
x|
x1,
x2,
x3, …,
xj }. The output layer provides the predicted output,
f(
x). Depending on the complexity of the problem, one or more hidden layers are designed between the input and output layers to transform and process the data. The data transformation from the input layer to the 1st hidden layer can be expressed as:
where
g1 is the output of the 1st hidden layer,
W1 is a linearly transformed weight matrix,
b1 is a bias vector,
x is an input feature vector, and
φ is a nonlinear activation function. Similar data transformations are performed between two subsequent neighboring layers (e.g., the
rth hidden layer and the
r + 1th hidden layer), but the weight matrix
Wr, the bias vector
br, and the activation function
φ may change.
At the beginning of the first forward pass of the artificial neural network, the weights and biases are initialized, and the input vectors enter the network for initial prediction. The backpropagation algorithm computes the loss function and then iteratively adjusts the weights and biases to minimize it [
14]. The backpropagation algorithm determines the impact of each weight and bias in the model on the predicted output values by computing the partial derivatives of the loss function relative to the weights and biases, as follows:
where
Sr is the output of the
rth layer. The new weights and biases are computed using the gradient descent method:
The ANN models in this study are based on the Pytorch implementation. As shown in
Figure 4, both the trained thermal ANN and mechanical ANN contain two hidden layers. The activation function between the input and hidden layers is a widely used Tanh function. Since the study is a regression problem, a linear activation function is used between the last hidden layer and the output layer. The loss function
L is defined as the average variance of the predicted output value
f(
x) and the actual output value y. The loss function
L is defined as the average variance of the predicted output value
f(x) and the actual output value
y:
where
n is the number of samples fed into the network,
yi is the actual value, and
f(
xi) is the predicted value. To avoid the suppression of small values by high values in the input features, all data are normalized in the range [−1, 1]. The normalized value
x* for each input feature
x is calculated as:
where
xmin is the minimum value of the input feature and
xmax is the maximum value of the input feature. Before training, the sample dataset is divided into the training set (80%) and the testing set (20%). About 20% of the training set is further divided into a validation set. The training set is used for the training and learning of the ANN, and the validation set is used to examine the performance of the ANN during training. The test set is not involved in network training and is used to evaluate the predictive ability of the final trained ANN. The weights and biases of the neural network are initialized using the Kaiming uniform distribution, and the weights and biases are optimized using the Adams algorithm during the training process. The network divides the training data into multiple batches of 20 inputs during one training epoch.
The genetic algorithm optimally tunes the hyperparameters of the ANN model (number of neurons in the hidden layer and initial learning rate). The genetic algorithm is an intelligent stochastic global search algorithm inspired by natural evolution, which is less likely to fall into local minima than traditional search methods. This is because the genetic algorithm uses a population of individuals to explore all regions of the solution space. In the genetic algorithm optimization process, the loss function L of the model on the validation set is minimized as the optimization objective. Due to the strong randomness of single random sampling, a 5-fold cross-validation method is used to assess the performance of ANN on the validation set. That is, the loss function L is the evaluated value of the 5-fold cross-validation. The individual in this study has a set of parameters, including the size of the first hidden layer, the size of the second hidden layer, and the initial learning rate. The range of values for each parameter in the initial population is as follows:
Size of the first hidden layer:
Size of the second hidden layer:
Initial learning rate: