*2.2. Winkler Model*

Rail deformation and bending stress under specific loads are typically estimated using THE Winkler model, which considers the track as an infinite beam on a continuous elastic foundation [27–29]. Using the Winkler model (Equation (1)), the vertical rail deflection (*y*) at a distance *x* from the applied load (*P*) is computed as follows:

$$y(\mathbf{x}) = \frac{P\beta e^{-\beta \mathbf{x}} (\cos \beta \mathbf{x} + \sin \beta \mathbf{x})}{2L} \tag{1}$$

where β is the stiffness ratio, which is equal to (*U*/(4*EI*))0.25, *U* is the track modulus, *E* is the modulus of the elasticity of the rail, and *I* is the second moment of area of the rail.

From the Winkler model, the vertical deflection profile of a rail is only dependent on the track modulus value when the rail size and vertical loads are known. Once a value is assumed for track modulus, the rail vertical deflection profile can be estimated using Equation (1), and from the rail vertical deflection profile, *Yrel* can be calculated as the relative vertical deflection between the rail surface and the rail/wheel contact plane at a distance of 1.22 m from the nearest wheel (Figure 1a) [11,30]. The main shortcoming in this method is that the Winkler model assumes a track modulus is constant along the track while the field data shows that a track modulus stochastically varies along the track [31,32]. Therefore, the estimation of the track modulus from the *Yrel* measurements needs more advanced numerical models.

#### *2.3. Finite Element Model*

FEMs allow the simulation of a stochastically varying track modulus, and therefore, a more accurate simulation of *Yrel* measurements. Fallah Nafari et al., developed 90 FEMs with a stochastically varying track modulus to facilitate a more detailed investigation of the relationship between the *Yrel* and the track modulus [21]. Datasets from the 90 FEMs were used for the study in this paper. Hence, the details of the models are discussed briefly. The models are developed using CSiBridge software, where each model includes a 180.8 m track structure with two rails, crossties, and spring supports [33]. To develop each model, a normal track modulus distribution is assumed and randomly selected numbers from this distribution are assigned to the spring supports along the track. Statistical properties of the assumed normal distributions are summarized in Table 1, and the applied loads are depicted in Figure 2. RE136 rail size and 0.508 m tie spacings are used in the models.



**Figure 2.** The loading condition in the finite element models (FEMs).

Individual *Yrel* values are calculated from the vertical deflection profile at every 0.3048 m (≈1 ft) interval while the moving loads pass the track model. The dynamic effects of track–train interactions are not considered during the simulations due to the software's limitation. This is acceptable within the scope of this study which mostly focuses on the Canadian freight lines where speeds are most likely lower than 65 km/h.

Figure 3 shows an example of the inputted track modulus to the model and corresponding *Yrel* output. Fallah Nafari et al., used basic statistical analysis and curve fitting approaches to study the relationship between the statistical properties of track modulus (*U*) and *Yrel* [21]. The results showed that the average and standard deviation of the track modulus over a track section length can be estimated from the average and standard deviation of *Yrel* over the same track section length. However, the estimation accuracy becomes lower by decreasing the track section length [21]. To overcome this shortcoming and increase the estimation accuracy of the track modulus, ANNs are proposed for the track modulus estimations in this study.

**Figure 3.** (**a**) Track modulus inputted to the FEM (Mean = 41.4 MPa, COV = 0.25); and (**b**) the extracted *Yrel.*

#### *2.4. Estimation of Track Modulus Average*

#### 2.4.1. Multilayer Perceptron Artificial Neural Networks

Multilayer perceptron neural networks (MLPNN) are typically useful for classification, and function approximation problems [34–38]. The implementation of MLPNN is operated with two stages of performance, i.e., training and testing procedures. Once the training process is successfully performed in a self-adaptive manner with all defined parameters (such as learning algorithm and network architecture including several layers, and neurons in each layer), the network can effectively approximate the input–output mapping function.

MLPNN is a network containing two or more neurons distributed in different layers, such as input layers, output layers, and hidden layers that connect the input and output layers (Figure 4a). Each neuron has a nonlinear differentiable activation function that creates real values and is highly connected to other neurons based on synaptic weights *wij* (*n*) (Figure 4b) as the level of connectivity.

**Figure 4.** (**a**) Example of a two-hidden layer perceptron; (**b**) typical operation at neuron *j.*

One of the most complicated tasks before executing an MLPNN is that all the required parameters should be well defined to approximate the input–output relationship, which is called the learning process that contains two phases. In the forward phase, the inputs are fed into the network from left to right, and layer by layer with the fixed values of synaptic weights. In the backward phase, the error vector is first computed by subtracting the output of the network from the expected target. The error is then propagated backward from the output to the input layer. In this phase, the synaptic weights are adjusted to minimize the network error by solving the credit-assignment problems in the operation of each hidden unit. Each synaptic weight will be updated differently based upon the contribution of the corresponding hidden unit to the overall error. More information about training the network using backpropagation and gradient descent is in Haykin's book [34].

#### 2.4.2. Estimation Procedure and Results

The inputted track modulus and the corresponding *Yrel* data from the 180 m track models are divided into equivalent groups based on a track section length (e.g., 5 m, 10 m, etc.). Once the subgroups are defined, the average and standard deviation of *Yrel* in each subgroup are used as the networks' inputs whereas the track modulus averages in the corresponding track segments are defined as the network's outputs.

*Yrel* data extracted from eighty-one FEMs (out of ninety FEMs) are used to train the neural network. The accuracy of the trained network is then tested using the remaining nine (unseen) FEMs. These nine FEMs are called "unseen models" hereafter as they are not used in training the network. To

test the trained network, track modulus average is estimated from *Yrel* average and the standard deviation for the nine unseen models. The estimated track modulus average is then compared with the track modulus inputted initially into the FEMs to generate *Yrel* data. The effectiveness of the proposed network is measured based on three parameters: the coefficient of determination (*R*2), the root mean square error (RMSE), and the mean absolute percentage error (MAPE) [39]. These measures are described as follows:

$$R^2 = \left(\frac{\frac{1}{N}\sum\_{i=1}^{N} \left[ (o\_i - \overline{o}\_i) \cdot \left( y\_i - \overline{y}\_i \right) \right]}{\sigma\_o \cdot \sigma\_y} \right)^2 \tag{2}$$

$$\text{RMSE} = \sqrt{\sum\_{i=1}^{N} \frac{\left(y\_i - o\_i\right)^2}{N}} \tag{3}$$

$$\text{MAPE} = \frac{1}{N} \sum\_{i=1}^{N} 100 \frac{\left| y\_i - o\_i \right|}{y\_i} \tag{4}$$

where *oi*, *yi* , σ*o*, and σ*<sup>y</sup>* are the average and standard deviation of the estimated, and targeted values; *N* is the number of testing samples.

When a network is trained, five-fold cross-validation is employed to minimize any potential over-fitting problem and increase the network's generalization. Regarding the network architecture, a network with two hidden layers (each contains 15 hidden nodes) is used in this study. This network ensures an acceptable error range, avoids over-fitting, and optimizes the computational efficiency. From different tests, it is noted that increasing the number of hidden nodes and hidden layers, does not necessarily mean the network's performance is improved. In fact, the input configuration is the most important factor that controls the network performance.

Five networks for five different track section lengths have been fully trained to perform this study. The track modulus average over five section lengths is then estimated for the nine new models using the trained networks. Table 2 presents the accuracy level of these estimations. From the table, the network performs better when the track section length increases although the error is acceptably small even with the case of a 10 m section length. *R*<sup>2</sup> is 0.95 for the case of the 10 m section length, which means that the estimated and inputted track modulus averages are well correlated. Moreover, the RMSE and MAPE are quite small, i.e., 2.81 MPa, and 6.99% respectively, considering that range of inputted track modulus average is 12.8 to 41.4 MPa. In addition to confirming the applicability of the *Yrel* data in indicating the track modulus information, the current methodology provides more accurate results than the other method in the literature [21]. As shown in Table 2, the *R*<sup>2</sup> value computed in the related study decreases as the length of the track segment reduces, whereas the *R*<sup>2</sup> in the current study is almost constant for cases with a 10 m track section and more.

**Table 2.** Estimation accuracy of the track modulus average (no noise added).


\* Mean Absolute Percentage Error; \*\* Root Mean Square Error.

The accuracy of the estimation method for the case of the 10 and 20 m section lengths are demonstrated in Figure 5 for four models as an example. These four models had a different track modulus average and variations. From the figure, the values estimated from the networks are very close to the actual track modulus average inputted to the FEMs. Better results can be observed in the case of a 20 m section length (Figure 5b) although the performance of the estimation of track modulus over the shorter section length (Figure 5a) is still satisfactory. Most importantly, the local fluctuation of the track modulus is well captured.

**Figure 5.** Moving average of the actual track modulus inputted to the FEMs vs. estimated values over: (**a**) the 10 m section length; and (**b**) the 20 m section length.

The effectiveness of the framework is further investigated by adding artificial noise to the *Yrel* data extracted from the FEMs. This simulates the real-life condition in which the *Yrel* measurements are affected by parameters such as the resolution of the MRail measurement system, track irregularities, etc. The artificial noise was added based on Equation (5) [40]. An example of pure vs. noise-added *Yrel* is shown in Figure 6:

$$Y\_{rel-noise} = Y\_{rel} + a \cdot 0.12 + \beta \cdot 0.1 \cdot Y\_{rel} \tag{5}$$

where α, and β are random numbers ranging from −1 to 1.

**Figure 6.** Demonstration of pure and noisy *Yrel*.

The noisy *Yrel* is used to train new networks, and then the trained networks are used to estimate the track modulus average. The estimated track modulus is then compared with the inputted track modulus for each model and the error is reported in Table 3. From the table, the estimation of the track modulus average (*Uave*) from the noisy *Yrel* is still successful even for the short track section length of 10 m as *R*<sup>2</sup> is 0.95, and RMSE is 2.77 MPa. This demonstrates that the framework performs effectively even when the *Yrel* data contain noise, and thus is expected to work with real-life data.


**Table 3.** Estimation accuracy of the track modulus average (with added noise).

\* Not available for comparisons since those section lengths are not available in the previous study.
