**1. Introduction**

Concrete and steel are the two most commonly used construction materials today. However, each material has different advantages and disadvantages [1–3]. Therefore, to be able to take advantages and minimize disadvantages, an optimal solution is to use a combination of both materials, such as a "combined steel concrete structure" or using a combination of concrete elements and steel elements in "composite structures". One of the combined steel concrete structures is a steel pipe composite structure filled with medium or high strength concrete. This type of structure is called a steel-concrete pipe.

In recent decades, concrete filled steel tubes (CFSTs) have been widely used in the construction of modern buildings and bridges [4], even in high seismic risk areas [5–10]. This increase in use is because of the significant advantages that the CFST column system offers over conventional steel or reinforced concrete systems, such as high axial load capacity [4], good plasticity and toughness [6], larger energy absorption capacity [7], convenient construction [11], economy of materials [12–14], and excellent seismic and refractory performance [15]. In particular, this type of structure can reduce the environmental burden by removing formwork [16], reusing steel pipes, and using high quality concrete with recycled aggregate [17]. The characteristics of CFST are that the steel material is located far from the central axis so the rigidity of the column is very large, and thus it also contributes to increasing the moment of inertia of the structure [5,18]. The ideal form of concrete core works against the compressive load and hinders the local buckling state of the steel pipe. Therefore, the CFST structures are often used in locations subject to large compressive loads [9,15,19]. The CFST columns are mainly divided into square columns, round columns, and rectangular columns, based on different cross-sectional forms [15]. In particular, the square and rectangular CFST columns have the advantage of easy connection and reliable work with other structural members such as beams, walls, and panels [20]. Compared with square CFST columns, rectangular columns have irregular bending stiffness along different axes, so this type of column is suitable for the mechanical behavior of members including arch ribs, pillars, abutments, and piers, and other structural members under load actions vary greatly from vertical to horizontal [6]. Because the scope of application of rectangular CFST columns is quite wide and this column is mainly subjected to compression, the main purpose of the paper is to analyze and evaluate the ultimate bearing capacity of rectangular columns.

In recent decades, the regulations for calculating the CFST column type have been proposed in design standards such as AISC-LRFD [21], ACI 318-05 [22], Japan Institute of Architecture [17], European Code EC 4, British Standard BS 5400 [23], and Australian Standard AS-5100.6 [24]. In addition, numerous experimental and numerical studies were conducted to analyze the mechanical properties of rectangular CFST columns under axial compression. As an example, Hatzigeorgiou [25] has proposed a theoretical analysis for modeling the behavior of CFST under extreme loading conditions. Later, the verification of such an approach against experimental and analytical results has also been reported in the work of Hatzigeorgiou [26]. In the work of Liu et al. [4], 26 rectangular CFST column samples were experimented under concentric compression with the main parameters such as strength and aspect ratio. In Chitawadagi et al. [8], the load capacity of CFST columns depended on the variation of CFST properties such as the wall thickness of pipes, strength of in-filled concrete, area of cross section of steel pipes, and pipe length. In this study, 243 rectangular CFST samples were investigated; the experimental results were compared with the predicted column strength, which was performed according to design codes such as EC4-1994 and AISC-LRFD-1994. In addition, there are many other test methods dealing with factors that affect the bearing capacity of rectangular CFST columns such as the effect of concrete compaction [27], load conditions, and boundary conditions [16]. The addition of steel fibers in core concrete had a significant effect on the performance of concrete steel pipes [28] and many other tests [9,13,29–32]. Finite element analysis is now also frequently used for design and research issues thanks to the existence of many commercial software such as ABAQUS [33] and ANSYS [34]. Tort et al. [35] carried out computational research to analyze the nonlinear response of composite frames including rectangular concrete pipe beams and steel frames subjected to static and dynamic loads. On the basis of the Drucker–Prager model, Wang et al. [36] developed a finite element model that can predict the axial compression behavior of a composite column with a fibrous reinforced concrete core. Collecting 340 test data of circular, square, and rectangular CFST columns, Tao et al. [37] developed new finite element models for simulating CFST stub columns under compression mode along the axis. The new model was more flexible and accurate for modeling the CFST stub columns. However, the design standards were limited by the scope of use and were not suitable for high-strength materials, and testing methods were often costly and time-consuming. The accuracy of finite element models was greatly affected by the input parameters, especially the suitable selection of the concrete model. Therefore, it is necessary to propose a uniform and effective approach to design rectangular CFST columns.

In recent years, artificial intelligence (AI) based on computer science has gradually become popular and applied in many different fields [38–41]. Artificial neural network (ANN) is a branch of AI techniques; different ANN-based modeling methods have been used by scientists in many construction engineering applications [42]. Sanad et al. [43] used ANN to estimate the reinforced concrete deep beams ultimate shear strength. Lima et al. [44] predicted the bending resistance and initial stiffness of steel beam connection using a back-propagation algorithm. Seleemah et al. [45] applied ANN to predict the maximum shear strength of concrete beams without horizontal reinforcement. Blachowski and Pnevmatikos [46] have developed a vibration control system based on the ANN method, for application in earthquake engineering. As an example for structural engineering, Kiani et al. [47] have applied AI techniques including support vector machines (SVM) and ANN for deriving seismic fragility curves. It is worth noticing that significant studies have been carried out to explore the prediction of damage using AI techniques. In a series of papers, Mangalathu et al. [48] have proposed various AI methods such as ANN and random forest for tracking damage of bridge portfolios [48] as well as assessing the seismic risk of skewed bridges [49]. In terms of structural failure, typical failure modes of reinforced concrete columns such as flexure, flexure–shear, and shear were investigated by Mangalathu et al. [50,51] using decision trees (DT), SVM, and ANN. Guo et al. [52,53] employed the ANN model for the identification of damage in different structures such as suspended-dome and offshore jacket platforms. Regarding structural uncertainty analysis, various published works by E. Zio should be consulted [54–56]. With rectangular CFST columns, the use of ANN has also been proposed. For example, Sadoon et al. [57] proposed an ANN model for predicting the final strength of rectangular concrete steel beam girder (RCFST) under eccentric shaft load. The results showed that the ANN model was more accurate than the AISC and Eurocode 4 standard. Du et al. [10] formulated an ANN model with different input parameters to determine the axial bearing capacity of rectangular CFST column. The results of the model were compared with the results calculated according to European Code EC 4 [23], ACI [22], and AISC360-10 [21], and found that the ANN model was accurate. However, in the above studies, the mentioned correlation coefficient (R) was less than 0.98. Therefore, in this paper, we tried to create a bulk sample set and proposed an algorithm to increase the accuracy of the prediction of the axial load bearing capacity of the CFST column.

In short, the aim of this paper is dedicated to the development and optimization of an AI-based model, namely the feedforward neural network (FNN), to predict the P<sup>u</sup> of CFST. An optimization algorithm, invasive weed optimization (IWO), was used to finely tune the FNN parameters (i.e., weights and biases) to develop a hybrid model, namely FNN–IWO, and to improve the prediction performance. With respect to the CFST database, 99 samples were collected from the available literature and used for the training and testing phases of the FNN–IWO algorithm. Criteria such as coefficient of determination (R<sup>2</sup> ), standard deviation error (ErrorStD), root mean square error (RMSE), mean absolute error (MAE), and slope were used to evaluate the performance of FNN–IWO. Finally, an investigation of the prediction capability in the function of different structural parameters was conducted.

#### **2. Materials and Methods**

#### *2.1. Feedforward Neural Network (FNN)*

An artificial or neural network (also known as an artificial neural network (ANN)) is a biological neural network based a computational or mathematical model. It includes a number of artificial neurons (nodes) that are linked to each other and processes information by transmitting along the connections and calculating new values at the nodes (connection method for calculation) [58,59]. The ANN models are made up of three or more layers, including an input layer that is the leftmost layer of the network representing the inputs, an output layer that is the rightmost layer of the network representing the results achieved, and one or more hidden layers representing the logical reasoning of the network [60–62]. The neurons in each layer are linked to the front and rear neurons with each associated weight. A training algorithm is often used to repeat minimizing the cost function relative to the link weight and neuron threshold. Networks are usually divided into two categories based on how the units are connected, including the feedforward neural network (FNN) and the recurrent

neural network. To date, FNN is the most popular architecture owing to its structural flexibility, good performance, and the availability of many training algorithms [63]. Currently, the most widely used training algorithm for multi-layer feedforward networks is the backpropagation algorithm (BP). In BP, network training is achieved by adjusting weights and is done through numerous training sets and training cycles [64]. With the ability to approximate the functions, FNNs have been successfully applied in a number of civil engineering and structural fields [65] such as predicting the compression strength of concrete [66], investigating the fire resistance of calves [67], determining the axial strength of cylindrical concrete pillars [58], and predicting the fire resistance of concrete tubular steel columns [65]. Therefore, in this study, FNN was selected and used to predict the axial capacity of CFST.

#### *2.2. Invasive Weed Optimization (IWO)*

IWO is a new random number optimization method inspired by a popular phenomenon in agriculture. The term of weed invasion was first introduced by Mehrabian and Lucas in 2006 [68]. This technique is based on a number of interesting features of invasive weed plants that reproduce and distribute fast and vigorously, and adapt themselves to changes in climatic conditions [69]. Therefore, capturing their characteristics will lead to a powerful optimization algorithm [70]. The advantages of IWO algorithm compared with other evolutionary algorithms are few parameters, simple structure, easy to understand, and easy to program features [71]. Up to now, the IWO algorithm has become more and more popular and has been successfully applied in areas such as antenna system design [72] and design of coding chains for DNA [73], as well as inter-related problems regarding economic [74], tourism [75], and construction techniques [76]. The IWO algorithm is implemented by the following steps:


#### *2.3. Quality Assessment Criteria*

Evaluation of the AI model was performed using statistical measurements such as mean absolute error (MAE), coefficient of determination (R<sup>2</sup> ), and root mean square error (RMSE). In general, these criteria are popular methods to quantify the performance of AI algorithms [76,77]. More specifically, the mean squared difference between actual values and estimated values defines RMSE, whereas the mean magnitude of the errors defines MAE. The R<sup>2</sup> evaluates the correlation between actual and estimated values [78–80]. Quantitatively, lower RMSE and MAE show better performance of the

models. In contrast, a higher R<sup>2</sup> shows better performance of the model [81,82]. MAE, RMSE, and R<sup>2</sup> are expressed as follows [83,84]: ( ) <sup>1</sup> MAE *N a a* = − (1)

*Materials* **2020**, *13*, x FOR PEER REVIEW 5 of 25

$$\text{MAE} = \frac{1}{N} \sum\_{i=1}^{N} (a\_i - \overline{a}\_i) \tag{1}$$

$$\text{RMSE} = \sqrt{\frac{1}{N} \sum\_{i=1}^{N} (a\_i - \overline{a}\_i)^2} \tag{2}$$

$$\mathbb{R}^2 = 1 - \frac{\sum\_{i=1}^N \left(a\_i - \overline{a}\_i\right)^2}{\sum\_{i=1}^N \left(a\_i - \overline{a}\right)^2} \tag{3}$$

where *a<sup>i</sup>* is the actual output, *a<sup>i</sup>* infers the predicted output, *a* infers the mean of the *a<sup>i</sup>* , and *N* infers the number of used samples. where *<sup>i</sup> a* is the actual output, *<sup>i</sup> a* infers the predicted output, *a* infers the mean of the *<sup>i</sup> a* , and

#### *2.4. Data Used and Selection of Variables N* infers the number of used samples.

In this study, a total of 99 compression tests of rectangular CFST columns (Figure 1) were extracted from the available literature: Bridge [85], Du et al. [86], Du et al. [87], Ghannam et al. [88], Han [89], Han & Yang [90], Han & Yao [91], Lin [92], Schneider [93], Shakir-Khalil & Mouli [94], and Shakir-Khalil & Zeghiche [95]. Information of the database is summarized in Table 1, including the number of data and the percentage of proportion, whereas Table 2 presents the initial statistical analysis of the corresponding database. *2.4. Data Used and Selection of Variables*  In this study, a total of 99 compression tests of rectangular CFST columns (Figure 1) were extracted from the available literature: Bridge [85], Du et al. [86], Du et al. [87], Ghannam et al. [88], Han [89], Han & Yang [90], Han & Yao [91], Lin [92], Schneider [93], Shakir-Khalil & Mouli [94], and Shakir-Khalil & Zeghiche [95]. Information of the database is summarized in Table 1, including the number of data and the percentage of proportion, whereas Table 2 presents the initial statistical analysis of the corresponding database.

**Figure 1.** Schematic diagram of the compression test for concrete filled steel tubes (CFSTs): (**a**) front view; (**b**) cross-section view of the sample. **Figure 1.** Schematic diagram of the compression test for concrete filled steel tubes (CFSTs): (**a**) front view; (**b**) cross-section view of the sample.

The experimental tests were carried out considering the following steps: design, processing of steel tube, production of concrete, curing of specimens, and loading measurement [15,86]. As proposed by Sarir et al. [96] and Ren et al. [15] in investigating CFST columns, initial geometric imperfections as well as residual stress exhibited a negligible effect on the behavior of columns under axial loading. Consequently, input variables affecting the axial capacity of rectangular CFST are from The experimental tests were carried out considering the following steps: design, processing of steel tube, production of concrete, curing of specimens, and loading measurement [15,86]. As proposed by Sarir et al. [96] and Ren et al. [15] in investigating CFST columns, initial geometric imperfections as well as residual stress exhibited a negligible effect on the behavior of columns under axial loading. Consequently, input variables affecting the axial capacity of rectangular CFST are from two main groups: geometry of columns and mechanical properties of constituent materials. Therefore, six independent variables were selected as inputs of the problem, such as depth of cross section (H), width of cross section (W), thickness of steel tube (t), length of column (L), yield stress of steel (fy), and compressive

Lyu et al. [97]).

strength of concrete (fc'). It is seen in Table 2 of the initial statistical analysis that all input variables cover a wide range of values. More precisely, H varies from 90 to 360 mm with an average value of 163 mm and a coefficient of variation of 32%. W ranges from 60 to 240 mm with an average value of 111 mm and a coefficient of variation of 32%. t ranges from 0.7 to 10 mm with an average value of 4 mm and a coefficient of variation of 48%. L varies from 100 to 3050 mm with an average value of 869 mm and a coefficient of variation of 89%. f<sup>y</sup> ranges from 194 to 515 MPa with an average value of 329 MPa and a coefficient of variation of 24%. fc' varies from 8 to 47 MPa with an average value of 31 MPa and a coefficient of variation of 39%. section (H), width of cross section (W), thickness of steel tube (t), length of column (L), yield stress of steel (fy), and compressive strength of concrete (fc'). It is seen in Table 2 of the initial statistical analysis that all input variables cover a wide range of values. More precisely, H varies from 90 to 360 mm with an average value of 163 mm and a coefficient of variation of 32%. W ranges from 60 to 240 mm with an average value of 111 mm and a coefficient of variation of 32%. t ranges from 0.7 to 10 mm with an average value of 4 mm and a coefficient of variation of 48%. L varies from 100 to 3050 mm with an average value of 869 mm and a coefficient of variation of 89%. fy ranges from 194 to 515 MPa with an average value of 329 MPa and a coefficient of variation of 24%. fc' varies from 8 to 47 MPa with an

*Materials* **2020**, *13*, x FOR PEER REVIEW 6 of 25

two main groups: geometry of columns and mechanical properties of constituent materials. Therefore, six independent variables were selected as inputs of the problem, such as depth of cross

It should be pointed out that the steel tube of 43 specimens was cold-formed, whereas welded built-up was done in the other 56 configurations. In terms of failure modality, local outward buckling failure of the external steel was observed in all specimens, as shown in Figure 2a. This is the same as that observed by other investigations such as Han and Yao [91], Lyu et al. [97], Ding et al. [98], and Yan et al. [99]. Depending on the dimension of the cross section, the locations of the external folding of the steel tube are not the same. Such local buckling of the steel tube occurred mostly at the ends or in the center along the axis of the specimens, as seen in Figure 2a. In addition to outward buckling failure, fracture at the welding seam also occurred in welded specimens, as shown in Figure 2b. Such tensile fracture is the result of too much growth of the concrete in the core [99]. However, the tensile fracture of the steel tube generally occurred after the peak load [98]. Last, but not least, for all specimens, concrete in the core was damaged in most of specimens following a shear failure mode, as shown in Figure 2c [97,98]. Besides, the influence of temperature on the failure modality of stub CFST structural members could be referred to in Yan et al. [99] (low temperature) and Lyu et al. [97] (high temperature). Finally, Angelo et al. [100] and Kulkarni et al. [101] have tested and discussed about the failure of rectangular CFST structural members in junction with wide beam for earthquake engineering application. average value of 31 MPa and a coefficient of variation of 39%. It should be pointed out that the steel tube of 43 specimens was cold-formed, whereas welded builtup was done in the other 56 configurations. In terms of failure modality, local outward buckling failure of the external steel was observed in all specimens, as shown in Figure 2a. This is the same as that observed by other investigations such as Han and Yao [91], Lyu et al. [97], Ding et al. [98], and Yan et al. [99]. Depending on the dimension of the cross section, the locations of the external folding of the steel tube are not the same. Such local buckling of the steel tube occurred mostly at the ends or in the center along the axis of the specimens, as seen in Figure 2a. In addition to outward buckling failure, fracture at the welding seam also occurred in welded specimens, as shown in Figure 2b. Such tensile fracture is the result of too much growth of the concrete in the core [99]. However, the tensile fracture of the steel tube generally occurred after the peak load [98]. Last, but not least, for all specimens, concrete in the core was damaged in most of specimens following a shear failure mode, as shown in Figure 2c [97,98]. Besides, the influence of temperature on the failure modality of stub CFST structural members could be referred to in Yan et al. [99] (low temperature) and Lyu et al. [97] (high temperature). Finally, Angelo et al. [100] and Kulkarni et al. [101] have tested and discussed about the failure of rectangular CFST structural members in junction with wide beam for earthquake engineering application.

**Figure 2.** Failure of rectangular CFST specimens: (**a**) local outward buckling of steel tube (reproduced with permission from Han [89]), (**b**) tensile fracture at the welding seam of steel tube (reproduced with permission from Ding et al. [98]), (**c**) damage of concrete core (reproduced with permission from **Figure 2.** Failure of rectangular CFST specimens: (**a**) local outward buckling of steel tube (reproduced with permission from Han [89]), (**b**) tensile fracture at the welding seam of steel tube (reproduced with permission from Ding et al. [98]), (**c**) damage of concrete core (reproduced with permission from Lyu et al. [97]).

It is worth mentioning that only rectangular CFST columns (i.e., depth/width ratio greater than 1) were collected for investigation. As indicated in Table 2, the depth/width ratio ranges from 1 to 2, allowing for exploring the axial failure of CFST around the weak axis. In addition, as the depth/width ratio differs than 1, the stress of confined concrete applied to the steel wall is not the same along the It is worth mentioning that only rectangular CFST columns (i.e., depth/width ratio greater than 1) were collected for investigation. As indicated in Table 2, the depth/width ratio ranges from 1 to 2, allowing for exploring the axial failure of CFST around the weak axis. In addition, as the depth/width ratio differs than 1, the stress of confined concrete applied to the steel wall is not the same along the weak and strong axes, while the thickness of the steel tube was constant. Consequently, the consideration of

weak and strong axes, while the thickness of the steel tube was constant. Consequently, the

only rectangular CFST columns could strongly reveal the influence of both the structural geometry and mechanical properties of constituent materials.


**Table 1.** Information of the database used in this study.

**Table 2.** Initial statistical analysis of database.


The dataset was randomly divided into two sub-datasets including the training part (60%) and testing part (40%) part. All data were scaled into the range of [0,1] in order to reduce numerical biases while treating with the AI algorithms, as recommended by various studies in the literature [102–104]. Such a scaling process is expressed using Equation (4) between raw and scaled data [105–107]:

$$\mathbf{x}^{\text{scaled}} = \frac{(\mathbf{x}^{\text{naw}} - \boldsymbol{\beta})}{\alpha - \boldsymbol{\beta}} \tag{4}$$

where α and β are the maximum and minimum values of the considered variable *x*, respectively. It should be noticed that a reverse transformation could be used for converting data from the scaling space to the raw one using Equation (4). Besides, a correlation analysis between the input and output variables is performed and plotted in Figure 3.

Figure 3 was generated in order to explore the linear statistical correlation between variables in the database. Therefore, a 7 × 7 matrix was generated, in which the upper triangular part indicates the value of the correlation coefficient, whereas the lower triangular part shows the scatter plot between two associated variables. The diagonal of the matrix indicates the name of the variable (i.e., as the correlation coefficient of a variable itself is equal to 1). For interpretation purpose, the correlation coefficient between H and W is indicated as 0.86, whereas the corresponding scatter plot between H and W is shown on the left side of W (row 2, column 1). It is seen that a high and positive value of statistical correlation was obtained in this case, confirmed by most of the data points being located around the diagonal in the scatter plot.

It can be seen that no direct correlation was observed between each input and output (Pu). The maximum value of the Pearson correlation coefficient (r) compared with P<sup>u</sup> was calculated as 0.78 (for variable t), followed by 0.60 (for variable fy), 0.39 (for variable W), 0.30 (for variable H), 0.27 (for variable fc'), and 0.18 (for variable L). Besides, the correlation between H and W was highest (r = 0.86).

**Figure 3.** Correlation analysis between the depth of cross section (H), width of cross section (W), thickness of steel tube (t), length of column (L), yield stress of steel (fy), concrete compressive strength (fc'), and axial capacity (Pu). **Figure 3.** Correlation analysis between the depth of cross section (H), width of cross section (W), thickness of steel tube (t), length of column (L), yield stress of steel (fy), concrete compressive strength (fc'), and axial capacity (Pu).

#### **3. Results and Discussion 3. Results and Discussion**

#### *3.1. Optimization of Weight Parameters of FNN using the IWO Technique 3.1. Optimization of Weight Parameters of FNN using the IWO Technique*

In this section, the optimization of weight parameters of FNN is presented using the IWO algorithm. It is not worth noticing that the architecture of the FNN model is very important. Depending on the problem of interest, the prediction results could exhibit significant variation from using one architecture to another [96,107,108]. As the numbers of inputs and outputs are fixed, the undetermined parameters of the architecture are the number of hidden layer(s) and the number of neurons in each hidden layer(s) [109]. As proved by many investigations in the literature, the FNN model involving only one hidden layer could be sufficient for exploring successfully complex nonlinear relationship between inputs and outputs. For instance, Mohamad et al. [110] have used one hidden layer architecture model for predicting ripping production, as have Singh et al. [111] for predicting cadmium removal. In civil engineering application, a prediction model involving one hidden layer has also been widely applied in many works, for instance, Gordan et al. [112] for earthquake slope stability or Sarir et al. [96] for bearing capacity of circular concrete-filled steel tube columns. Therefore, the one hidden layer FNN model was finally adopted in this work, also saving cost, processing time, and limitation of instruments. On the other hand, the number of neurons in the hidden layer was recommended to be equal to the sum of the number of inputs and outputs [109,113,114]. Consequently, the FNN model exhibits one hidden layer and seven neurons in the In this section, the optimization of weight parameters of FNN is presented using the IWO algorithm. It is not worth noticing that the architecture of the FNN model is very important. Depending on the problem of interest, the prediction results could exhibit significant variation from using one architecture to another [96,107,108]. As the numbers of inputs and outputs are fixed, the undetermined parameters of the architecture are the number of hidden layer(s) and the number of neurons in each hidden layer(s) [109]. As proved by many investigations in the literature, the FNN model involving only one hidden layer could be sufficient for exploring successfully complex nonlinear relationship between inputs and outputs. For instance, Mohamad et al. [110] have used one hidden layer architecture model for predicting ripping production, as have Singh et al. [111] for predicting cadmium removal. In civil engineering application, a prediction model involving one hidden layer has also been widely applied in many works, for instance, Gordan et al. [112] for earthquake slope stability or Sarir et al. [96] for bearing capacity of circular concrete-filled steel tube columns. Therefore, the one hidden layer FNN model was finally adopted in this work, also saving cost, processing time, and limitation of instruments. On the other hand, the number of neurons in the hidden layer was recommended to be equal to the sum of the number of inputs and outputs [109,113,114]. Consequently, the FNN model exhibits one hidden layer and seven neurons in the hidden layer. The activation function for the hidden layer was chosen as a sigmoid function, whereas the activation function for the output layer was a linear

function [115]. The cost function was chosen such as the mean square error function [116]. Finally, Table 3 indicates the information of the FNN model.

As revealed in the literature, a key aspect of using evolutionary algorithms for optimizing AI models is to study the relationship between population size and problem dimensionality [117–120]. In many other evolutionary algorithms such as differential evolution, the number in the population is recommended to be 7–10 times the number of inputs [121,122]. In this study, the population size of the IWO technique was chosen as 50. Other parameters include the variance reduction exponent, chosen as 2; initial value of standard deviation, chosen as 0.01; final value of standard deviation, chosen as 0.001; and maximum iteration, chosen as 800. It is worth noticing that such ranges of parameters are commonly employed for training AI models using IWO algorithm, for instance, Huang et al. [76] and Mishagi et al. [123]. It should also be noticed that a large population size cannot be useful in evolutionary algorithms and affects the optimization results [124]. Information of the IWO algorithm is presented in Table 3.


**Table 3.** Values and description of feedforward neural network (FNN) and invasive weed optimization (IWO) parameters in this study.

Figure 4a presents the evolution of 42 weight parameters of the hidden layer, whereas Figure 4b shows such evolution of 7 weight parameters of the output layer. It is seen that, at the 300 first iterations, fluctuations were observed for all weight parameters, as the IWO algorithm imitated the colonizing behavior of weed plants. After about 500–600 iterations, stabilization was achieved for weight parameters for the 57-dimensional optimization problem. Consequently, at least 700–800 iterations are needed in order to ensure the stabilization of the process.

**Figure 4.** Evolution of weight parameters over 800 iterations: (**a**) weight parameters of input layer (42 parameters); (**b**) weight parameters of hidden layer (7 parameters). **Figure 4.** Evolution of weight parameters over 800 iterations: (**a**) weight parameters of input layer (42 parameters); (**b**) weight parameters of hidden layer (7 parameters).

Weight parameters at iteration 800 were extracted for constructing the final FNN–IWO model (a combination of FNN and IWO). This model was then used as a numerical prediction function for parametrically investigating the deviation of quality assessment criteria in function weight parameters. The parametric study could be helpful to verify if the results provided by the IWO were unique, that is, the IWO allowed reaching the global optimum of the problem. For illustration purposes, only three first weight parameters were plotted. Figure 5a presents the evolution of RMSE while varying weight parameters N°1 and N°2 from their lowest to highest values. In the same context, Figure 5b presents the evolution of RMSE while varying weight parameters N°1 and N°3 from their lowest to highest values. It is seen from Figure 5a,b that the global optimum of the two RMSE surfaces matched the final set of weight parameters provided by the IWO algorithm. This remark confirmed that the IWO technique allowed calibrating the global optimum of the optimization problem, thus providing the final FNN–IWO model. Weight parameters at iteration 800 were extracted for constructing the final FNN–IWO model (a combination of FNN and IWO). This model was then used as a numerical prediction function for parametrically investigating the deviation of quality assessment criteria in function weight parameters. The parametric study could be helpful to verify if the results provided by the IWO were unique, that is, the IWO allowed reaching the global optimum of the problem. For illustration purposes, only three first weight parameters were plotted. Figure 5a presents the evolution of RMSE while varying weight parameters N◦1 and N◦2 from their lowest to highest values. In the same context, Figure 5b presents the evolution of RMSE while varying weight parameters N◦1 and N◦3 from their lowest to highest values. It is seen from Figure 5a,b that the global optimum of the two RMSE surfaces matched the final set of weight parameters provided by the IWO algorithm. This remark confirmed that the IWO technique allowed calibrating the global optimum of the optimization problem, thus providing the final FNN–IWO model.

**Figure 5.** Verification of global optimum provided by the invasive weed optimization (IWO). The surfaces of root mean square error (RMSE) show unique optimal solution, which minimizes the value of RMSE: (**a**) between weight parameters N°1 and N°2, (**b**) between weight parameters N°1 and N°3. **Figure 5.** Verification of global optimum provided by the invasive weed optimization (IWO). The surfaces of root mean square error (RMSE) show unique optimal solution, which minimizes the value of RMSE: (**a**) between weight parameters N◦1 and N◦2, (**b**) between weight parameters N◦1 and N◦3.

Figure 6a–c present the evolution of RMSE, MAE, and R2 during the optimization process of FNN weight parameters, for both training and testing data. It is seen that during the optimization using the training data, good results of RMSE, MAE, and R2 for the testing data were obtained. It is not worth noting that the testing data were totally new when applying. This remark allows exploring that no overfitting occurred during the training phase (i.e., performance indicators of testing data go in a bad direction). The efficiency and robustness of the IWO technique are then confirmed. Figure 6a–c present the evolution of RMSE, MAE, and R<sup>2</sup> during the optimization process of FNN weight parameters, for both training and testing data. It is seen that during the optimization using the training data, good results of RMSE, MAE, and R<sup>2</sup> for the testing data were obtained. It is not worth noting that the testing data were totally new when applying. This remark allows exploring that no overfitting occurred during the training phase (i.e., performance indicators of testing data go in a bad direction). The efficiency and robustness of the IWO technique are then confirmed.

*Materials* **2020**, *13*, x FOR PEER REVIEW 12 of 25

**Figure 6.** Evaluation of the performance indicators during optimization: (**a**) RMSE, (**b**) mean absolute error (MAE), and (**c**) R<sup>2</sup> , for training and testing data, respectively. **Figure 6.** Evaluation of the performance indicators during optimization: (**a**) RMSE, (**b**) mean absolute error (MAE), and (**c**) R<sup>2</sup> , for training and testing data, respectively.

#### *3.2. Influence of the Training Set Size 3.2. Influence of the Training Set Size*

In this section, the influence of training set size (in %) on the prediction results is presented. The training dataset was varied from 10% to 90% of the total data (with a resolution of 10%). Figure 7 illustrates the influence of training set size, with respect to R<sup>2</sup> (Figure 7a), RMSE (Figure 7b), MAE (Figure 7c), ErrorStD (Figure 7d), and slope (Figure 7e). All relevant values are also highlighted in Table 4. In this section, the influence of training set size (in %) on the prediction results is presented. The training dataset was varied from 10% to 90% of the total data (with a resolution of 10%). Figure 7 illustrates the influence of training set size, with respect to R<sup>2</sup> (Figure 7a), RMSE (Figure 7b), MAE (Figure 7c), ErrorStD (Figure 7d), and slope (Figure 7e). All relevant values are also highlighted in Table 4.

As seen in Figure 7a,e for R<sup>2</sup> and slope, the performance of the prediction model progressively increased during the increasing of the training set size from 10% to 90%. For instance, for the testing part, R<sup>2</sup> = 0.387 when the training set size was 10%, which was increased to 0.987 when the training set size was 90%. The same remark was also obtained when regarding Figure 7b,c, and d for RMSE, MAE, and ErrorStD, respectively. Moreover, the performance of the prediction model for both training and testing parts became stable from 60% of the training set size (Figure 7a). This observation indicates that no over-fitting occurred when the training set size surpassed a high percentage, for instance, 80%. This point proves that the prediction model is robust, exhibiting a strong capability in As seen in Figure 7a,e for R<sup>2</sup> and slope, the performance of the prediction model progressively increased during the increasing of the training set size from 10% to 90%. For instance, for the testing part, R<sup>2</sup> = 0.387 when the training set size was 10%, which was increased to 0.987 when the training set size was 90%. The same remark was also obtained when regarding Figure 7b,c,d for RMSE, MAE, and ErrorStD, respectively. Moreover, the performance of the prediction model for both training and testing parts became stable from 60% of the training set size (Figure 7a). This observation indicates that no over-fitting occurred when the training set size surpassed a high percentage, for instance, 80%. This point proves that the prediction model is robust, exhibiting a strong capability in tracking relevant information in the testing part even it is small. Finally, yet importantly, the prediction model is promising in the case in which more data are available.

tracking relevant information in the testing part even it is small. Finally, yet importantly, the

prediction model is promising in the case in which more data are available.

**Figure 7.** Influence of training set size with respect to (**a**) R<sup>2</sup> , (**b**) RMSE, (**c**) MAE, (**d**) ErrorStD, and (**e**) slope. **Figure 7.** Influence of training set size with respect to (**a**) R<sup>2</sup> , (**b**) RMSE, (**c**) MAE, (**d**) ErrorStD, and (**e**) slope.


