*Article* **Application of Soft Computing Techniques for Predicting Thermal Conductivity of Rocks**

**Masoud Samaei 1, Timur Massalow 2, Ali Abdolhosseinzadeh 1, Saffet Yagiz 2,\* and Mohanad Muayad Sabri Sabri <sup>3</sup>**


**\*** Correspondence: saffet.yagiz@nu.edu.kz; Tel.: +7-(7172)-705724

**Abstract:** Due to the different challenges in rock sampling and in measuring their thermal conductivity (TC) in the field and laboratory, the determination of the TC of rocks using non-invasive methods is in demand in engineering projects. The relationship between TC and non-destructive tests has not been well-established. An investigation of the most important variables affecting the TC values for rocks was conducted in this study. Currently, the black-boxed models for TC prediction are being replaced with artificial intelligence-based models, with mathematical equations to fill the gap caused by the lack of a tangible model for future studies and developments. In this regard, two models were developed based on which gene expression programming (GEP) algorithms and non-linear multivariable regressions (NLMR) were utilized. When comparing the performances of the proposed models to that of other previously published models, it was revealed that the GEP and NLMR models were able to produce more accurate predictions than other models were. Moreover, the high value of R-squared (equals 0.95) for the GEP model confirmed its superiority.

**Keywords:** thermal conductivity; geothermal systems; gene expression programming (GEP); non-linear multivariable regression (NLMR); P-wave; porosity

#### **1. Introduction**

Due to the increase in energy prices and energy demand, energy conservation and management play a significant role in human lives and governmental policies. Heat as a form of energy is transferred more rapidly in solid mediums than in gas and liquid [1]. Therefore, knowing the ability of solid materials to transfer heat can aid in conserving energy more efficiently. Three main indices were introduced for the evaluation of the thermal behavior of solid materials [2,3] as the following:


$$\kappa = \frac{\lambda}{\rho. \, \text{C}\_{\text{P}}} \tag{1}$$

The determination of rocks' TC is of great importance in geothermal, environmental, mining, and civil engineering applications [1]. This parameter is critical for the management of geothermal reservoirs, designing power-saving walls and powerhouses, CO2 sequestration, and underground waste disposal wells [2–6]. It can be measured using heat input and the temperature gradient of the host rock [7].

The physical and mechanical properties of rocks have a direct relation with TC as the mineral content [8–10], bulk density (*ρ*) [3,11], porosity (∅) [2,12], P-wave (pressurewave) velocity (*Vp*) [2,4,13], uniaxial compression strength (UCS) [2,14], saturating fluid

**Citation:** Samaei, M.; Massalow, T.; Abdolhosseinzadeh, A.; Yagiz, S.; Sabri, M.M.S. Application of Soft Computing Techniques for Predicting Thermal Conductivity of Rocks. *Appl. Sci.* **2022**, *12*, 9187. https://doi.org/ 10.3390/app12189187

Academic Editor: Ricardo Castedo

Received: 25 July 2022 Accepted: 30 August 2022 Published: 14 September 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

characteristic [15], quality and geometry of the contact between the grains [16–18], discontinuities [19], temperature [20–23], and atmospheric pressure [24].

In situ measurements of TC need specialized equipment, which is very expensive and can be time-consuming [25]. However, due to the scale effect, even if the effects of the stress level, pore fluid pressure, temperature, and permeability of rocks can be considered, the laboratory values may substantially differ from those of the in situ measurements [24,26,27]. Therefore, in recent years, several researchers proposed methods to estimate the in situ TC by measuring a rock's effective properties (e.g., [2,28–31]). Measuring the uniaxial compressive strength (UCS), density, porosity, and P-wave velocity of a rock is not as difficult as measuring in situ TC. These properties can also be obtained by laboratory or indirect methods, such as using the Schmidt hammer for the determination of the UCS [24]. In this case, the following items can be found without performing TC core experiments at several depths [30]:


Özkahraman, Selver, and Isık [2] conducted tests on rock samples that are mostly used in building constructions. They investigated the relationship between TC and rock properties, specifically for the P-wave velocity, UCS, density, and porosity. They proposed equations for the prediction of the TC regarding each of the four above-mentioned parameters, but not by simultaneously considering all parameters (multivariable equation). They found that TC has a direct relation to P-wave, UCS, and density, whereas it has an inverse relation to porosity. In another study, Ya¸sar, Erdo ˘gan, and Güneyli [4] performed laboratory tests on 12 different rock samples. They found that the type of a rock's mineral composition could also affect its TC to a high degree.

In recent years, modeling using machine learning (ML) methods to better fit with actual measurements has become more popular with scientists than other modeling methods [32–37]. ML methods are applied to different issues in geotechnical engineering [38–41]. A methodology combining physical modeling and ML was proposed by Assouline et al. [42] to estimate the apparent ground thermal diffusivity at the national scale. In this methodology, a model is built with random forests (RF) using the output values from previous diffusivity estimations, as well as geological, elevation, and temperature features. The model, which exhibited an acceptable test error of 16.5%, is then used to estimate the apparent diffusivity across Switzerland. Singh, Sinha, and Singh [3] applied two ML techniques, including artificial neural networks (ANNs) and adaptive neuro-fuzzy inference (ANFIS), to predict TC by some series of datasets on rock properties (P-wave velocity, UCS, density, and porosity). The proposed models had a strong correlation coefficient between the measured and the predicted TC. Sargam et al. [43] used a multilayer perceptron (MLP) model to study concrete's thermal conductivity. A high degree of prediction accuracy was achieved by MLP. In this regard, Khandelwal [24] conducted an analysis using a feed-forward backpropagation neural network and found that the ANNs presented more accurate results than other techniques did. To predict the TC of Jalore granite, Verma et al. [44] used artificial neural networks (ANNs), linear regression, support vector regressors (SVRs), and decision tree regressors (DTRs). According to their analysis, TC was strongly correlated with density, S-wave velocity, and P-wave velocity. Moreover, it was validated by different AI tools that thermal conductivity is highly sensitive to rock's physical properties [44]. In the study done by Wang et al. [45], TC was analyzed using a convolutional neural network (CNN) and datasets of temperature fields from lattice Boltzmann method (LBM) simulations based on three-dimensional sphere-packed porous media. CNN and LBM acquired relative errors for the effective thermal conductivity of sphere-packed porous media (0.7–22.8%) and irregular porous media (3.1–16.0%). According to them, CNN is promising for the prediction of the heat transport properties of porous media with variable boundary conditions and different morphologies. For predicting geothermal gradient, thermal conductivity, the heat productivity of rocks, and the crustal/mantle heat flow, He et al. [46] used generalized linear models (GLM), deep neural networks (DNN), and gradient-boosted regression trees (GBRT). Their

results showed that the DNN model, with a number of neurons multiplied by the features, performed better than the other models. The average relative prediction errors for SVM and DNN were 13.3 and 12.7%, respectively. In a hybrid SVM–DNN approach, the average relative prediction error decreased to 12.2%. Hajihassani et al. [47] developed an ANFIS and a linear multivariable regression (LMR) model using 44 datasets that were collected from the literature. Kang et al. [48] measured the thermal conductivity of various rocks in the Songliao Basin (China). The correlation between porosity, moisture content, density, P-wave velocity, and thermal conductivity were investigated. Seven prediction models were developed using extreme learning machine (ELM), support vector regression (SVR), and backpropagation neural network (BPNN) algorithms. The results demonstrated that the ELM-based model had better performance, speed, and accuracy for predicting rock thermal conductivity.

The above-developed computer-aided models (ANNs and ANFIS) are black-box methods. Some pitfalls remained in these studies [49,50]:


In order to overcome the disadvantages discussed above and bridge the gap between ANNs and conventional experimental models, we developed mathematical equations/models using computer programs that have high confidence compared with other studies conducted in a similar field. Using multiple related inputs in this study, we developed functional relationships that can predict a specific output. Two models were proposed using a gene expression programming (GEP) algorithm and a non-linear multivariable regression (NLMR). As shown in the numerical experiments by Ferreira [53], the GEP approach can be seen as an efficient alternative to traditional machine learning approaches. The developed models were validated using statistical indices, and they were compared with the results of previously developed models.

#### **2. Establishment of Dataset**

In rock mechanics, each test induces costs, and a long period is needed for each phase of the experimental process. Therefore, it is crucial to reach the desired accuracy by testing a minimum number of samples [54]. Yamaguchi [55] sought to analyze this problem using a statistical technique called the "decision of the sample number". He tested three different kinds of igneous rocks to evaluate their compressive strength and found that a 90% confidence level could be acquired by using only ten samples. This study showed that, due to the high similarity in their physical properties, the mechanical properties of each primary type of rock (i.e., sedimentary, metamorphic, and igneous) could be similar to each other. Therefore, testing a small number of samples would be sufficient to obtain results with a high confidence level [55].

In our study, 50 datasets, including TC, UCS, density, porosity, and P-wave velocity, were taken from the literature [2–4,24]. The type of these samples was not reported, but most of them were sedimentary rocks (i.e., by referring to their P-wave velocity and porosity values). The basic descriptive statistics of the datasets are presented in Table 1.


**Table 1.** Descriptive statistical distribution of datasets.

#### *2.1. Input Parameters for Models*

During the preliminary steps of the analysis, using polynomial and power functions, a series of single-variable regressions were conducted to obtain more details about the relationship between TC and the independent rock properties. Table 2 summarizes the results of the single-variable regressions. As seen, the P-wave velocity and density had the highest and the lowest effects on the TC, respectively.

**Table 2.** Correlation coefficients of the simple regression models between the TC and the independent variables.


To avoid redundancy in the future model generation, the relationships between the independent variables were investigated. As shown in Table 3, there were no redundancies between the independent parameters, and the relationships between porosity and density as well as between UCS and P-wave velocity were significant.

**Table 3.** Correlation coefficients of the relationships between the independent variables.


#### 2.1.1. P-Wave Velocity

Interpreting the relationship between P-wave velocity and TC requires considering the parameters that influence P-wave velocity. Among the most-cited variables that affect P-wave velocity, mineral composition, lithology, porosity, and confining stress are cited as the main factors. Gegenhuber and Schoen [56] studied the relationship between TC and the P-wave velocity for different rock types. They found that TC and P-wave velocity had a positive and robust correlation coefficient. The mineral composition was the most effective parameter influencing their relationship. Figure 1 shows the relationship between TC and P-wave and the mineral composition of rocks. There was no information found about the mineral composition of samples in the current study. As a result, the curve fitted to TC versus P-wave velocity did not include this factor (Figure 2).The effect of lithology on seismic wave velocity in rocks was well-established by Domenico [57]. According to him, a higher seismic velocity can indicate higher quartz content, which can result in higher UCS. In addition, more significant confining stress can close microcracks in samples and improve P-wave transmission. As porosity decreases, P-wave and TC values increase. Table 3 also demonstrates this phenomenon by showing that P-wave velocity has an inverse relationship with porosity. As a result, P-wave velocity can inherently represent the most influential parameter for TC as it has a direct relationship with TC.

Furthermore, according to Freund [58], the type of pore fluid influences P-waves in porous rocks. In our study, the samples were completely dry. However, this method can be used in future studies to evaluate TC in saturated conditions.

**Figure 1.** The points show the measured values, and the curve shows the relationship between TC and P-wave velocity. The grey arrow indicates the increase of the porosity.

**Figure 2.** Measured values and relationship between TC and P-wave velocity.

#### 2.1.2. Porosity of Rocks

As shown in Table 2, the porosity had an inverse and significant relationship with TC, which represented the more effective parameter for the TC of sedimentary rocks [59]. The void ratio of sedimentary rocks (from 1% up to 80%) could be higher than that of the volcanic and metamorphic ones (at most 1%). The wide range of substances that could fill the void spaces, such as air, water, chemical sediments, or organic matters, was the other parameter controlling TC. However, the reported datasets of the TC rocks, which were used in our study, were obtained in the laboratory, considering dry conditions and the ISRM standards.

#### 2.1.3. Density of Rocks

Birch [60] obtained a direct relationship between P-wave and density. Horai [61] examined 166 rock-forming minerals to find an empirical equation for the TC prediction. They found an analogous relationship between increasing TC and increasing the density, as well as increasing the P-wave velocity. In our study, in addition to the results obtained by Horai (1971), another important issue was observed. Figure 3 shows the TC–density plot for silica minerals by Horai [61], whereas Figures 2 and 4 present the TC–density and TC–P-wave plots obtained in this study. The trend line of these plots reached a density of about 2500 kg/m3, showing that there was a low gradient, while it increased for greater values of density. As observed in Figure 5, the density of rocks increased with the porosity decrease, while both of them had an extreme effect on the TC and P-wave velocity.

**Figure 3.** TC versus density for silica minerals.

**Figure 4.** Measured values of and the relationship between TC and density.

**Figure 5.** Measured values of and the relationship between porosity and density.

2.1.4. Uniaxial Compressive Strength

The strength of rocks is affected by their mineralogy, grain size, and porosity. Rocks with larger grain sizes and considerable porosity have lower UCS than that of other dense rocks [2]. Furthermore, an increase in quartz content will increase the strength of rocks as well as the TC (Clauser and Huenges [59] and Figure 6). Hence, the greater the UCS is, the larger the rocks' TC is (Figure 7). Sargam, Wang, and Cho [43] found that concrete mixtures with higher quartz content had higher TC and compressive strength.

Different groups of researchers (Pimienta et al. [62]; Esteban, Pimienta, Sarout, Delle, Piane, Haffen, Geraud, and Timms [30]; and El Sayed and El Sayed [12]) proposed models for TC prediction using only the P-wave velocity and porosity. Although their models had an acceptable degree of accuracy, the discussions on the role of the rock properties in TC prediction indicated that these properties were not enough. Other rock properties can also be interesting and give accurate predictions of TC (rock type, TC, UCS, density, saturating fluid characteristic, quality, and geometry of the contact between the grains).

**Figure 6.** Thermal conductivity of basic rock-forming minerals and their compositional relationship with rocks for volcanic and sedimentary rocks.

**Figure 7.** Measured values and relationship between TC and UCS.

#### **3. Gene Expression Programming (GEP)**

The genetic algorithm (GA) introduced by J. Holland [63] as a new stochastic optimization technique was utilized in this study, in which Darwin's theory of "survival of the fittest" was also used. This algorithm attempts to use genetic operators and a fitness function in each round of processing to optimize and classify the set of parameters that are the best solutions for the problem. Whenever the output requirements (e.g., the required proportion between measured and predicted values) are met, the algorithm is stopped [64]. A newer developed version of GA is genetic programming (GP). It was first introduced by Koza [65]. GP finds a solution for problems using variable-length sets of parameters, including mathematical functions, algebraic operators (function sets), and numbers (terminal sets). Ferreira [53] introduced gene expression programming (GEP) as a newer version of GP in which individuals are characterized as linear strings. The five central units that constitute a GEP algorithm are terminal sets, function sets, fitness functions, operators, and stop conditions [66]. Moreover, expression trees (ETs) were used to demonstrate fixed-length solutions in tree shape structures.

The fixed-length chromosome is the most obvious difference between the GEP and GP. The genomes or chromosomes are linear, symbolic strings of one or more genes with a fixed length. The genes themselves are composed of primitives (mathematical functions or variables), which are all fixed-length strings.

As shown in Figure 8, in the initial step, GEP randomly generated a series of chromosome sets that were potential solutions for the problem. In the second step, a set of chromosomes was expressed as ET. Meanwhile, ETs are interpretable as mathematical equations. Afterward, a fitness function (which is responsible for calculating the errors in predictions) evaluated the fitness of each set of parameters. If the expected results were not met, the best solutions would be selected by a selector function, and the genetic operators would combine them to generate a better set of parameters. In the following paragraphs, we will describe the most common operators.

Mutation: Since it enables immediate changes to the program output, the mutation operator is the most crucial one in the GEP algorithm. To put it another way, it changes a terminal node into a functional node and vice versa. By choosing two distinct subtrees, switching them, and then choosing another subtree, two distinct subtrees from different chromosomes are swapped. Gene inversion: Using this operator, a specific set of genes in the chromosome's head is inverted. Transposition: With the help of this operator, a chromosome's chosen portion is moved to a different location. Among the different types

of transposition operators, there are the (1) insertion sequence transposition (IS), (2) root insertion sequence transposition (RIS), and (3) gene transposition.

**Figure 8.** Flowchart of the GEP algorithm employed in our study.

Finally, if the newly created set of parameters did not have the expected fitness, the process would be repeated to reach the stopping conditions [67].

#### **4. Results**

Two empirical equations were developed in our study to predict TC using the physicomechanical properties of rocks (i.e., P-wave, porosity, density, and UCS). The first model was developed using non-linear multivariable regression (NLMR), and another model was developed using gene expression programming (GEP). To avoid overfitting during the training stages, 65% of all datasets were randomly selected for training, and the rest were used for testing the developed models. The testing results of the developed models showed how much these models could be generalized for other datasets. Furthermore, statistical indices (i.e., the coefficient of determination (R2) and the root mean square error (RMSE)) were used to evaluate the robustness of the developed models for prediction. The R-square (Equation (2)) showed how reliable a model was for future forecasts, and the RMSE (Equation (3)) showed how much the standard deviation of the residuals was (prediction errors).

$$\mathbf{R}^2 = 1 - \frac{\sum \left( \mathbf{y\_{act}} - \mathbf{y\_{pre}} \right)^2}{\sum \left( \mathbf{y\_{act}} - \overline{\mathbf{y\_{act}}} \right)^2} \tag{2}$$

$$\text{RMSE} = \sqrt{\frac{\sum\_{i=1}^{n} \left(\mathbf{y}\_{\text{pre}} - \mathbf{y}\_{\text{act}}\right)^{2}}{n}} \tag{3}$$

where ypre is the predicted value, yact is the measured (actual) value, and yact is the average of the measured values.

#### *4.1. NLMR Model*

Unlike linear regression, the NLMR model could use a wide variety of mathematical functions to find the best fitting equation between the input and output parameters [54]. However, to avoid model complexity, only polynomial and power functions were used for the model development. Based on Table 2, the NLMR functions were generated by combining non-linear single variable equations for each parameter. An optimization algorithm was used to optimize the NLMR function to achieve the best TC prediction model. The NMLR model was developed and optimized by a genetic algorithm (GA) using MATLAB software [68]. GA's general procedures are similar to GEP's, which generate random coefficients for parameters, enhance them by their operators, and try to reach a higher degree of accuracy. In order to determine the most suitable setting for the GA run, a series of trial-and-error tests were conducted. As a result, with NLMR functions, the most accurate TC prediction model was found to be Equation (4). Table 4 shows the performance indices of this model.

$$\lambda = 0.122 \times V\_p^{-0.033} + 0.0013 \times V\_p - 0.024 \times \phi^{-0.18} - 0.0177 \times \rho^{-0.0213} - 0.1171 \times lICS^{0.13} + 0.02153 \tag{4}$$

**Table 4.** Resulting performance indices' values of the proposed NLMR model.


#### *4.2. GEP Model*

The GEP model was developed through the GenXPro Tools software. This software extracts the significant features of datasets, including a high number of variables, and finds relationships among them with high accuracy. Similar to the process used in Section 4.1, the same training and testing subsets were used for the GEP model's development. To develop a model for TC prediction by GEP, a simple equation in the form of *TC* = *f*(*P*-*wave*, *porosity*, *UCS*, *density*) was first proposed. To acquire the best setting for the GEP model generation in the initial step, function sets and fitness functions were then chosen from the study of Zare Naghadehi et al. [69]. Afterward, several trial-and-error procedures were carried out to obtain the best settings. Having utilized these settings, the GEP-developed models obtained minimal error percentages. Table 5 presents the GEP software settings for the model generation during this study. The procedure of GEP modeling was illustrated in Section 3.

**Table 5.** Parameters of the GEP model.


The prediction of the GEP performance models was evaluated by both R-square and RMSE. Several models were developed to find a better model with the lowest RMSE and highest R-square. Table 6 gives the prediction performance of the selected GEP model.

**Table 6.** Resulting performance indices' values of the proposed GEP model.


The developed models are presented in terms of expression trees (ETs) or as computer codes. However, these presentations should be interpreted as a form of a mathematical equation. The extraction of the mathematical equation from ETs is an easy task. The ETs are read from left to right and bottom to the top. The ET of each gene of the GEP model is shown in Figure 9a–d, and the mathematical equation of each gene is presented as Equations (5)–(8). The genes' equations were linked, and the GEP model was generated using Equation (9).

$$\text{SubET1} = \cos\sqrt[3]{\Phi \times \cos\left(\frac{\rho + \text{UCS} - \text{V}\_{\text{P}}}{10.61 - \rho}\right)^{2}} \tag{5}$$

$$\text{SubET2} = \left( \cos \left( \cos (0.973 - \frac{\text{V}\_{\text{P}} + \text{UCS}}{\rho - \Phi}) \right) \right)^2 \tag{6}$$

$$\text{SubET3} = \frac{1}{\sqrt[3]{\Phi^2}} + \frac{1}{\Phi^3} \tag{7}$$

$$\text{SubET4} = \cos(\cos(-1 - \sqrt[3]{\tan(2\rho) - \text{UCS} + 1.71})) \tag{8}$$

$$
\lambda = \text{SubET1} + \text{SubET2} + \text{SubET3} + \text{SubET4} \tag{9}
$$

**Figure 9.** *Cont*.

**Figure 9.** (**a**) Expression tree of each gene of the GEP model for TC prediction (Equation (5)). (**b**) Expression tree of each gene of the GEP model for TC prediction (Equation (6)). (**c**) Expression tree of each gene of the GEP model for TC prediction (Equation (7)). (**d**) Expression tree of each gene of the GEP model for TC prediction (Equation (8)).

A better illustration of the predicted values of TC by the GEP model against the measured TC for training and testing subsets are shown in Figures 10 and 11, respectively.

**Figure 10.** The correlation coefficient of the GEP model for the 65% training subsets.

**Figure 11.** The correlation coefficient of the GEP model for the 35% testing subsets.

#### *4.3. Verification and Discussion of the Results*

To evaluate the prediction reliability of the developed models, the corresponding performance indices were compared with those of previously published studies. For the prediction of the TC using the same parameters and datasets as those used in our study, two multivariable equations were developed by Khandelwal [24] (Equation (10)) and Hajihassani, Marto, Khezri, and Kalatehjari [47] (Equation (11)); hence, these two models were chosen for the verification of the newly developed models. It is noteworthy that there are also some other efforts in the literature to predict TC by computer-aided methods using the same datasets [3,24,70,71].

$$
\lambda = -1.1864 + 0.006 \text{ UCS} + 0.1493 \times \text{p} + 0.0134 \text{ }\phi + 0.0004 \text{ V}\_{\text{P}} \tag{10}
$$

$$\lambda = 0.00037 \,\text{V}\_{\text{P}} - 0.01653 \,\text{\textdegree } 0-0.00058 \,\text{\textdegree } \text{\textdegree } + 0.02053 \,\text{LCS}-0.06072 \tag{11}$$

These studies used some kinds of methods called black boxes, but these methods did not have the practical potential to be utilized in applications. In the meantime, as is evident from Equations (10) and (11), in the developed mathematical equations, only simple functions are used, but to represent the effect of parameters on TC, complex non-linear functions are needed. To avoid any confusion made by the selection of training and testing subsets, the performance indicators of all models were calculated again by substituting the input variables using all 50 datasets. Thus, the accuracy of the previous model could be compared exactly to the proposed one in this study. The results of the performance indicators of the proposed models for all 50 datasets are listed in Table 7.

**Table 7.** Performance indices and ranking of the new and the previous TC predictor models calculated for all 50 datasets.


Comparing the performance indices of the proposed models and the previously published model revealed that the GEP and NLMR models produced more accurate predictions than the MVRA model did. The high value of R-square (equal to 0.95) and the low value of RMSE (equal to 0.17) confirmed the higher accuracy of the proposed GEP model. The GEP was more accurate since it used longer terms and a wider variety of mathematical functions than the NLMR one did. The existing MVRAs in the literature [24,47,71] were developed using simple non-linear variables that were regulated using commercial software. Based on the low correlation coefficients of these models, we concluded that they could not capture the complexity of the problem and the relationship between TC and the influential parameters. The NLMR model and the GEP model developed in our study were developed not only using non-linear mathematical terms to represent each parameter but also via an extensive trial and error process; thus, a soft computing approach was used to enhance the accuracy of predictions.

#### **5. Conclusions**

In our study, a literature survey was performed to the establishment of the dataset for estimating the thermal conductivity of rock via several rock properties, including the UCS, density, porosity, and P-wave velocity of rocks. Several simple variable regressions were conducted among the rocks' properties and TC. As a result of simple regressions, we found that the P-wave velocity and density had the highest and the lowest effect on the TC, respectively. Further, the relationships between porosity and density and between UCS and P-wave velocity were then considered significant and meaningful. To estimate TC via rock properties, two empirical equations were developed. First, a model was developed using NLMR, and then a second model was developed by GEP. The two statistical indices R<sup>2</sup> and RMSE were utilized to evaluate the robustness of the developed models in order to predict TC. While the GEP model had a higher value of the R<sup>2</sup> (0.95) and a lower value of RMSE (0.17), a low R<sup>2</sup> (0.82) and RMSE (0.32) were obtained for NLMR. A comparison of the performance indices of the proposed models and of the previously published models revealed that the GEP and NLMR models were able to produce more accurate predictions. As a result, the developed model can be used for estimating the TC of rocks, since performing TC-related tests might be time-consuming and cost-restrictive. Additionally, despite the fact that our study proposed methods and mathematical models that significantly increased prediction accuracy, there are some associated limitations. As mentioned in previous sections, a limited number of datasets were used in our study because data collection in geosciences is challenging. As a result, we recommend that future studies focus on rocks and parameter ranges that overlap only slightly with those

of this study. Our study and similar studies have not quantified the texture of the rocks, which is one of the most important parameters. The data collection and analysis stages of future studies should incorporate this parameter.

**Author Contributions:** Conceptualization, M.S. and S.Y.; methodology, M.S. and S.Y.; software, M.S.; formal analysis, M.S., A.A. and S.Y.; resources, M.S., A.A. and S.Y.; data curation, M.S. and S.Y.; writing—original draft, M.S., T.M., A.A., S.Y. and M.M.S.S.; writing—review and editing, M.S., T.M., A.A., S.Y. and M.M.S.S.; supervision, S.Y.; funding acquisition, M.M.S.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** The research was partially funded by the Ministry of Science and Higher Education of the Russian Federation under the strategic academic leadership program 'Priority 2030' (Agreement 075-15-2021-1333, dated 30 September 2021).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

