*4.4. Multiple Linear Regression Model Test*

The prerequisites for the establishment of multiple linear regression models (4) and (5) are that there is no serious autocorrelation between series, there is no strong multicollinearity between independent variables and the residual distribution basically obeys the normal

distribution. Only when these three assumptions are satisfied at the same time can it be proved that (4) and (5) have certain validity and reliability. SPSS data analysis software is used for statistical analysis of the output results (as shown in Table 3). The coefficient of determination of the regression model (4) in Table 3 is 0.833, indicating that the four independent variables *a*, *ARD*, Δ*p* and *f* can explain 83.3% of the change of the dependent variable *C*, *R*<sup>2</sup> is close to 1, and the goodness of regression fitting is better. That is to say, there is a very close linear correlation between *C* and *a*, *ARD*, Δ*p* and *f*. The coefficient adjustment *R*<sup>2</sup> of the regression model (5) is only 0.631, indicating that the two independent variables such as Δ*p* and *Q* can explain 63.1% of the change in the dependent variable *D*, and there is a certain linear correlation between *D* and Δ*p* and *Q*.

**Table 3.** Model summary.


Regression model (4) significance level (*t* statistics corresponding probability value) *Sig* = 0.038 < 0.05. The probability that 11 independent variables cannot have a significant impact on the dependent variable *C* is 0. Rejecting the null hypothesis, there are at least four independent variables, *a*, *ARD*, Δ*p* and *f*, that have a significant impact on *C* in the 11 independent variables. Among them, *a*, Δ*p* and *f* significantly affect positive *C*, and *ARD* significantly affects negative *C*. The significance level of regression model (5) (the probability value corresponding to *t* statistics) *Sig* = 0.001 < 0.05, indicating that there are at least two independent variables, Δ*p* and *Q*, that have a significant impact on *D*, where Δ*p* significantly affects negative *D* and *Q* significantly affects positive *D*.

Regression models (4) and (5) Durbin-Wastson in Table 3. are 2.082 and 1.377, respectively, which are close to 2. In statistics, when the Durbin-Wastson value is significantly close to 0 or 4, the sequences are not independent of each other, and there is a serious autocorrelation. Therefore, it can be considered that the series in regression models (4) and (5) are independent of each other and there is no serious autocorrelation.

In statistics, it is generally believed that there is no strong multicollinearity between independent variables when the variance inflation factor VIF is less than 5. The output result of statistical analysis using SPSS data analysis software is that the variance expansion factor VIF of the four independent variables of *a*, *ARD*, Δ*p* and *f* in the regression model (4) is 1.761, 1.771, 3.262, and 3.133, respectively, all less than 5. The variance expansion factor VIF of the two independent variables, such as Δ*p* and *Q*, are 1.488 and 1.488, respectively, which are also less than 5. It shows that there is no strong multicollinearity between *a*, *ARD*, Δ*p* and *f*, between Δ*p* and *Q*; that is, there is no strong correlation.

From the standard *P*–*P* diagram and scatter diagram of the regression standardized residuals of the dependent variables *C* and *D* in Figures 3 and 4, it can be seen that the standardized residuals are basically distributed around the asymptote, the scatters are basically linear, and the data and models are basically matched. The distribution of sample points is scattered and irregular, the residuals are random, and there is no heteroscedasticity. This shows that the residual basically obeys the normal distribution.

The main factors that significantly affect the regression coefficient *C* are *a*, *ARD*, Δ*p* and *f*, and the main factors that significantly affect the regression coefficient *D* are Δ*p* and *Q*. There is no serious autocorrelation between the tested series, and there is no strong multicollinearity between the independent variables. The residual distribution is basically normal distribution. Therefore, the *C* and *D* regression models (4) and (5) established by SPSS data analysis software have certain validity and reliability.

**Figure 3.** Standard *P*–*P* diagram and scatter plot diagram of regression normalized residual of dependent variable *C*. (**a**) Cumulative probability of observation. (**b**) Regression standardized predicted value.

**Figure 4.** Standard *P*–*P* diagram and scatter plot diagram of regression normalized residual of dependent variable *D*. (**a**) Cumulative probability of observation. (**b**) Regression standardized predicted value.

#### **5. Mathematical Model Prediction Correction of Coal Seam Gas Content**

Through the above research and analysis, the regression models (4) and (5) of regression coefficients *C* and *D* are established. Substituting (4) and (5) into Equation (3), the mathematical model (6) for predicting coal seam gas content in coal mines in the Hancheng area is finally obtained:

$$\begin{array}{l}W' = (11.031 + 0.148a + 0.084\Delta p - 7.919ARD + 2.263f)\ln K\_1\\ + (4.724 - 0.339\Delta p + 4.725Q) \end{array} \tag{6}$$

where *W* is the predicted value of coal gas content, m3/t.

Three coal samples are taken from the Xiangshan Coal Mine, Xiayukou Coal Mine and Sangbei Coal Mine in the Hancheng area. Eleven gas basic parameters and coal quality indexes of *Mad*, *Aad*, *Vdaf*, *TRD*, *ARD*, *F*, *Q*, *a*, *b*, Δ*p*, *f*, etc. of three coal samples are measured in the laboratory. At the same time, the *K*1–*p* relationship model of three coal samples is measured by a WTC gas outburst parameter instrument [31]. The gas content of the coal seam is predicted by mathematical model (6). The gas pressure *p* is set as 0.5, 1.0, 2.0, 3.0, 4.0 and 5.0 MPa, respectively. The predicted value *W* of the mathematical model of coal seam gas content is compared with the measured value *W* (as shown in Table 4).

**Table 4.** Comparison of predicted and measured values of the mathematical model of coal bed methane content.


Remarks: error rate = (W –W)/W × 100%.

It can be seen from Table 4 that when the gas pressure is 0.5 MPa, 1.0 MPa, 2.0 MPa, 3.0 MPa, 4.0 MPa and 5.0 MPa, the variation range of the predicted value *W* of the mathematical model of coal seam gas content and the measured value *W* is 3.7021~13.0581 m3/t. The predicted value *W* and the measured value *W* increase with the increase in gas pressure. When the gas pressure is in the range of 0.5~5.0 MPa, the maximum deviation between the predicted value *<sup>W</sup>* and the measured value *<sup>W</sup>* is −2.3189 m3/t when the gas pressure is 4.0 MPa in the Xiayukou Coal Mine. The minimum deviation is 0.0711 m3/t corresponding to the gas pressure of 0.5 MPa in the Xiangshan Coal Mine. The average deviation is −1.2178 m3/t. The maximum error rate is −25.71%. The minimum error rate is 1.31%. The average error rate is −12.84%.

In Figure 5, the trend of the predicted value *W* of the mathematical model of coal seam gas content in the three coal mines is generally consistent with the measured value *W*, which conforms to the characteristics of logarithmic function. The predicted value *W* of the Xiangshan Coal Mine is very close to the measured value *W*, and the change trend of the Xiayukou Coal Mine deviates far. In the low-pressure state, the coincidence degree is high, close to coincidence, and the average absolute error rate is 12.84%, which can meet the prediction requirements. It can be seen that the mathematical model of multiple linear regression has high reliability in predicting coal seam gas content in the Hancheng area. Because the average deviation is −1.2178 m3/t, the predicted value *<sup>W</sup>* is generally lower than the measured value *W*. Therefore, Equation (6) is modified by the average deviation of −1.2178 m3/t. The modified mathematical model is:

$$\begin{array}{l}\mathcal{W}' = (11.031 + 0.148a + 0.084\Delta p - 7.919ARD + 2.263f)\ln K\_1\\+ (4.724 - 0.339\Delta p + 4.725Q) + 1.2178\end{array} \tag{7}$$

**Figure 5.** The predicted value *W* of the regression model under different pressures is compared with the measured value *W*.

#### **6. Conclusions**


**Author Contributions:** Investigation, H.L., L.D., J.C., R.L. and B.W.; methodology, H.L., L.D. and J.C.; supervision, L.D., R.L. and B.W.; writing—original draft, H.L., L.D. and J.C.; writing—review and editing, H.L., L.D., J.C. and R.L.; funding acquisition, L.D., J.C., R.L. and B.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Natural Science Foundation of China (No. 51974358, No. 52104239 and No. 51874348), Natural Science Foundation of Chongqing (No. CSTB2022NSCQ-MSX1080, No. CSTB2022NSCQ-MSX0379 and No. cstc2020jcyj-msxmX1052), and Chongqing Science Fund for Distinguished Young Scholars (No. cstc2019jcyjjqX0019).

**Data Availability Statement:** All data and/or models used in the study appear in the submitted article.

**Acknowledgments:** We also would like to thank the anonymous reviewers for their valuable comments and suggestions that led to a substantially improved manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.
