*Article* **Prediction of Air Pollutant Concentrations via RANDOM Forest Regressor Coupled with Uncertainty Analysis—A Case Study in Ningxia**

**Weifu Ding <sup>1</sup> and Xueping Qie 2,\***


**Abstract:** Air pollution has not received much attention until recent years when people started to understand its dreadful impacts on human health. According to air pollution and the meteorological monitoring data from 1 January 2016 to 31 December 2017 in Ningxia, we analyzed the impact of ground surface temperature, air temperature, relative humidity and the power of wind on air pollutant concentrations. Meanwhile, we analyze the relationships between air pollutant concentrations and meteorological variables by using the mathematical model of decision tree regressor (DTR), feedforward artificial neural network with back-propagation algorithm (FFANN-BP) and random forest regressor (RFR) according to air-monitoring station data. For all pollutants, the RFR increases *R*<sup>2</sup> of FFANN-BP and DTR by up to 0.53 and 0.42 respectively, reduces root mean square error (RMSE) by up to 68.7 and 41.2, and MAE by up to 25.2 and 17. The empirical results show that the proposed RFR displays the best forecasting performance and could provide local authorities with reliable and precise predictions of air pollutant concentrations. The RFR effectively establishes the relationships between the influential factors and air pollutant concentrations, and well suppresses the overfitting problem and improves the accuracy of prediction. Besides, the limitation of machine learning for single site prediction is also overcame.

**Keywords:** air pollution; random forest; feedforward artificial neural network with back-propagation; decision tree; Ningxia
