**6. Discussion**

This study first studied the traditional haze inversion method. After studying the relationship between AOD and haze concentration, we found a non-linear mapping relationship between the two. Therefore, we propose a CNN-based haze classification model to take advantage of CNN's non-linear relationship fitting. Through the experimental comparison between the traditional method and our proposed model, the correlation coefficient of the haze classification model based on the convolutional neural network in spring, summer and fall are better than the traditional inversion method. In particular, the results in summer and fall increased by 12% and 39%, respectively, which indicates that the CNN haze classification model can provide better classification results than traditional inversion methods. In addition, all correlation coefficients in the haze classification model are above 0.6, indicating a strong correlation between the haze level and the PM2.5 concentration.

In general, whether it is a traditional inversion method or a CNN-based method, summer has the lowest correlation coefficient, followed by winter. The reason may be that the fog concentration of summer aerosol is higher than that of haze. In winter, there are many days of heavy haze and uneven distribution of time and space, resulting in a low recognition rate of the CNN. On the other hand, the increase in fall is due to the smaller haze level, which is most concentrated in the top four levels, and there is less severe haze weather.

The result shows consistency with other studies [3,5,20]. The winter is the most polluted season in this area of China. The pattern may contribute to the high correlation coefficient in the winter. The result from the CNN shows a better performance in spring, summer, and fall. The feature capture characteristics of the CNN and its ability to fit with complex functions could benefit haze classification and prediction [31,32]. Since the CNN model could improve the inversion results when the haze is less severe, other machine learning methods could be used on a broader range of areas and haze scenarios. After considering the limitations of both the proposed method and the comparison traditional method, the result of both the proposed method and the comparing result is limited. However, the results still show the potential of combining neural networks with timeconsuming tasks in atmospheric studies.

#### **7. Conclusions**

This research is based on experimental research on convolutional neural networks' classification and prediction of haze levels. We found that a convolutional neural network

uses images to identify haze concentrations, which can sufficiently fit the non-linear relationship between input and output. It proves the feasibility of a convolutional neural network to classify and invert haze. At the same time, the use of trained convolutional neural networks can reduce manual inversion work to a certain extent. Our proposed CNN-based haze level classification model greatly simplifies data processing compared to traditional inversion methods. By comparing the correlation coefficients of traditional inversion methods and CNN-based methods, we prove that the haze prediction network can provide better PM2.5 concentration classification than traditional inversion methods. It also proves that the original remote sensing satellite images can provide rich features for analyzing haze problems.

Since this study still contains many limitations, including the limited data and only covering a limited range of the haze problem, this method is only the starting point for further combining machine learning methods with atmospheric problems. Moreover, the recognition rate reached more than 80%. However, due to insufficient data, the accuracy of the recognition result is not high, but it is also feasible. Among them, the accuracy rate and the F1 value of levels 1–4 are higher. However, the values of these three items of grades 5 and 6 all decreased, and the reason is because the concentration span of the latter two grades is larger.

Since the proposed method aims to replace and replicate the traditional process using CNN, the proposed method shares the same limitations as the traditional method. The estimate heavily depends on the satellite image, which contains many other elements that could be falsely claimed as haze. The traditional method uses various calibration processes with satellite parameters including azimuth, zenith, emissivity, and reflectivity to reduce these false results. Some of these calibration processes are removed to fit the network structure and simplify the network process when constructing a neural network to replace the labor-intensive traditional method. Although the proposed method has fewer calibrations and processes than the traditional method, the results are still compatible and even better than those in several situations. This result is likely caused by the nature of neural networks that have an outstanding performance in fitting complex natural processes. This study and previous studies have shown that neural networks have a great capacity and performance in simulating haze progress and in the prediction of haze states [23–26,28,31,32]. Of course, due to all of the simplification processes, this study took some time to construct the network, and the original limitations of the traditional method still remain in the network, there will be many misfits and false claims of haze phenomena. In order to demonstrate the great potential of the neural network on the haze problem, we believe that the following aspects can be further studied to improve the accuracy of haze prediction.

Future studies could further improve the model's ability to interpret images. In order to avoid the influence of clouds on the model's ability to identify polluted areas and pollutant concentrations, this paper manually annotated cloud information on all satellite images. This paper also removed some steps in traditional image processing to reduce complexity and time, which would cause errors in the outcome. In future research, the model's ability to process images can be improved to replace manual marking, which keeps the accuracy and quality. In addition, convolutional neural networks have good recognition performance for images, but they lack the ability to process time-series information. Therefore, to obtain the haze characteristics in the time series, utilizing more network models would benefit future studies.

**Author Contributions:** Conceptualization, W.Z. and L.W.; methodology, L.Y. and W.H.; software, J.T. and W.H.; formal analysis, B.Y. and S.L.; data curation, W.H.; writing—original draft preparation, L.Y. and W.H.; writing—review and editing, L.Y. and W.Z.; funding acquisition, W.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was jointly supported by the Sichuan Science and Technology Program (Grant: 2021YFQ0003).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** This study created or analyzed no new data. Data sharing is not applicable to this article.

**Conflicts of Interest:** The authors declare no conflict of interest.
