Generating Daily High-Resolution Regional XCO2 by Deep Neural Network and Multi-Source Data

Tian, Wenjie; Zhang, Lili; Yu, Tao; Yao, Dong; Zhang, Wenhao; Wang, Chunmei

doi:10.3390/atmos15080985

Open AccessArticle

Generating Daily High-Resolution Regional XCO₂ by Deep Neural Network and Multi-Source Data

by

Wenjie Tian

^1,2

,

Lili Zhang

^1,2,3,*,

Tao Yu

^1,2,

Dong Yao

⁴,

Wenhao Zhang

⁵

and

Chunmei Wang

²

¹

Key Laboratory of Earth Observation of Hainan Province, Hainan Aerospace Information Research Institute, Sanya 572029, China

²

Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

³

State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China

⁴

QiLu Aerospace Information Research Institute, Jinan 250010, China

⁵

North China Institute of Aerospace Engineering, Langfang 065000, China

^*

Author to whom correspondence should be addressed.

Atmosphere 2024, 15(8), 985; https://doi.org/10.3390/atmos15080985

Submission received: 2 July 2024 / Revised: 14 August 2024 / Accepted: 15 August 2024 / Published: 16 August 2024

(This article belongs to the Special Issue Satellite Remote Sensing Applied in Atmosphere (2nd Edition))

Download

Browse Figures

Versions Notes

Abstract

:

CO₂ is one of the primary greenhouse gases impacting global climate change, making it crucial to understand the spatiotemporal variations of CO₂. Currently, commonly used satellites serve as the primary means of CO₂ observation, but they often suffer from striping issues and fail to achieve complete coverage. This paper proposes a method for constructing a comprehensive high-spatiotemporal-resolution XCO₂ dataset based on multiple auxiliary data sources and satellite observations, utilizing multiple simple deep neural network (DNN) models. Global validation results against ground-based TCCON data demonstrate the excellent accuracy of the constructed XCO₂ dataset (R is 0.94, RMSE is 0.98 ppm). Using this method, we analyze the spatiotemporal variations of CO₂ in China and its surroundings (region: 0°–60° N, 70°–140° E) from 2019 to 2020. The gapless and fine-scale CO₂ generation method enhances people’s understanding of CO₂ spatiotemporal variations, supporting carbon-related research.

Keywords:

DNN; XCO₂; China; regional fine scale; high resolution

1. Introduction

The global carbon dioxide (CO₂) concentration has been continuously rising in recent years, significantly impacting the ecological environment and climate change [1]. According to the latest 2021 Greenhouse Gas Bulletin by the World Meteorological Organization [2], the global average CO₂ concentration increased to 415.7 ppm in 2021, with the growth rate of CO₂ between 2020 and 2021 exceeding the average annual growth rate of the past decade. Therefore, urgent action is needed to control CO₂ emissions. Many countries around the world have established “carbon neutrality” goals [3], and China has also set targets to peak its carbon emissions by 2030 and achieve carbon neutrality by 2060 [4]. To achieve this goal, we first need to accurately monitor the concentration and spatiotemporal variations of CO₂ in the atmosphere. As a result, obtaining high-resolution CO₂ data in China is of great significance.

With its broad coverage, high stability, long time series, and high accuracy, satellite remote sensing is an ideal method for CO₂ monitoring [1,5]. Since the 1990s, satellite remote sensing has become essential for obtaining global CO₂ information. CO₂ remote sensing has undergone the development of thermal infrared and near-infrared detection methods, and CO₂ inversion algorithms have also rapidly advanced [6]. Currently, the near-infrared detection method is primarily used, and satellites’ sensors such as the European Space Agency’s Environmental Satellite (ENVISAT) [7], Japan’s Greenhouse Gases Observing Satellite (GOSAT) [8], and the United States Orbiting Carbon Observatory-2 (OCO-2) [9], as well as newly developed satellites like GOSAT-2 [10], OCO-3 [11], and China’s TanSat (Tan means carbon in Chinese) [12] are employed. These satellites can provide global column-averaged dry air CO₂ mole fraction (XCO₂) products, where XCO₂ represents the average mole fraction of CO₂ in the entire atmospheric column from the Earth’s surface to the top of the atmosphere (excluding water vapor molecules) [13]. Although satellite remote sensing can obtain CO₂ concentrations on a large scale, there are often significant spatial gaps in the data due to imaging techniques and adverse weather conditions. Furthermore, the observation swath of current satellites is relatively narrow. GOSAT has an observation width of only 10.6 km and a long revisit period [14], resulting in limited availability of effective data. Additionally, data quality is influenced by external conditions, with high aerosol loading in the climate environment [15] or cloud cover directly affecting data quality. Therefore, while satellite remote sensing is a promising method, there are still many challenges in obtaining regional fine-scale, high-spatial-coverage CO₂ data.

Geostatistical methods are commonly used for interpolating XCO₂ data obtained from satellite observations. The main techniques typically employed include ordinary kriging and spatiotemporal kriging [16]. Compared to ordinary kriging, spatiotemporal kriging considers temporal and spatial trends, enhancing spatiotemporal effectiveness. However, geostatistical methods, represented by kriging interpolation, have the following disadvantages: they can result in resolution loss, lack the capture of spatial details and textures, cannot effectively predict nonlinear time series, and have low computational efficiency when dealing with large datasets. More importantly, these methods overlook the factors that influence XCO₂. The research, represented by Lee et al., on the correlation variables of XCO₂ indicates that [17,18,19,20,21]. Meteorological conditions, vegetation conditions, and human activities significantly impact XCO₂. Therefore, combining influencing factors and spatial characteristics can effectively enhance the accuracy of XCO₂ estimation under high-spatiotemporal-resolution requirements.

Many researchers have recently used machine learning techniques to establish relationships between auxiliary factors and satellite XCO₂, thereby estimating full-coverage XCO₂ data. For example, Zeng et al. [22] utilized an artificial neural network and predictive factors (including sea surface temperature, latitude, longitude, salinity, and chlorophyll-a concentration) to estimate surface ocean CO₂. He et al. [23] used a light gradient boosting machine. They incorporated variables such as elevation, land use, meteorological conditions, and Carbon Tracker XCO₂ data to obtain full-coverage XCO₂ data for China. Zhang et al. [24] utilized convolutional neural networks and attention mechanisms to obtain the distribution of regional XCO₂ in China from 2013 to 2019, using multi-satellite data and auxiliary variables. Similarly, Zhang et al. [25] chose a geographically weighted neural network to construct the distribution of XCO₂ in China from 2014 to 2020. Li et al. [17] selected environmental factors such as vegetation and meteorology and utilized an extreme random tree model to obtain a global CO₂ dataset with a resolution of 0.01° and an 8-day interval. The coefficient of determination (R²) is 0.83, and the root mean square error (RMSE) is 1.79 ppm.

High-resolution and high-precision global mapping of CO₂ concentrations is very important for understanding the carbon cycle. However, research on estimating high-temporal-resolution CO₂ satellite data is still minimal. Furthermore, there is potential for improving the estimation accuracy and spatial resolution to generate a global, high-precision, and high-resolution seamless CO₂ dataset. This study aims to establish relationships between multiple auxiliary data sources and satellite XCO₂ data. By utilizing existing modeled XCO₂ data, such as Carbon Tracker (CT) XCO₂ data, and other datasets, such as vegetation data and meteorological data, the study attempts to construct the global distribution of XCO₂ using multiple DNNs. The spatial resolution is set at 0.1°, and the temporal resolution is set at one day.

Finally, using this method, we reconstructed the XCO₂ distribution in China and its surroundings from 2019 to 2020. To better analyze the spatiotemporal variations of CO₂ in China and observe the influences of topography, population, wind patterns, and other factors, we selected the study region of 0°–60° N, 70°–140° E. This region primarily encompasses China but includes major industrial areas in India and Southeast Asia, as well as parts of Mongolia and Russia.

2. Data and Method

2.1. Data

(1): OCO-2 XCO₂

OCO-2 was launched on 2 July 2014. Orbiting Carbon Observatory-2 (OCO-2) is a NASA satellite mission designed to measure atmospheric CO₂ levels from space. OCO-2 aims to provide global, high-resolution measurements of CO₂ concentrations in Earth’s atmosphere to enhance our understanding of carbon cycle processes and improve climate models [26]. The primary instrument on board OCO-2 is the Three-Wavelength Imaging Spectrometer, which measures the intensity of sunlight reflected from Earth’s surface and atmosphere at specific wavelengths to derive the concentration of CO₂. OCO-2 operates in three spectral bands: a near-infrared band at 1.61 μm, and two shortwave infrared bands at 2.06 and 2.24 μm [27]. These bands are sensitive to CO₂ absorption and allow for accurate measurements. OCO-2 provides a spatial resolution of approximately 3 square kilometers (1.2 square miles) at the nadir, which allows for detailed observations of CO₂ concentrations across different regions [9]. OCO-2 collects data on a global scale, covering the entire Earth’s surface to capture variations in CO₂ concentrations across different latitudes, longitudes, and ecosystems. OCO-2 data are publicly accessible through NASA’s Earth Observing System Data and Information System (EOSDIS) and can be obtained from the NASA Goddard Earth Sciences Data and Information Services Center (GES DISC). The data product version selected for this study is V11r. Data source: https://search.earthdata.nasa.gov (accessed on 6 October 2023).

(2): Carbon Tracker XCO₂

The Carbon Tracker (CT) modelling system was developed by the National Oceanic and Atmospheric Administration (NOAA) in 2005. It combines climate, vegetation, and carbon cycle models to obtain detailed estimates of carbon emissions and uptake at global and regional scales. CT combines atmospheric transport models with ensemble Kalman filtering to estimate the temporal evolution of CO₂ uptake and release at the Earth’s surface from an atmospheric perspective [28]. CT tracks atmospheric CO₂ sources and sinks by comparing them with global observational data. This modelling system assesses CO₂ exchange from the “atmospheric viewpoint”. It can handle multiple ecosystem and ocean data, estimating carbon fluxes from natural sources such as the ocean and wildfires, anthropogenic emissions, and carbon uptake from fossil fuel combustion and other human activities [29]. It distinguishes between natural carbon cycle variations and carbon emissions resulting from human activities. The global XCO₂ data provided by CT have a spatial resolution of 3° × 2° and a temporal resolution of 1 day. The data can be accessed from https://gml.noaa.gov/aftp/products/carbontracker (accessed on 10 October 2023).

(3): CAMS XCO₂ and XCH₄

Copernicus Atmosphere Monitoring Service (CAMS) reanalysis is the latest global atmospheric composition reanalysis dataset, including aerosols, chemical species, and greenhouse gases [30]. CAMS global greenhouse gas reanalysis primarily covers pollutants such as CO₂, NO₂, O₃, CH₄, and aerosols from 2003 to 2021 (the global greenhouse gas reanalysis still covers 2003 to 2020) [31]. In the generation process of CAMS XCO₂ data, OCO-2 data are not assimilated, ensuring the effective fusion of data [32]. Verification has shown the potential and feasibility of CAMS XCO₂ data for atmospheric CO₂ analysis. The CAMS XCO₂ data used in this study are generated by the Integrated Forecasting System (IFS) model and the 4DVar data assimilation system at European Centre for Medium-Range Weather Forecasts (ECMWF) and are available from the Atmospheric Data Store [29,33]. The spatial resolutions of XCO₂ and XCH₄ from CAMS are 3° × 2°, and temporal resolutions are 3 h. The data can be accessed from https://ads.atmosphere.copernicus.eu (accessed on 4 January 2024).

(4): TCCON XCO₂

Total Carbon Column Observing Network (TCCON) is a global observational network established in 2004 to monitor and measure the concentration of total carbon columns in the atmosphere. TCCON utilizes high-precision infrared spectrometers to infer their concentrations within the atmospheric column to observe the spectral characteristics of CO₂, CH₄, and other related gases in the atmosphere. This network provides high-quality data for carbon cycle and climate change research [34]. At TCCON observation sites, spectrometers measure the spectral absorption features of solar radiation passing through the atmosphere. By analyzing these absorption features, the concentrations of gases such as CO₂ and CH₄ within the atmospheric column can be inferred [35]. The data products generated by TCCON include atmospheric XCO₂, XCH₄, and other parameters, along with associated uncertainty estimates. These data products are essential for studying global carbon cycling, climate change, greenhouse gas emissions, and uptake. For this study, the data version used is GGG2020. The data can be accessed from https://tccondata.org (accessed on 20 November 2023). Figure 1 shows the distribution of TCCON sites.

(5): Vegetation data

Sun-induced chlorophyll fluorescence (SIF) data are the fluorescence signals emitted by chlorophyll molecules during photosynthesis when excited by light energy. They directly reflect the dynamic changes in the actual photosynthesis of plants [36]. The global SIF data used in this study are obtained from the National Tibetan Plateau Data Center, constructed by Zhang et al. [37]. The spatial and temporal resolution of the data is 0.05° × 0.05° and four days, respectively.

Normalized difference vegetation index (NDVI) is used to assess vegetation conditions by calculating the difference in reflectance between the near-infrared and visible light bands. It is commonly used in remote sensing image analysis and vegetation monitoring. Enhanced vegetation index (EVI) is an improved vegetation index incorporating reflectance from the red, blue, and near-infrared bands, providing a more accurate estimation of vegetation growth and health status [38]. Both NDVI and EVI can reflect vegetation status. The NDVI and EVI data used in this study are obtained from Moderate Resolution Imaging Spectroradiometer (MODIS), with a spatial and temporal resolution of 0.05° × 0.05° and eight days, respectively.

(6): Meteorological data

Fifth Generation European Reanalysis (ERA5) is a global meteorological reanalysis dataset developed and maintained by the European Centre for Medium-Range Weather Forecasts (ECMWF) [24]. It is a high-resolution dataset that provides spatiotemporal variables for various meteorological parameters, including temperature, humidity, wind speed, precipitation, and more. The ERA5 dataset utilizes advanced physical models and data assimilation techniques, combining data from satellite observations, ground-based weather stations, aircraft observations, and other sources. It has a temporal resolution of one hour, a horizontal resolution of approximately 31 km, and a vertical resolution with 137 levels. The dataset offers continuous meteorological records from 1979 to the present. ERA5 has diverse applications in climate research, weather forecasting, environmental monitoring, agriculture, water resource management, and more [39]. Using ERA5 data, researchers and practitioners can access global-scale meteorological information for climate analysis, modelling, prediction, and other weather and climate-related studies [40].

The meteorological data include the U-component of wind (U10), V-component of wind (V10), mean sea-level pressure (MSL), 2 m temperature (T2M), and surface pressure (SP) from EAR5. The spatial resolution is 0.25° × 0.25°, and the temporal resolution is monthly. The data can be accessed from https://cds.climate.copernicus.eu (accessed on 25 March 2024).

2.2. Method

A Deep Neural Network (DNN) is a common artificial neural network model with multiple hidden layers. Each hidden layer is composed of various neurons, where each neuron is connected to all neurons in the previous layer, meaning that each neuron receives the outputs of all the neurons from the last layer as inputs. The depth of a deep neural network allows it to possess more robust representation capabilities, enabling it to learn more complex and abstract features. Each hidden layer can be seen as a higher-level representation of the input data, and through successive transformations and feature extractions, the network gradually learns higher-level features.

Training a DNN involves using the backpropagation algorithm. This algorithm compares the network’s output with the desired output, calculates the loss (error), and then propagates the error backward through the network, adjusting the weights and biases accordingly to minimize the loss. This process is achieved through gradient descent, allowing the network to optimize itself and improve its performance gradually.

We constructed four neural network models with five fully connected layers, including three hidden layers. These models were used to predict global CO₂ concentrations from 2019 to 2020. Each model predicted the global CO₂ distribution for six months. The formulation is given by Equation (1).

{X C O}_{2} = D N N (\begin{matrix} L A T, L O N, T I M E, \\ S I F, N D V I, E V I, \\ \begin{matrix} S P, T 2 M, U 10, V 10, M S L, \\ \begin{matrix} X C H_{4}^{c a m s}, X C O_{2}^{c a m s}, \\ X C O_{2}^{c t} \end{matrix} \end{matrix} \end{matrix})

(1)

We primarily used four categories of features. The first category includes location and time information, such as the observation coordinates

L A T

(latitude),

L O N

(longitude), and the observation time (

T I M E

). The time is the year’s Julian day (i.e., the number of days since the beginning of the year, e.g., 2 January 2019, has a Julian day of 2). The second category comprises vegetation information, including

S I F

,

N D V I,

and

E V I

for the specific location and time. The third category consists of meteorological information, including

S P

,

T 2 M

,

U 10

,

V 10

, and

M S L

. The fourth category includes XCO₂ information with temporal features from the CAMS, which also includes XCH₄ information.

X C O_{2}^{c a m s}

represents the XCO₂ concentration at the given location (

L A T

,

L O N

) for a specific day. The fifth category is similar to the fourth category, but the XCO₂ information is sourced from CT. Due to the simplicity of the model structure we constructed, the training time was significantly reduced. The dataset was built using the following methods.

All data were resampled to the specified spatial resolution of 0.1° × 0.1°. Firstly, based on the resampled satellite observation data (

L A T, L O N, T I M E

), the corresponding information from other datasets was retrieved. Missing values were set to 0, and a dataset was constructed. A total of 80% of the data were used as the training dataset and 20% as the testing dataset. These datasets were fed into the constructed model for training using Equation (1), followed by testing. The first half of the year (January to June) was denoted as FH, and the second half of the year (July to December) was denoted as SH. The sample sizes are shown in Table 1. After obtaining the models, the data for the desired region were read and inputted into the model to obtain the temporal distribution of XCO₂ for the area. The flowchart is illustrated in Figure 2.

When generating the final long-term dataset, each model needs to make additional predictions for a certain period, overlapping with the adjacent model. We set this overlapping period to be ten days to ensure better continuity between the models. Taking the 2019 FH model as an example, although the training data used covers 1 January 2019, to 30 June 2019, the generated dataset covers 1 January 2019, to 10 July 2019. Similarly, for the 2019 SH model, the generated dataset covers 20 June 2019, to 10 January 2020. For the overlapping period, the XCO₂ values are averaged. The time variable needs to be adjusted using the models during the data generation process. For example, when training the 2019 SH model, the maximum value of the time variable (Julian day) used is 366 (31 December 2019). However, using this model when generating the XCO₂ dataset for 1 January 2020, to 10 January 2020, the time input should be from 367 to 377. Similarly, when using the 2020 FH model to generate the dataset for 21 December 2019, to 31 December 2019, the time input should be from −10 to 0.

Due to the unknown nature of the overlapping period for individual models, this study refers to this time as the “position period”. Table 2 below presents the data time range used for training each DNN model and the unknown period that needs to be predicted for each model in this section. For convenience, we will name the models after their training time, such as 19FH for the model trained in the first half of 2019.

3. Results and Discussion

3.1. DNN Testing and Accuracy Verification

After training, the performance of each model on their respective test sets is shown in Figure 3. The X-axis represents the models corresponding to specific periods, such as 19FH for the model responsible for the first half of 2019.

It can be observed that the accuracy change curve of the predicted data closely resembles the CT data in Figure 3, indicating that the CT data have a significant contribution to the feature importance of the model. Regarding accuracy, the expected results have an average R of 0.89 and an RMSE of 0.82 ppm, while the CT data have an average R of 0.88 and an RMSE of 0.92 ppm. The CAMS data, on the other hand, have an average R of 0.79 and an RMSE of 1.41 ppm. The consistency of DNN-predicted data with OCO-2 is significantly better than that of CT data, while CAMS data perform the worst. This could be due to the significant differences between the atmospheric transmission models used in CAMS and OCO-2, or it could be attributed to the fact that both CT and OCO-2 data were corrected using TCCON ground-based observations during their production. Based on the test set’s performance, using DNN to predict data in satellite data gaps is feasible.

To further validate this, we conducted TCCON ground-based verification for each model, and the results are shown in Figure 4. It illustrates the comparison between the predicted, CT, and CAMS models’ XCO₂ data with the TCCON data, where the TCCON data represent the average daily value. The average R for the DNN predicted data is 0.91, with an RMSE of 0.97 ppm, slightly outperforming the CT data with an average R of 0.9 and an RMSE of 1.07 ppm. The CAMS data show lower consistency with TCCON, with an average R of 0.85 and an RMSE of 1.83 ppm. The validation results against TCCON indicate that the individual XCO₂ prediction models constructed by the DNN exhibit good accuracy.

3.2. Performance of Model in Unknown Time

We have defined a requirement for overlapping periods between adjacent models to ensure temporal continuity when generating the long-term time series dataset. These overlapping periods do not include the training data for the corresponding models, meaning they are “unknown” to the DNN models. In this study, the overlapping period is set to 10 days. Table 2 in Section 2.2 shows the data time range used for training each DNN model and the unknown period that needs to be predicted for each model in this section. The accuracy comparison is shown in Figure 5.

As shown in Figure 5, the DNN model (19FH, 19SH, and 20SH) performs with an R of 0.92 and an RMSE of 1.12 ppm during the unknown period, while the CT dataset achieves an R of 0.90 and an RMSE of 1.20 ppm within the same time range. It is worth noting that the accuracy results of the DNN model for the unknown period of 20FH (R is 0.86, RMSE is 1.26 ppm) are lower than that of CT (R is 0.89, RMSE is 1.12 ppm). However, the predictions of the DNN model for the training period of 20FH are better than CT (refer to Figure 4). A possible reason is that due to the lockdown impact of the pandemic in the first half of 2020 [41], the “patterns” learned by 20FH differed from 19SH and 20SH, resulting in lower accuracy when predicting its corresponding unknown time during the use of 20SH.

Based on the previous results, we will not discuss the CAMS dataset in this context. The performance of the DNN model during the unknown period is slightly better than that of CT but lower than its accuracy within the training data’s corresponding time range (average R is 0.91 and average RMSE is 0.97 ppm). The comparison between DNN and TCCON within the time range corresponding to the DNN model’s training data may include some training data, leading to better accuracy within that specific training time range.

3.3. Comparison of DNN and Other Datasets

By integrating the XCO₂ data generated by multiple DNN models, we obtained a long and continuous time series of XCO₂ data. These data were compared with TCCON data, as well as with CT and CAMS datasets. Figure 6 shows that the DNN models perform best, with an R of 0.94 and an RMSE of 0.98 ppm, slightly outperforming the CT dataset with an R of 0.94 and an RMSE of 1.07 ppm. However, the consistency between the CAMS dataset and TCCON is the lowest, with an R of 0.84 and an RMSE of 1.07 ppm. These results demonstrate the feasibility of using deep learning models to establish relationships between various data sources and XCO₂ to fill data gaps.

The predictions made by the DNN model achieved a spatial resolution of 0.1 degrees and a temporal resolution of 1 day, providing a higher coverage compared to traditional satellite measurements. China and its surrounding distribution of XCO₂ and magnified XCO₂ in some regions, as shown in Figure 7, have satellite coverage of only 0.14%, while the DNN model, CT, and CAMS can achieve full coverage. It can be observed that the DNN XCO₂ retains more details compared to the other datasets. The spatial resolution of DNN XCO₂ is 0.1° × 0.1°, CT XCO₂ has a spatial resolution of 3° × 2°, and CAMS has a spatial resolution of 0.75° × 0.75°.

For the regional XCO₂ depicted in Figure 7, CAMS tends to overestimate XCO₂ concentrations overall. The limited satellite observations show an XCO₂ mean value of 414.61 ppm. Due to the significant contribution of CT XCO₂ in the feature importance of the DNN (as discussed in Section 2.1), the mean concentration of DNN (414.97 ppm) is close to the mean concentration of CT (415.15 ppm). However, the mean concentration of CAMS is relatively higher, at 416.17 ppm. This discrepancy may be attributed to the lower consistency between CAMS and OCO-2 measurements.

3.4. Correlation and Importance of Features

We merged four training datasets to research feature correlation. Since this study only considers the magnitude of correlation for ease of comparison with importance, we do not differentiate between positive and negative correlations. The correlation is shown in Figure 8.

Figure 8 shows that LAT, CAMS XCH₄, and existing model data (CAMS XCO₂ and CT XCO₂) exhibit a relatively high correlation with XCO₂. We will re-enter the test set into the model, selecting specific feature columns for shuffling each time to obtain the final root mean square error. The larger the RMSE, the more critical the feature, as shown in the following Figure 9.

From Figure 9, CT XCO₂ has the highest contribution to the model accuracy, which explains its similarity to the performance of CT. CAMS XCO₂ has the second highest contribution, indicating that the accuracy of CAMS itself may not be ideal. Overall, vegetation data contribute less than meteorological data. However, this does not necessarily imply that meteorological factors are more important than vegetation factors. We selected five meteorological factors and three vegetation factors; each factor’s importance is similar, on average.

When comparing Figure 8 and Figure 9, it is clear that feature correlation alone does not fully account for feature importance. For instance, despite the significant difference in correlation between the features XCH₄, LAT, and SP, their importance is relatively low. Only the strongly correlated features CAMS XCO₂ and CT XCO₂ exhibit high importance. Correlation can only measure trends and may not accurately describe the complex non-linear relationships in deep learning. More research is needed to improve the interpretability of deep learning models. It is worth noting that although methane exhibits a high correlation, its importance is not high. This could be due to the influence of CAMS, as the accuracy of CAMS XCO₂ data is not high. We cannot determine the accuracy of CAMS XCH₄ data, which requires further investigation. If studying the relationship between CH₄ and CO₂, TCCON could be a better choice.

3.5. Spatial–Temporal Characteristics of China and Its Surrounding XCO₂

In this study, to investigate the spatiotemporal variations of XCO₂ in China better and consider the influences of atmospheric transport, we selected the regions of 0°–60° N and 70°–140° E as our study area. This region covers the entire territory of China and parts of India, Southeast Asia, Mongolia, and Russia. It exhibits diverse topography, including plateaus, plains, and hills. The region also experiences significant climate variations, such as tropical rainforest climates, tropical monsoon climates, and temperate monsoon climates. Population density varies significantly within this area, with high population density in eastern China, the east part of the Southeast Asian peninsula, and southeastern India. Low population density is represented in Mongolia and southern Russia. The region consists primarily of developing countries with labor-intensive industries, leading to high energy demand and significant anthropogenic emissions.

We calculated this region’s seasonal mean XCO₂ values, as shown in Figure 10. The graph reveals an overall increasing trend in XCO₂ concentrations within the area. The highest XCO₂ concentration was observed in spring 2020, reaching 414.23 ppm, while the lowest concentration occurred in summer 2019, at 409.13 ppm.

The concentrations in spring and winter were higher than in summer and autumn. This pattern can be attributed to several factors. In winter, central and northern China regions, representing densely populated areas, experience significant CO₂ emissions due to heating requirements [33]. Additionally, during spring, the thawing of frozen soil and the early recovery of vegetation result in increased respiration compared to photosynthesis. In summer and autumn, the XCO₂ concentrations are relatively lower. This could be attributed to the dominance of photosynthesis over respiration during this period and the influence of the summer monsoon, which brings frequent strong winds that prevent CO₂ from accumulating [42].

Figure 11 displays the spatial distribution of seasonal XCO₂ in the region. The southern region generally exhibits higher XCO₂ concentrations than the northern part, primarily due to population distribution. The areas of the north, such as Siberia, have a cold climate and sparse population. In contrast, the southern regions have a warmer climate and higher population density, significantly impacting human activities. The areas with high XCO₂ concentrations are mainly concentrated in eastern China, the Southeast Asian peninsula, and India. This is attributed to the concentration of population and industrial development in these areas, resulting in higher anthropogenic emissions [24].

The high XCO₂ values in India are mainly concentrated around the Kolkata industrial area (in the northeastern coastal region of India). Heavy emissions from traditional industries such as coal and iron characterize this region. It is worth noting that in summer, the CO₂ emissions from the Mumbai industrial area on the eastern coast of India and the Bengaluru industrial area in the southeastern coastal region are influenced by the monsoon winds from the Indian Ocean, which push the CO₂ northward and are blocked by the Tibetan Plateau, resulting in highly concentrated areas [42]. In the Southeast Asian peninsula, the distribution of XCO₂ concentrations is relatively concentrated in the northern part. This is primarily due to the concentration of coal, petroleum, and metal mineral resources in the region’s north, leading to significant CO₂ emissions from industrial mining activities.

In China, high XCO₂ values are concentrated around the middle and lower reaches of the Yellow River and the Yangtze River, traditional industrial regions in China. Overall, especially in spring and summer, the distribution of XCO₂ resembles the “three-level staircase” mentioned in Chinese geographical studies. The high-value areas are concentrated in the flat “first step” region, including North China, East China, and the middle reaches of the Yangtze River. This is represented in Figure 11e, where high-value areas surround Beijing, Shijiazhuang, and Tianjin. This region is economically developed, with heavy industries with high emissions as the main pillars. It is worth noting that in the southern region, represented by Guangzhou and Shenzhen, although the population density is high, the CO₂ concentration is relatively low. This may be due to the intense monsoons in the area, which allow for rapid CO₂ dispersion. Additionally, the industries in the region are mainly light industries, resulting in relatively lower anthropogenic emissions [23,43].

In the “second step” region, the overall concentration values are slightly lower than the “first step”. However, localized high-value areas appear near the Sichuan Basin, as evident in Figure 11d,e. This could be attributed to the high population density and associated emissions in the area. However, the presence of surrounding mountains and rivers in the basin slows down atmospheric transport, impeding the dispersion of CO₂. In the “second step” region, there are also localized high-value areas in Xinjiang (shown in Figure 11a,e), which is divided by the Tianshan Mountains and surrounds Urumqi and Korla. This area is also a population concentration zone. The “third step” is mainly dominated by the Qinghai–Tibet Plateau. Based on these findings, we believe that the main factors influencing the distribution of CO₂ are anthropogenic impact and topography, which may help improve future models.

4. Discussion

Research on carbon sources and sinks has gradually shifted towards finer scales [44,45,46,47], focusing on global or national major power plants, waste treatment facilities, or regional carbon emission monitoring. The differences in CO₂ concentrations in various regions can be used to assess and verify carbon emissions and support the allocation and trading of carbon emission quotas. A high-resolution CO₂ dataset is essential as a background value or monitoring baseline, as it provides more accurate estimates and monitoring capabilities for carbon emissions. This helps support countries in formulating carbon reduction policies.

This study constructed a high-resolution CO₂ dataset with a resolution of 0.1° × 0.1° using deep learning models. OCO-2 satellite data were used as the training data. It is worth noting that OCO-2 observations are often spatially close to each other. However, we believe this does not affect the independence between the training and testing sets and does not compromise this study’s rigor. In the input phase of the model, the latitude, longitude, and time of OCO-2 observations were used as the basis, and other auxiliary data corresponding to that location were extracted. During the training process, the data were not normalized in advance according to the final resolution. The auxiliary data were also not resampled to the exact resolution as the original OCO-2 data. It can be determined that even within the same grid cell, the multiple OCO-2 observations it contains will differ at least in terms of latitude and longitude. Therefore, the training and testing sets meet the independence criteria. However, we cannot determine how much the auxiliary data should be resampled. This resampling should consider the accuracy of the resampled auxiliary data and the precision of satellite XCO₂ observations, with the accuracy of the final XCO₂ dataset serving as the evaluation criterion.

Theoretically, this model can generate a final CO₂ dataset with a resolution of 0.01° × 0.02°. However, based on the methodology described in this paper, the accuracy is not optimal, indicating room for improvement. On the one hand, regarding the selection of auxiliary data, we chose representative data commonly used in similar research articles and conducted comparative analyses of their relevance and importance. The results show that there is no specific relationship between relevance and importance. Therefore, determining which data are suitable for predicting XCO₂ will be one of the directions for our future research. On the other hand, we believe that CO₂ exhibits periodic variations over time. It is possible to incorporate this periodicity into the model by introducing the CO₂ values from the previous or subsequent weeks at a particular location in the input space. Alternatively, considering the temporal aspect, using recurrent neural networks (RNN), such as long short-term memory (LSTM), may lead to better accuracy performance.

5. Conclusions

This study serves as a quantitative complement to current machine learning or deep learning models for retrieving atmospheric trace gases. The proposed approach applies to atmospheric CO₂ monitoring or forecasting and can extend to other chemical compounds, such as CH₄ or NOx. Researchers have already used deep neural networks to predict CO₂ emissions [48,49]. However, in this study, we did not extensively analyze the performance of individual models in an unknown period. This limitation arises because the training data used in this study coincide with the period primarily covered by the generated models. Furthermore, further research is needed to investigate the complexity of the models and the maximum time they can support. Additionally, the impact of the number of hidden layers or neurons on the final accuracy of the models has not been extensively discussed in this study, which will be one of the future research directions.

In this study, using multi-source data on a global scale, we seamlessly constructed XCO₂ data using multiple DNN models, reducing operational difficulties and training time. The generated XCO₂ dataset has a spatial resolution of 0.1° and a temporal resolution of 1 day. Compared to other datasets, the data constructed by the DNN model exhibit significant advantages in terms of spatial and temporal resolution. The individual models perform well in predicting different periods, and the global results from 2019 to 2020, compared with TCCON data, demonstrate their accuracy (R = 0.94, RMSE = 0.98 ppm). Finally, based on the DNN method, we reconstructed the XCO₂ dataset for China (area: 0°–60° N and 70°–140° E) for the years 2019–2020. Temporally, XCO₂ concentrations in this region show an increasing trend, while spatially, the high XCO₂ values align with densely populated areas.

Author Contributions

Methodology, W.T. and T.Y.; software, W.T. and W.Z.; validation, C.W.; formal analysis, W.T. and L.Z.; investigation, C.W. and D.Y.; resources, C.W.; data curation, W.T. and L.Z.; writing—original draft preparation, W.T.; writing—review and editing, L.Z.; visualization, C.W.; supervision, W.Z. and D.Y.; project administration, T.Y.; funding acquisition, T.Y. and D.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Natural Science Foundation of Hainan Province (grant 423MS113), National Key Research and Development Program of China (grant 2023YFB4004503), and a grant from State Key Laboratory of Resources and Environmental Information System, Chinese Academy of Sciences.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data of XCO₂ are not publicly available due to privacy concerns.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Sun, Y.; Yin, H.; Wang, W.; Shan, C.; Notholt, J.; Palm, M.; Liu, K.; Chen, Z.; Liu, C. Monitoring greenhouse gases (GHGs) in China: Status and perspective. Atmos. Meas. Tech. 2022, 15, 4819–4834. [Google Scholar] [CrossRef]
Tarasova, O.; Vermeulen, A.; Sawa, Y.; Houweling, S.; Dlugokencky, E. The State of Greenhouse Gases in the Atmosphere Using Global Observations through 2021; Copernicus Meetings: New York, NY, USA, 2023. [Google Scholar]
Van Soest, H.L.; den Elzen, M.G.; van Vuuren, D.P. Net-zero emission targets for major emitting countries consistent with the Paris Agreement. Nat. Commun. 2021, 12, 2140. [Google Scholar] [CrossRef] [PubMed]
Zhao, X.; Ma, X.; Chen, B.; Shang, Y.; Song, M. Challenges toward carbon neutrality in China: Strategies and countermeasures. Resour. Conserv. Recycl. 2022, 176, 105959. [Google Scholar] [CrossRef]
Wang, Z.; Ma, P.; Zhang, L.; Chen, H.; Zhao, S.; Zhou, W.; Chen, C.; Zhang, Y.; Zhou, C.; Mao, H.; et al. Systematics of atmospheric environment monitoring in China via satellite remote sensing. Air Qual. Atmos. Health 2021, 14, 157–169. [Google Scholar] [CrossRef]
Liangyun, L.; Liangfu, C.; Yi, L.; Dongxu, Y.; Zhang, X.; Naimeng, L.; Jiang, F.; Zengshan, Y.; Guohua, L.; Longfei, T. Satellite remote sensing for global stocktaking: Methods, progress and perspectives. Natl. Remote Sens. Bull. 2022, 26, 243–267. [Google Scholar]
Boesch, H.; Toon, G.C.; Sen, B.; Washenfelder, R.A.; Wennberg, P.O.; Buchwitz, M.; de Beek, R.; Burrows, J.P.; Crisp, D.; Christi, M.; et al. Space-based near-infrared CO₂ measurements:: Testing the Orbiting Carbon Observatory retrieval algorithm and validation concept using SCIAMACHY observations over Park Falls, Wisconsin. J. Geophys. Res. Atmos. 2006, 111, D007080. [Google Scholar] [CrossRef]
Inoue, M.; Morino, I.; Uchino, O.; Miyamoto, Y.; Yoshida, Y.; Yokota, T.; Machida, T.; Sawa, Y.; Matsueda, H.; Sweeney, C.; et al. Validation of XCO₂ derived from SWIR spectra of GOSAT TANSO-FTS with aircraft measurement data. Atmos. Chem. Phys. 2013, 13, 9771–9788. [Google Scholar] [CrossRef]
Sun, Y.; Frankenberg, C.; Jung, M.; Joiner, J.; Guanter, L.; Kohler, P.; Magney, T. Overview of Solar-Induced chlorophyll Fluorescence (SIF) from the Orbiting Carbon Observatory-2: Retrieval, cross-mission comparison, and global monitoring for GPP. Remote Sens. Environ. 2018, 209, 808–823. [Google Scholar] [CrossRef]
Gogoi, M.M.; Babu, S.S.; Imasu, R.; Hashimoto, M. Satellite (GOSAT-2 CAI-2) retrieval and surface (ARFINET) observations of aerosol black carbon over India. Atmos. Chem. Phys. 2023, 23, 8059–8079. [Google Scholar] [CrossRef]
Taylor, T.E.; Eldering, A.; Merrelli, A.; Kiel, M.; Somkuti, P.; Cheng, C.; Rosenberg, R.; Fisher, B.; Crisp, D.; Basilio, R.; et al. OCO-3 early mission operations and initial (vEarly) XCO₂ and SIF retrievals. Remote Sens. Environ. 2020, 251, 112032. [Google Scholar] [CrossRef]
Du, S.; Liu, L.; Liu, X.; Zhang, X.; Zhang, X.; Bi, Y.; Zhang, L. Retrieval of global terrestrial solar-induced chlorophyll fluorescence from TanSat satellite. Sci. Bull. 2018, 63, 1502–1512. [Google Scholar] [CrossRef]
He, Z.; Lei, L.; Zhang, Y.; Sheng, M.; Wu, C.; Li, L.; Zeng, Z.-C.; We, L.R. Spatio-Temporal Mapping of Multi-Satellite Observed Column Atmospheric CO₂ Using Precision-Weighted Kriging Method. Remote Sens. 2020, 12, 576. [Google Scholar] [CrossRef]
Johnson, M.S.; Schwandner, F.M.; Potter, C.S.; Nguyen, H.M.; Bell, E.; Nelson, R.R.; Philip, S.; O’Dell, C.W. Carbon Dioxide Emissions During the 2018 Kilauea Volcano Eruption Estimated Using OCO-2 Satellite Retrievals. Geophys. Res. Lett. 2020, 47, e2020gl090507. [Google Scholar] [CrossRef]
Su, X.; Wang, L.; Gui, X.; Yang, L.; Li, L.; Zhang, M.; Qin, W.; Tao, M.; Wang, S.; Wang, L. Retrieval of total and fine mode aerosol optical depth by an improved MODIS Dark Target algorithm. Environ. Int. 2022, 166, 107343. [Google Scholar] [CrossRef]
Bhattacharjee, S.; Dill, K.; Chen, J. Forecasting Interannual Space-based CO₂ Concentration using Geostatistical Mapping Approach. In Proceedings of the 2020 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), Bangalore, India, 2–4 July 2020; pp. 1–6. [Google Scholar]
Li, J.; Jia, K.; Wei, X.; Xia, M.; Chen, Z.; Yao, Y.; Zhang, X.; Jiang, H.; Yuan, B.; Tao, G.; et al. High-spatiotemporal resolution mapping of spatiotemporally continuous atmospheric CO₂ concentrations over the global continent. Int. J. Appl. Earth Obs. Geoinf. 2022, 108, 102743. [Google Scholar] [CrossRef]
Siabi, Z.; Falahatkar, S.; Alavi, S.J. Spatial distribution of XCO₂ using OCO-2 data in growing seasons. J. Environ. Manag. 2019, 244, 110–118. [Google Scholar] [CrossRef]
Sun, B.; Han, S.; Li, W. Effects of the polycentric spatial structures of Chinese city regions on CO₂ concentrations. Transp. Res. Part D-Transp. Environ. 2020, 82, 102333. [Google Scholar] [CrossRef]
Wei, C.; Wang, M.; Fu, Q.; Dai, C.; Huang, R.; Bao, Q. Temporal characteristics of greenhouse gases (CO₂ and CH₄) in the megacity Shanghai, China: Association with air pollutants and meteorological conditions. Atmos. Res. 2020, 235, 104759. [Google Scholar] [CrossRef]
Zhang, H.; Peng, J.; Wang, R.; Zhang, J.; Yu, D. Spatial planning factors that influence CO₂ emissions: A systematic literature review. Urban Clim. 2021, 36, 100809. [Google Scholar] [CrossRef]
Zeng, J.; Nojiri, Y.; Nakaoka, S.i.; Nakajima, H.; Shirai, T. Surface ocean CO₂ in 1990–2011 modelled using a feed-forward neural network. Geosci. Data J. 2015, 2, 47–51. [Google Scholar] [CrossRef]
He, C.; Ji, M.; Li, T.; Liu, X.; Tang, D.; Zhang, S.; Luo, Y.; Grieneisen, M.L.; Zhou, Z.; Zhan, Y. Deriving Full-Coverage and Fine-Scale XCO₂ Across China Based on OCO-2 Satellite Retrievals and CarbonTracker Output. Geophys. Res. Lett. 2022, 49, e2022gl098435. [Google Scholar] [CrossRef]
Zhang, M.; Liu, G. Mapping contiguous XCO₂ by machine learning and analyzing the spatio-temporal variation in China from 2003 to 2019. Sci. Total Environ. 2023, 858, 159588. [Google Scholar] [CrossRef]
Zhang, L.; Li, T.; Wu, J. Deriving gapless CO₂ concentrations using a geographically weighted neural network: China, 2014–2020. Int. J. Appl. Earth Obs. Geoinf. 2022, 114, 103063. [Google Scholar] [CrossRef]
Jin, C.; Xue, Y.; Yuan, T.; Zhao, L.; Jiang, X.; Sun, Y.; Wu, S.; Wang, X. Retrieval anthropogenic CO₂ emissions from OCO-2 and comparison with gridded emission inventories. J. Clean. Prod. 2024, 448, 141418. [Google Scholar] [CrossRef]
Philip, S.; Johnson, M.S.; Baker, D.F.; Basu, S.; Tiwari, Y.K.; Indira, N.K.; Ramonet, M.; Poulter, B. OCO-2 Satellite-Imposed Constraints on Terrestrial Biospheric CO₂ Fluxes Over South Asia. J. Geophys. Res. Atmos. 2022, 127, e2021jd035035. [Google Scholar] [CrossRef]
Peters, W.; Jacobson, A.R.; Sweeney, C.; Andrews, A.E.; Conway, T.J.; Masarie, K.; Miller, J.B.; Bruhwiler, L.M.P.; Petron, G.; Hirsch, A.I.; et al. An atmospheric perspective on North American carbon dioxide exchange: CarbonTracker. Proc. Natl. Acad. Sci. USA 2007, 104, 18925–18930. [Google Scholar] [CrossRef]
Chen, H.W.; Zhang, L.N.; Zhang, F.; Davis, K.J.; Lauvaux, T.; Pal, S.; Gaudet, B.; DiGangi, J.P. Evaluation of Regional CO₂ Mole Fractions in the ECMWF CAMS Real-Time Atmospheric Analysis and NOAA CarbonTracker Near-Real-Time Reanalysis With Airborne Observations From ACT-America Field Campaigns. J. Geophys. Res.-Atmos. 2019, 124, 8119–8133. [Google Scholar] [CrossRef]
Inness, A.; Ades, M.; Agusti-Panareda, A.; Barre, J.; Benedictow, A.; Blechschmidt, A.-M.; Dominguez, J.J.; Engelen, R.; Eskes, H.; Flemming, J.; et al. The CAMS reanalysis of atmospheric composition. Atmos. Chem. Phys. 2019, 19, 3515–3556. [Google Scholar] [CrossRef]
Guevara, M.; Jorba, O.; Tena, C.; van der Gon, H.D.; Kuenen, J.; Elguindi, N.; Darras, S.; Granier, C.; Perez Garcia-Pando, C. Copernicus Atmosphere Monitoring Service TEMPOral profiles (CAMS-TEMPO): Global and European emission temporal profile maps for atmospheric chemistry modelling. Earth Syst. Sci. Data 2021, 13, 367–404. [Google Scholar] [CrossRef]
Zhang, L.; Li, T.; Wu, J.; Yang, H. Global estimates of gap-free and fine-scale CO₂ concentrations during 2014–2020 from satellite and reanalysis data. Environ. Int. 2023, 178, 108057. [Google Scholar] [CrossRef]
Li, T.; Wu, J.; Wang, T. Generating daily high-resolution and full-coverage XCO₂ across China from 2015 to 2020 based on OCO-2 and CAMS data. Sci. Total Environ. 2023, 893, 164921. [Google Scholar] [CrossRef] [PubMed]
Malina, E.; Veihelmann, B.; Buschmann, M.; Deutscher, N.M.; Feist, D.G.; Morino, I. On the consistency of methane retrievals using the Total Carbon Column Observing Network (TCCON) and multiple spectroscopic databases. Atmos. Meas. Tech. 2022, 15, 2377–2406. [Google Scholar] [CrossRef]
Messerschmidt, J.; Geibel, M.C.; Blumenstock, T.; Chen, H.; Deutscher, N.M.; Engel, A.; Feist, D.G.; Gerbig, C.; Gisi, M.; Hase, F.; et al. Calibration of TCCON column-averaged CO₂: The first aircraft campaign over European TCCON sites. Atmos. Chem. Phys. 2011, 11, 10765–10777. [Google Scholar] [CrossRef]
Bai, J.; Zhang, H.; Sun, R.; Li, X.; Xiao, J.; Wang, Y. Estimation of global GPP from GOME-2 and OCO-2 SIF by considering the dynamic variations of GPP-SIF relationship. Agric. For. Meteorol. 2022, 326, 109180. [Google Scholar] [CrossRef]
Zhang, Y.; Joiner, J.; Alemohammad, S.H.; Zhou, S.; Gentine, P. A global spatially contiguous solar-induced fluorescence (CSIF) dataset using neural networks. Biogeosciences 2018, 15, 5779–5800. [Google Scholar] [CrossRef]
Meng, L.; Liu, H.; Zhang, X.; Ren, C.; Ustin, S.; Qiu, Z.; Xu, M.; Guo, D. Assessment of the effectiveness of spatiotemporal fusion of multi-source satellite images for cotton yield estimation. Comput. Electron. Agric. 2019, 162, 44–52. [Google Scholar] [CrossRef]
Munoz-Sabater, J.; Dutra, E.; Agusti-Panareda, A.; Albergel, C.; Arduini, G.; Balsamo, G.; Boussetta, S.; Choulga, M.; Harrigan, S.; Hersbach, H.; et al. ERA5-Land: A state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data 2021, 13, 4349–4383. [Google Scholar] [CrossRef]
Zhu, J.; Xie, A.; Qin, X.; Wang, Y.; Xu, B.; Wang, Y. An Assessment of ERA5 Reanalysis for Antarctic Near-Surface Air Temperature. Atmosphere 2021, 12, 217. [Google Scholar] [CrossRef]
Willems, S.; Vanden Bussche, P.; Van Poel, E.; Collins, C.; Klemenc-Ketis, Z.; Pati, E.Q.E.A.Q. Moving forward after the COVID-19 pandemic: Lessons learned in primary care from the multi-country PRICOV-19 study. Eur. J. Gen. Pract. 2024, 30, 2328716. [Google Scholar] [CrossRef]
Das, C.; Kunchala, R.K.; Chandra, N.; Chhabra, A.; Pandya, M.R. Characterizing the regional XCO₂ variability and its association with ENSO over India inferred from GOSAT and OCO-2 satellite observations. Sci. Total Environ. 2023, 902, 166176. [Google Scholar] [CrossRef]
Wu, C.; Ju, Y.; Yang, S.; Zhang, Z.; Chen, Y. Reconstructing annual XCO₂ at a 1 km × 1 km spatial resolution across China from 2012 to 2019 based on a spatial CatBoost method. Environ. Res. 2023, 236, 116866. [Google Scholar] [CrossRef] [PubMed]
Gao, P.; Yue, S.; Chen, H. Carbon emission efficiency of China’s industry sectors: From the perspective of embodied carbon emissions. J. Clean. Prod. 2021, 283, 124655. [Google Scholar] [CrossRef]
Ge, W.; Cao, H.; Li, H.; Zhang, Q.; Wen, X.; Zhang, C.; Mativenga, P. Data-driven carbon emission accounting for manufacturing systems based on meta-carbon-emission block. J. Manuf. Syst. 2024, 74, 141–156. [Google Scholar] [CrossRef]
Wang, W.; Hu, Y.; Lu, Y. Driving forces of China’s provincial bilateral carbon emissions and the re-definition of corresponding responsibilities. Sci. Total Environ. 2023, 857, 159404. [Google Scholar] [CrossRef]
Zhang, C.; Song, K.; Wang, H.; Randhir, T.O. Carbon budget management in the civil aviation industry using an interactive control perspective. Int. J. Sustain. Transp. 2020, 15, 30–39. [Google Scholar] [CrossRef]
Ji, Z.; Song, H.; Lei, L.; Sheng, M.; Guo, K.; Zhang, S. A Novel Approach for Predicting Anthropogenic CO₂ Emissions Using Machine Learning Based on Clustering of the CO₂ Concentration. Atmosphere 2024, 15, 323. [Google Scholar] [CrossRef]
Zhang, Y.; Liu, X.; Lei, L.; Liu, L. Estimating Global Anthropogenic CO₂ Gridded Emissions Using a Data-Driven Stacked Random Forest Regression Model. Remote Sens. 2022, 14, 3899. [Google Scholar] [CrossRef]

Figure 1. Distribution of TCCON site (source: https://tccondata.org) (accessed on 25 March 2024).

Figure 2. Flowchart of the training, testing, and validation process for a single DNN model.

Figure 3. Model in test dataset (the red line is the comparison of DNN, the yellow one is the comparison of CAMS, and the blue one is the comparison of CT).

Figure 4. Model validation with TCCON globally (the red line is the comparison of DNN, the yellow one is the comparison of CAMS, and the blue one is the comparison of CT).

Figure 5. Comparison of the dataset with TCCON in an unknown period. (a) DNN and TCCON; (b) CT and TCCON. (The blue points are from 19FH, the black points are 19SH, and the yellow points are from 20SH).

Figure 6. Comparison between XCO₂ datasets. (a) DNN and TCCON; (b) CT and TCCON; (c) CAMS and TCCON.

Figure 7. China and its surrounding XCO₂ distribution comparison (date: 12 December 2020, region: 0°–60° N 70°–140° E). (a) Regional XCO₂ by OCO-2; (b) Regional XCO₂ by DNN; (c) Regional XCO₂ by CT; (d) regional XCO₂ by CAMS.

Figure 8. Correlation of features.

Figure 9. Importance of features.

Figure 10. XCO₂ changes in four seasons (region: 0°–60° N 70°–140° E).

Table 1. The model sample size of the training.

	Time	2019	2020
FH	1 January–30 June	170 × 10⁵	161 × 10⁵
SH	1 July–31 December	169 × 10⁵	164 × 10⁵

Table 2. The unknown period predicted by the model.

Use Model	Train Time	Unknown Time
19FH	1 January 2019–30 June 2019	1 July 2019–10 July 2019
19SH	1 July 2019–31 December 2019	20 June 2019–30 June 2019 1 January 2020–10 January 2020
20FH	1 January 2020–30 June 2020	21 December 2019–31 December 2019 1 July 2020–10 July 2020
20SH	1 July 2020–31 December 2020	20 June 2020–30 June 2020

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tian, W.; Zhang, L.; Yu, T.; Yao, D.; Zhang, W.; Wang, C. Generating Daily High-Resolution Regional XCO₂ by Deep Neural Network and Multi-Source Data. Atmosphere 2024, 15, 985. https://doi.org/10.3390/atmos15080985

AMA Style

Tian W, Zhang L, Yu T, Yao D, Zhang W, Wang C. Generating Daily High-Resolution Regional XCO₂ by Deep Neural Network and Multi-Source Data. Atmosphere. 2024; 15(8):985. https://doi.org/10.3390/atmos15080985

Chicago/Turabian Style

Tian, Wenjie, Lili Zhang, Tao Yu, Dong Yao, Wenhao Zhang, and Chunmei Wang. 2024. "Generating Daily High-Resolution Regional XCO₂ by Deep Neural Network and Multi-Source Data" Atmosphere 15, no. 8: 985. https://doi.org/10.3390/atmos15080985

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Generating Daily High-Resolution Regional XCO₂ by Deep Neural Network and Multi-Source Data

Abstract

1. Introduction

2. Data and Method

2.1. Data

2.2. Method

3. Results and Discussion

3.1. DNN Testing and Accuracy Verification

3.2. Performance of Model in Unknown Time

3.3. Comparison of DNN and Other Datasets

3.4. Correlation and Importance of Features

3.5. Spatial–Temporal Characteristics of China and Its Surrounding XCO₂

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Generating Daily High-Resolution Regional XCO2 by Deep Neural Network and Multi-Source Data

Abstract

1. Introduction

2. Data and Method

2.1. Data

2.2. Method

3. Results and Discussion

3.1. DNN Testing and Accuracy Verification

3.2. Performance of Model in Unknown Time

3.3. Comparison of DNN and Other Datasets

3.4. Correlation and Importance of Features

3.5. Spatial–Temporal Characteristics of China and Its Surrounding XCO2

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Generating Daily High-Resolution Regional XCO₂ by Deep Neural Network and Multi-Source Data

3.5. Spatial–Temporal Characteristics of China and Its Surrounding XCO₂