Next Article in Journal
A Novel FECAM-iTransformer Algorithm for Assisting INS/GNSS Navigation System during GNSS Outages
Previous Article in Journal
Mechanical Behaviors of a New Polymer-Based Restorative Material for Immediate Loading: An In Vitro Comparative Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Forecasting the Total Output Value of Agriculture, Forestry, Animal Husbandry, and Fishery in Various Provinces of China via NPP-VIIRS Nighttime Light Data

Agricultural Information Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(19), 8752; https://doi.org/10.3390/app14198752
Submission received: 20 August 2024 / Revised: 25 September 2024 / Accepted: 26 September 2024 / Published: 27 September 2024

Abstract

:
This paper attempts to establish the accurate and timely forecasting model for the total output value of agriculture, forestry, animal husbandry, and fishery (TOVAFAF) in various provinces of China using NPP-VIIRS nighttime light (NTL) remote sensing data and machine learning algorithms. It can provide important data references for timely assessment of agricultural economic development level and policy adjustment. Firstly, multiple NTL indices for provincial-level administrative regions of China were constructed based on NTL images from 2013 to 2023 and various statistics. The results of correlation analysis and significance test show that the constructed total nighttime light index (TNLI), luminous pixel quantity index (LPQI), luminous pixel ratio index (LPRI), and nighttime light squared deviation sum index (NLSDSI) are highly correlated with the TOVAFAF. Subsequently, using the relevant data from 2013 to 2020 as the training set, the four NTL indices were separately taken as single independent variable to establish the linear model, exponential model, logarithmic model, power exponential model, and polynomial model. And all the four NTL indices were taken as the input features together to establish the multiple linear regression (MLR), extreme learning machine (ELM), and particle swarm optimization-ELM (PSO-ELM) models. The relevant data from 2021 to 2022 were taken as the validation set for the adjustment and optimization of the model weight parameters and the preliminary evaluation of the modeling effect. Finally, the established models were employed to forecast the TOVAFAF in 2023. The experimental results show that the ELM and PSO-ELM models can better explore and characterize the potential nonlinear relationship between NTL data and the TOVAFAF than all the models established based on single NTL index and the MLR model, and the PSO-ELM model achieves the best forecasting effect in 2023 with the MRE value for 32.20% and the R2 values of the linear relationship between the actual values and the forecasting values for 0.6460.

1. Introduction

Agriculture is the key foundation for supporting human survival and development, ensuring national stability and security, and achieving high-quality development of the national economy. If agricultural development is not strengthened, national food security will be threatened, people’s quality of life will be greatly affected, and the national economy will also be seriously hindered [1]. The total output value of agriculture, forestry, animal husbandry, and fishery (TOVAFAF) can reflect the total scale and achievements of agricultural production during a certain period of time, which is an important indicator for evaluating the level of agricultural economic development [2,3]. The healthy development of agriculture, forestry, animal husbandry, and fishery is conducive to adjusting the urban and rural economic structure, promoting regional economic and social development, increasing farmers’ income, and improving the urban and rural ecological environment [4]. Therefore, it is of great significance for grasping the trend of agricultural economic development and adjusting policies to accurately and timely obtain the TOVAFAF [3].
TOVAFAF, as the important basis for calculating the gross domestic product (GDP), mainly comes from surveys conducted by the National Bureau of Statistics or other administrative departments. Although the data obtained in this way are very reliable and authoritative, this method consumes a lot of time, manpower, and financial resources [5,6]. In recent years, nighttime light (NTL) remote sensing images have been widely used to quantify human activities and assess socio-economic development, including the evaluation and prediction of population [7,8], carbon dioxide emission [9,10], urbanization process [11,12], electricity consumption [13,14], GDP [6,15], poverty level [16,17], and other socio-economic indicators. NTL remote sensing technology can detect and record weak visible and near-infrared radiation emitted from residential areas, buildings, streets, traffic, and other surface areas [18,19]. And the radiation intensity of these lights reflects the intensity of human activities and the level of socio-economic development on the whole [20]. Furthermore, NTL remote sensing data possess the advantages of wide coverage, short periodicity, and availability [21].
Since numerous studies have demonstrated that there is a close correlation between NTL data and socio-economic indicators, some scholars directly utilized NTL data to characterize the level and differentiation of socio-economic development [18,19,22,23,24]. For example, on the basis of revealing the correlation between county-level statistical economic data and NTL data, Kuang et al. [22] analyzed the temporal and spatial evolution characteristics of county-level economic development using NTL data. The results showed that there was consistency between NTL data and statistical economic data in cold and hot spots. Chen et al. [23] constructed the NTL landscape metrics using NTL data to reveal the economic development and differentiation between urban and rural areas in Fujian Province, China, from multiple perspectives. Empirical results indicated that the NTL landscape metrics were valuable indicators for analyzing the distribution and evolution of the economy. In fact, these studies only qualitatively characterized socio-economic data using NTL data but did not develop regression models to quantitatively estimate and characterize the level and differentiation of socio-economic development. It may lead to unstable and uncertain results.
Instead of only using NTL data, some scholars constructed regression models by selecting the NTL indices that were highly correlated with socio-economic indicators, so as to evaluate socio-economic indicators and analyze the level and differentiation of socio-economic development [25,26,27,28,29]. For example, Guo et al. [26] used the normalized total radiation index to establish linear and nonlinear regression models for the GDP of various provinces in China. The results showed that the determination coefficients of linear, power law, and logistic models were all above 0.8, laying the foundation for the estimation and prediction of GDP based on NTL data [26]. Ji et al. [28] utilized the GDP growth rate to address the saturation problem of NTL data obtained from the Defense Meteorological Satellite Program-Operational Linescan System (DMSP-OLS) and established a linear regression model for county-level GDP using NTL density data. The results showed that the estimated county-level GDP was consistent with the authoritative county-level GDP statistics, providing an effective method for grasping the development of county-level economy [28]. Most of these studies tended to use a single NTL index to construct regression models for socio-economic indicators, such as the total NTL index and the average NTL index. Additionally, the established regression models were usually conventional, such as linear regression, quadratic polynomial regression, and exponential regression.
This paper attempts to use NTL data to evaluate and forecast the TOVAFAF. In theory, NTL data in rural areas should be obtained and used for analysis and research. As a matter of fact, NTL data mainly come from light sources in urban and urbanized areas, while there are fewer light sources in rural areas. In this situation, conventional models established using a single NTL index may lead to the omission of important information and fail to fully reflect the potential relationship between NTL data and the TOVAFAF. Han et al. [30] showed that the correlation between NTL data and the primary industry was not obvious, and the determination coefficient of the log-linear regression model established using the optimal lighting area index was only 0.306. Despite the difficulties, it is still feasible to evaluate and forecast the TOVAFAF using NTL data. The reason is that there is a close correlation between GDP and NTL data [6,28,29], and the TOVAFAF is the important component of GDP. Therefore, there should be a certain correlation between the TOVAFAF and NTL data. Yong et al. [16] constructed multiple NTL indices as input features and established a fitting model combining the particle swarm optimization (PSO) with the back propagation (BP) artificial neural network algorithms to evaluate the poverty level in southwestern China, which achieved ideal results. However, the BP neural network algorithm needs to continuously adjust the weight values through gradient descent algorithm during the training process, which leads to slow convergence speed and is prone to getting stuck in local minima [31]. Although the PSO algorithm is usually employed to optimize the initial weights of BP algorithm, the final weights of BP algorithm still need to be obtained through gradient descent algorithm.
Extreme learning machine (ELM) proposed by Huang [32] is a single hidden layer feedforward neural network. Instead of constantly adjusting weight values during training, ELM randomly generates the input weight and the hidden layer bias, and obtains the output weight through the generalized inverse matrix theory, which possesses good generalization performance and fast learning ability. Given the above advantages, ELM has been widely used in classification and regression applications [33,34].
Inspired by the reference [16], this paper aims to establish the forecasting model for the TOVAFAF in various provinces of China using multiple NTL indices and machine learning algorithms, so as to provide data reference and technical support for the timely assessment of agricultural economic development level. The main objectives of this paper are: (1) to construct multiple NTL indices using various statistics and analyze the correlation and significance between each NTL index and the TOVAFAF; (2) to establish linear and nonlinear forecasting models for the TOVAFAF in various provinces of China based on single NTL index and multiple NTL indices to explore the potential relationship between NTL data and the TOVAFAF; (3) to optimize the input weights and hidden layer biases of ELM using the PSO algorithm to improve the forecasting accuracy of the ELM model.

2. Materials and Methods

2.1. Study Area

China is located in the eastern part of Asia and on the west coast of the Pacific Ocean. China covers a land area of approximately 9.6 million km2 and a sea area of approximately 4.73 million km2, as well as a mainland coastline of approximately 18,000 km. At present, China is divided into 34 provincial-level administrative regions (Figure 1), including 23 provinces, five autonomous regions, four municipalities directly under the central government, and two special administrative regions.
In this paper, Taiwan Province, the Macao Special Administrative Region, and Hong Kong Special Administrative Region of China are not included in the scope of this study due to the lack of relevant statistical data. Additionally, Beijing and Shanghai, as modern and internationalized provincial cities in China, have focused on developing the tertiary industry, while the output value of the primary industry (agriculture, forestry, animal husbandry, and fishery) accounts for a relatively small proportion of the total economic output and has continued to decline in recent years. For other provincial-level administrative regions of China, the TOVAFAF shows an upward trend year by year. Therefore, Beijing and Shanghai are also not included in the scope of this study. In summary, the remaining 29 provincial-level administrative regions in China are taken as the objects of this study.

2.2. Data Sources

In recent years, the NTL data sources obtained from the Defense Meteorological Satellite Program-Operational Linescan System (DMSP-OLS) and the Suomi National Polar-orbiting Partnership-Visible Infrared Imaging Radiometer Suite (NPP-VIIRS) have been the most commonly used [16], both of which are provided by the National Oceanic and Atmospheric Administration (NOAA) of the United States. The DMSP-OLS dataset provides global NTL maps from 1992 to 2013 with a spatial resolution of 30 arc-seconds, while the NPP-VIIRS dataset provides global NTL maps from 2012 up to the present with a spatial resolution of 15 arc-seconds. Compared with the DMSP-OLS dataset, the NPP-VIIRS dataset possesses higher spatial resolution and avoids the impact of brightness saturation problems. Furthermore, the NPP-VIIRS dataset is continuously updated for future research. Therefore, the annual VIIRS nighttime lights version 2 (VNL V2) product of the NPP-VIIRS satellite created from monthly cloud-free composites was selected as the data source of this study, which can be downloaded from https://eogdata.mines.edu/products/vnl/ (accessed on 20 September 2023).
The NTL remote sensing images for provincial-level administrative regions of China from 2013 to 2023 were obtained and preprocessed by projection conversion, noise removal, and outlier processing for subsequent research and analysis [35]. Taking the NTL images from 2021 as examples, Figure 2 shows the original and preprocessed NTL images of various provincial-level administrative regions of China. Meanwhile, the TOVAFAF for the corresponding provincial-level administrative regions from 2013 to 2023 were obtained from the website of the National Bureau of Statistics of China. It should be mentioned that the TOVAFAF is expressed in the form of currency, which is calculated by multiplying the output of agriculture, forestry, animal husbandry, fishery, and their by-products by the price of their respective unit products. In this paper, the unit of the TOVAFAF for each provincial-level administrative region is expressed in 100 million yuan.

2.3. Construction for NTL Indices

Multiple NTL indices were constructed using various statistics such as sum, mean, variance, and standard deviation to fully reflect the concentration and dispersion of NTL distribution in each provincial-level administrative region. These NTL indices will be employed as feature variables to establish the forecasting model for the TOVAFAF. Table 1 lists and describes the constructed NTL indices.

2.4. Extreme Learning Machine

Extreme learning machine (ELM) will be used for regression analysis in this paper, so the principle of ELM for the regression form is mainly introduced.
Given N distinct samples ( x i , t i ) x i R n , t i R m , i = 1 , 2 , , N } , x i = [ x i 1 , x i 2 , , x i n ] represents the input vector with n inputs, and t i = [ t i 1 , t i 2 , , t i m ] represents the corresponding output vector with m outputs. In this paper, the input vector x i = [ x i 1 , x i 2 , , x i n ] is composed of n NTL indices, and the TOVAFAF is taken as the corresponding expected output and represents a numerical value, so the dimension m of the output vector t i = [ t i 1 , t i 2 , , t i m ] should be equal to 1 in the regression form of ELM. The mathematical model of ELM for the regression form is shown below:
H N × L β L × 1 = T N × 1 ,
where H represents the hidden layer output matrix, i.e.,
H N × L = g ( x 1 w 1 + b 1 ) g ( x 1 w L + b L ) g ( x N w 1 + b 1 ) g ( x N w L + b L ) ,
where w i = [ w i 1 , w i 2 , , w i n ] T represents the input weight vector connecting the input layer and the ith hidden layer neuron, bi represents the bias of the ith hidden layer neuron, L represents the number of hidden layer neurons, and g ( ) represents the activation function. Furthermore, β represents the output weight vector connecting the hidden layer and output layer, i.e.,
β L × 1 = β 1 , β 2 , , β L T ,
and T represents the target output vector, i.e.,
T N × 1 = t 1 , t 2 , , t N T .
Since H and T are known in Equation (1) for the training dataset, β can be obtained according to the generalized inverse matrix theory, i.e.,
β = arg   min β * H β * T = H + T ,
where H+ represents the Moore–Penrose generalized inverse of matrix H. Once β is determined, the test data can be predicted using Equation (1).

2.5. Accuracy Assessment

This paper employed the determination coefficient (R2), relative error (RE), and mean relative error (MRE) to evaluate the forecasting accuracy of the constructed models.
R2 is used to evaluate the fitting effect of the forecasting model. The larger R2 value is, the better the fitting effect of the model is, and the stronger the model explains the dependent variables. The mathematical expression for R2 is as follows:
R 2 = i = 1 N ( t ^ i t ¯ ) 2 i = 1 N ( t i t ¯ ) 2 ,   ( 0 R 2 1 ) .
RE is used to reflect the degree of deviation between the forecasting value of the model and the actual value for a sample. The smaller the RE value is, the better the forecasting effect of the model for the sample is. The mathematical expression for RE is as follows:
RE = t ^ i t i t i × 100 % .
MRE is used to reflect the reliability of the forecasting model. The smaller the MRE value is, the higher the reliability of the forecasting model is, and the better the stability of the model is. The mathematical expression for MRE is as follows:
MRE = i = 1 N ( t ^ i t i ) / t i N × 100 % .
In Equations (6)–(8), t i is the true value of the ith sample, t ^ i is the forecasting value of the ith sample, t ¯ is the average of the true values for all the samples, and N is the number of all the samples.

3. Results and Discussion

In this section, the relevant data from 2013 to 2020 were used as the training set for the training and establishment of forecasting models. The relevant data from 2021 to 2022 were taken as the validation set for the adjustment and optimization of the model weight parameters and the preliminary evaluation of the modeling effect. The relevant data for 2023 were employed as the independent test set to evaluate the forecasting performances of the established models. Furthermore, the forecasting performances of the models established based on single NTL index and multiple NTL indices were discussed and analyzed, respectively.

3.1. Correlation Analysis and Significance Test

Before establishing the forecasting model, the Pearson correlation analysis was carried out between the constructed NTL indices and the TOVAFAF for each provincial-level administrative region from 2013 to 2022. The significance test was conducted on the correlation using t-test. Table 2 lists the correlation coefficients for the correlation analysis and p-values for the significance test.
As can be seen from Table 2, the constructed TNLI, LPQI, LPRI, and NLSDSI indices have a high correlation with the TOVAFAF, and the p-values are all less than 0.01, indicating that the correlation has reached the extremely significant level. The correlation between the other four NTL indices (ANLI, ALPLI, NLSDI, and NLVI) and the TOVAFAF is low, and the p-values are higher than 0.05, indicating that the correlation has not reached the significant level. Therefore, the NTL indices (TNLI, LPQI, LPRI, and NLSDSI) that are significantly correlated with the TOVAFAF are selected as feature variables to establish forecasting models.

3.2. Forecasting Model Based on Single NTL Index

The TOVAFAF for each provincial-level administrative region from 2013 to 2020 and the corresponding NTL indices were used as the training set to establish the forecasting models. Taking the four NTL indices (TNLI, LPQI, LPRI, and NLSDSI) as the single independent variable, respectively, and the TOVAFAF as the dependent variable, linear model, exponential model, logarithmic model, power exponential model, and polynomial model were established. Furthermore, the established models were used to forecast the TOVAFAF for each provincial-level administrative region from 2021 to 2022, and the effectiveness of the established models was evaluated using R2 and MRE. The modeling and forecasting results are shown in Table 3, and Figure 3 and Figure 4.
From the perspective of NTL index, Table 3 and Figure 3 show that the R2 values of the five models (linear, exponential, logarithmic, power exponential, and polynomial models) established based on LPQI index are higher than those of the corresponding models established by other NTL indices (TNLI, LPRI, and NLSDSI). Table 3 and Figure 4 show that the MRE values of the five models established based on LPQI index are lower than those of the corresponding models established by other NTL indices. The above results indicate that the forecasting performance of the five models established based on LPQI index is superior to that of corresponding models established by other NTL indices.
From the perspective of the established model, Table 3 and Figure 3 show that the R2 value of the power exponential model established based on each NTL index (TNLI, LPQI, LPRI, and NLSDSI) is higher than those of the other models (linear, exponential, logarithmic, and polynomial models) established by the corresponding NTL index. Table 3 and Figure 4 show that the MRE value of the power exponential model established based on each NTL index is lower than those of the other models established by the corresponding NTL index. The above results indicate that the forecasting performance of the power exponential model is superior to that of the other models established based on each NTL index.
Furthermore, among all the models established based on each NTL index, the power exponential model established by the LPQI index achieves the best forecasting performance with the R2 value for 0.7113 and the MRE value for 48.34%. However, the MRE of the optimal model for forecasting the TOVAFAF is close to 50%, indicating that the forecasting models established based on single NTL index cannot commendably reflect the relationship between NTL data and the TOVAFAF.

3.3. Forecasting Model Based on Multiple NTL Indices

To further explore the potential relationship between NTL data and the TOVAFAF, multiple linear regression (MLR) and ELM algorithms were employed to establish the forecasting models based on multiple NTL indices. As with the modeling process based on single NTL index, the TOVAFAF for each provincial-level administrative region from 2013 to 2020 and the corresponding multiple NTL indices were used as the training set, and the relevant data from 2021 to 2022 were used as the validation set. Due to the extremely significant correlation between the four NTL indices (TNLI, LPQI, LPRI, and NLSDSI) and the TOVAFAF, all the four NTL indices were taken as the independent variables together, and the TOVAFAF was taken as the dependent variable. Before modeling, each NTL index was normalized to the range of 0–1 to avoid the impact of different orders of magnitude on modeling results.
The specific mathematical expression of the MLR model in this paper is as follows:
t = 11577.99 x 1 + 7915.59 x 2 10210.69 x 3 2189.15 x 4 + 1494.33
where x 1 , x 2 , x 3 , and x 4 represent the value of TNLI, LPQI, NLSDSI, and LPRI, respectively.
For the ELM model, its mathematical expression is shown in Equations (1) to (5). The activation function g ( ) and the number of hidden layer neurons L produce important effects on model performance. This paper took the sigmoid function as the activation function of the ELM model, as shown in Equation (10). By optimization, the number of hidden layer neurons was set to 35. Since the input weights and hidden layer bias were generated randomly, the ELM model was run 20 times, and the average value was taken as the final result to avoid the influence of random errors on the forecasting results.
g ( ϕ ) = 1 1 + e ϕ
To further improve the forecasting performance of the ELM model, the particle swarm optimization (PSO) algorithm was used to optimize the input weights and hidden layer biases of the ELM model, thereby establishing the PSO-ELM model. In this model, the number of iterations of the PSO algorithm was set to 200, and MRE of the validation set was taken as the fitness function to evaluate the optimization effect. The modeling and forecasting results based on multiple NTL indices are shown in Table 4.
As can be seen from Table 4, among the forecasting models established based on multiple NTL indices, the MLR model obtains the worst forecasting performance with the R2 value for 0.6253 and the MRE value for 67.48%. The MRE value of the MLR model is relatively large, and its forecasting performance is even inferior to that of the power exponential models established based on the single TNLI and LPQI index. The R2 and MRE values of the ELM model reach 0.8968 and 33.60%, respectively, so its forecasting performance is significantly better than that of the MLR model. This indicates that the relationship between the NTL data and the TOVAFAF is not a simple linear relationship, but a more complex nonlinear relationship. To further explore the nonlinear relationship between the NTL data and the TOVAFAF, the PSO algorithm was employed to optimize the input weights and hidden layer biases of the ELM model. The results show that the PSO-ELM model achieves the best forecasting performance with the R2 value for 0.8974 and the MRE value for 23.42%.

3.4. Forecasting Performance for Each Provincial-Level Administrative Region in 2023

Among all the abovementioned models, the TNLI-power exponential, LPQI-power exponential, ELM, and PSO-ELM four models have better overall forecasting performance on the TOVAFAF than other models, in which TNLI-power exponential and LPQI-power exponential represent the power exponential model established based on the single TNLI and LPQI index, respectively. To further demonstrate the effectiveness of the aforementioned four models, the RE was used to evaluate the forecasting effect for 29 provincial-level administrative regions in 2023. It should be noted that five provincial-level administrative regions of China, i.e., Beijing, Shanghai, Hong Kong, Macao, and Taiwan, are not included in the scope of this study, and the specific reasons have been explained in Section 2.1.
Figure 5 shows the actual value and model forecasting value of TOVAFAF for each provincial-level administrative region in 2023. It can be qualitatively seen the forecasting effect of each model on the TOVAFAF of each provincial-level administrative region. For the PSO-ELM model, except for Shandong and Zhejiang provinces, the forecasting values of TOVAFAF in most provincial-level administrative regions are close to the actual value or fluctuate near the actual value. The forecasting effect of the ELM model on the TOVAFAF in many provinces is similar to that of PSO-ELM. For the TNLI-power exponential and LPQI-power exponential models, the forecasting values of TOVAFAF are close to the actual value in some provinces, while the forecasting values are much different from the actual value in some provinces.
To quantitatively evaluate the forecasting effect of various models on the TOVAFAF, Figure 6 shows the RE of the four models for each provincial-level administrative region in 2023, and Figure 7 further shows the number of provincial-level administrative regions at various levels of RE in 2023. For the PSO-ELM model, the RE values of 22 provincial-level administrative regions are below 50%, and the RE values of 18 provincial-level administrative regions are below 30%, which makes the fluctuation amplitude of the RE curve smaller than that of other models. For the ELM model, there are also 22 provincial-level administrative regions with RE below 50%. However, the number of provincial-level administrative regions with RE below 30% for the ELM model is 15, which is less than that of the PSO-ELM model. In addition, the RE values of Guangdong, Hainan, Qinghai, Shanxi, and Zhejiang provinces even exceed 90% for the ELM model. That makes the fluctuation amplitude of the RE curve of the ELM model larger than that of the PSO-ELM model. There are 11 and 14 provincial-level administrative regions with RE above 50% for the TNLI-power exponential model and the LPQI-power exponential model, respectively. And the RE values of some provincial-level administrative regions even significantly exceed 100% for the TNLI-power exponential and LPQI-power exponential models. Therefore, the fluctuation amplitude of the RE curves of the TNLI-power exponential and LPQI-power exponential models is much larger than that of the ELM and PSO-ELM models. The above results indicate that the forecasting performance of the PSO-ELM model is superior to that of other models.
Furthermore, MRE was used to evaluate the overall forecasting performance of the four models in 2023. The linear relationship between the actual values and the forecasting values of each model for 29 provincial-level administrative regions in 2023 was analyzed. The higher the R2 value is, the better the linear relationship between the actual values and the forecasting values of the model is, and the better the forecasting performance of the model is. The results are shown in Figure 8. The red lines in the figure represent the linear trend between the actual values and the forecasting values.
Compared to the TNLI-power exponential and LPQI-power exponential models, the ELM model achieves the better overall forecasting performance with MRE for 42.74% in 2023. This indicates that the ELM model based on multiple NTL indices can better reflect the potential relationship between NTL data and the TOVAFAF compared to the power exponential models based on single NTL index. Moreover, among the four models, the MRE values of the PSO-ELM model with 32.20% in 2023 are lowest, and the R2 values of the linear relationship between the true values and the forecasting values for the PSO-ELM model with 0.6460 in 2023 are highest. Therefore, the PSO-ELM model achieves the best overall forecasting performance in 2023, which indicates that the PSO-ELM model can further effectively explore and characterize the potential relationship between NTL data and the TOVAFAF.

4. Conclusions

For timely grasping the development trend of agricultural economics, this paper attempts to explore the relationship between NTL data and the total output value of agriculture, forestry, animal husbandry, and fishery (TOVAFAF), so as to establish the forecasting model for the TOVAFAF in various provinces of China using NTL remote sensing data and machine learning algorithms. The conclusions are as follows:
(1) The constructed TNLI, LPQI, LPRI, and NLSDSI indices are correlated with the TOVAFAF at the extremely significant level;
(2) The ELM and PSO-ELM models established based on multiple NTL indices can better explore and characterize the potential nonlinear relationship between NTL data and the TOVAFAF than all the models established based on single NTL index and the MLR model established based on multiple NTL indices;
(3) The PSO-ELM model achieves the best forecasting effect in 2023 with the MRE value of 32.20% and the R2 values of the linear relationship between the actual values and the forecasting values for 0.6460.
The PSO-ELM model established based on multiple NTL indices in this paper can effectively forecast the TOVAFAF in various provincial-level administrative regions of China on the whole. However, the forecasting accuracy still needs further improvement. In the future research, we will explore the temporal features from NTL remote sensing images to construct NTL indices. And the ELM and PSO algorithms will be improved to deeply explore the potential relationship between NTL data and the TOVAFAF, so as to establish a forecasting model with higher accuracy.

Author Contributions

Data curation, R.Y.; methodology, R.Y.; supervision, Q.Z. and L.X.; validation, T.W. and Y.Z.; writing—original draft preparation, R.Y.; writing—review and editing, R.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Central Public-interest Scientific Institution Basal Research Fund (No. JBYW-AII-2024-46).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The NPP-VIIRS nighttime light remote sensing images in this paper can be obtained from https://eogdata.mines.edu/products/vnl/ (accessed on 20 September 2023).

Acknowledgments

We acknowledge the NPP-VIIRS nighttime light remote sensing images for the research in this paper.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Han, S.Z.; Pan, W.T.; Zhou, Y.Y.; Liu, Z.L. Construct the prediction model for China agricultural output value based on the optimization neural network of fruit fly optimization algorithm. Future Gener. Comput. Syst. 2018, 86, 663–669. [Google Scholar] [CrossRef]
  2. Yang, Y.P.; Liu, M.H. Analysis and prediction of agriculture, forestry, animal husbandry and fishery in China based on ARMA model. Econ. Trade 2015, 273. Available online: https://kns.cnki.net/kcms2/article/abstract?v=ZOnxTxd1G4LQghJz-7msoLitmhrnaPd143E6mrhLVcyVwPIsrqvJ8ejYh1Coq7ktzL_nsehWQEcFEvfjprhNlRl-g-eq1h6x175rZl2lw40Mcln4RYHsiZaCRo4ddbbLpij4pJq06syecnoT9KFbOzMbNRs-glzfRBwqVOBinI2pfILj4n-wHD37KA6mfZLKTb6oA_lcOBi3pviOY6Momg==&uniplatform=NZKPT (accessed on 25 September 2024).
  3. Chen, Y.; Nu, L.; Wu, L.F. Forecasting the agriculture output values in China based on grey seasonal model. Math. Probl. Eng. 2020, 2020, 3151048. [Google Scholar] [CrossRef]
  4. Li, R.A.; Hu, M.H.; Li, Y. Research on prediction method for total output values of agriculture, forestry, pasturage and fishery in Wuhan City. J. Wuhan Univ. Technol. 2009, 31, 153–156. [Google Scholar]
  5. Liu, H.Y.; He, X.W.; Bai, Y.B.; Liu, X.; Wu, Y.L.; Zhao, Y.Y.; Yang, H.F. Nightlight as a proxy of economic indicators: Fine-grained GDP inference around Chinese mainland via attention-augmented CNN from daytime satellite imagery. Remote Sens. 2021, 13, 2067. [Google Scholar] [CrossRef]
  6. Gu, Y.; Shao, Z.F.; Huang, X.; Cai, B.W. GDP forecasting model for China’s provinces using nighttime light remote sensing data. Remote Sens. 2022, 14, 3671. [Google Scholar] [CrossRef]
  7. Yu, S.S.; Zhang, Z.X.; Liu, F. Monitoring population evolution in China using time-series DMSP/OLS nightlight imagery. Remote Sens. 2018, 10, 194. [Google Scholar] [CrossRef]
  8. Ortakavak, Z.; Çabuk, S.N.; Cetin, M.; Senyel Kurkcuoglu, M.A.; Cabuk, A. Determination of the nighttime light imagery for urban city population using DMSP-OLS methods in Istanbul. Environ. Monit. Assess. 2020, 192, 790. [Google Scholar] [CrossRef]
  9. Shi, K.F.; Chen, Y.; Yu, B.L.; Xu, T.B.; Chen, Z.Q.; Liu, R.; Li, L.Y.; Wu, J.P. Modeling spatiotemporal CO2 (carbon dioxide) emission dynamics in China from DMSP-OLS nighttime stable light data using panel data analysis. Appl. Energy 2016, 168, 523–533. [Google Scholar] [CrossRef]
  10. Guo, B.; Xie, T.T.; Zhang, W.C.; Wu, H.J.; Zhang, D.M.; Zhu, X.W.; Ma, X.Y.; Wu, M.; Luo, P.P. Rasterizing CO2 emissions and characterizing their trends via an enhanced population-light index at multiple scales in China during 2013–2019. Sci. Total Environ. 2023, 905, 167309. [Google Scholar] [CrossRef]
  11. Lu, H.M.; Zhang, M.L.; Sun, W.W.; Li, W.Y. Expansion analysis of Yangtze River delta urban agglomeration using DMSP/OLS nighttime light imagery for 1993 to 2012. ISPRS Int. J. Geo-Inf. 2018, 7, 52. [Google Scholar] [CrossRef]
  12. Chen, Z.Q.; Yu, B.L.; Zhou, Y.Y.; Liu, H.X.; Yang, C.S.; Shi, K.F.; Wu, J.P. Mapping global urban areas from 2000 to 2012 using time-series nighttime light data and MODIS products. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 1143–1153. [Google Scholar] [CrossRef]
  13. Jasinski, T. Modeling electricity consumption using nighttime light images and artificial neural networks. Energy 2019, 179, 831–842. [Google Scholar] [CrossRef]
  14. Chen, J.D.; Gao, M.; Cheng, S.L.; Hou, W.X.; Song, M.L.; Liu, X.; Liu, Y. Global 1 km x 1 km gridded revised real gross domestic product and electricity consumption during 1992–2019 based on calibrated nighttime light data. Sci. Data 2022, 9, 202. [Google Scholar] [CrossRef]
  15. Dai, Z.X.; Hu, Y.F.; Zhao, G.H. The suitability of different nighttime light data for GDP estimation at different spatial scales and regional levels. Sustainability 2017, 9, 305. [Google Scholar] [CrossRef]
  16. Yong, Z.W.; Li, K.; Xiong, J.N.; Cheng, W.M.; Wang, Z.G.; Sun, H.Z.; Ye, C.C. Integrating DMSP-OLS and NPP-VIIRS nighttime light data to evaluate poverty in Southwestern China. Remote Sens. 2022, 14, 600. [Google Scholar] [CrossRef]
  17. Chen, R.; Zhang, F.; Chan, N.W.; Wang, Y.S. Multidimensional poverty measurement and spatial-temporal pattern analysis at county level in the arid area of Xinjiang, China. Environ. Dev. Sustain 2023, 25, 13805–13824. [Google Scholar] [CrossRef]
  18. Feng, Z.; Peng, J.; Wu, J.S. Using DMSP/OLS nighttime light data and K-means method to identify urban-rural fringe of megacities. Habitat Int. 2020, 103, 102227. [Google Scholar] [CrossRef]
  19. Zhong, L.; Liu, X.S.; Yang, P. Regional development gap assessment method based on remote sensing images and weighted Theil index. Arab. J. Geosci. 2020, 13, 1176. [Google Scholar] [CrossRef]
  20. Yu, B.L.; Deng, S.Q.; Liu, G.; Yang, C.S.; Chen, Z.Q.; Hill, C.J.; Wu, J.P. Nighttime light images reveal spatial-temporal dynamics of global anthropogenic resources accumulation above ground. Environ. Sci. Technol. 2018, 52, 11520–11527. [Google Scholar] [CrossRef]
  21. Wu, K.; Wang, X.N. Aligning pixel values of DMSP and VIIRS nighttime light images to evaluate urban dynamics. Remote Sens. 2019, 11, 1463. [Google Scholar] [CrossRef]
  22. Kuang, K.J.; Zheng, K.Y.; Chen, R.; Chen, B.; Hong, Y.; Liu, J.F. Spatio-temporal evolution of county economic development in Fujian province based on NPP-VIIRS nighttime lighting data. Areal Res. Dev. 2023, 42, 29–35. [Google Scholar]
  23. Chen, Z.Q.; Yu, S.Y.; You, X.J.; Yang, C.S.; Wang, C.X.; Lin, J.; Wu, W.T.; Yu, B.L. New nighttime light landscape metrics for analyzing urban-rural differentiation in economic development at township: A case study of Fujian province, China. Appl. Geogr. 2023, 150, 102841. [Google Scholar] [CrossRef]
  24. Zeng, P.F.; Sun, M.L.; Lu, N.; Chang, Y. Spatial expansion and intrinsic correlation measure of Qingdao urban-rural integration area based on night light data. J. Geomat. Sci. Technol. 2021, 38, 213–220. [Google Scholar]
  25. Ma, T.; Zhou, C.H.; Pei, T.; Haynie, S.; Fan, J.F. Quantitative estimation of urbanization dynamics using time series of DMSP/OLS nighttime light data: A comparative case study from China’s cities. Remote Sens. Environ. 2012, 124, 99–107. [Google Scholar] [CrossRef]
  26. Guo, Y.D.; Gao, J.H.; Ma, H.B. Spatial correlation analysis of Suomi-NPP nighttime light data and GDP data. J. Tsinghua Univ. Sci. Technol. 2016, 56, 1122–1130. [Google Scholar]
  27. Pan, J.H.; Hu, Y.X. Spatial identification of multi-dimensional poverty in rural China: A perspective of nighttime-light remote sensing data. J. Indian Soc. Remote Sens. 2018, 46, 1093–1111. [Google Scholar] [CrossRef]
  28. Ji, X.L.; Li, X.Z.; He, Y.Q.; Liu, X.L. A simple method to improve estimates of county-level economics in China using nighttime light data and GDP growth rate. ISPRS Int. J. Geo-Inf. 2019, 8, 419. [Google Scholar] [CrossRef]
  29. Duerler, R.; Cao, C.X.; Xie, B.; Huang, Z.B.; Chen, Y.Y.; Wang, K.M.; Xu, M.; Lu, Y.L. Cross reference of GDP decrease with nighttime light data via remote sensing diagnosis. Sustainability 2023, 15, 6900. [Google Scholar] [CrossRef]
  30. Han, X.D.; Zhou, Y.; Wang, S.X.; Liu, R.; Yao, Y. GDP spatialization in China based on nighttime imagery. J. Geo-Inf. Sci. 2012, 14, 128–136. [Google Scholar] [CrossRef]
  31. Gori, M.; Tesi, A. On the problem of local minima in backpropagation. IEEE Trans. Pattern Anal. Mach. Intell. 1992, 14, 76–86. [Google Scholar] [CrossRef]
  32. Huang, G.B.; Zhu, Q.Y.; Siew, C.K. Extreme learning machine: Theory and applications. Neurocomputing 2006, 70, 489–501. [Google Scholar] [CrossRef]
  33. Balasundaram, S.; Gupta, D. On optimization based extreme learning machine in primal for regression and classification by functional iterative method. Int. J. Mach. Learn. Cybern. 2016, 7, 707–728. [Google Scholar] [CrossRef]
  34. Liu, X.L.; Zhou, Y.Q.; Meng, W.P.; Luo, Q.F. Functional extreme learning machine for regression and classification. Math. Biosci. Eng. 2023, 20, 3768–3792. [Google Scholar] [CrossRef]
  35. Duan, Y.P. Research on the Effect Evaluation of Rural Construction Based on Nighttime Light Remote Sensing. Master’s Thesis, Dalian Ocean University, Dalian, China, 2023. [Google Scholar]
Figure 1. The provincial-level administrative regions in China.
Figure 1. The provincial-level administrative regions in China.
Applsci 14 08752 g001
Figure 2. The NPP-VIIRS NTL images of various provincial-level administrative regions of China in 2021: (a) the original NTL image and (b) the preprocessed NTL image.
Figure 2. The NPP-VIIRS NTL images of various provincial-level administrative regions of China in 2021: (a) the original NTL image and (b) the preprocessed NTL image.
Applsci 14 08752 g002
Figure 3. The determination coefficient (R2) of the models established based on single NTL index.
Figure 3. The determination coefficient (R2) of the models established based on single NTL index.
Applsci 14 08752 g003
Figure 4. The mean relative error (MRE) of the models established based on single NTL index.
Figure 4. The mean relative error (MRE) of the models established based on single NTL index.
Applsci 14 08752 g004
Figure 5. The actual value and model forecasting value of TOVAFAF for each provincial-level administrative region in 2023.
Figure 5. The actual value and model forecasting value of TOVAFAF for each provincial-level administrative region in 2023.
Applsci 14 08752 g005
Figure 6. The relative error (RE) of the established models for each provincial-level administrative region in 2023.
Figure 6. The relative error (RE) of the established models for each provincial-level administrative region in 2023.
Applsci 14 08752 g006
Figure 7. The number of provincial-level administrative regions at various levels of relative error (RE) in 2023.
Figure 7. The number of provincial-level administrative regions at various levels of relative error (RE) in 2023.
Applsci 14 08752 g007
Figure 8. The linear relationship and the mean relative error (MRE) between the true values and the forecasting values of (a) TNLI-power exponential, (b) LPQI-power exponential, (c) ELM, and (d) PSO-ELM in 2023.
Figure 8. The linear relationship and the mean relative error (MRE) between the true values and the forecasting values of (a) TNLI-power exponential, (b) LPQI-power exponential, (c) ELM, and (d) PSO-ELM in 2023.
Applsci 14 08752 g008
Table 1. Detail description for the constructed NTL indices.
Table 1. Detail description for the constructed NTL indices.
Serial NumberIndex NameAbbreviationDetail Description
V1Total nighttime light index TNLIThe sum of light intensity for all pixels in the NTL image within the provincial-level administrative region
V2Average nighttime light indexANLIThe ratio of TNLI to quantity of all pixels in the NTL image within the provincial-level administrative region
V3Luminous pixel quantity indexLPQIThe quantity of luminous pixels (brightness value greater than 0) in the NTL image within the provincial-level administrative region
V4Luminous pixel ratio indexLPRIThe ratio of LPQI to quantity of all pixels in the NTL image within the provincial-level administrative region
V5Average luminous pixel light indexALPLIThe ratio of TNLI to quantity of luminous pixels in the NTL image within the provincial-level administrative region
V6Nighttime light standard deviation indexNLSDIThe standard deviation of light intensity for all pixels in the NTL image within the provincial-level administrative region
V7Nighttime light variance indexNLVIThe variance of light intensity for all pixels in the NTL image within the provincial-level administrative region
V8Nighttime light squared deviation sum indexNLSDSIThe sum of squared deviation of light intensity for all pixels in the NTL image within the provincial-level administrative region
Table 2. Results of correlation analysis and significance test.
Table 2. Results of correlation analysis and significance test.
Index NameTNLIANLILPQILPRIALPLINLSDINLVINLSDSI
Correlation Coefficient0.65870.05700.76760.27540.07520.1092−0.05530.5409
p-value1.829 × 10−373.331 × 10−11.430 × 10−571.917 × 10−62.014 × 10−16.339 × 10−23.481 × 10−11.955 × 10−23
Table 3. The modeling and forecasting results based on single NTL index.
Table 3. The modeling and forecasting results based on single NTL index.
Index NameModelExpressionR2MRE
TNLILinear t =   0.0046 x +   1730.5 0.425788.11%
Exponential t =   1284 . 5 e ( 2 × 10 6 ) x 0.3415141.06%
Logarithmic t =   2218.8 ln ( x )     24467 0.537562.29%
Power exponential t =   0.0083 x 0.9971 0.651858.12%
Polynomial t =   ( 5 × 10 9 ) x 2 +   0.012 x +   20.84 0.543866.68%
LPQILinear t =   0.0238 x +   784.37 0.578267.51%
Exponential t =   886.57 e ( 9 × 10 6 ) x 0.485171.34%
Logarithmic t =   2589.4 ln ( x )   26065 0.582060.51%
Power exponential t =   0.0039 x 1.1682 0.711348.34%
Polynomial t =   ( 3 × 10 8 ) x 2 +   0.0321 x +   312.52 0.584662.47%
LPRILinear t =   3477 x +   3061 0.0705138.57%
Exponential t =   2378 . 2 e 0.6514 x 0.0149112.83%
Logarithmic t =   817.64 ln ( x )   + 5465.1 0.1585109.69%
Power exponential t =   5951.9 x 0.381 0.206698.55%
Polynomial t = 18965 x 2 +   16187 x +   1871.3 0.196594.51%
NLSDSILinear t =   0.0002 x +   2099.3 0.2758104.49%
Exponential t =   1397 e ( 8 × 10 8 ) x 0.2591109.15%
Logarithmic t =   1857.3 ln ( x )   25254 0.373691.44%
Power exponential t =   0 . 0038 x 0.8622 0.483578.86%
Polynomial t =   ( 1 × 10 11 ) x 2 +   0 . 0005 x +   882 . 81 0.342898.79%
Note: x represents the corresponding NTL index.
Table 4. The modeling and forecasting results based on multiple NTL indices.
Table 4. The modeling and forecasting results based on multiple NTL indices.
ModelR2MRE
MLR0.625367.48%
ELM0.896833.60%
PSO-ELM0.897423.42%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, R.; Zhou, Q.; Xu, L.; Zhang, Y.; Wei, T. Forecasting the Total Output Value of Agriculture, Forestry, Animal Husbandry, and Fishery in Various Provinces of China via NPP-VIIRS Nighttime Light Data. Appl. Sci. 2024, 14, 8752. https://doi.org/10.3390/app14198752

AMA Style

Yang R, Zhou Q, Xu L, Zhang Y, Wei T. Forecasting the Total Output Value of Agriculture, Forestry, Animal Husbandry, and Fishery in Various Provinces of China via NPP-VIIRS Nighttime Light Data. Applied Sciences. 2024; 14(19):8752. https://doi.org/10.3390/app14198752

Chicago/Turabian Style

Yang, Rongchao, Qingbo Zhou, Lei Xu, Yi Zhang, and Tongyang Wei. 2024. "Forecasting the Total Output Value of Agriculture, Forestry, Animal Husbandry, and Fishery in Various Provinces of China via NPP-VIIRS Nighttime Light Data" Applied Sciences 14, no. 19: 8752. https://doi.org/10.3390/app14198752

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop