# **Nighttime Lights as a Proxy for Economic Performance of Regions**

Edited by Nataliya Rybnikova

Printed Edition of the Special Issue Published in *Remote Sensing*

www.mdpi.com/journal/remotesensing

## **Nighttime Lights as a Proxy for Economic Performance of Regions**

## **Nighttime Lights as a Proxy for Economic Performance of Regions**

Editor

**Nataliya Rybnikova**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Editor* Nataliya Rybnikova Department of Geography and Environmental Studies University of Haifa Israel

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Remote Sensing* (ISSN 2072-4292) (available at: https://www.mdpi.com/journal/remotesensing/ special issues/Nighttime Lights Economic).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-3437-4 (Hbk) ISBN 978-3-0365-3438-1 (PDF)**

Cover image courtesy of the Earth Science and Remote Sensing Unit, NASA Johnson Space Center

© 2022 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**


## **About the Editor**

**Nataliya Rybnikova** was born in Lugansk, Ukraine, in 1981. She received her M.S. and first Ph.D. in economics from the Volodymyr Dahl Eastern Ukrainian National University, Ukraine, in 2004 and 2011, respectively. She received her second Ph.D. degree in remote sensing from the University of Haifa, Israel, in 2018. From 2018 to 2019, she was a Postdoctoral Fellow with the Remote Sensing Laboratory, University of Haifa, Israel. From 2019 to late 2021, she was a Postdoctoral Fellow with the Department of Mathematics, University of Leicester, UK, and School of Environmental Studies, University of Haifa, Israel. Since then, she is a research fellow at the Department of Geography and Environmental Studies, University of Haifa, Israel. She is the author of more than 30 articles and book chapters. Her research interests include processing and using artificial night-time light imagery as a proxy for human presence on Earth.

## *Editorial* **Everynight Accounting: Nighttime Lights as a Proxy for Economic Performance of Regions**

**Nataliya Rybnikova 1,2,3**


Artificial nighttime lights, emitted from residential, industrial, commercial and entertainment areas, and captured by satellites, have proven to be a reliable proxy for on-ground human activities. Since the end of the 1990s, nighttime light data have been used to monitor population concentrations and to assess the economic performance of countries and regions (see, for instance, [1,2]). Since then, studies of this kind have been increasing in an avalanche-like manner, and with nighttime lights being used as a proxy for even more sophisticated things. The reason is that nighttime light data are indeed a very promising tool to catch the patterns of human activities remotely. Specifically, these data are nowadays available for each day and each point on the Earth, thus being more advantageous than traditional estimates, which might be scarce and irregular (due to time-consuming analysis, as in the case of defining the commuting rates), non-unified (due to different national reporting standards), or confidential (due to security reasons or illegality, as in the case of shadow economies).

In the current Special Issue, we have collected ten recent examples for the successful usage of nighttime light data in a variety of socio-economic tasks, for which traditional direct techniques are either inapplicable or inefficient. Thus, Mohammad Reza Farzanegan and Sven Fischer [3], proceeding from the lifting of the international sanctions, proposed using nighttime light data to estimate and model the level of the shadow economy in Iran. John Gibson and Geua Boe-Gibson [4] used DMSP/OLS and VIIRS/DNB nighttime light data to assess their association with disaggregated GDP for various industries at the county level in the USA. Haoyu Liu with co-authors [5] demonstrated the substitution of time-, money-, and labor-consuming GDP data at a county level in the Chinese Mainland by nighttime lights, combined with daytime remote sensing data. VIIRS-provided nighttime lights were used by Nils B. Weidmann and Gerlinde Theunissen [6] to substitute hard-tomeasure and rare data on local economic inequalities in the countries of the Global South. Bingxin Qi, Xuantong Wang, and Paul Sutton [7] combined populational data with the data of nighttime lights to model comparable estimates for educational inequality globally at the national level. Dan Lu, with co-authors [8], using the example of Chongqing municipality in China, proposed an approach to model—via DMSP/OLS and VIIRS/DNB nighttime light data—population dynamics in an agricultural mountainous region. The rest of the studies in the Special Issue are devoted to using nighttime light data for identifying urban extent. Thus, for this purpose, Yuping Wang and Zehao Shen [9], proceeding from the example of eleven urban districts of Nanjing, China, applied a threshold-based Kernel Density Estimator to Luojia 1-01 and VIIRS-provided nighttime light data. Feng Li, with co-authors [10], combined Luojia 1-01 nighttime light data with the Normalized Difference Vegetation and Water Indices to extract data from the urban areas in four capital Chinese cities. Ding Ma with co-authors [11] compared the hotspots, extracted for 20 major Chinese cities, alternatively from the OpenStreetMap platform and VIIRS-provided nighttime light data. Finally, Nataliya Rybnikova with co-authors [12], using the example of two European

**Citation:** Rybnikova, N. Everynight Accounting: Nighttime Lights as a Proxy for Economic Performance of Regions. *Remote Sens.* **2022**, *14*, 825. https://doi.org/10.3390/ rs14040825

Received: 4 January 2022 Accepted: 17 January 2022 Published: 10 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

countries, proposed the substitution of the time-consuming mechanism of identifying functional urban areas by optimal thresholds of VIIRS-provided nighttime lights.

We believe that the herein-presented analyses will be useful both for the research community and decision-makers, aiming to better understand the patterns of regional economic development and to design more informed policies. Besides, current studies may provide important insights for engineers, developing and launching satellites for nighttime remote sensing. Specifically, most of the studies in the field (including those of the current Special Issue) use panchromatic data, whereas the potential of multispectral imagery is not fulfilled due to the low availability of the corresponding data.

The Guest Editor is grateful to the Editorial Team for ensuring smooth communication between the authors. I also thank the panel of invited reviewers whose constructive criticism helped to improve the articles and to make them more comprehensive for the reader.

Sincerely,

Nataliya Rybnikova.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **References**


## *Article* **Lifting of International Sanctions and the Shadow Economy in Iran—A View from Outer Space**

**Mohammad Reza Farzanegan \* and Sven Fischer**

Economics of the Middle East Research Group, Center for Near and Middle Eastern Studies (CNMS), School of Business and Economics, Philipps-Universität Marburg, 35032 Marburg, Germany; sven.fischer@uni-marburg.de **\*** Correspondence: farzanegan@uni-marburg.de

**Abstract:** With the implementation of the Joint Comprehensive Plan of Action (JCPOA) in 2016, Iran experienced a short period without international sanctions which resulted in an annual increase in the gross domestic product (GDP) in the following two years. However, it was not just the formal economy that was affected by the sanctions. Previous studies have shown that sanctions can negatively affect the shadow (or informal) economy and may even have a larger impact on the informal economy than on the formal economy. Nighttime lights (NTL) data allow us to study shadow economy activities that are not reported in the official GDP. This study uses a panel of NTL (the DMSP/OLS and VIIRS/DNB harmonized dataset) from 1992 to 2018 for 31 Iranian provinces to investigate the association between the lifting of sanctions and the growth of the shadow economy. The empirical results suggest an increase in shadow economy activity with the lifting of sanctions while controlling for other drivers of informal activities.

**Keywords:** shadow economy; Iran; sanctions; JCPOA; nighttime lights

**Citation:** Farzanegan, M.R.; Fischer, S. Lifting of International Sanctions and the Shadow Economy in Iran—A View from Outer Space. *Remote Sens.* **2021**, *13*, 4620. https://doi.org/ 10.3390/rs13224620

Academic Editor: Nataliya Rybnikova

Received: 20 September 2021 Accepted: 13 November 2021 Published: 17 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

The international sanctions that were introduced in 2012 by the United States (USA) and the European Union (EU) against Iran's oil and financial sectors had an immediate impact on the economy. Between 2012 and 2015, the average annual economic growth was −1.1%, compared to 2.4% in the prior four-year period [1]. Moreover, Iranian crude oil exports dropped from 2.4 to 1.4 million barrels per day in the same period [2]. Shadow economy activity also significantly decreased by more than 30 percentage points, suffering more from the sanctions than the formal economy [3]. When the formal part of the economy is under increased pressure, a declining informal economy may endanger the political stability of a country, increasing the risk of internal conflict [4,5].

After almost three years of negotiations, Iran and the five permanent members of the United Nations Security Council (China, France, Russia, United Kingdom, and the USA), the EU, and Germany came to an agreement called the Joint Comprehensive Plan of Action (JCPOA), which was implemented in 2016. As a consequence, most international sanctions against Iran were lifted, and for a short period, it appeared that economic activity returned to pre-sanction levels or better. However, in 2018, the US government decided to withdraw from the JCPOA and re-introduced sanctions against Iran, effectively ending the agreement and returning Iran to a situation under sanctions.

These circumstances allow us to study the effect of the lifting of international sanctions on the Iranian shadow economy in 2016 and 2017. Despite many views and perspectives on the definition and measurement of the phenomenon of the shadow economy, there are several approaches to address this topic. According to Schneider and Enste [6], a shadow economy includes "unreported income from the production of legal goods and services, either from monetary or barter transactions, hence all economic activities that would generally be taxable were they reported to the tax authorities." In this study, shadow economy growth is calculated with the help of the harmonized global nighttime light (NTL) dataset and gross domestic product (GDP) data from the Iranian Ministry of Economic and Financial Affairs (MEFA) [7].

The harmonized NTL dataset by Li et al. [8] utilizes monthly remote sensing data from the Defense Meteorological Satellite Program (DMSP) of the United States Department of Defense and the data from Visible Infrared Imaging Radiometer Suite (VIIRS) of the Earth Observation Group of the United States National Oceanic and Atmospheric Administration (NOAA). The authors created images of yearly data, removed noise, and harmonized the two different types of measurement to create a dataset that can be used as time-series data. The data gathered through remote sensing have several advantages, some of which are relevant for our study. It offers the possibility to measure economic activity that goes beyond the formal GDP, namely the shadow economy. This can help us to get a more precise picture of the impact of the lifting of sanctions on economic activity in Iran. We are using the NTL data to create a panel dataset of 31 Iranian provinces with 837 observations, which we are using together with the official data from MEFA. To estimate the association between the lifting of sanctions and the change of shadow economy in Iranian provinces, we use multiple linear regression, in which the growth of the shadow economy, measured in the first difference of the logarithm, is the dependent variable. We are presenting different specifications and adding a variety of control variables to show the robustness of our results.

The study's contribution is that it uses a dataset that permits a longer study period relative to prior studies on the case of Iran. The dataset of Li et al. [8] gives us the possibility to study a longer time period for our panel data approach, instead of the two DMSP/OLS and VIIRS/DNB datasets that are usually not comparable over time. It is essential for our approach because the DMSP dataset ends in the year 2013 and does not include the years of lifted sanctions. Moreover, the effects of lifting sanctions are discussed, which is less covered in the literature compared to the effects of introducing sanctions. This is also true for the behavior of the shadow economy after economic shocks. In addition, this contribution will help to understand the dynamics of the shadow economy in similar economies. Finally, it is an important addition to several recent studies about international sanctions against Iran and already existing studies about its shadow economy [3,9–23]. The paper is structured as follows. Section 2 presents an overview of the relevant literature related to the topic, and Section 3 explains the data and methodology. In Section 4, the results are presented, and we discuss the results in Section 5. Section 6 concludes the paper.

#### **2. Literature Review**

Using nighttime light (NTL) data in economics has three main advantages. First, the growth in NTL reflects growth in economic activity, but it does not include possible GDP measurement errors in countries with low quality of national accounts [24–26]. Second, official GDP statistics do not account for the informal economy, which can be significant in many countries [3,27,28]. These two aspects can lead to underestimating the true effect of economic shocks. This has also been shown by several authors in the context of natural disasters [29–31]. The third advantage is that nightlight datasets are available for all countries and smaller geographical units and are therefore comparable across units. Existing studies mainly use NTL to investigate economic activity and social well-being [24,25,29–33], the shadow economy and remittances [3,27,28], and electricity consumption and light pollution [34–36], as well as urban ecosystems and urban extent mapping [37–39].

The first part of the literature links NTL to the formal and informal economy. Chen and Nordhaus show that NTL can be used as a proxy for GDP [24]. Their study is based on Elvidge et al., who estimated the light emissions for 21 countries and showed that the area lit is highly correlated to gross domestic product and electric power consumption [34]. Henderson et al. went a step further and developed a statistical framework to use NTL to augment official GDP growth measures [25]. They use it to improve the estimates of true GDP growth in countries with poor data quality. Additionally, Ghosh et al. discussed how

nighttime lights could be used as a measure for human well-being such as GDP, poverty, informal economic activity, remittances, human ecological footprint, electrification rates, and how it can be used to calculate the Night Light Development Index (NLDI) and the Information and Technology Development Index (IDI) [32]. They also studied the role of remittances and the shadow economy in Mexico with the help of NTL from the United States (US) because a large number of remittances are sent from the US to Mexico every year [27]. According to the authors' results, the magnitude of Mexico's informal economy and the inflow of remittances are 150 percent larger than their existing official estimates in the gross national income. Moreover, Tanaka and Keola studied the shadow economy in Cambodia using census data on formally registered non-farm establishments and NTL [28]. Their results suggest that both formal and informal firms increased their estimated sales, and thus the informal sector increased quantitatively in both absolute and relative terms over time. Another approach from Shi et al. uses NTL to estimate the total freight traffic in China [33].

The second part of the literature uses the NTL to determine the impact of shocks on the economy, for example, natural disasters. Bertinelli and Strobl use NTL data as a measure of local economic activity to statistically assess the impact of hurricane strikes on local economic growth [29]. Their results suggest that, on average, hurricane strikes reduce income growth by around 1.5% at the local level, with no effect beyond the year of the strike, which is more than 2 times higher than the impact estimated from aggregate analyses. Moreover, Elliott et al. use a similar approach and focus on the impact of typhoons on local economic activity in coastal China at a spatially highly disaggregated level of approximately 1 km [30]. According to their results, a typhoon that is estimated to destroy 50% of the property reduces local economic activity by 20% for that year. Klomp examined the impact of large-scale natural disasters on economic development, measured by NTL, and found that natural disasters reduce the number of lights visible from outer space significantly in the short run, while climatic and hydrological disasters cause a large drop in the luminosity in developing and emerging market countries, and geophysical and meteorological disasters decrease light intensity more in industrialized countries [31].

The third part of the literature focuses on urbanization-related topics. Doll and Pachauri investigate how NTL and spatially explicit population data can be used to study electricity access [35]. They present satellite-derived estimates of rural populations without access to electricity in developing countries. In addition, they show the slow progress of electricity provision to households in Sub-Saharan Africa. On the contrary, Falchi et al. focus on the topic of light pollution, which did not have a quantification on the global scale before [36]. With their world atlas of artificial sky luminance, the authors show that more than 80% of the world and more than 99% of the U.S. and European populations live under light-polluted skies. This can affect many dimensions of life, such as ecology, astronomy, health care, and land-use planning. Additionally, Bennie et al. studied the impact of light pollution on 43 global ecosystem types and found that all ecosystem types experienced an increase in light pollution in the period 1992 to 2012, with some ecosystems being affected more than others [37]. Zhou et al. use the DMSP/OLS nightlights to map the extent and dynamics of urban areas with a five-step method which was tested on the cases of the United States and China [38]. Their results indicate that the urbanized area occupies about 2% of total land area in the US, ranging from lower than 0.5% to higher than 10% at the state level, and less than 1% in China, ranging from lower than 0.1% to about 5% at the province level with some municipalities as high as 10%. After that, they developed spatially and temporally consistent global urban maps from 1992 to 2013 and found that the percentage of global urban areas relative to the world's land surface area increased from 0.23% in 1992 to 0.53% in 2013, with Asia being the continent with the most significant urban growth [39].

Despite challenges in defining and measuring a shadow economy, there is a significant body of literature on this topic [6,40–51]. Schneider and Enste summarize different approaches to measuring the shadow economy, such as the currency demand approach, the

electricity demand approach, the labor force approach, or the multiple indicator multiple causes (MIMIC) model [6,50]. They use the latter to provide shadow economy estimates for many countries worldwide, including estimations for Iran. Several Iranian researchers have used the mentioned methods as well to determine the size of the shadow economy in the country [52–55]. A comprehensive study on the effects of international sanctions on Iran's informal economy comes from Farzanegan, in which he discusses important transmission channels of financial and energy sanctions, mainly through the foreign exchange markets [15]. In addition, Farzanegan and Hayo show that the international sanctions from 2012 and 2013 had a significantly stronger negative impact on the growth rate of the shadow economy than on the official GDP growth rate [3].

Economic sanctions and trade embargos have been discussed for decades in the economic literature. A theoretical framework to understand sanctions was developed by Eaton and Engers [56,57], who state that sanctions are "measures that one party (the sender) uses to influence another (the target). Sanctions, or the threat of sanctions, have been used by governments to alter the human rights, trade, or foreign policies of other governments". However, they ignore that there can be multiple senders, as their framework is only an interaction of two parties. Additionally, they admit that there are many other factors that are not considered in their approach. Additionally, Caruso studied the impact of economic sanctions on trade by looking at the USA and 49 targeted countries from 1960 to 2000. His results suggest that extensive and comprehensive sanctions have a large negative impact on bilateral trade [58]. In addition, he shows that unilateral extensive US sanctions have a large negative impact, while limited and moderate sanctions induce a slightly positive effect on other G-7 countries' bilateral trade.

Torbat investigates the impacts of US trade and financial sanctions on Iran [9]. He summarizes the economic and political effects of different US sanctions since the establishment of the Islamic Republic of Iran in 1979. According to the author, the sanctions mainly damaged the Iranian economy, while the efficacy of sanctions has diminished in the long run and had minimal political effects. In addition, the negative effect on the US economy by losing a trade partner was low since the country is not dependent on a few trade partners. Overall, the author shows that unilateral sanctions might not have the strong effect intended by the sender country. This is also supported by further empirical studies, such as the study from Dizaji and Farzanegan, who analyzed the impact of unilateral and multilateral sanctions on Iran's military spending [12]. They found that an increase in the intensity of sanctions is associated with a larger decrease in military spending, both in the short and the long run. Moreover, they show that only the multilateral sanctions by the US and other countries have a statistically significant and negative impact on the military spending of Iran.

Additionally, Farzanegan estimates the effect of international banking and energy sanctions on Iran's military spending from 2012 to 2015 using the synthetic control method [16]. He concludes that per capita military spending was reduced by approximately USD 117 per year, on average. This supports previous findings, which show that Iran's military and security expenditures significantly respond to shocks in oil revenues or oil prices, while social spending components do not show any significant reactions [14]. The spending behavior of the Iranian government in the wake of sanctions can be linked to the quality of its political institutions [11,13]. Dizaji et al. show that sanctions have positive effects on the quality of democratic institutions in the short and medium terms, and trade openness may have a direct and positive impact on the size of its budget. However, the spending allocation depends on how trade affects the government budget and its political behavior simultaneously, which means that a weaker democracy can increase Iran's military expenditures and may reduce the share of non-military expenditures.

Based on these findings, Dizaji studies how improvements in trade openness in Iran due to lifting sanctions could affect political institutions and military spending [10]. The results of his impulse response analysis, based on an unrestricted VAR model, suggest that the response of political institutions to a one standard deviation shock to trade openness or to positive changes in trade openness is negative and statistically significant. On the contrary, government revenues and defense and non-defense expenditures respond positively to a positive shock toward trade openness, while shocks to trade openness influence military spending more than non-military spending.

Several other studies focus on the effects of oil price shocks and sanctions on different aspects of Iran's economy, such as GDP growth, inflation, publicly traded companies, environment, export, technology, foreign investments, and oil production [19–22]. Farzanegan and Markwardt analyze the dynamic relationship between oil price shocks and major macroeconomic variables using a VAR approach [59]. According to their results, the asymmetric effects of oil price shocks, negative and positive, significantly increased inflation. Furthermore, they find a strong positive relationship between positive oil price changes and industrial output growth, as well as the Dutch disease syndrome through significant real effective exchange rate appreciation. Gharehgozli estimates the costs of international sanctions against Iran from 2011 to 2014 using the synthetic control method and finds that sanctions during this period reduced Iran's real GDP by more than 17%, with the largest drop occurring in 2012 [18].

Another case study on Iran focuses on the impact of sanctions on the black-market premium on the Iranian Rial and US dollar exchange rate [23]. As discussed in previous studies [15,59], oil price shocks under sanctions may affect foreign exchange markets; thus, Zamani et al. investigate the effects of energy sanctions on the black-market premium on the exchange rate. Using data from 1959 to 2017 and a nonlinear autoregressive distributed lag (NARDL) model, they find that falling oil revenues caused by sanctions increase the black-market premium. Overall, previous studies on Iran have shown that economic sanctions and the lifting of sanctions can affect GDP growth, shadow economy growth, oil exports, trade, inflation, exchange rates, public spending, institutional quality, banking system, and other macroeconomic indicators connected to household welfare, with the focus mainly on the introduction of sanctions and not the removal of sanctions. Therefore, this article will be an important contribution to the latter part of the literature.

#### **3. Data and Methodology**

This study uses version 2 of the harmonized global NTL dataset by Li et al. [8] (https://figshare.com/articles/dataset/Harmonization\_of\_DMSP\_and\_VIIRS\_nighttime\_ light\_data\_from\_1992-2018\_at\_the\_global\_scale/9828827/2, accessed on 12 October 2020), which is based on data from the Defense Meteorological Satellite Program/Operational Linescan System (DMSP/OLS) of the United States Department of Defense (https:// eogdata.mines.edu/products/dmsp/, accessed on 5 November 2021), ranging from 1992 to 2013, and data from Visible Infrared Imaging Radiometer Suite/Day Night Band (VI-IRS/DNB) of the Earth Observation Group of the United States National Oceanic and Atmospheric Administration (NOAA) (https://eogdata.mines.edu/products/vnl/, accessed on 5 November 2021), ranging from 2012 to 2018 [60–62]. The harmonization procedure contains three major steps. First, they aggregated the global average radiance composite images of the VIIRS/DNB dataset from monthly to yearly observations. In addition, noises from aurora, fires, boats, and other temporal lights were excluded during this step. This might also filter out light emissions that are not very strong, for example, from small villages [26]. Second, they quantified the relationship between processed VIIRS data and DMSP NTL data in 2013 using a sigmoid function so that the processed VIIRS data have the same spatial resolution and similar radiometric characteristics as the DMSP data. Third, they applied the derived relationship at the global scale to obtain the DMSP-like data from VIIRS and finally generated the consistent NTL data by integrating the temporally calibrated DMSP NTL data (1992–2013) and DMSP-like NTL data from VIIRS (2014–2018). The authors provide one picture in Tagged Image File (TIF) format for each year.

Figure 1a,b show samples of the harmonized NTL dataset from 2013 and 2017 for the provinces of Iran, which is a year during the time of international sanctions against Iran, and a year after the lifting of sanctions. We can see that the light intensity in most of the

31 provinces has increased from 2013 to 2017, thus this will not give us an indication of how the shadow economy changes over time, but we utilize this dataset to calculate the growth of shadow economy, as presented in equation (1). The shapefile used in the figures and in the process of data extraction is from the United Nations Office for the Coordination of Humanitarian Affairs (OCHA), Regional Office for the Middle East and North Africa [63]. It uses the latest provincial borders, the first-level administrative divisions, after the reform of Tehran province in 2010. With the help of Quantum Geographic Information System's (QGIS) Zonal Statistics tool and the shapefile, we calculate the average light intensity in each province over the time period 1992–2018. This leaves us with a panel dataset of 837 observations for the NTL. The values represent the yearly mean of nighttime light intensity in each Iranian province and range theoretically from 0 (black) to 63 (white). However, Table 1 shows that the average NTL for Iranian provinces ranges between 0.139 and 25.426, which tells us that we can use the dataset for our approach. If a province has a value of 63, we would not be able to determine an increase in economic activity because this is the maximum possible value.

(**a**) (**b**)

**Figure 1.** (**a**) NTL in Iranian provinces, 2013; (**b**) NTL in Iranian provinces, 2017. Source: own illustration using the files from [8,63].


**Table 1.** Summary statistics of used data.

The economic data are from the Iranian Ministry of Economic and Financial Affairs (MEFA) (https://databank.mefa.ir/data?lang=en, accessed on 1 July 2021) and are available for many indicators from 2000 to 2019 [7]. Different measurements of shadow economies worldwide are usually available on the country level, like the data from Buehn and Schneider for the period 1999 to 2007 [41]. With their multiple indicator multiple causes (MIMIC) approach, they estimate the shadow economy in Iran to be between 17.3% and 19.1% of the official GDP, while the average of their full sample is 17.1%. An updated estimation for the period 1991 to 2017 estimates values between 13.2% and 20.5% [48]. However, our study uses NTL to calculate the growth rate of the shadow economy, mainly due to data available on the provincial level. With the following equation, we calculate the shadow economy growth:

$$
\Delta Ln(NTL\_{i\downarrow t}) - \Delta Ln(GDP\_{i\downarrow t}) = \Delta Ln(Shadow\_{i\downarrow t}) \tag{1}
$$

We subtract the first difference of the natural logarithm of the current gross domestic product (GDP) in Iranian Rial (IRR) from the first difference of the natural logarithm of the harmonized NTL for 31 Iranian provinces (i) for the period 2005–2018 (t). The nighttime light data were previously used as an indicator of GDP [24,25]. However, Farzanegan and Hayo assume that it is likely that shadow economic activity has a positive relationship with the intensity of night light, too [3]. Therefore, the data on night light has two main components: one is related to activities registered in the official GDP (observed sector), and the other is related to activities in the shadow economy (unregistered sector).

Our dependent variable, the measurement of shadow economy growth, is the difference between the growth rates of nighttime light and GDP. To operationalize the dependent variable, first, we take the natural logarithm of the formal GDP and the natural logarithm of the NTL data. Then the first difference of the natural logarithm of the formal GDP and the first difference of natural logarithm of the NTL data is calculated. The first difference (Δ) of the natural logarithm of a variable represents a growth rate similar to percentage growth, thus providing us with growth rates for NTL and GDP. Finally, we calculate the difference between the growth rates of nighttime light, ΔLn(NTL), and the growth rates of formal GDP, ΔLn(GDP), as presented in equation (1), to capture the relative development of the shadow economy.

Figure 2a,b show samples of the calculated values for the same two years as in the previous figures, where one year is in the time period of sanctions and the other in the time period of lifted sanctions. When simply comparing the changes of the shadow economy in these two years, we can already see a difference, namely a decrease of shadow economy compared to the year before in all provinces in the year 2013, and a positive value of shadow economy growth in most provinces in the year 2017, compared to the year before. The exceptions are the provinces Alborz, Ilam, Khuzestan, Kohgiluyeh and Boyerahmad, and Tehran. The summary statistics of NTL and other used variables are presented in Table 1, excluding the dummy variable.

A detailed data description, including sources, is available in Table A1 in the Appendix A. From the descriptive statistics, we get a first glimpse of the behavior and the difference between the formal and informal economy in the reflected period. If we compare the volatility of ΔLn(GDP) and ΔLn(Shadow), we see that the shadow economy growth has a larger standard deviation of 0.26, compared to 0.12 of the formal economic growth. This is a first indication that the shadow economy might react more strongly to positive and negative economic shocks, which has been seen in previous studies [3]. On the one hand, we can argue that the shadow economy reacts or adapts faster to the new economic situation, but on the other hand, that also means more uncertainty for people engaged in the informal economy. The relationship between different variables can be seen in the correlation matrix in Table A2 of the Appendix A.

**Figure 2.** (**a**) ΔLn(Shadow) in Iranian provinces, 2013; (**b**) ΔLn(Shadow) in Iranian provinces, 2017. Source: own illustration based on own calculations.

Using the described data, this study uses a regression analysis with ordinary least squares (OLS) and province-fixed effects. We use unbalanced panel data of 31 Iranian provinces from 2001 to 2018 and use the following specification:

Δ*Ln*(*Shadowi*,*t*) = *α* + *β*<sup>1</sup> ∗ *Li f tedt* + *β*<sup>2</sup> ∗ *Trendt* + *β*<sup>3</sup> ∗ Δ*Unemploymenti*,*<sup>t</sup>* + *β*<sup>4</sup> ∗ ΔAgriculturei,t + *β*<sup>5</sup> ∗ ΔIndustryi,t+ *<sup>β</sup>*<sup>6</sup> <sup>∗</sup> <sup>Δ</sup>*Ln*(*Shadowi*,*t*−1) <sup>+</sup> *<sup>β</sup>*<sup>7</sup> <sup>∗</sup> <sup>Δ</sup>*Ln*(*Shadowi*,*t*−2) <sup>+</sup> *<sup>π</sup><sup>i</sup>* <sup>+</sup> *<sup>ε</sup>i*,*t*. (2)

> The dependent variable is the shadow economy growth that was calculated in (1), where the subscript *i* represents the Iranian provinces and the subscript *t* represents the years. In addition, the model includes a constant α and an error term *ε*, and the term *Lifted* represents the dummy variable which takes the value 1, if the year is 2016 or 2017, which were the two years of lifted sanctions under JCPOA, and it takes the value 0 otherwise. We have added a trend term that controls for time trends that have not been controlled for in the model, for example, technological progress or inflation. As the values do not change between provinces, it will also replace the time-fixed effects, which are not included in our estimations, because the inclusion of time dummies to capture possible trends would lead to perfect multicollinearity. The remaining independent variables are control variables, such as other drivers of shadow economy activity like unemployment as well as the size of agriculture and the industry sectors. These are often used in previous studies on the shadow economy, such as in studies by Farzanegan and Hayo or Schneider and Enste [3,50]. The service value added divided by total value added was not included due to collinearity. Furthermore, the model includes province fixed-effects π that are used to control for individual factors that affect each province over the period, for example, cultural attitudes toward formal and informal jobs. Despite relative homogeneity in terms of religion and culture, Iran is a multi-ethnic country with several ethnic minorities that are concentrated in different provinces. Moreover, we use robust standard errors that are clustered on the province level.

#### **4. Results**

The results of the empirical analysis, presented in Table 2, suggest that the lifting of international sanctions had a positive and statistically significant effect on the relative growth of Iran's shadow economy. The seven presented specifications show the relationship between the lifting of sanctions dummy variable, starting from a simple regression and then adding more control variables that are considered important drivers of the shadow economy. Table A3 in the Appendix A shows the behavior of the dummy variable if we add the control variables individually. Additionally, we can see that the lifting of sanctions has a stronger effect on the growth rate of the shadow economy than on GDP growth. In 2016 and 2017, the GDP growth of Iran was 13.4% and 3.8% [1], respectively. This is an increase of 14.7 and 5.1 percentage points compared to 2015, where the annual GDP growth rate was −1.3%. As presented in specification (2.7), the lifting of sanctions in the time period 2016–2017 is on average associated with an increase of shadow economy growth of 13.6 percentage points. The remaining specifications show values between 14.4 and 30.9 percentage points, which are higher than the changes of the formal economy during the lifting of sanctions. This is also in line with arguments presented in the literature on the cyclical features of the shadow economy [64], especially in developing and emerging economies. A positive shock (lifting of sanctions) which booms the formal economy, does have a stronger positive effect on the informal economy, while a negative shock (imposition of sanctions) which leads to recession of the formal economy, results in a more significant decline of economic activities in the informal economy, as shown in Farzanegan and Hayo [3].

**Table 2.** Lifting of international sanctions and shadow economy growth.


Notes: Robust standard errors clustered on the province level are reported in parentheses. Significance levels: \*\*\* *p* < 0.01, \*\* *p* < 0.05, \* *p* < 0.1.

> Due to data availability, the number of observations in our seven specifications in Table 2 is different, but all coefficients of the dummy variable *Lifted* show the same direction and are statistically significant on the 1% level. Additionally, we lose observations by calculating the first differences and adding time lags of the dependent variable. In parentheses, we report robust standard errors clustered on the province level, and the significance levels are shown by one to three asterisks, where three asterisks refer to the 1% level, two asterisks refer to the 5% level, and one asterisk refers to the 10% level. The significant effect is also true for the time lags of our shadow economy indicator that is statistically significant and negative, as expected from earlier studies [3]. The other control variables, such as different economic sectors, show mixed results. First, the coefficient of the first difference of industrial value added divided by total value added is negative and statistically significant on the 1% level in specifications 2.6 and 2.7 of Table 2, thus an increase in industrial value added will, on average, decrease the shadow economy growth. This effect is expected and can be explained through the shift of informal industrial companies and workforce to the formal sector. In this case, the shadow economy serves as a substitute for the formal economy.

> Second, the coefficient of the first difference of agriculture value added divided by total value added is negative and statistically significant on the 1% level in specification 2.7 of Table 2; thus, an increase in agriculture value added will, on average, decrease the shadow economy growth. This effect is expected and can be explained through the shift of

the informal agricultural workforce to the formal sector. In this case, the shadow economy serves as a substitute for the formal economy. Third, the coefficient of the first difference of services value added divided by total value added, which is reported in Table A3 of the Appendix A, is positive and statistically significant on the 1% level in specification 3.6. It was not reported in Table 2 because it was omitted due to collinearity with the other two sectors. This effect is not expected and would assume that both the formal and informal service sectors serve as complements. This means that an increase in the service value added will increase the shadow economy growth. A possible explanation could be that a part of the workforce in the service sector will not be employed in the formal economy, so the growth of the official service sector might also increase the informal economy.

Moreover, the results suggest that an increase in unemployment is associated with an increase in shadow economy growth, as reflected by the coefficients in specifications 2.5 and 2.6 of Table 2, which are statistically significant on the 1% and 10% levels, respectively. This effect was expected, as an increase in unemployment will increase the number of people who are seeking jobs in the informal economy. The latter is also considered a safety net in the context of developing countries if the government does not provide any form of unemployment benefits. Firoozabadi et al. show in their case study of the province Sistan and Baluchistan how unemployment pushes people into the informal economy [54]. However, our result for unemployment is not robust to the inclusion of other control variables such as the value added of different economic sectors. Similar to other control variables used in this study, the available time period is very limited, and as reflected in Table 2, the number of observations in our specifications significantly reduces when adding some of the control variables. Therefore, there is a systematic shortening of the time period, and we should be careful in the overinterpretation of the control variables. As we do not want to lose observations in our shorter specifications, we kept as many observations as possible and did not limit them to the observations of specification 2.7. Overall, our main finding is that the variable of interest, the dummy variable of lifted sanctions, is statistically significant when adding further control variables or changing the specifications in other ways, as presented in Tables 2 and A3 in the Appendix A.

#### **5. Discussion**

In our study, we used an approach to estimate the change of shadow economy over time utilizing NTL data because there is no direct way of measuring the size or behavior of the shadow economy. This approach is comparable to other indirect ways of measuring the shadow economy, such as the currency demand approach, the electricity demand approach, or the labor force approach [6,50]. A more sophisticated approach is the MIMIC model, which uses multiple indicators and multiple causes and is usually used to determine the size of the shadow economy on the country level. This modeling approach, however, needs a large amount of data which we usually do not have for the province level, especially in the context of developing countries. Therefore, remote sensing data such as the NTL are an enormous help to shed light on the shadow economy in lower administrative units than the country level.

However, there are also several weaknesses of the NTL approach, and therefore limitations to our study. Firstly, we have the strong premise that NTL measures economic activity. An indication of this connection is the correlation of the mean of NTL and the GDP of 31 Iranian provinces. Pearson's correlation coefficient for the two data series is 0.65 based on 570 observations, which suggests a strong positive relationship. This has also been shown in several previous studies [24,29,31,34]. Secondly, we have a strong premise that the remaining part, when subtracting GDP from NTL, is connected to shadow economy activity. This phenomenon has also been widely discussed in the literature [3,27,28,32]. If these two premises hold, our approach will reflect the growth rate of the shadow economy.

However, in previous studies, nighttime lights have been used to measure different other forms of human activity, which might not all be associated with the shadow economy, for example, non-commercial private household electricity use. In addition, NTL might

also contain other nighttime light sources that are not usually associated with formal or informal economic activities, for example, wildfires. Fortunately, this type of measurement error was removed in the data preparation process by Li et al. [8] and other researchers who have worked on the DMSP/OLS and VIIRS/DNB datasets before [60–62]. Another limitation is that the effect of our dummy variable for the lifting of sanctions only reflects the short-term effect on the shadow economy growth; thus, we do not know if this short period of lifted sanctions has an impact in the long run. We rather argue that the shadow economy is rapidly adapting to economic challenges and thus might only be affected by this shock for a short period of time. Moreover, we do not measure the size of the shadow economy, but only the changes of the shadow economy due to international sanctions.

#### **6. Conclusions**

We provide new empirical evidence for the relationship between the shadow economy and international sanctions in Iran. First, we summarize previous findings on sanctions and the lifting of sanctions in Iran that have focused on aspects such as GDP growth, oil exports, trade, inflation, exchange rates, public spending, institutional quality, banking system, household welfare, and other macroeconomic variables. Extending the previous research on the Iranian shadow economy under sanctions [3,42], we show that the lifting of international sanctions in 2016 and 2017 was associated with an increase of the shadow economy which is larger than the increase of the official economy in Iran. Moreover, we found that the performance of the shadow economy is more volatile than the formal economy. Therefore, we can argue that the shadow economy is more flexible and can react faster to positive or negative economic shocks because the workforce and businesses in the informal economy do not need to follow administrative procedures or hiring practices. Usually, smaller-scale businesses in the informal economy are also able to react faster to shocks. This would explain why the shadow economy was hit harder when introducing sanctions, as seen in Farzanegan and Hayo [3], and recovered faster than the formal economy after lifting sanctions, as we have shown in this study. Overall, this means that the lifting of sanctions will be a relief for potential workers who seek employment both in the formal and informal economies in Iran. Our results are also in line with cyclical features of the shadow economy, suggesting that the informal economy undergoes larger output movements over the business cycle in emerging and developing economies than in developed economies [64].

**Author Contributions:** Conceptualization, M.R.F.; methodology, M.R.F.; data collection and estimations, S.F.; writing—original draft preparation, S.F.; writing—review and editing, M.R.F. and S.F.; overall study supervision, M.R.F. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** Table A1 of the Appendix A includes an overview of the used data sources which are publically available.

**Acknowledgments:** We appreciate the helpful comments and suggestions of the Academic Editor (Nataliya Rybnikova) and four anonymous reviewers.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A.**


**Table A1.** Description of data used.

#### **Table A2.** Correlation matrix of variables used.


Notes: The presented Pearson's correlation coefficients are calculated based on the 217 observations that are available for all variables. Significance levels: \*\*\* *p* < 0.01, \*\* *p* < 0.05, \* *p* < 0.1.

**Table A3.** Lifting of international sanctions and shadow economy growth; alternative specifications.


Notes: Robust standard errors clustered on the province-level are reported in parentheses. Significance levels: \*\*\* *p* < 0.01, \* *p* < 0.1.

#### **References**


## *Article* **Nighttime Lights and County-Level Economic Activity in the United States: 2001 to 2019**

**John Gibson \* and Geua Boe-Gibson**

Department of Economics, University of Waikato, Private Bag 3105, Hamilton 3240, New Zealand; geua.boegibson@waikato.ac.nz

**\*** Correspondence: jkgibson@waikato.ac.nz; Tel.: +64-7-838-4289

**Abstract:** Nighttime lights (NTL) are a popular type of data for evaluating economic performance of regions and economic impacts of various shocks and interventions. Several validation studies use traditional statistics on economic activity like national or regional gross domestic product (GDP) as a benchmark to evaluate the usefulness of NTL data. Many of these studies rely on dated and imprecise Defense Meteorological Satellite Program (DMSP) data and use aggregated units such as nation-states or the first sub-national level. However, applied researchers who draw support from validation studies to justify their use of NTL data as a proxy for economic activity increasingly focus on smaller and lower level spatial units. This study uses a 2001–19 time-series of GDP for over 3100 U.S. counties as a benchmark to examine the performance of the recently released version 2 VIIRS nighttime lights (V.2 VNL) products as proxies for local economic activity. Contrasts were made between cross-sectional predictions for GDP differences between areas and time-series predictions of GDP changes within areas. Disaggregated GDP data for various industries were used to examine the types of economic activity best proxied by NTL data. Comparisons were also made with the predictive performance of earlier NTL data products and at different levels of spatial aggregation.

**Keywords:** VIIRS; DMSP; GDP; nighttime lights; cross-sectional; time-series; economic statistics

Nighttime Lights and County-Level Economic Activity in the United States: 2001 to 2019. *Remote Sens.* **2021**, *13*, 2741. https://doi.org/ 10.3390/rs13142741

**Citation:** Gibson, J.; Boe-Gibson, G.

Academic Editor: Nataliya Rybnikova

Received: 13 May 2021 Accepted: 8 July 2021 Published: 12 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

Satellites have been observing the Earth at night for over 50 years, but it is especially since the digital archive of nighttime lights (NTL) was established in 1992 by the National Oceanic and Atmospheric Administration (NOAA) that researchers have found an evergrowing set of use for these data. Several key early studies by non-economists showed that NTL data from the Defense Meteorological Satellite Program (DMSP) could be used to estimate sub-national indicators of economic activity and per capita incomes [1–5]. Potential advantages of these NTL-based estimates, compared to traditional economic activity statistics like national or regional gross domestic product (GDP), are timelines, lower cost, comparability between countries irrespective of statistical capacity, and availability for spatial units below the level at which GDP data are reported.

In the last decade, economists have also begun using NTL data. Widely cited early studies from two different research teams noted that DMSP data are noisy, but in a wide range of contexts [6,7], or alternatively, just in data-poor environments [8,9], DMSP data could add value to conventional economic statistics. In contrast to earlier studies focused particularly on comparing regions, a theme in recent studies by economists is using NTL data to track fluctuations in local economic activity in response to various shocks such as disasters [10–12], or certain policy interventions [13,14]. This use of NTL as a proxy for changes in local economic activity, plus ongoing cross-sectional use as a proxy for variation in economic performance, raises the question of how predictive NTL data are for studying differences in economic activity between areas and the temporal changes in activity within areas.

Several validation studies have considered this question by using GDP data as a benchmark for assessing predictive performance of NTL data. An early and widely cited study used national level DMSP and GDP data for 188 countries from 1992 to 2008 [7], while a similar study used these data for 1500 regions (mostly at first sub-national level) from 82 countries from 1992 to 2009 [15]. However, applied researchers who draw support from validation studies to justify their use of NTL data as an economic activity proxy have increasingly focused on smaller and lower level spatial units [16]. Several studies have used DMSP data at the third sub-national level, which includes counties, sub-districts, and NUTS3 regions [10,17–20], with some studies for even lower level spatial units such as villages [14], micro-grids [21], and even pixel-level [11,22]. A mismatch exists between the spatial level of validation studies and the spatial level of applied studies that use NTL data to proxy for economic activity matters because flaws in DMSP data such as spatial imprecision and blurring [23,24] make the predictive performance far worse for lower level spatial units such as the third sub-national level than for more aggregated units such as the national or first sub-national level [25].

The extant validation studies are mainly for older NTL data products such as DMSP. Some comparisons between GDP and version 1 NTL annual composites from the Visible Infrared Imaging Radiometer Suite (VIIRS) have been made [26], but these products are only for 2015 and 2016. To date, no validation studies have used version 2 VIIRS annual composites (V.2 VNL), which have recently been released [27]. To help close this gap in the literature, this study used the 2001–19 time-series of GDP for over 3100 U.S. counties as a benchmark to examine the usefulness of three NTL data sources, DMSP, V.1 VNL, and V.2 VNL as proxies for local economic activity. We included data from the 2014–18 extension of DMSP based on pre-dawn readings (compared to the early evening readings for DMSP prior to 2014). We also used the V.2 VNL data with two other samples, a cross-country dataset, so that results could be compared with earlier validation studies [7] and statelevel U.S. data to examine the aggregation effects. Our panel data estimation framework helps to contrast cross-sectional predictive performance for differences between areas with performance for a time-series of changes within areas. A further contribution is to use GDP for various industries to see what economic activities are best proxied by NTL data. The industry-level results and related split-sample results based on agriculture's contribution to GDP and on population density provide a basis to consider how our findings may apply to other settings where the economic structure differs from that of the United States.

#### **2. Materials and Methods**

#### *2.1. Related Literature on NTL Validation Studies*

In the current context, validation studies have attempted to estimate the nature of the relationship between NTL data and traditional economic activity data for places with trustworthy data. These studies provide a basis for using NTL data as a proxy in other times and places where traditional data such as GDP are either absent or not trusted. The errors in GDP data should be independent of errors in NTL data, so some studies have noted an optimal indicator of true economic activity would weight a mixture of the two measures [7–9]. Studies using this framework have put some weight on DMSP data for examining cross-sectional differences in places where the GDP data have low reliability, but note that without further refinement of the NTL data, they are "not a reliable proxy for time-series measures of output growth" [9] (p. 241). A far lower predictive ability for time-series changes, even if DMSP data are good predictors of cross-sectional differences in economic performance, also holds at very local (third sub-national) levels in a developing country setting [28].

The VNL data from VIIRS are a refinement over DMSP data, in terms of spatial precision and temporal consistency [23], so the question of whether these data are a reliable proxy for measuring changes in economic activity has been examined, albeit within the limits of the short time-series for V.1 VNL annual composites. The V.1 VNL data predict over 70% of variation in U.S. state-level GDP (and over 85% of variation in GDP for

metropolitan areas), but predict less than 4% of variation in annual rates of change in GDP [26]. Direct comparisons of VIIRS and DMSP have been limited because the V.1 VNL annual composites are only for 2015–16 [29] and the popular DMSP stable lights timeseries [30] ends in 2013 (data from the DMSP 2014–18 extension are yet to be used). To deal with this issue, annual NTL estimates for 2013 from VIIRS monthly data are constructed by various researchers, usually with masking procedures to remove outliers in the monthly data, and these VIIRS annual estimates better predict in cross-sections of GDP than DMSP data [25,31–33].

While several studies have noted that DMSP data are noisy measures of true luminosity, the nature of the measurement error has rarely been examined. A study at the second sub-national (NUTS2) level for Europe found mean-reversion, where errors in DMSP data negatively correlate with true values [33]. Unlike random errors that do not bias regression coefficients if NTL data are the left-hand side variable and attenuate coefficients in proportion to the reliability ratio if they are the right-hand side variable [34], mean-reverting errors in a left-hand side variable cause bias and in a right-hand side variable may overstate coefficients rather than attenuate them [35–37]. A decomposition using DMSP data adjusted for top-coding [38] found that most of the spatially mean-reverting errors were still present, implying that the blurring of the DMSP images [24] is the more important source of error in DMSP data [33].

A consequence of mean-reverting errors is understated inequality between places as NTL estimates revert toward their mean. Some studies have considered inequality as an aspect of economic performance by using DMSP data as a proxy in places that lack timely or fine resolution sub-national GDP data [39,40]. However, validation studies show that DMSP data understate spatial inequality, especially in urban and high density areas, with this pattern holding across developed and developing regions of the world [25,33].

Validation studies have also examined the types of economic activity (and hence, the type of places, given different patterns of specialization) for which NTL data are a poor proxy. The GDP-luminosity relationship (using DMSP data from 1992 to 2009) is positive for countries with agricultural shares of GDP below 20%, but negative elsewhere [41]. The weaker relationship with agricultural sector activity is also seen at the third sub-national level in China in the DMSP data, while the V.1 VNL data (annual estimates from masked monthly records) are unrelated to primary sector GDP [25]. If NTL data poorly capture agricultural activity, it may help explain why NTL data are a weaker proxy for economic activity in low density areas [42], given the predominance of agriculture in such places.

#### *2.2. Data and Methods*

We used four data sources to test the relationships between night lights and countylevel and state-level GDP. The first was real GDP in chained 2012 dollars, from the U.S. Bureau of Economic Analysis (BEA). The annual estimates are provided separately for each county for the 2001 to 2019 period, except in Alaska, where the BEA combines some census areas in their reporting, for example, in Hawaii, where they combine Maui and Kalawao counties, and in Virginia, where there are 23 BEA-created combination areas where one or two independent cities with 1980 populations of less than 100,000 are combined with an adjacent county. The dissolve function in ArcGIS was used to modify a county-level shapefile, so that it matched these combination areas. There were n = 3109 counties and combination areas (we refer to all of these as county-level units) with data available in each year.

The second data source was four annual products for the 2014 to 2019 period from the version 2 VIIRS nighttime lights (V.2 VNL) annual composites [27]. We used the average radiance, median radiance, and the masked variants of these two data products, summing the radiance by county-level unit in each year. While the V.2 VNL annual composites are also available for 2012 and 2013 (as they are built from monthly data available since April 2012), the values for those two years are yet to have a stray light adjustment. With the northerly latitude of much of the U.S., stray light can affect the images on many nights. This reduces comparability with the time-series from 2014 onwards, which is based on stray light corrected data, so we did not use the 2012 and 2013 V.2 VNL data.

The V.2 VNL are produced from monthly cloud-free radiance averages, with initial filtering to remove extraneous features such as fires and aurora before the resulting rough annual composites are subjected to outlier removal procedures. To isolate the background from lit grid cells, a data range threshold is set from 3 × 3 blocks of grid cells where the threshold is based on a multiyear maximum median and a multiyear percent cloud-cover grid [27]. In other words, there is a single data range threshold across all the years in the series, in contrast to the year-specific thresholds that were used for the version 1 VIIRS annual composites [29]. The data are in units of nano Watts per square centimeter per steradian (nW/cm2/sr) reported on a 15 arc-second output grid.

The third data source was the version 1 VIIRS nighttime lights (V.1 VNL) annual composites for 2015 and 2016 [29]; the only two years for which this product is available. We used the stray light corrected version (vcmsl) of these annual composites, with the outliers removed and background set to zero (ormntl). The average annual radiances from each of the 15 arc-second output pixels were summed to county-level totals.

The fourth data source was annual composites from the Defense Meteorological Satellite Program (DMSP) satellites F14, F15, F16, and F18. These composites provide an average digital number (DN) for each 30 arc-second output pixel, where DN values are 6-bit digital numbers that range from 0–63, with higher numbers indicating greater brightness. Ephemeral lights such as from fires and gas flares are removed from the annual composites, and the original processing by NOAA scientists also excluded (at pixel level) images for any nights affected by clouds, moonlight, sunlight, and other glare. The usual stable lights product has a time-series that ended in 2013 [30], with two satellites providing data for each year up to 2007, so there are 20 satellite-years available over the 2001 to 2013 period.

The DMSP satellites have an unstable orbit, tending to observe Earth earlier as they age. For example, a satellite tracking mission (see: http://www.remss.com/support/ crossing-times/ accessed on 6 August 2019) shows equator crossing times for F18 of 8 pm in 2013, but 6 pm by 2018. Thus, what starts out as a Day–Night observation becomes Dawn–Dusk observation. The Earth Observation Group at the Colorado School of Mines has exploited this feature to extend the time-series of DMSP stable lights annual composites by using pre-dawn data from satellite F15 for 2014 to 2018. Lights observed in the early hours of the morning are more likely to be from public infrastructure (e.g., street lights) than from private consumption and production activities, so the extended DMSP stable lights series may not be consistent with the earlier DMSP data, and we treated them as a separate source of information on NTL. For both sets of DMSP data, we used the sum of the DN values within a county-level unit.

Our main parameter of interest was the elasticity of GDP with respect to night lights, as estimated from the following regression:

$$
\ln(real\ GDP)\_{it} = a + \beta \ln(sum\ of\ lightts)\_{it} + \mu\_i + \varphi\_t + \varepsilon\_{it} \tag{1}
$$

where the *i* indexes the cross-sectional units (county-level units in most cases but we also estimated Equation (1) with country and state-level data); the *t* indexes years; the *μ<sup>i</sup>* are fixed effects for each cross-sectional unit; the *ϕ<sup>t</sup>* are the fixed effects for each year; and *εit* is the disturbance term. The fixed effects let us control for time-invariant features of each cross-sectional unit, and spatially-invariant features of each time period. One could allow time effects to vary across space at some more aggregated level (e.g., at state level if there are county fixed effects), but the setup we used is the traditional one in economics studies using night lights data. The elasticity is a unit-free measure showing by what percentage the left-hand side variable changes for each percentage change in the right-hand side variable. Thus, the fact that the V.1 and V.2 VNL data are measured in nW/cm2/sr while the DMSP data are in DN values does not affect the estimation of the elasticity.

The specification of Equation (1) with NTL data on the right-hand side does not imply that lights cause GDP (as any causation would go the other way) and instead, it has a

predictive interpretation. The typical situation where NTL data are used as a proxy for local economic activity is because traditional measures like GDP are either unavailable or are considered untrustworthy. Thus, it is important to learn from settings like the U.S., where the GDP data are both available and trustworthy, about how closely NTL data correlate with GDP data, in order to see if the NTL data are an adequate proxy measure.

For example, many studies use NTL data to estimate impacts of a shock such as a natural disaster [10–12], which affects some cross-sectional units but not others, and occurs in some time periods but not others. The validity of using NTL data to estimate the impacts on local economic activity of such shocks (or more generally, of 'treatments') depends on the product of two relationships: (*∂GDP*/*∂lights*)·(*∂lights*/*∂treatment*). In the settings of interest, typically the *∂GDP*/*∂lights* relationship is not estimated because there are no GDP data (as any available and trustworthy GDP data would already be used for the evaluation). Instead, the validation studies from elsewhere provide evidence on the *∂GDP*/*∂lights* term that is needed for interpreting estimates of the impact of the treatment on night lights as estimates of the impact of the treatment on local economic activity. In other words, if relationships between changes in GDP and changes in NTL data are very weak, then it is hard to see how estimates of the (*∂lights*/*∂treatment*) effect are informative about how the shock impacts on economic activity and performance.

To provide a basis to interpret results of Equation (1), we considered two widely cited studies (with 1850 and 650 *Google Scholar* citations as of May 2021) that have reported estimates of Equation (1). With 17 years of DMSP data for 188 countries, the elasticity is about 0.3 (long differences give a similar value) [7]. With 18 years of DMSP data for 1500 regions (typically at the first sub-national level) from 82 countries, an even larger elasticity of about 0.4 was reported [15].

The Equation (1) specification is known as a 'fixed effects' or 'within' estimator, as the variation that allows *β* to be estimated comes from time-series changes for each crosssectional unit. In other words, Equation (1) lets one see how changes in annual GDP vary with changes in NTL data. An alternative estimator that uses the same panel data is the 'between' estimator, where averages over time for each cross-sectional unit are used in the regression (e.g., the average GDP of a county from 2014 to 2019 is regressed on the average sum of lights in the county over the same period). The between estimator allows for examination of cross-sectional GDP differences between areas while the within estimator allows for time-series predictions of GDP changes within areas. We report the results for both estimators. The NTL data have been used in various studies in both contexts; to proxy for economic performance in cross-sectional studies such as when longrun impacts of historical factors are considered [43], and in studies focused on fluctuations in economic activity because the intervention or shock that they study occurs in the sample period [12,44].

#### **3. Results**

#### *3.1. Country-Level Results*

We started with country-level results for a comparison to a key study that found a GDPlights elasticity of 0.3 using the within estimator and DMSP data [7]. In the first two columns of Table 1, we show the results for all countries with data on real GDP in local currency units from 2014 to 2019 in the *World Development Indicators* [series NY.GDP.MKTP.KN].

The estimated GDP-lights elasticity was only 0.015 if the V.2 VNL average radiance product was used, while it was six-times larger, at 0.094, if the masked average was used. It seems that background noise and ephemeral sources of light in the unmasked data may attenuate within estimates of the elasticity. However, even after removing noise by masking, the elasticity was less than 0.1, which was far smaller than the earlier estimate of 0.3 with DMSP data. Moreover, omitting countries not in the sample of the widely cited Henderson et al. study [7] slightly lowered the estimated elasticity to 0.085 (column (3)). The other change in specification for results in the last two columns of Table 1 was to divide the sum of radiance by country area to match the way NTL data were used in the Henderson et al. study, and to add a quadratic term for the model reported in column (4); the squared term is statistically insignificant (*p* = 0.95) and the double logarithmic specification seems appropriate.

**Table 1.** Within estimator results for GDP-lights elasticities using V.2 VNL data: country-level 2014 to 2019.


Notes: Based on a panel of 203 countries (1192 observations) in columns 1 and 2, with ln(lights) based on the sum of radiance by year and country. Columns 3 and 4 are based on 181 countries (1072 observations) using lights per square mile in column 3 (and a quadratic of this, with an unreported squared term, in column 4). Models include year and country fixed effects. Standard errors in parentheses clustered at country level, \*\* *p* < 0.05.

> The results in Table 1 suggest that findings from earlier periods using DMSP data may not apply in more recent periods with VIIRS NTL data. However, there are at least two issues with this evidence. First, applied studies are increasingly focused on lower level spatial units, so country-level results may provide less guidance than in the past when the NTL data were used with more aggregated spatial units. The second and more concerning issue is that country level GDP data are of widely varying reliability and so they may not provide the consistent benchmark given by sub-national GDP data for the United States.

#### *3.2. Results at County and State Level*

The results of using four V.2 VNL products (average radiance, median radiance, masked average radiance, and masked median radiance) for a panel of 3109 county-level units observed each year from 2014 to 2019 are reported in Table 2. The top panel has the "within" estimator results, based on time-series variation, and the bottom panel has "between" estimator results, based on differences in average economic performance in the cross-section. Unlike the country-level results in Table 1, which are subject to wide variation in statistical capacity between countries that make some GDP data more trustworthy than others, we considered that county-level GDP data produced by the BEA will provide a consistent level of reliability over time and space. Consequently, differences in the lights-GDP relationships are interpreted in terms of potential measurement error features of the NTL data, rather than reflecting possible errors in the GDP data that may vary with either spatial scale or types of economic activity.

**Table 2.** Relationships between VIIRS V.2 NTL and county GDP: within and between estimator results.


Notes: Based on a strongly balanced panel of 3109 county-level units, observed each year from 2014 to 2019, giving N = 18,654 observations. Standard errors in parentheses (clustered at county level for the within-estimator results), \* *p* < 0.10, \*\*\* *p* < 0.01.

The masked products were better predictors of time-series changes in GDP and crosssectional differences in GDP than were the unmasked data products. The within-estimator *R*<sup>2</sup> values (which are always very low across all NTL data products, levels of spatial aggregation, types of economic activity, and time periods used in this study) were three points higher when using the masked data products. The between estimator *R*<sup>2</sup> values were 15 points higher (at 0.86 vs. 0.71) when using the masked VNL data products rather than their unmasked counterparts. Prior studies have shown that NTL data are more powerful cross-sectional predictors of differences in GDP (and other economic activity indicators) between areas than they are predictors of time-series changes [26,28,45]. This pattern also holds for the masked V.2 VNL data, where the *R*<sup>2</sup> values for the between estimator in the cross-section were almost 30 times as high as for the within-estimator of the time-series changes.

The GDP-lights elasticity was almost zero if using the within estimator with unmasked data products, and was 0.12 (0.13) when the masked average (median) was used. The masking procedure was designed to remove background noise and ephemeral sources of light [27]. To the extent that such noise is not auto-correlated across years, the usual pattern of random measurement error in a right-hand side variable, causing attenuation of the regression coefficient on that variable [34], seems to occur here, given that the estimated elasticity rises when masking is used to remove this noise from the data.

With this attenuation bias pattern in mind, it may seem puzzling that the between estimator results showed a larger GDP-lights elasticity (at 1.26 rather than 1.05) when the unmasked data products were used. Although not reported in Section 3.1, a similar pattern showed up in the country-level results, where the between-estimator gave a GDPlights elasticity of 0.96 with the unmasked data and of 0.86 with the masked data (and the difference was statistically significant at *p* < 0.02). A potential explanation lies in the impact of non-random, and specifically mean-reverting, measurement errors. The unmasked data included occurrences of apparent light (either ephemeral or noise) outside of usually lit areas. After averaging across years, the apparent radiance of these unlit areas was raised and so the apparent luminosity of these areas became closer to the mean. With this meanreverting error, when NTL data are on the right-hand side of a regression, the coefficients can be exaggerated, as seen in the first two columns of between estimator results in Table 2. Once this noise is removed, the results in the last two columns in the lower panel of Table 2 suggest that, on average, a county where the sum of NTL is ten percent higher than for another county will have a real GDP that is 10.5 percent higher.

The results in Table 2 are atypical of studies that relate NTL data to GDP data. While there are some county-level results for China [25], the validation studies with GDP data as a benchmark are mostly for spatially aggregated data at the national or first subnational level, even as applied studies increasingly use NTL data locally [45]. It is therefore of interest to see how the results for estimating Equation (1) change when the GDP and NTL data are at the state-level. This spatial aggregation suppresses much of the variation in the fluctuations; for example, the coefficient of variation for annual changes in log GDP, which is what the within-estimator is based on, has a value at the state level that is just one-sixth of the value at the county level. There is less suppression of variation for the between-estimator based on the averages over 2014–19, with the state-level coefficient of variation being one-half the county-level coefficient of variation.

An important change with state-level data is that there is less gain from masking to remove noise when using the within estimator; the top panel of Table 3 shows that the unmasked V.2 VNL data gives elasticities for changes in state-level GDP with respect to changes in state-level NTL of about 0.05, compared to 0.04 with the masked products (and these coefficients are surrounded by standard errors of about 0.03, so we cannot reject the hypothesis that the four sets of within-estimator elasticities in Table 3 are all the same). Unlike with the county-level data, predictive accuracy for annual changes in log GDP was not any higher when using the VNL masked data, and actually fell slightly from 0.05 to 0.02 (for the average radiances).


**Table 3.** Relationships between VIIRS V.2 NTL and state-level GDP: within and between estimator results.

Notes: Based on a strongly balanced panel of 51 state-level units (treating the District of Columbia as equivalent to a state), observed each year from 2014 to 2019, giving N = 306 observations. Standard errors are in parentheses (clustered at state level for the within-estimator results), \*\* *p* < 0.05, \*\*\* *p* < 0.01.

> One interpretation of the fact that using masked data has little effect on the within estimator at the state level, unlike at the county-level, is that noise in estimates of annual changes in lights may cancel out as data are spatially aggregated to the state-level (noting also that there is less variability in annual GDP changes at state level than at county level). However, with even further aggregation to the country level in Table 1, using the masked data again seemed to matter (although discussion of the country-level relationships must be tempered by the fact that the GDP data across countries are likely to be a less consistent benchmark than are the sub-national data for the U.S. given the variation in statistical capacity between countries). The issue of how relationships between changes in NTL data and changes in GDP vary by level of aggregation is one that could usefully be investigated further.

> The state-level results from the between estimator, in the bottom panel of Table 3, also show important differences from the county-level results. The predictive accuracy was lower, with *R*<sup>2</sup> values just below 0.70 with masked data products (or below 0.36 with unmasked data) compared to an *R*<sup>2</sup> of 0.86 at the county level. The elasticities were also lower at 0.84 compared to 1.05 in the county-level results with masked VNL data. Overall, this sensitivity to the level of spatial aggregation suggests a need to use findings from validation studies that are based on a similar level of spatial aggregation to what is used in ones' own study.

#### *3.3. Results Using Earlier NTL Products*

The V.2 VNL data products have only been recently available, so much of the literature has used older NTL data products such as V.1 VNL and DMSP stable lights composites. In this section, we examine how the results of estimating Equation (1) changed when older NTL data products are used. For comparisons, we used the V.2 VNL masked average radiance as that data product had the equal best performance in Table 2. Additionally, summing a (masked) mean to a county total is conceptually more consistent with GDP, which is the sum of economic activity in a county, than the case for summing a median.

In Table 4, we report estimates of Equation (1) for 2015–16 using either V.1 or V.2 VNL data as the right-hand side variable. For the analysis of temporal changes in GDP with respect to changes in NTL (the within estimator), V.2 is clearly superior, with an elasticity about four times larger (and an *R*<sup>2</sup> over 10-times larger). This is consistent with the expectation of the data creators, that the V.2 VNL series would do better at the analysis of lighting changes, due to using the same outlier removal threshold in all years rather than using a threshold that is year-specific, as in the V.1 VNL product [27]. Nevertheless, we emphasize that the predictive power for county-level annual changes in GDP based on annual changes in NTL is very low, regardless of whether the V.1 or V.2 data are used. When cross-sectional differences were examined using the between estimator, performance of the V.1 and V.2 VNL data was very similar, with *R*<sup>2</sup> of about 0.86 and elasticities of about 1.03. Thus, existing cross-sectional results that have been established with the V.1 data should also hold with the V.2 data.

**Table 4.** Within and between estimators of GDP-lights elasticities: V.1 and V.2 VNL county-level results, 2015–16.


Notes: Based on a balanced panel of 3109 county-level units, observed in 2015 and 2016. The within estimator models include year and county fixed effects. The V.2 VNL product is the masked average radiance. Standard errors in parentheses (clustered at county level for the within-estimator), \*\* *p* < 0.05, \*\*\* *p* < 0.01.

> Many studies of economic performance using NTL data continue to use DMSP data [16,33], even though the flaws in this data source, compared to VIIRS, have been known for almost a decade [23]. A key difference between these data sources is that even though the output grid for DMSP is only twice as coarse as for VNL (30 arc-seconds vs. 15 arc-seconds), the underlying spatial resolution of DMSP data is far coarser. This coarseness is due to geolocation errors [46], the smoothing of pixels into 5 × 5 blocks because onboard storage could not hold all the fine pixel data, and because there is no compensation for the expanded field-of-view as the Earth is viewed at an angle away from the nadir [24]. Consequently, the spatial precision of VNL images is at least 45 times greater than the precision of DMSP images [23]. One way that this imprecision shows up is through an exaggerated impression of urban extent from DMSP images [16,24,47].

> Figure 1 shows how the lower 48 states of the U.S. (and also parts of Canada and Mexico) appear in the DMSP stable lights composite for 2013. Much of the land surface to the east of the 100◦ W meridian appears to be covered in light, and large clusters of light are also apparent around Denver, Salt Lake City, Phoenix, in California south of 39◦ N, and in Oregon and Washington north of 43◦ N. However, the picture shown with the V.2 VNL composite for 2014 appears very different, with cities having a far smaller lit area footprint than the DMSP data suggest (Figure 2). Notwithstanding the later overpass time of VIIRS, which may mean that some lights visible in the early evening have been turned off, the difference between Figures 1 and 2 reflects a key feature of DMSP of attributing city lights to places that are much less brightly lit (or even unlit). This feature contributes to noisy data that may distort apparent relationships between NTL and local economic activity.

> There are several ways to numerically contrast Figures 1 and 2. A salient approach is to use spatial inequality statistics, as ever more studies use DMSP data to estimate inequality [39,40,48]. The overstated lit area in Figure 1 from DMSP blurring [24] makes it harder to distinguish areas of concentrated activity from other areas. Top-coding of DMSP data also attenuates differences between places. These spatially mean-reverting errors lead to far lower spatial inequality estimates when DMSP data are used, compared to when VIIRS data are used. When the Gini coefficient (an inequality measure that is zero for perfect equality and 1.0 for complete inequality) was calculated from the county-level GDP data, the average value over 2001–19 was 0.71 with no trend up or down. The V.2 VNL masked average radiances for 2014–19 gave a slightly lower value of 0.65, but it was not statistically significantly different to what the benchmark GDP data showed and also had no time trend. However, when the DMSP data for 2001–13 were used they gave an average Gini coefficient of just 0.50, significantly below the benchmark GDP estimate. Moreover, the DMSP data misleadingly suggested a downward trend in spatial inequality that was not apparent with the benchmark GDP data.

**Figure 1.** Night lights according to the DMSP stable lights annual composite, 2013.

**Figure 2.** Night lights according to masked average radiance from the V.2 VNL, 2014.

In Table 5, we report the results of estimating Equation (1) using DMSP data for the panel of 3109 county-level units observed between 2001 (when the GDP data were first available) and 2013 (when the most widely used DMSP stable lights time-series ends). The table parallels Table 2, except for the earlier time period. For each year from 2001 to 2007, two DMSP satellites provided data (F14 and F15 through 2003, F15 and F16 through 2007). To deal with this extra information, we used three procedures reflecting approaches from applied studies. The first was to simply average the DN values from the two satellites operating in a particular year [49]; the second was to discard information from one satellite so that each year only had one source of data [13]; and the third recast the analysis in terms of satellite-years and introduced fixed effects for each satellite, in addition to fixed effects for each year [8]. The satellite-year approach creates an observation from the interaction of

a year and a satellite; for example, F15\_2001 is a separate observation from F14\_2001 or from F15\_2002. Thus, when this method is used, the years with two satellites providing the data are counted twice as often as the years with just a single satellite. Therefore, to put equal weight on each year, the observations from 2001 to 2007 were weighted by 0.5 (as all of these years have two satellites providing the data) while a weight of 1.0 was used for the other years. Given that economics studies rarely use inter-calibrated DMSP data [50,51] as the year dummies in Equation (1) are claimed to deal with year-by-year fluctuations in the NTL time-series caused by sensor degradation and differences between satellites [7] we also did not use inter-calibrated DMSP data products.


**Table 5.** Relationships between DMSP NTL and county GDP: within- and between estimator results.

Notes: Based on a balanced panel of 3109 county-level units, each year from 2001 to 2013. The within-year averaging affects years 2001 to 2007, which each have two satellites providing data. To use observations of only one satellite per year, we used F15 from 2001 to 2007, F16 in 2008 and 2009, and F18 from 2010 onwards. Standard errors in parentheses (clustered at county level for the within-estimator), \*\*\* *p* < 0.01.

> How the issue of two DMSP satellites per year is dealt with affects the within-estimates of the GDP-lights elasticity, which can vary from 0.10 (using satellite-year observations) to 0.25 (using within-year averaging). A review of 18 economics studies using DMSP data found only two used satellite fixed effects while all used year fixed effects [16]. The results in Table 5 imply possible sensitivity of the results in this literature from not exploring other ways of incorporating multiple DMSP readings within a year (the within estimator is also affected by inclusion or exclusion of particular years, as seen below). This issue has no effect on the between estimator, which gives estimated elasticities of 1.22 across-the-board, because it is the same whether one first averages between satellites within a year and then averages over years, or instead averages over all satellite-years in one go.

> Given the sensitivity to different ways of dealing with the observations from years with two DMSP satellites providing data, we also report the results in Table 5 for a 6-year time-series from 2008 to 2013. By necessity over this period, there is only one satellite available per year and so there is no sensitivity to different ways of dealing with multiple satellites in the same year. Additionally, these results (in the final column of Table 5) used a time-series that was of the same length as the time-series used for the V.2 VNL results shown in Table 2.

> Two key patterns emerged from comparing the results in Table 5 with those in Table 2. First, the within estimator gave a higher GDP-lights elasticity using DMSP data for the period to 2013 than when using V.2 VNL data for the period since then, being about 50% higher if attention was restricted to the two 6-year time-series. Second, the between estimator showed that DMSP data gave elasticities more similar to those from the unmasked V.2 VNL data than those from the masked VNL data. Specifically, the estimated elasticity was 1.22 with DMSP data, 1.26 with unmasked V.2 VNL data, and only 1.05 with masked V.2

VNL data. In other words, the results with DMSP data were more like those coming from V.2 VNL data that had not had the background noise removed, which is an indirect way of saying that there is evidence of noise in the DMSP data. This noise reflects two features of DMSP data noted previously: attributing light to unlit places (blurring) and top-coding in brightly lit places [23,24]. Both features produce errors that cause a reversion toward the mean, and are likely to lead to elasticities being overstated rather than understated [35–37] if DMSP NTL data are on the right-hand side of regression equations.

The blurring and top-coding of DMSP that contribute to the noise in the NTL data are illustrated at finer scale in Figure 3, which maps four counties in western Massachusetts: Berkshire, Franklin, Hampshire, and Hampden using V.2 VNL data and DMSP data. The largest city in this region is Springfield (population: 160,000), and lights from this city (with masked average radiance exceeding 130 nW/cm2/sr) are clearly visible in the middle of Hampden county in map (a) using V.2 VNL data for 2014. The largest cities in the other counties are far smaller, with populations of about 45,000 in Pittsfield (Berkshire Co.), 40,000 in Amherst (Hampshire Co.), and only 18,000 in Greenfield (Franklin Co.). The smaller size and lower brightness (e.g., no pixels in Pittsfield had an average radiance greater than 54 nW/cm2/sr) of these other cities is also clear with the V.2 VNL data.

In contrast, the DMSP stable lights image for 2013 makes much of the area appear to be lit, with lights extending north from Springfield along Interstate 91 (I-91) corridor to Greenfield and into Vermont and New Hampshire (Figure 3b). Likewise, most of Berkshire county appears to be lit, with some parts seeming to be almost as brightly lit as Springfield. For example, Pittsfield has areas with DN = 60, which is almost as high as some areas in Springfield that have pixels with DN = 63, however, the reality seen in the V.2 VNL radiance data was that Pittsfield was only about 40% as brightly lit as Springfield, in line with being only one-quarter as populous.

When lights are aggregated to county level, the DMSP data greatly understate the differences between places. For example, the sum of lights for Franklin county was 35% of the sum of lights for Hampden county when DMSP data for 2013 were used. In contrast, the V.2 VNL data for 2014 showed that the sum of lights for Franklin county was just 9% of what was emitted by Hampden county. The GDP of Franklin county in either 2013 or 2014 was just 12% of that of Hampden county, and so the V.2 VNL data are a far more realistic proxy for what GDP reveals about the differences in economic activity in these two places.

This feature of DMSP data in understating differences between places is due both to blurring, which attributes light to unlit or less-lit places, and top-coding [33]. At least for the example of western Massachusetts, these two problems seemed to contribute equally to understated differences between places. In certain years (1996, 1999, 2000, 2002, 2004, 2005, and 2010), 'radiance-calibrated' DMSP data were derived from certain nights when NOAA asked the Air Force to turn down the amplification on the DMSP sensors, so that DN values were not top-coded in urban areas [52]. With these data for 2010, the sum of radiance-calibrated lights in Franklin Co. was one-quarter the sum of lights for Hampden Co., while the GDP of Franklin Co. in 2010 was only 13% of that of Hampden Co. In other words, the radiance-calibrated lights data made the smaller economy seem twice as large as what the GDP data showed. This improved over the three-fold overstatement of the smaller economy implied by the usual DMSP lights data, but the fact that the radiance calibrated lights still understated the GDP differences highlights the importance of the blurring problem in DMSP data, given that this problem is not dealt with by the radiance-calibration.

Features of DMSP data like blurring that contribute to exaggerated GDP-luminosity elasticities in between estimator results seem to hold in the extended DMSP series for the 2014–18 period. In Table 6, we report results using V.2 VNL data and extended DMSP data. The between estimator elasticity of 1.05 with V.2. VNL data was hardly changed from what was reported in Table 2 (as averaging was over five of the six years used in Table 2), but DMSP data for the same period gave an elasticity of 1.14. Once again, this exaggeration of the elasticity was consistent with mean-reverting errors in DMSP data. For the within

estimator results, the elasticity with DMSP data was smaller, perhaps because pre-dawn lights are less responsive to fluctuations in economic activity than are evening lights. For both the within and between estimators, the V.2 VNL data were more powerful predictors of GDP than were the DMPS data.

**Figure 3.** Night lights of western Massachusetts according to (**a**) V.2 VNL masked average radiance in 2014 and (**b**) DMSP stable lights in 2013.

A higher GDP-lights elasticity (for 2014–18) from V.2 VNL data than from extended DMSP data also holds with the country-level data. Recall from Table 1 (column 2) that the country-level elasticity with VNL data was 0.094 ± 0.038. This elasticity rose to 0.131 ± 0.034 when 2019 was omitted (so there is some sensitivity to sample periods). In contrast, with extended DMSP data, the elasticity was 0.063 ± 0.026 (the within *<sup>R</sup>*<sup>2</sup> was 0.046 compared to 0.118 with VNL data). Even noting that pre-dawn lights may vary less with economic fluctuations than do evening lights, this is a far smaller GDP-lights elasticity than seen in prior results with DMSP data.

**Table 6.** Within and between estimators of GDP-lights elasticities: DMSP and V.2 VNL county-level results, 2014–18.


Notes: Based on a balanced panel of 3109 county-level units, observed each year from 2014 to 2018. The within estimator models include year and county fixed effects. The V.2 VNL product was the masked average radiance. Standard errors in parentheses (clustered at county level for the within-estimator), \*\* *p* < 0.05, \*\*\* *p* < 0.01.

#### *3.4. Results Using GDP by Industry*

The U.S. has a larger share of GDP from the services sector than does any other major economy. The strength of the relationship between NTL and overall GDP depends on the structure of the economy because not all types of economic activity are equally reliant on lighting at night [25,26,41]. Thus, one way to examine how the above findings for the U.S. may apply to other countries is to look at estimates of Equation (1) that are disaggregated by industry, so that some extrapolation of the results to settings with different industrial structures can be considered.

The first two columns of Table 7 show that V.2 VNL data have higher predictive power for services sector economic activity than for goods-producing activities, whether examining cross-sectional differences or time-series changes. Hence, in countries where the services sector is less important than in the U.S., the NTL data may be less successful as a proxy for local GDP than they are in the U.S.

The private goods sector covers a range of industries and in some of them, there is a very weak, or entirely absent, relationship between NTL data and economic activity. The last two columns of Table 7 show the results for agriculture, forestry, fishing, and hunting (the primary sector), and for mining, quarrying and oil and gas extraction. The within estimator showed that changes in nighttime lights were not related to changes in primary sector economic activity, while they were only weakly related to changes in activity in the mining and oil and gas extraction sector. The between estimator results showed that GDP-lights elasticities were far smaller for these two industries than for all goods-producing industries and the *R*<sup>2</sup> values were much lower (and are almost zero for the primary sector).

Another way to consider the pattern shown in the third column of Table 7 is to divide counties into two groups, based on having an above-median or below-median share of agriculture in GDP (based on the 2014–19 averages). The within estimator results from column 3 of Table 2, where the elasticity was 0.12 ± 0.02, were re-estimated for these two sub-samples. In the counties where agriculture is more important, the elasticity was only 0.05 ± 0.02 (and the *<sup>R</sup>*<sup>2</sup> = 0.01), but where agriculture is less important, the elasticity was 0.18 ± 0.03 (and the *<sup>R</sup>*<sup>2</sup> = 0.08). Thus, NTL data may be less useful as a proxy for fluctuations in overall economic activity in places where agriculture is more important. Notwithstanding this result for fluctuations in economic activity, between estimator results in the first two columns of the lower panel of Table 8 suggest that V.2 VNL data remain a good proxy for differences in GDP between counties, whether they are more reliant on agriculture or not.


**Table 7.** Relationships between V.2 VNL masked average radiance and GDP by industry: counties, 2014–19.

Notes: Based on county-level panels, observed each year from 2014 to 2019, with N = 2935 cross-sectional units for the first two columns and N = 2850 cross-sectional units for the last two columns. The private goods-producing industries consist of agriculture, forestry, fishing, and hunting; mining, quarrying, and oil and gas extraction; construction; and manufacturing. Standard errors in parentheses (clustered at county level for the within-estimator), \*\*\* *p* < 0.01.

**Table 8.** Split-sample results for relationships between VIIRS V.2 VNL and county GDP.


Notes: Based on 3109 county-level units, observed each year from 2014 to 2019. The share of agriculture in GDP was averaged over all years and counties were then allocated into the above median or below median group based on the multi-year average. Population density was based on the 2010 census. Standard errors in parentheses (clustered at county level for the within-estimator results), \*\*\* *p* < 0.01.

> One reason NTL data may be a less useful proxy for fluctuations in overall economic activity in more agricultural places is that there are some forms of non-agricultural activity, like retail shopping and wholesale distribution, which may occur at night aided by concentrated artificial light while this is less common for agriculture. Another factor is agriculture's use of space as a productive input, so population density and NTL intensity are lower in agricultural areas. For example, the counties with an above median share of agriculture in GDP had an average population density just under 40 people per square mile in the 2010 Census, while the counties with a below median share of agriculture had an average density more than 10-times higher, at almost 440 people per square mile.

> The last two columns explore the role of population density more directly by splitting the sample into counties above and below the median density. In higher density counties, the predictive power of NTL data as a proxy for GDP was higher, for both the within estimator and the between estimator. The overall level and the composition of economic activity vary with population density, so relationships between NTL data and traditional indicators such as GDP will average over what could be quite disparate relationships for

particular places and types of activity, and this should be borne in mind when NTL data are used as a proxy.

#### **4. Discussion**

In this paper, we used a comprehensive and updated set of DMSP, V.1 VNL, and V.2 VNL nighttime lights data. We mainly examined the relationships with county-level and state-level economic activity for the U.S. over the 2001 to 2019 period, but we also provided some country-level results to link to the previous literature. Our motivation for using this rich set of NTL data products, and for using the lowest level spatial units that have GDP data available, stems from a concern that existing validation studies that assess NTL data as a proxy for economic activity are mainly for dated and imprecise DMSP data, and the most widely cited of these studies use aggregated spatial units such as nations or the first sub-national level. However, NTL data are increasingly used to proxy for economic activity at very local levels such as the third sub-national level and below. Another feature of recent applied studies is using NTL data to proxy for temporal fluctuations in local economies when evaluating the impacts of various shocks or policy interventions. In contrast, earlier studies tended to use NTL data to study regional differences in economic performance.

A key overall finding is that masked average radiance from the V.2 VNL data product was a better cross-sectional and time-series predictor of GDP than any of the other NTL products considered here (with the masked median also a good predictor). Masking to zero out background noise and ephemeral lights substantially improved predictive performance in cross-sections of county- and state-level GDP, and for time-series changes in county-level GDP. The masked V.2 VNL also better predicted time-series changes in GDP than did the V.1 VNL data, most likely because V.2 VNL uses a single multiyear threshold to isolate the background from lit grid cells while the year-by-year thresholds used for V.1 VNL may provide a less consistent basis for detecting changes. Comparisons with the predictive performance of extended DMSP data, which are based on pre-dawn readings from 2014 to 2018, also highlight the superiority of the masked V.2 VNL data.

When the various NTL data products faced the same benchmark GDP data, some predicted better than others. At least one reason for this is that some NTL data products are more error-ridden measures of true luminosity. The patterns of GDP-luminosity elasticities help to reveal the nature of these measurement errors. If either DMSP data or unmasked VNL data are used, the cross-sectional GDP-luminosity elasticity from the between estimator is exaggerated, with county-level estimates exceeding 1.20 (or 1.14 for the extended DMSP data product) compared with an elasticity of 1.05 from the masked VNL data that should have the least noise. This exaggeration of the elasticity suggests that measurement errors in DMSP data, and in unmasked VNL data, are mean-reverting rather than random. Consequently, these measurement errors will bias regression coefficients even if NTL data are the left-hand side variable, and can exaggerate coefficients rather than attenuate them if NTL data are the right-hand side variable.

There are at least two other consequences of mean-reverting errors in popular NTL data products like the DMSP annual composites. First, the literature that is beginning to use these data to estimate trends in spatial inequality may prove misleading, as inequality is significantly understated by DMSP data compared to what the GDP data and VIIRS data show. Second, attempts to splice together DMSP and VNL data to obtain a longer time-series face a key difficulty in finding an adjustment factor to make the DMSP data more like the VNL data. The measurement errors in DMSP data appear to vary with true but unknown luminosity; less brightly-lit areas have apparent luminosity overstated and more brightly-lit areas have it understated. Hence, no single adjustment factor, like an inter-calibration regression coefficient, can be most appropriate in all times and places. Moreover, spatial aggregation also affects the impacts of the measurement errors, as seen in the different patterns of results at county and state level.

The NTL data did far worse at predicting time-series changes in county GDP than at predicting in cross-sections of GDP. A prior study also found this in the V.1 VNL data [26], but the results here are more compelling because they are from a longer time-series, using V.2 VNL data that should better measure lighting changes because they are derived from a constant threshold across years for isolating the background from lit grid cells. The weak relationship between changes in NTL and changes in GDP raises doubts about applied studies that show the effects of their treatment (e.g., a shock) on NTL data. If the GDPluminosity elasticity is only 0.1 (and the within *R*<sup>2</sup> values are close to zero, as seen in Table 2), which is far lower than the elasticities in the literature reported from DMSP data at the national and first-subnational level, then it is hard to see how changes in NTL data are a good proxy for changes in local economic activity. In other words, estimates of the impact of the treatment on NTL data may not be very informative about the impact of the treatment on economic activity. In particular, treatment effects may be far smaller than presumed from econometric estimates using NTL data, especially if the researchers assume that cross-sectional elasticities hold in the time-series context [45].

#### **5. Conclusions**

There are several things that we can conclude from our analyses. First, masking to reduce measurement error improved the predictive power of V.2 VNL data. Second, predictive accuracy in county-level cross-sections was about 30-times higher than for county-level time-series changes in GDP. Third, the V.2 VNL data better predicted timeseries changes in GDP than did the V.1 VNL data; likely due to V.2 VNL using a single multiyear threshold for isolating background from lit grid cells while the V.1 VNL uses year-by-year thresholds. Fourth, whether examined at the country level or county level, the relationship between recent temporal fluctuations in GDP and fluctuations in V.2 VNL data yielded a far smaller elasticity than was estimated when DMSP data were used for earlier years. Fifth, cross-sections of DMSP data provided similar results to what unmasked VNL data showed, indicating noise in the DMSP data (this pattern also holds if using the extended DMSP series). Relatedly, the DMSP data understate spatial inequality and the example we provide suggests that this comes in equal parts from blurring and top-coding.

The results reported here pertain to the United States—a setting where NTL data are not especially needed for research, given the abundance of other data on economic activity. However, the patterns of results across the various NTL data products for different spatial levels and for modeling time-series changes versus cross-sectional variation in economic performance should hold more broadly. For example, just using the U.S. data, it was possible to obtain a GDP-luminosity elasticity of 0.25 if a particular way of handling years with two DMSP satellites was used, which is quite close to the existing values in the literature beyond the U.S., despite more precise VNL data, suggesting an elasticity below 0.1. Moreover, the U.S. is a very diverse country, with types of economic activities in some places that are more like those in poorer countries. For example, given that NTL data are shown to be poor predictors of agricultural activity, or of changes in total economic activity in highly agricultural counties, there are grounds to question whether NTL data can be relied upon as a proxy for economic performance in predominantly agricultural settings in other countries. Relatedly, we also show that the NTL data were a less useful proxy for economic activity in less densely populated areas. Overall, our results suggest a need for greater caution in using NTL data as a proxy for economic activity, especially as findings from validation studies in different settings, or with different NTL data products, or at different levels of spatial aggregation may not translate to other settings.

**Author Contributions:** Conceptualization and methodology, J.G.; Software, validation, data curation and GIS analysis, G.B.-G.; Econometric analysis, preparation of original draft and revision editing, J.G.; Visualization, G.B.-G.; Project administration and funding acquisition, J.G. Both authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Marsden Fund Grant UOW-1901.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The annual VNL V.2 data used in this study are available for download from the Earth Observation Group of the Colorado School of Mines at https://eogdata.mines.edu/ products/vnl/#annual\_v2, accessed on 9 July 2021 and the V1 data are at https://eogdata.mines. edu/products/vnl/#v1, accessed on 6 April 2021. The DMSP stable lights annual composites are available at https://eogdata.mines.edu/products/dmsp/#download, accessed on 9 July 2021. The county-level and state-level GDP are available from the Bureau of Economic Analysis at https://www.bea.gov/data/gdp/gdp-county-metro-and-other-areas, accessed on 9 July 2021.

**Acknowledgments:** We are grateful to Xiangzheng Deng, Bonggeun Kim, and four anonymous reviewers for comments that have helped to improve this paper. We also acknowledge the use of images and data processing by the Earth Observation Group, Payne Institute for Public Policy, Colorado School of Mines.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Nightlight as a Proxy of Economic Indicators: Fine-Grained GDP Inference around Chinese Mainland via Attention-Augmented CNN from Daytime Satellite Imagery**

**Haoyu Liu 1, Xianwen He 1, Yanbing Bai 1,\*, Xing Liu 2, Yilin Wu 1, Yanyun Zhao <sup>1</sup> and Hanfang Yang <sup>1</sup>**


**Abstract:** The official method of collecting county-level GDP values in the Chinese Mainland relies mainly on administrative reporting data and suffers from high costs of time, money, and human labor. To date, a series of studies have been conducted to generate fine-grained maps of socioeconomic indicators from the easily accessed remote sensing data and achieved satisfactory results. This paper proposes a transfer learning framework that regards nightlight intensities as a proxy of economic activity degrees to estimate county-level GDP around the Chinese Mainland. In the framework, paired daytime satellite images and nightlight intensity levels were applied to train a VGG-16 architecture, and the output features at a specific layer, after dimensional reduction and statistics calculation, were fed into a simple regressor to estimate county-level GDP. We trained the model with data of 2017 and utilized it to predict county-level GDP of 2018, achieving an R-squared of 0.71. Furthermore, the results of gradient visualization confirmed the validity of the proposed framework qualitatively. To the best of our knowledge, this is the first time that county-level GDP values around the Chinese Mainland have been estimated from both daytime and nighttime remote sensing data relying on attention-augmented CNN. We believe that our work will shed light on both the evolution of fine-grained socioeconomic surveys and the application of remote sensing data in economic research.

**Keywords:** attention-augmented CNN; nightlight; fine-grained GDP estimation; daytime satellite imagery; arbitrary area representation

#### **1. Introduction**

Fine-grained, large-scale measures of economic development levels are vital to resource allocation and policy-making. Gross domestic product (GDP in short) is an elementary but crucial indicator in assessing regional productivity and consumption degrees. Disaggregated GDP maps can reflect both the overall development levels and the regional imbalance within a country. It is worth to emphasize that the geographic administrative hierarchy in China is province, city and county in descending order, county is a relatively small administrative unitwhich is quite different with the system of many other countries. In the Chinese Mainland, the official county-level GDP, i.e., GDP of the second sub-national administrative unit in China [1,2], are collected by local government statistical services. The final GDP values are calculated mainly from administrative reporting data and supplemented (or amended) by periodical surveys and censuses. However, official county-level GDP values are often heterogeneous and costly [3] because statistical institutions at the county level commonly suffer from the lack of specialized persons and the inaccessibility to essential materials [4,5].

X.; Wu, Y.; Zhao, Y.; Yang, H. Nightlight as a Proxy of Economic Indicators: Fine-Grained GDP Inference around Chinese Mainland via Attention-Augmented CNN from Daytime Satellite Imagery. *Remote Sens.* **2021**, *13*, 2067. https://doi.org/ 10.3390/rs13112067

**Citation:** Liu, H.; He, X.; Bai, Y.; Liu,

Academic Editor: Nataliya Rybnikova

Received: 12 March 2021 Accepted: 21 May 2021 Published: 24 May 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

In recent years, remote sensing data have been increasingly applied to predict socioeconomic indicators [6]. A series of studies have been conducted to develop convenient and scalable methods to estimating various indices relying on nighttime lights, satellite imagery, and emerging machine learning models. Previous studies have demonstrated the correlation between lit areas and GDP values [7,8]. Researchers successfully applied nightlight data and regression methods to generate disaggregated maps of socioeconomic indicators in different regions around the world, including both developed countries such as European Union countries, the United States [8], and Japan [9], and developing countries such as India [10,11] and China [12]. However, nightlight data are easily affected by coarse resolution, noise, and oversaturation [13,14]. They also overlook the relationship between economic developments and geographic patterns. The rapid development of convolutional neural networks and the availability of high-resolution daytime satellite imagery enable the detection of detailed land appearances thought to be strongly correlated with socioeconomic statuses, such as buildings, cars, roads, and farmlands [15,16]. Despite the informativity of daytime satellite imagery, it is often infeasible to estimate socioeconomic indicators directly from daytime images since there are hardly enough ground-truth data to supervise the training of data-intensive CNN-based models. Learned from previous studies, some researchers creatively combined daytime and nighttime remote sensing data [17,18]. They regarded nightlight intensities as a data-rich proxy of economic development degrees and applied CNN-based classifiers to predict nightlight from the corresponding daytime images. Later, the high-dimensional output features at a specific layer of CNNs are fed into simple regressors to estimate indicators in interest. In this way, fine-grained GDP maps can be generated conveniently from easily accessed data.

This paper is interested in predicting annual county-level gross domestic product (GDP) around the Chinese Mainland from readily accessed remote sensing data, including daytime satellite imagery and nighttime lights. Our framework is mainly based on the work of Jean et al. [18], in which paired daytime images and nightlight intensities are utilized for training a CNN classifier, and the output features at a specific layer, after dimensional reduction and statistics calculation, are fed into a simple regressor for the final estimation. We boost the model performance via incorporating attention mechanism into the CNN architecture. To the best of our knowledge, this is the first time that the CNN-based estimation of county-level socioeconomic indicators from remote sensing data have been applied on such a large scale in China, i.e., all over the country. Since the number of image grids belonging to a county-level administrative region varies, each economic index, i.e., annual county-level GDP, corresponds with an indeterminate number of output feature vectors. To uniform the dimensions of downstream model inputs, feature vectors belonging to the same county are regarded as a sample, and the representative statistics are computed as the final independent variables for the regression models.

Our work has the following contributions.


#### **2. Related Work**

#### *2.1. Estimating GDP with Nightlight Only*

Large amounts of previous studies have investigated the association between economic activities and nighttime lights at different scales and in different areas. The Defense Meteorological Satellite Program (DMSP) and the Visible Infrared Imaging Radiometer Suite (VIIRS) are two main sources of nightlight data applied in socioeconomic research [19]. The DMSP annual stable lights from 1992 to 2013 published by the NOAA Earth Observation Group (EOG) boosted studies that estimated socioeconomic indicators, GDP for instance, by nighttime luminous data in the last decades. Elvidge et al. [20] examined the relationship between the area of lighting measured from the DMSP data and country-level GDP for 200 nations. Doll et al. [7] moved one step further and produced the first-ever global map of GDP using the total lit area of a country, indicating a high correlation between nightlights and GDP at the country level. As fine-grained socioeconomic data became increasingly desirable, the following studies tended to examine the relationship between nightlight and economic activity degrees at smaller geographic units. Doll et al. [8] successfully produced disaggregated maps for 11 European countries along with the United States at a 5 km spatial resolution using nighttime radiance data and the prevailing land-use data. There also existed evidence that nightlight could be applied to predict sub-national GDP or income levels in developing countries such as India [10,11] and China [12,21]. These studies verified the rationality of considering nighttime lights (provided by the DMSP data) as a proxy of regional economic activity degrees. However, flaws in DMSP data, including pervasion blurring, no calibration, coarse spatial and spectral resolution, and inter-satellite differences [14,22], inflicted inaccuracy and even invalidity upon studies using this data source, especially for smaller units and lower density areas [23,24]. In comparison, the newgeneration VIIRS data, which became available from 2012 onward, were more pertinent to the needs of socioeconomic researchers. Empirical results proved that the VIIRS data could be a promising supplementary source for socioeconomic indicator measures [25–27] and have better performance than the DMSP data [23,28].

It should be noticed that despite the superiority to DMSP data, estimation at small geographic units and detection of agricultural activities remained to be challenges for the utilization of VIIRS data [23,26]. The estimation of county-level GDP in China from nightlight was only affected by the limitation of data sources. To the best of our knowledge, no studies have ever generated county-level GDP maps covering the whole country. Moreover, many studies either eliminated the output of primary industry, i.e., agricultural output, from local GDP [29] or incorporated additional information such as land-use status and rural population [30]. Data that are both informative for small or low-density regions and easily accessed are needed.

#### *2.2. Detection of Economic-Related Visual Patterns from Daytime Satellite Imagery via Deep Learning*

Remote sensing data are valuable for economic studies because they provide access to information hard to obtain by other means and generally cover broad geographic areas [6]. Apart from nighttime lights, daytime satellite imagery is another valuable resource for socioeconomic research. Compared with relatively low-resolution nightlight data, daytime images contain much more features and can reveal more detailed topographic information [6]. Land appearance detection relying on CNN-based architectures from daytime satellite imagery perform well in locating regions that strongly related to socioeconomic status [15,31,32]. Engstrom et al. [15] trained CNNs to extract features concerning buildings, cars, roads, farmlands, and roof materials from high-resolution daytime images. They fed these features into a simple linear model and explained nearly sixty percent of both poverty headcount rates and average log consumption at the village level in Sri Lanka. Abitbol and Karsai [32] applied a CNN model to predict inhabited tiles' socioeconomic status and projected the class discriminative activation maps onto the original images, interpreting the estimation of wealth in terms of urban topology. To date,

daytime imagery and deep neural networks have been widely applied to predict various socioeconomic indicators such as population [33–35], poverty distribution [15,18,36], and urbanization [6,37]. Despite the convenience and scalability, these studies depend largely on data-intensive CNNs and require large volumes of ground-truth labels to supervise the training process. Han et al. [6] developed a framework for learning generic spatial representations in a semi-supervised manner. They constructed a small custom dataset in which daytime satellite images were classified into three urbanization degrees by four annotators and applied it to fine-tune the CNN-based classifier pre-trained on ImageNet [38]. The output features can be adopted to predict various socioeconomic indicators, but the training labels of this method suffered from high expenses and subjective judgments.

#### *2.3. Nightlight as an Intermediate between Economic Indicators and Daytime Satellite Imagery*

In many developing countries, reliable sub-regional socioeconomic data are scarce and expensive, making it difficult for data-intensive neural networks to directly learn relevant features from informative daytime satellite imagery. Since nightlight data are much more abundant and commonly correlated with degrees of economic activities, some researchers began to regard them as a data-rich intermediate between economic indicators and daytime satellite imagery. Xie et al. [17] proposed a two-step transfer learning framework in which a fully convolutional CNN model pre-trained on ImageNet [38,39] was tuned to predict nightlight intensities from daytime images and learn poverty-related features simultaneously. They found that the model learned to identify semantically meaningful features such as urban areas, roads, and farmlands from daytime images without direct supervision of poverty indices but with only nighttime lights as a proxy. Jean et al. [18] refined this method by feeding the features learned from raw daytime satellite imagery by the tuned CNN into ridge regression models to estimate average household wealth in five low-income African countries. Their research further demonstrated that nightlight data could well serve as an intermediate between daytime satellite imagery and socioeconomic indicators relying on deep learning techniques [40]. Follow-up studies showed that the fully convolutional network, which was tuned to extract high-dimensional features from daytime images under the supervision of corresponding nightlight intensities, could be substituted for various architectures [40], including DenseNet [41] and ResNet [42], and this approach also generalized well to predict poverty-related indices in other countries outside Africa [43]. Nighttime lights also proved useful when there was a lack of ground-truth socioeconomic data, guiding the CNN-based model to compute economic scores from daytime imagery in an unsupervised way [18]. Instead of utilizing luminous data as approximate labels to train neural networks, Yeh et al. [44] trained identical ResNet18 [42] architectures on daytime and nighttime images, respectively, and then fed the concatenated output features into a ridge regressor to predict cluster-level asset wealth. Although this approach could predict economic indicators from remote sensing data in an end-to-end manner, it required much more efforts processing and matching daytime and nighttime imagery, and would be unable to generate valid estimations when the ground-truth socioeconomic data are insufficient for the CNNs to converge.

#### **3. Data**

This paper utilizes the following three data sources: daytime satellite imagery, nighttime light maps, and county-level GDP around the Chinese Mainland along with the corresponding administrative boundaries. All the data sources mentioned above are joined together to construct a complete dataset. The brief procedures of collecting and matching data are shown in Figure 1.

**Figure 1.** The brief procedures of collecting and matching data. There are three steps. (1) Determine the interval between adjacent image centers (2 km in this paper) and calculate all the center coordinates across the Chinese Mainland. (2) Scratch a daytime satellite image covering an area of 1 km2 centered on each coordinate. (3) Select an area of 2.5 km<sup>2</sup> centered on each coordinate and sum up the nighttime light intensities. The sum of nightlight intensities is then classified into 3 degrees, addressed as nightlight intensity level in this paper. The nightlight intensity level serves as the label for the daytime satellite image centered on the same coordinate.

#### *3.1. Daytime Satellite Imagery*

We scratch daytime satellite images mainly through an API provided by the Planet satellite. Posting a request consisting of locations and dates in a month-year mode, the API will return a corresponding image of 256 × 256 pixels. In detail, the specific product we utilize is PlanetScope Ortho Scene product (PSScene3Band), where the distortions caused by terrain have been removed.

Each daytime image covers approximately 1 km<sup>2</sup> with a 5 m resolution, which generally enables human activities to be observed. The natural idea is that we traverse the Chinese Mainland at a 1 km interval so that all the images together can cover the whole territory. However, such a procedure will result in large amounts of images, leading to high time cost in scratching images and over-head computation in training models. As a compromise, we set an interval of 2 km. In this way, the total amount of images is reduced by 4 times, which will greatly speed up for the whole framework. We collect daytime satellite images from 2016 to 2020 according to the grid coordinates. Since the Planet product update image products monthly, the images we scratch are in month granularity of the middle of the year, mostly in June and July. Several instances of daytime satellite imagery are shown in Figure 2.

#### *3.2. Nighttime Light Maps*

As the pioneer of the nocturnal remote sensing technology, the Earth Observation Group (EOG) has been collecting nighttime remote sensing data for years, producing high-quality global nighttime light maps. We utilize the newest V1 annual composites made with the "vcm" version of the year 2016, which covers the Asian area. In this version, the influence of stray light has been excluded. Meanwhile, ephemeral lights and backgrounds (non-lights) are screened out to ensure the ground truth.

**Figure 2.** Instances of daytime satellite imagery with different corresponding nightlight intensity levels in 2016. From top to bottom: images with low-level, medium-level, and high-level nighttime light intensity.

The nighttime light map is then applied to construct labels for daytime satellite imagery. For each daytime satellite image in 2016, we delineate a 2.5 km2 area centered on the same coordinate and sum up nightlight intensities within the area. The areas we select are slightly larger than daytime images and thus can roughly cover the gaps among those images. We regard the sum of nightlight intensities within each area as a proxy of the economic activity degree for the corresponding daytime satellite image. In addition, we apply the Gaussian mixture model (GMM) to cluster the nighttime light intensities. The Gaussian mixture model is a probabilistic model for representing normally distributed subpopulations within an overall population [45]. It is a popular clustering algorithm considered as an improvement over k-means clustering. With the GMM clustering method, we divide the nighttime light intensities into three levels: low, medium, and high. Since the proportion of low-level samples is too high, we drop a few samples with low nightlight intensity levels to maintain the data balance. The final distribution of nightlight intensity levels is shown in Table 1.

**Table 1.** Distribution of nightlight intensity levels.


#### *3.3. County-Level GDP and County Boundaries*

By default, the word "county" in this paper denotes the second sub-national administrative unit in China. County-level units can be mainly divided into three types: municipal districts, counties, county-level cities. Some county-level units, municipal districts for instance, have merged to form larger administrative regions named cities or prefectures, while others are governed directly by the first sub-national units in China, i.e., provinces.

Complex administrative hierarchy makes it difficult to collect annual county-level GDP around the Chinese Mainland from a single publication. This paper sorts to the China Economic and Social Development Statistics Database provided by China National Knowledge Infrastructure (CNKI), where over 28,000 statistical yearbooks concerning different themes released by official statistical institutions at different levels are available. Most annual county-level GDP data can be fetched from the corresponding Provincial Statistical Yearbooks, while a few are supplemented by the Municipal Statistical Yearbooks

and the data retrieval function supported by CNKI. In this paper, the annual county-level GDP is measured in ten thousand Chinese Yuan.

The geographic boundary information of county-level units around China is gathered from the National Catalogue Service for Geographic Information (https://www.webmap. cn, accessec on 14 October 2021). We collected the boundary coordinates of 2900 countylevel administrative units along with county names and the names of the cities or provinces these counties are governed by. GDP values are attached to geographic information via names of counties as well as names of the superior administrative units.

In the data matching process, we utilized the geofencing algorithm supported by the Python package geopandas [46] to compare image coordinates and county boundaries. Specifically, an image along with its nightlight intensity level will be matched with a county once its center falls into the target county-level administrative unit. Figure 3 shows this matching process.

**Figure 3.** The GDP distribution map of the Chinese Mainland in 2018 (some values along with the boundaries of county-level units are missing), and an example of matching center coordinates and county boundaries. Blue crosses denote center coordinates that fall into the boundary of Liping County, while red points denote centers that do not.

#### **4. Method**

Our target is to predict the annual county-level GDP from daytime satellite imagery and nightlight intensities. Since county-level units in China vary in shape and size, the corresponding amount of satellite images along with nightlight intensity levels are variable. Therefore, it is necessary to uniform input dimensions before estimating GDP values. In our model, we first build an attention-augmented feature extractor under the supervision of paired daytime satellite images and nightlight intensity levels. Given a county *i* in the whole county set *C* along with the corresponding daytime satellite image set *Pi* that contains *ni* images, each image such as the *<sup>j</sup>*-th image *<sup>P</sup><sup>j</sup> <sup>i</sup>* in *Pi* will be passed through a trained feature extractor to get the economic-related features *Fj* <sup>∈</sup> <sup>R</sup>*n*, *<sup>n</sup>* <sup>=</sup> 4096, the length of the output vector in the feature extractor. After dimensional reduction, the representative statistical characteristics, including mean, variance, correlation, and the number of each county's reduced features, are calculated and combined as a fixed-size representation *Ri* <sup>∈</sup> <sup>R</sup>*<sup>s</sup>* where *<sup>s</sup>* is the amount of final used variables. Finally, the representation is fed into a regression model to predict the GDP value at each county.

#### *4.1. Training Feature Extractor via Supervised Learning and Transfer Learning*

The attention-based VGG-16 network architecture is utilized to extract features from satellite imagery. The VGG-16 [47] pre-trained on ImageNet [38] contains five convolutional blocks, and each block consists of a series of convolution layers, pooling layers, and non-linear activation functions. The convolutional blocks are trained to extract and construct complex features from raw input daytime images. The last two layers of the network are fully connected layers trained to sort stimuli into 1000 predefined categories based on features extracted from the preceding structure. This paper classifies nightlight intensities into three categories, i.e., nightlight intensity levels, and applies them to supervise the training of the extractor. In the last two convolutional blocks of VGG-16, we insert an attention layer that can re-weight the activation representations. Suppose the convolutional block of VGG outputs an activation features *Mpre* <sup>∈</sup> <sup>R</sup>*H*,*W*,*<sup>C</sup>* defined as pre-attention activation, the attention layer *<sup>A</sup>* <sup>∈</sup> <sup>R</sup>*<sup>C</sup>* matches it in the channel dimension *<sup>C</sup>* correspondingly. Later, the post-attention activation *Mpost* <sup>∈</sup> <sup>R</sup>*H*,*W*,*<sup>C</sup>* is calculated as the hadamard product between *Mpre* and *A*:

$$M\_{post}^{i,j,c} = A\_c \times M\_{prc}^{i,j,c} \tag{1}$$

where *i* = 1, ..., *H*, *j* = 1, ..., *W*, *c* = 1, ..., *C*.

The post-attention activation modulated by the attention layer maintains the same shape as the pre-attention activation, and they are then passed into the next block as Figure 4 shows. We use the Adam optimizer to train the network. The loss function is defined as follows:

$$L = -\sum\_{i \in Iabcl} \mathfrak{F}\_i \log \mathfrak{F}\_i \tag{2}$$

where *y*˜*<sup>i</sup>* denotes the ground-truth class probability (i.e., low level, medium level, and high level) and *y*ˆ*<sup>i</sup>* denotes the predicted probability. When the model converges, we remove the last fully connected layer and utilize the remaining structure to extract features *<sup>F</sup>* <sup>∈</sup> <sup>R</sup>*<sup>n</sup>* from satellite imagery. *n*, the length of *F*, is equal to the length of the output activation flattened by the last convolutional block in VGG-16.

#### *4.2. Dimension Reduction*

Once features have been extracted from each satellite image, we intend to reduce the dimension of *F* into a smaller size. Since the number of counties applied in this paper of a single year is around 2000, and we aim to utilize the statistical characteristics of a county's image set to fit GDP values, the dimension is supposed to be less than the number of counties to avoid overfitting. Therefore, we implemented the principal component analysis (PCA) [48] to reduce the dimensions of the feature *F*.

PCA is nonparametric and does not require a parameter tuning process. It applies orthogonal linear transformations of the original vectors to extract principal components with the maximum variance. A sufficient number of principal components should explain most of the variance of the original data and efficiently reducing dimensions. Empirically, the first six components can explain approximately 80 percent of the variance, and additional gains will rapidly become marginal. This paper considers up to the first 25 principal components in the dimension reduction process, i.e., k (3 ≤ k ≤ 25).

#### *4.3. Statistical Characteristics*

To address the varying number of daytime images along with nightlight intensity levels belonging to a county, we calculate the statistical characteristics of each image set. In this way, each county has a fixed-sized representation. Following the approach of [18], we consider the following base statistical characteristics: (1) sample amount *n*, i.e., the number of satellite images within a county; (2) the sample mean *μ*; (3) the standard deviation *σ*; and (4) Pearson's correlation of the reduced features *ρ*. These four statistical characteristics are fundamental statistics that can capture the vital traits of an image set. Concretely speaking, these descriptive statistical characteristics represent a sample set through central tendency (the sample mean), dispersion (the standard deviation), association (the correlation), and volume (the sample size). To enrich the independent variable, we apply the feature interaction process in which the interactions and polynomial combinations of features are

added. Therefore, the augmented search space can be considered. Finally, for each county *i*, we obtain a representation *Ri* of the same length *s*. The representation is later fed into a regressor to estimate the target county-level GDP.

**Figure 4.** The structure of our method. Our method operates in three steps. (1) An attentionaugmented VGG-16 network pre-trained on ImageNet [38] is tuned to predict nighttime light intensity levels from daytime satellite images. The middle blocks of the network are taken out as the feature extractor after transfer learning (the pre-trained VGG-16 network) and supervised training (nightlight intensity degrees as a proxy of socioeconomic indicators). (2) Reduce the dimensions of output features via PCA. (3) Calculate the embedded spatial statistical characteristics and apply regression models to predict the logarithm of county-level GDP.

#### **5. Experimental Results**

#### *5.1. Performance Evaluation*

In this study, the experiment was conducted in the environment of Public Computing Cloud, Renmin University of China.We applied several methods to evaluate the model performance. Unanimously, we take the data of 2017 as the training set and the data of 2018 as the testing set. K-fold cross-validation is utilized to determine the optimal hyperparameters for PCA and the regression process. The **nightlight** method takes the sum of nightlight intensities of each county as the independent variable and applies it to predict the corresponding county-level GDP directly via a regressor. The **no-proxy** method utilizes the same VGG-16 architecture as the feature extractor in the proposed framework but only pre-trains it on the ImageNet. The **VAE** (variational auto-encoder) method plays the role of feature extractor in our model. A variational auto-encoder [49] is an unsupervised deep learning algorithm that aims to learn a compressed representation of the input data and recover, limiting the hidden layer's scale. The **VGG-A** denotes our proposed model, which features the attention-augmented VGG network.

As Table 2 shows, VGG-A (our proposed framework) outperforms all the other methods with an R-squared of 0.71. The satellite imagery does provide abundant features for predicting GDP since the R-squared of only using nightlight intensities is 0.36. Moreover, due to increased predicting quality against no-proxy (0.22) and VAE (0.45), we suggest that using nighttime light as a proxy helps extract more economic-related features.


**Table 2.** Results of different methods. The county-level GDP for training and testing is logarithmically transformed since they are approximately log-normal.

Figure 5 presents the prediction error map of county-level GDP of 2018 in China. The degree of the color reflects the value of the prediction error. Red denotes overestimation, while blue denotes underestimation. According to the prediction error map, we find that estimations of larger counties tend to be more accurate than those of smaller counties. A good reason is that larger counties usually contain more satellite image instances so that the corresponding statistical characteristics are more representative. Another finding is that the proposed framework seems to overpredict in the poorer areas and underpredict in richer areas. In the error map, southeastern coastal areas, the most advanced regions in China, are colored blue, while the middle of China, the less advanced regions, are colored red.

**Figure 5.** The prediction error map of county-level GDP in 2018. White areas in the map represent regions where data are missing. Due to the large area of the Chinese Mainland, there are a few regions where images are either missing or of poor quality (Hainan Island, for instance). Nevertheless, the number of counties covered by the images we gained is enough for this study.

#### *5.2. Ablation Study*

To ensure the effectiveness of different modules within our method, we conduct a few ablation experiments. **VGG** uses only the VGG-16 network, dislodging the attention layer. *μ*, *σ*, *ρ*, and *n* denote methods that remove the sample mean, standard deviation, the Pearson's correlation, and sample size from statistical characteristics, respectively.

According to the results shown in Table 3, the insertion of the attention layer effectively improves the R-squared score by 0.02. Meanwhile, any removal of the statistical characteristics decreases the final performance evidently, indicating that each of them makes a meaningful contribution.

**Table 3.** Results of ablation study.


#### *5.3. Comparison of Regression Methods*

We fed the embedded statistics into different regression models and determined the most suitable method in this case. Figure 6 reports the experimental results measured by

R-squared. It can be concluded that, in general, random forest and the Xgboost algorithm with a tree kernel (gbtree) perform the best, and that results of gbtree are more stable than those of random forest. The feature interaction process, i.e., construction of extra features, has little influence on these two methods' performances, probably because random forest and gbtree elect only the most essential features to predict dependent variables. Ridge regression achieves satisfactory results when the original embedded statistics are independent variables, while feature interaction enhances the performance of the Xgboost algorithm with a linear kernel (gblinear). Meanwhile, we find that although feature interaction may enable linear-based models to achieve better results, performances of these models decline rapidly when the dimension of PCA increases.

**Figure 6.** Results of different regression methods measured by R-squared.

#### *5.4. Gradient Visualization*

To explore what our feature extractor focuses on, we conduct gradient visualization with guided back-propagation applied in [50]. The guided back-propagation method computes the gradient of the target output (nightlight intensity level in our case) concerning the input. Gradients of ReLU functions are overridden so that only non-negative gradients are back-propagated, which is a widely used method in interpreting convolution neural networks. Based on Figure 7, we can observe the highlighted areas of buildings and the contours of roads in the gradient map, which accords with the perception that a more developed county tends to have denser buildings and advanced transportation systems. It also confirms the validity of using nighttime light intensities as a proxy for economic development levels.

**Figure 7.** Gradient visualization with guided backpropagation of VGG-A. The first row shows the input satellite imagery samples and the second row shows the corresponding visualization results. Larger gradient values result in higher brightness in the results.

#### **6. Conclusions**

This study considered nightlight maps as a proxy of socioeconomic indicators, constructing labels concerning nightlight intensity levels for daytime satellite imagery and training attention-augmented VGG-16 network as a feature extractor. The fixed-length county-level representation was calculated as each county's statistical characteristics, which were later fed into a simple regressor to predict GDP. Our method yielded a satisfactory performance with an R-squared of 0.71.

Our methods are explainable both quantitatively and qualitatively. The model trained on data from 2017 could achieve relatively high scores in predicting county-level GDP values of 2018. On the other hand, the gradient visualization indicated that the CNN-based classifier performed well in detecting visual patterns that were thought to be closely related to economic development degrees, such as roads and buildings.

Experimental studies confirmed the learning abilities of the VGG structures in our framework. Values of R-squared gradually became stable when the dimension of PCA reduction was greater than 15. The feature interaction process did not contribute much to prediction accuracy, indicating that features learned by the VGG structure were powerfully informative.

This paper contributed to modifying methods that applied remote sensing data in the estimation of socioeconomic indicators. Compared with Jean's method [18], our framework is both more applicable to county-level GDP estimation in China and more generalizable. First, Jean's method was point-to-point, while ours was capable of district-to-point estimation. Concretely speaking, there was a one-to-one relationship between ground-truth data (household wealth) and remote sensing data (paired daytime image and nightlight intensity) in Jean's method. Consequently, Jean's method required large volumes of groundtruth socioeconomic data, and the corresponding relationship between socioeconomic data and remote sensing data was strictly constrained. In contrast, thanks to the process of representative statistics calculation, our method could handle the case that a single socioeconomic indicator (or administrative unit) corresponded with variable number of daytime images along with nightlight intensities. Therefore, our method required relatively less ground-truth data and could be applied to estimate socioeconomic indicators at administrative units of variable sizes. Second, since our method enabled all the daytime images and nighttime light intensities belonging to an administrative unit to be used, the estimations were more robust against noise and missing data. Third, the CNN-based classifier in our method was incorporated with the attention mechanism. We successfully augmented the model performance from an R-squared of 0.69 to an R-squared of 0.71 via this module.

Our method still has several limitations. First, missing images degraded model performance, making it unable to learn broad principles around China. Second, while economic development is a continuous process, we applied models trained on data from the previous year to directly predict the next year's GDP values, leaving the underlying evolution out of consideration. Third, all the images were fed into the same model and, thus, failed to incorporate regional differences.

To the best of our knowledge, there is no previous study that has succeeded in estimating county-level GDP around the Chinese Mainland utilizing daytime and nighttime remote sensing data. This paper filled in this gap, indicating the possibility of convenient socioeconomic data-collecting methods relying on deep learning techniques and remote sensing data. The framework proposed by this paper is still coarse. Nevertheless, it will be fruitful to analyze current experimental results and augment model performance. On the one hand, more accurate estimations are more helpful. On the other hand, research on China, where social development degrees of disparate sub-national regions are measured by the same national economic accounting system, will shed light on the principles of applying remote sensing data in socioeconomic studies.

**Author Contributions:** Conceptualization, H.L.; methodology, H.L. and X.H.; software, H.L. and X.H.; validation, H.L. and X.H.; formal analysis, H.L. and X.H.; investigation, H.L., X.H., Y.B. and Y.W.; resources, Y.B.; data curation, H.L.; writing—original draft preparation, H.L. and X.H.; writing review and editing, H.L., X.H., Y.B., X.L., Y.W. and Y.Z.; visualization, H.L. and X.H.; supervision, Y.B., X.L., H.Y. and Y.Z.; project administration, Y.B.; funding acquisition, Y.B., H.Y., Y.W. and Y.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was partly funded by the Fundamental Research Funds for the Central Universities, and the Research Funds of Renmin University of China (17XNLG09), fund for building world-class universities (disciplines) of Renmin University of China.

**Data Availability Statement:** The data we utilize can be reached at (www.planet.com, accessed on 11 November 2020). where a python api key is provided to download the images.

**Acknowledgments:** This work was supported by the Public Computing Cloud, Renmin University of China. We thank Linhao Dong, Ying Hao, Yafeng Wu, Yunhui Xu and Yecheng Tang, students from Renmin University of China; the author gratefully acknowledges the support of the K.C. Wong Education Foundation, Hong Kong.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:


#### **References**


### *Article* **Estimating Local Inequality from Nighttime Lights**

**Nils B. Weidmann 1,2\*,† and Gerlinde Theunissen 2,†**


**Abstract:** Economic inequality at the local level has been shown to be an important predictor of people's political perceptions and preferences. However, research on these questions is hampered by the fact that local inequality is difficult to measure and systematic data collections are rare, in particular in countries of the Global South. We propose a new measure of local inequality derived from nighttime light (NTL) emissions data. Our measure corresponds to the local inequality in per capita nighttime light emissions, using *VIIRS*-derived nighttime light emissions data and spatial population data from *WorldPop*. We validate our estimates using local inequality estimates from the *Demographic and Health Surveys (DHS)* for a sample of African countries. Our results show that nightlight-based inequality estimates correspond well to those derived from survey data, and that the relationship is not due to structural factors such as differences between urban and rural regions. We also present predictive results, where we approximate the (survey-based) level of local inequality with our nighttime light indicator. This illustrates how our approach can be used for new cases where no other data are available.

**Keywords:** economic inequality; nighttime light emissions; VIIRS; spatial measurement

#### **1. Introduction**

In the social sciences, there is an increasing trend to use fine-grained data to capture political and economic mechanisms. Measured at high levels of resolution such as individuals or households, they allow for a precise analysis of local conditions and the social processes that people are embedded in [1]. The availability of fine-grained data is usually very good for developed countries, where researchers can rely on extensive surveys or administrative data. For many countries of the Global South, however, the availability of disaggregated data is usually limited. Oftentimes, these countries are unlikely to be covered by surveys, and administrative data shared for research purposes is sparse or does not exist.

For this reason, social science scholars have increasingly turned to alternative sources of data, such as remote sensing. One prominent example in this strand of research is the use of nighttime lights (NTL) data collected by satellites. First attempts have used NTL emissions at aggregated, lower levels of resolution. For example, earlier work has shown that nighttime light emissions can track economic performance and human development at the level of large geographic units, for example countries or states [2–5]. However, more recent work has tried to increase the resolution of these tests. For example, Weidmann and Schutte [6] show that nighttime light emissions correlate well with ground truth measurements of household wealth, as recorded in surveys. This means that satellite-based NTL data can be used also at high levels of resolution, for example for the estimation of wealth, human development or regional inequality between provinces and sub-national administrative units [7–11].

In this paper, we build on this work and attempt to use NTL data for the estimation of local inequality. In recent years, and in particular following the influential work

**Citation:** Weidmann, N.B.; Theunissen, G. Estimating Local Inequality from Nighttime Lights. *Remote Sens.* **2021**, *13*, 4624. https:// doi.org/10.3390/rs13224624

Academic Editor: Nataliya Rybnikova

Received: 28 September 2021 Accepted: 11 November 2021 Published: 17 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

by Piketty [12], inequality has attracted a lot of interest from the research community. Using aggregated country- or group-levels measures of economic inequality, this research has shown for example that inequality can be an important driver of social conflict and political instability [13]. Again, research in this vein has relied on NTL data, but only at aggregated levels to measure inequality between [14–16] or within social groups [17]. However, recent research has also shown that people do not perceive aggregated/systemic levels of inequality. Rather, it is the local context that matters for explaining individuals' behavior. In particular, there is a number of studies showing that local inequality, i.e., inequality with an individual's immediate spatial context, affects citizen's political preferences in behavior [18–24].

To find out how this local context matters in the Global South, we need fine-grained estimates of local inequality. This is what we present in this article. Our study, however, is not the first to study local inequality with NTL data. Existing work, however, has not used night light emissions to measure local inequality directly; rather, these studies first approximate economic performance or wealth from night lights for small geographic units, and then calculate inequality between them [9,25,26]. Our approach, in contrast, operates directly on the NTL data in combination with a population raster, and is therefore able to produce local inequality estimates for arbitrary locations on the globe and at a high levels of resolution.

#### **2. Data and Methods**

In this paper, we present an approach to computing satellite-based estimates of local inequality, which we validate with local inequality estimates derived from large-scale survey data. In the following, we first describe the nighttime light data we use for our indicator, before turning to the survey data used for validation.

Our satellite-based estimates of local inequality rely on the VIIRS nighttime light data [27] (V2). We use the annual composites, where non-stationary light sources and other erroneous influences have been removed by a combination of the different images available for a given year. This methodology is described in Elvidge et al. [27]. The VIIRS nighttime lights is one of the most recent freely available data products of remote-sensed nighttime light emissions, and it is available for the years 2012–2021. Compared to earlier products such as the frequently-used DMSP-OLS nighttime light data [28], it has a number of advantages. Most importantly, VIIRS nighttime light rasters have a higher resolution of 15 arc-seconds, which corresponds to about 500m at the equator. Furthermore, VIIRS reduce the problem of top-coding: in the DMSP-OLS NTL data, high emissions are all coded at the maximum value of 63, which eliminates a lot of variation at the upper end of the spectrum. Therefore, with VIIRS data, we can exploit considerably more variation within well-lit areas. Not surprisingly, existing research has concluded that VIIRS-derived data should be preferred for work that uses nighttime lights to study socio-economic processes [29,30].

For our approach, we rely on earlier work by Weidmann and Schutte [6], which has analyzed nighttime light emissions as a proxy for economic wealth at high levels of resolution. This work has shown that on average, more intensely illuminated areas are also the richer ones. However, since variation in illumination to a large extent driven by settlement patterns, more populated areas emit more light at night. In our analysis, we take this into account by using a second spatial data source that maps the global population at a high resolution: the WorldPop dataset, available from https://www.worldpop.org/ (accessed on 30 July 2021) [31]. We use the population counts raster from WorldPop, which provides annual population estimates at the level of cells with a resolution of 30 arcseconds. These counts are computed in a "top-down" fashion, by disaggregating official population statistics for administrative divisions using spatial covariates as described in Lloyd et al. [32].

For combining the VIIRS NTL data and WorldPop, we aggregate the former to a resolution of 30 arc seconds. Dividing the nighttime light emissions value by the population

living in the same cell, we obtain per capita values of nighttime light emissions at the level of the raster cells. This allows us to compute inequality estimates for any given point on the globe: Given a set of longitude/latitude coordinates, we retrieve all cells within a buffer of a certain radius, and simply compute an inequality index—the Gini coefficient—across all of them. For this computation, we need the per capita nighttime light emissions as well as the population counts of each grid cell. In line with results by Weidmann and Schutte [6], we log-transform the nighttime light value before computing the inequality estimates. In our analysis below, we vary the buffer size from 2 km to 20 km, to find out what produces the most accurate estimates of local inequality. Figure 1 (left panel) illustrates the data we use for this procedure. In principle, it is possible with this approach to compute local inequality estimates for any point on the globe. For our validation exercise below, we do this for the spatial locations where the survey was conducted, which allows us to compare survey-based inequality estimates to those calculated from the nighttime lights.

**Figure 1.** Satellite imagery of nighttime light emissions in raster format for the town Kansanshi in Zambia. (**Left panel**): The computation of the local Gini coefficient requires log-transformed nighttime light emissions at the level of cells (in yellow) and population estimates (in white). (**Right panel**): The DHS Wealth Index values of the households in the survey cluster at that location. All values are hypothetical and only displayed for illustration purposes.

For our validation exercise, we require alternative estimates of local inequality. For countries where detailed official income or wealth statistics are available, these estimates can easily be computed (as for example in [33]). However, for many countries in particular in the Global South, these data cannot be used for research purposes, or are simply not collected regularly. This is why we rely on large cross-national survey data from the Demographic and Health Surveys (DHS) project (see https://dhsprogram.com, accessed on 30 July 2021). The DHS is a regular survey on living conditions and health-related data that is conducted across many countries. It uses the same survey instrument in all countries, which contains questions at the individual level but also the household level. Most importantly, the DHS also include an assessment of the household's wealth by means of a wealth index. The wealth index is created from different questions answered by the enumerator (not the respondents) about the household's assets. These answers are collapsed to the most important underlying dimension using factor analysis, and the factor scores are used to assign each household to its corresponding quintile in the distribution of scores in the country [34]. The household's quintile (1–5) is the wealth index for this household. Figure 1 (right panel) gives an example of the DHS data we use for the validation. The entire sample covers 26 countries from DHS survey waves 6, 7 and 8, with data collected in the years 2012–2019. Appendix A lists all the countries and survey waves included in the sample.

To link the survey results to our spatial index of local inequality, we also require geographic information about the location of households in the survey. These coordinates are not provided at the level of households, but at the level of survey *clusters* or primary sampling units (PSUs). In the DHS, a cluster is a group of about 25–30 households in close proximity to each other, which were selected according to the DHS's sampling scheme [35].

The DHS categorize clusters into urban and rural ones. For each cluster, the DHS provide a point (longitude/latitude) location, which, however, is randomly distorted to preserve anonymity in the data. More precisely, an urban cluster's location is randomly shifted within a radius of 2 km, while a rural location is assigned a random location with a radius of 5 km of its original location (10 km for a randomly chosen 1% of all rural clusters in a given country and survey wave). Therefore, the spatial reference for the survey cluster is approximate, and we construct the spatial buffers for the computation of our local inequality index such that it contains the original cluster location (with the exception of the randomly chosen 1% of the rural cluster with a spatial error of up to 10 km, which introduces measurement error in our analysis that we cannot prevent).

For our survey-based measure of local inequality, we compute the Gini inequality coefficient over the wealth index values of all households in a cluster. Since the input values have a limited range of 1–5, the upper bound of the Gini coefficients is less than 1 (the usual upper bound of the Gini index). To normalize the resulting coefficient values, we divide them by 0.382. The derivation for this value is presented in Appendix B.

#### **3. Results**

In this section, we first present the satellite-based and survey-derived estimates of local inequality separately, before turning to a comparison of the two.

#### *3.1. Estimates of Local Inequality from Nighttime Lights Data*

As stated above, we compute spatial estimates of local inequality for all survey cluster locations in our sample, so that we can later compare them to the survey-derived inequality scores. These computations use NTL data for the same year in which the cluster was included in the survey (see below). In Figure 2, we show the overall distribution of our spatial estimates, computed with a buffer radius of 5 km. The distribution is bimodal, which is an aggregate result of the different distributions of urban and rural clusters: While urban clusters tend to have low values of inequality (most of them located around 0.20), the opposite is true for rural clusters. Here, the majority of the cases has Gini values of 0.5 and above. This could partly reflect more segregated residential patterns in cities, where neighborhoods tend to be inhabited by similarly poor or rich households. This could be different in rural areas, where rich and poor households can be located close to each other, thus resulting in a high level of local inequality.

**Figure 2.** Histogram of the overall distribution of nightlight-based Gini-coefficients, computed with a buffer radius of five kilometers. The light-grey histogram shows the distribution of urban clusters, the distribution of rural clusters is shown in dark-grey.

At the same time, this pattern can also indicate potential limitations of our satellitebased measurement method. In urban areas, a small buffer radius (2 km or 5 km) will include many cells with similar levels of illumination and similar population counts, thus leading to low levels of the NTL-based inequality indicator. A plot of the inequality scores for different buffer sizes (see Figure 3 partly confirms this: as the buffer size increases, cells within the buffers become more diverse as regards their illumination and population values, and inequality scores increase as a result. Our validation exercise later will have to test how buffer size affects the correlation between NTL-based and survey-based inequality scores, and which of them results in the best fit.

**Figure 3.** Histogram of the overall distribution of nightlight-based Gini-coefficients for different buffer sizes.

We also show the distribution of nightlight-based inequality scores separately for each country in Figure 4. The results show that the distribution of NTL-based local inequality values differs by country. Our validation exercise will have to test whether these patterns reflect actual differences in local inequality.

**Figure 4.** Boxplot of the NTL-based Gini-coefficients for a buffer radius of 5 km for individual countries. The lower and upper hinges correspond to the 25th and 75th percentiles, and the centerline indicates the 50th percentile.

#### *3.2. Estimates of Local Inequality from the DHS*

What is the level of local inequality according to the survey data from the DHS? In Figure 5, we plot the overall distribution of the survey-based inequality scores, distinguishing again between urban and rural clusters. Again, we observe a similar distribution as for

the NTL-based estimates above, with urban clusters on average exhibiting lower levels of local inequality, while rural clusters have high Gini values. This is somewhat reassuring, since it shows that the patterns we found for the nightlight-based indicator above are not entirely driven by the measurement method.

We again plot the indicator distribution separately for each country (see Figure 6). In contrast to the pronounced differences between countries for the NTL-based indicator, we see considerably less variation across countries here, with most distributions centered in the range 0.25–0.5.

**Figure 6.** Boxplot of the distribution of survey-based Gini coefficients for the individual countries. The number indicates the survey wave.

#### *3.3. Validation*

In this section, we compare the local inequality estimates obtained from the surveys to those computed from the nighttime light data. As explained above, for each survey cluster and the associated level of (survey-based) local inequality, we compute a nightlight-based estimate for the same year in which the survey was conducted. In Figure 7, we show simple scatterplots of the two indicators, as well as a line indicating the linear fit. Overall, the plot shows a positive and significant correlation between the two indicators. In other words, our nightlight-based indicator is able to pick up some of the variation in local inequality

we see in the surveys. Still, the large point clouds also indicate that there is considerable error where the two indicators disagree.

**Figure 7.** Scatterplot of NTL-based Gini coefficients (computed with a buffer size of five kilometers) and survey-based Gini coefficients, separately for urban and rural clusters.

To test how buffer size affects the fit between the nightlight-based and the surveyderived indicator, we plot the full distribution of clusters for different buffer sizes in Figure 8. Here, we see that neither small nor large buffer sizes maximize the fit between the two indicators. Rather, a buffer size of 5 km seems to give the best results over the entire sample.

**Figure 8.** Scatterplot of NTL-based and survey-based Gini coefficients, for different buffer sizes.

Can we also observe different patterns for the different countries in our analysis? Following our approach above, we plot the two indicators separately for each country in Figure 9. In all countries except one (Ghana), the correlation between them is positive, which is encouraging. In some countries, we observe high levels of agreement (as for example, Burkina Faso, Uganda or Zambia), while in a few others, our satellite-based measurement method does not seem to work well. In Gabon and Ghana, for example, correlations between the indicators remain low.

**Figure 9.** Scatterplot of nighttime light-based Gini coefficients (with a buffer size of five kilometers) and survey-based Gini coefficients, by country and survey wave.

Our bivariate comparison of survey-based and NTL-based indicators cannot control for other factors that could potentially affect the positive correlation we find between the nightlight-based and the survey-based indicator. For that reason, we run multivariate regression models for each buffer size (2 km, 5 km, 10 km and 20 km), with the survey-derived Gini coefficient as the outcome. Our main predictor is the inequality index computed from the satellite data. We include a number of control variables. First, we include a dummy variable for urban clusters, to remove variation in the outcome that is driven by the difference between urban and rural locations (see the discussion above). We also control for demographic factors such as the average size of the household, as well as the number of households included in the cluster. To make sure that the results are driven by inequality in the nightlight emissions and not the overall level of emissions or the size of the buffer, we also control for the sum of the nighttime light emissions in a buffer, and the total population as well as the number of cells in the buffer. The results of the regression models are shown in Table 1. We provide additional results with country/wave fixed effects in Table 2, to take into account systematic differences between countries and survey waves.

**Table 1.** OLS regression results. Dependent variable: survey-based Gini coefficient. Standard errors clustered by country and survey wave.


Note: \* *p* < 0.1; \*\*\* *p* < 0.01.

The regression results confirm that our NTL-based indicator remains a strong predictor of actual local inequality. We see that in both types of regression and for all four buffer sizes, the coefficient of this variable remains positive and highly significant. This results holds in the presence of several control variables. For example, the "urban" dummy nets out the difference between urban and rural clusters we have seen above, with urban clusters having lower levels of inequality. Furthermore, the effect of the NTL-based indicator remains when we control for the overall level of night light emissions and the total population, which are additional controls that go beyond the simple urban/rural distinction and provide additional support for the impact of our NTL-based indicator. In Appendix C, we provide additional results that limit the sample to clusters with at least 30 households, since we may be concerned that survey-based local inequality may be measured with considerable error if we have fewer observations in a cluster. Furthermore, we repeat the analysis without log-transforming the NTL. The substantive results from our main analysis remain

unchanged. In short, these results show that our indicator can capture local inequality well and that the relationship we see is not due to some a spurious correlation with other characteristics of the survey clusters and their spatial features.

**Table 2.** OLS regression results with country/wave fixed effects. Dependent variable: survey-based Gini coefficient. Standard errors clustered by country/survey wave.


Note: \* *p* < 0.1; \*\*\* *p* < 0.01.

#### *3.4. Predicting Local Inequality from Nighttime Lights Data*

Our above analyses show that the nightlight-based indicator picks up variation in local inequality, even when we control for a number of factors that could be driving this result. In a final analysis, we move from correlation analysis to prediction. We analyze a situation where a researcher requires estimates of local inequality, and uses simple machine learning models to predict these values based on our NTL indicator with a model fitted on available data from other locations. Specifically, we study two scenarios. In the first one, we use data from a given country to fit a prediction model, and then predict local inequality for a new location. In the second and more difficult scenario, we predict local inequality for a new country with a prediction model fitted on data from other countries. For both scenarios, our aim is to gauge the average prediction error that the researcher would have to incur when relying solely on our NTL indicator.

In both scenarios, we use very simple prediction models. Our first model is an OLS regression model similar to the one we have used above, but with only one predictor: the nightlight-based estimate of local inequality. The second model is a generalized additive model (GAM) using quadratically penalized likelihood, fitted using the gam function from R's mgcv package (see [36]), while more complex machine learning models could be applied, we do not expect significant performance gains due to the simple setup of the prediction exercise with a single predictor only. We evaluate all our models out-of-sample. In the first prediction scenario, this means that we keep a single cluster in a country as a hold-out, fit the model on the remaining clusters from that country, and then predict the level of local inequality for the cluster that was set aside. In Figure 10, we show the distribution of the

absolute prediction errors across the 37 surveys in our sample, for satellite-based inequality indicators with different buffer sizes (2 km, 5 km, 10 km and 20 km) and the two different prediction models (LM and GAM). For comparison, we add an additional linear model that only contains a binary predictor for urban vs. rural locations. The tabular presentation of the results is provided in Appendix D.

**Figure 10.** Predicting wealth from nighttime light emissions, within-country. The figure shows the median (black lines), the 25th and 75th percentile (hinges) and the full ranges of the mean absolute prediction errors across the 37 surveys in our sample. Lower values indicate better performance.

The plot shows that prediction of local inequality for new locations using our spatial indicator works well. Using small buffer sizes (2 km), we miss the level of local inequality as given by the survey data only by around 0.11 on average, and 75% of the cases have an error of less than 0.125 (for the GAM). The GAM performs slightly better than the LM, but the differences are small.

In our second prediction scenario, we predict local inequality in a new country that was not used in training the model. We again use leave-one-out cross-validation, where we fit the model on all our data except one country, and then predict the values for that country. In Figure 11, we show again the distribution of absolute prediction errors for this exercise.

**Figure 11.** Predicting wealth from nighttime light emissions, across countries. As above, the figure shows the distribution of the mean absolute prediction errors across the 37 surveys in our sample, with lower values indicating better performance.

Figure 11 shows that as expected, prediction errors are higher as compared to the first scenario. This is not surprising, since in the second scenario, the model is not able to capture a possible country-specific relationship between the satellite-based estimates and the survey-based inequality indicator. Still, prediction errors are again of limited magnitude even in the more difficult scenario. However, unlike in the first prediction task, we see that our NTL-based indicator improves predictive performance only marginally as compared to the simple model using only the urban/rural dummy ("LM Urban") in

Figure 11. In particular, the 5 km buffers seem to work best. Together, these results show that we can use our NTL-based indicator in a simple machine learning model to obtain local inequality estimates for new locations in a given country, but in particular for cases where we do have some training/calibration data available for the same country.

#### **4. Discussion**

In this article, we have introduced an indicator for local inequality derived from highresolution night lights data. In addition to the night lights raster data, the computation of this indicator requires only a fine-grained population grid, both of which are freely available. We combine these two data sources to obtain per capita emissions values at the grid cell level, which we use to compute a Gini index of inequality for spatial buffers of a given size. We present two main analyses. In a first validation exercise, we compare the NTL-based indicator to estimates of local inequality derived from survey data. The correlations are positive and significant in almost all countries in our sample, although not surprisingly, the indicator cannot fully capture local inequality as measured by the surveys. This is to be expected: while survey estimates of wealth take into account a variety of household assets, only some of them are related to electricity consumption and are therefore possibly reflected in nightlight emissions. Furthermore, in particular in urban areas, night light emissions are less likely to be attributable to individual households, and rather reflect public infrastructure. This will also reduce the correlation between NTL emissions and individual wealth.

To address the question of whether it is possible to our indicator for locations where no other data are available, we provide a second type of analysis. Here, we generate estimates of local inequality with simple prediction models, and compare these predicted values to the ones measured with the survey data. This analysis shows that prediction errors are generally low. When we predict Gini coefficients of local inequality with our NTL-based indicator, the best predictions have an average error around 0.05 on the 0–1 scale. This is a good result, given that it is derived exclusively from simple spatial datasets (night light emissions and population rasters). Overall, this shows that our approach can be used to generate new estimates of local inequality for locations for which no other data exists.

While our results show that night lights emission can pick up local inequality to a certain extent, they are necessarily weaker as compared to other approaches combining multiple sources of data. For example, Chi et al. [37] introduce micro-level estimates of wealth that are computed using a variety of input data, including telecommunication coverage maps as well as Facebook connectivity data. This leads to better wealth estimates, which could also be used to estimate local inequality. At the same time, however, the use of proprietary data makes this approach impossible to use for many researchers without access to these data. Furthermore, the coverage of these data may be limited to particular countries, which restricts their applicability to country-specific studies. Our approach, in contrast, uses only publicly available data, is fully replicable using open-source software (PostGIS), and can be used for comparative, cross-national work in the social sciences.

Due to its ability to pick up variation in local inequality and its exclusive reliance on publicly available data, our index enables future research in many different fields. In political science, for example, it helps to better understand how local inequality in an individual's immediate context affects political preferences and behavior. Sociologists can use these data to study the effect of local inequality on residential choice or personal relationships, and development economists can use it to identify areas in need of particular support.

While the results presented in our article are encouraging, there are several drawbacks associated with the NTL-based estimation of inequality. Due to its reliance on variation in night light emissions, this approach can only work in world regions where no saturation has been reached. For example, in most countries of the Global North, nightly illumination of streets is commonplace, which reduces variation in night light emissions and their correlation with socio-economic variables [38]. Consequently, we expect our approach to be less applicable to these countries. Furthermore, there are limitations as regards the

temporal variation the indicator is able to pick up. Night light emissions change slowly, which is why our indicator will remain relatively stable even in cases of large population shifts, for example due to refugee movements. When relying on night lights as a proxy for wealth or inequality, researchers should be aware of these limitations and carefully consider whether this data source is suitable for their project.

**Author Contributions:** Conceptualization, N.B.W. and G.T.; methodology, N.B.W.; spatial data preparation, N.B.W.; survey data preparation, G.T.; analysis, G.T.; writing—original draft preparation, N.B.W.; writing—review and editing, G.T.; visualization, G.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the German Research Foundation (DFG) under the Excellence Strategy of the German Federal and State Governments, Excellence Cluster "The Politics of Inequality" (EXC-2035/1–390681379). The APC was covered by the University of Konstanz's Open Access Fund.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Replication data and code are available from https://doi.org/10.7802/ 2345 (accessed on 20 September 2021). The dataset contains the NTL-based data. All variables based on the DHS could not be shared directly due to the DHS terms of use, but the replication package contains information and code to obtain and process all required DHS datasets.

**Acknowledgments:** We are grateful to special issue editor Nataliya Rybnikova for help and advice.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A. Description of the Sample**

Table A1 lists all countries, survey waves ("phases") and years in our sample, along with the number of PSUs and households.


**Table A1.** List of countries and waves included in the analysis.


**Table A1.** *Cont.*

#### **Appendix B. Proof: Upper Bound of Gini Coefficient for DHS Wealth Index Values**

The DHS Wealth Index has values in the range 1–5, where 5 corresponds to the richest households. For a group of households, the maximum Gini value can only be achieved if each household belongs either to the lowest (1) or the highest group (5). Assume that we have an income distribution with only two different groups, where a fraction *n* 0 < *n* < 1 of the population belongs to the group of poor households with wealth index 1, and 1 − *n* belong to the group with wealth index 5. The Lorenz curve (cumulative shares of households along the x-axis, cumulative shares of wealth along the y-axis) is piecewise linear with two linear segments. The first line segment connects (*x*0, *y*0)=(0, 0) and (*x*1, *y*1)=(*n*, *<sup>n</sup> <sup>n</sup>*+5(1−*n*)), the second line segment connects (*x*1, *<sup>y</sup>*1) and (*x*2, *<sup>y</sup>*2)=(1, 1). The Gini coefficient *G* is defined as the area between the equality line and the Lorenz curve, which corresponds to *G* = 1 − 2*B* if *B* is the area below the Lorenz curve. In our case, this means that the Gini coefficient is

$$G = 1 - 2[\frac{1}{2}x\_1y\_1 + (1 - x\_1)y\_1 + \frac{1}{2}(1 - x\_1)(1 - y\_1)]$$

 $\text{or simplified}$ 

$$G = x\_1 - y\_1$$

Substituting *x*<sup>1</sup> and *y*1, we get

$$G(n) = n - \frac{n}{n + 5(1 - n)}$$

Taking the first derivative, we get

$$\frac{d}{dn}G(n) = \frac{d}{dn}[n - \frac{n}{n + 5(1 - n)}] = \frac{4(4n^2 - 10n + 5)}{(5 - 4n)^2}.$$

which results in a maximum at *n* = <sup>5</sup><sup>−</sup> √5 <sup>4</sup> and with that a maximum value for the Gini at 0.382. In conclusion, the Gini coefficient cannot be higher than 0.382 for wealth values in the range 1–5.

#### **Appendix C. Additional Results of the Validation Analysis**

**Survey-Based Inequality Index Radius 2 km 5 km 10 km 20 km (1) (2) (3) (4)** Intercept 0.036 0.052 0.059 −0.026 (0.080) (0.077) (0.073) (0.078) NTL-based Gini 0.062 \*\*\* 0.116 \*\*\* 0.151 \*\*\* 0.201 \*\*\* (0.015) (0.015) (0.016) (0.019) Urban −0.060 \*\*\* −0.074 \*\*\* −0.099 \*\*\* −0.119 \*\*\* (0.008) (0.008) (0.007) (0.007) Household size (mean) 0.019 \*\*\* 0.014 \*\*\* 0.012 \*\*\* 0.009 \*\*\* (0.003) (0.003) (0.003) (0.002) Number of households 0.007 \*\*\* 0.008 \*\*\* 0.009 \*\*\* 0.008 \*\*\* (0.002) (0.002) (0.002) (0.002) Total NTL emissions (log) −0.010 \*\*\* −0.001 \*\*\* −0.0004 \*\*\* −0.0001 \*\*\* (0.001) (0.0002) (0.00004) (0.00001) Total population (log) −0.009 \*\* −0.009 \*\* −0.004 0.003 (0.003) (0.004) (0.003) (0.004) Number of cells 0.009 \*\*\* 0.001 \*\*\* 0.0002 \*\*\* 0.0001 \*\*\* (0.002) (0.0003) (0.0001) (0.00001) Fixed effects (country/wave) Yes Yes Yes Yes Observations 2631 3206 3824 4522 R2 0.557 0.538 0.503 0.442 Adjusted R2 0.553 0.534 0.500 0.439 Residual Std. Error 0.146 (df = 2604) 0.152 (df = 3179) 0.157 (df = 3797) 0.164 (df = 4495)

**Table A2.** OLS with country fixed effects, using only clusters with more than 30 households.

Note: \*\* *p* < 0.05; \*\*\* *p* < 0.01.

**Table A3.** OLS with country fixed effects without log-transformed NTL values.


Note: \*\*\* *p* < 0.01.

#### **Appendix D. Results of the Prediction Analysis**


**Table A4.** Within-country prediction results (AE = Absolute Error).

**Table A5.** Across-country prediction results (AE = Absolute Error).


#### **References**


## *Article* **Can Nighttime Satellite Imagery Inform Our Understanding of Education Inequality?**

**Bingxin Qi 1, Xuantong Wang 2,\* and Paul Sutton 3,4**


**Abstract:** Education is a human right, and equal access to education is important for achieving sustainable development. Measuring socioeconomic development, especially the changes to education inequality, can help educators, practitioners, and policymakers with decision- and policy-making. This article presents an approach that combines population distribution, human settlements, and nighttime light (NTL) data to assess and explore development and education inequality trajectories at national levels across multiple time periods using latent growth models (LGMs). Results show that countries and regions with initially low human development levels tend to have higher levels of associated education inequality and uneven distribution of urban population. Additionally, the initial status of human development can be used to explain the linear growth rate of education inequality, but the association between trajectories becomes less significant as time increases.

**Keywords:** education inequality; nighttime light; urbanization; sustainable development; human development

#### **1. Introduction**

Assessing our socioeconomic development in a frequent, rapid, and accurate manner is important for achieving the United Nations' Sustainable Development Goals (SDGs) on various national and global scales [1]. The United Nations' 2030 Agenda for Sustainable Development was developed to transform our world by urging countries to solve current development challenges related to education, poverty, inequality, climate change, etc. [2–5]. Recently, many countries and regional organizations have made significant progress toward the achievement of these goals. Nevertheless, due to the complexity of socioeconomic development, many countries are still suffering from these problems, and some of the actions and policies are not implemented in an effective and efficient way.

To support the 2030 Agenda for Sustainable Development, it is important to monitor and evaluate the current socioeconomic development status to provide scientific evidence for facilitating the policy- and decision-making processes. Measuring socioeconomic development, especially the status of education inequality, in a timely and accurate manner can help educators, practitioners, scientists, and policymakers compare and evaluate a variety of key education indicators. Measuring education inequality, for example, can help us better evaluate the fairness and effectiveness of our education systems and the processes of current educational development [6]. Since education is the foundation of development and growth, measuring socioeconomic data related to education inequality also will help countries achieve many of the SDGs including stable economic growth [7–9], eradication of poverty [10,11], reduction of inequality and exclusion [12,13], and achievement of sustainable development [14] in the long-run.

**Citation:** Qi, B.; Wang, X.; Sutton, P. Can Nighttime Satellite Imagery Inform Our Understanding of Education Inequality?*Remote Sens.* **2021**, *13*, 843. https://doi.org/ 10.3390/rs13050843

Academic Editor: Nataliya Rybnikova

Received: 26 January 2021 Accepted: 21 February 2021 Published: 24 February 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

This paper presents an approach that combines multi-source data (including population distribution, human settlement, and artificial light data monitored from space) to assess changes in trajectories of human development and education inequality at a national level from 1990 to 2010. This research has utilized nighttime light (NTL) data collected by the Defense Meteorological Satellite Program (DMSP) and human settlement data from the Global Human Settlement Layer (GHSL) to measure human development and evaluate its association with education inequality. Many researchers have demonstrated that NTL data can be used to assess regional inequality and economic development [15–17]. Studies also have shown that NTL is capable of capturing regional uneven development [18–20]. Therefore, we use DMSP NTL data to estimate human development [21] and assess the associations of growth patterns with education inequality.

Education is a human right, and equal access to education is not only crucial for an individual's well-being, but also is essential for eradicating poverty, transforming our society, ensuring long-term prosperity for all, and achieving sustainable development. Many researchers have proposed that ensuring equal access to education can be achieved through distributing education resources more equally [6]. Therefore, it is important to develop indicators that can measure education inequality so we can monitor the changes to education resource allocation status over time. Nevertheless, unlike many socioeconomic indicators (e.g., the Gross Domestic Product) that are developed based on a series of sophisticated accounting and statistical methods, it is difficult to measure education inequality by assigning a monetary value to education accessibility or student achievement and attainment. Some studies have demonstrated the usage of Gini coefficients for measuring education inequality. An Education Gini (EG) index [6], for example, is developed based on education attainment of the concerned population using the following steps:

$$E\_L = \left(\frac{1}{\mu}\right) \sum\_{i=2}^{n} \sum\_{j=1}^{i-1} p\_i |y\_i - y\_j| p\_j \tag{1}$$

where *EL* is the education Gini, μ is the mean years of schooling, *pi* and *pj* are the percentages of the population with certain levels of schooling, *yi* and *yj* are the years of schooling at different education attainment levels, and *n* is the number of levels of the attainment data for the concerned population.

Thomas et al. [6] also have adopted the Lorenz curve to calculate an education Gini based on the cumulative proportion of the population with certain years of schooling, which is similar to the calculation of an income Gini. Generally, although different studies have proposed different approaches to education Gini calculation, an education Gini is mainly derived based on the proportion of the population with various education attainment levels.

Recently, many scientists also have incorporated multi-source data to enhance model performance for evaluating various socioeconomic indicators that are related to human development. There are many difficulties associated with collecting traditional socioeconomic data for measuring human well-being. Accurate information about the distribution of the population, settlements, and even wealth are not available for many less developed regions, for example, and sometimes these data are of poor quality [22]. Nevertheless, remote sensing technology and satellite imagery can help us observe, explore, and evaluate the status of human development on the Earth's surface [23]. Hence, geospatial data can be an alternative way for scientists to study and monitor human activities in a timely, consistent, and affordable way. NTL data is widely used for estimating and evaluating socioeconomic activities, for instance, since it captures the artificial light at night [24–26]. Based on remotely sensed NTL data, for example, Sutton et al. [27] estimated global marketed and non-marketed economic value from classified satellite images. Elvidge et al. [28] produced a global poverty map on a subnational scale based on population and DMSP NTL data. Therefore, the subnational data generated from NTLs can greatly help scientists measure human activities on various spatial scales.

Many scientists also have adopted Gini concepts for calculating other socioeconomic indexes based on the Lorenz curve. Elvidge et al. [21], for example, produced the Nighttime Light Development Index (NLDI) based on DMSP NTL data and LandScan population density data to measure human development. NLDI for each country is calculated based on the Lorenz curve produced from the cumulative proportion of the NTL and the cumulative proportion of the population. Generally, results show that developed countries tend to have low NLDI values and less developed countries have high NLDI values. It also shows that NLDI has a strong correlation with other indicators like the Human Development Index (HDI), poverty rate, and the proportion of the urban population. Therefore, the NLDI can be an alternative way for measuring human development using NTL data. Song et al. [29] also have used the Spatial Lorenz Curve (SLC) and Gini coefficients to measure land use changes based on an unsupervised land use classification method with cloud-free Landsat Thematic Mapper (TM) images. Similar to the NLDI, the SLC is calculated based on the cumulative proportion of land use and the cumulative proportion of land. Therefore, these studies show that there is great potential for scientists to utilize geospatial data to monitor the allocation of resources, the distribution of population, and the different levels of development on various spatiotemporal scales. Added to that, the availability of geospatial data can help us establish a consistent, objective, and globally applicable method for characterizing and measuring education inequality that are caused by development problems like income inequality, urbanization, and resource allocation.

This research utilizes multi-source data to evaluate human development levels and the uneven distribution of the urban population on various spatiotemporal scales to explore development trajectories and patterns of human development and education inequality. The rest of this paper is organized as follows. Section 2 describes data processing procedures and the development of latent growth models (LGMs) for measuring different development trajectories and patterns. Section 3 presents the results from LGMs to evaluate the growth patterns for each factor included in this study. Section 4 discusses the associations between trajectories. Finally, Section 5 summarizes the results and draws conclusions.

#### **2. Data and Method**

#### *2.1. Gini Coefficients for Human Development and Education*

During this study, we analyze the relationship between an Education Gini (EG), Nighttime Light Development Index (NLDI), and population distribution at a national level in 1990, 2000, and 2010. The NLDI for each county is calculated as a proxy for human development [21]. Moreover, an urban population Gini (UG) index also is constructed based on similar procedures [21,29] to measure the levels of urbanization with Lorenz curves. A higher UG value represents higher levels of rural–urban population distribution inequality which, in turn, indicates that less of the population are likely to benefit from improved economic activity, better shared infrastructure, and higher standards of living due to urbanization [30–32]. The datasets used in this study are described in Table 1. This study utilizes the Defense Meteorological Satellite Program nighttime light (DMSP NTL) data (Figure 1a) and the Global Human Settlement Layer (GHSL) population data (Figure 1b) to construct an NLDI and UG for countries and regions around the world. Due to the data availability issues, population data from 2015 (rather than 2010) and DMSP NTL data from 1992 (rather than 1990) are used to calculate these indexes.


**Table 1.** Datasets for calculating the Nighttime Light Development Index (NLDI) and urbanization Gini.

Based on the population distribution, NTL intensity, and human settlements, the Gini coefficients for the NLDI and urbanization are calculated as follows:

$$\mathbf{G} = \mathbf{1} - \sum\_{i=0}^{n-1} \left( N\_i + N\_{i-1} \right) (P\_i - P\_{i-1}) \tag{2}$$

where G is the Gini coefficient for the NLDI or urbanization, *Ni* is the cumulative proportion of the NTL (for calculating NLDI) or the urban population (for calculating an urbanization Gini) in the subnational entities, and *Pi* is the cumulative proportion of the population in the same subnational entities.

The NLDI and UG at the national level are constructed using level 0 and 1 administrative units. Level 0 represents national-level administrative boundaries, and level 1 represents state- and provincial-level boundaries. To construct the Lorenz curve for each country based on the cumulative proportion of the NTL and population, this study uses the level 1 subdivisions' administrative boundary layer (state or province) to calculate the sum of the population and NTL within each subdivision. Based on the cumulative percentage of the NTL and population data, this study calculates the NLDI value for each country for that corresponding year. The subnational NLDI at level 1 subdivisions is calculated based on the level 2 subdivisions' data using the same procedures. After matching and filtering the data (i.e., based on the ISO3 country code), a total number of 141 countries and regions from 1990, 2000, and 2010 are included in this study for trajectory analysis to construct latent growth models (LGMs) [34] to study the trends of the EG, UG, and NLDI changes (see Appendix A) on a national scale.

#### *2.2. Development of Associative Latent Growth Models (LGMs)*

To better analyze the developmental trajectories of an Education Gini (EG), Nighttime Light Development Index (NLDI), and Urban Population Gini (UG) for each country over time, an unspecified associative latent growth model (LGM) is developed due to its greater capacity to (1) test the efficiency and adequacy of the hypothesized growth structure, especially the non-linear growth curve [35–37]; (2) integrate a time-variant and time-varying covariate [38] so as to estimate their effects on developmental trajectories; (3) identify growth patterns based on the estimations of individual change, intra-individual differences from individual change, and within-group error [39]. More importantly, the associative LGMs allow researchers to explore interrelations among parameters for individual differences [40–42]. This model, in other words, is specified to investigate the synchronous model's correlation coefficients, which are the correlations of trajectories between factors that are included in this study [38].

It is suggested that the parallel process of LGM analysis methodology can be implemented to test the research hypotheses [43]. First, three separate unconditional (i.e., without covariates) single-factor polynomial LGMs are constructed and evaluated for the NLDI, UG, and EG, respectively. Second, these three single-factor LGMs are examined based on their model fits. Three single-factor LGMs then are combined to construct the unconditional three-factor associative LGM to further explain the associations between the growth parameters of these three major factors. Third, this study evaluates the model fits of the associative LGM and examines the growth trajectories between the NLDI, UG, and EG by interpreting model fit indices and values of growth parameters.

#### *2.3. Latent Growth Model (LGM) Configuration Procedures*

#### 2.3.1. Unconditional Latent Growth Model (LGM) Specification for All Factors

MPlus software Version.8 [44] is used to specify, configure, and estimate the latent growth models (LGMs). To test and determine the growth shape of the Nighttime Light Development Index (NLDI), a single-factor polynomial LGM with a quadratic growth factor is specified. Since each major factor has been measured 3 times (i.e., 1990, 2000, and 2010), factor loadings of the latent intercept are all set to 1, and those of the linear latent slope are set to 0, 1, and 2, respectively. Moreover, the factor loadings for the quadratic growth factor are set to 0, 1, and 4 [45]. Additionally, the covariances between the latent intercept, slope, and quadratic factors are set to be freely estimated. To ensure that the model is overidentified with positive degrees of freedom, the error variances and mean structures of the latent factors are set to 0.

Similar to the specification of a single-factor polynomial model with a quadratic growth LGM for the NLDI, the model specifications and constraints for an Urban Population Gini (UG) and Education Gini (EG) are set with identical configurations as the LGM for the NLDI for the purpose of determining the growth shape and model identification.

#### 2.3.2. Unconditional Three-Factor Associative Latent Growth Model (LGM)

The unconditional associative latent growth model (LGM) was developed by combining three separate single-factor polynomial LGMs to evaluate the associations between the latent growth factors. To ensure that the model was overidentified, the residual variances for 9 time points were set to 0 (i.e., t1–t9 since there are 3 factors, and each factor has 3 time points), and the mean structures for the growth factors also were set to 0. When the three-factor unconditional associative LGM shows an acceptable model fit, further analyses will be conducted to interpret covariances between growth parameters within and across latent factors.

#### 2.3.3. Model Estimation and the Fit Indices

Multiple fit indices are used in evaluating the latent growth models (LGMs), including Chi-square test statistics, a comparative fit index (CFI), a Tucker–Lewis index (TLI), a root mean square error of approximation (RMSEA), and a standardized root mean square residual (SRMR), which are the common fit statistics used for assessing structural equation models [46]. The thresholds for each fit index to determine if a model is acceptable are as follows: (1) it is noted that RMSEA values ranging from 0.08 to 0.10 indicate a mediocre fit [47]. Moreover, they strongly argued that the RMSEA values alone could not accurately determine the model fit, and it is reasonable to combine RMSEA values with confidence intervals. Therefore, the p value should be greater than 0.50 to indicate an acceptable model fit for testing closeness of fit with a 90% confidence interval [48]; (2) Hu et al. [49] suggested that values of the CFI and TLI greater than 0.95 can indicate an acceptable model fit; (3) a smaller value of the SRMR indicates a better model fit and a SRMR value of 0 indicates a perfect model fit [50].

#### 2.3.4. Model Parameter Estimation and Interpretation

Regarding either unconditional or associative latent growth models (LGMs), the variances of intercepts indicate the differences of countries on human development and educational status at the baseline. The variations in latent growth factors (such as the slope and quadratic rates of change) can indicate differences of individual countries in the probability of progressing in a linear or quadratic rate of change over time. Moreover, in the associative model, the direction and magnitude of the covariances among growth factors can indicate the directions and strengths of the relationships between the growth trajectories for human development and education factors.

#### **3. Result**

#### *3.1. Model Configuration Results*

Separate unconditional single-factor polynomial latent growth models (LGMs) are constructed for each factor. Shown in Table 2, the single-factor polynomial LGMs fit the Education Gini (EG) and Urban Population Gini (UG) adequately. However, for the Nighttime Light Development Index (NLDI), although the model does not yield an acceptable fit (root mean square error of approximation (RMSEA) = 0.335), the singlefactor polynomial LGM with a quadratic growth parameter still demonstrates a better fit over those with a constant and linear growth. Therefore, all factors show quadratic change patterns. During the next step, a three-factor associative LGM is constructed to explore the associations of developmental trajectories between factors, following the model configuration procedures described in Section 2.

**Table 2.** Model fit indices including root mean square error of approximation (RMSEA), comparative fit index (CFI), Tucker–Lewis index (TLI), and standardized root mean square residual (SRMR) for latent growth models (LGMs) based on Education Gini (EG) and Urban Population Gini (UG), and Nighttime Light Development Index (NLDI).


\*\*\* *p*-value < 0.001 with two-tailed test.

#### *3.2. Associative Growth Trends*

Based on the results in Table 2, it is found that the three-factor associative latent growth model (LGM) yields an acceptable model fit for the dataset used in this study. Therefore, for the rest of Section 3, we use this associative model to investigate the interrelationships between the growth patterns of factors. First, to interpret how each factor is changing over time, statistically significant growth parameter estimates within each factor are presented as follows: (1) for the Nighttime Light Development Index (NLDI), the association between its initial status and linear slope growth is statistically significant (Covariant (Cov.)1 = −0.435, Standard Error (S.E.) = 0.068, *p* < 0.001), and the association between the linear slope

growth and quadratic growth also is statistically significant (Cov2 = −0.869, S.E. = 0.021, *p* < 0.001). Countries with lower NLDI values tend to have a higher linear growth but a lower quadratic growth, in other words. However, countries with higher NLDI values show a lower linear growth but a higher quadratic growth. (2) Regarding the Education Gini (EG), the association between the initial EG status and the linear slope growth is statistically significant (Cov3 = –0.307, S.E. = 0.076, *p* < 0.001), indicating that countries with a greater initial education inequality tended to have a slower linear rate of change. The association between the linear slope growth factor and the quadratic growth factor also is statistically significant (Cov4 = –0.845, S.E. < 0.024, *p* < 0.001), which means that countries with a higher linear rate of change to education inequality tend to have a slower quadratic rate of change. Therefore, the EG exhibits a similar growth pattern to the NLDI where countries with higher EG values at the initial stage demonstrate a slower linear growth, but a higher quadratic growth. Whereas, for countries that have lower EG values at the initial stage, they tend to have a higher linear growth, but a lower quadratic growth. (3) Regarding the Urban Population Gini (UG), the linear slope and quadratic growth factor covary significantly (Cov5 = –0.978, S.E. = 0.004, *p* < 0.001), indicating that countries with a higher linear growth in population show a slower quadratic growth.

The associative LGM allows us to explore growth parameters across factors that are statistically significant (Table 3): (1) countries with lower NLDI values also have a lower UG (Cov6 = 0.293, S.E. = 0.077, *p* < 0.001); (2) countries with lower NLDI values also have lower EG values (Cov7 = 0.566, S.E. = 0.057, *p* < 0.001); (3) countries with higher education Gini tend to demonstrate a slower linear rate of change in the UG (Cov8 = –0.644, S.E. = 0.049, *p* < 0.001). However, as time increases, countries with higher EG values show a higher quadratic growth in the UG (Cov9 = 0.645, S.E. = 0.049, *p* < 0.001).


**Table 3.** Standardized model estimates for Education Gini (EG), Urban Population Gini (UG), and Nighttime Light Development Index (NLDI) based on intercept (INT), slope (SLP), and quadratic (QUA) growth parameters.

*p*-value: two-tailed.

The LGM trajectory analysis results also are reflected in Figure 2 (plotted based on data in Appendix A). Figure 2 shows that both the EG and NLDI experience downward trends from 1990 to 2010, which means that most of the countries included in this study have less education inequality and higher human development levels. Nevertheless, the urbanization Gini decreases from 1990 to 2000, and then increases from 2000 to 2010. Therefore, there is a greater uneven urban population distribution in recent years. During 1990, there were positive associations between the initial status of the EG, UG, and NLDI. This indicates that the countries with initially lower levels of human development also had a higher education inequality and a greater uneven urban population distribution. Considering 1990–2000, all factors experienced decreasing trends, and the EG demonstrated a higher decreasing rate. Considering 2000–2010, the quadratic change rates of the UG and NLDI showed a less significant change, whereas the quadratic change rate of the EG still demonstrated a decreasing trend.

**Figure 2.** Growth trajectories of the Nighttime Light Development Index (NLDI), Education Gini (EG), and Urban Population Gini (UG) from 1990, 2000, and 2010 for all 141 countries with a 95% confidence interval.

#### **4. Discussion**

Although nighttime light (NTL) is not measuring human activities directly, results from previous studies have shown that NTL is capable of estimating socioeconomic development accurately on different spatial scales [27,28]. Therefore, we calculate the Nighttime Light Development Index (NLDI), Urban Population Gini (UG), and Education Gini (EG) at the country level based on the Defense Meteorological Satellite Program (DMSP) NTL, and the Global Human Settlement Layer (GHSL) population distribution. When analyzing the results from the associative latent growth model (LGM), we are able to identify the different growth trajectory patterns across multiple years, which can further inform us about the associations between development and education inequality.

Considering 1990–2010, we see a significant drop in education inequality. Considering 1990–2000, that drop is accompanied by similar drops in the NLDI (related to Human Development). However, from 2000 to 2010 the gains in the NLDI have ceased while improvements to educational inequality have continued. This bifurcation raises some interesting questions. Theory suggests that human development will correlate with higher levels of education, which appears to be true from 1990 to 2000 [51]. Therefore, those trends lead to a series of questions that need to be explored: (1) Is the departure from these correlated trajectories due to exogenous or endogenous forces? (2) Could the departure be related to fundamental resource constraints such as the availability of adequate food, water, and energy? (3) Will improved educational outcomes occurring simultaneously with slowed changes to human development foster increased levels of social unrest?

#### **5. Conclusions**

Here, we analyzed the trajectories of human development, urban population distribution, and education inequality using multi-source data on multiple spatiotemporal scales. Generally, the overall trend for human development levels is increasing and for education inequality is decreasing in most of the countries. However, there is a greater uneven urban population distribution over time. Different development patterns are identified through latent growth models (LGMs). To provide an example, (1) countries with low initial human development levels tend to have greater associated education inequality; (2) countries with higher initial human development levels tend to show higher linear and lower quadratic rates of changes in human development over time; (3) education inequality changes show a stronger association with the trajectories of urban population distributions than those of human development levels. To be more specific, countries with a greater initial education inequality are associated with a slower linear rate of change in the uneven distribution of the urban population. However, as time increases, the countries with a greater initial

education inequality also are associated with a greater quadratic rate of change in the uneven distribution of the urban population; (4) however, the growth patterns of the human development levels and education inequality show less significant associations.

It has been demonstrated that the Defense Meteorological Satellite Program (DMSP) nighttime light (NTL) can support the estimation of socioeconomic data, especially at the country level, as some of the outlier effects are minimized with data aggregation [52]. Nevertheless, due to its own limitations, it may not be able to capture the human activities at smaller regional levels (e.g., city or town levels). Therefore, there is a potential for using the Visible Infrared Imaging Radiometer Suite (VIIRS) NTL data for estimating socioeconomic development in the future. VIIRS has outperformed DMSP in many ways, including its better resolution and higher sensitivity for capturing artificial lights [53], and the results derived from VIIRS NTL are more accurate [54,55]. Thus, VIIRS can help us better capture the spatial heterogeneity of economic development on a finer scale (e.g., a provincial level). VIIRS data also can help us better assess and explore disparities in education not only across countries but between urban and rural areas within countries and regions. Accompanying more accurate subnational socioeconomic data, there is a potential for us to develop advanced models (e.g., multi-level models) to capture the within-cluster and between-cluster variations to better analyze education disparities.

Upcoming, there are several important steps that can take this research to the next level: (1) using more accurate education Gini data to estimate education inequality as the current data is developed based on a few indicators and may not reflect the true education inequality on various scales; (2) collecting more historical data, including socioeconomic data and geospatial data to monitor and forecast education inequality changes to build LGMs with greater complexity to characterize the commonalities of trajectories; (3) developing suitable statistical models such as hierarchical linear models to cluster countries and their subnational entities in terms of their levels of development to better compare intra-group growth patterns; (4) using the VIIRS NTL data for future studies.

**Author Contributions:** Conceptualization, B.Q. and X.W.; methodology, B.Q. and X.W.; investigation, P.S., B.Q., X.W.; writing—review and editing, P.S., B.Q., X.W.; supervision, P.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

**Country NLDI Urban Population Gini 1990 2000 2010 1990 2000 2010** Afghanistan 0.745 0.650 0.481 0.213 0.162 0.471 Albania 0.366 0.140 0.148 0.165 0.180 0.103 United Arab Emirates 0.225 0.341 0.339 0.055 0.054 0.018 Argentina 0.194 0.224 0.281 0.042 0.039 0.053 Armenia 0.242 0.240 0.333 0.143 0.154 0.084 Australia 0.101 0.145 0.143 0.036 0.031 0.013 Austria 0.233 0.241 0.248 0.231 0.225 0.008 Burundi 0.945 0.813 0.715 0.084 0.064 0.637 Belgium 0.104 0.118 0.153 0.061 0.065 0.003

**Table A1.** Nighttime Light Development Index (NLDI) and Urban Population Gini results for 141 countries and regions.

**Table A1.** *Cont.*


**Table A1.** *Cont.*



**Table A1.** *Cont.*

#### **References**


## *Article* **Modeling Spatiotemporal Population Changes by Integrating DMSP-OLS and NPP-VIIRS Nighttime Light Data in Chongqing, China**

**Dan Lu 1,2,3,†, Yahui Wang 1,2,3,†, Qingyuan Yang 1,2,3,\*, Kangchuan Su 1,2,3, Haozhe Zhang 1,2,3 and Yuanqing Li 1,2,3**


**Abstract:** The sustained growth of non-farm wages has led to large-scale migration of rural population to cities in China, especially in mountainous areas. It is of great significance to study the spatial and temporal pattern of population migration mentioned above for guiding population spatial optimization and the effective supply of public services in the mountainous areas. Here, we determined the spatiotemporal evolution of population in the Chongqing municipality of China from 2000–2018 by employing multi-period spatial distribution data, including nighttime light (NTL) data from the Defense Meteorological Satellite Program's Operational Linescan System (DMSP-OLS) and the Suomi National Polar-orbiting Partnership Visible Infrared Imaging Radiometer Suite (NPP-VIIRS). There was a power function relationship between the two datasets at the pixel scale, with a mean relative error of NTL integration of 8.19%, 4.78% less than achieved by a previous study at the provincial scale. The spatial simulations of population distribution achieved a mean relative error of 26.98%, improved the simulation accuracy for mountainous population by nearly 20% and confirmed the feasibility of this method in Chongqing. During the study period, the spatial distribution of Chongqing's population has increased in the west and decreased in the east, while also increased in low-altitude areas and decreased in medium-high altitude areas. Population agglomeration was common in all of districts and counties and the population density of central urban areas and its surrounding areas significantly increased, while that of non-urban areas such as northeast Chongqing significantly decreased.

**Keywords:** population reorganization; population density; spatiotemporal patterns; DMSP-OLS; NPP-VIIRS; Chongqing

#### **1. Introduction**

Urban-rural migration is a major issue affecting the sustainable development of society, while the spatial distribution of population is a core focus of research in population geography [1]. Driven by economic globalization, developing countries occupy an increasing share of the world economy and the world's economic center continues to move to Asia [2–4]. As the largest developing country, China has experienced an unprecedented growth rate over the past 30 years. The urbanization rate has increased from 26% to 58% and the growth rate is about 2.7 times the world average (World Bank). China's rapidly developing social economy and ongoing urbanization has resulted in the relocation and reorganization of urban and rural populations [5–8] as reflected in the continuous growth of the former and substantial reductions in the latter's labor force. According to the National

**Citation:** Lu, D.; Wang, Y.; Yang, Q.; Su, K.; Zhang, H.; Li, Y. Modeling Spatiotemporal Population Changes by Integrating DMSP-OLS and NPP-VIIRS Nighttime Light Data in Chongqing, China. *Remote Sens.* **2021**, *13*, 284. https://doi.org/10.3390/ rs13020284

Received: 10 December 2020 Accepted: 12 January 2021 Published: 15 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional clai-ms in published maps and institutio-nal affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Bureau of Statistics of China (NBSC), the country's urban population has increased by an average of 21 million per year since 2000. In contrast, the agricultural labor force has decreased by 11 million per year [9] and the rural population has decreased by ~30.2%, from 808 million in 2000 to 564 million in 2018 (NBSC). It is worth noting that population migration from mountainous areas has been particularly significant [9]. The process of urban-rural migration results in the redistribution of production factors such as capital, which will impact on the ecosystem and social economy, with contradiction between resources, the environment and population changing accordingly [10–12]. The rural population structure has also changed (including age, gender and number), which has changed the land use pattern and human activities radius, thereby affecting the construction and restoration of rural ecological civilization [6,13–15]. Therefore, mapping and estimating the spatial distribution of populations can provide scientific support for developing regionally sustainable development strategies and spatial land-use planning [16,17].

Traditional demographic statistics and analysis mainly rely on population surveys, including censuses and sampling studies. Until now, China has carried out six censuses. Although population surveys are scientific and authoritative [10,18], their data acquisition cycle is long and townships are the smallest survey unit, such that the spatial resolution of the data is insufficient [19]. Therefore most studies do not use the administrative unit as the research object [13,20,21]. With the rapid development of geographic information system and remote sensing technology, multi-source remote sensing data have been widely applied in spatial population research, especially land-use and night-time light (NTL) data [19,22–29]. For example, Yang et al. [24] combined Defense Meteorological Satellite Program's Operational Linescan System (DMSP-OLS) NTL data, enhanced vegetation index data and digital elevation model (DEM) data to simulate the population density of Zhejiang. Hu et al. [25] determined the spatial distribution of population in Sichuan and Chongqing based on NTL data and land-use data. Other studies have shown that the spatial distribution of regional populations can be well-described by data processing, multi-source data fusion and model improvement [24,30]. Most research has remained focused on the spatial modeling of population at a single point in time [19,23,29,31,32], however few studies adopt multivariate data to model the population spatial distribution in a long time series.

Although the DMSP-OLS dataset provides continuous NTL data from 1992 to 2013 [33], its imagery contains problems due to OLS limitations such as discontinuity and oversaturation by bright lights [34,35] These data were replaced by Suomi National Polar-orbiting Partnership Visible Infrared Imaging Radiometer Suite (NPP-VIIRS) NTL data after 2013, bringing clear upgrades such as improved spatial resolution and reduced saturation [33] as well as on-board calibration [36]. Although these are clear upgrades, they also present challenges to obtaining consistent long-term NTL data [35,37,38], such that proper integration of the two datasets must be accomplished before the construction of a long-term population spatial distribution dataset and there have been several attempts to integrate DMSP and VIIRS NTL data [37–41]. Zhu et al. [39] established the relationship between the two at the provincial level and used it to model China's Gross Domestic Product. Zhao et al. [37] achieved this at the pixel level and established a long-term NTL dataset in Southeast Asia. Previous studies have contributed to enhancing the consistency of NTL between DMSP and VIIRS data, however there are limitations regarding a widespread application of current methods, such as the models proposed has regional limitations and may not be suitable for other regions [37,41]; the datasets used are not accessible to general public [40,41]; and the time series of data generated only has consistent NTL indices at the administrative level and is still limited at the pixel level [39].

The municipality of Chongqing integrates a metropolis and a large rural area that is mostly mountainous area, which is characterized by intense human activities and a fragile ecological environment. According to the NBSC, ongoing urbanization in Chongqing resulted in the rural population declining from 15.33 million to 10.7 million from 2005 to 2018 (a decrease of 30.2%), exceeding the national average of 24.34%. Meanwhile, Chinese policies targeting poverty alleviation and rural revitalization have benefitted most residents in poor mountainous areas through relocation, resulting in major changes in population distribution. Therefore, exploring the spatiotemporal changes in Chongqing's population via a timely understanding of population distribution data can help guide population migration from mountainous areas, promote the sustainable development of the regional economy and inform ecological restoration in mountainous areas.

This study explored integration methods for the two NTL datasets that are suitable for the study area at the pixel level and constructed a long-term NTL dataset that provides a basis for modeling long-term population spatial distribution data. Then it simulated the spatial distribution of Chongqing's population in 2000, 2005, 2010, 2015 and 2018 by integrating the two NTL datasets and analyzing spatiotemporal changes. Our results can serve as a scientific reference for rationally allocating urban and rural resources, optimizing urban and rural spatial patterns and promoting the high-quality development of the regional economy.

#### **2. Study Area and Data**

#### *2.1. Study Area*

Chongqing is located in the eastern Sichuan Basin, covering 8.24 × 104 km<sup>2</sup> from 28◦10 –32◦13 N and 105◦11 –110◦11 E (Figure 1). Its 26 districts and 12 counties cover a rugged landscape that is 75.33% mountainous. Its location at the intersection of the Silk Road and the Yangtze River Economic Belt allows it to form connections between east and west while driving economic development between north and south, leading to a vital role in China's development strategy underlain by the Belt and Road Initiatives and Yangtze River Economic Belt [42]. Chongqing is one of the important population areas in Western China, having a resident population of 31.02 million in 2018, an increase of 2.53 million compared with 2000 (NBSC); its average population density is about three times the national average. Its location in the upper reaches of the Yangtze River is part of an ecological protective screen within the Yangtze River Economic Belt. Its complex topography and fragile ecological environment enhance tensions between humans and the environment, such that ecological construction and regional development face many challenges.

**Figure 1.** Location map of the study area.

#### *2.2. Data Sources*

DMSP-OLS Version 4 NTL data from 2000 to 2013 were obtained from the Paynes Institute for Public Policy, Colorado School of Mines (https://eogdata.mines.edu/dmsp/ downloadV4composites.html). These have a spatial resolution of 30 arc-seconds, with data values ranging from to 0–63 and have been denoised [43]. Monthly VIIRS Cloud Mask (vcm) data from 2013 to 2018 were also obtained from the Paynes Institute for Public Policy, Colorado School of Mines (https://eogdata.mines.edu/dmsp/download\_radcal.html), with a spatial resolution of 15 arc-seconds that excludes observations affected by stray light. The data contained additional noise from sources such as auroras, fires, boats, other temporary lights and outliers, probably caused by stable lights from oil or gas fires.

Land-use data (1 km × 1 km) were obtained from the Resource and Environment Science Data Centre of the Chinese Academy of Sciences (http://www.resdc.cn/) with major categories including cultivated land, forest, grassland, water, residential land and unused land. Resident population data at the county level were obtained from the Chongqing Statistical Information Net (http://data.tjj.cq.gov.cn/), while those at the township level in 2015 were derived from the China County Statistical Yearbook 2016. DEM data were obtained from the Geospatial Data Cloud (http://www.gscloud.cn/).

#### *2.3. Data Preprocessing*

Firstly, all data were extracted by administrative boundaries and the DMSP-OLS NTL data was resampled to 1 km grids. Secondly, a stepwise calibration approach at the global scale was used to improve the temporal inconsistency of DMSP time series [44]. Thirdly, calculated the average value of VIIRS data from January to December to generate annual time series of VIIRS NTL imagery. Fourthly, mask extraction was then used to remove noise from the NPP-VIIRS NTL data. The DMSP-OLS NTL data and the annual NTL data provided by the NPP-VIIRS dataset were used as mask data. Masks were selected for each year according to the principle of time adjacency [34,45,46]. Finally, The maximum value of VIIRS NTL data in the main urban area of Chongqing was selected as the effective light intensity threshold and the eight-neighborhood algorithm was used to smooth the VIIRS NTL data [47]. These procedures allowed the NTL correction data to be obtained.

#### **3. Methods**

We established a relationship model between the two kinds of NTL data (based on the pixel scale), constructed a long time series of stable NTL datasets, then modeled the spatiotemporal dynamics of Chongqing's population from 2000 to 2018.

#### *3.1. Integrating DMSP-OLS and NPP-VIIRS NTL Data*

In order to match the spatial resolution and radiation characteristics of the two NTL data, we first performed two processes on the VIIRS data with reference to Zhao et al. [37]. One is using a kernel density (KD) method for resampling to make the spatial resolution the same as the DMSP data. The other is the logarithmic transformation. On this basis, we further discuss the NTL integration model and convert the value of VIIRS data.

(1) Spatial Resampling Using a KD Method Given that the blur of DMSP NTL image is a Gaussian point-spread function, the influence of neighborhood NTL brightness should be taken into account during the conversion of VIIRS spatial resolution. This paper adopted a quartic kernel function to realize as follows:

$$f(x) = \frac{1}{nh} \sum\_{i=1}^{n} K\left(\frac{X - X\_i}{h}\right),\tag{1}$$

where *f(x)* denotes the estimation of the KD function; *n* is the total number of samples; *h* is the window width and the value is five times of VIIRS pixel size here; *K* is the KD function; *X* is the pixel to be corrected; and *Xi* is the neighbor pixels within the window.

(2) Logarithmic Transformation Logarithmic transformation of NPP-VIIRS data can better suppress the sharp radiance jump within urban core areas and strengthen the radiance variance within suburban and rural areas [37]. Therefore, we performed a logarithmic transformation for VIIRS data as follows:

$$Log\_{-}N\_{i} = \ln(N\_{i} + 1)\_{\prime} \tag{2}$$

where *Ni* denotes s the aggregation results of VIIRS NTLs using the KD method and *Log\_Ni* denotes the corresponding logarithmic transformation results. To avoid invalid values caused by logarithmic transformation, a constant of 1 was added.

(3) Conversion of the VIIRS NTL Value Both DMSP and VIIRS products provide NTL data in 2012 and 2013 and the monthly VIIRS data in 2013 include all months, while the monthly data in 2012 are only available from April to December. Considering that a slight seasonal difference may exist in annual VIIRS data, 2013 data were used to determine the relationship between the two data sets. We observed a positive correlation between DMSP and processed VIIRS value (Figure 2).

**Figure 2.** Scatter density plots of DMSP and processed VIIRS nighttime lights (NTLs) in 2013.

For further analysis, we developed a linear regression model, a quadratic polynomial regression model and a power function regression model relating the DMSP and processed VIIRS values in 2013, in order to find the best model for integrating NTL data.

#### *3.2. Modeling the Spatiotemporal Dynamics of Population*

NTL mainly comes from household lighting, roads, urban lightscapes, all of which are closely related to human activities. Moreover, NTL intensity directly reflects the intensity of such activities. Figure 3 shows the relationships between population density and the mean value of NTL at the county level. In Chongqing, NTL intensity grew rapidly with growth population growth. The quadratic polynomial model had the highest coefficient of determination of all models tested including the linear model and the power function model.

Different land-use patterns reflect population distribution and human production [48]. Our correlation analysis between population and land-use types at the county level showed that population was positively correlated with cultivated land, water and residential land at the 1% significance level, positively correlated with unused land at the 5% significance level and negatively correlated with forest and grassland at the 0.05 significance level (Table 1).

**Figure 3.** Relationship between light intensity and population density at the county level in Chongqing.

**Table 1.** Correlation analysis between population and land-use type (by area).


Note: \*\* and \*\*\* are significantly different from zero at the 5% and 1% levels, respectively.

The population spatial distribution pattern was therefore closely related to NTL and land use, so we used the random-effect model to establish the relationship between population, NTL and land use. The resident population in each district and county was selected as the dependent variable and the total value of NTL (NT), the number of bright pixels (NL) and the number of dark pixels (ND) of each land use type in each district and county were used as independent variables. Considering that geographical factors are also important factors affecting population distribution, elevation variables were also added into the model as independent variables, which includes the number of pixels with altitudes (NPA) of 0–300 m, 300–500 m, 500–1000 m and >1000 m in each district and county. Prior to empirical simulation, stepwise regression was used to identify the key independent variables with a significance level within 20%. The key independent variables included the NT of cultivated land and forest; NL of residential; ND of cultivated land and grassland; and the NPA of 0–300 m, 300–500 m, 500–1000 m and >1000 m. The collinearity between the variables was then tested using the variance inflation factor (VIF); the maximum VIF of a single variable was 3.42 and the overall VIF was 2.49 and they were well below the critical value of 10, indicating no serious collinearity problem between the variables. The empirical model settings were as follows:

$$P\_{it} = \alpha\_i + \beta\_i \mathbf{x}\_{it} + \mu\_{it} \tag{3}$$

where *Pit* is the resident population of the *i*th county in the *t*th year; *i* = 1,2, ... , 38; *t* represents the known year; *xit* represents the observation value of variables in the *i*th county in the *t*th year; *α<sup>i</sup>* is the individual difference between regions; *β<sup>i</sup>* is a parameter to be estimated; and *μit* is a random error term. Considering the real situation of population distribution, water and unused land were not involved in the model calculation [19].

Next, based on the estimated results of the random-effect model, the resident population of each grid was calculated as follows:

$$P\_{\rm ijk} = P\_0/N\_i + \sum\_{j=1}^{m} \left( a\_j \times NT\_{ijk} + b\_j \times NL\_{ijk} + c\_j \times ND\_{ijk} + d\_j \times NPA\_{ikn} \right) \tag{4}$$

where *Pijk* is the resident population in the *k*th pixel of the *j*th land use type in the *i*th county; *P*<sup>0</sup> is a constant; *Ni* is the number of pixels in the *i*th county; *aj*, *bj*, *cj*, and *dj* are coefficients; m is the number of land-use types; *NTijk*, *NLijk* and *NDijk* are the total value of NTL, the number of bright pixels and the number of dark pixels in the *k*th pixel of the *j*th land use type in the *i*th county, respectively; *NPAikn* is the number of pixels of the *n*th elevation interval in the *k*th pixel and *i*th district and county. Negative coefficients for some variables in the simulation equation established by the random-effect model resulted in the estimated population of some pixels being negative, a situational impossibility. Therefore, pixels with a negative estimation value were assigned a value of 0 before obtaining the preliminary estimated population data.

Finally, the statistical data for county population were used to adjust the simulation results as follows:

$$P'\_{ijk} = P\_{ijk} \times P\_i / P'\_i \tag{5}$$

where *P ijk* is the final resident population in the *k*th pixel of the *j*th land use type in the *i*th county; *Pi* is the statistical data of the resident population in the *i*th county; and *P <sup>i</sup>* is the total population by preliminary estimate in the *i*th county.

#### *3.3. Evaluation of Model Accuracy*

Based on the population census data at the township level, the correlation coefficient (R), mean absolute error (MEA), mean relative error (MRE) and root mean square error (RMSE) were selected to evaluate accuracy as follows:

$$R = \frac{\sum\_{i=1}^{n} \left(P\_i - \overline{P}\right) \left(PE\_i - \overline{PE}\right)}{\sqrt{\sum\_{i=1}^{n} \left(P\_i - \overline{P}\right)^2} \sqrt{\sum\_{i=1}^{n} \left(PE\_i - \overline{PE}\right)^2}}\tag{6}$$

$$MAE = \frac{1}{n} \sum\_{i=1}^{n} |PE\_i - P\_i| \tag{7}$$

$$MRE = \frac{1}{n} \sum\_{i=1}^{n} \frac{|PE\_i - P\_i|}{P\_i} \tag{8}$$

$$RMSE = \sqrt{\frac{\sum\_{i=1}^{n} \left(PE\_i - P\_i\right)^2}{n}},\tag{9}$$

where *Pi* is the statistical resident population in the *i*th township provided by census data, *PEi* is the estimated resident population in the *i*th township, *P* is the average of the statistical population and *PE* and is the average of the estimated population.

#### **4. Results**

*4.1. Integration Model*

4.1.1. Integration Model

The power function model had the highest coefficient of determination (R2 = 0.907) of the three models tested (Figure 4). Therefore, the relationship established by the power function model was used to simulate DMSP data from 2014 to 2018. The method of integrating NTL data is as follows:

$$TNL\_{\pi} = \begin{cases} \begin{array}{c} TNL\_{\pi}^{a} \\ 4.33 \times \left(L\log\_{-}N\_{n}\right)^{1.39} + 4.87 \end{array} & n > 2013 \end{cases} \tag{10}$$

where *TNL<sup>a</sup> <sup>n</sup>* is the NTL radiance value for the DMSP-OLS data in the *n*th year; *Log*−*Nn* is the processed VIIRS radiance value in the *n*th year; and *TNLn* is the value for the NTL integration data in the *n*th year.

**Figure 4.** Correlation between the Defense Meteorological Satellite Program (DMSP) and processed Visible Infrared Imaging Radiometer Suite (VIIRS) values at the pixel scale for the (**a**) linear model, (**b**) quadratic polynomial model and (**c**) power function model.

#### 4.1.2. Accuracy Assessment

We assessed the accuracy of the integrated NTL data by comparing the DMSP-OLS and adjusted NPP-VIIRS data in 2013 (Table 2). The MRE value of the mean NTL generated from the integrated data was 8.19%, while the relative error (RE) values of the mean NTL varied by county, with 39.47% of counties underestimated, 60.53% overestimated and 71.05% having RE values within 10%. The maximum and minimum RE values were 42.46% (Chengkou) and 0.2% (Yubei).

**Table 2.** Accuracy assessment of the NTL integrated data by county.


A previous study exploring the relationship between the two kinds of NTL data at the provincial level produced an MRE of 12.97% [39]. In comparison, our methods clearly improved the matching accuracy, making this approach feasible for integrating NTL data.

#### *4.2. Modeled Spatial Distribution of Population*

#### 4.2.1. Random-Effect Model

The random-effect model results produced an overall F value of 224.42, an R<sup>2</sup> value between groups of 0.71 and an overall p-value of 0.000, indicating that the model was well-established and that the modeling equation was reasonable (Table 3).


**Table 3.** Estimated coefficients for the random-effect model.

Note: (1) \*, \*\* and \*\*\* are significantly different from zero at the 10%, 5% and 1% levels, respectively.

#### 4.2.2. Accuracy Assessment

We evaluated the population modeling results using 2015 census data for 150 randomly selected villages and towns. As terrain factors could affect the accuracy of population simulations, we divided the study area into the three zones by elevation (high-altitude, ≥1000 m; medium-altitude, 500–1000 m; and low-altitude, <500 m) among which the randomly selected villages and towns were evenly distributed (Figure 5).

The four error evaluation indicators of the overall simulated population in 2015 were R (0.85), MAE (4947.58), MRE (26.98%) and RMSE (8170.45). In addition, MRE differed by altitude zone (low-altitude, 25.73%; middle-altitude, 25.90%; high-altitude, 29.34%). The REs for each village and town showed that 46% were relatively accurate, 18% were generally overestimated, 20% were generally underestimated, 8% were seriously overestimated and 8% were seriously underestimated (Table 4).

**Table 4.** Relative error (RE) classification for villages and townships.


**Figure 5.** Spatial distribution of the villages and towns selected.

4.2.3. Spatial Distribution of Population in Chongqing

According to the fifth census in 2000, if the population density of a municipal district was more than 1500 persons/km2, the entire population was classified as urban. On this basis, we regarded population densities of >1500 persons/km<sup>2</sup> as high-population-density regions. In addition, according to Tan et al.'s [19] hierarchical classification method for population density, areas with a population density of 200–1500 people/km2 were classified as intermediate-density regions and areas with a population density <200 people/km<sup>2</sup> were classified as low-density regions.

From 2000 to 2018, Chongqing's population density has generally increased in the west and decreased in the east (Figure 6). High-density regions were mainly distributed in western Chongqing and those centered on Yuzhong continued to expand. In contrast, the population density of most regions in the northeast and southeast decreased to varying degrees, trending toward low population density.

Low-density regions in Chongqing grew from 35.64 × 103 km2 in 2000 to 41.07 × 103 km2 in 2018 (an increase of 15.22%) (Table 5). 95% of the newly added regions were created by the loss of population from Intermediate-density regions, mainly in the northeast and southeast, including Fengjie, Yunyang, Wushan, Wuxi, Xiushan, Fengdu and Shizhu(Figure 7).


**Table 5.** Changes in population density from 2000 to 2018.

**Figure 6.** Simulated results of population density in Chongqing in 2000, 2005, 2010, 2015 and 2018.

**Figure 7.** Changes in population density in Chongqing from 2000 to 2018.

The total intermediate-density area decreased from 45.02 × 103 km2 in 2000 to 38.38 × 103 km2 in 2018 (a decrease of 14.75%). The reduced regions were mainly distributed in the primary urban zone of Chongqing and in the northeast. Intermediate-density regions in urban zone tended to agglomerate and gradually develop into high-density regions, while intermediate-density regions in the northeast gradually lost their population and developed into low-density regions. In addition, within each district and county, population development trended toward agglomeration, manifested as a gradual increase in urban population density and the gradual evolution of intermediate-density regions into high-density regions; however, in non-urban areas, population loss was more common in intermediate-density regions, where population density decreased.

High-density regions gradually expanded from 1.07 × <sup>10</sup><sup>3</sup> km2 in 2000 to 2.28 × 103 km2 in 2018 (an increase of 113.08%). In 2000, these were mainly distributed within a radius of 24 km from Yuzhong (Figure 8) but ongoing urbanization expanded this range to a radius of 33 km by 2018. In addition, urban areas within each district and county also became distributed within high-density regions, which expanded to different degrees.

**Figure 8.** High-population-density regions in Chongqing in 2000 and 2018.

The low-altitude zone had the highest average population density and population growth while trending toward agglomeration (Table 6). The average population density here increased from 550.58 to 647.08 people/km2 from 2000 to 2018, a total population increase of 3.16 million. The growth rate was fastest from 2010 to 2015, increasing by 134.69 × <sup>10</sup><sup>4</sup> people in only five years. In contrast, the medium- and high-altitude zones showed declining population density and total population. The medium-altitude zone showed a drop in average population density from 232.66 to 223.70 people/km2 from 2000 to 2018, a total population decrease of 0.4 million. The average population density in the high-altitude zone dropped from 64.45 to 58.78 people/km<sup>2</sup> from 2000 to 2018, a total population decrease of 0.18 million.

**Table 6.** Average population density and total population of each altitude zone.


#### **5. Discussion**

The DMSP-OLS dataset represents the most widely used NTL data over the previous two decades, while the new NPP-VIIRS NTL data have been available since 2012. Despite the great significance of studying long-term population evolution in the context of urbanrural migrations, few studies have integrated the two datasets to simulate and monitor population spatial changes over the full time period. In this study, we proposed a method for integrating the DMSP-OLS and NPP-VIIRS data at the pixel scale in order to extend the temporal coverage of NTL data. Meanwhile, we have evaluated the accuracy of the integrated NTL data and the MRE was 8.19%. Our integration accuracy was improved by 4.78% compared with the long-time-series NTL dataset established at the provincial level [39], which indicated that our method for NTL integration was feasible and the resulting data had good quality and generally reliable temporal consistency.

Previous studies have simulated population spatial distribution in different regions using NTL and land-use data. Hu et al. [25] did this for Sichuan and Chongqing in 2014, with MREs for population data based on DMSP-OLS and NPP-VIIRS NTL data of 46.3% and 44.62%, respectively. Chowdhury et al. [23] developed a model for estimating the population in the Indian portion of the Indo-Gangetic Plains at both city and state levels by employing OLS NTL data. The model was validated for the population of year 1995, with an MRE of 9.4%. Liu et al. [26] simulated the spatial pattern of urban and rural residents in the Huang-Huai-Hai area with an MRE of 15.6%. Tan et al. [19] simulated the population density of China in 2000, achieving a correlation coefficient between the statistical and simulated values of 0.95. The accuracy of population simulations in mountainous areas such as Chongqing and Sichuan is lower than in plains areas such as Huang-Huai-Hai, demonstrating that population simulation in mountainous areas is more challenging and uncertain. As we were limited by the difficulty of obtaining accurate population data in towns and villages, we only tested the accuracy of population simulation in 2015; the R value (0.85) and MRE (26.98%) confirmed that the adjusted VIIRS data were capable of effectively simulating spatial population patterns. We optimized the simulation method for mountainous areas based on previous research [25], increasing the results' accuracy by nearly 20%. We also introduced a feasible method for constructing long-term population spatial data, which is helpful for scientifically monitoring spatiotemporal trends in mountainous populations. In addition, the U.S. Department of Defense has developed the Landscan database using an innovative approach with Geographic Information Systems and Remote Sensing, which is the finest resolution global population distribution data available [49]. In order to further verify our results, we also evaluated Landscan data using 2015 census data for 150 randomly selected villages and towns and the results showed that the R value and MRE were 0.78 and 35.7% respectively, which also proved the feasibility of our method.

It is worth mentioning that there are still some limitations in this study. First, although we were able to improve the accuracy of mountainous population spatial simulation

through data processing, this method was unable to completely eliminate inherent defects in the DMSP-OLS data, such as light saturation in urban centers with high light intensity [50] and insufficient detection capabilities in low-radiation areas such as rural areas [33]. These flaws reduce the accuracy of population simulation to a certain extent. Second, the change of lighting technology (from sodium vapor to light-emitting diode) reduced NTL values in the city center [51], which may have led to an underestimation of population simulation results. Third, the study was difficult to obtain the annual population distribution data and we only simulated the population distribution in the five periods of 2000, 2005, 2015 and 2018 due to limitation of data collection. Fourth, compared with DMSP-OLS data, NPP-VIIRS data have a higher spatiotemporal resolution. The advantages of the latter were not fully integrated into the long-term NTL dataset and further research is needed to improve the spatial resolution of NTL integration.

#### **6. Conclusions**

We integrated DMSP-OLS and NPP-VIIRS NTL data to construct a long-term NTL dataset, using the random-effect model with land-use data and corrected NTL data to model the spatiotemporal dynamics of the Chongqing's population from 2000–2018. At the pixel level, there was a power function relationship between the two datasets (R2 = 0.907). Compared with an NTL integration model previously established at the provincial level, our model was 4.78% more accurate. In addition, accuracy tests using 2015 data resulted in an MRE of 26.98%, an improvement of nearly 20% when compared with previous studies of mountainous populations. Therefore, our approach is feasible and provides a technical method for monitoring spatiotemporal population changes in mountainous areas.

From 2000–2018, the spatial distribution of Chongqing's population has increased in the west and decreased in the east, while also increasing in low-altitude areas and decreasing in the medium-high altitude areas. Moreover, population agglomeration was common. At the provincial level, high-density regions showed a significant increase, while decreasing in intermediate-density regions. The population density significantly increased in the central urban area and immediate surroundings in every district and county, while significantly decreased in non-urban areas, especially in the northeast.

**Author Contributions:** The co-authors together contributed to the completion of this article. Specifically, it follows their individual contribution: conceptualization, D.L. and Y.W.; data curation, H.Z., Y.W. and K.S.; formal analysis, Q.Y. and Y.W.; investigation, D.L. and Y.L.; methodology, D.L. and Y.W.; project administration, Q.Y.; writing—original draft, D.L.; writing—review and editing, D.L., Y.W. and Q.Y. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by National Natural Science Foundation of China, grant number 42071234 and 41901232; Chongqing Social Science Planning Project, grant number 2019PY49; the Youth Fund for Humanities and Social Sciences Research of the Ministry of Education, grant number 19XJCZH006; and the Fundamental Research Funds for the Central Universities, grant number XDJK2020C014.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Comparing Luojia 1-01 and VIIRS Nighttime Light Data in Detecting Urban Spatial Structure Using a Threshold-Based Kernel Density Estimation**

**Yuping Wang and Zehao Shen \***

Ministry of Education Key Laboratory for Earth Surface Processes, College of Urban and Environmental Sciences, Peking University, Beijing 100871, China; yupinwang@pku.edu.cn

**\*** Correspondence: shzh@urban.pku.edu.cn; Tel.: +86-010-62751179

**Abstract:** Nighttime light (NTL) data are increasingly used in urban studies and urban planning owing to their strong connection with human activities, although the detection capacity is limited by the spatial resolution of older data. In the present study, we comparedthe results of extractions of urban built-up areas using data obtained from the first professional NTL satellite Luojia 1-01 with a resolution of 130 m and the Visible Infrared Imaging Radiometer Suite (VIIRS). We applied an analyzing framework combing kernel density estimation (KDE) under different search radii and threshold-based extraction to detect the boundary and spatial structure of urban areas. The results showed that: (1) Benefiting from a higher spatial resolution, Luojia 1-01 data was more sensitive in detecting new emerging urban built-up areas, thus better reflected the spatial structure of urban system, and can achieve a higher extraction accuracy than that of VIIRS data; (2) Combining with a proper threshold, KDE improves the extraction accuracy of NTL data by making use of the spatial autocorrelation of nighttime light, thus better detects the scale of the spatial pattern of urban built-up areas; (3) A proper searching radius for KDE is critical for achieving the optimal result, which was 1000 m for Luojia 1-01 and 1600 m for VIIRS in this study. Our findings indicate the usefulness of the KDE method in applying the upcoming high-resolution NTL data such as Luojia 1-01 data in urban spatial analysis and planning.

**Keywords:** kernel density estimation; Luojia 1-01 satellite; nighttime light; spatial resolution; searching radius threshold; urban built-up area

#### **1. Introduction**

Cities comprise a landscape type with the most concentrated human activities. The intense exchanges of materials, energy, and information connect cities with nearby areas via traffic and other networks, and form apparent social and environment gradients from urban to rural areas, with diverse city structures [1]. Spatial analysis of city structures is critical for understanding city functions and evolution, while accurate discrimination ofthe urban boundary and the internal structure is a prerequisite for further spatial analyses [2–4]. In general, urban built-up areas within the administrative region of a specific city comprise continuous areas with adequate municipal facilities [5]. Urban built-up areas are the core areas of cities and the main focus of research into urban structure, functioning, anddevelopment [6].

Remote sensing images are a major data source for urban structure analysis. Early studies usually employed daytime remote sensing data such as Thematic Mapper (TM) images to extract urban built-up areas [7,8], but information about the buildings is not always an accurate indicator of theintensity and economic importance of urban areas. In recent years, nighttime light (NTL) data have been increasingly employed to indicate human activities at the landscape to regional scales [9]. Pioneered by Elvidge et al.'s application of NTL data in city mapping and analyses [10,11], this new approach provided

**Citation:** Wang, Y.; Shen, Z. Comparing Luojia 1-01 and VIIRS Nighttime Light Data in Detecting Urban Spatial Structure Using a Threshold-Based Kernel Density Estimation. *Remote Sens.* **2021**, *13*, 1574. https://doi.org/10.3390/ rs13081574

Academic Editor: Nataliya Rybnikova

Received: 3 March 2021 Accepted: 13 April 2021 Published: 19 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

a distinct and sometimes more effective informationfor urban structure identification and spatial analysis, especially for high energy release patterns [11,12].

Li and Li (2015) stressed that NTL contains various types of information that merit further research [13]. For example, The Defense Meteorological Satellite Program/Operational Linescan System (DMSP/OLS) data has been employed in various applications, including urban extent and extension analysis [14,15], regional economy assessment [16,17], energy releasing events monitoring [18], and fishery research [19]. DMSP/OLS data have significant research value because of the long time period covered. However, these data are too coarse to extract detailed spatial information [20]. Thus, later studiestried to combine DMSP/OLS data with high-resolution remote sensing data to obtain accurate results [21–23]. For example, the Visible Infrared Imaging Radiometer Suite (VIIRS) was launched in 2011 on the Suomi National Polar-Orbiting Partnership (NPP) spacecraft and it provides a new data source with a resolution of 15 (~500 m), implying a much-improved detail detecting capacity. Shi et al. (2014) employed the VIIRS NTL data to extract built-up urban areas, proving its reliability in urban extent extraction [24]. The combination of VIIRS NTL data with high-resolution remote sensing data has been effective in extracting built-up areas [25,26].The higher spatial resolution of the VIIRS NTL data ensures it anobviously better ability in separating light sources from other land covertypes [27].

The Luojia 1-01 satellite was launched from China on 2 June 2018 and its onboard complementary metal oxide semiconductor can produce high-resolution NTL imagery (130 m). As the world's first professional NTL satellite, Luojia 1-01 has a swath widthof 250 km and it covers the Earth surface every 15 days. Luojia 1-01 data have been used to extract urban extent characteristics [28,29] and investigate artificial light pollution [30]. Compared with previous studies based on other NTL data, such as NPP-VIIRS data, more precise extent of urban impervious surface can be obtained using Luojia 1-01 data Appendix A, Figure A1), due to its superior capacity to detect more details andits wider measurement range [31]. Further, researchers found it feasible to detect urban expansion through the combination of Luojia 1-01 data and other imagery data. The high spatial resolution of the NTL images played a critical role in achieving more accurate resultsin detecting distinct energy-releasingobjects, such as urban impervious surface, population density, or human activity intensity [32,33].

Along with increasing applications of NTL data inresearches, novel methods have been developed to explore specific features of this data source. For example, thresholdbased method is widely used to select a specific NTL value to distinguish built-up areas from non-built-up areas [34,35]. With ancillary data such as International Space Station images [32], multiple thresholds can beidentified for extraction in different regions or different time periods [36]. Clustering methods are also commonly applied in urban areas extraction, which is especially useful in large-scale studies [15]. Machine-learning methodsrepresent another active frontier of built-up areas classification with NTL data; related examples include support vector machine [37], artificial neural network [38], and specifically developed methods [39].

In this study, we employed Luojia 1-01 data, VIIRS data, and Landsat 8 data to develop a method for extracting urban built-up areas using kernel density estimation (KDE), taking Nanjing, the capital city of Jiangsu Province of China, as a study area.By comparing the extraction of urban built-up areas using these two NTL datasets and testing the results with the validation data, we intended to answer the following two questions: (1) What is the relative advantages of Luojia 1-01 compared with VIIRS in detecting urban spatial structure?; (2)How dothe searching radius of KDE and the discriminating threshold value affect the effectiveness of KDE in extracting urban built-up areas, especially theurban boundariesand new emerging built-up areas?

#### **2. Materials and Methods**

*2.1. Study Region and Data*

2.1.1. Remote Sensing Data

Nanjing is a large inland port city, the capital of Jiangsu Province in East China, with a population of 8.436 million and an urban area of 6587 km<sup>2</sup> by 2018. Nanjing comprises eleven urban districts (i.e.,Gaochun, Gulou, Lishui, Liuhe, Jiangning, Jianye, Pukou, Qinhuai, Qixia, Xuanwu, Yuhuatai, andthe Jiangbei New District) that are distributed on the south and the north banks of the Yangtze River. The Luojia1-01 data product of Nanjing used in this study was imaged on 23 November 2018. It completely covered the study area with the central geographical coordinates of 117.880537◦E/31.883928◦N. A Landsat 8 Operational Land Imager (OLI) image on 19 April 2018 was acquired from the Geospatial Data Cloud (http://www.gscloud.cn/, accessed on 10 July 2020). The central coordinates of the image were 118.8335◦E/31.7424◦N, and the cloud cover was 0.31%. The VIIRS monthly synthetic product acquired in December 2018 for the same region was downloaded from https://www.ngdc.noaa.gov/eog/viirs/download\_dnb\_composites.html (accessed on 21 November 2019), the website of the National Oceanic and Atmospheric Administration, and included in this study.

The images were clipped to fit the study area. To ensure the accuracy of the area calculations, images were applied using the Albers equal-area conic projection. To reduce the effect of light saturation, aradiometric correction for Luojia1-01 NTL was implemented using the following formula provided by the data distribution website:

$$L = DN^{3/2} \cdot 10^{-10} \tag{1}$$

where *DN* is the digital number representing the image value of each pixel, and *L* represents the corrected radiance of the Luojia1-01 NTL data.

The unit of Luojia 1-01 radiance is W·m−2·sr−1·μm−1, and we converted the unit to nano W·cm−2·sr−<sup>1</sup> which is the unit of VIIRS data. To eliminate georeferencing errors in Luojia 1-01 data, a geometric correction was also done referring to the OpenStreetMap. After the correction, the image matched well with ground objects (Appendix A, Figure A2). For further processing, both NTL images were resampled to the same resolution of Landsat 8 data (30 m) through cubic spline interpolation. Figure 1a,b show the corrected Luojia 1-01 image and the VIIRS image, respectively.

**Figure 1.** City structure of Nanjing City represented by radiometric corrected nighttime light (NTL) images of (**a**) Luojia 1-01 and (**b**) Visible Infrared Imaging Radiometer Suite (VIIRS).

/XNRX

%DQTLDR 4LDROLQ

%LQMLDQJ

/RQJWDQ /RQJWDQ

7DQJVKDQ

&KXQ[L

<RQJ\DQJ

#### 2.1.2. Validation Data

Validation data isessential to assess the accuracy of urban built-up areas extracted from the remote sensing data. In our study, the Nanjing Zoning Map and the urban system planning map in the Nanjing Urban Master Plan (Figure 2a,b) were included for validation purpose. The maps were used to evaluate whether the spatial pattern of the extracted built-up area can reflect the actual structure of the urban system.

**Figure 2.** Validation maps of the structure of Nanjing city represented as: (**a**) the zoning mapissued by Jiangsu Provincial Bureau of Surveying Mapping and Geoinformation; (**b**) the urban system map derived from the plan of the Nanjing Jiangbei New District Administrative Committee and the Nanjing Urban Master Plan (2011–2020) issued by Nanjing Municipal Planning and Natural Resources Bureau; (**c**) the built-up area of Nanjing in 2018 from the Resource and Environment Data Cloud Platform.

For accuracy evaluation, we obtained the data of urban built-up areas in 2018 from the Resource and Environment Data Cloud Platform (http://www.resdc.cn/, accessed on 27 January 2020), supported by the Institution of Geographic Sciences and Natural Resources Research, the Chinese Academy of Sciences. This 1-km resolution raster data was derived from the Landsat 8 data through manual visual interpretation (Figure 2c).

#### *2.2. Analytical Methods*

We used the Vegetation Adjusted NTL Urban Index (VANUI) [40] to extract built-up areas in Nanjing from Luojia 1-01 and VIIRS images. KDE was conducted under different search radii, and then a threshold method was applied to extract high-value pixels as built-up areas. Extraction results were compared with the validation data of urban built-up areas in Nanjing to evaluate the accuracy. A conceptual diagramof methods of this study is shown in Figure 3.

**Figure 3.** The conceptual diagram of analytical procedure of this study.

#### 2.2.1. VANUI for Luojia 1-01 and VIIRS

To improve the sensitivity of light density to the geographical objects it was used to represent [41–43], such as the intensity of economic activities, we combined NTL and Normalized Difference Vegetation Index (*NDVI*) to calculate the VANUI as indices for extracting built-up areas instead of directly using NTL. This spectral index has been proven effective in reducing NTL saturation and increasing variation of data values in core urban areas [40]. The normalized difference vegetation index is an indicator of vegetation coverage:

$$NDVI = \frac{NIR - R}{NIR + R} \tag{2}$$

where *NIR* represents the near-infrared band and *R* represents the red band, i.e., band 5 and band 4 in the Landsat 8 OLI, respectively.

VANUI is defined as Equation (3), where NDVI is derived from Equation (2), and NTL represents the radiance value of Luojia 1-01 and VIIRS data:

$$VAN\text{ULI} = (1 - NDVI) \times NTL \tag{3}$$

*VANUI* derived from Luojia 1-01 and VIIRS datawas respectively calculated (VANUI\_LUOJIA and VANUI\_VIIRS). Figure 4 shows the spatial structures of Nanjing City derived from VANUI\_LUOJIA and VANUI\_VIIRS.

**Figure 4.** Spatial patterns of the urban indices: (**a**) Vegetation Adjusted NTL Urban Index (VANUI)\_LUOJIA, derived from Luojia 1-01 data and Normalized Difference Vegetation Index (NDVI); (**b**) VANUI\_VIIRS, derived from VIIRS data and NDVI.

#### 2.2.2. KDE Method

KDE is a distance-dependent density estimate method, in which the value of each output grid/point represents the accumulative influence of a neighborhood, described by a kernel function, on the focal grid/point density [44]. KDE is generally applied to describe the spatial patterns with lateral overflow, such as the species distribution ranges [45] and road density patterns [46,47]. The density in each output grid cell is calculated by adding the values of all the kernel surfaces where they overlay the grid cell center. KDE is also applicable to NTL data as overflow generally exists in light density. Moreover, the indicative capacity of NTL is variable among geographical objects. Specifically, for urban patterns, the raw data of NTL could underrepresent the blocks with most lights inside turning off at night, such as schools, banks, and parks located in urban areas. Therefore, a smooth of night density is helpful to visualize the urban areas darker at night. Conceptually, a smoothly curved surface is fitted over each point. The surface value is highest at the location of the point and it decreases as the distance increases from the point, until it reaches zero at the bandwidth distance from the point. The bivariate KDE is defined as:

$$\hat{f}(\mathbf{x}) = \frac{1}{nh^2} \sum\_{i=1}^{n} K\left\{\frac{\mathbf{x} - \mathbf{X}\_j}{h}\right\} \tag{4}$$

where *n* is the sample size, *h* represents the bandwidth, *K* is the kernel function, the two-dimensional *x* denotes the vector for which the function is evaluated, and the twodimensional *Xj* is the sample vector [48].

The kernel function used here is based on the quartic kernel function as follows:

$$\mathcal{K}(\mathbf{x}) = \begin{cases} 3\pi^{-1}(1 - \mathbf{x}^T\mathbf{x}) & \mathbf{x}^T\mathbf{x} < 1 \\ 0 & \text{otherwise} \end{cases} \tag{5}$$

In this study, we integrated the effect of the NTL surrounding each grid point, and classified the output grids with values higher than a specified threshold value as urban builtup areas. KDE was thus used to compare the spatial patterns of urbanization recognized by the NTL data of Luojia 1-01 and VIIRS.

We converted VANUI\_LUOJIA and VANUI\_VIIRS into a set of point features at the center of each grid cell. Each point had the same value as the index grid cell from which it was derived. We then used the KDE method to estimate the spatial pattern of the point density to obtain the spatial distribution information for each index. The value of each point was treated as its weight, i.e., the number of calculations for the point. The search radius is an important parameter in the KDE method which has considerable influence on the extraction result. To explore the proper search radius, KDE on VANUI\_LUOJIA and VANUI\_VIIRS was conducted under a search radius ranging from 100 m to 2000 m with a 100-m interval, resulting in twenty KDE images for each NTL data.

#### 2.2.3. Threshold-Based Urban Built-Up Area Extraction

KDE cannot be used to discriminate boundary by itself. Density threshold determining is essential for reasonably deciding the boundaries of urban built-up areas. The most widely employed methods include the mutation detection method [49], empirical threshold method [50], and reference comparison method basing on spatial data [34] or statistical data [51]. In this study, we employed the statistical data to help determine the threshold value in order to identify and extracturban built-up areas. This method has been adopted in previous NTL-based studies to determine the threshold in built-up area extraction and the results proved satisfying [24,52,53].

According to the Nanjing Statistical Yearbook -2019, the built-up area of Nanjingcovered 817 km2 in 2018. We employed this as a benchmark to extract 817 high-value pixels from each KDE image. Then we eliminated individual pixels which were not connected with other extracted pixels, and filled holes to obtain twenty optimum extraction results for Luojia 1-01 and VIIRS data with different searching radii.

#### 2.2.4. Extraction Result Evaluation

In this study, the Nanjing zoning boundaries and the Nanjing Urban System Plan map were used to assess the structure of extracted built-up areas, while the built-up area for 2018 was applied to evaluate the extraction accuracy.

As shown in Figure 2c, validation data of the built-up area for 2018 has holes and small fragments of built-up area. Thus, the processes of fragment removal and hole filling in Section 2.2.3 were also carried out before the accuracy evaluation, resulting in a validation data covering 963 km2. Four commonlyused metrics were calculated for accuracy evaluation, including overall accuracy (OA), producer's accuracy (PA), user's accuracy (UA) and Kappa coefficient (KC) [54]. OA presented the ratio of correctly identified pixels, PA indicated the proportion of detected built-up areas in the validation data, UA measured the proportion of truly built-up areas in extraction results, and KC provided an overall assessment of extraction accuracy.

#### **3. Results**

#### *3.1. Extraction Results and the Urban Structure*

The KDE-based extraction results using two NTL datawith four searching radii (i.e., 500 m, 1000 m, 1500 m, and 2000 m) were compared separately with the urban structure of Nanjing (Figure 5). In general, built-up areas were mostly concentrated in the central city. Districtsthat had long histories of urban development (e.g., Qinhuai, Xuanwu, and Gulou) were almost fully covered with built-up areas. Regions with rapid economic growth in recent years (such as Jiangbei New District and Jiangning District) also had large built-up areas.

**Figure 5.** Comparison of extraction results and the urban structure of Nanjing, in which blue areas indicate extracted built-up areas, hatched areas indicate the central city, points represent new towns, and boundaries between districts are drawn. Kernel density estimation (KDE) was conducted respectively on VANUI\_LUOJIA and VANUI\_VIIRS under the search radius of 500 m, 1000 m, 1500 m, and 2000 m.

The urban system of Nanjing is composed of the central city and nine new towns. Under a searching radius of 500 m, Luojia 1-01 identified all nine new towns with the best performance. However, the results didnot agree well with the central city. With the increase of the search radius, the new Tangshan town could not be detected, but the extraction results performed better in the central city. On the other hand, VIIRS was able to detect Tangshan but missed the new Chunxi town in northern Nanjing. In new towns Qiaolin, Lukou, and Yongyang, VIIRS-based extraction results covered less area than that derived from Luojia 1-01 data.

#### *3.2. Accuracy Evaluation*

Accuracy evaluation of the extraction results using OA, PA, UA, and Kappa metrics were shown in Figure 6, for the raw data (radius as 0 m) of Luojia 1-01, VIIRS, and their KDE images with radius increasing from 100 m to 2000 m; and the confusion matrix at the radius of 0 m (no KDE), 500 m, 1000 m, 1500 m, and 2000 m were selected to show in Table 1. For Luojia 1-01 extractions, the KDE results are more accurate than the raw data when the search radius was larger than 500 m; while for VIIRS, the KDE results were consistently more accurate than its raw data. Overall, the Luojia-based extractions showed higher accuracies than those of the VIIRS-basedresults in the KDEs with the radius over 500 m, but the accuracy of the raw data (resampled to 1 km) of VIIRS was superior to that of Luojia 1-01. The accuracy of Luojia 1-01 extractions showed a unimodal change across the range of search radius and peaked at a radius around 1000 m. ForVIIRS-based extractions, the accuracy increased steadily and peakedat the search radius of about 1600 m.The accuracy of the best extraction result of Luojia 1-01 images (0.937, 0.733, 0.815, and 0.734 for OC, PC,

UC, and Kappa, respectively) was higher than that of VIIRS images (0.935, 0.732, 0.805, and 0.730, correspondingly).

**Figure 6.** The change of accuracy evaluationmetrics for built-up area extractions under different search radii using Luojia 1-01 and VIIRS data products.The evaluation values at the search radius of 0 m represent the raw data of LuoJia 1-01 and VIIRS, respectively, without application of KDE.

**Table 1.** Confusion matrix of extracting built-up area with different search radii using Luojia 1-01 and VIIRS data products. BA: built-up areas; NA: non-built-up areas.


As for the spatial distribution of extraction results (Figure 7), difference can be found between the KDE patterns of two NTL datasets, as well as among different search radii. VIIRS showed relatively poor results in the southern suburbs and failed to detect the most southern built-up area. Under the search radius of 500 m, the KDE of Luojia 1-01 was not able to fully extract the central part of built-up areas, whereas under the search radius of 1000 m, the extraction matched best with the validation data in the central part. However, as the radius continued to increase, omission errors could be found near the margins. Similarly, omission of built-up areas revealed in VIIRS-based extraction when the search radius was 2000 m.

**Figure 7.** Comparison of extraction results and the validation data for urban built-up areas, in which blue areas indicate extracted built-up areas and hatched areas indicate built-up areas from the validation data. KDE was conducted respectively on VANUI\_LUOJIA and VANUI\_VIIRS under the search radius of 500 m, 1000 m, 1500 m, and 2000 m.

#### **4. Discussion**

Being recognized as a useful indicator of human activity intensity, NTL data have been increasingly applied for urban structure analyses [55,56]. The technical features of the first professional NTL satellite Luojia 1-01 have been reported earlier [57–59]. After comparing the built-up area extraction results obtained using different researching radii, we found that the threshold-constrained KDEs for both Luojia 1-01 and VIIRS dataeffectively extracted the built-up areas in the urban center and the false extraction of water bodies (such as the Yangtze River and Xuanwu Lake) was satisfactorily avoided. However, there were substantial differences in detection of urban built-up area boundaries, especially in the suburbs. In addition, the search radius of KDE had considerable effects on extraction results, and differed between the two NTL data sources.

The effectiveness of extracting the urban built-up area boundaries differed between the Luojia 1-01 and VIIRS datasets. In the central city, the NTL data of Luojia 1-01 and VIIRS produced comparatively consistent results. However, Luojia 1-01 was better at identifying new growing urban cores, showing the advantage of Luojia 1-01 data with a finer resolution in capturing the emergent urban structure under a proper KDE search radius. The higher spatial resolution of Luojia 1-01 images provide a lesserdegree of mixture of land use types and light sources, and warrant it a higher sensitivity to the change of NTL environment than the coarser-resolution VIIRS images. In particular, moreomission errors occurred in VIIRS-based extraction in the suburbs, where the urban built-up areas were more fragmented and mixed with other surrounding land-use types. Nevertheless, the daily revisit interval of VIIRS data warrant it a much higher sensitivity to the dynamics of ground processes, although the 15-day temporal resolution of Luojia 1-01 data generally satisfies the requirement for urban built-up area analysis. It is also importantto keep in mind that the difference in local acquisition time for VIIRS (01:30 a.m.) and Luojia 1-01 (10:30 p.m.) could also cause difference in their extraction results.

The KDE-based extraction of urban built-up areas has several advantages compared with direct extraction from raw NTL data. First of all, as a grid data of on-site light intensity, NTL is a useful indicator of nighttime human activity intensity. However, one type of NTL data has a fixed spatial resolution. When it is used to detect the pattern of a particular energy-releasing spatial process such as urban development, a high-resolution NTL data could miss the urban buildings or blocks with most lights turning off at night, such as those of banks, schools, museums, and administrative offices, as well as city parks. These areas are generally scattered in the center of urban areas, but normally empty and darker at night, compared with the areas of active nightlife, such as business and entertainment centers. Therefore, the new high-resolution NTL data may generate more bias in detecting urban areas compared with old coarser data, as demonstrated by the lower accuracy values of Luojia 1-01 than VIIRS for their raw data and short search radius (≤500 m) KEDs (Figure 6). This flaw of high-resolution NTL data can be offset by applying KDE that makes use of the spatial autocorrelation of NTLs with a proper search radius adaptive to the actual lamination environment, rather than being limited by the fixed scale of NTL data itself. Actually, the accuracy of KDE with the optimal search radius was obviously higher that the raw image for both NTL data, and the best KDE of Luojia 1-01 was more accurate than that of VIIRS. Our application of KDE confirmed it as a useful method of urban detecting using high-resolution NTL data, whichsmooth the NTL space by integrating the surrounding NTL values on the focal pixels, and thus reasonably "erase" the darker points within a continuous urban area. Secondly, in a KDE image, high-value grids have high accumulative light density within the search radius. Unlike the raw data, this accumulation can help to identify "NTL hotspots" with a threshold light valueand spatial magnitude that indicate an emergent urban core. Moreover, KDE can be more reasonable and accurate in detecting the urban boundary, out of which the NTL density drops below the threshold value that is determined by the accumulation of surrounding light intensity within a radius, rather than that at a single point.This effect is especially helpful for NTL data of upcoming higher spatial resolution. Urban built-up areas comprise centralized and contiguous areas, and the extraction of their boundaries is traditionally based on manual interpretation or automatic identification according to the edge point density [6]. The KDE method integrated the effects of the surrounding pixels on each output grid, which excluded small or narrowly illuminated areas and helped to extract "centralized and contiguous" urban built-up areas. Our results indicated that, regardless of whether the Luojia 1-01 NTL or VIIRS NTL was employed, the KDE-based extraction results could map the built-up urban areas precisely. This method should be particularly useful when the core urban areas have to be highlighted and the emerging new towns need to be identified, as in the cases of urban planning.

In the KDE process, the search radius is an important parameter which can largely influence the extraction result. As shown in Figure 6, both large and small radii resulted in omission of built-up areas. When the search radius is too small, the KDE process only includes the influence of a narrow neighborhood. Thus, some lighted areas with dark intervals could be left out of the extraction result. If the search radius is too large, built-up areas near the boundaries may be integrated into the non-built-up area, or some disconnected built-up areas will be misconnected. The proper search radius should be the point when the extracted built-up area decreases with the increase of the radius. For Luojia 1-01 and VIIRS datasets used in our case, proper search radii appeared to be similar (1000 m and 1600 m, respectively), indicatingthat the KDE optimal radius might not be optimal per see, but subject to the spatial resolution of the NTL data, the validation data, and the scale of fragmentation feature of the geographical objects.

For both Luojia 1-01 NTL data and the KDE method, there remains much to be explored in future studies. There are other factors thatcan affect the data of Luojia 1-01 images, as demonstrated in other NTL data sources, and thus may also impact the extraction result that requires further work to estimate;these include the seasonal changes in nighttime light brightness [60] and diurnal change of satellite overpass time [61], as well as the effect of satellite observation angle on the nighttime light [62].The radiometric correction formula for Luojia1-01 was obtained after running smoothly for several months. This data source will be more reliable after further corrections aremade for different regions and time domains, and thus expected to provide long-term and high-resolution NTL information to support various applications in research, management, and policy-making; for example, the possible evaluation of new request for energy in newly detected urban areas, as well as the trend of CO2 level in the atmosphere. Luojia 1-01 images from June to November 2018 can be downloaded at present. With more images available in the future, the performance of Luojia 1-01 in detecting urban area changes across different temporal resolution levels could be discussed. Given the effectiveness in the KDE extraction results, choosinga proper search radius would becritical fora successful application of this method. Our study shows the effectiveness of the KDE method in a high-luminosity area; the universal principle to select the optimal search radius in variable luminosity contexts remains for further exploration.

#### **5. Conclusions**

The comparison with VIIRS indicates that the first professional NTL satellite Luojia 1- 01 provides a reliable new data resource of nighttime light remote sensing. Its substantially improved spatial resolution is more sensitive to nighttime light variation, and thus benefits the accurate extraction of the spatial structure of urban built-up areas, especially the urban boundary andthe new growing urban cores. The application of KDE combinedwith a properly determined threshold can be used to make use of the spatial autocorrelation in NTL information.This improvement would be critical for the capture capacity of upcoming higher-resolution NTL data inapplications of overall spatial pattern detection; and a proper searching radius and NTL threshold value is critical for an optimized KDE result. The high agreement between the extraction result and the validation dataindicates the potentialof Luojia 1-01 data in widespread applicationsincluding urban study and planning.

**Author Contributions:** Conceptualization, Z.S. and Y.W.; methodology, Y.W. and Z.S.; formal analysis, Y.W.; writing—original draft preparation, Y.W.; writing—review and editing, Z.S.; funding acquisition, Z.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Key Research and Development Plan of the Ministry of Science and Technology of China, grant number 2017YFC0505200, and the project of National Natural Science Foundation of China, grant number 41371190.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Acknowledgments:** This study is sponsored by the Key Research and Development Plan of the Ministry of Science and Technology of China [2017YFC0505200], and the project of National Natural Science Foundation of China [41371190]. We are grateful for the helpful comments and suggestions from Fang Qiu from University of Texas at Dallas.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **Appendix A**

**Figure A2.** Luojia 1-01 image before (**a**,**b**) and after (**c**,**d**) the geometric correction.

#### **References**


## *Article* **The Modified Normalized Urban Area Composite Index: A Satelliate-Derived High-Resolution Index for Extracting Urban Areas**

**Feng Li 1, Xiaoyang Liu 1, Shunbao Liao <sup>1</sup> and Peng Jia 2,3,\***


<sup>3</sup> International Institute of Spatial Lifecourse Epidemiology (ISLE), Hong Kong, China

**\*** Correspondence: jiapengff@hotmail.com

**Abstract:** The accurate and efficient extraction of urban areas is of great significance for better understanding of urban sprawl, built environment, economic activities, and population distribution. Night-Time Light (NTL) data have been widely used to extract urban areas. However, most of the existing NTL indexes are incapable of identifying non-luminous built-up areas. The high-resolution NTL imagery derived from the Luojia 1-01 satellite, with low saturation and the blooming effect, can be used to map urban areas at a finer scale. A new urban spectral index, named the Modified Normalized Urban Areas Composite Index (MNUACI), improved upon the existing Normalized Urban Areas Composite Index (NUACI), was proposed in this study, which integrated the Human Settlement Index (HSI) generated from Luojia 1-01 NTL data, the Normalized Difference Vegetation Index (NDVI) from Landsat 8 imagery, and the Modified Normalized Difference Water Index (MNDWI). Our results indicated that MNUACI improved the spatial variability and differentiation of urban components by eliminating the NTL blooming effect and increasing the variation of the nighttime luminosity. Compared to urban area classification from Landsat 8 data, the MNUACI yielded better accuracy than NTL, NUACI, HSI, and the EVI-Adjusted NTL Index (EANTLI) alone. Furthermore, the quadratic polynomial regression analysis showed the model based on MNUACI had the best R<sup>2</sup> and Root-Mean Square Error (RMSE) compared with NTL, NUACI, HSI, and EANTLI in terms of estimation of impervious surface area. It is concluded that MNUACI could improve the identification of urban areas and non-luminous built-up areas with better accuracy.

**Keywords:** nighttime light; Luojia 1-01; MNUACI; urban area; urban remote sensing

#### **1. Introduction**

Urban areas are the supporting systems of urban population, built-ups, transportation, and commercial corporations, as well as where urbanization takes place [1]. Urbanization and urban sprawl have important influence on the urban ecological environment, climate, public health, and socioeconomic development through the transformation of land use/cover types [2–6]. Urban expansion brings a series of urban problems, such as underground water pollution, traffic congestion, carbon emission increment, urban heat island effect, etc. Furthermore, urban nighttime light (NTL) causes disturbance to the human circadian rhythm and sleep disorders [7]. These urbanization issues increase the burden on the urban ecological system and impact the sustainable development of cities, especially in developing countries like China. Due to China's opening and economic development strategies, the urbanization rate of China's permanent population has increased from 18.96% in 1979 to 60.60% in 2019 [8], as shown in Figure 1. This approximately exponential urban population growth has greatly promoted the accelerated expansion of China's cities. Faced with problems caused by rapid expansion of China's cities, it is

**Citation:** Li, F.; Liu, X.; Liao, S.; Jia, P. The Modified Normalized Urban Area Composite Index: A Satelliate-Derived High-Resolution Index for Extracting Urban Areas. *Remote Sens.* **2021**, *13*, 2350. https:// doi.org/10.3390/rs13122350

Academic Editor: Nataliya Rybnikova

Received: 20 May 2021 Accepted: 12 June 2021 Published: 16 June 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

essential to design relevant analytical methods to solve these regular mapping problems of urban sprawl.

**Figure 1.** China's urbanization rate from 1979 to 2019.

The NTL data have been most widely used to extract urban areas at regional and global scales, such as the Defense Meteorological Satellite Program/Operational Linescan System (DMSP/OLS) and the Suomi National Polar-orbiting Partnership satellite with Visible Infrared Imaging Radiometer Suite (NPP-VIIRS) data [9–12]. The main approaches for identification of urban areas through NTL data include edge-detection, supervised classification, and threshold-based segmentation. Assuming the existence of abrupt changes of NTL in urban and suburban transition zones, Tan used a light intensity gradient to delineate the boundaries of urban areas [13]. Xue et al. adopted an edge detection method to acquire urban boundaries based on the Vegetation Adjusted NTL Urban Index (VANUI) [14]. In supervised classification methods, many studies have proved that the support vector machine (SVM) method could provide high-precision classification results. Cao et al. developed an SVM-based region-growing algorithm to distinguish urban areas from non-urban background [15]. Jing et al. proved that results obtained by k-nearest-neighbors, SVM, and the random forests classification algorithm could achieve a better agreement for the purpose of urban area detection [16]. In addition, simple thresholding methods were also usually adopted to extract urban area extent [17,18].

The integration of multi-source remote sensing data and NTL data can mitigate the blooming effect (i.e., adjacent pixels of pervious surface are usually counted as impervious surface due to similar NTL values) of NTL and improve its performance. Integrating urban NTL with vegetation index and land surface temperature (LST) provides promising approaches to differentiate urban areas from non-urban areas. For example, Lu et al. developed a Human Settlement Index (HSI) with DMSP-OLS and the normalized difference vegetation index (NDVI) data to map urban settlements [19]. Zhang et al. proposed a simple and effective VANUI that could reduce the effects of NTL saturation and overcome the overcorrection issue of HSI [20]. Zhuo et al. proposed an Enhanced Vegetation Index (EVI) Adjusted NTL Index (EANTLI) to lessen the saturation problem of NTL data [21]. Liu et al. demonstrated that an LST and EVI Regulated NTL City Index (LERNCI) was more effective in delineating the urban spatial structures than VANUI [22]. However, when land cover types include water body and bare land, it is not enough to rely solely on vegetation index or land surface temperature to separate urban area from non-urban area. Therefore, Liu et al. established a Normalized Urban Areas Composite Index (NUACI) by combining the Normalized Difference Water Index (NDWI), NTL, and the EVI to estimate the urban impervious surface [23]. NUACI could degrade oversaturation using water body and vegetation indexes, but it might still ignore certain urban areas in low-luminous areas.

A Modified NDWI (MNDWI) was found to be more suitable for water feature recognition than the NDWI, as it can better suppress non-water land noise and better enhance

water features [24]. Thus, this study aimed to develop a Modified NUACI (MNUACI) on the basis of the MNDWI, NDVI, and NTL, to further improve the identification of urban areas. Moreover, the MNUACI has a higher spatial resolution than the previous DMSP/OLS- and NPP-VIIRS-based indexes by using NTL data from the Luojia 1-01 NTL satellite, designed and developed by Wuhan University in China, which has started providing nighttime imagery with a finer resolution of 130 m since 2018. The MNUACI would be useful for a wide array of urban studies, such as urban population health [25], urban spatial structure [26], and energy carbon emissions [27], where such an index has been urgently demanded.

#### **2. Study Sites and Data Sources**

#### *2.1. Study Sites*

As shown in Figure 2, four capital cities in China from north to south, Beijing, Nanjing, Guangzhou and Haikou, were chosen as study sites. Beijing, the political, science and technology innovation and cultural center of China, is surrounded by mountains in the west, north and northeast, and the North China Plain in the southeast [28]. It has sixteen municipal districts with a total area of 16,410 km2, a resident population of 21.54 million and a GDP of 3032 billion CNY in 2018 [29]. With six downtown areas as the center and Tongzhou District as the sub-center, Beijing is expanding outward along east-west and north-south axes, and the urban secondary industry and a large labor force is beginning to migrate to the surrounding suburbs. Nanjing, a regional transportation hub, is in the lower reaches of the Yangtze River and is the only megacity in the Yangtze River Delta and East China. It includes twelve administrative districts, with a total area of 6587 km2, a resident population of 8.44 million and a GDP of 1282 billion CNY in 2018 [30]. New urban areas and towns were built along two banks of the Yangtze River, and the central city expanded northward and southward to the suburbs, where the chemical industry was centrally located. Guangzhou, located at the northern edge of the pearl river delta, is an important transportation and logistics hub in South China. It consists of eleven administration districts, with a resident population of 14.9 million and a GDP of 2286 billion CNY in 2018, covering a total area of 7434 km<sup>2</sup> [31,32]. With the metropolitan area as the city center, Guangzhou built two new districts in the south and east regions, and three sub-centers in the north and east regions. The international tourist city of Haikou borders the Qiongzhou Strait in the north and serves as the core city of the China Free Trade Zone. Haikou covers a land area of 2290 km2 and a sea area of 861 km2, with a population of 2.3 million and a GDP of 151.1 billion CNY in 2018 [33]. Haikou has built the east and west coast tourism belt and the north-south tourism axis of Nandu River urban water system.

The traditional economic growth points are mostly located in the central areas of China cities. Based on the strong economic foundation, the latest urban planning and layouts of the above four cities guide and drive the flow of the large labor force to the new industrial agglomeration areas in suburbs, thus promoting the continuous urban expansion of these cities. Besides, these four cities have experienced tremendous socioeconomic development over the past 40 years, representing the natural and socioeconomic development levels of different cities in China. Therefore, they are suitable to study the dynamics of urbanization and its impact on urban ecosystems according to the extent of urban areas extraction.

**Figure 2.** Geolocation map of study sites (Beijing, Nanjing, Guangzhou, Haikou).

#### *2.2. Data Sources*

The cloud-free Luojia 1-01 NTL images in Beijing, Nanjing, Guangzhou and Haikou, which are dated 6 September 2018, 15 July 2018, 4 September 2018 and 5 September 2018, respectively, as illustrated in Figure 3 and Table 1, were downloaded for free from the Hubei Data and Application Center of High-Resolution Earth Observation System website (http://59.175.109.173:8888, accessed on 1 June 2021). The Luojia 1-01 NTL satellite, designed and developed by Wuhan University in China, has started to provide nighttime imagery with a finer resolution of 130 m since 2018. This satellite sensor records with 14-bits radiometric resolution and improves on-board calibration functions, which demonstrates finer spatial detail and urban spatial structure than DMSP/OLS and NPP-VIIRS data [34,35]. Landsat 8 Operational Land Imager (OLI) images with minimum cloud cover for Beijing, Nanjing, Guangzhou and Haikou on 23 October 2017, 6 June 2018, 23 October 2017, and 17 May 2018 were obtained from the United States Geological Survey (USGS) website (https://glovis.usgs.gov, accessed on 1 May 2021), as illustrated in Table 1. Except for the coastal/aerosol band and the cirrus band, OLI inherits the seven bands of the Thematic Mapper (TM) and Enhanced Thematic Mapper Plus (ETM+) sensors, improving image measurement abilities and offering compatibility with the historical Landsat images. The Landsat 8 multi-spectral imagery ranging from Band 2 to Band 7 (Blue, Green, Red, NIR, SWIR1, SWIR2, respectively) were used for relevant vegetation index calculation, water body index calculation and image classification.

**Figure 3.** Nighttime light images of Luojia 1-01 in (**a**) Guangzhou, (**b**) Nanjing, (**c**) Beijing, and (**d**) Haikou.



#### **3. Methodology**

Figure 4 demonstrates the methodological framework, which involves four main steps. First, basic preprocessing such as geometric rectification, reprojection, and atmospheric correction was executed for the Luojia 1-10 NTL and Landsat 8 OLI raw data. Next, the MNUACI model was developed by integrating NTL, HSI, NDVI and MNDWI data. Third, four classic classification methods were applied to NTL, HSI, NDVI and MNDWI data for mapping urban areas of study sites. Lastly, the reference urban mapping results based on Landsat 8 data were used to evaluate the accuracy of the MNUACI model.

**Figure 4.** Flowchart of methodology for the MNUACI model.

#### *3.1. Data Preprocessing*

The positioning accuracy of Luojia 1-01 NTL imagery was reported as approximately 800 m by executing the on-board geometric calibration method [36], but its results are far from meeting the actual positioning requirements. For each Luojia 1-01 image, at least forty ground control points (GCPs) were selected at road intersections on Landsat 8 images, and a geometric correction was carried out on Luojia 1-01 images through an affine transformation. The geometric accuracy of the final correction error of each image was controlled within half of a Luojia 1-01 image pixel, namely 65 m. Landsat 8 OLI records not only the reflected and emitted radiation from the earth's surface, but also the radiation scattered or emitted by the atmospheric layer. To quantize the real reflectance from the earth's surface, the Radiometric Calibration Tool and the Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) module in the ENVI software were used to convert the DNs of raw images into surface reflectance values. The purpose of atmospheric correction is to eliminate the influence of atmosphere and solar illumination and obtain the correct surface reflectance parameters of the earth's surface. After atmospheric correction, satellite images can improve the ability of data analysis. To integrate them better with images from Landsat 8 OLI, Landsat 8 images were resampled to the same 130 m-resolution as the Luojia 1-01 nighttime images after the images underwent atmospheric correction.

#### *3.2. Modified Normalized Urban Area Composite Index (MNUACI)*

Water and vegetation index can effectively differentiate water body and vegetation types from urban areas [37]. Based on the characteristics of the two indexes, the NUACI, established by integrating water index, vegetation index and NTL data, can be used to recognize urban areas, as shown in the following equation:

$$NILACI = \begin{cases} \ 0, \ d > r, \ d = \sqrt{\left(NDVI - \overline{a}\_{NDWI}\right)^2 + \left(EVI - \overline{b}\_{EVI}\right)^2} \\ \ \ (1 - d/r) \* NTL\_{norm}, \ d \le r \end{cases} \tag{1}$$

where *aNDW I* and *bEV I* indicate the average NDWI and EVI from the urban impervious surface, respectively; *d* and *r* denote the distance and maximum radius of the circle from urban core, respectively; and *NTLnorm* represents the normalized NTL which is expressed with the following equation:

$$NTL\_{\text{norm}} = \frac{NTL - NTL\_{\text{min}}}{NTL\_{\text{max}} - NTL\_{\text{min}}} \tag{2}$$

where *NTL*min and *NTL*max are the minimum and maximum values of NTL DN values, respectively.

Equation (1) reveals that the integration of NDWI and EVI can eliminate the blooming effect of NTL when *d* is greater than *r*, while the saturation effect of NTL can be mitigated when *d* is less than *r*. As illustrated in Figure 5, in the absence of NTL, NUACI is unable to detect impervious surfaces, such as buildings and roads in urban areas. To make up for the drawback of NUACI, HSI was introduced to reinforce the NTL effect of urban impervious surfaces. In addition, compared with NDWI, MNDWI is more suitable for water identification of water bodies under urban background due to its merits of suppressing noises from bare soil and built-up areas [38]. Considering the advantages of HSI and MNDWI, a modified urban index MNUACI is constructed by integrating them, as expressed in the following equation:

$$MNULACI = \begin{cases} \ 0, \ d > r, \ d = \sqrt{\left(MNDWI - \overline{a}\_{\text{MNDWI}}\right)^2 + \left(NDVI - \overline{b}\_{\text{NDVI}}\right)^2} \\ \ \ (1 - d/r) \* HSI, \ d \le r \end{cases} \tag{3}$$

where *aMNDW I* and *bNDV I* indicate the average MNDWI and NDVI from urban impervious surfaces, respectively. HSI can be expressed with following equation:

$$HSI = \frac{(1 - NDVI\_{\text{norm}}) + NTL\_{\text{norm}}}{1 - NTL\_{\text{norm}} + NDVI\_{\text{norm}} + NDVI\_{\text{norm}} \times NTL\_{\text{norm}}} \tag{4}$$

where *NDVInorm* represents the normalized NDVI, and its normalized method is the same as *NTLnorm*.

The equation of MNDWI is expressed as follows:

$$MNDWI = \frac{G - MIR}{G + MIR} \tag{5}$$

where *G* and *MIR* denote the green band and mid-infrared band of Landsat 8 OLI images, separately.

**Figure 5.** Urban areas in the Temple of Heaven Park in Beijing, extracted from (**a**) Bands 5, 4, 2 composite image of Landsat 8, (**b**) normalized Night-Time Light (NTL), (**c**) Enhanced Vegetation Index Adjusted Nighttime Light Index (EANTLI), (**d**) Human Settlement Index (HSI), (**e**) Normalized Urban Areas Composite Index (NUACI), and (**f**) Modified Normalized Urban Areas Composite Index (MNUACI).

#### *3.3. Accuracy Analysis Methods*

A confusion (error) matrix is an effective quantitative method of characterizing accuracies of land use/land cover types in image classification results. The Commission Errors (CE) are mistakes where results erroneously included in consideration when they should be excluded. The Omission Errors (OE) are mistakes where results are erroneously excluded from consideration when they should have been included. Overall Accuracy (OA) is essentially what percentage of all reference data is correctly classified. The Kappa Coefficient (KC) is a statistic measure of inter-rater reliability or intra-rater reliability for qualitative (categorical) items [39–41]. The Jaccard Similarity coefficient (JSI) refers to a statistic used for gauging the similarity and diversity of sample sets [42].

By calculating the CE, OE, OA, KC and JSC of the reference data and the user classification data, the consistency between both datasets can be evaluated. The detailed equations are as follows:

$$OE = 1 - \frac{n\_{tt}}{n\_{rt}}\tag{6}$$

$$CE = 1 - \frac{n\_{tt}}{n\_{ut}}\tag{7}$$

$$OA = \frac{\sum n\_{tt}}{N} \tag{8}$$

$$\text{KC} = \frac{N\sum\_{t=1}^{\mathcal{C}} n\_{tt} - \sum\_{t=1}^{\mathcal{C}} n\_{tt}n\_{ut}}{N^2 - \sum\_{t=1}^{\mathcal{C}} n\_{rt}n\_{ut}} \tag{9}$$

$$JSI = \frac{|\mathcal{U} \cap \mathcal{R}|}{|\mathcal{U}| + |\mathcal{R}| - |\mathcal{U} \cup \mathcal{R}|} \tag{10}$$

where *ntt* refers to pixel numbers correctly classified in type *t*; *nut* refers to pixel numbers of type *t* in user classification data; *nrt* refers to pixel numbers of type *t* in the reference data; *c* refers to the number of all types; *N* refers to total pixel numbers in all types; and *U* and *R* refer to the user classification dataset and reference dataset, respectively.

#### *3.4. Estimation of Urban Impermeable Surface*

The Impervious Surface Area (ISA) is considered to be an important indicator to measure the degree of urbanization. Previous studies have confirmed a positive correlation between ISA and urban NTL data [43,44], therefore, ISA can be used as an evaluation indicator for extraction results of urban areas. Using the blue and near-infrared bands of Landsat 8 images, the Perpendicular Impervious Surface Index (PISI) was derived and used to represent the ISA [45]. The extraction accuracy for the ISA based on PISI ranged from 89.51% to 96.50% in the four China cities, which demonstrated a better separability for ISA and bare soil. The ISA can be derived by following equation:

$$ISA = 0.8192 \rho\_{Blue} - 0.5735 \rho\_{NIR} + 0.0750 \tag{11}$$

where *ρBlue* and *ρNIR* denote the reflectance of the blue band and near-infrared band from a Landsat 8 image.

#### **4. Results**

#### *4.1. Urban Area Extraction by the MNUACI*

Landsat 8 multi-spectral reflectance data from the four capital cities in China were adopted to calculate MNDWI and NDVI. The MNUACI was then derived by integrating MNDWI, NDVI and Luojia 1-01 NTL. Before performing the calculation for MNUACI, the parameters *aMNDW I* and *bNDV I* were determined by Equation (3) based on samples collected from the urban cores. The parameter *r* was calculated according to the farthest distance between (*aMNDW I*, *bNDV I*) and (*aMNDW I*, *bNDV I*) from the sample data. The parameters (*aMNDW I*, *bNDV I*, *r*) from Beijing, Nanjing, Guangzhou and Haikou are (0.73, 0.54, 0.35), (0.37, 0.55, 0.51), (0.41, 0.46, 0.32) and (0.36, 0.36, 0.49), respectively.

MNUACI is used to distinguish between light-intensity differences in urban core areas, and therefore, to improve pixel resolution in light-saturated areas and allow recognition of urban core structures. The Temple of Heaven Park in Beijing and Hongcheng Lake in Haikou were selected to evaluate the effectiveness of MNUACI. As illustrated in Figure 5, the urban areas (cyan) extracted from Landsat 8 were regarded as reference data, to which NTL, EANTLI, HSI, NUACI, and MNUACI were compared. It can be seen that NTL and EANTLI have similar results: neither buildings nor roads are recognized in the middle of the park due to the lack of nighttime luminosity. Although NUACI shows more fractions of urban areas, only a small amount of impervious surface in the park can be recognized, due to the lack of a sufficient luminous condition. HSI and MNUACI identified detail structures of impervious surfaces, but MNUACI extracted impervious surfaces more accurately. As illustrated in Figure 6, NTL and EANTLI mistakenly identify most urban areas as pervious surfaces and increase omission errors. Although HSI identifies more urban areas, it recognizes the lake as urban areas by mistake, resulting in many commission errors. The recognition results of urban areas from NUACI and MNUACI are similar, both showing the detailed urban structure. However, MNUACI exhibits higher accuracy of urban areas extraction resulting from the reduction of the impact of water bodies on urban areas using MNDWI.

**Figure 6.** Urban areas in the Hongcheng Lake in Haikou, extracted from (**a**) Bands 5, 4, 2 composite image of Landsat 8, (**b**) normalized Night-Time Light (NTL), (**c**) Enhanced Vegetation Index Adjusted Nighttime Light Index (EANTLI), (**d**) Human Settlement Index (HSI), (**e**) Normalized Urban Areas Composite Index (NUACI), and (**f**) Modified Normalized Urban Areas Composite Index (MNUACI).

Taking Nanjing as an example, Figure 7 illustrates a latitudinal transect of NTL, NUACI and MNUACI. These three types of curve variation are similar, but DN values of MNUACI and NUACI in urban areas are distinctly higher than those of NTL, which suggests that MNUACI and NUACI can enhance the NTL effect in urban areas. For urban areas, MNUACI has higher peaks and lower valleys than NUACI, which reflects more characteristics of inner-urban variability and spatial differentiation. This suggests an easier process of urban area extraction using MNUACI. For peri-urban areas, NUACI and NTL present similar low DN values, proving it difficult to identify small towns with them. In contrast, DN values are higher when MNUACI extracts urban areas in suburban regions. In addition, NTL cannot eliminate blooming effects due to a small quantity of luminosity

values occurring in water and vegetation areas, while MNUACI and NUACI solve these blooming problems by introducing vegetation and water indexes.

**Figure 7.** Night-Time Light (NTL), Normalized Urban Areas Composite Index (NUACI), and Modified Normalized Urban Areas Composite Index (MNUACI) along a longitudinal transection in Nanjing.

#### *4.2. Performance Assessment of the MNUACI*

In terms of urban area recognition, a combination of NTL and auxiliary data is better than the use of NTL alone. Different extraction methods for urban areas demonstrate different performances on the same composited NTL index [46]. To assess the feasibility and effectiveness of MNUACI, several supervised and unsupervised classification approaches were separately applied to identify urban areas on NTL, EANTLI, HSI, NUACI and MNUACI images. Because the optimal thresholding method is time-consuming and laborious, the genetic algorithm (GA) was used instead of automatically determining the image segmentation threshold for the extraction of urban areas [47]. Deep learning (DL), GA, fuzzy C-means (FCM) and SVM methods were used to extract urban areas from NTL, EANTLI, HSI, NUACI and MNUACI images of Beijing, Nanjing, Guangzhou and Haikou. Moreover, urban area references of the four sample cities were obtained from Landsat 8 images using the maximum likelihood classifier (MLC) method. Corresponding urban areas from each city were derived from NTL, EANTLI, HSI, NUACI and MNUACI images associated with DL, GA, FCM and SVM approaches. The point-to-point comparison method was applied to test images and reference images. Then, precision indicators such as the Kappa coefficient, overall accuracy and the Jaccard similarity index were calculated to analyze the performance of the combination of different indexes and approaches.

The OA and KC of reference images from Beijing, Nanjing, Guangzhou and Haikou are (93.42%, 0.87), (95.88%, 0.92), (98.16%, 0.96) and (96.94%, 0.94), respectively, which suggests that the classification accuracies of Landsat 8 images from the four cities are reliable.

As illustrated in Tables 2–5, based on the OA, KC and JSC of five NTL indexes, the order of accuracy of urban area classification in Beijing, Nanjing, Guangzhou and Haikou respectively are: MNUACI > HSI > NUACI > NTL > EANTLI, MNUACI > NUACI > NTL > HSI > EANTLI, MNUACI > NUACI > HSI > NTL > EANTLI, MNUACI > NUACI> HSI > NTL > EANTLI. MNUACI has a higher classification accuracy than the other four NTL indexes of the four capital cities using four classification approaches. Based on the OA, KC and JSC of the four classification methods in MNUACI, each SVM demonstrates the highest classification accuracy in the four urban area classification methods. Except the classification accuracy of NUACI which ranks third in Beijing, each NUACI accuracy from the other three cities follows the corresponding MNUACI. For MNUACI, the accuracy relationship of the four urban area extraction approaches in Beijing, Nanjing, Guangzhou and Haikou are as follows: SVM > GA > FCM > DL, SVM > DL> GA > FCM, SVM > GA > FCM > DL, SVM > GA > FCM > DL. The SVM method is superior to other methods with the GA method being the second, the FCM method the third, and the DL method the last.

**Table 2.** Accuracy comparison among various methods of urban area extraction methods using different nighttime light indexes in Beijing.


OA: Overall Accuracy; KC: Kappa Coefficient; JSC: Jaccard Similarity Index.

**Table 3.** Accuracy comparison among various methods of urban area extraction methods using different nighttime light indexes in Nanjing.


OA: Overall Accuracy; KC: Kappa Coefficient; JSC: Jaccard Similarity Index.


**Table 4.** Accuracy comparison among various methods of urban area extraction methods using different nighttime light indexes in Guangzhou.

OA: Overall Accuracy; KC: Kappa Coefficient; JSC: Jaccard Similarity Index.

**Table 5.** Accuracy comparison among various methods of urban area extraction methods using different nighttime light indexes in Haikou.


OA: Overall Accuracy; KC: Kappa Coefficient; JSC: Jaccard Similarity Index.

After applying the SVM method, the spatial distribution of the extraction accuracy of urban areas, commission errors and omission errors from Beijing, Nanjing, Guangzhou and Haikou are displayed in Figure 8. The results of EANTLI and NTL produce a great deal of omission errors on some peri-urban areas lacking in nighttime luminosity. This might be due to the Luojia 1-01 satellite imaging time set at 2:00–3:00 a.m. local time. The primary errors of HSI for the extraction of urban areas are commission errors caused by a large number of water bodies. Although NUACI improves the accuracy of urban area extraction by integrating vegetation and water bodies, NUACI, like EANTLI and NTL, still has difficulty identifying unlit urban areas due to the use of NTL alone. All results of MNUACI in the term of extraction of urban areas illustrate lower commission errors and omission errors contrasting to results of NUACI, HSI, EANTL and NTL. Moreover, the spatial distribution type of MNUACI results is also closer to the MLC results.

**Figure 8.** Accuracy comparison of urban area extraction using an SVM method on the basis of the Modified Normalized Urban Areas Composite Index (MNUACI), Normalized Urban Areas Composite Index (NUACI), Human Settlement Index (HSI), EVI-Adjusted NTL Index (EANTLI), and Night-Time Light (NTL) in (**a**) Beijing, (**b**) Nanjing, (**c**) Guangzhou, and (**d**) Haikou.

The extraction results of urban areas in the Tongzhou District of Beijing based on MNUACI, NUACI, HSI, EANTLI and NTL by the SVM method are shown in Figure 9, and a Landsat 8 false color composite image (Figure 9a) is used as a visual reference for urban areas. For two central city areas, MNUACI and HSI show specific spatial distribution patterns and inner-urban differentiation. NUACI and EANTLI extracted non-vegetation and illuminated regions as urban areas, while NTL extracted only illuminated regions as urban areas. For two town areas, NTL, EANTLI and NUACI merely identify road areas within them, missing most town areas, especially for Town II, while MNUACI and HSI recognize more urban areas. For bare land area, NTL, EANTLI and NUACI merely identify minor bare lands within them while MNUACI and HSI recognize most bare lands. For construction sites, NTL almost identifies the whole construction sites as urban areas without any difference, while EANTLI, HSI, NUACI and MNUACI can extract urban areas correctly, among which MNUACI have the best extraction effect. For village areas, the results of urban areas identified by the five indexes are similar, but for Village I, NTL, EANTLI and NUACI, lead to a large number of omission errors, while the results of HSI and MNUACI generate minimum omission errors. Moreover, NTL and EANTLI produce slight commission errors over the river area, and even HSI mistakenly identifies rivers as urban areas. In contrast, MNUACI and NUACI do not generate such errors.

**Figure 9.** Landsat 8 false color composite image (**a**) and urban extraction results using an SVM method on the basis of the Modified Normalized Urban Areas Composite Index (MNUACI) (**b**), Normalized Urban Areas Composite Index (NUACI) (**c**), Human Settlement Index (HSI) (**d**), EVI-Adjusted NTL Index (EANTLI) (**e**), and Night-Time Light (NTL) (**f**) in the Tongzhou District, Beijing.

#### *4.3. Correlation between MNUACI and Urban Impervious Surface*

Furthermore, one thousand sample points from an ISA image and corresponding MNUACI, NUACI, HSI, EANTLI and NTL images in each city were randomly selected by using the *Create Random Points* tool as well as the *Extract Multi Values to Points* tool of the ArcGIS software. The quadratic polynomial regression models were subsequently established with MNUACI, NUACI, HSI, EANTLI and NTL for estimation of ISA. Correlation coefficients and Root-Mean Square Error (RMSE) were employed together to evaluate the performance of the established regression models.

As shown in Table 6, the average *R*<sup>2</sup> and RMSE of MNUACI, NUACI, HSI, EANTLI and NTL in Beijing, Nanjing, Guangzhou and Haikou are (0.74, 0.13), (0.49, 0.18), (0.44, 0.19), (0.21, 0.22) and (0.24, 0.22), respectively. According to correlation coefficients and the RMSE of quadratic polynomial regression models, the results of EANTLI and NTL have similar lower fitting accuracy, and the result of HSI is better than that of the previous two indexes. Apart from the result of Beijing, model regression effects of NUACI in the other three cities are better than EANTLI and NTL. In contrast, MNUACI shows the highest correlation coefficients and the lowest RMSE in all four cities. This suggests that the regression model of MNUACI could enormously decrease the blooming effect of Luojia 1-01 NTL and improve identification accuracy for non-luminous ISA better than for other models. As illustrated in Figure 10, the scatter plots indicate that regression models between MNUACI and ISA in Beijing, Nanjing, and Guangzhou demonstrate the form of a quadratic polynomial regression model, whereas the polynomial regression model at Haikou is closer to a linear regression model. The NTL of urban core areas in developed metropolitan cities, such as

Beijing, can contribute to the saturated MNUACI value. The ISA corresponding to PISI might not be the highest, because the differentiation between the blue and the near-infrared band during the daytime is weakened in the urban core area. On the contrary, the NTL of urban core areas in developing cities, such as Haikou, might rarely generate a saturated MNUACI value, which can present a good linear correspondence to ISA derived from a multispectral image.


**Table 6.** Correlation coefficients and RMSE of regression models for estimating impervious surface areas.

**Figure 10.** The quadratic polynomial regression models established based on the Modified Normalized Urban Areas Composite Index (MNUACI) and impervious surface area (ISA) in (**a**) Beijing, (**b**) Nanjing, (**c**) Guangzhou, and (**d**) Haikou.

#### **5. Discussion**

In this study, the MNUACI was proposed to improve the capability of delineating spatial structures of inner-urban areas using vegetation coverage and water body index via Luojia 1-01 NTL data. Four China cities with different development levels were chosen to evaluate the performance of MNUACI. To some extent, MNUACI expressed the specific spatial distribution patterns and inner-urban differentiation of urban areas. It also tackled the problems of urban area extraction in areas with low- and non-luminosity.

#### *5.1. Comparison with Previous Indexes*

It is difficult for a single NTL to correctly identify urban areas. Through the introduction of a vegetation index, EANTLI, HSI and NUACI all improved the differentiation within the city core area, but HSI mistakenly identifies water bodies as urban areas, and EANTLI also brings about commission errors because of the NTL blooming effect; NUACI applied the water body and vegetation indexes further enhance the inner-city variability and differentiation. However, NUACI is still subject to the NTL blooming effect, generating commission errors in the urban core area. For bare land and suburban towns, NTL, EANTLI and NUACI all generate numerous omission errors due to the lack of luminosity information. In contrast, by using smaller NDVI values (bare land, sand land and builtups), HSI increases the recognition rate of bare land and urban areas, so that HSI and MNUACI can identify urban areas even under lower luminous conditions. As a regulated version of NDWI, MNDWI can effectively reduce the misclassified built-ups and their shadow information in urban water bodies, while HSI can strengthen the light index value in urban and suburban areas. By introducing improved water index and HSI, MNUACI decreased the size of saturated urban areas and increased the spatial differentiation and variability of inner-urban ones. In addition, with the introduction of HSI, the MNUACI significantly improves the identification ability of urban areas without NTL; it especially reduces the commission errors of urban areas in suburban areas. Adjusted by MNDWI and HSI, MNUACI can not only accurately express the spatial differentiation related to urban spatial structure, but can also increase spatial variation in NTL outside saturated urban areas more than NUACI.

The accuracy of urban area extraction was compared through four classification methods on four sample datasets under different natural and socioeconomic conditions. overall accuracy, Kappa coefficient and Jaccard similarity index were introduced to assess the accuracy of MNUACI, NUACI, HSI, EANTLI and NTL in terms of urban area mapping. Although vegetation index was added to the NTL data to reduce the blooming effect of NTL, the problem of omission errors caused by unlit or low-lit areas was not solved for the high-resolution light data such as for Luojia 1-01. Therefore, NTL and EANTLI had a lower extraction accuracy for urban areas. HSI improves the identification accuracy of urban areas in unlit or low-lit areas because HSI uses a smaller NDVI. However, HSI wrongly identifies the water body as the urban area, resulting in large commission errors. The integration of the water and vegetation index allows NUACI to reduce NTL saturation and the blooming effect, but there still exists the problem of omission errors associated with no-lit or low-lit data. Among the five indexes, the urban area results extracted by MNUACI exhibit the highest accuracy and robustness. This is due to MNUACI dramatically reducing the omission error caused by the unlit area based on small NDVI values and eliminating the blooming effect of NTL through vegetation and the water body index.

For MNUACI, the SVM method has the best performance, followed by the GA method. In third is the FCM method and the DL method shows the lowest urban classification accuracy. Due to the lack of extensive shape and texture information from urban objects, the DL method failed to achieve the desired high accuracy of urban area classification. Although the GA method can obtain higher classification accuracy of urban areas by simulating the natural evolution process to determine the optimal segmentation threshold, the SVM supervised classification approach using training samples shows higher accuracy in terms of Kappa coefficient, overall accuracy and Jaccard similarity index. In addition, considering the coefficient of regression model and RMSE, the correlation degree between the NTL index and ISA is as follows: MNUACI>NUACI>HSI>EANTLI>NTL. The above results suggest that the MNUACI model is robust and reliable for extraction of urban areas. Therefore, with the global coverage of Luojia 1-01 NTL and NDVI data, the approach we proposed can also be applied to the study of urban socio-economic, and environmental issues in other countries and regions around the world.

#### *5.2. Limitations of the Method*

The proposed MNUACI model is proven to be effective and accurate in analyzing and identifying urban areas. However, there remain several shortcomings of MNUACI that could be further improved in the future study. Firstly, though the estimation errors of MNUACI were the smallest in five indexes, there still exist large omission errors in unlit and low-lit area, especially in peri-urban areas. More efforts in improving this model quality and integrating more ancillary data should be made. For instance, POI (Point of Interest), land surface temperature and population data can make up the defect caused by unlit and low-lit areas [48–50]. Also, co-registration errors of Luojia 1-01 NTL images with Landsat 8 images would be transmitted to MNUACI through NDVI and MNDWI, resulting in some urban area misidentification. Once the positioning accuracy of Luojia 1-01 images is improved, the fusion of NTL and Landsat 8 images will improve the performance of MNUACI. Moreover, it is complicated and difficult to derive accurate parameters *d* and *r* by statistical sample data of urban areas. Inaccurate parameters may lead to the inability to exclude the impact of vegetation and water. Finally, the performance of MNUACI was tested only using Luojia 1-01 NTL data, and its applicability and feasibility need to be further evaluated by using other low-resolution NTL data, such as NPP-VIIRS and DMSP-OLS.

#### **6. Conclusions**

Accurate and timely information on the spatial extent and spatial distribution of urban areas, particularly at the regional and global levels, is crucial and important for environmental and ecological issues. NTL data are valuable for regional and global urban mapping and for analysis of urban human activities. The Luojia 1-01 satellite usually captures NTL images before dawn, when urban area lights may be turned off. Thereby, NTL data might ignore certain important urban characteristics. In this research, a new urban index is proposed, combining information from Luojia 1-01 NTL data, NDVI and MNDWI of Landsat 8 data for more detailed characterizations of inner-urban variations in nighttime luminosity. In comparison with NUACI, HSI, VANTLI and NTL, MNUACI was superior in identification of inner-city forms. Then, the performance of SVM, GA, FCM and DL methods for extraction of urban areas were evaluated in four Chinese cities according to the five urban indexes mentioned. MNUACI based on the SVM method exhibits the best performance in urban area extraction, attributed to the integration of HSI, NDVI and MNDVI information. The regression models based on the five NTL indices were respectively established to map ISA using the urban fraction obtained by Landsat 8 images as the reference data. The validation results reveal a closer goodness-of-fit relationship with both MNUACI and corresponding ISA. The average correlation coefficient and the RMSE of the four cities are 0.74 and 0.13, respectively. In conclusion, combining with multisource remotely sensed data, MNUACI has the ability to mitigate NTL pixel saturation and eliminate blooming effects, and provides a promising approach for identification of urban areas by enhancing inner-urban spatial differentiation and spatial differentiation.

**Author Contributions:** F.L. designed the MNUACI method for extracting urban areas and drafted the manuscript. X.L. carried out the processing of nighttime light data and land use classification using a maximum likelihood classifier based on Landsat 8 images. S.L. analyzed and assessed results of the MNUACI method. P.J. supervised the study and revised the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Natural Science Foundation of Hebei Province of China (Grant No. D2018512002), the Fundamental Research Funds for the Central Universities of China (Grant No. ZY20180101) and the Key Project of Science and Technology Research for Universities of Hebei Province (Grant No. ZD2020407).

**Data Availability Statement:** Direct request for these materials may be made to the provider as indicated in the Acknowledgments. The processed data presented in this study are available on request from the first author.

**Acknowledgments:** We thank the High Resolution Earth Observation System of Hubei Data Application Center for providing Luojia 1-01 satellite imagery, and the International Institute of Spatial Lifecourse Epidemiology (ISLE) for the research support.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


## *Article* **Intra-Urban Scaling Properties Examined by Automatically Extracted City Hotspots from Street Data and Nighttime Light Imagery**

**Ding Ma 1,2, Renzhong Guo 1, Ying Jing 3, Ye Zheng 1, Zhigang Zhao 1,\* and Jiahao Yang <sup>1</sup>**


**Abstract:** A country can be well-comprehended through its core cities. Similarly, we can learn about a city from its hotspots, as they manifest the concentration of urban infrastructures and human activities. Following this philosophy, this paper studies the intra-urban form and function from a complexity science perspective by exploring the power law distribution of hotspot sizes and related socio-economic attributes. To detect hotspots, we rely on spatial clustering of geospatial big data sets, including street data from OpenStreetMap platform and nighttime light (NTL) data from the visible infrared imaging radiometer suite (VIIRS) imagery. Unlike conventional spatial units, which are imposed by governments or authorities (such as census block), the delineation of hotspots is done in a totally bottom-up manner and, more importantly, can help us examine precisely the scaling pattern of urban morphological and functional aspects. This results in two types of urban hotspots—streetbased and NTL-based hotspots—being generated across 20 major cities in China. We find that Zipf's law of hotspot sizes (both types) holds remarkably well for each city, as do the city-size distributions at the country level, indicating a statistically self-similar structure of geographic space. We further find that the urban scaling law can be effectively detected when using NTL-based hotspots as basic units. Furthermore, the comparison between two types of hotspots enables us to gain in-depth insights of urban planning and urban economic development.

**Keywords:** urban hotspot delineation; Zipf's law; intra-urban scaling; street nodes; VIIRS imagery

#### **1. Introduction**

As a result of urbanization or the continuous influx of people into cities, the number of worldwide urbanites is predicted to be 6.9 billion by 2050, accounting for 68% of the world's population [1]. The urbanization in China has been unprecedentedly rapid as well in the past few decades [2], reaching 60.6% nationally in 2019 [3]. Consequently, the grasp of city form and function—that is, how cities look and work—has become the key to our sustainable development. Given the circumstances, city-related research has attracted scientists from a variety of subjects and has, inevitably, become cross-disciplinary, including geography, economics, computer science, and physics, etc. To converge these disciplines, scholars have called for a new science of cities in the past few decades, in which they view cities as an organized complexity [4], for studying cities' fractal shapes, complex structures, and nonlinear dynamics (e.g., [5–11]).

One major aspect of urban complexity is its underlying scaling properties. The scaling pattern of urban entities can be categorized into two perspectives: The power law distribution of a single quantity, such as city sizes (Zipf's law [12]), building heights [13], street lengths [14], and leisure venue densities [15], and the power relationship between

**Citation:** Ma, D.; Guo, R.; Jing, Y.; Zheng, Y.; Zhao, Z.; Yang, J. Intra-Urban Scaling Properties Examined by Automatically Extracted City Hotspots from Street Data and Nighttime Light Imagery. *Remote Sens.* **2021**, *13*, 1322. https:// doi.org/10.3390/rs13071322

Academic Editor: Nataliya Rybnikova

Received: 3 March 2021 Accepted: 27 March 2021 Published: 30 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

two quantities, such as populations versus innovations ([16,17]) or gross domestic product (GDP) versus street fractality ([18,19]). This study uses the terms scaling and power law interchangeably. Urban scaling is, to a great extent, a ubiquitous pattern across different measures. Moreover, the theory developed by Bettencourt et al. [16], which is behind the power relationship between urban populations and other socio-economic measures, has been formulated as fundamental laws about cities: Universal scaling law. However, recent studies have shown that the universal scaling law may not work as expected, as the scaling exponent is sensitive to different city boundaries or ineffective urban areas [20,21]. This controversy is likely to be bound with the top-down methods of defining geographic units by governments and authorities, such as administrative city boundaries, census tracts, and some equally partitioned cells, which are essentially for management purposes and hardly consider the scaling pattern of urban morphological and functional entities.

The arrival of geospatial big data has triggered a new paradigm for urban analysis since geospatial big data, such as remote sensing (RS) images and location-based social media data, has the capacity to offer fine-grained, massive-scale geographic information [22]. For instance, nighttime lights (NTL) data, also referred to as RS of human beings and their activities [23], are globally downloadable and can manifest the development of urban and regional areas. OpenStreetMap (OSM), a pioneering volunteered geospatial information platform, provides street data across the globe for probably the first time in human history [24]. Both NTL and OSM data help researchers construct alternative modeling units for spatial analyses at both intercity and intracity levels, and remove the barriers of inter-regional incomparability. The most recent relevant studies are so-called natural cities, referring to the objectively defined cities based on different types of urban elements from the open data, such as building footprints, street nodes, and points of interest (e.g., [25–29]). However, most of these studies take the derived cities as a whole to understand the scaling structure over a region or country, but seldom calibrate a "local" understanding of such spatial configuration at the intracity level.

Thus, the present study attempts to investigate the intra-urban scaling properties through the lens of city hotspots. A city is formed by highly concentrated areas of human settlements or activities within a country extent [30]. Likewise, if we scale down our scope from a country to one of its cities, such concentrations can be regarded as urban hotspots. With the advance of geographic information system (GIS) technologies, urban hotspots can be delineated more precisely on the support of geospatial big data and bottom-up approaches. The study contributes to the current literature in three aspects. Firstly, we followed the ideas of previous city delineation methods to derive two types of urban hotspots across 20 Chinese cities: Street-based and NTL-based hotspots, from respectively the spatial clustering of individual street nodes and NTL image pixels with the cutoff determined by data's inherent scaling properties (see details in Section 2.2). Secondly, we found that Zipf's law held remarkably well for both street-based and NTL-based hotspot sizes per city, as do the city-size distributions on the national scale. The scaling exponents derived based on NTL-based hotspots were also consistent with the established regimes, implying that NTL-based hotspots can act as better spatial units for urban analysis. Thirdly, we found that the spatial discrepancy between the street-based and NTL-based hotspots can lead us to deep insights on urban planning and development.

The remainder of this paper is organized as follows. Section 2 introduces the data sets and the designed methods for urban hotspot delineation and related scaling analyses. Section 3 presents the maps of the detected hotspots across the top 20 cities in China, as well as the power law metrics of hotspot sizes and associated socio-economic attributes. Section 4 further discusses the intra-urban scaling properties. Section 5 concludes the study and points to future research directions.

#### **2. Data and Methods**

#### *2.1. Data and Data Processing*

We selected 20 well-developed cities in China as study areas and primarily made use of the following three data sets: (1) VIIRS imagery, (2) OSM street network, and (3) socio-economic grid data (Figure 1a). All data sets are national coverage. The NTL data was obtained from NOAA/NCEI [31]. We chose one monthly image at June 2020, of which the resolution is 15-arc-s (about 500 m at the Equator). We reprojected and cleaned the image to get rid of noises (lit spots) such as burning wildfires and oil drilling, based on the method proposed by Elvidge et al. [32]. The national street network was downloaded from OSM, including 4,419,603 segments from which we extracted 3,172,001 street nodes based on the criterion that a node must intersect with three segments. The socio-economic grid data include the GDP and population from the National Resources and Environment Database of the Chinese Academy of Sciences [33] and environmental grid data include CO2 emissions from the National Earth System Science Data Center [34]. Raster data sets for GDP, population, and CO2 were collected in 2010 and had a 1 km resolution. To perform the analysis, we clipped out both the vector and raster data using each of 20 city administrative boundaries, then conducted zonal statistics of cells with socio-economic attributes for each city, which were further joined with city hotspots (Figure 1b).

**Figure 1.** *Cont*.

**Figure 1.** (Color online) The related datasets (**a**) and the methodological framework (**b**) in this study. (Note: The units of raster datasets for population, GDP, CO2 emissions are 1 person/km2, 10,000 CNY/km2, and 10,000 ton/km2, respectively).

#### *2.2. Urban Hotspot Detection*

We adopted the spatial clustering method for urban hotspot calculation and delimitation. As there were two types of data sets (street junction nodes and NTL pixels) to be processed, we applied two rules for cluster detection of each data set: Point proximity and lit pixel adjacency. The threshold (distance between points or pixel value) for clustering was determined by the data's inherent scaling properties uncovered by head/tail breaks and power law detection methods.

#### 2.2.1. Spatial Clustering of Street Nodes and NTL Pixels

Urban hotspots—that is, populated areas in a city—are the basic unit for the analysis in this study. Traditional urban analysis uses pre-defined administrative units provided by local authorities or grids with different resolutions. However, both spatial units cannot represent the merit of "concentration" as they are defined either from a top-down or arbitrary manner. To overcome this issue, we adopted a spatial clustering approach to objectively delimit the boundary of a hotspot from the dense areas of street junctions or lit pixels.

We chose two clustering approaches for each data set. For street junctions, we first computed the triangulated irregular network (TIN) to get junction–junction proximities. As Figure 2a–c shows, the area of urban hotspot can be directly obtained through the conversion of short TIN edges between points. For NTL images, the first step is to vectorize each raster pixel into a polygonal feature with the light value maintained (Figure 2e), then the hotspot can be derived through grouping the adjacent lit pixels (Figure 2f).

The above procedures can be simply done using any mainstream GIS or RS image processing software (such as ArcGIS and Erdas). The major difficulty lies in identifying the cutoff value for the classification short/long edges and dim/lit pixels across a set of urban areas. In other words, it lacks an objective criterion to make the linkage between the morphological hotspot (the concentration of urban infrastructure) and a set of proximate street junctions or the functional hotspot (the concentration of human activities) and a group of lit pixels. The same issue occurs when delineating the city boundaries (regardless of administrative boundaries) at the country or cross-country level, whereas prior studies (e.g., [35]) have made use of the universal scaling property for finding the effective cutoff value. In a similar spirit, the next section will introduce how to obtain the optimal cutoff value for the accurate delimitation of urban hotspots.

**Figure 2.** (Color online) The derivation of urban hotspots using the spatial clustering approach based on respectively street nodes (**a**–**c**) and NTL image pixels (**d**–**f**).

2.2.2. Scaling Analytics for Identifying the Cutoff for Spatial Clustering

A vast body of literature has investigated city-size distributions in different countries. Most of those studies have used the power law model to characterize the uneven spatial distribution of cities, as well as their sizes, such as Zipf's law [12]. Zipf's law states that there is an inverse relationship between the rank and the size of a city. In other words, the largest city is twice as big as the second largest city, etc. Such a statistical distribution would strikingly present the long-tail effect or scaling pattern of far more small cities than large ones. In most cases, the scaling pattern recurs within the power-law distribution and leads to an inherent hierarchy, which can be derived through the head/tail breaks classification scheme. In this study, we change our perspective from a "country-to-city" relationship to "city-to-hotspot" one. In this way, we can borrow the scaling analysis methods (power law detection and head/tail breaks), which were previously used for finding the cutoff value of city demarcation, to delineate hotspots. To start with, we shall first introduce briefly Zipf's Law, power law, and head/tail breaks.

Referring to the size *n* of each city relative to its rank number *r*, Zipf's law is denoted by Equation (1):

$$n \propto r^{-b} \tag{1}$$

where *b* usually is equal to 1, indicating that the city size is equal to the reciprocal of its rank.

Another way to describe Zipf's law is the Pareto distribution (or power-law, which is a derivative of Pareto distribution) [36]. To do this, it is equivalent to use the inverse function of Equation (1) as *<sup>r</sup>* <sup>∼</sup> *<sup>n</sup>*<sup>−</sup> <sup>1</sup> *<sup>b</sup>* , where *r* is further treated as the proportion, *Pr*, to the whole population by the cumulative distribution function (CDF), and it is relative to how many of the cities are greater than the size, *x*, is defined as follows:

$$\Pr[X \succ=\mathbf{x}] \ll \mathbf{x}^{-k} \tag{2}$$

where *k* > 0. For a specific point of *x*, the power-law is acquired by the derivative of Pareto distribution by the probability density function (PDF) as:

$$\Pr[X = x] \propto -kx^{-k-1} \propto \mathbb{C}x^{-n} \tag{3}$$

where *C* is a constant and *α* = *k* + 1. In practical terms, the power-law distribution could only be discovered in one part of the whole dataset, where there must be some lower bound denoted as *xmin*. A formal form of the power-law is given as follows proposed by Clauset et al. [37]:

$$p(\mathbf{x}) = \frac{\alpha - 1}{\mathbf{x}\_{min}} \left(\frac{\mathbf{x}}{\mathbf{x}\_{min}}\right)^{-\alpha} \tag{4}$$

With the fixed lower bound *xmin*, the power law exponent α is then derived from the robust maximum likelihood estimation (MLE) method, noted as Equation (5):

$$\alpha = 1 + n \left[ \sum\_{i=1}^{n} \ln \frac{\boldsymbol{x}\_i}{\boldsymbol{x}\_{\min}} \right]^{-1} \tag{5}$$

So far, we can remark that, for detecting Zipf's law, the power law exponent should be two rather than one. Furthermore, a modified Kolmogrov-Smirnov test [37,38], needs to be performed to determine the extent of fitness for the data to an ideal power-law fitted model using the derived *xmin* and α values. Every time we generate 1000 synthetic datasets that follow a perfect power law above *xmin* but have the same non-power-law distribution as the original dataset. Then, we check how many times the maximum difference between each synthetic data and the fitted model are larger than the one between the original dataset and the fitted model, the ratio of number of times to 1000 is the goodness-of-fit index *p*-value. We set *p*-value ≥ 0.05 as the acceptance of data being a power law in this study, meaning that at least 50 among the 1000 synthetic datasets are less "power-law-distributed" than the original dataset.

Zipf's law can be used as an effective assessment when performing city demarcations. In other words, if the demarcated city sizes follow Zipf's law, we think that the result is valid. The question then narrows down to how to derive cities whose sizes follow Zipf's law from geospatial datasets, such as the TIN model and NTL imagery (Figure 3). Here, we introduce the head/tail breaks method [39] to effectively locate the cutoff value. Put simply, data with a power law distribution can be divided into a high percentage in the tail (≥60%) and a low percentage in the head (≤40%) at the arithmetic mean. Therefore, for TIN and image models, the head refers to long TIN edges and light pixels, and the tail refers to short edges and dark pixels. The process then runs recursively for the head part until the head percentage is no longer small (say, ≥40%). During the process, a series of arithmetic means were iteratively computed, naturally forming a scaling hierarchy of the data. The number of mean values, also known as the ht-index [40], can then characterize the tendency of data being power-law-distributed. Namely, the larger the ht-index value, the more likely it is that the data is a power-law. Prior studies have used these nested mean values as cutoffs for extracting the so-called natural cities whose sizes obey Zipf's law at either national or cross-national levels (e.g., [41]). However, the use of those values for hotspot derivation at the city level remains under-researched. The present study would detect urban hotspots through a combination of head/tail breaks for locating the feasible cutoff and MLE method for examining Zipf's law.

#### *2.3. Power Function Fitting for Intra-Urban Scaling Law Examination*

The examination of urban scaling concerns two perspectives: The power law detection of a single urban indicator (as mentioned in Section 2.2.2) and the power relationship between two types of urban quantities (for example, urban areas versus populations). The latter have been formulated as the universal scaling law [16] for most of the urban

indicators, which uses the power function fitting between an urban indicator and the urban population size across cities at time *t*, denoted as Equation (6):

$$Y(t) = kN(t)^{\beta} \tag{6}$$

where *β* is the scaling exponent and *k* is the constant.

The scaling exponent *β* can be further investigated by means of three categories: The sub-linear (*β* < 1), linear (*β* ≈ 1), and super-linear (*β* > 1) scaling relationships between urban measures [16]. To elaborate, for *β* < 1, it normally refers to the need of a city's infrastructure scales sub-linearly with its population size due to the economies of scale, whereas the number of a city's innovations and crimes scales super-linearly (*β* > 1) due to the endogenous social interactions. The regime of *β* ≈ 1 describes the pattern that the individual demands in a city is proportionate to the urban population size. In this study, we use the detected hotspots as alternative spatial units to reexamine the urban scaling law. To do so, we conduct the power function fitting between urban socio-economic metrics (such as population, GDP, and CO2 emissions) that are within urban hotspots. To compute the scaling exponent, we first take the logarithms on both axes and adopt the ordinary least-squares linear regression for fitting. The scaling exponent is then the slope of the fitting line.

#### **3. Results**

#### *3.1. Derived Urban Hotspots in the Top 20 Chinese Cities*

We applied the urban hotspot detection method on street nodes and NTL imagery, respectively, across top 20 Chinese cities, ranked by GDP. To derive the hotspots from the street nodes, we established big TIN models for each city, whose TIN edges range from tens to hundreds of thousands (Table 1). The heavy-tailed distribution statistics were striking for each TIN model, as the average edge length (the mean length of *ledge* is about 450 m) was classified effectively between short and long TIN edges according to their imbalanced ratios (around 80% versus 20%). The observation of 80/20 division, namely the scaling pattern of far more short TIN edges than long ones, objectively reveals the uneven spatial distribution of street node densities. The delineation of urban hotspots for each city was then conducted by grouping and converting those short edges into many different-sized hotspots. The area of resulting hotspots per city followed well with Zipf's law, as the mean value of 20 cities' power-law exponents was 2.01 (for more details of the basic statistics and related power-law metrics of hotspot size, see Section 3.2). Figure 3 presents the appearance of hotspots across selected cities, clearly showing that a few largest patches were located in the downtown and numerous smaller ones were spaced dispersedly in places other than the city center.


**Table 1.** Statistics of street nodes, related triangulated irregular network (TIN) edges, and head/tail division results among 20 cities. (Note: #: Number; *ledge*: Average edge length).

**Table 1.** *Cont*.


**Figure 3.** (Color online) Urban hotspots based on the density of street junctions throughout the top 20 Chinese cities.

The urban hotspot extraction from NTL data went through experiments with a series of "candidate" mean values along with the head/tail breaking process on each image. To start with, the number of pixels for each image ranged widely, from 9397 (Shenzhen) to 353,344 (Harbin) and, interestingly, also followed the fat-tailed distribution. More specifically, among 20 city NTL images, most of the images (14) contain fewer than 78,045 pixels, some (five) between 78,045 and 141,623 pixels, while only one image has more than 141,623 pixels, resulting in a ht-index value of 3, meaning that there are three hierarchical levels of images regarding the number of pixels. Moreover, the ht-index for the pixel values of each city image was even higher. Figure 4 shows clearly that each image contains far more dark pixels than light ones, and such a scaling pattern recurs at least five times, indicating that there were no fewer than five average lightness values of each image achieved as candidate threshold values for a single city's hotspot delineation (see Appendix A for more details of the head/tail breaks method applied to the pixel values of each city's NTL data). Therefore, for every image we merged the vectorized pixels whose values above each derived candidate thresholds based on head/tail breaks to extract the urban hotspots, ensued with power law detection for each set of the hotspot results. The summary of statistical results for varying thresholds is presented in Table 2, which shows that the optimal cutoff value resided in the third level, since its power-law exponent was closest to 2, leading to hotspots being most akin to the Zipf's law configuration. It should be noted that the average of the cutoff values across 20 cities (33.086) largely echoes the optimal threshold (33.14) based on the VIIRS NTL data in 2013 for Chinese city demarcation [35]. Following the located cutoffs for each image, the layout of extracted urban hotspots exhibited a picture that was overall similar to that from street nodes in terms of the imbalanced spatial distribution from city center to periphery (Figure 4).

**Table 2.** The candidate cutoff values for the NTL image and the resulting power law exponents at different levels based on the head/tail breaks method. (Note: *light*: The average of lightness thresholds at each level for 20 images; *α*: The average of power-law exponent of hotspot area for 20 cities).


By comparing Figure 3 with Figure 4, it is clear that two types of patches overlapped, but in varying degrees, with each other, indicating there were similarities and differences between urban physical and functional extents. Here, we applied the intersection over union (IoU) metric to compute the overlapping ratio between two types of hotspots for each city, the average ratio for 20 cities was around 0.27 (see more details in Appendix B). It appeared that inland cities were inclined to have larger ratios, such as Shenyang, Xian, and Zhengzhou had most overlays (around 0.4), whereas coastal cities such as Shenzhen and Qingdao held much less (e.g., only 0.11 for Qingdao). We further opted to map the overlay between two types of hotspots among the top four representative cities in China: Beijing, Shanghai, Guangzhou, and Shenzhen (Figure 5), whose IoU metrics are all smaller than the average, i.e., 0.26, 0.17, 0.21, 0.18, respectively. Moreover, it is intriguing to note that detailed disparities can be found with respect to the extent of dispersive patches. In other words, with similar power-law exponents (around 2), the sizes of NTL hotspots in top cities seemed to be more even and the spatial distribution were more dispersed than those of street hotspots.

**Figure 4.** (Color online) Urban hotspots based on NTL imagery using the third mean value as the cutoff value.

**Figure 5.** (Color online) Comparison between two types of urban hotspots in four Chinese first-tier cities.

#### *3.2. Intra-Urban Scaling Properties Based on Derived Urban Hotspots*

We applied the robust power law detection based on the MLE method to two types of hotspots in 20 cities. For each city, we listed the power-law fitting metrics regarding its hotspot areas detected using the cutoffs derived from head/tail breaks (Table 3). We can see that Zipf's law held remarkably well for both types of urban hotspots. As stated, the power-law exponents for street hotspots were centered at 2.01 ± 0.15, while the averaged exponent value for NTL hotspots was slightly smaller, 1.921 ± 0.19, due to the exception of Chengdu (1.46). Most of the p-values were above 0.05 and readers can cross-check the results in Table 3. In addition to the hotspot sizes, we also examined the power law fit of the socio-economic status within the hotspots in the top four cities. As Figure 6 shows, the power-law distribution still holds for GDP, population, and the amount of CO2 emissions per hotspot, respectively. However, the values of exponents for each city performed slightly differently. Specifically, the exponents of three urban metrics inside the hotspots remained relatively stable with the hotspot size in Guangzhou and Shanghai, but less so in Beijing (up-and-downs around *αArea*) and Shenzhen (all smaller than *αArea*).


**Table 3.** Power law metrics of detected urban hotspot areas. (Note: *αArea*: Power-law exponent; *p*: The goodness-of-fit index; *Areamin*: The minimum area above which the power-law holds).

**Figure 6.** (Color online) Power law distribution of NTL-based hotspot sizes (**a**), GDP (**b**), population (**c**), and CO2 emissions (**d**) among the top four cities in China.

We further investigated how these extracted hotspots worked as cores of each city. Ideally, there should be a disproportionate relationship between hotspot areas and the amount of pertained resources. Consequently, 3% of the city area, constituting either type of hotspot, accommodates, on average, around 15% of GDP, 25% of population, and 20% of CO2 emissions (Table 4). Extreme cases such as Shenyang, Wuhan, and Kunming showed that derived hotspots could even account for more than 40% of the city's total population or GDP. Such imbalanced ratios enabled us to make use of those urban indicators within the hotspots for exploring the intra-urban scaling law. After correlating the total areas, GDP, and CO2 emissions with the population, based on two types of hotspots for each city in double logarithm scales, we were intrigued by two findings. Firstly, there were no scaling relationships between the area/GDP/CO2 emissions and population based on the street hotspots, indicated by the very low R2 values (below 0.01), while significant scaling relationships existed when using NTL hotspots (R2 values above 0.4). Secondly, the relationships of area- and CO2 emissions-population were sub-linear (0.84 and 0.68; Figure 7a,c), whereas the GDP–population relationship was super-linear (1.13; Figure 7b), wherein the corresponding scaling exponent values, computed among the chosen 20 cities, were very consistent with values from the recent study based on 287 Chinese prefecture-level cities [17].

**Table 4.** Percentages of area, gross domestic product (GDP), population, and CO2 emissions inside urban hotspots to those of the entire city. (Note: Pop: Population; CO2: CO2 emissions).


**Figure 7.** (Color online) Scaling relations and exponents for urban indicators reflected by NTL-based hotspots (Note: Panels (**a**,**c**)show sub-linear scaling law for area/CO2 emissions versus population; Panel (**b**) shows super-linear scaling law of GDP and population; all metrics for each city are calculated based on the extent of contained NTL-based hotspots).

#### **4. Discussion**

Cities have long been treated as complex systems. The formation of cities can be described as a dynamic, self-organized, and nonlinear process of human settlements [5], demonstrating highly-heterogenous patterns in both its spatial and aspatial aspects [42]. The spatial aspect can refer to the fractal urban form and the aspatial aspect can refer to the long-tailed distribution of city-related metrics. However, such heterogeneities cannot be revealed effectively since conventional urban data, formed normally through top-down approaches, lack sufficient geographic scope and granularity. In the current geospatial big data era, we can easily conquer this constraint by acquiring fine-grained open data regarding the city form and function at countrywide coverage. Big data is not only big, but also possesses significant fractal and nonlinear properties [43], based on which we can model and analyze a city in a bottom-up manner. That is, delimiting city boundary at the country level or delineating hotspot area at the city scale by agglomeration of individual-based locations.

By adopting the fractal and nonlinear ways of thinking and doing, the cutoff for hotspot boundary derivation was located effectively. Specifically, drawing the border of hotspots is similar to measuring the length of a coastline—a commonality between the two is that, in reality, there is no ground truth for them. The father of fractal geometry, Benoit Mandelbrot [44], has made it clear that the length of a coastline is immeasurable, while the nonlinearity or scaling property is always measurable. In the present study, we characterized the data's nonlinearity in its inherent scaling hierarchy (by head/tail breaks) and power-law or Zipf's law distribution (by the MLE method), by which we obtained the cutoff guiding the spatial clustering. Taking the NTL image as an example, the nested mean values enable us to quickly classify pixels iteratively into a minority of light ones and a majority of dark ones, without exhausting all pixel values by increasing the threshold one at a time. Accordingly, only a few times of experiments on grouping-light-pixel operations for each city led us to generate hotspot polygons whose sizes follow Zipf's law.

The successfully detected Zipf's law of street- and NTL-based hotspots across 20 cities further strengthen the fractal structure of geographic space. It is well-known that a part of a fractal is similar geometrically or statistically to the whole, termed as self-similarity. Since there has been a good agreement among scholars that Zipf's law holds for cities at the country scale [36,45], such a repeated statistical regularity for hotspots at the city scale in the present study can be considered evidence of the self-similarity of geographic space. The self-similarity across multiple scales makes us connect the system of geographic space with that of biology, where similar power law statistics appear across multiple layers in a human body from organs, to tissues, and further to cells [46,47]. Therefore, we believe that Zipf's law can hold within even smaller sub-units than city hotspots (such as neighborhoods), and thus more refined urban center areas could be further identified with the proposed methods. This certainly warrants further study as long as the data granularity allows.

The detected hotspots in both types constituted only a small part of the city area, but accounted for a considerable portion of the urban population, wealth, and energy. This imbalanced ratio between hotspot sizes and the associated socio-economic statistics sheds light on the fact that not all city areas for people live or perform activities. This is also known as the potential problem of the administrative city boundary for urban analysis [21]. Without an accurate capture of human urban activities, the urban scaling estimations may be subjected to unexpected variations. We also examined the power relationship between selected urban measures within the entire administrative boundary among 20 cities, and failed to achieve expected scaling exponents (small R2 values or in wrong regimes), similar to the case when using the street-based hotspots. By contrast, through the NTL-based hotspots, the derived scaling relationships of area/GDP/CO2 to population were consistent with the established regimes (e.g., [17,48]). The obtained scaling exponents, shown in Figure 7, indicated that due to a more concentrated settlement and use of infrastructure, the growth of urban economy paced quicker than that of the population (super-linear regime), while the demands of urban areas and the related energy consumption accelerates slower than the population growth (sub-linear regime). The presence of scaling law further implied that the NTL-based hotspots could work as a new, effective instrument for exploring the system of cities.

The hotspots identified by both street and NTL data, by and large, tally with the locations of central urban areas of these 20 cities in China. As noted, street-based hotspots can represent a city's morphological aspects, whereas NTL-based hotspots can accurately reflect a city's functional aspects. The comparison between the two can give us a comprehensive image of how people utilized the urban space. It is noteworthy that the disparity occurs in their spatial distributions. Given that NTL-based hotspots illustrate the aggregation of human activities, we refer that the NTL-based hotspots better manifest the actual urban populous areas than the street-based hotspots, in the context that the street network constructed or traffic planning normally show a time lag. This discrepancy normally hints the evolution of urban centers. That is, these regions are preferred by humans, but apt to be neglected by the municipal authorities or urban scholars. Thus, the planning authorities should at least pay attention to these regions and other urban infrastructure should be strengthened in order to keep pace with real human needs, as well.

By computing IoU metrics, we are able to find that two types of hotspots have less overlays in coastal cities than in inland cities, while coastal cities in China normally have better economic status. Meanwhile, it is worth mentioning that the NTL-based hotspots are very dispersed in the four headmost metropolises, indicating that well-developed cities tend to exhibit a balanced distribution of human activities. It is further referred that cities with higher economic status shift to a more decentralized structure upon urban autonomous development. On this basis, the governments need to take more measures to promote urban justice (including the even distribution of urban resources, etc.) on the process of urban development.

#### **5. Conclusions**

The ultimate goal of city science is closely related to urban smart growth and sustainable development. In natural and societal phenomena, it has been widely adopted that the scaling pattern and power-law statistics are signs of sustainability [49]. This paper provides an intra-urban perspective to study the underlying scaling structure of urban space through novel spatial units: Urban hotspots, detected from geospatial big data including OSM street data and VIIRS imagery. In contrast to conventional spatial units that were imposed by local authorities, the present study adopted the objectively delineated concentration areas as hotspots using the spatial clustering approach. This is mainly motivated by the instability of urban scaling exponents affected by different cities and its sub-unit demarcations. In sum, we found (1) that Zipf's law also holds strikingly at the intra-urban level; and (2) that NTL-based hotspots can be good proxies for city populous areas, by which the urban scaling relationship can be correctly maintained.

The method for hotspot detection acts as a promising tool and could supplement innovative urban planning toolboxes in the big data era. Despite the strengths of urban hotspot in this work, there is still room for improvement in terms of the following. Firstly, whether the intra-urban scaling law exists in other countries remains to be verified from a global view, in addition to these 20 cities in China. Secondly, it is important to add NTL images before 2020 to check whether and how the intra-urban scaling exponents change or evolve. Further, the updated raster data sets of GDP, population, and CO2 emissions after 2010 will be combined once they are available, for eliminating possible biases or inaccuracies that occurred due to the difference in data time acquisition. Thirdly, the multiscale effect of scaling analytics (e.g., detecting a more refined spatial unit and related power law statistics) within one city needs to be further conducted. Fourthly, the underlying mechanism of this scaling law has not been revealed yet, concerning policy, landform or demographic traits, etc. Future work will point to these directions.

**Author Contributions:** Conceptualization, D.M.; data curation, J.Y.; formal analysis, D.M. and Y.Z.; funding acquisition, D.M. and Z.Z.; methodology, D.M. and Y.J.; supervision, R.G. and Z.Z.; visualization, J.Y.; writing—original draft, D.M.; writing—review and editing, D.M. and Z.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was financially supported by the National Nature Science Foundation of China (grant no. 42001180) and the China Postdoctoral Science Foundation (grant no. 2019M663038), and the National Key Research and Development Program of China [Grant No. 2018YFB2100705].

**Data Availability Statement:** Data sharing not applicable.

**Acknowledgments:** We would like to thank the anonymous referees and the editor for their constructive comments. We also would like to thank Chengyue Zhang, Wei Zhu, and Qionghuan Liu for their useful suggestions on NTL data acquisition and processing.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

This appendix supplements Section 3.1 by showing average pixel values derived along with the head/tail breaks process of each nighttime image and the resulting power law metrics on hotspot sizes, by city.



#### **Appendix B**

This appendix supplements Section 3.1 by presenting the overlapping ratios between street- and NTL-based hotspots among 20 cities. We adopted *IoU* for assessing how much one type of hotspot overlaps another in a city. The *IoU* metric between two types of hotspots can be denoted by the following equation:

$$IoU = \frac{Area\_{\text{s}} \cap Area\_{n}}{Area\_{\text{s}} \cup Area\_{n}}$$

where *Areas* is the total area of street-based hotspots, *Arean* is the total area of NTL-based hotspots. The results of *IoU* for each city is shown in Table A0.


**Table A2.** The intersection over union (IoU) metrics between two types of hotspots among 20 cities.

#### **References**


## *Article* **Delineating Functional Urban Areas Using a Multi-Step Analysis of Artificial Light-at-Night Data**

**Nataliya Rybnikova 1,2,3,\*, Boris A. Portnov 2, Igal Charney <sup>3</sup> and Sviatoslav Rybnikov 4,5**


**Abstract:** A functional urban area (FUA) is a geographic entity that consists of a densely inhabited city and a less densely populated commuting zone, both highly integrated through labor markets. The delineation of FUAs is important for comparative urban studies and it is commonly performed using census data and data on commuting flows. However, at the national scale, censuses and commuting surveys are performed at low frequency, and, on the global scale, consistent and comparable data are difficult to obtain overall. In this paper, we suggest and test a novel approach based on artificial light at night (ALAN) satellite data to delineate FUAs. As ALAN is emitted by illumination of thoroughfare roads, frequented by commuters, and by buildings surrounding roads, ALAN data can be used, as we hypothesize, for the identification of FUAs. However, as individual FUAs differ by their ALAN emissions, different ALAN thresholds are needed to delineate different FUAs, even those in the same country. To determine such differential thresholds, we use a multi-step approach. First, we analyze the ALAN flux distribution and determine the most frequent ALAN value observed in each FUA. Next, we adjust this value for the FUA's compactness, and run regressions, in which the estimated ALAN threshold is the dependent variable. In these models, we use several readily available, or easy-to-calculate, characteristics of FUA cores, such as latitude, proximity to the nearest major city, population density, and population density gradient, as predictors. At the next step, we use the estimated models to define optimal ALAN thresholds for individual FUAs, and then compare the boundaries of FUAs, estimated by modelling, with commuting-based delineations. To measure the degree of correspondence between the commuting-based and model-predicted FUAs' boundaries, we use the Jaccard index, which compares the size of the intersection with the size of the union of each pair of delineations. We apply the proposed approach to two European countries—France and Spain—which host 82 and 72 FUAs, respectively. As our analysis shows, ALAN thresholds, estimated by modelling, fit FUAs' commuting boundaries with an accuracy of up to 75–100%, being, on the average, higher for large and densely-populated FUAs, than for small, low-density ones. We validate the estimated models by applying them to another European country—Austria—which demonstrates the prediction accuracy of 47–57%, depending on the model type used.

**Keywords:** functional urban areas (FUAs); boundaries; multiple regression modelling; artificial light-at-night (ALAN); optimal threshold

#### **1. Introduction**

More than 50% of the world's population currently resides in urban areas, and this share is expected to increase to 70% by 2050 [1]. Due to a significant concentration of production factors, urban areas produce approximately 80% of the global GDP [2]. This makes spatial dynamics of urban areas to be important for policy-makers and researchers

**Citation:** Rybnikova, N.; Portnov, B.A.; Charney, I.; Rybnikov, S. Delineating Functional Urban Areas Using a Multi-Step Analysis of Artificial Light-at-Night Data. *Remote Sens.* **2021**, *13*, 3714. https://doi.org/ 10.3390/rs13183714

Academic Editor: Xuecao Li

Received: 10 August 2021 Accepted: 10 September 2021 Published: 17 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

alike. Decision-makers can devise informed development policies, while in the research community, this information can be used to monitor the process of urban growth and the forces behind it [3–5], to assess the impact of urbanization on agriculture and natural landscapes [6], on biodiversity [7], on land surface temperature [8], and other socioeconomic and physical phenomena.

Urban growth is characterized by two distinctive components—*physical growth* and *functional change*. The former group of attributes reflect changes in impervious surfaces and built-up characteristics, such as building density, building volumes [9,10], as well as population size and density of individual urban settlements [11–13]. Concurrently, functional attributes of urban growth reflect *factor mobility*, associated with various economic activities, such as commuting, commerce, industrial production and services [14]. Such exchanges are especially intense between urban cores, where a large share of production factors is concentrated, and their surrounding areas. Functionally-integrated clusters, representing geographic entities that consist of a densely inhabited city and a less densely populated commuting zone, both highly integrated through labor markets, are commonly referred to as *functional urban areas* or FUAs [15]. A FUA is conceptually different from an urban agglomeration, which is commonly defined as a major city surrounded by an adjacent hinterland [16]. The major difference between the two is *commuting*, which is crucial for delineating FUAs, but is not a prime consideration for the definition of urban agglomerations.

According to the mainstream approach adopted by the European Union (EU) and the Organization for Economic Co-operation and Development (OECD), the boundaries of FUAs are defined in three consecutive steps. First, *urban cores* are identified as contiguities of high-density grid cells with population density of at least 1500 residents per km<sup>2</sup> and the total population in the contiguous cells of at least 50,000 residents. Second, local administrative units (LAUs) with at least 50% of their residents living inside the urban core are identified. At the final step, the commuting zone, comprising LAUs, which have at least 15% of their residents employed in the core city, is determined. Together with the central city, these administrative units are assumed to form a single FUA [17].

However, commuting data, needed to perform such delineations, are laborious to collect and are infrequent and sporadic even in developed countries [18]. In addition, different countries and regions report communing data with different frequencies, and sometime collect them using different definitions and methodologies [18]. As a result, comparable cross-country estimates of FUA boundaries cannot always be obtained.

As artificial light-at-night (ALAN) data are freely available globally and provide a seamless global coverage, the idea of using them for the identification of human activities was investigated in several studies (see inter alia [19–21]). In previous studies, ALAN data were used in health geography [22–27], for the analysis of economic performance of countries and regions [28–31], and in population density research [20,32–36]. The use of such data in the studies of light pollution and its ecological effects is also common [24,37–40].

In recent years there have been attempts to use ALAN data for the identification of urban areas [20,21,41–47]. In one such study, Imhoff et al. [41] examined frequency-based ALAN thresholds for three large metropolitan areas in the U.S.—Miami, Chicago and Sacramento. After the authors analyzed the frequencies of differently lit pixels in the ALAN images, they determined that pixels present with 85%, 89% and 94% frequencies, occupy the areas of approximately same size, such as those reported in the Census for the corresponding metropolitan entities.

In another study, Sutton et al. [20] investigated 2000 cities across the globe, and compared their actual boundaries with those produced by three different frequency-based ALAN thresholds—40%, 80% and 90%. As the study revealed, pixels in the ALAN image, observed with a frequency of 80% or more, correspond to the actual municipal boundaries best, reaching a correlation level of about 68%.

In a separate study, Henderson et al. [21] examined frequency- and intensity-based ALAN thresholds that match the boundaries of San Francisco, Beijing and Lhasa. As the authors of this study have found, the optimal ALAN frequency-based thresholds that produce

the total lit area comparable in size to the Landsat data-derived urban delineations, reach 88% for Lhasa, 97% for Beijing, and 92% for San Francisco, with the corresponding ALAN flux being equal to 19, 30 and 51 digital numbers (DN), respectively. However, the spatial correspondence between metropolitan boundaries, determined using ALAN thresholds, and actual metropolitan delineations was found to be relatively low, not exceeding 8–44%.

It should also be noted that the aforementioned studies focus on the identification of built-up urban contiguities, while, to the best of our knowledge, only one study by Bosker et al. [18] analyzed functional urban delineations based on commuting flows. The authors of this analysis compared varying percentiles of ALAN intensities, reported by the VIIRS/DNB satellite's sensor for 2015, with commuting delineations in Malaysia. As this study revealed, the best fit of ~40% is observed when 7% commuting frequency delineations are compared with delineations based on the 25th percentile of ALAN intensities.

A possible reason for such a low fit of less than 40% is that FUAs even in the same country differ by the amount of ALAN they emit. As a result, different ALAN thresholds must be used for the delineation of FUA boundaries in different parts of the urban system. In Figure 1, we illustrate this point using two FUAs in France, as an example. As evidenced by this figure, the ALAN threshold of 0.71 nW/cm2/sr fits reasonably well the boundaries of the Paris FUA, but the same threshold fits rather poorly the much smaller Chateauroux FUA, ALAN flux at which boundary does not exceed 0.15 nW/cm2/sr.

Considering that ALAN emissions from different FUAs vary substantially, it is thus important to establish varying ALAN thresholds, which would fit individual FUAs. This task can be performed for each FUA separately. However, in order to be practical, the approach needs to be sufficiently general, to enable its application to different FUAs, both for countries and regions with well-established commuting data and for other locations with unavailable or sparsely available commuting information. In this paper, we develop such an approach and test it against actual FUA delineations.

**Figure 1.** *Cont*.

**Figure 1.** Commuting-based boundaries (black lines) of the Paris (**a**) and Chateauroux (**b**) FUAs vs. the ALAN contours (blue lines), representing the 0.71 nW/cm2/sr threshold level.

#### **2. Materials and Methods**

*2.1. Study Phases*

The proposed approach is implemented in several steps, as detailed in Figure 2. The data sources and analysis stages are described in the subsections below.

**Figure 2.** Flowchart of study stages.

#### *2.2. Data Sources*

Data for the present study were drawn from the following four main sources:


**Figure 3.** *Cont*.

**Figure 3.** ALAN maps for continental France (**a**) and Spain (**b**). Note: Areas located outside the national borders are marked in blue.

**Figure 4.** *Cont*.

**Figure 4.** FUAs and their cores in continental France (**a**) and Spain (**b**).


**Table 1.** Descriptive statistics of the research variables.

Notes: <sup>a</sup> Calculated as straight line distance between a FUA core's centroid and centroid of the closest FUA with 1.5M+ residents; <sup>b</sup> Calculated as the ratio between the population density of the FUA core and that of the core's buffer with a 5 km width for small FUAs (less than 100,000 residents), a 15 km buffer for medium-size FUAs (100,000–250,000 residents), and a 25 km buffer for large FUAs (over 250,000 residents).

#### *2.3. Initial Determination of the ALAN Thresholds*

For the sake of simplicity, let's assume that the nighttime light source of highest intensity is located at the center of a FUA, and light intensities drop monotonically and uniformly towards the FUA's periphery (see Figure 5).

**Figure 5.** A simplified distribution of ALAN emissions (**a**) and the associated frequency distribution of ALAN values (**b**).

Such an assumption might be fully plausible for compact and monocentric urban areas (Figure 6). Under these conditions, the territorial footprint of the FUA's ALAN emissions follows a perfect circle, and the most frequently observed (i.e., modal) ALAN values are found at the FUA's outer boundary (Figure 5a). These modal values are also the dimmest ones, and, as such, they effectively define the FUA's outer boundary (Figure 5b).

**Figure 6.** Examples of compact monocentric FUAs, which territorial footprints are close to a circular shape: Le Mans (**a**) and Limoges (**b**) in France. Note: Thin grey lines mark FUAs' boundaries.

If the above assumptions are upheld, the analysis of the frequency distribution of the observed ALAN values can help to identify the ALAN level, which coincides best with the FUA's boundary. In particular, the researcher needs to choose the *modal* ALAN value, for which ALAN intensity is expected to be close to zero (Figure 5b).

#### *2.4. Correction for Compactness*

The above assumption of monotonic and concentric distribution of ALAN emissions (Figure 5) is upheld only if the boundaries of FUAs that are circularly shaped. However, if a FUA's shape is not circular, using the modal ALAN value as a delineation threshold would underestimate the actual area of the FUA. Figure A1 in Appendix A, which reports different FUAs' footprints, helps to illustrate this point. As this figure shows, the more distant the shape of a FUA from a perfect circle, the brighter ALAN values emerge as the most frequent. For such non-circular FUAs, it is thus necessary to correct for compactness, so as to account for a FUA's shape deviation from a perfect circle.

To perform such a correction, we first estimate the FUA's compactness (*c*), calculating it as the ratio between the area of a FUA and the area of its bounding circle [53,54]:

$$
\omega\_{FIA} = \frac{S\_{FIA}}{S\_{BC}} \tag{1}
$$

where *SFUA* = area of a FUA; *SBC* = area of the bounding circle, calculated using the Minimum Bounding Geometry tool in the ArcGIS software.

Next, to represent FUAs, which deviate from circular shapes, we model them as ellipses of the same compactness:

$$\mathfrak{c}\_{EI} = \frac{\mathcal{S}\_{EI}}{\mathcal{S}\_{BC}} = \frac{\pi ab}{\pi a^2} = \frac{b}{a} = \mathfrak{c}\_{FIL} \tag{2}$$

where *SEl* = area of an ellipse with semi-axes *a* and *b* (*a* > *b*).

At the next step, to correct the initially estimated ALAN threshold (see Section 2.3) for a FUA's compactness, we calculate the radius of the circle, *r*, which has the maximal intersection with ellipse, *CEl.* As shown in Box A1 in Appendix A, this radius is equal to:

$$
\sigma = \sqrt{ab} \tag{3}
$$

Lastly, we estimate the percentile of the ALAN value distribution, *p\**, corrected for compactness (see Box A1 in the Appendix A for the justification):

$$p^\* = \frac{2}{\pi} \arcsin\left(\frac{1-c}{1+c}\right) \tag{4}$$

According to (4), for compact shapes, which are close to a circle, i.e., for which *c* → 1, the optimal ALAN threshold percentile (*p\**) tends to the dimmest ALAN value (*p\** → 0), while for prolongated shapes with *c* → 0, *p\** → 100, that is, the optimal ALAN threshold will tend to the highest ALAN percentile (see Figure 7).

**Figure 7.** Relationship between a FUA's compactness (*c*) and the optimal ALAN percentile (*p\**). Note: Shapes deviating from a perfect circle are assumed to be elliptical; see text for explanations.

#### *2.5. Regression Modelling*

After the optimal ALAN threshold is identified for each FUA by determining the modal ALAN value (see Section 2.3), and corrected for compactness (Section 2.4), we link the estimated threshold values to several explanatory variables, characterizing the FUA cores, so as to determine these variables' load on the optimal ALAN threshold value. To model these relationships, the following generic regression equation is used:

$$ALAN\_i = b\_0 + b\_1 \* Lat\_i + b\_2 \* D\_i + b\_3 \* PD\_i + b\_4 \* PDD\_i + \varepsilon\_i \tag{5}$$

where *ALANi* is the optimal ALAN threshold for FUA *i* (nW/cm2/sr); *Lati* is latitude of the FUA core's centroid (decimal degrees, dd); *Di* is distance to the nearest major city, calculated between a given FUA core's centroid and the centroid of the nearest FUA with more than 1.5M residents (dd); *PD* is population density of the FUA core (persons per km2); *PDD* is population density decline gradient, calculated as the ratio between the FUA core's population density and population density in the FUA core's buffer with a 5 km width for small FUAs (under 100,000 residents), a 15 km width for mid-sized FUAs (100,000–250,000 residents), and a 25 km width for large FUAs (over 250,000 residents); *b*0*..b*4—regression coefficients, and *ε* is a random error term.

The predictors used in the model are expected to contribute to the ALAN threshold's variance due to varying reasons. In particular, population density is known to be closely associated with ALAN flux (see inter alia [33,35,55]). Concurrently, population density gradient might capture changes in the pattern of population density around the FUA core. Concurrently, distance to the nearest major city is likely to show how local development patterns are modulated by proximity to major urban concentrations [55]. In addition, as population concentrations in high latitudes often require more artificial illumination, especially during long winters [40], FUA's latitude is also included into the model as a potential predictor.

In the analysis, we tested different functional forms of the models, and determined that the logarithmic transformation of the *PD* and *PDD* variables provides the best results, by improving the regression fit substantially (*p* < 0.05). The initial analysis was performed in the IBM SPSSv.25 software using its multiple regression module. To ensure the normality distribution of the dependent variable, *ALANi*, we applied Box-Cox transformation procedure, to redefine the ALAN thresholds [56].

In addition to ordinary least square regressions (OLS), we also tested "random forest" regressions. Such regressions imply building an ensemble of "decision trees", each of which "voting" for a certain level of the dependent variable, with subsequent averaging of the estimates across all the decision trees [57]. In the present analysis, we implemented a standard realization of the "random forest" regression (the TreeBagger module) in the MATLAB v.R2020x software [58]. During the estimation procedure, two parameters were a matter of choice—the number of independent variables used for the individual decision tree construction and the number of decision trees that comprise the forest. To ensure the comparability of the results, we used all independent variables, covered by the analysis, for the decision trees' construction, and defined number of trees to be equal to 100, which is usually considered to be a reasonable number for reaching a generalization error convergence (see for example [57,59]). Each decision tree was built for 80% of randomly selected observations.

#### *2.6. Adjustment for Contiguity*

When the analysis is performed, any given ALAN threshold level might identify several clusters of identically lit pixels, some of which might be related to a given FUA, while other pixels might be located elsewhere. Therefore, to identify the ALAN pixels relevant to a given FUA, the following analytical procedure was implemented. First, for each FUA, we identified pixels that overlap the FUA's core area, considering the core boundary information as an initial input (see Section 2.2: Data Sources). Next, for each pixel selected thereby, we analyzed all the pixels in its surroundings. If the ALAN value of a neighboring pixel was lower or equal to that of the pixel under analysis but greater than the ALAN threshold identified for the FUA (see Sections 2.3 and 2.4), the pixel in question was considered to be a part of the FUA analyzed. We have continued this procedure as long as all the pixels, which satisfy the above criteria, maintained a spatial contiguity. Then, for each FUA, we selected local administrative areas (LAUs), most of which area (that is, >50%) is occupied by the pixels identified thereby. These LAUs were considered to be a part of a given FUA (the MATLAB code for contiguity adjustment can be obtained from the authors upon request).

#### *2.7. Initial Validation*

To assess the performance of the estimated models (see Section 2.5), we analyzed the degree of correspondence between the empirically determined (see Section 2.4) and model-predicted ALAN thresholds adjusted for contiguity (see Section 2.6). To this end, the model estimated for France was used to predict the ALAN thresholds for individual FUAs in Spain and vice versa. In order to assess the extent to which the empirically determined and model-predicted ALAN thresholds coincide, we used different metrics, including Pearson correlation coefficients, standard error of the estimates (SEE), and weighted mean squared errors (WMSE).

Next, we compared the FUAs' delineations, either empirically determined using the modal ALAN values (see Section 2.4) and adjusted for contiguity (see Section 2.6), or modelpredicted (see Section 2.5), with commuting-based FUAs' delineations (see Section 2.2). To perform such a comparison, we used the Jaccard Index (*JI*)*,* which estimates the share of intersection within the union of the two sets relative to these shapes' union [60]:

$$MI(FILA\_C, FILA\_T) = \frac{|FILA\_C \cap FILA\_T|}{|FILA\_C \cup FILA\_T|} \tag{6}$$

where *FUAC* = the set of local autonomous units (LAUs) forming a FUA defined by commuting, and *FUAT* = set of LAUs within either an empirically determined or modelpredicted FUA boundary. The value of the index in question ranges from zero, when no intersection between the two sets is present, to one, when the two sets completely coincide and their intersection is equal to their union [60].

#### *2.8. Second-Step Validation*

For an additional validation, we applied the models estimated for France and Spain to FUAs in another European country—Austria (Figure 8).

**Figure 8.** FUAs in Austria used for the models' validation.

Although Austria differs from the two other countries under analysis in terms of size, urbanization level, topography, and FUAs' location, it was chosen for an additional model validation, to demonstrate that the estimated models perform reasonably well even in this specific case. As all FUAs in this country are located apart from each other (see Figure 8), this country is considered particularly suitable for the intended validation.

The validation procedure was carried out in the following four steps. First, we determined the optimal ALAN thresholds for each FUA empirically (see Sections 2.3 and 2.4). Second, we used the ALAN-threshold identification models, estimated for France and Spain (see Section 2.5), to predict optimal ALAN thresholds for FUAs in Austria, using relevant input variables (see Sections 2.2 and 2.5), and, then, adjusted these estimates for contiguity (see Section 2.6). Considering that FUAs in Austria are located in close proximity to international borders, the input information was not limited to the areas inside Austria only. For instance, population density-decline gradient and distance to the closest

major city were calculated regardless of the state borders. Third, we assessed the correspondence between the empirically determined and models-predicted ALAN thresholds using Pearson correlation coefficients, SEE, and WMSE. Finally, we compared delineations, based on model-predicted ALAN thresholds, with commuting-based delineations, while expanding the study area by a 50-km buffer around the Austrian border, to cover the parts of FUAs located outside Austria and potentially extending into neighboring countries. As in the previous stage of the analysis (Section 2.7), the comparison of the shapes was performed using *JI*.

#### **3. Results**

#### *3.1. Optimal ALAN Thresholds*

The descriptive statistics of the ALAN thresholds, estimated by the multi-step approach described in Section 2.1, are reported in Table 2, separately for France and Spain, both as ALAN percentiles and actual ALAN levels in nW/cm2/sr. As evidenced by this table, the optimal ALAN thresholds identified for individual FUAs appear to vary widely, ranging from 0.15 to 9.91 nW/cm2/sr for France, and from 0.13 to 8.23 nW/cm2/sr for Spain.

**Table 2.** Descriptive statistics of the identified ALAN thresholds.


The most frequent (i.e., modal) ALAN values for all FUAs under analysis are reported in Figure 9, separately for France and Spain. In both countries, the modal ALAN values are not identical to the dimmest ones, thus pointing out that the "circularity" assumption (see Section 2.3) is violated. The bottom sub-figures report ALAN thresholds, corrected for compactness using the approach described in Section 2.4. As it can be seen from the comparison of the upper and bottom diagrams, modal ALAN thresholds, corrected for compactness are closer to the dimmest ALAN values than before the correction (especially for France), albeit differences in distributions are still valid.

**Figure 9.** *Cont*.

**Figure 9.** The modal and dimmest ALAN values estimated for individual FUAs before correcting for compactness ((**a**) = France; (**b**) = Spain) and after correcting for compactness ((**c**) = France; (**d**) = Spain). Notes: The column numbering (axis X) refers to FUA numbers listed in Table A1 of the Appendix. FUAs are sorted in an ascending order according to their modal ALAN values (upper diagrams) or according to compactness-based ALAN thresholds (bottom diagrams).

#### *3.2. Explaining the Variance of the Observed ALAN Thresholds*

In Table 3, we report the results of OLS analysis, linking individually determined optimal ALAN thresholds with geographic and socio-economic attributes of the FUAs' core areas. As evidenced by Table 3, the predictors used in the analysis help to explain ~74% of the ALAN threshold variance (R2 = 0.739–0.740). Characteristically, in both models, significant predictors are nearly identical and exhibit the same signs: population density (+); population density gradient (−); latitude (+), and distance to the nearest major city (−) (*p* < 0.01).


**Table 3.** Factors affecting ALAN threshold values estimated for individual FUAs (Method—OLS; Dependent variable— ALAN optimal threshold level, Box-Cox transformed with *α* = −0.55).

Notes: <sup>a</sup> unstandardized regression coefficients; <sup>b</sup> standardized regression coefficients; <sup>c</sup> *t*-statistic and its significance level; *SEE* = standard error of the estimates; *WMSE* = weighted mean squared error; *F* = *F*-statistics; \* 0.01 significance level.

> As random forest regressions do not provide explicit estimates of the explanatory variables' coefficients, we do not report these models here, but should remark that these estimates in terms of correlation with the actual ALAN threshold levels are similar to the

OLS estimates reported in Table 3 (*r* = 0.856 for France and *r* = 0.883 for Spain, as opposed to *r* = 0.866 for France and *r* = 0.812 for Spain in the OLS models), while in terms of *SEE* they are poorer (*SEE* = 0.913 for France and *SEE* = 0.817 for Spain in comparison to *SEE* = 0.533 for France and *SEE* = 0.629 for Spain; see Table 3).

However, in terms of *WMSE,* random forest regressions are much better (*WMSE* = 0.945 for France and *WMSE* = 0.545 for Spain in comparison to *WMSE* = 4.521 for France and *WMSE* = 2.718 for Spain in the OLS models; see Table 3). Considering this result, we use the ALAN threshold estimates, produced by the random forest regressions, in the following analysis.

#### *3.3. Model Cross-Validation*

In Figure 10, we report the correspondence between the empirically determined and model-predicted ALAN thresholds. For this analysis, the model estimated for France (see Table 3) is applied to FUAs in Spain and vice versa. As evidenced by this figure, the estimates are fairly congruent, with *r* > 0.819.

**Figure 10.** Models cross-validation results for France (**a**) and Spain (**b**).

#### *3.4. Model-Estimated vs. Commuting-Based FUAs' Delineations*

Figure 11 shows several most successful examples of FUAs' delineations, generated by the proposed approach. Concurrently, in Figure 12, we report actual FUA delineations and model estimates for all FUAs in continental France and Spain. In addition, in Table 4, we report the degree of correspondence between the model-estimated and commuting-based delineations, assessed using *JI*.

**Figure 11.** *Cont*.

**Figure 11.** Examples of FUAs featuring compactness-based boundaries (blue lines), model-based boundaries (green lines) and commuting-based boundaries (black lines): Paris (**a**) and Madrid (**b**) (see text for explanations).

**Figure 12.** Commuting-based (**a**) vs. model-estimated (**b**) delineations of FUAs in France and Spain.




**Table 4.** *Cont.*

As can be seen in Table 4, the calculated *JI* values range between 0.30 and 0.64, being higher for large FUAs (*JI* = 0.499–0.507) than for small FUAs (*JI* = 0.33–0.34). For densely populated FUAs, the match between the commuting-based and ALAN-based delineations is especially high, reaching 0.557–0.638, or 56–64% (see Table 4). [The *JI* values for all the French and Spanish FUAs are reported in Figure A2 in Appendix A].

#### *3.5. Second-Step Validation*

In Table 5, we report ALAN threshold values for FUAs in Austria, calculated using the 'French' and 'Spanish' models (Table 3), and compared to individually fitted ALAN thresholds. As evidenced by this table, the ALAN thresholds, estimated using the French and Spanish models, correspond to the individually fitted ALAN thresholds quite well, with *r* > 0.77 and *SEE* < 0.82. Yet, in terms of WMSE, the French model performs poorer in comparison to the Spanish model (*WMSE* = 0.711 vs. *WMSE* = 10.102, respectively). In Figure 13, we report FUAs' delineations obtained by averaging the estimates obtained using the French and Spanish models (see Table 3).




**Table 5.** *Cont.*

**Figure 13.** Commuting-based vs. models-estimated delineations of FUAs in Austria (see text for explanations).

#### **4. Discussion and Conclusions**

The delineation of geographic boundaries of FUAs is important for comparative urban studies. However, using commuting data for this task is not always feasible due to difficulties in data collection. In the present study, we suggested and tested an approach, based on the analysis of ALAN data. As ALAN is emitted from roads, frequented by commuters, and by buildings surrounding roads, ALAN emissions can be used, as we hypothesize, for the identification of FUAs.

We verify this hypothesis using data on commuting-based delineations available for France and Spain, applying a multi-step approach. First, we fit the ALAN threshold for each individual FUA, using the modal value of the ALAN frequency distribution. Next, we explain this threshold by a multiple regression analysis, using several characteristics of the FUAs' cores, such as latitude of the core's centroid, distance to the closest major city, population density, and density decline gradient. Although the boundaries of the FUA core areas used as an initial input are not generated by the analysis per se, such boundaries, if not a priori available, can be identified easily using Global Human Settlement [61] or LandScan [52] as contiguities of densely populated grids. Lastly, we cross-validate the obtained models for three European countries.

As our analysis indicates, the degree of correspondence between the individually fitted and model-predicted ALAN thresholds is relatively high (*r* > 0.819), with *Jaccard Index* values reaching up to 75% for France and up to 100% for Spain.

Our results are more robust than those obtained by Bosker and colleagues [18] for FUAs in Indonesia, according to which the correspondence between ALAN-based and 15% commuting-based FUA delineations did not exceed 28%. We explain the improvement, obtained in the present study, by the use of individually-fitted ALAN thresholds, based on the analysis of modal values, corrected for compactness.

To the best of our knowledge, this study is the first that estimates the optimal ALAN thresholds that approximate the boundaries of individual FUAs, using readily available, or easy-to-compute, characteristics of the FUAs' cores, such as latitude of the core's centroid, distance to the closest major city, population density and population density decline gradient, combined with ALAN flux data.

The proposed modelling approach might be useful for FUA delineations in countries and regions, for which commuting data are unavailable, as well as for places, in which commuting data are not updated on a regular basis, and for a comparative analysis of countries and regions, which use different commuting-assessment procedures. Using our modeling approach, FUAs' boundaries can be determined in the following steps. First, the boundaries of FUAs' cores should be identified. If such boundaries are not readily available, they can be determined as contiguities of high-density grid cells, using input sources, such as Global Human Settlement [61] or LandScan [52] grids. The procedure might follow the algorithm described in Dijkstra et al. (2019): Grid cells with population density of at least 1500 residents per km2 are identified. Afterward, the grid cells identified thereby are grouped into contiguous area with a total population of at least 50,000 residents. For such areas, the development and locational characteristics are identified next, including the latitude of the contiguity's centroid, distance to the closest major city, population density and population density decline gradient (see Section 2.5). These characteristics of the core areas are then used as predictors in the ALAN-threshold estimation models, reported in Section 2.5 for either France or Spain, or both, to obtain the optimal ALAN threshold estimates for each individual FUA. Finally, a VIIRS-DNB raster is used, to select pixels, corresponding to the estimated ALAN threshold, and to identify LAUs associated with such pixels' contiguities, as detailed in Section 2.6.

The present study has several limitations. While for some FUAs, our estimates are quite accurate, reaching the levels of accuracy of 74–100%, for other, typically smaller FUAs, our estimations are less accurate. We assume that the reason might be that commutingbased boundaries rely mainly on work-related commuting, while omitting other human flows, such as travels for leisure, services, and social activities. In contrast, the suggested ALAN approach captures human activities at large. In addition, the ALAN-based approach might omit areas occupied by functions that operate mainly at daytime and emit much less light at night. For smaller FUAs, this source of error might by more pronounced than for large FUAs, where many functions operate around the clock. Another possible reason for a relatively low correspondence between some commuting- and ALAN-based delineations might be due to the fact that many FUAs are not monocentric, or might have a shape which is far from circular or elliptic, which we considered for modelling. For such cases, further studies might be needed to reflect more complex situations, in which FUA is either polycentric, or adjacent to other FUAs and their boundaries overlap or merge.

It should also be noted that in this study, we investigated the performance of the proposed method by applying it to three well-developed countries in Europe—France, Spain, and Austria. Yet, question remains about the models' applicability to countries outside Europe and to countries in mid-latitudes, and, especially, to less-developed countries. We expect that applying the models to such countries might result in the overestimation of the optimal ALAN thresholds and thus in the underestimation of the commuting extent (the evidence for this conclusion is provided in [21]). Therefore, a follow-up investigation of the applicability of the proposed models to less developed countries and regions might be needed. Additionally, we need to acknowledge that a temporal mismatch between ALAN and actual FUAs' delineation exists. Whether it might affect the results of the analysis should be clarified in future studies, after newer commuting data become available.

**Author Contributions:** Conceptualization, B.A.P. and I.C.; methodology, N.R., S.R. and B.A.P.; software, S.R. and N.R.; formal analysis, N.R.; data curation, N.R.; writing—original draft preparation, N.R.; writing—review and editing, B.A.P., I.C. and N.R.; overall study supervision, B.A.P. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Council for Higher Education of Israel (postdoctoral scholarship of N.R.).

**Data Availability Statement:** Initial data and processing codes are available from N.R. upon request.

**Acknowledgments:** We express our gratitude to three anonymous reviewers for the highly valuable comments.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **Appendix A**

**Table A1.** FUAs' identification (ID number = number in axis X in Figure 9 in the main manuscript).



**Table A1.** *Cont.*

13 FR066 Saint-Brieuc 13 ES037 Puerto de Santa Maria, El


**Table A1.** *Cont.*


**Table A1.** *Cont.*

**Figure A1.** *Cont*.

**Figure A1.** Light emission distribution from the center of a monocentric FUA, modelled by different geometric shapes (left panel) and distributions of ALAN in corresponding FUAs (right panel).

**Figure A2.** Jaccard Index for the estimated delineations, derived from the compactness-based (**a**,**b**) and model-based (**c**,**d**) ALAN thresholds: FUAs in France (**a**,**c**) and Spain (**b**,**d**). Note: The column numbering refers to FUA numbers listed in Table A2 below. In the graphs, FUAs are sorted in descending order according to their *JI* values.

**Table A2.** FUAs' identification (ID number = number on the X-axis in Figure A2 above).



**Table A2.** *Cont.*

#### **Box A1.** Estimation of the compactness-based ALAN threshold (derivation).

#### *A. Optimal Radius Calculation*

To calculate the radius of the circle (*r*\*), ensuring maximal intersection with the ellipse we should define and maximize Jaccard index: *J I* = <sup>|</sup>*SC*∩*SE*<sup>|</sup> <sup>|</sup>*SC*∪*SE*<sup>|</sup> <sup>→</sup> *max* , where *Sc* = area of the circle, and *SE* = area of the ellipse. For this sake, we should calculate and differentiate the following function:

$$MI(r) = \frac{\int\_0^l y\_{cllipse}(\mathbf{x})d\mathbf{x} + \int\_l^r y\_{clrela}(\mathbf{x})d\mathbf{x}}{\int\_0^l y\_{clrela}(\mathbf{x})d\mathbf{x} + \int\_l^a y\_{cllipse}(\mathbf{x})d\mathbf{x}} \tag{A1}$$

where *ycircle* <sup>=</sup> <sup>√</sup> *<sup>r</sup>*<sup>2</sup> <sup>−</sup> *<sup>x</sup>*<sup>2</sup> is equation of circle and *<sup>y</sup>*ellipse <sup>=</sup> *<sup>b</sup> a* <sup>√</sup>*a*<sup>2</sup> <sup>−</sup> *<sup>x</sup>*<sup>2</sup> is equation of ellipse. Limit of integration *i* is defined as *x* coordinate of intersection *ycircle* and *yellipse*:

$$
\sqrt{r^2 - \chi^2} = \frac{b}{a}\sqrt{a^2 - \chi^2} \tag{A2}
$$

Both integrals in Equation (A1) are of the same type, which are calculated in the same way:

$$\begin{split} \int \sqrt{k^2 - x^2} dx &= \left\{ \begin{array}{l} \mathbf{x} = k \sin(y) \\ d\mathbf{x} = k \cos(y) dy \end{array} \right\} = \int \sqrt{k^2 - k^2 \sin^2(y)} \ast k \cos(y) dy \\ &= \int k^2 \cos^2(y) dy = \frac{k^2}{2} \int (\cos(2y) + 1) dx = \frac{k^2}{2} \left( \frac{\sin(2y)}{2} + y \right) \\ &= \frac{k^2}{2} (\cos(y) \sin(y) + y) = \frac{k^2}{2} \left( \frac{x}{k} \sqrt{1 - \left(\frac{x}{k}\right)^2} + a \sin\left(\frac{x}{k}\right) \right) \end{array} \tag{A3}$$

Proceeding from the equations of circle and ellipse, limit of integration *i* (formula (A2)), and the integral calculation (formula (A3)), let's consequentially compute the integrals in *JI(r)*. Thus, the first integral in nominator will look like the following:

$$\begin{split} \int\_{0}^{l} y\_{ellipst}(\mathbf{x}) d\mathbf{x} &= \left. \frac{b}{4} \frac{a^2}{2} \left( \frac{\mathbf{x}}{4} \sqrt{1 - \left(\frac{\mathbf{x}}{d}\right)^2} + a \sin\left(\frac{\mathbf{x}}{d}\right) \right) \right| \begin{array}{l} a\sqrt{\frac{r^2 - b^2}{a^2 - b^2}} &= \\ 0 & \end{array} \\ &= \left. \frac{a}{2} \left( \frac{\sqrt{r^2 - b^2}\sqrt{a^2 - r^2}}{\frac{a^2}{a^2 - b^2}} + a \sin\left(\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right) \right) \end{array} \tag{A4}$$

The second one will be equal to

$$\begin{cases} \int\_{1}^{r} y\_{\text{drclle}}(\mathbf{x}) d\mathbf{x} = \left. \frac{r^{2}}{2} \left( \frac{\mathbf{x}}{\mathcal{T}} \sqrt{1 - \left(\frac{\mathbf{x}}{\mathcal{T}}\right)^{2}} + a \sin\left(\frac{\mathbf{x}}{\mathcal{T}}\right) \right) \right|\_{1} a \sqrt{\frac{r^{2} - b^{2}}{a^{2} - b^{2}}} = \\\ = \frac{\pi r^{2}}{4} - \frac{4b\sqrt{r^{2} - b^{2}}\sqrt{a^{2} - r^{2}}}{2(a^{2} - b^{2})} - \frac{r^{2}}{2} a \sin\left(\frac{\mathbf{q}}{r} \sqrt{\frac{r^{2} - b^{2}}{a^{2} - b^{2}}}\right) \end{cases} \tag{A5}$$

**Box A1.** *Cont.*

$$\begin{aligned} \text{The noniminator of } &fl(r) \text{ will equal to (A4) + (A5):}\\ &\int\_0^l y\_{cl} y\_{clpsle}(\mathbf{x}) d\mathbf{x} + \int\_l^r y\_{clrelle}(\mathbf{x}) d\mathbf{x} = \ \underbrace{\mathfrak{A}}\_2^l \sin\left(\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right) - \underbrace{\mathfrak{C}}\_2^l \sin\left(\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right) \\ & + \frac{\pi r^2}{4} \end{aligned} \tag{A6}$$

Actually, the denominator of *JI(r)*, representing the union of *SC* and *SE*, is equal to the sum of the quarter of corresponding areas diminished by the intersection of *SC* and *SE*, calculated in formula (A6):

$$\begin{array}{ll} \displaystyle \displaystyle \int\_{0}^{l} y\_{\text{circle}}(\mathbf{x}) d\mathbf{x} + \int\_{l}^{a} y\_{\text{diffuse}}(\mathbf{x}) d\mathbf{x} = \\ \displaystyle \displaystyle \frac{1}{4} \left( \pi r^{2} + \pi ab \right) - \left( \frac{ab}{2} a \sin \left( \sqrt{\frac{r^{2} - b^{2}}{a^{2} - b^{2}}} \right) - \frac{r^{2}}{2} a \sin \left( \frac{a}{r} \sqrt{\frac{r^{2} - b^{2}}{a^{2} - b^{2}}} \right) + \frac{\pi r^{2}}{4} \right) =\\ \displaystyle \frac{r^{2}}{2} a \sin \left( \frac{a}{r} \sqrt{\frac{r^{2} - b^{2}}{a^{2} - b^{2}}} \right) - \frac{ab}{2} a \sin \left( \sqrt{\frac{r^{2} - b^{2}}{a^{2} - b^{2}}} \right) + \frac{\pi a b}{4} \end{array} \tag{A7}$$

Thus, *JI(r)* is equal to

$$\begin{split} II(r) &= \frac{ab \ast a \sin\left(\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right) - r^2 \ast a \sin\left(\frac{4}{r}\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right) + \frac{\overline{w}^2}{2}}{-ab \ast a \sin\left(\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right) + r^2 \ast a \sin\left(\frac{4}{r}\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right) + \frac{\overline{w}\overline{d}}{2}} \\ &= \left\{ ab \ast a \sin\left(\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right) - r^2 \ast a \sin\left(\frac{4}{r}\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right) = \overline{z} \right\} = \frac{\overline{z} + \frac{\overline{w}^2}{2}}{-z + \frac{\overline{w}^2}{2}} \end{split} \tag{A8}$$

Derivative of *JI(r)* will be equal to

$$\begin{array}{l} \frac{d(I)}{dr} = z'(ab+r^2) - 2zr + \pi rab = \\ = \left\{ z = ab \ast a \sin\left(\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right) - r^2 \ast a \sin\left(\frac{a}{r}\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right) \right\} = \\ = -2r \ast a \sin\left(\frac{a}{r}\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right)(ab+r^2) - \\ - 2r\left(ab \ast a \sin\left(\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right) - r^2 \ast a \sin\left(\frac{a}{r}\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right)\right) + \pi rab = \\ = -2rab\left(a \sin\left(\frac{a}{r}\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right) + a \sin\left(\sqrt{\frac{r^2 - b^2}{a^2 - b^2}}\right)\right) + \pi rab \end{array} \tag{A9}$$

To find the maximum of the function *JI(r)*, let's put equal to zero its derivative and define *r*:

$$\begin{array}{l} -2\operatorname{rab}\left(\operatorname{asin}\left(\frac{a}{r}\sqrt{\frac{r^2-b^2}{a^2-b^2}}\right)+\operatorname{asin}\left(\sqrt{\frac{r^2-b^2}{a^2-b^2}}\right)\right)+\operatorname{\tau}\operatorname{rab}=0;\\ \operatorname{asin}\left(\frac{a}{r}\sqrt{\frac{r^2-b^2}{a^2-b^2}}\right)+\operatorname{asin}\left(\sqrt{\frac{r^2-b^2}{a^2-b^2}}\right)=\frac{\pi}{2};\\ \operatorname{in}\left(\operatorname{asin}\left(\frac{a}{r}\sqrt{\frac{r^2-b^2}{a^2-b^2}}\right)+\operatorname{asin}\left(\sqrt{\frac{r^2-b^2}{a^2-b^2}}\right)\right)=\\ \left\{\sin(a+\beta)=\sin(a)\sqrt{1-\sin^2(\beta)}+\sqrt{1-\sin^2(a)}\sin(\beta)\right\}=\\ \frac{\sqrt{r^2-b^2}\sqrt{a^2-r^2}}{r(a-b)}=\sin\left(\frac{\pi}{2}\right)=1\\ \sqrt{r^2-b^2}\*\sqrt{a^2-r^2}=r(a-b)\\ \left(r^2-b^2\right)\*\left(a^2-r^2\right)=r^2(a-b)^2\\ \left(r^2-ab\right)^2=0\\ r^2=ab\\ r=\sqrt{ab} \end{array} \tag{A10}$$

*B. Optimal Percentile Calculation*

Optimal percentile *p\** will equal to the share of area (2) (see figure above) of the area of ellipse:

$$p^\* = \frac{\int\_I^a y\_{ellippsc}(x)dx - \int\_I^r y\_{circle}(x)dx}{\frac{xsh}{4}}\tag{A11}$$

Under defined optimal radius *<sup>r</sup>* <sup>=</sup> <sup>√</sup>*ab*, limit of integration *<sup>i</sup>* is equal to:

$$a\sqrt{\frac{r^2 - b^2}{a^2 - b^2}} = a\sqrt{\frac{b}{a+b}}\tag{A12}$$

#### **Box A1.** *Cont.*

Thus, proceeding from the equations of circle and ellipse, limit of integration *i* (formula (A12)), and the integral resolution (formula (A3)), *p\** will equal to: *p*<sup>∗</sup> = <sup>1</sup> *πab* 4 ⎛ ⎝ ⎛ ⎝ *b a a*2 2 - *x a* <sup>1</sup> <sup>−</sup> *<sup>x</sup> a* <sup>2</sup> + *asin <sup>x</sup> a a a b a*+*b* ⎞ ⎠ − ⎛ ⎜⎝ *b a a*2 2 √*x ab* 1 − √*x ab* 2 + *asin* √*x ab* <sup>√</sup>*ab a b a*+*b* ⎞ ⎟⎠ ⎞ ⎟⎠ = <sup>1</sup> *πab* 4 *πab* <sup>4</sup> <sup>−</sup> *ab* 2 <sup>√</sup>*ab <sup>a</sup>*+*<sup>b</sup>* <sup>+</sup> *asin <sup>b</sup> a*+*b* − *πab* <sup>4</sup> <sup>−</sup> *ab* 2 <sup>√</sup>*ab <sup>a</sup>*+*<sup>b</sup>* <sup>+</sup> *asin <sup>a</sup> a*+*b* = <sup>2</sup> *π asin <sup>a</sup> a*+*b* + *asin <sup>b</sup> a*+*b* (A13) Since *sin*(*α* + *β*) = sin(*α*) <sup>1</sup> − *sin*2(*β*) + <sup>1</sup> − *sin*2(*α*) sin(*β*), *sin asin <sup>a</sup> a*+*b* + *asin <sup>b</sup> a*+*b* <sup>=</sup> *<sup>a</sup>*−*<sup>b</sup> <sup>a</sup>*+*<sup>b</sup>* , and then *<sup>p</sup>*<sup>∗</sup> <sup>=</sup> <sup>2</sup> *<sup>π</sup> asin*- *a* − *b a* + *b* (A14)

Finally, putting compactness c of the ellipse with axes *a* and *b* (*a* > *b*) to be equal to their ratio between the ellipse's area and the area of the bonding circle (*c* = *Area o f Ellipse Area o f Bonding Circle* <sup>=</sup> *<sup>π</sup>ab <sup>π</sup>a*<sup>2</sup> <sup>=</sup> *<sup>b</sup> a* , optimal percentile *p\** will be equal to

$$p^\* = \frac{2}{\pi} \arcsin\left(\frac{a-b}{a+b}\right) = \frac{2}{\pi} \arcsin\left(\frac{\frac{a}{a} - \frac{b}{a}}{\frac{a}{a} + \frac{b}{a}}\right) = \frac{2}{\pi} \arcsin\left(\frac{1-c}{1+c}\right) \tag{A15}$$

#### **References**


MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Remote Sensing* Editorial Office E-mail: remotesensing@mdpi.com www.mdpi.com/journal/remotesensing

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18

www.mdpi.com

ISBN 978-3-0365-3438-1