2.2.2. NPP−VIIRS Night-time Light Data

NPP−VIIRS night-time light (NTL) data were also used in this study. Compared with the Defense Meteorological Satellite Program/Operational Linescan System (DMSP/OLS) data, the night-time light data had a higher spatial resolution (15 arc-seconds, about 750 m) and a wider radiometric detection range [22,27]. These data can be obtained from NOAA's National Centers for Environmental Information (NOAA/NCEI) website [28]. However, as it is a preliminary product, these data are not filtered to remove detected light associated with gas flares, fires, volcanoes, or aurorae, and the dataset has not been processed to remove background noise [29]. In addition, the VIIRS annual night-time light data are being discontinued by NOAA, and only annual data from 2015 and 2016 are supported [30]. Therefore, the 'Flint' annual data were also obtained from the Chinese Academy of Sciences [31]. These data are not affected by fires, volcanoes, and background noise as they have been through statistical cleaning and average noise reduction preprocessing. Therefore, these annual products can be considered as the surface light, and 'Flint' version Beta 1 [32] was used in this study. This 'Flint' imagery consists of 15 arc-second grids, spanning the range −180 to 180 degrees longitude and from −65 to 75 degrees latitude. The digital pixel numbers (DN) range from 0–255. The 'Flint' India light data for India in 2018 is shown below as Figure 4.

**Figure 4.** 'Flint' India light data for 2018 (including Jammu and Kashmir state).

#### 2.2.3. Auxiliary Data

Indian national, state, and taluk boundaries were acquired from the Global Administrative Areas (GADM) provided by the Center for Spatial Sciences at the University of California, Davis [33]. The latest version (version 3.6, released on 6 May 2018) was used. The coordinate reference system based on the WGS84 datum was adopted for the boundary files. In order to support the verification of heavy industry heat sources in India, high-resolution images from Google Earth were also utilized in this paper.

## *2.3. Data Preprocessing*

The size of the long-term time series of active fire/hotspot data was huge, and the 'Flint' data consisted of global data; therefore, some preprocessing work was necessary for this study. In order to obtain information about heavy industry heat sources in India, the VNP14IMG and NTL data were processed, as shown below (Figure 5). This processing consisted of two main parts: data preprocessing and a heavy industry heat source detection model.

**Figure 5.** The architecture of the heavy industry heat source detection model using the active fire/hotspot data and night-time light data for India.

#### 2.3.1. NPP-VIIRS Active Fire/Hotspot Data Preprocessing

For the same reason in a previous paper [2], the long-term time series of VNP14IMG products was also needed to be divided. It was almost impossible to divide one area of heavy industry into two or more administrative taluks in India. So, according to the taluk-level administrative boundaries, the 3,998,465 fire hotspots were then divided according to the taluk-level administrative boundaries.

### 2.3.2. Preprocessing of NPP-VIIRS Night-time Light Data

For most heavy industrial production activities, the use of lighting is also necessary. Therefore, superimposed light data can be used to verify industrial heat sources and filter out false ones. Also, due to economic problems or policy decisions, including regional plans and environmental protection policies, only a small fraction of large, heavy enterprises worked continuously between 2012 and 2018: most enterprises operated for only a few years or months. Thus, some preprocessing of the data was needed. The main processing step was as follows.

Step 1: The annual and global 'Flint' night-time light data were clipped according to the Indian national boundary to obtain annual Indian 'Flint' night-time light data.

Step 2: The annual Indian 'Flint' night-time light data were re-sampled from 750m to 375 m in order to maintain the same spatial resolution as the NP14IMG products.

Step 3: Maximum night-time light data were produced by selecting the maximum value from the annual Indian night-time light data for 2012 to 2018.

#### *2.4. Heavy Industry Heat Source Detection Model*

In this study, we propose an Indian heavy industry heat source detection model that uses VNP14IMG and NTL data. This model consists of six parts: constructing the heat source object detection model using real-time VNP14IMG data, extracting the hot features of the heat source objects, detecting the initial heavy industrial heat sources based on an empirical threshold, calculating the mean night-time light value for each heavy industrial heat source object, detecting the final heavy industrial heat sources based on the empirical threshold for the mean night-time light, and, finally, assessing the results. Details of the model are described in this section.

Step 1: Static and persistent industrial heat sources in the VNP14IMG time series were found to be concentrated around the hot centers due to the stability of their positions and temporal consistency. The heat source object detection model that used long-order VNP14IMG data based on an improved adaptive K-means algorithm was then implemented [2].

Step 2: Extraction of the hot features of heat source objects. In this study, geometric, statistical, and heat source attribute features were used. The central point of the heat source, as well as the width and the height of the max-circumscribed rectangle, were used as the geometric features. For the statistical feature extraction, the number of fires/hotspots, the density of fires/hotspots per unit area, the initial and final detection times of the heat source object, and the mean and variance of the time interval sorted by date were adopted. For the heat source attributes, the minimum, maximum, mean, and variance attribute information of the VIIRS I-4 band brightness temperature (bright\_ti4), the I-5 band brightness temperature (bright\_ti4), scan direction pixel size (scan), track direction pixel size (track), and fire point radiation Power (FRP) were extracted for each heat source object.

Step 3: Heavy industrial heat source objects are static and persistent, whereas biomass fires are usually sparsely distributed. The initial heavy industrial heat source identification was based on an empirical threshold [2]. Subsequently, the initial heavy industry heat sources were identified from heat-source objects.

Step 4: Once the initial vector data of the initial heavy industry heat sources had been registered to the raster data of the max night-time light data using the same WGS84 projection, the mean night-time light value was calculated for each initial heavy industrial heat source object.

Step 5: The final detection of the heavy industry heat sources was carried out by applying the empirical threshold algorithm to the mean night-time light data.

Step 6: Assessment of results. The number of working heavy industry heat sources (NWH), the total number of fire hotspots for each working heavy industry heat source area (NFHWH), as well as *Slope*\_*NWH* and *Slope*\_*NFHWH* [2], were used to analyze the distribution of the heavy industry heat sources in different statistical areas for different years.

#### **3. Results and Discussion**

#### *3.1. Heavy Industrial Heat Source Distribution Characteristics at the National Level*

The spatial distribution of 711 heavy industrial heat sources in Indian regions (Figure 6) revealed that heavy industrial heat sources were mainly concentrated in north-east Assam, east-central Jharkhand, the north of Chhattisgarh and Odisha, and coastal areas of Gujarat and Maharashtra. Another interesting phenomenon was that a large number of heavy industrial heat was found lying close to a line between Kolkata on the Eastern Indian Ocean and Mumbai on the Western Indian Ocean. The spatial distribution

of the 711 heavy industrial heat sources across India was not the same as that shown by the spatial density distribution image for the 3,998,465 fire hotspots (Figure 2), especially in the case of Punjab and Madhya Pradesh. Further investigation revealed that most of the fire hotspots in Punjab and Madhya Pradesh were due to burning straw, especially in May, October, and November. Additionally, heavy industrial heat sources founded in regions 1, 2, and 3 were mainly connected to petroleum development, whereas in region 4, they were linked to coal mining and steel production. And, each heavy industrial heat source detected were verified using Google Earth Map one by one. Six hundred fifty-nine heat sources can be easily confirmed as heavy industrial factories by Google Earth images. The type of the other 52 results cannot be curtained due to the lack of more field measured data. So, the accuracy of this detection model was higher than 92.7%. As the database of real heavy industrial heat sources has not been obtained, the recall ration can be calculated.

**Figure 6.** Spatial distribution of 711 heavy industrial heat sources in Indian regions (including Jammu and Kashmir state).

Recent changes in working heavy industry heat sources were compared, and the values of the NWH and NFHWH for each year during the period 2012 to 2018 were calculated (Figure 7). The values of NWH, and in particular, the NFHWH increased during this period. The trends in GDP and total population in India (Figure 8) between 2012 and 2017 were similar [24], demonstrating that heavy industries developed along with the development of the Indian economy as a whole.

**Figure 7.** Changes for heavy industrial heat sources in India. (**a**) The number of working heavy industrial heat sources (NWH) during the period 2012 to 2018. (**b**) The number of fire hotspots in working heavy industrial heat sources areas (NFHWH) during the period 2012 to 2018.

**Figure 8.** Changes in GDP and in the total population of India during the period 2012 to 2017. (**a**) GDP current Billion US\$ between 2012 and 2017. (**b**) Population total billion persons between 2012-2017.

High-resolution images from Google Earth (Figure 9) were selected to verify the results of the model. Figure 9a,b are images of steel plants in Jharkhand and West Bengal. The two open-pit minefields shown in Figure 9c,d are located in Jharkhand and Chhattisgarh. Figure 9e–h,j all show facilities related to oil and gas production, processing, and storage in Andhra Pradesh, Gujarat, Rajasthan and Assam, respectively. Figure 9i is an image of cement work named Gagal in Himachal Pradesh.

**Figure 9.** *Cont.*

**Figure 9.** High-resolution imagery used to validate the model.
