**3. Materials and Methods**

#### *3.1. Dataset Description and Preprocessing*

We merged two sources of data for this analysis. The first was obtained from Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) [30] satellite. This provides information on PM2.5, surface wind speed (m/s), surface air temperature (k), total cloud area fraction, dew point temperature at 2 m (k), 2 m eastward wind (m/s), and 2 m northward wind (m/s). The second set of data was extracted from from the National Renewable Energy Laboratory (NREL) [31] for 2013. This contains information

on DHI, DNI, GHI, clear-sky DHI, clear-sky DNI, clear -sky GHI, and solar zenith angle. The clearness index at time t (denoted by Kt) was calculated on the basis of GHI values. These two datasets were merged on the basis of latitude and longitude. For each location (unique combination of latitude and longitude), a 10 km radius was used for the merge. Table 2 describes the nine solar stations studied in this paper.

**Table 2.** Dataset description.


After collecting solar data (time series) from the PV module, they were stored in a database, and a series of standard preprocessing steps were applied.


**Figure 4.** Sliding-window approach.
