**Combined Multi-Layer Feature Fusion and Edge Detection Method for Distributed Photovoltaic Power Station Identification**

**Yongshi Jie 1,2,3, Xianhua Ji <sup>4</sup> , Anzhi Yue 1,3,5,\*, Jingbo Chen 1,3, Yupeng Deng 1,2, Jing Chen 1,2 and Yi Zhang 1,2**


Received: 29 November 2020; Accepted: 18 December 2020; Published: 21 December 2020 -

**Abstract:** Distributed photovoltaic power stations are an effective way to develop and utilize solar energy resources. Using high-resolution remote sensing images to obtain the locations, distribution, and areas of distributed photovoltaic power stations over a large region is important to energy companies, government departments, and investors. In this paper, a deep convolutional neural network was used to extract distributed photovoltaic power stations from high-resolution remote sensing images automatically, accurately, and efficiently. Based on a semantic segmentation model with an encoder-decoder structure, a gated fusion module was introduced to address the problem that small photovoltaic panels are difficult to identify. Further, to solve the problems of blurred edges in the segmentation results and that adjacent photovoltaic panels can easily be adhered, this work combines an edge detection network and a semantic segmentation network for multi-task learning to extract the boundaries of photovoltaic panels in a refined manner. Comparative experiments conducted on the Duke California Solar Array data set and a self-constructed Shanghai Distributed Photovoltaic Power Station data set show that, compared with SegNet, LinkNet, UNet, and FPN, the proposed method obtained the highest identification accuracy on both data sets, and its F1-scores reached 84.79% and 94.03%, respectively. These results indicate that effectively combining multi-layer features with a gated fusion module and introducing an edge detection network to refine the segmentation improves the accuracy of distributed photovoltaic power station identification.

**Keywords:** distributed photovoltaic power stations; remote sensing images; convolutional neural network; multi-layer features; edge

### **1. Introduction**

Renewable energy is a sustainable and inexhaustible energy, including biomass energy, wind energy, solar energy, etc., which plays an important role in solving the energy crisis. Biomass energy can be converted into Eco-fuels, and it has been found that Eco-fuels are a sustainable energy scenario at the local scale [1]. The main use of wind energy is to convert energy into electricity

through wind turbines. Solar energy is a clean and safe renewable energy source (RES) with strong development potential and application value [2]. Photovoltaic power generation is an effective way to use solar energy [3], of which there are two main forms: Centralized photovoltaic power generation and distributed photovoltaic power generation [4,5]. Centralized photovoltaic power stations are installed primarily in the desert and other ground areas and the generated electricity is usually incorporated into the national public power grid [6], while distributed photovoltaic power stations are generally installed on tops of buildings and the generated electricity is mainly for the inhabitants' own use [7]. Distributed photovoltaic power stations have advantages such as unlimited installed capacity, no occupation of land resources [8], and no pollution. Thus, exploitation of distributed photovoltaic power generation is an important solar energy development mode that has entered a stage of rapid development and is supported by Chinese policy [9,10]. The International Energy Agency predicts that the world's total renewable energy generation will grow by 50% between 2019 and 2024, with solar photovoltaic generation alone accounting for nearly 60% of the prospective growth. Distributed photovoltaic generation is expected to account for approximately half of the growth in total photovoltaic power generation [11]. The installed capacity of distributed photovoltaic power stations is currently growing rapidly. Consequently, the ability to accurately and efficiently acquire the installation locations, distribution, and total area of distributed photovoltaic power stations over a wide range is of importance to energy companies, governmental departments, and investors. For example, obtaining information of distributed photovoltaic power stations can help optimize power system planning [12]. The information of distributed photovoltaic power stations and solar irradiance data of building surfaces can be combined to predict the power generation potential [13]. Moreover, it can also support the development of open data and energy systems and facilitate the development of the energy field [14]. However, due to the spontaneity and randomness of distributed photovoltaic power station construction, it is difficult to obtain accurate information regarding the quantity and distribution of distributed photovoltaic power stations solely from governmental department planning information. In addition, distributed photovoltaic power stations are generally installed on the tops of buildings, making it difficult to investigate their distribution and area manually. High-resolution remote sensing imagery has the characteristics of high spatial resolution, high efficiency, and wide coverage. Thus, it provides the possibility for automatic identification of large-scale distributed photovoltaic power stations.

Traditional distributed photovoltaic power station identification methods rely mainly on manually designed features, and it is difficult to accurately obtain the location and area of photovoltaic power stations. Malof [15] pioneered the use of manual features for extracting distributed photovoltaic power stations and proposed a method that first obtains all the maximally stable extreme regions (MSERs) [16] from an image and then filters out the areas with low confidence. Then, color features and shape features in the remaining candidate area are extracted for classification by a support vector machine (SVM) [17]. However, this method does not obtain photovoltaic panel areas accurately. Later, Malof [18] used color, texture, and other features in the neighborhood of each pixel to represent the pixel, and then used a random forest [19] to predict the category of each pixel. However, this method also has difficulty accurately obtaining the location and area information of photovoltaic panels. On the basis of the research conducted by the authors of [18], Malof [20] cascaded the random forest and convolutional neural network [21] to identify distributed photovoltaic power stations. However, this method still relies on feature information designed by humans. In a later work, Malof [22] proposed a distributed photovoltaic power station identification model based on a VGG model [23]. However, its ability to accurately obtain the locations and shapes of photovoltaic panels is limited.

As deep learning technology has developed, a series of convolutional neural network (CNN) models have been proposed [23–30]. Semantic segmentation technology based on deep learning can use a CNN, which has strong feature-learning ability, to automatically learn object features from massive amounts of data. Compared with earlier machine learning methods, such as SVMs and random forests, CNNs significantly improved the object extraction accuracy. Semantic segmentation technology has

been widely applied and developed rapidly in fields such as medical image segmentation, automatic driving, and video segmentation. Jiang [31] used a CNN model and small data sets to extract the heart and lungs. Zhou [32] proposed the UNet++ model that has achieved high accuracy in nodule, nuclei, and liver segmentation. In addition to 2D medical image segmentation, the 3D full convolutional neural network can be used to realize organ segmentation in CT images [33]. Deep learning has become a robust and effective method for medical image segmentation [34]. In the field of automatic driving, CCNet [35] and ACFNet [36], respectively, used spatial context information and class context information to achieve the segmentation of objects in the street scene. Gated-scnn [37] combined shape and semantic information to extract targets on the street. In addition, in order to improve the performance of target segmentation in automatic driving, the idea of knowledge distillation has been used to retain the model's high precision while reducing the computation [38]. For the video semantic segmentation task, Paul [39] proposed an efficient video segmentation method that combines a convolutional neural network running on the GPU with an optical stream running on the CPU. Pfeuffer [40] added recurrent neural network into the video segmentation model to make full use of the time information of video sequence and improved the accuracy of video segmentation. Jain [41] proposed a video segmentation model with two input branches, which made use of the feature information of the current frame and the context information of the previous frame. Nekrasov [42] proposed a video segmentation algorithm without reliance on the optical flow, which further improved the efficiency of video segmentation. In addition to the natural image domain, semantic segmentation methods based on fully convolutional neural networks (FCN) [43] models have been widely used for object identification from remote sensing imagery, including road extraction, building extraction, and water extraction. For example, Zhou [44] proposed a road extraction method based on encoder-decoder structure and series-parallel dilated convolutions. Wu [45] added attention mechanism to the model [44], which further improved the accuracy of road extraction. Xu [46] designed a road extraction model based on DenseNet [30] and attracted local and global attention. Gao [47] used the refined residual convolutional neural network to extraction road in high-resolution remote sensing images. Xu [48] used deep convolutional neural network to extract buildings and optimized the results with guided filters. Yang [49] used DenseNet [30] and the spatial attention module to extract buildings. Huang [50] presented a residual refinement network for building extraction that fused aerial images and LiDAR point cloud data. Sun [51] proposed a building extraction method combining multi-scale convolutional neural network and SVM. Yu [52] proposed a water body extraction method based on convolutional neural networks, which used both spectral and spatial information from Landsat images Chen [53] proposed a cascade hyperpixel segmentation and convolutional neural network classification method to extract urban water bodies. Li [54] used fully convolutional network to extract water bodies from GeoFen-2 images with limited training data. Some previous deep learning-based semantic segmentation methods have been applied to the identification of distributed photovoltaic power stations. Yuan [55] was the first to introduce an FCN model for distributed photovoltaic power station identification. However, the adopted FCN model requires up-sampling by a large multiple, which may cause the loss of feature information. Subsequently, SegNet [56] and UNet [57] were used to identify distributed photovoltaic power stations [58,59]. Although the identification results of those models are superior to the results of traditional methods, they still do not solve the problem that photovoltaic panels with small areas are easily missed and densely installed photovoltaic panels are easily adhered.

To solve the above problems, this paper proposes a distributed photovoltaic power station identification method that combines multi-layer features and edge detection. The main contributions aims of this paper are as follows:

• To address the problem that small photovoltaic panels are difficult to recognize, a gated fusion module is introduced into the encoder-decoder model to effectively fuse multi-layer features, which improves the model's ability to identify small photovoltaic panels.


The remainder of this article is organized as follows. Section 2 introduces the distributed photovoltaic power station identification model designed in this paper, including the encoder-decoder architecture, gated fusion module, and edge detection network. Section 3 presents the experiments and results analysis on the two data sets, including the experimental data, evaluation metrics, experimental settings, and the experimental results. The results are analyzed and compared with those of other methods. Finally, Section 4 concludes this paper.

### **2. Model Architecture and Design**

The model proposed in this paper was composed of a semantic segmentation network and an edge detection network. These 2 networks were trained in parallel for multi-task learning, as shown in Figure 1. The semantic segmentation network was used to extract the semantic features of photovoltaic panels, and its architecture included an encoder-decoder structure based on UNet. The encoder was Efficientnet-B1 [61]. In the semantic segmentation network, a gated fusion module was introduced to control the transmission of valuable information, effectively fuse multi-layer features, and improve the ability to identify small photovoltaic panels. The edge detection network was used to extract the edge features of the photovoltaic panels and guide the semantic segmentation network to produce segmentation results with more refined edges to alleviate the problem of blurred and unrefined edges in segmentation results.

**Figure 1.** Structure of the proposed model.

### *2.1. Semantic Segmentation Network with Gated Fusion Multi-Layer Features*

A semantic segmentation network was used to extract the semantic features of photovoltaic panels. Efficientnet-B1 uses an encoder, and a gated fusion module was introduced to effectively fuse multi-layer features.

### 2.1.1. Encoder and Decoder

This study adopted EfficientNet-B1, which has strong feature representation capabilities, as the encoder for feature extraction. This decoder is the same as that used in the original UNet. The Efficientnet-B1 network structure is shown in Figure 2. The basic component of Efficientnet-B1 is the MBConv module. In the MBConv module, a 1 × 1 convolution is first used to change the channels of the input features, followed by a depth-wise convolution. Then, the channel attention mechanism of SENet [62] is introduced, and finally, a 1 × 1 convolution is used to reduce the channels of the feature maps.

**Figure 2.** Structure of EfficientNet-B1.

The original UNet encoder structure consists of 5 stages. The feature resolution at each stage is successively changed to half of that of the previous stage through down-sampling, and the features of each stage are fused with the corresponding decoder features through skip connections. Based on the UNet structure, this paper adopted the output features of Stages 0, 2, 3, 5, and 7 of Efficientnet-B1 as the 5 encoder blocks used in the encoder of our model, as shown in Figure 3, which assumes that the size of the input image is 256 × 256 × 3.

**Figure 3.** Encoder structure in the proposed model.

The decoder is mainly used to gradually up-sample the low-resolution high-level features to restore the original size of the input image. During the up-sampling process, the corresponding features of the encoder and decoder are concatenated through skip connections. The decoder structure block is shown in Figure 4. The decoding features represent the output feature of the previous decoder block, and the encoding features represent the features passed to the corresponding decoder block through the skip connections. First, the decoding features are up-sampled twice and then concatenated with the encoding features on the channel dimension. The number of channels of the concatenated features is the sum of the number of channels of the two features. After the concatenation and two

3 × 3 convolutional layers, the output features of the decoder block are obtained. The output features of the current decoder block are the input decoding features for the next decoder block.

**Figure 4.** Structure of the decoder blocks.

### 2.1.2. Gated Fusion Module

Inspired by the research conducted by the authors of [63], a gating fusion module was introduced to effectively fuse the multi-layer features to improve the ability to identify small photovoltaic panels. The gating fusion module structure is shown in Figure 5. The input is the feature of the adjacent layer of the encoder, and the features generated by the gating unit are used to measure the usefulness of the feature at each position in the spatial dimension. This arrangement controls the transmission of useful information and suppresses the transmission of useless information.

**Figure 5.** Structure of the gated fusion module.

The input to the gated fusion module consists of the features *F<sup>i</sup>* from layer *i* and the features *Fi*+<sup>1</sup> from the adjacent layer *i* + 1. Due to the differences in the feature sizes and the channel numbers, *Fi*+<sup>1</sup> is first up-sampled twice, and the number of channels in *Fi*+<sup>1</sup> is converted to be the same as that in *F<sup>i</sup>* . Then, *Fi*+<sup>1</sup> is input into the gating unit *G*. The output of gated fusion module is *F* ′ *i* .

The purpose of gating unit *G* feeds the input features into a 1 × 1 convolution and then obtains the gated features *G<sup>i</sup>* through the sigmoid function, as shown in Equation (1). The gated feature graph is used to judge the usefulness of the spatial position features of the input features. The range of the gated feature values is [0, 1]. A value less than 0.5 (approximately 0) corresponds to useless feature information, whereas a value greater than 0.5 (approximately 1) corresponds to useful feature information. The transfer of useful information and useless information is controlled by element-by-element multiplication between the gated features and the input features of the gating unit:

$$G\_{\mathbf{i}} = \sigma(w\_{\mathbf{i}} \* F\_{\mathbf{i}}) . \tag{1}$$

where σ is the sigmoid function, the asterisk ('∗') represents the convolution operation, and *w<sup>i</sup>* is the weight parameter of the convolution.

The entire gated fusion module process can be defined as shown in Equation (2). For a position (*x*, *y*), when *Gi*+1(*x*, *y*) is larger and *Gi*(*x*, *y*) is smaller, *Fi*+<sup>1</sup> transmits useful information to *F<sup>i</sup>* that *Fi* lacks at this position. When *Gi*+1(*x*, *y*) is smaller or *Gi*(*x*, *y*) is larger, this useless information is suppressed to reduce information redundancy:

$$F\_i' = (1 + G\_i) \odot F\_i + (1 - G\_i) \odot G\_{i+1} \odot F\_{i+1} \tag{2}$$

where ⊙ denotes element-by-element multiplication.

### *2.2. Combining Edge Detection for Multi-Task Learning*

The edge detection network was used to extract the edge features of photovoltaic panels. The semantic segmentation network was trained using multi-task learning so that the network model produced segmentation results with refined edges.

### 2.2.1. Edge Detection Network

Distributed photovoltaic stations have dense distribution characteristics, and the identified results of adjacent photovoltaic panels are prone to adhesion. In this paper, edge information extracted by the edge detection network was combined with the semantic segmentation network to ameliorate the problem of edge blurring.

In this paper, an encoder-decoder structure was adopted in the edge detection network, as shown in Figure 6. This is the same encoder used in semantic segmentation network for feature extraction and feature sharing. The decoder structure of the edge detection network is also the same as that of the semantic segmentation network. The object edge feature information is gradually obtained through multiple up-sampling operations, and the edge feature extracted by the encoder is fused by skip connections during the up-sampling process.

**Figure 6.** Structure of the edge detection network.

### 2.2.2. Loss Function

In the parallel training of 2 networks, a semantic segmentation loss function and an edge detection loss function are used to supervise the learning process for the semantic and edge features of photovoltaic panels, respectively. The semantic segmentation network loss function is calculated from the segmentation predictions and segmentation labels, while the edge detection loss function is calculated from the edge predictions and edge labels. Both the semantic segmentation and edge detection of photovoltaic power stations are binary classification tasks. In addition, compared with the background, the segmentation labels and edge labels account for only a small proportion. To avoid sample imbalance problems, a loss function composed of binary cross entropy (BCE) and the Dice loss function (Dice), namely, BCE + Dice [64,65], is used in both the semantic segmentation network and edge detection network. During training, the 2 loss functions are summed to obtain the total model loss, as shown in the following equation:

$$\text{Loss\\_total} = \text{Loss\\_seg} + \text{Loss\\_edge},\tag{3}$$

where *Loss*\_*total* is the total loss function of our proposed model, *Loss*\_*seg* is the loss function of the semantic segmentation network and *Loss*\_*edge* is the loss function of the edge detection network.

The BCE loss function is shown in Equation (4). The Dice loss function is given by Equation (5).

$$BCE = -\frac{1}{n} \sum\_{i=1}^{n} (g\_i \times \log(p\_i) + (1 - g\_i) \times \log(1 - p\_i)),\tag{4}$$

$$Dice = 1 - \frac{2|G \cap P|}{|G| + |P|} = 1 - \frac{2\sum\_{i=1}^{n} (g\_i \times p\_i)}{\sum\_{i=1}^{n} g\_i^2 + \sum\_{i=1}^{n} p\_i^2} \tag{5}$$

where *n* represents the number of pixels in the image, *g<sup>i</sup>* represents the value of the *i*-th pixel in the label, *p<sup>i</sup>* denotes the value of the *i*-th pixel in the prediction result map, and *G* and *P* denote the label and prediction result map, respectively.

### **3. Experimental and Result Analysis**

### *3.1. Experimental Data*

The experimental data in this study consisted of the Duke California Solar Array and Shanghai Distributed Photovoltaic Power Station data sets.

### 1. Duke California Solar Array data set

This data set is currently the largest manually labelled distributed photovoltaic power station data set, containing images and coordinate information of object boundary which can be used to train semantic segmentation and object detection algorithms. The images in the data set are collected by the United States Geological Survey (USGS), which uses remote sensing technology to perform orthographic correction on images, eliminating distortions caused by camera and terrain. The image size is 5000 × 5000 pixels, the spatial resolution is 0.3 m, and each image includes three bands: Red, green, and blue (the RGB code that is used to reproduce a broad array of colors). To ensure comparable results, a total of 526 images from Fresno, Modesto, and Stockton were selected and split following SolarMapper [66]. Fifty percent of the images were randomly selected to form the test set, and the remaining 50% of images were divided into a training set and verification set at a ratio of 8:2.

Given the limited memory available on the graphics card, the original images in the training set were clipped into 256 × 256 image blocks and the data were augmented by horizontal and vertical mirroring and a rotation of 90 degrees. Finally, a total of 85,448 image blocks were collected for training. During the training of the edge detection network, photovoltaic panel edge labels are needed. In this study, the edge labels were obtained based on the semantic segmentation labels. Some sample images, segmentation labels, and edge labels from this data set are shown in Figure 7.

**Figure 7.** Samples from the Duke California Solar Array data set: (**a**) Image; (**b**) segmentation label; (**c**) edge label.

2. Shanghai Distributed Photovoltaic Power Station Data Set

To verify the effectiveness of the proposed method in this paper for identifying domestically distributed photovoltaic power stations, the Shanghai Distributed Photovoltaic Power Station data set was constructed. The images were collected from the Songjiang and Pudong New districts in Shanghai. The data set contains 1000 aerial images with a size of 2048 × 2048 and a spatial resolution of 0.1 m and the images include three bands: Red, green, and blue. The data set images were randomly divided into a training set, a validation set, and a test set at a ratio of 7:1:2. The training set data were clipped into 256 × 256 image blocks. Then, the data were augmented by horizontal and vertical mirroring and rotations of 90, 180 and 270 degrees. Contrast transformation and brightness transformation was carried out. Finally, a total of 55,560 image blocks were collected for training. Some sample images, segmentation labels, and edge labels for this data set are shown in Figure 8.

**Figure 8.** Samples from the Shanghai Distributed Photovoltaic Power Station data set: (**a**) Image; (**b**) segmentation label; (**c**) edge label.

### *3.2. Evaluation Metrics*

In this study, IoU, precision, recall, and F1-scores were used as evaluation metrics. The IoU is the ratio of the intersection and union of the predicted result area and the labelled area. Precision represents the ratio of pixels correctly predicted as positive among all pixels predicted as positive. Recall represents the ratio of pixels correctly predicted as positive among all positive pixels. The F1 is a metric that combines precision and recall. The four evaluation metrics are calculated as shown in the following equations:

$$IoL = \frac{TP}{TP + FP + FN} \tag{6}$$

$$Precision = \frac{TP}{TP + FP} \tag{7}$$

$$Recall = \frac{TP}{TP + FN} \tag{8}$$

*F*1 = 2 × *Precision* × *Recall Precision* <sup>+</sup> *Recall* , (9)

where TP (true positive) represents the number of pixels that are both predicted and labelled as positive FP (false positive) represents the number of pixels that are predicted as positive but labelled as negative, and FN (false negative) represents the number of pixels that are predicted as negative but labelled as positive.

### *3.3. Experimental Setting*

### 1. Experimental environment

The computer used in the experiments was equipped with an Ubuntu 16.04.5 LTS operating system, an Intel (R) Xeon (R) E5-2678 v3 CPU, and two NVIDIA TITAN XP graphics cards, each with 12 GB of memory. PyTorch was used to build all the semantic segmentation models.

2. Training strategy and hyperparameter settings

All the models were trained using the Adam optimizer to help ensure a fast convergence speed. The batch size of the input images in each training epoch was 64. The initial learning rate was 1 × 10−<sup>3</sup> and the learning rate decay adopted the cosine annealing learning rate decline strategy. The cycle was 10, and the minimum learning rate was 1 × 10−<sup>5</sup> .

### *3.4. Experimental Results*

To verify the effectiveness of the proposed method, EfficientNet-B1-UNet was considered as the baseline network. Then, the gated fusion module and edge detection network were added successively. The experiments used the Duke California Solar Array data set and the Shanghai Distributed Photovoltaic Power Station data set. The experimental results on the Duke California Solar Array data set are shown in Table 1.

**Table 1.** Experimental results of each improved module on the Duke California Solar Array data set (%).


Effi-UNet represents UNet, which uses EfficientNet-B1 as the encoder; GFM represents the gated fusion module, and EDN represents the edge detection network.

On the Duke California Solar Array data set, by adding the gated fusion module, the IoU of the test set was increased from 72.41% to 73.33%, F1 was increased from 84.00% to 84.61%, and recall was increased from 82.64% to 83.24%. By adding the edge detection network, the IoU of the network model was further improved from 73.33% to 73.60% and F1 was improved from 84.61% to 84.79%.

The experimental results of the Shanghai Distributed Photovoltaic Power Station data set are shown in Table 2.

**Table 2.** Experimental results from successively improved models on the Shanghai distributed photovoltaic power station data set (%).


On the Shanghai Distributed Photovoltaic Power Station data set, adding the gating fusion module increased the IoU of the test set from 87.40% to 88.34%, the F1-score from 93.27% to 93.81%, and the recall from 93.47% to 94.08%. After adding the edge detection network, the IoU of the network model was further improved to 88.74% and the F1-score improved to 94.03%.

The added modules improved all four evaluation metrics. This shows that the gated fusion module and edge detection network proposed in this paper can improve the accuracy of distributed photovoltaic panel identification tasks.

### *3.5. Results Analysis*

1. The influence of the gating fusion module on the segmentation results

Figure 9 shows a sample image, and its segmentation results are shown both before and after adding the gated fusion module. The first two rows of images are from the Duke California Solar Array data set and the second two rows of images are from the Shanghai Distributed Photovoltaic Power Station data set. The first column is the sample image, the second column is the labelled image, and the third column shows the segmentation results of Effi-UNet. Compared with the labelled image, the Effi-UNet results failed to detect of some small photovoltaic panels. The fourth column shows the segmentation results of Effi-UNet + GFM, revealing that, with the help of the GFM module, the network's ability to identify small photovoltaic panels was improved, which verifies the effectiveness of the module.

2. The influence of the edge detection network on the segmentation results

By extracting edge information and conducting multi-task learning of the edge detection and segmentation networks, more refined segmentation results can be generated. In Figure 10, the first two rows of sample images come were sourced from the Duke California Solar Array data set, while the second two rows of sample images are were sourced from the Shanghai Distributed Photovoltaic Power Station data set. The first column is the sample image, and the second column is the segmentation label. The third column is the Effi-UNet + GFM segmentation results. Compared with the segmentation label, the segmentation results of adjacent photovoltaic panels were adhered. The fourth column and the fifth column, respectively, represent the semantic segmentation results and edge detection results of Effi-UNet + GFM + EDN, and the sixth column is the label of edge detection. With the help of the edge detection network, fine edge results were obtained, distinguishing adjacent photovoltaic panels insofar as possible and alleviating the adhesion problem.

**Figure 9.** Result samples before and after adding GFM: (**a**) Image; (**b**) label; (**c**) Effi-UNet results; (**d**) Effi-UNet + GFM results.

**Figure 10.** Result samples before and after adding the edge detection network: (**a**) Image; (**b**) segmentation label; (**c**) Effi-UNet + GFM results; (**d**) Effi-UNet + GFM + EDN segmentation results; (**e**) Effi-UNet + GFM + EDN edge detection results; (**f**) edge label.

### *3.6. Comparisons with Other Methods*

To further verify the effectiveness of the proposed method, the identification method proposed in this paper was compared with SegNet, LinkNet [67], UNet, and FPN [68] on the adopted two data sets. The results and analysis are as follows.

### 3.6.1. Results on the Duke California Solar Array Data Set

The experimental results of each method on the test set of the Duke California Solar Array data set are shown in Table 3. The results show that the proposed method outperformed the other methods on all the evaluation metrics. The IoU of the proposed method in this paper reached 73.60%, and its F1-score reached 84.79%. Moreover, the IoU of the proposed method was 6.6% better than the IoU of SolarMapper [66]. The analysis of the results is as follows: (1) Although LinkNet, UNet, and FPN combine features from different layers, they do not consider the differences between the high-level and low-level features, nor do they make full use of object edge information. (2) In this paper, based on the encoder and decoder structure network, the multi-layer features were fused effectively by the gated fusion module, and the useful information was transferred by the gated mechanism improving the ability to identify small photovoltaic panels. (3) Based on the semantic segmentation network, the method in this paper combined an edge detection network for multi-task learning to ameliorate the edge-blurring problem.

Figure 11 shows some of the experimental results of each method on the Duke California Solar Array data set. The segmentation results in the first and second rows show that the method proposed in this paper was better at identifying small photovoltaic panels compared with the other methods. In the segmentation results shown in the third and fourth rows, although each method identified the photovoltaic panel in the image, the method in this paper obtained more refined edges.


**Table 3.** Accuracy of each method on the Duke California Solar Array data set (%).

### 3.6.2. Results on the Shanghai Distributed Photovoltaic Power Station Data Set

Table 4 shows the evaluation results of each model on the Shanghai Distributed Photovoltaic Power Station data set, revealing that the method proposed in this paper outperformed all the other methods on all the evaluation metrics. The IoU of the method in this paper reached 88.74%, and its F1-score reached 94.03% Due to the encoder-decoder structure, the method proposed in this paper effectively fused features from multiple layers, improved the ability to identify small photovoltaic panels, and refined the segmentation edge results using the edge detection network. Therefore, compared with the other methods, the method in this paper achieved higher accuracy.

Figure 12 shows an example of the experimental results of the proposed method and the compared methods on the Shanghai Distributed Photovoltaic Power Station data set. As seen from the results in the first row, the method proposed in this paper was better at identifying small photovoltaic panels, and the identification results were more complete. In the second row, the two separate photovoltaic panels were difficult to identify due to their small sizes. Compared with the other methods, the proposed method not only recognized them but also obtained more refined edges in the identification results. In the third row, multiple photovoltaic panels were close to each other, which was likely to cause adhesion problems in the identification process. Compared with the other methods, with the help of the edge detection network, the identification results of the method proposed in this paper had more

refined edges and alleviated the adhesion problem. A comparison of the results in the fourth row shows that the identification results of the proposed method had more refined edges.

**Figure 11.** Sample results of each method on the Duke California Solar Array data set: (**a**) Image; (**b**) label; (**c**) SegNet; (**d**) LinkNet; (**e**) UNet; (**f**) FPN; (**g**) our method.

**Figure 12.** Sample results of each method on the Shanghai Distributed Photovoltaic Power Station data set: (**a**) Image; (**b**) label; (**c**) SegNet; (**d**) LinkNet; (**e**) UNet; (**f**) FPN; (**g**) our method.


**Table 4.** Accuracy of each method on the Shanghai Distributed Photovoltaic Power Station data set (%).

### **4. Conclusions**

This paper presented a novel fully connected convolutional neural network model that can automatically extract distributed photovoltaic power stations from remote sensing imagery. A distributed photovoltaic power station identification method that combines multi-layer features and edge detection was proposed to solve two problems: That small photovoltaic panels are difficult to identify and that adjacent photovoltaic panels can easily adhere. The model structure was composed of a semantic segmentation network and an edge detection network. A gated fusion module was introduced into the semantic segmentation network to conduct effective multi-layer feature fusion, and an edge detection network was used to guide the production of segmentation results with refined edges. Experiments on the Duke California Solar Array data set and the Shanghai Distributed Photovoltaic Power Station data set showed that the problem of missed small photovoltaic panels was improved and that the identification accuracy was enhanced by introducing a gating fusion module. By combining the edge detection network and semantic segmentation network for multi-task learning, the edge information of the photovoltaic panel was used to constrain the segmentation results, resulting in the extraction of photovoltaic panels with finer edges, which further improved the identification accuracy. Compared with SegNet, LinkNet, UNet and FPN, the method proposed in this paper achieved the highest identification accuracy on both data sets, and its F1-scores reached 84.79% and 94.03%, respectively.

However, there are also some limitations in this study: (1) In terms of data source, due to the limitations of the current data set, the trained model is only applicable to RGB optical images and cannot be directly used to images containing more bands. (2) In terms of the spatial resolution of the image, the training and testing of the method in this paper were carried out on the images with the same spatial resolution. Due to the differences of solar panels in images with different resolutions, the accuracy may be uncertain when the trained model is directly used to predict images with different resolutions. (3) Since the training data only includes distributed photovoltaic power stations, the trained model cannot be used to identify centralized photovoltaic power stations. The future work will be carried out from the following aspects: (1) Explore the application of our method in multi-spectral images and further improve the segmentation performance with more spectral information. (2) Multiple images of different spatial resolutions will be collected to train our method so that our method can identify distributed photovoltaic power stations in images with different resolutions. (3) A centralized photovoltaic power station data set will be constructed, and our method will be extended to the identification of centralized photovoltaic power stations. (4) In addition, the extracted results of distributed photovoltaic power stations will be combined with solar radiation data to assess the power generation potential.

**Author Contributions:** Y.J., A.Y. and X.J. designed the network architecture. Y.J. performed the experiments and wrote the paper. X.J. and J.C. (Jingbo Chen) revised the paper. Y.D., J.C. (Jing Chen) and Y.Z. built the data set. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported in part by the National Key Research and Development Project (No. 2017YFC0821900).

**Acknowledgments:** The authors sincerely thank the editors and reviewers. We also sincerely thank the authors of the Duke California Solar Array data set.

**Conflicts of Interest:** The authors declare that no conflict of interest exist.

### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Sustainable Spatial Energy Planning of Large-Scale Wind and PV Farms in Israel: A Collaborative and Participatory Planning Approach**

**Sofia Spyridonidou <sup>1</sup> , Georgia Sismani <sup>2</sup> , Eva Loukogeorgaki 2,\*, Dimitra G. Vagiona <sup>1</sup> , Hagit Ulanovsky <sup>3</sup> and Daniel Madar <sup>3</sup>**


**Abstract:** In this work, an innovative sustainable spatial energy planning framework is developed on national scale for identifying and prioritizing appropriate, technically and economically feasible, environmentally sustainable as well as socially acceptable sites for the siting of large-scale onshore Wind Farms (WFs) and Photovoltaic Farms (PVFs) in Israel. The proposed holistic framework consists of distinctive steps allocated in two successive modules (the Planning and the Field Investigation module), and it covers all relevant dimensions of a sustainable siting analysis (economic, social, and environmental). It advances a collaborative and participatory planning approach by combining spatial planning tools (Geographic Information Systems (GIS)) and multi-criteria decision-making methods (e.g., Analytical Hierarchy Process (AHP)) with versatile participatory planning techniques in order to consider the opinion of three different participatory groups (public, experts, and renewable energy planners) within the site-selection processes. Moreover, it facilitates verification of GIS results by conducting appropriate field observations. Sites of high suitability, accepted by all participatory groups and field verified, form the final outcome of the proposed framework. The results illustrate the existence of high suitable sites for large-scale WFs' and PVFs' siting and, thus, the potential deployment of such projects towards the fulfillment of the Israeli energy targets in the near future.

**Keywords:** spatial energy planning; site-selection process; participatory planning; onshore wind farms; photovoltaic farms; GIS; AHP; Borda Count; TOPSIS; Israel

### **1. Introduction**

Environmental concerns related to reduction in greenhouse gas emissions and mitigation of climate change effects have established Renewable Energy (RE) as a mainstream source of electricity generation globally. According to [1], by the end of 2019, the estimated share of renewables in global electricity generation was 27.3%, while the net additional installed capacity of RE Technologies (RETs) was higher compared to both fossil fuels and nuclear for a fifth consecutive year. Moreover, the electricity generated from new Wind Farms (WFs) and Photovoltaic Farms (PVFs) was more cost-efficient compared to fossil fuel power plants in many locations worldwide [2], demonstrating the strong competitiveness of wind and solar energy with conventional sources of electricity.

For global wind energy industry, 2019 was an outstanding year, since the new WF installations corresponded to over 60 GW and the global cumulative wind power capacity reached the amount of 651 GW by the end of 2019 [3]. Asia Pacific remained the world's largest wind energy market in 2019 followed by Europe, representing 50.7% and 25.5% of

**Citation:** Spyridonidou, S.; Sismani, G.; Loukogeorgaki, E.; Vagiona, D.G.; Ulanovsky, H.; Madar, D. Sustainable Spatial Energy Planning of Large-Scale Wind and PV Farms in Israel: A Collaborative and Participatory Planning Approach. *Energies* **2021**, *14*, 551. https:// doi.org/10.3390/en14030551

Received: 28 December 2020 Accepted: 18 January 2021 Published: 21 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the global new wind installations, respectively. Developing markets, such as the Middle East and Africa, with 1.6% of the global new wind installations, demonstrated steady, but not significant growth during 2019 and, thus, they are currently last in the global wind energy ranking. Regarding solar PV industry, the global PV power capacity reached 627 GW by the end of 2019, with China being currently the country with the largest capacity (204.7 GW or 32.6% of the global cumulative PV power capacity) followed by the United States (76 GW or 12.1% of global PV power capacity) [1]. In the Middle East, most of the new PV installations (2 GW) were implemented in the United Arab Emirates (with some PV projects installed in 2018, but commissioned in 2019), whereas in Israel and Jordan, the additional PV power capacity reached 1.1 and 0.6 GW by the end of 2019, respectively [1,4].

Israel, as the most developed country in the Middle East, pledged to eliminate the use of coal, gasoline, and diesel for energy production and transport by 2030, in favor of renewables and natural gas [2]. In that respect, the Israeli energy targets aim at 10% and 30% electricity generation from RE Sources (RES) by 2020 and 2030, respectively [2,5]. According to the latest available information from the Israeli Electricity Authority [6], by the end of 2019, the cumulative PV power capacity reached 1.72 GW, which corresponds to 8.7% of the national electricity demand [4], followed by concentrated solar power capacity (0.24 GW or 1.2% of electricity demand), biogas power capacity (0.04 GW or 0.2% of electricity demand), and wind power capacity (0.03 GW or 0.15% of electricity demand). Considering all the above and for facilitating the achievement of the Israeli energy targets, it is deemed significantly important to develop a sustainable spatial energy plan for the whole country, focusing on the potential deployment of the mature and cost-competitive wind and PV technologies on a national scale. Such a plan would enable the efficient determination of the most appropriate sites for the development of large-scale WFs and PVFs, and, thus, it could set a consistent starting basis towards the production of large amounts of electricity from RES in the country.

The appropriate site-selection for the efficient and sustainable deployment of WFs and PVFs corresponds to an important process, which involves various environmental, social, economic, technical, political, and legal aspects. Geographic Information Systems (GIS) have been deployed solely or in combination with Multi-Criteria Decision-Making (MCDM) methods within a large number of investigations related to either WFs' [7–23] or PVFs' [24–40] siting, aiming to address the corresponding multidimensional siting problem. The necessity of deploying GIS as a tool for the investigation of land suitability for single wind turbines or for the proper PV deployment, land use management, and detailed energy planning is also highlighted in [41] and [42], respectively. However, studies developing an integrated methodological approach for the simultaneous determination of suitable sites for both WFs and PVFs either at different areas for each RET (isolated WFs and PVFs) or at common sites (colocated WFs and PVFs) are quite rare [43–46]. In particular, Ali et al. [43], by combining GIS with the Analytical Hierarchy Process (AHP) and Local Experts' (LEs) opinion, investigated the existence of suitable areas for the siting of isolated WFs and PVFs in the Songkhla Province in Thailand, and they provided important insights for the siting of the aforementioned RETs on regional scale. A combination of GIS with AHP has been also applied in [45] to assess the suitability of the land in South Central England for installing isolated WFs and PVFs on regional scale. In [44], the author developed a multi-criteria GIS-based approach for the determination of suitable areas for isolated WFs' and PVFs' siting in Colorado State, the United States (i.e., on regional scale). Finally, a site-selection methodology based on the combination of GIS with AHP has been developed in [46] to determine the most suitable sites for the siting of isolated and colocated WFs and PVFs on regional scale and, more specifically, in Tehran, Iran. In Israel, however, there exists no study focusing on the sustainable site-selection of WFs and/or PVFs on any spatial scale.

The present paper focuses on the development of an innovative Sustainable Spatial Energy Planning (SSEP) methodological framework for Israel in order to identify and prioritize on national scale appropriate, technically and economically feasible, environmentally sustainable as well as socially acceptable sites for the siting of large-scale isolated

and/or colocated onshore WFs and PVFs in the country. The proposed holistic framework consists of distinctive steps allocated in two successive modules (the Planning and the Field Investigation module), and it covers all relevant dimensions of a sustainable siting analysis (social, economic, and environmental). Aiming at filling research gaps existing nowadays in the site-selection processes of RETs, generally, the present SSEP framework advances a collaborative and participatory planning approach by combining spatial planning tools (GIS) and MCDM methods with versatile participatory planning techniques, in order to consider the opinion of three different participatory groups (Local Public (LP), LEs, and RE Planners (REPs)) within the site-selection processes. Moreover, it facilitates verification of GIS results by conducting appropriate field observations. Initially, within the Planning module, the required Siting Criteria (SC) for each examined RET related to economic, technical, environmental, societal, political, and legal factors are defined. All relevant spatial data are collected and digitized and a RES database including relevant thematic maps is developed in GIS to illustrate the spatial dimension of each SC. Suitable areas for WFs and PVFs are, then, determined by: (i) utilizing specific SC that represent spatial constraints for each examined RET and (ii) incorporating the LP and the LEs' opinion in the formation of the exclusion limits based on questionnaire surveys and suitable statistical analysis. Next, for each examined RET, the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) is applied and three different Suitability Index (SI) maps are created by taking into account the relevant importance of each Assessment Criterion (AC) in accordance with: (i) the LP view and concerns, (ii) the LEs' knowledge and experience, as well as (iii) the REPs' expertise. For prioritizing the AC, the social-choice method Borda Count (BC) [47] is utilized in the case of the LP participation, while the AHP method, suitable for including experts' opinion in decision-making processes of RES [45], is applied for both LEs' and REPs' participation. The most highly suitable sites, as obtained from the Planning module, are, finally, further examined by performing direct field observations or by utilizing Google Earth Pro software in case of inaccessible locations (Field Investigation module). The final outcome of the proposed SSEP includes a set of highly suitable, accepted by LP, LEs, and REPs, and field verified, sites for the deployment of large-scale isolated and/or colocated onshore WFs and PVFs on national scale.

The remainder of the article is structured as follows. Section 2 briefly presents the proposed SSEP framework, while in Sections 3 and 4, the Planning and the Field Investigation module are described in detail, respectively. The results of the present paper are presented and discussed in Section 5, while, finally, in Section 6, the concluding remarks and key findings of this investigation are cited.

### **2. Overview of the Sustainable Spatial Energy Planning Framework**

In order to identify the most appropriate, technically/economically feasible, environmentally sustainable and socially acceptable site solutions for the deployment of large-scale WFs and PVFs in Israel, the SSEP framework shown in Figure 1 is developed and applied.

The proposed SSEP framework corresponds to a well-structured collaborative and participatory planning approach and it consists of six distinctive, successive steps allocated into two modules: the Planning module and the Field Investigation module. The Planning module aims at determining the suitability of the potential sites and it includes five steps (Steps 1–5, Figure 1). Specifically, in Step 1, the SC are defined and all required geographic information data are collected/digitized based on the special characteristics of the study area (Israel), the special siting requirements of each RET, and the REPs' expertise. Next, in Step 2, a RES GIS database is developed for configuring and illustrating in the form of thematic maps, the spatial dimension of each SC in a GIS environment. Step 3 follows, which is related to the LP and LE participation within the site-selection process and, more specifically, in the formation of the SC exclusion limits and the prioritization of the AC. The next step (Step 4) includes the identification of appropriate sites. This is achieved by eliminating all unsuitable areas based on specific SC and by considering the LP and the LEs' opinion. Finally, in Step 5, the suitability of the potential sites is determined, and three

different SI maps are developed in accordance with: (i) LP view and concerns, (ii) LEs' knowledge and experience as well as, (iii) REPs' expertise. The most highly suitable sites obtained from the Planning module, are then considered as input in the Field Investigation module (Step 6 of the SSEP framework, Figure 1), aiming at verifying the corresponding GIS results based on field observations. In this way, a set of highly suitable, accepted by LP, LEs, and REPs, and field verified sites for the deployment of large-scale onshore WFs and PVFs in Israel on national scale is obtained. This set represents the overall output of the proposed SSEP. It is noted that the proposed framework could be implemented by a group of REPs, which in the present investigation is assumed to include the authors of the paper. In the following sections, the modules and the steps of the proposed SSEP framework are described thoroughly.

**Figure 1.** Proposed Sustainable Spatial Energy Planning (SSEP) framework for large-scale Wind Farms' (WFs') and Photovoltaic Farms' (PVFs') site-selection in Israel.

### **3. The Planning Module**

### *3.1. Definition of SC and Data Collection/Digitization (Step 1)*

In Step 1, the SC for WFs and PVFs are initially defined based on the special characteristics and the policies of the study area, the available analog or digital geographic information data, the special siting requirements of each RET, and the expertise of the REPs. These criteria enable to identify and analyze spatially the environmental, economic, technical, political, social, and legal characteristics of the study area. For each of the examined RET, eighteen (18) SC have been taken into account, denoted hereafter as WSC for WFs and SSC for PVFs (Table 1). Detailed description of the WSC and the SSC is given in Appendix A. All relevant required geographic information data were collected from various sources (i.e., national institutes, services, and official international and national digital databases providing officially approved cartographic data), they were appropriately processed and Geographic Information Datasets (GIDs) were finally obtained (Table 1) by deploying GIS. It is noted that 10-year or 11-year suitable statistical analysis has been also conducted to obtain the final data of some essential SC (e.g., SSC.1 and SSC.2).

### *3.2. Development of a RES GIS Database (Step 2)*

In Step 2, a RES GIS database including relevant thematic maps was developed in GIS in order to: (a) illustrate the spatial dimension of each SC and, hence, (b) support the implementation of the remaining SSEP steps by facilitating the assessment of the positive or the negative spatial impact of each SC on the WFs' and PVFs' site-selection processes in Israel. For this development, the national legal restrictions resulted from all existing relevant policies [48–51] have been taken into account.

**Table 1.** Siting Criteria (SC), Geographic Information Datasets (GIDs), data processes, and sources employed in the present work.


### *3.3. Local Public and Local Experts' Participation in the Site-Selection Processes (Step 3)*

For implementing Step 3, two participatory techniques have been developed (Figure 2) in accordance with each group (LP and LE) facilitating these two groups' efficient involvement within the site-selection processes and, more specifically, in Step 4 and Step 5 of the proposed SSEP framework (Figure 1). The first technique (Figure 2a) corresponds to a public participatory technique, and it is based on the utilization of a well-structured questionnaire, where focus is given on essential social SC, while the AC are prioritized in accordance with the principles of the BC method. The second one (Figure 2b) corresponds to an experts' participatory technique. It is based on the deployment of a well-structured questionnaire, where, contrary to the LP questionnaire, focus is given on essential economic, technical, environmental, and political SC, while the AC are prioritized in accordance with the principles of the AHP method. Details about these two techniques are given in the sections that follow.

**Figure 2.** Schematic view of the participatory techniques developed and applied in the present site-selection processes for: (**a**) Local Public (LP) and (**b**) Local Experts (LEs).

### 3.3.1. Local Public Participation in the Site-Selection Processes

The questionnaire for the LP has been structured into four main sections. The first section was devoted to the collection of demographic information of the participants (e.g., gender, age, education, and professional occupation related or not to RE), while the second section focused on the LP opinion about the deployment of RET for electricity generation (e.g., types of RETs that the LP recommends for the RES exploitation in Israel). The third section of the LP questionnaire included questions related to the site-selection for both examined RETs (i.e., exclusion limits for essential social SC, such as "the most appropriate distance of WFs and PVFs from residential areas"). Finally, in the fourth section of the LP questionnaire, the participants were asked to prioritize 12 AC for the deployment of isolated WFs and PVFs based on their own different preferences. This prioritization was achieved according to the principles of the BC method. BC represents a social choice method that is generated by a large group of people for decision-making purposes, and it is characterized by anonymity, neutrality, and consistency [47]. In the BC social choice method, the participants of the decision-making issue rank the alternatives (the AC in our case) in order of their preference. Once all the responses have been obtained, the preference order can be determined.

In the present study, 200 fully-completed questionnaires have been obtained by the LP from all over Israel (North, Central and South part). This geographic segmentation enabled to investigate potential different policy orientations on the siting problem of WFs and PVFs driven by quite different geographic locations. The results of the LP questionnaire survey have been appropriately processed by performing statistical and correlation analysis using the built-in tools of the SPSS software. In this way, essential insights related to the deployment of WFs and PVFs in Israel based on the LP views and concerns have been revealed. Moreover, the overall, among all LP participants, exclusion limits for social SC have been obtained as well as the overall relevant importance (i.e., relevant weights) of the AC with respect to the goal of the examined decision-making problems (siting of WFs or PVFs).

### 3.3.2. Local Experts' Participation in the Site-Selection Processes

In the case of the LEs, the relevant questionnaire has been again structured into four main sections similar with those of the LP questionnaire. However, the questions in the third section of the LEs' questionnaire were related to the definition of exclusion limits for essential economic, technical, environmental, and political SC (e.g., WSC.1, SSC.1,

and WSC.2/SSC.3 in Table 1), while, in the fourth section of the LEs' questionnaire, the participants were asked to prioritize the 12 AC in accordance with the principles of the AHP method [65,66]. In that respect, each LE performed pairwise comparisons between the AC and quantified the relative importance of each AC with respect to the goal (siting of WFs and PVFs) by deploying the fundamental nine point's scale of the AHP. The corresponding results were further processed to obtain the relative weights of the compared criteria and, thus, to form the priority vector. The robustness of the pairwise comparisons was assessed by calculating the consistency index and the consistency ratio [67]. The overall, among all participating LEs, priority vector has been calculated by employing the aggregating individual priorities technique (i.e., aggregation of all the individual priorities) [68,69], since in the present investigation, each LE acts as an independent individual.

The LEs' group involved in the present study consisted of 4 LEs (doctoral researchers, senior managers, and professional engineers in RE) from universities and companies from all over Israel, carefully selected, considering their background on the siting of WFs and/or PVFs. These LEs quantified the exclusion limits of several essential SC and prioritized the AC based on their own high experience, high-level of knowledge on the local climatic conditions, and on the special characteristics of the study area, as well as the availability of the land in Israel. It is noted that the number of LEs participated in the present work is a bit larger compared to other previous relevant studies, where the opinion of one [23] or two [17] or three [26] experts was taken into account.

### *3.4. Determination of Appropriate Sites (Step 4)*

In Step 4, areas unsuitable for the siting of WFs and PVFs are identified and excluded from further analysis. Hence, appropriate sites for the potential deployment of the aforementioned RETs are, finally, determined. Unsuitable areas are identified by employing the SC thematic maps of the RES GIS database developed in Step 2 along with the exclusion limits of essential SC as resulted from the LP and LEs' questionnaire surveys in Step 3. The SC along with their siting aspect and their incompatibility zones for the case of WFs and PVFs are shown in Tables 2 and 3, respectively. For determining unsuitable areas, two linear geoprocessing models (one for the WFs' and one for the PVFs' site-selection) have been created, edited and managed by building all required geoprocessing workflows in a GIS environment.


**Table 2.** Siting Criteria (SC) and their incompatibility zones for large-scale Wind Farms' (WFs') site-selection.


**Table 3.** Siting Criteria (SC) and their incompatibility zones for large-scale Photovoltaic Farms' (PVFs') site-selection.

### *3.5. Determination of SI of the Appropriate Sites (Step 5)*

### 3.5.1. Definition of AC

In order to prioritize the appropriate areas for large-scale WFs' and PVFs' siting, 12 AC have been defined (hereafter called WAC and SAC for wind and solar energy exploitation, respectively). Specifically, in the case of WFs, the appropriate areas resulting from Step 4 are assessed and prioritized according to the following 12 WAC: wind velocity (WAC.1), slope of terrain (WAC.2), proximity to road network (WAC.3), proximity to high-voltage electricity grid (WAC.4), distance from land protected areas (WAC.5), distance from important birds areas (WAC.6), distance from touristic zones (WAC.7), distance from archaeological, historical, and cultural areas (WAC.8), land use (WAC.9), proximity to areas with high population (WAC.10), wind energy potential (WAC.11), and visibility from the residential areas (WAC.12). As for PVFs, the appropriate land areas resulting from Step 4 are assessed and prioritized according to the following 12 SAC: GHI (SAC.1), average maximum temperature (SAC.2), slope of terrain (SAC.3), proximity to road network (SAC.4), proximity to high-voltage electricity grid (SAC.5), distance from land protected areas (SAC.6), distance from touristic zones (SAC.7), distance from archaeological, historical, and cultural areas (SAC.8), land use (SAC.9), proximity to areas with high population (SAC.10), solar energy potential (SAC.11), and land aspect (SAC.12). The criteria that are introduced for the first time in this paper as AC, are described in Appendix B.

### 3.5.2. Inclusion of AC Importance by Each Participatory Group

The prioritization of the AC in the WFs' and the PVFs' site suitability analysis was made according to the outcome of the LP and the LEs' questionnaire surveys (Step 3 of the proposed SSEP framework) as previously described in Section 3.3. Additionally, to these two groups, the relevant importance of the AC has been also quantified by the authors of this paper (herein refereed as REPs) based on their own expertise in spatial and RE planning. This quantification was implemented in accordance with the principles of the AHP method as in the case of the LEs' group. It is noted that the different backgrounds of the three participating groups may reflect different policy orientations of the examined RE siting problems. Thus, the complexity of such critical planning issues can be revealed.

Based on all the above, Figure 3 shows the relevant importance (%) of the WAC and SAC as obtained from the LP, the LEs, and the REPs. Compared to LEs and REPs, the LP emphasizes mostly on the importance of the social and environmental aspects of the present site-selection processes, since the results of the LP questionnaire survey led to the largest relevant weights for WAC.10, WAC.9, WAC.12, WAC.6, and WAC.8 and for SAC.6, SAC.10, SAC.9, and SAC.8, among all three participatory groups. At the same time, however, the LP seems to acknowledge the importance of the existence of high wind velocity and GHI in the potential sites, since for this group, large relevant weights have been also obtained for WAC.1 and SAC.1. Comparing the LEs' results with those of the REPs, it can be concluded that REPs follow a clear technoeconomic policy orientation of the siting issue, whereas LEs focus mostly on both economic and environmental AC. Finally, all three participatory groups provide the smallest weight to WAC.7 and SAC.7.

**Figure 3.** Relevant importance (%) of (**a**) Wind Assessment Criteria (WAC) and (**b**) Solar Assessment Criteria (SAC) based on Local Public (LP), Local Experts' (LEs'), and Renewable Energy Planners' (REPs') opinion.

### 3.5.3. Site Suitability Analysis

Having defined and prioritized the AC, site suitability analysis of the appropriate sites of Step 4 is implemented. This is achieved by utilizing the TOPSIS method [70,71]. More specifically, the values of each AC are, initially, expressed into a common and objective SI scale by deploying a 10-point suitability scale. Table 4 shows indicatively the suitability classification of 4 essential WAC and SAC. Next, an *m* × *n* initial decision matrix is established, where *m* represents the number of alternative sites and *n*, the number of AC. The normalization of this matrix follows. The relative weights of the AC as obtained from the application of the BC or the AHP method (depending upon the participatory group) are then taken into account in order to estimate a weighted normalized decision matrix. The prioritization of the sites and the determination of an initial SI follows. Lastly, the 10-point suitability scale is deployed to determine the final SI and the corresponding results are incorporated in GIS for illustrating the spatial suitability allocation of the proposed sites. In the present work, for each examined RET, three site suitability analyses have been implemented, taking into account the opinion of each participatory group separately. In

this way, different potential site-selection plans for the sustainable deployment of WFs and PVFs in Israel can be realized.


**Table 4.** Suitability scaling of essential Wind Assessment Criteria (WAC) and Solar Assessment Criteria (SAC).

### **4. The Field Investigation Module (Step 6)**

In the Field Investigation module (Figure 1), the sites identified in Step 5 to have high suitability (SI equal or higher than 6.0) for the siting of large-scale WFs and PVFs are selected in order to verify the corresponding GIS results by performing field observations. For achieving this, the workflow shown in Figure 4 is deployed. Initially, the precise location of the site under investigation is determined based on the coordinates available from the GIS results. Next, the site availability (i.e., no land use conflicts) is examined in the field, while, moreover, the accuracy of the determined in GIS geographic boundaries of the site is validated. The inspection of the site characteristics (e.g., land use, proximity to road network, etc.) follows along with the identification of special site-specific characteristics, which cannot be recognized in GIS (e.g., land occupation restrictions). Having implemented all the above, the field data are compared with the corresponding GIS results. If these data/results agree well, the SI calculated in the Planning module does not require any update, the examined site is characterized as "field verified," and it is, thus, considered as an element of the overall output of the proposed SSEP. The opposite holds true in cases, where the agreement between the field data and the GIS results is not adequate. It is noted that for inaccessible locations, where direct field observations/on-site analysis cannot be realized, Google Earth Pro software is alternatively deployed as shown in Figure 4. This tool is also employed to verify the slope of terrain and the elevation of the examined sites. Table 5 shows the site characteristics examined in the present investigation by direct field observations and/or by deploying the Google Earth Pro software.

**Table 5.** Site characteristics examined in the present investigation within the Field Investigation module.


**Figure 4.** Workflow followed for the realization of the Field Investigation module (Step 6 of the proposed SSEP).

### **5. Results and Discussion**

### *5.1. Creation of SC Thematic Maps*

Numerous thematic maps were created to depict the spatial dimension of SC in WFs' and PVFs' site-selection processes.

Indicatively, Figure 5a,b includes the thematic maps of WSC.1 (wind velocity at 100 m height above the ground level, 10-year analysis) and of SSC.1 (GHI, 11-year analysis), respectively, while the thematic maps of WSC.2/SSC.3 (slope of terrain) and of WSC.8/SSC.9 (distance from land protected areas) along with WSC.17 (distance from important bird areas) are shown in Figure 6a,b, respectively.

**Figure 5.** Thematic maps of (**a**) Wind Velocity (WSC.1) and (**b**) Global Horizontal Irradiance (GHI) (SSC.1) as defined in Table 1.

**Figure 6.** Thematic maps of (**a**) Slope of Terrain (WSC.2/SSC.3) and (**b**) Distance from Land Protected Areas (WSC.8/SSC.9) and Distance from Importance Bird Areas (WSC.17) as defined in Table 1.

### *5.2. Insights from the Local Public Participatory Process*

LP participation in the examined WFs and PVFs site-selection processes revealed valuable insights for the proper management of the LP prospective negative reactions to the RETs' deployment in the country of Israel.

As shown in Figure 7, most citizens (87.5%) supported the development of both RETs in Israel, whereas 12.5% of the citizens participating in the LP questionnaire survey expressed a negative attitude towards the deployment of Wind Turbines (WTs). The latter percentage corresponds to citizens who mainly live in the Northern part of Israel, near to either existing or planned WFs' sites. The observed opposition against WTs was attributed to (in descending order, Figure 7): (a) landscape and visual disturbance (LVD), (b) bird collision and disturbance of wildlife habitat (BCDWH), (c) environmental impact (EI), (d) lack of high wind energy potential (NHWEP) in the country, (e) acoustic disturbance (AD), and (f) safety reasons (SR). It should be mentioned that the existing WFs in Israel do not comply with the restrictions of the proposed in this paper SSEP. Therefore, this could further feed their negative feelings of WFs' deployment in Israel.

**Figure 7.** Local Public (LP) views (%) on Renewable Energy Technologies (RETs) per geographic segment and causes of negative reactions towards WTs' deployment in Israel.

Most citizens suggested the deployment of PV projects at all construction scales (Figure 8a). However, as shown in Figure 8b, large-scale projects were popular in the Southern part of Israel (55%), since they can produce larger amounts of electricity and potentially cover the energy needs of a larger part of the population. On the other hand, small-scale projects were popular in both North and Central Israel (48.6% and 40%, respectively), due to low land availability in these parts of the country.

**Figure 8.** Local Public (LP) preferences (%) on Photovoltaic (PV) projects: (**a**) per construction scale and (**b**) per both construction scale and geographic segment.

Finally, the results of the LP questionnaire survey indicated that "Public Participation" (PP) and "Appropriate Sites" (AS) correspond to two very important aspects in RETs' site-selection processes (Figure 9). The high importance of participation highlighted in the present investigation is also in line with previous studies, which acknowledge that PP is crucial for the acceptance of wind energy projects [72–74].

**Figure 9.** Local Public (LP) views (%) on the importance of Public Participation (PP) in the RETs' site-selection processes and of Appropriate Sites (AS) for RETs.

### *5.3. Determination of Appropriate Sites*

Numerous sites for WFs' and PVFs' deployment (203 and 1396, respectively) were identified by superimposing the thematic maps related to exclusion criteria (Tables 3 and 4). Wind Appropriate Sites (WAS) less than 2.5 km<sup>2</sup> and Solar Appropriate Sites (SAS) less than 5 km<sup>2</sup> were further excluded from the analysis. Hence, 24 WAS of 160.80 km<sup>2</sup> total surface area and 87 SAS of 742 km<sup>2</sup> total surface area were finally considered appropriate for the potential siting of large-scale WFs and PVFs projects, respectively.

### *5.4. Site Suitability Analyses' Results*

Table 6 presents the results of the six site suitability analyses implemented in the last step of the Planning module, where, the SI values are classified into three classes: low suitability (0.01–3.99), moderate suitability (4.00–5.99), and high suitability (6.00–10.00).

Each land SI reveals the suitability of the potential sites for the considered RETs and visualizes their spatial allocation on the final suitability maps (Figures 10–12).


**Table 6.** Suitability analyses results using Geographic Information Systems (GIS) (final results of the Planning module).

**Figure 10.** Suitability Index (SI) spatial allocation based on Local Public (LP) for (**a**) Wind Appropriate Sites (WAS) and (**b**) Solar Appropriate Sites (SAS).

**Figure 11.** Suitability Index (SI) spatial allocation based on Local Experts (LEs) for: (**a**) Wind Appropriate Sites (WAS) and (**b**) Solar Appropriate Sites (SAS).

**Figure 12.** Suitability Index (SI) spatial allocation based on Renewable Energy Planners (REPs) for (**a**) Wind Appropriate Sites (WAS) and (**b**) Solar Appropriate Sites (SAS).

The results of the WFs' site suitability analyses (Table 6) demonstrate that the highest suitability of the potential sites is obtained by considering the LP opinion (2 and 11 WAS with high and moderate suitability, respectively). On the other hand, REPs opinion determined the potential PVFs' sites with the highest suitability (28 and 47 SAS with high and moderate suitability, respectively). As for the WAS and SAS spatial suitability allocation, Figures 10–12 show the corresponding suitability maps according to LP views and concerns, LEs' knowledge and experience and REPs' expertise, respectively. It is noted that some sites were identified suitable for the deployment of both RETs (e.g., WAS.2 and SAS.1).

Finally, Table 7 shows the WAS (SI > 6.0) and SAS (SI > 7.0) selected to be further examined in the Field Investigation module. In this table, the area and the SI of the sites along with the corresponding participatory group that lead to this SI are also included.

**Table 7.** Wind Appropriate Sites (WAS) and Solar Appropriate Sites (SAS) selected to be examined in the Field Investigation module.


### *5.5. Field Investigation Results*

The further assessment of WAS.1-WAS.4 and SAS.1-SAS.2 (Table 7) within the Field Investigation module was implemented by performing on-site analysis/direct field observations. However, for SAS.3-SAS.8 Google Earth Pro was deployed, since those sites were not accessible. Table 8 shows the main field investigation results.


**Table 8.** Field investigation results.

For all examined sites, GIS results were in a very good agreement with the corresponding field investigation data, proving the high credibility of the present GIS site-selection analysis. In addition, the field investigation verified the prioritization of the above sites. The special site-specific characteristics identified, included, but not limited to, the following: (a) land occupation restrictions, (b) indigenous villages that are unrecognized by the Israeli government (i.e., Bedouin villages), (c) abandon and semiruined buildings, and (d) existing PV installations (apart from large-scale projects) on SAS geographic boundaries. The final characterization of each examined site as "field verified" was implemented by considering the importance of the identified site-specific characteristics in terms of their impact on the realization of the projects. Within this context, 2 WAS and 7 SAS were

characterized as "field verified" sites having an SI as resulted from the Planning module (Table 7) and, thus, they form the overall output of the proposed SSEP in the case of Israel. The remaining three sites (i.e., WAS.1, WAS.2, and SAS.1) corresponding to "non-field verified" sites, require an adequate update of their SI, since their site-specific characteristics (e.g., land occupation restrictions and indigenous villages that are unrecognized by the Israeli government (Bedouin villages)) have been considered to affect in a great extend the potential deployment of large-scale WFs or PVFs.

### **6. Conclusions**

In the present work, we develop an innovative SSEP framework to identify and prioritize appropriate, technically and economically feasible, environmentally sustainable, as well as socially acceptable, siting solutions of large-scale WF and PVF projects at national scale. Spatial planning tools (GIS) and multi-criteria decision-making methods (AHP and TOPSIS) were combined with versatile participatory planning techniques, to actively involve three different participatory groups (LP, LEs, and REPs) into the site-selection processes. A field investigation procedure was introduced, for the first time, to verify the GIS suitability analysis results by performing direct field observations/on-site analysis, or by deploying alternative tools, such as Google Earth Pro, for sites that were inaccessible.

The proposed site-selection methodological framework was applied in Israel. Thirty criteria (SC and AC), corresponding to several economic, technical, environmental, societal, political, and legal aspects were employed for the WFs' and PVFs' siting. The final outcome of the proposed framework was the identification of two WAS (WAS.3 and WAS.4) situated in the North Israel and seven SAS (SAS.2–SAS.8) situated in the Central and the South Israel with high suitability for RES exploitation in Israel. The above sites were accepted by all participatory groups and they were verified in the field. Key concluding remarks of the present study can be summarized as follows:


The proposed methodology includes successive modules and definite steps and can be applied in several study areas and for various spatial planning scales. It could be also further extended by integrating drone technologies within the Field Investigation module in order to identify/map special site-specific characteristics and, thus, update, if necessary, the SI calculated in GIS. Finally, an ecological impact assessment study should accompany each proposed project in selected WAS and SAS, since the entire land of Israel is of significant importance in terms of biological diversity.

**Author Contributions:** Conceptualization, S.S., H.U., D.M., E.L. and D.G.V.; methodology, S.S., E.L., D.G.V., H.U. and D.M.; software, S.S.; validation, S.S. and G.S.; formal analysis, S.S.; investigation, S.S. and G.S.; data curation, S.S.; writing—original draft preparation, S.S. and G.S.; writing—review

and editing, E.L., D.G.V., D.M. and H.U.; visualization, S.S.; supervision, E.L., D.G.V., H.U. and D.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the EUROPEAN UNION's HORIZON 2020 RESEARCH and INNOVATION PROGRAMME under the Marie Skłodowska-Curie grant agreement No. 778039 (Project: "Planning and Engagement Arenas for Renewable Energy Landscapes" (PEARLS)).

**Acknowledgments:** The authors would like to thank Na'ama Teschner, Lecturer at Ben-Gurion University of the Negev, for supporting the implementation of the present research by providing information about the study area as well as Erez Peri, PhD Student at Tel Aviv University, for providing "high-voltage electricity grid" and "military zones" data.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

### **Abbreviations**

RE, Renewable Energy; RET, Renewable Energy Technology; WF, Wind Farm; PVF, Photovoltaic Farm; RES, Renewable Energy Sources; GIS, Geographic Information Systems; MCDM, Multi-Criteria Decision-Making; AHP, Analytical Hierarchy Process; LE, Local Expert; SSEP, Sustainable Spatial Energy Planning; LP, Local Public; REP, Renewable Energy Planner; SC, Siting Criterion; TOPSIS, Technique for Order Preference by Similarity to Ideal Solution; SI, Suitability Index; AC, Assessment Criterion; BC, Borda Count; GID, Geographic Information Dataset; GHI, Global Horizontal Irradiance; WAC, Wind Assessment Criterion; SAC, Solar Assessment Criterion; WT, Wind Turbine; LVD, Landscape and Visual Disturbance; BCDWH, Bird Collision and Disturbance of Wildlife Habitat; EI, Environmental Impact; LHWEP, Lack of High Wind Energy Potential; AD, acoustic disturbance; SR, Safety Reason, PP, Public Participation; AS, Appropriate Site; WAS, Wind Appropriate Site; SAS, Solar Appropriate Site.

### **Appendix A**

A detailed description of the SC used in the present investigation is cited below.

*Wind Velocity (WSC.1)*: Mean wind velocity at 100 m above the ground level defined according to LEs' opinion and studies on geographic regions with relevant climatic conditions (e.g., [11]).

*GHI (SSC.1)*: Total amount of direct normal, diffuse horizontal, and ground-reflected irradiance [24]. An 11-year statistical analysis (2006-2016) was conducted for 292 sites and GIS interpolation tools were used to estimate GHI spatially on a national scale.

*Average Maximum Temperature (SSC.2)*: The performance of the modules of PV systems declines in high temperatures [75,76]. Average maximum temperature is selected, instead of mean temperature [e.g., 24,33,39], due to relatively high temperatures in Israel, especially during the summer period. A 10-year statistical analysis (2009–2018) of average maximum air temperature has been conducted for 292 sites and GIS interpolation tools were used to estimate the average maximum temperature spatially on a national scale.

*Slope of Terrain (WSC.2/SSC.3)*: Slope of terrain affects the project's investment cost. Larger slopes lead to larger installation costs.

*Elevation (WSC.3/ SSC.4)*: Sites in high altitudes are avoided for large-scale WFs and PVFs, since in those altitudes, rare flora and fauna species are commonly grown, while the road and the electricity grid are frequently inadequate [77].

*Military Zones (WSC.4/SSC.5*): Land areas officially used by the National Army for training and other purposes or as firing fields; thus, they cannot be considered for any other use.

*Distance from the Existing Road Network (WSC.5/SSC.6)*: The distance of a WF or PVF from the existing road network could affect the construction/maintenance costs and it could cause adverse effects (e.g., deforestation) in the environment due to road construction [18]. A minimum distance is defined for safety and aesthetic reasons as well as a maximum threshold for reducing associated costs and environmental concerns.

*Distance from the Railways Network (WSC.6/SSC.7)*: The existing railways network and a proper buffer zone from it are excluded for safety, technical, and social reasons.

*Distance from the Existing High-Voltage Electricity Grid (WSC.7/SSC.8)*: A proper safety distance is defined from the national electricity grid to avoid any grid damage during WFs' and PVFs' installation along with a maximum threshold from it to avoid high construction/installation costs. Connection to the high or extra high-voltage grid is selected, due to risks (e.g., cable destruction due to grid overloading) associated with medium or low voltage grid [78,79].

*Distance from Land Protected Areas (WSC.8/SSC.9)*: Appropriate distance from national environmental protected areas (i.e., nature reserves and national parks) and national forests (if necessary) for preserving their environmental importance.

*Distance from Civil and Military Aviation Areas (WSC.9/SSC.10)*: The operation of WTs disturbs significantly the airports' surveillance radar signals [43], while the glint from PV panels can distract pilots' vision and disturb also airports' radars if PV panels are located close to one another [43]. Two different safety distances have been applied from all civil and military aviation areas (airports, airbases, public, or private airfields) in Israel.

*Landscape Protection/Visual and Acoustic Disturbance (WSC.10/SSC.11)*: Appropriate distance from residential areas and solitary residences contributing to landscape protection, visual, and acoustic disturbances avoidance and social acceptance.

*Distance from Touristic Zones (WSC.11/SSC.12)*: Appropriate distance from touristic sites (hotels, guesthouses and observation points, and tourist attractions) to reduce public concerns towards wind and solar energy.

*Distance from Mineral Extraction Sites/Quarrying (WSC.12/SSC.13)*: Appropriate distance from land areas officially used for mineral extraction/quarrying based on their low aesthetic value and high energy needs.

*Distance from Economic Activities (WSC.13/SSC.14)*: Appropriate distance from land areas officially used for industrial and commercial zones.

*Distance from Archaeological, Historical, Cultural Areas (WSC.14/SSC.15)*: Appropriate distance from World Heritage Sites (WHS), nominated and protected by the United Nations Educational, Scientific and Cultural Organization (UNESCO), archaeological monuments, museums, historical places, and cultural areas to preserve their historical/cultural importance.

*Distance from Water Areas (WSC.15/SSC.16)*: Appropriate distance from water bodies, rivers, canals, and streams.

*Distance from Coastline (WSC.16/SSC.17)*: Appropriate distance from the coastline according to the national legal restrictions [49].

*Distance from Important Bird Areas (WSC.17)*: Appropriate distance from areas hosting a variety of significant birds for reducing the potential risk of birds' collision on the WTs and protecting rare birds' species.

*Farm Minimum Required Area (WSC.18/SSC.18)*: Minimum required area to enable the siting of large-scale WFs and PVFs.

### **Appendix B**

A detailed description of AC not included in the SC of Appendix A is cited below.

*Land Use (WAC.9/SAC.9)*: Land areas corresponding to open areas, shrubs, grass areas, meadow, vineyard, orchard, and agricultural farms. In Israel, WFs or PVFs are permitted to be proposed and installed in sites currently used as agricultural farms, vineyard, or orchard, due to the low availability of the land in the country. However, open areas are considered here as more preferable one for WFs' or PVFs' siting, since no land use conflict can be created.

*Proximity to Areas with High Population (WAC.10/SAC.10)*: High population areas require high amounts of electricity, especially at the peak time of domestic electricity consumption in the study area (i.e., summer period). RETs installation near to areas with high electricity consumption could cover the increased peak electricity demand and could contribute significantly to large electricity losses' reduction and, thus, to energy supply cost reduction.

*Wind Energy Potential (WAC.11)*: Total amount of energy that a potential onshore wind project could generate. The larger the WAC.11 value is, the higher the SI is for the specific AC. For each appropriate site, WAC.11 was quantified based on: (a) the land requirements for generating 1 MW from WTs according to the LEs' opinion and (b) the area factor indicating the fraction of the area that can be covered by WTs. This factor was defined based on previous studies related to the proper micrositing configuration in WFs [80].

*Visibility from the Residential Areas (WAC.12)*: Distance and altitude at which an WF can be seen from a resident with an unaided eye. Relevant visibility maps are produced based on the elevation raster of the total surface area of Israel and they illustrate areas, where an installed WT with total height equal to 150 m is visible or not from the residential areas. The referred height is defined by the LEs based on the existing and future standards of WFs in Israel. The higher the degree of visibility from the residential areas is, the lower the SI is.

*Solar Energy Potential (SAC.11)*: Total amount of energy that a potential PV project could generate. The larger the SAC.11 value is, the higher the SI is for the specific AC. For each appropriate site, SAC.11 was quantified based on: (a) the land requirements for generating 1 MW from PV panels according to the LEs' opinion, (b) the existing standards and best practices of PV projects in Israel, as well as (c) the area factor, taken equal to 70% according to the maximum load occupancy of PV panels with the minimum shading effect [35,38].

*Land Aspect (SAC.12)*: Compass direction (e.g., Northern, Southern, or Western) that a slope faces in the proposed site. SAC.12 is quite important for the efficiency of PV installations, since it is directly linked with the amount of solar energy that could be produced during the daytime [77]. The south-oriented appropriate sites receive the highest suitability values [24,28,36].

### **References**


### *Article* **Life-Cycle Land-Use Requirement for PV in Vietnam**

**Eleonora Riva Sanseverino <sup>1</sup> , Maurizio Cellura <sup>1</sup> , Le Quyen Luu 1,2,\*, Maria Anna Cusenza <sup>1</sup> , Ninh Nguyen Quang <sup>2</sup> and Nam Hoai Nguyen <sup>2</sup>**


**Abstract:** Over the last 15 years, photovoltaics (PV) in Vietnam has experienced development. The increased installed capacity of PV requires more land for installation sites as well as for manufacturing the plants' component and waste treatment during the plants' decommissioning. As a developing country, in which more than 80% of the population's livelihood depends on agriculture, there are concerns about the competition of land for agriculture and solar development. This paper estimates the life-cycle land-use requirement for PV development in Vietnam, to provide the scientific-based evidence for policy makers on the quantity of land required, so that the land budget can be suitably allocated. The direct land-use requirement for PV ranges from 3.7 to 6.7 m<sup>2</sup> MWh−<sup>1</sup> year, and the total fenced area is 7.18 to 8.16 m<sup>2</sup> MWh−<sup>1</sup> year. Regarding the life-cycle land use, the land occupation is 241.85 m2a and land transformation is 16.17 m<sup>2</sup> per MWh. Most of the required land area is for the installation of the PV infrastructure, while the indirect land use of the background process is inconsiderable.

**Keywords:** land use; life cycle thinking and photovoltaic system

**Citation:** Sanseverino, E.R.; Cellura, M.; Luu, L.Q.; Cusenza, M.A.; Nguyen Quang, N.; Nguyen, N.H. Life-Cycle Land-Use Requirement for PV in Vietnam. *Energies* **2021**, *14*, 861. https://doi.org/10.3390/en14040861

Academic Editor: Surender Reddy Salkuti Received: 4 January 2021 Accepted: 30 January 2021 Published: 7 February 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

### **1. Introduction**

The achievement of the international targets on climate change set during the Paris Climate Conference (COP21) will require a deep transition towards a decarbonized global energy sector [1]. Renewable energy resources (RESs) are recognised as one of the optimal options to reduce energy-related greenhouse gas (GHG) emissions [2]. According to the International Renewable Energy Agency (IRENA) projections, the share of renewable energy in the power sector would increase from 25% in 2017 up to 85% by 2050, mostly through growth in solar and wind power generation [2].

The strategies implemented in the energy sector in order to mitigate climate change could involve a trade-off among the economic sectors due to the competitive uses of limited natural resources [3]. This is the case of land availability and the competitive use between food and renewable energy production [4]. The exploitation of renewable energy systems, such as photovoltaics (PV), bioenergy, etc. will involve the expansion of land devoted to energy production [5–7].

Historically, land has been used for agriculture and food production. In line with socio-economic development and population growth, increased demand on food will drive the expansion of agricultural land use. According to the Food and Agriculture Organization (FAO), feeding the global population by 2050 will require a 60% increase in food production [7]. Food production security is a key pillar within the Sustainable Development Goal "Zero Hunger" (SDG 2) set by the 2030 Agenda of Sustainable Development of the United Nations which aims to end hunger and malnutrition by 2030.

Current resource use trajectories could compromise inclusiveness and sustainable development. In this framework, the water–energy–food nexus approach is emerging as a

systemic and integrated management of limited resources needed to achieve competing objectives [7]. In order to support stakeholders in the resources planning, the availability of reliable data on the resource demand in the different implemented strategies is of paramount importance [8].

In this context, this study aims at evaluating the nexus between a significant increase of PV energy and the land needed for the PV system installation. In order to give a reliable estimation of the needed land, the authors apply a life-cycle approach [9–11] to assess both direct and indirect land use related to the whole life cycle of the systems examined. The functional unit is one MWh of solar power. The studied system boundary covers different stages of PV life cycle from cradle (raw material extraction and PV components manufacture) to gate (operation of PV plant and generation of solar power electricity). The end of life phase has not been considered in this paper as the life cycle inventories of the end-of-life treatment of PV modules are weak [12].

The case study is the Vietnam energy sector in which PV will increase from the current 5000 MW<sup>p</sup> up to 13 GW<sup>p</sup> in the near future [13]. Moreover, the competition of land for agriculture and PV appears more strongly as the Vietnamese economy is growing. As in many fast-developing countries, the Vietnamese economy is changing its structure from an agriculture-based economy (46.3% of gross domestic product (GDP) in 1988 to 13.96% in 2019) to an industry-based (23.96% of GDP in 1988 to 34.49% in 2019) and a service-based (from 29.74% of GDP in 1988 to 41.64% in 2019) economy [14].

PV has been widely proved to contribute to the GHG emissions reduction, as it emits no carbon dioxide, methane and nitrous oxide during its operation stage [15]. Over the whole life cycle, the GHG emissions of PV are less than one-fourth of those from an oil-fired steam turbine plant and one-half of that from a gas-fired combined cycle plant [16]. However, PV impacts abiotic resource consumption, freshwater ecotoxicity and human toxicity significantly [17–19]. Considering the global expected diffusion of the PV energy system and the potential competition for land among different economic sectors, a deeper insight into life-cycle land use is required to support policy makers in land resource allocation.

Several papers have studied land use for PV installation. In detail, Pimentel et al. reported that the land requirement for PV was at 28 m<sup>2</sup> for one MWh [20]. However, the authors did not follow a life-cycle approach and only the land requirement for PV system installation was accounted for and the indirect land use was not clearly mentioned.

Fthenakis and Kim conducted a review on life-cycle land requirements of different energy generation technologies: coal, natural gas, hydroelectric, PV, wind and biomass [21]. In this study, the studied PV structures were constructed in the area with solar irradiance of 1.7–2.5 MWh m−<sup>2</sup> year−<sup>1</sup> , 9.5–20.2% solar to electricity efficiency (module efficiency times performance ratio). In terms of land transformation, solar power transformed a land area of 0.2 to 0.5 m<sup>2</sup> MWh−<sup>1</sup> [21]. A solar PV plant (SPP) with 2.4 MWh m−<sup>2</sup> year−<sup>1</sup> of solar irradiance, at 13% module efficiency and 8% performance ratio occupied 9.9 m<sup>2</sup> MWh−<sup>1</sup> year [21]. Besides, indirect land impacts related to PV modules and balance-of-system (BOS), such as inverter, transformer, mounting structures and energy for PV (e.g., fuels consumed during transportation of the PV plants' components), are negligible, between 22.5 and 25.9 m<sup>2</sup> GWh−<sup>1</sup> , compared to direct land use [21].

Later on, in 2015, Aman et al. reviewed technical and environmental aspects of solar systems, e.g., concentrating solar power (CSP) and PV [22]. In the paper, the land transformation and land occupation were compared among different types of power technologies, including coal and PV. Authors followed a life cycle approach and based the assessment on the life cycle land use for PV system obtained by Fthenakis and Kim (9.9 m<sup>2</sup> MWh−<sup>1</sup> year) [21]. In general, the land use of PV ranges from high to low, depending on the location where the PV modules are mounted [22], for example considering low to high solar irradiance of the installation sites. While the direct land disturbed by the solar infrastructures was estimated at 5.9 acres per MW for small PV and 7.2 acres per MW

for large PV [22], it was not clear what was the exact amount of indirect land use due to background processes, for example Si material extraction.

Bukhary et al. estimated and harmonized water and land use for CSP and PV technologies. The reviewed results are then incorporated into a system dynamic model to analyze water and land availability and usage, and relevant carbon emission reduction in six states in the USA based on their renewable portfolio standard (RPS) during 2015–2030 [23]. In term of land use, it was indicated that SPPs require an area of 18.1 × 10<sup>6</sup> m<sup>2</sup> for 750 MW [23]. The land use for PV system was not assessed in a life cycle perspective since its computation was based on data inferred from [24] which estimated land requirement for PV system by including only land use for facility installation.

From the literature analysis, it was clear that only a few studies are available on the land requirements of PV systems and that most of them did not follow a life-cycle approach. Their focus was on the direct land use and neglected the indirect land use requirement, except for the work of Fthenakis and Kim [21] who performed a life cycle assessment (LCA) on a specific SPP. In addition, it is controversial on land requirements of PV, ranging among 45 m<sup>2</sup> MWh−<sup>1</sup> [25]; 28 m<sup>2</sup> MWh−<sup>1</sup> [20]; 9.9 m<sup>2</sup> MWh−<sup>1</sup> year [21]; 5.9–7.2 acres per MW [22]; and 18.1 × 10<sup>6</sup> m<sup>2</sup> for 750 MW [23]. Considering that the land-use requirement for PV depends on the solar irradiance and the technology efficiency, the disagreement of land-use requirements in these studies may originate from the different solar irradiance in the studied installation sites, as well as the applied technologies. Moreover, it should be noted that there is a difference on approaches and scopes of the studies, for example the inclusion of indirect land use and to what extent it is included, in both upstream (panels and BOS manufacturing) and downstream processes (end-of-life treatment of the plants' infrastructure). The difference on approaches and scopes of the studies would contribute to the various results obtained.

In this context, this study will contribute to the state of the art by applying a life-cycle approach for quantifying land use requirement of PV development by including all the devices and all the processes needed for a PV system to deliver its function. Although the study will provide a quantitative calculation of land area used for PV for Vietnam, it can be used as a comparative basis of life-cycle land use requirement for PV globally. The estimated land-use requirement for PV will support strategic land use planning, in which the land resources are balanced and suitably allocated to sustainably exploit the land budget and avoid the competition on land use for different socio-economic-industrial activities including agricultural and energy production. The preliminary estimation of the land needed for the installation of PV can be useful to identify the most suitable site for the best layout of the plant in order to increase its efficiency and reduce costs [26]. The obtained results would not only limit for Vietnamese government but also benefit the global policy making process in sustainably allocated limited land resources, and PV investors and developers in economically financing the PV projects.

### **2. Methods and Data**

### *2.1. Methods*

The most common life-cycle approach applied for environmental impacts is LCA methodology, which is clarified in the international standards of ISO 14040, ISO 14044 [9,10]. Life-cycle thinking (LCT) covers environmental, social and economic impacts of a product (or service) from the natural resource extraction to the end-of-life of the product [11]. As a pillar of LCT, "LCA examines and evaluates all inputs, outputs and potential environmental impacts of a product system over its life cycle" [10]. It is a holistic approach, extending the traditional boundary of production stages to include upstream and downstream stages of material extraction and waste management along the product's value chain. LCA, consequently, considers the limited capacity of natural resources including land, to meet increasing demand of human socio-economic activities, and avoids shifting the environmental burden of one stage into other stages during the whole value chain of the product. Examples of this approach are the inclusion of land use for mining coal for

thermal electricity production or the consideration of land use for growing feedstock for biofuels for energy production.

In this study, the life-cycle approach is applied by considering both the direct land use for SPP installation and operation as well as indirect land use during background stages e.g., raw material extraction and component production. As the rooftop solar systems are installed on the roof of buildings, they require almost no land during installation and operation. Moreover, the share of rooftop solar systems' installed capacity in Vietnam is small compared to that of SPP [13], therefore, the focus is on SPP life-cycle land use requirements.

For direct land use requirement calculation, the study followed the method of Bukhary et al. [23]. First, the direct land use estimate (*L*) of each SPP is calculated based on the following Equation (1):

$$L = \frac{P}{I \times SE} \tag{1}$$

in which:

*L*: Direct land use estimate. It is the direct land area occupied by solar structure, measured in m<sup>2</sup> MWh−<sup>1</sup> year.

*P*: packing factor (unitless). It is the ratio of land cover by the array, including land area for the shading, to the actual land cover of the modules [23].

*I*: solar irradiance, measured in MWh m−<sup>2</sup> year−<sup>1</sup> .

*SE*: solar to electricity efficiency (unitless). It is a product of performance ratio and module efficiency [23].

Then, Equation (2) below is used to harmonize land use estimate of different SPPs in Vietnam by adjusting several technical characteristics such as solar irradiance, module efficiency, performance ratio and lifetime of the plant. Due to the differences in the solar irradiance of the installation sites as well as the applied technologies of PV plants, the direct land-use estimates are different among plants. The harmonized land use estimate will provide a generalized result of land use requirement for PV in Vietnam, regardless the technical characteristics. The harmonized solar irradiance, module efficiency and performance ratio are assumed based on literature [23,27] and the mean value of the actual situation in Vietnam. While the published features (land-use estimate, solar irradiance, module efficiency and performance ratio) are the actual data of different SPPs in Vietnam. Details of required data are discussed in the following parts.

$$\text{Ni}\_{\text{harm}} = \frac{\text{Ni}\_{\text{pub}} \, I\_{\text{pub}} \times \text{ME}\_{\text{pub}} \times \text{PR}\_{\text{pub}} \times \text{LT}\_{\text{pub}}}{\text{I}\_{\text{harm}} \times \text{ME}\_{\text{harm}} \times \text{PR}\_{\text{harm}} \times \text{LT}\_{\text{harm}}} \tag{2}$$

in which:

*Niharm*: harmonized land use estimate (m<sup>2</sup> MWh−<sup>1</sup> year).

*Nipub*: land use estimate of all studied SPPs in Vietnam (m<sup>2</sup> MWh−<sup>1</sup> year).

*Ipub:* solar irradiance in installation sites (MWh m−<sup>2</sup> year−<sup>1</sup> ).

*Iharm*: harmonized solar irradiance, at 1.9 MWh m−<sup>2</sup> year−<sup>1</sup> , which is the average solar irradiance of installation sites.

*MEpub*: module efficiency of studied SPPs (unitless).

*MEharm*: harmonized module efficiency, at 0.18, which is the mean value of actual modules' efficiency in Vietnam.

*PRpub*: performance ratio of studied SPPs (unitless).

*PRharm*: harmonized performance ratio (unitless), at 0.8, which is based on [23].

*LTpub*: lifetime of studied SPPs (year).

*LTharm*: harmonized lifetime (year), at 30 years, which is based on [23].

The indirect land-use requirement is calculated on the basis of the secondary data for production of components, e.g., modules and balance-of-system, and transportation of the components to the construction sites of the PV plants, extracted from Ecoinvent 3.6 database [28]. The technologies considered include PV panels 330 kW, Inverter 100 kW, Transformer 4500 kVA; Transformer 63 MVA; Transformer 100 kVA; Sea transportation

by tanker, for dry goods; Road transportation >32 metric tons; and Electrical installation of all components. The inventory data are the global average values, except for Road transportation. The vehicle technology standard for Road transportation is Euro 4, which is the same for the standard of road vehicle in Vietnam.

### *2.2. Solar Irradiance in Vietnam*

The territory of Vietnam is long and narrow from the 8th to 23rd latitude of the Northern hemisphere, within the tropical region. Consequently, the solar energy potential in Vietnam is quite large, and feasibly exploited for the socio-economic and industrial development of the country. However, due to the shape of the territory, the exploitation potential of solar energy is different among regions. The national average solar irradiance is around 4–5 kWh m−<sup>2</sup> day−<sup>1</sup> , but it is quite low in the North and high in the South of Vietnam [29]. From 17th latitude towards the South of Vietnam, the solar irradiance is high and stable throughout the whole year, with difference at about 20% between the dry and rainy seasons. The duration of sunlight is more than 12 h during days from vernal equinox to autumnal equinox, and less than 12 h during days from autumnal equinox to vernal equinox. The average sunny hours are between 1800–2600 h annually [29].

In general, the amount of solar radiation in Vietnam is thus relatively good. In particular, in the Central Highlands, Southern Central and South regions, the amount of solar irradiance is very good for developing PV systems. Table 1 presents the annual irradiance of some provinces in Vietnam.


**Table 1.** Annual solar irradiance of some provinces in Vietnam. Reproduced from [30], The World Bank Group: 2020.

### *2.3. Technical Characteristics of Photovoltaics (PV)*

Solar to electricity efficiency depends on the technology and surrounding environment. It is proportional to the performance ratio and modules' efficiency. Performance ratio is the ratio of alternating current electricity generated by the PV modules, taking into account system loss, to the calculated electricity based on direct current module's efficiency and solar irradiance [27]. The review on the performance ratio of PV was determined to be around 0.8 [23], which is the assumed value for harmonized SE parameter in Equation (2). In this study, the published performance ratio is the actual figures of different SPPs in Vietnam, ranging from 0.79 to 0.85.

Module efficiency is the percentage of solar energy converted into direct current electricity by the modules. Module efficiency depends on the surrounding environment, e.g., dust covered on the panel and in the atmosphere in forms of smog or air pollution. The accumulation of dust results in efficiency loss [31,32]. It was also pointed out by

Maghami et al. that the dust accumulation on the panel decreases both current and voltage output, while dust in the atmosphere decreases the current output only [33]. The global average module efficiency is around 0.19 [23]. In this study, the module efficiency are the actual figures of different SPPs in Vietnam. These numbers range from 0.17 to 0.19. The harmonized module efficiency is the mean value of the actual module efficiency in Vietnam, at 0.18, which is lower than the global average.

The typical lifetime of SPP is between 30 and 60 years [16]. Review of LCA of PV plants and solar rooftop systems assumed the lifetime to be 25 to 30 years [27]. Most of the panels' manufacturers guarantee an efficiency of 25 years. After 25 years, the panel efficiency reduces quickly. Therefore, the lifetime of 30 years is selected for both harmonized and published values.

Packing factor is the ratio of the total land to the actual land cover [23]. The total land is the area covered by the array, including the area provided to avoid shading and maintenance activities. The actual land is the area covered by panels or mirrors [23]. The work in [21] used different packing factors for solar power technologies, ranging from 2.1 to 5. For mono-crystalline (mono-Si) PV, the packing factor of 2.5 has been applied. In this study, packing factors are actual figures of different SPPs in Vietnam. The packing factors of the studied SPPs range from 1 to 1.8.

The studied SPPs include 11 SPPs in the Central Highland, 14 SPPs in the Southern Central and seven SPPs in the South of Vietnam. They comprise commercialized SPPs and SPPs that were approved for being connected to the grid. The total installed capacity of these SPP is 2335 MW, most of them have an installed capacity of (or being larger than) 50 MW; 28 out of 33 SPPs utilized the poly-crystalline (multi-Si) solar modules, with installed capacity 1949 MW, accounting for 84% of the total installed capacity of studied SPPs. Among the studied SPPs, there are only two SPPs installing the tracking system. Information about the studied SPPs and applied technologies can be found in Table S1.

### *2.4. Inventory Data*

The land-use inventories are described in land occupation and land transformation. Land occupation describes the delay of land recovery over time, and is measured in m2a [34], which means m<sup>2</sup> over 30 years in this study. Land transformation describes the change in land use, consequently causing changes in the ecosystem quality, and is measured in m<sup>2</sup> [34]. Land-use inventory data of the foreground process is based on the total land use by the SPPs, or the fenced area of the power plants. This land area includes the land occupied by the infrastructure within 30 years, and the land transformed from other land use purposes into areas for plants' infrastructure, internal road and green covers within the fenced area of the power plants.

The inventory data for background processes are extracted from Ecoinvent 3.6 database [28]. The process of manufacturing PV panels is scaled from the data for 1 m<sup>2</sup> of panel into a piece of panel. The process of manufacturing inverter is scaled down from the data for one piece of a 500 kW inverter into a 100 kW inverter. For the processes of manufacturing transformers low voltage and high voltage, the data are directly taken from Ecoinvent. For the process of manufacturing medium voltage transformer, the data is scaled from average value of high voltage and low voltage transformer. Transportation processes include sea transportation from the manufacturer sites (China) to the international ports of Vietnam, and road transportation from the international ports of Vietnam to the installation sites. Sea transportation is the global average data for transporting dry goods by tankers. Road transportation is the European average data for transporting goods by lorry according to the Euro 4 standard and at more than 32 metric tons. The distances of transportation are assumed to be 3300 km for sea transportation and 200 km for road transportation. Data of 3 kWp electrical installation is scaled up to 50 MWp electrical installation. The inventory data for background processes are specified in Table 2.


**Table 2.** Inventory data for background processes. Reproduced from [28], Ecoinvent: 2019.

### **3. Results and Discussion**

*3.1. Direct Land Use of PV in Vietnam*

The direct land use estimates range from 3.7 to 7 m<sup>2</sup> MWh−<sup>1</sup> year. After harmonization, the land use estimates range from 3.7 to 6.7 m<sup>2</sup> MWh−<sup>1</sup> year. The median value is 6.01 m<sup>2</sup> MWh−<sup>1</sup> year.

There is not much difference between the direct land-use requirement among various silicon-based PV technologies, including mono-crystalline (mono-Si), poly-crystalline (multi-Si) and panels with tracking system. The harmonized land-use estimates for PV plants with mono-Si and multi-Si panels are around 5.6 m<sup>2</sup> MWh−<sup>1</sup> year. This may originate from the increasing module efficiency of multi-Si panels, which becomes closer to that of mono-Si ones. The harmonized land-use estimates for PV plants with tracking systems is slightly higher than those without tracking systems, at 5.9 m<sup>2</sup> MWh−<sup>1</sup> year.

The average land use efficiency for the fenced area of the PV plants ranges from 7.18 to 8.26 m<sup>2</sup> MWh−<sup>1</sup> year, which is lower than the average land use of PV, at 9.4 to 10.6 m<sup>2</sup> MWh−<sup>1</sup> year obtained by [23]. Table 3 presents the results obtained on the land-use estimate of the SPPs in Vietnam.


**Table 3.** Harmonized land use estimates of the solar PV plants (SPPs) in Vietnam.

<sup>1</sup> Not available.

### *3.2. Life-Cycle Land-Use Requirement*

The life-cycle land-use requirement for PV in Vietnam includes 241.85 m2a of land occupation and 16.17 m<sup>2</sup> of land transformation for one MWh of solar power over 30 years.

Most of the land area is required for the foreground process of SPPs' infrastructure. The indirect land use for the background of manufacturing and transportation of panels, inverters and other components of the SPPs infrastructure to the installation sites are inconsiderable. The contribution of different processes to the life-cycle land-use requirement is specified in Figure 1. As a matter of fact, most of panels and BOS utilized in SPPs in Vietnam are imported from China, the indirect land use impacts from the manufacturing and transporting of these components do not pose any environmental impacts on the local land budget.

**Figure 1.** Life-cycle land-use requirement for SPPs in Vietnam by processes.

The land occupation of 241.85 m2a represents the area of land occupied by the solar infrastructure to generate one MWh of solar power within 30 years of operation. The land transformation of 16.17 m<sup>2</sup> represent the area of land needed to be transformed from the previous situation into other purposes, e.g., constructing the SPP, manufacturing panels and BOS, etc. As previously stated, the land occupation delays recovery whereas land transformation causes a change in ecosystem quality. The land-use impact assessment requires identifying the type of land use, the spatial extent, the temporal extent, and the geographical location [34]. At the same time, other key elements need to be considered such as function of the land, ownership of the land (co-benefits of the land for different land use purposes), assumption on future or alternative land use and land recovery capacity [35]. This paper focuses on the land-use requirement of PV, therefore, only land areas with specific types of land are presented (see Table 4). The land-use impact assessment is excluded from the scope of this paper.


**Table 4.** Life-cycle land occupation and transformation of PV in Vietnam.

### *3.3. Limitations and Future Research*

The water–food–energy nexus, which aims to secure the supply of these resources by strengthening synergies and reducing trade-offs among these sectors, is vital to aim towards sustainable developments paths. The land-use estimation of PV based on a life-cycle perspective would avoid unnecessary land-use exploitation at global scale or continental scale. The life-cycle perspective allows the matter to be investigated on a larger scale than the mere national perspective.

The study sets the system boundary from cradle to gate, while the end-of-life treatment of the PV has not been considered due to the limited availability of inventory data. The missing evaluation of the PV end-of-life treatments is a weak point of this study as the waste treatment of panels and other components, either by landfilling, incineration or recycle, would require a substantial area of land [36]. Different end-of-life management options could involve different life-cycle land-use values [36]. This would open up future research on inventory data for the end-of-life stage of PV.

The results of the study are replicable to verify the LCT approach as well as the quantitative inventory of life cycle land use for PV. Although the case study limits the assessment to the Vietnamese context (mainly for the use phase), the available datasets that describe the life-cycle inventory of the PV systems and all the devices needed for a PV system to deliver its function are valid globally, that makes it a representative and illustrative example for LCA study on PV land use. Therefore, it can be used for supporting the land-use strategic planning for PV in Vietnam as well as other countries with economic and climatic characteristics similar to those of Vietnam, with an order of magnitude of the land-use impact associated with a significant increase in ground-mounted PV plant installation.

### **4. Conclusions**

The increasing demands on land for both agriculture and renewable energy development require a clever strategy of land allocation. In this paper, a life-cycle approach was applied to evaluate the land-use requirement for PV development in Vietnam. It is identified that the direct land use for PV is 3.7 to 6.7 m<sup>2</sup> MWh−<sup>1</sup> year. The total fenced area of a SPP would require 7.18 to 8.16 m<sup>2</sup> MWh−<sup>1</sup> year.

When the indirect land-use is included, the life-cycle land-use requirement includes 241.85 m2a of land occupation and 16.17 m<sup>2</sup> of land transformation per MWh over 30 years of lifetime. Both life-cycle land occupation and transformation mainly come from the construction and operation processes whereas the indirect land use of background processes is negligible.

Currently, the land budget for energy development in Vietnam is about 146.07 thousand ha, which is mainly used for large hydropower plants, thermal power plants, distribution and transmission network [37]. There is no available information on the land budget for developing PV in particular. However, considering the land area required for one MWh of PV is 8.04 m<sup>2</sup> MWh−<sup>1</sup> year, the land area needed for about 5000 MWp or 4800 GWh of PV by 2019 is about 3800 ha, accounting for 3% of the total land budget for energy development. As the government has no plan to degrow other types of power, it is obvious that the land for PV development in the future would be transformed from land for other purposes. The competition in land use for alternative purposes would potentially limit the exploitation of PV in Vietnam in the near future.

**Supplementary Materials:** The following are available online at https://www.mdpi.com/1996-107 3/14/4/861/s1, Table S1: List of studied solar power plants.

**Author Contributions:** Conceptualization, E.R.S. and L.Q.L.; methodology, E.R.S., M.C. and L.Q.L.; re-sources, all authors; data curation, M.A.C., N.N.Q. and N.H.N.; writing—original draft preparation, all authors; writing—review and editing, all authors; supervision, E.R.S. and M.C. All authors equally contributed to the paper's development and writing. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Ministry of Science and Technology of Vietnam, as a subsidiary component of the research project "Design and installation of a grid connected micro grid 100 kWp photovoltaic system and study solutions for developing solar generation in Vietnam to 2030, taking into account greenhouse gas emission reduction". Code NDT.80.ITA/20.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors extend the acknowledgement to the Department of Engineering– University of Palermo and the Institute of Energy Science–Vietnam Academy of Science and Technology for creating favor during the study of this work.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

### **References**


**Bala Bhavya Kausika \* and Wilfried G. J. H. M. van Sark \***

Copernicus Institute of Sustainable Development, Utrecht University, Princetonlaan 8A, 3584 CB Utrecht, The Netherlands

**\*** Correspondence: B.B.Kausika@uu.nl (B.B.K.); W.G.J.H.M.vanSark@uu.nl (W.G.J.H.M.v.S.); Tel.: +31-30-253-7611 (W.G.J.H.M.v.S.)

**Abstract:** Geographic information system (GIS) based tools have become popular for solar photovoltaic (PV) potential estimations, especially in urban areas. There are readily available tools for the mapping and estimation of solar irradiation that give results with the click of a button. Although these tools capture the complexities of the urban environment, they often miss the more important atmospheric parameters that determine the irradiation and potential estimations. Therefore, validation of these models is necessary for accurate potential energy yield and capacity estimations. This paper demonstrates the calibration and validation of the solar radiation model developed by Fu and Rich, employed within ArcGIS, with a focus on the input atmospheric parameters, diffusivity and transmissivity for the Netherlands. In addition, factors affecting the model's performance with respect to the resolution of the input data were studied. Data were calibrated using ground measurements from Royal Netherlands Meteorological Institute (KNMI) stations in the Netherlands and validated with the station data from Cabauw. The results show that the default model values of diffusivity and transmissivity lead to substantial underestimation or overestimation of solar insolation. In addition, this paper also shows that calibration can be performed at different time scales depending on the purpose and spatial resolution of the input data.

**Keywords:** photovoltaic solar potential; calibration; validation; ArcGIS solar radiation; Netherlands

### **1. Introduction**

Geographic Information System (GIS) based solar photovoltaic (PV) tools have been developed and used increasingly in the past decade, as they provide a remote assessment of PV siting, planning, integration and management [1]. These tools have been gaining popularity within the public sector (general public, governments, etc.) and also the private sector (PV installers, network operators, etc.). With increasing interest in sustainable solar energy generation, the mapping of solar PV potential has been explored by many at local [2,3], municipal [4,5] and regional scales [6]. At a local scale, it is easy and insightful to assess individual buildings. This information, once generated, can be used for answering several questions regarding the planning and siting of solar PV or solar thermal systems and even in urban planning and policy evaluations [7,8].

Early methods for PV potential calculations used computational solar radiation models which were either top-down or could not capture complex roof tops or probable shading due to the surroundings [9,10]. Then, a combination of computational models and GIS methods emerged for improving the solar irradiance calculations and for the estimation of technical [6,11–13] and socio-economic potential [14]. GIS based algorithms, on the other hand, help in capturing the spatio-temporal variation of solar irradiation and, consequently, PV yields [15]. A number of solar irradiation and PV mapping tools that are currently available and use different methodologies for rooftop PV potential analyses have been reviewed [16–18]. These algorithms are driven by geographic data and atmospheric parameters specific to the particular area. Most of the GIS based methods are based on some

**Citation:** Kausika, B.B.; van Sark, W.G.J.H.M. Calibration and Validation of ArcGIS Solar Radiation Tool for Photovoltaic Potential Determination in the Netherlands. *Energies* **2021**, *14*, 1865. https:// doi.org/10.3390/en14071865

Academic Editor: Jesús Polo

Received: 26 February 2021 Accepted: 22 March 2021 Published: 27 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

form of geographic data, such as satellite images, digital elevation models (DEM) [10,14,17] or LiDAR data [19–22]. These methods use different assumptions and, hence, differ in their accuracy and performance. Usually, the most common assumption is that every point on the rooftop receives an equal amount of solar radiation, irrespective of the slope, orientation and shading factors. Such assumptions often lead to inaccuracies [23]. When it comes to preparing maps or creating PV potential tools, it is necessary that the tool is customized to suit the geographic area, as solar irradiation and its associated weather parameters change drastically depending on the location and time. Commonly used solar irradiance models have been reviewed and analyzed [9,10,18]. Out of the few existing raster-based models, the GRASS r.sun model developed by Šúri and Hofierka [24] and ESRI's Solar Radiation used in ArcGIS [25], developed by Fu and Rich [26], allow for integration of attributes that vary spatially over large regions. In addition, these models also account for shadows from surrounding buildings and trees, while allowing modeling over inclined surfaces, which is of specific interest in the urban landscape.

For solar irradiance calculations, GRASS r.sun uses a Linke turbidity factor and beam and diffuse radiation coefficients, which are obtained from a data bank and calculated from decomposing global radiation measurements from a nearby weather station [27]. On the other hand, ArcGIS's Solar Radiation uses simplified models, in addition to an easily operable interface with high resolution geospatial graphics. In addition, in the Solar Radiation tool, sky transmissivity and diffusivity parameters for calculation of direct and diffuse insolation are values which can be changed via a time series; throughout the year, every month, or within a day. Diffusivity ranges from zero to one, with typical values of 0.2–0.3 for clear sky conditions. Transmissivity also ranges from zero to one, with 0.5–0.7 for clear skies. Note that transmissivity and diffusivity are inversely related [28]. The GRASS r.sun is an opensource software, while ESRI's Solar Radiation is a proprietary software.

The atmospheric parameters (Linke turbidity factor, clear-sky index, transmissivity, etc.) can have a significant impact on the calculated annual irradiation [22,29]. These atmospheric parameters are hard to model and customize for a particular location [24]. Using the tools without validating these variables can have a significant influence on the final results; therefore, using parameters closer to local insolation values reduces the variation in solar radiation estimation [20,30]. Especially, with the Solar Radiation, model validation is necessary since the actual values cannot be defined from atmospheric data prior to model implementation [10]. The Australian PV Institute's (APVI) Solar Potential Tool, developed by the University of New South Wales, uses the Solar Radiation model as the background [31]. They used validation methods to estimate the accuracy of the APVI tool in comparison to measurements of the output AC power of PV systems and NREL's System Advisory Model (SAM [32]). The study also analyzed the accuracy of ArcGIS's Solar Radiation tool with respect to insolation on shaded and unshaded surfaces [33]. Copper and Bruce [31] stated that a linear correction can be applied to ArcGIS's estimates of insolation in order to achieve better fits with the results from SAM. However, it was observed that studies do not validate these models before using them, despite the influence of this on the results.

This paper, therefore, addresses the relevance and implementation of using calibrated values for diffusivity and transmissivity for estimation of global horizontal irradiation for varying spatial resolutions and geographic areas, using the Solar Radiation tool of ArcGIS, with particular focus on the Netherlands as a case study. We used the typical meteorological year data as well as the most recent 10 years irradiance data for calibration purposes.

This paper is further organized as follows. In Section 2 the methods and data used are presented. Section 3 shows and discusses the results for the annual and monthly analysis of parameters with a validation case. Additionally, the model implemented for varying spatial resolutions is also presented. Section 4 concludes the paper.

### **2. Materials and Methods**

### *2.1. ArcGIS Solar Radiation Tool*

It is evident that solar irradiation varies with time, during a day, in a month and throughout the year. It also varies with the climatic conditions and the position of the sun. Therefore, the challenge for the model is to predict the values as close as possible to reality. The tool is quite simple, requiring only a couple of atmospheric parameters. In the case of the Solar Radiation tool, it is hard to calibrate these atmospheric parameters of diffusivity and transmissivity before running the model. The Solar Radiation tool of ArcGIS's Spatial Analyst Toolbox calculates the solar radiation over a geographic area or for specified point (latitude–longitude) locations, based on the hemispherical viewshed algorithm explained in [34–36]. This tool takes location, elevation, slope, orientation and atmospheric transmission as most the relevant inputs. The total amount of radiation calculated for a given location is given as global radiation in the (energy) units of Wh/m<sup>2</sup> .

The variable parameters we discuss in this paper are atmospheric diffusivity and transmissivity [28], which denote the proportion of global normal radiation flux that is diffuse and the fraction of radiation that passes through the atmosphere (averaged over all wavelengths), respectively. These values, thus, range from 0 to 1. All the calculations were performed under clear sky conditions.

The Solar Radiation tool uses a diffusivity value of 0.3 and transmissivity value of 0.5 as the default settings and this is referred to as the default model throughout this paper. For calibration of the Solar Radiation tool, solar irradiation for all combinations of diffusivity (0.2–0.7) and transmissivity (0.3–0.7) parameters (modelled values) have been simulated. In the results, for the purpose of analysis, these values will be referred to as whole numbers preceded by D or T to denote diffusivity and transmissivity, respectively. For example, D3T5 refers to a diffusivity of 0.3 and transmissivity of 0.5.

### *2.2. Calibration Data*

A major source of meteorological data in the Netherlands comes from the Royal Netherlands Meteorological Institute (KNMI) [37]. This institute provides a wide range of meteorological products and manages 50 automatic ground-based weather stations across the country, of which, 33 stations record the solar irradiance. Calibration of the atmospheric parameters was conducted using the measured values from the KNMI network. The KNMI station at De Bilt, in the Netherlands (52.10N, 5.18E) was chosen as a reference point for data calibration. Irradiation values obtained from the ground stations were mapped and interpolated to identify variations throughout the country for 10 years (2011–2020). The De Bilt station was selected out of the 33 stations that provide irradiation data, as this station is located in the center of the Netherlands and is commonly used as a reference point by KNMI for describing and forecasting the weather in the whole of the Netherlands. In fact, the change in irradiation from coast to mainland is not very prominent (about 10%) [38] and, therefore, a single station (at the center) can well be used as a reference when performing nationwide calculations. The model will be implemented for the area of De Bilt and meteorological data from that station will be used for atmospheric data calibration. For calibration purposes, De Bilt values were chosen in order to see if it was performing adequately to be used for the whole country.

Out of the 33 stations which measure irradiance, 30 stations were selected due to interruptions in the data collection of 3 stations within the 10 years. The locations of these KNMI ground measurement stations and their classification as either coast or mainland used in this study are shown in Figure 1. Daily sums of measured irradiance from the ground stations were gathered and aggregated per month and per year. In addition, irradiation maps for the country were created using a simple inverse distance weighted interpolation technique with irradiation data obtained from these 30 KNMI stations. This provides an insight into the variation in irradiance within the country over the years at low resolution, which is sufficient for checking for anomalies related to localized weather conditions or instrumentation errors [39].

**Figure 1.** Royal Netherlands Meteorological Institute (KNMI) stations in the Netherlands. Stations are categorized as coast (blue dots) and mainland (red). The station in the center (black square) is the De Bilt KNMI Station, and the station in the red square is the Baseline Surface Radiation Network (BSRN) station Cabauw.

In addition to the KNMI stations, there is a Baseline Surface Radiation Network (BSRN) station at Cabauw in the Netherlands. This is one of the stations that provides radiation measurements as part of a worldwide network [40,41]. There are about 40 stations in this global network in different climatic zones. These data are of primary importance for the validation and evaluation of various satellite and model estimates of radiation parameters. The Netherlands falls under the temperate maritime climate zone and Cabauw (51.97N, 4.93E) is a BSRN station in the Netherlands, which adheres to the highest achievable data measurement standards. Therefore, data from this station were used to validate the calibrated model [42]. This station is about 30 km southwest of De Bilt (see Figure 1).

### *2.3. Input Data for the Model*

Since the Solar Radiation tool is GIS based, it requires inputs in terms of raster or vector data. In particular, the Area Solar Radiation tool requires a DEM as an input to model solar radiation over geographic areas. The DEM used as input in this study is of 50 cm resolution and was obtained from Actueel Hoogtebestand Nederlands (AHN) [43]. Additionally, a DEM of 5 m (AHN) and 30 m (Aster DEM) [44,45] were used for irradiance calculations to evaluate the effect of spatial resolution on the outputs generated. A vector dataset of the locations and attributes of the KNMI and BSRN stations was used to map the measured irradiance values. Spatial resolution is one of the key factors deciding the quality of the output, as can be observed from Figure 2. The higher the resolution, the greater the detail in the images. Therefore, this should be chosen depending on the purpose of use. Modelling irradiation on the rooftops can be performed with 50 cm data, as can be clearly seen from Figure 2c. The slopes and orientations of the rooftops can also be calculated effectively at this resolution, which helps in potential estimations at the building level. With 5 m data, it is likely only possible to do this at the neighborhood or block level. With 30 m data, regional or national level estimations are possible.

**Figure 2.** Example of varying spatial resolution of the digital elevation models; (**a**) 30 m (**b**) 5 m and (**c**) 50 cm. The white areas correspond to missing data.

### *2.4. Method*

The Solar Radiation model was implemented for calibrating the model parameters T and D. The model has the capability to predict the irradiance values for varying temporal resolutions; daily, monthly, annual average and also within a specified time period. In this paper, the values were calibrated for two cases of varying temporal resolutions; yearly (annual average) and monthly average since this gives better information for potential estimations. In addition to these two temporal scales, we evaluated the data at varying spatial resolutions. All the modelled values were validated against a reference set for the default case, modelled values calibrated per year and modelled values calibrated every month.

The Solar Radiation modeling tool is computationally intensive, the process can run from a few hours up to multiple days depending on the inputs provided. In this particular tool, the simulation time is exponentially proportional to the resolution of the sky size and the raster input [3]. This also means that the higher the resolution of the input image, the greater the detail in the results and longer processing time.

ArcGIS uses Python as a scripting module to perform geographic data analysis, data conversion, data management, and for map automation [46]. Therefore, a customized Python script to run all permutations of atmospheric parameters of the model was incorporated to automatically run and iterate all the combinations of D and T values without manual intervention. The computed values of different permutations and combinations were then calibrated using measured values from the KNMI ground station in De Bilt. The best fit parameters of diffusivity and transmissivity were estimated for each month and year separately. The percentage difference (PD) between measured and modelled values was used to find the best fit values per month and per year (Equation (1)) [47].

Data fitting is highly dependent on the purpose of use, and the spatial and temporal scales at which the result is needed. In this paper, we chose to find the best fit values of global horizontal irradiation (GHI) for one location (De Bilt) over 10 years, assuming that the calibrated values from this location can be used for the whole country. The default model values and the calibrated model values (*GHImod*) were then compared with the measurements from De Bilt (*GHImeas*) using percent differences (PD) and mean bias error (MBE). MBE is the statistical model performance indicator, representing the systematic error of the prediction model to under or over estimate. The percentage difference PD and MBE are defined as:

$$\text{PD} = \left| \left( (\text{GHI}\_{\text{meas}} - \text{GHI}\_{\text{mod}}) / \text{GHI}\_{\text{meas}} \right) \times 100 \right. \tag{1}$$

$$\text{MBE} = \frac{1}{N} \sum (\text{GHI}\_{\text{mod}} - \text{GHI}\_{\text{meas}}) \tag{2}$$

ܰ with N referring to the number of measurements and the subscripts "meas" and "mod" corresponding to the irradiation values measured at KNMI De Bilt and obtained from the Solar Radiation model for all settings of D and T, respectively. Modelled data are calibrated

per month and once a year. Analysis at a local scale to depict buildings was also performed on an area close to the Cabauw station and this was chosen for validating the method.

### **3. Results and Discussion**

This section presents and discusses the results of the calibration and validation methods along with insights into the spatio-temporal variation of solar radiation within the Netherlands. In addition, the purpose of using a GIS based radiation model is presented.

### *3.1. Spatio-Temporal Variation of Solar Radiation in the Netherlands*

Solar irradiation depends on the geographic position and local climatic variations. The spatial and temporal variations in the global solar irradiation in the Netherlands for the years ranging from 2011 till 2020 are shown in Figure 3. The coastal region generally has a higher level of irradiation compared to the mainland. De Bilt, which is in the center of the country, falls in the median zone. Irradiation values from this station can, therefore, be taken as the average for the whole country.

**Figure 3.** Annual global horizontal irradiation in kWh/m<sup>2</sup> derived from KNMI stations across Netherlands for the years 2011–2020. Data have been interpolated to create a continuous irradiation map. The locations of the KNMI stations are also indicated as dots in the irradiation maps.

An overview of the ranges of values recorded at the 30 meteorological stations in the Netherlands is shown in Figure 4. The boxplots show the annual irradiation as recorded at the KNMI stations grouped as coast and mainland; 12 stations along the coast and 18 from the mainland (see Figure 1). It is clear that the coastal area has higher irradiation values compared to the mainland. It is worthy to mention that these values are larger than the 30-year average (983.41 kWh/m<sup>2</sup> measured between 1981–2010) used to characterize the Dutch climate [47]. Extremely high values have been recorded over the past three years. Table A1 in Appendix A, shows the averaged irradiation values for the coast and mainland categories, collected for the 30 stations in the Netherlands.

**Figure 4.** The range of irradiation values for all 30 stations categorized as coast (east) and inland (located west from the coast) for 10 years. Extremely high values were observed in the last 3 years, with record highs above 1200 kWh/m<sup>2</sup> for a few stations on the coast. The East to West variation of irradiation in the Netherlands can also be inferred from the graph.

From Figure 4, it is also evident that irradiation for location/locations is not the same every year. Even though the spatial variation of irradiation is prominent, even up to some 15% (Figure 3), we choose the De Bilt values for validation of the solar irradiance for the whole country, as this is the central location of the country.

### *3.2. Calibrated Values vs. Default Values*

All combinations of D and T for the 10 years have been modelled for the location of De Bilt. Table 1 shows the GHI values measured at the De Bilt station per month for the year 2020 and modelled values from the same location with the default settings and calibrated values (best combinations of D and T) and their corresponding percentage difference (PD). Note, that the modelled values for different years are the same for every combination each month, except for leap years, as shown in Table A2 in Appendix A. This is because solar irradiation modelling has been performed on a single location (De Bilt station) with a constant DEM for all the years, assuming that there are no height variations throughout the 10 years. The locations of the ground measurement systems are also usually unchanged and are placed in fields with no obstructions. This clearly indicates that the model is very sensitive to the provided height information, which in turn, can be used in a manner that is dependent on the purpose of the analysis.

From Table 1, it is clear that the default model substantially underestimates the GHI. On an annual basis, for the year 2020, the default model yields an annual sum of 891.12 kWh/m<sup>2</sup> , which is about 21% less than the measured values at De Bilt. Only for two months (June and July) are the percentage differences below 6%, while in the winter months, the differences are much larger. If these values are not adjusted, they might lead to error propagation when these values used in further PV potential estimations. Therefore, it is necessary to find the right combination of D and T parameters in order to achieve better fits and, in turn, better accuracy. Choosing the correct temporal resolution for irradiance estimations is, therefore, important for the final results. For example, when trying to look at the production profile for a single household, hourly irradiance calculations can be very useful, in particular, for optimization of self-consumption. On the other hand, if the purpose is creating an irradiance map for the whole country, then it is more useful to select a seasonal or yearly variation.


**Table 1.** Global horizontal irradiation (GHI) from de Bilt from measured (*GHImeas*), results from solar radiation default model D3T5 (*GHImod*) for the year 2020 and the corresponding percentage differences (PD).

The best combination of diffusivity D and transmissivity T values was studied for the Netherlands for every month and for a year as a whole at the De Bilt location. Best fit values for each month were determined by finding the lowest PD between *GHImeas* and *GHImod* (Equation (1)). The results for the best combination of D and T and the corresponding error ranges for monthly fits are shown in Figure 5a,b and Figure 6a, respectively.

**Figure 5.** (**a**) Best fit D and T values for monthly calibrations over 10 years. The inverse relationship between D and T values is observed here, (**b**) Calibrated diffusivity (D) and transmissivity (T) combinations for 2011–2020. Although certain combinations are repeated, it is hard to find a pattern with these reoccurring combinations.

**Figure 6.** (**a**) Range of PD for the default model and the calibrated model for all the 10 years and (**b**) Scatterplot of default and best fit (calibrated values) per month and year vs. the measured values from de Bilt for 2020.

The difference in PD between the default and the calibrated model is huge (Figure 6a). The PD for the calibrated model is well below 7% for most of the fits. Here, the highest PD was also observed for the winter months, similar to the PD of the default model. Most repeating (four times in 10 years) D and T values are also from the winter months. The variation of best fit D and T values is shown separately for the 10 years in Figure 5a. Figure 6b shows the fits achieved by calibrating the model using the monthly and yearly fits, in comparison with the default model. It is evident as to how much error can be reduced by using calibrated values from Figure 6b. The MBE for the default model for 2020, as shown in Figure 6b, is negative, which means that the model is underestimating the value. Furthermore, analyzing the MBE values for all the 10 years revealed that the default model is biased, which means that for all the 10 years under review, the default model has underestimated the GHI.

Calibrating the values using only one annual DT combination resulted in higher PD values than fitting the data using DT combinations optimized per month, as shown in Table 1. Modelled values, obtained by using one DT combination per year, under estimate the irradiance for winter months and overestimate the irradiance for summer months. Therefore, over a year, the cumulative irradiation values are closer to the reference values. However, the monthly fits are much better when looking at higher temporal scales. On the other hand, if we are looking at lower spatial resolutions (district or country level), yearly fitting could suffice. This is because detailed information would be masked as the DEM input would be coarse (resolution of about 15 m–30 m or larger), which is not enough to distinguish between individual buildings.

To a large extent, yearly fits also reduce the error as compared to the default model, as shown in Table 2. The graph shown in Figure 7, plots the calibrated values of D and T when using one value for the whole year. It can be seen that certain years (2015, 2018–2020) with high levels of radiation have low diffusion and high transmission (D2T6), and low radiation years (2012 and 2013) have high diffusion and low transmission (D6T4), similar to what has been published recently [48]. The rest of the years have a median combination of diffusion and transmission (D4T5). Therefore, on the basis of the trend from these data, and the look up table (Table A2), it is feasible to predict the DT values for running the model, without the need to run simulations to recalibrate the model for annual estimations.


**Table 2.** Best fit DT values on an annual basis and the corresponding PD

**Figure 7.** Graph with best fit D and T values plotted for the years 2011–2020.

### *3.3. Validation of the Calibrated Values*

The calibrated values for the year 2020 were used to model the irradiation for a built-up area close to Cabauw. The results of the default model and results with calibrated models are shown in Figure 8. Although, the underestimation in the default model is evident, it still captures the surroundings efficiently. The relationship of the default values to the calibrated year values is linear. For the case of the default model, building classification in terms of suitability and delineation of suitable areas on the rooftop can still be done on the basis of the regional min–max values of modelled solar irradiation. On the other hand, calibrated values provide more possibilities in terms of potential estimations. Therefore, potential area estimations can still be made when using the default model without calibration, as long as the irradiation values are not directly used to estimate the power production or capacity. This is especially valid for high resolution analyses. During the validation of images, high values were observed (see Figure 8), especially on south facing roofs, for the calibrated models. This could be due to the fact that the model was calibrated using data from one point (the KNMI meteorological station at De Bilt).

**Figure 8.** Modelled irradiation for a geographic area with default model (D3T5) and calibrated models.

The complexity involved in calibrating the ArcGIS model refers to the fact that one measured value is used for a whole geographic area, be it measurements from the closest ground station or a central location. In addition, the only atmospheric parameters which can be changed are the D and T. This means that for high resolution rooftop analyses, even the calibrated values may sometimes fall short. An example is shown in Figure 9, where the irradiation profiles from different roof types are presented. Figure 9a shows the DEM of a small selection from the area used for validation purposes along with the locations selected for creating the radiation profiles. Small areas on the rooftops with different orientations were selected; blue for north, red for south, pink for east, orange for west and green for flat. All these locations are highlighted in the figure. Figure 9b shows the corresponding ranges of irradiation values for each image created by the default and calibrated models in boxplots and the mean values of the selected roof areas, plotted as lines.

**Figure 9.** (**a**) Colorized digital elevation models (DEM) with selected areas on different roof orientations and slopes. (**b**) box plot of irradiation values in the images for the default and calibrated models for 2020 with mean lines from the selected areas of different roof types.

The measured value at Cabauw is depicted as a black line at 1155 kWh/m<sup>2</sup> (for 2020). This value is closer to the first quartile for the monthly calibrated model, median for the yearly calibrated model and third quartile for the default model. In this scenario, using the calibrated model to model irradiation on the images or rather larger geographic areas instead of point locations, one DT fit per year can be seen to perform better. In all three cases east–west facing roofs have irradiation values closer to the first quartile. Flat roofs have a value that is larger than the median but only for the calibrated models, this is also larger than the measured irradiation. South and north facing roofs are closer to the maximum and the minimum values in the region and are significantly higher or lower than the measured values. The south facing and flat roof values from the default model are closer to the measured values, while the calibrated models overestimate the irradiation values. This suggests that the default model performs adequately when used for annual calculations and that it has a linear relation with the fitted models.

### *3.4. Irradiation Modelling with Varying Spatial Resolution*

The purpose of using ArcGIS is to be able to analyze solar irradiation based on location. Locations can vary from a point (latitude–longitude), a particular building, a street, a neighborhood or even a country. As mentioned earlier, the scale and purpose are important in selecting the required spatial resolution. Figure 10 shows the effect of spatial resolution in modelling solar radiation. It is evident as to which types of analysis are possible with the resulting images. The very high resolution of 50 cm is quite good for bottom-up analyses in urban applications of suitability modelling or power production and capacity estimations. On the other hand, 5 m, for example, can be used for modelling parking areas or fields or even for providing a general suitability classification of neighborhoods. Low resolution

images can be useful at a regional or national level for very broad or generalized figures. It should also be noted that the processing time is also related to the input resolution. For this study area of about 1 km<sup>2</sup> , the processing time recorded while running the default model was 01 m:12 s, 06 m:22 s, and 10 m:7 s, for 30 m, 5 m and 50 cm, respectively. It was executed on a Windows machine with an Intel i5 processor with four cores and eight GB RAM. This can become slightly complex and the processing time increases when smaller time intervals, higher resolution and larger geographic areas are used.

**Figure 10.** Solar Radiation with varying spatial resolution run with the default model in ArcGIS.

### **4. Conclusions**

This paper shows the importance of using validated values of transmissivity and diffusivity for performing irradiation analysis using the ArcGIS Solar Analyst Tool. The analysis shows that there is not one unique combination of D and T values that can be used as a constant for monthly fits; this also means that, for the prediction of solar irradiation for the future, other modelling methods, such as r.sun, are also preferable in terms of control of various atmospheric parameters. However, the Solar Radiation Tool is very simplistic (easy to execute with a minimum number of atmospheric parameters required) and at the same time, it can provide a detailed overview of shading or the effect of orientations and slopes when using high resolution data.

DT combinations are highly dependent on climatic conditions and calibrated values should be used depending on the purpose and scale. Calibrating this model is relatively easy when one has access to measured radiation values and can improve the potential calculations by at least 10–20%, depending on time scales used in the analysis. It was also observed that the monthly variation of the combinations leads to higher accuracy results, which is very useful when modelling energy profiles for households or even for generating accurate potential information which is closer to reality. When looking at lower temporal scales (yearly) one DT combination will suffice.

When the model is used to predict the annual irradiation, a direct relation could be made with the measured values and, therefore, standardized values can be used, as demonstrated. However, it must be noted that we assume that one single location (De Bilt) is sufficient for calibrating the model. Hence, these values are reliable when using similar data and settings as those used in this study and, therefore, are reproducible and reusable. Better fits can be achieved when the model is calibrated using data from the closest ground measurement station, no matter which resolution or temporal scale is used.

Finally, the spatial and temporal resolution play an important role in this model, which are directly related to the accuracy of the model, level of detail and processing time. We demonstrated the use of ArcGIS in mapping the PV potential, with optimized and validated D and T values. While the method was applied to the Netherlands, it can successfully applied to other regions. We finally recommend validating the ArcGIS model with local irradiation data before it is used for modeling/mapping purposes, if the values are to be used directly for potential estimations. This information can prove to be useful, especially in driving data dependent policies for PV penetration in order to encourage sustainable energy deployment.

**Author Contributions:** Conceptualization, B.B.K. and W.G.J.H.M.v.S.; methodology, B.B.K.; formal analysis, B.B.K.; writing—original draft preparation, B.B.K.; writing—review and editing, B.B.K. and W.G.J.H.M.v.S.; visualization, B.B.K. and W.G.J.H.M.v.S.; supervision, W.G.J.H.M.v.S.; funding acquisition, W.G.J.H.M.v.S. Both authors have read and agreed to the published version of the manuscript.

**Funding:** This research is partly financially supported by the Netherlands Enterprise Agency (RVO) within the framework of the Dutch Topsector Energy (project Advances Solar Management—1, ASM-1, and Advanced Scenario Management—2, ASM-2).

**Data Availability Statement:** Data is contained within the article. The data presented in this study are available from the links presented in the references mentioned in Sections 2.2 and 2.3.

**Acknowledgments:** The authors gratefully acknowledge Jessie Copper, UNSW Australia for the initial help in setting up the automation in ArcPy and for invaluable information on her research.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

### **Appendix A**

**Table A1.** Spatio-temporal variation of measured annual irradiation (kWh/m<sup>2</sup> ) and its standard deviation (std) in the Netherlands, comparing coast, mainland and the central De Bilt location. The coast column contains averaged irradiation values of 12 stations (blue dots in Figure 1) collected over 10 years. Similarly, the mainland irradiation values were obtained from 18 stations away from the coast (red dots in Figure 1).


<sup>1</sup> Averaged solar radiation from 1981–2010 collected from different KNMI stations [49].


**Table A2.** Monthly modelled irradiation values for all combination of D and T at de Bilt using Solar Radiation tool.

### **References**


### *Article* **Remote Sensing for Monitoring Photovoltaic Solar Plants in Brazil Using Deep Semantic Segmentation**

**Marcus Vinícius Coelho Vieira da Costa 1,2, Osmar Luiz Ferreira de Carvalho <sup>3</sup> , Alex Gois Orlandi 1,2 , Issao Hirata <sup>1</sup> , Anesmar Olino de Albuquerque <sup>2</sup> , Felipe Vilarinho e Silva <sup>1</sup> , Renato Fontes Guimarães <sup>2</sup> , Roberto Arnaldo Trancoso Gomes <sup>2</sup> and Osmar Abílio de Carvalho Júnior 2,\***



**Citation:** Costa, M.V.C.V.d.; Carvalho, O.L.F.d.; Orlandi, A.G.; Hirata, I.; Albuquerque, A.O.d.; Silva, F.V.e.; Guimarães, R.F.; Gomes, R.A.T.; Júnior, O.A.d.C. Remote Sensing for Monitoring Photovoltaic Solar Plants in Brazil Using Deep Semantic Segmentation. *Energies* **2021**, *14*, 2960. https://doi.org/10.3390/en14102960

Academic Editors: Benedetto Nastasi and Jesús Polo

Received: 16 March 2021 Accepted: 12 May 2021 Published: 20 May 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

**Abstract:** Brazil is a tropical country with continental dimensions and abundant solar resources that are still underutilized. However, solar energy is one of the most promising renewable sources in the country. The proper inspection of Photovoltaic (PV) solar plants is an issue of great interest for the Brazilian territory's energy management agency, and advances in computer vision and deep learning allow automatic, periodic, and low-cost monitoring. The present research aims to identify PV solar plants in Brazil using semantic segmentation and a mosaicking approach for large image classification. We compared four architectures (U-net, DeepLabv3+, Pyramid Scene Parsing Network, and Feature Pyramid Network) with four backbones (Efficient-net-b0, Efficient-net-b7, ResNet-50, and ResNet-101). For mosaicking, we evaluated a sliding window with overlapping pixels using different stride values (8, 16, 32, 64, 128, and 256). We found that: (1) the models presented similar results, showing that the most relevant approach is to acquire high-quality labels rather than models in many scenarios; (2) U-net presented slightly better metrics, and the best configuration was U-net with the Efficient-net-b7 encoder (98% overall accuracy, 91% IoU, and 95% F-score); (3) mosaicking progressively increases results (precision-recall and receiver operating characteristic area under the curve) when decreasing the stride value, at the cost of a higher computational cost. The high trends of solar energy growth in Brazil require rapid mapping, and the proposed study provides a promising approach.

**Keywords:** solar panel; deep learning; semantic segmentation

### **1. Introduction**

Solar energy is one of the most promising renewable energy sources, being crucial for sustainable development in places with intense sunlight. Several studies have shown that solar energy systems allow for economic and efficiency gains, driven by technological and productive development that enables cost reduction to overcome technical barriers [1,2]. According to Sampaio and Gonçalez [3], the main advantages of solar energy systems are reliability, low costs of operation and servicing, low maintenance, a free energy source, clean energy, high availability, generation closer to the consumer, a low environmental impact, potential to mitigate greenhouse gas emissions, and noiselessness. In contrast, the main disadvantages are a high initial cost, large installation area, high dependence on technology development, and climatic conditions (solar irradiation). The benefits of solar technology provided an exponential increase in installed solar energy capacity between 1992 and 2020 [4,5]. This detected growth of solar energy was not foreseen in previous scenarios of the Intergovernmental Panel on Climate Change's fifth assessment report [6].

Creutzig et al. [7] considered that the cause of underestimating the potential of solar energy was rapid technological learning and political support only for specific technologies. In 2019, China led the Photovoltaic (PV) solar energy capacity, followed by the European Union and the United States of America, where together they hold more than 64% of the world's total capacity.

Brazil offers good prospects for net-zero carbon energy due to its abundance of renewable energies: hydropower, bioenergy, wind, and solar [8]. Hydroelectric power is the primary generator of electric energy in Brazil. However, thermal energy is still needed to supply domestic demand in periods of prolonged drought [9–12]. Therefore, the challenge is to increase renewable energy production to supply the growing energy demand due to population growth and new technologies. One problem is that hydroelectric expansion prospects are in the Amazon region, with substantial environmental restrictions, such as extensive areas of flooding by dam reservoirs, methane emissions, and ecological changes [13–16]. In addition, climate change scenarios in Brazil for the 2030s and 2080s predict a decrease in rainfall and an increase in temperature, resulting in a reduction in hydroelectric production and an increase in solar (slight) and wind (significant) energy potential [17]. Thus, national progress needs to intensify alternative energy sources such as combining wind and solar sources [18].

The Brazilian territory has a high solar incidence availability, with a vast area close to the equator and without significant variations in the day's solar duration [19]. The semi-arid region has the most significant aptitude for installing solar power plants [20–25]. Between the two solar energy generation technologies, the Brazilian government has initially prioritized PV instead of concentrated solar power [26]. In 1995, the Hydroelectric Company of San Francisco developed the first PV system connected to Brazil's grid in Recife [27]. The crisis in the Brazilian electric sector between the years 2013 and 2015 favored the decentralization and diversification of the electric matrix sources. Therefore, since 2014, Brazil's solar energy has experienced a substantial expansion, with the first projects for PV Plants being contracted by way of public auctions. In the second half of 2015, solar energy production and distributed generation marked an inflection of growth driven by regulations and adoption of incentive changes [28]. Despite the various barriers to the development of solar energy (technological, economic, sociocultural, managerial, environmental, and political) [29–34], the current strong growth in PV energy brings optimistic perspectives for the electricity sector. Barbosa et al. [35] demonstrate, by modeling, that PV solar energy in Brazil will reach more than 36% of total electricity in 2050. This rapid expansion is mainly due to technology development, reducing investment costs, increasing the PV Panels capacity, and other enterprise cost reductions [36]. Furthermore, energy security policies and the eco-label design for improving air quality, by reducing greenhouse gas emissions, also contribute to solar energy growth [37].

Moreover, in developing countries such as Brazil, the PV solar plants are vital for ensuring energy security. Thus, inspecting solar plant constructions is important in order to carry out effective public policies. In Brazil, the Brazilian Electricity Regulatory Agency (ANEEL) is responsible for regulating the installed capacity expansion and monitoring the powerplant construction progress [38]. However, the inspection is manual, which will increase in complexity over time, requiring laborious work with skilled professionals and high costs for fieldworks and technical analysis. According to the ANEEL database, the growth expectancy for PV solar plant energy is considerable. For 2021, 32 new ventures are expected, and more than 140 for 2022. Furthermore, sustainable energy sources (e.g., wind and solar) tend to have many ventures with low energy production, increasing the number of processes to evaluate, urgently requiring automatic processes.

Remote sensing data (aerial photography and satellite imagery) enable inspection periodically, and have been widely used in the electrical sector for effective maintenance of electrical lines [39–41], thermal monitoring from nuclear power plants [42–45], environmental changes from hydroelectric dams [46–49], and energy consumption using nighttime light satellite imagery [50–52], among others. In solar energy, many studies use remote sensing images, such as solar energy estimates [53–56], solar power plant site selection [57–62], PV potential on building rooftops [63–66], and area estimation [67,68].

In automatic detection, Deep Learning (DL) emerges as a powerful method, especially in regards to computer vision problems using convolutional neural networks (CNN), due to its ability to process multi-dimensional arrays [69] with wide remote sensing applications [70–75]. Several reviews were carried out on the different DL methods, in which object detection, semantic segmentation, and instance segmentation were the most common approaches [76–78]. The method choice is highly dependent on the task objectives. When the main goal is to make a pixel-wise classification (as is the case with PV solar plants), semantic segmentation is a great alternative [79,80].

Previous studies in PV solar panel detection have shown promising results using the DL method, presenting very high accuracy. However, most studies consider urban PV panels using aerial or high-resolution satellite images [81–83], while PV solar plant mapping is still restricted [84]. This approach is an effective alternative to construction inspection, requiring periodic data and free satellite imagery. Previous studies on PV panel detection have not yet shown reasonable solutions for classifying large regions, and the use of mosaicking with sliding windows is a promising solution [85–87].

The primary motivation for this study is the development of a methodology based on remote sensing for the automatic monitoring of new installations of PV solar plants. In Brazil, the high growth of solar energy throughout the territory, with a continental dimension, prevents on-site inspection due to the financial and time cost, requiring the development of technological alternatives. Therefore, this research aims to evaluate the use of DL methods, representing the state of the art of computer vision, to identify and monitor PV solar power plants from ANEEL's database using Sentinel-2 images. This methodology represents an innovation for the management and monitoring of installed solar energy structures on the Brazilian territory, and similar research does not exist in the country to date.

### **2. Materials and Methods**

The present research had the following methodological steps (Figure 1): (2.1) data preparation; (2.2) DL models; (2.3) DL accuracy analysis; (2.4) mosaicking; and (2.5) mosaicking accuracy analysis.

**Figure 1.** Methodological flowchart.

### *2.1. Data Preparation*

### 2.1.1. Study Area

Brazil has a large and diverse territory, presenting different solar energy incidence [21]. Nevertheless, many areas are extremely suitable for the installation of PV panels. Therefore, we selected 24 areas to conduct this experiment (Figure 2). There are limited PV plants installed in the Brazilian territory and currently no open datasets considering Sentinel-2 data [88]. However, the development of methodologies and expansion of databases is a fundamental strategy for monitoring large-scale PV with a high growth trend.

**Figure 2.** Study Area.

### 2.1.2. Image Acquisition and Annotations

We obtained Sentinel-2 cloudless images with four channels (Red, Green, Blue, and near infra-red) for each region containing PV solar power plants. For each image, a specialist manually annotated ground truth (GT) masks considering two classes: background and PV solar plant. The background class presents a wide variety of spectral behaviors, including the different soil and vegetation compositions present in a large-scale country such as Brazil. The research considered the difference in the light incidence and the construction of panels in each region for DL model training.

### 2.1.3. Data Split

After preparing each tile with their respective annotations, we separated the dataset into training, validation, and testing sets. For each area of interest that may contain more than one PV solar plant, we cropped at least seven 256 × 256-pixel tiles. Table 1 lists the distribution of areas and images for training, validation, and testing.

**Table 1.** Data split in training, validation, and test sets.


### *2.2. DL Models*

2.2.1. Architectures and Backbones

Semantic segmentation allows for a pixel-wise classification, being highly suitable for many remote sensing applications [74]. Most semantic segmentation networks include an encoder/decoder structure. The encoder aims to extract features, whereas the decoder restores the image's original dimensions. In the last few years, many architectures were proposed to increase performance in this task (e.g., U-net [89], SegNet [90], Feature Pyramid Network (FPN) [91], DeepLab [92,93], and Pyramid Scene Parsing Network (PSPNet) [94], and backbones (e.g., ResNet [95], ResNeXt [96], and Efficient-net [97]). This study evaluated four commonly used architectures (U-net, DeepLabv3+, FPN, and PSPNet) and four backbones (ResNet-50 (R-50), ResNet-101 (R-101), Efficient-net-b0 (Eff-b0), and Efficientnet-b7 (Eff-b7)). We used models from the Semantic Segmentation repository [98], which provides different architectures and backbones in Pytorch.

### 2.2.2. Model Configurations

In addition to choosing the appropriate models, it is crucial to make fine adjustments for the task at hand. The first problem is the reduced number of available samples. Therefore, in addition to obtaining at least seven frames from each location, we applied two augmentations in the training process: random horizontal flip and random vertical flip (both with a probability of 0.5). The second problem is class distribution (there are many more background pixels than solar panel pixels). Thus, we used a loss function that minimizes this effect, the Dice Loss:

$$\text{Dice Loss} = \frac{2\chi(\text{pred} \cap \text{GT})}{|\text{pred}| + |\text{GT}|},\tag{1}$$

in which pred is the DL prediction, and GT is the ground truth mask. In addition, we used transfer learning with Imagenet [99] pre-trained weights for faster convergence; to avoid overfitting, we applied callbacks, saving the model with the lowest Dice Loss in the validation set. Regarding hyperparameters, we used: (a) 300 epochs; (b) Adam optimizer; (c) 5 × 10-3 learning rate (lr); and (d) batch size of 5.

### *2.3. DL Accuracy Analysis*

Accuracy analysis is a fundamental step for DL model evaluation. Since semantic segmentation models provide a pixel-wise mask, the metrics compare the predicted mask and the GT mask through confusion matrix metrics. The confusion matrix (Table 2) has four quadrants in binary tasks: True Negatives (TN), True Positives (TP), False Positives (FP), and False Negatives (FN).

**Table 2.** Confusion matrix.


The model outputs probability, whereas the GTs are integers. Thus, it was necessary to establish a cutoff point for the threshold metrics. A stricter threshold tends to reduce the commission errors, while a more permissive threshold tends to reduce omission errors. Thus, we applied a commonly intermediate threshold of 0.5 for three metrics (overall accuracy, F-score, and IoU): Overall Accuracy = TP + TN TP + TN + FP + FN ,

$$\text{Overall Accuracy} = \frac{\text{TP} + \text{TN}}{\text{TP} + \text{TN} + \text{FP} + \text{FN}} \,\, \, \tag{2}$$

$$\text{Tm}$$

$$\text{IoU} = \frac{\text{TP}}{\text{TP} + \text{FP} + \text{FN}} \text{ \textdegree \tag{3}$$

$$\text{F}-\text{score} = \frac{\text{TP}}{\text{TP} + \frac{1}{2}(\text{FP} + \text{FN})} \,\text{'}\tag{4}$$

### *2.4. Mosaicking*

The 256 × 256 pixel tiles used in training may not represent an entire scene, requiring a postprocessing stage. Mosaicking using a sliding window algorithm is a very promising solution. However, combining frames side by side to reconstruct a scene may also induce errors in the single frame edges. A way to minimize this effect is to apply a sliding window with overlapping pixels, where the final pixel will be the average from the overlapped pixels. Thus, we compared six different stride values for the mosaicking strategy: 8, 16, 32, 64, 128, and 256 (adjacent frames). Figure 3 shows four images with consecutive frames using different stride values. The smaller the stride value, the more overlapping pixels (which tends to reduce errors in the frame edges).

**Figure 3.** Four examples of different stride values of two consecutive frames in ascending order, where the stride value a < b < c < d.

### *2.5. Mosaicking Accuracy Analysis*

To evaluate the mosaicking, we analyzed the ranking metrics Receiver Operating Characteristic Area Under the Curve (ROC AUC) and Precision-Recall (PR) AUC, considering six stride values: 8, 16, 32, 64, 128, and 256. The ROC curve considers the true positive rate (TP/(TP + FN)) and false positive rate (FP/(TN + FP)) and the PR curve considers the precision (TP/(TP + FP)) and recall (TP/TP + FN). From the points generated, it is possible to calculate the area under these curves.

### **3. Results**

### *3.1. DL Metrics Results*

Overall, the different architectures and backbones presented good results (Table 3). The U-net presented the best metrics results regarding the different architectures, followed by DeepLabv3+, FPN, and PSPNet. Despite the higher complexity of the DeepLabv3+ architecture, the U-net presented better results as the targets do not present a high variance in scaling, one of the most significant benefits of this model. Moreover, although PSPNet provided the worst results, the difference is not extremely large, and the training period is considerably lower (less than half the time to train the Eff-b7 using the U-net architecture, and nearly one-fifth of the period for training on the DeepLabv3+ architecture). When analyzing the different backbones, apart from Eff-b0 with the PSPNet architecture, the results did not change significantly. Moreover, metrics-wise, the accuracy score shows high values among all models (<3% variation), possibly due to the fact that there are many more pixels corresponding to the background class than the panels class. The IoU and F-score provide much more meaningful results. The Eff-b7 using the U-net architecture had the best IoU and F-score results, and an intermediate computational cost.


**Table 3.** Semantic segmentation evaluation (accuracy, IoU, F-score, and epoch period) using three architectures (U-net, DeepLabv3+, and PSPNet), and four backbones (Efficient-net-b7 (Eff-b7), Efficientnet-b0 (Eff-b0), ResNet-101 (R-101), and ResNet-50 (R-50)).

Figure 4 shows three examples from the test set, and three examples from the validation set with their corresponding original images (RGB channels), GT, and prediction. Despite some errors in the edges of the objects, these results suggest a correct identification of the target, with few errors.

**Figure 4.** Three examples from the test set and three examples from the validation set with their corresponding original image, ground truth (GT), and prediction.

### *3.2. Mosaicking Results*

Table 4 shows the ROC AUC scores using the 1536 × 768 area, using six different stride values (8, 16, 32, 64, 128, and 256). The analysis only considered the best model (U-net with Eff-b7 backbone). When the stride value decreases, results progressively improve in both metrics. Nevertheless, decreasing the stride value increases the computational cost needed, becoming a significant limitation, especially for practical applications.

**Table 4.** ROC AUC, PR AUC, and processing time for 8, 16, 32, 64, 128, and 256 stride values.


Figure 5 shows the original image, its corresponding GT, and the prediction using U-net with Eff-b7 backbone and 8-pixel stride value on a 1532 × 768-pixel image. This mosaicking strategy enables the classification of areas with large dimensions, outputting images with no discontinuity.

**Figure 5.** Mosaic representation on a 1536 × 768-pixel image with the original image, the corresponding ground truth (GT), and prediction using the U-net with Efficient-net-b7 backbone.

### **4. Discussion**

The best result of our study was the U-net with the Eff-b7 backbone, although the other methods also reach high or adequate values. However, an unexpected result is that U-net outperformed DeepLabv3+ by a slight margin. This result is probably because the input images do not present multi-scale objects—one of the main contributions of the DeepLabv3+ method. Therefore, these results show that simpler structures may be well suited in some scenarios, highlighting the importance of testing different architectures.

Other solar panel detection studies using DL methods have demonstrated high accuracy in different locations. However, studies carried out on PV solar plants are still much lower than residential PV solar panels. Considering the large-scale solar plants, Hou et al. [84] proposed a study in China with one thousand images achieving 95% IoU from the U-net model. They used a much more significant amount of data, and the results were not dissimilar to ours (92% IoU).

Generally, accuracy results are lower in residential PV solar panels due to their smaller dimension and higher susceptibility to noise interference. Yuan et al. [100] applied a simple ConvNet for large-scale solar panel mapping from aerial images, and evaluated their model in the cities of Boston and San Francisco using completeness (0.84 and 0.87) and correctness (0.81 and 0.85) metrics. Yu et al. [101] proposed DeepSolar with a substantial amount of training data using high-resolution satellite images, obtaining 93.1% recall and 88.5% precision, results very similar to our F-score (95%). Zhuang et al. [83] applied the U-net in satellite images for residential panels, achieving 74% IoU. Recently, Jie et al. [82] combined a U-net model with edge detection networks. The authors showed that the edge detection increased performance on two city panel datasets by nearly 2% IoU. This effect may be even less prominent in large solar plants since it is easier to detect borders, as shown in our study. Even though these studies trained with smaller PV solar panels, the results show an excellent ability to segment panels even with simpler models.

Thus, the results of our and previous studies suggest that the mapping of PV solar panels should be addressed in a data-driven, rather than model-driven, perspective, i.e., the DL models do not present a significant difference, and the most important endeavor is to obtain a reliable source of generating good annotations. Moreover, the present study showed significant results using data augmentation despite a limited amount of data.

The mosaicking procedure enables the classification of areas of indefinite and large sizes. We have shown that using a smaller stride value increases performance, but also the computational cost. The stride value for a practical application should take both factors into consideration. Regarding the mosaicking technique on semantic segmentation models, de Albuquerque et al. [86] performed a comparative analysis using different stride values, presenting progressively better ROC AUC scores for lower stride values, a result also verified in our research.

This research presents many possibilities for future studies. A first proposition would be to estimate energy production using the mapping of the photovoltaic solar panel from DL, and the level of solar incidence in a specific region. Another relevant test would be evaluating radar images due to cloud cover and atmospheric interference in optical images. Although synthetic aperture radar (SAR) images are noisy, they can be useful in some scenarios. Studies comparing the frame sizes according to the proposal by Bem et al. [102] can also be valuable in understanding the model's differences in various tasks (e.g., binary and multiclass) and object scales.

### **5. Conclusions**

The survey and monitoring of PV solar power plants are extremely important for energy management and planning. The high growth of solar energy in Brazil, a country with continental dimensions, generates an increase in inspection processes for ANEEL that is only possible through technological innovation. Thus, this paper presented a comparison between DL models for the classification of PV solar plants using Sentinel-2 images with four spectral bands (RGB and near infra-red), comparing four architectures (U-net, DeepLabv3+, FPN, and PSPNet) with four backbones (ResNet-50, ResNet-101, Effb0, and Eff-b7), totaling 16 combinations. Additionally, we used augmentation and transfer learning. The PV panel spectral and shape characteristics facilitate the accurate detection of the panels. Results were satisfactory using the different backbones and architectures, but U-net with Eff-b7 backbone presented the best results with 98% accuracy, 92% IoU, and 95% F-score. We estimate that the most critical factors when mapping PV solar panels are a reliable source of data and their possible applications. For the classification of large regions, the image mosaicking procedure significantly improves when using more overlapping pixels, minimizing edge errors. The results are also expressive when analyzing the ROC AUC score and PR AUC score, in which the results progressively increase whilst decreasing the stride value. However, the computational cost may be a significant challenge for practical applications, since the processing time significantly increases with the stride value reduction. This methodology has many applications and satisfies the conditions for automatically classifying PV solar plants using free Sentinel-2 imagery, allowing for a significant advance in monitoring the implanted infrastructure.

**Author Contributions:** Conceptualization, M.V.C.V.d.C., O.L.F.d.C., A.G.O., and I.H.; methodology, O.L.F.d.C., O.A.d.C.J., and A.O.d.A.; software, O.L.F.d.C.; validation, M.V.C.V.d.C., A.O.d.A., and F.V.e.S.; formal analysis, O.L.F.d.C., O.A.d.C.J., and R.F.G.; investigation, M.V.C.V.d.C. and O.L.F.d.C.; resources, A.G.O., I.H., O.A.d.C.J., R.A.T.G., and R.F.G.; data curation, O.L.F.d.C., A.O.d.A., and F.V.e.S.; writing—original draft preparation, M.V.C.V.d.C., O.L.F.d.C., and O.A.d.C.J.; writing—review and editing, M.V.C.V.d.C., O.L.F.d.C., R.F.G., and O.A.d.C.J.; supervision, A.G.O., I.H., R.A.T.G., and R.F.G.; project administration, A.G.O., I.H., R.A.T.G., and R.F.G.; funding acquisition, A.G.O., I.H., R.A.T.G., R.F.G, and O.A.d.C.J. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the following institutions: National Council for Scientific and Technological Development (434838/2018-7) and Coordination for the Improvement of Higher Education Personnel (Finance Code 001).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The following research materials are available online at https://github. com/osmarluiz/large-scale-solar-panel (accessed on 12 May 2021).

**Acknowledgments:** The authors are grateful for financial support from CNPq fellowship (Osmar Abílio de Carvalho Júnior, Roberto Arnaldo Trancoso Gomes, and Renato Fontes Guimarães). Special thanks are given to the research group of the Laboratory of Spatial Information System of the University of Brasilia for technical support. The authors thank the researchers from the ANEEL, who encouraged research with deep learning. Finally, the authors acknowledge the contribution of anonymous reviewers.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**


### *Article* **SHP Assessment for a Run-of-River (RoR) Scheme Using a Rectangular Mesh Sweeping Approach (MSA) Based on GIS**

**Gerardo Alcalá <sup>1</sup> , Luis Fernando Grisales-Noreña <sup>2</sup> , Quetzalcoatl Hernandez-Escobedo <sup>3</sup> , Jose Javier Muñoz-Criollo <sup>4</sup> and J. D. Revuelta-Acosta 5,\***


**Abstract:** This work proposed a base method for automated assessment of Small Hydro-Power (SHP) potential for a run-of-river (RoR) scheme using geographic information systems (GIS). The hydro-power potential (HP) was represented through a comprehensive methodology consisting of a structured raster database. A calibrated and validated hydrological model (Soil and Water Assessment Tool—SWAT) was used to estimate monthly streamflow as the Mesh Sweeping Approach (MSA) driver. The methodology was applied for the upper part of the Huazuntlan River Watershed in Los Tuxtlas Mountains, Mexico. The MSA divided the study area into a rectangular mesh. Then, at every location within the mesh, SHP was obtained. The main components of the MSA as a RoR scheme were the intake, the powerhouse, and the surge tank. The surge tank was located at cells where the hydro-power was calculated and used as a reference to later locate the intake and powerhouse by maximizing the discharge and head. SHP calculation was performed by sweeping under different values of the penstock's length, and the headrace's length. The maximum permissible lengths for these two variables represented potential hydro-power generation locations. Results showed that the headrace's length represented the major contribution for hydro-power potential estimation. Additionally, values of 2000 m and 1500 m for the penstock and the headrace were considered potential thresholds as there is no significant increment in hydro-power after increasing any of these values. The availability of hydro-power on a raster representation has advantages for further hydro-power data analysis and processing.

**Keywords:** small-hydro; GIS; tuxtlas mountains; Grid; SWAT; MSA

### **1. Introduction**

Hydroelectric power generation can be classified, based on their storage schemes, into two categories, reservoir and run-of-river (RoR) [1]. Higher power generation is associated with reservoir hydro-power as they typically involve large-scale infrastructure. However, it also carries negative environmental and human impacts, such as changes in the hydrology of the river and community relocation. On the other hand, run-of-river schemes are better suited for Small Hydro-Power (SHP) generation projects as they use the natural river flow, requiring little or no impoundment infrastructure [2]. This makes them especially suited for sustainable and community friendly distributed developments.

A run-of-river system (see Figure 1) is typically composed of an intake, a surge tank (ST), and a powerhouse (PH). Power is generated by diverting the river's flow via the

**Citation:** Alcalá, G.; Grisales-Noreña, L.F.; Hernandez-Escobedo, Q.; Muñoz-Criollo, J.J.; Revuelta-Acosta, J.D. SHP Assessment for a Run-of-River (RoR) Scheme Using a Rectangular Mesh Sweeping Approach (MSA) Based on GIS. *Energies* **2021**, *14*, 3095. https://doi.org/10.3390/en14113095

Academic Editors: Benedetto Nastasi and Meysam Majidi Nezhad

Received: 24 April 2021 Accepted: 13 May 2021 Published: 26 May 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

headrace to the surge tank, where the water is slowed down sufficiently for suspended particles to settle down. The water flow then continues to the powerhouse by the penstock or pipeline for electricity generation, which is finally carried back to the river [3]. Therefore, power generation will depend on the hydrological and topographical factors involving different hydro-power facilities' locations.

**Figure 1.** Run of River Scheme: (1) Diversion Weir, (2) Intake, (3) Headrace, (4) Surge Tank, (5) Penstock, (6) Powerhouse, (7) Tail Race [4].

### *1.1. GIS-Based Power Assessment*

It is not uncommon for suitable high-potential RoR sites to be located at remote mountainous areas characterized by rough terrains, making site surveyed-based potential assessment unpractical. In such situations, the use of geographic information systems (GIS) and remote sensing (RS) provide powerful tools to overcome such challenges to a large extent [5]. GIS- and RS-based techniques have been successfully applied to potential assessment and site selection of solar [6–8], wind [9–11], and biomass [12–14] energy resources. For these cases, such resources can be distributed over a uniform rectangular grid (i.e., raster data). The raster models can be conceived as gridded maps where each of its elements represents an area of the study zone and has information and geographic location assigned to it [15]. The raster representation has advantages for data processing, as different variables in raster data can be considered by directly overlaying their maps, making it suitable for further analysis, such as for multi-criteria decision making (MCDM).

In turn, GIS is also becoming increasingly popular as an assessment tool for location and selection of the different types of hydro-power opportunities due to their ease of use, cost, and time effectiveness [16]. Nevertheless, unlike the other generation projects, hydro-power potential is not presented as a distribution over a rectangular grid and

not associated with a single position but with the river's basin region that includes the different hydro-power facilities' different locations. In fact, hydro-power (see Equation (1)) depends on the discharge *Q* taken at the intake position and a gross hydraulic head *H* that relates the height difference between the surge tank and the powerhouse, where the energy is produced. Furthermore, Small Hydro-Power is typically distributed over an stream network in which the selection of one hydro-power generation site may interfere with the selection of another [17]. Namely, once a potential site is located over an influence area, it restricts other possible potential sites to overlap the same influence area.

### *1.2. Hydro-power Potential Modeling*

A common approach to evaluate a river's basin hydro-power potential consists of specifying points along the river, separated by a fixed interval, representing potential sites over the stream network. These points are then evaluated using digital elevation models (DEM) to calculate the gross hydraulic head and the available flow for the catchment area [17]. Nevertheless, as the hydro-power potential modeling depends on the site's discharge and topography, it is highly sensitive to the approach used.

Some models are focused on identifying the locations with the highest potential, e.g., Kusre et al. [18] identified, as potential sites in the Kopili River in India, streams of 5th order or greater with a bed slope greater than 2% and spaced 500 m. Palomino-Cuya et al. [19] estimated maximum theoretical hydro-power potential in the La Plata basin in South America by considering the mean annual discharge. This was done at sub-basin scale by considering the mean elevation of upstream sub-basin calculated from hypsographic curves, and at river scale, in cross-sections spaced 100 m; in order to obtain the maximum hydro-power potential at main river scale, the energy values of each cross section produced by the hydraulic head of 0.5 m were selected. Fujii et al. [20] estimated hydro-power for six different rivers in Beppu city Bay, with lengths ranging from 3 to 6 km for sites located 500 m from the mouth of each river. They were able to estimate mean discharge by GIS using precipitation data and land use map for all months. Bayazit et al. [21] located the sites with the most potential along the river by considering an scenario with average precipitation and one with minimum precipitation. In order to obtain the height gradient, focal statistics were applied for a 3 × 3 cells by considering the minimum height with the neighbors, which is the corresponding head.

In order to maximize hydro-power benefit, some works have been devoted, not to identify the site with the highest potential but, to schemes of non-overlapping projects along the river, e.g., Larentis et al. [16] developed a GIS-based program called Hydrospot for RoR and storage projects. Given a location along the river, they obtained another location in the river within a radius according to the best relation between the head and the slope, which is not necessarily the farthest point. They also calculated the gross potential due to the terrain head for the dam-powerhouse alternative. The program considers the interference of multiple plants and minimizes the difference between the value of the total potential in the river basin before and after every flow regulation and at the site optimization cycle. Zaidi and Khan [5] considered schemes of plants (from intake to turbine) separated by 100 m from each other. The gross head for each plant was calculated as the elevation difference from the intake to the turbine located at a horizontal distance of 500 m. The intake positions for a given scheme were selected as points separated by 100 m along the river. Different schemes of plants were generated depending on the position of the initial point, which is considered the first plant. Ibrahim et al. [22] proposed non-overlapping RoR projects along the Gude River in Ethiopia. The intake sites were generated along the given stream at uniform intervals, and a genetic algorithm is applied for optimization. The benefit of each intake is optimized by considering constraints on turbine flow, penstock's length, and diameter. Thereafter, a second optimization scheme was performed by selecting all non-overlapping projects. This way, a map for optimal intakes was considered all along the river, identifying 22 optimal RoR projects on a 49 km length.

Abdelhady et al. [23] developed a thorough optimization model to determine the ideal arrangement of non-storage-based Small Hydro-Power projects along a selected stream. The model maximized the annual benefit through a genetic algorithm and calculated the intake location, penstock's diameter and length, and turbine size and capacity.

Nevertheless, when selecting locations, the potential is not the only variable to be considered. Some works also include social and economic factors or even the exact location of the different components of the plant. Rojanamon et al. [3] performed a complete study by considering environmental, economic, and social factors to find potential sites. Specific criteria for site selection were also provided, such as the distance between the weir site, powerhouse, and the surge tank head. Yi et al. [24] developed a model for Small Hydro-Power for a RoR and a storage scheme, considering topographic, hydrologic, and eco-environmental factors. The stream network grid was studied all along its waterways. On every location circles with a radius of 100 m intervals were drawn, corresponding this distance to the waterway length, and then selecting as optimal sites the ones with the shortest waterway possible and maximum height. Zapata-Sierra and Manzano-Agugliaro [25] proposed a methodology of evaluation of hydro-power in a Mediterranean climate. They analyzed 10 basins in the Sierra Nevada and found a relation between altitude and basin area for location of SHP by optimization of cost of civil work, the energy production, and the population supplied energy and population. Sammartano et al. [26] identified potential locations for run-of-river hydro-power plants by using GIS tools and the Soil and Water Assessment Tool (SWAT) model within the Taw at Umberleigh River Basin, Southwest England. Hydro-power potential was estimated for 2189 different locations at the river network by dividing it into equal segments of 100 m. Potential sites (segments of the river) were obtained according to environmental and economic criteria. The environmental analysis was incorporated by excluding sites located in areas of high environmental sensitivities. The economic analysis was implemented through an equation that calculated the turbine and generator costs, which depended on the hydro-power and the head. Wegner et al. [27] calculated hydro-power potential in Paraná hydrologic basin 3. The study included environmental variables such as flooded and protected areas, slope, infrastructure, indigenous land, consolidated area, as well as water availability. To carry out a detailed study of the contributing areas, the drainage network was segmented into stretches with a maximum extension of 450 m, resulting in 3899 points to identify the potential sites within the river network.

The present work proposes a new approach for Small Hydro-Power assessment that describes the study zone's hydro-power potential, distributed on a rectangular mesh in raster format. This methodology presents a RoR scheme that considers the location of the intake, the surge tank, the powerhouse, and the penstock's and the headrace's routes. Because the effect of interference is not considered, as it needs a sequence of projects, this leads to independent calculations, which are performed via parallel computing by using R programming language packages (e.g., doParallel, parallel, foreach). The hydrologic response of the selected watershed was calculated using a well-known semi-distributed hydrologic model. The final product of the calculations yields an HP raster map. The availability of a hydro-power distribution on a raster format has advantages, and can be used as a base map for different studies such as suitability analysis implementation on a planning step or determination of a sequence of hydro-power projects. The methodology developed in this work for HP calculation is called mesh sweeping assessment (MSA) as calculations are performed for every location on the mesh.

### **2. Materials and Methods**

Theoretical potential *P* depends on the water density *ρ*, the hydraulic gross head *H*, the discharge *Q* in a cross-sections of the river and an efficiency factor *η*, according to the following equation [28]

$$P = \eta \rho \text{gQH} \tag{1}$$

As previously mentioned, solar and wind potential estimation differs from the hydropower generation (HP) since the potential for the first two can be estimated where the panel or turbine is located. On the other hand, hydro-power generation relies on the discharge *Q* at the intake position and the gross head *H* corresponding to the height difference between the surge tank and the powerhouse where the turbine produces the energy in the PH.

For the mesh sweeping assessment methodology proposed in this work, hydro-power potential is obtained for every position on the study area, which is divided by a rectangular mesh, provided a digital elevation model. This approach models a RoR scheme (see Figure 2) in 3 main steps: first, the location of the surge tank is established, second the location of the powerhouse and the headrace's route are determined, and finally the location of the intake and the penstock's route are found. There are also three main simplifications for the model. First, no interference effect is considered, second headrace and penstock are modeled as straight lines, and third, no head losses are considered. Nevertheless, none of these considerations compromises the base methodology, and improvements could be implemented to obtain a better modeling on any of the three steps.

**Figure 2.** Run of River scheme includes the intake, the surge tank and the powerhouse.

### *2.1. First Step: Surge Tank Location*

The main idea for the methodology is to establish the surge tank's location to be the same position where the power is calculated, placed on the ground and not along the river. Constructions and water bodies are considered as exclusion or restricted areas, and potential *P* are set to zero for these places.

### *2.2. Second Step: Powerhouse Location*

The process for location of the PH is represented schematically on Figure 3 in Subfigures a, b, and c. On Figure 3a, candidates for PH within a radius *RPH* from the ST that meet the *PH slope criterion* are colored in purple. This criterion requires that potential sites have a gross head relative to the ST higher than 4 m. On Figure 3b, the remaining

candidates meeting the *PH obstacles criteria* are colored in orange. These criteria require no obstacles between the ST and the PH candidate if joined by a straight line, which is how the penstock's route is modeled. Finally, in Figure 3c, the PH location is selected to be the site that maximizes the gross head, its corresponding penstock route is also shown. For the present study, the values of the maximum horizontal distance *RPH* for the penstock will vary from 100 to 2500 m. If no site meets any of these requirements, gross head *H*, discharge *Q* and consequently power *P* are set to zero.

**Figure 3.** Powerhouse site selection process. (**a**) Slope Criteria (**b**) Obstacles Criteria (**c**) Maximum gross head. Maximum permissible penstock's length for example is set to *RPH* = 2000 m.

### *2.3. Third Step: Intake Location*

Location for the intake is observed on Figure 4 for Subfigures, a, b and c. In Figure 4a candidates for intake within a radius *R<sup>I</sup>* from the ST are colored in green. These sites meet the *stream criteria*, where potential intake sites and the PH should belong to the same stream segment in order to avoid water exchange. In Figure 4b, the remaining candidates meeting the *Intake slope criteria* are colored in pink. These criteria require that the elevation of potential sites is at least 3 m higher than the surge tank. Finally, in Figure 4c, the remaining intake candidates shown in black meet the *Intake obstacles criteria*, which guarantees no obstacles if a straight line is drawn from the ST to the intake candidate. Lastly, the site with the maximum discharge is selected as the intake location. For the present study, the values of the maximum horizontal distance *R<sup>I</sup>* for the headrace will vary from 100 to 2500 m. If no site meets any of these requirements, discharge *Q* and consequently power *P* are set to zero.

### *2.4. Hydro-Power Computation*

Computations are performed for the Huazuntlan River Watershed, according to Figure 5. The extent of the study area is 364.74 km<sup>2</sup> with a resolution of 12.5 m which leads to 2,334,360 different locations on the map. Due to the fact that the effect of interference is not considered, calculations are performed via parallel computing. The script was run in R free software (v.2020.4.0.0) [29] by using the packages (sp, raster, rgeos, rgdal, gdistance) for spatial data manipulation and (doParallel) for parallel implementation.

**Figure 4.** Intake site selection process. (**a**) Stream criteria (**b**) Slope criteria (**c**) Obstacles criteria. Maximum permissible headrace's length for example is set to *R<sup>I</sup>* = 3000 m.

**Figure 5.** Mesh sweeping approach (MSA) methodology flow diagram.

### *2.5. Watershed Description*

The Huazuntlan River Watershed is located in the Sierra de Santa Martha that belongs to the Los Tuxtlas Mountain range, an ecological reserve, and a multicultural zone. It belongs to the state of Veracruz, Mexico, and drains to the Coatzacoalcos River basin. The watershed covers approximately 118.6 km<sup>2</sup> with the outlet at 18.1510 Lat and −94.7884 Lon. The annual precipitation ranges from 1102 to 993 mm within the 19 year-study period (1995 to 2013). The minimum, mean, and maximum elevation of the watershed are 39, 574, and 1673 meters above sea level, respectively. Current land uses are shown in Figure 6. They include evergreen forest (66.13%, FRSE), rangeland (14.93%, RNE1), rye (9.37%, RYER), shrubland (3.45%, RNG1), agricultural and cultivated areas (5.02%, AGMX), and grassland (1.09%, CRGR). The predominant soils in the watershed are sandy loam soils (55.25%) and loam soils (44.75%).

**Figure 6.** Huazuntlan River Watershed: **left**: digital elevation model; **right**: land use distribution. Units in meters.

### *2.6. Hidrological Model*

The hydrological response of the watershed was simulated using the Soil and Water Assessment Tool (SWAT) [30]. The SWAT is a semi-distributed hydrological model that executes at daily and monthly time steps [31–33]. The model was chosen because of its ability to simulate flows at different time steps and its temporal-spatial representation of climate, soil types, and land use. The SWAT model simulates the hydrological processes based on the following water balance equation:

$$
\Delta W\_l = \Delta W\_0 + \sum\_{i=1}^{t} (P - Q\_s - E\_a - W\_s - Q\_\mathcal{g})\_i \tag{2}
$$

where ∆*W*<sup>0</sup> and ∆*W<sup>t</sup>* are the initial and current soil water content, respectively, *P* is the amount of precipitation, *Q<sup>s</sup>* is the amount of surface run-off, *E<sup>a</sup>* is evaporation, *W<sup>s</sup>* is the amount of water in the vadose zone, and *Q<sup>g</sup>* is the baseflow. All variables on day *i* are in meters.

### 2.6.1. Databases

The SWAT model requires a meteorological, topographic, land cover, and land use data for the study watershed. All input data were integrated into the model via raster data sets, weather station locations, and measured data files. The topography was described using an DEM from the shuttle radar topography mission (SRTM) [34] with a 12.5 m

horizontal resolution. Meteorological data were extracted from [35], which contains daily records of precipitation, maximum and minimum temperatures, and wind speeds from 1950 to 2013 for all North America with a 1/6 ◦ spatial resolution. Relative humidity and solar radiation inputs were generated using the generator integrated into the SWAT model and developed by the National Center for Environmental Prediction (NCEP) and the Climate Forecast System Reanalysis (CFSR). Soil information was obtained from the world's digital soil map provided by the United Nations Food and Agricultural Organization (FAO) [36]. All observed data used for calibration and validation purposes in the present study were extracted from the Banco Nacional De Datos De Aguas Superficiales (BANDAS) database [37], which contains daily and monthly discharges from 2070 streamflow stations. Land use description for the watershed was extracted from the GlobCover initiative developed by the European Space Agency, which contains global cover maps using observations from the 300 m MERIS sensor onboard the ENVISAT satellite [38].

### 2.6.2. Model Setup

ArcSWAT (v.2012.10.5.24) was used to facilitate the data entry, setup, and parametrization of the present study's hydrological model. The watershed was delineated automatically and based entirely on topographic and river network information such as DEM, flow direction and flow accumulation raster maps. The watershed outlet was selected to match the river mouth and the streamflow station. After delineating the watershed, 1477 HRUs (Hydrologic Response Units) were generated based on land use, soil type, and slope characteristics.

The simulation was executed from 1 January of 1995 to 31 December of 2013. This period was defined as a function of the available data. A five-year period was selected for warm-up purposes in all the simulations. Eight years were used for model calibration from the total simulation period, whereas the left five years were used for validation purposes.

### 2.6.3. Model Calibration, Validation, and Sensitivity Analysis

Model calibration was performed with the automatic tool Soil and Water Assessment Tool Calibration and Uncertainty software (SWAT-CUP, v. 5.2.1.1; [39]). The selected calibration algorithm was the SUFI-2 method [40,41]. In SUFI-2, the uncertainty, referred to as the 95% prediction uncertainty, is propagated using the Latin Hypercube scheme (LHs) and calculated at the 2.5% and 97.5% levels for all calibration variables [42]. The model was calibrated at monthly time steps. A total of 500 simulations were carried out in each of the eight iterations during calibration. The calibration process took place from 2000 to 2009 for the streamflow in the Huazuntlan River Watershed. The recommendations for parameter regionalization discussed by [43]. Based on these directions, the parameters selected for the calibration procedure are showed in Table 1. The validation of the model was performed for a period starting in 2010 and ending in 2013.

Calibration and validation performances were assessed using the Nash-Sutcliffe efficiency (NSE) and the percentage bias (PB) from the following equations:

$$NSE = 1 - \frac{\sum\_{i=1}^{n} (Q\_i^o - Q\_i^s)^2}{\sum\_{i=1}^{n} (Q\_i^o - \bar{Q}\_i^s)^2} \tag{3}$$

$$PB = 100 \times \frac{\sum\_{i=1}^{n} (Q\_i^o - Q\_i^s)^2}{\sum\_{i=1}^{n} (Q\_i^o)}\tag{4}$$

where *Q<sup>o</sup> i* is the observed streamflow, *Q<sup>s</sup> i* is the simulated streamflow, and *Q*¯ *<sup>s</sup> i* is the mean of the measured data. Values of *NSE* > 0.65, and −25% ≤ *PB* ≤ 25% are statistical measurements necessary in order to consider a good calibration, as established in the criteria provided by [44]. The correlation factor, *R*, was also calculated to observe the linear dependence between the observed simulated responses from the following equation:

$$R = \frac{S\_{xy}}{S\_x S\_y} \tag{5}$$

where *Sxy* is the covariance of the variables *x* and *y*, respectively, and *S<sup>x</sup>* and *S<sup>y</sup>* are the standard deviations of the corresponding variables. The assessment metrics previously mentioned were also used for the validation procedure. The sensitivity analysis was performed using SWAT-CUP. The model estimates the sensitivity by changing the different input parameters and analyzing the model's output to these variations. Each parameter's significance is evaluated with a t-test and its corresponding *p*-value [39].



### **3. Results and Discussion**

Results are presented as they were utilized in the proposed model. First, the hydrological model calibration and validation were presented along with its performance. Second, the hydrological behavior was exposed in terms of the actual watershed physical conditions and the model parameters. Third, hydro-maps were developed as functions of two more relevant variables, the headrace, and penstock lengths. Thus the hydro-power spatial distribution was shown. Additionally, the contribution of the variables mentioned above is quantified by multiple regression analysis. Lastly, a more comprehensive MSA methodology is illustrated to show its use and potentiality, and limitations.

### *3.1. SWAT Model Sensitivity Analysis*

The top 10 most sensitive parameters from the 20 calibration parameters showed in Table 1 were ranked from the most to the least sensitive variables (see Table 2). These sensitive parameters were responsible for significant changes in the model output during the calibration process. Results showed that the first five more sensitive parameters control the overland run-off (CN2, t = 17.64), the base-flow recession response to changes of the water table (ALPHA\_BF, t = 14.86), the flow discharge peaks and residence time through properties, such as the channel hydraulic conductivity (CH\_K2, t = −8.98) and the Manning's roughness coefficient (CN\_N2, t = −7.80), and, lastly, the soil evaporation compensation (ESCO, t = 3.85), which modifies the depth distribution to meet the soil evaporative demand to account for the capillary rise. Table 2 shows these variables present significant *p*-values (+0.000).


**Table 2.** Parameters used in streamflow calibration.

### *3.2. SWAT Calibration and Validation*

Overall, a good calibration of the proposed model was obtained when compared to the observed flow discharge at the outlet of the Huazuntlan River Watershed. Figure 7 shows the observed flow discharge and the best simulated flow signal after automatic calibration and validation procedures. The NSE value for the calibration period was 0.69, *R* <sup>2</sup> was 0.84, whereas PBIAS was 3.1%, which represent a good performance for monthly flow discharge calculations according to [44]. Similarly, the performance during the validation period NSE was 0.64, *R* <sup>2</sup> was 0.83, and PBIAS was 12.5%, which are in the range bracketing a good fit between the observed and simulated signals (see Table 3). The mean, maximum, and minimum observed discharge for the calibration period were 5.28, 28.78, and 0.03 m3/s, respectively. The simulated flow showed a mean, maximum, and minimum of 5.12, 20.75, and 0.55 m3/s, respectively. It can be noticed that similar flow magnitudes were obtained between the predictions and the observations. However, minimum flow conditions were the slightly over-predicted by the model. During the model validation, the mean, maximum, and minimum observed discharge for the calibration period were 5.99, 40.35, and 0.04 m3/s, respectively, while during simulated conditions, the validation period showed mean, maximum, and minimum of 5.30, 17.33, and 0.90 m3/s, respectively. We can see that the mean discharge agreed well with its simulated counterpart. However, several peaks during both calibration and validation were not able to be captured by the model accurately. Similarly to the calibration period, the model slightly overestimated the base-flow conditions, especially for those months with low values of precipitation.

**Table 3.** Observed vs simulated streamflow goodness of fit

**Figure 7.** Huazuntlan River Watershed hydrologic response. **Top**: monthly precipitation; **Bottom**: observed streamflow (circles) and simulate streamflow (solid line).

### *3.3. Hydrological Behavior of the Watershed*

The SWAT model proposed in this study performed well and indicated a linear watershed hydrologic response as it presented a rapid response to rainfall events, except for storm events presented during the intense summer rainfall generally preceded by dry conditions. This linearity and rapid response of the watershed is attributed to the response to the subsurface lateral flow [45]. Evidence of this type of hydrologic behavior is accountable for the relatively high values of ALPHA\_BF during the calibration period [46]. Although non-linearity response assumes that overland flow in steep mountainous catchments is the most significant contribution to the streamflow, surface run-off is unlikely to happen in forested hillslopes due to the high hydraulic conductivities of the soil, which is the case for the mainly forested Huazuntlan River Watershed. As a result, in the shallow soils found in the study area, a preferential flow mechanism is likely to occur within the watershed. This process is common to happen in mostly humid climate and forested watersheds [47] due to ephemeral and perennial pipes in the soil maintained mainly by subsurface flow and burrowing animals, respectively. This type of preferential flow is quite complex to explain and simulated and it is only possible to discuss by the statistical properties of parameters defining the movement of the subsurface in hydrological models. In addition, in SWAT models, the representation of the water table attributes and position tends to be inadequate. As a result, the parameters controlling shallow groundwater structure will have a large source of uncertainty associated with them [48]. A more discrete spatial representation of the aquifer characteristics and groundwater systems within sub-watersheds may increase the model accuracy.

The overall hydrologic response of the watershed showed that observed and simulated streamflow outputs presented similar phases and trends, as well as a reasonable match with the observed peaks during humid season (see Figure 7). However, significant deviations between the simulated and observed peaks were found. These deviations, present during high-intensity rainfall events, might be due to the precipitation heterogeneity within the watershed or the well-known curve number (CN) method limitations. The CN method does not considered neither the storm or precipitation duration and its intensity [49], which can limit the SWAT model to estimate the magnitude of the flow discharge peaks [50]. Regardless these limitations, the CN method slightly overestimated streamflow for some large rainfall events and showed some limitations to capture the peak during the highintensity rainfall season. This peak mismatching, as previously mentioned, is present due to the spatial-temporal variability of the rainfall data, which was heterogeneous by nature. Additionally, peak discharges may be due to changes in land use influencing the hydrological phenomena of the direct run-off. In this study, a increase of 25% of the overall curve number was found during the calibration period. Although the calibration procedure produced a good performance, a more accurate representation of the CN may be obtained from a spatial and dynamic calibration of this parameter.

At large basins, the residence time plays an important role during calibration. These residence time refers to the average time that a certain amount of water travels through a defined river reach. In SWAT model, this variable is affected by the Mannign's roughness coefficient, which by the definition influences the mean velocity of the flow traveling through streams. This roughness was homogeneously assumed in the proposed SWAT model, having a calibrated value of natural streams (∼0.06 [51]). Additionally, the hydraulic conductivity of the overall system of streams presented an average value of 476.15 mm/day, which represent a high transmissivity of water from the streams to the hillslope or vice versa, which is consisted with the high rates of lateral flow dominating the flow hydrograph. Overall, the annual average discharge from 2000 to 2013 were 5.49 and 5.17 m3/s for the measured and simulated data, respectively. These values and the performance metrics indicate that the model can be applied for further assessment of the hydrologic response under different land use scenarios. It is carefully noted, that some uncertainty is always introduced into a hydrologic model regardless the agreement between the simulated and observed signals [52]. The foretold uncertainties may arise from different sources, includ-

ing the model conceptual structure, assumed initial conditions, observed input data, and selected calibration parameters. The latter have been discussed extensively and although 20 parameters were selected as the most significant, Duan et al. [52] suggested that more variables are needed for calibration purposes.

### *3.4. Hydropower Map*

Hydropower maps were obtained by implementing the MSA according to the scheme of Figure 5 for different pairs of values (*RPH*, *RI*) representing the headrace's and penstock's maximum permissible lengths, respectively. Values of (100, 250, 500, 1000, 1500, 2000, 2500) m were used for *RPH* and *R<sup>I</sup>* to get a more comprehensive understanding of the interaction between both variables. Due to computing access limitations (i.e., HPC account expired), results for (*RPH*, *RI*) = (2500, 2500) m were not considered. As a result, only 48 simulations were analyzed. Potential sites include mountainous regions with steep geographies, as well as downstream places.

Figure 8 shows hydro-power maps for the three representative (*RPH*, *RI*) cases, (250, 2000), (1000, 2000) and (2500, 2000) m, where *R<sup>I</sup>* is a fixed value. For the sake of visualization, different scales were used. It can be observed that for all three maps, most of the values are close to zero. The main reason is that places far away from the river were not included within the *RPH* or the *R<sup>I</sup>* radius. For some other cases, it could be that obstacles were obstructing the headrace's or the penstock's path, or simply that the study site did not meet the requirements given by the model. Overall, the larger the *RPH* values, the more potential sites were found with higher hydro-power. Additionally, spatially speaking, a non-uniform hydro-power distribution was present on all maps. This behavior is due to an implicit balance between discharges *Q* and hydraulic heads *H*, which is required for hydro-power estimation according to Equation (1). This translates to flow discharges increasing from downstream to upstream (from north to south), and higher elevation gradients present more commonly at high elevations. For instance, when (*RPH*, *RI*) were (250, 2000) m, specifically Figure 8a, potential sites with hydro-power lower than 1500 kW tended to be located within the buffer created along flood plains of the river and dominated by *RPH*. Locations with hydro-power higher than 1500 kW were mostly at the junction of the river's main stem and tributaries and other downstream zones. Similarly, for Figure 8b,c, increments in hydro-power were due to the magnitude of *RPH*. However, whereas the former showed the highest hydro-power values ranging from 1500 to 2500 kW over similar locations as shown in Figure 8a, the extension increased. Lastly, Figure 8c shows significant changes in potential locations for hydro-power generations, where the locations with the highest hydro-power (>2000 kW) were located downstream and all over the middle and top sections of the watershed. According to all this, it would be possible to select high potential sites without interfering with other high potential sites.

For fixed values of *RPH* and three representative values of *R<sup>I</sup>* , Figure 9 shows cases (2000, 250), (2000, 1000) and (2000, 2500) m. Once again, different scales were used for each map for the sake of visualization. It can be seen that for the first scenario, there are fewer potential sites for low values of *R<sup>I</sup>* with the highest hydro-power close to the mouth of the river (Figure 9a). An increment in *R<sup>I</sup>* showed that hydro-power ranged from 500 to 2000 kW and was mostly located at the watershed's headwaters. Additionally, a few locations with *HP* > 2000 kW were at the junction of the main river and its tributaries and close the watershed's outlet (Figure 9b). The last scenario with *RPH* = 2000 and *R<sup>I</sup>* = 2500 m showed a similar hydro-power distributions as in Figure 8c. However, potential locations with moderate hydro-power conditions (1500 < *HP* < 2000) kW were located mainly over the middle and top sections of the watershed, whereas the highest hydro-power, *HP* > 2000 kW, were still located at the river's junction and near the mouth of the watershed (Figure 9c). Once again, it can be seen that high potential sites do not interfere with other high potential sites under this approach.

**Figure 8.** Hydropower maps for fixed maximum headrace's length *R<sup>I</sup>* = 2000 m.

**Figure 9.** Hydropower maps for fixed maximum penstock's length *RPH* = 2000 m.

The MSA shares common criteria with other methodologies. Works such as [3,16,17] located the different components of the RoR scheme by fixing maximum distances between surge tank and powerhouse. Then, the head was maximized according to the given criteria. Nevertheless, in these works, hydro-power was assessed by first establishing the position of the weir along the river. Thereafter, the position of the surge tank is conditioned when included in the model. This limits the potential sites to locations along the river and at sub-basin levels. This MSA method, on the other hand, allows multiple potential sites over the entire study area or any raster region (e.g., Figures 8 and 9).

### *3.5. RPH and R<sup>I</sup> Contributions*

To observe the effect of the *RPH* and *R<sup>I</sup>* parameters, Table 4 shows the mean value for the 243 sites with the highest hydro-power. This amounts only to the 0.01% of the total simulated data. The mean values ranged from 103 to 3432 kW. It is important to highlight that considering the mean of all values (100% of the total data) would lead to wrong conclusions since there are plenty of locations with zero potential values. This study was only concerned about the sites with the highest potentials. For the HP values on Table 4, a multiple linear regression model with interaction effect was estimated as *P* ∼ *RPH* + *R<sup>I</sup>* + *RPH* × *R<sup>I</sup>* with a 0.945 value for the *R* 2 correlation coefficient. The predictors *RPH*, *R<sup>I</sup>* and *RPH* × *R<sup>i</sup>* were 0.355, 0.654, and 0.221 × 10−<sup>3</sup> , respectively. Table 5 shows reasonable contribution of interaction term and the penstock's length, whereas the strongest contribution to the hydro-power was due to the headrace's maximum length with a significance level of 0.05. Namely, an increase in 100 m for *RPH* is associated with an increase in 35.456 + 0.0221 *R<sup>I</sup>* kW, meanwhile an increase in 100 m for *R<sup>I</sup>* is associated with an increase in 65.379 + 0.0221 *RPH* kW. The interception term shows to be not significant in the regression equation with *p* = 0.3625 since neither *RPH* or *R<sup>I</sup>* can be set to zero as this combination makes an irrational arrangement in a hydro-power generation system.

**Table 4.** Mean HP[kW] for the 243 maximum values.


**Table 5.** Regression Model for the Mean HP[kW] for the 243 maximum values.


Figure 10 describes the parametric space for *RPH* and *R<sup>I</sup>* through a contour plot for the hydro-power conditions shown in Table 4. The interpolation was performed using a cubic spline. A thorough analysis of the level plot supports the same conclusion as the one derived from the regression equation. Small increments in *R<sup>I</sup>* produce a significant increment in hydro-power, especially for high values of *RPH*. Besides, this description of the two variables involved in hydro-power production allows the observation of all possible system arrangements, which is an advantage derived from the use of raster information. Figure 11a plots the mean HP against the *R<sup>I</sup>* by fixing the *RPH* for different values. It shows a linear increment of the power as the *R<sup>I</sup>* length increases. Moreover, when *R<sup>I</sup>* takes values from 1500 to 2000 m, there is a significant increment in power for *RPH* values higher than 1000 m. The lines *RPH* = 2000 m and *RPH* = 2500 m show no significant difference for any *R<sup>I</sup>* value. On the other hand, Figure 11b plots mean HP against the *RPH* by fixing the *RI* for different values. There was an asymptotic trend for *RPH* values higher than 1500 observed for all *R<sup>I</sup>* values. This suggested *RPH* values within the range (1000, 1500) m behave as a threshold for this parameter, as no significant gains in power were obtained when increasing. The lines *R<sup>I</sup>* = 2000 m and *R<sup>I</sup>* = 2500 m show no significant difference for low *RPH* values.

**Figure 10.** Mean HP for the 243 maximum values.

**Figure 11.** Power as function of *R<sup>I</sup>* and *RPH* for mean of 243 maximum values. (**a**) Fixed *RPH* value (**b**) Fixed *R<sup>I</sup>* value.

Table 6 indicates the maximum hydro-power for each run. Maximum HP values range from 608 to 3752 kW for the (100, 100) m and (2000, 2500) m cases, respectively. One could see that this case represented the topmost hydro-power condition, whereas the previous case includes a representative selection of the maxima HP (e.g., 0.01% of the entire raster database). A multiple linear regression model with form *P* ∼ *RPH* + *R<sup>I</sup>* was adjusted to explained the variability of the maximum hydro-power as a function of *RPH* and *R<sup>I</sup>* . The *R* <sup>2</sup> value was equal to 0.950, which means HP depends linearly on the penstock's and the headrace's length. The predictor's values were 0.714 and 0.823 for *RPH* and *R<sup>I</sup>* , respectively (Table 7). Although both variables had a significant effect on the maximum HP, headrace's length *R<sup>I</sup>* had a slightly stronger effect (*p* = 0.0000). Once again, the interception term played an important effect on the maximum HP when considering a complete hydro-power generation system. However, it would be meaningless to consider its effect on HP by itself.




**Table 7.** Linear Regression Model for the maximum HP[*kW*] values.

Figure 12a shows a linear increment of the hydro-power as *R<sup>I</sup>* increases, and a significant increment when increasing from 1500 to 2000 m for *RPH* values greater than 1000 m. Once again, the lines corresponding to *RPH* = 2000 m and *RPH* = 2500 had no significant difference for any *R<sup>I</sup>* value. Additionally, for small *R<sup>I</sup>* values within the range (100, 500) m some hydro-power lines for different *RPH* values intercept each other, that is the case of (100, 500) m and (250, 500) m. There is no guarantee that by increasing *RPH* or *R<sup>I</sup>* by a relatively small amount, HP value will increase. This is because the *RPH* and *R<sup>I</sup>* values stand for the maximum permissible penstock's and headrace's length, respectively, but in general, the values for both lengths are lower than permissible ones.

On the other hand, Figure 12b plots maximum HP against *RPH* for different *R<sup>I</sup>* values. No significant increments were observed for *RPH* greater than 2000 m for all *R<sup>I</sup>* values. There are also cases, where where hydro-power lines intercept, or closely intercept each other. That is the case of (2500, 2000) m and (2500, 1500) m. Nevertheless, compared with Figure 12a there are fewer cases since the increment on *R<sup>I</sup>* has a stronger increment on the HP.

Results indicated that according the MSA the headrace's length, *R<sup>I</sup>* , plays an important role on HP assessment and has to be taken into account to generate a more effective hydro-power generation system. However, general studies for hydro-power assessment models mainly focus on the penstock and ignore the headrace contribution in their models [17,22,53].

**Figure 12.** Power as function of *R<sup>I</sup>* and *RPH* for maximum HP values. (**a**) Fixed *RPH* value (**b**) Fixed *R<sup>I</sup>* value.

### *3.6. Considerations*

Although the raster representation associates a hydro-power potential to every position, it does not mean that all sites can be used at the same time, as there could be interference between potential RoR projects. In fact, each cell of the hydro-power map contains additional information such as the powerhouse locations, the intake, the surge tank, and the headrace's and penstock's route, the flow discharge as well as the gross head. Therefore, when one selects a location for potential use, not only consider that single location but the complete facilities and a segment of the river, which restricts, in consequence, the selection of other potential locations. Additionally to these considerations, working with a raster representation carries advantages for further data processing and

analysis. In fact, implementations such as turbine selection or MCDM analysis becomes a straightforward task [28,54]. For instance, Figure 13 presents a simple scheme of 5 nonoverlapping projects by considering the *RPH* = 1500 m and *R<sup>I</sup>* = 2000 m. The extent was divided in five equally sized horizontal rectangles from east to west. As a result of the approach used in this study, the highest hydro-power values were 492.61, 1355.98, 1408.47, 2000.90, and 3346.18 kW for sites located from top to bottom of the image or downstream direction. The highest potential location among all the trials is shown in black. More complex optimization procedures could be implemented [22,55] as the MSA procedure was designed for further applications and research. That being said, the MSA methodology can be improved or extended without compromising it. An important improvement would be modeling more complex routes for the headrace and the pentsock by incorporating variables such as land use, watershed geomorphology, head losses, and economic costs. It could also be possible to integrate the different scenarios for the different parameters *RPH* and *R<sup>I</sup>* , all in a single raster, in order to provide enough information, and make it open and accessible for the community.

**Figure 13.** Hydropower map for *RPH* = 1500 m and *R<sup>I</sup>* = 2000 m, presenting 5 non-overlapping projects.

### **4. Conclusions**

The present work developed the MSA methodology devoted to obtain hydro-power potential maps distributed on a rectangular grid for a run of river scheme. The study area was the Huazuntlan River Watershed at the Tuxtlas Mountains, Veracruz, Mexico. The proposed method calculates the hydro-power on each location of the map where the position of the surge tank is established. Then, the position of the powerhouse and the intake are determined in the respective order by using the topographic and environmental factors, along with the SWAT model to obtain the watershed hydrologic response.

Hydro-power maps, were generated by controlling the penstock's and headrace's maximum permissible lengths (*RPH*,*RI*). The regression model showed the parameter *R<sup>I</sup>* had the major effect on the hydro-power estimation, followed by *RPH* and, lastly, the interaction effect with a fair contribution. In a more realistic condition, where hydraulic losses are considered, the impact of *R<sup>I</sup>* would be even greater than *RPH* on the hydro-power generation. The selection of *RPH* = 1500 m and *R<sup>I</sup>* = 2000 m yield an *HP* = 3350 kW. This scenario behaves as a threshold condition since no significant gains were obtained after increasing either penstock's or headrace's maximum lengths.

Overall, the maps presented a non-uniform hydro-power distribution with possible potential sites located at downstream, middle, and headwaters area of the watershed. Moderate hydro-power values are mostly at the junction of the river's main stem and tributaries and other downstream areas. This makes the MSA procedure suitable for a sequence of RoR project along the different sites of the watershed.

The SWAT model used here provided an effective tool to estimate streamflow conditions for the different annual seasons in the study. The model shows an overall efficiency of 0.67, which is considered a good performance for various hydrological models (i.e., 0.65 < NSE < 0.75). However, the MSA method does not limit the use of other hydrological models and is flexible to higher time increments.

Although the MSA method assumed penstock's and headrace's routes as straight lines, and only gross hydro-power was calculated with no consideration of losses and ecological flows, the base methodology can be easily enriched. For instance, the straightline assumption of the headrace and the penstock could be relaxed, and the model could incorporate more variables such as land use, watershed geomorphology, head losses, and economic costs. Additionally, improvements can include the hydrological modeling temporal and spatial resolution, and the implementation of MCDM analysis.

The central idea of MSA the developed methodology is relatively simple to apply, and, therefore, replicable. This approach involves elementary geoprocesses, such as searching extreme values in raster maps variables, intersecting different geometries, calculating distances, among others. However, these basic calculations can take seconds to minutes, translating into long computation times for large study areas. For this, one needs highperformance computing (HPC) equipment or reduce the number of pixels by choosing smaller areas or with lower resolution and performing the calculations in parallel. Furthermore, the results found through this method will be sensitive to how it is modeled. Namely, the MSA is sensitive to define the headrace and the penstock routes, the criteria for the location of the turbines and the intake, and at the same time as the hydrological model used.

**Author Contributions:** conceptualization, G.A., L.F.G.-N., Q.H.-E., J.J.M.-C., and J.D.R.-A.; methodology, G.A., L.F.G.-N., Q.H.-E., J.J.M.-C., and J.D.R.-A.; formal analysis, G.A., L.F.G.-N., Q.H.-E., J.J.M.-C., and J.D.R.-A.; investigation, G.A., L.F.G.-N., Q.H.-E., J.J.M.-C., and J.D.R.-A.; resources, G.A., L.F.G.-N., Q.H.-E., J.J.M.-C., and J.D.R.-A.; writing-original draft preparation, G.A., L.F.G.-N., Q.H.-E., J.J.M.-C., and J.D.R.-A. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** We used FAO and CONAGUA open source databases.

**Acknowledgments:** Authors acknowledge the support from the Center for Research in Mathematics (CIMAT) and "Laboratorio de Supercómputo del Bajio" which was funded by CONACyT through the project 300832, as well as the support provided by the HPC center Advanced Research Computing at Cardiff (ARCCA), Cardiff University, for the completion of this work.

**Conflicts of Interest:** The authors declare no conflicts of interest.

### **References**


### *Article* **Wind Climate and Wind Power Resource Assessment Based on Gridded Scatterometer Data: A Thracian Sea Case Study**

**Nikolaos Kokkos <sup>1</sup> , Maria Zoidou <sup>1</sup> , Konstantinos Zachopoulos <sup>1</sup> , Meysam Majidi Nezhad <sup>2</sup> , Davide Astiaso Garcia <sup>3</sup> and Georgios Sylaios 1,\***


**Abstract:** The present analysis utilized the 6-hourly data of wind speed (zonal and meridional) for the period between 2011 and 2019, as retrieved from the Copernicus Marine Environmental Service (CMEMS), covering the Thracian Sea (the northern part of the Aegean Sea). Data were estimated from the global wind fields derived from the Advanced Scatterometer (ASCAT) L2b scatterometer on-board Meteorological Operational (METOP) satellites, and then processed towards the equivalent neutral-stability 10 m winds with a spatial resolution of 0.25◦ × 0.25◦ . The analysis involved: (a) descriptive statistics on wind speed and direction data; (b) frequency distributions of daily-mean wind speeds per wind direction sector; (c) total wind energy content assessment per wind speed increment and per sector; (d) total annual wind energy production (in MWh/yr); and (e) wind power density, probability density function, and Weibull wind speed distribution, together with the relevant dimensionless shape and scale parameters. Our results show that the Lemnos Plateau has the highest total wind energy content (4455 kWh/m2/yr). At the same time, the area to the SW of the Dardanelles exhibits the highest wind energy capacity factor (~37.44%), producing 7546 MWh/yr. This indicates that this zone could harvest wind energy through wind turbines, having an efficiency in energy production of 37%. Lower capacity factors of 24–28% were computed at the nearshore Thracian Sea zone, producing between 3000 and 5600 MWh/yr.

**Keywords:** marine renewable energy; wind climate; wind power assessment; wind energy capacity factor; scatterometer; Thracian Sea

### **1. Introduction**

Through the European Green Deal, the European Union (EU) has set a target to reach total decarbonization and achieve energy efficiency for its members by the year 2050 [1]. To achieve this ambitious goal, the power production sector would follow the Clean Energy Transition pathway, with renewable energy sources at the epicenter of such conversion. In this gradually changing energy mix, the offshore wind industry is expected to play a significant role, experiencing a considerable increase in the coming several decades [2,3]. The EU plans to install in all European Sea at least 240 gigawatts (GW) of global offshore wind power capacity by 2050 [4]. Current developments illustrate the exponential growth in offshore wind installations (e.g., offshore wind grew from 1% of annual capacity additions in global wind installations in 2009 to over 10% in 2019) [3].

Technological progress, recent developments in floating technologies, and significant cost reductions, in conjunction with local, low level, and controllable environmental impacts, appear to be the main factors driving the transformation of offshore wind energy into

**Citation:** Kokkos, N.; Zoidou, M.; Zachopoulos, K.; Nezhad, M.M.; Garcia, D.A.; Sylaios, G. Wind Climate and Wind Power Resource Assessment Based on Gridded Scatterometer Data: A Thracian Sea Case Study. *Energies* **2021**, *14*, 3448. https://doi.org/10.3390/en14123448

Academic Editor: Adrian Ilinca

Received: 18 May 2021 Accepted: 8 June 2021 Published: 10 June 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

a safe and commercially viable form of clean power generation [5]. In any case, the total offshore installations reached 29.1 GW by the end of 2019, representing only 5% of total global wind capacity and generating barely 0.3% of global electricity production. In the EU, approximately 10 million households are now being served by offshore wind energy. In the U.S., the first commercial Offshore Wind Farm (OWF) started its operation in December 2016. However, up to the present date development activity remains impressively high, and sixteen active commercial leases for offshore wind development have been procured [6]. In Southeast Asia, countries such as China, Japan, and Taiwan lead the market, with China surpassing 1 GW in annual offshore wind installation [3].

The above indicates the enormous potential for offshore wind capacity growth. On this account, a large amount of new OWFs will be designed, installed, and will become operational, especially in Europe where the European Commission (EC) forecasts that total offshore wind installations will range between 240 and 450 GW by 2050 [4].

Although all OWFs are concentrated in the North and the Irish Seas, a clear tendency from the private sector to harvest the Mediterranean's wind power potential has also been observed. A 30 MW wind farm comprised of 10 monopole wind turbines is expected to be installed in the Apulia region, southern Italy, as the first Mediterranean offshore wind project to be implemented. Even though 1 GW of offshore wind power is equivalent to emissions of 3.5 MT CO<sup>2</sup> (Carbon dioxide), several technological, administrative, legislative, environmental, socio-economic, and financial barriers exist in the development of OWF projects. Such barriers have been summarized by Soukissian et al. [7]. The Geographic Information System (GIS) mapping of offshore marine and maritime uses could assist the selection of proper locations for and placements of turbines [8].

The most crucial suitability selection criterion for wind farm siting (i.e., the wind resource availability [9]), in conjunction with the presence of a wide continental shelf ensuring relatively shallow depths and an appropriate distance from shore [10], could be met over the Thracian Sea in the Northern Aegean. Several investigators have assessed the wind power potential in the broader area, especially in Çanakkale [5] and Bozcaada [11], the Samothraki Island [12], and the whole Aegean Sea [13]. Most studies have utilized data from meteorological stations [5,11]. Bagiorgas et al. [13] used wind data from offshore buoys. Soukissian et al. [7] downscaled the European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis data, using a high-resolution meteorological model (15 year period, 0.10◦ × 0.10◦ ) validated by offshore buoy data, while Majidi Nezhad et al. [12] utilized the ERA–Interim reanalysis dataset (40 year period, mean monthly data).

In this work, the gridded 6-hourly wind data collected from ASCAT L2b scatterometer on-board METOP satellites, combined with the ECMWF ERA–Interim atmospheric reanalysis, as provided by the Copernicus Marine Environmental Monitoring Service (CMEMS), were used to assess the offshore wind power potential over the whole Thracian Sea and the Lemnos Plateau. This is an area of significant interest for wind offshore energy development, especially along the NNE–SSW axis following the wind exiting from the Dardanelles Straits [14,15].

Scatterometer data have been widely used in literature for large-scale wind resource assessments, filling the gap in the absence of offshore meteorological stations while providing continuous, systematic, long-term, and relatively-accurate wind data. However, data reliability suffers from low pixel resolution, together with errors related to sensor malfunctioning, wind retrieval algorithm, rain contamination, land contamination, etc. [16]. Several global and regional wind resource assessment studies exist using scatterometer data, mostly using QuickSCAT (Pimenta et al. [17] for offshore SE Brazil; Mostafaeipour [18] for the Persian Gulf and Gulf of Oman; Karamanis et al. [19] for the Ionian Sea; and Fuverik et al. [20] for the whole Mediterranean Sea). To minimize errors induced by the above factors, recent studies have explored offshore wind resources utilizing multiplatform datasets such as QuickSCAT, rapidSCAT, METOP-A and METOP-B, OCEANSAT-2, and others [21].

### **2. Materials and Methods**

### *2.1. Study Area Description*

The study area covers the Thracian Sea, the northern part of the Aegean Sea (from 39.7◦ N to 40.9◦ N and 23.7◦ E to 26.3◦ E), an area with complex bathymetry and hydrography (Figure 1). This area is characterized by abrupt topographic and bathymetric changes due to the extensive width of the continental shelf (40 km) and the North Aegean Trough, a NE–SW oriented deep trench separating the Thracian Sea shelf from Lemnos Plateau [22]. Coastal morphology consists of semi-enclosed gulfs, such as Saros, Alexandroupolis, Kavala, and Strymonikos, located along the northern coast.

**Figure 1.** Study area map and CMEMS grid discretization.

This area is influenced by the outflow of the Black Sea Water, exiting from the Dardanelles Straits and the prevailing N–NE wind circulation, known as the Etesians [23]. The present analysis divided the study area into six main sub-areas based on their physiographic and meteorologic characteristics: the western Thracian Sea (stations 7–9, 17–19), central Thracian Sea (10–13, 20–23), eastern Thracian Sea (14–16, 24–29), the Lemnos Plateau (43–46, 52–54), the Dardanelles' zone of influence (47–49, 55, 56), and the Siggitikos Gulf–Mt Athos (30–33, 41).

### *2.2. Wind Data Description*

The 6-hourly data of wind speed (eastings and northings), measured 10 m above sea level with a spatial resolution of 0.25◦ × 0.25◦ , were retrieved from the Copernicus Marine Environmental Monitoring Service (CMEMS). The data product used was encoded as WIND\_GLO\_WIND\_L4\_REP\_OBSERVATIONS\_012\_006 (http://marine.copernicus.eu/ documents/PUM/CMEMS-WIND-PUM-012-006.pdf, accessed on 26 April 2021), referring to a set of time-series comprised of level 4 reprocessed hindcasted wind observations, assimilated on a global ocean model. Data were estimated from the global wind fields derived from ASCAT scatterometers on-board the METOP-A and METOP-B satellites, combined with ECMWF ERA–Interim atmospheric reanalysis.

The dataset consists of six meteorological variables: wind speed, zonal and meridional wind components, wind stress amplitudes, and the associated components. The present analysis covered the period from January 2011 to December 2019. The resulting fields were estimated on a daily and monthly basis, as equivalent neutral-stability 10 m winds having spatial resolutions of 0.25◦ in longitude and latitude over the study area (Figure 1).

In total, 56 grid points were analyzed, while in situ daily-mean wind data were retrieved for the above defined period from the World Meteorological Organization (WMO) stations located at the Lemnos Airport and the Chrisoupolis Airport (Hellenic Meteorological Service, Figure 1). These data were used to assess the consistency of the CMEMS remotely sensed wind dataset in the study area.

### *2.3. Qualitative Wind Data Assessment*

A set of statistical parameters were used to test the quality of CMEMS scatterometer datasets. These include the difference between temporal means (defined as *bias*) and the Root Mean Square Difference (*RMSD*) between the in situ (considered as ground-truth) and the satellite data products, the scalar (*r*) and the regression coefficient slope (*bS*). A similar analysis was also performed by Bentamy et al. [24] between CMEMS and offshore wind data from buoys in the California, Canary, and Benguela zones. These statistical measures are estimated as:

$$Bias \, = \, \overline{X - Y} \, \tag{1}$$

$$RMSD = \sqrt{\overline{\left(X - Y\right)^2}}\tag{2}$$

$$STD = \sqrt{\left(X - Y - \overline{X - Y}\right)^2} \tag{3}$$

$$\rho = \frac{\overline{(\chi - \overline{\chi}) - (\chi - \overline{\chi})}}{STD(X) - STD(Y)} \tag{4}$$

$$b\_S = \sqrt{\frac{\overline{Y^2}}{\overline{X^2}}} \tag{5}$$

where *X* is the wind speed measured by the meteorological station and *Y* the CMEMS wind speed.

### *2.4. Preliminary Data Processing*

The 6-hourly wind data from 56 data points, located at the center of CMEMS grid (covering the whole Thracian Sea) were retrieved in the form of u- and v-wind speed time-series (in m/s) from 1 Janurary 2011 00:00 until 31 December 2019 21:00 (in total, 13,148 values per point). The power law was used to estimate the wind speed at wind turbine hub height (93 m) with the 10 m wind speed, as:

$$\mathcal{U}\_{hub} = \mathcal{U}\_{10} \left(\frac{Z\_{hub}}{Z\_{10}}\right)^a \tag{6}$$

where *Uhub* is the wind speed at the hub height of the wind turbine (m/s), *U*<sup>10</sup> is the CMEMS scatterometer data at 10 m above sea level (m/s), *Zhub* is the hub height of the wind turbine (m), *Z*<sup>10</sup> refers to 10 m above sea level, and *α* = 0.123 (as in Bagiorgas et al.) [13].

Using these wind data profiles, the mean daily and monthly values of wind speed and direction at the hub height were produced for each examined grid point. Descriptive statistical parameters on wind speed and direction data were computed as the dataset minimum, first quartile (Q1), median, mean, third quartile (Q3), and maximum values. Frequency distributions of daily-mean wind speeds were computed per wind directional sector, and relevant tables were produced. Based on these results, wind roses were developed indicating the frequency variability per wind speed increment and per wind direction sector. Mean-monthly wind speeds were computed on a year-to-year basis, and boxplots were produced.

### *2.5. Weibull Probability Function*

Several probability density functions are available in the literature to be fitted on the distributions representing the wind speed frequency curve per directional sector for the prediction of randomly distributed wind speed data [25]. The Weibull probability density function depicts an acceptable accuracy level in numerous wind power studies worldwide, expressed mathematically as:

$$f(\mathcal{W}\_b) = \frac{k}{A} \left(\frac{\mathcal{W}\_b}{A}\right)^{k-1} e^{-\left(\frac{\mathcal{W}\_b}{A}\right)^k} \tag{7}$$

where *f*(*W*) is the frequency of occurrence of wind speed *W*, *A* is the scale parameter (measure for the wind speed), and *k* is the shape parameter (description of the shape of the distribution) per directional bin. The Weibull distribution parameters were estimated by:

$$k = \left[ \frac{\sum\_{b=1}^{n} \mathcal{W}\_b^k \ln(\mathcal{W}\_b) f(\mathcal{W}\_b)}{\sum\_{b=1}^{n} \mathcal{W}\_b^k f(\mathcal{W}\_b)} - \frac{\sum\_{b=1}^{n} \ln(\mathcal{W}\_b) f(\mathcal{W}\_b)}{f(\mathcal{W}\_b \ge 0)} \right]^{-1} \tag{8}$$

$$A = \left(\frac{1}{f(\mathcal{W}\_b \ge 0)} \sum\_{b=1}^n \mathcal{W}\_b^k f(\mathcal{W}\_b)\right)^{1/k} \tag{9}$$

where *W<sup>b</sup>* is the mean wind speed per directional bin *b*, *n* is the number of bins, *f*(*W<sup>b</sup>* ) is the frequency for wind speed ranging within bin *b*, and *f*(*W<sup>b</sup>* ) ≥ 0 is the probability for wind speed equal to or exceeding zero. To estimate the Weibull distribution parameters *k* and *A*, an analysis was performed in R programming language (fitdistrplus package [26]) using the maximum likelihood estimation method per directional bin.

### *2.6. Wind Energy Content and Power Density*

Using the estimated Weibull probability density function, the total wind energy content per directional bin was computed. The total wind energy content (in kWh/m2/yr) can be understood as the theoretic energy potential of a particular site. Therefore, it is a useful metric for the resource assessment of an area and for comparative purposes among areas, being independent of the characteristics of the wind turbine. The available wind energy content per wind speed increment and wind direction at each gridded point of the Thracian Sea was assessed using the R-package bReeze, by:

$$E(\mathcal{W}) = \frac{1}{2} \rho\_{air} H \sum\_{b=1}^{n} \mathcal{W}\_b^3 f(\mathcal{W}\_b) \tag{10}$$

where *ρair* is the density of air at the sea level under a mean temperature of 15 ◦C and one atmospheric pressure (=1225 kg/m<sup>3</sup> ), n is the total number of directional bins (=16), *H* is the number of hours of the desired period (=8760 per year), *W<sup>b</sup>* is the wind speed per directional bin, and *f*(*W<sup>b</sup>* ) is the probability of that bin, estimated by the Weibull distribution described in the previous equation. [6].

Wind power density is an important factor when assessing the wind potential of a location. It designates the available amount of energy per unit of time and swept area of the blades at the selected location. It is this amount of energy that will be converted to electricity by the wind turbine. The estimation of wind power density per directional bin is achieved by fitting the Weibull distribution to the respective dataset, expressed mathematically as:

$$P(\mathcal{W}) = \frac{1}{2} \rho\_{air} \sum\_{b=1}^{n} \mathcal{W}\_b^3 f(\mathcal{W}\_b) \tag{11}$$

### *2.7. Annual Wind Energy Production*

The estimation of the annual wind energy production is as follows:

$$AEP = A\_{turb} \frac{\rho}{\rho\_{pc}} H \sum\_{b=1}^{n} f(\mathcal{W}\_b) P(\mathcal{W}\_b) \tag{12}$$

where *Aturb* is the average availability of the turbine, *ρair* is the density of air (=1225 kg/m<sup>3</sup> ), *ρpc* is the specific air density for power curve definition, *f*(*W<sup>b</sup>* ) is the probability of the wind speed bin *W<sup>b</sup>* , estimated by the Weibull distribution, and *P*(*W<sup>b</sup>* ) is the power output for that wind speed bin. Finally, *H* is the number of operational hours (= 8760 h).

The Capacity Factor (*CF*) represents the productive suitability of the wind turbine, i.e., an indicator to assess the field performance of the turbine. It is defined as the ratio between the average output power (*Pout*) of the wind turbine represented by the *AEP* and the theoretical maximum power output on annual basis. It is defined as:

$$CF = \frac{AEP}{P\_{th}H} \tag{13}$$

where *Pth* is the wind turbine's theoretical power, defined as being proportional to the wind speed cubed for wind speeds lower than the rated wind speed and equal to the turbine rated power for higher wind speeds. In this work, the annual energy production and the capacity factor were assessed based on a Siemens SWT 2.3 MW wind turbine with a height of 93 m. This turbine was selected as a potential monopile system to be deployed at an offshore wind farm in NE Lemnos. The power curve for this turbine (consisting of wind speed and power pairs), starting at the cut-in wind speed of the turbine and ending with the cut-out wind speed, is shown in Figure 2.

**Figure 2.** Wind turbine power curve.

### **3. Results**

### *3.1. Assessment in Satellite Wind Analysis Accuracy*

The intercomparison of the satellite-derived wind data products against "groundtruth" data collected from meteorological stations led to the assessment of regional accuracy in the satellite wind analysis. Unfortunately, offshore buoy data were not available. Thus, comparisons were made against land-based stations of low altitude and in proximity to the shore on a daily-mean basis. Figure 3a,b illustrate the scatter and fitted line plots between the 10 m wind speed retrieved from CMEMS (grid points 45 and 2) and the wind data collected from the Lemnos and Chrisoupolis meteorological stations, respectively.

**Figure 3.** Density plots histograms of CMEMS wind speed data against wind data from on-site stations in (**a**) Chrisoupolis Airport and (**b**) Lemnos Airport. The dashed line represents the perfect match line, the red line the linear regression model fitted on the scattered data, and the light red area the 95% confidence interval.

These figures illustrate the rather good correlation with a slight overestimation of CMEMS wind speed data at the open Thracian Sea area (Lemnos: *n* = 3287; *bias*: −1.35; *RMSD* = 2.43; *STD* = 2.02; *ρ* = 0.76; *b<sup>S</sup>* = 1.31), and a moderate overestimation at the Thracian coastal zone (Chrisoupolis: *n* = 1825; *bias*: −1.25; *RMSD* = 2.33; *STD* = 1.97; *ρ* = 0.50; *b<sup>S</sup>* = 1.59), in relation to the in situ meteorological datasets. In Lemnos, agreement is higher at high wind speeds (15–20 m/s and >20 m/s, *bias*: −1.03; *RMSD* = 1.76; *STD* = 1.37; *ρ* = 0.78; *b<sup>S</sup>* = 1.02). Regression equations for both areas were defined, as:

CMEMS scatterometer data = 1.011 × Meteorological station data + 1.230 (14)

for the Chrisoupolis Airport, and

CMEMS scatterometer data = 0.973 × Meteorological station data + 1.463 (15)

for the Lemnos Airport.

Errors and biases are attributed to the coarse resolution of data product, exhibiting the tendency of satellite-derived ASCAT data to overestimate offshore winds [27]. Similar findings were also reported by Alvarez et al. [28], showing that similar satellite data, such as QuikSCAT, CCMP, and CFSR datasets, overestimated the wind (especially at high wind speeds >4 m/s).

### *3.2. Descriptive Wind Statistics per Sub-Area*

In order to be able to analyze the wind data at hub level (93 m) and to provide analytical descriptive statistics, data from grid points were spatially-aggregated according to the main physiographic units of the study area. Table 1 presents the summary values for these sub-areas. Results indicate that along the Thracian Sea continental shelf, a gradient in wind speed values exists with higher mean, median, and quartile values being exhibited

towards the Eastern Thracian Sea. Furthermore, the highest offshore wind statistical parameters are shown in the Lemnos Plateau and the Dardanelles area. However, the maximum wind speed is lower than that in the West Thracian Sea. In all areas, data are positively skewed. Data are highly skewed in the west and central Thracian Sea and in Mt Athos (skewness >+5), characterized by increased maximum speeds under extreme events. Leptokurtic curves prevail over the Thracian Sea and Mt Athos area (kurtosis ~1.3), and mesokurtic curves prevail at the Lemnos Plateau and the Dardanelles.


**Table 1.** Descriptive wind statistics in (m/s) at hub height (93 m), per study sub-area.

An indicative time-series diagram illustrating the 6-hourly wind speed variability in the Lemnos Plateau (grid point 46) at the hub height is shown in Figure 4. Winds under extreme stormy conditions exceed the limit of 20 m/s, originating mainly from the Dardanelles and affecting the northern part of the Aegean Sea. Data exhibit seasonality showing higher winter values, with regular incidents exceeding 20 m/s. Mean monthly values indicate that the seasonal component oscillates with an amplitude of 6 m/s, and reveals a slightly upward trend (~0.008 m/s) over the years examined.

**Figure 4.** 6-hourly time series (blue line) and mean-monthly time-series (red line) of wind speed at hub height in the Lemnos Plateau (grid point 46).

The wind speed exhibits intra-annual variability, with higher values in winter (especially in February) and significantly lower values in spring and summer (from April to July). A representative boxplot diagram of monthly-mean wind speed values at the hub level (93 m) at point 46 (the Lemnos Plateau) is shown in Figure 5.

**Figure 5.** Boxplots for monthly wind speed values at hub height in the Lemnos Plateau (point 46).

The spatial variability of frequency distributions in daily-mean wind speeds, per wind directional sector, are shown in Figure 6. It is apparent that NE winds prevail in the study area, followed by ENE at the nearshore parts of the Thracian Sea and Mt Athos, and by NNE winds at the offshore Thracian Sea, the Lemnos Plateau, and the Dardanelles. Wind speeds and frequencies per directional bin are more dispersed in the West and Central Thracian Sea and Mt Athos area, with mean wind speeds of 5.6 m/s, 6.0 m/s, and 7.5 m/s (~30%, 36%, and 35%) from the NE and ENE directions, respectively.

**Figure 6.** Wind frequency roses at hub height over the study area.

Eastwards and offshore, wind speeds are significantly higher, of higher frequency, and appear confined along the NE direction, as in point 46 (the Lemnos Plateau) which has a mean NE wind speed of 9.5 m/s and 33% frequency of occurrence. This is attributed to the impact of orographic effects on the cyclonic synoptic circulation of surface wind field over the Black Sea and the funneling effect along the Turkish Straits. In parallel, these offshore points illustrate the influence of moderately strong S winds (~7.5 m/s, 8%).

### *3.3. Spatial Variability in Weibull Fitting Function Parameters*

To achieve a clear view of the available wind potential of an area, we may not rely only on the description of the instantaneous and mean wind speeds. The statistical parameters *k* and *A* of the fitted Weibull probability density function will provide a better understanding of wind dynamics. The probability of occurrence, and therefore the fraction of time for each wind speed range per directional sector, prevailing in the study area may be derived through this function. Table 2 presents the annual variation in Weibull parameters per directional bin for all study area sub-regions. For all bins, the Weibull shape parameter *k* varies between 1.40 in the West Thracian Sea and 1.73 in the Dardanelles region of influence, with a mean value of 1.61 throughout the gridded data at hub level (z = 93 m). At the nearshore Thracian Sea area, k-mean values range from 1.39 from the N direction to 1.63 from the WSW direction.


**Table 2.**Weibull probability density function parameters, per directional bin, at hub height for all sub-areas.

In terms of k-distribution over the various directional bins, higher values occur at the NE direction in the East Thracian Sea, the Lemnos Plateau, the Dardanelles, and the Mt Athos areas (ranging from 1.88 to 2.45), at the ENE direction in the central Thracian Sea (*k* = 2.00), and the E direction in the western Thracian part (*k* = 1.88). In parallel, the Weibull scale parameter (*A*) exhibits a gradual increase from the western nearshore zone (4.79 m/s) towards the east (6.77 m/s) and then offshore until the Lemnos Plateau (7.81 m/s) and the highly dynamic Dardanelles area (8.02 m/s). The NE direction displays the higher *A*-values in all sub-areas, except for the East Thracian Sea where the NNE direction prevails. The highest NE-bin *A*-value is seen at the Lemnos area (10.42 m/s), followed by the Dardanelles region (10.39 m/s).

The Weibull probability density function, fitted on the NE wind speed data at a specific grid point located at the Lemnos Plateau, together with the cumulative probability density function and the relevant *Q-Q* and *P-P* plots, are shown in Figure 7. Based on this analysis and the wind turbine power curve (Figure 2), it can be deduced that the probability of wind speed from the NE direction within the turbine operational (>5 m/s) window is 79.81%.

**Figure 7.** Probability density model fitted on NE wind data at hub height (point 46, Lemnos Plateau) (**a**) data histogram and fitted Weibull function, (**b**) *Q–Q* plot, (**c**) Cumulative density function, and (**d**) *P–P* plot.

The iso-lines connecting points of equal *k* and *A* values, as extracted from the Weibull probability distribution for the NE wind direction, is shown in Figure 8. Based on Figure 8a, it is evident that *k*-values >2.4 occur in the Dardanelles area, and that *k* reduces gradually towards the WNW direction with a stable rate of 0.1 per 20 km. On the other hand, the spatial distribution of the scale parameter *A* seems more complex, with local peaks (>10.5 m/s) at Bozcaada Island and at the Saros Gulf and a general W-E isolines orientation indicating a sharp reduction in *A* towards the nearshore and onshore Thracian Sea grid points (Figure 8b).

### *3.4. Total Wind Energy Content*

Using the parameters of the Weibull distribution per grid point and integrating spatially, Table 3 presents the wind energy content per directional sector, averaged over the main sub-areas of the study region. The analysis suggests that the highest wind energy content occurs in the Lemnos Plateau area (4455 kWh/m2/yr), followed by the Dardanelles (4398 kWh/m2/yr), Siggitikos/Mt Athos (3091 kWh/m2/yr), and East Thracian Sea (2964 kWh/m2/yr).

**Figure 8.** Spatial distribution of the Weibull probability density function parameters: (**a**) the shape parameter *k* and (**b**) the scale parameter *A* (in m/s) at the hub level over the study area.

**Table 3.** Total annual wind energy content (in kWh/m<sup>2</sup> ) at hub level, per directional bin, for all sub-areas.


Table 3 indicates that the Lemnos Plateau and the Dardanelles region have a high wind energy content spread over three directional bins (NNE, NE, and ENE), representing an annual wind energy content of 3496 kWh/m<sup>2</sup> and 3431 kWh/m<sup>2</sup> , respectively. This energy content is equivalent to the power density of 399 W/m<sup>2</sup> and 391 W/m<sup>2</sup> , respectively. Approximately 22% of this sectorial energy content is being produced by winds in the 0–5 m/s range, 43% within the 5–10 m/s, 26% in the 10–15 m/s range, 7% in the 15–20 m/s, and only 2% by winds higher than 20 m/s. The contribution of the S sector in the total wind energy content of these two areas also seems quite considerable.

### *3.5. Annual Wind Energy Production*

Considering the wind profile produced from *z*<sup>1</sup> = 10 m (CMEMS data) to hub height (*z<sup>2</sup>* = 93 m), the wind turbine power curve and dimensions and annual wind energy production (in MWh/yr) was estimated, following the previous Equation (10). The highest wind energy may be produced in the Dardanelles region, with a spatially-averaged *AEP*value of 7546 MWh/yr. Approximately 75% of this energy (5684 MWh/yr) is concentrated along the NNE, NE, and ENE directional sectors, with the NE–*AEP* being the highest (48.8%). In parallel, most of the total *AEP* in Dardanelles is being produced from winds in the range of 10–15 m/s (46% or 3532 MWh/yr) and 5–10 m/s (37% or 2859 MWh/yr).

In the Lemnos Plateau, the spatially-averaged estimated *AEP*-value reached 7212 MWh/yr, mostly provided by the same directional bins (NNE, NE, and ENE), producing in total 5342 MWh/yr (i.e., 74% of total *AEP*). As previously discussed, most of the energy is produced by winds in the range of 10–15 m/s and 5–10 m/s, with values of 3261 MWh/yr and 2701 MWh/yr, respectively. The East Thracian Sea is another area of significant interest, as the spatially-mean *AEP* approximates 5620 MWh/yr, 75% of which is produced from the NNE, NE, and ENE sectors. Another interesting feature is the rising contribution of the S and SSW directions (5.7% and 7.9%, respectively). The Siggitikos Gulf and the area of Mt Athos exhibit *AEP* of the order of 5241 MWh/yr, while the Central Thracian and the West Thracian Sea have values of 3743 MWh/yr and 2939 MWh/yr, respectively.

Based on the above *AEP*-estimates, the capacity factor of turbine performance reached 37.44% in the Dardanelles, 35.80% in the Lemnos Plateau, 27.89% in the East Thracian Sea, and 26.02% in the Mt Athos area The capacity factor in the western and central Thracian Sea was assessed at 14.58% and 18.58%, respectively.

### **4. Discussion**

Over the latest decade, there has been a growing interest in exploiting wind energy resources, particularly in offshore marine areas. This has been fueled by the recent trend towards economic decarbonization and the stable turn towards marine renewables, in association to cost reductions in turbine manufacturing, installation and maintenance, and the advancements in floating wind turbine technology which is now capable of even tripling the technical potential for offshore wind across the world [2,3]. Capital investments and rates of return in this sector are highly correlated to wind energy resource density and variability, indicating the need for long-term, high quality assessments of annual wind energy production [29].

Although the methodology for such assessments has been standardized, there still exist several bottlenecks related to the availability and reliability of long-term wind data at the wind turbine level over the open sea. For this purpose, several investigators have used wind data collected by on-site sensors (e.g., offshore buoys [13]), facing significant periods of sensors malfunction and gaps in operation (these buoys are usually sparse), and land-based meteorological stations (e.g., stations on islands and coastal zone) [5,30], which collect long time-series but are prone to localized orographic effects. On the other hand, data collected by satellites equipped with scatterometers may cover large marine areas, but as measurements are available at the regular intervals of satellite crossings the outputs of numerical models are utilized to fill in the spatio-temporal gaps. The final reanalysis product contains gridded wind data, but at rather coarse spatial resolution [31]. In parallel, scatterometers' operation is limited by rain, ice, large spatial wind variability, and high wind speeds.

Carvalho et al. [27] compared ASCAT-A and B wind data to wind speeds from offshore buoys and reported that scatterometer data slightly underestimated the wind field (*RMSE* = 1.55; *bias* = 0.64; *STDE* = 1.40; *R* <sup>2</sup> = 0.90) along the Atlantic coast of the Iberian Peninsula. Pickett et al. [32] working with the QuickSCAT satellite showed that the satellite– buoy wind differences nearshore were more significant than those offshore. Wang et al. [31] performed a similar analysis at the Central California Coast, reporting that ASCAT had the lowest error metrics compared to the QuickSCAT. Overall data products overestimated winds relative to the buoy at low wind speeds and underestimated at high wind speeds. These works indicate that different wind products performance varies considerably by study region, indicating the need for site-specific analyses.

In the present work, 6-hourly wind speed and direction data with 0.25◦ spatial resolution, obtained from CMEMS (blending ASCAT observations and the ERA Interim model results), were daily averaged and then compared to ground-truth data from land-based stations over the period between 2015 and 2019. The evaluation metrics illustrated slight to moderate overestimation at the Lemnos Plateau and the Central Thracian Sea, respectively. The Weibull parameters computed here appear in agreement with those reported by Aslan [5] and Bagiorgas et al. [13].

In terms of the wind power density, Bagiorgas et al. [33] reported a mean value of 600 W/m<sup>2</sup> at the Athos offshore buoy, located in the Lenmos Plateau, at a heigh of 10 m from the sea surface water. Similarly, Bohran [34] calculated a mean energy content, ranging between 400 and 1410 W/m<sup>2</sup> , at a height of 30 m at Bozcaada (in the Dardanelles area). Aslan [5], using onshore wind stations, assessed the annual wind power content of the Bozcaada Island (in the Dardanelles region) at 460 W/m<sup>2</sup> . Soukissian et al. [7] estimated the wind power potential of the Lemnos area at 468 W/m<sup>2</sup> , and that of the Thracian Sea at 71.7 W/m<sup>2</sup> .

Our analysis suggested that at hub height (i.e., 93 m from sea level), the spatiallyaveraged wind power density reaches 513 W/m<sup>2</sup> at the Lemnos Plateau and 507 W/m<sup>2</sup> at the Dardanelles, which is comparable to the above findings. Based on international standards of wind density power classification, the wind energy potential at hub height of Lemnos and the Dardanelles is classified in wind class 5 (excellent); the wind power potential of the East Thracian Sea and Siggitikos Gulf/Mt Athos in wind class 3 (fair); the Central Thracian Sea in class 2 (marginal); and the West Thracian Sea in class 1 (poor). As shown in Table 3, winds from NNE, NE, and ENE directions contribute highly to this energy production.

Considering the characteristics of a Siemens SWT 2.3 MW wind turbine, we have assessed that a mean *AEP* of 5684 MWh/yr from the NNE, NE, and ENE sectors may be produced in the Dardanelles region. In the Lemnos Plateau, these directional bins may produce *AEP* of 5342 MWh/yr and ~5600 MWh/yr in the East Thracian Sea. The selected turbine achieves a capacity factor of 37.4% in the Dardanelles and 35.8% in the Lemnos Plateau. Konstantinidis et al. [15] estimated the capacity factor of Vestas V90-3 MW at 38.5% and RE power (Senvion) 5 M at 41% for the design of an OWF in the Lemnos area.

### **5. Conclusions**

This work has examined the wind power potential of the Thracian Sea, a regional sea at the northern part of the Aegean, with significant interest in regards to the installation and operation of wind farms. CMEMS scatterometer wind data for the period between 2011 and 2019, blended with the numerical model reanalysis, were used for the assessment of wind energy content and the annual wind energy production. Although it was at a height of 10 m, the CMEMS wind data illustrated mild overestimation of the wind field compared to the Lemnos station data. The estimated Weibull parameters and the assessed wind power density were found comparable to that reported by previous investigators. Earlier wind power assessments in the area utilized limited offshore buoy data or data from nearshore, land-based stations. The basic differences in the present analysis, in relation to previous works focusing in the area, lie in the fact that our analysis is based on gridded data which covers extended offshore zones and quantifies the influence of each directional bin on final wind energy production.

The highest spatially-averaged wind energy content at hub height occurs in the Lemnos Plateau (4455 kWh/m2/yr), followed by the Dardanelles (4398 kWh/m2/yr), Siggitikos/Mt Athos (3091 kWh/m2/yr), and East Thracian Sea (2964 kWh/m2/yr). In these areas, most of the wind energy is produced by three directional bins (i.e., NNE, NE, ENE) and by wind magnitudes between 5–10 m/s. The spatially-averaged wind power density reaches 513 W/m<sup>2</sup> at the Lemnos Plateau and 507 W/m<sup>2</sup> at the Dardanelles, and the wind energy production for the selected wind turbine reaches 7212 MWh/yr and 7546 MWh/yr, respectively.

**Author Contributions:** Conceptualization, G.S.; methodology, G.S., N.K.; software, G.S., N.K.; validation, M.M.N. and D.A.G.; resources, N.K., M.Z. and K.Z.; data curation, N.K., M.Z. and K.Z.; writing—original draft preparation, G.S., N.K.; writing—review and editing, G.S., M.M.N. and D.A.G.; visualization, N.K.; supervision, G.S.; project administration, G.S.; funding acquisition, G.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by European Union's Horizon 2020 Research and Innovation Program (H2020-BG-12-2016-2), grant number No. 727277—ODYSSEA (Towards an integrated Mediterranean Sea Observing System). The article reflects only the authors' view. The Commission is not responsible for any use that may be made of the information it contains.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**


### *Article* **A Combined Fuzzy GMDH Neural Network and Grey Wolf Optimization Application for Wind Turbine Power Production Forecasting Considering SCADA Data**

**Azim Heydari 1,\* , Meysam Majidi Nezhad <sup>1</sup> , Mehdi Neshat <sup>2</sup> , Davide Astiaso Garcia <sup>3</sup> , Farshid Keynia <sup>4</sup> , Livio De Santoli <sup>1</sup> and Lina Bertling Tjernberg <sup>5</sup>**


**Abstract:** A cost-effective and efficient wind energy production trend leads to larger wind turbine generators and drive for more advanced forecast models to increase their accuracy. This paper proposes a combined forecasting model that consists of empirical mode decomposition, fuzzy group method of data handling neural network, and grey wolf optimization algorithm. A combined Kmeans and identifying density-based local outliers is applied to detect and clean the outliers of the raw supervisory control and data acquisition data in the proposed forecasting model. Moreover, the empirical mode decomposition is employed to decompose signals and pre-processing data. The fuzzy GMDH neural network is a forecaster engine to estimate the future amount of wind turbines energy production, where the grey wolf optimization is used to optimize the fuzzy GMDH neural network parameters in order to achieve a lower forecasting error. Moreover, the model has been applied using actual data from a pilot onshore wind farm in Sweden. The obtained results indicate that the proposed model has a higher accuracy than others in the literature and provides single and combined forecasting models in different time-steps ahead and seasons.

**Keywords:** power system; wind power production; SCADA data; fuzzy GMDH neural network; grey wolf optimization

### **1. Introduction**

Wind power industries have been tremendously expanded and are expected to progress at a compound annual growth rate (CAGR) of 5.2% between 2020 and 2027. This extension resulted in the produced power cost of wind energy as one of the most significant renewable and low-carbon energy resources. Wind power generation is currently one of the principal renewable energy power generations [1–4]. Wind energy is stochastic, uncertain, and discontinuous, antagonistically influencing the power grid's protected and stable activity and the nature of the power supply [5]. The stochasticity and discontinuity of wind power could diminish the reliability prediction system and wind power quality [6].

A potential answer to these issues is to improve the forecast accuracy of wind generation. Several studies [7–12] are proposed to portray the distribution of the wind power prediction, and diverse scientific methodologies are connected to improve its accuracy.

**Citation:** Heydari, A.; Majidi Nezhad, M.; Neshat, M.; Garcia, D.A.; Keynia, F.; Santoli, L.D.; Tjernberg, L.B. A Combined Fuzzy GMDH Neural Network and Grey Wolf Optimization Application for Wind Turbine Power Production Forecasting Considering SCADA Data. *Energies* **2021**, *14*, 3459. https://doi.org/10.3390/ en14123459

Academic Editor: Francesco Castellani

Received: 7 May 2021 Accepted: 9 June 2021 Published: 11 June 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Other studies proposed complex models such as the Laplace distribution [9], the Beta distribution [10], the hyperbolic distribution [11], the Levy α-stable distribution [12], and the flexible likelihood distribution [13], which have been proposed to improve the fitting precision of the wind power prediction. In the previous decade, many studies to assess and predict the various aspects of energy management and power systems have been presented. For example, wind and solar power generation forecasting [14–16], condition monitoring of wind turbines [17], electricity market [18], and load forecasting [19,20] are proposed. Amjady et al. (2011) provided the short-term wind power prediction dependent on the ridgelet neural network (RNN) with a high capacity estimate ability. They suggested a differential evolution algorithm with a new selection mechanism and crossover to train the network [21]. Han et al. (2017) proposed combined models based on autoregressive integrated moving average (ARMA) and non-parametric model for wind speed forecasting [22].

The results demonstrated that non-parametric based combined models usually have a better performance than other models. Jonas C. Pelajo et al. (2019) developed a model to predict wind speed and energy price to determine the optimal maintenance planning of a real wind farm in the Brazilian Northeast [23]. Osório et al. (2014) proposed a combined forecasting model based on mutual information, wavelet transform, particle swarm optimization, and adaptive neuro-fuzzy inference system framework to predict the short-term wind power and electricity market prices [24]. Gallego-Castillo et al. (2016) provided a quantile relapse model dependent on the recreating piece Hilbert space (RKHS) system to predict the wind power probabilistic. Furthermore, they implemented two types of models (online and offline) for a real wind farm [25]. Xiao et al. (2017) employed an electrical power system prediction model using a wavelet neural network (WNN) model and an improved cuckoo search algorithm. The results showed that the proposed model essentially diminished the expectation error with respect to other relative models [26].

Kunpeng Shi et al. (2018) provided a combined model based on two-stage feature selection and improved random forest models to short term wind power forecasting [27]. Van Quang Doan et al. (2019) have presented a mesoscale ensemble model to predict wind speed ramps. The proposed model applied at real wind farms in Japan [28]. Duan et al. (2021) developed a combined intelligent model based on the improved variational mode decomposition and Correntropy long short-term memory neural network to predict wind power. The model was evaluated using two wind farms in China at different sampling intervals [29]. Yildiz et al. (2021) presented a two-step new deep learning approach based on the variational mode decomposition (VMD) method and modified the residual-based deep convolutional neural network for wind power forecasting [30]. Jafarzadeh et al. (2021) provided a modified fuzzy wavelet neural network for short-term wind power forecasting considering weather and power plant parameters. In order to evaluate the model, the Mnjil wind power plant in Iran has been used [31].

In addition, GIS-based models play an important role in renewable energy potential assessment and prediction [32,33]. Furthermore, the behavior and performance of renewable energy systems can be estimated using GIS models [34,35].

Generally, in order to model the wind turbine power production analysis, a combined intelligent solution is required. It means that the data should first be modelled by a combined data pre-processing model then a combined intelligent strategy should analyze the processed data. This type of strategy plays an essential role in managing the energy production of wind farms.

In this research, we propose an integrated strategy that couples an empirical mode decomposition, fuzzy GMDH (group method of data handling) neural network, and grey wolf optimization algorithm (GWO) to forecast the produced power of wind turbines. Furthermore, to detect and clean outliers, a combined K-means and density-based local outliers (LOF) are applied.

The main contributions and novelty of this paper are illustrated as follows:


### **2. Materials and Methods**

After a brief description of the SCADA system and data gathering, this section illustrates the artificial intelligence methods proposed in this paper.

### *2.1. SCADA System*

The SCADA system, known as remote supervision and control of wind turbines in wind farms, plays a significant role in the wind power forecasting models. This paper's collected and applied SCADA data is related to a large wind farm (located in Sweden). The input data includes the power output of wind turbines and wind speed (short-term with the interval of 10 min) for a year from Jan to Dec 2015. Furthermore, in order to evaluate and compare the performance of the proposed hybrid model, we applied the SCADA data for two wind turbines (wind turbine 1 (WT1) and wind turbine 2 (WT2)). '

### *2.2. Proposed Wind Power Forecasting Strategy*

In this study, a multi-step hybrid intelligent model has been proposed as a means to predict wind power production (see Figure 1).

**Figure 1.** General schematic of the proposed forecasting model.

Due to the wide range of intelligent methods such as neural networks and metaheuristic optimization algorithms, in this paper, we presented a hybrid forecasting model based

—

on FGMDH and GWO for wind power production forecasting. The GWO optimization algorithm can perform the neural network (FGMDH) training step well and optimize the value of network parameters. Therefore, this algorithm (GWO) is constructive to the performance of the proposed model for predicting wind power production. —

Since the structure of the input matrix plays a significant role in determining the output and accuracy of the model, in the first place, the various input signals (wind power and wind speed) are decomposed through the EMD method to different high and low frequencies (see Figure 2).

**Figure 2.** The EMD output, i.e., decomposition signals of wind power and wind speed of turbine 1 (WT1).

Five types of decomposed frequencies (IMF1, IMF2, IMF3, IMF4, residual) are selected and applied by delaying a unit of time (t-1) as inputs of the model subsequently (see Figure 3—inputs and output data structure). In addition, the lagged values (1 to 5) for the original wind power signal and actual wind speed signal are considered as input parameters (Figure 3—inputs and output data structure). In the next step, the FGMDH method has been employed to predict the wind turbine power.

The FGMDH model structure includes different neurons. The parameters grouped in the form of Gaussian variables and the weight of the fuzzy rule in each neuron are unknown. In this paper, the GWO algorithm is applied with the purpose of optimizing the FGMDH model variables (the group-unknown variables in neurons).

In this study, in order to evaluate the performance and reliability of forecasting models, the wind turbine power production is predicted for different seasons at two times (10-min and 1-h). The framework of the proposed model is represented in Figure 3.

**Figure 3.** The framework of the proposed model. **Figure 3.** The framework of the proposed model.

### *2.3. Data Cleaning*

The raw SCADA datasets usually include different forms of noise that directly negatively affect the accuracy of the forecasting process. One of the most notable outliers can be the negative wind turbine power outputs observed when the wind speed is shallow or during a failure situation. For evaluating the distribution of the raw SCADA dataset, Figure 1 is plotted, and also an abnormal distribution of wind power can be seen in Figure 4.

**Figure 4.** The distribution of the raw SCADA data WT1 (wind speed and power).

In the pre-processing section, it is recommended that [36] these negative powers should be set as zero. In addition, to remove the impact of the data scale, a Min-Max normalization is implemented for the feature scaling. Meanwhile, as each wind turbine has a unique power curve that presents the average efficiency of the applied wind turbine, without declaring the particular mechanical components, Figure 5 is plotted for showing this characteristic of the first wind turbine in this research. The scatter data point indicates the outliers.

**Figure 5.** The power curve model of the WT1.

The proposed cleaning data method is a combined K-means clustering and the identifying density-based local outliers (LOF) method [37]. In the first step, a k-means clustering method is employed to classify the SCADA data into various clusters. Then, in each cluster, the local density-based method is adopted to eliminate the potential noises. The clean data after using K-means clustering and the LOF method can be illustrated in Figure 6.

**Figure 6.** The K-means clustering method performance and the applied data (WT1) divided into 10 clusters (**left**). The dark points represent the clean SCADA data after applying the LOF (**right**).

### *2.4. Empirical Mode Decomposition*

The EMD is a method of signal decomposition that can analyse the non-linear and non-stationary time series. Moreover, using this method is more accessible and more understandable compared to wavelet decomposition [38,39]. In addition, EMD does not stand in the need of deciding a mother function in advance (beforehand of time) by no means such as wavelet decomposition. The most important characteristic of the EMD is a fully data-driven decomposing means by which signals break down into various independent components within the interval of local specifications of a signal.

Decomposing initial signals as intrinsic mode functions (IMFs) and residual into a finite amount of oscillatory functions is the concept of EMD. These IMFs must be met by the following conditions: (1) The number of extreme must be equal to the number of zero crossings or their maximum difference is equal to one; (2) the mean value of the envelopes characterized by local maxima and local minima must be zero at all components.

The EMD is a sifting method using a real signal to extract the IMFs and residual. The calculation of the EMD can be given as the following steps [38,39]:

**Stage 1:** Recognize all local maxima and local minima in time series *S*(*t*).

() **Stage 2:** Connect all local maxima and minima to produce the upper *U*(*t*) and lower *L*(*t*) envelopes using a cubic spline line.

() **Stage 3:** Calculate the point-by-point mean envelope from the upper and lower envelopes and create the mean envelopes *m*(*t*) later as:

$$m(t) = \frac{\left[\mathcal{U}(t) + L(t)\right]}{2} \tag{1}$$

**Stage 4:** Compute the distinction between the mean envelopes and the actual signal:

$$h(t) = S(t) - m(t) \tag{2}$$

ℎ() ℎ() = () − () ℎ() – **Stage 5:** Check whether *h*(*t*) is an intrinsic mode function (IMF). Provided that this is true, it is treated as the ith IMF and afterwards the actual time series is supplanted by the residuals *h*(*t*) = *S*(*t*) − *m*(*t*). If not, is supplanted by *h*(*t*).

**Stage 6:** Repeat Steps 1–5 until the standard deviation magnitude of the two consecutive sifting results (IMFS and Residual) is lower than the predefined stopping criterion.

Using the above-mentioned sifting process, many IMFs can be obtained from high frequency to low frequency, thereby disintegrating into several IMFs and a residual as:

$$r\_n(t) = S(t) - \sum\_{i=1}^{n} \mathbb{C}\_i(t) \tag{3}$$

 ()

() ( = 1,2, … , )

where *rn*(*t*) and *n* are the last residuals and the number of IMFs, respectively. *Ci*(*t*) (*i* = 1, 2, . . . , *n*) indicates different IMFs.

### *2.5. Fuzzy-GMDH Model*

The FGMDH is a machine learning strategy in the hierarchical structure [40]. In this model, every neuron has two inputs and an output. The general structure of the FGMDH system was shown in Figure 3 (the FGMDH forecasting model). In this figure, the output of each neuron in each layer is considered as the input in the following layer. The last output is determined to utilize the mean of the last layer output.

The FGMDH structure part in Figure 6 demonstrates that the inputs from the *m*th model and *p*th layer are the outputs of the (*m* − 1)th and *m*th model in the (*p* − 1)th layer. The numerical function for computing the *y pm* (the output variable of the mth model in the *p*th layer) is as follows:

$$\boldsymbol{y}^{pm} = \boldsymbol{f}\left(\boldsymbol{y}^{p-1,m-1}, \boldsymbol{y}^{p-1,m}\right) = \sum\_{k=1}^{K} \mu\_{k}^{pm} \cdot \boldsymbol{w}\_{k}^{pm} \tag{4}$$

$$\mu\_k^{pm} = \exp\left\{ -\frac{\left(y^{p-1,m-1} - a\_{k,1}^{pm}\right)^2}{b\_{k,1}^{pm}} - \frac{\left(y^{p-1,m} - a\_{k,2}^{pm}\right)^2}{b\_{k,2}^{pm}} \right\} \tag{5}$$

where *w pm k* and *µ pm k* are its corresponding weight parameter and the *k*th Gaussian function, respectively. Moreover, *a pm k* and *b pm k* are the Gaussian parameters [41]. Furthermore, the last output is calculated by the following equation:

$$y = \frac{1}{M} \sum\_{m=1}^{M} y^{pm} \tag{6}$$

The learning procedure of feed forward FGMDH is known to solve the composite problems as an iterative technique.

A simplified fuzzy logic rule has been provided by [40] to improve the GMDH neural network:

$$\text{If } \mathbf{x}\_1 = F\_{\mathbf{k}1} \text{ and } \mathbf{x}\_2 = F\_{\mathbf{k}2}, \text{ then output } y = w\_{\mathbf{k}}.$$

### *2.6. Gray Wolf Optimization*

The GWO algorithm, which is a new meta-heuristic algorithm based on swarm intelligence evolutionary, is proposed by Mirjalili et al. [42].

The GWO is inspired by grey wolves. The four types of grey wolves are hired as alpha, beta, delta, and omega to replicate the hierarchy of management. On the other side, the notable steps of grey wolves (encircling prey, hunting, attacking prey, and searching for prey) are performed during the operation [43].

Encircling prey: The encircling behaviour of each agent of the group is computed by the following mathematical formula:

$$\overrightarrow{d} = \left| c\overrightarrow{\boldsymbol{x}}\_p^t - \overrightarrow{\boldsymbol{x}}^t \right| \tag{7}$$

$$
\stackrel{\rightarrow}{x}^{t+1} = \stackrel{\rightarrow}{x}\_p^t - \stackrel{\rightarrow}{a}\_r \stackrel{\rightarrow}{d} \tag{8}
$$

The vectors <sup>→</sup> *<sup>a</sup>* and <sup>→</sup> *c* are computed as the following formula:

$$\begin{cases} \begin{array}{c} \stackrel{\rightarrow}{a} = 2l.r\_1\\ \stackrel{\rightarrow}{c} = 2.r\_2 \end{array} \end{cases} \tag{9}$$

Hunting: For a mathematical simulation of the hunting behaviour of grey wolves, it is assumed that *α*, *β*, and *δ* have better information about the possible location of the prey.

$$\begin{cases} \begin{array}{l} d\_{\mathcal{a}} = \left| \stackrel{\rightarrow}{\mathcal{c}}\_{1} \stackrel{\rightarrow}{\mathcal{X}}\_{\mathcal{a}} - \stackrel{\rightarrow}{\mathcal{X}} \right| \\\ d\_{\mathcal{\beta}} = \left| \stackrel{\rightarrow}{\mathcal{c}}\_{2} \stackrel{\rightarrow}{\mathcal{X}}\_{\mathcal{\beta}} - \stackrel{\rightarrow}{\mathcal{X}} \right| \\\ d\_{\mathcal{\delta}} = \left| \stackrel{\rightarrow}{\mathcal{c}}\_{3} \stackrel{\rightarrow}{\mathcal{X}}\_{\mathcal{\delta}} - \stackrel{\rightarrow}{\mathcal{X}} \right| \end{array} , \quad \begin{cases} \begin{array}{l} \stackrel{\rightarrow}{\mathcal{X}}\_{1} = \stackrel{\rightarrow}{\mathcal{X}}\_{\mathcal{a}} - \stackrel{\rightarrow}{\mathcal{a}}\_{1} \stackrel{\rightarrow}{\mathcal{(}\mathcal{d}\_{\mathcal{a}}\big{}} \\\ \stackrel{\rightarrow}{\mathcal{X}}\_{2} = \stackrel{\rightarrow}{\mathcal{X}}\_{\mathcal{\beta}} - \stackrel{\rightarrow}{\mathcal{a}}\_{2} \stackrel{\rightarrow}{\mathcal{(}\mathcal{d}\_{\mathcal{\beta}}\big{}} \\\ \end{array} \\\ \begin{array}{l} \stackrel{\rightarrow}{\mathcal{X}}\_{3} = \stackrel{\rightarrow}{\mathcal{X}}\_{\mathcal{\beta}} - \stackrel{\rightarrow}{\mathcal{d}}\_{3} \stackrel{\rightarrow}{\mathcal{(}\mathcal{d}\_{\mathcal{\delta}}\big{})} \end{array} \end{cases} \tag{10}$$

$$
\stackrel{\rightarrow}{\vec{\mathbf{x}}}\_{(t+1)} = \frac{\stackrel{\rightarrow}{\vec{\mathbf{x}}}\_1 + \stackrel{\rightarrow}{\vec{\mathbf{x}}}\_2 + \stackrel{\rightarrow}{\vec{\mathbf{x}}}\_3}{3} \tag{11}
$$

Attacking the prey and searching for the prey: The <sup>→</sup> *a* is a random value in the interval [−2*a*, 2*a*]. When the random value is less than 1, the grey wolves are enforced to attack the prey and if the random value is greater than 1, the grey wolves are forced to diverge from the prey.

### *2.7. Error Indicators*

In order to assess the accuracy and reliability of the proposed forecasting model, different error indicators have been used in this paper: The mean absolute percentage error (*MAPE*), the sum squared error (*SSE*), the root mean squared error (*RMSE*), and the mean absolute error (*MAE*). All error indicators are based on error percentage (unitless).

$$MAPE = \frac{100}{m} \sum\_{i=1}^{m} \left| \frac{\mathbf{x}\_{real\_i} - \mathbf{x}\_{for\_i}}{\mathbf{x}\_{real\_i}} \right| \tag{12}$$

$$SSE = \sum\_{i=1}^{m} \left(\mathbf{x}\_{real\_i} - \mathbf{x}\_{for\_i}\right)^2 \tag{13}$$

$$RMSE = \sqrt{\frac{1}{m} \sum\_{i=1}^{m} \left( \mathbf{x}\_{real\_i} - \mathbf{x}\_{for\_i} \right)^2} \tag{14}$$

$$MAE = \frac{1}{m} \sum\_{i=1}^{m} \left| \mathbf{x}\_{real\_i} - \mathbf{x}\_{for\_i} \right| \tag{15}$$

where *xreal<sup>i</sup>* and *xf ori* are the actual value and predicted value, respectively. *m* is the number of data.

### **3. Results and Discussion**

As discussed in the previous sections, in order to evaluate the performance and efficiency of the proposed method, the real on-shore SCADA dataset for two wind turbines (WT1 and WT2) has been exploited in this paper. Regarding the framework of the proposed model (Figure 3), several frequencies of wind speed (*IMF1Speed, IMF2Speed , IMF3Speed, IMF4Speed, ResSpeed*) and wind turbine power (*IMF1Power, IMF2Power, IMF3Power ,* and *IMF4Power ,* and *ResPower*) are considered as input parameters of the model:

$$\text{Original and decomposition signals} \begin{cases} \text{ } & \text{WP}\_{(t-1)}, \text{WP}\_{(t-2)}, \text{WP}\_{(t-3)}, \text{WP}\_{(t-4)}, \text{WP}\_{(t-5)} \\ \text{ } & \text{WS}\_{(t-1)}, \text{WS}\_{(t-2)}, \text{WS}\_{(t-3)}, \text{WS}\_{(t-4)}, \text{WS}\_{(t-5)} \\ \text{ } & \text{IMF1}^{\text{Power}}\_{(t-1)}, \text{IMF2}^{\text{Power}}\_{(t-1)}, \text{IMF3}^{\text{Power}}\_{(t-1)}, \text{IMF4}^{\text{Power}}\_{(t-1)}, \text{IRS}^{\text{Power}}\_{(t-1)}, \text{} \text{RS}^{\text{Power}}\_{(t-1)} \\ \text{ } & \text{IMF1}^{\text{Spred}}\_{(t-1)}, \text{IMF2}^{\text{Spred}}\_{(t-1)}, \text{IMF3}^{\text{Spred}}\_{(t-1)}, \text{IMF4}^{\text{Spred}}\_{(t-1)}, \text{RS}^{\text{Spred}}\_{(t-1)} \end{cases}$$

In addition, several combined forecasting models such as MI-CNN [44], MRMR-HNES [45], MI-CNEA [46], GRNN, and FGMDH have been applied to measure the performance of the proposed model. First, the 1-year-dataset is selected as the prediction model data and the predicted results are calculated with the error indicators presented in the previous section. The results are shown in Table 1.


**Table 1.** Comparison of the wind turbine power forecasting errors of the models for two wind turbines (WT1 and WT2).

In Table 1, the prediction results are calculated for two wind turbines in two different time steps (*10-min* and *1-h*). The wind turbine power production is highly dependent on the wind speed. On the other hand, the wind speeds vary greatly on different days thus, the forecasting time intervals for power production have been chosen amongst the days of the four following months: February, May, August, and November which have the highest fluctuation values for power production.

According to the results in Table 1, the performance of the proposed model is better than the other provided models in different time steps and wind turbines. In addition, the results indicate that the performance of the forecasting models in the 10-min time step is better than the 1-h time step. Table 2 and Figure 7 indicate the results of the proposed forecasting model and other models for wind turbine power production forecasting (WT1).

**Table 2.** Comparison of the wind turbine power forecasting errors of the models for the different seasons of a year.


**Figure 7.** The wind turbine power forecasting results of the comparative models for WT1.

According to the results of Table 2 and Figure 7, the performance of the forecasting models has been evaluated in different time steps (10-min and 1-h) and seasons (winter, spring, summer, and fall). Based on these results, the proposed model can predict the wind turbine power more reliably and highly accurately multiple times ahead compared to the other valid forecasting models (GRNN, FGMDH, MI-CNN, MRMR-HNES, and MI-CNEA).

### **4. Conclusions**

Considering the highly volatile and nonlinear process of wind turbines power production, a hybrid intelligent system to improve the accuracy and efficiency of wind turbine power prediction has been proposed. For the initial step, the hybrid K-means-LOF and EMD methods have been applied as a pre-processing step for removing the outliers and decomposition of the SCADA data, respectively. Then, the processed data was given to the forecasting model (FGMDH) and the future power of the wind turbine has been calculated. Furthermore, in order to complete the proposed model as a parallel calculation, the GWO algorithm has been used as an optimization method to optimize the FGMDH parameters. In this study, the SCADA data for two wind turbines in the real wind farm located in Sweden has been used to measure the performance and reliability of the proposed model.

The new forecasting model has been applied to predict the power of wind turbines for two time-intervals ahead (10-min and 1-h) in 1 year and different seasons. The obtained results pinpointed that the performance of the proposed method (EMD-FGMDH-GWO) at different time intervals has a high accuracy and reliability than many other available methods such as GRNN, FGMDH, MI-CNN, MRMR-HNES, and MI-CNEA. The MAPE error indicator obtained for GRNN, FGMDH, MI-CNN, MRMR-HNES, MI-CNEA and the proposed model is equal to 20.818, 11.883, 3.981, 5.105, 5.754, and 3.012, respectively. In

addition, the proposed forecasting model can be extended and applied for different energy sources to maximize the use of renewable energy sources and better manage their use. In future studies, the proposed model will be extended to analyze different energy sources in one area simultaneously.

**Author Contributions:** Conceptualization, A.H.; data curation, A.H., M.M.N., and M.N.; investigation, A.H., M.M.N., and M.N.; methodology, A.H., M.N., and M.M.N.; validation, A.H., M.M.N., M.N., D.A.G., and F.K.; visualization, A.H., M.M.N., D.A.G., and F.K.; writing—original draft, A.H., D.A.G., F.K., and L.B.T.; writing—review and editing, A.H., D.A.G., F.K., and L.D.S.; supervision, D.A.G., F.K., L.D.S., and L.B.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**


**Salah Vaisi 1,\* , Saleh Mohammadi 1,2,\* and Kyoumars Habibi <sup>3</sup>**


**Abstract:** District heating (DH) has a major potential to increase the efficiency, security, and sustainability of energy management at the community scale. However, there is a huge challenge for decision makers due to the lack of knowledge about thermal energy demand during a year. Thermal energy demand is strongly dependent on the outdoor temperature, building area, and activities. In this context, this paper presents an innovative monthly thermal energy mapping method to calculate and visualize heat demand accurately for various types of buildings. The method includes three consecutive phases: (i) calculating energy loss, (ii) completing a dataset that includes energy and building information, and (iii) generating the monthly heat demand maps for the community. Determining the amount of demand and the best location for energy generators from the perspective of energy efficiency in a DH system in an urban context is one of the important applications of heat maps. Exploring heat demand characteristics and visualizing them on maps is the foundation of smart DHs.

**Keywords:** district heat network; thermal energy modelling; heat map; university campus; Ireland; GIS energy mapping; building heat demand and generation; heat balance

### **1. Introduction**

Cities are responsible for more than 50% of total global energy demand [1], while a huge portion of this demand is utilized for thermal purposes to provide comfortable indoor temperatures as well as domestic hot water in buildings [2,3]. In the 'Heating and Cooling Strategy' of the European Commission (EC), the significant potential of heating demand to reduce energy consumption was highlighted on one hand, and on the other hand, district heat networks (DHN) were recommended as a successful method for decarbonizing cities [3]. District heating (DH) is a promising approach to achieve low carbon heat [4] and supports the use of various thermal sources, such as fossil fuels, renewables, and waste heat [5].

However, compared with the electrical smart grid, DHN did not develop, particularly in terms of sharing surplus heat technology. The main reason involves the complicated prediction of heat demand for various types of buildings, such as hotels, sports centers, colleges, residences, and commercial structures.

This study assessed the characteristics of a group of network members, i.e., buildings, in terms of their ever-changing heat demand during a month or even a day. The patterns of heat energy demand were investigated in further detail to generate a smart DHN (SDHN). Using a case study, the authors explored how the heat demand varies in each month at a university campus. The developed methodology helps to manage the heat demand in shorter time periods, and this achievement increases the efficiency as well as the security

**Citation:** Vaisi, S.; Mohammadi, S.; Habibi, K. Heat Mapping, a Method for Enhancing the Sustainability of the Smart District Heat Networks. *Energies* **2021**, *14*, 5462. https:// doi.org/10.3390/en14175462

Academic Editors: Benedetto Nastasi and Meysam Majidi Nezhad

Received: 18 July 2021 Accepted: 26 August 2021 Published: 2 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

of a DHN. This method can improve knowledge in the field of DHN, moving the field toward smart DHN (SDHN). SDHN can apply all types of energy sources, including fossil fuels, renewables, or hybrids. The methodology provides a platform to share the extra heat between the network members.

Understanding the amount and pattern of heat consumption of the studied campus was the main objective of the research. The efficiency of heat consumption in the individual heat boilers was assessed and compared with the efficiency of the DH system.

In addition, based on the completed energy building database (EBD), a GIS-based tool, i.e., district heat balance (DHB) tool, was developed and applied to generate heat maps. Then, the heat maps were used to discover thermal demand density, the baseload, and the peak load of case study buildings. The main purpose of heat maps is to understand heat consumption (size) and geo-scattering of thermal anchor loads. Heat maps are used to extract effective policies to reduce thermal energy at the community scale. Determination of the best location for central heat generators in an urban context from the perspective of energy efficiency is one of the basic advantages of the heat maps. Assessing the criteria, such as minimum heat loss and costs and maximum efficiency, as well as the optimum land cost to determine the best location for heat generators, could be investigated in future studies.

### **2. Background**

### *District Heat Network (DHN)*

In the contemporary century, four generations of district heat network (DHN) were developed. The first and second generations applied before 1980, and they circulated steam or pressurized water at over 100 ◦C. However, the third-generation (3G) circulates water at medium to high temperature (80–100 ◦C), and all these are mentioned as high-temperature district heat networks (HTDH). Since 2008, the fourth generation (4G) system introduced what are called low-temperature district heating networks (LTDH), and they circulate water between 30 and 70 ◦C. A study [6] has suggested that future grids may use 4G distribution networks with annual mean temperatures of 50 and 20 ◦C for supply and return, respectively. The references [7,8] have assessed 'ultra-low temperature networks' with the temperature below 50 ◦C. In this method, high-efficiency heat pumps can be used at the endpoint (building boundary) if heat demand with higher temperatures is required.

The fifth generation (5G) of DHN is integrated with renewable energy resources and relies on smart technologies [9]. The 5G systems are decentralized, bi-directional, close to ground temperature networks that apply direct exchange of heat, combined heating and cooling, cold return flows, and thermal storage to balance thermal demand as much as possible [10,11]. The 5Gs focus on the decentralized heat generators installed at customer stations. The success of this system highly depends on the accurate prediction of heat demand.

Spatial analysis is essential to control DHN potential since its applicability depends on the local characteristics of heat demands and patterns. Recently, Ireland started moving toward the application of DHN when its first spatial energy demand analyses were produced for South Dublin in 2015 [12]. Nevertheless, the predominant heating method in the country is local central heating (mostly individual boilers) and uses fossil fuel.

Most of the studies on DHN focused on the water temperature, and others [13–15] presented a heat atlas that illustrated heat supply and demand at a large scale, such as a city without focusing on individual buildings' demands. A group of studies assessed the length of network pipes using mixed integer linear programming (MILP) solvers [16], a minimum spanning tree algorithm [17], or similar optimal network decision methods to reduce heat losses during the heat distribution process [18].

Based on actual heat demand data, DH managers can design efficient DH plants and consequently reduce the energy price [19]. The total daily space heating (SH) demand for each geographical zone was calculated based on the proportion of day-specific heating, in degree days [20]. Some studies also have been conducted to evaluate heat losses in

DH systems by efficient transmission loads along a distribution network [21,22] as well as determining the location of thermal energy demands [23,24]. Discovering the location of huge thermal demands (large building size and high demand) in terms of DH network energy efficiency is crucial.

The authors of [25,26] developed an algorithm to evaluate design heat demand of a future urban context as a pathway for decision-makers. In references [27,28] using three building case studies, a detailed investigation was presented on building consumption profile modeling. A certain load profile was used for all buildings in the same category. However, the buildings in an urban context are not necessarily classifiable in one group.

Based on the studies conducted across Europe in terms of domestic hot water demand, Harney et al. [29] classified the demand into four categories, including short draw (e.g., handwashing), medium draw (e.g., dishwashing), shower bath, and bathtub. Each demand class was responsible for 14%, 36%, 40%, and 10% of the total volume of daily domestic hot water consumption, respectively.

Nicholas Fry [30] developed a method for mapping building-level heat demand for three US demonstration municipalities, i.e., Montana, Idaho, and Washington. The author determined 9 building categories, such as single family residential, office and retail, multifamily, school/campuses, restaurants, and hospital–healthcare. According to this study, a hospital's annual heat demand is very much lower than that of a single family residence. However, Chartered Institution of Building Services Engineers, CIBSE TM 46 [31], has suggested that both of them need 420 kWh/m2/year.

One of the best references for heat and electricity demand at the building level is CIBSE TM46 [31], which presented the heat demand of 29 building categories based on local weather, building conditioned area, and building type. For example, a school, a university building, a hotel, a general office, and a restaurant need 150, 240, 330, 120, 370 kWh/m2/year of thermal energy, respectively. Comparing other published research, TM46 presents a comprehensive number of building types with a high level of accuracy. It shares a method to adjust energy demand patterns—for instance, the difference between full and part time usage of a building. Nevertheless, TM46 does not present the demand for monthly or shorter periods.

Hanmer et al. [32] studied the heat demand patterns in UK residences, and they explored social routines, e.g., the timing of work and school affect heat consumption. Based on the actual data and interview, the authors found two heat consumption peaks at 07:00 and 19:00 o'clock when people are leaving home for work and switching off boilers at nighttime.

Directive (EU) 2019/944 determines and regulates the formation of citizen low energy communities. Allowing for a broader diffusion of on-site renewable energy sources, for example, solar thermal is believed to be a key factor in decreasing buildings' carbon footprint by generating and maximizing renewable energy self-consumption [33]. Several studies investigated solar power community energy, but few studies focused on sharing the extra solar thermal hot water at the community scale. Solar thermal hot water generation varies during a day and year. Visualizing solar heat demand and generation as well as the balance between them are applicable, applying daily or monthly heat maps. Addressing the gap in this area could improve the security of SDHNs.

On the whole, the studies in the field of DHN could be classified into three categories: (i) water temperature and the related technology, (ii) network characteristics, members (buildings), and their efficiency as well as cost-effectiveness, and (iii) heat supply and demand, energy resources, smart networks [34], and technology [35].

The reviewed literature highlighted the role of smart heat networks and their efficiency, while heat demand and its consumption patterns are the fundamental part of this network. The study of heat demand or supply maps and research in this field will help to achieve a zero-carbon city. Open access to energy consumption and buildings data is fundamental to improve energy assessments and mapping. Openness is the best way to speed up research by public administrations, governments, and scholars. Therefore, open access promotes

the construction of a reliable and robust dataset [36]. Open access to energy data is the main obstacle in front of research in the DHN area.

### **3. Methods and Datasets**

### *3.1. The Case Study*

Trinity College Dublin (TCD), Ireland, founded in 1592 and located in Dublin city center [37], was used as a case study. Figure 1 shows the boundary of the TCD campus. The campus area is approximately 172,000 m<sup>2</sup> and its perimeter is about 1718 m. There are 68 buildings located on the campus, with a footprint area of 51,615 m<sup>2</sup> . The overall building area is approximately 156,665 m<sup>2</sup> . The footprint area shows the area of land used by the ground floor of a building on the campus site, while the overall building area includes the area of all floors, such as the basement floor. The scattering of the 68 case study buildings on the campus is shown in Figure 1.

**Figure 1.** The boundary of TCD campus.

### *3.2. Visualizing Current TCD Campus Heating Method*

The heating systems at TCD can be divided into two groups: (1) district heating (DH) system, and (2) individual boiler systems. These heating systems are not connected, so that each system is run, managed, and serviced independently. The first system applies in the west part of the campus (yellow color in Figure 2) and serves the Buttery Restaurant; Atrium; Houses 4, 10, 12, 14, 24, 26, and 35; Graduate Memorial Building; Provost's House, Public Theater; Reading Room; and Dining Hall. Individual boiler systems, on the other hand, serve the rest of the buildings on the campus. The size of green circles on the map (Figure 2) shows the number of boilers installed in each building. For example, in Berkeley Library, six boilers were installed. As visualized, it will be possible in the future to connect the individual boilers to create second or third DH zones on the campus.

To generate a robust base for a DH system, energy models and maps are essential [38]. Since a DH relies on local thermal energy demand densities as well as on resources, energy maps that involve detailed information at the building scale are a prerequisite. The DH heating system in TCD was compared with individual boiler systems.

**Figure 2.** The current heating systems of TCD.

### *3.3. Assessment of Heat Losses at TCD Campus*

Based on the actual data shared by the Estates Office of TCD, the overall annual heat loss of the boiler systems was calculated. In addition, the heat loss of each boiler was assessed. According to the ArcGIS analysis, the overall useful area of the buildings serviced by the DH system was 43,793 m<sup>2</sup> . According to the CIBSE TM46:2008 methodology [31] and the monthly thermal energy models (MTEMs) developed in [39,40], the buildings needed (in 2012) 11,312,468 kWh thermal energy per year. These buildings occupy approximately 14,916 m<sup>2</sup> of the campus land (the overall area of the footprint of buildings). The detailed information, including the area and thermal energy demand of all buildings heated by the DH system, is presented in Table 1.

**Table 1.** Information of the buildings heated by the DH system.


The Estates Office of TCD in 2015 reported that the boilers were switched off at nighttime, between 7:00 p.m. and 6:00 a.m., throughout the year [41]. The results of the survey show that the temperature of water in the boiler's storage tank before switching off was 70 ◦C, but in the morning, it was 20 ◦C. The 50 ◦C difference between evening and morning temperature of water indicates a significant heat loss. The Estates Office is switching off the boilers because at nighttime there was nearly zero demand (kWh) for space heating or hot water in many buildings. This was accomplished as a part of the campus policy for energy efficiency. The policy was also implemented during the holidays and weekends.

Based on the volume of water associated (i.e., the volume of the tanks), and other necessary information (Table 2), the monthly and annual heat losses were calculated. Equation (1) was used to calculate the heat losses.

$$\mathbf{Q} = \left[ \mathbf{c} \times \mathbf{M} \times (\mathbf{T}\mathbf{2} - \mathbf{T}\mathbf{1}) \right] = \mathbf{c}\mathbf{M}\,\Delta\mathbf{T} \tag{1}$$

where Q is the amount of heat loss, c is the specific heat of water (c = 4.1868 J/gr × ◦C), M is water mass (liter), T2 − T1 = 50 ◦C, 1 L of water ∼= 1000 g.


**Table 2.** The calculated heat loss of boilers at TCD campus.

According to the calculations, the overall annual heat loss of TCD boilers was approximately 1679 MWh. This is a huge heat loss and needs to be addressed properly. The DH system in the west of TCD did not follow the daily switching on/off strategy. The reason was that the DH system at nighttime serves a group of buildings, e.g., residential accommodations, that need hot water continuously. The mixed-use functions of buildings in the network—such as residential, which has continual heat demand—caused the DH system to not need to turn on/off daily. The purpose of calculating the heat losses from the individual boiler systems was to indicate that the other system, i.e., DHN, is more efficient.

### *3.4. Calculation of Thermal Demand*

A university campus is a sample of a community with various buildings that have different thermal demands and patterns. The amount of heat demand is modified based on the activities in buildings [40], outdoor temperature, and building area. Therefore, calculating the accurate amount of thermal energy demand based on these factors is fundamental. The actual heat demand data applied in the assessments related to both space heating and domestic hot water. The monthly heat demand calculation was addressed comprehensively by applying valid monthly thermal energy models (MTEMs) [39] for typical educational buildings, such as a college. However, a campus also includes other types of buildings, which are important from the heat demand perspective. Sports centers, libraries, laboratories, amphitheaters, shops, and restaurants are usually found on a campus, and they need a large amount of heat energy per unit area. For example, a restaurant needs thermal energy of 370 kWh/m<sup>2</sup> per year [31].

The other types of buildings on a campus except for the typical college buildings, such as a restaurant, were called non-UC buildings, with various heat demand sizes and patterns. The monthly heat demand estimations in non-UC buildings are also essential to quantify daily or monthly thermal energy demand at the campus scale. The heat map (HM) methodology developed in this paper considered all types of buildings on a campus when calculating the heat demands. The HM method and the capability of the DHB tool are explained in further detail in the following sections.

In fact, non-UC building types comprise 28 categories, as defined by CIBSE TM46 [31]. These 28 categories cover all types of buildings on a campus, and they include many building types, for example, sport center, swimming pool, restaurant, lab, library, shop, etc. Each type has its special thermal demand benchmark. To calculate the amount of mean monthly thermal energy demand of non-UC buildings, the annual thermal energy benchmarks based on the dominant function (single function) of the buildings were derived from TM46 and divided by 12 (number of months in a year). Then the result was multiplied by the total useful floor area (TUFA) of that building, which was obtained by survey. Using Equation (2), the mean monthly heat demand was calculated for non-UC building types.

The mean monthly thermal energy demand for a non-UC building (Q1) with dominant function (i) is:

$$\mathbf{Q1} = \text{(TM46 benchmark (i))}/\text{12)} \times \mathbf{A} \tag{2}$$

where benchmark (i) in kWh/m2/yr refers to the dominant function of the building, and A is the total useful floor area (m<sup>2</sup> ) of a given building.

A hybrid heat map (HM) methodology was developed, which was a suitable GISbased method for analysis and management of heat demand or surplus at the community scale, in this case, a university campus. Hybrid methodology means the combination of both MTEMs and TM46 methods. HM was also applied to assessing the efficiency and feasibility of district heating (DH) systems at the case study campus.

### *3.5. Development of the Smart District Heating Network (SDHN) Dataset*

The district heat balance (DHB) tool developed in this research is applicable for calculating heat demand and generation at daily or monthly resolutions. In addition, it can share a lot of valuable information about the buildings and energy analysis. If a user of the DHB tool clicks on a given building in ArcGIS, for example on the Lloyd Building, a window will open that delivers valuable information in a table format, as presented in Figure 3. The information can be used for multiple purposes, for instance, energy efficiency planning to reduce fossil energy consumption at the neighborhood scale. The DHB tool shares useful information such as the monthly heat demand analysis data, dominant and mixed activities in the building, gross heat density (H\_UEDg), footprint heat density (H\_UEDf), building floor area, building total area, number of floors, and the annual thermal energy demand of the case study buildings.


**Figure 3.** Screenshot of attribute table of DHB tool, energy and building information.

### **4. Results and Discussion**

### *Monthly Heat Maps for TCD Campus*

The DHB tool is a GIS-based tool developed in this research to generate monthly thermal energy maps. In ArcGIS, the DHB tool was linked with EBD (energy building database), which includes all the energy data, such as display energy certificate information, monthly thermal energy models (MTEMs) data, monthly thermal energy benchmarks (MTEBs), and heat density. It also includes all the information obtained from the survey. The attribute table of the DHB tool comprises 120,000 data cells, which were used to generate the monthly heat maps.

To generate the monthly heat demand maps for case study buildings, two methods were combined. The method of monthly thermal energy models (MTEMs) [39] was applied for typical college buildings (PTC types) that did not need heating in summers. The CIBSE TM46 [31] method was also applied to calculate the energy demand for non-UC buildings (PTC types). In ArcGIS, both methods were combined, and the heat maps were generated. Further information about these methods was explained in Section 3.4.

The analysis of the thermal energy map of TCD in January 2012 revealed that the highest thermal energy demand among the buildings on the campus was 330,748 kWh belonging to the Arts Building, followed by the Panoz Institute and Lloyd Building, with 231,550 and 225,250 kWh, respectively (Figure 4). The lowest demand, however, was approximately 6500 kWh belonging to a storage facility building in the north of the campus behind the Simon Perry Building. In the figure, the darker colors show a higher thermal energy demand and lighter colors a lower thermal energy demand. NCC refers to 'No Condition Control' spaces or unconditioned spaces such as a garage.

**Figure 4.** The heat demand (HM) map of January.

The key characteristic of a spatial thermal demand map is to identify the locations of potential anchor thermal energy loads. Anchor thermal energy loads are high heat demand buildings, such as swimming pools. In addition to the high-density thermal energy load, the other parameter that should be considered in the identification of an anchor load is the continuing demand pattern. For instance, a building with high thermal energy density during a short period, such as a month or even a season, is not a thermal anchor load.

Detailed analyses of the monthly fossil thermal energy demand of TCD buildings were undertaken using the DHB tool and, as a sample, the demands in August are shown in Table 3 at the building level. The analyses were presented at the building level; however, they can be presented at a larger level, such as at a campus or an urban context level.

**Table 3.** The thermal energy demand in August at the building level.


The HM of TCD in July 2012 is presented in Figure 5. According to the map, the thermal energy demands of UC buildings (typical college buildings) at TCD were zero; however, the non-UC buildings with functions such as residential, Sports Center (swimming pool), restaurants, and library needed the thermal energy over this month. For the definition and typical characters of a college building, refer to [40]. According to the HM of July, the highest thermal energy demand belonged to Dining Hall with 186,384 kWh followed by Sports Center with 177,854 kWh.

**Figure 5.** The heat demand map of July.

According to the ArcGIS statistical analysis, the overall thermal demand of non-UC buildings in July 2012 was 1,529,438 kWh. In addition, the average daily thermal energy demand was approximately 51,000 kWh, which indicates the baseload (the lowest daily load) of the campus. On the other hand, the peak load (monthly and daily) can be derived from the analysis of the January heat map presented in Figure 4. Based on the statistical assessments, the thermal energy demand of TCD in January 2012 was approximately 3,944,680 kWh, which shows an average daily demand of 131,000 kWh. The difference between baseload and peak load was 80,000 (kWh/day), which refers to the space heating.

This detailed information, i.e., baseload and peak load, which was obtained from the analysis of the monthly heat maps, is crucial in designing a DH system. Such information is very valuable at an urban scale. For example, understanding the thermal density, extracted from the HM, helps urban planners to determine the optimized location for energy plants nearby the anchor loads. This strategy reduces the energy loss of hot water in a DH system due to the short distance between the energy plant and energy consumer, i.e., anchor loads. In addition, knowing the base and peak heat loads is useful in the design of a DH system plant in terms of water tank capacity and power. For instance, such evidence could be used by mechanical engineers to calculate efficient plants.

By comparing both January and July maps (2012), essential information regarding the anchor loads of the campus was obtained. The maps revealed the location of both anchor loads, i.e., Dining Hall in the northwest of the campus and Sports Center on the opposite side, northeast.

In addition, by analysis of the July heat map (Figure 5), two classes of buildings were defined from the heat demand pattern perspective, (i) continual thermal consumers (CTC), all buildings on the map except blue colored and NCC, and (ii) periodic thermal consumers (PTC), blue colored buildings. PTC needs very low or nearly zero thermal energy during

summers or at nighttime and holidays. Typical college buildings are a sample of this class. CTC and PTC can be used as a foundation for thermal energy balance analysis.

The detailed analysis of thermal energy demand of both classes of buildings indicates the footprint area of PTC was 28,234 m<sup>2</sup> , whereas the footprint area of the CTC class was 21,835 m<sup>2</sup> . The useful building area of PTC was 87,049 m<sup>2</sup> while the useful building area of CTC was 60,852 m<sup>2</sup> . Accordingly, the annual thermal energy consumption of the PTC class was approximately 14,640 MWh, while the annual consumption of the CTC class was 16,657 MWh. The monthly thermal energy consumption of PTC buildings at the campus scale is presented in Figure 6.

**Figure 6.** The monthly thermal energy consumption of PTC buildings.

To explore the daily heat demand patterns, the hourly actual heat consumption data were obtained from the TCD energy database. Based on these hourly data, the daily heat demand pattern of case study buildings was discovered. Through the assessment of daily consumption data of PTC buildings, the daily energy demand pattern was extracted as shown in Figure 7.

**Figure 7.** The thermal energy consumption pattern of PTC buildings, Aras An Phiarsaigh, December 2013 [42].

The thermal energy consumption pattern of the PTC class not only depends on the outdoor temperature and modifications through a year, but it also depends on the attendance timetable of students/staff. For example, at nighttime, e.g., from 6:00 p.m. when there were few students, the boilers at TCD are switched off. This strategy was observed on the weekends and holidays, as shown in Figure 7. The actual daily heat consumption data were obtained from Cylon Active Energy Management online dataset [42].

Further assessment of the HM maps revealed that the location of the energy plant of the DH system at TCD is located beside an anchor load, i.e., Dining Hall (Figure 8), which is an optimized decision in terms of reducing heat losses when hot water circulates between the heating plant and the anchor load (Dining Hall). The shorter distance results in shorter piping, lower risk of repairing, lower water leakage risk, lower heat losses, lower thermal insulation, and consequently lower expenses. The closeness of heating plant/generator to anchor loads (consumer) is a key criterion that should be considered in land development in future urban planning. Determining the best location for energy generators from the perspective of energy efficiency in a DH system at the community/urban context is one of the important applications of heat energy maps.

**Figure 8.** Location of the anchor load and heating generator in the DH system.

According to the analyses of the heat maps of May, June, (Figures 9 and 10), July (Figure 5), October (Figure 11), and the rest of the year (Figure 12), it can be concluded that if the DH system at TCD develops, an open space in the west of the Sports Center is the best location for a new DH station. Layout 1 in Figure 12 indicates the analysis applied to determine the optimal location. Based on the closeness to the anchor load (Sports Center) and the availability of open space, the location was determined.

Based on the distance analysis presented in Layout 1 (Figure 12), the college map was divided into three sections with 241, 224, and 205 m intervals. The current DH system covers the first interval, i.e., 241 m in the west of the campus. The suggested new DH system can cover the rest of the campus with a maximum distance of 325 m, as shown in the figure. The new location is specified with a circle with an area of 415 m<sup>2</sup> .

The suggested new DH system with maximum distance of 325 m can service the buildings located in the southeast and east of the campus, such as the Parsons Building, Moyne Institute, Chemistry Building, Smurfit Institute, Panoz Institute, Hamilton Building, and Lloyd Building. Likewise, it can also serve another group of the buildings located in the north of the campus with a distance of 224 m. Samuel Beckett Theater, Aras An Phiarsaigh, and Simon Perry Building are examples of buildings in this group. In the case of connecting both DH systems (the existing plant station near Dining Hall and the recommended station near Sports Center), the maximum distance between them is approximately 480 m. Establishing the recommended new DH system instead of individual boilers can save a heat loss of 1679 MWh resulting from the daily boilers' on/off strategy. Using the DHB tool, seven monthly heat demand maps of TCD were generated and are presented in Figure 13.

**Figure 9.** Heat demand map of May. **Figure 9.** Heat demand map of May.

**Figure 10.** Heat demand map of June.

**Figure 11.** Heat map of October.

**Figure 12.** Determining a location for a DH thermal station.

**Figure 13.** The monthly fossil fuel heat map of TCD in 2012.

The amount of monthly thermal energy demand of TCD at the campus scale is presented in Table 4. The overall annual thermal demand of the campus was 31,361 MWh/yr. The lowest demand at the campus scale was in July, with nearly 1529 MWh, while the highest demand was in January with nearly 3945 MWh. The results show that the heat demand of TCD in January was approximately 2.5 times greater than that in July.

**Table 4.** The monthly thermal energy demand at the campus level.


### **5. Conclusions**

Thermal energy demand strongly depends on the building area, activity, and outdoor temperature. Modeling and visualization of heat demand of individual buildings and simultaneously at the neighborhood scale are fundamental to enhance the sustainability of a smart DHN. In this paper based on monthly thermal energy models (MTEMs) and 29 CIBSE TM46 energy benchmarks, an integrated methodology for monthly heat demand calculation and mapping for a university campus was developed. The generated heat maps share detailed information to smartly manage the DHN. Based on the developed DHB tool, the anchor and peak heat loads were calculated. Visualizing the location and size of heat demand, such as the maximum and minimum loads and their consumption patterns, are crucial for developing a DHN. According to the heat consumption patterns, two groups of buildings, including continual thermal consumers (CTC) and periodic thermal consumers (PTC) were identified. Discovering this detailed information is essential for managing a DHN more sustainably. The methodology can replicate in any urban context to assess the heat demand at both the individual building and community levels. This method paves the way toward sharing surplus thermal energy across the DHN. Future studies may focus on the sharing of heat energy between the community members using smart heat measurement technologies.

**Author Contributions:** Conceptualization, S.V. and S.M.; methodology, S.V.; software, S.V. and S.M.; validation, S.V., S.M. and K.H.; formal analysis, S.V. and S.M.; investigation, S.V.; resources, S.V. and K.H.; data curation, S.V.; writing—original draft preparation, S.V. and S.M.; writing—review and editing, S.M. and K.H.; visualization, S.V.; supervision, S.M.; project administration, S.V.; funding acquisition, S.V. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data sharing is not applicable to this article.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**


### *Article* **Evaluation of Air Quality Index by Spatial Analysis Depending on Vehicle Traffic during the COVID-19 Outbreak in Turkey**

**Kadir Diler Alemdar <sup>1</sup> , Ömer Kaya <sup>1</sup> , Antonino Canale <sup>2</sup> , Muhammed Yasin Çodur <sup>1</sup> and Tiziana Campisi 2,\***


**Abstract:** As in other countries of the world, the Turkish government is implementing many preventive partial and total lockdown practices against the virus's infectious effect. When the first virus case has been detected, the public authorities have taken some restriction to reduce people and traffic mobility, which has also turned into some positive affect in air quality. To this end, the paper aims to examine how this pandemic affects traffic mobility and air quality in Istanbul. The pandemic does not only have a human health impact. This study also investigates the social and environmental effects. In our analysis, we observe, visualize, compare and discuss the impact of the post- and pre-lockdown on Istanbul's traffic mobility and air quality. To do so, a geographic information system (GIS)-based approach is proposed. Various spatial analyses are performed in GIS with the statistical data used; thus, the environmental effects of the pandemic can be better observed. We test the hypothesis that this has reduced traffic mobility and improved air quality using traffic density cluster set and air monitoring stations (five air pollutant parameters) data for five months. The results shows that there are positive changes in terms of both traffic mobility and air quality, especially in April–May. PM10, SO<sup>2</sup> , CO, NO<sup>2</sup> and NO<sup>x</sup> parameter values improved by 21.21%, 16.55%, 18.82%, 28.62% and 39.99%, respectively. In addition, there was a 7% increase in the average traffic speed. In order for the changes to be permanent, it is recommended to integrate e-mobility and sharing systems into the current transportation network.

**Keywords:** pollutant emission; traffic mobility; COVID-19; sustainable transportation; paired sample *t*-test

### **1. Introduction**

The terms energy, production and use of energy have started to be used in every field, especially in recent years, as energy is a great power that directly affects all humanity. We have to use this power in a correct, environmentally friendly, innovative and sustainable way, otherwise the irreversible consequences of climate change and global warming will endanger the future of humanity. In addition to this ongoing danger, as of March 2019, humanity faced the Coronavirus (COVID-19) pandemic [1]. Countries have taken many measures to fight against this pandemic; the most important of which is the lockdown, which is thought to prevent contact and contamination. As the lockdown times increased, some environmental analyses began to be performed. This process showed that there was a great change in air quality, especially in crowded cities and areas with high mobility [2,3]. The main reason for this change is the decrease in the mobility of conventional motor vehicles used in urban transportation and emissions caused by transportation [4–6]. The reduction in emissions has many positive effects in terms of both human health and the environment. However, in order to ensure this effect is permanent, it is vital to encourage and expand the use of sustainable energy and transportation types in urban transportation.

**Citation:** Alemdar, K.D.; Kaya, Ö.; Canale, A.; Çodur, M.Y.; Campisi, T. Evaluation of Air Quality Index by Spatial Analysis Depending on Vehicle Traffic during the COVID-19 Outbreak in Turkey. *Energies* **2021**, *14*, 5729. https://doi.org/10.3390/ en14185729

Academic Editors: Benedetto Nastasi, Meysam Majidi Nezhad and John M. Cimbala

Received: 2 July 2021 Accepted: 9 September 2021 Published: 11 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Istanbul, which has the highest human and vehicle mobility in Turkey [7], was selected as the study area, and the change in air quality, traffic density and vehicle speeds pre- and post-lockdown were analyzed.

The study seeks answers to two main research questions in which the pre- and postlockdown situation is discussed. These are:


In this analysis process, traffic density and vehicle speed data measured with the help of sensors placed by the local municipality throughout Istanbul, as well as air quality index values obtained from air quality measurement stations were considered. The data used can be classified as big data. Geographic Information Systems (GIS) were used for both the size of the data and the analysis results, in order to be more understandable. Although GIS is a frequently preferred analysis tool in the literature in various analyses of COVID-19 and monitoring the air quality of cities, remote sensing-based machine learning has also been a very popular tool in recent years. GIS and remote sensing techniques are tools widely applied to the energy field in various sectors, ranging from building construction [8] to energy management [9], as well as to mobility [10]. In the mobility sector, the adoption of new technologies related to vehicle traction (electric motors) is leading to improvements in environmental impacts [11,12]. GIS maps are implemented in various areas, such as in the monitoring, analysis of road accidents, identifying risk factors, in assessing road congestion and in choosing mitigation strategies [13].

Optimal control and management of mobility entails starting with the inclusion of electric vehicles in the national fleet and control by means of a series of sensors installed in the infrastructure (such as video cameras, sensors for environmental and acoustic parameters, sensors for measuring road flows, etc.) and finally the mapping of the results acquired on a GIS basis in order to assess and draw up risk maps and consider the optimal or most critical scenarios for a city as vehicle flows change [14–16]. The recent pandemic has highlighted a number of critical issues and benefits brought about by the drastic reduction of mobility that laid the basis for an assessment of mobility development in the pre-pandemic phase, taking into account the sustainable and resilient aspect [17].

The paper is organized as follows: Section 2 presents literature studies on traffic parameters, air quality and COVID-19. In Section 3, brief material and method information about the study are given. In Section 4, there is analysis information of the GIS approach by conducting a case study for Istanbul. In Section 5, the results of the study are discussed, and sustainable transportation proposals were mentioned. Finally, it provides information about the result of the study and future studies.

### **2. Literature Review on Vehicular Pollutant Emission Pre and Post Pandemic**

In general, scientific or public questions about COVID-19 are discussed on social media and related scientific studies. Some of the questions discussed, especially in terms of transportation, are listed below [18]:


In this study, the authors focus on the second and fourth questions. Researchers have conducted many studies on the air quality change and traffic mobility of the partial and/or total lockdown imposed to struggle against the COVID-19 pandemic. Different methods are used in current studies. In this section, studies on air quality change and traffic mobility were presented. Finally, the contribution part of this paper to the literature was given.

Çelik and Gül conducted a study to measure air quality during the COVID-19 pandemic. Comparison was made considering seven different emission parameters on post and pre lockdown. While improvements were observed in PM10, NO2, NO and NO<sup>x</sup> parameters, it was found that there was a partial deterioration in O<sup>2</sup> parameter [19].

¸Sahin examined the impact of COVID-19 measures on air pollutants. In March 2020, PM10, PM2.5, NO2, CO and SO<sup>2</sup> from air pollutants were taken into consideration for the Anatolian and European Side in Istanbul, and there were decreases of 32–43%, 19–47%, 29–44%, 40–58%, 38–69%, respectively [20].

Wang and Su analyzed the impact of the COVID-19 pandemic on air quality in China. They observed that air quality in China improved, and global carbon emissions decreased. The decrease of NO<sup>2</sup> parameter was seen first in Wuhan, and then in all cities [21]. Dantas et al. conducted a study on air quality in Brazil. When the partial lockdown period and the same period of the previous year are compared in terms of air quality, the median values of NO<sup>2</sup> and CO are 24.1–32.9% and 37.0–43.6% lower, respectively [22].

Tian et al. examined the impact of the pandemic in Canadian cities in terms of urban transportation and air pollution. Fuel consumption and CO<sup>2</sup> emission values were taken into consideration. Due to partial lockdown, fuel consumption and estimated CO<sup>2</sup> values declined to very low levels in April 2020. However, it started to rise again in May 2020. Furthermore, while NO<sup>2</sup> and CO parameters are found to be strongly associated with COVID-19, the situation is not the same for the SO<sup>2</sup> parameter [23].

Parker et al. observed that traffic mobility decreased by up to 50% as a result of the pandemic in the Southern California region. When the 19 March–30 June period of the last five years are compared with the same period in 2020, there are significant decreases in PM2.<sup>5</sup> and NO<sup>x</sup> parameters [24].

Gualtieri et al. analyzed changes in pollutant and greenhouse gas emissions due to pandemic restrictions. It was found that urban road traffic in Italy has decreased by 48–60%. For comparison, the NO2, O3, PM2.<sup>5</sup> and PM<sup>10</sup> parameters on 24 February 2020–30 April 2020 and 25 February 2019–2 May 2019 were evaluated for six cities. There was an improvement of 59.1% in the NO2, 17% in the PM2.<sup>5</sup> and 32.1% in the PM10. An increase of 14.7% was observed in the O<sup>3</sup> parameter [25].

Marinello et al. examined the traffic flow and air quality in the case study (Northern Italy). The consideration period is February, May 2019/2020. The results showed that the number of vehicles in traffic decreased by up to 82% in 2020. The decrease of the NO<sup>2</sup> and CO emissions is above 30% and 22%, respectively. On the other hand, an increase of 13% was observed in the O<sup>3</sup> parameter [26].

Chen et al. examined the effects of travel restriction on air pollution of 49 cities in China. They found that the negative impact of usage private vehicle on air pollution decreased during the pandemic. Significant improvements were observed in the parameters PM2.5, PM10, SO2, NO2, CO and O<sup>3</sup> [27].

Patra et al. studied short-term changes in road traffic patterns in the city of Chennai (India). It has been observed that non-compulsory travel has dropped. However, as the lockdown measures eased, road traffic started to increase. It can be stated that total lockdown is most effective in reducing road travel activity, but a partial lockdown can only provide temporary benefits [28].

Hicks et al. examined the effects of lockdowns on exhaust gas emissions in London. During the lockdown periods, a 32% decrease in the traffic volume on the Marylebone road and 15% increase in the average speed were observed. Thus, it has been revealed that vehicle emissions have also decreased [29].

Teufel et al. have developed several simulations to measure the effect of reduction of traffic-related heat emissions on urban temperature characteristics in the COVID-19 period. As it clear from simulation results, it has been revealed that an 80% reduction in traffic density will reduce the temperatures by 1 ◦C on average in the city of Montreal (Canada) [30].

Boroujeni et al. conducted a study to examine the public effects of the pandemic. Impacts in terms of mobility, traffic, air pollution, noise pollution and waste generation were studied. It was observed that between January 3 and February 6, mobility in public transport centers decreased by 85%. PM2.<sup>5</sup> parameter was decreased by 24% in the state of Victoria (Australia) [31].

Doucette et al. examined daily travel and accident values in Connecticut during COVID-19. After the lockdown, the distance traveled by the vehicles was reduced by 43%. When comparing the pre- and post-stay at home periods, single vehicle crash ratios increased by 2.29 times. In addition, the fatal accident severity in a single vehicle increased by 4.10 times [32].

Lee et al. examined the changes in traffic in the first three months of 2020 in South Korea during the pandemic process. There was a 9.7% decrease compared to 2019. It was observed that the number of vehicles in traffic increased after the number of cases decreased [33]. Parr et al. conducting a similar study, observed that the traffic volume decreased to 47.5% compared to 2019 [34].

During and after the pandemic, several studies have been undertaken to analyze the environmental impacts generated by mobility. One study conducted in China states that correlation analysis between measured data and the ArcGIS system was useful in revealing the relationships between pollutants and seven different sources [35].

A study in Poland monitored emissions during pandemic phases by defining a heat map, i.e., a graphical illustration of the value of the tested characteristic, depending on its concentration level and size, presented with a selected color palette [36].

In addition, a study conducted in India in the cities of Kolkata and Howrah Municipal Corporation, West Bengal was designed to assess changes in air quality from the preclosure period to the closure period. This study focused on the application of GIS-based techniques (spatial and temporal distribution of pollutants) using the interpolation method and statistical methods such as analysis of variance (ANOVA) to understand the changing association of pollutants in the pre- and during-closure phases [37].

When the aforementioned studies are examined, it is seen that there is an improvement in air quality and a decrease in traffic mobility as a result of partial and/or total lockdown. In most of the current studies, the limited and insufficient data, the simple of the results, the lack of spatial analysis to better understand the results and the lack of recommendations for permanent solutions, are expressed as gaps in the literature. In order to fill this gap, in this study, traffic mobility and air quality change were examined using GIS-based spatial analysis. Permanent solutions are suggested for administrators for these.

The contributions of this study to the literature are as follows:


### **3. Materials and Method**

A GIS-based approach was developed to find solutions to the determined research questions. Data were collected to analyze air quality and traffic mobility. Traffic mobility

and air quality data were processed in the GIS environment, and changes were analyzed. The study stages are briefly given in the framework given in Figure 1. First of all, data were collected for the study. The collected data was transferred to the GIS environment. Visualization of the data has been achieved thanks to analyses applied in GIS. After spatial analysis, it is easier to interpret the effects of the pandemic.

**Figure 1.** Framework of the paper.

### *3.1. Study Area*

Istanbul, with a population of about 16 million, is the most urbanized city in Turkey [38]. Naturally, human mobility is quite high, and this mobility is usually provided by road. Percent of 20 the number of motor vehicles in the country are located in this city. Istanbul has a total surface area of 5500 km<sup>2</sup> . However, the total surface area of the regions with high activity in Istanbul is 2000 km<sup>2</sup> . These areas are southeast of Europe and southwest of Anatolia. The study area is given in Figure 2.

### **Figure 2.** Study area.

Traffic parameters in Istanbul are different from many other cities in Turkey. The reason for this is that despite its small surface area, it has a very crowded population (although there are 16 million residents, this number increases significantly in the summer season). To meet the transportation mobility of the crowded population, highway/seaway and airline transports are all used together. For this reason, Istanbul is expressed as a "transportation laboratory". In addition, 25% of the motor vehicles in Turkey and most other types of vehicles (motorcycles, shared vehicles, electric vehicles, etc.) are located in Istanbul. Micro mobility and innovative vehicle transportation applications are first launched in Istanbul. Since traffic congestion is a big problem in Istanbul, it is seen that innovative micro mobility applications are easily adopted by the public. Another reason for this is that the young population and the number of people in Gen Z is very high. For all these reasons, Istanbul can be shown as the city where the effects of the pandemic are best analysed in the country.

Due to the high number of vehicles and other activities, problems in air quality are observed. Automobile types constitute 69% of the traffic in Istanbul. Trucks have a large share, with 16%. In addition, more than 50% of the vehicles used in Istanbul are diesel vehicles [39]. In the "Air Pollution in Istanbul" report presented by the Ministry of Foreign Affairs in 2018, the causes of air pollution are shown in the vehicle (traffic) and industry, mining operations, agriculture and household sectors, respectively [40]; the biggest proof of this situation is that Istanbul ranks first in the country in terms of carbon footprint. Because of all these reasons, it is the city where most of COVID-19 cases seen in Turkey. In fact, the first case and death in the country occurred in this city [41]. For this reason, the partial and/or total lockdown, which is an important measure in fighting the pandemic, first took place in Istanbul in the second half of March and in April–May (2020). During this period, the authors think that traffic mobility decreased and air quality was improved. Traffic data must be collected instantly to show the changes between post- and pre-lockdown. This data collection process is not available in most cities, as it is costly. However, these data are collected by the Istanbul Metropolitan Municipality (IMM) and shared for academic studies. In addition, there are 36 air quality monitoring stations actively in Istanbul and seven emission parameters are collected at air monitoring stations. Considering all these reasons, Istanbul was chosen as the study area. The first case in Turkey on March 11 and the first death occurred on March 17. The partial and/or total lockdown was implemented throughout the country from the second half of March to June. Within the scope of the study, certain periods were considered to measure the effects of the restrictions/lockdowns. Post- and pre-lockdown were considered equal times for consistency of results. To do so, data in January–February–March–April–May were collected to show the changes in traffic mobility and air quality. Since the normalization process started with June, the other months were not included in the study.

### *3.2. Collected Data*

In order to measure the change in traffic mobility and air quality pre- and postlockdown, the number of vehicles in traffic, average vehicle speeds and air quality measurement data are needed. For this purpose, traffic density data for five months was provided by IMM. Real-time traffic density data are collected 24 h a day in the city with the help of 1455 sensors. Vehicle data are collected in terms of passenger car unit in terms of the homogeneity of the data. These data consist of the number of vehicles in traffic and the average vehicle speed values.

36 air quality monitoring stations were established in Istanbul by the Ministry of Environment and Urbanization. Measurements are officially made by the Ministry of Environment and Urbanization of the Republic of Turkey At these stations, measurements of emission values such as PM10, SO2, CO, NO2, NOX, O3, PM2.<sup>5</sup> are made instantaneously, and their averages are stored hourly and daily. Due to inconsistencies in O<sup>3</sup> and PM2.<sup>5</sup> emission values, the parameters were not included in the study. Five months of data were considered for the other five air pollutants. In order for the five air pollutant values to be meaningful and to be used in the study, there should be a difference in the values preand post-pandemic. This difference can be revealed by various statistical methods. In this study, this difference was analyzed using the "Paired-Samples T Test". There is a significant difference between the air pollutant data used pre and post the pandemic. The values of statistical analysis are presented in Table 1. As a result of the statistical analysis, the *p*-alpha value was found to be 0.01. In the literature, this value should be less than 0.05. It states that if the *p*-value is less than the value of alpha 0.05, this means has a significant difference and strong correlations [42].


**Table 1.** Statistical values of parameters.

PM<sup>10</sup> is a complex mixture in a solid or liquid state with a molecule larger than 0.0002 µm and smaller than 500 µm in the atmosphere or a gas mass. Particles below 10 µm are called PM<sup>10</sup> [19]. SO<sup>2</sup> is a highly harmful gas with one sulfur and two oxygen atoms with a covalent-polar bond between them, colorless and pungent odor [43]. CO gas is a colorless, tasteless, odorless, flammable and toxic gas. As the rate of CO in the air increases, death can occur in a shorter time. Air basically consists of Nitrogen (78%) and Oxygen (21%) [44]. The most common nitrogen oxides (generally defined as NOx) are NO and NO2. NO<sup>2</sup> generally results from the burning of fossil fuels [45].

### *3.3. Geographical Information Systems*

GIS is an integrated system where collection, storage, association, query and visualization processes of geographic data are carried out [46]. GIS was preferred to analyze traffic mobility and changes in air quality. The ArcMap 10.5 program was used in this study. To ensure the consistency of spatial analysis, the data and data sources must be reliable and accurate. The authors were very sensitive in this regard and the data were obtained from official sources. The five-month traffic density data were transferred to the GIS. By transferring the data to the GIS environment, it was possible to visualize the data. The point density analysis type was used for spatial analysis of the number of vehicles and average speed values. Emission values were also transferred to GIS and spatial analysis was performed with Inverse Distance Weighted (IDW) analysis. An interpolation process needs to be applied to air pollutants, vehicle numbers and vehicle speed data. In order to visually present the analysis to the readers, the IDW interpolation method offered by the ArcMap 10.5 program was preferred. When the studies in the literature are examined, it is seen that there are many interpolation methods [47–50]. IDW is one of these methods and is frequently used in the literature. In this study, IDW was used to obtain predictive values between stations and to perform analysis. Thanks to the IDW method, which is based on Tobler's First Law of Geography, values in two distances and two value ranges can be estimated [51,52].

Thanks to visualization within the borders of Istanbul, the effects of the pandemic can be better analyzed. To accurate comparisons of these analysis results for five months, the same interval was used in all analyses. The pixel values of the maps obtained are 900 m<sup>2</sup> (30 × 30).

GIS is frequently used in the literature to analyze, visualize and interpret the change of emission values and traffic parameters. Having proved its usability in this field, GIS is very useful in terms of understanding the relevant data more easily during the pandemic period. In this context, both researchers and administrators have used GIS-based approaches quite widely.

### **4. Results**

The key part of the study is the correct transfer of data to the GIS environment and the implementation of the necessary analysis. Point density analysis was applied to show the change of traffic density data over the five-month period in the most robust way. Both the number of vehicles and the average vehicle speed data were used for this analysis. The classification intervals must be the same in to provide a clear representation of the change in the five-month period (to not mislead the readers and to transfer correct information to the literature).

The change maps of the number of vehicles and average vehicle speeds are given in Figure 3. While red indicates congestion in the number of vehicles, green indicates free flow. Red color on average vehicle speed indicates high speed. As is clear from Figure 3, traffic density measurement sensors are located in road networks with high mobility and dense traffic.

**Figure 3.** The maps of number of vehicles and vehicle speed's changes.

As can be seen from Figure 3, there are significant changes in both the number of vehicles and the average vehicle speed, especially in January–February–March and April–May.

Partial/total lockdown in April and May significantly reduced the density of vehicle traffic in the city center. In fact, this situation shows us that there will be a significant improvement in traffic density if the use of private vehicles is reduced.

There has been an increase in average vehicle speed due to the improvement in traffic density. Average vehicle speeds by months are 57.50, 55.20, 56.90, 59.00 and 58.56 (km/h), respectively.

The effect of restrictions on traffic also affects air quality. We used Interpolation (IDW), a type of spatial analysis, to better show this effect. Five-month changes of five air pollutants from 36 air quality stations established in Istanbul were analyzed. An important part of the change in air quality occurs due to transport mobility. In Figure 4, changes according to months and pollutants are given.

**Figure 4.** The maps of changes of air quality.

The red color indicates the positive situation and the blue color the negative situation. However, the improvement ratios of air quality parameters are presented Table 2.


**Table 2.** Month-based air quality values.

**−**

Therefore, the individual concentrations are shown in comparison in Figure 4 using a better/worse display scale depending on the regulatory threshold values for each individual concentration. One of the biggest reflective pollution parameters of traffic mobility is NO2. In this context, both temporal and spatial changes of NO<sup>2</sup> parameter are given as examples in Figure 5.

When Figure 5 is examined, there is a serious decrease in NO<sup>2</sup> parameter values in May and April. When the maps in Figure 3 are examined, it is seen that the traffic mobility decreased considerably in April and May. Thus, it can be said that NO<sup>2</sup> air pollutant increases and decreases in relation to traffic mobility.

The main emissions (air pollutants) from traffic are known as NOx, PM and CO. When the parameters are examined, it is seen that PM<sup>10</sup> parameter density in city centers decreases. However, the news of partial lockdown in March caused people to migrate to rural areas. Therefore, a serious increase was observed in PM<sup>10</sup> in March.

**Figure 5.** Change of NO<sup>2</sup> parameter according to months [53].

The SO<sup>2</sup> parameter has generally remained stable except for February. The main reason for the increase in February is the energy consumed for warming due to the cold air temperature. A significant improvement is observed in the parameters of CO, NO<sup>2</sup> and NO<sup>x</sup> during the partial lockdown period. The decrease in the number of vehicles in traffic can be shown as the main reason for this decrease. These parameters have a significant impact on human health. Emission values occur intensively in urban areas. Different sustainable transportation practices can be developed for urban areas to achieve permanent emission values during the pandemic period. These applications, which are not very suitable for rural areas, will be fully integrated with urban transportation systems.

### **5. Discussion**

Air quality is an important indicator for urban sustainability. On the other hand, poor air quality has a very negative effect on human health and its consequences. The World Health Organization announced that three million people die each year due to air pollution, and billions of people are adversely affected. Recently, significant changes in air quality have been observed due to pandemic. In particular, the restriction measures applied in the fight against pandemic have seriously affected both people and vehicle mobility. This situation has created a positive change in air quality. In other words, the reduction in emission values as a result of reducing the use of private vehicles gives us an important message. As is clear from Figure 3, a decrease in traffic density and an increase in average traffic speed are observed, especially in April, compared to previous periods. Since the emission values originating from transportation have decreased, the positive change in air quality is clearly seen in Figure 4. However, positive situations lost its effect with the normalization period, and even got worse. To make the improvements permanent, especially in the areas marked in red, various micro mobility solutions need to be implemented. It may be necessary to prohibit individual motor vehicles from entering areas with poor air quality or central areas, or to introduce paid entry (congestion charge), as has been done in various cities (Singapore, London, Milan, Stockholm). As with the rest of the world, the devastating effects of climate change can now be seen concretely in Turkey. Flood and fire disasters in July and August are the best examples of this situation. While fire disasters were seen in 49 of the 81 provinces, flood disasters were seen in various provinces simultaneously. In a two-month period, approximately 100 people died in these disasters. These events show the dimensions of climate change. The phrase "climate change in our lives" is now more appropriate instead of "climate change is at hand". Sustainable practices should be implemented in order to permanently ensure positive change in air quality in our living spaces. In this direction, sustainable transportation modes should be quickly integrated into the existing transport system to reduce traffic-related emissions. The most popular sustainable practices in recent years are e-mobility and sharing system. E-mobility applications of transportation systems with electrification are increasing, but currently insufficient. In addition, a demand-responsive transport (DRT) approach should be expanded in urban transportation, considering passenger demands. In this context, the suggestions for sustainable transportation are:


With the implementation of such suggestions in urban transportation as soon as possible, positive changes in air quality will be achieved permanently. In addition, a decrease in traffic density will be observed with the dissemination use of sharing transportation modes.

The study has a few limits to be discussed. The fact that the traffic sensors used to measure traffic mobility are only located on the road networks with high mobility and dense traffic, prevents the analysis of the change in other transportation networks. However, most mobility in Istanbul is in 2000 km<sup>2</sup> (total surface: 5500 km<sup>2</sup> ). Therefore, most of the sensors are located in these areas. A total of 44 air quality-monitoring stations have been established in Istanbul to measure air quality by the Ministry of Environment and Urbanization. Eight of stations were not included in the study due to the data discontinuity. Since there is no correct data flow for O<sup>3</sup> and PM2.<sup>5</sup> air pollutants, these parameters were not used in the study.

### **6. Conclusions**

The GIS information system related to the comparison of scenarios and related emissions from vehicle traffic can be the starting point for the definition of a permanent technical table, a meeting point for all the operators in the sector, i.e., local authorities, mobility service providers, researchers, etc.

In this way it is possible to share knowledge and experience and to define timely, effective and efficient actions to combat the phenomenon.

The information system for the evaluation of vehicle flows and environmental emissions ensures. The information system for assessing vehicle flows and environmental emissions ensures easier and more reliable recognition of environmental risk factors and can be associated with other useful information, such as accident risk. In addition, such tools allow an assessment of scenarios in different traffic and disaster contexts (such as the recent pandemic) and enable a definition of solutions to improve the resilience and sustainability of transport in the examined area.

In the absence of pharmaceutical interventions for the COVID-19, governments have taken drastic steps like social distancing, quarantine, limited transport, partial/total lockdown to prevent the spread of the virus.

The rapid change in vehicle flows in Istanbul from January to May 2020 showed a change in vehicle emissions. The number of vehicles and the average speed were the parameters evaluated not in a hotspot way, as it happens in several literature works but on a map through the use of GIS. This made it possible to determine the real structure of vehicular traffic on the road and its influencing factors, thus hypothesizing a possible reduction in emissions.

Starting from the 2020 autumn–winter season, several countries have been facing a slow and progressive worsening of the epidemic. Although the epidemic trend during the summer of 2021 is showing reductions in some parts of the world, it is essential to strengthen monitoring and mitigation activities in light of all possible epidemic scenarios that may arise. This paper lays the groundwork for assessing the environmental effects of reducing vehicle traffic during the 2020 lockdown period in Turkey. After reconstructing the activities carried out since the start of this pandemic event, the document takes stock of several pollutants and the comparison of concentrations during the period examined.

The paper identifies the GIS and potential mitigation of impacts by proposing a shared approach to remodelling containment/mitigation measures according to the assumed scenario and risk classification. The aim of this study is therefore to show the changes on traffic mobility and air quality of the restrictions applied during the pandemic process. To do so, a GIS-based approach was proposed. A five-month period was considered to visualize the change. As can be seen from the spatial analysis results, there are improvements in traffic mobility and air quality in April–May. The parameter values in air quality are divided into two as the averages of January–February–March and April–May. Thus, the change in values pre- and post-restriction can be interpreted more easily. PM10, SO2, CO, NO2, NO<sup>x</sup> parameter values are improved by 21.21%, 16.55%, 18.82%, 28.62% and 39.99%, respectively. In addition, there was an increase up to 7% in average traffic speed. This positive development in a short period time is very important, and new sustainable transportation practices should be included in the transportation system in order to become permanent. The results showed that:


The comprehensive analysis of changing air quality and average speed can help the government assess make corresponding strategy for sustainable transportation in the future with help of this study. The findings of this paper provide some positive improvements information to evaluate in understanding the effects of reduced traffic in urban areas on air quality.

**Author Contributions:** Conceptualization, K.D.A., Ö.K., T.C. and M.Y.Ç.; methodology, Ö.K., K.D.A. and T.C.; software, K.D.A. and Ö.K.; validation, Ö.K., K.D.A. and T.C.; formal analysis, Ö.K., K.D.A. and M.Y.Ç.; investigation, Ö.K., K.D.A. and A.C.; resources, Ö.K., K.D.A. and M.Y.Ç.; data curation, Ö.K. and K.D.A.; writing—original draft preparation, K.D.A., Ö.K., T.C. and M.Y.Ç.; writing—review and editing, K.D.A., Ö.K., T.C. and M.Y.Ç.; visualization, K.D.A. and Ö.K.; supervision, T.C. and M.Y.Ç.; project administration, T.C. and M.Y.Ç.; and funding acquisition, T.C and A.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data available on request due to restrictions eg privacy or ethical.

**Acknowledgments:** In the providing of data, the Istanbul Metropolitan Municipality and the Republic of Turkey to thank the Ministry of Environment and Urbanization.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**


### *Review* **Tides and Tidal Currents—Guidelines for Site and Energy Resource Assessment**

**Silvio Barbarelli <sup>1</sup> and Benedetto Nastasi 2,\***


**Abstract:** The main aim of this paper was to classify and to analyze the expeditious resource assessment procedure to help energy planners and system designers dealing with tides and tidal currents. Depending on the geographical features of the site to be evaluated, this paper reported the easiest methods to adopt for later working plans, crucial for preliminary considerations but to be supported by in situ measurements and by a more complex and detailed modelling. While tide trends are predictable by using Laplace equations and Fourier series, tidal currents velocities prediction is not easy, requiring suitable methods or hydraulic applications. Natural and artificial sites were analyzed and the best method for each type of them was presented. The latter together highlighting the minimum set of required information was discussed and provided as a toolkit for assessing tides and tidal current energy potential.

**Keywords:** marine renewable energy; tides; tidal current; tidal velocity; barrages; channels; bathymetry; flow rate; site analysis; coastal resources

Decarbonization strategies are directly promoting renewable energy sources (RES) exploitation to replace the current fossil fuel supply [1]. RES potential assessment becomes the first step of analysis and its data accuracy plays a primary role in go/no go investment decision making [2]. Several methods are available for solar and wind energy, while for marine resources, few tools or atlases [3] are available for preliminary studies.

Marine energy forms are multiple, including (i) mechanical ones such as tides, currents, and waves, (ii) chemical ones such as salinity gradients, or (iii) thermal ones such as constant heat sinks.

Among them, tidal energy can be considered predictable over a long time scale since it comes from the conversion of gravitational forces [4]. Its intermittency affects the design of the harvester, but not its reliability since accurate predictions are linked to its nature. However, its main drawback is the distribution over large surfaces entailing large efforts to exploit it. Furthermore, the electricity generated by tidal energy conversion is not steady and is not able to fill in the consumption peak of an energy system.

Tidal energy can be exploited in two main ways: (i) harvesting its height range in natural bays and estuaries or in artificial barrages; and (ii) extracting the kinetic energy from the tidal currents across natural and artificial channels [5].

Regarding case (i), the size of the barrage is determined by the bay or estuary to build an artificial basin of water whereby, in so doing, its level increases and decreases with a different period compared with the open sea. Therefore, the hydrostatic head occurs. The energy harvesters located along the barrage extract power from the in- and out- water flows. This can be compared to a low head hydro dam [6]. The calculation, indeed, follows similarly with dedicated systems of equations.

**Citation:** Barbarelli, S.; Nastasi, B. Tides and Tidal Currents—Guidelines for Site and Energy Resource Assessment. *Energies* **2021**, *14*, 6123. https://doi.org/10.3390/en14196123

Academic Editors: Eugen Rusu and Rafael J. Bergillos

Received: 23 June 2021 Accepted: 23 September 2021 Published: 26 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Referring to case (ii), the tidal stream enters in a bay or a channel located between a mainland and offshore island. The most important parameter for the assessment is the maximum average power, usually assumed as the average kinetic energy flux in an undisturbed state across the most constricted channel's cross section where the strongest currents are present. Furthermore, the installation of energy harvesters plays a crucial role since their presence affects the flow. This is the reason why it is fundamental to determine the optimal threshold between amount of power to produce and the number of turbines to install. It is possible, as discussed in [7].

In order to start an installation or tidal system, the site must first be selected. A distinction between natural and artificial sites must obviously be made. The artificial sites are concerned with the construction of dams that allow vast amounts of energy to be stored but require major civil works. Instead, natural sites are synonymous with small lagoons, estuarine canals, straits, etc.

Incredible costs are needed for artificial sites, which are put towards other uses, such as environmental conservation, water storage, and viability, and not just to produce energy. The classic realization is a tidal barrage that, due to tidal forces, is a structure similar to a dam used to absorb energy from water moving in bidirectional way (inbound or outbound) through bays or rivers. A tidal barrage permits water to flood into a bay or river in the course of high tide, rather than damming water on one side like a traditional dam, and discharges the water during low tide. That is achieved at crucial times of the tidal cycle by calculating the tidal stream and regulating opening and closure of the sluice gates. In order to absorb the energy as water flows in and out, turbines are located at these sluices. The barrage technology of tidal energy exploitation requires the construction of a barrage across a bay or river where tidal currents flow. Turbines mounted in the barrage generate electricity while water flows in and out of the enclosed estuary basin, bay, or river. These systems are comparable to a hydro dam generating pressure energy due to the difference in height (head). The turbines are able to generate power when the water level outside the basin or lagoon varies relative to the water level within. Several kind of turbines could be used depending on the head and flow available, in some cases even reversible pumps or pumps as turbines [8].

Embankments, caissons, pumps, sluices, and ship locks are the essential elements of a barrage. These elements are located in very large concrete blocks.

Barrage systems depend on the high cost of civil infrastructure associated with the placement of a dam across estuarine systems. Because of the detrimental consequences associated with altering a big ecosystem where many varieties of species live, people have resisted barrages [9].

Tidal currents are more attractive for minor costs than less invasive applications [10], allowing installations in channels between an offshore island and the mainland or in a strait at the entrance to the bay. In this case, prior to making the decision, many parameters have to be considered or evaluated [11].

With the aim is to provide available power over a valuable period of time, the current velocity characterization in terms of spatio-temporal variance is needed for the siting activities: the optimum range is indicated as 1.5–3.5 m/s [12].

For designing the structural loading and power capability of the system, these parameters are inevitable. The geology of the seabed affects the construction of a kinetic energy system significantly. Recent sediment dynamics research has postulated a threshold value for initial particle movement [13–15]. However, bottom friction relies on various settings and forcings, including the structure of bed-sediments and sea bottom morphology [16] and the effect of hydro-dynamic processes, such as wave interaction and current bottom-boundary layers [17].

This is vital for determining whether the sediments removed will impact turbine components such as blades and structural parts under critical conditions. In addition, shore and bed-boundary layer effects and roughness have not always been taken into account, though tide trend is often the object of the theoretical output of energy.

All these parameters should be taken into account in the complete feasibility analysis of the production plant layout [18], starting from the tide oscillations, which are wellknown for each site as the sum of the harmonic components, up to the geomorphology of the site, determining the flow velocity losses and thus the energy generated, and also the availability of the installation. A shore must be able to bear heavy concrete structures to decrease erosion due to sedimented currents, or simply to be defined as low roughness for retaining structures such as gravity-based structures (GBS).

In order to properly deploy the machine in a location, many details are required at this point. In particular:


Finally, once the above parameters have been established, the machine's location and design can be considered to be almost ready in terms of feasibility.

The review articles available in the literature deal with the status of research [19] or regional outlook, such as the Ireland case [20], but a comprehensive and critical analysis of expeditious assessment methods is not available, despite the recent availability of marine databases [21].

Finally, the novelty of this paper is to provide readers an overview of the methods and analyses present in the literature for assessing the tides oscillations and, particularly, the tidal current velocities, by including, according to the sites conformation, 1D, 2D, and 3D approaches.

The main objective of this study was to help energy planners and system designers in resource assessment procedures for tides and tidal currents.

### **2. Material and Method—Tides Genesis: Prediction Models**

Tidal energy is generated directly from the gravitational and centrifugal forces between the earth, the sun, and the moon [22]. Because of the gravitational force of the moon, the sun, and the earth and the centrifugal one produced by the mutual rotation of the earth and moon [23], a tide results in an oscillation of the ocean's surface.

The moon exercises a gravitational force twice as big as that of the sun because it is closer to the earth. Every 24 h, 50 min, and 28 s, the tidal phenomenon occurs twice [24]. A bulge of water is formed by the gravitational force of the moon, which is stronger on the side of the earth nearest the moon. The rotation of the earth-moon system, creating a centrifugal force, creates another water bulge furthest from the moon on the side of the earth, shown in Figure 1.

The water around the landmass is at high tide when a landmass matches up with this earth-moon system. In addition, the water around it is at low tide while the landmass is at 90◦ to the earth-moon system (see Figure 2). Every landmass is therefore subjected to two high tides and two low tides during each cycle of the earth's rotation [25].

The timing of these tides changes at every point on the planet as the moon rotates around the earth, and the same apex of high or low tide occurs at the same point roughly 50 min later per day [26]. Every 29.5 days, known as the lunar cycle, the moon orbits the earth. Between spring tides and neap tides, tides vary in size.

When the sun and moon are aligned with the earth, spring tides occur, whether moving on the same side of the earth or on opposite sides, resulting in extremely high spring tides. If the sun and moon are at 90◦ to each other, neap tides occur, resulting in low neap tides (see Figure 3).

**Figure 1.** The influence of the moon on tidal genesis. '

**Figure 2.** Trend of the tides related to the moon position.

In particular land conformations like harbors, estuaries, and bays, the level oscillation of the ocean water also produces a horizontal movement of the water which causes *tidal currents*. In general, a current can flow from the oceans into the harbors, bays, and estuaries as the range of tides increase; this is called a "flood current".

A current can flow into the oceans as the tides fall; this is called an "ebb current". When the tide stops to act, no horizontal motion is observed; this is referred to as "slack water".

**Figure 3.** Spring tide and neap tide.

water".

To presume a correlation between the times of high/low tides and the times of maximum and minimum tidal currents, a *'rule of thumb'* is adopted by many technical users of

*'rule of thumb'*

" "

" "

no horizontal motion is observed; this is referred to as "slack

" "

" " " "

water".

the sector. This rule implies that the flood and ebb current will occur between the high and low tides, while the periods of slack water will happen at the same time as the high and low tides. However, for most places, this rule does not apply. It is not a clear relation between the times of high/low tide and the times of slack water or maximum current. Three "base case" requirements exist. A "standing wave" form of current is the first. The cycles of slack water would be exactly the same as the high and low tides in a standing wave, with the highest flood and ebb current occurring halfway between the high and low tides. " " " "

*'rule of thumb'*

" "

no horizontal motion is observed; this is referred to as "slack

" "

The second is the presence of a "progressive wave". The maximum flood and ebb would arise at the times of the high and low tides in a progressive wave, with the slack water between the times of high and low tides. The two abovementioned occurrences are illustrated in Figure 4. " "

**Figure 4.** Standing wave and progressive wave.

A "hydraulic current" is the third case. In a hydraulic current, the current is formed at two locations joined by a waterway by the difference in height of the tides. When the difference between the two heights is the highest, the current is at its full flood or ebb. When the height of the tide at the two places is about the same, slack water occurs.

At a small number of sites, hydraulic currents exist. Some instances would be:


Most generally, progressive currents characterize the oceanic entry of several bays and harbors. At the head of larger bays and harbors (see example in Figure 5), stationary wave conditions are most typical. Somewhere, most areas of the coast will fall in between a progressive and standing wave current.

The exact relationship between high and low tide times and maximum current or slack water is unique to each location and a general "rule of thumb" cannot be applied.

As the tidal currents are caused by the same forces that cause the tides, it is possible to predict the currents in a very similar way to the tides.

Using the same techniques used to analyze tides, observational data on the currents at a site can be evaluated and the results of such a study can be used to produce forecasts of tidal currents. However, tide predictions and tidal current predictions are performed separately because the relationship between tides and tidal currents is unique to each region.

• The times and heights of the tides are given by tide forecasts.

• The dates and speeds of maximum current and times of slack water are given by tidal current predictions.

It is up to the users to ensure that the right form of forecasts are used for their operations.

" "

'

 

**Figure 5.** Example of bays in North America.

### *Tidal Analysis and Prediction-Tidal Constituents*

" " Tides are completely predictable, as the number of harmonic elements can be foreseen. Approximately 62 constituents [27] are of sufficient size to be considered for potential use in the prediction of marine tides, although far fewer can often predict tides with useful precision.

 Generally, seven different harmonic components cause about 83% of the variation in tides. These components originate from the influence of the moon or sun and the relative periods occur once or twice per day. For example, the so-called 'M2' component is typically the dominant tidal wave caused by the moon, twice daily. The periods of tidal components are constant across locations, but the relative strengths (amplitudes) vary considerably.

 These major tidal constituents, determined by geographic coordinates [28] and which allow for prediction of the water level by harmonic analysis, are listed in Table 1 together with their period and related strength.


**Table 1.** Main tidal constituents with relative strength.

Then, in addition to the mean sea level *Hm*, it is possible to reconstruct the tide height pattern as follows by considering n constituents and the corresponding frequencies *ω<sup>i</sup>* and phases ϕ*<sup>i</sup>* .

$$h(t) = H\_m + \sum\_{1}^{n} H\_i \sin(\omega\_i t + \phi\_i) \tag{1}$$

The tidal components can be produced using the global inverse solution TOPEX/ Poseidon TPXO developed by Oregon State University [29]. The TPXO is a collection of global ocean tide models that best match the Laplace Tidal Equations and altimetry data in the least-square sense.

For eight primaries (i.e., M2, S2, N2, K2, K1, O1, P1, and Q1), two long periods (i.e., Mf, Mm) and three non-linear harmonic constituents (i.e., M4, MS4, MN4), on a 1440 × 721, 1/4 degree resolution complete global grid, the model considered tides as complex amplitudes of earth-relative sea-surface elevation.

If a power plant installation is considered in a natural site where tidal currents generate, like a channel, a river, an estuary, a fjord, or a strait, where more consistent tidal currents arise and flow parallel to the coast, it is necessary to know the tidal current velocities trend in order to assess the feasibility of the installation. Therefore, measurement surveys are essential. Velocity and tidal level data can be measured using an Acoustic Doppler Current Profiler (ADCP) sensor and with a built-in pressure sensor.

However, since these surveys are both expensive and time-consuming, and because of the various areas with potential suitability for tidal energy extraction, preliminary estimations require simpler, more generalized methods. A simple model would need only publicly accessible data, such as changes in sea level elevation (see Equation (1)).

### **3. Tides Applications and Tidal Currents Genesis: Prediction Models**

In this section, both tides and tidal currents are analyzed in order to detect possible applications and assess the energy annual yield. We start from the case of a basin with barrage, which directly exploits the potential energy linked to the tides, and try to find valid correlations regarding the tidal currents, which generate from the tides themselves.

### *3.1. Basin with Barrage*

Tidal barrages take advantage of the potential energy contained in the tides. Electricity is produced just like a hydroelectric dam with the exception that tidal currents flow in both directions, as opposed to only one direction for a dam [30].

From the tide coming in (flood tide) and out (ebb tide), a head difference is created; if the head difference is of a sufficient size, sluice gates are opened and water flows through the barrage turbines. Below, the two operating modes are explained in more detail.

• Ebb generation (Figure 6a)

Through the sluices, the basin is filled until high tide flows. Then, when the tide reaches maximum height, sluice gates are closed. At this point there could be "extrapumping" to further increase the level. To obtain an adequate head across the dam, the turbine gates are held closed until the sea level starts to fall. Hence, when the tide reaches minimum height, see Figure 6a, the gates are opened so that while head is sufficiently high, the turbines work. This phase lasts until the difference in height (the head) is greater than zero. The sluices are then kept opened, turbines are disconnected, and the basin is again filled. With the tides, the cycle repeats. Generation of ebb (also known as generation of outflow) takes its name because generation takes place as the tide reverts tidal direction.

• Flood generation (Figure 6b)

In this case, the achievement of the energy production happens in the opposite way. The basin is filled by using the turbines working during tide flood and when the height is maximum (see Figure 6b). This is normally much less powerful than ebb generation, since the volume of the basin charged during flood is lower than the volume obtained when ebb generation operates. In fact, this last is filled first during flood generation and

"

even with extra water from inland rivers and extra streams connected to it via the land. Therefore, the available level difference between the basin side and the sea side of the barrage, essential for the turbine power produced, decreases faster than in ebb generation. Instead of enhancing it as in ebb generation, rivers flowing into the basin can further reduce the energy capacity. This is not, of course, a concern with the "lagoon" model without the inflow of rivers. " "

**Figure 6.** Ebb generation (**a**) and flood generation (**b**) plant scheme.

A first simple model to evaluate the energy production through turbines inserted in the dam is based on the emptying/filling of the reservoir surrounded by the dam itself. Therefore, it is possible to express the level of the sea as:

$$h\_1(t) = h(t) + a \sin \omega t \tag{2}$$

"

where: *h(t)* is the mean sea level, *h*1*(t)* is the sea level outside the basin, *h*2*(t)* inside the basin By introducing a suitable discharge coefficient *C<sup>d</sup>* , velocity through a turbine is:

$$V = \mathbb{C}\_d \sqrt{2g|h\_1(t) - h\_2(t)|}\tag{3}$$

The flow rate through the turbine(s), having blades area *Aturbine* , is instead:

$$Q = VA\_{turbine} \tag{4}$$

While the differential *dh,* referred to a Basin area *Barea*, changes according to:

$$dh = \frac{Q}{B\_{area}(h)}dt\tag{5}$$

The basin area is expected to change with *h* in this situation. Typically, through two experimentally determinable coefficients *k*<sup>1</sup> and *k*2, linear laws will take this shift into consideration as follows:

$$B\_{area} = k\_1 h(t) + k\_2 \tag{6}$$

So, the level *h*<sup>2</sup> can be found by following the next equation:

$$h\_{2new} = h\_{2old} \pm dh \tag{7}$$

Ultimately, the power provided by the turbine becomes:

$$P = \eta\_{\rm turb} \eta\_{\rm tr} \eta\_{\rm gen} \rho \text{g} \, A\_{\rm turb} \text{C}\_d \sqrt{2 \text{g} \left| h\_1(t) - h\_2(t) \right|^3} \tag{8}$$

It is assumed that a transmission efficiency *ηtr* of 67% and a turbine *ηturb* and generator unit *ηgen* global efficiency of 60%–90% are typical values for small hydro power plants. The turbine's efficiency *ηturb* varies, as the flow rate and head are not constant. A turbine is usually designed to keep the efficiency constant for different flow rates for a given operating band. The efficiency will drop rapidly, however, if it exceeds a certain condition. Kaplan turbine efficiencies start to decline at 50% of the nominal flow in traditional hydro power. In the next figure (Figure 7), this occurrence is highlighted.

**Figure 7.** Turbine efficiency vs. flow rate%.

This impact is even greater in a tidal barrage, as the maximum head is only reached for a limited period of time and cross-flows through the turbine tunnel are required when the system reaches a very low head (less than 10% of the maximum value). The efficiency of the turbine and generator is therefore assumed to be 90% for peak power and 70% for annual average power output. There is no consideration of additional losses due to transformation, gear boxes, or downtime.

### *3.2. Model for Predicting Tidal Current Velocities*

Tidal currents are generated from tides whenever the difference in height of the sea level is converted in flow through a wide or narrow channel, in a strait or a gorge. They flow parallel to the coast like a river and revert their direction generally twice per day. Bringing essentially kinetic energy, they can be exploited by wind turbine-like machines. Here, these machines were not analyzed, because they were not the focus of this paper. More interesting is the way to assess the trend of these currents. In this section, some models for simple evaluations are proposed, and energy resources are evaluated for both artificial sites (barrages) and natural sites (channels).

### Enclosed Bay with Channel

Continuity means that the change in volume within the bay must be equal to the flow into the bay when considering an enclosed area, such as a fjord. The flow into the bay is the flow through the canal linking the bay to the open ocean and other inputs, such as

discharge from the river. If one disregards the freshwater input, the continuity equation can be written as [31]:

$$A\_{bay}\frac{dh\_l}{dt} = \mathbb{Q} \tag{9}$$

where *Abay* is the region of the enclosed bay, which is expected not to shift with the *h* level in this case; *h<sup>i</sup>* is the water level within the bay; and *Q* is the channel flow. The cumulative drop across the channel in water level is equivalent to *ho-h<sup>i</sup>* , where *h<sup>o</sup>* is the level of water on the outer side of the bay in the ocean. This total water-level drop can be divided into two parts: one related to the channel friction (∆*h<sup>f</sup>* ) and one related to the flow acceleration towards the constriction (∆*h<sup>b</sup>* ).

Using the Manning number, *n*, the frictional resistance can be calculated according to:

$$
\Delta h\_f = \frac{n^2 L}{R^{4/3}} \left(\frac{Q}{A\_c}\right)^2 \tag{10}
$$

where *L* is the channel's length; *R* is the hydraulic radius; and *A<sup>c</sup>* is the channel's crosssectional area [31]. It is possible to write the variable ∆*h<sup>b</sup>* as:

$$
\Delta h\_b = \frac{Q^2}{2gA\_c^2} \tag{11}
$$

The friction and acceleration concept is assumed to be negligible in the simplest case, and Equation (9) is used to model the channel velocity.

Only continuity is applied to measure the velocity of the channel. In addition, assuming that during the tidal cycle the cross-sectional area will not change significantly, the flow will depend on the channel velocity such that *Q = A<sup>c</sup>* · *u*. Through this, a relationship between the velocity and the tidal level can be achieved. Denoting the approximate velocity with *u*, the equation becomes:

$$
\mu = \frac{A\_{\text{day}}}{A\_{\text{c}}} \frac{dh\_{\text{i}}}{dt} = \frac{A\_{\text{day}}}{A\_{\text{c}}} \frac{dh\_{\text{o}}}{dt} \tag{12}
$$

If the tidal water level with an angular frequency of ω has a sinusoidal difference,

$$h(t) = H\_{\text{max}} \sin(\omega t) \tag{13}$$

The velocity is, then, *u*

$$u = \frac{A\_{\text{day}}}{A\_{\text{c}}} \omega \sin \left(\omega t - \frac{\pi}{2}\right) = \frac{A\_{\text{day}}}{A\_{\text{c}}} \omega \sin \left[\omega \left(t - \frac{T}{4}\right)\right] \tag{14}$$

The velocity would then have a phase shift of *T*/4, being *T* the period of the tides equal to 12 h and *t* the time in hours.

Equation (14) can be further extended to estimate the maximum velocity in the channel. Writing the sinusoidal variation of the velocity with *u(t) = umaxsin(ωt)*, the integration of Equation (12) over half a tidal period gives the maximum velocity:

$$
\mu\_{\text{max}} = 1.4 \cdot 10^{-4} \left( s^{-1} \right) \frac{A\_{\text{day}}}{A\_{\text{c}}} H\_{\text{max}} \tag{15}
$$

.

by considering that *ω =* 2*π/T.* i.e., 6.28/(12 × 3600) = 1.4 × 10−<sup>4</sup>

*Case study: Skarpsundet tidal channel (Norway)* [31].

In the following, an example of how it is possible to estimate the tidal current velocity in a channel is reported together with ADPC measurements. The area is in the Norway fjords, and particularly the case of the Skarpsundet tidal channel is analyzed. The following figures (Figure 8a,b) report the map of the site with the channel and the connected bay highlighted. Table 2 reports useful data for applying the methodology.

**Figure 8.** (**a**) Maps of Norway's fjords Elsfjorden and Ranfjorden, with the Skarpsundet channel highlighted. (**b**) Bathymetry of the Skarpsundet channel and measurements station [31]. 's

**Table 2.** Values for the Skarpsundet channel.


### *3.3. ADCP Measurements*

A 600 kHz ADCP Workhorse Sentinel was used for measuring velocity and tidal level. The built-in pressure sensor of the ADCP instead was utilized for measuring the water level. By means of a downward-looking ADCP, one transect measurement and one different survey were performed. Figure 8b shows the locations of the measurements: transect measurements are marked as black lines. By using a low-frequency sampling rate, longtime measurements were performed over 43 days. The ADCP measured velocities (#1—see Figure 8b) were compared to the simulated ones obtained from TELEMAC-2D software (Figure 9). The velocity reliefs in direction North-South agreed with the simulations (Figure 9). Figure 10 illustrates the tidal velocity variations across the channel. —

**Figure 9.** N-S tidal velocities comparison from 1 June to 9 June 2011 (red—simulated, black—measured) [31]. — —

### *3.4. Maximum Velocity and Tidal Range*

At this point, it is necessary to test the goodness of the methodology by verifying if the maximum velocities of the tidal range considered are according to Equation (15). On purpose, only the long-time ADCP velocity series were used by obtaining the tidal range the difference from the maximum and minimum height or tide level—from the Norwegian

Hydrographic Service (NHS). The velocities were depth averaged and smoothed over one hour, and then their maximum values, during flood and ebb, were extracted from the time series. Figure 11 highlights the results of this analysis.

**Figure 10.** Cross-sectional measurements during incoming tide with the depth line along the transect highlighted [31].

The correlation plot shows that the velocity variation was stronger for the outgoing tide compared with the incoming.

$$
u\_{\text{max},in} = 0.11 + 0.16 \cdot H \tag{16}$$

$$
\mu\_{\text{max},out} = 0.03 + 0.30 \cdot H \tag{17}
$$

Equations (16) and (17) are written for interpolating data measured, shown in Figure 11 together with dashed lines representing error band of ±0.05 m/s and 0.1 m/s, taking into account that *H* is the peak to peak difference height equal to 2*H*max (see Equation (13)). It can be seen that the slope in the equation was different between the flood and the ebb tide, with a higher slope value for the ebb tide. The maximum velocity occurred, on average, 3 h and 35 min after high tide and 3 h 32 min after low tide. — —

0.11 0.16 **Figure 11.** Velocity peaks versus tidal heights: (**a**) flood tide and (**b**) ebb tide. Dashed lines display 95% interval of confidence [31].

max,

max, 0.03 0.30 Using Equation (15) and taking into account the parameters reported in Table 2, *u*max became 0.4 m/s if the considered tidal range *H* is 200 cm. This value agrees strongly with the measured cross-sectional average velocity during flood tide (0.42 m/s—see Figure 11a), and weakly with the measured cross-sectional average velocity during ebb tide (0.63 m/s—see

Figure 11b). However, measurements at a diverse location, more centered in the channel, could yield different results. —

### Channels

Figure 12 illustrates the stream through a variable cross-section channel. It is assumed that the current velocity *u(x, t)* is a function of time *t* and location *x* along the channel, but independent of the cross-channel position. The dynamic equation that governs the flow is

$$\frac{\partial \mu}{\partial t} + \mu \frac{\partial \mu}{\partial \mathbf{x}} + g \frac{\partial \mathbf{h}}{\partial \mathbf{x}} = -F \tag{18}$$

—

**Figure 12.** Flow through a channel [7].

Where the pressure gradient to drive the flow is given by the slope of the surface elevation *h*, and where *F(x, t)* represents an opposing force associated with natural friction and possibly turbine presence. To be independent of the cross-channel location of the frictional force associated with the turbines, the turbines must be deployed in a uniform fence across the flow, so that all the water flows through the turbines itself. If the channel is small compared with the wavelength of the tide, which usually reaches hundreds of kilometers, even in shallow water, the continuity law ensures that the flux *A* · *u* along the channel is independent of *x* (we neglected small changes in *A* associated with the rise and fall of the tide) and can be written as *Q(t)*.

The use of this in Equation (18) and the integration along the channel means

$$c\frac{d\mathbb{Q}}{dt} - gh\_0(t) = -\int\_0^L F d\mathbf{x} - \frac{1}{2}\mu\_\varepsilon |u\_\varepsilon| \tag{19}$$

where *c* is

$$\mathcal{L} = \int\_0^L \frac{1}{A(\mathfrak{x})} d\mathfrak{x} \tag{20}$$

The difference in sea level between the two basins is *ho(t)*, meaning that this difference is unaffected by the flow through the channel and also unaffected by *F* shifts as turbines are installed.

0

### *3.5. The Easiest Case for Tidal Currents Velocity Estimation*

The natural friction and head loss associated with separation at the exit could likely to be relevant. It is assumed, however, that these effects are minimal, above all if the channel is long and wide, so that the natural regime has a balance between the difference in sea level and acceleration. Considering a sinusoidal tide:

$$h\_0(t) := a \cos(\omega t) \tag{21}$$

where the amplitude is *a* and the frequency is *ω*. This forcing will, of course, at each end of the channel be the distinction of sinusoidal tides rather than reflecting forcing from only one end. By integrating Equation (18), the corresponding volume flux is

$$Q = Q\_0 \sin(\omega t) \tag{22}$$

$$Q\_0 = \frac{ga}{\omega \mathcal{C}}\tag{23}$$

Taking into account the drag associated with the presence of turbines, expressed as

$$\int\_{0}^{L} \mathbf{F} d\mathbf{x} = \lambda \mathbf{Q} \tag{24}$$

Equation (19) becomes:

$$c\frac{dQ}{dt} - ga\cos\omega t = -\lambda Q \tag{25}$$

In this case

$$Q = \text{Re}\left[\left(\frac{ga}{\lambda - ic\omega}\right)e^{-i\omega t}\right] \tag{26}$$

### 3.5.1. 3D and 2D Methods—Navier–Stokes Equations

When the site's geometry is complex, for example when headlands and islets are present, the approaches described previously fail. Where the coast is indented and headlands and islets are present, this is the case for complex geometries. Therefore, depending on the characteristics of the seabed, more robust methods need to provide 2D or 3D approaches. The baroclinic Navier-Stokes equations were obtained by reducing the vertical momentum equation to the hydrostatic pressure assumption by applying the Boussinesq assumptions. The fluid was often believed in the simulation to be incompressible. The equations for continuity and momentum are given below [32]:

$$\frac{\partial h}{\partial t} + \frac{\partial [(d+h)\mathcal{U}]}{\partial \mathbf{x}} + \frac{\partial [(d+h)\mathcal{V}]}{\partial y} + \frac{\partial [(d+h)\mathcal{W}]}{\partial z} = \mathcal{Q} \tag{27}$$

$$\begin{split} \frac{\partial \mathcal{U}}{\partial t} + \mathcal{U} \frac{\partial \mathcal{U}}{\partial x} + V \frac{\partial \mathcal{U}}{\partial y} - fV \\ &= -g \frac{\partial \mathcal{U}}{\partial x} - \frac{g}{\rho\_o} \int \frac{\partial \rho'}{\partial x} dz + \frac{\tau\_{\rm str} - \tau\_{\rm br}}{\rho\_o (d+h)} + v\_h \left( \frac{\partial^2 \mathcal{U}}{\partial x^2} + \frac{\partial^2 \mathcal{U}}{\partial y^2} \right) \\ &+ v\_v \left( \frac{\partial^2 \mathcal{U}}{\partial z^2} \right) \end{split} \tag{28}$$

$$\begin{split} \frac{\partial V}{\partial t} + \mathcal{U} \frac{\partial V}{\partial x} + V \frac{\partial V}{\partial y} - f \mathcal{U} \\ &= -g \frac{\partial \mathfrak{h}}{\partial y} - \frac{g}{\rho\_o} \int\_0^h \frac{\partial p'}{\partial y} dz + \frac{\tau\_{\mathfrak{g}\mathcal{I}} - \mathfrak{h}\_{\mathcal{Y}}}{\rho\_o (d+h)} + \upsilon\_h \left( \frac{\partial^2 V}{\partial x^2} + \frac{\partial^2 V}{\partial y^2} \right) \\ &+ \upsilon\_\mathcal{\upsilon} \left( \frac{\partial^2 V}{\partial z^2} \right) \end{split} \tag{29}$$
 
$$\label{def:} \label{def:} \label{def:} \label{def:} \label{def:} \label{def:} \Delta p \end{split} \tag{20}$$

$$\frac{\partial p}{\partial z} = -\rho g \tag{30}$$

$$\begin{split} \frac{\partial[(d+h)c]}{\partial t} + \frac{\partial[(d+h)Lc]}{\partial x} + \frac{\partial[(d+h)Vc]}{\partial y} + \frac{\partial[(d+h)Wc]}{\partial z} \\ = D\_h \left( \frac{\partial^2 c}{\partial x^2} + \frac{\partial^2 c}{\partial y^2} \right) + D\_v \left( \frac{\partial^2 c}{\partial z^2} \right) - \lambda\_d (d+h)c + R \end{split} \tag{31}$$

Taking into account shallow waters, the above equations reduce as:

$$\frac{\partial h}{\partial t} + \frac{\partial [(d+h)\mathcal{U}]}{\partial x} + \frac{\partial [(d+h)V]}{\partial y} = Q \tag{32}$$

$$\frac{\partial \mathcal{U}}{\partial t} + \mathcal{U}\frac{\partial \mathcal{U}}{\partial \mathbf{x}} + V\frac{\partial \mathcal{U}}{\partial y} - fV = -g\frac{\partial h}{\partial \mathbf{x}} - \frac{g}{\rho\_o} \int\_{-d}^{h} \frac{\partial \rho'}{\partial \mathbf{x}} dz + \frac{\tau\_{\mathbf{x}\mathbf{x}} - \tau\_{\mathbf{b}\mathbf{x}}}{\rho\_o(d+h)} + v\_h \nabla^2 \mathcal{U} \tag{33}$$

$$\frac{\partial V}{\partial t} + \mathcal{U}\frac{\partial V}{\partial x} + V\frac{\partial V}{\partial y} - f\mathcal{U} = -g\frac{\partial h}{\partial y} - \frac{g}{\rho\_o} \int\_{-d}^{h} \frac{\partial \rho'}{\partial y} dz + \frac{\tau\_{sy} - \tau\_{by}}{\rho\_o(d+h)} + \upsilon\_h \nabla^2 V \tag{34}$$

$$\frac{\partial[(d+h)\mathcal{c}]}{\partial t} + \frac{\partial[(d+h)\mathcal{L}\mathcal{c}]}{\partial \mathcal{x}} + \frac{\partial[(d+h)V\mathcal{c}]}{\partial y} = D\_h\nabla^2\mathcal{c} - \lambda\_d(d+h)\mathcal{c} + \mathcal{R} \tag{35}$$

where:

*d*: depth of local water relative to a reference plane;

*h*: level of water;

*U*: vertically integrated velocity components towards the east;

*V*: vertically integrated velocity components towards the north;

*Q*: mass sources intensity per unit area;

*f*: parameter Coriolis;

*Vh* : viscosity of kinematic horizontal eddy;

*ρo*: density of reference;

*ρ* ′ : the density of the anomaly;

*τsx*: x-wind tension components acting on the surface of the sea;

*τsy*: y-wind pressure components acting on the surface of the sea;

*τbx*: shear pressure at the bottom of x-components;

*τby*: shear pressure at the bottom of the y-components;

*c:* salinity (transported substance) or temperature;

*Cd:* seabed friction;

*D<sup>h</sup>* : lateral diffusivity of the eddy;

*λd* : method of first order decay;

*R*: is the term for the source per unit area.

A flexible finite-element-based coastal ocean model, THETIS, could be used to solve the shallow water equations on an unstructured triangular mesh [33]. THETIS is built using the Firedrake framework [34], which automates the generation of low-level application code from high-level descriptions of the finite element discretization specified. Other open source software is available: DELFT3D [35], TELEMAC [36], POM [37], and MIKE 21 [38]. However, detailed knowledge of the bathymetry or 3D conformation of the seabed is required for providing optimal solutions.

*Case study: Pentland Firth and Orkney Waters—North Scotland—tidal currents predictions*

The above methods were applied in the area of Pentland Firth and Orkney Waters. The Pentland Firth (PF), placed between the mainland Scotland and Orkney Islands, constitutes 36% of the total tidal energy in the United Kingdom and records the highest tidal currents in the world [39] with current speeds exceeding 5 m/s in some cases. Particularly, TELEMAC 3D was used for predicting the speed.

TELEMAC 3D solves the Navier Stokes Equations (27)–(35) by using the Finite Element Method, considering advection-diffusion equations of intrinsic quantities like salinity, temperature, and concentration by requiring three main pieces of information: the geometry (model mesh), the boundary condition of the domain, and the simulations configuration.

The Acoustic Doppler Current Profiler (ADCP) was used for validating the model results; the data are very useful for 3D hydrodynamics modelling, as they provide data on current velocities through the water column. The locations of the measurement devices are illustrated in Figure 13.

Figure 14 reports the comparison between the tidal current velocities predicted by the software TELEMAC in the location of Figure 13 (highlighted by the red circle) and the ADPC measurements (black lines) related to the period between 16 September 2001 and 20 September 2001. The software output was obtained by changing the friction coefficient of the seabed from 0.005 to 0.086 (green, red, and blue lines).

–

**Figure 13.** Pentland Firth and Orkney Waters in the United Kingdom—domain, bathymetry and ADPC measurements devices locations [40].

**Figure 14.** Tidal current prediction versus ADCP measurements for different depths measurements and for different seabed friction coefficients *Cd* [40].

 

 

   

3.5.2. Influence of the Coastal Boundary Layer and Deepness of the Seabed on the Tidal Current Velocities

In some cases, when the flow is unaffected by macro vortex areas or recirculation, tidal velocities can be expressed by appropriate coefficients [41,42], depending somehow on the harmonic constituents, so that it is possible to use a harmonic series for foreseeing their trend, as it follows:

$$V(t) = A\_{\mathcal{W}} + \sum\_{i} A\_{i} \sin(\omega\_{i}t + \varphi\_{i})\tag{36}$$

The previous formulation, which is related to a specific position, is not easy to model and it requires measurements support. In addition, these velocities occur at the sea surface; a correction to the peak value *Vo* is needed if the depth influence is considered [12,43]:

$$V(z) = V\_o \left(\frac{z\_o - z}{\beta z\_o}\right)^{\frac{1}{a}} \tag{37}$$

In the above equation, *V*(*z*) is the velocity at the depth *z*, and *Vo* is the velocity at the depth (*z*o) of reference; *z* = 0 refers to the seabed position.

In proximity of the mainland, it is well known that the coastal boundary layer affects the values of the tidal current velocities. This dependence is very hard to find. A simple parabolic law estimating the reduction of these velocities (*V<sup>b</sup>* ) at a distance *h* from the coast inner the boundary layer can be applied:

$$V\_b = -V\_o \left(\frac{h}{h\_{\rm lim}}\right)^2 + 2V\_o \frac{h}{h\_{\rm lim}}\tag{38}$$

However, the previous equation is exploitable only if the thickness of the boundary layer *hlim* is found [44].

### **4. Conclusions**

This paper attempted to provide valid support and a set of tools for quickly estimating the trend of the tides and of the tidal currents. This knowledge is fundamental for assessing the feasibility of a marine plant able to harness this source of renewable energy. It is important to understand that the tides offer a certain amount of gravitational energy, while the tidal currents essentially carry kinetic energy. The tides are generated from the mutual gravitational influence of the sun, the moon, and the Earth, and are characterized by a periodic rise and fall of the sea level.

The tidal currents originate from these differences in height, whenever the land conformation allows water flow in a channel, in a strait, or in a throat. The difference in the sea level between the ends of a channel, induced by tidal oscillations, will determine the variation in the flow rate through the channel. The predictability of the distribution of water velocity density and energy supply for tidal applications, combined with the cross-sectional area and depth, makes it possible to determine the possible output of energy.

A significant amount of information is easily accessible on the web and it is possible to reconstruct the tides trend by means of harmonic Fourier series, whose main constituents depend on the geographic coordinates.

The simplest way to exploit the tides is the building of a barrage for accumulating water masses at high level to feed hydraulic turbines. However, the placement of a barrage, for example in an estuary, has a considerable influence on the water and the environment within the created basin. Many governments have been hesitant in recent times to give permission for tidal barrages. In fact, a lot of issues affecting and changing the environment equilibrium arise, as reported below.

• Turbidity: As a result of smaller amounts of water being transferred between the basin and the sea, turbidity (the quantity of matter suspended in the water) decreases. This allows light from the sun to further reach the water, enhancing phytoplankton conditions. The modifications spread the food chain, causing the environment to change in general.


However, many places around the world offer marine energy resources in form of tidal currents. Tidal currents generate from tides and they are characterized by water flows bringing large amounts of kinetic energy to apply directly on marine turbines. The installations exploiting tidal currents are less problematic and the environmental issues are more manageable. Unfortunately, differently from tides, is very hard to foresee the trend of tidal currents, that being the essential aspect for assessing the feasibility of the power plant.

In this paper, many cases were faced by contemplating more approaches, from the easiest to the most complex. The objective was to provide some formulas predicting the trend of the tidal currents for rapid assessment of energy production. In some cases, simple configurations, like channels or channels connected whit a bay, can be approached with simple formulas, derived from the trend of the tides.

In the other cases, when the land conformation is very complex, 2D or 3D approaches are needed. Supporting these computations available online software like TOPEX/Poseidon TPXO, or TELEMAC or THETIS, or POM, are able to predict the trend of the tidal current. However, it is necessary to provide accurate information about the bathymetry of the site.

**Author Contributions:** The authors equally contributed to the work. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**


### *Article* **Solar Irradiation Evaluation through GIS Analysis Based on Grid Resolution and a Mathematical Model: A Case Study in Northeast Mexico**

**Fausto André Valenzuela-Domínguez <sup>1</sup> , Luis Alfonso Santa Cruz <sup>1</sup> , Enrique A. Enríquez-Velásquez <sup>2</sup> , Luis C. Félix-Herrán 1,\* , Victor H. Benitez <sup>3</sup> , Jorge de-J. Lozoya-Santos <sup>4</sup> and Ricardo A. Ramírez-Mendoza <sup>4</sup>**


**Abstract:** The estimation of the solar resource on certain surfaces of the planet is a key factor in deciding where to establish solar energy collection systems. This research uses a mathematical model based on easy-access geographic and meteorological information to calculate total solar radiation at ground surface. This information is used to create a GIS analysis of the State of Nuevo León in Mexico and identify solar energy opportunities in the territory. The analyzed area was divided into a grid and the coordinates of each corner are used to feed the mathematical model. The obtained results were validated with statistical analyses and satellite-based estimations from the National Aeronautics and Space Administration (NASA). The applied approach and the results may be replicated to estimate solar radiation in other regions of the planet without requiring readings from on-site meteorological stations and therefore reducing the cost of decision-making regarding where to place the solar energy collection equipment.

**Keywords:** total solar irradiation; GIS analysis; mathematical model; grid map design; statistical analysis; sustainable urban planning

### **1. Introduction**

Solar energy is one of the most popular sources of renewable energy around the world. Compared to other forms of energy supplies, such as fossil fuels, producing energy based on solar resources reduces carbon dioxide emissions. This fact promotes more energy supply diversification and as a result a regional energy independence for difficult access regions from the electric grid. Moreover, according to the International Energy Agency (IEA), solar power may become the World's leading energy source by 2050. This phenomenon is rapidly advancing towards the goal, e.g., solar energy accounted for about one-quarter of all new energy production installations in the first half of 2017 in the United States, which represents almost 1.6 million solar installations in total for the whole country [1]. An important aspect to consider, as photovoltaic (PV) systems increase, is the amount of land surface used to place the system that collect the solar resource. It is necessary to consider soil consumption, electricity consumption, and renewable electricity production, as well as their relationships and possible policies that will allow the adequate development of large-scale use of solar resource collection systems [2].

Various approaches to estimate solar irradiation have been applied globally. Regarding the task of measuring solar irradiation, it is well known that the best solution is to have

**Citation:** Valenzuela-Domínguez, F.A.; Santa Cruz, L.A.; Enríquez-Velásquez, E.A.; Félix-Herrán, L.C.; Benitez, V.H.; Lozoya-Santos, J.d.-J.; Ramírez-Mendoza, R.A. Solar Irradiation Evaluation through GIS Analysis Based on Grid Resolution and a Mathematical Model: A Case Study in Northeast Mexico. *Energies* **2021**, *14*, 6427. https://doi.org/ 10.3390/en14196427

Academic Editors: Philippe Leclère, Jesús Polo and Benedetto Nastasi

Received: 25 July 2021 Accepted: 31 August 2021 Published: 8 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

meteorological stations that directly measure the total amount of solar radiation using specialized sensors such as pyranometers and pyrheliometers, but the installation, operation, and maintenance of the stations can be expensive. On the other hand, the estimation of solar irradiation is regularly realized by means of satellite readings [3] and mathematical models [4]. These approaches may have a larger effective area range of measurement. When comparing the cost between measuring and estimating, the first approach is more expensive, and it requires investment in equipment and its installation and maintenance; by contrast, accurate estimation models require considering many variables such as cloud cover and solar zenith angle [5]. In this context, the present investigation focuses on the estimation of solar irradiation using mathematical models.

Mathematical models have been reported for several years to estimate total solar irradiation. A very complete work applied 294 different empirical-based mathematical models to estimate the solar resource in regions of China. In the study, the models were grouped according to their characteristics and two statistics were applied (root mean square error (*RMSE*) and relative root mean square error (*RRMSE*)) to validate them [4]. Another reported result was performed via empirical models and considered the skydiffuse radiation [6]. Research efforts have also focused on reviewing the accuracy of the models, and a study has analyzed twelve empirically based models and compared them with models based on machine learning, the latter being the ones with the best performance. These results have shown significant opportunity areas in some empirical models [7]. This opens the possibility to other radiation estimation approaches such as deep learning, which has been applied to a multi-layer perceptron (MLPs) method to estimate horizontal daily solar irradiation [8], bayesian model averaging and machine learning [9], or more theory-oriented control systems like Kalman filters [10].

An important consideration when working with mathematical models that estimate solar resources is their validation. For example, Kausika et al. [11] report a calibration and validation of a model, and for this they use experimental data from at least two meteorological stations. In addition, disadvantages are presented such as underestimation and overestimation of solar insolation.

Among research efforts dealing with solar energy, some of them deal with the growth and application of solar energy in Mexico. Reports include an analysis of different scenarios regarding the possible integration of solar systems in Mexico taking into account climate policies, where it was found that the cost-optimal share of solar energy in electricity and transportation would be 75% and 90%, respectively [12]. Other research reports a review of the solar resource assessment in Mexico, where the differences in the estimated radiation between the different maps reported for Mexico stand out; differences that can be of the order of 40% between the reported values. When comparing these results with estimates that make use of satellite measurements, which are more precise, an important drawback is observed in the work of estimating solar irradiation in Mexico [13]. A more recent investigation estimated solar irradiation, along with the assessment of available solar resources based on meteorological and geographical data in the northwestern state of Sonora [14].

The work reported by Enríquez-Velásquez et al. [14] is based on the mathematical model developed by Obukhov et al. [15] and adapted for its use in a northern state of Mexico. The estimates were validated with data from the National Aeronautics and Space Administration (NASA) [16], which are satellite readings, and the Mexican National Weather Service (Servicio Metereológico Nacional, SMN, by its acronym in Spanish), which are meteorological stations [17]. The results in [14] were close to those obtained by NASA and SMN. As part of the method, the area was divided into the 72 counties of Sonora State, which have very irregular shapes, and more research is required on this regard.

The SMN has meteorological stations all around the country and they have the capability to measure solar radiation. For the measurement, the stations are equipped with pyranometers and pyrheliometers. Besides, there are two types of stations in the SMN network: automatic meteorological stations (EMAS by its acronym in Spanish) and synoptic meteorological stations (ESMAS, by its acronym in Spanish). Although there are slight differences between the variables they measure, both have solar radiation sensors (which is important to us in this study). Moreover, all stations provide measurements that are considered valid within a radius of 5 km [17].

Even though Mexico has weather stations to measure solar radiation, the problem to develop estimation models arises from the limited number of meteorological locations, which is around 270 (190 EMAS and 80 ESMAS) [17]; therefore, there is a lack of meteorological stations, and this is more evident in the north of the country, where the case study of this work is located. In addition, having stations implies operation and maintenance costs, which is also a constraint. Moreover, it has been reported that a significant number of stations may have erroneous measurements or that they have not met certain validation criteria [18]. Another study reported that only 33% of the stations in the state of Sonora (northwest of Mexico) were reliable, which is due to, among other causes, the possible deficient maintenance of the stations. This generated loss or inconsistency of data, which prevented reliable readings [14].

One of the states with the most industrial development in Mexico is Nuevo León where solar energy harvesting could be a relevant option to reduce the operating costs of various industrial processes, but also of consumption in houses/rooms. Although the area of this state is 64,924 km<sup>2</sup> , there are only four meteorological stations to measure solar radiation [17]. This situation poses a problem and in turn a motivation for the present study.

Given the previous context and scenario, this research focuses on the estimation of solar irradiation using the mathematical approach reported in [14]; however, it applies a different approach. We divide the area by means of a discrete grid, so it covers the entire area of the state at evenly spaced points. As a reference, the coordinates (latitude and longitude) of each square's corner are taken instead of the midpoints of irregular surfaces. The case study considered is the state of Nuevo León in the northeast of Mexico. Results are validated by means of statistical methods and are compared against NASA estimations. No field stations are required, so it gives an advantage over direct measurements in the field, which implies the use of technological equipment.

The manuscript is organized as follows. Section 1 includes the context, relevance, previous work, motivation, and contribution of this research. Section 2 explains the database used for the study. Section 3 develops the applied methodology, whereas Section 4 presents the findings and discussions. The article closes with a section on conclusions and future work.

### **2. Data Sources**

This section describes *The Power Project* database of the National Aeronautics and Space Administration (NASA) through the Surface Meteorology and Solar Energy (SSE), which is available to the public through an internet portal [16]. The current research work utilizes the resources in *The Power Project* to establish a proper estimation of solar resource for the state of Nuevo León located in the northeast region of Mexico highlighted in Figure 1.

**Figure 1.** The state of Nuevo León is highlighted for reference. Geographical places of interest are named for reference as well. The image was taken and modified from the work in [19].

It is intended that this research will allow evaluating the viability of implementation of solar projects in this zone, as well as establishing methods of estimating the solar irradiation in other areas using the same program. This data source was selected due to its reliability and access to data worldwide for the parameters required for the calculations in the model utilized in this research and as a reference for comparison against data obtained from the model during the validation for the analyzed geographic zone. In addition, this data source contains a collection of around 30 years of several meteorological parameters based on satellite observation. This provides a solid comparison reference for validation of the model.

Inside the main page of The Power Project, select the option *POWER DATA ACCESS VIEWER*, and a map will be displayed. In the floating menu, select the *Climatology* option and enter the desired latitude and longitude, along with the parameters of interest. With this, the database provides the required data [16]. For the current research, clearness index (*kt*) and surface albedo (*ρ*) were obtained from the database for the calculations as well as all sky insolation incident on a horizontal surface (G) for validation of the model.

Note that other data sources such as meteorological stations from the Mexican national meteorological system (SMN) were considered but discarded due to the lack of stations in the state of Nuevo León (only three stations), which was considered not enough for comparison.

### **3. Methodology**

This section presents the research methodology that was followed to obtain a GIS analysis, which reflects the yearly behavior of the total insolation as well as the maximum and minimum temperatures in the state of Nuevo León, Mexico. Besides, all the process related to the validation of the model employed and the design of a representative grid are illustrated for the state of Nuevo León. The applied methodology is presented in Figure 2.

**Figure 2.** Applied methodology.

### *3.1. Mathematical Model*

The work reported by Enríquez-Velásquez et al. [14] was based on the mathematical model developed by Obukhov et al. [15] for high latitudes (55 ◦ N). This model was applied in [14] for the calculation and evaluation of the solar resource in the Northwestern Mexican state of Sonora at latitudes between 26 ◦ and 32 ◦ N. The applied mathematical formula for the estimation of total solar irradiation arriving at an inclined surface (*G*) is as follows:

$$\mathbf{G} = [\mathbf{G}\_D(\frac{\cos \theta}{\cos \theta\_z})] + \mathbf{G}\_{DH}[A\_i(\frac{\cos \theta}{\cos \theta\_z}) + [(1 - A\_i)(\frac{1 + \cos \theta}{2})] + [(\mathbf{G}\_H)(\rho)(\frac{1 - \cos \theta}{2})] \tag{1}$$

Equation (1) presents the calculation of the total solar irradiation on a surface orientated at any angle, it adds the direct, diffuse, and reflective components of solar radiation for calculation. The model includes the tilt angle of the receiving surface (*β*), the surface albedo (*ρ*), the incidence angle (*θ*), the solar zenith angle (*θz*), and the anisotropic index (*A<sup>i</sup>* ). Other parameters such as the diffusion index *K<sup>D</sup>* were calculated using the conditional table reported in [14]. For further information on the calculation of each variable, the same document can be consulted.

Table 1 shows the day taken to represent a reliable average representation of solar irradiation for each month. In addition, the model was used to calculate the average monthly solar irradiance on a horizontal surface throughout the year. The above-mentioned model was written in MATLAB and was instructed to read the input data from an Excel file containing the pertaining data of all the chosen geographical points of interest, and once every calculation was made, it outputted the resulting data in a second Excel file for further processing of the data. Furthermore, additional input data to run the mathematical model were obtained with the use of Python code requesting mass data from NASA's POWER LARC API. These data were produced as several CSV files that were later merged using Excel data manipulation tools for them to be readable by the model.

**Month Representative Day Day Number of the Year** January 17 17 February 16 47 March 16 75 April 15 105 May 15 135 June 11 162 July 17 198 August 16 228 September 15 258 October 15 288 November 14 318 December 10 344

**Table 1.** Average representative day for each month.

In this research, 80 geographic coordinates were fed to the program at once. The program calculations took around 90 min to complete for the millions of required calculations for several geographical points. This represents a significant advantage over calculating each data point manually or retrieving all the several results and parameters from a database. The mathematical model presents a clear advantage of calculating several geographical data points at once and generates an Excel file with the results in an acceptable period of time.

### *3.2. Statistical Parameters*

To validate the model, pertinent statistical methods were used to ensure data accuracy. The data generated using the herein mathematical model were compared against the data provided by NASA SSE project [16], specifically for "Total radiation arriving at a horizontal surface" parameter *G*. The eight statistical methods employed are outlined in Equations (2)–(9), where *X<sup>i</sup>* stands for each G-value calculated by the mathematical model, *Yi* is the G-value provided by NASA's SSE database, *n* is the number of months (sample size), and *i* represents the month number analyzed.

*MAE*—Mean absolute error

$$MAE = \frac{1}{n} \sum\_{i=1}^{n} |X\_i - Y\_i| \tag{2}$$

In Equation (2), mean absolute error (*MAE*) represents the average of the error's magnitude. It is desired for this value to be as close to zero as possible. *MAE* calculates a ratio relating the number of samples *n* to the magnitude of the error vector [20].

*MBE*—Mean bias error

$$MBE = \frac{1}{n} \sum\_{i=1}^{n} (X\_i - Y\_i) \tag{3}$$

In Equation (3), mean bias error (*MBE*) calculates the bias of the model results. For this value, the closer it is to zero, the better [21].

*RMSE*—Root mean square error

$$RMSE = \sqrt{\frac{1}{n} \sum\_{i=1}^{n} (X\_i - \mathbf{Y}\_i)^2} \tag{4}$$

In Equation (4), root mean square error (*RMSE*) calculates the standard deviation of the calculated data [22]. It is desired to be as close to zero as possible.

*MPE*—Mean percentage error

$$MPE = \frac{100}{n} \sum\_{i=1}^{n} (\frac{X\_i - Y\_i}{Y\_i}) \tag{5}$$

In Equation (5), mean percentage error (*MPE*) calculates the percentage of the error in the model calculated data, it is used to describe the performance of the error. A ±10% value is allowable [23].

*RPE*—Relative percentage error

$$RPE = \left(\frac{X\_i - Y\_i}{Y\_i}\right) \times 100\tag{6}$$

Equation (6) calculated the relative percentage error (*RPE*) that represents the percentage of error for each of monthly results. A ±10% value is permissible [24].

*r*—Correlation coefficient

$$r = \frac{\sum\_{i=1}^{n} (X\_i - \bar{X})(Y\_i - \bar{Y})}{\sqrt{\sum\_{i=1}^{n} (X\_i - \bar{X})^2 \sum\_{i=1}^{n} (Y\_i - \bar{Y})^2}} \tag{7}$$

In Equation (7), *r* (correlation coefficient) indicates the correlation between two different variables in a range of ±1, where +1 is a positive linear correlation, −1 represents a negative linear correlation, and 0 stands for no correlation at all. Correlation vales are desired to be as close to 1 as possible. Moreover, *X*¯ represents the annual average G-value calculated using the mathematical model, and *Y*¯ is the annual average G-value provided by the NASA SSE database [25].

*R* <sup>2</sup>—Coefficient of determination

$$\mathcal{R}^2 = 1 - (\frac{\sum\_{i=1}^n (Y\_i - X\_i)^2}{\sum\_{i=1}^n (Y\_i - \bar{Y})^2}) \tag{8}$$

In Equation (8), the coefficient of determination, *R* 2 , determines the percentage of variability between the results of the model and data set used, particularly in calculating the adequacy of the model to explain the variations represented with a value from 0 to 1. It is desired for *R* 2 to be as close to 1 as possible [26].

*t*—Student distribution test

$$t = \sqrt{\frac{(df)(MBE)^2}{(RMSE)^2 - (MBE)^2}}\tag{9}$$

As presented in Equation (9), the *t*—Student distribution test is commonly used to relate the significance of two independent data sets, and it is specifically useful when dealing with a small sample size (small value of *n*, in this case, 12 months). This statistical test shows if a value shows statistical significance according to selected metrics. For this test, a critical value of 4.025 was chosen to reflect a confidence level of 99% and eleven degrees-of-freedom. In the formula, *MBE* represents the Mean bias error, *RMSE* represents the Root mean square error, *df* stands for degrees of freedom, and *n* is the number of months calculated. To prove statistical significance, the calculated *t*-value must be less than the chosen critical value [27].

### *3.3. Model Validation*

The NASA SSE data were used to validate the model's results for each of the five chosen locations by evaluating that these results coincide with, or approximate, the data obtained by NASA SSE. Statistical models were calculated to analyze the performance of the model compared to the data source. The model was validated using these strategic points scattered at the edges of the Mexican state of Nuevo León under the premise of having an overview of the model behavior under the most extreme parameter conditions available in this territory.

The five points were chosen to represent the cardinal points in the state. They were placed as follows: Monterrey city to the west, China represents a point in the center of the state, Osca town to the East, Doctor Arroyo to the South, and Anáhuac to the North. This deployment allowed more diverse data to be used for validation of the model, something that would not have been possible if Mexico's Meteorological Service's data had been used instead. The latitudes and longitudes of the five locations are listed in Table 2.

**Table 2.** Latitude and longitude coordinates for all five locations in the state of Nuevo León chosen as reference points to validate the mathematical model with NASA's data.


The calculated validation data were processed using several statistical methods to ensure their significance and accuracy by comparing them to NASA's long-term acquired data. This comparison was further explored by plotting them side-by-side using MAT-LAB functions.

### *3.4. Area Justification*

To create a visual representation of the temperatures and solar radiation in the delimited surface, a grid was created to generate heat maps for the whole state. The Tech District, the area around the Monterrey tech campus, was selected as the origin point of this grid, i.e., the rounded red-black point to the right of *Z*0 as shown in Figure 3. This is because the location is of special interest for future research on solar energy, and a meteorological facility is in the process to be installed in this area. The grid was generated from this point using axes parallel to the equator and the Greenwich Meridian, respectively. Each grid cell is composed of a 30 km-sided square, using the vertices of each cell as evaluation points for the desired data (maximum temperature, minimum temperature, or solar irradiation); thus, 80 of these points were produced around the state of Nuevo León. The 30 km × 30 km grid resolution was used to prevent retrieving too many data points, but at the same time providing acceptable results (supported by the statistical validation and comparison against NASA SEE database). A higher resolution (with lower length for each cell side) is not convenient as, if vertices are too close to each other, solar radiation will vary little between

them. Figures 3 and 4 illustrate the work done in Google Earth software to distribute the 80 points and to draw the grid to cover all the surface of the state.

**Figure 3.** Geographical points (80 in total) selected for mathematical model calculation. The rounded red-black point represents the grid's origin.

**Figure 4.** The resulting grid strategically dividing the state of Nuevo León. The lines are parallel to the equator and Greenwich Meridian.

### *3.5. Heat Maps*

GIS maps are used to determine the amount of solar resource on a certain surface. The total solar radiation was calculated for each point on the statewide grid for each month of the year. Average maximum and minimum monthly temperatures were taken from the NASA SSE data source for each point as a comparison reference. The model was run in these chosen locations in the grid. The results were then plotted in several heatmaps to provide an overview of the annual variation in solar irradiance arriving at a horizontal surface throughout the state's territory.

The heatmaps were coded in Python with support of Pandas and Plotly libraries. The Pandas library was used for data manipulation. The Plotly library was employed to fetch maps and generate a heatmap on top of it, adding its corresponded legends and color bar. At last, Kaleido library was employed to generate a resulting vectorized image for further enhancement and postprocessing.

### **4. Results and Discussion**

This section presents the results obtained in the research and discuss the findings to develop an identification of opportunities and potential for solar energy in the state of Nuevo León based on the results.

### *4.1. Model Validation Results*

The mathematical model was analyzed for the five locations presented in Section 3, which were selected in the state of Nuevo León to represent its surface. These points were selected for each cardinal and one in the middle of the state to ensure the maximum coverage possible. Based on these points, a validation comparing the model against the NASA SSE database using the statistical parameters presented in Section 3 was realized.

The results obtained from the above-mentioned analysis were satisfactory for all the cases and just small discrepancies were observed. Noticed values for *MAE*, *MBE*, and *RMSE* were close to zero for almost all cases, which means minimum errors presented between the model and the reference. Slightly higher values of these parameters were shown in the cases of China and Anahuac, but they remain acceptable.

For *MPE* and *RPE* tests values close to zero percent were obtained for all localities, which means all cases were on an acceptable range. It was found positive linear correlation for all cases using the parameter "r". Furthermore, for *R* 2 tests, the best performance was presented in Dr. Arroyo and La Osca, followed by Monterrey and Anahuac, a smaller performance but still acceptable was observed in China.

The t-distribution test illustrates four of the analyzed points as statistically significant, the only exception was Anahuac which remained a little high respect to the critical point selected. Based on the realized statistical results, it was observed that the model is a very close approximation to the values obtained from the NASA SSE database. This proves that the model is a valid estimation of solar irradiation in the geographical coordinates of the state of Nuevo León. Tables 3 and 4 provide all the monthly calculations for solar irradiation obtained from the model and the reference from NASA SSE. Additionally, it shows the values obtained for all the statistical metrics analyzed. To review the accuracy of results, refer to the statistical parameters in Section 3.2.


**Table 3.**Monthly comparison of*G*from the model against the*G*provided by NASA SSE database for the five target points.


**Table 4.** Statistical parameters to measure quality of estimated *G* against *G* from NASA SSE database for the five target points.

Figures 5–9 illustrate the behavior of the model respect to the values from the NASA SSE database. The values shown in Figures 5–9 are in kWh/m<sup>2</sup> for solar irradiation. It can be observed that for La Osca, the model and the NASA SSE values were nearly identical for all months, then for Dr. Arroyo and Monterrey small discrepancies were observed in certain months. Finally, the biggest differences were found in Anáhuac and China for the summer months, but these differences remain small.

**Figure 5.** Values obtained for La Osca from mathematical model and NASA SSE.

**Figure 6.** Values obtained for Doctor Arroyo from mathematical model and NASA SSE.

**Figure 7.** Values obtained for Monterrey from mathematical model and NASA SSE.

**Figure 8.** Values obtained for Anáhuac from mathematical model and NASA SSE.

**Figure 9.** Values obtained for China from mathematical model and NASA SSE.

Based on the validation realized and results in Figures 5–9, it is shown that the model proposed is a close estimation of solar resource in the region for latitudes from 23◦ to 28◦ and longitudes from −98◦ to −101◦ .

### *4.2. GIS Analysis Results*

From the GIS analysis applied to the grid described in Section 3 for strategical points to calculate solar irradiation in Nuevo León, it was found that the minimum value during the year for all the state surface was 2.97 kWh/m<sup>2</sup> and reached a maximum value of 6.68 kWh/m<sup>2</sup> in the analysis of the points studied. This illustrates the availability of solar resource in the states and its evolution throughout the year. As a result, for the 12 months of the year, the GIS of average total solar irradiation at a horizontal surface, the average of the maximum temperatures, and the average of the minimum temperatures are presented in Figures 10–21. These GIS were generated from the computations in Appendix A.

**Figure 10.** GIS analysis for the month of January. (**left**) Average solar irradiation (kWh/m<sup>2</sup> ), (**center**) Maximum average temperature ( ◦C), and (**right**) Minimum average temperature ( ◦C).

**Figure 11.** GIS analysis for the month of February. (**left**) Average solar irradiation (kWh/m<sup>2</sup> ), (**center**) Maximum average temperature ( ◦C), and (**right**) Minimum average temperature ( ◦C).

**Figure 12.** GIS analysis for the month of March. (**left**) Average solar irradiation (kWh/m<sup>2</sup> ), (**center**) Maximum average temperature ( ◦C), and (**right**) Minimum average temperature ( ◦C).

**Figure 13.** GIS analysis for the month of April. (**left**) Average solar irradiation (kWh/m<sup>2</sup> ), (**center**) Maximum average temperature ( ◦C), and (**right**) Minimum average temperature ( ◦C).

**Figure 14.** GIS analysis for the month of May. (**left**) Average solar irradiation (kWh/m<sup>2</sup> ), (**center**) Maximum average temperature ( ◦C), and (**right**) Minimum average temperature ( ◦C).

**Figure 15.** GIS analysis for the month of June. (**left**) Average solar irradiation (kWh/m<sup>2</sup> ), (**center**) Maximum average temperature ( ◦C), and (**right**) Minimum average temperature ( ◦C).

**Figure 16.** GIS analysis for the month of July. (**left**) Average solar irradiation (kWh/m<sup>2</sup> ), (**center**) Maximum average temperature ( ◦C), and (**right**) Minimum average temperature ( ◦C).

**Figure 17.** GIS analysis for the month of August. (**left**) Average solar irradiation (kWh/m<sup>2</sup> ), (**center**) Maximum average temperature ( ◦C), and (**right**) Minimum average temperature ( ◦C).

**Figure 18.** GIS analysis for the month of September. (**left**) Average solar irradiation (kWh/m<sup>2</sup> ), (**center**) Maximum average temperature ( ◦C), and (**right**) Minimum average temperature ( ◦C).

**Figure 19.** GIS analysis for the month of October. (**left**) Average solar irradiation (kWh/m<sup>2</sup> ), (**center**) Maximum average temperature ( ◦C), and (**right**) Minimum average temperature ( ◦C).

**Figure 20.** GIS analysis for the month of November. (**left**) Average solar irradiation (kWh/m<sup>2</sup> ), (**center**) Maximum average temperature ( ◦C), and (**right**) Minimum average temperature ( ◦C).

**Figure 21.** GIS analysis for the month of December. (**left**) Average solar irradiation (kWh/m<sup>2</sup> ), (**center**) Maximum average temperature ( ◦C), and (**right**) Minimum average temperature ( ◦C).

The identified zone with the most constant solar irradiation levels during the year was south-west of the state, except for the months of June–August, where solar irradiation passes to be almost constant in the whole state but with slightly higher values in the north and east of the state in June and July. The average maximum and minimum temperatures for the state fluctuate in a range from 1.55 ◦C to 43.84 ◦C during the year for all the points analyzed.

The highest temperatures were found in the northern and eastern regions, whereas the lowest are in the south and west. On the other hand, the state contains mountainous terrain in its southern and western areas, passing to plains in its northern and eastern regions. All the previous factors must be considered in planning PV projects. Based on the previous factors, it was found that the west region where the capital city of Monterrey is located appears promising on average throughout the year for values of total solar irradiance. However, the effect of low and high temperatures on photovoltaic systems must be taken into account to realize the proper design for solar projects considering the effects of extreme temperatures on solar PV panels (current and voltage). An advantage of the state in terms of minimum temperatures is that there are no sub-zero temperatures, which facilitates the design and selection of components of a photovoltaic system, by not leading to such large variations in voltage due to temperature change.

A factor to consider for the west and south zones of the state is that the terrain is mountainous, which makes it difficult to implement large-scale photovoltaic projects due to space and suitable terrain issues. This is because of the space required to avoid shadowing and the right distribution of solar panels in the terrain; the bigger the project, the most terrain it will require. On the other hand, north and east regions are defined by plain terrain, which allows development of large-scale PV projects. Here, the main issue is maximum temperatures in summer, which reduce efficiency in the energy conversion from PV panels, so measures to counteract these effects must be taken in the design.

The GIS analysis applied to solar irradiation proves itself a useful tool that allows easy identification of opportunity areas for photovoltaic systems in the state of Nuevo León. The main advantage is that all information required for decision-making is shown, comparing parameters which affect PV performance and planning, such as temperatures and geographical terrain. It is important to consider the proper resolution for the analyzed area. As explained in Section 3, a distribution of points on a grid of 30 × 30 km was defined.

Compared against the approach applied by Enríquez-Velásquez et al. [14], the approach herein has the improvement of using a grid to divide the area of interest, instead of dividing the surface into municipalities, which have irregular shapes, as it was done in the state of Sonora, Mexico. The grid concept is versatile and allows to estimate, in a more uniform way, the solar resource of any area, taking a given latitude and longitude as the origin point of the grid. It should also be considered that the mathematical model will need to be validated if it is going to be applied to other latitudes and longitudes as suggested by Kausika et al. [11].

### **5. Conclusions and Future Work**

Based on the findings previously described in Section 4, it is concluded that the mathematical model utilized in this research work was successful for computing total solar irradiation and demonstrated high accuracy values for the state of Nuevo León in México. This accuracy was proved using statistical metrics comparing the model against the NASA SSE database as reference. The validation of the model was tested for the geographic zone between latitudes from 28◦ to 23◦ N and longitudes from 98◦ to 101◦ W.

Based on a scrutiny of the state of Nuevo León using GIS analysis, the solar irradiation calculated by the model was compared against maximum and minimum temperatures, and opportunities for local photovoltaic (PV) development were identified. The GIS analysis was realized over a grid of 30 × 30 km, which covers the totality of the state surface. This approach allowed a clear visualization of the solar irradiation, without rejecting data due to long distances between the points of the grid proposed. It was concluded that solar irradiation in the state of Nuevo León ranges from 2.98 kWh/m<sup>2</sup> to 6.68 kWh/m<sup>2</sup> , as minimum and maximum ranges. This result demonstrated the high potential of solar energy projects in the region. The comparison against extreme temperatures and geographical terrain allowed to point certain areas in the region, which are more suitable for PV development.

It was also inferred that the most suitable zones for solar collecting systems may be located in the southwest, which presented constant solar irradiation values throughout the year; however, these zones are mountainous, so it should be considered in the design of solar farms due to space requirements to avoid shadowing among solar panels. Note that the most industrialized zone of the state is located in the Monterrey metropolitan area, and it presents constant solar irradiation throughout the year, which represents great potential for the sustainable development of the industrial base in the city.

This would result in greatly reducing the carbon emissions of the state, which is one of the most industrialized in Mexico. As a result, this region would contribute towards the accomplishment of Mexico's international agreements in relation to the reduction of its carbon footprint and commitments such as the Paris Agreement. This would be reflected in urbanization and sustainable development in electricity generation for the city of Monterrey. The mathematical model used in this research is shown to be an easy and accurate tool to estimate solar irradiation in the region, with the ease of not depending on field references for the calculations, and only requires meteorological data and geographic coordinates easily accessible online.

This represents an economical alternative compared to meteorological stations that require constant maintenance and cost of implementation. In addition to these factors, it should be considered that a large number of meteorological stations must be available to cover the entire area of the state for the same resolution obtained by the model. Considering all the previous factors, it is concluded that the proposed model and GIS analysis offer a great opportunity for solar energy planning in the public and private sectors in the state, and to used these GIS methods as decision-making tools and reference for the implementation of solar projects in the region. Furthermore, a tool for a possible cooperation with the United States regarding sustainable development of both nations.

As future work, the application of the model is proposed by implementing different inclination angles for the receiving surface, as well as surface azimuth to simulate the daily tracking of the sun throughout the year. Besides, more databases for solar resource estimation may be employed to reinforce the validation of our results. This would be the basis for developing a solar monitoring system that would increase the efficiency of solar collection in PV systems within the region, and thus, increasing their profitability and production of electrical energy. Additionally, solar tracking could be used to design arrays of heliostats to focus sunlight at a central tower to heat liquids and generate steam for electric generation using solar thermal technologies. These projects are focused on sustainable urban and industrial development facing the current challenges of climate change.

**Author Contributions:** Conceptualization, E.A.E.-V. and J.d.-J.L.-S.; methodology, E.A.E.-V., L.C.F.-H. and V.H.B.; software, F.A.V.-D.; validation, F.A.V.-D., L.A.S.C., E.A.E.-V. and L.C.F.-H.; formal analysis, F.A.V.-D., L.A.S.C. and E.A.E.-V.; investigation, F.A.V.-D., L.A.S.C., E.A.E.-V., L.C.F.-H. and V.H.B.; resources, J.d.-J.L.-S.; data curation, F.A.V.-D. and E.A.E.-V.; writing-original draft preparation, F.A.V.-D., L.A.S.C., E.A.E.-V. and L.C.F.-H.; writing—review and editing V.H.B., J.d.-J.L.-S. and R.A.R.-M.; visualization, V.H.B., J.d.-J.L.-S. and R.A.R.-M.; supervision, L.C.F.-H.; project administration, L.C.F.-H.; funding acquisition, J.d.-J.L.-S. and R.A.R.-M. All authors have read and agreed to the published version of the manuscript.

**Funding:** The APC was funded by CampusCity Initiative from the School of Engineering and Sciences at the Tecnologico de Monterrey.

**Acknowledgments:** The authors are grateful to Tecnológico de Monterrey for having provided the software resources to carry out the simulation work. In addition, we thank NASA—The Power Project for giving free access to its databases and to consult the reference values to estimate the total solar irradiation and corroborate the results.

**Conflicts of Interest:** The authors declare no conflicts of interest. The funder had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

#### **Appendix A**

#### *Solar Irradiance Parameters for the 80 Selected Points in the State of Nuevo León*


**TableA1.**Monthlyparametermeasurementsforeachpointonthestate-widegrid.


**Table A1.** *Cont.*


**Table A2.**Monthly parameter measurements for each point on the state-wide grid.

**Location LAT LON Parameter January February March April May June July August September October November December Annual Average** A-6 24.0294◦ −100.2871◦ G\_Model 4.08 4.93 5.92 6.32 6.58 6.57 6.28 6.12 5.39 4.98 4.44 4 5.47 G\_NASA 4.06 4.85 5.9 6.24 6.52 6.48 6.28 6.01 5.3 4.94 4.46 3.93 5.41 SRF\_ALB 0.16 0.16 0.17 0.17 0.17 0.18 0.17 0.17 0.17 0.15 0.15 0.16 0.16 KT 0.6 0.62 0.64 0.61 0.6 0.59 0.57 0.58 0.56 0.6 0.63 0.62 0.6 TS\_MAX 24.76 28.21 32.65 36.47 38.11 36.74 34.62 34.07 31.13 29.61 27.11 24.65 31.51 TS\_MIN 3.03 4.43 6.72 10.23 13.11 14.12 13.49 13.46 12.47 9.63 6.29 3.83 9.23 A-7 23.7586◦ −100.2871◦ G\_Model 4.1 4.95 6.02 6.33 6.68 6.56 6.27 6.12 5.3 5 4.47 3.96 5.48 G\_NASA 4.15 4.98 6.03 6.3 6.57 6.51 6.26 6.07 5.3 5 4.58 4.01 5.48 SRF\_ALB 0.17 0.17 0.17 0.17 0.17 0.18 0.18 0.18 0.18 0.16 0.16 0.17 0.17 KT 0.6 0.62 0.65 0.61 0.61 0.59 0.57 0.58 0.55 0.6 0.63 0.61 0.6 TS\_MAX 25.08 28.54 33.01 36.74 38.19 36.43 33.61 33.3 30.68 29.35 27.17 24.91 31.42 TS\_MIN 3.32 4.66 6.89 10.16 13.04 14.1 13.25 13.24 12.34 9.61 6.48 4.11 9.27 A-8 23.4877◦ −100.2871◦ G\_Model 4.13 4.97 6.04 6.33 6.68 6.55 6.27 6.12 5.31 5.02 4.49 3.99 5.49 G\_NASA 4.15 4.98 6.03 6.3 6.57 6.51 6.26 6.07 5.3 5 4.58 4.01 5.48 SRF\_ALB 0.17 0.17 0.17 0.17 0.17 0.18 0.18 0.18 0.18 0.16 0.16 0.17 0.17 KT 0.6 0.62 0.65 0.61 0.61 0.59 0.57 0.58 0.55 0.6 0.63 0.61 0.6 TS\_MAX 25.97 29.48 33.99 37.58 38.5 36.09 32.61 32.65 30.51 29.45 27.64 25.65 31.68 TS\_MIN 4.54 5.84 8.13 11.26 14.05 15.11 14.17 14.18 13.34 10.63 7.58 5.28 10.34 A-9 23.2168◦ −100.2871◦ G\_Model 4.15 4.99 6.05 6.33 6.67 6.55 6.26 6.12 5.32 5.04 4.52 4.01 5.5 G\_NASA 4.15 4.98 6.03 6.3 6.57 6.51 6.26 6.07 5.3 5 4.58 4.01 5.48 SRF\_ALB 0.17 0.17 0.17 0.17 0.17 0.18 0.18 0.18 0.18 0.16 0.16 0.17 0.17 KT 0.6 0.62 0.65 0.61 0.61 0.59 0.57 0.58 0.55 0.6 0.63 0.61 0.6 TS\_MAX 25.97 29.48 33.99 37.58 38.5 36.09 32.61 32.65 30.51 29.45 27.64 25.65 31.68 TS\_MIN 4.54 5.84 8.13 11.26 14.05 15.11 14.17 14.18 13.34 10.63 7.58 5.28 10.34 Z0 25.6544◦ −100.5859◦ G\_Model 3.79 4.64 5.64 5.98 6.27 6.28 6.09 5.9 5.04 4.62 4.15 3.59 5.17 G\_NASA 3.83 4.61 5.73 5.94 6.27 6.19 6.06 5.74 5.05 4.66 4.2 3.64 5.16 SRF\_ALB 0.15 0.14 0.15 0.14 0.15 0.16 0.15 0.16 0.15 0.13 0.13 0.15 0.15 KT 0.58 0.6 0.62 0.58 0.57 0.56 0.55 0.56 0.53 0.57 0.61 0.58 0.58 TS\_MAX 25.32 28.96 33.57 37.91 40.24 39.88 38.3 37.82 34.1 31.91 28.42 25.21 33.47 TS\_MIN3.715.448.1912.4415.9317.3517.0217.0515.3311.717.554.5511.36

**Table A2.** *Cont.*


**Table A2.** *Cont.*

**Table A3.**Monthly parameter measurements for each point on the state-wide grid.



**Table A3.** *Cont.*

**Location LAT LON Parameter January February March April May June July August September October November December Annual Average** Z5 27.0083◦ −100.5859◦ G\_Model 3.35 4.15 5.21 5.75 6.17 6.43 6.45 6.1 5.18 4.37 3.76 3.16 5.01 G\_NASA 3.35 4.05 5.18 5.65 6.09 6.32 6.43 5.93 5.07 4.34 3.67 3.14 4.93 SRF\_ALB 0.16 0.16 0.16 0.16 0.16 0.17 0.17 0.18 0.17 0.15 0.15 0.15 0.16 KT 0.53 0.55 0.58 0.56 0.56 0.57 0.58 0.58 0.55 0.55 0.57 0.53 0.56 TS\_MAX 24.98 29.5 35.31 40.05 42.66 43.84 42.81 42.79 38.63 35.27 29.63 24.66 35.84 TS\_MIN 5.26 7.54 11.29 15.86 20.27 22.98 23.05 23.36 20.71 15.94 10.3 5.93 15.21 Z6 27.279◦ −100.5859◦ G\_Model 3.33 4.13 5.19 5.75 6.18 6.43 6.45 6.1 5.17 4.35 3.74 3.14 5 G\_NASA 3.35 4.05 5.18 5.65 6.09 6.32 6.43 5.93 5.07 4.34 3.67 3.14 4.93 SRF\_ALB 0.16 0.16 0.16 0.16 0.16 0.17 0.17 0.18 0.17 0.15 0.15 0.15 0.16 KT 0.53 0.55 0.58 0.56 0.56 0.57 0.58 0.58 0.55 0.55 0.57 0.53 0.56 TS\_MAX 24.98 29.5 35.31 40.05 42.66 43.84 42.81 42.79 38.63 35.27 29.63 24.66 35.84 TS\_MIN 5.26 7.54 11.29 15.86 20.27 22.98 23.05 23.36 20.71 15.94 10.3 5.93 15.21 Z-2 25.1128◦ −100.5859◦ G\_Model 3.84 4.68 5.67 5.99 6.26 6.27 6.08 5.9 5.06 4.66 4.2 3.64 5.19 G\_NASA 3.83 4.61 5.73 5.94 6.27 6.19 6.06 5.74 5.05 4.66 4.2 3.64 5.16 SRF\_ALB 0.15 0.14 0.15 0.14 0.15 0.16 0.15 0.16 0.15 0.13 0.13 0.15 0.15 KT 0.58 0.6 0.62 0.58 0.57 0.56 0.55 0.56 0.53 0.57 0.61 0.58 0.58 TS\_MAX 24.02 27.3 31.61 35.65 37.73 37.05 35.29 34.76 31.69 29.89 26.89 23.94 31.32 TS\_MIN 1.83 3.25 5.58 9.52 12.67 13.9 13.54 13.46 12.18 9.04 5.37 2.74 8.59 Z-3 24.842◦ −100.5859◦ G\_Model 4 4.86 5.87 6.31 6.59 6.59 6.3 6.12 5.36 4.92 4.36 3.92 5.43 G\_NASA 4.06 4.85 5.9 6.24 6.52 6.48 6.28 6.01 5.3 4.94 4.46 3.93 5.41 SRF\_ALB 0.16 0.16 0.17 0.17 0.17 0.18 0.17 0.17 0.17 0.15 0.15 0.16 0.16 KT 0.6 0.62 0.64 0.61 0.6 0.59 0.57 0.58 0.56 0.6 0.63 0.62 0.6 TS\_MAX 24.01 27.37 31.77 35.78 38.06 37.05 34.87 34.3 31.79 30.13 27.03 24.01 31.35 TS\_MIN 1.55 2.83 5.03 8.84 12 13.32 12.83 12.73 11.71 8.74 5.11 2.49 8.1 Z-4 24.5711◦ −100.5859◦ G\_Model 4.03 4.88 5.89 6.31 6.58 6.59 6.29 6.12 5.37 4.94 4.39 3.94 5.44 G\_NASA 4.06 4.85 5.9 6.24 6.52 6.48 6.28 6.01 5.3 4.94 4.46 3.93 5.41 SRF\_ALB 0.16 0.16 0.17 0.17 0.17 0.18 0.17 0.17 0.17 0.15 0.15 0.16 0.16 KT 0.6 0.62 0.64 0.61 0.6 0.59 0.57 0.58 0.56 0.6 0.63 0.62 0.6 TS\_MAX 24.01 27.37 31.77 35.78 38.06 37.05 34.87 34.3 31.79 30.13 27.03 24.01 31.35 TS\_MIN1.552.835.038.841213.3212.8312.7311.718.745.112.498.1

**Table A3.** *Cont.*


**Table A4.**Monthly parameter measurements for each point on the state-wide grid.

**Location LAT LON Parameter January February March April May June July August September October November December Annual Average** B2 26.196◦ −99.98822◦ G\_Model 3.29 3.98 5.07 5.66 6.06 6.52 6.43 6.21 5.12 4.43 3.7 3.11 4.97 G\_NASA 3.24 3.95 5.08 5.54 5.99 6.37 6.45 6.04 5.08 4.41 3.63 3.1 4.91 SRF\_ALB 0.15 0.14 0.14 0.15 0.15 0.16 0.16 0.17 0.16 0.14 0.14 0.14 0.15 KT 0.51 0.52 0.56 0.55 0.55 0.58 0.58 0.59 0.54 0.55 0.55 0.51 0.55 TS\_MAX 25.12 29.34 34.75 39.34 41.91 42.5 41.41 41.47 37.01 34.14 29.35 24.98 35.11 TS\_MIN 6.89 9.1 12.63 16.88 20.87 22.97 22.84 23.17 20.99 16.79 11.71 7.59 16.04 B3 26.4668◦ −99.98822◦ G\_Model 3.27 3.97 5.06 5.66 6.06 6.52 6.44 6.21 5.11 4.41 3.68 3.09 4.96 G\_NASA 3.24 3.95 5.08 5.54 5.99 6.37 6.45 6.04 5.08 4.41 3.63 3.1 4.91 SRF\_ALB 0.15 0.14 0.14 0.15 0.15 0.16 0.16 0.17 0.16 0.14 0.14 0.14 0.15 KT 0.51 0.52 0.56 0.55 0.55 0.58 0.58 0.59 0.54 0.55 0.55 0.51 0.55 TS\_MAX 25.12 29.34 34.75 39.34 41.91 42.5 41.41 41.47 37.01 34.14 29.35 24.98 35.11 TS\_MIN 6.89 9.1 12.63 16.88 20.87 22.97 22.84 23.17 20.99 16.79 11.71 7.59 16.04 B4 26.7375◦ −99.98822◦ G\_Model 3.25 3.95 5.04 5.65 6.06 6.53 6.44 6.21 5.1 4.39 3.65 3.07 4.95 G\_NASA 3.24 3.95 5.08 5.54 5.99 6.37 6.45 6.04 5.08 4.41 3.63 3.1 4.91 SRF\_ALB 0.15 0.14 0.14 0.15 0.15 0.16 0.16 0.17 0.16 0.14 0.14 0.14 0.15 KT 0.51 0.52 0.56 0.55 0.55 0.58 0.58 0.59 0.54 0.55 0.55 0.51 0.55 TS\_MAX 24.74 29.04 34.6 39.28 41.91 43.01 42.09 42.34 37.83 34.69 29.4 24.63 35.3 TS\_MIN 6.86 9.14 12.87 17.19 21.32 23.7 23.7 24.09 21.64 17.24 11.96 7.61 16.44 B5 27.0083◦ −99.98822◦ G\_Model 3.17 3.93 5.03 5.65 6.17 6.65 6.67 6.21 5.27 4.45 3.57 3.04 4.98 G\_NASA 3.15 3.86 4.97 5.61 6.07 6.62 6.75 6.17 5.24 4.41 3.51 2.99 4.95 SRF\_ALB 0.16 0.15 0.16 0.16 0.16 0.18 0.18 0.19 0.18 0.16 0.14 0.15 0.16 KT 0.5 0.52 0.56 0.55 0.56 0.59 0.6 0.59 0.56 0.56 0.54 0.51 0.55 TS\_MAX 24.22 28.63 34.18 38.89 41.54 43.17 42.56 42.98 38.44 34.93 29.23 24.18 35.25 TS\_MIN 6.49 8.85 12.68 17.05 21.3 23.93 24.09 24.52 21.82 17.26 11.82 7.31 16.43 B6 27.279◦ −99.98822◦ G\_Model 3.14 3.91 5.01 5.64 6.18 6.66 6.67 6.21 5.26 4.43 3.54 3.02 4.97 G\_NASA 3.15 3.86 4.97 5.61 6.07 6.62 6.75 6.17 5.24 4.41 3.51 2.99 4.95 SRF\_ALB 0.16 0.15 0.16 0.16 0.16 0.18 0.18 0.19 0.18 0.16 0.14 0.15 0.16 KT 0.5 0.52 0.56 0.55 0.56 0.59 0.6 0.59 0.56 0.56 0.54 0.51 0.55 TS\_MAX24.2228.6334.1838.8941.5443.1742.5642.9838.4434.9329.2324.1835.25

**Table A4.** *Cont.*


**Table A4.** *Cont.*

**Table A5.** Monthly parameter measurements for each point on the state-wide grid.


**Location LAT LON Parameter January February March April May June July August September October November December Annual Average** B-4 24.5711◦ −99.98822◦ G\_Model 3.76 4.49 5.52 5.8 6.04 6.25 6.18 6.01 5.08 4.61 4.11 3.56 5.12 G\_NASA 3.75 4.5 5.53 5.71 6 6.11 6.2 5.98 5.09 4.61 4.14 3.62 5.1 SRF\_ALB 0.13 0.12 0.13 0.13 0.13 0.15 0.13 0.14 0.13 0.11 0.12 0.13 0.13 KT 0.56 0.57 0.6 0.56 0.55 0.56 0.56 0.57 0.53 0.56 0.59 0.56 0.56 TS\_MAX 25.29 28.82 33.29 36.96 38.29 37.43 36.03 35.8 32.07 30.09 27.62 25.09 32.23 TS\_MIN 5.72 7.38 10.16 13.95 16.92 17.92 17.48 17.6 16.22 13 9.26 6.38 12.66 B-5 24.3003◦ −99.98822◦ G\_Model 3.78 4.51 5.53 5.8 6.03 6.24 6.17 6.01 5.09 4.63 4.14 3.59 5.13 G\_NASA 3.75 4.5 5.53 5.71 6 6.11 6.2 5.98 5.09 4.61 4.14 3.62 5.1 SRF\_ALB 0.13 0.12 0.13 0.13 0.13 0.15 0.13 0.14 0.13 0.11 0.12 0.13 0.13 KT 0.56 0.57 0.6 0.56 0.55 0.56 0.56 0.57 0.53 0.56 0.59 0.56 0.56 TS\_MAX 25.64 29.21 33.62 37.29 38.35 36.98 35.14 34.85 31.3 29.7 27.57 25.4 32.09 TS\_MIN 4.98 6.55 9.12 12.63 15.35 16.2 15.62 15.72 14.6 11.59 8.19 5.68 11.35 B-6 24.0294◦ −99.98822◦ G\_Model 3.81 4.53 5.55 5.8 6.03 6.24 6.17 6.01 5.1 4.65 4.16 3.61 5.14 G\_NASA 3.75 4.5 5.53 5.71 6 6.11 6.2 5.98 5.09 4.61 4.14 3.62 5.1 SRF\_ALB 0.13 0.12 0.13 0.13 0.13 0.15 0.13 0.14 0.13 0.11 0.12 0.13 0.13 KT 0.56 0.57 0.6 0.56 0.55 0.56 0.56 0.57 0.53 0.56 0.59 0.56 0.56 TS\_MAX 25.64 29.21 33.62 37.29 38.35 36.98 35.14 34.85 31.3 29.7 27.57 25.4 32.09 TS\_MIN 4.98 6.55 9.12 12.63 15.35 16.2 15.62 15.72 14.6 11.59 8.19 5.68 11.35 B-7 23.7586◦ −99.98822◦ G\_Model 3.97 4.71 5.75 6.01 6.25 6.23 6.05 6.02 5.11 4.83 4.32 3.83 5.26 G\_NASA 4.02 4.78 5.82 6.03 6.31 6.17 6.11 5.92 5.15 4.82 4.41 3.85 5.28 SRF\_ALB 0.14 0.13 0.14 0.13 0.14 0.16 0.14 0.15 0.15 0.12 0.12 0.14 0.14 KT 0.58 0.59 0.62 0.58 0.57 0.56 0.55 0.57 0.53 0.58 0.61 0.59 0.58 TS\_MAX 25.56 29.06 33.39 36.89 37.75 35.87 33.34 33.24 30.16 28.85 27.13 25.25 31.37 TS\_MIN 4.77 6.22 8.64 11.84 14.49 15.39 14.62 14.7 13.75 10.9 7.78 5.47 10.71 B-8 23.4877◦ −99.98822◦ G\_Model 3.99 4.73 5.76 6.02 6.24 6.22 6.05 6.02 5.12 4.85 4.35 3.86 5.27 G\_NASA 4.02 4.78 5.82 6.03 6.31 6.17 6.11 5.92 5.15 4.82 4.41 3.85 5.28 SRF\_ALB 0.14 0.13 0.14 0.13 0.14 0.16 0.14 0.15 0.15 0.12 0.12 0.14 0.14 KT 0.58 0.59 0.62 0.58 0.57 0.56 0.55 0.57 0.53 0.58 0.61 0.59 0.58 TS\_MAX 26.22 29.72 34.08 37.46 37.94 35.44 32.14 32.36 29.95 28.87 27.46 25.76 31.45 TS\_MIN5.897.269.7112.7715.416.3615.4915.5314.6711.868.796.5311.69

**Table A5.** *Cont.*

**Location LAT LON Parameter January February March April May June July August September October November December Annual Average** C1 25.9252◦ −99.68944◦ G\_Model 3.38 4.16 5.18 5.57 5.94 6.4 6.43 6.22 5.13 4.36 3.72 3.2 4.97 G\_NASA 3.4 4.13 5.23 5.53 5.81 6.23 6.37 6.04 5.04 4.4 3.8 3.27 4.94 SRF\_ALB 0.13 0.13 0.13 0.13 0.13 0.15 0.15 0.15 0.15 0.13 0.13 0.13 0.14 KT 0.52 0.54 0.57 0.54 0.54 0.57 0.58 0.59 0.54 0.54 0.55 0.52 0.55 TS\_MAX 25.67 29.68 34.81 39.12 41.46 41.65 40.51 40.4 36.15 33.45 29.34 25.5 34.81 TS\_MIN 6.73 8.81 12.12 16.32 20.1 21.91 21.66 21.94 20.03 16.01 11.21 7.39 15.35 C2 26.196◦ −99.68944◦ G\_Model 3.29 3.98 5.07 5.66 6.06 6.52 6.43 6.21 5.12 4.43 3.7 3.11 4.97 G\_NASA 3.24 3.95 5.08 5.54 5.99 6.37 6.45 6.04 5.08 4.41 3.63 3.1 4.91 SRF\_ALB 0.15 0.14 0.14 0.15 0.15 0.16 0.16 0.17 0.16 0.14 0.14 0.14 0.15 KT 0.51 0.52 0.56 0.55 0.55 0.58 0.58 0.59 0.54 0.55 0.55 0.51 0.55 TS\_MAX 25.12 29.34 34.75 39.34 41.91 42.5 41.41 41.47 37.01 34.14 29.35 24.98 35.11 TS\_MIN 6.89 9.1 12.63 16.88 20.87 22.97 22.84 23.17 20.99 16.79 11.71 7.59 16.04 C3 26.4668◦ −99.68944◦ G\_Model 3.27 3.97 5.06 5.66 6.06 6.52 6.44 6.21 5.11 4.41 3.68 3.09 4.96 G\_NASA 3.24 3.95 5.08 5.54 5.99 6.37 6.45 6.04 5.08 4.41 3.63 3.1 4.91 SRF\_ALB 0.15 0.14 0.14 0.15 0.15 0.16 0.16 0.17 0.16 0.14 0.14 0.14 0.15 KT 0.51 0.52 0.56 0.55 0.55 0.58 0.58 0.59 0.54 0.55 0.55 0.51 0.55 TS\_MAX 25.12 29.34 34.75 39.34 41.91 42.5 41.41 41.47 37.01 34.14 29.35 24.98 35.11 TS\_MIN 6.89 9.1 12.63 16.88 20.87 22.97 22.84 23.17 20.99 16.79 11.71 7.59 16.04 C4 26.7375◦ −99.68944◦ G\_Model 3.25 3.95 5.04 5.65 6.06 6.53 6.44 6.21 5.1 4.39 3.65 3.07 4.95 G\_NASA 3.24 3.95 5.08 5.54 5.99 6.37 6.45 6.04 5.08 4.41 3.63 3.1 4.91 SRF\_ALB 0.15 0.14 0.14 0.15 0.15 0.16 0.16 0.17 0.16 0.14 0.14 0.14 0.15 KT 0.51 0.52 0.56 0.55 0.55 0.58 0.58 0.59 0.54 0.55 0.55 0.51 0.55 TS\_MAX 24.74 29.04 34.6 39.28 41.91 43.01 42.09 42.34 37.83 34.69 29.4 24.63 35.3 TS\_MIN 6.86 9.14 12.87 17.19 21.32 23.7 23.7 24.09 21.64 17.24 11.96 7.61 16.44 C5 27.0083◦ −99.68944◦ G\_Model 3.17 3.93 5.03 5.65 6.17 6.65 6.67 6.21 5.27 4.45 3.57 3.04 4.98 G\_NASA 3.15 3.86 4.97 5.61 6.07 6.62 6.75 6.17 5.24 4.41 3.51 2.99 4.95 SRF\_ALB 0.16 0.15 0.16 0.16 0.16 0.18 0.18 0.19 0.18 0.16 0.14 0.15 0.16 KT 0.5 0.52 0.56 0.55 0.56 0.59 0.6 0.59 0.56 0.56 0.54 0.51 0.55 TS\_MAX 24.22 28.63 34.18 38.89 41.54 43.17 42.56 42.98 38.44 34.93 29.23 24.18 35.25 TS\_MIN6.498.8512.6817.0521.323.9324.0924.5221.8217.2611.827.3116.43

**Table A5.** *Cont.*


**Table A6.**Monthly parameter measurements for each point on the state-wide grid.


**Table A6.** *Cont.*


**Table A6.** *Cont.*

**Table A7.**Monthly parameter measurements for each point on the state-wide grid.


E1

**Location LAT LON Parameter January February March April May June July August September October November December AnnualAverage**D-2 25.1128◦ −99.39066◦ G\_Model 3.45 4.21 5.22 5.58 5.93 6.38 6.41 6.22 5.16 4.42 3.79 3.26 5 G\_NASA 3.4 4.13 5.23 5.53 5.81 6.23 6.37 6.04 5.04 4.4 3.8 3.27 4.94SRF\_ALB 0.13 0.13 0.13 0.13 0.13 0.15 0.15 0.15 0.15 0.13 0.13 0.13 0.14KT 0.52 0.54 0.57 0.54 0.54 0.57 0.58 0.59 0.54 0.54 0.55 0.52 0.55TS\_MAX 26.52 30.62 35.64 39.67 41.76 41.81 40.53 40.66 36.59 33.86 30.1 26.32 35.34TS\_MIN 8.39 10.49 13.86 17.89 21.39 23.01 22.76 23.06 21.25 17.4 12.83 8.98 16.77D-324.842◦ −99.39066◦ G\_Model 3.74 4.47 5.51 5.79 6.04 6.26 6.18 6.01 5.07 4.6 4.09 3.54 5.11G\_NASA 3.75 4.5 5.53 5.71 6 6.11 6.2 5.98 5.09 4.61 4.14 3.62 5.1 SRF\_ALB 0.13 0.12 0.13 0.13 0.13 0.15 0.13 0.14 0.13 0.11 0.12 0.13 0.13KT 0.56 0.57 0.6 0.56 0.55 0.56 0.56 0.57 0.53 0.56 0.59 0.56 0.56TS\_MAX 26.61 30.52 35.25 39.06 40.5 40.01 38.47 38.63 34.59 32.15 29.34 26.34 34.29TS\_MIN 8.11 10.07 13.25 17.11 20.24 21.5 21.12 21.38 19.8 16.22 12.06 8.68 15.7925.9252◦ −99.09189◦ G\_Model 3.38 4.16 5.18 5.57 5.94 6.4 6.43 6.22 5.13 4.36 3.72 3.2 4.97G\_NASA 3.4 4.13 5.23 5.53 5.81 6.23 6.37 6.04 5.04 4.4 3.8 3.27 4.94SRF\_ALB 0.13 0.13 0.13 0.13 0.13 0.15 0.15 0.15 0.15 0.13 0.13 0.13 0.14KT 0.52 0.54 0.57 0.54 0.54 0.57 0.58 0.59 0.54 0.54 0.55 0.52 0.55TS\_MAX 26.06 30.3 35.5 39.76 42.12 42.62 41.49 41.7 37.52 34.79 30.22 25.91 35.67TS\_MIN 8.2 10.39 13.87 17.96 21.66 23.57 23.44 23.79 21.77 17.76 12.95 8.84 17.02E-1 25.3836◦ −99.09189◦ G\_Model 3.42 4.2 5.2 5.58 5.94 6.38 6.42 6.22 5.15 4.4 3.77 3.24 4.99G\_NASA 3.4 4.13 5.23 5.53 5.81 6.23 6.37 6.04 5.04 4.4 3.8 3.27 4.94SRF\_ALB 0.13 0.13 0.13 0.13 0.13 0.15 0.15 0.15 0.15 0.13 0.13 0.13 0.14KT 0.52 0.54 0.57 0.54 0.54 0.57 0.58 0.59 0.54 0.54 0.55 0.52 0.55TS\_MAX 26.52 30.62 35.64 39.67 41.76 41.81 40.53 40.66 36.59 33.86 30.1 26.32 35.34TS\_MIN 8.39 10.49 13.86 17.89 21.39 23.01 22.76 23.06 21.25 17.4 12.83 8.98 16.77E-2 25.1128◦ −99.09189◦ G\_Model 3.45 4.21 5.22 5.58 5.93 6.38 6.41 6.22 5.16 4.42 3.79 3.26 5 G\_NASA 3.4 4.13 5.23 5.53 5.81 6.23 6.37 6.04 5.04 4.4 3.8 3.27 4.94SRF\_ALB0.130.130.130.130.130.150.150.150.150.130.130.130.14

 0.54

 41.76

 21.39  0.57

 41.81

 23.01  0.58

 40.53

 22.76  0.59

 40.66

 23.06  0.54

 36.59

 21.25  0.54

 33.86

> 17.4

 0.55

 30.1

 12.83  0.52

 26.32

> 8.98

 0.55

 35.34

 16.77

 0.52

 26.52

> 8.39

KT

TS\_MAX

TS\_MIN

 0.54

 30.62

 10.49  0.57

 35.64

 13.86  0.54

 39.67

 17.89

**Table A7.** *Cont.*

**Location LAT LON Parameter January February March April May June July August September October November December Annual Average** F1 25.9252◦ −98.79311◦ G\_Model 3.19 3.93 4.99 5.67 6.16 6.62 6.65 6.22 5.22 4.52 3.65 3.07 4.99 G\_NASA 3.22 3.95 5.01 5.57 6.1 6.57 6.65 6.15 5.24 4.62 3.74 3.15 5 SRF\_ALB 0.15 0.14 0.15 0.15 0.16 0.17 0.17 0.18 0.17 0.15 0.15 0.15 0.16 KT 0.49 0.51 0.55 0.55 0.56 0.59 0.6 0.59 0.55 0.56 0.54 0.5 0.55 TS\_MAX 26.09 30.34 35.32 39.38 41.56 42.26 40.9 41.47 37.25 34.57 30.09 25.98 35.44 TS\_MIN 8.88 11.04 14.38 18.26 21.72 23.66 23.67 24 21.98 18.08 13.51 9.53 17.39 F-1 25.3836◦ −98.79311◦ G\_Model 3.23 3.96 5.02 5.68 6.16 6.61 6.64 6.22 5.24 4.56 3.7 3.12 5.01 G\_NASA 3.22 3.95 5.01 5.57 6.1 6.57 6.65 6.15 5.24 4.62 3.74 3.15 5 SRF\_ALB 0.15 0.14 0.15 0.15 0.16 0.17 0.17 0.18 0.17 0.15 0.15 0.15 0.16 KT 0.49 0.51 0.55 0.55 0.56 0.59 0.6 0.59 0.55 0.56 0.54 0.5 0.55 TS\_MAX 26.46 30.66 35.55 39.44 41.45 41.75 40.06 40.53 36.56 33.83 30.03 26.27 35.22 TS\_MIN 9.16 11.27 14.56 18.31 21.59 23.34 23.25 23.53 21.72 17.96 13.58 9.75 17.33 G0 25.6544◦ −98.49433◦ G\_Model 3.21 3.94 5.01 5.67 6.16 6.62 6.64 6.22 5.23 4.54 3.68 3.1 5 G\_NASA 3.22 3.95 5.01 5.57 6.1 6.57 6.65 6.15 5.24 4.62 3.74 3.15 5 SRF\_ALB 0.15 0.14 0.15 0.15 0.16 0.17 0.17 0.18 0.17 0.15 0.15 0.15 0.16 KT 0.49 0.51 0.55 0.55 0.56 0.59 0.6 0.59 0.55 0.56 0.54 0.5 0.55 TS\_MAX 26.31 30.09 34.5 38.18 40.19 41.04 39.61 40.27 36.85 34.34 30.17 26.27 34.82 TS\_MIN 9.83 11.82 14.88 18.52 21.93 24.04 24.1 24.4 22.56 18.77 14.37 10.49 17.97 G1 25.9252◦ −98.49433◦ G\_Model 3.19 3.93 4.99 5.67 6.16 6.62 6.65 6.22 5.22 4.52 3.65 3.07 4.99 G\_NASA 3.22 3.95 5.01 5.57 6.1 6.57 6.65 6.15 5.24 4.62 3.74 3.15 5 SRF\_ALB 0.15 0.14 0.15 0.15 0.16 0.17 0.17 0.18 0.17 0.15 0.15 0.15 0.16 KT 0.49 0.51 0.55 0.55 0.56 0.59 0.6 0.59 0.55 0.56 0.54 0.5 0.55 TS\_MAX 26.31 30.09 34.5 38.18 40.19 41.04 39.61 40.27 36.85 34.34 30.17 26.27 34.82 TS\_MIN 9.83 11.82 14.88 18.52 21.93 24.04 24.1 24.4 22.56 18.77 14.37 10.49 17.97 G-1 25.3836◦ −98.49433◦ G\_Model 3.23 3.96 5.02 5.68 6.16 6.61 6.64 6.22 5.24 4.56 3.7 3.12 5.01 G\_NASA 3.22 3.95 5.01 5.57 6.1 6.57 6.65 6.15 5.24 4.62 3.74 3.15 5 SRF\_ALB 0.15 0.14 0.15 0.15 0.16 0.17 0.17 0.18 0.17 0.15 0.15 0.15 0.16 KT 0.49 0.51 0.55 0.55 0.56 0.59 0.6 0.59 0.55 0.56 0.54 0.5 0.55 TS\_MAX 26.73 30.57 34.94 38.39 40.05 40.47 38.87 39.45 36.1 33.61 30.14 26.58 34.66 TS\_MIN10.2912.2915.3118.7921.9823.923.8724.1222.4618.814.610.9118.11

**Table A7.** *Cont.*

### **References**


### *Article* **Computing and Assessment of Discrete Angle Positions for Optimizing the Solar Energy Harvesting for Urban Sustainable Development**

**Guillermo Quiroga-Ocaña <sup>1</sup> , Julio C. Montaño-Moreno <sup>1</sup> , Enrique A. Enríquez-Velásquez <sup>2</sup> , Victor H. Benitez 3,\* , Luis C. Félix-Herrán <sup>1</sup> , Jorge de-J. Lozoya-Santos <sup>4</sup> and Ricardo A. Ramírez-Mendoza <sup>4</sup>**


**Citation:** Quiroga-Ocaña, G.; Montaño-Moreno, J.C.; Enríquez-Velásquez, E.A.; Benitez, V.H.; Félix-Herrán, L.C.; Lozoya-Santos, J.d.-J.; Ramirez-Mendoza, R.A. Computing and Assessment of Discrete Angle Positions for Optimizing the Solar Energy Harvesting for Urban Sustainable Development. *Energies* **2021**, *14*, 6441. https://doi.org/10.3390/en14206441

Academic Editors: Benedetto Nastasi, Jesús Polo and Philippe Leclère

Received: 25 July 2021 Accepted: 31 August 2021 Published: 9 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

**Abstract:** This paper proposes the computation and assessment of optimal tilt and azimuth angles for a receiving surface, using a mathematical model developed at the University of Tomsk, Russia. The model was validated and analyzed for the Nuevo León State, Northeast Mexico, utilizing a set of metrics, comparing against satellite data from NASA. A point of interest in the city of Monterrey was analyzed to identify orientation patterns throughout the year for an optimal solar energy gathering. The aim is providing the best orientation tilt angles for photovoltaic or solar thermal panels without tracking systems. In addition, this analysis is proposed as a tool to achieve optimal performance in sustainable urban development in the region. Based on the findings, a set of optimal tilt and azimuth surface angles are proposed for the analyzed coordinates. The aim is to identify the optimal performance to obtain the maximum solar irradiation possible over the year for solar projects in the region. The results show that the model can be used as a tool to accelerate decision making in the design of solar harvesting surfaces and allows the design of discrete tracking systems with an increase in solar energy harvesting above 5% annually.

**Keywords:** solar irradiation; renewable energy assessment; solar harvesting surface; urban energy tools

### **1. Introduction**

Renewable energy sources are essential to reduce greenhouse gases (GHG) and for sustainable development. One of the main sources of renewable energy due to its accessibility is solar radiation. The energy and land usage in several Italian provinces were studied to analyze whether solar energy is the optimal renewable energy source to reach the 2030 climate policies in Europe. It was concluded that solar energy is the cheapest renewable energy source with the largest potential in the latitude of Italy [1]. This form of energy can be harnessed by photovoltaic (PV) panels that work with semiconductors that release electrons when in contact with solar radiation [2]. This process can be both used for industrial and domestic purposes. Mexico has the potential capacity to develop renewable technologies and produce 1172 gigawatts, stating that solar power is the best technology in every scenario considered [3]. A great area of solar potential is the northeast, where Nuevo León state is located. From an economical perspective, an advantage of Nuevo León is the geographical location that opens the possibility of forming an integrated North American energy market [3,4].

In recent years, mathematical models of solar radiation have been refined because they can significantly reduce the price and time of development of solar energy projects [5]. In this regard, several tools have been tested for their efficiency, such as in [6], where 23 software packages and 4 mobile apps regarding photovoltaic systems were analyzed; they found that none of the tools reviewed met every benchmark set of design and management purposes. In order to improve these calculations, a system of eight different factors that improve decision making during PV projects was proposed. The proposition of the system is that, for future work, researchers should focus on working on these factors, such as optimization of the PV layout design, instead of building a general design software for PV arrays.

An extensively used approach to estimating solar resources is the utilization of satellites. According to [7], a mathematical model, supported by the NASA SSE database, was applied to predict the characteristics of solar radiation for any latitude and longitude in Russia. Recently, reference [4] found that the model is a good estimation of solar resources available and depends only on the geographical location and data available in the satellite NASA SSE database. This fact allows the model to be independent of meteorological stations or physical limitations and provides better coverage of the region [4].

Studies made in Libya by the Sebelas Maret University to compute the maximum solar radiation were conducted, using a mathematical model to determine the optimal orientation and positioning of a solar panel [8]. Similar calculations were made in Tunisia [9], where they addressed the issue of computing the angle of PV panels. This was done as an alternative to solar tracking control systems. Furthermore, calculating the best tilt angle of the solar panels using mathematical modeling would make it easier to implement PV panels and ensure their maximum efficiency.

A mathematical model computed the optimal tilt and azimuth angle for a PV array in [10] for the city of Sharjah in the United Arab Emirates. The results showed a set of optimal angles for Sharjah, using the sensitivity analysis of design parameters of bi-facial solar PV technology. They studied the effects of albedo, tilt angle, and height from the ground against power gain correlations. The application of the findings served to increase the energy supply for the existent solar panels and to reduce the energy demand for cooling in the buildings where PV systems were installed. The optimization of the tilt angle in the PV systems had the highest effect during the winter in Punjab, Pakistan; this result can be especially useful since the days are shorter and more energy is required for lightning purposes [11].

In reference [8], a study to estimate solar radiation is presented. Three mathematical models were compared: the Caperdaou, the Liu and Jordan, and the R. Sun model. These were compared against the solar radiation received by a ground collector, and it was found that the Caperdaou model was the most suitable for the Algerian region.

In Mexico, the solarimetric network is currently managed by the Mexican National Weather Service (SMN). This network consists of several meteorological stations deployed all over the country, which measures several variables of weather, including solar radiation. The Mexican grid consists of two types of stations: automatic meteorological stations (EMAs) and synoptic meteorological stations (ESIMEs). EMAs are conformed by mechanical and electronic devices, such as sensors that monitor several meteorological variables. These stations have the following sensors: wind speed, direction of the wind, atmospheric pressure, temperature, relative humidity, solar radiation, and precipitation [12]. ESIMEs are a set of sensors that realizes measurements automatically from the previously mentioned meteorology variables [12]. Another difference between an EMA and an ESIME lies in the way the information is presented; in EMAs, a file is created every ten minutes with all the necessary variables. On the other hand, ESIMEs generate a synoptic message every three hours [12]. The Mexican grid counts with only 189 EMAs, and 84 ESIMEs all over Mexico [12]. In the state of Nuevo León, there are only three EMAs and one ESIME, as shown in Figure 1, where the state is highlighted; these are not enough to have a complete panoramic of the solar radiation in the state. Additionally, according to [13], more than half

of the solar radiation sensors may not be correctly calibrated, leading to higher discrepancies in the station results when measuring the solar radiation. In the same study, it is stated that these measurements are not public and have a higher error than satellite readings, which is around 10%. Another study of solar radiation in Mexico analyzed the findings of the meteorological stations to validate their mathematical model, and found that only 33% of the stations in the Sonoran region in northwest Mexico met their selection criteria [4].

**Figure 1.** Mexican meteorological grid. The state of Nuevo León is highlighted in red.

Due to the scarcity of local data concerning radiation analysis in Mexican geographical coordinates, in this study, some relevant contributions with respect to the irradiation calculations regarding the surface angle were made. These calculations were made for the city of Monterrey in Mexico to build a photovoltaic array. However, they can be extrapolated in order to analyze any coordinates in the world. The main objective of the current research is utilizing a mathematical model as a tool to identify and propose several setups for tilt and azimuth angles for solar collecting surfaces with no tracking systems and increase their energy harvesting throughout the year as close as possible to an optimal point. The idea is to save costs of tracking systems required by other technologies (such as full tracking, or partial tracking), which require sensors and mechanical systems, as well as increasing performance and production for future efficient solar systems (PV or solar thermal), which impact in the green development in the region.

The literature review indicates the importance of calculating the parameters of solar radiation acting on collecting surfaces as a way to identify the harvesting of the highest amount of solar energy possible during the year. This paper is aimed to calculate and assess the solar irradiation on a receiving surface at different tilt and azimuth angles to obtain the greatest amount of energy captured, utilizing a mathematical model as a tool with a data-driven approach based on the geospatial information retrieved.

This approach does not require weather stations, or a local sensing device deployed, avoiding the cost of equipment and maintenance. Besides this, the proposed approach can use databases which provides the albedo, the reflection index and the geographic coordinates. This makes the approach flexible and adaptable.

This study is organized as follows: Section 2 describes the data source used to compare the mathematical model. Section 3 presents the methodology, which describes the mathematical model and statistical methods used to validate and analyze the impact of different tilt and azimuth angles for a receiving surface. Section 4 consists of a report and discussion of the accuracy of the model and the effect of changing the surface angles

previously mentioned in the methodology. Finally, Section 5 provides the conclusions and offers a brief mention of further research opportunities.

### **2. Data Sources**

The present work used a resource to gather data related to solar radiation. The obtained information was later compared with the findings of the mathematical model in order to validate it. The resource was the database of NASA's Surface Meteorology and Solar Energy (SSE). As stated in the introduction, the set of EMAs and ESIMEs is scarce (only four stations) and cannot be used to have a complete representation of the state of Nuevo León. Due to this, it was decided not to use the Mexican national weather service findings to validate the model.

The database of NASA's Surface Meteorology and Solar Energy (SSE) contains information about solar measurements, such as surface albedo (*ρ*) and the clarity index (*KT*) [14]. These two parameters represent the capacity of light reflection and the atmospheric effect on light for the analyzed zone. These are used as inputs for the mathematical model to calculate the solar irradiation.

Five geographic points were selected to cover a sufficiently representative area of the state of Nuevo León, where Colombia is at the north of the state, Linares at the center, El Grullo at the east, Monterrey at the west and Mier y Noriega at the south. Then, the average total solar irradiation per month was calculated from these specific geographic points and compared against the data from the NASA SSE database. Table 1 present the locations and geographic coordinates.

**Table 1.** Geographical data for the selected municipalities.


### **3. Methodology**

The methodology consisted of calculating solar irradiation based on a mathematical model proved for other geographical regions around the world. This model was applied to a specific point in the city of Monterrey as a case study. A comparison of different angles (tilt and orientation) of the receiving surface was utilized for the calculations. Then, statistical methods were used to compare the model against the values of solar irradiation at the ground level of the NASA database. It is worth mentioning that NASA only provides the solar irradiation that reaches a horizontal surface without tilt or azimuth surface angles; therefore, the NASA data source focuses only on horizontal surfaces (parallel to the ground) at different altitudes for its data of solar irradiation. On the other hand, the model used in this research work can be used to analyze any orientation of the collecting surface. Finally, the results for different surface tilts and orientations were used for comparison in order to identify the best angles for the optimum performance of solar systems in the region throughout the year.

### *3.1. Mathematical Model*

This section closely follows the results presented in [4,7]. The total radiation arriving at an inclined surface (*G*) is calculated by (1), which consists of the sum of the direct, scattered, and reflected solar radiation that hits on a receiving surface. The model has eight inputs: surface azimuth angle (*γ*), the tilt angle of the receiving surface (*β*), latitude (*φ*), longitude (*ψ*), surface albedo (*ρ*), clarity index (*KT*), the difference in hours with respect to the standard Greenwich meridian (*Dif* GMT) and the date of the year (*N*). These inputs

are used in the equations presented in Appendix A. The main equation to calculate the total solar irradiation for any orientation of a tilted collecting surface is presented in Equation (1).

$$\mathbf{G} = [\mathbf{G}\_D(\frac{\cos \theta}{\cos \theta\_z})] + \mathbf{G}\_{\text{DH}}[A\_i(\frac{\cos \theta}{\cos \theta\_z}) + [(1 - A\_i)(\frac{1 + \cos \theta}{2})] + [(\mathbf{G}\_H)(\rho)(\frac{1 - \cos \theta}{2})] \tag{1}$$

where *GH*, *GDH*, and *G<sup>D</sup>* are the hourly total radiation arriving at a horizontal surface divided into three components (total, diffuse and direct respectively), (Equations (A22)–(A24)); *A<sup>i</sup>* is the anisotropic index (Equation (A26)). Finally, *θ* is the incidence angle and *θ<sup>z</sup>* is the solar zenith angle, Equations (A5) and (A6), respectively.

The latitude and longitude data for the geographical location analyzed in Monterrey were 25.6544 N and 100.2874 W respectively [15]. The surface albedo and the clarity index for every month were taken from the database of NASA SSE [14]. Then, for the inclination angles (*β*) a range from 0 to 60 degrees was selected. This is because of a suggestion of angles between 10 and 50 degrees from a study realized in the city of Hermosillo, located at a latitude and longitude similar to the points used in the present research [16]. Finally, in order to obtain a reliable average representation of solar irradiation for every month in the year, each month was represented by a significant day. Table 2 shows the representative days.


**Table 2.** Representative days considered.

### *3.2. Statistical Analysis*

The accuracy of the mathematical model is validated against the data given by the NASA database [14]; to prove how precise it is against the data source, there will be used several statistical methods.

For all following mathematical formulas, *n* is the number of months in the year, *i* is the number of the analyzed month, *Y* is the value of the reference (NASA SSE), *X* is the value to analyze (model), *X* is the average annually of the values to analyze (model), and *Y* is the average annually of the value of reference (NASA SSE). All formulas were obtained from references [17–19].

	- **–** MAE provides a mean error magnitude among the different data sources; the smaller the value obtained, the better the model.

$$\text{MAE} = \frac{1}{n} \sum\_{i=1}^{n} |X\_i - Y\_i| \tag{2}$$

• Mean Bias Error (MBE)

**–** MBE provides the bias that follows the average error; the closer it is to zero, the more precise it is. If the value is less than zero, it is considered an underestimation, and if it overpasses zero, it is considered overestimation. This statistical method reflects the performance of the analyzed model.

$$\text{MBE} = \frac{1}{n} \sum\_{i=1}^{n} (X\_i - Y\_i) \tag{3}$$

	- **–** Represents the standard deviation of the calculated errors. The smaller the value, the greater the accuracy.

$$\text{RMSE} = \sqrt{\frac{1}{n} \sum\_{i=1}^{n} (X\_i - \mathbf{Y}\_i)^2} \tag{4}$$

	- **–** This parameter determines the behavior of the error. Values among the range of ten percent to minus ten percent are acceptable.

$$\text{MPE} = \frac{100}{n} \sum\_{i=1}^{n} (\frac{X\_i - Y\_i}{Y\_i}) \tag{5}$$

	- **–** Same as MPE, values between the range of ten percent to minus ten percent are acceptable.

$$\text{RPE} = (\frac{X\_i - Y\_i}{Y\_i}) \times 100\tag{6}$$

	- **–** Utilized to measure the linear correlation between two variables on a scale of one to minus one, where one is totally positive, minus one is totally negative, and zero represents no linear correlation.

$$\mathbf{r} = \frac{\sum\_{i=1}^{n} (X\_i - \bar{X})(Y\_i - \bar{Y})}{\sqrt{\sum\_{i=1}^{n} (X\_i - \bar{X})^2 \sum\_{i=1}^{n} (Y\_i - \bar{Y})^2}} \tag{7}$$

	- **–** Represents the proximity midst the line of calculated values and the reference values; the closer it is to one, the greater the precision.

$$R^2 = 1 - (\frac{\sum\_{i=1}^n (Y\_i - X\_i)^2}{\sum\_{i=1}^n (Y\_i - \bar{Y})^2})\tag{8}$$

	- **–** Utilized to determine if the values of the mathematical model are statistical representatives or not. The smaller the value of t, the better the performance of the model. Statistical significance is considered based on a table of distribution t of critical values, where a confidence level (*α*) and a degree of freedom (*df*) are used to find the critical value.

$$df = n - 1\tag{9}$$

$$\mathbf{t} = \sqrt{\frac{(df)(\text{MBE})^2}{(\text{RMSE})^2 - (\text{MBE})^2}}\tag{10}$$

### *3.3. Impact of the Variation of Inclination and Azimuth Angles on Receiving Surfaces*

Once the mathematical model was validated, tests were made at a fixed point at the Monterrey Tec campus, which is part of the Tec district located to the south of the city of Monterrey, in a polygon of 452 hectares [20]. The computed location was at a latitude of 25.6544 N and a longitude of 100.2874 W. These tests were accomplished with different tilt angles (*β*) and surface azimuths (*γ*) for a receiving surface. The purpose of this was to build a lookup table from which it is possible to extract the information of the surface angles that are most suitable to maximize the harvesting of available solar radiation at the analyzed interest point.

### **4. Results and Discussion**

This section presents the results found in the investigation. First, the results of the validation of the model are presented, which allows it to be applied to a geographical point of particular interest, thereby highlighting the findings found. All tables present the solar irradiation in kWh/m<sup>2</sup> .

### *4.1. Model Validation*

In general, when analyzing the results, it can be seen that there is a very close relationship between the mathematical model and the NASA database (Figure 2). There are some cases where there are minor discrepancies in the results. The greatest difference found in the validation is in Monterrey, during the month of August; however, this difference is 0.16 kWh/m<sup>2</sup> , representing an error of just 2.8%. Regarding the metrics, a monthly average for every set of results is used to make the statistical analysis, which can be seen in Table 3. Statistical tests with the MAE, MBE, and RMSE methods provided values very close to zero in all geographic points. Most of the values for MPE and RPE fall within the acceptable range of ±10%. In certain months the values in RPE overpass ±2%, but it is still an acceptable range. For the r and *R* 2 tests, almost all the values are very close to 1 and even for the geographical point called Colombia reaches 1 in r. Finally, the metric known as t-student showed that all the cases were significant, taking into account the critical value of 4025 for a confidence level of 99% (0.001) with 11 degrees of freedom [21]. From the results given by the statistical tests, it can be concluded that the mathematical model works for the location of the state of Nuevo León.

### *4.2. Discussion and Findings*

Once the model was validated, different angles were chosen to compute the solar irradiation arriving at a solar harvesting surface, as shown in the Appendix B Tables A2 and A3. The mathematical model was used to inquire which combinations of possible angles would allow to capture the greatest amount of solar irradiation.


**Table 3.** Statistical test (model vs. NASA).

(**a**) Comparison in the municipality of Colombia

(**b**) Comparison in the municipality of Monterrey

(**c**) Comparison in the municipality of El Grullo.

(**d**) Comparison in the municipality of Mier y Noriega.

(**e**) Comparison in the municipality of Linares.

**Figure 2.** Comparison between the solar irradiation calculated by the mathematical model and the one gathered from the NASA database.

Different month arrangements were formed. The azimuth value was set to zero, and the tilt angle was varied to calculate the corresponding irradiation value. Various sets and groupings of months can be selected. Table 4 shows the average results of all the combinations of all the groupings made in this paper. However, it is necessary to note the particularities of each selection.



Table 5 proposes 12 angle changes, one for each month, that would be the most efficient in terms of capturing solar irradiation, but it is more demanding in terms of path tracking. Table 6 requires only five changes, and the average efficiency is 99.93%. If the year is divided into four periods of three months each, the efficiency remains high with an average value of 99.44% with a follow-up cost of only four angular values as shown in Table 7. Table 8 divides the year into two large groups of six months, and this selection requires only two discrete positions; however, the efficiency is reduced to 94.94%. Finally, when calculating the efficiency with a fixed angle of 25 degrees, Table 9, an efficiency of 94.35% is obtained. Although different arrangements of months can be used, as an additional example, the arrangement shown in Table 10 was formed, consisting of four partitions but considering irregular distribution of grouping the months with an efficiency of 99.94%. This is done in order to improve the efficiency of the array throughout the year. These findings have an evident impact on the design of electromechanical solar tracking systems, where the proposed mathematical model can be used as a reference of optimal angles to obtain the best possible performance.


**Table 5.** The 12 months' angle combinations.

**Table 6.** Bimonthly angle combinations.



**Table 7.** Quarterly angle combinations.

**Table 8.** Biannual angle combinations.


**Table 9.** Fixed angle combinations.



**Table 10.** Alternative quarterly angle combinations.

Notice that, as discussed above, there is a difference of an almost 6% loss between leaving the tilt angle fixed and changing it every month; it is practically the same to be changing it every month as it is to be doing it quarterly.

Figure 3 illustrates the relationship between the month of the year, the tilt angle (*β*) and the solar irradiation. The aim of this figure is to provide a guide that presents how solar irradiation behaves respect to different tilt angles during the year. Additionally, Figure 3 offers an easy way to analyze data, which can be used to accelerate decision making regarding the orientation of solar harvesting systems for optimal performance or planning minimum values of solar irradiation throughout the year.

**Figure 3.** Total solar irradiation throughout the year in kWh/m<sup>2</sup> depending on the tilt angle when azimuth equals 0.

As can be seen in Figure 4, any relationship between the tilt angle and the received solar irradiation follows a behavior like that of a convex parabola, that is, there is a vertex where the optimum angle is found, and the further away it is from this point, in either direction, the lower the solar radiation that it receives. This effect is more visible in the first and last quarter of the year, where both branches of the parabola can be observed in all months, unlike the months of May, June and July, where the optimal angle is zero.

(**a**) Total solar irradiation in the first quarter of the year.

(**c**) Total solar irradiation in the third quarter of the year.

(**d**) Total solar irradiation in the fourth quarter of the year.

**Figure 4.** Total solar irradiation throughout the year in kWh/m<sup>2</sup> , depending on the tilt angle when azimuth equals 0, broken down into quarters.

### **5. Conclusions**

A mathematical model developed at the University of Tomsk, Russia, for high latitudes was applied to obtain a set of angles to maximize the energy collection in the state of Nuevo León, a strategic region of northern Mexico, showing excellent results, according to the evaluated metrics. The model was evaluated in specific points of the state, and was used for a particular point within the university campus of Tec de Monterrey. The viability of the model was evaluated when applied to solar harvesting surfaces to maximize energy collection. It was found that the model allows studying the angular variation of a solar harvesting surface in such a way that a set of angles was found that allows maximizing the solar energy capture. The implications of this are of interest to solar engineering, as it visualizes the possibility of designing discrete tracking systems, that is, tracking systems that vary the angles at certain discrete positions to be selected by the user throughout the year. This is a different approach to current solar tracking systems that are designed to do continuous day-to-day monitoring at a high computational and economic cost. Discrete tracking would be, according to our findings, simpler. However, more research is required in this regard.

Another implication of our findings is that this type of study can be used to improve the urban development of the region by reducing the costs of efficient solar collection systems for the generation of green energy and reducing the regional carbon footprint, due to energy production.

It is worth highlighting that, after a series of performance tests with different tilt and surface azimuth angles for a receiving surface, it was found that the azimuth angle had a minimal effect on the solar irradiation on the surface for a discrete monthly approach. On the other hand, different tilt angles represent notable variations in the solar irradiation obtained. Based on these findings, it was concluded the importance of modifying the tilt and azimuth angles in order to achieve the best efficiency of solar irradiation that can be received on a surface, such as solar panels in the analyzed location. This study presented an approach that, if the results obtained against meteorological stations are compared, the proposed method offers an effective quantitative advantage since it requires neither monitoring stations nor the operating and maintenance costs involved. These results can

be utilized to assess the deployment and planning of renewable energy systems based on solar panels with adjustable angle.

Finally, as future work, applications of solar tracking and monitoring systems for photovoltaic or thermal solar implementations are proposed based on this paper; this study establishes the possibility of being used for the design of a discrete tracking system, based on Tables 5–10 as inputs, which can be implemented as a sensorless open-loop control system, as conventional solar tracking systems are based on continuous angle regulation.

**Author Contributions:** Conceptualization, E.A.E.-V. and J.d.-J.L.-S.; methodology, E.A.E.-V., V.H.B. and L.C.F.-H.; software, G.Q.-O.; validation, G.Q.-O., E.A.E.-V. and V.H.B.; formal analysis, G.Q.-O. and E.A.E.-V.; investigation, J.C.M.-M., G.Q.-O., E.A.E.-V., V.H.B. and L.C.F.-H.; resources, J.d.-J.L.-S.; data curation, G.Q.-O. and E.A.E.-V.; writing—original draft preparation, J.C.M.-M., G.Q.-O. and V.H.B.; writing—review and editing, E.A.E.-V., V.H.B., L.C.F.-H., R.A.R.-M. and J.d.-J.L.-S.; visualization, E.A.E.-V., L.C.F.-H. and J.d.-J.L.-S.; supervision, V.H.B.; project administration, V.H.B.; funding acquisition, J.d.-J.L.-S. and R.A.R.-M. All authors have read and agreed to the published version of the manuscript.

**Funding:** The APC was funded by CampusCity Initiative from the School of Engineering and Sciences at the Tecnologico de Monterrey.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **Abbreviations**

The following abbreviations are used in this manuscript:



)


### **Appendix A. Equations from Mathematical Model**

All the equations used in the mathematical model are shown below: Declination angle of the sun (*δ*):

$$\delta = \left[23.45 \sin\left(\frac{360)(284) + N}{365}\right)\right](\frac{\pi}{180}), \text{Radians} \tag{A1}$$

The solar hour angle (*ω*):

$$\omega = \{15[(Time\ hr) - 12 - (Diff\ GMT) - EoT] + \psi\} (\frac{\pi}{180})\_{\prime} Radians \tag{A2}$$

The equation of time (*EoT*):

$$EoT = \frac{9.87 \sin 2B - 7.53 \cos B - 1.5 \sin B}{60}, Hours \tag{A3}$$

Value of B:

$$B = (\frac{360}{365})(N - 81)(\frac{\pi}{180})\tag{A4}$$

The incidence angle (*θ*):

*θ* = cos−<sup>1</sup> [(sin *δ* · sin *φ* · cos *β*) + (cos *δ* · sin *φ* · sin *β* · cos *γ* · cos *ω*) − (sin *δ* · cos *φ* · sin *β* · cos *γ*)+ (cos *δ* · sin *β* · sin *γ* · sin *ω*) + (cos *δ* · cos *φ* · cos *β* · cos *ω*)], *Radians* (A5)

The solar zenith angle (*θz*):

$$\theta\_2 = \cos^{-1}[(\sin\phi \cdot \sin\delta) + (\cos\phi \cdot \cos\delta \cdot \cos\omega)]\_\prime \,\text{Radians} \tag{A6}$$

The solar altitude angle (*h*):

$$h = \frac{\pi}{2} - \theta\_{z\prime} \,\text{Radians} \tag{A7}$$

$$h = \sin^{-1}[(\sin \phi \cdot \sin \delta) + (\cos \phi \cdot \cos \delta \cdot \cos \omega)] \text{ } Radius \tag{A8}$$

The solar azimuth angle (*Az*):

$$A\_z = \cos^{-1}[\frac{(\sin h \cdot \sin \phi) - \sin \delta}{\cos h \cdot \cos \phi}] \text{ } Radius \tag{A9}$$

The sunset hour angle (*ωss*):

*ωss* = cos−<sup>1</sup> (− tan *φ* · tan *δ*), *Radians* (A10)

The hourly diffuse coefficient (*r<sup>d</sup>* ):

$$r\_d = \frac{\pi}{24} (\frac{\cos \omega - \cos \omega\_{\rm ss}}{\sin \omega\_{\rm ss} - (\omega\_{\rm ss} \cdot \cos \omega\_{\rm ss})}) \tag{A11}$$

The hourly transparency coefficient (*rt*):

$$r\_t = r\_d(a + b\cos\omega) \tag{A12}$$

Values of coefficients *a* and *b*:

$$a = 0.409 + 0.5016 \sin \left(\omega\_{ss} - \frac{\pi}{3}\right) \tag{A13}$$

$$b = 0.6609 - 0.4767 \sin \left(\omega\_{ss} - \frac{\pi}{3}\right) \tag{A14}$$

The clearness index (*KT*):

$$K\_T = \frac{H}{H\_0} \tag{A15}$$

The average daily extra-atmospheric insolation arriving at a horizontal surface (*H*0):

$$H\_0 = G\_\mathrm{sc}(\frac{24}{\pi}) \left[ 1 + [0.033 \cos(\frac{(360)N}{365})(\frac{\pi}{180})] \right] \left[ (\sin \phi \cdot \sin \delta \cdot \omega \infty) + (\cos \phi \cdot \cos \delta \cdot \sin \omega\_{\mathrm{ss}}) \right] \mathrm{W}h/m^2 \tag{A16}$$

Solar constant (*Gsc*):

$$\mathbf{G\_{sc}} = \mathbf{1367W/m^2} \tag{A17}$$

The average daily radiation arriving at a horizontal surface (*H*):

$$H = K\_T(H\_0) / \mathcal{W}h / m^2 \tag{A18}$$

The diffusion index (*KD*):

$$K\_D = \frac{H\_D}{H} \tag{A19}$$

The average daily diffuse radiation arriving at a horizontal surface (*HD*):

$$H\_D = K\_D(H)\_\prime \,\text{Wh}/\text{m}^2\tag{A20}$$

*K<sup>D</sup>* can be determined by the equations and conditions shown in Table A1. *K<sup>D</sup>* is determined based on *KT*.

$$\mathbf{K}\_D = f(\mathbf{K}\_T) \tag{A21}$$

Total (*GH*):

$$G\_H = r\_t(H) / \mathcal{W} / m^2 \tag{A22}$$

$$\text{Diffuse (G}\_{DH})\text{:}$$

$$G\_{\rm DH} = r\_d (H\_{\rm D})\_\prime \,\mathrm{W/m^2} \tag{A23}$$

Direct (*GD*):

$$\mathbf{G}\_{\rm D} = \mathbf{G}\_{\rm H} - \mathbf{G}\_{\rm DH} \,\mathrm{W}/\mathrm{m}^2 \tag{A24}$$

The hourly extra-atmospheric radiation arriving at a horizontal surface (*G*0):

$$\mathbf{G}\_0 = \mathbf{G}\_{\rm sc} \{ 1 + [0.033 \cos \left[ (\frac{360 \, (N)}{365}) (\frac{\pi}{180}) \right] \} \sin h\_\prime \, \mathcal{W}/m^2 \tag{A25}$$

The anisotropic index (*A<sup>i</sup>* ):

$$A\_l = \frac{G\_D}{G\_0} \tag{A26}$$


**Table A1.** Conditional table to determine the diffusion index (*KD*).

### **Appendix B. Positive and Negative Azimuth Angle Combinations for Monterrey Tec Campus, State of Nuevo León**

Table A2 presents the solar irradiation in kWh/m<sup>2</sup> calculated using the proposed model for all combination with a positive surface azimuth angle from 0 to 15 degrees and tilt angle from 0 to 60. Table A3 shows the same but with negative surface azimuth angle from −5 to −15 degrees.

**Table A2.** Positive azimuth angle combinations for Monterrey Tec campus (latitude: 25.6544 N; longitude: 100.2874 W).



**Table A2.** *Cont.*

**Table A3.** Negative azimuth angle combinations for Monterrey Tec campus (latitude: 25.6544 N; longitude: 100.2874 W).



### **Table A3.** *Cont.*

### **References**


MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Energies* Editorial Office E-mail: energies@mdpi.com www.mdpi.com/journal/energies

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18

www.mdpi.com ISBN 978-3-0365-2866-3