**1. Introduction**

In the past decades large-scale wind power integration has become a trend [1]. As a result, a variety of uncertainties have been identified in the power systems [2–7]. The outputs of wind farms are greatly influenced by natural environmental factors such as wind speed, which are random and, therefore, difficult to accurately predict and control [8]. There are many giant wind farms in the northwest of China. When these wind farms are connected to the power grid, a large number of random output generating nodes form in the power system. This brings enormous challenges to the scheduling and planning of the power system because these schemes usually need the accurate prediction data of

the generating node outputs. Consequently, it is widely believed that the impacts of wind power uncertainties should be considered in the scheduling and planning of power systems [9–11]. Currently, the best way to describe the uncertainties of wind power is to construct a probability density function (PDF) [12–16].

The outputs of the wind farms maintain a random and probabilistic correlation in the scenario that multiple wind farms are connected to power systems, simultaneously, in the same wind belt [17]. According to the probability theory, when the probability density function is established for multiple subjects with a probability correlation, these subjects cannot be viewed as independent events. Therefore, these wind farms cannot be considered independent [18–21]. That is, the PDF for a single wind farm is not applicable when the uncertainties of these wind farms need to be described. Therefore, it is necessary to construct the joint probability density function (JPDF) for wind farms. Accordingly, it is well-known that the precise construction of the JPDF of wind farms is a foundation for the scheduling and planning of power systems with multiple wind farms. For example, Reference [22] proposes that the probabilistic correlation between multiple wind farms should be considered in the scheduling of power systems. The systematic planning method, considering the probabilistic correlation of multiple wind farms, is studied in Reference [23]. In general, it is of significance to research a method of constructing the JPDFs of multiple wind farms accurately and conveniently.

There exists much literature studying the construction of a JPDF for multiple wind farms. The Copula theory is the most common method used to study this problem, given that it can be used to characterize the probabilistic correlation in multiple wind farms. In Reference [24], the wind farms near the Dutch coastline are equated as two wind farms and a Gaussian-Copula function is introduced to establish their JPDF. In Reference [25], the Copula function is used to build a probabilistic correlation model for the wind speed and the wind power output, and then the model is used to assess the state of the generators. In Reference [26], a variety of two-element Copula functions are utilized to study the dependent structures of wind farms and the goodness of fit of different Copula functions is compared. In Reference [27], a number of basic Copula functions are summed with weights to form a comprehensive Copula function. As a result, compared with the single Copula function model, the JPDF of wind speed can be described more accurately by the comprehensive function [28]. According to previous research, the regular steps are as follows: First, a number of Copula function forms are selected according to the cumulative distribution characteristics of the wind farm outputs, in advance. Second, the unknown parameters are estimated. Finally, the most appropriate Copula function is determined by the optimization method. However, this method, based on Copula function for JPDF modeling of wind farm outputs, is essentially parameter estimation (PE). This kind of method depends on the multiple, prior definition of the JPDF forms. On one hand, once the form selection is wrong, no accurate modeling results can be obtained no matter how accurate the PE process is. Although, for the purpose of improving the accuracy of modeling, some scholars tried to estimate the parameters of all forms of Copula functions, and then selected the most accurate function. However, this kind of thought undoubtedly increases the complexity of the modeling process. On the other hand, a large number of wind farms are scattered throughout China. Consequently, the joint probability characteristics of wind farm clusters on different wind belts may follow different JPDF forms, and it is difficult to ensure the universal applicability of the modeling method based on the Copula function.

Different from the PE method, the probability distributions of objects can be modeled directly, without the prior judgment process of function forms by the nonparametric kernel density estimation (NKDE) method. Accordingly, it has higher accuracy and applicability and has been applied effectively in the field of probabilistic modeling in power systems [2–31]. The main focus of the existing research surrounds PDF modeling of a single random variable [32]. Some literature has begun to study the NKDE method for multidimensional random variables [33–38], but few of them are applied to the field of power systems.

In Reference [39], a JPDF model of grid node loads based on NKDE theory is proposed and the effects of the node load correlation and uncertainty in the aspect of reliability are analyzed. In Reference [40], a node load conditional probability density modeling method based on NKDE theory is proposed. In these two papers a multivariate nonparametric kernel density estimation (MNKDE) method for the probabilistic correlation modeling of node loads is successfully proposed. However, problems with this method still exist when it is applied to the JPDF modeling for multiple wind farms. The local amplitude of the JPDF for wind farms different from the loads is larger and the bandwidth of the existing MNKDE method is fixed. This fixed bandwidth may be a problem for local applicability because the accuracy of the modeling is high in some intervals but lower in other.

In order to solve this problem, the new idea of modifying the bandwidth, based on the samples themselves and a mathematical model of adaptive univariate NKDE, is proposed in Reference [41]. Based on References [41,42], a new adaptive univariate NKDE model for power system state estimation is proposed. Moreover, a method to determine the bandwidth, discussed in References [43,44], is also proposed. The above references have made significant progress in solving the local applicability problem and provided the idea used in this research. However, the above research was all aimed at the univariate NKDE model. The study of the MNKDE model has not been reported.

In summary, an approach of adaptive multivariate nonparametric kernel density estimation (AMNKDE) is proposed in this paper, and it is utilized to model the JPDF of multiple wind farms. The correctness and effectiveness of the approach is verified by the simulation results, based on the practical operation data of several wind farms in China.

The main contributions of this paper are as follows:

(1) The AMNKDE approach for the JPDF modeling of multiple wind farms is proposed in this paper. Compared with the traditional PE method based on the Copula function, the approach does not require prior judgement of the JPDF forms of multiple wind farms. Consequently, this approach possesses higher modeling accuracy and applicability.

(2) In order to promote the MNKDE in the specific problem of multiple wind farms, an improved adaptive strategy is proposed. Specifically, a model of optimal bandwidth is established and the traditional fixed bandwidth is replaced with the adaptive bandwidth, which can be adjusted automatically according to the samples. The improved strategy in this paper solves the local applicability problem of the existing MNKDE method, and further improves the modeling accuracy.

The rest of the paper is organized as follows: The AMNKDE model for multiple wind farms and bandwidth evaluation indicators are given in Section 2. An optimized method of solving the bandwidth model based on ordinal optimization (OO) is explained in Section 3. The simulation results are compared and analyzed in Section 4. The conclusions are presented in Section 5.

#### **2. Adaptive Multivariate Nonparametric Kernel Density Estimation Model for Multiple Wind Farms**

#### *2.1. MNKDE Model for Multiple Wind Farms*

Considering *m* wind farms have *n* output data samples in each sampling period, the active power vector of the *i* sampling point is *Xi* = [ *Xi*1, *Xi*1, ··· , *Xim*] *T i* = 1, 2, ... , *n*. The random variation of the power output for *m* wind farms is *x* = [*<sup>x</sup>*1, *x*2, ··· , *xm*] *T*. The JPDF is *f*(*x*) = *f*(*<sup>x</sup>*1, *x*2, ··· , *xm*). The MKDE model of the JPDF is

$$\hat{f}(\mathbf{x}) = \frac{1}{n} \sum\_{i=1}^{n} \frac{1}{|\mathbf{H}|^{1/2}} K \Big[ H^{-1/2} (\mathbf{x} - \mathbf{X}\_i) \Big],\tag{1}$$

where *H* is the bandwidth matrix, which denotes an *m* × *m* symmetrical positive determined matrix. *<sup>K</sup>*(.) is the multivariate kernel function and must satisfy the following conditions:

$$\begin{cases} \int \int\_{\mathbb{R}^m} \mathbf{K}(\mathbf{x}) d\mathbf{x} = 1 \\ \int\_{\mathbb{R}^m} \mathbf{x} \mathbf{K}(\mathbf{x}) d\mathbf{x} = 0 \\ \int\_{\mathbb{R}^m} \mathbf{x} \mathbf{x}^T \mathbf{K}(\mathbf{x}) d\mathbf{x} = I\_m \end{cases},\tag{2}$$

where *R m* is the *m*-dimensional Euclidean space, *Im* denotes the *m* × *m* identity matrix, and *x<sup>T</sup>* is the transpose of *x*.

According to Reference [45], if the kernel function satisfies Formula (2), its form has little effect on the probability density modeling accuracy. Therefore, the Gauss kernel function was chosen as the kernel function in this paper.

The specific form of the bandwidth matrix *H* is given in Formula (3),

$$\mathbf{H} = \begin{bmatrix} h\_{11} & h\_{21} & \cdots & h\_{m1} \\ h\_{12} & h\_{22} & \cdots & h\_{m2} \\ \vdots & \vdots & \ddots & \vdots \\ h\_{1m} & h\_{2m} & \cdots & h\_{mm} \end{bmatrix} \tag{3}$$

In MNKDE modeling, the selection of the bandwidth matrix is the most important factor and can directly affect the modeling accuracy. Generally, the bandwidth matrix is obtained by an optimal model of bandwidth. Due to the large number of bandwidth matrix elements, the computational complexity of the optimal model of bandwidth for MNKDE is much larger than that of univariate NKDE. In order to reduce the computation complexity, the method in Reference [39] was used to simplify the Formula (1) in this paper.

The formula is simplified as follows:

$$\hat{f}\_m(\mathbf{x}) = \frac{1}{n} \sum\_{i=1}^n \frac{1}{h\_1 h\_2 \cdots h\_m} K\left(\frac{\mathbf{x}\_1 - \mathbf{X}\_{i1}}{h\_1}, \dots, \frac{\mathbf{x}\_m - \mathbf{X}\_{im}}{h\_m}\right),\tag{4}$$

where *<sup>K</sup>*(*x*) is defined as

$$K(\mathbf{x}\_1, \mathbf{x}\_2, \dots, \mathbf{x}\_m) = K(\mathbf{x}\_1)K(\mathbf{x}\_2)\cdots K(\mathbf{x}\_m). \tag{5}$$

Here, the Gaussian kernel is used as the kernel function

$$K(x) = \frac{1}{\sqrt{2\pi}} e^{\left(-\frac{x^2}{2}\right)}.\tag{6}$$

According to Formulas (4)–(6), Formula (7) is as follows:

$$\hat{f}\_{\mathfrak{m}}(\mathbf{x}) = \frac{1}{n} \sum\_{i=1}^{n} \frac{1}{h\_1 h\_2 \cdots h\_m} \bullet \frac{e^{-\frac{1}{2} \left(\frac{x\_1 - X\_{i1}}{h\_1}\right)^2} \bullet \frac{e^{-\frac{1}{2} \left(\frac{x\_2 - X\_{i2}}{h\_2}\right)^2}}{\sqrt{2\pi}} \cdots \frac{e^{-\frac{1}{2} \left(\frac{x\_m - X\_{im}}{h\_m}\right)^2}}{\sqrt{2\pi}}.\tag{7}$$

Further simplification of Formula (7) can be obtained as follows:

$$f\_m(\mathbf{x}) = \frac{1}{n} \sum\_{i=1}^n \frac{1}{h\_1 h\_2 \cdots h\_m} \frac{1}{\left(\sqrt{2\pi}\right)^m} e^{-\frac{1}{2}Y(x)}\,\_\prime \tag{8}$$

where the specific form of *<sup>Y</sup>*(*x*) is shown in Formula (9) as

$$Y(\mathbf{x}) = \left( \left( \frac{\mathbf{x}\_1 - X\_{i1}}{h\_1} \right)^2 + \left( \frac{\mathbf{x}\_2 - X\_{i2}}{h\_2} \right)^2 + \dots + \left( \frac{\mathbf{x}\_m - X\_{im}}{h\_m} \right)^2 \right). \tag{9}$$

#### *2.2. Optimal Model of Bandwidth*

In the MNKDE model, *H* can directly influence the accuracy and smoothness of the model. If the value of *H* is too large it may lead to high smoothness of the probability density function of ˆ *f*(*x*), which results in a large estimation error. If the value of *H* is too low the accuracy of estimation can be improved. However, the fluctuation of the probability density function of ˆ *f*(*x*) may be excessively high, especially for the tail of ˆ *f*(*x*).

In conclusion, two kinds of bandwidth evaluation indicators are presented in this paper: the Euclidean distance and the maximum distance. The former is mainly used to evaluate the accuracy of the model and the latter is used to evaluate the smoothness of the model.

Assuming that *f*(*x*) is the real JPDF of wind power samples, the Euclidean distance is defined as follows: 

$$d\_O(\mathcal{H}) = \sqrt{\sum\_{i=1}^{n} d\_{fi}^2(\mathcal{H})},\tag{10}$$

where *dJi*(*H*) = ˆ*f*(*<sup>x</sup>i*) − *<sup>f</sup>*(*<sup>x</sup>i*), which is the geometric distance between the estimation value and the real value for each sample.

The maximum distance is defined as follows:

$$d\_M(H) = \max\{d\_{\restriction i}(H)\}.\tag{11}$$

Based on Formulas (10) and (11), an optimal model of bandwidth, considering both accuracy and smoothness of the model, is:

$$\min \mathbb{R}(H) = \min [d\_{\mathcal{O}}(H) + d\_M(H)],\tag{12}$$

where *R*(*H*) is the fitness error function of MNKDE.

#### *2.3. Improved Adaptive Strategy Based on the Optimal Bandwidth Adjustment Model*

According to Formula (12), a fixed bandwidth *H* was used in the previous MNKDE theory, which involves obtaining only one *H* to minimize the fitness error sum of all the samples. However, the fitness error values may be abnormally large for some local sample intervals in that situation. If the adaptive bandwidth matrix, which is adapted to the local sample interval, is solved by modifying *H* in the sample data and the original fixed bandwidth matrix is replaced with the adaptive bandwidth matrix, the adaptive property of the constructed JPDF in the local sample intervals would be guaranteed. The modeling accuracy would also be further improved. Taking into account the above analysis, based on the MNKDE, the following improved strategies have been used for this paper.

After the bandwidth matrix *H* is solved by the optimal model of bandwidth (12), we discriminate the fitness of the sample interval. For any local sample intervals, *l* ∈ [*l*1, *l*2] (*l*2 > *l*1 and *l*1, *l*2 ∈ [*<sup>X</sup>*1, *<sup>X</sup>n*]), we have determined that there exists a local adaptability problem in the local sample interval if the following inequality holds as follows:

$$d\_{fl}(\mathbf{H}\_{\rm Best}) \ge \lambda \overline{d\_I(\mathbf{H}\_{\rm Best})},\tag{13}$$

where *l* denotes any sample intervals, *dJl*(*<sup>H</sup>Best*) is the geometric distance in *l*, *HBest* is the result of Formula (12), *dJ*(*<sup>H</sup>Best*) is the average geometric distance of the entire sample space, and *λ* is an adjustment factor. If *λ* is smaller, the screening is more strict and more intervals need to be adjusted. In this scenario, the modeling accuracy is promoted but the complexity of the modeling is higher. In contrast, the complexity of the solution may be reduced but the modeling accuracy will then be declined. The specific value can be determined according to tests.

The *dJ*(*<sup>H</sup>Best*) is as follows:

$$\overline{d\_{\!\!\!\!\!}(\mathbf{H}\_{\text{Best}})} = \frac{1}{n} \sum\_{i=1}^{n} d\_{\!\!\!\!\!\/ }(\mathbf{H}\_{\text{Best}})\_{\prime} \tag{14}$$

Aiming at the interval with local adaptability problems, a bandwidth adjustment model was built to modify the bandwidth matrix:

$$\mathbf{H}\_{l} = \frac{n\_{l}d\_{lI}(\mathbf{H}\_{\rm Bct})\_{\rm mid}}{\sqrt{-2\ln\delta}}\mathbf{H}\_{\rm Bct\prime} \tag{15}$$

where *H*l is the modified bandwidth in *l*, *nl* is the number of samples in l, *dJl*(*<sup>H</sup>Best*)*mid* is the median of the geometric distance in *l*, and *δ* is the threshold of the kernel function.

Thus, Formula (8) can be modified into Formula (16), which is an AMNKDE model for the JPDF modeling of multiple wind farms:

$$\begin{split} f\_m(\mathbf{x}) &= \underbrace{\frac{1}{l\_1} \sum\_{i=1}^{l\_1} \frac{\omega\_i}{\prod H\_{\text{Rost}}} \frac{1}{(\sqrt{2\pi})^m} e^{-\frac{1}{2}H\_{\text{best}}(\mathbf{x})}}\_{\begin{subarray}{c} \sum\_i \omega\_i \end{subarray}} \\ &+ \underbrace{\frac{1}{l\_2} \sum\_{i=l\_1}^{l\_2} \frac{\omega\_i}{\prod H\_{l\_1}} \frac{1}{(\sqrt{2\pi})^m} e^{-\frac{1}{2}H\_{l\_1}(\mathbf{x})}}\_{\begin{subarray}{c} \sum\_i \omega\_i \end{subarray}} \\ &+ \underbrace{\cdots \cdot \right}\_{i=l\_{k-1}} \underbrace{\frac{1}{l\_k}}\_{\begin{subarray}{c} \omega\_i \end{subarray}} \sum\_{i=l\_{k-1}}^{l\_k} \frac{\omega\_i}{\prod H\_{k-1}} \frac{1}{(\sqrt{2\pi})^m} e^{-\frac{1}{2}H\_{l\_{k-1}}(\mathbf{x})} \\ &+ \underbrace{\frac{1}{\pi}}\_{\begin{subarray}{c} \omega\_i \end{subarray}} \underbrace{\frac{\omega\_i}{\prod H\_{k\_i}} \frac{1}{(\sqrt{2\pi})^m} e^{-\frac{1}{2}H\_{l\_k}(\mathbf{x})}}\_{\begin{subarray}{c} \sum\_i \omega\_i \end{subarray}} \left( \begin{array}{c} \\ \end{array} \right) \tag{16}$$

where *k* is the number of sample intervals that need to be adjusted, *<sup>H</sup>lk* is the modified bandwidth matrix in *lk*, and *ωi* is the measurement weight. In this paper, the following formula is used for *ωi* [42]:

$$
\omega\_i = \alpha + \exp\left(-\frac{{\bf s}\_i^2}{\overline{\bf s}^2}\right),
\tag{17}
$$

where *α* is a small positive number, *si* is the standard deviation of measurement for each sampling interval, and *s* = 1*n n*∑*i*=1 *si*<sup>2</sup> is the geometric mean of the standard deviation for all measurements.

#### **3. Solution of the Optimal Model of Bandwidth Based on Ordinal Optimization**

For the proposed AMNKDE in this paper, the bandwidth was transformed from a traditional single parameter matrix, which contributed to the increasing difficulty of the solution. In order to solve this problem, a solving approach of the optimal model of bandwidth, based on OO, was proposed.

OO is an effective method for solving complex optimization problems. According to the previous research in Reference [30], this method was successfully applied to solve the optimal model of bandwidth of univariate NKDE and achieved positive results. In this research, the OO was used to solve the bandwidth optimization problem of AMNKDE, the solution is shown in Figure 1 and the detailed steps are as follows:

**Figure 1.** Flow Chat for the Solution.

(1) In the solution space of the bandwidth matrix *H*, the *N* bandwidth matrices were extracted to form a characterization set Ω according to the uniform distribution. The *N* was closely related to the size of the solution space. When the solution space was less than 108, *N* = 1000 was recommended by Reference [46].

(2) The *N* feasible solutions were selected by the rough model of Formula (10). Then, the feasible solutions were sorted according to the assessment results. In addition, the ordered performance curve (OPC) was constructed. The types of OPC are given in Reference [30].

(3) Formula (18) was used to determine the number of solutions in the selected set *S*,

$$S = \mathfrak{e}^{\mathfrak{e}} t^{\mu} \mathfrak{g}^{\mathcal{O}} + \eta\_{\prime} \tag{18}$$

where *S* is the number of solutions in the selected set *S*, *t* represents that there exist at least t good enough solutions in the selected set *S*, *g* represents the size of the good enough solution subset, *ε*, *μ*, , *η* are the parameters associated with the type of OPC, and the values are 8.1378, 0.8974, 1.2058, 6.00, respectively [29].

(4) Taking the objective function of Formula (12) as the exact model, the order of comparison of solutions in the solution set *S* is made and the top *t* solutions will be selected as real, good enough, solutions.

(5) Utilizing Formula (13), the local sample intervals with low accuracy in the model were found. The bandwidths in these intervals were adjusted according to Formula (15).

## **4. Scenario Study**

In this paper, 4773 sampling sequences of wind power outputs from six wind farms in the Hubei province of China were selected as examples. The sampling time interval was 10 min. The sampling period was from 19:40 on 17 March, 2009 to 23:00 on 19 April, 2009. For the frequency histogram of two wind farms, the straight interval chosen for this paper was 30 kW. For the frequency histogram of three wind farms, the straight interval was 100 kW. For the two wind farms, the total probability density of the samples was 1.1 × 10−3. For the three wind farms, the total probability density of the samples was 1 × 10−6. When *λ* = 6, the comprehensive performance of the proposed model was best. The model could improve the overall modeling accuracy by approximately 10% compared with the traditional MNKDE model and the corresponding calculation time was only 63 s. Accordingly, *λ* = 6 was chosen for this research. According to Reference [47], *δ* was 0.79655.

Program simulation was achieved in the MATLAB platform and related computing was completed on a computer with an Intel Core i5-4460 (3.20-GHz) CPU with 8 G of RAM. The computer time of the OO in this paper was 63.350 s. To verify the validity and applicability of the proposed approach, three-dimensional and four-dimensional JPDFs were obtained from two wind farms and three wind farms for comparison and analysis. The active power output sampling sequence of six wind farms is listed in Figure 2.

**Figure 2.** Historical data of six wind farms.

Figure 2 shows the differences between the output trends of the former three wind farms and those of the latter three wind farms. The differences are most obvious for the sampling points between 1300 and 2300. We concluded that the JPDFs of the former three wind farms and the latter three wind farms are different.

#### *4.1. Joint Probability Density Function Modeling of Two Wind Farms*

The JPDF of Wind Farms 1 and 2 were obtained via the approach described previously. The frequency histogram is of Wind Farm 1 and Wind Farm 2, based on the sample data. It is shown in Figure 3b. The comparison between them is shown in Figure 3.

(a) Joint probability density function.

(b) Frequency Histogram.

**Figure 3.** Joint probability density function and frequency histogram of Wind Farms 1 and 2.

From Figure 3, we found that the outputs of Wind Farms 1 and 2 had a tail correlation. The correlation in the upper tails was stronger, which meant that both wind farms were more likely to produce larger outputs. From the function curve of the model, the modeled JPDF fit well with the real joint distribution of Wind Farms 1 and 2. The detailed calculation results are listed in Table 1.



From the results of Table 1, the modeling error was relatively low and the overall fitness error was only 6.50 × 10−5.

## *4.2. Multipart Figures*

To guarantee the generalizability of the results, the four-dimensional JPDF of three wind farms is presented. Based on this scenario, a comparative study was carried out. Models 1, 2, and 3 are the traditional MNKDE model, the AMNKDE model and the comprehensive Copula model, respectively.

#### 4.2.1. Validity Analysis of the Improved Adaptive Strategy for MNKDE

To verify the differences between the proposed AMNKDE and the traditional MNKDE, the JPDFs of Wind Farms 1, 2, and 3 were constructed in these methods. The results are given in Table 2.


**Table 2.** Comparison of AMNKDE and MNKDE.

Compared with Model 1, the Euclidean distance, the maximum distance and the overall fitness error of Model 2 were reduced by 6.76%, 55.5% and 8.81%, respectively, as shown in Table 2. This suggests that the modeling accuracy of MNKDE was effectively improved by the new adaptive strategy. The proposed AMNKDE achieved an adaptive improvement for the bandwidths of the sample interval in [0,100] [0,100] [0,100] and [800,1000] [800,1000] [0,100]. The elements of the bandwidth matrices in the sample intervals [0,100] [0,100] [0,100] and [800,1000] [800,1000] [0,100] were changed from 47 to 32 and 33, respectively.

The rest of the interval elements remained as 40.7 and the above matrix as an adaptive bandwidth matrix. The resulting decline of the Euclidean distance for the corresponding sample intervals was 16.7% and 55.6%, respectively. This improvement resulted in a rise of 13.1% in the Euclidean distance of the other sample intervals, but, for the entire sample interval, the overall Euclidean distance and the maximum distance was evidently reduced and the overall fitness error was cut down by 8.81%. We summarized that the overall modeling accuracy of the MNKDE was effectively facilitated by the adaptive bandwidth improvement strategy of the sample intervals with the local adaptability problem.

4.2.2. Accuracy Comparison between AMNKDE and Copula Parameter Estimation

To verify the accuracy of the proposed AMNKDE approach, the JPDF of Wind Farms 1, 2 and 3 were established using the comprehensive Copula method from Reference [29]. The compared results are shown in Table 3. The optimal Copula function was composed of Gumbel Copula, Clayton Copula and Frank Copula.

**Table 3.** Accuracy comparison between AMNKDE and Copula parameter estimation.


From Table 3, Model 3, based on the comprehensive Copula method, the Euclidean distance, maximum distance, and overall fitness error of Model 2, based on the proposed AMNKDE approach, were compared and shown to be reduced by 7.8%, 57.9% and 11.6%, respectively. It can be seen that the proposed AMNKDE approach has higher modeling accuracy than the comprehensive Copula method. The reason is that the proposed AMNKDE approach directly models the JPDF based on the sample data. Accordingly, it does not need to choose the specific form of the JPDF in advance and the modeling accuracy is only related to the selection of bandwidth, rather than the prior definition of the JPDF forms.

#### 4.2.3. Comparison of Applicability between AMNKDE and Copula Parameter Estimation

To verify the applicability of the proposed AMNKDE approach, the wind farms were changed for the comparison. The JPDFs of Wind Farms 4, 5, and 6 were established using the AMNKDE and the comprehensive Copula method from Reference [29]. The optimal Copula function still consisted of Gumbel Copula, Clayton Copula and Frank Copula. The detailed results are presented in Table 4.


**Table 4.** Comparison of Applicability between AMNKDE and Copula Parameter Estimation.

From Table 4, the proposed AMNKDE approach still maintained high modeling accuracy for the different wind farms. Compared with Table 3, the overall fitness error of the AMNKDE increased by 5.67%. In contrast, the overall fitness error increase of the comprehensive Copula method was larger, 10.99%, and the increase was 1.94 times that of the AMNKDE. It can be concluded that the proposed AMNKDE approach possesses high applicability compared with the Copula PE method when the modeling object is changed. The reason is that the latter method needs to judge the form of the JPDF, and the joint probability distribution of different wind farms may follow different function forms. Consequently, it may cause a large error if the same function form is used to model different wind farms.

#### 4.2.4. Comparison of Algorithms

To analyze the validity of the OO algorithm in this paper, the calculation efficiency was proposed. The GA, PSO and OO algorithms were used to solve the optimal model of bandwidth in this paper. The optimal bandwidth matrices of these three algorithms, *Hbest*, were [48.7,48.7,48.7], [55,55,55], [52,52,52], respectively. The results of the fitness error *R*(*H*) and the computation time are shown in Figure 4.

**Figure 4.** Comparison of Accuracy and Operation Time between Different Algorithms.

Figure 4 compares the traditional genetic algorithm and the PSO algorithm. The proposed OO algorithm was relatively limited in terms of improving the computational accuracy. However, the OO possessed a significant advantage in computational efficiency. It can be concluded that the proposed OO algorithm can effectively guarantee the computational efficiency and accuracy.
