1. Introduction
In the field of electronic and electrical engineering, nonlinear load is widely used, which makes the harmonic pollution of three-phase symmetrical power system increasingly serious. Nonlinear devices, such as generators, inverters, and rectifiers, inject harmonic currents into their connected power grid, resulting in voltage distortion at the point of common coupling (PCC), which in turn affects the current at this point. Moreover, the interactions among multiple harmonic loads are complicated, making it difficult to locate the sources of distortion. Therefore, to address the issue of harmonic pollution [
1], accurately identifying the “problematic” harmonic sources and determining the responsibility of each harmonic source are two extremely important aspects, the latter of which being the focus of this paper.
Existing methods of harmonic responsibility division mainly include: the volatility method [
2,
3,
4], the covariance method [
5,
6,
7], the blind source separation method [
8,
9,
10,
11], and the linear regression method [
12,
13,
14,
15,
16,
17,
18,
19,
20]. The first three methods have shown a higher accuracy in estimating the harmonic impedance, but they all require accurate harmonic waveform sampling data to obtain the harmonic amplitude and phase angle to realize the separation and calculation of the real and imaginary parts. The linear regression method is vulnerable to the fluctuation of the background harmonic. In addition, the regression formulas derived from following the circuit principles in different works may require different data, and some of them do not require phase angle data [
19,
20]. The volatility method and the covariance method are both proposed based on the responsibility assessment of a single harmonic source, while the blind source separation method and linear regression method can be used for a single harmonic source and multi-harmonic sources.
At present, responsibility division for multi-harmonic sources has been widely studied, and the proposed methods are generally based on accurately estimating the harmonic impedance [
21]. The blind source separation method has been widely studied in recent years. The independent component method (ICA) as well as its improved variants [
8,
9,
10,
11] are the most representative. When the harmonic source meets the precondition of being statistical independence and non-Gaussianness, individual harmonic current signals can be separated one by one from their mixed signal with merely measuring the harmonic voltage and without knowing the network harmonic impedance. The least squares method and its improved variants [
12,
13,
14,
15] are a common method to obtain the harmonic impedance through calculating the coefficient of Norton equivalent circuit of each harmonic source [
16].
However, in real situations where the assessment of harmonic impedance is affected by the highly autocorrelated harmonic monitoring data, the problem may become ill-conditioned, and the abovementioned methods could fail. The problem is highly possible to be ill-conditioned when multiple harmonic sources exist, because a large amount of data are required for such systems. Partial least-squares method is an improvement of the least squares method, which overcomes the disadvantage of variable dependence for systems model with multiple harmonic sources [
17,
19]. M-estimation robust regression method [
20] can reduce the influence of singular values to a certain extent and realize the division without phase angel information, but this method is mainly used for responsibility division of a single harmonic source. When used for dividing responsibility of multi-harmonic sources, data segments in which the harmonic resource to be analyzed fluctuates greatly while the fluctuations of other harmonic sources being less than 5~10% have to be selected out. In practical cases, the limited available data segments would make the corresponding result difficult to reflect the overall situation of harmonic responsibility in a long time scale.
The relevant works of harmonic responsibility division in the past five years are summarized in
Table 1, from which it can be seen that most methods of responsibility division of multi-harmonic sources require data of harmonic phase angle. However, due to the limitation on communication channels and storage capacity, the harmonic phase angle information is usually not stored in existing regional power quality online monitoring systems. To this end, to apply those methods, harmonic measuring instruments have to be used to conduct special tests on site to obtain short-term measurement data. However, the obtained data is not compatible to the monitoring system, and the corresponding results are hard to reflect the overall situation of long-term harmonic responsibility.
In general, the linear regression method can find highly accurate solutions when the utility harmonic voltage is constant; however, utility harmonic voltage usually varies in real systems [
22]. When the utility harmonic voltage is under considerable fluctuations, the use of linear regression methods for harmonic impedance assessment becomes less accurate [
23,
24]. Data analysis methods can weaken the impacts of fluctuations of the background harmonic on the accuracy but may result in a reduction in the amount of available data. Taking the linear regression method as an example, we discuss three typical data analysis methods.
The first solution is to classify the background harmonic voltage into different categories, and then perform linear regression for each category. For classification of the background harmonic voltage, Zang et al. [
25] divided the harmonic data points into segments that take constant utility harmonic voltage, which can be determined from hierarchical K-means clustering. Zhang and coworkers [
26] considered the effects of background harmonic fluctuations, and proposed a data selection method based on the improved Hampel weight function. The basic idea of this method is to estimate the utility background harmonic voltage by solving the phasor equation, which avoids the use of iterative methods. However, in real systems where the measurement of phase data is not mandatory, the background harmonic voltage cannot be estimated if the phase information is unknown. The accuracy of the utility background harmonic voltage relies on the accuracy of the utility harmonic impedance; its accuracy is usually low and difficult to calculate when the system harmonic impedance changes [
27].
The second solution is to perform linear regressions first and then classifies the background harmonic voltage using iterative methods. As iterative methods are applied in this solution, the computational (both space and time) complexity is significantly increased [
28]. The drawback of these two solutions is that both require the harmonic phase data, which are usually not available directly in power quality monitoring systems [
29].
To avoid this limitation, the third solution based on data correlation is proposed to assess the harmonic responsibilities by comparing the similarity between the harmonic voltage and the harmonic current and then selecting the strongly relevant data for linear regression [
30,
31]. However, these methods cannot reflect the degree of correlation for multiple sets of data combinations and are used only for estimating the harmonic emission capability of harmonic sources on both sides of the PCC. For cases where multiple harmonic sources are connected to the PCC, it remains unsolved as to how to analyze the correlations between the harmonic currents and the harmonic voltage.
However, the above-mentioned multi-harmonic source division method requires a large amount of calculations and cumbersome steps, the responsibility division for multi-harmonic sources needs further improvement.
The main contributions of this paper can be highlighted as follows:
- (1)
A harmonic responsibility division method based on canonical correlation analysis is proposed. Compared with the existing methods, harmonic phase angle data is not required while still achieving a higher division accuracy even when the background harmonic fluctuates relatively greatly.
- (2)
One-to-many correlation analysis between the harmonic voltage at PCC and the data of multi-harmonic sources is realized with canonical correlation analysis method, with which even under great fluctuations of the background harmonic, appropriate data can be selected out.
- (3)
Merely using available data from the online power quality monitoring system and without involving additional harmonic measuring instruments, harmonic responsibility division on a longer time scale is realized focusing on long-term stable operation of power grids.
2. Preliminaries on the Division Method of Multi-Harmonic Sources Responsibility
2.1. Projection Coefficient Calculation
Figure 1 shows the Norton equivalent circuit with multi-harmonic sources, where
and
are harmonic voltage and harmonic current, respectively, at the PCC;
,
,
represent the system side equivalent harmonic current of the harmonic source, equivalent harmonic impedance, and branch harmonic current, respectively;
,
,
(
), similar to
,
,
, represent the counterparts on the user side [
18].
As proved in [
32], it is feasible to use the actual measured current
instead of the theoretical current
to carry out the responsibility assessment of centralized multi-harmonic sources, and most current studies default to do so. According to the superposition theorem,
can be easily calculated as
where
n is the number of major harmonic sources connected to the PCC;
is the harmonic voltage at the PCC generated by the harmonic source
k;
is the equivalent harmonic impedance excluding the branch
k;
is the background harmonic voltage for other components at the PCC, including the system harmonic voltage and the user-side non-major harmonic source. Assuming three main harmonic sources (
n = 3), the relationship between the harmonic source voltage for each user and the harmonic voltage vector at the PCC is shown in
Figure 2.
According to the definition, the harmonic responsibility for source
k is defined as the ratio of the projection of
on
to the amplitude of
, which is expressed as
where
with
the phase angle between
and
(
).
and
are the modulus measured at the PCC.
hk (
) is the projection coefficient, calculated by the linear regression method.
Figure 2 shows the relationship between
and the projection of
on
. Based on Equations (3), Equation (1) can be expressed as
Since all the terms in Equation (4) are scalars, hk can be calculated from linear regression, which avoids solving vector equations. Note that the solution of Equation (4) will have high accuracy if U0cos α and hk are both constant. However, U0cos α varies in real systems, which may cause large errors when using the linear regression method.
2.2. Principle of Data Selection
As it is shown in
Figure 1, where the system side and the user side are considered as two different parts, the harmonic voltage Equation at the PCC is expressed as
Since
, and
is the integrated equivalent harmonic impedance of the user side. Then, we have:
Based on Equations (5) and (6), the background harmonic voltage of the system side can be expressed as
The phasor
can be calculated from the vector Equation (7), and then the data for linear regression are selected via the cluster analysis [
21].
The cluster analysis methods for data selection have two major drawbacks: (i) vector calculation is required, and (ii) these methods only select the data with small background harmonic voltage fluctuations from the system side. When the user side contains unknown harmonic sources that fluctuate, these methods consider them to be constant, resulting in errors that are unavoidable.
The data with small background harmonic voltage fluctuations can be directly selected using the data correlation methods; the linear equation is then solved according to the selected data, and the projection coefficient is calculated. These methods require no estimation of the system side harmonic voltage, and they also minimize the influence of the unknown harmonic sources on the user side.
Although the correlation between different sequences has been widely studied in the existing data correlation methods, in this paper we propose a new data selecting method based on the CCA. Equation (4) shows that when U0cos α is a constant, UPCC is linearly related to Ik (k = 1, 2, …, n). This linear relationship between UPCC and Ik (k = 1, 2, …, n) is affected by the term U0cos α if it fluctuates. In other words, as the fluctuation of U0cos α increases, the linearity between UPCC and the linear combination of Ik (k = 1, 2, …, n) decreases, and vice versa. The data can thereby be selected according to the degree of linearity between UPCC and the linear combination of Ik (k = 1, 2, …, n). The higher the degree of linearity is, the smaller the fluctuation of U0cos α is, and vice versa. Since the traditional correlation analysis methods, such as the Pearson correlation coefficient evaluation method, cannot be applied to cases with two groups of multiple random variables, in this paper we use the CCA method to analyze the correlations between the harmonic voltage and the harmonic currents for multiple harmonic sources.
3. Multi-Harmonic Source Responsibility Division Method
3.1. The Method of CCA Data Selection
The analysis in the previous section indicates that the background voltage is considered constant when the harmonic voltage and the harmonic currents are highly relevant, and thus a multivariate correlation analysis is need to analyze the relationship between the harmonic voltage and the harmonic currents. The CCA is such a method used to study the correlation among multiple sets of data. The two sets of variables studied are
X = (
x1,
x2,…,
xn) and
Y = (
y1,
y2,…,
yn), where
xi and
yi are both
m-dimensional vectors, and the two groups of variables are standardized.
X’ and
Y’ are defined as comprehensive variables where
X’ and
Y’ are both
m-dimensional vectors; these two comprehensive variables are to replace the original variables according to the optimization method. More details about the optimization method can be found in [
33].
The correlation between the input parameters
X and
Y can be found by calculating the Pearson correlation coefficient. The expression is shown in Equation (8) as
where comprehensive variable
X′ and
Y′,
X′=
X,
Y′ is the optimal linear combination of the variables in
Y, and the principle is to make the correlation of
X′ and
Y′ strongest; cov(
X′,
Y′) represents the covariance between
X′ and
Y′;
D(
X′) and
D(
Y′) are the variances of
X′ and
Y′, respectively.
If the correlation between
X and
Y satisfies the criterion in Equation (9), they are considered strongly correlated [
34].
In order to illustrate the effects of the CCA on data selection, we set two harmonic sources on the user side and set the harmonic source on the system side fluctuating.
Figure 3 is a schematic diagram of the data selection results, where
UPCC is the harmonic voltage at the PCC,
I1 and
I2 represent the harmonic current of harmonic sources 1 and 2, respectively. For the plots on left of
Figure 3, the data variance is 0.1, and the background harmonic voltage fluctuations are considered small; while the plots on the right show relatively large background harmonic voltage fluctuations with a variance of 0.5. When the background harmonic voltage fluctuations are small, there is no obvious similarity between the harmonic currents of the two harmonic sources and the harmonic voltage at the PCC, and the comprehensive variable have a strong similarity with the harmonic voltage. From the plots on the right of
Figure 3, there is no obvious correlation between the comprehensive variable and the harmonic voltage at the PCC, which is in agreement with the theoretical analysis in
Section 2 that as the background harmonic voltage fluctuations increase, the correlation between the comprehensive variable and the PCC harmonic voltage decreases. Based on the CCA, the data with small background harmonic voltage fluctuations can be screened for subsequent analysis and calculations.
3.2. Harmonic Data Sliding Window Analysis
Since it is almost impossible to directly obtain the correlation of the data at a certain moment, the analysis needs to segment the data first. The CCA method proposed in this paper based on sliding window analysis. Assume that the length of each data set is m, and one sliding window includes t sets of harmonic voltage and current amplitude data.
The CCA is performed for each data segment. rj is the correlation coefficient of the jth data segment. When rj ≥ rth (rth is the correlation coefficient threshold), the background harmonic voltage fluctuations of the jth segment data are considered small, hence each harmonic current and the harmonic voltage at the PCC in the jth segment can be used to calculate the harmonic responsibility coefficient; when rj < rth, the jth data segment is not available, thus is discarded.
The CCA method based on sliding window analyzes long-term data in stages by intercepting data segments. This method analyzes the correlation between the harmonic voltage and the harmonic currents in each data segment. On this basis, the data segments with smaller background harmonic voltage fluctuations are selected, which can be used to calculate the projection coefficient by linear regression; the harmonic responsibility index can then be calculated from Equation (2).
3.3. Harmonic Responsibility Division Procedures
Figure 4 depicts the procedures of the harmonic responsibility division method for multi-harmonic sources, that are based on the CCA proposed in this paper. A brief description of the procedures is provided as follows.
Step 1: Measure the harmonic data at the PCC for n major harmonic sources during the time periods of interest. The measurements include the harmonic voltage data at the PCC, , and the harmonic current data for each source .
Step 2: Determine the length of the sliding window t by considering the accuracy and time of the calculation. The smaller sliding window t, the higher calculation accuracy and the longer calculation time.
Step 3: Calculate, using the CCA method, the correlation coefficient, , between the harmonic voltage data and the harmonic current data of the jth window. The data are selected if rj ≥ rth, and are discarded otherwise.
Step 4: Solve Equation (4) using the linear regression method to calculate the projection coefficient.
Step 5: Calculate the harmonic responsibility index for each harmonic source from Equation (2).
Step 6: Move j one point backwards, and repeat steps (3) to (5) until j = n.
Step 7: Calculate the average of the harmonic responsibility index in each period to get the total harmonic responsibility index for each user, the expression is shown in Equation (10) as
where,
n is the number of total data segments after screening in the analysis period.