1. Introduction
Accurate line parameters are the basis of state estimation, security control and other advanced applications in energy management systems (EMS). With the continuous expansion of power grids, the operation of distribution networks is complex and changeable; meanwhile, dynamic network reconfiguration is frequent [
1,
2]. Unlike transmission networks, distribution networks have not realized the real-time monitoring of topology changes and periodic checks of line parameters; therefore, EMS can only rely on the topology and line impedance data recorded in grid-planning files. However, adjustments to the operation structure of a distribution network and changes in the environments around grid lines can lead to deviations between recorded data and actual parameters, which seriously affects the accuracy of EMS safety analysis and control decision results. Therefore, for a distribution network with missing or outdated topology information, it is of great significance to propose an effective line parameter identification method to improve its operation stability and management level [
3,
4].
Fortunately, the rapid deployment of advanced metering infrastructure (AMI) in distribution networks has enabled high-density historical data to be obtained for line parameter identification [
5,
6]. In [
7,
8,
9], line parameters were estimated based on nonlinear least squares (NLS), reweighted NLS and augmented state estimation, respectively. In [
10], a graphical learning algorithm based on physical parameters was proposed, and the stochastic gradient descent method was used to estimate line parameters; however, the identification efficiency was low in networks with a large number of lines; In [
11], a fast graphical learning method was proposed for a large-scale distribution network, which improved the efficiency of parameter estimation. In [
12,
13], line parameters were identified using particle swarm optimization and a machine learning algorithm, respectively; however, these had little physical interpretability and could not guarantee the generalization of an identification model in different distribution networks. In [
14], a hierarchical estimation strategy for line parameters was proposed based on the generalized equations of line voltage drops, and reinforcement learning was used to improve the robustness of the line impedance estimation method to measurement errors. Although the above methods can identify line parameters, they all require prior information for an accurate network topology.
The uncertainty of dynamic distribution network topologies limits the use of the above identification methods in engineering applications. In this regard, some scholars have proposed a joint identification method for distribution network topologies and line parameters. In [
15,
16], topology identification was transformed into a generalized low-rank approximation problem, and the error-in-variables (EIV) model was used to realize the joint identification of topology and line impedance in a maximum-likelihood estimation framework. In [
17], the distribution network topology and line impedance were estimated through the iteration of the Kalman filtering method and Newton–Raphson method. In [
18], a deep–shallow neural network was proposed, and the network topology and parameters were identified based on reinforcement learning. In [
19], a complex recursive grouping algorithm for the unsupervised identification of topology and line parameters was adopted, which is applicable in distribution networks with latent nodes. The methods proposed in [
15,
16,
17,
18,
19] are based on the data collected by phasor measurement units (PMU), which synchronously sample the power and voltage measurement data of buses based on global positioning system (GPS) time references.
However, due to the limitations of economic cost, PMU devices have not been fully deployed in many distribution networks [
20], and smart meters are generally used as alternative measurement equipment. Compared with PMU, smart meters cannot obtain voltage angle measurement information. There are several identification methods based on smart meter data. In [
21], a linear regression model was established to construct the network topology and calculate the line parameters sequentially from the leaf nodes to the root nodes; however, the calculation capacity required by the algorithm increased rapidly with large-scale distribution networks. In [
22], the measurement data of voltage magnitudes and power injections were used to estimate the impedance distance between observed buses, and the operation topology and line parameters were iteratively identified. In [
23], a topology label matrix was constructed based on the LinDistFlow model, and the topology and line parameters were estimated by feature clustering. Nevertheless, the methods proposed in [
22,
23] require a large number of samples, thus the effectiveness of algorithm cannot be guaranteed in a distribution network with frequent topology changes, and they solely take the radial grid structure into consideration.
As above, existing parameter identification methods generally require prior information on grid topology, PMU devices, or months of measurement data. To overcome these limitations, we propose a method to identify topology and line parameters simultaneously that does not rely on prior information of the grid topology or parameters. Consequently, it addresses uncertainty in the topology, which may change flexibly and frequently in distribution networks. First, a linear power flow model is established according to the operating characteristics of the distribution network and the voltage angle information is ignored temporarily. The initial identification results of distribution network topology and line admittance are quickly obtained through the simplified model. Then, the identification results are modified through decoupling iterative optimization among variables, aiming to reduce the influence of modeling errors on the accuracy of the parameter estimation. Finally, we demonstrate the effectiveness of our method on IEEE test cases, and the robustness of proposed algorithm is verified through a sensitivity analysis.
The main contributions of this paper are given as follows:
(1) All the measurement data are provided by smart meters and do not rely on expensive equipment such as PMU. Additionally, the samples required by the algorithm ars quite reasonable, only one day of historical data is needed in a five-minute sampling period, which can avoid topology changes during data collection to a large extent;
(2) The method accurately identifies the topology and effectively estimates the line parameters; moreover, it is suitable for large-scale distribution networks and superior to the method that can only estimate the topology or line parameters;
(3) The proposed identification algorithm is robust to measurement noises, and is applicable for distribution grids with different structures, including weakly meshed grids.
The rest of this paper is organized as follows.
Section 2 provides the mechanism analysis of line parameter identification and demonstrates the overall flow of the proposed method.
Section 3 and
Section 4 concretely introduce the initial identification strategy and enhanced identification strategy, respectively.
Section 5 validates the performance of our method on IEEE 33- and 118-bus distribution systems.
Section 6 concludes the paper.
2. Mechanism Analysis of Line Parameter Identification in Dynamic Distribution Network
The admittance matrix contains both the topology and line parameter information of the distribution network, so this paper solves the estimated admittance matrix
G^ and
B^ based on the power and voltage data collected by smart meters. Similar to state estimation, the objective function is established according to the minimum variance between actual power injection and the power calculated by
G^,
B^, which can be expressed as follows.
where
N is the total number of buses in the distribution network,
pi and
qi are the active and reactive power injection of bus
i, respectively,
and
are the estimated active and reactive power injection of bus
i, respectively.
According to the classical power flow equations, there are constraints between the estimated power injection and the estimated admittance matrix as follows.
where
and
are the voltage magnitude of bus
i and bus
k, respectively,
is the voltage angle difference between bus
i and bus
k,
and
are the estimated conductance and susceptance between bus
i and bus
k, respectively. In order to simplify the symbolic representation without losing its generality, the following will uniformly use
v to represent |
v|.
In addition, according to the characteristics of admittance matrix, the optimization model needs to meet the following constraints.
(1) Where there is no connecting branch between bus i and bus k, it has = = 0.
(2) For the non-diagonal element of the admittance matrix, it always has:
(3) Since the shunt resistance in distribution network could be neglected [
15], the diagonal element of the admittance matrix always has:
(4) The admittance matrix is symmetrical as follows.
where []
T represents matrix transpose.
With the premise that the topology information of the dynamic distribution grid is unavailable, we can construct a mixed-integer programming problem by introducing a binary variable representing the line connection status to jointly identify the grid topology and line parameters, whereas there is barely an effective method for solving this problem. Furthermore, for the two buses without connecting branches, it is difficult to optimize the corresponding element in G^ and B^ to 0. In this paper, the corresponding value of the unconnected line in the estimated admittance matrix is set to 0 by noise reduction, which can effectively imply the topological information and reduce the number of optimization variables. On this basis, a two-stage identification method for jointly estimating the distribution network topology and line parameters is proposed to ensure the accuracy of the identification model.
The principle of noise reduction is that due to the constraints of power flow and the characteristics of the admittance matrix, if there is no connecting branch between bus i and k, then and will be gradually optimized to a number close to 0; consequently, the value of ||/|| would be sufficiently small. Therefore, we can calculate ||/|| for each non-diagonal element in G^; if the value is less than ωg, the noise reduction threshold and corresponding branch should be eliminated, as well as and should be set to 0. Owing to the fact that the distribution network usually adopts a radial or weakly meshed grid structure, the number of branches connecting bus i is much lower than N − 1; therefore, ωg can be properly set to 1/(N − 1).
In order to further verify the rationality of the proposed criterion, we use four IEEE test grids to calculate the |
Gik|/|
Gii| of each branch. As shown in
Table 1, for all branches in the four grids, the values of |
Gik|/|
Gii| are greater than
ωg. Accordingly, we can effectively remove the unconnected branches by noise reduction.
The overall flow of the proposed two-stage identification method is shown in
Figure 1. In stage 1, the approximate topology and line parameters are quickly obtained based on linear regression, and the number of optimization variables is reduced significantly. On this basis, the identification results are modified by decoupling the iterative optimization between variables in stage 2. At the same time, the unconnected branches are further eliminated by noise reduction. The final identification result is outputted until the iteration converges.
3. Initial Identification Strategy Based on Linear Regression
The nonlinearity of power flow equations may lead the identification model to become a non-convex optimization problem, which is difficult to solve; its linearization can improve the convergence and robustness of the identification algorithm [
24,
25]. Therefore, a linear power flow model is established in this stage, and we propose an initial identification strategy based on linear regression. Although the simplification of power flow equations and the neglect of the voltage angle affect the identification accuracy of line parameters to a certain extent, the main purpose of this stage is to quickly remove most of the unconnected branches and reduce the number of optimization variables, as well as to provide the basic topology and initial values of line parameters for stage 2. In that stage, the estimated admittance matrix will be further modified based on a classical power flow model to improve the accuracy of the identification results.
According to the characteristics of a normally operating distribution gird, we can obtain the following approximations [
24,
25,
26]:
(1) The voltage magnitude of bus i is close to 1 p.u., which implies = − 1 is a small number.
(2) The voltage angle difference
is usually less than 5°, so we have:
With the approximations discussed above, the active power flow equation can be expressed as follows.
Replacing
with 1 +
, and combining Equations (10) and (11), we obtain:
Equations (6) and (7) imply that term (a) and (c) in Equation (13) are 0; furthermore,
multiplied by
yields the value of term (e) to be much smaller than term (b) and term (d); therefore, we can omit term (e) and obtain:
Analogically, we can obtain the linear equation of reactive power flow as follows.
Equations (14) and (15) can be rewritten in matrix form, and we can obtain the linear power flow model, as follows.
where [
p/
v] = [
p1/
v1 …
pN/
vN]
T, [
q/
v] = [
q1/
v1 …
qN/
vN]
T, Δ
v = [Δ
v1 … Δ
vN]
T,
θ = [
θ1 …
θN]
T are vectors containing the measurement data of all buses.
Assuming that we obtain
M1 independent samples without voltage angle information, it is necessary to further simplify Equation (16). Considering that the voltage angle difference between the slack bus and the other bus is typically small, Equation (16) is simplified in stage 1, as follows.
where [
P/
V] = [[
p/
v]
1 … [
p/
v]
M1], [
Q/
V] = [[
q/
v]
1 … [
q/
v]
M1], Δ
V = [[Δ
v]
1 … [Δ
v]
M1].
The solutions of
G^ and
B^ are:
In order to eliminate the unconnected branches, G^ and B^ should be modified by noise reduction. That is, traversing G^, and setting and to 0 if the value of ||/|| is less than ωg.
Additionally,
G^ and
B^ should hold the symmetry of the admittance matrix, thus we symmetrize
G^ and
B^ to obtain
and
as follows:
for convenience, we denote
and
as
G^,
B^, respectively.
Moreover, for the purpose of eliminating the influence of removed branches on regression results, we should renew the remaining non-zero elements in
G^ and
B^ as follows:
where
and
are the non-zero elements in the
i-th row of
G^ and
B^, respectively, [
P/
V]
i and [
Q/
V]
i are vectors corresponding to the measurement data of bus
i, Δ
Vi is a voltage matrix corresponding to the column indexes of non-zero elements in the
i-th row.
Repeat the steps of noise reduction, symmetrization, and element renewing until G^ and B^ remain unchanged before and after iteration; we can then obtain the initial identification results of the topology and line parameters.
4. Enhanced Identification Strategy Based on Decoupling Iterative Optimization
In stage 1, the influence of the voltage angle on line parameter identification accuracy is temporarily ignored, and there are certain modeling errors in the linear regression model. Hence, in this stage, we will take the change of voltage angle into consideration, and iteratively modify G^ and B^, i.e., outputs in stage 1, based on the classical power flow model, so as to obtain more accurate identification results.
In stage 2, the variables of the optimization problem include branch admittance and the voltage angle. Simultaneously optimizing all variables together may slow down the execution speed of the algorithm, or even make the iterative process difficult to converge. In order to improve the robustness of optimization algorithm, we take the branch admittance as the main optimization variable and propose a decoupling iterative optimization method between the voltage angle and branch admittance. In each iteration, we first obtain the estimated voltage angle corresponding to each data sample through a pseudo power flow calculation; on this basis, the adaptive ridge regression model is established to modify G^ and B^, and the topology is further corrected by noise reduction. The iterative process ends until the convergence condition is satisfied.
In the pseudo power flow calculation, if we use the existing power system simulation tools, such as MATPOWER (version 7.1) [
27], to build the distribution network model, and take the measurement data of power injection and
G^,
B^ as inputs, the voltage angle information can be obtained through the power flow calculation.
Once the estimation value of the voltage angle is acquired, we are able to establish a more accurate mathematical model to solve the line parameters.
4.1. Mathematical Optimization Model and Its Solving Algorithm
Taking line admittance as variables, the power flow equations can be rewritten as follows.
where
and
are the conductance and susceptance of line
lij, respectively,
E is the set of connected lines.
We take Equation (25) as an example to illustrate the establishment process of the mathematical optimization model. Equation (25) suggests that there is a linear relationship between
and
,
. We denote the set of connected lines obtained in stage 1 as
E^ with the basis
M^, then the matrix form of Equation (25) can be expressed as follows:
where
p = [
p1 …
pN]
T, is the active power injection measurement,
g = [
g1 …
gM^]
T,
b = [
b1 …
bM^]
T, are matrices containing the line conductance and susceptance in
E^,
,
, their elements can be expressed as follows.
where
, its first and second columns store the start bus and end bus index of each line in
E^, respectively. We denote
= [
],
x = [
g;
b], then Equation (27) can be expressed as follows:
A minimized loss function model for solving line parameters can be built based on Equation (30). We denote the number of data samples in stage 2 as
M2, and the optimization model based on square loss function can be expressed as follows:
where
= [
ap1; …;
apM2],
= [
p1; …;
pM2].
In order to prevent overfitting and enhance the stability of the solution, the
L2 regularization term is added to the loss function to construct the ridge regression model as follows:
where
λ is the regularization parameter.
The regularization term of the ridge regression model equally penalizes all parameters of
x; consequently, biased estimates may occur when the distribution grid contains several lines with relatively large admittance values. Therefore, an adaptive ridge regression model is established as follows:
where ‘·’ is the matrix dot product operator,
as follows:
where both
δ and
γ are constant parameters, referring to [
28], we select
δ = 10
−5,
γ = 2,
x0 is the initial estimated value of
x, which can be obtained by least square method as follows:
Equations (33)–(35) indicate that the adaptive ridge regression model adds a penalty weight coefficient to each variable according to the initial estimated value of the corresponding line parameter. The weight coefficient is smaller when the admittance of the corresponding line is larger; as a result, the degree of shrinkage is reduced.
Analogically, we can obtain (36) from Equation (26):
where
= [
aq1; …;
aqM2],
= [
q1; …;
qM2],
aq = [
ab -
ag],
q = [
q1 …
qN]
T.
Based on the above analysis, the flow chat of the proposed enhanced identification algorithm is shown in
Figure 2. In the figure,
ξ is the empirical threshold for deciding whether the iteration converges or not, which is set to 0.02.
Appendix A specifically introduces the process of determining the value of
ξ. It is worth noting that the adaptive ridge regression model in (33) and (36) are convex optimization problems, and stage 1 provides an appropriate initial value for the estimation of branch admittance; as a result, the enhanced identification algorithm has a fast solution speed and good convergence performance.
4.2. Parameter Selection
4.2.1. The Number of Measurement Samples
In order to improve the accuracy of the initial identification results, the measurement data of stage 1 should have a certain degree of redundancy. Fortunately, the linear regression problem is easy to solve, and the identification algorithm with redundant measurements still has a fast execution speed. The number of optimization variables is lower in stage 2 than in stage 1; as a consequence, we can reduce the samples used in stage 2 to save computing resources. In order to ensure the robustness of the algorithm, the number of independent equations M2 × N in Equation (33) should not be less than 2 × M^. Therefore, M2 should not be less than 2M^/N.
4.2.2. The Value of λ
λ directly affects the performance of the algorithm. When the value of
λ is too small, the effect of the penalty term is not obvious; when the value of
λ is too large, there is a high risk of eliminating too many lines, which makes it difficult to converge in a power flow calculation. Generalized cross-validation (GCV) selects the appropriate value of
λ by minimizing
V(
λ), defined as follows [
29]:
where
I is the identity matrix,
tr() represents the trace of matrix,
is defined as follows:
Figure 3 shows the
V(
λ) curve of an IEEE 33-bus system in one iteration. The blue dot in the figure is the minimum point, and the curve changes gently near the minimum value. In order to better reflect the advantage of regularization, we select
λ where the curve begins to rise significantly; that is, the
λ corresponding to the red star point in the figure.