1. Introduction
The accelerated addition of distributed energy resources (DERs) such as electric vehicle charging infrastructure, heat pumps, and photovoltaic (PV) installations drives distribution systems to their limits [
1]. With a high share of DERs, these systems are expected to frequently be in critical states where branch capacity is limited or voltage limits are exceeded. An important aspect of operating these systems is state estimation, as it provides knowledge of the grid state and enables the operator to make informed decisions about how to control the system and identify faults and critical system states.
Unlike the transmission level of a power system, the primary distribution (at medium voltage (MV)) and, more so, the secondary distribution have a low coverage with real-time measurements [
2,
3,
4]. This poses a challenge for state estimation, which relies on real-time measurements to process them into accurate estimates of the grid state. An approach to meet this challenge is to include additional data sources that, while not available in real-time, provide more accurate characterizations of past and expected system states. These data sources might include, e.g.,:
Time-resolved Smart-Meter (SM) measurements from customers in the low voltage (LV) grids [
5];
Time-resolved power recordings (RLM), which are mandatory for large customers with annual energy demands larger than 100 MWh (typically directly connected to MV nodes) [
6];
Annual energy demand data available through the billing system from analog or digital meters from customers without time-resolved measurements;
Standard load profiles (SLP), representing the average behavior of a consumer group [
7];
Data from exogenous sources like solar irradiance or wind velocity measurements to estimate generation from photovoltaic (PV) or wind turbines.
An approach particularly suited to process data from different sources and at such different levels of detail is Bayesian state estimation [
8,
9,
10,
11,
12,
13,
14,
15]. Before processing real-time measurements, Bayesian state estimation models any prior knowledge of the system state using a prior probability distribution. The above data sources are then used to accurately parameterize this distribution, characterizing the expected values of the state variables, such as voltages and currents, and their covariance. As real-time measurements are obtained, the prior distribution is updated using the Bayes rule to incorporate the new information.
In applying Bayesian probability theory, it is important to note that state variables are not statistically independent. If, for instance, the voltage magnitude at one node increases, the voltage at the neighboring nodes also increases. These correlations are valuable for state estimation because they allow one to use data from one node to infer information about other nodes, particularly unmeasured ones. Two origins of these correlations can be distinguished:
Physical correlations are caused by the grid’s physical coupling, for example, that a high voltage at one grid node is transmitted through a line to another node.
Behavioral correlations or load correlations are caused by similar (correlated) behavior of grid customers independent from the grid, i.e., even for electrically unconnected customers: As an example, if it is a warm day, electricity consumption for air conditioning will be higher than the historical average for all households at the same time, causing the voltage to drop in different grid segments, which might not even be physically connected.
Properly accounting for these correlations in the context of state estimation can drastically improve the accuracy of the results, as will be shown in this paper. In the Bayesian framework, behavioral correlations are encoded in a prior probability distribution for the customers’ loads—the load probability distribution (LPD), specifically in the covariance matrix of the LPD. Physical correlations are explicitly modeled using a power system model to obtain the prior probability distribution for the state variables [
8].
Distribution system state estimation (DSSE) methods that are based on the Weighted-Least-Square (WLS) approach [
16], which is the state-of-the-art method for higher voltage levels, compensate for the lack of real-time data by processing the above-mentioned data sources about expected states into so-called pseudo-measurements [
17,
18]. These pseudo-measurements are then treated in the same way as the real-time measurements, but by giving them lower weights, one ensures that they have a smaller influence on the estimation result. WLS approaches assume these pseudo-measurements to be uncorrelated, like the real-time measurements. As a consequence, they do not take into account behavioral correlations as described above. Papers that adjust the WLS algorithm to consider load correlations [
19] show that the consideration improves the state estimation results. A complementary set of works [
20,
21,
22,
23] has developed optimal planning and control approaches for energy systems in the presence of uncertainty.
In [
8], the basic principle for a Bayesian Linear DSSE is demonstrated. The parameters for the LPD are taken from historical SM time series, and loads are assumed to be uncorrelated. There is also a proposal for three-phase Bayesian DSSE [
9], in which the authors use a machine learning model to get forecast data for the background LPD distribution, and the authors in [
11] used probabilistic graph models for state estimation. They do not use behavioral correlation information. The simplification of zero correlations in [
8] is addressed in [
12], where the authors use correlation information between loads calculated from historical data assuming full SM coverage. In [
13], the correlation coefficients for active power are sampled randomly, while a full correlation between active and reactive power is assumed. As in [
12], the authors assume full meter coverage. Another approach combines deep learning with the Bayes rule using grids with full SM coverage to learn the LPD [
14]. In [
15], the authors explicitly investigated correlation coefficients between different types of loads for a historical data set. Their focus is on the impact of the correlation coefficients on the standard deviations of state variables. A more recently proposed Bayesian approach [
10] combines varying input sources with different time resolutions and considers load correlations. In particular, it considers real-time measurements, smart meters, and historical load data, which are at least 30 min time-resolved.
All the above approaches that consider LPD correlations have in common that they assume complete availability of historical time-resolved power measurements (e.g., Smart Meter data or recorded power measurements) across the entire grid to parameterize correlation coefficients.
However, only a few European countries or US states have 100%-SM coverage [
24,
25,
26]. While with the Third Energy Package (European Union, 2009), European member states are required to implement SMs in the future, currently, a mixed situation is found. Sweden, Norway, Spain, and Italy have a high SM coverage of over 97%. Conversely, Poland, Hungary, and Slovakia have a coverage smaller than 10% [
24,
25]. Similarly, in the US, the SM coverage for Utah and New Mexico is below 20% while California, Nevada, Georgia, and Maine achieve high coverage rates over 80% [
26].
In summary, for both US- and European grids, only a few real-time measurements are available, and only a few countries have 100% SM coverage. In most countries, there is a substantial share of customers for which only annual energy consumption data are available. While this is sufficient to characterize expectation values for load profiles, data are lacking to parameterize the correlation between load profiles accurately. This paper shows that accounting for behavioral correlations in the load profiles improves the state estimation accuracy (confirming findings for WLS-based approaches [
19] also for Bayesian state estimation) and how to relax the assumption of complete SM coverage. A method is developed to synthesize LPDs from mixed sources of input data while adequately accounting for correlations between different loads, regardless of the available type of load measurements. This paper focuses on MV grids, where most nodes represent the aggregated load of multiple customers. The topology and line parameters of the grid are assumed to be known.
The main methodological contributions of this paper are as follows:
Our contributions extend the well-established Bayesian state estimation framework by a method to parameterize load probability distributions in different, practically relevant levels of data availability ranging from complete time-resolved measurement sets to yearly power consumption data only, rendering unnecessary the constraint of having a full set of time-resolved power measurements for all customers. The application of the presented method is demonstrated in a case study using the example of Bayesian DSSE, but the synthesis of accurate load profile correlations can also be applied to WLS-based methods. The application of the presented method is demonstrated in a case study using the example of Bayesian DSSE, but the synthesis of accurate load profile correlations can also be applied to WLS-based methods.
In
Section 2, the modeling of the grid and system states is outlined, followed by a brief introduction of both WLS and Bayes approaches, focusing on how these approaches treat load correlations. Subsequently, the principle of the Bayesian linear state estimator is outlined.
Section 3 starts with the proposal of the Correlation-Aware LPD Synthesis Method and details its integration into a state estimation framework. In
Section 4, the accuracy gains of considering behavioral correlations for recognizing critical system states are demonstrated for a 107 bus, 20 kV MV grid under different measurement instrumentation scenarios.
3. Correlation-Aware Load Probability Distribution Synthesis
In this section, the aim is to estimate the LPD parameters by combining non-real-time inputs with varying levels of detail. The input data
Further external inputs are required: the measurement type and the index of the MV node, the assignment of connected LV nodes to the corresponding MV node given by grid topology, and the smart meter coverage of the underlying LV grid.
The correlation-aware LPD synthesis method consists of the following steps:
Classifying every MV node into a Measurement Instrumentation Scenario (MIS), the scenarios are explained in detail in
Section 3.1;
Determination of load time series (apparent power) for every MV node according to its assigned MIS as described in
Section 3.2;
Determination of LPD parameters
and
, which is explained in
Section 3.3.
3.1. Measurement Instrumentation Scenario
A classification for MV nodes into different Measurement Instrument Scenarios (MIS) is proposed in
Table 1.
The first scenario (MV) covers all MV nodes with recorded real-time measurements or with RLMs. Here, the meter devices are directly installed into the MV nodes. This usually only applies to the primary substation, central MV nodes, or MV nodes with directly connected large customers. Most of the MV nodes are not equipped with a metering device. However, historical data from the SMs of the underlying LV grids can be used as an information source: If all customers in an LV grid are equipped with SMs, their summed-up demand would be approximately the demand at the connected MV node (neglecting the line and transformer losses).
Three scenarios are defined for the underlying LV grids, corresponding to
full,
substantial, or
low SM coverage. Here, a
low SM coverage means that the set of customers with SMs cannot be assumed to be statistically representative of the behavior of all consumers. The scenario where the LV grid is fully covered with SMs is denoted by SM
. To differentiate between LV grids with
substantial and
low SM coverage, a threshold fraction
is defined (e.g.,
). Using this threshold fraction, the measurement instrumentation scenario SM
represents grids where SM measurements are available for at least
of all customers. Analogously, the scenario SM
represents grids where SM measurements are available for less than
of all customers.
Figure 1 shows an exemplary MV node for each proposed MIS. Green points mark the placement of SMs.
To estimate the power consumption of non-metered grid users, system operators often conventionally use SLP [
7]. In the proposed method, it is also possible to take SLPs for the LV grid profiles
of MV nodes assigned to SM
. However, using SLPs does not correctly reproduce correlations between groups of customers. Simply calculating the sample correlations between two MV nodes using the same SLP time series results in an unrealistically high correlation value of
. One way to avoid such unrealistic correlation values is to consider correlations between comparable LV grids. Hence, exemplary synthetic power time series are used. A more detailed description is given in
Table 2.
3.2. Determination of Time Series Values
The MV node time series are denoted by
at every MV node
for every time step
. In
Table 2, the calculation steps for the determination of time series values are shown for each MIS.
For MV nodes in scenario MV, the recorded measurements at the MV nodes can directly be used as time series input. The other scenarios use the power time series from the SMs of the underlying LV grid. The SM power time series for LV node connected to MV node n are denoted by . is the number of the LV nodes equipped with smart meters of a corresponding MV node n. For the scenario with the assumption of coverage of SM in LV grids, the active and reactive power of the SM are summed up for the corresponding MV node n and result in a time series with a quarter-hourly resolution. This approximation neglects distribution losses. These typically amount to no more than for active power and for reactive power, which justifies this simplification.
For LV grids with larger SM coverage than the set threshold , it is assumed that the summed-up time series of the SM measured grid users is representative for the behavior of this LV grid. To achieve correct energy demand values at the MV node, the aggregated SM time series are multiplied with a scaling factor compensating for the missing power contributions from unmeasured customers. This factor is calculated as the ratio of the summed-up annual energy demand of all LV nodes of this grid and the summed-up energy demand recorded by SM .
In the last scenario SM
, the SM information cannot be assumed to be representative of the non-SM nodes due to the small sample size. Therefore, a set of
K exemplary, synthetic time series
for
is created. These exemplary time series will be used to estimate the parameters of an LPD representing this MV node (see
Section 3.3). The number of power profiles
K should be high enough to achieve sufficient sampling. A residential profile from comparable grids is randomly assigned for every LV node which is not an SM node. The profiles are scaled to the LV nodes’ annual consumption. For each
k, the synthesized time series
is now defined as the sum of the aggregated SM profiles
for the SM nodes and of the aggregated comparable profiles for the non-SM nodes
. In the extreme special case where no smart meter measurements are available at all (i.e.,
), each synthesized time series
consists only of aggregated, synthesized time series and
. This case will be evaluated in detail in our case study in
Section 4.
3.3. Determination of LPD Parameters
The mean
is calculated from the time series values as the sample expectation value
. This is equivalent to the annual energy consumption divided by the number of time steps
T (
). As the synthetic time series are scaled to the annual consumption, the mean is always the same for every exemplary
K.
The elements
for the LPD covariance are calculated according to
Table 3: For
-pairs, where neither
n or
m are assigned to scenario SM
, the covariance matrix elements are calculated as the sample covariance
. If at least one of the nodes is classified as SM
(i.e., no sufficient measurement coverage), then the sample covariance is calculated for every
k and then averaged over all
k resulting in
. To calculate the sample covariance for
, the
k-th synthesized time series
is used for every
SM
and the measured time series
for every
SM
. To validate this approach, a statistical analysis of an exemplary set of residential active power profiles was performed in the following
Section 3.4.
The steps of the Correlation-Aware LPD Synthesis Method, described in the sections above, are summarized in
Figure 2.
3.4. Correlation Analysis between LV Grids
For the statistical analysis of an exemplary set of residential active power profiles, the profiles are exemplary taken from OpenEI for Washington state. OpenEI provides a “publicly available dataset of calibrated and validated 15-min resolution load profiles for all major residential and commercial building types and end uses, across all climate regions in the United States” [
31]. This database has previously been used by other authors for state estimation methods [
12]. Overall, 4947 individual residential profiles were chosen. The correlation coefficient between the individual power profiles is, on average,
.
For the analysis of correlations between sub-grids with a varying number
d of households, for each pair of sub-grid sizes (
and
), one thousand samples of
and
are randomly chosen. The required power profiles are taken from over 4900 Washington state profiles. They are summed up to obtain MV node time series (
). Based on these profiles, the correlation coefficients are calculated. As a result, 1000 correlation coefficients are obtained for each pair
and
of sub-grid sizes. The resulting mean and standard deviations of the correlation coefficients are plotted as heat maps in
Figure 3. Already, for subgrids consisting of only 20 customers each, the variance in correlation values drops to ∼4%. This indicates that the correlation of two similarly sized subgrids taken from a comparable region provides a reliable estimate for the correlation coefficient of two individual subgrids. Furthermore, if the size of the subgrids increases, so does the accuracy of the estimate: For a sufficiently large number of households in the subgrids (>80), the mean of correlation coefficients
is an accurate estimation of the correlation as the standard deviation for correlation coefficients
is lower than
.
4. Case Study—Correlation–Aware State Estimation with Synthesized Load Profile Distributions
This section evaluates the accuracy of the proposed state estimation approach based on correlation–aware load probability distributions from two different perspectives:
The first perspective sheds light on the practical benefits expected from the approach when applied to a typical use case. The second perspective helps to give a more fundamental understanding of where improvements in accuracy are coming from and what the specific differences are compared to conventional approaches. At first, the performance metrics used in this evaluation are defined in
Section 4.1. The simulation environment is presented in
Section 4.2, and the different test cases with respect to SM coverage and load correlation assumption are given in
Section 4.3. Finally, the recognition of critical system states is evaluated in
Section 4.4, and the accuracy of the estimated voltage prior distribution is assessed in
Section 4.5.
4.1. Performance Metrics
To evaluate the recognition of critical system states, a critical system state must be defined first. A critical system state is present if a voltage band (
,
) or thermal current limit (
) is violated for any node or branch element. In this case study, voltage limits are set to ±6% of the nominal voltage, which are limits commonly used by distribution system operators (see [
30]). The thermal current limits are taken from the chosen test power system (defined below). From the state estimation results (step 5 of state estimation workflow in
Section 2.3), the probability of critical systems states
can be calculated (see Equation (
A14)): It is the probability that the system state described by the posterior distribution
violates operational limits (e.g., voltage band or thermal current limits). Note that the voltage posterior distribution
can also be used to derive a posterior distribution of branch currents
via the branch admittance matrix
, since the branch currents
.
Finally, a probability threshold (
) must be defined at which the state is classified as critical. A system state is classified as critical if any value in the range of
is critical. The range
contains
of all estimates from the posterior distribution. Hence,
of the distribution remain on each side (=
). Therefore, the probability threshold
is set to ≈15.9%. If the calculated probability of critical system states
exceeds this threshold, the value is inside the range
and therefore assumed to be critical. The key performance indicator (KPI), used to measure how well critical system states are recognized, is the aggregated true positive rate
(see [
30]). An estimate is considered a
true-positive if the estimate
for a grid element
(node or branch) for time step
correctly recognizes a critical system state corresponding to the given limit (
). In this case,
(otherwise,
). The aggregated true positive rate,
, is now given by
where
is the number of all limit violations (over all time steps and grid elements) in the true system state time series. In some cases, one is primarily interested not in the fraction of critical cases correctly recognized but in the fraction of critical cases
incorrectly missed. To measure this, the false negative rate
is used.
The second KPI evaluates how accurately the prior state distribution can be estimated. The estimated and true prior distributions (available in the simulation) are compared in terms of the true and estimated expectation value and covariance matrix components. The normalized Root-Mean-Square-Error
is used as an error metric. It is defined by the
between true and estimated multi-component quantities,
and
, normalized to the mean of the true values:
We applied this error metric to the expectation value and covariance matrix components. The resulting KPIs are a measure of how close the estimated and the true prior distribution are.
4.2. Simulation Environment
The present approach was tested in a simulation study to evaluate the effect of correlations in general and the performance of the synthesized correlation-aware load probability distributions. A 20 kV MV grid with 107 buses from [
32] (
1-MV-comm–0-sw) was taken as a test system. The model is used in the pandapower format [
33] and contains consumption and generation profiles for one year with a 15-min resolution for every MV node. The grid topology is shown in
Figure 4. The renewable generation plants directly connected to MV nodes include PV, wind, biomass, and hydropower units. In summary, 10 large RES units and 19 large commercial customers are connected at the MV level. At 79 MV nodes, LV grids are located. The LV grids have varying numbers of customers, 13 to 118.
In the original dataset from Simbench, load and generation profiles are assigned to nodes in the test model from a relatively small pool of profiles, resulting in the same profile being assigned to multiple nodes. While this is accurate enough for most grid analyses, it leads to unrealistically high correlations in the behavior of different loads. Since these correlations play an important role in evaluating the performance of the proposed approach, the test system must realistically reproduce the correlations between power profiles of different MV nodes. We, therefore, retain only the static specification of the grid topology and the load types from the simbench data set but apply a bottom-up approach to synthesize the load profiles for them. Since the load profiles from Washington state (as described in
Section 3.4) will be used as comparable profiles to estimate LPDs for subgrids with low smart meter coverage, we draw household load profiles for the test system from an independent pool of 7000 OpenEI load profiles for a different state (New York state) [
31]. This process avoids our comparable load profiles being unrealistically similar to the true load profiles in the test data. For MV nodes with underlying LV grids dominated by household profiles, every household is assigned a new, unique residential profile from that pool. The commercial profiles from simbench are replaced by OpenEI profiles of the same type. The PV and wind profiles are taken from simbench. For the reactive power profiles, the power angle
between active
and reactive power
is assumed to be constant.
To simulate the true system state (ground truth), a power flow calculation is executed for every time step using the load profiles described above. Simulated measurements are then obtained from true system states without adding synthetic measurement errors. For the measurement scenario, all MV nodes with large commercial consumers and RES are assumed to be equipped with RLMs. Furthermore, five voltage PMU measurements (assumed accuracy:
) are placed at buses 2, 5, 23, 77, and 87 (see
Figure 4). For the household loads, two different test cases are investigated, corresponding to smart meter coverage of 0 and
.
In the terminology of
Section 3, this results in the following MIS for the different MV nodes:
The 5 MV nodes with real-time measurements and 21 MV nodes with RLMs (large commercial consumers and RES) are assigned to MV.
The 79 MV nodes with underlying LV grids are assigned to SM/SM/SM. The SM coverage is either 0 or , depending on the test case.
On this test dataset, the performance of the approach is evaluated in the context of a typical use case: the recognition of critical system states. To this end, the approach is integrated into a complete Bayesian state estimation workflow, described in detail in
Section 2.3. The workflow uses a single LPD for all state estimation tasks. Note that the accuracy of Bayesian state estimation can further be improved if a more accurate prior voltage distribution is available for a specific state estimation task. One example would be distinguishing between weekdays and weekends or different seasons, using a separate LPD (and hence prior voltage distribution) for each setting.
4.3. Test Cases
For the evaluation, four test cases are defined:
- Test case 1:
The SM coverage of all underlying LV grids is assumed to be
(SM
), i.e., 15 min “time-resolved” measured power profiles at every load are available for the estimation. The sample LPD covariance
is calculated according to our approach as described in
Section 3.3. Hence, it considers the behavioral correlations (correlation aware, denoted in results tables by
).
- Test case 2:
The SM coverage is also assumed to be , but the load correlations are not considered (correlation-unaware, denoted in results tables by ). This results in a diagonal covariance matrix of the LPD.
- Test case 3:
The SM coverage of
(SM
) is assumed for all LV grids, i.e., only the annual energy demand is available for the estimation approach. The LPD is calculated according to our approach as described in
Section 3.3, using a value of
and drawing comparable profiles for non-SM nodes (in this case, all nodes in the LV grid) from the pool of 4947 OpenEI power profiles described in
Section 3.4. Hence, it considers the load correlations (correlation aware, denoted in results tables by
).
- Test case 4:
The same SM coverage as for the third test (SM) is assumed, but the correlation is neglected (correlation-unaware, denoted in results tables by ).
4.4. Recognition of Critical System States
As mentioned above, the evaluation starts with a test to recognize critical system states. For every time step
, the Bayesian linear state estimation is conducted as described in
Section 4.2 based on LPDs calculated according to the approach described in
Section 3. From the resulting posterior distribution, the probability of critical system states
is calculated as described in
Section 4.1. To quantify the accuracy, the true positive rate
(see
Section 4.1) was used (The aggregated true-negative rate, which gives the percentage of correctly recognized non-critical system states, is for all eight cases larger than
). The results are shown in
Table 4 below.
First, the results of the SM scenario are compared. In this case, taking into account behavioral correlations for calculating the probability of critical system states improves by percentage points for voltage limit violations and by percentage points for thermal current limit violations. The picture is similar for the SM scenario: Here, improves by percentage points for voltage limit violations and by percentage points for thermal current limit violations.
The benefit of considering behavioral correlations becomes even more striking when viewed from the perspective of critical system states that are missed (i.e., not correctly recognized) by each approach. This can be captured, e.g., looking at the false negative rate : In the scenario SM where no smart meter measurements are available, the conventional approach misses of true critical system states caused by voltage limit violations. Using the approach based on the correlation-aware estimation of background load probability distributions in this article, that fraction drops to , a reduction by a factor of . Analogously, for thermal current limit violations, a reduction of the by a factor of is observed.
These findings confirm the importance of considering load correlations in DSSE algorithms, as found in the context of WLS state estimation by [
19]. It also demonstrates the relevance of the Correlation-Aware LPD Synthesis method presented in this article: The method enables correlation-aware state estimation even for nodes with no smart meter coverage. The first and third test cases (SM
vs. SM
, both with correlation awareness) result in similar true positive rates. Even without smart meter coverage, the accuracy is much better than what can be achieved without considering load correlations, even if background distributions are based on full historical smart meter data.
It is important to note that, as described in
Section 3.2, the proposed approach only uses
comparable load profiles for the estimation of correlations in the case SM
. These give us an approximate indication of the true behavioral correlation between loads in the system (in fact, the estimated correlation values differ from the empirical correlation in our ground truth by
on average, with a maximum value of
). However, as the results above show, this approximation is good enough to substantially improve the accuracy of the resulting state estimation and, consequently, the recognition of critical system states.
In summary, it could be confirmed that considering load correlations allows for substantially more accurate recognition of critical system states. Furthermore, using the approach described in this paper, this improvement can be leveraged even in cases where no smart meter measurements are available that would allow an empirical estimation of load correlations. Instead, insights from a set of comparable (but different) load profiles from an entirely different geographical region can be used to achieve accuracy comparable to the case where full smart meter coverage is available.
4.5. Accuracy of Estimated Prior Voltage Distribution
To get a more detailed view of the improved state estimation accuracy behind the results in the previous section, we now look closer at the prior voltage distribution used in the Bayesian state estimation approach.
To achieve accurate state estimation results (and hence a reliable detection of critical system states), the approximation of should represent the true empirical voltage distribution (i.e., the distribution of voltage values in the simulated ground truth system state, resulting from a power flow calculation for every time step) as accurately as possible. To illustrate this fact, we have run the state estimation based on true, empirical voltage prior distribution , which results in recognition accuracies for critical system states of >99% (: , : ).
A closer look at the Bayesian state estimation algorithm shows that changes in the LPD covariance matrix only affect the covariance matrix of the prior voltage distribution (the expectation value is not affected). Hence, to assess the impact of changes in LPD covariances on the estimated prior voltage distribution, the calculated prior voltage covariance matrices are compared to the true empirical covariance matrix obtained from the simulated ground truth system state. Specifically, the following two characteristics of the prior voltage distribution are evaluated:
Standard deviation (overall variation of voltage values for each node);
Correlation coefficient (correlation between voltage values at different nodes).
On a qualitative level, the standard deviation calculated without considering load correlations always substantially underestimates the true standard deviation, thus assuming a much lower variation of voltage values in the prior distribution (see
Figure 5). On the other hand, if load correlations are considered, the resulting standard deviation is very close to the true empirical values (there are small deviations with a maximum of ±2.7% relative to the mean).
Table 5 summarizes the results separately for
and
. It shows the
(see Equation (
13)) for each of the two characteristics of the prior voltage distribution and for each of the four test cases from
Section 4.3, separately for magnitude and angle components of
V.
We first focus on the
for the standard deviation of the prior voltage distribution. If load correlations are not taken into account (
columns), relatively high
values (∼72% for voltage magnitude and ∼45% for voltage angle) are observed, confirming the qualitative observation from
Figure 5. This is true independently of the level of smart meter coverage. Accounting for load correlations, on the other hand, reduces that error dramatically in the case of full smart meter coverage. Even in the case where no smart meter measurements are available, using the approach presented in this paper, error levels can be reduced to ∼
(for
,
vs.
) and ∼
(for
,
vs.
) compared to the baseline
.
Regarding the for the correlation coefficients of the prior distribution, an even more dramatic improvement can be observed. While the is somewhat lower if load correlations are not taken into account (∼41% for voltage magnitude and ∼36% for voltage angle), the increase in accuracy when considering load correlations is larger than before: In the case where no smart meter measurements are available, the approach reduces error levels to ∼ (for both and ) compared to the baseline.
The comparison of true and calculated
and
shows that the consideration of behavioral correlations in LPD for Bayesian DSSE strongly improves the accuracy of the estimated prior distribution
. This translates into a more accurate posterior estimate of
, which can be observed in our assessment of recognition of critical system states above (
Section 4.4).
Naturally, the achieved accuracy w.r.t the prior voltage distribution cannot be quite as high, when no smart meter measurements are available as in the case with full smart meter coverage. However, the results of this paper show that the approach presented in this article can improve the accuracy of the prior voltage distribution impressively, even without any smart meter measurements (see
Section 4.5). In particular, the level of accuracy is sufficient for practical use cases, e.g., enabling a reliable recognition of critical system states (see
Section 4.4).