1. Introduction
Power line communications (PLC) systems are becoming increasingly common and are a key component of smart grids, as they allow grid usage to be monitored without adding extensive infrastructure. Currently, PLC systems are commonly used for automatic meter reading and communication between electrical substations, but there is a wide range of potential applications, including dynamic load management, appliance and lighting control, device-specific billing, and power outage detection [
1,
2,
3,
4,
5,
6]. There are three classifications of PLC systems based on bandwidth: ultra narrowband (UNB) systems have frequencies ranging from 0.3–3 kHz, narrowband (NB) systems have a frequency range of 3–500 kHz, and broadband (BB) systems have frequencies of 1.8–250 MHz [
7]. Lower transmission frequencies have slower data transmission rates but are less affected by losses and are, therefore, more effective over longer distances. In contrast, broadband systems are limited to applications within the home or industrial facilities but have much faster data transmission rates.
Power line networks are far from ideal for signal transmission. The loads, wires, junctions, and network configuration all affect the signal transmission, resulting in systems with high variability and noise. PLC systems must, therefore, be customized for individual systems to effectively transmit signals, which requires developing models that characterize transmission pathways for individual networks.
Two types of PLC models have been developed: top-down and bottom-up models [
8,
9]. The top-down approach generates a parametric model based on the transmission line (TL) theory [
10] but does not incorporate information about the network topology and loads. The model parameters are found by fitting the model output to the data. The top-down modeling approach is best used when the full network information is unavailable, but measurements of the network are available. Because the network topology is not incorporated into the model, the model can no longer be used if the network components or the topology are changed.
The bottom-up model uses TL theory in combination with the network topology and conductive pathways to computing signal throughput [
11]. Unlike top-down models, they can be easily modified if network components are changed. There are a number of previous works constructing bottom-up models for different applications [
8,
9,
12,
13,
14,
15]. The disadvantages of bottom-up models are that they require full knowledge of the network topology and pathways and that they cannot easily be modified to match measurements.
Most PLC models have been developed for home networks and incorporate either 2-conductor cables or 3-conductor cables [
8,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26], although several models have been extended to 4-conductor cables [
27,
28]. In this work, the target networks are industrial networks, which generally have 3-phase power throughout the facilities, so we model the network using 4-conductor cables (3-phase conductors and a ground conductor). Industrial facilities typically have a large number of motors present, unlike home networks, so we incorporate motor models in our network.
Our previous work (Ref. [
29]) combined top-down and bottom-up models to gain the advantages of both. It used a Bayesian inference and sampling method called Transitional Markov Chain Monte Carlo (TMCMC) to calibrate model parameters to data. TMCMC effectively calibrates up to approximately 20 parameters, but it became computationally intractable to calibrate more, and a large PLC network model can contain many more unknown parameters. This work instead uses a Bayesian inference method called mean-field variational inference to calibrate model parameters and uses a more complex industrial network model instead of the home network model in Ref. [
29]. Our efforts begin with the analysis of a single network, and then expand to multiple network realizations, then address the challenge of inferring load types. This paper is organized as follows:
Section 2 describes the PLC network model for an industrial network,
Section 3 describes mean-field variational inference,
Section 4 contains results, and
Section 5 contains conclusions and suggestions for future work.
3. Mean Field Variational Inference
To estimate model parameters, we use a variational Bayes approach. We seek a posterior distribution that summarizes what we know about model parameters based on data and prior knowledge. In general, the posterior is calculated according to the Bayes rule.
where
are the observed data,
are the model parameters, and
designates a probability.
is called the
prior probability. The prior probability is based upon established knowledge in the area of a particular application.
, the
likelihood, measures agreement between model output and the simulated data for given parameter values. Hence, parameters that align with prior beliefs and result in a close match with the data have the highest posterior probability.
For this work, the prior probability is modeled with a uniform distribution where the bounds are set in
Table 1. We define our likelihood using a Student’s t-distribution. Using a Student’s t-distribution resulted in faster convergence than a Gaussian function.
where
represents our model output and
denotes a Student’s t-distribution with mean
, variance
, and degrees of freedom
. For our purposes, we set
and
to be 0.4 and 1, respectively.
To apply Bayesian inference, must be approximated as it can rarely be computed directly. Markov chain Monte Carlo (MCMC) methods are generally used to draw samples from the distribution of . In MCMC, samples are drawn using a Markov chain which is constructed to have stationary distribution . Unfortunately, when is large, MCMC methods often take a prohibitive amount of time, and convergence can be difficult to determine.
Variational inference methods provide an alternate approach to approximate otherwise intractable posterior distributions and are faster in most cases [
41,
42]. Assuming our posterior distribution comes from a specific family of distributions, we can cast the inference problem as an optimization problem. Variational inference methods most commonly use Kullback-Leibler divergence to measure the difference between distributions. We call a realization of our family of distributions
.
Prior to performing inference, all network parameters are converted from the ranges shown in
Table 1 to the range [0, 1] to ensure that they are all assigned equal weights. Accordingly, the variational family is chosen to be a Gaussian distribution truncated to be between 0 and 1. Here, we start with a standard Gaussian distribution and use rejection sampling to truncate the distribution according to our bounds. Parameter independence is the mean-field assumption, equivalent to setting the covariance matrix to the identity matrix. In general, the goal is to choose a variational family that is simple enough for efficient optimization but can capture a density close to
.
Minimizing the KL divergence between our variational family and the posterior is equivalent to maximizing the Evidence Lower Bound (ELBO) with respect to
.
For details on the derivation of the ELBO, see Ref. [
41].
To maximize our objective function, it is desirable to have gradients of Equation (
25). Finding an exact gradient is typically difficult as this would require gradients of both the model and variational distribution. We use a stochastic gradient estimator as we cannot compute gradients of arbitrarily complicated PLC models and variational distributions. To optimize the ELBO with stochastic optimization, we need an estimator of the gradient, which can be computed using samples from the variational distribution.
This is called the score function gradient estimator, and its derivation can be found in Ref. [
43]. As the derivative is taken with respect to the parameters of
q, this approach does not require the calculation of the derivative of
.
Due to the high variance of the score function estimator, using this alone often leads to poor results. It is common to use this estimator combined with various variance reduction techniques. Using a higher number of samples will lead to improved estimates of the gradient and faster convergence, although it also increases computational costs. In our work, we use a control variate strategy, where terms are added to the estimator that has zero expectation but is specifically chosen to reduce variance. Essentially, terms of the form
are replaced with
where
b is a constant and
g is not to be differentiated with respect to
. The identity
ensures this does not change the mean of our gradient estimator. A carefully chosen
b can reduce the variance of our estimator. The value
b is referred to as a baseline. In this work, our baseline is constructed using a decaying running average of samples from the objective function. For further discussion of baselines, see Ref. [
44].
To perform inference on our model, we use the probabilistic software package called pyro [
45,
46]. The score function gradient estimator is computed using 14 samples, which are computed in parallel. An AdaGrad optimizer with an initial learning rate of 0.6 converges samples to the posterior distribution [
47].
4. Results
Model calibration relies upon data simulated using the PLC model with randomly generated parameters to compute a transfer function for subsequent inference of network parameters. This also allows a direct comparison between the randomly generated parameters and the inferred parameters. 1.0 Decibel of noise is added to the transfer function before applying mean field variational inference.
Each network model can have over 100 parameters, and a parameter space this high-dimensional can make inferring parameters challenging. Furthermore, many of the network parameters are from cables or loads far from the main path between transmitter and receiver and have very little effect on signals between them. Therefore, this study uses sensitivity analysis to determine which parameters contribute least to the variance of the transfer function, similar to Ref. [
29]. Computing a full sensitivity analysis as was done in Ref. [
29] is computationally expensive, so this study instead uses an approximate sensitivity analysis.
To dramatically reduce the number of model evaluations that would otherwise be required by a global sensitivity analysis, a one-at-a-time (OAT) approach is employed. Varying each parameter ten times at values distributed throughout their domain while fixing all other parameters at their nominal values, the variation, defined as the two-norm of the difference with the nominal transfer function, is calculated with respect to each parameter. The total variation is defined as the sum of all of the variations due to each parameter. An estimate of the relative sensitivities of the parameters is obtained by normalizing each parameter’s variation by the total variation. Then, based on an adaptively selected threshold, parameters are kept or discarded based on their contribution. It was found through a series of tests that a one-half percent threshold often reduced the parameter space by more than half and, correspondingly, the runtime by a significant margin while achieving virtually equivalent results to that of the full parameter model.
4.1. Demonstration on Single Network
As an initial demonstration, a network is generated with random load types, but all the network parameters are assigned a “true” value of 0.25, so the reader can easily identify mispredicted parameters. This demonstration is to quantify the accuracy of the inference methodology so the loads are fixed to their true types.
The mean and variance of each sensitive parameter are shown in each iteration step in
Figure 9. The network has a total of 133 parameters, and 75 of them are sensitive parameters. All but 6 of them converged to mean values between 0.08 and 0.4, showing that mean field variational inference can accurately infer most of the sensitive parameters.
Table 4 shows the mean parameter values at the end of the run ordered by decreasing sensitivity. The most sensitive parameters are inferred very accurately, while the accuracy noticeably decreases for less sensitive parameters.
Figure 10 shows the resulting transfer function component
along with the 95% confidence interval computed from the push-forward posterior. The push-forward posterior is the distribution of the transfer function computed from samples drawn from the posterior. The figure shows that the inference methodology accurately fits the model to the true transfer function, and the uncertainties are highest at frequencies where the transfer function has a peak.
The focus of this work is on the calibration methodology, so for brevity, we refrain from a detailed study of the full transfer function matrix. However, the transfer function was found to be full rank across the entire frequency range, justifying its usage for MIMO applications.
One of the main advantages of a bottom-up model is that the model can still be used if the network changes. Here, a new network is generated with random parameter values over the range [0, 1] and then calibrated with mean field variational inference. The network is then modified by powering off all of the the loads in the two central rooms of the network. This is done by assigning them as constant loads with the resistance of
and parasitic capacitance
nF.
Figure 11 shows the transfer function of the model with the loads turned off compared with the true model. While there are slight discrepancies, the calibrated model is very accurate despite not having been trained using data from the modified network.
4.2. Demonstration on Multiple Networks
The results in
Section 4.1 showed that the inference method could calibrate the model for a single realization of the network. Next, the method is repeated 10 times for different network realizations to demonstrate that the method is consistently able to calibrate the model. However, instead of setting the true values of the parameters to 0.25, all network parameters have randomly generated parameters over the range [0, 1].
Figure 12 shows the mean transfer function and true transfer function for each realization. Slight inaccuracies can be observed, particularly at frequency peaks, but generally, there is good agreement between the data and calibrated model.
Table 5 shows the magnitude of the average error for parameters in order of decreasing sensitivity. Each error in the table is averaged across the indicated parameters and the 10 realizations. The parameters with the highest sensitivity to the 10th highest sensitivity have an average error magnitude of 0.029, but the errors increase as the sensitivity decreases. Note that for parameters chosen randomly in [0, 1], randomly guessing the parameters would give an average error magnitude of 0.333. Since all the error magnitudes in
Table 5 are significantly lower than 0.333, the inference method can consistently identify parameters, particularly for parameters with high sensitivity.
4.3. Inferring Load Types
A natural extension of this analysis is to consider the case where there are unknown load types that may be desirable to infer. In a home kitchen, for example, knowledge of a fridge and microwave may be known, while additional appliances may not be as clear, although a set of likely appliances may be assumed, given the setting. It follows that one may be interested in resolving the identity of loads; however, it must first be assessed whether this is a feasible task provided the selection of potential load types, the model to which they are applied, and the method of inferencing.
In the previous sections, the load types (constant, double RLC, or motors) during inference were identical to the true load types. In reality, it may not always be possible to know all of the loads in the networks, especially as loads may change during operation. It may, therefore, be important to infer load types in the network from data.
We performed an analysis to identify how much the transfer function changed based on load types. For brevity, we summarize the method and findings. Individual loads in a network were changed to a different load type, then inference was performed to see if a set of parameters could be found that resulted in an accurate transfer function even though one of the load types was wrong. It was found that if a motor load was replaced with a different load type or vice versa, it was generally impossible for the calibrated model to have an accurate transfer function. In contrast, constant loads and double RLC loads could generally be interchanged, and the calibrated model output would still be accurate. This suggests that constant and double RLC loads are too similar to identify from each other but that motor loads can be distinguished from the other load types.
We treat the constant and double RLC load types as a single group represented by the constant impedance load, as it has fewer parameters, and a slightly modified version of the algorithm offered in [
29] is employed to predict load types. The algorithm initializes by assigning random load types to each load. This assignment is evaluated by performing a full parameter inference and recording the likelihood function. Then, three loads are randomly changed in type, and the change is accepted or rejected based on whether or not the subsequent full parameter inference yields better agreement than that of the previous assignment. This process continues for a predetermined number of steps, although it may be readily apparent to the user that changes are no longer being accepted, indicating potential convergence. The results are illustrated in
Table 6.
The inference successfully identified all twenty-two of the unknown load groupings. While succeeding in this case, it should be noted that it may require a large number of iterations for the random triplets to happen upon the exact load identities. For this problem, with a limited number of iterations, it is more often the case that all but a random few will be inferred correctly. It should also be mentioned that positionally symmetric loads may interfere with the accuracy of load type inference. Positionally symmetric loads, in this context, are loads that are connected independently and identically to a network. As an example, two loads branching off of the same node by wires of equivalent length and matching properties are positionally symmetric loads. In particular, if the loads were to be swapped, the transfer function would be unaffected, as the network has effectively not changed. Consequently, in the presence of positionally symmetric loads, this method of load type inference would achieve equivalent objective function values for predicting the correct load types or the swapped, correct load types, which may be incorrect.
The success in our case is also attributable to the load types under consideration. When considering motor versus constant impedance loads, the impact of the two load types is substantially different, allowing for accurate resolution via tangible differences in the transfer function, even after inferring optimal load parameters. Additionally, accounting for only two load types simplifies the problem—more robust algorithms would likely be required if operating on a larger set of load types.
5. Conclusions
This work developed a PLC model for three-phase industrial networks and used mean field variational inference with sensitivity analysis to calibrate the model. The calibrated model has the advantages of both bottom-up and top-down models as it incorporates full knowledge of the network topology but can also be tuned to match data. The results showed that many of the parameters could be inferred accurately. When averaged across different network realizations, the 10 most sensitive parameters had an average error of 2.9% after calibration, and the 40 most sensitive parameters had an average error of 7.1%. This work also found good agreement between the transfer function produced by the calibrated model and a network modified by powering offloads, suggesting that the model maintains accuracy when the network undergoes changes such as loads being powered off.
This study used mean field variational inference for calibration, whereas our previous work (Ref. [
29]) used transitional Markov chain Monte Carlo (TMCMC). The key difference between the methods is that mean field variational inference assumes the distribution of each parameter is independent of all other parameters and follows an assumed distribution profile, which in this case is Gaussian. While these assumptions allow mean field variational inference to converge to a solution faster than TMCMC and work better in high dimensional parameter spaces, care must be taken that it is only applied to problems where the assumptions are valid. Our previous work found that very few network parameters were strongly correlated, and generally, the distributions had a single mode. That, together with the accurate results obtained here, justifies using mean field variational inference.
Our analysis of inferring load types found that motor loads could be distinguished from constant or double RLC loads. In a 6-room network, all 22 motor loads were inferred correctly, but constant loads could not be distinguished from double RLC loads. This raises the question of what kinds of loads are needed for a PLC model. It is possible that a reasonable PLC model could contain one of those two types of loads rather than both.
The accuracy of the network structure and load models should be considered when analyzing PLC models. In this and many other previous works, both the network structure and load models were generated using intuition and generalizations rather than from statistics in actual applications. While this is done due to the lack of available statistics and does not discredit such works, it must be taken into account that the models may not reflect PLC networks in actual applications.
There are several directions future works could explore. One area where more research is needed is accurate load modeling. Most load models in PLC network models are unrealistic due to the lack of data from loads common in households and industrial networks. More work is needed to infer load types or cable parameters from data. Here and in Ref. [
29], a greedy search algorithm is used, but that method is expensive and can be inaccurate. Future work could further explore the potential for MIMO applications in industrial facilities. Finally, research is needed to measure facilities and develop calibrated models custom to each facility.