Synthetic PMU Data Generator for Smart Grids Analytics

Grasso Toro, Federico; Frigo, Guglielmo

doi:10.3390/metrology5010012

Open AccessArticle

Synthetic PMU Data Generator for Smart Grids Analytics

by

Federico Grasso Toro

^1,†

and

Guglielmo Frigo

^2,*,†

¹

Open Science Team, University of Bern, Hochschulstrasse 6, 3012 Bern, Switzerland

²

Federal Institute of Metrology METAS, Lindenweg 50, 3003 Bern-Wabern, Switzerland

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Metrology 2025, 5(1), 12; https://doi.org/10.3390/metrology5010012

Submission received: 20 September 2024 / Revised: 6 December 2024 / Accepted: 3 January 2025 / Published: 7 February 2025

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The development and study of Smart Grid technologies rely heavily on high-fidelity data from Phasor Measurement Units (PMUs). However, the scarcity of real-world PMU data due to privacy, security, and variability issues poses significant challenges to researchers, developers, and related industries. To address these challenges, this article introduces the bases for a digital metrology framework, focusing on a newly designed and developed synthetic PMU data generator, that is both metrologically accurate and easy to adapt to various grid configurations for data generation from point-on-wave (PoW) data. This initial phase for a Smart Grid research framework aligns with Open Science principles, ensuring that the generated data are Findable, Accessible, Interoperable, and Reusable (FAIR). By embracing these principles, the generated synthetic data not only facilitate collaboration for Smart Grid research but also ensure their easy integration into existing Smart Grid simulation environments. Additionally, the proposed digital metrology framework for Smart Grid research will provide a robust platform for simulating real-world scenarios, such as grid stability, fault detection, and optimization. Through this open science approach, future digital metrology frameworks can support the acceleration of research and development, overcoming current limitations, e.g., lack of significant amounts of real-world scenarios by PMU data. This article also presents an initial case study for situational awareness and control systems, demonstrating the potential for future Smart Grid research framework and its direct real-world impact. All research outcomes are provided to highlight future opportunities for reusability and collaborations by a novel approach for research on sensor network metrology.

Keywords:

digital metrology; smart grid; phasor measurement unit (PMU); FAIR principles; open science

1. Introduction

In this first section, we present an overall introduction to Smart Grid technologies from the point of view of their current challenges, due to lack of high-fidelity data.

While high-quality data (i.e., data sets that ensure the collected PMU data are correct and usable for decision-making and analysis) would be good to have, we consider that for Smart Grid research high-fidelity data are required, since it will often be used in simulations and/or technical models, i.e., data sets must ensure the PMU data are a precise, detailed, and a realistic representation of reality.

The evolution of traditional power grids into Smart Grids marks a transformative leap toward more efficient, reliable, and sustainable energy systems. Leveraging advanced digital technologies, Smart Grids can enable real-time monitoring, prediction, and control of electricity flow from generation to consumption, addressing the increasing complexity and variability of modern energy demands, especially with the integration of renewable sources [1]. These grids are designed to enhance energy distribution efficiency, reduce operational costs, and improve resilience against disruptions [2]. Through two-way communication systems between utilities and consumers, Smart Grids also enable dynamic and responsive grid management, allowing real-time adjustments based on consumption patterns, supporting better demand response and load balancing, and positioning Smart Grid research as the foundation of a more intelligent and adaptive energy system on a worldwide scale.

Phasor Measurement Units (PMUs) are integral to advanced grid infrastructure, delivering high-resolution, time-synchronized measurements of electrical waveforms, referred to as synchrophasors. These measurements provide a precise, real-time depiction of grid status, which is critical for immediate disturbance detection and response [3].

PMUs are indispensable for applications such as monitoring grid stability, optimizing power flow, and identifying faults or anomalies [4]. Their capability for high-speed, accurate data acquisition is particularly crucial in Smart Grids research, where complex and interconnected systems demand timely and precise information for reliable operation.

Despite their significance, the full utilization of PMUs is constrained by challenges in data quality and data acquisition. The availability of real-world PMU data are limited, due to issues related to privacy, security, and the inherent variability of grid conditions. These constraints impede the ability of researchers, developers, and related industries to access the necessary high-fidelity data for the testing and validation of new technologies aimed at enhancing grid performance.

High-fidelity data from Phasor Measurement Units (PMUs) is crucial for the optimal functioning of Smart Grids, enabling precise modeling of grid behavior as well as the development of advanced control algorithms and predictive maintenance strategies [5]. The scarcity of real-world PMU data necessitates the use of synthetic data, which can effectively simulate grid operations and test new approaches in a controlled setting. For synthetic data to be useful, it must be accurate, scalable, and adaptable to various grid configurations. As the demand for advanced and resilient energy systems increases, the importance of frameworks that generate high-fidelity synthetic datasets will grow accordingly [6].

This article presents a new application for the generation of synthetic datasets of PMU measurements. Differently from the recent literature about synchrophasor estimation techniques, the article introduces an open-access tool and intends to establish a common tool to compare real-world datasets in terms of phasor measurements. The objective is not the further improvement of estimation accuracy or reporting latency. Conversely, the focus is on the definition of a white-box model that can be easily customized in order to reproduce realistic streams of PMU measurements. The foreseen application is the generation of synthetic datasets for the characterization and validation of machine learning applications, e.g., state estimators, fault locators, virtualized protections, etc.

The subsequent sections of this article will explore the development of a digital metrology framework designed to address these needs mentioned above, based on the creation of high-fidelity and metrologically relevant synthetic PMU data, to drive the next generation of Smart Grid research and innovation. We consider that, otherwise, the scarcity of real-world PMU data will restrict all significant advances in the field.

Section 2 details the challenges of real-world PMU data for Smart Grids and the need for synthetic data to foster innovation in the field. Section 3 presents the needs and benefits of synthetic data for research and development in Smart Grid. Section 4 introduces the concept of sensor network metrology as a significant path for research and the future of metrology, mainly focusing on the potentiality of synthetic data for the development of digital representations. Section 5 presents the overall conceptual development of the framework. Section 6 details the presented case study and describes the synthetic PMU data generator, as the base for a digital metrology framework. Finally, Section 7 concludes the potential of the proposed framework while Section 8 mentions further potential developments, relevant to the presented Open Science aspects, to take into consideration on the path towards open research on Smart Grids.

2. Challenges in Obtaining Real-World PMU Data

The development and implementation of Smart Grid technologies heavily depend on access to accurate and comprehensive data. Phasor Measurement Units (PMUs) are crucial in this context, providing real-time, high-resolution data indispensable for monitoring grid stability, detecting faults, and optimizing power flow.

However, despite the critical importance of PMU data, there is a significant scarcity of real-world datasets available for research and development.

2.1. Data Scarcity and Privacy Concerns

One of the most significant reasons for the scarcity of PMU data primarily stems from data privacy concerns. PMU data reveals sensitive information about power grids, which poses security risks if exposed. Thus, utilities are reluctant to share it due to sophisticated threats to cybersecurity [7]. The potential for data breaches and unauthorized access complicates not only the sharing of PMU data but also Smart Grid research in general [8]. Additionally, stringent data privacy regulations and compliance requirements hinder data sharing, particularly across borders.

These legal constraints further complicate the acquisition of real-world PMU data for Smart Grid research. Adding to this issue, only a few utilities have deployed PMUs at a scale that provides comprehensive insights, and even when deployed, the data are often restricted to internal use, limiting external research opportunities [7].

2.2. Variability and Inconsistencies

A major challenge in obtaining real-world PMU data is the inherent variability in grid operations. Factors such as weather, demand fluctuations, and renewable energy integration cause significant differences in data over time and locations [9].

This variability complicates the development and testing of new algorithms and models as PMU data can be incomplete or insufficiently detailed. Moreover, it poses a challenge for researchers who need consistent, high-fidelity data to develop and test new algorithms, models, and technologies. In many cases, the data available from PMUs is incomplete or lacks the resolution necessary for detailed analysis. For instance, certain grid events or disturbances may be only partially captured or missed entirely, depending on the placement and sensitivity of the PMUs. This can lead to gaps in the data that hinder the ability to accurately model grid behavior and predict potential issues.

Inconsistencies in data collection methods and standards across different utilities further complicate the situation. PMU devices may have varying levels of accuracy, sampling rates, and synchronization capabilities, leading to discrepancies in the data they produce. These inconsistencies make it difficult to aggregate data from multiple sources or compare results across different studies, limiting the ability to generalize findings or apply them to broader contexts.

2.3. Impact on Research and Development

The combination of data scarcity, privacy concerns, and variability presents a significant barrier to advancing Smart Grid technologies. This lack of properly curated PMU data also hinders the ability to conduct large-scale simulations and stress tests, which are critical for understanding how new technologies will perform under different conditions.

As a result, the pace of innovation in Smart Grid technology is slowed [10], with potential breakthroughs being delayed or missed altogether, proven by the lack of big data essential characteristics [11,12], i.e., the 5 Vs of Big Data: volume, variability, velocity, variety, and value, describing them in detail [13]. Given these mentioned challenges, developing alternative approaches to real-world PMU data acquisition becomes essential.

By creating high-fidelity synthetic PMU data that accurately mimic real-world conditions, researchers can overcome the limitations of data scarcity and variability, paving the way for more rapid and robust advancements in Smart Grid technologies. This is why synthetic data generation, as discussed in the subsequent sections, plays a crucial role in Smart Grid research [14].

3. The Need for Synthetic Data in Smart Grid Development

As the complexity and scale of power grids continue to grow, the need for accurate, comprehensive data becomes ever more critical in the development of Smart Grid technologies. However, as discussed in the previous sections, the challenges associated with obtaining real-world Phasor Measurement Unit (PMU) data—such as scarcity, privacy concerns, and variability—create significant barriers to innovation.

To address these challenges, synthetic data has emerged as a powerful tool—currently considered crucial—to bridge the gap between the need for high-fidelity, metrologically relevant data and the limitations of real-world data availability.

3.1. Benefits of Synthetic Data

Synthetic data refers to artificially generated data that accurately mimic the characteristics and patterns of real-world data. In the context of Smart Grids, synthetic PMU data can be created to simulate a wide range of grid conditions and scenarios [15].

This capability is invaluable for researchers and developers who need to test new algorithms, models, and technologies in a controlled and repeatable environment for Smart Grid research. Hence, by using synthetic data, it becomes possible to explore the behavior of the grid under extreme or rare conditions that might not be easily (or even possibly) captured in real-world datasets.

In addition to that, another significant benefit of synthetic data is their ability to provide a safe and secure environment for experimentation [15]. Since synthetic data are generated artificially, they do not contain any sensitive or proprietary information that could pose security risks if shared or published. This allows collaboration across institutions, since sharing data openly and adhering to data privacy regulations becomes simpler, without compromising the integrity of critical infrastructure.

Furthermore, synthetic data can be generated in large volumes and tailored to specific research needs. This scalability is particularly important for conducting stress tests, simulations, and other forms of analysis that require vast amounts of data to produce statistically significant results [15].

3.2. Metrological Perspective and Impact

From the metrology perspective, synthetic data allows for the controlled generation of traceable datasets, ensuring consistent reference points for the validation of algorithms and models in Smart Grid applications.

By simulating grid conditions with known uncertainties, synthetic data enables a more precise assessment of measurement uncertainty, a critical factor for improving the reliability of power system monitoring and control. Additionally, synthetic datasets can be tailored to evaluate performance under diverse operational scenarios, facilitating robust uncertainty analysis and ensuring that developed solutions align with traceable, standardized measurement frameworks.

Researchers can therefore generate and use metrologically relevant datasets that can include a wide variety of scenarios, since the proposed synthetic PMU data generator can be fed by point-on-wave (PoW) inputs from normal operating conditions upon extreme events such as cascade blackouts or cyber-physical attacks, providing a comprehensive testing ground for new Smart Grid technologies.

Point-on-wave refers to the precise measurement and monitoring of electrical parameters at specific points within a power wave. Essentially, it involves capturing data on the exact phase angle of the voltage and current waveforms in real time [16].

3.3. Overcoming Limitations of Real-World Data

While real-world PMU data are invaluable, they are often limited in scope and quality due to factors such as inconsistent data collection, incomplete datasets, and the natural variability of grid operations [17].

These limitations can hinder the ability to fully understand and predict grid behavior, especially in the face of emerging challenges like the integration of renewable energy sources, increased demand, and the threat of cyber-physical attacks [7].

Synthetic data addresses these issues by offering consistent, tailored datasets that fill gaps in real-world data, improving model accuracy and supporting the development of resilient Smart Grid technologies. It can also simulate future grid scenarios, enabling proactive planning and testing of algorithms in varied conditions, ensuring robustness, and reducing risks before real-world deployment [15].

3.4. Evolving Smart Grid Scenarios

Another critical advantage of synthetic data is their ability to simulate future grid conditions and scenarios that have not yet been observed in the real world. As Smart Grids evolve and new technologies are integrated, the grid’s behavior may change in ways that are difficult to predict based solely on historical data. Synthetic data allows researchers to explore these potential future states, enabling proactive planning and development that can anticipate and address future challenges before they arise [17].

This advantage of synthetic data over real-world data is particularly important in the context of Smart Grids, where the stakes are high and the margin for error is small. By thoroughly testing new technologies with synthetic data, researchers can reduce the risk of unexpected failures and ensure that innovations are ready for deployment.

3.5. Enabling Innovation in Smart Grid Technologies

The ability to generate high-fidelity synthetic PMU data is not just a convenience; it is a necessity for the continued advancement of Smart Grid technologies.

As the energy landscape becomes more complex and the demand for reliable and sustainable power grows, the need for innovative solutions becomes increasingly urgent. Synthetic data provides the foundation upon which these innovations can be built, offering researchers the tools they need to explore new ideas, test new approaches, and push the boundaries of what is possible in the world of Smart Grids [18]. By overcoming the limitations of real-world data and providing a flexible, scalable, and secure environment for research, synthetic data generation is poised to play a central role in the next wave of Smart Grid development [19].

The creation and use of synthetic PMU data will be crucial in ensuring that Smart Grids are not only more efficient and resilient, but also capable of meeting the challenges of the future [20]. In this sense, Table 1 summarizes all discussed benefits for synthetic PMU data as a central part of the Smart Grid development.

4. Synthetic PMU Data Generator for Digital Metrology

As we move into the digital age, the field of metrology is undergoing a profound transformation [21]. At the heart of this evolution lies sensor network metrology, a promising new area that has the potential to redefine how we measure and interpret the world around us.

This emerging domain not only exemplifies the convergence of traditional metrological principles with cutting-edge digital technologies [22], in particular for the design and development of Smart Cities, but it also aligns with the growing demand for real-time, high-resolution data across various sectors, also known as systems of systems (SoS) [23].

4.1. Sensor Network Metrology: A New Paradigm

Sensor networks represent a shift from isolated measurement instruments to interconnected systems that offer continuous monitoring and analysis over large spatial and temporal scales. They mark a significant evolution from traditional, standalone measurement instruments and related metrological services to highly integrated systems (and corresponding metrological services) capable of real-time, continuous monitoring and analysis across vast spatial and temporal scales.

This shift is largely due to innovations in wireless communication, advanced data processing techniques, and the miniaturization of sensor technology. Unlike conventional measurement tools that typically function in static environments with limited flexibility, sensor networks are designed to adapt and operate in dynamic, ever-changing conditions.

This adaptability enhances the scope and accuracy of measurements, allowing for more comprehensive monitoring solutions. According to the [24], these interconnected systems play a crucial role in system life cycle management by enabling greater interoperability and system resilience across various stages of development. Typical characteristics of Systems of Systems include [25]:

Operational and managerial independence;
Geographical distribution;
Emergent behavior;
Evolutionary development;
Heterogeneity of constituent systems.

This SoS transformation, then, is driven by advancements in wireless technology, data processing, and the miniaturization of sensors. Therefore, unlike conventional measurement systems, sensor networks can operate in dynamic environments, providing a more comprehensive and adaptive approach to metrology.

For instance, in environmental monitoring, sensor networks can be deployed across vast areas to collect data on temperature, humidity, and pollutants in real time. These data are then processed using sophisticated algorithms, allowing for predictive analytics and more accurate modeling of environmental changes [26].

The potential applications of sensor network metrology are vast and extend beyond Smart Cities [27], ranging from industrial automation to healthcare and agriculture.

4.2. Advantages and Challenges of Sensor Network Metrology

The benefits of sensor network metrology seem, at first sight, evident.

Firstly, it allows for the continuous collection of data and bidirectional communication, offering a real-time snapshot of conditions that can be critical in fields like manufacturing, where even slight deviations can have significant consequences, allowing industrial big data-driven decision-making in the field of intelligent manufacturing [28].

Secondly, the distributed nature of sensor networks enhances the robustness and reliability of measurements, as data sets from multiple sensors can be cross-referenced and validated, one of the most significant advantages of sensor networks, i.e., considering the systems of systems (SoS) one overall measuring system, bringing into light complex solutions for distributed estimation [29].

However, the implementation of sensor network metrology is not without challenges. One major hurdle is the need for standardized protocols and calibration methods to ensure the accuracy and interoperability of sensors within the network. Additionally, managing the vast amounts of data generated by these networks requires advanced data storage and processing solutions, as well as robust cybersecurity measures to protect sensitive information, e.g., the multifaceted challenges of big data security and privacy [30].

The role of smart sensors within systems of systems (SoS) in enhancing the interoperability of Smart Grids through standardization has been discussed in depth [31]. The importance of integrating sensor networks by means of smart sensors, such as PMUs, focuses on ensuring bidirectional communication and efficient data exchange within complex systems. This aids current research and further development and deployment of solutions by adhering to standardized frameworks for monitoring and controlling grid operations [32].

4.3. The Role of Digital Representations

The future of sensor network metrology relies mostly on digital representations, differentiated in Figure 1. Digital representations, with grid-related examples, can be classified as the following [33]:

Digital Model: A digital representation of a physical system or object (e.g., a network infrastructure map that utilizes data from a fixed point in time);
Digital Shadow: A digital model that integrates automated one-way data flow from the physical system or object (e.g., a network infrastructure map that pulls data from the system to dynamically update inventory, asset state, and constraints); and
Digital Twin: A digital model which integrates two-way data flow between the model and physical object or system. Where making a change to one can change the other for example a control center network map that displays real-time system status and enables engineers to control assets to mitigate issues (e.g., a network infrastructure map that utilizes data from real-world to adapt its model and the corresponding predictions to control the monitored system).

Focusing on digital shadows and twins —-also known as virtual replicas —-of physical systems, researchers can simulate, predict, and optimize performance [34]. By incorporating data from sensor networks into digital shadows/twins, it becomes possible to create highly accurate models that can anticipate system behavior under various conditions [35], perfectly suited for research driven by synthetic PMU data-driven smart grids.

4.4. The Road Ahead for Digital Metrology

The future of metrology, as shaped by sensor network technology, is likely to be characterized by increased collaboration between national metrology institutes (NMIs), industry stakeholders and research organizations [21]. To fully realize the potential of sensor network metrology, concerted efforts will need to be made to develop new standards, protocols, and calibration techniques that can accommodate the complexities of these systems [35].

Sensor network metrology represents a significant step forward in the digital transformation of metrology [21]. As we expand into Smart Grid development and deployment, the continued integration of sensor networks into the metrological frameworks will be essential in meeting the demands of an increasingly data-driven world.

4.5. Frameworks Developed for Research on Digital Metrology

The development of digital metrology frameworks, mainly grounded in metrologically relevant synthetic data generators, represents a critical advance in sensor network metrology, particularly for applications such as Smart Grids and Smart Cities [36].

These frameworks can establish a foundational knowledge base by addressing key metrological aspects, including the generation, validation, and application of synthetic data under traceable methods, like the one presented here for PMU data sets. By providing high-fidelity data, the suggested Smart Grid framework will allow researchers to simulate various scenarios and conditions, thereby improving the understanding of sensor network behavior and performance in various operational contexts [37].

The bases for the proposed Smart Grid research framework presented here are the culmination of a series of PMU-based research and development for Smart Grids that includes [38,39,40,41,42].

We believe that this synthetic PMU data generator is a milestone, crucial for developing reliable models and algorithms that can effectively manage and interpret dynamic environments while setting a precedent for future studies in sensor network metrology and paving the paths for innovations in technologies such as Smart Grid or Smart City, where precise, accurate, and adaptable metrologically relevant data and metadata will be essential for optimal system performance, monitoring, and resilience.

5. Synthetic PMU Data Generator for the Proposed Frameworks

The proposed framework for digital metrology will be able to generate and utilize synthetic Phasor Measurement Unit (PMU) data from point-on-wave (PoW) inputs, designed to address the limitations of real-world data by providing research opportunities in robust, adaptable, and accessible solutions for Smart Grid research.

The proposed minimum architecture of this type of framework consists of three key components, each serving a specific role in ensuring the generation, validation, and application of high-fidelity synthetic data. These components are (1) the Synthetic Data Generation Module (in the present article, synthetic PMU data to be precised); (2) validation and verification for metrologically relevant modules; and (3) integration modules with Smart Grid simulators by reusable data structure and syncing procedures.

Although the overall design of these modules is out of the scope of the current article, we consider it necessary to detail them in short descriptions, to properly define the context for the presented synthetic PMU data generator.

5.1. Data Generation Module

At the core of the proposed framework is the Data Generation Module, which, in the present case study, is responsible for creating synthetic PMU data for Smart Grid analytics. This module employs advanced algorithms and models that are metrologically relevant to simulate the electrical characteristics of power grids under various conditions, as presented in detail in the following section.

Synthetic data generators allow different type of simulation environment modules to test stochastic models, machine learning techniques, and system dynamics models [43], under metrologically relevant considerations [44].

5.2. Validation and Verification of Synthetic Data Generator

Ensuring the accuracy and reliability of synthetic data will be crucial for its effective application in Smart Grid research.

All digital metrology frameworks must also incorporate rigorous validation and verification processes to achieve this. Validation will involve comparing synthetic data against known benchmarks and real-world datasets to assess its reliability and accuracy, as simplified in [45].

This validation process may include statistical analysis, consistency checks, and scenario-based testing. Verification, on the other hand, will ensure that the data generation algorithms and models function correctly and produce reliable results. Techniques such as cross-validation, sensitivity analysis, and error analysis tend to be used to confirm the integrity of the data and the correctness of the models [46].

Altogether, these kinds of processes by means of in-environment modules (i.e., additional hardware components and/or software modules) will ensure that the synthetically created PMU datasets meet the required specifications on digital shadows/twins for the research, development, and application in the field of Smart Grid.

5.3. Integration with Smart Grid Simulators

The Synthetic PMU Data Generator presented in the following section has been designed to seamlessly integrate with any existing Smart Grid simulators and/or tools, i.e., our openly accessible asset is aimed to be a simulators-environment-independent tool.

Hence, the integration must be achieved through standardized interfaces and known structured data exchange protocols, which facilitate interoperability between the proposed digital metrology framework and any other existing Smart Grid simulation platform [47]. By incorporating the synthetic PMU data generated from the proposed digital metrology framework into any other Smart Grid simulators, researchers and developers can test and validate their own developed new technologies and algorithms in their own controlled environment, without having to deal with unstructured data or develop their own parsers and/or importing modules, reducing investment for further development [48].

6. Synthetic PMU Data Generator

In this section, we introduce the PMU Data Generator and we validate its performance in standard and realistic test conditions. Firstly, we briefly describe the main algorithmic steps and the input and output variables of the generator. The algorithm performance is evaluated by means of full compliance testing against the P-class requirements of the IEC 60255-118-1 [49]. Secondly, we provide a simple user guide for customizing the Synthetic PMU Data Generator according to the specific requirements of the particular dataset under analysis. Finally, we present the results obtained on two well-known datasets taken from the recent literature of PMU-based measurement application.

6.1. Modeling Assumptions

In modern power systems’ theory, voltage and current waveforms are typically represented as a linear combination of different components [50]:

x (t) = A (1 + ε_{A} (t)) \cdot cos (2 π f t + φ + ε_{φ} (t)) + η (t) + ρ (t)

(1)

where t is the time-independent variables and the parameters A, f, and

φ

denote the amplitude, frequency, and initial phase of the fundamental component. The functions

ε_{A}

and

ε_{φ}

account for any variation of amplitude and phase over time. The additional terms

η

and

ρ

model narrow- and wide-band spurious components. The former include harmonic and out-of-band components that can be well approximated by sinusoidal tones.

The latter is typically related to measurement noise or any continuous spectrum disturbance that cannot be represented as a sum of sinusoidal tones.

In this scenario, the PMU is a measurement device capable of extracting the main parameters associated with the fundamental component in a given reporting time instant. There are three main measurements: First, the synchrophasor

\hat{p}

that consists of the vector representation of the fundamental amplitude

\hat{A}

and phase

φ

in a polar plane rotating at the nominal system frequency. Second, the fundamental frequency

\hat{f}

, which may not coincide with the nominal system one. Third, the Rate Of Change Of Frequency (ROCOF)

Δ \hat{f}

, which is the first-order time derivative of frequency.

In other words, the PMU maps the signal in (1) in a simplified and compressed model:

F_{P M U} (t) = \hat{A} \cdot cos (2 π \hat{f} t + \hat{φ} + π Δ \hat{f} t^{2})

(2)

The transition from (1) to (2) can be seen as a lossy compression [51]. Nevertheless, it is important to underline two considerations: First, the PMU limits its analysis to the fundamental components, as this is the one considered in most monitoring and control applications. Second, the PMU considers relatively short observation intervals (few cycles of the nominal system frequency) and adopts a reasonably high reporting rate (typically, once per cycle). It is thus reasonable to expect that the simplification introduced in (2) is still capable of capturing the main dynamics of the power signal under analysis.

6.2. Algorithmic Details

The synthetic PMU Data Generator emulates the operating functions of a real-world PMU but provides some extra insights into the quality of the measurement results. The main steps of the data processing routine are summarized in Algorithm 1.

Algorithm 1 PMU Data Generator

1:: Input: $s, F_{n}, F_{s}, F_{r}, θ_{S N R}, θ_{O o B}$ ▹ input variables and parameters
2:: Require: $B$ ▹ corresponding TFM basis matrix
3:: for $i = 1, 2, \dots$ do
4:: $s_{i} \leftarrow s$ ▹ extract the i-th observation interval, pu
5:: $x_{i} = B^{†} \cdot s_{i}$ ▹ projection over the TFM basis $B$
6:: ${\hat{p}}_{i} = {\hat{A}}_{i} \cdot exp (1 j \cdot {\hat{φ}}_{i}) \leftarrow x^{0}$ ▹ synchrophasor at i-th reporting time, pu
7:: ${\hat{f}}_{i} \leftarrow x^{1}$ ▹ frequency at i-th reporting time, Hz
8:: $Δ {\hat{f}}_{i} \leftarrow x^{2}$ ▹ ROCOF at i-th reporting time, Hz/s
9:: ${\hat{s}}_{i} = F_{P M U} ({\hat{p}}_{i}, {\hat{f}}_{i}, Δ {\hat{f}}_{i})$ ▹ recovered fundamental component, pu
10:: ${\hat{S N R}}_{i} = - 20 \cdot log (\frac{rms ({\hat{s}}_{i} - s_{i})}{{\hat{A}}_{i}})$ ▹ estimated SNR at i-th reporting time, dB
11:: if ${\hat{S N R}}_{i} > θ_{S N R}$ then WDF $\leftarrow 1$ ▹ activated wide-band distortion flag
12:: else WDF $\leftarrow 0$ ▹ de-activated wide-band distortion flag
13:: end if
14:: ${\hat{A}}_{O o B, i} \leftarrow s n o b ({\hat{f}}_{i})$ ▹ frequency oscillation depth, pu
15:: if ${\hat{A}}_{O o B, i} > θ_{O o B}$ then NDF $\leftarrow 1$ ▹ activated narrow-band distortion flag
16:: else NDF $\leftarrow 0$ ▹ de-activated narrow-band distortion flag
17:: end if
18:: function OoB Sniffer - snob( $f$ )
19:: Input: $f = {{\hat{f}}_{k}, k = i - [0, 1, \dots F_{r}]}$ ▹ 1-s frequency measurement FIFO, Hz
20:: $R_{f} = r o u n d ({\hat{f}}_{i}) \cdot [1, 3]$ ▹ frequency range of interest for OoB, Hz
21:: $y = msDFT (f, R_{f})$ ▹ spectrum of frequency oscillation, pu
22:: $[{\hat{A}}_{O o B}, {\hat{f}}_{O o B}] = max (abs (y))$ ▹ spectrum peak detection, pu/Hz
23:: end function
24:: end for

The input variables are as follows: The signal

s

is a column vector of uniformly sampled values of a voltage or current waveform. The parameters

F_{n}

,

F_{s}

, and

F_{r}

denote the nominal system frequency, the sampling rate, and the reporting rate, respectively. The nominal system frequency can be set to 50 or 60 Hz, depending on the grid configuration under analysis. The sampling rate relies on

F_{n}

for 50 Hz systems, then the sampling rate

F_{s}

can be set to 10, 18, or 50 kHz, while for 60 Hz systems, the sampling rate

F_{s}

can be set to 12, 18, or 60 kHz. In a similar way, the reporting rate

F_{r}

is a function of

F_{n}

. The synthetic PMU Data Generator implements the reporting rates required by the IEC Std but is also capable of operating in a sample-by-sample mode (i.e.,

F_{r} = F_{s}

). Such a high reporting rate may be impractical in real-world monitoring and control infrastructures, but it represents a useful tool for in-depth analysis of unexpected behavior of the network as well as for the tracking of sudden variations of the signal main parameters.

The thresholds

θ_{S N R}

and

θ_{O o B}

set the corresponding criteria for the detection of low-quality PMU measurements. The former indicates the expected Signal-to-Noise Ratio (SNR) we expect for the signal under test (e.g.,

θ_{S N R} = 80

dB). The latter defines a lower limit for the detection of out-of-band components, also known as sub- and inter-harmonics. By exceeding one or both of these thresholds, the PMU Data Generator is aware of operating in non-ideal conditions. In other words, the consistency between the model of the measurement and the signal under test is questionable and the definitional uncertainty may become predominant when compared to the other intrinsic and algorithmic contributions. In this case, the PMU measurements are flagged as potentially unreliable and the quantities involved in the thresholding process are also reported.

The algorithm considers observation intervals of 80 ms

s_{i}

, partially overlapped in order to reproduce the desired reporting rate (line 4). Such an interval corresponds to 4 and 5 nominal cycles at 50 Hz and 60 Hz system rates, respectively. The interval under analysis is projected over a Taylor-Fourier Multifrequency (TFM) Basis

B

, specifically designed to minimize the spurious interference from low-order harmonic components (line 5). The projection produces the Taylor expansion coefficients of the fundamental component referred to as the interval mid-point. The 0th-order term is the synchrophasor (line 6). The 1st and 2nd order terms account for frequency and its rate of change (ROCOF) variations from the nominal values (lines 7 and 8).

Based on the estimates of synchrophasor, frequency, and ROCOF, the fundamental component is recovered in

\hat{s_{i}}

(line 9). By comparing

{\hat{s}}_{i}

with

s_{i}

, the residual energy provides an estimate of the current SNR (line 10). If this value exceeds

θ_{S N R}

, the first flag is raised (lines 11 to 13). In parallel, a function called OoB Sniffer analyzes a 1 s buffer of the frequency measurements and allows for detecting possible out-of-band components in terms of component frequency and magnitude (lines 18 to 23). If the out-of-band component magnitude exceeds

θ_{O o B}

, the second flag is raised (lines 14 to 17).

The recent literature has proposed different applications of TFM basis to phasor estimation. In this case, the estimation approach is inspired by the so-called Enhanced Taylor–Fourier Multifrequency Model [52]. In this paper, though, the frequency response of the TFM basis has been optimized in order to guarantee maximum flexibility between compliance with dynamic conditions and prompt response. Indeed, it is worth mentioning that the frequency response of any TFM basis depends on the included frequencies, on the derivative orders associated with each component, and on the windowing functions applied to the input signal.

6.3. Metrological Characterization

Before applying the presented PMU Data Generator on real or synthetic datasets, it is important to characterize its performance with respect to the reference standard for PMU applications, namely the IEC 60255-118-1 [49]. For this analysis, we consider the P-class requirements as P-class PMUs are the most widely employed in control applications like protections, load shedding relays, or fast response mechanisms. Nevertheless, an equivalent PMU Data Generator for the M-class requirement is ready to be shared on the same repository (i.e., Zenodo Community: Research in Sensor Network Metrology) in the upcoming months. More information is in the Further Work section.

In the following, the performance of the PMU Data Generator is characterized in terms of the metrics indicated by the IEC 60255-118-1. To guarantee a statistically relevant sample, if not otherwise specified, each test has been carried out for a total duration of 10 s and with the highest possible reporting rate. In this regard, Table 2 reports the worst-case Total Vector Error (TVE), Frequency Error (FE), and ROCOF Error (RFE) for each test. These metrics are compared against the P-class requirements to facilitate compliance verification.

In order to reproduce a plausible operating condition, the test waveform has been corrupted with an uncorrelated white Gaussian noise with SNR equal to 80 dB. The sampling rate has been fixed to 10 kHz and 12 kHz for 50 Hz and 60 Hz nominal system frequency, respectively.

The choice of the lowest sampling rate is conservative, as this implies a larger impact of measurement noise on the results. In the following paragraph, the sensitivity to both noise levels is further discussed.

As reported in Table 2, the measurements provided by the synthetic PMU Data Generator comply with the P-class performance requirements in all tests. It is worth noticing that the TVE never exceeds 0.05% even in the presence of harmonic components or dynamic test conditions. The most challenging condition proves to be the harmonic distortion test. In this regard, it is worth observing that the test requires injecting a 1% distortion on each single harmonic component up to the 50th order. The worst case here reported refers to the 6th order harmonic. In reality, though, it is unlikely that the analog front end of the measurement infrastructure would present such a high distortion at an even harmonic. Nevertheless, the synthetic PMU Data Generator complies with the limit despite the combined effect of wide- and narrow-band distortions.

In Table 3, we report the performance of the PMU Data Generator in the presence of a step change variation in the signal magnitude or phase. For this analysis, we set the step occurrence after 547.27 ms from the beginning of the test waveform. In this way, the step occurrence does not correspond to an exact reporting instant with any of the considered reporting rates.

According to the IEC 60255-118-1 requirements, the measurements have been evaluated in terms of response time, delay and overshoot [49]. All the metrics comply with the standard requirements. In this context, the most challenging condition is represented by the phase step changes, where larger overshoots are noticed. Nevertheless, given the relatively small response times, it is reasonable to say that this would not largely affect the overall performance of the PMU Data Generator.

6.4. Missing and Invalid Data

When dealing with real-world datasets, it is likely to come across one or more data points that are either missing, corrupted, or invalid. In order to simplify the treatment of similar scenarios, the PMU Data Generator has been equipped with a simple yet effective interpolation functionality.

The missing or invalid data should be denoted as Not-A-Number, while the corresponding entries in the time axis should be kept unaltered. The missing portion will be recovered by means of shape-preserving piecewise cubic Hermite polynomials. In this sense, Figure 2 shows an example of a waveform with a missing portion of data and the equivalent interpolated reconstruction. On the other hand, it is worth underscoring that this operation extrapolates the waveform features from the preceding samples and represents only a plausible guess of the possible waveform evolution during the considered time interval. There is no guarantee about the trustworthiness of the obtained waveform.

For instance, the test signal considered in Figure 2 refers to a power outage that hit the Pacific Southwest power system on 8 September 2011. The signal is not stationary and presents an underlying amplitude and phase modulation. The recovered signal in red is consistent with the rest of the samples but may introduce slight discontinuities at the end of the recovered portion.

6.5. Parameter Setting

As discussed in the previous subsection, the PMU Data Generator presents two thresholds that allow for monitoring the level of wide- and narrow-band distortion levels, namely

θ_{S N R}

and

θ_{O o B}

. In order to suitably set these parameters, it is useful to perform a preliminary sensitivity analysis of the PMU Data Generator behavior in the presence of uncorrelated white Gaussian noise or out-of-band distortion components. If not otherwise specified, in the following the nominal system frequency and the sampling rate are set to 50 Hz and 10 kHz, respectively, but similar results could be obtained with any other combination of parameter values.

The left plot in Figure 3 presents the TVE distributions as a function of the SNR. The boxplot representation shows how the errors are not symmetrically distributed around their mean values. Nevertheless, a clear descending trend is noticed (expectedly) when the SNR increases. The right plot, instead, shows the error of the SNR estimated by the PMU Data Generator in different test conditions. In this case, along with the P-class tests already reported in the previous section, we included three tests with an out-of-band distortion component. For this analysis, we consider an inter-harmonic component whose frequency is set to 75 Hz and whose magnitude is varied between 0.1, 1, and 10% of the fundamental one. The bar plot shows clearly that the estimated SNR is capable of clearly detecting the presence of an inter-harmonic component.

In this context, the left plot in Figure 4 shows the worst-case TVE and FE as a function of the out-of-band component magnitude. In terms of TVE, the performance degradation is reduced: the 1% target performance is guaranteed up to a magnitude of 5% of the fundamental one. Conversely, the frequency estimates prove to be severely affected by out-of-band distortion. A 0.5% component magnitude is sufficient to exceed the performance target of 10 mHz. In the right plot of Figure 4, we show the magnitude of the out-of-band component as estimated by the PMU Data Generator. For this analysis, we vary the component frequency according to the M-class compliance test of the IEC 60255-118-1, and we consider two component magnitudes, namely 1 and 10%. It is worth noticing that the estimated magnitude is significantly larger than the noise floor (

10^{- 4}

for an 80 dB SNR) for both the considered configurations. By properly setting the threshold

θ_{O} o B

, it is then possible to detect the presence of out-of-band distortions and activate the NDF flag.

A similar example is represented by the measurement behavior in the presence of a transient or a parameter step change. In the left plot of Figure 5, we compare the synchrophasor magnitude estimates vs. the true profile of the signal magnitude.

This plot shows the advantage of setting the reporting rate equal to the sampling rate: the time resolution is so fine, that it is possible to evaluate the actual step response of the algorithmic chain. The right plot of Figure 5, instead, shows the corresponding time evolution of the SNR estimate. As soon as the step change is within the considered observation interval, the estimation accuracy degrades and the estimated SNR collapses from its expected value of 80 dB. By properly setting the threshold, it is possible to detect these events and suitably mark the measurements with the WDF flag to indicate a potential case of signal model inconsistency or high distortion.

As mentioned in Section 6.2, the observation interval length is set by default to 80 ms. This parameter choice is based on the fact that 80 ms is the maximum interval duration to guarantee a reporting latency compliant with P-class requirements, and—at the same time—the minimum interval duration to allow for sufficient resolution in the spectral domain to distinguish out-of-band disturbances. In case shorter observation intervals are needed, it is possible to exploit zero-padding techniques. In this regard, it is important to underline that zero-crossing should be applied in a symmetric way: the real signal should be placed in the middle of the interval as the final estimates will be referred to the 80 ms interval midpoint. Another aspect to be stressed is that zero-padding allows for increasing the granularity of the spectral domain representation, but the actual resolution depends on the number of cycles of real signal.

Figure 6 provides a possible application of this technique. On the left, the original signal window of 80 ms consists of a fundamental tone at 50.05 Hz and a 1% out-of-band disturbance at 75 Hz. The zero-padded version consists of only 40 ms. On the right, the corresponding spectral representation is given by the DFT. As expected, the resolution in the spectral domain for the zero-padded version is much scarcer and risks to affect the capability of properly detecting the out-of-band component.

6.6. Validation on Test Cases

After having introduced the algorithmic and setting details, it is necessary to demonstrate the feasibility and practicality of the synthetic data generator with real datasets. To this end, in the following, we consider two test cases taken from the recent literature on PMU-based monitoring and control applications. If not otherwise specified, the threshold

θ_{S N R}

and

θ_{O o B}

are set equal to 60 dB and 0.05% of the nominal fundamental magnitude, respectively. The first one is intended to spot any dataset where the measurement noise is larger than 60 dB (feasible in real-world scenarios, yet inappropriate for high-accuracy synchrophasor estimation). The second one allows for detecting possible small disturbances in the proximity of the fundamental component.

The first data set makes reference to the well-known IEEE 5-bus benchmark model. In this case, we consider a varied configuration, specifically designed to evaluate an Under Frequency Load Shedding (UFLS) scheme, as described in [53]. This model has been programmed in the MatLAB Simulink environment and is publicly available at [54]. For this analysis, we consider the voltage signal at bus 1 for an overall test duration of 30 s.

At t = 5 s, the generator output power at bus 2 is suddenly reduced by 300 MW. The consequent UFLS is successful in preventing the system black-out, but it produces a progressive shed of power at buses 4 and 5. In the following phase, a progressive load restoration progressively brings back the system to quasi-normal operating conditions.

As shown in the left plot of Figure 7, the reference values of frequency and magnitude exhibit the effects of the power outage and of the UFLS procedures. Each intervention corresponds to a sudden oscillation, whereas the progressive comeback to a normal operating condition is characterized by a frequency ramp and a slowly dampened amplitude modulation. The right plot, instead, focuses on the frequency measurements during the first phase of the event. In the upper graph, it is worth noticing how the synthetic PMU Data Generator correctly tracks the descending ramp, but it is also capable of capturing the fast transient variations at 5, 6.5, 6.75, and 7.25 s. The lower graph shows the frequency error with respect to the reference values and compares it with the performance targets for the closest test conditions, namely a phase modulation (60 mHz) and a frequency ramp (10 mHz). Most of the time, the frequency errors are largely within the performance targets. The only exceptions are represented by the four discontinuities. Only the first one (i.e., the largest one) is characterized by an error of several tens of mHz. Nevertheless, the divergence between measured and reference values last only 70 ms (i.e., within the response time limit for a phase step change).

The second data set is taken from the well-known Great Britain 34-bus benchmark model [55]. The model has been programmed in DIgSILENT PowerFactory environment, publicly available at [56].

In this case, we consider a varied configuration where the interconnection with France IFA2 is tripped at t = 0.05 s and results in the loss of 1 GW. For the sake of simplicity, this analysis refers to the voltage signal at bus 1 for an overall duration of 1 s.

In this regard, the left plot of Figure 8 shows the frequency and magnitude evolution in the phases before and after the contingency. In a similar way to the preceding test case, the right plot shows two graphs. The upper one presents the frequency errors: They remarkably exceed the performance targets only for a few ms in correspondence with the tripping event, while they keep around 0.01 mHz for the rest of the acquisition (despite the presence of a frequency ramp and minor oscillations). In the lower graph, we present the estimated SNR.

It is interesting to observe how this value takes nearly half a second to converge to the optimal value of 80 dB. This proves that the PMU estimates are capturing only a part of the energy in the signal during such an event. Therefore, it is necessary to take into account that the PMU measurements may provide a limited or incomplete representation of the current situation.

6.7. Application Example

Once proven, the reliability of the PMU Data Generator in several test cases showcases a possible application. In this sense, Figure 9 shows the distribution of ROCOF measurements in the aforementioned IEEE 5-bus dataset. On the left, the histogram of the synthetic measurements is fitted against a Rayleigh distribution. On the right, a new set of synthetic measurements is obtained by reproducing random numbers according to the same Rayleigh distribution (parameter B equal to 0.053). The similarity between the two distributions is noticeable in the histogram plot and can be quantified with a Kolmogorov–Smirnov distance of 0.98.

7. Conclusions

In this article, we have explored the transformative potential of synthetic PMU data and introduced a comprehensive digital metrology framework, designed to address the challenges of real-world data scarcity and variability.

Synthetic PMU data will play a crucial role in advancing Smart Grid technologies by providing a controlled, scalable, and secure environment for testing and validation. Hence, our proposed digital metrology framework leverages state-of-the-art algorithms and models to generate high-fidelity synthetic data, ensuring that researchers and developers have access to the reliable and adaptable datasets needed to drive innovation.

Additionally, the proposed framework will be refined to better support the principles of Open Science and FAIR data management, ensuring that synthetic data remain accessible, interoperable, and reusable [57]. These advancements will not only further the field of digital metrology but also strengthen the role of sensor network metrology in research and innovation.

8. Further Work

Looking ahead, there are exciting prospects for the continued evolution of this proposed framework. Future developments will focus on enhancing the proposed framework’s capabilities, such as improving data generation models and expanding integration with emerging Smart Grid simulators. In particular, a new version of the PMU Data Generator with M-class functionalities is envisioned.

8.1. Towards a Framework of Openly Accessible Intellectual Assets

The key related intellectual assets, according to [58], i.e., what we consider research outcomes under Open Science and FAIR principles, will be lively and openly available in the corresponding Zenodo Community, https://zenodo.org/communities/smartgridsnm/ (accessed on 2 January 2025), as the following types:

Research Publications: Articles, papers, and representations material, with DOIs, detailing the design, development, and research on methodologies and findings;
Related Data: Data sets, with DOIs, providing both created and verified data (e.g., synthetic PMU data for Smart Grid analytics), as well as further experimentation, such as testing results;
Related Software: Free and open-source software (FOSS), hosted on GitHub related Repository and offered both as the archived released version (by DOI) and modular code and/or stand-alone tool, e.g., synthetic PM data generator; and
Related Models: Digital representations, available for integration into simulation environment, as well as standalone software, i.e., an application with versioning managed (by DOI) through the corresponding Zenodo community.
All intellectual assets are posted after a review procedure, following the Zenodo community internal Data Curation Policy.

8.2. Call to Action for Open Research and Digital Metrology

This comprehensive, adaptable, and openly accessible proposed framework is presented as a significant leap forward in Smart Grid research and development, addressing the limitations of real-world data and setting the stage for more effective and innovative technologies, under Open Science and FAIR principles and following a novel approach to data management plans, as part of European research [59].

We encourage researchers, developers, and industry professionals to explore and contribute to the development of the proposed framework and the new Zenodo community. By adopting and adapting the intellectual assets provided, they can leverage the synthetic PMU data generator to advance their own Smart Grid projects and contribute to the broader goals of improved grid stability, efficiency, and resilience.

By collaborating and sharing insights, we believe the Smart Grid research and development community can collectively drive forward the innovation and development of said technologies while building a solid knowledge base for the novel field of digital metrology, in particular for the oriented sensor networks.

Author Contributions

Conceptualization, F.G.T. and G.F.; methodology, G.F.; software, G.F.; validation, G.F. and F.G.T.; formal analysis, G.F.; investigation, F.G.T.; resources, F.G.T.; data curation, F.G.T.; writing—original draft preparation, F.G.T. and G.F.; writing—review and editing, F.G.T. and G.F.; visualization, G.F.; supervision, F.G.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Swiss Federal Office of Energy through QUINPORTION Research Program under Grant SI/502415.

Data Availability Statement

One of the main objectives of the current article is to establish a Zenodo Community for the field of Sensor Network Metrology towards advancing the research on Smart Grid technologies, following Open Science and FAIR principles. In the mentioned community, https://zenodo.org/communities/smartgridsnm/ (accessed on 2 January 2025), additional material will be uploaded, beyond our related publications, presentations, dataset and software (i.e., code released versions, as well as stand-alone applications) for the replicability of results, but also for the long-term reproducibility of findings, as well as the further development and usage of our openly available proposed framework, from which its baseline is the presented synthetic PMU data generator tool.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Fang, X.; Misra, S.; Xue, G.; Yang, D. Smart Grid—The new and improved power grid: A survey. IEEE Commun. Surv. Tutor. 2012, 14, 944–980. [Google Scholar] [CrossRef]
EPRI (Electric Power Research Institute). Estimating the Costs and Benefits of the Smart Grid: A Preliminary Estimate of the Investment Requirements and the Resultant Benefits of a Fully Functioning Smart Grid. 2011. Available online: https://www.epri.com/ (accessed on 2 January 2025).
Phadke, A.G.; Thorp, J.S. Synchronized Phasor Measurements and Their Applications; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2017. [Google Scholar] [CrossRef]
NASPI (North American SynchroPhasor Initiative). Using Synchrophasor Data for Voltage Stability Assessment. 2015. Available online: https://www.naspi.org/sites/default/files/reference_documents/7.pdf (accessed on 2 January 2025).
Zhang, Y.; Huang, T.; Bompard, E.F. Big data analytics in smart grids: A review. Energy Inform. 2018, 1, 8. [Google Scholar] [CrossRef]
Dos Santos, R.; Aguilar, J.; R-Moreno, M. A synthetic Data Generator for Smart Grids based on the Variational-Autoencoder Technique and Linked Data Paradigm. In Proceedings of the 2022 XVLIII Latin American Computer Conference (CLEI), Armenia, Colombia, 17–21 October 2022; pp. 1–7. [Google Scholar]
He, H.; Yan, J. Cyber-physical attacks and defenses in the Smart Grid: A survey. IET Cyber-Phys. Syst. Theory Appl. 2016, 1, 13–27. [Google Scholar] [CrossRef]
Sadeghi, A.R.; Wachsmann, C.; Waidner, M. Security and privacy challenges in industrial Internet of Things. ACM Trans. Embed. Comput. Syst. (TECS) 2015, 16, 1–25. [Google Scholar]
Sansaniwal, S.K.; Sharma, V.; Mathur, J. Energy and exergy analyses of various typical solar energy applications: A comprehensive review. Renew. Sustain. Energy Rev. 2017, 81, 1243–1254. [Google Scholar] [CrossRef]
Demirbaga, U.; Aujla, G.S.; Jindal, A.; Kalyon, O. Big Data Analytics for Smart Grid: Theory, Methods, and Applications; Springer: Cham, Switzerland, 2024. [Google Scholar]
Kharlamov, A.A.; Pilgun, M. Data Analytics for Predicting Situational Developments in Smart Cities: Assessing User Perceptions. Sensors 2024, 24, 4810. [Google Scholar] [CrossRef]
Hiba, J.; Hadi, H.; Shnain, A.; Hadishaheed, S.; Haji, A. Big Data and Five V’s Characteristics. J. Adv. Electron. Comput. Sci. 2015, 2, 16–23. [Google Scholar]
Moura, J.; Serrão, C. Security and Privacy Issues of Big Data; IGI Global Scientific Publishing: Hershey, PA, USA, 2015. [Google Scholar]
Idehen, I.; Jang, W.; Overbye, T. PMU Data Feature Considerations for Realistic, Synthetic Data Generation. In Proceedings of the 2019 North American Power Symposium (NAPS), Wichita, KS, USA, 13–15 October 2019; pp. 1–6. [Google Scholar]
Zhao, Y.; Song, R.; Xia, L. A Survey on Smart Grid and its Applications. In Proceedings of the 2023 International Annual Conference on Complex Systems and Intelligent Science (CSIS-IAC), Shenzhen, China, 20–22 September 2023; pp. 686–691. [Google Scholar]
Tong, L. Grid Monitoring and Protection with Continuous Point-on-Wave Measurements and Generative AI. 2024. Available online: https://arxiv.org/html/2403.06942v1 (accessed on 2 January 2025).
Yan, Y.; Qian, Y.; Sharif, H.; Tipper, D. A survey on Smart Grid communication infrastructures: Motivations, requirements, and challenges. IEEE Commun. Surv. Tutor. 2013, 15, 5–20. [Google Scholar] [CrossRef]
Larsen, K. Smart grids—A smart idea? Renew. Energy Focus 2009, 10, 62–67. [Google Scholar]
Rudin, C.; Wagner, D. The role of synthetic data in developing algorithms for Smart Grids. IEEE Trans. Smart Grid 2016, 7, 162–169. [Google Scholar]
Rahimi, F.; Ipakchi, A. Overview of Demand Response under the Smart Grid and Market paradigms. In Proceedings of the 2010 Innovative Smart Grid Technologies (ISGT), Gaithersburg, MD, USA, 19–21 January 2010; pp. 1–7. [Google Scholar]
Grasso Toro, F.; Lehmann, H. Brief overview of the future of metrology. Meas. Sens. 2021, 18, 100306. [Google Scholar] [CrossRef]
Ray, P.P.; Saha, S. A comprehensive survey of IoT and sensor network metrology for smart cities. J. Netw. Comput. Appl. 2021, 170, 102814. [Google Scholar]
ISO/IEC/IEEE. IEEE-Reliability Society. Technical Committee on ‘Systems of Systems’. White Paper. Available online: https://rs.ieee.org/images/files/slideshows/IEEE-RS-TC-SoS-_White_Paper-_15-10-14-vf.pdf (accessed on 2 January 2025).
ISO/IEC/IEEE 21839:2019(en); Systems and Software Engineering—System of Systems (SoS) Considerations in Life Cycle Stages of a System. ISO: Geneva, Switzerland, 2019.
DeLaurentis, D. Understanding Transportation as a System-of-Systems Design Problem. In Proceedings of the 43rd AIAA Aerospace Sciences Meeting and Exhibit, Reno, Nevada, 10–15 January 2005. [Google Scholar]
Tao, F.; Qi, Q.; Wang, L.; Nee, A.Y.C. Digital Twins and Cyber–Physical Systems toward Smart Manufacturing and Industry 4.0: Correlation and Comparison. Engineering 2019, 5, 653–661. [Google Scholar] [CrossRef]
Cavalcante, E.; Cacho, N.; Lopes, F.; Batista, T.; Oquendo, F. Thinking Smart Cities as Systems-of-Systems: A Perspective Study. In Proceedings of the 2nd International Workshop on Smart Cities (SmartCities ’16), Trento, Italy, 12–16 December 2016; p. 9. [Google Scholar]
Li, C.; Chen, Y.; Shang, Y. A review of industrial big data for decision making in intelligent manufacturing. Eng. Sci. Technol. Int. J. 2022, 29, 101021. [Google Scholar] [CrossRef]
He, S.; Shin, H.-S.; Xu, S.; Tsourdos, A. Distributed estimation over a low-cost sensor network: A review of state-of-the-art. Inf. Fusion 2020, 54, 21–43. [Google Scholar] [CrossRef]
Ngesa, J. Tackling Security and Privacy Challenges in the Realm of Big Data Analytics. World J. Adv. Res. Rev. 2024, 21, 552–576. [Google Scholar] [CrossRef]
Song, E.Y.; FitzPatrick, G.J.; Lee, K.B. Smart Sensors and Standard-Based Interoperability in Smart Grids. IEEE Sens. J. 2017, 17, 7723–7730. [Google Scholar] [CrossRef] [PubMed]
Meletiou, A.; Vasiljevska, J.; Prettico, G.; Vitiello, S. Distribution System Operator Observatory 2022; JRC132379; EUR 31481 EN; Publications Office of the European Union: Luxembourg, 2023. [Google Scholar]
Johnston, G. Digital Twins: Model, Shadow, Twin. The Case for Policy Use. 2023. Available online: https://es.catapult.org.uk/wp-content/uploads/2023/02/Digital-Twins.-The-Case-for-Policy-Use.pdf (accessed on 2 January 2025).
Boschert, S.; Rosen, R. Digital Twin—The Simulation Aspect. In Mechatronic Futures; Hehenberger, P., Bradley, D., Eds.; Springer: Cham, Switzerland, 2016. [Google Scholar]
Ahmed, S.F.; Alam, M.S.B.; Hoque, M.; Lameesa, A.; Afrin, S.; Farah, T.; Kabir, M.; Shafiullah, G.M.; Muyeen, S.M. Industrial Internet of Things enabled technologies, challenges, and future directions. Comput. Electr. Eng. 2023, 110, 108847. [Google Scholar] [CrossRef]
Meiser, M.; Duppe, B.; Zinnikus, I. Generation of meaningful synthetic sensor data—Evaluated with a reliable transferability methodology. Energy AI 2024, 15, 100308. [Google Scholar] [CrossRef]
Xia, Y.; Wang, C.H.; Mabry, J.; Cheng, G. Advancing Retail Data Science: Comprehensive Evaluation of Synthetic Data. arXiv 2024, arXiv:2406.13130. [Google Scholar]
Frigo, G.; Derviškadić, A.; Colangelo, D.; Braun, J.P.; Paolone, M. Characterization of uncertainty contributions in a high-accuracy PMU validation system. Measurement 2019, 146, 72–86. [Google Scholar] [CrossRef]
Frigo, G.; Grasso Toro, F. On-line performance assessment for improved sensor data aggregation in power system metrology. Meas. Sens. 2021, 18, 100186. [Google Scholar] [CrossRef]
Frigo, G.; Costa, F.; Grasso Toro, F. PMU-based metrics for Power Quality Assessment in Distributed Sensor Networks. In Proceedings of the 25th IMEKO TC4 Symposium and 23nd International Workshop on ADC and DAC Modelling and Testing (IWADC), Brescia, Italy, 12–14 September 2022. [Google Scholar]
Frigo, G.; Grasso Toro, F. Feasibility Analysis of Online Uncertainty Metrics for PMU-Based State Estimation. Appl. Sci. 2023, 13, 11670. [Google Scholar] [CrossRef]
Elg, A.P.; Rietveld, G.; Hällström, J.; Crotti, G.; Ramirez, Á.; Rovira, J.; Milojevic, V.; Saadeddine, H.; Ayhan, B.; Zhao, W.; et al. High-Voltage Metrology for Electric Energy and Supply Reliability. In Proceedings of the 2024 Conference on Precision Electromagnetic Measurements (CPEM), Denver, CO, USA, 8–12 July 2024; pp. 1–2. [Google Scholar]
Zhang, Y.; Wang, X. Data-driven modeling and simulation of power grids: A review. Renew. Sustain. Energy Rev. 2019, 101, 328–341. [Google Scholar]
Zheng, X.; Pinceti, A.; Sankar, L.; Xie, L. Synthetic PMU Data Creation Based on Generative Adversarial Network Under Time-varying Load Conditions. J. Mod. Power Syst. Clean Energy 2023, 11, 234–242. [Google Scholar] [CrossRef]
Kelm, P.; Wasiak, I.; Mieński, R.; Wędzik, A.; Szypowski, M.; Pawełek, R.; Szaniawski, K. Hardware-in-the-Loop Validation of an Energy Management System for LV Distribution Networks with Renewable Energy Sources. Energies 2022, 15, 2561. [Google Scholar] [CrossRef]
Liu, M.; Huang, J.; Zhang, J.; Cao, L. Model Validation and Error Analysis of Photovoltaic Grid-connected Inverter Based on BDEW Standard. Autom. Electr. Power Syst. 2014, 38, 196–201. [Google Scholar]
Azeroual, M.; Lamhamdi, T.; El Moussaoui, H.; El Markhi, H. Simulation tools for a smart grid and energy management for microgrid with wind power using multi-agent system. Wind. Eng. 2020, 44, 661–672. [Google Scholar] [CrossRef]
Liu, Y.; Wang, M.; Liu, X.; Xiang, Y. Evaluating investment strategies for distribution networks based on yardstick competition and DEA. Electr. Power Syst. Res. 2019, 174, 105868. [Google Scholar] [CrossRef]
IEC/IEEE 60255-118-1:2018; IEEE/IEC International Standard—Measuring Relays and Protection Equipment—Part 118-1: Synchrophasor for Power Systems—Measurements. IEEE: Piscataway, NJ, USA, 2018; pp. 1–78.
Macii, D.; Fontanelli, D.; Barchi, G.; Petri, D. Impact of Acquisition Wideband Noise on Synchrophasor Measurements: A Design Perspective. IEEE Trans. Instrum. Meas. 2016, 65, 2244–2253. [Google Scholar] [CrossRef]
Frigo, G.; Colangelo, D.; Derviškadić, A.; Pignati, M.; Narduzzi, C.; Paolone, M. Definition of Accurate Reference Synchrophasors for Static and Dynamic Characterization of PMUs. IEEE Trans. Instrum. Meas. 2017, 66, 2233–2246. [Google Scholar] [CrossRef]
Frigo, G.; Pegoraro, P.A.; Toscani, S. Enhanced Support Recovery for PMU Measurements Based on Taylor–Fourier Compressive Sensing Approach. IEEE Trans. Instrum. Meas. 2022, 71, 9004211. [Google Scholar] [CrossRef]
Walger, Q.; Zuo, Y.; Derviškadić, A.; Frigo, G.; Paolone, M. OPF-based Under Frequency Load Shedding Predicting the Dynamic Frequency Trajectory. Electr. Power Syst. Res. 2020, 189, 106748. [Google Scholar] [CrossRef]
“OPF-Based-Under-Frequency-Load-Shedding”, [Online]. Available online: https://github.com/DESL-EPFL/IEEE-5-bus-power-system (accessed on 18 September 2024).
Abarghooee, R.; Malekpour, M.; Karimi, M.; Terzija, V. Development of the equivalent Great Britain 36-zone power system for frequency control studies. Int. J. Electr. Power Energy Syst. 2023, 153, 109390. [Google Scholar] [CrossRef]
“GB 36 Bus Electricity Transmission Network Model”, [Online]. Available online: https://www.nationalgrideso.com/research-and-publications/gb-36-bus-electricity-transmission-network-model (accessed on 18 September 2024).
Wilkinson, M.; Dumontier, M.; Aalbersberg, I.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.W.; da Silva Santos, L.B.; Bourne, P.E.; et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 2016, 3, 160018. [Google Scholar] [CrossRef]
European Union. Recommendation of the European Commission of 2023/499 of 1 March 2023 on a Code of Practice on the Management of Intellectual Assets for Knowledge Valorisation in the European Research Area. Access to European Union Law. 2023. Available online: https://eur-lex.europa.eu/eli/reco/2023/499/oj (accessed on 2 January 2025).
Grasso Toro, F. DMPs as Management Tool for Intellectual Assets by SMART-metrics. Int. J. Digit. Curation 2024, 18, 1–14. Available online: https://ijdc.net/index.php/ijdc/article/view/919 (accessed on 2 January 2025). [CrossRef]

Figure 1. Differentiation between Digital Model, Digital Shadow, and Digital Twin.

Figure 2. (Left) Signal with invalid data portion. (Right) corresponding interpolation based on shape-preserving piecewise cubic Hermite polynomials in orange.

Figure 3. (Left) TVE distribution as a function of the SNR level. The red crosses indicate the outliers that do not belong to the fitted normal distribution as represented by the boxplots. (Right) SNR estimation error as a function of the different test conditions.

Figure 4. (Left) TVE and FE as a function of the out-of-band interference magnitude (frequency set at 75 Hz). (Right) Out-of-band interference magnitude as estimated by the PMU Data Generator as a function of its frequency.

Figure 5. (Left) Synchrophasor magnitude estimation vs true value in the presence of a step change at

t = 547.27

ms. The dashed line denotes the ground truth magnitude profile, while the solid line indicates the synchrophasor estimated magnitude. (Right) SNR estimation error as a function of time during the amplitude step change test.

Figure 5. (Left) Synchrophasor magnitude estimation vs true value in the presence of a step change at

t = 547.27

ms. The dashed line denotes the ground truth magnitude profile, while the solid line indicates the synchrophasor estimated magnitude. (Right) SNR estimation error as a function of time during the amplitude step change test.

Figure 6. (Left) Original signal affected by off-nominal frequency and out-of-band distortion vs. shorter interval duration (40 ms) with zero-padding in orange. (Right) Corresponding representation in the DFT domain.

Figure 7. (Left) True value of frequency (a) and magnitude (b) for the IEEE 5-bus dataset. (Right) PMU Data Generator frequency estimate (a) and error (b) as compared with the standard limits.

Figure 8. (Left) True value of frequency (a) and magnitude (b) for the GB 36-bus dataset. (Right) PMU Data Generator frequency error (a) and estimated SNR (b).

Figure 9. (Left) distribution of the ROCOF measurements obtained by the PMU Data Generator (blue) against the fitted Rayleigh distribution. (Right) simulated dataset based on the fitted distribution parameters.

Table 1. Overall Benefits of Synthetic PMU Data for Smart Grid Development.

Benefit	Description
Simulation of Grid Conditions	Synthetic PMU data can simulate a wide range of grid conditions, allowing researchers to explore grid behavior under extreme or rare scenarios that may not be easily captured in real-world data [15].
Safe and Secure Experimentation	Synthetic data do not contain sensitive or proprietary information, enabling secure data sharing and collaboration without risking the integrity of critical infrastructure [15].
Scalability	Large volumes of synthetic data can be generated, tailored to specific research needs, making it ideal for stress tests, simulations, and statistical analysis [15].
Metrological perspective	Synthetic data allow for traceable datasets with known uncertainties, aiding in the precise assessment of measurement uncertainty and providing standardized reference points for the validation of algorithms and models.
Diverse Testing Scenarios	Synthetic data generation can include scenarios ranging from normal operations to extreme events (e.g., blackouts or cyber-physical attacks), offering a comprehensive testing environment for Smart Grid technologies.

Table 2. IEC 60255-118-1 P-Class—Static and Dynamic Compliance Tests.

Test	$F_{n}$	TVE	Std	FE	Std	RFE	Std
Test	(Hz)	(%)	(%)	(mHz)	(mHz)	(Hz/s)	(Hz/s)
nominal	50	0.001	1	0.170	5	0.008	0.4
nominal	60	0.001		0.153		0.013
sign. freq.	50	0.015	1	0.482	5	0.032	0.4
$F_{n} \pm 5$ %	60	0.035		1.207		0.104
harm. dist.	50	0.029	1	0.975	5	0.338	0.4
THD = 1%	60	0.041		1.034		0.352
ampl. mod.	50	0.003	3	0.201	60	0.009	2.3
$f_{m} \leq 2$ Hz	60	0.005		0.371		0.014
phase mod.	50	0.003	3	0.206	60	0.045	2.3
$f_{m} \leq 2$ Hz	60	0.006		0.368		0.097
freq. ramp	50	0.014	1	0.419	10	0.178	0.4
$\| Δ f \| \leq 2$ Hz/s	60	0.021		0.598		0.113

Table 3. IEC 60255-118-1 P-Class—Step Change Compliance Tests.

Test	$F_{n}$	TVE	FE	RFE	Delay	Over
Test	(Hz)	Resp. Time (ms)			(ms)	(%)
ampl. step	50	24.2	69.9	75.9	1.5	0.5
$k_{x} = + 10$ %	60	26.7	70.1	75.6	1.5	0.5
ampl. step	50	25.3	79.9	68.7	1.5	−0.62
$k_{x} = - 10$ %	60	26.7	82.6	75.4	1.5	−0.92
phase step	50	29.1	67.3	61.8	1.5	4.17
$k_{a} = + π / 18$	60	29.3	70.8	70.1	1.5	4.23
phase step	50	29.5	79.9	68.1	1.5	−4.83
$k_{a} = - π / 18$	60	29.4	82.1	74.1	1.5	−4.91
Std		40	90	120	2.5	5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Grasso Toro, F.; Frigo, G. Synthetic PMU Data Generator for Smart Grids Analytics. Metrology 2025, 5, 12. https://doi.org/10.3390/metrology5010012

AMA Style

Grasso Toro F, Frigo G. Synthetic PMU Data Generator for Smart Grids Analytics. Metrology. 2025; 5(1):12. https://doi.org/10.3390/metrology5010012

Chicago/Turabian Style

Grasso Toro, Federico, and Guglielmo Frigo. 2025. "Synthetic PMU Data Generator for Smart Grids Analytics" Metrology 5, no. 1: 12. https://doi.org/10.3390/metrology5010012

APA Style

Grasso Toro, F., & Frigo, G. (2025). Synthetic PMU Data Generator for Smart Grids Analytics. Metrology, 5(1), 12. https://doi.org/10.3390/metrology5010012

Article Menu

Synthetic PMU Data Generator for Smart Grids Analytics

Abstract

1. Introduction

2. Challenges in Obtaining Real-World PMU Data

2.1. Data Scarcity and Privacy Concerns

2.2. Variability and Inconsistencies

2.3. Impact on Research and Development

3. The Need for Synthetic Data in Smart Grid Development

3.1. Benefits of Synthetic Data

3.2. Metrological Perspective and Impact

3.3. Overcoming Limitations of Real-World Data

3.4. Evolving Smart Grid Scenarios

3.5. Enabling Innovation in Smart Grid Technologies

4. Synthetic PMU Data Generator for Digital Metrology

4.1. Sensor Network Metrology: A New Paradigm

4.2. Advantages and Challenges of Sensor Network Metrology

4.3. The Role of Digital Representations

4.4. The Road Ahead for Digital Metrology

4.5. Frameworks Developed for Research on Digital Metrology

5. Synthetic PMU Data Generator for the Proposed Frameworks

5.1. Data Generation Module

5.2. Validation and Verification of Synthetic Data Generator

5.3. Integration with Smart Grid Simulators

6. Synthetic PMU Data Generator

6.1. Modeling Assumptions

6.2. Algorithmic Details

6.3. Metrological Characterization

6.4. Missing and Invalid Data

6.5. Parameter Setting

6.6. Validation on Test Cases

6.7. Application Example

7. Conclusions

8. Further Work

8.1. Towards a Framework of Openly Accessible Intellectual Assets

8.2. Call to Action for Open Research and Digital Metrology

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI