Advanced ENF Region Classification Using UniTS-SinSpec: A Novel Approach Integrating Sinusoidal Activation Function and Spectral Attention

Li, Yujin; Lu, Tianliang; Zeng, Gaojun; Zhao, Kai; Peng, Shufan

doi:10.3390/app14199081

Open AccessArticle

Advanced ENF Region Classification Using UniTS-SinSpec: A Novel Approach Integrating Sinusoidal Activation Function and Spectral Attention

by

Yujin Li

,

Tianliang Lu

^*,

Gaojun Zeng

,

Kai Zhao

and

Shufan Peng

School of Information Network Security, People’s Public Security University of China, Beijing 100038, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(19), 9081; https://doi.org/10.3390/app14199081

Submission received: 5 September 2024 / Revised: 21 September 2024 / Accepted: 27 September 2024 / Published: 8 October 2024

Download

Browse Figures

Versions Notes

Abstract

:

The electric network frequency (ENF), often referred to as the industrial heartbeat, plays a crucial role in the power system. In recent years, it has found applications in multimedia evidence identification for court proceedings and audio–visual temporal source identification. This paper introduces an ENF region classification model named UniTS-SinSpec within the UniTS framework. The model integrates the sinusoidal activation function and spectral attention mechanism while also redesigning the model framework. Training is conducted using a public dataset on the open science framework (OSF) platform, with final experimental results demonstrating that, after parameter optimization, the UniTS-SinSpec model achieves an average validation accuracy of 97.47%, surpassing current state-of-the-art and baseline models. Accurate classification can significantly aid in ENF temporal source identification. Future research will focus on expanding dataset coverage and diversity to verify the model’s generality and robustness across different regions, time spans, and data sources. Additionally, it aims to explore the extensive application potential of ENF region classification in preventing crimes such as telecommunications fraud, terrorism, and child pornography.

Keywords:

electric network frequency; ENF classification; sinusoidal activation function; spectral attention; UniTS-SinSpec model

1. Introduction

An ENF is produced by the operation of alternating-current generators [1], and the nominal values of ENF exhibit variation across different countries. Table 1 provides a selection of nominal voltage and ENF statistics for various countries globally [2].

Due to the limited capacity of power grids to store electricity, inconsistencies between production power and consumption power are primarily caused by factors such as load changes, generator instability, transmission line faults, or dispatch issues [1,3,4,5,6,7,8]. Adi Hajj-Ahmad et al. [5] pointed out that the fluctuation range of ENF signals is generally related to the size of the power grid. Therefore, the fluctuation patterns of power grids differ across countries and even within different regions in the same country, leading to a widespread interest in studying the regional classification of ENF based on these differences. ENF varies among different regions and due to its unforgeable characteristics; these differences cannot be fabricated. Notably, researchers like Wei-Hong Chuang et al. [9] attempted to manipulate ENFs through specific editing techniques; however, the experimental results indicated that these editing attempts could be readily detected.

Contemporary approaches for regional classification heavily leverage signal processing and machine learning technologies. Conventional signal processing techniques’ efficacy hinges upon operator expertise, machine learning methodologies enhance automation, and precision in categorization, but remain reliant on manually engineered features, which hinder capturing intricate data insights. The advent of deep learning technology has ushered neural network-based methodologies into ENF regional classifications by harnessing sophisticated feature extraction capacities for the automatic acquisition of complex data representations. Nonetheless, extant deep learning approaches encounter constraints when leveraging frequency domain information alongside periodic signal processing capabilities.

In addressing the challenges posed by traditional signal processing techniques and the limitations of machine learning in ENF regional classification tasks, we refined and advanced the UniTS model through both training and testing using the OSF public dataset.

We conduct a comparative analysis of the sinusoidal activation function in relation to commonly utilized activation functions. Our investigation focuses on the impact of activation functions on processing periodic signals, revealing that the sinusoidal activation function leverages the periodic nature of the sinusoidal function to optimize the retention and extraction of periodic features within ENF data, thereby enhancing its efficacy in processing periodic signals.
We develop a spectral attention module and integrate it into the UniTS model. The spectral attention mechanism is utilized to effectively capture crucial frequency information in the original ENF data within the frequency domain, thereby addressing the model’s limited noise resistance to raw data and inadequacies of extracting and learning frequency domain information.
We propose a standardized dataset configuration and preprocessing approach for ENF classification based on the categorization of ENF regions. Due to the diverse research objectives in previous studies on ENF, variations exist in both the datasets utilized and the processing methods employed. Our proposal aims to establish a unified dataset configuration and preprocessing approach that is better suited for the task of classifying ENF regions.
In order to validate the efficacy of the enhanced model and investigate the factors influencing its performance, a total of three experiments is devised in this study, with each experiment being further subdivided into multiple groups according to its specific objectives. These objectives encompass an assessment of the fundamental classification performance of the UniTS model, an ablation experiment on the novel enhancements, and a hyperparameter search for the refined model. The ultimate average classification accuracy achieved by the UniTS-SinSpec model is 97.47%.

2. Related Work

Conventional approaches for frequency domain classification of the power grid primarily encompass signal processing and machine learning methodologies. Signal processing techniques entail the extraction of frequency domain features followed by manual comparison and identification, for instance, employing band-pass filters to extract fundamental ENF and harmonic information, as well as utilizing spectral analysis to scrutinize the ENF signal [10]. Furthermore, investigations have employed high-resolution spectral estimation methods based on eigenvalue decomposition, such as Multiple Signal Classification (MUSIC), as well as parameter estimation methods based on the signal subspace rotational invariance principle, such as Estimation of Signal Parameters via Rotational Invariance Techniques (ESPRIT), to analyze the characteristics of the ENF signal for grid classification and localization [11,12]. While these approaches can partially extract features from the ENF signal and conduct regional classification, their efficiency and accuracy are heavily reliant on operator expertise.

Machine learning-based approaches have been utilized for ENF classification. These methods leverage empirical knowledge to extract features from data and subsequently employ classification algorithms. While support vector machines (SVMs), random forests (RFs), and K-nearest neighbors (KNNs) are common machine learning techniques [12,13], their reliance on manually designed features and combination strategies poses challenges when fully capturing the long-term and cyclical nature of ENFs as well as achieving efficient and accurate classifications, despite some improvements in automation and accuracy.

The advancement of deep learning technology has led to the integration of neural network-based approaches in ENF classification. These methods excel at automatically extracting intricate features from data and demonstrating a strong performance in classifying tasks. Notably, Georgios Tzolopoulos et al. [14] achieved outstanding results through a multi-classifier fusion framework for regional classification. This approach leverages a convolutional neural network (CNN) to extract spatial features via convolutional layers; long short-term memory network (LSTM) to capture long-term dependencies within time -series data; and SVM for handling high-dimensional data. The collective performance is commendable with a final accuracy rate of 96%.

In February 2024, Shanghua Gao et al. introduced the UniTS model [15]. This model is a unified, multi-task, time-series model suitable for prediction, classification, anomaly detection, and interpolation tasks. It can handle homogeneous time series with different variables and lengths. Illustrated in Figure 1, the UniTS model consists of multiple, stacked, repetitive modules known as UNITS blocks and a lightweight tower designed for both mask generation and classification. Each UNITS block integrates components such as multi-head self-attention (MHSA), variable MHSA, a dynamic multi-layer perceptron (MLP), and a gating module. The mask tower is responsible for generating task-specific masks, while the classification tower (CLS) is dedicated to handling classification tasks. The cross-attention mechanism processes tokens and performs classification predictions. Tested on 38 datasets, the UniTS model outperformed other models and demonstrated zero-shot learning as well as prompt-based learning capabilities. Its design features encompass task tokenization for enhanced flexibility and adaptability by converting task specifications into uniform token representation; a unified time-series architecture incorporating dynamic linear operators and dynamic MLP modules to effectively capture relationships between sequence points of varying lengths; and joint training of generation and prediction tasks utilizing masked reconstruction pre-training strategy to enhance feature learning ability.

The early definition of time-series data can be traced back to 1927, when G. U. Yule defined time-series data as a sequence of observed values recorded in chronological order [16]. This definition laid the foundation for subsequent research on time-series analysis methods.

However, a more comprehensive and accurate definition of time-series data was proposed by John Cochrane in 2021, who considered time-series data as a sequential recording of data points within specific time intervals, enabling analysts to study variable variations over time, identify trends, seasonal patterns, and other temporal dependence structures [17]. ENF data exhibit typical characteristics of time-series analysis, which were extensively discussed in the study by Jumar et al. [18]. Furthermore, Kruse et al. [19] emphasized that ENF data can serve as valuable inputs for prediction and analysis using advanced time-series models to enhance grid reliability and predictive accuracy. While ENF data fall under the category of time-series data and are amenable to utilization by relevant models for specific tasks, they differ from general time-series datasets in several aspects: (1) Incorporation of both temporal and frequency domain information: ENF data encompass characteristics from both domains, whereas general time-series datasets primarily focus on temporal changes and patterns; and (2) cyclical and regional characteristics: ENF records exhibit pronounced cyclical patterns, such as diurnal and seasonal variations, as well as distinct regional features. Different regions may experience varying electricity demands and generation capacities at different times. Conversely, general time-series datasets’ periodicity varies based on the nature of the dataset, such as seasonal sales or meteorological observations. These distinctions underscore that, while ENF records are categorized under time-series data, their unique attributes necessitate tailored applications and analytical approaches.

3. Proposed Approaches

The UniTS model demonstrates a strong performance when handling multi-task time-series data. However, it exhibits limitations when applied to tasks involving ENF data: (1) the model lacks a specialized design for extracting periodic features from time-series data with significant periodicity, resulting in a suboptimal performance when capturing the periodic features of ENF. (2) Noise is easily introduced during the collection, transmission, and processing of ENF data. The UniTS model does not adequately consider noise resistance when processing such noisy data, consequently impacting classification accuracy. (3) ENF data encompass not only temporal information, but also rich spectral information. While the UniTS model primarily focuses on temporal features, its ability to extract and utilize spectral features is relatively limited.

To address the limitations of the UniTS model in classifying frequency data and enhance its performance, this study implemented innovative enhancements. Firstly, the original gaussian error linear unit (GELU) activation function was substituted with the sinusoidal activation function (SAF) to leverage its periodic nature for the improved capture and representation of periodic features within input signals, thereby enhancing the processing efficacy of frequency data. Secondly, a spectral attention module was introduced to extract features and compute spectral attention weights in the frequency domain, enabling the identification and emphasis of crucial components of frequency data. This heightened the model’s sensitivity to frequency domain information and classification performance. The experimental results demonstrate that the enhanced model achieves a peak classification accuracy of 97.47% on the OSF public dataset.

3.1. Sinusoidal Activation Function

The GELU activation function used in the UniTS model is a nonlinear activation function based on the Gaussian error function, proposed by Hendrycks and Gimpel in 2016 [20]. Its definition is presented by (1):

G E L U (x) = 0.5 x (1 + t a n h (\sqrt{\frac{2}{π}} (x + 0.044715 x^{3})))

(1)

The GELU activation function smoothly approaches zero for negative values and gradually increases for positive values. This characteristic enables the GELU to outperform traditional rectified linear unit (ReLU) activation functions in many deep learning models, as it effectively handles the negative input data, thereby enhancing the model’s performance. The formula of the ReLU activation function is presented in (2). Nair and Hinton proposed the ReLU activation function in 2010 [21]. The ReLU activation function maintains linearity for positive values and directly outputs zero for negative values. This feature makes the ReLU activation function computationally efficient due to its simple calculation method, which reduces model complexity during training.

R e L U (x) = m a x (0, x)

(2)

However, the GELU and ReLU activation functions have limitations in capturing periodic data features, thereby hindering the extraction of repeating patterns in such data. Specifically, periodic data often exhibit conspicuous cyclic fluctuations and repetitive structures, which may be attenuated when processed through the GELU and ReLU activation functions. This could result in a diminished capacity of the model to discern these crucial features. Hence, for datasets containing significant periodic characteristics, it might be necessary to integrate alternative methods or activation functions to more effectively capture these features.

Parascandolo and Virtanen explored the potential of the SAF in deep learning [22]. Despite being less commonly utilized compared to other activation functions, the SAF has demonstrated superiority in capturing periodic features in specific scenarios. In a study by Michael S. Gashler, Stephen C. Ashmore, and their colleagues [23], researchers investigated the application of sinusoidal activation functions in deep neural networks for time-series modeling problems. They initialized the network with fast Fourier transform (FFT) and implemented dynamic parameter optimization, concluding that the SAF effectively captures periodic patterns in data and outperforms traditional activation functions in certain tasks. Additionally, Sitzmann et al. discovered that neural networks with periodic activation functions (such as the SAF) are particularly adept at representing complex natural signals and their derivatives [24]. These networks excel at fitting and representing details of time-series data, making them especially suitable for processing periodic signals such as ENFs.

The mathematical representation of the SAF is depicted in (3). This function directly applies a sinusoidal transform to the input value,

x

, converting the linear input into a periodic output, enabling the model to effectively capture the periodic characteristics of the input signal. Figure 2 illustrates the schematic diagrams of the ReLU, GELU, and SAF.

S A F (x) = s i n (x)

(3)

To compare the impact of different activation functions on processing the original periodic signal, we conducted simulations using periodic signals and applied them successively to the SAF, ReLU, and GELU, as depicted in Figure 3. The unique sinusoidal activation mechanism of the SAF enables it to more effectively capture and replicate the periodic variations in the signal, thereby preserving a greater proportion of its original periodic characteristics during processing.

As previously mentioned, the ENF signal exhibits a robust periodicity. In order to bolster the model’s capacity for handling ENF data, we incorporated the sinusoidal function as the activation function. Our objective is to optimize the preservation of periodic patterns in ENF data through the SAF, thereby mitigating potential signal distortion or information loss attributable to alternative activation functions and ultimately enhancing the model’s proficiency in processing periodic signals.

3.2. Spectrum Attention Mechanism

The introduction of the attention mechanism aims to replicate the human capacity for selectively focusing on specific key information while disregarding extraneous data [25]. When processing ENF data containing spectral information using deep learning models, we also endeavor to selectively process crucial spectral information while filtering out irrelevant details. Therefore, we devised a spectral attention module tailored for ENF data based on its unique characteristics and the underlying principles of the attention mechanism. The UniTS model incorporates a CLS tower (depicted in Figure 1) for classification tasks, with the inclusion of the spectral attention module preceding the category embeddings (as illustrated in Figure 4) within the CLS tower.

In Europe and most of Asia, the nominal value of the ENF is 50 Hz [2]. The allowable fluctuation range typically falls within ±0.1 Hz, indicating that the range of 49.9 Hz to 50.1 Hz is considered normal under standard conditions [26]. Therefore, this range becomes the focal point in our analysis of ENF classification in the frequency domain.

The fundamental concept behind our designed spectral attention module is to transform the time domain signal into the frequency domain, focusing on the frequency range of 49.9 Hz to 50.1 Hz. This approach enhances the model’s noise robustness by allowing it to effectively capture information within the critical frequency domain of the ENF. Following the attention weight calculation, the resulting attention weight matrix is multiplied by weighting and then reconverted into a time domain signal for category embedding and classification. By concentrating on this specific frequency range, the model can better distinguish the essential features from noise, thereby improving the classification accuracy and reliability.

A = s o f t m a x (W_{2} \cdot \tanh (W_{1} \cdot X + b_{1}) + b_{2})

(4)

The equation for calculating the attention weight is presented in (4). Let X represent the input spectral data, with dimensions (N, T, F), where N denotes the batch size, T represents the time step size, and

F

indicates the frequency dimension.

W_{1}

and

W_{2}

denote weight matrices, while

b_{1}

and

b_{2}

are bias vectors. The activation function,

t a n h

, is utilized along with

s o f t m a x

for normalizing the weights. Subsequently, the weighted signal is derived by multiplying the attention weight matrix with the input signal; here,

A

signifies the calculated attention weight matrix and

X^{'}

denotes the weighted input signal. The module processing flow proceeds as follows.

3.2.1. Frequency Domain Transformation

Conducting FFT on the input time-series data transforms the time domain signal into the frequency domain, resulting in a spectral plot of the signal. The frequency domain representation offers a more distinct portrayal of the frequency components within the signal, enabling the model to function within the frequency domain, as illustrated in (5). In this equation,

X_{t}

denotes the time domain signal, while

X_{f}

signifies the frequency domain signal.

X_{f} = F F T (X_{t})

(5)

3.2.2. Define Frequency Domain Attention Calculation

In the frequency range of primary focus (49.9 Hz–50.1 Hz), a linear transformation is applied to map the frequency domain signal to the attention space and calculate the attention weight for each frequency component. The specific formula is presented in (6), where

W_{1}

and

W_{2}

denote the weight matrices of the linear transformation,

b_{1}

and

b_{2}

represent the bias vectors,

t a n h

serves as the activation function, and

s o f t m a x

is utilized for normalizing the weights. Subsequently, inputting the frequency domain signal,

X_{f}

, yields the frequency domain attention weight matrix,

A

.

A = s o f t m a x (W_{2} \cdot \tanh (W_{1} \cdot X_{f} + b_{1}) + b_{2}) 49.9 H z \leq x_{f} \leq 50.1 H z

(6)

3.2.3. Frequency Domain Weighting

The computed attention weight matrix

A

is multiplied by the frequency domain signal

X_{f}

to obtain the weighted frequency domain signalc

X_{f}^{'}

. This ensures that the model emphasizes the crucial frequency information in the signal. As depicted in (7),

X_{f}^{'}

represents the weighted sum of the frequency domain signal.

X_{f}^{'} = A \cdot X_{f}

(7)

3.2.4. Inverse Transform

By applying an inverse fast Fourier transform (IFFT) to the features in the frequency domain, these features are transformed back to the time domain. This process enables the extracted frequency domain features to be integrated with the original time-series data, thereby restoring the temporal characteristics of the signal. The formula for recovering the temporal characteristics is presented by

X_{t}^{'}

, which represents the time domain signal obtained through the IFFT of the weighted computed frequency domain signal, as depicted in (8).

4. Experimental and Results Analysis

4.1. Dataset and Baseline Settings

4.1.1. Dataset Description and Data Preprocessing

The OSF [27] provides power and audio recordings from various global grids, and this study utilizes multiple publicly available datasets offered by the OSF. To ensure comparable data volumes across different regions, we selected datasets based on their temporal coverage and divided them into two groups, as detailed in Table 2.

To standardize the data format, we preprocessed the datasets from seven regions uniformly, resulting in six sets of datasets, as detailed in Table 3. The specific processing methods are as follows:

Time sequence alignment: Arranging all regional datasets chronologically to preserve contextual time-related information of ENF data;
Format alignment: Standardizing all data to actual measured values rather than differences between the measured and reference values;
Missing value treatment: Replacing all “Nan” values with the medians of each region’s entire dataset. According to the OSF’s dataset description, “Nan” values may result from measurement device or calculation errors. Instead of simply replacing missing data with nominal values, which may introduce bias, we opted for replacing all “Nan” values with medians to accurately reflect each dataset’s operational characteristics without affecting the main trend;
Time-span grouping: Grouping datasets based on time spans; longer time spans provide more contextual time-related information;
Sequence length division: Significant fluctuation pattern differences exist in ENFs between the daytime and nighttime due to the changes in power demand and supply. For instance, electricity demand is typically higher during the day, especially on workdays, leading to a higher frequency. At night, both the electricity demand and frequency decrease [28]. Diurnal variations in wind speed also affect ENFs; higher wind speeds occur during the day, while lower speeds occur at night [29]. To investigate the sequence length impact, we selected 1 min (60 s), 1 h (3600 s), and 1 day (86,400 s) as the dataset sequence lengths—each group was divided into sequences of 60, 3600, and 86,400, respectively;
Training set and test set divisions: Each group was split into training and test sets at an 8:2 ratio.

4.1.2. Baseline Setting

When employing machine learning for regional ENF classification, researchers utilize data primarily characterized by regional disparities, diverse data collection methodologies, and variations in nominal frequencies. To ensure fair and accurate comparisons with our study’s findings against existing research outcomes necessitated screening prevailing machine learning models’ conclusions based on three prerequisites: a minimum of three classification regions; a nominal frequency set at 50 Hz; and direct acquisitions from power grids.

In response to the advancements in deep learning techniques, we endeavored to apply deep learning to regional ENF classifications. We selected the Transformer_Hugging Face Time-Series Classification model (THF_TSC) [30], LSTM_ TSC model [31], along with the UniTS base model as benchmark models before training them on our preprocessed dataset. Subsequently modifying solely the SAF led us to name it as ‘UniTS_SAF’, while incorporating the spectral attention mechanism improvement module resulted in ‘UniTS_SAM’. Additionally introducing both modifications concurrently produced an enhanced version termed ‘UniTS-SinSpec’.

The aforementioned benchmark configurations facilitated the selection of baseline models meeting the comparative criteria detailed in Table 4.

4.2. Experimental Conditions and Experimental Design

The experimental framework utilized in this study for deep learning models comprises PyTorch 2.2.2 and torchvision 0.17.2. The computational infrastructure consists of two Intel (R) Xeon (R) Gold 6326 CPU @ 2.90 GHz processors (Intel, Santa Clara, CA, USA) and four NVIDIA A100-PCIE-40 GB graphics cards (NVIDIA, Santa Clara, CA, USA). The experimental models include LSTM_TSC, THF_TSC, UniTS, UniTS_SAF, UniTS_SAM, and UniTS-SinSpec, as delineated in Table 4. The datasets employed are A1, A2, A3, B1, B2, and B3, as illustrated in Table 3. Distributed training is executed using PyTorch’s distributed data parallel (DDP) tool, while hyperparameter optimization (HPO) is conducted utilizing Optuna. The initial parameter configurations for the models are detailed in Table 5.

To assess the efficacy of the enhanced model and investigate the factors influencing its performance, this study designed three experiments, each comprising multiple groups aligned with specific experimental objectives. Each group underwent 20 epochs of training.

Experiment 1: assessed the performance of the UniTS baseline model and examined the impacts of the temporal span and sequence length on model training. The LSTM_TSC, THF_TSC, and UniTS models were trained using preprocessed datasets A1, A2, A3, B1, B2, and B3 to establish a total of six comparative experiments.
Experiment 2: validated the effectiveness of the improved UniTS-SinSpec model through three ablation experiments. The first group utilized UniTS_SAF, the second group used UniTS_SAM, and the third group employed UniTS-SinSpec. The dataset used in this experiment was consistent with that used in Experiment 1, where optimal training results were achieved by the UniTS baseline model.
Experiment 3: Optuna was employed to search for hyperparameters for optimizing the performance of the UniTS-SinSpec model. The original parameters are detailed in Table 5. This was similar to Experiment 1’s dataset usage comparison against benchmark models, including UniTS_SAF and UniTS_SAM.

4.3. Experimental Results and Analysis

4.3.1. Experiment 1

The findings from Experiment 1 are presented in Table 6, showing the classification accuracy of different models across varying sequence lengths and effective sample sizes. Notably, the UniTS model exhibits a significantly superior performance across all available datasets compared to the other models.

Impact of Effective Sample Size: Datasets with larger effective sample sizes (Dataset A) lead to a higher classification accuracy compared to smaller sample sizes (Dataset B). This demonstrates that more representative data enable better model generalization. As shown in Table 6, the UniTS, THF_TSC, and LSTM_TSC models achieve 11.92%, 8.93%, and 2.58% higher classification accuracies on Dataset A compared to Dataset B, respectively. This illustrates the importance of larger sample sizes in improving the performance across all models.
Impact of Sequence Length: Increasing the sequence length from 60 to 3600 significantly improves the classification accuracy, as longer sequences provide more contextual information. Specifically, the classification accuracies of the UniTS, THF_TSC, and LSTM_TSC models improve by 2.54%, 0.43%, and 5.94%, respectively, showing that longer sequence lengths enhance the models’ performance. However, extremely long sequences (86,400) pose computational challenges for complex models, like UniTS and THF_TSC, which encounter memory-related errors. In contrast, the simpler LSTM_TSC model can handle longer sequences, but at the cost of reduced accuracy, achieving only 5.64%.

4.3.2. Experiment 2

The results of Experiment 1 demonstrate that the model’s performance on Dataset A2 is superior, which led to its selection for further evaluation in Experiment 2. The findings from Experiment 2, as shown in Table 7, involve three ablation experiments: the UniTS_SAF model using only the sinusoidal activation function, the UniTS_SAM model using only the spectral attention mechanism, and the combined UniTS-SinSpec model, which integrates both components. The results reveal that both the UniTS_SAF and UniTS_SAM models show significant improvements over the baseline UniTS model, with the UniTS_SAF model achieving an increase of X% and the UniTS_SAM model improving by Y%. The combined UniTS-SinSpec model reached the highest average validation accuracy of 96.24%, demonstrating that integrating both components leads to greater performance gains compared to using either individually. However, this improvement also resulted in an increased training time due to the added complexity.

4.3.3. Experiment 3

After conducting an HPO with Optuna, we found that the batch size of 16, model depth of 256, hidden layer depth of 1 layer, and patch size and stride of 300 resulted in the best performance for the UniTS-SinSpec model. The final results of Experiment 3 are shown in Table 8.

The results demonstrate that the UniTS-SinSpec model, optimized through HPO, achieves an average accuracy rate of 97.47% in classifying ENF data, significantly surpassing both the baseline model and other enhanced models. Notably, the optimized UniTS-SinSpec model exhibits a nearly 6 percentage point improvement in accuracy compared to the baseline UniTS model.

It is noteworthy that the optimal patch length and stride step of 300, corresponding to inputting ENF data sequentially at 5 min intervals, yields the model’s highest classification performance and an average reduced running time of approximately 9.0597 s. We posit that, due to its one-dimensional nature, ENF data exhibit lower complexity; thus, a reduction in model complexity and an appropriate patch length and stride step can effectively enhance both the model’s performance and efficiency.

4.4. Discussion and Conclusions

The experimental results demonstrate that the enhanced UniTS-SinSpec model exhibits significantly improved classification accuracy and speed compared to the baseline model and other enhanced models. Specifically, Experiment 1 indicates that the UniTS model outperforms the traditional LSTM_TSC and THF_TSC models on the selected dataset. Through an analysis of the impact of different time spans and sequence lengths on model performance, it was observed that longer time series can offer more contextual information, with one hour being identified as the most suitable, leading to a significant enhancement in classification performance. However, an excessively long sequence length may result in insufficient data or large tensors, thereby affecting the training effectiveness of the model. Experiment 2 reveals that both the sinusoidal activation function and spectral attention module notably enhance the model’s performance, with the UniTS-SinSpec model demonstrating a superior performance, thus validating the efficacy of this enhanced approach. Nevertheless, it is noted that this improved model requires a longer training time, highlighting a trade-off between performance improvement and computational resources, which needs consideration in practical applications. Finally, Experiment 3 optimizes the hyperparameters for UniTS-SinSpec using Optuna, resulting in an optimized average accuracy of 97.47% for the ENF data classification task—significantly surpassing both baseline and other enhanced models. It was discovered that a lower model complexity actually contributes to an improved classification performance for ENF data; furthermore, when inputting ENF data in sequences of 5 min (300), an optimal classification performance is achieved with an average runtime of only 9.0597 s.

5. Conclusions and Prospects

This study presents an enhanced ENF region classification model, UniTS-SinSpec, which effectively leverages frequency domain information and periodic features. Through rigorous experimentation, including ablation studies and hyperparameter tuning, the model achieved a classification accuracy of 97.47%. The ablation experiments highlighted the distinct contributions of the sinusoidal activation function and spectral attention mechanism, with the combined model delivering the best performance.

ENF classification has numerous important applications across various fields. In forensic analysis, ENF signals are used for time and location verifications in audio and video recordings, aiding in the authentication of multimedia content. In power grid monitoring, ENF signals are employed for power-quality monitoring and fault detection, identifying anomalies in grid stability. In audio watermarking and copyright protection, ENF serves as a unique identifier to verify the authenticity of audio content. Furthermore, ENF signals can be used for device and location tracking, determining the position of devices or individuals through signal variations, and for device identification. Additionally, ENF signals support audio–video synchronization, enabling the alignment of asynchronous media content.

Despite the significant contributions of this study, there are several limitations. The dataset used primarily originates from the OSF platform, which may limit the diversity and coverage of the data. The model also relies on high-quality, synchronized ENF data. Future research should focus on collecting data from various regions and time periods with different nominal frequencies to improve the model’s generalization and robustness. Moreover, the periodic nature of the sinusoidal activation function may lead to local minima during training, increasing the risk of suboptimal convergence, as highlighted in the referenced paper. Future studies should explore ways to optimize this activation function to mitigate such risks and enhance training efficiency. Lastly, the model’s performance in handling short-term grid anomalies, as well as its computational efficiency when processing large datasets, is an area that warrants further exploration in future research.

Author Contributions

Original draft, Y.L.; review and editing, T.L.; methodology, G.Z.; data curation, K.Z.; validation, S.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded and supported by the Fundamental Research Funds for the Central Universities (Grant No. 2023JKF01ZK08) and the People’s Public Security University of China Top-notch innovative talents training funds, which supports graduate research innovation projects (Grant No. 2023yjsky010).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study are publicly available in the ‘Power Grid Frequency Data Base’ at https://osf.io/by5hu/ (accessed on 27 June 2024). No new data were created during this study.

Acknowledgments

The authors used ChatGPT to assist in checking the manuscript for spelling and grammatical errors.

Conflicts of Interest

The authors declare no conflict of interest.

References

Rodriguez, D.P.N.; Apolinario, J.A.; Biscainho, L.W.P. Audio Authenticity: Detecting ENF Discontinuity with High Precision Phase Analysis. IEEE Trans. Inf. Forensics Secur. 2010, 5, 534–543. [Google Scholar] [CrossRef]
National Technical Information Service. NTIS Homepage. Available online: https://www.ntis.gov/ (accessed on 26 September 2024).
Kajstura, M.; Trawinska, A.; Hebenstreit, J. Application of the Electrical Network Frequency (ENF) Criterion: A case of a digital recording. Forensic Sci. Int. 2005, 155, 165–171. [Google Scholar] [CrossRef] [PubMed]
Huijbregtse, M.; Geradts, Z. Using the ENF criterion for determining the time of recording of short digital audio recordings. Forensic Sci. Int. 2008, 175, 148–157. [Google Scholar]
Hajj-Ahmad, R.A.; Geradts, Z.; Wu, M. Instantaneous Frequency Estimation and Localization for ENF Signals. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hollywood, CA, USA, 3–6 December 2012; pp. 2877–2881. [Google Scholar]
Grigoras, C. Applications of ENF criterion in forensic audio, video, computer and telecommunication analysis. Forensic Sci. Int. 2006, 167, 136–145. [Google Scholar] [CrossRef] [PubMed]
Liu, Y.; Yuan, Z.; Markham, P.N.; Conners, R.; Liu, Y. Application of Power System Frequency for Digital Audio Authentication. IEEE Trans. Power Deliv. 2012, 27, 1820–1828. [Google Scholar] [CrossRef]
Garg, R.; Hajj-Ahmad, A.; Wu, M. Geo-Location Estimation from Electrical Network Frequency Signals. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Vancouver, BC, Canada, 26–31 May 2013; pp. 2862–2866. [Google Scholar]
Chuang, W.H.; Garg, R.; Wu, M. Anti-Forensics and Countermeasures of Electrical Network Frequency Analysis. IEEE Trans. Inf. Forensics Secur. 2013, 8, 2073–2088. [Google Scholar] [CrossRef]
Chakma, S.; Fattah, S.A. Location Identification Using Power and Audio Data Based on Temporal Variation of Electric Network Frequency and Its Harmonics. In Proceedings of the 2018 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE), Chonburi, Thailand, 14–16 December 2018; pp. 1–4. [Google Scholar] [CrossRef]
Bang, W.; Yoon, J.W. Power Grid Estimation Using Electric Network Frequency Signals. Secur. Commun. Netw. 2019, 2019, 1982168. [Google Scholar] [CrossRef]
Garg, R.; Hajj-Ahmad, A.; Wu, M. Feasibility Study on Intra-Grid Location Estimation Using Power ENF Signals. arXiv 2021, arXiv:2105.00668. [Google Scholar] [CrossRef]
Khairy, I. ENF Based Classification and Extraction. IEEE Signal Processing Cup 2016. Available online: https://github.com/ibrahimkhairy/ENF_based_classification_and_extraction (accessed on 27 June 2024).
Tzolopoulos, G.; Korgialas, C.; Kotropoulos, C. On Spectrogram Analysis in a Multiple Classifier Fusion Framework for Power Grid Classification Using Electric Network Frequency. arXiv 2024, arXiv:2403.18402. [Google Scholar]
Gao, S.; Koker, T.; Queen, O.; Hartvigsen, T.; Tsiligkaridis, T.; Zitnik, M. UNITS: A Unified Multi-Task Time Series Model. arXiv 2024, arXiv:2403.00131. [Google Scholar]
Yule, G.U. On a Method of Investigating Periodicities in Distributed Series, with special reference to Wolfer’s Sunspot Numbers. Philos. Trans. R. Soc. Lond. A 1927, 226, 267–298. [Google Scholar] [CrossRef]
Kim, Y. Time Series Analysis for Macroeconomics and Finance; University of Kentucky: Lexington, KY, USA, 2024. [Google Scholar]
Jumar, R.; Maaß, H.; Schäfer, B.; Gorjão, L.R.; Hagenmeyer, V. Database of Power Grid Frequency Measurements. arXiv 2020, arXiv:2006.01771. [Google Scholar]
Kruse, T.; Vanfretti, L.; Silva, F. Data-Driven Trajectory Prediction of Grid Power Frequency Based on Neural Models. Electronics 2021, 10, 151. [Google Scholar] [CrossRef]
Hendrycks, D.; Gimpel, K. Gaussian Error Linear Units (GELUs). arXiv 2016, arXiv:1606.08415. [Google Scholar] [CrossRef]
Nair, V.; Hinton, G.E. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel, 21–24 June 2010; pp. 807–814. [Google Scholar]
Parascandolo, G.; Virtanen, T. Taming the Waves: Sine as Activation Function in Deep Neural Networks. In Proceedings of the ICLR 2017, Toulon, France, 24–26 April 2017. [Google Scholar]
Gashler, M.S.; Ashmore, S.C. Training Deep Fourier Neural Networks to Fit Time-Series Data. arXiv 2014, arXiv:1405.2262. [Google Scholar] [CrossRef]
Sitzmann, V.; Martel, J.N.P.; Bergman, A.W.; Lindell, D.B.; Wetzstein, G. Implicit Neural Representations with Periodic Activation Functions (SIREN). Adv. Neural Inf. Process. Syst. 2020, 33, 7462–7473. [Google Scholar]
Zhou, S.; Pan, Y. Spectrum Attention Mechanism for Time Series Classification. In Proceedings of the 2021 Chinese Control and Decision Conference (CCDC), Suzhou, China, 14–16 May 2021; pp. 1–6. [Google Scholar] [CrossRef]
Tumino, P. Frequency Control in a Power System; EE Power: Penns Park, PA, USA, 2020; Available online: https://eepower.com/technical-articles/frequency-control-in-a-power-system (accessed on 25 July 2024).
Power Grid Frequency. Available online: https://power-grid-frequency.org/ (accessed on 27 June 2024).
Schäfer, B.; Witthaut, D.; Timme, M.; Latora, V. Non-Gaussian power grid frequency fluctuations characterized by Lévy-stable laws and superstatistics. Nat. Energy 2017, 2, 17058. [Google Scholar] [CrossRef]
Goss, R.; Mulder, F.; Howard, B. Implications of diurnal and seasonal variations in renewable energy generation for large scale energy storage. J. Renew. Sustain. Energy 2020, 12, 045501. [Google Scholar] [CrossRef]
Hugging Face. Time Series Transformer Classification. Available online: https://huggingface.co/keras-io/timeseries_transformer_classification/tree/main (accessed on 27 June 2024).
Romijnders, R. LSTM Time Series Classification. Available online: https://github.com/RobRomijnders/LSTM_ (accessed on 27 June 2024).
Chakma, S.; Chowdhury, D.; Sarkar, M. Power line data based grid identification using signal processing. In Proceedings of the 2016 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE), Pune, India, 19–21 December 2016; pp. 1–4. [Google Scholar] [CrossRef]

Figure 1. UniTS architecture.

Figure 2. Illustration of activation functions.

Figure 3. Comparison of activation function effects.

Figure 4. Spectrum attention mechanism improvement diagram.

Table 1. Schematic table of nominal voltages and ENFs in selected countries.

Region	Voltage	Frequency
Brazil	110/220 $V$	60 Hz
China	220 $V$	50 Hz
Germany	230 $V$	50 Hz
Japan	100 $V$	50/60 Hz
USA	120 $V$	60 Hz
Myanmar	230 $V$	50 Hz

Table 2. Original dataset.

Time Span	Location of Measurements	Synchronous Area	Time Range of Dataset	Selected Range	Frequency	Resolution	Number of Days
1 year	Baden-Württemberg, Germany	TransnetBW	2011-07–2020-03	2018-01-01–2019-12-31	50 Hz	1 s	360
	London, United Kingdom	National Grid	2014-01–2019-12	2018-01-01–2019-12-31	50 Hz	1 s	360
	Zealand, Denmark	Nordic Grid	2018-01-01–2018-12-31	2018-01-01–2018-12-31	50 Hz	1 s	360
41 days	Karlsruhe, Germany	Continental Europe	2019-07-09–2019-08-18	2019-07-09–2019-08-18	50 Hz	1 s	41
	Oldenburg, Germany	Continental Europe	2019-07-10–2019-08-07	2019-07-10–2019-08-07	50 Hz	1 s	41
	Istanbul, Turkey	Continental Europe	2019-07-09–2019-08-18	2019-07-09–2019-08-18	50 Hz	1 s	41
	Lisbon, Portugal	Continental Europe	2019-07-09–2019-08-16	2019-07-09–2019-08-16	50 Hz	1 s	41

Table 3. Preprocessed dataset.

Time Span	Region and Label	Sequence Length	Dataset Number
1 year	Baden-Württemberg, Germany—0	60	A1
	London, United Kingdom—1	3600	A2
	Zealand, Denmark—2	86,400	A3
41 days	Karlsruhe, Germany—3 Oldenburg, Germany—4 Istanbul, Turkey—5 Lisbon, Portugal—6	60	B1
		3600	B2
		86,400	B3

Table 4. Baseline model.

Model Category	Model
Machine learning	Power line data-based grid identification using signal processing (2016) [32].
	Location identification using power and audio data based on temporal variations in electric network frequencies and their harmonics (2018) [10].
	Power grid estimation using electric network frequency signals (2019) [11].
	Spectrogram analysis in a multiple classifier fusion framework for power grid classifications using electric network frequencies (2024) [14].
Deep learning	LSTM_TSC(2016) [31]. ¹ THF_TSC (2018) [30]. ² UniTS(2024) [15]. UniTS_SAF. ³ UniTS_SAM. ⁴ UniTS-SinSpec.

¹ Transformer Hugging Face Time-Series Classification model—THF_TSC [30]. ² LSTM Time-Series Classification model—LSTM_TSC [31]. ³ UniTS model with only the sine activation function—UniTS_SAF. ⁴ UniTS model with only the spectral attention mechanism enhancement module—UniTS_SAM.

Table 5. Models’ initial parameters.

Model	Batch Size	Initial Learning Rate	Model Layers	Hidden Layers	Patch Length and Stride
UniTS	32	0.0001	512	2	16
UniTS_SAF	32	0.0001	512	2	16
UniTS_SAM	32	0.0001	512	2	16
UniTS-SinSpec	32	0.0001	512	2	16
THF_TSC	32	0.0001	12	12	25
LSTM_TSC	32	0.0001	2	2	none

Table 6. Classification accuracy comparison of models based on the sequence length and effective sample size (Experiment 1).

Dataset Number	Number of Valid Samples	Sequence Length	UniTS	THF_TSC	LSTM_TSC
A1	1,575,357	60	88.95%	57.86%	26.6%
A2	188,928	3600	91.49%	58.29%	32.54%
A3	1046	86,400	none	none	6.25%
B1	188,921	60	77.03%	48.93%	24.02%
B2	22,656	3600	79.61%	50.21%	29.37%
B3	124	86,400	none	none	5.64%

Table 7. Performance comparison of ablation models (Experiment 2).

Index	UniTS	UniTS_SAF	UniTS_SAM	UniTS-SinSpec
Time-consuming	33.28 s	44.01 s	45.674 s	46.549 s
Average verification Accuracy	91.49%	94.43%	94.11%	96.24%

Table 8. Model performance comparison.

Model Category	Model Name	Average Accuracy
Machine learning	Power-line data-based grid identification using signal processing (2016) [32].	88.23%
	Location identification using power and audio data based on temporal variations in electric network frequency and its harmonics (2018) [10].	88.67%
	Power grid estimation using electric network frequency signals (2019) [11].	88.23%
	Spectrogram analysis in a multiple classifier fusion framework for power grid classification using the electric network frequency (2024) [14].	96%
Deep learning	LSTM_TSC.	32.54%
	THF_TSC.	57.86%
	UniTS.	91.49%
	UniTS_SAF.	94.43%
	UniTS_SAM.	94.11%
	UniTS-SinSpec.	96.24%
	UniTS-SinSpec (HPO via Optuna).	97.47%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Y.; Lu, T.; Zeng, G.; Zhao, K.; Peng, S. Advanced ENF Region Classification Using UniTS-SinSpec: A Novel Approach Integrating Sinusoidal Activation Function and Spectral Attention. Appl. Sci. 2024, 14, 9081. https://doi.org/10.3390/app14199081

AMA Style

Li Y, Lu T, Zeng G, Zhao K, Peng S. Advanced ENF Region Classification Using UniTS-SinSpec: A Novel Approach Integrating Sinusoidal Activation Function and Spectral Attention. Applied Sciences. 2024; 14(19):9081. https://doi.org/10.3390/app14199081

Chicago/Turabian Style

Li, Yujin, Tianliang Lu, Gaojun Zeng, Kai Zhao, and Shufan Peng. 2024. "Advanced ENF Region Classification Using UniTS-SinSpec: A Novel Approach Integrating Sinusoidal Activation Function and Spectral Attention" Applied Sciences 14, no. 19: 9081. https://doi.org/10.3390/app14199081

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Advanced ENF Region Classification Using UniTS-SinSpec: A Novel Approach Integrating Sinusoidal Activation Function and Spectral Attention

Abstract

1. Introduction

2. Related Work

3. Proposed Approaches

3.1. Sinusoidal Activation Function

3.2. Spectrum Attention Mechanism

3.2.1. Frequency Domain Transformation

3.2.2. Define Frequency Domain Attention Calculation

3.2.3. Frequency Domain Weighting

3.2.4. Inverse Transform

4. Experimental and Results Analysis

4.1. Dataset and Baseline Settings

4.1.1. Dataset Description and Data Preprocessing

4.1.2. Baseline Setting

4.2. Experimental Conditions and Experimental Design

4.3. Experimental Results and Analysis

4.3.1. Experiment 1

4.3.2. Experiment 2

4.3.3. Experiment 3

4.4. Discussion and Conclusions

5. Conclusions and Prospects

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI