1. Introduction
The Distribution Network (DN), as the last stage in the electricity supply chain, plays a crucial role in delivering electricity to homes, businesses, and industries. It ensures a reliable supply for our modern lifestyles, where any disruption can cause inconveniences, financial losses, and safety hazards. Distribution Network Operators (DNOs) utilise a reactive approach to continuously handle variations in voltage and power quality, thus ensuring that they remain within the bounds stipulated by regulations and that they guarantee continuous electricity supply. For European networks, the BS EN 50160 standard sets the requirements for voltage characteristics in public low-voltage networks [
1]. When taking into account that LV networks constitute the lateral and most dense part of electrical power grids, then it is clear that they remain more susceptible to disruptions stemming from numerous origins. These include, but are not limited to, aging components or cables, incorrect connection configurations following maintenance procedures, or construction works that might interfere with electrical infrastructure [
2,
3]. Regardless of the fault origin, these occurrences have a negative impact on system reliability, thereby resulting in costly repairs and power loss for customers, which significantly impacts the DNO because of the Customer Minutes Lost (CML) penalties. Due to the unforeseeable nature of faults, it is imperative that, when a fault arises, swift identification and isolation within distribution systems be executed as quickly as possible to mitigate their consequences.
The current status of fault identification and location methods can be split into two general categories, traditional techniques and advanced data-driven techniques. The traditional techniques are reactive in nature, meaning that they are used after a fault has already occurred. These include impedance-based, travelling-wave-based, and knowledge/experience-based techniques. Impedance-based methods use voltage and current measurements to determine the type of fault, which is then estimated on the fault location based on the apparent impedance [
4,
5]. Although the impedance-based method is one of the oldest techniques, it is still used due to its simplicity. Recent impedance-based literature [
6,
7,
8] extend the impedance-based method to increase its accuracy by taking into account extra parameters such as time-varying load profiles, unbalanced networks, and high-impedance shunt faults. However, as impedance-based methods are single-ended measurements at the substation, they are prone to the errors arising from the variability of cable impedance and the non-linearities and harmonics introduced by emerging load types. Travelling-wave techniques use the propagation time of the generated step/ramp voltage wave, which caused by the fault, to then translate this time to the fault distance, and, thus, the location [
9]. Travelling-wave technique accuracy is improved based on the way the recorded wave is analysed, such as by using wavelet or artificial intelligence methods for wave analysis [
10,
11]. This analysis requires a small measurement window (e.g., 100 microseconds) for the time-domain features extraction [
12]. While travelling-wave techniques for fault location in DNs benefit from enhanced accuracy through advanced wave analysis, a limitation arises from their requirement for a high-sampling device, thus limiting their fault analysis to only where the device is installed. The knowledge or experience-based fault location methods rely on the expertise of network operators. Operators can assess the condition of various parts of the network through a combination of visual, smell, and noise inspections, which are performed by experienced personnel who can often identify potential fault locations by observing specific cues [
13,
14]. Possible visual signs include physical damage, or they may rely on smell or noise to detect any unusual signs or sounds that could indicate faulty equipment or electrical arcing, which might be aided with the help of a sniffing dog [
15]. However, the reliance on knowledge- or experience-based fault location methods makes them highly dependent on the expertise of network operators. This dependency on specific personnel and the need for effective knowledge transfer could introduce variability and potential inaccuracies in fault identification, as it relies on the subjective judgment and experience of individual operators. However, as mentioned earlier, these traditional techniques are reactive fault localisation techniques, which means they are deployed in case a fault has already occurred and where the cables are de-energised. In that case, these techniques help pinpoint the exact fault location. Moreover, for high-impedance shunt faults, the accuracy of these techniques suffers and might even go unnoticed due to the relatively low fault currents that are usually below any protection device’s high-current setting [
16,
17].
In most cases, the identification and location of faults in DNs can be further improved through taking a more active approach by integrating data from end-user devices, such as
PMUs and SMs, in what is known as data-driven fault localisation. The literature covers numerous examples of using
PMUs for fault localisation. The authors of [
18] developed a
PMU-based fault detection technique for DNs, thereby utilising a combination of voltage deviation and power change indices for fault detection in multi-configured networks. The researchers in [
19] introduced an integrated fault detection, classification, and section identification, one that utilises
PMUs precise synchronised measurements for the effective pinpointing and categorising of various types of faults within a distribution grid. Another approach by [
20] was to integrate the data from
PMUs with ML, thereby utilising the magnitude and angle information of the measurements to enhance fault detection reliability. Similarly, the authors in [
21] utilised PMU data along with a deep learning technique for short-term voltage stability. Ref. [
22] provided a practical framework for decentralised fault localisation with PMUs performing the fault detection at fault nodes, but this requires computational capability at each PMU node, which can translate into extra costs. Generally, the utilisation of
PMUs in LVDNs is impractical due to the high cost related to their installation. On the other hand, SMs are currently being deployed within national energy grid upgrade initiatives [
23]. However, the proper study of utilising SMs data for fault identification is lacking, with existing studies often impractical in comparison to realistic smart meter capabilities or due to the assuming of continuous measurements post-de-energisation, which is not the case [
24]. References [
25,
26] present an Artificial Neural Network (ANN)-based fault location method for the IEEE-13 bus and IEEE-37 bus systems, in which the data from Advanced Metering Infrastructure (AMI) was employed. The proposed algorithm utilises both voltage magnitudes and currents from the SMs to accurately determine fault locations. The authors in [
27,
28] used measurements from a feeder terminal unit at each section to detect fault currents and to identify the faulted section again, and this was achieved by relying on the accuracy of the current measurement devices. Refs. [
29,
30] provided a theoretically validated fault-finding algorithm that relies solely on voltage magnitudes and the bus admittance matrix, but the practicality of deploying such an algorithm for LVDNs was not specified. However, most of these techniques assume the availability of AMIs or SMs as high-sampling measurement devices with robust capabilities for current and power measurements [
31,
32]. These assumptions can often prove impractical given the actual capabilities of deployed SMs, which are typically only able to capture voltage magnitudes at a half-hourly sampling rate [
33,
34]. Moreover, the limited installation of SMs necessitates the integration of additional data sources for practical fault identification and localisation.
While most of the research in the literature focuses on reactive fault identification approaches after the fault, this work distinguishes itself by prioritising pre-fault identification. Specifically, it emphasises the detection of high-impedance shunt faults that have the potential to develop into more serious faults. For example, the degradation of cable insulation might start as a high-impedance shunt fault and develop into a serious short-circuit that interrupts supply [
35,
36,
37]. This proactive approach represents a significant departure from the predominant reactive methods. By identifying and addressing faults at an earlier stage, the proposed methodology aims to prevent potentially catastrophic outcomes, thus enhancing the overall reliability and safety of distribution networks. Moreover, it is important to highlight that only limited number of techniques in the existing literature endorse the deployment of
PMU for high-impedance shunt faults as a preventive pre-fault measure. These techniques could potential attain high theoretical accuracies through their access to full voltage and current waveforms [
38,
39]. In contrast, the proposed approach in this paper stands out by exclusively relying on pre-installed smart meters with the practical limitations of half-hourly voltage magnitude measurements only, thus presenting a cost-effective and pragmatic alternative. To address this limitation, this manuscript introduces a novel approach for detecting high-impedance shunt faults and open-conductor faults in DN feeders. SM measurements, which are mainly voltage-magnitude readings, are augmented by the virtual data generated within a DT. This DT serves as a digital replica of the physical asset, in which it emulates its behavior and captures its real-time state with high fidelity. The application of DTs has demonstrated success in various facets of power systems asset management [
40], including transformers [
41], electrical machines [
42], power electronics devices [
43,
44], and renewable power generation units [
45,
46,
47]. However, there have been fewer attempts to use the DT concept for entire network management, all of the attempts thus far have been within the scope of the transmission network [
48,
49,
50]. However, from the literature review, the DT potential within DN studies remains untapped. By seamlessly integrating the DT paradigm with SM data, our objective is to significantly enhance the early detection and precise localisation of high-impedance shunt faults as they develop in cables and before protection schemes disconnect the circuit.
The presented work leverages the capabilities of the DT in tandem with SM voltage-magnitude readings. This enables the creation of a comprehensive database of fault scenarios, which are effectively matched with their distinctive voltage fingerprint. However, it is important to note that relying solely on the SM-based voltage-only approach yielded an accuracy of only 70.7% in fault type and location classification, as will be shown in later sections. As a result, this research advocates for the integration of the line Currents Symmetrical Component (CSC) to augment fault detection capabilities.
In view of the fact that SMs do not provide direct current data, this paper proposes a ML-based regression method to estimate the line currents within the DT. This critical step significantly improves fault localisation and identification accuracy, thereby achieving a commendable 95.77% from the original 70.7%. These results affirm the pivotal role of a DT in DN, thus enabling highly accurate fault detection through SM voltage-only data, as well as enabling the estimation of line CSC through the DT. The proposed DT-based fault detection and localisation presents an accurate SM-informed fault detection solution, thereby enhancing customer connectivity and streamlining maintenance team dispatch efficiency, all without the need for additional costly PMUs on the densely-noded distribution network.
The rest of the paper is organised as follows:
Section 2 provides an overview of the proposed digital twin, as well as explores the studied SPEN feeder. Additionally, this section introduces the two-stage fault detection, identification, and localisation methodology. Moving to
Section 3, the focus shifts towards simulation, data preparation, and ML models training. This encompasses the design of fault scenarios, addressing both high-impedance shunt fault scenarios and open conductor fault scenarios. Additionally, this section covers simulations and the training dataset, as well as introduces machine learning models. In
Section 4, the paper presents the results and initiates a discussion that specifically addresses the impact of partial SM coverage on fault localisation accuracy. Finally,
Section 5 encapsulates the conclusions, offering a summary of the research findings and their implications.
4. Results and Discussion
In order to estimate the holistic accuracy of the proposed approach, the original DIgSILENT PowerFactory model was again used to simulate 100 test scenarios, whose conditions from fault location, type, and resistance were randomised. Then, the SM voltages were captured and inputted into the pre-trained ML models in order to attain a predicted fault cable location and type. The test fault scenarios were designed to be more general than the training scenarios. This demonstrated the machine learning capability to generalise from the training data and to make predictions for fault locations that were not specifically encountered during training. For the type classification, the model achieved an outstanding accuracy of 100% in determining the fault type. On the other hand, for cable location, an accuracy of 78% was found. However, this accuracy stands for the exact matches between the correct cable and predicted cable, as displayed in
Table 2, but it can also be seen that even those miss-classified cables were geographically close to each other; hence, an additional distance error was used to capture the distance between the predicted cable’s midpoint and the correct cable’s midpoint to give an indication of the classification error in terms of distance. The histogram displayed in
Figure 11 shows clearly that, although the spot-on accuracy was 78%, 87 of the 100 test cases had a distance error between 0 and 0.02 km, with only 6 cases that had a distance error between 0.02 to 0.04. In addition, only 6 cases had a distance error between 0.04 to 0.06. Out of the 100 cases, only 1 case had a distance error above 0.06 km.
In the context of fault location within distribution networks, the achieved results hold significant promise. Relying solely on SMs for fault identification and localisation represents a shift towards the more efficient and cost-effective monitoring strategies that were only enabled by the DT model. The exceptional fault type classification accuracy of 100% underscores the robustness and added value of the proposed initial stage of the CSC estimation from the SM voltages. Meanwhile, for fault location, 99% of the predicted locations were at a relatively close geographic proximity to the actual fault location. This makes the proposed methodology helpful in the automated detection of faults and their types, and it helps with narrowing down the search area into a specific cable section. This, in turn, will lead to much less connectivity disruptions for customers and will improve the efficacy of maintenance teams being dispatched to the fault location.
4.1. Detection of Multiple Faults
While the proposed algorithm exhibits remarkable efficacy in identifying individual faults, it is designed with real-time applications in mind. This ensures that, once a fault or network topology change is recorded, the DT undergoes a prompt event-based update process. This includes the retraining of classification models to accurately reflect the latest network configuration, thereby incorporating any existing faults. This dynamic approach allows for the system to operate with the most up-to-date information, as well as to identify and record multiple faults as they occur sequentially. The other potential approach to achieve multiple or simultaneous fault identification could be through the exhaustive simulation of all the potential permutations of simultaneous faults in the initial training dataset; however, this would be significantly computationally extensive, as the larger network size may lead to higher confusion for the classification models.
4.2. Impact of the Partial SM Coverage on Fault Localisation Accuracy
In this section, the 100 test case scenarios were re-simulated to understand how the accuracy of the proposed method is affected by the partial coverage of SMs. These values were based on the distribution of the partial smart meters, as shown in
Figure 12. In the initial case previously studied in Case 1, where 19 out of the 19 possibly available SMs were connected (19/19), there were no crosses that indicated the SMs was unavailable. In Case 2, 4 unavailable SMs were found, which meant that measurements were collected from only 15 of the smart meters (15/19). In both Cases 3 and 4, only 13 connected SMs were featured, the difference being that, for Case 4, the disconnected SMs were the lateral ones, which caused less visibility for the voltages. Moreover, the low-coverage Cases 5 and 6 featured only 9 and 6 SMs, respectively, which represented the low SM penetration scenarios.
In the previously studied cases, Case 1 demonstrated full connectivity with 19 out of 19 SMs (19/19), as well as the highest classification accuracy of 93.6%. However, Case 2 enclosed 4 unavailable SMs, which resulted in a data collection from only 15 out of 19 m (15/19), thus leading to a slight drop in the classification accuracy to 93.4%, which is comparable to the previously calculated in
Section 3.3.3. Both Case 3 and Case 4 had 13 connected SMs, but, in Case 4, the disconnected meters were located on the laterals, thus resulting in reduced visibility and, consequently, a slightly lower classification accuracy of 93.03% as opposed to 93.2% for Case 3. Furthermore, for low SMs coverage, Case 5 and Case 6 represented scenarios with 9 and 6 smart meters, respectively, thus indicating low smart meter penetration, as they exhibited a classification accuracy of 92.8% and 91.5%, respectively.
Figure 13 presents the classification accuracy, which was determined using the training dataset for validation, as well as the practical test accuracy, which was assessed by employing random real-world scenarios from the training dataset for testing in all of the six cases. These results suggest the significant impact of the number and location of the connected SMs on the fault localisation accuracy. Although the classification’s validation accuracy remained above 90%, the realistic tests showed a significant drop in accuracy with partial SMs coverage. In addition, the drop between Cases 3 and 4 were shown to be much higher for the test scenarios, thus validating the importance of having SMs dispersed through the lateral nodes of the network. This alludes to the recommendation of having a minimum lateral node coverage of 70% to ensure reliable accuracies.
It is worth mentioning that the accuracies and values presented in this study are directly tied to the specific characteristics of the analysed network, and they are limited to the studied high-impedance pre-fault identification, which underscores the importance of contextual considerations. Nonetheless, they serve as compelling evidence of the viability and effectiveness of the proposed DT-based methodology for fault identification and localisation. It is crucial to acknowledge that the performance metrics and model training outcomes may vary depending on factors such as network size, the density of deployed SM, and the prevailing operating conditions.
5. Conclusions
This paper introduced a pioneering methodology for detecting high-impedance shunt faults and open conductor faults in DN feeders. By integrating state-of-the-art DNDT technology with SM measurements only and without PMUs, particularly voltage-magnitude readings, fault detection capabilities for DNOs increased significantly. Initial attempts relied solely on the SM-based voltage-only approach, which yielded an accuracy of 70.7% in fault type and location classification. However, by incorporating the line CSC through a machine learning-based regression method within the DT framework, fault localisation and identification accuracy surged to an impressive 95.77%.
Through extensive DT simulations encompassing a spectrum of fault scenarios, the trained ML models exhibited exceptional performance. Fault type classification achieved a flawless 100% accuracy rate, thereby showcasing the robustness of the approach in distinguishing the various fault types. Additionally, the predicted fault locations demonstrated remarkable proximity, whereby they deviated by less than 0.06 km from the actual fault location in 99% of the tested cases. This remarkable precision substantially reduced the area requiring investigation, thus enabling targeted dispatches of maintenance teams and minimising customer disruptions.
These results represent an advancement in the fault detection within DN. The integration of the CSC estimation stage from the SM voltages signifies the value of building a DNDT for attaining additional information, as well as for obtaining insights from raw SM voltage-based DT simulations. In light of the analysis highlighting the crucial role of accelerating the SM roll-out program, it becomes evident that expediting these efforts is essential for maximising additional benefits and enhancing the efficiency of applications, particularly in areas such as fault localisation.
In summary, the proposed methodology equips DNOs with a SM data-informed fault detection, whereby there is high confidence in specifying the fault type and in providing precise geographic areas for investigation.
The DNDT research opens up new avenues for optimising the operation of active distribution networks. Having a DT in the network empowers broader operational capabilities within diverse distribution network contexts, thus ultimately contributing to enhanced reliability and resilience. Future investigations may focus on testing the methodology with different feeders of diverse network configurations and operational environments. Additionally, an extension of this work could involve pre-fault identification and locations that use a partial coverage of SM, or a high integration of embedded generation.