*4.7. Other Methods*

This subsection groups all the other reviewed faults that were not classified in any of the six groups. A summary of the contribution of the reviewed methods is presented in Table 7.

The approaches, that use graph-based semi-supervised learning (GBSSL) for fault identification and diagnosis in PV arrays, are available here [13,117]. The authors of [117] presented a sound technique based on GBSSL for recognizing, classifying, locating, and fixing errors in PV arrays, which was improved by expanding the diagnostic space of the GBSSL algorithm and adding more class labels. The model detects and locates a failure, temporarily isolating the system in order for it to continue to operate normally until the problem is resolved. The authors adjusted the way the data were normalized in order to increase the system's ability of finding the unlearned defects, and they were able to detect faults that the algorithm could not detect at first. The functionality of the system to feed the required energy after the defect was eliminated was tested using an interleaved boost converter, and it was discovered that the maximum voltage loss in a standard condition is 1 V, demonstrating the model's high efficiency. The proposed model in [117] was an improvement of a previous model proposed in [13] by the addition of the fault location element and higher fault detection and classification accuracy. The early stage of the algorithm starts with the extraction of a limited number of labeled data with their class labels serving as initial value data and fed into the model. Both the faulted and fault-free data are included in this dataset. To achieve the fault-prone labeled data, faults are purposefully introduced into the system, and the parameters are measured and stored for each data. The data are measured and stored in the fault-free mode in the same way as it is in the normal mode. An important advantage of the proposed GBSSL method is that it only requires a small amount of data for the learning process. To detect PV system faults, the current of all rows of solar cells, as well as the overall voltage of the panel, are constantly measured at any time (this is accomplished by installing 10 current sensors in each row of cells and one voltage sensor at the PV system's output points). As a result, each dataset contains 11 parameters, 10 of which are currents flowing across the cell's rows and one of which is the system's total voltage. In the modeling process, the parameters required to implement the GBSSL algorithm are entered. On the other hand, the authors of [118] proposed an intelligent defect diagnostic approach for PV arrays based on a kernel extreme learning machine (ELM) optimized simulating annealing (SA) algorithm with an improved radial basis function (RBF). Short-circuit faults, aging faults, and shadow faults are among the faults discovered by the proposed method. The results obtained were above 90%, showing that the SA-RBF-ELM fault classification is accurate and stable. According to the acquired simulation findings, the suggested model has three key advantages: (1) The optimal fitness value of the PV array and the model parameters used, as one of the characteristic factors of neural network learning, considerably increase the fault detection accuracy of the four fault kinds described; (2) the RBF-ELM kernel function has strong learning and classification capabilities, making it suitable for detecting and classifying PV array faults; and (3) the SA algorithm can quickly optimize the parameters of the RBF-ELM fault diagnostic model, significantly improving the RBF-ELM model's training accuracy and testing precision. The algorithm proposed is a derivative of the basic ELM algorithm, with kernel function limit learning machine features, which improve its ability to solve the problem of regression prediction leading to higher accuracy and faster calculations. The RBF-ELM model contains a regularization coefficient (C) and radial width (α), which affects the algorithms' performance. The RBF-ELM fault model's training accuracy is employed as the optimization objective function, and the coefficients C and α are the parameters that must be tuned. The simulated annealing approach is then utilized to optimize the parameters of the RBF-ELM fault model, resulting in optimal training and test accuracy for each time. A diagnostic technique for PV systems was developed in [119] using the learning method to take each PV site's condition into account. The technique employs the diagnostic criteria database to analyze the data acquired from the PV system.

The special features of the proposed technique include updating the diagnostic criteria, making it possible to detect normal or abnormal operating conditions of a PV system; the detection of shadow on modules and the pyranometer using the sophisticated verification (SV) method [120]; and the maintenance advice provided by an expert system according to the precise diagnosis. The ratio of acquired data to reference data is calculated to diagnose the system's normality or abnormality. The ratio approaches "1" when actual and average meteorological data are close. For example, when the summer generated power ratio is "1" and the winter generated power ratio is "0.7," a winter shadow or snow on the modules is assumed. The criterion for diagnosis in this situation is "1". The contribution of the proposed method is highlighted in its features, as follows: By updating the diagnostic criteria, it is now feasible to diagnose the normality or abnormality of PV systems, while taking into account the PV system's characteristics, as well as the climate; where a shadow appears on the modules or pyranometer is determined using the SV approach and hourly data analysis; and maintenance recommendations are also given based on the diagnosis outcome. The simulation results of the proposed technique suggest that it offers quick and proper maintenance advice within a short detection period. In [121], a simple shortcircuit and open-circuit fault detection approach for PV systems was suggested based on the evaluation of three coefficients. The suggested technique has two steps. First, an offline simulated model for extracting the variation boundaries of the three coefficients for each faulty operation. Second, an online comparison model for comparing real measured coefficients to the simulated coefficients from the offline step. Three coefficients have been established for each fault type in order to detect and diagnose both short-circuit and open- circuit faults, namely the current coefficient, the voltage coefficient, and the power coefficient. The offline step is aimed at extracting the three coefficients' variation boundaries for each type of fault. In order to achieve this, three other operations are conducted. By bringing the detected parameters to a PSIM/MATLAB co-simulation, you may simulate both the healthy and flawed scenarios under a few climatic situations. For each simulated instance, the goal of this stage is to extract a few MPP coordinates. Based on the given equations, determine the three coefficients for each fault situation, then by adding a ±2% offset to the three derived coefficients, you may extract the variation boundaries for each defective type. For the online step, using the various sensors, both the meteorological conditions and MPP values may be detected and monitored during the actual operation of a PV system. The three actual onsite coefficients will be calculated using these measures. Finally, a comparison is made between the real onsite coefficients and the variation boundaries of each faulty case that was previously stored during the offline process. To conclude, the faults detection task will be carried out based on the real onsite monitored power coefficients measured, in a way that if their value exceeds a set threshold, a DC side fault alarm will be triggered. In addition, the faults' type will be determined by comparing the three real onsite coefficients with the variation boundaries of each simulated faulty case. The proposed method is straightforward, efficient, and does not necessitate a large amount of training data. the authors of [71] presented a data-driven anomaly detection and classification system that can accurately detect and categorize a wide range of PV system anomalies. The method consists of two stages. First, the local context-aware detection (LCAD), which is a hierarchical context-aware anomaly detection using supervised learning, and is aimed at identifying possible anomalies in PV strings with current characteristics that are different from the other PV strings under similar environmental conditions. Second, the remote context-aware detection (RCAD), which is a hierarchical context-aware anomaly detection using supervised learning, and is aimed at identifying possible anomalies in PV strings with current characteristics that are different from solar PV farms and benefit from a combination of LCAD and GCAD to detect anomalies at the string level. First, the domain-specific features are designed. To reduce computation complexity and increase classification performance, the multimodal properties are carefully generated and extracted. Then, with the purpose of developing an accurate classification model that is suitable for specific categorization situations, a

multimodal model training technique is constructed. The effectiveness, robustness, cost-, and computing efficiency of the suggested strategy are proved by the results of trials conducted over time. The proposed method has the following advantages: A more robust method against irradiance and weather variations that can accurately detect different anomalies without pre-labelled data; 90.2% detection accuracy for the top 100 anomalies that are otherwise nearly undetectable under low irradiance or weather with high cloud cover; the use of SCADA data to classify commonly occurring anomalies at the plant level; and cost- and computation-efficient as it uses readily available data of existing PV systems. Numerous machine learning-based fault detection methods have the following problems, according to the authors in [122]. Fault diagnosis performance is limited due to the insufficient monitored information. Moreover, fault diagnosis models are inefficient to train and update, and labeled fault data samples are difficult to obtain by field experiments. The authors proposed a method with the aim of overcoming these problems and three features were addressed. The first is based on important points and model parameters collected from I-V characteristic curves and environmental factors that are observed. An effective and efficient feature vector of seven dimensions is proposed as input of the model. The second is an emerging kernel based on extreme learning machine (KELM), which features extremely fast learning speed and good generalization performance, utilized to automatically establish the fault diagnosis model. The Nelder-Mead simplex (NMS) optimization method is employed to optimize the KELM parameters, which affect the classification performance. The final aspect is an improved accurate SIMULINK-based PV modeling approach for a laboratory PV array to facilitate the fault simulation and data sample acquisition. There are six steps leading to the establishment of the proposed model, as shown in Figure 10.

**Figure 10.** Flowchart on the establishment of the proposed fault diagnosis model [122].

The data samples for each fault condition should cover a wide range of operational irradiance and temperature, in order to make the fault diagnosis model suitable for a variety of operating settings. To begin, certain SIMULINK simulation experiments were used to obtain labeled data samples of normal and problematic situations. Then, on the real laboratory PV array, some field experiments were conducted to achieve some experimentally labeled data samples. Finally, the fault diagnostic model is established using the optimized KELM, which is evaluated and analyzed using both simulated and experimental data samples with known fault kinds. The proposed KELM-based fault detection model is promising in real-time applications due to its exceptionally fast learning speed, simplicity, and high generalization performance. The authors are attempting to apply the fault diagnosis model in digital signal processor (DSP) based embedded real-time systems, in conjunction with an integrated rapid I-V tester that is currently in development. The authors of [123] presented outlier detection rules based on instantaneous PV string current monitoring for failure detection. It is a command to monitor PV functioning and discover faults that may go undetected by overcurrent protection devices (OCPD). Three outlier identification rules were devised and compared by the authors, namely the threesigma rule, Hampel identifier, and boxplot rule. Weather measurement or model training are not required with the suggested strategy. The Hampel identifier performs well in

cases with extremely high contamination levels (33.3% in this investigation), while the boxplot rule performs better under PV faults in cases with relatively high contamination levels (12.5% in the case of this study). The model's reliability improves as the number of PV measurements rises. Despite the fact that the outlier identification methods in this study are based on PV-string level measurements, the authors claim that the proposed approaches should be straightforward to implement with minor modifications on any PV installation level. If the assumption is made that the solar irradiation is identical on the same PV level, the outlier rules can be applied on the PV-module or sub-array level, for example. Aside from the PV string current, the measurement could include PV insulation impedance, output power or energy yield, all of which are commonly used PV metrics. This may provide extra flexibility to fault detection methods. To overcome the limitations of conventional wired monitoring systems, such as physical constraints during data cable laying, high installation and maintenance costs, and reduction in the system lifespan due to the over exposure to extreme weather conditions, a Zigbee-based wireless monitoring system was developed in [124] to replace the conventional systems for online monitoring of parameters, such as temperature, irradiation, PV power output, and grid inverter power output, in grid-connected PV system applications. Moreover, it is equipped with a control function for remote system monitoring and a user-friendly web application, in order for the monitored data to be easily accessible via the Internet. Although the simulation results were satisfactory, the authors pointed out the limitations of the proposed method. These limitations include: (1) The proposed method is location specific. Therefore, before implementing the system in other locations with significantly different weather conditions, the weather factor needs to be taken into consideration. (2) The program used was developed based on the available software and programming language familiar to the authors. In [125], a failure diagnostic algorithm based on an online distributed monitoring system of a PV array of Zigbee wireless sensors network and a genetic algorithm optimization based BP neural network was investigated. The Zigbee wireless network system monitors each module's output current, voltage, and irradiation, as well as the environment's temperature and irradiation. In addition, a simulation PV module is set up, based on which typical problems are simulated and fault training samples are obtained. The fault sample data are then utilized to build and train a generic algorithm optimized BP neural network diagnosis model, which is subsequently used to detect four different PV array operating states (normal, abnormal aging, short-circuit, shadow). Since an open-circuit problem is noticed during the data collecting phase, it is not considered one of the diagnosis model's outputs. According to the simulation data, the proposed defection system has a high level of accuracy. Operators or managers can log in to verify the parameters of each PV module and use the designed mechanism to quickly discover the problematic PV module. In relation to the PV energy conversion system (PVECS), the authors of [126] presented a fault detection system using a fractional-order color relation classifier. The output power degradation is used to monitor the physical circumstances associated with changes in the circuitry of a PV array, such as grounded faults, mismatch faults, bridged faults between two PV panels, and open-circuit faults, using an electrical inspection method. The over-current and ground fault prevention devices can also be used to isolate failures on the AC side. As a result, the grid connection side fault impact can be reduced. Iterative calculations are not required to update the parameters of the inference model in the flexible and inferential model. As a result, it can handle the complexity of an adjustable mechanism in a relatively short design cycle. Embedded system approaches can then be used to implement the proposed detection model. The suggested approach can detect normal conditions, mismatch faults, and four common electrical defects on the DC side, according to the simulation results. For solar radiation of 0.4–1.0 kW/m<sup>2</sup> and temperatures of 25–40 ◦C, the suggested detection model has an average accuracy of 88.23% in identifying the fault under low/high solar radiation and various temperatures. The authors of [127] presented another interesting fault detection and diagnosis method based on a laterally primed adaptive resonance theory (LAPART) neural network. It is a

low-cost way of automatically detecting and diagnosing PV system issues. The LAPART algorithm was taught how to detect fault states using real-world data that were classified as normal system behavior. The algorithm was then given new data and three-fault data points for an initial test. The system was given synthetic data to examine its performance over a statistically significant month-long dataset, and it was able to correctly identify flaws within the dataset. The LAPART algorithm's accuracy is determined by its ability to deliver a high likelihood of detection, while reducing false alarms. The number of true positive values generated by the FDD process is compared to the total number of actual positive values to determine the likelihood of detection. The LAPART architecture combines two fuzzy adaptive resonance theory (ART) algorithms to build a system for predicting outcomes based on the learnt associations. The single fuzzy ART algorithm's fundamental equations include category selection, match criterion, and learning. The goal is to create the optimal template matrix for the provided dataset. The approach employs category selection to discover the existing template matrix that best matches the provided input. In addition, for fast learning applications, the free parameter is frequently set to 10-7. The match criterion then checks to verify if the template matrix and input that is compared fulfill the user-defined vigilance parameter criterion. Depending on the level of intricacy requested, the vigilance free parameter can range from 0 to 1. A high vigilance value of 0.9, for example, yields high complexity but limited generality, whereas a low parameter of 0.5 yields the opposite. Finally, if it passes, the template is changed to reflect what has been learned. The LAPART algorithm is created by linking the two fuzzy ARTs (FARTs), which is seen graphically in Figure 11. The L matrix, which connects the A and B templates, connects the A and B FARTs. Each FART has its own set of vigilance settings, and inputs are delivered to both the A and B sides at the same time during the learning process. The A and B sides work together to generate and update the templates, while also forming links. Testing inputs are only applied to the A side after the training is complete, allowing them to resonate with the already acquired templates. The L matrix's relationships are then used to link with the B side and generate the prediction results.

**Figure 11.** Creation of LAPART algorithm through linking two fuzzy ARTs [127].

The approach provided in [128] is based on monitoring the PV array's output power and is suitable for low irradiance, high impedance, and low mismatch fault circumstances. The irregularity of the time series of the normalized fault-imposed component of PV power is measured using entropy-based complexity as the fault detection criterion. Weather disturbances and partial shade can cause many array faults, which the proposed method can identify. It is applicable to both grid-connected and islanded PV systems and does not require a training set or prior knowledge of the PV array. Moreover, it is an economical strategy as it does not require costly sensors, relying only on the central IED to process the PV voltage and current measurements. The irregularity of the fault imposed power time series is measured by the sample entropy. The study uses the sample entropy-based complexity as the PV array fault detection index (FDI), since the complexity of time series data more effectively captures the behavior of a nonlinear system. The fault-imposed component of the PV array power is zero during the normal operation. As a result, the time series of moving data windows is regular. Therefore, SampEn is equal to zero. The faultimposed power samples in each moving data window of N points are not identical when the solar irradiance or temperature varies, but they are fairly close together, since weather disturbances are not severe. As a result, FDI will be near-zero in this situation. When a fault develops in the PV array, however, the fault-imposed power samples rapidly shift. As a result, the normal fault-imposed power samples differ dramatically from the post-fault ones. Therefore, non-repetitive patterns can be found in the moving data windows during

the initial milliseconds of fault transients, and the estimated SampEn is not zero. The issue is recognized when FDI rises to a non-zero value. It may be concluded that FDI for normal occurrences is approximately zero, whereas FDI for fault events is non-zero. A defined threshold is used to discriminate between non-zero FDI values under fault situations and near-zero FDI values during weather disturbances and partial shadings. As a result, determining the fault detection threshold is simple. It can considerably reduce nuisance tripping. Several experiments that were carried out validate the proposed fault detection algorithm's simplicity, sensitivity, scalability, resilience, and adaptability.



#### **5. Discussion**

This study presented a comprehensive review of PV system fault detection and diagnosis techniques that are based on artificial intelligence and machine learning. Conventional fault detection and diagnosis methods, which equip PV systems with overcurrent protec-

tion devices and ground fault detection interrupters, are not sufficient enough to detect certain faults due to low irradiance conditions, nonlinear output characteristics, maximum power point tracker of PV inverters or high fault impedances. This led to the need for more intelligent fault detection and diagnosis methods to replace the conventional methods, in order to improve the PV systems operational efficiency and safety. AI-based methods, which are still currently explored and improved have been found to be the alternative to conventional methods. This paper's contribution outlines the main features of reviewed AI-based methods and the effectiveness of PV fault detection and diagnosis applications. The reviewed methods mostly adopt ML models, such as neural networks, wavelets, fuzzy logics, decision trees, support vector machines, graph-based semi-supervised learning, regression, etc., in order to develop models and algorithms that are trained to learn the relationships between input and output parameters of PV systems. The effectiveness of these methods depends on their ability to detect a fault and pinpoint its location in the shortest possible time; their relative affordability; and ease of use. Of note, there are currently fewer literatures in this area of PV application compared to the other areas, since the topic has only recently been explored, as evident in the oldest paper we could obtain, which dates back to only about 15 years.

**Author Contributions:** Conceptualization, A.A. and C.F.M.A.; methodology and formal analysis, A.A., C.F.M.A. and M.G.; writing—original draft preparation, A.A.; writing—review and editing, C.F.M.A. and M.G.; supervision, C.F.M.A. and M.G.; funding acquisition, A.A. and C.F.M.A. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was partly funded by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) (No. 88887.514132/2020-00).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors kindly thank ENERQ-CTPEA Centro de Estudos em Regulação Qualidade de Energia for their support.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

