1. Introduction
Renewable energy has become most attractive than before due to the clean and high-efficiency energy produced from renewable energy sources. Despite the economic slowdown caused by the COVID-19 pandemic, global renewable energy capacity has experienced rapid growth, adding more than 314 GW in 2021, which represents a growth of 17%, to reach 3146 GW of total installed capacity [
1,
2]. The solar photovoltaic (PV) market in recent years has seen strong expansion with at least 175 Gigawatts in direct current (GWdc) added in 2021, reaching a total installed capacity of 942 GWdc [
3]. This growth is due to technology innovations and competitiveness in the solar PV market, electricity demand, low maintenance costs, and rapid return on investment [
4]. Since the share of solar PV power is sharply increasing, safety and reliability during electricity production are becoming indispensable. For this purpose, national grid operators around the world require guaranteed availability and a high level of predictability from solar PV energy providers [
5]. However, solar PV systems like all electricity generation systems can be subject to various types of faults that affect the components of the PV system such as PV modules, cables, protections, or inverters [
6]. To guarantee reliable performances of PV plants and maintain target requirements, faults have to be accurately detected and localized.
Anomalies and faults in PV installations can degrade their performance severely. Broadly speaking, faults in PV systems can be categorized based on their side of occurrence: Alternating Current (AC) or Direct Current (DC) in the PV installation [
7,
8]. Most DC side faults are located in PV array (e.g., partial shading and soiling effects, power degradation, DC wiring losses, mismatch of PV strings, electrical arc, line-line, and line to ground faults). Other faults in DC side are due to the rest of the Balance of System (BOS) components (e.g., cable, fuses, diodes, switches, and DC/DC converter). On the other hand, total black-out and grid outage are the two important faults occurring in the AC side. The grid outage faults are due to unstable voltage and lightning [
9]. Detecting and classifying both DC and AC faults affecting PV systems are surely critical issues into be carefully addressed. To this end, various fault detection and diagnosis (FDD) procedures have been developed to detect and identify the type of fault in the DC or AC side of PV systems [
10,
11,
12,
13]. These techniques help to increase the system reliability and lifetime of PV installations.
The high efficiency nowadays needed for PV systems demands ongoing monitoring of the working condition of these systems [
14,
15]. Hence, reliable and precise FDD techniques able to provide the information needed to maintain optimum working conditions, are necessary to ensure (1) fast detection and accurate localization of faults, (2) reduce plant downtime, (3) extend long-term profitability (4) and exploit full power generation [
16]. The International Energy Agency (IEA) Photovoltaic Power Systems Program (PVPS) has reported in [
17] ten (10) on-site inspection techniques using mobile test equipment for PV modules and arrays. These techniques range from I-V characterization to imaging techniques used to detect and localize failures (e.g., electrical mismatch of PV strings, degradation, and cell cracks). For materials analysis, spectroscopic methods were also considered in [
17]. In the last ten years numerous FFD methods have been proposed to accurately detect and identify faults in grid-tied PV systems [
9,
18,
19,
20].
FDD techniques in PV systems include two main categories: based on parametric models [
21,
22,
23] and data-driven models [
24,
25]. Indeed, the parametric models are specified in terms of power, type, and size of the PV system installed. Therefore, parameter identification and model calibration are mandatory for each PV installation, which represents the weakness of these models. On the other hand, the advantage of non-parametric or data-driven approaches is the consideration of the PV system as a black box without parameterization but requires a lot of historical data to develop a behavioral model.
Data-driven approaches that rely on data collected from an inspected PV system to achieve outstanding performance have become appealing due to the availability of big data [
26,
27,
28]. Data-driven fault detection methods such as Gaussian process regression-based charts [
26], ensemble learning [
28], artificial neural network methods [
29,
30,
31,
32,
33], and deep learning [
34], have received considerable attention in recent years due to their capability to automatically extract relevant features from high-dimensional data. For instance, in [
28], a method based on ensemble learning models (i.e., Boosting and Bagging) is proposed to identify faults in the DC side of a PV system. Here, Bayesian optimization method is utilized to calibrate the hyper-parameters of the ensemble learning models in the training stage. After these models are employed as residual generators, the double exponentially weighted moving average (DEWMA) chart is applied to the generated residuals for fault detection purposes. In [
26], kernel-based regression models (i.e., gaussian process regression and support vector regression) are used to model the nominal behavior of the inspected PV plant and the k-nearest neighbors-based detector is employed to discriminate between healthy operation and faulty conditions. Such an approach employs kernel density estimation to determine the detection threshold in a nonparametric way, which extends the flexibility and applicability of this approach.
Alternatively, model-based FDD techniques are based on the comparison between the measured data and the prediction from a mathematical model. The analytical model is often established based on some fundamental understanding of the PV system under normal operation states. In the past two decades, the interest in using model-based methods for FDD by engineers was high due to their mathematical and systematic characteristics. Model-based approaches include the empirical models [
35,
36], the one-diode model [
37], and the two-diodes model [
38]. Other approaches use the Kalman filter [
39], Fourier series [
40], and thermoreflectance imaging [
41,
42] to detect and recognize faults occurring in PV systems. Essentially, in the model-based methods, residuals, which are the deviation separating the measurement from the model predictions, are employed as indicators of faults in the inspected PV systems [
43]. Hence, residuals are for fault detection and diagnosis. This study is within a model-based FDD framework.
Recently, the IEA-PVPS Task 13 in [
44] has reported a benchmarking study of numerous approaches for computing the performance loss rates (PLR) of commercial and research PV system in different climatic zones. The PLR method is used by solar PV power plant owners, investors, and operators to estimate power output of a PV system over its service life. Therefore, discrepancies in different calculation methods can have a significant impact on the electricity generation cost. The benchmarking study is critical due to the inconsistency in reported PLR results based on the many methods currently used to compute the PLR of PV systems.
In the literature, many installed PV systems use analytical monitoring and performance analysis to avoid economic losses due to malfunctions problems [
45]. For example, the study in [
46] presented an automatic supervision and Fault Detection procedure on the DC side of the PV systems, analyzing both power losses and deviation of the DC current and DC voltage under the MATLAB/Simulink environment. An evaluation of the current indicator (Ci) and voltage indicator (Vi) at maximum power point (MPP) for automatic detection of main faults occurring in the PV systems was performed in [
47]. This method is simple and can be integrated into the inverter for real-time supervision of the PV system. In [
48], a level online FFD technique based on power losses analysis (PLA) for solar PV systems was proposed. The work in [
49] presents remote supervision and fault detection and identification (FDI) of grid connected PV (GCPV) systems using the PLA method through the open platform communications (OPC) technology-based monitoring. In [
50], an online monitoring FDI technique is introduced based on Vi analysis allowing for a reduction of the number of the sensors using power line communication (PLC), with an economic analysis of GCPV systems. In [
51], two faults indicators, named voltage ratio (VR) and power ratio (PR) are defined to determine the fault type, time, and location of failure. A review of the main strategies for automatic FDD based on PLA at the AC side and both Ci and Vi at the DC output of the PV array is reported in [
52]. In [
53], an FDD method based on the evaluation of voltage and current indicators was employed to identify faults of the PV array. Even under various climatic conditions, this method is able to distinguish between degradation, permanent shading, and temporary shading faults.
A model-based FDD method was proposed in [
54] which combines (1) the physical and statistical models and (2) fault diagnosis based on the physics of each failure. Both approaches contribute to optimizing the operation and maintenance of PV plants. The FDD capabilities of this method have been verified in a case study using six years of Supervisory Control and Data Acquisition (SCADA) data from a 315 kWp PV system. The study in [
55] presents an effective approach to detecting and identifying various faults in PV systems, including short and open-circuit faults, inverter disconnection, and partial shading conditions. Toward the end, three indicators are used to give information on the state of the PV system regarding normal and faulty operation. In the literature, few studies exploit satellite data instead of on-site data measurements to estimate PV energy production. In [
56], the satellite-derived solar irradiance data were analyzed by the PLA method to remotely detect failures in a PV system. Recently, authors in [
57] have proposed an FDD approach for detecting PV underperformance and accurately identifying and distinguishing the different fault types (e.g., inverter faults, degradation, snow, and soiling) in large-scale PV plants. The proposed approach was evaluated using historical data of one inverter from a 1.8 MWp PV plant. This method focuses on the differentiation of faults performance loss ratio based on statistical analysis to extract significant changes in time series data. Moreover, this method makes it possible to plan the cleaning event by quantifying the soiling level on the modules, and it also allows estimating the degradation rate of the installed PV modules. The key advantage of the automatic FDD method based on real-time difference measurement (RDM) is that it can practically identify the common faults in the PV array. The implementation is easy with minimal required components, the detection time is improved and the FDI accuracy depends on the fixed threshold limits from the behavior PV model. Despite the rather low degree of complexity of the RDM techniques, the measured data obtained during faulty conditions was necessary for better validation.
In this paper, we present an efficient technique merging both parametric models and performance loss rate (PLR) analysis to sense and recognize anomalies occurring in a GCPVS. The considered parametric models are accurate, easy to implement in real-time monitoring tools, and with low computational costs. Knowing that the precision of model-based techniques relies on the measurement’s quality and the model calibration method, main benefit of the proposed technique resides in its capability in detecting different types of anomalies in both DC and AC sides of a GCPVS.
Figure 1 illustrates the general framework of the proposed simulation process of behavioral modeling and fault diagnosis of the GCPVS using MATLAB and LabVIEW. The main contributions of this work are recapitulated as follows.
At first, we introduced a parametric approach to model the behavior of the inverter and the PV array based on Sandia National Laboratories (SNL) empirical models to estimate the electricity production and analyze the performance of the inspected GCPVS. More specifically, the parametric models were validated and calibrated under MATLAB using experimental data. Sandia models have been implemented under LabVIEW to design a user interface for dynamic simulation of the GCPVS.
Furthermore, this study introduced a simple and effective diagnostic tool to uncover and identify anomalies in a PV system. To this end, the residuals were calculated with the already calibrated Sandia parametric models. Then, the examination of the residual is conducted by analyzing the Performance Loss Rates (PLR) of four electrical indicators (i.e., DC voltage, DC current, DC power, and AC power). The effectiveness of the presented FDD methods is demonstrated on real data from a 9.54 kWp grid-connected PV system. Various anomalies were investigated in this paper, such as partial shading, soiling on the PV array, and DC/AC efficiency faults. Results showed the capability of the proposed FDD technique to provide significant help for PV system monitoring.
Finally, the developed FDD strategy with threshold limits for each fault was integrated into the supervision interface under LabVIEW. The investigated faults were successfully detected and identified.
This work is organized as follows.
Section 2 is devoted to the description of the used grid-tied PV system. Then,
Section 3 presents the investigated parametric models for PV system modeling.
Section 4 briefly describes the proposed PLR-based FDD method and discusses the experimental results. The last section includes this study and provides some future directions.
3. Modeling of Grid Connected PV System Using Parametric Models
The performance of PV installations depends on many parameters, including (1) weather conditions, (2) performance of the PV system components, and (3) grid parameters. Solar PV systems can be subjected during operation to various anomalies that affect their components (e.g., modules, cables, protections, or inverters) [
6].
The behavioral simulation of the PV system connected to the grid makes it possible to create reference thresholds for the monitoring system carried out, which makes it possible to analyze the performance and the power losses in order to detect and identify the various faults affecting the PV system. The aim of the behavioral simulation of the GCPVS is to obtain the expected evolution of DC Power produced by the PV array as well as the AC power at the output of the PV inverter, considering real climatic conditions.
Here, the simulation of the GCPVS is based on the SANDIA parametric models that have been implemented and evaluated under the MATLAB environment. Here, aging and mismatch losses were included in the models using derates factors. Shading and soiling in a PV system are considered as anomalies, so they are not modeled. The appropriate selection of parametric models with good parameter estimation is the key to minimizing the error between measurement and simulation under normal operation of the PV system. The simulation of the grid-connected PV system allows obtaining the expected evolution of maximum DC power of the PV array, as well as the AC power at the inverter output, taking into account the real weather conditions.
3.1. Sandia PV Array Performance Model
The behavioral model of a PV module depends on the many electrical and meteorological parameters. The Sandia PV Array Performance Model (SAPM) is presented by the following equations below.
3.1.1. Cell and Module Temperature Models
Following the proportional relationship, between the voltage of the module and its temperature, an accurate estimation of the cell temperature allows a good prediction of the voltage behavior. The PV cell or module temperature is usually obtained by prediction. To estimate the temperature of the PV cell/module, numerous models have been introduced in the literature [
62,
63,
64,
65,
66].
To estimate back-surface module temperature (
) in (°C), Sandia proposes an accurate model which depends on three parameters [
35] (Equation (1)): irradiance incident on module surface or Plane of Array (POA) irradiance (
) in (W/m
2), ambient air temperature (
Ta) in (°C), and wind velocity (
) in (m/s).
In (1), s1 denotes the empirically parameter creating the upper boundaries for module temperature at high tilted irradiance and low wind velocity, and s2 designates the empirical parameter defining the decrease rate in the module temperature value as a function of the increase in the wind velocity.
The Sandia model of PV cell temperature depends on the measured PV module temperature
, measured POA irradiance
, and a temperature difference parameter between module and cell
, as represented by Equation (2).
ΔT is a parameter that depends on the construction, materials, and mounting configuration of the PV module [
67]. In the studied PV array,
ΔT is fixed at 2 °C.
3.1.2. Effective Irradiance
Effective irradiance
is the available plane of array irradiance with consideration of spectral mismatch, soiling accumulation on PV array, and angle of incidence losses. In a general sense, or it can be considered as the irradiance actually received by the PV cells before conversion to power. A simplified relation based on POA irradiance measured by pyranometers or reference PV cell is suggested in Equation (3).
where
is the soiling factor (=1 when clean).
3.1.3. DC Current Model
The sandia dynamic model [
35] to estimate DC current of PV array at maximum power point (MPP) is expressed by Equation (4).
where
is the DC current of the PV sub-array at the MPP (A),
denotes MPP current of one PV module at standard test condition (STC) in (A),
is the number of strings in parallel (
),
is the reference solar irradiance (typically
1000 W/m
2),
denotes the cell temperature inside module in °C,
is the reference cell temperature (
25 °C),
is the number of cell-strings in parallel in module,
and
are the empirically determined parameters relating PV sub-array current to effective irradiance, equal to 1. Here
is the current temperature coefficient, equal to
in the datasheet, this coefficient has also been identified.
3.1.4. DC Voltage Model
The dynamic model of the estimated PV string voltage
at MPP is expressed by Equations (5) and (6). This model is based on sandia PV array performance model [
35].
is inversely proportional to the evolution of the effective cell temperature (
), and slightly proportional to the evolution of effective irradiance
.
with
where
is the voltage of PV string at MPP (V),
denotes the MPP voltage of PV module at STC (V),
is the effective temperature of PV cell (
,
is the number of modules in series (
),
refers to the number of cells in series in a module’s cell-string, and
denotes the Boltzmann’s constant (
= 1.38066 × 10
−23 (J/K)). Here,
is the elementary charge (
1.60218 × 10
−19 (coulomb)), and
denotes the thermal voltage is related to the effective temperature of the PV cell
. Note that for the diode factor of unity (
= 1) and a cell temperature of 25 °C, the thermal voltage is about 26 mV per cell.
and
are the empirically determined parameters relating voltage of PV string to effective irradiance, and
represents the normalized temperature coefficient for MPP voltage, equal to
in the datasheet, this coefficient has also been identified.
3.1.5. DC Power
The DC power at MPP is calculated by the following relationship
3.2. Sandia Inverter Model
The second important component in a grid-connected PV system is the PV inverter, which allows the conversion of DC power to AC power in order to be connected to the AC grid with high efficiencies up to 99%. Here, to estimate the conversion of SMA sunny boy inverter, we have used an accurate inverter model developed by SNL [
36]. This model is defined by the following equations.
The definition of the performance parameters is given below. is the predicted AC power based on predicted DC voltage (V) and DC power (W). denotes the estimated DC power at the inverter input, which equals to the maximum power of PV sub-array. refers to the simulated DC voltage of the PV sub-array at the inverter input, is the maximum AC power of PV inverter at Reference Operating Condition (ROC), represent respectively the DC voltage level and the DC power level at which rated AC power is reached under ROC, and represents the DC power required to start the DC/AC conversion process. is the parameter that defines the curvature between AC power and DC power (1/W). , , and are the empirical parameters allowing respectively , and to vary linearly with DC voltage of PV sub-array at the inverter input, (1/V).
3.3. Parameters Identification
The identification of parameters using Artificial Intelligence (AI) techniques allows a good calibration of the empirical model with measured data. Note that the calibration of representative models is very important, in order to try to approximate the experimental reference data as much as possible and to accurately emulate the electrical behavior of the PV array and PV inverter in the PV system.
These models are built using normal data (i.e., without any anomaly and fault) and will generate residuals for real-time data to initiate the fault diagnosis process. Here, the parameters of the PV sub-arrays model and the inverter model were identified using a curve fitting toolbox under MATLAB software, where the fitting is based on the nonlinear least squares method and trust-region algorithm (
Figure 5). In addition, the empirical models were validated and proved using experimental data collected from 9.54 kWp GCPVS under normal operating condition without faults and anomalies.
Three representative statistical indicators are considered to assess the prediction performance and accuracy of behavioral models: the mean absolute error (
MAE), root mean square error (
RMSE), and coefficient of determination (
). These metrics are computed as follows:
where
is the actual values,
denotes the estimated values,
refers to the mean value of
, and
n is the samples number.
The empirical models used to estimate the temperature of the module back-sheet surface are function of three variables (i.e., POA irradiance, ambient air temperature, and wind velocity). Therefore, we used the procedure given in [
68] to decompose the temperature model equation into two equations in order to estimate the coefficients using the curve fitting toolbox.
Table 3 gives the identified parameters used to predict module back surface temperature. The empirically identified parameters used in order to predict DC current and DC voltage at MPP based on SAPM are given respectively in
Table 4 and
Table 5 for three sub-arrays.
Table 6 illustrates eight parameters that were identified for three inverters based on the Sandia inverter model.
3.4. Simulation Result under Matlab
The simulation of all behavioral models was performed in the MATLAB environment. Before the simulation, all models were well calibrated using experimental data for a reference day with the following conditions: clear sky day, medium ambient temperature, and without (1) measurement faults, (2) shading, (3) significant soiling, and (4) electrical fault or other anomalies.
Figure 6 and
Figure 7 display the measured meteorological data used as input for DC current and DC voltage empirical models.
Figure 8 illustrates the measured PV cell temperatures of and predicted from four models. The measurements of effectiveness for the used PV cell temperature models listed in
Table 7 confirm its good prediction performance.
According to the regression metrics listed in
Table 7, the Sandia model of module temperature shows good performance. For the simulations results by way of example, we are satisfied to give just the plots of sub-system 2, because the plots of sub-systems 1 and 3 look very similar to sub-system 2, the other sub-system will be compared in the regression metrics tables.
Figure 9 shows the MPP current of the PV sub-array 2 measured and predicted for the DC current model.
Table 8 gives the regression metrics of the DC current model simulated for three sub-arrays. We can see a good agreement between the measured and the estimated data from models for all PV sub-arrays. The Sandia current is very accurate with
around
0.999 for all PV sub-arrays.
Figure 10 shows the MPP voltage of sub-array 2 measured and predicted from the behavioral model.
Table 9 gives the regression metrics of the DC voltage model simulated for three PV subarrays. We can see a good fitting between the measured and the estimated data from the model for all subarrays. The
is greater than
0.89 for three sub-arrays. We observe that for the predicted DC voltage the
is low due to the sensitive variation of the DC voltage when tracking the maximum power point depending on the climatic conditions, while the
RMSE is less than
0.66% for the Sandia model as shown in
Table 9.
However, the DC power models using one equation are less accurate than models which are based on the product of DC current and DC voltage models. So, the DC power simulation results shown in
Figure 11 are based on the product of the estimated DC current and DC voltage.
Table 10 indicates a satisfying agreement between the measured data and the predicted value of DC power using both models (i.e.,
greater than
0.998). We notice that the simulation results of DC power are very close to those of DC current.
We also observe that the prediction of AC power using the Sandia inverter model, based on the predicted DC voltage and DC power, agrees very well with the measurement data as shown in
Figure 12, with an
greater than
0.999 and an
RMSE that does not exceed
0.93% for the three simulated inverters (
Table 11).
In summary, the above-described empirical models fit well with the measured data.
3.5. Model Implementation under LabVIEW
All behavioral models validated in MATLAB have been implemented under the LabVIEW code diagram using a graphical program combined with a textual program based on C language, this is called a virtual instrument (VI) (
Figure 13).
As shown in
Figure 14,
Figure 15,
Figure 16,
Figure 17 and
Figure 18, the designed interface was very user-friendly and appealing, using modern indicators and waveform graphics. The modeling results were satisfactory, with a good fit to the reference electrical measurements for all PV subsystems. The considered models are easy to implement with low computational cost.
5. Conclusions
This paper introduces a simple and flexible diagnosis method under LabVIEW for detecting and identifying anomalies in a PV system. Toward this end, this method combines the desirable characteristics of empirical models and performance loss rate (PLR) evaluation. Specifically, the Sandia behavioral model of the PV array and the inverter were selected from several other models to accurately predict the electricity production and analyze the performance of the 9.54 kWp GCPVS. The results obtained from the Sandia models were validated and calibrated using reference experimental data in MATLAB. More specifically, the used models achieved good prediction quality in predicting DC current, DC power, and AC power with an of around 0.99, and an of about 0.98 for the PV cell temperature. On the other hand, for the predicted DC voltage of three subarrays, the is around 0.89. Overall, the prediction results were very satisfactory, with a better fit to the reference measured data. Here, the difference between the predicted and measured values are used for fault detection and diagnosis. The proposed fault detection and diagnosis technique is based on the PLR evaluation of four electrical indicators (i.e., DC current, DC voltage, DC power, and AC power) based on SANDIA parametric models. Importantly, four electrical indicators are used to further improve discrimination between the investigated anomalies in the DC and AC sides of the PV system. Results revealed that the proposed method can distinguish between faults under changeable climatic conditions and the nine faults’ cases were properly detected and identified using a convivial user interface, along with other features that were added to this interface. Essentially, the proposed technique is not complex and easy to integrate into the real-time monitoring program. This study revealed the promising performance of the combined empirical models with the PLR approach for anomaly detection and identification in PV systems.
Despite the improved detection performance, partial shading is not obvious to discriminate from other faults on the DC side, which can lead to false identification. Thus, future works will improve its capacity to discriminate partial shading by intelligent signal processing of a shading anomaly. Moreover, we plan to evaluate the performance loss rate [
44] with advanced machine learning and deep learning models [
71]. Furthermore, another direction of improvement consists of adding Internet of Things (IoT) functionalities [
72] to the monitoring system, which improves the online supervision of performance analysis and malfunctions alarms in real-time [
73].