With the intensification of the energy and environment crisis, clean energy plays an integral role in restraining global warming issues and has received more and more attention in industrial circles. As a major clean energy technology, photovoltaic power generation has received worldwide attention in recent years, especially in developing countries. Online intelligent multisensor monitoring systems are the guarantee of stable operation for the PV system. This direction is increasingly becoming a research hotspot in academic circles. Once the signal from the sensors have been acquired, various diagnostic techniques of the PV array can be adopted to extract as much information as possible from these signals. Furthermore, suitable decision-making strategies can be built up for failure detection of a PV array. A variety of studies have applied different diagnostic approaches to conduct this industrial task. In paper [
1,
2], Takashima T et al. used the acquired voltage and current signals to propose time domain reflectometry (TDR) and earth capacitance measurement (ECM) to detect the positions of the open circuit points in the circuits. Hu Y et al. presented a Thermography-based temperature distribution analysis method to identify PV module mismatch faults [
3]. Nuri Gokmen et al. only measured the operating voltage of PV string and ambient temperature to efficiently detect the number of open and short circuit faults and discriminate between them and partial shading conditions [
4]. Siva Ramakrishna Madeti, S.N. Singh proposed a program that using minimum number of voltage sensors to locate fault modules. The main advantage of this algorithm is that it voids the need of current sensors and reduces the number of the sensors used [
5]. A. Chouder and S. Slivestre defined two new power loss indicators, including thermal capture losses and miscellaneous capture losses, to classify different fault types in PV system operations [
6]. The line-to-line (LLF), ground fault (GF) and arc fault (AF) are generally considered to be the main catastrophic failures in a PV array [
7,
8,
9]. Arc faults cause continuous oscillations and distortions in the output current and voltage waveforms of the inverter. Then the diagnosis of the fault type replies on a suitable time-frequency domain analysis on the voltage and current waveforms [
10,
11,
12]. Fault detection in a PV array becomes difficult under low irradiance conditions. Line-to-line faults and ground faults may be undetected in the low irradiance situation. Artificial intelligence (AI) techniques are considered to be promising strategies to address with these problems. Zhehan Yi and Amir H. Etemadi proposed a multi-resolution signal decomposition method to extract fault features and two intelligent classification methods to identify the line-to-line and ground faults [
13,
14]. In addition, there are some other AI techniques for detecting the faults in a PV array. The strategy based on the fault threshold and Back Propagation (BP) neural networks was proposed in the paper [
15] to detect 8 fault types at three levels including component, branch and array. In the literature [
16], a multi-class support vector machine was designed to diagnose the line-to-line faults and the aging faults in the photovoltaic array. Zhao Y et al. [
17] developed a graph-based semi-supervised detection method for discriminating different faults including short circuits, open circuits and line-to-line faults, and so on. The paper [
18] proposed a density peak-based clustering approach for fault diagnosis in PV arrays. However, this approach needs the predetermination of the clustering number. Bi Rui et al. [
19] combined the open circuit voltage
Voc, short circuit current
Isc, maximum power voltage
Vmpp and maximum power current
Impp as the feature vectors to characterize different fault types, then the Fuzzy C-means clustering method (FCM) was used to classify these faults.
Under the operating conditions, some single faults and compound faults display similar features and the fault datasets acquired in the field experiment are usually accompanied by environmental and system noise. These two points lead to the fault classification difficulty that some single faults and compound faults with similar characteristics cannot be discriminated by the conventional feature quantities proposed in previous papers. The conventional feature quantities mainly contain voltage, current, power, Vnorm, Inorm, FF (Vnorm, Inorm), and so on. In order to realize a good degree of fault discrimination, a new eigenvector composing of Vnorm, Inorm and FF was constructed to characterize 8 different faults in three-dimension space. The simulation results show that the proposed feature vectors can separate these faults effectively in noise-free conditions. As a result of the noise existing in the data collecting process of the field experiment, the new feature vectors cannot distinguish the 8 faults in the operating condition obviously. The Gaussian Kernel Fuzzy C-means clustering method (GKFCM) is an unsupervised learning clustering algorithm that using kernel function to map the sample data in the original space to a high-dimension space and then adopting the similarity function to classify the fault datasets. Compared to the traditional fuzzy C-means clustering algorithm, the GKFCM can highlight the difference of the sample characteristics by the nonlinear mapping of kernel space. This method can improve the clustering ability of complex datasets and robustness of fault diagnosis effectively. Therefore, a new fault classification method based on the GKFCM with the new eigenvectors (Vnorm, Inorm, FF) is propounded in this paper. In the first phase of the algorithm, the different faults are characterized by the new eigenvectors and achieves a preliminary distinction. Then, the different fault datasets are input into the GKFCM to train and test for drawing fault diagnostic conclusions. Since the proposed method considers the normalization of the feature quantities and the robustness of the GKFCM, the generalization ability of the fault diagnostic algorithm can be further guaranteed.