**1. Introduction**

With the development of advanced technologies to increase production, modern industrial systems become more complex and expensive. The components of industrial systems are prone to malfunction, which could bring unanticipated economic costs due to unscheduled shutdown and repair/maintenance. Therefore, it is of particular interest to design effective fault diagnosis and fault classification approaches to automatically monitor the behaviour of industrial systems and prevent damage caused by unexpected faults. Motivated by environmental considerations and the shortage of fossil fuels, wind turbines, as one of renewable energy sources, have contributed to a large portion of the world's power production [1,2]. As a clean energy, wind energy has been significantly exploited via the onshore and offshore wind turbines. By the end of 2019, the overall installation capacity of all wind turbines worldwide reached 651 GW, and European countries contributed to 205 GW.

Moreover, wind power contributed 15% electricity generation in Europe and 20% electricity production in the UK in 2019 [3].

Wind farms consisting of hundreds of wind turbine units are being established in many different locations around the country, for instance, in offshore, arctic, and desert regions. In recent years, some different topologies of generators, such as doubly fed induction generators (DFIGs) and permanent magnet synchronous generators (PMSGs), are widely utilized in wind turbine systems. However, like any other industrial systems, wind turbines are sophisticated and prone to faults. It is observed that the operation and maintenance costs for onshore and offshore wind turbines make up 10~15% and 20~35%, respectively, of the total life costs of wind energy conversion systems. Furthermore, wind turbine systems are complex and expensive; therefore, there is a high demand for improving the reliability and availability, and reducing unscheduled down time in wind turbine industries [4]. Motivated by the above, monitoring and fault diagnosis for wind turbine systems have received wide attention in wind turbine industries [5–9].

Fault diagnosis approaches can be classified into model-based, signal-based, and knowledge-based methods. The model-based fault diagnosis approach requires a well-established model of practical processes developed by either physical principles or systems identification techniques. By checking the residual between the model output and the real-time process output, the decision for fault diagnosis can be made [10,11]. Signal-based fault diagnosis is relying on appropriate sensors, whose locations are normally installed in plant components. The faults in the process are reflected in the measured signals, and the time-domain, frequency-domain, or time-frequency-domain techniques are used to extract features. By checking the consistency between the features of the real-time process and the prior knowledge on the symptoms of the healthy system, a diagnostic decision can be made [12]. Knowledge-based approaches utilize a large volume of historic data available to train universal estimations or approximations on behalf of implementing to recognize faulty conditions [13]. It is worthy to point out that the knowledge-based approach more depends on the data processing and data-based learning, including processing historical data and real-time data. Therefore, the knowledge-based fault diagnosis approach is often called the data-driven approach [14,15].

Machine learning techniques play an important role for data-driven fault diagnosis. Generally speaking, machine learning techniques can mainly be classified into three categories, which are unsupervised, semi-supervised, and supervised learning algorithms, respectively [16]. Unsupervised machine learning aims to learn structure in the data, such as sparse or low-dimensional feature representation [17–20]. According to the tasks of the supervised machine learning, such as prediction and classification, the aim is to learn a knowledge base, on the basis of the known or labelled examples of the target pattern [21,22]. Semi-supervised machine learning represents a class of algorithms that include both supervised and unsupervised tasks [23–25].

It is noted that the dataset generally has a great volume of data with existence in high-dimensional space. Feature extractions thus play an important role in data-driven fault diagnosis [26–29] as well as dimensionality reduction for the samples/datasets. The geometric distribution of the datasets in high-dimensional space can be analyzed in order to effectively extract significant features. There are several methods to solve this problem and one of the most popular techniques is the principal component analysis (PCA) algorithm [30–34]. The PCA, as an unsupervised learning technique, is a statistical procedure that utilizes an orthogonal transformation to convert a set of correlated variables into linearly uncorrelated variables, namely principal components [35]. The number of principal components should be generally less than the number of the original variables [36–38]. The transformation in the PCA is carried out in a way so that the first principal component has the largest possible variance, and each succeeding component in turns has the highest variance possible under the constraint that it is orthogonal to the preceding components [39]. As a result, the PCA has become a popular tool for fault detection and fault classification on the basis of a large volume of high-dimensional experimental samples/datasets [40–42].

A wind turbine system is a complex industrial system, and the operation condition is harsh. Therefore, the conventional PCA technique may become invalid for fault diagnosis and fault classification in wind turbine systems subjected to multiple faults. As a result, there is a strong motivation to develop advanced PCA-based fault diagnosis and classification techniques for wind turbine systems. In this study, uncorrelated multi-linear principal component analysis (UMPCA) is integrated with FFT data preprocessing to form an algorithm, which is applied to a 4.8 MW wind turbine benchmark system for fault diagnosis and classification. Furthermore, comparison studies are carried out to demonstrate the effectiveness of the proposed algorithm by comparing with the known algorithms.

The rest of this paper is organized as follows: In Section 2, the fundamentals of the 4.8 MW wind turbine benchmark model are introduced, and actuator and sensor faults of wind turbines are explained. In Section 3, An algorithm integrated with FFT and UMPCA techniques is addressed for dimensionality reduction and feature extraction. Experimentation designs are proposed in terms of different topologies of the actuator and sensor faults of wind turbines in Section 4. Simulation studies are illustrated in Section 5. In order to demonstrate the effectiveness of the addressed FFT plus UMPCA method, the simulated studies of the fault diagnosis and classification for wind turbines respectively by using MPCA, FFT plus MPCA, and UMPCA are also discussed. Finally, this paper is ended by summarizing the conclusions in Section 6.
