A Dataset and a Comparison of Classification Methods for Valve Plate Fault Prediction of Piston Pump

Rojek, Marcin; Blachnik, Marcin

doi:10.3390/app14167183

Open AccessArticle

A Dataset and a Comparison of Classification Methods for Valve Plate Fault Prediction of Piston Pump

by

Marcin Rojek

^1,*,†

and

Marcin Blachnik

^2,*,†

¹

Joint Doctoral School, Silesian University of Technology, 44-100 Gliwice, Poland

²

Department of Industrial Informatics, Silesian University of Technology, 44-100 Gliwice, Poland

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Appl. Sci. 2024, 14(16), 7183; https://doi.org/10.3390/app14167183

Submission received: 27 June 2024 / Revised: 9 August 2024 / Accepted: 12 August 2024 / Published: 15 August 2024

Download

Browse Figures

Review Reports Versions Notes

Abstract

The article introduces datasets representing piston pump failures along with the experimental evaluation of various machine learning classification models. It starts with a detailed description of three classification datasets consisting of three different levels of valve plate damages and signals recorded from sensors used in classical hydraulic systems (pressure, temperature, flow). The obtained datasets consist of 100k (Failure 1), 30k (Failure 2) and 30k (Failure 3) samples and eight attributes. Then a broad range of classifiers are evaluated including three ensemble models based on decision trees: Random Forest, Gradient-Boosted Trees, and Rotation Forest, as well as the kNN algorithm and a neural network. The analysis showed that neural networks achieved the highest prediction accuracy, enabling a prediction accuracy level of 89%. The kNN algorithm ranked second, and tree-based algorithms performed 4% worse than the neural network. Next, the attribute importance analysis revealed that leak flow, pressure output, pressure of the leak line, and oil temperature are the most important parameters for accurate predictions. Additionally, the research includes a sensitivity analysis of the best classifier to verify the impact of sensor measurements or other noise indicators on the prediction model performance. The analysis indicates a 5% margin of measurement quality.

Keywords:

piston pump; machine learning; failure prediction; deep learning; predictive maintenance

1. Introduction

Over the last few years, artificial intelligence and machine learning have been widely discussed. For many years, attempts have been made to apply and implement machine learning models in many branches of industry. One of the tasks of this technology is predictive maintenance, i.e., the monitoring and analysis of device and system parameters in order to detect early damage to system components and avoid failures [1].

Many industrial applications and systems are based on hydraulics. Hydraulic components such as motors and pumps are key elements of hydraulic systems in automotive, mining, metal processing, or mobile machines [2]. Their reliable operation determines the stability of the production strand and affects significantly the competitive advantage of companies. Important production processes often depend on the efficiency of the hydraulic system; hence, their good technical condition is crucial. Depending on any industry area and the role of a device or system in the infrastructure, damage to a single component can cause serious problems, including significant economic losses. A relatively inexpensive device can have a huge impact on high economic loss when it breaks down. For example, there was pump damage in the hydraulic system of a long-wall shearer in a coal mine. The failure of a component worth several hundred euros caused an economic loss of one hundred thousand euros. That is why the ability to predict failures is so important.

One of the obstacles to the large-scale implementation of machine learning methods in industry is the availability of a large amount of training data describing both failure states and normal operational states of the equipment. While collecting data from an operational system does not generally seem to be a difficult task, collecting data during system failures is practically impossible because companies strive to prevent failures from occurring. There are two solutions to this problem: one is the creation of digital twins [3,4], which through computer simulations allow the creation of any failure states, and the other is conducting experiments on actual equipment but in laboratory conditions.

This article describes the research and analysis of piston pump damage conducted in laboratory conditions. During the tests, simulations were carried out with the pump operating with a properly functioning valve plate as well as with three different levels of cavitation damage to the valve plate. The research involved replacing one of the components and conducting simulations under various operating conditions of the pump with a load controlled by an electric motor managed via an inverter.

In the further part of the study, the collected data were transformed into a dataset suitable for evaluation by machine learning methods. Using these prepared data, a comparison was made of various predictive models including three different ensemble models based on trees: Random Forest [5], Gradient Boosted Trees [6], and Rotation Forest [7], the kNN algorithm [8], and a neural network [9].

The novelty of the manuscript involves:

Introducing a dataset representing real measurements of the valve plate failures in different damage conditions
Performing a broad comparison of various machine learning classifiers for prediction of the state of the valve plate
Analysis of the feature importance of the best prediction model in order to verify knowledge extracted by the model
Analysis of the model’s sensitivity to the uncertainty of sensor measurements or other noise sources.

The subsequent sections are organized as follows: The next section describes the current state of knowledge in the field of predicting hydraulic equipment failures, the following Section 4 provides a detailed description of laboratory tests on the pump and the method of constructing the dataset for the evaluation of machine learning models. Section 5 presents the results obtained along with the discussion of the performance of the models evaluated. It also includes an analysis of the significance of individual attributes. The final section summarizes the conducted research and suggests directions for future work.

2. Related Work

The literature analysis shows that machine learning techniques are popular tools for failure prediction including a large family of methods. A good example of a neural network application particularly a convolutional neural network is described [10]. However, it is noted that classic machine-learning algorithms are dominant in this area. One can find examples of the use of a modified KNN algorithm, which in combination with the just-in-time learning rule (JITL) was used to determine the remaining useful life (RUL) [11]. This method was used for hydraulic pumps taking into account pressure measurements. Also, authors of [12,13] use the term RUL in combination with Bayesian regularized radial basis function neural network (Trainbr-RBFNN) during research on external gear pump or modified auto-associative kernel regression (MAAKR) and the monotonicity-constrained particle filtering (MCPF) approach when examining the piston pump. Another application is based on the use of RBF neural networks combined with noise filtering algorithms and the use of vibrodiagnostics [12]. The literature review also includes comparisons of classic machine learning algorithms, including SVM, KNN, and gradient-boosted trees [14]. Another publication [15] describes the use of a method based on empirical wavelet transform (EWT), principal component analysis (PCA), and extreme learning machine (ELM) to analyze data from vibration sensors. Another approach presented in [16,17] is based on using a convolutional neural network. In these articles pressure, vibration, and acoustic signals were used as the input data for the prediction. The authors of [18] present a review of the recent literature touching on ML-based condition monitoring systems. Many authors draw attention to artificial neural networks (ANN), multilayer perceptron (MLP), or convolutional neural networks (CNN). Remaining useful life is also predicted in [19] where the Auto-Regressive Integrated Moving-Average (ARIMA) forecasting method is used. The research object was a piston pump for which leakage volume was considered a significant parameter for RUL prediction. Another proposal for the predictive maintenance model was presented in [20]. This paper is focused on a hybrid solution that combines Improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (ICEEMDAN), Principal Component Analysis (PCA) and Least Squares Support Vector Machine (LSSVM). The model is optimized by the combination of Coupled Simulated Annealing and Nelder-Mead Simplex optimization algorithms (ICEEMDAN-PCA-LSSVM). The proposed technique is compared with three established methods [Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), and Artificial Neural Network (ANN)] with multiclass classification capabilities. Authors of [21] describe deep learning methods where the Bayesian optimization (BO) algorithm is used to find the best hyperparameters of the model. In this paper, the vibration signal of the piston pump is used as a source of data. The CWT preprocesses the signal and then the preliminary CNN model is prepared. Finally, BO based on the Gaussian process was used to prepare an adaptive CNN model (CNN-BO). Another article [22] presenting research on axial piston pumps proposes a transfer learning method for fault severity recognition. The method is based on adversarial discriminative domain adaptation combined with a convolutional neural network (CNN). Similar to [21] there is also a vibration signal utilized as a data source for ML. In the paper [23], the authors do the research using a one-way analysis of variance ANOVA. This is another article in which the authors rely on the vibration signal and additional pressure on the output port of the pump. Pump wear classification methods used were as follows: decision tree (DT), discriminant analysis (DA), KNN, naïve Bayesian classifiers, and SVM.

3. The Dataset

3.1. Data Collection Process

In order to obtain data, a model of a hydraulic power unit (HPU) was built. The model is based on an axial piston pump, as it is one of the most popular types of pumps used in industrial installations. Pump model used is HSP10VO45DFR, Hydraut Europe S.R.L., Gallarate (VA), Italy. This is a nine-piston variable displacement pump with an inclined swash plate. Nominal parameters of the pump are as follows:

displacement: 45 cm³/rev.
max. continuous pressure: 280 bar
peak pressure: 350 bar

The pump cross-section is shown in Figure 1. The valve plate is marked yellow for better visualization. The cylinder block (marked blue) is the element that rotates and slides on the valve plate surface on the oil film. This allows cyclic displacement of oil by pistons from suction to the output port in the end cover of the pump. The valve plate is the element that allows cooperation between the cylinder block and end cover. It acts as a slide bearing.

In the first stage of building the HPU model, a hydraulic diagram of the system was prepared, including appropriate measurement sensors. The diagram is shown in Figure 2 and all the components used in test HPU are presented in Table 1. The test HPU built for the research uses a pump (10) driven by an electric motor (3). As a load for the pump, a hydraulic motor (14) was used. Further, motor (14) was driving another electric motor (16), which was acting as an adjustable brake for which a specific torque value could be set. The torque value was measured by a torque meter (15). Process parameters for the pump were measured by temperature (1, 4, 8), pressure (2, 5, 9) and flow (6, 11) sensors. Check valve (7) is used as a safety relief valve for the pump leak port in case of flow-meter (6) clogging. Temperature and pressure sensors (12, 13, 18, 19) for hydraulic motors (14) are used for diagnostic and safety monitoring purposes. As a standard element in hydraulic systems, a return oil filter (17) is also used.

The electric motor working as a brake is powered by a frequency inverter, which is then controlled by a PLC controller and a computer system.

The test unit is shown in Figure 3.

The built model allows the collection of data both for correct operating and damaged condition pumps.

In hydraulic systems, pumps are most often used as a source of specific pressure to, for example, generate the force required using a hydraulic cylinder or torque using a hydraulic motor. For this reason, a forced pump load in the form of a given load torque was used when collecting training data. This causes the pump to generate the appropriate pressure. There are structural losses between the cooperating pump components, causing internal oil leaks, especially when operating under high pressure. This is a normal feature of these devices. Wear and tear or failure causes these leaks to become more severe. Hydraulic oil, which has a specific nominal kinematic viscosity according to ISO 3448 [24], is usually used as the working medium. To be able to observe the operational wear of the pump or its failure state, the viscosity of the working medium plays an important role. The built system uses ISO VG46 oil, which means that it has a nominal kinematic viscosity of 46 mm²/s at a temperature of 40 °C. Oil with this viscosity value is often used in hydraulic systems. The viscosity of the oil strongly depends on its temperature; therefore, the measured parameters were recorded while the oil temperature was increased. The temperature in the oil tank, and in the suction line was increased from ambient temperature to approximately 50 °C. Because the data were collected in different months of the year, the starting ambient temperature ranges from around 16 °C up to 21 °C. The built system enables the registration of many parameters, but for the purposes of failure prediction research, seven of them were selected based on expert knowledge and the pump design.

The data collected during the tests show the correct and damaged condition (worn valve plate) of the piston pump.

In real hydraulic systems, the way the pumps are loaded is quite random, depending on many factors. These can be sudden pressure changes from a few bars when the pump is unloaded to 300 bar and more, but also smooth pressure changes when the system works with proportional parameter control. Taking these factors into account, both a stepped load and a sinusoidal waveform were used during data collection. Both types of load were applied randomly to better fit behavior to a real hydraulic system. Graphs showing the applied load torque are shown in Figure 4. The torque value was set in the range of 20–220 Nm, which corresponds to the obtained output pressure value in the range of around 25–235 bar. For the stepped type, the value of the applied torque was changed manually in periods ca. 4–5 min. For the sine wave, the period of sine was set to 300 s.

The process parameters recorded during the tests were collected at three points in the system:

suction line: temperature of the oil
the output line of the pump: pressure, temperature and flow
leak line: pressure, temperature and flow

The data acquisition system is quite extensive software running under Windows OS. Its limitation is that it is not possible to set a constant sampling frequency, and it depends on the operation of the operating system and the PLC controller. As a result, a sampling rate of approximately 40 samples (data records)/second was obtained. After collecting the data, the recorded values for each sensor were joined by the time stamp to form a sample record. Finally, each data sample consisted of seven values that describe the temperature, pressure, and flow on the suction line, the output line, and the leak line as described above. The data were collected for four different pump states, one with correct functioning and three representing the damaged state. In order to collect data on the damaged condition, a functional element in the pump was replaced with a damaged one. The damaged element was the valve plate. Three damaged plates, shown in Figure 5, with different sizes of damage, were prepared. The first, described as “Failure 1”, is an element from a pump that was damaged during operation. The irregular cavity in the oval hole is the effect of cavitation. Two other plates (described as “Failure 2” and “Failure 3”) were intentionally damaged to carry out tests. All valve plates have similar damage, which differs in size of material loss. The smallest cavity is in “Failure 2” and the biggest, where damage is made in two oval holes, was made in valve plate “Failure 3”. Therefore, “Failure 1” represents the middle state of the damage, which is similar to “Failure 2” in terms of the value of the missing material.

During the research, approximately 6.6 million data records were collected. The approximate quantity of them is listed in Table 2. Each record consists of float type values given in Table 3.

3.2. Data Preprocessing and Forming the Training and Test Set

To carry out further work with the evaluation of machine learning models, the dataset was pre-processed in two steps. First, recorded samples were filtered and rows with NaN values were removed. Next, since the sampling frequency was much higher than the signal changes, the recorded values were averaged to obtain a sample rate of 1 sample per second (the records were grouped and averaged within a second interval). After these processing steps the dataset was decreased, the summary is presented in Table 4.

To form the final dataset used for machine learning evaluation, the obtained records were divided into a training set and a testing set. The amount of data describing Failure 1 is significantly larger than the data describing Failure 2 and Failure 3. Therefore, the data describing Failure 1 was used as the positive class in the training dataset, while the data representing Failure 2 and Failure 3 were used to create the testing dataset. For the negative class, no-fail data were used. In this case, all records except the last 15,000 were used to create the negative class in the training dataset, while the last 15,000 records were used to create the negative class in both test datasets. Therefore, the negative class contained the same set of records in both test datasets. Finally, according to the experts’ knowledge, an additional attribute was generated which represents the difference between the temperature on the leak line and the temperature on the suction line,

T_{d i f f} = T_{l e a k} - T_{s u c t .}

. The attribute description and symbols are presented in Table 3, and dataset statistics are presented in Table 5. As can be observed, the number of samples in each class is comparable and the unbalanced rate is almost equal to

50 %

.

4. The Experiments

The research component related to machine learning encompassed three groups of experiments. In the first step, the issue of selecting a predictive model for the presented task was analyzed. This study examined three groups of predictive models, including methods based on decision trees, minimal-distance-based models, and neural networks. In the subsequent phase of the research, after selecting the best model, the factors used by the model to make decisions were analyzed focusing on the attribute significance analysis. The final stage of the research involved analyzing the model’s sensitivity and its susceptibility to measurement errors. The individual topics mentioned above are described in detail below.

4.1. Models Used for Evaluation

One of the problems in machine learning applications is selecting an appropriate predictive model. Among the available methods, three main families can be determined: models derived from decision trees, distance-based methods, and neural networks.

The first group of algorithms, i.e., based on decision trees, usually refer to complex models or ensembles of models that address the limitations of individual trees. For single decision trees, the problem lies in the shape of the decision boundary, which, for continuous data, takes the form of a stepwise curve. In this case, the application of complex models, including ensuring their diversity, results in multiple trees working together to increasingly better predict the true decision boundary. As mentioned earlier, it is important to ensure the diversity of individual trees. To solve this problem, many different algorithms have been developed, among which the most popular is Random Forest [5], where tree diversity is ensured by creating individual trees on independent subsets of features, and the final decision of the model is generated through a democratic vote of the component models. Another popular solution is the Gradient-Boosted Trees algorithm [6]. In this case, unlike the Random Forest algorithm, subsequent trees are added sequentially, focusing on the areas where previous trees caused the greatest model error. This is achieved by adapting the probability distribution used for sampling training vectors with replacement so that each subsequent tree is generated on a subset of training data. In this case, the final decision of the model is the result of a weighted vote of the individual component models. In the experiments carried out, another tree-based algorithm was also used, the rotation forest algorithm [7]. It operates similarly to Random Forest, with the difference that the algorithm generates random subspaces for which the tree is generated; however, before training the tree, the PCA algorithm is used to rotate the coordinate system. This algorithm is particularly useful when correlated variables exist, which simplifies the construction of a predictive model. All the tree-based algorithms have an additional significant advantage, namely, they provide the possibility of partial interpretation of the model results using the MDI (Mean Decrease in Impurity) coefficients. These coefficients are used to assess the importance of attributes, thus they can be interpreted as a measure of evaluating the quality of individual variables. Another important advantage of tree-based algorithms is the speed of training and prediction, as trees ensure low computational complexity and ease of parallelization.

A separate group of models consists of distance-based algorithms. A typical example of this group is the k-Nearest Neighbors (kNN) algorithm [8]. It belongs to the group of lazy learning methods because the entire inference process occurs during prediction, where the k nearest neighbors for a given vector are identified from among the stored training vectors, and a vote is taken among them to select the appropriate class. Besides the voting method, the factor determining the quality of kNN is also the selection of reference vectors stored in the model’s memory. These methods are also called instance selection methods. This enables the interpretation of the model, including the use of so-called prototype-based rules [25] and case-based reasoning methods [26].

The last method evaluated is the neural network [9,27]. In this case, a fully-connected neural network also called a multi-layer perceptron (MLP) was used, without convolutional layers. This choice results from the characteristics of the data, where the data could not be decomposed into time series due to the shape of the waveform, the sampling frequency, and the speed of recorded signal changes.

4.2. Feature Importance Analysis

The complement to the conducted analysis is the evaluation of the attribute importance. As mentioned earlier, attribute importance can be assessed using tree-based algorithms, including Mean Decrease in Impurity (MDI). This measure is generated with results obtained from the training data, which can lead to overestimation of the importance of individual attributes. Therefore, in this work, the permutation feature importance measure was used to assess attribute importance [28,29]. This measure is model-independent, allowing it to evaluate tree-based models, as well as kNN and neural networks. This method measures the contribution of each feature to the statistical performance of the fitted model by randomly shuffling the values of a single feature and observing the impact on the model’s prediction quality. By breaking the association between the feature and the labels, we determine how much the model relies on that particular feature while keeping the distribution of the feature unchanged. Moreover, this method can also be applied to test data, thus avoiding the issue of overestimating feature importance due to model overfitting, as it often appears for the MDI feature importance estimator.

4.3. Model Sensitivity Analysis

The objective of the model sensitivity study was to analyze the impact of measurement disturbances on the prediction accuracy of the classifier. In this area of research, the analysis was conducted based on the model with the highest prediction accuracy. The obtained model, trained on noise-free data representing Failure 1, was verified on the Failure 2 and Failure 3 test data, to which Gaussian noise with a specified variance was added. Since the input data for the predictive model underwent a standardization process, each variable had a mean equal to 0 and a standard deviation equal to 1. This data processing allows us to define a single universal noise level (

σ

) for all attributes and add it to each variable. In the analysis, it was assumed that subsequent cases would be tested with additive Gaussian noise by generating a noise vector of size 8 and adding it to each testing sample. The experiments were repeated 20 times for given

σ

, and 6

σ

values were evaluated:

σ = [0.01, 0.05, 0.1, 0.15, 0.2, 0.25]

, each time measuring the prediction performance. Finally, the mean value and standard deviation were measured for each value of

s i g m a

.

4.4. Model Evaluation Procedure

The model evaluation procedure comprised two steps. In the first step, the predictive model underwent parameter optimization, and in the second step, the quality of the best obtained predictive model was assessed using test datasets, which included different levels of damage to the valve plate as described in Section 3.

The full data processing schema is presented in Figure 6. It starts with the initial standardization of the training data, where the standardization parameters (mean and variance) were determined (fitted) using the training set and then applied to the test sets. Then, the grid search procedure starts with a 10-fold stratified cross-validation test. The range of optimized parameters for each model is provided in Table 6. Subsequently, the best model obtained from parameter optimization was used to verify the accuracy of the test. Here, the test data described two different levels of valve plate damage. In this case, the standardization coefficients derived from the entire dataset were applied to the test sets. After this initial processing, predictions were made using the model trained in the Failure 1 dataset and applied to both test datasets.

The next step in the model evaluation process was assessing attribute importance using the permutation method. Applying this method to both test sets and comparing the obtained results allows for the verification of the model in terms of the consistency of the results. Obtaining similar attribute importance for two independent test datasets confirms the validity of the model’s quality. The analysis of attribute importance using the permutation method has certain limitations (the problem also applies to the MDI method). This limitation arises in the case of strongly correlated attributes. In such situations, replacing an attribute with a new one containing permuted values causes the other strongly correlated attribute to take over its role, making the decrease in prediction accuracy invisible. To circumvent the above-described problem, this work includes an analysis of the correlations between attributes. For attributes identified as strongly correlated, i.e., those with a correlation above 0.9, a PCA analysis was conducted, leaving only the first principal component for all correlated attributes. The analysis of attribute importance was then performed on the dataset with a reduced number of attributes, starting by retraining the model and then evaluating its performance on the test data.

An important factor worth highlighting was the model evaluation method during cross-validation. The traditional approach to using cross-validation involves randomly splitting the data into individual folds. This procedure was abandoned in this study due to the nature of the data. The recorded data should be treated to some extent as a signal, which means that with randomized data and the step-wise nature of the pump load operation, we would obtain almost identical training and test data. This would consequently lead to an overestimation of the model’s quality. Therefore, a linear partitioning approach was adopted in this work. In this schema a dataset is divided into k subsets (we used

k = 10

), but not randomly, rather divided into k following subsets where each subset contains

\frac{n}{k}

samples ordered by time. In each of the k iterations, one of these subsets was used for testing and the remaining were used for training.

In the calculations performed, simple accuracy was used as the measure of prediction performance, calculated as the number of correctly predicted cases relative to the total number of evaluated cases. The use of simple accuracy is justified because the datasets evaluated were balanced, with the class distribution being 47%, 49%, and 48%, respectively, for the Failure 1, Failure 2, and Failure 3 datasets. As a complement to the accuracy, a confusion matrix was used for evaluation, which allows for the assessment of false positive and false negative errors.

All of the experiments were conducted using the scikit-learn Python library for machine learning. The final dataset used in the experiments is available at https://www.kaggle.com/datasets/mbjunior/valve-plate-failure-prediction-in-hydraulic-pumps (accessed on 25 June 2024), and the script used to perform the experiments is available on GitHub: https://github.com/mblachnik/2024_Valve_plate_failures/ (accessed on 25 June 2024).

5. Results

Below we present the results obtained for each of the evaluation steps.

5.1. Model Selection

According to the procedure described in Section 4.4, five prediction models were evaluated. The results of the model evaluation procedure were collected and presented in Table 7. They indicate that, in the case of identifying failures in hydraulic pumps, tree-based models exhibited the highest prediction error. Among the tree-based models, the Gradient Boosted Trees algorithm achieved the best results, with an accuracy of 84.55%. The kNN algorithm ranked next, showing a 3-percentage-point improvement in accuracy compared to the GBT algorithm. The best results were obtained by the neural network, which achieved an accuracy of 88.68%.

It is also worth noting that the use of autoregressive data, including up to three historical samples, did not positively affect the quality of the obtained results. These findings indicate no significant impact on model quality, and in many cases, a minimal degradation in model quality can be observed with an increasing window size.

Ultimately, the MLP model, which achieved the highest accuracy, was used for the verification of the test data. The quality of the results obtained from the Failure 2 and Failure 3 datasets is presented in Table 8. Complementing the results is the confusion matrix obtained for both datasets, which is shown in Figure 7, where class 0 corresponds to the correct pump operation, and class 1 corresponds to the failure state. The results obtained differ by 13%. Such a large difference in accuracy can be explained by the levels of damage. In the Failure 2 dataset, the damage is the smallest, and in the Failure 3 dataset the damage is the largest and appears in two holes. Also notable is the difference between the results obtained in the cross-validation test using the Failure 1 dataset where the estimated accuracy was 88% and for the Failure 2 dataset the accuracy was 80%. The difference in accuracy can also be explained by the size of the damage in the valve plate. More details on the datasets are presented in Section 3.

In Figure 7 identical values can be seen for the negative class (0). They are justified because the same set of values that describe the normal pump operating condition was used for both datasets.

5.2. Feature Importance Analysis

The next factor considered during the analysis was the verification of attribute importance for the obtained model. The quality of the attributes was assessed using the permutation method applied to both test datasets. As discussed in the previous section, the correlation between all attributes was measured first. The results obtained are presented in Table 9. In the experiment, we compared all three datasets, and in all cases, the correlations between the attributes were almost identical. The table contains the average correlation for all three datasets. The values obtained indicate a very strong dependency between the output temperature (

T_{o u t}

) and the suction line temperature (

T_{s u c t .}

), with a correlation level of 0.99. Similarly, there is a high correlation between the output temperature (

T_{o u t}

) and the leak line temperature (

T_{l e a k}

), with a correlation level of 0.91. As described above, a typical approach to overcome the problems of correlated features when performing permutation-based feature importance assessment is based on the PCA analysis, so the three attributes with a correlation coefficient above 0.9 were separated from the remaining attributes, and the PCA analysis on those tree attributes was performed. From the PCA, the first principal component was used to replace the three temperature attributes. Therefore, the results presented in Figure 8 represent only six attributes (with a single temperature attribute) instead of eight. The coefficients obtained that describe the importance of the individual attributes are presented in Figure 8 separately for both datasets. In the figures, the X-axis represents the decrease in accuracy and the y-axis represents the name of the feature. The whiskers represent statistics of the data collected for 10 random permutations of the values for each attribute.

The obtained results indicate that the most important parameters in both cases were ‘Flow—leak line’, and ‘Pressure—output’. The next in the order is temperature T for Failure 2 and

P_{l e a k}

for Failure 3, but their importance is much lower than the initial two attributes. The fourth position is equal to the third position but swapped between the datasets.

F_{o u t}

on both datasets is the least important but still valuable feature. The

T_{d i f f}

has importance 0 which means it can be removed from the dataset.

It is worth noting that the interpretation of the obtained results aligns with expert knowledge, where a high ‘Flow—leak line’ can indicate a valve plate failure, although the flow rate strongly depends on the pressure in the system; therefore, the ‘Pressure—output’ was marked as the second most important.

5.3. Model Sensitivity

As mentioned above, the sensitivity analysis of the model involved adding a noise vector with a given standard deviation to each test record and then assessing the prediction accuracy obtained on the test set. The results, presented graphically in Figure 9, indicate that the addition of noise with a standard deviation of 0.05 (5%) causes a minimal decrease in the prediction precision for both the Failure 2 and Failure 3 datasets. For the Failure 3 dataset, a significant drop in accuracy is observed for

σ > 0.05

, and for

σ > 0.1

, the decline is linear. In contrast, for the Failure 2 dataset, where the size of the extent of valve plate damage is similar to the Failure 1 data, a significant deterioration is observed for

σ > 0.1

, with a linear decline in accuracy occurring for

σ > 0.15

. It is also noteworthy that, despite adding noise at a level of 0.25 (25%), the results obtained for Failure 3 were higher than those obtained for the Failure 2 dataset without any noise. In conclusion, considering the measurement errors of the sensors used in the study, as shown in Table 10, the developed model should behave stably, and measurement errors should not affect the quality of the predictions.

6. Conclusions

The article describes the process of building a system to predict valve plate failures in a piston pump. To construct the system, three datasets were developed, describing three different failure states of the pump and the normal operating state. For the failure states, three different plates with varying degrees of degradation were used. During the measurements, various values of pressure, temperature, and flow parameters were recorded at different locations in the system. Using the recorded data, various machine learning models were evaluated along with hyperparameter tuning. These models were assessed in the dataset that describes the moderate state of the valve plate (the Failure 1 dataset) for which the highest number of samples was recorded during physical experiments. Among the five predictive models assessed, the neural network with (100, 10) hidden units organized in two layers achieved the best results. The final best-obtained model was trained on the entire dataset called Failure 1 and then evaluated on datasets representing Failure 2 and Failure 3, allowing for an independent assessment of the model’s quality. The results indicate a reasonable accuracy that is

80 %

for the Failure 2 dataset with the smallest damage,

88 %

for the Failure 1 dataset with moderate damage, and over

93 %

for the Failure 3 datasets with the largest damage. So, the level of damage directly corresponds to the prediction accuracy. The feature importance analysis revealed that the most critical parameters are the flow in the leak line, the system output pressure, together with pressure on the leak line and temperature. This information allows the reduction in the number of needed sensors for the construction of a practical predictive maintenance system implementation, significantly reducing costs. The results obtained can be summarized as follows:

three classification datasets were developed which represent three levels of valve plate damage,
the best performance out of the evaluated models was obtained by a neural network consisting of two hidden layers containing 100 and 10 neurons, respectively,
the system’s accuracy depends on the level of damage,
the most important attributes are the flow in the leak line, the system output pressure, the pressure on the leak line and the temperature,
the system performance starts to be sensitive to the input parameters when the level of noise in the input data is higher than $5 %$ .

Future research is planned to expand the collected data with vibrodiagnostic information, which can further improve the prediction quality of the system. It is also considered to create a digital twin [30,31] of the pump to simplify the data collection process model of other failure states.

Author Contributions

Conceptualization, M.R. and M.B.; methodology, M.R. and M.B.; software, M.R. and M.B.; validation, M.B.; investigation, M.R.; resources, M.R.; writing—original draft preparation, M.R.; writing—review and editing, M.B.; visualization, M.R. and M.B.; supervision, M.B.; project administration, M.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by the Ministry of Science and Higher Education grant number 11040SDW22030 and the Silesian University of Technology grant number BK-204RM42024.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset used in the experiments is available on Kaggle: https://www.kaggle.com/datasets/mbjunior/valve-plate-failure-prediction-in-hydraulic-pumps (accessed on 25 June 2024). The script used to run the experiments is available on GitHub: https://github.com/mblachnik/2024_Valve_plate_failures/ (accessed on 25 June 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

kNN	k-nearest neighbors
GBT	gradient boosted trees
RF	random forest
MLP	multi-layer perceptron
ML	Machine Learning
CNN	convolutional neural networks
ARIMA	auto-regressive integrated moving-average
SVM	support vector machine
LSSVM	least squares support vector machine
PCA	principal component analysis
ANN	artificial neural network
BO	Bayesian optimization
DT	decision tree
EWT	empirical wavelet transform
MCPF	monotonicity-constrained particle filtering
MAAKR	modified auto-associative kernel regression
RBFNN	regularized radial basis function neural network
RUL	remaining useful life

References

Carvalho, T.P.; Soares, F.A.; Vita, R.; Francisco, R.d.P.; Basto, J.P.; Alcalá, S.G. A systematic literature review of machine learning methods applied to predictive maintenance. Comput. Ind. Eng. 2019, 137, 106024. [Google Scholar] [CrossRef]
Jędrzykiewicz, Z.; Stojek, J.; Rosikowski, P. Napęd i Sterowanie Hydrostatyczne; Wydawnictwo Akademii Górniczo-Hutniczej: Krakow, Poland, 2017. [Google Scholar]
Blachnik, M.; Przyłucki, R.; Golak, S.; Ściegienka, P.; Wieczorek, T. On the development of a digital twin for underwater UXO detection using magnetometer-based data in application for the training set generation for machine learning models. Sensors 2023, 23, 6806. [Google Scholar] [CrossRef]
Rivera, D.L.; Scholz, M.R.; Fritscher, M.; Krauss, M.; Schilling, K. Towards a predictive maintenance system of a hydraulic pump. IFAC-PapersOnLine 2018, 51, 447–452. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Natekin, A.; Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobot. 2013, 7, 21. [Google Scholar] [CrossRef] [PubMed]
Rodriguez, J.J.; Kuncheva, L.I.; Alonso, C.J. Rotation forest: A new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 1619–1630. [Google Scholar] [CrossRef] [PubMed]
Kordos, M.; Blachnik, M.; Strzempa, D. Do we need whatever more than k-NN? In Proceedings of the Artificial Intelligence and Soft Computing: 10th International Conference, ICAISC 2010, Zakopane, Poland, 13–17 June 2010; Part I 10. Springer: Berlin/Heidelberg, Germany, 2010; pp. 414–421. [Google Scholar]
Bishop, C.M.; Bishop, H. Deep Learning: Foundations and Concepts; Springer Nature: Berlin/Heidelberg, Germany, 2023. [Google Scholar]
Tang, S.; Zhu, Y.; Yuan, S. A novel adaptive convolutional neural network for fault diagnosis of hydraulic piston pump with acoustic images. Adv. Eng. Inform. 2022, 52, 101554. [Google Scholar] [CrossRef]
Li, Z.; Jiang, W.; Zhang, S.; Xue, D.; Zhang, S. Research on prediction method of hydraulic pump remaining useful life based on KPCA and JITL. Appl. Sci. 2021, 11, 9389. [Google Scholar] [CrossRef]
Guo, R.; Li, Y.; Zhao, L.; Zhao, J.; Gao, D. Remaining useful life prediction based on the Bayesian regularized radial basis function neural network for an external gear pump. IEEE Access 2020, 8, 107498–107509. [Google Scholar] [CrossRef]
Yu, H.; Li, H. Pump remaining useful life prediction based on multi-source fusion and monotonicity-constrained particle filtering. Mech. Syst. Signal Process. 2022, 170, 108851. [Google Scholar] [CrossRef]
Bykov, A.; Voronov, V.; Voronova, L. Machine learning methods applying for hydraulic system states classification. In Proceedings of the 2019 Systems of Signals Generating and Processing in the Field of on Board Communications, Moscow, Russia, 20–21 March 2019; pp. 1–4. [Google Scholar]
Ding, Y.; Ma, L.; Wang, C.; Tao, L. An EWT-PCA and extreme learning machine based diagnosis approach for hydraulic pump. IFAC-PapersOnLine 2020, 53, 43–47. [Google Scholar] [CrossRef]
Tang, S.; Zhu, Y.; Yuan, S. An improved convolutional neural network with an adaptable learning rate towards multi-signal fault diagnosis of hydraulic piston pump. Adv. Eng. Inform. 2021, 50, 101406. [Google Scholar] [CrossRef]
Tang, S.; Khoo, B.C.; Zhu, Y.; Lim, K.M.; Yuan, S. A light deep adaptive framework toward fault diagnosis of a hydraulic piston pump. Appl. Acoust. 2024, 217, 109807. [Google Scholar] [CrossRef]
Surucu, O.; Gadsden, S.A.; Yawney, J. Condition monitoring using machine learning: A review of theory, applications, and recent advances. Expert Syst. Appl. 2023, 221, 119738. [Google Scholar] [CrossRef]
Sharma, A.K.; Punj, P.; Kumar, N.; Das, A.K.; Kumar, A. Lifetime prediction of a hydraulic pump using ARIMA model. Arab. J. Sci. Eng. 2024, 49, 1713–1725. [Google Scholar] [CrossRef]
Buabeng, A.; Simons, A.; Frempong, N.K.; Ziggah, Y.Y. Hybrid intelligent predictive maintenance model for multiclass fault classification. In Soft Computing; Springer: Berlin/Heidelberg, Germany, 2023; pp. 1–22. [Google Scholar]
Tang, S.; Zhu, Y.; Yuan, S. Intelligent fault diagnosis of hydraulic piston pump based on deep learning and Bayesian optimization. ISA Trans. 2022, 129, 555–563. [Google Scholar] [CrossRef]
Shao, Y.; Chao, Q.; Xia, P.; Liu, C. Fault severity recognition in axial piston pumps using attention-based adversarial discriminative domain adaptation neural network. Phys. Scr. 2024, 99, 056009. [Google Scholar] [CrossRef]
Konieczny, J.; Łatas, W.; Stojek, J. Application of analysis of variance to determine important features of signals for diagnostic classifiers of displacement pumps. Sci. Rep. 2024, 14, 6098. [Google Scholar] [CrossRef]
ISO 3448:1992; Industrial liquid lubricants—ISO viscosity classification. ISO—International Organization for Standardization: Geneva, Switzerland, 1992.
Blachnik, M.; Duch, W. LVQ algorithm with instance weighting for generation of prototype-based rules. Neural Netw. 2011, 24, 824–830. [Google Scholar] [CrossRef]
Kolodner, J. Case-Based Reasoning; Morgan Kaufmann: Cambridge, MA, USA, 2014. [Google Scholar]
Blachnik, M.; Kordos, M. Comparison of instance selection and construction methods with various classifiers. Appl. Sci. 2020, 10, 3933. [Google Scholar] [CrossRef]
Nicodemus, K.K.; Malley, J.D.; Strobl, C.; Ziegler, A. The behaviour of random forest permutation-based variable importance measures under predictor correlation. BMC Bioinform. 2010, 11, 110. [Google Scholar] [CrossRef]
Kaneko, H. Cross-validated permutation feature importance considering correlation between features. Anal. Sci. Adv. 2022, 3, 278–287. [Google Scholar] [CrossRef] [PubMed]
Minghui, H.; Ya, H.; Xinzhi, L.; Ziyuan, L.; Jiang, Z.; Bo, M. Digital twin model of gas turbine and its application in warning of performance fault. Chin. J. Aeronaut. 2023, 36, 449–470. [Google Scholar]
Tao, F.; Qi, Q. Make more digital twins. Nature 2019, 573, 490–491. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Pump cross section. Source: https://www.ponar-wadowice.pl/!uploads/attachments_prod/hg_metaris_ma10vo_technicalcatalog_web.pdf (accessed on 25 June 2024).

Figure 2. Hydraulic diagram.

Figure 3. Test unit for data acquisition; (a) HPU model with tested pump; (b) Hydraulic motor and electric motor-adjustable load for the tested pump.

Figure 4. Examples of applied load torque—stepped and sine wave. Steps duration ca. 4–5 min, sine wave period T = 300 s.

Figure 5. Damaged valve plates.

Figure 6. Model evaluation scheme.

Figure 7. Confusion matrix obtained for the Failure 2 and Failure 3 datasets. Class 0 indicates normal operation conditions and class 1 indicates failure state conditions; (a) Failure 2; (b) Failure 3.

Figure 8. Feature importance plot obtained for Failure 2 and Failure 3 datasets; (a) Failure 2; (b) Failure 3.

Figure 9. The influence of additive noise on prediction model accuracy for different levels of noise standard deviation.

Table 1. HPU components.

Pos.	Description	Symbol	Manufacturer
1	Temperature sensor Pt1000 150 °C	TA2105	IFM, Germany
2	Pressure sensor −1…1 bar	PA3509	IFM, Germany
3	Electric motor 37 kW	FCMP 225S-4/PHE	AC-Motoren, Germany
4	Temperature sensor Pt1000 150 °C	TA2105	IFM, Germany
5	Pressure sensor 400 bar	PT5400	IFM, Germany
6	Turbine flow meter	PPC-04/12-SFM-015	Stauff, Germany
7	Check valve	S8A1.0	Ponar, Poland
8	Temperature sensor Pt1000 150 °C	TA2105	IFM, Germany
9	Pressure sensor 10 bar	PT5404	IFM, Germany
10	Piston pump	HSP10VO45DFR	Hydraut, Italy
11	Gear wheel flow meter	DZR-10155	Kobold, Germany
12	Temperature sensor Pt1000 150 °C	TA2105	IFM, Germany
13	Pressure sensor 10 bar	PT5404	IFM, Germany
14	Hydraulic motor	F12 060 MF	Parker, USA
15	Torque meter	T22/1KNM	HBM, Germany
16	Electric motor 170 kW	LSRPM250ME1	Emerson, USA
17	Filter	FS1	Ponar, Poland
18	Pressure sensor 250 bar	PT5401	IFM, Germany
19	Temperature sensor Pt1000 150 °C	TA2105	IFM, Germany

Table 2. Raw data records.

	Class Name	Number of Records
1	No-fail	2.7 + mln
2	Failure 1	2.3 + mln
3	Failure 2	770 + k
4	Failure 3	900 + k

Table 3. Process parameters.

	Columns Name	Symbol
1	Pressure—leak line	$P_{l e a k}$
2	Temperature—leak line	$T_{l e a k}$
3	Pressure—output	$P_{o u t}$
4	Temperature—suction line	$T_{s u c t .}$
5	Temperature—output	$T_{o u t}$
6	Flow—leak line	$F_{l e a k}$
7	Flow—output	$F_{o u t}$
8	Temp. diff	$T_{d i f f}$

Table 4. Preprocessed data records.

	Class Name	Number of Records
1	No-fail	68,345
2	Failure 1	47,207
3	Failure 2	14,333
4	Failure 3	16,472

Table 5. Dataset statistics.

	$# Class 0$	$# Class 1$	$# Samples$	$\frac{# Class 0}{# Samples}$	$\frac{# Class 1}{# Samples}$
Failure 1	53,345	47,207	100,552	0.5305	0.4695
Failure 2	15,000	14,333	29,333	0.5114	0.4886
Failure 3	15,000	16,472	31,472	0.4766	0.5234

Table 6. Parameter settings for the models evaluation procedure.

Model	Param	Values
RandomForest	# trees	$[50, 100, 200, 300]$ [t]
	max depth	$[5, 7, 9, 12]$
	max features	$[0.3, 0.5, 0.6]$ [b]
RotationForest	# trees	$[50, 100, 200, 300]$ [t]
	max depth	$[5, 7, 9, 12]$
	# featuressubset	$[2, 3, 4]$ [b]
GradientBoostedTrees	# trees	[50, 100, 200, 300] [t]
	max depth	$[5, 7, 9, 12]$
	learning rate	$[0.05, 0.1, 0.2]$ [b]
kNN	k	$[1, 3, 5, 7, 9, 11, 15, 21, 29, 51, 71, 101, 201]$ [t]
	voting	democratic/weighted [b]
MLP	architecture	[(4,), (10,), (30,), (50,), (100,), (100,10)] [t]
	# iter.	[100, 500, 1000]
	learning rate	$[0.01, 0.001, 0.0001]$ [b]

Table 7. Comparison of the models’ performance using grid search procedure on failure 1 dataset.

LagModel	RandomForest	RotationForest	GradientBoostedTrees	kNN	MLP
0	0.8442	0.8335	0.8455	0.8739	0.8868
1	0.8223	0.8224	0.8440	0.8742	0.8803
2	0.8180	0.8295	0.8429	0.8737	0.8865

Table 8. Model performance obtained on the test datasets: the Failure 2 and Failure 3 datasets.

	Failure 2	Failure 3
MLP	0.8016	0.9313

Table 9. Correlation between attributes in Failure 1 dataset.

	$P_{l e a k}$	$T_{l e a k}$	$P_{o u t}$	$T_{s u c t .}$	$T_{o u t}$	$F_{l e a k}$	$F_{o u t}$	$T_{d i f f}$
$P_{l e a k}$	1.00	−0.35	0.57	−0.52	−0.44	0.36	0.03	0.46 [t]
$T_{l e a k}$	−0.35	1.00	0.33	0.88	0.91	0.43	0.37	−0.07
$P_{o u t}$	0.57	0.33	1.00	0.05	0.17	0.79	0.26	0.47
$T_{s u c t .}$	−0.52	0.88	0.05	1.00	0.99	0.23	0.27	−0.54
$T_{o u t}$	−0.44	0.91	0.17	0.99	1.00	0.32	0.30	−0.47
$F_{l e a k}$	0.36	0.43	0.79	0.23	0.32	1.00	0.12	0.29
$F_{o u t}$	0.03	0.37	0.26	0.27	0.30	0.12	1.00	0.10
$T_{d i f f}$	0.46	−0.07	0.47	−0.54	−0.47	0.29	0.10	1.00 [b]

Table 10. Sensors accuracy. FS = full scale.

Pos.	Description	Symbol	Accuracy
1	Temperature sensor Pt1000 −50…150 C	TA2105	0.1% FS ± 3 K
2	Pressure sensor −1…1 bar	PA3509	0.8% FS
3	Pressure sensor 400 bar	PT5400	1.05% FS
4	Turbine flow meter 1…15 L/min	PPC-04/12-SFM-015	1% FS
5	Pressure sensor 10 bar	PT5404	1.05% FS
6	Gear wheel flow meter 1…250 L/min	DZR-1015 S25	0.3% FS
7	Torque meter 1 kNm	T22/1KNM	0.5% FS
8	Pressure sensor 250 bar	PT5401	1.05% FS

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Rojek, M.; Blachnik, M. A Dataset and a Comparison of Classification Methods for Valve Plate Fault Prediction of Piston Pump. Appl. Sci. 2024, 14, 7183. https://doi.org/10.3390/app14167183

AMA Style

Rojek M, Blachnik M. A Dataset and a Comparison of Classification Methods for Valve Plate Fault Prediction of Piston Pump. Applied Sciences. 2024; 14(16):7183. https://doi.org/10.3390/app14167183

Chicago/Turabian Style

Rojek, Marcin, and Marcin Blachnik. 2024. "A Dataset and a Comparison of Classification Methods for Valve Plate Fault Prediction of Piston Pump" Applied Sciences 14, no. 16: 7183. https://doi.org/10.3390/app14167183

APA Style

Rojek, M., & Blachnik, M. (2024). A Dataset and a Comparison of Classification Methods for Valve Plate Fault Prediction of Piston Pump. Applied Sciences, 14(16), 7183. https://doi.org/10.3390/app14167183

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Dataset and a Comparison of Classification Methods for Valve Plate Fault Prediction of Piston Pump

Abstract

1. Introduction

2. Related Work

3. The Dataset

3.1. Data Collection Process

3.2. Data Preprocessing and Forming the Training and Test Set

4. The Experiments

4.1. Models Used for Evaluation

4.2. Feature Importance Analysis

4.3. Model Sensitivity Analysis

4.4. Model Evaluation Procedure

5. Results

5.1. Model Selection

5.2. Feature Importance Analysis

5.3. Model Sensitivity

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI