1. Introduction
Nowadays, industry is facing a transition to a new paradigm where the collection and process of data are taking a significant role in the improvement of technology. An important application of computational intelligence methods in industry is condition monitoring. Those methods analyze sensor data, which reflect the physical condition of the machine and predict the state of a subsystem [
1] or failure conditions in an early stage. The integration of computational intelligence applications with machines postulate several technical challenges. First the need to develop machine learning models for effective and secure detection from process and sensor data and second its integration with the device for real-time detection [
2]. Machine learning is a data-driven approach, therefore being the data quality paramount. From a machine learning point of view, the design of the condition monitoring system should be studied carefully in order to decide on relevant sensor information and a proper categorization of fault conditions for which the machine learning model will be trained. In order to build a reliable condition monitoring system it is important to recollect information from representative states of the working condition of the machine. This is not always possible, because machine manufacturers are not always willing to run their machines until failure, therefore being data of normal working conditions much more abundant than data for failure conditions and degradation states [
3].
The present research focuses on the detection of blockages in hydraulic circuits. The main reason why blockages have an important impact on the health of a hydraulic system stands on its non-desired effects on pumps. With a little research about the main reasons for the substitution of rotors in pumps, we can find cavitation as one of the most exhausting phenomena produced inside pumps. Moreover, it also can lead to other defects as bearing and rotor faults, which are related, among others, with undesired vibrations and thermal and dynamic stress [
4]. The relation between flow blockages and cavitation resides in the pressure drop produced when the section of a pipe is reduced [
5]. This is because cavitation is produced when the pressure and temperature of a fluid reach the boiling point during a short period of time. Then, there are three ways of producing cavitation: raising the temperature of a fluid, decreasing its pressure, or combining both. The physical principles that govern cavitation are that when the pressure of the fluid is low enough, it evaporates locally producing a bubble that travels through the pump until it reaches a higher-pressure point. Then, the gas is turned back to liquid in such a way that the bubble implodes, generating a shockwave that exhausts the closest obstacles. Long exposure to cavitation can produce severe damage in the inner part of the pumps. To achieve a reliable blockage detection, there are two challenges to face : (1) the implementation of sensors that allow us to know the state of the circuit, (2) and the ingestion and analysis of the data in real-time and as close as possible to the installation. This last concept is often referred to as edge computing, which is, processing the data on-site [
6].
In this study, a closed-loop hydraulic system has been built using different types of process sensors to capture information about the different performing working states. These states regard to blockages artificially induced in the inlet and outlet pipes of the pump. The collected process data was analyzed from a data analytical point of view to find substantial differences between a non-blockage and a blockage state. The objective is to design a condition monitoring for the pump, therefore an exploratory data analysis on the sensor data is carried out to assess data quality.
Ultimately, this document limits to explain the test bench installation, the sensor data integration approach, signal transformations and the exploratory data analysis. As a result, a sensor data set for the described experiments is released. The machine learning model for condition monitoring will be tackled in ongoing research. The remaining part of the article is structured as follows. A material section (
Section 2) with a description of the hydraulic circuit, the data ingestion system from sensors and the description of the conducted experiments.
Section 3 explains the exploratory data analysis on the fault categories followed by the conclusions of the research.
3. Exploratory Data Analysis
This section presents the results of an exploratory data analysis of the sensor data to gain further insights about the data quality and separability among the different blockage states of the pump. Exploratory data analysis finds a wide application as a first investigation on data to reveal patterns and derive useful knowledge about the data under study. Statistical validations and graphical visualizations are common tools for data exploration [
7]. The aim of the conducted experiments is the construction of a condition monitoring system for the pump and the recognition different blockage or failure states. Therefore, it is important to evaluate in an initial phase whether the information captured from sensors establish representative categories of the machine’s state.
For this analysis we visualize the sensor data in a lower dimensional space. The original data set comprises 12 attributes—three transformations (mean, RMS and standard deviation) for each of the 4 sensor signals. In the following, the results of dimensionality reduction with Principal Component Analysis (PCA) on the data set is shown. This mathematical method generates new reference axes (principal components) in the directions with greatest variance [
8]. The objective of PCA, then, is to diminish the number of dimensions with the smallest loss of information possible.
Figure 4 shows the visualization of the sensor data set after PCA transformation in three dimensions. Apparently, the PCA visualization of all 16 conditions does not show a clear separation of states. In consequence, we decide to compare data separability between selected states of the system, i.e., the normal state and different levels of blockage. We establish the categories of soft blockage, medium blockage and hard blockage considering a level of 20%, 50% and 80% of blockage in the valves in each category. We also consider the case of progressive degree of blockage in both valves in the diagonal category. The category labeled as extreme considers blockage at 80% in either the inlet or the outlet valve (IV3 or OV3) or both of them (IO33).
Table 3 gives an overview of the blockage states involved in each category.
Figure 4 shows the PCA visualization of each of the blockage categories. The states of the soft blockage category show some overlapping and are less clearly separated from the normal state than the medium or hard blockage states. With increasing level of blockage the groups appear more clearly separated in the PCA plot. Although medium blockage category already reveals differences, this separation is clear with the hard blockage and extreme blockage states, both categories representing differences of 80% of blockage. Interestingly, the progressive blockage in both valves (represented in the diagonal group) reveals overlapping between states. We interpret that the states of simultaneous blockage in both valves may be more difficult to detect due to underlying similarities in the sensor data.