1. Introduction
In industry, maintenance is of crucial importance, as it directly impacts the cost, reliability, ability, quality, and performance of a company. Unwanted or unplanned downtime of equipment adds to the degradation and suspension of the core business, resulting in immeasurable losses and significant penalties.n It was reported that Amazon suffered a set back of
$4 million in sales for a downtime of 49 min in 2013 [
1]. According to a survey by the Ponemon Institute [
2], on average an organization loses
$138,000 per hour because of downtime. Also, it is reported that the operations and maintenance costs of offshore wind turbines amount to 20% to 35% of their total generated revenue [
3], and for the oil and gas industry the maintenance expenditure can range from 15% to 70% of the total production cost [
4]. It is therefore crucial for industries to look into efficient and competent methods to prevent unexpected downtime, hence reducing operation and maintenance costs and improving reliability.
With the evolution of technology like the Internet of Things (IoT) [
5], we are able to connect manufacturing devices to networks to send and exchange data. The first step for industry is to connect their machines and gather data from them. The data generated are collected and stored in the cloud and can later be utilized for analysis and visualization. Predictive maintenance has been discussed and researched for some time [
6], but in recent years it has become more widely accessible because of the seamless advancements in modern technology [
7]. It generally involves monitoring the operating conditions of the machines and gathering the information that could be used to maximize the interval time between repairs and to minimize unscheduled disruption due to machine failures [
8]. Modern technology has the potential to identify and detect developing faults in machine components, predict a fault’s progression, and finally to provide a strategy for maintenance schedules.
Predictive maintenance is an important part of smart manufacturing and Industry 4.0. According to a recent report, the market for PdM will be worth
$23 billion by 2026 [
9]. The purpose of PdM is to identify uncommon machine behavior and to send out a warning about probable system damage. However, it still remains one of the key challenges faced by the industry. Additionally, with the advent of various technologies, many new concepts have been reinforced for PdM such as using IoT to gather increasing amounts of data from machines equipped with sensors [
10], advanced techniques for data pre-processing, e.g., feature extraction, normalization, data cleaning, and preparation, etc., and machine learning and deep learning-based models for condition monitoring and failure detection. All of these improvements and discoveries have improved overall system reliability, reducing machine downtime and costs. Currently, most modern machines deployed in manufacturing production lines are equipped with advanced sensors and increased communications capabilities. These machines are capable of monitoring their status by continuously observing the conditions of various important features such as voltage, pressure, vibration, rotation, temperature, motor speed, etc. These features of a modern machine are used to train deep learning algorithms that are capable of detecting and predicting faults even before they occur. This allows us to augment or even replace an ordinary/regular maintenance schedule through predictive maintenance, which is more reliable and advanced, reduces unnecessary maintenance and its associated costs, and guarantees the proper and timely functioning of machines.
Deep learning models have provided effective solutions in fault diagnosis due to their powerful feature learning abilities. They build a set of representation methods using numerous layers and learn the non-linear representation of any time series to a higher level of complexity and abstraction. Many papers have been published that cover machine fault diagnosis to a large extent over the past few years. A detailed overview of a deep learning-based monitoring system for machine health was proposed in [
11]. A simple auto-encoder, CNN and RNN-based machine health monitoring system was proposed in [
12]. A prognostic and health management architecture based on deep learning was proposed in [
13]. An auto encoder-based CNN for induction motor diagnosis was also proposed by [
14]. A gated CNN to estimate the remaining life of a machine was proposed in [
15]. Wu et al. [
16] used LSTM to predict the remaining life of turbofan engines.
Deep learning has been studied to predict an unknown time series using historical data [
17]. A variety of modern machine learning algorithms ranging from multi-layer perceptrons (MLP) and long short-term memory (LSTM) to the combinations of RNNs and dynamic Boltzmann machines [
17] have been used in time-series forecasting. All of these algorithms have their own strengths and weaknesses, depending on the type of data or the task at hand. However, due to the complex nature of predictive maintenance use cases, none of the above noted algorithms have provided excellent results. Moreover, hybrid methods [
18] are a new development trend in the field of time-series forecasting.
In this paper, we propose a hybrid deep learning framework based on CNN-LSTM for PdM. The proposed method combines the advantages of convolutional neural networks (CNN), which extract the effective features and patterns among multi-variate input variables, and long short term memory (LSTM), which captures the complex long-term dependencies and automatically selects the mode best suited for relevant time-series data. The hybrid CNN-LSTM model is designed to make the optimization easier by extracting the effective features using CNNs and then incorporating LSTM in parallel to predict machine failures. The main contributions of our work are summarised as follows:
We propose a hybrid deep learning model (CNN-LSTM) for PdM. The model uses CNN to extract features from the time series and LSTM for prediction. It uses a time sequence of different errors and analyses the correlation between different input variables for better prediction.
We introduce a novel temporal skip connection component for LSTM that enables them to capture long length dependencies and makes optimization easier and efficient.
On comparing the evaluation indexes of CNN, LSTM, and hybrid CNN-LSTM, we found that our hybrid CNN-LSTM has the highest prediction accuracy and is more reliable and suitable for PdM forecasting.
The IoT sensors and CCTVs are responsible for collecting data from the different available equipment on a production floor. Efficient use of data with AI encourages innovation, better productivity, and technology dissemination.
Structure of the Paper: The rest of the paper is organized as follows.
Section 2 provides related work of the state-of-the-art data-driven approaches.
Section 3 throws some light on the Industry 4.0 use case.
Section 4 and
Section 5 provides some background to existing methodologies, including CNN and LSTM, used in the design of the model and presents our framework by explaining the structural design of the model, respectively.
Section 6 explains the results and the optimization strategy adopted in this study.
Section 7 explains the lesson learned throughout the course of this study.
Section 8 discuses the limitations of the proposed framework, and, finally,
Section 9 concludes the paper.
2. Related Work
Data-driven approaches have attracted many research efforts in the area of smart manufacturing. Machine learning techniques for manufacturing applications, along with their weaknesses and strengths, are explained in [
19,
20]. Successful machine learning algorithms such as Bayesian Networks, artificial neural networks, and other ensemble methods are explained in [
21]. Traditional machine learning algorithms such as logistic regression, artificial neural networks, or support vector machines yield a modest performance because of their shallow structural design and hand-crafted feature engineering [
22]. Deep learning has exhibited impressive performance in fields such as image classification, natural language processing, semantic segmentation, and object detection/recognition [
23]. Deep learning is best known for automatically extracting highly nonlinear and complex abstract features by means of multiple layers stacked on top of each other [
24]. Because of its automated feature extraction and ability to learn different levels of data abstractions, deep learning can identify hidden patterns and trends in data and predict them through a well-defined optimization pipeline. The authors of [
25] introduced long short-term memory (LSTM), a special kind of RNN, capable of dealing with complex data and learning long-term dependencies. Over time, these networks were further optimized in other work [
26,
27]. LSTMs have demonstrated their success in time-series forecasting [
28]. Recently in [
29], a CNN combined with K-means, which was proposed for time-series load forecasting, achieved impressive results. The main reason for this is the capability of a CNN to extract features from input data at different levels. Because a CNN has the ability to learn for a vast range of non-linear problems as explained in [
22], it is well suited for time-series production data. In contrast, LSTMs are effective in modelling time-series data because of their ability to remember the states of input data in their memory [
28]. Deep learning method with LSTM combined with support vector machine (SVM) proposed in [
30] are utilized to distinguish aberrant data from normal vibration signals gathered during a reduction gearbox endurance test and helicopter test flight data acquired by many sensors located throughout the aircraft. A method based on JDA and deep belief network (DBN) for fault bearing diagnosis was proposed in [
31]. The JDA is used in the JACADN to perform feature transfer between source domain samples and target domain samples, i.e., the kernel function maps the source domain samples and target domain samples into the same feature space. Anomaly detection using a generative adversarial network (GANS) for industrial sensor data was proposed in [
32]. The paper focused on reconstruction of sound signal reconstructions and detection of anomalies. The basic notion underlying the techniques is that normal conditions can be accurately reconstructed using a smaller latent space interim of neural network design, whereas abnormal conditions cannot be reproduced due to larger reconstruction losses. This method is appropriate for anomaly detection, as the volume of anomalous condition data is typically much smaller than normal condition data, and a detection model may be trained solely using normal condition data.
The use of deep learning techniques to detect anomalies has improved the results of older methods. An artificial neural network model underpins deep learning. Deep learning and LSTMs offer to train hierarchical models over input data that describe probability distributions. Artificial intelligence has become a vibrant subject with many practical applications such as machine prognosis, fault detection, predictive maintenance, etc., thanks to recent advancements in both hardware and brain models, particularly in the previous decade.
3. Industry 4.0 Use Case
Consider a typical use case of a smart factory having its production lines equipped with modern sensing technologies. All data produced by the sensors installed on production lines are acquired and accessible in real time. As depicted in
Figure 1, a variety of modern AI techniques can be applied to the real-time data collected from sensors to get better insights into factory operations in real time.
In a given smart manufacturing production line, the objective is to have an active response to demand and supply. Deep learning technologies can help with maximizing the efficiency of real-time processes when analyzing large amounts of data. They can be used to quickly analyse the relevant data and identify failures that can result in reduced production over a subsequent period of time. They provide information about the status of the equipment involved, thus supporting the human workforce maintaining and monitoring this equipment [
33]. They also provide oversight for machine conditions and hence guarantee more reliability and process regularity, which eventually diminishes the necessity for regular and constant machine checkups and also reduces machine downtime [
34,
35].
In a typical smart production line, data from a large number of devices such as sensors and CCTVs (as shown in
Figure 1) are collected and then processed into a more human-understandable form. The collected data are stored in scalable data storage systems. Big data techniques are used to pre-process and analyze the data. The data analysis can be presented through rich visualizations. Once the data are prepared, deep learning algorithms such as LSTMs are used, which identify the correlations between different features and take the appropriate steps based on the data. LSTMs are applied to the sensor data or time-series data to find anomalies within a production line. The LSTMs predict critical errors in advance to avoid a shutdown, fatal accident, or any unwanted event. The basic difference between LSTMs and other existing models is in terms of being more proactive and reactive. LSTMs are responsible for predictive maintenance, which can help many smart production lines to save millions of dollars in the support and maintenance of equipment. CNNs perform automatic feature extraction and can learn a high order representation of data (e.g., time-series or image-based data) [
13]. However, CNNs alone when applied to time-series data are highly reliant on data pre-processing and tuning of a large pool of hyperparameters. In contrast, CNNs combined with LSTMs identify the most important features that contribute to the output. The combined model does not require heavy feature engineering, which makes the model computationally more reliable and efficient when compared to standalone CNNs.
7. Lessons Learned
During the course of this study we came across the following difficulties:
Data Interoperability: Currently, data are not shared across machines in a production line. Industry 4.0 is now working on system integration to make data more inter-operable so that better decisions may be made to optimize the production process. Where elements of the knowledge wrapped in the data are made available, we believe data integration can improve machine learning results.
Match and Validate Data: It can take a long time to come up with rules to match data received from multiple sources. This becomes progressively difficult as the number of sources grows. Machine learning models can be taught to learn the rules and forecast fresh data matches. There is no limit to the amount of data that may be used, and more data are actually beneficial in fine-tuning the model.
Modelling: When compared to using a single ML model, combining multiple ML models can yield superior predictions. However, developing the hybrid objective function for optimization is difficult given the complex nature of time-series data such as missing entries and dynamic periodicity. For example, classification and anomaly detection techniques can be coupled to keep classification model precision while retaining anomaly detection benefits. PdM can be used on equipment or systems that do not have a huge dataset in this fashion.