A Systematic Mapping Study on Machine Learning Techniques Applied for Condition Monitoring and Predictive Maintenance in the Manufacturing Sector

Phan, Thuy Linh Jenny; Gehrhardt, Ingolf; Heik, David; Bahrpeyma, Fouad; Reichelt, Dirk

doi:10.3390/logistics6020035

Open AccessReview

A Systematic Mapping Study on Machine Learning Techniques Applied for Condition Monitoring and Predictive Maintenance in the Manufacturing Sector

by

Thuy Linh Jenny Phan

^*

,

Ingolf Gehrhardt

,

David Heik

^*

,

Fouad Bahrpeyma

and

Dirk Reichelt

Faculty of Informatics/Mathematics, Dresden University of Applied Sciences, 01069 Dresden, Germany

^*

Authors to whom correspondence should be addressed.

Logistics 2022, 6(2), 35; https://doi.org/10.3390/logistics6020035

Submission received: 15 April 2022 / Revised: 13 May 2022 / Accepted: 17 May 2022 / Published: 28 May 2022

Download

Browse Figures

Versions Notes

Abstract

:

Background: Today’s production facilities must be efficient in both manufacturing and maintenance. Efficiency enables the company to maintain the required output while reducing production effort or costs. With the increasing interest in process automation and the Internet of things since Industry 4.0 was introduced, such shop floors are growing in complexity. Every component of the production needs to be continuously monitored, which is the basis for predictive maintenance (PdM). To predict when maintenance is needed, the components’ conditions are monitored with the help of a condition monitoring (CM) system. However, this task is difficult for human employees, as the monitoring and analysis is very demanding. To overcome this, machine learning (ML) can be applied to ensure more efficient production. Methods: This paper aims to investigate the application of ML techniques for CM and PdM in the manufacturing sector. For this reason, a systematic mapping study (SMS) is conducted in order to structure and classify the current state of research and identify potential gaps for future investigation. Relevant literature was considered between January 2011 and May 2021. Results: Based on the guidelines for SMSs and previously defined research questions, existing publications are examined and a systematic overview of the current state of the research domain is provided. Conclusions: Techniques such as reinforcement learning and transfer learning are underrepresented, but increasingly attracting more attention. The findings of this study suggest that the most promising results belong to the applications of hybrid ML methods, where a set of methods are combined to build a more powerful model.

Keywords:

machine learning; condition monitoring; predictive maintenance; systematic mapping study; manufacturing; production

1. Introduction

With the fourth industrial revolution (referred to as Industry 4.0 and introduced by [1] (see BibTeX file) in 2011), manufacturing processes and a plant’s components are designed to be more intelligent. Manufacturing is defined as the process of converting raw materials into finished products by using manual or mechanical transforming methods [2]. Machines and tools are equipped with sensors and communication devices in order to enable flexible manufacturing, resulting in harmonized and more transparent processing steps. Failures and anomalies within the production lead to higher operating costs and inefficient manufacturing due to faulty products, idle machines or systems, or inaccurate planning. The more factors that come into play, the more complex the plant and resulting management become over time. Especially with the increasing demand for process automation and the Internet of things, the process grows in complexity. Therefore, it is necessary to be aware of potential faults and failures during production and to predict when maintenance is needed. While condition monitoring (CM) deals with the issue and definition of monitoring machines and systems, predictive maintenance (PdM) can be considered as the technical perspective through which this issue is addressed in detail and practice through methodology and measurements. Both methods are capable of enhancing the highly complex processes in a manufacturing plant. Since any engineering system, at some time of its use, will not function without errors, CM takes care of the periodic or continuous monitoring of plants, machines, and processes, as well as other objects or metrics [3]. PdM instead aims to ensure that maintenance in a system is done only when it is necessary in order to reduce the resulting costs. This aim is met by condition-driven as well as time-driven tasks that evaluate, for example, failure data, equipment reliability, and failures that can be prevented [4]. While manually supervised methods are time consuming and inaccurate due to human errors, artificial intelligence (AI) is a technological innovation that replaces manual work and helps humankind, especially in manufacturing, significantly [5]. AI is a branch of computer science that develops intelligent machines, which can behave like humans and make decisions according to the logical program in their memory [6]. Machine learning (ML) is a subset of AI that mimics human intelligence by learning from the surrounding environment. There are different ML techniques that learn from the current context and generalize to unseen tasks [7]. These techniques can be useful for both CM as well as PdM. The models only need to be trained beforehand in order to be able to predict certain conditions or expected maintenance [8].

Other reviews (e.g., systematic literature reviews) and studies (e.g., [9,10]) have already been published in 2021 investigating the application of ML techniques for PdM in the field of manufacturing.These studies conclude that the research domain is moving fast, especially after 2016, and that targeted research is required. As a result of the increasing interest in ML techniques, as well as topics like CM and PdM in terms of manufacturing, this research paper conducts a systematic mapping study (SMS), which aims to provide a structured overview of the current literature in this area and the quantity of publications available within it. Moreover, this work allows for a better understanding of the current state of the research, as well as the detection of research gaps for further investigations. SMS is based on the guidelines for conducting systematic mapping studies (see [11]). Five research questions were defined in order to find relevant information with the help of this study.

The rest of the paper is organized as follows. In Section 2, the research methodology (i.e., systematic mapping study) is introduced and described. Here, the first process steps are illustrated. This section also presents the aforementioned research questions (see Section 2.1). The next steps of the procedure are explained in more detail in Section 3. In addition, this section summarizes the relevant results of this study. Subsequently, Section 4 analyzes and discusses these results. Finally, Section 5 provides a conclusion, highlighting the findings of the present study. The main findings are that the field of research is developing rapidly and that further investigation in this area is needed in order to address the problems associated with production, which are constantly increasing in complexity.

2. Materials and Methods

A systematic mapping study is a research technique to identify and classify current studies available in the considered research area. Specifically, this study is conducted to provide a systematic overview on ML techniques in CM and PdM in a quantitative manner. Therefore, the number of publications is the measurement to provide data in this study. The general process steps of such a study are based on the proposed approach of Petersen et al. [11], which is illustrated in Figure 1.

This present section describes the first four steps of this study. In Section 2.1, all research questions being examined are proposed. Section 2.2 deals with the process of searching all the papers in this context. Section 2.3 identifies all papers that are relevant, based on the research questions. The keywording process in Section 2.4 deals with the keywording of all included publications.

2.1. Research Questions

The essential starting point for a systematic mapping study is the formulation of research questions (RQs). The objective of this study is to provide an overview of current approaches in the research field and to identify the quantitative results of available studies. Thus, five research questions were defined. These questions, and the rationale for each, are as follows:

RQ 1.: Which techniques are used and what is their relative frequency?
Rationale: This question defines the basis of the study and provides an overview of the current existing ML approaches applied for CM or PdM.
RQ 2.: For the identified techniques, which algorithms are used the most?
Rationale: Since ML techniques involve diverse algorithms, these approaches have to be identified in order to determine the trend or distribution.
RQ 3.: What is the distribution of online and offline algorithms in the identified scenarios?
Rationale: Which learning method is used in the present studies? Do the researchers profit from one method in particular?
RQ 4.: Are there algorithms that are currently gaining momentum?
Rationale: In order to identify and fill gaps of machine learning applied in the manufacturing sector, this question aims to determine potential algorithms for implementation within the research domain by examining their frequency of use in current research.
RQ 5.: Which applications are examined to apply condition monitoring or predictive maintenance?
Rationale: This question extracts all applications of ML techniques applied for condition monitoring or predictive maintenance. It will show the distribution in the fields of application.

2.2. Search Strategy

To answer the above questions, primary studies are collected by executing search strings in scientific databases. These queries should frame the scope of this study to define the pool of data, and thus provide the information that is needed for the RQs. Therefore, all published papers should consider ML techniques for CM or PdM in the manufacturing sector. To define the search string, the RQs in Section 2.1 are first analyzed. Since every RQ focuses on the same context, the search string needs to cover them all. Hence, search results should yield all papers with “condition monitoring”, “predictive maintenance”, “machine learning” and the higher-level category “artificial intelligence” in the title, abstract, or author-specified keywords. Furthermore, the search should result in studies within the manufacturing scope, so “manufacturing”, as well as the synonyms “production”, “manufacture”, “producing”, and “shop floor” are also considered. To avoid finding other reviews (a report on, or summary and evaluation of or in, a specific field [12]) or surveys (used to describe a method of gathering information from a sample of individuals [13]), these were also included in the string as an exclusion criterion. Ultimately, the search string could be defined as follows:

(“condition monitoring" OR “predictive maintenance") AND (“machine learning" OR “artificial intelligence") AND (“manufacturing" OR “production" OR “shop floor" OR “producing" OR “manufacture") AND NOT (“survey" OR “review").

Manufacturing and production can be seen as synonyms here [14]. Furthermore, shop floor refers to the part of a workshop or factory where production is carried out [15], so this aspect is also included.

The queries were executed on IEEExplore (https://ieeexplore.ieee.org), ScienceDirect (https://www.sciencedirect.com), and Scopus (https://www.scopus.com) on 17 May 2021, and the resulting number of research papers is listed in Table 1. These databases were chosen since they are useful for a wider search because of their available records and are often used in the academic setting.

All publications were managed with the open-source citation and reference management software JabRef (https://www.jabref.org, 11 February 2022). It should be mentioned here that the query has been adapted to the syntax required in each case, as each scientific database uses its own. Actually, more keywords could be used in the string to further specify the search, but since Scopus only allows eight keywords in one search query, the search terms were chosen accordingly.

2.3. Screening of Papers

This process step determines the relevant publications. In order to answer the RQs, irrelevant research papers are excluded with the help of the inclusion and exclusion criteria listed in Table 2.

Since Industry 4.0 was first established in 2011 [1], all papers from this time on (which is the last decade of this study) and written in English are taken into account. Furthermore, one inclusion criterion is the ability to access the paper online, since [11] recommends reading the full texts during the course of the study. Apart from that, even if “condition monitoring” and “predictive maintenance” were used in the search queries, some results show purely healthcare- or network-related issues. Such studies focusing on non-manufacturing areas were excluded, although “manufacturing”, “production”, and their synonyms were included in the search term. Additionally, other systematic mapping studies and literature reviews were excluded, because this research is conducted to give a newer overview of the research area. All criteria were determined in order to have a guideline for this particular process step. The decision was always made based on the abstract. Whenever the title or the abstract did not make it clear if this study can be included, the introduction and the conclusion were read. In case of doubt, full-text reading was conducted as well. In this way, 389 papers were examined more closely (see Section 3). Ultimately, the relevant papers for this SMS include 254 publications.

2.4. Keywording

The keywording step is the fourth process step according to [11]. All tasks done in this phase are shown in Figure 2, which is also recommended by [11]. This scheme allows one to ensure that all relevant publications are considered and to reduce the time for mapping.

First of all, the author-specified keywords are collected in order to get a sense of the paper and the research area. Subsequently, the abstracts of the papers are read to identify the most important keywords and thus to create a classification scheme in which the paper is sorted. After reading the abstracts, another step is taken into account: the detailed or full-text reading, which allows one to update the classification scheme if necessary (e.g., when the abstract is not sufficient to classify the paper). This procedure generally helps to identify the needed information in order to answer the RQs. The keywords used are categories of:

the ML technique;
the algorithms applied;
the research type;
if a framework or case study is proposed;
the learning type (e.g., online and offline ML).

Since these keywords used in this study represent the information for answering the RQs, an illustration of the keyword distribution would not provide any additional information. Therefore, the keywords defined during this process will be used for the upcoming RQ results (see Section 3.2, Section 3.3, Section 3.4, Section 3.5 and Section 3.6). Nevertheless, the keyword distribution of the author-specified keywords is provided (as shown in Figure 3), which represents the study’s context as well as the used search string. The reason for the broad range of artificial intelligence techniques could be the usage of the term “machine learning” in the respective abstract or author-defined keywords. However, because “deep learning” is in fact a sub-domain of ML, it is not surprising that several papers deal with such methods. These publications were still considered in this work. In addition, “artificial intelligence“ was expected, since this was included in the queries and incorporates the research area to which ML and deep learning belong.

To determine the research type of the papers, all included research was classified in its respective facet, as later depicted in Section 3.6. In addition, the distinction between case studies and proposed frameworks was considered, as this provides an understanding of the depth to which researchers have explored the techniques.

3. Results

This section summarizes the results of the conducted SMS. The exclusion process shown in Figure 4 maps the successive reduction of publications in the course of the study. From the inital result set of 819 papers, only 254 publications are relevant for answering the RQs (i.e., approximately 31.01% met the above mentioned criteria).

The data extraction and mapping process in Section 3.1 is the last process step according to [11] and describes the analysis of these studies. All results for the RQs are divided in separate subsections (Section 3.2, Section 3.3, Section 3.4, Section 3.5 and Section 3.6) in order to provide the answer to the respective RQ. The data extraction step allows us to find these answers in a systematic manner. Appropriate diagrams and tables are provided to represent the extracted data.

3.1. Data Extraction and Mapping Process

The data extraction and mapping is the last process step of this SMS according to [11]. Based on the keywords set in the previous step, the required information for this study can be extracted. As shown in the keywording process in Figure 2, the classification scheme can be iteratively updated with each research paper read. The keywording was managed within the respective keywords field in JabRef, which makes it easy to determine the quantity of each keyword with the built-in search. This information was next used for answering the RQs. Figure 4 summarizes the mapping or exclusion process.

With the removal of papers that were published before 2011, 63 papers, which account for approximately 7.7%, were excluded. The step with the most papers excluded represents the application of the quality criteria, as listed in Table 2 in Section 2.3. As shown in Figure 5, investigation into the present research domain has increased over the last decade. Although a significant number of papers were excluded, this trend is still identifiable in the illustrated chart. Even in 2021, which was not over at the time the study was conducted, there are already publications that deal with applying ML techniques for CM or PdM in the manufacturing sector.

After exclusion, the first study in the final set still dates from 2011, so this topic has actually been worked on in the last ten years. Now, looking at the publications over all ten years, it can be seen that, on average, 61 studies were published per year before exclusion and 25 after exclusion. This also reflects the research interest in this topic. In terms of increasing publications from 2017 to 2020, it can be assumed that the number of published papers in 2021 could be at least as high as the number of publications in 2020. This assumption applies to both pre- and post-exclusion studies, as shown in the corresponding figure. In addition, more than half of the papers included and reviewed were published as articles in journals (approximately 58.66%), whereas only one article was published in a book [16]. This means that the rest (approximately 40.94%) were published as conference papers. Furthermore, the country in which the paper was written was recorded during full-text reading. The country of the first named author was used for this purpose. Nevertheless, this is only a brief assessment, as the author could only be at the respective university or research institute for a certain length of time. For improved readability, only those countries with at least five publications have been listed in Table 3, while all other countries that do not meet this criterion have been aggregated.

It can be noted that research institutes in China, Germany, India, the United Kingdom, and the USA are mainly represented and work predominantly in this research domain. However, there are various other countries around the world that are dealing with this topic as well.

Figure 6 shows the correlation between the population size of the countries listed in Table 3 and the number of publications in the respective country. In this illustration, it can be observed that, while there is some randomness in the figure, there seems to be a positive relationship between the population and number of papers published from that country.

3.2. Machine Learning Techniques Used and Their Relative Frequency (RQ 1)

This RQ helps to determine the distribution of ML techniques. There are a variety of such which can be applied in the manufacturing sector. For instance, Russel and Norvig [17] provide an overview of different artificial intelligence approaches. On the basis of this resource, a list of possible ML techniques is generated. In order to give a comprehensive summary of the ML techniques applied, several learning methods (e.g., supervised and unsupervised learning) are further broken down into detailed techniques. For this reason, the following ML techniques are considered in this study: clustering, reinforcement learning, classification, regression, dimensionality reduction (which includes feature selection as well), neural networks and deep learning, ensemble methods, and natural language processing. During full-text reading, it was also found that some papers use a technique known as transfer learning, in which pre-trained models are used to facilitate faster learning on similar problems (e.g., [18,19,20]). Therefore, this was also included as an additional ML technique. With regard to identify ML techniques used for CM and PdM in the manufacturing sector, these were considered in the screening process of this study. Here, it can be observed that classification and neural networks, but also deep learning techniques, are used the most among all papers found, whereas transfer and reinforcement learning were the techniques that were applied the least. Throughout the study, no publication using natural language processing techniques was examined, and therefore it was not included. Since RQ 1 and RQ 2 are thematically linked to each other, the results of both RQs are presented within Section 3.3 in Table 4 for an aggregated overview. In total, ML techniques and algorithms were used or evaluated 426 times in research starting from 2011 and dealing with CM or PdM in the manufacturing sector. It is worth mentioning that the total of 426 does not match with the number of publications depicted in the mapping process in Figure 4. This is due to the fact that the researchers in the existing papers did not only apply one ML techniques or algorithm, but examined or evaluated several. Based on this absolute frequency, the relative frequency (which is shown in Figure 7) can be determined. The chart illustrates that over half of the techniques used or evaluated are classification and neural nets or deep learning, while transfer and reinforcement learning only account for approximately 2%.

3.3. Algorithms Used (RQ 2)

In order to capture all algorithms applied for CM and PdM in the manufacturing sector, it is necessary to record all algorithms in the keywording process. For each ML technique, different algorithms were used. As described in Table 4, the top three algorithms used (i.e., the three most used algorithms) in each technique category are listed. All other algorithms were summarized in “other“. Because the “other“ category is at least as big as one of the top three algorithms, it can be assumed that there are various algorithms investigated in the existing research.

Regardless, the top three algorithms of each technique still account for at least half of all algorithms used in each technique category (see Figure 8 and Table 4).

The preferred algorithms are support vector machines and random forests, whereas reinforcement learning is only used three times in the included papers, so it does not include an “other“ category. Likewise, transfer learning is not used often. Furthermore, the principal component analysis method is mainly used in contrast to other dimensionality reduction algorithms. The same is observed with random forest, among the ensemble methods. Considering the number of times algorithms are used, classification and deep learning algorithms are used the most.

3.4. Distribution of Online and Offline Machine Learning (RQ 3)

There are several ways to train ML models. Whereas online ML typically considers one data point at a time and is able to generate models directly based on this data [17,21,22], offline ML uses historical data as often as needed in order to train the model [17]. Furthermore, for each training method, it is possible to train an ML model with mini-batch learning (MBL), which is a variant of online or offline ML and “aggregates multiple examples at each iteration” [23]. Figure 9 illustrates the distribution of online ML, offline ML, and MBL as a Venn diagram.

The large circle on the left shows all papers that have applied online ML. In contrast, the big circle on the right depicts the number of publications that use offline ML. The intersection of both circles represents those papers that used online as well as offline ML. This is possible because these publications used either their own framework that consists of an offline and an online training stage, or different approaches with various algorithms. The inner circles represent the mini-batch learning variant of offline and online ML, respectively. According to this figure, offline ML (approximately 76%) is used more often than online ML (nearly 20%), whereas around 4% of the papers that use ML techniques apply online as well as offline ML. The MBL variant is not used as much. Papers that did not disclose their learning method were not included in the figure, which was the case for a total of 31 publications.

3.5. Algorithms Currently Gaining Momentum (RQ 4)

In order to capture all algorithms that are currently gaining momentum, the studies are plotted over time. For this purpose, all algorithms with a number of publications of at least 10 are considered. This was the case with ten algorithms, as shown in Figure 10.

In this figure, only the year of publication and not the month is considered, to show the progression of the algorithms used. Looking at this line chart, it can be clearly seen that from 2017 on, in particular, there is a significant increase in the number of publications using these algorithms. The peak is mostly in 2020, which shows that all presented algorithms are preferred over other algorithms. Even in this year, 2021 (which is not over yet), there are already studies applying these algorithms.

To address the difference between 2021 as an incomplete year and the other full years, a comparison is made between the cumulative numbers of publications in 2020 and 2021, or 2018 and 2019. It can be observed that random forest and long short-term memory have been gaining positive momentum in the last two years, because the cumulative number of publications in 2020 and 2021 is higher than the cumulative number of publications in 2018 and 2019. While both algorithms are used more in the last years after 2017, algorithms such as support vector machines and C4.5 decision trees are showing negative momentum (i.e., they show a lower cumulative number of publications in 2020 and 2021 in comparison with 2018 and 2019 and are therefore not used as much).

In general, observing Figure 10, it is expected that these algorithms will be further examined and used in upcoming studies.

3.6. Applications in Condition Monitoring or Predictive Maintenance (RQ 5)

Most of the included research papers used ML techniques and algorithms for case studies, applications, or particular use cases. To provide an overview of all applications investigated in the existing literature, a matrix bubble chart was developed, which is shown in Figure 11. It should also be noted that, even in this chart, there may be duplication in the techniques used, as there are studies that have investigated different problems or purposes. For example, a paper may not only deal with classification methods, but also have considered deep learning algorithms. Therefore, these numbers do not match with the number of publications evaluated in this study. All use cases were classified in nine categories. “CM“ includes the CM of tools, tool wear, machines, process, and so on, while “prediction“ describes all tasks predicting any metric, for example, machine condition or speed. “Classification/identification“ reflects all tasks that deal with, for example, object identification, condition, or roughness classification. Classification and deep learning techniques, in particular, are used the most for CM. Fault detection, diagnosis, and anomaly detection are also important issues in the CM or PdM area, where classification, regression, and deep learning are used often.

For a better understanding of the work done in each particular field of application, Table 5 provides some examples of respective publications included in this SMS. From this table, it can be seen that there is some overlap, as these publications span multiple application areas. Not every publication focusing on a particular field of application uses the same ML techniques or algorithms. Thus, for example, ref. [24] used long short-term memory (LSTM) neural networks to monitor machine health, while [25] focused on support vector machines (SVMs) for machine CM. On the other hand, ref. [26] used SVMs as well as k-nearest neighbors (KNN) as classification algorithms for PdM. In these examples listed in Table 5, different metrics were used to evaluate the performance of the applied ML techniques and algorithms. For example, while [24,27,28] used the mean absolute error (MAE), mean squared error (MSE), and/or root mean squared error (RMSE), ref. [29] applied leave-one-out cross validation (LOOCV).

Since the form of presentation allows it, the research type facets are illustrated in Figure 11 as well. The existing publications were classified on the basis of Wieringa et al. [62], although only four out of six different research types were considered, namely evaluation and validation research, solution proposal, and experience paper. Furthermore, case studies and frameworks are also depicted in the chart. It is concluded that evaluation research papers and solution proposals are the most common research types found. There is just one experience paper, which only describes the development of a continuous diagnostics subsystem without evaluation [63]. In particular, the estimation/prediction of the remaining useful life (RUL) of machines or tools is investigated often. This seems to be a key issue in CM or PdM in manufacturing, which is solved or at least taken into consideration by ML techniques. Even if the number of publications is relatively low, there seems to be an emerging trend of a higher relative usage of “more sophisticated” techniques (e.g., neural networks and deep learning) for “more difficult” tasks, such as RUL estimation/prediction or failure prediction, in contrast to the higher percentage of classification techniques for fault detection and fault diagnosis.

3.7. High-Level Comparative Analysis and Insight into the Performance of ML Techniques

The primary purpose of this study was to investigate the literature regarding CM and PdM in relation to the questions provided at the beginning of the paper. Even though an in-depth numerical analysis is beyond the scope of this study, a brief high-level insight into the performance comparison of the ML techniques used in the field, as an attempt to provide a comparative perspective, is also given. Generally, ML techniques can be categorized into three major types of methods: supervised, unsupervised, and reinforcement learning. However, this subsection will only focus on supervised learning methods. For supervised learning applications, an appropriate numerical comparative analysis would require consistency over a set of factors outlined as follows:

data set [64];
performance metrics used (classification versus regression);
implementation standards (chance of overfitting);
distribution of errors.

Comparing the performance of the ML techniques used in the relevant literature can only be done if all of the factors listed above are identical between the papers. Otherwise, the results should be reproduced under the same conditions and setups as reported in the original publications. In short, for the most part, comparing two different applications of ML techniques is only valid if the implementations were reported for the same data set and via the same performance metric(s). In addition, the validity of the analysis will highly depend on the reliability of the algorithm (which can optimistically be incorporated via an analysis of the distribution of errors) and taking into account the potential chance of overfitting.

Considering the fact that the diversity of applications of ML to CM and PdM is of a significant level, only a few representative references in this regard are cited and studied.

Supervised ML methods are categorized into images (classification), classical supervised learning format or the multi-column format (both classification and regression), and time series prediction (regression).

For the performance comparison to be valid, the data sets used must be the same. In addition, the type of problem and the hyperparameters used for building the model are also important. Although such conditions are not met to compare different ML techniques in terms of performance, the conclusions provided on the experimental results of individual publications are used alternatively. In this section, those studies that make use of basic versions of ML techniques (i.e, a basic version of the ML technique is used, in which no additional improvements are made, nor variants used) are addressed first. This study shows that linear and logistic regression methods are mainly the weakest methods for regression and classification applications. On average, SVM (for classification and support vector regression (SVR) for regression) and artificial neural networks (ANNs) have not been reported to perform better than one another in a constant manner [65]. In [24], the authors report the overall superiority of LSTM over recurrent neural network (RNN), ANN, SVR and other regression methods. In [26], the authors report the superiority of SVR over KNN. In many cases, ensemble methods such as random forest (RF) have shown improved performance in comparison with basic SVM and ANN [66,67]. However, the majority of the studies show that, in the presence of a sufficient amount of data, deep learning methods (in particular deep neural networks) outperform the rest of the basic ML methods [68]. The above-mentioned brief comparison would also highly depend on the presence of adequate training data samples; otherwise, simpler methods show better performance when dealing with small-sample-size problems. In addition, the findings of this analysis suggest that the most promising results belong to the applications of hybrid ML methods, where a set of methods are combined to build a more powerful model. For example, principal component analysis (PCA) combined with supervised learning methods has predominantly been reported to lead to an improved performance. PCA is mainly used for feature reduction and thus improves the performance by reducing the complexity of the problem before employing ML methods. In particular, evidence reported in [69] shows that PCA improves SVM, and also improves SVR, according to [70], and iterated SVR (ISVR), according to [71]. In [72], the authors report that PCA improved a convolutional neural network (CNN). However, some studies, such as [73], reported an insignificant impact of PCA on the overall performance. A more advanced class of ML technique known as autoencoders proved even more effective than the combination of PCA and ML techniques. More details are provided in [74]. In the context of image data sets, recent research has demonstrated that CNNs are inherently superior to traditional image processing techniques [75]. CNN and its variants also proved successful when dealing with originally non-image data, in which case the data is converted to 2D images (e.g., vibrations are converted to two-dimensional images) [76,77,78]. There is an increasing interest in transfer learning in recent works on CM and PdM. Many recent works reported highly accurate results when using transfer learning, such as [79]. In [80], the authors presented a transfer convolutional neural network for fault diagnosis based on ResNet. In [81], the authors report that transfer learning with ResNet was able to outperform CovNet. In [82], the authors demonstrated that ResNet can outperform CovNet and AlexNet. In addition, ref. [83] presented a deep transfer network (DTN) that used domain adaptation to provide an improved performance. The authors reported superiority over SVMs, RFs, and even CNNs. Regarding the brief comparison provided in this section, it can be concluded that CNNs and deep learning methods are the most popular methods for CM and PdM. Transfer learning is increasingly attracting more attention, and it improves the performance of CNNs and deep learning methods in a meaningful manner. Therefore, this analysis suggests that transfer learning will continue to attract more attention and has the potential to be the most frequently used technique in the coming years.

4. Discussion

This section analyzes and discusses all findings in the course of answering the RQs in Section 3 and conducting the SMS. For this reason, each RQ will be discussed separately, which allows for better differentiation when discussing all of the results in this section.

The information found for RQ 1 shows that there is a large number of techniques used for CM or PdM in the manufacturing sector. Some of the publications do not meet the requirement of handling issues in the mentioned sector. This is due to the fact that the advanced search via a search string found other studies that also use ML techniques. Additionally, because CM and PdM can also be applied to a broad range of use cases, some research not belonging to this particular sector was included in the queries. Nevertheless, mainly manufacturing-related topics were found, which is indicated by the relatively high number of publications in the final set and reflects the importance of the application of ML technique in this research field. Further, it could be identified that the keywording of the scientific databases was not always based on the author-specified keywords. Hence, this resulted in keywords provided by the database, which do not represent the originally defined keywords of the authors and which distorted the keywording process immensely. For this reason, each paper was inspected a second time in an attempt to categorize it correctly. The small number of publications using reinforcement learning could be a result of the complexity of this technique, because such models are difficult to train, as the success of training is very dependent on the chosen policy. Transfer learning, on the other hand, is not used often, because the included studies mostly work with researchers’ own data sets created in the course of their implemented investigation or approach. Conversely, some research papers also trained on publicly available data sets (e.g., the Milling Data Set [84] or the Turbofan Engine Degradation Simulation Data Set [85], both provided by NASA).

The findings of RQ 2 confirm the results of RQ 1 (i.e., different algorithms were applied in the research found). In particular, more than one algorithm is often used in one paper in order to evaluate the various performance or accuracy results applied on one or more defined use cases or problem tasks. This is why the number of algorithms used is higher than the number of publications considered. The same also applies to the ML techniques used. It is noteworthy that there are several overlaps due to the fact that, for example, a classification problem is solved by an ensemble method, while the clustering technique is also mainly used for the same purpose. This result also indicates the feasibility of their application in the present research field, as they are used in a variety of ways. It can therefore be stated that there are many possibilities for CM and PdM in the manufacturing sector to work with such algorithms (e.g., predicting RUL of equipment or detecting failures in production).

With regard to RQ 3, it was expected that offline ML is applied more in contrast to online ML. In the research context, the model is often trained on the basis of existing historical data rather than applied work in industry, as new data is continuously generated during the production process (e.g., due to changing demands [86]). As already described in Section 3.4, offline ML uses already collected data as often as needed in order to train the model [17]. Consequently, this means that the entire data set has to be iterated, which leads to relatively long training times, where, especially when using the stochastic gradient descent, the number of training steps can be high [17]. However, such a learning method increases the probability of finding the global minimum in the training phase of a model [17]. By contrast, online ML is able to update the model as new data arrives. This is particularly useful when data changes significantly over time. Furthermore, this can be practical in certain use cases to compensate for the difficulties of offline ML. However, the disadvantage of online ML is that the optimal minimum often cannot be found [17].

The results of RQ 4 show that there are various algorithms that are currently gaining momentum. However, the lower boundary of 10 is set for no specific reason. On the one hand, if the limit was set too high, then too few algorithms would have been presented. On the other hand, there would have been too many algorithms that meet the requirement if the boundary was set too low. The provided chart reflects all algorithms that could be considered in upcoming research. Nevertheless, this could be a guide to other algorithms that could be investigated in future studies. According to RQ 1 or RQ 2, this could suggest exploring, for example, the technique of reinforcement learning more closely.

The last research question in particular, RQ 5, aims to illustrate different purposes for the usage of ML techniques or their algorithms. While either CM or PdM can be seen as a whole, there are various tasks belonging to each area. Hence, early fault detection is one of the key issues that PdM deals with. Furthermore, in order to predict when maintenance is needed, continuous monitoring of machinery, tools, or processes is required. Therefore, CM and PdM seem to be linked to each other, since (periodic or) continuous monitoring, a necessary tool in PdM, is one of the main tasks in CM.

Finally, all RQs could be sufficiently answered with the help of the extracted data. Many efforts have already been made in the current research. The narrowly defined SMS does not cause difficulties in conducting such a study, as enough papers exist to draw conclusions. However, this makes it difficult to identify any gaps in this research area. On the one hand, it is possible to investigate the used algorithms more closely and compare existing results. On the other hand, there are other algorithms that could be explored in the manufacturing sector for the present tasks. It can be stated that CM and PdM in combination with ML is not only relevant in today’s research, but also represents a topic for the future. One reason could be the growing complexity in production plants and machinery. The higher the degree of automation in production, and thus in the manufacturing sector, the more complex it is to maintain productivity in such an operation.

5. Conclusions

This study conducts a systematic mapping process in order to examine existing research on the usage of machine learning techniques for condition monitoring or predictive maintenance in the manufacturing sector. The main findings of this work are that the relevant research papers already provide various approaches of applying ML techniques and corresponding algorithms in these specific use cases. In the last five years, in particular, intensive research has been carried out, as seen in Figure 5 or Figure 10. It should be noted that many publications do not work with just one technique or algorithm, but use several in order to compare the individual methods in terms of their accuracy or feasibility. Therefore, a wide range of techniques, related algorithms, and also possible applications could be observed.

However, due to the fact that these issues are among the most important ones to consider in today’s highly sophisticated production environment, further evaluation of existing approaches or examination of less used but still potentially suitable algorithms should be explored in future work. For example, based on the assessment of the SMS, topics such as reinforcement learning and transfer learning are underrepresented. Reinforcement learning is based on the Markov decision process. Hence, all data used need to have the Markov property (i.e., the result of an action only depends on the current state [87]). However, it is not easy to identify the states and actions when they are continuous, as many reinforcement learning models assume discrete variables [88]. Dimensionality can also be increased if too many inputs are used [89]. Moreover, sequences and time series data can be used, but doing so is more complex [90]. Still, there is a potential for industrial application areas (e.g., [91,92]).

Through this SMS, it can be observed that algorithms such as SVMs, C4.5 decision trees, and KNNs, which were commonly used for CM and PdM in the last decade, have been experiencing a decline in use over the last two years (2020 and mid-2021). Meanwhile, RFs, LSTMs, PCA, and MLPs are algorithms that are currently showing an upswing (i.e., they are used more often, especially in 2020 and 2021). Therefore, it can be expected that the latter algorithms will continue to be used in future research in the coming years.

Limitations of this SMS are that the analysis provided in this research does not include numerical comparisons or analysis on algorithmic levels, as the scope is limited to a mapping study (i.e., results are examined in a quantitative manner, rather than using numerical analysis). No domain-specific analyses are provided, as different domains impose different constraints and requirements. Furthermore, the qualities of the cited publications were not analyzed, as such an analysis, while significant, requires domain and technical considerations that are beyond the scope of this research. Finally, the number of publications found depends on the search strategy, that is, the scientific databases chosen and the search string defined, since the latter is limited due to the fixed number of words, and the former provide different results.

Due to the rapid development within this field of research, it is recommended to conduct another systematic mapping study in, for example, five years, to further investigate the future progression. Based on this analysis, it seems like most papers work on CM and fault detection. Therefore, it is expected that these fields of application and techniques or algorithms will gain importance in the context of these problem-solving tasks. This means that neural networks and deep learning techniques may have an increasing tendency to be used.

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/logistics6020035/s1. The BibTeX file for the management of citations and references with their assignments is available online at https://doi.org/10.5281/zenodo.5841767.

Author Contributions

Conceptualization, T.L.J.P., I.G., D.H. and D.R.; investigation, T.L.J.P.; methodology, T.L.J.P., I.G. and D.H.; writing—original draft preparation, T.L.J.P.; writing—review and editing, T.L.J.P., I.G., D.H. and F.B.; visualization, T.L.J.P.; supervision, D.R.; project administration, T.L.J.P. and I.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research has been partially funded by the European Regional Development Fund (EFRE) and the Free State of Saxony (funding: RISE4PM, application number: 100364367).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing not applicable. No new data were created or analyzed in this study.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial intelligence
ANN	Artificial neural network
CM	Condition monitoring
CNN	Convolutional neural network
DTN	Deep transfer network
ISVR	Iterated support vector regression
KNN	K-nearest neighbors
LOOCV	Leave-one-out cross validation
LSTM	Long short-term memory
MAE	Mean absolute error
MBL	Mini-batch learning
ML	Machine learning
MSE	Mean squared error
PCA	Principal component analysis
PdM	Predictive maintenance
RF	Random forest
RMSE	Root mean squared error
RN	Recurrent neural network
RQ	Research question
RUL	Remaining useful life
SMS	Systematic mapping study
SVM	Support vector machine
SVR	Support vector regression

References

Kagermann, H.; Lukas, W.D.; Wahlster, W. Industrie 4.0: Mit dem Internet der Dinge auf dem Weg zur 4. industriellen Revolution. VDI Nachrichten 2011, 13, 2–3. [Google Scholar]
Thareja, P. Manufacturing paradigms in 2010. In Proceedings of the National Conference on Emerging Trends in Manufacturing Systems, Haryana, India, 15–16 March 2005. [Google Scholar]
Rao, B.K.N. Handbook of Condition Monitoring; Elsevier Advanced Technology: Amsterdam, The Netherlands, 1996. [Google Scholar]
Mobley, R.K. An Introduction to Predictive Maintenance; Elsevier: Amsterdam, The Netherlands, 2002. [Google Scholar] [CrossRef]
PK, F.A. Learning Outcomes of Classroom Research; Chapter Artificial Intelligence; L ORDINE Nuovo Publication: New Delhi, India, 2021; pp. 65–73. [Google Scholar]
Bhbosale, S.; Pujari, V.; Multani, Z. Advantages And Disadvantages Of Artificial Intellegence. Aayushi Int. Interdiscip. Res. J. 2020, 77, 227–230. [Google Scholar]
El Naqa, I.; Murphy, M.J. What Is Machine Learning? In Machine Learning in Radiation Oncology: Theory and Applications; El Naqa, I., Li, R., Murphy, M.J., Eds.; Springer International Publishing: Cham, Switzerland; Berlin, Germany, 2015; pp. 3–11. [Google Scholar] [CrossRef]
Ren, Y. Optimizing Predictive Maintenance With Machine Learning for Reliability Improvement. ASCE-ASME J. Risk Uncertain. Eng. Syst. Part B Mech. Eng. 2021, 7, 030801. Available online: https://asmedigitalcollection.asme.org/risk/article-pdf/7/3/030801/6733140/risk_007_03_030801.pdf (accessed on 11 November 2021). [CrossRef]
Nacchia, M.; Fruggiero, F.; Lambiase, A.; Bruton, K. A Systematic Mapping of the Advancing Use of Machine Learning Techniques for Predictive Maintenance in the Manufacturing Sector. Appl. Sci. 2021, 11, 2546. [Google Scholar] [CrossRef]
Begüm, A.; Akbulut, A.; Zaim, A.H. Techniques for Apply Predictive Maintenance and Remaining Useful Life: A Systematic Mapping Study. Bilecik Şeyh Edebali Üniversitesi Fen Bilim. Derg. 2021, 8, 497–511. [Google Scholar]
Petersen, K.; Feldt, R.; Mujtaba, S.; Mattsson, M. Systematic Mapping Studies in Software Engineering. In Proceedings of the 12th International Conference on Evaluation and Assessment in Software Engineering (EASE’08), Karlskrona, Sweden, 15–16 June 2017; BCS Learning and Development Ltd.: Karlskrona, Sweden, 2008; pp. 68–77. [Google Scholar]
Oxford English Dictionary. Review. 2022. Available online: https://www.oed.com/view/Entry/164850? (accessed on 5 February 2022).
Scheuren, F. What is a Survey? American Statistical Association Alexandria: Alexandria, VA, USA, 2004. [Google Scholar]
Oxford English Dictionary. Manufacturing. 2022. Available online: https://www.oed.com/view/Entry/113773? (accessed on 5 February 2022).
Oxford English Dictionary. Shop Floor. 2022. Available online: https://www.oed.com/view/Entry/178522? (accessed on 5 February 2022).
Ahmed, H.; Nandi, A.K. Compressive Sampling and Deep Neural Network (CS-DNN). In Condition Monitoring with Vibration Signals: Compressive Sampling and Learning Algorithms for Rotating Machines; IEEE: Hoboken, NJ, USA, 2019; pp. 361–377. Available online: https://ieeexplore.ieee.org/document/8958910 (accessed on 12 September 2021).
Russell, S.; Norvig, P. Artificial Intelligence: A Modern Approach; Pearson Education Limited: London, UK, 2009; Volume 3. [Google Scholar]
Variz, L.; Piardi, L.; Rodrigues, P.J.; Leitão, P. Machine Learning Applied to an Intelligent and Adaptive Robotic Inspection Station. In Proceedings of the 2019 IEEE 17th International Conference on Industrial Informatics (INDIN), Helsinki, Finland, 22–25 July 2019; Volume 1, pp. 290–295. [Google Scholar] [CrossRef] [Green Version]
Wang, P.; Gao, R.X. Transfer learning for enhanced machine fault diagnosis in manufacturing. CIRP Ann. 2020, 69, 413–416. [Google Scholar] [CrossRef]
Mukherjee, S.; Huang, X.; Rathod, V.T.; Udpa, L.; Deng, Y. Defects Tracking via NDE Based Transfer Learning. In Proceedings of the 2020 IEEE International Conference on Prognostics and Health Management (ICPHM), Detroit, MI, USA, 8–10 June 2020; pp. 1–8. [Google Scholar] [CrossRef]
Agarwal, S.; Vijaya Saradhi, V.; Karnick, H. Kernel-based online machine learning and support vector reduction. Neurocomputing 2008, 71, 1230–1237. [Google Scholar] [CrossRef]
Benczúr, A.A.; Kocsis, L.; Pálovics, R. Online Machine Learning in Big Data Streams. arXiv 2018, arXiv:1802.05872. [Google Scholar]
Li, M.; Zhang, T.; Chen, Y.; Smola, A.J. Efficient Mini-Batch Training for Stochastic Optimization. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’14), New York, NY, USA, 24–27 August 2014; Association for Computing Machinery: New York, NY, USA, 2014; pp. 661–670. [Google Scholar] [CrossRef]
Zhao, R.; Wang, J.; Yan, R.; Mao, K. Machine health monitoring with LSTM networks. In Proceedings of the 2016 10th International Conference on Sensing Technology (ICST), Nanjing, China, 11–13 November 2016; pp. 1–6. [Google Scholar] [CrossRef]
Wu, H.; Wang, Y.; Yu, Z. In situ monitoring of FDM machine condition via acoustic emission. Int. J. Adv. Manuf. Technol. 2016, 84, 1483–1495. [Google Scholar] [CrossRef]
Susto, G.A.; Schirru, A.; Pampuri, S.; McLoone, S.; Beghi, A. Machine Learning for Predictive Maintenance: A Multiple Classifier Approach. IEEE Trans. Ind. Inform. 2015, 11, 812–820. [Google Scholar] [CrossRef] [Green Version]
Li, Z.; Zhang, Z.; Shi, J.; Wu, D. Prediction of surface roughness in extrusion-based additive manufacturing with machine learning. Robot.-Comput.-Integr. Manuf. 2019, 57, 488–495. [Google Scholar] [CrossRef]
Corne, R.; Nath, C.; El Mansori, M.; Kurfess, T. Study of spindle power data with neural network for predicting real-time tool wear/breakage during inconel drilling. J. Manuf. Syst. 2017, 43, 287–295. [Google Scholar] [CrossRef]
Huber, S.; Wiemer, H.; Schneider, D.; Ihlenfeldt, S. DMME: Data mining methodology for engineering applications—A holistic extension to the CRISP-DM model. Procedia CIRP 2019, 79, 403–408. [Google Scholar] [CrossRef]
Painuli, S.; Elangovan, M.; Sugumaran, V. Tool condition monitoring using K-star algorithm. Expert Syst. Appl. 2014, 41, 2638–2643. [Google Scholar] [CrossRef]
Joshuva, A.; Sugumaran, V. A data driven approach for condition monitoring of wind turbine blade using vibration signals through best-first tree algorithm and functional trees algorithm: A comparative study. ISA Trans. 2017, 67, 160–172. [Google Scholar] [CrossRef]
Essien, A.; Giannetti, C. A Deep Learning Model for Smart Manufacturing Using Convolutional LSTM Neural Network Autoencoders. IEEE Trans. Ind. Inform. 2020, 16, 6069–6078. [Google Scholar] [CrossRef] [Green Version]
Langone, R.; Alzate, C.; De Ketelaere, B.; Vlasselaer, J.; Meert, W.; Suykens, J.A. LS-SVM based spectral clustering and regression for predicting maintenance of industrial machines. Eng. Appl. Artif. Intell. 2015, 37, 268–278. [Google Scholar] [CrossRef] [Green Version]
Lee, W.J.; Wu, H.; Yun, H.; Kim, H.; Jun, M.B.; Sutherland, J.W. Predictive Maintenance of Machine Tool Systems Using Artificial Intelligence Techniques Applied to Machine Condition Data. Procedia CIRP 2019, 80, 506–511. [Google Scholar] [CrossRef]
Kroll, B.; Schaffranek, D.; Schriegel, S.; Niggemann, O. System Modeling Based on Machine Learning for Anomaly Detection and Predictive Maintenance in Industrial Plants; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2014. [Google Scholar] [CrossRef]
Susto, G.A.; Beghi, A. Dealing with time-series data in Predictive Maintenance problems. In Proceedings of the 2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA), Berlin, Germany, 6–9 September 2016; Institute of Electrical and Electronics Engineers Inc.: Berlin, Germany, 2016; pp. 1–4. [Google Scholar] [CrossRef]
Kumar, A.; Shankar, R.; Thakur, L.S. A big data driven sustainable manufacturing framework for condition-based maintenance prediction. J. Comput. Sci. 2018, 27, 428–439. [Google Scholar] [CrossRef]
Traini, E.; Bruno, G.; D’Antonio, G.; Lombardi, F. Machine Learning Framework for Predictive Maintenance in Milling. IFAC-PapersOnLine 2019, 52, 177–182. [Google Scholar] [CrossRef]
Gutschi, C.; Furian, N.; Suschnigg, J.; Neubacher, D.; Voessner, S. Log-based predictive maintenance in discrete parts manufacturing. Procedia CIRP 2019, 79, 528–533. [Google Scholar] [CrossRef]
Panicucci, S.; Nikolakis, N.; Cerquitelli, T.; Ventura, F.; Proto, S.; Macii, E.; Makris, S.; Bowden, D.; Becker, P.; O’mahony, N.; et al. A cloud-to-edge approach to support predictive analytics in robotics industry. Electronics 2020, 9, 492. [Google Scholar] [CrossRef] [Green Version]
Susto, G.A.; Wan, J.; Pampuri, S.; Zanon, M.; Johnston, A.B.; O’Hara, P.G.; McLoone, S. An adaptive machine learning decision system for flexible predictive maintenance. In Proceedings of the 2014 IEEE International Conference on Automation Science and Engineering (CASE), New Taipei, Taiwan, 18–22 August 2014; pp. 806–811. [Google Scholar] [CrossRef]
Elangovan, M.; Devasenapati, S.B.; Sakthivel, N.; Ramachandran, K. Evaluation of expert system for condition monitoring of a single point cutting tool using principle component analysis and decision tree algorithm. Expert Syst. Appl. 2011, 38, 4450–4459. [Google Scholar] [CrossRef]
Klaic, M.; Staroveski, T.; Udiljak, T. Tool wear classification using decision treesin stone drilling applications: A preliminary study. Procedia Eng. 2014, 69, 1326–1335. [Google Scholar] [CrossRef] [Green Version]
Quatrini, E.; Costantino, F.; Di Gravio, G.; Patriarca, R. Machine learning for anomaly detection and process phase classification to improve safety and maintenance activities. J. Manuf. Syst. 2020, 56, 117–132. [Google Scholar] [CrossRef]
Madhusudana, C.; Kumar, H.; Narendranath, S. Fault Diagnosis of Face Milling Tool using Decision Tree and Sound Signal. Mater. Today Proc. 2018, 5, 12035–12044. [Google Scholar] [CrossRef]
Papatheou, E.; Dervilis, N.; Maguire, A.E.; Antoniadou, I.; Worden, K. A Performance Monitoring Approach for the Novel Lillgrund Offshore Wind Farm. IEEE Trans. Ind. Electron. 2015, 62, 6636–6644. [Google Scholar] [CrossRef] [Green Version]
Krishnakumar, P.; Rameshkumar, K.; Ramachandran, K. Tool Wear Condition Prediction Using Vibration Signals in High Speed Machining (HSM) of Titanium (Ti-6Al-4V) Alloy. Procedia Comput. Sci. 2015, 50, 270–275. [Google Scholar] [CrossRef] [Green Version]
Sezer, E.; Romero, D.; Guedea, F.; Macchi, M.; Emmanouilidis, C. An Industry 4.0-Enabled Low Cost Predictive Maintenance Approach for SMEs. In Proceedings of the 2018 IEEE International Conference on Engineering, Technology and Innovation (ICE/ITMC), Stuttgart, Germany, 17–20 June 2018; pp. 1–8. [Google Scholar] [CrossRef]
Kanawaday, A.; Sane, A. Machine learning for predictive maintenance of industrial machines using IoT sensor data. In Proceedings of the 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 24–26 November 2017; pp. 87–90. [Google Scholar] [CrossRef]
Luo, B.; Wang, H.; Liu, H.; Li, B.; Peng, F. Early Fault Detection of Machine Tools Based on Deep Learning and Dynamic Identification. IEEE Trans. Ind. Electron. 2019, 66, 509–518. [Google Scholar] [CrossRef]
Amruthnath, N.; Gupta, T. A research study on unsupervised machine learning algorithms for early fault detection in predictive maintenance. In Proceedings of the 2018 5th International Conference on Industrial Engineering and Applications (ICIEA), Singapore, 26–28 April 2018; pp. 355–361. [Google Scholar] [CrossRef]
Praveenkumar, T.; Saimurugan, M.; Krishnakumar, P.; Ramachandran, K. Fault Diagnosis of Automobile Gearbox Based on Machine Learning Techniques. Procedia Eng. 2014, 97, 2092–2098. [Google Scholar] [CrossRef] [Green Version]
Ben Ali, J.; Saidi, L.; Harrath, S.; Bechhoefer, E.; Benbouzid, M. Online automatic diagnosis of wind turbine bearings progressive degradations under real experimental conditions based on unsupervised machine learning. Appl. Acoust. 2018, 132, 167–181. [Google Scholar] [CrossRef]
Shao, S.; Sun, W.; Wang, P.; Gao, R.X.; Yan, R. Learning features from vibration signals for induction motor fault diagnosis. In Proceedings of the 2016 International Symposium on Flexible Automation (ISFA), Cleveland, OH, USA, 1–3 August 2016; pp. 71–76. [Google Scholar] [CrossRef]
Fan, S.K.S.; Hsu, C.Y.; Tsai, D.M.; He, F.; Cheng, C.C. Data-Driven Approach for Fault Detection and Diagnostic in Semiconductor Manufacturing. IEEE Trans. Autom. Sci. Eng. 2020, 17, 1925–1936. [Google Scholar] [CrossRef]
Gangadhar, N.; Kumar, H.; Narendranath, S.; Sugumaran, V. Fault Diagnosis of Single Point Cutting Tool through Vibration Signal Using Decision Tree Algorithm. Procedia Mater. Sci. 2014, 5, 1434–1441. [Google Scholar] [CrossRef] [Green Version]
Ruiz-Sarmiento, J.R.; Monroy, J.; Moreno, F.A.; Galindo, C.; Bonelo, J.M.; Gonzalez-Jimenez, J. A predictive model for the maintenance of industrial machinery in the context of industry 4.0. Eng. Appl. Artif. Intell. 2020, 87, 103289. [Google Scholar] [CrossRef]
Amruthnath, N.; Gupta, T. Fault class prediction in unsupervised learning using model-based clustering approach. In Proceedings of the 2018 International Conference on Information and Computer Technologies (ICICT), DeKalb, IL, USA, 23–25 March 2018; pp. 5–12. [Google Scholar] [CrossRef]
Candanedo, I.; Nieves, E.; González, S.; Martín, M.; Briones, A. Machine learning predictive model for industry 4.0. Commun. Comput. Inf. Sci. 2018, 877, 501–510. [Google Scholar] [CrossRef]
Strauß, P.; Schmitz, M.; Wöstmann, R.; Deuse, J. Enabling of Predictive Maintenance in the Brownfield through Low-Cost Sensors, an IIoT-Architecture and Machine Learning. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; pp. 1474–1483. [Google Scholar] [CrossRef]
Zabiński, T.; Maoczka, T.; Kluska, J.; Madera, M.; Sȩp, J. Condition monitoring in Industry 4.0 production systems—The idea of computational intelligence methods application. Procedia CIRP 2019, 79, 63–67. [Google Scholar] [CrossRef]
Wieringa, R.; Maiden, N.; Mead, N.; Rolland, C. Requirements engineering paper classification and evaluation criteria: A proposal and a discussion. Requir. Eng. 2006, 11, 102–107. [Google Scholar] [CrossRef]
Grishin, E. Development of intelligent algorithms for the continuous diagnostics and condition monitoring subsystem of the equipment as part of the process control system of a stainless steel pipe production enterprise. In IOP Conference Series: Materials Science and Engineering; IOP Publishing Ltd.: Bristol, UK, 2020; Volume 939. [Google Scholar] [CrossRef]
Neu, D.A.; Lahann, J.; Fettke, P. A systematic literature review on state-of-the-art deep learning methods for process prediction. Artif. Intell. Rev. 2021, 55, 801–827. [Google Scholar] [CrossRef]
Carvalho, T.P.; Soares, F.A.; Vita, R.; Francisco, R.d.P.; Basto, J.P.; Alcalá, S.G. A systematic literature review of machine learning methods applied to predictive maintenance. Comput. Ind. Eng. 2019, 137, 106024. [Google Scholar] [CrossRef]
Tessaro, I.; Mariani, V.C.; Coelho, L.d.S. Machine learning models applied to predictive maintenance in automotive engine components. Multidiscip. Digit. Publ. Inst. Proc. 2020, 64, 26. [Google Scholar]
Douglas, P.K.; Harris, S.; Yuille, A.; Cohen, M.S. Performance comparison of machine learning algorithms and number of independent components used in fMRI decoding of belief vs. disbelief. Neuroimage 2011, 56, 544–553. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zhang, W.; Yang, D.; Wang, H. Data-driven methods for predictive maintenance of industrial equipment: A survey. IEEE Syst. J. 2019, 13, 2213–2227. [Google Scholar] [CrossRef]
Saidi, L.; Ali, J.B.; Fnaiech, F. Application of higher order spectral features and support vector machines for bearing faults classification. ISA Trans. 2015, 54, 193–206. [Google Scholar] [CrossRef] [PubMed]
Sutrisno, E.; Oh, H.; Vasan, A.S.S.; Pecht, M. Estimation of remaining useful life of ball bearings using data driven methodologies. In Proceedings of the 2012 IEEE Conference on Prognostics and Health Management, Denver, CO, USA, 18–21 June 2012; pp. 1–7. [Google Scholar]
Ali, J.B.; Saidi, L. A new suitable feature selection and regression procedure for lithium-ion battery prognostics. Int. J. Comput. Appl. Technol. 2018, 58, 102–115. [Google Scholar] [CrossRef]
Kiangala, K.S.; Wang, Z. An effective predictive maintenance framework for conveyor motors using dual time-series imaging and convolutional neural network in an industry 4.0 environment. IEEE Access 2020, 8, 121033–121049. [Google Scholar] [CrossRef]
Michau, G.; Hu, Y.; Palmé, T.; Fink, O. Feature learning for fault detection in high-dimensional condition monitoring signals. Proc. Inst. Mech. Eng. Part O J. Risk Reliab. 2020, 234, 104–115. [Google Scholar] [CrossRef]
Guo, L.; Gao, H.; Huang, H.; He, X.; Li, S. Multifeatures fusion and nonlinear dimension reduction for intelligent bearing condition monitoring. Shock Vib. 2016, 2016, 4632562. [Google Scholar] [CrossRef] [Green Version]
Ding, X.; He, Q. Energy-fluctuated multiscale feature learning with deep convnet for intelligent spindle bearing fault diagnosis. IEEE Trans. Instrum. Meas. 2017, 66, 1926–1935. [Google Scholar] [CrossRef]
Guo, X.; Chen, L.; Shen, C. Hierarchical adaptive deep convolution neural network and its application to bearing fault diagnosis. Measurement 2016, 93, 490–502. [Google Scholar] [CrossRef]
Wen, L.; Li, X.; Gao, L.; Zhang, Y. A new convolutional neural network-based data-driven fault diagnosis method. IEEE Trans. Ind. Electron. 2017, 65, 5990–5998. [Google Scholar] [CrossRef]
Oh, J.W.; Jeong, J. Convolutional neural network and 2-D image based fault diagnosis of bearing without retraining. In Proceedings of the 2019 3rd International Conference on Compute and Data Analysis, Kahului, HI, USA, 14–16 March 2019; pp. 134–138. [Google Scholar]
Shao, S.; McAleer, S.; Yan, R.; Baldi, P. Highly accurate machine fault diagnosis using deep transfer learning. IEEE Trans. Ind. Inform. 2018, 15, 2446–2455. [Google Scholar] [CrossRef]
Wen, L.; Li, X.; Gao, L. A transfer convolutional neural network for fault diagnosis based on ResNet-50. Neural Comput. Appl. 2020, 32, 6111–6124. [Google Scholar] [CrossRef]
Zhao, M.; Zhong, S.; Fu, X.; Tang, B.; Dong, S.; Pecht, M. Deep residual networks with adaptively parametric rectifier linear units for fault diagnosis. IEEE Trans. Ind. Electron. 2020, 68, 2587–2597. [Google Scholar] [CrossRef]
Marei, M.; El Zaatari, S.; Li, W. Transfer learning enabled convolutional neural networks for estimating health state of cutting tools. Robot.-Comput.-Integr. Manuf. 2021, 71, 102145. [Google Scholar] [CrossRef]
Han, T.; Liu, C.; Yang, W.; Jiang, D. Deep transfer network with joint distribution adaptation: A new intelligent fault diagnosis framework for industry application. ISA Trans. 2020, 97, 269–281. [Google Scholar] [CrossRef] [Green Version]
Agogino, A.; Goebel, K. NASA Ames Prognostics Data Repository. 2007. Available online: https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/ (accessed on 27 August 2021).
Saxena, A.; Goebel, K. Turbofan Engine Degradation Simulation Data Set. 2008. Available online: https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/#turbofan (accessed on 3 September 2021).
Jia, L.; Zhao, Q.; Tong, L. Retail pricing for stochastic demand with unknown parameters: An online machine learning approach. In Proceedings of the 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 2–4 October 2013; pp. 1353–1358. [Google Scholar] [CrossRef]
van Otterlo, M.; Wiering, M. Reinforcement Learning and Markov Decision Processes. In Reinforcement Learning: State-of-the-Art; Wiering, M., van Otterlo, M., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 3–42. [Google Scholar] [CrossRef]
Smart, W.D.; Kaelbling, L.P. Practical reinforcement learning in continuous spaces. In Proceedings of the ICML ’00: Proceedings of the Seventeenth International Conference on Machine Learning, San Francisco, CA, USA, 9 June–2 July 2000; pp. 903–910. [Google Scholar]
Wu, J.; He, H.; Peng, J.; Li, Y.; Li, Z. Continuous reinforcement learning of energy management with deep Q network for a power split hybrid electric bus. Appl. Energy 2018, 222, 799–811. [Google Scholar] [CrossRef]
Dominey, P.F. Complex sensory-motor sequence learning based on recurrent state representation and reinforcement learning. Biol. Cybern. 1995, 73, 265–274. [Google Scholar] [CrossRef]
Oliff, H.; Liu, Y.; Kumar, M.; Williams, M.; Ryan, M. Reinforcement learning for facilitating human-robot-interaction in manufacturing. J. Manuf. Syst. 2020, 56, 326–340. [Google Scholar] [CrossRef]
Park, I.B.; Huh, J.; Kim, J.; Park, J. A Reinforcement Learning Approach to Robust Scheduling of Semiconductor Manufacturing Facilities. IEEE Trans. Autom. Sci. Eng. 2020, 17, 1420–1431. [Google Scholar] [CrossRef]

Figure 1. Process steps and outcomes proposed by [11].

Figure 2. Keywording process approach proposed by [11].

Figure 3. Keyword distribution.

Figure 4. Mapping process.

Figure 5. Number of publications over time (before and after exclusion).

Figure 6. Comparison of number of publications and population.

Figure 7. Relative frequency of techniques applied.

Figure 8. Techniques and top three algorithms.

Figure 9. Distribution of online and offline ML (number of papers).

Figure 10. Algorithms used over time.

Figure 11. Applications set against research types and techniques in the form of a bubble chart (numbers in bubbles: number of publications).

Table 1. Number of search results.

Database	Search Results	Search Results Since 2011
IEEExplore	219	205
ScienceDirect	94	91
Scopus	506	453
Sum	819	749

Table 2. Inclusion and exclusion criteria.

Inclusion	published in 2011 or later
	written in English
	available/accessible online
Exclusion	purely medical issues
	purely network issues
	issues not belonging to the manufacturing sector
	issues not belonging to CM or PdM
	other systematic mapping studies
	systematic literature reviews

Table 3. Geographical distribution of included papers.

Geographical Provenance	Number of Papers	Proportion
Canada	5	1.97%
China	29	11.42%
Germany	29	11.42%
Greece	6	2.36%
India	31	12.20%
Italy	9	3.54%
Singapore	8	3.15%
Spain	10	3.94%
Sweden	5	1.97%
United Kingdom	16	6.30%
USA	20	7.87%
Countries with 1 paper each $^{a}$	15	5.91%
Countries with 2 papers each $^{b}$	18	7.09%
Countries with 3 papers each $^{c}$	12	4.72%
Countries with 4 papers each $^{d}$	28	11.02%
No Information	13	5.12%
Total	254	100%

^a Brazil, Croatia, France, Iran, Israel, Jordan, Morocco, New Zealand, Norway, Qatar, Saudi Arabia, Slovakia, Sri Lanka, Netherlands, Tunisia; ^b Denmark, Ireland, Japan, Luxembourg, Mexico, Pakistan, Portugal, South Africa, Switzerland; ^c Australia, Malaysia, Poland, Turkey; ^d Algeria, Austria, Belgium, Indonesia, Russia, South Korea, Taiwan.

Table 4. Techniques and top three algorithms.

Technique	Algorithm	No. of Publications	Sum
Classification	Support vector machine	48	119
	k-nearest neighbor	20
	C4.5 decision tree	14
	Other	37
Neural Nets and	Multi-layered perceptron	23	104
Deep Learning	Long short-term memory	17
	Convolutional neural network	15
	Other	49
Ensemble	Random forest	34	63
Methods	Gradient boosting machine	8
	Adaboost	4
	Isolation forest	4
	Other	13
Regression	Support vector regression	13	55
	Logistic regression	9
	Linear regression	7
	Other	26
Dimensionality	Principal component analysis	25	39
Reduction	Linear discriminant analysis	4
	Multidimensional analysis	2
	Other	8
Clustering	k-means clustering	14	38
	Gaussian mixture model	9
	Agglomerative clustering	3
	DBSCAN	3
	Other	9
Transfer Learning			5
Reinforcement	Deep Q network	1	3
Learning	Double deep Q-learning	1
	Multi-objective reinforcement	1
		Total	426

Table 5. The respective five most cited publications in each field of application.

Application	Most Cited Publications
CM	[24,25,30,31,32]
PdM	[26,33,34,35,36]
RUL Estimation/Prediction	[37,38,39,40,41]
Classification/Identification	[29,42,43,44,45]
Prediction	[27,28,46,47,48]
Fault Detection	[33,49,50,51,52]
Fault Diagnosis	[45,53,54,55,56]
Failure Prediction	[28,39,49,57,58]
Anomaly Detection	[35,44,59,60,61]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Phan, T.L.J.; Gehrhardt, I.; Heik, D.; Bahrpeyma, F.; Reichelt, D. A Systematic Mapping Study on Machine Learning Techniques Applied for Condition Monitoring and Predictive Maintenance in the Manufacturing Sector. Logistics 2022, 6, 35. https://doi.org/10.3390/logistics6020035

AMA Style

Phan TLJ, Gehrhardt I, Heik D, Bahrpeyma F, Reichelt D. A Systematic Mapping Study on Machine Learning Techniques Applied for Condition Monitoring and Predictive Maintenance in the Manufacturing Sector. Logistics. 2022; 6(2):35. https://doi.org/10.3390/logistics6020035

Chicago/Turabian Style

Phan, Thuy Linh Jenny, Ingolf Gehrhardt, David Heik, Fouad Bahrpeyma, and Dirk Reichelt. 2022. "A Systematic Mapping Study on Machine Learning Techniques Applied for Condition Monitoring and Predictive Maintenance in the Manufacturing Sector" Logistics 6, no. 2: 35. https://doi.org/10.3390/logistics6020035

Article Menu

A Systematic Mapping Study on Machine Learning Techniques Applied for Condition Monitoring and Predictive Maintenance in the Manufacturing Sector

Abstract

1. Introduction

2. Materials and Methods

2.1. Research Questions

2.2. Search Strategy

2.3. Screening of Papers

2.4. Keywording

3. Results

3.1. Data Extraction and Mapping Process

3.2. Machine Learning Techniques Used and Their Relative Frequency (RQ 1)

3.3. Algorithms Used (RQ 2)

3.4. Distribution of Online and Offline Machine Learning (RQ 3)

3.5. Algorithms Currently Gaining Momentum (RQ 4)

3.6. Applications in Condition Monitoring or Predictive Maintenance (RQ 5)

3.7. High-Level Comparative Analysis and Insight into the Performance of ML Techniques

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI