Advances in Machine Learning for Sensing and Condition Monitoring

Ao, Sio-Iong; Gelman, Len; Karimi, Hamid Reza; Tiboni, Monica

doi:10.3390/app122312392

Open AccessReview

Advances in Machine Learning for Sensing and Condition Monitoring

by

Sio-Iong Ao

¹,

Len Gelman

^2,*,

Hamid Reza Karimi

³

and

Monica Tiboni

⁴

¹

International Association of Engineers, Unit 1 1/F, Kowloon, Hong Kong

²

Centre for Efficiency and Performance Engineering, The University of Huddersfield, Huddersfield HD1 3DH, UK

³

Department of Mechanical Engineering, Politecnico di Milano, 20156 Milano, Italy

⁴

Department of Mechanical and Industrial Engineering, University of Brescia, 25123 Brescia, Italy

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(23), 12392; https://doi.org/10.3390/app122312392

Submission received: 7 November 2022 / Revised: 28 November 2022 / Accepted: 30 November 2022 / Published: 3 December 2022

(This article belongs to the Special Issue Applied Artificial Intelligence (AI))

Download Versions Notes

Abstract

:

In order to overcome the complexities encountered in sensing devices with data collection, transmission, storage and analysis toward condition monitoring, estimation and control system purposes, machine learning algorithms have gained popularity to analyze and interpret big sensory data in modern industry. This paper put forward a comprehensive survey on the advances in the technology of machine learning algorithms and their most recent applications in the sensing and condition monitoring fields. Current case studies of developing tailor-made data mining and deep learning algorithms from practical aspects are carefully selected and discussed. The characteristics and contributions of these algorithms to the sensing and monitoring fields are elaborated.

Keywords:

machine learning deep learning; sensing; condition monitoring

1. Brief Introduction

Machine learning algorithms can be very useful for knowledge discovery [1], with the building of models based on training data The knowledge discovery process of machine learning algorithms usually involves feedback at each iteration with the goal that further improvement can be achieved. While the feedback can be made by humans, this can be time consuming and labor intensive. Data mining algorithms are developed to automate the feedback process to overcome the disadvantages of manual feedbacks, with the goal of discovery of unknown features in the data, while machine learning usually needs known features learned in the training process for prediction. Machine learning includes, for example, supervised learning such as classification and regression, and unsupervised learning such as clustering, and dimensionality reduction. Clustering algorithms are used to group data without any pre-defined classes. These methods can be employed to extract valuable information from the datasets [2]. Other machine learning approaches include semi-supervised learning, reinforcement learning, self-learning, robot learning and association rule learning, which are not covered in this review for sensor applications. Deep learning refers to machine learning algorithms with multi-layer structures for processing higher-level characters from the input dataset.

This review paper is organized as follows. In Section 2, a review is undertaken for supervised machine learning. In Section 3, a review is undertaken for unsupervised machine learning, more specifically, clustering. In Section 4, a review is undertaken for deep learning.

2. Supervised Machine Learning

In order to build robust learning systems, in many cases, only the relevant features of the dataset are needed. This selection process is called feature selection. For cases that need optimal feature selection, this involves the exhaustive search of all possible feature combinations. Indeed, this can be a NP-hard problem and become impossible to compute for large datasets. To overcome the computational difficulties, greedy algorithms are constructed. The benefits of feature selection include overcoming the difficulties associated with high dimension, improving algorithm speed, and better ability for generalizing. In other words, feature selection can enable focusing only on important features in the machine learning process [3,4].

Fu and Gao et al. [5] focused on the diagnosis of fault and its classification for actuators and sensors in turbines. They employed a transforming method (fast Fourier transform) and a principal component method to develop data-driven fault diagnosis and fault classification strategies.

Chu and Li et al. [6] used principal component analysis to extract the features from a sensor array comprising four gas sensors for detecting 11 types of mixtures of NO₂ and CO. The extracted features were then processed by c-means clustering and a back-propagation neural network (BPNN) to identify gases.

Video surveillance systems need help from human activity recognition tools. Ince et al. [7] presented a novel biometric system that can detect human activities using the Haar wavelet transform (HWT), a highly effective tool in time-series data processing, for preserving the information of the features before reducing the data dimension. This biometric system used angles between skeletal joints to recognize human activities in 3D space based on RGB-depth sensor data. Dimension reduction was achieved with an averaging algorithm for decreasing the computational cost. A faster performance while maintaining high accuracy was obtained.

Yang and Chen et al. [8] utilized the time-series datasets obtained from sensors for classifying sensors. Three transformation methods were employed to translate the data into images. The proposed framework succeeded in encoding source data into desired images with convolutional network employed in the classifying process.

The problem of detecting ships can be challenging and may involve the analysis of images from remote sensors. Nie and Han et al. [9] utilized the transformation tool (Fourier transform) to build a detection algorithm for locating meaningful regions. This approach was shown to be helpful for the next discrimination process for panchromatic images.

New advances in sensors generate more and more datasets that may need new algorithms to handle them efficiently. Liu and Kong et al. [10] studied the additive manufacturing process with online sensors. Feature extraction methods were employed in the monitoring process. Their proposed learning approach, called MKML-ISOMAP, was deployed for handling online high-dimensional data produced by sensors. Experimental results showed that the approach achieved a high accuracy of prediction efficiently.

Machine learning methods can extract meaningful and valuable outputs from the patterns in the data. Machine learning includes methods based on statistical analysis, mathematical modeling, control theory and computational intelligence. Inductive logic, evolutionary computing, artificial neural networks, the Bayesian approach, and Markov chains are only a few examples. Despite the diverse background difference, these approaches usually have the following common procedures [1]. Firstly, a comparison engine is employed for checking the input data with the underlying model. Secondly, the results from the comparison engine are utilized for assigning modifications to the underlying model. Thirdly, the new results from the modified model are evaluated with the pre-defined conditions. If these conditions are not satisfied, these three procedures will be iterated until the conditions are met. Recently, the authors in [11] proposed an integration of RBF neural networks and a passivity control framework based on the sliding mode theory for offshore dock cranes, modeled as non-linear systems. Then, adaptive control theory is utilized for the convergence in finite time for application in sliding mode dynamics with disturbances and non-linearity issues.

Mathematical equations can be employed to find the relationship between variables and are helpful in investigating the effects among the variables on the target subjects. A simple linear regression model is suitable for applications of continuous variables with a linear relationship. These linear models may not perform well for applications with a binary target variable, as their underlying assumptions are different. Logistic transformation and logistic regression models are often employed for this type of application.

In gesture recognition and communication, sensors such as flex sensors and accelerometers are attached to a glove, and machine learning models are employed to predict the gesture by obtaining the values from the sensors. Krishnan and Vijay et al. [12] used logistic regression models to perform the classification of gestures from values of these sensors.

Li and Cock [13] focused on detecting the cognitive load of a user from the reading obtained by the smart wrist-band sensors. Feature selection was employed, and the machine learning algorithms used in their applications included logistic regression, a decision tree model, and support vector machines.

A decision tree can be employed to form more homogeneous smaller sets for particular target variable. Decision rules are defined in order to split the records in the original dataset into smaller sets. Many classification and prediction problems can be handled satisfactorily by decision tree algorithms [14]. The underlying procedures for the various decision tree models are as follows: the data records are repeatedly split into smaller subsets. The objective is to achieve greater purity in the newly formed subsets than its ancestors. The performance of a split is measured by the degree of purity obtained by that split. Measures such as Gini, information gain, or chi-square can be used for applications with categorical target variable. Measures such as variance reduction and F-test are for cases with numerically continuous variables.

How to recognize new types of attacks in the intrusion detection systems is an important topic for the security of wireless sensor networks. Nancy and Muthurajkumar et al. [15] showed a new intrusion detection system, employing a decision tree classification algorithm to find attacks. Their proposed fuzzy temporal decision tree algorithm was integrated with convolution neural networks for locating intruders. The experiment results clearly supported that the detection performance and efficiency are satisfactory.

The cleaning of rice is an important function of a combine harvester. Chen and Lian et al. [16] developed sensors for checking rice grain impurity in harvesters. High-quality images are recorded during harvesting. The morphological features of the particles extracted from the images served as the inputs to the decision tree model for the later classifying process. The output in their application is the visualized tree, which is useful for the classification of the particles labeled in the binary image.

Inductive-based learning refers to the learning process by instances. The system tries to induce general rules from the input examples [17]. In inductive methods, relational learners are employed to achieve the partial ordering among the hypotheses concerned.

Problems of missing data values are common in sensor applications. Elhassan and Abu-Soud et al. [18] developed an inductive learning algorithm for dealing with the missing data values problem. They focused on enhancing the existing inductive learning algorithm to deal with datasets with missing values and showed a new algorithm that can have the added ability to deal with noise data.

Enormous amounts of spatial data are generated from remote sensing of geographical information system and computer cartography, etc. Mihai and Mocanu [19] focused on spatial data mining with the decision tree classifier algorithm. Information theory and an inductive learning method were used to construct a decision tree, which can in turn extract relevant relationships in a set of labeled input data.

Artificial neural networks (ANN) are well known for their high performance in tasks involving filtering and prediction. This process usually involves the filtering of noises from the source dataset and prediction based on the filtered dataset. Filtering of the noises can also refer to the extraction of essential patterns from the input. Based on some function approximation assumptions, the filtered data can be used for the prediction of future values of the target variables. The artificial neural network is a popular advanced tool because of its proven robustness. Feedforward networks refer to a network with more than one neuron but with no feedback paths in its structure. Multi-layer feedforward networks refer to a network with an input layer, a hidden layer and an output layer.

Machine vision technology can capture visual information by frame-based cameras, etc., and convert the images into a digital format and process afterwards using machine learning algorithms. Mennel and Symonowicz et al. [20] showed an image sensor with a built-in ANN that can sense and process optical images at the same time without latency. The sensor can conduct the classifying and encoding of images optically projected onto the chip at a rate of 20 million bins per second.

Falls in the elderly can cause serious consequences and are of major public health concern. Wearable inertial sensors, accelerometers and gyroscopes, can generate large datasets of various falls and activities of daily living (ADL). Yu and Qiu et al. [21] deployed three deep learning models for detecting falls from the large dataset obtained by wearable inertial sensors. These models are a convolutional neural network, long short-term memory, and a hybrid model integrating both convolution and long short-term memory. The prediction of a fall during its descent may lead to a safety mechanism that can prevent fall-related injuries. Chen and Zhang et al. [22] proposed a cuffless blood pressure estimation framework using a CNN-based Receptive Field Parallel Attention Shrinkage Network by capturing the long-term dynamics in the photoplethysmography signal with no long short-term memory.

A multi-layer perceptron neural network was employed in [23] to identify the working condition of a mechanical indexing system, using data acquired by accelerometers, with the aim to prevent the onset of vibratory phenomena or failures. The extraction of features from the raw data represents a very important phase of the diagnostic process, allowing to reduce the dimensionality of the problem and, therefore, of the networks. Different features used in [24], based on the power spectral density, the Fourier transform (FT), the wavelets, the probability density function, the higher-order spectra (HOS), have been compared for case study of an indexed rotating table. From the study, it emerged that all the considered pre-processing techniques permitted obtaining acceptable classifications, but two of them (the FT and the HOS) allowed better results.

Online fault detection of an aircraft becomes possible with the advances of actuator and sensor technologies. Taimoor and Aijun et al. [25] increased fault detection capabilities by employing the Extended Kalman Filter for the weight updating parameters of a multi-layer perceptron (MLP) neural network. With the online adaptation of weighting parameters of MLP, the preciseness of the fault detection is found to increase.

In 1965, Lotfi Zadeh proposed the concept of fuzzy logic, which is multi-valued logic that can handle reasoning approximately (Zadeh et al. [26]). The truth of a statement is no longer limited to the two traditional values “true” and “false”. In fuzzy logic, the degree of truth has any value in the interval from zero to one. Fuzzy systems may have problems such as how to define the fuzzy operators for real-world applications.

Cooperative cargo transportation studies the management of unmanned aerial systems, by utilizing information obtained by sensors. Teixeira and Neves-Jr. et al. [27] presented a fuzzy model to avoid the drones from colliding themselves or with other objects. A new approach was developed to evaluate potential fields with fuzziness measure for collisions avoidance. Four intelligent controllers were employed to monitor the motion of the drones for avoiding collisions.

Portable, wearable gait analysis system with signal obtained from the pressure sensors can be used for accurate gait phase recognition and gait cycle segmentation. Yang and Gao et al. [28] applied fuzzy logic inference to achieve continuous and smooth gait phase recognition. Then, gait cycle segmentation was performed using gait phases by fully considering the internal difference among different people.

The evolution of the biological species is the inspiration for the development of evolution computing (Jong, [29]). Evolutionary computational algorithms have iterative procedures concerning about the growth or shrinking of a population. In each iteration, the population is chosen randomly with the objective to get closer to the desired result. Metaheuristic optimization methods such as genetic algorithms, evolution strategy, ant colony optimization and particle swarm optimization are popular among evolutionary computing.

During the deployments of wireless sensor networks (WSNs), clustering and routing are two major issues that need to be addressed, and yet these two problems are both NP-hard. Kuila et al. [30] employed genetic algorithm, particle swarm optimization and differential evolution for solving clustering and routing problems in WSNs. Comparison as well as strengths and weaknesses of the algorithms were highlighted.

In mobile wireless sensor networks (MWSNs), the sensor nodes are movable within a certain area. It becomes more and more important to prolong the lifetime of the sensors for real-time and effective information. Zhang et al. [31] employed five evolutionary computing algorithms to achieve a MWSN lifetime optimization model.

Computational learning methods focus on utilizing induction for understanding the common methodology among efficient learning algorithms, and to find the hindrance for learning effectively (Kearns and Vazirani [32]). Mathematical analysis is often needed. There are learning algorithms that can forecast based on the values of past events. There are also algorithms that can improve with the advice from experts or teachers. When an algorithm can be finished in polynomial time, it is called feasible. Probably approximately correct learning, Vapnik–Chervonenkis theory, Bayesian inference, and algorithmic learning theory are common efficient methods of computational learning theory.

Nowadays, many manufacturing systems achieve monitoring jobs with the help of appropriate sensors. How to transfer the industrial input data from sensors to knowledge-based automatic execution with no need of human interference can be challenging. Kozłowski et al. [33] developed a new approach to determine the remaining useful life of machine tools at an early stage and to classify the conditions of the machine tools. It utilized the support vector machine for classifying the machine tool conditions. Autoregressive and integrated moving average-based identification is also employed to act as expert during normal operation.

Remote sensing image captioning is about producing natural semantic descriptions of images remotely. Shen et al. [34] developed a two-stage multi-task learning model for accomplishing this task. The proposed transformer generated the text to describe image from the spatial and semantic attributes. The sentence descriptions were further improved with the reinforcement learning.

In order to further enhance the prediction capability of individual machine learning methods, ensemble modeling is often employed for applications involving both forecasting and classification. Many experimental results support that ensemble modeling can further improve the forecasting capability of the individual models in the whole system (Opitz and Maclin [35]). Even though previous research results support that the performance of an ensemble can be better than its individual component, it has also been highlighted that the ensemble model can work better if its individual component is chosen carefully with high prediction accuracy.

A simple example of ensemble model is the combining of individual machine learning methods with linear weightings. Studies by Maqsood et al. [36] supported that with ensemble model the system could achieve better prediction capability than its individual algorithm. It has been shown that this ensemble model can work better than its individual methods.

Abubakr et al. [37] proposed a classification method to monitor the failure of tool condition in machine operations. The input data are the signals obtained by sensors monitoring the current, vibration and acoustic emission. The random forest method was employed for feature reduction. The authors illustrated that an ensemble of individual methods can further improve its performance and the approach has the potential applications in tool condition monitoring application.

At AT&T Bell Laboratories, Vapnik and colleagues, Boser and Guyon, initiated the studies of the support vector machines (SVMs) algorithms [38]. The development of the SVMs algorithms has clearly focused on industrial applications (Smola and Scholkopf [39]. Support vector machines for classification (SVC) and support vector regression (SVR) are the two main types of SVM algorithms. The mathematical properties of the SVM algorithms are found to be robust (Brown et al. [40]). The advantages include for example sparseness of solution, flexibility for large feature spaces, and outlier handling capabilities. With these mathematical properties, SVM algorithms can handle even large datasets well. Structural risk minimization with statistics and learning methods is the foundation of the SVM algorithms. Minimization of the empirical risk with the prevention of the “over-fitting” problem can be achieved in the structural risk minimization process. SVM algorithms deal with the mapping of input data of low dimension into targets of much higher dimension through kernel function. Quadratic programming is often employed for solving the global optimization problem involved.

A wireless sensor system was developed by Liu et al. [41] to monitor the water quality in real time. The system can handle the problem of delay of data transmission well with robust comparability for water quality forecasting. A wireless sensor network with a ZigBee protocol were employed to detect the quality of the water in the basin with the help of various indicators such as amounts of nitrogen and phosphorus in the water. SVM algorithms were deployed in this system for automatic detection of the quality of water.

Useful information such as the status of the objects is obtained by methods for the monitoring of the conditions. The information can be helpful in the prevention of catastrophic failures. Gómez et al. [42] developed a monitoring system for the conditions of railway axles dynamically. Wavelet Packet Transform energy and support vector machine diagnosis model were deployed satisfactorily in their proposed system.

Hybrid artificial intelligent system refers to system that combine several artificial intelligence methods to work together to achieve the target. Individual methods such as neural networks, evolutionary computing, fuzzy logic, and SVM, Bayesian networks, statistical learning are often deployed to form hybrid systems such as hybrid multi-agent models, knowledge-based artificial neural networks, and hybrid optimization algorithms.

A common goal of hybrid artificial intelligent systems is to improve the performance of the individual methods in the machine learning process. In [43], hybrid neural network regression models were combined with fuzzy clustering technique, and clustering non-parametric regression models were developed. The neural network regression models worked iteratively with optimal fuzzy membership values for each object, with the goal to minimize the total error of the neural network regression models. This hybrid system was shown to have the capability to cope with situations cases that the individual methods, i.e., the K-means and Fuzzy C-means methods, could not perform satisfactorily.

Mustafa et al. [44] developed a hybrid artificial intelligent system for the species recognition and the herb disease detection at early stage with computer intelligent vision technologies and electronic nose. The hybrid system employed fuzzy logic, naïve Bayes, an artificial neural network and the SVM algorithm to perform the tasks of specie recognition and disease detection. The proposed hybrid technique combined with these three machine learning approaches has a recognition and detection rate of almost 99%.

In recent years, emerging technologies in the fields of cloud computing, robotic computing algorithms, wireless sensor networks and communication help the advance of cloud robotics in smart cities. Kumaran et al. [45] developed a cloud robotic system using hybrid artificial intelligent algorithms. The proposed system was shown to perform the crowd control in smart cities satisfactorily. The integrated framework can arrange the robotics to move efficiently to accomplish various tasks.

Swarm intelligence is useful in optimization problems to find optimal goal. The homing behavior of pigeon is the inspiration for the development of swarm intelligence. Sun et al. [46] proposed a hybrid algorithm, combining transformation technique, evolutionary computing technique and swarm intelligence technique together. This hybrid system can address the problem of trapping in local optimum in the optimization process. The system was further integrated into the Distance Vector–Hop algorithm in application to locate the nodes of wireless sensor networks.

A synthetic summary of the advances in supervised machine learning for sensing and condition monitoring is presented in Table 1.

3. Unsupervised Machine Learning (Clustering)

Clustering algorithms focus on searching for similarities in feature vectors of the data, and then grouping similar vectors [47]. Another name for clustering approach is unsupervised pattern learning, while in supervised pattern learning training dataset is needed. Supervised learning such as classification needs the information obtained from the training dataset to guide the learning process. Clustering algorithms are deployed in many real-world applications in engineering and science.

Clustering algorithms are based on how to arrange the data points into clusters optimally. This combinatorial task is found to be NP-hard. To solve this combinatorial problem efficiently, it is leaded to the development of various clustering algorithms. The common goal is to restrict the total number of combinations of the clusters to be investigated. Hierarchical clustering, partition clustering and spectral clustering are the three main types of clustering methods. Various clustering approaches may lead to results of various clusters, and the nature of the problem may provide some guide about the clustering approach to be chosen.

The determination of similarities between two feature vectors of the data is a challenging task. Employing a suitable measure for comparing the similarity is essential for clustering algorithms. The procedure of how to cluster the vectors, based on the chosen measure, is the next issue in designing clustering algorithms. Different cluster outputs are often obtained with different clustering measure and procedure. For solving real-world tasks efficiently and accurately, opinions from experts may be very helpful.

Agglomerative clustering and divisive clustering are the two different types of hierarchical clustering algorithms. Agglomerative clustering algorithms use the bottom-up procedure. Firstly, each data object is regarded as individual cluster. Then, the data objects are iteratively merged into larger clusters. For the divisive clustering algorithms, the top-down procedure is employed. Firstly, the whole set of data objects are treated as a single cluster. Then, the large clusters are iteratively divided into smaller ones. Co-clustering algorithms focus on the clustering of both the data objects as well as their features.

Centralized entities, for example cloud or edge, can allow automated decision making for the applications in fields such as Internet of Things when fed with data from several sensors. Nevertheless, malicious outliers among the data obtained by sensors may affect this automation process. Shukla and Sengupta [48] developed an expandable outlier detection algorithm based on hierarchical clustering together with an artificial neural network. In this system, the hierarchical clustering algorithm can ensure expandability of the outlier detection algorithm from correlated sensors, while an artificial neural network worked together with statistical methods for detecting outliers from the time series obtained by the sensors.

A biosensor platform can be used for detection of drug contaminants in hormone drugs and antibiotics. M13 bacteriophage-based colorimetric sensors are found to be able to detect extremely small amounts of target molecules, while further works are needed to enhance their capability of formulating the groupings of target molecules. Kim et al. [49] proposed a statistical approach to classify the types of target molecules with high computational performance even for very large dataset. The proposed method can analyze pattern of change in color by a reaction among sensors and foreign materials. Hierarchical cluster algorithm is employed for separating the target materials.

A common property of partition clustering algorithms is that all the clusters can be estimated at one time. Renowned partition clustering algorithms include k-means clustering algorithms and fuzzy c-means clustering algorithms. k-means clustering algorithms start with k clusters that are randomly generated. The center of a cluster is computed with the average of all the data points within that cluster. Subsequently each point is allocated to its closest cluster center. All the new centers of the clusters need to be evaluated again. These procedures are iterated till the pre-defined criterion is satisfied. Fuzzy clustering algorithms utilize the concept of fuzzy logic. Data point no longer needs to belong to a single cluster. Instead, the data point can belong to clusters to some degrees. This is the main difference for the fuzzy c-means algorithms and the k-means algorithms.

Clustering algorithms have been employed in applications involving wireless sensor networks. A common difficulty is that the clustering process may be trapped in local minima. This can result in inaccurate cluster partitions. Kotary and Nanda [50] developed hybrid clustering techniques combined with evolutionary computing such that the global optima may be obtained. For the monitoring of outliers, a weight system was proposed, based on the volume and density of the data points. In this case, outliers are the ones with larger weights.

There are many difficulties in the deployment of wireless underwater sensor networks (WUSNs), such as the high loss rate of transmission powers in the data transfer process. Clustering may address this issue by combining wireless sensors into cluster with local base station one hop away. As the sensor nodes are now close to the local base station, the transmitting power can be reduced significantly. Omeke et al. [51] proposed a novel k-means clustering scheme for local base station selection. It was found to be able to prolong the lifetime of WUSNs. The proposed algorithm can decide the optimal number of clusters in real time. The experimental results support that it can outperform the traditional clustering algorithms by more than 90%.

A common property of the spectral clustering algorithm is the reduction in the dimension for the source data before measuring the similarity among the data. Shi-Malik algorithm is a spectral clustering algorithm which is popular in the segmentation process of images.

Gao and Shi [52] developed a novel clustering algorithm to monitor the behavior pattern of the handling of ships. Ship information is obtained from the array of sensors, and then feeds into the identification system with trajectory data. The sliding window algorithm was employed to extract information from the data given by this sensor system. The trajectories were divided and generated sub-trajectories, and a spectral clustering algorithm was utilized for the clustering of sub-trajectories in order to discover the patterns of behavior. This method can help understanding the behavior patterns during the process of handling ships. The proposed method can also increase the efficiency of the learning process for planning of ship routes and collision avoidance decision making, etc.

Sensors for hyperspectral imaging (HSI) have the capability to handle source dataset of wide spectrum of wavelengths. Yet, HSI classification can be a challenging task because of the high-dimensional feature space. Sellami et al. [53] developed a new HSI classification method, combining spectral technique with a deep neural network to the classification task of HSI. The issue about the redundancy between spectral groups were addressed with unsupervised selection algorithm. Spectral-spatial features were extracted from the different groups of selected bands for improving the accuracies of classification. A 3D CNN model was applied to associate and fuse each group with the target for further enhancing the accuracies of classification.

A synthetic summary of the advances in unsupervised machine learning for sensing and condition monitoring is presented in Table 2.

4. Deep Learning

Deep learning (DL) is a machine learning (ML) framework, developed from traditional neural networks, approximately since 2006. Deep learning is actually based on large-sized deep neural networks (DNNs), and can be referred as neural networks with a deep structure (Zhao and Zheng et al. [54]). Deep models have outperformed conventional techniques in recent decades and are now a common tool for data representation (Yuan and Shen et al. [55]).

The main advantage of deep learning over traditional ML is the automatic identification of features, learned through a general purpose learning procedure (LeCun and Bengio et al. [56]). Classic methods of machine learning starting from raw data require a pre-processing work, based on the identification and selection of feature vectors or of a suitable internal representation to be provided as input to the neural network. This work of data targeting is intensive and time-consuming and must be carried out by expert engineers. Otherwise, raw data can be directly supplied to a deep neural network and, from the composition of a number of levels consisting of simple but non-linear modules, progressive transformations are obtained with gradually increasing levels of abstraction, leading to learn even very complex functions. In deep learning for classification purpose, the elements of the inputs, which are crucial for discrimination, are amplified by higher level of representation, while the irrelevant ones are neglected.

High-level and abstract features are automatically extracted from a large variety and quantity of data, captured from various sources. Typical feature extraction methods are unable to obtain similar results (Samaras and Diamantidou et al. [57]). Hinton and Salakhutdinov in [58] firstly demonstrated this superiority. As clearly expressed by LeCun and Bengio et al. [56], deep learning methods are, in short, representation-learning methods with multiple levels of representation. Each layer representation is computed from the representation in the previous layer; the computation is based on internal parameters which are updated through a back-propagation algorithm. With multiple non-linear layers even an intricate structure in a large dataset can be discovered. Multi-layer learning allows very high performance in complex function approximation, image, video, speech or audio processing, classification problems, multi-sensor data aggregation, with extraordinary results in many fields as speech or visual object or signals recognition, natural language processing, face or object, or pedestrian detection, human activity identification, fault diagnosis, drug discovery, genomics, multi-task and transfer learning, domain adaptation, etc.

Input data quality is fundamental for a good functioning of deep learning. Consequently, the tools that enable data acquisition also play a key role. Depending on the field of application and the purpose, the nature of the data and of the sensors/acquisition devices is different. The use of data from sensors/sources of different nature is increasingly frequent and therefore data fusion, which merges complementary information, as spatial–temporal–spectral resolution data, often occurs. DL models extract abstract features from multiple input streams and can establish robust relationships between dissimilar input signals, not influenced by sensor type and spatial scale. Moreover, DL is robust even in cases of missing or corrupted sensor data.

Starting from the classical NNs, such as the back-propagation feedforward NN (BPNN) or the radial basis type known as a generalized regression neural network (GRNN), different DL architectures have been developed to address different kinds of problems. The mainstream deep neural models are the deep belief network (DBN), a convolutional neural network (CNN), autoencoder (AE), recurrent NN (RNN) and the long short-term memory network (LSTMN). In the following, these architectures will be treated, referring to some specific applications, focusing attention on the input data and the types of sensors adopted for datasets creation.

Among the most commonly used deep learning model in recognition and detection tasks is the CNN. In order to extract features which are resistant to distortion, CNNs use interconnected network architectures.

A convolutional neural network (CNN) is a deep learning method that can use images as input, assign weight/importance to objects in the images and classify them. For simple application, a 1D convolutional neural network may be used. More sophisticated classification models, CNN-Net, Encoded-Net, and CNN-LSTM, will have more complicated architectures such as denser layers and larger kernel size than 1D CNN. Medical care benefits from automatic prediction of routine human activity. For the purpose of recognizing human activity, Mukherjee et al. [59] created the EnsemConvNet ensemble, which combines CNN-Net, Encoded-Net, and CNN-LSTM classification models. Each model can accept time-series data as a 2D matrix, and the EnsemConvNet model’s classification result is created by combining different classifiers using techniques including majority voting, sum rules, product rules, and score fusion approaches. The suggested EnsemConvNet model outperforms the following deep learning models, according to the evidence: long short-term memory, multi-headed CNN, and CNN hybrid models.

Input data are structured in multiple arrays, which may have different dimensions, depending on the signal type (language-based signals and sequences are 1D; 2D pictures or audio spectrograms; and 3D video or volumetric images.). CNN is a feedforward network, formed in the early stages by a sequence of convolution, pooling, non-linear activation layers and in the final stages by fully connected layers. In convolutional layers, a filtering operation is performed through a feature map in which units are organized, thus, getting a discrete convolution in order to identify local confluences of features from the preceding layer; hence the name given to these levels and more generally to the deep model.

Simple features such as texture, lines, and edges are often extracted by the bottom convolution layers, whereas more abstract features are typically extracted by the top layer. (Chen and Li et al. [60]). Pooling layers combine semantically related features into a single feature, resulting in more robust feature descriptions as well as down-sampling and dimensionality reduction processing. These operations include max-pooling, average-pooling, L2-pooling, and local contrast normalization. To improve CNN’s capacity to fit non-linear data, activation layers’ units perform non-linear procedures such as rectified linear unit (ReLU) or sigmoid units. Fully connected layers are located at the outermost level, closest to the output, and serve the purpose of classification. As in BPNNs, the back-propagation algorithm is employed for weights update.

Numerous articles discuss the usage of CNN in numerous disciplines, and a significant amount of these works emphasize the importance of the sensor selection and sensor-related concerns.

For images, recognition cameras are adopted. The widespread use of cell phones equipped with high resolution cameras makes available a huge amount of data that can be used for various applications. The combination of mobile phones and deep learning is a promising solution in many fields.

In a framework for indoor localization, Ashraf et al. [61] presented a deep learning-based convolutional neural network (CNN) localization based on smartphone photos. CNN is used to distinguish between floors, recognize inside scenes in a variety of lighting conditions, and improve indoor localization precision. The classification is based on camera images captured in pre-defined collection points using Samsung Galaxy S8 rear camera. For recognizing scenes, CNN has a prediction accuracy of 91.04%. The search space in a geomagnetic database used for localization is further reduced using the identified scene.

Chen and Cao et al. [62] use image processing and a CNN-based technique to determine UV intensity. They created a wearable UV sensor out of PDMS and photochromic material. Images from a cell phone were used to construct the dataset, and the sensor changes color when exposed to UV radiation. When a CNN was trained to measure UV intensity, the influence of ambient light was considerably diminished, yielding an identification rate of more than 90% under various ambient light circumstances.

Yang et al. [63] presented a comprehensible fuzzy fusion method to combine the output of CNN models that could assess the relevance of each classifier by looking at the interaction index between each classifier. Additionally, SoftPool and Mish activation features were added to conventional CNNs to improve their capacity for feature extraction. An experimentally collected dataset and an artificially generated fault bearing dataset are used to evaluate the performance of the suggested model and assess its capacity to extract features.

In the health monitoring of industrial systems, DL is extensively adopted, as DL-based fault diagnosis methods achieve better results than traditional ML methods. Bearing fault detection classification and localization is a problem in which CNNs obtained very positive performances (Waziralilah and Fathiah et al. [64]).

Niu and Liu et al. [65] proposed a deep residual convolutional neural network (DR-CNN) with gray-scale pictures obtained by a multi-sensor data (multiple 3-axis accelerometers) as input data to address the problem of bearing fault diagnostics with multi-sensor data. The CNN degrades as the network depth reaches a particular level, but certain connections in the residual network skip some of the CNN structure’s layers, making it simple for parameter gradients to spread from the output layer to the lower levels.

Another area, in which accelerometer signals are highly adopted is the biomedical field, such as for human activity recognition (HAR). In Kulchyk and Etemad [66], the authors apply a deep CNN for HAR using a publicly available dataset (Ugulino and Cardador et al. [67]), which contains raw data from four tri-axial wearable accelerometers. The suggested approach is evaluated against other conventional classifiers, such as decision trees, random forest, support vector machines (SVM), and k-Nearest Neighbors (kNN). A classification accuracy of 100% is achieved, with the great advantage of eliminating the need for a pre-processing activity.

A combination of a CNN and a deep convolutional generative adversarial network (DCG), whose acronym is DCG–CNN, is proposed by Sun and Zhao [68] for gas sensor condition monitoring to prevent fault; in the specific case, the gas is hydrogen. A DCG combines a CNN with a generative adversarial network (GAN), whose purpose is to produce fresh samples of data from the available data with the same statistical properties, enhancing defect detection precision when imbalanced data samples are present. The following steps make up the method: the DCG approach is used to construct synthetic 2D grey images of sensor fault signals from 1D hydrogen sensor fault signals; the experimental signal and the synthetic signal are mixed to balance the training dataset. A CNN is trained and evaluated using the entire dataset.

It has been discovered that deep learning has great promise for wireless sensing tasks. There are however problems with labor-intensive training that involves gathering training samples and retraining efforts for trained systems. In order to complete wireless sensing tasks with fewer training efforts, Wang and Gao et al. [69] concentrated on the viability of utilizing deep learning networks. Deep generative adversarial networks (DGAN) were used to provide virtual training samples for the suggested wireless sensing system based on deep learning. The case study of wireless gesture recognition established its efficacy.

Remote sensing is another important area of DL application, with an exponential rise in papers published on the subject in recent years (Zhu and Tuia et al. [70]). Supervised CNNs give optimal performances in the direct classification of hyperspectral images in the spectral domain, as obtained by Hu and Huang et al. [71]. The adopted CNN model has only one convolutional level, since authors verified that the typical CNN with two convolutional layers is actually not applicable for hyperspectral data. According to experimental results based on multiple hyperspectral image datasets, the suggested method produced high classification performance.

CNNs analyze image-based patterns and are ineffective at simulating temporally oriented events. On the other hand, R-CNNs are particularly well suited to model temporal changes in data. They perform temporal analysis of events in time-sequence applications, such as language and speech recognition, when given sequential inputs. The history of the sequence is stored in a state vector in the hidden units of an R-CNN, which processes one input element at a time. Back-propagation is utilized to train an R-CNN because the outputs of the hidden neurons at each step time are analogous to the outputs of various neurons in a deep multilayer network.

Uddin and Mehedi et al. [72] use a deep R-CNN in the field of HAR to identify human behaviors (such as sitting, standing, and walking) from data collected by wearable body sensors. In the study two publicly available datasets, MHEALTH (mobile health) (Banos and Villalonga et al. [73]) and PUC-Rio are used, as well as the AReM (activity recognition system based on multi-sensor data fusion) dataset gathered by the authors. The suggested method is based on data fusion from many wearable sensors, including an electrocardiogram (ECG), an accelerometer, and a magnetometer. Next, using kernel principal component analysis (KPCA), features are retrieved, and then a deep R-CNN is applied to recognize behavior.

Controlling HMI devices or artificial limbs frequently involves the detection and classification of human movements. According to Wang and Chen et al. [74], a R-CNN is a promising decoder for classifying hand movements based on the combination of complicated time-series EMG signals and acceleration data.

Remote sensing also uses R-CNN. Arefin and Michalski et al. [75] developed a super-resolution method based on an R-CNN architecture to produce a high-resolution image from a succession of low resolution satellite photographs.

In order to learn long-term dependencies, Hochreiter and Schmidhuber [76] modified the recurrent neural network (RNN) and created the long short-term memory (LSTM). The LSTM employs a self-feeding loop in its inner layers that may learn time-based correlations, combining knowledge from previous inputs into the analysis of present inputs. Both spatial and temporal information may be extracted from data thanks to the combined strength of CNN and LSTM.

A CNN-LSTM was used by Bilgera et al. in [77] to determine the position of a gas source (GSL) in an outdoor environment using a variety of stationary sensors (sensor network). In the investigation, thirty metal oxide (MOX) gas sensors that are commercially available and one ultrasonic anemometer were used, and data from the gas sensor array were arranged in a series of monochrome images to create a visual learning challenge for GSL.

According to Nagrecha et al. [78], a deep CNN-LSTM provides reliable findings for predicting air pollution in the field of earth environmental monitoring. Ground-based pollution sensors are used in the solution, and the sensor data are recast into a modified pseudo-image to enable the usage of deep 1D CNN and LSTM.

Xia and Huang et al. [79] used inertial sensor data from a wearable smartphone to apply an LSTM-CNN model to a HAR issue to identify activities of daily life such as standing, walking, walking downstairs, and going upstairs. The model is made up of a pre-processing phase that uses a two-layer LSTM to extract temporal features, two convolutional layers with a max-pooling layer to extract spatial features, a global average pooling layer (GAP), a batch normalization layer (BN), and an output layer (with a Softmax classifier) that produces a probability distribution over classes. Three open datasets (UCI, WISDM, and OPPORTUNITY) were used for testing, with overall accuracy ratings of 95.78%, 95.85%, and 92.63%, respectively. The cost-minimization method is used by the logistic regression-based Softmax classifier to describe multi-class classification problems.

A generative deep learning model called autoencoder (AE) converts high-dimensional data into low-dimensional feature vectors by using copies of training data as input. This reduces the complexity of calculation. AE is an unsupervised method for learning data coding since it uses a feature learning paradigm that directly learns a para-metric map from inputs to their representation (Ma and Sun et al. [80]; Lei et al. [81]). The encoder, a feature-extracting function, and the decoder, which maps the feature space back into the input space, are the two parts of an AE. An encoder and a decoder consist of an input layer, an output layer, and numerous hidden layers in between.

In order to decrease the reconstruction error, which is a measurement of the disparity between the inputs and their reconstruction over all training datasets, a back-propagation technique is employed to modify the encoder and decoder parameters (the weights of the hidden layers). Deep AE provides a data-driven method for learning feature extraction in an effort to lessen the over-reliance on manually produced features prevalent in conventional machine learning techniques. Different fields use AE variations that have been established.

By connecting the hidden representations of two single AEs, a deep coupling AE (DCAE) model is created. DCAE is used to gather the combined information from multi-modal data. Ma et al. developed a DCAE in [80] with the objective of discovering a combined feature between vibration and acoustic data in order to categorize the health state of gears and bearings. For the purpose of combining multimodal signals obtained from several sensors, the model uses a deep learning approach based on the CAE. This technique fuses multimodal data fusion and feature learning into a single step. Furthermore, by self-teaching the high-level features through greedy layer-wise training, the created deep architecture can effectively extract correlations between vibration and acoustic data.

To increase precision and decrease over-fitting in HAR with smartphone-embedded accelerometer sensor data, Alo and Nweke et al. [82] present a deep sparse AE-based deep learning model. The sparse AE is a not supervised DL technique to learn an over-complete feature representation from the raw sensor data by modeling the loss function’s sparsity term with the sparsity term and setting to zero some of the active units. The model can train stable, linearly separable, displacement, distortion, and change-invariant feature representations thanks to the sparsity term. The features of the sparse AE guarantee effective low-dimensional characteristic extraction from the high-dimensional structure of the input sensor data. Furthermore, Additionally, a complicated activity recognition framework is compactly represented.

The model and the training algorithm of a deep belief network (DBN) were proposed by Hinton and Osindero et al. [83]. DBNs use a greedy layer-by-layer learning approach and a hierarchical structure with numerous stacked restricted Boltzmann machines (RBMs), followed by a fine-tuning. A visible layer and a concealed layer are both present in every RBM. Each layer contains a particular number of neurons. Although the RBM’s layers are interconnected, the units within each layer are not. The values of the hidden neurons can be updated for this structure using matrix operations. This approach is suitable for making predictions online because it can speed up training. The input for the following layer is thought to be the learned properties. A Softmax classifier is then used to update the network’s parameters, and it is also used in the final layer to label each pixel and the classification result. RBMs are an efficient method for extracting features for the feedforward neural network’s initialization, and they greatly enhance the network’s generalization performances.

Due to the random initialization of weighted parameters, local optimum and extended training period are problems that DBN resolves. The convergence time is substantially less because the parameter space only calls for a local search.

With information gathered from numerous sensors, Chen and Jin et al. [84] employed a DBN to forecast tool wear in a high-speed CNC milling machine (a three-component dynamometer, piezo-accelerometers and an acoustic emission sensor). The study found that the DBN performed well in terms of speed, accuracy, and stability.

Zhong et al. [85] created a technique based on a DBN used for multivariate optical sensors and hyperspectral image classification. The development of an RBM’s stacks has been their main contribution. They changed the RBM and the learning algorithm. As part of the pre-training and fine-tuning procedures, data are trained in tiny batches to maximize the loss function of the validation dataset. In hyperspectral images, deep features that model several ground-truth classes are extracted. Experiments demonstrate the effectiveness of this generative feature learning for a spatial classifier (SC) or combined spectral-SC (JSSC), demonstrating cutting-edge performance on hyperspectral image classification.

Deep learning is also declined in terms of “geometric deep learning”. Some current DNN topologies can be seen as graph neural networks (GNN). In the computer vision domain, e.g., CCNs can be thought of as a GNN applied to graphs that are organized as pixels-per-grid grids. A GNN allows the processing of data represented in graph domains, e.g., chemical compounds, images, subsets of the web [86]. Graphs could be cyclic, directed, undirected, or a mixture of these. Social networks, molecular biology, chemistry, citation networks, forecasting of environmental conditions and physics are a few relevant application domains for GNNs.

In [87], Jiao et al., in order to create a group solar irradiance neural network GSINN, merged a GCN (Graph Convolutional Network) with a modified LSTM RNN to capture the graph feature of photo-voltaic panels (PV) groups. The role of the LSTM RNN is to catch the temporal correlations. Meteorological data of 17 silicon radiometers of the U.S. National Renewable Energy Laboratory [88] are used to conduct a thorough study The testing outcome demonstrates the suggested GSINN’s higher performance in terms of universality, dependability, and accuracy when compared to existing prediction systems.

Shi and Rajkumar applied a GNN [89] for 3D object detection in a point cloud obtained through Lidar sensors. They propose a single-stage detection method, in which a graph is constructed from the point cloud, a GNN with auto-registration is used to refine the vertex features by aggregating features along the edges and the NN outputs (multiple bounding boxes) are merged into one and a confidence score is assigned.

CNN and GCN have been applied by Zhang et al. [90] to extract discriminative features from RNA sequences. They developed a method based on two-layer CNN and GCN in parallel to extract the hidden features, followed by a fully connected layer to make the prediction of RNA-binding proteins for the anatomy of the essential mechanism of gene regulation. The use of the spectral GCN in RNA sequence analysis suggests that GCNs are useful for extracting relative characteristics from RNA sequences.

ML and DL models have been effectively used to address the intrusion detection challenge for wireless sensor networks. A redundancy identification system based on a convolutional DBN and a performance evaluation strategy was created by Wen and Shang et al. [91]. The improved method deals with the issue of unidentified or inadequate preceding samples by using unsupervised learning to extract characteristics from examples of both normal and abnormal behavior. To raise the execution effectiveness of CDBN, a knowledge contraction method was created. This mechanism may optimize feature datasets and produce a useful classification sample space to improve intrusion detection’s classification accuracy.

Over the last decade, we have witnessed a big number of publications on ML and DL applications in science and engineering. Considering that both deep or shallow learning models are built as a black-box model by ML or DL algorithms, respectively, an interpretation mechanism should be applied, which allows to interpret or describe the ML or DL model results and make them more transparent. Having said this point, an early work to explainable ML models can be referred to [92]. However, there would be a tradeoff between the complexity of deep leaning models and its simplification to be interpretable or understandable by humans. On this aspect, the authors in [93] developed gradient-weighted class activation mapping (Grad-CAM) for generating “visual explanations” for choices from a broad class of CNN-based models.

Recently, the authors in [94] provided a comprehensive review for the existing machine learning interpretability methods. Four main kinds of interpretability approaches—those for developing white-box models, explaining complex black-box models, promoting fairness and preventing prejudice, and, finally, methods for measuring the sensitivity of model predictions—were specifically examined.

A synthetic summary of the deep learning advances for sensing and condition monitoring is presented in Table 3.

5. Brief Conclusions

This article has provided an overview and understanding on the impact of machine learning techniques in real-time condition monitoring and sensing technologies. More specifically, various learning algorithms are analyzed to deal with the accuracy and computational complexity challenges within the context of sensor data processing. Afterwards, different machine learning and deep learning models are provided from application point of view. Some important and yet challenging research topics are machine learning for complex sensing networks, integrated machine learning hardware for soft sensing applications, interpreting machine learning models and a different range of applications.

Author Contributions

Conceptualization, S.-I.A. and L.G.; formal analysis, S.-I.A., L.G., M.T. and H.R.K.; investigation, S.-I.A., L.G., M.T. and H.R.K.; methodology, S.-I.A., L.G., M.T. and H.R.K.; supervision, L.G.; writing—original draft preparation, S.-I.A., L.G., M.T. and H.R.K.; writing—review and editing L.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bergeron, B.P. Bioinformatics Computing; Prentice Hall Professional: Hoboken, NJ, USA, 2003. [Google Scholar]
Ao, S.I. Data Mining and Applications in Genomics; Springer: Dordrecht, The Netherlands, 2008. [Google Scholar] [CrossRef]
Liu, H.; Motoda, H. Feature Selection for Knowledge Discovery and Data Mining; Springer Science & Business Media: New York, NY, USA, 2012; Volume 454. [Google Scholar]
Tiboni, M.; Remino, C.; Bussola, R.; Amici, C. A Review on Vibration-Based Condition Monitoring of Rotating Machinery. Appl. Sci. 2022, 12, 972. [Google Scholar] [CrossRef]
Fu, Y.; Gao, Z.; Liu, Y.; Zhang, A.; Yin, X. Actuator and Sensor Fault Classification for Wind Turbine Systems Based on Fast Fourier Transform and Uncorrelated Multi-Linear Principal Component Analysis Techniques. Processes 2020, 8, 1066. [Google Scholar] [CrossRef]
Chu, J.; Li, W.; Yang, X.; Wu, Y.; Wang, D.; Yang, A.; Yuan, H.; Wang, X.; Li, Y.; Rong, M. Identification of gas mixtures via sensor array combining with neural networks. Sens. Actuators B Chem. 2021, 329, 129090. [Google Scholar] [CrossRef]
Ince, Ö.F.; Ince, I.F.; Yıldırım, M.E.; Park, J.S.; Song, J.K.; Yoon, B.W. Human activity recognition with analysis of angles between skeletal joints using a RGB-depth sensor. ETRI J. 2020, 42, 78–89. [Google Scholar] [CrossRef]
Yang, C.-L.; Chen, Z.-X.; Yang, C.-Y. Sensor Classification Using Convolutional Neural Network by Encoding Multivariate Time Series as Two-Dimensional Colored Images. Sensors 2020, 20, 168. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Nie, T.; Han, X.; He, B.; Li, X.; Liu, H.; Bi, G. Ship Detection in Panchromatic Optical Remote Sensing Images Based on Visual Saliency and Multi-Dimensional Feature Description. Remote Sens. 2020, 12, 152. [Google Scholar] [CrossRef] [Green Version]
Liu, C.; Kong, Z.; Babu, S.; Joslin, C.; Ferguson, J. An integrated manifold learning approach for high-dimensional data feature extractions and its applications to online process monitoring of additive manufacturing. IISE Trans. 2021, 53, 1215–1230. [Google Scholar] [CrossRef]
Jiang, B.; Liu, D.; Karimi, H.R.; Li, B. RBF Neural Network Sliding Mode Control for Passification of Nonlinear Time-Varying Delay Systems with Application to Offshore Cranes. Sensors 2022, 22, 5253. [Google Scholar] [CrossRef]
Krishnan, A.; Vijay, A.; Balaji, M.; Sreeja, B.S. Gesture Recognizer and Communicator using Flex Sensors and Accelerometer with Logistic Regression. In Proceedings of the 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS), Thoothukudi, India, 3–5 December 2020; pp. 391–394. [Google Scholar] [CrossRef]
Li, X.; De Cock, M. Cognitive load detection from wrist-band sensors. In Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers; Association for Computing Machinery: New York, NY, USA, 2020; pp. 456–461. [Google Scholar]
Berry, M.J.A.; Linoff, G.S. Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management; John Wiley & Sons: Hoboken, NJ, USA, 2004. [Google Scholar]
Nancy, P.; Muthurajkumar, S.; Ganapathy, S.; Santhosh Kumar, S.V.N.; Selvi, M.; Arputharaj, K. Intrusion detection using dynamic feature selection and fuzzy temporal decision tree classification for wireless sensor networks. IET Commun. 2020, 14, 888–895. [Google Scholar] [CrossRef]
Chen, J.; Lian, Y.; Li, Y. Real-time grain impurity sensing for rice combine harvesters using image processing and decision-tree algorithm. Comput. Electron. Agric. 2020, 175, 105591. [Google Scholar] [CrossRef]
Quinlan, J.R. Learning logical definitions from relations. Mach. Learn. 1990, 5, 239–266. [Google Scholar] [CrossRef] [Green Version]
Elhassan, A.; Abu-Soud, S.M.; Alghanim, F.; Salameh, W. ILA4: Overcoming missing values in machine learning datasets—An inductive learning approach. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 4284–4295. [Google Scholar] [CrossRef]
Mihai, D.; Mocanu, M. Processing GIS Data Using Decision Trees and an Inductive Learning Method. Int. J. Mach. Learn. Comput. 2021, 11, 393–398. [Google Scholar] [CrossRef]
Mennel, L.; Symonowicz, J.; Wachter, S.; Polyushkin, D.K.; Molina-Mendoza, A.J.; Mueller, T. Ultrafast machine vision with 2D material neural network image sensors. Nature 2020, 579, 62–66. [Google Scholar] [CrossRef]
Yu, X.; Qiu, H.; Xiong, S. A novel hybrid deep neural network to predict pre-impact fall for older people based on wearable inertial sensors. Front. Bioeng. Biotechnol. 2020, 8, 63. [Google Scholar] [CrossRef] [Green Version]
Chen, Y.; Zhang, D.; Reza, H.; Deng, C.; Yin, W. A new deep learning framework based on blood pressure range constraint for continuous cuffless BP estimation. Neural Netw. 2022, 152, 181–190. [Google Scholar] [CrossRef]
Tiboni, M.; Remino, C. Condition monitoring of a mechanical indexing system with artificial neural networks. In Proceedings of the WCCM 2017—1st World Congress on Condition Monitoring 2017, London, UK, 13–16 June 2017. [Google Scholar]
Tiboni, M.; Incerti, G.; Remino, C.; Lancini, M. Comparison of signal processing techniques for condition monitoring based on artificial neural networks. Appl. Cond. Monit. 2019, 15, 179–188. [Google Scholar] [CrossRef]
Taimoor, M.; Aijun, L. Adaptive strategy for fault detection, isolation and reconstruction of aircraft actuators and sensors. J. Intell. Fuzzy Syst. 2020, 38, 4993–5012. [Google Scholar] [CrossRef]
Zadeh, L.A.; Klir, G.J.; Yuan, B. Fuzzy Sets, Fuzzy Logic, and Fuzzy Systems: Selected Papers; World Scientific: Singapore, 1996; Volume 6. [Google Scholar]
Teixeira, M.A.S.; Neves, F., Jr.; Koubâa, A.; De Arruda, L.V.R.; De Oliveira, A.S. A quadral-fuzzy control approach to flight formation by a fleet of unmanned aerial vehicles. IEEE Access 2020, 8, 64366–64381. [Google Scholar] [CrossRef]
Yang, Y.; Gao, W.; Zhao, Z. Research on gait cycle recognition with plantar pressure sensors. In Proceedings of the 4th International Conference on Computer Science and Application Engineering, Sanya, China, 20–22 October 2020; pp. 1–5. [Google Scholar]
De Jong, K. Evolutionary computation: A unified approach. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, Berlin, Germany, 15–19 July 2017; pp. 373–388. [Google Scholar]
Kuila, P.; Jana, P.K. Evolutionary computing approaches for clustering and routing in wireless sensor networks. In Sensor Technology: Concepts, Methodologies, Tools, and Applications; IGI Global: Hershey, PA, USA, 2020; pp. 125–146. [Google Scholar]
Zhang, X.; Lu, X.; Zhang, X. Mobile wireless sensor network lifetime maximization by using evolutionary computing methods. Ad Hoc Netw. 2020, 101, 102094. [Google Scholar] [CrossRef]
Kearns, M.J.; Vazirani, U. An Introduction to Computational Learning Theory; MIT Press: Cambridge, MA, USA, 1994. [Google Scholar]
Kozłowski, E.; Mazurkiewicz, D.; Żabiński, T.; Prucnal, S.; Sęp, J. Machining sensor data management for operation-level predictive model. Expert Syst. Appl. 2020, 159, 113600. [Google Scholar] [CrossRef]
Shen, X.; Liu, B.; Zhou, Y.; Zhao, J.; Liu, M. Remote sensing image captioning via Variational Autoencoder and Reinforcement Learning. Knowl.-Based Syst. 2020, 203, 105920. [Google Scholar] [CrossRef]
Opitz, D.; Maclin, R. Popular ensemble methods: An empirical study. J. Artif. Intell. Res. 1999, 11, 169–198. [Google Scholar] [CrossRef]
Maqsood, I.; Khan, M.R.; Abraham, A. An ensemble of neural networks for weather forecasting. Neural Comput. Appl. 2004, 13, 112–122. [Google Scholar] [CrossRef]
Abubakr, M.; Hassan, M.A.; Krolczyk, G.M.; Khanna, N.; Hegab, H. Sensors selection for tool failure detection during machining processes: A simple accurate classification model. CIRP J. Manuf. Sci. Technol. 2021, 32, 108–119. [Google Scholar] [CrossRef]
Boser, B.E.; Guyon, I.M.; Vapnik, V.N. A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop On Computational Learning Theory, Pittsburgh, PA, USA, 27–29 July 1992; pp. 144–152. [Google Scholar]
Smola, A.J.; Schölkopf, B. A tutorial on support vector regression. Stat. Comput. 2004, 14, 199–222. [Google Scholar] [CrossRef] [Green Version]
Brown, M.P.S.; Grundy, W.N.; Lin, D.; Cristianini, N.; Sugnet, C.W.; Furey, T.S.; Ares, M., Jr.; Haussler, D. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl. Acad. Sci. USA 2000, 97, 262–267. [Google Scholar] [CrossRef] [Green Version]
Liu, C.; Li, H.; Zhang, Q. Research on sewage monitoring and water quality prediction based on wireless sensors and support vector machines. Wirel. Commun. Mobile Comput. 2020, 2020, 8852965. [Google Scholar] [CrossRef]
Gómez, M.J.; Castejón, C.; Corral, E.; Garcia-Prada, J.C. Railway axle condition monitoring technique based on wavelet packet transform features and support vector machines. Sensors 2020, 20, 3575. [Google Scholar] [CrossRef]
Ao, S.I. Neural Network Regressions with Fuzzy Clustering. In Proceedings of the World Congress on Engineering 2007 Vol I, WCE 2007, London, UK, 2–4 July 2007; pp. 507–512. [Google Scholar]
Mustafa, M.S.; Husin, Z.; Tan, W.K.; Mavi, M.F.; Farook, R.S.M. Development of automated hybrid intelligent system for herbs plant classification and early herbs plant disease detection. Neural Comput. Appl. 2020, 32, 11419–11441. [Google Scholar] [CrossRef]
Manikanda Kumaran, K.; Chinnadurai, M. Cloud-based robotic system for crowd control in smart cities using hybrid intelligent generic algorithm. J. Ambient Intell. Humaniz. Comput. 2020, 11, 6293–6306. [Google Scholar] [CrossRef]
Sun, X.-X.; Pan, J.-S.; Chu, S.-C.; Hu, P.; Tian, A.-Q. A novel pigeon-inspired optimization with QUasi-Affine TRansformation evolutionary algorithm for DV-Hop in wireless sensor networks. Int. J. Distrib. Sens. Netw. 2020, 16, 1550147720932749. [Google Scholar] [CrossRef]
Theodoridis, S.; Koutroumbas, K. Pattern Recognition, 3rd ed.; Elsevier: Amsterdam, The Netherlands, 2006. [Google Scholar]
Shukla, R.M.; Sengupta, S. Scalable and Robust Outlier Detector using Hierarchical Clustering and Long Short-Term Memory (LSTM) Neural Network for the Internet of Things. Internet Things 2020, 9, 100167. [Google Scholar] [CrossRef]
Kim, C.; Lee, H.; Devaraj, V.; Kim, W.-G.; Lee, Y.; Kim, Y.; Jeong, N.-N.; Choi, E.J.; Baek, S.H.; Han, D.-W.; et al. Hierarchical cluster analysis of medical chemicals detected by a bacteriophage-based colorimetric sensor array. Nanomaterials 2020, 10, 121. [Google Scholar] [CrossRef] [Green Version]
Kotary, D.K.; Nanda, S.J. Distributed robust data clustering in wireless sensor networks using diffusion moth flame optimization. Eng. Appl. Artif. Intell. 2020, 87, 103342. [Google Scholar] [CrossRef]
Omeke, K.G.; Mollel, M.S.; Ozturk, M.; Ansari, S.; Zhang, L.; Abbasi, Q.H.; Imran, M.A. DEKCS: A dynamic clustering protocol to prolong underwater sensor networks. IEEE Sens. J. 2021, 21, 9457–9464. [Google Scholar] [CrossRef]
Gao, M.; Shi, G.-Y. Ship-handling behavior pattern recognition using AIS sub-trajectory clustering analysis based on the T-SNE and spectral clustering algorithms. Ocean Eng. 2020, 20, 106919. [Google Scholar] [CrossRef]
Sellami, A.; Abbes, A.B.; Barra, V.; Farah, I.R. Fused 3-D spectral-spatial deep neural networks and spectral clustering for hyperspectral image classification. Pattern Recognit. Lett. 2020, 138, 594–600. [Google Scholar] [CrossRef]
Zhao, Z.-Q.; Zheng, P.; Xu, S.-T.; Wu, X. Object Detection with Deep Learning: A Review. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 3212–3232. [Google Scholar] [CrossRef] [Green Version]
Yuan, Q.; Shen, H.; Li, T.; Li, Z.; Li, S.; Jiang, Y. Remote Sensing of Environment Deep learning in environmental remote sensing: Achievements and challenges. Remote Sens. Environ. 2020, 241, 111716. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Samaras, S.; Diamantidou, E.; Ataloglou, D.; Sakellariou, N.; Vafeiadis, A.; Magoulianitis, V.; Lalas, A.; Dimou, A.; Zarpalas, D.; Votis, K. Deep learning on multi sensor data for counter UAV applications—A systematic review. Sensors 2019, 19, 4837. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Hinton, G.E.; Salakhutdinov, R.R. Reducing the Dimensionality of Data with Neural Networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Mukherjee, D.; Mondal, R.; Singh, P.K.; Sarkar, R.; Bhattacharjee, D. EnsemConvNet: A deep learning approach for human activity recognition using smartphone sensors for healthcare applications. Multimed. Tools Appl. 2020, 79, 31663–31690. [Google Scholar] [CrossRef]
Chen, L.; Li, S.; Bai, Q.; Yang, J.; Jiang, S.; Miao, Y. Review of Image Classification Algorithms Based on Convolutional Neural Networks. Remote Sens. 2021, 13, 4712. [Google Scholar] [CrossRef]
Ashraf, I.; Hur, S.; Park, Y. Application of deep convolutional neural networks and smartphone sensors for indoor localization. Appl. Sci. 2019, 9, 2337. [Google Scholar] [CrossRef] [Green Version]
Chen, Y.; Cao, Z.; Zhang, J.; Liu, Y.; Yu, D.; Guo, X. Sensors and Actuators: A. Physical Wearable ultraviolet sensor based on convolutional neural network image processing method. Sens. Actuators A Phys. 2022, 338, 113402. [Google Scholar] [CrossRef]
Yang, D.; Karimi, H.R.; Gelman, L. A Fuzzy Fusion Rotating Machinery Fault Diagnosis Framework Based on the Enhancement Deep Convolutional Neural Networks. Sensors 2022, 22, 671. [Google Scholar] [CrossRef]
Waziralilah, N.F.; Abu, A.; Lim, M.H.; Quen, L.K.; Elfakharany, A. A review on convolutional neural network in bearing fault diagnosis. MATEC Web Conf. 2019, 255, 06002. [Google Scholar] [CrossRef] [Green Version]
Niu, G.; Liu, E.; Zhang, B.; Golda, M.; Mastro, S. A Deep Residual Convolutional Neural Network based Bearing Fault Diagnosis with Multi-Sensor Data. In Proceedings of the 2021 4th IEEE International Conference on Industrial Cyber-Physical Systems (ICPS), Victoria, BC, Canada, 10–12 May 2021; pp. 655–660. [Google Scholar] [CrossRef]
Kulchyk, J.; Etemad, A. Activity Recognition with Wearable Accelerometers using Deep Convolutional Neural Network and the Effect of Sensor Placement. In Proceedings of the 2019 IEEE SENSORS, Montreal, QC, Canada, 27–30 October 2019; pp. 1–4. [Google Scholar] [CrossRef]
Ugulino, W.; Cardador, D.; Vega, K.; Velloso, E.; Milidiú, R.; Fuks, H. Wearable Computing: Accelerometers’ Data Classification of Body Postures and Movements. In Advances in Artificial Intelligence—SBIA 2012; Barros, L.N., Finger, M., Pozo, A.T., Gimenénez-Lugo, G.A., Castilho, M., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 52–61. [Google Scholar]
Sun, Y.; Zhao, T. Imbalanced data fault diagnosis of hydrogen sensors using deep convolutional generative adversarial network with convolutional neural network Imbalanced data fault diagnosis of hydrogen sensors using deep convolutional generative adversarial network with convolutional neural network. Rev. Sci. Instrum. 2021, 92, 095007. [Google Scholar] [CrossRef]
Wang, J.; Gao, Q.; Ma, X.; Zhao, Y.; Fang, Y. Learning to sense: Deep learning for wireless sensing with less training efforts. IEEE Wirel. Commun. 2020, 27, 156–162. [Google Scholar] [CrossRef]
Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.-S.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [Google Scholar] [CrossRef] [Green Version]
Hu, W.; Huang, Y.; Wei, L.; Zhang, F.; Li, H. Deep Convolutional Neural Networks for Hyperspectral Image Classification. J. Sens. 2015, 201, 258619. [Google Scholar] [CrossRef] [Green Version]
Uddin, Z.; Mehedi, M.; Alsanad, A.; Savaglio, C. A body sensor data fusion and deep recurrent neural network-based behavior recognition approach for robust healthcare. Inf. Fusion 2020, 55, 105–115. [Google Scholar] [CrossRef]
Banos, O.; Villalonga, C.; Garcia, R.; Saez, A.; Damas, M.; Holgado-terriza, J.A. Design, implementation and validation of a novel open framework for agile development of mobile health applications. BioMed. Eng. OnLine 2015, 14, S6. [Google Scholar] [CrossRef] [Green Version]
Wang, W.; Chen, B.; Xia, P.; Hu, J.; Peng, Y. Sensor Fusion for Myoelectric Control Based on Deep Learning With Recurrent Convolutional Neural Networks. Artif. Organs 2018, 42, E272–E282. [Google Scholar] [CrossRef]
Rifat Arefin, M.; Michalski, V.; St-Charles, P.-L.; Kalaitzis, A.; Kim, S.; Kahou, S.E.; Bengio, Y. Multi-Image Super-Resolution for Remote Sensing using Deep Recurrent Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020; pp. 816–825. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Bilgera, C.; Yamamoto, A.; Sawano, M.; Matsukura, H.; Ishida, H. Application of convolutional long short-term memory neural networks to signals collected from a sensor network for autonomous gas source localization in outdoor environments. Sensors 2018, 18, 4484. [Google Scholar] [CrossRef] [Green Version]
Nagrecha, K.; Muthukumar, P.; Cocom, E.; Holm, J.; Comer, D.; Burga, I.; Pourhomayoun, M. Sensor-Based Air Pollution Prediction using Deep CNN-LSTM. In Proceedings of the 2020 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 16–18 December 2020; pp. 694–696. [Google Scholar]
Xia, K.U.N.; Huang, J.; Wang, H. LSTM-CNN Architecture for Human Activity Recognition. IEEE Access 2020, 8, 56855–56866. [Google Scholar] [CrossRef]
Ma, M.; Sun, C.; Chen, X. Deep Coupling Autoencoder for Fault Diagnosis With Multimodal Sensory Data. IEEE Trans. Ind. Inform. 2018, 14, 1137–1145. [Google Scholar] [CrossRef]
Lei, Y.; Karimi, H.R.; Cen, L.; Chen, X.; Xie, Y. Processes soft modeling based on stacked autoencoders and wavelet extreme learning machine for aluminum plant-wide application. Control Eng. Pract. 2021, 108, 104706. [Google Scholar] [CrossRef]
Alo, U.R.; Nweke, H.F.; Teh, Y.W.; Murtaza, G. Smartphone Motion Sensor-Based Complex Human Activity Identification Using Deep Stacked Autoencoder Algorithm for Enhanced Smart Healthcare System. Sensors 2020, 20, 6300. [Google Scholar] [CrossRef] [PubMed]
Hinton, G.E.; Osindero, S.; Teh, Y.-W. A Fast Learning Algorithm for Deep Belief Nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef] [PubMed]
Chen, Y.; Jin, Y.; Jiri, G. Predicting tool wear with multi-sensor data using deep belief networks. Int. J. Adv. Manuf. Technol. 2018, 99, 1917–1926. [Google Scholar] [CrossRef]
Zhong, P.; Gong, Z.; Li, S.; Schönlieb, C.-B. Learning to Diversify Deep Belief Networks for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3516–3530. [Google Scholar] [CrossRef]
Scarselli, F.; Gori, M.; Tsoi, A.C.; Hagenbuchner, M. The Graph Neural Network Model. IEEE Trans. Neural Netw. 2009, 20, 61–80. [Google Scholar] [CrossRef] [Green Version]
Jiao, X.; Member, S.; Li, X.; Lin, D. A Graph Neural Network Based Deep Learning Predictor for Spatio-Temporal Group Solar Irradiance Forecasting. IEEE Trans. Ind. Inform. 2022, 18, 6142–6149. [Google Scholar] [CrossRef]
Sengupta, M.; Andreas, A. Oahu Solar Measurement Grid (1-Year Archive): 1-Second Solar Irradiance; Oahu, Hawaii (Data); National Renewable Energy Lab. (NREL): Golden, CO, USA, 2010. [Google Scholar] [CrossRef]
Shi, W.; Rajkumar, R.R. Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 1711–1719. [Google Scholar]
Zhang, J.; Liu, B.; Wang, Z.; Lehnert, K.; Gahegan, M. DeepPN: A deep parallel neural network based on convolutional neural network and graph convolutional network for predicting RNA-protein binding sites. BMC Bioinform. 2022, 23, 257. [Google Scholar] [CrossRef]
Wen, W.; Shang, C.; Dong, Z.; Keh, H.-C.; Roy, D.S. An intrusion detection model using improved convolutional deep belief networks for wireless sensor networks. Int. J. Ad Hoc Ubiquitous Comput. 2021, 36, 20–31. [Google Scholar] [CrossRef]
Holte, R.C. Very Simple Classification Rules Perform Well on Most Commonly Used Datasets. Mach. Learn. 1993, 11, 63–90. [Google Scholar] [CrossRef]
Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision 2017, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
Linardatos, P.; Papastefanopoulos, V.; Kotsiantis, S. Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy 2021, 23, 18. [Google Scholar] [CrossRef]
Kwapisz, J.R.; Weiss, G.M.; Moore, S.A. Activity Recognition using Cell Phone Accelerometers. ACM SigKDD Explor. Newsl. 2011, 12, 74–82. [Google Scholar] [CrossRef]
Vavoulas, G.; Chatzaki, C.; Malliotakis, T.; Pediaditis, M.; Tsiknakis, M. The MobiAct Dataset: Recognition of Activities of Daily Living using Smartphones. In Proceedings of the International Conference on Information and Communication Technologies for Ageing Well and e-Health (ICT4AWE 2016), Rome, Italy, 21–22 April 2016; pp. 143–151. [Google Scholar]
Micucci, D. UniMiB SHAR: A Dataset for Human Activity Recognition Using Acceleration Data from Smartphones. Appl. Sci. 2017, 7, 1101. [Google Scholar] [CrossRef] [Green Version]
Banos, O.; Garcia, R.; Holgado-terriza, J.A.; Damas, M.; Pomares, H.; Rojas, I.; Saez, A.; Villalonga, C. mHealthDroid: A Novel Framework for Agile Development of Mobile Health Applications. In Ambient Assisted Living and Daily Activities. IWAAL 2014; Springer: Cham, Switzerland, 2014; pp. 91–98. [Google Scholar]
Palumbo, F.; Gallicchio, C.; Pucci, R.; Micheli, A. Human activity recognition using multisensor data fusion based on Reservoir Computing. J. Ambient. Intell. Smart Environ. 2016, 8, 87–107. [Google Scholar] [CrossRef]
Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 3354–3361. [Google Scholar]
Anders, G.; Mackowiak, S.D.; Jens, M.; Maaskola, J.; Kuntzagk, A.; Rajewsky, N.; Landthaler, M. doRiNA: A database of RNA interactions in post-transcriptional regulation. Nucleic Acids Res. 2012, 40, D180–D186. [Google Scholar] [CrossRef]

Table 1. Summary of the advances in supervised machine learning for sensing and condition monitoring.

Ref.	First Author	Year	Model	Application Field	Dataset	Accuracy	Details
[5]	Fu	2020	Transforming model, based on the Fourier transform and principal component analysis (PCA)	Actuators and sensors for turbine condition monitoring	Wind turbine benchmark dataset	Comparison studies confirm a good method effectiveness	Combining the Fourier transform and uncorrelated multi-linear PCA
[6]	Chu	2021	Neural network	Condition monitoring via gas sensor array	Dataset, comprising from 6 types of CO and NO₂ mixtures under 4 levels of relative humidity	Estimate of monitoring accuracy is 100%	The PCA is used to extract features, and c-means clustering and BPNN are used to monitor gases
[7]	Ince	2020	The Haar wavelet transform (HWT)	Human activity monitoring	RGB-depth sensor dataset by Kyungsung University	Monitoring accuracy is 86.1%	The HWT is used for feature dimension reduction, and k-NN is used for a monitoring
[8]	Yang	2020	Convolutional neural network model	Sensor monitoring	Datasets by Olszewski	Monitoring error probability is between 0.4% and 0.6% for Wafer dataset	Three transformation methods are used to encode time series into images, and ConvNet is used for image classification
[9]	Nie	2020	The hyper-complex Fourier transform and the SVM	Ship detection and condition monitoring	Images from the satellites GF-2 and ZY-3 with a 2 m resolution	Monitoring accuracy is up to 92.8%	The hyper-complex Fourier transform of a quaternion is used to locate ROIs; false alarms are eliminated by SVM training
[10]	Liu	2021	MKML-ISOMAP model	Additive manufacturing	Online multi-dimensional sensor dataset	Very good prediction performance	MKML-ISOMAP model is used for feature dimension reduction and feature extraction
[11]	Jiang	2022	Passivity-based sliding mode control and monitoring and RBF neural network model	Dock cranes	Numerical dataset, obtained via simulation	Satisfactory system performance	RBF neural network is used for an adaptive control and monitoring
[12]	Krishnan	2020	Logistic regression	Gesture monitoring	Dataset, obtained by flex sensors and accelerometers, attached to a glove	Effective prediction via audio module	Prediction of the gesture via logistic regression
[13]	Li	2020	Machine learning, based on the Gini impurity	Cognitive load detection and monitoring	Dataset from UbiTtention 2020 workshop of UbiComp 2020	Low detection accuracy, 63%	Feature selection is based on the Gini impurity and multiple ML techniques are used for training ML models
[15]	Nancy	2020	Fuzzy temporal decision tree	Intrusion detection and monitoring	KDD‱99 cup dataset	Monitoring accuracy is 99.6%	Combination of dynamic recursive feature selection algorithm and a fuzzy temporal decision tree algorithm
[16]	Chen	2020	Decision tree model	Grain impurity monitoring	Image datasets with different grain impurities	Low monitoring accuracy, 76%	Machine vision method is used for grain impurity monitoring, and a decision tree algorithm is used for a classification
[19]	Mihai	2021	Decision tree model	Geographic information system	Dataset is Cadastre	A very good monitoring accuracy	C4.5 decision tree algorithm is used for monitoring
[20]	Mennel	2020	Neural network	Ultrafast machine vision	Image dataset, projected onto a chip with throughput of 20 million bins per second	Good efficiency	ANN that simultaneously senses and processes optical images
[21]	Yu	2020	Deep neural network	Prediction of a fall	SisFall, a fall and movement dataset with various falls and ADLs	Prediction accuracies are 93.2%, 93.8%, and 96% for non-fall, pre-impact fall and fall, respectively	Hybrid ConvLSTM model
[25]	Taimoor	2022	MLP neural network	Actuators and sensors for fault detection and condition monitoring of an aircraft	Dataset from non-linear dynamics of Boeing 747 100/200	Good accuracy	Extended Kalman filter makes the weight updating parameters of MLP adaptive
[27]	Teixeira	2020	Quadral-Fuzzy approach	Monitoring of drone flight formation	Simulated dataset from experiments using Virtual Robot Experimentation Platform	Most of the time, a safe distance between drones is preserved	Quadral-Fuzzy approach ensures that drone flight formation helps avoiding a collision
[28]	Yang	2020	Fuzzy logic inference model	Gait phase monitoring and gait cycle segmentation	Dataset, obtained from 8 pressure sensors	Recognition accuracy is 97.2%	Fuzzy logic inference is used for gait phase monitoring
[30]	Kuila	2020	Evolutionary algorithms	Clustering and routing problems for WSNs	Dataset from sensor nodes	Comparison table highlights strengths and weaknesses of algorithms	Evolutionary algorithms are used for solving clustering and routing problems
[31]	Zhang	2020	Evolutionary algorithms	Life-time optimization of mobile wireless sensor networks	Simulated dataset	Presenting advantages and disadvantages of five evolutionary algorithms	Evolutionary algorithms are used for MWSN lifetime optimization model
[33]	Kozłowski	2020	The SVM	Predictive maintenance and condition monitoring, RUL prediction	Dataset from a CNC machine monitoring system	Effective monitoring and prediction	The SVM constructs a classifier and a RUL prediction method
[34]	Shen	2020	Variational autoencoder and reinforcement learning model	Remote sensing and monitoring and image captioning	Remote sensing image caption dataset NWPU-RESISC45	Model is effective in a learning	Variational autoencoder and reinforcement learning based two-stage multi-task learning model
[37]	Abubakr	2021	Ensemble of ML models	Tool fault detection and condition monitoring	Dataset from Matsuura machining center MC-510V	Classification accuracy is up to 0.96%	Single-sensor-based TCCM
[41]	Liu	2020	The SVM	Sewage condition monitoring and water quality prediction	Datasets from laboratory environment, and sewage monitoring site	Average prediction accuracy error is less than 2%	Support vector machine algorithm is used for a prediction
[42]	Gómez	2020	The wavelet packet transform (WPT)	Real-time condition-monitoring for railway axles	Dataset from axles with multiple crack conditions	Monitoring false alarm rate is less than 10%	The WPT is combined with SVM diagnosis model
[45]	Kumaran	2020	Hybrid intelligent generic algorithm (HIGA)	Crowd monitoring in smart cities	WSN is used for a dataset creation in smart city environment	Effective crowd monitoring	Cloud-based robotic system for crowd monitoring in smart cities using HIGA

Table 2. Summary of the advances in unsupervised machine learning for sensing and condition monitoring.

Ref.	First Author	Year	Model	Application Field	Dataset	Accuracy	Details
[48]	Shukla	2020	Hierarchical clustering	IoT applications with data from sensors	Dataset, related to vehicular traffic count, and environmental pollutant CO level	Accuracy is more than 90%	Combination of a hierarchical clustering and an LSTM neural network
[49]	Kim	2020	Hierarchical clustering	M13 bacteriophage-based colorimetric sensors	Dataset from sensing experiments at four different temperatures	Effective classification	Hierarchical clustering is used to classify types of medical chemicals
[51]	Omeke	2021	Distance-and en-ergy-constrained k-means clustering	Lifetime of WUSNs	AUV-based dataset from underwater clusters	Essentially outperformed LEACH protocol	Distance- and energy-constrained k-means clustering scheme is used for cluster head selection
[52]	Gao	2020	Sub-trajectory clustering	Ship-handling behavior monitoring	Trajectory dataset from multiple sensors	Effective monitoring	Ship-handling behavior monitoring with multi-step sub-trajectory clustering analysis
[53]	Sellami	2020	Spectral clustering	Hyperspectral image classification and monitoring	Indian Pines and Salinas datasets	Effective classification and monitoring	Unsupervised band selection technique with a hierarchical clustering

Table 3. Summary of the deep learning advances for sensing and condition monitoring.

Ref.	First Author	Year	Model	Application Field	Dataset	Accuracy	Details
[59]	Mukherjee D.	2020	EnsemConvNet model	Human activity monitoring	Publicly available datasets are: Wireless Sensor Data Mining (WISDM [95], MobiAct [96] and uniMiB SHAR [97], employed accelerometers, gyroscope, and geomagnetic field sensors	Monitoring accuracies are 99.6%, 99.1% and 99.9% on the three datasets, respectively	Combination of CNN-Net, Encoded-Net, and CNN-LSTM
[61]	Ashraf	2019	CNN model	Indoor localization and monitoring	Datasets, based on smartphone photos	Monitoring accuracy is 91%	Indoor localization and monitoring based on CNN, which is used to distinguish between floors, recognize inside scenes in a variety of lighting conditions
[62]	Chen	2022	CNN model	Monitoring UV intensity	Datasets, based on smartphone photos	Monitoring accuracy is more than 90%	A UV intensity monitoring APP, based on a mobile CNN
[65]	Niu	2021	DR-CNN model	Bearing fault diagnostics with multi-sensor data	Multiple 3-axis accelerometers datasets	Estimate of diagnosis accuracy is 100%	New connections in the residual network skip some CNN structure’s layers; this leads to accuracy improvement
[66]	Kulchyk	2019	CNN model	Human activity monitoring	Four tri-axial wearable accelerometer dataset (publicly available via [67])	Estimate of diagnosis accuracy is 100%	Novel CNN is eliminating a need for a pre-processing activity
[68]	Sun	2022	DCG–CNN model	Gas sensor condition monitoring	1D hydrogen sensor data and synthetic 2D grey images	Monitoring accuracy is 98.9%	Synthetic 2D grey images are generated from 1D hydrogen sensor data
[69]	Wang	2020	DGAN model	Wireless gesture monitoring	Radio image datasets, obtained from de-noised wireless measurements	Monitoring accuracies are 90% for different participants, 91% for different laboratories, and 86% for different participants and laboratories	Reducing training efforts by generating virtual samples
[71]	Hu	2015	CNN model with one convolutional level	Hyperspectral image classification	Three datasets from remote sensors, which are characterized by hundreds of observation channels with a high spectral resolution	Classification accuracies are 90.2%, 92.6%, 92.6% for three datasets	Classification of hyperspectral images in spectral domain
[72]	Uddin	2020	R-CNN model	Human activity monitoring (sitting, standing and walking)	Data fusion from many wearable sensors, including an electrocardiogram (ECG), an accelerometer, and a magnetometer (publicly available datasets, MHEALTH [98] and PUC-Rio) and AReM [99], collected by the authors)	Monitoring accuracy is 99%	Monitoring features are retrieved using kernel principal component analysis
[76]	Wang	2018	R-CNN model	HMI control	Dataset, combining EMG signals and acceleration signals	Monitoring accuracy is 91.5%	Human activity monitoring (i.e., classifying hand movements)
[75]	Arefin	2020	R-CNN model	Remote sensing and condition monitoring	Dataset of low-resolution satellite photographs	Effective condition monitoring	Production of high-resolution images from a succession of low-resolution satellite photographs
[77]	Bilgera	2018	CNN-LSTM model	Monitoring a position of a gas source in an outdoor environment	Dataset from gas sensor array is arranged in a series of monochrome images	Monitoring accuracy is 95%	Monitoring in an outdoor environment using stationary sensors (i.e., a sensor network)
[78]	Nagrecha	2020	CNN-LSTM model	Earth environmental monitoring	Round based pollution sensor data recast into a modified pseudo-images	Monitoring accuracy is 75% (average value for different sites and different hourly intervals)	Prediction and monitoring of an air pollution
[79]	Xia	2020	LSTM-CNN model	Human activity monitoring (standing, walking, walking downstairs, and walking upstairs)	Three open datasets (UCI, WISDM and OPPORTUNITY)	Monitoring accuracies are 95.8%, 95.8%, and 92.6% for the three datasets, respectively	Logistic regression-based Softmax classifier is used to solve the multiclass monitoring problem
[80]	Ma	2018	DCAE model	Detection/diagnosis of the health state of gears and bearings	Vibration and acoustic datasets	Diagnosis accuracy is 94.3%	Technique, which fuses multimodal data and feature learning into a single step
[82]	Alo	2020	Sparse AE model	Human activity monitoring (ascending stairs, descending stairs, drinking coffee, presenting talks, and smoking)	Dataset from smartphone-embedded 3D accelerometers, including magnitude and rotation angle data (pitch and roll)	Monitoring accuracy is 97.2%	Method, based on data fusion, deep stacked autoencoder algorithm and orientation invariant features, is used for a complex human activity monitoring
[84]	Chen	2018	DBN model	Monitoring tool wear in a high-speed CNC milling machine	Dataset from three-component dynamometer, accelerometers and an acoustic emission sensor	R² value is 98.9%	DBN model, which is featured a low runtime, a high accuracy and a high stability
[85]	Zhong	2017	D-DBN-PF model (DBN with diversifying training method both in pre-training and fine-tuning phases)	Hyperspectral image classification	Dataset of hyperspectral images from multivariate optical sensors	Classification accuracy is 93.1%	Deep features, that model multiple ground-truth classes, are extracted from hyperspectral images
[87]	Jiao	2022	GCN-LSTM model	Solar irradiance forecasting and monitoring for a group of PV panels, based on their temporal and spatial information	Solar irradiance dataset, sourced from the U.S.’s National Renewable Energy Laboratory [88]	Root mean squared error is 0.0052	Model is a group solar irradiance neural network (GSINN), which integrates a GCN model with a LSTM model
[89]	Shi	2020	CNN-GNN model	Monitoring of objects from a 3D point cloud	KITTI dataset [100] (Lidar point clouds and the camera images)	Monitoring accuracy is between 75% and 90%	Single-stage monitoring method: a Point-GNN extracts features of the point cloud by iteratively updating vertex features on the same graph
[90]	Zhang	2022	CNN and GCN models	Discriminative features from RNA sequences for prediction and monitoring of binding sites	24 datasets [101] (RBPs binding sites with different methods)	Monitoring accuracy is 87.9% (mean value for 24 datasets)	Two-layer CNN and GCN are used in parallel to extract hidden features
[91]	Wen	2021	ICDBN-IDM (improved convolutional deep belief network-based intrusion detection model)	Intrusion monitoring via wireless sensor networks	Dataset on environmental information, collected in real-time from sensor node of an WSN	Monitoring accuracy is 96.8%	Redundancy monitoring algorithm, based on the convolutional deep belief network and a performance evaluation strategy

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ao, S.-I.; Gelman, L.; Karimi, H.R.; Tiboni, M. Advances in Machine Learning for Sensing and Condition Monitoring. Appl. Sci. 2022, 12, 12392. https://doi.org/10.3390/app122312392

AMA Style

Ao S-I, Gelman L, Karimi HR, Tiboni M. Advances in Machine Learning for Sensing and Condition Monitoring. Applied Sciences. 2022; 12(23):12392. https://doi.org/10.3390/app122312392

Chicago/Turabian Style

Ao, Sio-Iong, Len Gelman, Hamid Reza Karimi, and Monica Tiboni. 2022. "Advances in Machine Learning for Sensing and Condition Monitoring" Applied Sciences 12, no. 23: 12392. https://doi.org/10.3390/app122312392

APA Style

Ao, S.-I., Gelman, L., Karimi, H. R., & Tiboni, M. (2022). Advances in Machine Learning for Sensing and Condition Monitoring. Applied Sciences, 12(23), 12392. https://doi.org/10.3390/app122312392

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Advances in Machine Learning for Sensing and Condition Monitoring

Abstract

1. Brief Introduction

2. Supervised Machine Learning

3. Unsupervised Machine Learning (Clustering)

4. Deep Learning

5. Brief Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI