**1. Introduction**

Assistive robots are a category of robots that share their area of work and interact with humans. Their main goal is to help, assist, and monitor humans, especially people with disabilities. To achieve this goal, it is necessary that these robots possess a series of characteristics: the ability to perceive their environment from their sensors and act consequently, to interact with people in a multimodal manner, and to navigate and make decisions autonomously. This complexity demands computationally expensive algorithms to be performed in real-time. Therefore, with the advent of high-end embedded processors, several algorithms could be processed concurrently and in real-time.

All these capabilities involve, to a greater or less extent, the use of machine learning techniques. In particular, in the last few years, new deep learning techniques have enabled a very important qualitative leap in different problems related to perception, navigation, and human understanding. In this Special Issue, various works are presented involving the use of machine learning techniques for assistive technologies, but in particular for assistive robots.

#### **2. Machine Learning Techniques for Assistive Robotics**

This Special Issue consists of eleven papers covering the application of machine learning techniques on assistive technologies and assistive robots. There are two review papers and nine research ones.

The first review [1] is focused on the identification of the research works written in English about the recognition of daily activities and environment recognition using the AdaBoost method. In particular, it focuses on the data obtained from the sensors available in mobile devices that were published between 2012 and 2018. The second one [2] reviews and summarizes the research efforts toward the development of these kinds of systems, focusing on two social groups: older adults and children with autism.

Regarding research papers, there are nine, and they are described briefly in the next paragraphs. Pires et al. [3] use artificial neural networks (ANN) for the recognition of activities of daily living (ADLs) with the data acquired from the sensors available in mobile devices. Firstly, before ANN training, the mobile device is used for data collection. After training, mobile devices are used to apply an ANN previously trained for the ADLs' identification on a less restrictive computational platform.

In Reference [4], a system to detect the performance and the emotional state that elderly people have when performing exercises is presented. With this detection, the authors want to build an assistant that motivates those people to perform exercises and, concurrently, monitors them, observing their physical and emotional responses.

The paper presented by Ferreira et al. [5] proposes the recognition of eight ADL, e.g., walking, running, standing, going upstairs, going downstairs, driving, sleeping, and watching television, and nine environments, e.g., bar, hall, kitchen, library, street, bedroom, living room, gym, and classroom, using the instance-based k-nearest neighbor (IBk) and AdaBoost methods. The primary purpose of this paper is to find the best machine learning method for ADL and environment recognition.

The main proposal in [6] is to recognize users' environment and standing activities. Furthermore, these features are included in a framework for the ADL and environment identification. Therefore, this paper is divided into two parts: firstly, acoustic sensors are used for the collection of data towards the recognition of the environment, and secondly, the information of the recognized environment is fused with the information gathered by motion and magnetic sensors. The environment and ADL recognition are performed by pattern recognition techniques that aim for the development of their system, including data collection, processing, fusion, and classification procedures.

Modern achievements accomplished in both cognitive neuroscience and human–machine interaction technologies have enhanced the ability to control devices with the human brain by using brain–computer interface systems. In particular, the development of brain-controlled mobile robots is very important because systems of this kind can assist people, suffering from devastating neuromuscular disorders, move and thus improve their quality of life. The research work presented in [7] concerns the development of a system that performs motion control in a mobile robot in accordance with the eye blinking of a human operator via a synchronous and endogenous electroencephalography-based brain–computer interface, which uses alpha brain waveforms. The received signals are filtered in order to extract suitable features. These features are fed as inputs to a neural network, which is properly trained in order to guide the robotic vehicle.

One of the main problems in the elderly population and for people with functional disabilities is falling when they are not being supervised. Therefore, there is a need for monitoring systems with fall detection functionality. Mobile robots are a good solution for keeping the person in sight when compared to static-view sensors. Along this line, Maldonado et al. [8] propose a vision-based solution for fall detection based on a mobile-patrol robot that can correct its position in case of doubt. Deep learning-based computer vision is used for person detection, and fall classification is done by using a learning-based support vector machine (SVM) classifier.

In Reference [9], a Siamese network with an auto-encoding constraint is proposed to extract discriminative features from detection responses in a tracking-by-detection framework. The proposed network is improved to extract the previous-appearance-next vector from the tracklet for better association. Feature experiments show that the proposed Siamese network has advantages in terms of both discrimination and correctness.

Classification of complex acoustic scenes under real-time scenarios is an active domain, which has been engaged by several researchers lately from the machine learning community. In Reference [10], a framework for automatic acoustic classification for behavioral robotics is presented. Motivated by several texture classification algorithms used in computer vision, a modified feature descriptor for sound is proposed, which incorporates a combination of 1D local ternary patterns (1D-LTP) and baseline method Mel-frequency cepstral coefficients (MFCC). The extracted feature vector is later classified using a multi-class SVM, which is selected as a base classifier.

Near-infrared (NIR) facial expression recognition is resistant to illumination change. Chen et al. [11] propose a three-stream three-dimensional convolutional neural network with a squeeze-and-excitation (SE) block for NIR facial expression recognition. Each stream is fed with different local regions, namely the eyes, nose, and mouth. The experimental results on the Oulu-CASIANIR facial expression database show that the proposed method has a higher recognition rate than some of the state-of-the-art algorithms.
