**1. Introduction**

In recent years, many methods have been introduced to develop types of smart wheelchairs for people with different disabilities such as assistive technology [1], user physical interface [2], or semi-control (sharing control between user and machine) [3]. One of the most important problems in a smart wheelchair is to provide independent mobility for the elderly or severely disabled people, who cannot control an electric wheelchair using a joystick. Therefore, restoring their activity skills can significantly improve their life quality. The development of a typical smart wheelchair highly depends on the ability and disability of the user. It means that a patient with impaired activity often lacks muscle control and then it is difficult to control the movement of the arms and legs in the worst case. To support the mobility of patients, signals for control can be generated from actions such as voice, thoughts, eyes, and tongue [4–6]. In order to obtain good signals, users must control their emotions well and also highly concentrate for accuracy. This is difficult for users with severe disability, although it may be a good option. For people with severe disability, the best solution could use multiple signals from sensors installed on the user's body parts and the surrounding environment and the signals are analyzed before giving the desired commands for wheelchair control [7]. Using this solution could improve the

**Citation:** Ngo, B.-V.; Nguyen, T.-H. A Semi-Automatic Wheelchair with Navigation Based on Virtual-Real 2D Grid Maps and EEG Signals. *Appl. Sci.* **2022**, *12*, 8880. https://doi.org/ 10.3390/app12178880

Academic Editors: Alexander E. Hramov and Alexander N. Pisarchik

Received: 18 July 2022 Accepted: 1 September 2022 Published: 4 September 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

difficulty of people with severe disability in the wheelchair control compared to solutions using one input.

EEG signals related to the human brain, which have challenging problems, have attracted many researchers. In particular, recent research on cognitive and motor control to improve the Brain–Computer Interface (BCI) for enhancing the health of elderly people has been shown [8]. According to this study, the BCI system can be useful for elderly people in training their motor/cognitive abilities to prevent the effects of aging. Therefore, it can help them to more easily control household appliances and to communicate information in daily activities. In [9], the authors represented the physical principles of BCI and the fundamental new methods for acquiring and analyzing EEG signals for controls related to brain activities. In particular, the BCI system was classified into three main categories including active, reactive, and passive. Regarding an active BCI, the neuro interface user controls a complex external device such as a wheelchair through a series of functional components of the control system and sees the results of this control on a screen. Reactive BCI inherits many features of active BCI, with a significant change to implement a control system based on the classification of brain responses to stimuli such as visuals, sounds, and touch. Passive BCIs are designed to monitor current brain activity and thereby provide important information about the operator's mental state, user intent, and situational interpretation. The Brain-Controlled Wheelchair (BCW) is a typical BCI application, which can help people with a physical disability to communicate with the outside environment. In [10], the BCW was exploited from many aspects, including the type of EEG signal acquisition, the command set for the control system, and the control method. Moreover, the authors summarized the recent development of the BCW and it can be mainly expressed in three aspects: from the wet electrode to the dry electrode; from single-mode to multi-mode; from synchronous control to asynchronous control. Therefore, it indicates that new functions have been employed in the BCW to increase its stability and robustness.

Mapping and navigation for wheelchairs or self-propelled robots have attracted many researchers in recent decades. The wheelchairs or self-propelled robots need to be provided maps for movement in detail so that they can be located in moving spaces. Moreover, their current coordinates were used as a basis for collecting new information during the moving process [11]. Mapping algorithms were gradually developed as Simultaneous Localization and Mapping (SLAM) algorithms and were applied to draw 3D maps [12], in which the computational problem of constructing or updating a map of an unknown environment is represented, with simultaneously keeping track of an agent's location within it. An image processing method was employed to identify fixed artificial landmarks built in moving space [13]. Thus, these fixed artificial landmarks were applied for determining the current location of a wheelchair on a map built in advance during its movement. Alcantarilla et al. proposed a powerful and fast method of positioning a wheelchair based on computer vision, in which image features were extracted and then combined with map components to provide a current position of self-propelled robots [14,15]. In fact, mapping for mobile robots in the environment is a major challenge due to the data obtained from the environment and the algorithm applied on them [16–18]. Landmark information for mobile robots plays an important role, in which types of landmarks [19–22] such as doors, stairs, walls, ceilings, and floors were selected and features were extracted for identification. Therefore, to detect the landmarks with their features, one could be based on color, texture, brightness, and obstacle size.

In recent years, the Reinforcement Learning (RL) method has achieved great success in many tasks including games [23] and simulation control agents [24]. Applications of the RL method in robot manufacturing are mostly limited in operation methods [25], in which the workspace could be fully observed and very stable. With mobile robots, complex environments can expand the sample space, while the RL method often takes action samples from a separate space for simpler processing of problems [26]. In [27], the RL method was applied for autonomous navigation based on the input of image information and has achieved significant success. The authors of this research analyzed agent behavior in static mazes with complex geometries, starting positions, and random orientations, while target positions could vary. The results showed that this RL method could allow the agent to navigate in large and intuitive environments, in which there were starting and target positions which were changed frequently, but the maze layout was always static. In [28], Yuke Zhu et al. tried to find the sequence of actions with minimum distance to move an agent from its current position to the target specified by the RGB image. This means that they have to collect a large number of different images to process before training the navigation model.

Finding a path on a static grid map is a well-known issue and well researched in AI communities, in which planners and robots with lots of methods and algorithms have been proposed to date [29,30]. Most of these algorithms are based on heuristic searching in the state space created by grid cells. In general, one prominent issue is that Neural Networks (NNs) work well with all types of tasks with data collected from sensors or images and then they are used as an input of the NN. The cells contain only two types of cells, including movable and non-movable, and they look like a perfect input for Modern Artificial Neural Network (MANN) architectures, such as a Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) [31–34]. In [23], the DQN algorithm was applied due to solving many challenges related to autonomous control [35]. In particular, this algorithm combines the Q-Learning algorithm with neural networks, in which DQNs solve problems with high-dimensional observation spaces using neural networks to estimate Q-values for corresponding actions. Mostly, deep RL training including DQNs and their variants is performed in a virtual environment, because the process of training using a trial and error method can lead to damage to real robots in typical tasks. The big difference between the structural simulation environment and the very complex real-world environment is the main challenge to directly transfer a trained model into a real robot.

In this article, an RL method applied to obtain results of the optimal path planning in a virtual 2D grid map is presented. In particular, in the first stage, the virtual 2D grid map is built based on a real environment, including free spaces, obstacles, landmarks, and targets. This virtual 2D grid map will be connected to the input of a DQN and the DQN's output is the Q-value of four actions (Right, Left, Up, Down) and the action with the largest Q-value will be selected so that the wheelchair can reach the desired target from any start point in the real environment. Therefore, in the second one, when the wheelchair moves in this real environment, it can use the scenery fully simulated as a Motion Planner (MP) through the virtual 2D grid map. Moreover, the wheelchair needs to determine its current position in both real and virtual environments with natural landmarks for movement. With the start and target positions determined, the MP will suggest the optimal path with control commands and a Wheelchair's Action Converter (WAC) will convert these control commands into actual control commands so that the wheelchair can complete its schedule. Finally, we evaluate the performance of the proposed model by performing a series of tests in simulation and in real environments. The results showed that the RL network architecture applied in this research to path-finding tasks is a potential issue in mobile vehicles in real environments based on landmarks, obstacles, and start and target points.

This article consists of four sections: Section 2 presents the structure of the system, the method for selecting destinations using EEG signals, and applying the RL algorithm for determining the optimal path of the wheelchair to the selected destination. In Section 3, the description of the basic specifications applied for wheelchair movement is given and the experiments related to the basic functions of the system and the experimental results using the proposed method are discussed. Finally, Section 4 presents the conclusions about this research.

## **2. Materials and Methods**

### *2.1. System Architecture*

In this research, one system architecture for an optimal path planning is proposed for wheelchair navigation to reach the desired targets. This system architecture includes two stages for the electric wheelchair in an indoor environment as described in Figure 1. In the first stage, the 2D grid maps with cells simulated based on one real indoor environment with different targets will provide information of cell states and targets' coordinates which are the inputs of DQNs. After being trained, the DQN model will have optimal parameters that can estimate the Q-values of all possible actions for that state. Therefore, the DQNs will have 4 outputs corresponding to 4 actions (Up, Down, Left, Right). Therefore, each 2D grid map is just built for one of the targets in one real indoor environment, so each DQN model is obtained for one MP.

**Figure 1.** Representation of the system architecture for finding the optimal path of the wheelchair based on the 2D grid map.

The second stage is that the wheelchair will be controlled to reach the desired target in one real indoor environment. At the start time, the wheelchair will determine its start state itself based on natural landmarks and the desired target position in the real environment is known. When receiving the initial state of the wheelchair on the grid map, the DQN model will estimate the Q-values of 4 outputs corresponding to 4 actions (Up, Down, Left, Right). Therefore, the action with the highest Q-value will be selected. With this action, a new state on the grid map will be updated and then this new state will be the input to the DQN model and it will also select a corresponding action. This process will repeat and end when the state is the target. After navigating the optimal path to be able to reach the desired target, the MP with a sequence of actions (Right, Left, Up, Down) and the WAC will allow the wheelchair to move following this optimal path to reach that desired target.

In addition, as shown in Figure 1, the user needs to select a destination on the grid map using EEG signals. In the semi-automatic wheelchair system, the construction and the selection of destinations in a grid map for severely disabled people are a very important task. For people with severe disabilities not able to use normal controls, such as pressing a button, controlling a joystick, or touching a control screen, the EEG signal for controlling the semi-automatic wheelchair is a useful option. Using the EEG signal for directly controlling the semi-automatic wheelchair may cause stress due to concentrating for a long time, so the user can choose the desired destination through a screen interface with commands suitably designed for his/her actual environment [36]. The commands on the interface screen are assigned based on the type of the EEG signal from the user's face behaviors. Figure 2 describes the process of collecting, processing, and analyzing EEG signals for performing control commands related to the user interface. EEG signals are collected from an Emotiv EPOC system with 14 channels (14 electrodes) [37]. In particular, the EEG signals

are collected from the electrodes located in the prefrontal cortex considered to be the most reliable signals. Therefore, the EEG signals are transferred to the signal pre-processing block for filtering and scaling before being sent to the feature extraction block. For the control of the wheelchair, the EEG signals after pre-processing are sent to the classification block for classifying input signals to produce control commands [36–38]. It means that the user can use the control commands for selecting one of destinations on the environmental map to reach.

**Figure 2.** Brain–computer interface process flow.

The user interface is always designed to be simple and easy for disabled people, particularly, all commands can be operated using the BCI only as described in Figure 3. On the interface, the user will see a vertical menu with the symbols of destination names. The names in this menu are the pre-defined destinations such as living room, kitchen, and bedroom. To control the commands to reach the destinations, the act of closing the right eye of user is the command for selecting the desired destination. In particular, the user needs to close the right eye for 2 s to be able to move the cursor on the screen to the desired destination and then close the left eye to confirm the desired destination as shown in Figure 4. If the user wants to cancel the selected commands or cancel the selected destination, the user needs to perform the distortion of the mouth to the right. All operations selected for controlling the user interface with the designed destinations were tested on many users and the real results using the designed EEG commands produced the highest accuracy.

**Figure 3.** User interface for selecting the desired destination.

**Figure 4.** User interface selected the desired destination "Bed Room" using the EEG command.
