*3.1. Application Overview*

The interest toward behavioral architectures has grown, as shown in Figure 2. Particularly, of the fully evaluated papers, 7 papers (12.5%) were published before 2014 and 49 papers (87.5%) were published within the past five years.

**Figure 2.** Number of papers analyzed in this review, categorized by the year of publication.

The selected papers can be divided into two big groups: the works describing cognitive architectures, behavioral adaptation models, and empathy models from a conceptual point of view (seven papers as summarized in Table 2) and the works presenting experimental studies (twenty-three papers summarized in Table 3). Additionally, the papers can be grouped subsequently on the basis of the three areas described in the introduction and Figure 3 shows that most of the papers included in this review focus on behavioral adaptation strategies (41.07%) together with cognitive architectures (46.43%), and empathy (15.22%).

**Figure 3.** Number of papers, categorized by the main application covered.



**Table 2.** *Cont.*


**Table 2.**

*Cont.*


**Table 3.** *Cont.*


**Table 3.** *Cont.*


**Table 3.** *Cont.*

79



**Table 3.** *Cont.*



**Table 3.** *Cont.*



**Table 3.** *Cont.*

## *3.2. Data Abstraction*

Data were abstracted from each selected article, as reported in Table 4 The tables give the main purpose of each work, the robot used, the extracted features, and a short description of the implemented model/algorithms. The last column reports the area to which they belong (cognitive architectures, behavioral adaptation criterion, and empathy). In addition, for those papers which describe an experimental protocol, the number and the type of participants involved in the experimental session are also reported. The objective of the abstraction is to provide an overview of the papers included in this survey and to facilitate their comparison.

#### *3.3. Theoretical Works on the Development of Robotics Behavioral Models*

In this section, published works on theoretical studies of robotic models are described. Occurrences of theoretical studies belong to three areas (Table 2).

#### 3.3.1. Concepts for the Cognitive Application Area

Human cognitive systems are often adopted as an inspiration to develop a cognitive architecture for robots. In the last several years, in fact, assistive and companion robots have accomplished advanced social proficiency whenever they were equipped with cognitive architectures. Relevant examples of this trend are listed below in this section.

Reference [23] described cognitive architectures citing the Learning Intelligent Distribution Agent, Soar, and the Adaptive Control of Thought-Rationale architecture with the aim to provide a set of commitments useful to develop intelligent machines. In this work are presented the Theory of Mind (ToM) and the "perceptual-motor simulation routines," which are two of the fundamental theories of social cognition. Particularly, the ToM would represent the inherent human ability in attributed mental states to other social agents. That is possible through the application of theoretical inference mechanisms on cues gathered by people during social interactions (e.g., facial expression could be used in order to probabilistically determine the person's emotional state). On the other hand, the paradigm of "perceptual-motor simulation routines" state that people would be able to understand others' mental state by the use of simulation mechanisms, which would help the subject attribute a mental state to his\her interlocutor. The authors suggested their approach, Engineering Human Social Cognition (EHSC), which incorporates social signal processing mechanisms to allow a more natural HRI and focus on verbal and non-verbal cues to support interaction. Social signal processing is able to interpret social cues and then individual mental states. The authors underlined that modelling recommendations have centered primarily on the perceptual, motor, and cognitive modelling of a robotic system that spans disciplinary perspectives. This is the area that will require extensive work in the future. As such, the next steps in this area must include both research and modelling efforts that assess the issues and challenges of integrating the proposed types of models and formalisms. That effort can aid in the development of an integrated and working system based on these recommendations. These recommendations, if instantiated, would provide some basic perceptual, motor, and cognitive abilities, but future efforts should address whether these would also support more complex forms of social interaction. Such a capability would permit an artificial system to better express or perceive emotions while interacting and communicating with humans in even more complex social scenarios that would require shared decision-making and problem-solving.

Among cognitive architectures to be implemented into social robots to improve HRI, Pieters et al. [35] presented a work with the aim to develop a human-aware cognitive architecture. This system is conceived to provide robots with the ability to understand the human state, physical and affective, and then to interact in a suitable manner. Starting from cognitive models, the authors organized the architecture by considering a cognitive model that represents how memory is organized: a declarative memory for semantic and episodic facts, and procedural memory. According to this organization, the robot's tasks are encoded as a sequence of actions and events, thanks to a symbolic task planner, with the aim to verify if, and in what way, the task has already been executed. In Reference [77], the authors proposed an architecture that drives the robot behavior to acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world.

To provide robots with believable social responses and to have a more natural interaction, a theoretical model was developed by Reference [33]. The proposed architecture is human brain-inspired and it is structured into four principal modules, which encompasses anatomic structures and cognitive functions, such as the sensory system, the amygdala system, the hippocampal system, and the working memory. This brain-inspired system provides robots with emotional memory, which is fundamental to be able to learn and adapt to dynamic environments. In particular, the authors focus on artificial emotional memory, which lets robots remember emotions, associate them with stimuli, and react in an appropriate way if unpleasant stimuli occur. External stimuli are pre-processed by the sensory system, which is composed of the sensory cortex and thalamus. Prediction and association between stimuli and emotions are conducted via the amygdala system that provides emotional feedback to the hippocampal system.

Lastly, another brain-inspired architecture was developed in Reference [34]. It focuses on the autonomous development of new goals in robotic agents. Starting from neural plasticity, the Intentional Distributed Robotic Architecture (IDRA) is an attempt to simulate a brain circuit composed of the amygdala, the thalamus, and the cortex. The cortex is responsible for receiving signals from sensory organs. The thalamus develops new motivations in mammals, while the amygdala manages the generation of somatosensory responses. Elementary units, called Deliberative Modules (DM), enable a learning process that lets the robot learn and improve its skills during the execution of a task. This process is known as Intentional Distributed Robotic Architecture (IDRA). Working memory (acting as the cerebral cortex) and goal generator (acting as the thalamus) modules compose each DM. Amygdala is represented by instincts modules. Experiments were made to verify the ability of a NAO robot (*https:*//*www.softbankrobotics.com*/*emea*/*en*/*robots*/*nao*/*find-out-more-about-nao*. Retrieved July 2018) in learning to distinguish particular object shapes and in exploring in an autonomous way and learning new movements. Sensing and actuation as main activities required for learning and cognitive development were tested: NAO was able to learn new shapes taking sensorial inputs and to compose new behaviors, which are consistent with these goals. The authors underlined their choice to opt for directly using the high-level representation of the neural function, even though a system that uses neural coding as basic representation could be integrated into IDRA. NAO is often used to implement cognitive architectures. It was used in Reference [79] to evaluate a robot-assisted therapy for children with autism and intellectual disability (the same was done in Reference [78] with a robot named Kaspar) and in Reference [73] to examine the effect of robot-assisted language learning (RALL) on the anxiety level and attitude in English vocabulary acquisition among Iranian EFL junior high school students.

Another important element that should be considered in the field of behavioral models is the mechanism of affordances. The concept of affordance refers to the relationship between human perceivers and aspects of their environment. Being able to infer affordances is central to common sense reasoning, tool use, and creative problem solving in artificial agents.

Cutsuridis et al. [36] created a cognitive control architecture of the perception–action cycle for visually guided reaching and grasping of objects by a robot or an agent melded perception, recognition, attention, cognitive control, value attribution, decision-making, affordances, and action. The suggested visual apparatus allows the robot/agent to recognize both the object's shape and location, extract affordances, and formulate motor plans for reaching and grasping.

Haazebroek et al. [37] presented HiTEC, a novel computational (cognitive) model that allows for direct interaction between perception and action as well as for cognitive control, demonstrated by task-related attentional influences. In their model, the notion of affordance is effectively realized by allowing for automatic translation of perceptual object features (e.g., object shape) to action by means of overlap with anticipated action effect features (e.g., hand shape). Reference [39] proposed a Simulation

Theory and neuroscience findings on Mirror-Neuron Systems as the basis for a novel computational model, as a way to handle affective facial expressions. The model is based on a probabilistic mapping of observations from multiple identities onto a single fixed identity ('internal transcoding of external stimuli'), and then onto a latent space ('phenomenological response'). Asprino et al. [41] presented in this paper an Ontology Design Pattern for the definition of situation-driven behavior selection and arbitration models for cognitive agents. The proposed pattern relies on the descriptions and situations ontology pattern, combined with a frame-based representation scheme. Inspired by the affordance theory and behavior-based robotics principles, their reference model enables the definition of weighted relationships, or affordances, between situations (representing agent's perception of the environmental and social context) and agent's functional and behavioral abilities. These weighted links serve as a basis for supporting runtime task selection and arbitration policies, to dynamically and contextually select agent's behavior.

Lastly, a different use of a cognitive industrial entity called context-aware cloud robotics (CACR) is used for advanced material handling. Compared with the one-time on-demand delivery, CACR is characterized by two features: (1) context-aware services and (2) effective load balancing. The CACR case study is performed to highlight its energy-efficient and cost-saving material handling capabilities.

#### 3.3.2. Concepts for the Empathy Area

Empathy is becoming an important field of social robotics and several behavioral models take this aspect into consideration.

Reference [31] showed how different models based on emotions were created to build empathetic and emotional robots. The main cues used in these models are movements, gestures, and postures. In another paper, the same authors explored different dimensions of artificial empathy and revealed different empathy models: a conceptual model of artificial empathy that was structured on the developmental axis of self-other cognition, statistical models based on battery level or temperature, and a four-dimension empathy model were presented and described. The cues used in this article were unimodal and multimodal communication cues as opposed to the previous one that used movements.

Reference [80] discussed a conceptual model of artificial empathy with respect to several existing studies. This model is based on affective developmental robotics, which provide more authentic artificial empathy based on the concept of cognitive developmental robotics. The authors showed how the model worked using two different robots: an emotional communication robot called WAMOEBA and a humanoid robot called WE.

#### 3.3.3. Concepts for Behavioral Adaptation Area

Designing an intelligent agent is a difficult task because the designer must see the problem from the agent's viewpoint, considering all its sensors, actuators, and computation systems. Farahmand et al. [38] introduced a bio-inspired hybridization of reinforcement learning, cooperative co-evolution, and a cultural-inspired memetic algorithm for the automatic development of behavior-based agents. Reinforcement learning is responsible for the individual-level adaptation. Cooperative co-evolution performs at the population level and provides basic decision-making modules for the reinforcement-learning procedure. The culture-based memetic algorithm, which is a new computational interpretation of the meme metaphor, increases the lifetime performance of agents by sharing learning experiences between all agents in the society. To accelerate the learning process, the authors introduced a cultural-based method based on their new interpretation of the meme metaphor. Their proposed memetic algorithm is a mechanism for sharing learned structures among agents in society and lifetime performance of the agent, which is quite important for real-world applications, increases considerably when the memetic algorithm is in action.

#### *3.4. Experimental Works on the Development and Implementation of the Behavioral Model*

In this section, published works on behavioral models with the experimental loop are shown and are divided into sub-categories, according to the application area (Table 3).

#### 3.4.1. Experimental Works for Cognitive Architectures

Concerning cognitive architectures, Reference [53] proposed a cognitive framework inspired by the human limbic system to improve HRI between humanoid robots and children during a game session. The robot's emotional activity was modelled with computational modules representing amygdala, hippocampus, hypothalamus, and basal ganglia and used to suggest users' optimal game actions. The results showed that this cognitive architecture provided an efficient mechanism for representing cognitive activity in humanoid robots. The children's attention level was higher when compared to those of a game session without the use of the robot.

Reference [59] aimed to use an Interactive Social Engagement Architecture (ISEA) and an interactive user interface to gather information from children. The authors tested the developed architecture with an NAO robot and two other humanoids with 186 children. The ISEA is able to integrate and combine human behavior models, behavior-based robotics, cognitive architectures, and expert user input to improve social HRI. Eight modules compose the framework presented: knowledge, user input, sensor processing, perceptual, memory, behavior generation, behavior arbitration, and behavior execution modules. The knowledge module models human behaviors, while the perceptual module manages external sensor data from the environment, and processes and interprets data, sending results to the memory module. The behavioral generation module calculates which behavior and communication strategies must be used and sends data to the behavioral generation module. Novel emergent behaviors can be obtained by combining newly generated behaviors with the stored behaviors in memory modules. Every time that behavior is displayed, the robot's internal state is updated to keep track of the new data storage. Preliminary results showed that children seemed to find it more comfortable to establish an engagement with a robot, rather than with humans, in sharing information about their bullying experiences at school. Although this research is only midway through the grant award period, the developments and results are promising. Moreover, the authors said that slow and steady progress is occurring with the development of this Integrated Robotic Toolkit, but there is still significant and ongoing work to be explored with this approach.

Reference [52] proposed an intention understanding system that consists of perception and action modules. It is an object-augmented model, composed of two neural network models able to integrate perception and action information to allow the robot to better predict the user's intention. The model was tested in a cafeteria with customers and clerks. The action module was able to understand the human intention and associate a meaning to predict an object related to that action. The combination of these modules resulted in an improved human intention detection.

As explained in the theory section of the cognitive area, affordances are important elements for building a behavioral model for social robots. Those ones encode relationships between actions, objects, and effects and play an important role in basic cognitive capabilities such as prediction and planning [62], which also developed a computational framework based on the Dempster-Shafer (DS) theory for inferring cognitive affordances. They explained that this, much richer level of affordance representation is needed to allow artificial agents to be adaptable to novel open-world scenarios. Reference [63] also underlined the fact that affordances play an important role on basic cognitive capabilities such as prediction and planning. The authors said that the problem of learning affordances is a key step toward understanding the world properties and developing social skills.

Reference [69] also proposed a model that has collaborative cognitive skills such as geometric reasoning and situation assessment based on perspective-taking and affordance analysis. Another important element to be taken into consideration in the implementation of a behavioral model are facial expressions. Those ones are often based on an inner model that is related to the emotional state and are not only based on categorical choice. Chumkamon et al. [64] proposed a framework that

focuses on three main topics including the relation between facial expressions and emotions. The first point of their model is the organization of the behavior including inside-state emotion regarding the consciousness-based architecture. The second one presents a method whereby the robot can have empathy toward its human user's expressions of emotion. The last point shows the method that enables the robot to select a facial expression in response to the human user, which provides instant human-like 'emotion' and is based on emotional intelligence (EI) that uses a biologically inspired topological online method to express, for example, encouragement or being delighted. Another application of facial expressions in a cognitive architecture is shown in Reference [39] and in Reference [75]. Reference [55] proposed a robotic system that could learn online to recognize facial expressions without having a teaching signal associated with a facial expression. Reference [74] also created a system composed of three robots that helped elderly people during their daily works such as reminding them of taking drugs or bringing them the objects that they desired and analyzing their facial expressions to recognize them.

Lastly, learning from demonstration is used in Reference [71]. The authors proposed a learning method for collaborative and assistive robots based on movement primitives. The method allows for both action recognition and human-robot movement coordination.

#### 3.4.2. Experimental Works on Empathy

When a social robot interacts with human users, empathy represents one of the key factors to increase natural HRI. Emotional models are fundamental for social abilities to reach empathy with users.

Reference [81], for example, evaluated and compared the emotion recognition algorithm in two different robots (NAO and Pepper) and created metrics to evaluate the empathy of these social robots.

Reference [44] developed emotion-based assistive behavior to be implemented in social assistive robots. According to the user's state, the model is able to provide abilities to the robot to show appropriate emotions, which elicits suitable actions in humans. The robot's environmental and internal information plus user affective state represent the inputs for the Brian robot (Brownsell, Alex (29 May 2013). *"Confused.com overhauls brand in search of 'expert' positioning"*. Marketing Magazine. *http:*//*www. marketingmagazine.co.uk*/ *article*/*1183890*/ *confusedcom-overhauls-brand-search-expert-positioning*. Retrieved July 2018) to alter its emotional state according to the well-being of a participant and to the assistant in executing tasks. In this work, the robot emotional module is employed not to provoke emotional feelings, but rather in terms of assistive tasks that the robot should perform to satisfy the user's well-being.

The experiments show the potential of integrating the proposed online updating Markov chain module into a socially assistive robot to obtain compliance from individuals to engage in activities. Using robotic behavior that focuses on the well-being of the person could be beneficial to the person's health. Moving to a fully encompassing target user group is needed to test the overall robot in its intended assistive applications.

Reference [43] implemented an experiment with the I-Cat robot (*http:*//*www.hitech-projects.com*/*icat*/. Retrieved July 2018), which aims to provide a computer-based assistant that could persuade and guide elderly people to behave in a healthy way. Previous works demonstrated that combining the robot's empathy with the user's state contributed to a better appreciation of a personal assistant [82]. I-Cat features an emotional model that makes it able to smile and express sadness. Authors implemented natural cues such as understanding, listening, and looking, to perform different roles for the robot (educator, buddy, and motivator). The analysis was conducted by considering participants' personalities. The percentage of the total time that participants talked, laughed, and looked at the robot, and how many times the participants said "goodbye," as a sign of interpretation of the robot as a social entity. The aim of the work was to establish behaviors for an electronic personal assistant with a high level of dialogue, emotions, and social competencies. The findings showed that natural cues used by I-Cat provoked more empathy and social involvement with users. When non-social cues were used, users perceived the robot as less trustworthy and less persuasive, while avoiding its suggestions.

During experiments, the physical characters were found to be more trustworthy but less empathetic than the virtual character, which was not expected. This negative outcome on empathy might be due to specific constraints of the iCat: it makes a relatively high amount of noise when it moves, and the head and body movements may not be fluent enough. Another technical constraint was the (occasional) appearance of errors in the movements and speech, such as skipping choices of the multiple-choice questions. Furthermore, it may be that the three-character roles did not capture important advantages of a physical character that can act in the real environment. For instance, more positive outcomes might show up with a character that helps to attend to a medicine box with a specific location in the house, compared to a virtual character that is not a real actor in the house.

Reference [25] provided their contribution to social pervasive robotics by proposing an affective model for social robots, empathizing the concept of empathy. Behavioral adaptation according to users' needs and preferences resulted in preliminary tests that achieved a better social inclusion in a learning scenario. The first part of the model, called "Affective loop," was a module for the perception of humans, characterized by body-based emotion recognition that can recognize human emotions. According to the perception for human module's outputs, the internal state of the robot changed, which generates a complex emotional spectrum using a psycho-evolutionary theory of emotions. The user was able to visualize the robot's internal state and adjust some system parameters for the duration and intensity of each emotion. The user's interest in interaction was then monitored by the visual system: when it decreased, the robot changed its behavior to socially involve the user and selected its emotion according to the user's state. Affective behaviors were also adapted to the goal of interaction in a cooperative task between the robot and users.

Lastly, a comparison between two different cultures was made in Reference [30]. They made, in fact, a comparison between expression features of compassion, sympathy, and empathy in British English and Polish using emotion models that had sensory cues as inputs.

#### 3.4.3. Experimental Works on Behavioral Adaptation

An attempt to develop robots to be emotive and sociable like humans, showing a capability to adapt behavior in a social manner, is presented in Reference [58]. Starting from the Meyer-Briggs Theory on human personality, the authors mapped human psychological traits to develop an artificial emotional intelligence controller for the NAO robot. The proposed model was modelled as a biological system, and as a structure of emotionally driven and social behavior represented by three fuzzy logic blocks. Three variables were used as system input: "trigger event" that incites different psychological reactions, "behavior profiler" that models event-driven behavior to fit profiles of individuals whose behavior needs to be modelled, and "behavior booster/inhibitor" that augments or decreases the affective expressiveness. Social behavior attributes were implemented in the NAO robot controller according to this model. The robot interacted with young researchers, recognizing calls and gestures, and locating people in the environment, and showing personality traits of joy, sociability, and temperament. The model considers personality traits, social factors, and external/internal stimuli as human psychology does when interacting with others. In Reference [76], Nao was also used to assist children in developing self-regulated learning (SRL) skills. Combining the knowledge about personality traits discovered with Meyer-Briggs Theory and validated by Reference [58] and experimental measurements of affective reactions from a live model performed by an actor, Reference [61] developed a cognitive model of human psychological behavior. This model includes personality types and human temperaments to be implemented into the Robothespian humanoid robot. The authors tuned the block scheme developed in Reference [58] according to measurements from an actor performing as a behavioral live model. Different affective behaviors were played to create affective reactions to be added to the previous model.

Studies on proxemics, speed, and velocity provided unique suggestions to improve HRI, especially in behavior adaptation according to the user's movements and position. In Reference [56], the authors investigated a robot's trajectories and speed when it follows a user in a real domestic environment to provide a comfortable social interactive behavior. The authors presented a framework for people

detection, state estimation, and trajectory generation that can regulate robotic behavior. To select the appropriate behavior, the robot used the state of the user and his/her localization as input, considering movements and the context. Trajectories and velocity were considered in Reference [57], with a robot moving with a social partner toward the same goal. The authors developed and tested a person-aware navigation system modifying a trajectory planner. The criterion to change the planner was the distance between the robot and the user, according to which the robot's behavior adapted its velocity and trajectory to reach the goal, but remained close to the user at the same time. The approach described in this paper is limited because it only considers distance to the goal while ignoring the available free space. This model could be augmented to consider free-space features, such as free space in front of each social agent, distances to walls, and distances to other obstacles, to be more informed.

A similar work is presented by Reference [51] with a model to interpret the user's behavior and inclination toward interaction with an assistant robot. The robot was able to determine the user's behavior through body movements and extraction of posture features. According to its interpretation, the robot decided if it should move closer or should wait for a better inclination from the user to interact. The major benefit of this model is that it does not use verbal instruction from the user, which allows the robot to assess the suitability of starting a conversation by using posture and movement analysis.

Behavioral adaptation according to users' preferences and feedback on robot's actions is presented in Reference [46]. Two learning algorithms were applied to an internally developed adaptive robot, known as the EMOX (EMOtioneXchange) robot. After having identified the user's profile, the robot proposed a personalized activity, while assisting and interacting with the user after the activity selection. The user's feedback after each activity was traced, letting the robot have a memory about the user's preferences to aid in suggesting a more appreciated activity later. The robot's architecture has observations of user behavior, feedback, and environment to use as input. The robot's actions are the system output, which are determined through knowledge rules as interaction traces, users' profiles, decision process, and learning from the feedback process. The results showed that, even if the interaction modality, with hand gestures, was found difficult, most participants found the robot behavior adaptable and pertinent to their preferences.

Reference [49] presented a novel control architecture for the internally developed Brian 2.0 robot. The aim was to adapt the robot's behaviors according to the user state, which is a social motivator and assists if needed. To be effectively integrated into society, robots should be provided with social intelligence to interact with humans. This architecture promoted the robot's abilities to support and motivate users during a game memory session to stimulate humans cognitively. Encouragement and assistance were provided through a modular learning architecture that determined the user's state and performances, which modified the robot's behavior according to these inputs, recorded through sensors, cameras, and modules. The combination of the robot's emotional state module and intelligence layer led establishment of the current robot's assistive action related to the user's state and adapts the robot's behavior to the interactive scenario, using non-verbal modalities of communication.

Reference [24] investigated a robot's behavior by proposing a model that adapted to the visitor's intention. In a shopping mall, a humanoid robot was tested during approaching and interaction tasks. The robot was provided with two interaction strategies depending on users' behaviors: when visitors showed uncertain intentions, the "proactively waiting" strategy was used and the robot went toward them. The "collaboratively initiating" strategy, instead, was used when visitors' willingness to interact was seen and the robot started a conversation and moved closer to them. To reach a more natural context in interacting with robots, Reference [47] presented an experiment with a social robot learning to perform word-meaning associations. The authors hypothesized that a different human attitude in approaching the robot could be obtained. The robot's design had the aim to evoke a strong social response from humans. The social cues used influenced the tutoring of the human teacher and his behavior. An HRI interaction was measured through a language game, during which the learner assimilated a lexicon and associated meanings. Based on the teacher's feedback, the learner modified the word-meaning association. It could be considered as a sort of

behavioral adaptation, applied in a different context that could improve the robot's social abilities. Through users' facial tracking, the robot was able to address participants during the interaction, which emphasized the social involvement. Additional multi-modal social cues (gaze and verbal statement) to express its learning preference was used by the robot, which modulates the interaction and positively influences it. Reference [48] developed a spatial relationship model that considers interpersonal distance, body orientations, emotional state, and movements. On the basis of these inputs, the robot decides how to proceed, which sets its voice and moves toward the user or not. As the robot comes close to the child, entering the "personal" distance zone, the current status of the user is re-evaluated to adapt better to the robot's actions. Children with cognitive disabilities interacted with the robot, executing free and structured game sessions. Robot tactile sensors led us to understand tangible interaction, as an expression of touch-interaction through physical contact with the children. Depending on the touch-contact typology, the robot was able to select an appropriate behavior using multimodal emotional expressions. The robot's behavior can be adapted depending on the user's emotion, seen as an emotional stimulus for the robot's cognitive architecture. Reference [50] proposed a cognitive-emotional interactive model for interactive and communication tasks between young users and a robot. During the interaction, the emotional robot acted its emotions using facial expression, movements, and gesture as a consequence of the user's emotion, according to the Hidden Markov Model. The use of this model allowed the robot to regulate emotions as humans do, which provides a better interaction. The model starts from the hypothesis that robots might know a human's cognitive process, in order to understand human's behaviors. To do that, an object-functional role perspective method allowed robots to understand humans' behaviors: objects are interpreted as object-functional roles and role interactions. An activity is interpreted as an integration of object role interactions, so the robot is able to predict and understand a human activity. Because this model is only involved in emotional intensity attenuation, the continuous prediction of spontaneous affect still needs to be improved in the future, and the authors are considering expanding the experimental sample size and seeking more effective evaluation approaches for affective computing. Reference [54] also proposed a model that used a child's affective states and adapted its affective and social behavior in response to the affective states of the child.

In Reference [83], the authors also try to adapt robots' behavior to human emotional intention and an information-driven multi-robot behavior adaptation mechanism is proposed for human–robot interaction (HRI). In the mechanism, the optimal policy of behavior is selected by information-driven fuzzy friend-Q learning (IDFFQ), and facial expression with identification information are used to understand human emotional intention. It aims to make robots become capable of understanding and adapting their behaviors to human emotional intention, in such a way that HRI runs smoothly. The importance of facial expressions for the implementation of social robots is shown in the other two works. Reference [67] created a model for object, facial, gesture, voice, and biometric recognition and Reference [68] used a Multi-channel Convolutional Neural Network (MCCNN) to extract emotions from facial expressions.

Affordances are also used to implement a behavioral model that can adapt to users' needs. Another important aspect related to behavioral models is the cultural adaptation of the robot. Reference [40] proposed a multi-robot behavior adaptation mechanism based on cooperative-neutral-competitive fuzzy Q learning for coordinating local communication atmospheres in human-robots interaction. The Fuzzy Q learning is an approach that fuses fuzzy logic with the discrete Q-learning method and the authors called communication atmospheres significant information were introduced for human-robot interactions. This approach was tested with people from different countries and with different backgrounds to overcome the problem of cultural adaptation. Reference [65] also proposed a robotic system that helps therapists in sessions of cognitive stimulation. Without taking into account aspects such as the patient's perception of the robot, or the impact of the cultural environment, the application of these systems may be doomed to failure. The authors showed pieces of evidence of how the cultural adaptation of the robots has been considered decisive in their success.

Inspired by infant development, Reference [66] proposed a three-staged developmental framework for an anthropomorphic robot manipulator. In the first stage, the robot is initialized with a basic reach-and-enclose-on-contact movement capability and discovers a set of behavior primitives by exploring its movement parameter space. In the next stage, the robot exercises the discovered behaviors on different objects and learns the caused effects. This effectively builds a library of affordances and associated predictors. In the third stage, the learned structures and predictors are used to bootstrap complex imitation and action learning with the help of a cooperative tutor. Reference [70] developed an innovative approach that allows one or more human operators to share control authority with a high-level behavior controller on the basis of previous work on operator-centric manipulation control at the level of affordances. In their work, the affordances of the object template can be requested from the Object Template Server (OTS) and can be executed so that the robot performs the required arm motions to achieve the manipulation task.

Lastly, Reinforcement Learning techniques were also used to create an HRI system (robot that assists the human operator) to perform a given task with minimum workload demands and optimizes the overall human–robot system performance [72].

#### **4. Discussion**

The aim of this work is to analyze the state of the art and, thus, to provide a list of hints regarding cognitive architectures, behavioral adaptation, and empathy. Future research efforts should lead to overcoming the limitation of the current state of the art, as summarized in Table 4.




**Table 4.** *Cont.*


**Table 4.** *Cont.*

It shows several areas that have to be analyzed in future works as sensors technology, perception, architecture design, and the presence/lack of an experimental phase.

Moreover, ethical, legal, and social aspects should be taken into consideration to build an efficient behavioral model for future robots.

#### *4.1. Sensors Technology*

A crucial aspect in HRI is how robots manage to understand intentions and emotions of the users using social cues (i.e., posture and body movements, facial expression, head and gaze orientation, and voice quality). Sensors play a fundamental role because they are used to detect these cues, which are then processed in the robot model. An issue related to sensors is the data acquisition that can face delays, so an effort for the future could be to design sensors that are reliable and usable in real-life situations. Moreover, the robot should have a multisensory to acquire different types of signals [32]. To achieve this goal, microphones, 2D and 3D vision sensors, thermal cameras, leap motion, Myo, and face-trackers could be collected to create a system that gives a complete sensor coverage to the robot. Each device could cover a different area. Microphones could be used for speech, Myo to collect IMU and EMG data, face-trackers to find the head pose, gaze, and FACS, vision sensors to acquire point cloud data, thermal cameras to detect objects in dark environments, and a leap motion to track and estimate the position of the hand.

#### *4.2. Perception and Learning from the User*

A second ability that should be deeply analyzed is the area of perception. The main problem of perception is to have a reliable real-time sensing and learning system, as expressed in Reference [84]. In this paper, the authors show that people's preferences and knowledge change over time and a good system should be capable of adapting in real time to these changes and should be able to learn from the user. To achieve the latter competence, advanced learning-based methods [86] should be used to satisfy user needs, while increasing the performance of the robot [87]. Additionally, future works should detect and handle the emotion transition since humans change their emotions steadily [44]. This topic

is one of the main issues of HRI because the robot must have a real time data acquisition to handle the emotional transitions and this is quite difficult to gain due to delays during the data acquisition.

Another limitation of this area is the disconnection between perception and actions [52] and this problem should be overcome to have a reactive robot. Reference [28] underlined this concept, saying that changes in the user profile should be promptly detected to adapt behavior automatically.

An effort that could be interesting to study more deeply in the future is the possibility for the robot to achieve complex skills learned from the user (i.e., for cleaning a table). Reference [46] proposed that the robot should be able to change the internal model on the basis of what it learns from human beings, using the learning from the demonstration approach.

Moreover, it is worth underlining the importance to increase the number of non-verbal input parameters considered in the analysis, to make the robot more compliant and adaptable to the user's state and preferences [51].

## *4.3. Architecture Design*

Concerning the architecture design, more modular and more flexible architectures should be created, in the future: robots should be able to, for example, autonomously react to an unplanned event and a complete risk analysis procedure should be performed in the design phase to correctly handle the unplanned situations [19,24].

Behavioral consistency, predictability, and repeatability should be investigated since they are fundamental requirements in the design of socially assistive robots in different contexts, such as for children with autism as underlined in Reference [48]. Addressing them requires an accurate case analysis, grounded on the current practice and on extensive experimentation. A possible approach could be the use of learning from demonstration to teach the robot some skills to achieve a better performance [87].

Moreover, a multidisciplinary approach should be pursued to design and develop reliable and acceptable behavioral models [34]. Psychology, biology, and physiology, among others, are areas of expertise, which should be part of the development process since they can help improve the HRI experience [88]. Innovative behavioral models for assistive robots could be developed by taking inspiration from the studies of human social models or from the study of a specific anatomical apparatus. In this manner, future research will not only consider the physical safety (of the robot and the human beings) but also psychological, anatomical, and social spheres of humans [48].

It is important to underline that the robot might be appropriate not only in the context for which it was originally conceived (i.e., private home, hospital, and residential facility) but also for people with different levels of residual abilities. In other words, the robot should be able to adapt to the variability and different cultural and social contexts [89].

Lastly, a model of a robot should have a cloud architecture to have the ability to offload intensive tasks to the cloud, to access a vast amount of data, to access shared knowledge, and not to lose information in case of connection problems.

#### *4.4. Experimental Phase*

The advantage to test robotic solutions with real users is absolutely remarkable, as underlined by the comparison between papers with or without the experimental sessions. Particularly, some works [33,35,61] remark on the importance of testing the proposed model as a fundamental step for future research. The main issue is that, in some scenarios, robots that completed a task during the simulation phase, do not succeed in the experimental phase. This is the reason why testing the behavior of the robot in a real environment is of extreme importance in order to find good parameters that work for the experimental phase. In addition, the architecture should be validated using physical robots that interact with users in dynamic environments (i.e., schools, industries, and hospitals).

Lastly, another limitation that can be found in several papers is the absence of a database (i.e., physical forces or emotional states of the user). A database could be useful to collect information

achieved from the experimental phase and could be useful not only for the researchers that are working on the project, but also for other researchers that want to use the data for their own projects.

#### *4.5. Ethical, Legal, and Social Aspects*

Future research effort should include the ethical implications of designing robots that interact with people and the data that should be acquired and correctly stored to guarantee privacy [19].

Regarding legal and social aspects, Reference [24] underlined that the robot should be able to adjust its parameters for different cultures that have different needs, in order to be able to satisfy real user requests. Lastly, the birth of a regulation for social robots, like the one created for drones during recent years, could be an important step in order to have the possibility to use a robot in crowded environments [90].

#### **5. Conclusions**

This paper focuses on behavioral adaptation, cognitive architectures, and the establishment of empathy between social robots and users. The current state-of-the-art of existing systems used in this field is presented to identify the pros and cons of each work with the aim to provide recommendations for future improvements.

To establish a set of benchmarks to define an HRI similar to human-human interaction is an enormous challenge because of the complexity of non-verbal phenomena in social interactions. Its interpretation needs the support of psychological processes and neural mechanisms. The topic of the behavioral model is huge, and several factors contribute to making robots more accepted, perceived as friends, and empathetic with users. A common limitation of the works presented is that, often, authors focused on a particular aspect of HRI emphasizing a communication strategy or a particular behavior as a reaction to the user's action. Since it is not easy to include behavioral adaptation techniques, cognitive architectures, persuasive communication strategies, and empathy in a unique solution, researchers are often limited to organize experimental studies, which include only some of these factors. This, unfortunately, provides useful information only to a limited part of persuasive robotics. To maintain the importance of each contribution, it is fundamental to include, in a whole vision, all the suggestions provided by each work. Although many improvements remain to be accomplished, the already satisfying results from the authors have achieved an optimum starting point to develop a better solution using knowledge of human cognitive and psychological structures.

**Author Contributions:** O.N. and G.A. were responsible for the literature research and methodology definition and search strategies for synthesizing the information from the papers into text and tables. O.N., L.F., A.S., and G.M. collaborated on the discussion. O.N. and L.F. were responsible for the paper structure. They contributed to the methodology definition and search strategies and wrote the discussion. F.C. was the scientific supervisor, guarantor for the review, and contributed in methodology definition, paper writing, discussion, and conclusion. All authors were involved in paper screening and selection. All authors read, provided feedback, and approved the final manuscript.

**Funding:** Research supported by "SocIalROBOTics for active and healthy ageing" (SI-ROBOTICS) project founded by the Italian "Ministero dell'Istruzione, dell' Università e della Ricerca" under the framework "PON—Ricerca e Innovazione 2014–2020", Grant Agreement ARS01\_01120.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
