**8. Discussion and Conclusions**

#### *8.1. Interpretation of the Results*

The total average of the WERs was 0.778. This means that approximately 78% of the words in the utterances of the participants were mis-recognized. In general, it would be too difficult to continue a dialogue with this speech recognition accuracy. Despite the difficult situation, the system continued the dialogues for 12 min 51 s on average. This suggests that the twin-robot dialogue system could sustain a dialogue for a certain time regardless of speech recognition failures.

The average of the participant utterance time was 3 min 31 s, which was approximately 27% of the average dialogue time (cf. the average of the robot utterance times was approximately 5 min 51 s). In other words, the ratio of the participant utterance times to the robot utterance time was approximately 3:5. Because the gap between the utterance time of the participants and the robot was not so much, the participants can be considered to have positively participated in dialogue with the twin-robot dialogue system.

Regarding subjective impressions, 71% of participants answered that there was nothing strange in the dialogues with the robot. We believe that the system could have provided a dialogue without breakdown for many participants. In addition, the caregivers answered that they felt that the participants had been speaking more positively than usual. Because such positive participation might have involved a novelty effect that none of the participants has spoken to a robot before or an experimenter effect that the participants received special attention in the context of this experiment, we cannot justify whether the system was able to encourage some participants to participate more actively. To clarify the effect of the system on the positive participation, a long-term study is required.

In contrast, there was no significant difference between the one-robot scenario and two-robot scenario in each measurement. Therefore, it is still unclear if the use of two robots is effective in improving the user experience of dialogue. Nevertheless, regarding dialogue time, the effect size was medium (Cohen's d = −0.519). The results suggested that the presence of two robots might likely encourage elderly people to sustain the talk.
