6.6.1. Word Error Rate

WER is a typical metric of the accuracy of speech recognition [58]. In this experiment, the WER is for errors that occur when the robot recognizes the participant's speech. The WER was calculated as follows:

$$WER = (S + D + I) / N \tag{1}$$

where *S*, *D*, *I*, and *N* are the number of substitutions, deletions, insertions, and words in the reference, respectively. To compute the WER, we transcribed all participants utterances in the dialogue. The WER was used to confirm the difficulty of speech recognition in a dialogue with elderly people.
