**1. Introduction**

The global healthcare system is under great pressure due to rapid population aging as well as a shortage of healthcare personnel and budget [1]. An increasing proportion of older people is facing serious challenges of impaired physical functions such as muscle strength, balance, and mobility [2]. All these negative changes result in difficulties for older people maintaining independence of daily living, which would further cause anxiety, low self-esteem, and decreased quality of life [3,4]. Epidemiological studies show that a low physical activity level is strongly correlated to functional decline of the elderly. Physical exercise is an effective way to counteract the age-related functional decline [5].

There is strong evidence that appropriate physical rehabilitation exercises can improve physical activity level and activities of daily living of older people [4,6,7]. Conventional exercise therapies for older people are generally conducted in a formal rehabilitation center or clinical setting, which requires direct supervision from a professional therapist. Even though conventional exercise therapies have been shown as effective to increase physical activities as well as improve motor functions and balance, they suffer from low rates of uptake and adherence [8,9] due to a lack of enjoyment, inconvenient transportation, and high cost [10]. For example, Kobayashi et al. [11] examined the effects of physical exercise on fall risks in older people living at home in a rural area. After a 13-week intervention, they reported that the motor functions of the elderly were improved. However, only 31.7% of the participants were fully adherent to the intervention program. Another study by Liu-Ambrose et al. [12] showed that the Otago Exercise Program (OEP) improved functional mobility and executive functioning in older people based on a 26-week experiment. However, they reported that only 19.4% of the participants finished the whole sessions of OEP and only 25.0% of participants just completed the half proportion of sessions in OEP.

Recently, with the emergence of more affordable motion sensors (such as Kinect and inertial sensors) and gaming technologies, the use of home-based exergame (exercise + gaming) for physical rehabilitation appears promising over passive conventional exercise therapies for the elderly with respect to long-term uptake [9,13–15]. Such home-based solutions can not only eliminate restrictions of distance and cost to the rehabilitation center, but also allow the elderly to flexibly adjust the training schedule and exercise intensity. Another big advantage of exergame is that it provides auto-coaching to compensate for the lack of global healthcare resources [16]. In this kind of auto-coaching system, there is usually a virtual avatar coach in the game environment. The coaching system will guide end users to follow virtual coach's motions as well as track and evaluate their reproduced motions. However, because there is no real coach on site during the exercise, the scientific motion comparison for automatic performance evaluation is critical to guarantee the effectiveness of such a system.

Different algorithms have been proposed to assess exercise performance automatically. Lin et al. [17] developed a Kinect-based rehabilitation system utilizing a "seated Tai Chi" exercise to assist patients with movement disorders. They decomposed each form of Tai Chi into four poses and calculated the difference of joint angles between skeletons of standard pose and user's actual pose, which were tracked by Kinect sensors. Muangmoon et al. [18] adopted a similar method to evaluate Thai dance performance. Even though this kind of motion comparison algorithm is easy to implement due to the simplification of extracting a few discrete poses from continuous motions, the quality of exercise could not be evaluated comprehensively. This is because the simplification has resulted in the loss of some essential exercise information such as motion continuity, pace, and intensity. Another group of researchers proposed to assess motion similarity based on the correlation coefficient between two time series of human motions from a virtual coach (standard motion) and a subject (actual motion) [19,20]. Even though this approach can evaluate the overall performance from the continuous motion data, it fails to take into account the typical time lag and speed variance between the older subject's actual motion and the virtual coach's standard motion during the exercise therapy. It is difficult for the older people to exactly follow the pace of standard motion due to the physical and cognitive impairments as well as the lack of exercise skills. Both previously mentioned disadvantages can be elegantly overcome by dynamic time warping.

Dynamic time warping (DTW) is a well-known approach for measuring time-series similarity, which minimizes the effects of time lag and distortion in the time axis due to speed variation [21,22]. DTW outputs the optimal alignment (least matching cost or cumulative distance) between two time series and it is widely applied in speech and gesture recognition [23]. Due to the advantages of DTW, some researchers applied it into the auto-coaching system for physical rehabilitation exercises at home. Saraee et al. [24] applied DTW into developing a remote monitoring system to evaluate home-based physical exercises. Since DTW itself could not generate a meaningful scaled score for performance evaluation, a physical therapist was required to remotely monitor the patient in real time through Webcam and determine whether a patient's performance was acceptable or not. Semblantes et al. [25] and Saenz-de-Urturi and Garcia-Zapirain Soto [26] used DTW and binary classification to discriminate between correct and incorrect motions. Su et al. [27] utilized DTW and Kinect sensors to evaluate patients' supplementary exercise at home for shoulder rehabilitation, when compared with the pre-recorded standard motion in the hospital. They integrated DTW with fuzzy logic to convert different DTW matching costs (DTW distances) to three performance levels: bad, good, and excellent. Wei et al. [28] applied DTW to measure the similarity between the motion data of the trainee and the standard motion of the trainer in dance teaching. They utilized existing training data

and the experience of experts to determine three boundaries of DTW matching costs for categorizing individual performances into four levels: below average, average, good, and excellent. Most studies require large representative training data of each exercise to convert the matching cost from DTW to the final performance score. This task is resource-intensive and it is difficult to generalize the established conversion criteria for a specific exercise program to different ones. In addition, the final performance evaluation is categorical, and, thus, qualitative and not sensitive to recognize user's gradual progress of exercise interventions. Chatzitofis et al. [29] and Mocanu et al. [30] developed home-based rehabilitation systems for heart health and physical activity. DTW was used to compute the quantitative performance score. However, how to convert the DTW matching cost to a quantitative score was not described and the quantitative scores were not validated with ground-truth ratings. Osgouei et al. [31] recently proposed an objective method for quantitative performance evaluation of rehabilitation exergames. Using shoulder abduction exercise as an example and two angle features (shoulder angle and arm angle), they presented how DTW was applied to compute the motion similarity between the unknown and reference trajectories of the human skeleton joints. A normalization approach with estimated lower and upper bounds for the DTW distance was utilized to further convert any DTW distance to an objective similarity score (0–100). The proposed method is promising since it does not require any training data. However, the proposed objective similarity score was not validated with physicians' evaluation of the performances. In addition, it remains questionable whether the proposed method can be extended from simple repetitive exercises to complex whole-body exercises.

This study aims to develop and validate a DTW-based algorithm for a motion similarity evaluation, in order to support effective Kinect-enabled home-based virtual coaching. We proposed a simple but innovative method to directly convert the DTW matching cost to a meaningful performance score in terms of percentage (0–100%), without training data and experience of experts. We further validated the effectiveness of our algorithm through a follow-up experiment with human subjects performing the complex whole-body exercise (Tai Chi) instead of simple, repetitive exercises, which would show good generalization of our proposed method to different exercise programs. The developed algorithm is expected to provide a similar evaluation on user's performance as domain experts, which could be very promising to apply into home-based physical rehabilitation exercises for better quality of life of the elderly.
