*Article* **Interactive Application of Data Glove Based on Emotion Recognition and Judgment System**

**Wenqian Lin 1,\*, Chao Li <sup>2</sup> and Yunjian Zhang <sup>3</sup>**


**\*** Correspondence: jiangnanshui253@126.com

**Abstract:** In this paper, the interactive application of data gloves based on emotion recognition and judgment system is investigated. A system of emotion recognition and judgment is established based on the set of optimal features of physiological signals, and then a data glove with multi-channel data transmission based on the recognition of hand posture and emotion is constructed. Finally, the system of virtual hand control and a manipulator driven by emotion is built. Five subjects were selected for the test of the above systems. The test results show that the virtual hand and manipulator can be simultaneously controlled by the data glove. In the case that the subjects do not make any hand gesture change, the system can directly control the gesture of the virtual hand by reading the physiological signal of the subject, at which point the gesture control and emotion control can be carried out at the same time. In the test of the manipulator driven by emotion, only the results driven by two emotional trends achieve the desired purpose.

**Keywords:** human-computer interactive; data glove; virtual hand; emotion driven; test

#### **1. Introduction**

Though virtual reality (VR) is a way for human beings to interact with computers and complex data, its main purpose is to allow users to enter the virtual environment, wherein they can have the same experience and feeling as in real life. VR involves many fields and advanced technologies.

VR systems can be divided by different aspects. In terms of system functionality, the essential function of a VR system is environment simulation, so it can be applied to many fields such as military, medicine, and so on. At present, there are three kinds of VR systems: (1) systems used for simulation exercise or training in military field, (2) systems for planning and designing places and environment in the field of architecture, and (3) entertainment equipment and high-immersion systems in the entertainment field. In terms of interaction mode and user immersion mode, VR systems can be divided into non-interactive experience, human-virtual environment interactive experience, and group-virtual environment interactive experience. In terms of data input channels, VR can be divided into platform data, model data, perception data, and control data. In terms of interaction mode and interaction equipment, VR can be divided into four types: scene display, force/touch interaction, tracking and positioning, and walking interaction. The scene display type includes a helmet such as the popular VR glasses, desktops, projections, handhelds, and free stereoscopic displays. The force/touch interaction type includes the data glove with transmission functions, joysticks with force feedback, etc. The tracking and positioning type includes source and non-source tracking and positioning systems. The walking interaction type includes pedal walking and ground walking. In the design of VR system, attention should be paid to the elements of multi-perception, immersion, interaction, and imagination space.

**Citation:** Lin, W.; Li, C.; Zhang, Y. Interactive Application of Data Glove Based on Emotion Recognition and Judgment System. *Sensors* **2022**, *22*, 6327. https://doi.org/10.3390/ s22176327

Academic Editor: Stefano Berretti

Received: 14 July 2022 Accepted: 22 August 2022 Published: 23 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

<sup>1</sup> School of Media and Design, Hangzhou Dianzi University, Hangzhou 310018, China

Hand gesture recognition is an interactive type of VR system that relies on sensor technologies such as the electromyographic (EMG) and inertial measurement unit (IMU). There have been numerous studies on hand gesture recognition based on EMG and IMU. For example, Kundu et al. [1] presented a hand gesture based control of an omnidirectional wheelchair using IMU and myoelectric units as wearable sensors, and recognized and classified seven common gestures using a shape-based feature extraction and a Dendrogram Support Vector Machine (DSVM) classifier. Classification involved recognizing the activity pattern based on periodic shape of trajectories of the triaxial wrist tilt angle and EMG-RMS from the two selected muscles. Classification accuracy of 94% was achieved by DSVM classifier on 'k' fold cross validation data of 5 users. Zhang et al. [2] computed a deep learning technique known as the long short-term memory (LSTM) algorithm to build a model to classify hand gestures by training and testing the collected IMU, EMG, and finger and palm pressure data. The experimental results showed an outstanding performance of the LSTM algorithm. Song et al. [3] proposed a force myography (FMG), EMG, and IMU-based multi-sensor fusion model for hand motion classification, and evaluated the feasibility by motion classification accuracy and qualitative of subjects' questionnaires. They showed that the offline classification accuracy of adopting combined FMG-EMG-IMU was 81.0% for the 12 motions, which was obviously higher than single sensing modality; that is, only EMG, FMG, and IMU were 69.6, 63.2, and 47.8%, respectively. Jiang et al. [4] presented the design and validation of a real-time gesture recognition wristband based on surface EMG and IMU sensing fusion, which can recognize 8 air gestures and 4 surface gestures with 2 distinct force levels. The results showed that classification accuracies for the initial experiment were 92.6% and 88.8% for air and surface gestures, respectively, and there were no changes in accuracy results during testing 1 h and 1 day later. Yang et al. [5] applied the multivariate variational mode decomposition to extract the spatial-temporal features from the multiple channels to the EMG signals and used the separable, convolutional neural network for modeling by proposing an extensible two-stage machine learning lightweight framework for multi-gesture task recognition. The experimental results for a 52 hand gestures recognition task showed that the average accuracy on each stage is about 90%. Alfaro and Trejos [6] presented a user-independent gesture classification method combing EMG data and IMU data. They obtained average classification accuracies in the range of 67.5–84.6%, with the Adaptive Least-Squares Support Vector Machine model obtaining accuracies as high as 92.9%. Wu et al. [7] proposed a wearable system for recognizing American Sign Language (ASL) by fusing information from an inertial sensor and surface EMG sensors. Four popular classification algorithms were evaluated for 80 commonly used ASL signs on four subjects. The results showed 96.16% and 85.24% average accuracies for intra-subject and intra-subject cross session evaluation, respectively, with the selected feature subset and a support vector machine classifier. Shin et al. [8] studied a myoelectric interface that controls a robotic manipulator via neuromuscular electrical signals generated when humans make hand gestures. They proposed a system that recognizes dynamic hand motions and configuration of a hand over time. The results showed that the average real-time classification accuracy of the myoelectric interface was over 95.6%. Shahzad et al. [9] studied the effects of surface EMG signal variation on the performance of a hand motion classifier due to arm position variation, and explored the effect of static position and dynamic movement strategies for classifier training. A wearable system was made position aware (POS) using IMU for different arm movement gestures. The results showed the effectiveness of the dynamic training approach and sensor fusion techniques to improve the performance of existing stand-alone surface EMGbased prosthetic control systems. Ordóñez Flores et al. [10] proposed a new methodology and showed its particular application to the recognition of five hand gestures based on 8 channels of electromyography using a Myo armband device placed on the forearm. Romero et al. [11] presented the application of hand gestures and arm movements to control a dual rotor testbench. Chico et al. [12] employed a hand gesture recognition system and the inertial measurement unit integrated in the Myo armband sensor as a human-

machine interface to control the position and orientation of a virtual six-degree-of-freedom (DoF) UR5 robot.

Hand gesture recognition mainly includes two methods. One is gesture recognition based on data gloves (i.e., the motion characteristics such as the bending degree), angle, and displacement of each key joint of the hand are obtained through the motion sensor and are then inversed to the system database as much as possible. The other is imagebased gesture recognition (i.e., the image data of the hand are collected through camera), wherein the background segmentation and motion modeling are carried out through image recognition, and the hand motion is ultimately restored in the computer. The above two methods have their own advantages and disadvantages. Data gloves need subjects to wear external equipment, which may affect the user interaction experience and have delay in data processing, but they have strong anti-interference to data acquisition, more accurate data acquisition, and are not easily affected by the external environment. The image recognition method is more convenient, and the user's operation is more natural, but it has certain requirements for the environment and is easy to be disturbed by environmental factors. In this paper, the data glove is selected as the interactive device because its data acquisition is more accurate and the sensor used in this paper must contact the user's hand to obtain the physiological signal. In addition, data gloves are easy to implement modification measures, such as adding additional sensors, and have more advantages and pertinence than other interactive devices in hand movement.

Some achievements have been made in the research and development of data gloves, such as 5DT data gloves, cyberglove force feedback data gloves, measurand high-precision data gloves, X-IST music simulation data gloves, etc. Tarchanidis et al. [13] presented a data glove equipped with a force sensor with a resolution of 0.38 N and a sensitivity of 0.05 V/N. Kamel et al. [14] implement data glove from motion animation to signature verification and showed a high accuracy in finding the similarities between genuine samples as well as those differentiated between genuine-forgery trials. Yoon et al. [15] presented a data glove with adaptive mixture-of-experts model and showed the excellent performance and adaptability through tests. Kim et al. [16] used a data glove to present a sign language recognition system and indicated that the system was useful when employed to smartphones in some situations. Chen et al. [17] presented a data glove with highly stretchable conductive fiber strain sensor, which could recognize various gestures by detecting the finger motion. Fang [18] proposed a data glove to recognize and capture the gestures of 3-D arm motion, and the test results verified its effectiveness. Lin et al. [19] presented a data glove with characteristics of low cost, high reliability, and easy wearability. Wang et al. [20] presented a data glove with the feedback force control of a safe, lightweight, yet powerful and stable passive force feedback. Li [21] developed a data glove to monitor the hand posture and operated the division between sensor and base signal to decrease the test error induced by instability of light sources. Wu et al. [22] presented a data glove for catching finger joint angles and tested its effectiveness. Sarwat et al. [23] used a data glove to construct an automated assessment system for in-home rehabilitation, helping poststroke patients with a high level of recovery. Takigawa et al. [24] developed a controlled functional electrical stimulation to realize multiple grasping postures with data glove.

The previous research on data gloves has mostly focused on improving the accuracy of motion recognition and pressure simulation of force feedback. However, the study on the data glove which can capture the user's behavior and obtain the user's emotion through physiological signal sensor is rare. Therefore, in this paper, a kind of data glove with functions of emotion recognition and interaction between human and computer, or a human and hardware device according to the user's emotion, is presented. The data glove can be used in medicine, health, military training, academic research, and other fields.

#### **2. Classification of Emotion Trends**

In order to obtain user's emotion through physiological signal sensor, a system of emotion recognition is needed, while emotion recognition is based on the emotion evaluation [25]. Here the valence-arousal (V-A) model is used for the emotion classification. In the V-A model, as shown in Figure 1, V and A indicate the degree of emotional pleasure and emotional arousal, respectively. Four poles of the emotion classification model are extracted and used to represent tired, tense, happy, and depressed, respectively. The emotion classification system based on the V-A model is extended to a plane, and four quadrants of the plane stand for high-arousal and positive-valence (quadrant I: HAPV), high-arousal and negative-valence (quadrant II: HANV), low-arousal and negative-valence (quadrant III: LANV), and low-arousal and positive-valence (quadrant IV: LAPV), respectively.

**Figure 1.** Valence-arousal model.

#### **3. System of Emotion Recognition and Judgment**

#### *3.1. Data Analysis of Physiological Signal (PS)*

In the present study, skin electricity and pulse wave are taken as PS. The former is easily disturbed by other signals, so the noise interference should be removed before advancing. In order to facilitate computer analysis and processing, the discrete wavelet transform is used to decompose the signal into different frequency bands through lowpass and high-pass filtering. The unit of the frequency used for the filter is Hertz. The wdencmp function in MATLAB 9.0 R2016a is used to denoise the skin electrical signal, and all segments of skin electrical signal were normalized within the range of 0 to 100.

As shown in Figure 2, the signal of pulse wave is composed of main wave, dicrotic anterior wave, dicrotic notch and dicrotic wave. In the figure, the key feature points include: (1) c (peak systolic pressure), (2) e (starting point of left ventricular diastole), (3) g (maximum pressure point of anti tide wave), (4) d (point of aortic dilation depressurization), (5) f (origin of anti tide wave), and (6) b1 (point of aortic valve opening). The key amplitude includes: (1) main wave h1, (2) dicrotic anterior wave h2, (3) dicrotic notch h3, and (4) dicrotic wave h4. The key time includes: (1) the time from the starting point of waveform period to the peak c point of main wave t1, (2) the time from the starting point of waveform cycle to the lowest point of dicrotic notch t2, and (3) duration of one waveform period t. The pulse wave is smoothed and filtered using Butterworth low-pass filter and the relevant parameters of pulse wave are normalized after filtering.

**Figure 2.** Key feature points of pulse wave.

#### *3.2. Extraction of Optimal Feature of PS*

Features of PS are divided into a time domain, a frequency domain, and a feature related with physiological processes [26]. The direct fusion of original signal features will result in too much computation. As such, the dimensionality reduction of original signal feature is performed using the method of principal component analysis (PCA) to make the classifier more efficient and accurate in emotion recognition. Principal components are obtained using PCA, and then the weight threshold of each feature of PS on the principal component is taken as the criterion for selecting feature. Finally, some original features that play a major role can be determined as optimal feature subset. After obtaining optimal feature subset, the Pearson correlation coefficient (PCC) is used to judge the relationship between the emotional interval and these features. The PCC is calculated for features of four emotion trends and can be used to draw the significance P of the features. Based on P and correlation coefficient, the normalized threshold of optimal features correlated with emotional trends is determined. These optimal features include BpNN50 (percentage of main pulse wave interval >50 ms), the "range" of skin electrical signal (the mean value of first order difference for skin electrical signal), and 1dmean (mean value of first order difference of skin electrical signal).

#### *3.3. Establishment of the System of Emotion Judgment*

The range of skin electrical signal has a high positive correlation between the two completely opposite emotional trends (i.e., HVLA and LVHA). As such, the skin electrical waveform corresponding to the emotional trend is studied. The results show that it is necessary to add a directional judgment to the range of skin electrical signal. Based on the set of optimal signal feature from the Pearson correlation coefficient, the system of emotion recognition and judgment can be built according to the process as shown in Figure 3.

**Figure 3.** Process of emotion judgment model.

#### **4. Design and Connection of Data Glove**

The design framework of data glove with emotion recognition function is shown in Figure 4 where the data glove equipment consists of data acquisition and the controller. The data are acquired from the finger movement and physiological signal. The data glove customized based on DN-01 data module is taken as an example as shown in Figure 5, wherein the data module is an attitude acquisition board, which is used to collect the information of hand motion such as finger motion parameters, angular velocity of hand rotation, hand rotation acceleration, and angle change. The controller processes and integrates the collected information of hand motion, and then packages the processed

data and sends it to the host computer for processing through Bluetooth or USB to serial port. The interface of the attitude acquisition board is connected with the sensor, and the acquisition board and the controller are connected by a flat cable as shown in Figure 6.

**Figure 4.** Design framework of data glove with emotion recognition function.

**Figure 5.** DN-1 composition of data glove module.

**Figure 6.** Data glove hardware structure framework.

The data acquisition module in the data glove mainly collects two kinds of data: one is gesture data, and the other is physiological signal data. The prototype design of the data glove is shown in Figure 7, where the sensor of gesture data is Flex2.2 bending sensor which can capture the bending degree of five fingers and the motion posture of the palm, including acceleration, angular velocity, and angle. The length of the bending sensor is 7.7 cm, the non-bending resistance is 9000 Ω, the 90-degree bending resistance is 14,000 Ω, and the 180-degree bending resistance is 22,000 Ω.

**Figure 7.** Prototype design of data glove based on emotion recognition of physiological signal.

The skin electrical signal is acquired using a Grove-GSR skin electrical kit, as shown in Figure 8 (left). Two finger sleeves containing electrodes were put on the middle part of the middle finger and the thumb of the left hand, and the frequency of signal sampling was 20 Hz. A pulse sensor, as shown in Figure 8 (right), was used to acquire the signals of the pulse wave and heart rate. The pulse sensor was fixed on the tip of the middle finger of the left hand with a bandage, and the frequency of signal sampling was 100 Hz.

**Figure 8.** GSR skin electrical kit (**left**) and pulse sensor (**right**).

Gestures are reflected by the bending of fingers, and the output format of finger bending is: 0xaa a1 a2 a3 a4 a5 0xbb, where 0xaa and 0xbb are the head and tail of the frame, and a1, a2, a3, a4, and a5 represent the bending data of five fingers from thumb to little thumb, respectively. The data x read on the interface (a1–a5) is the quantization of the voltage value on the bending sensor:

$$\alpha = \frac{V\_{\chi} \times 3.3}{4096} \tag{1}$$

where *Vx* is voltage at sensor. Based on

$$V\_X = 3.3 \times \frac{R}{(R+20)}\tag{2}$$

the resistance value *R* can be obtained. The value of *R* is proportional to the bending degree of the bending sensor—i.e., the values of a1, a2, a3, a4, and a5 are inversely proportional to the degree of finger bending.

The finger bending data can be analyzed through the following functions: (1) the resume function is used to determine whether the data related to the finger part is received correctly (i.e., the correctness of frame header and tail), (2) finger\_calculate function is used to calculate the bending data of the finger, (3) judge function is used to determine whether the finger is bent within a reasonable range (i.e., filtering out wrong motion information), and (4) calculate function is used to process data related to finger bending including bending data of a single finger, storing and recording the data, calculating the offset of bending data, etc.

The acquisition sensors of physiological signal are the skin electrical sensor Grov-GSR skin electric kit (fixed on the middle part of the middle finger and the thumb of the glove) and the pulse sensor (fixed on the tip of the middle finger of the glove). The sensors are connected to the data acquisition module with a wire, and then the acquisition module is connected to the PC end and external hardware equipment through the Bluetooth interface. Unity3D receives the data information transmitted by the data acquisition board through the IO interface.

#### **5. Test of Virtual Gesture Change Driven by Emotion**

#### *5.1. System Design*

The data glove is used to control the hand gesture of the virtual hand and manipulator, and then the gesture of the virtual hand is changed through the awakened emotion of the subjects and compared with the gesture of the manipulator. The technological process of the system is shown in Figure 9, where DN-1 data glove is adopted and two kinds of sensors are added to DN-1 data glove.

**Figure 9.** The technological process of the system that virtual gesture change driven by emotion.

The virtual hand model comes from network shared resources. The fingers of index finger, middle finger, ring finger, and little thumb have 5 movable joint points, respectively, and the thumb has four movable joint points. There are 24 movable joint points for changing the gesture of the hand, as shown in Figure 10.

**Figure 10.** Virtual hand model.

Each joint of the hand is taken as a changeable unit, and two sets of attitude change systems are loaded for the virtual hand model in Unity3D. One is the gesture system which is related to the data of finger gesture transmitted from the data glove; that is, the virtual hand changes the gesture according to the related data of finger gesture, which is consistent with the subject's hand gesture. Another set of action templates is the gesture animation file designed in advance; that is, the corresponding gesture action of virtual hand is activated after the activation conditions are met.

#### *5.2. Virtual Hand Control Driven by Emotion*

Five subjects, aged between 24 and 30, participated in the test. Music materials were used to awaken the subjects' emotions in the test. The DN-1 customizable five fingers mechanical claw with the most basic functions is used as external hardware equipment, and the connection of test equipment is shown in Figure 11.

**Figure 11.** Connection of test equipment that virtual gesture change driven by emotion.

The principle and test process are as follows: (1) playing the music with style of terror, sadness, grandeur, and freshness to awake subjects' emotion; (2) the physiological signals (skin electricity and pulse wave) caused by the subjects' emotion are collected by sensors placed in the data glove; (3) four emotional trends HANV, LANV, HAPV, and LAPV corresponding to terror, sadness, grandeur, and freshness are detected using the system of emotion recognition and judgment as described in Section 3 based on the physiological signals; (4) the emotion changes of the subjects are detected by the system which is built based on the relationship between the emotion and gesture of the virtual hand; and (5) the system drives the virtual hand to make four animation gestures of "1", "2", "3", and "4" corresponding to HANV, LANV, HAPV, and LAPV as shown in Figure 12.

**Figure 12.** Four hand animation gestures.

The virtual gesture changes of subject 3 as driven by emotion are shown in Figure 13, where we can see the corresponding relationship between physiological signal and virtual gesture change.

**Figure 13.** Gesture changes of subject 3 driven by emotion.

A gesture data acquisition module is also placed in the data glove as described in Section 4. The system can drive the virtual hand and the manipulator to make the gesture consistent with the gesture of the subject. In the process described above, when the virtual hand makes the gestures of "1", "2", "3", and "4", the subjects also make the same gesture, which drives the manipulator also to make the same gesture as shown in Figure 13. Therefore, there is a time deviation between the gestures of the virtual hand and the manipulator as show in Table 1.

**Table 1.** The time deviation between virtual hand change driven by emotion and manipulator change.


In Table 1, the time deviation between virtual hand change driven by emotion and manipulator change is basically less than 20%, showing that the virtual hand and manipulator can be controlled synchronously through the data glove. When the user's data glove does not make any gesture change, the system can directly control the gesture of the virtual hand by reading the physiological signal of the subject, and gesture control and emotion control can be carried out at the same time to achieve the desired purpose.

#### *5.3. Manipulator Control Driven by Emotion*

The manipulator with six degrees of freedom, weight of 4.5 kg, and load capacity of 5 kg is directly controlled using data glove as shown in Figure 14, where the DN-1 data glove is adopted and two kinds of sensors are added to DN-1 data glove.

**Figure 14.** Manipulator in the test.

The control of manipulator is divided into gesture control of finger part and arm part. The control of the finger part can be seen in Section 4, and the arm part controls the movement angle, including the elbow joint (float anglere), the wrist joint (float anglere 1) and the finger root joint on the palm (float anglere 2).

The output angle related data includes acceleration, angular velocity, and angle.

(1) Acceleration:

$$0 \times 55.0 \times 51 \text{ AxL AxH AyL AyH AzL AzH TL THSUM} \tag{3}$$

where 0 × 55 and 0 × 51 are the head and tail of the frame; AxL, AyL, and AzL are the low byte of *x*, *y,* and *z* axes; AxH, AyH, and AzH are the high byte of *x*, *y,* and *z* axes; TL and TH are the total data transmission; and SUM is the acceleration output checksum:

$$0.0 \times 55 + 0 \times 51 + \text{AxH} + \text{AxL} + \text{AyH} + \text{AyL} + \text{AzH} + \text{AzL} + \text{TH} + \text{TL} \tag{4}$$

where the symbols are the same as those in Equation (3).

(2) Angular velocity:

$$20 \times 55.0 \times 52 \text{ wxL wxH wyL wyH wzL wzH TL THSUM} \tag{5}$$

where 0 × 55 and 0 × 51 are the head and tail of the frame; wxL, wyL, and wzL are the low byte of *x*, *y,* and *z* axes; wxH, wyH, and wzH are the high byte of *x*, *y,* and *z* axes; TL and TH are the total data transmission; and SUM is the acceleration output checksum:

$$\text{H} \times \text{55} + \text{0} \times \text{52} + \text{wxH} + \text{wxL} + \text{wyH} + \text{wyL} + \text{wzH} + \text{wzL} + \text{TH} + \text{TL} \tag{6}$$

where the symbols are the same as those in Equation (5).

(3) Angle:

$$0 \times 55.0 \times 53 \text{ RollL RollH PitchL PitchH YawL YawH TL THSUM} \tag{7}$$

where 0 × 55 and 0 × 53 are the head and tail of the frame, RollL and RollH are roll angle for *x* axis, PitchL and PitchH are pitch angle for *y* axis, YawL and YawH are yaw angle for *z* axis, TL and TH are total data transmission, and SUM is the acceleration output checksum:

$$0 \times 55 + 0 \times 53 + \text{RollH} + \text{RollL} + \text{PitchH} + \text{PitchL} + \text{YawH} + \text{YawL} + \text{TH} + \text{TL} \tag{8}$$

where the symbols are the same as those in Equation (7).

The angle related data is parsed by the following function:


The angle drive data are replaced with physiological signal data. The driving conditions of the manipulator steering are determined by the system of emotion recognition and judgment as shown in Section 3. The steering settings are up (HANV), left (LANV), right (HAPV), and down (LAPV), respectively. Each piece of music lasts 30 s ± 1 s. The manipulator steering change driven by emotion is shown in Figure 15. We can see that the manipulator completes the steering action only driven by the emotions of LANV and HAPV, and there was no response under the emotion conditions of HANV and LAPV, showing that, although the scheme of manipulator driven by emotion is feasible, it needs to be further improved in the recognition rate of emotion and the response speed of the manipulator.

**Figure 15.** Manipulator steering change driven by emotion.

#### **6. Conclusions**

In this paper, the interactive application of data glove based on emotion recognition and judgment system is studied. A data glove with multi-channel data transmission based on hand gesture recognition and emotion recognition is constructed. The system of virtual hand control and manipulator driven by emotion is established using Unity3D as a construction tool of computer system. In the test of virtual hand control driven by emotion, the data glove is used to simultaneously control the virtual hand on the PC side and external mechanical claw, while the system of emotion recognition and judgment is only used in the virtual hand control. In the test of the manipulator driven by emotion, the data glove is used to directly control the manipulator, and the arm angle control is replaced by the optimal features of physiological signal. The test results show that the virtual hand and manipulator can be simultaneously controlled by the data glove. The main innovation lies in the discovery that, in the case that the subjects do not make any hand gesture change, the system can directly control the gesture of the virtual hand by reading the physiological signal of the subject, and the gesture control and emotion control can be carried out at the same time. In the test of the manipulator driven by emotion, only the results driven by two emotional trends achieve the desired purpose. Although the system of the manipulator driven by emotion is feasible, it needs to be improved.

**Author Contributions:** Conceptualization, Y.Z. and W.L.; methodology, W.L. and C.L.; software, W.L. and C.L.; validation, C.L. and W.L.; writing, W.L. and C.L.; resources, W.L. and Y.Z.; review, Y.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the National Natural Science Foundation of China (Grant no. 12132015).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data sharing not applicable.

**Conflicts of Interest:** There are no conflict of interest regarding the publication of this paper.

#### **References**


### *Article* **Motion Analysis of Football Kick Based on an IMU Sensor**

**Chun Yu <sup>1</sup> , Ting-Yuan Huang <sup>1</sup> and Hsi-Pin Ma 2,3,\***

	- **\*** Correspondence: hp@ee.nthu.edu.tw

**Abstract:** A greater variety of technologies are being applied in sports and health with the advancement of technology, but most optoelectronic systems have strict environmental restrictions and are usually costly. To visualize and perform quantitative analysis on the football kick, we introduce a 3D motion analysis system based on a six-axis inertial measurement unit (IMU) to reconstruct the motion trajectory, in the meantime analyzing the velocity and the highest point of the foot during the backswing. We build a signal processing system in MATLAB and standardize the experimental process, allowing users to reconstruct the foot trajectory and obtain information about the motion within a short time. This paper presents a system that directly analyzes the instep kicking motion rather than recognizing different motions or obtaining biomechanical parameters. For the instep kicking motion of path length around 3.63 m, the root mean square error (RMSE) is about 0.07 m. The RMSE of the foot velocity is 0.034 m/s, which is around 0.45% of the maximum velocity. For the maximum velocity of the foot and the highest point of the backswing, the error is approximately 4% and 2.8%, respectively. With less complex hardware, our experimental results achieve excellent velocity accuracy.

**Keywords:** sports technology; football; motion analysis; IMU; trajectory reconstruction

**Citation:** Yu, C.; Huang, T.-Y.; Ma, H.-P. Motion Analysis of Football Kick Based on an IMU Sensor. *Sensors* **2022**, *22*, 6244. https://doi.org/

Academic Editor: Giovanni Saggio

Received: 17 July 2022 Accepted: 17 August 2022 Published: 19 August 2022

10.3390/s22166244

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

For any sports, repeated practice is required to improve performance and techniques. In addition to the amount of training, it is more important to use the correct method to enhance the quality of training. Practicing with improper methods is not only ineffective but also more likely to cause sports injuries. While performing a shot, players maximize speed and power, trying to make the shot more effective. However, for amateurs, exerting excessive force can easily lead to stiffness of the kicking leg. This results in insufficient knee bending which leads to momentum reduction during the foot swing before contacting with the ball. This problem is difficult to realize by athletes themselves. One way to analyze the motion is by applying multiple high-speed cameras combined with image analysis software to reconstruct the human body model and the state of motion. However, such equipment is relatively expensive and has environmental restrictions since image-related equipment needs to be set in a specific space or venue. On the other hand, IMU sensors have features such as light weight, low power, low cost and small size. An IMU can consist of a three-axis accelerometer, a three-axis gyroscope and a three-axis magnetometer. With proper filtering and data fusion, the information can be used for attitude and position estimation. Applications of IMU include military, automobile and sports.

#### *1.1. Related Work*

1.1.1. IMU in Sports

Wearable sensors with IMUs have been utilized in pedestrian dead-reckoning systems by detecting the stationary stance phase and applying zero-velocity updates (ZUPTs) for position tracking [1]. Inertial sensors were placed on the side of the shoe in [2] to obtain

information about foot clearance and mean step velocity, which helps assess foot kinematics in steady-state running. Another study [3] developed a system for field-based performance analysis based on IMUs which are attached to both ankles. The system detects stance duration, providing users with real-time feedback. In [4], the study used eight IMU sensors with velocity-based localization to capture the human spatial behavior and velocity during motions such as walking, jumping and running. The system was reduced to three IMU sensors and utilized the velocity-based localization with acceleration fine tuning [5].

To help prevent shoulder injuries, ref. [6] presented a classification approach by tracking and discriminating shoulder motions using an IMU. The wearable motion capture platform proposed in [7] provides physical quantities during the high-speed motion of baseball pitchers. With an array of inertial and magnetic sensors, the method allows for the analysis of various biomechanical parameters. A wearable device was developed by incorporating IMU sensors with flow sensors. The device in [8] measures human limbs velocity, acceleration and attitude angles. Experiments include boxing motion capture with the device on the forearm and kicking motion capture with the device on the shank. Ref. [9] presented a wearable sensing system consisting of multiple IMU sensors for basketball activity recognition. The system is able to identify walking, jogging, running, sprinting and shooting. Another basketball-related study built a wrist-worn sensor consisting of an IMU, five environmental sensors, a processor and a microcontroller. The activity recognition part was conducted by machine learning [10]. The algorithm proposed in [11] detects four key temporal events and three temporal phases in skateboarding. It can provide quantitative assessment for injury prevention.

#### 1.1.2. Football-Related Motion Analysis

Lower extremity and pelvis kinematics such as linear velocities and angular velocities were measured by an off-the-shelf product of 17 inertial sensors during kicking. The measurements were then compared with those obtained from an optoelectronic motion analysis system [12]. The hip joint motion of football players during practice was recorded directly on a sports field by a three IMU system [13]. The motion was characterized by hip acceleration and orientation. To quantify movement intensity and improve training load estimation, the system in [14] obtained knee and hip joint kinematics for footballspecific movements performed at different intensities. A pressure-sensitive material was placed on the kicking foot in [15]. The device measured the force and center of pressure during the impact phase for players to further improve their technique. Biomechanical differences were observed during kicking with the preferred and the non-preferred leg [16]. Both kinetics and kinematics were derived from the filmed movements. By the full-body modeling and three-dimensional motion capture system, quantitative evaluations of kick quality were provided [17]. Using a single IMU and the acceleration data, the system in [18] distinguished between running and dribbling, passing and shooting. The study also compared three sensor locations (inside ankle, lower back and upper back) for better accuracy. Detection and segmentation of a soccer kick were performed by a system of wearable sensors and video cameras for sports motion analysis [19].

From the above paragraphs, most IMU-related motion analysis research focuses on activity classification or motion recognition during training or in a match. With the environmental limitations possessed by camera-based optoelectronic systems, the size and weight of IMU has a clear advantage. It is a popular choice when performing motion analysis. Although some research studies look at the motion itself, most of them dive into the information related to training load or biomechanical parameters on a specific joint or body part. In particular, no previous research reconstructed and analyzed the instep kicking motion with a single IMU. This paper aims to present a motion analysis system with increased accessibility, providing football players of all levels with instant feedback and an auxiliary training method to improve the instep kicking technique.

The field application of this study is expected to help players or general football lovers to adjust their movement posture before actual kicking. Preliminarily changing the posture in the empty kick stage will make the players develop good kicking habits more effectively, resulting in a better performance when actually kicking the ball. Therefore, this paper is mainly focusing on dealing with the trajectory of the foot during the kicking motion. The sensors are calibrated and the threshold setting is tailored for the kicking motion to avoid some tiny impact.

To validate the reliability of the system, we utilize the high-speed cameras to obtain the golden pattern for the trajectory. According to the systematic review study [20], one of the most commonly used measures of agreement is the Bland–Altman plot. It is a scatter plot which shows the relationship between two methods. The metric is used in this study to evaluate the accuracy of the trajectory reconstructed from the IMU data.

The system architecture is shown in Figure 1. We collect acceleration and angular velocity data during movement through the accelerometer and gyroscope in the six-axis inertial measurement unit (IMU). After the steps of deviation calibration, attitude estimation with quaternions, the transformation of coordinates, and gravity compensation, we analyze the maximum velocity and highest point of the foot before contacting with the ball while reconstructing the 2D and 3D trajectory of the kicking motion.

**Figure 1.** An illustration of the proposed system. The three colored small axes on the trajectory represent the coordinates of the sensor.

This paper aims to present a 3D motion analysis system which allows users to observe the kicking motion and acquire significant motion information, with only a single IMU sensor attached to the kicking foot, avoiding complex accessories which might affect training and eluding the hassle of setting up optoelectronic devices. The main contributions of this study are: (1) the synthesis of a simple motion trajectory reconstruction system for the data collected by a single six-axis IMU during an instep kicking motion, which employs the quaternion representation of orientation to describe the attitude change; (2) the customized adjustments to various parameters for the football kicking action during

the signal processing, and the elimination of various possible noises to ensure that the accumulation of integral errors is minimized; (3) the extraction of specific motion data from the reconstructed trajectory to provide motion parameters that affect the quality of the kick during the process from backswing to kicking.

#### **2. Methodology**

The proposed sensing system includes data collection and several data processing procedures. More detailed steps will be given later in this chapter.

#### *2.1. Data Collection and Deviation Calibration*

The sensor selected in this research is ICM-20649 [21]. It is a wide-range six-axis motion tracking device which contains a three-axis accelerometer and a three-axis gyroscope, each with a 16-bit ADC, and the sampling frequency is set to 100 Hz. In the previous measurement, we found that the upper limit of the kicking motion is about 12 g, so we set the full signal range to ±30 g and ±4000 ◦/s for the application in this research. The precision measured from this range is acceptable because there are subsequent mechanisms for threshold and stationary judgement to distinguish the state of motion.

This experiment uses Bluetooth to transmit real-time data. After pairing the sensor with the Bluetooth receiver, the acceleration data and angular velocity data will be transferred to the computer and stored as text files. After the data are converted to decimal, it is necessary to perform the two's complement to obtain the negative number.

A modified sphere model is applied in the calibration process for sensor deviation. First, we assume the calibration equation to be G = L(g + b), G is the acceleration before calibration, g is the real acceleration, L is the linear proportional deviation of the sensor itself and b is the deviation of the center value of the sensor. In an ideal static state, the sum of the squares of the three-axis acceleration should be equal to one, so the gravitational acceleration values at various angles will form a sphere with a radius of one. When calculating, one must first assume that the linear proportional deviation is one, and one must use the least square method to obtain the center of the three axes. The same method can be used to find the linear proportional deviation, but the actual test found that the three-axis acceleration square sum will be less than one when the sensor is stationary. Therefore, normalization is performed in the end to complete the accelerometer calibration.

#### *2.2. Attitude Estimation with Quaternion*

In the common state of motion, rotation is bound to participate, and the acceleration received by the three axes of the sensor is actually the acceleration of the sensor's coordinates, not the acceleration of the earth coordinates. Data can only be applied and analyzed through attitude processing. The six-axis sensor chosen for this research only includes an accelerometer and a gyroscope. Without a magnetometer, we can only obtain the sensor attitude by obtaining the respective angle changes of the sensor and comparing them with that of the initial coordinates.

Quaternion representation of rotation is derived from the characteristics of inner and outer products between vectors. It can be considered to be the extension of two-dimensional real and imaginary numbers to four-dimensional to show the rotation in three-dimensional space. Similar to complex numbers, quaternions are composed of real numbers and three elements *i*, *j* and *k*. Each quaternion q can be represented by a linear combination of them, generally expressed as *q* = *a* + *bi* + *cj* + *dk*, and they follow the following relationship:

$$i^2 = j^2 = k^2 = ijk = -1\tag{1}$$

The attitude quaternion (*q*) is a column vector of four parameters to describe a rotation along a specific axis, which can be written as:

$$\mathbf{q} = \begin{bmatrix} q\_0 \\ q\_x \\ q\_y \\ q\_z \end{bmatrix} \stackrel{\scriptstyle \Delta}{=} \begin{bmatrix} \cos\left(\frac{\theta}{2}\right) \\ E\_x \sin\left(\frac{\theta}{2}\right) \\ E\_y \sin\left(\frac{\theta}{2}\right) \\ E\_z \sin\left(\frac{\theta}{2}\right) \end{bmatrix} \tag{2}$$

However, in a general movement, it is difficult to know the rotation axis of each sampling point, and the angle information is of the sensor axes instead of the axis with which the sensor rotates along. Since the angle information is obtained through the gyroscope, we decide to directly update the quaternion by using the angular velocity data. The vector *Sω* which contains the angular velocities is defined as:

$$\mathbf{S}\_{\omega} = \begin{bmatrix} 0 & \omega\_{\mathbf{x}} & \omega\_{\mathbf{y}} & \omega\_{z} \end{bmatrix} \tag{3}$$

Then, we consider the quaternion derivative that describes the rate of change in orientation:

$$\frac{d\mathbf{Q}\_k}{dt} = \frac{1}{2} \cdot \hat{\mathbf{Q}}\_{k-1} \bigotimes \mathbf{s}\_{\omega} \tag{4}$$

The first parameter, *<sup>d</sup>Qk dt* , is the derivative at time step *<sup>k</sup>* expressed in quaternion, *<sup>Q</sup>*<sup>ˆ</sup> *<sup>k</sup>*−**<sup>1</sup>** is the estimated orientation at time step k, and ⊗ is the quaternion product operator. By integrating the quaternion derivative, it would be possible to estimate the orientation over time:

$$
\hat{\mathbf{Q}}\_k = \hat{\mathbf{Q}}\_{k-1} + \frac{d\mathbf{Q}\_k}{dt} \cdot \boldsymbol{\Delta t} \tag{5}
$$

Finally, we can use the following equation to complete the quaternion update:

$$\mathbf{Q}\_{k} + = \mathbf{0}.5 \times \mathbf{Q}\_{k-1} \times \arg\mathbf{V} \mathbf{el} \times \mathbf{dt} \tag{6}$$

In addition, after each update of the quaternion, the quaternion must be normalized to obtain the true quaternion, so as to avoid the phenomenon of scaling while the vector is rotating. When a new quaternion is obtained, the acceleration data of the sensor can be converted into the acceleration data of the initial coordinates through the following formula:

$$\mathcal{accl}\_{transformed} = \mathbb{Q} \times \mathcal{accl} \times \mathbb{Q}\_{conj} \tag{7}$$

where *accltransformed* is the acceleration data in initial coordinates, *accl* is the acceleration data before attitude processing, *Q* and *Qconj* represent the quaternion and the conjugate quaternion, respectively.

#### *2.3. Gravity Compensation*

This subsection will introduce the method of compensating the gravity components and the transformation of coordinates. Since the sensor data during the entire motion have been converted into the initial sensor coordinates, we can subtract the average acceleration of the first 500 sampling points obtained in the static state offset\_accl from the raw acceleration data. Through this process, we can obtain the movement data of the sensor without the influence of gravity.

After gravity compensation, the misalignment between the initial coordinates and the earth coordinate still needs to be dealt with. If this problem remains unsolved, the 2D and 3D motion trajectory will be tilted. Different from the previous processing of attitude changes, since the initial coordinates are those at rest and cannot be processed with angular velocity information, we implement the rotation matrix of the initial coordinates to the earth coordinates to calculate the inclination of the gravity component.

First, we divide the rotation into three parts: roll, pitch and yaw. The tilt of a threedimensional space can be achieved with two axial rotations.

$$\text{roll} = \arctan\left(\frac{\text{offset}\_y}{\text{offset}\_z}\right), \quad \text{pitch} = -\arctan\left(\frac{\text{offset}\_x}{\text{offset}\_z}\right), \quad \text{yaw} = 0 \tag{8}$$

After obtaining the rotation angles around each axis, we find the rotation matrix, and combine the three with matrix multiplication to obtain the complete rotation matrix, written as the following matrices:

$$\begin{aligned} \mathbf{R}\_{\mathbf{x}} &= \begin{bmatrix} 1 & 0 & 0 \\ 0 & \cos(\text{roll}) & -\sin(\text{roll}) \\ 0 & \sin(\text{roll}) & \cos(\text{roll}) \end{bmatrix} \\ \mathbf{R}\_{\mathbf{y}} &= \begin{bmatrix} \cos(\text{pitch}) & 0 & \sin(\text{pitch}) \\ 0 & 1 & 0 \\ -\sin(\text{pitch}) & 0 & \cos(\text{pitch}) \end{bmatrix} \\ \mathbf{R}\_{\mathbf{z}} &= \begin{bmatrix} \cos(\text{yaw}) & -\sin(\text{yaw}) & 0 \\ \sin(\text{yaw}) & \cos(\text{yaw}) & 0 \\ 0 & 0 & 1 \end{bmatrix} \\ T\_{\text{rotate}} &= \mathbf{R}\_{\mathbf{z}} \times \mathbf{R}\_{\mathbf{y}} \times \mathbf{R}\_{\mathbf{x}} \end{aligned} \tag{9}$$

Lastly, we multiply it by the three-axis acceleration after compensating the gravity to complete the transformation of the coordinates, written as:

$$\text{acccl}\_{\text{corrected}} = T\_{\text{rotate}} \times \text{acccl}\_{\text{corrected}} \tag{10}$$

#### *2.4. Quadratic Integration and Threshold Setting*

After completing the transformation of the coordinates and the gravity compensation, we proceed to the trajectory construction part. The velocity can be obtained by integrating the acceleration once, and the displacement can be obtained after the second integration. The displacement between every two sampling points can be used to reconstruct the trajectory of the sensor movement.

In this research, we slightly modified the integration method by averaging the acceleration value between two sampling points to calculate the acceleration value belonging to the time interval. The formula can be written as:

$$v\_i = v\_{i-1} + \frac{a\_i + a\_{i-1}}{2} \,\Delta t \tag{11}$$

The result calculated by this integration method is more accurate than that calculated by the original formula *vi* = *vi*−<sup>1</sup> + *ai*Δ*t*. The velocity change, which is the area calculated by this method, is shown by the area *a* Δ*t* in Figure 2, and *a* is the average acceleration of *a*<sup>1</sup> and the acceleration from the previous sampling point. It can be found that the purple area on the left can be roughly compensated to the original missing area, so the integral error will be smaller than the original formula. We perform the integration separately on the three-axis data collected by the sensor to obtain the velocity of each axis, and then we use a similar integration technique to obtain the displacement.

Threshold setting is a crucial aspect when integrating. During the experiment, the sensor will inevitably be affected by some external factors, such as vibration, wind and incomplete compensation of gravity components. The slight fluctuation of acceleration has a considerable influence on the error of the integration. Therefore, after repeating several experiments, we found that the acceleration of the target motion is mostly above 3.92 m/s2. We set 0.392 m/s2 as the acceleration threshold to filter the acceleration value of the target movement before integration.

**Figure 2.** Illustration of integration error cancellation. The average acceleration of two adjacent sampling points is taken for calculation.

In addition, there will be a physical blind spot in the actual acceleration integration. When the sensor is stationary after a motion, the acceleration integration area during acceleration and deceleration cannot completely offset each other. Even if the sensor is at rest and the acceleration has become exactly zero, the velocity remains at the same value of the previous sampling point. In this case, when the velocity is integrated to obtain the displacement, the sensor will seem to continue its motion at a constant velocity instead of being in a static state. Therefore, a new judgment condition is added here. When the acceleration of fifteen consecutive sampling points is zero, it is determined to be a static state, and the velocity is returned to zero. A reasonable velocity threshold is also obtained through multiple experiments, and is set to 0.196 m/s to ensure that the above-mentioned accumulation of errors will not occur.

#### **3. Results**

#### *3.1. Experimental Setup*

Two high-speed cameras are used to capture the image from the front and side view to provide golden patterns for the experiment; we use tripods to secure the camera to avoid shaking, and place multiple scale bars within the capture range as a reference for depth correction. After setting up the cameras, we tie the sensor (ICM-20649) on the top of the athlete's foot with a rubber band, and perform an instep kicking motion without hitting a ball. The data received from the IMU will be collected and imported to MATLAB for data processing, then we draw trajectory diagrams and analyze different motion data.

The theoretical value of the experiment is provided by the video of the cameras. We import the video into Tracker for mapping and export the 2D data of each angle of view, align the peaks through the front view and the side view, and then perform the depth correction separately. The 3D data can be combined and the data can also be imported into MATLAB as the theoretical values. The results will then be used to calculate the error of each analysis.

#### *3.2. Experimental Results*

#### Motion Trajectory Analysis

After completing the data processing introduced in the previous chapter, the 3D position information of each sampling point of the IMU will be obtained, and the 3D trajectory diagram will be drawn with MATLAB. The average path length in several repeating experiments and the root mean square error (RMSE) with the theoretical value of the entire path will be calculated to verify the accuracy of the system. The two trajectories are aligned from the beginning of the motion, and then we utilize the relative sampling rate according to the different sample rates of the IMU and the frame rate of the camera. We calculate the distance between the corresponding sample points and calculate the RMSE of the position and the velocity in the direction of the kick. Figure 3 is a 3D motion trajectory diagram, the blue solid line in the figure is the theoretical trajectory obtained by Tracker, and the line composed of the red dots is the trajectory obtained after IMU data are processed.

**Figure 3.** Three-dimensional motion trajectory diagram. For an instep kicking motion of path length around 3.63 m, the position RMSE and the velocity RMSE of the two trajectories are 0.07 m and 0.034 m/s, respectively.

#### *3.3. Foot Velocity Analysis*

On the football field, whether it is passing or shooting, the velocity of the ball is a crucial factor. We hope to observe the maximum velocity of the athlete's foot swing and where the maximum value occurs so that we can help athletes transmit the most kinetic energy to the ball. With the golden pattern obtained by Tracker, we can compare the velocity of the sensor with the velocity from the video. Figure 4 is a 2D motion trajectory diagram, the blue cross is the position where the maximum velocity appears in the theoretical trajectory, and the red circle is the position where the maximum velocity appears in the IMU motion trajectory.

**Figure 4.** Two-dimensional trajectory diagram with maximum velocity position. The maximum velocity occurs when the foot reaches the bottom of the motion trajectory. An average value of the maximum instantaneous velocity in repeated experiments is around 7.4 m/s, and an error of 4% is achieved.

#### *3.4. Backswing Height Analysis*

When shooting or hitting a long ball, if the knee of the kicking foot is not bent enough to increase the height of the foot, the power of the ball will be significantly affected. Therefore, we would like to observe the height of the highest point of the foot during the pull-back motion on the reconstructed trajectory. With the golden patterns obtained by Tracker, we can discuss the accuracy of the system by comparing the highest points during the backswing. We can also use the 3D trajectory graph to obtain the position of the highest point for visualization. Figure 5 is the 3D motion trajectory diagram and the highest point of the backswing.

**Figure 5.** Three-dimensional trajectory diagram with backswing height illustrated. An error of 2.8% is achieved for an average backswing height of 0.756 m. The image on the right shows the highest point during the backswing.

Table 1 shows the quantified results generated from IMU data and also the results from high-speed cameras. From the results below, we can observe that in the motion with an average path of about 3.6 m, the entire trajectory obtained by IMU's data processing with the theoretical trajectory only has an absolute error of about 0.07 m. It is considered a very accurate result when constructing a motion trajectory, thus it proves that our signal processing system has a certain degree of credibility. As for the instantaneous velocity of the foot and the backswing height, the error is approximately 4% and 2.8%, respectively.

**Table 1.** Comparison of reconstructed trajectory, instantaneous velocity and backswing height generated by IMU data with high-speed cameras' results.


Figure 6 shows the validation of position (three axes) during the motion by comparing the IMU algorithm results with high-speed camera results. From the Bland–Altman plot, it can be seen that only 4.17% (10 out of 240) of the points are outside the 95% limits of agreement, the extent of the difference is clinically acceptable, so the two methods can be considered to be in good agreement, inferring that this IMU algorithm can be clinically substituted for high-speed camera.

**Figure 6.** Validation of position during the motion by comparing the IMU algorithm results with high-speed camera results. The Bland–Altman plots for the three axes show that the data obtained by these two methods have high similarity.

#### **4. Discussion**

While camera-based optoelectronic systems can provide high accuracy for motion capturing, it has environmental restrictions and has limitations in capture rate. When calculating derivatives greater than or equal to second order using the measurement data, it has a high level of noise, often resulting in limited or no physical meaning unless the raw data are filtered to 10–20 Hz [22]. When these optoelectronic systems are applied to targets moving in high speed, although the position will be accurate, the velocity and acceleration might not be of adequate accuracy. At the same time, the device settings of these image analysis systems are cumbersome and can only be used in a specific environment. The

IMU sensor is undoubtedly a perfect substitute in this case. It provides the information of inertial data such as acceleration and angular velocity directly. The sensor can be easily mounted on the person without interfering with their performance, and since the sensor is light, players can easily adapt to the existence of the new device.

Focusing on the football kicking motion, we constructed a motion analysis system based on an IMU sensor, trying to analyze the physical quantities related to improving the football kicking performance. To preliminarily evaluate and assess a kicking motion, the foot velocity and backswing phase are both key factors related to the quality of the kick. In [23], the results showed that the foot velocity at the initial instant at the initial impact phase affects the ball velocity more than any other factors. The quality of foot–ball contact is crucial to the spin and speed of the ball. Higher foot velocity is related to more powerful kicks [24].

For the reconstructed trajectory, our system has achieved results with high accuracy and low RMSE in both position and velocity. Since the types of target motions are different, it sometimes cannot fully explain whether a method outplays another simply by comparing the RMSE without considering the length of the motion and the dimensions evaluated. For the gait analysis algorithm that Zhou et al. performed in [25] on the action of striding forward, they achieved an RMSE of about 0.05 m in a stride of about 1.5 m. As for the acceleration-based simultaneous localization and capture method (A-SLAC) proposed in [5], the RMSE in the main walking direction is 0.038 m for a length of 3.6 m for each trial. The RMSE is 0.032 m for the vertical direction and 0.057 m in the sideways direction, which is about 2% of the trial length. While they focus on performing the error calculation on the direction of the stride, we conduct the error calculation of the 3D motion. For an instep kicking motion with the average path length of around 3.63 m, our system achieved the position RMSE of 0.07 m.

For velocity, we extracted the maximum instantaneous velocity from the kick; the results showed a 4% error compared to the image captured by the high-speed cameras. Moreover, the RMSE of the foot velocity is about 0.034 m/s, which is around 0.45% of the maximum velocity (7.47 m/s). For the velocity in the main walking direction in [5], the RMSE is 0.051 m/s, which is around 3% of the maximum velocity (1.5 m/s). The results indicate our system performs better in the accuracy of velocity. Table 2 shows the accuracy evaluation results obtained for different types of motion using different IMU-based systems.


**Table 2.** Accuracy evaluation results obtained for different types of motion using different IMU-based systems. For the position RMSE in the gait-related system and A-SLAC system, the error is calculated according to the direction of movement, while our system performs it with the 3D trajectory.

With the steady evolvement of wearable IMUs, inertial components are now commonly integrated onto a single die, allowing users to receive various motion-related data. The development of high-resolution and wide-range devices would be ideal for measuring motion poses in high-intensity motion. Moreover, the stretchable electronics would enable devices with multiple sensors to be embedded into forms that are more suitable for mounting on the body [26–28]. Multiple inertial sensor nodes would even provide better motion tracking; since there are more data, we can use the gradient descent method to fuse data and obtain a more accurate trajectory [29]. Moreover, by fusing the position and orientation data from the optoelectronic systems with the inertial data obtained from the IMU, we might be able to obtain the best set of kinematics data. By applying sensor fusion techniques

based on a multiple-model linear Kalman filter for deflection estimation, the data can be fused with low processing cost, compatible with real-time embedded applications [30].

#### **5. Conclusions**

For the motion analysis, we develop a data processing procedure to fuse data from the accelerometer and gyroscope of the IMU. According to the experiment results, for the instep kicking motion of trajectory length around 3.63 m, the root mean square error of the position and the velocity compared with the golden patterns obtained from the high-speed cameras and image analysis software is about 0.07 m and 0.034 m/s, respectively. For the maximum velocity of the foot, the error is approximately 4%. This metric is related to the contact point with the ball and the timing of acceleration. The error for the highest point of the foot before hitting the ball is 2.8%.

This system can be applied to players of all ages and levels, whether it is to observe movement changes by trajectory, or simply to measure the height or velocity of the feet. The motion information provided in the quantified form allows players or coaches to have a more specific and clear method to analyze the action. The experiment in this research does not require a large amount of equipment, nor does it need to be carried out in a specific place or room, hence the convenience of practical application is greatly improved.

**Author Contributions:** Conceptualization, C.Y. and T.-Y.H.; Formal analysis, C.Y. and T.-Y.H.; Methodology, C.Y., T.-Y.H. and H.-P.M.; Writing—original draft, C.Y. and T.-Y.H.; Writing—review and editing, H.-P.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported in part by the Ministry of Science and Technology in Taiwan, R.O.C. (Grant No. 110-2221-E-007-126-), and in part by National Tsing Hua University (NTHU 110Q2703E1).

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

