*Article*

## **A Comparative Study of Markerless Systems Based on Color-Depth Cameras, Polymer Optical Fiber Curvature Sensors, and Inertial Measurement Units: Towards IncreasingtheAccuracyinJointAngleEstimation**

**Nicolas Valencia-Jimenez 1,†, Arnaldo Leal-Junior 1,***<sup>∗</sup>***,†,, Leticia Avellar 1,†, Laura Vargas-Valencia 1, Pablo Caicedo-Rodríguez 2, Andrés A. Ramírez-Duque 1, Mariana Lyra 1, Carlos Marques 3, Teodiano Bastos 1 and Anselmo Frizera 1**


† These authors contributed equally to this work.

Received: 15 December 2018; Accepted: 30 January 2019; Published: 2 February 2019

**Abstract:** This paper presents a comparison between a multiple red green blue-depth (RGB-D) vision system, an intensity variation-based polymer optical fiber (POF) sensor, and inertial measurement units (IMUs) for human joint angle estimation and movement analysis. This systematic comparison aims to study the trade-off between the non-invasive feature of a vision system and its accuracy with wearable technologies for joint angle measurements. The multiple RGB-D vision system is composed of two camera-based sensors, in which a sensor fusion algorithm is employed to mitigate occlusion and out-range issues commonly reported in such systems. Two wearable sensors were employed for the comparison of angle estimation: (i) a POF curvature sensor to measure 1-DOF angle; and (ii) a commercially available IMUs MTw Awinda from Xsens. A protocol to evaluate elbow joints of 11 healthy volunteers was implemented and the comparison of the three systems was presented using the correlation coefficient and the root mean squared error (RMSE). Moreover, a novel approach for angle correction of markerless camera-based systems is proposed here to minimize the errors on the sagittal plane. Results show a correlation coefficient up to 0.99 between the sensors with a RMSE of 4.90◦, which represents a two-fold reduction when compared with the uncompensated results (10.42◦). Thus, the RGB-D system with the proposed technique is an attractive non-invasive and low-cost option for joint angle assessment. The authors envisage the proposed vision system as a valuable tool for the development of game-based interactive environments and for assistance of healthcare professionals on the generation of functional parameters during motion analysis in physical training and therapy.

**Keywords:** joint angular kinematics; human motion analysis; RGB-D cameras; polymer optical fiber; inertial measurement units

## **1. Introduction**

There is a clear and growing interest in developing technological-based tools that systematically analyze human movement. Notably, there are many advantages to implement automated systems to detect human motion for applications associated with children in a healthcare context or to assess mobility impairment of ill and elderly people [1]. Automated quantification of body motion to support specialists in the decision-making process such as stability, duration, coordination, and posture control is the desired result for those technological-based approaches [2,3]. Despite recent advances in this area, automated quantification of the human movement for children with sensory processing and cognitive impairments, and adults with mobility disabilities presents multiple challenges due to factors such as accessibility barriers, attached to the body requirements, and the high cost of the system.

Automated analysis of body movements typically involves obtaining 3D joint data such as position and orientation, which are estimated in two different ways using an intrusive or a non-intrusive approach, also known as wearable and non-wearable technologies [4]. Wearable systems are portable and can be used by people with movement impairments in unstructured scenarios [5]. However, advances in non-wearable sensing technologies and processing techniques have appeared to measure with high accuracy human biomechanics in highly structured environments [6]. For instance, camera-based markerless systems can be used in scenarios when the user does not admit a wearable device to capture data [4].

Commonly, analysis of joint angles is conducted through portable wearable sensors, which include electrogoniometers and potentiometers mounted in a single axis; however, they are bulky and limit natural patterns of movement [7]. Flexible goniometers adapt better to body parts and are not sensitive to misalignments due to movements of polycentric joints. In addition, they employ strain gauges, which must be carefully attached to the skin due to their high sensitivity [8]. Advancements in micro-electro-mechanical systems (MEMs) have enabled the growth of new wearable sensors.

Despite the wide range of applications, inertial measurement units (IMUs) present high sensitivity to the magnetic field and need frequent calibration [9]. New sensor technologies, such as optical sensors (with advantages such as compactness, lightweight, flexibility, and insensitivity to electromagnetic interference [10]), are employed for joint angle measurements using different approaches. Those approaches include intensity variation [11], fiber Bragg gratings [12] and interferometers [13]. Considering their advantages such as low cost, simplicity on the signal processing, and portability [14], intensity variation-based sensors emerge as a feasible alternative for joint angle assessment in movement analysis and robotic applications [15]. However, to date, POF-based curvature sensors share the same limitations of the aforementioned goniometers, since angle assessment is limited to a single plane.

Non-wearable human motion analysis systems use either multiple image sensors or multiple motion sensors [16]. The latter can be classified as either marker-based or markerless capture [17]. Although commercial marker-based systems can track human motion with high accuracy, markerless systems have many advantages [18]. Markerless systems can eliminate the difficulty of applying markers to users with physical or cognitive limitations [19].

However, most of the markerless systems require a surface model. Therefore, 3D surface reconstruction is a prerequisite to markerless capture [17,20]. The advent of off-the-shelf depth sensors, such as Microsoft Kinect [21], makes it easy to acquire depth data. Such depth data are beneficial for gesture recognition [22]. However, motion capture with depth sensors is a challenging issue due to the limitations of accuracy and range [23].

To build an efficient system, it is necessary to identify its functionalities and requirements to effectively contribute to the user's necessities [24]. For example, people with disabilities that harm the use of wearable sensors impose difficulties on the acquisition of mobility parameters by clinical staff. Consequently, it is necessary to propose valid and straightforward methods for acquiring such parameters.

The main assumption of the proposed technique for accuracy enhancement is the correlation between the measurement errors with the anthropometric parameters of the user. Thus, a compensation equation was obtained, which correlates the errors with the user's anthropometric parameters. Although the proposed approach requires a longer calibration step prior to the sensor system application, it results in lower errors when compared to conventional approaches for angle analysis using markerless systems. The novel approach proposed here is capable of enhancing the accuracy of non-wearable systems. In addition, it is important to emphasize that such an approach can be applied in different systems (even the more accurate ones) for the enhancement of their accuracy. Furthermore, this technique can, with a few adjustments on the calibration step, be applied on the assessment of 3D dynamics.

This paper presents the analysis and comparison among a multi-red green blue-depth (RGB-D) vision system, an intensity variation-based POF sensor, and IMUs, which are intended for human joint elbow angles assessment. These wearable sensors were chosen due to their compactness, which would not cause occlusions for the markerless camera system. In addition, a marker-based camera system was not used in this comparison, since the markers can aid or harm the joint detection by the markerless system, which would not represent the condition for a practical application. This systematic comparison aims to study the trade-off between the markerless feature of the vision system and its accuracy by comparing it with wearable technologies for joint angle measurements. Thus, a compensation technique based on anthropomorphic measurements of the participants was proposed and validated using the POF curvature sensor measures.

#### **2. Materials and Methods**

#### *2.1. RGB-D Fusion System*

This system was designed as a distributed and modular architecture using the open source project Robot Operating System (ROS). The architecture developed here was built using a node graph approach. This system consists of several nodes to local video processing, distributed around several different hosts and connected at runtime in a peer-to-peer topology. The inter-node connection is implemented as a hand-shaking and occurs in XML-RPC protocol (Remote Procedure Call protocol which uses XML to encode its calls). The node structure is flexible, scalable, and can be dynamically modified, i.e., each node can be started and left running along an experimental session or resumed and connected to each other at runtime.

A computer vision framework composed of an unstructured and scalable network of RGB-D cameras is used to automatically estimate joint position, this visual sensor network counteracts typical problem as occlusion and narrow field of view. Consequently, the system uses a distributed architecture for processing the videos of each sensor independently. In this structure, a human body analysis algorithm is executed for each camera. Subsequently, the data from each sensor are transformed and represented respect to a shared reference to fusion them and generate the global joint position.

The vision system is composed of two RGB-D cameras, as shown in Figure 1, which each camera is connected to a workstation equipped with a processor Intel Core i5 and a GeForce GTX GPU board (GTX960 board, and GTX580 board) to execute local data processing. All workstations are connected through a local area network synchronized using Network Time Protocol (NTP) and managed through ROS. The data fusion estimation is sent for a third-party software interaction, which are executed on the highest processing capacity workstation.

**Figure 1.** Configuration of the RGB-D system.

Each workstation has two primary processes: user detection and position/orientation estimation of 15 joints. The detection process targets the candidate points to be people in the scenario, using the software NiTE, which does the client task, sending the movement estimation to the server over the network. The extrinsic (transformation from 3-D world coordinate system to the 3-D camera's coordinate system) and intrinsic (transformation from the 3-D camera's coordinates into the 2-D image coordinates) calibration of each RGB-D camera is performed using both the OpenCV package and the multi-camera network calibration tool provided by OpenPTrack (see Figure 2).

**Figure 2.** Client flowchart.

The workstation with the highest processing capacity is also used as the system server, which is responsible for the vision fusion process, using a Kalman filter. When the server receives a message with the position data of a client, it checks the time interval between the last received message and the current message. If this interval is greater than 33 ms, the system discards the received measurement and resumes counting the intervals from the next measure to receive. This is made to do not use the merging data with a very discrepant time. If the data is within the time interval, the system transforms the client coordinate system into the global coordinate system defined in the extrinsic calibration process. Then, the data is inserted into the Kalman filter. The saved data is processed through a low-pass Butterworth filter used to eliminate noise and to achieve a smoother estimate (see Figure 3).

**Figure 3.** Server flowchart.

The RGB-D fusion process is executed to obtain kinematic parameters and corporal patterns to be shown in an assessment interface that detects and quantifies the user's movements. The system can produce parameters such as range of motion and positions of the tracked body articulations in three dimensions. In the same way, the system can be configured to show specific articulation angles. We use the law of cosine to calculate the elbow angle, as suggested by different authors [25–28]. Equation (1) shows the relation between the forearm *d*1 and upper arm *d*2 lengths as shown in Figure 4. The blue dots shown in Figure 4 are identified using the software NiTE, The blue dots shown in Figure 4 are identified using the software NiTE, such as aforementioned, where three points are identified (on the shoulder, elbow and hand), with (X, Y, Z) coordinates represented at each point.

$$
\theta = \cos^{-1}(\frac{-d\_3 + d\_1 + d\_2}{2 \ast d\_1 \ast d\_2}) \tag{1}
$$

$$
\begin{aligned}
\{X\_2, Y\_2, Z\_2\} \\
&\quad \{X\_3, Y\_3, Z\_3\} \\
&\quad \{X\_3, Y\_3, Z\_3\} \\
&\quad \{X\_3, Y\_3, Z\_3\} \\
&\quad \{Z\_3, Y\_3, Z\_3\} \\
&\quad \{Z\_3, Y\_3, Z\_3\} \\
&\quad \emptyset = d\_1 + d\_2
\end{aligned}
$$

**Figure 4.** Parameters to calculate the articulation angle of any joint.

#### *2.2. POF Sensor*

The POF curvature sensor used in this work is based on the intensity variation principle in which, considering the fiber input connected to a light source and the output to a photodetector, there is a power variation on the POF output when it is subjected to curvature. Such power variation is proportional to the curvature angle and occurs due to some effects on the fiber such as radiation losses, light scattering,

and stress-optic effects, as thoroughly discussed in Leal-Junior et al. [29]. To increase the sensor linearity and sensitivity as well as the hysteresis reduction, a lateral section is made on the fiber, removing the fiber cladding and part of its core (considering the core-cladding structure of conventional solid-core optical fibers [30]). This material removal on the fiber creates a so-called sensitive zone, where the fiber is more sensitive to curvature variations.

The POF employed in this work, multimode HFBR-EUS100Z POF (Broadcom Limited, Singapore), has its core made of Polymethyl Methacrylate (PMMA) with 980 μm, whereas the cladding thickness is 10 μm. In addition, the fiber also has a polyethylene coating for mechanical protection, which results in a total diameter of 2.2 mm for this fiber. The lateral section is made on the fiber through abrasive removal of material, where the depth and length of the sensitive zone are about 14 mm and 0.6 mm, respectively. The sensitive zone length and depth were chosen as the ones that result in high sensitivity, linearity with low hysteresis, as experimentally demonstrated in Leal-Junior et al. [31]. The POF with the lateral section has one end connected to a light emitting diode (LED) IF-E97 (Industrial Fiber Optics, Tempe, AZ, USA) with central wavelength at 660 nm, whereas the other end (output) is connected to the photodiode IF-D91 (Industrial Fiber Optics, Tempe, AZ, USA). The POF curvature sensor needs a characterization step prior to its application in movement analysis, where this characterization is performed by positioned the POF sensor on the experimental setup shown in Figure 5. In this setup, a DC motor with controlled position and angular velocity performs flexion and extension movements on the POF sensitive zone on an angular range of 0–90◦. Then, the power attenuation is compared with the angle measured by the potentiometer (also shown in Figure 5) and a linear regression is made.

**Figure 5.** Experimental setup for POF curvature sensor characterization.

To ge<sup>t</sup> user movement parameters, the sensor is positioned on their elbow joints. Since the mounting conditions of the POF curvature sensor has direct influence on its response (as discussed in Leal-Junior et al. [31]), each user is asked to perform a 90◦ angle in the sagittal plane (in both cases, i.e., sensor at elbow and knee joints). Thus, the 90◦ movement on the sagittal plane is used to calibrate the sensor as a function of its positioning on each subject.
