*2.2. Load Localization from Depth Images Using ANN*

A supervised feed forward ANN with a classification structure is proposed to localize where the load is acting on the wing. The proposed ANN classifier, as depicted in Figure 2, takes the encoded depth images as inputs and provides output based on the location of the loads.

**Figure 2.** The proposed load localization ANN with 2 hidden layers and ReLU activation functions.

The encoded depth features for each sample in the training set were normalized across the samples by making use of the standardization technique so that the input features had zero mean and unit standard deviation. The formula used for standardization is given by Equation (5). As for the test set, they were standardized using the mean and standard deviation of the training set. Standardization

was performed due to inevitable sensor noise, which can hinder the generalization capabilities of neural networks. Afterwards, the standardized inputs were fed into the localization ANN consisting of two hidden layers. The activation functions in both layers were chosen to be ReLU (Rectified Linear Unit), among other functions, such as sigmoid and tanh, due to ReLU's fast convergence.

$$
\hat{X}\_i = \frac{X\_i - mean(X\_j)}{\sigma(X\_j)} \tag{5}
$$

where *Xi* vector contains the encoded features in each sample, *Xj* vector contains the features across the samples, *X*ˆ *<sup>i</sup>* vector contains the standardized features for each sample, and *σ* is the standard deviation.

Typically, in classification problems, the output labels are one hot encoded, through which, categorical data, in this case, the load positions, are converted into numerical data. The output of the last layer of the neural network, which is now a one hot encoded vector is passed through a sigmoid function. The Sigmoid function changes the arbitrary output scores to a range of probabilities that range between zero and one. Sigmoid, instead of other activation functions, was chosen to be in the output layer. This is because the load localization in this work is a multi label classification problem where more than one correct label exists in the output. Therefore, the output labels are not mutually exclusive i.e. the output labels are independent. The closeness between the output of the sigmoid function and the true labels (*T*) is defined as loss or cost function. The cost function of the classification (*CFCL*) is defined as the average of Cross Entropy Error Function (CEEF) over a batch of multiple samples of size *N* and labels of size *K*, as follows:

$$CF\_{\rm CL} = \frac{1}{N} \sum\_{i=1}^{N} \sum\_{j=1}^{K} T\_{ij} \log(S(x\_{ij})) \tag{6}$$

The optimizer in the backpropagation algorithm updates the weights and biases, so as to minimize this loss and, as such, the loss decreases and the accuracy of the neural network increases. A classification ANN with two hidden layers of ReLU activation functions was determined to be sufficient to successfully localize the loads causing bending and/or twisting deflections. The proposed ANN was trained using Adam [20] optimizer. Both L2 regularization and dropout [21] techniques were utilized in order to increase the generalization performance of the network and prevent overfitting. This resulted in obtaining a new cost function, which consists of both the cost function defined by Equation (6) and the new scalar regularization value *β* due to L2 regularization. The final cost function *FCFCL* is given by Equation (7) and the metric used for calculating the accuracy of predictions in load localization is given by Equation (8). The localization results obtained for both concentrated and distributed loading scenarios are presented and evaluated in detail in the results section.

$$FCF\_{CL} = CF\_{CL} + \beta \sum ||Weights||\_2\tag{7}$$

$$Accuracy\_{CL} = \frac{\sum(Y=\hat{Y})}{N} \times 100\tag{8}$$

where *Y* is the ground truth, *Y*ˆ is the prediction, and *N* is the number of samples.

#### *2.3. Load Estimation from Depth Images Using ANN*

In this section, a logistic regression ANN for the estimation of the magnitude of loads acting on the wing is proposed. The input to this network is again the encoded depth images, and the output is the magnitude of the load. Unlike the ANN classifier, the output layer here consists of only a single node which provides continuous type numeric outputs in terms of loads. Because the output, in this case, is a single numeric value, there is no need to use sigmoid function in the output, as was the case in logistic classification. Moreover, the cost function for the estimation of load magnitudes (*CFE*) is simply defined as the sum of the squared difference between the predicted value and the ground

truth as given by Equation (9). Similar to the localization part, the estimation of load magnitudes was performed using two hidden layers, but the activation functions used in the first and second hidden layers were chosen to be tanh and sigmoid, respectively. The proposed load estimation ANN is illustrated in Figure 3.

$$CF\_E = \frac{1}{N} \sum\_{i=1}^{N} (Y - \hat{Y})^2 \tag{9}$$

where *Y* is the ground truth, *Y*ˆ is the prediction, and *N* is the number of samples.

**Figure 3.** The proposed artificial neural networks (ANN) for estimation of loads with two hidden layers of tanh and sigmoid activation functions, respectively.

Both tanh and sigmoid functions belong to the family of sigmoid functions. The difference between these two is that the output of sigmoid function ranges from zero to one while the output of tanh ranges from −1 to +1. Moreover, the tanh function often converges faster than sigmoid due to tanh's symmetric nature [22]. The formula used for calculating the accuracy of predictions [23,24] for load estimation is given by Equation (10). The results obtained for both concentrated and distributed loading scenarios are presented and evaluated in detail in the experimental results section.

$$Accuracy\_E = (1 - \frac{||Y - \hat{Y}||}{||Y - \hat{Y}||}) \times 100\tag{10}$$

where *Y* is the ground truth, *Y*ˆ is the prediction, and *Y*¯ is the mean of the ground truth.

#### **3. Experimental Setup and Evaluation of the Depth Sensor for Load Monitoring**

### *3.1. Experimental Setup*

In order to validate the effectiveness of the proposed framework, an experimental setup that consists of a composite wing of a quad tilt-wing aircraft called SUAVI [25] was used. The wing has a size of 50 × 25 cm in length and width, respectively. The root side of the wing was fixed, so that no tilting was induced under applied loads. Similar to the works in the literature, ground tests were performed to experimentally mimic the deflections that may occur on a wing due to some external loads during flight. In the experiment, different types of loads that cause bending and twisting deflections on the wing were applied in two different loading scenarios. First, the load was designed to be acting on one of the eight positions depicted in the left image of Figure 4 and it is called concentrated loading case in this work. Six calibrated loads with magnitudes of [2.45, 4.9, 7.35, 9.81, 12.26, 14.71] N were chosen to be acting on these positions. Therefore, in total, eight different positions exist with each one containing six distinct loads, resulting in forty eight configurations for the first case. In the second scenario, the loads were chosen to be distributed loads placed in between each of the eight locations. This way the loads were made to be acting on multiple locations of the wing at the same time, as indicated in the right image of Figure 4. In loading positions 9 to 21, except for positions 13, 17, and 21, the loads were made to act at two positions simultaneously, for example position 9 represents two loads acting at positions 1 and 5. As for positions 13, 17, and 21, the loads were made to act at four positions simultaneously, for example position 13 represents four loads acting at positions 1, 2, 5, and 6 at the same time. Therefore, the total number of positions corresponding to distributed loads are thirteen. The magnitude of loads used for distributed scenario were the same as the concentrated loading case, but their magnitudes were distributed among the multiple positions they were acting upon, for example at position 9 two loads of 1.225 N were acting at positions 1 and 5 simultaneously for a loading case of 2.45 N. Therefore, six distinct load magnitudes per location exist in the distributed loading case, thus resulting in seventy eight different configurations for the second case. Therefore, in total, 126 distinct loading cases were performed during the experiments. In order to measure the deflections occurring over the span of the wing, this work proposes the use of a single RGB-D camera. RGB-D cameras are sensors that are capable of providing pixel wise depth information from images, thus making them suitable for full field measurement purposes. The RGB-D sensor used for data collection in this work was chosen to be a Microsoft Kinect V1 [26] sensor. The reasons for choosing Microsoft Kinect V1 for this work are as follows:


**Figure 4.** Positions of concentrated (**Left**) and distributed (**Right**) loads (Green) acting on the UAV wing.

Khoshelham et al. [30] theoretically and experimentally evaluated the geometric quality of the depth data that were collected by Kinect V1. They quantified the random error of depth measurement to be 4 cm at a ranging distance of 5 m, and concluded that the error increased quadratically by the increase in the ranging distance from Kinect. Based on their results, Khoshelham et al. made a recommendation for the use of Kinect sensor for general mapping applications, and they indicated that the data should be acquired at a distance of 1 to 3 m from the sensor due to the reduced quality of range data at further distances. Therefore, in this work, the Kinect was placed at a distance of one meter from the wing during the tests, and it was placed under the wing of the UAV, so as to capture the whole wing. Figure 5 shows the experimental setup used in this work. Although the

methodology will be evaluated for a relatively small aircraft, the same technique can be utilized for structural health monitoring of much larger ones with the use of a depth camera with larger field of view, such as Carnegie Robotics® MultiSense™ S21B [31], Arcure Omega [32], and MYNT EYE [33]. Moreover, because the UAV used in this work was small in size, the depth sensor was not installed on it. Nonetheless, depth cameras can be installed on a larger aircraft with a minimum distance according to the depth sensor's operation range between the wing and the installation location. Depth cameras could be installed in place of RGB cameras that were fitted on the aircraft [3,4,7] but with the advantage of not requiring installation of additional marker's or LEDs on the wing as shown in Figure 6. The installation of the depth sensors at an angle, as shown in Figure 6, will not affect their operation, since, once a deflection occurs over the span of the wing, the wing's depth will change with respect to the pose of the installed depth sensor. For large aircrafts, the depth sensors can be connected to an onboard pc via wired or wireless connections. Moreover, dampers can be utilized for reducing the impact of vibrations on the depth sensors in order to take into account the vibrations that may occur in flight. Furthermore, if the proposed method is trained with the data obtained from in flight conditions then the proposed method can take into account all of the disturbances acting on the depth camera, since the disturbances will manifest themselves in the acquired data.

**Figure 5.** Experimental setup.

**Figure 6.** Possible installation locations of depth sensors on large aircraft.
