**Automatic Change Detection System over Unmanned Aerial Vehicle Video Sequences Based on Convolutional Neural Networks**

**Víctor García Rubio 1,\*, Juan Antonio Rodrigo Ferrán <sup>2</sup> , Jose Manuel Menéndez García <sup>2</sup> , Nuria Sánchez Almodóvar <sup>1</sup> , José María Lalueza Mayordomo <sup>1</sup> and Federico Álvarez <sup>2</sup>**


Received: 31 August 2019; Accepted: 11 October 2019; Published: 16 October 2019

**Abstract:** In recent years, the use of unmanned aerial vehicles (UAVs) for surveillance tasks has increased considerably. This technology provides a versatile and innovative approach to the field. However, the automation of tasks such as object recognition or change detection usually requires image processing techniques. In this paper we present a system for change detection in video sequences acquired by moving cameras. It is based on the combination of image alignment techniques with a deep learning model based on convolutional neural networks (CNNs). This approach covers two important topics. Firstly, the capability of our system to be adaptable to variations in the UAV flight. In particular, the difference of height between flights, and a slight modification of the camera's position or movement of the UAV because of natural conditions such as the effect of wind. These modifications can be produced by multiple factors, such as weather conditions, security requirements or human errors. Secondly, the precision of our model to detect changes in diverse environments, which has been compared with state-of-the-art methods in change detection. This has been measured using the Change Detection 2014 dataset, which provides a selection of labelled images from different scenarios for training change detection algorithms. We have used images from dynamic background, intermittent object motion and bad weather sections. These sections have been selected to test our algorithm's robustness to changes in the background, as in real flight conditions. Our system provides a precise solution for these scenarios, as the mean F-measure score from the image analysis surpasses 97%, and a significant precision in the intermittent object motion category, where the score is above 99%.

**Keywords:** change detection; convolutional neural networks; moving camera; image alignment; UAV

### **1. Introduction**

The use of change detection algorithms is crucial in high precision surveillance systems. The methods that make use of those algorithms aim to detect the differences between information acquired at the same location, e.g., an image captured in different moments. Unmanned aerial vehicles (UAVs) became a revolution in the surveillance sector due to the lower cost and reduced human workload needed compared to previous systems. In addition, UAV operations can be automatized. This need of automation increases the importance of change detection methods. These methods are based on image sequences analysis, usually acquired by mobile vehicles. Image acquisition from the mentioned vehicles entails a considerable issue for change detection algorithms: The camera movement. This is the fundamental challenge of the algorithms, as the movement produces a variable background, thus the flight's route will be moderately modified from one flight to another. Furthermore, the weather conditions and the precision of GPS positioning influence the relation between the acquired frames and the location of the UAV. Compared to moving cameras, stationary cameras significantly reduce the complexity of the change detection problem, as the background is common to every image [1].

Moving cameras introduce complexity to the problem because the reference image is continuously changing. As a result, the system needs to detect the background of each image to provide precise detection. Therefore, variable backgrounds introduce a considerable computational load. This generates a complex problem to solve for real-time video surveillance tasks. An interesting approach for change detection with moving cameras is studied in [2]. Our method is based on reconstruction techniques. However, the reconstruction process alone would not be precise enough for our task. This is a consequence of the limited precision of CNNs to generate detailed images. In addition, it conforms an additional process that could slow down the system. Nevertheless, it provides a distinct perspective from the stationary camera algorithm mentioned above. After reviewing the different state-of-the-art approaches such as [2–4], we have observed an increasing tendency to use CNNs on image processing systems for change detection. Furthermore, studied implementations do not consider a moving camera, as in [3,5]. Consequently, a supplementary component has been considered to overcome the problems introduced by a variable background. Therefore, we have developed our own complete approach based on image alignment and CNNs.

In this document, we have reviewed the state-of-the-art implementations for change detection and image feature extraction in Section 2. Section 3 includes a detailed explanation of the process followed to develop our implementation. Section 4 describes the metrics obtained by our algorithm and compares them with the state-of-the-art implementations analysed previously. Finally, in Section 6, the most relevant inferences obtained from the development are explained.

#### **2. Related Work**

In this section, a review of traditional change detection is performed in Section 2.1. In Section 2.2, an explanation of convolutional neural networks is presented. In Section 2.3, we provide an analysis of change detection implementations based on convolutional neural networks. Finally, in Section 2.4, different traditional image processing techniques to extract and find common features between images are reviewed.

#### *2.1. Change Detection Using Traditional Image Processing*

Traditional image processing methods for change detection are based on pixel intensity variation modelling. This is the prime feature of most background subtraction algorithms. Implementations such as the Gaussian mixture model [6] are exceptionally popular among them. This method uses multiple Gaussian distributions to model the intensity value of each pixel to differentiate between background and foreground elements. However, this method does not perform efficiently in complex environments where movement is permanent. Other relevant approaches are based on kernel density estimation (KDE) [7]. This method also employs probability density functions to model the intensity of background pixels. Based on the mentioned distributions, the foreground pixels are detected. These methods are founded on the intensity regions to estimate background probability. Most of them are unable to model frequent events. They also adapt poorly to dynamic background situations. Modern approaches in terms of change detection are SuBSENSE [8] and ViBe [9]. SuBSENSE relies on spatio-temporal binary features as well as colour information to detect changes. In case of ViBe, for each pixel, a set of values taken in the past at the same location is acquired. This set is compared to the current pixel value to determine if it belongs to the background. After that, the model is adapted by choosing randomly which values to substitute from the background model. Both methods are based on elemental features, such as colour analysis. The lack of high-level information brings a decrease in performance in complex environments, which represent our principal target.
