*2.2. Histogram of Oriented Gradients Descriptor*

The Histogram of Oriented Gradients (HOG) is a description method used in computer vision to detect objects. This descriptor is remarkable due to the fact that it is easy to build, leads to successful results in detection tasks, and also requires a low computational cost. It is built from the orientation of the gradient in localized parts of the panoramic image. The development consists of dividing the image into small regions (*k*<sup>2</sup> horizontal cells in this work) and compiling a histogram with *b* bins for the pixels, which are included inside each cell using their gradient orientation. The combination of this information provides the desired descriptor (#»*<sup>d</sup>* <sup>∈</sup> <sup>R</sup>*b*·*k*2<sup>×</sup>1). This method has been used by some authors such as Mekonnen et al. [32] to develop a person detection tool, or Dong et al. [33], who proposed an HOG-based multi-stage approach for object detection and pose recognition in the field of service robots. This method was firstly used in mobile robotics by Dalal and Triggs [34] to solve people detection task. Zhu et al. [35] presented an improved version with respect to computational time and efficiency to detect people.

The HOG version proposed in this work is described in detail in [36].

#### *2.3. Gist Descriptor*

The *gist*description was introduced by Oliva et al. [37], and it has been commonly used to recognize scenes. Since then, several versions can be found, which work with different features from the images, such as colour, texture, orientation, etc. [38]. Some researchers have used *gist* in mobile robotics. For instance, Chang et al. [39] used this global appearance descriptor for localization and navigation. Murillo et al. [40] also used the *gist* descriptor to solve the localization problem, but in this case, the *gist* descriptor was a reduced version obtained with Principal Components Analysis (PCA).

The version we use throughout this paper is described in [36] and works with the orientation information obtained through a set of Gabor filters. From the panoramic image, *m* different resolution levels are obtained. Then, *nmasks* orientation filters are applied over each level. Finally, the pixels of every image are grouped into *k*<sup>3</sup> horizontal blocks, and the information is arranged in a vector ( #»*<sup>d</sup>* <sup>∈</sup> <sup>R</sup>*nmasks*·*m*·*k*3<sup>×</sup>1).
