*2.1. Step 1: Hyperpsectral UAV Image Classification For Generating the Land Cover Map*

Hyperspectral images (HSIs) contain hundreds of narrow continuous bands over a wide range of the electromagnetic spectrum. Therefore, they provide more detailed spectral information than multispectral images and can spectrally discriminate similar materials. A land cover map derived from HSIs distinguishes distinct classes, such as forest and crop land, which are included in the land category framework. Level-I classes of land cover can be regarded as the usage information. In this sense, the land cover information from HSIs contains not only the land surface materials but also the land use. Furthermore, the latest land surface information of the target area can be extracted from UAV images taken at the desired time point in the desired interval.

HSI classification methods should consider the high dimensionality of the dataset. Traditionally, HSIs have been classified by pattern recognition algorithms, such as nearest neighbor, decision trees, and linear functions [23]. k-nearest neighbor (k-NN) clustering is a representative simple method that measures the similarities between the training and test data by using their Euclidean distances. Support vector machines remove the curse of dimensionality by determining the boundaries in a high-dimensional space, using the kernel method [23].

More recently, HSI classification has been performed by deep learning approaches. Deep learning replaces the hand-crafted feature-engineering process, which requires expert experience and careful parameter settings, with automatic extraction of the meaningful features contained in high-dimensional bands [24]. CNNs have been widely applied to HSI classification tasks [25–28]. Many studies have successively classified the items in hyperspectral images using 2D-CNNs, which extract features from spatial domains [25,26]. Efficient feature extraction by 2D-CNNs requires a data transformation process, such as data reduction, to convolute all bands of the input image. As HSIs include hundreds of spectral bands, the convolutions require several kernels, which introduces the over-fitting problem and increases the computational cost. 2D convolution is computed as follows:

$$w\_{l,j}^{x,y} = \phi \left( \sum\_{n} \sum\_{h=0}^{H-1} \sum\_{w=0}^{W-1} w\_{ljn}^{hw} o\_{(l-1)n}^{(x+h)(y+w)} + b \right) \tag{1}$$

where *v x*,*y <sup>l</sup>*,*<sup>j</sup>* is the pixel value of position (*x*, *<sup>y</sup>*) on the *<sup>j</sup>*th feature map in layer *<sup>l</sup>* (the layer of the current operation); φ is the activation function; *b* is a bias parameter; and *whw ljn* is the weight value at position (*h*, *w*) in the *n*th shared *H* × *W* kernel, where *n* is the number of feature maps in the (*l* − 1)th layer. *o* (*x*+*h*)(*y*+*w*) (*l*−1)*<sup>n</sup>* is the input at position (*<sup>x</sup>* <sup>+</sup> *<sup>h</sup>*)(*<sup>y</sup>* <sup>+</sup> *<sup>w</sup>*) and (*h*, *<sup>w</sup>*) denotes its offset to (*x*, *<sup>y</sup>*).

3D-CNNs simultaneously extract the spatial and spectral features [27,28]. A 3D-CNN preserves the original input data by avoiding complex data reconstruction and considers the relationships among channels; however, 3D-CNNs are more computationally complex than 2D-CNNs. In classes with similar textures over many spectral bands, they can perform worse than 2D-CNNs [29]. The pixel value at position (x, y, z) in the jth 3D feature cube of the lth layer is given as follows:

$$w\_{l,j}^{x,y,z} = \phi \left( \sum\_{n} \sum\_{h=0}^{H-1} \sum\_{w=0}^{W-1} \sum\_{r=0}^{R-1} w\_{ljn}^{luv} o\_{(l-1)n}^{(x+h)(y+w)(z+r)} + b \right) \tag{2}$$

where *R* is the spectral dimension of the 3D kernel and *whwr ljn* is the weight value at position (*h*, *<sup>w</sup>*,*r*), connected to the *n*th feature in the (*l* − 1)th layer. *o* (*x*+*h*)(*y*+*w*)(*z*+*r*) (*l*−1)*<sup>n</sup>* represents the input at position (*x* + *h*)(*y* + *w*)(*z* + *r*) and (*h*, *w*,*r*) denotes its offset to (*x*, *y*, *z*).

The abovementioned limitations can be resolved by hybridizing 2D- and 3D-CNNs [29]. In the hybrid spectral CNN (HybridSN), the output of a 3D-CNN is input to a 2D-CNN. This configuration learns the spatial representation at a more abstract level, with lower model complexity, compared to the 3D-CNN alone. The present study proposes a new hybrid 2D-CNN and 3D-CNN for effectively classifying hyperspectral UAV images (Figure 2). The network comprises 2D- and 3D-CNN branches in convolutional layers, which generate various meaningful feature maps from the input. First, spectral redundancy is removed by PCA along the spectral bands of the original HSI. The PCA image is then processed through the convolutional layers with 2D and 3D kernels. The first convolutional layers of both branches have eight filters, and the subsequent convolutional layers of the 2D and 3D branches have 16 and 32 kernels, respectively. The outputs of the 3D convolutional layers are converted to a 2D shape and the feature maps obtained from both branches are combined to form the spectral and spatial feature maps. These maps are input to the fully connected layers. Finally, the pixels are classified into land cover classes. In the next process, these land cover classes are mapped to the land category items in the cadastral map. To reduce the complexity of the mapping and to generalize the model, we adopt level-I types of land cover, namely forests, crop lands, roads, buildings, bare soil, and water bodies.

**Figure 2.** Process of step1 in discrepancy analysis: hyperspectral UAV image classification for generating the land cover map. PCA: principle component analysis.

## *2.2. Step 2: Inonsistency Comparison Between the Cadastral Map and Land Cover Map*

Our proposed pixel-level inconsistency comparison automatically detects the areas of inconsistent land use between the registered and actual land information. In a previous study [30], a restructured land use map was generated in vector format, which assigned the actual land cover classes from the imagery as attributes and the cadastral boundary as the geometry. Although this map compares the registered land categories in cadastral maps with the actual land use, it is limited to the primary land use, which occupies the maximum area in each parcel. An elaborate comparison must consider all land uses in each parcel. Figure 3 shows the process of comparing the actual land cover and cadastral map at the pixel-level, which considers both minor and primary uses.

**Figure 3.** Process of step 2 in discrepancy analysis: inconsistency comparison between cadastral and land cover maps.

The proposed automatic comparison technique is then divided into three stages: "Encoding," "Decoding," and "Query-based comparison" (Figure 3). Because the cadastral map and land cover map are constructed in vector and raster formats, respectively, the automatic inconsistency comparison must convert the heterogeneous datasets into the same structure prior to the overlay analysis [15]. The first encoding step performs raster conversion using the cadastral map attributes. For this purpose, the land category and parcel ID are assigned to each pixel of the rasterized cadastral maps, which have the same pixel size as that of the land cover map. A combined raster map is then generated with coded values *Vij* combining the land cover *Cij*, land category *Uij*, and parcel ID *Pij* values. An encoding query is expressed as follows:

$$V\_{ij} = P\_{ij} \times 10^4 + \mathcal{U}\_{ij} \times 10^2 + \mathcal{C}\_{ij} \,\,\forall\,\,(i,\,j). \tag{3}$$

The second decoding stage vectorizes the combined raster map, which results in a vector map combining the land cover and land category information. The attributes of the vector map include the parcel ID, land cover, and land category, and their values are assigned by decoding the pixel values.

The combined vector map includes both the land category values and land cover values in a unit area. Therefore, the inconsistent area can be automatically extracted through a query-based comparison between the corresponding values in the previous stage. The land category items are defined in terms of land use, and each item can contain multiple usages. For example, a "building site" may include buildings and bare land, and a "school site" may include buildings, bare land, trees, and grass. However, when extracting the land cover information from the imagery, the materials and/or objects covering the land are extracted from the spectral characteristics of the image. When constructing a query for comparing these two maps, we must define mapping rules that determine the discrepancy between the land category items and land cover classes, which are classified under different criteria. However, establishing an absolute standard for mapping land category items to land cover classes is restricted because the land category items differ among country-specific cadastral systems and the number of available classification classes depends on the quality of the imagery. In the case study (Section 4.1), the mapping between land cover classes and land category items is performed under the Korean Cadastral System as a guideline. An automatic comparison can be queried based on the corresponding mapping information; the query result can automatically determine the discrepancy between the land category and land cover. The discrepancy map can be generated by dissolving the area based on parcel IDs. From the discrepancy map, we can calculate the portions of inconsistent areas where the registered land category is different from the actual land cover in each parcel. Because the discrepancy map is generated by comparing both the primary and minor land uses, it provides reference data for the automatic detection of parcels that must be divided. Table 1 shows the proposed

algorithm of a pixel-level comparison for detecting inconsistent areas; moreover, this algorithm can be automated in the model builder of ArcGIS 10.1 [31] (Figure 4).


**Table 1.** Proposed algorithm of pixel-level inconsistency comparison.

Output: Discrepancy Map (DM, vector)

**Figure 4.** Automated model for detecting inconsistent area.

The generated discrepancy map can be utilized for another purpose: detecting parcels requiring division. Specifically, because the cadastral map was created by assigning one land category value (based on the primary use) per parcel, parcels with a high ratio of minor-use area must be divided for efficient land management [21]. The proposed process reflects all uses of the land. Therefore, it automatically detects inconsistent areas while detecting the parcels that must be divided into different land use statuses.
