*4.3. Limitations & Future Work*

When training a network to predict boundary likelihoods for visible object outlines, our training data based on cadastral reference are beneficial, as it is available without further processing. The data have little bias, as no human annotator with domain knowledge is required [36]. However, the data could be improved: Cadastral data contain invisible boundaries not detectable by MCG. To limit training data to visible boundaries would match better with what the network is expected to learn, and thus increase achievable accuracy metrics. When deciding whether to use RF or CNN for boundary classification, one needs to balance feature extraction for RF [37] against training data preparation and computational requirements for CNN [18]. In cases of limited training data for CNN, our CNN-based boundary classification may be adopted by data augmentation and re-balancing class weights. One advantage of our RF-based boundary classification is that it contains a feature capturing 3D information from a Digital Surface Model (DSM) [21]. 3D information still needs to be included in the CNN-based boundary classification. Compared to computer vision, the amount and size of benchmark image data are marginal: Existing benchmarks cover aerial data for urban object classification [38] and building extraction [39], satellite imagery for road extraction, building extraction and land cover classification [40], as well as satellite and aerial imagery for road extraction [41]. Such benchmarks in combination with open data initiatives for governmental cadastral data [42], aerial imagery [43] and crowdsourced labeling [44–46] may propel deep learning frameworks for cadastral boundary delineation, i.e., cadastral intelligence. Instead of using a VGG pre-trained on ImageNet, our approach could then be trained on diverse remote sensing and cadastral data, resulting in a possibly more effective and scalable network.

Despite the shown advances, automating cadastral boundary delineation is not at its end. Identifying areas in which a large portion of cadastral boundaries is visible, and for which high-resolution remote sensing and up-to-date cadastral data are available in digital form, still impedes methodological development. Future work could investigate the approach's applicability for invisible boundaries, that are marked before UAV data capture, e.g., with paint or other temporary boundary markers. In this context, the degree to which the approach can support participatory mapping could also be investigated. Furthermore, research needs to be done on how to align innovative approaches with existing technical, social, legal and institutional frameworks in land administration [47–49]. We are currently pursuing this by developing documentation and testing material [50] that enables surveyors and policy makers in land administration to easily understand, test and adapt our approach.

#### *4.4. Comparison to Previous Studies*

How we reformulated our problem to be solvable by a tile-based CNN has been similarly proposed in biomedical optics [51]. Fang et al. crop tiles centered on retinal boundary pixels and train a CNN to predict nine different boundary labels. Correspondingly labeled pixels are connected with a graph-based approach. To transfer the latter to our case, we may investigate whether connecting tiles of similar boundary likelihood can omit the need for an initial MCG image segmentation: By using Fully Convolutional Networks (FCNs) [52] each pixel of the input image would be assigned a boundary likelihood, which can be connected using Ultrametric Contour Maps (UCMs) [53] included in MCG [54]. Connecting pixels of corresponding boundary likelihoods could also be realized by using MCG-based contour closure [55], line integral convolution [56], or template matching [57].

Alternatively, the topology of MCG lines can be used to sort out false boundary likelihoods before aggregating them per line: This could be realized by not shuffling training data, and thus maintaining more context information per batch, or by using graph-based approaches such as active contour models [58] suggested for road detection [59,60], or region-growing models suggested for RF-based identification of linear vegetation [61].

Predicting the optimal MCG parameter k per image may also be achieved with CNNs. Depending on whether an area is, e.g., rural or urban, cadastral parcels vary in size and shape. Larger parcels demand less over-segmentation and a higher k. Similarly, our high-resolution UAV data required a higher k, i.e., 0.3 and 0.4 as compared to 0.1 for the aerial data. Challenges to be addressed are training with data from multiple sensors, varying parcel sizes in training and automatically labeling data with the optimal segmentation parameter k.
