**5. Conclusions**

Open spatial data allow the extraction of a lot of information. Their biggest advantage is fast and free access. The increase in data volume, combined with the development of machine learning algorithms for object identification increases the ability to accurately extract information from satellite and aerial images. As a result, it is possible to identify the location of objects more accurately at early stages of urban, planning or environmental analyses. It should be remembered, however, that the use of open data depends on the accuracy requirements for the problem being analysed.

On this basis, the analyses presented in this paper conclude that open vector spatial cadastral data can be used as labels in the training process of convolutional neural networks, and solve the task of the semantic segmentation of buildings for raster data. The solution presented in this paper enables simplification of the tedious, time-consuming and capitalintensive process of data labelling. This solution also enables the minimisation of errors that may be reflected in the later results of the analyses. The results of the conducted analyses also allow a comparison of the effectiveness of available popular network architectures, i.e., UNET and DeepLabV3+, in solving semantic segmentation problems, and the influence of backbones on the accuracy of building detection.

This paper also analysed the effect of the orthophoto ground pixel resolution on the accuracy of building identification. The analyses showed that for each of the network architectures used, better results are achieved for data with a smaller terrain pixel. The use of data with a smaller terrain pixel is particularly important when using DeepLabV3+, where it allows the mIoU value to be increased by approximately 13 points.

This needs to be confirmed in separate studies, but most probably these data can be successfully applied for the purposes mentioned in the introduction, such as verifying the state of cadastral databases for the identification of unauthorised buildings; verifying the actual state of the land in the initial phase of the infrastructural investment process for a more reliable cost assessment; the mapping of buildings for unmapped areas; verifying the validity of open databases.

However, it is important to keep in mind the identified limitations of the datasets described in Section 2, such as the building outline following the walls rather than the roof or radial offsets. These limitations were omitted in this analysis as unimportant to the problem being solved. However, they may reduce the efficiency of directly implementing the presented algorithm to solve other problems, such as the accurate detection of the position of buildings.

Therefore, the authors plan to further develop the presented dataset with other, more diverse areas, which will allow to generate new learning patterns and optimise the hyperparameters of the applied models in order to increase the accuracy of object detection.

The authors also plan to apply the algorithm presented in this paper using trueorthophoto mapping, which should become more widely available for a larger area of Poland over time. This approach will allow examination of the extent to which the reduction in radial displacement problems improves the extraction of buildings using the algorithm presented in this paper. Additionally, in order to minimise the problem of detecting objects under vegetation and shaded areas, the possibility of adding more image dimensions on the basis of available LiDAR data from Airborne Laser Scanning also made available by governmental units will be verified.

Moreover, the authors are planning to compare a dataset for which masks will be created manually with a dataset that includes masks adopted based on cadastral data. The aim of this activity will be to verify the existence of the algorithm limitations that are proposed in this publication.

**Author Contributions:** Conceptualization, S.G.; Data curation, S.G. and K.T.; Formal analysis, T.O.; Methodology, S.G., T.O. and K.T.; Software, S.G.; Supervision, T.O.; Validation, T.O.; Visualization, S.G. and K.T.; Writing—original draft, S.G. and K.T.; Writing—review & editing, T.O. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** The data and algorithms used are available through the GitHub platform (references in the text).

**Conflicts of Interest:** The authors declare no conflict of interest.
