*3.1. Dataset Construction*

We chose Nanjing for our case study. The study scope focused on applying deep learning methods to the quantitative study of the urban fabric. The plot morphology is highly related to the land use of the plot. In Chinese cities, the land use of the plots is determined according to "GB 50137-2011 Code for Classification of Urban Land Use and Planning Standards of Development Land" [40], which is an important standard in city and town planning in China. According to the land sale data in Nanjing, the sold R2 residential plots were 58.9% in 2017, 42.5% in 2018, and 54.5% in 2019 of all sold land.

This indicates that residential plots occupy a large proportion of the urban plots, which helps to ensure a sufficient sample size. In addition, the residential plots showed no significant difference, which can ensure the similarity distribution of the datasets; therefore, the performance of the quantification method can be tested by making distinctions from similar morphologies. Therefore, by taking residential plots as the case study, a highquality dataset can be obtained. We chose urban plots rather than blocks as research objects because the plot connects the buildings and the city. The buildings in one plot are usually planned as a whole. The plots with the same functions have similar design indicators, which implies the significance of morphological identification.

The AutoNavi open platform provides a web mapping service API in China, where we downloaded geometric information, navigation, and infrastructure information of a particular area. As the downloaded data includes all the information visualized on the web map, we need to filter out the specific data for our study direction according to the locations or the labels. For example, the Area of Interest (AOI) data includes various function types of areas, including residence areas, water areas, tourism areas, commercial areas, and education areas.

The building boundaries and road networks we downloaded from the platform involve all buildings and roads in Nanjing city. In addition to map-related geometry data, infrastructure information can be revealed by Point of Interests (POIs). The POI contains information, such as the ID, name, category, address, etc., and a series of related information (e.g., streetscape and user comments) can be obtained by keyword search. Based on the POIs from Nanjing city, we filtered out the residential-related POIs.

After downloading the data as shapefiles, we filtered out the plot boundaries according to the "residence" label of AOIs. Then, we filtered out the buildings and roads according to the locations compared to the filtered areas. These operations were based on ArcGIS. This study selected the infrastructure categories closely related to residential areas: restaurants, shopping, physical facilities, public services, medical facilities, education services, and public traffic. Figure 2 shows the distribution of the different types of POIs, taking one of the districts of Nanjing: Qinhuai District as an instance. The POIs information was visualized based on the tableau software.

**Figure 2.** The distribution of the related infrastructure information around the residential cases (in Qinhuai District).

Finally, 4172 residential plots were filtered out from Nanjing. Some residential areas were still under construction or had no building information. Therefore, the dataset contained 3817 residential areas with valid data. For each residential case, there was infrastructure information, exported plot images, and geometric models, as well as semantic information attached as attribute values, including the name, address, map-ID, location, and site area.

Figure 3 shows the information included in the dataset after we processed the shapefiles and POIs downloaded from the AutoNavi platform. All the plots were exported to images as training data for the deep convolutional network model. The amount of infrastructure covered per residential area was calculated based on the service radius of the POI points and visualized by radar charts. We wrote the codes for drawing radar charts with JAVA language.

**Figure 3.** Part of the collected Nanjing residential cases with the attribute information shown on the ArcGIS platform.
