1. Introduction
Spatial Data Infrastructure (SDI) is a collection of technologies, policies, and institutional arrangements that facilitate the availability of and access to spatial data [
1]. The key components of SDI are spatial datasets maintained by public administration, metadata, and web services, which are integrated in the main access points—mainly geoportals. SDI is a legal, administrative, and economic decision-making tool that is applied in many different thematic domains [
2,
3,
4], including urban planning and land development [
5].
The usefulness of SDIs may arise from a few aspects. Firstly, the usability of geoportals and applications [
6,
7]. Secondly, meeting the information-related needs of different groups of users [
8,
9,
10]. Another aspect is the flexibility of the infrastructure in integrating or adapting to new technological developments, such as semantic web and Linked Data [
11,
12,
13]. These aspects have a significant impact on the development of SDIs. The next generation of SDIs is driven by user needs, with focus on the use of both data and data applications. At the same time, the evolution of SDIs is under consideration, both in theory, but also in the phase of experiments and prototyping [
14,
15].
An important element of meeting the information-related needs of different groups of users is the availability of data with the characteristics required by end users. Regardless of the technological advances of the SDI, the possibility to use it or to link it to other sources such as open data and volunteered systems is a considerable issue [
16,
17].
Another source of information that is interesting in many thematic areas is Big Data. These datasets are extensive primarily in terms of volume, variety, velocity, and variability. Geospatial information is advancing in all the dimensions of Big Data and adheres to different sources of this data such as [
18,
19]: observations from sensors to processing centers, remote, in-situ, and on-vehicle sensors, mobile devices location tracking, real-time tracking of moving features, observations of urban environment in social media from citizens.
Both Big Data and open data, as well as the volunteered systems create new possibilities in different aspects of spatial planning. Campagna et al. [
20] pointed out that integrating social media and other volunteered resources with information from SDI offers a high potential for eliciting pluralist knowledge for spatial planning. Lin and Geertman [
21] reviewed the role of the social media in domains such as individual activity patterns, urban land use, transportation behavior, and landscape. Anejionu et al. [
22] presented the concept of the spatial big data infrastructure, which allows us to analyze social and economic aspects of cities and city-regions across the UK. Esch et al. [
23] described the examples of use of the multi-source repositories in the monitoring of regional land-use dynamics. The usability of these solutions in agricultural economics and management was indicated by [
24]. Jenkins et al. [
25] stated that the availability of large amounts of user-generated content allows for a computational analysis and quantification of the shared meaning of a place. A city may be referenced not only to its spatial representation, but platial characteristics, which are created through individual and collectively derived representations [
26].
There is a lack of in-depth research on the use of Big Data at the level of local spatial planning and practical examples of how datasets can be combined and adapted for analysis and planning procedures, which may be a reference point for different stakeholders who take part in spatial planning processes.
The aim of the paper is to present the possibility of using Big Data collections from mobile devices in conjunction with the data from the Polish Spatial Data Infrastructure main access point, for the purpose of carrying out the urban task of developing local zoning plans. The article describes how these resources open new opportunities in the aggregation of data concerning places and how they may be used in the process of defining the planned land use, taking into account values of the local landscape. The publication also focuses on assessing the quality of data sources used to obtain a thorough understanding of places and the decision-making process in spatial planning supported by the linking of SDI data to Big Data sources.
2. Materials and Methods
The research was based on the case study of the local zoning plan development of Mikolajki. It is a very popular tourist town situated in the north-east of Poland in the Land of The Great Masurian Lakes, on the Mikolajskie and Talty Lakes. The study included the whole town for the purposes of defining potential problem areas (
Figure 1). The area covered by the local zoning plan was almost 95.5 hectares.
The research process consisted of three stages. In the first step (
Section 3.1), selected data sets from the main access point of the spatial data infrastructure in Poland were integrated with Big Data. The second stage (
Section 3.2) consisted in the urban planning analysis. They concerned the verification of planned land use presented in the local zoning plan concerning the individual sites and a number of conditions. It was also important to draw attention to issues that were not addressed at the stage of preparing the spatial plan. The verification included residential and services areas, single family housing estates, services, green areas, public transport, leisure, and tourist areas. New functions were also considered for the whole city, as well as new areas for the selected functions. The last step of the study (
Section 3.3) consisted the assessment of data sources used by authors in the study to obtain a thorough understanding of places and the decision-making process in urban planning. The assessment of the quality of data obtained from the SDI was based on [
27], while the quality principles for Big Data were formulated based on [
28]. The source for criteria adhering to the process of decision making was [
29]. For all aspects under consideration the 0–10 points scale was assumed, where “0” meant that data did not meet the adopted criteria and “10” was the highest score. The assessment was conducted by the authors after the resolution on the local spatial development plan had been adopted.
The whole process including the integration of spatial data from different sources and the analysis was carried out with use of the QGIS application, version 3.14 [
30]. The data analysis methods included attribute selection, layers overlay, spatial analysis and queries.
The first data source for the urban analysis was the Polish Spatial Data Infrastructure (PSDI) and the main access point [
31]. The web services included (
Table 1) cadastral parcels, buildings, and arable lands. The data described in
Table 1 is more than just geometry. The attributes of parcels contain descriptive information, including ownership (i.e., whether the parcel is private or public property). Data that refers to arable lands includes data on drainage ditches, roads, and forests. The buildings layer contains, among others, information about their function and the number of stories.
Cooperation with the Selectivv company enabled us to capture data from mobile devices, including smartphones and tablets between May 2018 and May 2019, which was then processed. Data was collected and updated in real time and formed a behavioral profile of users (columns Item and Share in
Table 2). User profiles are created based on data from mobile devices, first of all on the installed mobile applications. It was obtained from 250,000 applications with advertisements and more than 16,000,000 mobile pages. The final product was the visualization (
Figure 2) presenting the information about the total number of mobile device users in the area of Mikolajki in the analyzed period. This visualization (“heat map”) was generated with use of the tools developed by the Selectivv Company. Google map served as reference data. The red color in
Figure 2 shows the places with the highest concentration of mobile devices, the yellow color with lower concentration and the green one—with the lowest.
During the year, from May 2018 to May 2019, 8330 people (on average) per day visited the area. These values refer only to users who have mobile devices. Assuming that the algorithms used by Selectivv [
34] allowed approximately 362 events to be recorded for a single user, more than 3 million values were obtained during the analyzed period.
3. Results
3.1. Integration of Mobile Big Data and Data from PSDI
Firstly, the content of the visualization presenting the number of mobile devices in
Figure 2 was overlapped with the location of the buildings (
Figure 3). Buildings are the anthropogenic element that is the main part of spatial development. The comparison with the Big Data revealed clearly that the intensity of development (i.e., not only the area of the building, but also the number of floors) affects the number of users.
Figure 3 presents only the contours of buildings, but other attributes of specific spatial objects, such as the number of floors, were also analyzed.
In turn, adding information on arable lands enabled us to analyze how the number of users varies depending on the type of land: e.g., built-up land, forests, surface water. The figure that presents the mobile devices shows clearly that the accumulation of vessels in the narrow isthmus between Mikolajskie Lake and Talty Lake generated much more “traffic” than, for example, in housing estates on the outskirts of the town.
Subsequently, as the boundaries of the cadastral parcels were added to the visualizations, the areas owned by the municipality could be properly identified and used in further analyses.
The local spatial development plan was prepared by the municipality, which also owns part of the area. Therefore, the intention of the authorities was to establish the optimum planned land use. Optimum planned use meant a planned land use that would take into account sustainable development. In the presented case, the public interest includes taking into consideration the needs of the general public, which in practice meant that public roads, green spaces and leisure areas had to be designated, as well as public services.
3.2. Urban Planning Analysis
Any decisions that are made in the process of preparing the spatial plan should be justified. Due to the interdisciplinary nature of spatial planning, decisions on individual sites should be based on a number of conditions. Certainly, having information about the number of users (along with their detailed characteristics, columns Item and Share in
Table 2) sometimes allows one to confirm the assumptions made previously, but it also draws attention to issues that were not addressed before. In
Table 2 (column Planned Land Use Indicators) there are propositions of indicators describing planned land uses, which could be verified based on data from users profile.
Figure 4 shows the local spatial development plan, whose assumptions were verified based on the results presented in
Section 3.1 and
Table 2. The reference layer for the study was the heat map (
Figure 2). The plan provides for, among other things, residential and services areas, single-family housing estates, services, green areas, public transport, leisure, and touristic areas. The colors in the legend are in accordance with the Regulation on the required scope of the draft of the local spatial development plan [
35].
Figure 4 presents only the graphic part of the local plan. The boundaries of planned land uses have been precisely drawn. However, the findings, which are the textual part of the spatial plan, were also adapted to the results of the analyses. As part of the arrangements, the height of the buildings and development intensity, but also the share of the built-up area or biologically active area were determined.
After identifying the cadastral parcels and the buildings situated on them, for which the Big Data set showed the highest values (red color), it was concluded that the largest “user traffic” was generated by the following zones (
Figure 5): (1) hotel, (2) city center, (3) multi-family housing estate, (4) school. Subsequently, the analysis of the transport, cultural and natural conditions allowed new functions to be set for the whole city. The integration of a transport system data (existing and designed circular, bicycle, rail and waterways), the cultural sites (including objects entered in the register of monuments, records of monuments, archaeological sites, including in particular the historic urban layout of the city), and natural ones (areas related to nature protection, e.g., Natura 2000 areas, protected landscape areas or natural monuments) allowed for the designation of areas for the selected functions.
After taking into account the additional parameter (average number of users in the analyzed area), the initial assumptions were adjusted. For the proper functioning of areas that generate a large flow of users, it is necessary to support both adequate vehicle and pedestrian traffic (ensuring smooth flow) and the designation of supplementary functions (parking lots, services), but also an appropriate share of biologically active areas. All these elements have been clarified in the local zoning plan, by designating public roads, internal roads, walking-carriageways, walking-bicycle strings, parking lots, service buildings, residential and service buildings, and sports and leisure areas.
Data sources allowed for the verification of designated areas of public space, which were assumed to be of particular importance for meeting the needs of residents, improving the quality of their life and fostering social contacts due to their location and functional-spatial characteristics, as defined in the study of the spatial development of the municipality. In the immediate vicinity of high-traffic areas, special attention was paid to enlarging or designing new public spaces, such as public squares, green areas, as well as a promenade.
The analysis of high-traffic areas allowed to designate new traffic routes or modify existing ones. As far as the developed plan was concerned, not only car transport was important, but also pedestrian, bicycle, and water traffic. The streets of great importance for the proposed promenade along the eastern shore of Mikolajskie Lake were identified. These included the streets whose historic importance was highlighted in the spatial plan, as well as those that are further away from the shore of the Mikolajskie Lake and are planned to take over traffic, especially of motor vehicles, during periods of increased tourist traffic.
In addition, it was decided that areas located outside the established boundaries of the urban area entered in the provincial register of monuments, and at the same time located directly on the shores of Mikolajskie Lake are predisposed to the development of hotel functions. In this way, a significant number of users of the space that are generated by the existing hotel (an area in the north-west of the city, on Talty Lake) will be balanced.
After detailed analyses, it was decided to assign recreational function to the site located in the immediate vicinity of the center. This area is located not only in the back of the central facilities, but also at an appropriate distance from the public school and multi-family housing estates. From the point of view of the image of the city, it was necessary to highlight the so-called water front. In the past, the spatial development of Mikolajki city took place by issuing decisions on the conditions of land development. The spatial plan designated a so-called water front, in order to reduce the proposed urban development by lowering the intensity of development to preserve the unique character of this part of the town of Mikolajki.
3.3. Data Sources Assessment for the Decision-Making Process in Urban Planning
Ex-post assessment of data sources was conducted.
Figure 6 presents the assessment of data from PSDI, while
Figure 7 describes the parameters of the Big Data set. Data from the main access point of PSDI are of good quality. Considering the assumption for urban planning analysis and data presented in
Section 2 and QGIS software supporting the process of developing the land use plan, the data are perceived as valuable (score of 10 points,
Figure 7) in terms of spatial resolution, completeness and thematic accuracy.
The Big Data set met the expectations regarding the assumptions made in the project. It was characterized by high level of validity and uniqueness (score of 9 points,
Figure 8). The conformance and completeness of data was also assessed as high.
Figure 8 shows the assessment of the decision-making process supported by Big Data and SDI (green color) against the use of PSDI resources only (grey color) in terms of creating an information model of a place (in the study of Mikolajki town) for the purpose of planned land use designation. Scores for 11 out of 14 indicators referenced to the joint use of these two sources were higher compared to the assumption using SDI only in the study. Undoubtedly, the greatest advantage of linking data from SDI to the Big Data resource (score of 10 points) were: better information quality, access to more sources of information, shorter time needed to make decisions; less time needed to analyze data, thorough studies and analysis, as well as improved decision quality.
4. Discussion
Mikolajki town is located in the immediate vicinity of the largest lake in Poland—Sniardwy. The tourist services are provided by further large hotel facilities situated in town, but also residential and service buildings, which, apart from the owner’s apartment, also offer rooms for rent. The choice of the test area was not random, as an area so valuable (e.g., in terms of landscape) must be properly shaped to maintain a broadly understood spatial order. Planning decisions in an area with significant investment pressures must be balanced, all the more so if it is the responsibility of the city authorities (with the support of the office staff and urban planners) to answer the question whether it is possible to introduce development or whether the development should be reduced or even prevented.
It should be also emphasized that Mikolajki is a popular tourist destination, not only among Poles. In connection with the development of the local plan, it was decided to obtain data on the largest possible number of people staying in a particular area. The advantage of the Big Data set used in the research was its volume (3 million values). The knowledge obtained from the data can be useful in shaping spatial policy not only for urban planners, but also for other stakeholders who participate in making decisions about places. Apparently, it may seem that such high level of detail of data is not necessary in the course of carrying out planning work. However, as an example, it can be given that the local zoning plan defines the areas for organizing mass events, depending on the needs. Analysis of data from specific mobile phone registration days where mass events take place will enable to adjust not only the size of the sites, but also the specific provisions of the local zoning plan.
Users of mobile devices were not categorized into residents and tourists. These users, as a total value, were “space users”. The space must “absorb” users, regardless of whether they are residents or tourists. Therefore, e.g., in tourist destinations, technical infrastructure or roads should be designed based on the expected maximum number of users. In this way, you can avoid “traffic jams” or temporary drops in water pressure in the water supply system. Official statistics in Poland seem to be imprecise in determining the number of “space users”, i.e., the number of people who were present in the selected place in a given period. What is more, the data usually refers to administrative units, most often municipalities. Currently, no data is collected that could be related to a specific location. The Big Data acquired in this study referred to a selected time and could be precisely related, for example, to cadastral parcels. This kind of analysis was added value.
The conducted analyses enabled us to verify the initial assumptions in the course of the planning procedure. The validity of the adopted solutions was also confirmed by the results of public consultations, and, at the final stage, also by adopting the resolution on the local spatial development plan. The case study concerned a draft local zoning plan which, at the time the Big Data set was made available, was at the second stage of public consultations. It can therefore be assessed (on the basis of the comments submitted after the first round of consultations) that the assumptions made correspond not only to the needs of residents but also to the expectations of potential investors. At the time of writing of the article, the local zoning plan has already been approved. During the procedure, three series of public consultations were conducted, with a total of only 8 comments (some of which were repeated from the first stage of consultations). No comments were submitted after the last phase of consultations. Considering the size of the area covered by the land use plan (almost 95.5 hectares), it can be assumed that the planned land use was approved by future and current users of the space.
Linking data from the SDI to the mobile Big Data set offers a possibility to evaluate and compare the legitimacy of maintaining or introducing a given new feature, as well as adjusting local zoning plan entries. The use of these data allowed us to “balance” the spatial development of the city and give it the right direction. In order to balance spatial development, it is necessary to understand the determination of such a destination for individual areas and choosing such parameters that will enable to avoid excessive intensification of both the development and the number of users. In addition, integrating Big Data sets with SDI allows us to create a new quality in deciding on places. The presented results fit into the concept of the platial nature of the city presented by [
26] and confirming the conclusion drawn by [
25] who stated that large amounts of user-generated content allow us to computationally analyze and quantify the shared meaning of place. As more planning practitioners are becoming aware of new opportunities that are brought by technology and different data sources and the ways to exploit them to deliver more transparent, participative, and well-informed policy and action [
36,
37], the results presented in this paper reveal how information about users of mobile devices and some information from behavioral profiles may be analyzed for the purposes of verifying local zoning plans.
The “heat map” and data from the PSDI allowed us to check what types of building generate an increased number of users to verify the assumptions concerning building facilities understood as structures, open areas, or objects, which accommodate or are intended to accommodate residential, civic, commercial, industrial, and/or mixed-use activities. Until now, it was a matter of some conjecture or knowledge based on not always precise data. Data from mobile devices (including smartphones and tablets) allowed us not only to specify the number of users, but also to properly define the concept of spatial development based on expert knowledge of urban planners. Proper orientation of spatial development should be understood primarily as compliance with the established spatial policy. Planned land use can be even better adapted to meet the needs of users of a given space. Obviously, the design process must be carried out in full form, that is, to examine all the conditions, including landscape, environmental, and cultural ones. In addition to the entire design process, new technologies allow urban planners to explore issues that were previously impossible to verify or to obtain the right data that was cost-effective. Linking detailed data on space users to buildings, cadastral parcels, and arable lands allowed for a more accurate definition of problem areas and possible solutions. The next step consisted in defining the principles of development and land use indicators. The use of Big Data allowed us to optimize the project, among other things, in terms of the movement of people. In this sense, urban planning should be treated [
38] as a weave of related criteria constituting a complete system.
Information about users of a given space may be obtained in many ways. In Poland these may be data collected by the given city, municipality or district office, as well as those that are available in the PSDI. Social networks and Big Data resources are also new sources of data that may be used as an additional source of knowledge. Together, these sources allow us to change the ways in which information about a place may be collected and analyzed. These collections and the analyses carried out on them may replace other sources of knowledge and research techniques. In order to obtain the data listed in
Table 1, surveys should be organized and paid for. It is worth noting, however, that even the best surveys may initially be fraught with errors, and certainly would not apply to such a large number of people.
The greatest advantage of linking data from SDI to the Big Data resource is obtaining more sources of information, thorough studies and analyses, as well as improved decision quality in the area of spatial planning tasks. All the elements mentioned above are improved thanks to the possibility of using data that were almost impossible to acquire before. The time of processing and analyzing these data is much shorter than that of traditional data obtained from the SDI. Moreover, the quality of the decision may be significantly improved, mainly due to the possibility of analyzing data that are acquired independently from the user, to a large extent. At the same time, one should bear in mind that there are numerous sources of Big Data (such as observations from sensors, on-vehicle sensors, social media from citizens), but also of companies and entities that collect these data. Each of these datasets requires using the adequate software, technical and organizational solutions to store and manage the data and to process them so that they may be used later in various thematic areas. An important issue related to the processing of Big Data are personal and sensitive data, whose processing is governed by several national and international legal regulations. From the point of view of integrating SDI and Big Data, one should also pay attention to the format of the studies prepared based on Big Data, so that, in various practical applications (e.g., in town planning for the purposes of this paper), it will be possible to integrate and analyze various types of data, including those obtained from SDI, considering their interoperability and the related standards. Both geoportals and numerous types of GIS software offer the possibility to add data sources in specific formats, and to perform specific operations and analyses on these data.
5. Conclusions
The research carried out assessed the possibilities of incorporating SDI and Big Data resources into local urban planning The draft of the local spatial development plan of Mikolajki was verified based on integrated data sources. The results showed that the visualization of Big Data as a heat map may be used in urban tasks and as the thematic layer integrated with vector and raster data sets from the SDI in the geographic information system software. The analysis of the buildings, parcels, transportation, cultural, and natural conditions allowed new functions to be set for the whole city and the parameters for land management to be specified more precisely. The Big Data set met the expectations regarding the assumptions made in the project and the decision-making process based on these sources was improved.
The contribution of this work is the presentation of the case of platial analysis in one of the local spatial planning tasks, such as development of local zoning plans. It is also the practical example how information about users of mobile devices and some information from behavioral profiles may be analyzed for the purposes of verifying planned land use. Analyses based on mobile Big Data sets, as well as the SDI, are the source of information that may be essential considering all levels of spatial planning, starting from the local zoning plans, through the study of the conditions and directions of spatial planning, to the zoning plan of the province and, finally, the concept of national spatial planning strategy. Depending on the specificity of the area, it is worth adjusting not only the scope of the analysis, but also the set of data used. In other words, the set of characteristics of users analyzed should be adapted depending on whether tourist, metropolitan, rural or border areas are analyzed. This type of research is worth conducting at every stage of the planning procedure. Even if the planning works are already advanced, it is worth verifying the proposals.
The authors also propose to consider the possibility of changing the Polish legal regulations on the access to data, e.g., from mobile networks or the main statistical office. Open access to such data, in open data formula, can stimulate the development of an improved, deeper decision-making process in spatial planning. In Poland, another limitation of Big Data use is undoubtedly the relatively high price of performing this type of analysis. Among other things, due to the deregulation of the town-planners’ profession, the price of local zoning plans is falling steadily, so spending further funds on such detailed analyses is not always accepted by both urban planning firms and contracting authorities. The study showed the potential in the possibility of applying the analysis of users of devices during the planning procedure. It is worth extending the research towards mobile crowd-sensing, and in particular user profiling in the area of urban tasks. Big Data creates new possibilities for spatial planning analysis, which is expressed by visualizations and modeling.