1. Introduction
Statistics from the Department of Economic and Social Affairs of the United Nations, DESAP, indicate that 68% of the world’s population will live in cities or urban areas by 2050 [
1], which means rapid and even uncontrolled growth with consequent challenges for governments, for example: pollution, problems of travel due to traffic and congestion, high costs of housing, food, and basic services, as well as security problems [
2]. In particular, noise pollution is becoming a growing concern as it is the second most important pollutant after air—it has been discovered to have great effects in both the health of adults [
3] and children [
4].
To address the abovementioned problems, the smart city (SC) concept emerged over the last years, which refers to the integration of the urban environment with the information and communication technologies (ICTs). This concept has attracted the interest of all sectors (governments, universities, research centres, etc.) to present solutions or developments to achieve a SC [
5]. The objective of the SC paradigm is the effective management of challenges related to the growth of urban areas through the adoption of ICTs in developments, solutions, applications, services, or even in the design of state policies [
6].
Modern models of municipal governance promote the creation of public value through articulated initiatives involving citizens. In this context, the generation of useful information for citizens is essential and citizens are increasingly demanding that it be accessible via the Internet. Making data open and mobilising collective knowledge is increasingly important to enable the creation of sustainable solutions for cities. It is in this context that the concept of city-as-a-platform emerges, which is associated with the movement for open government and the application of digital technologies to expand the possibilities for the co-production of public services [
7]. The city-as-a-platform is the technological and governmental infrastructure that enables the society to play a direct and broader role in the life of cities. Digital technologies are applied to promote an open space for collaboration and democratisation of information and knowledge, which requires governance that is consensual, transparent, responsive, efficient, effective, equitable, and inclusive. In order to promote this type of initiative, it is essential to have tools such as the one presented in this article, in order to have dynamic, efficient, and fast mechanisms to facilitate the analysis of information and to provide it to citizens. This platform makes it easy to capture data from IoT platforms, open data, etc., and process them using artificial intelligence techniques, augmented reality, etc.
Currently, smart city is a broadly used term, for example, in systematic reviews of the literature, more than 36 definitions are identified that address different dimensions of the urban environment, such as: mobility, technology, public services, economy, environment, quality of life, or governance [
8]. One of the most widely used definitions is the one proposed by Elmaghraby and Losavio (2014): “an intelligent city is one that incorporates information and communication technologies to increase operational efficiency, shares information independently within the system, and improves the overall effectiveness of services and the well-being of citizens”. However, the growth of the Internet of Things since 2017 onwards has led to a large number of devices being permanently connected to the Internet. This has led to increased interest in overcoming the challenges posed to effective data management and security and opened up a debate regarding their importance in urban management and in the wellbeing of the population [
9]
Security is a very important element in cities, it is vital to guaranteeing a safe environment for both the citizen and the data that is generated. Any city must implement security measures to ensure full protection of the citizens’ data, and the data generated by the urban infrastructure and sensors, etc. Given the seriousness of the current pandemic, it is also important to facilitate compliance with security measures that protect citizens and that, for example, they have an adequate social distance. For this reason, imaginative systems that allow us to identify those areas where there is less density of pedestrians can be of great interest, for example to facilitate the leisure of families who want to go out and enjoy the city. SCs need a secure and flexible platform for managing data coming from city sensors, service providers, citizens, etc. [
1], and data coming from real-time sensors, smart nodes, relational or non-relational databases, especially, small to medium cities or territories that need scalable platforms, that are easy to deploy and manage, and that do not require specialized data analysts, which they do not normally have [
10]. The construction of a SC is a dynamic process, and the management platforms have to be ready not just for real-time data ingestion but also for the inclusion of data from different sources, to manage such data, to analyse it, to create different visualization models, and to integrate different datasets. It is also of great relevance to exploit the data, to be able to develop classification, optimization, prediction models, etc., and to develop secure dashboards that can be integrated in the control system of a city or county council or any other entity responsible for the management of the SC. All these challenges must be tackled while developing a SC, therefore slowing down the process and raising several problems for the developers, which raises the question of how to accelerate the development of Smart Cities, reducing costs, time, and troubles?
This article presents the deepin.net platform [
11] and how it has been used to implement a model that facilitates the maintenance of social distance when walking around a city. Many cities have cameras to guarantee security and/or facilitate decision-making with respect to, for example, traffic, frequency of cleaning, etc. In this case, we show how a model with a facial recognition algorithm has been implemented on images captured in real-time and another regressive one. This collects images from security cameras and identifies the number of pedestrians on a street, calculates the density, and on the basis of historical data, predicts what the density of pedestrians will be in the future. The HOG algorithm (Histogram of Oriented Gradients Algorithm) is used to detect pedestrians and calculate their density on a street. The XGBoost algorithm [
12] is used to predict the future. The facility that deepint.net has to incorporate sensor data, in this case from camera images, and to implement these algorithms, makes it very simple to build mechanisms for automated decision-making processes. With this information, citizens will be able to plan their walks, know the density of pedestrians on a street at a given time, and know what is likely to happen in the future.
One of the main advantages of deepint.net against similar platforms is its serverless design, based on the cloud environment of AWS (Amazon Web Services). This results in no volume restrictions for the data and response times corresponding to the AWS machine chosen by the client. Moreover, its ease-of-use makes it possible for anyone with basic knowledge to take part in the development of a SC, greatly speeding up the developing and deployment phases.
The article is organised as follows: the second section describes the concept of the smart city and presents some tools for managing them—these are the verticals on which they can be structured and for which a greater number of use cases are carried out. The third section describes deepint.net and the fourth section presents the hybrid model used to identify pedestrian areas with an accumulation of citizens. Once the results have been analysed, the conclusions are presented.
2. Smart Cities
Cities are constantly evolving, and regardless of their size, they are seeking solutions to improve the quality of life of their citizens, to be more efficient, and to optimise their resources. Information and communication technology (ICT) is a basic element in the development of intelligent cities and numerous projects have been launched to create information management systems, especially adapted to the needs of cities [
7].
In all these developments, it is essential to take into account aspects that are closely related to the citizens, such as human capital/education, social and relational capital, the environment, etc. For smart city models to be useful and progress together with their citizens, it is necessary for them to be efficient, flexible, easy and rapid to implement, and to integrate with other smart city tools or technologies [
6]. Many countries are making a considerable effort to develop a “smart” urban growth strategy in their metropolitan areas [
13]. The Intelligent Community Forum conducts research on the local effects of ICTs that are now available worldwide.
The role of innovation in the ICT sector is fundamental in the development of the infrastructure that provides a city with intelligence and of tools for the sustainable, citizen-oriented, realistic, and coherent management that is required [
14]. The scope of research is extraordinarily broad in this field and there are numerous options for the implementation of intelligent cities. It is therefore important to know all of them and make the right choices. Interesting options have been presented for smart cities in the fields of wireless sensor networks [
15,
16], agriculture [
17,
18], energy optimisation [
19,
20,
21,
22], optimal resource allocation [
23], risks and challenges of EV (Electric vehicle) adoption [
24], vehicle networks [
25,
26], and route optimisation [
27,
28].
The Internet of Things is a basic element in the development of intelligent cities [
6,
29]. City data, especially if accessible in real-time, can be used to effectively transform and manage the city and promote urban planning and development [
30]. Appropriate real-time solutions and systems capable of making decisions to solve run-time problems are elements that can improve the efficiency of smart cities [
31]. The present paper is an example of how the data extracted from real-time images can be used to identify areas where pedestrians can walk, maintaining the adequate social distance.
Any platform for smart city management must have a robust system to acquire and process data from multiple data sources (databases, trackers, third-party applications, sensors, intelligent nodes). Architectures require flexible and scalable computing power to process large volumes of data. Today, thanks to technological advances and lower storage and sensor prices, the amount of generated and stored data is huge and growing exponentially. Multicore processing (in the form of symmetric multiprocessing (SMP) and asymmetric multiprocessing (AMP) is becoming common, with embedded multicore CPUs expected to grow by a factor of
x6 in the next years (Venture Development Corporation). In addition, Field Programmable Gate Arrays (FPGAs) have grown in capacity and decreased in cost, providing the high-speed functionality that could only be achieved with Application-Specific Integrated Circuits (ASICs) [
32]. In addition, virtualisation is driving the development of large scalable systems and blurring the connection between hardware and software by allowing multiple operating systems to run on a single processor.
There are numerous platforms for the management of smart cities, which facilitate both massive and secure data intake and processing. These platforms have mechanisms for information analysis, data transmission, information fusion, pre-processing, etc. In addition, these platforms must be prepared for integration with other platforms, with information management systems, etc. Some of the platforms used for the management of smart cities are presented in
Table 1.
Although much research is being done on smart cities, a compact system is still needed that is efficient and scalable, and easy to implement and integrate with other platforms. The platforms presented in
Table 1 offer many options and some of them are quite flexible for data management. They have all been analysed and some of them are more efficient in modelling data, while others are better at acquiring data from sensors, but the most interesting one, that has the biggest potential, is deepint.net [
11].
Deepint.net offers all the necessary elements to build a system for collecting and managing data using all the power of artificial intelligence. Deepint.net simplifies the development of the management systems of a smart city and also offers the possibility of integrating data from any source.
Deepint.net offers user-centred services and facilitates the creation of intelligent dashboards without the need for knowledge of intelligent systems, as it has an intelligent tutor that guides experts in city management to develop their own models. The platform is ready to integrate new management models and facilitates the composition of intelligent hybrid or expert mixing systems, so that different algorithms work together to obtain results from integrated or heterogeneous data sources.
Moreover, one of the main characteristics of smart cities is the heterogeneity of all their components [
33], both in the final applications and in the technology used in the deployed infrastructure. For example, within the IoT sector, there are many manufacturers, protocols, and communication technologies [
34,
35]. As detailed below, one of the main advantages of the platform presented in this document is that it is compatible with any technology or manufacturer.
Deepint.net incorporates the elements required for the management of any smart city without the need for ICT professionals or expert data analysts, and it has been developed under the concept of “Smart City as a Platform”. The platform includes artificial intelligence techniques for the extraction of information that is useful in the management of infrastructures, systems, and devices, making the city functional and efficient. Moreover, maintaining a rapid response time is fundamental for the functioning of a SC. Without deepint.net, the large volume of data collected from a SC is normally sent directly to the cloud, and this has high associated and variable costs, forcing cities to seek solutions that reduce the costs of payments to cloud service providers, as well as energy and bandwidth consumption. Deepint.net is a scalable, easy to use, and dynamic platform that has a fast response time, capable of satisfying the immediate needs of any city or territory.
Smart Cities as a Platform: Verticals and Domains of Smart Cities
The development of a smart city must be based on certain foundations, which range from a well-defined planification phase to the selection of the most suitable tools [
36]. In many cases, key aspects, such as an in-depth review of the human factors of the city or the validation of market-ready technologies, have been left out. This can result in a slow adoption of new systems and unsuccessful smart city projects. Vertical markets and tools for smart cities are often classified following the principles of
Table 2, which provides a solid structure upon which a smart city must be built.
3. DeepInt.net Platform for Smart Cities
Today, all sectors can extract knowledge and benefit considerably from data analysis, with cities/territories being the largest producers of data that exist today. Thanks to advances in computing, such as distributed processing techniques, improved processing capabilities, and cheaper technology, artificial intelligence techniques can now be applied to large volumes of data, offering rapid results, which was unthinkable less than a decade ago.
Deepint.net offers functionalities that cover the entire data analysis flow: a wizard for data ingestion from multiple sources and in multiple formats, a wizard for data management (pre-processing, filtering, etc.), a wizard for applying proprietary data analysis methodologies with the advantage that no algorithm needs to be programmed or configured, a wizard for creating fully customized dynamic visualizations and dashboards using drag and drop techniques, and finally, mechanisms for exporting data and visualization results, allowing, for example, for the interactive sharing of analyses with any Internet user. The architecture of the platform includes the 5 different layers described in
Figure 1.
Deepint.net is a platform that can be used to create data collection and management systems for members efficiently and without the need for data scientists, which is difficult to find nowadays, as it includes many data analysis algorithms created with artificial intelligence techniques. It is a platform created for the managers of smart cities, and it facilitates all the aspects related to data management, processing, and visualisation.
Deepint.net is a platform deployed in a self-adapting cloud environment, which allows users to apply artificial intelligence methodologies to their data, using the most widespread techniques (random forest, neural networks, etc.). The user does not have to know how these techniques work, nor how to configure them—it is not even necessary to have programming skills. Deepint.net facilitates the construction of models for data processing in a guided and clear way, indicating how to ingest data, work with the data, visualise the information to understand the data, apply a model and, finally, obtain, evaluate, interpret, and use the results. The platform incorporates a wizard that automates the process, selecting the configuration for the artificial intelligence methodology that provides the best solution to the problem the user is addressing.
Figure 2 shows which elements and tools can be used throughout the data and information management process, from data entry to the creation of dashboards or data exploitation at the end. Deepint.net allows users to exploit all results through dynamic, reusable dashboards that can be shared and used within other tools available to the city, such as exporting results in different formats for easy reporting, for example.
It is a platform that covers all the usual flow of data analysis from the intake of information to the exploitation of the results, but unlike other existing tools, the user has no knowledge of programming or data analysis but is only an expert in smart cities [
11].
Some of the most outstanding features of this platform are related to its plans tailored to the needs of cities, its ability to integrate datasets from different sources, prioritising the most common: formatted files (CSV/JSON) both local and available on the Internet, databases (NoSQL and SQL), streaming data (MQTT, among others), repositories based on CKAN, etc., the inclusion of models for the automatic detection of the type of data, the use of automatic mechanisms for data processing (filtering records according to a criterion, eliminating fields, merging sources, creating compound fields, etc.), guided mechanisms that facilitate the representation of the information provided by the user, and mechanisms for the creation of data analysis models in a guided manner, suggesting the best configuration to users. Meanwhile, the platform has advanced features which can be used by the users who have knowledge of data analysis, the use of metrics that allow the model results to be evaluated in a simple manner, the definition of dashboards by inserting created visualisations, model results, etc., by means of ‘drag and drop’ so that users can personalise how they want to work with the tool, etc.
In addition, deepint.net allows users to create different roles, structure user projects in such a way that with one account the tool can be used in different environments or for different clients, exploit results for the creation of reports, etc., and deploy the system in a commercial cloud environment (i.e., AWS) that allows all users to be served in a way that is adapted to their needs, on demand, with high performance and high availability.
4. A Case Study: Melbourne
In this section, a model is presented for real-time crowd detection and future crowd prediction using video surveillance footage. The use case is set in the city of Melbourne and focuses on the ability to detect the most crowded streets of the city and the streets which have the lowest number of pedestrians. This information is of critical importance to both government institutions and citizens, especially during the current COVID-19 pandemic. Crowds may contribute to the spread of the Coronavirus if the social distance among people is not adequate, and maintaining such distance is rather difficult. The information generated by the presented model may help citizens and city authorities make decisions.
The method allows the user to identify the areas of Melbourne in which crowds will appear using historical and real-time data from the video surveillance cameras of the city. The process is carried out using a hybrid algorithm with two modules: a face recognition unit and a regression unit.
4.1. Input Data
The method is designed in such a way that constant image flow is the only required input. Camera footage is analysed every 2 min by the face recognition unit and the number of detected faces is fed to the regressor unit, which is re-trained once a month. This creates a well-labelled dataset and provides real-time insight to the users. So as to test both units separately, the use case presented in this article uses independent data for each unit.
The regressor unit has been trained using a dataset which contains hourly pedestrian counts since 5 January 2009 to 31 October 2020 from pedestrian sensor devices located across the city of Melbourne (link is provided in the Data Availability Statement). It is formed by 3,391,523 data instances and contains information about each sensor’s location, the time of the measurement, and the number of pedestrian hourly counts. The data is considered reliable, up-to-date, and publicly available.
As for the face recognition unit, data protection laws do not permit the publishing of open datasets of street surveillance footage without blurring the face of the people involved. This obstacle has been overcome by using a well-tested algorithm which provided a good performance in a vast variety of datasets [
37,
38,
39].
An example of the data used is shown below (
Figure 3).
4.2. Crowd Detection Method
The goal of this solution is to accurately describe and predict the location of crowds in any developed smart city. To achieve this, security camera footage is processed using a face recognition algorithm which calculates the number of individuals in each frame. This process is carried out every two minutes and the obtained number is extrapolated to estimate the people density surrounding the considered sensor. The obtained information is useful for monitoring crowd behaviours and creating a training dataset. An output describing the areas of the city by “low, medium, or high density of pedestrians” is generated.
Afterwards, a machine learning algorithm is trained with the labelled dataset and used for predicting the behaviour of crowds in the near future. An output describing the density of pedestrians in the different areas of the city is generated after one hour and after two hours.
The overall process is described in
Figure 4.
4.3. Face Recognition Unit
The main aim of this module is to transform the input, which consists of camera images, into a consistent dataset of people density. To achieve this, the number of faces in each frame is obtained and the average number of people over a period of time is calculated. This results in a dataset of people density in the location of each camera, which is used for training a regression unit.
This unit makes usage of the HOG algorithm (Histogram of Oriented Gradients Algorithm) due to its good performance for human face image detection. This algorithm obtains a near-perfect separation on the original MIT pedestrian database and an 89% accuracy in other more complex datasets [
37].
The face recognition process is as follows:
Convert the image to grayscale and calculate the gradient of each pixel. This creates a common ground for all images. Changes in brightness do not affect the algorithm anymore.
The gradients are stored in an array, divided into 16 × 16-pixel squares and the direction of the greatest gradients of each square is selected.
A trained linear Support Vector Machine (SVM) is used to find face patterns.
The algorithm produces a series of locations within the image which contain people’s faces. A visual example of the output of the algorithm is shown in
Figure 5. The accuracy of the algorithm reaches 99.38% on the Labelled Faces in the Wild benchmark [
40].
4.4. Regression Unit
The regression unit in this case makes use of the XGBoost algorithm. This method stands out as it has shown a very powerful performance, accuracy, and it has a good interpretability. The goal is to model the previously obtained dataset and use it to predict future crowds as well as to monitor the current ones (
Figure 6).
XGBoost is an optimisation method which makes use of regularisation and a loss function, as described in [
12]. It addresses the problem of traditional Euclidean space optimisation methods and achieves a gradient tree boosting method. The (simplified) objective function is defined as:
where
l is the loss function,
is the real observed value,
is the previously predicted value,
is the function to optimise in step
i, and
is the regurgitation factor of the function.
This is defined as a computational-enhanced version of the Tailor Theorem which, in addition, can apply Euclidean space optimisation techniques.
Similarly, if we consider the second-order Taylor approximation, we obtain the truly used goal function:
where
and
are the first and second order gradient statistics of the loss function and
is a constant.
The usage of this function results in a lower computational complexity as compared to random forest and traditional tree ensemble models.
4.5. Using Deepint.net to Construct a Solution
The previously described algorithm has been implemented on deepint.net and its deployed version has been tested. An overview of the process is presented in this section.
All the data analysis is made using wizards on deepint.net. The user must only select the data source and the model to be used, and the platform will automatically look for the best hyper-parameters and configurations. The performance of the created model can be directly observed in predicted-observed diagrams and other ways to interact with the model. For example, predictions of arbitrary dates and other types of data input can be made (
Figure 7).
The ease-of-use is a key feature of the design of the platform. In just a few basic steps, a model can be created and there are several visualisations available to analyse its behaviour, performance, accuracy, and to interact with the model. The basics of all the available options are described in detail in the dialog boxes—the user just has to select the desired configuration. Furthermore, advanced data scientists can fine-tune the parameters manually if they wish, while using the functionalities of deepint.net via an API REST.
After the models have been created and connected within the smart territory, the design of the platform allows the user to create a set of dashboards for real-time monitoring of the sensors (
Figure 8). Furthermore, heat maps, as shown in
Figure 6, can be added to new dashboards.
5. Results
Deepint.net is a platform which eases the development and monitoring of intelligent systems in smart territories, while providing robust results. The developed applications can be operated in real-time by any user in the city: both a pedestrian who wants to plan a quiet walking route and a manager who has to decide where to reinforce the street cleaning routines. Moreover, during the current COVID-19 pandemic, it would allow pedestrians to keep a safety distance among themselves, as well as help the authorities keep a lower infection rate within their territory. All city stakeholders can benefit.
The wizards offered by deepint.net for the integration of data sources, creation of visualisations, dashboards, and modelling, cover the entire ecosystem within the data analysis life cycle. This proposal has this key advantage over other commercial data analysis solutions which are more limited in functionality and usability.
The algorithm used for prediction was XGBoost as it is an optimised distributed gradient boosting library designed to be highly efficient, flexible, and portable. It implements machine learning algorithms under the Gradient Boosting framework, which provide fast and accurate results for most data science problems. It is widely used nowadays as it has achieved a better accuracy than other tree ensemble algorithms and even outperformed newer algorithms such as LightGBM [
41]. The discussed use case, designed as a mock-up version to test the platform’s possibilities, has used a reduced version of the dataset (first 57,000 instances out of the 2,281,353 total instances) and obtained a mean relative error of 0.314. As it can be seen in
Figure 7, most values can be found near the predicted-observed line, with only a small amount of them far away. This performance corresponds with the basic results obtained when the platform automatically configures all the parameters, equivalent to a beginner user without any experience using it. For more advanced users, the wizards make it possible to boost this performance by manually fine-tuning the parameters and testing multiple configurations.
The face recognition algorithm applied in the use case has been selected due to its high performance, accuracy, and popularity, resulting in a trustworthy algorithm. A comparison with other algorithms is shown in
Table 3. The HOG algorithm stands out as the one with the higher detection rate, and it has been shown to work well with subsequent algorithms with a complex task [
37,
38,
39].
Limitations
The proposed system assumes that the cameras used for crowd detection are located in relevant locations with an angle to capture pedestrians—a camera facing a wall could confuse the classifier and distort the heatmaps. Furthermore, the model trained in this use case is designed to emulate the process performed by the most basic user. If a model is to be used in a real-world scenario, it would need to be fine-tuned by a data scientist.
6. Conclusions and Future Work
The proposed model makes use of advanced machine learning algorithms for face recognition and an ensemble learning method that successfully predicts the present and future location of crowds within cities, as evidenced by the results. The developed platform is the cornerstone of Smart Territory development, enabling any user to achieve equivalent results seamlessly and to implement them in real-life scenarios, facilitating the entire development process. Deepint.net made the creation of an advanced crowd detection dashboard possible and greatly reduced the development time—a few working days as opposed to the typical months of R&D for creating a system from the ground up. Most methods are based on well-established Python libraries, providing a high degree of reliability to any developed system.
The platform has a much greater potential than specialised tools with regards to providing strong, resilient models to a wider public, without dropping in performance. As the number of smart cities around the world continues to increase, such advancements are more needed than ever. Deepint.net is capable of reducing the costs associated with maintenance and resource management in smart territories while accelerating their development.
Current technological advantages are changing our cities from many different perspectives, and an efficient data management model is required. Such changes constitute a key challenge to any SC, and they constantly make platforms become obsolete and full of limitations. As a result, deepint.net and any other such platform must include constant upgrades and incorporate new, promising ideas and algorithms.
The concept of “city-as-a-platform” has been successfully introduced and is driving the development of smart cities, efficient in the use of data and boosting the development of smart applications. In this sense, it is not sufficient to have mechanisms for data processing and solution development—it is necessary to have platforms that allow for the construction of these systems in an efficient, fast, and secure way. The availability of “open-data” platforms, sensors capable of providing secure and continuous data, and the demand for solutions for each of the verticals was identified by the many smart cities under development. It seems clear that what impedes municipal governments from undergoing a definitive and disruptive transformation is the use of inappropriate platforms. This hindrance has been addressed in this paper, by demonstrating that deepint.net is a platform with great potential for information capture, visualisation, management, modelling, and representation. Municipal governance implies a systematic and definitive boost in the use of technologies such as those presented in this article. The presented user-friendly platform employs AI to manage data originating from IoT architectures.