LLM Agents for Smart City Management: Enhancing Decision Support Through Multi-Agent AI Systems
Abstract
:Highlights
- For smart city management, LLM-based multi-agent systems achieve 94–99% accuracy in routing urban queries and demonstrate significant improvements in response quality (G-Eval scores of 0.68–0.74) compared to standalone LLMs (0.30–0.38).
- Achievement of high scores in routing queries and response accuracy is possible with middle-size LLM models rather than the biggest LLM models.
- The multi-agent LLM approach enables efficient processing of complex urban planning tasks while maintaining high relevance in responses, making it practical for real-world city management applications.
- LLM agents can effectively augment human decision making in urban planning by reducing task completion time from days to hours while maintaining accuracy and accountability in complex scenarios.
Abstract
1. Introduction
1.1. Smart City Management Background and Challenges
- (A)
- Data Fragmentation. Strategic decision making is hindered by incomplete datasets repurposed from their original operational contexts, creating critical information gaps.
- (B)
- Domain Evaluation Inconsistency. Varying data coverage across urban sectors impedes uniform situation assessment and comparison.
- (C)
- Development Bias. Data-rich sectors receive disproportionate attention in smart city initiatives, potentially marginalizing important but less digitally documented areas.
- (D)
- Automation Constraints. Complex urban decision making faces limitations due to the challenge of reconciling digital and document-based data formats, necessitating human expert interpretation.
- (E)
- Cross-level Data Misalignment. Inappropriate application of operational data to strategic planning creates decision making inconsistencies between management levels.
1.2. Research Objectives and Hypotheses
- Hypothesis 1. LLM agents are capable of effectively routing and processing diverse user queries related to the urban environment and social infrastructure, accessing relevant services and databases.
- Hypothesis 2. The application of RAG (Retrieval-Augmented Generation) technology in combination with LLMs improves the quality and reliability of generated responses, especially when working with local knowledge about the city and regulatory documents.
- Hypothesis 3. The integration of LLMs with existing urban information systems and services (e.g., social benefits availability service, transport accessibility service) enables the generation of more accurate and contextually relevant responses to user queries.
2. Related Work
2.1. AI Applications in Smart Cities
2.2. Large Language Models and Agents for Decision Support
2.3. Large Language Models for Smart City Tasks
3. Proposed Multi-Agent LLM-Based Approach
3.1. LLM Agent Design
3.2. Multi-Agent System Design
- Interface Layer. The uppermost layer comprises a chatbot interface that serves as the primary point of interaction between users and the system. This interface is directly integrated with the city information system, enabling seamless access to urban data resources while maintaining a conversational interaction paradigm.
- Orchestration Layer. At the core of the architecture lies the orchestration layer, centered around an “agent-conductor”—the primary agent responsible for coordinating system operations.
- Processing Layer. The processing layer consists of three specialized secondary and system agents:
- The Tool Calling Agent manages interactions with city services through API integration.
- The RAG Operations Agent facilitates Retrieval-Augmented Generation using a vector database containing municipal documentation
- The Answer Summarization Agent synthesizes and formats final responses based on accumulated data.
- The Validation Agent controls the integrity of an answer.
4. Implementation
4.1. LLM-Based Multi-Agent Architecture Implementation
- Database with documents about city development;
- API that provides information about city services.
4.2. Integration with City Information Systems and Services
- Accessibility. Evaluates ease of movement by transport, walking, or multimodal opportunities [57].
- Connectivity. Calculates average travel times across the city with public or private transport [60].
- Service proximity. Assesses the availability of urban services (shops, restaurants, etc.) within specified distances [59].
- Development potential. Identifies areas suitable for developing new facilities [61].
- Centrality. Determines the position and relative importance of different urban areas [62].
- Urban Coverage Metrics Service. Evaluates accessibility and coverage of public services such as schools and hospitals within defined areas.
- Park Enhancement Service. Identifies priority areas for park development and renovation based on community impact.
- Smart Facility Placement Service. Recommends optimal locations for new public facilities considering population needs and accessibility.
- Development Impact Calculation Service. Assesses how new construction projects will affect local service provision and infrastructure.
- Quality of Life Evaluation Service. Measures community well-being across social security, public health, and personal development metrics.
5. Data
5.1. Data Sources and Preprocessing
5.2. Validation Q&A Dataset
6. Experimental Studies
6.1. Experimental Setup
- Experimental setup for Hypothesis 1: “LLM agents can effectively route and process diverse user queries related to the urban environment and social infrastructure, accessing relevant services and databases”. To assess query routing and processing efficiency, we evaluated the percentage of correctly routed queries (accuracy) to appropriate services (since this criterion is used in a widely known tool-use benchmark [68]). This evaluation was conducted across different LLM models and agent configurations, including scenarios with and without LLM-based correction of choices, and it also compared free API selection versus manual filtering for choice correction.
- Experimental setup for Hypothesis 2: “The integration of LLM with existing urban information systems and services (e.g., social benefits availability service, transport accessibility service, documentary DB) enables the generation of more accurate and contextually relevant responses to user queries”. The effectiveness of integration with urban information systems was examined by comparing the performance of various system configurations. These included LLM without RAG (pure LLM responses), LLM with a vector DB, LLM utilizing API access to city services, and LLM employing both a vector DB and city service APIs. For each configuration, we measured the accuracy of the response using G-Eval correctness, the relevance of the answer, and fact-check metrics. Additionally, query processing times were recorded for each setup to assess efficiency.
- Experimental setup for Hypothesis 3: “The integration of LLM-based methods in a city management system may significantly increase efficiency and decrease the decision making process time”. To contextualize the potential benefits of the LLM-based approach, we compared the speed of the LLM system’s responses with traditional query processing methods. This comparison was based on estimates provided by subject matter experts, specifically urban planners. We obtained time estimates for human experts to perform tasks similar to those handled by the LLM system, allowing for a direct efficiency comparison.
6.2. Evaluation Metrics
- Pipeline/tool choice accuracy was estimated as the percent of correctly chosen pipelines or tools with a function call query (without estimating final LLM answer correctness).
- Work time was estimated as a mean value (on experimental dataset queries) for each system configuration and LLM choice. For meaningful results, all time measurements were made in equal infrastructure conditions (except the guarantee of constant conditions of proprietary LLM services like GPT-4o).
- The answer correctness was estimated using three metrics: G-Eval, Answer Relevancy, and fact-checking. G-Eval and Answer Relevancy (AR) were used with the DeepEval framework (https://github.com/confident-ai/deepeval, accessed on 1 November 2024).The G-Eval metric outperforms other state-of-the-art evaluators and provides higher compliance with human requirements [69]. It uses LLMs with chain-of-thoughts (CoT) to evaluate answers from other LLMs based on custom user criteria. These criteria can be provided as evaluation steps—a list of rules specifying precise steps the LLM should take for evaluation. We developed several rules that evaluate correctness and relevance, paying attention to numerical accuracy and correct interpretation of facts from the context. The full text of the evaluation steps used for the experiments is in Appendix A.2.6.The AR metric evaluates how relevant the answer from the LLM is to the correct answer from an expert. AR first uses an LLM to extract all statements from the given answer and then uses the same LLM to define whether each statement is relevant to the correct answer. The metric equals the ratio of relevant statements to all statements in the answer. Evaluations on the WikiEval (https://huggingface.co/datasets/explodinggradients/WikiEval, accessed on 1 November 2024) dataset show that the predictions for AR are closely aligned with human predictions [70].GPT-4o Mini was used as the LLM estimator for both G-Eval and AR. G-Eval demonstrates more strict estimations than AR. While with G-Eval, the LLM estimator tries to catch subtle differences between the expected and actual answers, with AR, the LLM estimator tries to reason whether the actual answer superficially looks like the expected answer.Fact-checking was performed for API-based context by calculating the percentage of correct numerical facts within LLM answers. For this purpose, all numerical values from the correct and LLM answers were parsed and compared. If the numerical values were identical, the answer was considered correct.
7. Experimental Results
7.1. LLM Effectiveness
7.2. RAG Effectiveness
7.3. Tools Selection Effectiveness
7.4. Whole Multi-Agent System Effectiveness
7.5. Performance of Multi-Agent System vs. Human Experts
- Option 1: Direct multi-agent system interaction. Decision-makers directly query an LLM-based chat system.
- Option 2: Expert-mediated. A team of experts uses the information system described in Section 4.2, without the LLM dialogue interface.
- Option 3: Traditional analysis. Analysts collect and analyze data ad hoc, without pre-existing analytical databases.
8. Implications for Smart City Management
8.1. Potential Applications and Use Cases
8.1.1. A Scenario for the Formation of a Program for the Placement of New Educational Facilities
- As a first step, it is important to understand city administration’s plans and basic information about school development in city. With the Assistant’s help, it is possible to obtain an answer about school availability in Saint Petersburg:Q: “What is the situation with school education sphere in Saint-Petersburg?”A: “According to the strategic planning documents [73], the field of education in St. Petersburg occupies a special place due to the historical factors of the city’s development. There are 737 educational organizations in the city, which train about 500 thousand people.”
- The next step is to assess the best place for school placement according to the user’s budget and school provision evaluation. The question to the Assistant is as follows:Q: “My budget is 3 billion rubles. Where is the best place to locate new schools in St. Petersburg?”A: “With your budget of 3 billion rubles, it is the best to locate 3 schools in St. Petersburg, costing 1070.0 million rubles, 646.8 million rubles, 280 million rubles. The total budget will amount to 2 billion rubles. Availability of schools in the allocated neighborhoods will increase from 0.49 to 0.53. The total number of schoolchildren provided with the service will amount to 3210.”Also, the user obtains a layer with proposed school placement (due to the budget Assistant offering three locations) and layers with categorized blocks with the best school provision improvement (Figure 6).
8.1.2. A Scenario for the Formation of a Park Improvement Program
- First, the Assistant gives an answer about landscaping polices and plans in Saint Petersburg.Q: “What is situation with landscaping sector in St. Petersburg?”A: “According to the strategic planning documents [73] in St. Petersburg, landscaping is one of the priorities of the city’s policy.” This information identifies importance of green space development in a city.
- The next step is to identify territories suitable for landscaping. The question to the Assistant is as follows:Q: “Where are green spaces located?”A: “The layer of green zones is shown on the map.”The Assistant’s answer includes layer of the city’s green areas, as shown in Figure 7.
- On the last step, the effects of park improvement are assessed in terms of maximizing population coverage and increasing the level of provision of the population with green areas. To know the best areas for landscaping, the following question to the Assistant should be asked:Q: “Show me which areas and parks have the greatest investment potential for landscaping.”A: “Total audience is 5,597,763 people. Total cost of landscaping is 246 billion rubles.”Also, the user obtains the layer with green zones categorized in terms of the best combination of expected population coverage and landscaping costs, as shown in Figure 8.
8.2. Challenges and Limitations
- Computational scalability. The deployment of LLMs demands substantial computational resources, facing city government with a challenging trade-off between utilizing external cloud infrastructure (with associated data security risks) and developing local computing facilities (with associated management difficulties and essential significant monetary investments). While our experiments demonstrated the viability of both cloud-based (GPT-4) and local models (LLaMA, Mistral), the choice between them involves important trade-offs that cities must carefully consider. Local LLM deployment offers enhanced data security and privacy compliance, which is crucial when handling sensitive urban planning data. However, it requires significant computational infrastructure—our tests indicate that running LLaMA-70B demands at least 140GB of GPU memory for optimal performance, representing a substantial investment for municipal IT departments.
- Real-time Data Integration. The system’s effectiveness critically depends on its ability to synchronize and process multiple data streams from urban databases, statistical services, and regulatory documents in real time. Production-ready implementation faces challenges in maintaining a consistent data flow and standardization across various city information systems while ensuring data security and access control.
- Agent System Stability. The complex orchestration between the agent-conductor and specialized agents introduces reliability concerns, particularly in production environments. The system’s dependence on carefully engineered prompts and tool descriptions, combined with the need for regular model updates and validation, creates operational vulnerabilities that could affect decision making reliability in critical urban management scenarios.
8.3. Ethical Considerations
9. Conclusions and Future Work
9.1. Summary of Key Findings
9.2. Contributions to the Field
- A validated multi-agent architecture for orchestrating diverse urban data sources, achieving comprehensive context-aware responses to complex queries.
- Empirical performance comparison of leading LLMs (GPT-4, Mistral, Llama) in urban-specific tasks.
- Real-world implementation cases in school placement and park improvement planning, demonstrating practical impact on urban decision making.
9.3. Future Research Directions
10. Code and Data Availability
11. Acknowledgments
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Appendix A.1. Additional Data Description
Dataset from System | Indicators |
---|---|
General summary of the city | * Population size * The total area of residential premises * Recreational areas provision * Provision of public health facilities * Average correspondence time …. |
Recreation sphere | * Citizens’ recreational areas provision * Average time of transport accessibility to the beaches * Average walking time to the embankments * Average walking time to parks …. |
Sport sphere | * Swimming pools provision * The average time of accessibility to the swimming pools by public transport * Gyms provision …. |
Demographic sphere | * Population under the working age * Working-age population * Population over the working age * The number of preschool children * Number of school-age children …. |
Housing and maintenance services sphere | * The average tear of construction of residential buildings on the territory * The number of critical condition residential buildings …. |
Healthcare sphere | * Provision of polyclinics * Average accessibility to the polyclinics by public transport * Average walking distance to the child clinics * Provision of the health emergency facilities …. |
Culture and leisure sphere | * Average availability time to the library * Average walking distance to the museums * Average availability time to the theater * The average walking distance to cafes and restaurants …. |
Education sphere | * Provision of kindergartens * Provision of schools * Average availability time to the school * Provision of the universities …. |
Question | Expected LLM Response | Question Category |
---|---|---|
What measures should be taken to ensure the development of high-speed passenger services and reduce the negative impact of freight trains? | To ensure the development of high-speed passenger services and reduce the negative impact of freight trains, it is necessary to create transport corridors bypassing St. Petersburg to allow transit traffic flows. | Question about the strategy of socio-economic development (about the whole city) |
What is the goal of developing and launching effective tools that encourage economic entities, including industrial enterprises, to introduce environmentally friendly technologies? | The goal of developing and launching effective tools to encourage economic entities, including industrial enterprises, to introduce nature-saving technologies that are safe for the environment is to reduce the volume of energy resources used while maintaining their useful effect, and to develop alternative energy sources. | Question about the strategy of socio-economic development (about the whole city) |
What were the consequences of the commissioning of a complex of protective structures in the Kurortny district of St. Petersburg? | After the commissioning of a complex of protective structures, the territory of the Kurortny district of St. Petersburg, located outside the protection zone, is more susceptible to the influence of rising water levels, which leads to an acceleration of the process of bank erosion and increased flooding of coastal areas, including residential areas, street road network, parks, beaches. | Question about the strategy of socio-economic development (about certain territory-Kurortny district) |
What is the average availability time to the parks? | The average accessibility time to the park areas is 22.25 min. | Question about data obtained by referring to urban environment assessment models (about the whole city) |
What is the provision of hospitals in the city? | The provision of hospitals in the city is 100%, and the provision can be considered good. The provision of hospitals in the city in the accessibility zone is 94.16%, the provision can be considered good. | Question about data obtained by referring to urban environment assessment models (about the whole city) |
What is the number of people living in the Admiralteysky district? | The number of inhabitants in the Admiralteysky district is 395.73 thousand people. | Question about data obtained by referring to urban environment assessment models (about certain territories) |
Appendix A.2. LLM Prompts
Appendix A.2.1. FC System Prompt
Appendix A.2.2. FC User Prompt
Appendix A.2.3. API System Prompt
Appendix A.2.4. DB System Prompt
Appendix A.2.5. LLM-Without-Context Prompt
Appendix A.2.6. G-Eval Prompt
Appendix A.3. Additional Metrics
Model | LLM, s | LLM + ChromaDB, s | LLM + API, s | LLM + ChromaDB + API, s |
---|---|---|---|---|
gpt-4o-2024-08-06 | 2.6 | 6.1 | 17.4 | 12.1 |
mixtral-8x22b-instruct | 3.8 | 4.2 | 17.1 | 10.5 |
llama-3.1-70b-instruct | 7.4 | 10.0 | 21.8 | 14.6 |
llama-3.1-70b-instruct-int4 | 2.5 | 3.3 | 15.2 | 9.2 |
Model | Subset of Questions | |||||
---|---|---|---|---|---|---|
City Development Strategy, G-Eval | Accessibility of City Services, G-Eval | Accessibility of City Services, % | ||||
LLM | LLM + ChromaDB | LLM | LLM + API | LLM | LLM + API | |
gpt-4o-2024-08-06 | 0.44 | 0.64 | 0.17 | 0.81 | 0.0 | 82.0 |
mixtral-8x22b-instruct | 0.46 | 0.64 | 0.30 | 0.76 | 1.3 | 80.0 |
llama-3.1-70b-instruct | 0.49 | 0.65 | 0.27 | 0.83 | 0.0 | 96.0 |
llama-3.1-70b-instruct-int4 | 0.49 | 0.62 | 0.27 | 0.80 | 0.0 | 94.0 |
Model | LLM | LLM + ChromaDB | LLM + API | LLM + ChromaDB + API |
---|---|---|---|---|
gpt-4o-2024-08-06 | 0.3 (0.28, 0.34) | 0.38 (0.34, 0.44) | 0.52 (0.47, 0.57) | 0.71 (0.67, 0.74) |
mixtral-8x22b-instruct | 0.38 (0.35, 0.41) | 0.43 (0.39, 0.47) | 0.51 (0.47, 0.57) | 0.68 (0.64, 0.71) |
llama-3.1-70b-instruct | 0.384 (0.35, 0.41) | 0.41 (0.36, 0.46) | 0.54 (0.49, 0.59) | 0.74 (0.71, 0.78) |
llama-3.1-70b-instruct-int4 | 0.38 (0.35, 0.41) | 0.39 (0.34, 0.44) | 0.5 (0.44, 0.55) | 0.71 (0.67, 0.75) |
The Stage of the Process | Option 1 | Option 2 | Option 3 | |||
---|---|---|---|---|---|---|
Assessment of the cost of resources for the preparation of information and analytical materials | Time costs | The number of specialists in addition to the decision-maker | Time costs | Number of specialists in the information support group | Time costs | The number of external specialists involved |
Collection and preparation of initial data | 0.1–0.5 h | 0 | 2–8 h | 3 – 5 | 1–4 days | 5–10 |
Identification of problematic situations based on urban environment data | 0.1–0.5 h | 0 | 1–2 days | 3–5 | 1–5 days | 3–5 |
Analysis of complaints and appeals from citizens | 0.1–0.5 h | 0 | 1–4 days | 3–5 | 5–15 days | 5–10 |
Comparison with strategic and territorial planning documents | 0.1–0.5 h | 0 | 1–4 days | 1–2 | 2–5 days | 1–2 |
Identification of priority areas of urban policy | 0.1–0.5 h | 0 | 2–5 days | 1–2 | 2–15 days | 1–2 |
Formation of a list of projects and solutions in accordance with the priority directions of urban policy | not rated | 0 | 5–30 days | 3–5 | 15–30 days | 3–5 |
Identification of territories that form the potential of the city’s development | 0.5–4 h | 0 | 2–10 days | 1–2 | 5–15 days | 1–2 |
Placement of projects and solutions in identified areas | 1–5 days | 2 | 15–30 days | 3–5 | 15–30 days | 3–5 |
Assessment of the expected effect of project implementation | 0.1–1 h | 0 | 30–60 days | 3–5 | is not produced |
Appendix A.4. Case Study Examples
Appendix A.4.1. Example 1
Appendix A.4.2. Example 2
Appendix A.4.3. Example 3
The initial block’s network in the research area | School provision by blocks | The blocks’ categorization for school potential placement to increase school provision |
Recommended locations for new schools | Blocks with the greatest impact of new school placement | School provision by blocks after the new schools placement |
References
- Martín-Rojo, I. Strategic Planning for a Smart Sustainable City Model: The Importance of Public Administration and Enterprises Cooperation. In The Strategic Paradigm of CSR and Sustainability; Palgrave Macmillan: Cham, Switzerland, 2024. [Google Scholar]
- Momot, T.; Kraivska, I.; Triplett, R.; Azueta, A.C.; Kuznicki, S. Sustainable Roadmap to Global Smart Cities: A Comparative Analysis of Smart City Strategic Plans. In Smart Technologies in Urban Engineering; Springer: Cham, Switzerland, 2023. [Google Scholar]
- Jangeed, D.; Mohammad, I.; Patel, J. Sensor Based Smart Traffic Light Control System. Int. J. Tech. Res. Sci. 2024, 9, 27–35. [Google Scholar] [CrossRef] [PubMed]
- Min, K. A Study on the Application of Smart Technology to Improve the Safety of Smart Cities. Forum Public Saf. Cult. 2023, 24, 167–185. [Google Scholar] [CrossRef]
- Lee, C.; Park, J.; Seol, S. Development and demonstration of smart construction safety technology using drones. Forum Public Saf. Cult. 2023, 24, 93–105. [Google Scholar] [CrossRef]
- Chen, T.C. Smart Technology Applications in Healthcare Before, During, and After the COVID-19 Pandemic. In Sustainable Smart Healthcare. SpringerBriefs in Applied Sciences and Technology; Springer: Cham, Switzerland, 2023; pp. 19–37. [Google Scholar] [CrossRef]
- Urban Platform City. Available online: https://urbanplatform.city (accessed on 15 October 2024).
- City Digital Data Platform. Available online: https://bedrockanalytics.ai/products/city-digital-data-platform (accessed on 15 October 2024).
- CityEngine. Available online: https://www.esri.com/ru-ru/arcgis/products/arcgis-cityengine/overview (accessed on 15 October 2024).
- Urban Observatory. Available online: https://urbanobservatory.maps.arcgis.com/home/index.html (accessed on 15 October 2024).
- UrbanSim. Available online: https://www.urbansim.com (accessed on 15 October 2024).
- Antony, R.; Sunder, R. A Review on Data-Driven Approach Applied for Smart Sustainable City: Future Studies. In Proceedings of International Conference on Data Science and Applications; Springer: Singapore, 2023; pp. 875–890. [Google Scholar] [CrossRef]
- Kandt, J.; Batty, M. Smart cities, big data and urban policy: Towards urban analytics for the long run. Cities 2020, 109, 102992. [Google Scholar] [CrossRef]
- Wolniak, R.; Stecuła, K. Artificial Intelligence in Smart Cities—Applications, Barriers, and Future Directions: A Review. Smart Cities 2024, 7, 1346–1389. [Google Scholar] [CrossRef]
- Szpilko, D.; Jiménez Naharro, F.; Lăzăroiu, G.; Nica, E.; de-la-torre Gallegos, A. Artificial Intelligence in the Smart City—A Literature Review. Eng. Manag. Prod. Serv. 2023, 15, 53–75. [Google Scholar] [CrossRef]
- Gracias, J.; Parnell, G.; Pohl, E.; Buchanan, R. Smart Cities—A Structured Literature Review. Smart Cities 2023, 6, 1719–1743. [Google Scholar] [CrossRef]
- Akhrian Syahidi, A.; Kiyokawa, K.; Okura, F. Computer Vision in Smart City Application: A Mapping Review. In Proceedings of the 2023 6th International Conference on Applied Computational Intelligence in Information Systems (ACIIS), Bandar Seri Begawan, Brunei, 23–25 October 2023. [Google Scholar] [CrossRef]
- Mukhina, K.; Visheratin, A.; Nasonov, D. Urban events prediction via convolutional neural networks and Instagram data. Procedia Comput. Sci. 2019, 156, 176–184. [Google Scholar] [CrossRef]
- Yereseme, A.; Surendra, H.; Kuntoji, G. Sustainable integrated urban flood management strategies for planning of smart cities: A review. Sustain. Water Resour. Manag. 2022, 8, 85. [Google Scholar] [CrossRef]
- Popov, A.; Popov, A.; Ovsyankin, A.; Schneider, V.; Evstratov, A. Development of a module for environmental monitoring of the living condition of landscaping facilities using neural networks. IOP Conf. Ser. Earth Environ. Sci. 2022, 1112, 012145. [Google Scholar] [CrossRef]
- Bamwesigye, D.; Hlaváčková, P. Analysis of Sustainable Transport for Smart Cities. Sustainability 2019, 11, 2140. [Google Scholar] [CrossRef]
- Smirnova, O.; Zhukova, N. Smart Navigation for Modern Cities. In Proceedings of 19th International Conference on Urban Planning, Regional Development and Information Society; Springer: Cham, Switzerland, 2014; pp. 593–602. [Google Scholar]
- Rebelo, F.; Noriega, P.; De Oliveira, T.; Santos, D.; Oliveira, S. Expected User Acceptance of an Augmented Reality Service for a Smart City. In Design, User Experience, and Usability: Users, Contexts and Case Studies; DUXU 2018; Springer: Cham, Switzerland, 2018; pp. 703–714. [Google Scholar] [CrossRef]
- Alzahrani, N.; Alfouzan, F. Augmented Reality (AR) and Cyber-Security for Smart Cities—A Systematic Literature Review. Sensors 2022, 7, 2792. [Google Scholar] [CrossRef]
- Arkaraprasertkul, N. AI-Powered Smart Cities: Transforming Urban Living with LLM. 2024. Available online: https://nonsmartcity.medium.com/ai-powered-smart-cities-transforming-urban-living-with-llm-9230b154b425 (accessed on 15 October 2024).
- Digital Urban Platform St. Petersburg. Available online: https://dc.idu.actcognitive.org/ (accessed on 15 October 2024).
- Zhang, W.; Han, J.; Xu, Z.; Ni, H.; Liu, H.; Xiong, H. Urban Foundation Models: A Survey. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Barcelona, Spain, 25–29 August 2024. [Google Scholar] [CrossRef]
- Yuan, Y.; Han, C.; Ding, J.; Jin, D.; Li, Y. Urbandit: A foundation model for openworld urban spatio-temporal learning. arXiv 2024, arXiv:2411.12164. [Google Scholar]
- Awesome-Urban-Foundation-Models. Available online: https://github.com/usail-hkust/Awesome-Urban-Foundation-Models (accessed on 15 October 2024).
- Eigner, E.; Händler, T. Determinants of llm-assisted decision-making. arXiv 2024, arXiv:2402.17385. [Google Scholar]
- Benary, M.; Wang, X.D.; Schmidt, M.; Soll, D.; Hilfenhaus, G.; Nassir, M.; Sigler, C.; Knödler, M.; Keller, U.; Beule, D.; et al. Leveraging large language models for decision support in personalized oncology. JAMA Netw. Open 2023, 6, e2343689. [Google Scholar] [CrossRef]
- Dhar, R.; Vaidhyanathan, K.; Varma, V. Can LLMs Generate Architectural Design Decisions?—An Exploratory Empirical study. arXiv 2024, arXiv:2403.01709. [Google Scholar]
- Xu, Z.; Guo, L.; Zhou, S.; Song, R.; Niu, K. Enterprise supply chain risk management and decision support driven by large language models. Appl. Sci. Eng. J. Adv. Res. 2024, 3, 1–7. [Google Scholar]
- Handler, A.; Larsen, K.R.; Hackathorn, R. Large language models present new questions for decision support. Int. J. Inf. Manag. 2024, 79, 102811. [Google Scholar] [CrossRef]
- Laskar, M.T.R.; Alqahtani, S.; Bari, M.S.; Rahman, M.; Khan, M.A.M.; Khan, H.; Jahan, I.; Bhuiyan, A.; Tan, C.W.; Parvez, M.R.; et al. A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations. arXiv 2024, arXiv:2407.04069. [Google Scholar]
- Perković, G.; Drobnjak, A.; Botički, I. Hallucinations in LLMs: Understanding and addressing challenges. In Proceedings of the 2024 47th MIPRO ICT and Electronics Convention (MIPRO), Opatija, Croatia, 20–24 May 2024; pp. 2084–2088. [Google Scholar]
- Liu, J.; Lin, J.; Liu, Y. How Much Can RAG Help the Reasoning of LLM? arXiv 2024, arXiv:2410.02338. [Google Scholar]
- Jin, H.; Huang, L.; Cai, H.; Yan, J.; Li, B.; Chen, H. From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future. arXiv 2024, arXiv:2408.02479. [Google Scholar]
- Xiao, Z.; Zhang, D.; Wu, Y.; Xu, L.; Wang, Y.J.; Han, X.; Fu, X.; Zhong, T.; Zeng, J.; Song, M.; et al. Chain-of-Experts: When LLMs Meet Complex Operations Research Problems. In Proceedings of the Twelfth International Conference on Learning Representations, Vienna, Austria, 7–11 May 2024. [Google Scholar]
- Trirat, P.; Jeong, W.; Hwang, S.J. AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML. arXiv 2024, arXiv:2410.02958. [Google Scholar]
- Liu, Z.; Yao, W.; Zhang, J.; Yang, L.; Liu, Z.; Tan, J.; Choubey, P.K.; Lan, T.; Wu, J.; Wang, H.; et al. AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System. arXiv 2024, arXiv:2402.15538. [Google Scholar]
- Li, Y.; Wen, H.; Wang, W.; Li, X.; Yuan, Y.; Liu, G.; Liu, J.; Xu, W.; Wang, X.; Sun, Y.; et al. Personal llm agents: Insights and survey about the capability, efficiency and security. arXiv 2024, arXiv:2401.05459. [Google Scholar]
- Nan, L.; Zhang, E.; Zou, W.; Zhao, Y.; Zhou, W.; Cohan, A. On evaluating the integration of reasoning and action in llm agents with database question answering. arXiv 2023, arXiv:2311.09721. [Google Scholar]
- Wang, Q.; Wang, Z.; Su, Y.; Tong, H.; Song, Y. Rethinking the Bounds of LLM Reasoning: Are Multi-Agent Discussions the Key? arXiv 2024, arXiv:2402.18272. [Google Scholar]
- Guo, X.; Huang, K.; Liu, J.; Fan, W.; Vélez, N.; Wu, Q.; Wang, H.; Griffiths, T.L.; Wang, M. Embodied llm agents learn to cooperate in organized teams. arXiv 2024, arXiv:2403.12482. [Google Scholar]
- Feng, J.; Du, Y.; Liu, T.; Guo, S.; Lin, Y.; Li, Y. CityGPT: Empowering Urban Spatial Cognition of Large Language Models. arXiv 2024, arXiv:2406.13948. [Google Scholar]
- Jiao, Z.; Sha, M.; Zhang, H.; Jiang, X.; Qi, W. City-LEO: Toward Transparent City Management Using LLM with End-to-End Optimization. arXiv 2024, arXiv:2406.10958. [Google Scholar]
- OpenStreetMap. Available online: https://www.openstreetmap.org (accessed on 15 October 2024).
- Aino.World. Data Sources for Spatial Data Work. Available online: https://aino.world/data_sourses/ (accessed on 15 October 2024).
- Overture Maps Foundation. Overture Maps. Available online: https://overturemaps.org/ (accessed on 15 October 2024).
- Microsoft. Microsoft Planetary Computer: Buildings Dataset. Available online: https://planetarycomputer.microsoft.com/dataset/ms-buildings (accessed on 15 October 2024).
- Humanitarian Data Exchange (HDX). Kontur Population Dataset. Available online: https://data.humdata.org/dataset/kontur-population-dataset (accessed on 15 October 2024).
- Zenodo. Ensemble Digital Terrain Model (EDTM) of the World. Available online: https://zenodo.org/records/7634679 (accessed on 15 October 2024).
- Google Decelopers. Geocoding API Overview. Available online: https://developers.google.com/maps/documentation/geocoding/overview (accessed on 15 October 2024).
- HERE Technologies. HERE Maps. Available online: https://www.here.com (accessed on 15 October 2024).
- Kaggle. Pronto Cycle Share Dataset. Available online: https://www.kaggle.com/datasets/pronto/cycle-share-dataset (accessed on 15 October 2024).
- Mishina, M.; Khrulkov, A.; Soloveva, V.; Tupikina, L.; Mityagin, S. Method of intermodal accessibility graph construction. Procedia Comput. Sci. 2022, 212, 42–50. [Google Scholar] [CrossRef]
- Mishina, M.; Sobolevsky, S.; Kovtun, E.; Khrulkov, A.; Belyi, A.; Budennyy, S.; Mityagin, S. Prediction of Urban Population-Facilities Interactions with Graph Neural Network. In Computational Science and Its Applications; Springer: Cham, Switzerland, 2023. [Google Scholar] [CrossRef]
- Mishina, M.; Mityagin, S.; Belyi, A.; Khrulkov, A.; Sobolevsky, S. Towards Urban Accessibility: Modeling Trip Distribution to Assess the Provision of Social Facilities. Smart Cities 2024, 7, 2741–2762. [Google Scholar] [CrossRef]
- Morozov, A.; Kontsevik, G.; Shmeleva, I.; Schneider, L.; Zakharenko, N.; Budennyy, S.; Mityagin, S. Assessing the transport connectivity of urban territories and based on intermodal transport accessibility. Front. Built Environ. 2023, 9, 1148708. [Google Scholar] [CrossRef]
- Natykin, M.V.; Morozov, A.; Starikov, V.A.; Mityagin, S. A method for automatically identifying vacant area in the current urban environment based on open source data. Procedia Comput. Sci. 2023, 229, 91–100. [Google Scholar] [CrossRef]
- Pavlova, A.; Katynsus, A.; Natykin, M.; Mityagin, S. Automated Identification of Existing and Potential Urban Central Places Based on Open Data and Public Interest. In Computational Science and Its Applications—ICCSA 2024; ICCSA 2024; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2024; Volume 14813. [Google Scholar] [CrossRef]
- Territorial Development Fund. Available online: https://xn--80aafuec2bqq9b9f.xn--p1aee.xn--p1ai/ (accessed on 15 October 2024).
- Department of the Federal State Statistics Service for St. Petersburg and the Leningrad Region of the Russian Federation. Available online: https://78.rosstat.gov.ru (accessed on 15 October 2024).
- Kontsevik, G.; Sokol, A.; Bogomolov, Y.; Mityagin, S. Modeling the citizens’ settlement in residential buildings. Procedia Comput. Sci. 2022, 212, 51–63. [Google Scholar] [CrossRef]
- Polo, F.M.; Weber, L.; Choshen, L.; Sun, Y.; Xu, G.; Yurochkin, M. tinyBenchmarks: Evaluating LLMs with fewer examples. arXiv 2024, arXiv:2402.14992. [Google Scholar]
- Pacchiardi, L.; Cheke, L.G.; Hernández-Orallo, J. 100 instances is all you need: Predicting the success of a new LLM on unseen data by testing on a few instances. arXiv 2024, arXiv:2409.03563. [Google Scholar]
- Wang, J.; Ma, Z.; Li, Y.; Zhang, S.; Chen, C.; Chen, K.; Le, X. GTA: A Benchmark for General Tool Agents. arXiv 2024, arXiv:2407.08713. [Google Scholar]
- Liu, Y.; Iter, D.; Xu, Y.; Wang, S.; Xu, R.; Zhu, C. G-eval: Nlg evaluation using gpt-4 with better human alignment. arXiv 2023, arXiv:2303.16634. [Google Scholar]
- Es, S.; James, J.; Espinosa-Anke, L.; Schockaert, S. Ragas: Automated evaluation of retrieval augmented generation. arXiv 2023, arXiv:2309.15217. [Google Scholar]
- Schmidt, D.C.; Spencer-Smith, J.; Fu, Q.; White, J. Towards a catalog of prompt patterns to enhance the discipline of prompt engineering. ACM SIGAda Ada Lett. 2024, 43, 43–51. [Google Scholar] [CrossRef]
- Chen, B.; Zhang, Z.; Langrené, N.; Zhu, S. Unleashing the potential of prompt engineering in Large Language Models: A comprehensive review. arXiv 2023, arXiv:2310.14735. [Google Scholar]
- The Law of St. Petersburg “On the Strategy of Socio-Economic Development St. Petersburg for the Period up to 2035” Dated December 19, 2018 N 771-164. Available online: https://www.gov.spb.ru/gov/otrasl/c_econom/strategiya-ser-2035 (accessed on 15 October 2024).
- Wang, B. Ethical Reflections on the Application of Artificial Intelligence in the Construction of Smart Cities. J. Eng. 2024. [Google Scholar] [CrossRef]
- Ehwi, R.; Holmes, H.; Maslova, S.; Burgess, G. The ethical underpinnings of Smart City governance: Decision-making in the Smart Cambridge programme, UK. Urban Stud. 2022, 59, 2968–2984. [Google Scholar] [CrossRef]
- Xiao, P.; Xu, J.; Zhao, C. Conflict Identification and Zoning Optimization of “Production-Living-Ecological” Space. Int. J. Environ. Res. Public Health 2022, 19, 7990. [Google Scholar] [CrossRef] [PubMed]
- Zubizarreta, I.; Seravalli, A.; Arrizabalaga, S. Smart City Concept: What It Is and What It Should Be. J. Urban Plan. Dev. 2015, 142, 04015005. [Google Scholar] [CrossRef]
- Mohammadi, A. The Pathology of Urban Master Plans in Iran. In Proceedings of the International Conference on Civil Engineering, Architecture and Urban Cityscape, Istanbul, Turkey, 28 July 2016. [Google Scholar]
- Pleshkanovska, A. City Master Plan: Forecasting Methodology Problems (on the example of the Master Plans of Kyiv). Transf. Innov. Technol. 2019, 2, 39–50. [Google Scholar] [CrossRef]
- Abbas, S.; Ebraheem, M. Tactical Urban Projects Within Baghdad’s Master Plan. Int. J. Sustain. Dev. Plan. 2024, 19, 4167–4182. [Google Scholar] [CrossRef]
- Kumar, T. Smarter Master Planning. In Smart Master Planning for Cities. Advances in 21st Century Human Settlements; Springer: Singapore, 2022; pp. 3–79. [Google Scholar] [CrossRef]
Criteria | [7] | [8] | [9] | [10] | [11] | Ours |
---|---|---|---|---|---|---|
Operation of spatial data of territories | + | + | + | + | + | + |
Operation of spatial data of city objects | + | + | + | + | + | + |
Calculating spatial indexes | + | + | + | + | + | + |
Implementation of specialized models of the urban environment (accessibility, connectivity, centrality, provision of facilities) | - | - | + | + | - | + |
The ability to integrate custom models | + | - | - | - | - | + |
Operation of documental data of territories | - | - | - | - | - | + |
Support for automatic data updating | - | - | - | - | - | + |
The ability to integrate into smart city systems | - | + | - | - | - | + |
The possibility of developing a user interface for the tasks of a smart city | - | - | - | - | - | + |
The ability to integrate natural language data management | - | - | - | - | - | + |
Criteria | Ours | CityGPT | City-LEO | Aino.World |
---|---|---|---|---|
Ability to operate using administrative units | + | - | - | + |
Ability to link urban objects with administrative units | + | - | - | + |
Ability to operate urban objects | + | - | + | + |
Ability to refer to models | + | - | - | + |
Ability visualize data on the map | + | - | + | + |
Integration of regulatory documents | + | + | - | - |
Ability to make complicated suggestions based on analyzed data | + | + | - | - |
Model | LLM | |
---|---|---|
G-Eval | AR | |
gpt-4o-2024-08-06 | 0.30 | 0.93 |
mixtral-8x22b-instruct | 0.38 | 0.90 |
llama-3.1-70b-instruct | 0.38 | 0.95 |
llama-3.1-70b-instruct-int4 | 0.38 | 0.98 |
Model | LLM | LLM + ChromaDB | LLM + API | |||
---|---|---|---|---|---|---|
G-Eval | AR | G-Eval | AR | G-Eval | AR | |
gpt-4o-2024-08-06 | 0.30 | 0.93 | 0.38 | 0.65 | 0.52 | 0.66 |
mixtral-8x22b-instruct | 0.38 | 0.90 | 0.43 | 0.70 | 0.51 | 0.73 |
llama-3.1-70b-instruct | 0.38 | 0.95 | 0.41 | 0.74 | 0.54 | 0.71 |
llama-3.1-70b-instruct-int4 | 0.38 | 0.98 | 0.39 | 0.71 | 0.50 | 0.69 |
Model | Pipeline Selection (All Questions) | Function Selection (Service Accessibility Questions) | ||||
---|---|---|---|---|---|---|
No Verification | No Verification | LLM Verification | ||||
ACC, % | Time, s | ACC, % | Time, s | ACC, % | Time, s | |
gpt-4o-2024-08-06 | 97.3 | 3.1 | 48.0 | 2.3 | 80.0 | 5.4 |
mixtral-8x22b-instruct | 94.0 | 2.0 | 90.7 | 2.3 | 94.7 | 5.0 |
llama-3.1-70b-instruct | 99.3 | 2.1 | 96.0 | 2.5 | 96.0 | 5.7 |
llama-3.1-70b-instruct-int4 | 98.7 | 1.1 | 93.3 | 1.1 | 93.3 | 3.1 |
Model | LLM | LLM + ChromaDB | LLM + API | LLM + ChromaDB + API | ||||
---|---|---|---|---|---|---|---|---|
G-Eval | AR | G-Eval | AR | G-Eval | AR | G-Eval | AR | |
gpt-4o-2024-08-06 | 0.30 | 0.93 | 0.38 | 0.65 | 0.52 | 0.66 | 0.71 | 0.94 |
mixtral-8x22b-instruct | 0.38 | 0.90 | 0.43 | 0.70 | 0.51 | 0.73 | 0.68 | 0.91 |
llama-3.1-70b-instruct | 0.38 | 0.95 | 0.41 | 0.74 | 0.54 | 0.71 | 0.74 | 0.95 |
llama-3.1-70b-instruct-int4 | 0.38 | 0.98 | 0.39 | 0.71 | 0.50 | 0.69 | 0.71 | 0.92 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kalyuzhnaya, A.; Mityagin, S.; Lutsenko, E.; Getmanov, A.; Aksenkin, Y.; Fatkhiev, K.; Fedorin, K.; Nikitin, N.O.; Chichkova, N.; Vorona, V.; et al. LLM Agents for Smart City Management: Enhancing Decision Support Through Multi-Agent AI Systems. Smart Cities 2025, 8, 19. https://doi.org/10.3390/smartcities8010019
Kalyuzhnaya A, Mityagin S, Lutsenko E, Getmanov A, Aksenkin Y, Fatkhiev K, Fedorin K, Nikitin NO, Chichkova N, Vorona V, et al. LLM Agents for Smart City Management: Enhancing Decision Support Through Multi-Agent AI Systems. Smart Cities. 2025; 8(1):19. https://doi.org/10.3390/smartcities8010019
Chicago/Turabian StyleKalyuzhnaya, Anna, Sergey Mityagin, Elizaveta Lutsenko, Andrey Getmanov, Yaroslav Aksenkin, Kamil Fatkhiev, Kirill Fedorin, Nikolay O. Nikitin, Natalia Chichkova, Vladimir Vorona, and et al. 2025. "LLM Agents for Smart City Management: Enhancing Decision Support Through Multi-Agent AI Systems" Smart Cities 8, no. 1: 19. https://doi.org/10.3390/smartcities8010019
APA StyleKalyuzhnaya, A., Mityagin, S., Lutsenko, E., Getmanov, A., Aksenkin, Y., Fatkhiev, K., Fedorin, K., Nikitin, N. O., Chichkova, N., Vorona, V., & Boukhanovsky, A. (2025). LLM Agents for Smart City Management: Enhancing Decision Support Through Multi-Agent AI Systems. Smart Cities, 8(1), 19. https://doi.org/10.3390/smartcities8010019