Next Article in Journal
Know-How of the Effective Use of Carbon Electrodes with a through Axial Hole in the Smelting of Silicon Metal
Previous Article in Journal
Optimized Multi-Motor Power Control Strategy for Distributed Permanent Magnet Direct Drive Belt Conveyors
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Geovisualization of Buildings: AI vs. Procedural Modeling

1
Faculty of Geodesy, University of Zagreb, Kačićeva 26, 10000 Zagreb, Croatia
2
Faculty of Civil Engineering, Architecture and Geodesy, University of Split, Matice Hrvatske 15, 21000 Split, Croatia
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(18), 8345; https://doi.org/10.3390/app14188345
Submission received: 8 August 2024 / Revised: 13 September 2024 / Accepted: 15 September 2024 / Published: 16 September 2024
(This article belongs to the Section Computing and Artificial Intelligence)

Abstract

:
Procedural modeling offers significant advantages over traditional methods of geovisualizing 3D building models, particularly in its use of scripts or machine language for model description. This approach is highly suitable for computer processing and allows for the rapid rendering of entire building models and cities, especially when the buildings are not highly diverse, thus fully leveraging the strengths of procedural modeling. The first hypothesis is that buildings in the real world are mostly different and they should still be able to be displayed through procedural modeling procedures, and the second hypothesis is that this can be achieved in several ways. The first hypothesis suggests that real-world buildings, despite their diversity, can still be effectively represented through procedural modeling. The second hypothesis explores various methods to achieve this representation. The first approach involves recognizing the basic characteristics of a building from photographs and creating a model using machine learning. The second approach utilizes artificial intelligence (AI) to generate detailed building models based on comprehensive input data. A script is generated for each building, making reverse procedural modeling in combination with AI an intriguing field of study, which is explored in this research. To validate this method, we compare AI-generated building models with manually derived models created through traditional procedural modeling techniques. The research demonstrates that integrating AI and machine learning techniques with procedural modeling significantly improves the efficiency and accuracy of generating 3D building models. Specifically, the use of convolutional neural networks (CNNs) for image-to-geometry translation, and Generative Adversarial Networks (GANs) for texture generation, showed promising results in creating detailed and realistic 3D structures. This research is significant as it introduces a novel methodology that bridges the gap between traditional procedural modeling and modern AI-driven techniques. It offers a robust solution for automated 3D building modeling, potentially revolutionizing the fields of urban planning and architectural design by enabling more efficient and accurate digital representations of complex building geometries.

1. Introduction

Procedural modeling stands out when compared to conventional methods of geovisualizing 3D building models due to its ability to encapsulate model descriptions, such as those of buildings or residences, using scripting languages or machine code. This feature renders procedural modeling exceptionally conducive to computational processing, facilitating the expeditious rendering of comprehensive building models and urban landscapes composed of homogeneous building structures. In essence, procedural modeling thrives particularly in scenarios where building diversity is relatively limited, thus accentuating its manifold advantages [1,2,3].
Procedural modeling is a technique that generates 3D models and environments algorithmically, as opposed to manually creating each element. This approach utilizes a set of rules or procedures to automatically create complex structures, allowing for the efficient generation of large-scale models with varying levels of detail. Procedural modeling is particularly effective in urban modeling, where it can be used to create extensive cityscapes with minimal manual input [4]. This method is highly flexible and scalable, making it ideal for applications in virtual reality, video games, and urban planning. The conventional approach to geovisualizing 3D building models typically involves manual or semi-automated processes that rely on CAD (Computer-Aided Design) tools and GIS (Geographic Information Systems) software. This approach often requires significant manual intervention to create accurate and detailed 3D models from 2D plans, maps, or on-site measurements [5]. These models are then used in various applications, such as urban planning, architectural design, and geospatial analysis. While this approach can produce highly detailed models, it is time-consuming and less adaptable to changes in scale or design.
The integration of AI-driven procedural modeling into architectural and urban design holds significant implications for the future of these fields. By automating the generation of detailed 3D building models, this research offers a transformative tool that can greatly enhance the efficiency and precision of design processes. Architects and urban planners can leverage these advanced techniques to quickly prototype and visualize complex structures, enabling more informed decision-making and facilitating a more iterative design process. Moreover, the ability to rapidly produce accurate and adaptable models supports the development of smart cities, where real-time data can be integrated into urban planning to create more responsive and sustainable environments. This research thus not only advances the technical capabilities of 3D modeling but also contributes to the broader goal of creating more efficient, livable, and resilient urban spaces.
Procedural modeling stands out prominently in the field of geovisualizing 3D building models due to its efficiency in rendering comprehensive building models and urban landscapes. However, despite its advantages, there is a significant gap in the ability to accurately portray the diversity and complexity of real-world buildings [6]. This research aims to bridge this gap by integrating procedural modeling with advanced AI techniques, offering a robust solution for automated and accurate 3D building modeling.
The motivation for this work stems from the need to enhance the geovisualization of buildings and automate the production process.
Traditional methods often fall short in terms of efficiency and accuracy when dealing with diverse building structures [7,8,9]. By leveraging AI and machine learning, this research proposes a novel methodology to improve procedural modeling and address the existing limitations.
In this research, the ambitious task of elucidating the intricate amalgamation of procedural modeling with novel techniques and tools will be undertaken, aimed at enhancing the geovisualization of building models and streamlining the manufacturing process. Central to our inquiry are two overarching hypotheses, the first positing that despite the inherent heterogeneity of real-world buildings, procedural modeling methodologies should aptly accommodate their portrayal. The second hypothesis ventures into the exploration of various modalities through which this accommodation can be achieved [10,11]: how to design a grammatical language that can be used to describe buildings with sets of procedural rules, which can then be transformed into 3D models.
Such modality involves leveraging computer vision algorithms to discern fundamental building attributes from photographic inputs, thus laying the groundwork for machine learning-driven model generation. Although preliminary attempts at this approach have been made, their efficacy remains limited, necessitating further refinement and exploration [12,13].
It explores the transformative impact of artificial intelligence on architectural heritage research, highlighting AI’s capabilities in enhancing the documentation, preservation, and analysis of historical buildings. The chapter discusses innovative AI tools that significantly improve the accuracy and efficiency of heritage conservation efforts [14].
Some authors explore the intersection of AI and design. It discusses how AI techniques are applied in architectural design, geographic visualization, and program modeling. The text emphasizes AI’s role in enhancing creativity and efficiency within these fields [15].
Authors have been exploring innovative approaches to urban planning. This focuses on developing sustainable urban spatial structures using advanced modeling techniques, emphasizing environmental impact reduction and improving urban living conditions. This study highlights the importance of integrating ecological considerations in city planning for future resilience [16].
Conversely, our second approach, which serves as the focal point of this paper, entails harnessing the capabilities of artificial intelligence (AI) [17,18]. Herein, detailed descriptions of buildings, encompassing all requisite specifications, serve as inputs to AI algorithms, which subsequently generate scripts tailored to each individual building. This methodology, referred to as reverse procedural modeling, represents a novel synthesis of AI-driven automation with procedural modeling paradigms.
To validate the viability and effectiveness of the proposed methodology, a comparative analysis is conducted, comparing AI-generated building models with manually created models using traditional procedural modeling techniques. By identifying discrepancies and assessing the accuracy of AI-generated models in relation to their manually crafted counterparts, the aim is to provide empirical evidence supporting the efficacy and utility of the proposed approach [19,20].
Procedural building modeling, in conjunction with advancements in artificial intelligence (AI) and access to online databases, opens the door to revolutionary methods of creating complex structures like buildings. This fusion of technologies enables the automation of modeling processes, reducing the need for manual interventions, and accelerating the entire procedure. The ways in which automatic procedural building modeling can be executed using building descriptions or visual recognition of building characteristics, AI, online databases, and machine learning will be explored in detail [21,22].
Automated procedural building modeling integrates various cutting-edge technologies and methodologies to streamline the architectural design process. It begins with the extraction of building descriptions or the recognition of building characteristics from textual or visual inputs. Natural language processing (NLP) techniques are employed to parse textual descriptions, while computer vision algorithms analyze images or videos to identify architectural features [23,24].
Once the initial input is obtained, artificial intelligence (AI) algorithms come into play to analyze and interpret the data. Deep learning models, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), are trained on large datasets to understand spatial relationships and architectural styles. These AI algorithms generate building layouts or floor plans based on learned patterns and semantic meanings derived from the input data [25,26].
Moreover, online databases serve as valuable resources for pre-existing building elements, including 3D models, textures, and materials. APIs or web scraping techniques are utilized to access these databases, enabling the automatic retrieval and integration of relevant building components into the modeling pipeline [27,28].
Machine learning techniques further optimize the procedural modeling process by iteratively refining generated designs based on user feedback and performance metrics. Reinforcement learning algorithms adaptively adjust model parameters to maximize design objectives, such as energy efficiency or structural integrity [29,30].
Parametric modeling allows for the creation of flexible building models that can be dynamically adjusted based on various parameters such as size, style, and functionality. By defining parameters and constraints, designers can iteratively explore design alternatives and refine their concepts [31,32].
Rule-based systems enforce design rules and constraints to ensure compliance with architectural guidelines and regulatory requirements. These systems detect errors and violations early in the design process, reducing costly revisions and ensuring adherence to legal mandates [33,34].
Real-time feedback mechanisms enable designers to interactively manipulate building models and receive immediate visual feedback on design changes. Graphical user interfaces (GUIs) or virtual reality (VR) environments provide intuitive interfaces for exploring and refining designs in real time [35].
Simulation and visualization tools facilitate the evaluation and validation of building models through realistic rendering, lighting analysis, and virtual walkthroughs. By visualizing design outcomes in immersive 3D environments, stakeholders can gain insights into the spatial qualities and functional aspects of proposed buildings [36,37,38].
Lastly, scalability and adaptability are crucial aspects of automated procedural building modeling systems, allowing them to accommodate diverse project requirements and scales. Cloud-based infrastructure and modular software architectures enable the efficient handling of large-scale modeling tasks and foster interoperability and extensibility [39,40].
Automated procedural building modeling represents a transformative approach to architectural design, leveraging advanced technologies to streamline the process, optimize resource utilization, and create sustainable built environments that enrich the lives of inhabitants and communities [41,42,43].
  • Building Descriptions as Input: The initial step in automated procedural building modeling is gathering building descriptions. These descriptions may contain information such as the number of floors, types of materials, room layouts, and architectural elements.
  • Visual Recognition of Building Characteristics: Utilizing computer vision techniques and image processing allows systems to automatically recognize building characteristics from visual data. This may involve identifying shapes, colors, textures, and other key elements.
  • AI and Data Analysis: After collecting descriptions or visual recognition, artificial intelligence algorithms can analyze this data and generate building model plans. These AI systems can employ techniques like deep learning for precise understanding and interpretation of data.
  • Online Databases of Building Elements: Online databases containing a plethora of building elements, such as windows, doors, roofs, and more, serve as valuable resources. These databases can be integrated into the procedural modeling pipeline, allowing for the automatic selection and placement of elements based on the generated building plans.
  • Machine Learning for Optimization: Machine learning algorithms can be employed to optimize the procedural modeling process. By learning from past models and user preferences, these algorithms can iteratively refine the generated building designs to better match desired criteria and aesthetics.
  • Parametric Modeling: Leveraging parametric modeling techniques enables the creation of flexible building models that can be easily adjusted based on various parameters such as size, style, and functionality.
  • Rule-Based Systems: Rule-based systems can enforce constraints and guidelines during the modeling process, ensuring that the generated building designs adhere to specific regulations or architectural standards.
  • Real-time Feedback and Iteration: Integrating real-time feedback mechanisms allows for iterative improvements in the procedural modeling process. Users can interact with the system, providing feedback on generated designs, which can then be incorporated to refine subsequent iterations.
  • Simulation and Visualization: Simulation and visualization techniques enable users to preview and assess the generated building models in virtual environments. This allows for better understanding and validation of the designs before physical implementation.
  • Scalability and Adaptability: The automated procedural modeling approach should be scalable and adaptable to different project scales and requirements. Whether designing individual buildings or entire cityscapes, the system should be capable of handling diverse scenarios effectively.
Automated procedural building modeling, driven by AI, online databases, and machine learning, offers a powerful solution for rapidly generating complex building structures. By leveraging these technologies in combination, designers and architects can streamline the modeling process, increase efficiency, and explore innovative design possibilities. The goal of this research is to connect all available technologies combining procedural modeling and reverse procedural modeling to automatically produce a 3D model of a building [44,45,46,47,48].
Up until now, machine learning technology has been used wherein computers have learned from many examples of different styles and types of buildings through optical recognition of building parts from one or more building photographs, reconstructing the entire building model. Today, AI allows us how to converse, create videos, utilize machine learning technology, and obtain building descriptions. With the help of AI and these descriptions, we can construct a building using a script that outlines and defines the structure. This will be described in this paper, and to automate the process, all steps need to be integrated into a single entity and ultimately assess the success of forming a 3D model of a building, specifically to implement this new idea of creating a 3D building model. The question is how machine learning technology has been historically used to recognize building parts and reconstruct building models and advancements using AI, which can interact with humans and perform various tasks, including generating building descriptions and constructing buildings based on those descriptions using scripts. The text emphasizes the need to integrate all steps of the process into a cohesive workflow and evaluate the success of generating 3D building models. Finally, it mentions the intention to implement this new approach to building modeling [49].
Hypothesis 1.
The integration of artificial intelligence (AI) and machine learning techniques, such as convolutional neural networks (CNNs) and Generative Adversarial Networks (GANs), with procedural modeling significantly enhances the efficiency and accuracy of generating 3D building models compared to conventional methods.
Hypothesis 2.
The application of AI-driven procedural modeling in geovisualization not only reduces the time required for creating detailed 3D building models but also improves the realism and adaptability of these models in various urban planning and architectural contexts.

2. Materials and Methods

To effectively combine and utilize different available technologies, it is necessary to analyze and evaluate all existing methods aimed at expediting the process of obtaining and visualizing 3D building models.

2.1. What Can Be Achieved with Today’s Technology and Also by Combining Different Technologies?

There are not many ideas or research published in scientific journals focusing on procedural modeling, machine learning, and AI for 3D building modeling. The following are some examples of techniques and methods applied here.
Procedural Modeling: Various techniques of procedural modeling are used to generate complex 3D building models. This includes generating parametric rules or scripts that automatically create detailed architectural structures. Published research often explores different approaches to procedural generation, including L-systems, shape grammars, or parameterized models. Parametric Modeling: this defines building elements using parameters such as height, width, and style, which can be adjusted to generate variations. Shape Grammar: formal grammar that defines rules for generating building shapes and styles based on hierarchical compositions of architectural elements. Procedural Texturing: generating textures for building surfaces based on procedural algorithms, such as fractal patterns or procedural noise.
Machine Learning: In the field of machine learning, researchers apply various techniques to teach models to generate 3D building geometry from images or other input data. This can involve using convolutional neural networks (CNNs) for directly generating 3D models from images or using Generative Adversarial Networks (GANs) for generating textures or detailed features. Image-to-Geometry Translation: this trains models to convert 2D images of buildings into 3D geometric representations using techniques like convolutional neural networks (CNNs). Generative Models: these use generative models like Variational Autoencoders (VAEs) or Generative Adversarial Networks (GANs) to create plausible 3D building structures and textures from noise vectors or latent representations. Semantic Segmentation: this trains models to segment building elements (e.g., walls, windows, doors) in images, enabling more accurate reconstruction of 3D building geometry.
Artificial Intelligence: In the realm of artificial intelligence, researchers develop systems that leverage deep neural networks or other techniques to generate 3D building models. This includes using AI for tasks such as semantic segmentation of building elements, style transfer for architectural design, or even the autonomous generation of entire cityscapes. Style Transfer: this applies AI techniques to transfer architectural styles from existing buildings or to design concepts to generate new building designs with similar aesthetics. Evolutionary Algorithms: this uses genetic algorithms or other evolutionary techniques to evolve and optimize building designs based on specified criteria such as energy efficiency or visual appeal. Reinforcement Learning: this trains AI agents to autonomously generate and optimize building designs through trial and error, with feedback based on predefined objectives or constraints.
These are just a few broad areas where research is ongoing. Within each area, there are numerous specific methodologies and approaches being explored to improve the efficiency and realism of 3D building modeling using procedural techniques, machine learning, and AI [44,45,46,47,48].
Although there are many advantages to using procedural techniques, machine learning, and artificial intelligence to automate the process of obtaining 3D building models, there are also certain drawbacks and challenges that researchers need to consider:
  • Lack of Precision: automated processes may result in less precise or realistic models, especially when it comes to detailed architectural elements or complex textures.
  • Limited Generalization: models trained on one dataset may not generalize well to other cities or architectural styles, limiting the usability of these methods across different areas.
  • Dependency on Input Data Quality: the quality of the resulting models often depends on the quality of input data, including image resolution or accuracy of geolocation information.
  • Need for Large Datasets: training effective machine learning and artificial intelligence models requires large datasets, which can be challenging to gather and process, especially for specific locales or architectural styles.
  • Requirement for Computational Resources: some advanced techniques, such as Generative Adversarial Networks (GANs) or deep neural networks, require large amounts of computational resources for training and model generation, which can be costly and require specialized hardware infrastructure.
  • Challenges of Result Interpretation: in some cases, machine learning and artificial intelligence techniques may generate models that are difficult to interpret or explain, which can be problematic in certain applications, such as urban planning or architectural design.
Despite these challenges, research in the field of automating the process of obtaining 3D building models continues to advance, and further work on improving algorithms, techniques, and the quality of input data may help overcome these drawbacks [49].
The impact of input variation on procedural modeling, which is the optimal input modality for AI-based procedural modeling, comparing verbal descriptions with optical recognition of building images, delves into the efficacy and suitability of these two distinct approaches in informing the generation of procedural models for 3D buildings. This research explores two primary approaches for automated procedural building modeling: verbal descriptions and optical recognition of building images. Verbal descriptions as input for 3D modeling involve utilizing textual narratives or structured data outlining the characteristics of buildings. Natural language processing (NLP) techniques are employed to parse and extract relevant information. For instance, descriptions of architectural styles, materials, and dimensions are translated into detailed 3D models. Optical recognition of building images DeepMind’s Alpha3D Experimental was used. Computer vision algorithms analyze visual data from images or videos. Techniques such as convolutional neural networks (CNNs) are used (Midas is used because it can be highly effective when combined with other 3D reconstruction methods working with a single image like in our case) to identify and extract architectural features. This method leverages deep learning to generate accurate 3D models based on visual inputs. A comparative analysis is conducted to evaluate the efficiency of these approaches. Metrics such as model accuracy, computational efficiency, and user satisfaction are assessed. The findings will provide insights into the optimal methodology for different scenarios (Figure 1).
Figure 1 flowchart explanation:
  • Collect Verbal Descriptions (gather textual descriptions of buildings, including architectural styles, dimensions, and materials).
  • Utilize NLP for Data Parsing (use Natural Language Processing (NLP) techniques to parse and extract relevant information from the collected verbal descriptions).
  • Generate Initial 3D Models (create initial 3D building models based on the parsed data).
  • Collect Building Images (gather images or videos of buildings from various sources).
  • Apply Computer Vision Algorithms (use computer vision techniques to analyze the collected images and extract architectural features).
  • Extract Architectural Features (identify and extract detailed architectural elements from the images).
  • Generate Detailed 3D Models (create detailed 3D models based on the extracted features).
  • Comparative Analysis of Methods (compare the models generated from verbal descriptions and images to evaluate their accuracy and efficiency).
  • Evaluate Model Accuracy and Efficiency (assess the models based on various metrics such as accuracy, fidelity, and computational efficiency).
  • Refine Models Based on Findings (make necessary adjustments to improve the models based on the evaluation results).
This study’s experimental design focuses on developing an AI-driven procedural modeling approach aimed at improving the geovisualization of 3D building models. Two primary approaches were explored: generating 3D models from verbal descriptions and creating models based on optical recognition of building images. These approaches were subjected to a comparative analysis to evaluate their respective strengths and weaknesses in terms of model accuracy, computational efficiency, and user satisfaction. Regarding dataset selection for this study, we selected two types of datasets that cater to the different approaches. The first dataset comprises textual descriptions of various buildings, detailing architectural styles, materials, dimensions, and other key elements. This dataset was crucial for training the NLP models. The second dataset includes images and videos of buildings, sourced from architectural databases and on-site photography. These visual datasets were utilized to train computer vision models to accurately identify and extract architectural features for 3D model generation. The training process of the AI model was conducted in two parallel streams to address both the verbal and visual approaches. The NLP models were trained using the textual descriptions dataset, employing techniques such as tokenization and parsing to convert textual inputs into structured data. These models learned to interpret architectural terminology and generate corresponding 3D models. Concurrently, deep learning models, particularly convolutional neural networks (CNNs), were trained on the visual dataset. These models were designed to recognize architectural features from images, such as windows, doors, and structural elements, which were then used to generate detailed 3D models. Both models underwent iterative training, guided by performance metrics to ensure optimal accuracy and efficiency. To validate the effectiveness of the AI models, we employed several evaluation indicators. Model accuracy was assessed by comparing the AI-generated 3D models against manually created or real-world counterparts, focusing on metrics such as precision, recall, and F1-score. Computational efficiency was measured by the time required to generate 3D models from both verbal descriptions and images, alongside the computational resources utilized, including CPU and GPU usage. Additionally, user satisfaction was gauged through surveys and expert reviews, which evaluated the usability, interpretability, and overall quality of the generated models. Lastly, scalability was considered by examining the models’ ability to handle varying levels of detail, from individual buildings to complex cityscapes. This comprehensive experimental design and methodology underscore the rigor of our research process, ensuring that the findings are both reliable and reproducible across similar studies.
To validate each research hypothesis in this paper effectively, it is essential to employ specific technical methods and a carefully designed experimental framework. This approach will enhance the logical coherence of the paper by ensuring that each hypothesis is rigorously tested and supported by empirical evidence. For the first hypothesis, which posits that buildings in the real world are mostly different but can still be effectively modeled through procedural modeling, the validation process should begin by developing a procedural modeling framework capable of capturing the diversity inherent in real-world architecture. This can be accomplished by creating a flexible grammatical language designed to describe various building styles, architectural features, and structural elements. The experimental design should involve selecting a diverse dataset of building types that represent a wide range of architectural styles. By using this dataset, the framework can generate 3D models that are then compared against the original buildings. The validation of this hypothesis will be based on several metrics, including structural accuracy, aesthetic similarity, and computational efficiency. A control group of buildings, modeled using traditional methods, should be included to provide a baseline for comparison. This will allow for a robust assessment of how well the procedural models replicate the variety found in real-world structures. The second hypothesis, which suggests that this modeling can be achieved through different approaches such as AI and machine learning, requires the implementation of AI-based techniques to test its validity. Specifically, AI methods like deep learning for computer vision and Natural Language Processing (NLP) for parsing architectural descriptions can be employed to create 3D models. The experimental design for this hypothesis should involve gathering input data in the form of images and textual descriptions of buildings. AI models should then be trained using a subset of this data to learn to recognize and replicate architectural features. The effectiveness of these AI-generated models should be evaluated by comparing them with manually created models, focusing on their accuracy, level of detail, and the time efficiency of the modeling process. A comparative analysis should be conducted to assess which approach—AI-based or procedural—better accommodates the diversity of buildings. This analysis will provide critical insights into the strengths and weaknesses of each method. To further strengthen the validation process, a hybrid approach can be tested, wherein procedural modeling is enhanced with AI-generated inputs. This combined method could involve using AI to generate initial modeling scripts that are subsequently refined through procedural techniques. The experimental design for this approach would involve creating a workflow that integrates AI and procedural modeling and then analyzing the effectiveness of this hybrid method in handling complex architectural structures. The resulting models can be compared to those produced by AI or procedural methods alone to determine the added value of combining these techniques. Throughout the validation process, it is important to employ statistical methods to assess the significance of the results. Hypothesis testing, such as t-tests or ANOVA, can be used to evaluate whether the differences between models generated by various methods are statistically significant. This statistical validation will ensure that the conclusions drawn from the experiments are robust and reliable.

2.2. Verbal Descriptions as Input for 3D Modeling

Utilizing verbal descriptions entails the provision of textual narratives or structured data outlining the characteristics, dimensions, and architectural features of the building. This approach offers the advantage of leveraging Natural Language Processing (NLP) techniques to parse and extract relevant information from textual inputs. Verbal descriptions provide a structured and standardized format for conveying building specifications, enabling straightforward data processing and interpretation by AI algorithms. However, the efficacy of this approach may be contingent upon the accuracy and comprehensiveness of the provided descriptions, as well as the proficiency of NLP algorithms in extracting pertinent details.
For example, the same Frauenkirche in Munich with a verbal description as input is not needed since it is a well-known church described many times on the Internet and available to ChatGPT.
If the request “write me a python script for procedural modeling of a 3D model for Frauenkirche in Munich is input into ChatGPT, the following script is generated. The full answer is available in Ref. [50] in the Supplementary Materials.

2.3. Optical Recognition of Building Images

Alternatively, the optical recognition of building images involves the utilization of computer vision algorithms to analyze visual data captured from images or videos of buildings. This approach relies on image processing techniques to identify and extract architectural features, spatial relationships, and structural elements from visual inputs. By leveraging convolutional neural networks (CNNs) or other deep learning models, optical recognition can discern intricate details and nuances in building facades, enabling the generation of highly accurate procedural models. However, the effectiveness of this approach may be influenced by factors such as image quality, lighting conditions, and occlusions, which can impact the accuracy of feature extraction and subsequent model generation. The automatic recognition of buildings and the generation of their 3D models has been created and demonstrated. Using neural networks and AI has resulted in successful and well-rendered examples; however, certain building photos require further refinement to achieve a visually satisfactory result. Automation has been shown and proven to be very useful for geovisualization cartographic purposes for shaping individual buildings as well as entire cities. First, image segmentation is performed (Figure 2), where individual elements of the image are separated so that AI can recognize them from a database of 3D building parts. Some other databases are also used for accessing building geometry parts, such as OpenStreetMap (OSM), CityGML, SketchUp 3D Warehouse, Trimble Building Point Cloud Library (PCL), Google Poly, Building Information Model (BIM) Repositories (BIMobject and NBS National BIM Library), and IFC (Industry Foundation Classes) Models (BIMserver and IfcOpenShell). Following this, in the process of forming or merging parts into the whole, a 3D model of a single building is obtained (Figure 3). The time required for the automatic formation of a 3D model of a building for preview in Figure 3 is 10 s, and in Figure 4 it is 19 s. For full resolution of the building model, about 45 min is needed.
Depending on the level of technology available today, it can be observed that with a building image like in Figure 2 and Figure 3, the models are visually satisfactory for the building in Figure 2, but not for the building in Figure 3. This is because computers and all available technology can only assume what is on the other side of the building. Therefore, it is necessary to have multiple images from all sides of the building (which is sometimes not possible) or at least one from the building’s facade. However, the model is faithful to the building’s visualization only if the building is symmetrical relative to the two surfaces captured from the facade. The drawback of this method is that when a building is photographed typically from only one side or the front, it is assumed that the rear side mirrors the front side. When this is not the case, unusual results can occur, which can distort the visual identity of the building, as seen in Figure 4, where the church towers are too far apart, and the entire church is deformed and stretched on the side opposite the photographer’s viewpoint. Therefore, it is beneficial to introduce another step: primarily removing excess elements from the image before the modeling process.
If a building is surrounded by other buildings, the so-called unwanted excess can also be automatically removed from the image, adding another step to the entire process of obtaining a 3D model. However, given the time-consuming nature of the process, this step is negligible and can be performed online; for example, in Ref. [51], with the result shown in Figure 4. The limitation of such a model is visible in Figure 4 and Figure 5 because the optical recognition of parts from the original image is not possible in the lower part of the cathedral, where surrounding houses and buildings obscure the base of the cathedral, making it invisible in the original image. The original image of the object should always be of the entire object from all sides, which in this case can be achieved using a drone or terrestrial photogrammetry.
The segmented image is not sufficiently separated from other objects, buildings, and houses shown in the lower part, where the lower part of the cathedral is obscured. Therefore, the segmented image is not suitable for automatic modeling, resulting in a distorted model with an excessive width of the object itself and an overly large distance between the cathedral’s towers (Figure 4).
Only now can a 3D model of the building be created using the previously described automatic method, with the result visible in Figure 6. Our method, with the original image already segmented, yields better results; not in further segmentation, which is very similar to any other segmentation of the main object in the image, but in the creation of a model that is true to the actual scaled-down object, as can be seen in Figure 6c,d.
If automation and all the previously described methods are applied to the cathedral and surrounding objects, a complete model like in Figure 7 can be obtained. In narrow streets, the biggest challenge is obtaining photographs that serve as the source for optical recognition and the subsequent modeling process, which best describes each individual building.

2.4. Limitations of the Research

While this study provides significant insights into the integration of AI and procedural modeling for 3D building visualization, several limitations must be acknowledged. First, the generalizability of the findings is constrained by the specific datasets and architectural styles used in the training and testing phases. The models developed in this research may not perform as effectively when applied to buildings from different cultural contexts or regions with distinct architectural norms. Second, the dependency on high-quality input data, particularly in the case of machine learning techniques, presents a challenge. Variations in image resolution, lighting conditions, and angles can lead to discrepancies in the accuracy of the generated 3D models. Additionally, the computational intensity of the AI-driven methodologies limits their accessibility, particularly in environments with limited processing power or resources. Finally, the focus on integrating AI into procedural modeling overlooks the potential human factors, such as the learning curve associated with the adoption of these technologies by practitioners in architecture and urban planning. Future research should consider these factors and explore the implications of AI-driven tools on the workflows of these professionals.

3. Comparative Analysis, Results, and Discussion

To ascertain the optimal input modality, a comparative analysis is conducted, evaluating the performance and outcomes of AI-based procedural modeling using both verbal descriptions and optical recognition of building images. Metrics such as model accuracy, fidelity to real-world structures, computational efficiency, and user satisfaction are assessed to gauge the efficacy and suitability of each approach. Additionally, qualitative factors such as ease of data acquisition, interpretability of results, and scalability are considered in the comparative analysis.

3.1. Implications

The analysis yields insights into the strengths and limitations of each input modality, informing recommendations for practitioners and researchers in the field of procedural modeling. Depending on the specific requirements, constraints, and objectives of the modeling task, one modality may be favored over the other. For instance, in scenarios where detailed visual specifications are readily available in textual form, verbal descriptions may suffice as input. Conversely, in cases where visual data are abundant and provide rich contextual information about building structures, the optical recognition of building images may offer superior accuracy and fidelity in procedural model generation. Moreover, hybrid approaches that combine elements of both modalities may be explored to capitalize on their respective strengths and mitigate their weaknesses.
Overall, the analysis underscores the importance of considering input modality selection as a critical factor in AI-based procedural modeling, with implications for the design, implementation, and evaluation of modeling systems in diverse application domains:
  • Building Descriptions as Input: Building descriptions serve as the foundational input for the modeling process. These descriptions can be provided in various formats, including textual descriptions, architectural blueprints, or schematic diagrams. Natural Language Processing (NLP) techniques can be employed to parse and extract relevant information from textual descriptions, facilitating the generation of building models.
  • Visual Recognition of Building Characteristics: Computer vision algorithms play a crucial role in automatically extracting building characteristics from visual data. Convolutional neural networks (CNNs) can be trained on annotated datasets to recognize specific architectural features such as facades, windows, doors, and structural components. Through image segmentation and feature detection, these algorithms can accurately identify and classify building elements in images or video streams [52].
  • AI and Data Analysis: AI algorithms, particularly deep learning models, can analyze building descriptions or visual data to understand spatial relationships, architectural styles, and design preferences. Recurrent neural networks (RNNs) and transformer models can process textual descriptions and generate building layouts or floor plans based on learned patterns and semantic meanings. Furthermore, Generative Adversarial Networks (GANs) can be employed to synthesize realistic building designs by learning from large datasets of existing structures.
  • Online Databases of Building Elements: Online databases provide a vast repository of pre-existing building elements that can be utilized in procedural modeling. These databases may include 3D models, textures, materials, and metadata associated with building components. APIs or web scraping techniques can be used to access and query these databases, enabling the automatic retrieval and integration of relevant building elements into the modeling pipeline.
  • Machine Learning for Optimization: Machine learning techniques can optimize the procedural modeling process by iteratively refining generated designs based on user feedback and performance metrics. Reinforcement learning algorithms can adaptively adjust model parameters to maximize design objectives, such as energy efficiency, structural integrity, or aesthetic appeal. Additionally, genetic algorithms and evolutionary strategies can explore the design space and discover novel solutions by simulating natural selection and genetic variation.
  • Parametric Modeling: Parametric modeling enables the creation of flexible building models that can be parametrically controlled and manipulated. By defining parameters such as dimensions, proportions, and constraints, designers can dynamically adjust the properties of building elements and iteratively explore design alternatives. Parametric modeling software such as Grasshopper for Rhino or Autodesk Dynamo provides intuitive visual interfaces for creating parametric designs and automating repetitive tasks.
  • Rule-Based Systems: Rule-based systems enforce design rules and constraints to ensure compliance with architectural guidelines and regulatory requirements. These rules may encompass zoning regulations, building codes, accessibility standards, and environmental considerations. By integrating rule-based validation checks into the modeling workflow, errors and violations can be detected early in the design process, reducing costly revisions and ensuring compliance with legal mandates.
  • Real-time Feedback and Iteration: Real-time feedback mechanisms enable designers to interactively manipulate building models and receive immediate visual feedback on design changes. Graphical user interfaces (GUIs) or virtual reality (VR) environments provide intuitive interfaces for exploring and refining designs in real time. User inputs such as gestures, voice commands, or haptic feedback can be captured and interpreted to dynamically adjust model parameters and geometry.
  • Simulation and Visualization: Simulation and visualization tools facilitate the evaluation and validation of building models through realistic rendering, lighting analysis, and virtual walkthroughs. Building information modeling (BIM) software platforms such as Autodesk Revit or Trimble SketchUp integrate simulation engines for simulating structural performance, daylighting conditions, thermal comfort, and other architectural metrics. By visualizing design outcomes in immersive 3D environments, stakeholders can gain insights into the spatial qualities and functional aspects of proposed buildings.
  • Scalability and Adaptability: The scalability and adaptability of automated procedural building modeling systems are essential for accommodating diverse project requirements and scales. Cloud-based infrastructure and distributed computing technologies enable parallel processing and resource scalability to handle large-scale modeling tasks efficiently. Furthermore, modular software architectures and open-source frameworks foster interoperability and extensibility, allowing developers to integrate custom modules and adapt the system to specific domain contexts.
Automated procedural building modeling harnesses the synergistic capabilities of AI, machine learning, online databases, and advanced computational techniques to streamline the design process, empower designers with creative tools, and accelerate the realization of architectural visions. By leveraging data-driven insights and computational intelligence, architects and urban planners can address complex design challenges, optimize resource utilization, and create sustainable built environments that enrich the lives of inhabitants and communities.

3.2. What Is Needed for Detailed Descriptions of Buildings?

To create the perfect building model, a comprehensive description is needed that includes the following details:
  • Building Style: information about the architectural style of the building, such as modern, contemporary, traditional, or a specific architectural movement (e.g., Art Deco, Bauhaus, Postmodern, Gothic, etc.).
  • Exterior Features: description of exterior features such as facade design, materials (e.g., concrete, glass, steel, brick), color schemes, decorative elements, and any unique architectural elements (e.g., balconies, awnings, ornamentation).
  • Building Layout and Overall Shape and Dimensions: details about the layout of the building, including the number of floors, floor plans, dimensions, proportions, and any specific structural elements (e.g., columns, beams, cantilevers), and also information about the overall shape of the building (e.g., rectangular, cylindrical, angular) and its dimensions, including length, width, and height.
  • Windows and Doors: specifications for windows and doors, including their placement, size, style (e.g., sliding, casement, double-hung, bay), and materials (e.g., glass, wood, metal) (and specifically for doors, e.g., single, double, sliding).
  • Roof Design: description of the roof design, including its shape (e.g., flat, pitched, gabled, hipped, …), materials (e.g., shingles, tiles, metal), and any additional features (e.g., skylights, rooftop gardens, dormers).
  • Landscaping: information about landscaping elements surrounding the building, such as gardens, pathways, trees, and outdoor amenities (e.g., seating areas, fountains).
  • Description of Interior Layout and Features: (unnecessary if modeling only the outside of a model of a building) details about interior spaces, including room layouts, partitioning, ceiling heights, flooring materials, lighting fixtures, and any specific interior design elements (e.g., staircase design, built-in furniture).
  • Contextual Information: contextual details such as the building’s location, surroundings, site constraints, and any specific environmental considerations (e.g., climate, orientation, sustainability features).
  • Facade Design: specific details about the facade design, including materials (e.g., concrete, glass, brick, metal), textures, patterns, and decorative elements (e.g., columns, arches, balconies).
  • Finishing Details: specific finishing details such as paint colors, surface treatments, and decorative elements both inside and outside the building.
By providing a thorough description encompassing these aspects, we can ensure that the resulting building model accurately reflects the desired design intent and meets the project requirements. Additionally, accompanying visual references such as architectural drawings, sketches, or reference images can further aid in the accurate interpretation and realization of the building model.

3.3. Final Results and Discussion

The comparative analysis reveals that the AI-based approach using verbal descriptions provides a structured and efficient method for generating 3D models. However, it lacks the detailed accuracy of models generated through optical recognition. The latter approach excels in capturing intricate architectural features but requires high-quality visual data and significant computational resources.
We also bring discussions and the contribution of our research with reference to relevant recent literature.
The key contribution of this research lies in its AI-driven procedural automation, which enhances modeling accuracy and efficiency at an individual building level. Meanwhile, Ref. [53] focuses on reducing the computational overhead in large-scale urban environments. Together, these works highlight the potential for integrating procedural modeling with lightweight reconstruction methods to improve both the efficiency and fidelity of urban 3D visualizations, especially for smart city applications.
Our research significantly contributes to the advancements of 3D city modeling techniques that were explored in another study [54]. In both papers, procedural modeling emerges as an essential method for efficiently generating detailed urban structures, but the integration of AI, as we showed, enhances accuracy and scalability. In Ref. [54], procedural modeling is highlighted for its ability to quickly generate complex 3D city models, suitable for large-scale urban planning projects and simulations. However, the limitations regarding manual adjustments, model diversity, and texture accuracy are evident. The AI-driven approaches from our research address these limitations by employing machine learning techniques such as convolutional neural networks (CNNs) and Generative Adversarial Networks (GANs) for tasks such as image-to-geometry translation and texture generation. Thus, the AI integration proposed in this research not only streamlines the generation of diverse building models but also provides higher realism, reducing the manual effort required in traditional procedural methods, and ultimately aligning with the goals outlined in Ref. [54]. This fusion of procedural and AI techniques presents a transformative approach for urban planners and architects, offering enhanced visualization, efficiency, and adaptability in 3D smart city modeling.
We also showed the enhancement of procedural modeling approaches discussed in the literature and in Ref. [55] by integrating AI-driven methods. While the digital twin research focuses on workflows for large-scale 3D urban models, AI-based procedural modeling optimizes the generation of complex building structures and textures with greater efficiency. The AI techniques, particularly CNNs and GANs, enable more realistic, dynamic models, addressing limitations in traditional procedural approaches and facilitating adaptable, high-detail urban simulations.
The web-based application discussed in Ref. [56] by advancing procedural modeling with AI integration is very interesting. While the web-based tool focuses on efficient 3DCM (3D Composite Manufacturing) management, tiling, and spatial analysis, AI-enhanced procedural modeling allows for more detailed and accurate 3D building generation from 2D data. This AI-driven approach improves the efficiency and precision of 3DCM creation, aligning with the goal of optimizing model generation and visualization in web environments.
Our research also contributes to the advancements in procedural modeling discussed in Ref. [57] by incorporating AI to enhance model automation. While Ref. [57] focused on procedural generation through CGA-based workflows and traditional shape grammars, the AI-driven techniques that we showed here improve the accuracy and efficiency of generating diverse, complex building models. This AI integration aligns with Ref. [57]’s goal of automating design, boosting creativity, and offering flexible adjustments in urban design.
Scientific Value and Contribution:
  • Novel Framework: The integration of these methodologies offers a comprehensive solution that combines the strengths of both approaches, resulting in a novel framework for automated procedural modeling.
  • Empirical Validation: This study provides empirical evidence supporting the effectiveness of the proposed methodologies, highlighting their potential to transform the field of geovisualization.
  • Broader Implications: The findings have significant implications for urban planning, architecture, and other domains that rely on accurate and efficient building modeling.
The integration of these methodologies offers a comprehensive solution, combining the strengths of both approaches. This research contributes to the field by providing a robust framework for automated procedural modeling, addressing both the efficiency and accuracy required for diverse building structures.
To validate each research hypothesis and enhance the logical coherence of the paper, it is important to employ specific technical methods and a carefully structured experimental design. Additionally, incorporating case studies that demonstrate the practical application of these methods in the geographic visualization of different types of buildings will significantly improve the readability and comprehensibility of the paper.
For the first hypothesis, which suggests that procedural modeling can accurately portray the diversity of real-world buildings, the validation process begins with a comparative analysis between traditional geovisualization methods and the proposed procedural modeling techniques. This involves the use of advanced AI and machine learning algorithms to identify and incorporate diverse architectural features into the models. For instance, convolutional neural networks (CNNs) are employed for image recognition, while Natural Language Processing (NLP) techniques is used to interpret descriptive data about the buildings. An experimental design is established where a variety of buildings, differing in architectural style and complexity, are modeled using both traditional and procedural methods. The generated models are evaluated by experts in architecture and urban planning based on metrics such as structural accuracy, visual fidelity, and computational efficiency. This comparative analysis provides empirical evidence on how well procedural models can replicate the diversity observed in real-world design.
The second hypothesis, which posits that AI-driven procedural modeling can automate the creation of accurate building models using various input modalities such as images and textual descriptions, requires a different approach. To validate this hypothesis, AI algorithms that can process diverse types of input data to generate 3D building models must be implemented. Techniques like reverse procedural modeling, where AI uses detailed descriptions or images to generate scripts for building models, would play a crucial role. The experimental design involves feeding different types of data—such as photographs, blueprints, and verbal descriptions—into the AI-driven procedural modeling system to produce 3D models. These output models can then be compared against manually created models and their real-world counterparts to evaluate the accuracy, level of detail, and computational time. A further step could involve conducting a user study where professionals in cartography, architecture, or urban planning assess the usability and effectiveness of these AI-generated models, providing additional insights into the practical applicability of the proposed approach.
These case studies would serve to illustrate the application of the proposed methods in real-world scenarios, thereby making the research more accessible and relatable to the reader. For instance, one case study could focus on an urban residential area, demonstrating how procedural modeling can be used to create detailed 3D models of a neighborhood with diverse building types. The study might use AI-driven procedural modeling to generate models based on inputs such as images of houses and textual descriptions of building materials and styles, showcasing the system’s ability to replicate the variety found in residential architecture and their geovisualization.
Another case study could involve the recreation of a historical city center, highlighting how procedural modeling can be applied to complex architectural styles. In this scenario, archival images and descriptions would be used to model historical buildings with intricate details, demonstrating the system’s potential in heritage conservation and urban planning by accurately capturing the complexity and uniqueness of historical structures. A third case study might explore the modeling of a modern commercial district, where the procedural modeling approach could be applied to visualize large, repetitive structures such as office buildings and shopping malls. This case study will emphasize the computational advantages of procedural modeling, particularly in efficiently and accurately modeling large-scale commercial environments.

4. Challenges, Limitations, and Future Directions

Despite the significant advancements offered by the proposed AI-driven procedural modeling approach in enhancing geovisualization and automating 3D building modeling, there are several challenges and limitations that need to be addressed to fully realize its potential in practical applications. One of the primary challenges lies in the quality and availability of data. The accuracy and effectiveness of the AI models are closely tied to the quality and diversity of the input datasets. For models based on verbal descriptions, inconsistencies or incomplete data can result in errors during the generation of 3D models. Similarly, the optical recognition approach is highly dependent on the quality of images or videos; poor resolution, occlusions, or unfavorable lighting conditions can hinder accurate feature extraction and lead to less reliable outputs. Another notable limitation is the model’s ability to generalize across different architectural styles and environments. Models that are trained on specific datasets may encounter difficulties when applied to new or unseen types of buildings, particularly those with unique or complex architectural features. This limitation can restrict the method’s applicability in diverse or global contexts, where architectural variability is more pronounced. Additionally, the computational demands associated with training and deploying deep learning models, particularly for optical recognition, are substantial. High-performance computing resources are often required, which may not be readily available in all practical settings, especially in resource-constrained environments. This limitation could hinder the widespread adoption of these methods. A further challenge is the issue of interpretability and transparency within the AI models. Although these models can generate accurate 3D representations, the underlying decision-making processes often lack transparency, making it difficult to understand how certain architectural features were interpreted or why specific design decisions were made. This can be a significant drawback in fields that require a high degree of interpretability, such as urban planning or heritage conservation. Moreover, integrating AI-driven procedural modeling into existing architectural and urban planning workflows may present practical difficulties. Compatibility with existing software and tools, as well as the need for specialized knowledge to operate these AI systems, could pose significant barriers to adoption. Looking towards the future, several directions can be explored to address these challenges and enhance the proposed methodology. First, there is a need for richer and more diverse datasets that encompass a broader range of architectural styles and environmental contexts, which would help improve the models’ ability to generalize. Additionally, combining verbal and visual inputs into a hybrid model could leverage the strengths of both modalities, thereby overcoming some of the limitations associated with each approach. Advances in AI techniques, such as the use of Generative Adversarial Networks (GANs) for data augmentation and reinforcement learning for model optimization, could further enhance the quality and adaptability of the models. These advanced techniques might also contribute to generating more realistic and context-aware 3D models. Improving the transparency and interpretability of AI models should also be a priority. The development of explainable AI (XAI) techniques could make the decision-making processes within these models more understandable, thus increasing trust and usability in professional applications. Additionally, research into more resource-efficient AI models that can operate effectively on lower-end hardware or within cloud-based systems could make these tools more accessible and scalable. Finally, future work should focus on improving the integration of AI-driven procedural modeling tools with industry-standard software and workflows. Developing plugins or APIs that allow for seamless operation within existing CAD or BIM systems could facilitate broader adoption within the architecture, engineering, and construction (AEC) industries. While the proposed AI-driven procedural modeling approach represents a significant advancement, addressing these challenges and exploring the outlined future directions will be crucial for its successful application in real-world scenarios. By continuously refining and improving the methodology, it is possible to overcome the current limitations and expand the utility and impact of this research across various domains.
Looking ahead, future research could explore the integration of AI-driven procedural modeling with recent developments in generative artificial intelligence, particularly text-to-image programs like Midjourney. These tools, which can generate detailed images from textual descriptions, could be utilized to enhance both modern and heritage architecture. For example, in modern architecture, architects could use text prompts to quickly generate initial design concepts, which could then be refined through procedural modeling techniques. This approach could streamline the creative process, allowing for more experimentation and innovation. In heritage architecture and urbanism, generative AI could play a crucial role in the restoration and preservation of historical sites. By combining textual descriptions of lost or damaged structures with procedural modeling, researchers could create accurate 3D reconstructions that are both visually compelling and historically faithful. This would not only aid in the physical restoration of heritage sites but also in the digital preservation of cultural history. These speculative integrations align with the work of Ref. [58] who discusses the speculative uses of geovisualization in reimagining architecture and landscapes. By applying similar techniques, future research could push the boundaries of architectural design and visualization. Additionally, the analytical evaluation of Midjourney’s capabilities by Ref. [59] highlights the current limitations in AI-generated representations, particularly in Islamic architectural heritage, pointing to areas where procedural modeling could complement these approaches to overcome existing challenges. Furthermore, the study by Ref. [60] on the perception of 3D geovisualization in urban planning underscores the importance of combining familiarity with innovative tools to enhance communication and design processes in urban contexts. Future research should focus on these potential integrations, exploring how generative AI can be combined with procedural modeling to address the specific needs of architectural and urban design. This could lead to the development of new methodologies that bridge the gap between cutting-edge AI technologies and practical applications in these fields, ultimately advancing both the efficiency and creativity of design processes.

5. Conclusions

The comparative analysis of AI-based procedural modeling using verbal descriptions versus optical recognition of building images provides valuable insights into the strengths and limitations of each input modality. By assessing metrics such as model accuracy, fidelity to real-world structures, computational efficiency, and user satisfaction, the study underscores the importance of input modality selection in the design and implementation of modeling systems. The implications for practitioners and researchers are profound, as the choice of input modality can significantly impact the efficacy and suitability of the generated models. Verbal descriptions offer a straightforward method for capturing detailed specifications, especially when textual information is readily available. Natural language processing (NLP) techniques can efficiently parse and extract relevant data from these descriptions, facilitating accurate model generation. However, the fidelity of models derived from verbal descriptions heavily depends on the completeness and clarity of the provided information. This approach may fall short in scenarios where intricate visual details are critical, as it might not capture the nuanced architectural features that are easily perceivable in images. On the other hand, the optical recognition of building images leverages the power of computer vision algorithms, particularly convolutional neural networks (CNNs), to extract detailed architectural features from visual data. This method excels in capturing the contextual richness and fine-grained details of building structures. By identifying and classifying elements such as facades, windows, and doors through image segmentation and feature detection, this approach can produce highly accurate and faithful models. However, it requires substantial computational resources and high-quality, annotated datasets to train the algorithms effectively. Hybrid approaches that combine verbal descriptions with optical recognition hold promise for maximizing the strengths and mitigating the weaknesses of each modality. By integrating textual and visual data, these methods can achieve a higher level of detail and accuracy in model generation. For instance, while textual descriptions can provide a broad framework and specific dimensions, visual data can enrich the model with precise architectural features and contextual information. This synergy can enhance the overall fidelity and usability of the generated models. Scalability and adaptability are critical considerations for the practical application of procedural modeling systems. Cloud-based infrastructure and distributed computing technologies enable the efficient handling of large-scale modeling tasks. Modular software architectures and open-source frameworks foster interoperability and extensibility, allowing developers to integrate custom modules and adapt systems to specific domain contexts. This flexibility ensures that procedural modeling can accommodate diverse project requirements and scales, from individual buildings to extensive urban planning projects.
Conclusions from this research and the findings of this analysis emphasize the importance of selecting the appropriate input modality based on the specific requirements, constraints, and objectives of the modeling task. For example, in scenarios where detailed visual specifications are readily available in textual form, such as architectural blueprints or schematic diagrams, verbal descriptions may suffice. Conversely, in environments rich with visual data, such as photorealistic images or video streams, optical recognition may offer superior results. Ultimately, the choice of input modality should align with the desired outcomes of the modeling process, whether it be accuracy, efficiency, or user satisfaction. By carefully considering the strengths and limitations of each approach, practitioners can make informed decisions that enhance the effectiveness of AI-based procedural modeling.
Future research should explore the integration of advanced AI techniques, such as Generative Adversarial Networks (GANs) and reinforcement learning, to further optimize the procedural modeling process. Additionally, the development of comprehensive online databases and the utilization of real-time feedback mechanisms can enhance the adaptability and responsiveness of modeling systems. By continuing to advance the capabilities of AI and machine learning in procedural modeling, the field can address increasingly complex design challenges and contribute to the creation of sustainable and innovative built environments.
In conclusion, this comparative analysis provides a nuanced understanding of the relative merits of verbal descriptions and optical recognition in AI-based procedural modeling. By leveraging the strengths of each modality and exploring hybrid approaches, practitioners and researchers can enhance the design, implementation, and evaluation of modeling systems, ultimately driving innovation in architectural and urban planning.
This study demonstrates the potential of combining procedural modeling with AI and machine learning to enhance the geovisualization of buildings. The proposed methodologies offer significant improvements over traditional methods, providing a foundation for future research and development in this field.
This research bridges the gap in procedural modeling by integrating advanced AI techniques, providing a robust and efficient framework for 3D building modeling. The comparative analysis highlights the strengths and limitations of different approaches, offering valuable insights for practitioners and researchers. This work significantly contributes to the field of geovisualization and the modeling of 3D buildings and sets the stage for future advancements in automated procedural modeling.
Scientific Value, Motivation, and Contribution:
  • Innovative methodologies: this research introduces innovative methodologies that significantly advance the field of procedural modeling.
  • Addressing key challenges: by effectively handling the diversity and complexity of real-world buildings, this work addresses key challenges that have limited the applicability of procedural modeling in the past.
  • Future directions: the findings set the stage for future research, offering valuable insights and a robust framework for further innovations in the AI-driven geovisualization of buildings.
Overall, this research significantly advances the field of procedural modeling and sets the stage for future innovations in AI-driven geovisualization.

Supplementary Materials

Author Contributions

Conceptualization, R.N., R.Ž. and I.R.; R.N. and R.Ž.; software, R.N.; validation, R.N., R.Ž. and I.R.; formal analysis, R.N. and R.Ž.; investigation, R.N.; resources, R.N., R.Ž. and I.R.; data curation, R.N.; writing—original draft preparation, R.N. and R.Ž.; writing—review and editing, R.N., R.Ž. and I.R.; visualization, R.N. and R.Ž.; supervision, R.Ž. and I.R.; project administration, R.N. and R.Ž.; funding acquisition, R.N. and R.Ž. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data and scripts produced by the Potpora 821 are freely available from the site https://rb.gy/tsma7a.

Acknowledgments

This research is partially supported through project KK.01.1.1.02.0027, a project co-financed by the Croatian Government and the European Union through the European Regional Development Fund—the Competitiveness and Cohesion Operational Program.

Conflicts of Interest

The authors declare no conflicts of interest.

IGOs License

This is an open access article distributed under the terms of the Creative Commons Attribution IGO License (CC BY) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

References

  1. Müller, P.; Wonka, P.; Haegler, S.; Ulmer, A.; Van Gool, L. Procedural modeling of buildings. ACM Trans. Graph. 2006, 25, 614–623. [Google Scholar] [CrossRef]
  2. Kanda, J.; He, Y.; Xie, H.; Miyata, K. Sketch2Tooncity: Sketch-based city generation using neurosymbolic model. In Proceedings of the International Workshop on Advanced Imaging Technology (IWAIT), Langkawi, Malaysia, 7–8 January 2024; Volume 13164, pp. 431–436. [Google Scholar]
  3. Kubicka-Sowińska, A.; Miszk, Ł.; Zachar, P.; Fijałkowska, A.; Ostrowski, W.; Modrzewski, J.; Papuci-Władyka, E. Reconstructing the Urban Fabric of Nea Paphos by Comparison with Regularly Planned Mediterranean Cities, Using 3D Procedural Modeling and Spatial Analysis. Bull. Am. Soc. Overseas Res. 2024, 391, 163–189. [Google Scholar] [CrossRef]
  4. Parish, Y.I.; Müller, P. Procedural modeling of cities. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, Los Angeles, CA, USA, 12–17 August 2001; pp. 301–308. [Google Scholar]
  5. Kolbe, T.H. Representing and exchanging 3D city models with CityGML. In 3D Geo-Information Sciences; Springer: Berlin/Heidelberg, Germany, 2009; pp. 15–31. [Google Scholar]
  6. Şenyurdusev, G.; Dogru, A.O. Infrastructure for 3D Modeling of Historical Fountains in Istanbul with GIS-Based Procedural Approach. Geomat. Environ. Eng. 2024, 18, 51–72. [Google Scholar] [CrossRef]
  7. Fatima, I.; Sooda, K. Enhancing Immersive Virtual Reality Product Experiences: Strategies for Graphics Quality, Performance Optimization, and User-Centric Interfaces. In Proceedings of the 2023 IEEE Engineering Informatics, Melbourne, Australia, 22–23 November 2023; pp. 1–5. [Google Scholar]
  8. Qin, X.; Mao, W.; Hu, Z.; Zheng, H.; Xu, X. Procedural modeling and layout method for a generic ancient Chinese city. Multimed. Tools Appl. 2024, 83, 47021–47048. [Google Scholar] [CrossRef]
  9. Usta, Z.; Akin, A.T.; Cömert, Ç. Deep learning aided web-based procedural modelling of LOD2 city models. Earth Sci. Inform. 2023, 16, 2559–2571. [Google Scholar] [CrossRef]
  10. Adão, T.; Pádua, L.; Marques, P.; Sousa, J.J.; Peres, E.; Magalhães, L. Procedural modeling of buildings composed of arbitrarily-shaped floor-plans: Background, progress, contributions and challenges of a methodology oriented to cultural heritage. Computers 2019, 8, 38. [Google Scholar] [CrossRef]
  11. Kikuchi, T.; Fukuda, T.; Yabuki, N. Development of a synthetic dataset generation method for deep learning of real urban landscapes using a 3D model of a non-existing realistic city. Adv. Eng. Inform. 2023, 58, 102154. [Google Scholar] [CrossRef]
  12. Taye, M.M. Understanding of machine learning with deep learning: Architectures, workflow, applications and future directions. Computers 2023, 12, 91. [Google Scholar] [CrossRef]
  13. Brol, A.; Antoniuk, I. Procedural Generation of Virtual Cities. In Proceedings of the 2023 24th International Conference on Computational Problems of Electrical Engineering, Grybów, Poland, 10–13 September 2023; pp. 1–4. [Google Scholar]
  14. Varinlioglu, G.; Balaban, Ö. Artificial intelligence in architectural heritage research: Simulating networks of caravanserais through machine learning. In The Routledge Companion to Artificial Intelligence in Architecture; Routledge: London, UK, 2021; pp. 207–223. [Google Scholar]
  15. Chaillou, S. Artificial Intelligence and Architecture: From Research to Practice; Birkhäuser: Basel, Switzerland, 2022. [Google Scholar]
  16. Li, X.; Yue, J.; Wang, S.; Luo, Y.; Su, C.; Zhou, J.; Xu, D.; Lu, H. Development of Geographic Information System Architecture Feature Analysis and Evolution Trend Research. Sustainability 2023, 16, 137. [Google Scholar] [CrossRef]
  17. Sensmeier, J. Harnessing the power of artificial intelligence. Nurs. Manag. 2017, 48, 14–19. [Google Scholar] [CrossRef]
  18. Ávila Parra, R. House Generation Using Procedural Modeling with Rules. Master’s Thesis, Universitat Politècnica de Catalunya, Barcelona, Spain, 2021. [Google Scholar]
  19. Yeguas, E.; Muñoz-Salinas, R.; Medina-Carnicer, R. Example-based procedural modelling by geometric constraint solving. Multimed. Tools Appl. 2012, 60, 1–30. [Google Scholar] [CrossRef]
  20. Oketunji, F. Evaluating the Efficacy of Hybrid Deep Learning Models in Distinguishing AI-Generated Text. arXiv 2023, arXiv:2311.15565. [Google Scholar]
  21. Sepasgozar, S.M.; Khan, A.A.; Smith, K.; Romero, J.G.; Shen, X.; Shirowzhan, S.; Li, H.; Tahmasebinia, F. BIM and digital twin for developing convergence technologies as future of digital construction. Buildings 2023, 13, 441. [Google Scholar] [CrossRef]
  22. Schwarz, M.; Müller, P. Advanced procedural modeling of architecture. ACM Trans. Graph. 2015, 34, 1–12. [Google Scholar] [CrossRef]
  23. Locatelli, M.; Seghezzi, E.; Pellegrini, L.; Tagliabue, L.C.; Di Giuda, G.M. Exploring natural language processing in construction and integration with building information modeling: A scientometric analysis. Buildings 2021, 11, 583. [Google Scholar] [CrossRef]
  24. Tutenel, T.; Smelik, R.M.; Lopes, R.; de Kraker, K.J.; Bidarra, R. Generating consistent buildings: A semantic approach for integrating procedural techniques. IEEE Trans. Comput. Intell. AI Games 2011, 3, 274–288. [Google Scholar] [CrossRef]
  25. Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 1–74. [Google Scholar] [CrossRef]
  26. Saldana, M.; Johanson, C. Procedural modeling for rapid-prototyping of multiple building phases. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2013, 40, 205–210. [Google Scholar] [CrossRef]
  27. Leroux, F.; Germain, M.; Clabaut, É.; Bouroubi, Y.; St-Pierre, T. Instance Segmentation on 3D City Meshes for Building Extraction. In IGARSS 2023–2023 IEEE International Geoscience and Remote Sensing Symposium; IEEE: Piscataway, NJ, USA, 2023; pp. 6975–6978. [Google Scholar]
  28. Beneš, B.; Št’ava, O.; Měch, R.; Miller, G. Guided procedural modeling. In Computer Graphics Forum; Blackwell Publishing Ltd.: Oxford, UK, 2011; Volume 30, pp. 325–334. [Google Scholar]
  29. Hamdia, K.M.; Zhuang, X.; Rabczuk, T. An efficient optimization approach for designing machine learning models based on genetic algorithm. Neural Comput. Appl. 2021, 33, 1923–1933. [Google Scholar] [CrossRef]
  30. Kramer, M.; Akleman, E. A procedural approach to creating American second empire houses. J. Comput. Cult. Herit. 2020, 13, 1–19. [Google Scholar] [CrossRef]
  31. Turrin, M.; Von Buelow, P.; Stouffs, R. Design explorations of performance driven geometry in architectural design using parametric modeling and genetic algorithms. Adv. Eng. Inform. 2011, 25, 656–675. [Google Scholar] [CrossRef]
  32. Vanegas, C.A.; Garcia-Dorado, I.; Aliaga, D.G.; Benes, B.; Waddell, P. Inverse design of urban procedural models. ACM Trans. Graph. 2012, 31, 1–11. [Google Scholar] [CrossRef]
  33. European Parliament: P9_TA(2024)0138, Artificial Intelligence Act. Available online: https://www.europarl.europa.eu/doceo/document/TA-9-2024-0138_EN.pdf (accessed on 23 June 2024).
  34. Li, S.L.; Li, L.; Ming-Wei, C.A.O.; Cao, L.; Jia, W.; Liu, X.P. Rapid modeling of Chinese Huizhou traditional vernacular houses. IEEE Access 2017, 5, 20668–20683. [Google Scholar] [CrossRef]
  35. Vince, J. Introduction to Virtual Reality; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2004. [Google Scholar]
  36. Safikhani, S.; Keller, S.; Schweiger, G.; Pirker, J. Immersive virtual reality for extending the potential of building information modeling in architecture, engineering, and construction sector: Systematic review. Int. J. Digit. Earth 2022, 15, 503–526. [Google Scholar] [CrossRef]
  37. Sung, M. Graph-Based construction of 3D korean giwa house models. Buildings 2019, 9, 68. [Google Scholar] [CrossRef]
  38. House, B.R. A Procedural Interface Wrapper for Houdini Engine in Autodesk Maya. Doctoral Dissertation, Texas A&M University, College Station, TX, USA, 2019. [Google Scholar]
  39. Wang, S.; Wainer, G.; Goldstein, R.; Khan, A. Solutions for scalability in building information modeling and simulation-based design. In Proceedings of the Symposium on Simulation for Architecture and Urban Design, San Diego, CA, USA, 7–10 April 2013. [Google Scholar]
  40. Yenew, A.B.; Assefa, B.G. From Algorithms to Architecture: Computational Methods for House Floorplan Generation. SN Comput. Sci. 2024, 5, 589. [Google Scholar] [CrossRef]
  41. Mazzoli, C.; Iannantuono, M.; Giannakopoulos, V.; Fotopoulou, A.; Ferrante, A.; Garagnani, S. Building information modeling as an effective process for the sustainable re-shaping of the built environment. Sustainability 2021, 13, 4658. [Google Scholar] [CrossRef]
  42. Colmenero Fonseca, F.; Rodríguez Pérez, R.; Perlaza Rodríguez, J.; Palomino Bernal, J.F.; Cárcel-Carrasco, J. Sustainable Built Environments: Building Information Modeling, Biomaterials, and Regenerative Practices in Mexico. Buildings 2024, 14, 202. [Google Scholar] [CrossRef]
  43. Nauata, N.; Hosseini, S.; Chang, K.H.; Chu, H.; Cheng, C.Y.; Furukawa, Y. House-gan++: Generative adversarial layout refinement network towards intelligent computational agent for professional architects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13632–13641. [Google Scholar]
  44. Leroux, F.; Germain, M.; Clabaut, É.; Bouroubi, Y.; St-Pierre, T. Improving Three-Dimensional Building Segmentation on Three-Dimensional City Models through Simulated Data and Contextual Analysis for Building Extraction. ISPRS Int. J. Geo-Inf. 2024, 13, 20. [Google Scholar] [CrossRef]
  45. Sun, C.; Han, J.; Deng, W.; Wang, X.; Qin, Z.; Gould, S. 3d-gpt: Procedural 3d modeling with large language models. arXiv 2023, arXiv:2310.12945. [Google Scholar]
  46. Liu, R.; Li, H.; Lv, Z. Modeling methods of 3D model in digital twins. CMES-Comput. Model. Eng. Sci. 2023, 136, 985–1022. [Google Scholar] [CrossRef]
  47. Ebert, D.S.; Musgrave, F.K.; Peachey, D.; Perlin, K.; Worley, S. Texturing and Modeling: A Procedural Approach; Elsevier: Amsterdam, The Netherlands, 2022. [Google Scholar]
  48. Qi, C.R. Deep learning on 3D data. In 3D Imaging, Analysis and Applications; Springer: Berlin/Heidelberg, Germany, 2020; pp. 513–566. [Google Scholar]
  49. Baduge, S.K.; Thilakarathna, S.; Perera, J.S.; Arashpour, M.; Sharafi, P.; Teodosio, B.; Shringi, A.; Mendis, P. Artificial intelligence and smart vision for building and construction 4.0: Machine and deep learning methods and applications. Autom. Constr. 2022, 141, 104440. [Google Scholar] [CrossRef]
  50. MEGA. Available online: https://mega.nz/file/YchWCSbD#D6Gxl9SwBYqzyUG6OpE4J_ocY06RgnopSurEuGBdiqM (accessed on 23 June 2024).
  51. 24AI. Available online: https://24ai.tech/en/tools/cut-object/ (accessed on 23 June 2024).
  52. CSM. Available online: https://ln.run/RJnYd (accessed on 23 June 2024).
  53. Kamra, V.; Kudeshia, P.; Arabinaree, S.; Chen, D.; Akiyama, Y.; Peethambaran, J. Lightweight Reconstruction of Urban Buildings: Data Structures, Algorithms, and Future Directions. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 902. [Google Scholar] [CrossRef]
  54. Khayyal, H.K.; Zeidan, Z.M.; Beshr, A.A.A. Creation and Spatial Analysis of 3D City Modeling based on GIS Data. Civ. Eng. J. 2022, 8, 105. [Google Scholar] [CrossRef]
  55. Somanath, S.; Naserentin, V.; Eleftheriou, O.; Wästberg, B.S.; Logg, A. Towards Urban Digital Twins: A Workflow for Procedural Visualization Using Geospatial Data. Remote Sens. 2024, 16, 1939. [Google Scholar] [CrossRef]
  56. Usta, Z.; Cömert, Ç.; Akın, A.T. An interoperable web-based application for 3d city modelling and analysis. Earth Sci. Inform. 2023, 17, 163–179. [Google Scholar] [CrossRef]
  57. Zhang, M.; Wu, J.; Liu, Y.; Zhang, J.; Li, G. GIS Based Procedural Modeling in 3D Urban Design. IJGI 2022, 11, 531. [Google Scholar] [CrossRef]
  58. Barrio, R.S. Reimaging Earth. Architecture and the critical and speculative uses of geovisualization. City Territ. Archit. 2023, 10, 22. [Google Scholar] [CrossRef]
  59. Sukkar, A.W.; Fareed, M.W.; Yahia, M.W.; Abdalla, S.B.; Ibrahim, I.; Senjab, K.A.K. Analytical Evaluation of Midjourney Architectural Virtual Lab: Defining Major Current Limits in AI-Generated Representations of Islamic Architectural Heritage. Buildings 2024, 14, 786. [Google Scholar] [CrossRef]
  60. Jaalama, K.; Fagerholm, N.; Julin, A.; Virtanen, J.-P.; Maksimainen, M.; Hyyppä, H. Sense of presence and sense of place in perceiving a 3D geovisualization for communication in urban planning–Differences introduced by prior familiarity with the place. Landsc. Urban Plan. 2021, 207, 103996. [Google Scholar] [CrossRef]
Figure 1. The research framework methodology and comparative analysis of two research patterns—by collecting verbal descriptions and collecting building images.
Figure 1. The research framework methodology and comparative analysis of two research patterns—by collecting verbal descriptions and collecting building images.
Applsci 14 08345 g001
Figure 2. (a) Original image, (b) optical recognition and segmented image, (c,d) 3D model of the same building. Source: combination of Python and Blender.
Figure 2. (a) Original image, (b) optical recognition and segmented image, (c,d) 3D model of the same building. Source: combination of Python and Blender.
Applsci 14 08345 g002
Figure 3. (a) Original image, (b) optical recognition and segmented image, (c,d) 3D model of the same building.
Figure 3. (a) Original image, (b) optical recognition and segmented image, (c,d) 3D model of the same building.
Applsci 14 08345 g003
Figure 4. (a) Original image, (b) optical recognition and segmented image, (c) 3D model of the same building.
Figure 4. (a) Original image, (b) optical recognition and segmented image, (c) 3D model of the same building.
Applsci 14 08345 g004
Figure 5. Image before (a) and after (b) the step of separating the main object in the picture and the combination (c) that is intended to be eventually formed into a 3D model by AI (24AI 2024).
Figure 5. Image before (a) and after (b) the step of separating the main object in the picture and the combination (c) that is intended to be eventually formed into a 3D model by AI (24AI 2024).
Applsci 14 08345 g005
Figure 6. A much better result of the 3D model of the church in Munich compared to the model in Figure 3. (a) Original image, (b) optical recognition and segmented image, (c,d) 3D model of the same building.
Figure 6. A much better result of the 3D model of the church in Munich compared to the model in Figure 3. (a) Original image, (b) optical recognition and segmented image, (c,d) 3D model of the same building.
Applsci 14 08345 g006
Figure 7. Model of the cathedral (Frauenkirche in Munich) and surrounding buildings from different directions (ad).
Figure 7. Model of the cathedral (Frauenkirche in Munich) and surrounding buildings from different directions (ad).
Applsci 14 08345 g007aApplsci 14 08345 g007b
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nikçi, R.; Župan, R.; Racetin, I. Geovisualization of Buildings: AI vs. Procedural Modeling. Appl. Sci. 2024, 14, 8345. https://doi.org/10.3390/app14188345

AMA Style

Nikçi R, Župan R, Racetin I. Geovisualization of Buildings: AI vs. Procedural Modeling. Applied Sciences. 2024; 14(18):8345. https://doi.org/10.3390/app14188345

Chicago/Turabian Style

Nikçi, Rexhep, Robert Župan, and Ivana Racetin. 2024. "Geovisualization of Buildings: AI vs. Procedural Modeling" Applied Sciences 14, no. 18: 8345. https://doi.org/10.3390/app14188345

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop