1. Introduction
Over the past few decades, buildings and infrastructure have been deteriorating [
1,
2,
3]. Concerning the safety requirements of building structures, periodic structural health monitoring and increased interest in condition inspection are critical because of the degradation of serviceability [
3,
4]. Therefore, the problem of aging structures must be solved through the early identification of defects with systematic and continuous management to reduce the deterioration risk of buildings and extend their serviceability in the building life cycle [
1,
5,
6]. Periodic safety inspections are conducted to ensure sustainable building maintenance. Traditionally, field inspections, including visual inspection of the surfaces of structures, are primarily performed by a group of experts. However, recently, owing to a lack of experts, the manpower-oriented inspection method can result in poor safety inspection and subjective judgment in most field inspections and is time-consuming and costly owing to the dramatic aging of facilities.
To overcome these limitations, many studies have been conducted to reduce inspection costs and improve the objectivity of safety assessments by developing new approaches to visual inspection methods [
5,
6,
7,
8]. Recent studies have adopted both unmanned aerial vehicles (UAVs) and artificial intelligence (AI) applications in the fields of structural safety inspection and diagnosis [
5]. The use of UAVs to inspect and monitor structures can enhance the quality and efficiency of traditional visual inspection processes [
6,
7,
8]. Additionally, the convergence of an AI-based automatic inspection and decision-making support system can provide an alternative to or can support the existing expert-centered on-site inspection methods. However, most studies on UAV-based structure condition inspections focused on bridge structures, and few studies have been conducted on residential buildings [
7,
8]. Unlike bridge facilities, in residential facilities, the privacy of residents should be considered. Thus, specific factors must be considered for drone operations and data collection. Meanwhile, the defect information, used in previous studies [
5,
6,
7,
8], consists of the patch image generated from part of a structure. Moreover, the case studies applying the developed model to actual fields, such as residential building structures, are rare. Thus, an uncertainty remains about implementing AI-based building inspection in the real world. Moreover, it could be various in operation and data acquisition methods depending on the building structures.
To address this uncertainty about the applicability of the UAV-AI-based inspection process in residential buildings, this paper introduces how UAVs should be operated, what data must be collected for building condition assessment and AI analysis, and which data conditions are required for 3D modeling visualization focusing on the residential building structure. Specially, the building safety inspection cases performed through UAV-based inspection are analyzed in detail.
This study is organized as follows.
Section 2 describes the literatures review related to the UAV-AI-based inspection method. In
Section 3, the methodology, including a case study and the UAV-AI building inspection process for residential buildings are described. A holistic and descriptive analysis was applied to understand the obstacles and challenges to performing the safety inspection process step-by-step. In
Section 4, we explain the execution method, from site selection to AI-based defect analysis and 3D modeling methods.
Section 5 describes the results of AI-based defect detection and presents a plan for managing defect information using 3D models and coordinate information. Additionally, notable considerations for each stage are summarized in
Section 6.
2. Literature Reviews
To preserve facilities and maintain their performance, it is important to accurately assess the condition of buildings [
9,
10,
11,
12]. Visual inspection is an important task performed in the building safety inspection stage because the building condition information collected through visual inspection is used as a basis for evaluating a building’s condition. Nevertheless, frequent cases of building performance degradation and accidents occur owing to poor safety inspections [
10,
11]. Such poor inspection can be avoided by collecting and evaluating accurate information. However, the increasing proportion of aging buildings and infrastructure makes it difficult to precisely inspect all facilities during the operation and maintenance phases. Therefore, efficient facility inspection methods are required.
Recently, the number of inspection facilities adopting UAVs and AI has increased [
13,
14,
15,
16,
17,
18,
19]. For example, Peng [
13] and Li et al. [
14] proposed a UAV-based machine vision system for recognizing bridge cracks and quantifying width [
13,
14]. Li et al. [
15] proposed the Faster-RCNN to improve the efficiency and accuracy of bridge crack detection with UAVs. However, they only focused on detecting crack damage. Indeed, the defects exposed in buildings or infrastructures appear in various forms. Thus, some studies have focused on methods that obtain multi-label defect information using image analysis for the automatic recognition and evaluation of building defects. For instance, Perry et al. [
16] proposed a streamlined bridge inspection system for detecting and measuring cracks and spalling defects. Shin et al. [
17] proposed an automatic concrete damage recognition model for classifying multiple defect classes using a CNN model. Detection models that detect the type and location of complex damage in an image have been studied [
18,
19]. Previous studies [
13,
14,
15,
16,
17,
18,
19] have shown that UAVs address the constraints of conventional visual inspection, enabling automatic defect analysis using large amounts of image data with more accurate and detailed defect assessments.
Nevertheless, some obstacles remain to be overcome in inspecting building exteriors using UAV and AI technology from the perspective of data collection and defect information representation for the inspectors [
20,
21]. Many previous studies [
22,
23,
24] concentrated defect detection on local image parts; this partial defect information makes it challenging to inspect large-scale structures efficiently. The localization of defects in single image data is obtainable, but it demands huge computing resources and time to identify the local information of the defect region from the perspective of the whole building structure. To overcome this limitation, Xu et al. [
22] proposed to improve the efficiency of crack detection based on large-scene images using UAV. The large-scene image is divided into a segmented grid. Afterward, part of the concrete bridge member is used as a detection model for input values. It is possible to determine the relative position within a large image. Meanwhile, Kang [
23] proposed the convolution neural network (CNN) algorithm for damage detection and a geo-tagging method using an ultrasonic beacon system (UBS) for tracking the location of the UAV. The experiments were implemented in indoor environments and areas where GPS is denied or unreliable. However, for monitoring the location of the UAV, it is required to have the UBS, which is limited to operating distance. The location information of UAVs and defects in UAV-based inspection processes is trackable by flight planning of UAVs. Li et al. [
24] explored the application of an autonomous UAV inspection system for high-voltage power transmission lines. They proposed the autonomous planning of inspection paths, adopting high-voltage power transmission lines and detecting obstacles and damage using deep learning object detection models. However, few cases have been conducted in which the entire exterior condition of a residential building is inspected considering the UAV operation plan and data collection for defect information extraction.
Recently, three-dimensional (3D) models reconstructed using textured mesh methods have been used to visualize the damage state of real structures [
25,
26,
27]. Three-dimensional models help record building condition information based on actual buildings. In particular, some studies were conducted to recognize damage information using AI and 3D mappings to accurately determine the condition of buildings damaged by earthquakes [
25,
26]. Especially, in the studies of Pantoja-Rosero et al. [
9,
27], they reconstructed a 3D geometrical model to map the crack damage detected by the deep leaning model. It is an advanced digital shadowing approach for visualizing building current conditions. However, they do not detect other damages, such as delamination and leakages by the deep learning model. For applying the generalized AI model to detect defect information, the pre-trained multi-label detection model is necessary. Additionally, the UAV operation plan in the inspection process is very different according to the building design, as it varies in building scale and exterior contour lines. Therefore, it is required to explore the applicability of the UAV-AI combined inspection process and 3D reconstruction method in several other cases. In this regard, this study conducted a case study, selecting deteriorating residential buildings, which have been used for more than 30 years and are not under systematic government management, as a testbed.
Consequently, we proposed the field-adopted UAV operation method and AI-based defect detection model to adopt the residential buildings throughout the holistic and descriptive case analyses. Moreover, the lessons learned, which include drone application limitations, points of improvement of data collection, and items to be considered for AI- and UAV-based inspection for residential buildings, are summarized in this paper.
3. Methodology
3.1. Case Study Analysis
A case study analysis, as one of the qualitative research methods, aims to understand specific phenomena, explanations, and interpretations based on previous practical knowledge [
28]. Indeed, a case study can be descriptive, explanatory, and exploratory [
29]. In this study, the explanatory case analysis method was adopted to gain a deeper understanding of the automatic building inspection process using UAV and AI technologies. To begin with, we focused on explaining how the UAV-AI-based practices are implemented in building condition inspection in real words. Then, we investigated which data are needed to generate 3D models for visualizing the inspected buildings along with defect information. In this regard, the explanatory analysis is appropriate, as it provides a descriptive approach for applying UAVs and AI in building inspection. Moreover, it also provides insights into how UAVs can be adopted for inspection and integrated into AI technologies.
3.2. UAV-AI Based Inspection Process for Residential Buildings
This study presented a UAV-AI building inspection process to illustrate how applying AI and UAV to residential building inspection operates from a practical perspective. To begin with,
Figure 1 shows the UAV-based building inspection process, including four steps: (1) preliminary, (2) data acquisition, (3) AI-defect detection, and (4) 3D reconstruction and defect extraction. The following subsection describes how each component is operated in the process.
3.2.1. Preliminary
For the UAV operation to collect data from the residential building, it is necessary to check barrier factors that can be problematic during the flight due to various external environments. First, it is necessary to conduct on-site aviation regulations and preliminary surveys of the surrounding area. The pre-investigation is for confirming the areas where national flight is not allowed (state confidential facilities, confidentiality of private companies). The flight availability is different according to the national aviation regulations of the destination country. Second, after confirming flight availability, documents to be submitted following each country/local organization’s permit form must be prepared to request official approval. Once approval is granted, an official document is sent to the building manager/owner organization to coordinate the date and obtain permission to photograph. The site investigation is conducted after approval and permission. The flight maintenance interval is determined according to the shape of the exterior wall and the scale of the building, and the UAV operation plan is precisely established considering the natural factors of the environment (wood, electric wires, solar panels, antennas, etc.). This proactive approach is essential to setting up the UAV operations plan, and a properly determined plan can avoid potential problems in the data collection stages.
3.2.2. Data Acquisition
In this study, the UAV flight plan was proposed for a precise UAV-AI-based building inspection. The comprehensive flight plan is provided as outlined in
Figure 2, which enabled the exterior condition data to be acquired through two distinct approaches. The automatic flight path-planning method (
Figure 2a) is a comprehensive approach to data acquisition. This flight method, which is only used as a horizontal path plan, can acquire not only the exterior of the building but also the location and environmental data of the surroundings. Moreover, the manual flight method (
Figure 2b) yields pivotal data for establishing reference points for the subsequent 3D modeling of the building using manual methods. Employing manual flights as benchmarks for overlapping and aligning flight paths based on visual distances ensures a refined and sophisticated data collection process that encompasses building structural and environmental factors.
- (1)
Image Capturing Method for 3D Reconstruction of Inspected Buildings
To capture the external facade of the designated area, we implemented an interval capturing process involving altitude reduction after transitioning from the highest point of the building at +20 m to the intended capturing region. The camera’s orientation ranged from 45° to 60° during this procedure. Photographs were captured at a tilt angle of 0°, effectively representing the front view, which is essential for 3D data collection. For 3D model filming, a strategy was employed to minimize the distortion caused by camera curvature. After a region was filmed, subsequent shots were positioned to ensure an 80% to 90% overlap with the previous area. This strategy contributed to reduced distortion and enhanced continuity in the captured data. The overlapping approach was not limited to frontal angles; it extended to oblique angles, enabling the incorporation of three-dimensional information about the external walls.
Figure 3 shows the blind spots that occurred when capturing the orthophotos. This problem can be solved via flight operations by adjusting camera angles. This inclusive approach aimed to gather data regarding architectural features such as eaves, windows, and corner details. The tilt angle of the camera was adjusted between 45° and 60° to areas in which matching and clarity problems could occur.
Aerial photography was employed at an elevation of +20 m above the tallest point of the building for data collection suitable in an automatic mapping function, which was run using context capture software (11.0 Version). This configuration provided 80–90% overlap coverage over the designated capture area. To configure the automatic flight shooting method, we captured an initial orthogonal shot by tilting the UAV camera at a 90° angle, creating an encompassing view of the apartment’s overall structure. Subsequently, a second shot was captured at 70° to capture oblique views of the outer walls, enhancing the data collection process for areas that proved challenging during modeling.
- (2)
Data Collection Approaches to Extract Defect Positional Information
Figure 4 shows the flight planning and indexing methods for a representative plate-type structure (
Figure 4 top) and tower-type structure (
Figure 4 bottom). The capture procedure planned a flight path along the building outline (
Figure 4a) and captured sequence and area (
Figure 4b) according to the building elevation. Subsequently, an index was assigned for the objectives to record the location of the defect, which was extracted by the AI-defect detection method and shooting information, and location information was provided (
Figure 4c). The index functioned as a reference to facilitate precise modeling based on location data and smooth data collection. Additionally, it served as a marker for landing times, initial UAV positions during battery replacement, and systematic data acquisition. The capture sequence of these images closely adhered to a predetermined parameter set to obtain high-definition data. This adherence ensured a consistent separation of 5 m while maintaining safety throughout the process. This separation distance served as a safeguard against collisions with protruding building elements and simultaneously upheld data quality standards.
3.2.3. Artificial Intelligence for Defect Detection
This stage described the AI engine-based defect detection method. The AI engine plays a key role in extracting information by automatically analyzing defects to the exterior of buildings during the safety inspection process [
30,
31,
32].
- (1)
Data preparation
The efficient identification of defect types and their precise locations is essential for successfully implementing AI-based automatic object detection within image data. The fundamental requirement lies in acquiring suitable data that facilitate the training of annotated images for deep learning systems. These images encompass vital data on the defect types and their corresponding spatial coordinates. UAVs were employed to capture images of deteriorated concrete buildings. Subsequently, from this extensive dataset, images explicitly revealing instances of defects were systematically selected and meticulously annotated. The annotation process involved the creation of bounding boxes around the areas of interest, which effectively delineated the specific regions containing defects.
This compilation of images and annotations was pivotal in training the AI-based defect detection model. The data in this study were established in two different datasets; one (Dataset_P) is for developing a defect detection model. Data_P was collected from several concrete surface walls in university buildings and annotated with bounding boxes and three types of defects (i.e., crack, delamination, and leakages. The other one (Dataset_F), which consists of a dataset collected from five different buildings, is for validating the application of the pre-trained detection model. A summary of the datasets is presented in
Table 1.
Thus, two separate datasets were not used interchangeably. Dataset_P was only used for pretraining the defect detection model. Thus, the model training and validation were based on 219 images and 1803 annotations (i.e., 155 images and 1324 annotation for training, 64 images and 479 annotations for validation). An illustrative example of label creation is presented in
Figure 5, which visually encapsulates the practical application of the methodology. In this study, a bounding box was used to delineate the location of defects on the exterior of a building for precise identification.
- (2)
Development of AI-based defect detection model.
In this study, we adopted the faster region convolutional neural network (Faster-RCNN) model [
33] to detect building surface defects. Faster-RCNN is the convolutional neural network (CNN)-based defect detection model that discerns both the characteristics of defects and their precise locations within images. Moreover, CNN-based models for region classification have shown remarkable efficacy in the domain of object detection. The faster RCNN model has distinct advantages, notably offering heightened sensitivity in defect detection while concurrently alleviating the computational strain associated with the region proposal stages. Region proposal is for extracting the process of potential defect regions on the image using a region proposal network (RPN). RPNs facilitate seamless end-to-end fine-tuning of a pretrained model. Indeed, the core role of an RPN is dedicated to producing object proposals that include localization information [
34]. The images extracted through the region proposal process were adjusted to a fixed size, which is a prerequisite for integration as input values within a convolution-based network. Additionally, a classification mechanism was established by routing each region using a CNN algorithm. The sliding window method was employed to delineate the regions of interest (RoIs). This technique calculated the potential existence probabilities of defects by utilizing region boundary box coordinates facilitated by spatial window filters and anchor boxes, defined as parameters [
33]. The subsequent step involved the application of regional proposals to the RoI pooling. These pooled proposals were then directed to the second module, that is, the detection network. Here, the primary focus was on refining the object localization proposals, enhancing the precision of defect detection, and achieving precise spatial localization.
3.2.4. 3D Reconstruction and Defect Extraction
In this case study, the images collected according to the flight plan, which is an automatic and manual flight path plan, were used for reconstructing 3D modeling of the residential buildings. For efficient image mapping, unnecessary elements for the 3D reconstruction were removed. This is effective work to prevent a lengthy model synthesis time in the excessive data processing process of the surrounding environment, which was not the target of the building of automatic mapping. Then, a three-dimensional configuration was performed, generating the basic shape of the 3D model using a vector that specifies an arbitrary position and scale in the initial stage of the GPS satellite signal-based image. When constructing the shape of the exterior image of the building, the process of manually positioning detailed images with unstable GPS signals can help improve the quality of the model. From the 3D model that completed the matching process, the defects, extracted by the AI model, are presented on the elevation map for each building.
4. Case Analysis of AI-UAV-Based Building Inspection
4.1. Case Study Selection
This study analyzed five cases that were selected through pre-investigation. These buildings have been deteriorating residential buildings in use for over 30 years and not under systematic government maintenance and management. The AI-adopted UAV inspection method was conducted using field tests to verify the efficiency of the proposed approach in various building conditions, such as ashlar lines, heights, and period of use, as shown in
Table 2. In this study, five cases were selected, and buildings with diverse numbers of floors, ranging from five to fifteen floors, were considered. Buildings that have been in use for more than 30 years were investigated.
In these case studies, the Phantom 4 RTK aircraft was used, considering the characteristics of the inspection site environment and the collected data. For residential building inspections, a low-noise aircraft is required in consideration of residents’ complaints. Small drones can easily reduce the noise for residents and the discomfort associated with drone operations. Additionally, the image resolution was considered in collecting image data on the outer wall because the drone was small and was required to maintain a distance from the building exterior of more than 5 m. Because the flight time required for the inspection plan and path was more than 20 min, battery capacity was considered when selecting UAVs.
4.2. Performance of the Automatic Defect Detection Method
Figure 6 shows the results of automatic defect detection from the surface images of building structures. The bounding box on images was extracted based on intersection over union (IoU). The IoU is the ratio of the union and intersection between the true region and the predicted region of the bounding box. In this study, the bounding boxes were displayed as positive values when the IoU value was greater than 50%. The metric is as follows:
The precision-recall curve (PRC) was used to examine the detection performance, and the mean average precision (mAP) was calculated [
35]. Generally, the PRC is used as a metric of AI model performance, presenting the relationship between precision and recall. Precision asks how many real true values are in all true values predicted by the model, whereas recall presents the ratio of the true values predicted by the model in all the true values. Each metric is shown as follows:
Here, true positive, false positive, and false negative are variables that can be changeable depending on the threshold; thus, the precision and recall values can vary depending on the threshold value. In other words, the PRC curve graphically represents the correlation between precision and recall that varies depending on the change in threshold value. The AP summarizes the shape of the PRC, which is defined as the mean precision of a set of recall levels as follows in Equation (4). Meanwhile, mAP is a metric used in a classifier to predict multi-class and represents the average value of each class’s average precision (AP), as shown in Equation (5).
This experimental contribution demonstrated that the application of the proposed model is feasible for the visual inspection of concrete defects in practical environments with an mAP of 42.93.
Figure 7 shows the performance of the automatic defect detection results according to defect types, such as cracks, delamination, and leakages, of the application using the Faster RCNN model. For crack defects, the accuracy indicated a 40.5% average precision (AP), and the delamination and leakage types of defects presented results of 49.77% and 38.53%, respectively.
4.3. Application of AI-Based Building Surface Defect Inspection
In this study, the applicability of a pretrained automatic defect detection model was confirmed by applying the model to inspected buildings. This model was adopted for five cases. In each case, the model was tested by annotating approximately 40 images. However, according to the experimental results, the pretrained model exhibited low performance because it could not sufficiently train the data collected from the cases. Therefore, in this study, some images collected in each case were fine-tuned and reapplied to the pretrained model. Every 40 images in case buildings included in Dataset_F were divided into training data (30 images) and test data (10 images) for the fine-tuned pretrained model, which was trained by Dataset_P.
Figure 8 shows the results before and after fine-tuning the automatic defect detection model. We identified that the detecting objects using a pretrained model can detect defects even with no fine-tuning, but the performance of the result is very low. In contrast, with a fine-tuning model with very minimal data that do not overlap with test data, a higher defect detection rate could be achieved.
4.4. 3D Model-Based Defect-Location Extraction
The state of a building can be visually expressed through the 3D modeling of image information. In this study, context capture software was used to generate 3D information. To provide practical information on the defects on the surfaces of the building structures, we used the regional local defect information based on whole building images. The defect information could be extracted in two steps; the first step was to detect localization and type of defect from the AI networks. Then, the indexing method, which was established in the flight path plan, was used for tracking the image position containing defect information on the building exteriors. The predefined index method enables the determination of the entire defect location and tracking of the location information of the defects. Inspectors can use this information to observe the location and shape of a defect and intuitively deliver information on the exterior conditions of a building. The location information of the image can be used to determine the location of the defect by matching it to the index masked in the 3D model.
Figure 9 shows the defect information through 3D modeling and the index method. The types of defects and localization were extracted from the AI-based defect detection model.
5. Lesson Learned and Limitations
From the case analysis, several factors were identified for the successful application of UAVs and AI technologies in building inspection planning and monitoring processes. Based on a holistic and descriptive analysis, we summarize the findings and problems. First, site investigations must be conducted to plan a proper flight path. Additionally, any flying altitude restrictions in the region and whether obstacles exist near the inspected buildings must be determined in advance. Flight planning, including flight path, must consider the building shape to collect defects within the timeframe of a UAV’s battery life. As the UAV’s battery could be restricted from collecting all the information of the building structure because of the building scale, the battery supply plan, such as battery charging plan or spare batteries, also should be considered. In this regard, inspection zoning must be set according to the target part and position, and the inspection order can be determined.
In the UAV operation and data collection phase, owing to residents’ privacy and UAV collision risk with the building, flight close to the building surface was restricted; thus, the UAV flight was operated at least 5 m from the exterior of the building. However, the image collection from such a long distance caused a loss of information (resolution reduction of collected images) of the defects on the exterior walls of the building and eventually caused a problem in that the precision of the quantitative measurement is lower. Therefore, the collected data were required to be processed at a high resolution. In this case study, Phantom 4 RTK was used with a 20-megapixel high-resolution camera to identify and analyze small and large defect-related features in the collected data. This equipment facilitated proper flying near the building, which means low noise and high-quality image data collection. Through the case analysis, we identified several constraints, such as denied flight areas, flight interruption due to resident complaints, privacy issues, distance from adjacent buildings, and battery life issues, primarily occurring in residential buildings. Therefore, the restrictions on drone operations must be removed by obtaining sufficient consultation and consent before the implementation of inspections.
From a 3D modeling perspective, the quality of the data collected in the UAV significantly affects the generation of the 3D reconstruction model. The building shapes of the case analyzed in this study can be broadly divided into two types: flat and tower. However, for the tower type, blind spots occurred during the collection of image data, which limited the ability to collect high-quality images. Consequently, it affected the quality of the 3D modeling shape. This implies that when photographing buildings, the photography method must be reviewed to achieve not only the objective of acquiring defect information but also appropriately conveying visual information about the building’s shape. Thus, for appropriate 3D modeling, we propose a field-adaptive model-capturing method that removes blind spots by automatically shooting above the building and adjusting the tilt angle. Consequently, a photography method manual should be established according to the shape of the building to prepare a photography method suitable for its characteristics.
For detecting AI-based defects, an object detection model capable of multiclass classification is required. Therefore, in this study, a defect-recognition model was utilized to detect cracks, delamination, and leakages. However, we confirmed that the performance of the pretrained model must be improved to collect sufficient building defect information. The existing model was trained using 1300 annotations, but we observed that the performance of data collected in a new environment was slightly reduced given that it did not train sufficiently diverse patterns. Thus, in this study, we fine-tuned the existing model by training it with some of the data collected for each case. For fine-tuning the model, we used the untaught data for each case. Hence, the performance improved when compared to the non-fine-tuned model. Owing to the characteristics of the deep learning method, training data from big data and data collected in various environments must be utilized to automatically detect defects on the exterior of buildings. However, limitations are still present in securing a dataset with annotations when defects on the exteriors of buildings still exist. Based on this case study, we suggest that a recurring fine-tuning learning method is required to adopt a deep-learning-based inspection approach. The recurring fine-tuning method involves applying a non-fine-tuned model to the building being inspected and using the recognized defect information as fine-tuning learning data. This has the advantage of reducing the effort required for initial data processing to apply AI models and ensure sustainability in AI utilization.
6. Conclusions
This study discussed the application of UAVs to inspect and monitor defects in residential buildings and structures. The case study methods demonstrated an effective UAV-based visual inspection process integrated with AI technologies. UAVs serve as excellent aids in safety checks, offering safety managers an additional perspective on onsite inspections to streamline monitoring processes. Particularly in high-rise buildings and large infrastructures, where safety managers are limited, UAVs can play a crucial role in regularly inspecting inaccessible, difficult-to-reach, or hazardous areas, thereby improving the overall safety awareness of structural conditions.
Hence, a UAV- and AI-integrated building inspection process is proposed in this study. Subsequently, a case study was conducted using a holistic and descriptive case analysis method, in which five cases were selected for an in-depth understanding of the adoption of the UAV and AI integration process for safety inspection. Furthermore, insights were derived from a comprehensive case study analysis. The limitations of the current inspection and monitoring processes in residential buildings were identified. In this regard, some important factors (e.g., UAV operation plan, data collection method, and AI application) were extracted to determine how this AI-integrated UAV inspection method can be effectively applied to field inspection tasks. Additionally, the defect detection model was also assessed to better understand the usefulness of AI technology in recognizing each defect type from visual data collected by the UAV during the inspection process. This case study provided detailed considerations of the application of UAVs and AI in each phase of building inspection. Future work will include enhancing the performance of an AI-based multi-defect detection model using the state-of-the-art network for general applications of autonomous inspection models and the AI-based building condition assessment framework with accurate quantity measurement and approaches. Moreover, the automatic digital shadowing method of the digital twin model will be addressed for the digital transformation of building information. The problems discussed and results derived from this study can contribute to future AI-UAV-based building inspections and future works.