Rapid Identification Method for Surface Damage of Red Brick Heritage in Traditional Villages in Putian, Fujian

Huang, Linsheng; Xu, Yian; Chen, Yile; Zheng, Liang

doi:10.3390/coatings15101140

Open AccessArticle

Rapid Identification Method for Surface Damage of Red Brick Heritage in Traditional Villages in Putian, Fujian

by

Linsheng Huang

¹

,

Yian Xu

¹,

Yile Chen

^2,*

and

Liang Zheng

^2,*

¹

College of Civil Engineering, Putian University, No. 2121, Zixiao East Road, Licheng District, Putian City 351100, China

²

Faculty of Humanities and Arts, Macau University of Science and Technology, Avenida Wai Long, Tapai, Macau 999078, China

^*

Authors to whom correspondence should be addressed.

Coatings 2025, 15(10), 1140; https://doi.org/10.3390/coatings15101140

Submission received: 24 August 2025 / Revised: 14 September 2025 / Accepted: 22 September 2025 / Published: 2 October 2025

(This article belongs to the Section Surface Characterization, Deposition and Modification)

Download

Browse Figures

Versions Notes

Abstract

Red bricks serve as an important material for load-bearing or enclosing structures in traditional architecture and are widely used in construction projects both domestically and internationally. Fujian red bricks, due to geographical, trade, and immigration-related factors, have spread to Taiwan and various regions in Southeast Asia, giving rise to distinctive red brick architectural complexes. To further investigate the types of damage, such as cracking and missing bricks, that occur in traditional red brick buildings due to multiple factors, including climate and human activities, this study takes Fujian red brick buildings as its research subject. It employs the YOLOv12 rapid detection method to conduct technical support research on structural assessment, type detection, and damage localization of surface damage in red brick building materials. The experimental model was conducted through the following procedures: on-site photo collection, slice marking, creation of an image training set, establishment of an iterative model training, accuracy analysis, and experimental result verification. Based on this, the causes of damage types and corresponding countermeasures were analyzed. The objective of this study is to attempt to utilize computer vision image recognition technology to provide practical, automated detection and efficient identification methods for damage types in red brick building brick structures, particularly those involving physical and mechanical structural damage that severely threaten the overall structural safety of the building. This research model will reduce the complex manual processes typically involved, thereby improving work efficiency. This enables the development of customized intervention strategies with minimal impact and enhanced timeliness for the maintenance, repair, and preservation of red brick buildings, further advancing the practical application of intelligent protection for architectural heritage.

Keywords:

red brick buildings; structural damage; machine learning; detection; identification

1. Introduction

The use of red brick materials has a long history, widely employed in industrial, educational, residential, and religious buildings across both Eastern and Western cultures. The emergence of bricks and tiles in China occurred relatively early, with significant advancements in their firing techniques. During the Ming and Qing dynasties, bricks and tiles were extensively employed in building walls, roofs, floors, and intricate decorative details. Fujian’s clay, rich in iron oxide, produced red bricks with a vibrant hue. Consequently, red bricks became a defining symbol of Fujian’s regional architecture [1,2]. With the commercial and migratory activities of Fujian merchants both domestically and internationally, red brick architecture spread widely across China’s southeastern coastal regions and Southeast Asia. Examples include the Mazu temples in Taiwan and Southeast Asia, the shophouse architecture in Malaysia, and the neighborhoods in Singapore. The architectural pillars, walls, windowsills, and other structural elements in these regions predominantly feature red brick as the primary surface material, showcasing distinctive cultural characteristics and local identity. Evidently, red brick architecture embodies dual attributes of both rural and maritime cultures, serving as a testament to the dissemination and evolution of Chinese culture overseas. In 2012, Fujian’s Min Nan red brick architecture was included in China’s tentative list of World Heritage sites. In 2023, China issued guiding principles supporting Fujian’s application to designate Min Nan red brick architecture as a World Cultural Heritage site, further accelerating the progress of this nomination. Red brick architecture has emerged as a vital tangible cultural heritage for interpreting cultural exchanges between different regions both domestically and internationally.

Red brick, as the surface structural material in red brick architecture, serves to enclose spaces and protect internal structures while also embodying cultural and aesthetic intentions. In red brick cultural district buildings across China and abroad, red brick walls have long functioned as primary decorative surfaces and even load-bearing structural elements. Their integrity and stability are crucial to the survival and continuity of red brick architecture. However, prolonged exposure to natural and human environments has led to varying degrees of structural damage in brick materials under conditions such as climate exposure and impact from human activities, most notably manifested as the continuous expansion of surface cracks [3]. Additionally, influenced by complex environmental factors and real-world changes in Asia and Europe, brick surfaces are prone to erosion from water and salt, leading to material degradation, cracking, and deformation. These damage phenomena directly or indirectly lead to a decline in the quality of building use [4]. The underlying mechanisms causing this impact include the influence of soluble salt corrosion within the brick material [5,6,7] and changes in the loads borne by the brick walls of buildings [8]. The degree of aging and degradation of brick materials is also critical to the safety of building structures. Effective identification models and health monitoring are important means of improving the management of such traditional brick building heritage [9]. Although certain destructive processes are relatively slow, resulting in little or no noticeable change or deformation in the structural elements and surface texture of materials [10], it should be recognized that once damage becomes apparent, such as obvious cracks, gaps, and material spalling or loss, this indicates that the safety of the building is at significant risk. Any damage has a significant impact on the overall characteristics of the building [11].

Faced with the impact of modern building materials, forms, and extreme climate change, the vulnerability of red brick buildings has become increasingly apparent. Therefore, it is necessary to develop accurate identification techniques to determine the exact type of damage and structural characteristics of brick building surfaces. Traditional professional testing methods primarily rely on on-site analysis and inspection by professional technicians, using specialized equipment to assess the extent of surface damage to materials and generate evaluation reports. This method depends on the professional expertise and work experience of the inspectors; the conclusions of damage assessments may carry subjective uncertainty, which can significantly impact subsequent structural safety evaluations and restoration efforts for the materials [12]. With advancements in scientific instrumentation and technological maturity, scholars have begun to utilize various detection instruments to conduct experiments on the damage conditions of building heritage materials. This includes the application of acoustic emission (AE) technology in the inspection of masonry structures [13], the use of laser scanning technology and digital image processing to assess damage in stone-built heritage sites [14], and the application of infrared thermal imaging technology in non-destructive testing of concrete and masonry bridges [15,16]. The maturity of these testing technologies enables architectural heritage materials to be effectively assessed and addressed in a timely manner. However, these technologies require specialized scientific equipment and instruments, making them difficult to operate and costly when dealing with large-scale or numerous architectural heritage sites, which hinders their widespread adoption and limits their use as routine inspection tools. With the development of computer vision image technology, significant advantages have emerged in the field of building materials and structural inspection. This technology is widely applied in various types of damage detection research within the civil engineering sector, including concrete cracks, bridge and tunnel inspections, and post-earthquake building assessments [17]. In the field of architectural heritage material inspection, machine learning-based technologies have demonstrated significant advantages in the automatic detection, segmentation, and measurement of large-area surface damage. This has been effectively validated through experimental studies on the glazed tiles of the roof of the Forbidden City Museum in China [18]. Building on this, scholars have further proposed methods based on convolutional neural networks (R-CNNs) [19] to identify and count the number of intact and damaged components. For example, the YOLOv5 model and Faster R-CNN framework were used to effectively identify four types of defects in Indian tomb materials: discoloration, exposed bricks, cracks, and peeling [20]. This method has also been applied to historical brick-built buildings in Macau [21] and damage type detection on the bricks of classical gardens in Suzhou [22], yielding effective data on the current condition of the materials. However, research into intelligent detection systems for red brick structures remains underdeveloped at present, with no directly applicable and effective detection techniques currently guiding the assessment of damage types in red brick materials. Should corresponding models be established using computer vision image technology, this would enable the identification, authentication, evaluation, and classification of surface damage patterns on bricks, thereby facilitating the effective preservation of red brick architectural heritage.

The innovation of this study lies in constructing an efficient and accurate detection model for identifying surface damage types on red brick materials in complex scenarios. It establishes a scientific method for the automatic recognition and localization of brick surface damage types, achieving comprehensive detection and evaluation of damage characteristics, spatial distribution, and developmental trends. Simultaneously, this research aims to provide essential and reliable scientific guidance and evidence for the conservation and management of red brick architecture through intelligent detection of these damage types. Achieving this objective will not only advance routine maintenance of red brick structures, effectively slow material deterioration, and ensure overall structural safety but also offer theoretical and technical support for the nomination of red brick architecture as World Cultural Heritage, thereby promoting the sustainable development of this architectural heritage. At the application level, the primary tasks of this research include first developing an intelligent detection model for surface damage types in traditional red brick materials, which can provide a reference for intelligent heritage protection and effectively address common issues arising from manual efficiency and subjective judgment errors. Second, based on the research findings, it can promote interdisciplinary research across fields such as heritage conservation, computer science, and software engineering. This will offer scientific insights for studying red brick architectural heritage and other similar material structures, expanding the model’s practical application value in engineering projects.

2. Materials and Methodology

2.1. Material Preparation: Image Acquisition Area—Putian City

Red brick architecture in Putian City, Fujian Province, was selected as the research sample due to its characteristic climate and distinctive red brick construction techniques. The research team identified over a hundred of the densest and best-preserved red brick building clusters in Putian City. Ultimately, the study selected sites including Lou Tou Village and Shuang Fu Village (Table 1), where Shuang Fu Village is a traditional Chinese village and Lou Tou Village is listed as a historical and cultural town in Fujian Province. Both are typical areas with clusters of red brick buildings and are important research areas for the nomination of Min Nan red brick architecture as a UNESCO (United Nations Educational, Scientific and Cultural Organization) World Cultural Heritage site (Figure 1).

The research team found through preliminary research that there are currently no datasets available for red brick buildings. Therefore, through multiple on-site visits, they collected 374 first-hand images of Fujian red brick buildings. The collection equipment used was a Canon EOS M6 (Canon (China) Co., Ltd., Zhuhai, China), with an image pixel range of 6000 × 4000 pixels, ensuring the clarity of the samples. The collected content focused on various types of damage to red brick buildings and their overall structural integrity. The number of effective samples is the foundation for training machine learning models. Therefore, the research team established selection criteria for images, prioritizing those that were clear and front-facing and clearly showed damage type features, while excluding samples with poor angles, blurring, or low image recognition quality. Ultimately, 164 red brick image experimental samples were retained, comprising 78 cracking samples and 86 missing-type samples. By meticulously categorizing the types of damage in the red brick images, sufficient data information was provided for the model to accurately learn and distinguish damage type features, ensuring the validity and accuracy of the experimental results.

2.2. The Red Brick Material and Its Damage Types in This Study

This study selected typical red brick buildings in Putian within the candidate area for the red brick heritage application. From the perspective of the use of red brick materials in construction, they are primarily employed in arches, windowsills, walls, and decorative elements (such as brick carvings) (Figure 2). It is worth noting that the wall structure of most red brick buildings, from the exterior to the interior, consists of red bricks, earthen bricks (or rammed earth), and a white lime plaster layer. Therefore, red brick materials not only serve as a load-bearing function for the building but also protect the earthen layers from rainwater damage. Consequently, this study will focus on investigating surface structural damage to bricks in red brick architectural heritage areas to prevent further deterioration of the brick structure from causing irreversible harm to the overall building.

The research team conducted thorough on-site investigations of red brick structures and visited disused brickworks to consult with craftsmen and architects. We identified three primary causes of damage to red brick building materials.

First, we considered the impacts of climatic and hydrological conditions on brick structures. Coastal regions such as Fujian, Guangzhou, and Southeast Asia experience frequent rainfall and typhoons during the summer months (Figure 3). The humid environment causes significant temperature differences within the internal structure of brick materials, leading to expansion from within the bricks, which can result in cracking and even the loss of bricks. Exposed brick structures further damage the overall building.

Second, we considered issues related to the production process and construction techniques of red bricks. During the firing process, factors such as uneven temperature distribution, insufficient firing time, and craftsmen’s negligence can lead to deviations from production standards, resulting in variations in the hardness and oxidation levels of the bricks upon leaving the factory. If the flatness of the masonry during construction is poor, this can further impair the durability of the red bricks, leading to their detachment and posing a threat to the structural safety of the building (Figure 4).

Third, we considered conflicts between traditional alleyway dimensions and human activities. The early alleys formed by red brick architectural heritage are generally quite narrow. As the size of vehicles and daily items continues to increase, the original alley widths can no longer accommodate the transportation of daily items, frequently resulting in collisions between vehicles and walls, causing damage to the surface structure of the bricks. The exposed internal structure accelerates the cracking of red bricks and removes the first layer of protection from the walls, allowing plant seeds and other microorganisms to continuously erode the brick structure, affecting the safety of the building’s use.

These damage factors have caused substantial damage to red brick materials, and this damage is irreversible. The research team collected on-site data, conducted comparisons and analyses, and ultimately identified two of the most common and typical types of damage to red brick building materials (Figure 5): cracking and loss. The following are important factors that directly threaten the integrity of the building’s load-bearing structure:

(1) Cracking: Based on the data available to the team, the red brick manufacturing process is an objective factor influencing cracking, particularly whether the construction process strictly adheres to standards. Generally, incomplete firing and insufficient cooling time can lead to inadequate evaporation of moisture within the red brick material. Prolonged storage allows water molecules to evaporate, creating internal pressure on the brick structure and resulting in cracking. Additionally, human activities such as handling and transportation, along with physical friction and collisions with walls, can also cause cracking in brick materials. Of course, temperature changes and building foundation settlement are also major contributing factors to cracking. Once cracked, red bricks expose their internal structure to the natural environment over the long term, which is a primary factor leading to accelerated deterioration, peeling, and deformation of the bricks [23].

(2) Material missing: Factors such as internal corrosion and mechanical impact in brick structures can cause bricks to fall off or disintegrate, resulting in noticeable gaps or holes. These gaps often become a “breeding ground” for plant seeds, and the long-term attachment and growth of plants can cause fatal damage to the surface structure of the bricks, easily leading to the collapse of red brick walls. As a result, the expansion of material loss on the red brick surface exposes the internal structure of the wall. This not only damages the internal wall structure but also exacerbates the overall safety risks of the brick wall. Enhancing the intelligent identification of material loss and promptly conducting masonry repairs is a necessary measure to ensure the safety of red brick buildings.

Overall, to prevent damage such as cracking and loss of red brick surfaces from causing significant impacts on walls and internal structures (Figure 6), it is crucial to identify these two types of damage in a timely, effective, and rapid manner to protect red brick architectural heritage.

2.3. Research Process

Unlike traditional methods that rely on manual on-site measurement, assessment, and judgment of red brick damage (Figure 7), this study primarily proposes a machine learning method that combines a computer vision model based on YOLOv12 with the protection of architectural heritage (materials). Through model iteration training, type testing, and application, many wall samples can be detected in a short period. This method can objectively, efficiently, and accurately define and classify the types of damage on brick surfaces, as well as quickly identify and locate them, which helps to promptly implement targeted repair measures. Additionally, the data obtained through this method can better assist researchers in conducting an in-depth exploration of red brick building materials. However, this method must be implemented under strict and standardized procedures to achieve high precision and widespread application. The main process consists of six steps (Figure 8).

(1) Data collection

The quantity and quality of data are critical factors in training computational models in this study. A complete dataset includes a training dataset, a validation dataset, and a test dataset. Experimental studies have shown that the model learns and trains most effectively when the number of images is 100–200 times the number of labels [24]. To ensure the model training has a solid data foundation, the research team conducted on-site image data collection, effectively avoiding issues such as tampering, color manipulation, and human intervention that may exist in online data. The data collection emphasized regional representativeness and typological diversity, and the collected data and imagery were categorized and managed, with archives established. This study focused on collecting and organizing data for two types of damage, including cracking and material missing. The field collection conditions included sunny, rainy, and cloudy days. To enhance the model’s ability to distinguish environmental conditions, changes in lighting under different environmental conditions were also incorporated into the sample scenarios. The complex collection environment effectively captured the expressive differences of damage types under various environmental conditions, aiming to enable the machine learning model to adapt to a wider range of real-world environmental changes. The entire data collection process spanned six months, covering two main villages and experiencing various environmental changes. Detailed investigations were conducted on the size, extent, color, and severity of damage types, ensuring the authenticity and objectivity of the samples.

(2) Data processing

Data processing encompasses three main aspects. The first aspect is analysis of sample data. Through interviews with craftsmen, the team analyzed common types of damage to brick craftsmanship and brick surfaces. Based on the collected data, they conducted preliminary analyses of the location, type, structure, extent, and color changes of the damage and established corresponding type datasets. The second aspect is optimization of sample quality. The purpose of image sample optimization is to ensure the effectiveness and accuracy of model training. To this end, the team standardized the images, including resizing all images to 512 × 512 pixels, setting the horizontal and vertical resolutions to 96 dpi, and specifying a bit depth of 24. This step ensured the integrity of the content and features of each image. Consistent image size and aspect ratio improved the precision of model training, computation, and learning. The third aspect is sample data enhancement. Data augmentation techniques were employed, including image transformation, random flipping (horizontal or vertical), scaling, and translation [25], to address the issue of imbalanced sample sizes. This also enhances the model’s generalization capabilities to ensure experimental stability. Data augmentation also facilitates improvements in the model’s detection performance when faced with different scenarios.

(3) Data annotation

If establishing and processing datasets are key steps in improving data consistency and accuracy, then data annotation is a necessary process for enabling the YOLOv12 machine model to learn real features and achieve detection goals. In this study, to prevent subjective errors caused by individual judgment, we adopted a unified standard for image data annotation. A team composed of craftsmen, cultural heritage preservation experts, and architectural professionals was divided into two groups to conduct two rounds of annotation, drawing, and inspection of the data. Using the LabelImg annotation tool, precise bounding boxes were drawn for each damage type in the image data, and a unique type of code was assigned to establish an accurate type-feature relationship. The specific process was as follows: first, professionals classified and standardized each type of damage in the images, forming category labels; second, another group of experts reviewed and calibrated the category labels to ensure their accuracy. The above data annotation process was conducted under strict, specified standards, with category labels covering the scope, color, and severity of damage, ensuring the accuracy of model learning.

(4) Model training

Machine learning methods in computer vision technology no longer require manual extraction of damage features. With sufficient training data samples, high-precision classifiers can be trained [26]. Given the research team’s academic background, this study adopted the YOLOv12 training model. It is worth noting that in the field of artificial intelligence, techniques such as Faster R-CNN can also achieve classification objectives [27], but they involve complex computational operations, making them less suitable for widespread adoption. The YOLOv12 model is the latest version in the YOLO series and features advanced feature selection capabilities, higher accuracy, faster processing speeds, and enhanced efficiency. It provides the most suitable software and hardware advantages for detecting damage types on red brick material surfaces. During model training, the research team trained the YOLOv12 model using labeled image data samples. To visually assess the discrepancy between model predictions and actual labels, the study employed the cross-entropy loss function to quantify the difference values and utilized the Adam optimizer to further refine the model’s weight updates, aiming to minimize loss and achieve optimal performance during training. After continuous iterative training, the model underwent a total of 200 training epochs. Additionally, after each training epoch, 1–2 validation sets were promptly used to evaluate model performance and further confirm detection effectiveness. Throughout this process, model technical parameters, adjustment functions, and evaluation metrics were continuously optimized to enhance the model’s ability to learn specific features and manifestations of red brick surface damage types.

(5) Model testing

The purpose of model testing is to evaluate the model’s ability to identify damage type features in real-world scenarios. To conclude, the research team meticulously prepared a test dataset comprising 34 images across various types and environments, including but not limited to the two primary damage types mentioned earlier. During the model testing process, the research team employed multiple performance metrics to assess the model’s generalization capabilities. Additionally, the test results underwent a two-step validation process: first, the results were quantified using average precision (AP) and miss rate (MR) to objectively reflect the model’s actual detection performance. Second, the test set was manually evaluated and statistically analyzed to align with real-world application scenarios. Finally, the results from these two validation processes were compared for accuracy, and the parameter configuration with the best performance metrics was selected as a reference for further model training and testing.

(6) Results analysis

Model results are a critical stage in evaluating the performance of a model in terms of automatic localization and intelligent detection of damage types. In this study, the model training results primarily encompass three aspects: first, by comparing model test results with human judgment outcomes, we comprehensively assess whether the YOLOv12 model can accurately localize, efficiently identify, and rapidly detect the specific names and corresponding features of different damage types. This process concludes the overall performance evaluation of the model in this study. Second, another key focus of this study is to evaluate the model’s detection performance for each damage type, including two damage types or others, and to statistically analyze the advantages and disadvantages in terms of localization and identification for each damage type. The evaluation criteria are primarily achieved through quantifying detection rates, sensitivity, and identification deviation rates. Third, this study also investigates the model’s detection performance under different environmental and lighting conditions, assessing potential factors that may cause variations in detection accuracy or a lack of distinct features for different damage types. This is crucial for effectively optimizing model performance and addressing challenges arising from real-world applications and environmental conditions.

2.4. Model Structure

The detection method employed in this study is based on the YOLOv12 (You Only Look Once version 12) object detection model (please refer to Appendix A and Appendix B for the model’s operating environment and parameters). Through targeted configuration and training of its network structure, it achieves rapid identification and location of typical surface damage types on red brick buildings, including cracking and missing material. The YOLO family offers the advantages of “end-to-end” detection. While maintaining high speed and light weight, it further enhances detection accuracy and feature extraction capabilities. This makes it particularly suitable for damage detection in historical buildings with complex textures and strong background interference, as in this study.

As the YOLO model family continues to evolve, YOLOv12′s attention-centric design significantly improves accuracy while maintaining real-time inference speed. It demonstrates a superior speed–accuracy tradeoff for small to medium model sizes, making it suitable for handling fine-grained objects such as cracks and missing edges [28]. In contrast, the two-stage approach traditionally achieves superior accuracy but limited inference speed by using an RPN to generate candidate regions and then classify and regress, making it unsuitable for rapid on-site inspections [29]. The classic one-stage approach, RetinaNet, uses Focal Loss to alleviate the foreground–background imbalance in dense detection, achieving a balance between accuracy and speed. However, its localization of irregular and elongated boundaries still requires improvement [30]. The distributed regression approach (GFL/DFL) treats boundary regression as distributed learning, enhancing the ability to represent positioning uncertainty. This aligns with the distributed regression and attention-enhancing principles emphasized in the YOLOv12 detection head [31]. Therefore, compared with previous model architectures, YOLOv12 better meets the target form and deployment requirements of this research.

The model’s overall architecture continues the classic three-stage structure of backbone, neck, and head, as shown in Figure 9. Its main components include a backbone feature extraction network (backbone), a multi-scale feature fusion module (neck), and a prediction output head (head). These components work together to complete the task of feature extraction, enhancement, fusion, and prediction from the input image to the final detection result. Compared with previous models, the module design incorporates stronger local modeling capabilities and context-aware mechanisms. In particular, the introduction of structural units such as C3k2, A2C2f, ABlock, and AAttn effectively enhances the model’s feature representation capabilities during object recognition.

(1) In the backbone, the model downsamples and extracts features layer by layer through multiple cascaded Conv and C3k2 layers, extracting low-level texture information and mid- and high-level semantic information. The C3k2 module is one of the key modules in this architecture. It combines bottleneck stacking, cross-layer connections, and variable channel fusion strategies to effectively enhance the receptive field and improve the model’s nonlinear expression capabilities, making it suitable for extracting linear details such as red brick cracks.

(2) In the feature fusion stage (neck), the model completes multi-scale feature fusion through the A2C2f module combined with Concat and Upsample operations. This stage uses multiple upsampling and cross-scale connection mechanisms to fuse feature maps from different depth layers to achieve synchronous perception and precise positioning of damage of different scales (such as small cracks and large area defects). The A2C2f module further embeds the self-attention mechanism (AAttn) and position encoding module (PE), which can enhance the model’s ability to focus on damage features in complex backgrounds, suppress invalid interference information, and significantly improve the detection performance of small target damage.

(3) In the output stage (head), the model constructs a multi-scale detection head, which outputs prediction results from 16 × 16, 32 × 32, and 64 × 64 resolutions, corresponding to large, medium, and small target detection tasks. Each detection head consists of a CV module and a DWConv depthwise separable convolution stack to improve model parameter efficiency and inference speed. In addition, the output layer is also connected to the DFL (distribution focal loss) distributed regression mechanism to further optimize the bounding box regression accuracy, especially the positioning performance on irregular morphological damage (such as blurred crack edges and unclear outlines of missing areas).

2.5. Model Training Process

In this study, training was initially set to a maximum of 500 epochs, with early stopping employment to control overfitting. Training was automatically stopped at the 227th epoch due to a persistent validation loss, resulting in a stable model structure. The training image resolution was uniformly set to 512 × 512 pixels, with a batch size of 16. Gradient updates were performed using a momentum-based SGD (stochastic gradient descent) optimizer, with an initial learning rate of 0.01 that was gradually decayed to 1 × 10⁻⁴ during training. The momentum term was set to 0.937, with a weight decay coefficient of 5 × 10⁻⁴. A warmup strategy was used for the first five epochs to smooth out initial gradient oscillations. AMP (automatic mixed precision) training was also employed to improve computational efficiency and convergence stability.

Looking at the overall training curve, as shown in Figure 10, the model reached its lowest validation loss of 3.801 around the 127th epoch. Afterward, the loss curve, while fluctuating somewhat, generally stabilized, demonstrating that the model had gradually achieved an effective fit to the target features. Regarding evaluation metrics, the model exhibited positive trends in key performance parameters such as precision, recall, and mAP (mean average precision). Precision (B) reached a peak of 0.886 at the 144th epoch, while recall (B) peaked at 0.783 at the 193rd epoch. mAP50-95 (B), a key metric for measuring the model’s overall performance at different IoU thresholds, reached a maximum of 0.384 at the 150th epoch, demonstrating the model’s comprehensive ability to discriminate small objects with damaged brick structures and blurred boundaries. Further analysis of the training log data reveals that among the training losses, box_loss and dfl_loss continue to decrease over the training process, while cls_loss exhibits certain convergence fluctuations. These fluctuations may be related to the partial similarities in texture and morphology between red brick damage types, further demonstrating the model’s reliance on high semantic learning and deep perception capabilities to distinguish subtle type features. Furthermore, in the later stages of training, the learning rate gradually decreases to the convergence boundary, and the model enters a stable state, indicating that the learning rate annealing strategy and warmup mechanism played a positive role in this study, particularly influencing the convergence of detailed brick surface cracks against complex backgrounds.

During this training process, the model showed optimal performance in various key performance indicators at different epochs. Therefore, the study comprehensively evaluated metrics most relevant to red brick detection, including minimum validation loss, precision, recall, and mean average precision (mAP). Four representative training iterations were specifically selected for subsequent model testing phase comparisons and analysis: (1) The 127th epoch model achieved the lowest value of 3.801 in validation loss, indicating that the overall fitting ability of the model was the most balanced, and was named model A. (2) The 144th epoch model achieved the highest value of 0.886 in precision, reflecting the model’s advantage in positioning accuracy, and was named model B. (3) The 150th epoch model achieved the maximum value of 0.384 in average precision index mAP50-95 (B), indicating that the model had the strongest comprehensive detection performance under multiple IoU thresholds, and was named model C. (4) The 193rd epoch model achieved the highest value of 0.783 in recall, demonstrating its superiority in missed detection control, and was named model D.

3. Results

The bar chart in Figure 11 and Table 2 visually illustrates the performance differences among the four phased models (model A, model B, model C, and model D) across five core evaluation metrics. These metrics include precision, recall, mAP@0.5, mAP@0.5:0.95, and fitness, which measure the models’ positioning accuracy, coverage, detection quality at IoU thresholds, and overall evaluation capabilities in object recognition.

In terms of precision, model B performed the best of the four, reaching 0.8482, significantly higher than model A (0.7682), model C (0.6922), and model D (0.6847), demonstrating its significant advantage in reducing false positives. In terms of recall, model C achieved the best score of 0.7546, followed closely by model D (0.7821) and model A (0.7141). Model B, however, was relatively low at 0.5981, suggesting potential for missed detections. In terms of mAP@0.5, model A led with a score of 0.8395, demonstrating its excellent overall detection performance under a relatively loose intersection over union (IoU) threshold. Models B and C performed similarly (0.7086 and 0.7364, respectively), while model D was slightly lower (0.7602). When the threshold was expanded to the more stringent mAP@0.5:0.95, model C achieved the highest value of 0.3862, followed by model A (0.3523). Models B and D both scored below 0.32, demonstrating model C’s relative advantage in comprehensive detection stability under multiple thresholds. On the comprehensive fitness metric, model A scored 0.4282, significantly higher than the other three (model B, 0.3524; model C, 0.4208; and model D, 0.3696), demonstrating its overall ability to balance precision, recall, and mAP performance. In summary, while each model has its own strengths in a single dimension, the data in Figure 11 shows significant differences in the core performance metrics of these models, providing the necessary quantitative basis for model selection in specific scenarios.

The PR (precision–recall) curves generated by the four phased models (models A, B, C, and D) on the test set further analyze their precision and recall performance in different object recognition tasks. Figure 12 plots the PR curves for two types of damage (cracking and missing), with the corresponding AP (average precision) and mAP@0.5 metrics for each category noted in the legend.

As shown in Figure 12, model A demonstrates the best overall detection performance, achieving an AP of 0.802 for the cracking category and 0.877 for the missing category, the highest among the four models in this category. Its overall mAP@0.5 score reaches 0.840. Its PR curve is smooth and has a wide range, demonstrating that the model maintains high precision and recall under multiple thresholds, demonstrating a well-balanced model. In contrast, while model B achieves the highest precision during training, its PR curve is relatively inferior, with an AP of 0.806 for the cracking category and only 0.611 for the missing category, significantly lower than the other models. Its overall mAP@0.5 score is only 0.709. This indicates that its preference for high precision but insufficient recall results in significant missed detections, and its PR curve also shows a clear downward trend. Model C achieves the highest AP of 0.808 for the cracking category, demonstrating its strong ability to identify crack-like defects. However, its AP for the missing category is only 0.665. The overall mAP@0.5 score is 0.736, placing it above average. Its PR curve structure is relatively steep, indicating that the model performs well on high-confidence predictions but exhibits some instability in low-recall areas. Model D shows balanced performance across both categories, with cracking scores of 0.792 and missing scores of 0.728, resulting in an overall mAP@0.5 score of 0.760. Its PR curve is stable, covering a wide range, and it strikes a good balance between recall and precision, demonstrating good generalization potential.

As shown in Figure 13, a comparison of the confusion matrices of the four models on the test set further validates the patterns previously identified in the analysis of precision, recall, mAP, and PR curves. Model A demonstrates strong recognition capabilities for both cracking and missing impairments, particularly achieving an accuracy of 0.83 in the missing category, demonstrating overall good balance. Model B, on the other hand, clearly underperforms in the missing category, with a high confusion rate and numerous misclassifications between the background and missing categories. This is consistent with its poor recall as shown in the PR curve. Model C achieves the highest accuracy (0.88) in the cracking category but exhibits some bias in the missing category, suggesting slightly uneven performance. Model D performs most effectively in the cracking category, achieving an accuracy of 0.92 and 0.71 in the missing category, demonstrating good overall robustness, but exhibits some error in the background category.

As shown in Figure 14, in comparison with the detection results of the training set, different models show certain differences in the recognition of cracked and missing bricks. As shown in Figure 14(1), model A can completely detect the three crack damages in the image, showing a strong detection sensitivity, while models B, C, and D have missed detection or false detection, especially at the edge of the crack. In Figure 14(2), model A detected four crack areas, but there was a repeated labeling of the crack at the bottom of the image, and the actual crack was mistakenly divided into two. In contrast, although models B, C, and D avoided repeated detection, they all missed the crack on the far left, indicating that their perception ability of fine cracks is slightly insufficient. Figure 14(3) further shows the detection effect of missing bricks. Model A gave relatively accurate recognition results for all missing bricks on the left side of the image. Models B, C, and D all had some missed detections, indicating that model A has a greater advantage in this type of defect recognition task. However, it can also be discerned from the diagram that when environmental conditions bear a high degree of similarity to the visual characteristics of a particular injury type, the model may categorize the factors present in that environment as belonging to that injury type. This leads to an increased error rate in the detection results. For instance, the model might detect a surrounding black environment and misclassify it as a missing type. Finally, in Figure 14(4), all four models experienced a certain degree of missed detection when faced with the complex situation of large-area damage. However, overall, model A had relatively fewer missed detections, demonstrating a more stable detection effect. Overall, model A achieved detection accuracy above 0.80 on the training set, demonstrating more comprehensive coverage and capturing greater detail in damage but also exhibiting individual false detection issues. The other three models performed relatively conservatively, with less stable detection accuracy. While they missed more instances, they produced fewer false positives.

As shown in Figure 15, in detection experiments using real-world images, the performance of the four models further demonstrates their generalization capabilities on new, unseen image data. It is important to note that these images are all brand-new data, not encountered by the models during training. They have a resolution of 6000 × 4000 pixels, significantly higher than the standard size used in the training set. This high-resolution, real-world imagery not only incorporates more complex texture details and lighting variations but also, to a certain extent, simulates the actual application scenarios of the models in real-world architectural heritage conservation sites, thus more accurately reflecting the models’ robustness and generalization capabilities.

As shown in Figure 15(1), in the missing brick detection task, only model A was able to fully identify the locations of the three missing bricks, demonstrating strong adaptability, while models B, C, and D all missed detections, indicating that their stability was insufficient under high-resolution complex backgrounds. In Figure 15(2), model C performed the best, accurately detecting all cracked locations, demonstrating its strong generalization ability in crack recognition, while models A and B missed detections, and model D produced false detections, suggesting that it was somewhat confused by the complex background texture. Figure 15(3) further verified this trend. Model C was still able to fully identify all cracks, maintaining a high degree of detection consistency, while model A missed the crack on the left, and models B and D had additional false detections, indicating that their discrimination ability was reduced under the interference of complex textures and shadows. Finally, in Figure 15(4), model A and model D completely detected all missing bricks, showing good robustness, while model B and model C misidentified some brick edges as cracks, indicating that they still have certain limitations in distinguishing missing and crack features.

In summary, this section compares the detection performance of four models using confusion matrices, quantitative metrics, and test images from different scenarios. Model C demonstrates superior stability and accuracy in crack detection, while models A and D excel in missing brick identification. Comprehensive evaluation results demonstrate that model A outperforms the other three models overall. The confusion matrix shows that model A achieves high recognition accuracy for the two main categories of cracking and missing objects and is particularly stable in avoiding background false detections. In contrast, models B and D exhibit significant under detection and false detection in the missing object category. While model C’s overall performance is like model A, its stability in complex scenarios is slightly lower. Quantitative results and visualization examples confirm this, demonstrating that model A is more robust in practical applications.

4. Discussion

This study focuses on detection of surface damage in historic masonry buildings in Putian. Addressing the challenges of efficiently identifying defects such as cracks and missing bricks during actual conservation and restoration, a deep learning detection model based on an improved YOLOv12 architecture was proposed and trained. The results demonstrate that the proposed model exhibits significant potential in heritage conservation applications. On one hand, the method is capable of processing high-resolution on-site images (6000 × 4000 pixels) and maintains satisfactory detection performance even on previously unseen samples, demonstrating its excellent generalization capabilities. On the other hand, the multi-task detection framework, combining segmentation and counting, not only provides the spatial distribution and geometric boundaries of defects but also outputs quantitative statistical information on them, providing refined data support for subsequent structural safety assessments and restoration projects.

As shown in Figure 16, combined with the visualization analysis of the heat map, the internal mechanism of model A in the reasoning process is further revealed. From backbone to neck and then to head, the response area of the feature gradually shrinks and focuses, reflecting the evolution of the model’s target information extraction and aggregation at different stages. In the missing brick case in Figure 16(1),(2), the high response area in the backbone stage covers the entire structure of the brick body, the neck stage gradually focuses on the missing edge, and the head stage finally locks the missing brick position. The segmentation result is highly consistent with the actual defect boundary, and the count module accurately gives the number of missing bricks. In Figure 16(3), facing multiple target missing bricks, the response of the backbone is relatively scattered, but after the multi-scale feature aggregation of neck and head, the model can effectively distinguish and locate multiple defects, and the segmentation mask and statistical results both show strong robustness. In single defect detection, as shown in Figure 16(4), model A shows extremely high accuracy and can accurately lock the missing area without being disturbed by adjacent mortar joints. In the crack detection case (Figure 16(5)–(8)), the heat map shows that the response of the backbone stage is distributed along the brick joints and cracks, reflecting the sensitivity to linear texture features. The neck stage further compresses the response range and highlights the main direction of the crack. The head stage forms a highlight cluster at the center of the crack, and the mask generated by the final segmentation module is highly consistent with the actual location of the crack. For example, in the subtle crack cases of Figure 16(7),(8), the model can still maintain the detection path of step-by-step focus from backbone to head and achieve complete labeling and accurate counting of cracks in the final results. This detection chain not only improves the interpretability of the model but also demonstrates its applicability and accuracy in complex scenarios.

5. Conclusions

This research focuses on intelligent detection of surface defects in ancient red brick buildings, focusing on two of the most common and representative types of damage: brick cracking and missing brick material. Traditional manual inspection methods not only are inefficient and subjective but also lack stability under complex textures and variable lighting conditions, posing a real challenge for cultural heritage preservation. In recent years, the rapid development of deep learning, particularly object detection models, has provided new solutions for building defect detection. However, maintaining light weight and efficiency while maintaining accuracy in complex scenarios remains a pressing research challenge. Based on this, this study introduces the YOLOv12 architecture to automatically identify and locate red brick defects through an end-to-end deep detection framework. Combined with targeted model optimization and systematic experimental verification, this study provides a new and feasible technical solution for the digital preservation of cultural heritage.

In terms of research methods, this study adaptively designed and analyzed the YOLOv12 architecture, focusing on leveraging the multi-scale feature extraction and fusion advantages of its three-stage structure, backbone, neck, and head, to enhance the ability to discriminate cracks and missing areas in complex brick texture environments. A systematic training strategy, including a maximum epoch of 500, early stopping, a 512 × 512 pixel input resolution, a batch size of 16, a momentum-based SGD optimizer, and warmup and learning rate annealing strategies, ensured stable convergence and optimized performance under limited computing resources. During the experiment, key training epochs were extracted, and four staged models (A, B, C, and D) were generated. The detection performance of each model was systematically compared and evaluated based on multiple metrics, including validation loss, precision, recall, and mean average prediction (mAP). This study also conducted comprehensive tests on training and validation sets, as well as high-resolution (6000 × 4000 pixel) live-action data. Furthermore, the model’s detection mechanism was deeply analyzed using confusion matrices, PR curves, and Grad-CAM heat map visualization. The research concluded that the following three key points are relevant.

(1): Model A performs best in overall performance. It maintains a good balance in key indicators such as precision, recall, and mAP and demonstrates strong robustness and generalization capabilities in the training set and high-resolution on-site real-life images. The results indicate that the model effectively identifies two types of red brick damage—cracking and missing—with accuracy rates of 0.802 and 0.877, respectively. The study also analyzes and verifies that the model maintains stable detection performance across different damage categories and complex backgrounds, providing technical support for intelligent detection of red brick damage types.
(2): The interpretability of the detection mechanism is enhanced. Through Grad-CAM heat map analysis, it can be seen that model A mainly extracts global brick texture features in the backbone stage, gradually focuses on the damaged area in the neck stage, and further locks the target and outputs accurate positioning results in the head stage. This step-by-step focusing mechanism not only verifies the effectiveness of the model but also improves its interpretability and usability in cultural heritage protection scenarios.
(3): It has outstanding adaptability in high-resolution scenarios. In the 6000 × 4000 pixel unseen real-life data test, model A can completely identify most missing bricks and cracks, and its robustness is better than models B, C, and D. Especially in missing brick detection, model A has the most complete recognition results, and it also maintains a high accuracy rate in crack detection. This shows that this method has the potential to be promoted to actual heritage protection project scenarios.

Despite this, this study still has certain shortcomings: (1) The scale and category distribution of training data are still limited, and more complex damage types (such as weathering, spalling, biological attachment, etc.) are not covered, resulting in certain limitations in the application scope of the model. (2) The current detection results mainly remain at the positioning and classification level, and quantitative assessment of the severity and development trend of the disease has not yet been achieved. (3) Although the high-resolution inference used in the study verifies the feasibility of the model, it will bring about a trade-off between inference time and computing power consumption in actual deployment, and further optimization of lightweight and real-time performance is urgently needed.

In summary, this study not only verifies the effectiveness of YOLOv12 in the detection of red brick ancient building defects but also proposes a systematic adaptation, training, and testing method and reveals its detection mechanism and application potential through multi-dimensional comparison and visualization. Subsequent work will further expand the range of defect types and data scale, continuously optimize the model architecture, and explore integration with digital technologies such as BIM and 3D reconstruction. Concurrently, consideration will be given to integrating the modeling programmed with mobile application devices to develop inspection applications for on-site maintenance technicians, thereby achieving higher-level intelligent and automated inspection applications within the field of cultural heritage conservation.

Author Contributions

Conceptualization, L.H.; methodology, L.H. and Y.X.; software, Y.C. and L.Z.; validation, Y.C.; formal analysis, L.H., Y.C., and LZ.; investigation, L.H., Y.X., and L.Z.; resources, L.Z.; data curation, L.H. and Y.X.; writing—original draft preparation, L.H. and L.Z.; writing—review and editing, Y.C. and L.H.; visualization, L.Z. and, Y.X.; supervision, L.Z. and Y.C.; project administration, L.H. and L.Z., funding acquisition, L.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the (1) Fujian Provincial Social Science Foundation Project “Arrangement and Research on Historical Materials of Mazu Architectural Images in Ming and Qing Dynasty” (Grant number FJ2023C053); (2) Fujian Provincial Social Science Foundation Project “Research on the Historical Evolution and Contemporary Reconstruction of Mazu Cultural Space” (Grant number FJ2023JDZ059); (3) Fujian Province College Student Innovation and Entrepreneurship Training Program Project (Grant number S202511498012); (4) Faculty Research Grants funded by Macau University of Science and Technology (Grant numbers FRG-25-041-FA and FRG-25-067-FA); and (5) Guangdong Provincial Department of Education’s key scientific research platforms and projects for general universities in 2023: Guangdong, Hong Kong, and Macau Cultural Heritage Protection and Innovation Design Team (Grant number 2023WCXTD042). The funders had no role in the study conceptualization, data curation, formal analysis, methodology, software, decision to publish, or preparation of the manuscript. There was no additional external funding received for this study.

Data Availability Statement

The datasets used and analyzed during the current study are available from Linsheng Huang (hls0707@ptu.edu.cn) on reasonable request.

Acknowledgments

We are very grateful to the students of Putian University during this study. They contributed to the research and collection of information and data classification and statistics in the pre-study period. These students are Linchong Guo, Qingling He, Ying Huang, Zhiyi Hou, Ying Xiao, Huanru Zhang, and Jiaxin Zhang.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Experimental Environment

Table A1. Experimental environment.

Category	Version	Category	Version
Operating system	Windows 10 (22H2) Professional Edition/AMD64 Architecture	Python	3.11.13
Processor	AMD Ryzen 7 9800X3D 8-Core Processor	System memory	64 GB
Pytorch	2.7.0+cu128	Torch cuda	12.8
GPU	NVIDIA GeForce RTX 5070 Ti 16 GB	CUDA	12.9.41
Flash attn	2.7.4.post1	NumPy	1.26.4
Pandas	2.3.0	Matplotlib	3.10.3
OpenCV	4.9.0	Ultralytics	8.3.63

Appendix B. Model Structure

Table A2. Model structure.

Stage	Layer/Module	Output Channels	Stride	Notes
Backbone	Conv	96	2	Stem conv
	Conv (groups = 2)	192	2	Downsample
	C3k2 (2 × C3k)	384	1	Bottleneck blocks
	Conv (groups = 4)	384	2	Downsample
	C3k2 (2 × C3k)	768	1	Bottleneck blocks
	Conv	768	2	Downsample
	A2C2f (4 × ABlock)	768	1	Attention-enhanced
	Conv	768	2	Downsample
	A2C2f (4 × ABlock)	768	1	Transformer-like
Neck	Upsample + Concat	-	-	Feature fusion
	A2C2f (2 × C3k)	768	1	Fusion block
	Upsample + Concat	-	-	PANet-like fusion
	A2C2f (2 × C3k)	384	1	Fusion block
	Conv	384	2	Downsample
	Concat + A2C2f (2 × C3k)	768	1	Fusion block
	Conv	768	2	Downsample
	Concat + C3k2 (2 × C3k)	768	1	Fusion block
Head	Detect	-	-	Three detection scales, with DFL

References

National ICH: Construction Technique of Minnan Folk Dwellings (Hui’an County), General Office of Fujian Provincial People’s Government. Available online: https://www.fj.gov.cn/english/cultureandtravel/cultureandarts/202503/t20250328_6789060.htm (accessed on 20 August 2025).
Qiu, H.; Zhang, J.; Zhuo, L.; Xiao, Q.; Chen, Z.; Tian, H. Research on Intelligent Monitoring Technology for Roof Damage of Traditional Chinese Residential Buildings Based on Improved YOLOv8: Taking Ancient Villages in Southern Fujian as an Example. Herit. Sci. 2024, 12, 231. [Google Scholar] [CrossRef]
Xia, Q.; Liu, T.; Li, Y.; Xiong, Y.; Ma, Y. Simulation and Multi-Dimensional Damage Evolution Analysis of Detailed Micro-Model Representing Ancient Brick Masonry Compressive Behavior. Constr. Build. Mater. 2024, 457, 139410. [Google Scholar] [CrossRef]
Hao, Y.; Yao, Z.; Wu, R.; Bao, Y. Damage and Restoration Technology of Historic Buildings of Brick and Wood Structures: A Review. Herit. Sci. 2024, 12. [Google Scholar] [CrossRef]
Wons, W.; Kłosek-Wawrzyn, E.; Rzepa, K. Corrosion of Porous Building Ceramics Caused by Double Sulphate Salt. Materials 2025, 18, 1041. [Google Scholar] [CrossRef]
Marrone, C.; Franzoni, E. Enhancing the Durability of Historic Brick Masonry: The Role of Diammonium Phosphate and Chitosan in Reducing Salt-Induced Damage. J. Cult. Herit. 2025, 73, 150–157. [Google Scholar] [CrossRef]
Sharma, S.; Esposito, R.; D’Altri, A.M.; Castellazzi, G. Salt Crystallisation and Weathering in Masonry Retaining Walls: A Multiphase Modelling Approach. J. Build. Eng. 2025, 111, 112999. [Google Scholar] [CrossRef]
Knyziak, P.; Spodzieja, S.; Krentowski, J.R.; Pawłowicz, J.A.; Sawczyński, S.; Gil-Mastalerczyk, J. Semi-Destructive Testing Methods for Examining the Structural Condition of Historic Buildings. Eng. Fail. Anal. 2025, 171, 109354. [Google Scholar] [CrossRef]
Lignola, G.P.; Buratti, N.; Cattari, S.; Parisi, F.; Ubertini, F.; Alfano, S.; Ierimonti, L.; Meoni, A.; Sivori, D.; Virgulto, G. Validated and Optimized Strategies for Preserving Historical Heritage Towards Natural and Anthropic Risks: Insights from the DETECT-AGING Project. Buildings 2025, 15, 693. [Google Scholar] [CrossRef]
Niedostatkiewicz, M.; Majewski, T. Causes of Defects and Damage to Brick Masonry Elements in Historic Buildings. Civ. Environ. Eng. Rep. 2024, 34, 423–448. [Google Scholar] [CrossRef]
Alaei, A.; Hejazi, M.; Vintzileou, E.; Miltiadou-Fezans, A.; Skłodowski, M. Effect of Damage and Repair on the Dynamic Properties of Persian Brick Masonry Arches. Eur. Phys. J. Plus 2023, 138, 231. [Google Scholar] [CrossRef]
Zheng, L.; Chen, Y.; Yan, L.; Zhang, Y. Automatic Detection and Recognition Method of Chinese Clay Tiles Based on YOLOv4: A Case Study in Macau. Int. J. Archit. Heritage 2023, 18, 1–20. [Google Scholar] [CrossRef]
Verstrynge, E.; Lacidogna, G.; Accornero, F.; Tomor, A. A Review on Acoustic Emission Monitoring for Damage Detection in Masonry Structures. Constr. Build. Mater. 2021, 268, 121089. [Google Scholar] [CrossRef]
Armesto-González, J.; Riveiro-Rodríguez, B.; González-Aguilera, D.; Rivas-Brea, M.T. Terrestrial Laser Scanning Intensity Data Applied to Damage Detection for Historical Buildings. J. Archaeol. Sci. 2010, 37, 3037–3047. [Google Scholar] [CrossRef]
Aquino-Rocha, J.H.; Póvoas, Y.V.; Bezerra-Batista, P.I. Flaw Recognition in Reinforced Concrete Bridges Using Infrared Thermography: A Case Study. Rev. Fac. Ing. Univ. Antioq. 2024, 110, 99–109. [Google Scholar] [CrossRef]
Janků, M.; Březina, I.; Grošek, J. Use of Infrared Thermography to Detect Defects on Concrete Bridges. Procedia Eng. 2017, 190, 62–69. [Google Scholar] [CrossRef]
Attard, L.; Debono, C.J.; Valentino, G.; Di Castro, M. Tunnel Inspection Using Photogrammetric Techniques and Image Processing: A Review. ISPRS J. Photogramm. Remote. Sens. 2018, 144, 180–188. [Google Scholar] [CrossRef]
Zhu, X.; Zhu, Q.; Zhang, Q.; Du, Y. Deep Learning-Based 3D Reconstruction of Ancient Buildings with Surface Damage Identification and Localization. Structures 2025, 73, 108383. [Google Scholar] [CrossRef]
Zou, Z.; Zhao, X.; Zhao, P.; Qi, F.; Wang, N. CNN-Based Statistics and Location Estimation of Missing Components in Routine Inspection of Historic Buildings. J. Cult. Herit. 2019, 38, 221–230. [Google Scholar] [CrossRef]
Mishra, M.; Barman, T.; Ramana, G.V. Artificial Intelligence-Based Visual Inspection System for Structural Health Monitoring of Cultural Heritage. J. Civil Struct. Health Monit. 2022, 14, 103–120. [Google Scholar] [CrossRef]
Yang, X.; Zheng, L.; Chen, Y.; Feng, J.; Zheng, J. Recognition of Damage Types of Chinese Gray-Brick Ancient Buildings Based on Machine Learning—Taking the Macau World Heritage Buffer Zone as an Example. Atmosphere 2023, 14, 346. [Google Scholar] [CrossRef]
Yan, L.; Chen, Y.; Zheng, L.; Zhang, Y. Application of Computer Vision Technology in Surface Damage Detection and Analysis of Shedthin Tiles in China: A Case Study of the Classical Gardens of Suzhou. Herit. Sci 2024, 12, 72. [Google Scholar] [CrossRef]
Moropoulou, A.; Polikreti, K.; Ruf, V.; Deodatis, G. San Francisco Monastery, Quito, Equador: Characterisation of Building Materials, Damage Assessment and Conservation Considerations. J. Cult. Herit. 2003, 4, 101–108. [Google Scholar] [CrossRef]
Li, Y.; Zhao, M.; Mao, J.; Chen, Y.; Zheng, L.; Yan, L. Detection and Recognition of Chinese Porcelain Inlay Images of Traditional Lingnan Architectural Decoration Based on YOLOv4 Technology. Herit. Sci. 2024, 12, 137. [Google Scholar] [CrossRef]
Yang, S.; Tian, Y.; Zheng, M.; Du, Y.; Chen, H.; Song, F.; Gao, X.; Li, L. A Review of Image Enhancement Technology Research. In Proceedings of the 3rd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China, 3–5 December 2021; pp. 715–720. [Google Scholar]
Seng, K.P.; Ang, L.-M.; Schmidtke, L.M.; Rogiers, S.Y. Computer Vision and Machine Learning for Viticulture Technology. IEEE Access 2018, 6, 67494–67510. [Google Scholar] [CrossRef]
Tang, H.; Peng, L. Influence of Building Recognition of High-Point Monitoring Image by the Optimized Faster R-CNN on Urban Planning. Int. J. Artif. Intell. Tools 2022, 31, 2250013:1–2250013:16. [Google Scholar] [CrossRef]
Tian, Y.; Ye, Q.; Doermann, D. Yolov12: Attention-centric real-time object detectors. arXiv 2025, arXiv:2502.12524. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 1137–1149. [Google Scholar] [CrossRef]
Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
Li, X.; Wang, W.; Wu, L.; Chen, S.; Hu, X.; Li, J.; Tang, J.; Yang, J. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Adv. Neural Inf. Process. Syst. 2020, 33, 21002–21012. [Google Scholar]

Figure 1. Location map of the red brick built-up area surveyed in this study. The few Chinese characters in the picture are place names and have no specific meaning. (Image source: drawn and annotated by the author.)

Figure 2. Areas of brick use in construction. (Image source: drawn and annotated by the author.)

Figure 3. Climate temperature and precipitation conditions in the sampling area of Putian, Fujian Province. (Image source: https://www.worldweatheronline.com/putian-weather-averages/fujian/cn.aspx (accessed on 21 July 2025).)

Figure 4. The construction process for red brick walls. (Image source: drawn and annotated by the author; the author first did hand-drawing and then used Photoshop CS5 drawing software for coloring and modification.)

Figure 5. The two main types of damage have the greatest impact on red brick buildings. (Image source: photographed and annotated by the author.)

Figure 6. Threat of building structural safety posed by damage to red brick surfaces. (Image source: drawn and annotated by the author.)

Figure 7. On-site sampling operations for traditional red brick damage types. (a) An inspection worker stands on a lifting platform, using equipment to inspect or maintain the walls of a brick building. (b) An inspection worker, on a hoist outside the building, performs work on the building’s walls, with personnel below providing guidance. (c) An inspection worker on the ground uses a drone to inspect the building. (d) An inspection worker controls the drone using a remote control, which inspects the building’s walls, with the images transmitted to a tablet device in real time. (Image source: drawn and annotated by the author.)

Figure 8. The main steps in the implementation process of this study. (Image source: drawn by the author.)

Figure 9. Model structure. (Image source: drawn and annotated by the author; the author summarizes and organizes the images exported after the model training.)

Figure 10. Numerical records during model training. (Image source: drawn and annotated by the author; the author summarizes and organizes the images exported after the model training.)

Figure 11. Comparative analysis of overall model indicators. (Image source: drawn and annotated by the author.)

Figure 12. Comparative analysis of model precision–recall curves. (Image source: drawn and annotated by the author; the author summarizes and organizes the images exported after the model training.)

Figure 13. Model confusion matrix comparison analysis. (Image source: drawn and annotated by the author; the author summarizes and organizes the images exported after the model training.)

Figure 14. The test results of the model in the training set. (1)–(4) represent different model recognition results. (Image source: photographed and annotated by the author.)

Figure 15. Test results of the model in real scenarios. (1)–(4) represent different model recognition results. (Image source: photographed and annotated by the author.)

Figure 16. The detection process and results of the model on red bricks of historical buildings in traditional villages in Putian. (1)–(8) represent different model recognition results. The red areas in the heat map indicate regions where the model’s feature responses are most concentrated, signifying the highest density of damage types. The green areas represent the gradual diffusion of the model’s feature responses from the focal point of damage toward the edges and boundaries of the damage, delineating cracked and missing sections from the overall brick wall structure. The blue areas denote regions where the model’s feature responses are present, indicating that the model’s detection surface covers the entire sampled brick wall structure. (Image source: drawn and annotated by the author).

Table 1. Basic information on sample collection areas.

Area	Range (Area)	Features	Panoramic View of the Village
Shuangfu Village	1.5–2 square kilometers	watershed
Loutou Village	1–1.3 square kilometers	Plain

Note: The data on the scope (area) of this table was obtained from official sources: the Land Use Status Map and the Putian City Land Use Master Plan (2021–2035) published by the Putian City Natural Resources Bureau.

Table 2. Metrics for models A to D.

	Precision	Recall	mAP50	mAP50-95	Fitness
Model A	0.768	0.714	0.839	0.382	0.428
Model B	0.848	0.589	0.708	0.312	0.352
Model C	0.679	0.756	0.736	0.385	0.420
Model D	0.687	0.782	0.760	0.326	0.369

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, L.; Xu, Y.; Chen, Y.; Zheng, L. Rapid Identification Method for Surface Damage of Red Brick Heritage in Traditional Villages in Putian, Fujian. Coatings 2025, 15, 1140. https://doi.org/10.3390/coatings15101140

AMA Style

Huang L, Xu Y, Chen Y, Zheng L. Rapid Identification Method for Surface Damage of Red Brick Heritage in Traditional Villages in Putian, Fujian. Coatings. 2025; 15(10):1140. https://doi.org/10.3390/coatings15101140

Chicago/Turabian Style

Huang, Linsheng, Yian Xu, Yile Chen, and Liang Zheng. 2025. "Rapid Identification Method for Surface Damage of Red Brick Heritage in Traditional Villages in Putian, Fujian" Coatings 15, no. 10: 1140. https://doi.org/10.3390/coatings15101140

APA Style

Huang, L., Xu, Y., Chen, Y., & Zheng, L. (2025). Rapid Identification Method for Surface Damage of Red Brick Heritage in Traditional Villages in Putian, Fujian. Coatings, 15(10), 1140. https://doi.org/10.3390/coatings15101140

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Rapid Identification Method for Surface Damage of Red Brick Heritage in Traditional Villages in Putian, Fujian

Abstract

1. Introduction

2. Materials and Methodology

2.1. Material Preparation: Image Acquisition Area—Putian City

2.2. The Red Brick Material and Its Damage Types in This Study

2.3. Research Process

2.4. Model Structure

2.5. Model Training Process

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Experimental Environment

Appendix B. Model Structure

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI