Hyperparameter Tuning Technique to Improve the Accuracy of Bridge Damage Identification Model

Chung, Su-Wan; Hong, Sung-Sam; Kim, Byung-Kon

doi:10.3390/buildings14103146

Open AccessArticle

Hyperparameter Tuning Technique to Improve the Accuracy of Bridge Damage Identification Model

by

Su-Wan Chung

¹,

Sung-Sam Hong

²

and

Byung-Kon Kim

^1,*

¹

Department of Future and Smart Construction Research, Korea Institute of Civil Engineering and Building Technology (KICT), Goyang-si 10223, Republic of Korea

²

Department of Multimedia Contents, Jangan University, Pyeongtaek-si 17731, Republic of Korea

^*

Author to whom correspondence should be addressed.

Buildings 2024, 14(10), 3146; https://doi.org/10.3390/buildings14103146

Submission received: 26 August 2024 / Revised: 26 September 2024 / Accepted: 28 September 2024 / Published: 2 October 2024

(This article belongs to the Special Issue Advances in AI, Digitization, Robotics, IoT, BIM, and Spatial Modeling in Building Sciences)

Download

Browse Figures

Versions Notes

Abstract

:

In recent years, active research has been conducted using deep learning to evaluate damage to aging bridges. However, this method is inappropriate for practical use because its performance deteriorates owing to numerous classifications, and it does not use photos of actual sites. To this end, this study used image data from an actual bridge management system as training data and employed a combined learning model for each member among various instance segmentation models, including YOLO, Mask R-CNN, and BlendMask. Meanwhile, techniques such as hyperparameter tuning are widely used to improve the accuracy of deep learning, and this study aimed to improve the accuracy of the existing model through this. The hyperparameters optimized in this study are DEPTH, learning rate (LR), and iterations (ITER) of the neural network. This technique can improve the accuracy by tuning only the hyperparameters while using the existing model for bridge damage identification as it is. As a result of the experiment, when DEPTH, LR, and ITER were set to the optimal values, mAP was improved by approximately 2.9% compared to the existing model.

Keywords:

deep learning; aging bridge management; image analysis; image processing; hyperparameter tuning

1. Introduction

Bridges are crucial structures in roadways and transportation systems. In South Korea, aging bridges that are more than 30 years old account for approximately 18% of all road bridges. Thus, their maintenance is a critical issue. Construction technology has advanced with the advent of the Fourth Industrial Revolution, actively promoting high-rise, large-scale, and complex facilities. However, the current inspection of aging bridges relies on handwriting, partial photographs, and visual observations by inspectors. This makes objective and quantitative assessments difficult because the constant monitoring of facilities is unfeasible. Additionally, the results are managed using paper reports, which makes it challenging to utilize this information. Bridge damage can be categorized as follows based on its severity: (i) structural impact damage that involves the repair and maintenance of structures requiring immediate action, and (ii) damage that requires preventive maintenance because it occurs routinely and has a less immediate structural impact. In South Korea, material testing and visual inspection are used to assess the condition of each part of the facility based on condition changes, including defects, damage, and deterioration. Subsequently, the maintenance method is determined and implemented based on the type and degree of defects, importance of the facility, environmental conditions during usage, and economy. Visual inspection using on-site survey tools is the general damage investigation and inspection method that is applied to concrete structures, such as bridges and tunnel-retaining walls. As this method relies significantly on the subjectivity of the inspector, the objectivity and reliability of the records are often inadequate. Consequently, various automatic damage image analysis techniques have been investigated for objective damage detection of aging bridges.

The management of aging bridges in Korea is conducted through regular visual inspections and non-destructive testing (NDT), similar to those in the US and Europe [1]. In Korea, bridges are managed by dividing them into three grades. Grade 1 bridges are the most important large-scale facilities with a length of 1000 m or more or special bridges, while grade 2 bridges are bridges with a length of 500 to 1000 m. In particular, grade 3 bridges are bridges that were completed more than 10 years ago and are less than 500 m long. These are inspected once a year as regular inspections. Inspection methods include visual inspections, where inspectors visit the bridge in person to check for external abnormalities, such as cracks, deformation, corrosion, and aging, and inspections using video equipment such as cameras or drones. Korea is actively introducing smart technologies such as AI and IoT to extend the life of bridges and ensure safety, utilizing these technologies to detect structural damage, steel corrosion, etc., early and establish maintenance plans.

Hence, it is crucial to establish a revolutionary and highly reliable inspection system. This can be achieved by incorporating artificial intelligence and the rapidly evolving technology of the Fourth Industrial Revolution, particularly deep learning (DL), into facility damage inspection, which has traditionally relied on manpower. On-site survey tools (e.g., crack rulers and crack microscopes) are implemented as crack inspection and checking methods for small-scale ground concrete structures for the visual inspection and manual creation of an appearance survey network map. However, the subjectivity of the inspector may degrade the objectivity and reliability of the records. Furthermore, a different inspector may experience difficulty in determining whether the damage is progressive. To overcome these challenges and enhance the objectivity and accuracy of facility damage inspection and the convenience of data recording and storage, an image processing method that uses a digital camera to extract the appearance survey results automatically via image acquisition was studied [2].

The image processing technique involves processing and analyzing the images acquired from facilities. It includes the input and output images and preprocessing steps for digitization, segmentation, defect management, and defect detection. Various approaches have been explored for effective crack detection. These include morphology techniques that employ morphology operations [3,4,5] and methods that use RGB channel values for contrast-based detection [6]. Additionally, various shooting devices and equipment have attracted scholarly attention [7,8].

To compensate for these limitations, the demand for image analysis technology using machine learning and DL has increased, thereby facilitating research in related fields. In particular, DL-based image processing methods have been investigated to inspect and analyze the appearance of large-scale aging infrastructures using unmanned aerial vehicles (UAVs), such as drones equipped with imaging devices [9,10,11]. However, drone-assisted damage inspection is in the early stages of development and requires expensive equipment and drone experts. Furthermore, its practical applicability is limited by regulatory and institutional factors.

Previous studies using image data to perform visual inspections of bridges are limited to detecting cracks on concrete surfaces. This is because most damage to aging bridges is detected from cracks, making their early detection and maintenance crucial. Most boundary detection-based algorithms and machine learning techniques, such as Canny, crack forest, the feature fusion attention network, and the graph convolutional network, are limited in precisely classifying various types of damage.

Convolutional neural networks (CNNs) have exhibited excellent performance in classification and object detection using training images. Hoskere et al. used deep CNNs (DCNNs) to detect the six damage types in facilities, including cracks, scaling, and corrosion [12]. However, DL-assisted techniques are limited in that their performance depends on the training images, and classifying a few or unclear class features is difficult. For instance, when classifying bridge damage types as efflorescence, corrosion, cracks, scaling, and spalling, differentiating between efflorescence, which appears as white substances on the bridge, and scaling, spalling, and cracks, which exhibit cracks, may enhance the model performance. However, the classification of cracks, scaling, and spalling may deteriorate the performance. Owing to the properties of DL technology, higher classification accuracy is achieved with a larger number of training objects [13].

In addition to image analysis, DL analysis of vibration data has been actively studied in recent years as an effective means of monitoring structural damage. During the past decade, different machine learning tools have been used to develop various parametric and nonparametric vibration-based structural damage detection (SDD) systems for civil structures. These tools include, but are not limited to, artificial neural networks, support vector machines, self-organizing maps, and CNNs. Extensive analytical and experimental studies have been conducted in an attempt to demonstrate the efficiency of these SDD systems. However, in the current study, the research scope was limited to image-based damage identification through DL.

This study proposes a DL model for automatically identifying the six damage types (crack, water leak, efflorescence, concrete scaling, concrete spalling, and corrosion) of old bridges and performing hyperparameter tuning to compare and verify the accuracy and detection speed. This damage-type model is applicable only to concrete bridges. The bridge targeted in this study is a grade 3 bridge. In Korea, bridge management is classified according to the size and importance of the bridge, and grade 3 bridges are one of them. Grade 3 bridges are generally small-scale bridges, concrete structure bridges with a length of less than 500 m. They are mainly located on local roads or minor roads and have relatively lower management priorities than major bridges managed by the state or local governments. Since grade 3 bridges have low traffic volume and are not located on important roads, they may not require as frequent regular inspections and maintenance as large bridges (grade 1 and 2 bridges), but regular visual inspections and maintenance are necessary to prevent safety issues due to aging.

In a previous study, we proposed a technique using super-resolution (SR) that utilizes low-quality bridge images from the collected data for training [13]. This technique performs normalization and data augmentation to convert the images into a suitable form for the learning model. This model detects small objects by increasing the resolution and labeling properly, exhibiting comparable or superior performance to those of existing bridge damage detection models. The data preprocessed in this manner are annotated and extracted as a dataset for each member. The dataset produces results through a model combination process that selectively performs learning and detection among algorithms such as Mask R-CNN, YOLO, and BlendMask. In this study, we propose a bridge member-specific combination learning model that is combined with the bridge damage object detection deep combination framework (BDODC-F). We performed experiments and analyses to improve the accuracy and detection speed of the proposed model by tuning the hyperparameters of the previously proposed combination learning model.

The remainder of this paper is organized as follows: Section 2 introduces related works, Section 3 describes the proposed and implemented methods, Section 4 outlines the experimental results to evaluate the performance of the proposed method, and Section 5 presents the conclusions.

2. Related Works

2.1. Bridge Damage Identification

A previous study, which reported the application of DL in maintaining bridge facilities, proposed a method for inspecting bridges by using UAVs and applying DL algorithms to develop an automated bridge inspection system [14]. First, UAVs were used to collect the bridge image data. The UAVs were equipped with high-resolution cameras and other sensors to capture and record bridge conditions in real time. The collected image data were fed into the DL algorithm, which was built on a CNN architecture, for training. CNNs exhibit excellent performance in image processing by extracting input image features, which are subsequently used to determine whether a bridge is defective or damaged. Compared to traditional methods, the DL-based bridge inspection system developed by the authors achieved higher accuracy and efficiency, improving the defect detection and prediction while minimizing the labor and time. Another study used DL for bridge facility maintenance by proposing a DL algorithm for crack detection in concrete structures [15]. The CNN, which was trained using the image data of the surface of concrete structures as input, detected and classified the crack patterns in the images. The results were more accurate than those obtained using traditional methods, enabling real-time processing. The system was efficient in inspecting large concrete structures because it could effectively process large amounts of data. Feng et al. [16] used a CNN to develop a model that exhibited high accuracy in identifying damage in images of the entire bridge area. Bukhsh et al. [17] demonstrated that transfer learning can be used to improve the performance of a bridge damage detection model without using a rich dataset. They developed a model to detect the damaged parts of a structure by fine-tuning the VGG16 model. Zhang et al. [18] proposed a DL-assisted real-time bridge damage detection system that determined the health of a structure using Faster R-CNN. This facilitated the real-time detection and analysis of damaged areas in bridge images. Recent studies have focused on implementing image generation and data augmentation techniques using generative adversarial networks to detect the degree of damage in real structures more accurately [19]. These studies were early attempts at applying DL to bridge damage identification and the identification accuracy was very low, making them difficult to apply in practice.

2.2. Damage Identification Using Deep Learning

The human method of inspecting bridges relies on experience and intuition and can respond flexibly in complex situations. In particular, skilled engineers can perform inspection tasks with very high accuracy. However, human visual inspection can involve fatigue and subjectivity. Inspection results can vary from inspector to inspector, and, in particular, minute damage or defects that are difficult to detect can be missed. Several studies have compared human inspection with automated methods using deep learning to identify damage in bridges. Several studies have compared human inspection with deep learning-based automatic detection, and deep learning is efficient in handling large amounts of data and has shown much higher consistency than humans, especially in repetitive inspection tasks. One study confirmed that deep learning can detect defects with higher accuracy and speed while minimizing the subjective judgment of humans.

Several optimization theories have been used to improve the identification of existing network structures [20]. Song et al. proposed a preliminary selection and refined identification scheme based on Faster R-CNN. This optimized solution was used to identify three types of bridge surface damage: honeycomb pitting, crack, and salt out. Compared with Faster R-CNN, this method improved the mean average precision (mAP) of the three categories by 23.89%, 21.04%, and 35.43%, respectively [21]. Yu et al. used a k-means clustering algorithm to obtain anchor sizes corresponding to the damage sizes [22]. The shooting distance and focal length were fixed during the data acquisition phase to enable visual characterization of the true damage size. The proposed method was used for the identification of three common types of damage on bridge surfaces: crack, spalling, and exposed reinforcement. The identification results showed that the average identification accuracy for the three types of damage was 84.56%, which was higher than that of Faster R-CNN with predefined anchor points.

In contrast to parametric optimization, some new modular structures and convolutional forms can be used to facilitate the development of object detection networks. Yang et al. improved YOLOV3 using the Spatial Pyramid Pooling (SPP) module [23]. Meanwhile, a transfer learning strategy based on the pretrained model was used to learn a small sample of concrete bridge damage data. The experimental results showed that the SPP module could improve the mAP by 1.3%. Chen et al. used deformable convolution to extract more accurate features based on YOLOV3. Transfer learning was also used to improve the identification accuracy of rust, collapse, cracks, and weeds on the bridge surface [24]. The highest mAP of 84% was obtained when the number of transfer layers was 10. Moreover, pruning and group convolution were used to compress the model to accelerate the inference. A higher compression rate results in a worse identification effect of the compressed model. Most studies on bridge surface damage identification using DL have focused on applying and optimizing the latest models, and no research has been conducted on improving the accuracy through hyperparameter tuning.

Therefore, this study aims to find the optimal hyperparameters to improve the accuracy of the previously proposed model-combined bridge damage automatic identification algorithm. This is significant in that it can be applied to damage types that match the characteristics of three types of bridges, which are concrete structures. In addition, the reliability of the accuracy improvement can be secured through verification experiments using inspection photos of actual bridge information systems. In this study, we focused on comparing the identification accuracy improved through existing deep learning techniques and hyperparameter tuning.

3. Materials and Methods

3.1. Instance Segmentation Model

3.1.1. Mask R-CNN

Mask R-CNN is a CNN-based model that extends the Faster R-CNN object detection framework to include instance segmentation [24]. It enables both object detection and pixel-level segmentation within a single unified architecture. The breakdown of its key components is as follows:

Backbone network: Mask R-CNN typically uses a pretrained CNN such as ResNet or VGGNet as its backbone network. The backbone network is responsible for extracting high-level features from the input image.
Region proposal network (RPN): The RPN generates region proposals, which are potential bounding box locations of objects in the image. It achieves this by sliding a small network over the feature map of the backbone network and predicting objectness scores and bounding box adjustments.
Region of Interest (RoI) align: RoI align extracts fixed-size feature maps from the feature map of the backbone network for each region proposal. It uses bilinear interpolation to obtain more accurate feature representations, thereby enabling precise localization.
Region classification and bounding box regression: Each RoI is fed into two parallel fully connected layers, one for predicting the class probabilities of the object within the RoI and another for regressing the refined bounding box coordinates.
Mask prediction: Mask R-CNN introduces an additional branch for predicting pixel-level masks for each object instance. This branch uses the RoI-aligned features and applies a small fully convolutional network to generate a binary mask for each class-agnostic RoI.

The important hyperparameters of Mask R-CNN include the following:

Number of anchors: The number of anchors at each spatial position of the feature map affects the number of region proposals generated by the RPN. It is typically set based on the scale and aspect ratios of the objects present in the dataset.
Region proposal intersection over union (IoU) threshold: This is the IoU threshold that is used to determine positive and negative anchor boxes during the RPN training. Anchors that IoU overlap with ground truth boxes that are higher than the threshold are considered positive.
RoI pooling/align output size: This is the desired output size of the RoI pooling or RoI align operation, which determines the fixed spatial resolution of the RoI-aligned feature maps.
Learning rate: The initial learning rate determines the step size for updating the model parameters during training. It plays a crucial role in controlling the convergence speed and final performance.

The bridge defect detection process using Mask R-CNN first preprocesses the input image and provides it to the model, and it extracts important features from the image through a backbone network such as ResNet or FPN. Then, the region proposal network (RPN) is used to propose regions where defects are likely to exist as candidates, and the proposed regions are precisely aligned through RoI align to preserve location and size information. For each region, the type of defect is predicted, the location of the defect is indicated through a bounding box, and a pixel-level mask is generated to segment the shape of the defect more precisely. Finally, through this process, the result containing various information about the defect, such as the location, type, size, and shape of the defect, is output. The architecture of Mask R-CNN is shown in Figure 1.

3.1.2. YOLO

YOLO is a single-stage detector based on real-time object detection [26]. It achieves high speed and accuracy and uses backbone networks such as DarkNet53, DarkNet19, and CSPDarkNet. The network architecture of YOLO is designed by scaling the input image to a fixed size and extracting feature maps through the CNN, as shown in Figure 2. YOLO aims to detect objects in an image by dividing it into a grid and predicting the bounding boxes and class probabilities directly. Its key components are outlined as follows:

Input image division: YOLO divides the input image into a grid of cells, typically of size S × S. Each cell is responsible for predicting the bounding boxes and class probabilities for objects that fall within its boundaries.
Anchor boxes: YOLO uses anchor boxes, which are predefined boxes with different shapes and aspect ratios, to predict multiple bounding boxes per cell. These anchor boxes are sized based on the expected object sizes in the dataset.
Convolutional layers: YOLO employs a series of convolutional layers to extract features from the input image. These layers capture both the low- and high-level features, thereby enabling the model to detect objects at different scales and levels of abstraction.
Bounding box prediction: Each cell in the grid predicts multiple bounding boxes using anchor boxes. For each bounding box, YOLO predicts the coordinates (x, y, width, height) relative to the boundaries of the cell. Subsequently, these coordinates are mapped to the coordinates of the entire image.
Objectness score and class prediction: For each bounding box, YOLO predicts the objectness score, which represents the confidence that the bounding box contains an object. It also predicts the class probabilities for different object categories using the softmax activation function.

The important hyperparameters of YOLO include the following:

Grid size (S): The size of the grid used to divide the input image. A larger grid size allows for finer spatial resolution but increases the computational complexity.
Number of anchor boxes: The number of anchor boxes assigned to each grid cell. YOLO predicts the bounding boxes based on these anchor boxes, and the number of anchor boxes affects the ability of the model to detect objects of different sizes and aspect ratios.
Confidence threshold: The minimum confidence threshold above which a predicted bounding box is considered a valid detection. Lowering this threshold can increase the number of detected objects but may also increase false positives.
Non-maximum suppression (NMS) threshold: During post-processing, YOLO applies NMS to eliminate redundant bounding box detections. The NMS threshold determines the IoU overlap that is required for two bounding boxes to be considered duplicates and removes the one with lower confidence.

3.1.3. BlendMask

The BlendMask model performs the semantic and instance segmentation tasks simultaneously, using the instance mask branch and semantic branch [27] to achieve high accuracy and speed. The pipeline of the BlendMask model is shown in Figure 3. BlendMask is an object detection model that combines the strengths of both one-stage and two-stage detectors. It integrates the anchor-based and anchor-free approaches to achieve accurate and efficient object detection. Its key components are summarized as follows:

Backbone network: BlendMask typically uses a backbone network such as ResNet or ResNeXt to extract high-level features from the input image. The backbone network plays a crucial role in capturing semantic information.
Anchor-based head: BlendMask uses an anchor-based head, similar to two-stage detectors such as Faster R-CNN. This head generates a set of predefined anchors and predicts the bounding box offsets and class probabilities for each anchor using features from the backbone network.
Anchor-free head: BlendMask also incorporates an anchor-free head, inspired by one-stage detectors such as FCOS. This head directly predicts the bounding box coordinates and class probabilities for objects, without relying on predefined anchors.
Context enhancement module (CEM): BlendMask introduces the CEM to enhance the feature representation by capturing context information. The CEM fuses features from different levels of the backbone network to improve the ability of the model to handle objects at various scales.
Mask prediction: BlendMask extends object detection to instance segmentation by predicting the pixel-level masks for each object. It uses a mask prediction branch that obtains features from the backbone network and generates a binary mask for each object instance.

The important hyperparameters of BlendMask include the following:

Anchor sizes and aspect ratios: The anchor sizes and aspect ratios determine the scales and shapes of the predefined anchors that are used in the anchor-based head. These parameters need to be set based on the distribution of object sizes and aspect ratios in the dataset.
Anchor-free detection threshold: BlendMask assigns objects to anchor-free locations based on a detection threshold. Objects with detection scores above this threshold are considered positive detections.
Mask IoU threshold: During mask prediction, BlendMask uses an IoU threshold to assign predicted masks to ground truth instances. The threshold determines the required overlap between the predicted and ground truth masks.
Loss weights: BlendMask combines multiple loss functions, including the classification loss, bounding box regression loss, and mask segmentation loss. The weights assigned to each loss term affect the overall training objective and can be adjusted to balance the importance of different tasks.

3.2. DL-Combined Model Framework for Bridge Damage Identification Bsased on Discrete Learning by Members

We used a DL framework for detecting the bridge damage objects proposed in a previous study [13]. First, the image quality was enhanced and normalized using SR to improve the detection performance for each object and strengthen the diversity and consistency of the dataset. Second, an optimized detection model was established for each object using a DL combination module based on individual learning and optimized into a unified model. The proposed DL-based framework was optimized for the following six types of bridge damage: efflorescence, concrete scaling, concrete spalling, cracks, corrosion, and water leaks. The architecture of this framework is illustrated in Figure 4.

3.2.1. Dataset Extraction Module by Members

The DL model training data were collected, labeled, and preprocessed during the dataset extraction. The bridge images collected using BMS were labeled for the six damage types and members. Subsequently, they were fed into the npz data structure based on the label content and transformed into suitable data types for the GPU. The labelme [28] open-source labeling tool was used for labeling to yield the final dataset, which comprised 3742 data points, as annotations and image sets for each member. Table 1 shows the number of data points per damage type.

3.2.2. Combined Learning and Detection Module

The combined learning and detection module analyzed the characteristics of the bridge damage objects for each member to develop a DL model that was optimized for identification. The identification performance of the derived model was enhanced by tuning, i.e., efficiently learning the previously constructed image net for each member. We investigated the R-CNN-based DL models Mask R-CNN, YOLO (R-CNN), and BlendMask to identify the damaged objects because these models reportedly exhibit a rapid detection time and larger input image size. Among these, we used the BlendMask DL model with the following structural components:

Backbone network: A feature map is extracted from the input image using a CNN-based network such as ResNet or ResNeXt.
Feature pyramid network: This generates feature maps of different sizes and resolutions, enabling the detection of objects with different sizes and locations.
RoI align: Each object region is cropped to a fixed size and connected to the feature map using RoI align while preserving the accurate location information.
Mask head: A mask prediction branch and class prediction branch are separately configured for each object instance from the RoI align feature map.
Mask encoding: A mask encoding vector is generated by combining the feature maps from the mask and class prediction branches.
Decoder network: The mask encoding vector is received as the input and up-sampled to the original image size. Subsequently, pixel-wise segmentation generates an instance segmentation mask.
Blending module: Instance masks with the same label (class) are blended using the blending module, and the final instance segmentation result is obtained as the output.

After training, the generated identification model was used for damage object identification. The model, which received a new bridge image as input, automatically identified the damaged objects in the image. The most optimized model was used as the identification model based on speed and accuracy measurements. The R-CNN-series models exhibited detection speed issues owing to the bottleneck caused by the separation between the object area estimation and object recognition. Additionally, high-performance GPUs were required to achieve the maximum detection speed owing to the large size of the network parameters. Detection models were generated for each member and combined using the aggregation module. Each detection model was trained on a dataset for each member. After receiving an input image, the final integrated detection result was derived using the combined framework of detection models for each member. Figure 5 depicts the result screen of the bridge damage detection.

3.3. Hyperparameter Tuning

DL model hyperparameters are adjustable and influence the model structure and learning algorithm. As these hyperparameters directly affect the performance and learning process of the model, so it is critical to set them to appropriate values. Some common DL hyperparameters and their characteristics are as follows:

Learning rate: A scaling factor used by the backpropagation algorithm to update the weight. A low value may decelerate learning and a high value may cause divergence problems. In general, this value is set to a power of 10, such as 0.1, 0.01, or 0.001.
Batch size: The number of data samples that are processed simultaneously. A small batch size can lower the memory usage and enable faster computation; however, it may increase instability owing to noise. A large batch size provides stable gradient descent updates but increases the memory requirements and computational costs.
Number of epochs: The number of times that the entire dataset is fed into the model. DL models typically express the number of iterations in the unit of “epochs”. An epoch implies that the entire dataset is fed into the model once and updated. For instance, if there are 1000 data points and the batch size is 100, 10 iterations are required to process the entire dataset once. In this case, the number of epochs would be increased by unity after the 10th iteration. Determining the right number of iterations is a critical issue in DL models. Insufficient iterations may result in underfitting and subpar performance owing to inadequate training, whereas excessive iterations can cause overfitting.

The aforementioned hyperparameters are crucial in DL models. Thus, they need to be adjusted via multiple iterations to determine their optimal values for the experiments, thereby improving the model performance and generalization ability. A previous study provided practical recommendations for parameter tuning while training DL models use backpropagation, which is a gradient-based learning algorithm [29]. Some of these recommendations are as follows:

Initialization method: During initialization, the weights should be selected randomly from an appropriate distribution.
Learning rate (LR) scheduling: The initial learning and decay rates should be fixed by adjusting the learning rate to achieve the optimal performance.
Normalization: Techniques such as L1 or L2 normalization and dropout can be used to mitigate overfitting problems.
Batch size: The batch size must be adjusted to strike a balance between the learning speed and generalization performance.

These recommendations can serve as useful guidelines for tuning DL model parameters, thereby facilitating the construction of a more stable and better-performing DL model. Yu et al. analyzed various hyperparameter tuning techniques, such as grid search, random walk, and Bayesian optimization [30]. Snoek et al. proposed a method for the hyperparameter tuning of DL models using Bayesian optimization [31], which constructs a probability model based on real-world experimental results that are used to infer the optimal hyperparameter combination. Feurer et al. introduced successive halving and model-based algorithm configuration (SMAC), which is an automated machine learning framework [32]. SMAC explores and evaluates the hyperparameter space of DL models to determine the optimal combination based on real-world experimental results. Recently, a method for optimizing DL hyperparameters using the Slurm and Optuna frameworks was investigated, enabling the performance of various experiments, such as DL experiments derived from distributed computing environments or auto-encoding [33].

4. Experiments and Results

4.1. Experimental Setup

We used a CPU Intel(R) Core(TM) i7-10900k 2.90 GHz with 96 GB of RAM and an NVIDIA GeForce RTX3090 GPU. The experiments were conducted on a system with a built-in GPU to test the performance of the established detection model, and Python 3.8.16 was used as the programming language. The details of the test bridges are presented in Table 2, and 3724 damage images of the class 3 deteriorated bridges were used as learning data.

The safety grade of a grade 3 bridge is evaluated through regular safety inspections and detailed safety diagnoses, and is divided into five grades from A to E, depending on the bridge condition. Class A is the best condition with no problems, and as the grade goes up to E, it is considered that the condition requires restrictions on use or maintenance and reinforcement. All bridges tested in this paper are class B to C, and although these bridges have minor defects, they are structurally sound. However, they require periodic maintenance.

4.2. Hyperparameters Used in Experiment

Table 3 lists the parameters of the exiting models. The backbone was a Resnet-based model with a default depth of 101 layers.

In this study, hyperparameter tuning was performed by adjusting the LR scheduling and batch size using the initialization method. Specifically, we experimented with variable parameters for DEPTH, MAX_ITER, and LR. The DEPTH value was set to 101 and 50 layers. The LR value was tested at 0.001, 0.01, and 0.015. The ITERATION value was set to 50,000, 100,000, 150,000, 200,000, and 250,000. Table 4 lists the hyperparameter values used as variables.

4.3. Measurement Methods

We employed the average precision (AP), which is a typical metric for evaluating the prediction accuracy in DL detection models. Correct classification was determined based on the matching percentage by measuring the intersection over union (IoU) between the correct object and model-predicted regions. As shown in Figure 6, the IoU is expressed as the ratio of the overlap between the ground truth and the detection result to all surrounding areas. Accordingly, a wider overlapping area between the polygon line at the actual object location and predicted polygon line indicates a better prediction evaluation.

AP50 and AP75 indicate the values measured by assuming successful classification for a matching rate greater than 50% and 75%, respectively. APs, APm, and APl represent the measurement of the AP performance as small (s), medium (m), and large (l) based on the size of each detected object. The results within a 50–95% matching rate were considered as successful classifications. Furthermore, the mAP is the average of all AP performance numbers, representing the average performance of the model. The six damage types (efflorescence, concrete scaling, concrete spalling, crack, water leak, corrosion) were trained and used in the experiment. However, the detection results were only analyzed for the efflorescence, concrete scaling, and concrete spalling types for each member, excluding cases with few or zero objects. The detection performance values (such as mAP) represent all types. The experiments were performed using 50 and 20 images per member for training and validation, respectively. Table 5 lists the reported detection performance results for the default parameter values (DEPTH: 101, ITER: 270,000, and LR: 0.01). This table compares the detection performance of the existing models and the model modified by hyperparameter tuning.

4.4. Model Performance with DEPTH of 50 Layers

Table 6 presents the model performance for each ITER at a DEPTH of 50 and LR of 0.001.

The results show a significant decrease in detection performance with these hyperparameters. A low LR value may cause the DL model to exhibit inferior detection performance. LR is the scaling factor that is used to update the weight. First, a low LR value decreases the size of the weight updates, thereby slowing the convergence. The detection performance may be limited if the model fails to converge to the optimal parameters under sufficient iterations. Second, using a low LR increases the probability that the model will fall into a local minimum and fail to attain the global minimum, thereby limiting the model performance. Third, a low LR may cause the model to learn insufficient complex patterns and meaningful features, resulting in underfitting and poor detection performance. Finally, a low LR can make it difficult to learn sufficiently complex patterns using small or simple datasets. Detection was not possible when ITER was set to 50,000, which can be attributed to the insufficient training of the model. DL models update their weights by iterating through the data multiple times to achieve the optimal performance. Sufficient training may not be achieved at a low ITER value because only a limited number of iterations can be performed on the entire dataset.

Table 7 presents the model performance results according to ITER at an LR value of 0.01.

The detection performance of each model had a linear relationship with the ITER value. At ITER values of 200,000 and 250,000, the performance was superior to that of the existing model (mAP: 83.965).

Table 8 lists the model performance according to ITER at an LR value of 0.015.

At ITER values of 100,000, 150,000, and 250,000, the detection performance was superior to that of the existing model.

4.5. Model Performance with DEPTH of 101 Layers

Table 9 shows the model performance for each ITER with a DEPTH of 101 and an LR of 0.001.

The highest detection performance of 85.814 mAP was achieved at an ITER of 250,000 and LR of 0.001. However, the detection performance results were not obtained for each ITER at an LR of 0.01 owing to the size, complexity, and class imbalance of the dataset. These hyperparameter values (DEPTH: 101 and LR: 0.01) were ineffective because the model did not perform well even after 2–3 days of training and approximately 270,000 iterations for performance measurement. Table 10 shows the model performances at an LR of 0.015 for each ITER.

Although the detection performance was lower than that of the existing model, it was substantial at an ITER of 250,000. Furthermore, detection was impossible at an ITER of 50,000 because the model was not trained sufficiently, similar to the case of DEPTH: 50, LR: 0.001, and ITER: 50,000. Figure 7 presents a graph showing the case with the highest accuracy for all hyperparameters. It can be observed that, except for the case of DEPTH: 50, ITER: 250,000, and LR: 0.001, better results were achieved than those of the existing model (DEPTH: 101, ITER: 270,000, LR: 0.01).

According to the results in Table 11, the model performance was highest in AP50 and AP75. The AP50 increased by 0.8 and AP75 increased by 9.7 compared to the second-best recent model, BR-DETR. This is possibly because a framework that performs identification was applied by combining various models rather than a single model, which was optimized through hyperparameter tuning.

The experimental results demonstrated hyperparameter values, indicating superior performance compared to those of the existing models. However, some hyperparameter values were overfitted owing to the small amount of collected per-member image data, which should be improved by securing a sufficient amount of data. A few cases exhibited interactions between hyperparameters. For instance, using a large LR and small ITER simultaneously may cause various issues, such as slow convergence to the optimal solution and underfitting. Thus, developing a detection model by setting the optimal hyperparameter values based on the experimental results can facilitate more accurate detection performance. The experiments revealed that hyperparameter tuning influences the model detection performance. However, a method that is suitable for the type and size of the dataset and characteristics of the detection model is required to explore these optimal hyperparameters.

5. Conclusions

Determining optimal values and improving performance via hyperparameter tuning in DL models are challenging. To this end, this study compared the detection performance when varying each parameter using initialization methods, LR scheduling, and the batch size, among various parameter-tuning methods. The experimental results demonstrated that, out of 23 hyperparameter types, 6 types exhibited better detection performance than that of the traditional model. The hyperparameters DEPTH, LR, and ITER with values of 50, 0.015, and 250,000, respectively, improved the detection performance by 2.863%. In R-CNN-based DL models, the difference in detection speed between the depths of 101 and 50 was generally faster with a lower depth because the amount of computation and number of parameters increase with the model depth. Thus, in addition to accuracy improvement, the detection time was approximately 40% greater than that of the existing model, indicating that the hyperparameter values were highly efficient. However, inappropriate hyperparameter values significantly decrease the detection performance or speed. In the worst case, detection itself may become impossible. This underscores the importance of adequately tuning the hyperparameters to achieve excellent speed and accuracy. At present, research is being conducted on identifying and quantifying various types of damage in concrete structures such as bridges. The experimental results obtained from this study can be used as important references for the hyperparameter tuning of DL models for detecting similar damage types. Combining hyperparameters with the settings selected for a DL model can yield different results and performance. Therefore, it is crucial to evaluate various hyperparameter values through experimentation and validation, adjusting them appropriately to achieve optimal results.

Recently, the aging of bridges due to climate change is accelerating. Climate change can affect the physical properties of structures and accelerate problems such as corrosion, which can reduce the durability and safety of structures. In particular, increased rainfall, rising temperatures, and changes in humidity can accelerate the corrosion of structures and affect earthquake resistance. Therefore, measures to adapt to climate change should include more accurate and rapid maintenance and repair procedures using new technologies for existing structures. In order to quickly respond to such climate change, it is important to continuously develop automatic identification models using artificial intelligence [38].

In the experiment in this study, data bias occurred owing to the limited training dataset. In future research, it is necessary to improve this bias by securing additional datasets. We also plan to introduce an algorithm that automatically determines optimal hyperparameter values and tests and verifies the results. Additionally, the proposed bridge damage automatic identification model will be applied to the latest Vision Transformer-based model. Further research is required to develop a DL model framework with higher identification accuracy, and it should be tested and validated on real bridge sites to improve the technology.

Author Contributions

Conceptualization, S.-W.C.; software, S.-W.C. and S.-S.H.; validation, S.-W.C. and S.-S.H.; resources, S.-W.C.; data curation, S.-W.C.; writing—original draft, S.-W.C.; writing—review and editing, S.-W.C., S.-S.H. and B.-K.K.; supervision, B.-K.K.; project administration, B.-K.K.; funding acquisition, B.-K.K. All authors have read and agreed to the published version of the manuscript.

Funding

Research for this paper was carried out under the KICT Research Program (20240143-001, Research on Smart Construction Technology for Leading the future construction industry and Creating new market), funded by the Ministry of Science and ICT.

Data Availability Statement

DIV2K dataset—Super Resolution Benchmark Dataset (link: https://data.vision.ee.ethz.ch/cvl/DIV2K/, accessed on 30 October 2022).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Kušter Marić, M.; Mandić Ivanković, A.; Vlašić, A.; Bleiziffer, J.; Srbić, M.; Skokandić, D. Assessment of reinforcement corrosion and concrete damage on bridges using non-destructive testing. Građevinar 2019, 71, 843–862. [Google Scholar]
Kim, A.R.; Kim, D.; Byun, Y.S.; Lee, S.W. Crack detection of concrete structure using deep learning and image processing method in geotechnical engineering. J. Korean Geotech. Soc. 2018, 34, 145–154. [Google Scholar]
Byun, T.B.; Kim, J.H.; Kim, H.S. The Recognition of Crack Detection Using Difference Image Analysis Method based on Morphology. J. Korea Inst. Inf. Commun. Eng. 2006, 10, 197–205. [Google Scholar]
Lee, B.Y.; Kim, Y.Y.; Kim, J.K. Development of image processing for concrete surface cracks by employing enhanced binarization and shape analysis technique. J. Korea Concr. Inst. 2005, 17, 361–368. [Google Scholar] [CrossRef]
Lee, B.J.; Shin, J.I.; Park, C.H. Development of image processing program to inspect concrete bridges. In Proceedings of the Korea Concrete Institute Conference, Gwangju, South Korea, 7 May 2008; pp. 189–192. [Google Scholar]
Kim, K.B.; Cho, J.H. Detection of concrete surface cracks using fuzzy techniques. J. Korean Inst. Inf. Commun. Eng. 2010, 14, 1353–1358. [Google Scholar] [CrossRef]
Kim, Y. Development of crack recognition system for concrete structure using image processing method. J. Korean Inst. Inf. Technol. 2016, 14, 163–168. [Google Scholar] [CrossRef]
Park, H.S. Performance analysis of the tunnel inspection system using high speed camera. J. Korean Inst. Inf. Technol. 2013, 11, 1–6. [Google Scholar]
Cho, S.; Kim, B.; Lee, Y.I. Image-based concrete crack and Spalling detection using deep learning. J. Korean Soc. Civ. Eng. 2018, 66, 92–97. [Google Scholar]
Kim, J.W.; Jung, Y.W. Study on Rapid Structure Visual Inspection Technology Using Drones and Image Analysis Techniques for Damaged Concrete Structures. Proc. Korean Soc. Civ. Eng. 2017, 1788–1789. [Google Scholar] [CrossRef]
Lee, J.H.; Kim, I.H.; Jung, H.J. A feasibility study for detection of bridge crack based on UAV. Trans. Korean Soc. Noise Vib. Eng. 2018, 28, 110–117. [Google Scholar] [CrossRef]
Hoskere, V.; Narazaki, Y.; Hoang, T.; Spencer, B., Jr. Vision-based structural inspection using multiscale deep convolutional neural networks. In Proceedings of the 3rd Huixian International Forum on Earthquake Engineering for Young Researchers, Urbana, IL, USA, 11–12 August 2017. [Google Scholar]
Hong, S.S.; Hwang, C.; Chung, S.W.; Kim, B.K. A deep learning-based bridge damaged objects automatic detection model using bridge members model combination framework. J. Next-Gener. Converg. Inf. Serv. Technol. 2023, 12, 105–118. [Google Scholar] [CrossRef]
Liu, K.; Han, X.; Chen, B.M. Deep learning based automatic crack detection and segmentation for unmanned aerial vehicle inspections. In Proceedings of the IEEE International Conference on Robotics and Biomimetics (ROBIO), Dali, China, 6–8 December 2019; pp. 381–387. [Google Scholar] [CrossRef]
Silva, V.L.S.; Cardoso, M.A.; Oliveira, D.F.B.; Moraes, R. Stochastic. Concrete cracks detection based on deep learning image classification. In Optimization Strategies Applied to OLYMPUS Benchmark Proceedings; European Association of Geoscientists & Engineers: Utrecht, The Netherlands, 2018. [Google Scholar] [CrossRef]
Feng, C.; Zhang, H.; Wang, S.; Li, Y.; Wang, H.; Yan, F. Structural damage detection using deep convolutional neural network and transfer learning. KSCE J. Civ. Eng. 2019, 23, 4493–4502. [Google Scholar] [CrossRef]
Bukhsh, Z.A.; Jansen, N.; Saeed, A. Damage detection using in-domain and cross-domain transfer learning. Neural Comput. 2021, 33, 16921–16936. [Google Scholar] [CrossRef]
Zhang, C.; Chang, C.C.; Jamshidi, M. Concrete bridge surface damage detection using a single-stage detector. Comput.-Aided Civ. Infrastruct. Eng. 2020, 35, 389–409. [Google Scholar] [CrossRef]
Munawar, H.S.; Hammad, A.W.A.; Waller, S.T.; Islam, M.R. Modern crack detection for bridge infrastructure maintenance using machine learning. Hum.-Cent. Intell. Syst. 2022, 2, 95–112. [Google Scholar] [CrossRef]
Zhang, Y.; Yuen, K.V. Review of artificial intelligence-based bridge damage detection. Adv. Mech. Eng. 2022, 14, 16878132221122770. [Google Scholar] [CrossRef]
Song, W.; Cai, S.; Guo, H.; Gao, F.; Zhang, J.; Liu, G.; Wei, H. Bridge apparent damage detection system based on deep learning. In Fuzzy Systems and Data Mining V; IOS Press: Amsterdam, The Netherlands, 2019; pp. 475–480. [Google Scholar]
Yu, L.; He, S.; Liu, X.; Ma, M.; Xiang, S. Engineering-oriented bridge multiple-damage detection with damage integrity using modified faster region-based convolutional neural network. Multimed. Tools Appl. 2022, 81, 18279–18304. [Google Scholar] [CrossRef]
Yang, J.; Li, H.; Huang, D.; Jiang, S. Concrete bridge damage detection based on transfer learning with small training samples. In Proceedings of the 2021 7th International Conference on Systems and Informatics (ICSAI), Chongqing, China, 13–15 November 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–6. [Google Scholar]
Chen, X.; Ye, Y.; Zhang, X.; Yu, C. Bridge damage detection and recognition based on deep learning. J. Phys. Conf. Ser. 2020, 1626, 012151. [Google Scholar] [CrossRef]
He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 386–397. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
Chen, H.; Sun, K.; Tian, Z.; Shen, C.; Huang, Y.; Yan, Y. BlendMask: Top-down meets bottom-up for instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 8570–8578. [Google Scholar] [CrossRef]
Torralba, A.; Russell, B.C.; Yuen, J. Labelme: Online image annotation and applications. Proc. IEEE 2010, 98, 1467–1484. [Google Scholar] [CrossRef]
Bengio, Y. Practical recommendations for gradient-based training of deep architectures. In Neural Networks: Tricks of the Trade, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 437–478. [Google Scholar] [CrossRef]
Yu, T.; Zhu, H. Hyper-parameter Optimization: A Review of Algorithms and Applications. arXiv 2020, arXiv:2003.05689. [Google Scholar]
Snoek, J.; Larochelle, H.; Adams, R.P. Practical Bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems 25; Curran Associates, Inc.: Red Hook, NY, USA, 2012. [Google Scholar]
Feurer, M.; Klein, A.; Eggensperger, K.; Springenberg, J.; Blum, M.; Hutter, F. Efficient and robust automated machine learning. In Advances in Neural Information Processing Systems 28; Curran Associates, Inc.: Red Hook, NY, USA, 2015. [Google Scholar]
Pokhrel, P.A. Comparison of AutoML Hyperparameter Optimization Tools for Tabular Data. Ph.D. Dissertation, Youngstown State University, Youngstown, OH, USA, 2023. [Google Scholar]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part I 14. Springer International Publishing: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable Transformers for End-to-End Object Detection. arXiv 2020, arXiv:2010.04159. [Google Scholar]
Wan, H.; Gao, L.; Yuan, Z.; Qu, H.; Sun, Q.; Cheng, H.; Wang, R. A novel transformer model for surface damage detection and cognition of concrete bridges. Expert Syst. Appl. 2023, 213, 119019. [Google Scholar] [CrossRef]
Milić, P.; Kušter Marić, M. Climate change effect on durability of bridges and other infrastructure. Građevinar Časopis Hrvat. Saveza Građevinskih Inženjera 2023, 75, 893–906. [Google Scholar]

Figure 1. Mask R-CNN architecture [25].

Figure 2. YOLO architecture [26].

Figure 3. BlendMask pipeline [27].

Figure 4. Architecture of proposed framework.

Figure 5. Bridge damage detection results.

Figure 6. Calculation of IoU in bridge damage detection results.

Figure 7. Graph of highest accuracy according to hyperparameters.

Table 1. Numbers of learning data points.

Type of Damage	Learning Data
Water leak	163
Efflorescence	1220
Concrete scaling	368
Concrete spalling	602
Corrosion	482
Crack	889
Total	3724

Table 2. Details of test bridges.

	Experimental Bridges
Bridge Name	Yeonhwa Bridge	Deokmun Bridge	Shinyul Bridge	Daechi Bridge	Hwaseong Bridge
Total length	29.8 m	20 m	54 m	36 m	21.6 m
Total width	13 m	10.5 m	10 m	10 m	10 m
Height	3 m	5 m	3 m	2.4 m	4.4 m
Completion	2016	1982	1987	1987	1983

Table 3. Main parameters of SRCNN and BlendMask.

SRCNN	BlendMask
Epochs: 300 Batch size: 100 Loss rate: 1 × 10⁻⁴	BACKBONE: NAME: “build_fcos_resnet_fpn_backbone” DEPTH: 101 ROI_HEADS: BATCH_SIZE_PER_IMAGE: 512 ITER: 270,000 SOLVER: BASE_LR: 0.01 BIAS_LR_FACTOR: 1.0 MOMENTUM: 0.9 WARMUP_METHOD: linear WEIGHT_DECAY: 0.0001

Table 4. Hyperparameters used as variables in experiment.

DEPTH: 101
LR: 0.001	LR: 0.01	LR: 0.015
ITER: 50,000~250,000 (50,000 increments, 5 types)
DEPTH: 50
LR: 0.001	LR: 0.01	LR: 0.015
ITER: 50,000~250,000 (50,000 increments, 5 types)

Table 5. Damaged object detection performance results for existing models.

Measure	Existing Model
AP50	92.675
AP75	92.121
APs	76.414
APm	92.508
APl	94.013
mAP	83.965

Table 6. Experimental results for DEPTH of 50 and LR of 0.001.

LR	0.001
ITER	50,000	100,000	150,000	200,000	250,000
AP50	-	11.404	22.439	31.656	33.479
AP75	-	0.339	1.328	2.074	1.916
APs	-	3.567	6.208	8.886	9.104
APm	-	3.837	7.22	8.733	9.198
APl	-	3.988	6.048	7.747	8.358
mAP	-	2.275	4.754	6.441	6.851

Table 7. Experimental results for DEPTH of 50 and LR of 0.01.

LR	0.01
ITE	50,000	100,000	150,000	200,000	250,000
AP50	90.497	92.726	93.33	92.629	92.641
AP75	71.81	87.558	91.109	92.267	92.092
APs	55.535	66.577	73.494	75.964	76.558
APm	74.419	85.424	89.929	92.511	93.711
APl	75.051	89.505	90.706	93.117	97.628
mAP	64.521	76.961	81.676	84.531	86.372

Table 8. Experimental results for DEPTH of 50 and LR of 0.015.

LR	0.015
ITER	50,000	100,000	150,000	200,000	250,000
AP50	90.843	92.762	92.702	92.748	92.744
AP75	78.625	92.762	91.328	91.744	92.232
APs	62.06	76.675	75.135	71.786	77.204
APm	82.462	92.055	92.55	89.408	94.697
APl	80.827	93.472	94.137	94.348	97.826
mAP	70.527	84.467	84.112	82.402	86.828

Table 9. Experimental results for DEPTH of 101 and LR of 0.001.

LR	0.001
ITER	50,000	100,000	150,000	200,000	250,000
AP50	92.284	92.823	92.825	92.700	92.773
AP75	86.137	89.528	91.943	92.362	91.905
APs	67.671	68.660	74.152	74.094	76.007
APm	85.607	87.300	91.465	90.698	93.251
APl	88.856	88.944	93.698	93.771	97.207
mAP	76.697	78.542	83.456	83.693	85.814

Table 10. Experimental results for DEPTH of 101 and LR of 0.015.

LR	0.015
ITER	50,000	100,000	150,000	200,000	250,000
AP50	0	77.232	85.413	88.195	92.679
AP75	0	28.415	67.493	57.54	91.941
APs	0	30.179	42.324	43.962	74.271
APm	0	47.073	67.273	68.074	92.659
APl	0	39.296	63.465	63.525	92.529
mAP	0	34.276	54.01	53.329	82.756

Table 11. Detection results of different models on bridge damage dataset.

Model	AP50	AP75
Single Shot MultiBox Detecter (SSD) [34]	81.4	64.4
YOLOV3 [26]	87.9	76.2
YOLOV4 [35]	89.4	76.4
DETR [36]	90	79
BR-DETR [37]	91.9	82.5
BDODC-F	92.7	92.2

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chung, S.-W.; Hong, S.-S.; Kim, B.-K. Hyperparameter Tuning Technique to Improve the Accuracy of Bridge Damage Identification Model. Buildings 2024, 14, 3146. https://doi.org/10.3390/buildings14103146

AMA Style

Chung S-W, Hong S-S, Kim B-K. Hyperparameter Tuning Technique to Improve the Accuracy of Bridge Damage Identification Model. Buildings. 2024; 14(10):3146. https://doi.org/10.3390/buildings14103146

Chicago/Turabian Style

Chung, Su-Wan, Sung-Sam Hong, and Byung-Kon Kim. 2024. "Hyperparameter Tuning Technique to Improve the Accuracy of Bridge Damage Identification Model" Buildings 14, no. 10: 3146. https://doi.org/10.3390/buildings14103146

APA Style

Chung, S.-W., Hong, S.-S., & Kim, B.-K. (2024). Hyperparameter Tuning Technique to Improve the Accuracy of Bridge Damage Identification Model. Buildings, 14(10), 3146. https://doi.org/10.3390/buildings14103146

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hyperparameter Tuning Technique to Improve the Accuracy of Bridge Damage Identification Model

Abstract

1. Introduction

2. Related Works

2.1. Bridge Damage Identification

2.2. Damage Identification Using Deep Learning

3. Materials and Methods

3.1. Instance Segmentation Model

3.1.1. Mask R-CNN

3.1.2. YOLO

3.1.3. BlendMask

3.2. DL-Combined Model Framework for Bridge Damage Identification Bsased on Discrete Learning by Members

3.2.1. Dataset Extraction Module by Members

3.2.2. Combined Learning and Detection Module

3.3. Hyperparameter Tuning

4. Experiments and Results

4.1. Experimental Setup

4.2. Hyperparameters Used in Experiment

4.3. Measurement Methods

4.4. Model Performance with DEPTH of 50 Layers

4.5. Model Performance with DEPTH of 101 Layers

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI