1. Introduction
As bridge construction and transportation networks rapidly expand, steel bridge deck paving systems, recognized for their exceptional performance, have become increasingly common in global bridge projects [
1]. These systems consist of a pavement layer and a steel bridge deck, with the pavement acting as a protective barrier against corrosion and providing essential friction for vehicles [
2]. The pavement layer and steel bridge deck work together to bear loads, showcasing strong coordinated deformation abilities [
3]. However, during operation, these systems are prone to defects, which pose significant safety risks to urban traffic. Therefore, regular inspection and evaluation are vital to maintaining their structural integrity and ensuring traffic safety [
4].
Recently, artificial intelligence technology has become increasingly prevalent in various fields such as healthcare, finance, and engineering, bringing about substantial improvements and innovations [
5,
6]. Machine learning, a subset of artificial intelligence, facilitates automation by identifying patterns in data [
7,
8]. This study utilizes machine learning algorithms to classify condition levels of steel deck pavement systems, moving away from traditional manual inspection methods. Manual inspections are often time-consuming and labor-intensive. In contrast, machine learning provides a more efficient alternative, enabling transit agency personnel to assess the condition levels of steel deck pavement systems with limited data. This method not only reduces workload and costs but also decreases the likelihood of human error.
With advancements in computing power, artificial intelligence has increasingly captured public attention. Machine learning has been effectively applied in various fields and has been proven to be able to solve various nonlinear problems well [
9,
10,
11]. Mangalathu et al. employed machine learning algorithms, including K-Nearest Neighbor and Random Forest, to classify the failure modes of reinforced concrete beam–column joints and predict their shear capacity [
12]. Bakouregui et al. developed a gradient boosting algorithm model to estimate the load-bearing capacity of reinforced concrete columns strengthened with fiber-reinforced polymer strips [
13]. Ikumi et al. used artificial neural networks to estimate the tensile strength of fiber-reinforced concrete after cracking [
14]. Guan et al. developed an estimation model using a Random Forest algorithm to estimate the maximum interstory displacement angle and maximum roof acceleration of frame structures subjected to earthquakes [
15]. Feng et al. employed an adaptive boosting formula to build a robust model by combining multiple weak learners to estimate the compressive strength of concrete [
16]. In addition to building ML prediction models, researchers often use interpretability methods to help people understand how ML decisions are made [
17].
In recent years, some researchers have introduced machine learning algorithms into bridge condition prediction problems. Nasab et al. developed a framework to improve the accuracy of predicting bridge deck conditions using machine learning algorithms with Ohio bridge data [
18]. Rajkumar et al. designed a Random Forest algorithm to estimate the condition ratings of bridges in Florida, which can generate an efficient model with fewer input parameters [
19]. Martinez et al. collected and used data from 2802 bridges in Canada over a 10-year period, used a decision tree algorithm to predict the bridge condition index, and verified the model through cross-validation [
20]. Liu et al. used the convolutional neural network algorithm to build up a method for predicting the status of bridge structural components and optimized the model parameters [
21]. Assaad et al. selected the feature importance of factors affecting the bridge and developed a bridge deck pavement defect prediction model using the K-Nearest Neighbor algorithm [
22]. Although some researchers have used machine learning algorithms to study bridge conditions, the databases used have class imbalance problems. Researchers failed to consider the imbalance of categorical data in the database. To address these issues, this study developed a decision-making tool for assessing the condition levels of deck pavement systems using unbalanced data.
This study develops a decision-making tool to assist transit agency personnel in assessing the condition of steel bridge deck pavement systems. We build up a data-driven prediction model to replace the manual detection methods currently used, thereby saving costs and improving efficiency. A series of tests and validations on the condition levels of the pavement layer and steel bridge deck were performed using seven machine learning techniques. The primary contributions of this paper are outlined as follows: (1) this paper presents a decision-making tool for estimating the condition levels of deck pavement systems by employing data balancing techniques and machine learning algorithms for classification; (2) data imbalance is a common issue in classification problems: to address this, the Synthetic Minority Oversampling Technique (SMOTE) was used to create a balanced synthetic dataset for the deck pavement system, and the new dataset was utilized to evaluate the effectiveness of the machine learning model; (3) hyperparameters of the model were optimized through a mixture of 10-fold cross-validation and grid search to improve its generalization performance; (4) in the original dataset, the model achieved an accuracy of 0.841. After applying SMOTE to address the imbalance, the accuracy of the model improved to 0.929.
4. Conclusions
The existing steel bridge deck pavement system condition detection is usually carried out manually. The use of data-driven intelligent algorithms can get rid of the dependence on manual labor. The imbalance in data categories within the real steel bridge deck pavement system database impacts the accuracy of machine learning predictions. This study proposes a decision-making tool for predicting condition levels in steel bridge deck pavement systems, specifically designed to address unbalanced data.
To address the class imbalance problem, a generative database was created using SMOTE technology for training machine learning models. Evaluated using seven different machine learning algorithms (LightGBM, XGBoost, RF, AdaBoost, KNN, MLP, and LR). The best-performing LightGBM algorithm has an accuracy of 0.841, a precision of 0.845, and a recall of 0.265 in the original database. Accuracy in the generated database is 0.929, precision is 0.929, and recall is 0.930. The results show that the generated database using SMOTE technology can solve the problem of data category imbalance well. Therefore, the condition level classification algorithm proposed in this study holds significant potential for application in steel bridge deck pavement systems.
This study employs parameter optimization methods during the training of prediction models, which requires a high number of iterations and consumes a significant amount of computational resources. Currently, parallelizing the model training process, using multiple computing nodes, can expedite the training procedure given sufficient computational resources. Future research will focus on further improving data balancing methods to reduce computational costs while maintaining quality.