**3. Experimental Setup**

This work aims to define an efficient and automatable method to identify and localize damage in concrete bridge images using DCNNs and Transfer Learning. This section presents the Transfer Learning schemes followed to train the pre-trained VGG16 model on the proposed dataset. In addition, the weakly supervised semantic segmentation framework based on the above-explained interpretation techniques is also discussed.

### *3.1. VGG16 Fine-Tuning and Training*

The VGG16 model can capture high-level features [21] and has the ability to generalize to other datasets [32]. Moreover, it has shown an excellent performance in many studies on damage classification in concrete surfaces [31–33]. Therefore, it was chosen as a base model for the learning approach proposed in this paper.

Training this deep network from scratch requires enormous computational resources, significantly labeled data, and excessive training time. Thus, three Transfer Learning settings were explored in this work and compared based on standard classification metrics, training time, and generalization ability.

First, the pre-trained VGG16 with the ImageNet weights was uploaded, and the last fully connected layers of the model were adjusted to the number of classes (i.e., four classes). Then, based on the assumption that the high-level layers of DCNNs are more sensitive to the target dataset, the last layers of the pre-trained model were retrained on the constructed dataset to update their learning parameters.

Gradient descent and back-propagation were used following three different approaches in this work:


Figure 4 presents the three Transfer Learning-based training settings investigated in this paper.

The dataset presented in Section 2 was randomly split into three subsets: 70% of the images were used in the training set, 10% in the validation set, and 20% in the testing set. The number of images per subset and per class is shown in Table 1.

**Table 1.** Number of images per subset per class.


In addition, data augmentation techniques (i.e., random horizontal and vertical flips and random rotations) were applied to the training set to avoid overfitting.

**Figure 4.** Transfer Learning configurations: (**a**) retraining only the classification layers, (**b**) retraining the classification layers and the last convolutional layers, (**c**) retraining the classification layers and the last two convolutional layers.

The optimization method recommended in [32] based on Stochastic Gradient Descent (SGD) with momentum and a small learning rate was used in this work's experiments. The SGD with momentum optimizer reduces the computational load and accelerates the training convergence. The training was conducted for 25 epochs as the results converged. In addition, low training and validation errors were achieved while mitigating overfitting. The cross-entropy loss function was optimized using the SGD with a learning rate of 0.001, a momentum of 0.9, and a mini-batch size of 32. All the experiments were carried out using Pytorch in Google Colaboratory (Colab) with the 12GB NVIDIA Tesla K80 GPU provided by the platform.
