*3.2. Model Development*

Model development from scratch requires a large number of annotated images. For an object detection task where only small number of training data is available, a common solution is to perform fine-tuning on a CNN which is pre-trained with related source data. In this work, we adopted transfer learning and used a pre-trained convolutional neural network called Inception V2 described in the previous section.

The annotated images were converted to the record format to be used within TensorFlow and the dataset was randomly split in the ratio of 75% for training the model and 25% for testing. This ensures that the model do not overfit to the dataset and would therefore be able to perform well on unseen data.

The training and validation process of the Faster R-CNN model was performed on a laptop computer with GPU working under the 64-bit Windows 10 operating system using TensorFlow environment and monitored by using TensorBoard system. Figure 7 shows changes of two examples of the monitored losses with and without smoothing (classification loss and localization loss) during the training process. This process was manually stopped after 11,000 epochs which took about 7 h. From the plots, it can be observed that after about 30,000 epochs the smoothed losses started to decrease with constant rates.

**Figure 7.** Changes of classification loss and localization loss (with and without smoothing) during the training process (stopped after 11,000 epochs).

After training, the model was tested by detecting cracks in new images. The model produces a bounding boxes on the images along with a percentage of how accurate this bounding box is based on the trained model. This value provides users with an assessment of how good the detection of cracks is. Figure 8 shows tested images containing cracks. It can be noted that while the crack on the left image was properly detected and localized as one crack with certainty 99% (inside the green bounding box) and also not properly divided into two cracks: inside the blue bounding box (with certainty 96%) and inside the red bounding box (with certainty 81%).

**Figure 8.** Example of two tested images: the crack on the left image properly detected and localized as one crack (with certainty 100%), the crack on the right image simultaneously properly detected as one crack (with certainty 99%) and not properly detected as two cracks (with certainty 96% and 81%, respectively).

After the model was developed it was deployed for automatic surface crack detection during mechanical tests in laboratory.
