*3.2. Comparison between Regular Fully Connected Layers and Global Average Pooling*

Our method used global averaged pooling layers instead of the regular fully connected layers to improve the classification accuracy. To verify the advantage of global averaged pooling, we calculated the classification results with regular fully connected layer to compare with the results based on global average pooling. In this experiment, the dropout technique with an experimental value of 50% dropout probability was adopted with two fully connected layers to compare with the global average pooling classifier. Fully connected layers are usually accompanied by the dropout method to promote convergence. Dropout is a regularization technique to prevent overfitting for neural network models [14,22]. Neurons are randomly selected with a given probability to be dropped out and ignored during training so that the network could learn multiple independent internal representations and improve the generalization ability.

The comparison was based on the same five-fold cross-validation dataset, as mentioned above. The highest classification accuracy of the regular fully connected layers was 97.3%, which was less than the highest accuracy of 98.8% of the global average pooling method. Moreover, we also found that the global averaged pooling method had better convergence performance for model training. Figure 6 plots the training accuracy curve and validation accuracy curve of both global averaged pooling and conventional fully connected layers. The global averaged pooling method shows more steady convergence process with less fluctuation. The global averaged pooling method also has a narrower gap between the training and validation curves, implying better generalization ability than the fully connected layers method.

**Figure 6.** The training accuracy curve and validation accuracy curve of both global averaged pooling method and conventional fully connected layers method.

#### *3.3. Comparison between Augmented Dataset and Original Dataset*

Our method used data augmentation strategy to overcome the limitation of the small and imbalanced training dataset. In this Section, we compared the performances of zebrafish egg classification with and without data augmentation. Besides the training result acquired with augmented datasets of 7864 image patches, another model was trained with the original dataset of 638 patches without augmentation. The improvement of accuracy in the case of the augmented datasets against the original datasets was significant. Validation accuracy was improved from 83.8% to 98.8% after data augmentation.

To further analyze the effect of balancing the imbalanced datasets by augmentation, we had computed the metrics of sensitivity, specificity, precision, and accuracy between the methods with and without data augmentation (as shown in Table 2). The metrics were defined as Sensitivity = NTP/(NTP + NFN), Specificity = NTN/(NTN + NFP), Precision = NTP/(NTP + NFP), where N stands for the number of samples, TP, FP, TN, FN represent true positive, false positive, true negative, false negative, respectively. From Table 2, it can be observed that data augmentation led to evident improvements in both sensitivity and accuracy, while the specificity and precision of both methods were at the same level. We also compared the convergence performance of model training between the methods with and without data augmentation. As shown in Figure 7, data augmentation led to faster convergence speed and a smaller gap between the training accuracy curve and validation accuracy curve, implying that data augmentation yielded better specificity and generalization ability.

**Table 2.** Comparison of the classification performance between the methods with and without data augmentation.


**Figure 7.** The training accuracy curve and validation accuracy curve of model training with and without data augmentation.

#### *3.4. Comparison with Other Zebrafish Embryo Microscopic Image Analysis Studies*

As we surveyed the existing studies, there was rarely any research on zebrafish egg fertilization status classification from microscopic images. The most similar study to ours is from Liu et al. [4] who used support vector machine (SVM) to classify zebrafish embryo hatching status based on hand-crafted image features. It is hard to rigorously compare our method with Liu's method since the application purpose is different. As a rough comparison, their method achieved average recognition accuracy of 97.4 ± 61.0%, while our method had an average accuracy of 95.0 ± 2.2%. Although the two methods have similar accuracy, the standard deviation of our method (2.2%) is much less than theirs (61.0%), meaning that our method is considerably more stable. Moreover, our method doesn't need any hand-crafted feature; thus, the cost of algorithm design and the involvement of subjective interference of our method is much less. It is evident that our deep learning approach has better stability and objectiveness than the traditional machine learning methods based on hand-crafted features.

#### **4. Discussion**

In this study, exploratory research was conducted on CNN-based zebrafish egg phenotype classification from microscopic images. Due to the particularity of zebrafish egg research, we were facing the problems of the small imbalanced dataset and subtle inter-class difference. To tackle these problems, the strategies of transfer learning, data augmentation, and global averaged pooling were used.

It is known that training a deep network from scratch with random initialization is a formidable task. It requires millions of well-annotated training images, which are difficult to obtain in our study. Transfer learning is a technique to obtain deep features that an existing model has learned from tens of thousands of natural image datasets, either as an initialization or a fixed feature extractor for the task of interest. In some studies, transfer learning has been used to analyze medical images and achieved dramatic performance improvement for classification tasks of small datasets [24,25]. In this study, we used VGG-16 model previously trained on millions of natural scene images [15]. Compared to medical images like Computed Tomography (CT) and Magnetic Resonance Imaging (MRI), bright-field microscopic images share more common image features with natural scene images; therefore, we directly used the original VGG-16 model without modifications to its network architecture. As shown in our experimental results (Table 1), a mean accuracy of 95.0% was obtained based on five-fold cross-validation, proving the effectiveness of the transfer learning.

To further address the small imbalanced dataset problem, data augmentation strategy was used in this study. The effect of data augmentation was evident. As shown in Table 2, the sensitivity and accuracy were improved dramatically after data augmentation. The model trained without data augmentation yielded quite low sensitivity (68.0%), implying that this model tended to make negative judgments (fertilized). This is because the training data without augmentation contained much less unfertilized eggs than the fertilized eggs, making the model inadequate to recognized fertilized eggs. Therefore, dedicated data augmentation is very crucial for training a network for recognizing both types of eggs.

To cope with the subtle inter-class differences, global averaged pooling was used instead of the conventional fully connected layers. Global average pooling classifier enforced the correspondences between feature maps and categories without introducing extra weights to be optimized, and thus reduced the fluctuation during the training process and promoted fast and steady convergence. As reflected from the experimental results (Figure 3), global averaged pooling not only yield improved averaged classification accuracy but also lead to faster and more stable convergence of the training and validation curves. Such an advantage is crucial for biological microscopic image classification since the genetic or biological changes usually result in quite subtle phenotype differences.

As a limitation of this study, the proposed method still used a conventional template matching scheme to locate each egg in the well-plate image. There are several state-of-the-art neural networks for fast object detection, such as Faster-RCNN, YOLO, etc. [26]. However, as we tested these models, they performed well on locating the eggs but failed to accurately distinguish between the fertilized and unfertilized eggs. Therefore, we chose to use conventional CNN structure equipped with global averaged pooling to overcome the subtle inter-class difference. In the future study, we will focus on combining the object detection networks with our network architecture so that the whole workflow (including detection and classification) can be performed with only one network.

#### **5. Conclusions**

This study applied the deep learning technique to classify fertilized and unfertilized zebrafish eggs from bright-field microscopic images. Transfer learning and data augmentation schemes were used to overcome the problem of the small imbalanced training dataset. Global averaged pooling was adopted to improve the classification accuracy over subtle inter-class differences. Our future research direction will focus on applying this method in daily zebrafish egg acquisition workflow so that the proposed algorithm can promote the research outcome of high throughput biological experiments.

**Author Contributions:** Conceptualization, F.C. and S.L.; methodology, S.S.; software, S.S.; validation, S.S.; formal analysis, S.S.; investigation, S.L. and L.L.; resources, S.L. and L.L.; data curation, L.L.; writing—original draft preparation, S.S.; writing—review and editing, F.C.; visualization, S.S.; supervision, F.C.; project administration, F.C. and S.L.; funding acquisition, F.C. and S.L.

**Funding:** This research is supported by the National Natural Science Fund of China, No. 21607115; the general program of the National Natural Science Fund of China, No. 21777116; the Xinghai Scholar Cultivating Funding of Dalian University of Technology, No. DUT14RC(3)066.

**Acknowledgments:** The authors would like to thank Hongkai Wang for his advice on deep neural network training.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
