**1. Introduction**

Insecticide-treated nets (ITNs) are a common vector control tool and have considerably decreased the burden inflicted by malaria [1]. However, in recent years, species of the

**Citation:** Fowler, M.T.; Lees, R.S.; Fagbohoun, J.; Matowo, N.S.; Ngufor, C.; Protopopoff, N.; Spiers, A. The Automatic Classification of Pyriproxyfen-Affected Mosquito Ovaries. *Insects* **2021**, *12*, 1134. https://doi.org/10.3390/ insects12121134

Academic Editor: Geoffrey M. Attardo

Received: 9 November 2021 Accepted: 14 December 2021 Published: 17 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

mosquito genus *Anopheles*, the principal vector for malaria, have demonstrated an increased resistance to the pyrethroid-based insecticides used to treat ITNs. This increase in resistance to pyrethroids threatens the efficacy of ITNs and may have contributed to an increase in malaria cases in affected areas [2]. Consequently, alternative effective insecticides for use on ITNs need to be identified to maintain the efficacy of this intervention and meet the gap in global disease control that pyrethroid resistance has created [3,4]. ITNs treated with a mixture of pyriproxyfen (PPF) and pyrethroids offer an alternative to standard pyrethroidtreated ITNs in areas where pyrethroid-resistant malaria vectors are prevalent [5–8]. The mode of action of PPF affects the fertility, longevity, and lifetime fecundity of malaria vectors [9,10], and PPF-treated ITNs have been shown to sterilise *Anopheles* mosquitos under both laboratory and field conditions [11,12]. As vector ovary development is inhibited by exposure to PPF [8], and females that fail to develop morphologically normal eggs have been shown to not oviposit [13,14], a means of measuring efficacy and monitoring the durability of PPF and PPF-treated tools is through the assessment of eggs for signs of abnormal or inhibited development [8,12]. Although different means of scoring sterility exist (e.g., by looking for the ability to prevent egg laying or oviposition inhibition), another method to determine fertility status is based on trained experts manually dissecting ovaries and classifying egg development according to Christopher's stages [15]. However, this can be a time-consuming process and requires a level of expertise not always available. Therefore, to increase the throughput and robustness of data used to measure the efficacy and durability of PPF-based ITNs, and to aid efficient and reproducible data collection in research settings, freely available alternative methods for the accurate, quick, and automatic classification of ovary development are required.

In recent years, deep learning models and convolutional neural networks (CNNs) have made significant progress across a range of computer vision problems, including image classification [16]. A CNN implements a convolution operation across several distinct layers to convert an input (i.e., an image) into an output (i.e., a classification)**.** The convolution operation applies a filter or kernel (usually a 3 × 3 or 5 × 5 matrix) to a two-dimensional representation of an image. This matrix then slides over the full 2D grid, performing calculations on the data depending on the kernel's weights, transforming data into a representation of patterns found within the image (i.e., edges, etc.) [17]. A CNN, therefore, uses linear regression with forward and backward propagation in a neural network to automatically adjust and determine the most appropriate kernel weights [18,19]. These weights can then identify different pattern types found within a dataset, with layers earlier in the network identifying primitive features in an image, such as edges and colours, while deeper layers detect more complex shapes, patterns, or objects [20,21].

This type of architecture enables the automatic training and detection of multiple visual features, which can then be used to identify and classify variance between images. However, the area of application for deep learning and CNNs has been constrained by its reliance on large datasets to avoid overfitting (i.e., to ensure generalisability) and, thus, achieve high accuracy rates [22]. Nevertheless, the size of a dataset can be increased through data augmentation, which employs a raft of tactics so as to artificially increase the available dataspace and allow generalisable models to be built. Data augmentation includes the geometric transformation, colour augmentation, and random cropping of available data (amongst other techniques), thereby creating randomised novel images from those that are already available [23]. However, even with data augmentation, most datasets are still insufficient to avoid overfitting. In such cases, transfer learning can be used, whereby opensource architectures and pretrained weights, derived using very large datasets, are repurposed and fine-tuned for a different but related task [24]. Models trained against the ImageNet dataset (which contains over 14 million images and 20 thousand classes) are freely available and regularly achieve high levels of accuracy [25]. Three common and high-performing models used in transfer learning, all pretrained and tested against the ImageNet dataset, are (1) VGG-16 [26], (2) ResNet-50 [27], and (3) InceptionV3 [28,29].

Machine learning has already been successfully utilised within entomology for a number of species classification tasks, such as the identification of pest insect species [30], the recognition of lepidopteran species [31], and the classification of mosquito species [32–35]. Additionally, automatic tools have been developed to count the eggs laid by female mosquitos, which can be used to estimate fecundity [36–38]. However, current work on the automatic classification of mosquito fertility and egg development is limited. As such, this study is aimed at bridging this gap and uses deep learning, data augmentation, and transfer learning to develop a quick, robust, and practical method to classify the fertility status (i.e., 'fertile' or 'infertile') of mosquito ovaries from colour images. To be successful, this new method must (1) be automatic and require no, or limited, expert knowledge to categorise an image, (2) achieve close to the human accuracy rate of 99–100% (rate determined by the agreemen<sup>t</sup> between two scorers assessing the dataset used in this study), (3) be in an easily distributable, non-proprietary, and low-cost format, and (4) classify ovary fertility of an image faster than the estimated 2 s taken by human experts (rate determined by the mean time taken for four trained technicians to classify 30 random ovary images).

Using a novel dataset of dissected ovary images, data augmentation, and transfer learning, we were able to build and train a CNN in TensorFlow that can detect and classify the development status ('fertile' or 'infertile') of 157 ovaries in 38.5 s at a 94% accuracy rate. As such, this study proposes a new method for the automatic classification of the fertility status of *Anopheles* mosquito ovaries that is quick, accurate, and easily distributable, and that is not dependent on trained experts to score egg development.

#### **2. Materials and Methods**

## *2.1. Image Dataset*

As no publicly available datasets exist, data from ongoing research were used for this study. A total of 524 images of dissected ovaries from 5–8 day old female mosquitos were collected and labelled with the appropriate fertility status (based on Christopher's stage of egg development). These were all full colour images obtained from three sources where fertility status was determined and corroborated by two trained expert scorers. A summary of the datasets used here is found in Table 1.


**Table 1.** Image dataset summary.

The first dataset contained a total of 124 blood-fed adult pyrethroid-resistant female *An. gambiae* s.l. mosquitoes which had survived exposure to either a control untreated net or a PPF-treated net (Royal Guard) in experimental hut studies performed in accordance with current WHO guidelines [39]. Mosquitoes were collected as wild free-flying adults in experimental huts in Cove, Southern Benin, with 36.3% (*n* = 45) classified as being infertile and the remaining 63.7% (*n* = 79) classified as being fertile. The second dataset contained 187 blood-fed adult pyrethroid-resistant female *An. gambiae* Akron mosquitoes from insectary-maintained colonies. All samples had survived exposure to either a control untreated net or a PPF-treated net (Royal Guard) in WHO cone bioassays [39]. Of the total samples in the second dataset, 77.0% (*n* = 144) were classified as being infertile, and 23.0% (*n* = 43) were classified as being fertile. All samples in the first and second dataset were, after exposure, held in plastic holding cups and provided 10% glucose for 72 h to allow enough time to become gravid. Prior to dissection, mosquitoes were killed by placing them in a freezer at −20 ◦C for 5–10 min and then dissected on a dissecting slide by separating the abdomen from the head and thorax to expose the ovaries using dissecting needles. After

dissection, eggs and ovaries of each mosquito were observed and photographed using a microscope equipped with a digital camera at 4× or 10× magnification. Developmental status of the eggs in each mosquito's ovaries was classified and validated by two scorers according to Christopher's stage of egg development [15]. Mosquitoes were classified as 'fertile' if eggs had fully developed to Christopher stage V and 'infertile' if eggs had not fully developed and remained in stages I–IV (see Figure 1).

**Figure 1.** Christopher stages of egg development. Mosquitos whose eggs have fully developed to stage V (normal elongated, boat/sausage-shaped eggs with lateral floats) are classified as 'fecund' or 'fertile'. If eggs have not fully developed and remain in stages I–IV (less elongated, round shape, lacking floats), the mosquito is classified as 'non-fecund' or 'infertile'.

The third dataset contained 125 free-flying freshly blood-fed pyrethroid-resistant female *An. funestus* s.l. mosquitoes collected from the wall and roof of houses in Mwanza, Northwest Tanzania. Of these mosquitos, 46.4% (*n* = 58) were classed as being infertile and 53.6% (*n* = 67) were classified as being fertile. Dataset 4 also contained free-flying freshly blood-fed pyrethroid-resistant female mosquitoes collected from the wall and roof of houses in Mwanza, Northwest Tanzania. However, these were *An. gambiae* s.l., 56.8% (*n* = 50) classed as infertile and 43.2% (*n* = 38) classed as fertile. All samples from datasets 3 and 4 were, after collection and following the CDC bottle bioassay guidelines [40], immediately exposed to glass bottles treated with 1× the diagnostic dose of 100 μg/mL of PPF solution or control bottles treated with acetone for 60 min and left for 72 h post exposure to allow time to become gravid. Dissection was then carried out under a stereoscopic dissecting microscope (using a Nikon MODEL C-PSN) at 5× magnification to assess ovary development. The status of ovaries and eggs was again categorised by two scorers as either 'fertile' or 'infertile' according to Christopher's stage of egg development, with those in Christopher stage V determined to be 'fertile' and those in stages I–IV classed as 'infertile' [13]. After dissection, one image per mosquito was captured with a Motic camera microscope into a tablet PC.

#### *2.2. Pre-Processing and Train/Test Split*

After data were loaded into Python, all images were rescaled to 224 × 224 pixels to ensure consistency and improve processing times (Figure 2A). Before data were analysed, images were first randomly allocated to a training and a test set using a respective split of 70% (*n* = 367) and 30% (*n* = 157). A training set is used to teach a model to classify the correct domain. The set used here to train the model consisted of a total of 367 images, 151 (41.1%) classed as fertile and 216 (58.9%) classed as infertile. The test set is used to measure the accuracy of a model. Here, 157 total images were allocated to testing accuracy, with 76 (48.4%) classed as fertile and 81 (51.6%) classed as infertile.

**Figure 2.** Summary of analysis workflow. (**A**) Data are pre-processed as described in Section 2.2. Images and labels are loaded before the images are resized and undergo a random 70%/30% split into training and test sets. (**B**) The training set undergoes data augmentation as described in Section 2.3. Each original image produces 18 variations based on a random rotation around 360◦, a random horizontal flip, a random vertical flip, and a random brightness shift between 60% and 140%. Each variation retains the same classification as its original. (**C**) The training set is fitted to a range of CNNs, and classifiers are built and tested as described in Section 2.4.
