1. Introduction
The marine ecosystem is one of the largest and richest ecosystems on Earth, covering a vast area from the coastline to the deep sea, occupying about 70% of the Earth’s surface, and covering a rich biodiversity and complex ecosystems. Marine plankton is the primary and secondary producer in the ocean, with a complex composition and vast numbers. It is crucial that we maintain the balance of marine material circulation, energy flow, and the marine ecological environment [
1,
2,
3]. In marine ecological environment monitoring, the abundance, species, and distribution of plankton are important monitoring indicators. It is particularly important to collect and analyze plankton data in real time to gain a deep understanding of the relationship between its abundance and distribution and changes in the surrounding environment [
4,
5]. The comprehensive analysis of such data can provide key evidence to help predict potential marine and climate change effects, formulate sustainable resource management strategies, and promote the adoption of necessary environmental protection and restoration measures [
6,
7,
8].
Therefore, the study of plankton has received widespread attention in society, and the identification of plankton is a prerequisite for studying and monitoring changes in plankton ecology, and has certain scientific research value and economic value [
9]. In the early days, in order to understand the distribution changes of plankton in time and space, plankton samples were usually collected by trawling and water sampling, and then classified and counted manually [
10]. This process is time-consuming, labor-intensive, and costly, which hindered the monitoring and research of plankton at a larger temporal and spatial scale. In recent years, with the development of imaging technology, underwater image sensors can continuously and automatically capture individual images of plankton, such as the Video Plankton Recorder (VPR), Shadow Image Particle Profile Evaluation Recorder, Zooplankton Visualization System, Scripps Plankton Camera, In Situ Fish Plankton Imaging System, etc. [
11,
12,
13,
14,
15]. Since these imaging systems usually generate a large amount of image data, taxonomists need to manually mark them, resulting in the slow identification of plankton species and a huge workload. At the same time, due to the wide variety of plankton species, researchers may have inadequate knowledge and experience, which may lead to miscalculations and human error by chance when performing species identification [
16,
17].
Over the last decade, numerous research studies on automatic plankton image recognition have emerged globally [
18,
19,
20,
21]. Before the emergence of deep learning, plankton image recognition mainly used traditional machine-learning methods. Tang et al. [
22] proposed a novel pattern recognition system to image a large number of plankton images using drag-and-drop VPR, and classify them using an improved learning vector quantized network classifier, which realizes fully automatic plankton recognition for the first time. Hu and Davis [
23] used the co-occurrence matrix (COM) as a feature and used a support vector machine (SVM) as a classifier to reduce the classification error rate by using a combination of the COM and SVM to automatically recognize plankton images. Zheng et al. [
24] proposed an automatic plankton image classification system based on multi-kernel learning (MKL) combined with multi-view features. By combining different types of features together and providing them to multiple classifiers to better utilize the feature information, a higher classification accuracy is achieved. These machine-learning-based recognition methods extract target features and process them using classifiers. The setting of these manual features depends on past experience and does not involve big data. In addition, the number of parameters in manual features is small, which limits the recognition accuracy of plankton. At present, due to the rapid advancement of computer technology and the improvement of hardware integration quality, deep-learning technology has rapidly emerged and has made significant breakthroughs in various fields [
25,
26]. Li and Cui [
27] proposed a method for classifying plankton images using a deep residual network, which improved the accuracy and real-time performance of plankton recognition. In order to overcome the problem of class imbalance, Lee et al. [
28] proposed a fine-grained classification method for a large plankton database using a convolutional neural network (CNN) based on transfer learning. Nandini et al. [
29] used CNN models with VGG16 and Resnet as the backbone networks and deployed them on the Heroku platform to classify different plankton species and record newly added classes, so as to further input new data into the model to help identify new plankton species, achieving an accuracy of 83.7%. Most existing zooplankton image recognition methods usually only use imagers to image underwater, and then export the images or transmit them to onshore computers or servers through cables or communication systems for identification and analysis, which cannot meet the needs of real-time observation. In order to realize the real-time in situ monitoring of plankton distribution, lightweight algorithms have gradually gained attention and applications, such as Guo et al. [
30], who combined deep-learning methods with digital embedded holography and applied the lightweight network ShuffleNet to classify marine plankton holograms. However, the acquisition and reconstruction process of holographic images is complicated and has high requirements for light sources, recording media and reconstruction algorithms. Microscopic imaging technology can directly observe the microstructure of plankton without complex recording and reconstruction processes, and can quickly acquire a large amount of high-quality plankton image data, and this immediacy is crucial for the study of the ecological behaviors, distribution patterns, and population dynamics of plankton [
31,
32,
33].
In view of this, this study has made improvements in two aspects: reducing the number of model parameters and the amount of calculation, and improving the accuracy of plankton recognition in real scenarios. A lightweight recognition method for plankton microscopic images based on MobileNetV2 is proposed. First, the depth and width of the network are balanced by redesigning the layer structure of the feature extraction network to reduce the number of parameters and the amount of computation of the model, which not only reduces the computational burden of the model, but also accelerates the training and inference process. Second, the CA mechanism is introduced to dynamically adjust the feature weights between different channels to promote the deep interaction of global information within the graph, which enables the model to focus more on the key feature regions of plankton, and significantly improves the differentiation and extraction accuracy of the features; finally, in conjunction with the target recognition task, the tail classifier of the network is improved to enhance the utilization efficiency of the network parameters, and, ultimately, the accurate identification of the in situ-end plankton is achieved while guaranteeing the lightweighting premise. Combined with the hydrological information and spatial and temporal position information observed by other sensors, the abundance of each category of plankton is thus obtained. This study not only demonstrates, for the first time, the potential of the high-precision classification of plankton micrographs in real-time environments, but also provides a new technological path for the long-term in situ monitoring of plankton. By integrating the in situ observation algorithm in this study, the imaging instrument will be able to monitor plankton automatically and continuously, which is of great significance for marine ecological environment monitoring and ocean and climate prediction.
4. Discussion
In view of the problems of large model parameters, insufficient real-time performance, and poor accuracy in the current convolutional neural network in the identification of plankton, this study proposes a lightweight plankton image recognition algorithm that improves MobileNetV2. The algorithm takes 12 types of plankton images as the research object, and constructs a plankton dataset containing 6674 images through manual screening, combined with data enhancement technology. Based on the MobileNetV2 model, this study took three key improvement measures: feature extraction backbone network reconstruction, the introduction of the coordinate attention module and network classifier improvement, and the verification of the effectiveness of these improvement measures through ablation experiments. The experimental results show that, compared with mainstream models such as MobileNetV2, EfficientNetb0, and ShuffNet0.5, the model proposed in this study shows better recognition performance on the plankton test set, with an accuracy of 95.46% and an F1 score of 94.48%. At the same time, the corresponding advantages are also reflected in the model size. The inference time for a single plankton image is only 6.15 ms, which is suitable for deployment on edge computing devices. Taking into account the number of model parameters, recognition accuracy, and speed, the proposed method has the best performance and realizes the accurate, fast, and efficient recognition of plankton. In addition, by further combining the hydrological information observed by the depth sensor, the abundance of each type of plankton can be obtained.
Although the model proposed achieves a significant performance improvement in the plankton identification task, there is still some room for improvement. First, the dataset used is only 12 types of images, which will be further expanded in the future to include a wider and more diverse array of plankton species. Secondly, in the future, modern enhancement techniques such as synthetic data generation and generative adversarial networks will be further integrated. Additionally, more images from different angles will be added as training data to improve the model’s robustness and generalization ability. Finally, since ocean observation equipment often faces strict power and computing resource constraints, the model structure and parameters can be further optimized to improve the operational efficiency and stability of the algorithm. In general, this study not only optimizes the model structure and computational complexity, realizes the lightweight properties of the network, and improves the recognition efficiency, but also realizes the real-time and high-precision application of plankton identification by integrating hydrological information. Deploying the proposed model on the embedded platform of the in situ imager will help to realize the long-term, fixed-point, real-time stereoscopic observation of marine plankton, providing strong support for ecological environment monitoring and marine scientific research.