Construction and Comparison of Machine-Learning Forecast Models of Albacore Thunnus alalunga Fishing Grounds in the South Pacific Ocean
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Sources
2.2. Data Processing
2.3. Feature Engineering
2.4. Modeling Methods
2.5. Evaluating Indicators
2.6. Statistical Analysis
3. Results
3.1. Distribution of Albacore Fishing Ground in the South Pacific
3.2. Correlation between Environmental Variables and CPUE of Albacore
3.3. Comparative Analysis of Binary Classification Models
3.4. Comparative Analysis of Multiple Classification Models
3.5. Performance of Fishing Ground Forecast Models
4. Discussion
4.1. Model Comparison and Analysis
4.2. Impact of Environmental Factors on Model Prediction
4.3. Impact of Different Partitioning Methods on the Model
4.4. Outlook
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Nikolic, N.; Morandeau, G.; Hoarau, L.; West, W.; Arrizabalaga, H.; Hoyle, S.; Fonteneau, A. Review of albacore tuna, Thunnus alalunga, biology, fisheries and management. Rev. Fish. Bio. Fish. 2017, 27, 775–810. [Google Scholar] [CrossRef]
- Watanabe, H.; Kubodera, T.; Masuda, S.; Kawahara, S. Feeding habits of albacore Thunnus alalunga in the transition region of the central North Pacific. Fish. Sci. 2004, 70, 573–579. [Google Scholar] [CrossRef]
- Dragon, A.C.; Senina, I.; Titaud, O.; Calmettes, B.; Conchon, A.; Arrizabalaga, H.; Lehodey, P. An ecosystem-driven model for spatial dynamics and stock assessment of North Atlantic albacore. Can. J. Fish. Aquat. Sci. 2015, 72, 864–878. [Google Scholar] [CrossRef]
- Lehodey, P.; Senina, I.; Nicol, S.; Hampton, J. Modelling the impact of climate change on South Pacific albacore tuna. Deep-Sea. Res. Pt. II. 2015, 113, 246–259. [Google Scholar] [CrossRef]
- Vaihola, S.; Yemane, D.; Kininmonth, S. Spatiotemporal Patterns in the Distribution of Albacore, Bigeye, Skipjack, and Yellowfin Tuna Species within the Exclusive Economic Zones of Tonga for the Years 2002 to 2018. Diversity. 2023, 15, 1091. [Google Scholar] [CrossRef]
- Williams, A.J.; Allain, V.; Nicol, S.J.; Evans, K.J.; Hoyle, S.D.; Dupoux, C.; Vourey, E.; Dubosc, J. Vertical behavior and diet of albacore tuna (Thunnus alalunga) vary with latitude in the South Pacific Ocean. Deep-Sea. Res. Pt. II. 2015, 113, 154–169. [Google Scholar] [CrossRef]
- Reglero, P.; Santos, M.; Balbín, R.; Laíz-Carrión, R.; Alvarez-Berastegui, D.; Ciannelli, L.; Jiménez, E.; Alemany, F. Environmental and biological characteristics of Atlantic bluefin tuna and albacore spawning habitats based on their egg distributions. Deep-Sea. Res. Pt. II. 2017, 140, 105–116. [Google Scholar] [CrossRef]
- Zainuddin, M.; Saitoh, K.; SAITOH, S.I. Albacore (Thunnus alalunga) fishing ground in relation to oceanographic conditions in the western North Pacific Ocean using remotely sensed satellite data. Fish. Oceanogr. 2008, 17, 61–73. [Google Scholar] [CrossRef]
- Mondal, S.; Lee, M.A. Habitat modeling of mature albacore (Thunnus alalunga) tuna in the Indian Ocean. Front. Mar. Sci. 2023, 10, 1258535. [Google Scholar] [CrossRef]
- Tussadiah, A.; Pranowo, W.S.; Syamsuddin, M.L.; Riyantini, I.; Nugraha, B.; Novianto, D. Characteristic of eddies kinetic energy associated with yellowfin tuna in southern Java Indian Ocean. IOP Conf. Ser. Earth Environ. Sci. 2018, 176, 012004. [Google Scholar] [CrossRef]
- Singh, A.A.; Sakuramoto, K.; Suzuki, N. Impact of climatic factors on albacore tuna Thunnus alalunga in the South Pacific Ocean. Amer. J. Clim. Chan. 2015, 4, 295. [Google Scholar] [CrossRef]
- Lindegren, M.; Checkley, D.M., Jr.; Koslow, J.A.; Goericke, R.; Ohman, M.D. Climate-mediated changes in marine ecosystem regulation during El Niño. Glob. Change Biol. 2018, 24, 796–809. [Google Scholar] [CrossRef] [PubMed]
- Gao, F.; Chen, X.; Guan, W.; Li, G. A new model to forecast fishing ground of Scomber japonicus in the Yellow Sea and East China Sea. Acta Ocean. Sin. 2016, 35, 74–81. [Google Scholar] [CrossRef]
- Cui, X.; Tang, F.; Zhou, W.; Wu, Z.; Yang, S.; Hua, C. Fishing ground forecasting model of Ommastrephes bartramii based on support vector machine (SVM) in the Northwest Pacific Ocean. South China Fish. Sci. 2016, 12, 1–7, (In Chinese with English abstract). [Google Scholar]
- Mao, J.; Chen, X.; Yu, J. Forecasting fishing ground of Thunnus alalunga based on BP neural network in South Pacific Ocean. Acta. Ocean. Sin. 2016, 10, 34–43, (In Chinese with English abstract). [Google Scholar]
- Chen, X.; Fan, W.; Cui, X.; Zhou, W.; Tang, F. Fishing ground forecasting of Thunnus alalunga in Indian Ocean based on random forest. Acta. Ocean. Sin. 2013, 35, 158–164, (In Chinese with English abstract). [Google Scholar]
- Vaihola, S.; Kininmonth, S. Environmental Factors Determine Tuna Fishing Vessels’ Behavior in Tonga. Fishes. 2023, 8, 602. [Google Scholar] [CrossRef]
- Hou, J.; Zhou, W.; Fan, W.; Zhang, H. Research on fishing grounds forecasting models of albacore tuna based on ensemble learning in South Pacific. South China Fish. Sci. 2020, 5, 42–50, (In Chinese with English abstract). [Google Scholar]
- Song, L.; Li, T.; Zhang, T.; Sui, H.; Li, B.; Zhang, M. Comparison of machine learning models within different spatial resolutions for predicting the bigeye tuna fishing grounds in tropical waters of the Atlantic Ocean. Fish. Oceanogr. 2023, 32, 509–526. [Google Scholar] [CrossRef]
- Xu, L.; Chi, D. Machine Learning Classification Strategy for Imbalanced Data Sets. Comput. Eng. Appl. 2020, 56, 12–27, (In Chinese with English abstract). [Google Scholar]
- Wardhani, N.W.S.; Rochayani, M.Y.; Iriany, A.; Sulistyono, A.D.; Lestantyo, P. Cross-validation metrics for evaluating classification performance on imbalanced data. In Proceedings of the 2019 International Conference on Computer, Control, Informatics and Its Applications, Tangerang, Indonesia, 23–24 October 2019; pp. 14–18. [Google Scholar]
- Shabani, F.; Kumar, L.; Ahmadi, M. A Comparison of Absolute Performance of Different Correlative and Mechanistic Species Distribution Models in an Independent Area. Ecol. Evol. 2016, 6, 5973–5986. [Google Scholar] [CrossRef] [PubMed]
- Feng, Y.; Chen, X.; Gao, F.; Liu, Y. Impacts of changing scale on Getis-Ord Gi hotspots of CPUE: A case study of the neon flying squid (Ommastrephes bartramii) in the northwest Pacific Ocean. Acta. Ocean. Sin. 2018, 37, 67–76. [Google Scholar] [CrossRef]
- Krishna, K.; Murty, M.N. Genetic K-means algorithm. IEEE Trans. Sys. Man. Cyber. Pt. B 1999, 29, 433–439. [Google Scholar] [CrossRef] [PubMed]
- Jacod, J.; Protter, P. Discretization of Processes; Springer: Berlin, Heidelberg, 2011. [Google Scholar]
- Ren, L.; Ma, Y.; Shi, H.; Chen, X. Overview of Machine Learning Algorithms. In Lecture Notes in Electrical Engineering; Springer: Singapore, 2020; pp. 672–678. [Google Scholar]
- LaValley, M.P. Logistic regression. Circulation 2008, 117, 2395–2399. [Google Scholar] [CrossRef] [PubMed]
- Hearst, M.A.; Dumais, S.T.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. Springer Science+Business Media, LLC: New York, NY, USA, 1998; Volume 13, pp. 18–28. [Google Scholar]
- Suryanarayana, I.; Braibanti, A.; Sambasiva Rao, R.; Ramam, V.A.; Sudarsan, D.; Nageswara Rao, G. Neural networks in fisheries research. Fish. Res. 2008, 92, 115–139. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Chen, T.; He, T.; Benesty, M.; Khotilovich, V.; Tang, Y.; Cho, H.; Chen, K.; Mitchell, R.; Cano, I.; Zhou, T.; et al. R Package, version 0.4-2; Xgboost: Extreme Gradient Boosting; The R Foundation: Vienna, Austria, 2015; Volume 1, pp. 1–4. [Google Scholar]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. Ad. Neural. Infor. Process. Syst. 2018, 31, 6637–6647. [Google Scholar]
- Džeroski, S.; Ženko, B. Is combining classifiers with stacking better than selecting the best one? Mach. Learn. 2004, 54, 255–273. [Google Scholar] [CrossRef]
- Fawcett, T. An introduction to ROC analysis. Pat. Recog. Let. 2006, 27, 861–874. [Google Scholar] [CrossRef]
- Majnik, M.; Bosnić, Z. ROC analysis of classifiers in machine learning: A survey. Intel. Dt. Analy. 2013, 17, 531–558. [Google Scholar] [CrossRef]
- Harding, J.; Shahbaz, M.; Srinivas, S.; Kusiak, A. Data mining in manufacturing: A review. ASME Trans. J. Manuf. Sci. Eng. 2006, 128, 969–976. [Google Scholar] [CrossRef]
- Tan, H.; Wu, Y.; Shen, B.; Jin, P.; Ran, B. Short-term traffic prediction based on dynamic tensor completion. IEEE Trans. Intell. Transp. Syst. 2016, 17, 2123–2133. [Google Scholar] [CrossRef]
- Cui, Y.; Liu, S.; Zhang, Y.; Xu, B.; Ji, Y.; Zhang, C.; Xue, Y. Habitat characteristics of Octopus ocellatus and their relationship with environmental factors during spring in Haizhou Bay, China. Chin. J. Appl. Ecol. 2022, 33, 1686–1692, (In Chinese with English abstract). [Google Scholar]
- Yang, C.; Li, X.; Liu, Q.; Wang, Y. Application of machine learning methods for estimating the biomass of economically crabs in the Zhoushan fishery. Mar. Sci. 2023, 9, 61–70, (In Chinese with English abstract). [Google Scholar]
- Mugo, R.; Saitoh, S.I. Ensemble modelling of Skipjack tuna (Katsuwonus pelamis) habitats in the western North Pacific using satellite remotely sensed data; a comparative analysis using machine-learning models. Remote. Sens. 2020, 12, 2591. [Google Scholar] [CrossRef]
- Zainuddin, M.; Kiyofuji, H.; Saitoh, K.; Saitoh, S.I. Using multi-sensor satellite remote sensing and catch data to detect ocean hot spots for albacore (Thunnus alalunga) in the northwestern North Pacific. Deep-Sea. Res. Pt. II. 2006, 53, 419–431. [Google Scholar] [CrossRef]
- Ashida, H.; Gosho, T.; Watanabe, K.; Okazaki, M.; Tanabe, T.; Uosaki, K. Reproductive traits and seasonal variations in the spawning activity of female albacore, Thunnus alalunga, in the subtropical western North Pacific Ocean. J. Sea. Res. 2020, 160, 101902. [Google Scholar] [CrossRef]
- Jackson, G.D.; Meekan, M.G.; Wotherspoon, S.; Jackson, C.H. Distributions of young cephalopods in the tropical waters of Western Australia over two consecutive summers. ICES. J. Mar. Sci. 2008, 65, 140–147. [Google Scholar] [CrossRef]
- Shcherbina, A.Y.; D’Asaro, E.A.; Riser, S.C.; Kessler, W.S. Variability and interleaving of upper-ocean water masses surrounding the North Atlantic salinity maximum. Oceanography 2015, 28, 106–113. [Google Scholar] [CrossRef]
- Mondal, S.; Wang, Y.C.; Lee, M.A.; Weng, J.S.; Mondal, B.K. Ensemble three-dimensional habitat modeling of Indian Ocean immature albacore tuna (Thunnus alalunga) using remote sensing data. Remote. Sens. 2022, 14, 5278. [Google Scholar] [CrossRef]
- Arrizabalaga, H.; Dufour, F.; Kell, L.; Merino, G.; Ibaibarriaga, L.; Chust, G.; Irigoien, X.; Santiago, J.; Murua, H.; Fraile, I.; et al. Global habitat preferences of commercially valuable tuna. Deep-Sea Res. Pt. II. 2015, 113, 102–112. [Google Scholar] [CrossRef]
- Kai, E.T.; Marsac, F. Influence of mesoscale eddies on spatial structuring of top predators’ communities in the Mozambique Channel. Prog. Oceanogr. 2010, 86, 214–223. [Google Scholar]
- Zhou, C.; He, P.; Xu, L.; Bach, P.; Wang, X.; Wan, R.; Zhang, Y. The effects of mesoscale oceanographic structures and ambient conditions on the catch of albacore tuna in the South Pacific longline fishery. Fish. Oceanogr. 2020, 29, 238–251. [Google Scholar] [CrossRef]
- Iriarte, J.L.; González, H.E.; Liu, K.K.; Rivas, C.; Valenzuela, C. Spatial and temporal variability of chlorophyll and primary productivity in surface waters of southern Chile (41.5–43 S). Estuarine. Coast. Shelf. Sci. 2007, 74, 471–480. [Google Scholar] [CrossRef]
- Lougee, L.A.; Bollens, S.M.; Avent, S.R. The effects of haloclines on the vertical distribution and migration of zooplankton. J. Exp. Mar. Bio. Eco. 2002, 278, 111–134. [Google Scholar] [CrossRef]
- Wu, J.; Jin, L. Exploration of the classification and main characteristics of marine ecosystems. Inter. J. Mar. Sci. 2023, 13, 1–7. [Google Scholar]
- Xu, H.; Song, L.; Shen, J.; Li, Y.; Zhang, M. The relationship between the spatial-temporal distribution of albacore tuna CPUE and the marine environment variables in waters near the Cook Islands based on GAM. Mar. Sci. Bull. 2023, 4, 444–455, (In Chinese with English abstract). [Google Scholar]
- Du Pontavice, H.; Gascuel, D.; Reygondeau, G.; Maureaud, A.; Cheung, W.W. Climate change undermines the global functioning of marine food webs. Glob. Change Biol. 2020, 26, 1306–1318. [Google Scholar] [CrossRef]
- Lehodey, P.; Chai, F.; Hampton, J. Modelling climate-related variability of tuna populations from a coupled ocean–biogeochemical-populations dynamics model. Fish. Oceanogr. 2003, 12, 483–494. [Google Scholar] [CrossRef]
Model | Introduction | Parameters |
---|---|---|
K-Nearest Neighbors (KNN) (reproduced from Springer, 2020) [26] | The KNN classification algorithm is an algorithm that only considers the K training set data closest to the new data. | K = 9, weight: distance K is the amount of nearest neighbors. |
Logistic Regression (LR) (reproduced from American Heart Association, 2008) [27] | LR is a classic classification algorithm widely used in fields such as life sciences, social sciences, and finance. | penalty coefficient = 1.2, max_iterate = 100. |
Support Vector Machine (SVM) (reproduced from Springer, 1998) [28] | SVM is a powerful classification algorithm which finds an optimal line or surface (also known as a hyperplane), divides the data into two categories, and maximizes the spacing between the two categories. | penalty coefficient = 1 |
Artificial Neural Network (ANN) (reproduced from Fish. Res, 2008) [29] | ANN is a model widely used in fields such as image, audio, and natural language processing. | HiddenLayerSizes = 20, learning rate: adaptive |
Random Forest (RF) (reproduced from Mach. Learn, 2001) [30] | Random Forest is an ensemble learning algorithm based on decision trees. | n_estimators = 300 |
Extreme Gradient Boosting (XGB) (reproduced from CRAN, 2015) [31] | XGBoost is an efficient implementation of gradient lifting trees that combines various techniques such as tree pruning and feature subsampling. | n_estimators = 200, learning rate = 0.05 |
CatBoost (CAT) (reproduced from Ad. Neural. Infor. Process. Syst., 2018) [32] | CatBoost is a machine learning algorithm based on gradient lifting methods that supports tasks such as classification, regression, and sorting. | n_estimators = 200, learning rate = 0.05 |
Stacking (STK) (reproduced from Mach.Learn, 2004) [33] | Stacking is an integrated learning technique that combines different models to achieve higher accuracy in the final prediction. | Estimators = {KNN, RF, XGB}, meta_learner = ANN, cv = 5 |
Actual Positive Class | Actual Negative Class | |
---|---|---|
Predict positive class | TP (True positives) | FP (False positives) |
Predict negative class | FN (False negatives) | TN (True negatives) |
Variables | Correlation | p |
---|---|---|
Year | −0.00 | 0.81 |
Month | 0.27 | <0.05 |
Lat | −0.37 | <0.05 |
Lon | 0.12 | <0.05 |
SOI | 0.13 | <0.05 |
PDO | 0.01 | 0.48 |
SST | −0.48 | <0.05 |
CHL | −0.05 | <0.05 |
SSS | −0.17 | <0.05 |
DO | 0.49 | <0.05 |
SSHA | 0.06 | <0.05 |
WS | −0.02 | 0.28 |
MLD | 0.37 | <0.05 |
PAR | −0.23 | <0.05 |
EKE | −0.06 | <0.05 |
ST50 | −0.46 | <0.05 |
SS50 | −0.32 | <0.05 |
ST100 | −0.39 | <0.05 |
SS100 | −0.37 | <0.05 |
ST200 | −0.36 | <0.05 |
SS200 | −0.42 | <0.05 |
Scheme | Division Basis | Number of High Abundance Fishing Grounds | Number of Low Abundance Fishing Grounds | Imbalance Rate |
---|---|---|---|---|
A | Median | 2312 | 2311 | 1 |
B | Third percentile first quantile | 3097 | 1526 | 2 |
C | Quartile first quantile | 3467 | 1156 | 3 |
D | Two fifths percentile | 2774 | 1849 | 1.5 |
E | K-means clustering | 650 | 3973 | 6 |
F | 1-R partition nodes | 102 | 4521 | 44 |
Scheme | A | P | R | F1 |
---|---|---|---|---|
A | 0.744 | 0.743 | 0.746 | 0.744 |
B | 0.746 | 0.773 | 0.879 | 0.822 |
C | 0.823 | 0.833 | 0.955 | 0.89 |
D | 0.753 | 0.771 | 0.837 | 0.802 |
E | 0.925 | 0.828 | 0.589 | 0.688 |
F | 0.976 | 0.417 | 0.192 | 0.263 |
Scheme | Segmentation Method | Parameter | Sample Proportion |
---|---|---|---|
A | Third percentile method | - | 1: 1: 1 |
B | K-means clustering | K = 3 | 2872: 1415: 336 |
C | Fifth percentile method | - | 1: 1: 1: 1: 1 |
D | K-means clustering | K = 5 | 1972: 1538: 683: 317: 106 |
Scheme | Scheme | Top-k | t | p |
---|---|---|---|---|
A | B | 1 | −43.392 | <0.001 |
A | B | 2 | −33.639 | <0.001 |
C | D | 1 | −20.906 | <0.001 |
C | D | 2 | −19.472 | <0.001 |
C | D | 3 | −13.458 | <0.001 |
C | D | 4 | −16.306 | <0.001 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, J.; Chen, F.; Dai, Q.; Zhu, W.; Li, D.; Yu, W.; Zhou, W. Construction and Comparison of Machine-Learning Forecast Models of Albacore Thunnus alalunga Fishing Grounds in the South Pacific Ocean. Fishes 2024, 9, 375. https://doi.org/10.3390/fishes9100375
Li J, Chen F, Dai Q, Zhu W, Li D, Yu W, Zhou W. Construction and Comparison of Machine-Learning Forecast Models of Albacore Thunnus alalunga Fishing Grounds in the South Pacific Ocean. Fishes. 2024; 9(10):375. https://doi.org/10.3390/fishes9100375
Chicago/Turabian StyleLi, Jianxiong, Feng Chen, Qian Dai, Wenbin Zhu, Dewei Li, Wei Yu, and Weifeng Zhou. 2024. "Construction and Comparison of Machine-Learning Forecast Models of Albacore Thunnus alalunga Fishing Grounds in the South Pacific Ocean" Fishes 9, no. 10: 375. https://doi.org/10.3390/fishes9100375
APA StyleLi, J., Chen, F., Dai, Q., Zhu, W., Li, D., Yu, W., & Zhou, W. (2024). Construction and Comparison of Machine-Learning Forecast Models of Albacore Thunnus alalunga Fishing Grounds in the South Pacific Ocean. Fishes, 9(10), 375. https://doi.org/10.3390/fishes9100375