Near-Infrared Spectroscopy Machine-Learning Spectral Analysis Tool for Blueberries (Vaccinium corymbosum) Cultivar Discrimination
Abstract
:1. Introduction
2. Materials and Methods
2.1. Vaccinium Cultivar Leaves
- Spring: 44 plants—44 leaves, 1 per plant;
- Fall: 22 plants—22 leaves, 1 per plant;
- Winter: 32 plants—32 leaves, 1 per plant.
2.2. Infrared Spectra Acquisition
2.3. Data Analysis and Prediction
3. Results
3.1. Spectral Analysis Results
3.2. Machine-Learning Classification Results
4. Discussion
4.1. Choosing the Best DDR Method for Each Season and Leaf Side Using the ROC Curves
4.2. Choosing the Best Spectral Region for Each Season and Leaf Side Using and
4.3. Analysis of the Best Overall Results
4.4. Analysis of Confusion Matrices Related to the Best Overall Results
5. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Luteyn, J.L. Diversity, adaptation, and endemism in neotropical Ericaceae: Biogeographical patterns in the Vaccinieae. Bot. Rev. 2002, 68, 55–87. [Google Scholar] [CrossRef]
- Vander Kloet, S.P.; Avery, T.S. Vaccinium on the Edge. Edinb. J. Bot. 2010, 67, 7–24. [Google Scholar] [CrossRef]
- Vander Kloet, S.P.; Dickinson, T.A. A subgeneric classification of the genus Vaccinium and the metamorphosis of V. section Bracteata Nakai: More terrestrial and less epiphytic in habit, more continental and less insular in distribution. J. Plant Res. 2009, 122, 253–268. [Google Scholar] [CrossRef]
- Carvalho, M.; Matos, M.; Crespí, A.; Lopes, V.R.; Carnide, V. Genetic Diversity and Identification of Vaccinium Species Through Microsatellite Analysis. Plants 2024, 13, 3488. [Google Scholar] [CrossRef]
- Palmieri, L.; Saviane, A.; Sordo, M.; Grando, M.; Giongo, L. Blueberry: Germplasm Characterization and Food Traceability by the Use of Molecular Markers. Acta Hortic. 2009, 810, 167–172. [Google Scholar] [CrossRef]
- Bell, D.J.; Rowland, L.J.; Polashock, J.J.; Drummond, F.A. Suitability of EST-PCR Markers Developed in Highbush Blueberry for Genetic Fingerprinting and Relationship Studies in Lowbush Blueberry and Related Species. J. Am. Soc. Hortic. Sci. 2008, 133, 701–707. [Google Scholar] [CrossRef]
- Lu, H.F.; Shen, J.B.; Lin, X.Y.; Fu, J.L. Relevance of Fourier transform infrared spectroscopy and leaf anatomy for species classification in Camellia (Theaceae). TAXON 2008, 57, 1274. [Google Scholar] [CrossRef]
- Sousa, C.; Silva, L.; Grosso, F.; Nemec, A.; Lopes, J.; Peixe, L. Discrimination of the Acinetobacter calcoaceticus–Acinetobacter baumannii complex species by Fourier transform infrared spectroscopy. Eur. J. Clin. Microbiol. Infect. Dis. 2014, 33, 1345–1353. [Google Scholar] [CrossRef]
- Silva, S.; Tobaldini-Valerio, F.; Costa-de Oliveira, S.; Henriques, M.; Azeredo, J.; Ferreira, E.C.; Lopes, J.A.; Sousa, C. Discrimination of clinically relevant Candida species by Fourier-transform infrared spectroscopy with attenuated total reflectance (FTIR-ATR). RSC Adv. 2016, 6, 92065–92072. [Google Scholar] [CrossRef]
- Quintelas, C.; Ferreira, E.C.; Lopes, J.A.; Sousa, C. An Overview of the Evolution of Infrared Spectroscopy Applied to Bacterial Typing. Biotechnol. J. 2017, 13. [Google Scholar] [CrossRef]
- Kim, S.W.; Min, S.R.; Kim, J.; Park, S.K.; Kim, T.I.; Liu, J.R. Rapid discrimination of commercial strawberry cultivars using Fourier transform infrared spectroscopy data combined by multivariate analysis. Plant Biotechnol. Rep. 2008, 3, 87–93. [Google Scholar] [CrossRef]
- Aouidi, F.; Dupuy, N.; Artaud, J.; Roussos, S.; Msallem, M.; Perraud-Gaime, I.; Hamdi, M. Discrimination of five Tunisian cultivars by Mid InfraRed spectroscopy combined with chemometric analyses of olive Olea europaea leaves. Food Chem. 2012, 131, 360–366. [Google Scholar] [CrossRef]
- de Oliveira Moura, L.; de Carvalho Lopes, D.; Steidle Neto, A.J.; de Castro Louback Ferraz, L.; de Almeida Carlos, L.; Martins, L.M. Evaluation of Techniques for Automatic Classification of Lettuce Based on Spectral Reflectance. Food Anal. Methods 2015, 9, 1799–1806. [Google Scholar] [CrossRef]
- Li, X.l.; Yi, S.l.; He, S.l.; Lv, Q.; Xie, R.j.; Zheng, Y.q.; Deng, L. Identification of pummelo cultivars by using Vis/NIR spectra and pattern recognition methods. Precis. Agric. 2015, 17, 365–374. [Google Scholar] [CrossRef]
- Kasampalis, D.S.; Tsouvaltzis, P.; Ntouros, K.; Gertsis, A.; Moshou, D.; Siomos, A.S. Rapid Nondestructive Postharvest Potato Freshness and Cultivar Discrimination Assessment. Appl. Sci. 2021, 11, 2630. [Google Scholar] [CrossRef]
- Difante, G.d.S.; Monteiro, G.O.d.A.; Santana, J.C.S.; Frontado, N.E.V.; Rodrigues, J.G.; Chaves, A.R.D.; Santana, D.C.; Oliveira, I.C.d.; Ítavo, L.C.V.; Baio, F.H.R.; et al. Can Different Cultivars of Panicum maximum Be Identified Using a VIS/NIR Sensor and Machine Learning? AgriEngineering 2024, 6, 3739–3751. [Google Scholar] [CrossRef]
- Yılmaz-Düzyaman, H.; de la Rosa, R.; Núñez-Sánchez, N.; León, L. Global and Specific NIR Models for Oxidative Stability Prediction and Cultivar Discrimination in Extra Virgin Olive Oil. Horticulturae 2025, 11, 177. [Google Scholar] [CrossRef]
- Widodo, S.; Brawijaya, H.; Samudi, S. Stratified K-fold cross validation optimization on machine learning for prediction. Sinkron 2022, 7, 2407–2414. [Google Scholar] [CrossRef]
- Sammut, C.; Webb, G.I. Accuracy. In Encyclopedia of Machine Learning and Data Mining; Springer: New York, NY, USA, 2017; p. 8. [Google Scholar] [CrossRef]
- Doğan, O. Data Linkage Methods for Big Data Management in Industry 4.0. In Optimizing Big Data Management and Industrial Systems With Intelligent Techniques; IGI Global: Hershey, PA, USA, 2019; pp. 108–127. [Google Scholar] [CrossRef]
- Ting, K.M. Precision and Recall. In Encyclopedia of Machine Learning and Data Mining; Springer: New York, NY, USA, 2017; pp. 990–991. [Google Scholar] [CrossRef]
- Goutte, C.; Gaussier, E. A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation. In Advances in Information Retrieval; Springer: Berlin/Heidelberg, Germany, 2005; pp. 345–359. [Google Scholar] [CrossRef]
- Nahm, F. Receiver operating characteristic curve: Overview and practical use for clinicians. Korean J. Anesthesiol. 2022, 75, 25–36. [Google Scholar] [CrossRef]
- Páscoa, R.; Lopo, M.; Teixeira dos Santos, C.; Graça, A.; Lopes, J. Exploratory study on vineyards soil mapping by visible/near-infrared spectroscopy of grapevine leaves. Comput. Electron. Agric. 2016, 127, 15–25. [Google Scholar] [CrossRef]
- Kokaly, R. Spectroscopic Determination of Leaf Biochemistry Using Band-Depth Analysis of Absorption Features and Stepwise Multiple Linear Regression. Remote. Sens. Environ. 1999, 67, 267–287. [Google Scholar] [CrossRef]
- Curran, P.J. Remote sensing of foliar chemistry. Remote. Sens. Environ. 1989, 30, 271–278. [Google Scholar] [CrossRef]
- Bajwa, S.; Rupe, J.; Mason, J. Soybean Disease Monitoring with Leaf Reflectance. Remote. Sens. 2017, 9, 127. [Google Scholar] [CrossRef]
- Páscoa, R.N.M.J.; Teixeira, A.M.; Sousa, C. Antioxidant capacity of Camellia japonica cultivars assessed by near- and mid-infrared spectroscopy. Planta 2018, 249, 1053–1062. [Google Scholar] [CrossRef]
- Camps, C.; Guillermin, P.; Mauget, J.; Bertrand, D. Discrimination of Storage Duration of Apples Stored in a Cooled Room and Shelf-Life by Visible-Near Infrared Spectroscopy. J. Near Infrared Spectrosc. 2007, 15, 169–177. [Google Scholar] [CrossRef]
- Bishop, C.M.; Nasrabadi, N.M. Pattern Recognition and Machine Learning (Information Science and Statistics); Springer: New York, NY, USA, 2006; Volume 4, pp. 591–592. [Google Scholar]
- Lee, H.; Battle, A.; Raina, R.; Ng, A. Efficient sparse coding algorithms. In Advances in Neural Information Processing Systems; Schölkopf, B., Platt, J., Hoffman, T., Eds.; MIT Press: Cambridge, MA, USA, 2006; Volume 19. [Google Scholar]
- Ying, X. An Overview of Overfitting and its Solutions. J. Physics Conf. Ser. 2019, 1168, 022022. [Google Scholar] [CrossRef]
- Deeva, A.; Lazaruk, G.; Pavlovskii, N.; Spiridovich, A.; Reshetnikov, V. Dynamics of photosynthetic pigments content in leaves of Vaccinium corymbosum L. during vegetation. Fiziologia Rastenij i Genetika 2021, 53, 55–62. [Google Scholar] [CrossRef]
Species | Blueberry Type | Cultivar | Spring (30th May) | Fall (1st September) | Winter (1st December) | |
---|---|---|---|---|---|---|
R1 | R2 | R1 | R1 | |||
V. corymbosum | Northern | Bluecrop | 2 | 2 | 2 | 2 |
Duke | 2 | 2 | 2 | 2 | ||
Legacy | 2 | 2 | 2 | 2 | ||
Draper | 2 | 2 | 2 | |||
Chandler | 2 | 2 | 2 | 2 | ||
Bluejay | 2 | |||||
Goldtraube | 2 | 2 | 2 | |||
Huron | 2 | 2 | 2 | |||
Aurora | 2 | 2 | 2 | |||
Liberty | 2 | 2 | 2 | |||
Elliott | 2 | |||||
Patriot | 2 | |||||
Bluegold | 2 | |||||
Southern | Camellia | 2 | 2 | 2 | 2 | |
Misty | 2 | 2 | 2 | |||
O’neal | 2 | 2 | ||||
Ozarkblue | 2 | 2 | ||||
Biloxi | 2 | |||||
Star | 2 | 2 |
Classifier | Hyperparameters |
---|---|
AdaBoostClassifier (AdaBoost) | Default parameters |
BaggingClassifier (BaggC) | Default parameters |
DecisionTreeClassifier (DeTreeC) | max_depth: 5 |
GaussianNB (GauNB) | Default parameters |
KNearestNeighborsClassifier (KNN) | Default parameters |
LinearDiscriminantAnalysis (LinDis) | Default parameters |
LinearSVC (LinSVC) | random_state = 0 |
LogisticRegression (LogReg) | solver: “lbfgs” |
QuadraticDiscriminantAnalysis (QuadDis) | Default parameters |
Support-vector Machines (SVC) | γ: “auto”, probability = 1 |
DDR | I (9920–7275 ) | II (7274–6314 ) | III (6313–5390 ) | IV (5389–4924 ) | V (4923–4073 ) | All Spectrum (9920–4073 ) | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Process | AUC | Accuracy | AUC | Accuracy | AUC | Accuracy | AUC | Accuracy | AUC | Accuracy | AUC | Accuracy | |
Adaxial_Fall | Fast ICA | 0.8 | 81.82 | 0.975 | 77.28 | 1 | 100 | 1 | 68.19 | 1 | 90.9 | 1 | 81.82 |
Adaxial_Spring | DL | 0.5833 | 40.91 | 0.9524 | 43.18 | 0.6429 | 50 | 0.619 | 40.91 | 0.5714 | 45.45 | 0.619 | 52.27 |
Adaxial_Winter | Fast ICA | 0.9833 | 75 | 0.9833 | 84.38 | 1 | 93.75 | 0.9833 | 78.13 | 1 | 96.88 | 0.96667 | 84.38 |
Abaxial_Fall | Fast ICA | 0.9 | 77.27 | 0.5 | 77.27 | 1 | 95.45 | 0.725 | 81.82 | 1 | 86.36 | 0.975 | 90.91 |
Abaxial_Spring | Fast ICA | 0.9524 | 34.09 | 1 | 52.27 | 1 | 70.45 | 0.9881 | 50 | 1 | 59.09 | 0.9881 | 54.55 |
Abaxial_Winter | Fast ICA | 1 | 71.88 | 1 | 84.38 | 1 | 93.75 | 0.75 | 87.5 | 1 | 100 | 1 | 87.5 |
Spectral Region | DDR Process | # of Components | # of Features | Classifier | Accuracy | Precision | Recall | F1-Score | AUC | |
---|---|---|---|---|---|---|---|---|---|---|
Adaxial_Fall | III | Fast ICA | 12 | 140 | LinSVC | 100 | 100 | 100 | 100 | 1 |
Adaxial_Spring | All Spectrum | DL | 15 | 330 | LinSVC | 52.27 | 38.54 | 48.44 | 40.63 | 0.619 |
Adaxial_Winter | V | Fast ICA | 14 | 120 | LogReg | 96.88 | 95.31 | 96.88 | 95.83 | 1 |
Abaxial_Fall | III | Fast ICA | 10 | 110 | LogReg | 95.45 | 93.18 | 95.45 | 93.94 | 1 |
Abaxial_Spring | III | Fast ICA | 13 | 190 | LinSVC | 70.45 | 61.46 | 70.31 | 64.69 | 1 |
Abaxial_Winter | V | Fast ICA | 15 | 190 | LinSVC | 100 | 100 | 100 | 100 | 1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ribeiro, P.; Barbosa, M.I.; Sousa, C.; Rodrigues, P.M. Near-Infrared Spectroscopy Machine-Learning Spectral Analysis Tool for Blueberries (Vaccinium corymbosum) Cultivar Discrimination. Foods 2025, 14, 1428. https://doi.org/10.3390/foods14081428
Ribeiro P, Barbosa MI, Sousa C, Rodrigues PM. Near-Infrared Spectroscopy Machine-Learning Spectral Analysis Tool for Blueberries (Vaccinium corymbosum) Cultivar Discrimination. Foods. 2025; 14(8):1428. https://doi.org/10.3390/foods14081428
Chicago/Turabian StyleRibeiro, Pedro, Maria Inês Barbosa, Clara Sousa, and Pedro Miguel Rodrigues. 2025. "Near-Infrared Spectroscopy Machine-Learning Spectral Analysis Tool for Blueberries (Vaccinium corymbosum) Cultivar Discrimination" Foods 14, no. 8: 1428. https://doi.org/10.3390/foods14081428
APA StyleRibeiro, P., Barbosa, M. I., Sousa, C., & Rodrigues, P. M. (2025). Near-Infrared Spectroscopy Machine-Learning Spectral Analysis Tool for Blueberries (Vaccinium corymbosum) Cultivar Discrimination. Foods, 14(8), 1428. https://doi.org/10.3390/foods14081428