Next Article in Journal
Sustainable Strategies for the Control of Pests in Coffee Crops
Previous Article in Journal
Effect of Different Parameters (Treatment Administration Mode, Concentration and Phenological Weed Stage) on Thymbra capitata L. Essential Oil Herbicidal Activity
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Identifying Optimal Wavelengths from Visible–Near-Infrared Spectroscopy Using Metaheuristic Algorithms to Assess Peanut Seed Viability

by
Mohammad Rajabi-Sarkhani
1,
Yousef Abbaspour-Gilandeh
1,*,
Abdolmajid Moinfar
1,
Mohammad Tahmasebi
1,
Miriam Martínez-Arroyo
2,*,
Mario Hernández-Hernández
3 and
José Luis Hernández-Hernández
4
1
Department of Biosystems Engineering, College of Agriculture and Natural Resources, University of Mohaghegh Ardabili, Ardabil 56199-11367, Iran
2
National Technology of Mexico/Acapulco Institute of Technology, Acapulco 39905, Mexico
3
Faculty of Engineering, Autonomous University of Guerrero, Chilpancingo 39070, Mexico
4
National Technological of Mexico/Chilpancingo Institute of Technology, Chilpancingo 39070, Mexico
*
Authors to whom correspondence should be addressed.
Agronomy 2023, 13(12), 2939; https://doi.org/10.3390/agronomy13122939
Submission received: 9 November 2023 / Revised: 23 November 2023 / Accepted: 27 November 2023 / Published: 29 November 2023
(This article belongs to the Section Agricultural Biosystem and Biological Engineering)

Abstract

:
Peanuts, owing to their composition of complex carbohydrates, plant protein, unsaturated fatty acids, and essential minerals (magnesium, iron, zinc, and potassium), hold significant potential as a vital component of the human diet. Additionally, their low water requirements and nitrogen fixation capacity make them an appropriate choice for cultivation in adverse environmental conditions. The germination ability of seeds profoundly impacts the final yield of the crop; assessing seed viability is of extreme importance. Conventional methods for assessing seed viability and germination are both time-consuming and costly. To address these challenges, this study investigated Visible–Near-Infrared Spectroscopy (Vis/NIR) in the wavelength range of 500–1030 nm as a nondestructive and rapid method to determine the viability of two varieties of peanut seeds: North Carolina-2 (NC-2) and Spanish flower (Florispan). The study subjected the seeds to three levels of artificial aging through heat treatment, involving incubation in a controlled environment at a relative humidity of 85% and a temperature of 50 °C over 24 h intervals. The absorbance spectra noise was significantly mitigated and corrected to a large extent by combining the Savitzky–Golay (SG) and multiplicative scatter correction (MSC) methods. To identify the optimal wavelengths for seed viability assessment, a range of metaheuristic algorithms were employed, including world competitive contest (WCC), league championship algorithm (LCA), genetics (GA), particle swarm optimization (PSO), ant colony optimization (ACO), imperialist competitive algorithm (ICA), learning automata (LA), heat transfer optimization (HTS), forest optimization (FOA), discrete symbiotic organisms search (DSOS), and cuckoo optimization (CUK). These algorithms offer powerful optimization capabilities for effectively extracting relevant wavelength information from spectral data. Results revealed that all the algorithms demonstrated remarkable accuracy in predicting the allometric coefficient of seeds, achieving correlation coefficients exceeding 0.985 and errors below 0.0036, respectively. In terms of execution time, the ICA (2.3635 s) and LCA (44.9389 s) algorithms exhibited the most and least efficient performance, respectively. Conversely, the FOA and the LCA algorithms excelled in identifying the least number of optimal wavelengths (10 wavelengths). Subsequently, the seeds were classified based on the wavelengths selected via the FOA (10 wavelengths) and (DSOS (16 wavelengths) methods, in conjunction with logistic regression (LR), decision tree (DT), multilayer perceptron (MP), support vector machine (SVM), k-nearest neighbor (K-NN), and naive Bayes (NB) classifiers. The DSOS–DT and FOA–MP methods demonstrated the highest accuracy, yielding values of 0.993 and 0.983, respectively. Conversely, the DSOS–LR and DSOS–KNN methods obtained the lowest accuracy, with values of 0.958 and 0.961, respectively. Overall, our findings demonstrated that Vis/NIR spectroscopy, coupled with variable selection algorithms and learning methods, presents a suitable and nondestructive approach for detecting seed viability.

1. Introduction

Peanuts hold significant economic importance globally, serving as a stable and cost-effective source of complex carbohydrates, plant protein, unsaturated fatty acids, and essential minerals such as magnesium, iron, zinc, and potassium [1]. This plant’s ability to thrive with low water requirements makes peanuts a favorable choice for cultivation in arid and semi-arid regions [2]. Moreover, in adverse environmental conditions, it exhibits resistance and is capable of performing adequately in poor soil conditions [3]. Additionally, peanuts’ excellent nitrogen fixation properties make them a suitable rotation option with cereal crops in successive planting seasons, contributing to soil fertility improvement [4]. Since peanut cultivation is conducted via seeds, evaluating seed germination vigor and viability is of paramount importance.
Seeds are considered the fundamental elements in agriculture and forestry, as they are directly or indirectly involved in establishing fields for various crops, vegetables, fruits, fodder, and economic forest products [5]. The quality of seeds profoundly impacts crop growth uniformity, yield, and overall crop quality. Furthermore, the safety and quality of seeds and their products directly affect human health [6]. Among the vital parameters associated with seed quality, seed vigor represents a key criterion for assessing seed quality, as it reflects the potential for seed germination, germination in the field, resistance to biotic and abiotic stresses, and the ability to withstand different storage conditions compared to standard germination tests [7]. Furthermore, it is well-established that seeds with desirable viability capabilities, achieved through significant yield performance for farmers and reduced crop diversity, will be profitable for seed industries [8]. A vigorous seed possesses the potential to thrive in environmental conditions that may not be optimal for its species. Such seeds exhibit high and uniform germination rates, quick germination, and produce robust seedlings, ultimately leading to higher field yields [9]. The study of the relationship between the growth rates of different parts of an organism or the creature as a whole is known as allometry. In identical environmental conditions, the growth and functioning of both the root and aerial systems are closely interrelated, and this relationship can be quantified through allometric relationships. Specifically, the allometric coefficient, which is calculated based on the length of the aerial parts and roots, represents the ratio of shoot length to root length [10]. Several research studies have indicated that the allometric coefficient is influenced by seed vigor and can be served as an indicator for diagnosing seed quality [11,12]. Traditionally, various methods, including standard germination tests, electrical conductivity tests, seedling growth tests, accelerated aging tests, cold tests, and tetrazolium tests, have been proposed and employed to evaluate seed germination. However, these methods typically require significant time, are nonautomatic and may lead to seed destruction, and often necessitate specialized training and expertise. Consequently, they are not well-suited for large-scale applications or for protecting endangered species. Therefore, nondestructive and high-throughput screening methods are essential for the seed industry to provide high-quality seeds with superior characteristics to ensure the supply of high-germination seeds to farmers before planting [13].
In recent years, there have been significant advancements in electronic technologies and equipment, leading to notable improvements in the resolution and accuracy of light- and image-based systems. These systems are now capable of determining qualitative indicators of the chemical components of materials, either in a static setting or online in production lines. This progress has enabled the fast and precise classification of materials with reduced labor requirements [14]. Light- and image-based detection systems have been successfully employed in assessing the quality of agricultural food products, offering reliable and accurate results. By minimizing the influence of human intervention, these systems have become a preferred approach due to their consistency and stability [15,16,17]. Among the noninvasive methods used for identifying the chemical components of agricultural products, near-infrared spectroscopy (NIR) has gained widespread popularity in recent years. NIR operates based on the absorption of electromagnetic radiation within the wavelengths of 780 to 2500 nm [18]. When agricultural products are exposed to this radiation, their spectral response varies depending on the wavelength due to scattering and absorption processes. The tissue structures of these products, consisting of cells and intracellular/extracellular environments, are responsible for radiation scattering. Additionally, the absorption of electromagnetic rays is mainly influenced by C\\H, O\\H, and N\\H bonds present in major compounds such as water, sugars, chlorophylls, carotenoids, and so on. The NIR spectrum comprises broad wavebands resulting from the overlapping of absorption bands, which are closely associated with the colors and combinations of these chemical bonds. As a result, organic and biological substances can be effectively detected using NIR spectroscopy [19]. The investigation of artificially aged soybean seeds in comparison to healthy seeds revealed that changes in radiation absorption within the wavelength range of 1000–2500 nm can effectively distinguish between healthy and old seeds [20]. Similarly, differentiating between viable and nonviable soybean seeds, which underwent accelerated aging through heat treatment, was accomplished using NIR reflectance spectra in the wavelength range of 400–2500 nm. Partial least square discriminant analysis (PLS-DA) was employed in this research to classify viable and nonviable seeds, with the best model achieving an accuracy of 95% in the short-wave infrared (SWIR) region of 750–2500 nm [21]. In the case of tomato seeds subjected to accelerated aging, NIR spectroscopy in absorption mode, within the wavelength range of 911–2258 nm, was utilized to classify viable and nonviable seeds. Both PLS-DA and interval partial least squares discriminant analysis (iPLS-DA) were employed to construct corresponding models. Specific spectral regions (1160–1170, 1383–1397, 1647–1666, 1884–1860, and 1915–1940 nm) were identified via iPLS-DA for the classification of viable and nonviable tomato seeds, resulting in a classification accuracy of 94% [22]. In a study focusing on spinach seeds, NIR spectroscopy within the wavelength range of 833–1667 nm was used to differentiate between viable and nonviable seeds. The optimal wavelengths were selected using successive projections algorithms (SPA), and classification models created with these 10 selected wavelengths demonstrated satisfactory accuracy in distinguishing viable seeds from nonviable ones [23].
Recently, several studies have investigated the feasibility of Fourier transform infrared spectroscopy (FTIR) and laser-induced breakdown spectroscopy (LIBS) methods to detect seed vigor of soybean and Brachiaria. The results of this research have stated that the FTIR method is able to provide information about carbohydrates, proteins, amides, and lipids in seeds, considering that it is sensitive to the molecular changes of substances. These substances are known as the main molecules influencing seed viability, so the FTIR method, due to its sensitivity to the changes of these molecules, was able to accurately determine the seed vigor of soybean [24] and Brachiaria [25]. Also, LIBS is a technique that is able to identify the elements in a substance. In the research that focused on the identification of seed vigor of soybean and Brachiaria using LIBS, it was found that the presence of Ca elements in the seeds is the main characteristic responsible for the major variance in the data. Therefore, due to the fact that Ca elements play an important role in enzyme activities of plant during germination, LIBS was able to determine the seed vigor of soybean [26] and Brachiaria [27] by identifying Ca elements.
Although previous studies revealed that the use of NIR spectroscopy for diagnosing seed viability had acceptable accuracy, the practical and commercial feasibility of using the entire wavelength range is not economical. This can be explained by the high cost associated with producing spectroscopic instruments based on the full spectrum [28]. Therefore, there is a need to explore methods for identifying optimal wavelengths that would enable the production of industrial-commercial tools at a lower cost. The application of chemometrics techniques in analyzing spectroscopic data poses a fundamental challenge due to the high dimensionality of the data set. This refers to situations where the number of features greatly exceeds the size of the data set itself [29]. For instance, in spectroscopic applications with a large number of wavelengths, the classification parameters also increase, leading to a significant decrease in the performance of the classification tool [30]. When obtaining a substantial number of training data becomes impractical, reducing the size of the feature subset becomes crucial as it helps in reducing the number of required training data and, in turn, enhances the performance of the classification algorithm [31]. Dimension reduction serves as a common method to address this challenge by removing noise and unnecessary features. It proves to be an efficient approach for improving accuracy, reducing computational complexity, building more generalized models, and reducing storage space requirements [32]. The main idea behind feature selection is to select a subset of features by eliminating those with little or no informative value and removing highly correlated features [33]. Generally, feature selection methods aim to optimize two conflicting objectives: maximizing the association with the target class and minimizing redundancy (correlation) among the selected features [34].
The current research aimed to develop an intelligent model based on NIR spectroscopy data and machine learning analysis to assess the viability of peanut seeds. To achieve this, two peanut cultivars, North Carolina-2 (NC-2) and Florispan, were selected, that were exposed to three levels of artificial aging. The NIR spectroscopy data of the samples were collected, and deep-learning approaches were employed to create predictive models for germination indicators. Moreover, metaheuristic variable selection methods such as world competitive contest (WCC), league championship algorithm (LCA), genetic algorithm (GA), particle swarm optimization (PSO), ant colony optimization (ACO), imperialist competitive algorithm (ICA), learning automata (LA), heat transfer optimization algorithm (HTS), forest optimization algorithm (FOA), discrete symbiotic organism search (DSOS), and cuckoo optimization (CUK) were used to select optimal wavelengths based on seed age and qualitative classification of peanuts.

2. Materials and Methods

Due to the fact that the implementation of this research includes several different stages, the flow chart of the research implementation process has been drawn in Figure 1 so that the readers have a better insight into this research. In the rest of this section, the steps mentioned in Figure 1 are described in order. Due to the fact that the measurement methods used in this research are accepted as standard methods, their detailed description is omitted, and the reader is referred to the original sources. Also, considering that variable selection algorithms and machine learning methods are accepted as scientific and practical methods and can be implemented in different software packages, the description of the mathematical methods of their implementation has been omitted. In order to prevent the length of the article, interested readers have been given references to the original articles of the inventors of these methods.

2.1. Seed Selection and Aging Treatment

Two common peanut seed cultivars for cultivation in Iran, namely North Carolina 2 (NC-2) and Florispan, were chosen for the experiments. Seeds from the last crop year were selected to ensure optimal conditions for survival and germination vigor. The seeds with similar mass and size were selected to minimize the impact of unfavorable factors on the experiment results. Seeds with length in the range of 17–18 mm, width in the range of 8–9 mm, thickness in the range of 9–10 mm, and mass in the range of 1.1–1.2 g were considered. To induce artificial senescence, 300 seeds of each cultivar were subjected to accelerated aging treatment in three time intervals, with 24 h between each interval. The seeds were placed in a single layer of aluminum nets positioned above water containers in an incubator. Before placing the seeds, the containers were thoroughly cleaned with a 15% sodium hypochlorite solution to prevent fungal contamination. The incubator was set to maintain a relative humidity of 85% and a temperature of 50 °C. At 24 h intervals, one-third of the samples were removed from the incubator, resulting in three different aging periods for the seeds [35].

2.2. Preparation of Vis/NIR Spectra from Samples

A PS-100 model spectroradiometer (Apogee Instruments, Inc., Logan, UT, USA) was utilized to acquire the spectra of peanut seeds. This spectroradiometer is compact, lightweight, and portable, equipped with a sputtering-type monochromator with a resolution of 1 nm and a linear silicon CCD array detector containing 2048 pixels, covering the spectral range of 250–1150 nm (Vis/NIR). Furthermore, the spectroradiometer PS-100 can be connected to a computer via an optical fiber, and the acquired spectra are displayed and stored in the SpectraWiz® spectrometer software through a USB port. For obtaining the absorption spectra of the samples, a probe-detector sensor was employed, which is designed in a way that the light source is positioned at a 45° angle to the detector, allowing for the acquisition of internal diffuse reflection rather than using mirror reflection acquisition from the sample. Internal diffuse reflection occurs when the radiation penetrates into the cell structure of the sample and after hitting the common surfaces of the cell wall and spreading, it goes back and leaves the sample surface [36]. In this measuring method, light radiation penetrates into the sample, and a part of the radiation is absorbed depending on the molecular structure of the sample, and the rest of the radiation exits the sample at an angle of 45 degrees and is detected via the detector. In this way, the effects of external and internal composition of the sample on the amount of radiation absorption can be measured simultaneously. This internal diffuse reflection provides information about the internal contents of the sample. Figure 2 illustrates the process of acquiring the spectrum using the probe-detector sensor.

2.3. Standard Germination Test

For the standard germination test, 100 samples were selected for each treatment. The germination test was conducted using the method of placing seeds between wet papers. The samples were placed in a germinator with a constant temperature of 25 °C and kept in these conditions for 10 days to facilitate germination. Before conducting the test, the containers used were disinfected with a 15% hypochlorite solution, and the peanut seeds were treated with 1% mercury chloride [37]. The emergence of a two mm radicle was considered as a standard for seed germination. The identification and counting of normal and abnormal seedlings (abnormal seedlings include seedlings without a primary root system, or weak secondary roots, with necrotic spots in the tissue, and seedlings with a damaged terminal bud or a missing cotyledon) were performed according to the guidelines of the International Seed Testing Association (ISTA) from the fifth day up to the tenth day. On the tenth day, the seedlings were placed in a dryer at a temperature of 60 °C for 24 h [10]. The mass and length of the seedlings were measured using a scale with an accuracy of 0.0001 g and a caliper with an accuracy of 0.01 mm, respectively. Based on the counts and measurements, various indicators related to seed germination were calculated for each treatment group. These indicators include germination energy (GE), mean daily germination (MDG), germination value (GV), daily germination speed (DGS), and germination vigor (GVI). Additionally, the allometric coefficient (AC) was calculated for each seed. The relationships used to calculate the seed germination indices were presented in Table 1.

2.4. Preprocessing of Vis/NIR Spectra

After acquiring the spectra and transferring them to the computer using Excel 2013 software, a single spectrum was created by averaging the two acquired spectra from the sides of the peanut, representing the index spectrum of each sample. Spectral data may contain irrelevant information and noise, such as fluorescence background, stray light, detector noise, cosmic rays, instrument noise, laser power fluctuations, and so on. To extract accurate information and enhance subtle differences between different samples, spectral preprocessing is a crucial step in spectral data analysis [39].
In this study, the first step of spectral preprocessing involved using the Savitzky–Golay (SG) method to smooth the curves and remove random noise. This method effectively smooths out slight fluctuations caused by noise in the curve while enhancing spectral peaks related to changes in the sample components [39,40]. Subsequently, multiplicative scatter correction (MSC) was employed to eliminate the noise caused by light scattering. MSC removes baseline translations and displacements caused by scattering effects between samples, thereby improving the signal-to-noise ratio of the original spectrum [41]. The Unscrambler 10.4 software was used for spectral data preprocessing. Figure 3 depicts the main spectrum curves of the samples and their preprocessed curves associated with each cultivar. As it is clear in Figure 3c,d, after applying preprocessing on the spectral curves, the unrealistic variance between the curves has been reduced. Also, in the range of wavelengths of 520–530 and 590–610 nm, it can be seen that the inflection point of the curves rotate in different directions. Also, in the range of wavelengths of 555–565 and 945–955, some curves are at the relative maximum point and some others are at the relative minimum point. Such differences that have been revealed due to the application of preprocessors will help to identify the optimal wavelengths.

2.5. Methods of Choosing the Optimal Wavelength

Optimization methods and algorithms are divided into two categories: exact algorithms and approximate algorithms. Exact algorithms are able to find the optimal solution accurately, but they are not efficient enough for hard optimization problems, and their execution time increases exponentially according to the dimensions of the problems. Approximate algorithms are able to find good solutions (near optimal) in a short solution time for hard optimization problems [42]. Approximate algorithms are divided into three categories: heuristic, metaheuristic, and hyper heuristic. The two main problems of heuristic algorithms are that they get stuck in local optimal points and that they display premature convergence to these points. Metaheuristic algorithms are presented to solve these heuristic algorithm problems. In fact, metaheuristic algorithms are one of the types of approximate optimization algorithms that have solutions for exiting from local optimal points and can be used in a wide range of problems [43].
To address complex optimization problems with numerous variables, metaheuristic algorithms are employed as suitable and efficient approaches. These algorithms rapidly provide approximate solutions, avoiding the need for time-consuming optimal solutions [44]. A key advantage of metaheuristic algorithms is their ability to escape local optima, ensuring a more comprehensive search for optimal solutions [45]. The working principle of these algorithms involves introducing a set of initial solutions randomly. A fitness function is then calculated to assess the optimality of each solution in the initial population. If the statistical criteria for optimization quality are not met, the algorithm produces a new generation of solutions, repeating the cycle until the desired optimization criteria are satisfied [46]. Metaheuristic approaches are typically categorized into two main groups: evolutionary algorithms (EA) and swarm intelligence (SI) [47]. Evolutionary algorithms have tried to simulate the process of genetic evolution of organisms or communities using mathematical principles. EA draws inspiration from biological evolution mechanisms such as reproduction, mutation, recombination, and selection. In the context of optimization problems, the introduced solutions represent individuals within a population, and the fitness function evaluates the quality and accuracy of these solutions. Through iterative steps, the evolutionary algorithm facilitates the evolution of the initial population toward overall optimization [48]. In contrast, swarm intelligence optimization methods involve a collection of simple solutions with no complexity of artificial agents. In general, swarm intelligence algorithms try to simulate the routing pattern of different organisms to reach food by using mathematical principles. The concept behind SI algorithms is inspired by natural systems, where each agent performs a basic task. However, the interaction, cooperation, and somewhat random responses of these agents lead to emergent intelligent behavior that is not achievable by any individual agent alone [49]. SI-based feature selection methods have been utilized in previous research, and their operational principles have been thoroughly described in the literature [30]. In this study, the process of selecting optimal wavelengths was carried out using variable selection methods based on various metaheuristic algorithms, including (WCC) [50], LCA [51], GA [52], PSO [53], ACO [54], ICA [55], LA [56], HTS [57], and FOA [58]. The optimization of optimal wavelength selection was carried out using the FeatureSelect software package within MATLAB 2017 [59].

2.6. Modeling Methods to Predict Seed Viability

Traditional artificial intelligence methods typically employed a mathematical representation approach to describe optimization problems and discover optimal solutions under specific constraints based on logical mathematical principles. Such an approach is commonly referred to as knowledge-based [60]. Nevertheless, in most natural phenomena in which predicting trends or classifying an occurrence should be conducted, logical–mathematical laws may not be able to describe these phenomena adequately. This limitation arises from the fact that these phenomena are often abstract and not easily captured with mathematical formulations [61]. To overcome the limitations of the knowledge-based approach, an alternative strategy inspired by human processes has been developed. Humans are capable of learning from repeated tasks, receiving feedback, and adjusting their decisions or actions accordingly to achieve favorable outcomes [60]. This approach, based on iterative learning from experience, is referred to as a learning-based approach, in contrast to the knowledge-based approach [61]. Similarly, we can develop machines to perform specific tasks using a learning-based approach, known as machine learning (ML). In this current study, machine learning methods such as LR [62], DT [63], MP, SVM [64], k-NN [65], and NB [66] have been employed in the WEKA 3.8.6 software package for detecting and classifying the vitality of seeds.

2.7. Evaluation Criteria of Optimal Wavelength Selection Algorithms and Machine Learning Models

To assess the effectiveness and performance of the optimal wavelength selection algorithms, two statistical measures were utilized: the root mean square error (RMSE) and the coefficients of determinate (R2) [67,68]. These measures were calculated using Equations (1) and (2) [68,69,70,71]:
R M S E = 1 n i = 1 n ( y i y ´ i ) 2
R 2 = i = 1 n ( y i y ´ i ) 2 i = 1 n ( y i y ¯ ) 2
where y i , y ´ i , and y ¯ are predicted, actual, and mean values, respectively.
To evaluate the accuracy of the classification models for peanut seed viability, several statistical criteria were employed, including accuracy, precision, sensitivity, specificity, and the receiver operating characteristic (ROC). These criteria were calculated using the following equations: Accuracy refers to how close the measured value is to the true value. Precision indicates the closeness of successive measurements to each other (i.e., the consistency of errors in the various measurements). Sensitivity represents the fraction of positive cases correctly identified. And specificity denotes the fraction of negative cases correctly identified. These metrics were calculated using Equations (3)–(6) [69].
A c c u r a c y = ( T r u e   P o s i t i v e + T r u e   N e g a t i v e ) T r u e   P o s i t i v e + F a l s e   N e g a t i v e + ( F a l s e   N e g a t i v e + T r u e   N e g a t i v e )
P r e c i s i o n = T r u e   P o s i t i v e ( T r u e   P o s i t i v e + F a l s e   P o s i t i v e )
S e n s i t i v i t y = T r u e   P o s i t i v e ( T r u e   P o s i t i v e + F a l s e   N e g a t i v e )
S p e c i f i c i t y = T r u e   N e g a t i v e ( F a l s e   P o s i t i v e + T r u e   N e g a t i v e )
where True Positive is the number of samples in the i-th category that the algorithm correctly recognized. False Negative is the number of samples in the i-th category that the algorithm has misdiagnosed. False Positive is the number of samples outside the i-th category that the algorithm placed in the i-th category. True Negative is the number of samples outside the i-th category that the algorithm did not place in the i-th category.

3. Results and Discussion

3.1. Examination of Seed Viability Indices

Table 2 presents the results related to seed viability indices for each treatment. Accordingly, it is evident that the accelerated aging test significantly impacted the seed groups, and there were noticeable differences in seed viability indices among the treatments. With an increase in the aging period, the germination percentage for both seed varieties significantly decreased, and in the third period, almost half of the seeds failed to germinate.

3.2. Comparison of the Efficiency of Optimal Wavelength Selection Methods

In this study, to carry out the process of selecting the seed allometric coefficient as the continuous output variable (target) and wavelengths as the input variable (independent variable), regression was considered. The higher the seed allometric coefficient, the higher the likelihood of germination [10]. The selection of variables was considered a regression problem; thus, optimization algorithms were employed to search for wavelengths that create regression models with the highest correlation between actual values and predicted values. Table 3 provides descriptive statistical measures of algorithm accuracy and the number of optimally selected wavelengths via each algorithm.
Based on Table 3, it is evident that all variable selection algorithms exhibit a high level of accuracy (CR > 0.98) and low error (RMSE < 0.003) in predicting the allometric coefficient. Therefore, it appears that the most logical and appropriate criterion for selecting the optimal algorithm is based on its execution time. Algorithms that require less computational time are more practical for commercial-scale implementation [67,68,69,70]. This time difference is considered an extremely significant and important feature in commercial and practical applications. In research contexts, the seed recognition system encounters only one seed at a time, but at the commercial scale, the seed recognition system encounters millions of seeds at a time. The shorter the execution time of the algorithm, the less the computing load on the hardware, the less the seed separation time, and the system performance increases. The ranking of variable selection algorithms based on execution time is as follows: ICA < PSO < GA < HTS < FOA < WCC < DSOS < ACO < CUK < LA < LCA. Previous research has also indicated the high popularity and practicality of ICA [72], PSO [73], and GA [74], algorithms due to their low execution time.
On the other hand, the number of selected wavelengths is also a crucial criterion in industrial-scale applications because the cost of producing spectroscopic tools for practical purposes is directly dependent on the number of wavelengths detectable via the instrument. As the number of wavelengths decreases, the production cost decreases accordingly [14]. Additionally, instruments capable of detecting fewer wavelengths can provide higher accuracy and resolution in their measurements [75]. Therefore, the variable selection algorithms are ranked based on the number of optimal wavelengths as follows: LCA < FOA < CUK < WCC < PSO < GA < DSOS < ACO < HTS < LA < ICA. The LCA and FOA algorithms perform superior than others by identifying 10 optimal wavelengths. Considering that the FOA algorithm’s execution time is twice as fast, it can be considered the optimal method for wavelength selection.
In Figure 4, the algorithms’ performance are compared in terms of correlation and RMSE in each round of algorithm execution. The LA algorithm’s results showed a strong correlation between the number of executions and its performance, achieving lower error and higher correlation with each round of execution. Conversely, the DSOS algorithm is relatively unaffected by more executions, as re-executing the algorithm does not result in significant changes in its error rate and correlation. These findings align with the results presented by Masoudi-Sobhanzadeh et al. [59], who implemented and compared the algorithms using different datasets.

3.3. Examination of Averages of Vis/NIR Absorption Spectra and Evaluation of the Location of the Selected Wavelength

Figure 5 displayed the mean Vis/NIR absorption spectra for three seed aging periods along with the locations of the optimum selected wavelengths. Accordingly, as the seeds’ age increases, the absorption levels decrease along the entire curve. This phenomenon can be attributed to the reduction of water in the sample due to the aging process, because the amount of radiation absorption largely depends on the amount of water in the chemical components [76]. Similar trends have been reported by other researchers [47,75,77]. The absorption changes in the spectral range of 500–550 nm can be attributed to carotenoids and anthocyanins [78]. Furthermore, changes in the range of 650–700 nm are related to the presence of chlorophyll [79], whereas changes in 680–760 nm were associated with the presence of amino acids in the seed [77]. On the other hand, changes in the region of 750–850 nm were attributed to the water content in the sample [80]. The alterations in the 860–910 nm indicate the presence of CH and OH bonds in carbohydrates [81]. Furthermore, changes in the 930–1030 nm range were explained by the presence of protein compounds in the sample [82]. The presence of each of these substances in the seed composition increases the probability of germination, highlighting the importance of their detection [78]. Hence, it can be concluded that the variable selection algorithm provides an optimal mode that has selected at least one wavelength in all the mentioned ranges.

3.4. Results of Seed Viability Classification Modeling based on Selected Wavelengths via Machine Learning Methods

Table 4 presented the results of the seed viability classification models based on the selected wavelengths using the FOA algorithm (lowest number of wavelengths) and DSOS algorithm (highest number of wavelengths). As shown in Table 4, all classifications exhibited good performance in determining the viability of the seeds. Overall, it can be concluded that the seed classification using selected wavelengths via the FOA algorithm performed better. However, to more precisely check and compare the classifiers’ performance, the results are graphically presented in Figure 6.
As shown in Figure 6, except for the DT and MP methods, the effectiveness of other methods decreased with an increase in the number of variables. The MP method provides consistent results across both sets of variables (10 and 16 variables), and the DT method exhibits the best performance with increasing variables. LR, SVM, and K-NN methods were strongly affected by the increased number of variables, leading to a sharp drop in their performance. Previous research has indicated that two methods, MP and DT, have favorable capabilities in data mining of high-scale data with many variables. This can be explained by the fact that LR and SVM methods are not affected by the collinearity of variable problems [73]. However, the DT method, with its node-branch expansion capability, can check different features in various branches and overcome the problem of collinearity of variables, which is essential for achieving good performance [61,69]. On the other hand, the MP method features a layered structure comprising several neurons in each layer. Each neuron in this method models a set of features, thus providing a solution to the problem of nonlinearity among variables [83]. However, the LR approach, when executed with a limited number of variables, possesses an advantage over other methods, as it can explicitly describe the relationship between each wavelength and the response variable and rank the importance of wavelengths in the target classification [60]. Hence, considering the high accuracy of all algorithms in seed classification, no particular algorithm can be considered superior to others. Consequently, the appropriate algorithm can be chosen based on specific research or operational goals. Various classification methods have been employed in different research to classify seed viability. For instance, corn seed viability was classified using SVM, KNN, random forest, and a deep convolutional neural network (CNN), with the best accuracy achieved using CNN [35]. Peanut seed viability was classified using SVM, DT, and LDA, and the best accuracy was obtained using the DT classification [35]. Hyperspectral images were used to identify damaged rice seeds with SVM, KNN, DT, and deep forest classifiers, with the new DF classifier, developed based on the DT classifier, providing higher accuracy than other classifiers [84]. The germination vigor of sugar beet seeds was predicted using hyperspectral images and KNN, SVM, and RF classifiers, with the SVM classifier providing the best performance [77].
Table 5, Table 6, Table 7 and Table 8 display the confusion matrices for DT–DSOS, MP–FOA, LR–DSOS, and KNN–DSOS classifications, respectively, enabling a clear comparison between the best and worst classifications in each treatment. Accordingly, the most accurate diagnosis was related to the third senescence period for both seed varieties, and seeds with poor viability were correctly identified and distinguished from healthy seeds. Additionally, the correct identification of healthy seeds was acceptable, with most misclassified seeds belonging to the second senescence period. As a result, there was no significant difference in the accuracy of the correct diagnosis based on the seed variety.

4. Conclusions

In this research, Vis/NIR spectroscopic technology in absorption mode was utilized to detect aging and classify two varieties of peanut seeds. The results revealed significant differences between the spectral curves of the aging treatments, with healthy (young) seeds displaying higher absorption compared to unhealthy (artificially aged) seeds. Notably, the differences in absorption properties due to aging were so distinctive that the two seed varieties did not hinder the identification of healthy seeds from unhealthy ones. Based on the absorption spectrum curves of different treatments, it can be concluded that seed aging induces changes in its chemical components, which can be effectively detected through Vis/NIR spectroscopy.
This study employed metaheuristic optimization algorithms for the optimal wavelength selection process. The results demonstrated that all these algorithms exhibited a high capability in identifying the optimal wavelengths and achieved excellent accuracy in modeling the seed’s allometric coefficient. The algorithms achieved correlation coefficients higher than 0.985 and errors lower than 0.0036, respectively. However, the primary difference in the performance of the algorithms depended on the user’s specific objectives. ICA < PSO < GA algorithms demonstrated much shorter execution times than other algorithms, with execution times of 2.3635 < 2.5181 < 3.4135, respectively. However, these methods selected 60% more optimal wavelengths. Therefore, these algorithms appear more suitable for online and fast applications. On the other hand, LCA < FOA < CUK algorithms introduced the least number of optimal wavelengths. However, their execution times were significantly longer than other algorithms, with values of 33.3829, 17.1878, and 44.9389, respectively. This limitation may be a significant drawback in online applications.
In this study, LR, SVM, KNN, DT, MP, and naive Bayes classifiers were utilized for seed classification. The highest accuracies were achieved via the DSOS–DT and FOA–MP methods, with accuracies of 0.993 and 0.983, respectively. Conversely, the lowest accuracies were obtained with the DSOS–LR and DSOS–KNN methods, with values of 0.958 and 0.961, respectively. The results of this study indicated that the DT and MP algorithms exhibit superior performance in seed classification, and their classification performance remains unaffected by an increase in the number of wavelengths. However, other classifiers appear to be sensitive to the number of wavelengths, leading to relatively weak classification performance. In conclusion, it can be concluded that the combination of Vis/NIR spectroscopic technology with machine learning and variable selection algorithms techniques can provide promising prospects for practical applications in seed viability detection tools.

Author Contributions

Conceptualization, M.R.-S., A.M. and M.T.; methodology, M.R.-S., Y.A.-G., A.M. and M.T.; software, M.R.-S., A.M. and M.T.; validation, M.R.-S., Y.A.-G. and A.M.; formal analysis, Y.A.-G.; investigation, Y.A.-G. and M.R.-S.; resources, M.R.-S.; data curation, M.R.-S. and Y.A.-G.; writing—original draft preparation, M.R.-S., A.M. and M.T.; writing—review and editing, Y.A.-G., M.M.-A., M.H.-H. and J.L.H.-H.; visualization, M.R.-S.; supervision, Y.A.-G.; project administration, Y.A.-G.; funding acquisition, Y.A.-G., M.M.-A., M.H.-H. and J.L.H.-H. All authors have read and agreed to the published version of the manuscript.

Funding

This study was financially supported by University of Mohaghegh Ardabili.

Data Availability Statement

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. Zou, S.; Tseng, Y.-C.; Zare, A.; Rowland, D.L.; Tillman, B.L.; Yoon, S.-C. Peanut maturity classification using hyperspectral imagery. Biosyst. Eng. 2019, 188, 165–177. [Google Scholar] [CrossRef]
  2. Deepa, R.; Anandhi, A.; Bailey, N.O.; Grace, J.M., III; Betiku, O.C.; Muchovej, J.J. Potential environmental impacts of peanut using water footprint assessment: A case study in georgia. Agronomy 2022, 12, 930. [Google Scholar] [CrossRef]
  3. Wang, X.; Chen, C.Y.; Dang, P.; Carter, J.; Zhao, S.; Lamb, M.C.; Chu, Y.; Holbrook, C.; Ozias-Akins, P.; Isleib, T.G. Variabilities in symbiotic nitrogen fixation and carbon isotope discrimination among peanut (Arachis hypogaea L.) genotypes under drought stress. J. Agron. Crop Sci. 2023, 209, 228–241. [Google Scholar] [CrossRef]
  4. Tan, X.L.; Azam-Ali, S.; Goh, E.V.; Mustafa, M.; Chai, H.H.; Ho, W.K.; Mayes, S.; Mabhaudhi, T.; Azam-Ali, S.; Massawe, F. Bambara groundnut: An underutilized leguminous crop for global food security and nutrition. Front. Nutr. 2020, 7, 601496. [Google Scholar] [CrossRef]
  5. Rahman, A.; Cho, B.-K. Assessment of seed quality using non-destructive measurement techniques: A review. Seed Sci. Res. 2016, 26, 285–305. [Google Scholar] [CrossRef]
  6. Kaur, N.; Erickson, T.E.; Ball, A.S.; Ryan, M.H. A review of germination and early growth as a proxy for plant fitness under petrogenic contamination—Knowledge gaps and recommendations. Sci. Total Environ. 2017, 603, 728–744. [Google Scholar] [CrossRef]
  7. Bastos, L.L.d.S.; Calvi, G.P.; Lima Júnior, M.d.J.V.; Ferraz, I.D.K. Degree of seed desiccation sensitivity of the Amazonian palm Oenocarpus bacaba depends on the criterion for germination. Acta Amaz. 2021, 51, 85–90. [Google Scholar] [CrossRef]
  8. Carrera-Castaño, G.; Calleja-Cabrera, J.; Pernas, M.; Gómez, L.; Oñate-Sánchez, L. An updated overview on the regulation of seed germination. Plants 2020, 9, 703. [Google Scholar] [CrossRef]
  9. Marcos Filho, J. Seed vigor testing: An overview of the past, present and future perspective. Sci. Agric. 2015, 72, 363–374. [Google Scholar] [CrossRef]
  10. Moghaddam, S.S.; Rahimi, A.; Noorhosseini, S.A.; Heydarzadeh, S.; Mirzapour, M. Effect of seed priming with salicylic acid on germinability and seedling vigor fenugreek (Trigonella Foenum-Graecum). Yuz. Yıl Univ. J. Agric. Sci. 2018, 28, 192–199. [Google Scholar]
  11. Ebrahimi, M.; Miri, E. Effect of humic acid on seed germination and seedling growth of Borago officinalis and Cichorium intybus. Ecopersia 2016, 4, 1239–1249. [Google Scholar] [CrossRef]
  12. Mohajeri, F.; Taghvaei, M.; Ramrudi, M.; Galavi, M. Effect of priming duration and concentration on germination behaviors of (Phaseolus vulgaris L.) seeds. Int. J. Ecol. Environ. Conserv. 2016, 22, 603–609. [Google Scholar]
  13. Xia, Y.; Xu, Y.; Li, J.; Zhang, C.; Fan, S. Recent advances in emerging techniques for non-destructive detection of seed viability: A review. Artif. Intell. Agric. 2019, 1, 35–47. [Google Scholar] [CrossRef]
  14. Li, A.; Yao, C.; Xia, J.; Wang, H.; Cheng, Q.; Penty, R.; Fainman, Y.; Pan, S. Advances in cost-effective integrated spectrometers. Light Sci. Appl. 2022, 11, 174. [Google Scholar] [CrossRef]
  15. Abasi, S.; Minaei, S.; Jamshidi, B.; Fathi, D. Dedicated non-destructive devices for food quality measurement: A review. Trends Food Sci. Technol. 2018, 78, 197–205. [Google Scholar] [CrossRef]
  16. El-Mesery, H.S.; Mao, H.; Abomohra, A.E.-F. Applications of non-destructive technologies for agricultural and food products quality inspection. Sensors 2019, 19, 846. [Google Scholar] [CrossRef] [PubMed]
  17. Ali, M.M.; Hashim, N. Non-destructive methods for detection of food quality. In Future Foods; Elsevier: Amsterdam, The Netherlands, 2022; pp. 645–667. [Google Scholar]
  18. Wei, X.; Xu, N.; Wu, D.; He, Y. Determination of branched-amino acid content in fermented Cordyceps sinensis mycelium by using FT-NIR spectroscopy technique. Food Bioprocess Technol. 2014, 7, 184–190. [Google Scholar] [CrossRef]
  19. Xia, Y.; Huang, W.; Fan, S.; Li, J.; Chen, L. Effect of spectral measurement orientation on online prediction of soluble solids content of apple using Vis/NIR diffuse reflectance. Infrared Phys. Technol. 2019, 97, 467–477. [Google Scholar] [CrossRef]
  20. Kusumaningrum, D.; Lee, H.; Lohumi, S.; Mo, C.; Kim, M.S.; Cho, B.K. Non-destructive technique for determining the viability of soybean (Glycine max) seeds using FT-NIR spectroscopy. J. Sci. Food Agric. 2018, 98, 1734–1742. [Google Scholar] [CrossRef]
  21. Ambrose, A.; Kandpal, L.M.; Kim, M.S.; Lee, W.-H.; Cho, B.-K. High speed measurement of corn seed viability using hyperspectral imaging. Infrared Phys. Technol. 2016, 75, 173–179. [Google Scholar] [CrossRef]
  22. Shrestha, S.; Deleuran, L.C.; Gislum, R. Separation of viable and non-viable tomato (Solanum lycopersicum L.) seeds using single seed near-infrared spectroscopy. Comput. Electron. Agric. 2017, 142, 348–355. [Google Scholar] [CrossRef]
  23. Lakshmanan, M.K.; Boelt, B.; Gislum, R. A chemometric method for the viability analysis of spinach seeds by near infrared spectroscopy with variable selection using successive projections algorithm. J. Near Infrared Spectrosc. 2023, 31, 24–32. [Google Scholar] [CrossRef]
  24. Larios, G.; Nicolodelli, G.; Ribeiro, M.; Canassa, T.; Reis, A.R.; Oliveira, S.L.; Alves, C.Z.; Marangoni, B.S.; Cena, C. Soybean seed vigor discrimination by using infrared spectroscopy and machine learning algorithms. Anal. Methods 2020, 12, 4303–4309. [Google Scholar] [CrossRef] [PubMed]
  25. Oliveira, I.C.; Franca, T.; Nicolodelli, G.; Morais, C.P.; Marangoni, B.; Bacchetta, G.; Milori, D.M.; Alves, C.Z.; Cena, C. Fast and accurate discrimination of Brachiaria brizantha (A. Rich.) Stapf seeds by molecular spectroscopy and machine learning. ACS Agric. Sci. Technol. 2021, 1, 443–448. [Google Scholar] [CrossRef]
  26. Larios, G.S.; Nicolodelli, G.; Senesi, G.S.; Ribeiro, M.C.; Xavier, A.A.; Milori, D.M.; Alves, C.Z.; Marangoni, B.S.; Cena, C. Laser-induced breakdown spectroscopy as a powerful tool for distinguishing high-and low-vigor soybean seed lots. Food Anal. Methods 2020, 13, 1691–1698. [Google Scholar] [CrossRef]
  27. Cioccia, G.; Pereira de Morais, C.; Babos, D.V.; Milori, D.M.B.P.; Alves, C.Z.; Cena, C.; Nicolodelli, G.; Marangoni, B.S. Laser-Induced Breakdown Spectroscopy Associated with the Design of Experiments and Machine Learning for Discrimination of Brachiaria brizantha Seed Vigor. Sensors 2022, 22, 5067. [Google Scholar] [CrossRef] [PubMed]
  28. Li, L.; Chen, S.; Deng, M.; Gao, Z. Optical techniques in non-destructive detection of wheat quality: A review. Grain Oil Sci. Technol. 2022, 5, 44–57. [Google Scholar] [CrossRef]
  29. Rostami, M.; Berahmand, K.; Forouzandeh, S. A novel method of constrained feature selection by the measurement of pairwise constraints uncertainty. J. Big Data 2020, 7, 83. [Google Scholar] [CrossRef]
  30. Rostami, M.; Berahmand, K.; Nasiri, E.; Forouzandeh, S. Review of swarm intelligence-based feature selection methods. Eng. Appl. Artif. Intell. 2021, 100, 104210. [Google Scholar] [CrossRef]
  31. Gokalp, O.; Tasci, E.; Ugur, A. A novel wrapper feature selection algorithm based on iterated greedy metaheuristic for sentiment classification. Expert Syst. Appl. 2020, 146, 113176. [Google Scholar] [CrossRef]
  32. Tang, X.; Dai, Y.; Xiang, Y. Feature selection based on feature interactions with application to text categorization. Expert Syst. Appl. 2019, 120, 207–216. [Google Scholar] [CrossRef]
  33. Liu, Y.; Nie, F.; Gao, Q.; Gao, X.; Han, J.; Shao, L. Flexible unsupervised feature extraction for image classification. Neural Netw. 2019, 115, 65–71. [Google Scholar] [CrossRef] [PubMed]
  34. Chen, R.-C.; Dewi, C.; Huang, S.-W.; Caraka, R.E. Selecting critical features for data classification based on machine learning methods. J. Big Data 2020, 7, 52. [Google Scholar] [CrossRef]
  35. Zou, Z.; Chen, J.; Zhou, M.; Zhao, Y.; Long, T.; Wu, Q.; Xu, L. Prediction of peanut seed vigor based on hyperspectral images. Food Sci. Technol. 2022, 42, e32822. [Google Scholar] [CrossRef]
  36. Nicolai, B.M.; Beullens, K.; Bobelyn, E.; Peirs, A.; Saeys, W.; Theron, K.I.; Lammertyn, J. Nondestructive measurement of fruit and vegetable quality by means of NIR spectroscopy: A review. Postharvest Biol. Technol. 2007, 46, 99–118. [Google Scholar] [CrossRef]
  37. Hampton, J.G.; Tekrony, D.M. Handbook of Vigour Test Methods; The International Seed Testing Association: Zurich, Switzerland, 1995. [Google Scholar]
  38. Panwar, P.; Bhardwaj, S. Handbook of Practical Forestry; Agrobios: Jodhpur, India, 2005. [Google Scholar]
  39. Maguire, J.D. Speed of germination-aid in selection and evaluation for seedling emergence and vigor. Crop Sci. 1962, 2, 176–177. [Google Scholar] [CrossRef]
  40. Hunter, E.; Glasbey, C.; Naylor, R. The analysis of data from germination tests. J. Agric. Sci. 1984, 102, 207–213. [Google Scholar] [CrossRef]
  41. Chu, Y.W.; Tang, S.S.; Ma, S.X.; Ma, Y.Y.; Hao, Z.Q.; Guo, Y.M.; Guo, L.B.; Lu, Y.F.; Zeng, X.Y. Accuracy and stability improvement for meat species identification using multiplicative scatter correction and laser-induced breakdown spectroscopy. Opt. Express 2018, 26, 10119–10127. [Google Scholar] [CrossRef]
  42. Balamurugan, R.; Natarajan, A.; Premalatha, K. Stellar-mass black hole optimization for biclustering microarray gene expression data. Appl. Artif. Intell. 2015, 29, 353–381. [Google Scholar] [CrossRef]
  43. Sörensen, K. Metaheuristics—The metaphor exposed. Int. Trans. Oper. Res. 2015, 22, 3–18. [Google Scholar] [CrossRef]
  44. Zhang, S.; Lee, C.K.; Chan, H.K.; Choy, K.L.; Wu, Z. Swarm intelligence applied in green logistics: A literature review. Eng. Appl. Artif. Intell. 2015, 37, 154–169. [Google Scholar] [CrossRef]
  45. Hu, Y.; Zheng, J.; Zou, J.; Yang, S.; Ou, J.; Wang, R. A dynamic multi-objective evolutionary algorithm based on intensity of environmental change. Inf. Sci. 2020, 523, 49–62. [Google Scholar] [CrossRef]
  46. Wang, C.; Pan, H.; Su, Y. A many-objective evolutionary algorithm with diversity-first based environmental selection. Swarm Evol. Comput. 2020, 53, 100641. [Google Scholar] [CrossRef]
  47. Zhang, L.; Sun, H.; Rao, Z.; Ji, H. Hyperspectral imaging technology combined with deep forest model to identify frost-damaged rice seeds. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2020, 229, 117973. [Google Scholar] [CrossRef] [PubMed]
  48. Gong, D.; Xu, B.; Zhang, Y.; Guo, Y.; Yang, S. A similarity-based cooperative co-evolutionary algorithm for dynamic interval multiobjective optimization problems. IEEE Trans. Evol. Comput. 2019, 24, 142–156. [Google Scholar] [CrossRef]
  49. Yong, Z.; Dun-wei, G.; Wan-qiu, Z. Feature selection of unreliable data using an improved multi-objective PSO algorithm. Neurocomputing 2016, 171, 1281–1290. [Google Scholar] [CrossRef]
  50. Masoudi-Sobhanzadeh, Y.; Motieghader, H. World Competitive Contests (WCC) algorithm: A novel intelligent optimization algorithm for biological and non-biological problems. Inform. Med. Unlocked 2016, 3, 15–28. [Google Scholar] [CrossRef]
  51. Kashan, A.H. League Championship Algorithm (LCA): An algorithm for global optimization inspired by sport championships. Appl. Soft Comput. 2014, 16, 171–200. [Google Scholar] [CrossRef]
  52. McCall, J. Genetic algorithms for modelling and optimisation. J. Comput. Appl. Math. 2005, 184, 205–222. [Google Scholar] [CrossRef]
  53. Shi, Y. Particle swarm optimization: Developments, applications and resources. In Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No. 01TH8546), Seoul, Republic of Korea, 27–30 May 2001; pp. 81–86. [Google Scholar]
  54. Dorigo, M.; Bonabeau, E.; Theraulaz, G. Ant algorithms and stigmergy. Future Gener. Comput. Syst. 2000, 16, 851–871. [Google Scholar] [CrossRef]
  55. Atashpaz-Gargari, E.; Lucas, C. Imperialist competitive algorithm: An algorithm for optimization inspired by imperialistic competition. In Proceedings of the 2007 IEEE Congress on Evolutionary Computation, Singapore, 25–28 September 2007; pp. 4661–4667. [Google Scholar]
  56. Beigy, H.; Meybodi, M.R. Cellular learning automata with multiple learning automata in each cell and its applications. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 2009, 40, 54–65. [Google Scholar] [CrossRef]
  57. Patel, V.K.; Savsani, V.J. Heat transfer search (HTS): A novel optimization algorithm. Inf. Sci. 2015, 324, 217–246. [Google Scholar] [CrossRef]
  58. Ghaemi, M.; Feizi-Derakhshi, M.-R. Feature selection using forest optimization algorithm. Pattern Recognit. 2016, 60, 121–129. [Google Scholar] [CrossRef]
  59. Masoudi-Sobhanzadeh, Y.; Motieghader, H.; Masoudi-Nejad, A. FeatureSelect: A software for feature selection based on machine learning approaches. BMC Bioinform. 2019, 20, 170. [Google Scholar] [CrossRef]
  60. Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
  61. Park, H.; Son, J.-H. Machine learning techniques for THz imaging and time-domain spectroscopy. Sensors 2021, 21, 1186. [Google Scholar] [CrossRef]
  62. Dreiseitl, S.; Ohno-Machado, L. Logistic regression and artificial neural network classification models: A methodology review. J. Biomed. Inform. 2002, 35, 352–359. [Google Scholar] [CrossRef]
  63. Priyanka; Kumar, D. Decision tree classifier: A detailed survey. Int. J. Inf. Decis. Sci. 2020, 12, 246–269. [Google Scholar]
  64. Cervantes, J.; Garcia-Lamont, F.; Rodríguez-Mazahua, L.; Lopez, A. A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing 2020, 408, 189–215. [Google Scholar] [CrossRef]
  65. Cunningham, P.; Delany, S.J. k-Nearest neighbour classifiers—A Tutorial. ACM Comput. Surv. (CSUR) 2021, 54, 128. [Google Scholar] [CrossRef]
  66. Wickramasinghe, I.; Kalutarage, H. Naive Bayes: Applications, variations and vulnerabilities: A review of literature with code snippets for implementation. Soft Comput. 2021, 25, 2277–2293. [Google Scholar] [CrossRef]
  67. Kumar, A.; Alsadoon, A.; Prasad, P.; Abdullah, S.; Rashid, T.A.; Pham, D.T.H.; Nguyen, T.Q.V. Generative adversarial network (GAN) and enhanced root mean square error (ERMSE): Deep learning for stock price movement prediction. Multimed. Tools Appl. 2022, 81, 3995–4013. [Google Scholar] [CrossRef]
  68. Karunasingha, D.S.K. Root mean square error or mean absolute error? Use their ratio as well. Inf. Sci. 2022, 585, 609–629. [Google Scholar] [CrossRef]
  69. Menard, S. Coefficients of determination for multiple logistic regression analysis. Am. Stat. 2000, 54, 17–24. [Google Scholar]
  70. Dhakate, P.P.; Patil, S.; Rajeswari, K.; Abin, D. Preprocessing and Classification in WEKA using different classifiers. Int. J. Eng. Res. Appl. 2014, 4, 91–93. [Google Scholar]
  71. Ferreira, A.J.; Figueiredo, M.A. Efficient feature selection filters for high-dimensional data. Pattern Recognit. Lett. 2012, 33, 1794–1804. [Google Scholar] [CrossRef]
  72. Kumar, C.A.; Sooraj, M.; Ramakrishnan, S. A comparative performance evaluation of supervised feature selection algorithms on microarray datasets. Procedia Comput. Sci. 2017, 115, 209–217. [Google Scholar] [CrossRef]
  73. Majd, A.; Sahebi, G.; Daneshtalab, M.; Plosila, J.; Lotfi, S.; Tenhunen, H. Parallel imperialist competitive algorithms. Concurr. Comput. Pract. Exp. 2018, 30, e4393. [Google Scholar] [CrossRef]
  74. Atashpendar, A.; Dorronsoro, B.; Danoy, G.; Bouvry, P. A scalable parallel cooperative coevolutionary PSO algorithm for multi-objective optimization. J. Parallel Distrib. Comput. 2018, 112, 111–125. [Google Scholar] [CrossRef]
  75. Ramdania, D.; Irfan, M.; Alfarisi, F.; Nuraiman, D. Comparison of genetic algorithms and Particle Swarm Optimization (PSO) algorithms in course scheduling. J. Phys. Conf. Ser. 2019, 1402, 022079. [Google Scholar] [CrossRef]
  76. Pang, L.; Wang, J.; Men, S.; Yan, L.; Xiao, J. Hyperspectral imaging coupled with multivariate methods for seed vitality estimation and forecast for Quercus variabilis. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2021, 245, 118888. [Google Scholar] [CrossRef]
  77. Fu, J.; Yu, H.-D.; Chen, Z.; Yun, Y.-H. A review on hybrid strategy-based wavelength selection methods in analysis of near-infrared spectral data. Infrared Phys. Technol. 2022, 125, 104231. [Google Scholar] [CrossRef]
  78. Yang, J.; Sun, L.; Xing, W.; Feng, G.; Bai, H.; Wang, J. Hyperspectral prediction of sugarbeet seed germination based on gauss kernel SVM. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2021, 253, 119585. [Google Scholar] [CrossRef] [PubMed]
  79. Saputri, D.A.S.; Pahlawan, M.F.R.; Murti, B.M.; Masithoh, R.E. Vis/NIR spectroscopy for non-destructive method in detecting soybean seeds viability. IOP Conf. Ser. Earth Environ. Sci. 2022, 1038, 012043. [Google Scholar] [CrossRef]
  80. Pahlawan, M.; Wati, R.; Masithoh, R. Development of a low-cost modular VIS/NIR spectroscopy for predicting soluble solid content of banana. IOP Conf. Ser. Earth Environ. Sci. 2021, 644, 012047. [Google Scholar] [CrossRef]
  81. Wati, R.; Pahlawan, M.; Masithoh, R. Development of calibration model for pH content of intact tomatoes using a low-cost Vis/NIR spectroscopy. IOP Conf. Ser. Earth Environ. Sci. 2021, 686, 012049. [Google Scholar] [CrossRef]
  82. Savi, A.; De Aguiar, L.; Tonial, L.; Lafay, C.; Assmann, T.; De Bortolli, M. Fast and Non-Destructive Determination of N, P, and K in Sorghum, Oat, and Corn Residue Using Near-Infrared Spectroscopy. J. Agric. Sci. 2019, 11, 304. [Google Scholar] [CrossRef]
  83. Arora, R. Comparative analysis of classification algorithms on different datasets using WEKA. Int. J. Comput. Appl. 2012, 54, 21–25. [Google Scholar] [CrossRef]
  84. Neo, E.R.K.; Yeo, Z.; Low, J.S.C.; Goodship, V.; Debattista, K. A review on chemometric techniques with infrared, Raman and laser-induced breakdown spectroscopy for sorting plastic waste in the recycling industry. Resour. Conserv. Recycl. 2022, 180, 106217. [Google Scholar] [CrossRef]
Figure 1. Flowchart of research implementation stages.
Figure 1. Flowchart of research implementation stages.
Agronomy 13 02939 g001
Figure 2. The spectrum acquisition process along with the probe-detector sensor.
Figure 2. The spectrum acquisition process along with the probe-detector sensor.
Agronomy 13 02939 g002
Figure 3. Raw and preprocessed VIS/NIR spectra of peanut seeds, (a) NC-2 raw spectrum, (b) Florispan raw spectrum, (c) NC-2 preprocessing, and (d) Florispan preprocessing.
Figure 3. Raw and preprocessed VIS/NIR spectra of peanut seeds, (a) NC-2 raw spectrum, (b) Florispan raw spectrum, (c) NC-2 preprocessing, and (d) Florispan preprocessing.
Agronomy 13 02939 g003
Figure 4. Comparison of correlation and RMSE of variable selection algorithms based on convergence, mean convergence, and stability.
Figure 4. Comparison of correlation and RMSE of variable selection algorithms based on convergence, mean convergence, and stability.
Agronomy 13 02939 g004
Figure 5. Averages of Vis/NIR absorption spectra and the location of the selected wavelengths.
Figure 5. Averages of Vis/NIR absorption spectra and the location of the selected wavelengths.
Agronomy 13 02939 g005
Figure 6. Results of classification for (a) seed viability identification and (b) FOA algorithm selected wavelength.
Figure 6. Results of classification for (a) seed viability identification and (b) FOA algorithm selected wavelength.
Agronomy 13 02939 g006
Table 1. Calculation relationships of the studied indicators.
Table 1. Calculation relationships of the studied indicators.
The Studied IndexEquationReferences
Germination Energy G E = M C G P N × 100 [10]
Germination Value G V = M D G × P V [37]
Germination Vigor G V I = G P × M e a n ( P L + P R ) 100 [38]
Allometric Coefficient A C = P L P R [12]
Daily Germination Speed D G S = 1 M D G [39]
Mean Daily Germination M D G = G P T [40]
Where MCGP is the maximum percentage of cumulative germination, N is total number of seeds sown, ti is the number of days after the start of germination, GP is percentage of germination final yield, T is length of germination period (days), SFW is seedling wet weight (grams), SDW is seedling dry weight (grams), PL is seedling length (centimeters), and PR is root length (centimeters).
Table 2. Results of seed viability indices for peanut seeds.
Table 2. Results of seed viability indices for peanut seeds.
VarietyPeriodNumberGermination PercentageGermination ENERGYMean Daily GerminationGermination ValueDaily Germination SpeedGermination Vigor
NC-21100918611.3750250.25000.08797.0752
NC-2210069618.6250146.62500.11593.7826
NC-2310047345.875064.62500.17020.9795
Florispan1100949211.7500329.00000.08519.5836
Florispan210072639.000153.00000.11114.0761
Florispan310052656.500084.50000.15382.0072
Table 3. The results obtained from the variable selection algorithms for the regression problem.
Table 3. The results obtained from the variable selection algorithms for the regression problem.
ALNOFWavelengths (nm)ETRMSECR (R2)
WCC14704, 694, 775, 835,1025, 991, 906, 824, 852, 738, 795, 699, 963, 76723.36030.00280.9870
LCA10748, 915, 783, 967, 887, 869, 801, 696, 744, 88344.93890.00250.9872
GA16870, 799, 783, 636, 846, 785, 734, 737, 762, 954, 827, 913, 714, 810, 904, 7253.41350.00280.9868
PSO15911, 784, 992, 713, 839, 726, 928, 840, 691, 791, 963, 832, 775, 737, 8172.51810.00260.9870
ACO16780, 759, 868, 777, 814, 704, 804, 982, 952, 775, 1017, 934, 685, 905, 800, 65731.16990.00270.9870
ICA16791, 844, 786, 731, 929, 1003, 798, 675, 1022, 774, 710, 888, 777, 978, 901, 6972.36350.00270.9867
LA15935, 790, 768, 796, 776, 955, 732, 818, 883,694, 866, 1027, 783, 722, 82433.83880.00250.9876
HTS16920, 828, 762, 804, 811, 503, 862, 837, 785, 779, 698, 846, 845, 957, 854, 63314.37980.00280.9866
FOA10754, 825, 731, 778, 962, 902, 794, 738, 707, 85617.18780.00250.9874
DSOS16915, 681, 672, 977, 815, 994, 956, 798, 939, 581, 522, 819, 690, 793, 760, 80623.81450.00330.9854
CUK12874, 734, 706, 878, 1018, 775, 972, 742, 791, 843, 967, 72333.38290.00270.9870
Where AL is algorithm, NOF is the number of features, ET is elapsed time, RMSE is the root mean squared error, and CR (R2) is the squared correlation coefficient.
Table 4. Classification results to determine seeds viability.
Table 4. Classification results to determine seeds viability.
AlgorithmClassifiersAccuracyPrecisionSensitivitySpecificityROC Area
FOALogistic Regression0.97500.92600.92500.98500.9960
Naive Bayes0.97700.93700.93300.98600.9970
Decision trees0.98200.94600.94600.98900.9820
k-Nearest Neighbors0.97900.93800.93800.98700.9630
Support Vector Machines0.97700.92900.92100.98400.9860
Multilayer Perceptron0.98300.95000.95000.99000.9980
DSOSLogistic Regression0.95800.87500.87500.97500.9880
Naive Bayes0.97100.91600.91300.98200.9930
Decision trees0.99300.98000.97900.99600.9890
k-Nearest Neighbors0.96100.88400.88300.97700.9300
Support Vector Machines0.96200.89900.88800.97700.9790
Multilayer Perceptron0.98300.95000.95000.99000.9980
Table 5. Confusion matrix for DT–DSOS classification.
Table 5. Confusion matrix for DT–DSOS classification.
TreatmentNumberabcdefTrue Positive Rate
a: NC-2- Period 140382000095%
b: Florispan- Period 1400400000100%
c: NC-2- Period 240013810095%
d: Florispan- Period 240001390097.5%
e: NC-2- Period 3400000400100%
f: Florispan- Period 3400000040100%
False positive rate0%1.5%0.5%0.5%0%0%
Table 6. Confusion matrix for MP–FOA classification.
Table 6. Confusion matrix for MP–FOA classification.
TreatmentNumberabcdefTrue Positive Rate
a: NC-2- Period 140373000092.5%
b: Florispan- Period 140534100085%
c: NC-2- Period 240013810095%
d: Florispan- Period 240001390097.5%
e: NC-2- Period 3400000400100%
f: Florispan- Period 3400000040100%
False positive rate2.5%2%1%0.5%0%0%
Table 7. Confusion matrix for LR–DSOS application.
Table 7. Confusion matrix for LR–DSOS application.
TreatmentNumberabcdefTrue Positive Rate
a: NC-2- Period 140382000095%
b: Florispan- Period 140235300087.5%
c: NC-2- Period 240043150077.5%
d: Florispan- Period 240005323080%
e: NC-2- Period 340000336190%
f: Florispan- Period 340000023895%
False positive rate1%3%4%4%2.5%0.5%
Table 8. Confusion matrix for KNN–DSOS application.
Table 8. Confusion matrix for KNN–DSOS application.
TreatmentNumberabcdefTrue Positive Rate
a: NC-2- Period 140346000085%
b: Florispan- Period 140633100082.5%
c: NC-2- Period 240033610090%
d: Florispan- Period 240002353087.5%
e: NC-2- Period 340000335287.5%
f: Florispan- Period 340000013997.5%
False positive rate3%4.5%1.5%2%2%1%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rajabi-Sarkhani, M.; Abbaspour-Gilandeh, Y.; Moinfar, A.; Tahmasebi, M.; Martínez-Arroyo, M.; Hernández-Hernández, M.; Hernández-Hernández, J.L. Identifying Optimal Wavelengths from Visible–Near-Infrared Spectroscopy Using Metaheuristic Algorithms to Assess Peanut Seed Viability. Agronomy 2023, 13, 2939. https://doi.org/10.3390/agronomy13122939

AMA Style

Rajabi-Sarkhani M, Abbaspour-Gilandeh Y, Moinfar A, Tahmasebi M, Martínez-Arroyo M, Hernández-Hernández M, Hernández-Hernández JL. Identifying Optimal Wavelengths from Visible–Near-Infrared Spectroscopy Using Metaheuristic Algorithms to Assess Peanut Seed Viability. Agronomy. 2023; 13(12):2939. https://doi.org/10.3390/agronomy13122939

Chicago/Turabian Style

Rajabi-Sarkhani, Mohammad, Yousef Abbaspour-Gilandeh, Abdolmajid Moinfar, Mohammad Tahmasebi, Miriam Martínez-Arroyo, Mario Hernández-Hernández, and José Luis Hernández-Hernández. 2023. "Identifying Optimal Wavelengths from Visible–Near-Infrared Spectroscopy Using Metaheuristic Algorithms to Assess Peanut Seed Viability" Agronomy 13, no. 12: 2939. https://doi.org/10.3390/agronomy13122939

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop