1. Introduction
Seeds are the most fundamental means of agricultural production [
1], and seed vigor is a comprehensive index to measure seed quality [
2], which is a comprehensive evaluation of the activity intensity and characteristics of seeds during germination and emergence, including seed germination percentage, emergence percentage, seedling growth potential, plant resistance and yield potential, etc. [
3]. Seed vigor first increased and then decreased during seed growth, development and storage. In the seed development stage, seed vigor rises with the increase in seed maturity and reached the peak at physiological maturity [
4]. In the storage stage after full maturity, seed vigor decreases with the storage time due to natural aging [
5]. Understanding the changing rules of seed vigor and evaluating it in different stages accurately and quickly is of great guiding significance for judging the maturity of seed growth and developmental period for safe storage of seed after maturity. However, it is very difficult to evaluate and quantify seed vigor accurately and comprehensively, because of the complex characteristics mentioned above. Therefore, the evaluation and determination of seed vigor has always been a difficult problem in the field of seed quality testing. At present, the International Seed Testing Association (ISTA) recognizes the standard seed vigor testing methods include electrical conductivity test (for chickpea (
Cicer arietinum), soybean (
Glycine max), kidney bean (
Phaseolus vulgaris), and pea (
Pisum sativum), radish (
Raphanus sativus), etc.), artificial accelerated aging method (for soybean seeds, etc.), controlled deterioration method (for Brassica seeds), radicle emergence (for corn (
Zea mays), Oilseed rape (
Brassica napus), radish seeds, etc.), and tetrazole staining (for soybean seeds) [
6]. Except for the species mentioned above, most of the seed vigor evaluation is still in the stage of experimental research, and there is no definite standard testing method.
With the development of science and technology and interdisciplinary application, multispectral imaging (MSI), a new non-destructive physical technique, provides a new idea for seed quality evaluation. The technology can obtain spectral information at 19 different wavelengths (365, 405, 430, 450, 470, 490, 515, 540, 570, 590, 630, 645, 660, 690, 780, 850, 880, 940, 970 nm), and can integrate traditional vision and spectral technology, simultaneously obtain the spatial and spectral information of the object, quickly and accurately determine the surface characteristics of the target object and detect its internal chemical composition, and also display their differences and changes. In recent years, many scholars have explored the application of this technique in seed vigor determination. Multispectral imaging combined with LDA model can accurately distinguish aged and unaged cowpea (
Vigna unguiculata) seeds, and it has a high overall correct classification rate in predicting high vigor seeds, medium vigor seeds and dead seeds [
7]. Cong [
8] combined the method of seedling evaluation with the nCDA model in the multispectral software, and then predicted the perennial ryegrass (
Lolium perenne) seeds with high, medium and low vigor with high accuracy. Galletti et al. [
9] used two-component principal component analysis (PCA) method for exploratory analysis of multispectral imaging data, selected the most significant 5 wavelengths among 19 wavelengths, and carried out high-precision classification of high and low vigor seed in QDA-based model. The accuracy was 86~95% in tomato seeds (
Solanum lycopersicum) and 88~97% in carrot seeds (
Daucus Carota). Liu et al. [
10] used PCA, LS-SVM, BPNN and RF models to evaluate the seed quality of watermelon (
Citrullus lanatus), and the results showed that there were significant differences between high-quality watermelon seeds and the other watermelon seeds (including dead seeds and low vigor seeds). Among the used models in this study, the classification accuracy of LS-SVM and RF model was high. These studies indicate that multispectral imaging has a good application prospect in seed vigor testing.
Alfalfa (
Medicago sativa) is a perennial herb of the genus Alfalfa in the leguminous family, known as the “King of forage” due to its rich protein content [
11]. Because of its good forage value, ecological value and medicinal value, it is widely cultivated in the world. Because of this, timely and accurately monitoring of its seed vigor is very important for planting, transportation and storage. “Artificial Accelerated Aging Determination of Vigor of Grass Seed Testing Procedures” (NY/T 3187-2018) is currently the only standard issued in China that can testing the seed vigor for eight grass species including alfalfa. However, it has the disadvantages of small scope of application, cumbersome operation steps and destructive to seeds. Therefore, based on the above research status and basis of seed vigor determination, this study explored and evaluated the ability and feasibility of alfalfa seed identification and testing based on multispectral imaging at different vigor levels, laying a foundation for exploring the application of multispectral imaging in seed vigor and other important quality items testing.
2. Materials and Methods
2.1. Materials
“Zhongmu No.1” alfalfa seeds from different harvest years (2004, 2008 and 2019) and different maturity stages (green ripe, yellow ripening and full ripe) were used as experimental materials. Seeds from different harvest years or maturity levels come from the Alfalfa seed Production Base (39°12′ N, 106°95′ E) of Institute of Animal Science, Chinese Academy of Agricultural Sciences, Saiusu, Etoke Banner, Inner Mongolia, at an altitude of 1150 m, with an average annual temperature of 8.5 °C, average annual relative humidity of 52%, and average annual precipitation of 225 mm. We found that alfalfa pods were formed from bottom to top, and the pods at
Section 2,
Section 3 and
Section 4 below the top of alfalfa were relatively full at green ripe stage. Therefore, seed samples of the three stages were taken from the relatively full pods mentioned above, and dried indoors, the abnormal pods in the inflorescence were removed. The sample of green ripe stage was collected on 3 July 2021, yellow ripening stage was on 12 July 2021, and full ripe stage was on 31 July 2021. After harvest, the seeds were stored in forage seed Laboratory of China Agricultural University with an average temperature of 25 °C and 35% average humidity.
2.2. Multispectral Imaging
The multispectral imaging instrument VideometerLab4 (Videometer A/S, Herlev, Denmark) was used to take seed sample images. Calibration and light setup of the instrument should be carried out before taking seed samples. Absolute reflectance was calibrated with a bright and dark reference object and geometrically aligned with a Dotted plate. The 660/700 nm and 405/600 nm excitation/emission combinations were then added to the default light settings to capture fluorescent images of chlorophyll A and chlorophyll B [
9,
12]. This was a completely independent analysis, and the data obtained were not involved in multivariate analysis. The seed sample was placed on the bottom of the instrument sphere and a high-resolution, multispectral original image of 2056 × 2056 pixels was obtained in seconds (
Supplementary Figures S1–S6). The raw image was obtained with 19 different wavelengths (365, 405, 430, 450, 470, 490, 515, 540, 570, 590, 630, 645, 660, 690, 780, 850, 880, 940, 970 nm), and then the original image was manipulated. The Blob tool of VideometerLab software was opened to separate the seed sample from the irrelevant background, and the area, aspect ratio, color index, saturation, mean spectral reflectance, pixel distribution histogram and other measurement indexes of each seed were obtained (
Supplementary Tables S1 and S2).
In total, 400 seeds were randomly taken from each batch, 50 seeds were placed in a Petri dish each time and then put into the spectrum instrument. The shape and spectral relevant data of every seed was extracted and exported. Seed germination test was conducted according to the order of seed placement on Petri dishes since the results of seed germination test and the multispectral data of every seed should be one-to-one correspondence. The test was conducted in September 2021.
2.3. Determination of Seed Germination Characteristics
Germination of alfalfa seeds harvested in different years (2004, 2008 and 2019) and harvested in 2021 at different maturity stages (green ripe, yellow ripening and full ripe) were conducted according to ISTA
International Rules for Seed Testing 2021 in 20 °C light 8 h and dark 16 h from October to November 2021. Data of 400 seeds taken at random from each sample were collected using multispectral imaging before germination test. Normal seedlings, abnormal seedlings, hard seeds, fresh ungerminated seeds and dead seeds were recorded in the last count, and the germination potential and germination percentage were calculated according to the following formula:
2.4. Artificial Accelerated Aging Determination of Seed Vigor
The germination percentage of accelerated aging seeds was determined according to Artificial Accelerated Aging Determination of Grass Seed Vigor Testing Procedures (NY/T 3187-2018). Cover the upper layer of aging box with clean gauze of appropriate size to prevent small seeds from leaking out of the hole, spread the seeds evenly and flatly on the gauze, and put them into aging box. At least 400 seeds were placed in an aging box and held for 48 h at 42 °C and 90–100% relative humidity. After aging, the seeds were taken out immediately, and seed germination test was conducted within 1 h. The results of vigor determination were expressed as the percentage (%) of germinated normal seedlings in the total number of seeds tested per treatment at the end of germination.
2.5. Data Analysis
The spectral information and seed morphology information collected by MSI were used for multivariate analysis as follows:
Principal component analysis (PCA), as an exploratory technique of multivariate data analysis, identifies morphological characteristics of extracted seeds and hidden patterns in spectral data, and is used to obtain an overview of systematic changes in the data. It is used to explore the possibility of grouping seeds with similar morphology and spectrum. By selecting less component information from linear transformation of multiple variables, the goal of dimensionality reduction can be achieved with sufficient information. In this study, the linear kernel was used in SVM for classification.
Support vector machine (SVM) can better solve the problems of small samples and high dimensions by mapping data to high-dimensional space and constructing the optimal classification hyperplane for data classification. The learning strategy of SVM is to minimize the intra-class observation distance and maximize the inter-class observation distance.
Linear discriminant analysis (LDA) projects data from a high-dimensional space to a lower-dimensional space and ensures that the in-class variance of each category is small and the mean difference between classes is large. This means that high-dimensional data of the same category are projected into low-dimensional space and the same categories are grouped together, but different categories are far apart.
Random forest (RF) is a combinatorial classification model based on decision tree predictor, which has excellent accuracy and can process input samples with high dimensional features without dimensionality reduction. In the process of the model development with RF using the combined spectral and morphological data, the number of classification trees desired (ntree) was defined as 500, and the number of variables (mtry) used in each tree to make the tree grow was 3.
The same number of seeds randomly taken from each seed lot was divided into the training set (70% of the seed sample) and the validation set (30% of the seed sample), respectively. The classification model was established by using the training set, and the validation set was used to verify the obtained model. Accuracy, sensitivity, specificity and precision were obtained by using the confusion matrix obtained from the validation set to evaluate the classification performance of SVM, LDA and RF models. The formulas are as follows:
In the formulas, X and Y represent two samples, TX represents the sample X with correct prediction, TY represents the sample Y with correct prediction; FX represents the sample Y that is predicted to be sample X; FY means sample X that is predicted to be sample Y.
Normalized canonical discriminant analysis (nCDA) is a transformation construction method, and the calculation behind it is more similar to PCA, but more supervised. Set the two or more categories to be distinguished on different layers to generate a single-band visual analysis image that will be a linear discriminator between two categories, one mainly positive and the other mainly negative.
FactoMineR, E1071, MASS and random Forest packages in R language were used for PCA, SVM, LDA and RF analysis and prediction. nCDA analysis was achieved by using MSI-Transformation Builder in Videometer Software V.3.22.0 (Videometer A/S, Herlev, Denmark).
4. Discussion
In this experiment, samples were taken during different stages of seed development and storage in order to obtain seeds with different vigor levels. The results of germination test and artificial accelerated aging test were consistent with the expected results, indicating that the seed vigor of the samples at different development stages and different storage stages presented high, medium and low vigor gradient level.
Compared with other nondestructive testing techniques such as soft X-ray and near infrared spectroscopy, MSI can obtain spectral and spatial information of seeds at the same time, which makes the information more diversified. The fusion method of spectral data and morphological data and the screening of fusion data have always been a difficulty and research focus in the field of nondestructive testing technology [
13]. The results of chlorophyll fluorescence imaging showed that the degradation of chlorophyll existed in both the development stage and storage stage of seeds. For alfalfa seeds, the fluorescence intensity of chlorophyll A was the highest at the green ripe stage. This stage has incomplete chlorophyll degradation and low physiological potential [
14], the higher the fluorescence intensity, the lower the seed vigor [
9]; Degradation of chlorophyll B molecules occurred at the storage stage after maturity, and the chlorophyll fluorescence intensity of high-vigor seeds was higher than that of low-vigor seeds, which was similar to the previous research results on peanut (
Arachis hypogaea) [
12].
In this experiment, key information with good discrimination was also screened out from morphological data and spectral data. The projected area, length and width of seeds increased with the increase in maturity, indicating a positive correlation between seed size and vigor at the seed development stage. This was consistent with the results of studies on the relationship between seed size and vigor in rice (
Oryza sativa) [
15] and sunflower (
Helianthus annuus) seeds [
16]. The area and vigor of seeds in different harvest years were negatively correlated, which was consistent with the research results on soybean [
17] and corn [
18]. CIELab is an international standard for color testing [
19]. Among the three parameters obtained by multispectral imaging, L* represents brightness, A* represents the color change degree from green to red, and B* represents the color change degree from blue to yellow [
20], which can be applied to judge the maturity and vigor of seeds [
21]. For seeds of different maturity, L* increased with the increase in vigor, and A* and B* also had significant changes in three stages, but there was no regularity. For seeds of different harvest years, L* and B* decreased and A* increased with decreasing seed vigor. In general, CIELab L* is a reliable color index for evaluating the seed vigor of alfalfa. The mean spectral reflectance of seeds with different vigor levels varied greatly from 470 to 690 nm, which may be related to the change of seed color and chlorophyll content [
22,
23]. The mean spectral reflectance of seeds with different maturity levels varied from 940 to 970 nm, which may be related to the changes in lipid content and water content during seed development [
24]. However, seeds in different harvest years were fully mature, so there was little difference at 940–970 nm.
Related studies using leguminous seeds as experimental materials showed that PCA combined with multispectral data was not effective in distinguishing six leguminous seeds [
25], nor was it effective in distinguishing alfalfa seeds from sweet clover (
Melilotus officinalis) seeds [
26]. There was also no good effect in distinguishing alfalfa seeds of different cultivars or different natural aging degrees [
27,
28]. By contrast, LDA in the above experiments has a good effect, and the prediction ability of LDA is superior to SVM in distinguishing different leguminous seeds and different varieties of alfalfa. LDA was also superior to SVM and RF in distinguishing alfalfa seeds with different natural aging degrees, which is consistent with the research results of this study. These may be due to their different working modes. During the modeling process, SVM model mainly relies on near-infrared regional spectral information [
26,
27], LDA model mainly relies on the reflectance of 430~490 nm band in the visible spectral range to make a large contribution [
26], while RF focuses on seed morphology and spectral indicators together [
10].
In addition, in this study, the determination ability of multispectral imaging was extended to predict the alfalfa seed germination [
28,
29]. Seed vigor is uneven even among seeds of the same maturity or harvested in the same year, which makes it difficult to accurately determine this index. According to nCDA prediction results, the effect of distinguishing normal seedlings and hard seeds separately was not ideal, but the effect of distinguishing them by combining them into one group was very good. The statistical results of germination test also showed that many hard seeds were predicted to be normal seedlings, and many normal seedlings were predicted to be hard seeds, indicating that nCDA prediction model could not distinguish hard seeds and normal seedlings well. However, if hard seeds and normal seedlings were grouped together, they can be well distinguished from other ungerminated seeds and abnormal seedlings, which indicated that hard seeds and seeds developed into normal seedlings were classified into one category by nCDA model. Hard seeds are similar to normal seedlings and belong to high vigor seeds with strong germination potential [
30]. Therefore, this method can be applied to predict high vigor seeds in leguminous seeds. In the actual work of the statistics of the germination percentage of leguminous seeds in the standard germination test, the sum of normal seedlings and hard seeds was usually reported as the final germination percentage results. Therefore, nCDA was also ideal for predicting alfalfa seed germination percentage. In addition, the experimental results showed that nCDA had a strong ability to predict dead seeds, with an average accuracy of 93.3%, which had great application potential in predicting seed viability.
For the testing of seed vigor, standard germination percentage and viability by using the traditional method specified in the current standard, not only the duration of the test is long, but also the accuracy of the test results depends on the experience of analysts to a large extent. For example, there are some errors in the determination of seed germination percentage, such as the differentiation between normal seedlings and abnormal seedlings, the determination of dyeing area and location. The use of MSI for seed germination percentage and viability test, can immediately obtain the results, greatly shortened the testing period, not only to complete the seed quality testing, but also to achieve the seed quality prediction. By combining computer image with spectral information and applying multivariate analysis, the error interference caused by subjective judgment difference was greatly avoided, and the judgment result was more objective and accurate. In addition, the whole test process is nondestructive, and it is of great application value especially for the quality monitoring of seed resource bank and the quality testing of expensive and rare precious seeds. Although multispectral imaging technology has a very good application prospect and many advantages in the field of seed quality testing, it also has some limitations and disadvantages. For example, the equipment is expensive, the application time is short, many species and testing items have not completed the mature modeling work, there is still a long way to go before standardized testing and application promotion. However, it is believed that as the cost of computer hardware and artificial intelligence decreases, the application value of multispectral imaging technology in seed quality testing will be immeasurable.