*3.5. Classification Model Built by Feature-Level Fusion of Spectra and Texture Data*

The classification of maize with different moldy levels based on hyperspectral imaging technology involved the rapid collection of a substantial number of hyperspectral images, which were composed of two spatial dimensions and one spectral dimension data. Then, the spectral or texture information was extracted from these hyperspectral images and used to predict the categories of each maize sample. A large number of spectral or texture variables in the full wavelength range often contained noise from the environment and instrumental sources, leading to complexity and poor predicting ability of a calibration model. In addition, when used for online or at-line purposes, the complex calibration models developed with the whole spectrum will not be applicable. To resolve these issues, the features of spectra and textures were extracted alone using variable selection algorithms including VCPA, IRIV, and mVCPA-IRIV; then, the classification model of feature-level was established based on the extracted features for Vis-SWNIR and LWNIR regions, respectively.

In order to analyze the features selected from spectral and texture data, characteristic wavelengths selected by the same variable selection method were concatenated together and their distribution maps were shown in Figure 7. Comparing the number of selected characteristic bands, the VCPA selected fewer variables than IRIV and mVCPA-IRIV for spectral and texture data in both Vis-SWNIR and LWNIR regions. The number of selected characteristic bands was greater in texture data than that of spectral data for all three variable selection methods, illustrating that the texture had more information about the moldy maize than that of spectral data. Comparing the distribution of characteristic bands selected by three variable selection methods, it could be found that many common regions were determined in both spectral and texture data. The shared regions were concentrated at 629–649, 728–743, 764–772, 855–860, and 1055–1248 nm for the spectral data (Figure 6a,c), while the shared regions were observed to be concentrated at 410–490, 584–592, 679–693, 866–876, 953–963, 1029–1060, 1167–1192, 1235–1267, and 1688–1701 nm for the texture data (Figure 6b,d). According to previous studies, Stasiewicz et al. [53] classified different levels of Aspergillus flavus in maize at 850 nm near 857 nm. Moreover, 768 nm and 853 nm were used to differentiate the fungal contaminated maize from healthy samples in Chu's study [54]. Furthermore, 1029–1267 nm belonged to the second overtone of N-H stretching of proteins, as well as C-H stretching in lipids [47,48]; 1688–1701 nm was attributed to the second overtone of S-H, these were associated with protein, fat and starch [55]. Mold growth broke down the fat, protein, and starch of the maize kernels, which would change the reflection spectra and textures features of maize kernels, and resulting in the selection of the above characteristic bands. Except for those shared regions, there were some differences in the selected characteristic bands, which may be caused by the different principles of variable selection methods.

The classification results of the feature-level fusion of spectral and texture information were shown in Table 5. It could be found that the feature-level fusion models achieved better accuracy and reliability in Vis-SWNIR range than that of LWNIR region. In particular, the feature-level fusion models of VCPA, IRIV, and mVCPA-IRIV increased by 3.33%, 5.00%, and 1.67%, respectively, compared to the model based on pixel-level fusion for Vis-LWNIR region (Table 4). In detail, the prediction accuracy of the model based on the features selected by VCPA, IRIV, and mVCPA-IRIV was 93.33%, 95%, and 91.67% for Vis-SWNIR, and 90%, 83.33%, and 91.97% for LWNIR region. Although the IRIV method achieved the best prediction results in Vis-SWNIR region, the prediction ability was very poor in LWNIR region. In addition, the IRIV was time-consuming in variable selection and the number of selected variables used for modeling was much larger than that of the VCPA approach. However, the VCPA method achieved the classification accuracy of 93.33% and 90% for Vis-SWNIR and LWNIR regions, which was more robust than IRIV and mVCPA-IRIV. It is worth noting that, as a hybrid variable selection method, mVCPA-IRIV did not yield the best prediction results either in the Vis-LWNIR or in the LWNIR region. This result was consistent with the results of employing two-step hybrid methods to determine TVB-N contents in tilapia fillet for single Vis-NIR and NIR data blocks [24]. This may be because feature selection was based on a single data block with a relatively small number of variables, which was not suitable for the hybrid variable selection methods.

**Figure 7.** The distribution of key wavelengths selected by different variable selection algorithms from spectra and texture data of Vis-SWNIR and LWNIR regions, respectively. (**a**) Spectrum in Vis-SWNIR region, (**b**) energy parameters in Vis-SWNIR region, (**c**) spectrum in LWNIR region, and (**d**) contrast parameters in LWNIR region. Note: red points, black points, and green points represent the variables retained by VCPA, IRIV, and mVCPA-IRIV in the spectrum or texture parameters, respectively. Orange points represent the bands jointly selected by all three algorithms.


**Table 5.** The classification results of the feature-level fusion of spectral and texture information.

In conclusion, feature-level fusion model on the basis of variable selection had a large potential for distinguishing the maize with different moldy levels than the model of pixel-level fusion with full-band in both Vis-SWNIR and LWNIR regions. Remarkably, in the study by Anting et al. [37], the feature-level-PCA strategy using sample image and the spectra to evaluate the fermentation degree of black tea had a similar result with our proposed strategy.
