3.1.2. Establishment of an SSC Model Using the IRIV-SPA

Based on the spectral data of fresh jujubes in the open field, the IRIV algorithm with an inverse elimination was used to select variables. The number of cross-validation, maximum principal component, and iteration were set as 10, 20, and 8, respectively. In this IRIV iterative process, the change curve of the retained variable number is shown in Figure 2. As the iteration number increased, the number of retained variables decreased, and the downward trend gradually flattened. Overall, 87 wavelength variables were retained at the 8th iteration. The DMEAN and P –values of retained variables are shown in Figure 3.

**Figure 2.** Number of retained variables.

**Figure 3.** DMEAN and P–values of the nonparametric Mann–Whitney U test on variable.

Combining Figure 3 and the rules in Table 2, the variable type was divided. The selected variables and types are shown in Figure 4. In total, 4 strong informative variables and 83 weak informative variables were selected from 1951 variables, respectively. An inverse elimination was performed, and 71 characteristic wavelengths were preserved.

**Figure 4.** Selection of characteristic wavelength using IRIV based on open-field cultivation.

The number of selected characteristic wavelengths using the IRIV algorithm was high. For further data dimensionality reduction, SPA was used to perform an extraction of characteristic wavelengths for the second time based on 71 extracted characteristic wavelengths using IRIV. When the RMSE was 1.0257%, 10 characteristic wavelengths were extracted. According to the degree of importance, the extracted wavelengths using IRIV-SPA were 957, 1008, 2339, 920, 2248, 2394, 1137, 1976, 647, and 602 nm in turn.

Based on extracted characteristic wavelengths using IRIV and IRIV-SPA, LS-SVM was used to establish SSC detection models. The SSC of fresh jujubes from two cultivation modes was predicted. The SSC-predicted results are shown in Table 4.


**Table 4.** Prediction results of SSC using IRIV and IRIV-SPA.

Compared with the built model using full wavelengths (in Table 3), the IRIV-LS-SVM model improved the prediction ability. For the SSC of open-field samples, the Rp<sup>2</sup> (from 0.80 to 0.85) and the RPD (from 2.25 to 2.52) were increased, and the RMSEP decreased from 1.14% to 1.02%. For the SSC of rain-shelter samples, the Rp2 (from 0.59 to 0.71) and the RPD (from 1.12 to 1.14) were increased, and the RMSEP decreased from 2.54% to 2.50%. For the prediction results of open-field samples, the IRIV-SPA-LS-SVM model and the fullwavelength LS-SVM model were basically the same. For the prediction results of samples cultivated in the rain shelter, the IRIV-SPA-LS-SVM model was slightly worse than the full-wavelength LS-SVM model. Based on the IRIV-LS-SVM and IRIV-SPA-LS-SVM models of fresh jujubes cultivated in the open field, the SSC of samples in the same cultivation was well-predicted, but the SSC of samples in the rain-shelter cultivation was poorly predicted. Therefore, the cultivation mode has a certain influence on the SSC detection model, and the model needs to be further optimized to improve the predictive ability. Compared with IRIV, the number of extracted characteristic wavelengths using IRIV-SPA was significantly reduced (from 71 to 10) on the premise of ensuring the model performance. The IRIV-SPA algorithm achieved a better comprehensive ability and was used to select the characteristic wavelength in the following research.
