*2.4. Spectral Preprocessing and Sample Set Division*

The original spectra inevitably shifted and displayed background noise due to the influences of the data acquisition environment, sample size, instrument, and other factors. The stability of spectral data and the signal-to-noise ratio could effectively be improved through the reasonable use of preprocessing methods. A multiple scattering correction (MSC) was used to eliminate baseline drift [37]. The standardized normal variate (SNV) was used for the centering and calibration of the spectral data in each wavelength [38]. Several Savitzky–Golay (S-G) filters [39], with frame sizes of 3–9 and fitting orders of 1–7, were used to improve the smoothing effect. The optimal combination of frame size and fitting order was chosen according to R and RMSE values. Each spectral preprocessing method and the combinations of S-G and MSC, or S-G and SNV, were used. The preprocessed spectra were used to construct different PLSR models of SCC. The best method or combination was selected according to corresponding R and RMSE values.

The sample-set partitioning method based on the joint x-y distance algorithm (SPXY) divided the samples into a calibration set (Cs) and a validation set (Vs) where spectral data and SCC were taken as the input data. The proportion of Cs:Vs was 3:1 in this study.
