*3.2. Visible/NIR Spectral Analysis of PCC Samples*

Using the software WinISS III, the chemically determined values were input to the corresponding spectral positions, and the spectral data were analyzed in combination with chemical analysis data. The raw spectra of the PCC samples obtained after visible/NIR spectroscopy scan (Figure 1A), in which the horizontal coordinate was the wavelength, and the vertical coordinate was the absorbance expressed as log 1/R, showed that several samples of PCC had a clear trend of decreasing absorption peaks in the wavelength range of 400 to 800 nm, which indicated that different samples had specific absorption characteristics in the visible wavelength band. The large variation in their spectrograms also indirectly indicated the different contents of each sample composition.

The raw NIR spectra contained comprehensive information on all chemical structures and a lot of irrelevant information and noise, so mathematical data pretreatment methods were applied to remove noise, compensate for baseline shifting, reduce the influence of non-target variation, and assist in smoothing the spectrum. The derivative transformation could partially compensate for baseline offset between samples and reduce instrument drift effects [22]. Figure 1B shows the spectral curve of the original spectrum after SNV+Detrend and first-order derivative pretreatment. The pretreated spectrum had more obvious undulations, the peaks became more and sharper, and the absorption peaks appeared in the originally smooth part. Figure 1C shows the spectral profile of the original spectrum after the SNV only and first-order derivative pretreatment, and the fitting phenomenon could be observed. On the processed spectrograms, we observed more clearly several characteristic peaks of the spectrum, with the peak at 672 nm associated with chlorophyll [23]. The peak at about 760 nm corresponds to the third overtone of the O-H vibration [24].

**Figure 1.** Visible/Near infrared spectra of purple leaf Chinese cabbages. (**A**): original spectra; (**B**): spectrum after SNV+Detrend and first derivative processing; (**C**) spectrum after SNV only and first derivative processing.

#### *3.3. Establishment of Quantitative Models for Anthocyanidins Content in PPC*

## 3.3.1. Model for Cyanidin Content Prediction

The spectral curves obtained from the scanned samples and the chemical analysis data were processed using PLSR to establish calibration models, and the calibration equation results are shown in Table 2. All spectral pre-treatment models performed well, with RSQ all above 0.91. Successful calibrations usually had a correlation coefficient of determination above 0.9. The 1-VR value of cyanidin in the full spectral band from 400 to 1100 nm and 1100 to 2498 nm after no scattering processing and first-order derivative pretreatment was 0.942 at the maximum, the SECV value was 693.004 at the smaller value, and the RSQ was 0.965. Figure 2A shows the cross-validation result of the prediction model established, the linear regression relationships between the NIR predicted values, and the chemically determined results (reference value). The slope of the line was 0.976, which is closed to 1; the samples were irregularly distributed on both sides of the line with the overall trend of discrete. The model fits well and can achieve the purpose of good quantitative prediction. So, the model after no scattering processing (None) and first-order derivative pretreatment was chosen to be used in the rapid screening of high-quality PCC breeding materials. The highest 1-VR of delphinidin prediction models was 0.172, and the SECV was 12.030, obtained by SNV only (first-order derivative) in the 400–800 nm band. The values of correlation coefficients were small and could not accurately predict the content of delphinidin fraction in PCC. After no scattering processing and first-order derivative preprocessing in 400–800 nm visible light of pelargonidin, the 1-VR value was 0.467 at maximum, and SECV value was 3.887 at minimum, so its detection model was poorly predictive and could not accurately predict

the content of pelargonidin. After the Detrend only and first-order derivative pretreatment under 400–800 nm visible light, the 1-VR value was 0.652 at maximum, and the SECV value was 3.557 for peonidin, so its detection model prediction was weakly correlated and could not accurately predict the content of peonidin fraction in PCC, which need further study. Considering that the contents of delphinidin, pelargonidin, and peonidin were relatively low, which accounted for less than 5% of the total anthocyanidins, it is negligible of their contribution to the quality of PCC.

**Table 2.** Calibration equations of cyanidin content in purple leaf Chinese cabbage using different pretreatment models.


**Figure 2.** The cross-validation and external validation results of cyanidin and total anthocyanidin prediction models. (**A**): cross-validation of cyanidin prediction model; (**B**): cross-validation of total anthocyanidins prediction model; (**C**): external validation of cyanidin prediction model; (**D**): external validation of total anthocyanidins prediction model.
