*4.7. Chloroplast RNA Editing Analysis*

For chloroplast RNA editing analysis, total DNA and RNA were isolated from wild type and *yl* leaves as described in Sections 4.4 and 4.5. Next, the DNA and RNA were submitted to Berry Genomics (Beijing) for DNA resequencing and rRNA-depleted strand-specific RNA-seq, respectively. For RNA-seq analysis, four cDNA libraries (two biological replicates per genotype) were constructed. All sequencing data used in this research were deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive under BioProject ID PRJNA616185.

Clean reads were quality checked using FastQC (version 0.11.3) (http://www.bioinformatics. babraham.ac.uk/projects/fastqc/) [48]. After quality control, reads were aligned to the soybean chloroplast genome (ncbi.nlm.nih.gov/nuccore/DQ312375.1) [49] using HISAT (version 2.0.0) [50]. SNP calling was performed by GATK combined with Samtools [51,52]. SNPs with read numbers ≥ 20 from the RNA-seq and the corresponding DNA resequencing data were compared and the different

chloroplast allele bases between the two data sets were considered as the candidate RNA editing sites. Because C-to-U editing in plastid and mitochondrial mRNAs appear to be ubiquitous in land plants, the editing efficiency of each editing site was calculated using the following equation: Editing (%) = U/(C + U) × 100. U represents the read number of SNPs that are different from DNAs in plastid mRNAs. C represents the read number of SNPs that are identical as DNAs in the same plastid editing site.

To experimentally validate the RNA editing sites derived from RNA-seq, the genomic and transcript regions surrounding these sites were amplified from another wild type biological replicate and *yl* mutant biological replicate using KOD FX High-Fidelity DNA polymerase (Toyobo, Osaka, Japan). The PCR products were sequenced and compared to identify SNP changes resulting from RNA editing. The RNA editing extent was estimated by the relative heights of the peaks of the nucleotide in the sequence analyzed. The primer sequences are listed in Table S3.
