*2.1. Plant Materials*

All the watermelon lines used in this study were obtained from domestic seed companies (Hyundai Seed Company, Gyeonggi-do, South Korea) in Republic of Korea. A total of 70 lines with red (33), yellow (17), and orange (20) flesh colors were used in this study (Table S1). Seeds of the lines were sown in 72-cell polyethylene flats and cultivated under greenhouse at 25 and 20 ºC under 16 and 8 h light and dark conditions, respectively, until the appearance of second and third true leaves. Thereafter, the leaf samples were collected and the genomic DNA from the leaves were isolated using the WizPrep™ Plant DNA Mini Kit (Wizbiosolutions, Seongnam, South Korea).

### *2.2. Sequence Retrial and Identification of the PPR Family Members in Watermelon*

To retrieve PPR genes in watermelon, the PPR motif "PF01535" from the Pfam (http://pfam.sanger.ac.uk/) database was used to BlastP searches against Watermelon *Citrullus lanatus* subsp. *vulgaris* cv. 97103 protein (version 2) sequences on the cucurbit genomics database (CuGenDB; http://cucurbitgenomics.org). Additionally, a BlastP search was also investigated with Arabidopsis and rice PPR proteins [1,7]. As queries (e-value set at1 × 10-5) against watermelon (version 2) protein sequences. Apart from Blastp analysis, '*Pentatrichopeptide repeat*' was used as a keyword in a functional annotation search in the genome of Watermelon (97103) version 2 at the CuGenDB. After combining the results from above searches, redundant sequences were removed. The non-redundant PPR proteins in watermelon with the presence of PPR motif with confidence (E-value < 0.1) in SMART (http://smart.embl-heidelberg.de/) were taken into further analysis. To analyze the protein structure and PPR motif types in translated PPR protein sequences, the HMMsearch program from the HMMER package [30] was used to classify P or PLS (E1, E2, E+, and DYW). The PPR proteins with zero or less than 2 P motifs were excluded from further analysis following reports of previous studies on rice and cotton [7,31]. Finally, the identified candidate PPR genes were named as *Citrullus lanatus* PPR (*ClaPPR*).

### *2.3. Chromosomal Locations, Genomic Distribution, Exons*/*intron Organization, and Synteny Analysis*

Information on accession number, chromosomal locations, CDS, and protein sequences of each non-redundant PPR gene of watermelon 97103 (version 2) were finally retrieved from the cucurbit genomics database (CuGenDB; http://cucurbitgenomics.org). The identified*ClaPPR*genes were mapped proportionally on watermelon chromosomes using CIRCOS image software [32]. The exon/intron organization of *ClaPPR* genes were predicted and investigated using the Gene Structure Display Server (GSDS2.0; http://gsds.cbi.pku.edu.cn/). To analyze the gene duplication and syntenic relationship of *ClaPPR* genes, the multiple Collinearity Scan toolkit (MCScanX) was employed with default parameters as previously reported [33]. Watermelon 97103 (version 2) genes were classified into

segmental or tandem duplication types using an all-against-all BLASTP comparison (e-value ≤1 × <sup>10</sup>−10). Putative segmentally duplicated gene pairs of *ClaPPR* in watermelon genome were visualized in CIRCOS. To evaluate the selection pressure on duplicated *ClaPPR* genes, the rates of non-synonymous (Ka) and synonymous substitution (Ks) were determined using the PAL2NAL web program (http://www.bork.embl.de/pal2nal/).

### *2.4. Gene Ontology (GO), Motif Identification, and Subcellular Location Prediction*

The *ClaPPR* genes were Blast searched against the Arabidopsis genome (https://www.arabidopsis.org/Blast/cereon.jsp), and corresponding Arabidopsis homolog accessions (selected at an E-value of1 × 10-10) for each *ClaPPR* were retrieved. GO annotations were performed for the *ClaPPR* using the Arabidopsis accessions. AgriGO web-based tool (v1.2)153 was used for gene ontology (GO) enrichment analysis (*p* < 0.05) of *ClaPPR* genes (http://systemsbiology.cau.edu.cn/agriGOv2/index.php) [34]. The conserved motifs among *ClaPPR* proteins were investigated with the following parameters: motif width between 13–50 residues, maximum number of 25 motifs, and remain parameters at default, using the MEME (Multiple Em for Motif Elicitation) software version 5.5.1 (http://meme-suite.org) [35]. The subcellular distribution of *ClaPPR* proteins were predicted using TargetP 2.0 (https://services.healthtech.dtu.dk/service.php?TargetP-2.0) and Predotarv.1.04 [36] with default parameters.
