*3.4. Predicted Subcellular Localization of PPR Proteins*

Increasing molecular evidence suggested that PPR proteins play a pivotal role in RNA editing of transcripts in mitochondria and chloroplast organelles. Therefore, we determined the subcellular location of *ClaPPR* proteins in watermelon using TragetP2.0 and Predotar v.1.04 programs. The TragetP2.0 results showed that approximately 32% *ClaPPR* proteins were targeted to mitochondria and 6% to chloroplast, whereas Predotar v.1.04 results showed that approximately 44% and 16% were targeted to mitochondria and chloroplast, respectively (Table S4). Combining the results from both programs, we were able to predict that approximately 65% of *ClaPPR* proteins were targeted to the organelles of chloroplast (73 of 422) and mitochondria (204 of 422); however, few proteins were found to be targeted to ER (5%), and the remaining (30%) protein distributions were uncertain. In the P subfamily, almost half of the proteins (106 of 196) were predicted to be in the mitochondria and 18% (36 of 197) were in the chloroplast (Figure S3). Similarly, in the PLS subfamily, the DYW, E2, and E+ subgroup proteins had a similar localization with almost half in the mitochondria (43–50%) and some proportions in the chloroplasts (19%, 10%, and 30%, respectively). As in the case of PLS and E1, 47% and 67% were predicted in uncertain and mitochondrial localizations, respectively, and approximately 33% of both PLS and E1 proteins were targeted in the chloroplast and ER, respectively (Figure S3).

**Figure 3.** Phylogenetic relationships among the *ClaPPR* family genes. The full coding amino acid sequences of 422 *ClaPPR* proteins and 44 PPR proteins from Arabidopsis were aligned, and the NJ tree was built with 1000 bootstrap replicates using MEGA7.0. P; PLS subfamilies are represented using blue and black lines, respectively. P subfamily members which were clustered into the PLS subfamily members are indicated by a dot (blue) symbol.

### *3.5. Gene Ontology (GO) Annotation of ClaPPR Genes*

To elucidate the role of PPR genes in watermelon, GO annotations were performed for *ClaPPR*s. The results suggested that 364 of the 422 *ClaPPR* transcripts participated in biological processes (82.69%), cellular components (92.85%), and molecular functions (55.22%) (Figure 4). Further insights into the functional categorization indicated that a large portion of *ClaPPR*s were likely related to metabolic processes (179), followed by nucleobase-containing compound metabolic (168), unknown biological (91), and other cellular (50) processes (Figure 4A; Table S5). A total of 191 and 136 *ClaPPR* genes were found to be targeted to mitochondria and chloroplast, respectively; 146 to other intracellular components, 37 to nucleus, and 19 to plastids (Figure 4B; Table S5). For molecular functions, a total of 130 *ClaPPR* genes showed putative participation in binding functions such as protein (28), RNA (15), and DNA binding (5) and other bindings (82). We also found out that several *ClaPPRs* were predicted to be involved in activities, including transferase (15), catalytic (11), transporter (5), kinase (5), hydrolase (7), and unknown molecular functions (69) (Figure 4C; Table S5). In addition, GO enrichment analysis using AgriGO [34] also provided similar results as GO annotation. In the biological process category, all the *ClaPPR* families enriched for RNA modification (GO: 0009451) (Figure S4A). Among the molecular function, the binding functions such as zinc ion (GO: 0008270), translational metal ion (GO: 0043169), cation (GO: 0043169), and protein (GO: 0005515) binding were the enriched category (Figure S4B).

**Figure 4.** Functional annotation of ClaPPR proteins by Gene Ontology (GO) analysis. According to the GO annotation, the *ClaPPR* proteins were annotated into functional categories of (**A**) biological process, (**B**), cellular component, and (**C**) molecular function.

### *3.6. Expression Profiles of ClaPPR Genes in Di*ff*erent Stages of Fruit Development in 97103 Watermelon*

Different PPR members may exhibit variations in the levels of mRNA accumulation among different tissues during the physiological processes of plants. To explore the putative biological functions of PPR members in watermelon, expression profiles of P- and PLS-subfamily members, including their subgroups (PLS, DYW, E1, E2, and E+), were investigated in fruit rind and flesh on different DAP during fruit development of watermelon 97103 using the RNA-seq data from the cucurbit expression atlas. Based on hierarchical clustering and expression heat map (Figure 5), *ClaPPR* genes from each subgroup were differentially expressed in rind and flesh parts during fruit development. However, the majority of the members from each subgroup exhibited preferential accumulation in rind compared to flesh and could therefore be clustered into different expression groups/clades. Based on the expression pattern, the P-subfamily members of 197 *ClaPPR* genes were distributed into seven distinct clades. Expression Clade-II of the P-subfamily includes a total of 27 genes displaying abundant expression at earlier DPAs (10 and 16 DPA) in both rind and flesh; however, expression levels declined at later stages (Figure 5A). The results sugges<sup>t</sup> that these *ClaPPR* genes might be involved in the early stages of each tissue development (Figure 5A). Clade-III is comprised of 13 genes (*ClaPPR334*, *ClaPPR304*, *ClaPPR341*, *ClaPPR331*, *ClaPPR294*, *ClaPPR377*, *ClaPPR277*, *ClaPPR102*, *ClaPPR222*, *ClaPPR13*, *ClaPPR391*, *ClaPPR228*, and *ClaPPR32*), and was strongly upregulated in the flesh at almost all stages. Clade IV comprising 22 genes showed preferential accumulation in flesh; however, in the rind, the genes showed significant expression at only 10 DAP. Clade V, which is the largest one with 81 genes, displayed expression abundance only in the rind at almost all stages, suggesting that these genes could have a functional role in rind development. Clade VI and VII contain 31 and 11 genes (*ClaPPR421*, *322*, *333*, *364*, *27*, *183*, *285*, *263*, *160*, *298*, *119*, and *16*) with transcript accumulation majorly in the rind at all stages, however, their expression in the flesh was higher in early (10 DPA) and later (26 and 34 DPA) stages, respectively (Figure 5A).

Analysis of the expression patterns of other subgroups in the PLS subfamily indicated that clade I of DYW (47 genes), clade II (13 genes) and III (31 genes) of E2 and clade I of E+ (6 genes) of *ClaPPR* members showed significantly higher expression levels in the rind tissues of both stages compare to those in the flesh, where their expressions were only on early 10 DPA (Figure 5B). Similarly, in the DYW subgroup, some of the genes in clades II and III displayed upregulated expression patterns in the rind (10–34 DPA) and flesh (26–34 DPA). Flesh-specific expression of *ClaPPR* genes were also noted in clade IV of DYW (*ClaPPR330*, *143*, *240*, *31*, *185*, *98*, *55*, *168*, *226*, *189*, *313*, and *191*), where these genes were highly expressed at 26–34 DPA of flesh. Clade-I from the E2 subgroup, which comprises 13 genes, showed an up-regulated expression pattern in flesh tissues at 26 DPA (*ClaPPR281*, *216*, *171*, *139*, *320*, *230*, *201*, *37*, and *271*) and 34 DPA (*ClaPPR201*, *37*, *271*, and *131*) and displayed higher expressions in rind at 10 DPA (Figure 5C). Furthermore, clade IV of E2 (containing 29 genes) also possessed few genes (*ClaPPR321*, *178*, *81 50*, *242*, *395*, and *85*) that responded highly in flesh at 34 DPA. However, in the PLS-subgroup, a total of 7 genes (*ClaPPR401*, *90*, *280*, *350*, *121*, *179*, and *411*) were found to be more expressed in the rind than in the flesh at all DPA stages in clade II (Figure 5D). E1 and E+, which are smaller subgroups with only 3 and 10 *ClaPPR* members, respectively, showed a relatively higher expression in the rind (clade I of each) than in the flesh; however, only a few genes were expressed at a higher level in the flesh (*ClaPPR312* in clade of E1; *ClaPPR52*, *307*, and *395* in clade two E+) (Figure 5E,F). The result of the analysis indicates that *ClaPPR* genes in each subgroup showed high expression levels in both rind and flesh at all stages or particular stages of DPA; this facilitates the preliminary understanding of their possible participation in watermelon fruit development.

### *3.7. Sequence Variation in ClaPPR Genes and Development of CAPS Markers for Flesh Color*

For the utilization of *ClaPPR* genes in watermelon breeding, we investigated the relationship between the watermelon flesh color and sequence variations in *ClaPPR* genes. We identified the SNPs in the sequences of *ClaPPR* encoding genes from 24 different flesh colored watermelons (red, yellow, and orange) using our WGRS data (Bioproject: PRJNA516776) [29]. A total of 368 SNPs were observed from 139 *ClaPPR* genes in the WGRS data. After detailed analysis of all SNPs, we selected 4 SNP-carrying genes, including *ClaPPR11*, *ClaPPR25*, *ClaPPR95*, and *ClaPPR140* from 9 red, 9 yellow, and 6 orange flesh-colored watermelons, and compared them with the reference 97103 genome (Table S7). The SNPs in *ClaPPR* genes were found to be monomorphic among a chosen flesh color-type and polymorphic between a chosen and an unchosen flesh color-types; each SNP in the corresponding *ClaPPR* gene almost showed a co-segregation with a particular flesh color phenotype (Orange: *ClaPPR11*, Yellow: *ClaPPR25* and *ClaPPR95*, and Red: *ClaPPR140*). Among the four SNP-carrying genes, 3 of them were classified as non-synonymous substitutions (*ClaPPR11*, *ClaPPR25*, and *ClaPPR140*) with altered amino acid residue in the PPR motifs, which could cause functional variation, in those corresponding genes, between a desired and non-desired flesh types. Therefore, to detect the association among the four SNP-carrying *ClaPPR* genes and flesh colors, CAPS marker primer sets were designed and analyzed by restriction digestion. Using these four CAPS markers, genotyping was carried out on 70 different commercial cultivars comprising red (33), yellow (17), and orange (20) flesh color for their reliability and applicability on watermelon breeding (Figure 6; Table S7). The genotyping results for flesh color determination, based on the SNPs of the four *ClaPPR* genes, are described in Table S8. *ClaPPR11* showed a higher match rate of 0.87 (87%) among genotypes of the markers and phenotypes of the orange-flesh color in all surveyed lines. However, *ClaPPR140* had a perfect co-segregation with red-flesh color with a match rate of 1 (100%). With regard to genotyping for *ClaPPR25* and *ClaPPR95*, they co-segregated well with yellow-flesh, exhibiting a match rate of 0.79 and 0.76, respectively (Table S8). Furthermore, a joint, *ClaPPR25* + *ClaPPR95* genotyping provided an average match rate of 0.94, indicating a high reliability and applicability of flesh type specific *ClaPPR* gene-based SNPs identified in this study.

**Figure 5.** Expression profiles of *ClaPPR* genes during watermelon fruit development (Cultivar 97103; BioProject: SRP012849). All heat maps showing various expression levels of *ClaPPR* genes and subdivided into various clusters (labeled as roman numerals) were built using log2- transformed FPKM values of fruit rind and flesh at the developmental stages of 10, 18, 26, and 34 days after pollination (DAP). (**A**) Expression profiles of *ClaPPR* genes in the P subfamily, (**B**–**F**) Expression profiles of *ClaPPR* genes in the PLS subfamily and subgroups (**B**) DYW, (**C**) E2, (**D**) PLS, (**E**) E1, and (**F**) E+. Differences in transcript abundances such as high (red) and low (blue) levels are shown in color as the scale bar of *Z*-score.

### *3.8. Comparative Expression Patterns of ClaPPR Genes under Di*ff*erent Fruit Ripening Stages of Two Cultivated Watermelon Varieties with Red- and Pale-Yellow-Fleshed*

To explore the role of *PPR*s in fruit ripening of different flesh-colored watermelons, we investigated the comparative expression profiles of *ClaPPR* genes in fruit growth and ripening stages, including immature white (10 DPA), white-pink (18 DPA), red (26 DPA), and full-ripe (34 and 42 DPA), of two cultivated watermelons differing in flesh color, 'COS' (pale yellow-flesh) and 'LSW-177' (red-flesh) (Table S6) (BioProject: SRP012849). Almost all of the 422 *ClaPPR* genes showed expression in at least one of the DPA stages of the two watermelon varieties. It was noted that *ClaPPR* genes exhibited preferential and stage-specific expression between varieties. In the P-subfamily, most of the *ClaPPR* members showed a uniform upregulated expression in both COS and LSW177 at 10 DPA. However, at 26 DPA, these genes apparently exhibited abundant expression in LSW177 than in COS (Figure S5A). At 10 DPA, almost all of the *ClaPPR*s genes in both DYW and E2 subgroups showed significantly upregulated expression in the red flesh of LSW177 compared to that in COS. At 18 DPA, these genes exhibited robust, uniform upregulated expression patterns in both COS and LSW177, but relative to the remaining DPA stages (Figure S5B–C). The PLS subgroup members showed relatively similar expression levels in both COS and LSW177 at 10 DPA; and a higher level in LSW177 than in COS at 26 DPA as observed in the P-subgroup (Figure S5D). Both E1 and E+ subgroup members exhibited significant expression in LSW117 than inCOS at 10 DPA, while their expression at 18 DPA were almost uniform between LSW117 and COS as observed in DYW and E2 subgroups (Figure S5E–F). At the full-ripe stages (34 and 42 DPA), only the P-subfamily exhibited significant expression; some of the *ClaPPR* members had high transcript accumulation in COS than in LSW117, particularly at 34 DPA (Figure S5A and D). These results sugges<sup>t</sup> that *ClaPPR* genes can be considered as candidate genes that are associated with growth and ripening of watermelon fruits.

**Figure 6.** Gel pictures for the validated cleaved amplified polymorphic sequence (CAPS) markers based on single-nucleotide polymorphisms (SNPs) in *PPR* genes. The name of validated CAPS markers such as *ClaPPR11* (**A**), *ClaPPR25* (**B**), *ClaPPR95* (**C**), and *ClaPPR140* (**D**) are shown below their corresponding gel pictures. Numbers in lanes underlined with different colors indicates the DNA sample names of the 70 lines, same as those listed in Table S7. M represents a 100 bp ladder. A "*green asterisk*" represents the enzyme-cleavage of the DNA samples. DNA samples of C803 (red-flesh) and C819 (yellow-flesh) were used as control (Table S7) during the restriction digestion of PCR amplicons.
