**4. Discussion**

Watermelon (*Citrullus lanatus*) is one of the important economic crops in the *Cucurbitaceae* family. Fruits of watermelon contain sugars, carotenoids (lycopene, beta-carotene, and phytoene), and various health-promoting nutritional compounds (glutathione, citrulline, and arginine) which significantly contributes to the human diet [28]. With a relatively compact genetic complement (~425 Mbp), the gene families of watermelon are being investigated owing to their sequenced or re-sequenced draft genomes [29,42]. PPR proteins are one of the largest gene families in terrestrial plants. In the present study, 422 PPR proteins, of which 197 and 225 members belonged to the P subfamily and PLS subfamily, respectively, were identified in the 97103 watermelon genome (Figure 1; Table S2). This number of proteins is in accordance with previous studies that reported the presence of >400 PPR genes per plant genome, including *Arabidopsis*, foxtail millet, maize, and rice [1,5–7]. Analysis of gene structure revealed that more than 70% of the *ClaPPR* genes were intronless, indicating that a majority of the plant PPR genes (for example, 80% of *PPR* genes in Arabidopsis) lack introns [3,5,7], as it could be the result of amplification by retrotransposition events, in which intron-poor genes might have originated from intron-rich PPRs [1,43,44].

Although analysis of conserved protein motif analyses did not reveal the same results as that for the motifs used to distinguish different types of PPR proteins, it can be used to determine the conserved molecular functions in P and PLS subfamilies of PPR genes [6]. Conserved motif analysis showed that 25 motifs are present in *ClaPPR* proteins; motif 21 and 25 are present only in the P subfamily; the PLS subfamily of the DYW subgroup only exhibits motif 5, 13, and 20; and both DYW and E2 share motifs 3, 7, and 17 (Figure S2). This type of motif distribution has also been observed for PPR proteins in maize where some motifs were even conserved between their genomes [6]. These identified subgroup-specific motifs could be a significant component of the corresponding *ClaPPR* genes in different subgroups that may determine the conserved molecular functions. However, extensive future studies on the characterization of these *ClaPPRs* are required to elucidate their conserved functions. The *ClPPR*s were unevenly distributed on each chromosome, and often clustered together in short regions of the chromosomes (Figures 1A and 2). These results indicated that the size of chromosomes was not relatively associated with the number of genes [31] and that duplication events could have resulted in the expansion of these genes as suggested in a previous study [5]. In duplicated *ClPPR* gene analysis, we noted a total of 11 segmentally duplicated genes located on each of all the chromosomes (Figure 2; Table S3), suggesting that segmental duplication is the most prevalent, having a higher frequency than tandem duplication events in watermelon, and correspond to that reported in previous genome-wide studies [45,46]. In our phylogenetic analysis, *ClaPPR* proteins, based on the pattern of PPR motifs, can be classified into two groups of proteins: P subfamily and PLS subfamily (Figure 3). It has been observed that several P subfamily proteins clustered together with PLS subfamily proteins, showing similarity in the evolutionary relationship of PPR proteins in poplar and rice [3,7].

PPR proteins have been predominantly predicted to be located in the mitochondria and plastids [7]. In the present study, most of the identified *ClaPPR* proteins (approximately 65%) were predicted to be commonly localized in subcellular regions of the chloroplast (73 of 422) and mitochondria (204 of 422) (Figure S3). GO analysis also indicated that many *ClaPPR* proteins seem to be located in the mitochondria (191) and chloroplast (136) (Figure 4). PPR proteins modulate gene expression via post-transcriptional or translational regulations in organelles at the RNA level by acting as RNA-binding proteins [47]. It has also been observed that a large number of *ClPPRs* had binding functions, including DNA, RNA, and protein binding and other-binding functions (Figure 4 and Figure S4), corroborating the results of PPR protein functions. Therefore, any defects or mutations in organelle-targeted PPR proteins often result in organelle dysfunction, which ultimately leads to altered phenotypes, including cytoplasmic male sterility [24], defective embryo development [12], abnormal photosynthetic ability and aberrant pigmentation in seeds [48], and flesh color variation [27]. Future studies on the functional characterization of *ClaPPR* genes will clarify their significant implications in watermelon breeding.

There is increasing evidence that PPR genes play a significant role in plant growth and developmental process, and their mRNA expression patterns have been explored in cotton floral buds [31], maize kernels [6], and rice panicles [7]. In the present study, we investigated the expression patterns of *ClaPPR* genes in watermelon fruit development (BioProject: SRP012849). The results of in-silico expression analyses indicated that *ClaPPRs* were differentially expressed in the rind and flesh tissues (Figure 5). Watermelon fruits have rapid cell division and expansion in their early fruit development stages, resulting in changes in cell wall structure and accumulation of compounds (carbohydrates and organic acids) in vacuoles; followed by fruit ripening stages, which involves changes in carotenoid profiles and conversion of carbohydrates to sugars [49]. The expression levels of some of the *ClaPPR* members in the P-subfamily were relatively high in fruit flesh (Clade III), or fruit rind (Clade V), and or both flesh and rind tissues (Clade II, IV, VI, and VII), suggesting that they could be important in correlating the development of rind and flesh in watermelon (Figure 5A). In the PLS-subfamily, most of the subgroups were preferentially expressed in rind with a higher level

at all DPA as well as flesh at 10 DPA (clade I of DYW; clade II, III of E2; and clade I of E+), suggesting that these genes might be important in fruit rind and early fruit flesh development (Figure 5B–F). DYW (clade II, III, and IV) and E2 (clade IV) members presented high expression levels in the fruit flesh at the late ripening stages (26 and 34 DAP), indicating that these genes might be essential for carotenoid accumulation and fruit ripening of watermelon (Figure 5B–C). These results imply that *ClaPPR* genes might be involved in watermelon fruit development and chloroplast-to-chromoplast transition.

Fruit flesh color is an important trait of watermelon; variations in carotenoid profiles often result in colors of red (lycopene), yellow (phytoene), and orange (β-carotene) fleshes [29]. Fruit ripening has been reported to be influenced by environmental factors, hormones, and developmental gene regulation [49,50]. Therefore, identification and characterization of genes, which govern fruit growth and ripening, would be helpful in watermelon breeding. The present study explored the expression profiles of *ClaPPR* genes to evaluate the possible roles of *ClaPPR* in fruit growth and ripening stages (10–42 DPA), between 'COS' (pale yellow-flesh) and 'LSW-177' (red-flesh) watermelons (BioProject: PRJNA338036). At 10 and 18 DPA, *ClaPPR* genes from various subgroups were found to be generally upregulated in both COS and LSW177 (Figure S5). However, DYW, E2, and E+ subgroup genes displayed robust upregulated expression of their transcripts in LSW177 than in COS. Similarly, both Pand PLS- subgroups were also upregulated only in LSW177 at 26 DPA. In contrast, these two subgroups showed high transcript accumulation in COS than in LSW117 at full-ripening stages of 34 DPA (Figure S5A,D). From the expression profiles, it was also speculated that the expression of 242 and 226 *ClaPPR* genes were upregulated at all stages in LSW177 and COS, respectively, among which 60 and 52 genes showed high expression levels (a log2 value between 3–5) (Figure S5; Table S6). This indicated that LSW177 had 1.07 (242/226 = 1.07) fold higher number of upregulated *ClaPPR* genes than that of COS. Therefore, it appears that the mechanism of fruit growth and ripening in LSW177 is more complex than that in COS, and that the PPR family also has functional involvement in the growth and ripening of fruits. However, further studies are required to elucidate the complete role of these upregulated *ClaPPR*s in watermelon fruit.

mRNA expression of PPR genes have been reported to be regulated by microRNAs (miRNA) through cleavage or translational repression in plants [3,7,51]. In watermelon, a previous study showed that eight PPR genes (Cla008388, Cla012681, Cla015802, Cla018752, Cla011015, Cla005585, Cla019381, and Cla006187) have complementary sites of miRNAs [52]. These miRNAs have been reported to be involved in melatonin-mediated cold tolerance in watermelon by suppressing the expression of the abovementioned PPR genes through either cleavage (gene names as in the present study; *ClaPPR29*:*miR399-5p*, *ClaPPR268*:*miR8029-39*, and *ClaPPR59*:*novel-m0058-5p*), or translational inhibition (*ClaPPR234*:*miR159-5p*, *ClaPPR21*:*miR6284-3p*, *ClaPPR348*:*novel-m0030-5p*, *ClaPPR104*:*novel-m0051-5p*, and *ClaPPR179*:*novel-m0051-5p*) [52]. In addition, a recent study showed that a total of 218 PPR genes have complementary sites of 160 miRNAs [53]. Hence, further research is required on the dynamic expression of miRNAs and their corresponding *ClaPPR* targets, and a crosstalk between miRNAs and PPRs will contribute to the regulation of plant growth and fruit development in watermelon.

Based on the SNPs in *ClaPPR* genes (Table S7), the developed four CAPS were found to perfectly co-segregated with their corresponding flesh colors, and match rate ranged from 0.87 to 1 (Figure 6; Table S8). Notably, *ClaPPR11* co-localized with β-carotene-related QTL on chromosome Chr1 [54], whereas *ClaPPR140* co-localized with lycopene-related QTL on chromosome Chr4 [55]. With regard to *ClaPPR25* and *ClaPPR95*, they were not co-localized with any QTL for flesh color. Therefore, the identified SNPs in these *ClaPPR* genes might be used for fine mapping of flesh color locus in watermelon genome. Few amino acid positions (4th and 34th) in a PPR motif have been found to act as attachment points, which help PPR proteins to binds to target mRNAs, thus inhibiting translation [56]. In Arabidopsis, a single nonsynonymous mutation at the 4th amino acid in the 12th PPR motif inhibited the complete function of a PPR gene called *Proton Gradient Regulation3* [57]. Among the selected SNP-carrying candidate genes, *ClaPPR11* and *ClaPPR140* had nonsynonymous mutation at the 2nd amino acid location in the 13th and 11th motif, respectively, while *ClaPPR25* had nonsynonymous

mutation at the 23rd amino acid location in the 18th motif (Table S7), suggesting that these SNPs could influence the binding action of corresponding *ClaPPR* and therefore, play a role in their functions. Furthermore, the aforementioned *ClaPPR* protein-encoding genes could be considered as important candidates for watermelon fruit related traits and the developed CAPS markers will be helpful for breeders to economically distinguish fruit flesh colors at watermelon seedling stage.
