**3. Results**

#### *3.1. Comparative Analysis of NLR Gene Composition in the Genomes of Five Arecaceae Species*

A total of 399, 536, 85, 262, and 135 NLR genes from the genomes of *C. simplicifolius*, *D. jenkinsiana, P. dactylifera*, *E. guineensis* and *C. nucifera*, respectively, were retrieved from the ANNA database. Among them, *D. jenkinsiana* possessed the largest number of NLR genes and was as 1.34-, 6.31-, 2.05- and 3.97-times bigger than that of *C. simplicifolius*, *P. dactylifera*, *E. guineensis* and *C. nucifera*, respectively. All NLR genes were divided into the CNL and RNL subclasses based on the classification provided in ANNA, with no TNL genes found. Among the two NLR subclasses, CNL genes overwhelmingly outnumbered RNL genes, with 99.50%, 99.25%, 100%, 99.62% and 98.52% of NLRs in *C. simplicifolius*, *D. jenkinsiana, P. dactylifera*, *E. guineensis* and *C. nucifera*, respectively, being CNLs. There were only two, four, one, and two RNL genes in *C. simplicifolius*, *D. jenkinsiana, E. guineensis* and *C. nucifera*, and no RNL genes were identified in *P. dactylifera*. Domain composition analysis revealed that less than half of NLR genes in *C. simplicifolius*, *D. jenkinsiana* and *C. nucifera* encode intact NLR proteins possessing all three domains (CC/RPW8-NBS-LRR), with the rest of NLR, either lacking the CC/RPW8 domain at the N-terminus, the LRR domain at the C-terminus, or domains at both termini (Table 1). The proportion of intact NLR genes in the *E. guineensis* genome is much higher, with 144 (143 CNL and 1 RNL) of the 261 genes having all three domains. Some NLR genes were classified as "other" in CNL due to their atypical structural domain compositions (Tables 1 and S1). For example, the *D. jenkinsiana* genome encodes one "other" gene in CNL(NCCLNCCL), and the *C. nucifera* genome encodes two "other" genes, including NCCLNCC and CNCNL.


**Table 1.** The number of identified NLR genes in the five Arecaceae genomes.

Integration of alien domains in addition to the three typical domains was detected for NLR genes from the five genomes, including eight, 15, five, three and five distinct integrated domains (IDs) found in 13, 14, six, seven and seven NLR genes of the *C. simplicifolius*, *D. jenkinsiana, P. dactylifera*, *E. guineensis* and *C. nucifera* genomes, respectively (Table S2). All the NLR-ID genes belong to the CNL subclass. The numbers of NLR genes with fused IDs (NLR-ID gene) in the five genomes show a significant positive correlation with total NLR gene numbers (Figure 1a). An average of 4.15% of NLR genes in each genome possess the NLR-ID structure. The comparison of the ID diversity in the five species shows that a total of 32 non-redundant IDs were present in the five genomes (Table S2). Some of these IDs have been detected in proteins with immune function, including the v-SNARE domain and the PKinase domain. Plant SNARE domain-containing proteins are targets of filamentous fungi effectors and are monitored by NLRs for programmed cell death [31]. PKinases are known to function in the immune pathways in both plants and mammals and are also often found in the receptor-like PKinases that transduce PAMP-triggered immunity [32]. Among the 32 different types of IDs, one of them was found in NLR genes from three species and two were found in NLR genes from two species (Table S2). In contrast, the majority of IDs were found only in one genome, suggesting frequent occurrence of species-specific domain fusions (Figure 1b).

**Figure 1.** Exogenous fusion domains of NLR genes in *D. jenkinsiana, C. nucifera, C. simplicifolius, E. guineensis* and *P. dactylifera.* (**a**) Spearman correlation between the total number of species' NLR genes and the number of NLR genes fused IDs. r represents the correlation coefficient, *p* < 0.05. (**b**) Extraneous domains of NLR gene-specific fusions or convergent fusions among different species in the Arecaceae species. Black circles indicate exogenous fusion domain presence, and gray circles indicate exogenous fusion domains non- presence.

#### *3.2. Organization of NLR Genes in Arecaceae Genomes*

Clustering organization of NLR genes has been proposed as an important mechanism of generating NLR diversity and functional members. Our results show that the majority of NLR genes were organized into clusters rather than singletons in *C. nucifera*, *D. jenkinsiana*, and *E. guineensis* genomes, with 54.8%, 70.7% and 77.7% NLR genes detected in clusters, respectively (Table 2). However, there were more singleton genes in the other two Arecaceae genomes, with only 15 (17.6%) and 176 (44.1%) NLR genes organized into clusters in the *P. dactylifera* and *C. simplicifolius* genomes, respectively. Among the five Arecaceae genomes, the clustered loci in *E. guineensis* genome contained the most genes (3.37 genes/locus) on average. The largest gene clusters of *C. nucifera*, *D. jenkinsiana*, *E. guineensis*, *C. simplicifolius*, and *P. dactylifera* were in locus 47 (eight genes), locus 34 (10 genes) and 126 (10 genes), locus 80 (11 genes), locus 237 (five genes), and locus 47 (three genes), respectively (Table 2).

**Table 2.** The organization of NLR genes in five Arecaceae species.


NLR genes may undergo duplication via different mechanisms [33]. We surveyed the duplication patterns of NLR genes from the five Arecaceae species by using MCScanX software. The results showed that amplification of NLR genes in the five genomes was dominated by different types. The majority of NLR genes in *C. nucifera* (57.0%) and *E. guineensis* (70.1%) were generated by tandem/proximal duplications, whereas most NLR genes in the remaining three genomes were characterized as dispersed duplication. Only a small proportion of NLR genes in *C. nucifera*, *D. jenkinsiana* and *E. guineensis* were generated by whole genome duplications (WGD)/segmental duplication, whereas no WGD/segmental duplicated NLR genes were found in *C. simplicifolius* or *P. dactylifera* (Figure 2). However, the proportion of segmental duplicated genes might have been underestimated because the syntenic relationship of NLR genes would be disrupted during long-term evolution.

**Figure 2.** Type of gene duplication in *C. simplicifolius*, *D. jenkinsiana*, *P. dactylifera*, *E. guineensis* and *C. nucifera*, respectively.
