*3.4. The MO Allows the Establishment of a Threshold That Can Serve as a Parameter by Other Methods That Detect Origins*

To compare the MO with the origins obtained by different experimental approaches, we set up graphs in order to show trend lines for each methodology analyzed, as shown in Figure 2A–E. We observed an expected positive correlation between the number of origins and the size of each chromosome, i.e., the larger the chromosome, the more origins are required to replicate it within the S-phase duration. As the MO is estimated from relatively constant parameters in a wild type population, the trend line of the MO, shown in Figure 2A–E in black lines, allows the establishment of a threshold that can serve as a parameter when estimating the number of origins by other methods.

**Figure 2.** Comparative analysis among different approaches used to estimate replication origins in trypanosomatids (*T. cruzi*, *L. major*, *T. brucei*) and yeasts (*S. cerevisiae* and *S. pombe*) (a–e). Graphs showing positive correlations between chromosome length and the number of replication origins estimated by different approaches: minimum number of origins—MO (black), origins estimated by DNA combing (red), origins estimated by MFA-seq (green), potential origins mapped by SNS-seq (blue), origins estimated by microarray (yellow), known origins (purple), and A+T rich islands (gray). (**A**) *T. cruzi*, (**B**) *L. major*, (**C**) *T. brucei*, (**D**) *S. cerevisiae*, and (**E**) *S. pombe*. The trend lines for all approaches, as well as the equations, are shown. Studies are referenced in each graph.

Before we go on with our analysis, it is worth mentioning that according to their different usages, replication origins can be classified into three categories: constitutive, which are always activated in all cells of a given population; flexible, whose usage varies from cell to cell; or dormant, which are not fired during a normal cell cycle but are activated in the presence of replication stress [52]. However, due to the technical difficulty in distinguishing flexible and dormant origins, we refer to these only as non-constitutive origins.

Thus, when comparing the trend line of the origins estimated by DNA combing (the red lines in Figure 2A–E) with the trend lines of MO (the black lines in Figure 2A–E,), we observed that *T. cruzi*,

*L. major*, *T. brucei*, *S. cerevisiae*, and *S. pombe* use, on average, more origins per chromosome than the minimum required, i.e., the red lines are above from the black ones, as shown in Figure 2A–E). This makes sense, given that the DNA combing approach estimates the pool of all origins (constitutive + non-constitutive) fired in a population.

For *L. major*, in addition to the trend line for origins estimated by DNA combing (the red line in Figure 2B), we also plotted a trend line for potential origins mapped by small leading nascent strand purification coupled to next-generation sequencing (SNS-seq) [16] (the blue line in Figure 2B), and a trend line for origins mapped by marker frequency analysis (MFA-seq) [10] (the green line in Figure 2B). The trend line of the potential origins mapped by SNS-seq is far above from the others (the blue line in comparison with the others in Figure 2B). This makes sense because the SNS-seq approach has high accuracy and resolution in detecting small sites of replication, which can include DNA repair, potential origins, and other events that generate DNA synthesis in a population. However, the trend line of origins mapped by MFA-seq is below the threshold imposed by the minimal origins (MO trend line) (the black line in comparison with the green one in Figure 2B). This implies that only with origins mapped by MFA-seq [10], *L. major* is not able to replicate its nuclear genome within the S-phase duration. Although it seems meaningless, this can be easily explained by the fact that the MFA-seq analysis has low resolution and accuracy, probably being able to identify only the constitutive origins in a population and not the entire pool of fired origins, as occurs in the DNA combing approach for example [3,8]. Nevertheless, further studies are necessary to figure out how many origins are indeed used for a single cell of *L. major* during a standard cell cycle.

In *T. brucei*, we also plotted a trend line for the origins mapped by MFA-seq, and the situation is similar to that presented by *L. major* (the black dots in comparison with the green ones in Figure 2C), i.e., some chromosomes are not able to be duplicated only with the origins mapped by MFA-seq, as already explained in a recent study of our group [8]. Briefly, the reason is the same as previously explained: the low resolution of the MFA-seq analysis.

For *S. cerevisiae*, in addition to plotting a trend line for origins estimated by DNA combing [53] (the red line in Figure 2D) and a trend line for origins mapped by microarray analysis [12] (the green line in Figure 2D), we also plotted a trend line for the known positions of origins [54] (the purple line in Figure 2D). The known position of origins refers to a conserved DNA sequence (called autonomous replicating sequences—ARS) where the assembly of the pre-replication complex occurs [53,55,56]. The trend line of the known origins is above from the MO trend line (the purple line in comparison with the black one in Figure 2D), and above from the trend line of the origins estimated by DNA combing (the purple line in comparison with the red one in Figure 2D). This makes sense because the known positions of origins are potential sites for the establishment of origins. However, not all of these sites are activated during the S-phase, i.e., in *S. cerevisiae*, there are many more potential sites for the establishment of origins than those that are indeed used to complete replication within the S-phase duration [19,57–60]. The trend line of origins mapped by microarray analysis is above the threshold imposed by the minimum origins (MO trend line) (the green line in comparison with the black one in Figure 2D), but below the trend lines of both the known origins and the origins estimated by DNA combing (the comparison amongst the green, purple and red lines in Figure 2D). This was expected for the same reason raised before, i.e., just like MFA-seq, microarray analysis also has low accuracy in detecting the entire pool of origins activated in a population and maps mainly the constitutive origins.

Unlike *S. cerevisiae*, *S. pombe* lacks a consensus DNA sequence that determines origin sites. However, its origins coincide with chromosomal A+T-rich islands [13]. Thus, in addition to plotting a trend line for origins estimated by DNA combing [50] (the red line in Figure 2E) and a trend line for origins mapped by microarray analysis [61] (the yellow line in Figure 2E), we also plotted a trend line for the A+T rich islands [13] (the gray line in Figure 2E). All three of these trend lines (red, yellow, and gray) are above from the threshold imposed by the minimum origins (MO trend line) and practically overlap each other, although as the comparison amongst all the trend lines in Figure 2E shows, the trend line of the origins estimated by DNA combing is slightly above, as expected. This overlapping of the

trend lines raises a question about the dynamics of origin usage during the S-phase in *S. pombe*, which seems to be relatively peculiar when compared to other single-celled eukaryotes [50].

Of note, so far, there is no data about MFA-seq or microarray analysis in *T. cruzi*, which prevents a deeper comparative analysis in this organism.
