*2.7. TPM Statistical Test*

To test for a possible association of the repeat with the closest gene upstream in transcripts, in opposition to the closest gene downstream, we used previously available RNA-seq data from [databank]. Reads from three different lineages were used (CL Brener-P, CL Brener-S, and Y) for both epi- and trypomastigote forms, with transcripts per million (TPM) values averaged between forms within the same genome for each set of genic regions containing 5′ gene/repeat/3 ′ gene. We used the smallest difference between the TPM of the repeat to either the upstream or downstream gene's TPM as a sign of the most likely posttranscriptional genic association of the repeat ("−1" if the smallest TPM difference was upstream, "+1" if downstream). The one-tailed sign test was then employed in R [28] to test the hypothesis of a stronger association to the upstream gene. This procedure was carried in two subsets of genes: (1) the set of four gene families to which repeats were found to be more associated with (trans-sialidases, etc.) appearing at either of the neighboring positions and (2) the remaining genic regions ("background regions") in which none of the flanking genes was a member of those four gene families. Cases in which there was a second repeat flanking at the 5′ end of the upstream gene or at the 3 ′ end of the downstream gene were discarded.
