*2.1. Phylogeny and Clustering Based on ACE2 Domain-Based Homology*

First, we examined all the substitutions with similar properties and similar side chain binding atoms, signifying that the substitutions would not impede the SARS-CoV-2 transmission. Note that all the mutations are considered concerning the human ACE2 domains D1, D2, and D3 (Figure 1).


**Figure 1.** Substitutions in D1, D2, and D3 domains of ACE2 across eighteen species.

In D1 domain: out of eighteen species, eight species were found to possess a substitution at position 30 where D (aspartate) was substituted by E (glutamate), and four species were found to carry the D38E substitution. It was reported that, in the aspartate side chain, the oxygen atom was involved in ionic-ionic interaction and the side-chain oxygen atom was also present in glutamate, so this substitution may not affect the protein–protein interaction properties [27,28]. In the T27S

substitution, threonine and serine both possess OH that participates in binding, and in the H34L substitution, both histidine and leucine use the NH group for interaction with another amino acid (backbone HN). Consequently, if we consider only the critical perspective for these substitutions, we can conclude that these changes would not impede the binding between the S and ACE2 protein.

In D2 domain: L79I bears importance across eighteen species since both of these amino acids (leucine and isoleucine) share similar chemical properties. Thus, if we analyze the changes in amino acid residues based on their chemical properties, which is the main contributing factor for protein–protein interaction, we can conclude that it will not significantly affect the binding between ACE-2 and RBD of the S protein.

In D3 domain: out of eleven substitutions, three substitutions (R393K, K353H, and K353R) were observed of the similar type with similar side chain interacting atoms and therefore changed at these positions would not affect the interaction of ACE2 with that of the S protein.

Secondly, across all nineteen species, homology was derived based on amino acid sequences, and, consequently, associated phylogenetic trees were drawn (Figure 2).

Six clusters of the nineteen species were formed using the K-means clustering technique based on sequence homology of the three domains (Figure 3). The clusters of species {*S*1, *S*2, *S*3} and {*S*6, *S*13} stayed together for the ACE2 full-length sequence homology and the combination of three domain-based sequence similarity. The species S16, S17, and S18 also followed the same as observed.

Furthermore, it was observed that sequence homology of the D1, D2, and D3 domains clustered the species S15 into the cluster where S9, S10, and S12 belong, although S15 was similar ACE2 sequence of S8 and S9. Despite S4 being very similar to S9, S10, and S12 for full-length ACE2 homology, it combined with S5 and S11 concerning the three domain-based sequence spatial organizations. In addition, S7 was found to be in the proximity of S6 and S13 although S7 was very much similar to S5 and S11 based on ACE2 homology.
