2.1. Virophage Sputnik
During observations in 2008 by transmission electron microscopy of a viral particle factory of a giant virus,
Mamavirus ACMV (
Acanthamoeba (
A)
castellani mamavirus) resembles the giant
Mimivirus AMPV (
Acanthamoeba polyphaga mimivirus), which belongs to the genus
Mimivirus, family
Mimiviridae, and of which
ACMV has been found in water from the Bradford cooling tower. There, small virions have been detected. They have been named after the Earth’s first satellite, the virophage Sputnik, meaning “travelling companion” [
2,
4,
25,
26,
42,
43] and (
Figure 2).
This virophage was initially classified into satellite viruses [
11,
26,
44,
45] and is now included in the genus
Sputnikovirus and the family
Lavidaviridae [
1,
15,
28]. The Sputnik virophage, similar to the other described virophages, lacks a sheath, has an icosahedral capsid 50–70 nm in diameter, and consists of 260 pseudohexameric and 12 pentameric capsomeres, which contain a circular double-stranded DNA 18,343 bp in length. This determines the V20 gene, which contains 595 amino acids, with 437 amino acids determining the AMPV giant virus MCP protein [
11,
24,
26,
42]. This indicates that the Sputnik virophage evolved from other genetic elements before associating with giant viruses [
11,
24,
25,
26,
44,
46]. A trimeric MCP protein is most abundant in the Sputnik virophage capsid, forming a hexagonal surface network of the molecule characterized by a triangulation number of T = 27 [
26].
This MCP protein assembles pseudohexameric and pentameric capsomeres to form the outer shell of the capsid of this virophage [
26]. It has also been shown that the surface of this virophage representing the pseudohexomeric capsomeres is covered by 55 Å “protrusions”, containing a triangular head protruding from the center of each pseudohexameric unit [
26]. Their function is not fully known, although they are presumed to play a role in the recognition and adhesion of the Sputnik virophage to the ACMV, APMV giant virus particle, or both, allowing them and the giant virus to enter eukaryotic host cells such as amoebae [
26,
47]. In contrast, their capsomeres, composed of pentameric units that do not contain “protrusions”, have a type of cavity in the center of the pentamer, which can serve as a pathway for DNA exit or entry [
26]. It has been shown that inside the capsid of this virophage, there is a double lipid layer 4 nm thick, which accounts for 12–24% of the lipids, in which phosphatidylserine is the main component [
11,
26].
The organization of the Sputnik genome is typical of viral genomes, namely a tight arrangement but little overlap of the genes. The genome virophage contains 21 genes encoding proteins ranging in size from 88 to 779 amino acids, with only slight overlap [
24,
25,
26,
44,
48]. Among these are genes encoding its trimeric MCP and mCP proteins, as well as proteins predicted to be involved in its replication [
24,
25,
26]. The genome of this virophage shows a high A+T content of 73%, which is very similar to the characteristics of the
APMV giant virus [
24,
26]. Within its 21 genes, 13 have no homologues in the GOS (Global Ocean Survey) environmental databases [
24]. In contrast, the remaining eight encode proteins with detectable homologues in these databases, of which three are derived from
APMV giant virus. One is a homologue of the integrase of archeon viruses, and the other four are a predicted primase-helicase, an ATP-packing ATP-ase (homologue in bacteriophages and eukaryotic viruses), a distant homologue of the bacterial insertion sequence of the DNA-binding subunit of transposase, and the Zn ribbon protein [
24,
26]. Three of the proteins predicted by Sputnik, encoded by g06, g12, and g13, were most closely related to the products of the ACMV and APMV giant virus genes [
24,
26], except that the protein encoded by g06 is more similar to the ACMV giant virus homologue, while the proteins encoded by g12 and g13 are similar to the respective ACMV and APMV virus homologues [
24,
26].
Furthermore, g06 and g07 encode a protein containing a highly conserved collagen triple helix motif, while g13 encodes a protein consisting of two domains involved in viral DNA replication [
24,
26]. It has been shown [
24,
26] that the C-terminal protein domain encoded by its g13 is a highly conserved superfamily 3 helicase that clusters in phylogenetic trees with the giant virus homologue, Nucleocytoplasmic Large DNA Virus (NCLDV). In contrast, its amino-terminal portion, also encoded by g13, is a domain for which homologues with high similarity have only been detected among proteins from the GOS base set and which, due to the presence of a signature sequence motif, represents a highly derived version of the primase [
24,
26]. The protein encoded by g03 of this virophage has also been shown to be similar, but only to a limited extent, to the packaging ATPase of the Fts–HerA superfamily present in all NCLDVs and many bacteriophages [
24,
26]. Adjacent to the primase–helicase gene, g14 of this virophage encodes a protein containing a Zn ribbon protein motif and is similar to that found in several databases of proteins—GOS [
24,
26].
Additionally, g04 encodes a Zn ribbon protein but lacks highly conserved homologues [
24,
26], while the protein encoded by g10 shows significant sequence similarity to integrases from the provirus tyrosine recombinase family and archeon viruses [
24]. In contrast, its protein encoded by g17 has homologues in the GOS base and belongs to the bacterial subunit/domain insertion sequence family of DNA transposase-binding transposase A proteins [
24,
26]. Of this virophage, g20 has been shown to encode the MCP protein, while g08 and g19 encode the mCP protein [
24,
26]. It has also been recorded that the closest related genes to the GOS base are its genes in the order of g13, g03, and g14, except that the protein encoded by g13 is involved in its essential replicative functions. The protein encoded by g03 is responsible for packaging its genome, and the protein encoded by g14 has a potential function in regulating its gene expression [
24]. Since the FtsK-like ATPase (ATP-dependent DNA translocase) and its primase–helicase are similar to typical viral genes, the Sputnik virophage is likely related to unknown but possibly related giant viruses, which are abundantly represented in marine metagenomic sequences [
24,
26]. Hence, it is indicated [
24,
26] that Sputnik virus genes are evolutionarily related to a minimum of three distinct sources: the ACMV and APMV family of giant viruses, the family of viruses, plasmids and archeons, and another putative family of viruses.
For the Sputnik virophage, the dominant host among viruses of Lineages A, B, and C of the family
Mimiviridae is the giant
Mamavirus ACMV, which infects the amoebae
A. castellanii [
1,
2,
17,
26,
42,
48,
49]. However, with the same reaction kinetics, it can effectively infect the giant virus,
Mimivirus APMV, of the same genus and family but living on the amoeba
A. polyphaga [
3,
4,
11,
21,
49]. It has been recorded that on the surface of these giant viruses, there are 140 nm fibrils, 1.4 nm in diameter, composed of glycosylated proteins, terminating in a protein head anchored to the capsid [
26]. These fibers form a protective layer resembling bacterial peptidoglycan and are involved in the penetration of the Sputnik virophage with the giant virus into the amoebae [
26]. The three proteins, R135, L829, and L725, present in
Mamavirus ACMV filaments are essential elements [
26], as the R135 protein is a GMC (glucose–methanol–choline) permeable oxidoreductase, which is probably involved in the adhesion of this virophage to
Mamavirus ACMV, as its replication has not been detected during coinfection with its naked form [
26]. The Sputnik virophage is assumed to be internalized by amoebae, ACMV, and APMV giant viruses in the same endocytic vacuole [
26,
28,
33,
46,
50], and its replication occurs in the viral particle factory of these giant viruses, which harms these viruses. It has been recorded that the coinfection of Sputnik with the
ACMV giant virus results in the formation of its defective virions, by which its replication efficiency is reduced by up to 70%. Such an effect on
ACMV giant viruses determines its protection against amoebae [
3,
4,
11,
24,
26,
42,
46,
50]. It has been shown that after 24 h of a culture of amoebae with this giant virus, approximately 92% of these protozoa are lysed, while a coinfection of them with virophage and the giant virus results in a value of 79% [
26].
2.2. Virophage Sputnik 2
Virophage Sputnik 2 was described in 2012 in a giant virus of the genus
Lentillevirus of the family
Mimiviridae that parasitized the amoeba
A. polyphaga, which was isolated from a contact lens solution belonging to a 17-year-old patient with keratitis [
11,
17,
25,
26,
43,
49]. This virophage, analogous to Sputnik, has circular dsDNA genetic material, an icosahedral capsid approximately 70 nm in diameter and belongs to the genus
Sputnikovirus, family
Lavidaviridae [
1,
11,
15,
17,
24,
25,
26,
28,
43,
49]. The genome of this virophage has 18,338 bp and 20 or 21 genes, which encode proteins of 88 to 779 amino acids [
11,
25,
26]. Its four genes are similar to those of eukaryotes and bacteriophages, three to those of APMV giant viruses, and one to archaea viruses [
11,
25,
26]. The virophage Sputnik 2 replicates in the viral particle factory of giant viruses of the
Mimiviridae family of Lineages A, B, and C, causing their destruction, thus showing a protective effect against their amoeba hosts cell [
2,
3,
4,
11,
17,
26,
43,
48]. With the discovery of the Sputnik 2 virophage in the giant virus
Lentillevirus, a provirophage integrated with it was also found [
12], as well as a new class of mobile genetic elements; that is, small fragments of DNA in the form of independent mobile “pieces” found in both the virophage and giant virus genomes, which have been called transpovirions [
12,
17,
25,
26,
49]. These pieces are similar to the transposons (jumping genes) found in eukaryotic organisms, which can insert their DNA independently into the host cell genome or stay outside the host cell genome [
25,
26,
42]. It is thought that, due to the presence of these mobile DNA molecules, the virophage Sputnik 2 provides a “carrier” of genes between the giant virus
Lentillevirus and the amoeba
A. polyphaga, which is an example of a tripartite CVV system, with a novel transpovirion forming the CVV system and consisting of the transpovirion, the virophage, and the giant virus parasitizing the amoeba [
12,
17,
28,
42].
2.3. Virophage Sputnik 3
This virophage was described in 2013 [
4] in a soil sample collected in Marseille, France, containing
Mimiviridae family C-lineage giant viruses. Although, as with the Guarani virophage, it has been shown that this virophage can be “free” of the giant virus [
25]. This study developed a new protocol for obtaining virophages using giant viruses from the
Mimiviridae family—Lineages A, B, and C, including
APMV giant viruses parasitizing the amoebae
A. polyphaga [
1,
2,
4,
5,
11,
17,
25,
26]. The virophage Sputnik 3, similar to the Sputnik and Sputnik 2, has been shown to have circular dsDNA genomes and an icosahedral capsid approximately 70 nm in diameter [
11,
24,
43]. The genome of this virophage is similar to that of Sputnik 2. It consists of 18,338 base pairs [
11,
17,
25,
26], which are minimally smaller in number in the comparison to the 18,343 bp found in Sputnik virophage [
25,
43]. This virophage, analogous to the Sputnik 2 virophage, contains 20 or 21 genes, which, similar to the Sputnik virophage, encode proteins ranging in size from 88 to 779 amino acids [
25,
44]. Similar to the Sputnik virophage, three of the genes of this virophage are homologous to the genes of the APMV giant virus, one is homologous to the genes of archeon viruses, and four are analogous to the genes of viruses of eukaryotic organisms and bacteriophages [
11,
25,
26]. This virophage’s remaining 12–13 genes have no detectable homologues in the GOS bases [
11,
26]. This mosaicism of Sputnik 3 virophage genes suggests the involvement of its genes in lateral transfers between different viruses that encode proteins of as yet unknown origin and function [
25,
26]. The virophage Sputnik 3, similar to the virophages Sputnik and Sputnik 2, replicates in the viral particle factory of giant viruses of the
Mimivirdae family, mainly of Lineage C, although also of Lineages A and B, causing the formation of abnormal virions of these giant viruses and reducing their infectivity and lytic capacity against their hosts cells, which are amoebae [
2,
3,
4,
11,
26,
48]. Like Sputnik and Sputnik 2, this virophage belongs to the genus
Sputnikovirus, the family
Lavidaviridae. However, it differs from them by less than 10 base pairs, although all these virophages have a low G+C content (approximately 30%), which is also typical of giant viruses of the family
Mimiviridae [
11,
15,
28].
2.5. Virophage Mavirus
The Mavirus virophage was obtained in 2010 from the coastal waters of Texas, USA, from the giant virus, CroV (
Cafeteria (
C)
roenbergensis virus), genus
Cafeteriavirus, family
Mimiviridae infecting the unicellular phototrophic marine flagellates
C. roenbergensis [
1,
2,
8,
11,
17,
43,
49,
51]. This virophage is named for its high similarity to the self-replicating eukaryotic Maverick/Polinton transposable elements [
17,
52]. This virophage has an icosahedral capsid 50–60 nm in diameter, which forms only the main trimeric MCP protein and which, despite its complexity (number of triangulations T = 27), does not need auxiliary proteins when folding the capsid [
17,
32,
42,
46,
51]. The genome of Mavirus virophage is a circular dsDNA of 19,063 bp. in size, presumably encoding 20 genes, among which 13 have been identified as specific genes and are g04, g05, g07–09, g10–12, and g14–18, all with a characteristic A + T content of 69.74% [
8,
11,
17,
25,
32,
51,
53]. These genes are responsible for, among other things, coding for the main NCLD viral replication helicase, retroviral integrase, protein-primed DNA polymerase B (Polβ), endonuclease, lipase, and ATPase, as well as coding for the MCP protein and its cysteine protease [
8,
32,
51]. The 10 genes of the Mavirus virophage have been shown to share sequence similarity with proteins of retroviruses and dsDNA viruses, as well as bacteria and eukaryotes. However, at least four proteins of this virophage, including its MCP protein, encoded by g18, are homologous to the analogous protein of the Sputnik virophage [
8,
11,
12,
17,
51]. The Mavirus genome also encodes a retrovirus-type integrase and Polβ homologous to the corresponding Maverick/Polinton transposon proteins regarding gene length and content as DNA repeats and host ranges [
8,
12,
13,
41]. These genetic features of Mavirus virophage allow it to integrate at multiple sites in the CroV giant virus genome [
8,
13]. Studies based on DNA scoring plots of this virophage, and its phylogenetic analysis, have distinguished eight different types of endogenous virophages associated with it. The genes of these endogenous virophages are transcriptionally silent and do not undergo constitutive expression [
8,
13]. Thus, when an infection of
C. roenbergensis cells with CroV giant virus co-occurs, the expression of Mavirus virophage genes is activated, leading to the replication and synthesis of its virions from proviruses [
8,
13]. This situation results in the
C. roenbergensis flagellate cells transporting these provirophages not being directly protected against giant virus-CroV infection. However, when infection with this virus occurs and Mavirus provirophages are released in subsequent coinfections, they inhibit giant virus-CroV replication and protect the
C. roenbergensis flagellates on which these viruses parasitize [
13]. The protection of
C. roenbergensis worms by proviruses against the CroV giant virus appears to take place in an altruistic model, as some cells are sacrificed to protect others [
8,
13]. A mutualistic relationship between the CroV giant virus and the flagellate
C. roenbergensis has also been demonstrated, providing the Mavirus virophage with the opportunity to exist as a provirophage. In contrast, these flagellate populations benefit from the Mavirus virophage infecting CroV giant virus [
13]. It is assumed [
8] that the Mavirus virophage enters the CroV giant virus host cell by clathrin-dependent endocytosis or enters them independently, indicating that it enters the CroV giant virus host cell without its participation.
2.6. Virophage OLV (Organic Lake Virophage)
Virophage OLV was described in 2011 as an infecting agent of giant viruses of the family
Phycodnaviridae infecting unnamed phototrophic marine algae obtained from the waters of the hypersaline Organic Lake in southeast Antarctica, whose waters have remained unchanged for decades [
1,
2,
5,
11,
17,
29]. The OLV virophage has an icosahedral capsid with a diameter of 50 nm [
16,
29], and its dsDNA genome is circular with a size of 26,421 bp, characterized by a relatively low G+C content (36.5%) [
11,
17,
29,
49,
52]. The genome of this virophage probably encodes 24 genes conditioning their synthesis, 15 of which (namely, g01–11, g15, g21, g24, and g26) were identified as specific, showing 27–42% similarity in amino acids and to Sputnik virophage proteins [
11,
22,
29]. In addition, of the proteins of the virophage OLV, six are homologous to proteins found in the virophage Sputnik [
29,
37], among which those found in its g20 regions encode its MCP protein, while those found in the g03 region encode a DNA ATPase. In contrast, those in the g13 region encode a putative DNA polymerase, while its homologues found in the g09, g18, g21, and g32 regions encode proteins of as yet unknown function [
2,
29,
37,
50]. The homologues of g20, g03, and g13 of Sputnik in the OLV virophage have been shown to determine its primary functions, demonstrating similarity in these virophages [
29,
37].
Furthermore, by studying the OLV virophage, six genes are linked to genes of giant viruses of the
Phycodnaviridae family, indicating that gene exchange between these virophages and viruses occurs during coinfection [
28,
29,
37]. This fact was also recorded by studying its g12, derived from an unknown giant virus infecting the alga
Chlorella sp. [
29,
37]. By comparing the genome of the OLV virophage with that of the
Organic Helper Phycodnaviruses (OLPV) of the
Phycodnaviridae family, it was shown that as many as 7408 bp of the OLV virophage encodes g17–22 proteins, similar in 32–65% to sequences in the OLPV-1 and OLPV-2 regions of the
Phycodnaviridae family giant viruses [
29,
51]. This virophage has also demonstrated unique genes targeting specific adaptation for its helper–host system, including a DNA methyltransferase specific for N6 adenine [
29]. In addition, it has been described that the genes of the OLV virophage, found in g12, g13, g17, g19, g20, g22, and g23, are linked, among others, to the coding of a protein responsible for the selectivity of NCLDV viral homologues. These include APMV and Sputnik virophage, including their transmembrane protein and a protein-encoding the three-stranded structure of the OLV virophage collagen, as well as a protein presumably conditioning the interaction of the OLV virophage with the giant virus [
2,
29,
37,
50]. By infecting giant viruses of the
Phycodnaviridae family and infecting unnamed phototrophic marine algae, the OLV virophage influences their growth and abundance, thus playing a pivotal role in regulating the microbial network of organic lake waters [
29].
2.7. Virophage RNV (Rio Negro Virophage)
The RNV virophage was the first virophage discovered in Brazil in 2011 in the waters of the Negro River in Amazonia and was found in the amoeba
A. castellanii infected with Samba giant virus, genus
Mimivirus, family
Mimiviridae Lineage A [
2,
8,
11,
17,
25,
30]. The capsid of the RNV virophage has icosahedral symmetry with a diameter of only 35 nm, and its genome consists of dsDNA. Although there are no precise data on its shape, it is indicated to be similar to the circular dsDNA of the Sputnik virophage [
2,
30]. The genome of the RNV virophage is 18,145 bp long and contains 20 genes ranging from 330 to 2340 bp, confirming its close relationship not only to the Sputnik virophage but also to the Sputnik 2 and Sputnik 3 virophages [
25,
30].
It is also indicated that the sequence of the MCP protein gene of the virophage RNV is also partly identical to the gene encoding the MCP protein of the virophage Sputnik [
8,
30]. This virophage has similar gene content and symmetry to Sputnik virophage 2, despite SNPs (single nucleotide polymorphisms) and insertions found in coding and noncoding regions [
30]. A 49 bp insertion at position 11,841 in the RNV virophage has also been recorded, which results in an elongation of its region between g14 and g15 [
6] and is a repeat of the previous 49 nucleotides, except for an SNP at Position 22, where cysteine has been replaced by guanine [
30]. SNPs were also found in the genome of this virophage at Positions 16.075 and 18.121, with a deletion at Position 18.145 and an additional three guanines inserted at Position 18.016 [
30]. Comparing the genome of the RNV virophage with that of the Sputnik, Sputnik 2, and Sputnik 3 virophages, it can be observed that it lacks the last 244 bp [
30]. This virophage replicates in the Samba giant virus infecting
A. castellanii amoebae, causing its defective shape and abnormal capsid and reducing the abundance of this giant virus in amoebae by more than 80% [
2].
2.8. Virophage PGVV (Phaeocystis Globosa Virus Virophage)
The PGVV virophage was obtained in 2013 from the giant virus PgV–16T (
Phaeocystis globosa virus), genus
Prymneovirus, family
Phycodnaviridae, infecting algae of the genus
Phaeocystis residing in the Dutch coastal waters of the North Sea [
1,
2,
17,
21,
31]. At the time of discovery, it was considered to be most closely related to Mavirus virophage and OLV. However, identification in PLV metagenomic datasets showed that it was not a virophage but a PLV [
1,
3,
21,
31]. It was justified by the lack of a cysteine protease conserved for virophages and a distinct version of the genes for MCP, mCP, and ATPase in it, as well as PLV viruses [
1,
21,
31]. Its circular genome is a double-stranded DNA of 19,527 bp in length, containing only 36% G+C, housed in a capsid of icosahedral symmetry and 50–80 nm in diameter, which encodes 16 predicted genes, three of which show similarity to genes located in OLV and Mavirus virophages [
1,
2,
15,
21,
31,
40]. The PGVV virophage replicates as a linear plasmid in the viral particle factories of the PgV-16T giant virus, or as a provirus integrated into this giant virus, occurring at multiple sites in its genome [
1,
2,
21,
31,
40]. It is understood [
21] that the genome of the PGVV virophage associated with the PgV–16T giant virus has only three homologous genes, including primase, and cannot exist as a free viral particle, which probably represents the first example of a mobile virophage element; that is, a transpovirion. Initially, it was thought that the PGVV virophage had no recognizable genes coding for its capsid proteins. However, it has now been shown that its g12 region probably encodes a distant version of the double-gallate MCP protein, and g10 encodes a minor mCP capsid protein, which would support the theory that it is a virophage and not a PLV [
1,
2,
21,
31]. It should be added that the PLV viruses described in 2023 coinfecting the alga
Phaeocystis globosa—14T (PgV-14T), together with the giant virus
Phaeocystis globose, are probably new virophages that have been named PLV “Gezel–14T”, which are different from all known virophages but also have a destructive effect on specific giant viruses [
6].
2.11. Virophage Zamilon
The virophage Zamilon was described in 2014 in soil samples collected in Tunisia, in which a Mont1 giant virus belonging to the family
Mimiviridae was found to infect the amoebae
A. polyphaga [
1,
2,
11,
17,
25,
33]. The name of this virophage in Arabic is “xamilon”, which means neighbour and is linked to the fact that this virophage, unlike other known virophages, does not affect giant viruses, as it does not produce their morphologically abnormal virions and does not affect their lysis [
2,
11,
25,
33]. The Zamilon virophage replicates in the viral particle factories of giant viruses of the
Mimiviridae family of Lineages C and B but not A [
25,
33]. Its capsid has a helical symmetry of approximately 60 nm in diameter [
16,
17,
43,
46], and its circular dsDNA genome consists of 17,276 bp with a G+C content of 29.7% and contains 20 genes ranging from 222 to 2337 bp in length [
11,
17,
33]. Approximately 6000 bp from the end of its genome contains an inverted portion, also recorded in giant viruses of
Mimiviridae Lineage A [
33]. The genome of this virophage is as much as 75% identical to that of Sputnik virophage and has 76% nucleotide identity, resulting in the vast majority of its genes showing high similarity with those of Sputnik virophage [
11,
25,
36]. Its 17 out of 20 genes show homology with Sputnik’s gene with an identity of 31–86%, of which two genes show an additional 50–67% similarity with
Megavirus chiliensis giant virus of the family
Mimiviridae. One of its genes is 72% identical to the gene of
Moumouvirus monve giant virus of the family
Mimiviridae [
3,
11,
33,
47]. Of this virophage, g12 is the closest homologue of the V9 Sputnik gene, which encodes an unidentified protein but is also related to the putative cysteine protease protein of Mavirus virophage, as it has 32% identity and 83% similarity [
33]. Additionally, g11 and g18 of the virophage Zamilon are closely related to the gene of the virophage Sputnik, encoding a potential integrase and a DNA-packaging protein with a putative ATPase domain [
33]. The more significant homology of g19 of this virophage, with giant viruses of the
Mimiviridae family Lineages B and C more than with Lineage A and the virophage Sputnik, has been shown to determine its infectivity [
25]. It has also been demonstrated that its g08 is a homologue of Sputnik’s g14, which, as in the Zamilon virophage, has no predictable function [
33]. Of this virophage, g01 and g02, showing some similarity to g15 and g02 of Sputnik (≥30% identity), is a predicted protein sequence encoded by g01, which contains a putative protein domain related to the transmembrane domain of Cytochrome C oxidase Subunit II [
33]. It was also recorded that the protein sequence encoded by Zamilon virophage g09, which encodes a putative helicase, shows homology to the putative DNA primase/polymerase of virophage OLV and the putative DNA primase of virophage PGVV [
33]. The virophage Zamilon also exhibits a unique evolutionary feature: the ZnR ribbon protein domain [
11]. It is assumed that the functions of the proteins encoded by its gene, homologous to those of the Sputnik virophage, are probably transposase but also proteins that determine the formation of its capsid [
33]. It has also been indicated that the virophage Zamilon shares a common trimeric fold with noumea viruses for its receptor-binding proteins, which may also be responsible for the host cell receptor for the giant virus [
54].
2.12. Virophage RVP (Rumen Virophage)
Various metagenomes, including those from the activated sludge of a freshwater seawater and wastewater bioreactor and the rumen of sheep, were described in 2015. Sixteen virophage sequences carried the MCP protein in the capsid [
1,
2,
17,
25,
27,
55]. Two nearly complete and two partial genomes of these virophages, collected from the sheep rumen metagenome, were named rumen virophages (RVPs) [
25,
27,
55]. The genome of the RPV virophages is linear as polintons, with the longest polinton being 26,209 bp long and encoding 22 genes [
1,
25,
55]. Of these 22 genes, the three relatively longest ones presumably encode a protein similar to the Polβ subunit of various polintons, while the others encode “unspecified” proteins described in the GOS database, in addition to a polynucleotide kinase [
25,
55]. The RVP virophage most likely infects giant viruses of the family
Mimiviridae, replicating in unspecified eukaryotic protist hosts cells living in the rumen of sheep [
2,
17,
25,
55]. Their capsid has no described symmetry, and no integrases were detected in the genome of these virophages, which may suggest that they parasitize giant viruses without integration into their genome. However, their occasional integration via an in-trans integrase cannot be excluded [
55]. RVP virophages may also be hybrids of virophages and polintons capable of forming infectious virions with genes of MCP, ATP-ase, cysteine proteinase, and self-replicating eukaryotic Polinton/Mavericks transposable elements that encode Polβ together with protein primers [
1,
25,
55]. To date, the minor mCP protein has not been found in the genome of the RVP virophage, indicating that the construction of its capsid differs from that characteristic found in many virophages, including Sputnik and Mavirus virophages [
1,
25,
55].
2.13. Virophage DSLV1 (Dishui Lake Virophage 1)
In 2016, metagenetic material later defined as a virophage, which was named DSLV1 (Dishui Lake Virophage 1), was obtained from the waters of artificial Dishui Lake in Shanghai, China; it has also been found in other freshwater bodies [
1,
2,
17,
25,
34,
40,
56]. It is assumed that although no giant viruses have been attributed to this virophage, they are likely to be giant viruses of the family
Phycodnaviridae, which are thought to infect unspecified freshwater algae [
2,
34]. Virophage DSLV1 has indeterminate capsid symmetry, and its genome is a spherical double-stranded DNA with 43.2% G+C [
50]. Metagenomic analysis has shown that it is 28,788 bp long and contains 28 genes, 15 of which show homology to genes of described virophages, particularly those identified in Yellowstone Lake; that is, virophage YSLV. However, two genes of this virophage are similar to giant viruses of the family
Phycodnaviridae [
11,
34,
51,
52]. It is also indicated that more than half of the 28 genes of the DSLV1 virophage have the highest sequence similarity to the genes of the YSLV 3 virophage (33–70%) [
34]. Among these, five genes are genes encoding MCP protein, mCP protein, DNA helicase, packaging ATPase, and cysteine protease [
34]. In addition, the DSLV 1 virophage genome examination revealed five highly conserved regions shared between DSLV1 and YSLV 3 virophages, suggesting that the two virophages are related [
34]. It has been reported that in samples in which DSLV1 virophages were found, 46 other virophage sequences were recorded, including six MCP protein-related genes closely related to OLV and YSLV virophages—mainly YSLV 3, where similarity was determined to be 33–70% [
25].
2.14. Virophage QLV (Qinghai Lake Virophage)
This virophage was identified in 2016 in the surface waters of Qinghai Lake in the mountains of Tibet, which is rich in a planktonic microbial community [
1,
2,
33,
35]. This virophage has been shown to possibly replicate in giant viruses of the family
Phycodnaviridae, parasitizing unnamed freshwater algae [
35], and is most closely related to the virophage OLV and the virophages YSLV 1–7, particularly YSLV 1–4 [
1,
2,
35]. The QLV virophage has an undefined capsid, and its genome is a circular dsDNA of 23,379 bp in length, with a G+C content of only 33.2%, forming 25 genes [
35]. An analysis of its gene content has identified genes considered to be universally conserved for both QLV and other virophages, including genes encoding the FtsK–HerA family ATPase (g01), cysteine protease (g06), MCP protein (g18), mCP protein (
g19), and DNA helicase/primase/polymerase (g23) [
35]. The products of its core genes have also been shown to be responsible for the replication of its DNA and the packaging of its virions [
35]. This virophage has seven gene homologues coinciding with the YSLV3 virophage (41% amino acid identity), eight with the OLV virophage (39% amino acid identity), nine with Virophage YSLV1 (40% amino acid identity), and 11 with Virophage YSLV4 (46% amino acid identity). In addition, its amino acid identity with Virophages Sputnik, Mavirus, Zamilon, and ALM is determined to be less than 35% [
29,
35]. In addition, it has been shown that its genes, g02 and g19, successively encode a glycoprotein and RecB family recombinase-containing protein, which is a subunit of the RecBCD enzyme that rescues recombinant DNA repair and causes double breaks in the DNA strand [
35]. This protein also affects its glycoproteins in forming its capsid and is vital in adhesion processes and interactions between the virophage, the giant virus, and its host cell [
35]. It was also indicated that the amino acid sequence of the gene encoding Gp02 of the QLV virophage has less than 48% amino acid identity to
Phycodnaviruses (
Paramecium bursaria,
Acanthocystis turfacea and
Chlorella virus), which are known to infect unicellular green algae [
35]. Of its 25 genes, 11 are specific, as they have not been found in other virophages [
29,
35], and it is further indicated that its evolutionary affinity with OLV-like virophages and the homology of its genes, especially g02, with giant viruses of the family
Phycodnaviridae is evidence that it replicates in these viruses [
35,
49].
2.16. Virophages CpV–PLV Curly, CpV–PLV Moe, CpV–PLV Larry
These three virophages (CpV-PLV Curly, CpV-PLV Moe, and CpV-PLV Larry) were described in 2019 along with the CpV–BQ2 giant virus from the family
Phycodnaviridae, which infects the freshwater alga
Chrysochromulina parva, living in the waters of Lake Tai in China and Lake Erie in the United States [
25,
36]. Initially, no particles were recorded during the isolation of the giant virus CpV–BQ2, the host of these virophages, as it is likely that particles of this virophage, or its genome, were packed into the giant virus, analogous to that of the virophage PGVV and the giant virus PgV–16T [
36]. Further studies [
36] found the genomes of these virophages and showed that they encode, among other things, the major capsid protein MCP and the minor capsid protein mCP, although it is still not described whether these three virophages are provirophages or whether they are virophages that remain in the CpV-BQ2 giant virus [
36]. It has only been accepted that they belong to the PLV group of viruses, possessing between 19 and 23 genes, including all the core genes of PLVs and several genes involved in modifying their genome [
36]. To date, the symmetry and size of their capsid have not been determined, and their genome is dsDNA [
36]. The CpV–PLV Curly genome’s length was determined to be 22,761 bp, and its G+C content was only 37.8% [
36]. Its genome encodes 19 genes, eight of which have predicted functions, as g11 of this virophage has been shown to encode the mCP protein, g12 the MCP protein, and g17 encodes a hypothetical protein, similar to that encoded by the homologous gene of the QLV virophage [
36]. This virophage is closely related to the PGVV, YSLV 1, and YSLV 3 [
36]. In addition to the MCP and mCP proteins, and the packaging ATPase, helicase superfamily 3 and tyrosine recombinase, it also has five uncharacterized conserved proteins common only to YSLV virophages [
36]. The CpV–PLV curly virophage also encodes, in addition to genes typical of virophages, a probable endonuclease populating with the HNH (helix–turn–helix) structural motif and a DNA methyltransferase [
36], and in addition, its one gene probably encodes an E3 ubiquitin ligase [
36].
In contrast, the CpV–PLV Moe virophage was shown to be very similar to the CpV–PLV curly virophage and, despite having a genome of 21,750 bp with only 30.1% G+C content, encodes 23 genes [
36]. Meanwhile, in the case of the CpV–PLV virophage Larry, its genome was the largest among these three characterized virophages at 22,879 bp, with 39.3% G+C content, but it encoded only 20 genes. It has the same core elements as CpV–PLV Curly and CpV–PLV Moe virophage, albeit with the 5′ half of the genome inverted [
36] and, unlike CpV–PLV Curly and CpV–PLV Moe virophage, it probably encodes a DNA cytosine methyltransferase [
36].
2.17. Virophage CVV–SW01 (Chlorella Virus Virophage)
In 2022, a
Chlorella virophage, CVV–SW01, residing in the giant
Chlorella virus XW01 (CV-XW01) of the family
Mimiviridae, which parasitizes algae of the genus
Chlorella, was obtained from the waters of Lake Dishui [
23]. Describing this virophage, the CVV system was discovered in these organisms, which are unicellular eukaryotic hosts [
23]; this has been recorded in the protozoa and unicellular eukaryote
Bigelovatella natans [
8,
12,
18,
20,
21,
22,
23]. These facts led to studying the CVV system as a potential mechanism influencing ecological phenomena in aquatic environments, including the evolution of giant viruses and virophages [
23]. It has been recorded that this virophage has icosahedral symmetry of the capsid, and its circular dsDNA genome is 24,744 bp and contains only 35.6% G+C. Its genome encodes 23 genes, 13 of which have homologues in the virophage DSLV5, indicating their close affinity [
23]. The genome of the virophage CVV–SW01 encodes conserved genes for these microorganisms; that is, the packaging ATPase, cysteine protease, MCP and mCP, and one of its genes probably encodes a DNA helicase [
23]. This virophage is closely related to Lake Dishui virophages, particularly Virophage DSLV5; it also shows an affinity for Lake Mendota virophages and YSLV 3 [
23].
Furthermore, as many as 82 genes of this virophage’s CV–XW01 giant virus host show homology with the CroV giant virus, which is most closely related [
23]. It should be added that the codon usage preferences of the giant virus CV–XW01 and the virophage CVV–SW01 are very similar to those of the giant virus CroV and its virophage Mavirus, respectively, suggesting that the giant virus CV–XW01 hosts the virophage CVV–SW01 [
23]. Furthermore, the giant viruses CV–XW01 and CroV show a 74.7% genomic sequence identity, indicating that the giant virus CV–XW01 may be the second species of the genus
Cafeteria or the first species of a new genus closely related to it. It should be added that despite the close relationship between the two giant viruses, CV–XW01 and CroV, their virophages are poorly related. Given these facts, it is suggested that the interaction of the virophage CVV–SW01, the giant virus CV–XW01, and the alga
Chlorella sp. is likely to be different from the interaction of the virophage Mavirus—giant virus CroV—the flagellate
C. roenbergensis (now
C. burkhardaei) [
23]. Notably, Dishui Lake virophages, the closest relatives of virophage CVV–SW01, are likely to parasitize the Dishui Lake 1 green algal giant virus, which is poorly related to the giant virus CV–XW01 [
23]. It has also been reported that there is evidence of interspecies infections by virophages, which may be because virophages, through horizontal gene transfer and recombination, are “linked” to a dynamic network integrating mobile genetic elements, such as the Maverick/Polinton transposon, PLVs, proviruses, transpovirons, or retrotransposons, and thus can acquire versatile adaptations to colonize and parasitize different giant viruses [
23].